db48x/doc/6-Performance.md

106 lines
4.8 KiB
Markdown
Raw Normal View History

# Performance measurements
This sections tracks some performance measurements across releases.
## NQueens (DM42)
Performance recording for various releases on DM42 with `small` option (which is
the only one that fits all releases). This is for the same `NQueens` benchmark,
all times in milliseconds, best of 5 runs, on USB power, with presumably no GC.
| Version | Time | PGM Size | QSPI Size | Note |
|---------|---------|-----------|-----------|-------------------------|
Release 0.5.2 "Christmas Eve": Reaching hard limits on DM42 This release was a bit longer in coming than earlier ones, because we are about to reach the limits of what can fit on a DM42. This release uses 711228 bytes out of the 716800 (99.2%). Without the Intel Decimal Library code, we use only 282980 bytes. This means that the Intel Decimal Library code uses 60.2% of the total code space. Being able to move further requires a rather radical rethinking of the project, where we replace the Intel Decimal Library with size-optimized decimal code. As a result, release 0.5.2 will be the last one using the Intel Decimal Library, and is release in parallel with 0.6.0, which switches to a table-free and variable-precisions implementation of decimal code that uses much less code space. The two releases should otherwise be functionally identical **New features** * Shift and rotate instructions (#622) * Add `CompatibleTypes` and `DetsailedTypes` setting to control `Type` results * Recognize HP-compatible negative values for flags, e.g. `-64 SF` (#625) * Add settings to control multiline result and stack display (#634) **Bug fixes** * Truncate to `WordSize` the small results of binary operations (#624) * Fix day-of-week shortcut in simulator * Avoid double-evaluation of immediate commands when there is no help * Generate an error when selecting base 1 (#628) * Avoid `Number too big` error on based nunbers * Correctly garbage-collect menu entries (#630) * Select default settings that allow solver to find solutions (#627) * Fix display of decimal numbers (broken by multi-line display) * Fix rendering of menu entries for `Fix`, `Std`, etc * Detect non-finite results in arithmetic, e.g. `(-8)^0.3`m (#635, #639) * Fix range-checking for `Dig` to allow `-1` value * Accept large values for `Fix`, `Sci` and `Eng` (for variable precision) * Restore missing last entry in built-in units menu (#638) * Accept `Hz` and non-primary units as input for `ConvertToUnitPrefix` (#640) * Fix LEB128 encoding for signed value 64 and similar (#642) * Do not parse `IfThenElse` as a command * Do not consider `E` as a digit in decimal numbers (#643) * Do not parse `min` as a function in units, but as minute (#644) **Improvements** * Add `OnesComplement` flag for binary operation (not used yet) * Add `ComplexResults` (-103) flag (not used yet) * Accept negative values for `B→R` (according to `WordSize`) * Add documentation for `STO` and `RCL` accessing flash storage * Mention `True` and `False` in documentation * Rename `MaxBigNumBits` to `MaxNumberBits` * Return HP-compatible values from `Type` function * Minor optimization of flags implementation * Catalog auto-completion now suggests all possible spellings (#626) * Add aliases for `CubeRoot` and `Hypothenuse` * Align based number promotion rules to HP calculators (#629) * Expand the range of garbage collector integrity check on simulator * Show command according to preferences in error messages (#633) * Avoid crash in `debug_printf` if used before font initialization * Update performance data in documentation * Add ability to disable any reference to Intel Decimal Floating-point library * Simplify C++ notations for safe pointers (`+x` and `operartor bool()`) * Fix link to old `db48x` project in `README.md` Signed-off-by: Christophe de Dinechin <christophe@dinechin.org>
2023-12-25 17:09:31 +01:00
| 0.5.2 | 1310 | 711228 | 1548076 | |
| 0.5.1 | | | | |
| 0.4.10+ | 1205 | 651108 | | RPL stack runloop |
| 0.4.10 | 1070 | 650116 | | Focused optimizations |
| 0.4.9+ | 1175 | | | Range-based type checks |
| 0.4.9+ | 1215 | | | Remove busy animation |
| 0.4.9 | 1447 | 646028 | 1531868 | No LastArgs in progs |
| 0.4.8 | 1401 | 633932 | 1531868 | |
| 0.4.7 | 1397 | 628188 | 1531868 | |
| 0.4.6 | 1380 | 629564 | 1531868 | |
| 0.4.5 | 1383 | 624572 | 1531868 | |
| 0.4.4 | 1377 | 624656 | 1531868 | Implements Undo/LastArg |
| 0.4.3S | 1278 | 617300 | 1523164 | 0.4.3 build "small" |
| 0.4.3 | 1049 | 717964 | 1524812 | Switch to -Os |
| 0.4.2 | 1022 | 708756 | 1524284 | |
| 0.4.1 | 1024 | 687444 | 1522788 | |
| 0.4 | 998 | 656516 | 1521748 | Feature tests 7541edf |
| 0.3.1 | 746 | 618884 | 1517620 | Faster busy 3f3ab4b |
| 0.3 | 640 | 610820 | 1516900 | Busy anim 4ab3c97 |
| 0.2.4 | 522 | 597372 | 1514292 | |
| 0.2.3 | 526 | 594724 | 1514276 | Switching to -O2 |
| 0.2.2 | 723 | 540292 | 1512980 | |
## NQueens (DM32)
Performance recording for various releases on DM32 with `fast` build option.
This is for the same `NQueens` benchmark, all times in milliseconds,
best of 5 runs. There is no GC column, because it's harder to trigger given how
much more memory the calculator has. Also, experimentally, the numbers for the
USB and battery measurements are almost identical at the moment. As I understand
it, there are plans for a USB overclock like on the DM42, but at the moment it
is not there.
| Version | Time | PGM Size | QSPI Size | Note |
|---------|---------|-----------|-----------|-------------------------|
Release 0.5.2 "Christmas Eve": Reaching hard limits on DM42 This release was a bit longer in coming than earlier ones, because we are about to reach the limits of what can fit on a DM42. This release uses 711228 bytes out of the 716800 (99.2%). Without the Intel Decimal Library code, we use only 282980 bytes. This means that the Intel Decimal Library code uses 60.2% of the total code space. Being able to move further requires a rather radical rethinking of the project, where we replace the Intel Decimal Library with size-optimized decimal code. As a result, release 0.5.2 will be the last one using the Intel Decimal Library, and is release in parallel with 0.6.0, which switches to a table-free and variable-precisions implementation of decimal code that uses much less code space. The two releases should otherwise be functionally identical **New features** * Shift and rotate instructions (#622) * Add `CompatibleTypes` and `DetsailedTypes` setting to control `Type` results * Recognize HP-compatible negative values for flags, e.g. `-64 SF` (#625) * Add settings to control multiline result and stack display (#634) **Bug fixes** * Truncate to `WordSize` the small results of binary operations (#624) * Fix day-of-week shortcut in simulator * Avoid double-evaluation of immediate commands when there is no help * Generate an error when selecting base 1 (#628) * Avoid `Number too big` error on based nunbers * Correctly garbage-collect menu entries (#630) * Select default settings that allow solver to find solutions (#627) * Fix display of decimal numbers (broken by multi-line display) * Fix rendering of menu entries for `Fix`, `Std`, etc * Detect non-finite results in arithmetic, e.g. `(-8)^0.3`m (#635, #639) * Fix range-checking for `Dig` to allow `-1` value * Accept large values for `Fix`, `Sci` and `Eng` (for variable precision) * Restore missing last entry in built-in units menu (#638) * Accept `Hz` and non-primary units as input for `ConvertToUnitPrefix` (#640) * Fix LEB128 encoding for signed value 64 and similar (#642) * Do not parse `IfThenElse` as a command * Do not consider `E` as a digit in decimal numbers (#643) * Do not parse `min` as a function in units, but as minute (#644) **Improvements** * Add `OnesComplement` flag for binary operation (not used yet) * Add `ComplexResults` (-103) flag (not used yet) * Accept negative values for `B→R` (according to `WordSize`) * Add documentation for `STO` and `RCL` accessing flash storage * Mention `True` and `False` in documentation * Rename `MaxBigNumBits` to `MaxNumberBits` * Return HP-compatible values from `Type` function * Minor optimization of flags implementation * Catalog auto-completion now suggests all possible spellings (#626) * Add aliases for `CubeRoot` and `Hypothenuse` * Align based number promotion rules to HP calculators (#629) * Expand the range of garbage collector integrity check on simulator * Show command according to preferences in error messages (#633) * Avoid crash in `debug_printf` if used before font initialization * Update performance data in documentation * Add ability to disable any reference to Intel Decimal Floating-point library * Simplify C++ notations for safe pointers (`+x` and `operartor bool()`) * Fix link to old `db48x` project in `README.md` Signed-off-by: Christophe de Dinechin <christophe@dinechin.org>
2023-12-25 17:09:31 +01:00
| 0.5.2 | 1752 | | |
| 0.5.1 | 1746 | | |
| 0.5.0 | 1723 | | |
| 0.4.10+ | 1804 | 761252 | | RPL stack runloop |
| 0.4.10 | 1803 | 731052 | | Focused optimizations |
| 0.4.9 | 2156 | 772732 | 1534316 | No LastArg in progs |
| 0.4.8 | 2201 | 749892 | 1534316 | |
| 0.4.7 | 2209 | 742868 | 1534316 | |
| 0.4.6 | 2204 | 743492 | 1534316 | |
| 0.4.5 | 2171 | 730092 | 1534316 | |
| 0.4.4 | 2170 | 730076 | 1534316 | Implements Undo/LastArg |
| 0.4.3 | 2081 | 718020 | 1527092 | |
| 0.4.2 | 2242 | 708756 | 1524284 | |
| 0.4.1 | 2152 | 687500 | 1522788 | |
| 0.4 | | | | Feature tests 7541edf |
| 0.3.1 | | | | |
| 0.3 | | | | |
| 0.2.4 | | | | |
| 0.2.3 | | | | |
## Collatz conjecture check
This test checks the tail recursion optimization in the RPL interpreter.
The code can be found in the `CBench` program in the `Demo.48S` state.
The HP48 cannot run the benchmark because it does not have integer arithmetic.
Timing on 0.4.10 are:
* HP50G: 397.438s
* DM32: 28.507s (14x faster)
* DM42: 15.769s (25x faster)
Release 0.5.2 "Christmas Eve": Reaching hard limits on DM42 This release was a bit longer in coming than earlier ones, because we are about to reach the limits of what can fit on a DM42. This release uses 711228 bytes out of the 716800 (99.2%). Without the Intel Decimal Library code, we use only 282980 bytes. This means that the Intel Decimal Library code uses 60.2% of the total code space. Being able to move further requires a rather radical rethinking of the project, where we replace the Intel Decimal Library with size-optimized decimal code. As a result, release 0.5.2 will be the last one using the Intel Decimal Library, and is release in parallel with 0.6.0, which switches to a table-free and variable-precisions implementation of decimal code that uses much less code space. The two releases should otherwise be functionally identical **New features** * Shift and rotate instructions (#622) * Add `CompatibleTypes` and `DetsailedTypes` setting to control `Type` results * Recognize HP-compatible negative values for flags, e.g. `-64 SF` (#625) * Add settings to control multiline result and stack display (#634) **Bug fixes** * Truncate to `WordSize` the small results of binary operations (#624) * Fix day-of-week shortcut in simulator * Avoid double-evaluation of immediate commands when there is no help * Generate an error when selecting base 1 (#628) * Avoid `Number too big` error on based nunbers * Correctly garbage-collect menu entries (#630) * Select default settings that allow solver to find solutions (#627) * Fix display of decimal numbers (broken by multi-line display) * Fix rendering of menu entries for `Fix`, `Std`, etc * Detect non-finite results in arithmetic, e.g. `(-8)^0.3`m (#635, #639) * Fix range-checking for `Dig` to allow `-1` value * Accept large values for `Fix`, `Sci` and `Eng` (for variable precision) * Restore missing last entry in built-in units menu (#638) * Accept `Hz` and non-primary units as input for `ConvertToUnitPrefix` (#640) * Fix LEB128 encoding for signed value 64 and similar (#642) * Do not parse `IfThenElse` as a command * Do not consider `E` as a digit in decimal numbers (#643) * Do not parse `min` as a function in units, but as minute (#644) **Improvements** * Add `OnesComplement` flag for binary operation (not used yet) * Add `ComplexResults` (-103) flag (not used yet) * Accept negative values for `B→R` (according to `WordSize`) * Add documentation for `STO` and `RCL` accessing flash storage * Mention `True` and `False` in documentation * Rename `MaxBigNumBits` to `MaxNumberBits` * Return HP-compatible values from `Type` function * Minor optimization of flags implementation * Catalog auto-completion now suggests all possible spellings (#626) * Add aliases for `CubeRoot` and `Hypothenuse` * Align based number promotion rules to HP calculators (#629) * Expand the range of garbage collector integrity check on simulator * Show command according to preferences in error messages (#633) * Avoid crash in `debug_printf` if used before font initialization * Update performance data in documentation * Add ability to disable any reference to Intel Decimal Floating-point library * Simplify C++ notations for safe pointers (`+x` and `operartor bool()`) * Fix link to old `db48x` project in `README.md` Signed-off-by: Christophe de Dinechin <christophe@dinechin.org>
2023-12-25 17:09:31 +01:00
| Version | DM32 ms | DM42 ms |
|---------|---------|---------|
| 0.5.2 | 26733 | 15695 |
| 0.4.10 | 28507 | 15769 |
## SumTest (decimal performance)
| Version | DM32 ms | DM42 ms |
|---------|---------|---------|
| 0.5.2 | 215421 | 143412 |
## Drawing `sin X` with `FunctionPlot`
DM32 Intel Decimal: 2332 - 5140
DM32 variable precision (6): 2423 -
DM32 variable precision (24): 3863 - 6005
DM32 variable precision (36): 6567 - 10186
DM32 variable precision (48): 8377 - 10259
Crash at precision 3