2023-11-03 19:42:12 +01:00
|
|
|
# Performance measurements
|
|
|
|
|
|
|
|
This sections tracks some performance measurements across releases.
|
|
|
|
|
2024-01-24 23:19:17 +01:00
|
|
|
<!--- DMNONE --->
|
2023-11-03 19:42:12 +01:00
|
|
|
## NQueens (DM42)
|
|
|
|
|
|
|
|
Performance recording for various releases on DM42 with `small` option (which is
|
|
|
|
the only one that fits all releases). This is for the same `NQueens` benchmark,
|
|
|
|
all times in milliseconds, best of 5 runs, on USB power, with presumably no GC.
|
|
|
|
|
|
|
|
|
|
|
|
| Version | Time | PGM Size | QSPI Size | Note |
|
|
|
|
|---------|---------|-----------|-----------|-------------------------|
|
2023-12-28 17:36:27 +01:00
|
|
|
| 0.6.0 | 1183 | 409252 | 187516 | New table-free decimal |
|
Release 0.5.2 "Christmas Eve": Reaching hard limits on DM42
This release was a bit longer in coming than earlier ones, because we are about
to reach the limits of what can fit on a DM42. This release uses 711228 bytes
out of the 716800 (99.2%).
Without the Intel Decimal Library code, we use only 282980 bytes. This means
that the Intel Decimal Library code uses 60.2% of the total code space. Being
able to move further requires a rather radical rethinking of the project, where
we replace the Intel Decimal Library with size-optimized decimal code.
As a result, release 0.5.2 will be the last one using the Intel Decimal Library,
and is release in parallel with 0.6.0, which switches to a table-free and
variable-precisions implementation of decimal code that uses much less code
space. The two releases should otherwise be functionally identical
**New features**
* Shift and rotate instructions (#622)
* Add `CompatibleTypes` and `DetsailedTypes` setting to control `Type` results
* Recognize HP-compatible negative values for flags, e.g. `-64 SF` (#625)
* Add settings to control multiline result and stack display (#634)
**Bug fixes**
* Truncate to `WordSize` the small results of binary operations (#624)
* Fix day-of-week shortcut in simulator
* Avoid double-evaluation of immediate commands when there is no help
* Generate an error when selecting base 1 (#628)
* Avoid `Number too big` error on based nunbers
* Correctly garbage-collect menu entries (#630)
* Select default settings that allow solver to find solutions (#627)
* Fix display of decimal numbers (broken by multi-line display)
* Fix rendering of menu entries for `Fix`, `Std`, etc
* Detect non-finite results in arithmetic, e.g. `(-8)^0.3`m (#635, #639)
* Fix range-checking for `Dig` to allow `-1` value
* Accept large values for `Fix`, `Sci` and `Eng` (for variable precision)
* Restore missing last entry in built-in units menu (#638)
* Accept `Hz` and non-primary units as input for `ConvertToUnitPrefix` (#640)
* Fix LEB128 encoding for signed value 64 and similar (#642)
* Do not parse `IfThenElse` as a command
* Do not consider `E` as a digit in decimal numbers (#643)
* Do not parse `min` as a function in units, but as minute (#644)
**Improvements**
* Add `OnesComplement` flag for binary operation (not used yet)
* Add `ComplexResults` (-103) flag (not used yet)
* Accept negative values for `B→R` (according to `WordSize`)
* Add documentation for `STO` and `RCL` accessing flash storage
* Mention `True` and `False` in documentation
* Rename `MaxBigNumBits` to `MaxNumberBits`
* Return HP-compatible values from `Type` function
* Minor optimization of flags implementation
* Catalog auto-completion now suggests all possible spellings (#626)
* Add aliases for `CubeRoot` and `Hypothenuse`
* Align based number promotion rules to HP calculators (#629)
* Expand the range of garbage collector integrity check on simulator
* Show command according to preferences in error messages (#633)
* Avoid crash in `debug_printf` if used before font initialization
* Update performance data in documentation
* Add ability to disable any reference to Intel Decimal Floating-point library
* Simplify C++ notations for safe pointers (`+x` and `operartor bool()`)
* Fix link to old `db48x` project in `README.md`
Signed-off-by: Christophe de Dinechin <christophe@dinechin.org>
2023-12-25 17:09:31 +01:00
|
|
|
| 0.5.2 | 1310 | 711228 | 1548076 | |
|
|
|
|
| 0.5.1 | | | | |
|
2023-11-07 20:56:49 +01:00
|
|
|
| 0.4.10+ | 1205 | 651108 | | RPL stack runloop |
|
|
|
|
| 0.4.10 | 1070 | 650116 | | Focused optimizations |
|
2023-11-03 21:40:29 +01:00
|
|
|
| 0.4.9+ | 1175 | | | Range-based type checks |
|
2023-11-03 20:31:29 +01:00
|
|
|
| 0.4.9+ | 1215 | | | Remove busy animation |
|
2023-11-03 19:42:12 +01:00
|
|
|
| 0.4.9 | 1447 | 646028 | 1531868 | No LastArgs in progs |
|
|
|
|
| 0.4.8 | 1401 | 633932 | 1531868 | |
|
|
|
|
| 0.4.7 | 1397 | 628188 | 1531868 | |
|
|
|
|
| 0.4.6 | 1380 | 629564 | 1531868 | |
|
|
|
|
| 0.4.5 | 1383 | 624572 | 1531868 | |
|
|
|
|
| 0.4.4 | 1377 | 624656 | 1531868 | Implements Undo/LastArg |
|
|
|
|
| 0.4.3S | 1278 | 617300 | 1523164 | 0.4.3 build "small" |
|
|
|
|
| 0.4.3 | 1049 | 717964 | 1524812 | Switch to -Os |
|
|
|
|
| 0.4.2 | 1022 | 708756 | 1524284 | |
|
|
|
|
| 0.4.1 | 1024 | 687444 | 1522788 | |
|
|
|
|
| 0.4 | 998 | 656516 | 1521748 | Feature tests 7541edf |
|
|
|
|
| 0.3.1 | 746 | 618884 | 1517620 | Faster busy 3f3ab4b |
|
|
|
|
| 0.3 | 640 | 610820 | 1516900 | Busy anim 4ab3c97 |
|
|
|
|
| 0.2.4 | 522 | 597372 | 1514292 | |
|
|
|
|
| 0.2.3 | 526 | 594724 | 1514276 | Switching to -O2 |
|
|
|
|
| 0.2.2 | 723 | 540292 | 1512980 | |
|
|
|
|
|
|
|
|
|
|
|
|
## NQueens (DM32)
|
|
|
|
|
|
|
|
Performance recording for various releases on DM32 with `fast` build option.
|
|
|
|
This is for the same `NQueens` benchmark, all times in milliseconds,
|
|
|
|
best of 5 runs. There is no GC column, because it's harder to trigger given how
|
|
|
|
much more memory the calculator has. Also, experimentally, the numbers for the
|
|
|
|
USB and battery measurements are almost identical at the moment. As I understand
|
|
|
|
it, there are plans for a USB overclock like on the DM42, but at the moment it
|
|
|
|
is not there.
|
|
|
|
|
|
|
|
|
|
|
|
| Version | Time | PGM Size | QSPI Size | Note |
|
|
|
|
|---------|---------|-----------|-----------|-------------------------|
|
2023-12-28 17:36:27 +01:00
|
|
|
| 0.6.0 | 1751 | 467260 | 187948 | New table-free decimal |
|
|
|
|
| 0.5.2 | 1752 | 856228 | 1550436 | |
|
|
|
|
| 0.5.1 | 1746 | | | |
|
|
|
|
| 0.5.0 | 1723 | | | |
|
2023-11-07 20:56:49 +01:00
|
|
|
| 0.4.10+ | 1804 | 761252 | | RPL stack runloop |
|
|
|
|
| 0.4.10 | 1803 | 731052 | | Focused optimizations |
|
2023-11-03 19:42:12 +01:00
|
|
|
| 0.4.9 | 2156 | 772732 | 1534316 | No LastArg in progs |
|
|
|
|
| 0.4.8 | 2201 | 749892 | 1534316 | |
|
|
|
|
| 0.4.7 | 2209 | 742868 | 1534316 | |
|
|
|
|
| 0.4.6 | 2204 | 743492 | 1534316 | |
|
|
|
|
| 0.4.5 | 2171 | 730092 | 1534316 | |
|
|
|
|
| 0.4.4 | 2170 | 730076 | 1534316 | Implements Undo/LastArg |
|
|
|
|
| 0.4.3 | 2081 | 718020 | 1527092 | |
|
|
|
|
| 0.4.2 | 2242 | 708756 | 1524284 | |
|
|
|
|
| 0.4.1 | 2152 | 687500 | 1522788 | |
|
|
|
|
| 0.4 | | | | Feature tests 7541edf |
|
|
|
|
| 0.3.1 | | | | |
|
|
|
|
| 0.3 | | | | |
|
|
|
|
| 0.2.4 | | | | |
|
|
|
|
| 0.2.3 | | | | |
|
2023-11-07 20:56:49 +01:00
|
|
|
|
|
|
|
|
|
|
|
## Collatz conjecture check
|
|
|
|
|
|
|
|
This test checks the tail recursion optimization in the RPL interpreter.
|
|
|
|
The code can be found in the `CBench` program in the `Demo.48S` state.
|
|
|
|
The HP48 cannot run the benchmark because it does not have integer arithmetic.
|
|
|
|
|
|
|
|
Timing on 0.4.10 are:
|
|
|
|
|
|
|
|
* HP50G: 397.438s
|
|
|
|
* DM32: 28.507s (14x faster)
|
|
|
|
* DM42: 15.769s (25x faster)
|
Release 0.5.2 "Christmas Eve": Reaching hard limits on DM42
This release was a bit longer in coming than earlier ones, because we are about
to reach the limits of what can fit on a DM42. This release uses 711228 bytes
out of the 716800 (99.2%).
Without the Intel Decimal Library code, we use only 282980 bytes. This means
that the Intel Decimal Library code uses 60.2% of the total code space. Being
able to move further requires a rather radical rethinking of the project, where
we replace the Intel Decimal Library with size-optimized decimal code.
As a result, release 0.5.2 will be the last one using the Intel Decimal Library,
and is release in parallel with 0.6.0, which switches to a table-free and
variable-precisions implementation of decimal code that uses much less code
space. The two releases should otherwise be functionally identical
**New features**
* Shift and rotate instructions (#622)
* Add `CompatibleTypes` and `DetsailedTypes` setting to control `Type` results
* Recognize HP-compatible negative values for flags, e.g. `-64 SF` (#625)
* Add settings to control multiline result and stack display (#634)
**Bug fixes**
* Truncate to `WordSize` the small results of binary operations (#624)
* Fix day-of-week shortcut in simulator
* Avoid double-evaluation of immediate commands when there is no help
* Generate an error when selecting base 1 (#628)
* Avoid `Number too big` error on based nunbers
* Correctly garbage-collect menu entries (#630)
* Select default settings that allow solver to find solutions (#627)
* Fix display of decimal numbers (broken by multi-line display)
* Fix rendering of menu entries for `Fix`, `Std`, etc
* Detect non-finite results in arithmetic, e.g. `(-8)^0.3`m (#635, #639)
* Fix range-checking for `Dig` to allow `-1` value
* Accept large values for `Fix`, `Sci` and `Eng` (for variable precision)
* Restore missing last entry in built-in units menu (#638)
* Accept `Hz` and non-primary units as input for `ConvertToUnitPrefix` (#640)
* Fix LEB128 encoding for signed value 64 and similar (#642)
* Do not parse `IfThenElse` as a command
* Do not consider `E` as a digit in decimal numbers (#643)
* Do not parse `min` as a function in units, but as minute (#644)
**Improvements**
* Add `OnesComplement` flag for binary operation (not used yet)
* Add `ComplexResults` (-103) flag (not used yet)
* Accept negative values for `B→R` (according to `WordSize`)
* Add documentation for `STO` and `RCL` accessing flash storage
* Mention `True` and `False` in documentation
* Rename `MaxBigNumBits` to `MaxNumberBits`
* Return HP-compatible values from `Type` function
* Minor optimization of flags implementation
* Catalog auto-completion now suggests all possible spellings (#626)
* Add aliases for `CubeRoot` and `Hypothenuse`
* Align based number promotion rules to HP calculators (#629)
* Expand the range of garbage collector integrity check on simulator
* Show command according to preferences in error messages (#633)
* Avoid crash in `debug_printf` if used before font initialization
* Update performance data in documentation
* Add ability to disable any reference to Intel Decimal Floating-point library
* Simplify C++ notations for safe pointers (`+x` and `operartor bool()`)
* Fix link to old `db48x` project in `README.md`
Signed-off-by: Christophe de Dinechin <christophe@dinechin.org>
2023-12-25 17:09:31 +01:00
|
|
|
|
|
|
|
| Version | DM32 ms | DM42 ms |
|
|
|
|
|---------|---------|---------|
|
2023-12-28 17:36:27 +01:00
|
|
|
| 0.6.0 | 26256 | 15355 |
|
Release 0.5.2 "Christmas Eve": Reaching hard limits on DM42
This release was a bit longer in coming than earlier ones, because we are about
to reach the limits of what can fit on a DM42. This release uses 711228 bytes
out of the 716800 (99.2%).
Without the Intel Decimal Library code, we use only 282980 bytes. This means
that the Intel Decimal Library code uses 60.2% of the total code space. Being
able to move further requires a rather radical rethinking of the project, where
we replace the Intel Decimal Library with size-optimized decimal code.
As a result, release 0.5.2 will be the last one using the Intel Decimal Library,
and is release in parallel with 0.6.0, which switches to a table-free and
variable-precisions implementation of decimal code that uses much less code
space. The two releases should otherwise be functionally identical
**New features**
* Shift and rotate instructions (#622)
* Add `CompatibleTypes` and `DetsailedTypes` setting to control `Type` results
* Recognize HP-compatible negative values for flags, e.g. `-64 SF` (#625)
* Add settings to control multiline result and stack display (#634)
**Bug fixes**
* Truncate to `WordSize` the small results of binary operations (#624)
* Fix day-of-week shortcut in simulator
* Avoid double-evaluation of immediate commands when there is no help
* Generate an error when selecting base 1 (#628)
* Avoid `Number too big` error on based nunbers
* Correctly garbage-collect menu entries (#630)
* Select default settings that allow solver to find solutions (#627)
* Fix display of decimal numbers (broken by multi-line display)
* Fix rendering of menu entries for `Fix`, `Std`, etc
* Detect non-finite results in arithmetic, e.g. `(-8)^0.3`m (#635, #639)
* Fix range-checking for `Dig` to allow `-1` value
* Accept large values for `Fix`, `Sci` and `Eng` (for variable precision)
* Restore missing last entry in built-in units menu (#638)
* Accept `Hz` and non-primary units as input for `ConvertToUnitPrefix` (#640)
* Fix LEB128 encoding for signed value 64 and similar (#642)
* Do not parse `IfThenElse` as a command
* Do not consider `E` as a digit in decimal numbers (#643)
* Do not parse `min` as a function in units, but as minute (#644)
**Improvements**
* Add `OnesComplement` flag for binary operation (not used yet)
* Add `ComplexResults` (-103) flag (not used yet)
* Accept negative values for `B→R` (according to `WordSize`)
* Add documentation for `STO` and `RCL` accessing flash storage
* Mention `True` and `False` in documentation
* Rename `MaxBigNumBits` to `MaxNumberBits`
* Return HP-compatible values from `Type` function
* Minor optimization of flags implementation
* Catalog auto-completion now suggests all possible spellings (#626)
* Add aliases for `CubeRoot` and `Hypothenuse`
* Align based number promotion rules to HP calculators (#629)
* Expand the range of garbage collector integrity check on simulator
* Show command according to preferences in error messages (#633)
* Avoid crash in `debug_printf` if used before font initialization
* Update performance data in documentation
* Add ability to disable any reference to Intel Decimal Floating-point library
* Simplify C++ notations for safe pointers (`+x` and `operartor bool()`)
* Fix link to old `db48x` project in `README.md`
Signed-off-by: Christophe de Dinechin <christophe@dinechin.org>
2023-12-25 17:09:31 +01:00
|
|
|
| 0.5.2 | 26733 | 15695 |
|
|
|
|
| 0.4.10 | 28507 | 15769 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## SumTest (decimal performance)
|
|
|
|
|
2024-01-16 12:04:26 +01:00
|
|
|
VP = Variable Precision
|
|
|
|
ID = Intel Decimal Library
|
2024-01-24 23:19:17 +01:00
|
|
|
HW = Hardware-accelerated (`float` or `double` types)
|
2024-01-16 12:04:26 +01:00
|
|
|
|
2024-01-24 23:19:17 +01:00
|
|
|
For 100000 loops, we see that the variable-precision implementation at 24-digit
|
|
|
|
is roughly 10 times slower than the fixed precision implementation at 34 digits
|
|
|
|
(128 bits).
|
2024-01-16 12:04:26 +01:00
|
|
|
|
|
|
|
| Version | DM32 ms | DM42 ms |
|
|
|
|
|--------------|---------|---------|
|
|
|
|
| 0.6.0 (VP24) | 2377390 | 1768510 |
|
|
|
|
| 0.5.2 (ID) | 215421 | 143412 |
|
|
|
|
|
|
|
|
|
2024-01-24 23:19:17 +01:00
|
|
|
For 1000 loops, comparing variable-precision decimal with the earlier Intel decimal
|
2024-01-16 12:04:26 +01:00
|
|
|
|
|
|
|
| Version | DM32 ms | DM42 ms |
|
|
|
|
|--------------|---------|---------|
|
2024-01-24 23:19:17 +01:00
|
|
|
| 0.6.4 (VP24) | 32346 | 23011 |
|
|
|
|
| 0.6.4 (VP12) | 13720 | 10548 |
|
|
|
|
| 0.6.4 (VP6) | 6905 | 5623 |
|
2024-01-16 12:04:26 +01:00
|
|
|
| 0.5.2 (ID) | 2154 | 1434 |
|
2023-12-14 19:37:20 +01:00
|
|
|
|
2024-01-24 23:19:17 +01:00
|
|
|
| | DM32 (ms) | DM42 (ms) |
|
|
|
|
|--------------|--------------------------------------------|--------------------------------------------|
|
|
|
|
| Version | HW7 | HW16 | VP6 | VP12 | VP24 | VP36 | HW7 | HW16 | VP6 | VP12 | VP24 | VP36 |
|
|
|
|
| 0.5.2 (ID) | 2154 | | | | | | 1434 | | | | | |
|
|
|
|
| 0.6.0 (Note) | | | | | 23773 | | | | | | 17685 | |
|
|
|
|
| 0.6.2 | | | 7436 | 16017 | 34898 | 62012 | | | 5842 | 10782 | 23714 | 42269 |
|
|
|
|
| 0.6.4 | 1414 | 1719 | 6905 | 13720 | 32346 | 60259 | 422 | 705 | 5623 | 10548 | 23811 | 42363 |
|
|
|
|
|
|
|
|
Note: Results for 0.6.0 are artificially good because intermediate computations
|
|
|
|
were not made with increased precision.
|
|
|
|
|
|
|
|
|
2023-12-14 19:37:20 +01:00
|
|
|
## Drawing `sin X` with `FunctionPlot`
|
|
|
|
|
2024-01-16 12:04:26 +01:00
|
|
|
| Configuration | DM32 ms | DM42 ms |
|
|
|
|
|-----------------|------------|------------|
|
2024-01-24 23:19:17 +01:00
|
|
|
| HW7 | 1869-2000 | 1681-1744 |
|
|
|
|
| HW16 | 1928-2067 | 1679-2060 |
|
|
|
|
| ID | 2332-5140 | |
|
|
|
|
| VP24 | 3683-6005 | 3377-3511 |
|
|
|
|
| VP36 | 6567-10186 | 4434-4709 |
|
|
|
|
| VP48 | 8377-10259 | 5964-6123 |
|
2023-12-14 19:37:20 +01:00
|
|
|
|
|
|
|
Crash at precision 3
|
2024-01-24 23:19:17 +01:00
|
|
|
<!--- !DMNONE --->
|