Inlining of (not all) control flow structures #39

OctaveLarose · 2024-01-29T13:54:21Z

Calls to:

ifTrue:
ifFalse:
ifTrue:ifFalse
ifFalse:ifTrue
whileTrue:
whileFalse:

...are no longer treated as method calls and their instructions are inlined into the previous scope. This relies on several JumpX bytecodes which modify the bytecode index to go back/further in the body of the method being executed.

Some inlined control flow structures are missing, e.g. to:do, and:, etc. But I'm optimistic they're easy to implement since the logic they rely on should already be present in this PR.

I'm not sure what the speedup is as of yet, but it can be pushed further than it is right now since it has a few shortcomings and design decisions I'm unsure of: those are usually marked by a TODO comment in my code. I especially dislike the change I had to do to som-interpreter-bc/src/block.rs. In short, it's almost ready for merging, and since it's functional on my machine (I may have missed stuff though!) I figured it was time to at least open the PR, to show that these changes do exist.

I had started this work a year ago for fun and then dropped it. I picked it up recently since I was considering using som-rs for some research.

~~Also now that I'm typing all this out, I realize this PR also contains some specialized bytecode like PushConstant[0|1|2] or Send[1|2|3] which I also implemented a year ago...~~ (EDIT: I've removed those from this branch now) Additionally, I implemented the halt: primitive and removed the bytecode for it. Let me know if you want those changes to be proposed in their own PR.

Let me know what you think. I don't know when I'll do the finishing touches for it, but hopefully soon.

…ined blocks, not sure how to fix that yet

…d unused. I wish they were usable, but afaik it's not doable in Rust to replace matches with more clever static array lookups (at least not with enums that have optional arguments)

…e my life easier

… though: "self" in inlined blocks no longer points to the right value, so Bounce fails.

… (Mandelbrot infinite loop)

…e (had to make some compiler.rs data structures public)

…d moving some more logic to backpatch_jump() )

OctaveLarose · 2024-02-07T14:57:43Z

Just added and/or inlining, which is an extra 17% or so!

Thanks for fixing the benchmark runner. Also w.r.t the specialized bytecode branch, here it is: #40.

The CI for the ReBench benchmarks also fails due to a change you made to the rebench.conf file which references a local ReBenchDB instance that I would guess you used for local testing.
Excluding that change should fix that other issue as well.

Oops, my bad. Will fix that one.

It seems to simply be a matter of the benchmark parameters in run_benchmarks.sh (1 0 7) being inadequate for these two benchmarks.
These benchmarks always give me an error when 0 is its second argument.
Replacing that 0 with 1 fixes this issue and all benchmarks can run successfully.
I believe this argument specifies the number of inner iterations, which could explain why 0 causes issues.

Ah, that does make sense. I thought it was a bug, but it's nice that it doesn't seem to be.

som-rs-benchmarker · 2024-02-07T15:08:15Z

Here are the benchmark results for fixing-inlining (commit: 5efc26a):

AST interpreter

+-----------------+---------------------------------------+---------------------------+
| Benchmark       | master (base)                         | fixing-inlining (head)    |
+-----------------+---------------------------------------+---------------------------+
| Bounce          | 197.90 ms ± 14.83 (187.51..233.91)    | 1.02x ± 0.09 (0.92..1.07) |
| BubbleSort      | 268.79 ms ± 9.69 (255.11..286.55)     | 0.99x ± 0.12 (0.74..1.06) |
| DeltaBlue       | 158.85 ms ± 6.10 (152.72..171.58)     | 1.00x ± 0.07 (0.87..1.06) |
| Dispatch        | 179.55 ms ± 3.29 (176.31..187.99)     | 0.98x ± 0.04 (0.90..1.02) |
| Fannkuch        | 122.12 ms ± 1.83 (118.15..124.06)     | 1.00x ± 0.04 (0.90..1.04) |
| Fibonacci       | 359.55 ms ± 13.43 (342.85..386.45)    | 0.99x ± 0.05 (0.91..1.04) |
| FieldLoop       | 328.60 ms ± 15.04 (312.53..365.05)    | 1.00x ± 0.06 (0.92..1.03) |
| GraphSearch     | 81.35 ms ± 3.96 (76.33..89.09)        | 1.01x ± 0.05 (0.97..1.04) |
| IntegerLoop     | 321.87 ms ± 21.47 (305.80..381.31)    | 0.99x ± 0.10 (0.88..1.05) |
| JsonSmall       | 197.37 ms ± 2.98 (192.47..200.94)     | 1.00x ± 0.05 (0.89..1.04) |
| List            | 229.96 ms ± 4.54 (223.74..240.40)     | 0.99x ± 0.04 (0.93..1.04) |
| Loop            | 419.06 ms ± 14.74 (403.25..440.46)    | 0.97x ± 0.11 (0.75..1.05) |
| Mandelbrot      | 272.50 ms ± 33.26 (249.15..362.82)    | 1.05x ± 0.13 (1.01..1.08) |
| NBody           | 205.46 ms ± 9.63 (195.47..230.11)     | 0.96x ± 0.11 (0.75..1.04) |
| PageRank        | 300.68 ms ± 11.58 (287.36..327.14)    | 0.96x ± 0.10 (0.80..1.07) |
| Permute         | 298.78 ms ± 11.64 (283.59..325.52)    | 0.95x ± 0.07 (0.86..1.01) |
| Queens          | 245.45 ms ± 21.69 (222.66..290.28)    | 1.05x ± 0.10 (1.00..1.08) |
| QuickSort       | 77.56 ms ± 8.52 (69.61..93.84)        | 1.00x ± 0.13 (0.87..1.06) |
| Recurse         | 281.25 ms ± 27.84 (255.08..338.34)    | 0.99x ± 0.13 (0.86..1.06) |
| Richards        | 3902.50 ms ± 66.18 (3805.40..3999.52) | 1.00x ± 0.02 (0.97..1.03) |
| Sieve           | 412.96 ms ± 16.81 (397.45..454.38)    | 0.95x ± 0.08 (0.85..1.04) |
| Storage         | 85.78 ms ± 5.43 (81.06..98.70)        | 1.04x ± 0.07 (1.01..1.09) |
| Sum             | 159.85 ms ± 5.22 (153.12..171.07)     | 0.99x ± 0.05 (0.91..1.05) |
| Towers          | 306.93 ms ± 5.04 (297.17..313.27)     | 0.95x ± 0.06 (0.87..1.06) |
| TreeSort        | 157.22 ms ± 8.30 (149.39..178.15)     | 0.96x ± 0.09 (0.83..1.03) |
| WhileLoop       | 385.35 ms ± 25.15 (346.75..420.31)    | 1.05x ± 0.07 (1.00..1.10) |
|                 |                                       |                           |
| Average Speedup |              (baseline)               | 0.99x ± 0.02 (0.95..1.05) |
+-----------------+---------------------------------------+---------------------------+

The raw ReBench data files are available for download here: baseline and head

Bytecode interpreter

+-----------------+---------------------------------------+---------------------------+
| Benchmark       | master (base)                         | fixing-inlining (head)    |
+-----------------+---------------------------------------+---------------------------+
| Bounce          | 86.37 ms ± 2.62 (83.12..90.94)        | 1.51x ± 0.06 (1.44..1.55) |
| BubbleSort      | 122.98 ms ± 11.72 (116.05..155.19)    | 1.54x ± 0.15 (1.52..1.58) |
| DeltaBlue       | 68.68 ms ± 8.19 (64.37..91.63)        | 1.58x ± 0.20 (1.47..1.67) |
| Dispatch        | 90.58 ms ± 8.14 (84.64..113.39)       | 1.76x ± 0.17 (1.68..1.82) |
| Fannkuch        | 55.63 ms ± 2.20 (53.08..60.93)        | 1.46x ± 0.07 (1.40..1.54) |
| Fibonacci       | 153.54 ms ± 6.30 (147.12..169.46)     | 1.22x ± 0.05 (1.19..1.24) |
| FieldLoop       | 206.22 ms ± 11.10 (195.29..230.15)    | 1.19x ± 0.07 (1.16..1.24) |
| GraphSearch     | 40.88 ms ± 4.19 (36.18..47.16)        | 1.46x ± 0.16 (1.36..1.53) |
| IntegerLoop     | 161.38 ms ± 9.68 (155.00..187.26)     | 1.92x ± 0.13 (1.76..1.97) |
| JsonSmall       | 107.67 ms ± 9.34 (98.78..123.48)      | 1.60x ± 0.14 (1.55..1.65) |
| List            | 120.99 ms ± 6.61 (114.37..134.00)     | 2.33x ± 0.13 (2.25..2.38) |
| Loop            | 213.23 ms ± 17.93 (200.77..259.24)    | 1.93x ± 0.17 (1.88..2.02) |
| Mandelbrot      | 151.35 ms ± 19.36 (132.02..187.29)    | 1.96x ± 0.26 (1.88..2.02) |
| NBody           | 91.30 ms ± 1.05 (89.83..92.78)        | 0.92x ± 0.04 (0.87..0.98) |
| PageRank        | 137.20 ms ± 4.26 (130.86..147.58)     | 1.64x ± 0.06 (1.59..1.69) |
| Permute         | 134.27 ms ± 10.75 (126.63..162.76)    | 1.25x ± 0.11 (1.16..1.28) |
| Queens          | 100.04 ms ± 2.76 (96.16..103.79)      | 1.30x ± 0.08 (1.15..1.37) |
| QuickSort       | 37.97 ms ± 3.37 (33.91..44.73)        | 2.10x ± 0.37 (1.48..2.32) |
| Recurse         | 132.91 ms ± 7.97 (122.55..149.42)     | 1.45x ± 0.09 (1.39..1.49) |
| Richards        | 1642.80 ms ± 58.31 (1578.67..1771.95) | 1.93x ± 0.11 (1.72..2.00) |
| Sieve           | 191.48 ms ± 4.70 (184.74..197.58)     | 1.83x ± 0.06 (1.78..1.89) |
| Storage         | 39.86 ms ± 4.94 (35.97..53.61)        | 1.48x ± 0.19 (1.40..1.53) |
| Sum             | 86.66 ms ± 5.75 (80.39..95.29)        | 2.01x ± 0.14 (1.93..2.12) |
| Towers          | 147.73 ms ± 11.61 (141.04..180.15)    | 1.23x ± 0.10 (1.18..1.29) |
| TreeSort        | 52.68 ms ± 2.36 (50.12..56.99)        | 2.39x ± 0.13 (2.22..2.51) |
| WhileLoop       | 206.70 ms ± 21.63 (196.36..267.95)    | 2.78x ± 0.31 (2.57..2.88) |
|                 |                                       |                           |
| Average Speedup |              (baseline)               | 1.68x ± 0.03 (0.92..2.78) |
+-----------------+---------------------------------------+---------------------------+

The raw ReBench data files are available for download here: baseline and head

The benchmarks were run using ReBench v1.2.0
The statistical analysis was done using rebench-tabler v0.1.0

The source code of this benchmark runner is available as a GitHub Gist for more details about the setup

smarr · 2024-02-07T15:08:20Z

Patches to the SOM benchmark harness or specific benchmarks to give more useful feedback are always welcome :)

And indeed, some benchmarks don't work with all possible parameters. And yeah, zero iterations should probably also produce a warning to the user. Since it's likely an intended error :)

(Especially since quite a while ago (multiple years, I believe) I changed the harness to not support warmup anymore, since I think it should be handled outside the harness and supporting it leads to unexpected recompilation by a JIT. Removing it also meant removing the middle 0 as command line parameter)

OctaveLarose · 2024-02-07T16:36:09Z

Yeah, thinking about it, I've no idea why I was telling it to do 0 inner iterations. I'll consider submitting some changes to the benchmark harness since those errors confounded me for longer than I care to admit

…p etc logic

…x_pop etc logic for jumps, breaks in some cases

som-rs-benchmarker · 2024-02-26T13:04:30Z

Here are the benchmark results for fixing-inlining (commit: 540956b):

AST interpreter

+-----------------+---------------------------------------+---------------------------+
| Benchmark       | master (base)                         | fixing-inlining (head)    |
+-----------------+---------------------------------------+---------------------------+
| Bounce          | 203.63 ms ± 11.22 (188.53..226.03)    | 1.05x ± 0.07 (1.00..1.12) |
| BubbleSort      | 273.65 ms ± 13.82 (257.39..300.76)    | 0.98x ± 0.08 (0.88..1.06) |
| DeltaBlue       | 166.25 ms ± 11.78 (151.26..191.08)    | 0.99x ± 0.10 (0.85..1.08) |
| Dispatch        | 191.57 ms ± 5.47 (184.40..203.21)     | 1.02x ± 0.04 (0.98..1.06) |
| Fannkuch        | 130.29 ms ± 6.62 (121.43..141.92)     | 1.04x ± 0.07 (0.96..1.09) |
| Fibonacci       | 401.24 ms ± 22.55 (377.96..440.94)    | 0.97x ± 0.09 (0.87..1.08) |
| FieldLoop       | 339.10 ms ± 15.55 (318.98..370.15)    | 0.95x ± 0.06 (0.87..1.01) |
| GraphSearch     | 86.05 ms ± 5.41 (75.22..94.72)        | 1.03x ± 0.08 (0.92..1.10) |
| IntegerLoop     | 344.69 ms ± 30.94 (323.66..415.97)    | 1.00x ± 0.10 (0.91..1.05) |
| JsonSmall       | 206.67 ms ± 11.67 (195.56..230.83)    | 1.00x ± 0.06 (0.94..1.04) |
| List            | 237.43 ms ± 9.61 (224.15..251.13)     | 1.01x ± 0.06 (0.91..1.04) |
| Loop            | 438.01 ms ± 12.68 (408.24..449.04)    | 1.00x ± 0.05 (0.91..1.04) |
| Mandelbrot      | 270.49 ms ± 16.19 (248.93..305.27)    | 1.01x ± 0.09 (0.87..1.09) |
| NBody           | 219.65 ms ± 17.07 (204.50..263.01)    | 1.02x ± 0.09 (0.96..1.07) |
| PageRank        | 309.15 ms ± 10.93 (296.94..330.60)    | 1.01x ± 0.08 (0.85..1.05) |
| Permute         | 302.77 ms ± 9.68 (290.83..317.44)     | 0.98x ± 0.07 (0.87..1.06) |
| Queens          | 238.01 ms ± 7.63 (229.75..251.26)     | 0.99x ± 0.05 (0.93..1.04) |
| QuickSort       | 75.31 ms ± 4.19 (70.91..85.24)        | 0.99x ± 0.06 (0.94..1.03) |
| Recurse         | 291.94 ms ± 17.51 (271.72..318.99)    | 0.94x ± 0.16 (0.66..1.08) |
| Richards        | 4004.43 ms ± 87.80 (3879.97..4202.34) | 1.01x ± 0.03 (0.97..1.03) |
| Sieve           | 432.97 ms ± 16.30 (416.13..457.13)    | 1.03x ± 0.05 (0.99..1.06) |
| Storage         | 89.37 ms ± 13.49 (79.72..122.48)      | 1.03x ± 0.16 (0.99..1.07) |
| Sum             | 176.27 ms ± 9.58 (163.34..192.01)     | 0.91x ± 0.12 (0.68..0.98) |
| Towers          | 329.40 ms ± 13.19 (307.82..348.45)    | 1.00x ± 0.06 (0.92..1.05) |
| TreeSort        | 164.72 ms ± 13.49 (153.94..193.50)    | 1.05x ± 0.11 (0.91..1.10) |
| WhileLoop       | 365.99 ms ± 23.19 (346.02..428.64)    | 0.97x ± 0.07 (0.90..1.02) |
|                 |                                       |                           |
| Average Speedup |              (baseline)               | 1.00x ± 0.02 (0.91..1.05) |
+-----------------+---------------------------------------+---------------------------+

The raw ReBench data files are available for download here: baseline and head

Bytecode interpreter

+-----------------+---------------------------------------+---------------------------+
| Benchmark       | master (base)                         | fixing-inlining (head)    |
+-----------------+---------------------------------------+---------------------------+
| Bounce          | 77.43 ms ± 1.00 (75.29..78.86)        | 1.40x ± 0.10 (1.24..1.50) |
| BubbleSort      | 112.19 ms ± 2.72 (108.16..116.21)     | 1.33x ± 0.26 (0.91..1.62) |
| DeltaBlue       | 65.10 ms ± 2.48 (61.19..69.14)        | 1.54x ± 0.19 (1.22..1.71) |
| Dispatch        | 83.88 ms ± 5.12 (80.02..96.93)        | 1.65x ± 0.21 (1.31..1.85) |
| Fannkuch        | 50.95 ms ± 2.55 (48.13..56.51)        | 1.47x ± 0.10 (1.32..1.56) |
| Fibonacci       | 137.94 ms ± 9.68 (128.13..162.80)     | 1.03x ± 0.13 (0.87..1.16) |
| FieldLoop       | 184.46 ms ± 6.49 (175.14..195.70)     | 1.31x ± 0.13 (1.08..1.44) |
| GraphSearch     | 35.00 ms ± 1.30 (32.67..37.09)        | 1.40x ± 0.09 (1.26..1.49) |
| IntegerLoop     | 154.45 ms ± 13.99 (143.89..191.16)    | 1.87x ± 0.23 (1.56..2.04) |
| JsonSmall       | 103.63 ms ± 13.09 (87.82..130.67)     | 1.46x ± 0.30 (1.11..1.67) |
| List            | 108.50 ms ± 7.23 (99.88..121.64)      | 2.14x ± 0.22 (1.83..2.32) |
| Loop            | 185.98 ms ± 4.60 (180.35..195.73)     | 1.74x ± 0.16 (1.55..1.95) |
| Mandelbrot      | 123.02 ms ± 4.59 (117.60..131.81)     | 1.68x ± 0.19 (1.36..1.83) |
| NBody           | 88.33 ms ± 4.29 (83.50..98.35)        | 0.92x ± 0.11 (0.78..1.10) |
| PageRank        | 125.48 ms ± 2.62 (120.88..128.74)     | 1.65x ± 0.08 (1.48..1.73) |
| Permute         | 118.36 ms ± 11.18 (109.02..147.65)    | 1.15x ± 0.14 (1.02..1.28) |
| Queens          | 93.16 ms ± 2.20 (90.02..95.99)        | 1.16x ± 0.16 (0.87..1.31) |
| QuickSort       | 29.71 ms ± 0.92 (28.25..31.43)        | 1.75x ± 0.22 (1.38..2.01) |
| Recurse         | 113.02 ms ± 2.23 (108.61..116.02)     | 1.21x ± 0.12 (1.05..1.33) |
| Richards        | 1456.16 ms ± 23.35 (1414.57..1483.53) | 1.78x ± 0.09 (1.67..1.94) |
| Sieve           | 178.41 ms ± 6.56 (168.55..192.64)     | 1.81x ± 0.19 (1.49..1.94) |
| Storage         | 33.95 ms ± 2.24 (31.43..39.49)        | 1.41x ± 0.14 (1.23..1.58) |
| Sum             | 72.12 ms ± 1.39 (69.97..74.59)        | 1.60x ± 0.33 (1.14..1.97) |
| Towers          | 126.80 ms ± 10.65 (117.98..152.62)    | 1.13x ± 0.13 (1.00..1.26) |
| TreeSort        | 50.55 ms ± 2.50 (48.95..57.27)        | 2.16x ± 0.49 (1.50..2.57) |
| WhileLoop       | 180.50 ms ± 11.67 (168.51..204.81)    | 2.71x ± 0.69 (1.70..3.23) |
|                 |                                       |                           |
| Average Speedup |              (baseline)               | 1.56x ± 0.05 (0.92..2.71) |
+-----------------+---------------------------------------+---------------------------+

The raw ReBench data files are available for download here: baseline and head

The benchmarks were run using ReBench v1.2.0
The statistical analysis was done using rebench-tabler v0.1.0

The source code of this benchmark runner is available as a GitHub Gist for more details about the setup

OctaveLarose · 2024-02-29T14:58:08Z

Feel free to review that one as is. It's got a lot of stuff to improve upon, but I'd appreciate initial feedback since there are currently a number of design decisions I'm unsure about, and which I'm positive you'll identify and possibly find flaws in.

Sent you an email regarding the AST interpreter, btw!

…e need for ast_body in Block

…t except hash

som-rs-benchmarker · 2024-03-06T16:06:40Z

Here are the benchmark results for fixing-inlining (commit: fabf18d):

AST interpreter

+-----------------+---------------------------------------+---------------------------+
| Benchmark       | master (base)                         | fixing-inlining (head)    |
+-----------------+---------------------------------------+---------------------------+
| Bounce          | 199.56 ms ± 9.62 (189.46..224.31)     | 0.95x ± 0.12 (0.80..1.06) |
| BubbleSort      | 309.98 ms ± 25.34 (273.52..355.59)    | 1.12x ± 0.12 (0.96..1.20) |
| DeltaBlue       | 161.21 ms ± 5.75 (153.29..168.81)     | 0.97x ± 0.07 (0.84..1.08) |
| Dispatch        | 193.96 ms ± 10.65 (182.31..218.21)    | 0.90x ± 0.09 (0.77..1.01) |
| Fannkuch        | 136.70 ms ± 15.09 (122.43..174.32)    | 0.96x ± 0.17 (0.71..1.11) |
| Fibonacci       | 387.63 ms ± 25.70 (357.58..437.03)    | 0.99x ± 0.09 (0.92..1.07) |
| FieldLoop       | 354.22 ms ± 29.76 (321.50..425.78)    | 1.02x ± 0.11 (0.92..1.11) |
| GraphSearch     | 92.21 ms ± 14.11 (76.91..117.29)      | 1.11x ± 0.19 (0.95..1.17) |
| IntegerLoop     | 340.66 ms ± 18.64 (317.28..373.48)    | 1.00x ± 0.08 (0.91..1.06) |
| JsonSmall       | 211.00 ms ± 7.58 (199.75..225.69)     | 0.93x ± 0.13 (0.75..1.08) |
| List            | 237.81 ms ± 11.52 (226.64..259.04)    | 0.94x ± 0.09 (0.77..1.05) |
| Loop            | 428.96 ms ± 17.21 (404.18..450.58)    | 1.00x ± 0.05 (0.95..1.06) |
| Mandelbrot      | 264.80 ms ± 9.27 (252.33..283.28)     | 0.94x ± 0.07 (0.82..1.02) |
| NBody           | 251.62 ms ± 24.15 (217.85..289.16)    | 1.09x ± 0.15 (0.93..1.30) |
| PageRank        | 314.94 ms ± 7.16 (305.30..326.95)     | 1.00x ± 0.06 (0.88..1.06) |
| Permute         | 324.24 ms ± 39.92 (289.91..400.44)    | 1.04x ± 0.14 (0.94..1.09) |
| Queens          | 243.09 ms ± 10.64 (228.92..264.31)    | 1.00x ± 0.07 (0.89..1.07) |
| QuickSort       | 76.36 ms ± 4.18 (69.26..84.04)        | 0.95x ± 0.12 (0.82..1.07) |
| Recurse         | 274.84 ms ± 17.86 (256.20..317.90)    | 0.95x ± 0.10 (0.84..1.05) |
| Richards        | 4254.69 ms ± 80.26 (4138.73..4409.92) | 1.02x ± 0.04 (0.97..1.07) |
| Sieve           | 432.90 ms ± 11.16 (420.86..455.97)    | 0.96x ± 0.06 (0.89..1.07) |
| Storage         | 92.53 ms ± 7.13 (86.43..109.42)       | 1.06x ± 0.09 (1.00..1.12) |
| Sum             | 167.97 ms ± 6.13 (160.98..178.46)     | 0.97x ± 0.10 (0.78..1.08) |
| Towers          | 327.81 ms ± 26.01 (303.86..397.13)    | 0.98x ± 0.10 (0.90..1.07) |
| TreeSort        | 174.54 ms ± 11.23 (160.83..195.05)    | 1.04x ± 0.12 (0.88..1.17) |
| WhileLoop       | 380.97 ms ± 19.39 (361.74..412.88)    | 0.98x ± 0.08 (0.86..1.08) |
|                 |                                       |                           |
| Average Speedup |              (baseline)               | 0.99x ± 0.02 (0.90..1.12) |
+-----------------+---------------------------------------+---------------------------+

The raw ReBench data files are available for download here: baseline and head

Bytecode interpreter

+-----------------+---------------------------------------+---------------------------+
| Benchmark       | master (base)                         | fixing-inlining (head)    |
+-----------------+---------------------------------------+---------------------------+
| Bounce          | 85.47 ms ± 11.04 (72.63..107.18)      | 1.39x ± 0.24 (1.14..1.61) |
| BubbleSort      | 122.21 ms ± 11.12 (107.59..140.19)    | 1.66x ± 0.22 (1.38..1.87) |
| DeltaBlue       | 61.43 ms ± 3.94 (57.27..70.13)        | 1.54x ± 0.13 (1.42..1.68) |
| Dispatch        | 85.59 ms ± 6.40 (78.96..99.00)        | 1.69x ± 0.17 (1.52..1.85) |
| Fannkuch        | 52.04 ms ± 2.71 (48.38..57.70)        | 1.62x ± 0.11 (1.52..1.75) |
| Fibonacci       | 156.41 ms ± 24.16 (136.36..217.60)    | 1.46x ± 0.35 (1.08..1.81) |
| FieldLoop       | 190.95 ms ± 11.67 (175.97..203.03)    | 1.14x ± 0.21 (0.84..1.35) |
| GraphSearch     | 41.07 ms ± 5.89 (31.99..49.31)        | 1.46x ± 0.27 (1.13..1.63) |
| IntegerLoop     | 165.39 ms ± 18.99 (146.00..199.17)    | 1.91x ± 0.25 (1.68..2.06) |
| JsonSmall       | 93.95 ms ± 4.44 (88.93..105.37)       | 1.38x ± 0.11 (1.27..1.54) |
| List            | 113.91 ms ± 15.81 (101.89..154.93)    | 2.35x ± 0.34 (2.20..2.46) |
| Loop            | 202.01 ms ± 25.51 (180.59..268.85)    | 1.89x ± 0.28 (1.57..2.02) |
| Mandelbrot      | 129.64 ms ± 14.47 (115.76..163.33)    | 1.75x ± 0.21 (1.64..1.85) |
| NBody           | 87.34 ms ± 6.35 (80.91..100.94)       | 1.17x ± 0.16 (0.93..1.28) |
| PageRank        | 129.20 ms ± 7.25 (122.66..148.22)     | 1.64x ± 0.11 (1.56..1.72) |
| Permute         | 134.59 ms ± 16.76 (119.72..163.20)    | 1.31x ± 0.25 (0.97..1.48) |
| Queens          | 99.13 ms ± 11.76 (90.38..129.35)      | 1.34x ± 0.20 (1.10..1.49) |
| QuickSort       | 29.87 ms ± 0.64 (28.81..30.82)        | 1.86x ± 0.11 (1.65..1.97) |
| Recurse         | 118.74 ms ± 7.74 (112.80..138.34)     | 1.27x ± 0.15 (1.04..1.41) |
| Richards        | 1483.59 ms ± 44.03 (1425.34..1565.25) | 1.91x ± 0.08 (1.83..2.01) |
| Sieve           | 181.77 ms ± 11.48 (167.36..203.24)    | 1.87x ± 0.18 (1.62..2.02) |
| Storage         | 43.24 ms ± 7.45 (36.04..56.21)        | 1.54x ± 0.38 (1.20..1.95) |
| Sum             | 85.97 ms ± 7.10 (76.82..98.95)        | 1.97x ± 0.30 (1.65..2.29) |
| Towers          | 124.95 ms ± 7.67 (115.16..142.56)     | 1.14x ± 0.13 (0.95..1.27) |
| TreeSort        | 50.74 ms ± 5.46 (44.98..62.00)        | 2.29x ± 0.52 (1.52..2.71) |
| WhileLoop       | 176.55 ms ± 10.14 (163.64..199.54)    | 3.04x ± 0.18 (2.90..3.08) |
|                 |                                       |                           |
| Average Speedup |              (baseline)               | 1.68x ± 0.05 (1.14..3.04) |
+-----------------+---------------------------------------+---------------------------+

The raw ReBench data files are available for download here: baseline and head

The benchmarks were run using ReBench v1.2.0
The statistical analysis was done using rebench-tabler v0.1.0

The source code of this benchmark runner is available as a GitHub Gist for more details about the setup

…lining returnnonlocal

…rf is inaffected, then this is fine

som-rs-benchmarker · 2024-03-08T18:10:47Z

Here are the benchmark results for fixing-inlining (commit: aabc4ac):

AST interpreter

+-----------------+----------------------------------------+---------------------------+
| Benchmark       | master (base)                          | fixing-inlining (head)    |
+-----------------+----------------------------------------+---------------------------+
| Bounce          | 207.04 ms ± 7.69 (196.22..220.04)      | 1.01x ± 0.08 (0.91..1.11) |
| BubbleSort      | 296.66 ms ± 19.98 (271.17..342.62)     | 1.11x ± 0.08 (1.05..1.15) |
| DeltaBlue       | 179.80 ms ± 23.57 (157.65..228.48)     | 1.00x ± 0.17 (0.85..1.14) |
| Dispatch        | 186.83 ms ± 7.92 (178.72..205.95)      | 0.92x ± 0.07 (0.82..1.00) |
| Fannkuch        | 130.18 ms ± 8.18 (121.27..146.00)      | 1.01x ± 0.08 (0.89..1.07) |
| Fibonacci       | 375.29 ms ± 20.13 (349.88..404.02)     | 0.96x ± 0.10 (0.84..1.07) |
| FieldLoop       | 341.64 ms ± 19.09 (320.57..386.67)     | 1.00x ± 0.08 (0.88..1.06) |
| GraphSearch     | 84.40 ms ± 5.62 (76.60..95.25)         | 0.84x ± 0.09 (0.71..0.92) |
| IntegerLoop     | 350.47 ms ± 22.82 (319.74..398.71)     | 0.96x ± 0.08 (0.91..1.03) |
| JsonSmall       | 208.59 ms ± 12.47 (194.41..233.51)     | 1.00x ± 0.09 (0.90..1.10) |
| List            | 264.01 ms ± 35.64 (236.57..340.39)     | 1.08x ± 0.16 (0.95..1.19) |
| Loop            | 418.31 ms ± 9.61 (403.90..438.02)      | 0.96x ± 0.07 (0.82..1.01) |
| Mandelbrot      | 267.12 ms ± 13.18 (248.89..288.69)     | 0.98x ± 0.08 (0.86..1.06) |
| NBody           | 205.14 ms ± 10.71 (192.77..226.92)     | 0.89x ± 0.08 (0.81..0.99) |
| PageRank        | 322.08 ms ± 17.64 (298.19..351.75)     | 1.05x ± 0.06 (1.00..1.08) |
| Permute         | 319.02 ms ± 13.70 (293.09..340.31)     | 1.00x ± 0.07 (0.92..1.09) |
| Queens          | 260.06 ms ± 18.76 (241.98..296.15)     | 1.06x ± 0.09 (1.00..1.11) |
| QuickSort       | 81.13 ms ± 5.32 (74.80..89.86)         | 0.98x ± 0.10 (0.83..1.07) |
| Recurse         | 289.95 ms ± 16.28 (270.04..316.96)     | 0.98x ± 0.12 (0.76..1.06) |
| Richards        | 4297.01 ms ± 127.46 (4060.14..4461.71) | 1.00x ± 0.05 (0.93..1.04) |
| Sieve           | 425.08 ms ± 18.56 (405.63..470.33)     | 0.97x ± 0.05 (0.92..1.01) |
| Storage         | 84.56 ms ± 5.73 (78.79..97.33)         | 0.95x ± 0.07 (0.90..1.00) |
| Sum             | 170.56 ms ± 6.78 (158.58..179.44)      | 0.97x ± 0.11 (0.77..1.07) |
| Towers          | 332.40 ms ± 27.09 (310.98..386.50)     | 0.99x ± 0.10 (0.90..1.06) |
| TreeSort        | 156.14 ms ± 9.90 (144.07..173.52)      | 0.93x ± 0.10 (0.80..1.02) |
| WhileLoop       | 378.81 ms ± 19.71 (359.67..426.19)     | 1.00x ± 0.07 (0.90..1.07) |
|                 |                                        |                           |
| Average Speedup |               (baseline)               | 0.98x ± 0.02 (0.84..1.11) |
+-----------------+----------------------------------------+---------------------------+

The raw ReBench data files are available for download here: baseline and head

Bytecode interpreter

+-----------------+---------------------------------------+---------------------------+
| Benchmark       | master (base)                         | fixing-inlining (head)    |
+-----------------+---------------------------------------+---------------------------+
| Bounce          | 81.58 ms ± 8.01 (71.19..96.00)        | 1.41x ± 0.21 (1.11..1.57) |
| BubbleSort      | 115.19 ms ± 6.01 (104.22..123.86)     | 1.56x ± 0.11 (1.41..1.65) |
| DeltaBlue       | 72.77 ms ± 12.87 (62.34..96.37)       | 1.90x ± 0.35 (1.76..2.04) |
| Dispatch        | 85.63 ms ± 7.69 (79.02..102.50)       | 1.70x ± 0.20 (1.42..1.85) |
| Fannkuch        | 49.52 ms ± 3.53 (46.78..57.83)        | 1.69x ± 0.14 (1.57..1.77) |
| Fibonacci       | 149.21 ms ± 17.21 (132.85..180.60)    | 1.71x ± 0.21 (1.60..1.81) |
| FieldLoop       | 191.39 ms ± 18.11 (174.15..227.76)    | 1.31x ± 0.14 (1.21..1.41) |
| GraphSearch     | 34.99 ms ± 1.42 (32.93..37.43)        | 1.44x ± 0.08 (1.33..1.50) |
| IntegerLoop     | 168.21 ms ± 25.52 (145.42..221.43)    | 2.06x ± 0.33 (1.90..2.19) |
| JsonSmall       | 96.86 ms ± 6.80 (88.68..112.98)       | 1.31x ± 0.17 (1.11..1.52) |
| List            | 114.03 ms ± 6.80 (104.83..123.93)     | 2.26x ± 0.20 (2.01..2.46) |
| Loop            | 200.52 ms ± 12.72 (183.73..225.27)    | 1.95x ± 0.17 (1.74..2.09) |
| Mandelbrot      | 124.08 ms ± 4.69 (116.32..132.52)     | 1.71x ± 0.12 (1.55..1.81) |
| NBody           | 86.20 ms ± 5.85 (81.36..101.69)       | 1.14x ± 0.11 (1.02..1.23) |
| PageRank        | 141.13 ms ± 15.99 (123.62..179.45)    | 1.83x ± 0.24 (1.59..1.95) |
| Permute         | 121.45 ms ± 9.94 (112.79..145.63)     | 1.21x ± 0.18 (0.93..1.35) |
| Queens          | 91.65 ms ± 2.77 (87.41..96.65)        | 1.30x ± 0.10 (1.10..1.41) |
| QuickSort       | 30.24 ms ± 1.21 (28.23..31.93)        | 1.82x ± 0.16 (1.60..2.14) |
| Recurse         | 124.52 ms ± 6.03 (115.23..132.19)     | 1.27x ± 0.14 (1.11..1.45) |
| Richards        | 1513.27 ms ± 46.42 (1455.53..1627.84) | 1.94x ± 0.11 (1.77..2.07) |
| Sieve           | 184.89 ms ± 17.00 (164.64..212.76)    | 1.81x ± 0.25 (1.45..2.03) |
| Storage         | 38.09 ms ± 7.22 (32.84..53.76)        | 1.64x ± 0.32 (1.47..1.73) |
| Sum             | 74.93 ms ± 8.12 (68.06..91.49)        | 1.79x ± 0.22 (1.61..1.94) |
| Towers          | 131.28 ms ± 10.27 (118.55..146.06)    | 1.15x ± 0.15 (0.95..1.30) |
| TreeSort        | 50.38 ms ± 2.10 (47.64..54.60)        | 1.96x ± 0.46 (1.42..2.53) |
| WhileLoop       | 184.17 ms ± 14.76 (168.28..218.88)    | 2.82x ± 0.35 (2.36..3.20) |
|                 |                                       |                           |
| Average Speedup |              (baseline)               | 1.68x ± 0.04 (1.14..2.82) |
+-----------------+---------------------------------------+---------------------------+

The raw ReBench data files are available for download here: baseline and head

The benchmarks were run using ReBench v1.2.0
The statistical analysis was done using rebench-tabler v0.1.0

The source code of this benchmark runner is available as a GitHub Gist for more details about the setup

…e comment as to why

som-rs-benchmarker · 2024-03-08T19:09:42Z

Here are the benchmark results for fixing-inlining (commit: 1a900aa):

AST interpreter

+-----------------+---------------------------------------+---------------------------+
| Benchmark       | master (base)                         | fixing-inlining (head)    |
+-----------------+---------------------------------------+---------------------------+
| Bounce          | 199.38 ms ± 12.62 (190.96..229.98)    | 0.96x ± 0.10 (0.82..1.05) |
| BubbleSort      | 291.37 ms ± 21.05 (259.04..327.35)    | 1.02x ± 0.09 (0.94..1.11) |
| DeltaBlue       | 180.14 ms ± 17.89 (161.50..206.15)    | 0.98x ± 0.12 (0.87..1.10) |
| Dispatch        | 194.75 ms ± 10.14 (179.26..216.72)    | 0.96x ± 0.09 (0.85..1.07) |
| Fannkuch        | 134.70 ms ± 7.97 (123.54..145.50)     | 1.00x ± 0.15 (0.75..1.11) |
| Fibonacci       | 391.27 ms ± 20.94 (366.21..431.76)    | 1.00x ± 0.08 (0.88..1.06) |
| FieldLoop       | 361.57 ms ± 28.21 (330.20..422.96)    | 1.01x ± 0.11 (0.88..1.11) |
| GraphSearch     | 89.82 ms ± 5.00 (82.03..100.80)       | 0.96x ± 0.14 (0.76..1.13) |
| IntegerLoop     | 353.36 ms ± 12.44 (332.16..371.69)    | 1.01x ± 0.09 (0.87..1.14) |
| JsonSmall       | 211.74 ms ± 9.81 (196.20..223.90)     | 0.98x ± 0.09 (0.85..1.07) |
| List            | 258.38 ms ± 21.50 (229.18..292.64)    | 1.05x ± 0.11 (0.93..1.11) |
| Loop            | 449.88 ms ± 28.01 (410.36..491.96)    | 1.02x ± 0.11 (0.88..1.11) |
| Mandelbrot      | 265.24 ms ± 7.54 (257.23..276.12)     | 0.99x ± 0.07 (0.84..1.07) |
| NBody           | 239.13 ms ± 27.62 (205.53..291.76)    | 1.04x ± 0.14 (0.92..1.13) |
| PageRank        | 322.47 ms ± 16.59 (300.63..344.84)    | 1.01x ± 0.06 (0.96..1.07) |
| Permute         | 327.38 ms ± 12.06 (316.34..352.58)    | 0.94x ± 0.13 (0.70..1.09) |
| Queens          | 251.20 ms ± 12.21 (224.34..264.79)    | 0.99x ± 0.07 (0.90..1.07) |
| QuickSort       | 73.30 ms ± 2.18 (70.21..76.86)        | 0.85x ± 0.09 (0.71..0.99) |
| Recurse         | 288.25 ms ± 23.93 (268.21..328.65)    | 0.99x ± 0.11 (0.88..1.08) |
| Richards        | 4236.13 ms ± 72.36 (4151.19..4340.03) | 1.01x ± 0.04 (0.96..1.07) |
| Sieve           | 452.65 ms ± 30.68 (410.63..513.71)    | 1.03x ± 0.08 (0.96..1.09) |
| Storage         | 105.57 ms ± 17.71 (84.86..138.94)     | 1.16x ± 0.21 (1.05..1.29) |
| Sum             | 180.13 ms ± 15.02 (160.70..206.88)    | 0.98x ± 0.14 (0.82..1.10) |
| Towers          | 337.90 ms ± 35.42 (309.24..419.27)    | 1.02x ± 0.12 (0.95..1.08) |
| TreeSort        | 165.32 ms ± 12.73 (149.81..185.65)    | 0.94x ± 0.13 (0.76..1.08) |
| WhileLoop       | 408.32 ms ± 34.27 (382.39..499.33)    | 1.03x ± 0.12 (0.87..1.10) |
|                 |                                       |                           |
| Average Speedup |              (baseline)               | 1.00x ± 0.02 (0.85..1.16) |
+-----------------+---------------------------------------+---------------------------+

The raw ReBench data files are available for download here: baseline and head

Bytecode interpreter

+-----------------+---------------------------------------+---------------------------+
| Benchmark       | master (base)                         | fixing-inlining (head)    |
+-----------------+---------------------------------------+---------------------------+
| Bounce          | 76.14 ms ± 2.04 (73.63..80.37)        | 1.28x ± 0.13 (1.05..1.41) |
| BubbleSort      | 117.16 ms ± 5.39 (111.59..124.76)     | 1.47x ± 0.22 (1.11..1.66) |
| DeltaBlue       | 63.57 ms ± 5.59 (57.78..75.73)        | 1.70x ± 0.19 (1.49..1.82) |
| Dispatch        | 86.57 ms ± 11.62 (80.82..118.98)      | 1.64x ± 0.24 (1.47..1.78) |
| Fannkuch        | 55.70 ms ± 7.51 (48.98..67.46)        | 1.82x ± 0.31 (1.47..2.03) |
| Fibonacci       | 140.65 ms ± 7.50 (134.90..159.63)     | 1.55x ± 0.14 (1.31..1.64) |
| FieldLoop       | 206.00 ms ± 26.04 (177.78..268.50)    | 1.38x ± 0.21 (1.15..1.55) |
| GraphSearch     | 38.31 ms ± 4.31 (34.37..45.90)        | 1.54x ± 0.18 (1.42..1.61) |
| IntegerLoop     | 151.91 ms ± 7.50 (144.32..168.03)     | 1.80x ± 0.14 (1.58..1.92) |
| JsonSmall       | 98.96 ms ± 7.98 (90.35..116.32)       | 1.31x ± 0.27 (0.94..1.56) |
| List            | 105.70 ms ± 2.62 (101.79..110.73)     | 2.10x ± 0.14 (1.84..2.32) |
| Loop            | 208.23 ms ± 25.39 (185.14..261.12)    | 1.87x ± 0.26 (1.66..2.06) |
| Mandelbrot      | 129.45 ms ± 10.02 (117.76..151.10)    | 1.68x ± 0.17 (1.50..1.80) |
| NBody           | 91.56 ms ± 9.11 (81.89..105.14)       | 1.21x ± 0.15 (1.03..1.28) |
| PageRank        | 138.89 ms ± 10.11 (125.71..151.84)    | 1.63x ± 0.31 (1.23..1.89) |
| Permute         | 127.85 ms ± 11.90 (115.90..149.90)    | 1.30x ± 0.23 (0.95..1.48) |
| Queens          | 103.22 ms ± 16.16 (88.80..142.50)     | 1.51x ± 0.29 (1.18..1.62) |
| QuickSort       | 30.77 ms ± 1.03 (29.37..32.68)        | 2.04x ± 0.18 (1.65..2.14) |
| Recurse         | 126.93 ms ± 15.28 (108.70..155.74)    | 1.45x ± 0.21 (1.18..1.52) |
| Richards        | 1532.81 ms ± 56.49 (1466.77..1622.98) | 2.05x ± 0.12 (1.89..2.15) |
| Sieve           | 180.25 ms ± 6.94 (173.60..193.45)     | 1.76x ± 0.17 (1.47..1.99) |
| Storage         | 34.38 ms ± 2.09 (31.94..38.05)        | 1.20x ± 0.24 (0.97..1.63) |
| Sum             | 87.63 ms ± 9.18 (76.92..106.35)       | 1.90x ± 0.40 (1.39..2.26) |
| Towers          | 144.88 ms ± 22.65 (117.11..179.31)    | 1.42x ± 0.24 (1.25..1.50) |
| TreeSort        | 55.88 ms ± 4.88 (52.63..67.78)        | 2.91x ± 0.29 (2.58..3.09) |
| WhileLoop       | 190.27 ms ± 15.92 (170.66..209.23)    | 2.98x ± 0.43 (2.47..3.32) |
|                 |                                       |                           |
| Average Speedup |              (baseline)               | 1.71x ± 0.05 (1.20..2.98) |
+-----------------+---------------------------------------+---------------------------+

The raw ReBench data files are available for download here: baseline and head

The benchmarks were run using ReBench v1.2.0
The statistical analysis was done using rebench-tabler v0.1.0

The source code of this benchmark runner is available as a GitHub Gist for more details about the setup

som-rs-benchmarker · 2024-03-08T19:16:33Z

Here are the benchmark results for fixing-inlining (commit: 5bc87a7):

AST interpreter

+-----------------+----------------------------------------+---------------------------+
| Benchmark       | master (base)                          | fixing-inlining (head)    |
+-----------------+----------------------------------------+---------------------------+
| Bounce          | 210.46 ms ± 10.55 (192.24..230.97)     | 1.07x ± 0.07 (0.97..1.14) |
| BubbleSort      | 277.06 ms ± 17.25 (260.20..313.41)     | 1.02x ± 0.08 (0.94..1.07) |
| DeltaBlue       | 172.86 ms ± 19.95 (153.51..212.52)     | 1.02x ± 0.16 (0.84..1.12) |
| Dispatch        | 204.12 ms ± 16.61 (183.65..229.23)     | 1.06x ± 0.10 (0.94..1.12) |
| Fannkuch        | 142.41 ms ± 21.49 (124.30..199.25)     | 1.11x ± 0.18 (0.98..1.20) |
| Fibonacci       | 378.01 ms ± 13.26 (353.49..401.68)     | 1.01x ± 0.05 (0.93..1.07) |
| FieldLoop       | 339.96 ms ± 8.89 (328.21..352.99)      | 0.97x ± 0.05 (0.91..1.04) |
| GraphSearch     | 85.69 ms ± 3.90 (80.10..91.61)         | 1.01x ± 0.07 (0.89..1.07) |
| IntegerLoop     | 337.92 ms ± 16.49 (318.94..365.94)     | 1.03x ± 0.08 (0.93..1.09) |
| JsonSmall       | 216.26 ms ± 10.06 (203.71..231.25)     | 1.04x ± 0.07 (0.93..1.09) |
| List            | 251.13 ms ± 15.59 (235.77..290.41)     | 1.04x ± 0.08 (0.96..1.10) |
| Loop            | 434.79 ms ± 27.44 (402.58..484.10)     | 0.98x ± 0.08 (0.91..1.03) |
| Mandelbrot      | 272.17 ms ± 17.85 (256.13..309.56)     | 1.01x ± 0.09 (0.91..1.07) |
| NBody           | 232.25 ms ± 14.97 (208.39..258.28)     | 1.03x ± 0.10 (0.91..1.10) |
| PageRank        | 318.30 ms ± 19.59 (282.55..345.04)     | 1.04x ± 0.08 (0.96..1.10) |
| Permute         | 318.55 ms ± 25.28 (292.26..379.68)     | 1.03x ± 0.10 (0.94..1.10) |
| Queens          | 254.63 ms ± 29.54 (233.67..323.61)     | 1.03x ± 0.14 (0.92..1.11) |
| QuickSort       | 76.95 ms ± 5.39 (72.17..91.16)         | 0.99x ± 0.08 (0.92..1.02) |
| Recurse         | 285.30 ms ± 20.55 (263.82..334.33)     | 1.04x ± 0.09 (0.96..1.08) |
| Richards        | 4237.34 ms ± 134.19 (4041.86..4432.84) | 1.01x ± 0.04 (0.98..1.03) |
| Sieve           | 450.02 ms ± 22.70 (424.15..482.46)     | 0.99x ± 0.07 (0.91..1.09) |
| Storage         | 88.56 ms ± 5.70 (83.25..103.13)        | 1.02x ± 0.10 (0.90..1.10) |
| Sum             | 174.12 ms ± 12.60 (163.40..200.76)     | 1.00x ± 0.11 (0.85..1.10) |
| Towers          | 321.43 ms ± 13.69 (298.82..346.01)     | 1.00x ± 0.07 (0.91..1.06) |
| TreeSort        | 166.17 ms ± 14.17 (149.82..190.82)     | 1.06x ± 0.10 (1.01..1.10) |
| WhileLoop       | 388.97 ms ± 19.99 (363.76..427.57)     | 0.98x ± 0.10 (0.80..1.06) |
|                 |                                        |                           |
| Average Speedup |               (baseline)               | 1.02x ± 0.02 (0.97..1.11) |
+-----------------+----------------------------------------+---------------------------+

The raw ReBench data files are available for download here: baseline and head

Bytecode interpreter

+-----------------+---------------------------------------+---------------------------+
| Benchmark       | master (base)                         | fixing-inlining (head)    |
+-----------------+---------------------------------------+---------------------------+
| Bounce          | 88.59 ms ± 19.31 (77.26..141.62)      | 1.60x ± 0.41 (1.16..1.74) |
| BubbleSort      | 117.18 ms ± 13.02 (107.81..151.11)    | 1.67x ± 0.19 (1.60..1.73) |
| DeltaBlue       | 67.47 ms ± 12.38 (58.22..98.02)       | 1.74x ± 0.46 (1.14..1.88) |
| Dispatch        | 83.42 ms ± 3.41 (79.02..91.90)        | 1.63x ± 0.17 (1.34..1.78) |
| Fannkuch        | 53.99 ms ± 6.31 (47.70..67.43)        | 1.64x ± 0.29 (1.32..1.85) |
| Fibonacci       | 147.74 ms ± 12.91 (136.30..176.41)    | 1.61x ± 0.18 (1.43..1.74) |
| FieldLoop       | 188.34 ms ± 8.66 (175.75..201.44)     | 1.31x ± 0.12 (1.15..1.47) |
| GraphSearch     | 37.29 ms ± 5.06 (33.22..50.90)        | 1.36x ± 0.37 (0.93..1.63) |
| IntegerLoop     | 167.35 ms ± 19.61 (146.20..198.50)    | 1.99x ± 0.27 (1.77..2.20) |
| JsonSmall       | 112.00 ms ± 16.69 (93.32..139.64)     | 1.60x ± 0.29 (1.31..1.85) |
| List            | 118.29 ms ± 18.48 (100.08..166.65)    | 2.37x ± 0.49 (1.71..2.60) |
| Loop            | 208.54 ms ± 12.68 (197.75..235.71)    | 1.89x ± 0.25 (1.46..2.11) |
| Mandelbrot      | 127.48 ms ± 15.79 (118.63..171.41)    | 1.79x ± 0.25 (1.61..1.94) |
| NBody           | 88.59 ms ± 6.37 (83.09..100.84)       | 1.18x ± 0.09 (1.13..1.23) |
| PageRank        | 135.51 ms ± 18.11 (121.94..171.36)    | 1.71x ± 0.28 (1.38..1.84) |
| Permute         | 118.04 ms ± 2.72 (113.81..121.88)     | 1.23x ± 0.10 (1.11..1.33) |
| Queens          | 99.86 ms ± 11.11 (89.90..128.60)      | 1.43x ± 0.23 (1.16..1.56) |
| QuickSort       | 32.38 ms ± 5.38 (27.92..46.52)        | 1.93x ± 0.42 (1.46..2.26) |
| Recurse         | 130.52 ms ± 17.93 (110.79..167.61)    | 1.38x ± 0.24 (1.14..1.54) |
| Richards        | 1530.31 ms ± 54.64 (1432.74..1601.48) | 1.97x ± 0.09 (1.91..2.11) |
| Sieve           | 190.44 ms ± 20.74 (173.85..241.78)    | 2.04x ± 0.23 (2.00..2.11) |
| Storage         | 38.93 ms ± 5.85 (32.94..52.98)        | 1.54x ± 0.32 (1.16..1.79) |
| Sum             | 77.52 ms ± 6.67 (69.81..93.95)        | 1.98x ± 0.18 (1.88..2.06) |
| Towers          | 127.29 ms ± 6.29 (120.00..138.15)     | 1.15x ± 0.12 (1.02..1.26) |
| TreeSort        | 54.58 ms ± 10.20 (47.42..76.89)       | 2.46x ± 0.56 (1.96..2.80) |
| WhileLoop       | 177.23 ms ± 5.85 (167.26..184.88)     | 2.95x ± 0.27 (2.43..3.23) |
|                 |                                       |                           |
| Average Speedup |              (baseline)               | 1.74x ± 0.06 (1.15..2.95) |
+-----------------+---------------------------------------+---------------------------+

The raw ReBench data files are available for download here: baseline and head

The benchmarks were run using ReBench v1.2.0
The statistical analysis was done using rebench-tabler v0.1.0

The source code of this benchmark runner is available as a GitHub Gist for more details about the setup

OctaveLarose added 30 commits December 10, 2022 17:32

Updated rebench config

a084c30

adding bytecodes send1,2,3

63d2d6c

supersend1-2-3 bytecodes

bf0e77e

Option type instead of relying like u8::MAX like a maniac

9c3eb96

push 0, 1, nil

bb335ad

push constant 0, 1, 2

3389a7b

non functional ifTrue inlining, with new JumpOnFalseTopNil opcode

5eb4f4d

progress with ifTrue inlining, fixed some bugs but still not functional

d67c0c0

very WIP commit but getting closer to successful inlining

30cbb0b

functional ifTrue inlining at least for running Mandelbrot!

3536084

Minor cleanups and fixed it for Bounce at least

01bdda9

Basic bash script for running benchmarks

38a331e

Fixed other benchmarks. Known issue: breaks variable shadowing in inl…

8118773

…ined blocks, not sure how to fix that yet

Inlining ifFalse:, and the associate opcode JumpOnTrueTopNil

0089092

Removed bytecodes NAMES and PADDED_NAMES, since they were annoying an…

64769c1

…d unused. I wish they were usable, but afaik it's not doable in Rust to replace matches with more clever static array lookups (at least not with enums that have optional arguments)

start of ifTrue:ifFalse: inlining, and some slight refactoring to mak…

06fabe5

…e my life easier

successful ifTrue:ifFalse: and ifFalse:ifTrue: inlining. Known issue,…

016e151

… though: "self" in inlined blocks no longer points to the right value, so Bounce fails.

Fixed the self bug, was an oversight

0504ad6

Improved some panic messages and left a TODO

ad99e99

Removed an old TODO and slightly optimized binary ops

9840c01

Had forgotten to inline ifFalse:ifTrue:, oops.

fdbb7b5

Working towards inlining whileTrue:

0e45b1a

Successfully inlining one whileTrue in Bounce

aab089d

Inlining every ifTrue makes Bounce not crash at least

e3a1a31

Progress with whileTrue, seems to occasionally make infinite loops

bda7fd7

added whileFalse: but still haven't fixed some issues with whileTrue:…

9c3ed08

… (Mandelbrot infinite loop)

Wrote an inliner module because inlining code was getting too sizeabl…

23f6132

…e (had to make some compiler.rs data structures public)

Setting the stage for patching inner blocks during inlining

3e29715

fixed mandelbrot infinite loop, but not Json

6f04062

Minor refactoring in inlining code for slightly more clarity (include…

052c831

…d moving some more logic to backpatch_jump() )

some more advanced inlining tests

dd6b338

Hirevo added M-interpreter Module: Interpreter P-medium Priority: Medium C-performance Category: Performance improvements labels Feb 7, 2024

yuria2'ing

b452df9

OctaveLarose mentioned this pull request Feb 14, 2024

Specialized bytecodes #40

Merged

OctaveLarose added 4 commits February 21, 2024 11:34

BROKEN: ongoing merge with specialized bc branch

25b7f27

BROKEN: ongoing merge with specialized bc branch. Lacking dup_popx_po…

fb7ea67

…p etc logic

BROKEN: ongoing merge with specialized bc branch. Implemented dup_pop…

7133fd9

…x_pop etc logic for jumps, breaks in some cases

functional merge with specialized BC branch

540956b

OctaveLarose added 4 commits March 6, 2024 10:47

ongoing work: adapt_block_after_outer_inlined(). which will remove th…

1868393

…e need for ast_body in Block

fixed disassembler bug related to push1

c06e05a

Progress with adapt_block_after_outer_inlined. passes every basic tes…

355675e

…t except hash

block adaptation after inlining: seems functional, actually!

fabf18d

OctaveLarose added 3 commits March 8, 2024 17:08

test to see if perf is affected (probably not): changing rules for in…

6ee323c

…lining returnnonlocal

another test for inlining returnnonlocal (deltablue disabled!). if pe…

8ba8c58

…rf is inaffected, then this is fine

reactivated tests, nonlocal case is fine

aabc4ac

returning to the old "only nonlocal ret" strat, and added an expansiv…

1a900aa

…e comment as to why

formatting pass too

5bc87a7

Hirevo self-assigned this Apr 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inlining of (not all) control flow structures #39

Inlining of (not all) control flow structures #39

OctaveLarose commented Jan 29, 2024 •

edited

Loading

OctaveLarose commented Feb 7, 2024

som-rs-benchmarker bot commented Feb 7, 2024 •

edited

Loading

smarr commented Feb 7, 2024

OctaveLarose commented Feb 7, 2024

som-rs-benchmarker bot commented Feb 26, 2024 •

edited

Loading

OctaveLarose commented Feb 29, 2024

som-rs-benchmarker bot commented Mar 6, 2024 •

edited

Loading

som-rs-benchmarker bot commented Mar 8, 2024 •

edited

Loading

som-rs-benchmarker bot commented Mar 8, 2024 •

edited

Loading

som-rs-benchmarker bot commented Mar 8, 2024 •

edited

Loading

Inlining of (not all) control flow structures #39

Are you sure you want to change the base?

Inlining of (not all) control flow structures #39

Conversation

OctaveLarose commented Jan 29, 2024 • edited Loading

OctaveLarose commented Feb 7, 2024

som-rs-benchmarker bot commented Feb 7, 2024 • edited Loading

smarr commented Feb 7, 2024

OctaveLarose commented Feb 7, 2024

som-rs-benchmarker bot commented Feb 26, 2024 • edited Loading

OctaveLarose commented Feb 29, 2024

som-rs-benchmarker bot commented Mar 6, 2024 • edited Loading

som-rs-benchmarker bot commented Mar 8, 2024 • edited Loading

som-rs-benchmarker bot commented Mar 8, 2024 • edited Loading

som-rs-benchmarker bot commented Mar 8, 2024 • edited Loading

OctaveLarose commented Jan 29, 2024 •

edited

Loading

som-rs-benchmarker bot commented Feb 7, 2024 •

edited

Loading

som-rs-benchmarker bot commented Feb 26, 2024 •

edited

Loading

som-rs-benchmarker bot commented Mar 6, 2024 •

edited

Loading

som-rs-benchmarker bot commented Mar 8, 2024 •

edited

Loading

som-rs-benchmarker bot commented Mar 8, 2024 •

edited

Loading

som-rs-benchmarker bot commented Mar 8, 2024 •

edited

Loading