Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GC Parallel Sweeping #45

Open
wants to merge 15 commits into
base: master
Choose a base branch
from
Open

GC Parallel Sweeping #45

wants to merge 15 commits into from

Conversation

Hirevo
Copy link
Owner

@Hirevo Hirevo commented May 15, 2024

The idea for this PR came to me after realising that, technically, deallocating values is a pure waste of time for the interpreter, and doesn't affect any of its computations.

So this PR adds a new dedicated thread, maintained by the GC heap, that is dedicated to being sent the dead objects identified by the GC and deallocate them there, instead than on the main interpreter thread, in the hope that it then allows the interpreter to resume its work sooner.

I am unsure at the present moment on whether this is a performance improvement or not.

Depends on #33.

@Hirevo Hirevo added C-enhancement Category: Enhancements M-interpreter Module: Interpreter P-medium Priority: Medium C-performance Category: Performance improvements labels May 15, 2024
@Hirevo Hirevo self-assigned this May 15, 2024
@som-rs-benchmarker
Copy link

som-rs-benchmarker bot commented May 15, 2024

Here are the benchmark results for feat/parallel-sweeping (commit: 58d555f):

AST interpreter
+-----------------+----------------------------------------+-------------------------------+
| Benchmark       | master (base)                          | feat/parallel-sweeping (head) |
+-----------------+----------------------------------------+-------------------------------+
| Bounce          | 202.27 ms ± 10.28 (185.55..217.29)     | 0.86x ± 0.15 (0.68..1.04)     |
| BubbleSort      | 283.07 ms ± 36.41 (243.28..364.36)     | 1.03x ± 0.14 (0.96..1.11)     |
| DeltaBlue       | 168.45 ms ± 12.41 (148.39..186.87)     | 1.02x ± 0.17 (0.80..1.17)     |
| Dispatch        | 217.88 ms ± 43.39 (180.77..298.29)     | 1.06x ± 0.24 (0.93..1.25)     |
| Fannkuch        | 131.81 ms ± 11.62 (112.85..148.99)     | 0.99x ± 0.11 (0.90..1.09)     |
| Fibonacci       | 377.62 ms ± 20.36 (353.01..407.17)     | 0.97x ± 0.12 (0.81..1.07)     |
| FieldLoop       | 343.02 ms ± 40.38 (305.33..451.97)     | 1.00x ± 0.15 (0.82..1.14)     |
| GraphSearch     | 100.28 ms ± 20.33 (83.20..148.40)      | 1.21x ± 0.25 (1.15..1.29)     |
| IntegerLoop     | 369.28 ms ± 48.60 (315.29..454.61)     | 1.02x ± 0.16 (0.86..1.09)     |
| JsonSmall       | 223.71 ms ± 22.60 (191.88..255.13)     | 1.11x ± 0.13 (1.02..1.21)     |
| List            | 249.96 ms ± 10.54 (238.68..266.56)     | 0.96x ± 0.12 (0.75..1.08)     |
| Loop            | 432.63 ms ± 21.98 (403.24..467.37)     | 1.03x ± 0.11 (0.86..1.12)     |
| Mandelbrot      | 274.07 ms ± 18.59 (250.79..311.52)     | 1.04x ± 0.11 (0.87..1.15)     |
| NBody           | 230.91 ms ± 20.71 (201.71..256.89)     | 1.03x ± 0.11 (0.88..1.09)     |
| PageRank        | 335.60 ms ± 30.96 (292.55..379.90)     | 1.07x ± 0.13 (0.90..1.21)     |
| Permute         | 321.81 ms ± 29.21 (294.16..383.19)     | 0.99x ± 0.12 (0.83..1.07)     |
| Queens          | 259.25 ms ± 23.19 (221.09..299.67)     | 0.93x ± 0.14 (0.76..1.13)     |
| QuickSort       | 73.75 ms ± 2.75 (71.34..79.88)         | 0.91x ± 0.10 (0.76..1.02)     |
| Recurse         | 309.77 ms ± 38.69 (272.89..396.57)     | 1.04x ± 0.15 (0.92..1.12)     |
| Richards        | 4107.32 ms ± 102.28 (3941.51..4277.60) | 1.00x ± 0.04 (0.92..1.03)     |
| Sieve           | 427.53 ms ± 26.13 (396.18..476.53)     | 0.93x ± 0.12 (0.73..1.06)     |
| Storage         | 92.85 ms ± 8.05 (83.27..104.08)        | 1.04x ± 0.15 (0.81..1.17)     |
| Sum             | 170.19 ms ± 13.69 (156.92..202.00)     | 1.08x ± 0.11 (0.97..1.18)     |
| Towers          | 329.32 ms ± 25.30 (298.39..370.25)     | 1.02x ± 0.10 (0.92..1.12)     |
| TreeSort        | 166.31 ms ± 11.74 (151.23..187.77)     | 0.99x ± 0.11 (0.84..1.11)     |
| WhileLoop       | 389.74 ms ± 36.04 (345.53..439.52)     | 1.07x ± 0.11 (0.99..1.12)     |
|                 |                                        |                               |
| Average Speedup |               (baseline)               | 1.02x ± 0.03 (0.86..1.21)     |
+-----------------+----------------------------------------+-------------------------------+

The raw ReBench data files are available for download here: baseline and head

Bytecode interpreter
+-----------------+---------------------------------------+-------------------------------+
| Benchmark       | master (base)                         | feat/parallel-sweeping (head) |
+-----------------+---------------------------------------+-------------------------------+
| Bounce          | 73.49 ms ± 4.34 (67.67..81.48)        | 0.82x ± 0.13 (0.63..0.96)     |
| BubbleSort      | 110.80 ms ± 11.98 (98.67..137.42)     | 0.90x ± 0.13 (0.80..1.00)     |
| DeltaBlue       | 59.52 ms ± 4.19 (54.52..68.01)        | 0.70x ± 0.17 (0.51..0.87)     |
| Dispatch        | 91.81 ms ± 17.23 (73.43..122.87)      | 0.92x ± 0.21 (0.72..1.12)     |
| Fannkuch        | 48.19 ms ± 5.25 (43.91..57.86)        | 0.70x ± 0.21 (0.48..1.00)     |
| Fibonacci       | 144.44 ms ± 21.87 (129.45..196.34)    | 0.80x ± 0.14 (0.67..0.87)     |
| FieldLoop       | 162.46 ms ± 13.01 (149.47..187.32)    | 0.99x ± 0.17 (0.72..1.14)     |
| GraphSearch     | 33.76 ms ± 5.25 (30.31..48.26)        | 0.88x ± 0.20 (0.66..1.06)     |
| IntegerLoop     | 147.27 ms ± 9.56 (136.69..170.50)     | 0.83x ± 0.13 (0.66..0.97)     |
| JsonSmall       | 87.17 ms ± 5.96 (78.66..98.68)        | 0.91x ± 0.10 (0.76..1.04)     |
| List            | 111.58 ms ± 16.08 (96.90..138.33)     | 0.83x ± 0.18 (0.58..0.98)     |
| Loop            | 194.04 ms ± 14.88 (175.88..224.23)    | 0.92x ± 0.09 (0.83..1.02)     |
| Mandelbrot      | 115.69 ms ± 15.09 (101.48..150.91)    | 0.86x ± 0.17 (0.65..1.02)     |
| NBody           | 79.22 ms ± 5.52 (75.17..90.97)        | 0.79x ± 0.14 (0.58..0.97)     |
| PageRank        | 124.69 ms ± 5.97 (115.19..138.17)     | 0.79x ± 0.12 (0.64..1.02)     |
| Permute         | 116.94 ms ± 11.41 (104.57..144.25)    | 0.83x ± 0.18 (0.56..1.00)     |
| Queens          | 93.48 ms ± 6.70 (85.59..106.84)       | 0.90x ± 0.19 (0.60..1.05)     |
| QuickSort       | 33.56 ms ± 7.73 (27.08..44.85)        | 0.95x ± 0.24 (0.81..1.13)     |
| Recurse         | 122.43 ms ± 17.71 (108.85..171.40)    | 0.85x ± 0.18 (0.59..0.98)     |
| Richards        | 1473.83 ms ± 97.54 (1345.55..1691.55) | 0.75x ± 0.08 (0.65..0.85)     |
| Sieve           | 166.66 ms ± 14.13 (156.32..205.15)    | 0.83x ± 0.16 (0.58..0.96)     |
| Storage         | 32.92 ms ± 1.52 (31.30..36.09)        | 0.79x ± 0.14 (0.59..0.97)     |
| Sum             | 69.05 ms ± 3.58 (65.13..75.51)        | 0.83x ± 0.20 (0.59..1.02)     |
| Towers          | 130.08 ms ± 12.86 (119.07..162.76)    | 0.99x ± 0.16 (0.73..1.09)     |
| TreeSort        | 52.32 ms ± 7.40 (45.34..64.02)        | 0.94x ± 0.16 (0.83..1.09)     |
| WhileLoop       | 170.93 ms ± 9.25 (159.72..191.28)     | 0.86x ± 0.14 (0.62..0.99)     |
|                 |                                       |                               |
| Average Speedup |              (baseline)               | 0.85x ± 0.03 (0.70..0.99)     |
+-----------------+---------------------------------------+-------------------------------+

The raw ReBench data files are available for download here: baseline and head

The benchmarks were run using ReBench v1.2.0
The statistical analysis was done using rebench-tabler v0.1.0

The source code of this benchmark runner is available as a GitHub Gist for more details about the setup

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: Enhancements C-performance Category: Performance improvements M-interpreter Module: Interpreter P-medium Priority: Medium
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant