Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: use Arc<str> and lasso for string interning #766

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

jrray
Copy link
Collaborator

@jrray jrray commented Jun 20, 2023

Here is some exploration of making the members of State take up less memory and/or faster to clone. It uses the Arc<str> technique for the values in OptionMap, since these values are deemed to have somewhat high cardinality, and uses lasso to do string interning for PkgNameBuf and friends, since these values should have relatively low cardinality.

Benchmarking and memory profiling are on my todo list, to see if this is worthwhile.

@codecov
Copy link

codecov bot commented Jun 20, 2023

Codecov Report

Attention: 19 lines in your changes are missing coverage. Please review.

Comparison is base (e8cf549) 53.55% compared to head (686d7be) 53.81%.
Report is 310 commits behind head on main.

❗ Current head 686d7be differs from pull request most recent head 97d8075. Consider uploading reports for the commit 97d8075 to get more accurate results

Files Patch % Lines
crates/parsedbuf/src/lib.rs 80.00% 5 Missing ⚠️
crates/spk-build/src/build/binary.rs 50.00% 3 Missing ⚠️
...rates/spk-schema/crates/foundation/src/name/mod.rs 71.42% 2 Missing ⚠️
crates/spk-schema/src/option.rs 80.00% 2 Missing ⚠️
crates/spk-solve/crates/solution/src/solution.rs 71.42% 2 Missing ⚠️
crates/spk-cli/common/src/flags.rs 50.00% 1 Missing ⚠️
crates/spk-schema/src/package.rs 0.00% 1 Missing ⚠️
crates/spk-schema/src/v0/spec.rs 87.50% 1 Missing ⚠️
crates/spk-solve/crates/graph/src/graph.rs 90.90% 1 Missing ⚠️
crates/spk-solve/src/solver.rs 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #766      +/-   ##
==========================================
+ Coverage   53.55%   53.81%   +0.26%     
==========================================
  Files         258      240      -18     
  Lines       20466    19116    -1350     
==========================================
- Hits        10960    10288     -672     
+ Misses       9506     8828     -678     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jrray
Copy link
Collaborator Author

jrray commented Jun 21, 2023

Here's what I'm seeing before and after using dhat solving the same requests.

dhat: Total:     6,200,590,152 bytes in 23,982,868 blocks
dhat: At t-gmax: 743,071,743 bytes in 6,981,444 blocks
dhat: At t-end:  175,296,359 bytes in 810,908 blocks
dhat: Total:     5,694,959,715 bytes in 15,450,695 blocks
dhat: At t-gmax: 540,982,583 bytes in 2,595,216 blocks
dhat: At t-end:  165,912,942 bytes in 693,749 blocks

The size difference is not massive here but the number of blocks went down a lot. t-gmax is the point during the run when the heap was the largest.

As for run speed differences, using an optimized build without any profiling enabled, so far I'm seeing roughly the same run times for either build. Lasso can perform a lot better if you can know you no longer need to intern any more strings, but we don't have a way to do that in the solver.

@rydrman
Copy link
Collaborator

rydrman commented Jun 25, 2023

Interesting - it's too bad that the performance doesn't get affected too much but the reduction in memory usage still seems pretty worth while

@rydrman
Copy link
Collaborator

rydrman commented Feb 14, 2024

From the meeting today:

  • leaving open as we want to come back to this soon and try to reduce the memory footprint of the solver

First in a series of commits to make this kind of change for values that
are part of `State` and will benefit greatly from faster cloning and lower
memory footprint. Cloning an Arc is an O(1) operation and the solver has
to make many clones of State as it progresses.

There is currently a lot of `.to_string()` used on these values but this
is expected to change into Arc::clone as other things get similarly
converted.

Signed-off-by: J Robert Ray <[email protected]>
Use the lasso crate to intern package name strings, so that any instances
of the same name share the same backing memory. These are also very cheap
to clone because it only clones the "Spur" (a u32).

The tradeoff is that there is some contention from using `ThreadedRodeo`
since we need to support accessing the backing store from multiple threads.

Signed-off-by: J Robert Ray <[email protected]>
@rydrman
Copy link
Collaborator

rydrman commented Feb 26, 2024

leaving this here for posterity: ratatui/ratatui#601

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants