Skip to content

HAMT Performance Improvements #54

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 32 commits into
base: dev-integrity
Choose a base branch
from

Conversation

ssoelvsten
Copy link

@ssoelvsten ssoelvsten commented Aug 21, 2025

So, I've been looking at the current state of the HAMT. For now, here are all of the associated cleanups I've made.

  • Makefiles
  • test/lib for Standard Library
  • Removing dead code and sorting through all HAMT related files.
  • Moved the hash function into its own Standard Library module for separate testing and reuse.
  • Fixed all hash functions.
  • Cleaned up and reviewed the current HAMT.

Here is my roadmap to continue:

  • Make List library use namespaced style (such that there are no conflicts with "overloading" from other parts of the standard library). length is still also exported into the global namespace.
  • Add Unit library - depends(ish) on Library is imported as a closure rather than an array #55
  • Implement a separate StencilVector.trp (useful for separate testing)
  • Reimplement the HAMT in a Map.trp
  • Wrap the HAMT in a Set.trp

@ssoelvsten ssoelvsten changed the title Fork/dev integrity/hamt HAMT Performance Improvements Aug 21, 2025
These are not per se testing the runtime itself. Rather, they are checking that
the standard library actually is correct.

Furthermore, the 'hamt_stress.trp' and 'hamt_perf.trp' tests include timing the
time to run certain operations. These break on other machines (and presumably
also on the same machine).
The git history already documents the fact these functions are 'optimized'...
And once again, make the make target match the convention of also matching
the folder's name.
According to an SO post (https://stackoverflow.com/a/41537995/13300643),
the other value is a prime number pretty close to the original value. Yet,
using that one instead breaks many of the intended properties of the hash
function
This is done by emulating the actual original Knuth's multiplicative
hash function (fixed for data structures of 2^31 size... or less)
This makes it larger than (most) ASCII characters. This should decrease
the number of collisions that happens with any regular string.
This also makes it use a Mersenne prime rather than just a power of two.
@ssoelvsten ssoelvsten force-pushed the fork/dev-integrity/hamt branch from 929de88 to ff53745 Compare August 21, 2025 08:53
As you can see, Claude, I got questions...
@ssoelvsten ssoelvsten force-pushed the fork/dev-integrity/hamt branch 3 times, most recently from d4f33ca to 78e0e98 Compare August 21, 2025 15:24
@ssoelvsten ssoelvsten force-pushed the fork/dev-integrity/hamt branch from 78e0e98 to b7736bd Compare August 22, 2025 14:09
@ssoelvsten ssoelvsten force-pushed the fork/dev-integrity/hamt branch from b7736bd to 49b21d0 Compare August 22, 2025 14:39
This way, if someone else has a `filter`, `map`, and so on, you can still
access these functions. Like in SML, the `length` function is exported
both as `List.length` and `length`.
@ssoelvsten ssoelvsten force-pushed the fork/dev-integrity/hamt branch from 49b21d0 to b7d14a1 Compare August 22, 2025 14:45
@ssoelvsten ssoelvsten force-pushed the fork/dev-integrity/hamt branch from 2d23590 to c7a2886 Compare August 22, 2025 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant