Add Hashtable and LongHashingUtils utilities#11409
Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 21 commits intoMay 20, 2026
Merged
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a534e4f4f4
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
dougqh
commented
May 19, 2026
dougqh
commented
May 19, 2026
dougqh
commented
May 19, 2026
dougqh
commented
May 19, 2026
dougqh
commented
May 19, 2026
dougqh
commented
May 19, 2026
dougqh
commented
May 19, 2026
dougqh
commented
May 19, 2026
This comment has been minimized.
This comment has been minimized.
dougqh
commented
May 19, 2026
dougqh
commented
May 19, 2026
dougqh
commented
May 19, 2026
dougqh
commented
May 19, 2026
dougqh
commented
May 19, 2026
|
|
||
| public static final int hash(Object obj0, Object obj1, Object obj2, Object obj3, Object obj4) { | ||
| return hash(hashCode(obj0), hashCode(obj1), hashCode(obj2), hashCode(obj3)); | ||
| return hash(hashCode(obj0), hashCode(obj1), hashCode(obj2), hashCode(obj3), hashCode(obj4)); |
5 tasks
sarahchen6
reviewed
May 20, 2026
… create() Replace Support.MAX_RATIO_NUMERATOR / _DENOMINATOR with a single float MAX_RATIO constant, and add a Support.create(int, float) overload that takes a scale factor. Callers now write Support.create(n, MAX_RATIO) instead of stitching together the int arithmetic at the call site. The scaled size is truncated (not ceiled) before going through sizeFor. sizeFor already rounds up to the next power of two, so truncation just absorbs float fuzz that would otherwise push a result like 12 * 4/3 = 16.0000005f past 16 and double the bucket array size for no reason. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Five small cleanups from a design re-review pass: 1. Support javadoc: drop the stale "methods are package-private" sentence; most of them were made public in earlier commits for higher-arity callers. Also drop the "nested BucketIterator" framing (iterators are peers of Support inside Hashtable, not nested inside Support). 2. MAX_RATIO javadoc: drop the Math.ceil recommendation; create(int, float) deliberately truncates and is the canonical pathway. 3. Document the null-hash treatment on D1.Entry.hash and D2.Entry.hash so the behavior difference is explicit: D1 uses Long.MIN_VALUE as a sentinel that's collision-free against any int-valued hashCode(); D2 has no such sentinel and relies on matches() to resolve null/null vs hash-0 collisions. 4. Rename Support.MAX_CAPACITY -> MAX_BUCKETS and sizeFor's parameter to requestedSize. The cap is on the bucket-array length, not entry count; the new name reflects that. Error messages updated to match. 5. Drop the `abstract` modifier on Hashtable in favor of `final` with a private constructor. Nothing actually subclasses Hashtable -- the abstract was a namespace device that read as "intended for extension." Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Add Support.insertHeadEntry(buckets, long keyHash, entry) overload that derives the bucket index itself. Callers that already have a hash but not the index (the common case) now avoid the redundant bucketIndex(...) hop. - D1.insert, D1.insertOrReplace, D2.insert, D2.insertOrReplace: use the new overload, drop the (thisBuckets local, bucketIndex compute, setNext, store) sequence at each call site. - D2.buckets: drop the `private` modifier to match D1.buckets. Both are package-private so iterator tests in the same package can drive Support.bucketIterator against the table's bucket array. Added a short comment on both fields documenting the rationale. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three follow-ups from the design review: - Make Hashtable.Entry.next private. All same-package readers (BucketIterator) already had a next() accessor; the leftover direct field reads now route through it. Closes the "mixed encapsulation" gap where some readers used the accessor and same-package ones reached for the field. - BucketIterator and MutatingBucketIterator now document that chain-walk work happens in next() (and the constructor for the first match); hasNext() is an O(1) field read. - Add D1.getOrCreate(K, Function) and D2.getOrCreate(K1, K2, BiFunction). Both reuse the lookup hash for the insert on miss, avoiding the double-hash that "get; if null then insert" callers would otherwise pay. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses PR #11409 review comments: - #3267164119 / #3267165525: wrap every single-line if/break body in braces (7 sites across BucketIterator, MutatingBucketIterator, and the full-table Iterator). - #3275947761 / #3275948108 (sarahchen6): null out the removed/replaced entry's next pointer after splicing it out of the chain in MutatingBucketIterator.remove / .replace. Applied the same fix to the full-table Iterator.remove for consistency. Rationale: detaching prevents accidental traversal through a removed entry via a stale reference and lets the GC reclaim a chain tail that the removed entry was the last referrer to. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
66ec7f6 to
e2642cd
Compare
dougqh
commented
May 20, 2026
…sistency Addresses PR #11409 review comment #3276167001. The method parallels the primitive hash(boolean) / hash(int) / hash(long) / ... family, so naming it hash(Object) -- with null collapsing to Long.MIN_VALUE as a sentinel distinct from any real hashCode -- matches the rest of the public surface. Test call sites that pass a literal null now disambiguate against hash(int[]) / hash(Object[]) / hash(Iterable) via an (Object) cast. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
|
/merge |
|
View all feedbacks in Devflow UI.
The expected merge time in
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What Does This Do
Introduces Hashtable which serves as lighter weight alternative to HashMap.
Motivation
Hashtable is parameterized on Entry types allowing for lower overhead.
The Entry can hold multiple fields that comprise the key.
The Entry can hold mutable fields that compromise the value.
The Entry can include metainfo useful for eviction, etc
Hashable includes D1 and D2 for 1-D and 2-D maps respectively, but also includes a Support class that be used to make higher dimensional / more complicated map structures.
Particularly useful in aggregation workloads with multipart keys where lookups dominate insertions. In those situations, a solution based on Hashtable avoids constantly allocating a composite key object that will be immediately thrown away.
Additional Notes
Splits out of #11382 into stand-alone own change:
datadog.trace.util.Hashtable— generic open-addressed-ish bucket table keyed by a 64-bit hash. Public abstractEntrylets client code subclass it for higher-arity keys (e.g. for multi-field aggregation keys in the metrics aggregator). Support helpers (create,clear,bucketIndex,bucketIterator,mutatingBucketIterator) are package-private but enough for higher layers built on top.datadog.trace.util.LongHashingUtils— chained 64-bit hash combiners with primitive overloads (boolean,short,int,long,Object). Used in place of varargs combiners to avoidObject[]allocation and boxing on the hot path.No callers within
internal-apiyet. The first usage will land in #11382 (AggregateTable+AggregateEntry), which now becomes a smaller, more focused diff once this lands first.Test plan
:internal-api:compileJavapasses (verified locally).🤖 Generated with Claude Code