-
Notifications
You must be signed in to change notification settings - Fork 336
Add Hashtable and LongHashingUtils utilities #11409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
gh-worker-dd-mergequeue-cf854d
merged 21 commits into
master
from
dougqh/util-hashtable
May 20, 2026
Merged
Changes from all commits
Commits
Show all changes
21 commits
Select commit
Hold shift + click to select a range
10956b2
Add Hashtable and LongHashingUtils to datadog.trace.util
dougqh 035dc09
Add unit tests for Hashtable and LongHashingUtils
dougqh 7728b60
Apply spotless formatting to Hashtable and LongHashingUtils
dougqh 8cd2d86
Add JMH benchmarks for Hashtable.D1 and D2
dougqh c689ef9
Add benchmark results to HashtableBenchmark header
dougqh 75790eb
Address review feedback on Hashtable
dougqh 6056ff7
Fix dropped argument in HashingUtils 5-arg Object hash
dougqh da55021
Address review feedback on Hashtable
dougqh 8b8b088
Drop reflection in iterator tests via package-private D1.buckets
dougqh 0fde7cd
Add context-passing forEach to Hashtable.D1 and D2
dougqh 6d6c2e0
Move forEach loop body to Support helper
dougqh 268de2b
Move bucket-head cast to Support.bucket helper
dougqh 93813b9
Drop d1_/d2_ prefix from per-table benchmark methods
dougqh 11a58bf
Add Hashtable.Support helpers: MAX_RATIO, insertHeadEntry, MutatingTa…
dougqh 8f1828d
Swap MAX_RATIO numerator/denominator pair for a single float + scaled…
dougqh c0d3e26
Tighten Hashtable docs + rename MAX_CAPACITY to MAX_BUCKETS
dougqh a0978ba
Dedupe chain-head splice in D1/D2 via keyHash insertHeadEntry overload
dougqh e604a8f
Tighten Entry.next encapsulation; doc hasNext; add D1/D2 getOrCreate
dougqh e2642cd
Hashtable: add missing braces and detach removed/replaced entries
dougqh 585ca56
Rename LongHashingUtils.hashCodeX(Object) to hash(Object) for API con…
dougqh 52c0fe6
Merge branch 'master' into dougqh/util-hashtable
dougqh File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
169 changes: 169 additions & 0 deletions
169
internal-api/src/jmh/java/datadog/trace/util/HashtableD1Benchmark.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,169 @@ | ||
| package datadog.trace.util; | ||
|
|
||
| import static java.util.concurrent.TimeUnit.MICROSECONDS; | ||
|
|
||
| import java.util.HashMap; | ||
| import java.util.Map; | ||
| import java.util.function.Consumer; | ||
| import org.openjdk.jmh.annotations.Benchmark; | ||
| import org.openjdk.jmh.annotations.BenchmarkMode; | ||
| import org.openjdk.jmh.annotations.Fork; | ||
| import org.openjdk.jmh.annotations.Level; | ||
| import org.openjdk.jmh.annotations.Measurement; | ||
| import org.openjdk.jmh.annotations.Mode; | ||
| import org.openjdk.jmh.annotations.OperationsPerInvocation; | ||
| import org.openjdk.jmh.annotations.OutputTimeUnit; | ||
| import org.openjdk.jmh.annotations.Scope; | ||
| import org.openjdk.jmh.annotations.Setup; | ||
| import org.openjdk.jmh.annotations.State; | ||
| import org.openjdk.jmh.annotations.Threads; | ||
| import org.openjdk.jmh.annotations.Warmup; | ||
| import org.openjdk.jmh.infra.Blackhole; | ||
|
|
||
| /** | ||
| * Compares {@link Hashtable.D1} against equivalent {@link HashMap} usage for add, update, and | ||
| * iterate operations. | ||
| * | ||
| * <p>Each benchmark thread owns its own map ({@link Scope#Thread}), but a non-trivial thread count | ||
| * is used so allocation/GC pressure surfaces in the throughput numbers — that pressure is the main | ||
| * thing Hashtable is built to avoid. | ||
| * | ||
| * <ul> | ||
| * <li><b>add</b> — clear the map then re-insert N fresh entries | ||
| * ({@code @OperationsPerInvocation(N_KEYS)}). Captures the steady-state cost of building up a | ||
| * map. | ||
| * <li><b>update</b> — for an existing key, increment a counter. Hashtable does {@code get} + | ||
| * field mutation (no allocation); HashMap uses {@code merge(k, 1L, Long::sum)}, the idiomatic | ||
| * Java 8+ way, which still allocates a {@code Long} per call. | ||
| * <li><b>iterate</b> — walk every entry and consume its key + value. | ||
| * </ul> | ||
| * | ||
| * <p><b>Update</b> is where Hashtable dominates: D1 is ~14x faster, because the HashMap path | ||
| * allocates per call (a {@code Long}) and the resulting GC pressure throttles throughput under | ||
| * multiple threads. <b>Add</b> is roughly comparable (both allocate one entry per insert). | ||
| * <b>Iterate</b> is essentially a wash — both are bucket walks. <code> | ||
| * MacBook M1 8 threads (Java 8) | ||
| * | ||
| * Benchmark Mode Cnt Score Error Units | ||
| * HashtableD1Benchmark.add_hashMap thrpt 6 187.883 ± 189.858 ops/us | ||
| * HashtableD1Benchmark.add_hashtable thrpt 6 198.710 ± 273.035 ops/us | ||
| * | ||
| * HashtableD1Benchmark.update_hashMap thrpt 6 127.392 ± 87.482 ops/us | ||
| * HashtableD1Benchmark.update_hashtable thrpt 6 1810.244 ± 44.645 ops/us | ||
| * | ||
| * HashtableD1Benchmark.iterate_hashMap thrpt 6 20.043 ± 0.752 ops/us | ||
| * HashtableD1Benchmark.iterate_hashtable thrpt 6 22.208 ± 0.956 ops/us | ||
| * </code> | ||
| */ | ||
| @Fork(2) | ||
| @Warmup(iterations = 2) | ||
| @Measurement(iterations = 3) | ||
| @BenchmarkMode(Mode.Throughput) | ||
| @OutputTimeUnit(MICROSECONDS) | ||
| @Threads(8) | ||
| public class HashtableD1Benchmark { | ||
|
|
||
| static final int N_KEYS = 64; | ||
| static final int CAPACITY = 128; | ||
|
|
||
| static final String[] SOURCE_KEYS = new String[N_KEYS]; | ||
|
|
||
| static { | ||
| for (int i = 0; i < N_KEYS; ++i) { | ||
| SOURCE_KEYS[i] = "key-" + i; | ||
| } | ||
| } | ||
|
|
||
| static final class D1Counter extends Hashtable.D1.Entry<String> { | ||
| long count; | ||
|
|
||
| D1Counter(String key) { | ||
| super(key); | ||
| } | ||
| } | ||
|
|
||
| /** Reusable iteration consumer — avoids per-call lambda capture allocation. */ | ||
| static final class BhD1Consumer implements Consumer<D1Counter> { | ||
| Blackhole bh; | ||
|
|
||
| @Override | ||
| public void accept(D1Counter e) { | ||
| bh.consume(e.key); | ||
| bh.consume(e.count); | ||
| } | ||
| } | ||
|
|
||
| @State(Scope.Thread) | ||
| public static class D1State { | ||
| Hashtable.D1<String, D1Counter> table; | ||
| HashMap<String, Long> hashMap; | ||
| String[] keys; | ||
| int cursor; | ||
| final BhD1Consumer consumer = new BhD1Consumer(); | ||
|
|
||
| @Setup(Level.Iteration) | ||
| public void setUp() { | ||
| table = new Hashtable.D1<>(CAPACITY); | ||
| hashMap = new HashMap<>(CAPACITY); | ||
| keys = SOURCE_KEYS; | ||
| for (int i = 0; i < N_KEYS; ++i) { | ||
| table.insert(new D1Counter(keys[i])); | ||
| hashMap.put(keys[i], 0L); | ||
| } | ||
| cursor = 0; | ||
| } | ||
|
|
||
| String nextKey() { | ||
| int i = cursor; | ||
| cursor = (i + 1) & (N_KEYS - 1); | ||
| return keys[i]; | ||
| } | ||
| } | ||
|
|
||
| @Benchmark | ||
| @OperationsPerInvocation(N_KEYS) | ||
| public void add_hashtable(D1State s) { | ||
| Hashtable.D1<String, D1Counter> t = s.table; | ||
| String[] keys = s.keys; | ||
| t.clear(); | ||
| for (int i = 0; i < N_KEYS; ++i) { | ||
| t.insert(new D1Counter(keys[i])); | ||
| } | ||
| } | ||
|
|
||
| @Benchmark | ||
| @OperationsPerInvocation(N_KEYS) | ||
| public void add_hashMap(D1State s) { | ||
| HashMap<String, Long> m = s.hashMap; | ||
| String[] keys = s.keys; | ||
| m.clear(); | ||
| for (int i = 0; i < N_KEYS; ++i) { | ||
| m.put(keys[i], (long) i); | ||
| } | ||
| } | ||
|
|
||
| @Benchmark | ||
| public long update_hashtable(D1State s) { | ||
| D1Counter e = s.table.get(s.nextKey()); | ||
| return ++e.count; | ||
| } | ||
|
|
||
| @Benchmark | ||
| public Long update_hashMap(D1State s) { | ||
| return s.hashMap.merge(s.nextKey(), 1L, Long::sum); | ||
| } | ||
|
|
||
| @Benchmark | ||
| public void iterate_hashtable(D1State s, Blackhole bh) { | ||
| s.consumer.bh = bh; | ||
| s.table.forEach(s.consumer); | ||
| } | ||
|
|
||
| @Benchmark | ||
| public void iterate_hashMap(D1State s, Blackhole bh) { | ||
| for (Map.Entry<String, Long> entry : s.hashMap.entrySet()) { | ||
| bh.consume(entry.getKey()); | ||
| bh.consume(entry.getValue()); | ||
| } | ||
| } | ||
| } |
209 changes: 209 additions & 0 deletions
209
internal-api/src/jmh/java/datadog/trace/util/HashtableD2Benchmark.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,209 @@ | ||
| package datadog.trace.util; | ||
|
|
||
| import static java.util.concurrent.TimeUnit.MICROSECONDS; | ||
|
|
||
| import java.util.HashMap; | ||
| import java.util.Map; | ||
| import java.util.Objects; | ||
| import java.util.function.Consumer; | ||
| import org.openjdk.jmh.annotations.Benchmark; | ||
| import org.openjdk.jmh.annotations.BenchmarkMode; | ||
| import org.openjdk.jmh.annotations.Fork; | ||
| import org.openjdk.jmh.annotations.Level; | ||
| import org.openjdk.jmh.annotations.Measurement; | ||
| import org.openjdk.jmh.annotations.Mode; | ||
| import org.openjdk.jmh.annotations.OperationsPerInvocation; | ||
| import org.openjdk.jmh.annotations.OutputTimeUnit; | ||
| import org.openjdk.jmh.annotations.Scope; | ||
| import org.openjdk.jmh.annotations.Setup; | ||
| import org.openjdk.jmh.annotations.State; | ||
| import org.openjdk.jmh.annotations.Threads; | ||
| import org.openjdk.jmh.annotations.Warmup; | ||
| import org.openjdk.jmh.infra.Blackhole; | ||
|
|
||
| /** | ||
| * Compares {@link Hashtable.D2} against equivalent {@link HashMap} usage for add, update, and | ||
| * iterate operations. | ||
| * | ||
| * <p>Each benchmark thread owns its own map ({@link Scope#Thread}), but a non-trivial thread count | ||
| * is used so allocation/GC pressure surfaces in the throughput numbers — that pressure is the main | ||
| * thing Hashtable is built to avoid. | ||
| * | ||
| * <ul> | ||
| * <li><b>add</b> — clear the map then re-insert N fresh entries | ||
| * ({@code @OperationsPerInvocation(N_KEYS)}). Captures the steady-state cost of building up a | ||
| * map. | ||
| * <li><b>update</b> — for an existing key, increment a counter. Hashtable does {@code get} + | ||
| * field mutation (no allocation); HashMap uses {@code merge(k, 1L, Long::sum)}, the idiomatic | ||
| * Java 8+ way, which still allocates a {@code Long} per call. | ||
| * <li><b>iterate</b> — walk every entry and consume its key + value. | ||
| * </ul> | ||
| * | ||
| * <p>The D2 variants additionally pay for a composite-key wrapper allocation in the HashMap path | ||
| * (Java has no built-in tuple-as-key) — D2 sidesteps it by taking both key parts directly. | ||
| * | ||
| * <p><b>Update</b> is where Hashtable dominates: D2 is ~26x faster, because the HashMap path | ||
| * allocates per call (a {@code Long}, plus a {@code Key2}) and the resulting GC pressure throttles | ||
| * throughput under multiple threads. <b>Add</b> is ~3x faster for D2 (Hashtable sidesteps the | ||
| * {@code Key2} allocation). <b>Iterate</b> is essentially a wash — both are bucket walks. <code> | ||
| * MacBook M1 8 threads (Java 8) | ||
| * | ||
| * Benchmark Mode Cnt Score Error Units | ||
| * HashtableD2Benchmark.add_hashMap thrpt 6 77.082 ± 72.278 ops/us | ||
| * HashtableD2Benchmark.add_hashtable thrpt 6 216.813 ± 413.236 ops/us | ||
| * | ||
| * HashtableD2Benchmark.update_hashMap thrpt 6 56.077 ± 23.716 ops/us | ||
| * HashtableD2Benchmark.update_hashtable thrpt 6 1445.868 ± 157.705 ops/us | ||
| * | ||
| * HashtableD2Benchmark.iterate_hashMap thrpt 6 19.508 ± 0.760 ops/us | ||
| * HashtableD2Benchmark.iterate_hashtable thrpt 6 16.968 ± 0.371 ops/us | ||
| * </code> | ||
| */ | ||
| @Fork(2) | ||
| @Warmup(iterations = 2) | ||
| @Measurement(iterations = 3) | ||
| @BenchmarkMode(Mode.Throughput) | ||
| @OutputTimeUnit(MICROSECONDS) | ||
| @Threads(8) | ||
| public class HashtableD2Benchmark { | ||
|
|
||
| static final int N_KEYS = 64; | ||
| static final int CAPACITY = 128; | ||
|
|
||
| static final String[] SOURCE_K1 = new String[N_KEYS]; | ||
| static final Integer[] SOURCE_K2 = new Integer[N_KEYS]; | ||
|
|
||
| static { | ||
| for (int i = 0; i < N_KEYS; ++i) { | ||
| SOURCE_K1[i] = "key-" + i; | ||
| SOURCE_K2[i] = i * 31 + 17; | ||
| } | ||
| } | ||
|
|
||
| static final class D2Counter extends Hashtable.D2.Entry<String, Integer> { | ||
| long count; | ||
|
|
||
| D2Counter(String k1, Integer k2) { | ||
| super(k1, k2); | ||
| } | ||
| } | ||
|
|
||
| /** Composite key for the HashMap baseline against D2. */ | ||
| static final class Key2 { | ||
| final String k1; | ||
| final Integer k2; | ||
| final int hash; | ||
|
|
||
| Key2(String k1, Integer k2) { | ||
| this.k1 = k1; | ||
| this.k2 = k2; | ||
| this.hash = Objects.hash(k1, k2); | ||
| } | ||
|
|
||
| @Override | ||
| public boolean equals(Object o) { | ||
| if (!(o instanceof Key2)) { | ||
| return false; | ||
| } | ||
| Key2 other = (Key2) o; | ||
| return Objects.equals(k1, other.k1) && Objects.equals(k2, other.k2); | ||
| } | ||
|
|
||
| @Override | ||
| public int hashCode() { | ||
| return hash; | ||
| } | ||
| } | ||
|
|
||
| /** Reusable iteration consumer — avoids per-call lambda capture allocation. */ | ||
| static final class BhD2Consumer implements Consumer<D2Counter> { | ||
| Blackhole bh; | ||
|
|
||
| @Override | ||
| public void accept(D2Counter e) { | ||
| bh.consume(e.key1); | ||
| bh.consume(e.key2); | ||
| bh.consume(e.count); | ||
| } | ||
| } | ||
|
|
||
| @State(Scope.Thread) | ||
| public static class D2State { | ||
| Hashtable.D2<String, Integer, D2Counter> table; | ||
| HashMap<Key2, Long> hashMap; | ||
| String[] k1s; | ||
| Integer[] k2s; | ||
| int cursor; | ||
| final BhD2Consumer consumer = new BhD2Consumer(); | ||
|
|
||
| @Setup(Level.Iteration) | ||
| public void setUp() { | ||
| table = new Hashtable.D2<>(CAPACITY); | ||
| hashMap = new HashMap<>(CAPACITY); | ||
| k1s = SOURCE_K1; | ||
| k2s = SOURCE_K2; | ||
| for (int i = 0; i < N_KEYS; ++i) { | ||
| table.insert(new D2Counter(k1s[i], k2s[i])); | ||
| hashMap.put(new Key2(k1s[i], k2s[i]), 0L); | ||
| } | ||
| cursor = 0; | ||
| } | ||
|
|
||
| int nextIndex() { | ||
| int i = cursor; | ||
| cursor = (i + 1) & (N_KEYS - 1); | ||
| return i; | ||
| } | ||
| } | ||
|
|
||
| @Benchmark | ||
| @OperationsPerInvocation(N_KEYS) | ||
| public void add_hashtable(D2State s) { | ||
| Hashtable.D2<String, Integer, D2Counter> t = s.table; | ||
| String[] k1s = s.k1s; | ||
| Integer[] k2s = s.k2s; | ||
| t.clear(); | ||
| for (int i = 0; i < N_KEYS; ++i) { | ||
| t.insert(new D2Counter(k1s[i], k2s[i])); | ||
| } | ||
| } | ||
|
|
||
| @Benchmark | ||
| @OperationsPerInvocation(N_KEYS) | ||
| public void add_hashMap(D2State s) { | ||
| HashMap<Key2, Long> m = s.hashMap; | ||
| String[] k1s = s.k1s; | ||
| Integer[] k2s = s.k2s; | ||
| m.clear(); | ||
| for (int i = 0; i < N_KEYS; ++i) { | ||
| m.put(new Key2(k1s[i], k2s[i]), (long) i); | ||
| } | ||
| } | ||
|
|
||
| @Benchmark | ||
| public long update_hashtable(D2State s) { | ||
| int i = s.nextIndex(); | ||
| D2Counter e = s.table.get(s.k1s[i], s.k2s[i]); | ||
| return ++e.count; | ||
| } | ||
|
|
||
| @Benchmark | ||
| public Long update_hashMap(D2State s) { | ||
| int i = s.nextIndex(); | ||
| return s.hashMap.merge(new Key2(s.k1s[i], s.k2s[i]), 1L, Long::sum); | ||
| } | ||
|
|
||
| @Benchmark | ||
| public void iterate_hashtable(D2State s, Blackhole bh) { | ||
| s.consumer.bh = bh; | ||
| s.table.forEach(s.consumer); | ||
| } | ||
|
|
||
| @Benchmark | ||
| public void iterate_hashMap(D2State s, Blackhole bh) { | ||
| for (Map.Entry<Key2, Long> entry : s.hashMap.entrySet()) { | ||
| bh.consume(entry.getKey()); | ||
| bh.consume(entry.getValue()); | ||
| } | ||
| } | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related bug fix