Skip to content

Commit cd623c4

Browse files
committed
Auto merge of #563 - Amanieu:foldhash, r=Amanieu
Change the default hasher to foldhash [Foldhash](https://github.com/orlp/foldhash) performs generally better than AHash while still avoiding the pitfalls of FxHash with certain distributions (such as only hashing aligned values).
2 parents edd22e1 + 7762511 commit cd623c4

File tree

6 files changed

+54
-99
lines changed

6 files changed

+54
-99
lines changed

Cargo.toml

+3-3
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ rust-version = "1.63.0"
1414

1515
[dependencies]
1616
# For the default hasher
17-
ahash = { version = "0.8.7", default-features = false, optional = true }
17+
foldhash = { version = "0.1.2", default-features = false, optional = true }
1818

1919
# For external trait impls
2020
rayon = { version = "1.0", optional = true }
@@ -66,10 +66,10 @@ rustc-dep-of-std = [
6666
# Enables the deprecated RawEntry API.
6767
raw-entry = []
6868

69-
# Provides a default hasher. Currently this is AHash but this is subject to
69+
# Provides a default hasher. Currently this is foldhash but this is subject to
7070
# change in the future. Note that the default hasher does *not* provide HashDoS
7171
# resistance, unlike the one in the standard library.
72-
default-hasher = ["dep:ahash"]
72+
default-hasher = ["dep:foldhash"]
7373

7474
# Enables usage of `#[inline]` on far more functions than by default in this
7575
# crate. This may lead to a performance increase but often comes at a compile

README.md

+3-48
Original file line numberDiff line numberDiff line change
@@ -26,59 +26,14 @@ in environments without `std`, such as embedded systems and kernels.
2626
## Features
2727

2828
- Drop-in replacement for the standard library `HashMap` and `HashSet` types.
29-
- Uses [AHash](https://github.com/tkaitchuck/aHash) as the default hasher, which is much faster than SipHash.
30-
However, AHash does *not provide the same level of HashDoS resistance* as SipHash, so if that is important to you, you might want to consider using a different hasher.
29+
- Uses [foldhash](https://github.com/orlp/foldhash) as the default hasher, which is much faster than SipHash.
30+
However, foldhash does *not provide the same level of HashDoS resistance* as SipHash, so if that is important to you, you might want to consider using a different hasher.
3131
- Around 2x faster than the previous standard library `HashMap`.
3232
- Lower memory usage: only 1 byte of overhead per entry instead of 8.
3333
- Compatible with `#[no_std]` (but requires a global allocator with the `alloc` crate).
3434
- Empty hash maps do not allocate any memory.
3535
- SIMD lookups to scan multiple hash entries in parallel.
3636

37-
## Performance
38-
39-
Compared to the previous implementation of `std::collections::HashMap` (Rust 1.35).
40-
41-
With the hashbrown default AHash hasher:
42-
43-
| name | oldstdhash ns/iter | hashbrown ns/iter | diff ns/iter | diff % | speedup |
44-
| :-------------------------- | :----------------: | ----------------: | :----------: | ------: | ------- |
45-
| insert_ahash_highbits | 18,865 | 8,020 | -10,845 | -57.49% | x 2.35 |
46-
| insert_ahash_random | 19,711 | 8,019 | -11,692 | -59.32% | x 2.46 |
47-
| insert_ahash_serial | 19,365 | 6,463 | -12,902 | -66.63% | x 3.00 |
48-
| insert_erase_ahash_highbits | 51,136 | 17,916 | -33,220 | -64.96% | x 2.85 |
49-
| insert_erase_ahash_random | 51,157 | 17,688 | -33,469 | -65.42% | x 2.89 |
50-
| insert_erase_ahash_serial | 45,479 | 14,895 | -30,584 | -67.25% | x 3.05 |
51-
| iter_ahash_highbits | 1,399 | 1,092 | -307 | -21.94% | x 1.28 |
52-
| iter_ahash_random | 1,586 | 1,059 | -527 | -33.23% | x 1.50 |
53-
| iter_ahash_serial | 3,168 | 1,079 | -2,089 | -65.94% | x 2.94 |
54-
| lookup_ahash_highbits | 32,351 | 4,792 | -27,559 | -85.19% | x 6.75 |
55-
| lookup_ahash_random | 17,419 | 4,817 | -12,602 | -72.35% | x 3.62 |
56-
| lookup_ahash_serial | 15,254 | 3,606 | -11,648 | -76.36% | x 4.23 |
57-
| lookup_fail_ahash_highbits | 21,187 | 4,369 | -16,818 | -79.38% | x 4.85 |
58-
| lookup_fail_ahash_random | 21,550 | 4,395 | -17,155 | -79.61% | x 4.90 |
59-
| lookup_fail_ahash_serial | 19,450 | 3,176 | -16,274 | -83.67% | x 6.12 |
60-
61-
62-
With the libstd default SipHash hasher:
63-
64-
| name | oldstdhash ns/iter | hashbrown ns/iter | diff ns/iter | diff % | speedup |
65-
| :------------------------ | :----------------: | ----------------: | :----------: | ------: | ------- |
66-
| insert_std_highbits | 19,216 | 16,885 | -2,331 | -12.13% | x 1.14 |
67-
| insert_std_random | 19,179 | 17,034 | -2,145 | -11.18% | x 1.13 |
68-
| insert_std_serial | 19,462 | 17,493 | -1,969 | -10.12% | x 1.11 |
69-
| insert_erase_std_highbits | 50,825 | 35,847 | -14,978 | -29.47% | x 1.42 |
70-
| insert_erase_std_random | 51,448 | 35,392 | -16,056 | -31.21% | x 1.45 |
71-
| insert_erase_std_serial | 87,711 | 38,091 | -49,620 | -56.57% | x 2.30 |
72-
| iter_std_highbits | 1,378 | 1,159 | -219 | -15.89% | x 1.19 |
73-
| iter_std_random | 1,395 | 1,132 | -263 | -18.85% | x 1.23 |
74-
| iter_std_serial | 1,704 | 1,105 | -599 | -35.15% | x 1.54 |
75-
| lookup_std_highbits | 17,195 | 13,642 | -3,553 | -20.66% | x 1.26 |
76-
| lookup_std_random | 17,181 | 13,773 | -3,408 | -19.84% | x 1.25 |
77-
| lookup_std_serial | 15,483 | 13,651 | -1,832 | -11.83% | x 1.13 |
78-
| lookup_fail_std_highbits | 20,926 | 13,474 | -7,452 | -35.61% | x 1.55 |
79-
| lookup_fail_std_random | 21,766 | 13,505 | -8,261 | -37.95% | x 1.61 |
80-
| lookup_fail_std_serial | 19,336 | 13,519 | -5,817 | -30.08% | x 1.43 |
81-
8237
## Usage
8338

8439
Add this to your `Cargo.toml`:
@@ -107,7 +62,7 @@ This crate has the following Cargo features:
10762
- `raw-entry`: Enables access to the deprecated `RawEntry` API.
10863
- `inline-more`: Adds inline hints to most functions, improving run-time performance at the cost
10964
of compilation time. (enabled by default)
110-
- `default-hasher`: Compiles with ahash as default hasher. (enabled by default)
65+
- `default-hasher`: Compiles with foldhash as default hasher. (enabled by default)
11166
- `allocator-api2`: Enables support for allocators that support `allocator-api2`. (enabled by default)
11267

11368
## License

benches/bench.rs

+27-27
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
// This benchmark suite contains some benchmarks along a set of dimensions:
2-
// Hasher: std default (SipHash) and crate default (AHash).
2+
// Hasher: std default (SipHash) and crate default (foldhash).
33
// Int key distribution: low bit heavy, top bit heavy, and random.
44
// Task: basic functionality: insert, insert_erase, lookup, lookup_fail, iter
55
#![feature(test)]
@@ -18,7 +18,7 @@ use std::{
1818
const SIZE: usize = 1000;
1919

2020
// The default hashmap when using this crate directly.
21-
type AHashMap<K, V> = HashMap<K, V, DefaultHashBuilder>;
21+
type FoldHashMap<K, V> = HashMap<K, V, DefaultHashBuilder>;
2222
// This uses the hashmap from this crate with the default hasher of the stdlib.
2323
type StdHashMap<K, V> = HashMap<K, V, RandomState>;
2424

@@ -58,22 +58,22 @@ impl Drop for DropType {
5858
}
5959

6060
macro_rules! bench_suite {
61-
($bench_macro:ident, $bench_ahash_serial:ident, $bench_std_serial:ident,
62-
$bench_ahash_highbits:ident, $bench_std_highbits:ident,
63-
$bench_ahash_random:ident, $bench_std_random:ident) => {
64-
$bench_macro!($bench_ahash_serial, AHashMap, 0..);
61+
($bench_macro:ident, $bench_foldhash_serial:ident, $bench_std_serial:ident,
62+
$bench_foldhash_highbits:ident, $bench_std_highbits:ident,
63+
$bench_foldhash_random:ident, $bench_std_random:ident) => {
64+
$bench_macro!($bench_foldhash_serial, FoldHashMap, 0..);
6565
$bench_macro!($bench_std_serial, StdHashMap, 0..);
6666
$bench_macro!(
67-
$bench_ahash_highbits,
68-
AHashMap,
67+
$bench_foldhash_highbits,
68+
FoldHashMap,
6969
(0..).map(usize::swap_bytes)
7070
);
7171
$bench_macro!(
7272
$bench_std_highbits,
7373
StdHashMap,
7474
(0..).map(usize::swap_bytes)
7575
);
76-
$bench_macro!($bench_ahash_random, AHashMap, RandomKeys::new());
76+
$bench_macro!($bench_foldhash_random, FoldHashMap, RandomKeys::new());
7777
$bench_macro!($bench_std_random, StdHashMap, RandomKeys::new());
7878
};
7979
}
@@ -97,11 +97,11 @@ macro_rules! bench_insert {
9797

9898
bench_suite!(
9999
bench_insert,
100-
insert_ahash_serial,
100+
insert_foldhash_serial,
101101
insert_std_serial,
102-
insert_ahash_highbits,
102+
insert_foldhash_highbits,
103103
insert_std_highbits,
104-
insert_ahash_random,
104+
insert_foldhash_random,
105105
insert_std_random
106106
);
107107

@@ -122,11 +122,11 @@ macro_rules! bench_grow_insert {
122122

123123
bench_suite!(
124124
bench_grow_insert,
125-
grow_insert_ahash_serial,
125+
grow_insert_foldhash_serial,
126126
grow_insert_std_serial,
127-
grow_insert_ahash_highbits,
127+
grow_insert_foldhash_highbits,
128128
grow_insert_std_highbits,
129-
grow_insert_ahash_random,
129+
grow_insert_foldhash_random,
130130
grow_insert_std_random
131131
);
132132

@@ -158,11 +158,11 @@ macro_rules! bench_insert_erase {
158158

159159
bench_suite!(
160160
bench_insert_erase,
161-
insert_erase_ahash_serial,
161+
insert_erase_foldhash_serial,
162162
insert_erase_std_serial,
163-
insert_erase_ahash_highbits,
163+
insert_erase_foldhash_highbits,
164164
insert_erase_std_highbits,
165-
insert_erase_ahash_random,
165+
insert_erase_foldhash_random,
166166
insert_erase_std_random
167167
);
168168

@@ -187,11 +187,11 @@ macro_rules! bench_lookup {
187187

188188
bench_suite!(
189189
bench_lookup,
190-
lookup_ahash_serial,
190+
lookup_foldhash_serial,
191191
lookup_std_serial,
192-
lookup_ahash_highbits,
192+
lookup_foldhash_highbits,
193193
lookup_std_highbits,
194-
lookup_ahash_random,
194+
lookup_foldhash_random,
195195
lookup_std_random
196196
);
197197

@@ -216,11 +216,11 @@ macro_rules! bench_lookup_fail {
216216

217217
bench_suite!(
218218
bench_lookup_fail,
219-
lookup_fail_ahash_serial,
219+
lookup_fail_foldhash_serial,
220220
lookup_fail_std_serial,
221-
lookup_fail_ahash_highbits,
221+
lookup_fail_foldhash_highbits,
222222
lookup_fail_std_highbits,
223-
lookup_fail_ahash_random,
223+
lookup_fail_foldhash_random,
224224
lookup_fail_std_random
225225
);
226226

@@ -244,11 +244,11 @@ macro_rules! bench_iter {
244244

245245
bench_suite!(
246246
bench_iter,
247-
iter_ahash_serial,
247+
iter_foldhash_serial,
248248
iter_std_serial,
249-
iter_ahash_highbits,
249+
iter_foldhash_highbits,
250250
iter_std_highbits,
251-
iter_ahash_random,
251+
iter_foldhash_random,
252252
iter_std_random
253253
);
254254

src/lib.rs

+3-3
Original file line numberDiff line numberDiff line change
@@ -39,11 +39,11 @@
3939
#![cfg_attr(feature = "nightly", warn(fuzzy_provenance_casts))]
4040
#![cfg_attr(feature = "nightly", allow(internal_features))]
4141

42-
/// Default hasher for [`HashMap`], [`HashSet`] and [`HashTable`].
42+
/// Default hasher for [`HashMap`] and [`HashSet`].
4343
#[cfg(feature = "default-hasher")]
44-
pub type DefaultHashBuilder = core::hash::BuildHasherDefault<ahash::AHasher>;
44+
pub type DefaultHashBuilder = foldhash::fast::RandomState;
4545

46-
/// Dummy default hasher for [`HashMap`], [`HashSet`] and [`HashTable`].
46+
/// Dummy default hasher for [`HashMap`] and [`HashSet`].
4747
#[cfg(not(feature = "default-hasher"))]
4848
pub enum DefaultHashBuilder {}
4949

src/map.rs

+10-10
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ pub use crate::raw_entry::*;
1515

1616
/// A hash map implemented with quadratic probing and SIMD lookup.
1717
///
18-
/// The default hashing algorithm is currently [`AHash`], though this is
18+
/// The default hashing algorithm is currently [`foldhash`], though this is
1919
/// subject to change at any point in the future. This hash function is very
2020
/// fast for all types of keys, but this algorithm will typically *not* protect
2121
/// against attacks such as HashDoS.
@@ -142,7 +142,7 @@ pub use crate::raw_entry::*;
142142
/// [`with_hasher`]: #method.with_hasher
143143
/// [`with_capacity_and_hasher`]: #method.with_capacity_and_hasher
144144
/// [`fnv`]: https://crates.io/crates/fnv
145-
/// [`AHash`]: https://crates.io/crates/ahash
145+
/// [`foldhash`]: https://crates.io/crates/foldhash
146146
///
147147
/// ```
148148
/// use hashbrown::HashMap;
@@ -270,7 +270,7 @@ impl<K, V> HashMap<K, V, DefaultHashBuilder> {
270270
/// The `hash_builder` normally use a fixed key by default and that does
271271
/// not allow the `HashMap` to be protected against attacks such as [`HashDoS`].
272272
/// Users who require HashDoS resistance should explicitly use
273-
/// [`ahash::RandomState`] or [`std::collections::hash_map::RandomState`]
273+
/// [`std::collections::hash_map::RandomState`]
274274
/// as the hasher when creating a [`HashMap`], for example with
275275
/// [`with_hasher`](HashMap::with_hasher) method.
276276
///
@@ -300,7 +300,7 @@ impl<K, V> HashMap<K, V, DefaultHashBuilder> {
300300
/// The `hash_builder` normally use a fixed key by default and that does
301301
/// not allow the `HashMap` to be protected against attacks such as [`HashDoS`].
302302
/// Users who require HashDoS resistance should explicitly use
303-
/// [`ahash::RandomState`] or [`std::collections::hash_map::RandomState`]
303+
/// [`std::collections::hash_map::RandomState`]
304304
/// as the hasher when creating a [`HashMap`], for example with
305305
/// [`with_capacity_and_hasher`](HashMap::with_capacity_and_hasher) method.
306306
///
@@ -333,7 +333,7 @@ impl<K, V, A: Allocator> HashMap<K, V, DefaultHashBuilder, A> {
333333
/// The `hash_builder` normally use a fixed key by default and that does
334334
/// not allow the `HashMap` to be protected against attacks such as [`HashDoS`].
335335
/// Users who require HashDoS resistance should explicitly use
336-
/// [`ahash::RandomState`] or [`std::collections::hash_map::RandomState`]
336+
/// [`std::collections::hash_map::RandomState`]
337337
/// as the hasher when creating a [`HashMap`], for example with
338338
/// [`with_hasher_in`](HashMap::with_hasher_in) method.
339339
///
@@ -377,7 +377,7 @@ impl<K, V, A: Allocator> HashMap<K, V, DefaultHashBuilder, A> {
377377
/// The `hash_builder` normally use a fixed key by default and that does
378378
/// not allow the `HashMap` to be protected against attacks such as [`HashDoS`].
379379
/// Users who require HashDoS resistance should explicitly use
380-
/// [`ahash::RandomState`] or [`std::collections::hash_map::RandomState`]
380+
/// [`std::collections::hash_map::RandomState`]
381381
/// as the hasher when creating a [`HashMap`], for example with
382382
/// [`with_capacity_and_hasher_in`](HashMap::with_capacity_and_hasher_in) method.
383383
///
@@ -429,7 +429,7 @@ impl<K, V, S> HashMap<K, V, S> {
429429
/// The `hash_builder` normally use a fixed key by default and that does
430430
/// not allow the `HashMap` to be protected against attacks such as [`HashDoS`].
431431
/// Users who require HashDoS resistance should explicitly use
432-
/// [`ahash::RandomState`] or [`std::collections::hash_map::RandomState`]
432+
/// [`std::collections::hash_map::RandomState`]
433433
/// as the hasher when creating a [`HashMap`].
434434
///
435435
/// The `hash_builder` passed should implement the [`BuildHasher`] trait for
@@ -471,7 +471,7 @@ impl<K, V, S> HashMap<K, V, S> {
471471
/// The `hash_builder` normally use a fixed key by default and that does
472472
/// not allow the `HashMap` to be protected against attacks such as [`HashDoS`].
473473
/// Users who require HashDoS resistance should explicitly use
474-
/// [`ahash::RandomState`] or [`std::collections::hash_map::RandomState`]
474+
/// [`std::collections::hash_map::RandomState`]
475475
/// as the hasher when creating a [`HashMap`].
476476
///
477477
/// The `hash_builder` passed should implement the [`BuildHasher`] trait for
@@ -521,7 +521,7 @@ impl<K, V, S, A: Allocator> HashMap<K, V, S, A> {
521521
/// The `hash_builder` normally use a fixed key by default and that does
522522
/// not allow the `HashMap` to be protected against attacks such as [`HashDoS`].
523523
/// Users who require HashDoS resistance should explicitly use
524-
/// [`ahash::RandomState`] or [`std::collections::hash_map::RandomState`]
524+
/// [`std::collections::hash_map::RandomState`]
525525
/// as the hasher when creating a [`HashMap`].
526526
///
527527
/// [`HashDoS`]: https://en.wikipedia.org/wiki/Collision_attack
@@ -556,7 +556,7 @@ impl<K, V, S, A: Allocator> HashMap<K, V, S, A> {
556556
/// The `hash_builder` normally use a fixed key by default and that does
557557
/// not allow the `HashMap` to be protected against attacks such as [`HashDoS`].
558558
/// Users who require HashDoS resistance should explicitly use
559-
/// [`ahash::RandomState`] or [`std::collections::hash_map::RandomState`]
559+
/// [`std::collections::hash_map::RandomState`]
560560
/// as the hasher when creating a [`HashMap`].
561561
///
562562
/// [`HashDoS`]: https://en.wikipedia.org/wiki/Collision_attack

0 commit comments

Comments
 (0)