Skip to content

Commit 30c01ca

Browse files
committed
Improve documentation
- Move comparison table to a separate section. - Use CSS icons to make table more readable. - Refer to the table from backend documentations. - Explain how backends store and manipulate interned data. Signed-off-by: Tin Švagelj <[email protected]>
1 parent 6dcb898 commit 30c01ca

File tree

6 files changed

+458
-133
lines changed

6 files changed

+458
-133
lines changed

src/backend/bucket/mod.rs

Lines changed: 27 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -9,39 +9,33 @@ use crate::{symbol::expect_valid_symbol, DefaultSymbol, Symbol};
99
use alloc::{string::String, vec::Vec};
1010
use core::{iter::Enumerate, marker::PhantomData, slice};
1111

12-
/// An interner backend that reduces memory allocations by using string buckets.
13-
///
14-
/// # Note
15-
///
16-
/// Implementation inspired by matklad's blog post that can be found here:
17-
/// <https://matklad.github.io/2020/03/22/fast-simple-rust-interner.html>
18-
///
19-
/// # Usage Hint
20-
///
21-
/// Use when deallocations or copy overhead is costly or when
22-
/// interning of static strings is especially common.
23-
///
24-
/// # Usage
25-
///
26-
/// - **Fill:** Efficiency of filling an empty string interner.
27-
/// - **Resolve:** Efficiency of interned string look-up given a symbol.
28-
/// - **Allocations:** The number of allocations performed by the backend.
29-
/// - **Footprint:** The total heap memory consumed by the backend.
30-
/// - **Contiguous:** True if the returned symbols have contiguous values.
31-
/// - **Iteration:** Efficiency of iterating over the interned strings.
32-
///
33-
/// Rating varies between **bad**, **ok**, **good** and **best**.
34-
///
35-
/// | Scenario | Rating |
36-
/// |:------------|:--------:|
37-
/// | Fill | **good** |
38-
/// | Resolve | **best** |
39-
/// | Allocations | **good** |
40-
/// | Footprint | **ok** |
41-
/// | Supports `get_or_intern_static` | **yes** |
42-
/// | `Send` + `Sync` | **yes** |
43-
/// | Contiguous | **yes** |
44-
/// | Iteration | **best** |
12+
/// An interner backend that reduces memory allocations by using buckets.
13+
///
14+
/// # Overview
15+
/// This interner uses fixed-size buckets to store interned strings. Each bucket is
16+
/// allocated once and holds a set number of strings. When a bucket becomes full, a new
17+
/// bucket is allocated to hold more strings. Buckets are never deallocated, which reduces
18+
/// the overhead of frequent memory allocations and copying.
19+
///
20+
/// ## Trade-offs
21+
/// - **Advantages:**
22+
/// - Strings in already used buckets remain valid and accessible even as new strings
23+
/// are added.
24+
/// - **Disadvantages:**
25+
/// - Slightly slower access times due to double indirection (looking up the string
26+
/// involves an extra level of lookup through the bucket).
27+
/// - Memory may be used inefficiently if many buckets are allocated but only partially
28+
/// filled because of large strings.
29+
///
30+
/// ## Use Cases
31+
/// This backend is ideal when interned strings must remain valid even after new ones are
32+
/// added.general use
33+
///
34+
/// Refer to the [comparison table][crate::_docs::comparison_table] for comparison with
35+
/// other backends.
36+
///
37+
/// [matklad's blog post]:
38+
/// https://matklad.github.io/2020/03/22/fast-simple-rust-interner.html
4539
#[derive(Debug)]
4640
pub struct BucketBackend<'i, S: Symbol = DefaultSymbol> {
4741
spans: Vec<InternedStr>,

src/backend/buffer.rs

Lines changed: 13 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -5,34 +5,22 @@ use crate::{symbol::expect_valid_symbol, DefaultSymbol, Symbol};
55
use alloc::vec::Vec;
66
use core::{mem, str};
77

8-
/// An interner backend that appends all interned string information in a single buffer.
8+
/// An interner backend that concatenates all interned string contents into one large
9+
/// buffer [`Vec`]. Unlike [`StringBackend`][crate::backend::StringBackend], string
10+
/// lengths are stored in the same buffer as strings preceeding the respective string
11+
/// data.
912
///
10-
/// # Usage Hint
13+
/// ## Trade-offs
14+
/// - **Advantages:**
15+
/// - Accessing interned strings is fast, as it requires a single lookup.
16+
/// - **Disadvantages:**
17+
/// - Iteration is slow because it requires consecutive reading of lengths to advance.
1118
///
12-
/// Use this backend if memory consumption is what matters most to you.
13-
/// Note though that unlike all other backends symbol values are not contigous!
19+
/// ## Use Cases
20+
/// This backend is ideal for storing many small (<255 characters) strings.
1421
///
15-
/// # Usage
16-
///
17-
/// - **Fill:** Efficiency of filling an empty string interner.
18-
/// - **Resolve:** Efficiency of interned string look-up given a symbol.
19-
/// - **Allocations:** The number of allocations performed by the backend.
20-
/// - **Footprint:** The total heap memory consumed by the backend.
21-
/// - **Contiguous:** True if the returned symbols have contiguous values.
22-
/// - **Iteration:** Efficiency of iterating over the interned strings.
23-
///
24-
/// Rating varies between **bad**, **ok**, **good** and **best**.
25-
///
26-
/// | Scenario | Rating |
27-
/// |:------------|:--------:|
28-
/// | Fill | **best** |
29-
/// | Resolve | **bad** |
30-
/// | Allocations | **best** |
31-
/// | Footprint | **best** |
32-
/// | Supports `get_or_intern_static` | **no** |
33-
/// | `Send` + `Sync` | **yes** |
34-
/// | Contiguous | **no** |
35-
/// | Iteration | **bad** |
22+
/// Refer to the [comparison table][crate::_docs::comparison_table] for comparison with
23+
/// other backends.
3624
#[derive(Debug)]
3725
pub struct BufferBackend<'i, S: Symbol = DefaultSymbol> {
3826
len_strings: usize,

src/backend/string.rs

Lines changed: 18 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -5,38 +5,27 @@ use crate::{symbol::expect_valid_symbol, DefaultSymbol, Symbol};
55
use alloc::{string::String, vec::Vec};
66
use core::{iter::Enumerate, slice};
77

8-
/// An interner backend that accumulates all interned string contents into one string.
8+
/// An interner backend that concatenates all interned string contents into one large
9+
/// buffer and keeps track of string bounds in a separate [`Vec`].
10+
///
11+
/// Implementation is inspired by [CAD97's](https://github.com/CAD97)
12+
/// [`strena`](https://github.com/CAD97/strena) crate.
913
///
10-
/// # Note
14+
/// ## Trade-offs
15+
/// - **Advantages:**
16+
/// - Separated length tracking allows fast iteration.
17+
/// - **Disadvantages:**
18+
/// - Many insertions separated by external allocations can cause the buffer to drift
19+
/// far away (in memory) from `Vec` storing string ends, which impedes performance of
20+
/// all interning operations.
21+
/// - Resolving a symbol requires two heap lookups because data and length are stored in
22+
/// separate containers.
1123
///
12-
/// Implementation inspired by [CAD97's](https://github.com/CAD97) research
13-
/// project [`strena`](https://github.com/CAD97/strena).
24+
/// ## Use Cases
25+
/// This backend is good for storing fewer large strings and for general use.
1426
///
15-
/// # Usage Hint
16-
///
17-
/// Use this backend if runtime performance is what matters most to you.
18-
///
19-
/// # Usage
20-
///
21-
/// - **Fill:** Efficiency of filling an empty string interner.
22-
/// - **Resolve:** Efficiency of interned string look-up given a symbol.
23-
/// - **Allocations:** The number of allocations performed by the backend.
24-
/// - **Footprint:** The total heap memory consumed by the backend.
25-
/// - **Contiguous:** True if the returned symbols have contiguous values.
26-
/// - **Iteration:** Efficiency of iterating over the interned strings.
27-
///
28-
/// Rating varies between **bad**, **ok**, **good** and **best**.
29-
///
30-
/// | Scenario | Rating |
31-
/// |:------------|:--------:|
32-
/// | Fill | **good** |
33-
/// | Resolve | **ok** |
34-
/// | Allocations | **good** |
35-
/// | Footprint | **good** |
36-
/// | Supports `get_or_intern_static` | **no** |
37-
/// | `Send` + `Sync` | **yes** |
38-
/// | Contiguous | **yes** |
39-
/// | Iteration | **good** |
27+
/// Refer to the [comparison table][crate::_docs::comparison_table] for comparison with
28+
/// other backends.
4029
#[derive(Debug)]
4130
pub struct StringBackend<'i, S: Symbol = DefaultSymbol> {
4231
ends: Vec<usize>,

0 commit comments

Comments
 (0)