You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To setup the disk cache, you need to choose a proper engine for your workload first. Currently, ***foyer*** support 3 kinds of engines:
66
+
67
+
-`Engine::Large`: For cache entries larger than 2 KiB. Friendly to HDD/SSD while minimizing memory usage for indexing.
68
+
-`Engine::Small`: For cache entries smaller than 2 KiB. A set-associated cache that does not use memory for indexing.
69
+
-`Engine::Mixed(ratio)`: For cache entries in all sizes. Mixed `Engine::Large` and `Engine::Small`. Use `ratio` to control the proportion of the capacity of `Engine::Small`. Introducing a little overhead compared to using `Engine::Large` and `Engine::Small` separately.
70
+
71
+
For more details about the engines, please refer to [Design - Architecture](/docs/design/architecture).
72
+
73
+
:::warning
74
+
75
+
`Engine::Small` and `Engine::Mixed` are preview version and have **NOT** undergone sufficient testing in production environments. Please use them with caution in production systems.
76
+
77
+
If you have such needs, you can contact me via Github. We can work together to improve the system for production and make ***foyer*** better! 🥰
78
+
79
+
:::
80
+
81
+
#### 2.3.2 Setup shared options
82
+
83
+
Some options are shared between engines, you can setup shared options before setting up engine-specific options.
84
+
85
+
##### 2.3.2.1 Setup device options
68
86
69
87
By default, the hybrid cache will **NOT** include a disk cache unless you specify a device. The hybrid cache will run on a in-memory cache compatible mode with the default configuration. All lookups to the disk will return a miss. It is useful if you want to support both in-memory cache or the hybrid cache based on your project's configuration or for debugging.
70
88
@@ -74,9 +92,7 @@ RisingWave[^risingwave] supports caching the LSM-tree meta and blocks in both hy
74
92
75
93
:::
76
94
77
-
#### 2.3.2 Run on hybrid cache mode with a device
78
-
79
-
To specify a device for the hybrid cache, just call `with_device_config()`[^with-device-config] and provide the device config.
95
+
To enable the hybrid cache mode, a device needs to be specified by calling `with_device_options()`[^with-device-options] and providing the device options.
80
96
81
97
Currently, the storage of the hybrid cache supports 2 kinds of devices:
82
98
@@ -97,28 +113,27 @@ Let's take `DirectFsDevice` as an example:
This example uses directory `/data/foyer` to store disk cache data using a device options builder. With the default configuration, ***foyer*** will take 80% of the current free space as the disk cache capacity. You can also specify the disk cache capacity and per file size with the builder.
108
124
109
-
For more details, please refer to the API document.[^direct-fs-device-options-builder][^direct-file-device-options-builder]
125
+
For more details, please refer to the API document.[^direct-fs-device-options][^direct-file-device-options]
110
126
111
-
#### 2.3.3 Restrict the throughput
127
+
#####2.3.2.2 Restrict the throughput
112
128
113
129
The bandwidth of the disk is much lower than the bandwidth of the memory. To avoid excessive use of the disk bandwidth, it is **HIGHLY RECOMMENDED** to setup the admission picker with a rate limiter.
@@ -134,17 +149,32 @@ For more details, please refer to [Design - Architecture](/docs/design/architect
134
149
135
150
:::
136
151
137
-
#### 2.3.4 Achieve better performance
152
+
#####2.3.2.3 Other shared options
138
153
139
-
The hybrid cache builder also provides a lot of detailed arguments for tuning.
154
+
There are also other shared options for tuning or other purposes.
140
155
141
-
For example:
156
+
-`with_runtime_options()`: Set the runtime options to enable and setup the dedicated runtime.
157
+
-`with_compression()`: Set the compression algorithm for serialization and deserialization.
158
+
-`with_recover_mode()`: Set the recover mode.
159
+
-`with_flush()`: Set if trigger a flush operation after writing the disk.
160
+
- ...
142
161
143
-
-`with_indexer_shards()` can be used for mitigating hot shards of the indexer.
144
-
-`with_flushers()`, `with_reclaimers()` and `with_recover_concurrency()` can be used to tune the concurrency of the inner components.
145
-
-`with_runtime_config()` can be used to enable the dedicated runtime or further runtime splitting.
162
+
For more details, please refer to the API document.[^hybrid-cache-builder]
146
163
147
-
Tuning the optimized parameters requires an understanding of ***foyer*** interior design and benchmarking with the real workload. For more details, please refer to [Design - Architecture](/docs/design/architecture).
164
+
:::tip
165
+
166
+
Tuning the optimized parameters requires an understanding of ***foyer*** interior design and benchmarking with the real workload. For more details, please refer to [Topic - Tuning](/docs/topic/tuning) and [Design - Architecture](/docs/design/architecture).
167
+
168
+
:::
169
+
170
+
#### 2.3.3 Setup engine-specific options
171
+
172
+
Each engine also has its specific options for tuning or other purposes.
173
+
174
+
-`with_large_object_disk_cache_options()`[^with-large-object-disk-cache-options]: Set the options for the large object disk cache.
175
+
-`with_small_object_disk_cache_options()`[^with-small-object-disk-cache-options]: Set the options for the small object disk cache.
176
+
177
+
For more details, please refer to the API document [^large-engine-options][^small-engine-options].
148
178
149
179
## 3. `HybridCache` Usage
150
180
@@ -193,7 +223,7 @@ The hybrid cache also provides a `writer()` interface for advanced usage, such a
***foyer***, just as its slogan, is a hybrid cache library for the Rust programming language. 🦀
18
+
19
+
## What is hybrid cache?
20
+
21
+
A hybrid cache is a caching system that utilizes both memory and disk storage simultaneously.
22
+
23
+
<divstyle="text-align: center;">
24
+
25
+

26
+
27
+
</div>
28
+
29
+
It is commonly used to extend the insufficient memory cache for the system uses Object Store Service (OSS, e.g. AWS S3) as its primary data storage[^oss-dia] to **improve performance** and **reduce costs**[^risingwave].
30
+
31
+
## Why we need a hybrid cache?
32
+
33
+
More and more systems are using OSS as their primary data storage. OSS has many great features, such as low storage cost, high availability and durability, and almost unlimited scalability.
34
+
35
+
However, there are also downsides with OSS. For example, the latency is high and uncontrollable, and its price increases with each accesses. The downsides will be further amplified in a large working set because of more data exchange between cache and OSS.
36
+
37
+
<divstyle="text-align: center;">
38
+
39
+

40
+
41
+
</div>
42
+
43
+
With a hybrid cache, the ability to cache the working set can be extended from memory only to memory and disk. This can reduce data exchange between cache and OSS, thereby improving performance and reducing costs.
As a hybrid cache, ***foyer*** provides the following highlighted features:
54
+
55
+
-**Hybrid Cache**: Seamlessly integrates both in-memory and disk cache for optimal performance and flexibility.
56
+
-**Plug-and-Play Algorithms**: Empowers users with easily replaceable caching algorithms, ensuring adaptability to diverse use cases.
57
+
-**Fearless Concurrency**: Built to handle high concurrency with robust thread-safe mechanisms, guaranteeing reliable performance under heavy loads.
58
+
-**Zero-Copy Abstraction**: Leveraging Rust's robust type system, the in-memory cache in foyer achieves a better performance with zero-copy abstraction.
59
+
-**User-Friendly Interface**: Offers a simple and intuitive API, making cache integration effortless and accessible for developers of all levels.
60
+
-**Out-of-the-Box Observability**: Integrate popular observation systems such as Prometheus, Grafana, Opentelemetry, and Jaeger in just **ONE** line.
61
+
62
+
## Why use foyer, when you have 'X'?
63
+
64
+
Unfortunately, there is currently no general proposed hybrid cache library in the Rust community. If you have a need for hybrid cache, ***foyer*** would be your best choice.
65
+
66
+
CacheLib[^cachelib] provides a Rust binding. However, it only provides limited interfaces. You still need to patch the C++ codebase if you have requirements such as logging, metrics, or tracing supports. Besides, ***foyer*** provides a better optimized storage engine implement over CacheLib. You can try both ***foyer*** and CacheLib to compare the benchmarks.
67
+
68
+
For the needs as an in-memory only cache, ***foyer*** also provides compatible interfaces and competitive performance. Benchmarks[^benchmark] show that ***foyer*** outperforms moka[^moka] and is only second to quick-cache[^quick-cache].
69
+
70
+
<divstyle="text-align: center;">
71
+
72
+

73
+
74
+
</div>
75
+
76
+
## What's next?
77
+
78
+
- Learn how to use ***foyer*** in your project, goto [Tutorial](/docs/category/tutorial).
79
+
- Learn how to solve various challenging situations with ***foyer***, goto [Topic](/docs/category/topic).
80
+
- Learn how other projects use ***foyer***, goto [Case Study](/docs/category/case-study).
81
+
- Learn the design of ***foyer***, goto [Design](/docs/category/design).
82
+
83
+
## Acknowledgement
84
+
85
+
***foyer*** draws inspiration from CacheLib[^cachelib], a well-known cache library written in C++, and Caffeine[^caffeine], a widely-used in-memory cache library in Java, among other projects like moka[^moka], intrusive-rs[^intrusive-rs], etc.
86
+
87
+
Thank you for your efforts! 🥰
88
+
89
+
[^oss-dia]: Systems using OSS as its primary data storage: [RisingWave](https://github.com/risingwavelabs/risingwave), [Chroma Cloud](https://github.com/chroma-core/chroma), [SlateDB](https://github.com/slatedb/slatedb), etc.
90
+
91
+
[^risingwave]: How streaming database RisingWave use foyer to improve performance and reduce costs: [Case Study - RisingWave](/docs/case-study/risingwave).
0 commit comments