Skip to content

Commit e1b8507

Browse files
author
Simone Sanfratello
authored
feat: use storage and invalidation (#7)
* feat: storage redis and memory, with invalidation * feat: redis storage, keys expiration * feat: optional invalidation * feat: bench script * feat: redis heuristic references clean * feat: redis gc (wip) * feat: redis gc (wip) * feat: redis gc (wip) * feat: redis gc and options validation * feat: redis gc and options validation * fix: flaky test * chore: remove joi * fix: flaky test * fix: flaky test * doc: wip * fix: remove awaits * doc: wip * feat: createCache interface, doc, test * doc: add Simone Sanfratello as maintaner * doc: grammar * doc: update examples
1 parent f3f9ed8 commit e1b8507

25 files changed

+3633
-296
lines changed

Diff for: .github/workflows/ci.yml

+17-3
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,29 @@
11
name: ci
22

3-
on: [push, pull_request]
4-
3+
on:
4+
push:
5+
paths-ignore:
6+
- 'docs/**'
7+
- '*.md'
8+
pull_request:
9+
paths-ignore:
10+
- 'docs/**'
11+
- '*.md'
12+
513
jobs:
614
test:
715
runs-on: ubuntu-latest
816

917
strategy:
1018
matrix:
1119
node-version: [12.x, 14.x, 16.x]
12-
20+
redis-tag: [5, 6]
21+
services:
22+
redis:
23+
image: redis:${{ matrix.redis-tag }}
24+
ports:
25+
- 6379:6379
26+
options: --entrypoint redis-server
1327
steps:
1428
- uses: actions/checkout@v2
1529

Diff for: .gitignore

+5
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@ yarn-debug.log*
66
yarn-error.log*
77
lerna-debug.log*
88

9+
package-lock.json
10+
911
# Diagnostic reports (https://nodejs.org/api/report.html)
1012
report.[0-9]*.[0-9]*.[0-9]*.[0-9]*.json
1113

@@ -102,3 +104,6 @@ dist
102104

103105
# TernJS port file
104106
.tern-port
107+
108+
.vscode
109+
docker-compose.yml

Diff for: README.md

+166-11
Original file line numberDiff line numberDiff line change
@@ -12,10 +12,11 @@ npm i async-cache-dedupe
1212
## Example
1313

1414
```js
15-
import { Cache } from 'async-cache-dedupe'
15+
import { createCache } from 'async-cache-dedupe'
1616

17-
const cache = new Cache({
18-
ttl: 5 // seconds
17+
const cache = createCache({
18+
ttl: 5, // seconds
19+
storage: { type: 'memory' },
1920
})
2021

2122
cache.define('fetchSomething', async (k) => {
@@ -44,35 +45,189 @@ Commonjs/`require` is also supported.
4445

4546
## API
4647

47-
### `new Cache(opts)`
48+
### `createCache(opts)`
4849

4950
Creates a new cache.
5051

5152
Options:
5253

53-
* `tll`: the maximum time a cache entry can live, default `0`; if `0`, an element is removed from the cache as soon as as the promise resolves.
54-
* `cacheSize`: the maximum amount of entries to fit in the cache for each defined method, default `1024`.
54+
* `ttl`: the maximum time a cache entry can live, default `0`; if `0`, an element is removed from the cache as soon as the promise resolves.
55+
* `onDedupe`: a function that is called every time it is defined is deduped.
56+
* `onHit`: a function that is called every time there is a hit in the cache.
57+
* `onMiss`: a function that is called every time the result is not in the cache.
58+
* `storage`: the storage options; default is `{ type: "memory" }`
59+
Storage options are:
60+
* `type`: `memory` (default) or `redis`
61+
* `options`: by storage type
62+
* for `memory` type
63+
* `size`: maximum number of items to store in the cache _per resolver_. Default is `1024`.
64+
* `invalidation`: enable invalidation, see [invalidation](#invalidation). Default is disabled.
65+
* `log`: logger instance `pino` compatible, default is disabled.
66+
67+
Example
68+
69+
```js
70+
createCache({ storage: { type: 'memory', options: { size: 2048 } } })
71+
```
72+
73+
* for `redis` type
74+
* `client`: a redis client instance, mandatory. Should be an `ioredis` client or compatible.
75+
* `invalidation`: enable invalidation, see [invalidation](#invalidation). Default is disabled.
76+
* `invalidation.referencesTTL`: references TTL in seconds, it means how long the references are alive; it should be set at the maximum of all the caches ttl.
77+
* `log`: logger instance `pino` compatible, default is disabled.
78+
79+
Example
80+
81+
```js
82+
createCache({ storage: { type: 'redis', options: { client: new Redis(), invalidation: { referencesTTL: 60 } } } })
83+
```
5584

5685
### `cache.define(name[, opts], original(arg, cacheKey))`
5786

5887
Define a new function to cache of the given `name`.
5988

89+
The `define` method adds a `cache[name]` function that will call the `original` function if the result is not present
90+
in the cache. The cache key for `arg` is computed using [`safe-stable-stringify`](https://www.npmjs.com/package/safe-stable-stringify) and it is passed as the `cacheKey` argument to the original function.
91+
6092
Options:
6193

62-
* `tll`: the maximum time a cache entry can live, default as defined in the cache.
63-
* `cacheSize`: the maximum amount of entries to fit in the cache for each defined method, default as defined in the cache.
94+
* `ttl`: the maximum time a cache entry can live, default as defined in the cache; default is zero, so cache is disabled, the function will be only the deduped.
6495
* `serialize`: a function to convert the given argument into a serializable object (or string).
96+
* `onDedupe`: a function that is called every time there is defined is deduped.
6597
* `onHit`: a function that is called every time there is a hit in the cache.
98+
* `onMiss`: a function that is called every time the result is not in the cache.
99+
* `storage`: the storage to use, same as above. It's possible to specify different storages for each defined function for fine-tuning.
100+
* `references`: sync or async function to generate references, it receives `(args, key, result)` from the defined function call and must return an array of strings or falsy; see [invalidation](#invalidation) to know how to use them.
101+
Example
66102

67-
The `define` method adds a `cache[name]` function that will call the `original` function if the result is not present
68-
in the cache. The cache key for `arg` is computed using [`safe-stable-stringify`](https://www.npmjs.com/package/safe-stable-stringify)
69-
and it is passed as the `cacheKey` argument to the original function.
103+
```js
104+
const cache = createCache({ ttl: 60 })
105+
106+
cache.define('fetchUser', {
107+
references: (args, key, result) => result ? [`user~${result.id}`] : null
108+
},
109+
(id) => database.find({ table: 'users', where: { id }}))
110+
111+
await cache.fetchUser(1)
112+
```
70113

71114
### `cache.clear([name], [arg])`
72115

73116
Clear the cache. If `name` is specified, all the cache entries from the function defined with that name are cleared.
74117
If `arg` is specified, only the elements cached with the given `name` and `arg` are cleared.
75118

119+
## Invalidation
120+
121+
Along with `time to live` invalidation of the cache entries, we can use invalidation by keys.
122+
The concept behind invalidation by keys is that entries have an auxiliary key set that explicitly links requests along with their own result. These auxiliary keys are called here `references`.
123+
A scenario. Let's say we have an entry _user_ `{id: 1, name: "Alice"}`, it may change often or rarely, the `ttl` system is not accurate:
124+
125+
* it can be updated before `ttl` expiration, in this case the old value is shown until expiration by `ttl`.
126+
* it's not been updated during `ttl` expiration, so in this case, we don't need to reload the value, because it's not changed
127+
128+
To solve this common problem, we can use `references`.
129+
We can say that the result of defined function `getUser(id: 1)` has reference `user~1`, and the result of defined function `findUsers`, containing `{id: 1, name: "Alice"},{id: 2, name: "Bob"}` has references `[user~1,user~2]`.
130+
So we can find the results in the cache by their `references`, independently of the request that generated them, and we can invalidate by `references`.
131+
132+
So, when a writing event involving `user {id: 1}` happens (usually an update), we can remove all the entries in the cache that have references to `user~1`, so the result of `getUser(id: 1)` and `findUsers`, and they will be reloaded at the next request with the new data - but not the result of `getUser(id: 2)`.
133+
134+
Explicit invalidation is `disabled` by default, you have to enable it in `storage` settings.
135+
136+
See [mercurius-cache-example](https://github.com/mercurius/mercurius-cache-example) for a complete example.
137+
138+
### Redis
139+
140+
Using a `redis` storage is the best choice for a shared and/or large cache.
141+
All the `references` entries in redis have `referencesTTL`, so they are all cleaned at some time.
142+
`referencesTTL` value should be set at the maximum of all the `ttl`s, to let them be available for every cache entry, but at the same time, they expire, avoiding data leaking.
143+
Anyway, we should keep `references` up-to-date to be more efficient on writes and invalidation, using the `garbage collector` function, that prunes the expired references: while expired references do not compromise the cache integrity, they slow down the I/O operations.
144+
Storage `memory` doesn't have `gc`.
145+
146+
### Redis garbage collector
147+
148+
As said, While the garbage collector is optional, is highly recommended to keep references up to date and improve performances on setting cache entries and invalidation of them.
149+
150+
### `storage.gc([mode], [options])`
151+
152+
* `mode`: `lazy` (default) or `strict`.
153+
In `lazy` mode, only a chunk of the `references` are randomly checked, and probably freed; running `lazy` jobs tend to eventually clear all the expired `references`.
154+
In `strict` mode, all the `references` are checked and freed, and after that, `references` and entries are perfectly clean.
155+
`lazy` mode is the light heuristic way to ensure cached entries and `references` are cleared without stressing too much `redis`, `strict` mode at the opposite stress more `redis` to get a perfect result.
156+
The best strategy is to combine them both, running often `lazy` jobs along with some `strict` ones, depending on the size of the cache.
157+
158+
Options:
159+
160+
* `chunk`: the chunk size of references analyzed per loops, default `64`
161+
* `lazy~chunk`: the chunk size of references analyzed per loops in `lazy` mode, default `64`; if both `chunk` and `lazy.chunk` is set, the maximum one is taken
162+
* `lazy~cursor`: the cursor offset, default zero; cursor should be set at `report.cursor` to continue scanning from the previous operation
163+
164+
Return `report` of the `gc` job, as follows
165+
166+
```json
167+
"report":{
168+
"references":{
169+
"scanned":["r:user:8", "r:group:11", "r:group:16"],
170+
"removed":["r:user:8", "r:group:16"]
171+
},
172+
"keys":{
173+
"scanned":["users~1"],
174+
"removed":["users~1"]
175+
},
176+
"loops":4,
177+
"cursor":0,
178+
"error":null
179+
}
180+
```
181+
182+
Example
183+
184+
```js
185+
import { createCache, createStorage } from 'async-cache-dedupe'
186+
187+
const cache = createCache({
188+
ttl: 5,
189+
storage: { type: 'redis', options: { client: redisClient, invalidation: true } },
190+
})
191+
// ... cache.define('fetchSomething'
192+
193+
const storage = createStorage('redis', { client: redisClient, invalidation: true })
194+
195+
let cursor
196+
setInterval(() => {
197+
const report = await storage.gc('lazy', { lazy: { cursor }})
198+
if(report.error) {
199+
console.error('error on redis gc', error)
200+
return
201+
}
202+
console.log('gc report (lazy)', report)
203+
cursor = report.cursor
204+
}, 60e3).unref()
205+
206+
setInterval(() => {
207+
const report = await storage.gc('strict', chunk: 128})
208+
if(report.error) {
209+
console.error('error on redis gc', error)
210+
return
211+
}
212+
console.log('gc report (strict)', report)
213+
}, 10 * 60e3).unref()
214+
215+
```
216+
217+
---
218+
219+
## Maintainers
220+
221+
* [__Matteo Collina__](https://github.com/mcollina), <https://twitter.com/matteocollina>, <https://www.npmjs.com/~matteo.collina>
222+
* [__Simone Sanfratello__](https://github.com/simone-sanfratello), <https://twitter.com/simonesanfradev>, <https://www.npmjs.com/~simone.sanfra>
223+
224+
---
225+
226+
## Breaking Changes
227+
228+
* version `0.5.0` -> `0.6.0`
229+
* `options.cacheSize` is dropped in favor of `storage`
230+
76231
## License
77232

78233
MIT

Diff for: bench/bench.sh

+29
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
#!/bin/bash
2+
3+
CWD="$(dirname $0)"
4+
5+
TTL=1
6+
ENTRIES=100000
7+
REFERENCES=5
8+
GET=1
9+
INVALIDATE=0
10+
11+
12+
echo "Running benchmarks..."
13+
14+
node $CWD/storage.js memory $TTL $ENTRIES $REFERENCES $GET $INVALIDATE
15+
16+
echo -e "\n-----\n"
17+
18+
node $CWD/storage.js redis $TTL $ENTRIES $REFERENCES $GET $INVALIDATE
19+
20+
echo -e "\n-----\n"
21+
22+
REFERENCES=1
23+
INVALIDATE=1
24+
25+
node $CWD/storage.js memory $TTL $ENTRIES $REFERENCES $GET $INVALIDATE
26+
27+
echo -e "\n-----\n"
28+
29+
node $CWD/storage.js redis $TTL $ENTRIES $REFERENCES $GET $INVALIDATE

Diff for: bench/storage.js

+77
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
'use strict'
2+
3+
const { hrtime } = require('process')
4+
const path = require('path')
5+
const Redis = require('ioredis')
6+
const createStorage = require(path.resolve(__dirname, '../storage'))
7+
8+
// NOTE: this is a very basic benchmarks for tweaking
9+
// performance is effected by keys and references size
10+
11+
function ms (ns) {
12+
return Number(ns) / 1e6
13+
}
14+
15+
async function main () {
16+
let [,, type, ttl, entries, references, set, invalidate] = process.argv
17+
18+
ttl = Number(ttl)
19+
references = Number(references)
20+
set = set === 'true' || set === '1'
21+
invalidate = invalidate === 'true' || invalidate === '1'
22+
23+
console.log(`
24+
type: ${type}
25+
ttl: ${ttl}
26+
entries: ${entries}
27+
references: ${references}
28+
set: ${set}
29+
invalidate: ${invalidate}
30+
`)
31+
32+
const options = {
33+
invalidation: invalidate
34+
}
35+
36+
if (type === 'redis') {
37+
options.client = new Redis()
38+
}
39+
40+
let start = hrtime.bigint()
41+
const storage = createStorage(type, options)
42+
let end = hrtime.bigint()
43+
console.log(`storage created in ${ms(end - start)} ms`)
44+
45+
start = hrtime.bigint()
46+
for (let i = 0; i < entries; i++) {
47+
const r = []
48+
for (let j = 0; j < references; j++) {
49+
r.push(`reference-${i + j}`)
50+
}
51+
await storage.set(`key-${i}`, `value-${i}`, ttl, r)
52+
}
53+
end = hrtime.bigint()
54+
console.log(`set ${entries} entries (ttl: ${!!ttl}, references: ${references}) in ${ms(end - start)} ms`)
55+
56+
if (set) {
57+
start = hrtime.bigint()
58+
for (let i = 0; i < entries; i++) {
59+
await storage.get(`key-${i}`)
60+
}
61+
end = hrtime.bigint()
62+
console.log(`get ${entries} entries (ttl: ${!!ttl}, references: ${references}) in ${ms(end - start)} ms`)
63+
}
64+
65+
if (invalidate) {
66+
start = hrtime.bigint()
67+
for (let i = 0; i < entries; i++) {
68+
await storage.invalidate([`reference-${i}`])
69+
}
70+
end = hrtime.bigint()
71+
console.log(`invalidate ${entries} entries (ttl: ${!!ttl}, references: ${references}) in ${ms(end - start)} ms`)
72+
}
73+
74+
options.client && options.client.disconnect()
75+
}
76+
77+
main()

Diff for: example.mjs

-26
This file was deleted.

0 commit comments

Comments
 (0)