Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make Map the default (round 2) #40

Open
gurgunday opened this issue Dec 18, 2023 · 2 comments
Open

make Map the default (round 2) #40

gurgunday opened this issue Dec 18, 2023 · 2 comments

Comments

@gurgunday
Copy link
Collaborator

gurgunday commented Dec 18, 2023

I finally cracked the code after digging through more benchmarks and stumbling upon the comment of a V8 developper

TLDR is that if we use ordered keys as we do in the benchmarks, V8 will highly optimize property access. To illustrate, if we just force the keys to be non-numeric, with something like x + '_', the picture becomes much more interesting :)

Objection

Benchmarks

Let's modify the execute function to make our keys nonnumeric, the following results surface for cache-get-inmemory/:

{ cpu: { brand: 'M1', speed: '2.40 GHz' } }
| Node   | Option             | Msecs/op       | Ops/sec  | V8                    |
| ------ | ------------------ | -------------- | -------- | --------------------- |
| 20.9.0 | toad-cache-lru-map | 0.434283 msecs | 2302.644 | V8 11.3.244.8-node.16 |
| 20.9.0 | toad-cache-lru     | 0.601765 msecs | 1661.777 | V8 11.3.244.8-node.16 |

Here's the execute functions of toad-cache-lru-map.js and toad-cache-lru-object.js:

async function execute() {
  for (var x = 0; x < ELEMENT_COUNT; x++) {
    const value = cache.get(x + 'a') // became x + 'a'
    if (!value) {
      const newValue = {
        id: x
      }

      cache.set(x + 'a', newValue) // became x + 'a'
    }
  }
}

Then, I realized another potential issue with the default parameters — MAX_ITEMS was higher than ELEMENT_COUNT

Ideally, MAX_ITEMS should be less than ELEMENT_COUNT so that we can have deletions as well to get a more complete picture of performance

Let's modify common.js:

const { validateEqual } = require("validation-utils");

const MAX_ITEMS = 5000 // became 5000 from 10000
const TTL = 1000 * 60 * 60
const ELEMENT_COUNT = 8500 // became 8500 from 5000

async function validateAccuracy(getFn) {
  const value = await getFn('123')
  validateEqual(value.id, '123')
}

module.exports = {
  ELEMENT_COUNT,
  MAX_ITEMS,
  TTL,
  validateAccuracy,
};

Here are the results of a rerun:

{ cpu: { brand: 'M1', speed: '2.40 GHz' } }
| Node   | Option             | Msecs/op       | Ops/sec | V8                    |
| ------ | ------------------ | -------------- | ------- | --------------------- |
| 20.9.0 | toad-cache-lru-map | 1.387450 msecs | 720.747 | V8 11.3.244.8-node.16 |
| 20.9.0 | toad-cache-lru     | 2.053608 msecs | 486.948 | V8 11.3.244.8-node.16 |

I argue that a generic LRU will not have number-based keys, especially not the very special case of using consecutive integers as keys — if that was the case, we'd use an array — and I propose once again to make the default export Map-based

What do you think?

If you do not agree, it's fine! But I feel like repos such as fastify-rate-limit would greatly benefit from the change

@kibertoad
Copy link
Owner

@gurgunday That's an interesting finding, but also the one that needs to be documented! Sometimes integers will be used as a keys, as they are a popular choice for primary ids in the databases; but sometimes uuids or some other alphanumeric options are used; so in ideal scenario we should have benchmarks for both scenarios, and provide recommendations about what to use in the project readme. Oh, and fastify-rate-limit can switch to map-based cache already, both are exposed.

Could you send a PR to https://github.com/kibertoad/nodejs-benchmark-tournament, adding separate benchmarks for string-based ids for inmemory caches? Based on that I could adjust the default and the readme.

@gurgunday
Copy link
Collaborator Author

gurgunday commented Dec 18, 2023

Will do that!

As you said, using the map import explicitly for rate-limit would be smarter as we can provide a solid reasoning for it if someone asks, will send the PR there after confirming results with you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants