Skip to content

enable adding indexes to collections to speed up initial data lookups #257

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

samwillis
Copy link
Collaborator

@samwillis samwillis commented Jul 13, 2025

stacked on #256

Indexes are created like so:

collection.createIndex(
  (row) => row.age
  {
    name: `ageIndex`, // optional name in an optional options object
  }
)

the callback uses the same expression builder to extract a value to index as the query builder does for where classes.

You can add as many indexes to a collection as you like, and they can be created after the collection was created initially.

This pr does not link this up to live queries... thats in #258

Copy link

changeset-bot bot commented Jul 13, 2025

🦋 Changeset detected

Latest commit: e526c1e

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 6 packages
Name Type
@tanstack/db Patch
@tanstack/electric-db-collection Patch
@tanstack/query-db-collection Patch
@tanstack/react-db Patch
@tanstack/vue-db Patch
@tanstack/db-example-react-todo Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link

pkg-pr-new bot commented Jul 13, 2025

@tanstack/db-example-react-todo

@tanstack/db

npm i https://pkg.pr.new/@tanstack/db@257

@tanstack/electric-db-collection

npm i https://pkg.pr.new/@tanstack/electric-db-collection@257

@tanstack/query-db-collection

npm i https://pkg.pr.new/@tanstack/query-db-collection@257

@tanstack/react-db

npm i https://pkg.pr.new/@tanstack/react-db@257

@tanstack/vue-db

npm i https://pkg.pr.new/@tanstack/vue-db@257

commit: e526c1e

Copy link
Contributor

github-actions bot commented Jul 13, 2025

Size Change: +3.65 kB (+9.68%) ⚠️

Total Size: 41.4 kB

Filename Size Change
./packages/db/dist/esm/collection.js 11.8 kB +3.31 kB (+39.14%) 🚨
./packages/db/dist/esm/query/builder/ref-proxy.js 890 B +48 B (+5.7%) 🔍
./packages/db/dist/esm/query/compiler/evaluators.js 1.46 kB +119 B (+8.91%) 🔍
./packages/db/dist/esm/query/compiler/order-by.js 713 B -220 B (-23.58%) 🎉
./packages/db/dist/esm/utils/comparison.js 397 B +397 B (new file) 🆕
ℹ️ View Unchanged
Filename Size
./packages/db/dist/esm/deferred.js 230 B
./packages/db/dist/esm/errors.js 150 B
./packages/db/dist/esm/index.js 568 B
./packages/db/dist/esm/local-only.js 815 B
./packages/db/dist/esm/local-storage.js 2.07 kB
./packages/db/dist/esm/optimistic-action.js 294 B
./packages/db/dist/esm/proxy.js 3.75 kB
./packages/db/dist/esm/query/builder/functions.js 531 B
./packages/db/dist/esm/query/builder/index.js 3.49 kB
./packages/db/dist/esm/query/compiler/group-by.js 2.09 kB
./packages/db/dist/esm/query/compiler/index.js 1.75 kB
./packages/db/dist/esm/query/compiler/joins.js 1.2 kB
./packages/db/dist/esm/query/compiler/select.js 657 B
./packages/db/dist/esm/query/ir.js 318 B
./packages/db/dist/esm/query/live-query-collection.js 2.06 kB
./packages/db/dist/esm/query/optimizer.js 2.24 kB
./packages/db/dist/esm/SortedMap.js 1.24 kB
./packages/db/dist/esm/transactions.js 2.29 kB
./packages/db/dist/esm/utils.js 419 B

compressed-size-action::db-package-size

Copy link
Contributor

github-actions bot commented Jul 13, 2025

Size Change: 0 B

Total Size: 1.05 kB

ℹ️ View Unchanged
Filename Size
./packages/react-db/dist/esm/index.js 152 B
./packages/react-db/dist/esm/useLiveQuery.js 902 B

compressed-size-action::react-db-package-size

Copy link
Contributor

@kevin-dp kevin-dp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR needs some additional work to minimize the bookkeeping overhead of the index range and to improve the code quality by introducing the necessary abstractions wrt indexes.

* Represents an index for fast lookups on a collection
* All indexes are ordered to support range queries
*/
export interface CollectionIndex<
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be more appropriate for CollectionIndex to be a class instead of a plain JS object. Right now, it's an object leaking internal data structures to whoever wants to use it. As a result, the CollectionImpl class now contains a lot of additional complexity for building and maintaining indexes. It would be better to abstract all of that away behind a CollectionIndex class. That way, the changes to CollectionImpl will be minimal.

So, I'd rather have a CollectionIndex that implements an index and is a black box (i.e. all implementation details are contained within the class and the caller doesn't need to know about them). That also simplifies code evolution since we could swap out internal data structures without requiring changes to the calling code, as long as the public interface remains the same.

Such a CollectionIndex class would expose an add(value, key) method which will then update its internal data structures such as valueMap, orderedEntries, and indexedKeys. It would also expose methods for equality lookups and range scans.

this.validateCollectionUsable(`createIndex`)

// Generate unique ID for this index
const indexId = `index_${++this.indexCounter}`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just the indexCounter number?


// Check if this is a simple field comparison: field op value

if (leftArg.type === `ref` && rightArg.type === `val`) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the other way around?
e.g. imagine someone wrote lte(18, user.age).

}

// All arguments must be optimizable
return expression.args.every((arg) => this.canOptimizeExpression(arg))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we optimize the parts that are optimizable to already gain some speedup there?

const fieldPath = fieldArg.path

// Find an index that matches this field
for (const index of this.indexes.values()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen this exact code also in canOptimizeSimpleComparison. Let's extract it to a function findIndex.


if (results.length > 0) {
// Intersect all matching keys (AND logic)
let intersectedKeys = results[0]!.matchingKeys
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move all this set intersection logic to a utility function

const expression = toExpression(whereExpression)
const evaluator = compileSingleRowExpression(expression)
const result = evaluator(item as Record<string, unknown>)
return Boolean(result)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to convert it to a boolean? Should't a WHERE clause always be a predicate (i.e. evaluate to a boolean value) ?

Base automatically changed from query-optimiser to main July 16, 2025 10:53
cursor bot pushed a commit that referenced this pull request Jul 16, 2025
- Updated utils.ts to use Map instead of WeakMap for cycle detection (from main)
- Kept all collection indexing utility functions
- Updated optimizer.ts to use main branch patterns while preserving enhancements
- Added distinct import to compiler/index.ts from main branch
- Fixed non-null assertions in compiler to match main branch style
- All review comments from PR #257 remain addressed
@samwillis samwillis force-pushed the index-collections branch from 95b29d8 to e526c1e Compare July 16, 2025 11:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants