-
Notifications
You must be signed in to change notification settings - Fork 138
Make indexing opt-in #950
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Make indexing opt-in #950
Conversation
🦋 Changeset detectedLatest commit: feaba11 The changes in this PR will be included in the next version bump. This PR includes changesets to release 14 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
More templates
@tanstack/angular-db
@tanstack/db
@tanstack/db-ivm
@tanstack/electric-db-collection
@tanstack/offline-transactions
@tanstack/powersync-db-collection
@tanstack/query-db-collection
@tanstack/react-db
@tanstack/rxdb-db-collection
@tanstack/solid-db
@tanstack/svelte-db
@tanstack/trailbase-db-collection
@tanstack/vue-db
commit: |
|
Size Change: +1.54 kB (+1.77%) Total Size: 88.7 kB
ℹ️ View Unchanged
|
|
Size Change: 0 B Total Size: 3.34 kB ℹ️ View Unchanged
|
kevin-dp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments that i would like to see addressed. Nothing major. There are no unit tests for the new BasicIndex. We should add unit tests.
| if (!collection.config.defaultIndexType) { | ||
| if (isDevModeEnabled()) { | ||
| console.warn( | ||
| `[TanStack DB] Auto-indexing is enabled but no defaultIndexType is set. ` + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this warning is misleading because it says auto-indexing is enabled but actually since we return false here it is disabled. We should make explicit that we are disabling it because no defaultIndexType is set.
I think it would make more sense to still keep it enabled and just fall back to a default index type in case the user does not provide one?
| * Simpler and smaller than BTreeIndex, good for read-heavy workloads. | ||
| * Use BTreeIndex for write-heavy workloads with large collections. | ||
| */ | ||
| export class BasicIndex< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: For many users it may not be clear which index they should pick based on the names BasicIndex and BTreeIndex. Perhaps we should rename them to clarify this: BasicIndex -> ReadOptimizedIndex and BTreeIndex -> WriteOptimizedIndex. The docstrings can then explain like they do now that ReadOptimizedIndex is backed by a Map + sorted Array and its performance implications and that WriteOptimizedIndex is backed by a B+ tree.
| /** | ||
| * Binary search to find insertion point in sorted array | ||
| */ | ||
| private findInsertionIndex(value: any): number { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this method. We already have a findInsertPosition helper functon in utils/array-utils.ts.
That helper function expects the array to be an array of tuples though so perhaps we should modify that helper to be over a basic array of values. And we can define a variant of it on top for tupled arrays (such that the parts of the code that were using it before don't need to be changed, just need to use the tupled variant).
| // No more keys for this value, remove from map and sorted array | ||
| this.valueMap.delete(normalizedValue) | ||
|
|
||
| const idx = this.findInsertionIndex(normalizedValue) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This findInsertionIndex + the if test below is the logic that is needed to delete a value from a sorted array. As mentioned in a previous comment findInsertionIndex should be replaced by the already existing findInsertPosition. And we should add an additional helper function deleteInSortedArray to that array-utils.ts file that does exactly this logic. And then here just use that deleteInSortedArray function.
| /** | ||
| * Builds the index from a collection of entries | ||
| */ | ||
| build(entries: Iterable<[TKey, any]>): void { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was expecting build to do something like this:
entries.map(([key, value]) => this.add(key, value))Because right now it duplicates the add logic.
| */ | ||
| equalityLookup(value: any): Set<TKey> { | ||
| const normalizedValue = normalizeValue(value) | ||
| return new Set(this.valueMap.get(normalizedValue) ?? []) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this.valueMap.get(normalizedValue) already returns a set so let's do:
return this.valueMap.get(normalizedValue) ?? new Set()| * Performs a reversed range query | ||
| */ | ||
| rangeQueryReversed(options: RangeQueryOptions = {}): Set<TKey> { | ||
| // For BasicIndex, reversed is the same result set, just different iteration order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this true?
rangeQuery looks up the index for from and the index for to and then iterates from from to to with this for loop:
for (let i = startIdx; i < endIdx; i++)So it is explicitly expecting startIdx to be smaller than endIdx and so going left to right by incrementing the index.
However, when doing a reversed range query we go from right to left, e.g. from 10 to 5. So shouldn't we either swap the from and to here or modify rangeQuery to find the insertions indexes of both from and to and go from the smallest to the biggest index? i.e. swap startIdx and endIdx if endIdx < startIdx.
This is how the B+ tree index does it which seems to me that we also need to do this here:
rangeQueryReversed(options: RangeQueryOptions = {}): Set<TKey> {
const { from, to, fromInclusive = true, toInclusive = true } = options
return this.rangeQuery({
from: to ?? this.orderedEntries.maxKey(),
to: from ?? this.orderedEntries.minKey(),
fromInclusive: toInclusive,
toInclusive: fromInclusive,
})
}| get orderedEntriesArrayReversed(): Array<[any, Set<TKey>]> { | ||
| return [...this.sortedValues] | ||
| .reverse() | ||
| .map((value) => [value, this.valueMap.get(value) ?? new Set()]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code iterates 3 times through the sorted values. We can improve it by a single pass with a for loop or with reduce.
| /** Index type to use (e.g., BasicIndex, BTreeIndex) */ | ||
| indexType?: TIndexType | ||
| /** Options passed to the index constructor */ | ||
| options?: TIndexType extends new ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to pass both the index type (i.e. constructor) and the arguments for the constructor. Isn't that equivalent but a bit more verbose than to just pass an index creation function: () => BaseIndex.
| // Dev mode detection settings - ON by default in non-production | ||
| let devModeConfig: IndexDevModeConfig = { | ||
| enabled: true, | ||
| collectionSizeThreshold: 1000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This threshold seems arbitrary. Going through an non-indexed collection of 1000 rows should be pretty fast. I guess modern machines can easily go to perhaps 1M rows before it really becomes slow and needs indexes. Time-based suggestions (slowQueryThresholdMs) makes much more sense imo.
This POC explores making indexing a tree-shakeable opt-in feature to reduce
default bundle size. Key changes:
- Create new `@tanstack/db/indexing` entry point for BTreeIndex and related utils
- Add index registry to replace hard-coded BTreeIndex defaults
- Add dev-mode auto-detection for when indexes would help (collection size,
slow queries)
- Remove BTreeIndex value export from main entry (types still exported)
Bundle size improvements when NOT using indexing:
- Minified: ~15 KB saved (6.9%)
- Gzipped: ~5.4 KB saved (8.5%)
Usage after this change:
```ts
// Option 1: Enable indexing globally
import { enableIndexing } from '@tanstack/db/indexing'
enableIndexing()
// Option 2: Use explicit index type (best for tree-shaking)
import { BTreeIndex } from '@tanstack/db/indexing'
collection.createIndex((row) => row.userId, { indexType: BTreeIndex })
// Dev mode for index suggestions
import { configureIndexDevMode } from '@tanstack/db'
configureIndexDevMode({ enabled: true, collectionSizeThreshold: 100 })
```
Note: Additional savings (~25KB more) possible by making index-optimization.ts
lazy-loaded, but would require more extensive refactoring of change-events.ts
and order-by.ts.
…y default
Breaking change: autoIndex now defaults to 'off' instead of 'eager'.
This reduces default bundle size by not requiring indexing code.
Changes:
- autoIndex defaults to 'off' - users must explicitly enable or add indexes
- Dev mode suggestions are ON by default (in non-production) to help
developers identify when indexes would improve performance
- Updated tests to reflect new default behavior
Users who want auto-indexing can either:
1. Set autoIndex: 'eager' on individual collections
2. Import and register BTreeIndex globally:
```ts
import { enableIndexing } from '@tanstack/db/indexing'
enableIndexing()
```
Dev mode will warn in console when queries could benefit from indexes,
and this will also be available in devtools.
Most collections won't have perf issues until much larger sizes. Slow query detection (10ms) is the more actionable metric.
- Add MapIndex for equality lookups (eq, in) without BTree overhead - Change enableIndexing() to use MapIndex by default - Add enableBTreeIndexing() for ORDER BY optimization on large collections - BTreeIndex is only needed for sorted iteration (10k+ items with ORDER BY) Bundle size impact: - MapIndex: ~5 KB (~1.3 KB gzipped) - BTreeIndex: ~33 KB (~7.8 KB gzipped) - Savings with MapIndex: ~27 KB (~6 KB gzipped)
…ndexType - Add BasicIndex using Map + sorted Array for both equality and range queries - Remove registry pattern - pass defaultIndexType to collection constructor instead - Remove lazy index infrastructure (LazyIndexWrapper, IndexProxy) - Simplify indexing.ts entry point to just export BasicIndex and BTreeIndex - Update all tests to explicitly set defaultIndexType where needed - Update changeset to reflect simplified approach Breaking changes: - createIndex() requires defaultIndexType on collection or indexType in config - enableIndexing()/enableBTreeIndexing() removed, use defaultIndexType instead
6df29a2 to
79db466
Compare
No description provided.