Skip to content

fix: Optimize user presence updates using batched WatermelonDB#7034

Open
Yaddalapalli-Charan-Kumar-Naidu wants to merge 5 commits intoRocketChat:developfrom
Yaddalapalli-Charan-Kumar-Naidu:fix/watermelon-db-bottleneck
Open

fix: Optimize user presence updates using batched WatermelonDB#7034
Yaddalapalli-Charan-Kumar-Naidu wants to merge 5 commits intoRocketChat:developfrom
Yaddalapalli-Charan-Kumar-Naidu:fix/watermelon-db-bottleneck

Conversation

@Yaddalapalli-Charan-Kumar-Naidu
Copy link
Contributor

@Yaddalapalli-Charan-Kumar-Naidu Yaddalapalli-Charan-Kumar-Naidu commented Mar 4, 2026

Proposed changes

This PR fixes a severe performance bottleneck and a race condition inside [getUsersPresence.ts] Previously, the code fired db.write() within an asynchronous forEach loop, which bypassed WatermelonDB's batching mechanism and caused massive UI stuttering by spinning up hundreds of concurrent writer processes.

Furthermore, usersBatch was never deduplicated, meaning concurrent processes would attempt to create the exact same user ID simultaneously, throwing native SQLite Unique Constraint violations (Unhandled Promise Rejections) and crashing the sync.

This PR deduplicates usersBatch and properly utilizes Q.where alongside a single, atomic db.batch() execution to safely and efficiently commit all user updates at once.

Issue(s)

Fixes #7032

How to test or reproduce

  1. Open a heavily populated channel/server where user presence updates are rapidly firing.
  2. In the develop branch, observe significant frame drops and background SQLite Unhandled Promise Rejections due to unbatched WatermelonDB writes and constraint violations.
  3. Switch to this PR's branch, and observe perfectly smooth UI performance without any background SQLite Unhandled Promise Rejections.

Screenshots

N/A - Background WatermelonDB Synchronization Performance Fix.

Types of changes

  • Bugfix (non-breaking change which fixes an issue)
  • Improvement (non-breaking change which improves a current function)
  • New feature (non-breaking change which adds functionality)
  • Documentation update (if none of the other choices apply)

Checklist

  • I have read the CONTRIBUTING doc
  • I have signed the CLA
  • Lint and unit tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works (if applicable)
  • I have added necessary documentation (if applicable)
  • Any dependent changes have been merged and published in downstream modules

Further comments

Summary by CodeRabbit

  • Refactor
    • Improved efficiency of presence syncs via chunked batch processing and deduplication to reduce database operations and during timed flushes.
  • Bug Fixes
    • Added robust error handling with per-item fallback writes and logging when batch updates fail, plus outer-workflow error logging.
  • New Features
    • Added a persisted user preference to toggle presence-capability.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 4, 2026

Walkthrough

Deduplicates user IDs and replaces per-user concurrent WatermelonDB writes with chunked bulk reads (Q.oneOf) and batched prepareUpdate/prepareCreate operations executed via db.batch, with per-operation fallback writes and logging; also exports setPresenceCap and adjusts upstream flushing to pass deduplicated IDs.

Changes

Cohort / File(s) Summary
Presence batching & fallback
app/lib/methods/getUsersPresence.ts
Deduplicate incoming usersBatch ([...new Set(...)]); import Q and logging helper; fetch existing user records in 900-id chunks via Q.oneOf(...); build existing-record map; prepare prepareUpdate/prepareCreate ops and execute db.batch() per chunk; on batch failure log error and attempt per-operation fallback writes with try/catch and per-operation logging; wrap overall flow in outer try/catch.
Public API: presence cap
app/lib/methods/getUsersPresence.ts
Add exported setPresenceCap(enabled: boolean) to persist presence-capability setting and wire to preferences/Redux actions; adjust timer-driven flushing to pass deduplicated user IDs to processing.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant getUsersPresence
  participant DB as WatermelonDB
  participant Logger

  Caller->>getUsersPresence: invoke with usersBatch
  getUsersPresence->>getUsersPresence: deduplicate IDs
  getUsersPresence->>DB: chunked Q.oneOf(...) bulk queries
  DB-->>getUsersPresence: existing records
  getUsersPresence->>getUsersPresence: prepare create/update operations
  getUsersPresence->>DB: db.batch(ops) per chunk
  DB-->>getUsersPresence: success / throws
  alt batch success
    getUsersPresence->>Caller: return
  else batch failure
    getUsersPresence->>Logger: log batch error
    getUsersPresence->>DB: attempt per-operation fallback writes (individual db.batch)
    DB-->>getUsersPresence: success / throws
    getUsersPresence->>Logger: log fallback errors if any
    getUsersPresence->>Caller: return
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning The introduction of the setPresenceCap function for toggling presence-capability settings appears to be outside the scope of fixing the getUsersPresence performance bottleneck and race condition. Clarify whether setPresenceCap is necessary for this PR or move it to a separate PR focused on presence-capability management features.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately describes the main optimization: batched WatermelonDB writes for user presence updates, which directly addresses the performance bottleneck in the linked issue.
Linked Issues check ✅ Passed The PR successfully implements all three core requirements: deduplicates usersBatch via Set, uses batched Q.oneOf queries for bulk fetches, and executes all operations in a single atomic db.batch() call.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/lib/methods/getUsersPresence.ts`:
- Around line 91-114: The catch currently swallows errors from the atomic
db.batch call in getUsersPresence.ts; instead, surface the failure and attempt a
safe fallback so one bad op doesn't drop all updates: in the catch block log the
error (include the exception), then perform a fallback write that applies each
operation individually (or in small chunks) using db.write with per-operation
try/catch so successful presence updates still persist; use the same
userCollection prepareCreate/prepareUpdate logic (operations,
existingRecordsMap) when doing the per-item fallback and avoid re-throwing
unless you want the caller to handle retry.
- Around line 92-93: The current fetch using Q.oneOf(userIds) in
getUsersPresence.ts can exceed SQLite bind limits for large userIds arrays;
change the logic that builds existingRecords (using userCollection and Q.oneOf)
to chunk userIds into batches (e.g., max 900 IDs) and perform multiple queries,
then merge/concat the results into existingRecords before continuing; ensure you
reference the variables userIds, existingRecords, userCollection and the Q.oneOf
call so you split userIds into chunks, run query per chunk, and combine fetched
records.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fff624e9-97f0-4f4b-baf6-dcce33047448

📥 Commits

Reviewing files that changed from the base of the PR and between dbba268 and 80b1888.

📒 Files selected for processing (1)
  • app/lib/methods/getUsersPresence.ts
📜 Review details
🔇 Additional comments (2)
app/lib/methods/getUsersPresence.ts (2)

3-3: Good use of WatermelonDB query builder import.

Q is used correctly for the bulk lookup path and keeps the query idiomatic.


127-127: Nice dedup before dispatching batched presence fetch.

Using getUsersPresence([...new Set(usersBatch)]) is the right fix to prevent duplicate IDs from generating duplicate create/update operations.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/lib/methods/getUsersPresence.ts`:
- Around line 133-135: The outer catch in getUsersPresence.ts currently swallows
all errors from the fetch/prepare phase; instead log the exception and context
so failures are visible. Replace the empty catch block in the getUsersPresence
function with a call to the module's logger (e.g., processLogger.error or
logger.error) or console.error if no logger is available, including the caught
error (e) and relevant context (like userIds or operation name) to aid debugging
while preserving existing control flow.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 823540cc-d66d-4cda-8aab-24ca013e51b3

📥 Commits

Reviewing files that changed from the base of the PR and between 80b1888 and c960d4f.

📒 Files selected for processing (1)
  • app/lib/methods/getUsersPresence.ts
📜 Review details
🔇 Additional comments (2)
app/lib/methods/getUsersPresence.ts (2)

92-119: Nice improvement: bulk fetch + single batch write path is solid.

This is a strong fix for the previous writer contention path: chunked Q.oneOf reads and a single db.batch in one db.write transaction are the right direction here.


148-148: Good dedup at dispatch boundary.

Using new Set(usersBatch) before getUsersPresence avoids duplicate work and reduces race pressure on writes.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/lib/methods/getUsersPresence.ts`:
- Around line 120-123: The fallback currently fires per-operation writes in
parallel and ignores failures; change the catch block to perform sequential
writes over operations (e.g., for...of operations) and for each await
db.write(() => db.batch(operation)) inside a try/catch so you can log
per-operation errors (include identifying info from the Model instance) and
optionally collect failed operations into an array to surface or rethrow at the
end; update the code paths around log(e), operations, Model, db.write and
db.batch to implement this sequential, per-operation error handling and logging.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c394aebb-a504-4441-8e60-2cab65af66ce

📥 Commits

Reviewing files that changed from the base of the PR and between c960d4f and a979bb6.

📒 Files selected for processing (1)
  • app/lib/methods/getUsersPresence.ts
📜 Review details
🔇 Additional comments (2)
app/lib/methods/getUsersPresence.ts (2)

92-119: Strong improvement: batched read/prepare/write flow is aligned with the performance fix.

This is a solid shift away from per-user writer transactions and should materially reduce UI stutter during large presence syncs.


139-139: Good deduplication at flush time.

[...new Set(usersBatch)] is a practical fix to avoid repeated processing of the same uid in one cycle.

Copy link
Member

@diegolmello diegolmello left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whenever you make performance optimizations, you need to report a detailed explanation of how this is impacting performance and compare the difference.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/lib/methods/getUsersPresence.ts`:
- Around line 104-110: The User model is missing `@field` declarations for status
and statusText so Object.assign(u, user) in getUsersPresence.ts (inside
existingRecord.prepareUpdate and userCollection.prepareCreate) writes are
dropped by WatermelonDB; add two new fields (status and statusText) with `@field`
decorators to the User model class (e.g., class User / User.js) and update the
database schema to include corresponding columns (status, statusText) and
migration if needed so presence values persisted; ensure any TypeScript
types/interfaces for TUserModel or the model exports are updated to include
these properties.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e19208e2-cc65-4395-966d-e3b2056e704e

📥 Commits

Reviewing files that changed from the base of the PR and between a979bb6 and 35a29ca.

📒 Files selected for processing (1)
  • app/lib/methods/getUsersPresence.ts
📜 Review details
🔇 Additional comments (3)
app/lib/methods/getUsersPresence.ts (3)

92-99: Good move to chunked Q.oneOf fetches before write prep.

This read path is much safer for large user sets and aligns with the stated performance objective.


114-138: Atomic batch with logged sequential fallback is a solid reliability improvement.

Line 116 primary db.batch(...operations) + Line 121-134 per-operation fallback gives better resilience and observability than the previous fire-and-forget pattern.


155-155: Deduplicating usersBatch at flush time is the right fix for duplicate-ID races.

This directly addresses the duplicate create/update pressure described in the issue.

Comment on lines +104 to +110
return existingRecord.prepareUpdate((u: TUserModel) => {
Object.assign(u, user);
});
}
return userCollection.prepareCreate(u => {
u._raw = sanitizedRaw({ id: user._id }, userCollection.schema);
Object.assign(u, user);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== User model fields (check declared `@field/`@json columns) =="
cat -n app/lib/database/model/User.js 2>/dev/null | head -250

echo
echo "== Schema definition for users table =="
find . -type f \( -name "*.js" -o -name "*.ts" \) -path "*/database/*" | xargs grep -l "users" | head -5
echo "Searching for users schema definition..."
rg -n "users|status_text|statusText" app/lib/database -B2 -A2 --type ts --type js | head -100

Repository: RocketChat/Rocket.Chat.ReactNative

Length of output: 7245


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== getUsersPresence.ts imports and context =="
cat -n app/lib/methods/getUsersPresence.ts | head -120

echo
echo "== Check which User model is imported =="
rg -n "import.*User|from.*database" app/lib/methods/getUsersPresence.ts

echo
echo "== Full app/lib/database/schema/app.js users table schema =="
rg -A20 "name: 'users'" app/lib/database/schema/app.js | head -40

Repository: RocketChat/Rocket.Chat.ReactNative

Length of output: 5996


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Database initialization/configuration =="
cat -n app/lib/database/index.ts 2>/dev/null || cat -n app/lib/database/index.js 2>/dev/null | head -100

echo
echo "== Search for database.active definition =="
rg -n "database.*active|\.active\s*=" app/lib/database -B3 -A3 | head -50

Repository: RocketChat/Rocket.Chat.ReactNative

Length of output: 6978


Add status and statusText field declarations to the User model to persist presence data.

Lines 105 and 110 assign the full server user payload via Object.assign(u, user). The User model at app/lib/database/model/User.js lacks @field decorators for status and statusText, so WatermelonDB silently discards these writes. This leaves presence data inconsistent—Redux holds the values but the database does not. Add these fields to the User model and ensure the app schema includes the corresponding columns.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/lib/methods/getUsersPresence.ts` around lines 104 - 110, The User model
is missing `@field` declarations for status and statusText so Object.assign(u,
user) in getUsersPresence.ts (inside existingRecord.prepareUpdate and
userCollection.prepareCreate) writes are dropped by WatermelonDB; add two new
fields (status and statusText) with `@field` decorators to the User model class
(e.g., class User / User.js) and update the database schema to include
corresponding columns (status, statusText) and migration if needed so presence
values persisted; ensure any TypeScript types/interfaces for TUserModel or the
model exports are updated to include these properties.

@Yaddalapalli-Charan-Kumar-Naidu
Copy link
Contributor Author

Yaddalapalli-Charan-Kumar-Naidu commented Mar 4, 2026

Whenever you make performance optimizations, you need to report a detailed explanation of how this is impacting performance and compare the difference.

To give some context on the performance side, here’s a detailed explanation of what this PR actually does:

  1. Stopping the "Bridge Traffic Jam"
    Previously, the code was firing off a separate database transaction for every single user update within a loop. If a big channel had 200 people changing status at once, the app would try to cross the React Native bridge 200 times in a row.
    Now, we gather everything first and do one single atomic batch write. It’s basically the difference between making 200 trips to the grocery store for one item each, versus one trip for the whole list.

  2. Bulk Lookups instead of 1-by-1 Checks
    Before updating a user, we need to know if they already exist in our local database (to decide if we create or update them). We used to ask the database for each user individually. Now, we just ask for everyone in the batch at once. It’s significantly faster.

  3. Handling SQLite's "999" Limit
    The reason for that 900 chunking logic is that SQLite which WatermelonDB uses has a hard limit: you can't ask for more than 999 things in a single query. On really busy servers, we might be syncing presence for over 1,000 users at once. Without this chunking, the app would simply crash with a database error. I've simplified this code so it's clear how we're staying under that limit.

  4. Smart Deduplication
    If a user flips from online -> away -> online very quickly, we now deduplicate that batch. We only process their latest state once, instead of wasting battery and CPU processing three different status changes that happened in the same 2-second window.

@diegolmello
Copy link
Member

@Yaddalapalli-Charan-Kumar-Naidu I meant where are your measurements?

@Yaddalapalli-Charan-Kumar-Naidu
Copy link
Contributor Author

@Yaddalapalli-Charan-Kumar-Naidu I meant where are your measurements?

@Yaddalapalli-Charan-Kumar-Naidu I meant where are your measurements?
@diegolmello here are the measurements from device testing where I simulated syncing 100 users:
Develop branch: ~2354 ms
My new branch: ~302 ms
I instrumented getUsersPresence.ts with performance.now() to measure the total sync latency from receiving api response to db writes
Optimized branch: measured the time to prepare operations and execute the single db.batch() call.
Both tests were run with a 100-user simulation on the same physical device for consistency.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors getUsersPresence to reduce WatermelonDB write contention and unique-constraint crashes by batching user upserts and deduplicating queued presence requests.

Changes:

  • Replace per-user db.write() calls with a single db.batch() of prepared updates/creates, using Q.oneOf to prefetch existing user records.
  • Deduplicate usersBatch before calling getUsersPresence to reduce duplicate work and constraint collisions.
  • Add error logging and a fallback path when the main batch write fails.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +95 to +97
for (let i = 0; i < userIds.length; i += 900) {
chunks.push(userIds.slice(i, i + 900));
}
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The chunk size 900 is a magic number with no explanation. If this is to stay under SQLite/WatermelonDB parameter limits (commonly 999), please extract it into a named constant (e.g., SQLITE_MAX_BIND_PARAMS_MARGIN) and add a short comment explaining why this value is chosen so future changes don’t accidentally break large presence updates.

Copilot uses AI. Check for mistakes.
Comment on lines +101 to +112
const operations: Model[] = users.map((user: IUser) => {
const existingRecord = existingRecordsMap.get(user._id);
if (existingRecord) {
return existingRecord.prepareUpdate((u: TUserModel) => {
Object.assign(u, user);
});
}
return userCollection.prepareCreate(u => {
u._raw = sanitizedRaw({ id: user._id }, userCollection.schema);
Object.assign(u, user);
});
} catch (e) {
// User not found
});
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

operations is built based on a pre-fetch of existing records. If getUsersPresence can run concurrently (it’s called from setTimeout without awaiting), two in-flight calls can both decide a user doesn’t exist and both prepare a create for the same id, causing the same unique-constraint crash this PR aims to avoid. Consider serializing getUsersPresence executions (e.g., an in-flight promise/mutex) and/or handling unique-constraint failures by re-fetching and retrying as an update instead of a create.

Copilot uses AI. Check for mistakes.
Comment on lines +121 to +134
for (const operation of operations) {
try {
// eslint-disable-next-line no-await-in-loop
await db.write(() => db.batch(operation));
} catch (operationError: any) {
log({
message: 'Fallback per-operation write failed',
operationId: operation.id,
operationTable: operation.collection.table,
error: operationError?.message || operationError
});
failedOperations.push(operation);
}
}
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback path retries by writing each operation individually (db.write(() => db.batch(operation))). If the initial batch fails due to a single bad record, this can reintroduce the same performance issue (many writes) and significantly delay UI/sync work. A more scalable fallback is to retry in smaller batches (e.g., split operations into chunks or bisect to isolate the failing op) and only fall back to per-item writes for the minimal failing subset.

Suggested change
for (const operation of operations) {
try {
// eslint-disable-next-line no-await-in-loop
await db.write(() => db.batch(operation));
} catch (operationError: any) {
log({
message: 'Fallback per-operation write failed',
operationId: operation.id,
operationTable: operation.collection.table,
error: operationError?.message || operationError
});
failedOperations.push(operation);
}
}
const processBatchWithFallback = async (ops: Model[]): Promise<void> => {
if (!ops.length) {
return;
}
try {
await db.write(async () => {
await db.batch(...ops);
});
} catch (batchError: any) {
if (ops.length === 1) {
const operation = ops[0];
log({
message: 'Fallback per-operation write failed',
operationId: operation.id,
operationTable: operation.collection.table,
error: batchError?.message || batchError
});
failedOperations.push(operation);
return;
}
const mid = Math.floor(ops.length / 2);
const left = ops.slice(0, mid);
const right = ops.slice(mid);
await processBatchWithFallback(left);
await processBatchWithFallback(right);
}
};
await processBatchWithFallback(operations);

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: Performance issue in getUsersPresence due to unbatched WatermelonDB writes and race condition

3 participants