fix(memory): dedupe Composio sources to the latest authorization per identity by sanil-23 · Pull Request #2843 · tinyhumansai/openhuman

sanil-23 · 2026-05-28T11:52:50Z

Summary

Intelligence → Memory was rendering every Composio connection ever returned by the backend, including superseded re-authorizations — visually 9 zombie rows where only 3 reflected current truth on a real account.
Dedupe rows in buildRows by (toolkit + identity) and keep the largest createdAt per group.
Drop the earlier status-priority idea (ACTIVE always wins) — a freshly re-authorized connection that then expires is the user's actual current truth and must beat a stale-but-still-marked-ACTIVE older row.
Vitest coverage: 8 cases including a regression guard on the createdAt-vs-status precedence.

Problem

composio.list_connections returns every connection record the Composio tenant has ever held for that account — including auths the user has since superseded by re-authorizing. The frontend dropped these into buildRows unchanged, so the Memory sources panel grew a row per historical authorization. On one tester's account this expanded to 9 rows (1 ACTIVE + 7 EXPIRED + 1 REVOKED across gmail/github/notion) where only 3 reflected anything actionable. Every duplicate row also shared the same per-toolkit MemorySyncStatus block, repeating the same "2 chunks · 1d ago" multiple times.

Solution

Two-stage filter in buildRows:

Toolkit gate (unchanged) — drop toolkits with no memory-tree sync provider.
Dedupe — group by ${toolkit}::${identity} when identity (accountEmail ?? workspace ?? username) is non-null; otherwise by toolkit alone. Within a group, keep the connection with the largest createdAt.

isMoreRecentConnection is pure recency. An earlier draft preferred ACTIVE/CONNECTED over other statuses regardless of timestamp, but that loses the right answer in one critical case: when a user re-authorizes and the fresh auth then expires, the new EXPIRED is the user's current state — the older row that is "still ACTIVE" is stale data from a superseded authorization. A regression test pins this.

Once the backend starts populating identity fields, two genuinely different accounts on the same toolkit (e.g. two Gmail addresses) will automatically keep separate rows without further code changes.

Submission Checklist

Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
Diff coverage ≥ 80% — changed lines (Vitest + cargo-llvm-cov merged via diff-cover) meet the gate enforced by .github/workflows/coverage.yml. Run pnpm test:coverage and pnpm test:rust locally; PRs below 80% on changed lines will not merge.
N/A: behaviour-only change — no new feature row added to docs/TEST-COVERAGE-MATRIX.md, so the matrix doesn't need an update.
N/A: behaviour-only change — no feature IDs to list under ## Related.
No new external network dependencies introduced (mock backend used per Testing Strategy)
N/A: cosmetic frontend cleanup that does not touch the docs/RELEASE-MANUAL-SMOKE.md surfaces.
N/A: discovered organically while reviewing the Memory panel, no linked issue.

Impact

Runtime / platform: desktop only — pure app/src/ change, no Rust, no schema, no backend.
Performance: trivial; replaces one linear pass with a linear pass + a Map<string, ComposioConnection> lookup.
Behavior: in any environment where a user has re-authorized a connection, the Memory sources panel will collapse from N rows per toolkit to 1 row per logical identity. The Sync button, freshness pill, and disabled-on-non-active rules continue to apply as before — they just apply to the one winning row.
Migration: none. No persisted state, no API contract change.
Forward compatibility: dedupe key uses identity when present, so when the backend starts populating accountEmail / workspace / username, distinct accounts on the same toolkit will automatically keep separate rows without further code changes.

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Key: N/A
URL: N/A

Commit & Branch

Branch: fix/memory-sources-dedupe-by-latest
Commit SHA: ff0dd265

🤖 Generated with Claude Code

Summary by CodeRabbit

Tests
- Added comprehensive tests for memory source deduplication, recency handling, status precedence, identity-preservation, and toolkit filtering.
Bug Fixes
- Improved memory connection deduplication so only the most recent relevant connection is shown per toolkit/identity, honors status precedence, and excludes non-syncable toolkits.

…identity Memory Sources (Intelligence -> Memory) was rendering every Composio connection row returned by the backend, including superseded re-authorizations. For a user with 3 historical Gmail auths (1 ACTIVE plus 2 EXPIRED), 3 Notion auths, etc., the panel showed 9 zombie rows where only 3 reflected current truth. Group connections in buildRows by (toolkit + identity) and keep the row with the largest createdAt. Identity uses accountEmail ?? workspace ?? username; when the backend does not populate any of those (current Composio passthrough), the group collapses by toolkit alone. Once identity fields ship, two genuinely distinct accounts on the same toolkit will automatically keep separate rows. isMoreRecentConnection is pure recency: larger createdAt wins. An earlier draft used a status-priority rule (ACTIVE/CONNECTED beats EXPIRED regardless of createdAt) but that is wrong: if a user re-authorizes and the fresh authorization then expires, the new EXPIRED is the actual current truth -- they need to re-auth. A stale 'still marked ACTIVE' row from a superseded authorization should not win. A regression test pins this. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-28T11:53:07Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7a6fbe64-d1aa-4094-a700-769837b1b5b7

📥 Commits

Reviewing files that changed from the base of the PR and between ff0dd26 and 0d9bbb8.

📒 Files selected for processing (1)

app/src/components/intelligence/MemorySources.test.tsx

🚧 Files skipped from review as they are similar to previous changes (1)

app/src/components/intelligence/MemorySources.test.tsx

📝 Walkthrough

Walkthrough

This PR exports and refactors buildRows to filter connections by syncableToolkits, deduplicate connections per toolkit (and optional identity) by newest createdAt, and produce rows annotated with matching MemorySyncStatus. It also adds isMoreRecentConnection and tests for both helpers.

Changes

Memory sources connection deduplication

Layer / File(s)	Summary
Recency comparison helper `app/src/components/intelligence/MemorySources.tsx`, `app/src/components/intelligence/MemorySources.test.tsx`	`isMoreRecentConnection` compares connections by `createdAt`. Tests verify ordering across statuses (including newer `EXPIRED` superseding older `ACTIVE`) and handling of missing `createdAt`.
Connection deduplication and filtering `app/src/components/intelligence/MemorySources.tsx`, `app/src/components/intelligence/MemorySources.test.tsx`	`buildRows` now filters connections to `syncableToolkits`, deduplicates by toolkit and optional identity, keeps the most recent connection per key, and attaches the corresponding `MemorySyncStatus`. Tests cover collapsing by newest `createdAt`, status precedence, identity-based separation, filtering of non-syncable toolkits, and status attachment.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

memory

Suggested reviewers

graycyrus

Poem

🐰 New rows hop in, then softly cease,
Newest timestamps sing and find their peace.
Toolkits sync or quietly drop,
Rabbits dedupe—never stop.
A tiny hop for code, a joyous bop.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title clearly summarizes the main change: deduplicating Composio sources to keep only the latest authorization per identity.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/src/components/intelligence/MemorySources.test.tsx`:
- Around line 24-27: Rename the snake_case local variables in the test to
camelCase: change older_active and newer_expired to olderActive and newerExpired
wherever they are declared and used (the conn(...) calls and the
isMoreRecentConnection(...) assertions) so the variables follow the repo naming
rules; ensure you update both instances in the test that reference conn and
isMoreRecentConnection to use the new names.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 22a88b6b-565c-4c92-9888-4997eb29130f

📥 Commits

Reviewing files that changed from the base of the PR and between 1884e10 and ff0dd26.

📒 Files selected for processing (2)

app/src/components/intelligence/MemorySources.test.tsx
app/src/components/intelligence/MemorySources.tsx

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-28T12:02:38Z

Actionable comments posted: 0

graycyrus

@sanil-23 hey! the code looks good to me — the dedupe logic is solid and the test suite is thorough (especially the regression guard pinning that a newer EXPIRED beats an older ACTIVE). CI is still pending on several checks (Frontend Unit Tests, Frontend Coverage, Build Tauri App, all E2E suites), so i'll hold off on approving until those are green. once CI passes, i'll come back and approve this. let me know if anything comes up.

jainsanil18

LGTM — small, well-tested cleanup of zombie Composio rows in Memory sources.

sanil-23 requested a review from a team May 28, 2026 11:52

coderabbitai Bot added the memory Memory store, memory tree, recall, summarization, and embeddings in src/openhuman/memory/. label May 28, 2026

coderabbitai Bot requested changes May 28, 2026

View reviewed changes

Comment thread app/src/components/intelligence/MemorySources.test.tsx Outdated

test(memory): camelCase locals in dedupe test per CodeRabbit nit

0d9bbb8

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai Bot approved these changes May 28, 2026

View reviewed changes

graycyrus reviewed May 28, 2026

View reviewed changes

jainsanil18 approved these changes May 28, 2026

View reviewed changes

sanil-23 merged commit 2d68c81 into tinyhumansai:main May 28, 2026
31 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(memory): dedupe Composio sources to the latest authorization per identity#2843

fix(memory): dedupe Composio sources to the latest authorization per identity#2843
sanil-23 merged 2 commits into
tinyhumansai:mainfrom
sanil-23:fix/memory-sources-dedupe-by-latest

sanil-23 commented May 28, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 28, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot commented May 28, 2026

Uh oh!

graycyrus left a comment

Uh oh!

jainsanil18 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

sanil-23 commented May 28, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Submission Checklist

Impact

Related

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Commit & Branch

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot commented May 28, 2026

Uh oh!

graycyrus left a comment

Choose a reason for hiding this comment

Uh oh!

jainsanil18 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sanil-23 commented May 28, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 28, 2026 •

edited

Loading