Skip to content

fix(memory): dedupe Composio sources to the latest authorization per identity#2843

Merged
sanil-23 merged 2 commits into
tinyhumansai:mainfrom
sanil-23:fix/memory-sources-dedupe-by-latest
May 28, 2026
Merged

fix(memory): dedupe Composio sources to the latest authorization per identity#2843
sanil-23 merged 2 commits into
tinyhumansai:mainfrom
sanil-23:fix/memory-sources-dedupe-by-latest

Conversation

@sanil-23
Copy link
Copy Markdown
Contributor

@sanil-23 sanil-23 commented May 28, 2026

Summary

  • Intelligence → Memory was rendering every Composio connection ever returned by the backend, including superseded re-authorizations — visually 9 zombie rows where only 3 reflected current truth on a real account.
  • Dedupe rows in buildRows by (toolkit + identity) and keep the largest createdAt per group.
  • Drop the earlier status-priority idea (ACTIVE always wins) — a freshly re-authorized connection that then expires is the user's actual current truth and must beat a stale-but-still-marked-ACTIVE older row.
  • Vitest coverage: 8 cases including a regression guard on the createdAt-vs-status precedence.

Problem

composio.list_connections returns every connection record the Composio tenant has ever held for that account — including auths the user has since superseded by re-authorizing. The frontend dropped these into buildRows unchanged, so the Memory sources panel grew a row per historical authorization. On one tester's account this expanded to 9 rows (1 ACTIVE + 7 EXPIRED + 1 REVOKED across gmail/github/notion) where only 3 reflected anything actionable. Every duplicate row also shared the same per-toolkit MemorySyncStatus block, repeating the same "2 chunks · 1d ago" multiple times.

Solution

Two-stage filter in buildRows:

  1. Toolkit gate (unchanged) — drop toolkits with no memory-tree sync provider.
  2. Dedupe — group by ${toolkit}::${identity} when identity (accountEmail ?? workspace ?? username) is non-null; otherwise by toolkit alone. Within a group, keep the connection with the largest createdAt.

isMoreRecentConnection is pure recency. An earlier draft preferred ACTIVE/CONNECTED over other statuses regardless of timestamp, but that loses the right answer in one critical case: when a user re-authorizes and the fresh auth then expires, the new EXPIRED is the user's current state — the older row that is "still ACTIVE" is stale data from a superseded authorization. A regression test pins this.

Once the backend starts populating identity fields, two genuinely different accounts on the same toolkit (e.g. two Gmail addresses) will automatically keep separate rows without further code changes.

Submission Checklist

  • Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
  • Diff coverage ≥ 80% — changed lines (Vitest + cargo-llvm-cov merged via diff-cover) meet the gate enforced by .github/workflows/coverage.yml. Run pnpm test:coverage and pnpm test:rust locally; PRs below 80% on changed lines will not merge.
  • N/A: behaviour-only change — no new feature row added to docs/TEST-COVERAGE-MATRIX.md, so the matrix doesn't need an update.
  • N/A: behaviour-only change — no feature IDs to list under ## Related.
  • No new external network dependencies introduced (mock backend used per Testing Strategy)
  • N/A: cosmetic frontend cleanup that does not touch the docs/RELEASE-MANUAL-SMOKE.md surfaces.
  • N/A: discovered organically while reviewing the Memory panel, no linked issue.

Impact

  • Runtime / platform: desktop only — pure app/src/ change, no Rust, no schema, no backend.
  • Performance: trivial; replaces one linear pass with a linear pass + a Map<string, ComposioConnection> lookup.
  • Behavior: in any environment where a user has re-authorized a connection, the Memory sources panel will collapse from N rows per toolkit to 1 row per logical identity. The Sync button, freshness pill, and disabled-on-non-active rules continue to apply as before — they just apply to the one winning row.
  • Migration: none. No persisted state, no API contract change.
  • Forward compatibility: dedupe key uses identity when present, so when the backend starts populating accountEmail / workspace / username, distinct accounts on the same toolkit will automatically keep separate rows without further code changes.

Related

  • Closes:
  • Follow-up PR(s)/TODOs: surface a "history" affordance for users who genuinely want to see the prior superseded auths (currently they're invisible from the UI; only reachable via the raw RPC).

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: fix/memory-sources-dedupe-by-latest
  • Commit SHA: ff0dd265

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Tests

    • Added comprehensive tests for memory source deduplication, recency handling, status precedence, identity-preservation, and toolkit filtering.
  • Bug Fixes

    • Improved memory connection deduplication so only the most recent relevant connection is shown per toolkit/identity, honors status precedence, and excludes non-syncable toolkits.

Review Change Stack

…identity

Memory Sources (Intelligence -> Memory) was rendering every Composio
connection row returned by the backend, including superseded
re-authorizations. For a user with 3 historical Gmail auths (1 ACTIVE
plus 2 EXPIRED), 3 Notion auths, etc., the panel showed 9 zombie rows
where only 3 reflected current truth.

Group connections in buildRows by (toolkit + identity) and keep the
row with the largest createdAt. Identity uses
accountEmail ?? workspace ?? username; when the backend does not
populate any of those (current Composio passthrough), the group
collapses by toolkit alone. Once identity fields ship, two genuinely
distinct accounts on the same toolkit will automatically keep
separate rows.

isMoreRecentConnection is pure recency: larger createdAt wins. An
earlier draft used a status-priority rule (ACTIVE/CONNECTED beats
EXPIRED regardless of createdAt) but that is wrong: if a user
re-authorizes and the fresh authorization then expires, the new
EXPIRED is the actual current truth -- they need to re-auth. A stale
'still marked ACTIVE' row from a superseded authorization should not
win. A regression test pins this.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sanil-23 sanil-23 requested a review from a team May 28, 2026 11:52
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7a6fbe64-d1aa-4094-a700-769837b1b5b7

📥 Commits

Reviewing files that changed from the base of the PR and between ff0dd26 and 0d9bbb8.

📒 Files selected for processing (1)
  • app/src/components/intelligence/MemorySources.test.tsx
🚧 Files skipped from review as they are similar to previous changes (1)
  • app/src/components/intelligence/MemorySources.test.tsx

📝 Walkthrough

Walkthrough

This PR exports and refactors buildRows to filter connections by syncableToolkits, deduplicate connections per toolkit (and optional identity) by newest createdAt, and produce rows annotated with matching MemorySyncStatus. It also adds isMoreRecentConnection and tests for both helpers.

Changes

Memory sources connection deduplication

Layer / File(s) Summary
Recency comparison helper
app/src/components/intelligence/MemorySources.tsx, app/src/components/intelligence/MemorySources.test.tsx
isMoreRecentConnection compares connections by createdAt. Tests verify ordering across statuses (including newer EXPIRED superseding older ACTIVE) and handling of missing createdAt.
Connection deduplication and filtering
app/src/components/intelligence/MemorySources.tsx, app/src/components/intelligence/MemorySources.test.tsx
buildRows now filters connections to syncableToolkits, deduplicates by toolkit and optional identity, keeps the most recent connection per key, and attaches the corresponding MemorySyncStatus. Tests cover collapsing by newest createdAt, status precedence, identity-based separation, filtering of non-syncable toolkits, and status attachment.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

memory

Suggested reviewers

  • graycyrus

Poem

🐰 New rows hop in, then softly cease,
Newest timestamps sing and find their peace.
Toolkits sync or quietly drop,
Rabbits dedupe—never stop.
A tiny hop for code, a joyous bop.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title clearly summarizes the main change: deduplicating Composio sources to keep only the latest authorization per identity.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added the memory Memory store, memory tree, recall, summarization, and embeddings in src/openhuman/memory/. label May 28, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/src/components/intelligence/MemorySources.test.tsx`:
- Around line 24-27: Rename the snake_case local variables in the test to
camelCase: change older_active and newer_expired to olderActive and newerExpired
wherever they are declared and used (the conn(...) calls and the
isMoreRecentConnection(...) assertions) so the variables follow the repo naming
rules; ensure you update both instances in the test that reference conn and
isMoreRecentConnection to use the new names.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 22a88b6b-565c-4c92-9888-4997eb29130f

📥 Commits

Reviewing files that changed from the base of the PR and between 1884e10 and ff0dd26.

📒 Files selected for processing (2)
  • app/src/components/intelligence/MemorySources.test.tsx
  • app/src/components/intelligence/MemorySources.tsx

Comment thread app/src/components/intelligence/MemorySources.test.tsx Outdated
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

Actionable comments posted: 0

Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sanil-23 hey! the code looks good to me — the dedupe logic is solid and the test suite is thorough (especially the regression guard pinning that a newer EXPIRED beats an older ACTIVE). CI is still pending on several checks (Frontend Unit Tests, Frontend Coverage, Build Tauri App, all E2E suites), so i'll hold off on approving until those are green. once CI passes, i'll come back and approve this. let me know if anything comes up.

Copy link
Copy Markdown
Contributor

@jainsanil18 jainsanil18 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — small, well-tested cleanup of zombie Composio rows in Memory sources.

@sanil-23 sanil-23 merged commit 2d68c81 into tinyhumansai:main May 28, 2026
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

memory Memory store, memory tree, recall, summarization, and embeddings in src/openhuman/memory/.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants