fix(web): stop truncating multi-line plugin transcript messages#1185
fix(web): stop truncating multi-line plugin transcript messages#1185abhay-codes07 wants to merge 1 commit into
Conversation
parseTranscriptMessages builds its regex with the m flag, and the lookahead that terminates a message body ends with |$. Under m, $ matches at every line end, so the lazy [\s\S]*? body stops at the first newline: every line of a message after the first was silently dropped from Codex session and Amp thread documents. Continuation lines carrying "memory id: ..." were lost too, so extractArtifacts never surfaced their Memory ID chips, and previews/message counts were computed from truncated text. Replace the |$ alternative with (?![\s\S]), a true end-of-input assertion unaffected by the m flag. The adjacent parseClaudeCodeTurns regex is fine as-is - it uses $ without the m flag. Add regression tests through parsePluginDocument covering multi-line bodies, memory-id artifact extraction from continuation lines, literal \n normalization, and unchanged single-line behavior. The multi-line cases fail against the previous regex. Fixes supermemoryai#1184
| return { | ||
| id: "doc_1", | ||
| title: "Codex session", | ||
| content, | ||
| metadata: { sm_source: "codex" }, | ||
| memoryEntries: [], | ||
| } as unknown as PluginDocumentInput |
There was a problem hiding this comment.
The object literal returned from makeCodexSessionDocument uses a type assertion (as unknown as PluginDocumentInput) instead of a type annotation. According to the 'Use type annotations instead of assertions for object literals' rule, you should annotate the return type of the function or the variable directly rather than casting. For example, annotate the function's return type: function makeCodexSessionDocument(content: string): PluginDocumentInput { return { ... }; } and remove the as unknown as PluginDocumentInput assertion.
Spotted by Graphite (based on custom rule: TypeScript style guide (Google))
Is this helpful? React 👍 or 👎 to let us know.
TestingThe testing subagent classified this as a parser-only change with no UI, mobile, API, or integration testing required. Focused parser tests and formatting/lint checks passed on PR HEAD Commands run: export PATH="$HOME/.bun/bin:$PATH"
bun test apps/web/lib/plugin-document.test.ts
bunx biome check apps/web/lib/plugin-document.ts apps/web/lib/plugin-document.test.tsResult: Before, on [
"Hello there",
"Sure!"
]After, on [
"Hello there\nHere is more context on line two",
"Sure!\nSecond line of the reply"
]Verdict✅ Passed. Focused parser regression tests and |
SummaryReviewed Verdict✅ Reviewed — no issues found. The change is narrowly scoped and the parser behavior matches the intended transcript preservation. |
What
Fixes #1184
Multi-line messages in plugin transcript documents (Codex sessions, Amp threads) were being truncated to their first line in the document modal. The message-splitting regex in
parseTranscriptMessagesis built with themflag, and its terminating lookahead ended with|$— underm,$matches at every line end, so the lazy[\s\S]*?body (written to span lines) stopped at the first newline.For this input:
the parser produced messages
"Hello there"and"Sure!"— the continuation lines landed in no message at all. Knock-on effects:memory id:lines inside a body were dropped beforeextractArtifactscould surface them, so the Memory ID chip never showed for multi-line turnsThis bites on essentially every real transcript, since
normalizeContentconverts literal\nescapes to real newlines before parsing.How
One-token change: the
|$alternative in the lookahead becomes|(?![\s\S])— a true end-of-input assertion that themflag can't reinterpret. Message bodies now run until the nextN. [role]header or the actual end of the transcript.The neighboring
parseClaudeCodeTurnsregex also ends with|$but doesn't use themflag, so it already behaves correctly and is left untouched.Testing
apps/web/lib/plugin-document.test.ts(bun:test, same setup as the otherlib/*.test.tsfiles), exercised through the publicparsePluginDocument: multi-line bodies preserved, memory-id artifacts extracted from continuation lines, literal\nnormalization, and single-line messages unchanged.biome checkclean; no newtscerrors in the touched files.cc @MaheshtheDev
Session Details
(aside)to your comment to have me ignore it.