Skip to content

Commit 139174b

Browse files
committed
fix(knowledge): key hard-delete result off rows actually deleted
Address Cursor review: hardDeleteDocuments returned existingIds.length (the requested count) and cleaned storage for the full pre-tx set, even though the decrement was driven by the rows the transaction actually deleted. Under a concurrent delete that claimed some ids first, that overstated the result and re-touched storage for rows this call didn't delete. Drive the storage cleanup, log, and return value from deletedDocs (the returning() rows) so all four are consistent.
1 parent f759a0d commit 139174b

1 file changed

Lines changed: 16 additions & 14 deletions

File tree

apps/sim/lib/knowledge/documents/service.ts

Lines changed: 16 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -2052,7 +2052,6 @@ export async function hardDeleteDocuments(
20522052
}
20532053

20542054
const existingIds = documentsToDelete.map((doc) => doc.id)
2055-
const docInfoById = new Map(documentsToDelete.map((doc) => [doc.id, doc]))
20562055

20572056
// Resolve subscriptions for every candidate billed owner before the transaction
20582057
// (these are reads). Connector-synced documents are never metered at ingest
@@ -2068,37 +2067,40 @@ export async function hardDeleteDocuments(
20682067
subByUser.set(billedUserId, await getHighestPrioritySubscription(billedUserId))
20692068
}
20702069

2071-
// Decrement inside the same transaction that deletes the rows, driven by the
2072-
// rows THIS transaction actually deleted (`returning()`) — so a concurrent
2073-
// delete of the same ids removes 0 rows and decrements nothing (no double
2074-
// decrement), and a rollback leaves the counter untouched (no inflation).
2070+
// Everything downstream is keyed off the rows THIS transaction actually deleted
2071+
// (`returning()`), not the requested ids — so a concurrent delete that removed
2072+
// some ids first doesn't get double-decremented, double-cleaned, or counted
2073+
// here, and a rollback leaves the counter untouched.
2074+
let deletedDocs: typeof documentsToDelete = []
20752075
await db.transaction(async (tx) => {
20762076
await tx.delete(embedding).where(inArray(embedding.documentId, existingIds))
20772077
const deletedRows = await tx
20782078
.delete(document)
20792079
.where(inArray(document.id, existingIds))
20802080
.returning({ id: document.id })
20812081

2082+
const deletedIds = new Set(deletedRows.map((row) => row.id))
2083+
deletedDocs = documentsToDelete.filter((doc) => deletedIds.has(doc.id))
2084+
20822085
const bytesByUser = new Map<string, number>()
2083-
for (const { id } of deletedRows) {
2084-
const info = docInfoById.get(id)
2085-
if (!info || info.connectorId || info.fileSize <= 0) continue
2086-
const billedUserId = info.uploadedBy ?? info.kbUserId
2086+
for (const doc of deletedDocs) {
2087+
if (doc.connectorId || doc.fileSize <= 0) continue
2088+
const billedUserId = doc.uploadedBy ?? doc.kbUserId
20872089
if (!billedUserId) continue
2088-
bytesByUser.set(billedUserId, (bytesByUser.get(billedUserId) ?? 0) + info.fileSize)
2090+
bytesByUser.set(billedUserId, (bytesByUser.get(billedUserId) ?? 0) + doc.fileSize)
20892091
}
20902092
for (const [billedUserId, bytes] of bytesByUser) {
20912093
await decrementStorageUsageInTx(tx, subByUser.get(billedUserId) ?? null, billedUserId, bytes)
20922094
}
20932095
})
20942096

2095-
await deleteDocumentStorageFiles(documentsToDelete, requestId)
2097+
await deleteDocumentStorageFiles(deletedDocs, requestId)
20962098

2097-
logger.info(`[${requestId}] Hard deleted ${existingIds.length} documents`, {
2098-
documentIds: existingIds,
2099+
logger.info(`[${requestId}] Hard deleted ${deletedDocs.length} documents`, {
2100+
documentIds: deletedDocs.map((doc) => doc.id),
20992101
})
21002102

2101-
return existingIds.length
2103+
return deletedDocs.length
21022104
}
21032105

21042106
export async function deleteDocument(

0 commit comments

Comments
 (0)