Skip to content

Persist local embedding progress on failure#74

Merged
clemsgrs merged 1 commit intomainfrom
codex/persist-local-embedding-progress
Mar 18, 2026
Merged

Persist local embedding progress on failure#74
clemsgrs merged 1 commit intomainfrom
codex/persist-local-embedding-progress

Conversation

@clemsgrs
Copy link
Owner

@clemsgrs clemsgrs commented Mar 18, 2026

Summary

Local single-process embedding used to batch all persistence until the entire embedding stage finished. If a later slide failed, earlier successful slides had been computed in memory but nothing had been written yet, so the run lost all embedding progress.

This change makes local embedding durable and resumable:

  • persist each completed slide immediately during local single-process embedding
  • keep process_list.csv feature and aggregation status aligned with the artifacts that have actually been written
  • when Pipeline.run(...) is rerun with resume=True, skip slides whose local embedding artifacts are already complete and marked successful
  • leave the multi-GPU pipeline path unchanged, while still making direct local embed_slides(..., output_dir=...) durable on later-slide failure

User impact

If a large local batch fails near the end, the embeddings that finished before the failure stay on disk instead of being discarded. A resumed local pipeline run now reuses those completed artifacts and computes only the remaining slides.

@clemsgrs clemsgrs changed the title [codex] Persist local embedding progress on failure Persist local embedding progress on failure Mar 18, 2026
@clemsgrs clemsgrs marked this pull request as ready for review March 18, 2026 01:08
@clemsgrs clemsgrs merged commit 6475a82 into main Mar 18, 2026
2 checks passed
@clemsgrs clemsgrs deleted the codex/persist-local-embedding-progress branch March 18, 2026 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant