feat(mutation): route serial add mutations through coordinator#156
feat(mutation): route serial add mutations through coordinator#156gwokhou wants to merge 12 commits into
Conversation
802d0e6 to
0401b81
Compare
|
Pushed three follow-up commits to handle the conflict, the review fix, and the final coverage gaps separately:
Validation:
Coverage-mode validation:
|
…col.add Since the add mutation stopped eagerly snapshotting .openkb/files (it registers the new blob via track_new only on success), a blob written by col.add() leaks if index_long_document raises afterward: pageindex.db is rolled back by the snapshot but the blob file is not, and nothing reclaims it. Delete the doc on the failure path so the indexer owns cleanup of a half-applied add, restoring the .openkb/files rollback-surface guarantee.
… prep run_add_mutation handles snapshot/body failures itself and returns False, so the broad except only ever catches pre-mutation errors (name resolution, registry read, plan construction). Echo the real exception instead of the old 'Failed to prepare mutation snapshot' label, which hid the cause at DEBUG level.
|
Found two more issues in self-review. All fixed.
|
…rdinator # Conflicts: # openkb/cli.py # openkb/indexer.py
cli._cleanup_staging duplicated add_coordinator._cleanup_staging_dirs (both guard None then shutil.rmtree with ignore_errors). Drop the cli copy and route its two pre-coordinator call sites through the coordinator helper.
AddMutationPlan.rollback_error_message had zero callers — every construction site (cli.py x2, tests x6) used the default. Remove the field and inline the literal at its single read site.
|
@KylinMountain The PR is ready for review🚀 Synced with Update — synced with
|
Background
#142 made serial
openkb addcrash-safe with journals, touched-path snapshots, staged publish, rollback, and recovery drain. The recovery model is now in place, but its orchestration still lives inline in the CLI add paths.Before OpenKB can safely add prepare workers or
--jobs, official KB mutation ownership needs to be explicit. Future workers may prepare disposable artifacts, but one serial owner must remain responsible for snapshotting rollback-critical paths, publishing staged artifacts, updating the registry, marking the commit point, and handling rollback or post-commit cleanup.This PR introduces that owner for add-like mutations. The coordinator lifts the crash-safe lifecycle from #142 into a higher-level abstraction: it owns transaction mechanics, while the add paths keep document-specific publish, compile, and register logic.
This is the Stage 3 bridge from the parallel architecture roadmap: #151. It keeps ingestion serial, makes the mutation boundary reviewable, and gives future non-add commands a pattern to adopt when they need the same recovery guarantees.
Summary
AddMutationPlanandrun_add_mutationas the serial coordinator for add-like KB mutationsraw/.openkb/filesrollback surface intact without eager snapshotting — the long-doc blob is registered lazily viasnapshot.track_new(), andindex_long_documentdeletes the PageIndex doc if it fails aftercol.add()so a half-applied add leaves no orphan blob (review-driven fix for a gap left by perf(mutation): don't snapshot the whole blob store on every add #155)Scope
This is the Stage 3 slice from the parallel architecture roadmap: #151
In scope:
openkb add <file>openkb add <URL>after URL fetchopenkb add --from-pageindex-cloud <DOC_ID>Out of scope:
openkb add <dir> --jobs Nremove,recompile,lint --fix, reports, chat sessions, or watch batchingWhy this matters
After this PR, future parallel ingestion work has a clear boundary:
That boundary is what lets later PRs split prepare from commit without weakening the recovery guarantees from #142.
Verification
UV_PYTHON=3.13 UV_CACHE_DIR=/tmp/uv-cache uv run --extra dev pytest tests/test_add_command.py -qUV_PYTHON=3.13 UV_CACHE_DIR=/tmp/uv-cache uv run --extra dev pytest tests/test_mutation.py tests/test_locks.py tests/test_add_command.py tests/test_url_ingest.py tests/test_indexer.py -qgit diff --check main...HEAD