feat: add serial crash-safe mutation recovery#142
Conversation
5346945 to
4ca33c7
Compare
|
Solid PR — the snapshot/rollback design is sound and the hardlink backup contract holds (every writer into the hardlinked dirs is atomic temp+replace or append-only). One gap worth closing before merge, since it's the PR's own goal: the URL path isn't crash-safe. |
Keep URL ingests on add_single_file's staged conversion path so source artifacts are published only after the mutation snapshot exists and can roll back on failure. Add regression coverage for URL failure cleanup while preserving downloaded raw files for retry, and replace legacy asyncio.run mocks with async compiler stubs to avoid unawaited coroutine warnings in the touched tests.
@KylinMountain Thanks for the CR, gap fixed🫡 |
TL;DR
This PR makes document import fail more cleanly. If
openkb addis interrupted or errors halfway through, OpenKB should no longer leave half-written KB files behind; it can recover the previous stable state before accepting the next mutation.Summary
Adds crash-safe mutation recovery for serial KB mutations without bringing back the concurrent add pipeline from #104.
This PR keeps the scope focused on the first split-out piece from the abandoned PR: journaled snapshot/rollback for
addand PageIndex Cloud import, plus the atomic writer prerequisites needed for hardlink-backed rollback.What changed
Added
openkb/mutation.pywith:Wired recovery into serial mutation paths:
openkb add <file>converts into staging, snapshots final KB paths, publishes, compiles, then commits via registry write + journal markopenkb add --from-pageindex-cloudprepares cloud data read-only, snapshots doc-specific paths, writes/compiles/registers under the same recovery contractKept conversion serial while adding staging support:
convert_document(..., staging_dir=...)writes raw/source artifacts into isolated staging before commitMade wiki writers atomic where hardlink snapshots depend on temp+replace semantics:
Explicitly out of scope
This does not reintroduce the concurrency half of #104:
ThreadPoolExecutorfile_processing_jobsValidation
UV_CACHE_DIR=/tmp/uv-cache UV_PYTHON=3.13 uv run --extra dev pytest -q898 passed, 5 warningsUV_CACHE_DIR=/tmp/uv-cache UV_PYTHON=3.13 uv run --extra dev pytest tests/test_add_command.py::TestImportFromPageindexCloud tests/test_mutation.py -q25 passedUV_CACHE_DIR=/tmp/uv-cache UV_PYTHON=3.13 uv run --extra dev pytest tests/test_converter.py tests/test_add_command.py tests/test_mutation.py -q72 passedUV_CACHE_DIR=/tmp/uv-cache UV_PYTHON=3.13 uv run --extra dev ruff check openkb/converter.py openkb/mutation.py tests/test_mutation.pyNotes
Full-project
ruffandtyare not clean on currentmain; this PR fixes the new diagnostics introduced by the recovery changes but does not widen scope to existing lint/type debt.