fix: preserve completed task progress on checkpoint resume#231
Open
jafreck wants to merge 4 commits into
Open
Conversation
- Add prepareForResume() to clear terminalExhaustion, failed/blocked tasks, and stale Phase 4 flow checkpoint entries on reload - Reset __flowCheckpoint and __phase4FlowCheckpoint status from 'failed' to 'running' so the Cadre runner re-enters correctly - Filter Phase 4 completedExecutionIds to only retain substeps for fully-completed tasks; failed/in-flight tasks re-enter from scratch - Reset flow checkpoint error field in resetFromPhase() - Accumulate outputTokens from assistant.message events in Copilot JSONL parser as fallback when usage summary is missing - Improve Lore MCP tool documentation in agent prompt partial with explicit tool names and stronger guidance to prefer Lore over view - Tune zstd fixture config: maxParallelAgents 12→8, resume false
checkpoint.completeTask() was defined but never called, leaving completedTasks empty. On resume, filterPhase4CompletedExecutionIds used the empty set to filter out ALL task substep entries from the Phase 4 flow checkpoint, causing the entire task graph to restart. Fix: - runCommitSubstep now calls checkpoint.completeTask(task.id) so completedTasks is populated during Phase 4 (all execution modes). - filterPhase4CompletedExecutionIds derives completed tasks from the flow checkpoint's own /commit entries when completedTasks is empty (backward compat for existing checkpoints). - Add test for back-fill resume path. - Fix task-graph-builder test schema (add parent_symbol_id column required by updated @jafreck/lore).
…-reset # Conflicts: # src/core/checkpoint.ts # tests/core/checkpoint.test.ts # tests/fixtures/zstd-c-project/migration.config.json
- Add symbol_metrics row for synthetic symbol in kb-server test - Update semantic search sort order expectation in kb-search-tool test
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
checkpoint.completeTask()was defined but never called anywhere during Phase 4 execution. This leftcompletedTaskspermanently empty in the checkpoint. On resume,filterPhase4CompletedExecutionIds()used this empty set to decide which Phase 4 flow entries to keep — and since no tasks were "completed", it removed all task substep entries, causing the entire task graph to restart from scratch.This was observed during a live zstd C→Rust migration: after ~46 hours and 43 committed tasks, a kill-and-resume reset the migration to 3 committed tasks.
Fix
Populate
completedTasksduring Phase 4 (all execution modes):runCommitSubstep()now callscheckpoint.completeTask(task.id)after the code-migrator commit step.completestep also callscompleteTask()for completeness.Backward-compatible fallback in
filterPhase4CompletedExecutionIds:completedTasksis empty, derives the completed set from the flow checkpoint's own/commitentries (tasks that have a committed substep are treated as completed).completedTasksfrom this derived set so downstream logic stays consistent.Other changes
task-graph-builder.test.tstest schema: addparent_symbol_idcolumn to thesymbolstable, required by the updated@jafreck/lorepackage.Testing
should back-fill completedTasks from Phase 4 commit entries when completedTasks is empty