This page documents the current implementation in codescribe/lib/_loop.py.
code-scribe loop runs a bounded execution/review workflow over a task file.
The public entry point is:
codescribe/lib/_loop.py: prompt_loop()
The implementation is centered on PromptLoopRunner.
For each loop iteration:
- Reload task metadata if the task file changed.
- Run an execution-phase agent.
- Build a harness-computed
LoopSummaryfrom the executionRunResult. - If the execution agent reported
STATUS: COMPLETE, stop early. - Otherwise run a review-phase agent.
- Read
review_output.tomland update pending items. - Stop early if review reports no pending items and no blocker.
Important correction:
- The current execution prompt tells the agent to complete as much work as possible in one session. It is not a strict “do exactly one task and exit” loop.
Cross-loop state lives in the PromptLoopRunner instance, not primarily in
files:
loop_summariesreview_summariespending_itemsloops_completed
The harness injects these summaries back into later prompts.
The loop writes artifacts under .codescribe/loop/:
run.toml- run metadata
- configured model and limits
- cumulative token counters
loops_completed
state.toml- current
run_id,loop_index, andphase workdirtask_fileupdated_at
- current
execution.toml- execution-phase event log for the most recent loop
- overwritten at the start of each execution phase
review_output.toml- structured output from the review agent
- overwritten on each review phase
review.toml- review-phase event log
These files are useful for inspection and crash recovery, but the main state relay is still in memory.
PromptLoopRunner.__post_init__():
- resolves
workdir, - ensures the task file is inside the workdir,
- constructs the model with
set_neural_model(..., reasoning=reason), - builds the loop system prompt with
build_system_prompt(...), - loads the task chat template with
load_chat_template(..., return_meta=True), - reads optional task metadata:
[tools]
bash = ["rg", "python", "python3"]That bash list extends the bounded execution-phase bash allowlist.
build_system_prompt(...) gives both loop phases a shared policy prompt.
It emphasizes:
- never fabricating file contents, command output, or test results,
- determining state by tools rather than assumption,
- staying inside the working directory,
- using relative or workdir-contained paths in bash,
- batching independent reads early,
- preferring direct implementation over repeated exploration,
- validating after meaningful changes,
- using
editfor targeted changes andwritefor new files or rewrites.
The execution phase uses:
Agent(...)tools = make_tools(workdir, bash_allow=...)max_iterations = agent_iterations- logging to
.codescribe/loop/execution.toml
If --log or --log-path is also provided, logs are fanned out through
MultiToolLogSink to both the execution log and the requested extra log path.
build_execution_task(...) injects:
- loop progress,
- files created across prior loops,
- files edited across prior loops,
- recent commands and errors from the last loop,
- pending items.
That context is assembled by format_loop_context(...), which is how the
harness keeps the next execution session oriented without requiring the agent to
re-read state files.
Behavior differs slightly on the first loop:
- if the task file can be read up front, its full contents are injected and the agent is told not to re-read it,
- otherwise the agent is told to read the task file.
Later loops are explicitly told not to re-read the task file or re-glob the workspace just for orientation.
The execution agent is asked to finish with <final_answer> containing:
STATUS: COMPLETEorSTATUS: INCOMPLETE- the plan followed
- exact check output
- when incomplete, a
NEXT STEPS:section
The harness parses:
STATUS:viaextract_status(...)NEXT STEPS:viaextract_pending_items(...)
The review phase runs a separate fresh agent.
Current review toolset:
readglobwrite- bounded
bash
The review bash allowlist is tighter than execution:
lsstatpwdfindgrepheadtailwhichenvrg
The review agent iteration budget is:
max(6, agent_iterations // 2)
Important correction:
- The review phase is not limited to only
read/glob/write; it also has a restrictedbashtool.
build_review_task(...) gives the review agent:
- a harness-computed summary of verified actions,
- any rejected tool calls,
- the execution agent’s final report,
- the path where it must write
review_output.toml.
Rejected calls are important: they were attempted by the execution agent but the harness refused to run them, so they had no workspace effect and should be considered unverified.
The review agent is instructed to write TOML like:
loop = 2
summary = "What actually happened"
blocker = ""
[[pending]]
item = "Concrete next step"The harness then reads pending and blocker from that file.
loop_summary_from_result(...) builds a LoopSummary from the structured
RunResult rather than reparsing the event log.
It tracks:
files_writtenfiles_editedfiles_readcommands_runerrorsrejected- token counts
Important correction:
- Although older comments may refer to the TOML event log, the current summary
logic is driven by
Agent.run()output semantics.
The loop stops early in either of these cases:
- The execution agent emits
STATUS: COMPLETE. - The review output contains:
- no pending items, and
- an empty blocker.
Loop mode is bounded to the configured working directory.
Current enforcement includes:
- task file must be inside
workdir, - file tools resolve paths within the root,
- bash runs with a bounded allowlist,
- review bash is even more restrictive.
This is a policy layer, not an OS sandbox.
code-scribe loop defaults today:
agent_loops = 5agent_iterations = 30
The loop implementation and CLI option defaults agree on the current limits.
However, the command help text in codescribe/cli/_commands.py still contains
older wording that says each loop picks the single most important next task and
exits. That description is stale relative to the current execution prompt in
codescribe/lib/_loop.py, which instructs the agent to complete as much work as
possible in one session.