Loop mode

This page documents the current implementation in codescribe/lib/_loop.py.

Overview

code-scribe loop runs a bounded execution/review workflow over a task file.

The public entry point is:

codescribe/lib/_loop.py: prompt_loop()

The implementation is centered on PromptLoopRunner.

Current execution model

For each loop iteration:

Reload task metadata if the task file changed.
Run an execution-phase agent.
Build a harness-computed LoopSummary from the execution RunResult.
If the execution agent reported STATUS: COMPLETE, stop early.
Otherwise run a review-phase agent.
Read review_output.toml and update pending items.
Stop early if review reports no pending items and no blocker.

Important correction:

The current execution prompt tells the agent to complete as much work as possible in one session. It is not a strict “do exactly one task and exit” loop.

In-memory cross-loop state

Cross-loop state lives in the PromptLoopRunner instance, not primarily in files:

loop_summaries
review_summaries
pending_items
loops_completed

The harness injects these summaries back into later prompts.

Persistent artifacts

The loop writes artifacts under .codescribe/loop/:

run.toml
- run metadata
- configured model and limits
- cumulative token counters
- loops_completed
state.toml
- current run_id, loop_index, and phase
- workdir
- task_file
- updated_at
execution.toml
- execution-phase event log for the most recent loop
- overwritten at the start of each execution phase
review_output.toml
- structured output from the review agent
- overwritten on each review phase
review.toml
- review-phase event log

These files are useful for inspection and crash recovery, but the main state relay is still in memory.

Task loading and tool configuration

PromptLoopRunner.__post_init__():

resolves workdir,
ensures the task file is inside the workdir,
constructs the model with set_neural_model(..., reasoning=reason),
builds the loop system prompt with build_system_prompt(...),
loads the task chat template with load_chat_template(..., return_meta=True),
reads optional task metadata:

[tools]
bash = ["rg", "python", "python3"]

That bash list extends the bounded execution-phase bash allowlist.

System prompt

build_system_prompt(...) gives both loop phases a shared policy prompt.

It emphasizes:

never fabricating file contents, command output, or test results,
determining state by tools rather than assumption,
staying inside the working directory,
using relative or workdir-contained paths in bash,
batching independent reads early,
preferring direct implementation over repeated exploration,
validating after meaningful changes,
using edit for targeted changes and write for new files or rewrites.

Execution phase

The execution phase uses:

Agent(...)
tools = make_tools(workdir, bash_allow=...)
max_iterations = agent_iterations
logging to .codescribe/loop/execution.toml

If --log or --log-path is also provided, logs are fanned out through MultiToolLogSink to both the execution log and the requested extra log path.

Execution task prompt

build_execution_task(...) injects:

loop progress,
files created across prior loops,
files edited across prior loops,
recent commands and errors from the last loop,
pending items.

That context is assembled by format_loop_context(...), which is how the harness keeps the next execution session oriented without requiring the agent to re-read state files.

Behavior differs slightly on the first loop:

if the task file can be read up front, its full contents are injected and the agent is told not to re-read it,
otherwise the agent is told to read the task file.

Later loops are explicitly told not to re-read the task file or re-glob the workspace just for orientation.

Completion contract

The execution agent is asked to finish with <final_answer> containing:

STATUS: COMPLETE or STATUS: INCOMPLETE
the plan followed
exact check output
when incomplete, a NEXT STEPS: section

The harness parses:

STATUS: via extract_status(...)
NEXT STEPS: via extract_pending_items(...)

Review phase

The review phase runs a separate fresh agent.

Current review toolset:

read
glob
write
bounded bash

The review bash allowlist is tighter than execution:

ls
stat
pwd
find
grep
head
tail
which
env
rg

The review agent iteration budget is:

max(6, agent_iterations // 2)

Important correction:

The review phase is not limited to only read/glob/write; it also has a restricted bash tool.

Review input

build_review_task(...) gives the review agent:

a harness-computed summary of verified actions,
any rejected tool calls,
the execution agent’s final report,
the path where it must write review_output.toml.

Rejected calls are important: they were attempted by the execution agent but the harness refused to run them, so they had no workspace effect and should be considered unverified.

Review output format

The review agent is instructed to write TOML like:

loop = 2
summary = "What actually happened"
blocker = ""

[[pending]]
item = "Concrete next step"

The harness then reads pending and blocker from that file.

Loop summaries

loop_summary_from_result(...) builds a LoopSummary from the structured RunResult rather than reparsing the event log.

It tracks:

files_written
files_edited
files_read
commands_run
errors
rejected
token counts

Important correction:

Although older comments may refer to the TOML event log, the current summary logic is driven by Agent.run() output semantics.

Early exit conditions

The loop stops early in either of these cases:

The execution agent emits STATUS: COMPLETE.
The review output contains:
- no pending items, and
- an empty blocker.

Safety model

Loop mode is bounded to the configured working directory.

Current enforcement includes:

task file must be inside workdir,
file tools resolve paths within the root,
bash runs with a bounded allowlist,
review bash is even more restrictive.

This is a policy layer, not an OS sandbox.

Defaults currently exposed by the CLI

code-scribe loop defaults today:

agent_loops = 5
agent_iterations = 30

CLI note

The loop implementation and CLI option defaults agree on the current limits.

However, the command help text in codescribe/cli/_commands.py still contains older wording that says each loop picks the single most important next task and exits. That description is stale relative to the current execution prompt in codescribe/lib/_loop.py, which instructs the agent to complete as much work as possible in one session.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Loop mode

Overview

Current execution model

In-memory cross-loop state

Persistent artifacts

Task loading and tool configuration

System prompt

Execution phase

Execution task prompt

Completion contract

Review phase

Review input

Review output format

Loop summaries

Early exit conditions

Safety model

Defaults currently exposed by the CLI

CLI note

Uh oh!

FilesExpand file tree

loop.md

Latest commit

History

loop.md

File metadata and controls

Loop mode

Overview

Current execution model

In-memory cross-loop state

Persistent artifacts

Task loading and tool configuration

System prompt

Execution phase

Execution task prompt

Completion contract

Review phase

Review input

Review output format

Loop summaries

Early exit conditions

Safety model

Defaults currently exposed by the CLI

CLI note