Open follow-ups that still appear relevant to the current code.
- Add tests for
Agent.run()stop conditions:- final text
- max iterations
- tool budget exhaustion
- repeated-call blocking
- invalid tool-argument JSON recovery
- Add direct tests for
RunResult,ToolResult, andRejectedCallso future refactors do not silently change loop semantics. - Consider stronger context-compaction beyond the current workspace summary plus output truncation.
- Add tests for
PromptLoopRunnerearly-exit behavior:- execution
STATUS: COMPLETE - empty review
pendingand blocker
- execution
- Decide whether review should always run after execution, or whether the current “skip review on STATUS: COMPLETE” behavior is final API.
- Document or implement a clearer crash-resume story for
.codescribe/loop/artifacts.
- Tighten bounded
bashsafety. It still usesshell=Trueafter validation. - Add tests for path-bounding edge cases in
ReadTool,GlobTool,EditTool, andWriteTool. - Consider whether
EditToolshould explicitly reject overlapping/nested edits instead of relying mainly on exact-match uniqueness.
- Add backend smoke tests for:
OpenAICompModelAnthropicModelArgoModelTFModel
- Decide whether
supports_native_tools=Trueis the right name for backends that emulate tool calls through strict JSON prompting. - Document the practical support matrix for reasoning/token accounting across providers.
- Keep docs focused on code-backed behavior and remove speculative framework comparisons unless they are needed and maintained.
- Reconcile
README.rstexamples and defaults with the current CLI and loop implementation.