Complete guide to running BaseAgent and interpreting its output
python3 agent.py --instruction "Your task description"| Argument | Type | Description |
|---|---|---|
--instruction |
string | The task for the agent to complete |
# Create a file
python3 agent.py --instruction "Create a file called hello.txt with 'Hello, World!'"
# Read and explain code
python3 agent.py --instruction "Read src/core/loop.py and explain what it does"
# Find files
python3 agent.py --instruction "Find all Python files that contain 'import json'"# Multi-step task
python3 agent.py --instruction "Create a Python module in src/utils/helpers.py with functions for string manipulation, then write tests for it"
# Code modification
python3 agent.py --instruction "Add error handling to all functions in src/api/client.py that make HTTP requests"
# Investigation task
python3 agent.py --instruction "Find the bug causing the TypeError in the test output and fix it"Configure the agent's behavior with environment variables:
# DeepSeek API for challenge runs
export DEEPSEEK_API_KEY="your-token"
export DEEPSEEK_BASE_URL="https://api.deepseek.com"
export LLM_MODEL="deepseek-v4-pro"
# Cost management
export LLM_COST_LIMIT="10.0"
# Run with inline variables
LLM_COST_LIMIT="5.0" python3 agent.py --instruction "..."Challenge API policy: this agent is configured to use only the DeepSeek API for cost reasons. Challenge runs must use DEEPSEEK_API_KEY and the configured DeepSeek model. Do not add or rely on Chutes, OpenRouter, Anthropic, OpenAI, or other provider fallbacks for challenge execution.
BaseAgent emits JSONL (JSON Lines) events to stdout:
sequenceDiagram
participant Agent
participant stdout as Standard Output
Agent->>stdout: {"type": "thread.started", "thread_id": "sess_..."}
Agent->>stdout: {"type": "turn.started"}
loop Tool Execution
Agent->>stdout: {"type": "item.started", "item": {...}}
Agent->>stdout: {"type": "item.completed", "item": {...}}
end
Agent->>stdout: {"type": "turn.completed", "usage": {...}}
| Event | Description |
|---|---|
thread.started |
Session begins, includes unique thread ID |
turn.started |
Agent begins processing the instruction |
item.started |
A tool call is starting |
item.completed |
A tool call has completed |
turn.completed |
Agent finished, includes token usage |
turn.failed |
An error occurred |
{"type": "thread.started", "thread_id": "sess_1706890123456"}
{"type": "turn.started"}
{"type": "item.started", "item": {"type": "command_execution", "id": "1", "command": "shell_command({command: 'ls -la'})", "status": "in_progress"}}
{"type": "item.completed", "item": {"type": "command_execution", "id": "1", "command": "shell_command", "status": "completed", "aggregated_output": "total 40\ndrwxr-xr-x...", "exit_code": 0}}
{"type": "item.completed", "item": {"type": "agent_message", "id": "2", "content": "I found the files. Now creating hello.txt..."}}
{"type": "item.started", "item": {"type": "command_execution", "id": "3", "command": "write_file({file_path: 'hello.txt', content: 'Hello, World!'})", "status": "in_progress"}}
{"type": "item.completed", "item": {"type": "command_execution", "id": "3", "command": "write_file", "status": "completed", "exit_code": 0}}
{"type": "turn.completed", "usage": {"input_tokens": 5432, "cached_input_tokens": 4890, "output_tokens": 256}}Agent logs go to stderr:
[14:30:15] [superagent] ============================================================
[14:30:15] [superagent] SuperAgent Starting (SDK 3.0 - DeepSeek API)
[14:30:15] [superagent] ============================================================
[14:30:15] [superagent] Model: deepseek-v4-pro
[14:30:15] [superagent] Instruction: Create hello.txt with 'Hello World'...
[14:30:15] [loop] Getting initial state...
[14:30:16] [loop] Iteration 1/200
[14:30:16] [compaction] Context: 5432 tokens (3.2% of 168000)
[14:30:16] [loop] Prompt caching: 1 system + 2 final messages marked (3 breakpoints)
[14:30:17] [loop] Executing tool: write_file
[14:30:17] [loop] Iteration 2/200
[14:30:18] [loop] No tool calls in response
[14:30:18] [loop] Requesting self-verification before completion
# Send JSONL to file, logs to terminal
python3 agent.py --instruction "..." > output.jsonl
# Send logs to file, JSONL to terminal
python3 agent.py --instruction "..." 2> agent.log
# Both to separate files
python3 agent.py --instruction "..." > output.jsonl 2> agent.log# Get all completed items
python3 agent.py --instruction "..." | jq 'select(.type == "item.completed")'
# Get final usage stats
python3 agent.py --instruction "..." | jq 'select(.type == "turn.completed") | .usage'
# Get all agent messages
python3 agent.py --instruction "..." | jq 'select(.item.type == "agent_message") | .item.content'import json
import subprocess
# Run agent and capture output
result = subprocess.run(
["python3", "agent.py", "--instruction", "Your task"],
capture_output=True,
text=True
)
# Parse JSONL output
events = [json.loads(line) for line in result.stdout.strip().split('\n') if line]
# Find usage stats
for event in events:
if event.get("type") == "turn.completed":
print(f"Input tokens: {event['usage']['input_tokens']}")
print(f"Output tokens: {event['usage']['output_tokens']}")flowchart TB
subgraph Input["Input Phase"]
Cmd["python3 agent.py --instruction '...'"]
Parse["Parse Arguments"]
Init["Initialize Components"]
end
subgraph Explore["Exploration Phase"]
State["Get Current State"]
Context["Build Initial Context"]
end
subgraph Execute["Execution Phase"]
Loop["Agent Loop"]
Tools["Execute Tools"]
Verify["Self-Verification"]
end
subgraph Output["Output Phase"]
JSONL["Emit JSONL Events"]
Stats["Report Statistics"]
end
Cmd --> Parse --> Init
Init --> State --> Context
Context --> Loop
Loop --> Tools --> Loop
Loop --> Verify
Verify --> Stats
Loop --> JSONL
# Create a file
python3 agent.py --instruction "Create config.yaml with database settings for PostgreSQL"
# Read and summarize
python3 agent.py --instruction "Read README.md and create a one-paragraph summary"
# Modify a file
python3 agent.py --instruction "Add a new function to src/utils.py that validates email addresses"# Explain code
python3 agent.py --instruction "Explain how the authentication system works in src/auth/"
# Find patterns
python3 agent.py --instruction "Find all API endpoints and list them with their HTTP methods"
# Review code
python3 agent.py --instruction "Review src/api/handlers.py for potential security issues"# Investigate error
python3 agent.py --instruction "Find why 'test_user_creation' is failing and fix it"
# Trace behavior
python3 agent.py --instruction "Trace the data flow from user input to database in the signup process"# Setup
python3 agent.py --instruction "Create a Python project structure with src/, tests/, and setup.py"
# Add feature
python3 agent.py --instruction "Add logging to all functions in src/core/ using Python's logging module"
# Refactor
python3 agent.py --instruction "Refactor src/utils.py to follow the single responsibility principle"Each agent run creates a new session with a unique ID:
{"type": "thread.started", "thread_id": "sess_1706890123456"}stateDiagram-v2
[*] --> Initializing: python3 agent.py
Initializing --> Running: thread.started
Running --> Iterating: turn.started
Iterating --> Executing: item.started
Executing --> Iterating: item.completed
Iterating --> Verifying: No tool calls
Verifying --> Iterating: Needs more work
Verifying --> Complete: Verified
Iterating --> Failed: Error
Complete --> [*]: turn.completed
Failed --> [*]: turn.failed
# Set lower cost limit for testing
export LLM_COST_LIMIT="2.0"# Watch tool executions in real-time
python3 agent.py --instruction "..." 2>&1 | grep -E "Executing tool|Iteration"# Full verbose output
python3 agent.py --instruction "..." 2>&1 | tee agent_debug.log- Tools Reference - Available tools and their parameters
- Configuration - Customize agent behavior
- Best Practices - Tips for effective usage