A world-class multi-agent engineering harness for Codex, Claude Code, CCSwitch, and local model routing.
Make Plus feel like Pro.
中文版文档:README.zh-CN.md
GPT-class models are excellent.
But Plus-level quotas are not infinite.
If you spawn many internal subagents directly inside Codex, your best-model quota can disappear fast.
A deep repo audit, a parallel multi-agent review, or one ambitious refactor can burn through the budget you wanted to save for judgment.
That is why this Skill exists.
The mission:
Make Plus feel like Pro.
This Skill turns that constraint into an engineering system:
Let the best model act as the brain. Let Claude Code plus your CCSwitch models act as hands. Let Codex stay in control.
In other words:
Codex does not need to do every low-level subtask itself. Codex plans, routes, supervises, and verifies. Claude Code executes through external worker models.
This is a miniature cost-management operating system for multi-agent coding.
Current version: v0.7.1
| Version | What changed | Why it matters |
|---|---|---|
v0.7.1 |
Fixes #24: manual workflow-retry-node now changes the workflow status from succeeded to needs_rerun, records invalidated nodes, and marks old node handoff/gate/token evidence as stale. |
Codex and dashboards no longer accept a workflow that was manually invalidated. Pending nodes must run again before the workflow can be treated as done. |
v0.7.0 |
Adds the first workflow DAG controller layer for GitHub issues #20, #21, and #22: YAML/JSON workflow validation, dry-run topological batches, mock workflow execution, structured handoff templates and validation, node gates, retry decisions, loop guard, workflow status, reports, and MCP tools. | Codex can now test long-running multi-agent pipelines as small verifiable nodes instead of one vague conversation. The first version is intentionally mock-safe, controller-owned, and data-backed before spending model quota. |
v0.6.4 |
Fixed GitHub issues #16, #17, and #18: final-only output now budgets persisted final text instead of raw stream noise, --cwd runs use the cwd-scoped artifact root with a run index for polling, and actual token aggregates are computed from raw modelUsage before redaction. |
Codex now has measurable evidence for low-noise worker supervision: short final-only tasks no longer die from thinking/system stream noise, project artifacts stay inside the target workspace, and usage dashboards do not report fake zero-token runs. |
v0.6.3 |
Fixed the GitHub Actions docs deploy secret-scan false positive by splitting placeholder test tokens in selftest code. | The public docs pipeline can publish v0.6.x without mistaking safe placeholder examples for real credentials. |
v0.6.2 |
Fixed #15: Claude stream modelUsage is captured as actual_model_usage; metadata, dashboard, usage summary, and controller reports now distinguish declared route from actual billed model and flag route_mismatch. Added supervise-decision as a compatibility alias for decision-review. |
Codex can now catch the painful case where a worker says it used one model but Claude actually bills another. The controller sees the real model, real usage, and mismatch risk. |
v0.6.1 |
Completed the GitHub issue audit pass: controller reports now include by-model totals, per-run duration, token estimates, stdout/events bytes, warning/blocking counts, and dashboard token estimates; legacy metadata writes now use the same UTF-8/control-character sanitizer. | The closed issues now have stronger evidence, not just feature names. Codex can hand you a report that is actually enough to judge worker health without opening raw logs. |
v0.6.0 |
Fixed GitHub issues #3-#12: transactional role-team launches, hard output/event budgets, final-only mode, route-preserving follow-ups, Windows UTF-8 checks, risk severity split, secret finding classification, source-vs-artifact diff summaries, operations dashboard, controller pressure reports, and supervisor decision review. | Codex can now manage Claude Code workers like a real controller: start teams without leaving silent workers behind, stop runaway output, preserve the chosen model, audit risk with clearer signals, and export acceptance evidence. |
v0.5.1 |
Fixed GitHub issues #1 and #2: portable tools/cc-orchestrator copies now discover version.json and Prompt Pack assets, and clean-workspace no longer suggests deleting freshly initialized scaffold folders. |
Workspace governance is now safer and more portable: lightweight tool copies work, and cleanup does not undo initialization. |
v0.5.0 |
Added workspace governance: .agent-workspace artifact routing, init-workspace, workspace-status, migrate-data, clean-workspace, archive-runs, repair-mcp-paths, and folder-policy, with matching MCP tools. |
Codex can now keep Claude Code worker logs, reports, dashboards, temp files, rollback notes, templates, and policies inside one managed folder without touching project source. |
v0.4.1 |
Added rolling checkpoint-###.md summaries, deduplicated tool-call summaries, default artifact-writing controller poll, and exact queued/running/done/failed queue states. |
Codex can now inspect only decision-grade summaries while workers keep raw audit logs on disk. |
v0.4.0 |
Added the Codex Controller Playbook, Prompt Pack, compact controller-mode polling, cc_summarize_run, cc_compact_events, one-click verification scoring, real queue policy, model registry, local override preservation, worker quality history, failure-mode detection, and timeline dashboard. |
Codex can now manage Claude Code workers like a real controller: watch compact progress, stop bad runs, verify changes, learn which model is best, and preserve local preferences across upgrades. |
v0.3.0 |
Added cc_verify_run, hard write-scope checks, mock streaming E2E tests, queue scheduling, usage summaries, upgrade checks, MCP auto-registration, and benchmark suite. |
Turns the project from “can run workers” into a safer control console with verification, migration, and low-cost testing. |
v0.2.0 |
Added live streaming control: run-streaming, poll-run, stop-run, run-status, team spawning, cross review, dashboard, reports, and cost guard. |
Codex can watch and manage Claude Code workers in real time instead of waiting blindly. |
v0.1.0 |
Built the first Skill + MCP + CLI foundation with CCSwitch profile discovery, model scoring, role routing, CLAUDE.md generation, visible Claude Code windows, logs, and safe defaults. |
Proved the core idea: Codex is the brain, Claude Code is the worker layer, CCSwitch is the local model router. |
v0.7.1 - Manual Retry Invalidates Workflow Success
- Fixed #24:
workflow-retry-nodeno longer leaves the workflow-level status assucceededafter invalidating nodes. - Manual retry now sets workflow status to
needs_rerun, recordsrequires_rerun=true, and stores the invalidated node list. - Invalidated nodes no longer expose old
handoff,handoff_validation,gate, run id, token, or cost fields as current acceptance evidence. - Workflow reports now show a visible manual-invalidation warning and stale evidence markers.
- Expanded
selftestwith manual retry status, stale evidence, and report warning checks.
v0.7.0 - Workflow DAG, Handoff Contracts, and Node Gates
- Implements #20:
workflow-validate,workflow-dry-run,workflow-run --mock,workflow-status,workflow-retry-node,workflow-stop, andworkflow-report. Real DAG worker execution is intentionally disabled in v0.7.0 until the controller loop has more production gates. - Implements #21:
handoff-template,handoff-validate,handoff-read, andhandoff-repair-prompt. - Implements #22: mock node controller decisions for
advance,retry,block,cancel, gate checks, retry invalidation, and loop-guard blocking. - Adds MCP tools for the workflow and handoff commands.
- Adds
examples/workflows/safe-refactor.yaml. - Expands selftest with DAG validation, handoff validation, retry, max-retry blocking, missing-handoff blocking, report decision trails, and controller-only no-source-change gates.
v0.6.4 - Data-Proven Worker Supervision Fixes
- Fixed #16:
--final-onlynow filters raw stream events before applying the persisted stdout budget. - Fixed #16: final-only stdout writes compact final result text instead of raw
system/assistantstream JSON. - Fixed #17:
run,run-streaming, andrun-visiblenow use the--cwdworkspace's.agent-workspace/claude-code-orchestratorartifact root. - Fixed #17: added a run index so
poll-run,run-status,stop-run,last-run, and summaries can still find cwd-scoped run folders by run id. - Fixed #18: actual token aggregates are computed from raw Claude
modelUsagebefore log/event redaction. - Replaced slow Windows
tasklistPID checks with a Windows API process-status check. - Expanded
mock-stream-testwith data gates for final-only noise filtering, cwd artifact routing, and token aggregate preservation.
v0.6.3 - Docs Deploy Stability
- Fixed a GitHub Actions secret-scan false positive in selftest placeholder-token coverage.
- Kept the placeholder-secret regression test, but split the sample token string so repository-level scans do not treat it as a real key.
- Published the fix as a documented hotfix so README, docs changelog, package metadata, and version metadata stay aligned.
v0.6.2 - Actual Model Attribution
- Fixed #15: streaming runs now persist Claude result
modelUsageasactual_model_usage. - Added
actual_model,actual_cost_usd,actual_total_tokens, androute_mismatchto run metadata/status. detect_failure_modesnow raises a high-severityroute_mismatchflag when declared and actual models differ.usage-summarygroups by actual model when available while preserving declared model fields.- Dashboard and controller reports now show declared model, actual model, mismatch state, and actual cost.
healthchecknow documents that actual model attribution is verified from Claude stream results.- Added
supervise-decisionas a compatibility alias fordecision-reviewso #14 retests pass either command name.
v0.6.1 - Issue Audit Completion
- Expanded
controller-report/pressure-reportMarkdown with by-model usage totals, duration, token estimates, output bytes, event bytes, budget stops, warning counts, blocking counts, and max severity. - Added per-run report rows with duration, token estimates, stdout/events bytes, warning/blocking counts, budget state, and source/artifact counts.
- Added token estimates to the local operations dashboard output-budget panel.
- Added warning/blocking risk counts to
usage-summaryand its by-model breakdown. - Routed remaining legacy metadata writes through the same UTF-8/control-character sanitizer used by streaming runs.
v0.6.0 - Controller Operations Hardening
- Fixed #4:
spawn-role-teamnow preflights team capacity and rolls back partial launches by stopping already-started runs. - Fixed #5:
run-streaming/cc_run_streaming_agentnow supportmax_output_bytes,max_events_bytes,soft_output_bytes,output_budget_policy,kill_on_excessive_output,final_only, andfinal_max_chars. - Fixed #6:
send-instructionnow preserves the previous profile/model by default and records route drift when rerouted. - Fixed #7: metadata, events, CLI JSON, and dashboard output now sanitize invalid control characters and preserve UTF-8 Chinese paths/prompts.
- Fixed #8: risk flags now expose
blocking_ok,has_warnings,max_severity,warning_count, andblocking_count; oldokremains compatible and means no blocking risk. - Fixed #9: secret scanning now classifies real candidates, placeholders/examples, identifiers, config key names, and unknown review items without printing raw secrets.
- Fixed #10: run diffs and diff summaries now split project source changes from
.agent-workspaceagent artifacts. - Fixed #11: dashboard is now an operations panel with worker filters, heartbeat, stop reason, output budget, risk level, route drift, and source/artifact sections.
- Fixed #12: added
controller-report/pressure-reportand MCPcc_controller_report/cc_pressure_reportfor acceptance-ready Markdown reports. - Added #3 MVP:
decision-reviewand MCPcc_decision_reviewproduce supervisor-style approve/revise/block reviews with evidence, objections, missing evidence, and required changes. - Expanded
mock-stream-testto verify output-budget stopping without spending model quota.
v0.5.1 - Portable Assets and Safer Cleanup
- Fixed
tools/cc-orchestratorlightweight copies so package assets can be discovered fromCC_ORCHESTRATOR_SKILL_ROOT, the full Skill root, or colocated assets underscripts/cc-orchestrator. - Added a portable colocated
version.jsonand Prompt Pack underscripts/cc-orchestrator. - Updated
healthcheckto reportskill_root,version_path,prompt_pack_path, and whether Prompt Pack assets exist. - Fixed
clean-workspaceso freshly initialized scaffold directories are protected even when empty. - Added selftest coverage for Prompt Pack availability and scaffold-preserving cleanup.
v0.5.0 - Workspace Governance
- Added
.agent-workspace/claude-code-orchestratoras the default home for agent-generated artifacts. - Added
init-workspaceto create runs, reports, dashboard, archives, rollback, logs, tmp, templates, and policies folders. - Added
workspace-statusto show exactly where Codex and Claude Code will write artifacts. - Added
migrate-datato safely move legacyruns,reports, anddashboarddata into the managed workspace. - Added
clean-workspace, dry-run by default, to clean tmp files, non-scaffold empty folders, and expired run folders. - Added
archive-runsto zip old run folders intoarchives/. - Added
repair-mcp-pathsto update.mcp.jsonwithCC_ORCHESTRATOR_WORKSPACE_ROOTandCC_ORCHESTRATOR_ARTIFACT_ROOT. - Added
folder-policyto write a machine-readable rule: manage only agent artifacts, never project source. - Added matching MCP tools:
cc_init_workspace,cc_workspace_status,cc_migrate_data,cc_clean_workspace,cc_archive_runs,cc_repair_mcp_paths, andcc_folder_policy. - Updated worker prompts and generated
CLAUDE.mdso Claude Code workers keep logs, reports, temp files, and rollback notes under the managed artifact root.
v0.4.1 - Controller Checkpoints, Tool Dedup, Queue State Polish
- Added rolling
checkpoints/checkpoint-###.mdfiles for long-running Claude Code workers. - Each checkpoint records what is done, what was found, what changed, what remains, and whether the worker is drifting.
- Added deduplicated tool-call summaries, such as
Grep x7andRead x3. - Made controller-mode
poll-runwrite controller artifacts by default. - Added
last_meaningful_action,new_findings,tool_call_summary, andcontroller_attention_flagsto controller summaries. - Changed queue success state to
done, with explicitqueued,running,done,failed,timed_out, andcancelledstates. - Polished the local HTML dashboard with top model routing, left worker list, center timeline/logs, and right diff/risk/control commands.
v0.4.0 - Codex Controller System
- Added
references/codex-controller-playbook.md, the dedicated Codex scheduling manual. - Documented when Codex should work directly and when it should delegate to Claude Code.
- Documented poll cadence, stop signals, cross-review rules, write-permission rules, and verification gates.
- Added Prompt Pack templates:
repo-audit,bugfix,security-audit,frontend-polish,test-generation,refactor-plan, andrelease-check. - Added
cc_poll_run --mode controllerfor compact controller summaries instead of raw event dumps. - Added
cc_summarize_runandcc_compact_events. - Added controller artifacts:
progress_summary.json,latest_decision.md,risk_flags.json,changed_files.json, andtool_timeline.md. - Added real queue policy support with max concurrency, priority, retry policy, timeout policy, and state summaries.
- Added
model_registry.jsonandmodel_benchmark_history.jsonsupport. - Added
local_policy.override.jsonso local preferences survive GitHub updates. - Added worker quality scoring history for solved status, scope safety, secret safety, failure flags, token usage, hallucination, and rework.
- Added automatic failure-mode detection for stalled workers, repeated search, excessive output, destructive command risk, test failure plus success claims, write-scope violations, and secret-like output.
- Added model registry aggregation from CCSwitch scans, benchmark history, and worker quality history.
- Added MCP tools for model registry, local policy, worker scoring, Prompt Pack rendering, queue policy, compact events, and run summaries.
- Added daily Codex automation guidance for checking GitHub updates without auto-applying them.
v0.3.0 - Verification, Packaging, and Safer Operations
- Added one-click
cc_verify_run. - Chained diff summary, write-scope check, secret scan, optional test commands, and Markdown report into the acceptance flow.
- Added hard write-scope enforcement after runs.
- Added a conservative rollback helper based on git snapshots.
- Added mock streaming end-to-end tests, so streaming can be tested without spending model quota.
- Added benchmark suite entrypoints for code, review, security, long-context, and multimodal planning tasks.
- Added daily usage summaries from saved run logs.
- Added upgrade and version state tracking.
- Added Windows MCP auto-registration installer.
- Added stronger install preservation rules for local config.
- Added
version.jsonas a single version metadata source.
v0.2.0 - Live Worker Control
- Added
run-streaming/cc_run_streaming_agent. - Started Claude Code with
--output-format stream-json --include-partial-messages. - Wrote live
events.ndjsonfiles for each run. - Added
poll-run,run-status, andstop-run. - Added role team spawning.
- Added team result collection.
- Added cross-review worker loops.
- Added run reports and export flow.
- Added local HTML dashboard foundation.
- Added cost guard settings for concurrency and timeout.
- Added visible Claude Code worker window support.
v0.1.0 - Skill, MCP, CLI, and CCSwitch Foundation
- Created the Codex Skill entrypoint.
- Added bundled MCP server.
- Added CLI orchestrator.
- Added CCSwitch profile discovery.
- Added Claude Code binary discovery.
- Added local model scoring by role.
- Added role-based model routing.
- Added default read-only planning mode.
- Added Claude Code subprocess execution.
- Added run metadata, prompt, stdout, stderr, and last-run logs.
- Added
CLAUDE.mdworker persona generation. - Added UTF-8-safe Windows output handling.
- Added safe secret redaction defaults.
- Added English and Chinese README foundation.
claude-code-orchestrator-skill is a Codex Skill with a bundled MCP server and CLI.
It lets Codex:
- discover local Claude Code
- read CCSwitch profiles
- find all configured Claude-compatible models
- score models by role
- route agents to the best local model
- launch Claude Code as an external worker
- keep runs read-only by default
- save run metadata and logs under
.agent-workspace/claude-code-orchestrator - initialize, inspect, clean, migrate, archive, and govern agent artifact folders
- expose everything through MCP tools
- handle Windows UTF-8 output safely
- write a project
CLAUDE.mdso Claude Code workers receive stable role/persona instructions
You need:
- Codex
- Claude Code
- CCSwitch
- Multiple models configured inside CCSwitch
- Python 3.10+
The Skill is most powerful when CCSwitch has several models with different strengths:
- strong reasoning model
- strong code model
- fast cheap model
- review/security model
- fallback model
Paste this into Codex:
Install the Codex Skill and MCP server from https://github.com/chu459/claude-code-orchestrator-skill. Put the Skill at ~/.codex/skills/claude-code-orchestrator, wire the bundled MCP server into Codex config.toml, run selftest, healthcheck, score-models, init-workspace, workspace-status, and show me the selected multi-agent routing plan. Do not print secrets.
Windows PowerShell:
$tmp = Join-Path $env:TEMP "claude-code-orchestrator-skill.zip"; `
iwr -UseBasicParsing "https://github.com/chu459/claude-code-orchestrator-skill/archive/refs/heads/main.zip" -OutFile $tmp; `
$dir = Join-Path $env:TEMP "claude-code-orchestrator-skill"; `
if (Test-Path $dir) { Remove-Item $dir -Recurse -Force }; `
Expand-Archive $tmp -DestinationPath $dir -Force; `
& (Get-ChildItem $dir -Recurse -Filter install.ps1 | Select-Object -First 1).FullNamemacOS / Linux:
tmp="$(mktemp -d)" && \
curl -L "https://github.com/chu459/claude-code-orchestrator-skill/archive/refs/heads/main.zip" -o "$tmp/skill.zip" && \
unzip -q "$tmp/skill.zip" -d "$tmp" && \
bash "$tmp"/claude-code-orchestrator-skill-main/install/install.shAdd this to Codex config.toml:
[mcp_servers.claude-code-orchestrator]
command = "python"
args = [
"-c",
"import os,sys,runpy; home=os.environ.get('CODEX_HOME') or os.path.join(os.environ.get('USERPROFILE') or os.path.expanduser('~'), '.codex'); root=os.environ.get('CC_ORCHESTRATOR_HOME') or os.path.join(home, 'skills', 'claude-code-orchestrator', 'scripts', 'cc-orchestrator'); sys.path.insert(0, root); runpy.run_path(os.path.join(root, 'server.py'), run_name='__main__')"
]
[mcp_servers.claude-code-orchestrator.env]
PYTHONIOENCODING = "utf-8"
PYTHONUTF8 = "1"
CC_ORCHESTRATOR_WORKSPACE_ROOT = "."
CC_ORCHESTRATOR_ARTIFACT_ROOT = ".agent-workspace/claude-code-orchestrator"Or let the safe installer write Codex/Claude MCP config after backing up existing files:
powershell -ExecutionPolicy Bypass -File .\install\install-mcp.ps1export CC_ORCHESTRATOR_HOME="$HOME/.codex/skills/claude-code-orchestrator/scripts/cc-orchestrator"
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" selftest
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" healthcheck
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" score-models
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" init-workspace --cwd .
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" workspace-status --cwd .Healthcheck:
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" healthcheckList CCSwitch profiles:
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" list-profilesScore local models:
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" score-modelsWrite strategy reports:
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" write-reportsInitialize and inspect the managed workspace:
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" init-workspace --cwd /path/to/project
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" workspace-status --cwd /path/to/project
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" folder-policy --cwd /path/to/project --applyWrite a CLAUDE.md worker persona into a project:
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" write-claude-md --cwd /path/to/project --role implementationRun a read-only architecture worker:
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" run "Map this repository architecture" --role architectureRun a streaming background worker:
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" run-streaming "Review this repository" --role reviewPoll, list, or stop workers:
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" poll-run --run-id <run_id>
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" run-status
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" stop-run --run-id <run_id> --forceSpawn and collect a role team:
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" spawn-role-team "Audit this repository" --roles requirements,architecture,security,testing
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" collect-team-results --team-id <team_id>
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" cross-review --run-id <run_id> --run-id <run_id>Safety and acceptance helpers:
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" preflight-write-scope --cwd /path/to/project --allow src --deny .env --max-diff-lines 800
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" check-write-scope --cwd /path/to/project
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" diff-summary --cwd /path/to/project
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" secret-scan-run --run-id <run_id>
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" verify-run --run-id <run_id> --test-command "npm test"Scheduling and reporting:
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" benchmark-model --profile PROFILE --execute
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" benchmark-suite --profile PROFILE
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" calibrate-policy --preference coding=glm-5 --preference multimodal=qwen3.7-plus
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" cost-guard --max-concurrent 4 --max-timeout-seconds 1200 --apply
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" usage-summary --write-report
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" init-workspace
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" workspace-status
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" migrate-data
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" clean-workspace
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" archive-runs --older-than-days 30
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" repair-mcp-paths --create
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" folder-policy --apply
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" queue-submit "Review this repo" --role review --priority 100
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" queue-tick --max-concurrent 3
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" queue-policy --max-concurrent 3 --default-timeout-seconds 900 --apply
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" model-registry --refresh --apply
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" local-policy --preference development=GLM5.2 --preference multimodal=qwen3.7-plus --apply
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" score-worker --run-id <run_id>
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" summarize-run --run-id <run_id>
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" render-prompt --template bugfix --task "Fix the bug"
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" upgrade-check --apply
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" mock-stream-test
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" dashboard
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" export-report --run-id <run_id>Open a visible Claude Code worker window:
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" run-visible "Inspect this repository" --role architectureInspect the latest run:
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" last-run| Tool | Purpose |
|---|---|
cc_healthcheck |
Check Claude Code, CCSwitch, config |
cc_list_profiles |
List CCSwitch profiles |
cc_pick_profile |
Pick a profile/model for a role |
cc_run_agent |
Run a Claude Code worker |
cc_run_streaming_agent |
Start a background Claude Code worker with stream-json events |
cc_poll_run |
Poll one run in compact controller mode by default; raw deltas are still available |
cc_summarize_run |
Write and return controller artifacts plus rolling checkpoints |
cc_compact_events |
Compact raw events.ndjson into a small timeline and deduplicated tool summary |
cc_stop_run |
Stop a specific running Claude Code worker |
cc_run_status |
List active Claude Code workers or inspect one run |
cc_send_instruction |
Stop and restart a run with recovered context and a new instruction |
cc_spawn_role_team |
Start several role workers at once |
cc_collect_team_results |
Summarize team output and mark agreements/conflicts |
cc_cross_review |
Launch second-round reviewer workers |
cc_preflight_write_scope |
Fix allowed paths, denied paths, and max diff before writes |
cc_check_write_scope |
Block acceptance when a run changed files outside the write scope |
cc_diff_summary |
Summarize changed files, risks, and test need |
cc_secret_scan_run |
Scan run logs/events/diff for leaked secrets |
cc_rollback_run |
Conservative rollback when git snapshots prove it is safe |
cc_verify_run |
Run diff summary, scope check, secret scan, tests, and report |
cc_benchmark_model |
Run or plan a small model benchmark |
cc_benchmark_suite |
Run or plan fixed code/review/security/context/multimodal benchmarks |
cc_model_registry |
Build the local model capability database |
cc_calibrate_policy |
Persist local model preference notes |
cc_local_policy |
Read or write user-owned routing overrides preserved across upgrades |
cc_score_worker |
Grade one worker run and update quality history |
cc_prompt_pack |
List or render reusable worker prompts |
cc_cost_guard |
Configure max concurrency and timeout guardrails |
cc_usage_summary |
Estimate daily tokens, duration, failures, and model usage |
cc_queue_submit |
Submit a priority worker job |
cc_queue_tick |
Start queued jobs up to the concurrency limit |
cc_queue_status |
Inspect queued, running, done, failed, timed_out, and cancelled jobs |
cc_queue_cancel |
Cancel a queued or running job |
cc_queue_policy |
Read or write queue concurrency, retry, and timeout policy |
cc_upgrade_check |
Preserve local model preferences across upgrades |
cc_mock_stream_test |
Test streaming/poll/stop/status with a fake Claude stream |
cc_init_workspace |
Initialize .agent-workspace, templates, policy files, rollback/log dirs, and optional CLAUDE.md |
cc_workspace_status |
Show exactly where Codex and Claude Code artifacts will be written |
cc_migrate_data |
Dry-run or move legacy runs, reports, and dashboard into the managed workspace |
cc_clean_workspace |
Clean tmp files, non-scaffold empty dirs, and expired run artifacts, dry-run by default |
cc_archive_runs |
Zip old run folders under archives/ |
cc_repair_mcp_paths |
Repair .mcp.json so MCP writes into the managed workspace |
cc_folder_policy |
Return or write the rule that only agent artifacts are managed |
cc_dashboard |
Generate a local HTML worker dashboard |
cc_open_run_folder |
Open or return a run log folder |
cc_export_report |
Export a run or team Markdown report |
cc_controller_report |
Export controller acceptance and pressure-test evidence |
cc_pressure_report |
Alias for pressure-test reports |
cc_decision_review |
Supervisor-style approve/revise/block decision review |
cc_run_visible_agent |
Open a visible Claude Code worker |
cc_last_run |
Inspect last run |
cc_git_diff |
Inspect git diff |
cc_workflow_plan |
Build a multi-agent workflow plan |
cc_workflow_validate |
Validate a YAML/JSON workflow DAG |
cc_workflow_dry_run |
Preview topological workflow batches |
cc_workflow_run |
Run a workflow; mock=true avoids model quota |
cc_workflow_status |
Inspect node state, gate details, and decisions |
cc_workflow_retry_node |
Invalidate one node and downstream nodes |
cc_workflow_stop |
Cancel a workflow |
cc_workflow_report |
Write a workflow report with decision trail |
cc_handoff_template |
Return a role handoff schema and example |
cc_handoff_validate |
Validate a run handoff |
cc_handoff_read |
Read a run handoff |
cc_handoff_repair_prompt |
Build a repair prompt for missing handoff fields |
cc_write_claude_md |
Write a project CLAUDE.md for Claude Code worker behavior |
cc_score_models |
Score local models |
cc_write_strategy_reports |
Write score and routing reports |
Claude Code can read a project-level CLAUDE.md file.
This is extremely useful for orchestration, because Codex can set the worker's persona before launching it.
The generated CLAUDE.md tells Claude Code:
- Codex is the controller, planner, reviewer, and final decision maker
- Claude Code is an external worker process
- the assigned role, such as
architecture,implementation, orreview - safety rules about secrets, destructive commands, and unrelated changes
- progress-reporting rules for long-running work
Create one:
python "$CC_ORCHESTRATOR_HOME/cc_orchestrator.py" write-claude-md --cwd /path/to/project --role reviewIf the project already has CLAUDE.md, the command is conservative:
- default: do not overwrite
--append: append the orchestrator-managed section--force: replace after writing a timestamped backup
Through MCP, Codex can call:
cc_write_claude_md
Recommended flow:
1. Codex plans the work
2. Codex writes CLAUDE.md for the selected worker role
3. Codex launches Claude Code through this Skill
4. Claude Code follows the project persona and role rules
5. Codex reviews logs, diffs, and final output
You can ask Codex to create a daily automation that checks chu459/claude-code-orchestrator-skill for new commits.
Recommended behavior:
- report the latest GitHub commit
- report local
HEAD - report installed Skill version
- summarize changes
- never pull or overwrite automatically
- only apply updates when
auto_applyis explicitly enabled
Suggested prompt:
Create a daily Codex automation that checks whether chu459/claude-code-orchestrator-skill has new commits. Report remote commit, local HEAD, installed Skill version, uncommitted changes, and a short summary. Do not pull or apply updates unless auto_apply is explicitly enabled.
| Role | Purpose |
|---|---|
requirements |
Requirements, scope, non-goals, acceptance criteria |
architecture |
Repository map, likely files, implementation strategy, risks |
security |
Secrets, permissions, command risk, supply-chain risk |
testing |
Validation commands, expected signals, residual risk |
implementation |
Scoped edits when write access is explicitly allowed |
review |
Findings ordered by severity, file references, open questions |
ops |
Deployment, logs, rollback, runtime risk |
This project is not just “spawn more agents”.
It is:
Brain: best model for judgment
Hands: cheaper/faster worker models for execution
Ledger: every run saved
Manager: Codex controls the flow
That is why it is a cost-management harness.
flowchart TD
User["User"] --> Codex["Codex Controller"]
Codex --> Skill["Claude Code Orchestrator Skill"]
Skill --> MCP["Bundled MCP Server"]
Skill --> CLI["cc_orchestrator.py CLI"]
MCP --> Router["Role + Model Router"]
CLI --> Router
Router --> CCSwitch["CCSwitch Profiles"]
CCSwitch --> Models["Qwen / GLM / Claude-compatible Models"]
Router --> ClaudeMD["Project CLAUDE.md"]
ClaudeMD --> ClaudeCode["Claude Code Worker Process"]
Router --> ClaudeCode
ClaudeCode --> Runs[".agent-workspace/claude-code-orchestrator/runs/<run_id> logs"]
Runs --> Codex
The default posture is intentionally conservative:
- read-only planning by default
permission_mode = planunless write access is explicitly enabledallow_write=truerequired for scoped implementation work- no global CCSwitch mutation
- secrets are redacted from tool output and persisted logs
- UTF-8-safe output on Windows
- timeout output is preserved when Python exposes partial stdout/stderr
- existing
CLAUDE.mdfiles are not overwritten unless--appendor--forceis used - workspace governance manages only
.agent-workspace/claude-code-orchestratorartifacts, not project source
What works today:
- use
run-streamingto start Claude Code with--output-format stream-json --include-partial-messages - read live events from
events.ndjson - use
poll-runto inspect compact controller progress, risk flags, changed files, and timeline - use
run-statusto list active workers - use
stop-runto terminate a runaway worker - use
run-visiblewhen the user wants a real terminal window
Windows:
Get-Content ".agent-workspace\claude-code-orchestrator\runs\<run_id>\stdout.txt" -Wait
Get-Content ".agent-workspace\claude-code-orchestrator\runs\<run_id>\events.ndjson" -WaitmacOS / Linux:
tail -f ".agent-workspace/claude-code-orchestrator/runs/<run_id>/stdout.txt"
tail -f ".agent-workspace/claude-code-orchestrator/runs/<run_id>/events.ndjson"The P0 live-control loop is:
cc_run_streaming_agent -> events.ndjson
cc_poll_run -> compact controller progress, risk flags, changed files, timeline
cc_summarize_run -> write controller artifacts and checkpoint-###.md
cc_run_status -> active worker list
cc_stop_run -> kill a stuck or expensive worker
Full design notes:
docs/realtime-progress.md
The goal is intentionally ambitious:
Become one of the world's top multi-agent collaboration harnesses: strong models as the brain, cheaper models as hands, Codex as controller, and MCP as the nervous system.
This is not about spectacle.
It is about bringing model cost, context cost, worker cost, and human attention cost into one auditable engineering loop.
- Codex Skill
- Bundled MCP Server
- CCSwitch profile discovery
- Local model scoring
- Role-based model routing
- Claude Code subprocess launching
- Visible Claude Code window
- UTF-8 safe Windows output
- Run logs and
last-run -
CLAUDE.mdworker persona writer - Live event stream with
events.ndjson - Poll/stop/status tools for live control
- Role team spawning and result collection
- Cross-review worker loop
- Preflight write-scope file
- Diff summary and secret scan helpers
- Conservative rollback helper
- Automatic
verify-runacceptance pipeline - Mock streaming E2E test
- Hard write-scope enforcement after runs
- Queue scheduling with priority, concurrency, timeout, and retry metadata
- Daily usage summary from logs
- Version and upgrade-state mechanism
- MCP auto-registration installer for Windows
- Fixed benchmark suite entrypoint
- Model benchmark/calibration entrypoints
- Cost guard policy
- Local HTML dashboard
- Codex Controller Playbook
- Prompt Pack
- Compact controller-mode polling
- Rolling checkpoint summaries
- Tool-call deduplication
- Run timeline visualization
- Model registry and benchmark history
- Local policy override preserved across upgrades
- Worker quality scoring
- Failure-mode detection
- Queue policy with priority, retry, timeout, and max concurrency
-
.agent-workspaceartifact routing - Workspace init/status/migration/cleanup/archive tools
- MCP path repair and folder policy
- Daily update monitor automation
- Web-style local dashboard
- Agent result voting
MIT.
Not affiliated with OpenAI, Anthropic, Claude, Claude Code, or CCSwitch.
