RFC: observable execution — live run visibility + context economy from one event stream

## The gap

Today a squad run is mostly a **black box**. The user sees the kickoff and the final artifact (a commit / PR / file), but almost nothing about *what the agent actually did in between* — which tools it called, what it read, what it reasoned about, where it spent its context budget. Two distinct problems fall out of this, and they share a root cause:

1. **No live visibility.** As a user I want to *watch the work happen* — like watching a teammate — not just read the final commit. The intermediate steps (tool calls, file reads, web research, sub-agent spawns, decisions) are invisible during the run and largely unrecoverable after.

2. **No context economy.** Long-running orchestrator/agent sessions re-process their entire accumulated conversation on every tool call. Inline work in a big session gets expensive; isolating a task into a fresh sub-agent (small context) is often cheaper — but nothing today measures or surfaces *where* context/tokens go, so there's no basis to budget, compact, or decide inline-vs-isolate.

## Root insight: one substrate solves both

Both gaps are downstream of the same missing capability: **runs don't emit a structured, inspectable execution-event stream.** If every run emitted typed events — `tool_call`, `file_read`, `web_fetch`, `subagent_spawn`, `decision`, `token_usage{layer,delta}`, `artifact` — then:

- **Visibility** = render that stream to the user live (and persist it for replay).
- **Context economy** = aggregate the `token_usage` events to see per-layer / per-tool cost, enabling budgets, compaction triggers, and isolate-vs-inline heuristics.

Build the event model once; both workstreams consume it.

## Workstream A — Live execution visibility

- Stream structured events from an in-flight run to the console (and persist to the run record / local API for replay).
- Human-legible activity feed: *"reading X… searched web for Y… spawned profiler… wrote Z."*
- Distinct from a post-hoc outcome record (#817) or an aggregate org-cycle TUI (#662) — this is the **single-run, mid-flight** view.
- Surfaces sub-agent fan-out as a tree (what each lane is doing right now).

## Workstream B — Context economy

- Per-run, per-layer token accounting from the `token_usage` events (orchestrator context vs each sub-agent vs tool results).
- Surface "context spent here" so the user/agent can see the bloat (relates to #889 — zombie `running` entries inflating context).
- Decision support: when to isolate work into a sub-agent vs run inline; budget caps + auto-compact triggers (#702); cache-friendly context layout (#703).

## Why one issue

A and B are two consumers of one new substrate (the event stream). Designing them together avoids building visibility and accounting twice. Implementation should split into child issues per workstream (one branch / one PR each).

## First steps

1. Define the execution-event schema (typed events + persistence shape).
2. Emit events from the run loop — minimal set first: `tool_call`, `subagent_spawn`, `token_usage`, `artifact`.
3. **A:** render live to console + persist for replay. **B:** aggregate `token_usage` into a per-run context report.

## Related

- Visibility: #817, #693, #662, #824
- Context: #703, #702, #889, #893
- Epic: #707 (Claude Code integration optimization)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: observable execution — live run visibility + context economy from one event stream #898

The gap

Root insight: one substrate solves both

Workstream A — Live execution visibility

Workstream B — Context economy

Why one issue

First steps

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

RFC: observable execution — live run visibility + context economy from one event stream #898

Description

The gap

Root insight: one substrate solves both

Workstream A — Live execution visibility

Workstream B — Context economy

Why one issue

First steps

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions