Skip to content

RFC: observable execution — live run visibility + context economy from one event stream #898

@kokevidaurre

Description

@kokevidaurre

The gap

Today a squad run is mostly a black box. The user sees the kickoff and the final artifact (a commit / PR / file), but almost nothing about what the agent actually did in between — which tools it called, what it read, what it reasoned about, where it spent its context budget. Two distinct problems fall out of this, and they share a root cause:

  1. No live visibility. As a user I want to watch the work happen — like watching a teammate — not just read the final commit. The intermediate steps (tool calls, file reads, web research, sub-agent spawns, decisions) are invisible during the run and largely unrecoverable after.

  2. No context economy. Long-running orchestrator/agent sessions re-process their entire accumulated conversation on every tool call. Inline work in a big session gets expensive; isolating a task into a fresh sub-agent (small context) is often cheaper — but nothing today measures or surfaces where context/tokens go, so there's no basis to budget, compact, or decide inline-vs-isolate.

Root insight: one substrate solves both

Both gaps are downstream of the same missing capability: runs don't emit a structured, inspectable execution-event stream. If every run emitted typed events — tool_call, file_read, web_fetch, subagent_spawn, decision, token_usage{layer,delta}, artifact — then:

  • Visibility = render that stream to the user live (and persist it for replay).
  • Context economy = aggregate the token_usage events to see per-layer / per-tool cost, enabling budgets, compaction triggers, and isolate-vs-inline heuristics.

Build the event model once; both workstreams consume it.

Workstream A — Live execution visibility

Workstream B — Context economy

Why one issue

A and B are two consumers of one new substrate (the event stream). Designing them together avoids building visibility and accounting twice. Implementation should split into child issues per workstream (one branch / one PR each).

First steps

  1. Define the execution-event schema (typed events + persistence shape).
  2. Emit events from the run loop — minimal set first: tool_call, subagent_spawn, token_usage, artifact.
  3. A: render live to console + persist for replay. B: aggregate token_usage into a per-run context report.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestpriority:P1Critical priorityrfcRequest for Comments - Architecture proposalssquad:cliCLI squad

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions