feat(agent-think): issue repro/fix agent on Workers + Containers by mattzcarey · Pull Request #1861 · cloudflare/agents

mattzcarey · 2026-07-02T16:33:56Z

Supersedes #1844 — the same reproduce / open-pr skills, but as a single persistent Think agent on Workers + Cloudflare Containers instead of CI Actions runners.

What this is

@agent-think <instruction> on any issue (e.g. "reproduce this issue", "open a PR fixing this") dispatches a run that works in a real Linux container (gh/git/npm/node/wrangler), streams a live thread UI, and reports back on the issue as the agent-think GitHub App.

@agent-think reproduce this issue        (issue comment)
   │  issue_comment webhook → gh-app (private worker: verifies sig, member-gates,
   │  mints short-lived installation token, reacts 👀, RPCs dispatch)
   ▼
agent-think (this package — holds NO GitHub App credentials)
   ├─ AgentThink WorkerEntrypoint — dispatch() returns in ~1s (durable submit only)
   ├─ ThinkAgent DO — owns the Workspace (SQLite VFS) + the durable turn;
   │    container gh/git auth runs inside the turn (beforeTurn), not in dispatch
   ├─ Sandbox DO — container host; WarmPool DO keeps one pre-warmed
   ├─ CommandCenterAgent DO — singleton registry (threads + counters);
   │    ThinkAgent reports lifecycle events fire-and-forget
   └─ UI: `/` command center (metrics, per-repo cards, ChatGPT-style thread
        sidebar, live over agents state sync) · /thread/:session thread view

Model: openai/gpt-5.5 through the default AI Gateway's model catalog (Unified Billing over the AI binding, no provider key; providers: [openai] plugin required for catalog slugs — verified for text, tool calls, and streaming). Turn idempotency is per triggering comment, so re-mentions on an issue start fresh turns while RPC retries stay deduped.

Patterns follow aron/cloudflare-workspaces-prototype (hackspace): the Agent DO owns the Workspace; compute is a separate warm-pooled Sandbox DO dialed per-connect.

Why not the #1844 shape

No Actions runner lifecycle: the turn runs via Think's durable submitMessages (idempotency key = repo#issue), survives DO eviction, and both verbs on an issue share one workspace/thread.
Webhook → running agent in ~5s; a pre-warmed container skips boot cost.
Every repro ships a minimal Vite + React frontend (skill enforces the exact 7-file recipe, matching the examples/* house style): maintainers click the deployed workers.dev URL, press Trigger bug, and watch expected-vs-actual in the page. Deploys use wrangler deploy --temporary with a claimable preview account.

Notes for reviewers

enable_abortsignal_rpc compat flag is required — the container backend's health probe passes an AbortSignal over cross-DO RPC (documented in wrangler.jsonc).
agent-think/AGENTS.md has the aims, architecture, self-imposed rules, and the edge cases that cost real debugging time.
The gh-app webhook worker lives in internal GitLab (team-apps) and holds all App credentials; this package is public-safe by construction.

Verified in prod

Green end-to-end run on #1859 (2026-07-02): trigger comment → 👀 + "🧠 on it" in 5s → 30-min durable turn (clone, npm install, repro project, real temporary-account deploy) → structured repro report posted by agent-think[bot] with live URL + claim link.

🤖 Generated with Claude Code

A Think agent that reproduces and fixes cloudflare/agents GitHub issues inside a container-backed @cloudflare/workspace VFS, triggered from an issue comment (@agent-think <instruction>) via a GitHub App webhook worker. Supersedes the CI-based /repro + /pr Actions from #1844 — same skills, but running as a persistent Worker + pre-warmed container instead of Actions runners. Architecture (patterns from aron/cloudflare-workspaces-prototype): - AgentThink WorkerEntrypoint: dispatch() RPC from the webhook worker; returns in ~1s (submitMessages only — container gh/git auth happens inside the durable turn via beforeTurn, so the caller's waitUntil cancellation window can never kill the run) - ThinkAgent DO owns the Workspace (SQLite VFS) + the durable turn; two exec backends: container (full Linux: gh/git/npm/node/wrangler) and just-bash isolate for cheap text ops - Sandbox DO hosts the Cloudflare Container; WarmPool DO keeps one pre-warmed and hands them out per session - live thread UI (Vite + React) at /thread/:session - skills (reproduce / open-pr) mounted read-only from R2; repros must ship a minimal Vite frontend so maintainers can click the deployed URL and watch the failing behavior in a UI The worker holds no GitHub App credentials — the webhook worker mints a short-lived installation token per dispatch. Note: requires the enable_abortsignal_rpc compat flag — the container backend's health probe passes an AbortSignal over cross-DO RPC. Verified end-to-end in prod on issue #1859: trigger comment to bot reply in 5s, 30-minute turn with real clone/install/deploy, structured repro report posted back on the issue.

changeset-bot · 2026-07-02T16:34:01Z

⚠️ No Changeset found

Latest commit: 233afbf

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

…fication

devin-ai-integration

Devin Review found 4 potential issues.

- Replace HANDOFF.md (session-log style) with AGENTS.md: aims, how the system works, the rules we hold ourselves to, and the edge cases that cost real debugging time (abortsignal RPC flag, lossy tail, stale container reconcile, R2 skill seeding, Access, WARP builds). - Vitest configs move next to their suites (test/, tests-e2e/); root keeps only vite.config.ts (thread UI build). - Env files standardise on .env + .env_example (the e2e harness reads .env); drop .dev.vars.example and the stray root WARP pem copy. - Drop the vite alias workaround for agents/chat/react — the subpath export exists upstream now, plain resolution works.

@cloudflare/worker-bundler, ws, @types/ws: imported nowhere. @cloudflare/workers-types: redundant — tsconfig consumes the wrangler-generated worker-configuration.d.ts runtime types only. (isomorphic-git and @platformatic/vfs stay: optional peers of @cloudflare/workspace whose main entry — which we bundle — imports both; git.diff runs on isomorphic-git.)

- Commit worker-configuration.d.ts (wrangler types) and stop ignoring it — CI has no way to generate it, so the tsconfig types reference failed with TS2688 on a fresh checkout. - tsconfig extends agents/tsconfig (verbatimModuleSyntax et al.), with the types list overridden to the generated runtime file — keeping @cloudflare/workers-types alongside it would conflict. client.tsx now typechecks too (was outside the old include). - compatibility_date 2026-05-26 -> 2026-06-11 (repo standard), both configs; types regenerated against it. - write tool now takes the same per-file lock as edit: its stat-then-write mode preservation had the same interleaving window edit's read-modify-write guards against. Lock extracted to src/tools/fs/file-lock.ts.

pkg-pr-new · 2026-07-02T18:44:49Z

Open in StackBlitz

agents

npm i https://pkg.pr.new/agents@1861

@cloudflare/ai-chat

npm i https://pkg.pr.new/@cloudflare/ai-chat@1861

@cloudflare/codemode

npm i https://pkg.pr.new/@cloudflare/codemode@1861

create-think

npm i https://pkg.pr.new/create-think@1861

hono-agents

npm i https://pkg.pr.new/hono-agents@1861

@cloudflare/shell

npm i https://pkg.pr.new/@cloudflare/shell@1861

@cloudflare/think

npm i https://pkg.pr.new/@cloudflare/think@1861

@cloudflare/voice

npm i https://pkg.pr.new/@cloudflare/voice@1861

@cloudflare/worker-bundler

npm i https://pkg.pr.new/@cloudflare/worker-bundler@1861

commit: 233afbf

The store already mkdir -p's the parent on every write; telling the model saves it a container exec mkdir round-trip first.

Model: openai/gpt-5.5 through the default AI Gateway's model catalog (Unified Billing over the AI binding — no provider key). The providers: [openai] plugin is required: workers-ai-provider refuses {provider}/{model} slugs without it (verified empirically: text, tool calls, and streaming all work with the plugin; raw env.AI.run works either way but Think needs an AI SDK LanguageModel). Command center: the root URL is now a dashboard run by a singleton CommandCenterAgent (synced-state registry of every thread + counters). ThinkAgent reports dispatch/tool/turn events fire-and-forget — observing must never break a run. The UI gains a ChatGPT-style left sidebar listing threads reverse-chronologically, live over agents state sync; /thread/:session renders inside the same shell. The old plain-text root banner is gone (root serves the SPA, with a worker fallback where asset-first routing is not emulated).

Main screen leads with per-repo cards (name, github link, issue/status counts) per the wireframe; the sidebar gains a search filter and a ChatGPT-style Recents treatment. Sidebar persists across thread navigation (unchanged).

The repo#issue idempotency key silently swallowed re-mentions: once an issue's first turn completed, submitMessages returned the old submission (accepted:false) and nothing ran. The key now includes the triggering commentId (passed by gh-app); dev dispatches without one get a random key. Webhook redeliveries are already deduped in gh-app's KV before dispatch, so nothing is lost.

Log (never throw) when a lifecycle report fails, and emit one structured line per registry update — silent-success and silent-failure were indistinguishable in the logs.

CI has no vite output (dist/client is not committed), so the root-route test read an empty body. The test config now points ASSETS at a committed fixture with the SPA root node.

Cloudflare Access on the domain passes authenticated HTTP but eats WebSocket upgrades (zero WS ever reached the worker — the thread view only worked via useAgentChat's HTTP get-messages polling). Plain useAgent state sync has no such fallback, so the command center rendered empty. GET /api/command-center returns the registry snapshot; the client hydrates from it and polls while the WS is not connected.

ThreadMeta carries the GitHub issue title and who mentioned @agent-think (login + avatar). Activity rows and the sidebar show the title; the requester's avatar sits on each row with a hover tooltip ('login: instruction'). Both flow from the webhook payload through dispatch; old threads without the fields fall back to the instruction.

…es died at the assets router The assets layer forwards ordinary no-asset-match requests to the worker but not WebSocket upgrades, so every wss:// connect to /agents/* failed while plain HTTP worked — which is why the command center sat on the HTTP fallback and showed 'disconnected'. (Corrects the earlier Access diagnosis; Access passes authenticated WS fine.)

docs(agent-think): HANDOFF — branch/PR landed, note skill-recipe veri…

c6fecff

…fication

devin-ai-integration Bot reviewed Jul 2, 2026

View reviewed changes

Comment thread agent-think/wrangler.jsonc Outdated

Comment thread agent-think/test/wrangler.jsonc Outdated

Comment thread agent-think/src/tools/fs/tools/write.ts Outdated

Comment thread agent-think/tsconfig.json

mattzcarey added 4 commits July 2, 2026 17:40

docs(agent-think): gh-app no longer posts an 'on it' comment (👀 only)

6e46ebc

mattzcarey added 9 commits July 2, 2026 20:24

docs(agent-think): advertise auto-created parent dirs in the write tool

b25d2ff

The store already mkdir -p's the parent on every write; telling the model saves it a container exec mkdir round-trip first.

chore(agent-think): observable command-center reporting

1340528

Log (never throw) when a lifecycle report fails, and emit one structured line per registry update — silent-success and silent-failure were indistinguishable in the logs.

fix(agent-think): hermetic assets fixture for the unit suite

d1c0000

CI has no vite output (dist/client is not committed), so the root-route test read an empty body. The test config now points ASSETS at a committed fixture with the SPA root node.

mattzcarey mentioned this pull request Jul 3, 2026

feat(agent-think): sync feature docs and resume on CI #1867

Open

mattzcarey merged commit d1ce3cb into main Jul 3, 2026
4 checks passed

mattzcarey deleted the feat/agent-think branch July 3, 2026 12:59

mattzcarey mentioned this pull request Jul 3, 2026

feat(agent-think): repro branches + per-PR live demo deployments in the skills #1868

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(agent-think): issue repro/fix agent on Workers + Containers#1861

feat(agent-think): issue repro/fix agent on Workers + Containers#1861
mattzcarey merged 15 commits into
mainfrom
feat/agent-think

mattzcarey commented Jul 2, 2026 •

edited

Loading

Uh oh!

changeset-bot Bot commented Jul 2, 2026 •

edited

Loading

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pkg-pr-new Bot commented Jul 2, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

mattzcarey commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this is

Why not the #1844 shape

Notes for reviewers

Verified in prod

Uh oh!

changeset-bot Bot commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pkg-pr-new Bot commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mattzcarey commented Jul 2, 2026 •

edited

Loading

changeset-bot Bot commented Jul 2, 2026 •

edited

Loading

pkg-pr-new Bot commented Jul 2, 2026 •

edited

Loading