Skip to content

fix(opencode): surface provider APIErrors instead of silent _Done_#205

Merged
LIU9293 merged 4 commits intomainfrom
fix/opencode-api-error-detection
Apr 24, 2026
Merged

fix(opencode): surface provider APIErrors instead of silent _Done_#205
LIU9293 merged 4 commits intomainfrom
fix/opencode-api-error-detection

Conversation

@LIU9293
Copy link
Copy Markdown
Contributor

@LIU9293 LIU9293 commented Apr 24, 2026

Summary

  • Fixes the root cause of the C0AUUDD0VDX:1776966674.077239 thread where the agent kept replying _Done_ forever: the OpenCode SDK returns HTTP 200 on provider failures (the error is stashed on data.info.error, e.g. Anthropic APIError for "image dimensions exceed max allowed size for many-image requests: 2000 pixels"), but the adapter in packages/agents/opencode/client.ts only inspected the top-level result.error, so the kernel fell back to the _Done_ placeholder at packages/core/kernel/request-run.ts:859.
  • Once the real error is surfaced, the oversized-image case still poisons every future turn (the full history is replayed on each prompt). For that specific error we now best-effort session.revert the poisoned user/assistant turn so the thread can recover without manual intervention.

Changes

  • packages/agents/opencode/client.ts
    • Extract data.info.error and throw a descriptive OpenCode <Name> (status N): <message> error after every session.prompt call.
    • Detect the Anthropic oversized-image APIError and call client.session.revert({ sessionID, messageID: info.parentID }) before throwing, so the offending turn is dropped from session history.
  • packages/agents/test/opencode-client.test.ts (new) — unit tests for extractInfoError, formatInfoError, and isOversizedImageError.

Scope notes

  • Ode's buildPromptParts only emits text parts today, so client-side image downsampling is out of scope here — the offending screenshots are produced by tool outputs on the OpenCode server side. If/when Ode starts attaching user-uploaded images to prompts, downsampling should be added in packages/agents/shared.ts::buildPromptParts at that point.

Test

  • bun test packages/agents (90 pass)
  • bunx tsc --noEmit -p . (clean)

Kai Liu added 2 commits April 24, 2026 10:20
The OpenCode SDK returns HTTP 200 even when the underlying provider call
failed: the failure is stashed on `data.info.error` (e.g. Anthropic
APIError when an image in session history exceeds 2000px) and the
assistant message ends up with zero text parts. The adapter only looked
at the top-level `result.error`, so the kernel silently fell back to
`_Done_` and hid the real failure from the user.

- Detect `data.info.error` on prompt responses and throw with the
  provider error name, status code, and message.
- When the error is the Anthropic "image dimensions exceed max allowed
  size" case, best-effort `session.revert` the poisoned turn so the
  thread is not permanently stuck failing every subsequent prompt.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dceef8f618

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/agents/opencode/client.ts Outdated
// the user.
const data = result.data as Record<string, unknown>;
const infoError = extractInfoError(data);
if (infoError) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Exclude stop-abort errors from fatal info.error handling

sendMessage now throws for every data.info.error, but this field also carries non-provider failures like MessageAbortedError (the union is documented in this file). When a user issues stop, an abort can race ahead of the stop watcher in request-run.ts; in that race this unconditional throw turns an intentional stop into a failed run/error status instead of the existing graceful stop path. Limit this branch to provider-side failures (e.g., API/auth) or explicitly treat abort errors as non-fatal when the request is being stopped.

Useful? React with 👍 / 👎.

Comment thread packages/agents/opencode/client.ts Outdated
// in this thread are not permanently broken. We fire-and-forget:
// revert failures should not mask the original error.
if (isOversizedImageError(infoError)) {
await tryRevertOversizedImageTurn(client, activeSessionId, data, workingPath);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid awaiting best-effort revert before surfacing API error

The oversized-image cleanup is described as best-effort/fire-and-forget, but this path awaits tryRevertOversizedImageTurn before throwing the original provider error. If the revert call is slow or hangs, the user-facing error is delayed (or can time out) even though cleanup is optional. Trigger revert asynchronously (or with a short timeout) so the original API failure is reported immediately.

Useful? React with 👍 / 👎.

Kai Liu added 2 commits April 24, 2026 15:29
…rget revert

Addresses Codex review on PR #205:
- P1: skip throw when info.error.name is MessageAbortedError so a user's
  stop does not race ahead of the kernel's graceful stop path and flip
  the run into a failed state.
- P2: dispatch tryRevertOversizedImageTurn asynchronously instead of
  awaiting it, so a slow/hung revert cannot delay surfacing the original
  provider error to the user.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c3bd611339

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

// do NOT await here, so a slow/hung revert cannot delay the user
// from seeing the original API failure.
if (isOversizedImageError(infoError)) {
void tryRevertOversizedImageTurn(client, activeSessionId, data, workingPath);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Serialize oversized-image revert with session lock

sendMessage detaches tryRevertOversizedImageTurn with void and immediately throws, so runtime.withSessionLock is released before session.revert finishes. In the oversized-image path, a user retry that starts right after this error can append a new turn, and the late revert to the old info.parentID can then roll back that fresh turn/context unexpectedly. Please keep the revert in the same serialized flow (or otherwise block subsequent prompts until it completes) to avoid cross-turn history truncation.

Useful? React with 👍 / 👎.

@LIU9293 LIU9293 merged commit 8ad437f into main Apr 24, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant