fix(streaming): normalize non-SSE upstream chat completions by SantiagoDePolonia · Pull Request #412 · ENTERPILOT/GoModel

SantiagoDePolonia · 2026-06-18T08:12:51Z

Summary

Fixes #411 — streaming chat completions where the connection appears to hang after the model has clearly finished, leaving the client "waiting for the model to continue outputting."

Root cause: Some OpenAI-compatible upstreams ignore stream:true and reply with a single buffered application/json completion (no data: framing, no [DONE]). The gateway forwarded that body verbatim under a text/event-stream content type, so SSE clients waited forever for an end-of-stream marker that never arrived.

Reproduced with bailian/deepseek-v3.2 (a dedicated Aliyun MaaS deepseek deployment that silently drops streaming):

HTTP/1.1 200 OK
Content-Type: text/event-stream                          ← gateway claims SSE
{"choices":[{"finish_reason":"stop","message":{...}}]}   ← but buffered JSON, no data:/[DONE]

Other models (qwen-flash, glm-5.1, deepseek-v4-*, OpenAI, Anthropic, Groq) stream normally — so this is a general robustness gap for any upstream that drops streaming, not a single-provider issue.

Not a version regression

The faulty path predates the versions named in the issue: CompatibleProvider.StreamChatCompletion (raw DoStream passthrough) is byte-identical at v0.1.40 and today, and the unconditional text/event-stream header has existed since the first commit. The v0.1.40..v0.1.42 diff touches only embeddings, health/readiness, dashboard, and storage — nothing in the chat-streaming path. The hang is triggered by the upstream model/endpoint behavior, not a gateway code change. This PR hardens the long-standing edge case.

Fix

New shared wrapper EnsureChatCompletionSSE, applied in CompatibleProvider.StreamChatCompletion so every OpenAI-compatible provider (bailian, groq, oracle, vllm, xiaomi, deepseek, openai) benefits. Following Postel's Law:

Genuine SSE (starts with data:, :, event:, …) passes through untouched, no buffering.
A buffered JSON completion (starts with {) is re-emitted as one SSE chunk (object → chat.completion.chunk, each choice's message → delta) followed by a terminal data: [DONE].

Testing

New table-driven tests for EnsureChatCompletionSSE (buffered-JSON conversion, genuine-SSE passthrough, leading-comment passthrough, nil).
Manually verified with curl: deepseek-v3.2 now emits a proper chunk and [DONE]; previously-working providers still stream multi-chunk unchanged.
make test-race and make lint pass (pre-commit).

🤖 Generated with Claude Code

Summary by CodeRabbit

Bug Fixes
- Chat completion streaming is now normalized to standard Server-Sent Events across providers, with consistent data: frames and a proper [DONE] terminator.
- Improved behavior for non-SSE JSON responses and for upstream disconnects mid-stream, so already-received content is still delivered and the stream ends cleanly.
- Added request validation in the XAI provider to return a clear error when the request is missing.
Tests
- Expanded coverage for SSE pass-through (including comment-prefixed payloads), JSON-to-SSE conversion, and mid-body read error scenarios.

Some OpenAI-compatible upstreams ignore stream:true and reply with a single buffered application/json completion (no data: framing, no [DONE]). The gateway forwarded that body verbatim under a text/event-stream content type, so SSE clients waited forever for an end-of-stream marker that never arrived — the connection appeared to hang after the model had clearly finished (issue #411). Add EnsureChatCompletionSSE, applied in the shared CompatibleProvider.StreamChatCompletion so every OpenAI-compatible provider benefits. Genuine SSE streams pass through untouched with no buffering; a buffered JSON completion is re-emitted as one SSE chunk (object -> chat.completion.chunk, message -> delta) followed by a terminal data: [DONE]. This is a long-standing latent defect, not a regression between the versions named in the issue; the fix hardens the gateway against any upstream that silently drops streaming. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-06-18T08:13:05Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 646778d2-cc5b-4fb2-98f8-4fd07d4404c2

📥 Commits

Reviewing files that changed from the base of the PR and between ee8ef7f and de658e3.

📒 Files selected for processing (2)

internal/providers/chat_stream_normalize.go
internal/providers/xai/xai.go

📝 Walkthrough

Walkthrough

Adds EnsureChatCompletionSSE, a new normalizer that inspects the first ~512 bytes of an upstream chat completion stream to determine whether it is buffered JSON or genuine SSE, converts buffered JSON to SSE format (rewriting the object field and message→delta), and appends the terminal data: [DONE]. The function is wired into three provider implementations—CompatibleProvider, DeepSeek, and XAI—with explicit error checks before wrapping, and seven unit tests cover all branches including partial-body recovery on read failures.

Changes

SSE Stream Normalization

Layer / File(s)	Summary
EnsureChatCompletionSSE and JSON-to-SSE conversion `internal/providers/chat_stream_normalize.go`	New file adds `EnsureChatCompletionSSE` (peek-based JSON vs. SSE detection), `bufferedCompletionToSSE` (rewrites `object`, moves `message`→`delta`, appends `data: [DONE]`), `firstNonSpaceByte` helper, and `bufferedReadCloser` wrapper.
Unit tests for all code paths `internal/providers/chat_stream_normalize_test.go`	Defines `errAfterReadCloser` test helper and covers JSON-to-SSE rewriting with field transforms, byte-for-byte pass-through for real SSE, SSE-with-leading-comment pass-through, partial body preservation on read errors (ensuring `[DONE]` still appends), and nil-stream handling.
Provider StreamChatCompletion integration `internal/providers/openai/compatible_provider.go`, `internal/providers/deepseek/deepseek.go`, `internal/providers/xai/xai.go`	Three providers now capture `(stream, err)` from `p.client.DoStream`, return errors without wrapping, and wrap successful streams with `EnsureChatCompletionSSE` before returning. Ensures all providers emit properly-terminated SSE regardless of upstream format.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Provider as Provider.StreamChatCompletion
    participant DoStream as p.client.DoStream
    participant Normalizer as EnsureChatCompletionSSE

    Client->>Provider: StreamChatCompletion(request)
    Provider->>DoStream: POST /chat/completions
    DoStream-->>Provider: (stream, err)
    alt err != nil
        Provider-->>Client: return nil, err
    else stream ok
        Provider->>Normalizer: wrap(stream)
        Normalizer->>Normalizer: peek first 512 bytes
        alt buffered JSON response
            Normalizer->>Normalizer: rewrite object to chat.completion.chunk
            Normalizer->>Normalizer: move message to delta fields
            Normalizer->>Normalizer: emit data: [DONE] terminator
        else genuine SSE stream
            Normalizer->>Normalizer: pass through via bufferedReadCloser
        end
        Normalizer-->>Provider: normalized SSE stream
        Provider-->>Client: normalized SSE stream
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 A stream came in half-baked, no [DONE] in sight,
The rabbit peeked ahead and said, "this ain't right!"
It dressed up the JSON in data: clothes,
Appended the marker before the stream slows.
Now clients can rest when the model is through —
No hanging forever, hooray, it's brand new! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 37.50% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: normalizing non-SSE upstream chat completions by introducing SSE formatting where missing.
Linked Issues check	✅ Passed	The PR implementation fully addresses issue `#411` by introducing EnsureChatCompletionSSE to normalize non-SSE responses and append the [DONE] marker that was causing the hang.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to fixing the streaming chat completion issue: new SSE normalization logic, integration into compatible providers, and comprehensive tests.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/connection-regression

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

greptile-apps · 2026-06-18T08:15:13Z

Confidence Score: 5/5

This looks safe to merge.

No blocking issues found in the changed code.

_{Reviews (3): Last reviewed commit: "fix(streaming): avoid tainted size arith..." | Re-trigger Greptile}

codecov-commenter · 2026-06-18T08:16:49Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 77.19298% with 13 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
internal/providers/chat_stream_normalize.go	79.06%	7 Missing and 2 partials ⚠️
internal/providers/deepseek/deepseek.go	50.00%	1 Missing and 1 partial ⚠️
internal/providers/xai/xai.go	66.66%	1 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/providers/chat_stream_normalize.go`:
- Around line 40-44: The error handling path after io.ReadAll in the function
discards any partial data that was successfully read before the error occurred,
silently losing generated content. Instead of returning only chatDonePayload
when err is not nil, modify the error branch to combine the partial body content
that was read with the done payload, ensuring that any successfully generated
content is preserved and emitted before the DONE marker. This requires returning
both the partial data and the completion signal rather than discarding the
partial data entirely.
- Around line 33-35: The reader.Peek(peekForNonSSE) call blocks waiting for the
full 512 bytes (peekForNonSSE) to be available or EOF, which delays SSE streams
and makes streaming appear frozen. Instead of peeking the full peekForNonSSE
buffer size, use a smaller peek size that is sufficient to detect whether the
response starts with a JSON object (indicated by the opening brace character)
without blocking unnecessarily. Consider using a minimal peek size like 1 byte
or a small reasonable amount to check the first non-whitespace character,
allowing valid SSE streams to pass through immediately without waiting for the
full buffer to fill.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 854ae2ef-3f38-4502-b64e-9b0c100034b3

📥 Commits

Reviewing files that changed from the base of the PR and between 68cd937 and c999793.

📒 Files selected for processing (3)

internal/providers/chat_stream_normalize.go
internal/providers/chat_stream_normalize_test.go
internal/providers/openai/compatible_provider.go

Address PR review feedback on the chat-completions SSE normalizer: - Classify the stream from its first non-whitespace byte via incremental Peek instead of Peek(512), so a genuine SSE upstream that emits a small first token then pauses is no longer held back waiting for a full buffer. - Preserve partially-read bodies: when the upstream looked like buffered JSON but the read failed mid-body, forward what arrived before [DONE] instead of replacing it with a bare done marker. - Apply EnsureChatCompletionSSE to the deepseek and xai providers, which stream via raw DoStream and previously bypassed the normalizer. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

CodeQL flagged the make() capacity computation (len(payload)+constants) as a possible allocation-size overflow. Build the normalized SSE chunk with a bytes.Buffer instead, which removes the explicit size arithmetic; behavior is unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/providers/chat_stream_normalize.go`:
- Around line 41-44: The condition at line 41 in the error handling block checks
both `err != nil && len(body) == 0`, but this is unreachable dead code. After
the Peek operation on line 34 successfully identifies the JSON opening brace
character, that byte is buffered and will be returned by the subsequent
io.ReadAll call, ensuring the body is never empty. Remove the `len(body) == 0`
check from the condition and keep only `if err != nil` to eliminate the dead
code branch and clarify the actual error handling path.

In `@internal/providers/xai/xai.go`:
- Around line 186-191: The StreamChatCompletion method in the Provider type
dereferences the req parameter without validating it is not nil, which will
cause a panic if a nil request is passed. Add a nil-check guard at the beginning
of the StreamChatCompletion method that returns an appropriate error if req is
nil, following the same pattern already implemented in the DeepSeek provider
referenced in this PR. This should occur before the req.WithStreaming() call on
line 190.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 1fc2ef05-f172-41c1-af83-7e6f3c50deea

📥 Commits

Reviewing files that changed from the base of the PR and between c999793 and ee8ef7f.

📒 Files selected for processing (4)

internal/providers/chat_stream_normalize.go
internal/providers/chat_stream_normalize_test.go
internal/providers/deepseek/deepseek.go
internal/providers/xai/xai.go

Address follow-up PR review: - xai StreamChatCompletion now guards a nil request before dereferencing it, matching the deepseek provider's contract in this PR. - Remove the unreachable len(body)==0 branch in EnsureChatCompletionSSE: the '{' that classifies the body is already buffered, so io.ReadAll always returns at least that byte. bufferedCompletionToSSE already forwards partial bytes and appends [DONE], so behavior is unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

greptile-apps Bot reviewed Jun 18, 2026

View reviewed changes

Comment thread internal/providers/chat_stream_normalize.go Outdated

Comment thread internal/providers/openai/compatible_provider.go

Comment thread internal/providers/chat_stream_normalize.go Outdated

Comment thread internal/providers/chat_stream_normalize.go Outdated

github-advanced-security AI found potential problems Jun 18, 2026

View reviewed changes

Comment thread internal/providers/chat_stream_normalize.go Fixed

SantiagoDePolonia mentioned this pull request Jun 18, 2026

1.4.2 The conversation cannot be properly ended. #411

Closed

coderabbitai Bot reviewed Jun 18, 2026

View reviewed changes

Comment thread internal/providers/chat_stream_normalize.go Outdated

Comment thread internal/providers/chat_stream_normalize.go Outdated

SantiagoDePolonia and others added 2 commits June 18, 2026 12:42

coderabbitai Bot reviewed Jun 18, 2026

View reviewed changes

Comment thread internal/providers/chat_stream_normalize.go Outdated

Comment thread internal/providers/xai/xai.go

SantiagoDePolonia merged commit 9025183 into main Jun 18, 2026
20 checks passed

Uh oh!

Conversation

SantiagoDePolonia commented Jun 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Not a version regression

Fix

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

greptile-apps Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Confidence Score: 5/5

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

SantiagoDePolonia commented Jun 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 18, 2026 •

edited

Loading

greptile-apps Bot commented Jun 18, 2026 •

edited

Loading

codecov-commenter commented Jun 18, 2026 •

edited

Loading