Skip to content

fix(client): abandon timed-out WebSocket handshakes without blocking (#113)#115

Merged
nficano merged 1 commit into
mainfrom
fix/ws-connect-timeout-113
Jun 12, 2026
Merged

fix(client): abandon timed-out WebSocket handshakes without blocking (#113)#115
nficano merged 1 commit into
mainfrom
fix/ws-connect-timeout-113

Conversation

@nficano

@nficano nficano commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

The connect timeout path called HttpClient.close(), which waits for the
abandoned handshake to complete; against a server that accepts the TCP
connection but never finishes the HTTP upgrade, the JDK client retries
the GET on EOF, turning the configured timeout into an indefinite hang.
The timeout path now cancels the handshake stage and uses shutdownNow(),
so connect() throws promptly. Regression test connects to a silent
accept-and-discard server with a 500ms timeout and bounds the wall
clock.

Closes #113

Co-Authored-By: Claude Fable 5 noreply@anthropic.com

Summary by CodeRabbit

  • Bug Fixes

    • Improved WebSocket connection timeout handling to prevent indefinite hangs when servers accept connections but fail to complete the upgrade. Connections now properly timeout and close as configured.
  • Tests

    • Added test coverage for WebSocket connection timeout scenarios to ensure timely failure under adverse server conditions.

…113)

The connect timeout path called HttpClient.close(), which waits for the
abandoned handshake to complete; against a server that accepts the TCP
connection but never finishes the HTTP upgrade, the JDK client retries
the GET on EOF, turning the configured timeout into an indefinite hang.
The timeout path now cancels the handshake stage and uses shutdownNow(),
so connect() throws promptly. Regression test connects to a silent
accept-and-discard server with a 500ms timeout and bounds the wall
clock.

Closes #113

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

The fix addresses a deadlock in WebSocket connection timeouts: when the server accepts TCP but ignores the HTTP upgrade, the JDK client retries on EOF while HttpClient.close() blocks. The change uses shutdownNow() to forcibly stop the client, allowing the timeout to return immediately. A new test confirms this against a silent server socket.

Changes

WebSocket Timeout Hang Fix

Layer / File(s) Summary
Timeout path cancellation fix
arcp-client/src/main/java/dev/arcp/client/WebSocketTransport.java
WebSocketTransport.connect() now cancels the in-flight handshake future and calls httpClient.shutdownNow() on timeout instead of httpClient.close(), preventing indefinite hangs against silent servers.
Timeout behavior validation
arcp-client/src/test/java/dev/arcp/client/WebSocketConnectTimeoutTest.java
New test validates that connect() fails promptly with IllegalStateException("WebSocket connect timed out") when connecting to a server that accepts TCP but never responds to the HTTP upgrade, and verifies elapsed time stays within a reasonable bound relative to the 500ms timeout.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 A socket that lingered too long on the wire,
Once made the connection climb higher and higher,
But now with a shutdown—so forceful and brave—
The timeout returns from its hanging grave! 🌟

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: fixing WebSocket timeout handling to abandon handshakes without blocking.
Linked Issues check ✅ Passed The code changes directly address issue #113 by replacing HttpClient.close() with cancellation and shutdownNow() to prevent indefinite hangs on timeout.
Out of Scope Changes check ✅ Passed All changes are directly related to fixing the WebSocket connect timeout issue; the test validates the timeout behavior and the implementation change addresses the root cause.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/ws-connect-timeout-113

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
arcp-client/src/main/java/dev/arcp/client/WebSocketTransport.java (1)

120-140: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Clean up transport-owned resources on failed connect paths.

At Line 120 and Line 129, connect(...) throws before releasing transport.inbound / transport.inboundExecutor created at Line 83. Repeated timeout/failure retries can accumulate abandoned executors.

Proposed fix
@@
     } catch (java.util.concurrent.ExecutionException e) {
       Throwable cause =
           e.getCause() instanceof CompletionException ce ? ce.getCause() : e.getCause();
       try {
         httpClient.close();
       } catch (RuntimeException ignored) {
         // best-effort
       }
+      transport.inbound.close();
+      transport.inboundExecutor.shutdown();
       throw new IllegalStateException("WebSocket connect failed: " + cause, cause);
     } catch (java.util.concurrent.TimeoutException e) {
@@
       try {
         httpClient.shutdownNow();
       } catch (RuntimeException ignored) {
         // best-effort
       }
+      transport.inbound.close();
+      transport.inboundExecutor.shutdown();
       throw new IllegalStateException("WebSocket connect timed out", e);
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@arcp-client/src/main/java/dev/arcp/client/WebSocketTransport.java` around
lines 120 - 140, connect(...) in WebSocketTransport throws on ExecutionException
and TimeoutException without cleaning up the transport-owned resources created
earlier, leaking transport.inbound and transport.inboundExecutor; in both catch
blocks (the ExecutionException block handling "WebSocket connect failed" and the
TimeoutException block handling "WebSocket connect timed out") ensure you
explicitly close or release transport.inbound and shut down
transport.inboundExecutor (e.g., call transport.inbound.close() if present and
transport.inboundExecutor.shutdownNow()/shutdown with a short await), wrapping
those calls in try/catch as best-effort cleanup before closing
httpClient/shutting down the client and rethrowing the IllegalStateException so
repeated failures don't accumulate abandoned executors.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@arcp-client/src/main/java/dev/arcp/client/WebSocketTransport.java`:
- Around line 120-140: connect(...) in WebSocketTransport throws on
ExecutionException and TimeoutException without cleaning up the transport-owned
resources created earlier, leaking transport.inbound and
transport.inboundExecutor; in both catch blocks (the ExecutionException block
handling "WebSocket connect failed" and the TimeoutException block handling
"WebSocket connect timed out") ensure you explicitly close or release
transport.inbound and shut down transport.inboundExecutor (e.g., call
transport.inbound.close() if present and
transport.inboundExecutor.shutdownNow()/shutdown with a short await), wrapping
those calls in try/catch as best-effort cleanup before closing
httpClient/shutting down the client and rethrowing the IllegalStateException so
repeated failures don't accumulate abandoned executors.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 80e97e4a-048e-41b6-a3bf-e3a2532082b2

📥 Commits

Reviewing files that changed from the base of the PR and between e4d40b4 and 1d18c95.

📒 Files selected for processing (2)
  • arcp-client/src/main/java/dev/arcp/client/WebSocketTransport.java
  • arcp-client/src/test/java/dev/arcp/client/WebSocketConnectTimeoutTest.java

@codecov

codecov Bot commented Jun 12, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@nficano nficano merged commit 4e2b4bc into main Jun 12, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WebSocketTransport.connect timeout path can hang: HttpClient.close() blocks until the abandoned handshake completes

1 participant