feat(broker): egress-auth broker (runner side)#56
Merged
Conversation
…nv leak, concurrent
…down - Validate the control-channel request byte (constants ctrlReqDial / ctrlRespOK / ctrlRespErr); refuse a malformed datagram with 0xFF so a confused in-sandbox process that inherited fd 3 fails fast and never mints a connection off-spec. - Reorder the fd dispatch: wrap and verify the runner's own end with net.FileConn BEFORE handing the peer its fd via SCM_RIGHTS, so a net.FileConn failure refuses cleanly instead of stranding the peer with a connection that never gets served (its request would hang to timeout). - Clone http.DefaultTransport (guarded type assertion, matching safeHTTPTransport) instead of sharing the process-global one, and add a ResponseHeaderTimeout backstop. - BaseContext ties each per-conn server to broker shutdown so cancel() promptly cancels in-flight upstream requests. - Test: fix fd leaks (childFD + received fd) and assert the refusal path. Note: deliberately do NOT join the control goroutine in the dispatch WaitGroup — it exits only when its Recvmsg peer fully closes (childEnd is closed by the caller after stop()), so joining it would deadlock.
This is the broker code's first PR, so the linter surfaces issues across the original commits too. Fixes: - errcheck: wrap deferred syscall.Close / f.Close in the test. - gosec G112: ReadHeaderTimeout on the per-conn http.Server. - govet shadow: reuse the outer err in the test instead of re-declaring. - noctx: http.NewRequestWithContext in the test. - gosec G115 (int->uintptr on fds): exclude by text + path, matching the repo's existing G706/G703 handling for rules absent from the pinned CI golangci-lint v2.4 (fds from socketpair/ParseUnixRights never overflow). Verified: golangci-lint run (GOOS=linux) reports 0 issues; config verify passes; the broker unit test still passes in the linux container.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Egress-auth broker (runner side)
Turns the runner into a local authenticating broker so the fduty CLI's outbound API calls authenticate without the app_key ever entering the bash environment.
What changes
environment/broker_linux.go(//go:build linux): per-invocationsocketpair(AF_UNIX, SOCK_SEQPACKET|SOCK_CLOEXEC)control channel; child end inherited by bash as fd 3 viacmd.ExtraFiles; a control goroutine answers each handshake by minting a dedicatedSOCK_STREAMconnection and passing it back viaSCM_RIGHTS, then serves anhttputil.ReverseProxythat overwrites the sentinel?app_key=with the real per-person key and forwards to the upstream.PR_SET_DUMPABLE=0hardens the runner's in-memory keys.environment/broker_other.go(//go:build !linux):BrokerSupported=false+ stubs. darwin/windows don't carry the broker and fall back to the legacy env-key path.executeBashCommandaccepts*protocol.BashCredential; in broker mode it setsFLASHDUTY_CRED_FD=3+ an http placeholder base URL and never puts the key incmd.Env.protocol.BashCredential{Key, BaseURL}+BashArgs.Credential.ws/client.go: advertises&broker=1on WS connect iffBrokerSupported.Robustness (control channel)
0xFFand never mints a connection.net.FileConnbefore handing the peer its fd, so anet.FileConnfailure refuses cleanly instead of stranding the peer with a connection that never gets served.http.DefaultTransport(guarded type assertion) with aResponseHeaderTimeoutbackstop;BaseContextties each per-conn server to broker shutdown so teardown promptly cancels in-flight upstream requests.Why
The app_key in the bash env leaks via
printenvand/proc/<sibling>/environon shared-uid hosts. The broker keeps the key in runner memory only; attribution is by-construction (the runner is the parent of each bash, fd 3 is private to that invocation). Linux-only by construction (SOCK_CLOEXEC/PR_SET_DUMPABLEare undefined on darwin); capability is negotiated so a heterogeneous BYOC fleet stays correct.Verification
TestServeBrokerControl_RewritesKeyAndProxies(Linux, in container) — control channel + SCM_RIGHTS dispatch + key rewrite + the0xFFrefusal path; passes under-count=3with no fd/goroutine leak.TestBrokerE2E_RealFduty, opt-in): realfdutythrough the broker → real local pgy → authenticated real data; app_key absent from bash env (FLASHDUTY_CRED_FD=3only); two concurrentfdutyboth authenticate, no byte interleave.broker=1(Safari loggedbroker=true); an AI-SRE session ranfduty channel listthrough the broker and returned real channels withexit_code:0. Container had noFLASHDUTY_APP_KEY— the only auth path was the broker.