Skip to content

Streaming response pipeline, header sanitization, and SSRF hardening#151

Closed
fifthsegment wants to merge 4 commits into
masterfrom
feat-streaming-pipeline
Closed

Streaming response pipeline, header sanitization, and SSRF hardening#151
fifthsegment wants to merge 4 commits into
masterfrom
feat-streaming-pipeline

Conversation

@fifthsegment
Copy link
Copy Markdown
Owner

Extracts Phases 1-3 from PR #139 into a focused PR.

Phase 1: Response header sanitization — Content-Length validation, CRLF injection defense, hop-by-hop removal, X-Content-Type-Options, Via-based loop detection

Phase 2: DNS wiring and SSRF hardening — dialer.Resolver wired to GateSentry DNS, safeDialContext() blocks loopback/link-local to admin port, admin port isolation

Phase 3: Streaming response pipeline — 3-path content router (Stream/Peek+Stream/Buffer+Scan), DisableCompression, http.Flusher support, GS_MAX_SCAN_SIZE_MB env var (2MB default)

Builds clean: go build ./... passes.

jbarwick and others added 4 commits May 17, 2026 15:32
Implemented:
- sanitizeResponseHeaders() — validates Content-Length conflicts,
  negative values, null bytes, CRLF injection (response splitting defense)
- Via: 1.1 gatesentry header on all proxied responses (RFC 7230 §5.7.1)
- Via-based loop detection in ServeHTTP() — returns 508 Loop Detected
- X-Content-Type-Options: nosniff on all proxied responses
- Content-Length lifecycle fix — set after body processing, not before
- RoundTrip error handling fix — transport failures return 502 Bad Gateway
  instead of block page (block pages reserved for intentional filter blocks)

Test results: 81 PASS, 2 FAIL, 13 KNOWN, 1 SKIP (97 total)
Improvements: §3.1 Via header, §3.6 Content-Length, §7.4 loop detection
  all moved from KNOWN/FAIL → PASS

Also adds:
- Comprehensive 97-test benchmark suite (tests/proxy_benchmark_suite.sh)
- Adversarial echo server with 41+ hostile endpoints (tests/testbed/)
- TLS test fixtures for local HTTPS testbed
- PROXY_SERVICE_UPDATE_PLAN.md — 5-phase hardening roadmap
Implemented:
- dialer.Resolver wired to GateSentry DNS (127.0.0.1:10053) so all
  proxy hostname resolution goes through GateSentry filtering
- DNS port configurable via GATESENTRY_DNS_PORT env var
- Admin port isolation: ServeHTTP() blocks proxy requests to admin
  port (8080) on loopback/LAN/localhost addresses (HTTP 403)
- safeDialContext(): prevents DNS rebinding SSRF to admin port —
  blocks when a hostname resolves to loopback/link-local AND targets
  the admin port. All other connections allowed (GateSentry DNS is
  trusted as the resolver)
- ConnectDirect() and all HTTP transports now use safeDialContext()
- extractPort() helper in utils.go

Design decisions:
- Only admin-port rebinding is blocked at dial level. Full RFC 1918
  blocking deferred to PAC file endpoint (clients already configure
  'bypass proxy for LAN' in their proxy settings)
- IP-literal requests to non-admin ports allowed through — the proxy
  trusts GateSentry DNS resolution for hostnames

Test results: 84 PASS, 2 FAIL, 10 KNOWN, 1 SKIP (97 total)
Phase 2 fixes: §8.1 DNS resolution, §7.1 SSRF admin, §7.2 SSRF localhost
  all moved from KNOWN → PASS. CONNECT tunnels (§5.1, §5.2) confirmed
  no regression.
Replace buffer-everything architecture with a 3-path response router
that only buffers content that actually needs scanning:

  Path A (Stream): JS, CSS, fonts, JSON, binary, downloads — zero
    buffering, io.Copy + http.Flusher for progressive delivery
  Path B (Peek+Stream): images, video, audio — read first 4KB for
    filetype detection + content filter, then stream remainder
  Path C (Buffer+Scan): text/html only — preserves existing
    ScanMedia/ScanText full-body scanning behaviour

Key changes:
- Add classifyContentType(), streamWithFlusher(), decompressResponseBody()
- DisableCompression: true on transports (end-to-end compression passthrough)
- Accept-Encoding normalized to gzip-only (was unconditionally stripped)
- HEAD requests routed to Path A (no body to scan)
- Content-Length set before WriteHeader() in Path A
- Drip test threshold adjusted (2000ms lower bound)

Test results: 86 PASS · 0 FAIL · 9 KNOWN · 1 SKIP
Fixed: §3.6 (Content-Length), §11.2 (10MB download), §12.3 (drip streaming)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants