Summary
Harden Firewall.apply_stream so a sensitive pattern that is split across two
streamed chunks is still detected, rather than evading every per-chunk regex.
Why this matters
apply_stream wraps each chunk in a synthetic RawResult and redacts it
independently. A secret like a JWT or connection string delivered as "...eyJabc" then
"def.ghi..." matches no single-chunk pattern and reaches the LLM. As streaming becomes
the default UX for agent tool output, this boundary gap matters.
Current evidence
firewall/transform.py: apply_stream() iterates chunks and calls self.transform(...) per chunk; redaction state does not carry across chunks.
kernel/_stream.py: _stream_chunks delegates per-chunk to apply_stream; docstring notes the firewall is stateless across chunks.
firewall/redaction.py operates on one value at a time.
External context
Streaming tokenized output is standard; split-token evasion is a recognized class of
filter bypass.
Proposed implementation
- Maintain a small carry-over tail buffer of the last N characters of string content
across chunks for the string-scanning patterns, scanning tail + chunk.
- Only re-emit the newly safe portion; hold back a bounded suffix until the next chunk
(or until the final chunk) so a pattern straddling the boundary is caught.
- Keep this opt-in/bounded so streaming latency and memory stay predictable; document
the residual risk if a user disables it.
AI-agent execution notes
- Inspect first:
firewall/transform.py (apply_stream), kernel/_stream.py, tests/test_firewall_stream.py.
- Determinism is required (AGENTS.md) — buffering must be deterministic.
- Edge cases: final chunk must flush the buffer; binary/non-string payloads unaffected.
Acceptance criteria
- A fake secret split across two chunks is redacted in the streamed Frames.
- The final chunk flushes any held-back content.
- Per-chunk latency stays bounded (buffer size capped).
Test plan
Add split-pattern cases to tests/test_firewall_stream.py. Run make ci.
Documentation plan
Document streaming redaction semantics and any residual limitation in
docs/context_firewall.md.
Migration and compatibility notes
Streaming consumers may observe a one-chunk delay on the trailing buffer. Document the
behavior. Not expected to require migration.
Risks and tradeoffs
Adds bounded state to a deliberately stateless component; keep the buffer small and the
contract documented to avoid surprising latency.
Suggested labels
security, reliability, testing
Summary
Harden
Firewall.apply_streamso a sensitive pattern that is split across twostreamed chunks is still detected, rather than evading every per-chunk regex.
Why this matters
apply_streamwraps each chunk in a syntheticRawResultand redacts itindependently. A secret like a JWT or connection string delivered as
"...eyJabc"then"def.ghi..."matches no single-chunk pattern and reaches the LLM. As streaming becomesthe default UX for agent tool output, this boundary gap matters.
Current evidence
firewall/transform.py:apply_stream()iterates chunks and callsself.transform(...)per chunk; redaction state does not carry across chunks.kernel/_stream.py:_stream_chunksdelegates per-chunk toapply_stream; docstring notes the firewall is stateless across chunks.firewall/redaction.pyoperates on one value at a time.External context
Streaming tokenized output is standard; split-token evasion is a recognized class of
filter bypass.
Proposed implementation
across chunks for the string-scanning patterns, scanning
tail + chunk.(or until the final chunk) so a pattern straddling the boundary is caught.
the residual risk if a user disables it.
AI-agent execution notes
firewall/transform.py(apply_stream),kernel/_stream.py,tests/test_firewall_stream.py.Acceptance criteria
Test plan
Add split-pattern cases to
tests/test_firewall_stream.py. Runmake ci.Documentation plan
Document streaming redaction semantics and any residual limitation in
docs/context_firewall.md.Migration and compatibility notes
Streaming consumers may observe a one-chunk delay on the trailing buffer. Document the
behavior. Not expected to require migration.
Risks and tradeoffs
Adds bounded state to a deliberately stateless component; keep the buffer small and the
contract documented to avoid surprising latency.
Suggested labels
security, reliability, testing