Extend the policy prover to reason about permitted flows across multiple connected sandboxes, not just individual sandbox policies in isolation.
Problem
Today OPP verifies a single sandbox policy at a time. In real deployments, agents in separate sandboxes communicate with each other -- one sandbox posts to Slack, another reads from GitHub, a third aggregates and reports to Telegram. The security question becomes: what is the combined allowed flow across the system?
A policy that looks locked down in isolation may still enable unintended end-to-end paths when composed with other sandboxes. For example:
- Sandbox A can read from a database and post to Slack
- Sandbox B can read from Slack and push to GitHub
- Together, data flows from the database to GitHub even though neither sandbox individually allows that
Scope
- Accept multiple policy files as input (one per sandbox)
- Model shared endpoints as connection points between sandboxes
- Run reachability analysis across the composed graph
- Surface end-to-end exfiltration and write paths that only exist through composition
- Report which sandbox boundaries the flow crosses
Outcome
Users can answer: "given these N sandboxes and their policies, what end-to-end data flows are possible?" This is especially relevant as multi-agent architectures become more common.
Context
Tracks under OS-43. This is the most forward-looking expansion item -- depends on single-sandbox verification being solid first.
Extend the policy prover to reason about permitted flows across multiple connected sandboxes, not just individual sandbox policies in isolation.
Problem
Today OPP verifies a single sandbox policy at a time. In real deployments, agents in separate sandboxes communicate with each other -- one sandbox posts to Slack, another reads from GitHub, a third aggregates and reports to Telegram. The security question becomes: what is the combined allowed flow across the system?
A policy that looks locked down in isolation may still enable unintended end-to-end paths when composed with other sandboxes. For example:
Scope
Outcome
Users can answer: "given these N sandboxes and their policies, what end-to-end data flows are possible?" This is especially relevant as multi-agent architectures become more common.
Context
Tracks under OS-43. This is the most forward-looking expansion item -- depends on single-sandbox verification being solid first.