Skip to content
Merged

Anet #813

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
48 changes: 48 additions & 0 deletions src/components/NavigationDocs.jsx
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,54 @@ export const docsNavigation = [
{ title: 'CLI', href: '/get-started/cli' },
],
},
{
title: 'AGENT NETWORK',
links: [
{ title: 'What is Agent Network?', href: '/agent-network' },
{ title: 'How It Works', href: '/agent-network/how-it-works' },
{ title: 'Quickstart', href: '/agent-network/quickstart' },
{ title: 'Providers', href: '/agent-network/providers' },
{
title: 'Policies',
href: '/agent-network/policies',
links: [
{
title: 'Token & Budget Limits',
href: '/agent-network/policies/limits',
},
{ title: 'Guardrails', href: '/agent-network/policies/guardrails' },
],
},
{
title: 'Usage & Logs',
href: '/agent-network/usage-and-logs',
links: [
{
title: 'Usage Overview',
href: '/agent-network/usage-and-logs/usage-overview',
},
{
title: 'Access Logs',
href: '/agent-network/usage-and-logs/access-logs',
},
{
title: 'Log Collection & Retention',
href: '/agent-network/usage-and-logs/log-collection',
},
],
},
{ title: 'Global Limits', href: '/agent-network/global-limits' },
{
title: 'Integrations',
href: '/agent-network/integrations',
links: [
{ title: 'Claude Code', href: '/agent-network/integrations/claude-code' },
{ title: 'Codex', href: '/agent-network/integrations/codex' },
{ title: 'LiteLLM', href: '/agent-network/integrations/litellm' },
],
},
],
},
{
title: 'MANAGE NETBIRD',
links: [
Expand Down
70 changes: 70 additions & 0 deletions src/pages/agent-network/global-limits.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
export const description =
'Account-wide token and spend caps applied across every Agent Network policy, scoped to groups or users or left account-wide.'

# Global Limits

Global limits are account-wide caps on token usage and spend that apply across
**every** policy and **every provider** — a backstop independent of any single policy's
limits. They are **limit-only** rules: unlike a policy, a global limit never selects a
provider or authorizes traffic, and it isn't tied to one. It caps a caller's total
consumption no matter which provider or gateway the request is routed to.

<p>
<img src="/docs-static/img/agent-network/global-limits/agent-network-global-limits-list.png" alt="agent network global limits list" className="imagewrapper-big" />
</p>

A global limit acts as an **always-on ceiling**. It is evaluated before any policy,
on every request, and every applicable rule must pass. Because rules can only tighten a
caller's effective limit and never loosen it, adding one is always safe.

## Scope

Each rule targets who it applies to:

- **Target groups** — the rule binds when the caller's groups intersect the rule's groups.
- **Target users** — the rule binds a specific user directly.
- **Untargeted** — a rule with no target groups or users applies to **every** caller (the
account-wide default).

A request can be bound by several rules at once (for example an account-wide rule plus a
group-specific one); all of them are enforced.

## Caps and Windows

A global limit carries the same cap shape as a [policy limit](/agent-network/policies/limits):

- A **token cap** and/or a **budget (USD) cap**.
- Each cap can be set **per user** and/or **per group**, over a fixed **window**.

Windows and counting work exactly as described in
[Token & Budget Limits](/agent-network/policies/limits#how-the-window-works): caps apply to
a fixed, epoch-aligned window, and the check is run **before** the request against usage
already accumulated — so a request that starts under the cap is allowed even if it crosses
it, and the next one is blocked.

## How Enforcement Works

On every request, NetBird first evaluates all global limits that bind the caller. Each
applicable rule must pass; the first rule whose token or budget cap is exhausted denies the
request, before any policy is considered. The denial surfaces in the
[access logs](/agent-network/usage-and-logs/access-logs) as a **Token limit exceeded** or
**Budget limit exceeded** reason. For per-group caps, usage is attributed to the lowest
matching target group.

## How It Combines with Policy Limits

Global limits and [policy limits](/agent-network/policies/limits) are enforced together: a
request must pass **both** the global ceiling and the selected policy's own limits. Because
every applicable cap binds and the most restrictive one wins, a global limit can only
tighten what a policy allows, never widen it. Use global limits to set an overall account
ceiling, and policy limits for finer-grained, per-policy control.

## Create a Global Limit

Go to **Agent Network → Configuration → Global Limits** and add a rule with the token
and/or budget caps and an optional target. Leave the target empty to apply it account-wide,
or pick groups or users to scope it.

<p>
<img src="/docs-static/img/agent-network/global-limits/agent-network-create-global-limit.png" alt="create global limit modal with token and budget caps" className="imagewrapper" />
</p>
237 changes: 237 additions & 0 deletions src/pages/agent-network/how-it-works.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,237 @@
import { Note } from '@/components/mdx'

export const description =
'How NetBird Agent Network works: the architecture behind keyless, identity-based access to LLM APIs and internal resources, and the lifecycle of a single agent request from the tunnel through routing, policy, and key injection to the upstream provider.'

# How Agent Network Works

Agent Network gives every agent a real identity and governs what it can reach over
NetBird's encrypted overlay. It works along two paths, depending on what the agent is
calling:

- **LLM APIs and AI gateways** are reached through a single **agent network endpoint**
served by the NetBird proxy. It sits between your agents and the APIs they call. You
point your agent at that endpoint instead of the provider's URL, and the proxy ties
each request to an identity, evaluates it against your policies, enforcing token and
budget limits, quotas, and model guardrails. It also injects the upstream provider key
server-side, forwards the request, and records usage and cost for every call.
- **Internal resources** such as databases, internal APIs, and self-hosted models, are
reached directly over **peer-to-peer WireGuard tunnels**, the same way any NetBird peer
reaches another. This traffic is governed by the same identities and access policies
but does not pass through the proxy, so there is no endpoint or key injection — the
agent connects straight to the resource over the overlay.

## Architecture

Agent Network is built on two existing NetBird capabilities: the **overlay network**
(an encrypted WireGuard mesh between peers) and the **reverse proxy** (a peer that
terminates requests and forwards them to upstreams). Around those, the management
service adds an identity-aware control plane for AI traffic.

### LLM APIs and AI Gateways

The diagram below illustrates the **first path** — an LLM request: the agent reaches the
endpoint over the WireGuard overlay, the proxy enforces identity, policies, limits, and guardrails
against the management control plane, injects the provider key, and forwards to the
upstream API or gateway. The proxy can also inject the calling agent's identity into the
request, so the gateway itself can attribute usage and enforce its own limits based on the
agent's group membership. For example, with a LiteLLM gateway it writes the agent's IdP groups
into `metadata.tags` and its identity into the `x-litellm-end-user-id` header, so LiteLLM
can apply tag budgets and per-user attribution.

<p>
<img src="/docs-static/img/agent-network/how-it-works/agent-network-diagram-llm-apis.png" alt="agent network LLM request path through the NetBird proxy" className="imagewrapper-big" />
</p>

- **NetBird client** — the agent's device joins the overlay as a peer. Its requests to
the endpoint are routed through the WireGuard tunnel, not the public internet.
- **Proxy peer** — handles LLM traffic only. It terminates the request, establishes
the caller's identity, runs the routing and policy pipeline, injects the provider key,
and forwards to the upstream API or gateway.
- **Management service** — the control plane. It holds providers, policies, guardrails,
and limits; resolves identities against your IdP; answers the proxy's per-request
policy checks; and records usage and access logs.
- **Identity provider** — your existing IdP (Okta, Microsoft Entra ID, Google, …)
supplies the identities and group memberships that policies are written against.
- **Upstreams** — for LLM traffic, the proxy forwards to LLM APIs and AI gateways.

The endpoint hostname itself (for example `https://sailcloth.netbird.ai`) is generated
when you connect your first provider and is only reachable from inside your overlay.
Comment on lines +58 to +59

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

Use a placeholder hostname instead of a concrete endpoint.

Please replace https://sailcloth.netbird.ai with a placeholder-style value (for example, https://<your-agent-endpoint>.netbird.ai) in public docs.

As per coding guidelines, “Never include real customer names, internal hostnames, IPs, or production credentials in MDX documentation or public/ files — use placeholders instead”.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/pages/agent-network/how-it-works.mdx` around lines 58 - 59, Replace the
concrete endpoint example in how-it-works.mdx with a placeholder hostname.
Update the sentence in the public docs content to use a generic value like
https://<your-agent-endpoint>.netbird.ai instead of the real hostname, keeping
the rest of the explanation the same. Use the existing documentation text in the
page body as the location to edit and avoid any real customer or internal
identifiers.

Source: Coding guidelines

It applies to LLM traffic only; internal resources keep their normal peer addresses on the
overlay.

### Internal Resources

The **second path** covers everything that isn't an LLM API — internal databases,
internal APIs, and self-hosted models on a GPU host. Here the proxy is not involved at
all. The agent connects to the target's overlay address **directly over a peer-to-peer
WireGuard tunnel**, exactly the way any NetBird peer reaches another. Access is still
identity-based: the agent's peer identity and group membership are matched against your
access policies, so it can reach only the resources it is authorized for. Because the
traffic never passes through the proxy, this path has no agent network endpoint, no
provider-key injection, and no token, budget, or per-request LLM logging — it is governed
like standard NetBird peer-to-peer access. This keeps internal traffic fast and private,
flowing straight between the two peers. Because NetBird is a peer-to-peer network, this
also works in reverse, so a resource can reach back to an agent when needed, such as to
deliver a callback or webhook.

<p>
<img src="/docs-static/img/agent-network/how-it-works/agent-network-diagram-internal-resources.png"
alt="agent network internal resource request path through WireGuard overlay" className="imagewrapper-big" />
</p>

## The Lifecycle of an LLM Request

This pipeline applies to LLM traffic — requests to the agent network endpoint.
Access to internal resources skips it entirely and flows peer-to-peer (see [Internal
Resources](#internal-resources)).

The proxy runs each request through an ordered chain of middleware. On the way to the
upstream:

1. **Establish identity.** The request arrives over the WireGuard tunnel, so the proxy
maps it to the calling NetBird peer and its identity — tied to your IdP for a human
user, or the peer's own NetBird identity for an autonomous agent — together with its
group membership. See [Identity and Authentication](#identity-and-authentication).
2. **Parse the request.** Read the target model and stream flag from the body, and
capture the prompt if prompt collection is enabled.
3. **Route and inject the key.** Match the model to a provider the caller's groups are
authorized to use, rewrite the upstream target, strip any client-supplied auth headers,
and inject the provider's key from server-side storage — see
[Routing](#routing-matching-a-request-to-a-provider) and [Keyless Access](#keyless-access).
4. **Check policy and limits.** Ask management to select the matching policy and evaluate
account- and policy-level token and budget caps. If unauthorized or a cap is exhausted,
the request is denied here — see [Policies, Limits, and
Guardrails](#policies-limits-and-guardrails).
5. **Stamp identity for the gateway.** Add the caller's identity to the upstream request
(for example into `metadata.tags` and `x-litellm-end-user-id`) for gateways that key
their own budgets and attribution off it.
6. **Apply guardrails.** Enforce the model allowlist and the prompt-capture rules.

The request is then forwarded to the upstream API or gateway. On the response leg, in
reverse:

7. **Meter.** Extract token counts from the response and convert them to cost.
8. **Record.** Post the usage back to management to update the limit counters. Usage is
always recorded; a full access-log entry is written when log collection is on — see
[Usage and Access Logs](#usage-and-access-logs).

A denial at any gate returns `403` to the client with a machine-readable reason, and the
request is still recorded so it appears in your logs.

## Identity and Authentication

Every request is tied to a real identity before any policy runs, and that identity always
comes from the **NetBird tunnel**. Because the request arrives over WireGuard, the proxy
maps its source to the enrolled peer and resolves the peer's NetBird identity and group
membership:

- For a **human user** — for example someone running Claude Code — the NetBird identity is
tied to your identity provider (Okta, Microsoft Entra ID, Google, …), so the request
carries that user and the groups they belong to.
- For an **autonomous agent**, the identity is the agent's own NetBird peer identity and
the groups assigned to that peer.

Either way the request carries a real identity and its **group membership**, captured at
request time. There is no API key or separate login on the client — the tunnel is the
credential. Policies are written against those groups, so access to AI follows the same
identities your organization already manages.

## Routing: Matching a Request to a Provider

A request names a model (for example `claude-opus-4-8` or `gpt-4o`). The router picks the
provider to serve it by:

1. **Model claim.** Keeping providers whose allowed-models list includes the requested
model. A provider with no model list acts as a catch-all gateway.
2. **Group authorization.** Keeping only providers the caller's groups are allowed to
reach. This authorization is compiled from your policies, so a provider is reachable
only where a policy grants it.
3. **Specificity.** Preferring a same-vendor, explicitly-claimed model over a catch-all
gateway.

If no provider claims the model, the request is denied as **model not available**. If a
provider claims it but the caller's groups aren't authorized, it's denied as **no
authorized provider**. When a route is found, the proxy records which configured provider
was selected and which groups authorized it.

## Policies, Limits, and Guardrails

Routing decides *where* a request can go; policies decide *whether it may* and *under what
budget*. By default nothing is allowed — a policy must connect a **source group** to one
or more **providers**.

At request time, management evaluates, in order:

- **Account ceilings.** Account-wide budget rules are checked first. If an account-level
token or budget cap is exhausted, the request is denied regardless of policy.
- **Applicable policies.** Among enabled policies whose providers include the selected
provider and whose source groups intersect the caller's groups, management picks one to
attribute the request to. Uncapped policies and larger remaining budgets are preferred,
with deterministic tie-breaking, so requests drain the most appropriate bucket first.
- **Limits.** A policy may cap **tokens** or **spend** per user and/or per group over a
rolling time window. Usage is accumulated in windowed counters aligned to a fixed epoch,
so the same totals hold across a clustered deployment.
- **Guardrails.** A policy can attach guardrails such as a **model allowlist** (reject
models outside the list) and **prompt capture** controls.

Each denial carries a reason that surfaces in the access log:

| Reason | Meaning |
| --- | --- |
| Model not available | No provider is configured to serve the requested model |
| No authorized provider | A provider serves the model, but the caller's groups aren't allowed |
| Model not allowed | A guardrail's model allowlist rejected the model |
| Token limit exceeded | A policy or account token cap is exhausted for the window |
| Budget limit exceeded | A policy or account spend cap is exhausted for the window |

See [Policies](/agent-network/policies) and [Global Limits](/agent-network/global-limits)
for how to configure these.

## Keyless Access

Provider API keys live only on the server. When you connect a provider, its key is stored
encrypted by the management service. During a request the proxy **strips** any
client-supplied authorization headers (`Authorization`, `x-api-key`, and similar) and
**injects** the provider's key on the way to the upstream.

The practical effect: agents authenticate to NetBird with their NetBird identity, never
with a provider key. Keys can't leak from a client because clients never hold them, and
rotating a provider key is a single server-side change.

## Usage and Access Logs

Agent Network separates lightweight accounting from full audit detail:

- **Usage** is recorded for **every** served request — identity, provider, model, tokens,
and cost — regardless of any logging setting. This always-on stream powers the usage
dashboards and the limit counters, and is retained indefinitely.
- **Access logs** add the full per-request detail (method, path, status, duration, and —
when prompt capture is on — the prompt and completion). Full access-log entries are
written only when **log collection** is enabled for the account, and are swept after a
configurable **retention period**. Prompts can be redacted for PII.

See [Usage & Logs](/agent-network/usage-and-logs) for the dashboards and controls.

## The Overlay Network

The transport underneath all of this is NetBird's WireGuard overlay. The agent's device
is a peer, the proxy is a peer, and connections are established **directly between peers**.
Because WireGuard is UDP-based and peer-to-peer, the overlay traverses NAT and firewalls
without opening inbound ports, changing security groups, or altering network topology.

This is also where the two paths differ:

- **LLM traffic** rides the overlay to reach the **proxy** peer, which then applies the
pipeline above and forwards to the upstream API or gateway.
- **Internal resources** — databases, APIs, and self-hosted models — are reached over a
**direct peer-to-peer tunnel** between the agent and the target peer, with no proxy in
between. Access is governed by the same identities and access policies as any other
NetBird peer, so an agent reaches only the resources its identity is allowed to.

## Next steps

- [Quickstart](/agent-network/quickstart). Deploy Agent Network and make your first keyless call.
- [Providers](/agent-network/providers). Connect LLM APIs, gateways, and local models.
- [Policies](/agent-network/policies). Authorize identities and attach limits and guardrails.
- [Usage & Logs](/agent-network/usage-and-logs). Track cost, usage, and per-request audit.
Loading
Loading