Skip to content

fix(audit): install SequenceCounter before auth so auth_verify lands seq=1 (closes #174)#176

Merged
initializ-mk merged 1 commit into
mainfrom
fix/issue-174-auth-verify-seq
Jun 16, 2026
Merged

fix(audit): install SequenceCounter before auth so auth_verify lands seq=1 (closes #174)#176
initializ-mk merged 1 commit into
mainfrom
fix/issue-174-auth-verify-seq

Conversation

@initializ-mk

Copy link
Copy Markdown
Contributor

Summary

Closes #174. The `auth_verify` / `auth_fail` audit events were emitted without a `seq` field because the per-invocation `SequenceCounter` was installed downstream of the auth middleware. Operators saw `auth_verify` sitting outside the per-correlation_id sequence, breaking the FWS-8 gap-detection contract.

Symptom (user-visible, v0.15.0)

```
auth_verify (no seq) ← BUG
session_start seq=1
guardrail_check seq=2
llm_call seq=3
...
```

Fix

Install the `SequenceCounter` on `r.Context()` before the auth chain runs, via a thin middleware wrapper. The auth chain's `OnAuth` callback now uses `EmitFromContext(req.Context(), ...)` and stamps `seq=1` on `auth_verify`. The runner's per-A2A-request setup calls a new `coreruntime.EnsureSequenceCounter` helper that detects the existing counter and reuses it, so `session_start` lands `seq=2` and the per-correlation_id sequence is gap-free.

After

```
auth_verify seq=1 ← FIXED
session_start seq=2
guardrail_check seq=3
llm_call seq=4
...
```

Pieces

File Change
`forge-core/runtime/audit_schema.go` New `EnsureSequenceCounter(ctx)` helper — returns ctx unchanged when a counter is present, installs fresh when absent
`forge-cli/runtime/sequence_counter_middleware.go` (new) Thin wrapper around `auth.Middleware` that installs the counter on `r.Context()` before delegating
`forge-cli/runtime/runner.go` (1) `AuthMiddleware: installSequenceCounterMiddleware(auth.Middleware(authCfg))` (2) `makeAuthAuditCallback` switches `Emit` → `EmitFromContext(req.Context(), ...)` (3) Three in-request `WithSequenceCounter(ctx, new(...))` calls → `EnsureSequenceCounter(ctx)`
`forge-cli/runtime/auth_audit_seq_test.go` (new) 5 tests pinning the invariant

What stays as plain `Emit` (intentional, documented in PR #173)

Site Why `Emit` is correct
Egress proxy `OnAttempt` (`source: proxy`) Subprocess HTTP CONNECT has no Go ctx
`mcp_tool_conflict` Startup-time, no invocation scope
Schedule dispatcher (`schedule_fire` / `schedule_complete`) Scheduler tick runs outside any A2A request scope

After this PR, the only "in-scope-but-plain-Emit" pattern that motivated #175 (the lint) is gone. The three remaining sites are genuinely scope-less.

Cost

The wrapper allocates a `SequenceCounter` (~24 bytes) on every HTTP request, including auth-skipped paths (`/.well-known/agent-card.json`, `/healthz`). The counter is unused on those paths but allocation is in the same ballpark as the request struct itself; threading skip-path knowledge into the wrapper would be more complexity than the savings warrant.

End-to-end smoke

`--no-auth` path (wrapper still runs, but auth middleware is a passthrough):

```
session_start seq=1
session_end seq=2
invocation_complete seq=3
```

Auth on (internal-token loopback, like channel-adapter callbacks and CronJob trigger pods):

```
auth_verify seq=1
session_start seq=2
session_end seq=3
invocation_complete seq=4
```

Test plan

  • `go test -count=1 ./...` clean in forge-core and forge-cli
  • `golangci-lint run ./...` → 0 issues
  • `gofmt -w` applied
  • 5 new tests pass:
    • `TestAuthAudit_SeqStampedWhenCounterInstalled` — the Audit emit: stamp seq on auth_verify / auth_fail by installing the sequence counter before the auth middleware #174 pin: counter on req.Context() → auth_verify gets seq=1
    • `TestAuthAudit_NoSeqWhenCounterAbsent` — back-compat: no counter on ctx → omitempty drops the field (legacy embedders that wire their own server.Server without the wrapper still work)
    • `TestSequenceCounterMiddleware_InstallsCounterBeforeNext` — the wrapper installs the counter before delegating to next
    • `TestEnsureSequenceCounter_ReusesExisting` — runner must NOT clobber an auth-installed counter (counter-pointer equality + sequence-continuity assertions)
    • `TestEnsureSequenceCounter_InstallsFresh` — --no-auth / direct-test path: install a fresh counter when ctx has none
  • End-to-end smoke against `/tmp/forge-fix174` binary verifying both paths produce the expected seq layout
  • Existing `TestAuthAudit_*` tests in `auth_audit_test.go` still pass (the back-compat assertion confirms emit shape is unchanged when no counter is installed)

Schema impact

None. The events were already declared to carry `seq` — the field's `omitempty` JSON tag just dropped it when the value was zero. After the fix, consumers see the field they were already expecting on every per-invocation event; legacy consumers that ignored unknown keys see no behavioral change.

Follow-on

When this merges, the third row of the four-row "plain Emit is intentional" table in #175's lint allow-list disappears (auth callback becomes `EmitFromContext`). The three remaining sites — egress proxy `source: proxy`, `mcp_tool_conflict`, scheduler dispatcher — are genuinely scope-less.

…seq=1 (closes #174)

Pre-fix: auth_verify / auth_fail audit events lacked the seq field
because the SequenceCounter was installed downstream (in the runner's
per-A2A-request setup) of the auth middleware. The auth chain's
OnAuth callback used plain AuditLogger.Emit and couldn't reach a
counter that didn't exist yet on req.Context().

Symptom (user-visible on v0.15.0):

  auth_verify           (no seq)   ← BUG
  session_start          seq=1
  guardrail_check        seq=2
  llm_call               seq=3
  ...

The FWS-8 contract promises every event under a given correlation_id
exposes a monotonically increasing seq for gap detection. auth_verify
sitting outside the sequence breaks that.

Fix: install the SequenceCounter on r.Context() BEFORE the auth chain
runs, via a thin middleware wrapper around auth.Middleware in the
runner. The auth callback now uses EmitFromContext(req.Context(), ...)
and stamps seq=1 on auth_verify. The runner's per-A2A-request setup
calls coreruntime.EnsureSequenceCounter (new helper) which detects
the existing counter and reuses it — so session_start lands seq=2
and the sequence is gap-free.

Three pieces:

  - forge-core/runtime/audit_schema.go gains EnsureSequenceCounter:
    returns ctx unchanged when a counter is present, installs a fresh
    one when absent. Used at every invocation-entry point that may
    run downstream of an upstream middleware.

  - forge-cli/runtime/sequence_counter_middleware.go (new) is the
    thin wrapper: composes (next) → authMW(next) → install counter
    → serve. Cost: ~24B counter allocation per request, even on
    skip paths — simpler than threading skip-path knowledge into the
    wrapper, and the alloc is in the same ballpark as the request
    struct.

  - forge-cli/runtime/runner.go:
    * AuthMiddleware: auth.Middleware(authCfg) →
      installSequenceCounterMiddleware(auth.Middleware(authCfg))
    * makeAuthAuditCallback: Emit → EmitFromContext(req.Context())
    * Three in-request WithSequenceCounter(ctx, new(...)) calls →
      EnsureSequenceCounter(ctx). Counter is reused from the auth
      wrapper (auth ON path) or installed fresh (--no-auth path).
    * Scheduler dispatcher's WithSequenceCounter call is left alone
      — it runs outside any HTTP request, no upstream counter.

Three other plain-Emit sites stay as documented in PR #173:

  - source=proxy egress: subprocess HTTP CONNECT has no Go ctx
  - mcp_tool_conflict: startup-time, pre-invocation
  - schedule_fire / schedule_complete: scheduler tick, no A2A scope

End-to-end smoke (--no-auth path):

  session_start          seq=1
  session_end            seq=2
  invocation_complete    seq=3

End-to-end smoke (auth on, internal token):

  auth_verify            seq=1   ← FIXED
  session_start          seq=2
  session_end            seq=3
  invocation_complete    seq=4

Tests:

  - TestAuthAudit_SeqStampedWhenCounterInstalled — the #174 pin: with
    a counter on req.Context(), auth_verify lands seq=1
  - TestAuthAudit_NoSeqWhenCounterAbsent — back-compat: no counter →
    omitempty drops the field (legacy embedders without the wrapper)
  - TestSequenceCounterMiddleware_InstallsCounterBeforeNext — the
    wrapper installs the counter before delegating
  - TestEnsureSequenceCounter_ReusesExisting — runner must not
    clobber the auth-installed counter
  - TestEnsureSequenceCounter_InstallsFresh — --no-auth path: install
    a fresh counter when none is present

Schema impact: none. The events were already declared to carry seq.
@initializ-mk initializ-mk merged commit 6597c21 into main Jun 16, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Audit emit: stamp seq on auth_verify / auth_fail by installing the sequence counter before the auth middleware

1 participant