Skip to content

broker: retire /v1/mint-aws-creds (issue #71 Option A end-state) #72

@hanwencheng

Description

@hanwencheng

Context

Issue #71 outlined a two-stage migration:

  • Option B (landed in b0c6515 / c54a69b / a9f3330 on evm) — pivot /v1/mint-aws-creds internals to AssumeRoleWithWebIdentity so it survives cloud-setup.md §4 federation. Wire shape preserved. Deployments that ran §4 stop returning AccessDenied from the integrated path.
  • Option A (this issue) — caller-side migration + retire the route — daemons fetch /v1/mint-oidc-jwt and do AssumeRoleWithWebIdentity client-side. The endpoint goes away.

The caller-side migration is already done as part of the Option B commits:

  • crates/agentkeys-provisioner/src/aws_creds.rs::fetch_via_broker_default_ttl — fetches OIDC JWT, does STS client-side via aws-sdk-sts with anonymous credentials.
  • crates/agentkeys-mcp/src/lib.rs::McpHandler::broker_env_for_provision — uses the new helper.
  • crates/agentkeys-cli/src/lib.rs::broker_env_for_provision — same.

Production daemons no longer call /v1/mint-aws-creds. The route still exists for callers who want server-side gates (audit + grants + idempotency + multi-anchor coordination) but has no in-tree caller as of a9f3330.

Goal

Delete the route and its handler. Broker becomes a pure JWT signer — zero AWS principals at runtime, single mint path. Compromise blast radius drops to "OIDC signing key only."

What's in scope

  1. Drop the route registration in crates/agentkeys-broker-server/src/lib.rs:39:
    .route(\"/v1/mint-aws-creds\", post(handlers::mint::mint_aws_creds))
  2. Drop the entire mint_aws_creds + mint_v2 handler in crates/agentkeys-broker-server/src/handlers/mint.rs (~700 LOC including: body parsing, EIP-191 per-call sig verification, grant resolution / consume, audit anchor write loop, response shaping, helpers).
  3. Delete crates/agentkeys-broker-server/tests/mint_v2_flow.rs (the only test suite that exercises this endpoint).
  4. Decide what happens to the policy gates that lived inside mint_v2:
    • Audit anchor write per mint. /v1/mint-oidc-jwt already audits the JWT mint via state.audit.record_mint(...). Plus AWS CloudTrail records every AssumeRoleWithWebIdentity call with the assumed role + session name. Two audit sources are arguably better than one. Multi-anchor (sqlite + EVM) coordination has no daemon-side equivalent today — it goes away with the endpoint.
    • Phase B explicit-grant enforcement. try_consume(grant_id) was the policy gate. Options:
      • (a) Move grant check to /v1/mint-oidc-jwt time. Requires the JWT request to carry an intent (service + scope_path) that the grant table can match. Currently /v1/mint-oidc-jwt takes only the bearer.
      • (b) Encode grant outcome into the JWT (scope claim, max_uses → JWT TTL) and let AWS bucket policy enforce. Limits granularity.
      • (c) Drop server-side grant enforcement entirely; rely on AWS PrincipalTag + bucket policy for isolation.
    • Idempotency-Key dedup. Currently keyed on body hash + key. Options:
      • (a) Move to /v1/mint-oidc-jwt keyed on bearer hash + key. Functional but the JWT is already short-lived (5min default).
      • (b) Drop. Daemons can dedup client-side via the JWT cache.
    • Per-OmniAccount rate limiting (MintRateLimiter::check_mint). Move to /v1/mint-oidc-jwt. Same code, different call site.

Acceptance criteria

  • crates/agentkeys-broker-server/src/lib.rs no longer registers /v1/mint-aws-creds.
  • crates/agentkeys-broker-server/src/handlers/mint.rs deleted (or shrunk to just the helpers /v1/mint-oidc-jwt reuses).
  • crates/agentkeys-broker-server/tests/mint_v2_flow.rs deleted.
  • Phase B grant enforcement and rate-limit checks move to /v1/mint-oidc-jwt per the chosen option above.
  • Multi-anchor audit policy (sqlite + EVM) decision documented — kept (re-homed at JWT mint), or dropped with explicit note in the runbook.
  • cargo build -p agentkeys-broker-server clean for both feature combos.
  • cargo test -p agentkeys-broker-server --features audit-evm,auth-email-link,auth-oauth2-google passes.
  • cargo test --workspace passes.
  • cargo clippy --workspace --all-features -- -D warnings clean.
  • bash harness/stage-7-issue-64-done.sh exits 0.
  • docs/operator-runbook-stage7.md AWS IAM Trust §'Mint-time STS path' rewritten — single path only.
  • docs/stage7-demo-and-verification.md §5 rewritten — drop the 'two paths' framing.
  • Live walkthrough on https://broker.litentry.org confirms /v1/mint-aws-creds returns 404 and the daemon-side path still works end-to-end.

Migration sequence (recommended)

  1. Decide the gate-rehoming policy (the four bullets in §4 above). This is the architectural question; the rest is mechanical.
  2. Move the gate code to /v1/mint-oidc-jwt (or document its drop).
  3. Delete the route + handler + tests in one commit.
  4. Doc updates in the same commit or a follow-up.
  5. Operator redeploys; verify live.

Out of scope

  • TEE-derived OIDC signer (tracked separately, plan §8 / heima-gaps §3).
  • Live EVM audit anchor (currently EvmStubAnchor — Phase E hardening).
  • The 3 pre-existing failing npm tests in provisioner-scripts/src/lib/email.test.ts (real-S3 calls failing due to local IAM perms — unrelated).

Why now / why not yet

Why now: Production daemons no longer use the endpoint. Keeping it is dead weight.

Why not yet: The gate-rehoming decision is real architecture work. Doing this without thinking about audit/grants/idempotency is how you delete a working policy enforcement layer by accident. The two-path system is fine to live in for a release or two while the rehoming is designed.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions