Skip to content

Clarify AI Gateway response-cache contract and metadata #85

@stackbilt-admin

Description

@stackbilt-admin

Summary

AI Gateway response caching is different from provider prompt/prefix caching. Cloudflare's current AI Gateway caching docs make the distinction concrete:

  • Gateway cache only applies to identical requests unless callers provide cf-aig-cache-key.
  • Per-request controls are cf-aig-cache-ttl, cf-aig-skip-cache, and cf-aig-cache-key.
  • Responses expose cache status with cf-aig-cache-status as HIT or MISS.

llm-providers already has pieces of this surface (GatewayMetadata.cacheKey, GatewayMetadata.cacheTtl, ResponseCacheAdapter, and CacheHints.strategy: 'response' | 'both'), but downstream consumers still need to know too much about which fields affect Cloudflare AI Gateway versus local response-cache adapters.

Docs:

Current code surface

  • BaseProvider.getAIGatewayHeaders() forwards cf-aig-cache-key and cf-aig-cache-ttl when the provider base URL is AI Gateway.
  • GatewayMetadata does not expose skipCache.
  • CacheHints.strategy: 'response' is documented, but does not itself create Gateway response-cache headers or an explicit local adapter policy.
  • Provider response metadata does not normalize cf-aig-cache-status when Gateway returns it.
  • The factory-level ResponseCacheAdapter key is internal and separate from AI Gateway's cache key behavior.

Proposed work

  1. Add explicit GatewayMetadata.skipCache?: boolean mapped to cf-aig-skip-cache: true for HTTP-provider Gateway calls.
  2. Add response metadata normalization for cf-aig-cache-status where provider adapters can access response headers.
  3. Document how CacheHints.strategy relates to GatewayMetadata and ResponseCacheAdapter:
    • provider-prefix: provider/native prompt cache only.
    • response: response-cache policy only, but caller must provide Gateway metadata or a response cache adapter.
    • both: both layers, still separately observable.
  4. Consider a small helper for deterministic cache key generation that consumers can call before dispatch, so gateway/custom cache keys are stable without copying factory internals.
  5. Add tests that prove old Cloudflare headers still work and the new skip-cache/status surfaces are backward-compatible.

Acceptance criteria

  • GatewayMetadata.skipCache is exported, documented, and mapped to the current Cloudflare header name.
  • Provider responses routed through AI Gateway can expose normalized cache status, at least as response.metadata.aiGatewayCacheStatus.
  • README shows a minimal AI Gateway response-cache example with cacheKey, cacheTtl, and skipCache.
  • Tests cover new headers and avoid sending Cloudflare-specific headers to non-Gateway base URLs.

Notes

Do not make response caching default for arbitrary chat/agent turns. Gateway response caching is exact-request caching and should remain an explicit policy choice.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2-mediumMedium priority improvementenhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions