Summary
AI Gateway response caching is different from provider prompt/prefix caching. Cloudflare's current AI Gateway caching docs make the distinction concrete:
- Gateway cache only applies to identical requests unless callers provide
cf-aig-cache-key.
- Per-request controls are
cf-aig-cache-ttl, cf-aig-skip-cache, and cf-aig-cache-key.
- Responses expose cache status with
cf-aig-cache-status as HIT or MISS.
llm-providers already has pieces of this surface (GatewayMetadata.cacheKey, GatewayMetadata.cacheTtl, ResponseCacheAdapter, and CacheHints.strategy: 'response' | 'both'), but downstream consumers still need to know too much about which fields affect Cloudflare AI Gateway versus local response-cache adapters.
Docs:
Current code surface
BaseProvider.getAIGatewayHeaders() forwards cf-aig-cache-key and cf-aig-cache-ttl when the provider base URL is AI Gateway.
GatewayMetadata does not expose skipCache.
CacheHints.strategy: 'response' is documented, but does not itself create Gateway response-cache headers or an explicit local adapter policy.
- Provider response metadata does not normalize
cf-aig-cache-status when Gateway returns it.
- The factory-level
ResponseCacheAdapter key is internal and separate from AI Gateway's cache key behavior.
Proposed work
- Add explicit
GatewayMetadata.skipCache?: boolean mapped to cf-aig-skip-cache: true for HTTP-provider Gateway calls.
- Add response metadata normalization for
cf-aig-cache-status where provider adapters can access response headers.
- Document how
CacheHints.strategy relates to GatewayMetadata and ResponseCacheAdapter:
provider-prefix: provider/native prompt cache only.
response: response-cache policy only, but caller must provide Gateway metadata or a response cache adapter.
both: both layers, still separately observable.
- Consider a small helper for deterministic cache key generation that consumers can call before dispatch, so gateway/custom cache keys are stable without copying factory internals.
- Add tests that prove old Cloudflare headers still work and the new skip-cache/status surfaces are backward-compatible.
Acceptance criteria
GatewayMetadata.skipCache is exported, documented, and mapped to the current Cloudflare header name.
- Provider responses routed through AI Gateway can expose normalized cache status, at least as
response.metadata.aiGatewayCacheStatus.
- README shows a minimal AI Gateway response-cache example with
cacheKey, cacheTtl, and skipCache.
- Tests cover new headers and avoid sending Cloudflare-specific headers to non-Gateway base URLs.
Notes
Do not make response caching default for arbitrary chat/agent turns. Gateway response caching is exact-request caching and should remain an explicit policy choice.
Summary
AI Gateway response caching is different from provider prompt/prefix caching. Cloudflare's current AI Gateway caching docs make the distinction concrete:
cf-aig-cache-key.cf-aig-cache-ttl,cf-aig-skip-cache, andcf-aig-cache-key.cf-aig-cache-statusasHITorMISS.llm-providersalready has pieces of this surface (GatewayMetadata.cacheKey,GatewayMetadata.cacheTtl,ResponseCacheAdapter, andCacheHints.strategy: 'response' | 'both'), but downstream consumers still need to know too much about which fields affect Cloudflare AI Gateway versus local response-cache adapters.Docs:
Current code surface
BaseProvider.getAIGatewayHeaders()forwardscf-aig-cache-keyandcf-aig-cache-ttlwhen the provider base URL is AI Gateway.GatewayMetadatadoes not exposeskipCache.CacheHints.strategy: 'response'is documented, but does not itself create Gateway response-cache headers or an explicit local adapter policy.cf-aig-cache-statuswhen Gateway returns it.ResponseCacheAdapterkey is internal and separate from AI Gateway's cache key behavior.Proposed work
GatewayMetadata.skipCache?: booleanmapped tocf-aig-skip-cache: truefor HTTP-provider Gateway calls.cf-aig-cache-statuswhere provider adapters can access response headers.CacheHints.strategyrelates toGatewayMetadataandResponseCacheAdapter:provider-prefix: provider/native prompt cache only.response: response-cache policy only, but caller must provide Gateway metadata or a response cache adapter.both: both layers, still separately observable.Acceptance criteria
GatewayMetadata.skipCacheis exported, documented, and mapped to the current Cloudflare header name.response.metadata.aiGatewayCacheStatus.cacheKey,cacheTtl, andskipCache.Notes
Do not make response caching default for arbitrary chat/agent turns. Gateway response caching is exact-request caching and should remain an explicit policy choice.