Skip to content

Tool call history tracking invalidates implicit cache during chat stream session #2508

@shawn-maybush

Description

@shawn-maybush

Environment details

  • Programming language: Python
  • OS: Debian
  • Language runtime version: Python 3.13.5
  • Package version: 1.73.1

output_contents.append(chunk.candidates[0].content)

In a streaming response, the model returns data in multiple chunks. If the model streams a thought followed by a tool call, they arrive in separate chunks. Because output_contents is populated with output_contents.append(chunk.candidates[0].content) during the stream, it is a list of Content objects.

By calling .extend(output_contents) in record_history, the SDK is dumping every single chunk as a separate Content entry into the history array. Instead of rolling up the parts into a single Content(role="model", parts=[...]), it’s producing history that looks like this:

[
  Content(role="user", parts=["Do a task..."]),
  Content(role="model", parts=["<thought>..."]),
  Content(role="model", parts=["</thought>"]),
  Content(role="model", parts=[FunctionCall(...)])
]

Gemini's Context Caching relies on strict left-to-right byte matching of the context payload. The cache expects alternating user and model turns. Even though the API might technically tolerate back-to-back model turns by silently collapsing them on the backend during inference, the raw JSON structure of the request has changed.

Because the serialized JSON array now contains multiple consecutive {"role": "model", ...} objects instead of one merged object, the byte-prefix of the request differs from whatever standard single-turn format was cached, resulting in a 100% cache miss (Cached Read = 0).

Metadata

Metadata

Labels

priority: p2Moderately-important priority. Fix may not be included in next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions