Skip to content

fix(kvcache): count per-tier bytes when data reaches a tier via eviction#420

Open
jay-tau wants to merge 2 commits into
mlcommons:mainfrom
jay-tau:fix/kvcache-eviction-tier-bytes
Open

fix(kvcache): count per-tier bytes when data reaches a tier via eviction#420
jay-tau wants to merge 2 commits into
mlcommons:mainfrom
jay-tau:fix/kvcache-eviction-tier-bytes

Conversation

@jay-tau

@jay-tau jay-tau commented Jun 9, 2026

Copy link
Copy Markdown

Fixes #408

Summary

MultiTierCache._demote_entry() (kv_cache_benchmark/kv_cache/cache.py) increments the offload counters and storage_tokens_processed when it evicts an entry into a lower tier, but it never increments the destination tier's byte counter. As a result, tier_cpu_kv_bytes_written / tier_storage_kv_bytes_written — and the *_gb and *_write_bandwidth_gbps values derived from them in results.json → summary.cache_stats — report 0 on any multi-tier setup where data reaches CPU or NVMe through eviction rather than direct allocation (the typical GPU-first configuration). The read side of the demotion (data read out of the source tier) was likewise untracked.

This is the bug reported in #408, where offloads_storage was 810 (data demonstrably written to storage) yet tier_storage_kv_bytes_written_gb and tier_storage_write_bandwidth_gbps were 0.0.

Fix

Inside the existing stats_lock block in _demote_entry():

  • increment the source tier's tier_*_kv_bytes_read counter (the eviction reads the entry out of from_tier), and
  • increment the destination tier's tier_*_kv_bytes_written counter for the cpu / nvme branches.

This mirrors the accounting already performed on the direct-allocation path (_allocate_cache_inner) and the read path (access_cache), reusing the same nvme → storage stat-name mapping.

Test

Strengthens TestThreeTierEvictionCascade::test_full_cascade_gpu_to_cpu_to_nvme_to_delete, which already forces a GPU→CPU→NVMe cascade (every new entry is admitted to GPU, so CPU/NVMe bytes can only arrive via demotion). New assertions verify:

  • the demote-only byte counters (tier_cpu_kv_bytes_written, tier_storage_kv_bytes_written, tier_gpu_kv_bytes_read, tier_cpu_kv_bytes_read) are non-zero;
  • the user-facing summary exposes non-zero tier_{cpu,storage}_kv_bytes_written_gb and tier_storage_write_bandwidth_gbps;
  • byte conservation — bytes read out of a tier equal the bytes written into the tier below it — pinning the accounting to the demotion path specifically.

This test fails before the fix (counters 0 while offloads_cpu/offloads_storage are non-zero) and passes after.

Verification

@jay-tau jay-tau requested a review from a team June 9, 2026 18:52
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown

MLCommons CLA bot:
Thank you very much for your submission; we really appreciate it. Before we can accept your contribution,
we ask that you sign the MLCommons CLA (Apache 2). Please submit your GitHub ID to our onboarding form to initiate
authorization. If you are from a MLCommons member organization, we will request that you be added to the CLA.
If you are not from a member organization, we will email you a CLA to sign. For any questions, please contact
support@mlcommons.org.
0 out of 1 committers have signed the MLCommons CLA.
@jay-tau
You can retrigger this bot by commenting recheck in this Pull Request

@dslik

dslik commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

@hazemawadalla , can you review?

@jay-tau jay-tau marked this pull request as ready for review June 10, 2026 17:06
@jay-tau

jay-tau commented Jun 10, 2026

Copy link
Copy Markdown
Author

recheck

@hazemawadalla

Copy link
Copy Markdown
Contributor

thanks let me check it one more time today

MultiTierCache._demote_entry() incremented the offload counters and
storage_tokens_processed but never the destination tier's byte counter,
so tier_cpu_kv_bytes_written / tier_storage_kv_bytes_written (and the
_gb and _write_bandwidth_gbps values derived from them in results.json)
stayed 0 on any multi-tier setup where data reaches CPU or NVMe through
eviction rather than direct allocation. The read side of the demotion
(data read out of the source tier) was likewise untracked.

Increment the source tier's read-bytes counter and the destination
tier's written-bytes counter inside the existing stats lock, mirroring
the accounting already done on the direct-allocation and access paths.

Strengthen the three-tier eviction cascade test to assert the
demote-only byte counters are non-zero and that bytes read out of a tier
equal the bytes written into the tier below it.

Fixes mlcommons#408

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jay-tau jay-tau force-pushed the fix/kvcache-eviction-tier-bytes branch from efc9407 to c4692fa Compare June 12, 2026 16:27
@dslik

dslik commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

thanks let me check it one more time today

@hazemawadalla If this change looks good to you, can you open a new PR so that we can get this in? Resolving the CLA issue may take some time.

@jay-tau

jay-tau commented Jun 12, 2026 via email

Copy link
Copy Markdown
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

KVCache: tier_storage_kv_bytes_written_gb and tier_cpu_kv_bytes_written_gb always 0 when data reaches those tiers via eviction

4 participants