fix(kvcache): count per-tier bytes when data reaches a tier via eviction by jay-tau · Pull Request #420 · mlcommons/storage

jay-tau · 2026-06-09T18:52:55Z

Fixes #408

Summary

MultiTierCache._demote_entry() (kv_cache_benchmark/kv_cache/cache.py) increments the offload counters and storage_tokens_processed when it evicts an entry into a lower tier, but it never increments the destination tier's byte counter. As a result, tier_cpu_kv_bytes_written / tier_storage_kv_bytes_written — and the *_gb and *_write_bandwidth_gbps values derived from them in results.json → summary.cache_stats — report 0 on any multi-tier setup where data reaches CPU or NVMe through eviction rather than direct allocation (the typical GPU-first configuration). The read side of the demotion (data read out of the source tier) was likewise untracked.

This is the bug reported in #408, where offloads_storage was 810 (data demonstrably written to storage) yet tier_storage_kv_bytes_written_gb and tier_storage_write_bandwidth_gbps were 0.0.

Fix

Inside the existing stats_lock block in _demote_entry():

increment the source tier's tier_*_kv_bytes_read counter (the eviction reads the entry out of from_tier), and
increment the destination tier's tier_*_kv_bytes_written counter for the cpu / nvme branches.

This mirrors the accounting already performed on the direct-allocation path (_allocate_cache_inner) and the read path (access_cache), reusing the same nvme → storage stat-name mapping.

Test

Strengthens TestThreeTierEvictionCascade::test_full_cascade_gpu_to_cpu_to_nvme_to_delete, which already forces a GPU→CPU→NVMe cascade (every new entry is admitted to GPU, so CPU/NVMe bytes can only arrive via demotion). New assertions verify:

the demote-only byte counters (tier_cpu_kv_bytes_written, tier_storage_kv_bytes_written, tier_gpu_kv_bytes_read, tier_cpu_kv_bytes_read) are non-zero;
the user-facing summary exposes non-zero tier_{cpu,storage}_kv_bytes_written_gb and tier_storage_write_bandwidth_gbps;
byte conservation — bytes read out of a tier equal the bytes written into the tier below it — pinning the accounting to the demotion path specifically.

This test fails before the fix (counters 0 while offloads_cpu/offloads_storage are non-zero) and passes after.

Verification

Full KV cache suite: 206 passed, 28 skipped (skips are CUDA-only).
The before/after counters were also cross-checked against the darshan-measured ground truth in KVCache: tier_storage_kv_bytes_written_gb and tier_cpu_kv_bytes_written_gb always 0 when data reaches those tiers via eviction #408 (~1% delta, attributable to .npy header bytes excluded from bench size accounting).
No new ruff findings on the changed lines.

github-actions · 2026-06-09T18:53:10Z

MLCommons CLA bot:
Thank you very much for your submission; we really appreciate it. Before we can accept your contribution,
we ask that you sign the MLCommons CLA (Apache 2). Please submit your GitHub ID to our onboarding form to initiate
authorization. If you are from a MLCommons member organization, we will request that you be added to the CLA.
If you are not from a member organization, we will email you a CLA to sign. For any questions, please contact
support@mlcommons.org.
0 out of 1 committers have signed the MLCommons CLA.
❌ @jay-tau
_{You can retrigger this bot by commenting recheck in this Pull Request}

dslik · 2026-06-10T15:15:19Z

@hazemawadalla , can you review?

jay-tau · 2026-06-10T17:19:34Z

recheck

hazemawadalla · 2026-06-12T16:21:44Z

thanks let me check it one more time today

MultiTierCache._demote_entry() incremented the offload counters and storage_tokens_processed but never the destination tier's byte counter, so tier_cpu_kv_bytes_written / tier_storage_kv_bytes_written (and the _gb and _write_bandwidth_gbps values derived from them in results.json) stayed 0 on any multi-tier setup where data reaches CPU or NVMe through eviction rather than direct allocation. The read side of the demotion (data read out of the source tier) was likewise untracked. Increment the source tier's read-bytes counter and the destination tier's written-bytes counter inside the existing stats lock, mirroring the accounting already done on the direct-allocation and access paths. Strengthen the three-tier eviction cascade test to assert the demote-only byte counters are non-zero and that bytes read out of a tier equal the bytes written into the tier below it. Fixes mlcommons#408 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

dslik · 2026-06-12T16:36:31Z

thanks let me check it one more time today

@hazemawadalla If this change looks good to you, can you open a new PR so that we can get this in? Resolving the CLA issue may take some time.

jay-tau · 2026-06-12T18:52:48Z

Do I need to do anything beyond replying to the mail that I received related to the CLA? Joel

…

On Fri, 12 Jun 2026 at 22:06, David Slik ***@***.***> wrote: *dslik* left a comment (mlcommons/storage#420) <#420 (comment)> thanks let me check it one more time today @hazemawadalla <https://github.com/hazemawadalla> If this change looks good to you, can you open a new PR so that we can get this in? Resolving the CLA issue may take some time. — Reply to this email directly, view it on GitHub <#420?email_source=notifications&email_token=AWXQGCBDS3MYYQBPVAVENUD47QWSNA5CNFSNUABFM5UWIORPF5TWS5BNNB2WEL2JONZXKZKDN5WW2ZLOOQXTINRZGMZDINJRGY4KM4TFMFZW63VHNVSW45DJN5XKKZLWMVXHJLDGN5XXIZLSL5RWY2LDNM#issuecomment-4693245168>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AWXQGCHZDUXYXFJOOTP5BXD47QWSNAVCNFSNUABFKJSXA33TNF2G64TZHM2DKOJXGYZTMMBTHNEXG43VMU5TINRSGQ3TCNRQGE4KC5QC> . You are receiving this because you were mentioned.Message ID: ***@***.***>

jay-tau requested a review from a team June 9, 2026 18:52

jay-tau mentioned this pull request Jun 9, 2026

KVCache: tier_storage_kv_bytes_written_gb and tier_cpu_kv_bytes_written_gb always 0 when data reaches those tiers via eviction #408

Open

jay-tau marked this pull request as draft June 9, 2026 19:07

jay-tau mentioned this pull request Jun 9, 2026

KVCache: eviction-driven tier I/O is excluded from latency P95 SLA checks (Storage/CPU write & read latencies never recorded in _demote_entry) #421

Open

dslik added the KVCache TF label Jun 10, 2026

jay-tau marked this pull request as ready for review June 10, 2026 17:06

jay-tau force-pushed the fix/kvcache-eviction-tier-bytes branch from efc9407 to c4692fa Compare June 12, 2026 16:27

Merge branch 'main' into fix/kvcache-eviction-tier-bytes

8589955

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(kvcache): count per-tier bytes when data reaches a tier via eviction#420

fix(kvcache): count per-tier bytes when data reaches a tier via eviction#420
jay-tau wants to merge 2 commits into
mlcommons:mainfrom
jay-tau:fix/kvcache-eviction-tier-bytes

jay-tau commented Jun 9, 2026

Uh oh!

github-actions Bot commented Jun 9, 2026

Uh oh!

dslik commented Jun 10, 2026

Uh oh!

jay-tau commented Jun 10, 2026

Uh oh!

hazemawadalla commented Jun 12, 2026

Uh oh!

dslik commented Jun 12, 2026

Uh oh!

jay-tau commented Jun 12, 2026 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

jay-tau commented Jun 9, 2026

Summary

Fix

Test

Verification

Uh oh!

github-actions Bot commented Jun 9, 2026

Uh oh!

dslik commented Jun 10, 2026

Uh oh!

jay-tau commented Jun 10, 2026

Uh oh!

hazemawadalla commented Jun 12, 2026

Uh oh!

dslik commented Jun 12, 2026

Uh oh!

jay-tau commented Jun 12, 2026 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants