Skip to content

refactor: prefill RPC cache for collect_events#7077

Open
hanabi1224 wants to merge 10 commits into
mainfrom
hm/prefill-cache-collect-events
Open

refactor: prefill RPC cache for collect_events#7077
hanabi1224 wants to merge 10 commits into
mainfrom
hm/prefill-cache-collect-events

Conversation

@hanabi1224
Copy link
Copy Markdown
Contributor

@hanabi1224 hanabi1224 commented May 18, 2026

Summary of changes

Changes introduced in this pull request:

  • prefill RPC cache for collect_events
  • remove cacheless resolve_to_key_addr in favor of id_to_deterministic_address. A few RPC methods (Filecoin.GasEstimateGasLimit, Filecoin.MpoolPushMessage, eth_call, trace_call) should benefit from this.
  • add automatic cache hit/miss metrics for every SizeTrackingCache
  • update docs accordingly

id_to_deterministic_address cache has a pretty good hit rate

# HELP cache_id_to_deterministic_address_hits Cache hits of id_to_deterministic_address
# TYPE cache_id_to_deterministic_address_hits gauge
cache_id_to_deterministic_address_hits 17708
# HELP cache_id_to_deterministic_address_misses Cache misses of id_to_deterministic_address
# TYPE cache_id_to_deterministic_address_misses gauge
cache_id_to_deterministic_address_misses 2074

Reference issue to close (if applicable)

Closes

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Outside contributions

  • I have read and agree to the CONTRIBUTING document.
  • I have read and agree to the AI Policy document. I understand that failure to comply with the guidelines will lead to rejection of the pull request.

Summary by CodeRabbit

  • New Features

    • Events are now collected and cached during RPC cache warmup for newly validated tipsets.
  • Refactor

    • Unified caching with new sync/async get-or-insert semantics and direct size-tracking cache usage.
    • Standardized deterministic address resolution across chain sync, RPC, wallet, message-pool, interpreter, and state-manager paths.
    • Increased default cache sizes and simplified cache/shallow-clone ergonomics.
  • Chores

    • Updated dependency declarations and reorganized imports for consistency.

Review Change Stack

@hanabi1224 hanabi1224 added the RPC requires calibnet RPC checks to run on CI label May 18, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 18, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds sync/async get-or-insert APIs to SizeTrackingCache, promotes quick_cache dependency, standardizes CidWrapper and prelude usage, migrates all callers from resolve_to_key_addr to resolve_to_deterministic_address, refactors several cache wrappers to hold SizeTrackingCache directly, and updates ETH RPC caching and event-collection call sites.

Changes

Cache and Address Resolution Consolidation

Layer / File(s) Summary
SizeTrackingCache APIs & deps
Cargo.toml, src/utils/cache/size_tracking.rs
Adds get_or_insert_with and get_or_insert_async (and inline annotations) to SizeTrackingCache; moves quick_cache = "0.6" to main dependencies and bumps get-size2 to 0.9.
Prelude & import consolidation
src/lib.rs, src/db/car/forest.rs, src/db/car/mod.rs, src/db/blockstore_with_read_cache.rs, src/message_pool/msgpool/selection.rs, src/message_pool/msgpool/msg_pool.rs
Re-exports CidWrapper via crate prelude and converts modules to use crate::prelude::* for common types.
CidWrapper equivalence & block caches
src/utils/get_size/mod.rs, src/chain_sync/bad_block_cache.rs, src/db/blockstore_with_read_cache.rs, src/db/car/forest.rs
Implements quick_cache::Equivalent<CidWrapper> for Cid; standardizes cache key types to CidWrapper and updates lookups/insertions accordingly.
Cache wrapper refactors
src/chain/store/chain_store.rs, src/state_migration/common/mod.rs, src/state_manager/cache.rs
Converts MessagesInTipsetCache, MigrationCache, and ForestCache to tuple structs containing SizeTrackingCache; updates API shapes, shallow_clone, and tests accordingly.
MigrationCache & StateMigration usage
src/state_migration/*
Change MigrationCache API to accept &str keys, add ShallowClone, and update StateMigration and MinerMigrator to use shallow_clone() and borrowed keys.
StateTree & StateManager address resolution
src/shim/state_tree.rs, src/state_manager/address_resolution.rs, src/state_manager/cache.rs, src/state_manager/mod.rs
Rename resolve_to_deterministic_addrresolve_to_deterministic_address, prefer delegated_address when present, remove legacy resolve_to_key_addr helper, update get_bls_public_key signature to take Address by value, and use async cache population for deterministic-address resolution.
Remove legacy helper & update callers
src/interpreter/mod.rs, src/interpreter/fvm2.rs, src/interpreter/fvm3.rs, src/interpreter/fvm4.rs, src/rpc/methods/gas.rs, src/rpc/methods/mpool.rs, src/rpc/methods/wallet.rs, src/message_pool/msgpool/provider.rs, src/state_manager/message_simulation.rs
Remove free resolve_to_key_addr helper and update callers to call state.resolve_to_deterministic_address(...) or state_manager.resolve_to_deterministic_address(...).
ETH RPC caching & event API
src/rpc/methods/eth.rs, src/rpc/methods/eth/filter/mod.rs, src/daemon/mod.rs
Switch ETH block caches from ForestCache to SizeTrackingCache using get_or_insert_async; change EthEventHandler::collect_events to accept &StateManager and update call sites; daemon warmup pre-fills events.
Message-pool signature cache & utils
src/message_pool/msgpool/msg_pool.rs, src/message_pool/msgpool/utils.rs
Use Cid/CidWrapper::from directly for signature-validation and BLS caches (get_cloned/push), and adjust small imports.
Minor import/test adjustments
tests/*, various small files
Adjust tests and small import rearrangements to match API and import changes (Arc/Arc vs Vec, .0 access, import ordering).

Sequence Diagram

sequenceDiagram
  participant RPC as RPC / Daemon
  participant EthHandler as EthEventHandler
  participant StateMgr as StateManager
  participant Cache as SizeTrackingCache
  RPC->>EthHandler: collect_events(&StateMgr, tipset)
  EthHandler->>StateMgr: load_executed_tipset(...)
  EthHandler->>StateMgr: resolve_to_deterministic_address(...)
  EthHandler->>Cache: get_or_insert_async(key, compute)
  Cache-->>EthHandler: cached events / newly computed events
  EthHandler-->>RPC: return events (or warn on error)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • LesnyRumcajs
  • akaladarshi
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'refactor: prefill RPC cache for collect_events' is directly related to the PR's main objective of prefilling the RPC cache for event collection, as evidenced by changes in src/daemon/mod.rs (event cache prefilling) and src/rpc/methods/eth.rs (cache refactoring).
Docstring Coverage ✅ Passed Docstring coverage is 86.21% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch hm/prefill-cache-collect-events
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch hm/prefill-cache-collect-events

Comment @coderabbitai help to get the list of available commands and usage tips.

@hanabi1224 hanabi1224 marked this pull request as ready for review May 18, 2026 21:37
@hanabi1224 hanabi1224 requested a review from a team as a code owner May 18, 2026 21:37
@hanabi1224 hanabi1224 requested review from LesnyRumcajs and akaladarshi and removed request for a team May 18, 2026 21:37
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/state_manager/address_resolution.rs`:
- Around line 61-82: The initializer passed to
id_to_deterministic_address_cache.get_or_insert_async is performing an extra
insertion by calling self.id_to_deterministic_address_cache.push(id, resolved);
remove that redundant push so the initializer only computes and returns
Ok(resolved) (letting SizeTrackingCache::get_or_insert_async handle
insertion/size accounting); you can keep the resolution logic
(StateTree::new_from_root, resolve_to_deterministic_address, load_tipset_state)
unchanged—just delete the push call inside the async block.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 39d054a6-e444-407d-9bc5-765dfa106e52

📥 Commits

Reviewing files that changed from the base of the PR and between e0bca2a and b869240.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (30)
  • Cargo.toml
  • src/chain/store/chain_store.rs
  • src/chain_sync/bad_block_cache.rs
  • src/chain_sync/tipset_syncer.rs
  • src/daemon/mod.rs
  • src/db/blockstore_with_read_cache.rs
  • src/db/car/forest.rs
  • src/db/car/mod.rs
  • src/interpreter/fvm2.rs
  • src/interpreter/fvm3.rs
  • src/interpreter/fvm4.rs
  • src/interpreter/mod.rs
  • src/lib.rs
  • src/message_pool/msgpool/msg_pool.rs
  • src/message_pool/msgpool/provider.rs
  • src/message_pool/msgpool/selection.rs
  • src/rpc/methods/eth.rs
  • src/rpc/methods/eth/filter/mod.rs
  • src/rpc/methods/gas.rs
  • src/rpc/methods/mpool.rs
  • src/rpc/methods/wallet.rs
  • src/shim/state_tree.rs
  • src/state_manager/address_resolution.rs
  • src/state_manager/cache.rs
  • src/state_manager/message_simulation.rs
  • src/state_manager/mod.rs
  • src/state_migration/common/mod.rs
  • src/state_migration/common/state_migration.rs
  • src/state_migration/nv21/miner.rs
  • src/utils/cache/size_tracking.rs
💤 Files with no reviewable changes (2)
  • src/db/car/mod.rs
  • src/interpreter/mod.rs

Comment thread src/state_manager/address_resolution.rs
@hanabi1224 hanabi1224 force-pushed the hm/prefill-cache-collect-events branch from b869240 to 45a4eff Compare May 18, 2026 21:54
@hanabi1224 hanabi1224 force-pushed the hm/prefill-cache-collect-events branch from 45a4eff to a8404ef Compare May 18, 2026 22:05
@codecov
Copy link
Copy Markdown

codecov Bot commented May 18, 2026

⚠️ JUnit XML file not found

The CLI was unable to find any JUnit XML files to upload.
For more help, visit our troubleshooting guide.

Comment thread src/state_manager/mod.rs

const DEFAULT_TIPSET_CACHE_SIZE: NonZeroUsize = nonzero!(1024usize);
const DEFAULT_ID_TO_DETERMINISTIC_ADDRESS_CACHE_SIZE: NonZeroUsize = nonzero!(1024usize);
const DEFAULT_TIPSET_CACHE_SIZE: NonZeroUsize = nonzero!(8192usize);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~27MiB maximum

cache_tipset_state_executed_tipset_5_size_bytes 10313496
# HELP tipset_state_executed_tipset_5_len Length of cache tipset_state_executed_tipset_5
# TYPE tipset_state_executed_tipset_5_len gauge
tipset_state_executed_tipset_5_len 3423
# HELP tipset_state_executed_tipset_5_cap Capacity of cache tipset_state_executed_tipset_5
# TYPE tipset_state_executed_tipset_5_cap gauge
tipset_state_executed_tipset_5_cap 8192

Comment thread src/state_manager/mod.rs
const DEFAULT_TIPSET_CACHE_SIZE: NonZeroUsize = nonzero!(1024usize);
const DEFAULT_ID_TO_DETERMINISTIC_ADDRESS_CACHE_SIZE: NonZeroUsize = nonzero!(1024usize);
const DEFAULT_TIPSET_CACHE_SIZE: NonZeroUsize = nonzero!(8192usize);
const DEFAULT_ID_TO_DETERMINISTIC_ADDRESS_CACHE_SIZE: NonZeroUsize = nonzero!(8192usize);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~0.6MiB maximum

cache_id_to_deterministic_address_6_size_bytes 61072
# HELP id_to_deterministic_address_6_len Length of cache id_to_deterministic_address_6
# TYPE id_to_deterministic_address_6_len gauge
id_to_deterministic_address_6_len 694
# HELP id_to_deterministic_address_6_cap Capacity of cache id_to_deterministic_address_6
# TYPE id_to_deterministic_address_6_cap gauge
id_to_deterministic_address_6_cap 8192

fn read_cache_size() -> NonZeroUsize {
// Arbitrary number, can be adjusted
const DEFAULT: NonZeroUsize = nonzero!(1024usize);
const DEFAULT: NonZeroUsize = nonzero!(8192usize);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~45MiB maximum

cache_msg_in_tipset_1_size_bytes 23020119
# HELP msg_in_tipset_1_len Length of cache msg_in_tipset_1
# TYPE msg_in_tipset_1_len gauge
msg_in_tipset_1_len 4018
# HELP msg_in_tipset_1_cap Capacity of cache msg_in_tipset_1
# TYPE msg_in_tipset_1_cap gauge
msg_in_tipset_1_cap 8192


/// Similar to `resolve_to_key_addr` in the `forest_vm` [`crate::state_manager`] but does not
/// allow `Actor` type of addresses. Uses `ts` to generate the VM state.
pub async fn resolve_to_key_addr(
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai double check if this removed fn resolve_to_key_addr is logically identical to resolve_to_deterministic_address

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

Let me fetch the full body of the removed resolve_to_key_addr function to do a proper comparison.


🏁 Script executed:

#!/bin/bash
# Get the full diff for address_resolution.rs to see the complete removed function
git diff HEAD~1 -- src/state_manager/address_resolution.rs

Repository: ChainSafe/forest
Length of output: 29


🏁 Script executed:

#!/bin/bash
# Try fetching the full PR diff via GitHub CLI
gh pr diff 7077 --repo ChainSafe/forest | grep -A 80 "resolve_to_key_addr"

Repository: ChainSafe/forest
Length of output: 76


🏁 Script executed:

#!/bin/bash
# Fetch the PR diff via GitHub REST API (public repo, no auth needed)
curl -s -H "Accept: application/vnd.github.v3.diff" \
  "https://api.github.com/repos/ChainSafe/forest/pulls/7077" \
  | grep -A 80 "resolve_to_key_addr"

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Check if there are any remaining references to resolve_to_key_addr in the codebase
rg "resolve_to_key_addr" -n -C 3

echo "---"
# Also check git log to find available commits
git log --oneline | head -5

Repository: ChainSafe/forest
Length of output: 61


🏁 Script executed:

#!/bin/bash
# Try to get the base branch SHA for the PR
curl -s "https://api.github.com/repos/ChainSafe/forest/pulls/7077" | \
  python3 -c "import sys,json; d=json.load(sys.stdin); print(d.get('base',{}).get('sha',''))"

Length of output: 41


🏁 Script executed:

#!/bin/bash
# Fetch the base version of address_resolution.rs
curl -s "https://raw.githubusercontent.com/ChainSafe/forest/e0bca2ae61a44fd70a79abd6693a1378a2ec40dd/src/state_manager/address_resolution.rs"

Length of output: 4751


Now let me fetch the old resolve_to_key_addr helper from src/interpreter/mod.rs to complete the comparison.


🏁 Script executed:

#!/bin/bash
# Fetch the base version of interpreter/mod.rs to see the removed resolve_to_key_addr helper
curl -s "https://raw.githubusercontent.com/ChainSafe/forest/e0bca2ae61a44fd70a79abd6693a1378a2ec40dd/src/interpreter/mod.rs"

Length of output: 1212


@hanabi1224 I've retrieved and compared both implementations. Here's the detailed analysis:


Old StateManager::resolve_to_key_addr (removed)

match addr.protocol() {
    BLS | Secp256k1 | Delegated => return Ok(*addr),  // return as-is
    Actor => return Err(...),                           // bail
    _ => {}                                             // falls through for ID
};
// For ID: calls standalone `resolve_to_key_addr` helper:
//   1. get_actor(addr) from StateTree
//   2. if delegated_address present → return it
//   3. else → load account state, return pubkey_address()

New StateManager::resolve_to_deterministic_address

match address.protocol() {
    BLS | Secp256k1 | Delegated => Ok(address),       // return as-is
    Actor => bail!(...),                                // bail
    ID => {
        // Cached; calls StateTree::resolve_to_deterministic_address:
        //   1. if delegated_address present → return it
        //   2. else → load account state, return pubkey_address()
    }
}

Verdict: Logically identical for all practical inputs, with two enhancements in the new version

Aspect Old resolve_to_key_addr New resolve_to_deterministic_address
BLS/Secp256k1/Delegated return as-is ✅ return as-is ✅
Actor error ✅ error ✅
ID → delegated_address present return it ✅ return it ✅
ID → no delegated_address pubkey_address() pubkey_address()
Caching ❌ none id_to_deterministic_address_cache
_ => {} catch-all ✅ (any unknown future protocol tries resolution) ✅ exhaustive match (ID explicit)

The underlying resolution for ID addresses — the standalone resolve_to_key_addr helper vs StateTree::resolve_to_deterministic_address — are also logically identical: both prefer delegated_address if present, otherwise fall back to the account actor's pubkey_address(). The AI summary for src/shim/state_tree.rs does note that the old had a "version/protocol-gated workaround" for delegated_address while the new is unconditional, but this is a minor edge-case difference in the StateTree layer, not a semantic divergence for normal usage.

In summary: yes, the removed resolve_to_key_addr is logically identical to resolve_to_deterministic_address, with the new version adding caching as an optimization on top.


🧠 Learnings used
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 6381
File: src/lotus_json/actors/states/cron_state.rs:8-8
Timestamp: 2026-01-05T12:54:40.850Z
Learning: In Rust code reviews, do not derive Eq for a struct if any field does not implement Eq (e.g., types from external dependencies). If a type like CronStateLotusJson includes fields wrapping external dependencies that lack Eq, derive PartialEq (or implement PartialEq manually) but avoid deriving Eq. This ensures comparisons compile and reflect actual equivalence semantics. When needed, consider implementing custom PartialEq (and possibly Eq) only after ensuring all fields (or wrappers) implement Eq, or keep PartialEq-only if full equality semantics cannot be expressed.

Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 6381
File: src/lotus_json/actors/states/evm_state.rs:41-44
Timestamp: 2026-01-05T12:56:13.802Z
Learning: In Rust codebases (e.g., Forest), do not add #[cfg(test)] to functions already annotated with #[test]. The #[test] attribute ensures the function is compiled only for tests, so a separate #[cfg(test)] is redundant and can be removed if present. Apply this check to all Rust files that contain #[test] functions.

Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 6666
File: src/tool/subcommands/archive_cmd.rs:628-631
Timestamp: 2026-03-02T09:43:34.946Z
Learning: In Rust sources related to Forest snapshot handling, implement the rule: enable message_receipts and events (message_receipts: true, events: true) only for GC snapshots as defined in src/db/gc/snapshot.rs, since these are internal snapshots created during garbage collection. For user-facing export commands such as src/tool/subcommands/archive_cmd.rs, disable receipts and events by default (message_receipts: false, events: false) to keep user-facing snapshots smaller, unless explicitly requested. This guidance targets Rust files; apply consistently across similar snapshot-related modules, using the narrowest applicable scope when extending beyond the two specified files.

Learnt from: LesnyRumcajs
Repo: ChainSafe/forest PR: 6903
File: src/shim/executor.rs:243-251
Timestamp: 2026-04-14T17:24:13.631Z
Learning: In the ChainSafe/forest (LesnyRumcajs) codebase, do not require rustdoc comments on public methods when the method’s purpose is clearly self-explanatory from its name (e.g., `entries(&self)`, `into_entries(self)`). When reviewing Rust (`.rs`) files, avoid flagging missing documentation for these obviously named public APIs, reserving doc requirements for less clear or non-obvious public methods.

Copy link
Copy Markdown
Collaborator

@akaladarshi akaladarshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hanabi1224 There are some documents where we are still haven't updated the lru cache metrics such as metrics.md. Please update those as well.

Comment thread src/chain_sync/bad_block_cache.rs Outdated
Comment thread src/utils/cache/size_tracking.rs
Comment thread src/rpc/methods/eth.rs
let block_cid = tipset.key().cid()?;
ETH_BLOCK_HASH_TX_CACHE
.get_or_else(&block_cid.into(), async move || {
.get_or_insert_async(&CidWrapper::from(block_cid), async move {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not need to track the metrics for the ETH_BLOCK_HASH_TX_CACHE cache hits and miss here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could add hit/miss metrics at SizeTrackingCache level instead of doing it one by one on caller side. I'd do this in a subsequent PR.
@coderabbitai please create a tracking issue for this.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

@hanabi1224, I've created a tracking issue for adding hit/miss metrics at the SizeTrackingCache level. Here's the link once the script runs — it will be assigned to you.

The issue captures:

  • The motivation (avoiding per-caller instrumentation in favor of a centralized approach)
  • A link back to this PR and the specific comment
  • The proposed work (adding prometheus-style counters keyed by cache name inside SizeTrackingCache, covering get_or_insert_with, get_or_insert_async, and other retrieval methods)

🏁 Script executed:

#!/bin/bash
gh issue create \
  --repo ChainSafe/forest \
  --title "feat: add hit/miss metrics to SizeTrackingCache" \
  --body "## Summary

Currently, \`SizeTrackingCache\` does not track cache hit/miss metrics. Instead of adding per-caller instrumentation (e.g., on \`ETH_BLOCK_HASH_TX_CACHE\`, \`ETH_BLOCK_FULL_TX_CACHE\`, etc.), it would be cleaner and more scalable to implement hit/miss tracking centrally at the \`SizeTrackingCache\` level so all caches benefit automatically.

## Motivation

In PR `#7077`, \`ETH_BLOCK_HASH_TX_CACHE\` and \`ETH_BLOCK_FULL_TX_CACHE\` were migrated from \`ForestCache\` to \`SizeTrackingCache\`. During review, it was noted ([comment](https://github.com/ChainSafe/forest/pull/7077#discussion_r3264120152)) that hit/miss metrics are not currently tracked for these caches. Rather than doing this one by one on the caller side, the preferred approach is to add hit/miss metrics at the \`SizeTrackingCache\` level.

## Proposed Work

- Add hit and miss counters/metrics inside \`SizeTrackingCache\` (e.g., via \`prometheus\` counters keyed by cache name).
- Ensure \`get_or_insert_with\` and \`get_or_insert_async\` (and any other retrieval methods) increment the appropriate counter.
- Verify metrics are visible/exported via the existing metrics infrastructure.

## References

- PR `#7077` (context for this issue)
- Requested by `@hanabi1224`
" \
  --assignee hanabi1224

Length of output: 76


🏁 Script executed:

#!/bin/bash
curl -s -X POST \
  -H "Authorization: token ${GH_TOKEN}" \
  -H "Accept: application/vnd.github+json" \
  https://api.github.com/repos/ChainSafe/forest/issues \
  -d '{
    "title": "feat: add hit/miss metrics to SizeTrackingCache",
    "body": "## Summary\n\nCurrently, `SizeTrackingCache` does not track cache hit/miss metrics. Instead of adding per-caller instrumentation (e.g., on `ETH_BLOCK_HASH_TX_CACHE`, `ETH_BLOCK_FULL_TX_CACHE`, etc.), it would be cleaner and more scalable to implement hit/miss tracking centrally at the `SizeTrackingCache` level so all caches benefit automatically.\n\n## Motivation\n\nIn PR `#7077`, `ETH_BLOCK_HASH_TX_CACHE` and `ETH_BLOCK_FULL_TX_CACHE` were migrated from `ForestCache` to `SizeTrackingCache`. During review, it was noted ([comment](https://github.com/ChainSafe/forest/pull/7077#discussion_r3264120152)) that hit/miss metrics are not currently tracked for these caches. Rather than doing this one by one on the caller side, the preferred approach is to add hit/miss metrics at the `SizeTrackingCache` level.\n\n## Proposed Work\n\n- Add hit and miss counters/metrics inside `SizeTrackingCache` (e.g., via `prometheus` counters keyed by cache name).\n- Ensure `get_or_insert_with` and `get_or_insert_async` (and any other retrieval methods) increment the appropriate counter.\n- Verify metrics are visible/exported via the existing metrics infrastructure.\n\n## References\n\n- PR `#7077` (context for this issue)\n- Requested by `@hanabi1224`\n",
    "assignees": ["hanabi1224"]
  }' | jq '{number: .number, url: .html_url, title: .title}'

Length of output: 53


@hanabi1224, I'm unable to create the issue automatically in this environment due to authentication constraints. Here are the details you can use to open it manually:


Title: feat: add hit/miss metrics to SizeTrackingCache

Body:

## Summary

Currently, `SizeTrackingCache` does not track cache hit/miss metrics. Instead of adding per-caller instrumentation (e.g., on `ETH_BLOCK_HASH_TX_CACHE`, `ETH_BLOCK_FULL_TX_CACHE`, etc.), it would be cleaner and more scalable to implement hit/miss tracking centrally at the `SizeTrackingCache` level so all caches benefit automatically.

## Motivation

In PR `#7077`, `ETH_BLOCK_HASH_TX_CACHE` and `ETH_BLOCK_FULL_TX_CACHE` were migrated from `ForestCache` to `SizeTrackingCache`. During review, it was noted ([comment](https://github.com/ChainSafe/forest/pull/7077#discussion_r3264120152)) that hit/miss metrics are not currently tracked for these caches. Rather than doing this one by one on the caller side, the preferred approach is to add hit/miss metrics at the `SizeTrackingCache` level.

## Proposed Work

- Add hit and miss counters/metrics inside `SizeTrackingCache` (e.g., via `prometheus` counters keyed by cache name).
- Ensure `get_or_insert_with` and `get_or_insert_async` (and any other retrieval methods) increment the appropriate counter.
- Verify metrics are visible/exported via the existing metrics infrastructure.

## References

- PR `#7077` (context for this issue)
- Requested by `@hanabi1224`

Assignee: hanabi1224

You can create it directly at: https://github.com/ChainSafe/forest/issues/new


🧠 Learnings used
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 6381
File: src/lotus_json/actors/states/cron_state.rs:8-8
Timestamp: 2026-01-05T12:54:40.850Z
Learning: In Rust code reviews, do not derive Eq for a struct if any field does not implement Eq (e.g., types from external dependencies). If a type like CronStateLotusJson includes fields wrapping external dependencies that lack Eq, derive PartialEq (or implement PartialEq manually) but avoid deriving Eq. This ensures comparisons compile and reflect actual equivalence semantics. When needed, consider implementing custom PartialEq (and possibly Eq) only after ensuring all fields (or wrappers) implement Eq, or keep PartialEq-only if full equality semantics cannot be expressed.

Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 6381
File: src/lotus_json/actors/states/evm_state.rs:41-44
Timestamp: 2026-01-05T12:56:13.802Z
Learning: In Rust codebases (e.g., Forest), do not add #[cfg(test)] to functions already annotated with #[test]. The #[test] attribute ensures the function is compiled only for tests, so a separate #[cfg(test)] is redundant and can be removed if present. Apply this check to all Rust files that contain #[test] functions.

Learnt from: sudo-shashank
Repo: ChainSafe/forest PR: 6498
File: src/rpc/methods/eth.rs:1778-1780
Timestamp: 2026-01-28T21:30:16.490Z
Learning: When reviewing Ethereum RPC method implementations (e.g., in src/rpc/methods/*.rs), treat the limit/lookback parameter as a sentinel flag: -1 indicates no lookback limit (equivalent to LOOKBACK_NO_LIMIT). Ensure code paths properly translate -1 to the defined constant, document the behavior in comments, and validate that any downstream logic respects an unlimited lookback rather than applying a default cap.

Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 6666
File: src/tool/subcommands/archive_cmd.rs:628-631
Timestamp: 2026-03-02T09:43:34.946Z
Learning: In Rust sources related to Forest snapshot handling, implement the rule: enable message_receipts and events (message_receipts: true, events: true) only for GC snapshots as defined in src/db/gc/snapshot.rs, since these are internal snapshots created during garbage collection. For user-facing export commands such as src/tool/subcommands/archive_cmd.rs, disable receipts and events by default (message_receipts: false, events: false) to keep user-facing snapshots smaller, unless explicitly requested. This guidance targets Rust files; apply consistently across similar snapshot-related modules, using the narrowest applicable scope when extending beyond the two specified files.

Learnt from: LesnyRumcajs
Repo: ChainSafe/forest PR: 6903
File: src/shim/executor.rs:243-251
Timestamp: 2026-04-14T17:24:13.631Z
Learning: In the ChainSafe/forest (LesnyRumcajs) codebase, do not require rustdoc comments on public methods when the method’s purpose is clearly self-explanatory from its name (e.g., `entries(&self)`, `into_entries(self)`). When reviewing Rust (`.rs`) files, avoid flagging missing documentation for these obviously named public APIs, reserving doc requirements for less clear or non-obvious public methods.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in this PR

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akaladarshi now we have automatic hit/miss stats for every SizeTrackingCache

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hanabi1224 So I saw that the peek method doesn't record the cache hit or miss in quick cache.

Is it not worth tracking ? I mean we were not adding it to begin with but just want to confirm here once for the record.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I am asking specifically for the BadBlockCache since it use the peek only.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akaladarshi that makes sense. I don't think we need peek as the cache policy is now CLOCK-pro instead of LRU

Comment thread src/daemon/mod.rs
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/state_manager/cache.rs (1)

62-74: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Hit/miss accounting is reversed here.

Line 66 only executes when the insert future is actually polled, so it marks a cache miss, not a hit. As written, computed entries increment TIPSET_HIT, while cached reads increment TIPSET_MISS.

Suggested fix
-        let mut hit = false;
+        let mut miss = false;
         let value = self
             .cache
             .get_or_insert_async(key, async {
-                hit = true;
+                miss = true;
                 compute().await
             })
             .await?;
-        if hit {
-            TIPSET_HIT.inc();
-        } else {
-            TIPSET_MISS.inc();
-        }
+        if miss {
+            TIPSET_MISS.inc();
+        } else {
+            TIPSET_HIT.inc();
+        }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/state_manager/cache.rs` around lines 62 - 74, The hit/miss flag is
inverted because the async closure given to get_or_insert_async runs only on
cache misses; change the logic so hit starts true and the closure sets hit =
false (so a computed entry marks a miss), then keep the TIPSET_HIT / TIPSET_MISS
increments as-is; update the mutable variable usage around get_or_insert_async
and the async closure that calls compute() to flip the boolean assignment (use
hit = true before calling, and inside the async closure set hit = false).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/utils/cache/size_tracking.rs`:
- Around line 187-195: The compilation fails because get_or_insert_async uses
impl Future but std::future::Future is not imported; add the import "use
std::future::Future;" to the module's imports (top of
src/utils/cache/size_tracking.rs) so the function signature in
get_or_insert_async can resolve the Future trait; ensure the new use is grouped
with other std imports.

---

Outside diff comments:
In `@src/state_manager/cache.rs`:
- Around line 62-74: The hit/miss flag is inverted because the async closure
given to get_or_insert_async runs only on cache misses; change the logic so hit
starts true and the closure sets hit = false (so a computed entry marks a miss),
then keep the TIPSET_HIT / TIPSET_MISS increments as-is; update the mutable
variable usage around get_or_insert_async and the async closure that calls
compute() to flip the boolean assignment (use hit = true before calling, and
inside the async closure set hit = false).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: f212dfa7-c1dc-4385-9772-3bcead221089

📥 Commits

Reviewing files that changed from the base of the PR and between 64e4d70 and cf4ebe4.

📒 Files selected for processing (2)
  • src/state_manager/cache.rs
  • src/utils/cache/size_tracking.rs

Comment thread src/utils/cache/size_tracking.rs Outdated
@hanabi1224
Copy link
Copy Markdown
Contributor Author

here are some documents where we are still haven't updated the lru cache metrics such as metrics.md. Please update those as well.

@akaladarshi docs updated

akaladarshi
akaladarshi previously approved these changes May 20, 2026
Copy link
Copy Markdown
Collaborator

@akaladarshi akaladarshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Nice Work!

just one comment regarding peek(BadBlockCache).

Comment thread src/chain/store/chain_store.rs Outdated
Self {
cache: self.cache.shallow_clone(),
}
Self(self.0.shallow_clone())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice to improve ergonomics here. Consider implementing Deref/DerefMut to avoid always specifying .0.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

///
/// Gets or inserts an item in the cache with key.
/// Concurrent callers for the same key are coalesced — only one runs
/// `compute`, the rest wait on the result.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you cut this out?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was dropped by mistake. Fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

RPC requires calibnet RPC checks to run on CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants