Skip to content

feat(p2p): log received messages with peer count#323

Merged
MegaRedHand merged 1 commit intomainfrom
log-p2p-message-received
Apr 28, 2026
Merged

feat(p2p): log received messages with peer count#323
MegaRedHand merged 1 commit intomainfrom
log-p2p-message-received

Conversation

@pablodeymo
Copy link
Copy Markdown
Collaborator

Motivation

Today, when an ethlambda node receives a p2p message we log per-message-type
lines (Received block from gossip, Received attestation from gossip,
Received BlocksByRoot request, etc.) but none of them carry the count of
currently connected peers
. Operators reading logs cannot easily tell, at a
glance, whether ongoing gossip activity is being amplified by a healthy mesh
or coming from just a couple of peers — they have to cross-reference the
last Peer connected / Peer disconnected event, which can be far up the
log.

This PR adds a single, dedicated log line emitted on every inbound p2p
message that includes the peer count, giving operators a continuously
updated network-health signal aligned with traffic.

The log level matches existing chain/fork events (Fork choice reorg detected, Block imported, Peer connected) — info!.

Description

What is logged

A new info! log line per inbound p2p message:

INFO ethlambda_p2p::...: P2P message received kind="block" peer_count=4

kind is a stable snake_case identifier; peer_count is the size of
P2PServer::connected_peers at the moment of reception.

Logged kinds

kind Source protocol Triggering event
block gossipsub block topic message
aggregation gossipsub aggregation topic message
attestation gossipsub any attestation_<subnet_id> topic message
status_request req/resp inbound Status request
status_response req/resp inbound Status response
blocks_by_root_request req/resp inbound BlocksByRoot request
blocks_by_root_response req/resp inbound BlocksByRoot response

Deliberate exclusions

  • Unknown gossip topics stay at trace! (existing behavior, unchanged).
    They mostly come from cross-version peers and would otherwise add info!
    noise on every misrouted message. This is the one knowing exception to
    "every received message".
  • OutboundFailure / InboundFailure / ResponseSent in the req/resp
    protocol are not "received messages" in the user-visible sense and remain
    unchanged.
  • Connection lifecycle events (Peer connected, Peer disconnected)
    already include peer_count — untouched.
  • Outbound publish logs (Published block to gossipsub, etc.) — out of
    scope; this PR is about inbound traffic only.

Implementation

Two emit sites — one per protocol entry point — to minimize duplication
and keep the new log close to the earliest knowable point:

  1. crates/net/p2p/src/gossipsub/handler.rs

    • peer_count captured once at the top of handle_gossipsub_message
      into a local variable. The read is immutable and does not conflict
      with the later mutable use of server inside each match arm.
    • info!(kind = …, peer_count, "P2P message received") is placed at
      the head of each known-topic match arm, so the log fires before
      decompression / decoding work — i.e. it captures the message even if
      decoding subsequently fails.
  2. crates/net/p2p/src/req_resp/handlers.rs

    • In handle_req_resp_message, peer_count is captured once per
      Message::Request / Message::Response branch.
    • The new info! is emitted inside the inner request/response variant
      match, so kind is precise (status_request vs blocks_by_root_request,
      etc.).

No new helpers, no new types, no signature changes to the existing
handle_status_request / handle_blocks_by_root_request etc. functions —
they already had &mut P2PServer access. The kind strings are short
literals inlined at each call site (≤ 7 sites across 2 files); abstracting
them into a helper or enum would have been over-engineering.

Why a dedicated log line, and not just a new peer_count field on the existing Received … logs?

Two reasonable designs were considered:

  • Option A — augment each existing Received … log with a
    peer_count field (no new log lines, ~7 sites touched).
  • Option B (this PR) — one dedicated P2P message received log per
    reception (one new log line per inbound message).

We chose Option B as a first iteration to evaluate whether the new
correlation is useful in practice without altering the existing per-type
logs that operators already grep for. If the new line proves redundant
with the existing ones, collapsing into Option A is a small follow-up.
Notes are in docs/plans/p2p-received-peer-count-log.md (local plan
doc, not committed) for the rationale.

Field ordering

Follows the project logging convention from CLAUDE.md
(temporal → identity → identifiers → context → metadata): kind is the
identity-ish field, peer_count is metadata, both come before the message
string.

How to Test

  1. make fmt — passes.
  2. make lint (clippy -D warnings) — passes.
  3. make test — passes (full workspace test suite green; no test changes
    needed since this is logging-only).
  4. Manual smoke test against a local devnet:
    make run-devnet
    # In another shell:
    docker logs -f <ethlambda-container> 2>&1 | grep "P2P message received"
    You should see lines like:
    INFO ethlambda_p2p::gossipsub::handler: P2P message received kind="block" peer_count=4
    INFO ethlambda_p2p::gossipsub::handler: P2P message received kind="attestation" peer_count=4
    INFO ethlambda_p2p::req_resp::handlers: P2P message received kind="status_request" peer_count=4
    
    Verify that:
    • peer_count matches the value last reported by Peer connected /
      Peer disconnected events.
    • All expected kind values appear during normal operation
      (block and attestation every slot; aggregation once
      aggregations occur; status_* shortly after each new connection;
      blocks_by_root_* when a peer is missing blocks and re-syncs).
    • No P2P message received line appears for messages on unknown
      topics (those stay at trace!).

Risk / open questions

  • Log volume. This adds one info! line per inbound message. With
    N validators, that is roughly N attestations + 1 block per slot per
    node, plus any aggregations and req/resp traffic — acceptable on the
    current devnets. If volume becomes a concern in production, the clear
    follow-up is to collapse peer_count into the existing per-type logs
    (Option A above) and remove this dedicated line.
  • Operational follow-up. No metric is added in this PR. If we later
    want a counterpart lean_p2p_messages_received_total{kind=…} counter,
    that can be a separate change.

Files changed

  • crates/net/p2p/src/gossipsub/handler.rs — 3 new info! calls + 1
    local peer_count capture.
  • crates/net/p2p/src/req_resp/handlers.rs — 4 new info! calls + 2
    local peer_count captures; restructure of the match to scope
    peer_count to the request/response branches.

No new files. No public API changes. No dependency changes.

  Add a dedicated info-level log emitted on each inbound p2p message that
  includes the connected-peer count, giving operators visibility into
  network activity correlated with peer connectivity.

  Logged kinds: block, aggregation, attestation, status_request,
  status_response, blocks_by_root_request, blocks_by_root_response.
  Unknown gossip topics remain at trace level. Existing per-message
  reception logs are unchanged.
@github-actions
Copy link
Copy Markdown

🤖 Kimi Code Review

Overall Assessment: The PR adds useful observability for P2P message handling, but has significant gaps in debuggability (missing peer IDs) and operational concerns (log verbosity).

Critical Issues:

1. Missing Peer Identification in Logs

  • File: crates/net/p2p/src/gossipsub/handler.rs (lines 33, 66, 97)
  • File: crates/net/p2p/src/req_resp/handlers.rs (lines 32, 44, 51, 58)
  • The logs capture peer_count but omit the specific peer ID. For P2P debugging (e.g., identifying malicious or buggy peers), the specific peer is essential. The message in gossipsub has a source: Option<PeerId> field, and req_resp has peer in scope.
  • Suggestion: Add %peer (or ?peer for debug) to all log statements. For gossipsub: info!(%peer, kind = "block", peer_count, "P2P message received");

2. Log Level Too Verbose for High-Frequency Messages

  • File: crates/net/p2p/src/gossipsub/handler.rs lines 66, 97 (attestations/aggregations)
  • Attestations arrive at high frequency (thousands per slot on mainnet). Using info! level will cause log spam and I/O overhead in production.
  • Suggestion: Use debug! for attestation and aggregation messages. Keep info! for block messages (lower frequency, higher importance).

Code Quality:

3. Inconsistent Variable Placement

  • File: crates/net/p2p/src/gossipsub/handler.rs line 29 vs req_resp/handlers.rs lines 32, 50
  • In gossipsub, peer_count is computed once at function start. In req_resp, it's computed separately in each match arm (Request/Response). While the lazy approach in req_resp is technically more efficient (computed only when needed), consistency would improve readability.
  • Suggestion: Consider moving peer_count to the top of the function scope in req_resp/handlers.rs for consistency, unless the arms are genuinely cold paths.

4. Redundant String Allocation in Error Path

  • File: crates/net/p2p/src/req_resp/handlers.rs line 68
  • Not introduced by this PR, but adjacent: String::from_utf8_lossy(&message) allocates a Cow. Since this is just for logging, use %error_str (display) instead of %error_str with the allocation, or log the bytes directly: warn!(%peer, ?code, message = %String::from_utf8_lossy(&message), ...).

Security/Performance Notes:

  • O(1) Assumption: server.connected_peers.len() assumes O(1) complexity. Ensure connected_peers is not a DashMap or concurrent structure with expensive .len() operations (rare, but verify).
  • SSZ/Data Safety: The decompression and SSZ decoding happen after logging, which is correct (fail fast on cheap ops, log before heavy CPU work).

Suggested Diff Amendments:

// In gossipsub/handler.rs, line 33:
info!(peer = ?message.source, kind = "block", peer_count, "P2P message received");

// In req_resp/handlers.rs, line 32:
info!(%peer, kind = "status_request", peer_count, "P2P message received");

Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Codex Code Review

  1. crates/net/p2p/src/gossipsub/handler.rs:33, crates/net/p2p/src/gossipsub/handler.rs:66, crates/net/p2p/src/gossipsub/handler.rs:97, crates/net/p2p/src/req_resp/handlers.rs:35, crates/net/p2p/src/req_resp/handlers.rs:39, crates/net/p2p/src/req_resp/handlers.rs:55, and crates/net/p2p/src/req_resp/handlers.rs:59 add unconditional info! logs on the hottest untrusted inbound paths. The node’s default log filter is INFO (bin/ethlambda/src/main.rs:93), so every inbound attestation/block/request now emits an extra synchronous log line before decompression and SSZ decode. Those handlers already log the successfully decoded payload later at info! (gossipsub/handler.rs:51, 80, 112, req_resp/handlers.rs:109, 116, 126, 150), so this doubles log volume for valid traffic and makes log-I/O easier to abuse under gossip spam. In a consensus client, that is a real availability risk. These new events should be debug!/trace!, sampled, or converted to metrics if the goal is traffic visibility.

No other correctness or consensus-path regressions stood out in this diff; it is otherwise just instrumentation. I did not run tests for this review.


Automated review by OpenAI Codex · gpt-5.4 · custom prompt

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 28, 2026

Greptile Summary

This PR adds a lightweight P2P message received info! log line at each inbound p2p message site — three gossipsub topic arms and four req/resp variants — carrying a kind label and a peer_count snapshot. The implementation is minimal: one usize capture per entry point, no new helpers, no signature changes.

Confidence Score: 5/5

Safe to merge — logging-only change with no logic mutations, no new types, no signature changes.

All changes are additive info! log lines and usize captures. Borrow safety is correct (peer_count holds a Copy value, so the immutable borrow of connected_peers ends immediately before any mutable use of server). The deliberate exclusions (unknown topics, Response::Error, outbound events) are well-documented. No P1 or P0 findings.

No files require special attention.

Important Files Changed

Filename Overview
crates/net/p2p/src/gossipsub/handler.rs Adds a single peer_count capture before the topic match and one info! log per known gossip topic (block, aggregation, attestation); no logic changes.
crates/net/p2p/src/req_resp/handlers.rs Restructures request/response match arms into blocks to scope peer_count captures; adds info! logs for status_request, blocks_by_root_request, status_response, blocks_by_root_response; Response::Error is deliberately excluded.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Inbound P2P event] --> B{Protocol}
    B -->|gossipsub| C[handle_gossipsub_message]
    B -->|req_resp| D[handle_req_resp_message]

    C --> C1[peer_count = connected_peers.len]
    C1 --> C2{topic_kind}
    C2 -->|BLOCK| C3["info! kind=block, peer_count"]
    C2 -->|AGGREGATION| C4["info! kind=aggregation, peer_count"]
    C2 -->|ATTESTATION_SUBNET_*| C5["info! kind=attestation, peer_count"]
    C2 -->|unknown| C6["trace! unchanged"]

    D --> D1{message type}
    D1 -->|Request| D2[peer_count = connected_peers.len]
    D2 --> D3{request variant}
    D3 -->|Status| D4["info! kind=status_request, peer_count"]
    D3 -->|BlocksByRoot| D5["info! kind=blocks_by_root_request, peer_count"]
    D1 -->|Response| D6[peer_count = connected_peers.len]
    D6 --> D7{response variant}
    D7 -->|Success/Status| D8["info! kind=status_response, peer_count"]
    D7 -->|Success/BlocksByRoot| D9["info! kind=blocks_by_root_response, peer_count"]
    D7 -->|Error| D10["warn! Received error response unchanged"]
Loading

Reviews (1): Last reviewed commit: "feat(p2p): log received messages with pe..." | Re-trigger Greptile

@github-actions
Copy link
Copy Markdown

🤖 Claude Code Review

Here is my review of PR #323:


Review: feat(p2p): log received messages with peer count

Overview

This PR adds a dedicated info! log line per inbound p2p message carrying kind and peer_count, giving operators an in-line network-health signal without cross-referencing connection lifecycle events. The change touches two files: gossipsub/handler.rs and req_resp/handlers.rs.


What's Good

  • The peer_count snapshot is taken once per dispatch site before any decompression/decoding work. Capturing it early means the log fires even when decode subsequently fails — correct behavior.
  • The req_resp restructuring (adding braces around match arms to scope peer_count) is minimal and preserves identical semantics.
  • Field ordering (kind, peer_count, message string) matches the project convention from CLAUDE.md (identity → metadata).
  • Response::Error keeps its existing warn! log without the new info! line — a sensible deliberate exclusion.
  • The _ => unknown-topic branch is correctly left at trace!, consistent with the stated exclusion policy.

Issues and Suggestions

1. Double logging on every received message (design concern)

Every inbound message now emits two info! lines:

  • The new generic "P2P message received" line.
  • The existing specific line ("Received block from gossip", "Received BlocksByRoot response", etc.).

For example, a gossip block now produces:

INFO P2P message received kind="block" peer_count=4
INFO Received block from gossip slot=... proposer=... block_root=...

The PR justifies this as "Option B" but the cost is real: N attestations + 1 block per slot effectively doubles info! log volume for the hottest paths. Option A (adding peer_count to the existing specific log lines) would deliver the same operational signal with fewer lines and better co-location of context. Worth reconsidering before merging, since collapsing later requires a follow-up PR that may never happen.

2. Redundant peer_count computation on the unknown-topic path

In handler.rs:29, peer_count is captured unconditionally before the topic match. When topic_kind falls into the _ => arm (unknown topic), the value is computed and never used. This is not a correctness issue — connected_peers.len() is a trivial usize read — but moving the capture inside each known arm (or after the check) would make the code's intent clearer.

3. Pre-existing convention violation in handle_status_request (not introduced here, but adjacent)

handlers.rs:109:

info!(finalized_slot=%request.finalized.slot, head_slot=%request.head.slot, "Received status request from peer {peer}");

peer is embedded in the message string rather than as a structured tracing field (%peer). The project convention (and CLAUDE.md) calls for structured fields. Outside scope for this PR, but a quick cleanup opportunity.

4. peer not included in the new gossipsub log line

Gossipsub messages don't have a single canonical source peer (they propagate through the mesh, and propagation_source is discarded with _), so omitting peer from the gossipsub log is correct. Worth a brief comment in the PR description or code to prevent future confusion, since the req_resp variant could reasonably carry %peer but doesn't.

5. Minor: peer_count captured twice in req_resp/handlers.rs (lines 32 and 51)

This is deliberate (one per Message::Request, one per Message::Response) and correct. No issue, just noting it is a bit repetitive — if the outer match ever grows a third arm that needs it, it will need a third capture.


Risk

  • Log volume: One additional info! per inbound message at current devnet scale is acceptable, but the PR's own risk section correctly identifies this as the only concern. Should be revisited before any production-scale deployment.
  • No correctness risk: This is purely additive logging. No state, no data flow, no error handling is changed.
  • No security risk: Logging peer_count (a length of a local collection) cannot leak sensitive information.

Summary

The implementation is correct and follows project conventions. The main open question is whether the "Option B" design (dedicated line per message) is the right long-term trade-off versus Option A (adding peer_count to existing lines). If log volume is a future concern, Option A will produce a cleaner fix than a post-hoc removal. Recommend either switching to Option A now, or explicitly marking this PR as a temporary evaluation (e.g., in a TODO comment) to ensure the follow-up consolidation actually happens.


Automated review by Claude (Anthropic) · sonnet · custom prompt

@MegaRedHand MegaRedHand merged commit dcf2287 into main Apr 28, 2026
7 checks passed
@MegaRedHand MegaRedHand deleted the log-p2p-message-received branch April 28, 2026 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants