Conversation
|
The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).
|
📝 WalkthroughWalkthroughAdds a new "High Availability" section to the docs sidebar and introduces three new HA guides: an overview of Raft-based operation, a 5-node cluster setup guide, and a single-to-HA migration runbook. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Claude encountered an error —— View job I'll analyze this and get back to you. |
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3293 +/- ##
==========================================
+ Coverage 62.45% 62.48% +0.03%
==========================================
Files 122 122
Lines 13047 13047
==========================================
+ Hits 8149 8153 +4
+ Misses 4012 4009 -3
+ Partials 886 885 -1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/.vitepress/config.ts (1)
345-355:⚠️ Potential issue | 🟡 MinorFix MD040: add a language to the RTT adaptation equations code fence.
The fenced block under “Adapting for different RTT values” starts at
Line 349without a language specifier. Add something liketext(ormath) so markdownlint MD040 is satisfied.Suggested change
-``` +```text heartbeat_timeout = RTT_MAX × 4 election_timeout = heartbeat_timeout × 4 leader_lease_timeout = heartbeat_timeout / 2 send_timeout = RTT_MAX × 3</details> Based on learnings, markdownlint-cli2 (MD040) reports: “Fenced code blocks should have a language specified (MD040, fenced-code-language)”. <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@docs/.vitepress/config.tsaround lines 345 - 355, The fenced code block
under the "Adapting for different RTT values" heading is missing a language tag
which triggers markdownlint MD040; edit the markdown string containing that
heading and change the opening triple backticks from plainto a language-tagged fence (for exampletext or ```math) so the RTT equations
block is annotated (preserve the existing lines for heartbeat_timeout,
election_timeout, leader_lease_timeout, send_timeout). Ensure only the opening
fence is updated and no other content is altered.</details> </blockquote></details> </blockquote></details>🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed. Inline comments: In `@docs/guides/ha/cluster-setup.md`: - Around line 262-266: The current failover step uses a blunt kill command `kill -SIGTERM $(pgrep evm)` which may match multiple processes; replace it with a targeted stop by either: 1) using the evm systemd unit (e.g. `systemctl stop evm` on the leader host) if available, or 2) resolving the leader's exact PID from the leader instance (use its PID file or parse the leader-specific log entry to get one PID), or 3) using `pgrep -f` with a stricter pattern that uniquely matches the leader process and verify it returns exactly one PID before calling `kill -SIGTERM <pid>`; ensure the docs show verifying a single PID and prefer systemd/PID-file methods over a plain `pgrep evm` call. - Around line 29-32: The fenced multiaddr code block containing "/ip4/<ip>/tcp/<port>/p2p/<peer-id>" is missing a language tag; update the fence for that snippet from ``` to ```text so the block reads as a text code fence (i.e., change the fence surrounding the multiaddr example to use the "text" language tag). - Around line 230-233: The fenced log snippet in cluster-setup.md is missing a language tag (MD040); edit the fenced block containing "INF raft: entering follower state leader=node-1" / "INF block applied from raft log..." and add the language identifier `text` after the opening backticks so the fence reads ```text; update the block with that tag around the existing log lines in the file. - Around line 221-226: The fenced log block containing "INF raft: entering candidate state node=node-1" and the line "INF raft: election won tally=3" is missing a language tag; update the opening fence from ``` to ```text so the snippet is marked as plain text (i.e., change the code fence that precedes the lines starting with "INF raft:" to ```text). In `@docs/guides/ha/overview.md`: - Around line 358-366: The P2P port in the YAML example under the "Interaction with P2P" section is inconsistent (listen_address uses 7676) and should match the rest of the docs; update the YAML snippet's listen_address and peers example to use 26656 (or replace the numeric port with a <P2P_PORT> placeholder) so the listen_address and peers entries are consistent with cluster-setup.md; specifically edit the "listen_address" line and any sample "peers" entries in the overview.md snippet to use 26656 or the placeholder. - Around line 349-354: The fenced code block containing the RTT scaling rules (lines showing heartbeat_timeout, election_timeout, leader_lease_timeout, send_timeout) is missing a language tag which triggers markdownlint MD040; update that fenced block by adding a language identifier (e.g., text or bash) after the opening triple backticks so it becomes ```text (or ```bash) to satisfy the linter and improve readability. - Around line 144-159: Decide whether raft.peers is meant to include the local node, then update both docs to be explicit and consistent: if raft.peers must include self, remove the word "remote" from the overview.md description of raft.peers and replace the example list with one that includes the local node (use the same nodeID@host:port entries shown in cluster-setup.md); if raft.peers must exclude self, update the cluster-setup.md examples so each node’s raft.peers omits its own nodeID and add a clear note in overview.md explaining “raft.peers excludes the local node (do not list node-X in node-X’s config)”; in either case, ensure the examples and the explanatory sentence referencing raft.peers are identical across overview.md and cluster-setup.md and add a short clarifying comment adjacent to the existing example to avoid future inconsistencies. In `@docs/guides/ha/single-to-ha.md`: - Around line 75-90: Update the signer-copy step in single-to-ha.md to avoid hardcoding priv_validator_key.json: either (A) use the canonical filename produced by the tool (the signer file created by running evm init, e.g., signer.json) to match cluster-setup.md, or (B) change the text/commands to a generic instruction that tells users to “copy the signer key file generated by evm init” (reference evm init and the signer filename like signer.json/priv_validator_key.json) so the guide is robust and consistent with cluster-setup.md. - Around line 279-298: Update the two fenced code blocks in the "Check leader election" section so they include a language specifier (e.g., text or log); specifically, change the block containing the lines starting with "INF raft: election won tally=3 leader=node-1" and the block starting with "INF raft: entering follower state leader=node-1" to use ```text (or ```log) instead of plain ``` to satisfy MD040. --- Outside diff comments: In `@docs/.vitepress/config.ts`: - Around line 345-355: The fenced code block under the "Adapting for different RTT values" heading is missing a language tag which triggers markdownlint MD040; edit the markdown string containing that heading and change the opening triple backticks from plain ``` to a language-tagged fence (for example ```text or ```math) so the RTT equations block is annotated (preserve the existing lines for heartbeat_timeout, election_timeout, leader_lease_timeout, send_timeout). Ensure only the opening fence is updated and no other content is altered.🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID:
3932bf12-f4e0-4e30-85b7-dab97b9d351c📒 Files selected for processing (4)
docs/.vitepress/config.tsdocs/guides/ha/cluster-setup.mddocs/guides/ha/overview.mddocs/guides/ha/single-to-ha.md
Critical fixes: - Fix snapshot_threshold math: 5000 ÷ 10 = 500s ≈ 8.3 min (not 83s) - Fix trailing_logs math: 18000 ÷ 10 = 1800s = 30 min (not 5 min) Medium fixes: - Fix heartbeat_timeout description: it is a follower-side election trigger, not the interval at which the leader sends heartbeats - Add explicit restart instruction after Step 5 data copy in single-to-ha.md so the chain keeps producing blocks during preparation (Steps 6-8) - Replace priv_validator_key.json with signer.json in single-to-ha.md to match cluster-setup.md and the E2E tests Minor fixes: - Exclude self from raft.peers in all examples (cluster-setup.md node-1 yaml/CLI/systemd, single-to-ha.md node-1 and node-2) - Add "exclude local node" note to raft.peers description in overview.md - Fix P2P port in overview.md Interaction with P2P section (7676 → 26656) - Add text language tag to all bare fenced blocks (MD040): multiaddr example, RTT equations, and all log snippets Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
raft_production.md had no sidebar entry and its content was fully superseded by the new ha/ guides. Extract the three pieces that were unique to it — bootstrap flag docs, auto-detection startup mode explanation, and static-membership limitation note — into ha/overview.md, then delete the file. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Passing --evnode.signer.passphrase inline exposes the secret in ps aux, journalctl, and shell history. - Add EnvironmentFile=/etc/ev-node/env (chmod 600) to the systemd unit in cluster-setup.md with setup instructions - Replace all inline <YOUR_PASSPHRASE> occurrences with $EV_SIGNER_PASSPHRASE sourced from /etc/ev-node/env in every evm start / evm init snippet across both guides Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Replace "peers list is identical" stub in node-2 config with an explicit peers list that excludes node-2 itself, and add a note that each node must omit itself from raft.peers - Replace "Wait ~30 seconds" in rolling restart with journalctl one-liners that exit as soon as the node logs follower/leader state, giving a deterministic signal instead of an arbitrary timeout Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The abbreviated node-2 snippet with "# peers list is identical" caused a startup failure: with raft_addr=0.0.0.0:5001 the bootstrap code's literal address comparison does not recognise node-2@10.0.0.2:5001 as self, so node-2 is appended twice and deduplicateServers returns "duplicate peers found in config". - Fix intro text: "only raft.node_id and raft_addr differ" → "raft.node_id is unique; raft.peers and p2p.peers must exclude self" - Expand node-2 snippet to a full evnode.yaml with the correct peers list (node-1, node-3, node-4, node-5 — no node-2) and an inline explanation of the wildcard address pitfall - Align overview.md trailing_logs example to 1 block/s (matching block_time: "1s" used throughout) and note the 10 block/s rate too Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
docs/guides/ha/cluster-setup.md (1)
290-294:⚠️ Potential issue | 🟡 MinorFallback kill path still doesn’t enforce single-PID targeting.
Line 290 says to verify exactly one PID, but Line 291–293 do not enforce that before sending SIGTERM. Add an explicit cardinality check to prevent stopping the wrong process.
Proposed doc-safe command patch
-# Fallback: stop the process directly (verify exactly one PID before killing) -PID=$(pgrep -f "evm start") -echo "Stopping PID $PID" -kill -SIGTERM "$PID" +# Fallback: stop the process directly (must resolve exactly one PID) +mapfile -t PIDS < <(pgrep -f "evm start") +if [ "${`#PIDS`[@]}" -ne 1 ]; then + echo "Expected exactly 1 evm PID, found ${`#PIDS`[@]}: ${PIDS[*]}" + exit 1 +fi +echo "Stopping PID ${PIDS[0]}" +kill -SIGTERM "${PIDS[0]}"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/guides/ha/cluster-setup.md` around lines 290 - 294, The fallback kill path uses PID=$(pgrep -f "evm start") and then kill -SIGTERM "$PID" without verifying cardinality; change the instructions to first capture all matching PIDs (via pgrep -f "evm start"), check that exactly one PID is present (fail/exit with an error message if zero or more than one), and only then assign/use that single PID variable and run kill -SIGTERM on it; reference the PID variable, the pgrep -f "evm start" invocation, and the kill -SIGTERM command so the doc update enforces the single-PID check before killing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/guides/ha/cluster-setup.md`:
- Around line 77-80: The guide's systemd examples incorrectly recommend passing
the passphrase via the CLI flag --evnode.signer.passphrase (which expands
$EV_SIGNER_PASSPHRASE into the command line and exposes the secret); update the
systemd unit examples and any CLI examples that use --evnode.signer.passphrase
(search for occurrences of --evnode.signer.passphrase and EnvironmentFile) to
instead use --evnode.signer.passphrase_file pointing to a protected file (and
explain creating the file and setting permissions), and update the explanatory
text at the cited examples (including the instances around the earlier example
and the later one) to recommend passphrase_file for production rather than
embedding the expanded env var on ExecStart.
---
Duplicate comments:
In `@docs/guides/ha/cluster-setup.md`:
- Around line 290-294: The fallback kill path uses PID=$(pgrep -f "evm start")
and then kill -SIGTERM "$PID" without verifying cardinality; change the
instructions to first capture all matching PIDs (via pgrep -f "evm start"),
check that exactly one PID is present (fail/exit with an error message if zero
or more than one), and only then assign/use that single PID variable and run
kill -SIGTERM on it; reference the PID variable, the pgrep -f "evm start"
invocation, and the kill -SIGTERM command so the doc update enforces the
single-PID check before killing.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e2ebbdf5-5ca6-4b02-aaf8-c61c1722965d
📒 Files selected for processing (4)
docs/guides/ha/cluster-setup.mddocs/guides/ha/overview.mddocs/guides/ha/single-to-ha.mddocs/guides/raft_production.md
💤 Files with no reviewable changes (1)
- docs/guides/raft_production.md
✅ Files skipped from review due to trivial changes (2)
- docs/guides/ha/single-to-ha.md
- docs/guides/ha/overview.md
| ./evm init --evnode.node.aggregator=true --evnode.signer.passphrase "$EV_SIGNER_PASSPHRASE" | ||
| ``` | ||
|
|
||
| > Set `EV_SIGNER_PASSPHRASE` in your shell session before running this command so the passphrase does not appear in `ps aux` or your shell history. The [EnvironmentFile setup](#running-as-a-systemd-service) later in this guide shows how to store it securely. |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify whether ev-node/evm supports a safer non-CLI passphrase input mechanism.
# Expected: presence of flags/config such as passphrase-file, stdin prompt, or dedicated secret provider.
# If none are found, docs should not claim CLI args are hidden from process inspection.
rg -n -C3 'signer\.passphrase|passphrase-file|passphrase_file|stdin|prompt|EnvironmentFile' docs
rg -n -C3 'signer\.passphrase|passphrase-file|passphrase_file|stdin|prompt' .Repository: evstack/ev-node
Length of output: 50373
🏁 Script executed:
sed -n '305,365p' docs/guides/ha/cluster-setup.mdRepository: evstack/ev-node
Length of output: 1968
Update systemd service configuration to use passphrase file instead of CLI argument.
The security claim at line 313-314 is incorrect. When systemd loads EnvironmentFile=/etc/ev-node/env, the environment variable $EV_SIGNER_PASSPHRASE gets expanded into the actual secret value. Passing it to the CLI via --evnode.signer.passphrase=$EV_SIGNER_PASSPHRASE (line 350) makes the expanded passphrase visible in ps aux, /proc/<pid>/cmdline, and process inspection tools—directly contradicting the guide's security guarantee.
Use the supported --evnode.signer.passphrase_file flag instead:
Corrected unit file snippet
[Service]
EnvironmentFile=/etc/ev-node/env
ExecStart=/usr/local/bin/evm start \
--evnode.node.aggregator=true \
...other flags... \
--evnode.signer.passphrase_file=/etc/ev-node/passphraseThen store the passphrase in a separate file:
echo -n "<YOUR_PASSPHRASE>" | sudo tee /etc/ev-node/passphrase > /dev/null
sudo chmod 600 /etc/ev-node/passphraseAlso applies to lines 77 and 229 where the same false security claim appears—those examples should also be updated to recommend --evnode.signer.passphrase_file for production use.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/guides/ha/cluster-setup.md` around lines 77 - 80, The guide's systemd
examples incorrectly recommend passing the passphrase via the CLI flag
--evnode.signer.passphrase (which expands $EV_SIGNER_PASSPHRASE into the command
line and exposes the secret); update the systemd unit examples and any CLI
examples that use --evnode.signer.passphrase (search for occurrences of
--evnode.signer.passphrase and EnvironmentFile) to instead use
--evnode.signer.passphrase_file pointing to a protected file (and explain
creating the file and setting permissions), and update the explanatory text at
the cited examples (including the instances around the earlier example and the
later one) to recommend passphrase_file for production rather than embedding the
expanded env var on ExecStart.
Summary
evm net-info, systemd unit, rolling restart, and failover testingTest plan
yarn dev— navigate to each of the three HA guides, verify sidebar links and internal cross-references workjust testlink— no broken links🤖 Generated with Claude Code
Summary by CodeRabbit