The current repository gives you five practical local development flows:
- a single-node flow where one Go node persists chain state, funds test accounts, validates transactions, and commits blocks
- a small multi-node devnet flow where one node produces blocks and other configured nodes follow through transport-backed replication and sync
- a scheduling flow where you elect a validator set, inspect the derived active round and scheduled proposer, and optionally enforce that schedule for local block production
- a manual certificate-gated consensus flow where you build a concrete next-block template, submit signed proposals and votes for the active round, and commit only after a quorum certificate exists
- an automated certificate-gated devnet flow where the scheduled proposer self-proposes, active validators auto-vote, timeout can rotate the proposer, and the next proposer can reuse the stored candidate body for the same height
The browser wallet can create a local account, export and import it, inspect node-side account state, sign a transaction, and send it to the node.
Deterministic WASM smart contracts and a confidential compute marketplace are planned next phases, not part of the current runnable workflow. Product-oriented applications and target use cases are outlined in docs/applications.md.
From the repository root:
go run ./cmd/nodeSanity-check the node with:
Invoke-RestMethod http://localhost:8080/health
curl.exe -i http://localhost:8080/v1/health
Invoke-RestMethod http://localhost:8080/v1/alerts
Invoke-RestMethod http://localhost:8080/v1/slo
Invoke-RestMethod http://localhost:8080/v1/alert-rules
Invoke-RestMethod http://localhost:8080/v1/recording-rules
Invoke-RestMethod http://localhost:8080/v1/dashboards
Invoke-RestMethod http://localhost:8080/v1/status
Invoke-RestMethod http://localhost:8080/v1/consensus
curl.exe http://localhost:8080/metrics
curl.exe http://localhost:8080/v1/alert-rules/prometheus
curl.exe http://localhost:8080/v1/recording-rules/prometheus
curl.exe http://localhost:8080/v1/dashboards/grafana/health tells you whether the process is responding. /v1/health tells you whether the node is actually ready based on validator-set expectations, recovery backlog, consensus warnings, peer-sync condition, and recent diagnostics. /v1/slo tells you how those same signals roll up into operator-facing objectives for readiness, consensus continuity, and peer-sync continuity.
curl.exe -i http://localhost:8080/health
curl.exe -i http://localhost:8080/v1/health
Invoke-RestMethod http://localhost:8080/v1/alerts
Invoke-RestMethod http://localhost:8080/v1/slo
Invoke-RestMethod http://localhost:8080/v1/metrics
curl.exe http://localhost:8080/metricsWhat to expect:
/healthreturns200as long as the API loop is alive/v1/healthreturns200when onlypassorwarnchecks exist and503when at least onefailcheck is activecheckscurrently coverapi,validator_set,recovery,consensus,settlement_throughput,peer_sync, anddiagnostics/v1/alertsturns those same operator signals into a derived critical or warning alert set for polling dashboards and automation, includingsettlement_throughput_reducedorsettlement_throughput_stalledwhen queued transactions fall behind the configured automatic block cadence, with the current worst-case drain forecast included in alert detail when throughput baselines exist, plus targetedpeer_import_blocked,peer_admission_blocked,peer_replication_blocked, the aggregatepeer_snapshot_restored, and the repair-path-specificpeer_snapshot_restore_divergence,peer_snapshot_restore_import_repair, andpeer_snapshot_restore_fetch_fallbackwarnings when retained peer incidents point to those fault classes/v1/slogroups them into objective states so operators can see whether readiness, consensus continuity, peer sync continuity, or settlement throughput is meeting, at risk, breached, or not applicable, with settlement detail carrying the same worst-case drain forecast context as health and alerts/metricsexports the alert, health, and SLO state as Prometheus-style gauges such aszephyr_node_ready,zephyr_health_check_status,zephyr_alert_active, andzephyr_slo_objective_status, plus peer-incident gauges such aszephyr_peer_sync_reason_occurrence_count,zephyr_peer_sync_error_code_occurrence_count, recent peer horizon gauges such aszephyr_peer_sync_horizon_occurrence_count, per-peer retained-incident gauges likezephyr_peer_sync_peer_occurrence_count, per-peer snapshot-repair metadata gauges such aszephyr_peer_snapshot_restore_last_height,zephyr_peer_snapshot_restore_last_observed_at_seconds, andzephyr_peer_snapshot_restore_age_seconds, chain throughput gauges such aszephyr_chain_total_committed_transaction_countandzephyr_chain_window_transactions_per_second, and settlement queue-drain gauges such aszephyr_settlement_queue_drain_lag_seconds,zephyr_settlement_queue_drain_threshold_seconds,zephyr_settlement_queue_drain_utilization_ratio,zephyr_settlement_estimated_queue_drain_warn_utilization_ratio,zephyr_settlement_estimated_queue_drain_warn_utilization_ratio_max,zephyr_settlement_estimated_queue_drain_seconds, andzephyr_settlement_estimated_queue_drain_seconds_max, while/v1/metricskeeps the structured JSON view includingchainThroughputwindows for1m,5m, and15m, fixedpeerSyncSummary.horizonswindows for5m,15m,1h,6h, and24h, plus the typedsettlementThroughputlag, threshold, alert-metadata, utilization-ratio, drain-estimate, per-estimate warn-utilization-ratio, and peak-drain-estimate view/v1/alert-ruleskeeps the structured recommended alert bundle, while/v1/alert-rules/prometheusexports the enabled subset as Prometheus-rule YAML for scrape-based alerting stacks/v1/recording-ruleskeeps the structured recommended recording bundle, while/v1/recording-rules/prometheusexports the enabled subset as Prometheus recording-rule YAML for dashboard and aggregation stacks, including canonical settlement-throughput state, queue-drain utilization, projected queue-drain pressure, max projected queue-drain pressure, queue-drain estimate, max queue-drain estimate, and recent-TPS rollups; on passive nodes the settlement-specific rules stay visible in JSON withdisabledReasonwhile the Prometheus export omits them/v1/dashboardskeeps the structured recommended dashboard bundle, while/v1/dashboards/grafanaexports the enabled subset as Grafana-oriented JSON built on the current recording rules and metrics, including the overview throughput, settlement-state, raw queue-drain-lag, queue-drain-utilization, queue-drain-estimate, worst-case queue-drain-time, and worst-case queue-drain-pressure panels; on passive nodes the settlement-specific panels stay visible in JSON withdisabledReasonwhile Grafana export omits them- use
/v1/healthtogether with/v1/alerts,/v1/slo,/metrics,/v1/metrics,GET /v1/status,/v1/dashboards, and structured logs when you need both a quick readiness gate and deeper incident context
Once you have a node running, you can export the recommended monitoring bundles directly from the API:
Invoke-RestMethod http://localhost:8080/v1/alert-rules
curl.exe http://localhost:8080/v1/alert-rules/prometheusWhat to expect:
/v1/alert-rulesreturns readiness, consensus, throughput, and peer-sync rule groups with expressions, severities, source metrics, and disabled reasons when a rule is not applicable to the current node configuration; the throughput group adds settlement queue-drain rules, and the peer-sync group includes continuity rules plus targeted peer import, peer admission, peer replication, and peer snapshot-restore diagnostics, including repair-path-specific divergence, import-repair, and fetch-fallback snapshot rules/v1/alert-rules/prometheusexports only the enabled subset as Prometheus-rule YAML so you can drop it into a standard scrape-plus-alert workflow without hand-translating expressions- treat the bundle as a production-oriented starting point rather than a final policy set; tune durations, severities, and escalation paths for your deployment
Once you have a node running, you can export the recommended aggregation rollups directly from the API:
Invoke-RestMethod http://localhost:8080/v1/recording-rules
curl.exe http://localhost:8080/v1/recording-rules/prometheusWhat to expect:
/v1/recording-rulesreturns readiness, consensus, throughput, peer-sync, and operator-summary recording-rule groups with stablerecordnames, expressions, source metrics, and disabled reasons when a rule is not applicable to the current node configuration; the throughput group includeszephyr:settlement_throughput:at_risk,zephyr:settlement_throughput:breached,zephyr:settlement_queue_drain:warn_utilization,zephyr:settlement_queue_drain:fail_utilization,zephyr:settlement_queue_drain:estimate_seconds_1m,zephyr:settlement_queue_drain:estimate_seconds_5m,zephyr:settlement_queue_drain:estimate_seconds_15m,zephyr:settlement_queue_drain:estimate_seconds_max,zephyr:settlement_queue_drain:estimate_warn_utilization_1m,zephyr:settlement_queue_drain:estimate_warn_utilization_5m,zephyr:settlement_queue_drain:estimate_warn_utilization_15m, andzephyr:settlement_queue_drain:estimate_warn_utilization_max, the peer-sync group includes the per-peer incident-pressure rollupszephyr:peer_sync:incident_pressure_by_peerandzephyr:peer_sync:incident_pressure_by_horizonplus the snapshot-repair rollupszephyr:peer_sync:snapshot_restore_pressure_by_peer,zephyr:peer_sync:snapshot_restore_age_by_peer,zephyr:peer_sync:snapshot_restore_pressure, andzephyr:peer_sync:snapshot_restore_pressure_by_reason, and the operator-summary group includeszephyr:chain:transactions_per_second_1m,zephyr:chain:transactions_per_second_5m, andzephyr:chain:transactions_per_second_15m; the snapshot-repair rollups now keep both the aggregatepeer_snapshot_restoredcode and the splitpeer_snapshot_restore_*codes inrelatedAlertCodes/v1/recording-rules/prometheusexports only the enabled subset as Prometheus recording-rule YAML so you can drop it into a standard scrape-plus-dashboard workflow without hand-translating expressions; when a recording rule maps to multiple related alerts it now emits a comma-joinedalert_codeslabel instead of silently dropping that metadata- use these rollups as the default dashboard query layer on top of
/metrics, then import or adapt the higher-level dashboard bundles for your deployment
Once you have a node running, you can export the recommended dashboard bundles directly from the API:
Invoke-RestMethod http://localhost:8080/v1/dashboards
curl.exe http://localhost:8080/v1/dashboards/grafanaWhat to expect:
/v1/dashboardsreturns overview, consensus-and-recovery, and peer-sync dashboard bundles with stable panel IDs, PromQL queries, source endpoints, related recording rules, related alert codes, and disabled reasons when a dashboard or panel is not applicable to the current node configuration; the overview bundle now includes aRecent transaction throughputpanel built onzephyr:chain:transactions_per_second_1m,zephyr:chain:transactions_per_second_5m, andzephyr:chain:transactions_per_second_15m, aSettlement throughput statepanel built onzephyr:settlement_throughput:at_riskandzephyr:settlement_throughput:breached, a rawSettlement queue-drain lagpanel built onzephyr_settlement_queue_drain_lag_secondspluszephyr_settlement_queue_drain_threshold_seconds, a normalizedSettlement queue-drain utilizationpanel built onzephyr:settlement_queue_drain:warn_utilizationandzephyr:settlement_queue_drain:fail_utilization, anEstimated queue-drain pressurepanel built onzephyr:settlement_queue_drain:estimate_warn_utilization_1m,zephyr:settlement_queue_drain:estimate_warn_utilization_5m, andzephyr:settlement_queue_drain:estimate_warn_utilization_15m, aWorst-case estimated queue-drain pressurestat built onzephyr:settlement_queue_drain:estimate_warn_utilization_max, anEstimated queue-drain timepanel built onzephyr:settlement_queue_drain:estimate_seconds_1m,zephyr:settlement_queue_drain:estimate_seconds_5m, andzephyr:settlement_queue_drain:estimate_seconds_15m, and aWorst-case estimated queue-drain timestat built onzephyr:settlement_queue_drain:estimate_seconds_max, while the peer-sync bundle includes incident-by-state, incident-by-reason, incident-by-error-code, per-peer incident-pressure,Peer incident pressure horizons,Peer snapshot restore pressure by peer,Peer snapshot restore heights,Peer snapshot restore age,Peer snapshot restore pressure, andPeer snapshot restore reasonspanels tied to the peer import, admission, replication, and snapshot-restore alerts, with the peer-incident and snapshot-repair panels now carrying both the aggregatepeer_snapshot_restoredcode and the splitpeer_snapshot_restore_*codes inrelatedAlertCodes, and the per-peer, horizon, and per-peer snapshot-repair panels built onzephyr:peer_sync:incident_pressure_by_peer,zephyr:peer_sync:incident_pressure_by_horizon,zephyr:peer_sync:snapshot_restore_pressure_by_peer, andzephyr:peer_sync:snapshot_restore_age_by_peer/v1/dashboards/grafanaexports only the enabled dashboards and panels as Grafana-oriented JSON so you can import a starting Zephyr dashboard set after wiring a Prometheus data source to/metrics- treat the bundle as a production-oriented starting point rather than a final layout; tune datasource selection, labels, thresholds, and panel arrangement for your deployment
Use separate terminals and separate data directories.
Node A, producer:
$env:ZEPHYR_NODE_ID="node-a"
$env:ZEPHYR_HTTP_ADDR=":8080"
$env:ZEPHYR_DATA_DIR="var/devnet-a"
$env:ZEPHYR_PEERS="http://localhost:8081"
$env:ZEPHYR_ENABLE_BLOCK_PRODUCTION="true"
$env:ZEPHYR_ENABLE_PEER_SYNC="true"
go run ./cmd/nodeNode B, replica:
$env:ZEPHYR_NODE_ID="node-b"
$env:ZEPHYR_HTTP_ADDR=":8081"
$env:ZEPHYR_DATA_DIR="var/devnet-b"
$env:ZEPHYR_PEERS="http://localhost:8080"
$env:ZEPHYR_ENABLE_BLOCK_PRODUCTION="false"
$env:ZEPHYR_ENABLE_PEER_SYNC="true"
go run ./cmd/nodeWhat to expect:
- Node A accepts wallet transactions and can produce blocks
- Node B polls peer status on
ZEPHYR_SYNC_INTERVAL - transactions, faucet credits, proposals, votes, and blocks are replicated over the current transport implementation
- if validator private keys are configured,
GET /v1/statusexposes a signed identity proof,peerSyncSummary, andGET /v1/peersshows verification, admission state, per-peer sync telemetry, restart-safe import, snapshot, and replication-failure context, derived incident counters, and durablerecentIncidentshistory for configured peers - if Node B starts late or misses a block import, it can recover from Node A's snapshot
- Start a node with a validator address or validator private key.
- Submit an election, then inspect:
Invoke-RestMethod http://localhost:8080/v1/validators
Invoke-RestMethod http://localhost:8080/v1/consensusYou should see:
- a validator snapshot version that increases when the election result is replaced
- the persisted validator list and normalized election config
totalVotingPower,quorumVotingPower,currentRound,currentRoundStartedAt, andnextProposer
If you want the node to prove which validator it represents over the current HTTP transport, start it with a validator private key:
$env:ZEPHYR_NODE_ID="node-a"
$env:ZEPHYR_VALIDATOR_PRIVATE_KEY="<base64-pkcs8-p256-private-key>"
go run ./cmd/nodeThen inspect:
Invoke-RestMethod http://localhost:8080/v1/status
Invoke-RestMethod http://localhost:8080/v1/peersIf you want the current HTTP devnet to fail closed on unsigned or mismatched peers, start the node with:
$env:ZEPHYR_NODE_ID="node-a"
$env:ZEPHYR_VALIDATOR_PRIVATE_KEY="<base64-pkcs8-p256-private-key>"
$env:ZEPHYR_PEERS="http://localhost:8081"
$env:ZEPHYR_REQUIRE_PEER_IDENTITY="true"
$env:ZEPHYR_PEER_VALIDATORS="http://localhost:8081=zph_validator_b"
go run ./cmd/nodeWhat to expect:
GET /v1/statusreportspeerIdentityRequired=trueGET /v1/peersshowsexpectedValidator,admitted,admissionError,syncState,heightDelta, per-peerincidentCount,incidentOccurrences,latestIncidentAt, the latest import, snapshot-repair, and replication-failure metadata, and durablerecentIncidentshistory for each configured peer- background sync and outgoing replication use only admitted peers under this policy
- replicated peer POST requests without a valid identity, or from validators outside the configured binding allowlist, are rejected with
403
If you want incident-friendly JSON logs while keeping the existing HTTP API surfaces, start the node with:
$env:ZEPHYR_ENABLE_STRUCTURED_LOGS="true"
go run ./cmd/nodeWhat to expect:
- consensus diagnostics, peer-sync incidents, and snapshot-restore recovery events are emitted as newline-delimited JSON
GET /v1/status,GET /v1/consensus, andGET /v1/metricsreportstructuredLogsEnabled=true- the logs are designed to pair with
GET /v1/metrics,diagnostics, andpeerSyncSummaryrather than replace those durable views
This is the lowest-level way to exercise the current certified block path.
- Start a node with proposer scheduling and certificate enforcement enabled:
$env:ZEPHYR_NODE_ID="node-a"
$env:ZEPHYR_VALIDATOR_ADDRESS="zph_validator_a"
$env:ZEPHYR_VALIDATOR_PRIVATE_KEY="<base64-pkcs8-p256-private-key>"
$env:ZEPHYR_ENFORCE_PROPOSER_SCHEDULE="true"
$env:ZEPHYR_REQUIRE_CONSENSUS_CERTIFICATES="true"
go run ./cmd/node- Make sure a validator set already exists.
- Queue at least one transaction in the mempool.
- Fetch the concrete next block candidate and the active round:
Invoke-RestMethod http://localhost:8080/v1/dev/block-template
Invoke-RestMethod http://localhost:8080/v1/consensus- Build a signed proposal whose
height,round,previousHash,producedAt, fulltransactions, orderedtransactionIds, andblockHashmatch that template exactly. - POST the proposal to
/v1/consensus/proposals. - Submit validator votes to
/v1/consensus/votesuntil a quorum certificate exists for that sameheight,round, andblockHash. - Commit that exact block template by reusing the returned
producedAttimestamp:
$body = @{ producedAt = "2026-03-24T13:00:00Z" } | ConvertTo-Json
Invoke-RestMethod http://localhost:8080/v1/dev/produce-block -Method Post -ContentType 'application/json' -Body $bodyThis is the closest current path to a production-style validator flow, but it is still a first-pass timeout-driven round engine.
- Start the initial round-0 proposer:
$env:ZEPHYR_NODE_ID="node-a"
$env:ZEPHYR_HTTP_ADDR=":8080"
$env:ZEPHYR_DATA_DIR="var/devnet-a"
$env:ZEPHYR_PEERS="http://localhost:8081"
$env:ZEPHYR_VALIDATOR_PRIVATE_KEY="<base64-pkcs8-p256-private-key-a>"
$env:ZEPHYR_ENABLE_BLOCK_PRODUCTION="true"
$env:ZEPHYR_ENABLE_PEER_SYNC="true"
$env:ZEPHYR_ENABLE_CONSENSUS_AUTOMATION="true"
$env:ZEPHYR_CONSENSUS_INTERVAL="250ms"
$env:ZEPHYR_CONSENSUS_ROUND_TIMEOUT="2s"
$env:ZEPHYR_ENFORCE_PROPOSER_SCHEDULE="true"
$env:ZEPHYR_REQUIRE_CONSENSUS_CERTIFICATES="true"
go run ./cmd/node- Start another active validator:
$env:ZEPHYR_NODE_ID="node-b"
$env:ZEPHYR_HTTP_ADDR=":8081"
$env:ZEPHYR_DATA_DIR="var/devnet-b"
$env:ZEPHYR_PEERS="http://localhost:8080"
$env:ZEPHYR_VALIDATOR_PRIVATE_KEY="<base64-pkcs8-p256-private-key-b>"
$env:ZEPHYR_ENABLE_BLOCK_PRODUCTION="true"
$env:ZEPHYR_ENABLE_PEER_SYNC="true"
$env:ZEPHYR_ENABLE_CONSENSUS_AUTOMATION="true"
$env:ZEPHYR_CONSENSUS_INTERVAL="250ms"
$env:ZEPHYR_CONSENSUS_ROUND_TIMEOUT="2s"
$env:ZEPHYR_REQUIRE_PEER_IDENTITY="true"
$env:ZEPHYR_PEER_VALIDATORS="http://localhost:8080=zph_validator_a"
$env:ZEPHYR_REQUIRE_CONSENSUS_CERTIFICATES="true"
go run ./cmd/node- Submit an election that makes both validators active and puts Node A first in the proposer schedule.
- Queue at least one transaction on the current proposer.
- Inspect the live state:
Invoke-RestMethod http://localhost:8080/v1/status
Invoke-RestMethod http://localhost:8080/v1/consensus
Invoke-RestMethod http://localhost:8081/v1/consensusExpected behavior:
- startup fails if automation is enabled without
ZEPHYR_VALIDATOR_PRIVATE_KEY - the scheduled proposer builds the next block template and persists a self-contained proposal automatically
- active validators persist and replicate votes automatically for that proposal
- once quorum is observed, the proposer commits from the stored certified proposal body without requiring
POST /v1/dev/produce-block - if the active proposer stalls past
ZEPHYR_CONSENSUS_ROUND_TIMEOUT, the node advancescurrentRound, rotatesnextProposer, and the new proposer can reuse the latest stored candidate body for that same height - admitted peers replicate the proposal, vote, certificate, and committed block over the current HTTP transport, and failed outgoing proposal, vote, or block dissemination is retained as durable
replication_blockedpeer evidence GET /v1/status,GET /v1/consensus, andGET /v1/dev/block-templatenow exposeroundEvidenceso operators can see the active round deadline, proposal presence, leading vote power, quorum remaining, replay backlog, warnings, and certificate state- those same responses now expose
roundHistory, which shows the pending height across prior and active rounds so operators can inspect proposer rotation and stalled rounds side by side - those same responses now expose
blockReadiness, which shows whether the local template matches stored proposals and certificates and whether commit or import can proceed from stored certified artifacts - those same responses now expose
recovery, which shows pending replayable local proposal or vote actions, pending import backlog, and recent replay/completion plus local certifiedblock_commitand snapshot-restore metadata from the broader local consensus recovery surface - those same responses now expose
diagnostics, which show recent rejected proposal, vote, commit, or import actions with stable error codes - those same responses now expose
peerSyncHistory, which keeps recent cross-peer sync incidents and retainedreplication_blockeddissemination failures visible even after restart - those same responses now expose
peerSyncSummary, which rolls those incidents up by peer, state, reason, and error code plus fixed recent5m,15m,1h,6h, and24hhorizons so operators can see the dominant network problem quickly GET /v1/metricsnow provides the same durable peer summary alongside machine-readable consensus-action, diagnostic, peer-incident reason and error-code counters, peer horizon counters, and live peer-runtime counters for dashboards or automation- if
ZEPHYR_ENABLE_STRUCTURED_LOGS=true, the node also emits newline-delimited JSON logs for diagnostics, peer incidents, and snapshot recovery as those events happen - if a peer link drops and later returns, validators keep rebroadcasting their latest local proposal or vote for the pending height until the matching certificate exists
- if a validator restarts mid-round after persisting a local proposal or vote, the node can replay that pending action from the persisted recovery state
- confirm a validator set already exists through
GET /v1/validators - confirm the signer address is part of that validator set
- confirm the proposal height matches the node's
nextHeightinGET /v1/consensus - confirm the proposal round matches the node's
currentRoundinGET /v1/consensus, unless you are intentionally advancing to a higher round - confirm the proposal signer matches
nextProposer - confirm the proposal
previousHashmatches the current chain tip - confirm the proposal
producedAt, fulltransactions, and orderedtransactionIdscome from the sameGET /v1/dev/block-templateresponse asblockHash - confirm votes reference the same
blockHash,height, androundas a known proposal - confirm the signed payload still matches the visible request fields exactly
- inspect
diagnosticsinGET /v1/statusorGET /v1/consensusafter a rejection; common codes now includeunexpected_proposer,stale_round,conflicting_proposal,unknown_proposal, andtemplate_mismatch
- confirm the node started with
ZEPHYR_VALIDATOR_PRIVATE_KEY; startup rejects automation without it - confirm the local validator is part of the active validator set shown by
GET /v1/validators - confirm
GET /v1/consensusreports the expected validator innextProposer - confirm the proposer node still has
ZEPHYR_ENABLE_BLOCK_PRODUCTION=true - confirm
GET /v1/statusorGET /v1/consensusshowsconsensusAutomationEnabled=true - confirm
ZEPHYR_CONSENSUS_ROUND_TIMEOUTis long enough for proposal and vote dissemination in your local setup - confirm there is at least one queued transaction or a previously stored proposal body when you expect automatic proposal generation
- inspect
roundEvidenceinGET /v1/status,GET /v1/consensus, orGET /v1/dev/block-templateto see whether the node is waiting for a proposal, collecting votes, timed out, waiting for reproposal, or already certified - use
leadingVotePower,quorumRemaining,pendingReplayRounds, andwarningsinsideroundEvidenceto separate partial quorum, timeout, replay backlog, and proposer-schedule problems - inspect
roundHistoryin those same responses to compare round-0, round-1, and later proposer attempts for the pending height without losing visibility into earlier rounds - inspect
blockReadinessin those same responses to see whether the current local template matches a stored proposal, whether a matching certificate exists, and whether a certified stored proposal is already ready for commit or import - inspect
recoveryin those same responses to see whether the node still has pending replayable local proposal or vote actions, blocked peer-import heights, or a recent snapshot restore after a restart or dropped peer link - inspect
diagnosticsin those same responses to see whether recent failures were caused by stale rounds, unexpected proposers, missing proposals, template mismatch, missing certificates, or other rejected consensus actions - inspect
GET /v1/metricswhen you want machine-readable totals for pending replay, diagnostic code frequency, and live peer sync-state distribution during the incident - inspect
GET /metricswhen you want those same health, recovery, peer, and alert signals in Prometheus-compatible text for scrape-based dashboards or alerts - inspect
GET /v1/alertswhen you want the current derived critical and warning alerts without reconstructing them from raw health or metric data - inspect
GET /v1/slowhen you want the same incident evidence projected into compact objective states for readiness, consensus continuity, and peer sync continuity - inspect
GET /v1/alert-rulesorGET /v1/alert-rules/prometheuswhen you are wiring alert managers or alert rule files and want the bundle Zephyr currently recommends, including peer import, peer admission, peer replication, and peer snapshot-restore diagnostics plus the repair-path-specific divergence, import-repair, and fetch-fallback snapshot rules built from retained incident state - enable
ZEPHYR_ENABLE_STRUCTURED_LOGS=truewhen you want those same incident transitions as newline-delimited JSON in the node logs - inspect
curl.exe -i http://localhost:8080/v1/healthto separate a live node from a ready one;503usually means recovery backlog or peer-sync availability has escalated into a hard failure, whilewarnhighlights degraded but still serving conditions - remember the current engine now supports timeout-driven proposer rotation, latest-artifact rebroadcast after peer recovery, restart-safe local proposal or vote replay, pending import recovery, snapshot-restore history, durable peer-incident history, cross-peer
peerSyncSummary, machine-readable/v1/metrics, Prometheus-style/metrics, derived/v1/health, derived/v1/alerts, derived/v1/slo, recommended alert-rule bundles, recommended recording-rule bundles, recommended/v1/dashboards, exported/v1/dashboards/grafana, structured event logs, per-height round history, block readiness inspection, and bounded rejection diagnostics, but broader recovery coverage plus broader dashboard coverage and export adapters are still limited
- inspect
GET /v1/peersfirst;syncState=snapshot_restoredtells you a peer-specific repair happened,lastSnapshotRestoreReasondistinguishespeer_diverged,import_repair, andfetch_fallback,lastReplicationFailureReasonshows whether outgoing dissemination most recently failed on a proposal, vote, or block, and durable incidents plus the per-peer counters keep that story available after restart - inspect
lastImportErrorCode,lastImportFailureHeight, andlastImportFailureBlockHashon that peer view when the repair was triggered by a rejected block import - inspect
peerSyncSummaryinGET /v1/statusorGET /v1/consensusto see whether the issue is isolated to one peer or part of a broader pattern such as repeatedunreachableincidents, admission failures,replication_blockedproposal or vote churn, orproposal_requiredimport blocks across several peers - inspect
GET /v1/metricsto compare that durable summary with the livepeerRuntime.bySyncStatedistribution, then use Prometheus-facing peer-level gauges such aszephyr_peer_sync_peer_occurrence_countplus the reason or error-code rollups when some peers have already recovered and others are still failing - inspect
diagnosticsinGET /v1/statusorGET /v1/consensus; if the latestblock_import_rejectedentry hassource=peer_sync, the node hit a block-import problem during background sync before falling back to snapshot restore - inspect
recovery.pendingImportCountandrecovery.pendingImportHeightsto see whether the node is still blocked on a peer-import path or whether that backlog has already been cleared - inspect
recovery.lastSnapshotRestoreAt,recovery.lastSnapshotRestoreHeight, andrecovery.lastSnapshotRestoreBlockHashto confirm that snapshot repair actually ran and which chain tip it restored - inspect
recovery.recentActionsfor a completedblock_commitaction when you want durable evidence that the local proposer finished a certified commit, or for a completedblock_importaction followed by a completedsnapshot_restoreaction when you are debugging catch-up or divergence repair - remember that peer snapshot restore now preserves the local node's own recovery, diagnostic, and peer-sync incident history, so post-incident inspection stays on the repairing node instead of inheriting the peer's local WAL context
- confirm the peer validator node is started with
ZEPHYR_VALIDATOR_PRIVATE_KEY - confirm the private key is a base64-encoded PKCS#8 P-256 key
- confirm
GET /v1/statuson the remote node includes anidentityobject - confirm
GET /v1/peersshows the expectedvalidatorAddress, then readidentityError,admissionError, andsyncStatefor the exact failure mode - if you enable
ZEPHYR_REQUIRE_PEER_IDENTITY, peer-originated replicated POST requests without a valid identity are rejected with403 - if you configure
ZEPHYR_PEER_VALIDATORS, confirm the bound<peer-url>=<validator-address>pair matches what the peer proves inGET /v1/status
- confirm
GET /v1/consensusshows a non-empty validator set - confirm the active
currentRoundandnextProposermatch the proposal and certificate you expect to commit - confirm
GET /v1/dev/block-templateand your proposal use the sameblockHash,previousHash,producedAt, fulltransactions, andtransactionIds - confirm
GET /v1/consensusshows a latest certificate for that sameheight,round, andblockHash - confirm you replay
POST /v1/dev/produce-blockwith the sameproducedAtused by the certified template when you are using the manual path - inspect
blockReadinessinGET /v1/status,GET /v1/consensus, orGET /v1/dev/block-template; common warnings now includeproposal_missing,local_template_mismatch,certificate_missing, andcertified_proposal_differs_from_local_template - inspect
diagnosticsinGET /v1/statusorGET /v1/consensus; common commit-side codes now includeproposal_required,template_mismatch,certificate_required, andnot_scheduled_proposer - disable
ZEPHYR_REQUIRE_CONSENSUS_CERTIFICATESonly if you intentionally want a looser local dev flow
- Start Node A.
- Start Node B as a replica or second validator.
- Start the Vue wallet against Node A.
- Create a wallet.
- Fund the wallet with the local dev faucet.
- Sign and broadcast a sample transaction.
- Produce a block on Node A or enable automated certified consensus if you already have validator keys.
- Inspect
GET /v1/status,GET /v1/peers,GET /v1/blocks/latest, andGET /v1/accounts/{address}on both nodes. - If validator private keys are configured, confirm peer identity verification succeeds in
GET /v1/peers. - Submit a validator election and inspect
GET /v1/validatorsplusGET /v1/consensus. - Either submit a matching proposal and validator votes manually, or let the automated proposer and validators handle the active round.
- If the active proposer stalls, watch
currentRoundadvance andnextProposerrotate. - Inspect the resulting block, vote tallies, and certificate on both nodes.
- Optionally restart a node and confirm the validator snapshot, round state, consensus artifacts, and
recoverystate survived. - If the restarted node had a pending local proposal or vote, confirm the action is replayed and later marked completed once the block finalizes.