The default local base URL is:
http://localhost:8080
Change it with ZEPHYR_HTTP_ADDR when starting the node.
ZEPHYR_HTTP_ADDR: HTTP bind address, default:8080ZEPHYR_NODE_ID: node identifier, defaultnode-localZEPHYR_VALIDATOR_ADDRESS: local validator address for proposer-schedule enforcement and status output, default emptyZEPHYR_VALIDATOR_PRIVATE_KEY: base64-encoded PKCS#8 P-256 private key used to derive and sign the node transport identity plus automated proposal and vote messages, default emptyZEPHYR_DATA_DIR: durable node state directory, defaultvar/nodeZEPHYR_PEERS: comma-separated peer base URLs, default emptyZEPHYR_BLOCK_INTERVAL: automatic block-production interval, default15sZEPHYR_CONSENSUS_INTERVAL: consensus automation ticker interval, default1sZEPHYR_CONSENSUS_ROUND_TIMEOUT: active round timeout before automation advances to the next round, default5sZEPHYR_SYNC_INTERVAL: peer poll/sync interval, default5sZEPHYR_MAX_TXS_PER_BLOCK: maximum committed transactions per block, default100ZEPHYR_ENABLE_BLOCK_PRODUCTION: enable local block production, defaulttrueZEPHYR_ENABLE_CONSENSUS_AUTOMATION: enable the current timeout-driven automation loop, defaultfalseZEPHYR_ENABLE_PEER_SYNC: enable background peer sync, defaulttrueZEPHYR_ENABLE_STRUCTURED_LOGS: emit newline-delimited JSON event logs for diagnostics, peer incidents, and snapshot recovery, defaultfalseZEPHYR_REQUIRE_PEER_IDENTITY: whentrue, replicated peer POST requests must include a valid signed transport identity, defaultfalseZEPHYR_PEER_VALIDATORS: comma-separated<peer-url>=<validator-address>bindings used to pin configured peers to expected validators, default emptyZEPHYR_ENFORCE_PROPOSER_SCHEDULE: whentrue, only the scheduled proposer for the active round may produce the next block once a validator set exists, defaultfalseZEPHYR_REQUIRE_CONSENSUS_CERTIFICATES: whentrue, local block commit and remote block import require a matching proposal and quorum certificate, defaultfalse
Notes:
- startup rejects
ZEPHYR_ENABLE_CONSENSUS_AUTOMATION=trueunlessZEPHYR_VALIDATOR_PRIVATE_KEYis configured - if
ZEPHYR_VALIDATOR_ADDRESSis also set, startup rejects mismatches between that address and the private key-derived address
{
"height": 1,
"round": 1,
"blockHash": "<64-hex-block-hash>",
"previousHash": "",
"producedAt": "2026-03-24T13:00:00Z",
"transactionIds": [
"<64-hex-transaction-id>"
],
"transactions": [
{
"from": "zph_sender_a",
"to": "zph_receiver",
"amount": 5,
"nonce": 1,
"memo": "tx-1",
"payload": "<canonical-transaction-payload>",
"publicKey": "<base64-spki-public-key>",
"signature": "<base64-signature>"
}
],
"proposer": "zph_validator_b",
"payload": "{\"blockHash\":\"<64-hex-block-hash>\",\"height\":1,\"previousHash\":\"\",\"producedAt\":\"2026-03-24T13:00:00Z\",\"proposer\":\"zph_validator_b\",\"round\":1,\"transactionIds\":[\"<64-hex-transaction-id>\"]}",
"publicKey": "<base64-spki-public-key>",
"signature": "<base64-signature>",
"proposedAt": "2026-03-24T13:00:01Z"
}Current meaning:
blockHashmust be the derived hash ofheight,previousHash,producedAt, and orderedtransactionIdstransactionsmust be present and must matchtransactionIdsin the same order- the proposer signs that full template commitment, not just a standalone hash string
- the scheduled proposer is derived from both
heightandround - validators can verify the candidate directly from the proposal body without relying on local mempool convergence alone
{
"height": 1,
"round": 1,
"blockHash": "<64-hex-block-hash>",
"voter": "zph_validator_a",
"payload": "{\"blockHash\":\"<64-hex-block-hash>\",\"height\":1,\"round\":1,\"voter\":\"zph_validator_a\"}",
"publicKey": "<base64-spki-public-key>",
"signature": "<base64-signature>",
"votedAt": "2026-03-24T13:00:02Z"
}{
"height": 1,
"round": 1,
"blockHash": "<64-hex-block-hash>",
"votingPower": 43000,
"quorumVotingPower": 28667,
"voterCount": 2,
"voters": ["zph_validator_a", "zph_validator_b"],
"createdAt": "2026-03-24T13:00:03Z"
}{
"nodeId": "node-a",
"validatorAddress": "zph_validator_a",
"payload": "{\"nodeId\":\"node-a\",\"signedAt\":\"2026-03-24T09:15:00Z\",\"validatorAddress\":\"zph_validator_a\"}",
"publicKey": "<base64-spki-public-key>",
"signature": "<base64-signature>",
"signedAt": "2026-03-24T09:15:00Z"
}roundEvidence is a derived operator-facing view exposed by GET /v1/status, GET /v1/consensus, and GET /v1/dev/block-template.
Current fields include:
height,round,nextProposer,startedAt,deadlineAt, andtimedOutquorumVotingPower, the target voting power required for certification in the active roundstate, which is currently one ofno_validator_set,idle,waiting_for_proposal,waiting_for_reproposal,collecting_votes, orcertifiedproposalPresent,proposalBlockHash, andproposalProposerfor the active roundlatestKnownProposalRoundandlatestKnownProposalBlockHashwhen the node has seen a newer stored proposal for the same height than the currently active roundvoteTalliesfor the active roundleadingVoteBlockHash,leadingVotePower,leadingVoteCount,quorumRemaining, andpartialQuorumso operators can see whether a round is converging or stalled below quorumlocalVotePresentandlocalVoteBlockHashfor the local validator when configuredpendingReplayCountandpendingReplayRoundsto show whether the local node still has replayable actions for the active heightcertificatePresentandcertificateBlockHashwhen the active round already has a matching quorum certificatewarnings, currently drawn fromtimeout_elapsed,partial_quorum,reproposal_pending,replay_pending, andproposal_not_from_scheduled_proposer
roundHistory is a derived per-height view exposed by GET /v1/status, GET /v1/consensus, and GET /v1/dev/block-template.
Current fields include:
height, which is currently the pendingnextHeightrounds, sorted by round number for that height- each round entry exposes
round,active,startedAt,scheduledProposer,proposalPresent,proposalBlockHash,proposalProposer,voteTallies,certificatePresent, andcertificateBlockHash
Current behavior:
- the active round is always included for the pending height, even if that round has no stored proposal yet
- prior rounds remain visible when the node advances after timeout or accepts higher-round messages
- operators can compare round-0 and round-1 proposal, vote, and certificate state directly without reconstructing it from logs or diagnostics
blockReadiness is a derived next-block readiness view exposed by GET /v1/status, GET /v1/consensus, and GET /v1/dev/block-template.
Current fields include:
height, the current pendingnextHeightlocalTemplateAvailable,localTemplateBlockHash,localTemplateProducedAt, andlocalTemplateTransactionCountstoredProposalCountandcertifiedProposalCountmatchingLocalProposalRoundandmatchingLocalCertificatereadyToCommitLocalTemplate,readyToCommitStoredProposal, andreadyToImportCertifiedBlocklatestCertifiedRound,latestCertifiedBlockHash, andlatestCertifiedProducedAtwarnings, currently drawn fromproposal_missing,local_template_mismatch,certificate_missing,certified_proposal_differs_from_local_template, andcertified_proposal_available_without_local_template
Current behavior:
- when no proposal exists yet, the view shows whether the node can build a local candidate and warns with
proposal_missing - when a proposal exists but lacks quorum, the view shows the matching round but warns with
certificate_missing - when a certified proposal exists for the pending height, the view shows whether the current local template still matches it and whether commit or peer import can proceed from stored artifacts
- wrong
producedAtor wrong imported block attempts now surface astemplate_mismatchin diagnostics instead of looking like a missing proposal
recovery is the durable local consensus-action recovery view exposed by GET /v1/status, GET /v1/consensus, GET /v1/dev/block-template, and local proposal or vote submissions.
Current fields include:
pendingActionCount,pendingReplayCount,pendingImportCount,pendingImportHeights,needsReplay, andneedsRecoverylastSnapshotRestoreAt,lastSnapshotRestoreHeight, andlastSnapshotRestoreBlockHashpendingActions, which now list replayable local actions plus pending import-repair actions that still need follow-uprecentActions, which show the latest local consensus actions withstatus,replayAttempts,lastReplayAt, andcompletedAt, including successful local certifiedblock_commitevents
Current behavior:
- locally authored proposals and votes for the configured validator are persisted into this WAL view
- successful local certified commits are now recorded as completed
block_commitactions so the proposer-side path remains visible after quorum - timeout-driven round advance is also recorded for operator history
- recoverable peer block-import failures now append a pending
block_importaction so operators can see blocked import heights directly in the recovery view - when automation rebroadcasts a stored local proposal or vote, the matching action updates
replayAttemptsandlastReplayAt - when a block is committed locally or imported for that height, pending proposal, vote, and import-repair actions for that height are marked completed
- when peer sync falls back to snapshot restore, the node preserves its own recovery and diagnostic history, completes any blocked import actions through the restored height, and records a completed
snapshot_restoreaction with the restored height and latest block hash
diagnostics is a bounded recent rejection-history view exposed by GET /v1/status, GET /v1/consensus, and GET /v1/dev/block-template.
Current fields include:
recent, newest first- each diagnostic exposes
kind,code,message,height,round,blockHash,validator,source, andobservedAt
Current behavior:
- rejected proposal submissions append a
proposal_rejecteddiagnostic - rejected vote submissions append a
vote_rejecteddiagnostic - rejected local commit attempts append a
block_commit_rejecteddiagnostic - rejected peer block imports append a
block_import_rejecteddiagnostic - background peer-sync import failures append the same
block_import_rejecteddiagnostic withsourceset topeer_syncbefore snapshot fallback codeis a stable operator-facing category such asunexpected_proposer,stale_round,conflicting_proposal,conflicting_vote,proposal_required,template_mismatch,certificate_required, ornot_scheduled_proposer
peerSyncHistory is a bounded durable peer-sync incident view exposed by GET /v1/status, GET /v1/consensus, and GET /v1/dev/block-template.
Current fields include:
recent, newest first- each incident exposes
peerUrl,state,reason,localHeight,peerHeight,heightDelta,blockHash,errorCode,errorMessage,firstObservedAt,lastObservedAt, andoccurrences
Current behavior:
- repeated incidents for the same peer and same incident shape are merged into one record with a higher
occurrencescount instead of growing the history indefinitely - failed outgoing proposal, vote, or block replication is retained as
replication_blockedincidents withreasonset to the artifact type and transport-orientederrorCodevalues such astimeoutorhttp_status_500 - the history survives restart because it is stored in the durable ledger state
- peer snapshot restore preserves the local node's own peer-sync incident history instead of replacing it with the repairing peer's local context
peerSyncSummary is a bounded derived cross-peer incident summary exposed by GET /v1/status, GET /v1/consensus, GET /v1/dev/block-template, and GET /v1/metrics.
Current fields include:
incidentCount,affectedPeerCount,totalOccurrences, andlatestObservedAtstates, where each entry exposesstate,incidentCount,affectedPeerCount,totalOccurrences, andlatestObservedAthorizons, where each entry exposes fixed5m,15m, or1hwindows keyed by retained-incidentLastObservedAt, with horizon-local incident, peer, occurrence, and dominant state or reason or error-code summariesreasons, where each entry exposesreason,incidentCount,affectedPeerCount,totalOccurrences, andlatestObservedAterrorCodes, where each entry exposeserrorCode,incidentCount,affectedPeerCount,totalOccurrences, andlatestObservedAtpeers, where each entry exposespeerUrl,incidentCount,totalOccurrences,latestState,latestReason,latestErrorCode,latestBlockHash, andlatestObservedAt
Current behavior:
- repeated incidents for one peer increase
totalOccurrenceswithout inflating the distinctincidentCount - the summary is derived from the durable peer-sync incident history, so it survives restart and peer snapshot repair
- blank peer incident reasons or error codes are normalized to
unknownfor cross-peer aggregation and metrics export - state, reason, and error-code summaries are sorted by dominant total occurrences and peer summaries are sorted by latest observation time
Returns the durable validator snapshot, the latest consensus artifacts, and the derived consensus summary for the next height.
Current behavior:
- the response includes
consensusAutomationEnabled,structuredLogsEnabled,proposerScheduleEnforced, andconsensusCertificatesRequired validatorSetexposes the durable validator snapshotartifactsexposes the latest stored proposal, votes, and certificateconsensusnow includescurrentRoundandcurrentRoundStartedAtin addition tonextHeight,nextProposer, total voting power, and quorum targetnextProposerreflects the active round, not only the next heightroundEvidenceexposes the round deadline, proposal presence, vote tallies, leading vote, quorum remaining, replay backlog, warnings, local vote, and certificate state for operator inspectionroundHistoryexposes the pending height across rounds so operators can compare prior and active proposer attempts side by sideblockReadinessexposes whether the current local template matches stored proposals and certificates for the pending heightrecoveryexposes the local consensus-action WAL, including pending replayable actions, pending import backlog, and recent replay, completion, plus snapshot-restore metadatadiagnosticsexposes recent rejected proposal, vote, commit, and import eventspeerSyncHistoryexposes recent durable cross-peer sync incidentspeerSyncSummaryexposes the derived cross-peer totals for those incidents
Validates and persists a signed proposal for the next block height.
Current behavior:
- the proposer must be part of the active validator set
- the proposer must match the scheduled proposer for that height and round
previousHashmust match the current chain tip for that heightblockHashmust match the proposal'sproducedAtplus orderedtransactionIdstransactionsmust be present and must matchtransactionIdsin the same order- the node rejects stale lower-round proposals after it has already moved forward
- a valid higher-round proposal can advance the local active round when needed
- the proposal is stored durably and replicated to admitted peers
- when automation is enabled, the scheduled proposer uses the same validation path internally before broadcasting the proposal
- when the proposal is authored by the node's configured local validator, the node persists a local recovery action for restart replay
- when automation is enabled and a peer link comes back, the proposer can rebroadcast its latest stored proposal for the pending height until a matching certificate exists
- rejected proposals are appended to the diagnostic history exposed by status and consensus surfaces
Validates and persists a signed validator vote for a known proposal.
Current behavior:
- the voter must be part of the active validator set
- the vote must target a known proposal for that height and round
- duplicate same-block votes from the same validator are idempotent
- the node rejects stale lower-round votes after it has already moved forward
- a valid higher-round vote can advance the local active round when needed, as long as the referenced proposal is known
- if the accumulated vote power reaches quorum, the node stores a commit certificate artifact
- when certificate enforcement is enabled, that certificate can unlock local commit and remote import for the matching block hash
- when automation is enabled, active validators use the same validation path internally before broadcasting their vote
- when the vote is authored by the node's configured local validator, the node persists a local recovery action for restart replay
- when automation is enabled and a peer link comes back, validators can rebroadcast their latest stored vote for the pending height until the matching certificate exists
- rejected votes are appended to the diagnostic history exposed by status and consensus surfaces
Returns a simple liveness response.
Returns a Prometheus-compatible text export derived from the same durable and live operator signals exposed by GET /v1/metrics, GET /v1/health, GET /v1/alerts, and GET /v1/slo.
Current behavior:
- the response uses
text/plain; version=0.0.4; charset=utf-8 - the endpoint keeps returning
200while the HTTP API is alive; readiness is exported throughzephyr_node_ready,zephyr_health_status, andzephyr_health_check_statusinstead of surfacing failure as HTTP503 - the current metric families cover node flags, chain height and mempool size, consensus height and round state, recovery backlog, retained consensus action history, retained consensus diagnostic buckets, live peer runtime counts, durable peer-sync incident summaries including recent
5m,15m,1h,6h, and24hhorizon gauges keyed by incidentLastObservedAt, active derived alert gauges, and SLO-oriented objective gauges GET /v1/metricsremains the structured JSON surface for automation that wants typed objects, whileGET /metricsis the scrape-friendly adapter for Prometheus-style monitoring stacks
Returns a derived readiness response built from durable ledger state plus the latest live peer-runtime view.
Current behavior:
/healthremains a simple liveness probe, while/v1/healthis the richer readiness surface for operators and automation- the top-level response includes
generatedAt, node identity, peer count, runtime flags,live,ready,status, orderedchecks, and flattenedwarnings statusis currently one ofok,warn, orfail, while each check usespass,warn, orfail- the current checks are
api,validator_set,recovery,consensus,settlement_throughput,peer_sync, anddiagnostics;settlement_throughputbecomes active when automatic block production has both an enabled producer and a positive block interval warnchecks keep the node live and ready but surface degraded conditions such as recent diagnostics, early peer observation, or consensus warningsfailchecks setready=falseand return HTTP503; the current hard-fail cases are recovery backlog and peer-sync availability failures when peer sync is enabled- the
peer_syncdetail now appends recent retained-incident horizon summaries asrecentOccurrences=5m:...|15m:...|1h:...plusrecentPeers=..., making it easier to distinguish fresh peer churn from older retained pressure without querying Prometheus warningsis a flattened operator-facing list built from the active warn or fail checks so dashboards do not need to re-derive short incident summaries
Returns a derived alert surface built from the same readiness, recovery, consensus, diagnostics, and peer-sync signals used elsewhere in the node API.
Current behavior:
- the top-level response includes node identity, readiness or status, runtime flags, counts by severity, and the current active alerts
- alert severities are currently
criticalandwarning - current alert codes include
validator_set_missing,consensus_recovery_backlog,consensus_state_warning,settlement_throughput_reduced,settlement_throughput_stalled,peer_sync_unavailable,peer_sync_degraded,peer_import_blocked,peer_admission_blocked,peer_replication_blocked,peer_snapshot_restored,peer_snapshot_restore_divergence,peer_snapshot_restore_import_repair,peer_snapshot_restore_fetch_fallback, andrecent_consensus_diagnostics - the endpoint always returns
200; unlike/v1/health, it is intended for dashboards and polling systems that want the active alert set instead of an HTTP readiness gate settlement_throughput_reducedandsettlement_throughput_stalledare derived from queued mempool work versus the age of the latest committed block when automatic block production is enabled and now include the current worst-case drain estimate plus warn-normalized forecast indetailwhen recent throughput baselines exist, whilepeer_import_blockedis derived from retainedimport_blockedincidents and includes the representative import error code indetail,peer_admission_blockedis derived from retainedunadmittedincidents and includes the representative admission reason indetail,peer_replication_blockedis derived from retainedreplication_blockedincidents and includes the representative artifactreasonplus transport-orientederrorCodeindetail,peer_snapshot_restoredis derived as the aggregate snapshot-repair warning and includes the dominant repairreasonplus representativeerrorCodewhen available, andpeer_snapshot_restore_divergence,peer_snapshot_restore_import_repair, andpeer_snapshot_restore_fetch_fallbackare derived from the retained snapshot-repair reason buckets so polling systems can distinguish the repair path directly from/v1/alerts/metricsmirrors this alert state throughzephyr_alert_count,zephyr_alert_count_by_severity,zephyr_alert_active, andzephyr_alert_observed_at_secondsGET /v1/slobuilds on the same evidence when operators want objective-style summaries instead of raw alert cardsGET /v1/alert-rulesandGET /v1/alert-rules/prometheusbuild on the same evidence when operators want recommended monitoring bundles rather than only the current runtime stateGET /v1/recording-rulesandGET /v1/recording-rules/prometheusbuild on the same evidence when operators want reusable dashboard and aggregation rollups rather than only current alert or SLO stateGET /v1/dashboardsandGET /v1/dashboards/grafanabuild one layer higher on the same evidence when operators want recommended dashboard structure and Grafana import material instead of only current cards or rollups
Returns an SLO-oriented objective summary built from the same readiness, alert, recovery, diagnostics, and peer-sync signals used elsewhere in the node API.
Current behavior:
- the top-level response includes
generatedAt, node identity, optional validator address, peer count,ready,healthStatus, counts by alert severity, counts by objective status, and the current objective list - objective statuses are currently
meeting,at_risk,breached, andnot_applicable - the current objectives are
node_readiness,consensus_continuity,peer_sync_continuity, andsettlement_throughput node_readinesssummarizes whether the node is still ready to serve traffic,consensus_continuitysummarizes whether the next-height pipeline is clear enough to progress,peer_sync_continuitysummarizes whether at least one admitted and reachable peer path exists when peer sync is enabled, andsettlement_throughputsummarizes whether queued transactions are clearing within the expected automatic block-production window- the endpoint always returns
200; unlike/v1/health, it is designed for dashboards and automation that want a stable objective summary rather than an HTTP readiness gate /metricsmirrors this summary throughzephyr_slo_objective_count,zephyr_slo_status_count, andzephyr_slo_objective_status
Returns a machine-readable recommended alert bundle derived from the current Zephyr metrics, alert codes, and SLO objectives.
Current behavior:
- the top-level response includes
generatedAt, node identity, optional validator address, peer count,peerSyncEnabled, current health or alert summary counts, total rule counts, and grouped rule bundles - rules are currently grouped into readiness, consensus, throughput, and peer-sync bundles; the throughput group adds queue-drain warnings for automatic block production, and the peer-sync group includes continuity plus targeted peer import, peer admission, peer replication, and peer snapshot-restore diagnostics, including repair-path-specific divergence, import-repair, and fetch-fallback snapshot-restore rules
- each rule includes
summary,description,expression, severity, component, source metrics, related alert codes or SLO objectives, and whether the rule is currently enabled for the node's runtime configuration - throughput and peer-sync rules stay visible in the JSON surface even when the related runtime mode is unavailable; settlement-throughput rules include
enabled=falseplus adisabledReasonwhen automatic block production is disabled or has no positive block interval, and peer-sync rules do the same when peer sync is disabled or no peers are configured - the current bundle is intentionally opinionated: it is a recommended starting point for monitoring stacks built on
zephyr_node_ready,zephyr_alert_active, andzephyr_slo_objective_status
Returns the enabled portion of the same bundle as Prometheus-rule YAML.
Current behavior:
- the response uses
application/yaml; charset=utf-8 - only enabled rules are exported, so peer-sync alerts are omitted when peer sync is disabled or no peers are configured and throughput alerts are omitted when automatic block production is disabled or has no positive block interval
- rules are grouped into readiness, consensus, throughput, and peer-sync groups and include severity, component, summary, description, and source-metric annotations
- this endpoint is designed as an export adapter for monitoring systems that already scrape
GET /metrics
Returns a machine-readable recommended recording-rule bundle derived from the current Zephyr metrics, alerts, and SLO objectives.
Current behavior:
- the top-level response includes
generatedAt, node identity, optional validator address, peer count,peerSyncEnabled, current health or alert summary counts, total rule counts, and grouped rule bundles - rules are currently grouped into readiness, consensus, throughput, peer-sync, and operator-summary bundles
- each rule includes a stable
recordname,summary,description,expression, component, source metrics, related alert codes or SLO objectives, and whether the rule is currently enabled for the node's runtime configuration; the throughput group adds canonicalzephyr:settlement_throughput:at_riskandzephyr:settlement_throughput:breachedrollups for queue-drain state plus normalizedzephyr:settlement_queue_drain:warn_utilizationandzephyr:settlement_queue_drain:fail_utilizationrollups for threshold pressure, along with canonical max projected-pressure and max drain-time rollups for worst-case settlement forecast views, while the peer-sync snapshot-repair rollups now carry both the compatibility aggregatepeer_snapshot_restoredcode and the splitpeer_snapshot_restore_divergence,peer_snapshot_restore_import_repair, andpeer_snapshot_restore_fetch_fallbackcodes inrelatedAlertCodes - throughput and peer-sync rules stay visible in the JSON surface even when the related runtime mode is unavailable; settlement-throughput rules include
enabled=falseplus adisabledReasonwhen automatic block production is disabled or has no positive block interval, and peer-sync rules do the same when peer sync is disabled or no peers are configured - the current bundle is intentionally opinionated: it is a recommended starting point for dashboards, fleet rollups, and downstream Prometheus recording-rule files built on
zephyr_node_ready,zephyr_alert_count_by_severity,zephyr_slo_objective_status, recovery or peer-runtime gauges, the settlement-throughput rollupszephyr:settlement_throughput:at_riskandzephyr:settlement_throughput:breached, the settlement queue-drain utilization rollupszephyr:settlement_queue_drain:warn_utilizationandzephyr:settlement_queue_drain:fail_utilization, the settlement queue-drain estimate rollupszephyr:settlement_queue_drain:estimate_seconds_1m,zephyr:settlement_queue_drain:estimate_seconds_5m,zephyr:settlement_queue_drain:estimate_seconds_15m, andzephyr:settlement_queue_drain:estimate_seconds_max, the settlement queue-drain projected-pressure rollupszephyr:settlement_queue_drain:estimate_warn_utilization_1m,zephyr:settlement_queue_drain:estimate_warn_utilization_5m,zephyr:settlement_queue_drain:estimate_warn_utilization_15m, andzephyr:settlement_queue_drain:estimate_warn_utilization_max, the per-peer incident-pressure rollupszephyr:peer_sync:incident_pressure_by_peerandzephyr:peer_sync:incident_pressure_by_horizon, the snapshot-repair pressure rollupszephyr:peer_sync:snapshot_restore_pressure_by_peer,zephyr:peer_sync:snapshot_restore_age_by_peer,zephyr:peer_sync:snapshot_restore_pressure, andzephyr:peer_sync:snapshot_restore_pressure_by_reason, and the canonical recent-TPS rollupszephyr:chain:transactions_per_second_1m,zephyr:chain:transactions_per_second_5m, andzephyr:chain:transactions_per_second_15m
Returns the enabled portion of the same bundle as Prometheus recording-rule YAML.
Current behavior:
- the response uses
application/yaml; charset=utf-8 - only enabled rules are exported, so settlement-throughput recording rules are omitted when automatic block production is disabled or has no positive block interval, and peer-sync recording rules are omitted when peer sync is disabled or no peers are configured
- rules are grouped into readiness, consensus, throughput, peer-sync, and operator-summary groups and include stable
recordnames plus component, group, related objective, or related alert labels when applicable; the throughput group includeszephyr:settlement_throughput:at_risk,zephyr:settlement_throughput:breached,zephyr:settlement_queue_drain:warn_utilization,zephyr:settlement_queue_drain:fail_utilization,zephyr:settlement_queue_drain:estimate_seconds_1m,zephyr:settlement_queue_drain:estimate_seconds_5m,zephyr:settlement_queue_drain:estimate_seconds_15m,zephyr:settlement_queue_drain:estimate_seconds_max,zephyr:settlement_queue_drain:estimate_warn_utilization_1m,zephyr:settlement_queue_drain:estimate_warn_utilization_5m,zephyr:settlement_queue_drain:estimate_warn_utilization_15m, andzephyr:settlement_queue_drain:estimate_warn_utilization_max, the peer-sync group includes the per-peer incident-pressure rollupszephyr:peer_sync:incident_pressure_by_peerandzephyr:peer_sync:incident_pressure_by_horizonplus the snapshot-repair rollupszephyr:peer_sync:snapshot_restore_pressure_by_peer,zephyr:peer_sync:snapshot_restore_age_by_peer,zephyr:peer_sync:snapshot_restore_pressure, andzephyr:peer_sync:snapshot_restore_pressure_by_reason, and multi-alert recording rules now export a comma-joinedalert_codeslabel so related-alert metadata survives the Prometheus YAML export - this endpoint is designed as an export adapter for monitoring systems that already scrape
GET /metricsand want reusable dashboard or aggregation series without hand-writing PromQL
Returns a machine-readable recommended dashboard bundle derived from the current Zephyr metrics, SLO objectives, alert state, and recording rules.
Current behavior:
- the top-level response includes
generatedAt, node identity, optional validator address, peer count,peerSyncEnabled,structuredLogsEnabled, current health or objective summary counts, total dashboard counts, total panel counts, and the current dashboard list - dashboards are currently grouped into operator overview, consensus-and-recovery, and peer-sync bundles; the overview bundle now includes a
Recent transaction throughputpanel backed by the canonicalzephyr:chain:transactions_per_second_1m,zephyr:chain:transactions_per_second_5m, andzephyr:chain:transactions_per_second_15mrecording rules, aSettlement throughput statepanel backed byzephyr:settlement_throughput:at_riskandzephyr:settlement_throughput:breached, a rawSettlement queue-drain lagpanel backed byzephyr_settlement_queue_drain_lag_secondspluszephyr_settlement_queue_drain_threshold_seconds, a normalizedSettlement queue-drain utilizationpanel backed byzephyr:settlement_queue_drain:warn_utilizationandzephyr:settlement_queue_drain:fail_utilization, anEstimated queue-drain pressurepanel backed by the canonicalzephyr:settlement_queue_drain:estimate_warn_utilization_1m,zephyr:settlement_queue_drain:estimate_warn_utilization_5m, andzephyr:settlement_queue_drain:estimate_warn_utilization_15mrecording rules, aWorst-case estimated queue-drain pressurestat backed byzephyr:settlement_queue_drain:estimate_warn_utilization_max, anEstimated queue-drain timepanel backed by the canonicalzephyr:settlement_queue_drain:estimate_seconds_1m,zephyr:settlement_queue_drain:estimate_seconds_5m, andzephyr:settlement_queue_drain:estimate_seconds_15mrecording rules, and aWorst-case estimated queue-drain timestat backed byzephyr:settlement_queue_drain:estimate_seconds_max, while the peer-sync bundle includes incident-by-state, incident-by-reason, incident-by-error-code, per-peer incident-pressure,Peer incident pressure horizons,Peer snapshot restore pressure by peer,Peer snapshot restore heights,Peer snapshot restore age,Peer snapshot restore pressure, andPeer snapshot restore reasonspanels tied back to the peer import, peer admission, peer replication, and peer snapshot-restore alert codes, with the peer incident and snapshot-repair panels now carrying both the compatibility aggregatepeer_snapshot_restoredcode and the splitpeer_snapshot_restore_*codes inrelatedAlertCodes, and the per-peer, horizon, and per-peer snapshot-repair panels using the canonical recording ruleszephyr:peer_sync:incident_pressure_by_peer,zephyr:peer_sync:incident_pressure_by_horizon,zephyr:peer_sync:snapshot_restore_pressure_by_peer, andzephyr:peer_sync:snapshot_restore_age_by_peer - each panel includes a stable panel
id,kind,summary,description, PromQL queries, source metrics, source endpoints, related recording rules, related alert codes or objectives, and whether the panel is currently enabled for the node's runtime configuration - settlement-specific overview panels stay visible in the JSON surface even when automatic block production monitoring is unavailable, and the peer-sync dashboard does the same when peer sync is disabled or no peers are configured; in both cases the affected panels include
enabled=falseplus adisabledReasonso operators can see what would become active on a producing or synced node - the current bundle is intentionally opinionated: it is a recommended starting point for Grafana or other dashboard tooling built on
GET /metrics, the recording-rule bundle, and the higher-level health, alert, or SLO projections
Returns the enabled portion of the same bundle as Grafana-oriented JSON.
Current behavior:
- the response uses
application/json; charset=utf-8 - only enabled dashboards and panels are exported, so peer-sync dashboards are omitted when peer sync is disabled or no peers are configured
- each exported entry includes a stable filename, dashboard UID, title, tags, and prewired PromQL targets for its panels
- this endpoint is designed as an export adapter for operators who already scrape
GET /metricsand want a dashboard starting point without recreating the panel queries by hand
Returns the local runtime status for the current node, including consensus summary and whether proposer or certificate enforcement is enabled.
Current behavior:
- the response includes
consensusAutomationEnabledandstructuredLogsEnabled - the embedded
consensusview now exposescurrentRound,currentRoundStartedAt, and the active-roundnextProposer roundEvidenceexposes the active round deadline, state, vote tallies, leading vote, quorum remaining, replay backlog, warnings, proposal presence, local vote, and certificate visibility for operatorsroundHistoryexposes the pending height across rounds so operators can inspect round-0, round-1, and later attempts togetherblockReadinessexposes whether the local template is ready to commit and whether a certified stored proposal is already ready for commit or importrecoveryexposes pending replayable local actions, pending import backlog, and recent replay/completion plus snapshot-restore metadata from the local consensus-action WALdiagnosticsexposes recent rejected proposal, vote, commit, and import eventspeerSyncHistoryexposes a durable recent history of cross-peer sync incidents, including repeated failures merged by occurrence countpeerSyncSummaryexposes affected-peer totals, dominant states, dominant reasons, dominant error codes, the latest incident summary across peers, and fixed recent horizons for5m,15m,1h,6h, and24hkeyed by each retained incident's latest observation timeGET /v1/metricsoffers a machine-readable roll-up of that durable summary plus live peer runtime countsGET /metricsoffers a Prometheus-style text projection of the same operator signals for scrape-based monitoring and alertingGET /v1/healthoffers a pass, warn, or fail readiness summary derived from the same durable and live operator signals; unlike/health, it can return HTTP503when fail checks are activeGET /v1/alertsoffers the current derived warning and critical alert set for operator polling and dashboard integrationGET /v1/slooffers the current SLO-oriented objective summary derived from those same readiness, recovery, consensus, diagnostics, and peer signalsGET /v1/alert-rulesandGET /v1/alert-rules/prometheusoffer recommended monitoring bundles derived from those same metrics, alert codes, and objective statesGET /v1/recording-rulesandGET /v1/recording-rules/prometheusoffer recommended dashboard and aggregation rollups derived from those same metrics, alert codes, and objective statesGET /v1/dashboardsandGET /v1/dashboards/grafanaoffer recommended dashboard bundles and Grafana export derived from those same metrics, recording rules, alert codes, and objective states- when
ZEPHYR_VALIDATOR_PRIVATE_KEYis configured, the response includes anidentityobject with a signed transport proof for the local validator peerIdentityRequiredistruewhen strict peer admission or explicit peer-validator binding is enabled
Returns a machine-readable observability snapshot built from durable ledger state plus the latest live peer runtime views.
Current behavior:
- the top-level response includes
generatedAt, node identity, runtime flags includingstructuredLogsEnabled, and embeddedstatus,consensus, andrecoverysummaries consensusActionsrolls up the durable local WAL and recovery actions intototalCount,pendingCount,totalReplayAttempts, latest record or completion times, andbyTypeorbyStatusbuckets; current types can includeproposal,vote,round_advance,block_commit,block_import, andsnapshot_restorediagnosticsrolls up the bounded rejection history intototalCount,latestObservedAt, andbyKind,byCode, orbySourcebucketspeerSyncSummaryreuses the durable cross-peer incident summary also exposed by status, consensus, and block-template responses, including fixed recent horizons for5m,15m,1h,6h, and24hthat filter retained incidents byLastObservedAtpeerRuntimereflects the current configured peer set and livesyncStatedistribution, including reachable or admitted counts versus unreachable or unadmitted counts- unlike
peerSyncSummary,peerRuntimeis derived from the latest in-memory peer view and may reset on process restart until peers are seen again chainThroughputsummarizes committed-chain throughput with total committed block and transaction counts, latest committed block time and interval, and fixed1m,5m, and15mwindows carrying block counts, transaction counts, blocks per second, transactions per second, and average transactions per blocksettlementThroughputadds the structured settlement queue-drain view, including whether monitoring is applicable, the current health-style status, active alert metadata plusobservedAtwhen present, the latest commit age, backlog lag, normalized warn and fail utilization ratios, recent backlog-drain estimates for the1m,5m, and15mthroughput windows, per-estimate warn utilization ratios for those same windows, an explicitpeakDrainEstimatesummary for the current worst-case backlog projection, and the expected, warn, or fail thresholds derived from automatic block productionGET /metricsreuses these same rollups in Prometheus-compatible text form, including readiness gauges such aszephyr_node_readyandzephyr_health_check_status, alert gauges such aszephyr_alert_countandzephyr_alert_active, SLO gauges such aszephyr_slo_status_countandzephyr_slo_objective_status, aggregate peer-incident gauges such aszephyr_peer_sync_reason_occurrence_countandzephyr_peer_sync_error_code_occurrence_count, recent-horizon peer gauges such aszephyr_peer_sync_horizon_incident_count,zephyr_peer_sync_horizon_affected_peer_count,zephyr_peer_sync_horizon_occurrence_count, andzephyr_peer_sync_horizon_latest_observed_at_secondslabeled bywindow, per-peer retained-incident gauges such aszephyr_peer_sync_peer_incident_count,zephyr_peer_sync_peer_occurrence_count, andzephyr_peer_sync_peer_latest_observed_at_secondslabeled bypeer_urlplus the latest dominant state, reason, and error code, per-peer snapshot-repair metadata gauges such aszephyr_peer_snapshot_restore_last_height,zephyr_peer_snapshot_restore_last_observed_at_seconds, andzephyr_peer_snapshot_restore_age_secondslabeled bypeer_urland retained repairreason, chain throughput gauges such aszephyr_chain_total_committed_transaction_count,zephyr_chain_latest_block_interval_seconds, andzephyr_chain_window_transactions_per_second, and settlement gauges such aszephyr_settlement_monitoring_applicable,zephyr_settlement_latest_commit_age_seconds,zephyr_settlement_queue_drain_lag_seconds,zephyr_settlement_expected_interval_seconds,zephyr_settlement_queue_drain_threshold_seconds,zephyr_settlement_queue_drain_utilization_ratio,zephyr_settlement_estimated_queue_drain_warn_utilization_ratio,zephyr_settlement_estimated_queue_drain_warn_utilization_ratio_max,zephyr_settlement_estimated_queue_drain_seconds, andzephyr_settlement_estimated_queue_drain_seconds_maxGET /v1/dashboardsandGET /v1/dashboards/grafanabuild directly on these same Prometheus-facing rollups plus the recording-rule bundle when operators want prewired dashboard panels instead of only raw metrics
When ZEPHYR_ENABLE_STRUCTURED_LOGS=true, the node emits newline-delimited JSON event logs alongside the existing text startup log.
Current behavior:
- every entry includes
timestamp,level,component,event,nodeId, and optionalvalidatorAddress - consensus diagnostic entries use
component=consensusandevent=diagnostic, then addkind,code,message,height,round,blockHash,validator,source, andobservedAt - peer incident entries use
component=peer_syncandevent=incident, then addpeerUrl,state,reason,localHeight,peerHeight,heightDelta,blockHash,errorCode,errorMessage,firstObservedAt,lastObservedAt, andoccurrences - snapshot restore entries use
component=recoveryandevent=snapshot_restore, then addpeer,height,blockHash, andrestoredAt - the current structured-log surface is intentionally narrow: it focuses on consensus rejection, peer incident, and snapshot-repair paths so operators can correlate the same events exposed by
diagnostics,peerSyncHistory,GET /v1/metrics,GET /metrics,GET /v1/alerts,GET /v1/slo, and the higher-level readiness summaries fromGET /v1/health
Returns the latest known view of configured peers.
Current behavior:
- each peer view includes the remote
validatorAddresswhen advertised expectedValidator,admitted, andadmissionErrorshow the local admission policy and whether the peer passed itidentityPresent,identityVerified, andidentityErrorshow whether the peer exposed a signed transport identity and whether local verification succeededheightDeltaandsyncStateshow whether the peer is aligned, ahead, behind, divergent, unadmitted, unreachable, blocked on import, or was recently repaired through snapshot restorelastSyncAttemptAtandlastSyncSuccessAtshow the last peer-sync attempt and completion times for that peerlastImportErrorCode,lastImportErrorMessage,lastImportFailureAt,lastImportFailureHeight, andlastImportFailureBlockHashshow the most recent import-side failure observed while syncing from that peerlastSnapshotRestoreAt,lastSnapshotRestoreHeight,lastSnapshotRestoreBlockHash, andlastSnapshotRestoreReasonshow the latest snapshot-based repair event for that peer, with reasons currently drawn fromfetch_fallback,import_repair, andpeer_divergedlastReplicationErrorCode,lastReplicationErrorMessage,lastReplicationFailureAt,lastReplicationFailureHeight,lastReplicationFailureBlockHash, andlastReplicationFailureReasonshow the latest outgoing proposal, vote, or block dissemination failure retained for that peer- when retained incident history exists, the latest import, snapshot-repair, and replication-failure metadata is backfilled into the peer view after restart even before live peer polling refreshes that peer
incidentCount,incidentOccurrences, andlatestIncidentAtexpose the derived per-peer counters from the durable incident historyrecentIncidentsexposes the durable per-peer incident history the node kept on disk, including state, reason, local and peer heights, block hash, error details, first and last observation time, and merged occurrence count- when strict peer admission or peer binding is enabled, background sync and outgoing replication use only admitted peers
Calculates a validator set from the provided candidates, votes, and config, persists it durably in the ledger, increments the validator-set version, resets pending proposal, vote, and certificate artifacts, and resets the active round to height nextHeight, round 0.
Returns the latest durable validator snapshot produced by POST /v1/election.
Returns the current persisted account view for the requested address.
Returns the latest committed local block.
Returns a committed block by exact height.
Credits a local account for development and testing.
Accepts a signed transaction envelope and queues it in the node's persisted mempool after validation.
Builds and returns the deterministic next block candidate from the current mempool and chain tip.
Current behavior:
- the response includes the exact
blockHash,previousHash,producedAt, fulltransactions, and orderedtransactionIdsvalidators should certify - operators can use that data directly when constructing a signed self-contained proposal
- the response also includes the current consensus summary,
roundEvidence,roundHistory,blockReadiness,recovery,diagnostics,peerSyncHistory,peerSyncSummary, and latest durable artifacts for operator context
Forces immediate block production from the current local mempool or a stored certified proposal.
Behavior:
- with no JSON body, the node uses the current time as the block timestamp for ungated local production
- you may send
{ "producedAt": "<RFC3339 timestamp>" }to target a specific previously fetched block template or a specific stored certified proposal - if proposer-schedule enforcement is enabled, the endpoint returns
409when the local validator is not the scheduled proposer for the active round - if certificate enforcement is enabled, the endpoint returns
409unless the resulting block exactly matches a stored proposal template and quorum certificate - when certificate enforcement is enabled and a matching certified proposal exists, the node can commit from the stored proposal body even if the local mempool no longer contains those transactions
- when automation is enabled, the scheduled proposer may reach the same commit path without an operator POST as soon as quorum exists for its current round proposal
- rejected local commit attempts are appended to the diagnostic history exposed by status and consensus surfaces
- wrong
producedAtfor an otherwise known certified proposal now reportstemplate_mismatchinstead ofproposal_required
These endpoints are used by the current devnet sync layer. They exist for node replication, not wallet clients.
When ZEPHYR_VALIDATOR_PRIVATE_KEY is configured, replicated POST requests carry these signed source headers:
X-Zephyr-Source-NodeX-Zephyr-Source-ValidatorX-Zephyr-Source-Identity-PayloadX-Zephyr-Source-Public-KeyX-Zephyr-Source-SignatureX-Zephyr-Source-Signed-At
Current behavior:
- if signed transport-identity headers are present, they must be complete and valid or the request is rejected with
400 - when
ZEPHYR_REQUIRE_PEER_IDENTITY=true, replicated peer POST requests must include a valid signed transport identity or they are rejected with403 - when
ZEPHYR_PEER_VALIDATORSis configured, replicated peer POST requests are also rejected with403unless the proven validator belongs to the configured peer-binding allowlist - proposal, vote, and block dissemination for the current automation flow use these same admitted peer paths
- the automation path now sends proposals before votes to avoid vote-before-proposal races on the happy path
- the automation loop also rebroadcasts the latest stored proposal and latest stored local vote for the pending height until a matching certificate exists, which helps delayed peers recover on the current HTTP devnet
GET /v1/peersnow shows whether a given peer most recently aligned normally, fell back to snapshot restore, triggered an import-side repair path during sync, or retained a recent outgoing replication failure, and durable incidents backfill the latest import, snapshot, and replication telemetry after restart
Imports a committed block from another node.
If certificate enforcement is enabled on the receiving node, the imported block must match a stored proposal template and quorum certificate or the import is rejected.
Rejected imports are appended to the diagnostic history exposed by status and consensus surfaces.
If proposals exist for that height but the imported block does not match any stored proposal template, the rejection now reports template_mismatch.
Returns the current durable node snapshot used for catch-up restore.
When another node applies this snapshot through peer sync, it preserves its own local recovery, diagnostic, peer-sync incident history, and derived peer-sync summary context instead of replacing that operator context with the peer's local WAL or diagnostics.