dabeacon-indexer

An official DAppNode project.

Ethereum validator indexer for the consensus (beacon) layer. Tracks a set of validators, writes per-epoch attestation / proposal / sync-committee outcomes rewards to Postgres, and exposes a web UI + JSON API + live SSE stream.

The design balances simplicity with completeness: the beacon node is treated as the source of truth via its /rewards/* and /duties/* endpoints, and beacon-spec logic is reimplemented locally only where it clearly pays off.

Designed to share a beacon node that's also doing validation duties: historical backfill can be pointed at a separate archive node so the live (non-archive) node keeps serving duties uninterrupted.

Features

Per-epoch attestation, proposal, sync-committee tracking — inclusion slot, vote correctness (head / target / source), rewards.
Live head tracking via beacon SSE (head, finalized_checkpoint, chain_reorg). Reorgs delete non-finalized rows and re-scan.
Concurrent live + backfill in the same process. Each owns a disjoint epoch range (live owns (f₀, ∞), backfill owns […=f₀]) so no row is scanned twice. Live has foreground priority.
Optional split beacon clients — point backfill_beacon_url at an archive node while the live client stays on your validator's attached (non-archive) node. On startup the indexer probes the backfill node and warns/errors if it can't serve the earliest epoch you need.
Chain-spec driven constants — SLOTS_PER_EPOCH, SECONDS_PER_SLOT, SYNC_COMMITTEE_SIZE, MAX_COMMITTEES_PER_SLOT, ALTAIR_FORK_EPOCH are fetched from /eth/v1/config/spec at startup. Works unchanged on mainnet, holesky, hoodi, or any custom network.
Cross-fork block deserialization — one typed struct per fork (phase0 → fulu). Electra attestation encoding (EIP-7549 committee_bits) is decoded correctly.
Multiple indexer instances, one DB — per-validator watermarks use GREATEST; finalized rows are immutable; reorg deletes target only non-finalized rows. A non-archive head-tracker and an archive backfiller can target the same validator set on the same DB.
Strict data invariants — malformed SSZ, size mismatches, missing committee entries, inclusion-slot < attestation-slot, etc. all surface as Error::InconsistentBeaconData and abort the epoch rather than writing partial data.
In-memory caches on the beacon client: head slot (2 s TTL), head finality checkpoints (10 s TTL), per-epoch committees + proposer/attester/sync duties. Invalidated on reorg.

Architecture at a glance

                   ┌───────────────┐
                   │ beacon node   │  ── /eth/v1/events (SSE)
                   │ (live client) │◄─ /eth/v2/beacon/blocks, /duties, …
                   └──────┬────────┘
                          │
                ┌─────────┴─────────┐
                │  beacon_client/   │  HTTP + retries + caches
                └─────────┬─────────┘
                          │
      ┌───────────────────┼───────────────────┐
      ▼                   ▼                   ▼
┌──────────┐      ┌────────────────┐   ┌────────────┐
│ live/    │      │ scanner/       │   │ backfill.rs│
│ head     │      │ (epoch scan)   │   │ (historical│
│ finality │      │ attestations   │   │  catch-up) │
│ reorg    │      │ proposals      │   │            │
└────┬─────┘      │ sync           │   └─────┬──────┘
     │            └────────┬───────┘         │
     │                     │                 │
     │                     ▼                 │
     │              ┌──────────────┐         │
     └─────────────►│ db/scanner/  │◄────────┘
                    │ (writes)     │
                    └──────┬───────┘
                           │
                           │ Postgres
                           ▼
                    ┌──────────────┐
                    │ db/api/      │
                    │ (reads)      │
                    └──────┬───────┘
                           │
                           ▼
                    ┌──────────────┐
                    │ web/         │   REST + /api/live/sse
                    │ (axum)       │
                    └──────────────┘

Module map:

beacon_client/ — Beacon API HTTP client, per-endpoint wrappers, wire types.
scanner/ — per-epoch scan pipeline (blocks → duties → rewards → DB writes).
live/ — SSE consumer for live head, finalization, chain_reorg.
backfill.rs — historical catch-up, archival-capability probe.
db/scanner/ — write-side queries (upsert, finalize, delete).
db/api/ — read-side queries behind web endpoints.
web/ — Axum HTTP server, REST API, live SSE.
chain.rs — chain-spec accessors (slots_per_epoch(), altair_epoch(), …).
config.rs — CLI + TOML + env merge.
error.rs — single Error enum shared across the crate.

Requirements

PostgreSQL 14+
Rust (edition 2024; see Cargo.toml)
A beacon node with /eth/v1/events, /eth/v2/beacon/blocks/{id}, /eth/v1/validator/duties/*, /eth/v1/beacon/rewards/*, /eth/v1/config/spec. Lighthouse, Prysm, Nimbus, Teku, Lodestar all work.

Optional but recommended when running in combined live + backfill mode:

A second, archive beacon node (e.g. lighthouse beacon ... --reconstruct-historic-states) pointed at by backfill_beacon_url. Keeps the live node unburdened by deep-history queries.

Quick start

# 1. Start Postgres
docker compose up -d db

# 2. Configure
cp config.example.toml config.toml
$EDITOR config.toml         # set validators, beacon_url, (optional) backfill_beacon_url

# 3. Set DB URL
export DATABASE_URL=postgres://dabeacon:dabeacon@localhost:5432/dabeacon

# 4. Run
cargo run --release

Open http://localhost:3000.

Configuration

All settings may be passed via CLI flag, env var, or TOML file. Precedence: CLI > env > TOML > built-in default. See config.example.toml.

Beacon nodes

CLI / env	TOML	Default	Purpose
`--beacon-url` / `BEACON_URL`	`beacon_url`	`http://localhost:5052`	Live client (head tracking, finality rescan, SSE). Non-archive nodes are supported but experimental, and the beacon node has to be configured accordingly.
`--backfill-beacon-url` / `BACKFILL_BEACON_URL`	`backfill_beacon_url`	(shares live)	Optional separate client for historical backfill. Must be archive-capable if set.

Database

CLI / env	TOML	Required
`--database-url` / `DATABASE_URL`	`database_url`	Yes

Migrations in migrations/ apply automatically at startup.

Validators to track

Either via CLI:

dabeacon-indexer --validators 123,456,789

or via TOML (with optional tags for UI grouping):

[[validators]]
index = 123
tags = ["pool-a", "node-1"]

[[validators]]
index = 456
tags = ["pool-a", "node-2"]

Mode flags

Flag	Default	Behaviour
`--mode` / `RUN_MODE`	`both`	Which workloads to run: `live` (head tracking + finality rescans + web server only — no historical backfill), `backfill` (one-shot historical catch-up, no live, no web), or `both`. See Running modes.
`--max-backfill-depth` / `MAX_BACKFILL_DEPTH`	(unlimited)	Clamp the earliest epoch backfill will start from. Protects against accidentally re-scanning from genesis for a newly-added validator.
`--non-contiguous-backfill` / `NON_CONTIGUOUS_BACKFILL`	`false`	Walk every epoch in the backfill range and scan only those (validator, epoch) pairs that don't already have a finalized row. Use after widening validator set or reducing `max_backfill_depth`.
`--scan-mode` / `SCAN_MODE`	`auto`	Attestation scan strategy. `dense` fetches every block in the epoch and derives correctness from attestations vs the canonical chain — amortises well for 30+ validators. `sparse` derives correctness from rewards and scans forward block-by-block only for duties the rewards show were included — ~4–5 API calls per epoch vs ~100 for dense. `auto` resolves to `sparse` when 5 or fewer validators are tracked. See scan mode semantics.

Web server

Flag	Default
`--web-port` / `WEB_PORT`	`3000`	HTTP port. Bound to `0.0.0.0`.
`--api-key` / `API_KEY`	(empty)	If set, required as `?api_key=…` on `/api/live/sse`. Read-only REST endpoints are always open.

Running modes

Selected via --mode {live|backfill|both} (env RUN_MODE, default both). Every mode is independently selectable; live and backfill are not shorthands for each other.

`both` — default: live + backfill concurrently

cargo run --release

On startup the indexer:

Connects to the beacon node, fetches the chain spec, seeds validator metadata.
Reads the current finality checkpoint f₀.
Spawns a background task that backfills epochs […=f₀] (using the backfill client — archive node if configured).
Runs live head tracking in the foreground, owning epochs (f₀, ∞).
The web server runs from startup with both DB reads and the SSE stream.
A periodic reconcile task re-fetches active validators' state every epoch so exits land in the DB without a process restart.

Live and backfill never touch the same epoch. finalized=true rows are immutable; live's finalized=false rows get promoted on the next finalized_checkpoint event.

`backfill` only

cargo run --release -- --mode backfill

Runs the historical backfill until everything up to current finality is covered, then exits. Use this for the first run against an archive node to seed the database.

No web server, no live tracking. Re-extends finality if the chain advances mid-pass. The validator-state reconcile loop is skipped — backfill processes finalized history where state at any past epoch is fully determined; only the startup seed is needed.

`live` only

cargo run --release -- --mode live

Head tracking + finality rescans + web server, no historical backfill. Two situations where this is the right pick:

Multi-instance: historical data is owned by a separate dedicated backfill instance writing to the same Postgres (typical: one head-tracker per validator's local non-archive node, one shared archive backfiller). The indexer assumes the DB is already (or being) populated by another writer for everything up to f₀ and just owns (f₀, ∞).
No archive client available: you only have a non-archive beacon node and don't want to run an archive node. Live mode keeps working — but the DB will only contain data from the moment this instance first started forward, so any view spanning epochs older than that startup will show gaps. Acceptable for "I just want to track from now on", not for historical analysis.

Split archive / non-archive nodes

Your validator's beacon node stays on live tracking. An archive node (or any beacon node configured with full historical state) handles the backfill.

# config.toml
beacon_url = "http://10.0.0.10:5052"           # non-archive, attached to validator
backfill_beacon_url = "http://10.0.0.20:5052"  # archive

The startup flow probes the backfill client at the earliest epoch it will try to scan. If the probe fails and backfill_beacon_url is set, the indexer exits with a descriptive error (backfill_beacon_url must be an archive node). If no dedicated URL is set and the shared client can't serve the range, you get a warning that tells you exactly which flag to set.

Resuming after downtime

Restarting any time just works; the indexer resumes from each validator's last_scanned_epoch watermark. If the gap exceeds the live node's retention (~1 day on a default Lighthouse) the startup probe warns you to set backfill_beacon_url + --non-contiguous-backfill to refill the gap.

Attestation scan modes

--scan-mode controls the attestation stage of finalized scans (backfill + finalization rescan). Live scans are unaffected.

Dense (default for >5 validators)

Fetches every block in the epoch and a one-epoch "late window" for inclusion discovery (~64 blocks), builds a canonical block-root map, and computes correctness by comparing each attestation's votes against the chain. Amortises well when most slots have at least one tracked duty.

Sparse (default for ≤5 validators)

Fetches /eth/v1/validator/duties/attester/{epoch} + /eth/v1/beacon/rewards/attestations/{epoch} plus (cached) committees. For each tracked duty whose rewards show inclusion, scans forward block-by-block from the duty's slot until the including block is found. For duties the rewards show as missed, skips the block scan entirely.

Typical network cost drops from ~100 calls/epoch to ~4–5, making long backfills on tiny validator sets practical.

Semantic difference to be aware of

Dense mode's *_correct columns mean "the vote was right". Sparse mode's mean "the validator earned the reward for that component", which requires correct vote AND timely inclusion (next slot for head, within ~5 for source, within 32 for target). A correct head vote included one slot late reads head_correct = false in sparse but true in dense. Operators typically care about the reward-qualifying definition; sparse mode surfaces that directly.

All reward columns, the included flag, inclusion_slot, and inclusion_delay match between modes (modulo rare beacon-node quirks where rewards show inclusion but the block scan can't locate it — then inclusion_slot is NULL and a warning is logged).

Multi-instance (sharing a Postgres)

Safe scenarios:

Two indexers tracking disjoint validator sets (e.g. one handles indices 1–500, another handles 501–1000). Per-validator watermarks + row upserts guarantee independence.
One head-tracker + one dedicated archive backfiller on the same set. The backfiller writes finalized=true rows; the head-tracker writes finalized=false rows that get promoted on finality. Reorg deletes only non-finalized rows, so the backfiller's output is immutable against the head-tracker's reorg path.

Unsafe:

Two head-trackers on overlapping validator sets — they'll churn on every reorg (each deletes the other's non-finalized rows).

The finalized=true-means-immutable invariant is load-bearing. It's documented on every write function in src/db/scanner/* and on scanner::scan_epoch. Don't violate it.

API surface

REST

All read-only, never requires auth (regardless of api_key).

Method + path	Returns
`GET /api/validators`	Per-validator summary with rates
`GET /api/stats`	Global aggregate counts + rates
`GET /api/epochs`	Per-epoch summaries (paginated, filterable)
`GET /api/attestations`	Per-(validator, epoch) attestation duties (paginated, filterable)
`GET /api/proposals`	Per-slot block proposals (paginated, filterable)
`GET /api/sync_duties`	Per-(validator, slot) sync-committee duties (paginated, filterable)
`GET /api/rewards`	1-day / 7-day / 30-day / all-time reward windows per validator + totals
`GET /api/meta`	Tracked validators list + tag map

Common pagination params: page, per_page, order=asc|desc. Filters vary by endpoint (see src/web/api/*.rs).

SSE

Path	Purpose
`GET /api/live/sse?api_key=…`	Live per-slot view for the current + previous epoch: tracked attesters, proposer + proposal outcome, sync committee participation. Refreshes on every broadcast event (live slot scanned, finalization, backfill epoch processed) and at least every 6 s.

Static web UI

Served from web/ at /. A minimal SvelteKit frontend consuming the REST endpoints above.

Database

Schema in migrations/. Key tables:

Table	Key	Contents
`validators`	`validator_index`	pubkey, activation/exit epochs, `last_scanned_epoch`
`attestation_duties`	`(validator_index, epoch)`	inclusion slot/delay, correctness, rewards
`sync_duties`	`(validator_index, slot)`	participated, reward, missed_block
`block_proposals`	`slot`	proposer_index, proposed, rewards
`instances`	`instance_id` (UUID)	`heartbeat` (for observability only)

finalized bool column on the three duty tables; upserts gate on WHERE finalized = FALSE, so finalized rows are immutable.

Development

cargo build
cargo test        # 64 unit tests, offline (uses captured block fixtures)
cargo clippy --all-targets

Fixtures live under testdata/blocks/ — one captured block per fork (phase0 → fulu) from a live Lighthouse node. cargo test doesn't hit the network.

Integration tests

scripts/run-integration-tests.sh spins up an ephemeral Postgres via docker-compose.test.yml, picks a random recent finalized epoch + a random proposer subset, and runs the dense-vs-sparse equivalence check and the performance bench against a real beacon node. Copy .env.test.example to .env.test and set BEACON_URL first; pin TEST_EPOCH / TEST_VALIDATORS there to reproduce a specific run. Both tests are #[ignore]d in cargo test since they need beacon + DB access.

Key invariants (before touching the scanner / DB)

Backfill must always pass finalized=true to scan_epoch. This makes its rows immune to reorg deletes and upsert overwrites.
Live must pass finalized=false and rely on db::scanner::finalization::finalize_up_to_epoch to promote them.
Every write upsert has WHERE finalized = FALSE — preserve that guard.
Reorg deletes only match finalized = FALSE. Same reasoning.
Malformed data is always fatal to the epoch, never silently tolerated. See Error::InconsistentBeaconData.

These are documented as doc-comments on the relevant functions in src/db/scanner/* and src/scanner/mod.rs.

Errors

One error type, crate::error::Error. Variants:

Http(reqwest::Error) — transport-layer failure.
BeaconApi { status, message } — non-2xx from beacon node. Pattern-matched on status: 404 where missing-slot vs error needs distinguishing.
Json(serde_json::Error) — SSE payload parse failure.
Database(sqlx::Error) / Migration(sqlx::migrate::MigrateError) — DB.
InvalidBlockId(String) — locally-constructed id validation.
InconsistentBeaconData(String) — spec-level data violation from the node.

CLI-level startup (config parsing) uses anyhow; beyond that everything is the typed enum.

License

GNU General Public License v3.0 or later. See LICENSE for the full text or https://www.gnu.org/licenses/gpl-3.0.html.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github/workflows		.github/workflows
migrations		migrations
scripts		scripts
src		src
testdata/blocks		testdata/blocks
web		web
.dockerignore		.dockerignore
.env.test.example		.env.test.example
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
config.example.toml		config.example.toml
docker-compose.test.yml		docker-compose.test.yml
docker-compose.yml		docker-compose.yml
rust-toolchain.toml		rust-toolchain.toml

Folders and files

Latest commit

History

Repository files navigation

dabeacon-indexer

Features

Architecture at a glance

Requirements

Quick start

Configuration

Beacon nodes

Database

Validators to track

Mode flags

Web server

Running modes

both — default: live + backfill concurrently

backfill only

live only

Split archive / non-archive nodes

Resuming after downtime

Attestation scan modes

Dense (default for >5 validators)

Sparse (default for ≤5 validators)

Semantic difference to be aware of

Multi-instance (sharing a Postgres)

API surface

REST

SSE

Static web UI

Database

Development

Integration tests

Key invariants (before touching the scanner / DB)

Errors

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`both` — default: live + backfill concurrently

`backfill` only

`live` only

Packages