A fast, resiliant Go service for pre-processing and publishing software releases into CVMFS without holding the repository transaction lock during file processing.
The standard CVMFS publishing workflow acquires an exclusive lock on the Stratum 0 repository for the full duration of tar extraction, compression, hashing, and CAS upload — serialising work that is intrinsically parallel. After the catalog is committed, Stratum 1 replicas must fetch all new objects from scratch, causing a lot of cache misses worker nodes.
cvmfs-prepub accepts a packaged tar file and:
- Unpacks and processes files in parallel — compress, SHA-256 hash, deduplicate against the existing CAS — without touching the overlay filesystem or holding any lock.
- Uploads objects to the CAS backend (S3 or local filesystem) before acquiring a gateway lease.
- Pre-warms Stratum 1 replicas with the new objects before the catalog is committed (Option B), so replication becomes catalog-only after the flip.
- Merges catalogs directly by fetching the current CVMFS catalog from Stratum 0, applying the new entries to the correct sub-catalog SQLite database, finalising (compress + SHA-256), and committing via the
cvmfs_gatewaylease API. No overlay filesystem is required; catalog merging is done entirely in Go using the CVMFS schema 2.5 SQLite format. - Supports private share directories — files can be published under a hidden, randomly-named path (analogous to a Google Docs share link) that is invisible in
readdir()but accessible by direct path, using the CVMFSkFlagHiddencatalog flag. - Recovers automatically from crashes at any stage — every state transition is an atomic filesystem rename backed by a WAL journal.
The existing cvmfs_server publish workflow continues to work in parallel; the
gateway lease enforces mutual exclusion at the path level.
| Option A | Option B (HTTP) | Option B (MQTT) | |
|---|---|---|---|
| Pre-processes tar | ✓ | ✓ | ✓ |
| Bypasses overlay FS | ✓ | ✓ | ✓ |
| Pre-warms Stratum 1 | ✗ | ✓ | ✓ |
| Inbound firewall rules at S1 | None | TCP 9100 (data push) | TCP 9100 (data push); outbound TCP 8883 to broker (control) |
| New infrastructure | None | Receiver agent on each S1 | Receiver agent + MQTT broker |
| Phase | 1 | 2 | 2 |
The MQTT variant of Option B shifts the control plane (announce/ready exchange) onto a shared broker — receivers connect outbound to the broker (TCP 8883) so Stratum 1 sites need not be reachable from Stratum 0 for signalling. The data plane is identical in both variants: after the ready exchange the publisher connects directly to each receiver's HTTP endpoint (TCP 9100 inbound on each S1) to push CAS objects. MQTT therefore helps when S1 sites cannot accept arbitrary inbound connections from S0, but each receiver must still accept the data push from S0 on port 9100.
See REFERENCE.md §5, §6, and §20.11 for full topology diagrams, trade-off analysis, and MQTT topic schema.
| Document | Contents |
|---|---|
| REFERENCE.md | Full architecture, subsystem design, Go package layout, configuration reference, security considerations, deployment roadmap, comparison with the traditional workflow, and provenance/transparency log |
| INSTALL.md | Build instructions, configuration, systemd setup, and smoke-test procedure |
Key sections in REFERENCE.md:
- §4 System Overview — topology diagram and gateway API summary
- §7.1 Job State Machine — spool directory layout and crash-recovery model
- §7.2 Processing Pipeline — fan-out channel graph and single-pass compress+hash
- §8 Lifecycle Cleanup — per-repository TTL-based GC with proxy-agnostic access tracking
- §10 Configuration Reference — annotated YAML config
- §12 Deployment Roadmap — phased rollout with exit criteria per phase
- §16 Security, Confidentiality, Integrity, and Traceability — audit trail, supply-chain fields, tamper-evident publishing
- §16.7 Stratum 1 Distribution Security — SSRF guard, MQTT mTLS, topic ACLs, input bounds
- §17 Comparison with Traditional
cvmfs_server publish— head-to-head table, where the fundamental difference lies, and when to use each approach - §18 Provenance and Transparency Log — four-layer attribution chain, Rekor integration, CI OIDC token validation, and offline verification workflow
- §20.11 MQTT Control Plane — topic schema, flow, presence/LWT, security controls, configuration flags
cvmfs-bits/
├── cmd/
│ ├── prepub/ # Main service binary
│ └── prepubctl/ # Admin CLI (drain, abort, status)
├── internal/
│ ├── api/ # REST server and Orchestrator
│ ├── job/ # Job struct and FSM
│ ├── spool/ # Atomic spool directory manager + WAL journal
│ ├── cas/ # CAS backend (local FS; S3 in roadmap)
│ ├── lease/ # cvmfs_gateway lease client with heartbeat
│ ├── pipeline/ # Processing stages (unpack, compress, dedup, upload, catalog)
│ ├── distribute/ # Stratum 1 pre-warmer with quorum gating
│ ├── gc/ # Lifecycle GC scheduler (Phase 3)
│ └── access/ # Pluggable access-event tracking (Phase 3)
├── pkg/
│ ├── observe/ # OTel tracing + Prometheus metrics + slog logger
│ ├── cvmfshash/ # CVMFS content hash format utilities
│ └── cvmfscatalog/ # CVMFS catalog: schema 2.5 SQLite, MD5 path encoding, merge, secret dirs
├── testutil/
│ ├── fakegateway/ # In-process cvmfs_gateway with chaos controls
│ ├── fakecas/ # In-memory CAS with latency/failure injection
│ ├── fakestratum1/ # In-process Stratum 1 receiver with partition simulation
│ └── simulate/ # Cluster simulator wiring all fakes; integration tests
├── docs/ # Architecture SVG diagrams
├── REFERENCE.md # Full design and implementation reference
├── INSTALL.md # Build, configuration, and deployment guide
├── go.mod
└── Makefile
# Build
make build
# Run the cluster integration test (simulates a full publish in-process)
make run-sim
# Start the service (Option A, local CAS)
./cvmfs-prepub \
--gateway-url http://localhost:4929 \
--cas-type localfs \
--cas-root /srv/cvmfs/cas \
--spool-root /var/spool/cvmfs-prepub \
--listen :8080See INSTALL.md for full deployment instructions.
Every significant operation emits an OpenTelemetry span. Prometheus metrics are
exposed at /api/v1/metrics. Structured JSON logs use log/slog throughout.
The testutil/simulate package runs the full publish pipeline in-process with
fake infrastructure components, each emitting their own spans, making distributed
traces observable in a single go test run without any external services.
- Go 1.22+
cvmfs_gateway≥ 1.2 (for the lease-and-payload API)- Write access to the CAS backend (local filesystem or S3-compatible)
- HTTP read access to the Stratum 0 CAS (required for manifest fetch and catalog download during merge)
- TCP 9100 inbound on each Stratum 1 for the CAS object data push (Option B only — both HTTP and MQTT variants)
- TCP 8883 outbound from each Stratum 1 to the MQTT broker, plus TCP 8883 inbound on the broker host (Option B MQTT variant only)
- No
cvmfsclient tools required on the pre-publisher node — catalog merging is done natively in Go