Skip to content

docs(operations): env-vars runbook + gitignore .env.production.snapshot*#34

Merged
pulkitpareek18 merged 1 commit into
mainfrom
dev
May 15, 2026
Merged

docs(operations): env-vars runbook + gitignore .env.production.snapshot*#34
pulkitpareek18 merged 1 commit into
mainfrom
dev

Conversation

@pulkitpareek18
Copy link
Copy Markdown
Collaborator

Documents the manual VPS env-injection procedure (as it stands today after the SMTP rollout) plus the GitHub-Actions-managed PROD_ENV_FILE path that retires it on the next change.

Why now

Today's SMTP env injection for issue #27 exposed two real procedural gaps:

  1. `docker compose restart` does NOT reload `env_file`. It restarts the container with its boot-time env. To pick up edits to the env_file, you MUST `up -d --force-recreate`. We hit this live during the SMTP rollout — env was injected, restart ran clean, but `sendMail()` kept logging "SMTP not configured" until we forced recreate. Caught + corrected; documented + load-bearing in this runbook so it doesn't bite the next operator.
  2. Local `.env.production.snapshot` files weren't gitignored. They contain real production credentials. Caught before any commit, but the .gitignore now explicitly covers both the latest (`.env.production.snapshot`) and the per-change archival copies (`.env.production.snapshot.*`).

What's in the runbook

`docs/operations/env-vars.md`:

  • 5-location map of where env vars live (VPS .env, local .env, two snapshot files, .env.example) and which is authoritative
  • The restart-vs-recreate gotcha at the top, called out clearly
  • Step-by-step procedures:
    • Add or change one or more env vars (snapshot → edit → recreate → verify → sync)
    • Rotate a secret (with the "new before old" ordering rule)
    • Emergency rollback (from `.env.bak.` backups, ~30 seconds)
    • Full re-creation from local snapshot if the VPS `.env` is lost
  • Service-specific operational pre-reqs: Brevo IP allowlist, SPF/DKIM/DMARC DNS for inbox delivery, Postgres password rotation order, DIDRegistry transferOwnership ordering on blockchain wallet rotation
  • The plan to retire manual editing — the `PROD_ENV_FILE` GitHub secret path. Backwards-compatible (guarded by `if: env.PROD_ENV_FILE != ''`) so today's manual flow keeps working until the secret is added in GH UI.
  • Audit-log table at the bottom. Every production env change adds a row. First entry: today's SMTP injection.

Test plan

  • All snapshots gitignored (`git check-ignore .env.production.snapshot[.]` → 0)
  • No secrets in any tracked file (`git diff` filtered for known credential strings → 0 hits)
  • Production SMTP working end-to-end (verified via fresh-signup + duplicate-signup; both `Email: sent` with messageIds)

🤖 Generated with Claude Code

Documents the manual VPS env-injection procedure as it stands today, plus
the GitHub-Actions-managed PROD_ENV_FILE path to retire it when the next
change comes up.

Why now: today's SMTP env injection (issue #27) exposed two procedural
gaps that bit us live:

  1. `docker compose restart` does NOT reload env_file — it re-reads only
     the compose definition. The env_file is loaded at container CREATION,
     not restart. You MUST `up -d --force-recreate` to pick up env edits.
     We hit this on the SMTP rollout: env was injected, restart ran clean,
     but the container kept logging "SMTP not configured" because it was
     still running with its boot-time env. Switched to force-recreate and
     the SMTP_HOST appeared.
  2. The local .env.production.snapshot files weren't gitignored. They
     contain real production credentials. Caught before any commit, but
     the gitignore patches both `.env.production.snapshot` (latest) and
     `.env.production.snapshot.*` (per-change archival copies). Existing
     .gitignore covered `.env` / `.env.local` / `.env.production` /
     `.env.*.local` but NOT `.env.production.snapshot` (different shape).

docs/operations/env-vars.md covers:

  - Where env vars live across 5 locations (VPS .env, local .env, two
    snapshot files, .env.example) and which is authoritative
  - The restart-vs-recreate gotcha (load-bearing)
  - Step-by-step procedures: add/change env, rotate secret, emergency
    rollback (from .env.bak.<ts> backups), full re-creation from local
    snapshot
  - Service-specific operational pre-reqs: Brevo IP allowlist
    (104.207.143.14 required, 5.7.1 errors otherwise), SPF/DKIM/DMARC
    DNS records on zeroauth.dev for inbox delivery, Postgres password
    rotation order, DIDRegistry transferOwnership order on blockchain
    wallet rotation
  - The plan to retire manual editing: GitHub Actions secret
    PROD_ENV_FILE written to /opt/zeroauth/.env on every deploy. Backwards-
    compatible — guarded by `if: env.PROD_ENV_FILE != ''` so today's
    manual flow keeps working until the secret is added in the GH UI.
  - An audit-log table at the bottom — every production env change adds
    a row (date, who, keys touched, rationale). First entry: today's
    SMTP injection.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 15, 2026 06:54
@pulkitpareek18 pulkitpareek18 merged commit 02a79d0 into main May 15, 2026
1 of 3 checks passed
@pulkitpareek18 pulkitpareek18 deleted the dev branch May 15, 2026 06:54
@pulkitpareek18 pulkitpareek18 review requested due to automatic review settings May 15, 2026 07:15
pulkitpareek18 added a commit that referenced this pull request May 15, 2026
…ot* (#34)

Documents the manual VPS env-injection procedure as it stands today, plus
the GitHub-Actions-managed PROD_ENV_FILE path to retire it when the next
change comes up.

Why now: today's SMTP env injection (issue #27) exposed two procedural
gaps that bit us live:

  1. `docker compose restart` does NOT reload env_file — it re-reads only
     the compose definition. The env_file is loaded at container CREATION,
     not restart. You MUST `up -d --force-recreate` to pick up env edits.
     We hit this on the SMTP rollout: env was injected, restart ran clean,
     but the container kept logging "SMTP not configured" because it was
     still running with its boot-time env. Switched to force-recreate and
     the SMTP_HOST appeared.
  2. The local .env.production.snapshot files weren't gitignored. They
     contain real production credentials. Caught before any commit, but
     the gitignore patches both `.env.production.snapshot` (latest) and
     `.env.production.snapshot.*` (per-change archival copies). Existing
     .gitignore covered `.env` / `.env.local` / `.env.production` /
     `.env.*.local` but NOT `.env.production.snapshot` (different shape).

docs/operations/env-vars.md covers:

  - Where env vars live across 5 locations (VPS .env, local .env, two
    snapshot files, .env.example) and which is authoritative
  - The restart-vs-recreate gotcha (load-bearing)
  - Step-by-step procedures: add/change env, rotate secret, emergency
    rollback (from .env.bak.<ts> backups), full re-creation from local
    snapshot
  - Service-specific operational pre-reqs: Brevo IP allowlist
    (104.207.143.14 required, 5.7.1 errors otherwise), SPF/DKIM/DMARC
    DNS records on zeroauth.dev for inbox delivery, Postgres password
    rotation order, DIDRegistry transferOwnership order on blockchain
    wallet rotation
  - The plan to retire manual editing: GitHub Actions secret
    PROD_ENV_FILE written to /opt/zeroauth/.env on every deploy. Backwards-
    compatible — guarded by `if: env.PROD_ENV_FILE != ''` so today's
    manual flow keeps working until the secret is added in the GH UI.
  - An audit-log table at the bottom — every production env change adds
    a row (date, who, keys touched, rationale). First entry: today's
    SMTP injection.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant