Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 13 additions & 11 deletions docs/plans/ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,23 +79,25 @@ will *worsen* end-to-end latency. Functions and Firestore must move together.

**Migration order (sequence is load-bearing)**

1. **Pre-flight**: `gcloud firestore export gs://the-postbox-game-backup-eu/pre-migration-$(date +%F)` (create the backup bucket in `europe-west2` first). Tag the repo, freeze deploys.
1. **Pre-flight**: a managed export/import bucket must be **co-located with the database**, and you change locations mid-migration, so you need **two** buckets. Create a **US** bucket for exporting out of `nam5` (`gcloud storage buckets create gs://the-postbox-game-backup-us --location=us`) and an **EU** bucket for importing into `eur3` (`gcloud storage buckets create gs://the-postbox-game-backup-eu --location=europe-west2`). Pre-migration backup → US bucket: `gcloud firestore export gs://the-postbox-game-backup-us/pre-migration-$(date +%F)`. Tag the repo, freeze deploys.
2. **Region-pin Cloud Functions** (code change, no deploy yet):
- v2 callables (`nearbyPostboxes`, `startScoring`, `updateDisplayName`, `registerFcmToken`, `userClaimHistory`, `submitReport`, `reviewReport`, `routePostboxes`): add `{ region: "europe-west2" }` to `onCall` options.
- v2 scheduler (`newDayScoreboard`): add `region: "europe-west2"` alongside `schedule` + `timeZone`.
- v1 triggers (`onFriendAdded`, `onUserCreated`): wrap with `functionsV1.region("europe-west2")`.
3. **Firestore database move (disruptive — Path A recommended)**:
1. Maintenance mode via the existing `maintenance_mode` Remote Config flag (per `b721caa`); client renders "we'll be right back".
2. Final export to `gs://…-backup-eu`.
3. Delete `(default)` (delete protection is already `DISABLED`).
4. `gcloud firestore databases create --location=eur3 --database='(default)'`.
5. Re-deploy `firestore.rules` and `firestore.indexes.json`; wait for indexes to finish before reopening traffic.
6. `gcloud firestore import` from the backup.
7. Verify counts on `postbox`, `claims`, `users`, `leaderboards`, `fcmTokens`.
2. Final export → **US** bucket: `gcloud firestore export gs://the-postbox-game-backup-us/final-$(date +%F-%H%M)`; note the printed `outputUriPrefix`.
3. Baseline counts on the frozen `nam5` data: `cd functions && npm run verify-migration -- snapshot --out pre-nam5.json`.
4. Copy the final export US→EU (eur3 can only import from an EU bucket): `gcloud storage cp -r gs://the-postbox-game-backup-us/final-… gs://the-postbox-game-backup-eu/final-…`.
5. Delete `(default)` (delete protection is already `DISABLED`).
6. `gcloud firestore databases create --location=eur3 --database='(default)'`.
7. Re-deploy `firestore.rules` and `firestore.indexes.json`; wait for the 5 composite indexes to finish (`gcloud firestore indexes composite list`) before reopening traffic.
8. `gcloud firestore import gs://the-postbox-game-backup-eu/final-…` (from the **EU** bucket).
9. Verify counts match: `npm run verify-migration -- snapshot --out post-eur3.json && npm run verify-migration -- compare pre-nam5.json post-eur3.json` (exit 0 = safe). Covers `postbox`, `claims`, `users`, `leaderboards`, `fcmTokens`, `reports`, `reportQuotas` + nested groups.
- Path B (named DB in `eur3`, dual-write, switch reads, retire `(default)`) is the fallback if Path A downtime is unacceptable; **avoid** unless forced, since every `admin.firestore()` call would need to target the non-default DB.
4. **Deploy new functions**: `firebase deploy --only functions` *creates* europe-west2 copies; us-central1 copies are not deleted automatically. Verify all 8 healthy.
5. **Pin Flutter client** to europe-west2 via a single helper `lib/firebase_functions_eu.dart` exposing an `appFunctions` getter. Refactor the 8 callable call sites in `lib/user_repository.dart`, `lib/wear/wear_compass_page.dart`, `lib/notification_service.dart`, `lib/nearby.dart`, `lib/wear/wear_claim_page.dart`, `lib/claim_history_screen.dart`, `lib/claim.dart`. Bump the app version. Keep both regions live until us-central1 invocations drop to zero in Cloud Logging.
6. **Storage bucket** (do last): create `the-postbox-game-eu` in `europe-west2`, regen SDK config via FlutterFire CLI, migrate any objects with `gsutil -m cp -r` (currently nothing referenced from client code).
4. **Deploy new functions**: `firebase deploy --only functions` *creates* europe-west2 copies; **decline** the prompt to delete the orphaned us-central1 copies — keep the 8 callables live for old installs. But the 3 event-driven functions (`onUserCreated`, `onFriendAdded`, `newDayScoreboard`) fire automatically and would double-run from both regions, so delete only those old copies: `firebase functions:delete onUserCreated onFriendAdded newDayScoreboard --region us-central1`. Verify all **11** healthy in `europe-west2` and that no us-central1 `newDayScoreboard` scheduler job remains.
5. **Pin Flutter client** to europe-west2 via a single helper `lib/firebase_functions_eu.dart` exposing an `appFunctions` getter (`instanceFor(region: 'europe-west2')`). Done in PR #159: all 11 call sites swapped (`nearby`, `claim_quiz_sheet`, `wear` ×2, `route` ×2, `reports`, `admin`, `user_repository`, `claim_history_screen`, `notification_service`), guarded by `test/firebase_functions_region_test.dart`. Bump the app version. Keep both regions live until us-central1 invocations drop to zero in Cloud Logging.
6. **Storage bucket** (do last): create `the-postbox-game-eu` in `europe-west2`, regen SDK config via FlutterFire CLI, migrate the existing objects with `gsutil -m cp -r` (the default bucket holds `report_photos/` and `osm_changesets/`).
7. **Decommission us-central1 functions** once Cloud Logging shows zero invocations for one release cycle: `firebase functions:delete <name> --region us-central1` for each.

**Risks & mitigations**
Expand All @@ -113,7 +115,7 @@ will *worsen* end-to-end latency. Functions and Firestore must move together.
- **Tests**: `cd functions && npm test` + `flutter test` both green.
- **Logs**: Cloud Logging shows invocations only in `europe-west2` after the deprecation window.

**Rollback**: delete the new `(default)`, re-create in `nam5`, import the pre-migration backup, re-deploy us-central1 functions from the freeze tag, revert the Flutter client `instanceFor` change, ship a hotfix.
**Rollback**: delete the new `(default)`, re-create in `nam5`, import the pre-migration backup **from the US bucket** (`gs://the-postbox-game-backup-us/pre-migration-…`), re-deploy us-central1 functions from the freeze tag, leave the Flutter client unshipped (or revert the `instanceFor` change), ship a hotfix.

### Performance Monitoring custom traces (was #105)

Expand Down
3 changes: 2 additions & 1 deletion functions/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@
"deploy": "firebase deploy --only functions",
"logs": "firebase functions:log",
"test": "npm run build && nyc mocha lib/test/test.index.js --reporter spec",
"plan-route": "npm run build && node lib/scripts/plan_route.js"
"plan-route": "npm run build && node lib/scripts/plan_route.js",
"verify-migration": "npm run build && node lib/scripts/verify_migration.js"
},
"engines": {
"node": "22"
Expand Down
80 changes: 80 additions & 0 deletions functions/src/_migrationVerify.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
/*
* _migrationVerify.ts — pure helpers for the read-only Firestore migration
* verification CLI (scripts/verify_migration.ts).
*
* Kept separate from the CLI (and free of any firebase-admin / I/O) so the
* diff verdict — the part where a silent bug would falsely report "match" and
* let traffic reopen on incomplete data — is unit-testable. See the
* `diffSnapshots` tests in test/test.index.ts.
*/

/** A point-in-time document-count snapshot of the `(default)` database. */
export interface MigrationSnapshot {
project: string;
/** ISO-8601 timestamp the snapshot was taken. */
generatedAt: string;
/** Root collection name -> document count. */
roots: Record<string, number>;
/** Collection-group id -> document count (across all nesting depths). */
groups: Record<string, number>;
}

export type DiffScope = "root" | "group";
export type DiffStatus = "ok" | "mismatch" | "missing";

export interface DiffRow {
name: string;
scope: DiffScope;
/** Count in the "before" snapshot, or null if the key is absent there. */
before: number | null;
/** Count in the "after" snapshot, or null if the key is absent there. */
after: number | null;
/** (after ?? 0) - (before ?? 0). */
delta: number;
status: DiffStatus;
}

export interface DiffResult {
/** True only if every row is "ok" (present on both sides, equal counts). */
ok: boolean;
rows: DiffRow[];
}

function diffSection(
scope: DiffScope,
before: Record<string, number>,
after: Record<string, number>,
): DiffRow[] {
const names = Array.from(
new Set([...Object.keys(before), ...Object.keys(after)]),
).sort();

return names.map((name) => {
const b = Object.prototype.hasOwnProperty.call(before, name)
? before[name]
: null;
const a = Object.prototype.hasOwnProperty.call(after, name)
? after[name]
: null;
const delta = (a ?? 0) - (b ?? 0);
const status: DiffStatus =
b === null || a === null ? "missing" : a === b ? "ok" : "mismatch";
return { name, scope, before: b, after: a, delta, status };
});
}

/**
* Compares two snapshots and returns a per-collection diff. A correct
* migration leaves every count unchanged, so any non-"ok" row means the
* import is incomplete (or grew) and traffic must NOT be reopened.
*/
export function diffSnapshots(
before: MigrationSnapshot,
after: MigrationSnapshot,
): DiffResult {
const rows = [
...diffSection("root", before.roots, after.roots),
...diffSection("group", before.groups, after.groups),
];
return { ok: rows.every((r) => r.status === "ok"), rows };
}
26 changes: 15 additions & 11 deletions functions/src/_notifications.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
import "./adminInit";
import * as admin from "firebase-admin";
import * as functions from "firebase-functions";
import * as functionsV1 from "firebase-functions/v1";
import { onDocumentUpdated } from "firebase-functions/v2/firestore";
import { FIRESTORE_TRIGGER_REGION } from "./_region";
import { getTodayLondon } from "./_dateUtils";

const database = admin.firestore();
Expand Down Expand Up @@ -286,23 +287,26 @@ export const registerFcmToken = functions.https.onCall(async (request) => {

// ── Firestore trigger: friend added ──────────────────────────────────────

export const onFriendAdded = functionsV1.firestore
.document("users/{uid}")
.onUpdate(async (change, context) => {
const uid: string = context.params.uid;
const before: string[] =
(change.before.data()?.friends as string[] | undefined) ?? [];
const after: string[] =
(change.after.data()?.friends as string[] | undefined) ?? [];
// 2nd-gen Firestore trigger in europe-west4 (eur3's Eventarc region); eur3 has
// no Gen1 Firestore triggers. See FIRESTORE_TRIGGER_REGION in _region.ts.
export const onFriendAdded = onDocumentUpdated(
{ document: "users/{uid}", region: FIRESTORE_TRIGGER_REGION },
async (event) => {
const uid = event.params.uid;
const before =
(event.data?.before.data()?.friends as string[] | undefined) ?? [];
const after =
(event.data?.after.data()?.friends as string[] | undefined) ?? [];

const newFriends = diffFriends(before, after);
if (newFriends.length === 0) return;

const adderDisplayName =
(change.after.data()?.displayName as string | undefined) ||
(event.data?.after.data()?.displayName as string | undefined) ||
`Player_${uid.slice(0, 6)}`;

await Promise.allSettled(
newFriends.map((fuid) => notifyFriendOfAddition(fuid, adderDisplayName))
);
});
},
);
19 changes: 19 additions & 0 deletions functions/src/_region.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
import { setGlobalOptions } from "firebase-functions/v2";

// UK-only audience reading the eur3 Firestore: pin every Cloud Function to
// europe-west2 so round-trips stay in-region. setGlobalOptions covers all v2
// functions (callables + scheduler); the v1 triggers reference FUNCTION_REGION
// directly via .region(). See ROADMAP v1.3 (us-central1 -> europe-west2).
//
// Imported first in index.ts so setGlobalOptions runs before any function
// module is evaluated (and therefore before each function is defined).
export const FUNCTION_REGION = "europe-west2";

// eur3 has no Gen1 Firestore triggers at all (Gen1 deploys fail in every
// region with "...is in region eur3-europe-west1 which is not supported"), so
// onFriendAdded is a 2nd-gen trigger. Eventarc maps the eur3 multi-region to
// europe-west4, so the function must run there — NOT europe-west2/west1.
// (Auth triggers like onUserCreated are global and stay on FUNCTION_REGION.)
export const FIRESTORE_TRIGGER_REGION = "europe-west4";

setGlobalOptions({ region: FUNCTION_REGION });
1 change: 1 addition & 0 deletions functions/src/index.ts
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import "./_region";
import "./adminInit";
import { nearbyPostboxes } from "./nearbyPostboxes";
import { startScoring } from "./startScoring";
Expand Down
6 changes: 5 additions & 1 deletion functions/src/onUserCreated.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import "./adminInit";
import * as admin from "firebase-admin";
import * as functions from "firebase-functions/v1";
import { FUNCTION_REGION } from "./_region";
import {
containsProfanity,
MIN_DISPLAY_NAME_CHARS,
Expand All @@ -16,7 +17,10 @@ export function sanitiseName(name: string, uid: string): string {
return t;
}

export const onUserCreated = functions.auth.user().onCreate(async (user) => {
export const onUserCreated = functions
.region(FUNCTION_REGION)
.auth.user()
.onCreate(async (user) => {
const raw =
user.displayName ||
(user.email
Expand Down
Loading
Loading