diff --git a/SERIALIZATIONS.md b/SERIALIZATIONS.md
new file mode 100644
index 0000000..8cfae53
--- /dev/null
+++ b/SERIALIZATIONS.md
@@ -0,0 +1,334 @@
+---
+title: "iSamples Data Serializations"
+subtitle: "A catalog of the parquet files that back the iSamples query substrate"
+author: "iSamples team"
+date: today
+toc: true
+categories: [data, architecture, parquet]
+---
+
+## 1. Purpose and scope
+
+iSamples has roughly a dozen parquet files in circulation at any given
+moment — each with a specific role, a specific upstream parent, and a
+specific set of downstream consumers (the web Explorer, the Python
+reference notebook, the progressive globe, the PQG conformance work).
+Some are primary archival products; others are derived aggregates
+or caches; still others are source-specific variants published
+outside the `data.isamples.org` namespace.
+
+This document is a **catalog**, not an ingestion guide: it tells you
+what each file is, where it came from, who consumes it, and where in
+the spec tree to look for its normative definition. For how to *build*
+these files, see the scripts in
+[`scripts/`](https://github.com/isamplesorg/isamplesorg.github.io/tree/main/scripts)
+and the converters in
+[`pqg/`](https://github.com/isamplesorg/pqg). For how to *query* them,
+see [`query-spec.qmd`](query-spec.qmd). For how to *cite* them, see
+the Zenodo deposition plan.
+
+All sizes and row counts below were verified by DuckDB `DESCRIBE` +
+`COUNT(*)` against `https://data.isamples.org/` on 2026-04-24.
+
+## 2. The derivation DAG
+
+```
+Zenodo export (doi:10.5281/zenodo.15278211, ~300 MB, 6.7 M samples)
+  │   sample-centric, nested STRUCTs (PQG "export" format)
+  │
+  └─► isamples_202512_narrow.parquet  (820 MB, 101 M rows)
+        │   graph-normalized, nodes + _edge_ rows (PQG "narrow")
+        │
+        └─► isamples_202601_wide.parquet  (278 MB, 20.7 M rows)
+              │   entity-centric, p__* relationship arrays (PQG "wide")
+              │
+              ├─► isamples_202604_wide.parquet  (292 MB, 20.7 M rows)
+              │     = 202601 wide + ~47 K OpenContext thumbnails
+              │     (see scripts/enrich_wide_with_oc_thumbnails.py)
+              │
+              ├─► isamples_202601_wide_h3.parquet  (292 MB, 20.7 M)
+              │     = wide + h3_res4 / h3_res6 / h3_res8 columns
+              │
+              ├─► isamples_202601_samples_map_lite.parquet  (60 MB, 6.0 M)
+              │     display projection for map points
+              │
+              ├─► isamples_202601_sample_facets_v2.parquet  (63 MB, 6.0 M)
+              │     pid → facet-URI strings for multi-dim filtering
+              │
+              ├─► isamples_202601_facet_summaries.parquet  (2 KB, 56 rows)
+              │     baseline (facet_type, facet_value, count) tuples
+              │
+              ├─► isamples_202601_facet_cross_filter.parquet  (6 KB, 526 rows)
+              │     single-active-filter cross cache
+              │
+              └─► isamples_202601_h3_summary_res{4,6,8}.parquet
+                    geospatial aggregates for the progressive globe
+                    (38 K / 112 K / 176 K cells)
+
+Source-specific variants (parallel to the substrate, not derived from it):
+
+oc_isamples_pqg.parquet        (GCS, 11.8 M, narrow, OC-only)
+oc_isamples_pqg_wide.parquet   (GCS,  2.5 M, wide,   OC-only)
+  └─► serve as upstream for OpenContext thumbnails folded into 202604 wide
+```
+
+Arrows indicate derivation, not containment. Every file in the left
+column can be rebuilt from its parent by a script in
+`isamples-python/` or `isamplesorg.github.io/scripts/`.
+
+## 3. Catalog
+
+### Tier: source of truth
+
+| File | Role | Size | Rows | Upstream | Consumers | Spec |
+|---|---|---:|---:|---|---|---|
+| `zenodo.15278211` export | Aggregated Zenodo export (all 4 sources, sample-centric, nested) | ~300 MB | 6.7 M | SESAR + OpenContext + GEOME + Smithsonian ingestion | PQG converters (narrow, wide) | PQG §3.3 (export format) |
+
+### Tier: graph normalization
+
+| File | Role | Size | Rows | Upstream | Consumers | Spec |
+|---|---|---:|---:|---|---|---|
+| `isamples_202512_narrow.parquet` | Graph-normalized with explicit `_edge_` rows; canonical archival form | 820 MB | 101.4 M | Zenodo export | Graph traversals, PQG tutorials, narrow→wide converter, Zenodo archive | PQG §3.1, §4.2 |
+| `isamples_202601_wide.parquet` | Entity-centric, relationships as `p__*` arrays; primary analytic substrate | 278 MB | 20.7 M | narrow | Search Explorer, Python notebook, facet/h3/lite derivations | PQG §3.2, §4.5 |
+| `isamples_202604_wide.parquet` | 202601 wide + ~47 K OC thumbnails folded in | 292 MB | 20.7 M | 202601 wide + `oc_isamples_pqg.parquet` | `current/wide.parquet` alias points here | PQG §3.2 |
+
+### Tier: derived aggregates (progressive globe / H3)
+
+| File | Role | Size | Rows | Upstream | Consumers | Spec |
+|---|---|---:|---:|---|---|---|
+| `isamples_202601_wide_h3.parquet` | Wide with `h3_res{4,6,8}` BIGINT columns pre-joined | 292 MB | 20.7 M | wide | Deep-Dive Analysis tutorial (H3 filtering without join) | QUERY_SPEC §2.4 |
+| `isamples_202601_h3_summary_res4.parquet` | Continental tier: `(h3_cell, sample_count, center_lat, center_lng, dominant_source, source_count, resolution)` | 580 KB | 38 K | wide_h3 | Interactive Explorer globe (zoomed out), Python Explorer H3 tier mode | QUERY_SPEC §2.4 |
+| `isamples_202601_h3_summary_res6.parquet` | Regional tier | 1.6 MB | 112 K | wide_h3 | Interactive Explorer globe (mid zoom) | QUERY_SPEC §2.4 |
+| `isamples_202601_h3_summary_res8.parquet` | Neighborhood tier | 2.4 MB | 176 K | wide_h3 | Interactive Explorer globe (close zoom) | QUERY_SPEC §2.4 |
+
+### Tier: display projections
+
+| File | Role | Size | Rows | Upstream | Consumers | Spec |
+|---|---|---:|---:|---|---|---|
+| `isamples_202601_samples_map_lite.parquet` | Minimum map projection; only `MaterialSampleRecord` rows with coordinates | 60 MB | 6.0 M | wide (filtered) | Interactive Explorer point-level rendering below ~120 km altitude | QUERY_SPEC §4.1 |
+
+### Tier: facet caches
+
+| File | Role | Size | Rows | Upstream | Consumers | Spec |
+|---|---|---:|---:|---|---|---|
+| `isamples_202601_sample_facets_v2.parquet` | `(pid, source, material, context, object_type, label, description, place_name)` — all VARCHAR scalars; each facet column is a single URI per sample (not an array) | 63 MB | 6.0 M | wide | Search Explorer multi-dim facet filtering | QUERY_SPEC §3.3, §5.1 |
+| `isamples_202601_facet_summaries.parquet` | Baseline `(facet_type, facet_value, scheme, count)` | 2 KB | 56 | wide | Every tutorial (instant initial facet counts) | QUERY_SPEC §3.3 tier 1 |
+| `isamples_202601_facet_cross_filter.parquet` | Pre-computed counts for single-filter cross-facet queries | 6 KB | 526 | wide | Search Explorer cross-filter UI | QUERY_SPEC §3.3 tier 2a |
+
+### Tier: alternative export formats (upstream of the aggregated Zenodo export)
+
+The `export_client` can emit each source's records in multiple formats;
+the aggregated Zenodo deposition archives the GeoParquet flavor, but
+JSONL and CSV are also emitted by the same pipeline and are useful for
+streaming or human inspection.
+
+| File | Role | Size | Rows | Upstream | Consumers | Spec |
+|---|---|---:|---:|---|---|---|
+| `isamples_export_*.jsonl` | Streaming JSON export (one sample per line, nested structs) | per query | — | `isamplesorg/export_client` (`isample export -f jsonl`) | Local DuckDB ingestion, STAC catalog generation | [export_client docs](https://github.com/isamplesorg/export_client) |
+| `isamples_export_*.csv` | Flat CSV export — convenience only, not authoritative for the query substrate | per query | — | `isamplesorg/export_client` (`isample export -f csv`) | Human inspection | [export_client docs](https://github.com/isamplesorg/export_client) |
+| `stac.json` / `manifest.json` | STAC/discovery sidecars emitted with local exports | < 1 KB | — | `isamplesorg/export_client` | STAC browser, local server, refresh workflow | [export_client README](https://github.com/isamplesorg/export_client) |
+
+### Tier: legacy bindings and convenience copies
+
+| File | Role | Size | Rows | Upstream | Consumers | Spec |
+|---|---|---:|---:|---|---|---|
+| Solr indexed documents | Legacy search-server binding for the same canonical query dimensions. *Not a portable serialization*; listed here because QUERY_SPEC §5.3 documents the Solr dialect bindings | N/A | ~6 M | `isamplesorg/isamples_inabox` harvest/index pipelines + schema mappings | [iSamples Central](https://central.isample.xyz/isamples_central/) (API offline as of Aug 2025; Solr schema remains the authoritative precedent for dimension names) | [QUERY_SPEC §5.3](query-spec.qmd#sec-bindings) |
+| H3 + lite CSV twins | Human-readable CSV duplicates of `samples_map_lite.parquet` and `h3_summary_res{4,6,8}.parquet` | ~640 MB total | mirror | the corresponding parquet files | Manual inspection only | parquet copies are authoritative; CSV twins excluded from the Zenodo substrate deposition by design |
+
+### Tier: source-specific variants (not part of the substrate)
+
+| File | Role | Size | Rows | Upstream | Consumers | Spec |
+|---|---|---:|---:|---|---|---|
+| `oc_isamples_pqg.parquet` (GCS) | OpenContext-only narrow; carries `thumbnail_url` values absent from the aggregated export | ~1.8 GB | 11.8 M | OpenContext ETL (Eric Kansa) | `scripts/enrich_wide_with_oc_thumbnails.py` → 202604 wide; PQG development | PQG §3.1 |
+| `oc_isamples_pqg_wide.parquet` (GCS) | OpenContext-only wide | ~600 MB | 2.5 M | OC narrow | OC-specific analyses, PQG benchmarks | PQG §3.2 |
+
+**No OpenContext sidecar file exists yet.** Per the sidecar-pattern plan
+(Raymond endorsed 2026-04-17), thumbnails are currently merged directly
+into `isamples_202604_wide.parquet` rather than joined at query time
+from a sidecar. A future `isamples_202601_oc_sidecar.parquet` (keyed on
+`pid`, with `thumbnail_url`, `is_public`, `license`, `media_url`,
+`harvested_at`) is planned — see
+`project_isamples_sidecar_pattern.md`.
+
+## 4. Per-file detail
+
+URL convention: each file is available at
+`https://data.isamples.org/<filename>` (versioned, 1-yr immutable cache)
+and, where applicable, at `https://data.isamples.org/current/<alias>`
+(302 redirect, 5-min cache). Examples below use the versioned URL; swap
+for the alias when you want "latest."
+
+### 4.1 Zenodo export (source of truth)
+
+- **Role**: The raw aggregated Zenodo export — all four sources, sample-centric, nested STRUCTs.
+- **DOI**: `10.5281/zenodo.15278211`
+- **Headline schema** (PQG export, 19 cols): `sample_identifier`, `label`, `description`, `produced_by {sampling_site {sample_location {latitude, longitude, ...}}}`, etc.
+- **Query pattern**: one row per sample; no JOINs needed for basic queries.
+- **DuckDB**: download the parquet from Zenodo, then
+  `SELECT * FROM read_parquet('isamples_export_*.parquet') LIMIT 10`.
+
+### 4.2 `isamples_202512_narrow.parquet`
+
+- **Role**: PQG narrow format — the canonical, lossless graph-normalized representation.
+- **Headline schema** (40 cols): `row_id, pid, otype, s, p, o, n, altids, geometry, ...entity-specific columns...`. Edges are rows with `otype='_edge_'` and populated `s/p/o`.
+- **Query pattern**: multi-hop JOIN via `_edge_` rows (see PQG §2.2).
+- **DuckDB**:
+  ```sql
+  SELECT COUNT(*) FROM read_parquet('https://data.isamples.org/isamples_202512_narrow.parquet')
+  WHERE otype = 'MaterialSampleRecord';
+  ```
+
+### 4.3 `isamples_202601_wide.parquet`
+
+- **Role**: PQG wide format — primary analytic substrate for Explorer + notebook.
+- **Headline schema** (49 cols): same core columns as narrow, plus `p__produced_by`, `p__sample_location`, `p__sampling_site`, `p__site_location`, `p__responsibility`, `p__registrant`, `p__has_material_category`, `p__has_context_category`, `p__has_sample_object_type`, `p__keywords`, `p__curation`, `p__related_resource` — each an integer array of target `row_id`s. Exact DuckDB types are mixed: `p__produced_by`, `p__sample_location`, `p__sampling_site`, `p__site_location`, `p__registrant`, `p__curation` are `INTEGER[]`; `p__has_material_category`, `p__has_context_category`, `p__has_sample_object_type`, `p__keywords`, `p__responsibility`, `p__related_resource` are `BIGINT[]`.
+- **Column name gotcha**: the source column is `n` on wide/narrow (PQG convention), not `source`. Alias it in projections (e.g. `n AS source`) to match what the lite and facet parquets already call it.
+- **Query pattern**: entity-centric; relationships via array-element JOIN (see PQG §3.2).
+- **DuckDB**:
+  ```sql
+  SELECT n AS source, COUNT(*) FROM read_parquet('https://data.isamples.org/isamples_202601_wide.parquet')
+  WHERE otype = 'MaterialSampleRecord' GROUP BY n ORDER BY 2 DESC;
+  ```
+
+### 4.4 `isamples_202604_wide.parquet`
+
+- **Role**: 202601 wide enriched with ~47 K OpenContext thumbnails. `current/wide.parquet` 302-redirects here.
+- **Headline schema**: identical to 202601 wide (49 cols). Only the `thumbnail_url` column on OC `MaterialSampleRecord` rows is populated differently.
+- **Query pattern**: drop-in replacement for 202601 wide; use `current/wide.parquet` unless you need a pinned version.
+- **DuckDB**:
+  ```sql
+  SELECT COUNT(*) FROM read_parquet('https://data.isamples.org/current/wide.parquet')
+  WHERE thumbnail_url IS NOT NULL;
+  ```
+
+### 4.5 `isamples_202601_wide_h3.parquet`
+
+- **Role**: Wide with H3 indices pre-joined, so H3 predicates don't need a join.
+- **Headline schema** (52 cols): wide columns + `h3_res4`, `h3_res6`, `h3_res8` (BIGINT).
+- **Query pattern**: direct H3-cell filtering without an H3 UDF.
+- **DuckDB**:
+  ```sql
+  SELECT COUNT(*) FROM read_parquet('https://data.isamples.org/isamples_202601_wide_h3.parquet')
+  WHERE h3_res6 = 604932829406232575;
+  ```
+
+### 4.6 `isamples_202601_h3_summary_res{4,6,8}.parquet`
+
+- **Role**: Zoom-adaptive aggregates that back the Cesium progressive globe and the Python Explorer's "H3 tier" rendering mode.
+- **Headline schema** (7 cols, identical across resolutions): `h3_cell` (BIGINT), `sample_count` (INT), `center_lat`, `center_lng` (DOUBLE), `dominant_source` (VARCHAR), `source_count` (INT), `resolution` (INT).
+- **Query pattern**: fetch the right resolution for the current zoom; no join needed.
+- **DuckDB**:
+  ```sql
+  SELECT * FROM read_parquet('https://data.isamples.org/isamples_202601_h3_summary_res6.parquet')
+  ORDER BY sample_count DESC LIMIT 20;
+  ```
+
+### 4.7 `isamples_202601_samples_map_lite.parquet`
+
+- **Role**: Display projection for point-level map rendering. Contains only `MaterialSampleRecord` rows with valid coordinates.
+- **Headline schema** (9 cols): `pid, label, source, latitude, longitude, place_name, result_time, h3_res8, h3_res8_hex`. **No `description`** — it's in wide only.
+- **Query pattern**: the Explorer reads this directly when altitude drops below the point-render threshold.
+- **DuckDB**:
+  ```sql
+  SELECT source, COUNT(*) FROM read_parquet('https://data.isamples.org/isamples_202601_samples_map_lite.parquet')
+  WHERE latitude BETWEEN 32 AND 42 GROUP BY 1;
+  ```
+
+### 4.8 `isamples_202601_sample_facets_v2.parquet`
+
+- **Role**: Cross-dimension facet filtering — one row per sample, each facet column holds a single controlled-vocabulary URI (the leaf concept the sample is tagged with at that dimension).
+- **Headline schema** (8 cols, all VARCHAR): `pid, source, material, context, object_type, label, description, place_name`. `material`/`context`/`object_type` are scalar URI strings, NOT arrays — the file's grain is one row per sample, so a sample tagged with multiple material URIs is represented by a single chosen URI (currently the first/leaf). For multi-material accuracy, JOIN back to `wide.p__has_material_category`.
+- **Query pattern**: `WHERE material = '<uri>'` for exact match; `WHERE material ILIKE '%rock%'` to substring-match URI fragments.
+- **DuckDB**:
+  ```sql
+  SELECT pid, label
+  FROM read_parquet('https://data.isamples.org/isamples_202601_sample_facets_v2.parquet')
+  WHERE material ILIKE '%rock%'
+  LIMIT 10;
+  ```
+
+### 4.9 `isamples_202601_facet_summaries.parquet`
+
+- **Role**: Baseline (no-filter) facet counts. Loaded by every tutorial at startup.
+- **Headline schema** (4 cols, 56 rows): `facet_type` (source|material|context|object_type), `facet_value`, `scheme`, `count`.
+- **Query pattern**: sort by `count DESC` to render a top-N facet list.
+- **DuckDB**:
+  ```sql
+  SELECT * FROM read_parquet('https://data.isamples.org/isamples_202601_facet_summaries.parquet')
+  WHERE facet_type = 'material' ORDER BY count DESC;
+  ```
+
+### 4.10 `isamples_202601_facet_cross_filter.parquet`
+
+- **Role**: Cross-facet counts for the single-active-filter case (QUERY_SPEC §3.3 tier 2a). Avoids recomputing when one facet dimension is active.
+- **Headline schema** (7 cols, 526 rows): `filter_source, filter_material, filter_context, filter_object_type, facet_type, facet_value, count`. Exactly one `filter_*` column is non-NULL per row.
+- **Query pattern**: lookup by the active filter to get counts for the remaining dimensions.
+- **DuckDB**:
+  ```sql
+  SELECT facet_type, facet_value, count FROM read_parquet('https://data.isamples.org/isamples_202601_facet_cross_filter.parquet')
+  WHERE filter_source = 'SESAR' ORDER BY facet_type, count DESC;
+  ```
+
+### 4.11 `oc_isamples_pqg.parquet` and `oc_isamples_pqg_wide.parquet` (OC variants)
+
+- **Role**: OpenContext-specific PQG files maintained by Eric Kansa. Hosted at
+  `https://storage.googleapis.com/opencontext-parquet/`, **not** under
+  `data.isamples.org`. They are not part of the cross-source substrate —
+  they carry OC-internal detail (notably `thumbnail_url`) that the
+  aggregated Zenodo export drops.
+- **Headline schema**: PQG narrow (40 cols) and wide (47 cols). OC wide has slightly fewer `p__*` columns than the unified wide — this is schema drift, not semantically meaningful for standard queries.
+- **Consumer**: `scripts/enrich_wide_with_oc_thumbnails.py` uses OC narrow to fill thumbnails into 202604 unified wide. Also used directly in PQG benchmark work.
+- **Future**: these become the prototype upstream for per-source sidecars (see §3, bottom row).
+
+## 5. URL convention
+
+All substrate files live under `https://data.isamples.org/` — a
+Cloudflare Worker fronting an R2 bucket. The Worker provides:
+
+- **Versioned URLs** `https://data.isamples.org/isamples_<YYYYMM>_<variant>.parquet`
+  — 1-year immutable cache. Safe to pin in papers, Zenodo manifests,
+  reproducibility notebooks.
+- **Alias URLs** `https://data.isamples.org/current/<alias>` — 302
+  redirect with 5-min cache; always resolves to the latest snapshot.
+  Use for "always fresh" consumers. Currently
+  `current/wide.parquet → isamples_202604_wide.parquet`.
+
+**Never reference the raw
+`pub-a18234d962364c22a50c787b7ca09fa5.r2.dev/...` URL.** It bypasses
+the Worker and defeats both the alias layer and the Cache-Control
+headers that DuckDB-WASM relies on for HTTP range requests.
+
+OpenContext-specific variants live at
+`https://storage.googleapis.com/opencontext-parquet/` and are
+maintained outside this convention.
+
+## 6. Relationship to other documents
+
+- **[`query-spec.qmd`](query-spec.qmd)** §5.1 — the DuckDB binding table,
+  which maps query-spec dimensions (`source`, `material`, `bbox`, `h3`,
+  `time`, `text`) to the specific parquet files above. This catalog
+  says *what* the files are; the query spec says *which dimension*
+  each file serves.
+- **`ZENODO_DEPOSITION_PLAN.md`** (in the monorepo root) — specifies
+  which subset of these files are archived in each Zenodo deposition.
+  The 202601 deposition bundles the 10 R2-served files plus a
+  `MANIFEST.json` and `README.md`. Source-specific OC variants and
+  the raw Zenodo export are **not** part of the substrate deposition.
+- **[`pqg/docs/PQG_SPECIFICATION.md`](https://github.com/isamplesorg/pqg/blob/main/docs/PQG_SPECIFICATION.md)** — defines the three canonical formats
+  (export, narrow, wide) whose schemas the primary files conform to.
+  §3.5 is the normative section.
+- **`pqg/docs/conformance_matrix.md`** (planned) — will document, for
+  each file above, exactly which clauses of the PQG spec it satisfies
+  (required columns, allowed `otype` values, edge-type constraints,
+  etc.). This catalog is the prose companion; the conformance matrix
+  will be the machine-checkable companion.
+- **`project_isamples_sidecar_pattern.md`** (memory) — planning for
+  per-source sidecars that would sit alongside the unified wide file
+  rather than being folded in at build time (as OC thumbnails
+  currently are). When that lands, it adds a new tier to §3.
+
+---
+
+*Last updated: 2026-04-24 by iSamples team. Row counts and sizes
+verified by DuckDB against `https://data.isamples.org/` on the same
+date.*