Skip to content

Add unstable_allowIndexMap for cheap indexed source maps#1750

Open
robhogan wants to merge 2 commits into
mainfrom
export-D108384690
Open

Add unstable_allowIndexMap for cheap indexed source maps#1750
robhogan wants to merge 2 commits into
mainfrom
export-D108384690

Conversation

@robhogan

Copy link
Copy Markdown
Contributor

Summary:
Allow Metro to emit indexed source maps for .map requests, behind serializer.unstable_allowIndexMap.

When source maps are stored compactly as VLQ (see the unstable_compactSourceMaps producer diff), the default .map serialization decodes every module back to tuples and re-encodes them into a single flat map - correct and byte-identical to today, but it pays decode + re-encode CPU on every whole-bundle .map request.

With unstable_allowIndexMap enabled, the serialiser is able to pass each module's VLQ
mappings string through verbatim, vastly reducing the computational complexity of map generation.

Note - this is a no-op if unstable_compactSourceMaps is not opted-in (or if no VLQ-stored map is present), so it can be enabled safely ahead of the producer.

The flag is honoured wherever Metro serialises a whole-bundle source map - standalone .map requests, the map emitted alongside a .bundle, and metro build output - not just the dedicated .map route.

The tradeoffs are:

  • Compatibility (so this will only be enabled in a breaking change). Browsers including CDT/RNDT, and Metro's own Consumer/SourceMetadataMapConsumer all support it.
  • Slight increase in .map size over the wire, as there's some repetition between sections. This turns out to be very modest, ~3.4% for a large bundle.

E2E benchmark — cold FBiOS .bundle then .map (with/without worker threads)

Across 8 repeats of each matrix entry, interleaved.

Simulating real-world use, a .bundle is requested first, followed by a .map - so that the .map request is using already-in-memory Graph, and the time to satisfy that request is largely CPU-bound serialisation.

Child-process workers (Metro default):

metric base compact_flat compact_indexed
heap used, post-build (MB) 1640 795 (−51.5%) 795 (−51.5%)
heap growth during build (MB) 1589 744 (−56.7%) 530 (−62.0%)
main-isolate RSS (MB) 1837 936 (−48.9%) 935 (−49.0%)
process-tree RSS (MB) 15646 14580 (−6.7%) 14727 (−7.6%)
build CPU (s) 607 606 (n.s.) 604 (n.s.)
map serialize, wall (s) 12.0 13.9 (+16.2%) 6.0 (−49.9%)
map size (MB) 154.9 154.9 160.1 (+3.4%)

(NB: same benchmarks were repeated under unstable_workerThreads - the findings were essentially the same, see base diff)

Takeaways:

  • The memory win is the headline and it is identical in both worker modes: compact storage cuts the retained module graph (heapUsed) by ~51.5–51.8% (≈1.64 GB → ≈0.79 GB), with extremely tight CIs. compact_flat and compact_indexed deliver it equally (storage-driven).
  • .map serialization is the flat-vs-indexed tradeoff, and it is mode-independent (main-process work): compact_flat pays +16–22% wall (decode VLQ → re-encode; byte-identical to base), while compact_indexed is ~2× faster than base (−48–50%; VLQ passes through verbatim) at the cost of a +3.4% larger indexed-format .map.

Changelog

 - **[Experimental]** `serializer.unstable_allowIndexMap` in combination with `transformer.unstable_compactSourceMaps` builds source maps much more efficiently

Reviewed By: huntie

Differential Revision: D108384690

robhogan and others added 2 commits June 25, 2026 06:08
Summary:
Scripts and findings for profiling Metro's memory and CPU during bundling, and an
end-to-end benchmark of the compact VLQ source-map work stacked on top.

**Methodology:**
- Start Metro with `NODE_ARGS="--expose-gc --inspect=9230" DEV=1 js1 run --prefetch=false`
- WildeBundle URL: `GET http://localhost:8081/xplat/js/RKJSModules/EntryPoints/WildeBundle.bundle?platform=ios&dev=true&app=com.facebook.Wilde`
- RSS profiling via /proc, heap snapshots via Chrome DevTools Protocol
- Graph freed via DELETE to the bundle URL (same as fill-http-cache)

**Scripts added:**
- `fb-metro-cli/memory-investigation/heap-profile.js` — Automated CDP-based profiler: captures 3 heap snapshots (baseline, post-build, post-delete) and compares them
- `fb-metro-cli/memory-investigation/heap-compare.js` — Standalone snapshot comparator with streaming parser for multi-GB .heapsnapshot files
- `fb-metro-cli/memory-investigation/heap-injector.js` — Optional in-process module exposing /memory, /gc, /snapshot HTTP endpoints
- `metro/scripts/profile-memory.sh` — Quick RSS-only profiling via /proc
- `fb-metro-cli/memory-investigation/compact-bench-measure.js` — One measurement cycle: builds WildeBundle, then requests WildeBundle.map, recording memory (RSS/heap) + build CPU + .map serialize CPU via CDP
- `fb-metro-cli/memory-investigation/run-compact-bench.sh` — Orchestrator: fresh Metro per repeat across three configs (base / compact_flat / compact_indexed), cold or warm cache
- `fb-metro-cli/memory-investigation/compact-bench-stats.js` — Welch t-test analysis between any two configs
- `fb-metro-cli/memory-investigation/README.md`, `compact-sourcemaps-benchmark-results.md` — Full writeup of methodology and results

**Baseline results (WildeBundle, June 2025):**
- Startup: 819 MB RSS / 426 MB heap used
- Post-build: 2,338 MB RSS / 1,549 MB heap used (+1,122 MB heap)
- Post-delete: 507 MB heap used (DELETE frees 93% of build growth)
- Arrays dominate: 10M Array objects + backing stores = 858 MB (77% of growth)
- Source maps stored as decoded number-tuple arrays are the primary consumer:
  ~678 MB, 60% of build growth (9,866,476 tuples across 16,562 modules)

**Compact source maps — end-to-end benchmark (n=3, WildeBundle):**
Three configs: `base` (decoded tuples), `compact_flat` (VLQ storage, flat .map),
`compact_indexed` (VLQ storage, indexed passthrough .map).
- Memory (both compact configs): heap −51% cold / −53% warm; RSS −48%
  (1654→810 MB heap cold; all Welch p < 1e-5).
- Build CPU: unchanged cold; ~20% faster warm with compact storage.
- Serialize CPU (`.map` request): `compact_flat` +18% vs base (decode + re-encode),
  `compact_indexed` −49% vs base (passthrough). Flat .map is byte-identical to base;
  indexed .map is +3.4% larger. Bundle output byte-identical across all configs.
Full tables in `compact-sourcemaps-benchmark-results.md`.

Differential Revision: D107879392
Summary:
Allow Metro to emit [indexed source maps](https://tc39.es/ecma426/#sec-index-source-map) for `.map` requests, behind `serializer.unstable_allowIndexMap`.

When source maps are stored compactly as VLQ (see the `unstable_compactSourceMaps` producer diff), the default `.map` serialization decodes every module back to tuples and re-encodes them into a single flat map - correct and byte-identical to today, but it pays decode + re-encode CPU on every whole-bundle `.map` request.

With `unstable_allowIndexMap` enabled, the serialiser is able to pass each module's VLQ
`mappings` string through verbatim, vastly reducing the computational complexity of map generation.

Note - this is a no-op if `unstable_compactSourceMaps` is not opted-in (or if no VLQ-stored map is present), so it can be enabled safely ahead of the producer.

The flag is honoured wherever Metro serialises a whole-bundle source map - standalone `.map` requests, the map emitted alongside a `.bundle`, and `metro build` output - not just the dedicated `.map` route.

The tradeoffs are:
 - Compatibility (so this will only be enabled in a breaking change). Browsers including CDT/RNDT, and Metro's own `Consumer`/`SourceMetadataMapConsumer` all support it.
 - Slight increase in `.map` size over the wire, as there's some repetition between `sections`. This turns out to be very modest, ~3.4% for a large bundle.

## E2E benchmark — cold FBiOS `.bundle` then `.map` (with/without worker threads)

Across 8 repeats of each matrix entry, interleaved.

Simulating real-world use, a `.bundle` is requested first, followed by a `.map` - so that the `.map` request is using already-in-memory `Graph`, and the time to satisfy that request is largely CPU-bound serialisation.

Child-process workers (Metro default):

| metric | base | compact_flat | compact_indexed |
|---|---|---|---|
| heap used, post-build (MB) | 1640 | **795 (−51.5%)** | **795 (−51.5%)** |
| heap growth during build (MB) | 1589 | 744 (−56.7%) | 530 (−62.0%) |
| main-isolate RSS (MB) | 1837 | 936 (−48.9%) | 935 (−49.0%) |
| process-tree RSS (MB) | 15646 | 14580 (−6.7%) | 14727 (−7.6%) |
| build CPU (s) | 607 | 606 (n.s.) | 604 (n.s.) |
| map serialize, wall (s) | 12.0 | **13.9 (+16.2%)** | **6.0 (−49.9%)** |
| map size (MB) | 154.9 | 154.9 | 160.1 (+3.4%) |

(NB: same benchmarks were repeated under `unstable_workerThreads` - the findings were essentially the same, see base diff)

Takeaways:
- **The memory win is the headline and it is identical in both worker modes:** compact storage cuts the retained module graph (`heapUsed`) by **~51.5–51.8%** (≈1.64 GB → ≈0.79 GB), with extremely tight CIs. `compact_flat` and `compact_indexed` deliver it equally (storage-driven).
- **`.map` serialization is the flat-vs-indexed tradeoff, and it is mode-independent** (main-process work): `compact_flat` pays **+16–22%** wall (decode VLQ → re-encode; byte-identical to base), while `compact_indexed` is **~2× faster than base** (−48–50%; VLQ passes through verbatim) at the cost of a +3.4% larger indexed-format `.map`.

## Changelog
```
 - **[Experimental]** `serializer.unstable_allowIndexMap` in combination with `transformer.unstable_compactSourceMaps` builds source maps much more efficiently
```

Reviewed By: huntie

Differential Revision: D108384690
@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 26, 2026
@meta-codesync

meta-codesync Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

@robhogan has exported this pull request. If you are a Meta employee, you can view the originating Diff in D108384690.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant