feat(constants)!: switch URLs to v0.9.0 layout + add MODEL_REGISTRY#1148
feat(constants)!: switch URLs to v0.9.0 layout + add MODEL_REGISTRY#1148msluszniak wants to merge 7 commits into
Conversation
URL refresh
-----------
Every URL constant in the library now points at the restructured HF
layout under `resolve/v0.9.0`. File names follow
`<model>_<size>_<backend>_<precision>.pte`, files sit under per-size
and per-backend directories. Affects:
- modelUrls.ts: 170 URL refs rewritten to new paths. The 8da4w-typo
file `lfm2_5_350m_xnnpack_8w4da.pte` is corrected to `..._8da4w.pte`.
- ocr/models.ts: CRAFT detector URL + CRNN per-language URL template
switch to the new `<lang>/xnnpack/crnn_<lang>_xnnpack_fp32.pte` shape.
- tts/models.ts: Kokoro consts re-rooted to
`<size>/xnnpack/kokoro_<size>_<component>_xnnpack_fp32.pte`.
- tts/voices.ts: voices/ and phonemizer/ asset paths kept in place;
only the `${VERSION_TAG}` value bumps.
- versions.ts: VERSION_TAG -> resolve/v0.9.0. NEXT_VERSION_TAG
collapsed into VERSION_TAG. PREVIOUS_VERSION_TAG=resolve/v0.8.0
retained for the two @deprecated Llama QLoRA aliases (LLAMA3_2_*_QLORA)
that continue to resolve their v0.8.0 file. SpinQuant is the canonical
quantized Llama 3.2 variant going forward.
MODEL_REGISTRY
--------------
Adds `constants/modelRegistry.ts` — a typed accessor grouped by
capability (LLM, VLM, CLASSIFICATION, OBJECT_DETECTION,
SEMANTIC_SEGMENTATION, INSTANCE_SEGMENTATION, STYLE_TRANSFER,
SPEECH_TO_TEXT, TEXT_EMBEDDING, IMAGE_EMBEDDING, IMAGE_GENERATION,
VAD). Each entry is callable with `{ quant, backend }`:
MODEL_REGISTRY.LLM.LLAMA3_2_3B // default (base)
MODEL_REGISTRY.LLM.LLAMA3_2_3B({ quant: true }) // SpinQuant
When read as a value (object access), returns the default config; when
called, resolves the requested variant. `backend` is accepted in the
signature for forward-compat but the library still picks via
`Platform.OS` at module load.
The previous flat `MODEL_REGISTRY = { ALL_MODELS: {...} }` export in
modelUrls.ts is removed; its internal-only consumer (the urlToModelName
lookup) now reads from a private `_ALL_MODELS` array.
Resolves the JS-API side of the HF naming convention migration.
The umbrella lfm-2.5 HF repo hosts two distinct models — the text LLM
(1.2B + 350M) and the vision-language model (1.6B + 450M). The
migrator collapsed the VL size tokens (`vl_1_6b`, `vl_450m`) to bare
numeric sizes, making VL 1.6B indistinguishable from a hypothetical
text 1.6B variant. It also left the four per-variant tokenizers at
their legacy `lfm2.5-*/` paths instead of moving them next to the new
backend dirs.
HF state (separate commits on the repo):
- VL .pte files renamed to `vl_<size>/xnnpack/lfm_2_5_vl_<size>_*.pte`
- tokenizers moved into `<size>/` and `vl_<size>/` next to each cell
- legacy `lfm2.5-*-instruct/` and `lfm2.5-VL-*/` dirs cleaned out
- config.json files refreshed (vl_* configs now carry
`model: lfm_2_5_vl` + `capabilities: [vision, text-generation]`)
This commit refreshes the matching URL constants in modelUrls.ts so
every LFM2.5 model points at its new HF path.
Covers the new grouped MODEL_REGISTRY shape (capability groups with
callable accessors), the `{ quant, backend }` options, default vs
quantized resolution, the still-supported direct-import pattern, and a
short migration note from the previous flat `ALL_MODELS` dict.
22 files updated across apps/llm, apps/computer-vision, apps/speech,
apps/text-embeddings, and apps/bare-rn. Each flat model-constant import
is replaced with the corresponding `MODEL_REGISTRY.<GROUP>.<NAME>` (or
`(...)({ quant: true })` for quantized variants). Llama QLoRA aliases
remain imported under their flat names — they're deprecated and not
part of the registry.
Net effect: -242 / +158 lines (collapsed imports, terser callsites).
Apps now serve as the canonical usage example for the typed registry.
…ctions useState auto-invokes function-typed initial values as lazy initializers, so passing a MODEL_REGISTRY accessor unwraps it into a plain config — breaking reference equality against the accessor stored in MODELS. Compare by modelName (falling back to === for picker users without one, e.g. VoiceConfig).
Each accessor's `backend` parameter is now typed to exactly the backends the model ships with — passing an unsupported one is a compile-time error. `Platform.OS` still picks the default when `backend` is omitted. The per- backend (quant × backend) variant matrix lives in modelRegistry.ts so modelUrls.ts stays flat-per-model. Unifies DISTILUSE_BASE_MULTILINGUAL_CASED_V2 to one accessor with xnnpack + coreml; the _8DA4W and _COREML named constants stay as deprecated aliases.
…ariant
Bare accessors (and undefined `quant`) now resolve to the quantized
variant when one is published; pass `{ quant: false }` to opt out. Docs
and example apps are updated to match — dual pickers keep both rows by
making the FP32 entry the explicit opt-out.
667d6b3 to
fc5eeb0
Compare
| if (typeof am === 'string' && typeof bm === 'string') return am === bm; | ||
| return a === b; | ||
| } | ||
|
|
There was a problem hiding this comment.
This code is duplicated in several files. It should probably be factored out to some common utilities file. However, in general I would be in favour of not using the model accessor as both a value and a function.
| /** | ||
| * An accessor that behaves as the platform-default config when read as a value | ||
| * (e.g. `MODEL_REGISTRY.LLM.LLAMA3_2_3B.modelName`) and as a function when | ||
| * called (e.g. `MODEL_REGISTRY.LLM.LLAMA3_2_3B({ quant: false })`). | ||
| */ | ||
| type Accessor< | ||
| C extends { modelName: string }, | ||
| B extends Backend = Backend, | ||
| > = C & ((opts?: ModelOpts<B>) => C); |
There was a problem hiding this comment.
I don't think we should make the accessor behave like both a value and a function and just make the user explicitly call e.g. MODEL_REGISTRY.LLM.LLAMA3_2_3B() for default config (perhaps with some stylistic changes like changing the names to lowercase to indicate these are getters and not constants; the use pattern in the user code would be something like const LLAMA3_2_3B = models.llm.llama3_2_3b()). I feel the current approach might generate some problems e.g. when comparing models as in example apps using sameValue workaround.
Description
Switches every URL constant in the library to the restructured HF layout
under
resolve/v0.9.0and adds the typedMODEL_REGISTRYaccessor.URL refresh
All URLs now follow
<model>_<size>_<backend>_<precision>.pte, filessit under per-size and per-backend directories on HF:
modelUrls.ts— 170 URL refs rewritten. Thelfm2_5_350m_xnnpack_8w4da.ptetypo is corrected to_8da4w.pte.ocr/models.ts— CRAFT detector URL + CRNN per-language URL switch to the new<lang>/xnnpack/crnn_<lang>_xnnpack_fp32.pteshape.tts/models.ts— Kokoro consts re-rooted to<size>/xnnpack/kokoro_<size>_<component>_xnnpack_fp32.pte.tts/voices.ts—voices/andphonemizer/asset paths kept in place; only the tag value bumps.versions.ts—VERSION_TAG → resolve/v0.9.0.NEXT_VERSION_TAGremoved.PREVIOUS_VERSION_TAG = resolve/v0.8.0retained for the@deprecatedLlama QLoRA aliases.MODEL_REGISTRY
New
constants/modelRegistry.tsexports a typed accessor grouped bycapability (LLM / VLM / CLASSIFICATION / OBJECT_DETECTION /
SEMANTIC_SEGMENTATION / INSTANCE_SEGMENTATION / STYLE_TRANSFER /
SPEECH_TO_TEXT / TEXT_EMBEDDING / IMAGE_EMBEDDING / IMAGE_GENERATION /
VAD). Each entry is callable with
{ quant, backend }:Object access returns the default config; calling resolves the
requested variant.
backendis accepted in the type signature forforward-compat — the library still picks via
Platform.OSat moduleload. Per-backend selection lands in a follow-up.
Deprecations
LLAMA3_2_3B_QLORA,LLAMA3_2_1B_QLORA—@deprecated; the .ptefiles stay at
v0.8.0, and the constants still resolve those URLs.Use
LLAMA3_2_*_SPINQUANTgoing forward.Introduces a breaking change?
URL paths under
${VERSION_TAG}change — any code that hardcodedresolve/v0.8.0URLs through the constants keeps working only if itread the constants at runtime (the constant values themselves are
updated). Historical tags continue to resolve old paths, so apps pinned
to a previous library version are unaffected.
The flat
MODEL_REGISTRY = { ALL_MODELS: {...} }export inmodelUrls.tsis removed; the newMODEL_REGISTRYfromconstants/modelRegistry.tsis the replacement. The internalURL→name lookup (
getModelNameForUrl) is preserved.Type of change
Tested on
yarn typecheckclean across the monorepo. Runtime behaviour validatedagainst the migrated HF state (every URL resolves at v0.9.0).
Testing instructions
In application code:
Related issues
#431
#612
Checklist