Skip to content

leps_localizer: --ios-bundle should emit a runtime-loadable ZIP (compiled .mlmodelc, assetID-wrapped) #77

@mihow

Description

@mihow

What's wrong

research/leps_localizer/scripts/export_classifier.py --ios-bundle currently emits a staging directory containing:

<model-id>.mlpackage/
<model-id>-category-map.json
<model-id>.model-info.json

The downstream consumer (LepsAI iOS, mihow/LepsAI) downloads ZIPs from s3://ami-models/mobile/<assetID>.zip at runtime via its AssetManager and tries to load <assetID>.mlmodelc directly with MLModel(contentsOf:). iOS cannot consume a raw .mlpackage at runtime — only Xcode-time compilation produces .mlmodelc. So shipping the staging dir as-is silently breaks: the ZIP extracts cleanly, but the model file the app looks up is missing.

This was hit during the 2026-05-08 ship of global-butterflies-resnet50-512 to LepsAI. Worked around by manually compiling on a macOS VM (xcrun coremlc compile in.mlpackage outdir/) and re-zipping.

What the consumer expects

Match the working na-butterflies-v3.zip layout exactly. LepsAI/Services/AssetDownloader.swift::flattenIfNeeded looks for a single top-level dir matching assetID:

<assetID>/
  <assetID>.mlmodelc/
    weights/weight.bin
    coremldata.bin
    metadata.json
    model.mil
    analytics/coremldata.bin
  <assetID>-category-map.json
  <assetID>.model-info.json

Suggested fix

Add a follow-on step to --ios-bundle (or a new --ios-zip flag) that:

  1. Compiles the .mlpackage.mlmodelc via xcrun coremlc compile (must run on macOS — gate behind platform.system() == \"Darwin\" and surface a clear error otherwise, since the compile is Apple-toolchain-only).
  2. Wraps the three artifacts (.mlmodelc/, -category-map.json, .model-info.json) in a parent dir named <model-id>/.
  3. ZIPs that parent dir to <out_dir>/<model-id>.zip.
  4. Prints the sha256 + size for the catalog entry on the consumer side.

Sketch:

import platform, subprocess, shutil, hashlib, zipfile

if args.ios_bundle and args.ios_zip:
    if platform.system() != \"Darwin\":
        raise SystemExit(\"--ios-zip requires macOS (xcrun coremlc compile)\")
    mlmodelc_parent = out_dir / \"_compile\"
    mlmodelc_parent.mkdir(exist_ok=True)
    subprocess.run(
        [\"xcrun\", \"coremlc\", \"compile\", str(mlpackage_path), str(mlmodelc_parent)],
        check=True,
    )
    pkg_dir = out_dir / args.model_id
    if pkg_dir.exists():
        shutil.rmtree(pkg_dir)
    pkg_dir.mkdir()
    shutil.move(str(mlmodelc_parent / f\"{args.model_id}.mlmodelc\"), str(pkg_dir))
    shutil.copy2(cat_map_path, pkg_dir)
    shutil.copy2(info_path, pkg_dir)
    zip_path = out_dir / f\"{args.model_id}.zip\"
    with zipfile.ZipFile(zip_path, \"w\", zipfile.ZIP_DEFLATED) as zf:
        for p in pkg_dir.rglob(\"*\"):
            zf.write(p, p.relative_to(out_dir))
    sha = hashlib.sha256(zip_path.read_bytes()).hexdigest()
    print(f\"ios-zip -> {zip_path}  sha256={sha}  size={zip_path.stat().st_size}\")

Alternative (cross-platform): add a sibling shell helper scripts/pack_ios_zip.sh that runs on a macOS VM/host and consumes the staging dir, so the Linux trainer doesn't need to think about platform gates.

Why it's worth doing

The packaging step is currently tribal knowledge — the failure mode (silent on-device skip) is hard to diagnose without inspecting na-butterflies-v3.zip for comparison. Baking it into the exporter avoids the next person re-discovering it, and gives a stable artifact to upload.

References

  • Working pattern: https://object-arbutus.cloud.computecanada.ca/ami-models/mobile/na-butterflies-v3.zip (inspect with `unzip -l`).
  • Consumer code: mihow/LepsAI LepsAI/Services/AssetDownloader.swift (extraction) and LepsAI/Services/AssetManager.swift (lookup).
  • Script that hit the bug: `research/leps_localizer/scripts/export_classifier.py` line ~621 (the `--ios-bundle` block).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions