A Query.Farm VGI worker for DuckDB.
vgi-vision · a Query.Farm VGI worker · powered by ONNX Runtime
A VGI worker that brings image classification to
DuckDB/SQL. Point it at a column of image bytes (BLOB) — or a file path — and
get ImageNet-1k predictions back: a single top label per row, or the top-K
labels with confidence scores, as plain SQL scalar and table functions.
Inference runs out-of-process on MobileNetV2 via
ONNX Runtime (no PyTorch); the model and labels
download on first use and cache locally. It is built to be
commercial-marketplace-safe — deliberately no AGPL-licensed
Ultralytics/YOLO — so every dependency is permissively licensed.
INSTALL vgi FROM community; LOAD vgi;
ATTACH 'vision' (TYPE vgi, LOCATION 'uv run vision_worker.py');
SELECT id, vision.top_label(image) FROM photos;
SELECT * FROM vision.classify((SELECT image FROM photos LIMIT 1)); -- top 5
SELECT * FROM vision.classify((SELECT image FROM photos LIMIT 1), 10); -- top 10
SELECT * FROM vision.image_classes(); -- 1000 labelsInference runs out-of-process on a permissively-licensed ONNX model via onnxruntime. The model + labels are downloaded on first use and cached; nothing large is committed to the repo.
This worker is built specifically to be safe for a commercial marketplace. It deliberately does not use Ultralytics / YOLOv8, whose code is AGPL-3.0 — unacceptable for proprietary/commercial distribution (the same reason
vgi-pdfrejected PyMuPDF). Everything below is permissively licensed.
| Component | What | License |
|---|---|---|
| Classifier weights | MobileNetV2 (opset 12), trained on ImageNet-1k (1000 classes). Sourced from the ONNX Model Zoo, mirrored on Hugging Face as onnxmodelzoo/mobilenetv2-12. |
Apache-2.0 |
| Inference runtime | onnxruntime | MIT |
| Image decoding | Pillow | MIT-CMU (HPND) |
| Numerics | NumPy | BSD-3-Clause |
| Labels | ImageNet-1k synset list (synset.txt) |
Apache-2.0 (ONNX Model Zoo) |
| This worker | vgi-vision | MIT (see LICENSE) |
The exact model file is mobilenetv2-12.onnx (~13.3 MB). It is downloaded on
first use to ~/.cache/vgi-vision/ (override with VGI_VISION_CACHE_DIR, or
point VGI_VISION_MODEL / VGI_VISION_LABELS at local copies). Pre-fetch it
with make fetch-model. The weights are never committed (gitignored).
Object detection (detect_objects, count_objects, has_object) is not
shipped in v1. The popular detection models (YOLOv5/v8) carry AGPL-3.0 or
otherwise non-permissive terms, and the clearly-permissive alternatives
(e.g. an SSD-MobileNet exported from the Apache-2.0 TensorFlow Object Detection
API) lack a single stable, well-labelled ONNX artifact we were comfortable
bundling for a commercial product. Rather than ship a model with murky
provenance, v1 ships classification only. Detection can be added later behind
the same classify-style surface once a vetted Apache/BSD/MIT detection model +
labels are pinned.
| Function | Kind | Returns | Notes |
|---|---|---|---|
classify(image[, top_k]) |
table | (label VARCHAR, confidence DOUBLE) |
Top-k predictions (default top_k=5), ordered by confidence desc. image is a BLOB; a classify(path[, top_k]) VARCHAR overload reads a file off disk. |
top_label(image) |
scalar | VARCHAR |
The #1 predicted label. BLOB input; top_label(path) VARCHAR overload reads a file. |
image_classes() |
table | (idx INT, label VARCHAR) |
The model's full 1000-class ImageNet label set. |
confidence values are softmax probabilities in [0, 1].
Images are untrusted. The worker never crashes on bad input:
NULLimage →NULL(top_label) / no rows (classify).- Malformed / non-image / empty bytes →
NULL/ no rows. - Over-large images (a decompression-bomb guard caps decode at 64 MP) →
NULL/ no rows. - A genuine model-load failure (offline, disk) raises a clear, actionable error telling you how to fetch the model — it is not silently swallowed per row.
The ONNX session + labels load once per worker process and are cached for its
lifetime, so the cost is amortised across every row of every query. The worker
also warms the model at startup (vision_worker.py's run() calls
model.warm_up()), so the first query of an ATTACH is fast and the end-to-end
SQL suite stays deterministic under load.
export PATH="$HOME/.local/bin:$PATH"
uv sync --extra dev
make fetch-model # download + cache the model and labels (one-off)
uv run --no-sync pytest -q # unit + integration tests
make test-sql # DuckDB end-to-end (.test) over a tiny generated PNG
uv run --no-sync ruff check . && uv run --no-sync mypy vgi_vision/Model-dependent tests skip cleanly when the model isn't present (e.g. offline),
so a bare checkout still goes green; they run for real once make fetch-model
(or any first query) has populated the cache.
vision_worker.py # worker entry point (PEP 723 header) + startup warm-up
vgi_vision/
model.py # pure inference: download/cache, warm-up, preprocess, classify
scalars.py # top_label(image) / top_label(path)
tables.py # classify(...), image_classes()
schema_utils.py # field() column-comment helper
tests/ # pytest: pure logic + in-process scalar/table + Client E2E
test/sql/ # DuckDB sqllogictest .test files
Written by Query.Farm.
Copyright 2026 Query Farm LLC - https://query.farm
