The Official Model Zoo for Tensorbit Labs.
This repository serves as a centralized library for pre-optimized neural network, large language model, and vision transformer binaries. Each model in this collection has been processed through the full Tensorbit P-D-Q pipeline (Pruning, Distillation, and Quantization) to ensure maximum performance on edge hardware without sacrificing reasoning capabilities.
Standard open-source models are often too "heavy" for on-device deployment. Tensorbit Labs specializes in transforming these heavy open-source models into lightweight, efficient versions suitable for on-device deployment. The system optimizes models through a specialized pipeline that combines pruning, distillation, and quantization to reduce size and latency while maintaining high accuracy.
- Memory Efficiency: Up to % reduction in VRAM footprint.
- Inference Speed: Optimized for tensorbit-run execution on NPU/ARM architectures.
- Verified Benchmarks: Every binary is benchmarked via tensorbit-bench to ensure accuracy parity with the original models.
Please reference performance_comparison.csv to view comparisons between raw PyTorch vs. Tensorbit stats for every model in the zoo.
| Model Name | Base Architecture | Sparsity | Precision | Target Hardware |
|---|---|---|---|---|
| tb-llama-4-8b | Llama 4 | 45% | INT4 | Apple M4 / Snapdragon G3 |
| tb-mistral-next | Mistral | 30% | INT4 | ARM Cortex-A78 |
| tb-vit-large | ViT | 50% | INT8 | Industrial NPU |
These models are stored as .tb binaries designed to be loaded directly into the tensorbit-run engine.
# Example: Running a Tensorbit model locally
./tensorbit-run --model ./models/tb-llama-4-8b.tb --prompt "Explain quantum gravity."We focus on optimizing high-impact, open-weights models. If you would like to request a specific model optimization or contribute a "Tensorbit-ified" version of your own architecture, please open an issue or pull request with the label model-request.
The optimization weights (.tbm model files) are provided under the Apache License 2.0. Please refer to the original model creators (e.g., Meta, Mistral AI, etc.) for their underlying architectural licenses.