Skip to content

Tensorbit-Labs/tensorbit-models

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tensorbit Models

The Official Model Zoo for Tensorbit Labs.

This repository serves as a centralized library for pre-optimized neural network, large language model, and vision transformer binaries. Each model in this collection has been processed through the full Tensorbit P-D-Q pipeline (Pruning, Distillation, and Quantization) to ensure maximum performance on edge hardware without sacrificing reasoning capabilities.

Why Tensorbit Models?

Standard open-source models are often too "heavy" for on-device deployment. Tensorbit Labs specializes in transforming these heavy open-source models into lightweight, efficient versions suitable for on-device deployment. The system optimizes models through a specialized pipeline that combines pruning, distillation, and quantization to reduce size and latency while maintaining high accuracy.

  • Memory Efficiency: Up to % reduction in VRAM footprint.
  • Inference Speed: Optimized for tensorbit-run execution on NPU/ARM architectures.
  • Verified Benchmarks: Every binary is benchmarked via tensorbit-bench to ensure accuracy parity with the original models.

Performance Comparison

Please reference performance_comparison.csv to view comparisons between raw PyTorch vs. Tensorbit stats for every model in the zoo.

Model Catalog

Model Name Base Architecture Sparsity Precision Target Hardware
tb-llama-4-8b Llama 4 45% INT4 Apple M4 / Snapdragon G3
tb-mistral-next Mistral 30% INT4 ARM Cortex-A78
tb-vit-large ViT 50% INT8 Industrial NPU

Usage

These models are stored as .tb binaries designed to be loaded directly into the tensorbit-run engine.

# Example: Running a Tensorbit model locally
./tensorbit-run --model ./models/tb-llama-4-8b.tb --prompt "Explain quantum gravity."

Contribution & Requests

We focus on optimizing high-impact, open-weights models. If you would like to request a specific model optimization or contribute a "Tensorbit-ified" version of your own architecture, please open an issue or pull request with the label model-request.

License

The optimization weights (.tbm model files) are provided under the Apache License 2.0. Please refer to the original model creators (e.g., Meta, Mistral AI, etc.) for their underlying architectural licenses.

About

Official library of pre-optimized Tensorbit models. Ready-to-deploy LLMs and Vision Transformers for edge hardware, optimized via the Tensorbit P-D-Q pipeline.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors