Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/doctor-smoke.yml
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ jobs:
# exercises the LD_LIBRARY_PATH patch path without pulling tensorrt
# (which has no Mac wheel and is huge on Linux). Doctor should still
# exit 0 with TRT libs ⚠ NOT installed warnings.
/tmp/tether-doctor-venv/bin/python -m pip install "$(ls dist/tether-*.whl)[onnx]"
/tmp/tether-doctor-venv/bin/python -m pip install "$(ls dist/fastcrest_tether-*.whl)[onnx]"

- name: Verify tether --version matches pyproject.toml
shell: bash
Expand Down
6 changes: 3 additions & 3 deletions GOALS.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -213,7 +213,7 @@ goals:

- id: fresh-install-verified
status: done
description: "`pip install 'tether[serve,gpu] @ git+https://...'` succeeds on a fresh python:3.12-slim container and `tether --help` runs. Catches dep resolution breakage, broken editable installs, missing files in the wheel. Regression-check every release."
description: "`pip install 'fastcrest-tether[serve,gpu] @ git+https://...'` succeeds on a fresh python:3.12-slim container and `tether --help` runs. Catches dep resolution breakage, broken editable installs, missing files in the wheel. Regression-check every release."
check: "test -f tests/test_fresh_install.py && .venv/bin/python -m pytest tests/test_fresh_install.py -q 2>/dev/null"
weight: 8

Expand Down Expand Up @@ -416,8 +416,8 @@ goals:
weight: 10

- id: serve-one-line-install
description: "pip install tether && tether serve <hf-id> works on a Jetson with no Docker, no Triton YAML, no NVIDIA Container Toolkit. The DX win that beats Triton even though Triton wins on raw features. Most customers will tolerate a 1.5x latency hit for a 60s install vs a 4-hour Triton-on-Jetson setup."
check: "test -f docs/quickstart_jetson.md && grep -q 'pip install tether' README.md 2>/dev/null"
description: "pip install fastcrest-tether && tether serve <hf-id> works on a Jetson with no Docker, no Triton YAML, no NVIDIA Container Toolkit. The DX win that beats Triton even though Triton wins on raw features. Most customers will tolerate a 1.5x latency hit for a 60s install vs a 4-hour Triton-on-Jetson setup."
check: "test -f docs/quickstart_jetson.md && grep -q 'pip install fastcrest-tether' README.md 2>/dev/null"
weight: 7

# ── DEFERRED LONG-TERM SERVE (Phase 4-5; do NOT build now) ────────
Expand Down
26 changes: 13 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@

> by [FastCrest](https://fastcrest.com) — deployment infrastructure for vision-language-action models.

[![PyPI](https://img.shields.io/pypi/v/tether.svg)](https://pypi.org/project/tether/)
[![Python](https://img.shields.io/pypi/pyversions/tether.svg)](https://pypi.org/project/tether/)
[![License](https://img.shields.io/pypi/l/tether.svg)](https://github.com/FastCrest/tether/blob/main/LICENSE)
[![Downloads](https://img.shields.io/pypi/dm/tether.svg)](https://pypi.org/project/tether/)
[![PyPI](https://img.shields.io/pypi/v/fastcrest-tether.svg)](https://pypi.org/project/fastcrest-tether/)
[![Python](https://img.shields.io/pypi/pyversions/fastcrest-tether.svg)](https://pypi.org/project/fastcrest-tether/)
[![License](https://img.shields.io/pypi/l/fastcrest-tether.svg)](https://github.com/FastCrest/tether/blob/main/LICENSE)
[![Downloads](https://img.shields.io/pypi/dm/fastcrest-tether.svg)](https://pypi.org/project/fastcrest-tether/)

![Tether — pip install + tether doctor + tether --help on Modal A10G with TRT EP active](assets/tether-tweet.gif)

Expand All @@ -28,16 +28,16 @@ The bootstrap installer detects your platform (Mac / Jetson Orin / NVIDIA GPU /
**Manual install** if you know what you want:

```bash
pip install tether # core
pip install 'tether[serve,gpu,monolithic]' # GPU production path
pip install 'tether[serve,onnx]' # Mac / CPU runtime
pip install fastcrest-tether # core
pip install 'fastcrest-tether[serve,gpu,monolithic]' # GPU production path
pip install 'fastcrest-tether[serve,onnx]' # Mac / CPU runtime
```

Requires Python ≥ 3.10.

### What's new in v0.11.2 (2026-05-29)

- **`tether connect` works on a clean install** — `requests` is now a core dependency, so `tether connect status` no longer raises `ModuleNotFoundError` on `pip install tether` without extras (it had been an undeclared import that only resolved transitively).
- **`tether connect` works on a clean install** — `requests` is now a core dependency, so `tether connect status` no longer raises `ModuleNotFoundError` on `pip install fastcrest-tether` without extras (it had been an undeclared import that only resolved transitively).
- **`--fast-kernels` cleared the formal N=100/task L3 LIBERO parity gate** — on Pi0.5 LIBERO-10 tasks 0-2 (600 episodes), Triton fast kernels scored 91.3% (274/300) vs native ORT 85.3% (256/300) — 6.0pp *ahead* of native, so kill-trigger 3 stays clear and the opt-in Triton runtime stays on.
- **Hardened monolithic serve/bench path, with external-data ONNX** — dedicated ORT provider-options + tokenizer-loading modules are extracted from the request hot path; ONNX models with external weight data (`.onnx` + `.onnx_data`, required once a graph exceeds the 2 GB protobuf limit) now load in both serve and the weight-fusion export pass.
- **Cleaner streams** — integration-command errors route to stderr, so `--json` consumers and shell pipelines get a clean stdout.
Expand Down Expand Up @@ -65,8 +65,8 @@ Breaking: module renames — `tether.exporters.{pi0,smolvla,gr00t}_exporter` →
We ship patches frequently — make sure you're on the latest:

```bash
pip install --upgrade tether # pip
uv add --refresh tether # uv (the --refresh flag is required;
pip install --upgrade fastcrest-tether # pip
uv add --refresh fastcrest-tether # uv (the --refresh flag is required;
# uv caches the package index aggressively
# and won't see new releases without it)
```
Expand All @@ -83,7 +83,7 @@ only needed for caches built by v0.5.3 or earlier.

## Performance

`tether[serve,gpu]` uses ONNX Runtime's TensorRT execution provider out of the box. Measured on Modal A10G (Ampere, sm_8.6) on 2026-04-29 against SmolVLA monolithic (5 warmup + 20 measured forward passes, batch=1):
`fastcrest-tether[serve,gpu]` uses ONNX Runtime's TensorRT execution provider out of the box. Measured on Modal A10G (Ampere, sm_8.6) on 2026-04-29 against SmolVLA monolithic (5 warmup + 20 measured forward passes, batch=1):

| Provider | Mean latency | p95 |
|---|---|---|
Expand Down Expand Up @@ -125,7 +125,7 @@ Full reproducer + 9-iteration debug log: [`reflex_context/03_experiments/2026-04
Adds ~2 GB to `[serve,gpu]` install (the `tensorrt` package + bundled libs). If you don't want it:

```bash
pip install 'tether[serve,gpu-min]' # ORT-CUDA only, ~5x slower on transformers
pip install 'fastcrest-tether[serve,gpu-min]' # ORT-CUDA only, ~5x slower on transformers
```

Or disable the `LD_LIBRARY_PATH` patch (e.g. if it conflicts with another env-aware tool):
Expand Down Expand Up @@ -275,7 +275,7 @@ Hidden legacy commands (`export`, `bench`, `replay`, etc.) stay callable as alia
### Install notes

- `[monolithic]` extra is required for the cos=+1.000000 verified export path (pins transformers==5.3.0)
- CPU-only: `pip install 'tether[serve,onnx,monolithic]'`
- CPU-only: `pip install 'fastcrest-tether[serve,onnx,monolithic]'`
- GPU install needs the FULL cuDNN 9 system library (not just the pip wheel). Easiest path: NVIDIA's container `docker run --gpus all -it nvcr.io/nvidia/tensorrt:24.10-py3`, then `apt-get install -y clang` (for lerobot→evdev), then the pip install
- `tether serve` errors loudly if cuDNN can't load — no silent CPU fallback
- First `tether go` downloads weights (~1-14 GB depending on model) — cached on subsequent runs
Expand Down
2 changes: 1 addition & 1 deletion contrib/ros2/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# ROS2 Robot Adapter Starter Kits

**Fork-and-customize** adapters for running Reflex VLA on real robots via ROS2.
These are NOT part of the core `pip install tether` — they live in `contrib/`
These are NOT part of the core `pip install fastcrest-tether` — they live in `contrib/`
and you're expected to adapt them to your specific robot setup.

## Available Adapters
Expand Down
2 changes: 1 addition & 1 deletion docs/cli_reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ tether go --model pi05-libero --dry-run
| `--api-key` | _(none)_ | If set, `/act` requires `X-Tether-Key` header (or `Authorization: Bearer`) |
| `--dry-run` | `false` | Probe + resolve + print plan; do not pull or serve |

Full flag list: `tether go --help`. Note: models that ship as raw PyTorch require the `[monolithic]` extra (`pip install 'tether[monolithic]'`) for the inline export step.
Full flag list: `tether go --help`. Note: models that ship as raw PyTorch require the `[monolithic]` extra (`pip install 'fastcrest-tether[monolithic]'`) for the inline export step.

---

Expand Down
4 changes: 2 additions & 2 deletions docs/eval.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ Bounded `failure_mode` enum, surfaced in the CLI + telemetry:
| `egl-black-frames` | Force `MUJOCO_GL=osmesa` in your env. |
| `dep-version-conflict` | Pin `robosuite==1.4.1`, `bddl==1.0.1`, `mujoco==3.3.2`. Use `--runtime modal` for known-good pins. |
| `osmesa-compile-hang` | Increase `--preflight-timeout` (cold containers take 60-180s for first-scene compile). |
| `import-error` | `pip install 'tether[eval-local]'` for local; `--runtime modal` for the bundled image. |
| `import-error` | `pip install 'fastcrest-tether[eval-local]'` for local; `--runtime modal` for the bundled image. |

The 5th failure (per-episode OOM) is per-call probabilistic; backoff + a legible error in the runner covers it.

Expand Down Expand Up @@ -145,7 +145,7 @@ Above-`$50` estimate triggers an extra **"are you sure?"** warning so the custom
Phase 1: **Linux x86_64 only**. Requires the `[eval-local]` extra:

```bash
pip install 'tether[eval-local]'
pip install 'fastcrest-tether[eval-local]'
```

`--runtime local` **never silently falls back to Modal**. If the local env is broken, `tether eval` fails loud with a remediation pointer. This avoids surprise Modal bills + masks real env-config issues.
Expand Down
10 changes: 5 additions & 5 deletions docs/getting_started.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Getting started with Tether

A 30-minute walkthrough of the typical first hour after `pip install tether[serve,gpu]`.
A 30-minute walkthrough of the typical first hour after `pip install fastcrest-tether[serve,gpu]`.

This guide assumes a Linux box with an NVIDIA GPU. CPU-only deployments work with `[serve]` instead of `[serve,gpu]` — every example below applies, just replace `--device cuda` with `--device cpu`.

Expand All @@ -10,10 +10,10 @@ This guide assumes a Linux box with an NVIDIA GPU. CPU-only deployments work wit

```bash
# GPU box (requires CUDA 12 + cuDNN 9 — easiest via nvcr.io/nvidia/tensorrt container):
pip install 'tether[serve,gpu] @ git+https://github.com/FastCrest/tether'
pip install 'fastcrest-tether[serve,gpu] @ git+https://github.com/FastCrest/tether'

# CPU-only box:
pip install 'tether[serve,onnx] @ git+https://github.com/FastCrest/tether'
pip install 'fastcrest-tether[serve,onnx] @ git+https://github.com/FastCrest/tether'
```

Or for development from source:
Expand Down Expand Up @@ -205,7 +205,7 @@ tether export lerobot/smolvla_base --target orin-nano --output ./sv
scp -r ./sv jetson:~/sv

# On the Jetson (Jetpack 6.x with TensorRT preinstalled):
pip install 'tether[serve] @ git+https://github.com/FastCrest/tether'
pip install 'fastcrest-tether[serve] @ git+https://github.com/FastCrest/tether'
tether serve ./sv --port 8000 --device cuda
```

Expand Down Expand Up @@ -236,7 +236,7 @@ ORT 1.20+ requires CUDA 12.x + cuDNN 9.x. The pip-installed `nvidia-cudnn-cu12`
```bash
docker run --gpus all -it --rm nvcr.io/nvidia/tensorrt:24.10-py3
# inside the container:
pip install 'tether[serve,gpu] @ git+https://github.com/FastCrest/tether'
pip install 'fastcrest-tether[serve,gpu] @ git+https://github.com/FastCrest/tether'
tether serve ...
```

Expand Down
4 changes: 2 additions & 2 deletions docs/mcp.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Tether exposes a [Model Context Protocol](https://spec.modelcontextprotocol.io/)
## Install

```bash
pip install tether[mcp]
pip install fastcrest-tether[mcp]
```

Pulls [`fastmcp`](https://github.com/jlowin/fastmcp) >= 3.0 alongside the core dependencies.
Expand Down Expand Up @@ -89,7 +89,7 @@ Shadow actions, A/B policy routing, and dataset validation run via explicit tool

**"fastmcp not installed"**
```bash
pip install tether[mcp]
pip install fastcrest-tether[mcp]
```

**Claude Desktop doesn't list Tether as a tool**
Expand Down
6 changes: 3 additions & 3 deletions docs/otel.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
## Install

```bash
pip install 'tether[tracing]'
pip install 'fastcrest-tether[tracing]'
```

Without the `[tracing]` extra, tracing no-ops silently; the server emits nothing and costs nothing. Your serve behavior is unchanged.
Expand Down Expand Up @@ -102,8 +102,8 @@ File a GitHub issue if either is blocking an integration.

## Troubleshooting

**"Tracing skipped — pip install tether[tracing] to enable"**
The `[tracing]` extra isn't installed. Run `pip install 'tether[tracing]'` and restart serve.
**"Tracing skipped — pip install fastcrest-tether[tracing] to enable"**
The `[tracing]` extra isn't installed. Run `pip install 'fastcrest-tether[tracing]'` and restart serve.

**Spans appear in Phoenix but not in Datadog**
Your Datadog Agent probably has `receivers.otlp` disabled. Enable OTLP gRPC on :4317 in `datadog.yaml`, or point `--otel-endpoint` at an OTel Collector that forwards to Datadog.
Expand Down
4 changes: 2 additions & 2 deletions docs/transports.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ produces actions; the transport delivers them to the robot client.

| Transport | Flag | When to use | Install |
|---|---|---|---|
| **HTTP** (default) | `--transport http` | Standard REST API. Works with any HTTP client (curl, Python requests, browser). Best for prototyping + debugging. | `pip install tether[serve]` |
| **ZMQ** | `--transport zmq` | Low-latency binary wire. 20× lower bandwidth for multi-camera setups. 10× smaller robot-side install. Best for production robot deployments where every millisecond matters. | Server: `pip install tether[serve]`. Robot: `pip install pyzmq msgpack numpy opencv-python-headless` (~25 MB) |
| **HTTP** (default) | `--transport http` | Standard REST API. Works with any HTTP client (curl, Python requests, browser). Best for prototyping + debugging. | `pip install fastcrest-tether[serve]` |
| **ZMQ** | `--transport zmq` | Low-latency binary wire. 20× lower bandwidth for multi-camera setups. 10× smaller robot-side install. Best for production robot deployments where every millisecond matters. | Server: `pip install fastcrest-tether[serve]`. Robot: `pip install pyzmq msgpack numpy opencv-python-headless` (~25 MB) |
| **ROS2** | (v1.0) | Native ROS2 action server. Reserved for v1.0. | — |

## Quick Start
Expand Down
2 changes: 1 addition & 1 deletion docs/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,7 @@ ERROR: Ignored the following versions that require a different python version: 0

**Fix:** On Jetson, install `[serve]` only — **not** `[monolithic]`:
```bash
pip install 'tether[serve]'
pip install 'fastcrest-tether[serve]'
```

The monolithic ONNX export (`tether export --monolithic`) requires lerobot and must run on a **Python 3.12+ host** (desktop, cloud GPU, or Docker). Export there, then copy the ONNX to the Jetson and serve it:
Expand Down
6 changes: 3 additions & 3 deletions examples/01-chat-quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
## Install

```bash
pip install tether
pip install fastcrest-tether
```

(~40 seconds; pulls torch + transformers + onnx as deps. Yes, it's chunky — the chat agent itself is ~50 KB, but it shares the install with the rest of Tether.)
Expand Down Expand Up @@ -55,9 +55,9 @@ Here's what the doctor found:
✓ tether 0.3.5
⚠ torch + CUDA — torch 2.11.0, CUDA unavailable (you're on Apple Silicon)
⚠ ONNX Runtime — not installed
Action: pip install 'tether[serve,onnx]'
Action: pip install 'fastcrest-tether[serve,onnx]'
⚠ fastapi + uvicorn — not installed
Action: pip install 'tether[serve,onnx]'
Action: pip install 'fastcrest-tether[serve,onnx]'
```

### "What models can I deploy?"
Expand Down
2 changes: 1 addition & 1 deletion examples/01_circle_tap_so100/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ by 0o8o0 (MIT) per ADR `reflex_context/01_decisions/2026-05-06-vendor-auto-soarm

```bash
# Install Tether with the so100 + bench-game extras
pip install 'tether[so100]' # adds scservo_sdk for the Pi-side driver
pip install 'fastcrest-tether[so100]' # adds scservo_sdk for the Pi-side driver
pip install lerobot==0.5.1 # for the from-scratch ACT training step
```

Expand Down
6 changes: 3 additions & 3 deletions examples/02-deploy-smolvla-jetson.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ pip install onnxruntime-gpu \
--index-url https://pypi.jetson-ai-lab.io/jp6/cu126

# 3. Install tether with [serve] only (NOT [gpu], NOT [monolithic])
pip install 'tether[serve]'
pip install 'fastcrest-tether[serve]'
```

> **Why not `[monolithic]`?** The monolithic export extra depends on `lerobot==0.5.1`, which requires **Python ≥ 3.12**. JetPack 6 ships Python 3.10. Export your model on a desktop/cloud machine with Python 3.12+, then copy the ONNX to the Jetson and serve it.
Expand All @@ -49,7 +49,7 @@ echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.bashrc
### What each piece provides:
- `numpy<2` — ABI compatibility with Jetson AI Lab pre-built wheels
- `torch` / `onnxruntime-gpu` (from Jetson AI Lab) — GPU-accelerated inference, compiled for `aarch64` + JetPack CUDA
- `tether[serve]` — FastAPI + uvicorn HTTP inference server + embodiment validation
- `fastcrest-tether[serve]` — FastAPI + uvicorn HTTP inference server + embodiment validation

This pulls ~2 GB of dependencies. Takes 5-10 minutes on the Jetson.

Expand All @@ -61,7 +61,7 @@ Since monolithic export requires Python 3.12+ (for `lerobot`), the typical Jetso

```bash
# On your desktop / cloud GPU (Python 3.12+)
pip install 'tether[serve,monolithic]'
pip install 'fastcrest-tether[serve,monolithic]'
tether export --model smolvla-base --out ./smolvla-export/
```

Expand Down
2 changes: 1 addition & 1 deletion examples/03-distill-pi05.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ SnapFlow ([arxiv 2604.05656](https://arxiv.org/abs/2604.05656)) is the canonical
## Install

```bash
pip install 'tether[monolithic]'
pip install 'fastcrest-tether[monolithic]'
```

## The command
Expand Down
2 changes: 1 addition & 1 deletion examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Each example is a self-contained walkthrough you can paste into a terminal. Star
All examples assume you've installed at least the base package:

```bash
pip install tether
pip install fastcrest-tether
```

Some examples need the GPU or monolithic export extras — each example flags what it needs at the top.
4 changes: 2 additions & 2 deletions examples/so_arm100_smolvla.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,8 @@
- USB-to-serial bridge on /dev/ttyUSB0 (Linux) or /dev/tty.usbserial-* (Mac)

Python requirements:
pip install 'tether[serve,gpu,monolithic,lerobot]' # GPU host
pip install 'tether[serve,onnx,lerobot,so100]' # Mac / Pi at the arm
pip install 'fastcrest-tether[serve,gpu,monolithic,lerobot]' # GPU host
pip install 'fastcrest-tether[serve,onnx,lerobot,so100]' # Mac / Pi at the arm

Calibration:
If you already have a LeRobot calibration file (recorded via
Expand Down
2 changes: 1 addition & 1 deletion infra/license-worker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ python -m tether.admin.issue_license \

```bash
# On the customer's machine
pip install --upgrade tether
pip install --upgrade fastcrest-tether
tether pro activate REFLEX-XXXX-XXXX-XXXX
# ✓ License fetched, signature verified, written to ~/.reflex/pro.license
# ✓ Hardware bound
Expand Down
2 changes: 1 addition & 1 deletion infra/telemetry-worker/worker.js
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/**
* Reflex telemetry endpoint — Cloudflare Worker.
*
* Receives heartbeat POSTs from `pip install tether` deployments
* Receives heartbeat POSTs from `pip install fastcrest-tether` deployments
* running with a valid Pro license OR free-tier telemetry enabled.
* Validates the payload shape, writes one row per heartbeat to D1,
* and returns 204 No Content.
Expand Down
Loading
Loading