Push-to-toggle voice dictation for your whole desktop.
Hit a key. Talk. The words land in whatever app is focused. Hit it again to stop.
dit streams your microphone to ElevenLabs Scribe v2 Realtime
and pastes each finalized sentence into the focused window the instant it's ready — no app
to switch to, no transcript window to copy out of. It's a single static binary, written in
Rust, that runs the same way on Linux, macOS and Windows.
It started as whisperflow.py —
a Linux/Wayland-only Python script. This is the portable, dependency-light rewrite.
┌─ press F9 ─────────────────────────────────────────────── press F9 ─┐
▼ ▼
mic ──► resample 16 kHz ──► WebSocket ──► Scribe v2 Realtime
│
committed_transcript ◄─────┘
│
typed as keystrokes ──► ✶ focused app
Linux / macOS:
curl -fsSL https://raw.githubusercontent.com/reddb-io/dit/main/install.sh | bashWindows (PowerShell):
irm https://raw.githubusercontent.com/reddb-io/dit/main/install.ps1 | iexThe installer detects your OS/arch, picks the best build for your host (see below), downloads
the matching binary, verifies its .sha256, installs or updates it (~/.local/bin on Unix,
%LOCALAPPDATA%\Programs\dit on Windows) and puts it on your PATH. If dit is already installed,
it reads the local dit --version, compares it with the requested/latest release, skips a no-op
reinstall when already current, and updates older local binaries in place. On Linux it also restarts
an active dit.service so the running desktop agent picks up the new binary.
It then walks you through the rest interactively: prompts for your ElevenLabs API key, offers to install the runtime libraries (detecting apt/dnf/pacman/zypper), and offers to set up the autostart service — then smoke-tests that the binary runs.
# fully non-interactive, e.g. for provisioning
curl -fsSL .../install.sh | bash -s -- --yes --api-key sk_... --with-service
# other flags
curl -fsSL .../install.sh | bash -s -- --version v0.1.0
curl -fsSL .../install.sh | bash -s -- --install-dir /usr/local/bin --skip-deps --no-serviceOnce installed, dit upgrades itself. dit update is a first-class, self-contained command — no
curl | bash, no re-running the installer:
dit update # upgrade to the latest release (no-op if already current)
dit update --check # just report whether a newer version exists
dit update --force # re-download and reinstall the current version
dit update --version v0.2.4 # pin a specific releaseWhat it does, end to end:
- Resolves the latest release from the GitHub API (or the tag you pin).
- Picks the right asset for this host — correct arch, and the glibc-portable vs. fully-static
-staticvariant inferred from how the running binary itself was built. - Downloads over HTTPS (rustls, no system OpenSSL) and verifies the published
SHA-256before touching anything — a mismatch aborts the update. - Atomically replaces the running executable in place (safe on Windows too), then on Linux
restarts an active
dit.serviceso the desktop agent picks up the new binary. - Idempotent: running it twice in a row just prints "already the latest release — nothing to update."
Grab the binary for your platform from the Releases page:
| Platform | Asset |
|---|---|
| Linux x86_64 | dit-linux-x86_64 |
| Linux aarch64 | dit-linux-aarch64 |
| Linux armv7 (32-bit ARM) | dit-linux-armv7 |
| Linux x86_64 — fully static | dit-linux-x86_64-static |
| Linux aarch64 — fully static | dit-linux-aarch64-static |
| macOS Apple Silicon | dit-macos-aarch64 |
| macOS Intel | dit-macos-x86_64 |
| Windows x86_64 | dit-windows-x86_64.exe |
curl -fsSL https://github.com/reddb-io/dit/releases/latest/download/dit-linux-x86_64 -o dit
chmod +x dit && sudo mv dit /usr/local/bin/Every asset ships a .sha256 sidecar — verify with shasum -a 256 -c dit-<asset>.sha256.
Note
Distro-portable by design. The default Linux x86_64/aarch64/armv7 binaries are built
with cargo-zigbuild against an old glibc floor
(2.28), so a single binary runs on every Ubuntu since 18.04 (20.04 / 22.04 / 24.04 / 26.04) and
Debian 10+ — no more "version GLIBC_2.39' not found" when you move between releases. If your host glibc is older than 2.28, or you're on a musl distro like Alpine, grab the *-staticvariant instead (ALSA is linked in, so it needs nothing on the system). The install script anddit update`
detect this and pick the right one for you automatically.
Note
Dependencies — there's essentially one, and the installer handles it. The prebuilt Linux
binary is self-contained: its only runtime dependency is the ALSA shared library libasound2
(audio) — no libxdo, wl-clipboard, GTK or appindicator (input, clipboard and the tray are all
pure-Rust). The install script checks for it and offers to install it via your package manager
(apt/dnf/pacman/zypper); with --yes it just does it. The *-static build links ALSA in,
so it needs nothing at all — use it (or --static) if you'd rather not touch system packages.
libasound2-dev and pkg-config are only needed to compile from source, never to run a release.
macOS and Windows need nothing extra.
cargo install --path . # or: cargo build --releaseLinux build dependencies
sudo apt-get install -y libasound2-dev pkg-config(That's the only build dependency — the Linux input, clipboard and tray are all pure-Rust, so no X11/GTK/xdo/appindicator dev packages are needed.)
macOS and Windows need no extra system packages.
Put your ElevenLabs API key in ~/.dit.env:
echo 'ELEVENLABS_API_KEY=sk_your_key_here' > ~/.dit.env(or export ELEVENLABS_API_KEY, or pass --env-file <path>).
dit # F9 toggle, Portuguese
dit --language en # English
dit --hotkey F8 # any of F1..F12
dit --device "Fifine" # prefer an input device by name substring
dit --no-filler # strip "uh"/"um" from the output
dit --keyterm RedDB --keyterm Scribe # bias toward names/jargon (repeatable)
dit --vad-silence 0.8 # commit faster on shorter pauses
dit --region eu # EU data residency
dit --list-devices # list inputs and exit
dit doctor # diagnose mic/keyboard/session permissions
dit update # update to the latest release (no-op if current)
dit update --check # only report whether an update is availablePress F9 → speak → press F9 again. While recording, the tray icon becomes a high-contrast VU meter: dark red bars mean silence/no input, green bars mean healthy speech level, and yellow/red bars mean loud input. Ctrl+C quits. Crank up logs with RUST_LOG=dit=debug.
| Flag | Default | Description |
|---|---|---|
--language |
pt |
Scribe language code (pt, en, es, …) |
--model |
scribe_v2_realtime |
Scribe realtime model id |
--hotkey |
F9 |
Toggle key (F1..F12) |
--device |
system default | Input device name substring |
--no-filler |
off | Remove filler words (no_verbatim) |
--keyterm <TERM> |
— | Bias the model toward a term; repeatable |
--vad-silence <SECS> |
1.5 |
Silence before a segment commits — lower = snappier |
--region |
global |
API region: global, us, eu, in |
--no-preview |
off | Disable the live terminal preview |
--env-file |
~/.dit.env |
Path to the key file |
--list-devices |
— | Print input devices and exit |
dit is resilient to desktop hardware churn: on Linux it monitors /dev/input
for keyboards plugged in after startup, debounces duplicate hotkey events from
multi-event keyboards, ranks real capture devices ahead of noisy ALSA aliases,
and retries/fails over if a microphone stream disappears.
Tip
For the sharpest transcripts: pass names and jargon with --keyterm (e.g. --keyterm Kubernetes),
turn on --no-filler for clean prose, and lower --vad-silence (e.g. 0.8) if you want each
sentence to land sooner at the cost of slightly more fragmentation.
dit is a long-running process — it has to be, since something must listen for the hotkey. To
have it start at login and stay ready, install it as a user service:
dit service install # autostart with defaults
dit service install --language en --no-filler # …or bake in your flags
dit service status
dit service uninstall| OS | What it installs |
|---|---|
| Linux | a systemd --user service (journalctl --user -u dit -f for logs), or an XDG autostart .desktop entry if there's no user systemd |
| macOS | a LaunchAgent in ~/Library/LaunchAgents |
| Windows | a logon task via Task Scheduler |
Important
It installs a user-session agent, not a root/system daemon — and that's deliberate. A system
service runs isolated from your login session (no display, no audio, no input access on Linux; in
"session 0" with no desktop on Windows), so it physically couldn't read your keyboard or type
into your apps. dit must live inside your graphical session.
Two surfaces keep your words safe without ever risking the focused app's text:
- Live terminal preview — the unstable
partial_transcript"materializes" on a single, self-rewriting line in your terminal. You watch the sentence form in real time, but the app in focus only ever receives committed (finalized) text. No backspace-and- retype into a window we don't control, so there's no way to clobber what's already there. - Append-only transcript log — every committed segment is written to
~/.dit/sessions/session-<ts>.txt. If typing fails, the app loses focus, or the connection drops, the text is still on disk. A previewed tail that never got a final commit is recorded too (marked# [uncommitted]) — saved for recovery, not typed late.
… materializing this senten ← live preview (dim, rewrites in place)
This sentence is now committed. ← locked in, typed into the app + logged
dit is faithful to the original script's streaming contract:
partial_transcriptevents are ignored — they're an unstable preview, and typing them character-by-character would scramble the output.committed_transcriptevents are stable per-segment text, committed by the server's Voice Activity Detection on each pause. Every one is typed into the focused app immediately.- Identical consecutive segments are de-duplicated so nothing lands twice.
- On stop, an empty
commit: trueframe flushes the last open segment. - While audio is streaming,
ditcomputes a lightweight RMS level locally and updates the tray icon about 5 times per second as a chunky 5-bar meter. Only the level is used for the icon; no audio is written to disk by default.
Text delivery is platform-specific. On Linux/Wayland, dit sets the clipboard and emits the paste
chord through /dev/uinput (Ctrl+V, or Ctrl+Shift+V with --paste-shift for terminals); this
makes delivery much more reliable than trying to synthesize every character individually. The text is
also appended to the session log, so a failed paste can still be recovered.
| Concern | Crate | Replaces (whisperflow.py) |
|---|---|---|
| Global hotkey | rdev |
evdev + input group |
| Audio capture | cpal |
parec (PulseAudio) |
| WebSocket | tokio-tungstenite |
websockets |
| Text injection | enigo |
wl-copy + ydotool key ctrl+v |
| Notifications | notify-rust |
notify-send |
Important
Linux — dit uses a kernel-level input backend (works the same on X11 and Wayland, since
X11 global grabs and X11 input don't reach native Wayland apps): it reads the hotkey from
/dev/input (evdev) and types by setting the clipboard and emitting the paste chord through
/dev/uinput. No external libraries or tools — but it needs a one-time permission setup, which the
installer offers to do:
sudo usermod -aG input $USER # read the keyboard
echo 'KERNEL=="uinput", GROUP="input", MODE="0660"' | sudo tee /etc/udev/rules.d/99-uinput.rules
sudo udevadm control --reload && sudo udevadm trigger # write to uinput
# then log out and back inIn a terminal, paste is Ctrl+Shift+V — pass --paste-shift so dit uses that chord.
Note
macOS — grant Accessibility permission (System Settings → Privacy & Security →
Accessibility) so dit can read the hotkey and type into the focused app.
Windows — works out of the box.
Versioning is commit-driven. release-plz reads the
Conventional Commits on main and opens a release PR
that bumps the version (feat → minor, fix → patch, !/BREAKING CHANGE → major) and updates
CHANGELOG.md. Merging that PR creates the version tag.
The tag triggers the release build on Blacksmith runners. Linux
x86_64/aarch64 compile natively then get their glibc floor lowered to 2.28 by
cargo-zigbuild (so one binary spans every modern
distro); armv7 cross-compiles the same way. Two extra fully-static musl binaries
(*-static) build inside messense/rust-musl-cross containers with a statically-linked ALSA, as a
fallback for ancient or musl hosts. macOS ships universal coverage, Windows an .exe. Every target
is built --locked, stripped, smoke-tested, and published to a GitHub Release with .sha256
sidecars and a git-cliff changelog.
commits (feat:/fix:/…) ─► release-plz PR ─► merge ─► tag vX.Y.Z ─► binaries + GitHub Release
So you never tag by hand — just write conventional commits and merge the release PR. No PAT
needed: the tag release-plz creates triggers the build directly (you can also rebuild any tag
manually with gh workflow run release.yml -f version=X.Y.Z).
MIT © RedDB.io