Skip to content

Quigleybits/cctts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cctts — Claude Code Text-To-Speech

GitHub release license: MIT

Listen to Claude's responses while you read code, switch windows, or step away from the screen. A Claude Code plugin that speaks assistant output aloud via Microsoft's free edge-tts — independent of the native /voice STT toggle.

Platform: Windows. Tested on Windows 10/11. macOS/Linux code paths exist but are unvalidated on real hardware — see Known limits.

Heads-up: assistant text leaves your machine — every spoken block is sent to Microsoft's Edge TTS endpoint. See Privacy before enabling on sensitive work.

Contents

Requirements

  • Claude Code (any recent version)
  • Python 3.10+ on PATH (per edge-tts requirements)
  • Windows 10/11 (macOS/Linux experimental — see Known limits)
  • Internet access for synthesis — text is sent to Microsoft's Edge TTS endpoint
  • Optional: pycaw for cctts -headphones, miniaudio for cctts -streaming

Install

From a Claude Code session, run these two slash commands (one at a time — Claude Code's slash parser does not chain with &&):

/plugin marketplace add Quigleybits/cctts
/plugin install cctts@quigleybits

Then fully restart Claude Code — close the terminal / IDE session and reopen it. /reload-plugins is not enough: cctts ships a UserPromptSubmit hook that only registers at session startup, so a soft reload won't pick it up. (If cctts -status reaches the LLM as a normal prompt instead of being intercepted, you skipped the full restart.)

Upgrading

To pick up a newer version of an already-installed cctts, use the update verb — /plugin install is fresh-install only and won't bump the version:

/plugin update cctts@quigleybits

Or run bare /plugin and select cctts from the management UI. After an upgrade, fully restart Claude Code again so the hook re-registers against the new code. Any .claude/cctts.json files left over from v0.1.0 migrate into state.projects automatically on the next hook fire — no manual cleanup needed.

Note — first-run auto-install. On the first speak event, the worker auto-installs the one runtime dep (edge-tts) into your active Python — you'll see cctts: installing edge-tts (one-time, ~5s)... Ctrl-C to abort in the hook output, then cctts: edge-tts ready. and normal playback continues. If auto-install fails (e.g. PEP 668 managed-Python systems), install manually:

pip install --user edge-tts

~/.claude/hooks/.cctts-state.json is created with defaults on first hook fire: rate %115, global voice 1 (Aria), and auto-voice mode ON — each terminal is auto-assigned the lowest free voice from the pool (voices 1–9), so the first terminal you open hears Aria, the second hears Ava, the third Jenny, and so on. Run cctts -g -1 (or any -g -N) if you'd rather every terminal share one voice.

Source. Repo: https://github.com/Quigleybits/cctts. The quigleybits marketplace points at this repo via the .claude-plugin/marketplace.json self-manifest.

Quick start

After install + restart, type any of these as a raw prompt (no slash):

cctts                toggle TTS on/off in THIS terminal
cctts -3             switch THIS terminal to voice 3
cctts %150           set THIS terminal's rate to 150% of original speed
cctts !80            set THIS terminal's volume to 80% of original
cctts -3 %150 !80    voice + rate + volume for THIS terminal in one shot
cctts -g -3          broadcast voice 3 to every current terminal + new default
cctts -g             toggle GLOBAL on/off
cctts -p -3          set voice 3 for THIS project (stored in state.projects)
cctts -p             toggle THIS project on/off
cctts -intro         interactive first-run picker (sets global default)
cctts -help          focused magic-phrase reference (when you forget a flag)

These fire instantly via a UserPromptSubmit hook — no LLM round-trip.

Scope: per-terminal, global, project

Every cctts flag (voice, rate, volume, -notify, -chime, -pause, etc.) picks one of three scopes. Per-terminal is the default — a naked cctts -3 only affects THIS shell, not your other Claude Code terminals. Add one flag to switch scope:

Flag Scope Stored at Affects
(none) terminal state.terminals[sid] only the current CLAUDE_SESSION_ID
-g global broadcast state.global + rewrites every auto_voice_assignments[sid] + clears matching per-terminal overrides every currently-tracked terminal AND the new-terminal default
-p project state.projects[<git-root>] (resolved from cwd) every terminal whose cwd resolves to this project

-g and -p are mutually exclusive; specifying both is an error.

Precedence (highest → lowest): per-terminal override → project override → auto-voice → global default.

The hook resolves the effective voice/rate/volume per fire using the active CLAUDE_SESSION_ID (terminal) and cwd (project — walked up to the nearest .git, falling back to cwd if no git is found). Auto-voice only ever resolves voice_index; rate and volume fall straight through terminal → project → global. enabled is supported at all three layers.

Bare cctts, bare cctts -g, and bare cctts -p all toggle their scope's enabled flag (the symmetry is by design).

Magic phrases

The hook intercepts any prompt that starts with cctts (case-insensitive) and runs the CLI in-process, blocking the prompt from reaching the model. Anything that doesn't match (e.g. "fix the cctts hook", "what does cctts do") passes through normally.

Combining: any of -N, %N, !N can be mixed in one call (e.g. cctts -3 %150 !80). Prepend -g to broadcast globally or -p to scope to the project — see Scope for the precedence rules.

Toggle, pause, interrupt

Phrase Effect
cctts Toggle THIS terminal on/off
cctts -g Toggle GLOBAL on/off (affects every session that hasn't manually toggled)
cctts -p Toggle THIS PROJECT on/off (writes enabled under state.projects[<git-root>])
cctts -p -status Print the current project's entry as JSON
cctts -pause Pause THIS terminal (cleared automatically on SessionEnd). -g for global pause (persists across sessions). -p writes paused: true to the project entry — every terminal whose cwd resolves here goes silent on its next fire
cctts -resume Resume THIS terminal (writes an explicit unpaused override; -reset or SessionEnd clears it). -g for global resume. -p drops the project's paused field (entry is removed if it becomes empty)
cctts -skip Skip the currently-playing block; queued blocks continue
cctts -stop Kill current + queued blocks for THIS terminal. -g to interrupt every terminal
cctts -reset Clear THIS terminal's voice/rate/volume/notify/chime/paused overrides. -g clears every terminal's overrides
cctts -more Speak the next 5000-char page of a block that hit the speech cap. Per-terminal pending queue; recurses across multiple pages

Voice, rate, volume

Phrase Effect
cctts -1cctts -20 Set voice slot for THIS terminal (see Voice catalog; global default = 1). -g -N broadcasts; -p -N writes the project entry AND clears the override on every terminal in this repo
cctts -voice <name> Set voice by full edge-tts name for THIS terminal (e.g. cctts -voice zh-CN-XiaoxiaoNeural). -g -voice <name> broadcasts globally; -p -voice <name> writes to the project entry
cctts %N Set rate as percent of original for THIS terminal (%25-%400, %100 = 1x)
cctts !N Set volume as percent of original for THIS terminal (!1-!200, !100 = original)
cctts -p -3 %150 !80 Write voice/rate/volume to state.projects[<git-root>] and broadcast the field clear to matching terminals
cctts -p -chime / -p -notify Write chime: true / notify: true to the project entry (broadcasts same as above)
cctts -p -clear Remove the project entry from state.projects (does NOT broadcast; per-terminal overrides stay)
Replay, test, inspect
Phrase Effect
cctts -replay [N] Re-speak the Nth most recent block (default 1, range 1–5)
cctts -test [-N] "hello" Speak the given text once at current voice — or voice N if -N is supplied. No state change
cctts -test "hello" -voice <name> Audition a specific edge-tts voice by name — no state change. Argument order matters: -voice must come AFTER the test text
cctts -status Show current state (plugin version, session id, global + effective voice/rate/volume, modes, auto-voice assignment)
cctts -voices List available voices
cctts -help Print the focused magic-phrase reference
Audio-cue toggles (per-terminal default; -g for global)
Phrase Effect
cctts -chime Toggle chime audio cues — adds a tool-call chime at PreToolUse and a stop-chime at Stop. Does NOT silence prose — prose always speaks when cctts is enabled and not paused
cctts -notify Toggle spoken readback of Notification events — both permission prompts ("Claude needs permission to use Bash") and the idle "Claude is waiting for your input" nudge. When ON, the stop-chime is suppressed (the readback supplies the end-of-block signal). Does NOT silence prose. Renamed from -waiting in 0.1.0; legacy state files auto-migrate
Advanced — machine-level toggles (global only)

These describe machine-level capability (audio routing, optional deps), not per-terminal preference, so they don't follow the -g convention — there is no per-terminal scope.

Phrase Effect
cctts -headphones Toggle headphones-only mode — TTS is suppressed when the default audio device doesn't look like a headphone/headset (requires pycaw)
cctts -streaming Toggle streaming playback — worker plays the first edge-tts chunk before the full synthesis completes (requires miniaudio; falls back to MCI when unavailable). No pause/resume in streaming mode yet
Setup / maintenance
Phrase Effect
cctts -intro Interactive first-run setup (auditions 3 voices, prompts for rate; writes to global)
cctts -prewarm Pre-synthesise canned acks/chimes into the F3 cache (one per voice). Run after install or after editing voices.md; takes ~100s (~1.5 min) and is safe to re-run (skips already-cached entries)

Auto-voice (no flag — automatic)

New terminals automatically get a distinct voice slot (1–9) so parallel sessions sound different. The pool cycles when 10+ terminals are open. Manual cctts -N overrides for that terminal; cctts -g -N broadcasts a single voice to every current terminal. (Picker / cctts -10-20 still reach slots 10–20 — they're manual-selection only and don't enter the auto-voice rotation by design.)

Slash commands

Discoverable but slower — these go through the LLM (~3–6s round-trip). Prefer the magic phrases above when you know the shortcut.

Command Effect
/cctts:settings Inline settings picker (uses AskUserQuestion for voice/rate/toggle/test)
/cctts:voice 3-stage picker (Language → Locale → Voice) over the full ~322-voice catalogue. Accepts a direct slot number to skip the picker, e.g. /cctts:voice 3
/cctts:test Audition a voice with a test phrase. Empty args → voice picker; 1-20 → that voice slot; free text → speak it at current voice
/cctts:help Print the magic-phrase reference (same content as cctts -help)

Rate syntax

%N is percent of original speech speed.

Input Meaning edge-tts equivalent (storage)
%100 Original speed (1x) +0%
%115 Default (1.15x) +15%
%150 1.5x +50%
%200 2x (max useful) +100%
%75 0.75x -25%
%50 Half speed -50%

Storage uses the edge-tts native offset format internally; the %N form is the user-facing input/display layer.

Volume syntax

!N is percent of original volume. Range !1-!200.

Input Meaning edge-tts equivalent (storage)
!100 Original volume (default) +0%
!80 80% (quieter) -20%
!50 Half volume -50%
!150 1.5x (louder) +50%
!200 Max (2x, edge-tts hard cap) +100%

edge-tts hard-caps volume at -99% to +100%; the CLI rejects inputs that would exceed this range.

Page cap (5000 chars per page)

Long assistant turns are split into 5000-char pages. The head plays now; the rest is stashed per-terminal for cctts -more to resume.

When a page hits the cap, you hear:

  1. The prose chunk (split at the last sentence boundary before 5000 chars).
  2. A chime (Windows SystemNotification).
  3. The announcement: "Five thousand character limit reached. Say cctts minus more to continue."

Type cctts -more to speak the next page. If more remains after that page, you'll get another chime + announcement; otherwise the queue is empty. Pending content is per-terminal — each Claude Code window has its own continuation. A fresh prose block (new Stop/PreToolUse fire) overwrites whatever's pending, so you can't accidentally replay last hour's tail.

Override via CLAUDE_TTS_MAX_CHARS env var (also sets the filter's pre-pagination cap, used in tests). The default 5000 ≈ 4 minutes of audio at the default rate.

Speech cache

Synthesized MP3s are cached at ~/.claude/hooks/cctts-cache/ keyed by (text, voice, rate, volume). Subsequent identical blocks (e.g. -replay) skip the edge-tts round-trip entirely. LRU eviction triggers at 100 MB or 200 files — whichever first.

Voice catalog

Edit voices.md to add/remove rows. The CLI re-reads on each invocation. Full edge-tts voice list: python -m edge_tts --list-voices (works regardless of whether edge-tts.exe is on PATH — relevant on Windows, where cctts's auto-install uses pip install --user and the user-site Scripts\ dir often isn't on PATH).

Defaults (slot 1 is the canonical default; auto-voice hands out slots 1–9 in order of terminal opening). Slots 10–20 are reachable via cctts -10cctts -20 (single-keystroke selection caps at slot 9 for muscle memory; 10+ are 2-digit). Auto-voice doesn't touch slots 10–20 — by design, the parallel-terminal pool stays at 9 distinct voices.

# Voice Character
1 en-US-AriaNeural Default — conversational, expressive
2 en-US-AvaNeural Natural multilingual
3 en-US-JennyNeural Warm, customer-service tone
4 en-GB-SoniaNeural UK female
5 en-GB-LibbyNeural UK female, younger
6 en-US-GuyNeural US male
7 en-US-AndrewNeural US male, conversational
8 en-US-MichelleNeural US female, more formal
9 en-US-EmmaNeural US female, friendly
10 en-US-RogerNeural US male, lively
11 en-IN-NeerjaExpressiveNeural Indian female, expressive
12 en-IE-EmilyNeural Irish female
13 en-US-ChristopherNeural US male, authoritative
14 en-CA-ClaraNeural Canadian female
15 en-GB-MaisieNeural UK female, youthful
16 en-GB-RyanNeural UK male
17 en-GB-ThomasNeural UK male, formal
18 en-AU-NatashaNeural Australian female
19 en-HK-YanNeural Hong Kong female
20 en-US-AnaNeural US female, cartoon

Other languages

cctts speaks ~322 edge-tts voices across 70+ languages. The 20 slots above are English-curated, but non-English users have two paths:

Picker (no editing). Run /cctts:voice and select Language → Locale → Voice. The choice applies to the current terminal. Broadcast to global or project from the magic-phrase forms:

cctts -g -voice zh-CN-XiaoxiaoNeural    # global default
cctts -p -voice ja-JP-NanamiNeural      # this project

Custom slots. Hand-edit voices.md to replace a slot with your language's voice — cctts -<N> then targets it directly. See all-voices.md for the full catalogue (regenerate with python cli/gen_all_voices.py).

Per-project override

Project settings live in ~/.claude/hooks/.cctts-state.json under the top-level projects map, keyed by the git root of your cwd (or the cwd itself if it isn't in a git repo). Use the CLI to read/write — bare cctts -p is the on/off toggle, mirroring bare cctts and bare cctts -g:

cctts -p                  # toggle THIS project on/off (writes `enabled` to state.projects[<root>])
cctts -p -status          # print the current project's entry as JSON
cctts -p -4 %125 !80      # voice 4, rate %125, volume !80
cctts -p -chime           # set chime=true for this project
cctts -p -notify          # set notify=true for this project
cctts -p -pause           # mute every terminal in this repo (sticks until -resume)
cctts -p -resume          # un-mute (drops the `paused` field; entry is removed if empty)
cctts -p -clear           # remove the project entry from state.projects

The CLI is the canonical interface — there's no file in the project tree to hand-edit. If you want to inspect or hand-edit the raw entries, the state file is at ~/.claude/hooks/.cctts-state.json and a project entry looks like:

"projects": {
  "C:/Users/aidan/git_projects/myrepo": {
    "enabled": true,
    "voice_index": 4,
    "rate": "+25%",
    "volume": "-20%",
    "chime": true,
    "notify": true,
    "paused": false
  }
}

Broadcast semantics. Every -p field write isn't just an entry write — it also clears the matching per-terminal override on every terminal currently inside the project (matched by recorded cwd → project root). Without that, a terminal that already ran cctts -3 would shadow a later cctts -p -5 indefinitely. -p mirrors -g in this respect, but scoped to the repo. Terminals outside the project are untouched. A per-terminal override set after the broadcast still wins on the next fire — broadcast clears, it doesn't lock.

paused: true short-circuits every hook fire whose cwd resolves to this repo — useful for sensitive projects you'd rather not have read aloud. Note: the project entry uses the edge-tts native rate/volume format (+25%, -20%), not the user-facing %N/!N form — see docs/todo.md for the planned unification.

Migrating from v0.1.0. If you upgraded from v0.1.0 and you had .claude/cctts.json files in your repos, they auto-migrate into state.projects on the next hook fire in each repo, then the file is deleted. Look for migrated <path> -> state.projects[<key>] in ~/.claude/hooks/claude-tts.log to confirm.

Env-var force-off

CLAUDE_TTS_DISABLED=1 in the shell environment kills TTS regardless of state. Useful for permanent silence in CI / scripts.

Logs

Tail ~/.claude/hooks/claude-tts.log to see every fire, queued block, and worker lifecycle. The log rotates at 1 MB with a single .log.1 backup retained.

Useful greps:

  • \[PreToolUse\]|\[Stop\] — every hook fire
  • queue \d+ block — every queued speech
  • play done ok=False — playback failures
  • cache hit|cache miss — speech-cache activity
  • synth attempt \d+ — network retries

Troubleshooting

No audio. Check ~/.claude/hooks/claude-tts.log for play done ok=False. Likely causes: no audio output device, MP3 codec issue, missing edge-tts. The worker auto-installs edge-tts on first speak; if you see cctts: edge-tts install failed in the hook output, your Python is PEP 668-managed or offline — install manually: pip install --user edge-tts.

cctts magic phrase not intercepting. The UserPromptSubmit hook is registered in hooks/hooks.json but only loads at session startup — a soft /reload-plugins won't activate it. Fully quit and reopen Claude Code (close the terminal / IDE session, not just /reload-plugins). Confirm with cctts -status — if you see "UserPromptSubmit operation blocked by hook" with the status output as the reason, the hook is live. If cctts -status still reaches the LLM, the hook didn't load: check ~/.claude/hooks/claude-tts.log (created on first fire) and verify the plugin is enabled in ~/.claude/plugins/installed_plugins.json.

/cctts:voice says "command not found". The plugin isn't loaded. Run /plugin install cctts@quigleybits and restart Claude Code. Verify with /plugin list.

Voice didn't change after -3. Cached worker may still be playing. Wait for current audio to finish; next fire uses the new voice. Or run cctts -3 again (it bumps kill_seq, killing in-flight).

Known limits

  • Windows: tested and supported. Uses Win32 mciSendString; pause/resume + barge-in fully supported.
  • macOS/Linux: experimental — not validated for the 0.1.0 release. macOS uses afplay with SIGSTOP/SIGCONT for pause; Linux prefers mpv (with IPC for pause/resume) and falls back to paplay/aplay (no pause). These paths exist in the codebase but were written and tested only via the GitHub Actions matrix (no real-audio smoke). Manual hardware validation is on the post-0.1.0 roadmap. If you run cctts on macOS or Linux, expect rough edges and please report regressions via the log.
  • Fenced code blocks are collapsed to a heuristic summary like "wrote a 12-line Python snippet" — language detected from the fence info string, line count from the body. Inline code is spoken verbatim.
  • Per-terminal scope needs an active Claude Code session id (resolved from CLAUDE_SESSION_ID or the sidecar file at ~/.claude/hooks/.cctts-session-id). If neither is available (e.g. running the CLI from a plain shell outside any Claude Code session), per-terminal setters refuse and prompt you to use cctts -g for global scope.
  • Mid-sentence barge-in only fires on slash-command actions, not microphone input.
  • A non-cctts user prompt automatically bumps kill_seq, so any in-flight TTS stops on the worker's next 100ms tick. Magic phrases like cctts -status skip this auto-interrupt — the CLI bumps kill_seq itself for the ops that need it.
  • Notification events (permission prompts, idle "Claude is waiting" nudges) follow a two-flag matrix: -notify=ON speaks the notification message (bypasses dedupe + history — you'll always hear "Claude needs permission to use Bash" even on repeat); -notify=OFF + -chime=ON plays a chime cue instead; both OFF is fully silent.
  • Streaming mode (cctts -streaming) needs miniaudio installed. Without it the worker logs streaming: miniaudio not installed, falling back to MCI path and uses the regular synth-then-MCI flow. Streaming has no pause/resume yet — toggle back off if you need those.
  • cctts -headphones mode matches the default Windows playback device's friendly name against headphone|headset|earbud|earphone|airpod|a2dp|hands-free. Unrecognised names (e.g. branded Bluetooth headphones like "WH-1000XM5") will be treated as non-headphones — extend _HEADPHONE_RE in hooks/claude-tts.py if needed. Fails open: any pycaw error returns "headphones present" so audio is never silently dropped because detection broke.

Privacy

Assistant text leaves your machine. Every spoken block is sent to Microsoft's Edge TTS endpoint for synthesis. That includes whatever the assistant printed to you — file paths, snippets of code, hostnames, ticket numbers, anything in the response stream. Fenced code blocks are filtered out before synthesis, but inline code and free prose are not.

The most recent spoken block is also persisted locally to ~/.claude/hooks/.cctts-state.json (for replay) and the first 80 chars of each block are written to ~/.claude/hooks/claude-tts.log.

Don't enable cctts in sessions handling secrets, customer PII, or anything else you wouldn't paste into Bing. Use CLAUDE_TTS_DISABLED=1 in the environment or cctts to force off for such sessions.

Uninstall

/plugin uninstall cctts@quigleybits
/plugin marketplace remove quigleybits

Then fully restart Claude Code so the UserPromptSubmit hook unloads.

Optional cleanup (these files aren't removed by /plugin uninstall):

  • ~/.claude/hooks/.cctts-state.json — per-terminal + global + per-project state
  • ~/.claude/hooks/claude-tts.log (+ .log.1) — fire log
  • ~/.claude/hooks/cctts-cache/ — cached MP3s (can be large)
  • ~/.claude/hooks/.cctts-session-id — session-id sidecar
  • Any leftover .claude/cctts.json in repos last used with v0.1.0 (0.1.1+ auto-migrates these into the central state file, so this only applies if you never opened those repos with v0.1.1)

Development

Run the test suite:

python -m pytest -q

Regenerate slash-command files after editing voices.md:

python cli/gen_commands.py

480 tests cover hooks (UserPromptSubmit/PreToolUse/Stop/Notification/SubagentStop), state schema + per-session dedupe/kill-seq isolation, project storage + migration, the speech filter, the worker (cache, retries, streaming + MCI, long-input chunking), and the CLI surface (per-terminal/-g/-p scope, intro/test/replay/skip).

License

MIT — see LICENSE.

About

Claude Code Text-To-Speech for Windows - speaks assistant responses via edge-tts. macOS/Linux paths experimental.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages