Skip to content

Abhaykumar9035/ProjectFriday

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Offline Voice Assistant

A fully offline, privacy-focused voice assistant that runs locally on your machine. It uses Whisper for Speech-to-Text, Gemma (via llama.cpp) for intelligence, and Coqui XTTS for high-quality, clonable Text-to-Speech.

Features

  • Offline Reliability: No internet connection required after initial model downloads.
  • Voice Cloning: Clone any voice using just a few seconds of audio samples.
  • Low Latency: Optimized for reasonably fast CPU inference (GPU recommended for TTS).
  • Privacy: No audio or text leaves your device.

Prerequisites

  • Python 3.9+
  • Git
  • Basic knowledge of terminal/command prompt.

Setup Instructions

1. Install Dependencies

pip install -r requirements.txt

Note: You may need to install PyTorch separately depending on your hardware (CUDA/CPU).

2. Download Binaries and Models

ASR (Whisper.cpp)

  1. Clone whisper.cpp or download the prebuilt binary for your OS.
  2. Place the main.exe (Windows) or main (Linux/Mac) inside asr/whisper.cpp/.
  3. Download a model (e.g., ggml-base.en.bin) and place it in asr/whisper.cpp/models/.

LLM (Llama.cpp)

  1. Clone llama.cpp or download the prebuilt binary.
  2. Place main.exe inside llm/llama.cpp/.
  3. Download the Gemma GGUF model (e.g., gemma-2b-it.gguf).
  4. Place the model in llm/llama.cpp/models/.

TTS (Coqui XTTS)

The TTS python library will automatically download the XTTS-v2 model on first run.

3. Voice Cloning Setup

  1. Record 3-5 audio samples (wav format, approx 5-10 seconds each) of the voice you want to clone.
  2. Place them in the client_voice/ directory.

4. Configuration

Edit config.yaml to match your exact paths if they differ from the defaults.

Usage

Run the assistant:

python assistant.py
  1. Wait for initialization.
  2. When it says "Listening...", speak into your microphone.
  3. Stop speaking to trigger processing.
  4. The assistant will reply in the cloned voice.

Troubleshooting

  • "Whisper binary not found": Ensure asr/whisper.cpp/main.exe exists.
  • "Llama binary not found": Ensure llm/llama.cpp/main.exe exists.
  • Slow TTS: XTTS is heavy. A GPU is highly recommended. For CPU, expect significant delay.

Structure

  • assistant.py: Main entry point.
  • modules/: Wrappers for binary interactions.
  • utils/: Audio and Text helpers.
  • config.yaml: System configuration.

About

Private, fully local voice assistant that keeps your conversations offline while supporting intelligent responses and cloned voice synthesis.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages