A minimal FastAPI service demonstrating the SIP webhook flow: a phone call arrives at Sayna, Sayna forwards a signed webhook here, and a background voice agent (Deepgram STT → Gemini stream → ElevenLabs TTS) holds a conversation with the caller until they hang up.
┌──────────┐ ┌──────────┐ signed webhook ┌──────────────────┐
│ Caller │──SIP─>│ Sayna │──────────────────>│ POST /sayna/ │
└──────────┘ │ Server │ │ webhook │
│ │ │ │
│ │<──WebSocket──┐ │ (background) │
└────┬─────┘ audio │ └────────┬─────────┘
│ │ │
▼ │ ▼
┌──────────┐ │ ┌──────────────┐
│ LiveKit │ └───────│ SaynaClient │
│ room │ │ + Gemini │
└──────────┘ └──────────────┘
Flow:
- SIP call hits the Sayna server, which routes it into a LiveKit room.
- Sayna posts a signed webhook to
POST /sayna/webhook. - The route verifies the HMAC signature with
WebhookReceiverand dispatches a background task. - The background
VoiceSessionopens a WebSocket to Sayna, joins the same LiveKit room asai-agent, and speaks the greeting. - Each final STT transcript is fed to the
VoiceAgent, which streams a Gemini response. - The agent yields one sentence at a time; each sentence is sent to TTS immediately for low-latency speech.
- When the caller hangs up,
participant_disconnectedfires, the session disconnects, and history is cleared.
- Python 3.10+
- A Google AI API key for Gemini
- Docker (for the local Sayna + LiveKit + LiveKit SIP stack)
The bundled ../docker-compose.yml brings up Sayna (port 3002), LiveKit (7880), LiveKit SIP (5063), and Redis. From the examples/ directory:
export DEEPGRAM_API_KEY=...
export ELEVENLABS_API_KEY=...
docker compose upThe sayna service is configured with ../sayna.example.yaml, which forwards SIP webhooks to http://localhost:5002/sayna/webhook — the route exposed by this example.
cd python-sayna-example
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
# edit .env and set GOOGLE_API_KEY
python main.pyThe server listens on http://0.0.0.0:5002 by default. Place a SIP call to the running LiveKit SIP gateway and the agent will answer.
| Variable | Required | Default | Description |
|---|---|---|---|
GOOGLE_API_KEY |
yes | — | Google AI API key for Gemini |
SAYNA_URL |
no | http://localhost:3001 |
Sayna API base URL |
SAYNA_API_KEY |
no | secret-key-1234567890 |
Matches auth.api_secrets[0].secret in sayna.example.yaml |
SAYNA_WEBHOOK_SECRET |
no | hook-secret-1234567890 |
Matches sip.hook_secret in sayna.example.yaml |
ELEVENLABS_VOICE_ID |
no | ZIlrSGI4jZqobxRKprJz |
ElevenLabs voice the agent speaks with |
PORT |
no | 5002 |
FastAPI bind port — must match sayna.example.yaml hooks[].url |
| Method | Path | Purpose |
|---|---|---|
GET |
/ |
Liveness check |
POST |
/sayna/webhook |
Receives signed SIP webhooks from Sayna |
python-sayna-example/
├── main.py # FastAPI app + the two routes
├── config.py # Settings (defaults from sayna.example.yaml)
├── prompts.py # Voice assistant system prompt + fallbacks
├── voice_agent.py # Gemini streaming + sentence extraction (no Sayna imports)
├── voice_session.py # SaynaClient lifecycle + STT → agent → TTS glue
├── requirements.txt
├── .env.example
└── .gitignore
voice_agent.py and voice_session.py are intentionally decoupled — the agent module has no
Sayna dependency, so it can be exercised against any text-in/text-out harness.
../nestjs-ai-sdk-server/— Node.js sibling demonstrating the browser/POST /startflow../sayna.example.yaml— Sayna server config used by this examplesayna-clienton PyPI — the Python SDK- Sayna docs