Reference

Configuration Reference

The entire stack is controlled by a single config.tomlfile. All settings have sensible defaults — you only need to set API keys and choose your providers.

Full Configuration Example

config.toml

[server]
port = "8080"              # Server listen port
public_ip = ""             # Public IP for ICE/STUN (required for TURN)
turn_secret = ""           # Shared secret for built-in STUN/TURN server
jwt_secret = ""            # Optional: JWT secret for token auth
api_key = ""               # Optional: API key for /token endpoint

[plugins]
directory = "./plugins"    # Plugin and skills directory

[pipeline]
barge_in = true            # Allow users to interrupt the agent
greeting = ""              # Optional: greeting message on connect
debug = false              # Emit timing events over DataChannel

[stt]
provider = "deepgram"      # deepgram | openai

[llm]
provider = "openai"        # openai | ollama

[tts]
provider = "cartesia"      # cartesia | deepgram | elevenlabs

# ── Provider credentials ──

[deepgram]
api_key = ""               # Deepgram API key (STT + optional TTS)

[openai]
api_key = ""               # OpenAI API key
model = "gpt-4o-mini"      # LLM model name

[cartesia]
api_key = ""               # Cartesia API key (TTS)

[elevenlabs]
api_key = ""               # ElevenLabs API key (TTS)

[ollama]
host = "http://localhost:11434"  # Ollama server URL
model = "gemma4:e4b"             # Local model name

[server]

KeyDefaultDescription
port"8080"HTTP listen port
public_ip""Public IP for ICE candidates and STUN/TURN (required when deploying to cloud)
turn_secret""Shared secret for built-in STUN/TURN server (required when public_ip is set)
jwt_secret""If set, WHIP requests require a valid JWT bearer token
api_key""If set, the /token endpoint requires this key to issue JWTs

STUN/TURN Server

When deploying to the cloud (EC2, DigitalOcean, etc.), browsers cannot directly connect to your server's private IP. Set public_ip and turn_secret to enable the built-in STUN/TURN server on port 3478.

Cloud deployment config

[server]
public_ip = "1.2.3.4"      # Your server's public IP
turn_secret = "your-secret"  # Shared secret for TURN authentication

The server automatically starts a STUN/TURN server when both values are set. The username is always voiceagent and the credential is your turn_secret.

SettingDescription
public_ipPublic IP address advertised in ICE candidates (required for NAT traversal)
turn_secretShared secret for TURN authentication. Username is always "voiceagent"

[pipeline]

KeyDefaultDescription
barge_intrueAllow users to interrupt the agent mid-response
greeting""If set, the agent speaks this message when a session starts
debugfalseEmit timing events (STT, LLM, TTS latency) over the DataChannel

STT Providers

Set [stt] provider to one of these values:

Deepgram Nova-3

provider = "deepgram"

Real-time streaming STT with word-level timestamps and punctuation. Recommended for lowest latency.

[deepgram]
api_key = "your-deepgram-api-key"

OpenAI Whisper

provider = "openai"

OpenAI’s speech recognition model. Uses the same OpenAI API key.

[openai]
api_key = "your-openai-api-key"

LLM Providers

Set [llm] provider to one of these values:

OpenAI

provider = "openai"

GPT-4o, GPT-4o-mini with streaming completions and function calling for plugin invocation.

[openai]
api_key = "your-openai-api-key"
model = "gpt-4o-mini"     # or "gpt-4o" for maximum capability

Ollama (local)

provider = "ollama"

Run local models with zero API costs. Requires Ollama running separately.

[ollama]
host = "http://localhost:11434"
model = "gemma4:e4b"      # or "mistral", "qwen2.5", etc.

TTS Providers

Set [tts] provider to one of these values:

Cartesia Sonic

provider = "cartesia"

Ultra-low latency streaming TTS with natural prosody. Recommended for fastest time-to-first-byte.

[cartesia]
api_key = "your-cartesia-api-key"

Deepgram Aura

provider = "deepgram"

High-quality neural TTS. Uses the same Deepgram API key as STT.

[deepgram]
api_key = "your-deepgram-api-key"   # same key for STT + TTS

ElevenLabs

provider = "elevenlabs"

Premium voice cloning and multilingual speech synthesis.

[elevenlabs]
api_key = "your-elevenlabs-api-key"

[plugins]

KeyDefaultDescription
directory"./plugins"Path to the plugins and skills directory. Scanned at startup.

The directory structure inside the plugin directory:

plugins/
├── plugins/           # External plugins (Python/TS/JS)
│   ├── math-calculate/
│   ├── weather-get/
│   └── time-get/
└── skills/            # Skill definitions (Markdown)
    ├── friendly-assistant/
    └── concise-responder/

Environment Variables

Provider API keys can also be set via environment variables. These take precedence over config.toml values:

VariableOverrides
DEEPGRAM_API_KEY[deepgram] api_key
OPENAI_API_KEY[openai] api_key
CARTESIA_API_KEY[cartesia] api_key
ELEVENLABS_API_KEY[elevenlabs] api_key

JWT Authentication (Optional)

When jwt_secret is set, the WHIP endpoint requires a valid JWT bearer token. Clients can obtain tokens from the /token endpoint (if api_key is also set).

config.toml

[server]
jwt_secret = "your-secret-key"    # Enable JWT auth
api_key = "your-api-key"          # Protect the /token endpoint

Token request

curl -X POST http://localhost:8080/token \
  -H "Authorization: Bearer your-api-key"

WHIP with JWT

curl -X POST http://localhost:8080/whip \
  -H "Authorization: Bearer <jwt-token>" \
  -H "Content-Type: application/sdp" \
  -d @offer.sdp