Complete reference for all agent configuration parameters available in the Voice AI API.
Overview
Agent configuration consists of:
- Root-level parameters -
name and kb_id are set at the root level of create/update requests
config object - Contains conversation style, voice settings, timing controls, and integrations
All configuration parameters are optional (except name when creating) - the system uses sensible defaults for any omitted fields.
Status Management: Agent status (paused, deployed, disabled) cannot be updated directly through the update endpoint. Use the dedicated endpoints: Deploy Agent, Pause Agent, or Delete Agent.
Update Behavior: Agent updates are partial. Omit a root-level field or config field to leave it unchanged. Use null only for fields or containers that are documented below as nullable or clearable.
Agent-Level Parameters
These parameters are set at the root level of agent create/update requests, not inside the config object:
| Parameter | Type | Required | Description |
|---|
name | string | Yes (create), No (update) | Agent name. Must be at least 1 character. |
kb_id | number | No | Knowledge base ID to assign to the agent. Set to an existing knowledge base ID to assign it, or null to remove the knowledge base. See Knowledge Base Management. |
Configuration Structure
When creating or updating an agent, the request structure is:
{
"name": "My Agent",
"config": {
"prompt": "You are a helpful assistant...",
"greeting": "Hello! How can I help you?",
"llm_temperature": 0.7,
"llm_model": "gemini-2.5-flash-lite",
"tts_min_sentence_len": 20,
"tts_params": {
"voice_id": "abc",
"temperature": 1.0
},
"allow_interruptions": true,
"min_silence_duration": 0.55,
"phone_number": "+14155551234",
"recording_enabled": true,
"mcp_servers": [...]
},
"kb_id": 123
}
The config object contains all behavioral parameters. The kb_id parameter is set at the root level to assign a knowledge base to the agent.
Core Conversation Parameters
| Parameter | Type | Default | Description |
|---|
prompt | string | null | Custom instructions that define the agent’s personality, role, and behavior. This is the system prompt sent to the LLM. |
greeting | string | null | Initial message the agent speaks when a call starts. If not provided, the agent will start with a default greeting. |
llm_temperature | number | 0.7 | Controls the creativity/randomness of the LLM responses. Range: 0.0-2.0. Lower values (0.0-0.5) = more focused and deterministic. Higher values (1.0-2.0) = more creative and varied. |
llm_model | string | "gemini-2.5-flash-lite" | The language model to use. Options: "gpt-4o-mini", "gemini-2.5-flash", "gemini-2.5-flash-lite", "gpt-4o", "gemini-2.5-pro". |
Dynamic Variables
dynamic_variables are call-scoped values that you pass when a session starts. They are not stored in the saved agent config, but you can reference them from both your prompt and greeting with bare placeholders such as {{customer_name}} or {{order_id}}.
{
"config": {
"prompt": "You are helping {{customer_name}} with order {{order_id}} for the {{account_tier}} account.",
"greeting": "Hi {{customer_name}}, I can help with your {{city}} delivery today."
}
}
You can provide dynamic_variables through the Web SDK, the API reference, or from webhooks.inbound_call before an inbound call starts. For inbound-call webhook payloads, response examples, and testing, use the dedicated Webhooks guide.
dynamic_variables must be a flat object of string, number, or boolean values.
- Omitted variables are allowed.
- Variables that are not referenced by the runtime prompt or greeting are ignored.
- If the same placeholder appears multiple times, the same runtime value is used for every occurrence.
For call-scoped config changes, use agent_overrides instead of dynamic_variables. agent_overrides keeps the same nested shape as agent config, but currently supports only a limited runtime subset under tts_params.
Voice & Speech Parameters
| Parameter | Type | Default | Description |
|---|
tts_min_sentence_len | number | 20 | Minimum sentence length (in characters) before TTS starts streaming audio. Lower values = faster response time but more choppy audio. Higher values = smoother audio but longer wait time. |
tts_params | object | See TTS Parameters | Nested object containing voice and TTS generation settings. |
TTS Parameters
The tts_params object contains voice-specific settings:
| Parameter | Type | Default | Description |
|---|
voice_id | string | null | ID of the voice to use. Get available voices from List Voices endpoint. If not provided, uses default voice. |
model | string | null | TTS model to use. Supported models: voiceai-tts-v1-latest, voiceai-tts-v1-2026-02-10 (English only), voiceai-tts-multilingual-v1-latest, voiceai-tts-multilingual-v1-2026-02-10 (multilingual). If not provided, automatically selected based on language. |
language | string | null | Language code (ISO 639-1 format) or "auto" for ASR-detected language. Supported languages: en, ca, sv, es, fr, de, it, pt, pl, ru, nl. Use "auto" to automatically detect language from speech recognition (requires multilingual model). Defaults to "en" at runtime if not set. |
temperature | number | 1.0 | Sampling temperature (0.0-2.0). Higher values make output more random. Controls voice variation and expressiveness. |
top_p | number | 0.8 | Nucleus sampling parameter (0.0-1.0). Controls diversity of output. |
dictionary_id | string | null | Managed pronunciation dictionary ID. See Pronunciation Dictionaries. |
dictionary_version | number | null | Optional saved dictionary version to pin. If omitted, the latest version is used. Requires dictionary_id. |
On update, omit individual tts_params fields to preserve their current values. For nullable TTS fields such as voice_id, model, language, temperature, top_p, dictionary_id, and dictionary_version, pass null to clear that specific value.
Automatic Language Detection: Set language: "auto" to automatically detect the user’s language from speech recognition. This requires a multilingual model (e.g., voiceai-tts-multilingual-v1-latest). The agent will match the TTS language to the detected speech language in real-time.
Use Pronunciation Dictionaries to manage custom pronunciations and then attach a dictionary to the agent with tts_params.dictionary_id.
Interruption & Turn-Taking
| Parameter | Type | Default | Description |
|---|
allow_interruptions | boolean | true | Whether users can interrupt the agent while it’s speaking. When false, users must wait for the agent to finish. |
allow_interruptions_on_greeting | boolean | false | Whether to allow interruptions during the greeting message. Useful for impatient callers. |
min_interruption_words | number | 1 | Minimum number of words required for an interruption to be recognized. Range: 0-10. 0 = any speech triggers interruption, 1+ = requires that many words. |
auto_noise_reduction | boolean | true | Enable automatic noise reduction based on environment detection. Improves speech recognition in noisy environments. |
Timing & Endpointing
These parameters control when the system detects speech start/end and manages conversation flow:
| Parameter | Type | Default | Description |
|---|
min_silence_duration | number | 0.55 | Minimum duration of silence (seconds) before considering speech has ended. Lower = more responsive but may cut off slow speakers. |
min_speech_duration | number | 0.5 | Minimum duration of speech (seconds) required to trigger an interruption. Prevents false interruptions from brief sounds. |
min_endpointing_delay | number | 0.5 | Minimum delay (seconds) before considering speech has ended. Works with min_silence_duration for endpointing. |
max_endpointing_delay | number | 3.0 | Maximum delay (seconds) before forcing speech to end. Prevents long pauses from blocking conversation. |
vad_activation_threshold | number | 0.5 | Voice activity detection threshold. Range: 0.0-1.0. Lower = more sensitive (detects quieter speech), higher = less sensitive (requires clearer speech). |
user_silence_timeout | number | 10.0 | Seconds of user silence before agent prompts to check if user is still there. Helps detect dropped calls or disengaged users. |
max_call_duration_seconds | number | null | Maximum call duration in seconds. null = no limit. Useful for limiting costs or enforcing time constraints. |
Agent Control Permissions
| Parameter | Type | Default | Description |
|---|
allow_agent_to_end_call | boolean | false | Whether the agent can end calls via tool calls or timeout. When true, agent can proactively end conversations. |
allow_agent_to_skip_turn | boolean | false | Whether the agent can skip turns and yield conversation control. Useful for multi-party scenarios. |
Phone Number
| Parameter | Type | Default | Description |
|---|
phone_number | string | null | Assigned phone number in E.164 format (e.g., "+14155551234"). Must be a number from your available phone numbers. See Phone Number Management. |
On update, set phone_number: null to unassign the current phone number. Empty strings are also normalized to null.
Call Recording
| Parameter | Type | Default | Description |
|---|
recording_enabled | boolean | true | Whether new calls for this agent should be recorded when the active connection mode supports recording. Disable it to skip creating recordings. |
On update, omit recording_enabled to leave it unchanged. Set recording_enabled: false to disable recording for future calls. Existing recordings are unaffected.
Use the Call Recording endpoint to check recording status and retrieve the merged MP3 URL for completed calls.
Integrations
Webhooks
| Parameter | Type | Default | Description |
|---|
webhooks | object | null | Webhook configuration for event notifications, inbound call personalization, and callable tools. See Webhooks for details. |
On update, omit webhooks to leave existing webhook config unchanged. Set webhooks: null to clear all webhook configuration, or clear individual webhook containers as described below.
This section is the configuration-field reference. For webhook payload examples, signature verification, test endpoints, and inbound-call response behavior, use the dedicated Webhooks guide.
The webhooks object contains three distinct configs. webhooks.events, webhooks.inbound_call, and webhooks.tools use different contracts:
webhooks.events[] supports secret (write-only on create/update) and has_secret (read-only on fetch), with fan-out across enabled endpoints.
webhooks.inbound_call supports secret (write-only on create/update) and has_secret (read-only on fetch). Use it to return dynamic_variables and optional agent_overrides for personalization only, not routing.
webhooks.tools define outbound API calls and do not use secret.
Event webhook fields (webhooks.events[])
Inbound: Voice.ai sends POST requests to each enabled event webhook URL. Configure secret for HMAC signature verification.
| Field | Type | Required | Default | Description |
|---|
events[].url | string | Yes | - | Webhook endpoint URL |
events[].secret | string | No | null | HMAC-SHA256 signing secret (write-only on create/update) |
events[].has_secret | boolean | No | false | Whether a signing secret is configured (read-only on fetch) |
events[].events | array | No | [] | Event types to receive (empty = all) |
events[].timeout | number | No | 5 | Request timeout in seconds (1-30) |
events[].enabled | boolean | No | true | Whether webhook event callbacks are active |
On update, omit webhooks.events to preserve the current list, set webhooks.events: null to clear it, or pass a full array to replace it. Within a replacement array, omitted events[].secret values are preserved only for entries whose url exactly matches an existing endpoint, events[].secret: null clears that endpoint’s signing secret, duplicate URLs are rejected, and events[].events: [] means that endpoint receives all event types.
Inbound call webhook fields (webhooks.inbound_call)
Inbound: Voice.ai sends POST requests to your URL before an inbound call starts. Use this webhook to personalize a call with dynamic_variables and optional runtime agent_overrides. Do not use inbound_call to route calls to different agents.
| Field | Type | Required | Default | Description |
|---|
inbound_call.url | string | Yes | - | Webhook endpoint URL |
inbound_call.secret | string | No | null | HMAC-SHA256 signing secret (write-only on create/update) |
inbound_call.has_secret | boolean | No | false | Whether a signing secret is configured (read-only on fetch) |
inbound_call.timeout | number | No | 5 | Request timeout in seconds (1-30) |
inbound_call.enabled | boolean | No | true | Whether inbound call personalization is active |
On update, omit webhooks.inbound_call to preserve it, set webhooks.inbound_call: null to remove it, and set inbound_call.secret: null to clear only the signing secret.
Outbound: Voice.ai calls your API. Configure auth_type, auth_token, or headers to authenticate the outbound request to your endpoint.
| Field | Type | Required | Default | Description |
|---|
tools[].name | string | Yes | - | Tool name used by the agent |
tools[].description | string | Yes | - | Human-readable tool description |
tools[].url | string | Yes | - | Your API endpoint URL |
tools[].parameters | object | Yes | - | Tool argument schema |
tools[].method | string | Yes | - | GET, POST, PUT, PATCH, or DELETE |
tools[].execution_mode | string | Yes | - | sync (wait for response) or async (accept 2xx) |
tools[].auth_type | string | Yes | - | none, bearer_token, api_key, or custom_headers |
tools[].auth_token | string | No | null | Token for bearer_token or api_key |
tools[].headers | object | No | {} | Custom headers (for auth_type: custom_headers) |
tools[].response | object | No | {} | Expected response shape |
tools[].timeout | number | No | 10 | Request timeout in seconds |
On update, omit webhooks.tools to leave the current tool list unchanged, set webhooks.tools: null to clear all tools, or pass a new array to replace the current list.
MCP Servers
| Parameter | Type | Default | Description |
|---|
mcp_servers | array | null | List of Model Context Protocol (MCP) server configurations. Allows agents to connect to external tools and data sources. See Model Context Protocol for details. |
On update, omit mcp_servers to preserve the current server list, set mcp_servers: null to clear it, or pass a new array to replace the current list.
Each MCP server configuration includes:
name (string, required) - Human-readable name
description (string, optional) - Server description
url (string, required) - MCP server endpoint URL
auth_type (string, optional) - Authentication type: "none", "bearer_token", "api_key", "custom_headers". Default: "none"
auth_token (string, optional) - Token for authentication
headers (object, optional) - Custom HTTP headers
Examples
Basic Configuration
{
"prompt": "You are a helpful customer support agent.",
"greeting": "Thank you for calling. How can I help you?",
"llm_model": "gemini-2.5-flash-lite"
}
Advanced Configuration
{
"prompt": "You are a technical support specialist. Be concise and professional.",
"greeting": "Hello, this is technical support. What issue can I help you with?",
"llm_temperature": 0.5,
"llm_model": "gemini-2.5-flash",
"tts_params": {
"voice_id": "custom_voice_123",
"model": "voiceai-tts-multilingual-v1-latest",
"language": "en",
"temperature": 0.9
},
"allow_interruptions": true,
"min_interruption_words": 2,
"min_silence_duration": 0.6,
"vad_activation_threshold": 0.6,
"max_call_duration_seconds": 1800,
"phone_number": "+14155551234",
"mcp_servers": [
{
"name": "Database Server",
"url": "https://api.example.com/mcp",
"auth_type": "bearer_token",
"auth_token": "your-token"
}
]
}
Multilingual Agent with Auto Language Detection
{
"prompt": "You are a multilingual customer support agent. Respond in the same language as the customer.",
"greeting": "Hello! How can I help you today?",
"llm_model": "gemini-2.5-flash-lite",
"tts_params": {
"model": "voiceai-tts-multilingual-v1-latest",
"language": "auto"
},
"allow_interruptions": true
}
This configuration automatically detects the user’s language from speech recognition and matches the TTS output language accordingly.
Configuration for Fast-Paced Conversations
{
"prompt": "You are a quick and efficient assistant.",
"llm_temperature": 0.3,
"allow_interruptions": true,
"min_interruption_words": 0,
"min_silence_duration": 0.4,
"min_endpointing_delay": 0.3,
"max_endpointing_delay": 2.0,
"vad_activation_threshold": 0.4
}
{
"prompt": "You are a professional business assistant. Speak formally and wait for complete thoughts.",
"allow_interruptions": false,
"min_silence_duration": 0.8,
"min_endpointing_delay": 0.7,
"max_endpointing_delay": 4.0,
"vad_activation_threshold": 0.6
}
Configuration with Webhooks
This example shows the saved agent config shape only. For webhook request/response payload examples and test flows, see the dedicated Webhooks guide.
{
"prompt": "You are a helpful sales assistant.",
"greeting": "Hi! How can I help you today?",
"webhooks": {
"events": [
{
"url": "https://your-server.com/webhooks/voice-events",
"secret": "your-hmac-secret",
"events": ["call.started", "call.completed"],
"enabled": true
}
],
"inbound_call": {
"url": "https://your-server.com/webhooks/inbound-call",
"secret": "your-inbound-call-secret",
"enabled": true
},
"tools": [
{
"name": "get_account_status",
"description": "Fetches the latest account status for a customer.",
"url": "https://your-server.com/webhooks/tools/account-status",
"parameters": { "customer_id": "string" },
"method": "POST",
"execution_mode": "sync",
"auth_type": "api_key",
"auth_token": "your-api-key",
"headers": { "X-Service-Version": "2026-02" },
"response": {
"type": "object",
"properties": {
"status": { "type": "string" },
"tier": { "type": "string" }
}
},
"timeout": 10
}
]
}
}
webhooks.events[] and webhooks.inbound_call use secret for HMAC verification of inbound requests. webhooks.tools use auth_type/auth_token for outbound API authentication. See the Webhooks guide for payload examples and the Web SDK guide for SDK usage.
Best Practices
- Start Simple: Begin with just
prompt and greeting, then add parameters as needed
- Test Incrementally: Adjust one parameter at a time to understand its effect
- Timing Parameters: Fine-tune
min_silence_duration and vad_activation_threshold based on your use case
- Interruptions: Set
allow_interruptions: false for formal scenarios, true for casual conversations
- Voice Selection: Use
voice_id in tts_params to select a specific voice from your available voices
- Call Duration: Set
max_call_duration_seconds to prevent runaway costs or enforce time limits
Next Steps