Agent Configuration

Complete reference for all agent configuration parameters available in the Voice AI API.

Prerequisites: API key

Overview

Agent configuration consists of:

Root-level parameters - name and kb_id are set at the root level of create/update requests
config object - Contains conversation style, voice settings, timing controls, and integrations

All configuration parameters are optional (except name when creating) - the system uses sensible defaults for any omitted fields.

Status Management: Agent status (paused, deployed, disabled) cannot be updated directly through the update endpoint. Use the dedicated endpoints: Deploy Agent, Pause Agent, or Delete Agent.

Update Behavior: Agent updates are partial. Omit a root-level field or config field to leave it unchanged. Use null only for fields or containers that are documented below as nullable or clearable.

Agent-Level Parameters

These parameters are set at the root level of agent create/update requests, not inside the config object:

Parameter	Type	Required	Description
`name`	string	Yes (create), No (update)	Agent name. Must be at least 1 character.
`kb_id`	number	No	Knowledge base ID to assign to the agent. Set to an existing knowledge base ID to assign it, or `null` to remove the knowledge base. See Knowledge Base Management.

Configuration Structure

When creating or updating an agent, the request structure is:

{
  "name": "My Agent",
  "config": {
    "prompt": "You are a helpful assistant...",
    "greeting": "Hello! How can I help you?",
    "llm_temperature": 0.7,
    "llm_model": "gemini-2.5-flash-lite",
    "tts_min_sentence_len": 20,
    "tts_params": {
      "voice_id": "abc",
      "temperature": 1.0
    },
    "allow_interruptions": true,
    "min_silence_duration": 0.55,
    "phone_number": "+14155551234",
    "recording_enabled": true,
    "mcp_servers": [...]
  },
  "kb_id": 123
}

The config object contains all behavioral parameters. The kb_id parameter is set at the root level to assign a knowledge base to the agent.

Core Conversation Parameters

Parameter	Type	Default	Description
`prompt`	string	`null`	Custom instructions that define the agent’s personality, role, and behavior. This is the system prompt sent to the LLM.
`greeting`	string	`null`	Initial message the agent speaks when a call starts. If not provided, the agent will start with a default greeting.
`llm_temperature`	number	`0.7`	Controls the creativity/randomness of the LLM responses. Range: 0.0-2.0. Lower values (0.0-0.5) = more focused and deterministic. Higher values (1.0-2.0) = more creative and varied.
`llm_model`	string	`"gemini-2.5-flash-lite"`	The language model to use. Options: `"gpt-4o-mini"`, `"gemini-2.5-flash"`, `"gemini-2.5-flash-lite"`, `"gpt-4o"`, `"gemini-2.5-pro"`.

Dynamic Variables

dynamic_variables are call-scoped values that you pass when a session starts. They are not stored in the saved agent config, but you can reference them from both your prompt and greeting with bare placeholders such as {{customer_name}} or {{order_id}}.

{
  "config": {
    "prompt": "You are helping {{customer_name}} with order {{order_id}} for the {{account_tier}} account.",
    "greeting": "Hi {{customer_name}}, I can help with your {{city}} delivery today."
  }
}

You can provide dynamic_variables through the Web SDK, the API reference, or from webhooks.inbound_call before an inbound call starts. For inbound-call webhook payloads, response examples, and testing, use the dedicated Webhooks guide.

dynamic_variables must be a flat object of string, number, or boolean values.
Omitted variables are allowed.
Variables that are not referenced by the runtime prompt or greeting are ignored.
If the same placeholder appears multiple times, the same runtime value is used for every occurrence.

For call-scoped config changes, use agent_overrides instead of dynamic_variables. agent_overrides keeps the same nested shape as agent config, but currently supports only a limited runtime subset under tts_params.

Voice & Speech Parameters

Parameter	Type	Default	Description
`tts_min_sentence_len`	number	`20`	Minimum sentence length (in characters) before TTS starts streaming audio. Lower values = faster response time but more choppy audio. Higher values = smoother audio but longer wait time.
`tts_params`	object	See TTS Parameters	Nested object containing voice and TTS generation settings.

TTS Parameters

The tts_params object contains voice-specific settings:

Parameter	Type	Default	Description
`voice_id`	string	`null`	ID of the voice to use. Get available voices from List Voices endpoint. If not provided, uses default voice.
`model`	string	`null`	TTS model to use. Supported models include `voiceai-tts-v1-latest`, `voiceai-tts-v1-2026-02-10`, `voiceai-tts-lite-v1-latest`, `voiceai-tts-lite-v1-2026-04-15` (English only), `voiceai-tts-multilingual-v1-latest`, and `voiceai-tts-multilingual-v1-2026-02-10` (multilingual). Use List Supported Models for the current public model IDs. If not provided, the worker selects automatically at runtime based on language.
`language`	string	`null`	Language code (ISO 639-1 format) or `"auto"` for ASR-detected language. Supported languages: `en`, `ca`, `sv`, `es`, `fr`, `de`, `it`, `pt`, `pl`, `ru`, `nl`. Use `"auto"` to automatically detect language from speech recognition (requires multilingual model). Defaults to `"en"` at runtime if not set.
`temperature`	number	`1.0`	Sampling temperature (0.0-2.0). Higher values make output more random. Controls voice variation and expressiveness.
`top_p`	number	`0.8`	Nucleus sampling parameter (0.0-1.0). Controls diversity of output.
`dictionary_id`	string	`null`	Managed pronunciation dictionary ID. See Pronunciation Dictionaries.
`dictionary_version`	number	`null`	Optional saved dictionary version to pin. If omitted, the latest version is used. Requires `dictionary_id`.

On update, omit individual tts_params fields to preserve their current values. For nullable TTS fields such as voice_id, model, language, temperature, top_p, dictionary_id, and dictionary_version, pass null to clear that specific value.

Automatic Language Detection: Set language: "auto" to automatically detect the user’s language from speech recognition. This requires a multilingual model (e.g., voiceai-tts-multilingual-v1-latest). The agent will match the TTS language to the detected speech language in real-time.

Use Pronunciation Dictionaries to manage custom pronunciations and then attach a dictionary to the agent with tts_params.dictionary_id.

Interruption & Turn-Taking

Parameter	Type	Default	Description
`allow_interruptions`	boolean	`true`	Whether users can interrupt the agent while it’s speaking. When `false`, users must wait for the agent to finish.
`allow_interruptions_on_greeting`	boolean	`false`	Whether to allow interruptions during the greeting message. Useful for impatient callers.
`min_interruption_words`	number	`1`	Minimum number of words required for an interruption to be recognized. Range: 0-10. `0` = any speech triggers interruption, `1+` = requires that many words.
`auto_noise_reduction`	boolean	`true`	Enable automatic noise reduction based on environment detection. Improves speech recognition in noisy environments.

Timing & Endpointing

These parameters control when the system detects speech start/end and manages conversation flow:

Parameter	Type	Default	Description
`min_silence_duration`	number	`0.55`	Minimum duration of silence (seconds) before considering speech has ended. Lower = more responsive but may cut off slow speakers.
`min_speech_duration`	number	`0.5`	Minimum duration of speech (seconds) required to trigger an interruption. Prevents false interruptions from brief sounds.
`min_endpointing_delay`	number	`0.5`	Minimum delay (seconds) before considering speech has ended. Works with `min_silence_duration` for endpointing.
`max_endpointing_delay`	number	`3.0`	Maximum delay (seconds) before forcing speech to end. Prevents long pauses from blocking conversation.
`vad_activation_threshold`	number	`0.5`	Voice activity detection threshold. Range: 0.0-1.0. Lower = more sensitive (detects quieter speech), higher = less sensitive (requires clearer speech).
`user_silence_timeout`	number	`10.0`	Seconds of user silence before agent prompts to check if user is still there. Helps detect dropped calls or disengaged users.
`max_call_duration_seconds`	number	`null`	Maximum call duration in seconds. `null` = no limit. Useful for limiting costs or enforcing time constraints.

Agent Control Permissions

Parameter	Type	Default	Description
`allow_agent_to_end_call`	boolean	`false`	Whether the agent can end calls via tool calls or timeout. When `true`, agent can proactively end conversations.
`allow_agent_to_skip_turn`	boolean	`false`	Whether the agent can skip turns and yield conversation control. Useful for multi-party scenarios.

Phone Number

Parameter	Type	Default	Description
`phone_number`	string	`null`	Assigned phone number in E.164 format (e.g., `"+14155551234"`). Must be a number from your available phone numbers. See Phone Number Management.

On update, set phone_number: null to unassign the current phone number. Empty strings are also normalized to null.

Call Recording

Parameter	Type	Default	Description
`recording_enabled`	boolean	`true`	Whether new calls for this agent should be recorded when the active connection mode supports recording. Disable it to skip creating recordings.

On update, omit recording_enabled to leave it unchanged. Set recording_enabled: false to disable recording for future calls. Existing recordings are unaffected. Use the Call Recording endpoint to check recording status and retrieve the merged MP3 URL for completed calls.

Integrations

Webhooks

Parameter	Type	Default	Description
`webhooks`	object	`null`	Webhook configuration for event notifications, inbound call personalization, and callable tools. See Webhooks for details.

On update, omit webhooks to leave existing webhook config unchanged. Set webhooks: null to clear all webhook configuration, or clear individual webhook containers as described below. This section is the configuration-field reference. For webhook payload examples, signature verification, test endpoints, and inbound-call response behavior, use the dedicated Webhooks guide. The webhooks object contains three distinct configs. webhooks.events, webhooks.inbound_call, and webhooks.tools use different contracts:

webhooks.events[] supports secret (write-only on create/update) and has_secret (read-only on fetch), with fan-out across enabled endpoints.
webhooks.inbound_call supports secret (write-only on create/update) and has_secret (read-only on fetch). Use it to return dynamic_variables and optional agent_overrides for personalization only, not routing.
webhooks.tools define outbound API calls and do not use secret.

Event webhook fields (`webhooks.events[]`)

Inbound: Voice.ai sends POST requests to each enabled event webhook URL. Configure secret for HMAC signature verification.

Field	Type	Required	Default	Description
`events[].url`	string	Yes	-	Webhook endpoint URL
`events[].secret`	string	No	`null`	HMAC-SHA256 signing secret (write-only on create/update)
`events[].has_secret`	boolean	No	`false`	Whether a signing secret is configured (read-only on fetch)
`events[].events`	array	No	`[]`	Event types to receive: `call.started`, `call.completed`, `call.failed` (empty = all)
`events[].timeout`	number	No	`5`	Request timeout in seconds (1-30)
`events[].enabled`	boolean	No	`true`	Whether webhook event callbacks are active

On update, omit webhooks.events to preserve the current list, set webhooks.events: null to clear it, or pass a full array to replace it. Within a replacement array, omitted events[].secret values are preserved only for entries whose url exactly matches an existing endpoint, events[].secret: null clears that endpoint’s signing secret, duplicate URLs are rejected, and events[].events: [] means that endpoint receives all event types.

Inbound call webhook fields (`webhooks.inbound_call`)

Inbound: Voice.ai sends POST requests to your URL before an inbound call starts. Use this webhook to personalize a call with dynamic_variables and optional runtime agent_overrides. Do not use inbound_call to route calls to different agents.

Field	Type	Required	Default	Description
`inbound_call.url`	string	Yes	-	Webhook endpoint URL
`inbound_call.secret`	string	No	`null`	HMAC-SHA256 signing secret (write-only on create/update)
`inbound_call.has_secret`	boolean	No	`false`	Whether a signing secret is configured (read-only on fetch)
`inbound_call.timeout`	number	No	`5`	Request timeout in seconds (1-30)
`inbound_call.enabled`	boolean	No	`true`	Whether inbound call personalization is active

On update, omit webhooks.inbound_call to preserve it, set webhooks.inbound_call: null to remove it, and set inbound_call.secret: null to clear only the signing secret.

Tool webhook fields (`webhooks.tools`)

Outbound: Voice.ai calls your API. Configure auth_type, auth_token, or headers to authenticate the outbound request to your endpoint.

Field	Type	Required	Default	Description
`tools[].name`	string	Yes	-	Tool name used by the agent
`tools[].description`	string	Yes	-	Human-readable tool description
`tools[].url`	string	Yes	-	Your API endpoint URL
`tools[].parameters`	object	Yes	-	Tool argument schema
`tools[].method`	string	Yes	-	`GET`, `POST`, `PUT`, `PATCH`, or `DELETE`
`tools[].execution_mode`	string	Yes	-	`sync` (wait for response) or `async` (accept 2xx)
`tools[].auth_type`	string	Yes	-	`none`, `bearer_token`, `api_key`, or `custom_headers`
`tools[].auth_token`	string	No	`null`	Token for bearer_token or api_key
`tools[].headers`	object	No	`{}`	Custom headers (for auth_type: custom_headers)
`tools[].response`	object	No	`{}`	Expected response shape
`tools[].timeout`	number	No	`10`	Request timeout in seconds

On update, omit webhooks.tools to leave the current tool list unchanged, set webhooks.tools: null to clear all tools, or pass a new array to replace the current list.

MCP Servers

Parameter	Type	Default	Description
`mcp_servers`	array	`null`	List of Model Context Protocol (MCP) server configurations. Allows agents to connect to external tools and data sources. See Model Context Protocol for details.

On update, omit mcp_servers to preserve the current server list, set mcp_servers: null to clear it, or pass a new array to replace the current list. Each MCP server configuration includes:

name (string, required) - Human-readable name
description (string, optional) - Server description
url (string, required) - MCP server endpoint URL
auth_type (string, optional) - Authentication type: "none", "bearer_token", "api_key", "custom_headers". Default: "none"
auth_token (string, optional) - Token for authentication
headers (object, optional) - Custom HTTP headers

Examples

Basic Configuration

{
  "prompt": "You are a helpful customer support agent.",
  "greeting": "Thank you for calling. How can I help you?",
  "llm_model": "gemini-2.5-flash-lite"
}

Advanced Configuration

{
  "prompt": "You are a technical support specialist. Be concise and professional.",
  "greeting": "Hello, this is technical support. What issue can I help you with?",
  "llm_temperature": 0.5,
  "llm_model": "gemini-2.5-flash",
  "tts_params": {
    "voice_id": "custom_voice_123",
    "model": "voiceai-tts-multilingual-v1-latest",
    "language": "en",
    "temperature": 0.9
  },
  "allow_interruptions": true,
  "min_interruption_words": 2,
  "min_silence_duration": 0.6,
  "vad_activation_threshold": 0.6,
  "max_call_duration_seconds": 1800,
  "phone_number": "+14155551234",
  "mcp_servers": [
    {
      "name": "Database Server",
      "url": "https://api.example.com/mcp",
      "auth_type": "bearer_token",
      "auth_token": "your-token"
    }
  ]
}

Multilingual Agent with Auto Language Detection

{
  "prompt": "You are a multilingual customer support agent. Respond in the same language as the customer.",
  "greeting": "Hello! How can I help you today?",
  "llm_model": "gemini-2.5-flash-lite",
  "tts_params": {
    "model": "voiceai-tts-multilingual-v1-latest",
    "language": "auto"
  },
  "allow_interruptions": true
}

This configuration automatically detects the user’s language from speech recognition and matches the TTS output language accordingly.

Configuration for Fast-Paced Conversations

{
  "prompt": "You are a quick and efficient assistant.",
  "llm_temperature": 0.3,
  "allow_interruptions": true,
  "min_interruption_words": 0,
  "min_silence_duration": 0.4,
  "min_endpointing_delay": 0.3,
  "max_endpointing_delay": 2.0,
  "vad_activation_threshold": 0.4
}

Configuration for Formal Conversations

{
  "prompt": "You are a professional business assistant. Speak formally and wait for complete thoughts.",
  "allow_interruptions": false,
  "min_silence_duration": 0.8,
  "min_endpointing_delay": 0.7,
  "max_endpointing_delay": 4.0,
  "vad_activation_threshold": 0.6
}

Configuration with Webhooks

This example shows the saved agent config shape only. For webhook request/response payload examples and test flows, see the dedicated Webhooks guide.

{
  "prompt": "You are a helpful sales assistant.",
  "greeting": "Hi! How can I help you today?",
  "webhooks": {
    "events": [
      {
        "url": "https://your-server.com/webhooks/voice-events",
        "secret": "your-hmac-secret",
        "events": ["call.started", "call.completed", "call.failed"],
        "enabled": true
      }
    ],
    "inbound_call": {
      "url": "https://your-server.com/webhooks/inbound-call",
      "secret": "your-inbound-call-secret",
      "enabled": true
    },
    "tools": [
      {
        "name": "get_account_status",
        "description": "Fetches the latest account status for a customer.",
        "url": "https://your-server.com/webhooks/tools/account-status",
        "parameters": { "customer_id": "string" },
        "method": "POST",
        "execution_mode": "sync",
        "auth_type": "api_key",
        "auth_token": "your-api-key",
        "headers": { "X-Service-Version": "2026-02" },
        "response": {
          "type": "object",
          "properties": {
            "status": { "type": "string" },
            "tier": { "type": "string" }
          }
        },
        "timeout": 10
      }
    ]
  }
}

webhooks.events[] and webhooks.inbound_call use secret for HMAC verification of inbound requests. webhooks.tools use auth_type/auth_token for outbound API authentication. See the Webhooks guide for payload examples and the Web SDK guide for SDK usage.

Best Practices

Start Simple: Begin with just prompt and greeting, then add parameters as needed
Test Incrementally: Adjust one parameter at a time to understand its effect
Timing Parameters: Fine-tune min_silence_duration and vad_activation_threshold based on your use case
Interruptions: Set allow_interruptions: false for formal scenarios, true for casual conversations
Voice Selection: Use voice_id in tts_params to select a specific voice from your available voices
Call Duration: Set max_call_duration_seconds to prevent runaway costs or enforce time limits

Next Steps

Agent Quickstart - Create your first agent
Webhooks - Receive event, inbound call, and tool webhook notifications
Web SDK - Pass dynamic_variables and connect from the browser
Phone Number Management - Assign phone numbers
Model Context Protocol - Connect MCP servers
API Reference - Complete API documentation

Get started

Text-to-Speech

Voice Agents

SDKs

Overview

Agent-Level Parameters

Configuration Structure

Core Conversation Parameters

Dynamic Variables

Voice & Speech Parameters

TTS Parameters

Interruption & Turn-Taking

Timing & Endpointing

Agent Control Permissions

Phone Number

Call Recording

Integrations

Webhooks

Event webhook fields (`webhooks.events[]`)

Inbound call webhook fields (`webhooks.inbound_call`)

Tool webhook fields (`webhooks.tools`)

MCP Servers

Examples

Basic Configuration

Advanced Configuration

Multilingual Agent with Auto Language Detection

Configuration for Fast-Paced Conversations

Configuration for Formal Conversations

Configuration with Webhooks

Best Practices

Next Steps

​Overview

​Agent-Level Parameters

​Configuration Structure

​Core Conversation Parameters

​Dynamic Variables

​Voice & Speech Parameters

​TTS Parameters

​Interruption & Turn-Taking

​Timing & Endpointing

​Agent Control Permissions

​Phone Number

​Call Recording

​Integrations

​Webhooks

​Event webhook fields (webhooks.events[])

​Inbound call webhook fields (webhooks.inbound_call)

​Tool webhook fields (webhooks.tools)

​MCP Servers

​Examples

​Basic Configuration

​Advanced Configuration

​Multilingual Agent with Auto Language Detection

​Configuration for Fast-Paced Conversations

​Configuration for Formal Conversations

​Configuration with Webhooks

​Best Practices

​Next Steps

Overview

Agent-Level Parameters

Configuration Structure

Core Conversation Parameters

Dynamic Variables

Voice & Speech Parameters

TTS Parameters

Interruption & Turn-Taking

Timing & Endpointing

Agent Control Permissions

Phone Number

Call Recording

Integrations

Webhooks

Event webhook fields (`webhooks.events[]`)

Inbound call webhook fields (`webhooks.inbound_call`)

Tool webhook fields (`webhooks.tools`)

MCP Servers

Examples

Basic Configuration

Advanced Configuration

Multilingual Agent with Auto Language Detection

Configuration for Fast-Paced Conversations

Configuration for Formal Conversations

Configuration with Webhooks

Best Practices

Next Steps