Agent Configuration

Complete reference for all agent configuration parameters available in the Voice AI API.

Prerequisites: API key

Overview

Agent configuration consists of:

Root-level parameters - name and kb_id are set at the root level of create/update requests
config object - Contains conversation style, voice settings, timing controls, and integrations

All configuration parameters are optional (except name when creating) - the system uses sensible defaults for any omitted fields.

Status Management: Agent status (paused, deployed, disabled) cannot be updated directly through the update endpoint. Use the dedicated endpoints: Deploy Agent, Pause Agent, or Delete Agent.

Agent-Level Parameters

These parameters are set at the root level of agent create/update requests, not inside the config object:

Parameter	Type	Required	Description
`name`	string	Yes (create), No (update)	Agent name. Must be at least 1 character.
`kb_id`	number	No	Knowledge base ID to assign to the agent. Set to an existing knowledge base ID to assign it, or `null` to remove the knowledge base. See Knowledge Base Management.

Configuration Structure

When creating or updating an agent, the request structure is:

{
  "name": "My Agent",
  "config": {
    "prompt": "You are a helpful assistant...",
    "greeting": "Hello! How can I help you?",
    "llm_temperature": 0.7,
    "llm_model": "gemini-2.5-flash-lite",
    "tts_min_sentence_len": 20,
    "tts_params": {
      "voice_id": "abc",
      "temperature": 1.0
    },
    "allow_interruptions": true,
    "min_silence_duration": 0.55,
    "phone_number": "+14155551234",
    "mcp_servers": [...]
  },
  "kb_id": 123
}

The config object contains all behavioral parameters. The kb_id parameter is set at the root level to assign a knowledge base to the agent.

Core Conversation Parameters

Parameter	Type	Default	Description
`prompt`	string	`null`	Custom instructions that define the agent’s personality, role, and behavior. This is the system prompt sent to the LLM.
`greeting`	string	`null`	Initial message the agent speaks when a call starts. If not provided, the agent will start with a default greeting.
`llm_temperature`	number	`0.7`	Controls the creativity/randomness of the LLM responses. Range: 0.0-2.0. Lower values (0.0-0.5) = more focused and deterministic. Higher values (1.0-2.0) = more creative and varied.
`llm_model`	string	`"gemini-2.5-flash-lite"`	The language model to use. Options: `"gpt-4o-mini"`, `"gemini-2.5-flash"`, `"gemini-2.5-flash-lite"`, `"gpt-4o"`, `"gemini-2.5-pro"`.

Dynamic Variables

dynamic_variables are call-scoped values that you pass when a session starts. They are not stored in the saved agent config, but you can reference them from your prompt with bare placeholders such as {{customer_name}} or {{order_id}}.

{
  "config": {
    "prompt": "You are helping {{customer_name}} with order {{order_id}}."
  }
}

You can provide dynamic_variables through the Web SDK, the API reference, or from webhooks.inbound_call before an inbound call starts.

dynamic_variables must be a flat object of string, number, or boolean values.
Omitted variables are allowed.
Variables that are not referenced by the runtime prompt are ignored.

Voice & Speech Parameters

Parameter	Type	Default	Description
`tts_min_sentence_len`	number	`20`	Minimum sentence length (in characters) before TTS starts streaming audio. Lower values = faster response time but more choppy audio. Higher values = smoother audio but longer wait time.
`tts_params`	object	See TTS Parameters	Nested object containing voice and TTS generation settings.

TTS Parameters

The tts_params object contains voice-specific settings:

Parameter	Type	Default	Description
`voice_id`	string	`null`	ID of the voice to use. Get available voices from List Voices endpoint. If not provided, uses default voice.
`model`	string	`null`	TTS model to use. Supported models: `voiceai-tts-v1-latest`, `voiceai-tts-v1-2026-02-10` (English only), `voiceai-tts-multilingual-v1-latest`, `voiceai-tts-multilingual-v1-2026-02-10` (multilingual). If not provided, automatically selected based on language.
`language`	string	`null`	Language code (ISO 639-1 format) or `"auto"` for ASR-detected language. Supported languages: `en`, `ca`, `sv`, `es`, `fr`, `de`, `it`, `pt`, `pl`, `ru`, `nl`. Use `"auto"` to automatically detect language from speech recognition (requires multilingual model). Defaults to `"en"` at runtime if not set.
`temperature`	number	`1.0`	Sampling temperature (0.0-2.0). Higher values make output more random. Controls voice variation and expressiveness.
`top_p`	number	`0.8`	Nucleus sampling parameter (0.0-1.0). Controls diversity of output.

Automatic Language Detection: Set language: "auto" to automatically detect the user’s language from speech recognition. This requires a multilingual model (e.g., voiceai-tts-multilingual-v1-latest). The agent will match the TTS language to the detected speech language in real-time.

Interruption & Turn-Taking

Parameter	Type	Default	Description
`allow_interruptions`	boolean	`true`	Whether users can interrupt the agent while it’s speaking. When `false`, users must wait for the agent to finish.
`allow_interruptions_on_greeting`	boolean	`false`	Whether to allow interruptions during the greeting message. Useful for impatient callers.
`min_interruption_words`	number	`1`	Minimum number of words required for an interruption to be recognized. Range: 0-10. `0` = any speech triggers interruption, `1+` = requires that many words.
`auto_noise_reduction`	boolean	`true`	Enable automatic noise reduction based on environment detection. Improves speech recognition in noisy environments.

Timing & Endpointing

These parameters control when the system detects speech start/end and manages conversation flow:

Parameter	Type	Default	Description
`min_silence_duration`	number	`0.55`	Minimum duration of silence (seconds) before considering speech has ended. Lower = more responsive but may cut off slow speakers.
`min_speech_duration`	number	`0.5`	Minimum duration of speech (seconds) required to trigger an interruption. Prevents false interruptions from brief sounds.
`min_endpointing_delay`	number	`0.5`	Minimum delay (seconds) before considering speech has ended. Works with `min_silence_duration` for endpointing.
`max_endpointing_delay`	number	`3.0`	Maximum delay (seconds) before forcing speech to end. Prevents long pauses from blocking conversation.
`vad_activation_threshold`	number	`0.5`	Voice activity detection threshold. Range: 0.0-1.0. Lower = more sensitive (detects quieter speech), higher = less sensitive (requires clearer speech).
`user_silence_timeout`	number	`10.0`	Seconds of user silence before agent prompts to check if user is still there. Helps detect dropped calls or disengaged users.
`max_call_duration_seconds`	number	`null`	Maximum call duration in seconds. `null` = no limit. Useful for limiting costs or enforcing time constraints.

Agent Control Permissions

Parameter	Type	Default	Description
`allow_agent_to_end_call`	boolean	`false`	Whether the agent can end calls via tool calls or timeout. When `true`, agent can proactively end conversations.
`allow_agent_to_skip_turn`	boolean	`false`	Whether the agent can skip turns and yield conversation control. Useful for multi-party scenarios.

Phone Number

Parameter	Type	Default	Description
`phone_number`	string	`null`	Assigned phone number in E.164 format (e.g., `"+14155551234"`). Must be a number from your available phone numbers. See Phone Number Management.

Integrations

Webhooks

Parameter	Type	Default	Description
`webhooks`	object	`null`	Webhook configuration for event notifications, inbound call personalization, and callable tools. See Webhooks for details.

The webhooks object contains three distinct configs. webhooks.events, webhooks.inbound_call, and webhooks.tools use different contracts:

webhooks.events supports secret (write-only on create/update) and has_secret (read-only on fetch).
webhooks.inbound_call supports secret (write-only on create/update) and has_secret (read-only on fetch). Use it to return dynamic_variables for personalization only, not routing.
webhooks.tools define outbound API calls and do not use secret.

Event webhook fields (`webhooks.events`)

Inbound: Voice.ai sends POST requests to your URL. Configure secret for HMAC signature verification.

Field	Type	Required	Default	Description
`events.url`	string	Yes	-	Webhook endpoint URL
`events.secret`	string	No	`null`	HMAC-SHA256 signing secret (write-only on create/update)
`events.has_secret`	boolean	No	`false`	Whether a signing secret is configured (read-only on fetch)
`events.events`	array	No	`[]`	Event types to receive (empty = all)
`events.timeout`	number	No	`5`	Request timeout in seconds (1-30)
`events.enabled`	boolean	No	`true`	Whether webhook event callbacks are active

Inbound call webhook fields (`webhooks.inbound_call`)

Inbound: Voice.ai sends POST requests to your URL before an inbound call starts. Use this webhook to personalize a call with dynamic_variables. Do not use inbound_call to route calls to different agents.

Field	Type	Required	Default	Description
`inbound_call.url`	string	Yes	-	Webhook endpoint URL
`inbound_call.secret`	string	No	`null`	HMAC-SHA256 signing secret (write-only on create/update)
`inbound_call.has_secret`	boolean	No	`false`	Whether a signing secret is configured (read-only on fetch)
`inbound_call.timeout`	number	No	`5`	Request timeout in seconds (1-30)
`inbound_call.enabled`	boolean	No	`true`	Whether inbound call personalization is active

Tool webhook fields (`webhooks.tools`)

Outbound: Voice.ai calls your API. Configure auth_type, auth_token, or headers to authenticate the outbound request to your endpoint.

Field	Type	Required	Default	Description
`tools[].name`	string	Yes	-	Tool name used by the agent
`tools[].description`	string	Yes	-	Human-readable tool description
`tools[].url`	string	Yes	-	Your API endpoint URL
`tools[].parameters`	object	Yes	-	Tool argument schema
`tools[].method`	string	Yes	-	`GET`, `POST`, `PUT`, `PATCH`, or `DELETE`
`tools[].execution_mode`	string	Yes	-	`sync` (wait for response) or `async` (accept 2xx)
`tools[].auth_type`	string	Yes	-	`none`, `bearer_token`, `api_key`, or `custom_headers`
`tools[].auth_token`	string	No	`null`	Token for bearer_token or api_key
`tools[].headers`	object	No	`{}`	Custom headers (for auth_type: custom_headers)
`tools[].response`	object	No	`{}`	Expected response shape
`tools[].timeout`	number	No	`10`	Request timeout in seconds

MCP Servers

Parameter	Type	Default	Description
`mcp_servers`	array	`null`	List of Model Context Protocol (MCP) server configurations. Allows agents to connect to external tools and data sources. See Model Context Protocol for details.

Each MCP server configuration includes:

name (string, required) - Human-readable name
description (string, optional) - Server description
url (string, required) - MCP server endpoint URL
auth_type (string, optional) - Authentication type: "none", "bearer_token", "api_key", "custom_headers". Default: "none"
auth_token (string, optional) - Token for authentication
headers (object, optional) - Custom HTTP headers

Examples

Basic Configuration

{
  "prompt": "You are a helpful customer support agent.",
  "greeting": "Thank you for calling. How can I help you?",
  "llm_model": "gemini-2.5-flash-lite"
}

Advanced Configuration

{
  "prompt": "You are a technical support specialist. Be concise and professional.",
  "greeting": "Hello, this is technical support. What issue can I help you with?",
  "llm_temperature": 0.5,
  "llm_model": "gemini-2.5-flash",
  "tts_params": {
    "voice_id": "custom_voice_123",
    "model": "voiceai-tts-multilingual-v1-latest",
    "language": "en",
    "temperature": 0.9
  },
  "allow_interruptions": true,
  "min_interruption_words": 2,
  "min_silence_duration": 0.6,
  "vad_activation_threshold": 0.6,
  "max_call_duration_seconds": 1800,
  "phone_number": "+14155551234",
  "mcp_servers": [
    {
      "name": "Database Server",
      "url": "https://api.example.com/mcp",
      "auth_type": "bearer_token",
      "auth_token": "your-token"
    }
  ]
}

Multilingual Agent with Auto Language Detection

{
  "prompt": "You are a multilingual customer support agent. Respond in the same language as the customer.",
  "greeting": "Hello! How can I help you today?",
  "llm_model": "gemini-2.5-flash-lite",
  "tts_params": {
    "model": "voiceai-tts-multilingual-v1-latest",
    "language": "auto"
  },
  "allow_interruptions": true
}

This configuration automatically detects the user’s language from speech recognition and matches the TTS output language accordingly.

Configuration for Fast-Paced Conversations

{
  "prompt": "You are a quick and efficient assistant.",
  "llm_temperature": 0.3,
  "allow_interruptions": true,
  "min_interruption_words": 0,
  "min_silence_duration": 0.4,
  "min_endpointing_delay": 0.3,
  "max_endpointing_delay": 2.0,
  "vad_activation_threshold": 0.4
}

Configuration for Formal Conversations

{
  "prompt": "You are a professional business assistant. Speak formally and wait for complete thoughts.",
  "allow_interruptions": false,
  "min_silence_duration": 0.8,
  "min_endpointing_delay": 0.7,
  "max_endpointing_delay": 4.0,
  "vad_activation_threshold": 0.6
}

Configuration with Webhooks

{
  "prompt": "You are a helpful sales assistant.",
  "greeting": "Hi! How can I help you today?",
  "webhooks": {
    "events": {
      "url": "https://your-server.com/webhooks/voice-events",
      "secret": "your-hmac-secret",
      "events": ["call.started", "call.completed"],
      "enabled": true
    },
    "inbound_call": {
      "url": "https://your-server.com/webhooks/inbound-call",
      "secret": "your-inbound-call-secret",
      "enabled": true
    },
    "tools": [
      {
        "name": "get_account_status",
        "description": "Fetches the latest account status for a customer.",
        "url": "https://your-server.com/webhooks/tools/account-status",
        "parameters": { "customer_id": "string" },
        "method": "POST",
        "execution_mode": "sync",
        "auth_type": "api_key",
        "auth_token": "your-api-key",
        "headers": { "X-Service-Version": "2026-02" },
        "response": {
          "type": "object",
          "properties": {
            "status": { "type": "string" },
            "tier": { "type": "string" }
          }
        },
        "timeout": 10
      }
    ]
  }
}

webhooks.events and webhooks.inbound_call use secret for HMAC verification of inbound requests. webhooks.tools use auth_type/auth_token for outbound API authentication. See the Webhooks guide for payload examples and the Web SDK guide for SDK usage.

Best Practices

Start Simple: Begin with just prompt and greeting, then add parameters as needed
Test Incrementally: Adjust one parameter at a time to understand its effect
Timing Parameters: Fine-tune min_silence_duration and vad_activation_threshold based on your use case
Interruptions: Set allow_interruptions: false for formal scenarios, true for casual conversations
Voice Selection: Use voice_id in tts_params to select a specific voice from your available voices
Call Duration: Set max_call_duration_seconds to prevent runaway costs or enforce time limits

Next Steps

Agent Quickstart - Create your first agent
Webhooks - Receive event, inbound call, and tool webhook notifications
Web SDK - Pass dynamic_variables and connect from the browser
Phone Number Management - Assign phone numbers
Model Context Protocol - Connect MCP servers
API Reference - Complete API documentation

Get started

Text-to-Speech

Voice Agents

SDKs

Overview

Agent-Level Parameters

Configuration Structure

Core Conversation Parameters

Dynamic Variables

Voice & Speech Parameters

TTS Parameters

Interruption & Turn-Taking

Timing & Endpointing

Agent Control Permissions

Phone Number

Integrations

Webhooks

Event webhook fields (`webhooks.events`)

Inbound call webhook fields (`webhooks.inbound_call`)

Tool webhook fields (`webhooks.tools`)

MCP Servers

Examples

Basic Configuration

Advanced Configuration

Multilingual Agent with Auto Language Detection

Configuration for Fast-Paced Conversations

Configuration for Formal Conversations

Configuration with Webhooks

Best Practices

Next Steps

Get started

Text-to-Speech

Voice Agents

SDKs

​Overview

​Agent-Level Parameters

​Configuration Structure

​Core Conversation Parameters

​Dynamic Variables

​Voice & Speech Parameters

​TTS Parameters

​Interruption & Turn-Taking

​Timing & Endpointing

​Agent Control Permissions

​Phone Number

​Integrations

​Webhooks

​Event webhook fields (webhooks.events)

​Inbound call webhook fields (webhooks.inbound_call)

​Tool webhook fields (webhooks.tools)

​MCP Servers

​Examples

​Basic Configuration

​Advanced Configuration

​Multilingual Agent with Auto Language Detection

​Configuration for Fast-Paced Conversations

​Configuration for Formal Conversations

​Configuration with Webhooks

​Best Practices

​Next Steps

Overview

Agent-Level Parameters

Configuration Structure

Core Conversation Parameters

Dynamic Variables

Voice & Speech Parameters

TTS Parameters

Interruption & Turn-Taking

Timing & Endpointing

Agent Control Permissions

Phone Number

Integrations

Webhooks

Event webhook fields (`webhooks.events`)

Inbound call webhook fields (`webhooks.inbound_call`)

Tool webhook fields (`webhooks.tools`)

MCP Servers

Examples

Basic Configuration

Advanced Configuration

Multilingual Agent with Auto Language Detection

Configuration for Fast-Paced Conversations

Configuration for Formal Conversations

Configuration with Webhooks

Best Practices

Next Steps