Skip to main content
Complete reference for all agent configuration parameters available in the Voice AI API.
Prerequisites: API key

Overview

Agent configuration consists of:
  1. Root-level parameters - name and kb_id are set at the root level of create/update requests
  2. config object - Contains conversation style, voice settings, timing controls, and integrations
All configuration parameters are optional (except name when creating) - the system uses sensible defaults for any omitted fields.
Status Management: Agent status (paused, deployed, disabled) cannot be updated directly through the update endpoint. Use the dedicated endpoints: Deploy Agent, Pause Agent, or Delete Agent.

Agent-Level Parameters

These parameters are set at the root level of agent create/update requests, not inside the config object:
ParameterTypeRequiredDescription
namestringYes (create), No (update)Agent name. Must be at least 1 character.
kb_idnumberNoKnowledge base ID to assign to the agent. Set to an existing knowledge base ID to assign it, or null to remove the knowledge base. See Knowledge Base Management.

Configuration Structure

When creating or updating an agent, the request structure is:
{
  "name": "My Agent",
  "config": {
    "prompt": "You are a helpful assistant...",
    "greeting": "Hello! How can I help you?",
    "llm_temperature": 0.7,
    "llm_model": "gemini-2.5-flash-lite",
    "tts_min_sentence_len": 20,
    "tts_params": {
      "voice_id": "abc",
      "temperature": 1.0
    },
    "allow_interruptions": true,
    "min_silence_duration": 0.55,
    "phone_number": "+14155551234",
    "mcp_servers": [...]
  },
  "kb_id": 123
}
The config object contains all behavioral parameters. The kb_id parameter is set at the root level to assign a knowledge base to the agent.

Core Conversation Parameters

ParameterTypeDefaultDescription
promptstringnullCustom instructions that define the agent’s personality, role, and behavior. This is the system prompt sent to the LLM.
greetingstringnullInitial message the agent speaks when a call starts. If not provided, the agent will start with a default greeting.
llm_temperaturenumber0.7Controls the creativity/randomness of the LLM responses. Range: 0.0-2.0. Lower values (0.0-0.5) = more focused and deterministic. Higher values (1.0-2.0) = more creative and varied.
llm_modelstring"gemini-2.5-flash-lite"The language model to use. Options: "gpt-4o-mini", "gemini-2.5-flash", "gemini-2.5-flash-lite", "gpt-4o", "gemini-2.5-pro".

Voice & Speech Parameters

ParameterTypeDefaultDescription
tts_min_sentence_lennumber20Minimum sentence length (in characters) before TTS starts streaming audio. Lower values = faster response time but more choppy audio. Higher values = smoother audio but longer wait time.
tts_paramsobjectSee TTS ParametersNested object containing voice and TTS generation settings.

TTS Parameters

The tts_params object contains voice-specific settings:
ParameterTypeDefaultDescription
voice_idstringnullID of the voice to use. Get available voices from List Voices endpoint. If not provided, uses default voice.
temperaturenumber1.0Sampling temperature (0.0-2.0). Higher values make output more random. Controls voice variation and expressiveness.
top_pnumber0.8Nucleus sampling parameter (0.0-1.0). Controls diversity of output.

Interruption & Turn-Taking

ParameterTypeDefaultDescription
allow_interruptionsbooleantrueWhether users can interrupt the agent while it’s speaking. When false, users must wait for the agent to finish.
allow_interruptions_on_greetingbooleanfalseWhether to allow interruptions during the greeting message. Useful for impatient callers.
min_interruption_wordsnumber1Minimum number of words required for an interruption to be recognized. Range: 0-10. 0 = any speech triggers interruption, 1+ = requires that many words.
auto_noise_reductionbooleantrueEnable automatic noise reduction based on environment detection. Improves speech recognition in noisy environments.

Timing & Endpointing

These parameters control when the system detects speech start/end and manages conversation flow:
ParameterTypeDefaultDescription
min_silence_durationnumber0.55Minimum duration of silence (seconds) before considering speech has ended. Lower = more responsive but may cut off slow speakers.
min_speech_durationnumber0.5Minimum duration of speech (seconds) required to trigger an interruption. Prevents false interruptions from brief sounds.
min_endpointing_delaynumber0.5Minimum delay (seconds) before considering speech has ended. Works with min_silence_duration for endpointing.
max_endpointing_delaynumber3.0Maximum delay (seconds) before forcing speech to end. Prevents long pauses from blocking conversation.
vad_activation_thresholdnumber0.5Voice activity detection threshold. Range: 0.0-1.0. Lower = more sensitive (detects quieter speech), higher = less sensitive (requires clearer speech).
user_silence_timeoutnumber10.0Seconds of user silence before agent prompts to check if user is still there. Helps detect dropped calls or disengaged users.
max_call_duration_secondsnumbernullMaximum call duration in seconds. null = no limit. Useful for limiting costs or enforcing time constraints.

Agent Control Permissions

ParameterTypeDefaultDescription
allow_agent_to_end_callbooleanfalseWhether the agent can end calls via tool calls or timeout. When true, agent can proactively end conversations.
allow_agent_to_skip_turnbooleanfalseWhether the agent can skip turns and yield conversation control. Useful for multi-party scenarios.

Phone Number

ParameterTypeDefaultDescription
phone_numberstringnullAssigned phone number in E.164 format (e.g., "+14155551234"). Must be a number from your available phone numbers. See Phone Number Management.

Integrations

MCP Servers

ParameterTypeDefaultDescription
mcp_serversarraynullList of Model Context Protocol (MCP) server configurations. Allows agents to connect to external tools and data sources. See Model Context Protocol for details.
Each MCP server configuration includes:
  • name (string, required) - Human-readable name
  • description (string, optional) - Server description
  • url (string, required) - MCP server endpoint URL
  • auth_type (string, optional) - Authentication type: "none", "bearer_token", "api_key", "custom_headers". Default: "none"
  • auth_token (string, optional) - Token for authentication
  • headers (object, optional) - Custom HTTP headers

Examples

Basic Configuration

{
  "prompt": "You are a helpful customer support agent.",
  "greeting": "Thank you for calling. How can I help you?",
  "llm_model": "gemini-2.5-flash-lite"
}

Advanced Configuration

{
  "prompt": "You are a technical support specialist. Be concise and professional.",
  "greeting": "Hello, this is technical support. What issue can I help you with?",
  "llm_temperature": 0.5,
  "llm_model": "gemini-2.5-flash",
  "tts_params": {
    "voice_id": "custom_voice_123",
    "temperature": 0.9
  },
  "allow_interruptions": true,
  "min_interruption_words": 2,
  "min_silence_duration": 0.6,
  "vad_activation_threshold": 0.6,
  "max_call_duration_seconds": 1800,
  "phone_number": "+14155551234",
  "mcp_servers": [
    {
      "name": "Database Server",
      "url": "https://api.example.com/mcp",
      "auth_type": "bearer_token",
      "auth_token": "your-token"
    }
  ]
}

Configuration for Fast-Paced Conversations

{
  "prompt": "You are a quick and efficient assistant.",
  "llm_temperature": 0.3,
  "allow_interruptions": true,
  "min_interruption_words": 0,
  "min_silence_duration": 0.4,
  "min_endpointing_delay": 0.3,
  "max_endpointing_delay": 2.0,
  "vad_activation_threshold": 0.4
}

Configuration for Formal Conversations

{
  "prompt": "You are a professional business assistant. Speak formally and wait for complete thoughts.",
  "allow_interruptions": false,
  "min_silence_duration": 0.8,
  "min_endpointing_delay": 0.7,
  "max_endpointing_delay": 4.0,
  "vad_activation_threshold": 0.6
}

Best Practices

  • Start Simple: Begin with just prompt and greeting, then add parameters as needed
  • Test Incrementally: Adjust one parameter at a time to understand its effect
  • Timing Parameters: Fine-tune min_silence_duration and vad_activation_threshold based on your use case
  • Interruptions: Set allow_interruptions: false for formal scenarios, true for casual conversations
  • Voice Selection: Use voice_id in tts_params to select a specific voice from your available voices
  • Call Duration: Set max_call_duration_seconds to prevent runaway costs or enforce time limits

Next Steps