Quickstart - Voice AI Developer Docs

The Voice.ai API provides a straightforward interface to our state-of-the-art text to speech and voice cloning. This short guide will help you get started with our Text to Speech API. More details about the Text to Speech API can be found in the API Documentation.

Create Your API Key

Visit the API Keys page to create your first API key. On the API Keys page, look for the ”+ API Key” button, typically located in the top-right corner of the page. Clicking this button will initiate the process of generating a new API key. Now that you have created your API key you can make your first API request.

Creating an Agent with Python

Prerequisites:

python version 3 or higher
virtual environment

You can download the full code for this example here. First, you’ll need Python 3 installed on your system. You can find the installers for your operating system here.

Virtual Environment Setup

python3 -m venv --upgrade-deps .venv

On Unix or Mac OS you’ll activate with:

source .venv/bin/activate

On Windows you’ll activate the virtual environment with:

.\.venv\Scripts\activate

We’re going to use the requests library for our example:

pip install requests
pip freeze > requirements.txt

Now your environment is ready to go!

Phone Number Setup

We’re going to need a phone number to call our agent. Let’s start by making our first API call to query for available numbers for purchase.You will need a payment method set to use this endpoint. We will also need an API key to call this endpoint, you go here to create one. Make sure you save it somewhere safe, as you can only access it once.If you lose it, just delete the old one and create another.

import requests

API_KEY = 'put-your-api-key-here'
BASE_URL = 'https://agent.voice.ai/api/v1'
HEADERS = {
    'Authorization': f'Bearer {API_KEY}'
}

Make sure you place your API key in the appropriate place, or our next call will fail. Now we’re going to actually call the API:

try:
    available_phone_numbers_request_body = {
        'country_code': 'US'
    }

    available_phone_numbers_response = requests.post(
        f'{BASE_URL}/agent/search-phone-numbers', 
        headers=HEADERS, 
        json=available_phone_numbers_request_body
    )

    if available_phone_numbers_response.status_code != 200:
        raise Exception(f'''
                            Error while checking available phone numbers 
                            {available_phone_numbers_response.status_code} 
                            {available_phone_numbers_response.text}
                        ''')

    available_phone_numbers = available_phone_numbers_response.json()

    print(f'Got available phone numbers from API {available_phone_numbers}')

except requests.exceptions.JSONDecodeError:
    print('Error could not decode JSON from response.')
except requests.exceptions.RequestException as e:
    print(f'Error during request: {e}')
except Exception as e:
    print(f'Error: {e}')

If that works correctly, you should see a list of phone numbers available for purchase in JSON format.Next, lets add the code to actually purchase one.Once we add this code, this will actually purchase one on your behalf so be careful running this script after this point. The rest of the code in the quickstart should be placed in the try block.

if available_phone_numbers.get('total_results', 0) == 0:
    raise Exception(f'No phone numbers found to purchase')

first_available_phone_number = available_phone_numbers.get('results', [])[0].get('phone_number')

if not first_available_phone_number:
    raise Exception('Could not parse available phone numbers response')

purchase_request_body = {
    'phone_number': first_available_phone_number
}

purchase_phone_number_response = requests.post(
    f'{BASE_URL}/agent/purchase-phone-number', 
    headers=HEADERS, 
    json=purchase_request_body
)

if purchase_phone_number_response.status_code != 200:
    raise Exception(f'''
                    Error attempting to purchase phone number 
                    {purchase_phone_number_response.status_code} 
                    {purchase_phone_number_response.text}
                    ''')

purchased_phone_number = purchase_phone_number_response.json()

if purchased_phone_number.get('status', 'unknown') != 'purchased':
    raise Exception(f'Phone number not available for purchase, please try again')

print(f'Purchased phone number for agent {purchased_phone_number}')

Agent Configuration and Deployment

Now that we have a phone number, the last thing we need to do is configure and deploy our agent with it.This is a minimal configuration, for a full reference of our powerful agent configuration possibilities see here.

agent_config = {
    'name': 'My First Agent',
    'config': {
        'prompt': '''
            You are a helpful call center agent. 
            You will help customers with billing problems, returns, and questions about merchandise.
        ''',
        'greeting': 'Thank you for calling the Voice AI help line, what can I help you with?',
        'llm_model': 'gemini-2.5-flash-lite',
        'allow_interruptions': True,
        'min_interruption_words': 1,
        'auto_noise_reduction': True,
        'allow_agent_to_skip_turn': True,
        'phone_number': first_available_phone_number
    }
}

create_agent_response = requests.post(f'{BASE_URL}/agent/', headers=HEADERS, json=agent_config)

if create_agent_response.status_code != 201:
    raise Exception(f'Error creating agent {create_agent_response.status_code} {create_agent_response.text}')

agent_id = create_agent_response.json().get('agent_id')

deploy_agent_response = requests.post(f'{BASE_URL}/agent/{agent_id}/deploy', headers=HEADERS)

print(f'Congrats! Your agent {agent_id} can now be called at {first_available_phone_number}')

Now you can call your agent and test it!You can manage your agents, phone numbers, view metrics, and more through the UI or the API.

Non-Streaming Text to Speech Example

The following examples show how to generate audio using the standard (non-streaming) Text to Speech endpoint.

import requests
# Your Voice.ai API key
api_key = "YOUR_API_KEY"
url = "https://tts-api.voice.ai/api/v2/audio/speech"
# Request headers
headers = {
    "X-API-Token": api_key,
    "Content-Type": "application/json"
}
# Request payload (non-streaming)
payload = {
    "voice": "d1bf0f33-8e0e-4fbf-acf8-45c3c6262513",
    "text": "Hello! This is a test of the Voice.ai text to speech API.",
    "audio_format": "mp3",
    "streaming": False
}
# Make the API request (non-streaming)
response = requests.post(url, headers=headers, json=payload)
# Check if the request was successful (200 or 201)
if response.status_code in (200, 201):
    output_file_path = "output_audio.mp3"
    with open(output_file_path, "wb") as f:
        f.write(response.content)
    print(f"Success! Audio saved to {output_file_path}")
else:
    print(f"Error: {response.status_code}")
    print(response.text)

Streaming Support for Low Latency Use-Cases

The previous example shows how to generate audio using our standard speed platform, which is perfect for most use cases. For use-cases that require low latency, such as Voice AI agents, we also offer an endpoint with streaming support that will send the audio data as it is generated instead of waiting for the file to complete before returning the audio. Below are examples of how to generate audio using our streaming endpoint.

import requests
# Your Voice.ai API key
api_key = "YOUR_API_KEY"
url = "https://tts-api.voice.ai/api/v2/audio/speech"
# Request headers
headers = {
    "X-API-Token": api_key,
    "Content-Type": "application/json"
}
# Request payload (streaming)
payload = {
    "voice": "d1bf0f33-8e0e-4fbf-acf8-45c3c6262513",
    "text": "Hello! This is a test of the Voice.ai text to speech API.",
    "audio_format": "mp3",
    "streaming": True
}
# Make the API request with stream=True to handle streaming response
response = requests.post(url, headers=headers, json=payload, stream=True)
# Check if the request was successful (200 or 201)
if response.status_code in (200, 201):
    output_file_path = "stream_output.mp3"
    with open(output_file_path, "wb") as f:
        for chunk in response.iter_content(chunk_size=1024):
            if chunk:
                f.write(chunk)
    print(f"Success! Streaming audio saved to {output_file_path}")
else:
    print(f"Error: {response.status_code}")
    print(response.text)

Getting started

​Create Your API Key

​Creating an Agent with Python

​Non-Streaming Text to Speech Example

​Streaming Support for Low Latency Use-Cases

Create Your API Key

Creating an Agent with Python

Non-Streaming Text to Speech Example

Streaming Support for Low Latency Use-Cases