Voice Cloning API

Clone a voice from an audio sample to create a custom voice for speech generation.

Prerequisites: API key, audio sample (MP3, WAV, or PCM format)

Clone the Voice

Upload your audio file directly using multipart/form-data. See the Clone Voice endpoint for details.

import requests

# Upload audio file directly
with open('voice_sample.mp3', 'rb') as f:
    files = {'file': ('voice_sample.mp3', f, 'audio/mpeg')}
    data = {
        'name': 'My Cloned Voice',
        'voice_visibility': 'PRIVATE'
    }
    response = requests.post(
        'https://dev.voice.ai/api/v1/tts/clone-voice',
        headers={'Authorization': 'Bearer YOUR_API_KEY'},
        files=files,
        data=data
    )

data = response.json()
print(f"Voice ID: {data['voice_id']}, Status: {data['status']}")

Check Voice Status

Check the voice status using the voice_id from the clone response. See the Get Voice endpoint for details.

import requests
import time

voice_id = data['voice_id']  # From clone response

# Poll until voice is available
while True:
    response = requests.get(
        f'https://dev.voice.ai/api/v1/tts/voice/{voice_id}',
        headers={'Authorization': 'Bearer YOUR_API_KEY'}
    )
    status_data = response.json()
    status = status_data['status']
    print(f"Status: {status}")
    
    if status == 'AVAILABLE':
        print("Voice is ready!")
        break
    elif status == 'FAILED':
        print("Voice cloning failed")
        break
    
    time.sleep(2)  # Wait 2 seconds before checking again

Status values:

PENDING - Voice cloning has started
PROCESSING - Voice is being processed
AVAILABLE - Voice is ready to use
FAILED - Voice cloning failed

List Voices

List all voices owned by the authenticated user. See the List Voices endpoint for details.

import requests

response = requests.get(
    'https://dev.voice.ai/api/v1/tts/voices',
    headers={'Authorization': 'Bearer YOUR_API_KEY'}
)

voices = response.json()
for voice in voices:
    print(f"Voice ID: {voice['voice_id']}, Name: {voice['name']}, Status: {voice['status']}")

Request Parameters

file (required) - Audio file (MP3, WAV, or OGG format, max 7.5MB)
name (optional) - Name for the voice
voice_visibility (optional) - "PUBLIC" or "PRIVATE" (default: "PUBLIC")

The endpoint accepts multipart/form-data. Upload the audio file directly - no base64 encoding needed.

Update Voice

Update voice metadata (name and/or visibility). See the Update Voice endpoint for details.

import requests

response = requests.patch(
    f'https://dev.voice.ai/api/v1/tts/voice/{voice_id}',
    headers={'Authorization': 'Bearer YOUR_API_KEY', 'Content-Type': 'application/json'},
    json={'name': 'Updated Voice Name', 'voice_visibility': 'PRIVATE'}
)

data = response.json()
print(f"Updated: {data['name']}, Visibility: {data['voice_visibility']}")

Delete Voice

Delete a voice (owner-only). See the Delete Voice endpoint for details.

import requests

response = requests.delete(
    f'https://dev.voice.ai/api/v1/tts/voice/{voice_id}',
    headers={'Authorization': 'Bearer YOUR_API_KEY'}
)

data = response.json()
print(f"Deleted: {data['voice_id']}")

Audio Requirements

Formats: MP3, WAV, or PCM
Max size: 7.5MB
Quality: Clear, single speaker audio works best
Duration: 10-60 seconds recommended

Generate Speech

Use your voice_id to generate speech

Streaming

Learn about real-time audio generation

API Reference

Explore complete API documentation

Get started

Text-to-Speech

Voice Agents

List Voices

Request Parameters

Update Voice

Delete Voice

Audio Requirements

Generate Speech

Streaming

API Reference

Get started

Text-to-Speech

Voice Agents

​List Voices

​Request Parameters

​Update Voice

​Delete Voice

​Audio Requirements

Generate Speech

Streaming

API Reference

List Voices

Request Parameters

Update Voice

Delete Voice

Audio Requirements