Skip to main content
Clone a voice from an audio sample to create a custom voice for speech generation.
Prerequisites: API key, audio sample (MP3, WAV, or PCM format)
1

Convert Audio to Base64

Encode your audio file to base64 format:
import base64

with open('voice_sample.mp3', 'rb') as f:
    base64_audio = base64.b64encode(f.read()).decode('utf-8')
2

Clone the Voice

Send the base64-encoded audio to the API to create your custom voice:
import requests

response = requests.post(
    'https://dev.voice.ai/api/v1/tts/clone-voice',
    headers={'Authorization': 'Bearer YOUR_API_KEY', 'Content-Type': 'application/json'},
    json={'base64_audio': base64_audio, 'name': 'My Cloned Voice', 'voice_visibility': 'PRIVATE'}
)

data = response.json()
print(f"Voice ID: {data['voice_id']}, Status: {data['status']}")
3

Check Voice Status

Check the voice status using the voice_id from the clone response:
import requests
import time

voice_id = data['voice_id']  # From clone response

# Poll until voice is available
while True:
    response = requests.get(
        f'https://dev.voice.ai/api/v1/tts/voice/{voice_id}',
        headers={'Authorization': 'Bearer YOUR_API_KEY'}
    )
    status_data = response.json()
    status = status_data['status']
    print(f"Status: {status}")
    
    if status == 'AVAILABLE':
        print("Voice is ready!")
        break
    elif status == 'FAILED':
        print("Voice cloning failed")
        break
    
    time.sleep(2)  # Wait 2 seconds before checking again
Status values:
  • PENDING - Voice cloning has started
  • PROCESSING - Voice is being processed
  • AVAILABLE - Voice is ready to use
  • FAILED - Voice cloning failed

List Voices

List all voices owned by the authenticated user:
import requests

response = requests.get(
    'https://dev.voice.ai/api/v1/tts/voices',
    headers={'Authorization': 'Bearer YOUR_API_KEY'}
)

voices = response.json()
for voice in voices:
    print(f"Voice ID: {voice['voice_id']}, Name: {voice['name']}, Status: {voice['status']}")

Request Parameters

  • base64_audio (required) - Base64-encoded audio file (max 7.5MB)
  • name (optional) - Name for the voice
  • voice_visibility (optional) - "PUBLIC" or "PRIVATE" (default: "PUBLIC")

Update Voice

Update voice metadata (name and/or visibility):
import requests

response = requests.patch(
    f'https://dev.voice.ai/api/v1/tts/voice/{voice_id}',
    headers={'Authorization': 'Bearer YOUR_API_KEY', 'Content-Type': 'application/json'},
    json={'name': 'Updated Voice Name', 'voice_visibility': 'PRIVATE'}
)

data = response.json()
print(f"Updated: {data['name']}, Visibility: {data['voice_visibility']}")

Delete Voice

Delete a voice (owner-only):
import requests

response = requests.delete(
    f'https://dev.voice.ai/api/v1/tts/voice/{voice_id}',
    headers={'Authorization': 'Bearer YOUR_API_KEY'}
)

data = response.json()
print(f"Deleted: {data['voice_id']}")

Audio Requirements

  • Formats: MP3, WAV, or PCM
  • Max size: 7.5MB
  • Quality: Clear, single speaker audio works best
  • Duration: 10-60 seconds recommended