Skip to main content
POST
/
api
/
v1
/
tts
/
speech
Generate Speech
curl --request POST \
  --url https://dev.voice.ai/api/v1/tts/speech \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "text": "<string>",
  "voice_id": "<string>",
  "audio_format": "mp3",
  "temperature": 1,
  "top_p": 0.8,
  "model": "voiceai-tts-v1-latest",
  "language": "en"
}
'
"<string>"

Authorizations

Authorization
string
header
required

Bearer token authentication. Use your API key as the bearer token. Format: Authorization: Bearer

Body

application/json
text
string
required

The text to generate speech for

voice_id
string

Voice ID to use. Omit to use the default built-in voice.

audio_format
enum<string>
default:mp3

Audio output format (32kHz sample rate)

Available options:
mp3,
wav,
pcm,
alaw_8000,
mp3_22050_32,
mp3_24000_48,
mp3_44100_32,
mp3_44100_64,
mp3_44100_96,
mp3_44100_128,
mp3_44100_192,
opus_48000_32,
opus_48000_64,
opus_48000_96,
opus_48000_128,
opus_48000_192,
pcm_8000,
pcm_16000,
pcm_22050,
pcm_24000,
pcm_32000,
pcm_44100,
pcm_48000,
ulaw_8000,
wav_16000,
wav_22050,
wav_24000
temperature
number
default:1

Sampling temperature (0.0-2.0)

Required range: 0 <= x <= 2
top_p
number
default:0.8

Nucleus sampling parameter (0.0-1.0)

Required range: 0 <= x <= 1
model
enum<string>

TTS model to use. If not provided, automatically selected based on language. English uses non-multilingual models; other languages use multilingual models.

Available options:
voiceai-tts-v1-latest,
voiceai-tts-v1-2026-02-10,
voiceai-tts-multilingual-v1-latest,
voiceai-tts-multilingual-v1-2026-02-10
language
enum<string>
default:en

Language code (ISO 639-1 format)

Available options:
en,
ca,
sv,
es,
fr,
de,
it,
pt,
pl,
ru,
nl

Response

Successful Response - Returns binary audio file (32kHz sample rate)

MP3 audio file (32kHz sample rate, compressed)