Generate speech from text. If voice_id is provided, uses that voice; otherwise uses the default built-in voice. Returns complete audio file. Synchronous endpoint - blocks until generation completes.
Bearer token authentication. Use your API key as the bearer token. Format: Authorization: Bearer
The text to generate speech for
Optional voice ID. If omitted, the default built-in voice is used.
Audio format: mp3, wav, or pcm
mp3, wav, pcm Temperature for generation (0.0-2.0)
0 <= x <= 2Top-p sampling parameter (0.0-1.0)
0 <= x <= 1TTS model to use. Supported models: voiceai-tts-v1-latest, voiceai-tts-v1-2025-12-19 (English only), voiceai-tts-multilingual-v1-latest, voiceai-tts-multilingual-v1-2025-01-14 (multilingual). If not provided, automatically selected based on language at runtime. English ('en') uses non-multilingual models; other languages use multilingual models.
Language code (ISO 639-1 format). Supported languages: en (English), ca (Catalan), sv (Swedish), es (Spanish), fr (French), de (German), it (Italian), pt (Portuguese), pl (Polish), ru (Russian), nl (Dutch). Defaults to 'en' if not provided.
Successful Response - Returns binary audio file
MP3 audio file