Base URL
All API requests should be prefixed with the following base URL:base-url
Authentication
All API requests must be authenticated using an API key. Include your API key in the request headers as:auth-header
Endpoints Overview
This page documents the following endpoints:POST /voice/create– Create a voice from a base64-encoded audio sample.GET /voice/get-voices– List your voices.POST /voice/update– Update an existing voice.POST /voice/delete– Delete a voice.POST /audio/speech– Convert text to speech (supports streaming and non-streaming).
Create a Voice
Method:POSTPath:
/voice/create
Create a voice from a base64-encoded audio sample.
Request Body
name(string, required): Name of the voice.type(string, required): Visibility of the voice. Example:PUBLICorPRIVATE.audio(string, required): Base64-encoded audio data.description(string, optional): Optional description of the voice.voiceTags(string[], optional): Optional list of tag IDs.
Example Request Body
create-voice.json
Code Examples
List Voices
Method:GETPath:
/voice/get-voices
List your voices.
Query Parameters
filter(string, optional): Filter voices. Example:my.
Code Examples
Update a Voice
Method:POSTPath:
/voice/update
Update an existing voice.
Request Body
id(string, required): Voice ID.name(string, optional): New name.description(string, optional): New description.type(string, optional): New type, e.g.PRIVATE.
Code Examples
Delete a Voice
Method:POSTPath:
/voice/delete
Delete a voice.
Request Body
id(string, required): Voice ID to delete.
Code Examples
Text to Speech – Non-Streaming
Method:POSTPath:
/audio/speech
Generate audio by sending text and a voice ID. Non-streaming returns the full audio file in the response.
Common Request Fields
text(string, required): Text to convert to speech.voice(string, required): Voice ID.audio_format(string, optional): e.g.mp3. Defaults tomp3.streaming(boolean, optional):falsefor non-streaming.latency_setting(number, optional): 0 or 1.
Non-Streaming Code Examples
Text to Speech – Streaming
Streaming returns audio bytes as they are generated, which is useful for low-latency use cases like Voice AI agents.Streaming Code Examples
Schemas
CreateVoiceRequest
Type:object
Properties:
name(string)type(PUBLIC | PRIVATE)audio(string, base64)description(string, optional)voiceTags(string[], optional)
UpdateVoiceRequest
Type:object
Properties:
id(string)name(string, optional)description(string, optional)type(PUBLIC | PRIVATE, optional)
DeleteVoiceRequest
Type:object
Properties:
id(string)
SpeechRequest
Type:object
Properties:
text(string)voice(string)audio_format(“mp3” | “wav”, optional, default “mp3”)streaming(boolean, optional, defaultfalse)latency_setting(0 | 1, optional, default0)creativity(number, optional)diversity(number, optional)precision(number, optional)adherence(number, optional)guidance(number, optional)