Request
Thegenerate() function can be used to generate speech. The following parameters are available:
Parameters
text(required) - The text you want to convert to speechvoice(required) - Voice ID to use for generationlanguage(optional) - ISO 639 language code to use for generationspeed(optional) - Voice speed from0.7to1.2; defaults to1.0format(optional) - Output audio format, defaults towav
Audio Formats
Theformat parameter supports the following audio formats:
wav(default) - Standard wave formatmp3- MPEG Layer III audioogg_opus- OGG container with Opus codecwebm_opus- WebM container with Opus codecflac- Free Lossless Audio Codecpcm_s16le- Raw PCM 16-bit little-endianmulaw- μ-law encoded audioalaw- A-law encoded audio
Need guidance on tone, punctuation, or narration style? Run through the TTS Prompting Guide for playbooks, checklists, and review workflows.
Response
The response for speech generation is streamed audio bytes.- The Content-Type is
audio/wav - The content is streamed back to the caller.
- The file type of the generated audio is
wav. If you use the streaming interface (Python SDK), you can start to take action on the byte chunks, e.g. stream to file.
Choosing a Voice
Browse the roster on the Voices page and match voices to your use case before generating audio.Examples
For error codes and retry guidance, see Troubleshooting.