Speech Generation

Also known as text-to-speech (TTS), speech generation/synthesis is an integral part of modern AI systems. We have built this endpoint with strong support across many languages. Spitch offers production-ready voices across English, Hausa, Igbo, Yoruba, Amharic, and Nigerian Pidgin so you can choose accents and styles that match your audience.

Request

The generate() function can be used to generate speech. The following parameters are available:

Parameters

text (required) - The text you want to convert to speech
voice (required) - Voice ID to use for generation
language (optional) - ISO 639 language code to use for generation
speed (optional) - Voice speed from 0.7 to 1.2; defaults to 1.0
format (optional) - Output audio format, defaults to wav

Audio Formats

The format parameter supports the following audio formats:

wav (default) - Standard wave format
mp3 - MPEG Layer III audio
ogg_opus - OGG container with Opus codec
webm_opus - WebM container with Opus codec
flac - Free Lossless Audio Codec
pcm_s16le - Raw PCM 16-bit little-endian
mulaw - μ-law encoded audio
alaw - A-law encoded audio

Examples are provided below as a guide for you.

Need guidance on tone, punctuation, or narration style? Run through the TTS Prompting Guide for playbooks, checklists, and review workflows.

Response

The response for speech generation is streamed audio bytes.

The Content-Type is audio/wav
The content is streamed back to the caller.
The file type of the generated audio is wav. If you use the streaming interface (Python SDK), you can start to take action on the byte chunks, e.g. stream to file.

Choosing a Voice

Browse the roster on the Voices page and match voices to your use case before generating audio.

Examples

import os
from spitch import Spitch

os.environ["SPITCH_API_KEY"] = "YOUR_API_KEY"
client = Spitch()

with open("new.mp3", "wb") as f:
    response = client.speech.generate(
        text="Bawo ni ololufe mi?",
        language="yo",
        voice="sade",
        speed=1.0,
        format="mp3"
    )
    f.write(response.read())

For error codes and retry guidance, see Troubleshooting.

​Request

​Parameters

​Audio Formats

​Response

​Choosing a Voice

​Examples

Request

Parameters

Audio Formats

Response

Choosing a Voice

Examples