Skip to main content
POST
/
v1
/
transcriptions
Speech to Text
curl --request POST \
  --url https://api.spitch.app/v1/transcriptions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form 'content=<string>' \
  --form 0.content='@example-file'
{
  "request_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "text": "<string>",
  "segments": [
    {
      "text": "<string>",
      "start": 123,
      "end": 123
    }
  ],
  "timestamps": [
    {
      "text": "<string>",
      "start": 123,
      "end": 123
    }
  ],
  "detected_language": "<string>"
}

Authorizations

Authorization
string
header
required

Authenticate with Authorization: Bearer <token>. The service accepts JWTs, API keys, and guest tokens through this bearer token header.

Body

multipart/form-data

Multipart form data. Provide exactly one audio source: an uploaded file or a public URL.

content
file
required

Audio file to transcribe.

language
string

Optional ISO 639 language code for the spoken audio. Omit it to let the service use automatic language handling.

special_words
string

Optional comma-separated words to bias recognition toward domain-specific names or terms.

timestamp
enum<string>
default:none

Timestamp granularity for returned segments. Use none, sentence, or word.

Available options:
sentence,
word,
none

Response

Successful Response

request_id
string<uuid>
required
text
string
required
segments
Segment · object[] | null
timestamps
Segment · object[] | null
detected_language
string | null