Skip to main content
POST
/
v1
/
audio
/
transcriptions
Create transcription
curl --request POST \
  --url http://localhost:8080/v1/audio/transcriptions \
  --header 'Content-Type: multipart/form-data' \
  --form 'model=<string>' \
  --form file='@example-file' \
  --form 'fallbacks=<string>' \
  --form stream=true \
  --form 'language=<string>' \
  --form 'prompt=<string>' \
  --form response_format=json \
  --form 'file_format=<string>'
{
  "duration": 123,
  "language": "<string>",
  "logprobs": [
    {
      "bytes": [
        123
      ],
      "logprob": 123,
      "token": "<string>"
    }
  ],
  "segments": [
    {
      "id": 123,
      "seek": 123,
      "start": 123,
      "end": 123,
      "text": "<string>",
      "tokens": [
        123
      ],
      "temperature": 123,
      "avg_logprob": 123,
      "compression_ratio": 123,
      "no_speech_prob": 123
    }
  ],
  "task": "<string>",
  "text": "<string>",
  "usage": {
    "type": "tokens",
    "input_tokens": 123,
    "input_token_details": {
      "text_tokens": 123,
      "audio_tokens": 123
    },
    "output_tokens": 123,
    "total_tokens": 123,
    "seconds": 123
  },
  "words": [
    {
      "word": "<string>",
      "start": 123,
      "end": 123
    }
  ],
  "extra_fields": {
    "request_type": "<string>",
    "provider": "openai",
    "model_requested": "<string>",
    "model_deployment": "<string>",
    "latency": 123,
    "chunk_index": 123,
    "raw_request": {},
    "raw_response": {},
    "cache_debug": {
      "cache_hit": true,
      "cache_id": "<string>",
      "hit_type": "<string>",
      "provider_used": "<string>",
      "model_used": "<string>",
      "input_tokens": 123,
      "threshold": 123,
      "similarity": 123
    }
  }
}

Body

multipart/form-data
model
string
required

Model in provider/model format

file
file
required

Audio file to transcribe

fallbacks
string[]
stream
boolean
language
string
prompt
string
response_format
enum<string>
Available options:
json,
text,
srt,
verbose_json,
vtt
file_format
string

Response

Successful response

duration
number
language
string
logprobs
object[]
segments
object[]
task
string
text
string
usage
object
words
object[]
extra_fields
object

Additional fields included in responses