Try Bifrost Enterprise free for 14 days. Explore now
curl --request POST \
--url http://localhost:8080/openai/v1/chat/completions \
--header 'Content-Type: application/json' \
--data '
{
"model": "gpt-4",
"messages": [
{
"role": "system",
"name": "<string>",
"content": "<string>",
"tool_call_id": "<string>",
"refusal": "<string>",
"reasoning": "<string>",
"annotations": [
{
"type": "<string>",
"url_citation": {
"start_index": 123,
"end_index": 123,
"title": "<string>",
"url": "<string>",
"sources": {},
"type": "<string>"
}
}
],
"tool_calls": [
{
"function": {
"name": "<string>",
"arguments": "<string>"
},
"index": 123,
"type": "<string>",
"id": "<string>"
}
]
}
],
"stream": true,
"max_tokens": 123,
"max_completion_tokens": 123,
"temperature": 1,
"top_p": 123,
"frequency_penalty": 0,
"presence_penalty": 0,
"logit_bias": {},
"logprobs": true,
"top_logprobs": 123,
"n": 123,
"stop": "<string>",
"seed": 123,
"user": "<string>",
"tools": [
{
"type": "function",
"custom": {},
"cache_control": {
"type": "ephemeral",
"ttl": "<string>"
}
}
],
"tool_choice": "none",
"parallel_tool_calls": true,
"response_format": {},
"reasoning_effort": "none",
"service_tier": "<string>",
"stream_options": {
"include_obfuscation": true,
"include_usage": true
},
"fallbacks": [
"<string>"
]
}
'{
"id": "<string>",
"choices": [
{
"index": 123,
"finish_reason": "<string>",
"log_probs": {
"content": [
{
"bytes": [
123
],
"logprob": 123,
"token": "<string>",
"top_logprobs": [
{
"bytes": [
123
],
"logprob": 123,
"token": "<string>"
}
]
}
],
"refusal": [
{
"bytes": [
123
],
"logprob": 123,
"token": "<string>"
}
],
"text_offset": [
123
],
"token_logprobs": [
123
],
"tokens": [
"<string>"
],
"top_logprobs": [
{}
]
},
"text": "<string>",
"message": {
"role": "assistant",
"name": "<string>",
"content": "<string>",
"tool_call_id": "<string>",
"refusal": "<string>",
"audio": {
"id": "<string>",
"data": "<string>",
"expires_at": 123,
"transcript": "<string>"
},
"reasoning": "<string>",
"reasoning_details": [
{
"id": "<string>",
"index": 123,
"type": "reasoning.summary",
"summary": "<string>",
"text": "<string>",
"signature": "<string>",
"data": "<string>"
}
],
"annotations": [
{
"type": "<string>",
"url_citation": {
"start_index": 123,
"end_index": 123,
"title": "<string>",
"url": "<string>",
"sources": {},
"type": "<string>"
}
}
],
"tool_calls": [
{
"function": {
"name": "<string>",
"arguments": "<string>"
},
"index": 123,
"type": "<string>",
"id": "<string>"
}
]
},
"delta": {
"role": "<string>",
"content": "<string>",
"refusal": "<string>",
"audio": {
"id": "<string>",
"data": "<string>",
"expires_at": 123,
"transcript": "<string>"
},
"reasoning": "<string>",
"reasoning_details": [
{
"id": "<string>",
"index": 123,
"type": "reasoning.summary",
"summary": "<string>",
"text": "<string>",
"signature": "<string>",
"data": "<string>"
}
],
"tool_calls": [
{
"function": {
"name": "<string>",
"arguments": "<string>"
},
"index": 123,
"type": "<string>",
"id": "<string>"
}
]
}
}
],
"created": 123,
"model": "<string>",
"object": "<string>",
"service_tier": "<string>",
"system_fingerprint": "<string>",
"usage": {
"prompt_tokens": 123,
"prompt_tokens_details": {
"text_tokens": 123,
"audio_tokens": 123,
"image_tokens": 123,
"cached_read_tokens": 123,
"cached_write_tokens": 123
},
"completion_tokens": 123,
"completion_tokens_details": {
"text_tokens": 123,
"accepted_prediction_tokens": 123,
"audio_tokens": 123,
"citation_tokens": 123,
"num_search_queries": 123,
"reasoning_tokens": 123,
"image_tokens": 123,
"rejected_prediction_tokens": 123
},
"total_tokens": 123,
"cost": {
"input_tokens_cost": 123,
"output_tokens_cost": 123,
"reasoning_tokens_cost": 123,
"citation_tokens_cost": 123,
"search_queries_cost": 123,
"request_cost": 123,
"total_cost": 123
}
},
"extra_fields": {
"request_type": "<string>",
"provider": "openai",
"model_requested": "<string>",
"model_deployment": "<string>",
"latency": 123,
"chunk_index": 123,
"raw_request": {},
"raw_response": {},
"cache_debug": {
"cache_hit": true,
"cache_id": "<string>",
"hit_type": "<string>",
"requested_provider": "<string>",
"requested_model": "<string>",
"provider_used": "<string>",
"model_used": "<string>",
"input_tokens": 123,
"threshold": 123,
"similarity": 123
}
},
"search_results": [
{
"title": "<string>",
"url": "<string>",
"date": "<string>",
"last_updated": "<string>",
"snippet": "<string>",
"source": "<string>"
}
],
"videos": [
{
"url": "<string>",
"thumbnail_url": "<string>",
"thumbnail_width": 123,
"thumbnail_height": 123,
"duration": 123
}
],
"citations": [
"<string>"
]
}Creates a chat completion using OpenAI-compatible format. Supports streaming via SSE.
Async inference: Send x-bf-async: true to submit the request as a background job and receive a job ID immediately. Poll with x-bf-async-id: <job-id> to retrieve the result. When the job is still processing, the response will have an empty choices array. When completed, choices will contain the full result. See Async Inference for details.
Note: This endpoint also works without the /v1 prefix (e.g., /openai/chat/completions).
curl --request POST \
--url http://localhost:8080/openai/v1/chat/completions \
--header 'Content-Type: application/json' \
--data '
{
"model": "gpt-4",
"messages": [
{
"role": "system",
"name": "<string>",
"content": "<string>",
"tool_call_id": "<string>",
"refusal": "<string>",
"reasoning": "<string>",
"annotations": [
{
"type": "<string>",
"url_citation": {
"start_index": 123,
"end_index": 123,
"title": "<string>",
"url": "<string>",
"sources": {},
"type": "<string>"
}
}
],
"tool_calls": [
{
"function": {
"name": "<string>",
"arguments": "<string>"
},
"index": 123,
"type": "<string>",
"id": "<string>"
}
]
}
],
"stream": true,
"max_tokens": 123,
"max_completion_tokens": 123,
"temperature": 1,
"top_p": 123,
"frequency_penalty": 0,
"presence_penalty": 0,
"logit_bias": {},
"logprobs": true,
"top_logprobs": 123,
"n": 123,
"stop": "<string>",
"seed": 123,
"user": "<string>",
"tools": [
{
"type": "function",
"custom": {},
"cache_control": {
"type": "ephemeral",
"ttl": "<string>"
}
}
],
"tool_choice": "none",
"parallel_tool_calls": true,
"response_format": {},
"reasoning_effort": "none",
"service_tier": "<string>",
"stream_options": {
"include_obfuscation": true,
"include_usage": true
},
"fallbacks": [
"<string>"
]
}
'{
"id": "<string>",
"choices": [
{
"index": 123,
"finish_reason": "<string>",
"log_probs": {
"content": [
{
"bytes": [
123
],
"logprob": 123,
"token": "<string>",
"top_logprobs": [
{
"bytes": [
123
],
"logprob": 123,
"token": "<string>"
}
]
}
],
"refusal": [
{
"bytes": [
123
],
"logprob": 123,
"token": "<string>"
}
],
"text_offset": [
123
],
"token_logprobs": [
123
],
"tokens": [
"<string>"
],
"top_logprobs": [
{}
]
},
"text": "<string>",
"message": {
"role": "assistant",
"name": "<string>",
"content": "<string>",
"tool_call_id": "<string>",
"refusal": "<string>",
"audio": {
"id": "<string>",
"data": "<string>",
"expires_at": 123,
"transcript": "<string>"
},
"reasoning": "<string>",
"reasoning_details": [
{
"id": "<string>",
"index": 123,
"type": "reasoning.summary",
"summary": "<string>",
"text": "<string>",
"signature": "<string>",
"data": "<string>"
}
],
"annotations": [
{
"type": "<string>",
"url_citation": {
"start_index": 123,
"end_index": 123,
"title": "<string>",
"url": "<string>",
"sources": {},
"type": "<string>"
}
}
],
"tool_calls": [
{
"function": {
"name": "<string>",
"arguments": "<string>"
},
"index": 123,
"type": "<string>",
"id": "<string>"
}
]
},
"delta": {
"role": "<string>",
"content": "<string>",
"refusal": "<string>",
"audio": {
"id": "<string>",
"data": "<string>",
"expires_at": 123,
"transcript": "<string>"
},
"reasoning": "<string>",
"reasoning_details": [
{
"id": "<string>",
"index": 123,
"type": "reasoning.summary",
"summary": "<string>",
"text": "<string>",
"signature": "<string>",
"data": "<string>"
}
],
"tool_calls": [
{
"function": {
"name": "<string>",
"arguments": "<string>"
},
"index": 123,
"type": "<string>",
"id": "<string>"
}
]
}
}
],
"created": 123,
"model": "<string>",
"object": "<string>",
"service_tier": "<string>",
"system_fingerprint": "<string>",
"usage": {
"prompt_tokens": 123,
"prompt_tokens_details": {
"text_tokens": 123,
"audio_tokens": 123,
"image_tokens": 123,
"cached_read_tokens": 123,
"cached_write_tokens": 123
},
"completion_tokens": 123,
"completion_tokens_details": {
"text_tokens": 123,
"accepted_prediction_tokens": 123,
"audio_tokens": 123,
"citation_tokens": 123,
"num_search_queries": 123,
"reasoning_tokens": 123,
"image_tokens": 123,
"rejected_prediction_tokens": 123
},
"total_tokens": 123,
"cost": {
"input_tokens_cost": 123,
"output_tokens_cost": 123,
"reasoning_tokens_cost": 123,
"citation_tokens_cost": 123,
"search_queries_cost": 123,
"request_cost": 123,
"total_cost": 123
}
},
"extra_fields": {
"request_type": "<string>",
"provider": "openai",
"model_requested": "<string>",
"model_deployment": "<string>",
"latency": 123,
"chunk_index": 123,
"raw_request": {},
"raw_response": {},
"cache_debug": {
"cache_hit": true,
"cache_id": "<string>",
"hit_type": "<string>",
"requested_provider": "<string>",
"requested_model": "<string>",
"provider_used": "<string>",
"model_used": "<string>",
"input_tokens": 123,
"threshold": 123,
"similarity": 123
}
},
"search_results": [
{
"title": "<string>",
"url": "<string>",
"date": "<string>",
"last_updated": "<string>",
"snippet": "<string>",
"source": "<string>"
}
],
"videos": [
{
"url": "<string>",
"thumbnail_url": "<string>",
"thumbnail_width": 123,
"thumbnail_height": 123,
"duration": 123
}
],
"citations": [
"<string>"
]
}Set to true to submit this request as an async job. Returns immediately with a job ID. Not compatible with streaming.
true Poll for results of a previously submitted async job by providing the job ID returned from the initial async request.
Override the default result TTL in seconds. Results expire after this duration from completion time.
Model identifier (e.g., gpt-4, gpt-3.5-turbo)
"gpt-4"
List of messages in the conversation
Show child attributes
Whether to stream the response
Maximum tokens to generate (legacy, use max_completion_tokens)
Maximum tokens to generate
0 <= x <= 2-2 <= x <= 2-2 <= x <= 2Show child attributes
Show child attributes
none, auto, required Format for the response
OpenAI reasoning effort level
none, minimal, low, medium, high, xhigh Show child attributes
Fallback models
Successful response
Show child attributes
Token usage information
Show child attributes
Additional fields included in responses
Show child attributes
Show child attributes
Show child attributes
Was this page helpful?