Overview

OpenAI is the baseline schema for Bifrost. When using OpenAI directly, parameters are passed through with minimal conversion - mostly validation and filtering of OpenAI-specific features.

Supported Operations

Operation	Non-Streaming	Streaming	Endpoint
Chat Completions	✅	✅	`/v1/chat/completions`
Responses API	✅	✅	`/v1/responses`
Text Completions	✅	✅	`/v1/completions`
Embeddings	✅	-	`/v1/embeddings`
Speech (TTS)	✅	✅	`/v1/audio/speech`
Transcriptions (STT)	✅	✅	`/v1/audio/transcriptions`
Image Generation	✅	✅	`/v1/images/generations`
Image Edit	✅	✅	`/v1/images/edits`
Image Variation	✅	-	`/v1/images/variations`
Files	✅	-	`/v1/files`
Batch	✅	-	`/v1/batches`
List Models	✅	-	`/v1/models`

1. Chat Completions

Request Parameters

Parameter	Type	Required	Notes
`model`	string	✅	Model identifier
`messages`	array	✅	`ChatMessage` array with roles (docs)
`temperature`	float	❌	Sampling temperature (0-2)
`top_p`	float	❌	Nucleus sampling parameter
`stop`	string/array	❌	Stop sequences
`max_completion_tokens`	int	❌	Min 16, max output tokens
`frequency_penalty`	float	❌	Frequency penalty (-2 to 2)
`presence_penalty`	float	❌	Presence penalty (-2 to 2)
`logit_bias`	object	❌	Token logit adjustments
`logprobs`	bool	❌	Include log probabilities
`top_logprobs`	int	❌	Number of log probabilities per token
`seed`	int	❌	Reproducibility seed
`response_format`	object	❌	Output format (docs)
`tools`	array	❌	`Tool` objects (docs)
`tool_choice`	string/object	❌	`"auto"`, `"none"`, `"required"`, or specific tool
`parallel_tool_calls`	bool	❌	Allow multiple simultaneous tool calls
`stream_options`	object	❌	Streaming options (docs)
`reasoning`	object	❌	Reasoning parameters (Bifrost docs, OpenAI docs)
`user`	string	❌	Truncated to 64 chars
`metadata`	object	❌	Custom metadata
`store`	bool	❌	Filtered for non-OpenAI routing
`service_tier`	string	❌	Filtered for non-OpenAI routing
`prompt_cache_key`	string	❌	Filtered for non-OpenAI routing
`prediction`	object	❌	Predicted output for acceleration
`audio`	object	❌	Audio output config
`modalities`	array	❌	Response modalities (text, audio)

Reasoning: OpenAI supports reasoning.effort (minimal, low, medium, high) and reasoning.max_tokens - both passed through directly. When routing to other providers, "minimal" effort is converted to "low" for compatibility. See Bifrost reasoning docs.
Messages: All message roles are supported: system, user, assistant, tool, developer (treated as system). Content types: text, images via URL (image_url), audio input (input_audio). Tool messages include a tool_call_id.
Tools: Standard OpenAI tool format with strict mode support. Tool choice: "auto", "none", "required", or specific tool by name.
Responses: Passed through in standard OpenAI format. Finish reasons: stop, length, tool_calls, content_filter. Usage includes token counts and optionally cached/reasoning token details.
Streaming: Server-Sent Events format with delta.content, delta.tool_calls, finish_reason, and usage (final chunk only, automatically included by Bifrost). stream_options: { include_usage: true } is set by default for all streaming calls.
Cache Control: cache_control fields are stripped from messages, their content blocks, and tools before sending.
Token Enforcement: max_completion_tokens is enforced to have a minimum of 16. Values below 16 are automatically set to 16.
Special handling: user field is truncated to 64 characters; prompt_cache_key, store, service_tier are filtered when routing to non-OpenAI providers

2. Responses API

The Responses API is OpenAI’s structured output API. Request Parameters

Parameter	Type	Required	Notes
`model`	string	✅	Model identifier
`input`	string/array	✅	Text or `ContentBlock` array (docs)
`max_output_tokens`	int	✅	Maximum output length
`background`	bool	❌	Run request in background mode
`conversation`	string	❌	Conversation ID for continuing a conversation
`include`	array	❌	Array of fields to include in response (e.g., `"web_search_call.action.sources"`)
`instructions`	string	❌	System instructions
`max_tool_calls`	int	❌	Maximum number of tool calls
`metadata`	object	❌	Custom metadata
`parallel_tool_calls`	bool	❌	Allow multiple simultaneous tool calls
`previous_response_id`	string	❌	ID of previous response to continue from
`prompt_cache_key`	string	❌	Prompt caching key
`reasoning`	object	❌	`ResponsesParametersReasoning` configuration (Bifrost docs)
`safety_identifier`	string	❌	Safety identifier for content filtering
`service_tier`	string	❌	Service tier for the request
`stream_options`	object	❌	`ResponsesStreamOptions` configuration
`store`	bool	❌	Store the response for later retrieval
`temperature`	float	❌	Sampling temperature
`text`	object	❌	`ResponsesTextConfig` for output formatting
`top_logprobs`	int	❌	Number of log probabilities to return per token
`top_p`	float	❌	Nucleus sampling parameter
`tool_choice`	string/object	❌	`ResponsesToolChoice` strategy
`tools`	array	❌	`ResponsesTool` objects (docs)
`truncation`	string	❌	Truncation strategy (`auto` or `off`)
`user`	string	❌	Truncated to 64 chars

Special Message Handling (gpt-oss vs other models): OpenAI models handle reasoning differently depending on the model family:

Non-gpt-oss models (GPT-4o, o1, etc.): Send reasoning as summaries. Reasoning-only messages (with no summary and only content blocks) are filtered out since these models don’t support reasoning content blocks in the request format.
gpt-oss models: Send reasoning as content blocks. Reasoning summaries in the request are converted to content blocks since gpt-oss expects reasoning as structured blocks, not summaries.

This conversion ensures compatibility across different model architectures for the structured Responses API. See Bifrost reasoning docs for detailed reasoning handling. Token & Parameter Enforcement:

max_output_tokens is enforced to have a minimum of 16. Values below 16 are automatically set to 16.
reasoning.max_tokens field is automatically removed from JSON output (OpenAI Responses API doesn’t accept it).

Other conversions:

Action types zoom and region are converted to screenshot
cache_control fields are stripped from messages and tools
Unsupported tool types are silently filtered (only these are supported: function, file_search, computer_use_preview, web_search, mcp, code_interpreter, image_generation, local_shell, custom, web_search_preview)

Response: Includes id, status (completed, incomplete, pending, error), output array with message content, and token usage. Streaming: Server-Sent Events with types: response.created, response.in_progress, response.output_item.added, response.content_part.added, response.output_text.delta, response.function_call_arguments.delta, response.completed, response.incomplete. stream_options: { include_usage: true } is set by default for all streaming calls.

3. Text Completions (Legacy)

Text Completions is a legacy API. Use Chat Completions for new implementations.

Request Parameters

Parameter	Type	Required	Notes
`model`	string	✅	Model identifier
`prompt`	string/array	✅	Completion prompt(s)
`max_tokens`	int	❌	Maximum output tokens
`temperature`	float	❌	Sampling temperature
`top_p`	float	❌	Nucleus sampling
`stop`	string/array	❌	Stop sequences
`user`	string	❌	Truncated to 64 chars

Array prompts generate multiple completions. Finish reasons: stop or length. Streaming uses SSE format. stream_options: { include_usage: true } is set by default for streaming calls.
user field is truncated to 64 characters or set to nil if it exceeds the limit.

4. Embeddings

Request Parameters

Parameter	Type	Required	Notes
`model`	string	✅	Model identifier
`input`	string/array	✅	Text(s) to embed (docs)
`encoding_format`	string	❌	`float` or `base64`
`dimensions`	int	❌	Output embedding dimensions
`user`	string	❌	NOT truncated (unlike chat/text)

No streaming support. Returns embedding array with usage counts.

5. Speech (Text-to-Speech)

Request Parameters

Parameter	Type	Required	Notes
`model`	string	✅	`tts-1` or `tts-1-hd`
`input`	string	✅	Text to convert to speech
`voice`	string	✅	alloy, echo, fable, onyx, nova, shimmer
`response_format`	string	❌	mp3, opus, aac, flac, wav, pcm
`speed`	float	❌	0.25 to 4.0 (default 1.0)

Returns raw binary audio. Streaming supported in SSE format (base64 chunks), but not all models support streaming. stream_options: { include_usage: true } is set by default for streaming calls.

6. Transcriptions (Speech-to-Text)

Requests use multipart/form-data, not JSON.

Request Parameters

Parameter	Type	Required	Notes
`file`	binary	✅	Audio file (multipart form-data)
`model`	string	✅	`whisper-1`
`language`	string	❌	ISO-639-1 language code
`prompt`	string	❌	Optional prompt for context
`temperature`	float	❌	Sampling temperature
`response_format`	string	❌	json, text, srt, vtt, verbose_json

Supported audio formats: mp3, mp4, mpeg, mpga, m4a, wav, webm
Response: Includes text, task, language, duration, and optionally word-level timing. Streaming supported in SSE format. stream_options: { include_usage: true } is set by default for streaming calls.

7. Image Generation

Request Parameters

Parameter	Type	Required	Notes
`model`	string	✅	Model identifier (e.g., `dall-e-3`)
`prompt`	string	✅	Text description of the image to generate
`n`	int	❌	Number of images to generate (1-10)
`size`	string	❌	Image size: `"256x256"`, `"512x512"`, `"1024x1024"`, `"1792x1024"`, `"1024x1792"`, `"1536x1024"`, `"1024x1536"`, `"auto"`
`quality`	string	❌	Image quality: `"auto"`, `"high"`, `"medium"`, `"low"`, `"hd"`, `"standard"`
`style`	string	❌	Image style: `"natural"`, `"vivid"`
`response_format`	string	❌	Response format: `"url"` or `"b64_json"`
`background`	string	❌	Background: `"transparent"`, `"opaque"`, `"auto"`
`output_format`	string	❌	Output format: `"png"`, `"webp"`, `"jpeg"`
`output_compression`	int	❌	Compression level (0-100%)
`partial_images`	int	❌	Number of partial images (0-3)
`moderation`	string	❌	Moderation level: `"low"`, `"auto"`
`user`	string	❌	User identifier

Request Conversion OpenAI is the baseline schema for image generation. Parameters are passed through with minimal conversion:

Model & Prompt: bifrostReq.Model → req.Model, bifrostReq.Prompt → req.Prompt
Parameters: All fields from bifrostReq (ImageGenerationParameters) are embedded directly into the OpenAI request struct via struct embedding. No field mapping or transformation is performed.
Streaming: When streaming is requested, stream: true is set in the request body.

Response Conversion

Non-streaming: OpenAI responses are unmarshaled directly into BifrostImageGenerationResponse since Bifrost’s response schema is a superset of OpenAI’s format. All fields are passed through as-is.
Streaming: OpenAI streaming responses use Server-Sent Events (SSE) format with event types:
- image_generation.partial_image: Intermediate image chunks with b64_json data
- image_generation.completed: Final chunk for each image with usage information
- error: Error events Each chunk includes:
- type: Event type
- sequence_number: Sequence number of the chunk
- partial_image_index: Image index (0-N) for partial images
- b64_json: Base64-encoded image data (pointer, may be nil)
- usage: Token usage (only in completed events)
- created_at, size, quality, background, output_format: Additional metadata Bifrost converts these to BifrostImageGenerationStreamResponse chunks with:
- Per-image chunkIndex tracking for proper ordering within each image
- Index field indicating which image (0-N) the chunk belongs to
- PartialImageIndex set only for partial images (not completed events)
- Usage information attached to completed chunks
- Latency tracking per chunk

Endpoint: /v1/images/generations

8. Image Edit

Requests use multipart/form-data, not JSON.

Request Parameters

Parameter	Type	Required	Notes
`model`	string	✅	Model identifier
`prompt`	string	✅	Text description of the edit
`image[]`	binary	✅	Image file(s) to edit (multipart form-data, supports multiple images)
`mask`	binary	❌	Mask image file (multipart form-data)
`n`	int	❌	Number of images to generate (1-10)
`size`	string	❌	Image size: `"256x256"`, `"512x512"`, `"1024x1024"`, `"1536x1024"`, `"1024x1536"`, `"auto"`
`quality`	string	❌	Image quality: `"auto"`, `"high"`, `"medium"`, `"low"`, `"standard"`
`response_format`	string	❌	Response format: `"url"` or `"b64_json"`
`background`	string	❌	Background: `"transparent"`, `"opaque"`, `"auto"`
`input_fidelity`	string	❌	Input fidelity: `"low"`, `"high"`
`partial_images`	int	❌	Number of partial images (0-3)
`output_format`	string	❌	Output format: `"png"`, `"webp"`, `"jpeg"`
`output_compression`	int	❌	Compression level (0-100%)
`user`	string	❌	User identifier
`stream`	bool	❌	Enable streaming response

Request Conversion

Model & Input: bifrostReq.Model → req.Model, bifrostReq.Input.Images → req.Input.Images, bifrostReq.Input.Prompt → req.Input.Prompt
Parameters: All fields from bifrostReq.Params (ImageEditParameters) are embedded directly into the OpenAI request struct via struct embedding. No field mapping or transformation is performed.
Multipart Form Data: The request is serialized as multipart/form-data:
- Model & Prompt: Written as form fields (model, prompt)
- Images: Each image in Input.Images is written as a separate image[] field with proper MIME type detection (image/jpeg, image/webp, image/png) and Content-Type headers
- Mask: If present, written as a mask field with MIME type detection and appropriate filename (mask.png, mask.jpg, mask.webp)
- Optional Parameters: All optional parameters (n, size, quality, response_format, background, input_fidelity, partial_images, output_format, output_compression, user) are written as form fields
- Integer Conversion: Integer fields (n, partial_images, output_compression) are converted to strings using strconv.Itoa
- Streaming: When streaming is requested, stream: "true" is written as a form field

Response Conversion

Non-streaming: OpenAI responses are unmarshaled directly into BifrostImageGenerationResponse since Bifrost’s response schema is a superset of OpenAI’s format. All fields are passed through as-is.
Streaming: OpenAI streaming responses use Server-Sent Events (SSE) format with event types:
- image_edit.partial_image: Intermediate image chunks with b64_json data
- image_edit.completed: Final chunk for each image with usage information
- error: Error events Each chunk includes:
- type: Event type (image_edit.partial_image or image_edit.completed)
- sequence_number: Sequence number of the chunk
- partial_image_index: Image index (0-N) for partial images
- b64_json: Base64-encoded image data (pointer, may be nil)
- usage: Token usage (only in completed events) Bifrost converts these to BifrostImageGenerationStreamResponse chunks with:
- Per-image chunkIndex tracking for proper ordering within each image
- Index field indicating which image (0-N) the chunk belongs to
- PartialImageIndex set only for partial images (not completed events)
- Usage information attached to completed chunks
- Latency tracking per chunk
- Robust handling of interleaved chunks using incomplete image tracking

Endpoint: /v1/images/edits

9. Image Variation

Requests use multipart/form-data, not JSON.

Request Parameters

Parameter	Type	Required	Notes
`model`	string	✅	Model identifier
`image`	binary	✅	Image file to create variations from (multipart form-data)
`n`	int	❌	Number of images to generate (1-10)
`size`	string	❌	Image size: `"256x256"`, `"512x512"`, `"1024x1024"`, `"1792x1024"`, `"1024x1792"`, `"1536x1024"`, `"1024x1536"`, `"auto"`
`response_format`	string	❌	Response format: `"url"` or `"b64_json"`
`user`	string	❌	User identifier

Request Conversion

Model & Input: bifrostReq.Model → req.Model, bifrostReq.Input.Image.Image → req.Input.Image.Image
Parameters: All fields from bifrostReq.Params (ImageVariationParameters) are embedded directly into the OpenAI request struct via struct embedding. No field mapping or transformation is performed.
Multipart Form Data: The request is serialized as multipart/form-data:
- Model: Written as form field (model)
- Image: The image is written as an image field with proper MIME type detection (image/jpeg, image/webp, image/png) and Content-Type headers. If MIME type cannot be detected, defaults to image/png
- Optional Parameters: All optional parameters (n, size, response_format, user) are written as form fields
- Integer Conversion: Integer field (n) is converted to string using strconv.Itoa
Multiple Images: Additional images beyond the first one (if present in ExtraParams["images"]) are stored in ExtraParams but only the first image is sent to OpenAI (OpenAI API only supports single image input)

Response Conversion

Non-streaming: OpenAI responses are unmarshaled directly into BifrostImageVariationResponse (which is a type alias for BifrostImageGenerationResponse). All fields are passed through as-is.
Streaming: Not supported for image variation requests.

Endpoint: /v1/images/variations

10. Files API

Upload

Request Parameters

Parameter	Type	Required	Notes
`file`	binary	✅	File to upload (multipart form-data)
`purpose`	string	✅	batch, fine-tune, or assistants
`filename`	string	❌	Custom filename (defaults to file.jsonl)

Response: FileObject with id, bytes, created_at, filename, purpose, status (docs)

List Files

Query Parameters

Parameter	Type	Required	Notes
`purpose`	string	❌	Filter by purpose
`limit`	int	❌	Results per page
`after`	string	❌	Pagination cursor
`order`	string	❌	asc or desc

Cursor-based pagination with has_more flag.

Retrieve / Delete / Content

Operations:

GET /v1/files/{file_id} - Retrieve file metadata
DELETE /v1/files/{file_id} - Delete file
GET /v1/files/{file_id}/content - Download file content

11. Batch API

Create Batch

Request Parameters

Parameter	Type	Required	Notes
`input_file_id`	string	Conditional	File ID OR requests array (not both)
`requests`	array	Conditional	`BatchRequestItem` objects (converted to JSONL)
`endpoint`	string	✅	Target endpoint (e.g., /v1/chat/completions)
`completion_window`	string	❌	24h (default)
`metadata`	object	❌	Custom metadata

Response: BifrostBatchCreateResponse with id, endpoint, input_file_id, status, created_at, request_counts (docs). Statuses: BatchStatus (validating, failed, in_progress, finalizing, completed, expired, cancelling, cancelled)

List Batches

Query Parameters

Parameter	Type	Required	Notes
`limit`	int	❌	Results per page
`after`	string	❌	Pagination cursor

Retrieve / Cancel Batch

Operations:

GET /v1/batches/{batch_id} - Get batch BifrostBatchRetrieveResponse (docs)
POST /v1/batches/{batch_id}/cancel - Cancel batch (docs)

Get Results

Batch must be completed (has output_file_id)
Download output file via Files API
Parse JSONL - each BatchResultItem: {id, custom_id, response: {status_code, body}}

12. List Models

GET /v1/models - Lists available models with metadata. Model IDs in Bifrost responses are prefixed with openai/ (e.g., openai/gpt-4o). Results are aggregated from all configured API keys. No request body or parameters required.

Common Error Codes

HTTP Status → Error Type mapping:

400 - invalid_request_error
401 - authentication_error
403 - permission_error
404 - not_found_error
429 - rate_limit_error
500 - api_error

Overview

Quick Start

Providers & Guides

SDK Integrations

MCP Gateway

Custom plugins

Open Source Features

Enterprise Features

OpenAI

Overview

Supported Operations

1. Chat Completions

2. Responses API

3. Text Completions (Legacy)

4. Embeddings

5. Speech (Text-to-Speech)

6. Transcriptions (Speech-to-Text)

7. Image Generation

8. Image Edit

9. Image Variation

10. Files API

Upload

List Files

Retrieve / Delete / Content

11. Batch API

Create Batch

List Batches

Retrieve / Cancel Batch

Get Results

12. List Models

Common Error Codes

Overview

Quick Start

Providers & Guides

SDK Integrations

MCP Gateway

Custom plugins

Open Source Features

Enterprise Features

​Overview

​Supported Operations

​1. Chat Completions

​2. Responses API

​3. Text Completions (Legacy)

​4. Embeddings

​5. Speech (Text-to-Speech)

​6. Transcriptions (Speech-to-Text)

​7. Image Generation

​8. Image Edit

​9. Image Variation

​10. Files API

​Upload

​List Files

​Retrieve / Delete / Content

​11. Batch API

​Create Batch

​List Batches

​Retrieve / Cancel Batch

​Get Results

​12. List Models

​Common Error Codes

Overview

Supported Operations

1. Chat Completions

2. Responses API

3. Text Completions (Legacy)

4. Embeddings

5. Speech (Text-to-Speech)

6. Transcriptions (Speech-to-Text)

7. Image Generation

8. Image Edit

9. Image Variation

10. Files API

Upload

List Files

Retrieve / Delete / Content

11. Batch API

Create Batch

List Batches

Retrieve / Cancel Batch

Get Results

12. List Models

Common Error Codes