Overview
Cerebras is a fully OpenAI-compatible provider leveraging the complete set of OpenAI API features. Bifrost delegates all functionality to the OpenAI provider implementation with standard parameter filtering. Key characteristics:- Complete OpenAI compatibility - All chat, text, and streaming features supported
- Full tool calling - Function definitions and parallel tool execution
- Streaming support - Server-Sent Events with token usage tracking
- Parameter preservation - Passes through all standard OpenAI parameters
- Responses API - Full support with format conversion
Supported Operations
| Operation | Non-Streaming | Streaming | Endpoint |
|---|---|---|---|
| Chat Completions | ✅ | ✅ | /v1/chat/completions |
| Responses API | ✅ | ✅ | /v1/chat/completions |
| Text Completions | ✅ | ✅ | /v1/completions |
| List Models | ✅ | - | /v1/models |
| Embeddings | ❌ | ❌ | - |
| Image Generation | ❌ | ❌ | - |
| Speech (TTS) | ❌ | ❌ | - |
| Transcriptions (STT) | ❌ | ❌ | - |
| Files | ❌ | ❌ | - |
| Batch | ❌ | ❌ | - |
Unsupported Operations (❌): Embeddings, Image Generation, Speech, Transcriptions, Files, and Batch are not supported by the upstream Cerebras API. These return
UnsupportedOperationError.1. Chat Completions
Request Parameters
Cerebras supports all standard OpenAI chat completion parameters. For full parameter reference and behavior, see OpenAI Chat Completions.Filtered Parameters
Removed for Cerebras compatibility:prompt_cache_key- Not supportedverbosity- Anthropic-specificstore- Not supportedservice_tier- OpenAI-specific
Reasoning Parameter
Cerebras delegates to OpenAI viaToOpenAIChatRequest, so reasoning parameters are transformed: reasoning.effort values (e.g., minimal → low) are mapped per the OpenAI-compatible providers convention, and reasoning.max_tokens is cleared/omitted (removed during conversion).
Cerebras supports all standard OpenAI message types, tools, responses, and streaming formats. For details on message handling, tool conversion, responses, and streaming, refer to OpenAI Chat Completions.
2. Responses API
Bifrost converts Responses API format to Chat Completions internally, then converts response back:3. Text Completions
Cerebras supports legacy text completion API:| Parameter | Mapping |
|---|---|
prompt | Sent as-is |
max_tokens | max_tokens |
temperature | temperature |
top_p | top_p |
stop | stop sequences |
choices[].text with completion text.
4. Text Completions Streaming
Streaming text completions use same SSE format as chat streaming.5. List Models
Lists available models from Cerebras with capabilities and context length information.Unsupported Features
| Feature | Reason |
|---|---|
| Embedding | Not offered by Cerebras API |
| Image Generation | Not offered by Cerebras API |
| Speech/TTS | Not offered by Cerebras API |
| Transcription/STT | Not offered by Cerebras API |
| Batch Operations | Not offered by Cerebras API |
| File Management | Not offered by Cerebras API |
Caveats
User Field Size Limit
User Field Size Limit
Severity: Low
Behavior: User field > 64 characters is silently dropped
Impact: Longer user identifiers are lost
Code: SanitizeUserField enforces 64-char max

