Overview
Azure is a cloud provider offering access to OpenAI and Anthropic models through the Azure OpenAI Service. Bifrost performs conversions including:- Deployment mapping - Model identifiers mapped to Azure deployment IDs with version handling
- Authentication modes - API key or bearer token (OAuth) support
- Model routing - Automatic provider detection (OpenAI vs Anthropic) based on deployment
- API versioning - Configurable API versions with preview support for Responses API
- Custom endpoints - Full control over Azure endpoint configuration
- Multi-model support - Unified interface for OpenAI, Anthropic (via Azure), and Gemini models
- Request/response pass-through - Support for raw request/response bodies for advanced use cases
Supported Operations
| Operation | Non-Streaming | Streaming | Endpoint |
|---|---|---|---|
| Chat Completions | ✅ | ✅ | /openai/v1/chat/completions |
| Responses API | ✅ | ✅ | /openai/v1/responses |
| Embeddings | ✅ | - | /openai/v1/embeddings |
| Files | ✅ | - | /openai/v1/files |
| List Models | ✅ | - | /openai/v1/models |
| Batch | ❌ | ❌ | - |
| Text Completions | ❌ | ❌ | - |
| Speech (TTS) | ❌ | ❌ | - |
Azure-specific: Batch operations and Text Completions are not supported by Azure OpenAI Service. Responses API uses preview API version and is available for both OpenAI and Anthropic models.
1. Chat Completions
Request Parameters
Core Parameter Mapping
| Parameter | Azure Handling | Notes |
|---|---|---|
model | Mapped to deployment_id | Supports version matching and base model matching |
max_completion_tokens | Direct pass-through | OpenAI models only |
temperature, top_p | Direct pass-through | Same across all models |
| All other params | Model-specific conversion | Converted per underlying provider (OpenAI/Anthropic) |
Authentication Configuration
Azure uses custom endpoint and deployment configuration:- Gateway
- Go SDK
Key Configuration
The key configuration for Azure requires:endpoint- Azure OpenAI resource endpoint (required)api_version- API version to use (default:2024-10-21)deployments- Map of model names to deployment IDs (optional, can be provided per-request)allowed_models- List of allowed models to use from this key (optional)
Deployment Selection
Deployments can be specified at three levels (in order of precedence):-
Per-request (highest priority)
-
Key configuration
- Model name (lowest priority, if no deployment specified) Model name is used as deployment ID directly
OpenAI Models
When using OpenAI models (GPT-4, GPT-4 Turbo, GPT-3.5-Turbo, etc.), Bifrost passes through OpenAI-compatible parameters directly.Parameter Mapping for OpenAI
All OpenAI-standard parameters are supported. Refer to OpenAI documentation for detailed conversion details.Anthropic Models
When using Anthropic models through Azure (Claude 3 family), Bifrost converts requests to Anthropic format.Parameter Mapping for Anthropic
All Anthropic-standard parameters are supported with special handling:- Reasoning/Thinking:
reasoningparameters converted to Anthropic’sthinkingstructure - System messages: Extracted and placed in separate
systemfield - Tool message grouping: Consecutive tool messages merged
Special Notes for Azure + Anthropic
- API version automatically set to
2023-06-01for Anthropic models - Endpoints use
/anthropic/v1/paths internally - Authentication uses
x-api-keyheader for Anthropic models - Minimum reasoning budget: 1024 tokens
API Versioning
- Default version:
2024-10-21(supports latest OpenAI features) - Preview version:
preview(used for Responses API) - Custom version: Set via
api_versionin key config
Streaming
Streaming uses OpenAI or Anthropic format depending on model type:- OpenAI models: Standard OpenAI streaming with
chat.completion.chunkevents - Anthropic models: Anthropic streaming format with content blocks
2. Responses API
The Responses API is available for both OpenAI and Anthropic models on Azure and uses the preview API version.Request Parameters
Core Parameter Mapping
| Parameter | Azure Handling | Notes |
|---|---|---|
instructions | Becomes system message | Model-specific conversion |
input | Converted to user message(s) | String or array support |
max_output_tokens | Model-specific field mapping | OpenAI vs Anthropic conversion |
| All other params | Model-specific conversion | Converted per underlying provider |
OpenAI Models
For OpenAI models (GPT-4, etc.), conversion follows OpenAI’s Responses API format.Anthropic Models
For Anthropic models (Claude, etc.), conversion follows Anthropic’s message format:instructionsbecomes system messagereasoningmapped tothinkingstructure
Endpoint Configuration
- Gateway
- Go SDK
Special Handling
- Uses
/openai/v1/responsesendpoint withpreviewAPI version - All request body conversions handled automatically
- Supports raw request body passthrough for advanced cases
3. Embeddings
Embeddings are supported for OpenAI models only (not available for Anthropic models on Azure).Request Parameters
| Parameter | Azure Handling |
|---|---|
input | Direct pass-through |
model | Mapped to deployment |
dimensions | Direct pass-through (when supported) |
- Gateway
- Go SDK
Response Conversion
Embeddings response is passed through directly from Azure OpenAI with standard format:4. Files API
Files operations are supported for OpenAI models only.Supported Operations
| Operation | Support |
|---|---|
| Upload | ✅ |
| List | ✅ |
| Retrieve | ✅ |
| Delete | ✅ |
| Get Content | ✅ |
5. List Models
Request Parameters
None required.Response Conversion
Lists available models/deployments configured in the Azure key. Response includes model metadata, capabilities, and lifecycle status.Caveats
Deployment ID Required
Deployment ID Required
Severity: High
Behavior: Model names must map to Azure deployment IDs
Impact: Request fails without valid deployment mapping
Code:
azure.go:145-200Model Provider Detection
Model Provider Detection
Severity: Medium
Behavior: Automatic detection of OpenAI vs Anthropic based on model name
Impact: Different conversion logic applied transparently
Code:
azure.go:92-114Responses API Preview Version (including Anthropic)
Responses API Preview Version (including Anthropic)
Severity: Medium
Behavior: Responses API automatically uses preview API version, which differs from Chat Completions API version. For Anthropic models, Responses API specifically uses
preview API version.
Impact: Different API version for Responses vs Chat Completions. Automatic version override for Responses requests.
Code: azure.go:92-114, azure.go:109-113, azure.go:694Version Matching for Deployments
Version Matching for Deployments
Severity: Low
Behavior: Model version differences ignored when matching to deployments
Impact:
gpt-4 and gpt-4-turbo can map to same deployment
Code: models.go:13-58Configuration
HTTP Settings: API Version2024-10-21 (configurable) | Max Connections 5000 | Max Idle 60 seconds
Endpoint Format: https://{resource-name}.openai.azure.com/openai/v1/{path}?api-version={version}
Note: Bifrost automatically constructs URLs using the endpoint from key configuration and the configured API version.
Setup & Configuration
Azure requires endpoint URLs, deployment mappings, and API version configuration. For detailed instructions on setting up Azure authentication, see the quickstart guides:- Gateway
- Go SDK
See Provider-Specific Authentication - Azure in the Gateway Quickstart for configuration steps using Web UI, API, or config.json.

