Overview

Azure is a cloud provider offering access to OpenAI and Anthropic models through the Azure OpenAI Service. Bifrost performs conversions including:

Deployment mapping - Model identifiers mapped to Azure deployment IDs with version handling
Authentication modes - API key or bearer token (OAuth) support
Model routing - Automatic provider detection (OpenAI vs Anthropic) based on deployment
API versioning - Configurable API versions with preview support for Responses API
Custom endpoints - Full control over Azure endpoint configuration
Multi-model support - Unified interface for OpenAI, Anthropic (via Azure), and Gemini models
Request/response pass-through - Support for raw request/response bodies for advanced use cases

Supported Operations

Operation	Non-Streaming	Streaming	Endpoint
Chat Completions	✅	✅	`/openai/v1/chat/completions`
Responses API	✅	✅	`/openai/v1/responses`
Embeddings	✅	-	`/openai/v1/embeddings`
Files	✅	-	`/openai/v1/files`
List Models	✅	-	`/openai/v1/models`
Image Generation	✅	✅	`/openai/v1/images/generations`
Image Edit	✅	✅	`/openai/v1/images/edits`
Image Variation	❌	❌	-
Batch	❌	❌	-
Text Completions	❌	❌	-
Speech (TTS)	❌	❌	-

Azure-specific: Batch operations and Text Completions are not supported by Azure OpenAI Service. Responses API uses preview API version and is available for both OpenAI and Anthropic models.

1. Chat Completions

Request Parameters

Core Parameter Mapping

Parameter	Azure Handling	Notes
`model`	Mapped to `deployment_id`	Supports version matching and base model matching
`max_completion_tokens`	Direct pass-through	OpenAI models only
`temperature`, `top_p`	Direct pass-through	Same across all models
All other params	Model-specific conversion	Converted per underlying provider (OpenAI/Anthropic)

Authentication Configuration

Azure uses custom endpoint and deployment configuration:

Gateway
Go SDK

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "azure/gpt-4-deployment",
    "messages": [{"role": "user", "content": "Hello"}],
    "deployment": "my-gpt4-deployment",
    "endpoint": "https://my-org.openai.azure.com"
  }' \
  -H "api-key: YOUR_AZURE_API_KEY"

resp, err := client.ChatCompletionRequest(ctx, &schemas.BifrostChatRequest{
    Provider: schemas.Azure,
    Model:    "gpt-4",
    Input:    messages,
    Params: &schemas.ChatParameters{
        ExtraParams: map[string]interface{}{
            "deployment": "my-gpt4-deployment",
            "endpoint": "https://my-org.openai.azure.com",
        },
    },
})

Key Configuration

Azure supports two authentication methods:

Azure Entra ID (Service Principal)

If you set client_id, client_secret, and tenant_id, Azure Entra ID authentication will be used with priority over API key authentication.

{
  "azure_key_config": {
    "endpoint": "https://your-org.openai.azure.com",
    "client_id": "your-client-id",
    "client_secret": "your-client-secret",
    "tenant_id": "your-tenant-id",
    "scopes": ["https://cognitiveservices.azure.com/.default"],
    "api_version": "2024-10-21",
    "deployments": {
      "gpt-4": "my-gpt4-deployment",
      "gpt-4-turbo": "my-gpt4-turbo-deployment",
      "claude-3": "my-claude-deployment"
    }
  }
}

Required Azure Roles:

For OpenAI models: Cognitive Services OpenAI User
For Anthropic models: Cognitive Services AI Services User

Direct Authentication (API Key)

{
  "azure_key_config": {
    "endpoint": "https://your-org.openai.azure.com",
    "api_version": "2024-10-21",
    "deployments": {
      "gpt-4": "my-gpt4-deployment",
      "gpt-4-turbo": "my-gpt4-turbo-deployment",
      "claude-3": "my-claude-deployment"
    }
  }
}

Configuration Details:

endpoint - Azure OpenAI resource endpoint (required)
client_id - Azure Entra ID client ID (optional, for Service Principal auth)
client_secret - Azure Entra ID client secret (optional, for Service Principal auth)
tenant_id - Azure Entra ID tenant ID (optional, for Service Principal auth)
scopes - OAuth scopes for token requests (default: ["https://cognitiveservices.azure.com/.default"])
api_version - API version to use (default: 2024-10-21)
deployments - Map of model names to deployment IDs (optional, can be provided per-request)
allowed_models - List of allowed models to use from this key (optional)

Deployment Selection

Deployments can be specified at three levels (in order of precedence):

Per-request (highest priority)
```
{"deployment": "custom-deployment"}
```

Key configuration

{"deployments": {"gpt-4": "my-gpt4-deployment"}}

Model name (lowest priority, if no deployment specified) Model name is used as deployment ID directly

OpenAI Models

When using OpenAI models (GPT-4, GPT-4 Turbo, GPT-3.5-Turbo, etc.), Bifrost passes through OpenAI-compatible parameters directly.

Parameter Mapping for OpenAI

All OpenAI-standard parameters are supported. Refer to OpenAI documentation for detailed conversion details.

Anthropic Models

When using Anthropic models through Azure (Claude 3 family), Bifrost converts requests to Anthropic format.

Parameter Mapping for Anthropic

All Anthropic-standard parameters are supported with special handling:

Reasoning/Thinking: reasoning parameters converted to Anthropic’s thinking structure
System messages: Extracted and placed in separate system field
Tool message grouping: Consecutive tool messages merged

Refer to Anthropic documentation for detailed conversion details.

Special Notes for Azure + Anthropic

API version automatically set to 2023-06-01 for Anthropic models
Endpoints use /anthropic/v1/ paths internally
Authentication uses x-api-key header for Anthropic models
Minimum reasoning budget: 1024 tokens

API Versioning

Default version: 2024-10-21 (supports latest OpenAI features)
Preview version: preview (used for Responses API)
Custom version: Set via api_version in key config

Different API versions may have different feature support. Bifrost automatically adjusts endpoint paths and parameters based on the configured version.

Streaming

Streaming uses OpenAI or Anthropic format depending on model type:

OpenAI models: Standard OpenAI streaming with chat.completion.chunk events
Anthropic models: Anthropic streaming format with content blocks

2. Responses API

The Responses API is available for both OpenAI and Anthropic models on Azure and uses the preview API version.

Request Parameters

Core Parameter Mapping

Parameter	Azure Handling	Notes
`instructions`	Becomes system message	Model-specific conversion
`input`	Converted to user message(s)	String or array support
`max_output_tokens`	Model-specific field mapping	OpenAI vs Anthropic conversion
All other params	Model-specific conversion	Converted per underlying provider

OpenAI Models

For OpenAI models (GPT-4, etc.), conversion follows OpenAI’s Responses API format.

Anthropic Models

For Anthropic models (Claude, etc.), conversion follows Anthropic’s message format:

instructions becomes system message
reasoning mapped to thinking structure

Endpoint Configuration

Gateway
Go SDK

curl -X POST http://localhost:8080/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "azure/claude-3-sonnet",
    "input": "Hello, how are you?",
    "instructions": "You are a helpful assistant",
    "deployment": "my-claude-deployment",
    "endpoint": "https://my-org.openai.azure.com"
  }' \
  -H "api-key: YOUR_AZURE_API_KEY"

resp, err := client.ResponsesRequest(ctx, &schemas.BifrostResponsesRequest{
    Provider: schemas.Azure,
    Model:    "claude-3-sonnet",
    Input:    messages,
    Params: &schemas.ResponsesParameters{
        Instructions: schemas.Ptr("You are a helpful assistant"),
    },
})

Special Handling

Uses /openai/v1/responses endpoint with preview API version
All request body conversions handled automatically
Supports raw request body passthrough for advanced cases

OpenAI Models - gpt-oss Special Message Handling: For OpenAI models through Azure, see OpenAI Responses API documentation for details on special gpt-oss model handling regarding reasoning conversion (summaries vs. content blocks). Anthropic Models: Refer to Anthropic Responses API for parameter details.

3. Embeddings

Embeddings are supported for OpenAI models only (not available for Anthropic models on Azure).

Request Parameters

Parameter	Azure Handling
`input`	Direct pass-through
`model`	Mapped to deployment
`dimensions`	Direct pass-through (when supported)

Gateway
Go SDK

curl -X POST http://localhost:8080/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": ["text to embed"],
    "deployment": "my-embedding-deployment"
  }' \
  -H "api-key: YOUR_AZURE_API_KEY"

resp, err := client.EmbeddingRequest(ctx, &schemas.BifrostEmbeddingRequest{
    Provider: schemas.Azure,
    Model:    "text-embedding-3-small",
    Input: &schemas.EmbeddingInput{
        Texts: []string{"text to embed"},
    },
})

Response Conversion

Embeddings response is passed through directly from Azure OpenAI with standard format:

{
  "data": [
    {
      "object": "embedding",
      "embedding": [0.1234, -0.5678, ...],
      "index": 0
    }
  ],
  "model": "text-embedding-3-small",
  "usage": {
    "prompt_tokens": 10,
    "total_tokens": 10
  }
}

4. Files API

Files operations are supported for OpenAI models only.

Supported Operations

Operation	Support
Upload	✅
List	✅
Retrieve	✅
Delete	✅
Get Content	✅

Files are stored in Azure and can be used with batch operations.

5. Image Generation

Image Generation is supported for OpenAI models on Azure and uses the OpenAI-compatible format.

Request Parameters

Core Parameter Mapping

Parameter	Azure Handling	Notes
`model`	Mapped to `deployment_id`	Deployment ID must be configured
`prompt`	Direct pass-through	Prompt text for image generation
All other params	Direct pass-through	Uses OpenAI format

Azure uses the same conversion as OpenAI (see OpenAI Image Generation):

Model & Prompt: bifrostReq.Model → req.Model (mapped to deployment), bifrostReq.Prompt → req.Prompt
Parameters: All other fields from bifrostReq are embedded directly into the request struct via struct embedding

Configuration

Gateway
Go SDK

curl -X POST http://localhost:8080/v1/images/generations \
  -H "Content-Type: application/json" \
  -d '{
    "model": "azure/dall-e-3",
    "prompt": "A sunset over the mountains",
    "size": "1024x1024",
    "n": 1,
    "deployment": "my-image-gen-deployment"
  }' \
  -H "api-key: YOUR_AZURE_API_KEY"

resp, err := client.ImageGenerationRequest(ctx, &schemas.BifrostImageGenerationRequest{
    Provider: schemas.Azure,
    Model:    "dall-e-3",
    Input: &schemas.ImageGenerationInput{
        Prompt: "A sunset over the mountains",
    },
    Params: &schemas.ImageGenerationParameters{
        Size:    schemas.Ptr("1024x1024"),
        N:       schemas.Ptr(1),
    },
})

Response Conversion

Non-streaming: Azure responses are unmarshaled directly into BifrostImageGenerationResponse since Bifrost’s response schema is a superset of OpenAI’s format. All fields are passed through as-is.
Streaming: Azure streaming responses use Server-Sent Events (SSE) format with the same event types as OpenAI (see OpenAI Image Generation Streaming).

Streaming

Image generation streaming is supported and uses OpenAI’s streaming format with Server-Sent Events (SSE).

6. Image Edit

Requests use multipart/form-data, not JSON.

Image Edit is supported for OpenAI models on Azure and uses the OpenAI-compatible format. Azure uses the same conversion as OpenAI (see OpenAI Image Edit):

Request Conversion: Uses openai.HandleOpenAIImageEditRequest with Azure-specific URL construction
URL Format: {endpoint}/openai/deployments/{deployment}/images/edits?api-version={apiVersion}
Authentication: Azure API key or OAuth bearer token (via getAzureAuthHeaders)
Deployment Mapping: Model identifier mapped to Azure deployment ID
Response Conversion: Same as OpenAI - responses unmarshaled directly into BifrostImageGenerationResponse
Streaming: Supported via openai.HandleOpenAIImageEditStreamRequest with Azure-specific URL and authentication

Endpoint: /openai/deployments/{deployment}/images/edits?api-version={apiVersion}

7. List Models

Request Parameters

None required.

Response Conversion

Lists available models/deployments configured in the Azure key. Response includes model metadata, capabilities, and lifecycle status.

{
  "data": [
    {
      "id": "gpt-4",
      "object": "model",
      "created": 1687882411,
      "status": "active",
      "lifecycle_status": "stable",
      "capabilities": {
        "chat_completion": true,
        "embeddings": false
      }
    }
  ]
}

Caveats

Deployment ID Required

Severity: High Behavior: Model names must map to Azure deployment IDs Impact: Request fails without valid deployment mapping Code: azure.go:145-200

Model Provider Detection

Severity: Medium Behavior: Automatic detection of OpenAI vs Anthropic based on model name Impact: Different conversion logic applied transparently Code: azure.go:92-114

Responses API Preview Version (including Anthropic)

Severity: Medium Behavior: Responses API automatically uses preview API version, which differs from Chat Completions API version. For Anthropic models, Responses API specifically uses preview API version. Impact: Different API version for Responses vs Chat Completions. Automatic version override for Responses requests. Code: azure.go:92-114, azure.go:109-113, azure.go:694

Version Matching for Deployments

Severity: Low Behavior: Model version differences ignored when matching to deployments Impact: gpt-4 and gpt-4-turbo can map to same deployment Code: models.go:13-58

Configuration

HTTP Settings: API Version 2024-10-21 (configurable) | Max Connections 5000 | Max Idle 60 seconds Endpoint Format: https://{resource-name}.openai.azure.com/openai/v1/{path}?api-version={version} Note: Bifrost automatically constructs URLs using the endpoint from key configuration and the configured API version.

Setup & Configuration

Azure requires endpoint URLs, deployment mappings, and API version configuration. For detailed instructions on setting up Azure authentication, see the quickstart guides:

Gateway
Go SDK

See Provider-Specific Authentication - Azure in the Gateway Quickstart for configuration steps using Web UI, API, or config.json.

Overview

Quick Start

Providers & Guides

SDK Integrations

MCP Gateway

Custom plugins

Open Source Features

Enterprise Features

​Overview

​Supported Operations

​1. Chat Completions

​Request Parameters

​Core Parameter Mapping

​Authentication Configuration

​Key Configuration

​Azure Entra ID (Service Principal)

​Direct Authentication (API Key)

​Deployment Selection

​OpenAI Models

​Parameter Mapping for OpenAI

​Anthropic Models

​Parameter Mapping for Anthropic

​Special Notes for Azure + Anthropic

​API Versioning

​Streaming

​2. Responses API

​Request Parameters

​Core Parameter Mapping

​OpenAI Models

​Anthropic Models

​Endpoint Configuration

​Special Handling

​3. Embeddings

​Request Parameters

​Response Conversion

​4. Files API

​Supported Operations

​5. Image Generation

​Request Parameters

​Core Parameter Mapping

​Configuration

​Response Conversion

​Streaming

​6. Image Edit

​7. List Models

​Request Parameters

​Response Conversion

​Caveats

​Configuration

​Setup & Configuration

Overview

Supported Operations

1. Chat Completions

Request Parameters

Core Parameter Mapping

Authentication Configuration

Key Configuration

Azure Entra ID (Service Principal)

Direct Authentication (API Key)

Deployment Selection

OpenAI Models

Parameter Mapping for OpenAI

Anthropic Models

Parameter Mapping for Anthropic

Special Notes for Azure + Anthropic

API Versioning

Streaming

2. Responses API

Request Parameters

Core Parameter Mapping

OpenAI Models

Anthropic Models

Endpoint Configuration

Special Handling

3. Embeddings

Request Parameters

Response Conversion

4. Files API

Supported Operations

5. Image Generation

Request Parameters

Core Parameter Mapping

Configuration

Response Conversion

Streaming

6. Image Edit

7. List Models

Request Parameters

Response Conversion

Caveats

Configuration

Setup & Configuration