Skip to main content

Overview

AWS Bedrock supports multiple model families (Claude, Nova, Mistral, Llama, Cohere, Titan) with significant structural differences from OpenAI’s format. Bifrost performs extensive conversion including:
  • Model family detection - Automatic routing based on model ID to handle family-specific parameters
  • Parameter renaming - e.g., max_completion_tokensmaxTokens, stopstopSequences
  • Reasoning transformation - reasoning parameters mapped to model-specific thinking/reasoning structures (Anthropic, Nova)
  • Tool restructuring - Function definitions converted to Bedrock’s ToolConfig format
  • Message conversion - System message extraction, tool message grouping, image format adaptation (base64 only)
  • AWS authentication - Automatic SigV4 request signing with credential chain support
  • Structured output - response_format converted to specialized tool definitions
  • Service tier & guardrails - Support for Bedrock-specific performance and safety configurations

Model Family Support

FamilyChatResponsesTextEmbeddingsImage GenerationImage EditImage Variation
Claude (Anthropic)
Nova (Anthropic)
Mistral
Llama
Cohere
Titan

Supported Operations

OperationNon-StreamingStreamingEndpoint
Chat Completionsconverse
Responses APIconverse
Text Completionsinvoke
Embeddings-invoke
Files-S3 (via SDK)
Batch-batch
List Models-listFoundationModels
Image Generationinvoke
Image Editinvoke
Image Variationinvoke
Speech (TTS)-
Transcriptions (STT)-
Unsupported Operations (❌): Speech (TTS) and Transcriptions (STT) are not supported by the upstream AWS Bedrock API. These return UnsupportedOperationError.Limitations: Images must be in base64 or data URI format (remote URLs not supported). Text completion streaming is not supported.

1. Chat Completions

Request Parameters

Parameter Mapping

ParameterTransformationNotes
max_completion_tokensinferenceConfig.maxTokensRequired field in Bedrock
temperature, top_pDirect pass-through to inferenceConfig
stopinferenceConfig.stopSequencesArray of strings
response_format→ Structured output tool (see Structured Output)Creates bf_so_* tool
toolsSchema restructured (see Tool Conversion)
tool_choiceType mapped (see Tool Conversion)
reasoningModel-specific thinking config (see Reasoning / Thinking)
usermetadata.userID (if provided)Bedrock-specific metadata
service_tierserviceModelTier (if provided)Performance tier selection
top_kVia extra_params (model-specific)Bedrock-specific sampling

Dropped Parameters

The following parameters are silently ignored: frequency_penalty, presence_penalty, logit_bias, logprobs, top_logprobs, seed, parallel_tool_calls

Extra Parameters

Use extra_params (SDK) or pass directly in request body (Gateway) for Bedrock-specific fields:
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0",
    "messages": [{"role": "user", "content": "Hello"}],
    "guardrailConfig": {
      "guardrailIdentifier": "guardrail-id",
      "guardrailVersion": "1",
      "trace": "enabled"
    },
    "performanceConfig": {
      "latency": "optimized"
    }
  }'
Available Extra Parameters:
  • guardrailConfig - Bedrock guardrail configuration with guardrailIdentifier, guardrailVersion, trace
  • performanceConfig - Performance optimization with latency (“optimized” or “standard”)
  • additionalModelRequestFieldPaths - Pass-through for model-specific fields not in standard schema
  • promptVariables - Variables for prompt templates (if using prompt caching)
  • requestMetadata - Custom metadata for request tracking

Cache Control

Prompt caching is supported via cache control directives:
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "This context will be cached",
            "cache_control": {"type": "ephemeral"}
          }
        ]
      }
    ],
    "system": [
      {
        "type": "text",
        "text": "You are a helpful assistant",
        "cache_control": {"type": "ephemeral"}
      }
    ]
  }'

Reasoning / Thinking

Documentation: See Bifrost Reasoning Reference Reasoning/thinking support varies by model family:

Anthropic Claude Models

Parameter Mapping:
  • reasoning.effortthinkingConfig.type = "enabled" (always enabled when reasoning present)
  • reasoning.max_tokensthinkingConfig.budgetTokens (token budget for thinking)
Critical Constraints:
  • Minimum budget: 1024 tokens required; requests below this fail with error
  • Dynamic budget: -1 is converted to 1024 automatically
// Request
{"reasoning": {"effort": "high", "max_tokens": 2048}}

// Bedrock conversion
{"thinkingConfig": {"type": "enabled", "budgetTokens": 2048}}

Anthropic Nova Models

Parameter Mapping:
  • reasoning.effortreasoningConfig.thinkingLevel (“low” → low, “high” → high)
  • reasoning.max_tokens → Max reasoning tokens (affects inference configuration)
// Request
{"reasoning": {"effort": "high", "max_tokens": 10000}}

// Bedrock conversion
{"reasoningConfig": {"type": "enabled", "thinkingLevel": "high"}}

Message Conversion

Critical Caveats

  • System message extraction: System messages are removed from messages array and placed in separate system field
  • Tool message grouping: Consecutive tool messages are merged into single user message with tool result content blocks
  • Image format: Only base64/data URI supported; remote image URLs are not supported by Bedrock Converse API
  • Document support: PDF, CSV, DOC, DOCX, XLS, XLSX, HTML, TXT, MD formats supported

Image Conversion

  • Base64 images: Data URL → {type: "image", source: {type: "base64", mediaType: "image/png", data: "..."}}
  • URL images: ❌ Not supported - Will fail if attempted
  • Documents: Converted to document content blocks with MIME types

Cache Control Locations

Cache directives supported on:
  • System content blocks (entire system message)
  • User message content blocks (specific parts)
  • Tool definitions within tool configuration

Tool Conversion

Tool definitions are restructured:
  • function.namename (preserved)
  • function.parametersinputSchema (Schema format)
  • function.strict → Dropped (not supported by Bedrock)

Tool Choice Mapping

OpenAIBedrock
"auto"auto (default)
"none"Omitted (not explicitly supported)
"required"any
Specific tool{type: "tool", name: "X"}

Tool Call Handling

Tool calls are converted between formats:
  • Bifrost → Bedrock: Tool call arguments converted from JSON object to input field
  • Bedrock → Bifrost: Tool use results with toolUseId, converted back to Bifrost format
  • Tool results: Merged consecutive tool messages into single user message

Structured Output

Structured output uses a special tool-based approach:
// Request with structured output
{
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "response",
      "schema": {
        "type": "object",
        "properties": {
          "name": {"type": "string"},
          "age": {"type": "number"}
        }
      }
    }
  }
}

// Bedrock conversion (internal)
{
  "tools": [{
    "name": "bf_so_response",
    "description": "Structured output tool",
    "inputSchema": {
      "type": "object",
      "properties": {...}
    }
  }],
  "toolChoice": {"type": "tool", "name": "bf_so_response"}
}

// Response extraction
// Tool use input is extracted and returned as contentStr

Response Conversion

Field Mapping

  • stopReasonfinish_reason: endTurn/stopSequencestop, maxTokenslength, toolUsetool_calls
  • usage.inputTokensprompt_tokens | usage.outputTokenscompletion_tokens
  • Cache tokens: cacheReadInputTokens, cacheWriteInputTokensprompt_tokens_details/completion_tokens_details
  • reasoning/thinking blocks → reasoning_details with index, type, text, and signature
  • Tool call input (object) → arguments (JSON string)

Structured Output Response

When structured output is detected:
  • Tool call with name bf_so_* is treated as structured output
  • input object is extracted and returned as contentStr
  • Removed from toolCalls array

Streaming

Chat Completions Streaming

Event sequence from Bedrock Converse Stream API:
  1. Initial message role: contentBlockIndex and role information
  2. Content block starts: toolUse blocks with toolUseId, name
  3. Content block deltas:
    • Text delta: Incremental text content
    • Tool use delta: Accumulated tool call arguments (JSON)
    • Reasoning delta: Reasoning text and optional signature
  4. Message completion: stopReason and final token counts
  5. Usage metrics: Token counts, cached tokens, performance metrics
Streaming event conversion:
  • Each Bedrock streaming event → Multiple Bifrost chunks as needed
  • Tool arguments accumulated across deltas and emitted on block end
  • Reasoning content emitted with signature if present

Text Completion Streaming

Not supported - AWS Bedrock’s text completion API does not support streaming.

Responses API Streaming

Streaming responses use OpenAI-compatible lifecycle events:
  • response.created
  • response.in_progress
  • content_part.start
  • content_part.delta
  • content_part.done
  • function_call_arguments.delta
  • function_call_arguments.done
  • output_item.done
Special handling:
  • Tool arguments accumulated across deltas
  • Content block indices mapped to output indices
  • Synthetic events emitted for text/reasoning content

2. Responses API

The Responses API uses the same underlying converse endpoint but converts between OpenAI’s Responses format and Bedrock’s Messages format.

Request Parameters

Parameter Mapping

ParameterTransformation
max_output_tokensRenamed to maxTokens (via inferenceConfig)
temperature, top_pDirect pass-through
instructionsBecomes system message
toolsSchema restructured (see Chat Completions)
tool_choiceType mapped (see Chat Completions)
reasoningMapped to thinking/reasoning config (see Reasoning / Thinking)
textConverted to output_format (Bedrock-specific)
includeVia extra_params (Bedrock-specific)
stopVia extra_params, renamed to stopSequences
truncationAuto-set to "auto" for computer tools

Extra Parameters

Use extra_params (SDK) or pass directly in request body (Gateway):
curl -X POST http://localhost:8080/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0",
    "input": "Hello, how are you?",
    "stop": ["###"]
  }'

Input & Instructions

  • Input: String wrapped as user message or array converted to messages
  • Instructions: Becomes system message (same extraction as Chat Completions)
  • Cache control: Supported on instructions (system) and input messages

Response Conversion

  • stopReasonstatus: endTurn/stopSequencecompleted, maxTokensincomplete
  • usage.inputTokens/usage.outputTokens preserved with cache tokens → *_tokens_details.cached_tokens
  • Output items: textmessage | toolUsefunction_call | thinkingreasoning

Streaming

Event sequence: response.createdresponse.in_progresscontent_part.startcontent_part.deltacontent_part.doneoutput_item.done

3. Text Completions (Legacy)

Legacy API using invoke endpoint. Streaming not supported. Only Claude (Anthropic) and Mistral models supported.
Request conversion:
  • Claude models: Uses Anthropic’s /v1/complete format with prompt wrapping
    • prompt auto-wrapped with \n\nHuman: {prompt}\n\nAssistant:
    • max_tokensmax_tokens_to_sample
    • temperature, top_p direct pass-through
    • top_k, stop via extra_params
  • Mistral models: Uses standard format
    • max_tokensmax_tokens
    • temperature, top_p direct pass-through
    • stopstop
Response conversion:
  • Claude: completionchoices[0].text
  • Mistral: outputs[].textchoices[] (supports multiple)
  • stopReasonfinish_reason

4. Embeddings

Supported embedding models: Titan, Cohere

Request Parameters

Parameter Mapping

ParameterTransformationNotes
inputDirect pass-throughText or array of texts
dimensions⚠️ Not supportedTitan has fixed dimensions per model
encoding_formatVia extra_params”base64” or “float”
Titan-specific:
  • No dimension customization
  • Fixed output size per model version
Cohere-specific:
  • Reuses Cohere format conversion
  • Similar parameter mapping to standard Cohere

Response Conversion

  • Titan: embedding → single embedding vector
  • Cohere: Reuses Cohere response format with embeddings array
  • usage.inputTokensusage.prompt_tokens

5. Image Generation

Supported image generation models: Titan Image Generator v1, Titan Image Generator v2, Nova Canvas v1

Request Conversion

Parameter(Bifrost)Transformation (Bedrock)
prompttextToImageParams.text
nimageGenerationConfig.numberOfImages
negativePrompttextToImageParams.negativeText
seedimageGenerationConfig.seed
qualityimageGenerationConfig.quality (see Quality Mapping)
styletextToImageParams.style
sizeimageGenerationConfig.width & imageGenerationConfig.height

Quality Mapping

The quality parameter is automatically mapped to Bedrock’s expected format:
Input ValueBedrock ValueNotes
"low""standard"Mapped automatically
"medium""standard"Mapped automatically
"high""premium"Mapped automatically
"default""standard"Passed through (case-insensitive)
"premium""premium"Passed through (case-insensitive)

Response Conversion

Parameter(Bedrock)Transformation (Bifrost)
imagesdata.b64_json

Example Request

curl -X POST http://localhost:8080/v1/images/generations \
  -H "Content-Type: application/json" \
  -d '{
    "model": "bedrock/amazon.nova-canvas-v1:0",
    "prompt": "A futuristic cityscape with a flying car",
    "size": "1024x1024",
    "seed": 123,
    "negative_prompt": "bikes",
    "n": 2
  }'

6. Image Edit

Requests use multipart/form-data, not JSON.
Supported image edit models: Titan Image Generator v1, Titan Image Generator v2, Nova Canvas v1 Bedrock supports three image edit task types: INPAINTING, OUTPAINTING, and BACKGROUND_REMOVAL. The type field is required and must be one of these values. Request Parameters
ParameterTypeRequiredNotes
modelstringModel identifier (must be Titan or Nova Canvas model)
typestringEdit type: "inpainting", "outpainting", or "background_removal"
promptstringText description of the edit (required for inpainting/outpainting)
image[]binaryImage file(s) to edit (only first image used)
maskbinaryMask image file (for inpainting/outpainting)
nintNumber of images to generate (1-10, for inpainting/outpainting only)
sizestringImage size: "WxH" format (e.g., "1024x1024", for inpainting/outpainting only)
qualitystringImage quality (for inpainting/outpainting only). See Quality Mapping for supported values.
cfgScalefloatCFG scale (via ExtraParams["cfgScale"], for inpainting/outpainting only)
negative_textstringNegative prompt (via ExtraParams["negative_text"], for inpainting/outpainting only)
mask_promptstringMask prompt (via ExtraParams["mask_prompt"], for inpainting/outpainting only)
return_maskboolReturn mask in response (via ExtraParams["return_mask"], for inpainting/outpainting only)
outpainting_modestringOutpainting mode (via ExtraParams["outpainting_mode"], outpainting only): "DEFAULT" or "PRECISE"

Request Conversion
  • Task Type Mapping: Params.Type is mapped to taskType:
    • "inpainting""INPAINTING"
    • "outpainting""OUTPAINTING"
    • "background_removal""BACKGROUND_REMOVAL"
    • Any other value returns an error: "unsupported type for Bedrock"
  • Image Conversion: First image in Input.Images is converted to base64: image.Image → base64 string
  • Task-Specific Parameters:
    • INPAINTING: Uses inPaintingParams:
      • promptinPaintingParams.text
      • image (base64) → inPaintingParams.image
      • mask (if present) → inPaintingParams.maskImage (base64)
      • negative_text (via ExtraParams) → inPaintingParams.negativeText
      • mask_prompt (via ExtraParams) → inPaintingParams.maskPrompt
      • return_mask (via ExtraParams) → inPaintingParams.returnMask
    • OUTPAINTING: Uses outPaintingParams:
      • promptoutPaintingParams.text
      • image (base64) → outPaintingParams.image
      • mask (if present) → outPaintingParams.maskImage (base64)
      • negative_text (via ExtraParams) → outPaintingParams.negativeText
      • mask_prompt (via ExtraParams) → outPaintingParams.maskPrompt
      • return_mask (via ExtraParams) → outPaintingParams.returnMask
      • outpainting_mode (via ExtraParams, validated to "DEFAULT" or "PRECISE") → outPaintingParams.outPaintingMode
    • BACKGROUND_REMOVAL: Uses backgroundRemovalParams:
      • image (base64) → backgroundRemovalParams.image
      • No other parameters supported
  • Image Generation Config (for INPAINTING and OUTPAINTING only):
    • nimageGenerationConfig.numberOfImages
    • sizeimageGenerationConfig.width and imageGenerationConfig.height (parsed from "WxH" format)
    • qualityimageGenerationConfig.quality (see Quality Mapping)
    • cfgScale (via ExtraParams["cfgScale"]) → imageGenerationConfig.cfgScale
Response Conversion
  • Uses the same response structure as image generation: BedrockImageGenerationResponseBifrostImageGenerationResponse
  • Response includes:
    • images[]: Array of base64-encoded images
    • maskImage: Base64-encoded mask image (if return_mask was true)
    • error: Error message (if present)
Endpoint: Same as image generation: invoke endpoint Streaming: Image edit streaming is not supported by Bedrock.

7. Image Variation

Requests use multipart/form-data, not JSON.
Supported image variation models: Titan Image Generator v1, Titan Image Generator v2, Nova Canvas v1 Request Parameters
ParameterTypeRequiredNotes
modelstringModel identifier (must be Titan or Nova Canvas model)
imagebinaryImage file to create variations from (supports multiple images via image[])
nintNumber of images to generate (1-10)
sizestringImage size: "WxH" format (e.g., "1024x1024")
qualitystringImage quality. See Quality Mapping for supported values.
cfgScalefloatCFG scale (via ExtraParams["cfgScale"])
promptstringPrompt/text for variation (via ExtraParams["prompt"])
negativeTextstringNegative prompt (via ExtraParams["negativeText"])
similarityStrengthfloatSimilarity strength (via ExtraParams["similarityStrength"]): Range 0.2 to 1.0

Request Conversion
  • Task Type: taskType is set to "IMAGE_VARIATION"
  • Image Conversion: All images are converted to base64 strings:
    • Primary image: Input.Image.Image → base64 string → imageVariationParams.images[0]
    • Additional images: ExtraParams["images"] (stored as [][]byte by HTTP handler) → base64 strings → appended to imageVariationParams.images[]
  • Image Variation Parameters:
    • prompt (via ExtraParams["prompt"]) → imageVariationParams.text
    • negativeText (via ExtraParams["negativeText"]) → imageVariationParams.negativeText
    • similarityStrength (via ExtraParams["similarityStrength"]) → imageVariationParams.similarityStrength (validated to range [0.2, 1.0])
  • Image Generation Config:
    • nimageGenerationConfig.numberOfImages
    • sizeimageGenerationConfig.width and imageGenerationConfig.height (parsed from "WxH" format)
    • quality (via ExtraParams["quality"]) → imageGenerationConfig.quality (see Quality Mapping)
    • cfgScale (via ExtraParams["cfgScale"]) → imageGenerationConfig.cfgScale
Response Conversion
  • Uses the same response structure as image generation: BedrockImageGenerationResponseBifrostImageGenerationResponse
  • Response includes:
    • images[]: Array of base64-encoded image variations
    • error: Error message (if present)
Endpoint: Same as image generation: invoke endpoint Streaming: Image variation streaming is not supported by Bedrock.

8. Batch API

Request formats: requests array (CustomID + Params) or input_file_id Pagination: Cursor-based with afterId, beforeId, limit Endpoints:
  • POST /batch - Create batch
  • GET /batch - List batches
  • GET /batch/{batch_id} - Retrieve batch
  • POST /batch/{batch_id}/cancel - Cancel batch
Response: JSONL format with {recordId, modelOutput: {...}} or {recordId, error: {...}} Status mapping:
Bedrock StatusBifrost Mapping
Submitted, ValidatingValidating
InProgressInProgress
CompletedCompleted
Failed, PartiallyCompletedFailed
StoppingCancelling
StoppedCancelled
ExpiredExpired
Note: RFC3339Nano timestamps converted to Unix timestamps, multi-key retry supported

9. Files API

S3-backed file operations. Files are stored in S3 buckets integrated with Bedrock.
Upload: Multipart/form-data with file (required) and filename (optional) Field mapping:
  • id (file ID)
  • filename
  • size_bytes (from S3 object size)
  • created_at (Unix timestamp from S3 LastModified)
  • mime_type (derived from content or explicitly set)
Endpoints:
  • POST /v1/files - Upload
  • GET /v1/files - List (cursor pagination)
  • GET /v1/files/{file_id} - Retrieve metadata
  • DELETE /v1/files/{file_id} - Delete
  • GET /v1/files/{file_id}/content - Download content
Note: File purpose always "batch", status always "processed"

10. List Models

Request: GET /v1/models (no body) Field mapping:
  • id (model name with deployment prefix if applicable)
  • display_namename
  • created_at (Unix timestamp)
Pagination: Token-based with NextPageToken, FirstID, LastID Filtering:
  • Region-based model filtering
  • Deployment mapping from configuration
  • Model allowlist support (allowed_models config)
Multi-key support: Results aggregated from all keys, filtered by allowedModels if configured

11. AWS Authentication & Configuration

Bifrost automatically handles AWS Bedrock authentication via multiple methods including explicit credentials, IAM roles, and bearer tokens with automatic Signature Version 4 (SigV4) signing.

Setup & Configuration

For detailed instructions on setting up AWS Bedrock authentication including credentials, IAM roles, regions, and deployment mapping, see the quickstart guides:
See Provider-Specific Authentication - AWS Bedrock in the Gateway Quickstart for configuration steps using Web UI, API, or config.json.

Endpoints

  • Runtime API: bedrock-runtime.{region}.amazonaws.com/model/{path}
  • Control Plane: bedrock.{region}.amazonaws.com (list models)
  • Batch API: Via bedrock-runtime

12. Error Handling

HTTP Status Mapping:
StatusBifrost Error TypeNotes
400invalid_request_errorBad request parameters
401authentication_errorInvalid/expired credentials
403permission_denied_errorAccess denied to model/resource
404not_found_errorModel or resource not found
429rate_limit_errorRate limit exceeded
500api_errorServer error
529overloaded_errorService overloaded
Error Response Structure:
type BifrostError struct {
    IsBifrostError bool
    StatusCode     *int
    Error: {
        Type:    string    // Error classification
        Message: string    // Human-readable message
        Error:   error     // Underlying error
    }
}
Special Cases:
  • Context cancellation → RequestCancelled
  • Request timeout → ErrProviderRequestTimedOut
  • Streaming errors → Sent via channel with stream end indicator
  • Response unmarshalling → ErrProviderResponseUnmarshal

Caveats

Severity: High Behavior: Only base64/data URI images supported; remote URLs not supported Impact: Requests with URL-based images fail Code: chat.go:image handling
Severity: High Behavior: reasoning.max_tokens must be >= 1024 Impact: Requests with lower values fail with error Code: chat.go:reasoning validation
Severity: High Behavior: System messages removed from array, placed in separate system field Impact: Message array structure differs from input Code: chat.go:message conversion
Severity: High Behavior: Consecutive tool messages merged into single user message Impact: Message count and structure changes Code: chat.go:tool message handling
Severity: Medium Behavior: Reasoning/thinking config varies significantly by model family Impact: Parameter mapping differs for Claude vs Nova vs other families Code: chat.go, utils.go:model detection
Severity: Medium Behavior: Text completion streaming returns error Impact: Streaming not available for legacy completions API Code: text.go:streaming
Severity: Low Behavior: response_format converted to special bf_so_* tool Impact: Tool call count and structure changes internally Code: chat.go:structured output handling
Severity: Low Behavior: Model IDs with region prefixes matched against deployment config Impact: Model availability depends on deployment configuration Code: models.go:deployment matching