Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.getbifrost.ai/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Bifrost ships a comprehensive end-to-end test harness (tests/e2e/api/collections/provider-harness.json) that exercises every provider’s translation surface. This page documents which features are covered for each provider, sourced from each upstream’s official docs and verified against what’s in the harness collection today. Total: 321 requests across 12 folders covering native API, drop-in routes, cross-model routing, passthrough endpoints, feature variations, and explicit backlog coverage. [PREVIEW] tag: requests prefixed [PREVIEW] are gated behind INCLUDE_PREVIEW=1 (default-skipped) because they target account/region-scoped resources — preview-model deployments, vector stores, cached content references, MCP servers — that the harness can’t provision in arbitrary environments. Run make run-provider-harness-test INCLUDE_PREVIEW=1 to include them. Run it with:
make run-provider-harness-test
Output:
  • tmp/newman-report.html — rich HTML report
  • tmp/harness-failures.md — categorized failures + coverage matrices
  • Interactive viewer at http://localhost:8090 with Resend + Copy curl

Status legend

  • — exercised by the harness, expected to pass against a properly-configured upstream
  • ✅* — exercised, but needs an environment-side resource the harness can’t manufacture (vector store, cached content reference, real audio bytes, MCP server, preview deployment, etc.). These rows are typically [PREVIEW]-tagged in the collection so they default-skip; opt in with make run-provider-harness-test INCLUDE_PREVIEW=1
  • — provider supports this feature but the harness doesn’t yet exercise it (gap; PRs welcome)
Features a provider doesn’t natively support are simply omitted from that provider’s table — there’s no N/A row, since each table only lists features within that provider’s own API surface.

Per-provider coverage

OpenAI

FeatureStatus
Basic chat
System message
Multi-turn conversation
Streaming (SSE)
Stop sequences
Sampling params (temperature / top_p)
Logprobs / top_logprobs
Seed (deterministic outputs)
Predicted outputs
Function calling (custom tool)
Tool choice forced (required)
Tool choice specific function
Parallel tool calls
Strict tool input
MCP toolset✅*
Web search (basic)
Code interpreter (Responses API)
File search (vector store)✅*
Computer use (Responses API)✅*
Vision (image input)
PDF input✅*
Audio input✅*
Reasoning effort (reasoning_effort)
Reasoning summary (Responses API)
Structured output (json_schema)
Response format JSON object
Prompt caching (ephemeral)
Stream options w/ usage
Service tier (auto / flex / priority)
Background mode (Responses async)
Truncation strategy (auto)
Include array (Responses)
Custom tool (Responses)
Skills / container
Token counting (/v1/responses/input_tokens)
Batch API (create / list)
Files API (list)
Models list

Anthropic

FeatureStatus
Basic chat
System message
Multi-turn conversation
Streaming (SSE)
Stop sequences
Function calling (custom tool)
Tool choice forced (any)
Tool choice specific function
Parallel tool calls
Strict tool input
Tool input examples
Defer loading
Allowed callers
Eager input streaming
Tool search (BM25 / regex)
MCP toolset✅*
Web search (basic / dynamic filtering / domain filter / user location)
Web fetch
Code execution
Computer use
Text editor tool
Bash tool
Memory tool
Vision (image input)
PDF input (URL or base64 source)
Citations on document blocks
Extended thinking (thinking.budget_tokens)
Adaptive thinking (thinking.adaptive)
Interleaved thinking (beta)
Structured output (json_schema)
Output config / effort
Prompt caching (ephemeral)
Prompt caching (1-hour TTL)
Service tier
anthropic-beta header
Skills / container
Context management / 1M context
Compaction beta
Token-efficient tools beta
Fine-grained tool streaming
Fast mode (Opus 4.6)✅*
Redact thinking beta
Token counting (/v1/messages/count_tokens)
Batch API (create / list)
Files API (list)
Models list

Bedrock

FeatureStatus
Basic chat
System message
Multi-turn conversation
Streaming (SSE / AWS event-stream)
Stop sequences
Sampling params (temperature OR top_p, not both on Anthropic models)
Function calling
Tool choice forced
Tool choice specific function
Parallel tool calls
MCP toolset
Web search (basic / dynamic filtering / domain filter / user location)
Web fetch
Code execution
Computer use
Text editor tool
Bash tool
Memory tool
Vision (image; URLs auto-fetched + inlined by Bifrost)
PDF input (URLs auto-fetched + inlined)
Citations on document blocks
Extended thinking (thinking.budget_tokens)
Interleaved thinking (beta)
Structured output (json_schema via tool-mode workaround)
Output config / effort
Prompt caching (ephemeral)
Prompt caching (1-hour TTL)
anthropic-beta header passthrough
Context management / 1M context
Cross-region inference (global. prefix)
Cross-region inference (us. prefix)
Native Converse (/bedrock/model/{m}/converse)
Native InvokeModel (/bedrock/model/{m}/invoke)
Model invocation jobs (Batch API equivalent)
Batch API

Gemini (Google AI Studio)

FeatureStatus
Basic chat
System message
Multi-turn conversation
Streaming (SSE)
Stop sequences
Sampling params (temperature / top_p / top_k)
Logprobs
Function calling
Tool choice forced (any)
Tool choice specific function
Parallel tool calls
Web search (basic)
Code execution
URL context tool
Google search grounding
Vision (image input)
PDF input✅*
Audio input
YouTube URL input
Thinking budget (thinkingConfig)
Structured output (json_schema)
Response MIME type
Prompt caching (implicit, via cached content)
Cached content reference (cachedContent: "cachedContents/{id}")✅*
Cached contents lifecycle (POST/GET/PATCH/DELETE /v1beta/cachedContents)
Cached contents list (GET /v1beta/cachedContents)
Safety settings
Token counting (:countTokens)
Files API (list)
Models list

Vertex AI

FeatureStatus
Basic chat
System message
Multi-turn conversation
Streaming (SSE)
Sampling params (temperature / top_p)
Function calling
Tool choice forced
Defer loading (Anthropic)
Allowed callers (Anthropic)
Tool search (Anthropic BM25)
MCP toolset✅*
Web search (basic / dynamic filtering / domain filter / user location)
Web fetch
Code execution
Google search grounding
Computer use (Anthropic)
Text editor tool (Anthropic)
Bash tool (Anthropic)
Memory tool (Anthropic)
Vision (image input)
PDF input (URLs auto-fetched + inlined for Vertex-Anthropic)
Audio input (Gemini)✅*
Citations on document blocks
Extended thinking (thinking.budget_tokens)
Adaptive thinking
Thinking budget (Gemini thinkingConfig)
Interleaved thinking (Anthropic beta)
Structured output (json_schema)
Response MIME type (Gemini)
Output config / effort (Anthropic)
Prompt caching (ephemeral)
Prompt caching (1-hour TTL)
Cached contents lifecycle (POST/GET/PATCH/DELETE /v1/projects/.../cachedContents)
anthropic-beta header passthrough
Safety settings (Gemini)
Context management / 1M context (Anthropic)
Token counting
Model Garden (Llama / Mistral)✅*

Azure OpenAI

FeatureStatus
Basic chat
System message
Multi-turn conversation
Streaming (SSE)
Stop sequences
Sampling params (temperature / top_p)
Logprobs / top_logprobs
Seed (deterministic outputs)
Predicted outputs
Function calling
Tool choice forced
Tool choice specific function
Parallel tool calls
Strict tool input
Code interpreter (Responses preview)
File search (Responses preview)✅*
Vision (image input)
Audio input (gpt-4o-audio-preview)✅*
Reasoning effort (o3 deployment)✅*
Structured output (json_schema)
Response format JSON object
Service tier (auto / flex / priority)
Skills / container
Azure On Your Data (azure_search)
Bifrost-side normalizations applied automatically (don’t appear as separate rows):
  • Vision URL images on Bedrock — fetched and inlined as base64 (Bedrock Converse only accepts inline bytes)
  • PDF URL documents on Bedrock — same fetch+inline path
  • PDF URL documents on Vertex Claude — same (Vertex-Anthropic doesn’t accept URL document sources)
  • Anthropic-style {type:"document",source:{...}} blocks on /v1/chat/completions — normalized to {type:"file",file:{...}} at JSON unmarshal so every provider’s converter sees the same shape

Cross-cutting (Bifrost-specific)

These exercise Bifrost’s translation layer between provider shapes — every check uses the unified POST /v1/chat/completions endpoint with provider/model prefix routing.
FeatureStatus
Cross-model routing (50 models × 5 providers)
Cross-cut function calling (4 providers)
Cross-cut structured output
Cross-cut streaming
Cross-cut vision
Cross-cut web search
Cross-cut code execution
Cross-cut tool choice forced
Cross-cut extended thinking
Cross-cut prompt caching
Cross-cut stop sequences
Failover via X-Bifrost-Fallback-Models
Virtual key auth via X-Bifrost-VK
Sampling-params auto-strip for Opus 4.7+
Computer-use generation auto-correct (4.5↔4.6 pairing)
Rate limit propagation with Retry-After

Passthrough surface (*_passthrough/*)

Catch-all forwarding routes that strip incoming auth headers and inject Bifrost’s configured provider key.
FeatureOpenAIAnthropicBedrockAzureGemini
Basic chatN/A
StreamingN/AN/AN/AN/A
VisionN/AN/AN/AN/A
Web searchN/AN/A
Code execution / code interpreterN/AN/AN/A
Function calling / tool useN/AN/AN/AN/A
Computer useN/AN/AN/AN/A
Extended thinkingN/AN/AN/AN/A
Prompt cachingN/AN/AN/AN/A
Bedrock: passthrough is not supported by design. AWS SigV4 signing requires bifrost to sign the request with its own credentials, which fundamentally conflicts with byte-for-byte forwarding. Use the typed /bedrock/model/{modelId}/converse, /converse-stream, or /invoke routes instead — those go through bifrost’s typed Bedrock provider with proper sigv4 handling. Vertex: no passthrough variant. Google OAuth bearer tokens are rotated per-request and can’t be bridged through a byte-for-byte forward.

Coverage by transport route

RouteWhat it tests
drop-inProvider-native shape via /openai, /anthropic, /bedrock, /genai. Most thorough coverage.
cross-modelUnified /v1/chat/completions with provider/model prefix. Tests Bifrost’s translation layer.
passthrough/*_passthrough/* byte-for-byte forwarding with auth-header strip + Bifrost key injection.

Known gaps that need external setup

These cells stay even after running because they require provider-side state the harness can’t manufacture:
  • OpenAI File Search — needs a real vs_* vector store ([PREVIEW]-tagged)
  • OpenAI / Azure Audio Input — needs real base64 audio bytes ([PREVIEW]-tagged)
  • OpenAI / Anthropic Batch creation — needs a real input file ID
  • Gemini Cached Content reference — the lifecycle endpoints (create/list/retrieve/update/delete) work end-to-end; only the referencing tests (passing cachedContents/{id} to generateContent) need a pre-provisioned cache with 32k+ tokens of content (Gemini’s minimum) ([PREVIEW]-tagged)
  • Vertex Anthropic features in us-central1 — region-restricted unless GOOGLE_LOCATION=global
  • Vertex preview-model deployments (Gemini-3.x, etc.) — [PREVIEW]-tagged; require account access
  • Vertex-Anthropic URL document sources — Vertex doesn’t accept URL document sources upstream; bifrost auto-fetches and inlines via inlineDocumentURLs for parity with direct Anthropic
  • Azure preview deployments (o3 / gpt-4o-audio-preview / computer-use-preview) — [PREVIEW]-tagged; require deployment provisioning
  • MCP toolset tests[PREVIEW]-tagged; need a reachable MCP server

Coverage report layout

When you run the harness, tmp/harness-failures.md is generated with three matrices:
  1. Feature × Provider — every feature row × every provider column
  2. Feature × Route — same features × drop-in / cross-model / passthrough
  3. Per-(provider, model) — every distinct provider/model tuple, with the exact features each one exercised
Plus three “missing coverage” lists (per-provider, per-route, per-model) that surface gaps as a backlog.

Extending coverage

The full backlog of candidate test additions lives in tests/e2e/api/HARNESS_COVERAGE_BACKLOG.md — provider-by-provider feature inventory sourced from each provider’s official docs. Adding a new test is a one-line entry in the relevant sub-folder of provider-harness.json.