Try Bifrost Enterprise free for 14 days. Explore now
A valid request URL is required to generate request examples{
"id": "<string>",
"type": "message",
"role": "assistant",
"content": [
{
"text": "<string>",
"thinking": "<string>",
"signature": "<string>",
"data": "<string>",
"tool_use_id": "<string>",
"id": "<string>",
"name": "<string>",
"input": {},
"server_name": "<string>",
"content": "<string>",
"source": {
"media_type": "<string>",
"data": "<string>",
"url": "<string>"
},
"cache_control": {
"ttl": "<string>"
},
"citations": {
"enabled": true
},
"context": "<string>",
"title": "<string>"
}
],
"model": "<string>",
"stop_sequence": "<string>",
"usage": {
"input_tokens": 123,
"output_tokens": 123,
"cache_creation_input_tokens": 123,
"cache_read_input_tokens": 123,
"cache_creation": {
"ephemeral_5m_input_tokens": 123,
"ephemeral_1h_input_tokens": 123
}
}
}Creates a message using Anthropic Messages API format. Supports streaming via SSE.
Async inference: Send x-bf-async: true to submit the request as a background job and receive a job ID immediately. Poll with x-bf-async-id: <job-id> to retrieve the result. When the job is still processing, the response will have an empty content array. When completed, content will contain the full result. See Async Inference for details.
A valid request URL is required to generate request examples{
"id": "<string>",
"type": "message",
"role": "assistant",
"content": [
{
"text": "<string>",
"thinking": "<string>",
"signature": "<string>",
"data": "<string>",
"tool_use_id": "<string>",
"id": "<string>",
"name": "<string>",
"input": {},
"server_name": "<string>",
"content": "<string>",
"source": {
"media_type": "<string>",
"data": "<string>",
"url": "<string>"
},
"cache_control": {
"ttl": "<string>"
},
"citations": {
"enabled": true
},
"context": "<string>",
"title": "<string>"
}
],
"model": "<string>",
"stop_sequence": "<string>",
"usage": {
"input_tokens": 123,
"output_tokens": 123,
"cache_creation_input_tokens": 123,
"cache_read_input_tokens": 123,
"cache_creation": {
"ephemeral_5m_input_tokens": 123,
"ephemeral_1h_input_tokens": 123
}
}
}Documentation Index
Fetch the complete documentation index at: https://docs.getbifrost.ai/llms.txt
Use this file to discover all available pages before exploring further.
Bearer token authentication. Use your provider API key or Bifrost authentication token.
Virtual keys (prefixed with sk-bf-) can also be passed here.
Set to true to submit this request as an async job. Returns immediately with a job ID. Not compatible with streaming.
true Poll for results of a previously submitted async job by providing the job ID returned from the initial async request.
Override the default result TTL in seconds. Results expire after this duration from completion time.
Model identifier (e.g., claude-3-opus-20240229)
"claude-3-opus-20240229"
Maximum tokens to generate
List of messages in the conversation
Show child attributes
System prompt
Automatic caching directives for the whole request
Show child attributes
Show child attributes
Whether to stream the response
0 <= x <= 1Show child attributes
Show child attributes
MCP servers configuration (requires beta header)
Show child attributes
Show child attributes
Structured output format (requires beta header)
Successful response
Show child attributes
end_turn, max_tokens, stop_sequence, tool_use, pause_turn, refusal, model_context_window_exceeded, null Show child attributes
Was this page helpful?