Create async chat completion

curl --request POST \
  --url http://localhost:8080/v1/async/chat/completions \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "openai/gpt-4",
  "messages": [
    {
      "role": "assistant",
      "name": "<string>",
      "content": "<string>",
      "tool_call_id": "<string>",
      "refusal": "<string>",
      "audio": {
        "id": "<string>",
        "data": "<string>",
        "expires_at": 123,
        "transcript": "<string>"
      },
      "reasoning": "<string>",
      "reasoning_details": [
        {
          "id": "<string>",
          "index": 123,
          "type": "reasoning.summary",
          "summary": "<string>",
          "text": "<string>",
          "signature": "<string>",
          "data": "<string>"
        }
      ],
      "annotations": [
        {
          "type": "<string>",
          "url_citation": {
            "start_index": 123,
            "end_index": 123,
            "title": "<string>",
            "url": "<string>",
            "sources": {},
            "type": "<string>"
          }
        }
      ],
      "tool_calls": [
        {
          "function": {
            "name": "<string>",
            "arguments": "<string>"
          },
          "index": 123,
          "type": "<string>",
          "id": "<string>"
        }
      ]
    }
  ],
  "fallbacks": [
    "<string>"
  ],
  "stream": true,
  "frequency_penalty": 0,
  "logit_bias": {},
  "logprobs": true,
  "max_completion_tokens": 123,
  "metadata": {},
  "modalities": [
    "<string>"
  ],
  "parallel_tool_calls": true,
  "presence_penalty": 0,
  "prompt_cache_key": "<string>",
  "reasoning": {
    "effort": "none",
    "max_tokens": 123
  },
  "response_format": {},
  "safety_identifier": "<string>",
  "service_tier": "<string>",
  "stream_options": {
    "include_obfuscation": true,
    "include_usage": true
  },
  "store": true,
  "temperature": 1,
  "tool_choice": "none",
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "<string>",
        "description": "<string>",
        "parameters": {
          "type": "<string>",
          "description": "<string>",
          "required": [
            "<string>"
          ],
          "properties": {},
          "enum": [
            "<string>"
          ],
          "additionalProperties": true
        },
        "strict": true
      },
      "custom": {
        "format": {
          "type": "<string>",
          "grammar": {
            "definition": "<string>",
            "syntax": "lark"
          }
        }
      },
      "cache_control": {
        "type": "ephemeral",
        "ttl": "<string>"
      }
    }
  ],
  "truncation": "<string>",
  "user": "<string>",
  "verbosity": "low"
}
'

{
  "id": "<string>",
  "status": "pending",
  "created_at": "2023-11-07T05:31:56Z",
  "expires_at": "2023-11-07T05:31:56Z",
  "completed_at": "2023-11-07T05:31:56Z",
  "status_code": 123,
  "result": "<unknown>",
  "error": {
    "event_id": "<string>",
    "type": "<string>",
    "is_bifrost_error": true,
    "status_code": 123,
    "error": {
      "type": "<string>",
      "code": "<string>",
      "message": "<string>",
      "param": "<string>",
      "event_id": "<string>"
    },
    "extra_fields": {
      "provider": "openai",
      "model_requested": "<string>",
      "request_type": "<string>"
    }
  }
}

POST

async

chat

completions

Create async chat completion

curl --request POST \
  --url http://localhost:8080/v1/async/chat/completions \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "openai/gpt-4",
  "messages": [
    {
      "role": "assistant",
      "name": "<string>",
      "content": "<string>",
      "tool_call_id": "<string>",
      "refusal": "<string>",
      "audio": {
        "id": "<string>",
        "data": "<string>",
        "expires_at": 123,
        "transcript": "<string>"
      },
      "reasoning": "<string>",
      "reasoning_details": [
        {
          "id": "<string>",
          "index": 123,
          "type": "reasoning.summary",
          "summary": "<string>",
          "text": "<string>",
          "signature": "<string>",
          "data": "<string>"
        }
      ],
      "annotations": [
        {
          "type": "<string>",
          "url_citation": {
            "start_index": 123,
            "end_index": 123,
            "title": "<string>",
            "url": "<string>",
            "sources": {},
            "type": "<string>"
          }
        }
      ],
      "tool_calls": [
        {
          "function": {
            "name": "<string>",
            "arguments": "<string>"
          },
          "index": 123,
          "type": "<string>",
          "id": "<string>"
        }
      ]
    }
  ],
  "fallbacks": [
    "<string>"
  ],
  "stream": true,
  "frequency_penalty": 0,
  "logit_bias": {},
  "logprobs": true,
  "max_completion_tokens": 123,
  "metadata": {},
  "modalities": [
    "<string>"
  ],
  "parallel_tool_calls": true,
  "presence_penalty": 0,
  "prompt_cache_key": "<string>",
  "reasoning": {
    "effort": "none",
    "max_tokens": 123
  },
  "response_format": {},
  "safety_identifier": "<string>",
  "service_tier": "<string>",
  "stream_options": {
    "include_obfuscation": true,
    "include_usage": true
  },
  "store": true,
  "temperature": 1,
  "tool_choice": "none",
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "<string>",
        "description": "<string>",
        "parameters": {
          "type": "<string>",
          "description": "<string>",
          "required": [
            "<string>"
          ],
          "properties": {},
          "enum": [
            "<string>"
          ],
          "additionalProperties": true
        },
        "strict": true
      },
      "custom": {
        "format": {
          "type": "<string>",
          "grammar": {
            "definition": "<string>",
            "syntax": "lark"
          }
        }
      },
      "cache_control": {
        "type": "ephemeral",
        "ttl": "<string>"
      }
    }
  ],
  "truncation": "<string>",
  "user": "<string>",
  "verbosity": "low"
}
'

{
  "id": "<string>",
  "status": "pending",
  "created_at": "2023-11-07T05:31:56Z",
  "expires_at": "2023-11-07T05:31:56Z",
  "completed_at": "2023-11-07T05:31:56Z",
  "status_code": 123,
  "result": "<unknown>",
  "error": {
    "event_id": "<string>",
    "type": "<string>",
    "is_bifrost_error": true,
    "status_code": 123,
    "error": {
      "type": "<string>",
      "code": "<string>",
      "message": "<string>",
      "param": "<string>",
      "event_id": "<string>"
    },
    "extra_fields": {
      "provider": "openai",
      "model_requested": "<string>",
      "request_type": "<string>"
    }
  }
}

Headers

x-bf-async-job-result-ttl

integer

default:3600

Time-to-live in seconds for the job result after completion. Defaults to 3600 (1 hour). After expiry, the job result is automatically cleaned up.

Body

application/json

model

string

required

Model in provider/model format (e.g., openai/gpt-4)

Example:

"openai/gpt-4"

messages

object[]

required

List of messages in the conversation

Show child attributes

fallbacks

string[]

Fallback models in provider/model format

stream

boolean

Whether to stream the response

frequency_penalty

number

Required range: -2 <= x <= 2

logit_bias

object

Show child attributes

logprobs

boolean

max_completion_tokens

integer

metadata

object

modalities

string[]

parallel_tool_calls

boolean

presence_penalty

number

Required range: -2 <= x <= 2

prompt_cache_key

string

reasoning

object

Show child attributes

response_format

object

Format for the response

safety_identifier

string

service_tier

string

stream_options

object

Show child attributes

store

boolean

temperature

number

Required range: 0 <= x <= 2

tool_choice

Available options:

none,

auto,

required

tools

object[]

Show child attributes

truncation

string

user

string

verbosity

enum<string>

Available options:

low,

medium,

high

Response

Job accepted for processing

Response returned when creating or polling an async job

string

required

Unique identifier for the async job

status

enum<string>

required

The status of an async job

Available options:

pending,

processing,

completed,

failed

created_at

string<date-time>

required

When the job was created

expires_at

string<date-time>

When the job result expires and will be cleaned up

completed_at

string<date-time>

When the job completed (successfully or with failure)

status_code

integer

HTTP status code of the completed operation

result

any

The result of the completed operation (shape depends on the request type)

error

object

Error response from Bifrost

Show child attributes

Download file content from a container Create async text completion

API Reference

Headers

Body

Response