A valid request URL is required to generate request examples{
"id": "<string>",
"object": "response.compaction",
"model": "<string>",
"created_at": 123,
"output": [
{
"id": "<string>",
"content": "<string>",
"call_id": "<string>",
"name": "<string>",
"arguments": "<string>",
"output": {},
"action": {},
"error": "<string>",
"queries": [
"<string>"
],
"results": [
{}
],
"summary": [
{
"text": "<string>"
}
],
"encrypted_content": "<string>"
}
],
"usage": {
"input_tokens": 123,
"input_tokens_details": {
"text_tokens": 123,
"audio_tokens": 123,
"image_tokens": 123,
"cached_read_tokens": 123,
"cached_write_tokens": 123
},
"output_tokens": 123,
"output_tokens_details": {
"text_tokens": 123,
"accepted_prediction_tokens": 123,
"audio_tokens": 123,
"reasoning_tokens": 123,
"rejected_prediction_tokens": 123,
"citation_tokens": 123,
"num_search_queries": 123
},
"total_tokens": 123,
"cost": {
"input_tokens_cost": 123,
"output_tokens_cost": 123,
"reasoning_tokens_cost": 123,
"citation_tokens_cost": 123,
"search_queries_cost": 123,
"request_cost": 123,
"total_cost": 123
}
},
"extra_fields": {
"request_type": "<string>",
"model_requested": "<string>",
"model_deployment": "<string>",
"latency": 123,
"chunk_index": 123,
"raw_request": {},
"raw_response": {},
"cache_debug": {
"cache_hit": true,
"cache_id": "<string>",
"hit_type": "<string>",
"requested_provider": "<string>",
"requested_model": "<string>",
"provider_used": "<string>",
"model_used": "<string>",
"input_tokens": 123,
"threshold": 123,
"similarity": 123
}
}
}Compact context (OpenAI)
Compresses a conversation into an opaque compaction item using the OpenAI-compatible
/v1/responses/compact endpoint. Drop-in compatible with the OpenAI SDK.
The response output contains the user messages plus a final item with
type: "response.compaction" and encrypted_content. Pass this output as input
to future Responses API calls to continue the conversation using the compacted context.
Note: This endpoint also works without the /v1 prefix (e.g., /openai/responses/compact).
A valid request URL is required to generate request examples{
"id": "<string>",
"object": "response.compaction",
"model": "<string>",
"created_at": 123,
"output": [
{
"id": "<string>",
"content": "<string>",
"call_id": "<string>",
"name": "<string>",
"arguments": "<string>",
"output": {},
"action": {},
"error": "<string>",
"queries": [
"<string>"
],
"results": [
{}
],
"summary": [
{
"text": "<string>"
}
],
"encrypted_content": "<string>"
}
],
"usage": {
"input_tokens": 123,
"input_tokens_details": {
"text_tokens": 123,
"audio_tokens": 123,
"image_tokens": 123,
"cached_read_tokens": 123,
"cached_write_tokens": 123
},
"output_tokens": 123,
"output_tokens_details": {
"text_tokens": 123,
"accepted_prediction_tokens": 123,
"audio_tokens": 123,
"reasoning_tokens": 123,
"rejected_prediction_tokens": 123,
"citation_tokens": 123,
"num_search_queries": 123
},
"total_tokens": 123,
"cost": {
"input_tokens_cost": 123,
"output_tokens_cost": 123,
"reasoning_tokens_cost": 123,
"citation_tokens_cost": 123,
"search_queries_cost": 123,
"request_cost": 123,
"total_cost": 123
}
},
"extra_fields": {
"request_type": "<string>",
"model_requested": "<string>",
"model_deployment": "<string>",
"latency": 123,
"chunk_index": 123,
"raw_request": {},
"raw_response": {},
"cache_debug": {
"cache_hit": true,
"cache_id": "<string>",
"hit_type": "<string>",
"requested_provider": "<string>",
"requested_model": "<string>",
"provider_used": "<string>",
"model_used": "<string>",
"input_tokens": 123,
"threshold": 123,
"similarity": 123
}
}
}Authorizations
Bearer token authentication. Use your provider API key or Bifrost authentication token.
Virtual keys (prefixed with sk-bf-) can also be passed here.
Body
Model identifier (e.g., "gpt-4o"). When routing through Bifrost's provider-prefixed path, use "provider/model" format.
Conversation to compact. Required unless previous_response_id is set. Compaction items (type "response.compaction") from prior responses may be included.
System instructions that persist across the compacted context.
ID of a previous response to extend rather than sending full input.
Bifrost-specific — fallback model list in provider/model format.
Response
Successful compaction response
Always "response.compaction"
"response.compaction"
The compacted output — the original user messages plus a final item of type "response.compaction" whose encrypted_content holds the opaque compacted state. Pass the full output array as input to a future Responses API request.
Show child attributes
Show child attributes
Show child attributes
Show child attributes
Additional fields included in responses
Show child attributes
Show child attributes
Was this page helpful?

