A valid request URL is required to generate request examples{
"id": "<string>",
"object": "response.compaction",
"model": "<string>",
"created_at": 123,
"output": [
{
"id": "<string>",
"content": "<string>",
"call_id": "<string>",
"name": "<string>",
"arguments": "<string>",
"output": {},
"action": {},
"error": "<string>",
"queries": [
"<string>"
],
"results": [
{}
],
"summary": [
{
"text": "<string>"
}
],
"encrypted_content": "<string>"
}
],
"usage": {
"input_tokens": 123,
"input_tokens_details": {
"text_tokens": 123,
"audio_tokens": 123,
"image_tokens": 123,
"cached_read_tokens": 123,
"cached_write_tokens": 123
},
"output_tokens": 123,
"output_tokens_details": {
"text_tokens": 123,
"accepted_prediction_tokens": 123,
"audio_tokens": 123,
"reasoning_tokens": 123,
"rejected_prediction_tokens": 123,
"citation_tokens": 123,
"num_search_queries": 123
},
"total_tokens": 123,
"cost": {
"input_tokens_cost": 123,
"output_tokens_cost": 123,
"reasoning_tokens_cost": 123,
"citation_tokens_cost": 123,
"search_queries_cost": 123,
"request_cost": 123,
"total_cost": 123
}
},
"extra_fields": {
"request_type": "<string>",
"model_requested": "<string>",
"model_deployment": "<string>",
"latency": 123,
"chunk_index": 123,
"raw_request": {},
"raw_response": {},
"cache_debug": {
"cache_hit": true,
"cache_id": "<string>",
"hit_type": "<string>",
"requested_provider": "<string>",
"requested_model": "<string>",
"provider_used": "<string>",
"model_used": "<string>",
"input_tokens": 123,
"threshold": 123,
"similarity": 123
}
}
}Compact context
Compresses a conversation into an opaque encrypted compaction item using the OpenAI-compatible context compaction API.
The response output array contains the original user messages plus a final item
with type: "response.compaction" and an encrypted_content field. Pass the
full output array as input to future Responses API requests to continue the
conversation without retransmitting the full history.
Supported providers: OpenAI, Azure OpenAI, xAI. Requests to unsupported providers return a 400 error.
A valid request URL is required to generate request examples{
"id": "<string>",
"object": "response.compaction",
"model": "<string>",
"created_at": 123,
"output": [
{
"id": "<string>",
"content": "<string>",
"call_id": "<string>",
"name": "<string>",
"arguments": "<string>",
"output": {},
"action": {},
"error": "<string>",
"queries": [
"<string>"
],
"results": [
{}
],
"summary": [
{
"text": "<string>"
}
],
"encrypted_content": "<string>"
}
],
"usage": {
"input_tokens": 123,
"input_tokens_details": {
"text_tokens": 123,
"audio_tokens": 123,
"image_tokens": 123,
"cached_read_tokens": 123,
"cached_write_tokens": 123
},
"output_tokens": 123,
"output_tokens_details": {
"text_tokens": 123,
"accepted_prediction_tokens": 123,
"audio_tokens": 123,
"reasoning_tokens": 123,
"rejected_prediction_tokens": 123,
"citation_tokens": 123,
"num_search_queries": 123
},
"total_tokens": 123,
"cost": {
"input_tokens_cost": 123,
"output_tokens_cost": 123,
"reasoning_tokens_cost": 123,
"citation_tokens_cost": 123,
"search_queries_cost": 123,
"request_cost": 123,
"total_cost": 123
}
},
"extra_fields": {
"request_type": "<string>",
"model_requested": "<string>",
"model_deployment": "<string>",
"latency": 123,
"chunk_index": 123,
"raw_request": {},
"raw_response": {},
"cache_debug": {
"cache_hit": true,
"cache_id": "<string>",
"hit_type": "<string>",
"requested_provider": "<string>",
"requested_model": "<string>",
"provider_used": "<string>",
"model_used": "<string>",
"input_tokens": 123,
"threshold": 123,
"similarity": 123
}
}
}Authorizations
Bearer token authentication. Use your provider API key or Bifrost authentication token.
Virtual keys (prefixed with sk-bf-) can also be passed here.
Body
Model in provider/model format (e.g., "openai/gpt-4o")
Conversation to compact. Required unless previous_response_id is provided. Can be a string or array of ResponsesMessage objects. Compaction items from prior responses (type "response.compaction") may be included in the array.
System instructions that persist across the compacted context.
ID of a previous response to extend rather than sending full input.
Fallback model list in provider/model format.
Response
Successful compaction response
Always "response.compaction"
"response.compaction"
The compacted output — the original user messages plus a final item of type "response.compaction" whose encrypted_content holds the opaque compacted state. Pass the full output array as input to a future Responses API request.
Show child attributes
Show child attributes
Show child attributes
Show child attributes
Additional fields included in responses
Show child attributes
Show child attributes
Was this page helpful?

