Skip to main content

Overview

Bifrost computes request costs using a built-in pricing catalog that is automatically synced from a remote datasheet. Custom Pricing lets you override those catalog prices at runtime without redeploying, applying your own rates for any model across any combination of provider, key, and virtual key scopes. Key capabilities:
  • Scoped overrides — apply prices globally or narrow them to a specific provider, provider key, or virtual key
  • Pattern matching — target an exact model name or a wildcard prefix (e.g. gpt-4*)
  • Request type filtering — restrict an override to one or more specific operations (chat, embeddings, image generation, etc.); at least one request type is required
  • Hierarchical resolution — the most-specific matching override always wins; broader scopes act as fallbacks

Pricing data source

Before configuring overrides, Bifrost needs a pricing catalog to work from. By default it ships with built-in prices and syncs them every 24 hours. You can point it at a custom pricing URL if you maintain your own datasheet.
  1. Navigate to Models in the sidebar
  2. Click the Pricing Settings tab
  3. Enter your pricing datasheet URL in the Pricing Datasheet URL field
  4. Set the Pricing Sync Interval (in hours)
  5. Click Save

Scope hierarchy

Every override is assigned a scope kind that determines which requests it applies to. When Bifrost resolves pricing for a request, it evaluates all matching overrides and selects the one with the most specific scope. More specific scopes always win over broader ones.
virtual_key_provider_key  (most specific)
virtual_key_provider
virtual_key
provider_key
provider
global                    (least specific / catch-all)
Scope kinds and their required identifiers:
Scope kindRequiredDescription
globalApplies to every request regardless of provider, key, or virtual key
providerprovider_idApplies to all keys under a specific provider
provider_keyprovider_key_idApplies to a specific provider API key only
virtual_keyvirtual_key_idApplies to all requests made under a virtual key
virtual_key_providervirtual_key_id + provider_idApplies when a virtual key routes to a specific provider
virtual_key_provider_keyvirtual_key_id + provider_key_idMost specific: virtual key + exact provider API key
Scope identifiers are exclusive to their scope kind — you cannot mix them. For example, virtual_key_provider requires virtual_key_id and provider_id and must not include provider_key_id.

Pattern matching

The pattern field controls which model names the override applies to. The match_type field controls how the pattern is interpreted.
Match typeBehaviorExample
exactMatches only the exact model namegpt-4o matches only gpt-4o
wildcardPrefix match — pattern must end with *gpt-4* matches gpt-4o, gpt-4-turbo, gpt-4o-mini
For wildcard patterns, append a * at the end of the prefix. For example, claude-3* will match all Claude 3 variants.

Request type filtering

request_types is required and must contain at least one value. Only request types that have pricing support are accepted. Stream variants are treated identically to their base type — specifying chat_completion covers both streaming and non-streaming chat requests.
TypeDescription
chat_completionChat requests (streaming included)
text_completionLegacy text completions (streaming included)
responsesResponses API requests (streaming included)
embeddingEmbedding generation
rerankReranking
speechText-to-speech (streaming included)
transcriptionSpeech-to-text (streaming included)
image_generationImage generation (streaming included)
image_variationImage variation
image_editImage editing (streaming included)
video_generationVideo generation
video_remixVideo remixing

Creating an override

  1. Navigate to ModelsPricing Overrides in the sidebar Pricing Overrides Table
  2. Click Create Override
  3. Fill in the form:
    • Name — a human-readable label
    • Scope — select the scope kind and provide the matching IDs
    • Pattern — enter the model name or wildcard prefix
    • Match type — choose Exact or Wildcard
    • Request types — select one or more request types (required)
    • Pricing fields — enter the price values you want to override (only non-zero fields are applied)
  4. Click Save Pricing Override Form

Pricing fields reference

Only fields with non-zero values are applied. All values are cost per unit in USD.

Token costs

FieldDescription
input_cost_per_tokenStandard input token cost
output_cost_per_tokenStandard output token cost
input_cost_per_token_batchesInput token cost for batch requests
output_cost_per_token_batchesOutput token cost for batch requests
input_cost_per_token_priorityInput token cost for priority requests
output_cost_per_token_priorityOutput token cost for priority requests
input_cost_per_characterInput cost per character (character-billed models)

Token tier costs

FieldDescription
input_cost_per_token_above_128k_tokensInput cost above 128k context
output_cost_per_token_above_128k_tokensOutput cost above 128k context
input_cost_per_token_above_200k_tokensInput cost above 200k context
output_cost_per_token_above_200k_tokensOutput cost above 200k context

Cache costs

FieldDescription
cache_creation_input_token_costCost to write a token to the prompt cache
cache_read_input_token_costCost to read a cached input token
cache_creation_input_token_cost_above_200k_tokensCache creation above 200k context
cache_read_input_token_cost_above_200k_tokensCache read above 200k context
cache_read_input_token_cost_priorityPriority cache read cost
cache_read_input_image_token_costCache read cost for image tokens
cache_creation_input_audio_token_costCache creation cost for audio tokens

Image costs

FieldDescription
input_cost_per_imageCost per input image
output_cost_per_imageCost per generated image
input_cost_per_pixelCost per input pixel
output_cost_per_pixelCost per output pixel
input_cost_per_image_tokenCost per image input token
output_cost_per_image_tokenCost per image output token
output_cost_per_image_low_qualityGenerated image — low quality
output_cost_per_image_medium_qualityGenerated image — medium quality
output_cost_per_image_high_qualityGenerated image — high quality
output_cost_per_image_auto_qualityGenerated image — auto quality
output_cost_per_image_above_512_and_512_pixelsGenerated image > 512×512
output_cost_per_image_above_1024_and_1024_pixelsGenerated image > 1024×1024
output_cost_per_image_above_2048_and_2048_pixelsGenerated image > 2048×2048
output_cost_per_image_above_4096_and_4096_pixelsGenerated image > 4096×4096

Audio and video costs

FieldDescription
input_cost_per_audio_tokenCost per audio input token
input_cost_per_audio_per_secondCost per second of audio input
input_cost_per_secondCost per second of input (generic)
input_cost_per_video_per_secondCost per second of video input
output_cost_per_audio_tokenCost per audio output token
output_cost_per_secondCost per second of audio output
output_cost_per_video_per_secondCost per second of video output
input_cost_per_video_per_second_above_128k_tokensVideo input cost above 128k context
input_cost_per_audio_per_second_above_128k_tokensAudio input cost above 128k context

Other costs

FieldDescription
search_context_cost_per_queryCost per web search context query
code_interpreter_cost_per_sessionCost per code interpreter session

Examples

Flat rate for all Anthropic models

Apply a single input/output rate to every Claude model globally:
{
  "id": "anthropic-flat-rate",
  "name": "Anthropic flat rate",
  "scope_kind": "provider",
  "provider_id": "anthropic",
  "match_type": "wildcard",
  "pattern": "claude*",
  "request_types": ["chat_completion", "text_completion", "responses"],
  "pricing_patch": "{\"input_cost_per_token\":0.000003,\"output_cost_per_token\":0.000015}"
}

Per-virtual-key negotiated rate

A specific virtual key has negotiated lower prices for GPT-4o:
{
  "id": "vk-prod-gpt4o-rate",
  "name": "Prod VK — GPT-4o negotiated rate",
  "scope_kind": "virtual_key",
  "virtual_key_id": "vk-abc123",
  "match_type": "exact",
  "pattern": "gpt-4o",
  "request_types": ["chat_completion"],
  "pricing_patch": "{\"input_cost_per_token\":0.000002,\"output_cost_per_token\":0.000008}"
}

Image generation override

Override costs for a specific image model at global scope:
{
  "id": "dall-e-3-rate",
  "name": "DALL-E 3 custom rate",
  "scope_kind": "global",
  "match_type": "exact",
  "pattern": "dall-e-3",
  "request_types": ["image_generation"],
  "pricing_patch": "{\"output_cost_per_image_high_quality\":0.04,\"output_cost_per_image_medium_quality\":0.02}"
}

Global catch-all for a new model

Use a global override to add pricing for a model not yet in the built-in catalog:
{
  "id": "my-new-model-rate",
  "name": "my-new-model pricing",
  "scope_kind": "global",
  "match_type": "exact",
  "pattern": "my-new-model-v1",
  "request_types": ["chat_completion"],
  "pricing_patch": "{\"input_cost_per_token\":0.000001,\"output_cost_per_token\":0.000005}"
}

Next steps

  • Virtual Keys — Attach virtual-key-scoped overrides to virtual keys for per-customer pricing
  • Budget and Limits — Understand how costs are tracked against budgets
  • Model Catalog — Deep dive into how pricing resolution and cost calculation work internally