Custom Pricing

Overview

Bifrost computes request costs using a built-in pricing catalog that is automatically synced from a remote datasheet. Custom Pricing lets you override those catalog prices at runtime without redeploying, applying your own rates for any model across any combination of provider, key, and virtual key scopes. Key capabilities:

Scoped overrides - apply prices globally or narrow them to a specific provider, provider key, or virtual key
Pattern matching - target an exact model name or a wildcard prefix (e.g. gpt-4*)
Request type filtering - restrict an override to one or more specific operations (chat, embeddings, image generation, etc.); at least one request type is required
Hierarchical resolution - the most-specific matching override always wins; broader scopes act as fallbacks

Pricing data source

Before configuring overrides, Bifrost needs a pricing catalog to work from. By default it ships with built-in prices and syncs them every 24 hours. You can point it at a custom pricing URL if you maintain your own datasheet.

Web UI
config.json

Navigate to Models in the sidebar
Click the Pricing Settings tab
Enter your pricing datasheet URL in the Pricing Datasheet URL field
Set the Pricing Sync Interval (in hours)
Click Save

{
  "framework": {
    "pricing": {
      "pricing_url": "https://your-host/pricing.json",
      "pricing_sync_interval": 86400
    }
  }
}

Field	Type	Required	Default	Description
`pricing_url`	string (URI)	No	built-in	URL of the pricing datasheet to sync from
`pricing_sync_interval`	integer	No	`86400`	Sync interval in seconds. Minimum `3600` (1 hour)

Scope hierarchy

Every override is assigned a scope kind that determines which requests it applies to. When Bifrost resolves pricing for a request, it evaluates all matching overrides and selects the one with the most specific scope. More specific scopes always win over broader ones.

virtual_key_provider_key  (most specific)
virtual_key_provider
virtual_key
provider_key
provider
global                    (least specific / catch-all)

Scope kinds and their required identifiers:

Scope kind	Required	Description
`global`	-	Applies to every request regardless of provider, key, or virtual key
`provider`	`provider_id`	Applies to all keys under a specific provider
`provider_key`	`provider_key_id`	Applies to a specific provider API key only
`virtual_key`	`virtual_key_id`	Applies to all requests made under a virtual key
`virtual_key_provider`	`virtual_key_id` + `provider_id`	Applies when a virtual key routes to a specific provider
`virtual_key_provider_key`	`virtual_key_id` + `provider_key_id`	Most specific: virtual key + exact provider API key

Scope identifiers are exclusive to their scope kind - you cannot mix them. For example, virtual_key_provider requires virtual_key_id and provider_id and must not include provider_key_id.

Pattern matching

The pattern field controls which model names the override applies to. The match_type field controls how the pattern is interpreted.

Match type	Behavior	Example
`exact`	Matches only the exact model name	`gpt-4o` matches only `gpt-4o`
`wildcard`	Prefix match - pattern must end with `*`	`gpt-4*` matches `gpt-4o`, `gpt-4-turbo`, `gpt-4o-mini`

For wildcard patterns, append a * at the end of the prefix. For example, claude-3* will match all Claude 3 variants.

Request type filtering

request_types is required and must contain at least one value. Only request types that have pricing support are accepted. Stream variants are treated identically to their base type - specifying chat_completion covers both streaming and non-streaming chat requests.

Type	Description
`chat_completion`	Chat requests (streaming included)
`text_completion`	Legacy text completions (streaming included)
`responses`	Responses API requests (streaming included)
`embedding`	Embedding generation
`rerank`	Reranking
`speech`	Text-to-speech (streaming included)
`transcription`	Speech-to-text (streaming included)
`image_generation`	Image generation (streaming included)
`image_variation`	Image variation
`image_edit`	Image editing (streaming included)
`video_generation`	Video generation
`video_remix`	Video remixing

Creating an override

Web UI
API
config.json

Navigate to Models → Pricing Overrides in the sidebar
Click Create Override
Fill in the form:
- Name - a human-readable label
- Scope - select the scope kind and provide the matching IDs
- Pattern - enter the model name or wildcard prefix
- Match type - choose Exact or Wildcard
- Request types - select one or more request types (required)
- Pricing fields - enter the price values you want to override (only non-zero fields are applied)
Click Save

curl -X POST http://localhost:8080/api/governance/pricing-overrides \
  -H "Content-Type: application/json" \
  -d '{
    "name": "GPT-4o reduced input cost",
    "scope_kind": "global",
    "match_type": "exact",
    "pattern": "gpt-4o",
    "request_types": ["chat_completion"],
    "patch": {
      "input_cost_per_token": 0.0000025,
      "output_cost_per_token": 0.000010
    }
  }'

Response:

{
  "message": "Pricing override created successfully",
  "pricing_override": {
    "id": "550e8400-e29b-41d4-a716-446655440000",
    "name": "GPT-4o reduced input cost",
    "scope_kind": "global",
    "match_type": "exact",
    "pattern": "gpt-4o",
    "request_types": ["chat_completion"],
    "pricing_patch": "{\"input_cost_per_token\":0.0000025,\"output_cost_per_token\":0.00001}",
    "created_at": "2026-03-20T10:00:00Z",
    "updated_at": "2026-03-20T10:00:00Z"
  }
}

Update (sparse patch):

curl -X PATCH http://localhost:8080/api/governance/pricing-overrides/{id} \
  -H "Content-Type: application/json" \
  -d '{
    "patch": {
      "input_cost_per_token": 0.000002
    }
  }'

Delete:

curl -X DELETE http://localhost:8080/api/governance/pricing-overrides/{id}

List (with optional filters):

# All overrides
curl http://localhost:8080/api/governance/pricing-overrides

# Filter by scope
curl "http://localhost:8080/api/governance/pricing-overrides?scope_kind=virtual_key&virtual_key_id=vk-abc123"

Pricing overrides are defined under governance.pricing_overrides. Each entry requires id, name, scope_kind, match_type, pattern, and request_types. The pricing_patch is a JSON-encoded string containing only the fields you want to override.

{
  "governance": {
    "pricing_overrides": [
      {
        "id": "550e8400-e29b-41d4-a716-446655440000",
        "name": "Global GPT-4o rate",
        "scope_kind": "global",
        "match_type": "exact",
        "pattern": "gpt-4o",
        "request_types": ["chat_completion"],
        "pricing_patch": "{\"input_cost_per_token\":0.0000025,\"output_cost_per_token\":0.00001}"
      },
      {
        "id": "660e8400-e29b-41d4-a716-446655440001",
        "name": "All Claude models for prod VK",
        "scope_kind": "virtual_key",
        "virtual_key_id": "vk-abc123",
        "match_type": "wildcard",
        "pattern": "claude-3*",
        "request_types": ["chat_completion"],
        "pricing_patch": "{\"input_cost_per_token\":0.000003,\"output_cost_per_token\":0.000015}"
      }
    ]
  }
}

Field	Type	Required	Description
`id`	string	Yes	Unique override ID (UUID recommended)
`name`	string	Yes	Human-readable label
`scope_kind`	string	Yes	One of: `global`, `provider`, `provider_key`, `virtual_key`, `virtual_key_provider`, `virtual_key_provider_key`
`virtual_key_id`	string	Conditional	Required for `virtual_key*` scopes
`provider_id`	string	Conditional	Required for `provider` and `virtual_key_provider` scopes
`provider_key_id`	string	Conditional	Required for `provider_key` and `virtual_key_provider_key` scopes
`match_type`	string	Yes	`exact` or `wildcard`
`pattern`	string	Yes	Model name or wildcard prefix ending with `*`
`request_types`	array	Yes	Request types this override applies to. At least one value required.
`pricing_patch`	string	No	JSON-encoded pricing fields to override
`config_hash`	string	No	Auto-managed. Do not set manually

Pricing fields reference

Only fields with non-zero values are applied. All values are cost per unit in USD.

Token costs

Field	Description
`input_cost_per_token`	Standard input token cost
`output_cost_per_token`	Standard output token cost
`input_cost_per_token_batches`	Input token cost for batch requests
`output_cost_per_token_batches`	Output token cost for batch requests
`input_cost_per_token_priority`	Input token cost for priority requests
`output_cost_per_token_priority`	Output token cost for priority requests
`input_cost_per_token_flex`	Input token cost for flex requests
`output_cost_per_token_flex`	Output token cost for flex requests
`input_cost_per_character`	Input cost per character (character-billed models)

Token tier costs

Field	Description
`input_cost_per_token_above_128k_tokens`	Input cost above 128k context
`output_cost_per_token_above_128k_tokens`	Output cost above 128k context
`input_cost_per_token_above_200k_tokens`	Input cost above 200k context
`input_cost_per_token_above_200k_tokens_priority`	Input cost above 200k context for priority requests
`output_cost_per_token_above_200k_tokens`	Output cost above 200k context
`output_cost_per_token_above_200k_tokens_priority`	Output cost above 200k context for priority requests
`input_cost_per_token_above_272k_tokens`	Input cost above 272k context
`input_cost_per_token_above_272k_tokens_priority`	Input cost above 272k context for priority requests
`output_cost_per_token_above_272k_tokens`	Output cost above 272k context
`output_cost_per_token_above_272k_tokens_priority`	Output cost above 272k context for priority requests

Cache costs

Field	Description
`cache_creation_input_token_cost`	Cost to write a token to the prompt cache
`cache_read_input_token_cost`	Cost to read a cached input token
`cache_creation_input_token_cost_above_200k_tokens`	Cache creation above 200k context
`cache_read_input_token_cost_above_200k_tokens`	Cache read above 200k context
`cache_read_input_token_cost_above_200k_tokens_priority`	Cache read above 200k context for priority requests
`cache_read_input_token_cost_priority`	Priority cache read cost
`cache_read_input_token_cost_flex`	Flex cache read cost
`cache_read_input_token_cost_above_272k_tokens`	Cache read above 272k context
`cache_read_input_token_cost_above_272k_tokens_priority`	Cache read above 272k context for priority requests
`cache_read_input_image_token_cost`	Cache read cost for image tokens
`cache_creation_input_audio_token_cost`	Cache creation cost for audio tokens

Image costs

Field	Description
`input_cost_per_image`	Cost per input image
`output_cost_per_image`	Cost per generated image
`input_cost_per_pixel`	Cost per input pixel
`output_cost_per_pixel`	Cost per output pixel
`input_cost_per_image_token`	Cost per image input token
`output_cost_per_image_token`	Cost per image output token
`output_cost_per_image_low_quality`	Generated image - low quality
`output_cost_per_image_medium_quality`	Generated image - medium quality
`output_cost_per_image_high_quality`	Generated image - high quality
`output_cost_per_image_auto_quality`	Generated image - auto quality
`output_cost_per_image_above_512_and_512_pixels`	Generated image > 512×512
`output_cost_per_image_above_1024_and_1024_pixels`	Generated image > 1024×1024
`output_cost_per_image_above_2048_and_2048_pixels`	Generated image > 2048×2048
`output_cost_per_image_above_4096_and_4096_pixels`	Generated image > 4096×4096

Audio and video costs

Field	Description
`input_cost_per_audio_token`	Cost per audio input token
`input_cost_per_audio_per_second`	Cost per second of audio input
`input_cost_per_second`	Cost per second of input (generic)
`input_cost_per_video_per_second`	Cost per second of video input
`output_cost_per_audio_token`	Cost per audio output token
`output_cost_per_second`	Cost per second of audio output
`output_cost_per_video_per_second`	Cost per second of video output
`input_cost_per_video_per_second_above_128k_tokens`	Video input cost above 128k context
`input_cost_per_audio_per_second_above_128k_tokens`	Audio input cost above 128k context

Other costs

Field	Description
`search_context_cost_per_query`	Cost per web search context query
`code_interpreter_cost_per_session`	Cost per code interpreter session

Examples

Flat rate for all Anthropic models

Apply a single input/output rate to every Claude model globally:

{
  "id": "anthropic-flat-rate",
  "name": "Anthropic flat rate",
  "scope_kind": "provider",
  "provider_id": "anthropic",
  "match_type": "wildcard",
  "pattern": "claude*",
  "request_types": ["chat_completion", "text_completion", "responses"],
  "pricing_patch": "{\"input_cost_per_token\":0.000003,\"output_cost_per_token\":0.000015}"
}

Per-virtual-key negotiated rate

A specific virtual key has negotiated lower prices for GPT-4o:

{
  "id": "vk-prod-gpt4o-rate",
  "name": "Prod VK - GPT-4o negotiated rate",
  "scope_kind": "virtual_key",
  "virtual_key_id": "vk-abc123",
  "match_type": "exact",
  "pattern": "gpt-4o",
  "request_types": ["chat_completion"],
  "pricing_patch": "{\"input_cost_per_token\":0.000002,\"output_cost_per_token\":0.000008}"
}

Image generation override

Override costs for a specific image model at global scope:

{
  "id": "dall-e-3-rate",
  "name": "DALL-E 3 custom rate",
  "scope_kind": "global",
  "match_type": "exact",
  "pattern": "dall-e-3",
  "request_types": ["image_generation"],
  "pricing_patch": "{\"output_cost_per_image_high_quality\":0.04,\"output_cost_per_image_medium_quality\":0.02}"
}

Global catch-all for a new model

Use a global override to add pricing for a model not yet in the built-in catalog:

{
  "id": "my-new-model-rate",
  "name": "my-new-model pricing",
  "scope_kind": "global",
  "match_type": "exact",
  "pattern": "my-new-model-v1",
  "request_types": ["chat_completion"],
  "pricing_patch": "{\"input_cost_per_token\":0.000001,\"output_cost_per_token\":0.000005}"
}

Next steps

Virtual Keys - Attach virtual-key-scoped overrides to virtual keys for per-customer pricing
Budget and Limits - Understand how costs are tracked against budgets
Model Catalog - Deep dive into how pricing resolution and cost calculation work internally

Overview

Quick Start

Release Cadence

Migration Guides

SDK Integrations

Providers & Guides

MCP Gateway

Custom plugins

Open Source Features

Overview

Pricing data source

Scope hierarchy

Pattern matching

Request type filtering

Creating an override

Pricing fields reference

Token costs

Token tier costs

Cache costs

Image costs

Audio and video costs

Other costs

Examples

Flat rate for all Anthropic models

Per-virtual-key negotiated rate

Image generation override

Global catch-all for a new model

Next steps

Overview

Quick Start

Release Cadence

Migration Guides

SDK Integrations

Providers & Guides

MCP Gateway

Custom plugins

Open Source Features

Documentation Index

​Overview

​Pricing data source

​Scope hierarchy

​Pattern matching

​Request type filtering

​Creating an override

​Pricing fields reference

​Token costs

​Token tier costs

​Cache costs

​Image costs

​Audio and video costs

​Other costs

​Examples

​Flat rate for all Anthropic models

​Per-virtual-key negotiated rate

​Image generation override

​Global catch-all for a new model

​Next steps

Overview

Pricing data source

Scope hierarchy

Pattern matching

Request type filtering

Creating an override

Pricing fields reference

Token costs

Token tier costs

Cache costs

Image costs

Audio and video costs

Other costs

Examples

Flat rate for all Anthropic models

Per-virtual-key negotiated rate

Image generation override

Global catch-all for a new model

Next steps