> ## Documentation Index
> Fetch the complete documentation index at: https://docs.getbifrost.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Custom Pricing

> Set custom rates for any model across global or virtual key scopes, optionally narrowed to a specific provider or key.

## Overview

Bifrost computes request costs using a built-in pricing catalog that is automatically synced from a remote datasheet. **Custom Pricing** lets you override those catalog prices at runtime without redeploying, applying your own rates for any model across any combination of provider, key, and virtual key scopes.

**Key capabilities:**

* **Scoped overrides** - apply prices globally or narrow them to a specific provider, provider key, or virtual key
* **Pattern matching** - target an exact model name or a wildcard prefix (e.g. `gpt-4*`)
* **Request type filtering** - restrict an override to one or more specific operations (chat, embeddings, image generation, etc.); at least one request type is required
* **Hierarchical resolution** - the most-specific matching override always wins; broader scopes act as fallbacks

***

## Pricing data source

Before configuring overrides, Bifrost needs a pricing catalog to work from. By default it ships with built-in prices and syncs them every 24 hours. You can point it at a custom pricing URL if you maintain your own datasheet.

<Tabs group="pricing-source">
  <Tab title="Web UI">
    1. Navigate to **Models** in the sidebar
    2. Click the **Pricing Settings** tab
    3. Enter your pricing datasheet URL in the **Pricing Datasheet URL** field
    4. Set the **Pricing Sync Interval** (in hours)
    5. Click **Save**
  </Tab>

  <Tab title="config.json">
    ```json theme={null}
    {
      "framework": {
        "pricing": {
          "pricing_url": "https://your-host/pricing.json",
          "model_parameters_url": "https://your-host/model-parameters.json",
          "pricing_sync_interval": 86400
        }
      }
    }
    ```

    `file://` URLs are also supported for loading datasheets from the local filesystem:

    ```json theme={null}
    "pricing_url": "file:///opt/bifrost/pricing.json"
    ```

    | Field                   | Type         | Required | Default  | Description                                                                          |
    | ----------------------- | ------------ | -------- | -------- | ------------------------------------------------------------------------------------ |
    | `pricing_url`           | string (URI) | No       | built-in | URL of the pricing datasheet. Supports `http://`, `https://`, and `file://`          |
    | `model_parameters_url`  | string (URI) | No       | built-in | URL of the model parameters datasheet. Supports `http://`, `https://`, and `file://` |
    | `pricing_sync_interval` | integer      | No       | `86400`  | Sync interval in seconds. Minimum `3600` (1 hour)                                    |
  </Tab>
</Tabs>

***

## Scope hierarchy

Every override is assigned a **scope kind** that determines which requests it applies to. When Bifrost resolves pricing for a request, it evaluates all matching overrides and selects the one with the most specific scope. More specific scopes always win over broader ones.

```
virtual_key_provider_key  (most specific)
virtual_key_provider
virtual_key
provider_key
provider
global                    (least specific / catch-all)
```

**Scope kinds and their required identifiers:**

| Scope kind                 | Required                             | Description                                                          |
| -------------------------- | ------------------------------------ | -------------------------------------------------------------------- |
| `global`                   | -                                    | Applies to every request regardless of provider, key, or virtual key |
| `provider`                 | `provider_id`                        | Applies to all keys under a specific provider                        |
| `provider_key`             | `provider_key_id`                    | Applies to a specific provider API key only                          |
| `virtual_key`              | `virtual_key_id`                     | Applies to all requests made under a virtual key                     |
| `virtual_key_provider`     | `virtual_key_id` + `provider_id`     | Applies when a virtual key routes to a specific provider             |
| `virtual_key_provider_key` | `virtual_key_id` + `provider_key_id` | Most specific: virtual key + exact provider API key                  |

<Note>
  Scope identifiers are exclusive to their scope kind - you cannot mix them. For example, `virtual_key_provider` requires `virtual_key_id` and `provider_id` and must not include `provider_key_id`.
</Note>

***

## Pattern matching

The `pattern` field controls which model names the override applies to. The `match_type` field controls how the pattern is interpreted.

| Match type | Behavior                                 | Example                                                 |
| ---------- | ---------------------------------------- | ------------------------------------------------------- |
| `exact`    | Matches only the exact model name        | `gpt-4o` matches only `gpt-4o`                          |
| `wildcard` | Prefix match - pattern must end with `*` | `gpt-4*` matches `gpt-4o`, `gpt-4-turbo`, `gpt-4o-mini` |

<Info>
  For wildcard patterns, append a `*` at the end of the prefix. For example, `claude-3*` will match all Claude 3 variants.
</Info>

***

## Lookup precedence

When pricing resolves a request, it tries lookup candidates in this order against the catalog (built-in entries + your overrides):

1. The alias's canonical `model_name` (`routing_info.resolved_key_alias.model_name`)
2. The alias's wire `model_id` (`routing_info.resolved_key_alias.model_id`)
3. The model the caller sent (`routing_info.model`)

The first non-empty candidate that matches a catalog entry wins. The precedence solves the **opaque deployment ID** case: when an admin aliases an unrecognisable wire ID (e.g. an Azure deployment `12345-azure-prod`) to a catalog-known canonical name (e.g. `claude-sonnet-4-5`) via [Static Aliasing](/providers/aliasing-models), pricing hits the catalog via the canonical name even though the wire identifier wouldn't.

When no key-level alias matches, candidates (1) and (2) are absent and the lookup falls straight through to the model the caller sent — preserving pre-alias behavior.

**Overrides** are matched against the wire model (`model_id` when an alias matched, otherwise the caller-sent model) so per-deployment override pricing stays addressable regardless of how the catalog entry was found.

***

## Request type filtering

`request_types` is **required** and must contain at least one value. Only request types that have pricing support are accepted. Stream variants are treated identically to their base type - specifying `chat_completion` covers both streaming and non-streaming chat requests.

| Type               | Description                                  |
| ------------------ | -------------------------------------------- |
| `chat_completion`  | Chat requests (streaming included)           |
| `text_completion`  | Legacy text completions (streaming included) |
| `responses`        | Responses API requests (streaming included)  |
| `embedding`        | Embedding generation                         |
| `rerank`           | Reranking                                    |
| `speech`           | Text-to-speech (streaming included)          |
| `transcription`    | Speech-to-text (streaming included)          |
| `image_generation` | Image generation (streaming included)        |
| `image_variation`  | Image variation                              |
| `image_edit`       | Image editing (streaming included)           |
| `video_generation` | Video generation                             |
| `video_remix`      | Video remixing                               |

***

## Creating an override

<Tabs group="config-method">
  <Tab title="Web UI">
    1. Navigate to **Models** → **Pricing Overrides** in the sidebar

           <img src="https://mintcdn.com/bifrost/aRrq3Eaiqd-A73Wp/media/ui-custom-pricing-table.png?fit=max&auto=format&n=aRrq3Eaiqd-A73Wp&q=85&s=b54178f2856bb7fe42f91421d5fa6922" alt="Pricing Overrides Table" width="3492" height="2366" data-path="media/ui-custom-pricing-table.png" />

    2. Click **Create Override**

    3. Fill in the form:
       * **Name** - a human-readable label
       * **Scope** - select the scope kind and provide the matching IDs
       * **Pattern** - enter the model name or wildcard prefix
       * **Match type** - choose **Exact** or **Wildcard**
       * **Request types** - select one or more request types (required)
       * **Pricing fields** - enter the price values you want to override (only non-zero fields are applied)

    4. Click **Save**

           <img src="https://mintcdn.com/bifrost/aRrq3Eaiqd-A73Wp/media/ui-custom-pricing-form.png?fit=max&auto=format&n=aRrq3Eaiqd-A73Wp&q=85&s=f1cdcb5777d1aad950ed6248cd972a7e" alt="Pricing Override Form" width="3492" height="2366" data-path="media/ui-custom-pricing-form.png" />
  </Tab>

  <Tab title="API">
    ```bash theme={null}
    curl -X POST http://localhost:8080/api/governance/pricing-overrides \
      -H "Content-Type: application/json" \
      -d '{
        "name": "GPT-4o reduced input cost",
        "scope_kind": "global",
        "match_type": "exact",
        "pattern": "gpt-4o",
        "request_types": ["chat_completion"],
        "patch": {
          "input_cost_per_token": 0.0000025,
          "output_cost_per_token": 0.000010
        }
      }'
    ```

    **Response:**

    ```json theme={null}
    {
      "message": "Pricing override created successfully",
      "pricing_override": {
        "id": "550e8400-e29b-41d4-a716-446655440000",
        "name": "GPT-4o reduced input cost",
        "scope_kind": "global",
        "match_type": "exact",
        "pattern": "gpt-4o",
        "request_types": ["chat_completion"],
        "pricing_patch": "{\"input_cost_per_token\":0.0000025,\"output_cost_per_token\":0.00001}",
        "created_at": "2026-03-20T10:00:00Z",
        "updated_at": "2026-03-20T10:00:00Z"
      }
    }
    ```

    **Update (sparse patch):**

    ```bash theme={null}
    curl -X PATCH http://localhost:8080/api/governance/pricing-overrides/{id} \
      -H "Content-Type: application/json" \
      -d '{
        "patch": {
          "input_cost_per_token": 0.000002
        }
      }'
    ```

    **Delete:**

    ```bash theme={null}
    curl -X DELETE http://localhost:8080/api/governance/pricing-overrides/{id}
    ```

    **List (with optional filters):**

    ```bash theme={null}
    # All overrides
    curl http://localhost:8080/api/governance/pricing-overrides

    # Filter by scope
    curl "http://localhost:8080/api/governance/pricing-overrides?scope_kind=virtual_key&virtual_key_id=vk-abc123"
    ```
  </Tab>

  <Tab title="config.json">
    Pricing overrides are defined under `governance.pricing_overrides`. Each entry requires `id`, `name`, `scope_kind`, `match_type`, `pattern`, and `request_types`. The `pricing_patch` is a JSON-encoded string containing only the fields you want to override.

    ```json theme={null}
    {
      "governance": {
        "pricing_overrides": [
          {
            "id": "550e8400-e29b-41d4-a716-446655440000",
            "name": "Global GPT-4o rate",
            "scope_kind": "global",
            "match_type": "exact",
            "pattern": "gpt-4o",
            "request_types": ["chat_completion"],
            "pricing_patch": "{\"input_cost_per_token\":0.0000025,\"output_cost_per_token\":0.00001}"
          },
          {
            "id": "660e8400-e29b-41d4-a716-446655440001",
            "name": "All Claude models for prod VK",
            "scope_kind": "virtual_key",
            "virtual_key_id": "vk-abc123",
            "match_type": "wildcard",
            "pattern": "claude-3*",
            "request_types": ["chat_completion"],
            "pricing_patch": "{\"input_cost_per_token\":0.000003,\"output_cost_per_token\":0.000015}"
          }
        ]
      }
    }
    ```

    | Field             | Type   | Required    | Description                                                                                                     |
    | ----------------- | ------ | ----------- | --------------------------------------------------------------------------------------------------------------- |
    | `id`              | string | Yes         | Unique override ID (UUID recommended)                                                                           |
    | `name`            | string | Yes         | Human-readable label                                                                                            |
    | `scope_kind`      | string | Yes         | One of: `global`, `provider`, `provider_key`, `virtual_key`, `virtual_key_provider`, `virtual_key_provider_key` |
    | `virtual_key_id`  | string | Conditional | Required for `virtual_key*` scopes                                                                              |
    | `provider_id`     | string | Conditional | Required for `provider` and `virtual_key_provider` scopes                                                       |
    | `provider_key_id` | string | Conditional | Required for `provider_key` and `virtual_key_provider_key` scopes                                               |
    | `match_type`      | string | Yes         | `exact` or `wildcard`                                                                                           |
    | `pattern`         | string | Yes         | Model name or wildcard prefix ending with `*`                                                                   |
    | `request_types`   | array  | Yes         | Request types this override applies to. At least one value required.                                            |
    | `pricing_patch`   | string | No          | JSON-encoded pricing fields to override                                                                         |
    | `config_hash`     | string | No          | Auto-managed. Do not set manually                                                                               |
  </Tab>
</Tabs>

***

## Pricing fields reference

Only fields with non-zero values are applied. All values are cost **per unit** in USD.

### Token costs

| Field                            | Description                                        |
| -------------------------------- | -------------------------------------------------- |
| `input_cost_per_token`           | Standard input token cost                          |
| `output_cost_per_token`          | Standard output token cost                         |
| `input_cost_per_token_batches`   | Input token cost for batch requests                |
| `output_cost_per_token_batches`  | Output token cost for batch requests               |
| `input_cost_per_token_priority`  | Input token cost for priority requests             |
| `output_cost_per_token_priority` | Output token cost for priority requests            |
| `input_cost_per_token_flex`      | Input token cost for flex requests                 |
| `output_cost_per_token_flex`     | Output token cost for flex requests                |
| `input_cost_per_character`       | Input cost per character (character-billed models) |

### Token tier costs

| Field                                              | Description                                          |
| -------------------------------------------------- | ---------------------------------------------------- |
| `input_cost_per_token_above_128k_tokens`           | Input cost above 128k context                        |
| `output_cost_per_token_above_128k_tokens`          | Output cost above 128k context                       |
| `input_cost_per_token_above_200k_tokens`           | Input cost above 200k context                        |
| `input_cost_per_token_above_200k_tokens_priority`  | Input cost above 200k context for priority requests  |
| `output_cost_per_token_above_200k_tokens`          | Output cost above 200k context                       |
| `output_cost_per_token_above_200k_tokens_priority` | Output cost above 200k context for priority requests |
| `input_cost_per_token_above_272k_tokens`           | Input cost above 272k context                        |
| `input_cost_per_token_above_272k_tokens_priority`  | Input cost above 272k context for priority requests  |
| `output_cost_per_token_above_272k_tokens`          | Output cost above 272k context                       |
| `output_cost_per_token_above_272k_tokens_priority` | Output cost above 272k context for priority requests |

### Cache costs

| Field                                                    | Description                                         |
| -------------------------------------------------------- | --------------------------------------------------- |
| `cache_creation_input_token_cost`                        | Cost to write a token to the prompt cache           |
| `cache_read_input_token_cost`                            | Cost to read a cached input token                   |
| `cache_creation_input_token_cost_above_200k_tokens`      | Cache creation above 200k context                   |
| `cache_read_input_token_cost_above_200k_tokens`          | Cache read above 200k context                       |
| `cache_read_input_token_cost_above_200k_tokens_priority` | Cache read above 200k context for priority requests |
| `cache_read_input_token_cost_priority`                   | Priority cache read cost                            |
| `cache_read_input_token_cost_flex`                       | Flex cache read cost                                |
| `cache_read_input_token_cost_above_272k_tokens`          | Cache read above 272k context                       |
| `cache_read_input_token_cost_above_272k_tokens_priority` | Cache read above 272k context for priority requests |
| `cache_read_input_image_token_cost`                      | Cache read cost for image tokens                    |
| `cache_creation_input_audio_token_cost`                  | Cache creation cost for audio tokens                |

### Image costs

| Field                                              | Description                      |
| -------------------------------------------------- | -------------------------------- |
| `input_cost_per_image`                             | Cost per input image             |
| `output_cost_per_image`                            | Cost per generated image         |
| `input_cost_per_pixel`                             | Cost per input pixel             |
| `output_cost_per_pixel`                            | Cost per output pixel            |
| `input_cost_per_image_token`                       | Cost per image input token       |
| `output_cost_per_image_token`                      | Cost per image output token      |
| `output_cost_per_image_low_quality`                | Generated image - low quality    |
| `output_cost_per_image_medium_quality`             | Generated image - medium quality |
| `output_cost_per_image_high_quality`               | Generated image - high quality   |
| `output_cost_per_image_auto_quality`               | Generated image - auto quality   |
| `output_cost_per_image_above_512_and_512_pixels`   | Generated image > 512×512        |
| `output_cost_per_image_above_1024_and_1024_pixels` | Generated image > 1024×1024      |
| `output_cost_per_image_above_2048_and_2048_pixels` | Generated image > 2048×2048      |
| `output_cost_per_image_above_4096_and_4096_pixels` | Generated image > 4096×4096      |

### Audio and video costs

| Field                                               | Description                         |
| --------------------------------------------------- | ----------------------------------- |
| `input_cost_per_audio_token`                        | Cost per audio input token          |
| `input_cost_per_audio_per_second`                   | Cost per second of audio input      |
| `input_cost_per_second`                             | Cost per second of input (generic)  |
| `input_cost_per_video_per_second`                   | Cost per second of video input      |
| `output_cost_per_audio_token`                       | Cost per audio output token         |
| `output_cost_per_second`                            | Cost per second of audio output     |
| `output_cost_per_video_per_second`                  | Cost per second of video output     |
| `input_cost_per_video_per_second_above_128k_tokens` | Video input cost above 128k context |
| `input_cost_per_audio_per_second_above_128k_tokens` | Audio input cost above 128k context |

### Other costs

| Field                               | Description                       |
| ----------------------------------- | --------------------------------- |
| `search_context_cost_per_query`     | Cost per web search context query |
| `code_interpreter_cost_per_session` | Cost per code interpreter session |

***

## Examples

### Flat rate for all Anthropic models

Apply a single input/output rate to every Claude model globally:

```json theme={null}
{
  "id": "anthropic-flat-rate",
  "name": "Anthropic flat rate",
  "scope_kind": "provider",
  "provider_id": "anthropic",
  "match_type": "wildcard",
  "pattern": "claude*",
  "request_types": ["chat_completion", "text_completion", "responses"],
  "pricing_patch": "{\"input_cost_per_token\":0.000003,\"output_cost_per_token\":0.000015}"
}
```

### Per-virtual-key negotiated rate

A specific virtual key has negotiated lower prices for GPT-4o:

```json theme={null}
{
  "id": "vk-prod-gpt4o-rate",
  "name": "Prod VK - GPT-4o negotiated rate",
  "scope_kind": "virtual_key",
  "virtual_key_id": "vk-abc123",
  "match_type": "exact",
  "pattern": "gpt-4o",
  "request_types": ["chat_completion"],
  "pricing_patch": "{\"input_cost_per_token\":0.000002,\"output_cost_per_token\":0.000008}"
}
```

### Image generation override

Override costs for a specific image model at global scope:

```json theme={null}
{
  "id": "dall-e-3-rate",
  "name": "DALL-E 3 custom rate",
  "scope_kind": "global",
  "match_type": "exact",
  "pattern": "dall-e-3",
  "request_types": ["image_generation"],
  "pricing_patch": "{\"output_cost_per_image_high_quality\":0.04,\"output_cost_per_image_medium_quality\":0.02}"
}
```

### Global catch-all for a new model

Use a global override to add pricing for a model not yet in the built-in catalog:

```json theme={null}
{
  "id": "my-new-model-rate",
  "name": "my-new-model pricing",
  "scope_kind": "global",
  "match_type": "exact",
  "pattern": "my-new-model-v1",
  "request_types": ["chat_completion"],
  "pricing_patch": "{\"input_cost_per_token\":0.000001,\"output_cost_per_token\":0.000005}"
}
```

***

## Next steps

* **[Virtual Keys](../features/governance/virtual-keys)** - Attach virtual-key-scoped overrides to virtual keys for per-customer pricing
* **[Budget and Limits](../features/governance/budget-and-limits)** - Understand how costs are tracked against budgets
* **[Model Catalog](../architecture/framework/model-catalog)** - Deep dive into how pricing resolution and cost calculation work internally