Skip to main content

Overview

Model limits let you enforce spending caps and rate limits keyed on a specific model (or all models), an optional provider, and a scope that determines who the limit applies to. They are the unified control plane for all model-level governance in Bifrost:
  • Global provider budgets — cap what OpenAI (or any provider) can spend across all traffic
  • Virtual key top-level budgets — limit how much a specific virtual key can spend across all its providers
  • Virtual key per-provider budgets — limit what a virtual key can spend on a single provider
  • Per-model limits — enforce fine-grained caps on individual models for any of the above scopes
The user scope is available in Bifrost Enterprise. Support for customer and team scopes is coming soon.

Scope system

Every model limit has a scope that determines the audience it applies to.
ScopeWho it applies toScope Target required?
globalAll traffic through BifrostNo
virtual_keyAll requests made with a specific virtual keyYes — the virtual key ID
userAll requests made by a specific user (Enterprise only)Yes — the user ID
Scope + model name combinations:
model_nameproviderscopeWhat it governs
* (All Models)openaiglobalGlobal OpenAI provider budget
* (All Models)(none)virtual_keyThat VK’s top-level cross-provider budget
* (All Models)anthropicvirtual_keyThat VK’s Anthropic-only budget
gpt-4oopenaiglobalHard cap on gpt-4o usage across all traffic
claude-3-5-sonnet-20241022(none)virtual_keyPer-VK cap on a specific model

Configuration

Navigate to Budget & Limits → Model Limits in the Bifrost dashboard.

Table view

The table shows all configured model limits with their current usage. Use the toolbar to find what you need:
  • Search — filter by model name
  • Scope dropdown — show only global or virtual_key limits
  • Provider dropdown — show only limits for a specific provider
The Scope Target column links directly back to the parent entity (e.g. clicking a virtual key badge takes you to that VK).Model Limits Table

Adding a model limit

Click Add Model Limit to open the configuration sheet.
  1. Provider — select a specific provider or leave as All Providers
  2. Model Name — search and select a model, or pick All Models to cover every model for the chosen provider/scope
  3. Scope — choose Global or Virtual Key
  4. Scope Target — appears when scope is Virtual Key; select the target virtual key
  5. Budget — add one or more budget lines, each with a dollar cap and reset duration. Multiple budgets per limit are supported (e.g. $50/day + $500/month).
  6. Rate Limits — optionally set token and/or request limits with their own reset durations
Click Create Limit to save.Model Limit Sheet
Model name and scope are locked after creation. To change them, delete the limit and recreate it.

Examples

Global provider cap

Prevent OpenAI from exceeding $1,000/month regardless of which virtual key triggered the request:
curl -X POST "http://localhost:8080/api/governance/model-configs" \
  -H "Content-Type: application/json" \
  -d '{
    "model_name": "*",
    "provider": "openai",
    "scope": "global",
    "budgets": [
      { "max_limit": 1000.00, "reset_duration": "1M" }
    ]
  }'
This is also manageable from the Providers page → Governance tab per provider, which writes to the same underlying entry.

Virtual key top-level budget

Cap the total spend for a virtual key across all its providers:
curl -X POST "http://localhost:8080/api/governance/model-configs" \
  -H "Content-Type: application/json" \
  -d '{
    "model_name": "*",
    "scope": "virtual_key",
    "scope_id": "vk-staging-team",
    "budgets": [
      { "max_limit": 200.00, "reset_duration": "1M" }
    ]
  }'

Virtual key per-provider budget

Let the staging VK use Anthropic up to $50/month independently of its OpenAI spend:
curl -X POST "http://localhost:8080/api/governance/model-configs" \
  -H "Content-Type: application/json" \
  -d '{
    "model_name": "*",
    "provider": "anthropic",
    "scope": "virtual_key",
    "scope_id": "vk-staging-team",
    "budgets": [
      { "max_limit": 50.00, "reset_duration": "1M" }
    ]
  }'
These VK governance limits are also editable through the Virtual Keys page → provider governance section.

Multi-budget daily + monthly cap

Protect against both runaway daily spikes and monthly overruns on a single model:
curl -X POST "http://localhost:8080/api/governance/model-configs" \
  -H "Content-Type: application/json" \
  -d '{
    "model_name": "gpt-4o",
    "provider": "openai",
    "scope": "global",
    "budgets": [
      { "max_limit": 30.00, "reset_duration": "1d" },
      { "max_limit": 500.00, "reset_duration": "1M" }
    ]
  }'
All budgets must pass for a request to be allowed — a spike that exhausts the daily cap blocks further requests until it resets, even if the monthly cap has room remaining.

How limits interact

When a request arrives, Bifrost checks every applicable model limit independently. All must pass:
Request: VK "staging" → openai → gpt-4o

Checks run in order:
  1. Global gpt-4o limit (if any)
  2. Global openai limit (if any)
  3. VK "staging" top-level limit (if any)
  4. VK "staging" → openai limit (if any)
If any single limit is exhausted, the request is blocked. Costs are deducted from all matching limits after a successful response.

Next Steps

  • Budget & Limits — Budgets at the virtual key, team, and customer hierarchy level
  • Virtual Keys — Create and manage virtual keys with provider configs
  • Routing — Automatic failover when a limit is exhausted