Documentation Index
Fetch the complete documentation index at: https://docs.getbifrost.ai/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Bifrost exposes Prometheus metrics via two methods:- Pull-based (Scraping): Traditional
/metricsendpoint that Prometheus can scrape - Push-based (Push Gateway): Push metrics to a Prometheus Push Gateway for cluster deployments
For multi-node deployments: Use the Push Gateway method to ensure accurate metric aggregation. Traditional scraping may miss nodes behind load balancers.
Pull-based Scraping
Bifrost automatically exposes a/metrics endpoint when the telemetry plugin is enabled (enabled by default). No additional configuration is needed.
When Bifrost’s authentication is enabled (
auth_config.is_enabled = true), the /metrics endpoint requires Basic auth credentials. You must include the same admin_username and admin_password from your auth_config in the Prometheus scrape configuration. Without this, Prometheus will receive 401 Unauthorized responses and scraping will silently fail.Prometheus Configuration
Add Bifrost to your Prometheusprometheus.yml:
basic_auth to your scrape config:
Endpoint
Push-based (Push Gateway)
For multi-node cluster deployments, the Prometheus plugin pushes metrics to a Prometheus Push Gateway. This ensures all nodes’ metrics are captured regardless of load balancer routing.Configuration
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
push_gateway_url | string | ✅ Yes | - | Push Gateway URL (e.g., http://pushgateway:9091) |
job_name | string | ❌ No | bifrost | Job label for pushed metrics |
instance_id | string | ❌ No | hostname | Instance identifier for metric grouping |
push_interval | integer | ❌ No | 15 | Push interval in seconds (1-300) |
basic_auth | object | ❌ No | - | Basic auth credentials |
Basic Auth Configuration
| Field | Type | Required | Description |
|---|---|---|---|
username | string | ✅ Yes | Basic auth username |
password | string | ✅ Yes | Basic auth password |
Setup
- UI
- Config File
- Navigate to Observability → Prometheus in the Bifrost UI
- The
/metricsendpoint is shown at the top for scraping configuration - To enable Push Gateway:
- Enter the Push Gateway URL
- Configure Job Name and Push Interval as needed
- Optionally set a custom Instance ID
- Enable Basic Authentication if required
- Toggle Enable Push Gateway on
- Click Save Prometheus Configuration
Available Metrics
The following metrics are available from both the/metrics endpoint and Push Gateway:
HTTP Metrics
| Metric | Type | Description |
|---|---|---|
http_requests_total | Counter | Total HTTP requests by path, method, status |
http_request_duration_seconds | Histogram | HTTP request latency |
http_request_size_bytes | Histogram | Request body size |
http_response_size_bytes | Histogram | Response body size |
Bifrost LLM Metrics
| Metric | Type | Description |
|---|---|---|
bifrost_upstream_requests_total | Counter | Total requests to LLM providers |
bifrost_upstream_latency_seconds | Histogram | Provider request latency |
bifrost_success_requests_total | Counter | Successful provider requests |
bifrost_error_requests_total | Counter | Failed provider requests |
bifrost_input_tokens_total | Counter | Total input tokens processed |
bifrost_output_tokens_total | Counter | Total output tokens generated |
bifrost_cost_total | Counter | Total cost in USD |
bifrost_cache_hits_total | Counter | Cache hits by type |
bifrost_stream_first_token_latency_seconds | Histogram | Time to first token (streaming) |
bifrost_stream_inter_token_latency_seconds | Histogram | Inter-token latency (streaming) |
bifrost_key_rotation_events_total | Counter | Per-attempt retry/rotation events with key identifiers (see below) v1.5.0-prerelease4+ |
Default Labels
All Bifrost metrics include these labels:provider- LLM provider namemodel- Model identifiermethod- Request type (chat, completion, embedding, etc.)virtual_key_id/virtual_key_name- Virtual key identifiersselected_key_id/selected_key_name- API key that successfully served the request (""when all attempts failed)number_of_retries- Total attempts minus one (across all keys)fallback_index- Fallback positionteam_id/team_name- Team identifiers (if governance enabled)customer_id/customer_name- Customer identifiers (if governance enabled)
v1.5.0-prerelease4+:
selected_key_id / selected_key_name are only populated when the request succeeds. On final errors both are empty - use bifrost_key_rotation_events_total or the attempt_trail log field to see which keys were tried.Key Rotation Events v1.5.0-prerelease4+
bifrost_key_rotation_events_total is incremented once per failed attempt (not per request), giving you time-series visibility into retry pressure:
| Label | Values | Description |
|---|---|---|
provider | e.g. openai | LLM provider |
requested_model | e.g. gpt-4o | Model as requested (before any alias resolution) |
key_id | UUID | The provider API key that failed on this attempt |
key_name | string | Human-readable name of the provider API key |
fail_reason | error type string | Provider error type (e.g. rate_limit_error, network_error) |
Push Gateway Setup
If you don’t have a Push Gateway running, deploy one:Docker
Kubernetes (Helm)
Configure Prometheus to Scrape Push Gateway
Add to yourprometheus.yml:
The
honor_labels: true setting is important - it preserves the job and instance labels pushed by Bifrost instead of overwriting them with the Push Gateway’s labels.Pull vs Push: When to Use Each
| Scenario | Recommended Method |
|---|---|
| Single Bifrost instance | Pull (scraping) |
| Multiple instances, direct access | Pull (scraping) |
| Multiple instances behind load balancer | Push (Push Gateway) |
| Kubernetes with service mesh | Pull or Push |
| Serverless / ephemeral instances | Push (Push Gateway) |
Why Push for Clusters?
When multiple Bifrost instances run behind a load balancer:- Scraping randomness: Each scrape may hit different nodes, missing metrics from others
- Instance tracking: Push Gateway properly tracks per-instance metrics via
instancelabel - Aggregation: Downstream tools (Grafana, Datadog) can aggregate across all instances
Troubleshooting
Push Gateway Connection Failed
- Verify the Push Gateway URL is correct and reachable from Bifrost
- Check firewall rules between Bifrost and Push Gateway
- Ensure Push Gateway is running:
curl http://pushgateway:9091/metrics
Metrics Not Appearing
- Verify the telemetry plugin is enabled (required for metrics collection)
- Check Bifrost logs for push errors
- Verify Prometheus is scraping the Push Gateway with
honor_labels: true
Authentication Failed
- Double-check username and password
- Ensure basic auth is configured on the Push Gateway side
- Check for special characters that may need escaping

