Skip to main content

Overview

Bifrost exposes Prometheus metrics via two methods:
  1. Pull-based (Scraping): Traditional /metrics endpoint that Prometheus can scrape
  2. Push-based (Push Gateway): Push metrics to a Prometheus Push Gateway for cluster deployments
For multi-node deployments: Use the Push Gateway method to ensure accurate metric aggregation. Traditional scraping may miss nodes behind load balancers.

Pull-based Scraping

Bifrost automatically exposes a /metrics endpoint when the telemetry plugin is enabled (enabled by default). No additional configuration is needed.

Prometheus Configuration

Add Bifrost to your Prometheus prometheus.yml:
scrape_configs:
  - job_name: 'bifrost'
    static_configs:
      - targets: ['bifrost-host:8080']
    scrape_interval: 15s

Endpoint

GET /metrics
Returns metrics in Prometheus exposition format.

Push-based (Push Gateway)

For multi-node cluster deployments, the Prometheus plugin pushes metrics to a Prometheus Push Gateway. This ensures all nodes’ metrics are captured regardless of load balancer routing.

Configuration

FieldTypeRequiredDefaultDescription
push_gateway_urlstring✅ Yes-Push Gateway URL (e.g., http://pushgateway:9091)
job_namestring❌ NobifrostJob label for pushed metrics
instance_idstring❌ NohostnameInstance identifier for metric grouping
push_intervalinteger❌ No15Push interval in seconds (1-300)
basic_authobject❌ No-Basic auth credentials

Basic Auth Configuration

FieldTypeRequiredDescription
usernamestring✅ YesBasic auth username
passwordstring✅ YesBasic auth password

Setup

  1. Navigate to ObservabilityPrometheus in the Bifrost UI
  2. The /metrics endpoint is shown at the top for scraping configuration
  3. To enable Push Gateway:
    • Enter the Push Gateway URL
    • Configure Job Name and Push Interval as needed
    • Optionally set a custom Instance ID
    • Enable Basic Authentication if required
    • Toggle Enable Push Gateway on
    • Click Save Prometheus Configuration

Available Metrics

The following metrics are available from both the /metrics endpoint and Push Gateway:

HTTP Metrics

MetricTypeDescription
http_requests_totalCounterTotal HTTP requests by path, method, status
http_request_duration_secondsHistogramHTTP request latency
http_request_size_bytesHistogramRequest body size
http_response_size_bytesHistogramResponse body size

Bifrost LLM Metrics

MetricTypeDescription
bifrost_upstream_requests_totalCounterTotal requests to LLM providers
bifrost_upstream_latency_secondsHistogramProvider request latency
bifrost_success_requests_totalCounterSuccessful provider requests
bifrost_error_requests_totalCounterFailed provider requests
bifrost_input_tokens_totalCounterTotal input tokens processed
bifrost_output_tokens_totalCounterTotal output tokens generated
bifrost_cost_totalCounterTotal cost in USD
bifrost_cache_hits_totalCounterCache hits by type
bifrost_stream_first_token_latency_secondsHistogramTime to first token (streaming)
bifrost_stream_inter_token_latency_secondsHistogramInter-token latency (streaming)

Default Labels

All Bifrost metrics include these labels:
  • provider - LLM provider name
  • model - Model identifier
  • method - Request type (chat, completion, embedding, etc.)
  • virtual_key_id / virtual_key_name - Virtual key identifiers
  • selected_key_id / selected_key_name - Actual key used
  • number_of_retries - Retry count
  • fallback_index - Fallback position
  • team_id / team_name - Team identifiers (if governance enabled)
  • customer_id / customer_name - Customer identifiers (if governance enabled)

Push Gateway Setup

If you don’t have a Push Gateway running, deploy one:

Docker

docker run -d -p 9091:9091 prom/pushgateway

Kubernetes (Helm)

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install pushgateway prometheus-community/prometheus-pushgateway

Configure Prometheus to Scrape Push Gateway

Add to your prometheus.yml:
scrape_configs:
  - job_name: 'pushgateway'
    honor_labels: true
    static_configs:
      - targets: ['pushgateway:9091']
The honor_labels: true setting is important - it preserves the job and instance labels pushed by Bifrost instead of overwriting them with the Push Gateway’s labels.

Pull vs Push: When to Use Each

ScenarioRecommended Method
Single Bifrost instancePull (scraping)
Multiple instances, direct accessPull (scraping)
Multiple instances behind load balancerPush (Push Gateway)
Kubernetes with service meshPull or Push
Serverless / ephemeral instancesPush (Push Gateway)

Why Push for Clusters?

When multiple Bifrost instances run behind a load balancer:
  1. Scraping randomness: Each scrape may hit different nodes, missing metrics from others
  2. Instance tracking: Push Gateway properly tracks per-instance metrics via instance label
  3. Aggregation: Downstream tools (Grafana, Datadog) can aggregate across all instances

Troubleshooting

Push Gateway Connection Failed

failed to push metrics to push gateway: connection refused
  • Verify the Push Gateway URL is correct and reachable from Bifrost
  • Check firewall rules between Bifrost and Push Gateway
  • Ensure Push Gateway is running: curl http://pushgateway:9091/metrics

Metrics Not Appearing

  • Verify the telemetry plugin is enabled (required for metrics collection)
  • Check Bifrost logs for push errors
  • Verify Prometheus is scraping the Push Gateway with honor_labels: true

Authentication Failed

  • Double-check username and password
  • Ensure basic auth is configured on the Push Gateway side
  • Check for special characters that may need escaping