> ## Documentation Index
> Fetch the complete documentation index at: https://docs.getbifrost.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Prometheus

> Monitor Bifrost metrics with Prometheus scraping or Push Gateway for multi-node deployments

## Overview

Bifrost exposes Prometheus metrics via two methods:

1. **Pull-based (Scraping)**: Traditional `/metrics` endpoint that Prometheus can scrape
2. **Push-based (Push Gateway)**: Push metrics to a Prometheus Push Gateway for cluster deployments

<Note>
  **For multi-node deployments**: Use the Push Gateway method to ensure accurate metric aggregation. Traditional scraping may miss nodes behind load balancers.
</Note>

***

## Pull-based Scraping

Bifrost automatically exposes a `/metrics` endpoint when the telemetry plugin is enabled (enabled by default). No additional configuration is needed.

<Info>
  When Bifrost's authentication is enabled (`auth_config.is_enabled = true`), the `/metrics` endpoint requires Basic auth credentials. You must include the same `admin_username` and `admin_password` from your `auth_config` in the Prometheus scrape configuration. Without this, Prometheus will receive `401 Unauthorized` responses and scraping will silently fail.
</Info>

### Prometheus Configuration

Add Bifrost to your Prometheus `prometheus.yml`:

```yaml theme={null}
scrape_configs:
  - job_name: 'bifrost'
    static_configs:
      - targets: ['bifrost-host:8080']
    scrape_interval: 15s
```

If Bifrost authentication is enabled, add `basic_auth` to your scrape config:

```yaml theme={null}
scrape_configs:
  - job_name: 'bifrost'
    static_configs:
      - targets: ['bifrost-host:8080']
    scrape_interval: 15s
    basic_auth:
      username: '<admin_username>'
      password: '<admin_password>'
```

### Endpoint

```
GET /metrics
```

Returns metrics in Prometheus exposition format.

***

## Push-based (Push Gateway)

For multi-node cluster deployments, the Prometheus plugin pushes metrics to a [Prometheus Push Gateway](https://github.com/prometheus/pushgateway). This ensures all nodes' metrics are captured regardless of load balancer routing.

### Configuration

| Field              | Type      | Required | Default   | Description                                        |
| ------------------ | --------- | -------- | --------- | -------------------------------------------------- |
| `push_gateway_url` | `string`  | ✅ Yes    | -         | Push Gateway URL (e.g., `http://pushgateway:9091`) |
| `job_name`         | `string`  | ❌ No     | `bifrost` | Job label for pushed metrics                       |
| `instance_id`      | `string`  | ❌ No     | hostname  | Instance identifier for metric grouping            |
| `push_interval`    | `integer` | ❌ No     | `15`      | Push interval in seconds (1-300)                   |
| `basic_auth`       | `object`  | ❌ No     | -         | Basic auth credentials                             |

### Basic Auth Configuration

| Field      | Type     | Required | Description         |
| ---------- | -------- | -------- | ------------------- |
| `username` | `string` | ✅ Yes    | Basic auth username |
| `password` | `string` | ✅ Yes    | Basic auth password |

***

## Setup

<Tabs group="setup-method">
  <Tab title="UI">
    1. Navigate to **Observability** → **Prometheus** in the Bifrost UI
    2. The `/metrics` endpoint is shown at the top for scraping configuration
    3. To enable Push Gateway:
       * Enter the **Push Gateway URL**
       * Configure **Job Name** and **Push Interval** as needed
       * Optionally set a custom **Instance ID**
       * Enable **Basic Authentication** if required
       * Toggle **Enable Push Gateway** on
       * Click **Save Prometheus Configuration**
  </Tab>

  <Tab title="Config File">
    ```json theme={null}
    {
      "plugins": [
        {
          "name": "telemetry",
          "enabled": true,
          "config": {
            "push_gateway": {
              "enabled": true,
              "push_gateway_url": "http://pushgateway:9091",
              "job_name": "bifrost",
              "push_interval": 15
            }
          }
        }
      ]
    }
    ```

    ### With Basic Auth

    ```json theme={null}
    {
      "plugins": [
        {
          "name": "telemetry",
          "enabled": true,
          "config": {
            "push_gateway": {
              "enabled": true,
              "push_gateway_url": "http://pushgateway:9091",
              "job_name": "bifrost",
              "push_interval": 15,
              "instance_id": "bifrost-node-1",
              "basic_auth": {
                "username": "admin",
                "password": "secret"
              }
            }
          }
        }
      ]
    }
    ```
  </Tab>
</Tabs>

***

## Available Metrics

The following metrics are available from both the `/metrics` endpoint and Push Gateway:

### HTTP Metrics

| Metric                          | Type      | Description                                 |
| ------------------------------- | --------- | ------------------------------------------- |
| `http_requests_total`           | Counter   | Total HTTP requests by path, method, status |
| `http_request_duration_seconds` | Histogram | HTTP request latency                        |
| `http_request_size_bytes`       | Histogram | Request body size                           |
| `http_response_size_bytes`      | Histogram | Response body size                          |

### Bifrost LLM Metrics

| Metric                                       | Type      | Description                                                                                       |
| -------------------------------------------- | --------- | ------------------------------------------------------------------------------------------------- |
| `bifrost_upstream_requests_total`            | Counter   | Total requests to LLM providers                                                                   |
| `bifrost_upstream_latency_seconds`           | Histogram | Provider request latency                                                                          |
| `bifrost_success_requests_total`             | Counter   | Successful provider requests                                                                      |
| `bifrost_error_requests_total`               | Counter   | Failed provider requests                                                                          |
| `bifrost_input_tokens_total`                 | Counter   | Total input tokens processed                                                                      |
| `bifrost_output_tokens_total`                | Counter   | Total output tokens generated                                                                     |
| `bifrost_cost_total`                         | Counter   | Total cost in USD                                                                                 |
| `bifrost_cache_hits_total`                   | Counter   | Cache hits by type                                                                                |
| `bifrost_stream_first_token_latency_seconds` | Histogram | Time to first token (streaming)                                                                   |
| `bifrost_stream_inter_token_latency_seconds` | Histogram | Inter-token latency (streaming)                                                                   |
| `bifrost_key_rotation_events_total`          | Counter   | Per-attempt retry/rotation events with key identifiers (see below) <sup>v1.5.0-prerelease4+</sup> |

### Default Labels

All Bifrost metrics include these labels:

* `provider` - LLM provider name
* `model` - Model identifier
* `method` - Request type (chat, completion, embedding, etc.)
* `virtual_key_id` / `virtual_key_name` - Virtual key identifiers
* `selected_key_id` / `selected_key_name` - API key that successfully served the request (`""` when all attempts failed)
* `number_of_retries` - Total attempts minus one (across all keys)
* `fallback_index` - Fallback position
* `team_id` / `team_name` - Team identifiers (if governance enabled)
* `customer_id` / `customer_name` - Customer identifiers (if governance enabled)

<Note>
  **v1.5.0-prerelease4+**: `selected_key_id` / `selected_key_name` are only populated when the request succeeds. On final errors both are empty - use `bifrost_key_rotation_events_total` or the `attempt_trail` log field to see which keys were tried.
</Note>

### Key Rotation Events <sup>v1.5.0-prerelease4+</sup>

`bifrost_key_rotation_events_total` is incremented once per **failed attempt** (not per request), giving you time-series visibility into retry pressure:

| Label             | Values            | Description                                                    |
| ----------------- | ----------------- | -------------------------------------------------------------- |
| `provider`        | e.g. `openai`     | LLM provider                                                   |
| `requested_model` | e.g. `gpt-4o`     | Model as requested (before any alias resolution)               |
| `key_id`          | UUID              | The provider API key that failed on this attempt               |
| `key_name`        | string            | Human-readable name of the provider API key                    |
| `fail_reason`     | error type string | Provider error type (e.g. `rate_limit_error`, `network_error`) |

**Example queries:**

```promql theme={null}
# Rate-limit events per provider over time
sum by (provider, fail_reason) (
  rate(bifrost_key_rotation_events_total[5m])
)

# Which specific keys are hitting rate limits most often
topk(5, sum by (provider, key_name, fail_reason) (
  rate(bifrost_key_rotation_events_total{fail_reason="rate_limit_error"}[1h])
))
```

***

## Push Gateway Setup

If you don't have a Push Gateway running, deploy one:

### Docker

```bash theme={null}
docker run -d -p 9091:9091 prom/pushgateway
```

### Kubernetes (Helm)

```bash theme={null}
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install pushgateway prometheus-community/prometheus-pushgateway
```

### Configure Prometheus to Scrape Push Gateway

Add to your `prometheus.yml`:

```yaml theme={null}
scrape_configs:
  - job_name: 'pushgateway'
    honor_labels: true
    static_configs:
      - targets: ['pushgateway:9091']
```

<Note>
  The `honor_labels: true` setting is important - it preserves the `job` and `instance` labels pushed by Bifrost instead of overwriting them with the Push Gateway's labels.
</Note>

***

## Pull vs Push: When to Use Each

| Scenario                                | Recommended Method      |
| --------------------------------------- | ----------------------- |
| Single Bifrost instance                 | Pull (scraping)         |
| Multiple instances, direct access       | Pull (scraping)         |
| Multiple instances behind load balancer | **Push (Push Gateway)** |
| Kubernetes with service mesh            | Pull or Push            |
| Serverless / ephemeral instances        | **Push (Push Gateway)** |

### Why Push for Clusters?

When multiple Bifrost instances run behind a load balancer:

1. **Scraping randomness**: Each scrape may hit different nodes, missing metrics from others
2. **Instance tracking**: Push Gateway properly tracks per-instance metrics via `instance` label
3. **Aggregation**: Downstream tools (Grafana, Datadog) can aggregate across all instances

***

## Troubleshooting

### Push Gateway Connection Failed

```
failed to push metrics to push gateway: connection refused
```

* Verify the Push Gateway URL is correct and reachable from Bifrost
* Check firewall rules between Bifrost and Push Gateway
* Ensure Push Gateway is running: `curl http://pushgateway:9091/metrics`

### Metrics Not Appearing

* Verify the telemetry plugin is enabled (required for metrics collection)
* Check Bifrost logs for push errors
* Verify Prometheus is scraping the Push Gateway with `honor_labels: true`

### Authentication Failed

* Double-check username and password
* Ensure basic auth is configured on the Push Gateway side
* Check for special characters that may need escaping
