> ## Documentation Index
> Fetch the complete documentation index at: https://docs.getbifrost.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Clustering

> Enterprise-grade high-availability clustering with automatic service discovery, intelligent traffic distribution, and gossip-based state synchronization for production deployments.

## Overview

**Bifrost Clustering** delivers production-ready high availability through a peer-to-peer network architecture with automatic service discovery. The clustering system uses gossip protocols to maintain consistent state across nodes while providing seamless scaling, automatic failover, and zero-downtime deployments.

### Why Clustering Matters

Modern AI gateway deployments require robust infrastructure to handle production workloads:

| Challenge                   | Impact                                      | Clustering Solution                              |
| --------------------------- | ------------------------------------------- | ------------------------------------------------ |
| **Single Point of Failure** | Complete service outage if gateway fails    | Distributed architecture with automatic failover |
| **Traffic Spikes**          | Performance degradation under high load     | Dynamic load distribution across multiple nodes  |
| **Provider Rate Limits**    | Request throttling and service interruption | Distributed rate limit tracking across cluster   |
| **Regional Latency**        | Poor user experience in distant regions     | Geographic distribution with local processing    |
| **Maintenance Windows**     | Service downtime during updates             | Rolling updates with zero-downtime deployment    |
| **Capacity Planning**       | Over/under-provisioning resources           | Elastic scaling based on real-time demand        |

### Core Features

| Feature                         | Description                                                                    |
| ------------------------------- | ------------------------------------------------------------------------------ |
| **Automatic Service Discovery** | 6 discovery methods for any infrastructure (K8s, Consul, etcd, DNS, UDP, mDNS) |
| **Peer-to-Peer Architecture**   | No single point of failure with equal node participation                       |
| **Gossip-Based State Sync**     | Real-time synchronization of traffic patterns and limits                       |
| **Automatic Failover**          | Seamless traffic redistribution when nodes fail                                |
| **Zero-Downtime Updates**       | Rolling deployments without service interruption                               |

***

## Architecture

### Peer-to-Peer Network Design

Bifrost clustering uses a **peer-to-peer (P2P) network** where all nodes are equal participants. Each node:

* Discovers peers automatically using the configured discovery method
* Receives application state and counter updates over gRPC
* Tracks cluster membership and node liveness over a memberlist gossip layer
* Handles failover automatically

### Cluster Communication

Bifrost uses two transports for different responsibilities. Membership and node-liveness signals run over a memberlist gossip layer; everything else (configuration changes, governance counters, routing rules, all replicated entity types) travels over a dedicated gRPC channel.

| Transport             | Default port        | Carries                                                                                                                                                                            |
| --------------------- | ------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Memberlist gossip** | `10101` (TCP + UDP) | Cluster membership, node join/leave, liveness probes, region metadata                                                                                                              |
| **gRPC counter sync** | `10102` (TCP)       | Application messages: governance usage counters, config sync, routing rules, virtual keys, providers, RBAC, MCP tools, pricing, auth config, and 25+ other replicated entity types |

This split lets membership churn (joins, leaves, failure detection) stay isolated from the higher-volume application message stream, and lets each transport be tuned, scaled, and observed independently. The gRPC layer was introduced in v1.4.0; before then, all traffic ran over gossip.

#### Application messages and entity types

Each replicated message carries an `EntityType` identifying the kind of state being broadcast. Bifrost replicates 30+ entity types across the cluster, including: model catalog, virtual keys, providers, governance counters, routing rules, RBAC, MCP tools and tool groups, pricing and pricing overrides, access profiles, prompt deployments, auth configuration, and cluster diagnostics. See the [Replicated Entity Types](#replicated-entity-types) reference for the complete list.

#### Message dedup and invalidation

Each broadcast carries a unique message ID and a `SentAt` timestamp. Receivers run a deduper (default 5-minute TTL) keyed by message ID, so a node that has already processed a given message ignores re-broadcasts of the same ID. When a newer message with the same ID arrives, the existing entry is invalidated and replaced.

**Convergence**: All nodes converge to the same state within seconds with eventual consistency guarantees.

### Node Identity and Region

Each node in the cluster has two pieces of identity metadata:

* **`node_id`** - configured via `cluster_config.node_id` and surfaced in cluster status output, the React Flow topology view, and diagnostics. The actual memberlist node name is derived from this value combined with the gossip port. If you omit `node_id`, set one explicitly per pod or instance to make cluster status readable; UUID-style IDs are fine.
* **`region`** - free-form region label (e.g. `"us-east-1"`, `"eu-west"`) read from `cluster_config.region` and propagated in node metadata. Defaults to `"unknown"` when omitted. Region is used for [regional leader election](#leader-election) and for region-aware operations; it does not gate gossip scope or membership.

### Leader Election

Bifrost runs two leader elections in parallel: one cluster-wide and one per region.

| Election            | Scope                                 | What it does                                                                                                                                                   |
| ------------------- | ------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Cluster leader**  | All `StateAlive` nodes in the cluster | Coordinates cluster-wide singleton tasks (e.g. pricing URL fetch and broadcast) so only one node hits the upstream and other nodes receive the result via gRPC |
| **Regional leader** | Nodes within the same `region` value  | Coordinates region-scoped operations                                                                                                                           |

Election is deterministic: the lexicographically-first healthy member wins. The election loop re-evaluates membership every 30 seconds, so leadership transfers automatically when nodes join, leave, or fail. There is nothing to configure - leader election runs whenever clustering is enabled.

### Minimum Node Requirements

<Note>
  **Recommended: 3+ nodes minimum** for optimal fault tolerance.
</Note>

| Cluster Size | Fault Tolerance  | Use Case                      |
| ------------ | ---------------- | ----------------------------- |
| **3 nodes**  | 1 node failure   | Small production deployments  |
| **5 nodes**  | 2 node failures  | Medium production deployments |
| **7+ nodes** | 3+ node failures | Large enterprise deployments  |

***

## Configuration Basics

### Core Configuration Structure

The new clustering configuration uses a `cluster_config` object with integrated service discovery:

```json theme={null}
{
  "cluster_config": {
    "enabled": true,
    "node_id": "bifrost-1",
    "region": "us-east-1",
    "discovery": {
      "enabled": true,
      "type": "kubernetes",
      "service_name": "bifrost-cluster"
    },
    "gossip": {
      "port": 10101,
      "config": {
        "timeout_seconds": 10,
        "success_threshold": 3,
        "failure_threshold": 3
      }
    },
    "grpc": {
      "port": 10102,
      "dial_timeout_seconds": 5
    }
  }
}
```

<Note>
  **Required ports for v1.4.x and later:**

  * `10101/TCP` and `10101/UDP` for memberlist gossip (membership and liveness)
  * `10102/TCP` for the gRPC counter sync transport (application messages)

  Both ports must be reachable peer-to-peer between cluster nodes. NetworkPolicies, security groups, and firewall rules need to allow traffic on both.
</Note>

Discovery-specific fields (e.g. `k8s_label_selector`, `consul_address`, `etcd_endpoints`) slot into the `discovery` object alongside `type` - see each method's section below.

<Note>
  At startup, cluster mode requires either a non-empty `peers` list or `discovery.enabled: true`.
</Note>

### Common Discovery Configuration Fields

All discovery methods support these common fields:

| Field                   | Type     | Required    | Description                                                                                     |
| ----------------------- | -------- | ----------- | ----------------------------------------------------------------------------------------------- |
| `enabled`               | boolean  | No          | Enable/disable discovery (must be `true` to use discovery at runtime)                           |
| `type`                  | string   | Yes         | Discovery type: `kubernetes`, `consul`, `etcd`, `dns`, `udp`, `mdns`                            |
| `service_name`          | string   | Conditional | Required for `consul`, `etcd`, `udp`, and typically `mdns`; optional for `kubernetes` and `dns` |
| `bind_port`             | integer  | No          | Port for cluster communication (default: 10101)                                                 |
| `dial_timeout`          | duration | No          | Discovery timeout (default: 10s)                                                                |
| `allowed_address_space` | array    | No          | CIDR ranges to filter discovered nodes (e.g., `["10.0.0.0/8"]`)                                 |

### Gossip Configuration

| Field               | Description                                        | Default |
| ------------------- | -------------------------------------------------- | ------- |
| `port`              | Memberlist gossip port (used for both TCP and UDP) | 10101   |
| `timeout_seconds`   | Health check timeout                               | 10      |
| `success_threshold` | Successful checks to mark healthy                  | 3       |
| `failure_threshold` | Failed checks to mark unhealthy                    | 3       |

### gRPC Configuration

The gRPC transport carries all application messages and counter sync between nodes. It is enabled automatically whenever `cluster_config.enabled` is `true`; configuration is optional.

| Field                  | Description                          | Default |
| ---------------------- | ------------------------------------ | ------- |
| `port`                 | TCP port for the cluster gRPC server | 10102   |
| `dial_timeout_seconds` | Timeout when dialing a peer for gRPC | 5       |

If you omit the `grpc` block entirely, both defaults apply. Override only when the defaults conflict with your environment (e.g. another service already binding `10102`).

### Top-level Fields

| Field       | Type    | Required    | Description                                                                               |
| ----------- | ------- | ----------- | ----------------------------------------------------------------------------------------- |
| `enabled`   | boolean | Yes         | Master switch for cluster mode                                                            |
| `node_id`   | string  | Recommended | Logical identifier for this node, used in cluster status and topology views               |
| `region`    | string  | No          | Region label (e.g. `us-east-1`); defaults to `unknown` and gates regional leader election |
| `peers`     | array   | Conditional | Static peer list (`host:port` per entry). Required if `discovery.enabled` is `false`      |
| `gossip`    | object  | No          | Memberlist gossip settings (see above)                                                    |
| `grpc`      | object  | No          | gRPC counter sync settings (see above)                                                    |
| `discovery` | object  | Conditional | Auto-discovery settings. Required if `peers` is empty                                     |

***

## Broker Mode

The default `mesh` clustering described above is peer-to-peer: every node must
accept inbound gossip and gRPC connections from every other node. Some
environments do not allow that. **Google Cloud Run**, for example, gives each
instance only a single inbound serving port, ephemeral instances with no
stable addresses, and no instance-to-instance networking - so memberlist
gossip and the gRPC mesh cannot form.

**Broker mode** solves this. Instead of connecting to each other, every node
makes a single **outbound** connection to a central **broker** process. The
broker is a pure relay: a message received from one node is fanned out to all
other connected nodes. The broker also pushes a **roster** (the list of
connected node IDs) to every node.

```mermaid theme={null}
flowchart LR
    A["Node A<br/>(Cloud Run)"] -->|outbound stream| B["Broker<br/>(relay)"]
    C["Node B<br/>(Cloud Run)"] -->|outbound stream| B
    D["Node C<br/>(Cloud Run)"] -->|outbound stream| B
    B -.->|fan-out| A
    B -.->|fan-out| C
    B -.->|fan-out| D
```

A message from Node A travels to the broker, which forwards it to Node B and Node C (never back to A).

Because nodes only need **outbound** connectivity, broker mode runs on any
platform that can make an outbound gRPC connection.

### How it differs from mesh mode

| Aspect             | Mesh mode                            | Broker mode                                           |
| ------------------ | ------------------------------------ | ----------------------------------------------------- |
| Connectivity       | Every node connects to every node    | Each node makes one outbound connection to the broker |
| Membership         | memberlist gossip                    | Roster pushed by the broker                           |
| Discovery          | 6 discovery methods                  | Not used - the broker is the rendezvous point         |
| Ports on a node    | `10101/TCP+UDP`, `10102/TCP` inbound | None - outbound only                                  |
| Leader election    | Deterministic over gossip members    | Deterministic over the broker roster (same algorithm) |
| Entity replication | Over the gRPC mesh                   | Over the broker relay - identical entity types        |

Leadership in broker mode uses the **same deterministic rule** as mesh mode:
the lexicographically-smallest node ID in the roster is the leader. Every node
computes this independently from the roster the broker pushes, so there is
nothing to configure and no broker-side election.

### Configuration

Nodes run in broker mode by setting `cluster_config.type` to `broker` and
pointing at the broker address:

```json theme={null}
{
  "cluster_config": {
    "enabled": true,
    "type": "broker",
    "region": "us-east-1",
    "broker": {
      "address": "broker.example.run.app:443",
      "tls": true,
      "auth_token": "your-shared-secret"
    }
  }
}
```

See the [config.json cluster reference](/deployment-guides/config-json/cluster#broker-mode)
for the full field list.

### Running the broker

The broker is **not** a separate binary - the same Bifrost Enterprise image
runs as the broker when started with the `-mode=broker` flag (or the
`BIFROST_MODE=broker` environment variable):

```bash theme={null}
bifrost-enterprise -mode=broker -app-dir /app/data
```

In broker mode the process branches before the normal server bootstrap: it starts
**only** the relay gRPC server and runs no database, providers, plugins, or
HTTP gateway. It reads `cluster_config.broker` from the same `config.json` and
serves on `broker.listen_port` (default `50051`). A standard gRPC health
service is registered for readiness probes.

### Deploying on Cloud Run

<Warning>
  All nodes must connect to the **same** broker process. Fan-out cannot span
  multiple broker instances, so the broker must run as a **single instance**.
</Warning>

**Broker service:**

* Deploy as a Cloud Run service with `min-instances=1` and `max-instances=1`.
* Enable **HTTP/2** (end-to-end) so gRPC works.
* Expose on `:443`; nodes use the service URL as `broker.address` with `tls: true`.
* Set the Cloud Run container port to `50051` so it matches
  `cluster_config.broker.listen_port`, or override `listen_port` to `8080` to
  match Cloud Run's default `$PORT`.
* Set `auth_token` so only your nodes can connect.

**Node services:**

* Deploy normally - they only need outbound access to the broker URL.
* Set `cluster_config.type` to `broker` and `broker.address` to the broker URL.

<Note>
  Cloud Run caps a single request - including a streaming gRPC connection - at
  60 minutes. When the broker stream is closed by the platform, each node
  automatically reconnects with exponential backoff, so this is transparent.
  gRPC keepalive pings are enabled on both sides to keep otherwise-idle streams
  alive within that window.
</Note>

### Roster and reconnection

The broker pushes the roster on three triggers: when a node connects or
disconnects, the full roster to a node as its first frame on join, and a
periodic rebroadcast (every \~20s) as a safety net for any node that missed an
event-driven update. A node that stops receiving roster heartbeats treats the
broker as down and enters its reconnect loop.

<Note>
  There is a brief window after a node disconnects where nodes can disagree on
  the leader until the updated roster lands everywhere - the same
  eventual-consistency window that gossip has in mesh mode.
</Note>

***

## Service Discovery Methods

Bifrost supports 6 service discovery methods to fit any infrastructure. Choose based on your deployment environment:

<CardGroup cols={2}>
  <Card title="Kubernetes" icon="dharmachakra" href="#kubernetes-discovery">
    Native K8s pod discovery via label selectors
  </Card>

  <Card title="Consul" icon="diamond" href="#consul-discovery">
    HashiCorp Consul service mesh integration
  </Card>

  <Card title="etcd" icon="database" href="#etcd-discovery">
    etcd-based distributed discovery
  </Card>

  <Card title="DNS" icon="globe" href="#dns-discovery">
    Traditional DNS SRV record discovery
  </Card>

  <Card title="UDP Broadcast" icon="tower-broadcast" href="#udp-broadcast-discovery">
    Local network broadcast discovery
  </Card>

  <Card title="mDNS" icon="wifi" href="#mdns-discovery">
    Multicast DNS for local development
  </Card>
</CardGroup>

***

## Kubernetes Discovery

**Best for:** Kubernetes deployments with StatefulSets or Deployments

Kubernetes discovery uses the K8s API to automatically discover pods based on label selectors. This is the most common method for cloud-native deployments.

### How It Works

1. Each Bifrost pod queries the Kubernetes API for pods matching the label selector
2. Discovers pod IPs automatically as pods scale up/down
3. Works seamlessly with StatefulSets, Deployments, and DaemonSets
4. No external dependencies required

### Configuration

```json theme={null}
{
  "cluster_config": {
    "enabled": true,
    "discovery": {
      "enabled": true,
      "type": "kubernetes",
      "service_name": "bifrost-cluster",
      "k8s_namespace": "default",
      "k8s_label_selector": "app=bifrost"
    },
    "gossip": {
      "port": 10101,
      "config": {
        "timeout_seconds": 10,
        "success_threshold": 3,
        "failure_threshold": 3
      }
    }
  }
}
```

### Configuration Parameters

| Parameter            | Required | Description                                                                                                                                          | Example                                   |
| -------------------- | -------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------- |
| `k8s_namespace`      | No       | Kubernetes namespace to search (if empty we pick the "default" as the namespace). If you are using a custom namespace, make sure to provide it here. | `"production"`                            |
| `k8s_label_selector` | Yes      | Label selector for pod discovery                                                                                                                     | `"app=bifrost"`, `"app=bifrost,env=prod"` |

### Kubernetes Deployment Example

<Tabs>
  <Tab title="StatefulSet">
    ```yaml theme={null}
    apiVersion: apps/v1
    kind: StatefulSet
    metadata:
      name: bifrost
      namespace: default
    spec:
      serviceName: bifrost-cluster
      replicas: 3
      selector:
        matchLabels:
          app: bifrost
      template:
        metadata:
          labels:
            app: bifrost
        spec:
          serviceAccountName: bifrost
          containers:
          - name: bifrost
            image: <enterprise_repo_base_url>/bifrost:latest
            ports:
            - containerPort: 8080
              name: http
            - containerPort: 10101
              name: gossip
            - containerPort: 10102
              name: grpc
            volumeMounts:
            - name: config
              mountPath: /etc/bifrost
          volumes:
          - name: config
            configMap:
              name: bifrost-config
    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: bifrost
      namespace: default
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: Role
    metadata:
      name: bifrost-pod-reader
      namespace: default
    rules:
    - apiGroups: [""]
      resources: ["pods"]
      verbs: ["get", "list", "watch"]
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: RoleBinding
    metadata:
      name: bifrost-pod-reader
      namespace: default
    subjects:
    - kind: ServiceAccount
      name: bifrost
      namespace: default
    roleRef:
      kind: Role
      name: bifrost-pod-reader
      apiGroup: rbac.authorization.k8s.io
    ```
  </Tab>

  <Tab title="Deployment">
    ```yaml theme={null}
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: bifrost
      namespace: default
    spec:
      replicas: 3
      selector:
        matchLabels:
          app: bifrost
      template:
        metadata:
          labels:
            app: bifrost
        spec:
          serviceAccountName: bifrost
          containers:
          - name: bifrost
            image: <enterprise_repo_base_url>/bifrost:latest
            ports:
            - containerPort: 8080
              name: http
            - containerPort: 10101
              name: gossip
            - containerPort: 10102
              name: grpc
            volumeMounts:
            - name: config
              mountPath: /etc/bifrost
          volumes:
          - name: config
            configMap:
              name: bifrost-config
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: bifrost-cluster
      namespace: default
    spec:
      clusterIP: None
      selector:
        app: bifrost
      ports:
      - port: 10101
        name: gossip
      - port: 10102
        name: grpc
    ```
  </Tab>
</Tabs>

### Troubleshooting

<Accordion title="Pods not discovering each other">
  **Symptoms**: Cluster shows only 1 member, pods running in isolation

  **Solutions**:

  * Verify ServiceAccount has RBAC permissions to list pods
  * Check label selector matches pod labels exactly
  * Ensure namespace is correct (defaults to "default")
  * Verify gossip port (10101) and gRPC port (10102) are not blocked by NetworkPolicies
  * Check logs for "error listing pods" messages
</Accordion>

<Accordion title="Permission denied errors">
  **Symptoms**: "error getting kubernetes config" or "forbidden" errors

  **Solutions**:

  * Create ServiceAccount for Bifrost pods
  * Create Role with `get`, `list`, `watch` permissions on pods
  * Create RoleBinding linking ServiceAccount to Role
  * Verify RBAC is enabled in cluster
</Accordion>

<Accordion title="Cluster forms but nodes show as unhealthy">
  **Symptoms**: Nodes discovered but marked as "suspect" or "dead"

  **Solutions**:

  * Verify gossip port (10101) and gRPC port (10102) are accessible between pods
  * Check for NetworkPolicies blocking pod-to-pod communication
  * Increase `timeout_seconds` in gossip config if network is slow
  * Verify pods are in Running state with `kubectl get pods`
</Accordion>

***

## Consul Discovery

**Best for:** Consul service mesh environments, multi-datacenter deployments

Consul discovery integrates with HashiCorp Consul for service registration and discovery. Ideal for environments already using Consul for service mesh or service discovery.

### How It Works

1. Each Bifrost node registers itself with Consul on startup
2. Nodes query Consul to discover other Bifrost instances
3. Consul performs health checks on each node
4. Unhealthy nodes are automatically deregistered
5. Supports multi-datacenter deployments

### Configuration

```json theme={null}
{
  "cluster_config": {
    "enabled": true,
    "discovery": {
      "enabled": true,
      "type": "consul",
      "service_name": "bifrost-cluster",
      "consul_address": "consul.service.consul:8500"
    },
    "gossip": {
      "port": 10101,
      "config": {
        "timeout_seconds": 10,
        "success_threshold": 3,
        "failure_threshold": 3
      }
    }
  }
}
```

### Configuration Parameters

| Parameter        | Required | Description          | Example                                                                        |
| ---------------- | -------- | -------------------- | ------------------------------------------------------------------------------ |
| `consul_address` | No       | Consul agent address | `"localhost:8500"`, `"consul.service.consul:8500"` (default: `localhost:8500`) |

<Note>
  Consul discovery automatically registers each node with health checks. The health check monitors the gossip port (TCP).
</Note>

### Docker Compose with Consul

```yaml theme={null}
version: '3.8'

services:
  consul:
    image: hashicorp/consul:latest
    command: agent -dev -client=0.0.0.0
    ports:
      - "8500:8500"
    networks:
      - bifrost-net

  bifrost-1:
    image: <enterprise_repo_base_url>/bifrost:latest
    environment:
      - BIFROST_CONFIG=/etc/bifrost/config.json
    volumes:
      - ./config-node1.json:/etc/bifrost/config.json
    ports:
      - "8080:8080"
    depends_on:
      - consul
    networks:
      - bifrost-net

  bifrost-2:
    image: <enterprise_repo_base_url>/bifrost:latest
    environment:
      - BIFROST_CONFIG=/etc/bifrost/config.json
    volumes:
      - ./config-node2.json:/etc/bifrost/config.json
    ports:
      - "8081:8080"
    depends_on:
      - consul
    networks:
      - bifrost-net

  bifrost-3:
    image: <enterprise_repo_base_url>/bifrost:latest
    environment:
      - BIFROST_CONFIG=/etc/bifrost/config.json
    volumes:
      - ./config-node3.json:/etc/bifrost/config.json
    ports:
      - "8082:8080"
    depends_on:
      - consul
    networks:
      - bifrost-net

networks:
  bifrost-net:
    driver: bridge
```

### Troubleshooting

<Accordion title="Failed to register with Consul">
  **Symptoms**: "failed to register service with Consul" errors

  **Solutions**:

  * Verify Consul agent is accessible at configured address
  * Check Consul agent logs for registration errors
  * Ensure Consul ACL token has write permissions if ACLs enabled
  * Verify network connectivity between Bifrost and Consul
  * Check firewall rules allow connections to port 8500
</Accordion>

<Accordion title="Services registered but not discovered">
  **Symptoms**: Consul UI shows services but nodes don't join cluster

  **Solutions**:

  * Verify `service_name` matches across all nodes
  * Check Consul service health checks are passing
  * Ensure gossip port is accessible between nodes
  * Verify nodes are registered in correct datacenter
  * Check for DNS resolution issues if using service DNS names
</Accordion>

<Accordion title="Health checks failing">
  **Symptoms**: Services show as critical in Consul UI

  **Solutions**:

  * Verify gossip port (10101) and gRPC port (10102) are accessible
  * Check Consul agent can reach node's gossip port
  * Increase health check timeout in Consul if needed
  * Review Bifrost logs for startup errors
  * Ensure nodes have correct IP addresses registered
</Accordion>

***

## etcd Discovery

**Best for:** etcd-based distributed systems, existing etcd infrastructure

etcd discovery uses etcd's distributed key-value store for service registration and discovery. Perfect for environments already using etcd or requiring strong consistency.

### How It Works

1. Each Bifrost node registers itself in etcd with a lease
2. Nodes maintain lease through keepalive messages
3. Nodes query etcd prefix to discover other instances
4. Failed nodes' leases expire and are automatically removed
5. Provides strongly consistent service registry

### Configuration

```json theme={null}
{
  "cluster_config": {
    "enabled": true,
    "discovery": {
      "enabled": true,
      "type": "etcd",
      "service_name": "bifrost-cluster",
      "etcd_endpoints": [
        "http://etcd-1:2379",
        "http://etcd-2:2379",
        "http://etcd-3:2379"
      ],
      "dial_timeout": "10s"
    },
    "gossip": {
      "port": 10101,
      "config": {
        "timeout_seconds": 10,
        "success_threshold": 3,
        "failure_threshold": 3
      }
    }
  }
}
```

### Configuration Parameters

| Parameter        | Required | Description                 | Example                                                                     |
| ---------------- | -------- | --------------------------- | --------------------------------------------------------------------------- |
| `etcd_endpoints` | Yes      | Array of etcd endpoint URLs | `["http://localhost:2379"]`, `["https://etcd1:2379", "https://etcd2:2379"]` |
| `dial_timeout`   | No       | Connection timeout          | `"10s"` (default), `"30s"`                                                  |

<Note>
  Each node registers under `/services/{service_name}/{node_id}` with a 30-second TTL lease.
</Note>

### Docker Compose with etcd

```yaml theme={null}
version: '3.8'

services:
  etcd:
    image: quay.io/coreos/etcd:latest
    command:
      - etcd
      - --advertise-client-urls=http://etcd:2379
      - --listen-client-urls=http://0.0.0.0:2379
      - --listen-peer-urls=http://0.0.0.0:2380
      - --initial-cluster=etcd=http://etcd:2380
      - --initial-advertise-peer-urls=http://etcd:2380
    ports:
      - "2379:2379"
      - "2380:2380"
    networks:
      - bifrost-net

  bifrost-1:
    image: <enterprise_repo_base_url>/bifrost:latest
    environment:
      - BIFROST_CONFIG=/etc/bifrost/config.json
    volumes:
      - ./config.json:/etc/bifrost/config.json
    ports:
      - "8080:8080"
    depends_on:
      - etcd
    networks:
      - bifrost-net

  bifrost-2:
    image: <enterprise_repo_base_url>/bifrost:latest
    environment:
      - BIFROST_CONFIG=/etc/bifrost/config.json
    volumes:
      - ./config.json:/etc/bifrost/config.json
    ports:
      - "8081:8080"
    depends_on:
      - etcd
    networks:
      - bifrost-net

  bifrost-3:
    image: <enterprise_repo_base_url>/bifrost:latest
    environment:
      - BIFROST_CONFIG=/etc/bifrost/config.json
    volumes:
      - ./config.json:/etc/bifrost/config.json
    ports:
      - "8082:8080"
    depends_on:
      - etcd
    networks:
      - bifrost-net

networks:
  bifrost-net:
    driver: bridge
```

### Troubleshooting

<Accordion title="Failed to create etcd client">
  **Symptoms**: "etcd client error" on startup

  **Solutions**:

  * Verify etcd endpoints are accessible
  * Check URL format (http\:// or https\://)
  * Ensure etcd cluster is healthy and running
  * Verify network connectivity to etcd endpoints
  * Check firewall rules allow connections to port 2379
  * Increase `dial_timeout` if network is slow
</Accordion>

<Accordion title="Failed to register with etcd">
  **Symptoms**: "failed to register with etcd" errors

  **Solutions**:

  * Verify etcd cluster is accepting writes
  * Check etcd cluster has available space
  * Ensure authentication credentials if etcd has auth enabled
  * Review etcd logs for permission or quota errors
  * Verify node can resolve etcd hostnames
</Accordion>

<Accordion title="Lease keepalive failures">
  **Symptoms**: Nodes repeatedly registering/deregistering

  **Solutions**:

  * Check network stability between nodes and etcd
  * Verify etcd cluster is not overloaded
  * Monitor etcd metrics for high latency
  * Increase lease TTL if network has high latency
  * Check for etcd leader election issues
</Accordion>

***

## DNS Discovery

**Best for:** Traditional infrastructure, static node addresses, cloud DNS services

DNS discovery uses standard DNS resolution to discover cluster nodes. Works with any DNS server and is ideal for static deployments or cloud environments with DNS integration.

### How It Works

1. Configure DNS A records or SRV records for cluster nodes
2. Bifrost queries DNS to resolve configured names
3. All returned IP addresses are treated as potential cluster members
4. Supports multiple DNS names for different node groups
5. Works with internal DNS, cloud DNS, or public DNS

### Configuration

```json theme={null}
{
  "cluster_config": {
    "enabled": true,
    "discovery": {
      "enabled": true,
      "type": "dns",
      "service_name": "bifrost-cluster",
      "dns_names": [
        "bifrost-cluster.local",
        "bifrost-nodes.internal.company.com"
      ],
      "bind_port": 10101
    },
    "gossip": {
      "port": 10101,
      "config": {
        "timeout_seconds": 10,
        "success_threshold": 3,
        "failure_threshold": 3
      }
    }
  }
}
```

### Configuration Parameters

| Parameter   | Required | Description                     | Example                                                              |
| ----------- | -------- | ------------------------------- | -------------------------------------------------------------------- |
| `dns_names` | Yes      | Array of DNS names to resolve   | `["bifrost.local"]`, `["node1.local", "node2.local", "node3.local"]` |
| `bind_port` | No       | Port appended to discovered IPs | `10101` (default)                                                    |

<Tip>
  DNS discovery is passive - it doesn't register nodes. You must manage DNS records externally (via DNS server, cloud DNS, or Kubernetes DNS).
</Tip>

### Setup Examples

<Tabs>
  <Tab title="Cloud DNS (AWS Route53)">
    ```bash theme={null}
    # Create A records for each node
    aws route53 change-resource-record-sets \
      --hosted-zone-id Z1234567890ABC \
      --change-batch '{
        "Changes": [{
          "Action": "CREATE",
          "ResourceRecordSet": {
            "Name": "bifrost-cluster.internal.company.com",
            "Type": "A",
            "TTL": 60,
            "ResourceRecords": [
              {"Value": "10.0.1.10"},
              {"Value": "10.0.1.11"},
              {"Value": "10.0.1.12"}
            ]
          }
        }]
      }'
    ```
  </Tab>

  <Tab title="Kubernetes Headless Service">
    ```yaml theme={null}
    apiVersion: v1
    kind: Service
    metadata:
      name: bifrost-cluster
      namespace: default
    spec:
      clusterIP: None  # Headless service
      selector:
        app: bifrost
      ports:
      - port: 10101
        name: gossip
    ---
    # DNS will resolve bifrost-cluster.default.svc.cluster.local
    # to all pod IPs matching the selector
    ```
  </Tab>

  <Tab title="Local DNS (dnsmasq)">
    ```bash theme={null}
    # /etc/dnsmasq.conf
    address=/bifrost-cluster.local/192.168.1.10
    address=/bifrost-cluster.local/192.168.1.11
    address=/bifrost-cluster.local/192.168.1.12

    # Or use /etc/hosts on each node
    echo "192.168.1.10 node1.bifrost.local" >> /etc/hosts
    echo "192.168.1.11 node2.bifrost.local" >> /etc/hosts
    echo "192.168.1.12 node3.bifrost.local" >> /etc/hosts
    ```
  </Tab>
</Tabs>

### Troubleshooting

<Accordion title="DNS lookup errors">
  **Symptoms**: "dns lookup error" in logs, no nodes discovered

  **Solutions**:

  * Verify DNS names are resolvable: `nslookup bifrost-cluster.local`
  * Check DNS server is accessible from Bifrost nodes
  * Verify `/etc/resolv.conf` has correct nameserver
  * Test DNS resolution from inside container if using Docker
  * Check for DNS caching issues (try flushing DNS cache)
</Accordion>

<Accordion title="No nodes discovered via DNS">
  **Symptoms**: DNS resolves but cluster has 0 members

  **Solutions**:

  * Verify DNS returns multiple A records (not CNAME)
  * Check that returned IPs are correct and reachable
  * Ensure `bind_port` matches actual gossip port on nodes
  * Verify nodes are listening on returned IP addresses
  * Use `dig` or `nslookup` to verify DNS response format
</Accordion>

<Accordion title="Nodes discovered but can't connect">
  **Symptoms**: IPs discovered but gossip connection fails

  **Solutions**:

  * Verify gossip port (10101) and gRPC port (10102) are open on all nodes
  * Check firewall rules between nodes
  * Ensure nodes are listening on correct network interface
  * Verify IP addresses match node's actual network addresses
  * Test connectivity: `telnet <ip> 10101`
</Accordion>

***

## UDP Broadcast Discovery

**Best for:** Local network deployments, on-premise infrastructure, development clusters

UDP broadcast discovery automatically finds nodes on the same local network using broadcast packets. No external dependencies required.

### How It Works

1. Nodes broadcast UDP discovery beacons on configured port
2. Other nodes on the same network respond with acknowledgments
3. Nodes discover each other's IP addresses automatically
4. Limited to nodes on the same broadcast domain (subnet)
5. Requires `allowed_address_space` for security

### Configuration

```json theme={null}
{
  "cluster_config": {
    "enabled": true,
    "discovery": {
      "enabled": true,
      "type": "udp",
      "service_name": "bifrost-cluster",
      "udp_broadcast_port": 9999,
      "allowed_address_space": [
        "192.168.1.0/24",
        "10.0.0.0/8"
      ],
      "dial_timeout": "10s"
    },
    "gossip": {
      "port": 10101,
      "config": {
        "timeout_seconds": 10,
        "success_threshold": 3,
        "failure_threshold": 3
      }
    }
  }
}
```

### Configuration Parameters

| Parameter               | Required | Description                          | Example                                                 |
| ----------------------- | -------- | ------------------------------------ | ------------------------------------------------------- |
| `udp_broadcast_port`    | Yes      | Port for broadcast discovery         | `9999`, `8888`                                          |
| `allowed_address_space` | Yes      | CIDR ranges to limit discovery scope | `["192.168.1.0/24"]`, `["10.0.0.0/8", "172.16.0.0/12"]` |
| `dial_timeout`          | No       | Time to wait for responses           | `"10s"` (default)                                       |

<Warning>
  UDP broadcast discovery requires `allowed_address_space` to be configured. This prevents scanning arbitrary networks and limits discovery to trusted subnets.
</Warning>

### Docker Compose Example

```yaml theme={null}
version: '3.8'

services:
  bifrost-1:
    image: <enterprise_repo_base_url>/bifrost:latest
    network_mode: bridge
    environment:
      - BIFROST_CONFIG=/etc/bifrost/config.json
    volumes:
      - ./config.json:/etc/bifrost/config.json
    ports:
      - "8080:8080"
      - "9999:9999/udp"
      - "10101:10101"
      - "10102:10102"

  bifrost-2:
    image: <enterprise_repo_base_url>/bifrost:latest
    network_mode: bridge
    environment:
      - BIFROST_CONFIG=/etc/bifrost/config.json
    volumes:
      - ./config.json:/etc/bifrost/config.json
    ports:
      - "8081:8080"
      - "9999:9999/udp"
      - "10101:10101"
      - "10102:10102"

  bifrost-3:
    image: <enterprise_repo_base_url>/bifrost:latest
    network_mode: bridge
    environment:
      - BIFROST_CONFIG=/etc/bifrost/config.json
    volumes:
      - ./config.json:/etc/bifrost/config.json
    ports:
      - "8082:8080"
      - "9999:9999/udp"
      - "10101:10101"
      - "10102:10102"
```

<Note>
  Use `network_mode: bridge` (default) or `host` for UDP broadcast. Custom networks may not support broadcast.
</Note>

### Troubleshooting

<Accordion title="No nodes discovered via UDP broadcast">
  **Symptoms**: Discovery runs but finds 0 nodes

  **Solutions**:

  * Verify `allowed_address_space` includes node IP addresses
  * Check UDP broadcast port is open (firewall/security groups)
  * Ensure nodes are on same subnet/broadcast domain
  * Verify broadcast is enabled on network interface
  * Test with `tcpdump -i any -n udp port 9999`
  * Check Docker network mode supports broadcast (use bridge or host)
</Accordion>

<Accordion title="Address space filtering issues">
  **Symptoms**: "not in allowed address space" warnings

  **Solutions**:

  * Verify CIDR notation is correct (e.g., `192.168.1.0/24`)
  * Ensure `allowed_address_space` covers all node IPs
  * Check node IP addresses: `ip addr` or `ifconfig`
  * Remember to use network address, not host address
  * Test CIDR match online or with ipcalc
</Accordion>

<Accordion title="Permission denied on UDP port">
  **Symptoms**: "permission denied" or "address already in use"

  **Solutions**:

  * Check if another process is using the UDP broadcast port
  * Verify port number is > 1024 (non-privileged) or run as root
  * Use `netstat -tulpn | grep 9999` to check port usage
  * Change `udp_broadcast_port` to different value
  * Ensure firewall isn't blocking UDP on that port
</Accordion>

***

## mDNS Discovery

**Best for:** Local development, testing, zero-configuration setups

mDNS (Multicast DNS) provides zero-configuration service discovery on local networks. Perfect for development and testing without requiring any infrastructure setup.

### How It Works

1. Nodes advertise themselves via mDNS (Bonjour/Avahi)
2. Other nodes browse for mDNS services
3. Automatic discovery within the same local network
4. No DNS server or configuration required
5. Limited to local network segment

### Configuration

```json theme={null}
{
  "cluster_config": {
    "enabled": true,
    "discovery": {
      "enabled": true,
      "type": "mdns",
      "service_name": "bifrost",
      "mdns_service": "_bifrost._tcp",
      "dial_timeout": "10s"
    },
    "gossip": {
      "port": 10101,
      "config": {
        "timeout_seconds": 10,
        "success_threshold": 3,
        "failure_threshold": 3
      }
    }
  }
}
```

### Configuration Parameters

| Parameter      | Required | Description                     | Example                                      |
| -------------- | -------- | ------------------------------- | -------------------------------------------- |
| `mdns_service` | No       | mDNS service type               | `"_bifrost._tcp"` (default), `"_myapp._tcp"` |
| `dial_timeout` | No       | Time to wait for mDNS responses | `"10s"` (default)                            |

<Warning>
  mDNS is designed for development and testing. For production, use Kubernetes, Consul, or etcd discovery.
</Warning>

### Local Development Example

```bash theme={null}
# Start first node
docker run -p 8080:8080 -p 10101:10101 -p 10201:10102 \
  -v $(pwd)/config-mdns.json:/etc/bifrost/config.json \
  <enterprise_repo_base_url>/bifrost:latest

# Start second node (discovers first automatically)
docker run -p 8081:8080 -p 10111:10101 -p 10211:10102 \
  -v $(pwd)/config-mdns.json:/etc/bifrost/config.json \
  <enterprise_repo_base_url>/bifrost:latest

# Start third node (discovers both automatically)
docker run -p 8082:8080 -p 10121:10101 -p 10221:10102 \
  -v $(pwd)/config-mdns.json:/etc/bifrost/config.json \
  <enterprise_repo_base_url>/bifrost:latest
```

### Troubleshooting

<Accordion title="mDNS services not discovered">
  **Symptoms**: Nodes don't discover each other via mDNS

  **Solutions**:

  * Verify mDNS is enabled on network (check firewall)
  * Ensure multicast is enabled on network interface
  * Check nodes are on same local network segment
  * Verify mDNS port 5353 is not blocked
  * Test mDNS resolution: `avahi-browse -a` (Linux) or `dns-sd -B` (macOS)
  * Increase `dial_timeout` if discovery is slow
</Accordion>

<Accordion title="Network address validation errors">
  **Symptoms**: "skipping invalid host address" warnings

  **Solutions**:

  * This is normal - mDNS returns network/broadcast addresses
  * mDNS automatically filters invalid addresses (127.x.x.x, \*.0, \*.255)
  * Check that nodes have valid non-loopback IP addresses
  * Ensure nodes are not using 127.0.0.1 for binding
  * Verify network interface has proper IP configuration
</Accordion>

<Accordion title="Discovery works but cluster unstable">
  **Symptoms**: Nodes discover then disconnect repeatedly

  **Solutions**:

  * mDNS has eventual consistency, allow time for propagation
  * Check gossip port accessibility between nodes
  * Verify network doesn't drop multicast packets
  * Consider using a more robust discovery method for production
  * Check for network congestion or packet loss
</Accordion>

***

## Deployment Patterns

### Docker Compose Deployment

Complete example using Kubernetes-style discovery with a shared config store:

```yaml theme={null}
version: '3.8'

services:
  postgres:
    image: postgres:14
    environment:
      POSTGRES_DB: bifrost
      POSTGRES_USER: bifrost
      POSTGRES_PASSWORD: bifrost_password
    volumes:
      - postgres_data:/var/lib/postgresql/data
    networks:
      - bifrost-net

  consul:
    image: hashicorp/consul:latest
    command: agent -dev -client=0.0.0.0
    ports:
      - "8500:8500"
    networks:
      - bifrost-net

  bifrost-1:
    image: <enterprise_repo_base_url>/bifrost:latest
    environment:
      - BIFROST_CONFIG=/etc/bifrost/config.json
    volumes:
      - ./config.json:/etc/bifrost/config.json
    ports:
      - "8080:8080"
    depends_on:
      - postgres
      - consul
    networks:
      - bifrost-net

  bifrost-2:
    image: <enterprise_repo_base_url>/bifrost:latest
    environment:
      - BIFROST_CONFIG=/etc/bifrost/config.json
    volumes:
      - ./config.json:/etc/bifrost/config.json
    ports:
      - "8081:8080"
    depends_on:
      - postgres
      - consul
    networks:
      - bifrost-net

  bifrost-3:
    image: <enterprise_repo_base_url>/bifrost:latest
    environment:
      - BIFROST_CONFIG=/etc/bifrost/config.json
    volumes:
      - ./config.json:/etc/bifrost/config.json
    ports:
      - "8082:8080"
    depends_on:
      - postgres
      - consul
    networks:
      - bifrost-net

  nginx:
    image: nginx:alpine
    ports:
      - "80:80"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf:ro
    depends_on:
      - bifrost-1
      - bifrost-2
      - bifrost-3
    networks:
      - bifrost-net

volumes:
  postgres_data:

networks:
  bifrost-net:
    driver: bridge
```

**nginx.conf** for load balancing:

```nginx theme={null}
events {
    worker_connections 1024;
}

http {
    upstream bifrost_cluster {
        least_conn;
        server bifrost-1:8080 max_fails=3 fail_timeout=30s;
        server bifrost-2:8080 max_fails=3 fail_timeout=30s;
        server bifrost-3:8080 max_fails=3 fail_timeout=30s;
    }

    server {
        listen 80;
        
        location / {
            proxy_pass http://bifrost_cluster;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
            
            # Timeouts
            proxy_connect_timeout 60s;
            proxy_send_timeout 60s;
            proxy_read_timeout 60s;
        }
        
        location /health {
            access_log off;
            return 200 "healthy\n";
            add_header Content-Type text/plain;
        }
    }
}
```

### Kubernetes Production Deployment

Production-ready Kubernetes deployment with StatefulSet:

<Note>
  If you use PostgreSQL for `config_store`, ensure the target database is UTF8 encoded. See [PostgreSQL UTF8 Requirement](../quickstart/gateway/setting-up#postgresql-utf8-requirement).
</Note>

```yaml theme={null}
apiVersion: v1
kind: ConfigMap
metadata:
  name: bifrost-config
  namespace: bifrost
data:
  config.json: |
    {
      "cluster_config": {
        "enabled": true,
        "discovery": {
          "enabled": true,
          "type": "kubernetes",
          "service_name": "bifrost-cluster",
          "k8s_namespace": "bifrost",
          "k8s_label_selector": "app=bifrost,component=gateway"
        },
        "gossip": {
          "port": 10101,
          "config": {
            "timeout_seconds": 10,
            "success_threshold": 3,
            "failure_threshold": 3
          }
        }
      },
      "config_store": {
        "enabled": true,
        "type": "postgres",
        "config": {
          "host": "postgres.bifrost.svc.cluster.local",
          "port": "5432",
          "user": "bifrost",
          "password": "changeme",
          "db_name": "bifrost",
          "ssl_mode": "require"
        }
      }
    }
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: bifrost
  namespace: bifrost
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: bifrost-pod-reader
  namespace: bifrost
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: bifrost-pod-reader
  namespace: bifrost
subjects:
- kind: ServiceAccount
  name: bifrost
  namespace: bifrost
roleRef:
  kind: Role
  name: bifrost-pod-reader
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: bifrost
  namespace: bifrost
spec:
  serviceName: bifrost-cluster
  replicas: 3
  selector:
    matchLabels:
      app: bifrost
      component: gateway
  template:
    metadata:
      labels:
        app: bifrost
        component: gateway
    spec:
      serviceAccountName: bifrost
      containers:
      - name: bifrost
        image: <enterprise_repo_base_url>/bifrost:latest
        ports:
        - containerPort: 8080
          name: http
          protocol: TCP
        - containerPort: 10101
          name: gossip
          protocol: TCP
        - containerPort: 10102
          name: grpc
          protocol: TCP
        env:
        - name: BIFROST_CONFIG
          value: /etc/bifrost/config.json
        volumeMounts:
        - name: config
          mountPath: /etc/bifrost
        resources:
          requests:
            cpu: "500m"
            memory: "512Mi"
          limits:
            cpu: "2000m"
            memory: "2Gi"
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5
      volumes:
      - name: config
        configMap:
          name: bifrost-config
---
apiVersion: v1
kind: Service
metadata:
  name: bifrost-cluster
  namespace: bifrost
spec:
  clusterIP: None
  selector:
    app: bifrost
    component: gateway
  ports:
  - port: 10101
    name: gossip
    protocol: TCP
  - port: 10102
    name: grpc
    protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
  name: bifrost
  namespace: bifrost
spec:
  type: LoadBalancer
  selector:
    app: bifrost
    component: gateway
  ports:
  - port: 80
    targetPort: 8080
    protocol: TCP
    name: http
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: bifrost-pdb
  namespace: bifrost
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: bifrost
      component: gateway
```

### Bare Metal / VM Deployment

For bare metal or VM deployments using systemd:

**Step 1: Install Bifrost on each node**

```bash theme={null}
# Download Bifrost Enterprise binary
curl -O https://releases.getmaxim.ai/bifrost-enterprise/latest/bifrost-enterprise-linux-amd64
chmod +x bifrost-enterprise-linux-amd64
sudo mv bifrost-enterprise-linux-amd64 /usr/local/bin/bifrost-enterprise
```

**Step 2: Create configuration file**

```bash theme={null}
sudo mkdir -p /etc/bifrost
sudo cat > /etc/bifrost/config.json <<EOF
{
  "cluster_config": {
    "enabled": true,
    "discovery": {
      "enabled": true,
      "type": "dns",
      "service_name": "bifrost-cluster",
      "dns_names": ["bifrost-cluster.internal.company.com"]
    },
    "gossip": {
      "port": 10101,
      "config": {
        "timeout_seconds": 10,
        "success_threshold": 3,
        "failure_threshold": 3
      }
    }
  },
  "config_store": {
    "enabled": true,
    "type": "postgres",
    "config": {
      "host": "postgres.internal.company.com",
      "port": "5432",
      "user": "bifrost",
      "password": "secure_password",
      "db_name": "bifrost",
      "ssl_mode": "require"
    }
  }
}
EOF
```

**Step 3: Create systemd service**

```bash theme={null}
sudo cat > /etc/systemd/system/bifrost.service <<EOF
[Unit]
Description=Bifrost Enterprise API Gateway
After=network.target

[Service]
Type=simple
User=bifrost
Group=bifrost
Environment="BIFROST_CONFIG=/etc/bifrost/config.json"
ExecStart=/usr/local/bin/bifrost-enterprise
Restart=always
RestartSec=10
StandardOutput=journal
StandardError=journal

# Security hardening
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/var/lib/bifrost

[Install]
WantedBy=multi-user.target
EOF
```

**Step 4: Setup DNS records**

```bash theme={null}
# Add A records for bifrost-cluster.internal.company.com
# pointing to all node IPs:
# 10.0.1.10  (node1)
# 10.0.1.11  (node2)
# 10.0.1.12  (node3)
```

**Step 5: Start and enable service**

```bash theme={null}
sudo useradd -r -s /bin/false bifrost
sudo mkdir -p /var/lib/bifrost
sudo chown bifrost:bifrost /var/lib/bifrost
sudo systemctl daemon-reload
sudo systemctl enable bifrost
sudo systemctl start bifrost
sudo systemctl status bifrost
```

**Step 6: Verify cluster formation**

```bash theme={null}
# Check logs on each node
sudo journalctl -u bifrost -f

# Look for messages like:
# "successfully joined X peers on startup"
# "cluster health: HEALTHY"
```

***

## Cluster Operations

### Cluster Topology View

The admin UI ships an interactive cluster graph (React Flow) that renders the live cluster as a circle of nodes with edges colored by reachability. Each node card shows its `node_id`, region, current state (alive / suspect / dead / left), and a leader badge if it currently holds either the cluster-wide or regional leadership. The view updates in the background and triggers an automatic diagnostic on leader transitions. Single-node clusters render the simplified single-card view.

### Cluster Diagnostics

A diagnostic flow lets administrators verify in-cluster reachability end-to-end. When triggered, the local node broadcasts a no-op diagnostic ping to every peer that advertises the `ack:v1` capability and streams ACK status back to the caller as each peer responds. Use it to:

* Confirm that a newly added node is receiving and acknowledging cluster messages
* Surface peers that are partitioned or silently dropping traffic
* Validate that NetworkPolicies / firewall rules permit gRPC after a config change

The diagnostic does not mutate state on any node - the message is purely a round-trip probe. It is admin-only and is invoked from the cluster topology view or the equivalent admin endpoint.

### Mixed-Version Rollouts

Bifrost negotiates per-peer capabilities so newer and older cluster members can run side-by-side during rolling upgrades. Peers advertise the `ack:v1` capability in their gossip metadata; a node only tracks ACKs from peers that advertise it. Older peers still receive broadcasts but are excluded from the pending-ACK set so they don't trigger false "unacked" metrics or unnecessary retries.

In practice this means: you can roll out a new Bifrost version one pod at a time, and the cluster will degrade ACK tracking gracefully for the older pods until they're replaced. Rolling-update strategies (Kubernetes `RollingUpdate`, Nomad canaries, etc.) are supported without quorum loss.

### Leader-Coordinated Tasks

A small number of cluster-wide tasks run only on the elected leader, with results broadcast over gRPC so followers stay in sync without each one independently doing the work. The clearest example is **pricing sync**: only the leader fetches the upstream pricing URL on the configured interval and then broadcasts a database reload message; followers reload from the local store rather than each making the upstream HTTP call. Region-scoped variants of this pattern run on the regional leader.

There is no configuration to enable this - leader-coordinated tasks engage automatically when clustering is enabled. If the leader fails, the next election (within \~30 seconds) hands the responsibility to a new node.

### Replicated Entity Types

Every replicated message carries an `EntityType` identifying the kind of state being broadcast. The cluster replicates the following entity types over gRPC:

| Category                     | Entity Types                                                                                                                |
| ---------------------------- | --------------------------------------------------------------------------------------------------------------------------- |
| **Catalog & Providers**      | `model_catalog`, `provider`, `model_config`, `pricing`, `pricing_override`                                                  |
| **Governance**               | `governance`, `virtual_key`, `team`, `customer`, `business_unit_governance`, `user_governance`, `access_profile`, `api_key` |
| **Security & Auth**          | `auth_config`, `rbac`, `proxy_config`, `client_config`                                                                      |
| **Routing & Load Balancing** | `routing_rule`, `load_balancing`, `load_balancing_log`                                                                      |
| **MCP**                      | `mcp_tool`, `mcp_tool_group`                                                                                                |
| **Guardrails**               | `guardrail`, `guardrail_config`, `guardrail_sampling`                                                                       |
| **Plugins & Connectors**     | `plugin`, `observability_connector`                                                                                         |
| **Prompts & Storage**        | `prompt_deployment`, `kv_store`                                                                                             |
| **Cluster**                  | `cluster_diagnostic` (no-op probe used by the diagnostic flow)                                                              |

Each entity type follows the same broadcast and dedup rules: a unique message ID, a `SentAt` timestamp, and a deduper TTL on the receiver side. Newer messages with the same ID invalidate older ones, so a late-arriving stale message will not overwrite fresh state.

***

## Troubleshooting

### General Clustering Issues

<Accordion title="Cluster forms but only has 1 member">
  **Symptoms**: Each node thinks it's the only member

  **Common Causes & Solutions**:

  * **Discovery not configured**: Verify `discovery.enabled: true` and `discovery.type` is set
  * **Service name mismatch**: Ensure all nodes have identical `service_name`
  * **Gossip port blocked**: Check firewall allows TCP port 10101 between nodes
  * **Discovery method issues**: See method-specific troubleshooting above
  * **Network isolation**: Verify nodes can reach each other on gossip port
</Accordion>

<Accordion title="Split brain - nodes form separate clusters">
  **Symptoms**: Nodes divided into separate clusters

  **Common Causes & Solutions**:

  * **Network partition**: Check network connectivity between all nodes
  * **Different discovery configs**: Ensure all nodes use same discovery settings
  * **Firewall blocking gossip**: Verify bidirectional connectivity on port 10101
  * **Discovery scoped incorrectly**: Check label selectors, DNS names, or address spaces
  * **Restart all nodes**: Sometimes requires simultaneous restart to reform cluster
</Accordion>

<Accordion title="High memory usage in cluster">
  **Symptoms**: Memory grows over time, especially in large clusters

  **Common Causes & Solutions**:

  * **Large gossip messages**: Check size of gossiped data
  * **Too many nodes**: Optimize for clusters with 3-7 nodes typically
  * **Message deduplication cache**: This is normal, cache TTL is 2 minutes
  * **Increase node resources**: Ensure adequate memory allocation
</Accordion>

<Accordion title="Cluster unstable - nodes flapping">
  **Symptoms**: Nodes repeatedly join and leave cluster

  **Common Causes & Solutions**:

  * **Network instability**: Check for packet loss or high latency
  * **Resource constraints**: Ensure nodes have adequate CPU/memory
  * **Timeout too aggressive**: Increase `timeout_seconds` in gossip config
  * **Health check failures**: Review liveness probe configuration
  * **Discovery intervals**: Check discovery isn't running too frequently
</Accordion>

<Accordion title="Cannot broadcast messages to cluster">
  **Symptoms**: Broadcast queue errors, messages not propagating

  **Common Causes & Solutions**:

  * **Queue not initialized**: Check logs for initialization errors
  * **No active members**: Verify cluster has multiple healthy members
  * **Gossip port unreachable**: Test connectivity between all nodes
  * **Message too large**: Check size of broadcast messages
</Accordion>

**Key log messages to look for**:

```
✅ Successful cluster formation:
- "successfully joined X peers on startup"
- "cluster health: HEALTHY"
- "discovered X nodes"

⚠️ Warning signs:
- "no new nodes discovered"
- "failed to join cluster"
- "cluster health: NOT HEALTHY"
- "node marked as suspect"

❌ Errors:
- "discovery failed"
- "failed to broadcast"
- "timeout waiting for response"
```

### Health Check Endpoints

Monitor cluster health via HTTP endpoints:

```bash theme={null}
# Check if node is healthy
curl http://localhost:8080/health

# Get cluster status (if exposed)
curl http://localhost:8080/cluster/status

# Expected response shows all cluster members
{
  "local_node": "bifrost-remote-10101-...",
  "members": 3,
  "healthy_members": 3,
  "cluster_health": "HEALTHY"
}
```

***

This clustering implementation ensures Bifrost can handle enterprise-scale deployments with high availability, automatic service discovery, and intelligent traffic distribution across any infrastructure.