> ## Documentation Index
> Fetch the complete documentation index at: https://docs.getbifrost.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Datadog

> Native Datadog integration for APM traces, LLM Observability, and metrics

## Overview

<Frame>
  <img src="https://mintcdn.com/bifrost/HE3esZlo6PW0-gsx/media/dd-trace.png?fit=max&auto=format&n=HE3esZlo6PW0-gsx&q=85&s=0cdb006e115843fd17c76077b219fc2e" alt="Datadog LLM Observability dashboard" width="2200" height="1218" data-path="media/dd-trace.png" />
</Frame>

The **Datadog plugin** provides native integration with the Datadog observability platform, offering three pillars of observability for your LLM operations:

* **APM Traces** - Distributed tracing via dd-trace-go v2 with W3C Trace Context support for end-to-end request visibility
* **LLM Observability** - Native Datadog LLM Obs integration for AI/ML-specific monitoring
* **Metrics** - Operational metrics via DogStatsD or the Metrics API

Unlike the [OTel plugin](/features/observability/otel) which sends generic OpenTelemetry data, the Datadog plugin leverages Datadog's native SDKs for richer integration with Datadog-specific features like LLM Observability dashboards and ML App grouping.

***

## Deployment Modes

<Frame>
  <img src="https://mintcdn.com/bifrost/HE3esZlo6PW0-gsx/media/dd-mode.png?fit=max&auto=format&n=HE3esZlo6PW0-gsx&q=85&s=c907bef2a9c33c241a04572ec73723c6" alt="Datadog LLM Observability dashboard" width="2294" height="1086" data-path="media/dd-mode.png" />
</Frame>

The plugin supports two deployment modes:

| Mode                | Description                              | Requirements                  | Best For                                                  |
| ------------------- | ---------------------------------------- | ----------------------------- | --------------------------------------------------------- |
| **Agent** (default) | Sends data through a local Datadog Agent | Datadog Agent running on host | Production deployments with existing agent infrastructure |
| **Agentless**       | Sends data directly to Datadog APIs      | API key only                  | Serverless, containers, or simplified deployments         |

### Agent Mode

In agent mode, the plugin communicates with a locally running Datadog Agent:

* **APM Traces** → Agent at `localhost:8126`
* **Metrics** → DogStatsD at `localhost:8125`

The agent handles batching, retries, and provides lower latency. This is the recommended mode for production deployments where you already have the Datadog Agent installed.

### Agentless Mode

In agentless mode, the plugin sends data directly to Datadog's intake APIs (`{site}` is your configured Datadog site, e.g. `datadoghq.com`):

* **LLM Observability** → `https://api.{site}/api/intake/llm-obs/v1/trace/spans`
* **Metrics** → `https://api.{site}` Metrics API (series to `/api/v2/series`, distributions to `/api/v1/distribution_points`)

This mode requires an API key but simplifies deployment by eliminating the need for a local agent. Ideal for serverless environments, Kubernetes pods, or quick testing.

<Note>
  Datadog officially supports agentless submission for [LLM Observability](https://docs.datadoghq.com/llm_observability/instrumentation/api/) and [metrics](https://docs.datadoghq.com/api/latest/metrics/), but **not** for general APM tracing - the [dd-trace-go setup](https://docs.datadoghq.com/tracing/trace_collection/dd_libraries/go/) assumes a running Agent (or the serverless extension). The plugin points the tracer at the public trace intake so APM spans are still emitted, but if you need fully-supported APM, run the Datadog Agent (agent mode). LLM Observability and metrics are unaffected.
</Note>

***

## Configuration

### Required Fields

| Field                     | Type     | Required       | Default               | Description                                                                                                                                                   |
| ------------------------- | -------- | -------------- | --------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `service_name`            | `string` | No             | `bifrost`             | Service name displayed in Datadog APM                                                                                                                         |
| `ml_app`                  | `string` | No             | (uses `service_name`) | ML application name for LLM Observability grouping                                                                                                            |
| `agent_addr`              | `string` | No             | `localhost:8126`      | Datadog Agent address as combined `host:port` (agent mode only, supports `env.VAR_NAME`). Overridden by `agent_host` when set                                 |
| `agent_host`              | `string` | No             | -                     | Datadog Agent host, set separately from the port (agent mode only, supports `env.VAR_NAME`). Takes precedence over `agent_addr`                               |
| `agent_port`              | `string` | No             | `8126`                | Datadog Agent port, used with `agent_host` (agent mode only, supports `env.VAR_NAME`)                                                                         |
| `dogstatsd_addr`          | `string` | No             | `localhost:8125`      | DogStatsD server address as combined `host:port` (agent mode only, supports `env.VAR_NAME`). Overridden by `dogstatsd_host` when set                          |
| `dogstatsd_host`          | `string` | No             | -                     | DogStatsD server host, set separately from the port (agent mode only, supports `env.VAR_NAME`). Takes precedence over `dogstatsd_addr`                        |
| `dogstatsd_port`          | `string` | No             | `8125`                | DogStatsD server port, used with `dogstatsd_host` (agent mode only, supports `env.VAR_NAME`)                                                                  |
| `env`                     | `string` | No             | -                     | Environment tag (e.g., `production`, `staging`)                                                                                                               |
| `version`                 | `string` | No             | -                     | Service version tag                                                                                                                                           |
| `custom_tags`             | `object` | No             | -                     | Additional tags for all traces and metrics                                                                                                                    |
| `enable_metrics`          | `bool`   | No             | `true`                | Enable metrics emission                                                                                                                                       |
| `enable_traces`           | `bool`   | No             | `true`                | Enable APM traces                                                                                                                                             |
| `enable_llm_obs`          | `bool`   | No             | `true`                | Enable LLM Observability                                                                                                                                      |
| `group_traces_by_session` | `bool`   | No             | `false`               | Group requests sharing the same `x-bf-session-id` into one APM trace (agent mode only). See [Grouping APM Traces by Session](#grouping-apm-traces-by-session) |
| `agentless`               | `bool`   | No             | `false`               | Use agentless mode (direct API)                                                                                                                               |
| `api_key`                 | `string` | Agentless only | -                     | Datadog API key (supports `env.VAR_NAME`)                                                                                                                     |
| `site`                    | `string` | No             | `datadoghq.com`       | Datadog site/region                                                                                                                                           |

### Environment Variable Substitution

The `api_key`, `agent_addr`, `agent_host`, `agent_port`, `dogstatsd_addr`, `dogstatsd_host`, `dogstatsd_port`, and `custom_tags` fields support environment variable substitution using the `env.` prefix:

```json theme={null}
{
  "api_key": "env.DD_API_KEY",
  "agent_addr": "env.DD_AGENT_ADDR",
  "dogstatsd_addr": "env.DD_DOGSTATSD_ADDR",
  "custom_tags": {
    "team": "env.TEAM_NAME",
    "cost_center": "env.COST_CENTER"
  }
}
```

<Note>
  Substitution is **whole-value only** — `"env.DD_HOST:8125"` does not work, because the entire field is treated as one variable reference. If your environment exposes the agent host and port as separate variables (common in Kubernetes, where the host is injected from the downward API `status.hostIP` and the port is fixed), use the separate `agent_host` / `agent_port` and `dogstatsd_host` / `dogstatsd_port` fields instead of `agent_addr` / `dogstatsd_addr`. When a `*_host` field is set it takes precedence over the combined `*_addr`, and the matching `*_port` defaults to `8126` (agent) / `8125` (DogStatsD).
</Note>

#### Separate host and port (Kubernetes)

```json theme={null}
{
  "dogstatsd_host": "env.DD_AGENT_HOST",
  "agent_host": "env.DD_AGENT_HOST"
}
```

The ports are omitted here because they default to `8126` (agent) and `8125` (DogStatsD). Set `agent_port` / `dogstatsd_port` (literal or `env.` reference) only if your agent listens on non-standard ports.

With the Kubernetes downward API injecting the node IP:

```yaml theme={null}
env:
  - name: DD_AGENT_HOST
    valueFrom:
      fieldRef:
        fieldPath: status.hostIP
```

***

## Setup

<Tabs group="setup-method">
  <Tab title="UI">
    <Frame>
      <img src="https://mintcdn.com/bifrost/HE3esZlo6PW0-gsx/media/dd-config-page.png?fit=max&auto=format&n=HE3esZlo6PW0-gsx&q=85&s=8894478ffe39d1b749407276a7721796" alt="Datadog LLM Observability dashboard" width="3504" height="2130" data-path="media/dd-config-page.png" />
    </Frame>

    Configure the Datadog plugin through the Bifrost UI:

    1. Navigate to **Plugins**
    2. Enable the **Datadog** plugin
    3. Configure the required fields based on your deployment mode
  </Tab>

  <Tab title="Go SDK">
    ```go theme={null}
    package main

    import (
        "context"
        bifrost "github.com/maximhq/bifrost/core"
        "github.com/maximhq/bifrost/core/schemas"
        "github.com/maximhq/bifrost/framework/modelcatalog"
        datadog "github.com/maximhq/bifrost-enterprise/plugins/datadog"
    )

    func main() {
        ctx := context.Background()
        logger := schemas.NewLogger()
        
        // Initialize model catalog (required for cost calculation)
        modelCatalog := modelcatalog.NewModelCatalog(logger)
        
        // Agent mode configuration
        ddPlugin, err := datadog.Init(ctx, &datadog.Config{
            ServiceName: "my-llm-service",
            Env:         "production",
            Version:     "1.0.0",
            CustomTags: map[string]string{
                "team": "platform",
            },
        }, logger, modelCatalog, "1.0.0")
        if err != nil {
            panic(err)
        }
        
        // Initialize Bifrost with the plugin
        client, err := bifrost.Init(ctx, schemas.BifrostConfig{
            Account: &yourAccount,
            Plugins: []schemas.Plugin{ddPlugin},
        })
        if err != nil {
            panic(err)
        }
        defer client.Shutdown()
        
        // All requests are now traced to Datadog
    }
    ```

    For agentless mode:

    ```go theme={null}
    // Agentless mode configuration
    enableAgentless := true
    ddPlugin, err := datadog.Init(ctx, &datadog.Config{
        ServiceName: "my-llm-service",
        Env:         "production",
        Agentless:   &enableAgentless,
        APIKey:      &schemas.EnvVar{EnvVarName: "DD_API_KEY"},
        Site:        "datadoghq.com",
    }, logger, modelCatalog, "1.0.0")
    ```
  </Tab>

  <Tab title="config.json">
    ### Agent Mode (Minimal)

    ```json theme={null}
    {
      "plugins": [
        {
          "enabled": true,
          "name": "datadog",
          "config": {
            "service_name": "bifrost",
            "env": "production"
          }
        }
      ]
    }
    ```

    ### Agent Mode (Full Configuration)

    ```json theme={null}
    {
      "plugins": [
        {
          "enabled": true,
          "name": "datadog",
          "config": {
            "service_name": "my-llm-gateway",
            "ml_app": "my-ml-application",
            "agent_addr": "localhost:8126",
            "dogstatsd_addr": "localhost:8125",
            "env": "production",
            "version": "1.2.3",
            "custom_tags": {
              "team": "platform",
              "cost_center": "env.COST_CENTER"
            },
            "enable_metrics": true,
            "enable_traces": true,
            "enable_llm_obs": true
          }
        }
      ]
    }
    ```

    ### Agentless Mode

    ```json theme={null}
    {
      "plugins": [
        {
          "enabled": true,
          "name": "datadog",
          "config": {
            "service_name": "my-llm-gateway",
            "env": "production",
            "agentless": true,
            "api_key": "env.DD_API_KEY",
            "site": "datadoghq.com"
          }
        }
      ]
    }
    ```

    Set the environment variable:

    ```bash theme={null}
    export DD_API_KEY="your-datadog-api-key"
    ```
  </Tab>
</Tabs>

***

## Datadog Sites

The plugin supports all Datadog regional sites. Set the `site` field to match your Datadog account region:

| Site          | Region                   | Value               |
| ------------- | ------------------------ | ------------------- |
| US1 (default) | United States            | `datadoghq.com`     |
| US3           | United States            | `us3.datadoghq.com` |
| US5           | United States            | `us5.datadoghq.com` |
| EU1           | Europe                   | `datadoghq.eu`      |
| AP1           | Asia Pacific (Japan)     | `ap1.datadoghq.com` |
| AP2           | Asia Pacific (Australia) | `ap2.datadoghq.com` |
| US1-FED       | US Government            | `ddog-gov.com`      |
| US2-FED       | US Government            | `us2.ddog-gov.com`  |

<Note>
  Ensure your API key corresponds to the selected site. API keys from one region will not work with another.
</Note>

***

## LLM Observability

<Frame>
  <img src="https://mintcdn.com/bifrost/HE3esZlo6PW0-gsx/media/dd-llmobs.png?fit=max&auto=format&n=HE3esZlo6PW0-gsx&q=85&s=293326662ec84536b8fd6f2c35be2db7" alt="Datadog LLM Observability dashboard" width="3504" height="2126" data-path="media/dd-llmobs.png" />
</Frame>

The Datadog plugin integrates with [Datadog LLM Observability](https://docs.datadoghq.com/llm_observability/) to provide AI/ML-specific monitoring capabilities.

### ML App Grouping

LLM traces are grouped under an **ML App** in Datadog. By default, this uses your `service_name`, but you can specify a dedicated ML App name:

```json theme={null}
{
  "service_name": "bifrost-gateway",
  "ml_app": "customer-support-ai"
}
```

This allows you to:

* Group related LLM operations across multiple services
* Track costs and performance by application
* Apply ML-specific alerts and dashboards

### Session Tracking

The plugin supports session tracking via the `x-bf-session-id` header. Include this header in your requests to group related LLM calls into a conversation session:

```bash theme={null}
curl -X POST https://your-bifrost-gateway/v1/chat/completions \
  -H "Authorization: Bearer $API_KEY" \
  -H "x-bf-session-id: user-123-session-456" \
  -d '{...}'
```

Sessions appear in Datadog LLM Observability, allowing you to trace entire conversation flows.

### Grouping APM Traces by Session

By default, every request is its own APM trace, so a multi-turn conversation appears as many separate traces in the APM trace view. (Cross-trace grouping normally lives in [LLM Observability sessions](#session-tracking), a separate product from APM.) Enable `group_traces_by_session` to instead group every request sharing the same `x-bf-session-id` into a **single APM trace**, where each request renders as a top-level sibling span:

```json theme={null}
{
  "group_traces_by_session": true
}
```

```bash theme={null}
# Both requests below land in the same APM trace
curl ... -H "x-bf-session-id: user-123-session-456" -d '{...}'
curl ... -H "x-bf-session-id: user-123-session-456" -d '{...}'
```

Bifrost derives a stable Datadog trace ID from the `x-bf-session-id` value, so all requests carrying that header resolve to the same trace.

<Note>
  * **Agent mode only.** APM spans are emitted only in agent mode. In agentless mode the plugin emits LLM Observability spans only, which already group via [session tracking](#session-tracking).
  * **W3C traceparent takes precedence.** If a request carries an inbound [`traceparent`](#w3c-distributed-tracing) header, it stays on that distributed trace and is not regrouped by session.
  * **APM traces are not built for long-lived sessions.** Datadog has practical limits on spans-per-trace and trace-assembly windows; very long sessions may render with large time gaps or be truncated. For long conversations, prefer LLM Observability sessions.
</Note>

### W3C Distributed Tracing

The plugin supports [W3C Trace Context](https://www.w3.org/TR/trace-context/) for distributed tracing across services. When your upstream service sends a `traceparent` header, Bifrost automatically links its spans as children of the parent trace.

```bash theme={null}
curl -X POST https://your-bifrost-gateway/v1/chat/completions \
  -H "Authorization: Bearer $API_KEY" \
  -H "traceparent: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01" \
  -d '{...}'
```

This enables:

* **End-to-end visibility** - See LLM calls in the context of your full application trace
* **Cross-service correlation** - Link frontend requests → backend services → Bifrost → LLM providers
* **Latency attribution** - Understand how LLM latency contributes to overall request time

The `traceparent` header format follows the W3C standard:

```
traceparent: {version}-{trace-id}-{parent-id}-{trace-flags}
```

All Datadog APM spans created by Bifrost will be linked to the parent span, appearing as children in the Datadog trace view.

### What's Captured

For each LLM operation, the plugin sends to LLM Observability:

* **Input/Output Messages** - Full conversation history with role attribution
* **Token Usage** - Input, output, and total token counts
* **Cost** - Calculated cost in USD based on model pricing
* **Latency** - Request duration and time-to-first-token for streaming
* **Model Info** - Provider, model name, and request parameters
* **Tool Calls** - Function/tool call details for agentic workflows

***

## Metrics Reference

The plugin emits the following metrics to Datadog:

| Metric                               | Type         | Description                        | Tags                            |
| ------------------------------------ | ------------ | ---------------------------------- | ------------------------------- |
| `bifrost.requests.total`             | Counter      | Total LLM requests                 | provider, model, method         |
| `bifrost.success.total`              | Counter      | Successful requests                | provider, model, method         |
| `bifrost.errors.total`               | Counter      | Failed requests                    | provider, model, method, reason |
| `bifrost.latency.seconds`            | Histogram    | Request latency distribution       | provider, model, method         |
| `bifrost.tokens.input`               | Counter      | Input/prompt tokens consumed       | provider, model                 |
| `bifrost.tokens.output`              | Counter      | Output/completion tokens generated | provider, model                 |
| `bifrost.tokens.total`               | Counter      | Total tokens (input + output)      | provider, model                 |
| `bifrost.request.cost.usd`           | Distribution | Per-request cost in USD            | provider, model                 |
| `bifrost.cache.hits`                 | Counter      | Cache hits                         | provider, model, cache\_type    |
| `bifrost.stream.first_token_latency` | Histogram    | Time to first token (streaming)    | provider, model                 |
| `bifrost.stream.inter_token_latency` | Histogram    | Inter-token latency (streaming)    | provider, model                 |

### Migrating from `bifrost.cost.usd`

The cost metric was renamed from `bifrost.cost.usd` to `bifrost.request.cost.usd`, and its type changed from **Gauge** to **Distribution**. The gauge was last-write-wins per flush window, so concurrent requests with the same tags collapsed to a single value and no query could recover the true total spend. The new name is required because Datadog permanently associates a metric name with its first-seen type per organization — orgs that previously received the gauge cannot receive the same name as a distribution.

**Affected assets:** any dashboards, monitors, saved views, or alerts that query `bifrost.cost.usd`.

**To migrate:**

1. Replace `bifrost.cost.usd` with `bifrost.request.cost.usd` in all queries.
2. Update aggregations for Distribution semantics — each sample is one request's cost:
   * Total spend: `sum:bifrost.request.cost.usd{*}` (do **not** append `.as_count()` or `.rollup(sum)`; the `sum:` aggregator already returns the additive total)
   * Per-request statistics: `avg:`, `max:`, or percentile aggregators
3. Recreate monitors and alerts on the new metric, adjusting thresholds if they assumed gauge behavior (the gauge systematically under-reported under concurrent load).

`bifrost.cost.usd` stops receiving data once the upgrade completes; during a rolling deploy both metrics receive data, so update dashboards at or shortly after the upgrade. Historical gauge data remains queryable under the old name for Datadog's standard retention window.

### Custom Tags

All metrics include your configured `custom_tags` plus automatic tags for:

* `provider` - LLM provider (openai, anthropic, etc.)
* `model` - Model name
* `method` - Type of request (chat, embedding, etc.)
* `bifrost_node` - Per-instance identity (`BIFROST_NODE_ID` if set, otherwise `hostname-pid`)
* plus Bifrost-context tags when available (virtual key, selected key, team, customer, fallback index)

***

## Captured Data

Each APM trace includes comprehensive LLM operation metadata:

### Span Attributes

* **Span Name** - Based on request type (`genai.chat`, `genai.embedding`, etc.)
* **Service Info** - `service.name`, `service.version`, `env` (Datadog's [unified service tagging](https://docs.datadoghq.com/getting_started/tagging/unified_service_tagging/): `service`, `version`, `env`)
* **Provider & Model** - `gen_ai.provider.name`, `gen_ai.request.model`

### Request Parameters

* Temperature, max\_tokens, top\_p, stop sequences
* Presence/frequency penalties
* Tool configurations and parallel tool calls
* Custom parameters via `ExtraParams`

### Input/Output Data

* Complete chat history with role-based messages
* Prompt text for completions
* Response content with role attribution
* Tool calls and results
* Reasoning and refusal content (when present)

When Enterprise guardrail redaction is enabled, Bifrost applies trace redaction replacements before exporting completed traces to Datadog. Exported span content receives the redacted or placeholderized value, but reversible reveal mappings are not exported. For the full mode matrix, see [Guardrail Redaction](/enterprise/guardrails/redaction).

### Performance Metrics

* Token usage (prompt, completion, total)
* Cost calculations in USD
* Latency and timing (start/end timestamps)
* Time to first token (streaming)
* Error details with status codes

### Bifrost Context

* Virtual key ID and name
* Selected key ID and name
* Team ID and name
* Customer ID and name
* Retry count and fallback index

***

## Plugin Span Filtering

By default every plugin's pre- and post-hook execution generates a span, which can bloat APM traces when many plugins are active (e.g. 8 built-in plugins × 2 hooks = 16 plugin spans per request). Use `plugin_span_filter` inside the Datadog plugin config to control which plugin spans are exported. This affects only the exported APM trace spans — plugin execution and metrics are unchanged.

**Via config.json** (inside the Datadog plugin config):

```json theme={null}
{
  "plugins": [
    {
      "name": "datadog",
      "enabled": true,
      "config": {
        "service_name": "bifrost",
        "agent_addr": "localhost:8126",
        "enable_traces": true,
        "plugin_span_filter": {
          "mode": "exclude",
          "plugins": ["logging", "compat", "telemetry"]
        }
      }
    }
  ]
}
```

**Via the UI**: Open the **Observability** page, select the **Datadog** connector, and click **Configure Plugin Tracing**. Toggle individual plugins on or off and save. UI-saved settings persist across restarts unless `plugin_span_filter` is set in config.json with a higher `version` value.

**Filter modes:**

| Mode      | Behaviour                                            |
| --------- | ---------------------------------------------------- |
| `exclude` | Export spans for all plugins **except** those listed |
| `include` | Export spans **only** for the listed plugins         |

**Plugin names:** list each plugin using the exact name shown for it in the **Configure Plugin Tracing** sheet — this is the same name that appears in the span (`plugin.<name>.<stage>`), and it is what the filter matches against. Note that some plugins are registered under a different name than their config key: the enterprise prompts and governance plugins appear as `enterprise-prompts` and `enterprise-governance` (not `prompts`/`governance`). Common names include `telemetry`, `logging`, `otel`, `semantic_cache`, `compat`, `maxim`, `enterprise-prompts`, `enterprise-governance`, `datadog`, `bigquery`, `guardrails`, `adaptive-loadbalancer`, and `model-catalog-resolver`. The exact set depends on which plugins are loaded in your deployment.

When a plugin span is filtered out, its children are automatically re-parented to the nearest exported ancestor so the trace hierarchy stays connected. The filter applies to APM trace spans only; it does not change DogStatsD metrics, which are never derived from plugin spans.

<Note>
  Each observability connector has its own independent `plugin_span_filter` — filtering plugin spans for Datadog does not affect OTEL, BigQuery, or any other connector. `plugin_span_filter` follows the standard plugin config precedence rules; to make a config.json value override UI-saved DB settings on restart, set a higher `version` on the Datadog plugin entry (e.g. `"version": 2`). See [Plugin Versioning](/deployment-guides/config-json/plugins) for details.
</Note>

***

## Supported Request Types

The Datadog plugin captures all Bifrost request types:

| Request Type                  | Span Name             | LLM Obs Type   |
| ----------------------------- | --------------------- | -------------- |
| Chat Completion               | `genai.chat`          | LLM Span       |
| Chat Completion (streaming)   | `genai.chat`          | LLM Span       |
| Text Completion               | `genai.text`          | LLM Span       |
| Text Completion (streaming)   | `genai.text`          | LLM Span       |
| Embeddings                    | `genai.embedding`     | Embedding Span |
| Speech Generation             | `genai.speech`        | Task Span      |
| Speech Generation (streaming) | `genai.speech`        | Task Span      |
| Transcription                 | `genai.transcription` | Task Span      |
| Transcription (streaming)     | `genai.transcription` | Task Span      |
| Responses API                 | `genai.responses`     | LLM Span       |
| Responses API (streaming)     | `genai.responses`     | LLM Span       |

***

## When to Use

### Datadog Plugin

Choose the Datadog plugin when you:

* Use Datadog as your primary observability platform
* Want native LLM Observability integration with ML App grouping
* Need seamless correlation with existing Datadog APM traces via W3C distributed tracing
* Require Datadog-specific features like notebooks and dashboards
* Want session tracking for conversation flows

### vs. OTel Plugin

Use the [OTel plugin](/features/observability/otel) when you:

* Need multi-vendor observability (send to multiple backends)
* Are using Datadog via an OpenTelemetry Collector
* Want vendor flexibility to switch backends without code changes
* Prefer standardized OpenTelemetry semantic conventions

<Note>
  You can use both plugins simultaneously if needed. The Datadog plugin provides native integration while OTel can send to additional backends.
</Note>

### vs. Built-in Observability

Use [Built-in Observability](/features/observability/default) for:

* Local development and testing
* Simple self-hosted deployments
* No external dependencies required
* Direct database access to logs

***

## Troubleshooting

### Agent Connectivity Issues

Verify the Datadog Agent is running and accessible:

```bash theme={null}
# Check agent status
datadog-agent status

# Test APM endpoint
curl -v http://localhost:8126/info

# Test DogStatsD (should accept UDP packets)
echo "test.metric:1|c" | nc -u -w1 localhost 8125
```

### Agentless Mode Not Working

1. Verify your API key is valid:

```bash theme={null}
curl -X GET "https://api.datadoghq.com/api/v1/validate" \
  -H "DD-API-KEY: $DD_API_KEY"
```

2. Ensure the `site` matches your API key's region

3. Check that the API key environment variable is set:

```bash theme={null}
echo $DD_API_KEY
```

### Missing Traces

1. Enable debug logging in Bifrost:

```bash theme={null}
bifrost-http --log-level debug
```

2. Verify traces are enabled in your configuration:

```json theme={null}
{
  "enable_traces": true,
  "enable_llm_obs": true
}
```

3. Check for errors in the Bifrost logs related to the Datadog plugin

### Missing Metrics

1. Verify DogStatsD is running (agent mode):

```bash theme={null}
datadog-agent status | grep DogStatsD
```

2. Ensure metrics are enabled:

```json theme={null}
{
  "enable_metrics": true
}
```

3. For agentless mode, verify your API key has metrics submission permissions

### LLM Observability Not Appearing

1. LLM Observability requires `enable_llm_obs: true` (default)
2. Verify your Datadog plan includes LLM Observability
3. Check the ML App name in Datadog under **LLM Observability** → **Applications**

***

## Next Steps

* **[OTel Plugin](/features/observability/otel)** - OpenTelemetry integration for multi-vendor observability
* **[Built-in Observability](/features/observability/default)** - Local logging for development
* **[Telemetry](/features/telemetry)** - Prometheus metrics and dashboards