> ## Documentation Index
> Fetch the complete documentation index at: https://docs.getbifrost.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Reasoning

> Cross-provider reference for reasoning and thinking capabilities in AI models

## Overview

Reasoning (also called "thinking" in some providers) allows AI models to show their step-by-step thought process before providing a final answer. This feature is available across multiple providers with different implementations.

<Info>
  Bifrost normalizes all provider-specific reasoning formats to a consistent OpenAI-compatible structure using `reasoning` in requests and `reasoning_details` in responses.
</Info>

***

## Provider Support Matrix

| Provider            | Request Field     | Response Field      | Min Budget      | Effort Levels                               | Streaming |
| ------------------- | ----------------- | ------------------- | --------------- | ------------------------------------------- | --------- |
| OpenAI              | `reasoning`       | `reasoning_details` | None            | `minimal`, `low`, `medium`, `high`          | ✅         |
| Anthropic           | `thinking`        | Content blocks      | **1024 tokens** | `enabled` only                              | ✅         |
| Bedrock (Anthropic) | `thinking`        | Content blocks      | **1024 tokens** | `enabled` only                              | ✅         |
| Gemini 2.5+         | `thinking_config` | `thought` parts     | 1024            | Budget-only                                 | ✅         |
| Gemini 3.0+         | `thinking_config` | `thought` parts     | 1024            | `minimal`, `low`, `medium`, `high` + Budget | ✅         |

***

## Request Configuration

### Chat Completions API

<Tabs>
  <Tab title="JSON">
    ```json theme={null}
    {
      "model": "provider/model-name",
      "messages": [...],
      "reasoning": {
        "effort": "high",
        "max_tokens": 4096
      }
    }
    ```
  </Tab>

  <Tab title="Go SDK">
    ```go theme={null}
    package main

    import (
    	"github.com/maximhq/bifrost"
    	"github.com/maximhq/bifrost/core/schemas"
    )

    chatReq := &schemas.BifrostChatRequest{
    	Provider: schemas.OpenAI,
    	Model:    "gpt-4o",
    	Input: []schemas.ChatMessage{
    		{
    			Role: schemas.ChatMessageRoleUser,
    			Content: &schemas.ChatMessageContent{
    				ContentStr: schemas.Ptr("Explain quantum computing"),
    			},
    		},
    	},
    	Params: &schemas.ChatParameters{
    		MaxCompletionTokens: schemas.Ptr(4096),
    		Reasoning: &schemas.ChatReasoning{
    			Effort:    schemas.Ptr("high"),
    			MaxTokens: schemas.Ptr(4096),
    		},
    	},
    }
    ```
  </Tab>
</Tabs>

### Responses API

<Tabs>
  <Tab title="JSON">
    ```json theme={null}
    {
      "model": "provider/model-name",
      "input": [...],
      "reasoning": {
        "effort": "high",
        "max_tokens": 4096,
        "summary": "detailed"
      }
    }
    ```
  </Tab>

  <Tab title="Go SDK">
    ```go theme={null}
    package main

    import (
    	"github.com/maximhq/bifrost/core/schemas"
    )

    responsesReq := &schemas.BifrostResponsesRequest{
    	Provider: schemas.Anthropic,
    	Model:    "claude-3-5-sonnet-20241022",
    	Input: []schemas.ResponsesMessage{
    		{
    			Role: schemas.Ptr(schemas.ResponsesInputMessageRoleUser),
    			Content: &schemas.ResponsesMessageContent{
    				ContentStr: schemas.Ptr("Explain quantum computing"),
    			},
    		},
    	},
    	Params: &schemas.ResponsesParameters{
    		MaxOutputTokens: schemas.Ptr(4096),
    		Reasoning: &schemas.ResponsesParametersReasoning{
    			Effort:    schemas.Ptr("high"),
    			MaxTokens: schemas.Ptr(4096),
    			Summary:   schemas.Ptr("detailed"),
    		},
    	},
    }
    ```
  </Tab>
</Tabs>

<Note>
  Responses API supports both `effort` + `max_tokens` (like Chat Completions) and adds the optional `summary` parameter for output summarization.
</Note>

### Parameter Reference

#### Chat Completions API Parameters

| Parameter    | Type     | Description                           |
| ------------ | -------- | ------------------------------------- |
| `effort`     | `string` | Reasoning intensity level             |
| `max_tokens` | `int`    | Maximum tokens for reasoning (budget) |

#### Responses API Parameters

| Parameter    | Type     | Description                                   |
| ------------ | -------- | --------------------------------------------- |
| `effort`     | `string` | Reasoning intensity level                     |
| `max_tokens` | `int`    | Maximum tokens for reasoning (budget)         |
| `summary`    | `string` | Summary level: `brief`, `detailed`, or `json` |

<Note>
  **Responses API** accepts the same `effort` and `max_tokens` parameters as Chat Completions, but adds an optional `summary` parameter for reasoning output summarization.
</Note>

***

## Provider-Specific Conversions

### OpenAI

OpenAI uses effort-based reasoning only. Bifrost applies priority logic:

1. If `reasoning.effort` is provided → use it directly
2. Else if `reasoning.max_tokens` is provided → estimate effort from it
3. The `max_tokens` field is cleared before sending to OpenAI

**Conversion Examples**:

<Tabs>
  <Tab title="Effort (JSON)">
    ```json theme={null}
    // Bifrost Request (with effort)
    {
      "reasoning": {
        "effort": "high"
      }
    }

    // OpenAI Request Sent
    {
      "reasoning": {
        "effort": "high"
      }
    }
    ```
  </Tab>

  <Tab title="Effort (Go)">
    ```go theme={null}
    // Bifrost request with effort (native field)
    chatReq := &schemas.BifrostChatRequest{
    	Provider: schemas.OpenAI,
    	Model:    "gpt-4o",
    	Input:    messages,
    	Params: &schemas.ChatParameters{
    		MaxCompletionTokens: schemas.Ptr(4096),
    		Reasoning: &schemas.ChatReasoning{
    			Effort: schemas.Ptr("high"),
    		},
    	},
    }

    // OpenAI receives effort directly, max_tokens is cleared
    ```
  </Tab>

  <Tab title="Max Tokens (JSON)">
    ```json theme={null}
    // Bifrost Request (with max_tokens only)
    {
      "max_completion_tokens": 4096,
      "reasoning": {
        "max_tokens": 3000
      }
    }

    // Estimation: ratio = 3000/4096 ≈ 0.73 → "high"
    // OpenAI Request Sent
    {
      "reasoning": {
        "effort": "high"
      }
    }
    ```
  </Tab>

  <Tab title="Max Tokens (Go)">
    ```go theme={null}
    // Bifrost request with max_tokens only
    chatReq := &schemas.BifrostChatRequest{
    	Provider: schemas.OpenAI,
    	Model:    "gpt-4o",
    	Input:    messages,
    	Params: &schemas.ChatParameters{
    		MaxCompletionTokens: schemas.Ptr(4096),
    		Reasoning: &schemas.ChatReasoning{
    			MaxTokens: schemas.Ptr(3000),
    		},
    	},
    }

    // Bifrost estimates effort from max_tokens
    // ratio = 3000/4096 ≈ 0.73 → effort = "high"
    // OpenAI receives effort, max_tokens cleared
    ```
  </Tab>
</Tabs>

**Supported Effort Levels**: `minimal`, `low`, `medium`, `high`

<Note>
  When `minimal` is encountered, it's converted to `low` for non-OpenAI providers. OpenAI receives only: `low`, `medium`, `high`.
</Note>

***

### Anthropic

Anthropic uses a `thinking` parameter with different structure.

<Tabs>
  <Tab title="Request Conversion (JSON)">
    ```json theme={null}
    // Bifrost Request
    {
      "reasoning": {
        "effort": "high",
        "max_tokens": 4096
      }
    }

    // Anthropic Request
    {
      "thinking": {
        "type": "enabled",
        "budget_tokens": 4096
      }
    }
    ```
  </Tab>

  <Tab title="Request Conversion (Go)">
    ```go theme={null}
    // Using Bifrost Go SDK
    chatReq := &schemas.BifrostChatRequest{
      Provider: schemas.Anthropic,
      Model:    "claude-3-5-sonnet-20241022",
      Input:    messages,
      Params: &schemas.ChatParameters{
        MaxCompletionTokens: schemas.Ptr(4096),
        Reasoning: &schemas.ChatReasoning{
          MaxTokens: schemas.Ptr(4096), // Anthropic native field
        },
      },
    }

    // Bifrost converts to Anthropic format:
    // {
    //   "thinking": {
    //     "type": "enabled",
    //     "budget_tokens": 4096
    //   }
    // }
    ```
  </Tab>

  <Tab title="Response Conversion (JSON)">
    ```json theme={null}
    // Anthropic Response (content blocks)
    {
      "content": [
        {
          "type": "thinking",
          "thinking": "Let me analyze this step by step...",
          "signature": "EqoBCkgIAR..."
        },
        {
          "type": "text",
          "text": "The answer is 42."
        }
      ]
    }

    // Bifrost Response
    {
      "choices": [{
        "message": {
          "content": "The answer is 42.",
          "reasoning": "Let me analyze this step by step...",
          "reasoning_details": [{
            "index": 0,
            "type": "text",
            "text": "Let me analyze this step by step...",
            "signature": "EqoBCkgIAR..."
          }]
        }
      }]
    }
    ```
  </Tab>

  <Tab title="Response Conversion (Go)">
    ```go theme={null}
    // After calling Bifrost Chat Completions with reasoning
    resp, err := client.ChatCompletionRequest(schemas.NewBifrostContext(ctx, schemas.NoDeadline), chatReq)
    if err != nil {
      log.Fatal(err)
    }

    // Extract reasoning from response
    choice := resp.Choices[0]
    message := choice.Message

    // Access combined reasoning text
    reasoningText := message.Reasoning

    // Access detailed reasoning blocks
    for i, details := range message.ReasoningDetails {
      fmt.Printf("Block %d: %s\n", i, details.Text)
      if details.Signature != "" {
        fmt.Printf("  Signature: %s\n", details.Signature)
      }
    }
    ```
  </Tab>
</Tabs>

**Conversion Rules**:

| Bifrost                | Anthropic                | Notes                        |
| ---------------------- | ------------------------ | ---------------------------- |
| `reasoning.effort`     | `thinking.type`          | Always mapped to `"enabled"` |
| `reasoning.max_tokens` | `thinking.budget_tokens` | Token budget for reasoning   |

<Warning>
  **Critical Constraint**: Anthropic requires `reasoning.max_tokens >= 1024`. Requests with lower values will **fail with an error**.
</Warning>

**Dynamic Budget Handling**:

| Input Value    | Converted To             |
| -------------- | ------------------------ |
| `-1` (dynamic) | `1024` (minimum default) |
| `< 1024`       | **Error**                |
| `>= 1024`      | Pass-through             |

**Code Reference**: `core/providers/anthropic/chat.go:104-134`

***

### Bedrock (Anthropic Models)

Bedrock uses the same structure as Anthropic for Claude models.

<Tabs>
  <Tab title="Request (JSON)">
    ```json theme={null}
    // Bifrost Request
    {
      "reasoning": {
        "effort": "high",
        "max_tokens": 4096
      }
    }

    // Bedrock Request (for Anthropic/Claude models)
    {
      "additionalModelRequestFields": {
        "reasoning_config": {
          "type": "enabled",
          "budget_tokens": 4096
        }
      }
    }
    ```
  </Tab>

  <Tab title="Request (Go)">
    ```go theme={null}
    // Using Bifrost Go SDK with Bedrock provider
    chatReq := &schemas.BifrostChatRequest{
      Provider: schemas.Bedrock,
      Model:    "us.anthropic.claude-3-5-sonnet-20241022-v2:0",
      Input:    messages,
      Params: &schemas.ChatParameters{
        MaxCompletionTokens: schemas.Ptr(4096),
        Reasoning: &schemas.ChatReasoning{
          MaxTokens: schemas.Ptr(4096), // Bedrock Anthropic native field
        },
      },
    }

    // Bifrost converts to Bedrock format with reasoning_config
    ```
  </Tab>
</Tabs>

<Note>
  The same 1024 minimum token budget constraint applies to Bedrock Anthropic models. Attempts to set `max_tokens` below 1024 will result in an error.
</Note>

**Code Reference**: `core/providers/bedrock/utils.go:34-47`

***

### Bedrock (Nova Models)

Bedrock Nova models use an effort-based approach similar to OpenAI.

<Tabs>
  <Tab title="Request Conversion (JSON)">
    ```json theme={null}
    // Bifrost Request
    {
      "reasoning": {
        "effort": "high",
        "max_tokens": 4096
      }
    }

    // Bedrock Request (for Nova models)
    {
      "additionalModelRequestFields": {
        "reasoningConfig": {
          "type": "enabled",
          "maxReasoningEffort": "high"
        }
      }
    }
    ```
  </Tab>

  <Tab title="Request Conversion (Go)">
    ```go theme={null}
    // Using Bifrost Go SDK with Bedrock Nova
    chatReq := &schemas.BifrostChatRequest{
      Provider: schemas.Bedrock,
      Model:    "us.amazon.nova-pro-v1:0",
      Input:    messages,
      Params: &schemas.ChatParameters{
        MaxCompletionTokens: schemas.Ptr(4096),
        Reasoning: &schemas.ChatReasoning{
          Effort: schemas.Ptr("high"), // Nova native field
        },
      },
    }

    // Bifrost converts to Bedrock Nova format:
    // reasoningConfig: {
    //   type: "enabled",
    //   maxReasoningEffort: "high"
    // }
    ```
  </Tab>

  <Tab title="Effort Levels">
    | Bifrost Effort   | Nova Effort | Configuration                             |
    | ---------------- | ----------- | ----------------------------------------- |
    | `minimal`, `low` | `"low"`     | Normal parameters allowed                 |
    | `medium`         | `"medium"`  | Normal parameters allowed                 |
    | `high`           | `"high"`    | Clears `maxTokens`, `temperature`, `topP` |
  </Tab>
</Tabs>

**Key Differences from Anthropic**:

* No minimum token budget constraint
* Uses effort levels instead of token budgets
* High effort mode automatically clears conflicting parameters

**Code Reference**: `core/providers/bedrock/utils.go:48-89`

***

### Gemini

Gemini uses `thinking_config` with dual support for both token budgets and effort levels, depending on the model version.

#### Model Version Support

| Gemini Version | `thinkingBudget` | `thinkingLevel` | Notes                         |
| -------------- | ---------------- | --------------- | ----------------------------- |
| **2.5+**       | ✅                | ❌               | Budget-only models            |
| **3.0+**       | ✅                | ✅               | Support both budget and level |

<Warning>
  **Important**: Only ONE parameter (`thinkingBudget` or `thinkingLevel`) should be sent to Gemini at a time. When both `reasoning.max_tokens` and `reasoning.effort` are provided in a Bifrost request, `max_tokens` takes priority and is converted to `thinkingBudget`.
</Warning>

#### Priority Rules

When both `reasoning.max_tokens` and `reasoning.effort` are present:

```
1. If max_tokens is provided → USE thinkingBudget (ignores effort)
2. Else if effort is provided:
   - Gemini 3.0+ → USE thinkingLevel (more native)
   - Gemini 2.5 → CONVERT effort to thinkingBudget
3. Else → disable reasoning
```

<Tabs>
  <Tab title="Budget Priority (JSON)">
    ```json theme={null}
    // Bifrost Request - Both fields provided
    {
      "model": "gemini-3.0-flash",
      "reasoning": {
        "effort": "high",        // Ignored
        "max_tokens": 4096      // Takes priority
      }
    }

    // Gemini 3.0+ Request - Only budget sent
    {
      "generation_config": {
        "thinking_config": {
          "include_thoughts": true,
          "thinking_budget": 4096
        }
      }
    }
    ```
  </Tab>

  <Tab title="Effort to Level (Gemini 3.0+)">
    ```json theme={null}
    // Bifrost Request - Effort only
    {
      "model": "gemini-3.0-flash",
      "reasoning": {
        "effort": "high"
      }
    }

    // Gemini 3.0+ Request - Converted to level
    {
      "generation_config": {
        "thinking_config": {
          "include_thoughts": true,
          "thinking_level": "high"
        }
      }
    }
    ```
  </Tab>

  <Tab title="Effort to Budget (Gemini 2.5)">
    ```json theme={null}
    // Bifrost Request - Effort only
    {
      "model": "gemini-2.5-flash",
      "max_completion_tokens": 4096,
      "reasoning": {
        "effort": "high"
      }
    }

    // Gemini 2.5 Request - Converted to budget
    // Calculation: 1024 + (0.80 × (4096 - 1024)) = 3482
    {
      "generation_config": {
        "thinking_config": {
          "include_thoughts": true,
          "thinking_budget": 3482
        }
      }
    }
    ```
  </Tab>
</Tabs>

#### Model-Specific Level Conversions

Gemini Pro models have stricter constraints on thinking levels:

| Bifrost Effort | Non-Pro Models | Pro Models   | Notes                       |
| -------------- | -------------- | ------------ | --------------------------- |
| `"none"`       | Empty string   | Empty string | Disables thinking           |
| `"minimal"`    | `"minimal"`    | `"low"`      | Pro doesn't support minimal |
| `"low"`        | `"low"`        | `"low"`      | Supported on all            |
| `"medium"`     | `"medium"`     | `"high"`     | Pro doesn't support medium  |
| `"high"`       | `"high"`       | `"high"`     | Supported on all            |

**Example**:

```go theme={null}
// For "gemini-3.0-flash-thinking-exp" (non-Pro)
effort: "medium" → thinkingLevel: "medium"

// For "gemini-3.0-pro" (Pro model)
effort: "medium" → thinkingLevel: "high"  // Converted up
```

#### Special Values

| Value    | Field        | Behavior                                        | Use Case                            |
| -------- | ------------ | ----------------------------------------------- | ----------------------------------- |
| `0`      | `max_tokens` | `thinking_budget: 0`, `include_thoughts: false` | Explicitly disable reasoning        |
| `-1`     | `max_tokens` | `thinking_budget: -1`                           | **Dynamic budget** (Gemini decides) |
| `"none"` | `effort`     | `thinking_budget: 0`, `include_thoughts: false` | Disable reasoning                   |

<Tabs>
  <Tab title="Dynamic Budget (JSON)">
    ```json theme={null}
    // Bifrost Request - Dynamic budget
    {
      "reasoning": {
        "max_tokens": -1
      }
    }

    // Gemini Request - Sent as-is
    {
      "generation_config": {
        "thinking_config": {
          "include_thoughts": true,
          "thinking_budget": -1
        }
      }
    }
    ```
  </Tab>

  <Tab title="Disable Reasoning (JSON)">
    ```json theme={null}
    // Bifrost Request - Method 1
    {
      "reasoning": {
        "max_tokens": 0
      }
    }

    // Bifrost Request - Method 2
    {
      "reasoning": {
        "effort": "none"
      }
    }

    // Gemini Request - Both become
    {
      "generation_config": {
        "thinking_config": {
          "include_thoughts": false,
          "thinking_budget": 0
        }
      }
    }
    ```
  </Tab>

  <Tab title="Go SDK Examples">
    ```go theme={null}
    // Using Bifrost Go SDK with Gemini
    // Example 1: Dynamic budget
    chatReq := &schemas.BifrostChatRequest{
      Provider: schemas.Gemini,
      Model:    "gemini-2.0-flash-thinking-exp-1219",
      Input:    messages,
      Params: &schemas.ChatParameters{
        MaxCompletionTokens: schemas.Ptr(4096),
        Reasoning: &schemas.ChatReasoning{
          MaxTokens: schemas.Ptr(-1), // Let Gemini decide
        },
      },
    }

    // Example 2: Effort-based for Gemini 3.0+
    chatReq := &schemas.BifrostChatRequest{
      Provider: schemas.Gemini,
      Model:    "gemini-3.0-flash",
      Input:    messages,
      Params: &schemas.ChatParameters{
        MaxCompletionTokens: schemas.Ptr(4096),
        Reasoning: &schemas.ChatReasoning{
          Effort: schemas.Ptr("high"), // Converts to thinkingLevel
        },
      },
    }

    // Example 3: Budget-based (all versions)
    chatReq := &schemas.BifrostChatRequest{
      Provider: schemas.Gemini,
      Model:    "gemini-2.5-flash",
      Input:    messages,
      Params: &schemas.ChatParameters{
        MaxCompletionTokens: schemas.Ptr(4096),
        Reasoning: &schemas.ChatReasoning{
          MaxTokens: schemas.Ptr(3000), // Direct budget
        },
      },
    }
    ```
  </Tab>
</Tabs>

#### Response Conversion

<Tabs>
  <Tab title="Response (JSON)">
    ```json theme={null}
    // Gemini Response
    {
      "candidates": [{
        "content": {
          "parts": [
            {
              "thought": true,
              "text": "Analyzing the problem..."
            },
            {
              "text": "The answer is 42."
            }
          ]
        }
      }]
    }

    // Bifrost Response
    {
      "choices": [{
        "message": {
          "content": "The answer is 42.",
          "reasoning": "Analyzing the problem...",
          "reasoning_details": [{
            "index": 0,
            "type": "text",
            "text": "Analyzing the problem..."
          }]
        }
      }]
    }
    ```
  </Tab>

  <Tab title="Response (Go)">
    ```go theme={null}
    // After calling Bifrost Chat Completions with Gemini
    resp, err := client.ChatCompletionRequest(schemas.NewBifrostContext(ctx, schemas.NoDeadline), chatReq)
    if err != nil {
      log.Fatal(err)
    }

    // Extract reasoning from response
    choice := resp.Choices[0]
    message := choice.Message

    // Access combined reasoning text
    fmt.Printf("Reasoning: %s\n", message.Reasoning)

    // Access detailed reasoning blocks
    for i, details := range message.ReasoningDetails {
      if details.Type == "text" {
        fmt.Printf("Thinking block %d:\n%s\n", i, details.Text)
      }
    }

    // Access final answer
    fmt.Printf("Answer:\n%s\n", message.Content)
    ```
  </Tab>
</Tabs>

#### Conversion Summary

**Bifrost → Gemini (Request)**:

| Input                        | Gemini 2.5                | Gemini 3.0+                                | Note                |
| ---------------------------- | ------------------------- | ------------------------------------------ | ------------------- |
| `max_tokens: 4096`           | `thinking_budget: 4096`   | `thinking_budget: 4096`                    | Direct pass-through |
| `max_tokens: -1`             | `thinking_budget: -1`     | `thinking_budget: -1`                      | Dynamic budget      |
| `max_tokens: 0`              | `thinking_budget: 0`      | `thinking_budget: 0`                       | Disabled            |
| `effort: "high"` only        | `thinking_budget: 3482`\* | `thinking_level: "high"`                   | Estimated or native |
| `effort: "medium"` only      | `thinking_budget: 2330`\* | `thinking_level: "medium"` or `"high"`\*\* | Estimated or native |
| Both `effort` + `max_tokens` | Uses `max_tokens`         | Uses `max_tokens`                          | Priority rule       |

\* Assumes `max_completion_tokens: 8192` (default), uses estimation formula\
\*\* Pro models convert `"medium"` to `"high"`

**Gemini → Bifrost (Response)**:

| Gemini Field          | Bifrost Field          | Conversion                |
| --------------------- | ---------------------- | ------------------------- |
| `thinking_budget`     | `reasoning.max_tokens` | Direct mapping            |
| `thinking_level`      | `reasoning.effort`     | Level → effort mapping    |
| `thought: true` parts | `reasoning_details[]`  | Array of reasoning blocks |

**Code References**:

* `core/providers/gemini/utils.go` (Chat Completions)
* `core/providers/gemini/responses.go` (Responses API)
* `core/providers/gemini/types.go` (Constants)

***

## Two Reasoning Methods: Effort vs. Max Tokens

Bifrost supports two distinct reasoning models across different providers:

### Reasoning Model Types

| Model                | Providers                 | Request Field          | Native Format                                    |
| -------------------- | ------------------------- | ---------------------- | ------------------------------------------------ |
| **Effort-Based**     | OpenAI, AWS Bedrock Nova  | `reasoning.effort`     | `reasoning_effort` (Chat) / `effort` (Responses) |
| **Max-Tokens-Based** | Anthropic, Cohere, Gemini | `reasoning.max_tokens` | `thinking.budget_tokens`                         |

**Important**: Both effort and max\_tokens can be specified in a single request. Bifrost uses a **priority hierarchy** to determine which field is used.

### Priority Logic: Native vs. Estimated

When both `effort` and `max_tokens` are present in a request, Bifrost prioritizes the **native compatible field** for the target provider:

#### **For Max-Tokens-Based Providers** (Anthropic, Cohere, Gemini)

```
1. If reasoning.max_tokens is provided → USE IT (native field)
2. Else if reasoning.effort is provided → ESTIMATE max_tokens from effort
3. Else → disable reasoning
```

**Example** (Cohere):

```json theme={null}
// Request with both fields
{
  "reasoning": {
    "effort": "high",
    "max_tokens": 2000
  }
}
```

**Result**: Uses `max_tokens: 2000` directly, ignores `effort`

#### **For Effort-Based Providers** (OpenAI, AWS Bedrock Nova)

```
1. If reasoning.effort is provided → USE IT (native field)
2. Else if reasoning.max_tokens is provided → ESTIMATE effort from max_tokens
3. Else → disable reasoning
```

**Example** (OpenAI Chat Completions):

```json theme={null}
// Request with both fields
{
  "reasoning": {
    "effort": "high",
    "max_tokens": 2000
  }
}
```

**Result**: Uses `effort: "high"` directly, strips `max_tokens` from JSON

<Accordion title="Why Priority Matters">
  **Reason 1: Accuracy** - Native fields provide direct control without estimation loss

  **Reason 2: Consistency** - Using native fields ensures the exact user intent is preserved

  **Reason 3: Performance** - Avoids unnecessary conversions when native field is already provided
</Accordion>

***

## Estimator Functions

Bifrost provides two estimator functions to convert between reasoning methods. These are used when the native field is not available.

### Function 1: Effort → Max Tokens

**Function**: `GetBudgetTokensFromReasoningEffort()`

**File**: `core/providers/utils/utils.go:1350-1387`

**Signature**:

```go theme={null}
func GetBudgetTokensFromReasoningEffort(
    effort string,           // "minimal", "low", "medium", "high"
    minBudgetTokens int,     // Provider-specific minimum (e.g., 1024 for Anthropic)
    maxTokens int,           // Total completion tokens available
) (int, error)
```

**Algorithm**:

```
1. Define ratio for effort level:
   - "minimal"  → 2.5%  (0.025)
   - "low"      → 15%   (0.15)
   - "medium"   → 42.5% (0.425)
   - "high"     → 80%   (0.80)

2. Calculate budget:
   budget = minBudgetTokens + (ratio × (maxTokens - minBudgetTokens))

3. Clamp to valid range:
   if budget < minBudgetTokens → budget = minBudgetTokens
   if budget > maxTokens → budget = maxTokens
```

**Conversion Examples** (with `minBudgetTokens=1024`, `maxTokens=4096`):

| Effort    | Ratio | Calculation         | Result        |
| --------- | ----- | ------------------- | ------------- |
| `minimal` | 2.5%  | 1024 + 0.025 × 3072 | 1101 → 1024\* |
| `low`     | 15%   | 1024 + 0.15 × 3072  | 1485          |
| `medium`  | 42.5% | 1024 + 0.425 × 3072 | 2330          |
| `high`    | 80%   | 1024 + 0.80 × 3072  | 3482          |

<Note>
  \*When result is below minimum, clamped to minBudgetTokens (for Anthropic minimum of 1024)
</Note>

**Error Handling**:

```go theme={null}
if minBudgetTokens > maxTokens {
    return 0, fmt.Errorf("max_tokens must be > minBudgetTokens")
}
```

**Code Example**:

```go theme={null}
// Cohere: Convert effort to token budget
budgetTokens, err := providerUtils.GetBudgetTokensFromReasoningEffort(
    "high",                    // effort
    1,                         // Cohere min
    4096,                      // max completion tokens
)
// Returns: 3277 tokens
```

### Function 2: Max Tokens → Effort

**Function**: `GetReasoningEffortFromBudgetTokens()`

**File**: `core/providers/utils/utils.go:1308-1345`

**Signature**:

```go theme={null}
func GetReasoningEffortFromBudgetTokens(
    budgetTokens int,        // Reasoning token budget
    minBudgetTokens int,     // Provider-specific minimum
    maxTokens int,           // Total completion tokens available
) string                     // Returns: "low", "medium", "high"
```

**Algorithm**:

```
1. Normalize budget to valid range:
   if budget < min → budget = min
   if budget > max → budget = max

2. Calculate ratio:
   ratio = (budgetTokens - minBudgetTokens) / (maxTokens - minBudgetTokens)

3. Map ratio to effort level:
   if ratio ≤ 0.25  → "low"
   if ratio ≤ 0.60  → "medium"
   if ratio > 0.60  → "high"
```

**Conversion Examples** (with `minBudgetTokens=1024`, `maxTokens=4096`):

| Budget Tokens | Ratio | Effort   |
| ------------- | ----- | -------- |
| 1024          | 0%    | `low`    |
| 1101          | 2.5%  | `low`    |
| 1500          | 15.6% | `low`    |
| 1900          | 28.6% | `medium` |
| 2500          | 48.1% | `medium` |
| 3000          | 64.5% | `high`   |
| 3400          | 77.6% | `high`   |

**Defensive Defaults**:

```go theme={null}
if budgetTokens <= 0 {
    return "none"
}
if maxTokens <= 0 {
    return "medium"  // Safe default
}
if maxTokens <= minBudgetTokens {
    return "high"    // Can't calculate ratio
}
```

**Code Example**:

```go theme={null}
// Convert Anthropic budget back to effort for display
effort := providerUtils.GetReasoningEffortFromBudgetTokens(
    3000,   // budget tokens from Anthropic response
    1024,   // Anthropic minimum
    4096,   // max tokens
)
// Returns: "high"
```

***

## Provider-Specific Constants

Different providers have different constraints on reasoning budget:

### Min Budget Constants

| Provider          | File                                | MinBudgetTokens | Reason                          |
| ----------------- | ----------------------------------- | --------------- | ------------------------------- |
| Anthropic         | `core/providers/anthropic/types.go` | **1024**        | Anthropic API requirement       |
| Bedrock Anthropic | `core/providers/bedrock/types.go`   | **1024**        | Same as Anthropic               |
| Bedrock Nova      | `core/providers/bedrock/types.go`   | 1               | More flexible                   |
| Cohere            | `core/providers/cohere/types.go`    | 1               | Flexible                        |
| Gemini            | `core/providers/gemini/types.go`    | 1024            | Default minimum for conversions |

### Default Completion Tokens (for ratio calculation)

When `max_completion_tokens` is not provided, these defaults are used for ratio calculations:

| Provider                           | Default | File                             |
| ---------------------------------- | ------- | -------------------------------- |
| OpenAI, Anthropic, Cohere, Bedrock | 4096    | `core/providers/*/types.go`      |
| Gemini                             | 8192    | `core/providers/gemini/types.go` |

***

## Effort-to-Token Conversion Examples

### Example 1: Estimate tokens from effort (Anthropic)

<Tabs>
  <Tab title="JSON">
    **Input**:

    ```json theme={null}
    {
      "model": "anthropic/claude-3-5-sonnet",
      "max_completion_tokens": 2000,
      "reasoning": {
        "effort": "high"
      }
    }
    ```

    **Conversion Process**:

    1. `effort = "high"` → `ratio = 0.80`
    2. `minBudgetTokens = 1024` (Anthropic)
    3. `maxCompletionTokens = 2000`
    4. `budget = 1024 + (0.80 × (2000 - 1024))`
    5. `budget = 1024 + (0.80 × 976)`
    6. `budget = 1024 + 780`
    7. **Result: 1804 tokens**

    **Anthropic Request Generated**:

    ```json theme={null}
    {
      "thinking": {
        "type": "enabled",
        "budget_tokens": 1804
      }
    }
    ```
  </Tab>

  <Tab title="Go SDK">
    ```go theme={null}
    import (
      "github.com/maximhq/bifrost/core/providers/utils"
      "github.com/maximhq/bifrost/core/schemas"
    )

    // Using Bifrost Go SDK
    chatReq := &schemas.BifrostChatRequest{
      Provider: schemas.Anthropic,
      Model:    "claude-3-5-sonnet-20241022",
      Input:    messages,
      Params: &schemas.ChatParameters{
        MaxCompletionTokens: schemas.Ptr(2000),
        Reasoning: &schemas.ChatReasoning{
          Effort: schemas.Ptr("high"), // Effort provided, max_tokens not set
        },
      },
    }

    // Bifrost automatically converts effort to budget tokens:
    // 1. Get ratio for "high": 0.80
    // 2. Calculate: 1024 + (0.80 × (2000 - 1024)) = 1804
    // 3. Send to Anthropic with budget_tokens: 1804

    // Alternatively, manually call the estimator function:
    budgetTokens, _ := utils.GetBudgetTokensFromReasoningEffort(
      "high",     // effort
      1024,       // Anthropic minimum
      2000,       // max completion tokens
    )
    // Returns: 1804
    ```
  </Tab>
</Tabs>

### Example 2: Estimate effort from tokens (Bedrock Nova)

<Tabs>
  <Tab title="JSON">
    **Input**:

    ```json theme={null}
    {
      "model": "bedrock/us.amazon.nova-pro-v1:0",
      "max_completion_tokens": 4096,
      "reasoning": {
        "max_tokens": 2000
      }
    }
    ```

    **Conversion Process**:

    1. `budgetTokens = 2000`
    2. `minBudgetTokens = 1` (Nova)
    3. `maxCompletionTokens = 4096`
    4. `ratio = (2000 - 1) / (4096 - 1)`
    5. `ratio = 1999 / 4095`
    6. `ratio = 0.488` (48.8%)
    7. Since `0.25 < 0.488 ≤ 0.60` → **Result: "medium"**

    **Bedrock Nova Request Generated**:

    ```json theme={null}
    {
      "reasoningConfig": {
        "type": "enabled",
        "maxReasoningEffort": "medium"
      }
    }
    ```
  </Tab>

  <Tab title="Go SDK">
    ```go theme={null}
    import (
      "github.com/maximhq/bifrost/core/providers/utils"
      "github.com/maximhq/bifrost/core/schemas"
    )

    // Using Bifrost Go SDK with max_tokens (not effort)
    chatReq := &schemas.BifrostChatRequest{
      Provider: schemas.Bedrock,
      Model:    "us.amazon.nova-pro-v1:0",
      Input:    messages,
      Params: &schemas.ChatParameters{
        MaxCompletionTokens: schemas.Ptr(4096),
        Reasoning: &schemas.ChatReasoning{
          MaxTokens: schemas.Ptr(2000), // Max tokens provided, effort not set
        },
      },
    }

    // Bifrost automatically estimates effort from max_tokens:
    // 1. Calculate ratio: (2000 - 1) / (4096 - 1) = 0.488
    // 2. Since 0.25 < 0.488 ≤ 0.60 → "medium"
    // 3. Send to Bedrock Nova with effort: "medium"

    // Alternatively, manually call the estimator function:
    effort := utils.GetReasoningEffortFromBudgetTokens(
      2000,  // budget tokens
      1,     // Nova minimum
      4096,  // max completion tokens
    )
    // Returns: "medium"
    ```
  </Tab>
</Tabs>

### Example 3: Both fields provided (priority used)

<Tabs>
  <Tab title="JSON">
    **Input**:

    ```json theme={null}
    {
      "model": "anthropic/claude-3-5-sonnet",
      "max_completion_tokens": 4096,
      "reasoning": {
        "effort": "medium",
        "max_tokens": 2500
      }
    }
    ```

    **Logic for Max-Tokens-Based Provider**:

    1. Check: Is `max_tokens` provided? → **YES**
    2. Use `max_tokens` directly (ignore `effort`)
    3. Validate: `2500 >= 1024`? → **YES**

    **Anthropic Request Generated**:

    ```json theme={null}
    {
      "thinking": {
        "type": "enabled",
        "budget_tokens": 2500
      }
    }
    ```

    **Note**: The `effort: "medium"` is completely ignored because `max_tokens` takes priority.
  </Tab>

  <Tab title="Go SDK">
    ```go theme={null}
    import "github.com/maximhq/bifrost/core/schemas"

    // Using Bifrost Go SDK with BOTH effort and max_tokens
    chatReq := &schemas.BifrostChatRequest{
      Provider: schemas.Anthropic,
      Model:    "claude-3-5-sonnet-20241022",
      Input:    messages,
      Params: &schemas.ChatParameters{
        MaxCompletionTokens: schemas.Ptr(4096),
        Reasoning: &schemas.ChatReasoning{
          Effort:    schemas.Ptr("medium"),   // Provided but ignored
          MaxTokens: schemas.Ptr(2500),       // This takes priority
        },
      },
    }

    // Bifrost Priority Logic:
    // 1. For max-tokens-based providers (Anthropic):
    //    → Check if max_tokens is provided? YES
    //    → Use it directly: 2500
    //    → Ignore effort: "medium"
    //    → Validate: 2500 >= 1024? YES ✓
    // 2. Send to Anthropic with budget_tokens: 2500

    // Result: effort is completely ignored, max_tokens is used
    ```
  </Tab>
</Tabs>

***

## Response Format

### Bifrost Standard Response

All providers return reasoning in a normalized `reasoning_details` array:

```json theme={null}
{
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "Final response text",
      "reasoning_details": [
        {
          "index": 0,
          "type": "text",
          "text": "Step-by-step reasoning content...",
          "signature": "optional_signature_for_verification"
        }
      ]
    }
  }]
}
```

### Reasoning Details Fields

| Field       | Type     | Description                                   | Present In         |
| ----------- | -------- | --------------------------------------------- | ------------------ |
| `index`     | `int`    | Position in reasoning sequence                | All                |
| `type`      | `string` | Content type (`text`, `encrypted`, `summary`) | All                |
| `text`      | `string` | Reasoning content                             | Chat Completions   |
| `summary`   | `string` | Reasoning summary                             | Responses API      |
| `signature` | `string` | Cryptographic signature for verification      | Anthropic, Bedrock |

### Type Mappings

| Reasoning Type        | When Used                            | Source                     |
| --------------------- | ------------------------------------ | -------------------------- |
| `reasoning.text`      | Direct thinking/reasoning content    | Anthropic, Gemini, Bedrock |
| `reasoning.encrypted` | Signature-verified reasoning         | Anthropic, Bedrock Nova    |
| `reasoning.summary`   | Summarized reasoning (Responses API) | All providers              |

<Note>
  **OpenAI Implementation**: OpenAI (both Chat Completions and Responses API) is effort-based, following the standard priority logic: if `effort` is provided, it's used directly; if only `max_tokens` is provided, effort is estimated from it. The `max_tokens` field is then cleared before JSON serialization via `MarshalJSON` (`core/providers/openai/types.go:383-453`), since OpenAI's APIs don't accept it.
</Note>

***

## Streaming

### Stream Event Types

| Provider  | Reasoning Event         | Signature Event     |
| --------- | ----------------------- | ------------------- |
| OpenAI    | `reasoning` (top-level) | N/A                 |
| Anthropic | `thinking_delta`        | `signature_delta`   |
| Bedrock   | `thinking_delta`        | `signature_delta`   |
| Gemini    | `thought` (in content)  | `thought_signature` |

### Anthropic Streaming Example

```
// Stream events
event: content_block_start
data: {"type": "content_block_start", "content_block": {"type": "thinking"}}

event: content_block_delta
data: {"type": "content_block_delta", "delta": {"type": "thinking_delta", "thinking": "Let me"}}

event: content_block_delta
data: {"type": "content_block_delta", "delta": {"type": "thinking_delta", "thinking": " analyze..."}}

event: content_block_delta
data: {"type": "content_block_delta", "delta": {"type": "signature_delta", "signature": "EqoB..."}}

event: content_block_stop
data: {"type": "content_block_stop"}
```

### Bifrost Stream Response

```json theme={null}
// Thinking delta
{
  "choices": [{
    "delta": {
      "reasoning_details": [{
        "index": 0,
        "type": "text",
        "text": "Let me analyze..."
      }]
    }
  }]
}

// Signature delta
{
  "choices": [{
    "delta": {
      "reasoning_details": [{
        "index": 0,
        "signature": "EqoB..."
      }]
    }
  }]
}
```

***

## Caveats Summary

<Accordion title="Minimum Budget (Anthropic/Bedrock)">
  **Severity**: High
  **Behavior**: `reasoning.max_tokens` must be >= 1024
  **Impact**: Requests with lower values fail with error
  **Workaround**: Always set max\_tokens >= 1024 for Anthropic/Bedrock
</Accordion>

<Accordion title="Dynamic Budget Not Supported">
  **Severity**: Medium
  **Behavior**: `reasoning.max_tokens = -1` converted to `1024`
  **Impact**: Dynamic budgeting not available on Anthropic/Bedrock
  **Workaround**: Set explicit token budget
</Accordion>

<Accordion title="Effort Level Normalization">
  **Severity**: Low
  **Behavior**: OpenAI's `minimal` converted to `low` when routing to other providers
  **Impact**: Slightly different reasoning behavior
</Accordion>

<Accordion title="Signature Field Provider-Specific">
  **Severity**: Low
  **Behavior**: `signature` field only present in Anthropic/Bedrock responses
  **Impact**: Signature-based verification only available for these providers
</Accordion>

<Accordion title="Thinking Type Always Enabled">
  **Severity**: Low
  **Behavior**: Anthropic's `thinking.type` always set to `"enabled"` regardless of effort
  **Impact**: Cannot disable thinking once reasoning param is present
</Accordion>

<Accordion title="Gemini: Only One Parameter Sent">
  **Severity**: Medium
  **Behavior**: When both `effort` and `max_tokens` are provided, only `thinkingBudget` is sent to Gemini (effort is dropped)
  **Impact**: Effort value is completely ignored when max\_tokens is present
  **Workaround**: Provide only the parameter you want to use
</Accordion>

<Accordion title="Gemini: Model Version Differences">
  **Severity**: Medium
  **Behavior**: Gemini 2.5 only supports `thinkingBudget`, while 3.0+ supports both `thinkingBudget` and `thinkingLevel`
  **Impact**: Effort-only requests on 2.5 are converted to budget; on 3.0+ they use native levels
  **Note**: Bifrost automatically detects version and uses appropriate conversion
</Accordion>

<Accordion title="Gemini Pro: Limited Level Support">
  **Severity**: Low
  **Behavior**: Pro models only support "low" and "high" thinking levels
  **Impact**: `"minimal"` → `"low"`, `"medium"` → `"high"` for Pro models
  **Note**: Non-Pro models support all four levels: minimal, low, medium, high
</Accordion>

***

## Complete Provider Comparison

### Reasoning Model

| Provider            | Model Type       | Budget Type           | Min Budget | Signature Support |
| ------------------- | ---------------- | --------------------- | ---------- | ----------------- |
| OpenAI              | Effort-based     | Effort-based          | None       | ❌                 |
| Anthropic           | Thinking blocks  | Token budget          | **1024**   | ✅                 |
| Bedrock (Anthropic) | Reasoning config | Token budget          | **1024**   | ✅                 |
| Bedrock (Nova)      | Reasoning config | Effort-based          | None       | ❌                 |
| Gemini 2.5+         | Thinking config  | Token budget          | 1024       | ✅                 |
| Gemini 3.0+         | Thinking config  | Dual (budget + level) | 1024       | ✅                 |

### Parameter Support

| Provider            | `effort`                | `max_tokens` | `summary` | Streaming |
| ------------------- | ----------------------- | ------------ | --------- | --------- |
| OpenAI              | ✅ (4 levels)            | ✅            | ❌         | ✅         |
| Anthropic           | ❌ (binary)              | ✅            | ✅         | ✅         |
| Bedrock (Anthropic) | ❌ (binary)              | ✅            | ✅         | ✅         |
| Bedrock (Nova)      | ✅ (3 levels)            | ⚠️ (ignored) | ❌         | ✅         |
| Gemini 2.5+         | ⚠️ (converts to budget) | ✅            | ❌         | ✅         |
| Gemini 3.0+         | ✅ (4 levels)            | ✅            | ❌         | ✅         |

***

## Troubleshooting

### Anthropic: "reasoning.max\_tokens must be >= 1024"

**Cause**: Attempting to use reasoning with `max_tokens < 1024`

**Solution**: Ensure `reasoning.max_tokens >= 1024` for Anthropic/Bedrock Anthropic models

```json theme={null}
// ❌ Invalid
{"reasoning": {"effort": "high", "max_tokens": 500}}

// ✅ Valid
{"reasoning": {"effort": "high", "max_tokens": 1024}}
```

### OpenAI: Model doesn't support reasoning

**Cause**: Using an older model that doesn't support reasoning (e.g., `gpt-4-turbo`)

**Solution**: Use models with reasoning support: `gpt-4o`, `gpt-4o-mini` (o1 series with native reasoning)

### Bedrock Nova: `max_tokens` parameter being ignored

**Expected Behavior**: Bedrock Nova uses effort-based reasoning only

**Solution**: Provide `effort` parameter instead of `max_tokens` for Nova models

```json theme={null}
// ✅ Correct for Nova
{"reasoning": {"effort": "high"}}
```

***
