> ## Documentation Index
> Fetch the complete documentation index at: https://docs.getbifrost.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Code Mode

> AI writes Python to orchestrate tools. Reduces token usage by 50%+ when using multiple MCP servers.

<Note>
  This feature is only available on `v1.4.0-prerelease1` and above.
</Note>

## Overview

**Code Mode** is a transformative approach to using MCP that solves a critical problem at scale:

> **The Problem:** When you connect 8-10 MCP servers (150+ tools), every single request includes all tool definitions in the context. The LLM spends most of its budget reading tool catalogs instead of doing actual work.

**The Solution:** Instead of exposing 150 tools directly, Code Mode exposes just **four generic tools**. The LLM uses those tools to write Python code (Starlark) that orchestrates everything else in a sandbox.

### The Impact

Compare a workflow across 5 MCP servers with \~100 tools:

**Classic MCP Flow:**

* 6 LLM turns
* 100 tools in context **every turn** (600 tool-definition tokens)
* All intermediate results flow through the model

**Code Mode Flow:**

* 3-4 LLM turns
* Only 4 tools + definitions on-demand
* Intermediate results processed in sandbox

**Result: \~50% cost reduction + 30-40% faster execution**

Code Mode provides four meta-tools to the AI:

1. **`listToolFiles`** - Discover available MCP servers
2. **`readToolFile`** - Load Python stub signatures on-demand
3. **`getToolDocs`** - Get detailed documentation for a specific tool
4. **`executeToolCode`** - Execute Python code with full tool bindings

## When to Use Code Mode

**Enable Code Mode if you have:**

* ✅ 3+ MCP servers connected
* ✅ Complex multi-step workflows
* ✅ Concerned about token costs or latency
* ✅ Tools that need to interact with each other

**Keep Classic MCP if you have:**

* ✅ Only 1-2 small MCP servers
* ✅ Simple, direct tool calls
* ✅ Very latency-sensitive use cases (though Code Mode is usually faster)

**You can mix both:** Enable Code Mode for "heavy" servers (web, documents, databases) and keep small utilities as direct tools.

***

## How Code Mode Works

### The Four Tools

Instead of seeing 150+ tool definitions, the model sees four generic tools:

```mermaid theme={null}
graph LR
    LLM["<b>LLM Context</b><br/><i>Compact & Efficient</i>"]

    List["<b>listToolFiles</b><br/>Discover servers"]
    Read["<b>readToolFile</b><br/>Load signatures"]
    Docs["<b>getToolDocs</b><br/>Get detailed docs"]
    Execute["<b>executeToolCode</b><br/>Run code with bindings"]

    Hidden["<i>All other MCP servers<br/>hidden behind these 4 tools</i>"]

    LLM --> List
    LLM --> Read
    LLM --> Docs
    LLM --> Execute

    List -.-> Hidden
    Read -.-> Hidden
    Docs -.-> Hidden
    Execute -.-> Hidden

    style LLM fill:#E3F2FD,stroke:#0D47A1,stroke-width:2.5px,color:#1A1A1A
    style List fill:#E8F5E9,stroke:#1B5E20,stroke-width:2.5px,color:#1A1A1A
    style Read fill:#FFF3E0,stroke:#BF360C,stroke-width:2.5px,color:#1A1A1A
    style Docs fill:#E1F5FE,stroke:#0288D1,stroke-width:2.5px,color:#1A1A1A
    style Execute fill:#F3E5F5,stroke:#4A148C,stroke-width:2.5px,color:#1A1A1A
    style Hidden fill:#EEEEEE,stroke:#424242,stroke-width:1.5px,stroke-dasharray: 5 5,color:#1A1A1A
```

### The Execution Flow

```mermaid theme={null}
graph LR
    User["<b>1. User Request</b><br/>Search YouTube<br/>& save to file"]

    Discover["<b>2. Discover Tools</b><br/>listToolFiles()"]

    GetDefs["<b>3. Load Definitions</b><br/>readToolFile()"]

    Write["<b>4. Write Code</b><br/>Python<br/>in sandbox"]

    Execute["<b>5. Execute</b><br/>Real MCP calls<br/>contained in VM"]

    Result["<b>6. Compact Result</b><br/>{saved:10}"]

    Response["<b>7. Final Response</b><br/>Found & saved<br/>10 videos"]

    User --> Discover
    Discover --> GetDefs
    GetDefs --> Write
    Write --> Execute
    Execute --> Result
    Result --> Response

    style User fill:#E3F2FD,stroke:#0D47A1,stroke-width:2.5px,color:#1A1A1A
    style Discover fill:#F3E5F5,stroke:#4A148C,stroke-width:2.5px,color:#1A1A1A
    style GetDefs fill:#F3E5F5,stroke:#4A148C,stroke-width:2.5px,color:#1A1A1A
    style Write fill:#FFF3E0,stroke:#BF360C,stroke-width:2.5px,color:#1A1A1A
    style Execute fill:#E8F5E9,stroke:#1B5E20,stroke-width:3px,color:#1A1A1A
    style Result fill:#FFFDE7,stroke:#F57F17,stroke-width:2.5px,color:#1A1A1A
    style Response fill:#E8F5E9,stroke:#1B5E20,stroke-width:2.5px,color:#1A1A1A
```

**Key insight:** All the complex orchestration happens inside the sandbox. The LLM only receives the final, compact result-not every intermediate step.

***

## Why This Matters at Scale

### Classic MCP with 5 servers (100 tools):

```
Turn 1: Prompt + search query + [100 tool definitions]
Turn 2: Prompt + search result + [100 tool definitions]
Turn 3: Prompt + channel list + [100 tool definitions]
Turn 4: Prompt + video list + [100 tool definitions]
Turn 5: Prompt + summaries + [100 tool definitions]
Turn 6: Prompt + doc result + [100 tool definitions]

Total: 6 LLM calls, ~600+ tokens in tool definitions alone
```

### Code Mode with same 5 servers:

```
Turn 1: Prompt + 4 tools (listToolFiles, readToolFile, getToolDocs, executeToolCode)
Turn 2: Prompt + server list + 4 tools
Turn 3: Prompt + selected definitions + 4 tools + [EXECUTES CODE]
        [YouTube search, channel list, videos, summaries, doc creation all happen in sandbox]
Turn 4: Prompt + final result + 4 tools

Total: 3-4 LLM calls, ~50 tokens in tool definitions
Result: 50% cost reduction, 3-4x fewer LLM round trips
```

***

## Enabling Code Mode

Code Mode must be enabled **per MCP client**. Once enabled, that client's tools are accessed through the four meta-tools rather than exposed directly.

**Best practice:** Enable Code Mode for 3+ servers or any "heavy" server (web search, documents, databases).

<Tabs>
  <Tab title="Web UI">
    ### Enable Code Mode for a Client

    1. Navigate to **MCP Gateway** in the sidebar
    2. Click on a client row to open the configuration sheet

    <Frame>
      <img src="https://mintcdn.com/bifrost/_ilYf7u7HJP58LQG/media/ui-mcp-edit-server.png?fit=max&auto=format&n=_ilYf7u7HJP58LQG&q=85&s=d9a3599e8075b9975d7f1f7078939db8" alt="MCP Client Configuration" width="3492" height="2358" data-path="media/ui-mcp-edit-server.png" />
    </Frame>

    3. In the **Basic Information** section, toggle **Code Mode Client** to enabled
    4. Click **Save Changes**

    Once enabled:

    * This client's tools are no longer in the default tool list
    * They become accessible through `listToolFiles()` and `readToolFile()`
    * The AI can write code using `executeToolCode()` to call them
  </Tab>

  <Tab title="API">
    ```bash theme={null}
    # When adding a new client
    curl -X POST http://localhost:8080/api/mcp/client \
      -H "Content-Type: application/json" \
      -d '{
        "name": "youtube",
        "connection_type": "http",
        "connection_string": "http://localhost:3001/mcp",
        "tools_to_execute": ["*"],
        "is_code_mode_client": true
      }'

    # Or update an existing client
    curl -X PUT http://localhost:8080/api/mcp/client/{id} \
      -H "Content-Type: application/json" \
      -d '{
        "name": "youtube",
        "connection_type": "http",
        "connection_string": "http://localhost:3001/mcp",
        "tools_to_execute": ["*"],
        "is_code_mode_client": true
      }'
    ```
  </Tab>

  <Tab title="config.json">
    ```json theme={null}
    {
      "mcp": {
        "client_configs": [
          {
            "name": "youtube",
            "connection_type": "http",
            "connection_string": "http://localhost:3001/mcp",
            "tools_to_execute": ["*"],
            "is_code_mode_client": true
          },
          {
            "name": "filesystem",
            "connection_type": "stdio",
            "stdio_config": {
              "command": "npx",
              "args": ["-y", "@anthropic/mcp-filesystem"]
            },
            "tools_to_execute": ["*"],
            "is_code_mode_client": true
          }
        ]
      }
    }
    ```
  </Tab>
</Tabs>

### Go SDK Setup

```go theme={null}
mcpConfig := &schemas.MCPConfig{
    ClientConfigs: []schemas.MCPClientConfig{
        {
            Name:             "youtube",
            ConnectionType:   schemas.MCPConnectionTypeHTTP,
            ConnectionString: bifrost.Ptr("http://localhost:3001/mcp"),
            ToolsToExecute:   []string{"*"},
            IsCodeModeClient: true, // Enable code mode
        },
        {
            Name:           "filesystem",
            ConnectionType: schemas.MCPConnectionTypeSTDIO,
            StdioConfig: &schemas.MCPStdioConfig{
                Command: "npx",
                Args:    []string{"-y", "@anthropic/mcp-filesystem"},
            },
            ToolsToExecute:   []string{"*"},
            IsCodeModeClient: true, // Enable code mode
        },
    },
}
```

***

## The Four Code Mode Tools

When Code Mode clients are connected, Bifrost automatically adds four meta-tools to every request:

### 1. listToolFiles

Lists all available virtual `.pyi` stub files for connected code mode servers.

**Example output (Server-level binding):**

```
servers/
  youtube.pyi
  filesystem.pyi
```

**Example output (Tool-level binding):**

```
servers/
  youtube/
    search.pyi
    get_video.pyi
  filesystem/
    read_file.pyi
    write_file.pyi
```

### 2. readToolFile

Reads a virtual `.pyi` file to get compact Python function signatures for tools.

**Parameters:**

* `fileName` (required): Path like `servers/youtube.pyi` or `servers/youtube/search.pyi`
* `startLine` (optional): 1-based starting line for partial reads
* `endLine` (optional): 1-based ending line for partial reads

**Example output:**

```python theme={null}
# youtube server tools
# Usage: youtube.tool_name(param=value)
# For detailed docs: use getToolDocs(server="youtube", tool="tool_name")

def search(query: str, maxResults: int = None) -> dict:  # Search for videos
def get_video(id: str) -> dict:  # Get video details
```

### 3. getToolDocs

Get detailed documentation for a specific tool when the compact signature from `readToolFile` is not sufficient.

**Parameters:**

* `server` (required): The server name (e.g., `"youtube"`)
* `tool` (required): The tool name (e.g., `"search"`)

**Example output:**

```python theme={null}
# ============================================================================
# Documentation for youtube.search tool
# ============================================================================
#
# USAGE INSTRUCTIONS:
# Call tools using: result = youtube.tool_name(param=value)
# No async/await needed - calls are synchronous.
#
# CRITICAL - HANDLING RESPONSES:
# Tool responses are dicts. To avoid runtime errors:
# 1. Use print(result) to inspect the response structure first
# 2. Access dict values with brackets: result["key"] NOT result.key
# 3. Use .get() for safe access: result.get("key", default)
# ============================================================================

def search(query: str, maxResults: int = None) -> dict:
    """
    Search for videos on YouTube.

    Args:
        query (str): Search query (required)
        maxResults (int): Max results to return (optional)

    Returns:
        dict: Response from the tool. Structure varies by tool.
              Use print(result) to inspect the actual structure.

    Example:
        result = youtube.search(query="...")
        print(result)  # Always inspect response first!
        value = result.get("key", default)  # Safe access
    """
    ...
```

### 4. executeToolCode

Executes Python code in a sandboxed Starlark interpreter with access to all code mode server tools.

**Parameters:**

* `code` (required): Python code to execute

**Execution Environment:**

* Python code runs in a Starlark interpreter (Python subset)
* All code mode servers are exposed as global objects (e.g., `youtube`, `filesystem`)
* Tool calls are **synchronous** - no async/await needed
* Use `print()` for logging (output captured in logs)
* Assign to `result` variable to return a value
* Tool execution timeout applies (default 30s)

**Syntax notes:**

* Use keyword arguments: `server.tool(param="value")` NOT `server.tool({"param": "value"})`
* Access dict values with brackets: `result["key"]` NOT `result.key`
* List comprehensions work: `[x for x in items if x["active"]]`

**Example code:**

```python theme={null}
# Search YouTube and return formatted results
results = youtube.search(query="AI news", maxResults=5)
titles = [item["snippet"]["title"] for item in results["items"]]
print("Found", len(titles), "videos")
result = {"titles": titles, "count": len(titles)}
```

***

## Binding Levels

Code Mode supports two binding levels that control how tools are organized in the virtual file system:

### Server-Level Binding (Default)

All tools from a server are grouped into a single `.pyi` file.

```
servers/
  youtube.pyi        ← Contains all youtube tools
  filesystem.pyi     ← Contains all filesystem tools
```

**Best for:**

* Servers with few tools
* When you want to see all tools at once
* Simpler discovery workflow

### Tool-Level Binding

Each tool gets its own `.pyi` file.

```
servers/
  youtube/
    search.pyi
    get_video.pyi
    get_channel.pyi
  filesystem/
    read_file.pyi
    write_file.pyi
    list_directory.pyi
```

**Best for:**

* Servers with many tools
* When tools have large/complex schemas
* More focused documentation per tool

### Configuring Binding Level

Binding level is a **global setting** that controls how Code Mode's virtual file system is organized. It affects how the AI discovers and loads tool definitions.

<Tabs>
  <Tab title="Web UI">
    Binding level can be viewed in the MCP configuration overview:

    <Frame>
      <img src="https://mintcdn.com/bifrost/_ilYf7u7HJP58LQG/media/ui-mcp-config.png?fit=max&auto=format&n=_ilYf7u7HJP58LQG&q=85&s=6e9d5cd353f66dc5943e0bdf819a1f95" alt="MCP Gateway Configuration" width="3492" height="2358" data-path="media/ui-mcp-config.png" />
    </Frame>

    * **Server-level (default)**: One `.pyi` file per MCP server
      * Use when: 5-20 tools per server, want simple discovery
      * Example: `servers/youtube.pyi` contains all YouTube tools

    * **Tool-level**: One `.pyi` file per individual tool
      * Use when: 30+ tools per server, want minimal context bloat
      * Example: `servers/youtube/search.pyi`, `servers/youtube/list_channels.pyi`

    Both modes use the same four-tool interface (`listToolFiles`, `readToolFile`, `getToolDocs`, `executeToolCode`). The choice is purely about **context efficiency per read operation**.
  </Tab>

  <Tab title="config.json">
    ```json theme={null}
    {
      "mcp": {
        "tool_manager_config": {
          "code_mode_binding_level": "server"
        }
      }
    }
    ```

    Options: `"server"` (default) or `"tool"`
  </Tab>

  <Tab title="Go SDK">
    ```go theme={null}
    mcpConfig := &schemas.MCPConfig{
        ToolManagerConfig: &schemas.MCPToolManagerConfig{
            CodeModeBindingLevel: schemas.CodeModeBindingLevelTool, // or CodeModeBindingLevelServer
        },
        ClientConfigs: []schemas.MCPClientConfig{
            // ... clients
        },
    }
    ```
  </Tab>
</Tabs>

***

## Auto-Execution with Code Mode

Code Mode tools can be auto-executed in [Agent Mode](./agent-mode), but with **additional validation**:

1. The `listToolFiles` and `readToolFile` tools are always auto-executable (they're read-only)
2. The `executeToolCode` tool is auto-executable **only if** all tool calls within the code are allowed

### How Validation Works

When `executeToolCode` is called in agent mode:

1. Bifrost parses the Python code
2. Extracts all `serverName.toolName()` calls
3. Checks each call against `tools_to_auto_execute` for that server
4. If ALL calls are allowed → auto-execute
5. If ANY call is not allowed → return to user for approval

**Example:**

```json theme={null}
{
  "name": "youtube",
  "tools_to_execute": ["*"],
  "tools_to_auto_execute": ["search"],
  "is_code_mode_client": true
}
```

```python theme={null}
# This code WILL auto-execute (only uses search)
results = youtube.search(query="AI")
result = results

# This code will NOT auto-execute (uses delete_video which is not in auto-execute list)
youtube.delete_video(id="abc123")
```

***

## Code Execution Environment

### Available APIs

| Available              | Not Available            |
| ---------------------- | ------------------------ |
| Python-like syntax     | `import` statements      |
| Synchronous tool calls | Classes (use dicts)      |
| `print()` for logging  | File I/O                 |
| Dict/List operations   | Network access           |
| List comprehensions    | `random`, `time` modules |

### Runtime Environment Details

**Engine:** Starlark interpreter (Python subset)

**Tool Exposure:** Tools from code mode clients are exposed as global objects:

```python theme={null}
# If you have a 'youtube' code mode client with a 'search' tool
results = youtube.search(query="AI news")
```

**Code Processing:**

1. Code is validated for syntax errors
2. Tool calls are extracted and validated
3. Code executes in isolated Starlark context
4. Result variable is automatically serialized to JSON

**Execution Limits:**

* Default timeout: 30 seconds per tool execution
* Memory isolation: Each execution gets its own context
* No access to host file system or network
* Logs captured from print() calls

### Error Handling

Bifrost provides detailed error messages with hints:

```python theme={null}
# Error: youtube is not defined
# Hints:
# - Variable or identifier 'youtube' is not defined
# - Available server keys: youtubeAPI, filesystem
# - Use one of the available server keys as the object name
```

### Timeouts

* Default: 30 seconds per tool call
* Configure via `tool_execution_timeout` in `tool_manager_config`
* Long-running operations are interrupted with timeout error

***

## Real-World Impact Comparison

### Scenario: E-commerce Assistant with Multiple Services

**Setup:**

* 10 MCP servers (product catalog, inventory, payments, shipping, chat, analytics, docs, images, calendar, notifications)
* Average 15 tools per server = **150 total tools**
* Complex multi-step task: "Find matching products, check inventory, compare prices, get shipping estimate, create quote"

### Classic MCP Results

| Metric              | Value            |
| ------------------- | ---------------- |
| LLM Turns           | 8-10             |
| Tokens in Tool Defs | \~2,400 per turn |
| Avg Request Tokens  | 4,000-5,000      |
| Avg Total Cost      | \$3.20-4.00      |
| Latency             | 18-25 seconds    |

**Problem:** Most context goes to tool definitions. Model makes redundant tool calls. Every intermediate result travels back through the LLM.

### Code Mode Results

| Metric              | Value              |
| ------------------- | ------------------ |
| LLM Turns           | 3-4                |
| Tokens in Tool Defs | \~100-300 per turn |
| Avg Request Tokens  | 1,500-2,000        |
| Avg Total Cost      | \$1.20-1.80        |
| Latency             | 8-12 seconds       |

**Benefit:** Model writes one Python script. All orchestration happens in sandbox. Only compact result returned to LLM.

***

## Next Steps

<CardGroup cols={2}>
  <Card title="Agent Mode" icon="robot" href="./agent-mode">
    Combine Code Mode with auto-execution
  </Card>

  <Card title="MCP Gateway URL" icon="server" href="./gateway-url">
    Expose your tools to external clients
  </Card>
</CardGroup>
