Skip to main content

MCP Architecture Overview

What is MCP in Bifrost?

The Model Context Protocol (MCP) system in Bifrost enables AI models to seamlessly discover and execute external tools, transforming static chat models into dynamic, action-capable agents. This architecture bridges the gap between AI reasoning and real-world tool execution. Core MCP Principles:
  • Dynamic Discovery - Tools are discovered at runtime, not hardcoded
  • Client-Side Execution - Bifrost controls all tool execution for security
  • Multi-Protocol Support - STDIO, HTTP, and SSE connection types
  • Request-Level Filtering - Granular control over tool availability
  • Async Execution - Non-blocking tool invocation and response handling

MCP System Components


MCP Connection Architecture

Multi-Protocol Connection System

Bifrost supports four MCP connection types, each optimized for different tool deployment patterns:

Connection Type Details

InProcess Connections (In-Memory Tools):
  • Use Case: Embedded tools, high-performance operations, testing
  • Performance: Lowest possible latency (~0.1ms) with no IPC overhead
  • Security: Highest security as tools run in the same process
  • Limitations: Go package only, cannot be configured via JSON
STDIO Connections (Local Tools):
  • Use Case: Command-line tools, local scripts, filesystem operations
  • Performance: Low latency (~1-10ms) due to local execution
  • Security: High security with full local control
  • Limitations: Single-server deployment, resource sharing
HTTP Connections (Remote Services):
  • Use Case: Web APIs, microservices, cloud functions
  • Performance: Network-dependent latency (~10-500ms)
  • Security: Configurable with authentication and encryption
  • Advantages: Scalable, multi-server deployment, service isolation
SSE Connections (Streaming Tools):
  • Use Case: Real-time data feeds, live monitoring, event streams
  • Performance: Variable latency depending on stream frequency
  • Security: Similar to HTTP with streaming capabilities
  • Benefits: Real-time updates, persistent connections, event-driven
MCP Configuration: MCP Setup Guide →

Tool Discovery & Registration

Dynamic Tool Discovery Process

The MCP system discovers tools at runtime rather than requiring static configuration, enabling flexible and adaptive tool availability:

Tool Registry Management

Registration Process:
  1. Connection Establishment - MCP client connects to configured servers
  2. Capability Exchange - Server announces available tools and schemas
  3. Tool Validation - Bifrost validates tool definitions and security
  4. Registry Update - Tools are registered in the internal tool registry
  5. Availability Notification - Tools become available for AI model use
Registry Features:
  • Dynamic Updates - Tools can be added/removed during runtime
  • Version Management - Support for tool versioning and compatibility
  • Access Control - Request-level tool filtering and permissions
  • Health Monitoring - Continuous tool availability checking
Tool Metadata Structure:
  • Name & Description - Human-readable tool identification
  • Parameters Schema - JSON schema for tool input validation
  • Return Schema - Expected response format definition
  • Capabilities - Tool feature flags and limitations
  • Authentication - Required credentials and permissions

Tool Filtering & Access Control

Multi-Level Filtering System

Bifrost provides granular control over tool availability through a sophisticated filtering system:

Filtering Configuration Levels

Request-Level Filtering:
# Include only specific MCP clients
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "x-bf-mcp-include-clients: filesystem,websearch" \
  -d '{"model": "gpt-4o-mini", "messages": [...]}'

# Include only specific tools
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "x-bf-mcp-include-tools: filesystem-read_file,websearch-search" \
  -d '{"model": "gpt-4o-mini", "messages": [...]}'
Configuration-Level Filtering:
  • Client Selection - Choose which MCP servers to connect to
  • Tool Blacklisting - Permanently disable dangerous or unwanted tools
  • Permission Mapping - Map user roles to available tool sets
  • Environment-Based - Different tool sets for development vs production
Security Benefits:
  • Principle of Least Privilege - Only necessary tools are exposed
  • Dynamic Access Control - Per-request tool availability
  • Audit Trail - Track which tools are used by which requests
  • Risk Mitigation - Prevent access to dangerous operations
📖 Tool Filtering: MCP Tool Control →

Tool Execution Engine

Async Tool Execution Architecture

The MCP execution engine handles tool invocation asynchronously to maintain system responsiveness and enable complex multi-tool workflows:

Execution Flow Characteristics

Validation Phase:
  • Parameter Validation - Ensure tool arguments match expected schema
  • Permission Checking - Verify tool access permissions for the request
  • Rate Limiting - Apply per-tool and per-user rate limits
  • Security Scanning - Check for potentially dangerous operations
Execution Phase:
  • Timeout Management - Bounded execution time to prevent hanging
  • Error Handling - Graceful handling of tool failures and timeouts
  • Result Streaming - Support for tools that return streaming responses
  • Resource Monitoring - Track tool resource usage and performance
Response Phase:
  • Result Formatting - Convert tool outputs to consistent format
  • Error Enrichment - Add context and suggestions for tool failures
  • Multi-Result Aggregation - Combine multiple tool outputs coherently
  • Context Integration - Merge tool results into conversation context

Multi-Turn Conversation Support

The MCP system enables sophisticated multi-turn conversations where AI models can:
  1. Initial Tool Discovery - Request available tools for a given context
  2. Tool Execution - Execute one or more tools based on user request
  3. Result Analysis - Analyze tool outputs and determine next steps
  4. Follow-up Actions - Execute additional tools based on previous results
  5. Response Synthesis - Combine tool results into coherent user response
Example Multi-Turn Flow:
User: "Find recent news about AI and save interesting articles"
AI: → Execute web_search("AI news recent")
AI: → Analyze search results
AI: → Execute save_article() for each interesting result
AI: → Respond with summary of saved articles

Complete User-Controlled Tool Execution Flow

The following diagram shows the end-to-end user experience with MCP tool execution, highlighting the critical user control points and decision-making process: Key Flow Characteristics: User Control Points:
  • Security Layer - Your application controls all tool execution decisions
  • Approval Gate - Users can approve or deny each tool execution
  • Transparency - Full visibility into what tools will be executed and why
  • Conversation Continuity - Tool results seamlessly integrate into conversation flow
Security Benefits:
  • No Automatic Execution - Tools never execute without explicit approval
  • Audit Trail - Complete logging of all tool execution decisions
  • Contextual Security - Approval decisions can consider full conversation context
  • Graceful Denials - Denied tools result in informative responses, not errors
Implementation Patterns:
// Example tool execution control in your application
func handleToolExecution(toolCall schemas.ChatToolCall, userContext UserContext) error {
    // YOUR SECURITY AND APPROVAL LOGIC HERE
    if !userContext.HasPermission(toolCall.Function.Name) {
        return createDenialResponse("Tool not permitted for user role")
    }

    if requiresApproval(toolCall) {
        approved := promptUserForApproval(toolCall)
        if !approved {
            return createDenialResponse("User denied tool execution")
        }
    }

    // Execute the tool via Bifrost
    result, err := client.ExecuteMCPTool(ctx, toolCall)
    if err != nil {
        return handleToolError(err)
    }

    return addToolResultToHistory(result)
}
This flow ensures that while AI models can discover and request tool usage, all actual execution remains under user control, providing the perfect balance of AI capability and human oversight.

Agent Mode Architecture

Agent Mode transforms Bifrost into an autonomous agent runtime by automatically executing pre-approved tools. This section details the internal architecture of the agent execution loop.

Agent Execution Loop

The agent mode operates as an iterative loop that continues until one of the termination conditions is met:

Tool Classification System

When the LLM returns tool calls, Bifrost classifies each tool based on the client configuration:

Mixed Tool Response Format

When a response contains both auto-executable and non-auto-executable tools, the agent creates a special response format:

Chat API Response Format

{
  "id": "chatcmpl-abc123",
  "choices": [{
    "index": 0,
    "finish_reason": "stop",
    "message": {
      "role": "assistant",
      "content": "The Output from allowed tools calls is - {\"filesystem_read_file\":\"file contents here\",\"filesystem_list_directory\":\"[\\\"file1.txt\\\",\\\"file2.txt\\\"]\"}\n\nNow I shall call these tools next...",
      "tool_calls": [
        {
          "id": "call_write_123",
          "type": "function",
          "function": {
            "name": "filesystem_write_file",
            "arguments": "{\"path\":\"output.txt\",\"content\":\"...\"}"
          }
        }
      ]
    }
  }]
}
The content field contains JSON-formatted results from auto-executed tools. The tool_calls array contains only non-auto-executable tools awaiting approval. Setting finish_reason to "stop" ensures the agent loop exits.
{
  "id": "resp-abc123",
  "output": [
    {
      "type": "message",
      "role": "assistant",
      "content": [{
        "type": "text",
        "text": "The Output from allowed tools calls is - {...}\n\nNow I shall call these tools next..."
      }]
    },
    {
      "type": "function_call",
      "role": "assistant",
      "call_id": "call_write_123",
      "name": "filesystem_write_file",
      "arguments": "{\"path\":\"output.txt\",\"content\":\"...\"}"
    }
  ]
}

Agent Depth Control

The max_agent_depth setting prevents infinite loops and controls resource usage:
When max depth is reached, the response may contain pending tool calls that weren’t executed. Your application should handle this gracefully.

Code Mode Architecture

Code Mode enables AI models to write and execute Python code (Starlark) that orchestrates multiple MCP tools in a single request. This provides a powerful meta-layer for complex multi-tool workflows.

Code Mode System Overview

Virtual File System (VFS)

Code Mode generates Python stub files (.pyi) for all connected MCP tools, providing compact function signatures:
When code_mode_binding_level: "server" (default), tools are grouped by MCP client:
servers/
├── filesystem.pyi      → All filesystem tools
├── web_search.pyi      → All web search tools
└── database.pyi        → All database tools
Generated Stub Example:
# servers/filesystem.pyi
# Usage: filesystem.tool_name(param=value)
# For detailed docs: use getToolDocs(server="filesystem", tool="tool_name")

def read_file(path: str) -> dict:  # Read contents of a file
def write_file(path: str, content: str) -> dict:  # Write content to a file
def list_directory(path: str) -> dict:  # List directory contents
Usage in Code:
files = filesystem.list_directory(path=".")
content = filesystem.read_file(path=files["entries"][0])
result = content

Code Execution Flow

Starlark Sandbox

The code execution environment is carefully sandboxed using Starlark, a Python-like language designed for configuration and embedded scripting:

Available Features

  • Python-like syntax - Familiar Python syntax and semantics
  • Synchronous calls - No async/await needed, direct function calls
  • List comprehensions - [x for x in items if condition]
  • print() - Output captured and returned in logs
  • Dict/List operations - Standard Python data structures
  • Tool bindings - All connected MCP tools as globals
  • Imports - No import statements (tools are pre-bound)
  • Classes - Use dicts and functions instead
  • File I/O - No direct filesystem access (use MCP tools)
  • Network - No direct network access (use MCP tools)
  • Randomness/Time - Deterministic execution only

Code Mode Security Model

Code Mode Configuration

{
  "mcp": {
    "client_configs": [
      {
        "name": "filesystem",
        "is_code_mode_client": true,
        "connection_type": "stdio",
        "stdio_config": {
          "command": "npx",
          "args": ["-y", "@anthropic/mcp-filesystem"]
        },
        "tools_to_execute": ["*"]
      }
    ],
    "tool_manager_config": {
      "code_mode_binding_level": "server",
      "tool_execution_timeout": "30s"
    }
  }
}

Code Mode vs Agent Mode

AspectAgent ModeCode Mode
Execution ModelLLM decides one tool at a timeLLM writes code orchestrating multiple tools
IterationsMultiple LLM round-tripsSingle LLM call, code handles orchestration
ComplexitySimple tool chainsComplex workflows with conditionals/loops
LatencyHigher (multiple LLM calls)Lower (single LLM call + code execution)
ControlPer-tool approval possibleCode runs atomically
Best ForInteractive agentsBatch operations, complex data processing

MCP Integration Patterns

Common Integration Scenarios

1. Filesystem Operations
  • Tools: list_files, read_file, write_file, create_directory
  • Use Cases: Code analysis, document processing, file management
  • Security: Sandboxed file access, path validation, permission checks
  • Performance: Local execution for fast file operations
2. Web Search & Information Retrieval
  • Tools: web_search, fetch_url, extract_content, summarize
  • Use Cases: Research assistance, fact-checking, content gathering
  • Integration: External search APIs, content parsing services
  • Caching: Response caching for repeated queries
3. Database Operations
  • Tools: query_database, insert_record, update_record, schema_info
  • Use Cases: Data analysis, report generation, database administration
  • Security: Read-only access by default, query validation, injection prevention
  • Performance: Connection pooling, query optimization
4. API Integrations
  • Tools: Custom business logic tools, third-party service integration
  • Use Cases: CRM operations, payment processing, notification sending
  • Authentication: API key management, OAuth token handling
  • Error Handling: Retry logic, fallback mechanisms

MCP Server Development Patterns

Simple STDIO Server:
  • Language: Any language that can read/write JSON to stdin/stdout
  • Deployment: Single executable, minimal dependencies
  • Use Case: Local tools, development utilities, simple scripts
HTTP Service Server:
  • Architecture: RESTful API with MCP protocol endpoints
  • Scalability: Horizontal scaling, load balancing
  • Use Case: Shared tools, enterprise integrations, cloud services
Hybrid Approach:
  • Local + Remote: Combine STDIO tools for local operations with HTTP for remote services
  • Failover: Use local fallbacks when remote services are unavailable
  • Optimization: Route tool calls to most appropriate execution environment
📖 MCP Development: Tool Development Guide →

Security & Safety Considerations

MCP Security Architecture

Security Measures: Connection Security:
  • Authentication - API keys, certificates, or token-based auth for HTTP/SSE
  • Encryption - TLS for HTTP connections, secure pipes for STDIO
  • Network Isolation - Firewall rules and network segmentation
Execution Security:
  • Sandboxing - Isolated execution environments for tools
  • Resource Limits - CPU, memory, and time constraints
  • Permission Model - Principle of least privilege for tool access
Operational Security:
  • Regular Updates - Keep MCP servers and tools updated
  • Monitoring - Continuous security monitoring and alerting
  • Incident Response - Procedures for security incidents involving tools