Skip to main content

Overview

When an LLM returns tool calls in its response, Bifrost does not automatically execute them. Instead, your application explicitly calls the tool execution API, giving you full control over:
  • Which tool calls to execute
  • User approval workflows
  • Security validation
  • Audit logging
The basic flow is: Chat Request → Review Tool Calls → Execute Tools → Continue Conversation. For detailed architecture diagrams, see the MCP Architecture documentation.

Authentication

The /v1/mcp/tool/execute endpoint uses the same authentication as other inference endpoints like /v1/chat/completions:
Auth ConfigurationBehavior
disable_auth_on_inference: trueNo auth required
disable_auth_on_inference: falseAuth required
Virtual keys and authentication are independent layers that work together. For details on how to use virtual keys with authentication, see Authentication and Virtual Keys.

End-to-End Example

Step 1: Send Chat Request

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": "List files in the current directory"
      }
    ]
  }'
Response with tool calls:
{
  "id": "chatcmpl-abc123",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": null,
      "tool_calls": [{
        "id": "call_xyz789",
        "type": "function",
        "function": {
          "name": "filesystem_list_directory",
          "arguments": "{\"path\": \".\"}"
        }
      }]
    },
    "finish_reason": "tool_calls"
  }]
}
Tool names are prefixed with the MCP client name (e.g., filesystem_list_directory). This ensures uniqueness across multiple MCP clients.

Step 2: Execute the Tool

The request body matches the tool call object from the response:
curl -X POST http://localhost:8080/v1/mcp/tool/execute \
  -H "Content-Type: application/json" \
  -d '{
    "id": "call_xyz789",
    "type": "function",
    "function": {
      "name": "filesystem_list_directory",
      "arguments": "{\"path\": \".\"}"
    }
  }'
Tool result response:
{
  "role": "tool",
  "content": "[\"config.json\", \"main.go\", \"README.md\"]",
  "tool_call_id": "call_xyz789"
}

Step 3: Continue the Conversation

Assemble the full conversation history and continue:
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": "List files in the current directory"
      },
      {
        "role": "assistant",
        "content": null,
        "tool_calls": [{
          "id": "call_xyz789",
          "type": "function",
          "function": {
            "name": "filesystem_list_directory",
            "arguments": "{\"path\": \".\"}"
          }
        }]
      },
      {
        "role": "tool",
        "content": "[\"config.json\", \"main.go\", \"README.md\"]",
        "tool_call_id": "call_xyz789"
      }
    ]
  }'
Final response:
{
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "The current directory contains 3 files:\n\n1. **config.json** - Configuration file\n2. **main.go** - Go source file\n3. **README.md** - Documentation"
    },
    "finish_reason": "stop"
  }]
}

Response Formats

Bifrost supports two API formats for tool execution:

Chat Format (Default)

Use ?format=chat or omit the parameter:
POST /v1/mcp/tool/execute?format=chat
Request:
{
  "id": "call_xyz789",
  "type": "function",
  "function": {
    "name": "filesystem_read_file",
    "arguments": "{\"path\": \"config.json\"}"
  }
}
Response:
{
  "role": "tool",
  "content": "{\"key\": \"value\"}",
  "tool_call_id": "call_xyz789"
}

Responses Format

Use ?format=responses for the Responses API format:
POST /v1/mcp/tool/execute?format=responses
Request:
{
  "type": "function_call_output",
  "call_id": "call_xyz789",
  "name": "filesystem_read_file",
  "arguments": "{\"path\": \"config.json\"}"
}
Response:
{
  "type": "function_call_output",
  "call_id": "call_xyz789",
  "output": "{\"key\": \"value\"}"
}

Multiple Tool Calls

LLMs often request multiple tools in a single response. Execute them in sequence or parallel:
for _, toolCall := range *response.Choices[0].Message.ToolCalls {
    result, err := client.ExecuteChatMCPTool(ctx, toolCall)
    if err != nil {
        // Handle error
        continue
    }
    history = append(history, *result)
}

Error Handling

Tool execution can fail for various reasons:
result, err := client.ExecuteChatMCPTool(ctx, toolCall)
if err != nil {
    switch {
    case errors.Is(err, context.DeadlineExceeded):
        // Tool execution timed out
    case strings.Contains(err.Error(), "tool not found"):
        // Tool doesn't exist or client disconnected
    case strings.Contains(err.Error(), "not allowed"):
        // Tool filtered out by configuration
    default:
        // Other execution error
    }
}
Gateway error responses:
{
  "error": {
    "type": "tool_execution_error",
    "message": "Tool 'filesystem_delete_file' is not allowed for this request"
  }
}

Copy-Pastable Responses

Tool execution responses are designed to be directly appended to your conversation history:
// Tool result is already in the correct format
toolResult, _ := client.ExecuteChatMCPTool(ctx, toolCall)

// Just append it directly
history = append(history, *toolResult)
The response includes:
  • Correct role field ("tool")
  • Matching tool_call_id for correlation
  • Properly formatted content

Next Steps