Provider Configuration

Multi-Provider Setup

Configure multiple providers to seamlessly switch between them. This example shows how to configure OpenAI, Anthropic, and Mistral providers.

type MyAccount struct{}

func (a *MyAccount) GetConfiguredProviders() ([]schemas.ModelProvider, error) {
    return []schemas.ModelProvider{schemas.OpenAI, schemas.Anthropic, schemas.Mistral}, nil
}

func (a *MyAccount) GetKeysForProvider(ctx *context.Context, provider schemas.ModelProvider) ([]schemas.Key, error) {
    switch provider {
    case schemas.OpenAI:
        return []schemas.Key{{
            Value:  os.Getenv("OPENAI_API_KEY"),
            Models: []string{},
            Weight: 1.0,
        }}, nil
    case schemas.Anthropic:
        return []schemas.Key{{
            Value:  os.Getenv("ANTHROPIC_API_KEY"),
            Models: []string{},
            Weight: 1.0,
        }}, nil
    case schemas.Mistral:
        return []schemas.Key{{
            Value:  os.Getenv("MISTRAL_API_KEY"),
            Models: []string{},
            Weight: 1.0,
        }}, nil
    }
    return nil, fmt.Errorf("provider %s not supported", provider)
}

func (a *MyAccount) GetConfigForProvider(provider schemas.ModelProvider) (*schemas.ProviderConfig, error) {
    // Return same config for all providers
    return &schemas.ProviderConfig{
            NetworkConfig:            schemas.DefaultNetworkConfig,
            ConcurrencyAndBufferSize: schemas.DefaultConcurrencyAndBufferSize,
    }, nil
}

If Bifrost receives a new provider at runtime (i.e., one that is not returned by GetConfiguredProviders() initially on bifrost.Init()), it will set up the provider at runtime using GetConfigForProvider(), which may cause a delay in the first request to that provider.

Making Requests

Once providers are configured, you can make requests to any specific provider. This example shows how to send a request directly to Mistral’s latest vision model. Bifrost handles the provider-specific API formatting automatically.

response, err := client.ChatCompletionRequest(context.Background(), &schemas.BifrostChatRequest{
    Provider: schemas.Mistral,
    Model:    "pixtral-12b-latest",
    Input:    messages,
})

Environment Variables

Set up your API keys for the providers you want to use:

export OPENAI_API_KEY="your-openai-api-key"
export ANTHROPIC_API_KEY="your-anthropic-api-key"
export CEREBRAS_API_KEY="your-cerebras-api-key"
export MISTRAL_API_KEY="your-mistral-api-key"
export GROQ_API_KEY="your-groq-api-key"
export COHERE_API_KEY="your-cohere-api-key"

Advanced Configuration

Weighted Load Balancing

Distribute requests across multiple API keys or providers based on custom weights. This example shows how to split traffic 70/30 between two OpenAI keys, useful for managing rate limits or costs across different accounts.

func (a *MyAccount) GetKeysForProvider(ctx *context.Context, provider schemas.ModelProvider) ([]schemas.Key, error) {
    switch provider {
    case schemas.OpenAI:
        return []schemas.Key{{
            Value:  os.Getenv("OPENAI_API_KEY_1"),
            Models: []string{},
            Weight: 0.7, // 70% of requests
        },
        {
            Value:  os.Getenv("OPENAI_API_KEY_2"),
            Models: []string{},
            Weight: 0.3, // 30% of requests
        },
        }, nil
    }
    return nil, fmt.Errorf("provider %s not supported", provider)
}

Model-Specific Keys

Use different API keys for specific models, allowing you to manage access controls and billing separately. This example uses a premium key for advanced reasoning models (o1-preview, o1-mini) and a standard key for regular GPT models.

func (a *MyAccount) GetKeysForProvider(ctx *context.Context, provider schemas.ModelProvider) ([]schemas.Key, error) {
    switch provider {
    case schemas.OpenAI:
        return []schemas.Key{
            {
                Value:  os.Getenv("OPENAI_API_KEY"),
                Models: []string{"gpt-4o", "gpt-4o-mini"},
                Weight: 1.0,
            },
            {
                Value:  os.Getenv("OPENAI_API_KEY_PREMIUM"),
                Models: []string{"o1-preview", "o1-mini"},
                Weight: 1.0,
            },
        }, nil
    }
    return nil, fmt.Errorf("provider %s not supported", provider)
}

Custom Network Settings

Customize the network configuration for each provider, including custom base URLs, extra headers, and timeout settings. This example shows how to use a local OpenAI-compatible server with custom headers for user identification.

func (a *MyAccount) GetConfigForProvider(provider schemas.ModelProvider) (*schemas.ProviderConfig, error) {
	switch provider {
	case schemas.OpenAI:
		return &schemas.ProviderConfig{
			NetworkConfig: schemas.NetworkConfig{
				BaseURL: "http://localhost:8000/v1", // Custom openai setup
				ExtraHeaders: map[string]string{ // Will be included in the request headers
					"x-user-id": "123",
				},
				DefaultRequestTimeoutInSeconds: 30,
			},
			ConcurrencyAndBufferSize: schemas.DefaultConcurrencyAndBufferSize,
		}, nil
	}
	return nil, fmt.Errorf("provider %s not supported", provider)
}

Managing Retries

Configure retry behavior for handling temporary failures and rate limits. This example sets up exponential backoff with up to 5 retries, starting with 1ms delay and capping at 10 seconds - ideal for handling transient network issues.

func (a *MyAccount) GetConfigForProvider(provider schemas.ModelProvider) (*schemas.ProviderConfig, error) {
	switch provider {
	case schemas.OpenAI:
		return &schemas.ProviderConfig{
			NetworkConfig: schemas.NetworkConfig{
				MaxRetries:          5,
				RetryBackoffInitial: 1 * time.Millisecond,
				RetryBackoffMax:     10 * time.Second,
			},
			ConcurrencyAndBufferSize: schemas.DefaultConcurrencyAndBufferSize,
		}, nil
	}
	return nil, fmt.Errorf("provider %s not supported", provider)
}

Custom Concurrency and Buffer Size

Fine-tune performance by adjusting worker concurrency and queue sizes per provider (defaults are 1000 workers and 5000 queue size). This example gives OpenAI higher limits (100 workers, 500 queue) for high throughput, while Anthropic gets conservative limits to respect their rate limits.

func (a *MyAccount) GetConfigForProvider(provider schemas.ModelProvider) (*schemas.ProviderConfig, error) {
    switch provider {
    case schemas.OpenAI:
        return &schemas.ProviderConfig{
            NetworkConfig: schemas.DefaultNetworkConfig,
            ConcurrencyAndBufferSize: schemas.ConcurrencyAndBufferSize{
                MaxConcurrency: 100, // Max number of concurrent requests (no of workers)
                BufferSize:     500, // Max number of requests in the buffer (queue size)
            },
        }, nil
    case schemas.Anthropic:
        return &schemas.ProviderConfig{
            NetworkConfig: schemas.DefaultNetworkConfig,
            ConcurrencyAndBufferSize: schemas.ConcurrencyAndBufferSize{
                MaxConcurrency: 25,
                BufferSize:     100,
            },
        }, nil
    }
    return nil, fmt.Errorf("provider %s not supported", provider)
}

Setting Up a Proxy

Route requests through proxies for compliance, security, or geographic requirements. This example shows both HTTP proxy for OpenAI and authenticated SOCKS5 proxy for Anthropic, useful for corporate environments or regional access.

func (a *MyAccount) GetConfigForProvider(provider schemas.ModelProvider) (*schemas.ProviderConfig, error) {
	switch provider {
	case schemas.OpenAI:
		return &schemas.ProviderConfig{
			NetworkConfig:            schemas.DefaultNetworkConfig,
			ConcurrencyAndBufferSize: schemas.DefaultConcurrencyAndBufferSize,
			ProxyConfig: &schemas.ProxyConfig{
				Type: schemas.HttpProxy,
				URL:  "http://localhost:8000", // Proxy URL
			},
		}, nil
	case schemas.Anthropic:
		return &schemas.ProviderConfig{
			NetworkConfig:            schemas.DefaultNetworkConfig,
			ConcurrencyAndBufferSize: schemas.DefaultConcurrencyAndBufferSize,
			ProxyConfig: &schemas.ProxyConfig{
				Type:     schemas.Socks5Proxy,
				URL:      "http://localhost:8000", // Proxy URL
				Username: "user",
				Password: "password",
			},
		}, nil
	}
	return nil, fmt.Errorf("provider %s not supported", provider)
}

Send Back Raw Response

Include the original provider response alongside Bifrost’s standardized response format. Useful for debugging and accessing provider-specific metadata.

func (a *MyAccount) GetConfigForProvider(ctx *context.Context, provider schemas.ModelProvider) (*schemas.ProviderConfig, error) {
    return &schemas.ProviderConfig{
        NetworkConfig: schemas.DefaultNetworkConfig,
        ConcurrencyAndBufferSize: schemas.DefaultConcurrencyAndBufferSize,
        SendBackRawResponse: true, // Include raw provider response
    }, nil
}

When enabled, the raw provider response appears in ExtraFields.RawResponse:

type BifrostChatResponse struct {
	ID                string                     `json:"id"`
	Choices           []BifrostResponseChoice    `json:"choices"`
	Created           int                        `json:"created"` // The Unix timestamp (in seconds).
	Model             string                     `json:"model"`
	Object            string                     `json:"object"` // "chat.completion" or "chat.completion.chunk"
	ServiceTier       string                     `json:"service_tier"`
	SystemFingerprint string                     `json:"system_fingerprint"`
	Usage             *BifrostLLMUsage           `json:"usage"`
	ExtraFields       BifrostResponseExtraFields `json:"extra_fields"`
}

type BifrostResponseExtraFields struct {
	RequestType    RequestType        `json:"request_type"`
	Provider       ModelProvider      `json:"provider"`
	ModelRequested string             `json:"model_requested"`
	Latency        int64              `json:"latency"`     // in milliseconds (for streaming responses this will be each chunk latency, and the last chunk latency will be the total latency)
	ChunkIndex     int                `json:"chunk_index"` // used for streaming responses to identify the chunk index, will be 0 for non-streaming responses
	RawResponse    interface{}        `json:"raw_response,omitempty"`
	CacheDebug     *BifrostCacheDebug `json:"cache_debug,omitempty"`
}

Provider-Specific Authentication

Enterprise cloud providers require additional configuration beyond API keys. Configure Azure OpenAI, AWS Bedrock, and Google Vertex with platform-specific authentication details.

Azure OpenAI
AWS Bedrock
Google Vertex

Azure OpenAI requires endpoint URLs, deployment mappings, and API version configuration:

func (a *MyAccount) GetKeysForProvider(ctx *context.Context, provider schemas.ModelProvider) ([]schemas.Key, error) {
    switch provider {
    case schemas.Azure:
        return []schemas.Key{
            {
                Value:  os.Getenv("AZURE_API_KEY"),
                Models: []string{"gpt-4o", "gpt-4o-mini"},
                Weight: 1.0,
                AzureKeyConfig: &schemas.AzureKeyConfig{
                    Endpoint: os.Getenv("AZURE_ENDPOINT"), // e.g., "https://your-resource.openai.azure.com"
                    Deployments: map[string]string{
                        "gpt-4o":      "gpt-4o-deployment",
                        "gpt-4o-mini": "gpt-4o-mini-deployment",
                    },
                    APIVersion: bifrost.Ptr("2024-08-01-preview"), // Azure API version
                },
            },
        }, nil
    }
    return nil, fmt.Errorf("provider %s not supported", provider)
}

Best Practices

Performance Considerations

Keys are fetched from your GetKeysForProvider implementation on every request. Ensure your implementation is optimized for speed to avoid adding latency:

func (a *MyAccount) GetKeysForProvider(ctx *context.Context, provider schemas.ModelProvider) ([]schemas.Key, error) {
    // ✅ Good: Fast in-memory lookup
    switch provider {
    case schemas.OpenAI:
        return a.cachedOpenAIKeys, nil  // Pre-cached keys
    }
    
    // ❌ Avoid: Database queries, API calls, complex algorithms
    // This will add latency to every AI request
    // keys := fetchKeysFromDatabase(provider)  // Too slow!
    // return processWithComplexLogic(keys)     // Too slow!
    
    return nil, fmt.Errorf("provider %s not supported", provider)
}

Recommendations:

Cache keys in memory during application startup
Use simple switch statements or map lookups
Avoid database queries, file I/O, or network calls
Keep complex key processing logic outside the request path

Next Steps

Streaming Responses - Real-time response generation
Tool Calling - Enable AI to use external functions
Multimodal AI - Process images, audio, and text
Core Features - Advanced Bifrost capabilities

Quick Start

Models Catalog

Provider Integrations

Custom plugins

Open Source Features

Enterprise Features

Provider Configuration

Multi-Provider Setup

Making Requests

Environment Variables

Advanced Configuration

Weighted Load Balancing

Model-Specific Keys

Custom Network Settings

Managing Retries

Custom Concurrency and Buffer Size

Setting Up a Proxy

Send Back Raw Response

Provider-Specific Authentication

Best Practices

Performance Considerations

Next Steps

Quick Start

Models Catalog

Provider Integrations

Custom plugins

Open Source Features

Enterprise Features

​Multi-Provider Setup

​Making Requests

​Environment Variables

​Advanced Configuration

​Weighted Load Balancing

​Model-Specific Keys

​Custom Network Settings

​Managing Retries

​Custom Concurrency and Buffer Size

​Setting Up a Proxy

​Send Back Raw Response

​Provider-Specific Authentication

​Best Practices

​Performance Considerations

​Next Steps

Multi-Provider Setup

Making Requests

Environment Variables

Advanced Configuration

Weighted Load Balancing

Model-Specific Keys

Custom Network Settings

Managing Retries

Custom Concurrency and Buffer Size

Setting Up a Proxy

Send Back Raw Response

Provider-Specific Authentication

Best Practices

Performance Considerations

Next Steps