Overview
Want to see Bifrostβs performance in your specific environment? The Bifrost Benchmarking Repository provides everything you need to conduct comprehensive performance tests tailored to your infrastructure and workload requirements. What You Can Test:- Custom Instance Sizes - Test on your preferred AWS/GCP/Azure instances
- Your Workload Patterns - Use your actual request/response sizes
- Different Configurations - Compare various Bifrost settings
- Provider Comparisons - Benchmark against other AI gateways
- Load Scenarios - Test burst loads, sustained traffic, and endurance
π‘ Open Source: The benchmarking tool is completely open source! Feel free to submit pull requests if you think anything is missing or could be improved.
Prerequisites
Before running benchmarks, ensure you have:- Go 1.23+ installed on your testing machine
- Bifrost instance running and accessible
- Target API providers configured (OpenAI, Anthropic, etc.)
- Network access between benchmark tool and Bifrost
- Sufficient resources on the testing machine to generate load
Quick Start
1. Clone the Repository
2. Build the Benchmark Tool
benchmark executable (or benchmark.exe on Windows).
3. Run Your First Benchmark
Configuration Options
The benchmark tool offers extensive configuration through command-line flags:Basic Configuration
| Flag | Required | Description | Default |
|---|---|---|---|
-provider <name> | β | Provider name (e.g., bifrost, litellm) | None |
-port <number> | β | Port number of your Bifrost instance | None |
-endpoint <path> | β | API endpoint path | v1/chat/completions |
-rate <number> | β | Requests per second | 500 |
-duration <seconds> | β | Test duration in seconds | 10 |
-output <filename> | β | Results output file | results.json |
Advanced Configuration
| Flag | Description | Default |
|---|---|---|
-include-provider-in-request | Include provider name in request payload | false |
-big-payload | Use larger, more complex request payloads | false |
Benchmark Scenarios
1. Basic Performance Test
Test standard performance with typical request sizes:2. High-Load Stress Test
Push your instance to its limits:3. Large Payload Test
Test with bigger request/response sizes:4. Endurance Test
Long-running stability test:5. Comparative Benchmarking
Compare Bifrost against other providers:Understanding Results
The benchmark tool generates detailed JSON results with comprehensive metrics:Key Metrics Explained
Critical Performance Indicators
Success Rate:- Target: >99.9% for production readiness
- Excellent: 100% (perfect reliability)
- P50 (Median): Typical user experience
- P99: Worst-case user experience
- Mean: Overall average performance
- Peak: Maximum memory consumption
- Average: Sustained memory usage
- After - Before: Memory growth during test
Instance Sizing Recommendations
Based on your benchmark results, use these guidelines for production sizing:Resource Planning Matrix
| Target RPS | Memory Usage | Recommended Instance | Notes |
|---|---|---|---|
| < 1,000 | < 1GB | t3.small | Cost-effective for light loads |
| 1,000 - 3,000 | 1-2GB | t3.medium | Balanced performance/cost |
| 3,000 - 5,000 | 2-4GB | t3.large | High-performance production |
| 5,000+ | 3-6GB | t3.xlarge+ | Enterprise/mission-critical |
Configuration Tuning Based on Results
If seeing high latency:- Increase
initial_pool_size - Increase
buffer_size - Consider larger instance
- Decrease
initial_pool_size - Optimize
buffer_size - Monitor for memory leaks
- Reduce request rate
- Increase timeout settings
- Check provider limits
Advanced Testing Scenarios
Burst Load Testing
Simulate traffic spikes:Multi-Instance Testing
Test horizontal scaling:Different Payload Sizes
Compare performance across payload sizes:Continuous Benchmarking
Automated Testing Pipeline
Set up regular performance regression testing:Performance Monitoring Integration
Monitor key metrics over time:- Success rate trends
- Latency percentile changes
- Memory usage patterns
- Throughput capacity
Troubleshooting
Common Issues
Connection Refused:- Check PORT is defined in
.envfile at root.
- Check provider API key limits
- Verify Bifrost configuration
- Monitor upstream provider status
- Reduce request rate for baseline test
- Monitor system resources during testing
- Check for memory leaks in long tests
- Adjust Bifrost pool sizes
- Run multiple test iterations
- Account for network variability
- Use longer test durations (60+ seconds)
- Isolate testing environment
- Try hitting gateway requests to a Mock provider
Next Steps
After Running Benchmarks
- Analyze Results: Compare against official benchmarks
- Optimize Configuration: Tune based on your specific results
- Plan Capacity: Size instances based on measured performance
- Set Up Monitoring: Track key metrics in production
Compare Results
- t3.medium Performance - Compare against medium instance results
- t3.xlarge Performance - Compare against high-performance configuration

