Guardrails

Overview

Guardrails in Bifrost provide enterprise-grade content safety, security validation, and policy enforcement for LLM requests and responses. The system validates inputs and outputs in real-time against your specified policies, ensuring responsible AI deployment with comprehensive protection against harmful content, prompt injection, PII leakage, and policy violations.

Core Concepts

Bifrost Guardrails are built around two core concepts that work together to provide flexible and powerful content protection:

Concept	Description
Rules	Custom policies defined using CEL (Common Expression Language) that determine what content to validate and when. Rules can apply to inputs, outputs, or both, and can be linked to one or more profiles for evaluation.
Profiles	Configurations for external guardrail providers (AWS Bedrock, Azure Content Safety, Patronus AI). Profiles are reusable and can be shared across multiple rules.

How They Work Together:

Profiles define how content is evaluated using external provider capabilities
Rules define when and what content gets evaluated using CEL expressions
A single rule can use multiple profiles for layered protection
Profiles can be reused across different rules for consistency

Key Features

Feature	Description
Multi-Provider Support	AWS Bedrock, Azure Content Safety, and Patronus AI integration
Dual-Stage Validation	Guard both inputs (prompts) and outputs (responses)
Real-Time Processing	Synchronous and asynchronous validation modes
CEL-Based Rules	Define custom policies using Common Expression Language
Reusable Profiles	Configure providers once, use across multiple rules
Sampling Control	Apply rules to a percentage of requests for performance tuning
Automatic Remediation	Block, redact, or modify content based on policy
Comprehensive Logging	Detailed audit trails for compliance

Navigating Guardrails in the UI

Access Guardrails from the Bifrost dashboard:

Page	Path	Description
Configuration	Guardrails > Configuration	Manage guardrail rules and their settings
Providers	Guardrails > Providers	Configure and manage guardrail profiles

Architecture

The following diagram illustrates how Rules and Profiles work together to validate LLM requests: Flow Description:

Incoming Request - LLM request arrives at Bifrost
Input Validation - Applicable rules evaluate the input using linked profiles
LLM Processing - If input passes, request is forwarded to the LLM provider
Output Validation - Response is evaluated by output rules using linked profiles
Response - Validated response is returned (or blocked/modified based on violations)

Supported Guardrail Providers

Bifrost integrates with leading guardrail providers to offer comprehensive protection:

AWS Bedrock Guardrails

Amazon Bedrock Guardrails provides enterprise-grade content filtering and safety features with deep AWS integration. Capabilities:

Content Filters: Hate speech, insults, sexual content, violence, misconduct
Denied Topics: Block specific topics or categories
Word Filters: Custom profanity and sensitive word blocking
PII Protection: Detect and redact 50+ PII entity types
Contextual Grounding: Verify responses against source documents
Prompt Attack Detection: Identify injection and jailbreak attempts

Supported PII Types:

Personal identifiers (SSN, passport, driver’s license)
Financial information (credit cards, bank accounts)
Contact information (email, phone, address)
Medical information (health records, insurance)
Device identifiers (IP addresses, MAC addresses)

Azure Content Safety

Azure AI Content Safety provides multi-modal content moderation powered by Microsoft’s advanced AI models. Capabilities:

Severity-Based Filtering: 4-level severity classification (Safe, Low, Medium, High)
Multi-Category Detection: Hate, sexual, violence, self-harm content
Prompt Shield: Advanced jailbreak and injection detection
Groundedness Detection: Verify factual accuracy against sources
Protected Material: Detect copyrighted content
Custom Categories: Define organization-specific content policies

Detection Categories:

Hate and fairness
Sexual content
Violence
Self-harm
Profanity
Jailbreak attempts

Patronus AI

Patronus AI specializes in LLM security and safety with advanced evaluation capabilities. Capabilities:

Hallucination Detection: Identify factually incorrect responses
PII Detection: Comprehensive personal data identification
Toxicity Screening: Multi-language toxic content detection
Prompt Injection Defense: Advanced attack pattern recognition
Custom Evaluators: Build organization-specific safety checks
Real-Time Monitoring: Continuous safety validation

Advanced Features:

Context-aware evaluation
Multi-turn conversation analysis
Custom policy templates
Integration with existing safety workflows

Additional Guardrail Providers

Beyond the natively integrated providers above, the following guardrail solutions are available in the ecosystem:

Guardrails AI - Open-source framework with validators for hallucination prevention, content moderation, and PII detection
Lakera (Lakera Guard) - Enterprise security platform for prompt injection, jailbreak, and data leakage protection
Aporia - Multi-modal guardrails (audio, vision, text) with 20+ pre-configured policies
Lasso Security - Real-time GenAI security with contextual data protection and custom policy wizard
PromptArmor - LLM security and compliance platform
WhyLabs - LLM security and observability platform with real-time guardrails and monitoring
CalypsoAI - Model-agnostic AI security platform with customizable scanners
NeuralTrust - LLM firewall specializing in prompt injection prevention
Vigil - Open-source prompt-level security tool
Confident AI - Red-teaming platform with LLM guardrail capabilities (DeepTeam)

Guardrail Rules

Guardrail Rules are custom policies that define when and how content validation occurs. Rules use CEL (Common Expression Language) expressions to evaluate requests and can be linked to one or more profiles for execution.

Rule Properties

Property	Type	Required	Description
`id`	integer	Yes	Unique identifier for the rule
`name`	string	Yes	Descriptive name for the rule
`description`	string	No	Explanation of what the rule does
`enabled`	boolean	Yes	Whether the rule is active
`cel_expression`	string	Yes	CEL expression for rule evaluation
`apply_to`	enum	Yes	When to apply: `input`, `output`, or `both`
`sampling_rate`	integer	No	Percentage of requests to evaluate (0-100)
`timeout`	integer	No	Execution timeout in milliseconds
`provider_config_ids`	array	No	IDs of profiles to use for evaluation

Creating Rules

Web UI
API
config.json
Helm

Navigate to Rules
- Go to Guardrails > Configuration
- Click Add Rule

Configure Rule Settings

Basic Information:

Name: Enter a descriptive name (e.g., “Block PII in Prompts”)
Description: Explain the rule’s purpose
Enabled: Toggle to activate the rule

Evaluation Settings:

Apply To: Select when to apply the rule
- input - Validate incoming prompts only
- output - Validate LLM responses only
- both - Validate both inputs and outputs
CEL Expression: Define the validation logic
Sampling Rate: Set percentage of requests to evaluate (default: 100%)
Timeout: Set maximum execution time in milliseconds

Link Profiles
- Select one or more profiles to use for evaluation
- Rules will execute all linked profiles in sequence
Save and Test
- Click Save Rule
- Use the Test button to validate with sample content

Create a Guardrail Rule:

curl -X POST http://localhost:8080/api/enterprise/guardrails/rules \
  -H "Content-Type: application/json" \
  -d '{
    "id": 1,
    "name": "Block PII in Prompts",
    "description": "Prevent PII from being sent to LLM providers",
    "enabled": true,
    "cel_expression": "request.messages.exists(m, m.role == \"user\")",
    "apply_to": "input",
    "sampling_rate": 100,
    "timeout": 5000,
    "provider_config_ids": [1, 2]
  }'

List All Rules:

curl -X GET http://localhost:8080/api/enterprise/guardrails/rules \
  -H "Content-Type: application/json"

# Response
{
  "rules": [
    {
      "id": 1,
      "name": "Block PII in Prompts",
      "description": "Prevent PII from being sent to LLM providers",
      "enabled": true,
      "cel_expression": "request.messages.exists(m, m.role == \"user\")",
      "apply_to": "input",
      "sampling_rate": 100,
      "timeout": 5000,
      "provider_config_ids": [1, 2]
    }
  ]
}

Update a Rule:

curl -X PUT http://localhost:8080/api/enterprise/guardrails/rules/1 \
  -H "Content-Type: application/json" \
  -d '{
    "enabled": false,
    "sampling_rate": 50
  }'

Delete a Rule:

curl -X DELETE http://localhost:8080/api/enterprise/guardrails/rules/1

{
  "enterprise": {
    "guardrails": {
      "guardrail_rules": [
        {
          "id": 1,
          "name": "Block PII in Prompts",
          "description": "Prevent PII from being sent to LLM providers",
          "enabled": true,
          "cel_expression": "request.messages.exists(m, m.role == \"user\")",
          "apply_to": "input",
          "sampling_rate": 100,
          "timeout": 5000,
          "provider_config_ids": [1, 2]
        },
        {
          "id": 2,
          "name": "Content Filter for Responses",
          "description": "Filter harmful content from LLM responses",
          "enabled": true,
          "cel_expression": "true",
          "apply_to": "output",
          "sampling_rate": 100,
          "timeout": 3000,
          "provider_config_ids": [2]
        },
        {
          "id": 3,
          "name": "Prompt Injection Detection",
          "description": "Detect and block prompt injection attempts",
          "enabled": true,
          "cel_expression": "request.messages.size() > 0",
          "apply_to": "input",
          "sampling_rate": 100,
          "timeout": 2000,
          "provider_config_ids": [1]
        }
      ]
    }
  }
}

enterprise:
  guardrails:
    rules:
      - id: 1
        name: "Block PII in Prompts"
        description: "Prevent PII from being sent to LLM providers"
        enabled: true
        cel_expression: "request.messages.exists(m, m.role == 'user')"
        apply_to: "input"
        sampling_rate: 100
        timeout: 5000
        provider_config_ids: [1, 2]
      - id: 2
        name: "Content Filter for Responses"
        description: "Filter harmful content from LLM responses"
        enabled: true
        cel_expression: "true"
        apply_to: "output"
        sampling_rate: 100
        timeout: 3000
        provider_config_ids: [2]

CEL Expression Examples

CEL (Common Expression Language) provides a powerful way to define rule conditions. Here are common patterns: Always Apply Rule:

true

Apply to User Messages Only:

request.messages.exists(m, m.role == "user")

Apply to Messages Containing Keywords:

request.messages.exists(m, m.content.contains("confidential"))

Apply Based on Model:

request.model.startsWith("gpt-4")

Apply to Long Prompts:

request.messages.filter(m, m.role == "user").map(m, m.content.size()).sum() > 1000

Combine Multiple Conditions:

request.model.startsWith("gpt-4") && request.messages.exists(m, m.role == "user" && m.content.size() > 500)

Linking Rules to Profiles

Rules can be linked to multiple profiles for comprehensive validation:

Rule configuration showing linked profiles

Best Practices:

Link PII detection rules to profiles with PII capabilities (Bedrock, Patronus)
Link content filtering rules to profiles with content safety features (Azure, Bedrock)
Use multiple profiles for defense-in-depth (e.g., Bedrock + Patronus for PII)
Set appropriate timeouts when using multiple profiles

Managing Profiles

Profiles are reusable configurations for external guardrail providers. Each profile contains provider-specific settings including credentials, endpoints, and detection thresholds.

Guardrail profiles list showing configured providers

Profile Properties

Property	Type	Required	Description
`id`	integer	Yes	Unique identifier for the profile
`provider_name`	string	Yes	Provider type: `bedrock`, `azure`, `patronus_ai`
`policy_name`	string	Yes	Descriptive name for the policy
`enabled`	boolean	Yes	Whether the profile is active
`config`	object	No	Provider-specific configuration

Creating Profiles

Web UI
API
config.json
Helm

Navigate to Providers
- Go to Guardrails > Providers
- Click Add Profile

Select Provider Type
- Choose from: AWS Bedrock, Azure Content Safety, or Patronus AI
Configure Provider Settings
- Enter credentials and endpoint information
- Configure detection thresholds and actions
- See provider-specific setup sections above for detailed configuration
Save Profile
- Click Save Profile
- The profile is now available for linking to rules

Create a Profile:

curl -X POST http://localhost:8080/api/enterprise/guardrails/providers \
  -H "Content-Type: application/json" \
  -d '{
    "id": 1,
    "provider_name": "bedrock",
    "policy_name": "PII Detection Profile",
    "enabled": true,
    "config": {
      "aws_region": "us-east-1",
      "guardrail_id": "gdrail-abc123",
      "guardrail_version": "1",
      "credentials": {
        "access_key_id": "AKIA...",
        "secret_access_key": "secret..."
      }
    }
  }'

List All Profiles:

curl -X GET http://localhost:8080/api/enterprise/guardrails/providers \
  -H "Content-Type: application/json"

# Response
{
  "providers": [
    {
      "id": 1,
      "provider_name": "bedrock",
      "policy_name": "PII Detection Profile",
      "enabled": true
    },
    {
      "id": 2,
      "provider_name": "azure",
      "policy_name": "Content Safety Profile",
      "enabled": true
    }
  ]
}

Update a Profile:

curl -X PUT http://localhost:8080/api/enterprise/guardrails/providers/1 \
  -H "Content-Type: application/json" \
  -d '{
    "enabled": false
  }'

Delete a Profile:

curl -X DELETE http://localhost:8080/api/enterprise/guardrails/providers/1

{
  "enterprise": {
    "guardrails": {
      "guardrail_providers": [
        {
          "id": 1,
          "provider_name": "bedrock",
          "policy_name": "PII Detection Profile",
          "enabled": true,
          "config": {
            "aws_region": "us-east-1",
            "guardrail_id": "gdrail-abc123",
            "guardrail_version": "1",
            "credentials": {
              "access_key_id": "${AWS_ACCESS_KEY_ID}",
              "secret_access_key": "${AWS_SECRET_ACCESS_KEY}"
            }
          }
        },
        {
          "id": 2,
          "provider_name": "azure",
          "policy_name": "Content Safety Profile",
          "enabled": true,
          "config": {
            "endpoint": "https://your-resource.cognitiveservices.azure.com/",
            "api_key": "${AZURE_CONTENT_SAFETY_API_KEY}",
            "api_version": "2024-02-15-preview"
          }
        },
        {
          "id": 3,
          "provider_name": "patronus_ai",
          "policy_name": "Hallucination Detection",
          "enabled": true,
          "config": {
            "api_key": "${PATRONUS_API_KEY}",
            "api_endpoint": "https://api.patronus.ai/v1"
          }
        }
      ]
    }
  }
}

enterprise:
  guardrails:
    providers:
      - id: 1
        provider_name: "bedrock"
        policy_name: "PII Detection Profile"
        enabled: true
        config:
          aws_region: "us-east-1"
          guardrail_id: "gdrail-abc123"
          guardrail_version: "1"
      - id: 2
        provider_name: "azure"
        policy_name: "Content Safety Profile"
        enabled: true
        config:
          endpoint: "https://your-resource.cognitiveservices.azure.com/"
          api_version: "2024-02-15-preview"
      - id: 3
        provider_name: "patronus_ai"
        policy_name: "Hallucination Detection"
        enabled: true
        config:
          api_endpoint: "https://api.patronus.ai/v1"

Provider Capabilities

Each provider offers different capabilities. Choose profiles based on your validation needs:

Capability	AWS Bedrock	Azure Content Safety	Patronus AI
PII Detection	Yes	No	Yes
Content Filtering	Yes	Yes	Yes
Prompt Injection	Yes	Yes	Yes
Hallucination Detection	No	No	Yes
Toxicity Screening	Yes	Yes	Yes
Custom Policies	Yes	Yes	Yes

Best Practices

Profile Organization:

Create separate profiles for different use cases (PII, content filtering, etc.)
Use descriptive policy names that indicate the profile’s purpose
Keep credentials secure using environment variables

Performance Considerations:

Enable only the profiles you need to minimize latency
Use sampling rates on rules for high-traffic endpoints
Set appropriate timeouts to prevent slow requests

Security:

Store API keys and credentials in environment variables or secrets managers
Regularly rotate credentials
Use least-privilege IAM roles for AWS Bedrock

Using Guardrails in Requests

Attaching Guardrails to API Calls

Once configured, attach guardrails to your LLM requests using custom headers: Single Guardrail:

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "x-bf-guardrail-id: bedrock-prod-guardrail" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": "Help me with this task"
      }
    ]
  }'

Multiple Guardrails (Sequential):

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "x-bf-guardrail-ids: bedrock-prod-guardrail,azure-content-safety-001" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": "Help me with this task"
      }
    ]
  }'

Guardrail Configuration in Request:

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": "Help me with this task"
      }
    ],
    "bifrost_config": {
      "guardrails": {
        "input": ["bedrock-prod-guardrail"],
        "output": ["patronus-ai-001"],
        "async": false
      }
    }
  }'

Guardrail Response Handling

Successful Validation (200):

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1699564800,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "I'd be happy to help you with your task..."
      },
      "finish_reason": "stop"
    }
  ],
  "extra_fields": {
    "guardrails": {
      "input_validation": {
        "guardrail_id": "bedrock-prod-guardrail",
        "status": "passed",
        "violations": [],
        "processing_time_ms": 245
      },
      "output_validation": {
        "guardrail_id": "patronus-ai-001",
        "status": "passed",
        "violations": [],
        "processing_time_ms": 312
      }
    }
  }
}

Validation Failure - Blocked (446):

{
  "error": {
    "message": "Request blocked by guardrails",
    "type": "guardrail_violation",
    "code": 446,
    "details": {
      "guardrail_id": "bedrock-prod-guardrail",
      "validation_stage": "input",
      "violations": [
        {
          "type": "PII",
          "category": "SSN",
          "severity": "HIGH",
          "action": "block",
          "text_excerpt": "My SSN is ***-**-****"
        },
        {
          "type": "prompt_injection",
          "severity": "CRITICAL",
          "action": "block",
          "confidence": 0.95
        }
      ],
      "processing_time_ms": 198
    }
  }
}

Validation Warning - Logged (246):

{
  "id": "chatcmpl-def456",
  "object": "chat.completion",
  "created": 1699564800,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Response with redacted content..."
      },
      "finish_reason": "stop"
    }
  ],
  "bifrost_metadata": {
    "guardrails": {
      "output_validation": {
        "guardrail_id": "azure-content-safety-001",
        "status": "warning",
        "violations": [
          {
            "type": "profanity",
            "severity": "LOW",
            "action": "redact",
            "modifications": 2
          }
        ],
        "processing_time_ms": 187
      }
    }
  }
}

Quick Start

Providers & Guides

SDK Integrations

MCP Gateway

Custom plugins

Open Source Features

Enterprise Features

Overview

Core Concepts

Key Features

Navigating Guardrails in the UI

Architecture

Supported Guardrail Providers

AWS Bedrock Guardrails

Azure Content Safety

Patronus AI

Additional Guardrail Providers

Guardrail Rules

Rule Properties

Creating Rules

CEL Expression Examples

Linking Rules to Profiles

Managing Profiles

Profile Properties

Creating Profiles

Provider Capabilities

Best Practices

Using Guardrails in Requests

Attaching Guardrails to API Calls

Guardrail Response Handling

Quick Start

Providers & Guides

SDK Integrations

MCP Gateway

Custom plugins

Open Source Features

Enterprise Features

​Overview

​Core Concepts

​Key Features

​Navigating Guardrails in the UI

​Architecture

​Supported Guardrail Providers

​AWS Bedrock Guardrails

​Azure Content Safety

​Patronus AI

​Additional Guardrail Providers

​Guardrail Rules

​Rule Properties

​Creating Rules

​CEL Expression Examples

​Linking Rules to Profiles

​Managing Profiles

​Profile Properties

​Creating Profiles

​Provider Capabilities

​Best Practices

​Using Guardrails in Requests

​Attaching Guardrails to API Calls

​Guardrail Response Handling

Overview

Core Concepts

Key Features

Navigating Guardrails in the UI

Architecture

Supported Guardrail Providers

AWS Bedrock Guardrails

Azure Content Safety

Patronus AI

Additional Guardrail Providers

Guardrail Rules

Rule Properties

Creating Rules

CEL Expression Examples

Linking Rules to Profiles

Managing Profiles

Profile Properties

Creating Profiles

Provider Capabilities

Best Practices

Using Guardrails in Requests

Attaching Guardrails to API Calls

Guardrail Response Handling