> ## Documentation Index
> Fetch the complete documentation index at: https://docs.getbifrost.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Vertex AI

> Google Vertex AI API conversion guide - multi-model support, OAuth2 authentication, project/region configuration

## Overview

Vertex AI is Google's unified ML platform providing access to Google's Gemini models, Anthropic Claude models, and other third-party LLMs through a single API. Bifrost performs conversions including:

* **Multi-model support** - Unified interface for Gemini, Anthropic, and third-party models
* **OAuth2 authentication** - Service account credentials with automatic token refresh
* **Project and region management** - Automatic endpoint construction from GCP project/region
* **Model routing** - Automatic provider detection (Gemini vs Anthropic) based on model name
* **Request conversion** - Conversion to underlying provider format (Gemini or Anthropic)
* **Embeddings support** - Vector generation with task type and truncation options
* **Model discovery** - Paginated model listing with deployment information

### Supported Operations

| Operation            | Non-Streaming | Streaming | Endpoint                                  |
| -------------------- | ------------- | --------- | ----------------------------------------- |
| Chat Completions     | ✅             | ✅         | `/generate`                               |
| Responses API        | ✅             | ✅         | `/messages`                               |
| Embeddings           | ✅             | -         | `/embeddings`                             |
| Image Generation     | ✅             | -         | `/generateContent` or `/predict` (Imagen) |
| Image Edit           | ✅             | -         | `/generateContent` or `/predict` (Imagen) |
| Video Generation     | ✅             | -         | `/predictLongRunning` (Veo models only)   |
| Image Variation      | ❌             | -         | Not supported                             |
| List Models          | ✅             | -         | `/models`                                 |
| Text Completions     | ❌             | ❌         | -                                         |
| Speech (TTS)         | ❌             | ❌         | -                                         |
| Transcriptions (STT) | ❌             | ❌         | -                                         |
| Files                | ❌             | ❌         | -                                         |
| Batch                | ❌             | ❌         | -                                         |

<Note>
  **Unsupported Operations** (❌): Text Completions, Speech, Transcriptions, Files, and Batch are not supported by Vertex AI. These return `UnsupportedOperationError`.

  **Vertex-specific**: Endpoints vary by model type. Responses API available for both Gemini and Anthropic models.
</Note>

***

## Setup & Configuration

Vertex AI requires Google Cloud project configuration and authentication credentials. Three authentication methods are supported.

<Note>
  The `aliases` field (mapping model names to fine-tuned model IDs or endpoint
  identifiers) requires **v1.5.0-prerelease2 or later**. On v1.4.x, use
  `deployments` inside `vertex_key_config` instead - see the [v1.5.0 Migration
  Guide](/migration-guides/v1.5.0#breaking-change-9-provider-deployments-removed-migrate-to-aliases)
  for details.
</Note>

### 1. Service Account JSON (Recommended for Production)

Provide a credential JSON string in `auth_credentials`. The JSON must contain a `type` field. Supported types: `service_account` (most common), `impersonated_service_account`, `authorized_user`, `external_account`, `external_account_authorized_user`.

<Tabs>
  <Tab title="Web UI">
    <Frame>
      <img src="https://mintcdn.com/bifrost/yMrf9PoN6zx_eXsX/media/ui-vertex-service-account-auth-setup.png?fit=max&auto=format&n=yMrf9PoN6zx_eXsX&q=85&s=cd1cd5c52d5e10bd9c2393b42bb1b105" alt="Google Vertex AI Service Account (JSON) authentication setup in the Bifrost Web UI showing Project ID, Region, and Auth Credentials fields" width="3492" height="2366" data-path="media/ui-vertex-service-account-auth-setup.png" />
    </Frame>

    1. Navigate to **"Model Providers"** → **"Configurations"** → **"Google Vertex"**
    2. Click **"Add Key"** (or edit an existing key)
    3. Under **Authentication Method**, select **"Service Account (JSON)"**
    4. Set **Project ID**: Your Google Cloud project ID
    5. Set **Project Number** (Required only for fine-tuned models): Your GCP project number; leave blank for standard models
    6. Set **Region**: e.g., `us-central1`
    7. Set **Auth Credentials**: Paste your service account JSON or reference an env var (e.g., `env.VERTEX_CREDENTIALS`)
    8. Configure **Aliases**: Map model names to fine-tuned model IDs (if using fine-tuned models)
    9. Save
  </Tab>

  <Tab title="API">
    ```bash theme={null}
    # Step 1: Create the provider
    curl -X POST http://localhost:8080/api/providers \
      -H "Content-Type: application/json" \
      -d '{"provider": "vertex"}'

    # Step 2: Create a key (Service Account JSON)
    curl -X POST http://localhost:8080/api/providers/vertex/keys \
      -H "Content-Type: application/json" \
      -d '{
        "name": "vertex-sa-key",
        "value": "",
        "models": ["*"],
        "weight": 1.0,
        "vertex_key_config": {
          "project_id": "env.VERTEX_PROJECT_ID",
          "region": "us-central1",
          "auth_credentials": "env.VERTEX_CREDENTIALS"
        }
      }'
    ```

    <Note>
      **On v1.4.x**, two differences apply: - Pass `keys` directly in the `POST
              /api/providers` body - there is no separate `/api/providers/{provider}/keys`
      endpoint. - Use `deployments` inside `vertex_key_config` instead of the
      top-level `aliases` field for fine-tuned model mappings.
    </Note>
  </Tab>

  <Tab title="config.json">
    ```json theme={null}
    {
      "providers": {
        "vertex": {
          "keys": [
            {
              "name": "vertex-sa-key",
              "value": "",
              "models": ["*"],
              "weight": 1.0,
              "vertex_key_config": {
                "project_id": "env.VERTEX_PROJECT_ID",
                "region": "us-central1",
                "auth_credentials": "env.VERTEX_CREDENTIALS"
              }
            }
          ]
        }
      }
    }
    ```

    <Note>
      On **v1.4.x**, use `deployments` inside `vertex_key_config` instead of the
      top-level `aliases` field for fine-tuned model mappings.
    </Note>
  </Tab>

  <Tab title="Go SDK">
    ```go theme={null}
    func (a *MyAccount) GetKeysForProvider(ctx *context.Context, provider schemas.ModelProvider) ([]schemas.Key, error) {
        switch provider {
        case schemas.Vertex:
            return []schemas.Key{
                {
                    Value:  schemas.EnvVar{}, // Leave empty when using service account credentials
                    Models: []string{"*"},
                    Weight: 1.0,
                    VertexKeyConfig: &schemas.VertexKeyConfig{
                        ProjectID:       *schemas.NewEnvVar("env.VERTEX_PROJECT_ID"),
                        Region:          *schemas.NewEnvVar("us-central1"),
                        AuthCredentials: *schemas.NewEnvVar("env.VERTEX_CREDENTIALS"), // full service account JSON
                    },
                },
            }, nil
        }
        return nil, fmt.Errorf("provider %s not supported", provider)
    }
    ```
  </Tab>
</Tabs>

### 2. Application Default Credentials

Leave `auth_credentials` empty. Bifrost calls `google.FindDefaultCredentials()` - Google's ADC library - which resolves credentials in this order:

1. `GOOGLE_APPLICATION_CREDENTIALS` env var (path to a JSON credential file)
2. Application default credential file (`~/.config/gcloud/application_default_credentials.json`, written by `gcloud auth application-default login`)
3. GCE/GKE/Cloud Run/App Engine metadata server (attached service account or Workload Identity)

<Tabs>
  <Tab title="Web UI">
    <Frame>
      <img src="https://mintcdn.com/bifrost/yMrf9PoN6zx_eXsX/media/ui-vertex-default-service-account-auth-setup.png?fit=max&auto=format&n=yMrf9PoN6zx_eXsX&q=85&s=fd3dcc2e41084667b7c8d4cc32ee4b74" alt="Google Vertex AI Application Default Credentials setup in the Bifrost Web UI showing Project ID and Region fields with no credential inputs" width="3492" height="2368" data-path="media/ui-vertex-default-service-account-auth-setup.png" />
    </Frame>

    1. Navigate to **"Model Providers"** → **"Configurations"** → **"Google Vertex"**
    2. Click **"Add Key"** (or edit an existing key)
    3. Under **Authentication Method**, select **"Service Account (Attached)"**
    4. Set **Project ID**: Your Google Cloud project ID
    5. Set **Project Number** (Required only for fine-tuned models): Your GCP project number; leave blank for standard models
    6. Set **Region**: e.g., `us-central1`
    7. Configure **Aliases** if needed
    8. Save

    Ensure `GOOGLE_APPLICATION_CREDENTIALS` is set in your environment, or that Workload Identity / gcloud is configured.
  </Tab>

  <Tab title="API">
    ```bash theme={null}
    # Step 1: Create the provider
    curl -X POST http://localhost:8080/api/providers \
      -H "Content-Type: application/json" \
      -d '{"provider": "vertex"}'

    # Step 2: Create a key (Application Default Credentials)
    curl -X POST http://localhost:8080/api/providers/vertex/keys \
      -H "Content-Type: application/json" \
      -d '{
        "name": "vertex-adc-key",
        "value": "",
        "models": ["*"],
        "weight": 1.0,
        "vertex_key_config": {
          "project_id": "env.VERTEX_PROJECT_ID",
          "region": "us-central1",
          "auth_credentials": ""
        }
      }'
    ```

    <Note>
      **On v1.4.x**, pass `keys` directly in the `POST /api/providers` body - there
      is no separate `/api/providers/{provider}/keys` endpoint.
    </Note>
  </Tab>

  <Tab title="config.json">
    ```json theme={null}
    {
      "providers": {
        "vertex": {
          "keys": [
            {
              "name": "vertex-adc-key",
              "value": "",
              "models": ["*"],
              "weight": 1.0,
              "vertex_key_config": {
                "project_id": "env.VERTEX_PROJECT_ID",
                "region": "us-central1",
                "auth_credentials": ""
              }
            }
          ]
        }
      }
    }
    ```
  </Tab>

  <Tab title="Go SDK">
    ```go theme={null}
    func (a *MyAccount) GetKeysForProvider(ctx *context.Context, provider schemas.ModelProvider) ([]schemas.Key, error) {
        switch provider {
        case schemas.Vertex:
            return []schemas.Key{
                {
                    Value:  schemas.EnvVar{},
                    Models: []string{"*"},
                    Weight: 1.0,
                    VertexKeyConfig: &schemas.VertexKeyConfig{
                        ProjectID: *schemas.NewEnvVar("env.VERTEX_PROJECT_ID"),
                        Region:    *schemas.NewEnvVar("us-central1"),
                        // Leave AuthCredentials empty - uses Application Default Credentials
                    },
                },
            }, nil
        }
        return nil, fmt.Errorf("provider %s not supported", provider)
    }
    ```
  </Tab>
</Tabs>

### 3. API Key (Gemini and Fine-Tuned Models Only)

Set `value` to your Vertex API key. API key authentication is supported only for Gemini models and fine-tuned Gemini models. For Anthropic models on Vertex, use Service Account or Application Default Credentials.

<Tabs>
  <Tab title="Web UI">
    <Frame>
      <img src="https://mintcdn.com/bifrost/yMrf9PoN6zx_eXsX/media/ui-vertex-api-key-auth-setup.png?fit=max&auto=format&n=yMrf9PoN6zx_eXsX&q=85&s=b8e06f34f209af1bcf312568827e325b" alt="Google Vertex AI API Key authentication setup in the Bifrost Web UI showing API Key, Project ID, Region, and Project Number fields" width="3492" height="2366" data-path="media/ui-vertex-api-key-auth-setup.png" />
    </Frame>

    1. Navigate to **"Model Providers"** → **"Configurations"** → **"Google Vertex"**
    2. Click **"Add Key"** (or edit an existing key)
    3. Under **Authentication Method**, select **"API Key"**
    4. Set **API Key**: Your Vertex AI API key
    5. Set **Project ID**: Your Google Cloud project ID
    6. Set **Project Number** (Required only for fine-tuned models): Your GCP project number; leave blank for standard models
    7. Set **Region**: e.g., `us-central1`
    8. Configure **Aliases**: Map short names to fine-tuned model IDs (e.g., `my-model` → `123456789`)
    9. Save
  </Tab>

  <Tab title="API">
    ```bash theme={null}
    # Step 1: Create the provider
    curl -X POST http://localhost:8080/api/providers \
      -H "Content-Type: application/json" \
      -d '{"provider": "vertex"}'

    # Step 2: Create a key (API Key - Gemini + fine-tuned models)
    curl -X POST http://localhost:8080/api/providers/vertex/keys \
      -H "Content-Type: application/json" \
      -d '{
        "name": "vertex-api-key",
        "value": "env.VERTEX_API_KEY",
        "models": ["gemini-pro", "gemini-2.0-flash", "my-fine-tuned-model"],
        "weight": 1.0,
        "aliases": {
          "my-fine-tuned-model": "123456789"
        },
        "vertex_key_config": {
          "project_id": "env.VERTEX_PROJECT_ID",
          "project_number": "env.VERTEX_PROJECT_NUMBER",
          "region": "us-central1"
        }
      }'
    ```

    <Note>
      **On v1.4.x**, two differences apply:

      * Pass `keys` directly in the `POST /api/providers` body - there is no separate `/api/providers/{provider}/keys` endpoint.
      * Replace the top-level `aliases` with `"deployments"` inside `vertex_key_config`:

      ```json theme={null}
      "vertex_key_config": {
        "project_id": "env.VERTEX_PROJECT_ID",
        "region": "us-central1",
        "deployments": {
          "my-fine-tuned-model": "123456789"
        }
      }
      ```
    </Note>
  </Tab>

  <Tab title="config.json">
    ```json theme={null}
    {
      "providers": {
        "vertex": {
          "keys": [
            {
              "name": "vertex-api-key",
              "value": "env.VERTEX_API_KEY",
              "models": ["gemini-pro", "gemini-2.0-flash", "my-fine-tuned-model"],
              "weight": 1.0,
              "aliases": {
                "my-fine-tuned-model": "123456789"
              },
              "vertex_key_config": {
                "project_id": "env.VERTEX_PROJECT_ID",
                "project_number": "env.VERTEX_PROJECT_NUMBER",
                "region": "us-central1"
              }
            }
          ]
        }
      }
    }
    ```

    <Note>
      On **v1.4.x**, use `deployments` inside `vertex_key_config` instead of the
      top-level `aliases` field.
    </Note>
  </Tab>

  <Tab title="Go SDK">
    ```go theme={null}
    func (a *MyAccount) GetKeysForProvider(ctx *context.Context, provider schemas.ModelProvider) ([]schemas.Key, error) {
        switch provider {
        case schemas.Vertex:
            return []schemas.Key{
                {
                    Value:  *schemas.NewEnvVar("env.VERTEX_API_KEY"), // only when using Gemini or fine-tuned models
                    Models: []string{"gemini-pro", "gemini-2.0-flash", "my-fine-tuned-model"},
                    Weight: 1.0,
                    Aliases: schemas.KeyAliases{
                        "my-fine-tuned-model": "123456789",
                    },
                    VertexKeyConfig: &schemas.VertexKeyConfig{
                        ProjectID:     *schemas.NewEnvVar("env.VERTEX_PROJECT_ID"),
                        ProjectNumber: *schemas.NewEnvVar("env.VERTEX_PROJECT_NUMBER"), // required for fine-tuned models
                        Region:        *schemas.NewEnvVar("us-central1"),
                    },
                },
            }, nil
        }
        return nil, fmt.Errorf("provider %s not supported", provider)
    }
    ```
  </Tab>
</Tabs>

<Note>
  Vertex AI support for fine-tuned models is currently in beta. Requests to
  non-Gemini fine-tuned models may fail, so please test and report any issues.
</Note>

**`vertex_key_config` fields:**

| Field              | Required | Description                                            |
| ------------------ | -------- | ------------------------------------------------------ |
| `project_id`       | Yes      | Google Cloud project ID                                |
| `region`           | Yes      | GCP region (e.g., `us-central1`, `eu-west1`, `global`) |
| `auth_credentials` | No       | Service account JSON string (leave empty for ADC)      |
| `project_number`   | No       | GCP project number (required for fine-tuned models)    |

**Key-level fields:**

| Field     | Required | Description                                                                               |
| --------- | -------- | ----------------------------------------------------------------------------------------- |
| `value`   | No       | Vertex API key (Gemini and fine-tuned models only; leave empty for Service Account / ADC) |
| `aliases` | No       | Map model names to fine-tuned model IDs or endpoint identifiers (v1.5.0-prerelease2+)     |
| `models`  | Yes      | Models this key can serve; use `["*"]` to allow all                                       |

***

## GKE Workload Identity Federation

When running Bifrost on GKE, [Workload Identity Federation](https://cloud.google.com/kubernetes-engine/docs/concepts/workload-identity) (WIF) lets pods authenticate to Vertex AI without managing service account keys. The pod inherits an IAM identity through the Kubernetes ServiceAccount, and Bifrost picks it up automatically via [Application Default Credentials](#2-application-default-credentials).

**What you need:**

1. The GCP-side prerequisites below (API enabled, IAM service account, WIF binding)
2. A Bifrost Vertex key using **"Service Account (Attached)"** auth - see [Application Default Credentials](#2-application-default-credentials) for Web UI, API, config.json, and Go SDK setup. For Helm, see [Helm - Google Vertex AI](/deployment-guides/helm/providers#google-vertex-ai).
3. The Kubernetes ServiceAccount annotated for WIF:

```bash theme={null}
kubectl annotate serviceaccount KSA_NAME \
  --namespace NAMESPACE \
  iam.gke.io/gcp-service-account=IAM_SA_NAME@PROJECT_ID.iam.gserviceaccount.com
```

Replace `IAM_SA_NAME` with the IAM Service Account created in [Step 3](#gcp-prerequisites) below.

### GCP Prerequisites

<Accordion title="Step 1: Enable the Vertex AI API">
  The Vertex AI API must be enabled in your project. Search for `aiplatform` in the [API Library](https://console.cloud.google.com/apis/library) or run:

  ```bash theme={null}
  gcloud services enable aiplatform.googleapis.com --project=PROJECT_ID
  ```

  WIF uses the IAM Credentials API for token exchange. Enable it as well:

  ```bash theme={null}
  gcloud services enable iamcredentials.googleapis.com --project=PROJECT_ID
  ```
</Accordion>

<Accordion title="Step 2: Enable Workload Identity on GKE">
  **Autopilot clusters:** WIF is always enabled. Skip this step.

  **Standard clusters:** Enable the workload identity pool and GKE metadata server:

  ```bash theme={null}
  # Enable Workload Identity on the cluster
  gcloud container clusters update CLUSTER_NAME \
    --location=LOCATION \
    --workload-pool=PROJECT_ID.svc.id.goog

  # Enable GKE metadata server on each node pool
  gcloud container node-pools update NODEPOOL_NAME \
    --cluster=CLUSTER_NAME \
    --location=LOCATION \
    --workload-metadata=GKE_METADATA
  ```

  Verify:

  ```bash theme={null}
  gcloud container clusters describe CLUSTER_NAME \
    --location=LOCATION \
    --format="value(workloadIdentityConfig.workloadPool)"
  # Expected: PROJECT_ID.svc.id.goog
  ```
</Accordion>

<Accordion title="Step 3: Create an IAM Service Account and grant Vertex access">
  Create a dedicated IAM Service Account (or use an existing one) and grant it the Vertex AI User role:

  ```bash theme={null}
  # Create the service account
  gcloud iam service-accounts create IAM_SA_NAME \
    --display-name="Bifrost Vertex AI" \
    --project=PROJECT_ID

  # Grant Vertex AI access
  gcloud projects add-iam-policy-binding PROJECT_ID \
    --member="serviceAccount:IAM_SA_NAME@PROJECT_ID.iam.gserviceaccount.com" \
    --role="roles/aiplatform.user"
  ```
</Accordion>

<Accordion title="Step 4: Bind the Kubernetes ServiceAccount to the IAM Service Account">
  Allow the Kubernetes ServiceAccount to impersonate the IAM Service Account:

  ```bash theme={null}
  gcloud iam service-accounts add-iam-policy-binding \
    IAM_SA_NAME@PROJECT_ID.iam.gserviceaccount.com \
    --role="roles/iam.workloadIdentityUser" \
    --member="serviceAccount:PROJECT_ID.svc.id.goog[NAMESPACE/KSA_NAME]"
  ```

  Replace `NAMESPACE` and `KSA_NAME` with your Bifrost pod's namespace and Kubernetes ServiceAccount name.

  Then annotate the Kubernetes ServiceAccount so GKE knows which IAM identity to map:

  ```bash theme={null}
  kubectl annotate serviceaccount KSA_NAME \
    --namespace NAMESPACE \
    iam.gke.io/gcp-service-account=IAM_SA_NAME@PROJECT_ID.iam.gserviceaccount.com
  ```

  If deploying with the Bifrost Helm chart, set the annotation via `serviceAccount.annotations` in your values file - see [Helm - Google Vertex AI](/deployment-guides/helm/providers#google-vertex-ai) for the full example.
</Accordion>

### Verify

From inside the Bifrost pod, confirm the GKE metadata server returns a token:

```bash theme={null}
kubectl exec -n NAMESPACE POD_NAME -- \
  wget -qO- --header="Metadata-Flavor: Google" \
  "http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token"
```

Replace `NAMESPACE` and `POD_NAME` with your Bifrost namespace and any running Bifrost pod name (e.g., `bifrost-0` for a StatefulSet or use `kubectl get pods -n NAMESPACE` to find it).

A JSON response with an `access_token` field confirms WIF is working. Then send a request through Bifrost to a Vertex model (e.g., `vertex/gemini-2.5-flash`) to verify end-to-end.

### Troubleshooting

| Symptom                                | Likely Cause                                                                         | Fix                                                                                                                                                                                     |
| -------------------------------------- | ------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `"could not find default credentials"` | GKE metadata server not enabled, or Kubernetes ServiceAccount missing WIF annotation | Enable GKE metadata server on the node pool ([Step 2](#gcp-prerequisites)); verify the `iam.gke.io/gcp-service-account` annotation on the ServiceAccount ([Step 4](#gcp-prerequisites)) |
| `403 Forbidden` from Vertex API        | IAM Service Account lacks Vertex permissions                                         | Grant `roles/aiplatform.user` to the IAM Service Account                                                                                                                                |
| `403` during token exchange            | WIF binding missing                                                                  | Run the `add-iam-policy-binding` command from Step 4; confirm `roles/iam.workloadIdentityUser` is granted                                                                               |
| Wrong project or region errors         | Bifrost config mismatch                                                              | Check `project_id` and `region` in the Vertex key configuration                                                                                                                         |

***

## Beta Headers

For Anthropic models on Vertex AI, Bifrost validates `anthropic-beta` headers and drops unsupported headers from the request.

**Supported**: `computer-use-*`, `compact-*`, `context-management-*`, `interleaved-thinking-*`, `context-1m-*`

**Not supported**: `structured-outputs-*`, `advanced-tool-use-*`, `mcp-client-*`, `prompt-caching-scope-*`, `files-api-*`, `skills-*`, `fast-mode-*`, `redact-thinking-*`

You can override these defaults per provider via the **Beta Headers** tab in provider configuration or via [`beta_header_overrides`](/quickstart/gateway/provider-configuration#beta-header-overrides). See the full support matrix in the [Anthropic provider docs](/providers/supported-providers/anthropic#beta-headers).

<Frame>
  <img src="https://mintcdn.com/bifrost/ywAmWSjmbrf-3qJw/media/vertex-ai-setting-anthropic-beta-headers.png?fit=max&auto=format&n=ywAmWSjmbrf-3qJw&q=85&s=de22d93722e5dfaff8cab60b70d516d9" alt="Vertex AI Beta Headers configuration tab showing supported and unsupported Anthropic beta features with override options" width="2318" height="2648" data-path="media/vertex-ai-setting-anthropic-beta-headers.png" />
</Frame>

***

# 1. Chat Completions

## Request Parameters

### Core Parameter Mapping

| Parameter        | Vertex Handling           | Notes                                                |
| ---------------- | ------------------------- | ---------------------------------------------------- |
| `model`          | Maps to Vertex model ID   | Region-specific endpoint constructed automatically   |
| All other params | Model-specific conversion | Converted per underlying provider (Gemini/Anthropic) |

### Key Configuration

The key configuration for Vertex requires Google Cloud credentials:

```json theme={null}
{
  "vertex_key_config": {
    "project_id": "my-gcp-project",
    "region": "us-central1",
    "auth_credentials": "{service-account-json}"
  }
}
```

**Configuration Details**:

* `project_id` - GCP project ID (required)
* `region` - GCP region for API endpoints (required)
  * Examples: `us-central1`, `us-west1`, `eu-west1`, `global`
* `auth_credentials` - Service account JSON credentials (optional if using default credentials)

### Authentication Methods

1. **Service Account JSON** (recommended for production)

   ```json theme={null}
   { "auth_credentials": "{full-service-account-json}" }
   ```

2. **Application Default Credentials** (for local development)
   * Requires `GOOGLE_APPLICATION_CREDENTIALS` environment variable
   * Leave `auth_credentials` empty

## Gemini Models

When using Google's Gemini models, Bifrost converts requests to Gemini's API format.

### Parameter Mapping for Gemini

All Gemini-compatible parameters are supported. Special handling includes:

* **System prompts**: Converted to Gemini's system message format
* **Tool usage**: Mapped to Gemini's function calling format
* **Streaming**: Uses Gemini's streaming protocol

Refer to [Gemini documentation](/providers/supported-providers/gemini) for detailed conversion details.

## Anthropic Models (Claude)

When using Anthropic models through Vertex AI, Bifrost converts requests to Anthropic's message format.

### Parameter Mapping for Anthropic

All Anthropic-standard parameters are supported:

* **Reasoning/Thinking**: `reasoning` parameters converted to `thinking` structure
* **System messages**: Extracted and placed in separate `system` field
* **Tool message grouping**: Consecutive tool messages merged
* **API version**: Automatically set to `vertex-2023-10-16` for Anthropic models

Refer to [Anthropic documentation](/providers/supported-providers/anthropic) for detailed conversion details.

### Special Notes for Vertex + Anthropic

* Responses API uses special `/v1/messages` endpoint
* `anthropic_version` automatically set to `vertex-2023-10-16`
* Minimum reasoning budget: 1024 tokens
* Model field removed from request (Vertex uses different identification)

## Region Selection

The region determines the API endpoint:

| Region        | Endpoint                                | Purpose                   |
| ------------- | --------------------------------------- | ------------------------- |
| `us-central1` | `us-central1-aiplatform.googleapis.com` | US Central                |
| `us-west1`    | `us-west1-aiplatform.googleapis.com`    | US West                   |
| `eu-west1`    | `eu-west1-aiplatform.googleapis.com`    | Europe West               |
| `global`      | `aiplatform.googleapis.com`             | Global (no region prefix) |

Availability varies by region. Check [GCP documentation](https://cloud.google.com/vertex-ai/docs/general/locations) for model availability.

## Streaming

Streaming format depends on model type:

* **Gemini models**: Standard Gemini streaming with server-sent events
* **Anthropic models**: Anthropic message streaming format

***

# 2. Responses API

The Responses API is available for both Anthropic (Claude) and Gemini models on Vertex AI.

## Request Parameters

### Core Parameter Mapping

| Parameter           | Vertex Handling              | Notes                             |
| ------------------- | ---------------------------- | --------------------------------- |
| `instructions`      | Becomes system message       | Model-specific conversion         |
| `input`             | Converted to messages        | String or array support           |
| `max_output_tokens` | Model-specific field mapping | Gemini vs Anthropic conversion    |
| All other params    | Model-specific conversion    | Converted per underlying provider |

### Gemini Models

For Gemini models, conversion follows Gemini's Responses API format.

### Anthropic Models (Claude)

For Anthropic models, conversion follows Anthropic's message format:

* `instructions` becomes system message
* `reasoning` mapped to `thinking` structure

### Configuration

<Tabs>
  <Tab title="Gateway">
    ```bash theme={null}
    curl -X POST http://localhost:8080/v1/responses \
      -H "Content-Type: application/json" \
      -d '{
        "model": "vertex/claude-3-5-sonnet",
        "input": "What is AI?",
        "instructions": "You are a helpful assistant",
        "project_id": "my-gcp-project",
        "region": "us-central1"
      }' \
      -H "X-Goog-Authorization: Bearer {token}"
    ```
  </Tab>

  <Tab title="Go SDK">
    ```go theme={null}
    resp, err := client.ResponsesRequest(schemas.NewBifrostContext(ctx, schemas.NoDeadline), &schemas.BifrostResponsesRequest{
        Provider: schemas.Vertex,
        Model:    "claude-3-5-sonnet",
        Input:    messages,
        Params: &schemas.ResponsesParameters{
            Instructions: schemas.Ptr("You are a helpful assistant"),
        },
    })
    ```
  </Tab>
</Tabs>

### Special Handling

* Endpoint: `/v1/messages` (Anthropic format)
* `anthropic_version` set to `vertex-2023-10-16` automatically
* Model and region fields removed from request
* Raw request body passthrough supported

Refer to [Anthropic Responses API](/providers/supported-providers/anthropic#2-responses-api) for parameter details.

***

# 3. Embeddings

Embeddings are supported for Gemini and other models that support embedding generation.

## Request Parameters

### Core Parameters

| Parameter    | Vertex Mapping                    | Notes                |
| ------------ | --------------------------------- | -------------------- |
| `input`      | `instances[].content`             | Text to embed        |
| `dimensions` | `parameters.outputDimensionality` | Optional output size |

### Advanced Parameters

Use `extra_params` for embedding-specific options:

<Tabs>
  <Tab title="Gateway">
    ```bash theme={null}
    curl -X POST http://localhost:8080/v1/embeddings \
      -H "Content-Type: application/json" \
      -d '{
        "model": "text-embedding-004",
        "input": ["text to embed"],
        "dimensions": 256,
        "task_type": "RETRIEVAL_DOCUMENT",
        "title": "Document title",
        "project_id": "my-gcp-project",
        "region": "us-central1",
        "autoTruncate": true
      }'
    ```
  </Tab>

  <Tab title="Go SDK">
    ```go theme={null}
    resp, err := client.EmbeddingRequest(schemas.NewBifrostContext(ctx, schemas.NoDeadline), &schemas.BifrostEmbeddingRequest{
        Provider: schemas.Vertex,
        Model:    "text-embedding-004",
        Input: &schemas.EmbeddingInput{
            Texts: []string{"text to embed"},
        },
        Params: &schemas.EmbeddingParameters{
            Dimensions: schemas.Ptr(256),
            ExtraParams: map[string]interface{}{
                "task_type": "RETRIEVAL_DOCUMENT",
                "title": "Document title",
                "autoTruncate": true,
            },
        },
    })
    ```
  </Tab>
</Tabs>

#### Embedding Parameters

| Parameter      | Type    | Description                                                                                                               |
| -------------- | ------- | ------------------------------------------------------------------------------------------------------------------------- |
| `task_type`    | string  | Task type hint: `RETRIEVAL_QUERY`, `RETRIEVAL_DOCUMENT`, `SEMANTIC_SIMILARITY`, `CLASSIFICATION`, `CLUSTERING` (optional) |
| `title`        | string  | Optional title to help model produce better embeddings (used with task\_type)                                             |
| `autoTruncate` | boolean | Auto-truncate input to max tokens (defaults to true)                                                                      |

### Task Type Effects

Different task types optimize embeddings for specific use cases:

* `RETRIEVAL_DOCUMENT` - Optimized for documents in retrieval systems
* `RETRIEVAL_QUERY` - Optimized for queries searching documents
* `SEMANTIC_SIMILARITY` - Optimized for semantic similarity tasks
* `CLASSIFICATION` - For classification tasks
* `CLUSTERING` - For clustering tasks

## Response Conversion

Embeddings response includes vectors and truncation information:

```json theme={null}
{
  "embeddings": [
    {
      "values": [0.1234, -0.5678, ...],
      "statistics": {
        "token_count": 15,
        "truncated": false
      }
    }
  ]
}
```

**Response Fields**:

* `values` - Embedding vector as floats
* `statistics.token_count` - Input token count
* `statistics.truncated` - Whether input was truncated due to length

***

# 4. Image Generation

Image Generation is supported for Gemini and Imagen on Vertex AI. The provider automatically routes to the appropriate format based on the model type.

## Request Parameters

### Core Parameter Mapping

| Parameter        | Vertex Handling                       | Notes                                             |
| ---------------- | ------------------------------------- | ------------------------------------------------- |
| `model`          | Mapped to deployment/model identifier | Model type detected automatically                 |
| `prompt`         | Model-specific conversion             | Converted per underlying provider (Gemini/Imagen) |
| All other params | Model-specific conversion             | Converted per underlying provider                 |

### Model Type Detection

Vertex automatically detects the model type and uses the appropriate conversion:

1. **Gemini Models**: Uses Gemini format (same as [Gemini Image Generation](/providers/supported-providers/gemini#8-image-generation))
2. **Imagen Models**: Uses Imagen format (detected via `IsImagenModel()`)

### Configuration

<Tabs>
  <Tab title="Gateway">
    ```bash theme={null}
    curl -X POST http://localhost:8080/v1/images/generations \
      -H "Content-Type: application/json" \
      -d '{
        "model": "vertex/imagen-4.0-generate-001",
        "prompt": "A sunset over the mountains",
        "size": "1024x1024",
        "n": 2,
        "project_id": "my-gcp-project",
        "region": "us-central1"
      }' \
      -H "X-Goog-Authorization: Bearer {token}"
    ```
  </Tab>

  <Tab title="Go SDK">
    ```go theme={null}
    resp, err := client.ImageGenerationRequest(schemas.NewBifrostContext(ctx, schemas.NoDeadline), &schemas.BifrostImageGenerationRequest{
        Provider: schemas.Vertex,
        Model:    "imagen-4.0-generate-001",
        Input: &schemas.ImageGenerationInput{
            Prompt: "A sunset over the mountains",
        },
        Params: &schemas.ImageGenerationParameters{
            Size: schemas.Ptr("1024x1024"),
            N:    schemas.Ptr(2),
        },
    })
    ```
  </Tab>
</Tabs>

## Request Conversion

Vertex converts requests based on model type:

* **Gemini Models**: Uses `gemini.ToGeminiImageGenerationRequest()` - same conversion as standard Gemini (see [Gemini Image Generation](/providers/supported-providers/gemini#8-image-generation))
* **Imagen Models**: Uses `gemini.ToImagenImageGenerationRequest()` - Imagen-specific format with size/aspect ratio conversion

All request bodies are converted to `map[string]interface{}` and the `region` field is removed before sending to Vertex API.

## Response Conversion

* **Gemini Models**: Responses converted using `GenerateContentResponse.ToBifrostImageGenerationResponse()` - same as standard Gemini
* **Imagen Models**: Responses converted using `GeminiImagenResponse.ToBifrostImageGenerationResponse()` - Imagen-specific format

## Endpoint Selection

The provider automatically selects the endpoint based on model type:

* **Fine-tuned models**: `/v1beta1/projects/{projectNumber}/locations/{region}/endpoints/{deployment}:generateContent`
* **Imagen models**: `/v1/projects/{projectID}/locations/{region}/publishers/google/models/{model}:predict`
* **Gemini models**: `/v1/projects/{projectID}/locations/{region}/publishers/google/models/{model}:generateContent`

## Streaming

Image generation streaming is not supported by Vertex AI.

***

# 5. Image Edit

<Warning>Requests use **multipart/form-data**, not JSON.</Warning>

Image Edit is supported for Gemini and Imagen models on Vertex AI. The provider automatically routes to the appropriate format based on the model type.

**Request Parameters**

| Parameter            | Type   | Required | Notes                                                                                                                                                           |
| -------------------- | ------ | -------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `model`              | string | ✅        | Model identifier (must be Gemini or Imagen model)                                                                                                               |
| `prompt`             | string | ✅        | Text description of the edit                                                                                                                                    |
| `image[]`            | binary | ✅        | Image file(s) to edit (supports multiple images)                                                                                                                |
| `mask`               | binary | ❌        | Mask image file                                                                                                                                                 |
| `type`               | string | ❌        | Edit type: `"inpainting"`, `"outpainting"`, `"inpaint_removal"`, `"bgswap"` (Imagen only)                                                                       |
| `n`                  | int    | ❌        | Number of images to generate (1-10)                                                                                                                             |
| `output_format`      | string | ❌        | Output format: `"png"`, `"webp"`, `"jpeg"`                                                                                                                      |
| `output_compression` | int    | ❌        | Compression level (0-100%)                                                                                                                                      |
| `seed`               | int    | ❌        | Seed for reproducibility (via `ExtraParams["seed"]`)                                                                                                            |
| `negative_prompt`    | string | ❌        | Negative prompt (via `ExtraParams["negativePrompt"]`)                                                                                                           |
| `maskMode`           | string | ❌        | Mask mode (via `ExtraParams["maskMode"]`, Imagen only): `"MASK_MODE_USER_PROVIDED"`, `"MASK_MODE_BACKGROUND"`, `"MASK_MODE_FOREGROUND"`, `"MASK_MODE_SEMANTIC"` |
| `dilation`           | float  | ❌        | Mask dilation (via `ExtraParams["dilation"]`, Imagen only): Range \[0, 1]                                                                                       |
| `maskClasses`        | int\[] | ❌        | Mask classes (via `ExtraParams["maskClasses"]`, Imagen only): For `MASK_MODE_SEMANTIC`                                                                          |

***

**Request Conversion**

Vertex uses the same conversion functions as Gemini:

1. **Gemini Models**: Uses `gemini.ToGeminiImageEditRequest()` - same conversion as standard Gemini (see [Gemini Image Edit](/providers/supported-providers/gemini#9-image-edit))
2. **Imagen Models**: Uses `gemini.ToImagenImageEditRequest()` - Imagen-specific format with edit mode mapping and mask configuration (see [Gemini Image Edit](/providers/supported-providers/gemini#9-image-edit))

**Model Validation**: Only Gemini and Imagen models are supported. Other models return `ConfigurationError`.

**Request Body Processing**:

* All request bodies are converted to `map[string]interface{}` for Vertex API compatibility
* The `region` field is removed before sending to Vertex API
* For Gemini models, unsupported fields are stripped via `stripVertexGeminiUnsupportedFields()` (removes `id` from function\_call and function\_response)

**Response Conversion**

* **Gemini Models**: Responses converted using `GenerateContentResponse.ToBifrostImageGenerationResponse()` - same as standard Gemini
* **Imagen Models**: Responses converted using `GeminiImagenResponse.ToBifrostImageGenerationResponse()` - Imagen-specific format

**Endpoint Selection**

The provider automatically selects the endpoint based on model type:

* **Gemini models**: `/v1/projects/{projectID}/locations/{region}/publishers/google/models/{model}:generateContent`
* **Imagen models**: `/v1/projects/{projectID}/locations/{region}/publishers/google/models/{model}:predict`

**Streaming**

Image edit streaming is not supported by Vertex AI.

**Image Variation**

Image variation is not supported by Vertex AI.

***

# 6. List Models

## Request Parameters

None required. Automatically uses project\_id and region from key config.

## Response Conversion

Lists models available in the specified project and region with metadata and deployment information:

```json theme={null}
{
  "models": [
    {
      "name": "projects/{project}/locations/{region}/models/gemini-2.0-flash",
      "display_name": "Gemini 2.0 Flash",
      "description": "Fast multimodal model",
      "version_id": "1",
      "version_aliases": ["latest", "stable"],
      "capabilities": [...],
      "deployed_models": [...]
    }
  ],
  "next_page_token": "..."
}
```

## Custom vs Non-Custom Models

<Warning>
  **Important**: Vertex AI's List Models API **only returns custom fine-tuned
  models** that have been deployed to your project. It does NOT return standard
  foundation models (Gemini, Claude, etc.).
</Warning>

To provide a complete model listing experience, Bifrost performs **multi-pass model discovery**:

### Three-Pass Model Discovery

1. **First Pass - Custom Models from API Response**
   * Queries Vertex AI's List Models API
   * Returns only custom fine-tuned models deployed to your project
   * Custom models are identified by having deployment values that contain only digits
   * Example: `"deployment": "1234567890"`

2. **Second Pass - Non-Custom Models from Aliases**
   * Adds standard foundation models from your `aliases` configuration
   * Non-custom models have alphanumeric deployment values (e.g., `gemini-pro`, `claude-3-5-sonnet`)
   * Filters by the key-level `models` allowlist, if specified
   * Example: `"deployment": "gemini-2.0-flash"`

3. **Third Pass - Allowed Models Not in Aliases**
   * Adds models specified in `models` that weren't in the `aliases` map
   * Ensures all explicitly allowed models appear in the list
   * Uses the model name itself as the deployment value
   * Skips digit-only model IDs (reserved for custom models)

### Model Filtering Logic

* **If `models` is empty and no aliases are configured**: No models are returned
* **If `models` is empty but aliases are configured**: Only aliased models are returned
* **If `models` is `["*"]`**: All models from all three passes are included (unrestricted)
* **If `models` is non-empty**: Only models/aliases whose request names appear in `models` are included
* **Duplicate Prevention**: Each model ID is tracked to prevent duplicates across passes

### Model Name Formatting

Non-custom models from aliases and allowed models are automatically formatted for display:

* `gemini-pro` → "Gemini Pro"
* `claude-3-5-sonnet` → "Claude 3 5 Sonnet"
* `gemini_2_flash` → "Gemini 2 Flash"

Formatting uses title case and converts hyphens/underscores to spaces.

### Example Configuration

<Tabs>
  <Tab title="With Custom Models Only">
    ```json theme={null}
    {
      "aliases": {
        "my-gemini-ft": "1234567890",
        "my-claude-ft": "9876543210"
      },
      "vertex_key_config": {
        "project_id": "my-project",
        "region": "us-central1"
      }
    }
    ```

    This returns only your custom fine-tuned models from the API.
  </Tab>

  <Tab title="With Foundation Models">
    ```json theme={null}
    {
      "aliases": {
        "gemini-2.0-flash": "gemini-2.0-flash",
        "claude-3-5-sonnet": "claude-3-5-sonnet-v2@20241022"
      },
      "vertex_key_config": {
        "project_id": "my-project",
        "region": "us-central1"
      }
    }
    ```

    This returns both custom models AND foundation models from aliases.
  </Tab>

  <Tab title="With Allowed Models Filter">
    ```json theme={null}
    {
      "models": ["gemini-2.0-flash", "claude-3-5-sonnet"],
      "aliases": {
        "gemini-2.0-flash": "gemini-2.0-flash",
        "claude-3-5-sonnet": "claude-3-5-sonnet-v2@20241022",
        "gemini-1.5-pro": "gemini-1.5-pro"
      },
      "vertex_key_config": {
        "project_id": "my-project",
        "region": "us-central1"
      }
    }
    ```

    Only returns `gemini-2.0-flash` and `claude-3-5-sonnet`, excluding `gemini-1.5-pro`.
  </Tab>
</Tabs>

### Pagination

Model listing is paginated automatically. If more than 100 models exist, `next_page_token` will be present. Bifrost handles pagination internally.

***

## Caveats

<Accordion title="Project ID and Region Required">
  **Severity**: High **Behavior**: Both project\_id and region required for all
  operations **Impact**: Request fails without valid GCP project/region
  configuration **Code**: `vertex.go:127-138`
</Accordion>

<Accordion title="OAuth2 Token Management">
  **Severity**: Medium **Behavior**: Tokens cached and automatically refreshed
  when expired **Impact**: First request slightly slower due to auth; cached for
  subsequent requests **Code**: `vertex.go:34-55`
</Accordion>

<Accordion title="Anthropic Model Detection">
  **Severity**: Medium **Behavior**: Automatic detection of Anthropic vs Gemini
  models **Impact**: Different conversion logic applied transparently **Code**:
  `vertex.go` chat/responses endpoints
</Accordion>

<Accordion title="Model-Specific Responses API Handling">
  **Severity**: Low **Behavior**: Responses API automatically routes to
  Anthropic or Gemini implementation based on model **Impact**: Different
  conversion logic applied transparently per model **Code**:
  `vertex.go:836-1080`
</Accordion>

<Accordion title="Anthropic Version Lock">
  **Severity**: Low **Behavior**: `anthropic_version` always set to
  `vertex-2023-10-16` for Claude **Impact**: Cannot override Anthropic version
  for Claude on Vertex **Code**: `utils.go:33, 71`
</Accordion>

<Accordion title="Embeddings Precision Preservation">
  **Severity**: Low **Behavior**: Vertex returns float64 embeddings, and Bifrost
  preserves that precision in normalized embedding responses **Impact**: No
  precision loss in the `/v1/embeddings` response path **Code**:
  `embedding.go:84-91`
</Accordion>

<Accordion title="List Models API Returns Only Custom Models">
  **Severity**: High **Behavior**: Vertex AI's List Models API only returns
  custom fine-tuned models, NOT foundation models **Impact**: Bifrost performs
  three-pass discovery to include foundation models from aliases and the
  key-level `models` allowlist **Why**: This is a Vertex AI API limitation -
  foundation models must be explicitly configured **Code**: `models.go:76-217`
</Accordion>

***

## Configuration

**HTTP Settings**: OAuth2 authentication with automatic token refresh | Region-specific endpoints | Max Connections 5000 | Max Idle 60 seconds

**Scope**: `https://www.googleapis.com/auth/cloud-platform`

**Endpoint Format**: `https://{region}-aiplatform.googleapis.com/v1/projects/{project}/locations/{region}/{resource}`

**Note**: For `global` region, endpoint is `https://aiplatform.googleapis.com/v1/projects/{project}/locations/global/{resource}`

## Video Generation

Vertex AI routes video generation through Gemini's Veo models using the `predictLongRunning` endpoint. All parameters are identical to [Gemini Video Generation](/providers/supported-providers/gemini#video-generation).

<Note>
  Only Veo models are supported (e.g., `veo-2.0-generate-001`). Passing a
  non-Veo model name returns a configuration error.
</Note>

**Supported Operations**

| Operation | Supported | Notes                         |
| --------- | --------- | ----------------------------- |
| Generate  | ✅         | `POST /v1/videos`             |
| Retrieve  | ✅         | `GET /v1/videos/{id}`         |
| Download  | ✅         | `GET /v1/videos/{id}/content` |
| Delete    | ❌         | Not supported                 |
| List      | ❌         | Not supported                 |
| Remix     | ❌         | Not supported                 |
