Create embeddings (LiteLLM - Cohere format)

curl --request POST \
  --url http://localhost:8080/litellm/cohere/v2/embed \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "embed-english-v3.0",
  "input_type": "<string>",
  "texts": [
    "<string>"
  ],
  "images": [
    "<string>"
  ],
  "inputs": [
    {
      "content": [
        {
          "type": "text",
          "text": "<string>",
          "image_url": {
            "url": "<string>"
          },
          "thinking": "<string>",
          "document": {
            "data": {},
            "id": "<string>"
          }
        }
      ]
    }
  ],
  "embedding_types": [
    "<string>"
  ],
  "output_dimension": 123,
  "max_tokens": 123,
  "truncate": "<string>"
}
'

{
  "id": "<string>",
  "embeddings": {
    "float": [
      [
        123
      ]
    ],
    "int8": [
      [
        123
      ]
    ],
    "uint8": [
      [
        123
      ]
    ],
    "binary": [
      [
        123
      ]
    ],
    "ubinary": [
      [
        123
      ]
    ],
    "base64": [
      "<string>"
    ]
  },
  "response_type": "<string>",
  "texts": [
    "<string>"
  ],
  "images": [
    {
      "width": 123,
      "height": 123,
      "format": "<string>",
      "bit_depth": 123
    }
  ],
  "meta": {
    "api_version": {
      "version": "<string>",
      "is_deprecated": true,
      "is_experimental": true
    },
    "billed_units": {
      "input_tokens": 123,
      "output_tokens": 123,
      "search_units": 123,
      "classifications": 123
    },
    "tokens": {
      "input_tokens": 123,
      "output_tokens": 123
    },
    "warnings": [
      "<string>"
    ]
  }
}

POST

litellm

cohere

embed

Create embeddings (LiteLLM - Cohere format)

curl --request POST \
  --url http://localhost:8080/litellm/cohere/v2/embed \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "embed-english-v3.0",
  "input_type": "<string>",
  "texts": [
    "<string>"
  ],
  "images": [
    "<string>"
  ],
  "inputs": [
    {
      "content": [
        {
          "type": "text",
          "text": "<string>",
          "image_url": {
            "url": "<string>"
          },
          "thinking": "<string>",
          "document": {
            "data": {},
            "id": "<string>"
          }
        }
      ]
    }
  ],
  "embedding_types": [
    "<string>"
  ],
  "output_dimension": 123,
  "max_tokens": 123,
  "truncate": "<string>"
}
'

{
  "id": "<string>",
  "embeddings": {
    "float": [
      [
        123
      ]
    ],
    "int8": [
      [
        123
      ]
    ],
    "uint8": [
      [
        123
      ]
    ],
    "binary": [
      [
        123
      ]
    ],
    "ubinary": [
      [
        123
      ]
    ],
    "base64": [
      "<string>"
    ]
  },
  "response_type": "<string>",
  "texts": [
    "<string>"
  ],
  "images": [
    {
      "width": 123,
      "height": 123,
      "format": "<string>",
      "bit_depth": 123
    }
  ],
  "meta": {
    "api_version": {
      "version": "<string>",
      "is_deprecated": true,
      "is_experimental": true
    },
    "billed_units": {
      "input_tokens": 123,
      "output_tokens": 123,
      "search_units": 123,
      "classifications": 123
    },
    "tokens": {
      "input_tokens": 123,
      "output_tokens": 123
    },
    "warnings": [
      "<string>"
    ]
  }
}

Body

application/json

model

string

required

ID of an available embedding model

Example:

"embed-english-v3.0"

input_type

string

required

Specifies the type of input passed to the model. Required for embedding models v3 and higher.

texts

string[]

Array of strings to embed. Maximum 96 texts per call. At least one of texts, images, or inputs is required.

Maximum array length: 96

images

string[]

Array of image data URIs for multimodal embedding. Maximum 1 image per call. Supports JPEG, PNG, WebP, GIF up to 5MB.

Maximum array length: 1

inputs

object[]

Array of mixed text/image components for embedding. Maximum 96 per call.

Maximum array length: 96

Show child attributes

embedding_types

string[]

Specifies the return format types (float, int8, uint8, binary, ubinary, base64). Defaults to float if unspecified.

output_dimension

integer

Number of dimensions for output embeddings (256, 512, 1024, 1536). Available only for embed-v4 and newer models.

max_tokens

integer

Maximum tokens to embed per input before truncation.

truncate

string

Handling for inputs exceeding token limits. Defaults to END.

Response

Successful response

string

Response ID

embeddings

object

Embedding data object with different types

Show child attributes

response_type

string

Response type (embeddings_floats, embeddings_by_type)

texts

string[]

Original text entries

images

object[]

Original image entries

Show child attributes

API Reference

Body

Response