Rerank documents

curl --request POST \
  --url http://localhost:8080/v1/rerank \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "cohere/rerank-v3.5",
  "query": "<string>",
  "documents": [
    {
      "text": "<string>",
      "id": "<string>",
      "meta": {}
    }
  ],
  "fallbacks": [
    "<string>"
  ],
  "top_n": 2,
  "max_tokens_per_doc": 2,
  "priority": 123,
  "return_documents": true
}
'

{
  "results": [
    {
      "index": 1,
      "relevance_score": 123,
      "document": {
        "text": "<string>",
        "id": "<string>",
        "meta": {}
      }
    }
  ],
  "model": "<string>",
  "id": "<string>",
  "usage": {
    "prompt_tokens": 123,
    "prompt_tokens_details": {
      "text_tokens": 123,
      "audio_tokens": 123,
      "image_tokens": 123,
      "cached_tokens": 123
    },
    "completion_tokens": 123,
    "completion_tokens_details": {
      "text_tokens": 123,
      "accepted_prediction_tokens": 123,
      "audio_tokens": 123,
      "citation_tokens": 123,
      "num_search_queries": 123,
      "reasoning_tokens": 123,
      "image_tokens": 123,
      "rejected_prediction_tokens": 123,
      "cached_tokens": 123
    },
    "total_tokens": 123,
    "cost": {
      "input_tokens_cost": 123,
      "output_tokens_cost": 123,
      "request_cost": 123,
      "total_cost": 123
    }
  },
  "extra_fields": {
    "request_type": "<string>",
    "provider": "openai",
    "model_requested": "<string>",
    "model_deployment": "<string>",
    "latency": 123,
    "chunk_index": 123,
    "raw_request": {},
    "raw_response": {},
    "cache_debug": {
      "cache_hit": true,
      "cache_id": "<string>",
      "hit_type": "<string>",
      "provider_used": "<string>",
      "model_used": "<string>",
      "input_tokens": 123,
      "threshold": 123,
      "similarity": 123
    }
  }
}

POST

rerank

Rerank documents

curl --request POST \
  --url http://localhost:8080/v1/rerank \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "cohere/rerank-v3.5",
  "query": "<string>",
  "documents": [
    {
      "text": "<string>",
      "id": "<string>",
      "meta": {}
    }
  ],
  "fallbacks": [
    "<string>"
  ],
  "top_n": 2,
  "max_tokens_per_doc": 2,
  "priority": 123,
  "return_documents": true
}
'

{
  "results": [
    {
      "index": 1,
      "relevance_score": 123,
      "document": {
        "text": "<string>",
        "id": "<string>",
        "meta": {}
      }
    }
  ],
  "model": "<string>",
  "id": "<string>",
  "usage": {
    "prompt_tokens": 123,
    "prompt_tokens_details": {
      "text_tokens": 123,
      "audio_tokens": 123,
      "image_tokens": 123,
      "cached_tokens": 123
    },
    "completion_tokens": 123,
    "completion_tokens_details": {
      "text_tokens": 123,
      "accepted_prediction_tokens": 123,
      "audio_tokens": 123,
      "citation_tokens": 123,
      "num_search_queries": 123,
      "reasoning_tokens": 123,
      "image_tokens": 123,
      "rejected_prediction_tokens": 123,
      "cached_tokens": 123
    },
    "total_tokens": 123,
    "cost": {
      "input_tokens_cost": 123,
      "output_tokens_cost": 123,
      "request_cost": 123,
      "total_cost": 123
    }
  },
  "extra_fields": {
    "request_type": "<string>",
    "provider": "openai",
    "model_requested": "<string>",
    "model_deployment": "<string>",
    "latency": 123,
    "chunk_index": 123,
    "raw_request": {},
    "raw_response": {},
    "cache_debug": {
      "cache_hit": true,
      "cache_id": "<string>",
      "hit_type": "<string>",
      "provider_used": "<string>",
      "model_used": "<string>",
      "input_tokens": 123,
      "threshold": 123,
      "similarity": 123
    }
  }
}

Body

application/json

model

string

required

Model in provider/model format

Example:

"cohere/rerank-v3.5"

query

string

required

Query used to score and reorder documents

Minimum string length: 1

documents

object[]

required

Documents to rerank

Minimum array length: 1

Show child attributes

fallbacks

string[]

Fallback models in provider/model format

top_n

integer

Maximum number of ranked results to return

Required range: x >= 1

max_tokens_per_doc

integer

Maximum tokens to consider per document (provider-dependent)

Required range: x >= 1

priority

integer

Request priority hint (provider-dependent)

return_documents

boolean

Whether to include document content in each result

Response

Successful response

results

object[]

required

Ranked results ordered by relevance score descending

Show child attributes

model

string

required

Model used to perform reranking

string

Unique identifier for the rerank response

usage

object

Token usage information

Show child attributes

extra_fields

object

Additional fields included in responses

Show child attributes

Create a response Create embeddings

API Reference

Body

Response