Create text completion (OpenAI format)

curl --request POST \
  --url http://localhost:8080/openai/v1/completions \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "gpt-3.5-turbo-instruct",
  "prompt": "<string>",
  "stream": true,
  "max_tokens": 123,
  "temperature": 1,
  "top_p": 123,
  "frequency_penalty": 0,
  "presence_penalty": 0,
  "logit_bias": {},
  "logprobs": 123,
  "n": 123,
  "stop": "<string>",
  "suffix": "<string>",
  "echo": true,
  "best_of": 123,
  "user": "<string>",
  "seed": 123,
  "fallbacks": [
    "<string>"
  ]
}
'

{
  "id": "<string>",
  "choices": [
    {
      "index": 123,
      "finish_reason": "<string>",
      "log_probs": {
        "content": [
          {
            "bytes": [
              123
            ],
            "logprob": 123,
            "token": "<string>",
            "top_logprobs": [
              {
                "bytes": [
                  123
                ],
                "logprob": 123,
                "token": "<string>"
              }
            ]
          }
        ],
        "refusal": [
          {
            "bytes": [
              123
            ],
            "logprob": 123,
            "token": "<string>"
          }
        ],
        "text_offset": [
          123
        ],
        "token_logprobs": [
          123
        ],
        "tokens": [
          "<string>"
        ],
        "top_logprobs": [
          {}
        ]
      },
      "text": "<string>",
      "message": {
        "role": "assistant",
        "name": "<string>",
        "content": "<string>",
        "tool_call_id": "<string>",
        "refusal": "<string>",
        "audio": {
          "id": "<string>",
          "data": "<string>",
          "expires_at": 123,
          "transcript": "<string>"
        },
        "reasoning": "<string>",
        "reasoning_details": [
          {
            "id": "<string>",
            "index": 123,
            "type": "reasoning.summary",
            "summary": "<string>",
            "text": "<string>",
            "signature": "<string>",
            "data": "<string>"
          }
        ],
        "annotations": [
          {
            "type": "<string>",
            "url_citation": {
              "start_index": 123,
              "end_index": 123,
              "title": "<string>",
              "url": "<string>",
              "sources": {},
              "type": "<string>"
            }
          }
        ],
        "tool_calls": [
          {
            "function": {
              "name": "<string>",
              "arguments": "<string>"
            },
            "index": 123,
            "type": "<string>",
            "id": "<string>"
          }
        ]
      },
      "delta": {
        "role": "<string>",
        "content": "<string>",
        "refusal": "<string>",
        "audio": {
          "id": "<string>",
          "data": "<string>",
          "expires_at": 123,
          "transcript": "<string>"
        },
        "reasoning": "<string>",
        "reasoning_details": [
          {
            "id": "<string>",
            "index": 123,
            "type": "reasoning.summary",
            "summary": "<string>",
            "text": "<string>",
            "signature": "<string>",
            "data": "<string>"
          }
        ],
        "tool_calls": [
          {
            "function": {
              "name": "<string>",
              "arguments": "<string>"
            },
            "index": 123,
            "type": "<string>",
            "id": "<string>"
          }
        ]
      }
    }
  ],
  "model": "<string>",
  "object": "<string>",
  "system_fingerprint": "<string>",
  "usage": {
    "prompt_tokens": 123,
    "prompt_tokens_details": {
      "text_tokens": 123,
      "audio_tokens": 123,
      "image_tokens": 123,
      "cached_tokens": 123
    },
    "completion_tokens": 123,
    "completion_tokens_details": {
      "text_tokens": 123,
      "accepted_prediction_tokens": 123,
      "audio_tokens": 123,
      "citation_tokens": 123,
      "num_search_queries": 123,
      "reasoning_tokens": 123,
      "image_tokens": 123,
      "rejected_prediction_tokens": 123,
      "cached_tokens": 123
    },
    "total_tokens": 123,
    "cost": {
      "input_tokens_cost": 123,
      "output_tokens_cost": 123,
      "request_cost": 123,
      "total_cost": 123
    }
  },
  "extra_fields": {
    "request_type": "<string>",
    "provider": "openai",
    "model_requested": "<string>",
    "model_deployment": "<string>",
    "latency": 123,
    "chunk_index": 123,
    "raw_request": {},
    "raw_response": {},
    "cache_debug": {
      "cache_hit": true,
      "cache_id": "<string>",
      "hit_type": "<string>",
      "provider_used": "<string>",
      "model_used": "<string>",
      "input_tokens": 123,
      "threshold": 123,
      "similarity": 123
    }
  }
}

POST

openai

completions

Create text completion (OpenAI format)

curl --request POST \
  --url http://localhost:8080/openai/v1/completions \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "gpt-3.5-turbo-instruct",
  "prompt": "<string>",
  "stream": true,
  "max_tokens": 123,
  "temperature": 1,
  "top_p": 123,
  "frequency_penalty": 0,
  "presence_penalty": 0,
  "logit_bias": {},
  "logprobs": 123,
  "n": 123,
  "stop": "<string>",
  "suffix": "<string>",
  "echo": true,
  "best_of": 123,
  "user": "<string>",
  "seed": 123,
  "fallbacks": [
    "<string>"
  ]
}
'

{
  "id": "<string>",
  "choices": [
    {
      "index": 123,
      "finish_reason": "<string>",
      "log_probs": {
        "content": [
          {
            "bytes": [
              123
            ],
            "logprob": 123,
            "token": "<string>",
            "top_logprobs": [
              {
                "bytes": [
                  123
                ],
                "logprob": 123,
                "token": "<string>"
              }
            ]
          }
        ],
        "refusal": [
          {
            "bytes": [
              123
            ],
            "logprob": 123,
            "token": "<string>"
          }
        ],
        "text_offset": [
          123
        ],
        "token_logprobs": [
          123
        ],
        "tokens": [
          "<string>"
        ],
        "top_logprobs": [
          {}
        ]
      },
      "text": "<string>",
      "message": {
        "role": "assistant",
        "name": "<string>",
        "content": "<string>",
        "tool_call_id": "<string>",
        "refusal": "<string>",
        "audio": {
          "id": "<string>",
          "data": "<string>",
          "expires_at": 123,
          "transcript": "<string>"
        },
        "reasoning": "<string>",
        "reasoning_details": [
          {
            "id": "<string>",
            "index": 123,
            "type": "reasoning.summary",
            "summary": "<string>",
            "text": "<string>",
            "signature": "<string>",
            "data": "<string>"
          }
        ],
        "annotations": [
          {
            "type": "<string>",
            "url_citation": {
              "start_index": 123,
              "end_index": 123,
              "title": "<string>",
              "url": "<string>",
              "sources": {},
              "type": "<string>"
            }
          }
        ],
        "tool_calls": [
          {
            "function": {
              "name": "<string>",
              "arguments": "<string>"
            },
            "index": 123,
            "type": "<string>",
            "id": "<string>"
          }
        ]
      },
      "delta": {
        "role": "<string>",
        "content": "<string>",
        "refusal": "<string>",
        "audio": {
          "id": "<string>",
          "data": "<string>",
          "expires_at": 123,
          "transcript": "<string>"
        },
        "reasoning": "<string>",
        "reasoning_details": [
          {
            "id": "<string>",
            "index": 123,
            "type": "reasoning.summary",
            "summary": "<string>",
            "text": "<string>",
            "signature": "<string>",
            "data": "<string>"
          }
        ],
        "tool_calls": [
          {
            "function": {
              "name": "<string>",
              "arguments": "<string>"
            },
            "index": 123,
            "type": "<string>",
            "id": "<string>"
          }
        ]
      }
    }
  ],
  "model": "<string>",
  "object": "<string>",
  "system_fingerprint": "<string>",
  "usage": {
    "prompt_tokens": 123,
    "prompt_tokens_details": {
      "text_tokens": 123,
      "audio_tokens": 123,
      "image_tokens": 123,
      "cached_tokens": 123
    },
    "completion_tokens": 123,
    "completion_tokens_details": {
      "text_tokens": 123,
      "accepted_prediction_tokens": 123,
      "audio_tokens": 123,
      "citation_tokens": 123,
      "num_search_queries": 123,
      "reasoning_tokens": 123,
      "image_tokens": 123,
      "rejected_prediction_tokens": 123,
      "cached_tokens": 123
    },
    "total_tokens": 123,
    "cost": {
      "input_tokens_cost": 123,
      "output_tokens_cost": 123,
      "request_cost": 123,
      "total_cost": 123
    }
  },
  "extra_fields": {
    "request_type": "<string>",
    "provider": "openai",
    "model_requested": "<string>",
    "model_deployment": "<string>",
    "latency": 123,
    "chunk_index": 123,
    "raw_request": {},
    "raw_response": {},
    "cache_debug": {
      "cache_hit": true,
      "cache_id": "<string>",
      "hit_type": "<string>",
      "provider_used": "<string>",
      "model_used": "<string>",
      "input_tokens": 123,
      "threshold": 123,
      "similarity": 123
    }
  }
}

Body

application/json

model

string

required

Model identifier

Example:

"gpt-3.5-turbo-instruct"

prompt

required

The prompt(s) to generate completions for

stream

boolean

Whether to stream the response

max_tokens

integer

temperature

number

Required range: 0 <= x <= 2

top_p

number

frequency_penalty

number

Required range: -2 <= x <= 2

presence_penalty

number

Required range: -2 <= x <= 2

logit_bias

object

Show child attributes

logprobs

integer

stop

suffix

string

echo

boolean

best_of

integer

user

string

seed

integer

fallbacks

string[]

Response

Successful response

string

choices

object[]

Show child attributes

model

string

object

string

system_fingerprint

string

usage

object

Token usage information

Show child attributes

extra_fields

object

Additional fields included in responses

Show child attributes

Create chat completion (Azure OpenAI)Create text completion (Azure OpenAI)

API Reference

Body

Response