Extracts text and content from documents or images using optical character recognition. Supports PDF URLs, base64-encoded documents, and image URLs.
Bearer token authentication. Use your provider API key or Bifrost authentication token.
Virtual keys (prefixed with sk-bf-) can also be passed here.
Model in provider/model format
"mistral/mistral-ocr-latest"
Optional unique identifier for the request
Fallback models in provider/model format
Whether to include base64-encoded images in the response
Specific page indices to process (0-based)
x >= 0Maximum number of images to extract per page
x >= 1Minimum image size in pixels to extract
x >= 1Format for extracted tables (e.g., "markdown", "html")
Whether to extract page headers
Whether to extract page footers
Granularity of confidence scores to include in the response
page, block, word, document Format for bounding box annotations. Supports text, json_object, and json_schema modes.
Format for document-level annotations. Supports text, json_object, and json_schema modes.
Custom prompt for document annotation