Runs OCR over a scanned PDF or image and returns the recognized content as markdown per page, preserving structure including tables.
OCR quality and layout/table reconstruction are hard to do locally. Mistral OCR returns clean, structured markdown rather than a flat character dump.
For PDFs that already contain a text layer, pdf-tools/extract_text is cheaper. Use pages to limit which pages are processed (billing is per page processed).
Optional 0-indexed page numbers to process. Defaults to all pages.
Preferred output style. The response always includes per-page markdown.
Pre-signed GET URL of the scanned PDF or image.