Frontend Guide
Hybrid AI mode
Hybrid mode routes complex pages to a dedicated AI backend, delivering dramatically higher accuracy for tables, scanned text, mathematical formulas, and image descriptions. Available on Pro and Business plans.
ℹ Info
Fast mode vs Hybrid mode
| Feature | Fast (default) | Hybrid AI |
|---|---|---|
| Speed | ~0.015s / page | ~0.5s / page |
| Table accuracy | ~49% | ~93% (+90%) |
| Reading order | ~90% | ~93% |
| Heading detection | ~74% | ~82% |
| OCR (scanned PDFs) | ❌ | ✅ |
| Formula extraction | ❌ | ✅ |
| Image descriptions | ❌ | ✅ (full mode) |
| Tier required | All | Pro / Business |
| Fallback if unavailable | — | Falls back to Fast |
When to use Hybrid mode
- Complex tables — financial statements, pricing tables, comparison matrices with merged cells or no visible borders
- Scanned PDFs — documents digitised by scanner or camera with no selectable text layer
- Mathematical formulas — LaTeX-heavy academic papers, scientific reports
- Mixed layouts — documents combining text, tables, images and sidebars
- Image-heavy reports — when you want AI-generated natural language descriptions of charts and diagrams
Enabling Hybrid mode
From the web UI
In the parser options panel, scroll to Advanced — Hybrid mode and select Hybrid (smart routing). Optionally enable Picture descriptions for AI captions.
Via API
# Standard Hybrid — tables + OCR
-F 'options={"hybrid":"docling-fast"}'
# Full Hybrid — also adds AI image descriptions
-F 'options={"hybrid":"docling-fast","hybrid_mode":"full"}'
# Force OCR on every page (scanned PDFs)
-F 'options={"hybrid":"docling-fast","force_ocr":true}'OCR for scanned PDFs
Scanned PDFs contain images of pages rather than selectable text. The fast engine cannot extract text from these documents. Hybrid mode uses an OCR engine (EasyOCR / Tesseract / RapidOCR — auto-selected) to recognise text in each page image.
# Process a scanned PDF
curl -X POST https://api.ragify.it/jobs \
-H "X-Api-Key: rg_..." \
-F "file=@scanned_contract.pdf" \
-F 'options={
"format":["markdown"],
"hybrid":"docling-fast",
"force_ocr":true
}'✦ Tip
force_ocr: true when you know a document is fully scanned. Without it, Hybrid mode applies OCR only to pages where no selectable text is found.Formula extraction
Hybrid mode can identify and extract mathematical formulas. In the JSON output, formulas appear as elements with type: "formula" and the formula is rendered in LaTeX notation within the content field.
{
"type": "formula",
"content": "E = mc^{2}",
"page_number": 4,
"bbox": [100, 300, 400, 330]
}AI image descriptions (full mode)
When hybrid_mode: "full" is set, each extracted image is processed by SmolVLM (a compact vision-language model) to generate a natural language description. The description appears in the JSON output alongside the figure element.
{
"type": "figure",
"content": "Bar chart showing quarterly revenue growth from Q1 to Q4 2025, with Q3 showing the highest bar at €2.4M.",
"page_number": 5,
"bbox": [72, 150, 540, 420]
}◆ Note
Automatic fallback
By default, hybrid_fallback: true. If the Hybrid AI backend is unavailable or returns an error for a page, that page is automatically re-processed with the fast engine and the job completes normally. You will not receive an error — just slightly lower accuracy for affected pages.
Set hybrid_fallback: false only if you require guaranteed Hybrid processing and prefer the job to fail rather than silently fall back.