Replies: 3 comments 1 reply
-
|
Hi @wyh, great questions! Let me answer each one. 1. Does hybrid mode need a GPU?No, GPU is not required. The hybrid server runs on CPU by default. If a GPU is available, it will automatically detect and use it for faster processing, but it's entirely optional.
2. How long does it take to process a page?It depends on the processing path:
The backend processing averages about 0.685 seconds per document, with a range of 0.19s to 4.24s depending on page complexity. 3. Does the system automatically decide which page goes to which mode?Yes, fully automatic. The built-in
This means most pages are processed lightning-fast in Java, while only complex pages (tables, charts, etc.) are sent to the backend for deeper analysis. How to runStep 1. Install with hybrid supportpip install -U "opendataloader-pdf[hybrid]"Step 2. Start the backend server (first terminal)opendataloader-pdf-hybrid --port 5002Step 3. Process PDFs with hybrid mode (second terminal)# Basic hybrid mode (auto triage)
opendataloader-pdf --hybrid docling-fast input.pdf
# With custom server URL and timeout
opendataloader-pdf --hybrid docling-fast --hybrid-url http://localhost:5002 --hybrid-timeout 60000 input.pdf
# With fallback to Java on backend error
opendataloader-pdf --hybrid docling-fast --hybrid-fallback input.pdf
# Full mode — send ALL pages to backend (required for enrichments)
opendataloader-pdf --hybrid docling-fast --hybrid-mode full input.pdf
For more details, see the Hybrid Mode documentation. Hope this helps! Feel free to ask if you have more questions. |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for your such a detailed reply. I actually tried it on both GPU and CPU, and works like a chram. Thanks again. |
Beta Was this translation helpful? Give feedback.
-
|
Hi @Tellyang7! Model storage locationBy default, models are saved to To change this, you have two options: Option 1 — Environment variable (applies to all Docling operations): export DOCLING_CACHE_DIR=/path/to/your/cache
opendataloader-pdf-hybrid --port 5002Option 2 — Pre-download with a custom path using docling-tools models download -o /path/to/your/modelsPre-downloading models (offline / air-gapped)You can download all models ahead of time so the hybrid server starts instantly with no network access needed: # Download the default model set
docling-tools models download
# Download ALL available models (including optional VLM, OCR, etc.)
docling-tools models download --all
# Download to a custom directory
docling-tools models download -o /data/docling-modelsWhat gets downloaded and from whereAll models come from Hugging Face Hub (
Models are cached after the first download — the warning only appears once. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
how long does it take to process a page in hybrid mode?
does the system automatically decide which page goes to the right mode?
Beta Was this translation helpful? Give feedback.
All reactions