Evaluation mode · Experimental

M2c — Docling Markdown mode

Model receives Docling's full linearized Markdown for the entire document.

In one paragraph

Instead of one targeted card, the model gets `docling.md` — Docling's full linearized text version of the whole document. NOAA = ~260 KB; V27 = 4.1 MB; V35 = 2.8 MB. Tests whether the linearized full-document format works when the model has sufficient context. Frontier-tier reference handles V27/V35 fine; smaller open models overflow on the larger documents.

How the inputs are generated

Generation · 01

Generator script

Docling library — produced during the initial pipeline conversion

Input sources

• Docling docling.md export (whole-document linearization)

AI use

No — pure deterministic transformation

OCR / re-OCR

Inherits from Docling's extraction step

Tool: Docling --force-reocr

Approximate processing time

Docling conversion: ~25 min for V27 (with OCR); model inference: ~10 seconds per cell when prompt fits.

Resource intensity

High — Docling extraction with OCR, multi-minute

Determinism

Deterministic (same input → same output, byte-identical)

Introduced

Cycle 8, 2026-05-21.

Related variants

Cross-reference · 06

← Back to all variants