GovToolsErschließung: Input variants

Interpolation Tests heatmap

Input variants — how evidence is shaped for the model

Every form the open model can be handed. Each entry links to inputs, generator script, AI use, OCR tool, processing time, and results.

Pipeline versions

01

Pipeline v0.6.1 (full card)— The reference full-featured card. All structural information Docling extracts, packaged into one Markdown card per detected table.
BaselineOpen 27%

Per-table card variants

02

Evidence-Preserving Table Normalization Layer— Five-layer deterministic OCR-confusable + column-type + table-context + authority normalizer with per-cell provenance. Repairs 444 cells across 123 cards in the V27/V35 corpus; 0 cells touched on born-digital NOAA.
ExperimentalBest 10/13 (Qwen2.5-7B, Granite-3.3-8B, Llama-3 8B)
CSV-only card— Table data rendered as raw CSV inside a Markdown code block. The most-effective open-tier variant.
RecommendedBest 11/13
CSV-only with deterministic header normalization— CSV cards with column headers cleaned by deterministic rules (no LLM). Addresses cycle-17 CELL_READ_ERROR failures from malformed Docling headers.
Experimental—
CSV with row-de-merge— CSV cards with fused multi-row records split back. Addresses OCR-merged rows like the cyprid 'April May,1491 35,...' case.
ExperimentalBest 10/13 (Qwen2.5-7B)
CSV with multi-page table stitching— Multi-page tables reunited across the PDF page break. Closes Q-NOAA-CALC-001 for frontier-tier reference; open models still need stronger arithmetic to use it.
ExperimentalBest 9/13 (Qwen2.5-7B, Granite-3.3-8B)
Micro card (≤1K tokens)— Smallest viable card: caption + inline Markdown table + 6-line YAML. Targets the 4K-context open-model tier.
ExperimentalOpen 40%
Table-only card— Just the Markdown table + 1-line caption. No frontmatter, no metadata.
ExperimentalOpen 52%
Labeled-faithfulness card— Full v0.6.1 card with every section explicitly tagged by provenance type (verbatim-cells, inherited, summarized, inferred).
ExperimentalOpen 49%

Document-level maps

03

Evaluation modes

04

← Back to Interpolation overview