- the OmniAI benchmark is bad
- Instead check OmniDocBench[1] out
- Mistral OCR is far far behind most Open Source OCR models and even further behind then Gemini
- End to End OCR is still extremely tricky
- composed pipelines work better (layout detection -> reading order -> OCR every element)
- complex table parsing is still extremely difficult