I wonder how this compares to open source models (which might be less accurate but even cheaper if self-hosted?), e.g. Llama 3.2. I'll see if I can run the benchmark.
Also regarding the failure case in the footnote, I think Gemini actually got that right (or at least outperformed Reducto) - the original document seems to have what I call a "3D" table where the third axis is rows within each cell, and having multiple headers is probably the best approximation in Markdown.
replies(1):