Ingesting PDFs and why Gemini 2.0 changes everything

Ingesting PDFs accurately is a noble goal which will no doubt be solved as LLMs get better. However, I need to point out that the financial statement example used in the article already has a solution: iXBRL.

Many financial regulators require you to publish heavily marked up statements with iXBRL. These markups reveal nuances in the numbers that OCRing a post processed table will not understand.

Of course, financial documents are a narrow subset of the problem.

Maybe the problem is with PDF as a format: Unfortunately PDFs lose that meta information when they are built from source documents.

I can't help but feel that PDFs could probably be more portable as their acronym indicates.