I think very soon a new model will destroy whatever startups and services are built around document ingestion. As in a model that can take in a pdf page as a image and transcribe it to text with near perfect accuracy.
Extracting plain text isn’t that much of a problem, relatively speaking. It’s interpreting more complex elements like nested lists, tables, side bars, footnotes/endnotes, cross-references, images and diagrams where things get challenging.