I definitely vaguely remember doing some incredibly cool things with PDFs and OCR about 6 or 7 years ago. Some project comes to mind... google tells me it was "tesseract" and that sounds familiar.
I definitely vaguely remember doing some incredibly cool things with PDFs and OCR about 6 or 7 years ago. Some project comes to mind... google tells me it was "tesseract" and that sounds familiar.
(1) be stored in a single file
(2) Allow tables, images and anything else that can be shown on a piece paper
(3) Won't have animation, fold-out text, or anything that cannot be be shown on a piece of paper
(4) won't require Javascript or access to external sites
that means never.. We've got lucky we at least got PDF before "web designers" made (3) impossible, and marketers made (4) impossible
But for real, thats a pretty easy set of hurdles. Really the barrier is the psychological fallacy that PDF's are immutable.
Re "PDF's are immutable." - that's not a psychological fallacy, that's a primary advantage of PDFs. If I wanted mutable format, I'd take an odt (or rtf or a doc). "Output only" format allows one to use the very latest version of editor app, while having the result working even in ancient readers, something very desirable in many contexts.