←back to thread

357 points ingve | 2 comments | | HN request time: 0.541s | source
Show context
dwheeler ◴[] No.43974621[source]
The better solution is to embed, in the PDF, the editable source document. This is easily done by LibreOffice. Embedding it takes very little space in general (because it compresses well), and then you have MUCH better information on what the text is and its meaning. It works just fine with existing PDF readers.
replies(5): >>43974667 #>>43974983 #>>43975217 #>>43975401 #>>43976216 #
1. carabiner ◴[] No.43975217[source]
I bet 90% of the problem space is legacy PDFs. My company has thousands of these. Some are crappy scans. Some have Adobe's OCR embedded, but most have none at all.
replies(1): >>43977501 #
2. ◴[] No.43977501[source]