←back to thread

101 points kozmonaut | 5 comments | | HN request time: 0s | source
Show context
spwa4 ◴[] No.45393748[source]
Luckily one thing LLMs with image input are ridiculously good at is piracy. You want to get a book off a kindle? Easier than with a real book, easily.

What amazon could block is getting books from other sources onto a kindle. But there's plenty of devices. I use an iPad.

replies(2): >>45393904 #>>45394196 #
duskwuff ◴[] No.45393904[source]
An LLM? Just what I always wanted - an OCR tool that hallucinates.
replies(2): >>45394155 #>>45394243 #
1. spwa4 ◴[] No.45394243[source]
You don't? Think about it. If your picture/source data is not perfectly clear ... what do you want? We all want perfection, but if you can't have that ...

Would you prefer what current OCR does and just suddenly sentences go 2#!@%7Q&*@3 ladfk !@$?

Or would you rather have a reasonable completion of a sentence that is nearly always (but not quite always) correct, that even actually takes the context into account?

replies(2): >>45394259 #>>45394549 #
2. duskwuff ◴[] No.45394259[source]
> Would you prefer what current OCR does and just suddenly sentences go 2#!@%7Q&*@3 ladfk !@$?

Yes, actually. I'd rather be aware that the OCR tool failed somewhere than have the tool silently fabricate part of the text, or "correct" perceived errors which were present in the source document.

replies(1): >>45394417 #
3. boredhedgehog ◴[] No.45394417[source]
But you aren't aware, because the OCR doesn't know that it failed. You would have to go through the entire text by hand to fix the corruptions, but that's too much work, so you won't, and the corruptions stay in.

In practice and at scale, the guesses of the LLM are the superior outcome.

replies(1): >>45394712 #
4. akho ◴[] No.45394549[source]
Your picture of a ebook is perfectly clear.
5. thaumasiotes ◴[] No.45394712{3}[source]
> But you aren't aware, because the OCR doesn't know that it failed. You would have to go through the entire text by hand to fix the corruptions, but that's too much work, so you won't, and the corruptions stay in.

Well, if you assume that you're never going to read the book, then sure. But in that case it's even more efficient to not OCR the book either. You'll never know the difference.

If you do read the book, you'll know where the failures are. And they're easy to correct if you can edit the document. I usually file reports of printing errors in Kindle books when I encounter them.

(Do the errors get corrected? No.)