←back to thread

1303 points serjester | 1 comments | | HN request time: 0.207s | source
Show context
ThinkBeat ◴[] No.42955283[source]
Hmm I have been doing a but if this manually lately for a personal project. I am working on some old books that are far past any copyright, but they are not available anywhere on the net. (Being in Norwegian m makes a book a lot more obscure) so I have been working on creating ebooks out of them.

I have a scanner, and some OCR processes I run things through. I am close to 85% from my automatic process.

The pain of going from 85% to 99% though is considerable. (and in my case manual) (well Perl helps)

I went to try this AI on one of the short poem manufscript I have.

I told the prompt I wanted PDF to Markdown, it says sure go ahead give me the pdf. I went upload it. It spent a long time spinning. then a quick messages comes up, something like

"Failed to count tokens"

but it just flashes and goes away.

I guess the PDF is too big? Weird though, its not a lot of pages.

replies(2): >>42956006 #>>42958103 #
1. sumedh ◴[] No.42956006[source]
Take a screenshot of the pdf page and give that to the LLM and see if it can be processed.

Your PDF might have some quirks inside which the LLM cannot process.