(www.marginalia.nu)

357 points ingve | 1 comments | 13 May 25 15:01 UTC | HN request time: 0.221s | source

Show context

elpalek ◴[13 May 25 20:00 UTC] No.43977047[source]▶

Recently tested a (non-english) pdf ocr with Gemini 2.5 Pro. First, directly ask it to extract text from pdf. Result: random text blob, not useable.

Second, I converted pdf into pages of jpg. Gemini performed exceptional. Near perfect text extraction with intact format in markdown.

Maybe there's internal difference when processing pdf vs jpg inside the model.

replies(1): >>43977077 #

1. jagged-chisel ◴[13 May 25 20:03 UTC] No.43977077[source]▶

Model isn’t rendering the PDF probably, just looking in the file for text.

PDF to Text, a challenging problem