←back to thread

273 points aaln | 2 comments | | HN request time: 0.428s | source
Show context
Onavo ◴[] No.42151201[source]
What's the PDF parsing like?
replies(1): >>42152699 #
1. aaln ◴[] No.42152699[source]
Extract all the text from the PDF, turn the pdf into images, send the text for each page along with the image to an LLM with a desired output strucutre.
replies(1): >>42153144 #
2. Onavo ◴[] No.42153144[source]
You are not doing any of the fancy table extractor stuff?