(llamaocr.com)

293 points lapnect | 1 comments | 16 Nov 24 04:57 UTC | HN request time: 0.193s | source

Show context

nutlope ◴[16 Nov 24 07:16 UTC] No.42155007[source]▶

Hi all, I'm the author of llama-ocr. Thank you for sharing & for the kind comments! I built this earlier this week since I wanted a simple API to do OCR – it uses llama 3.2 vision (hosted on together.ai, where i work) to parse images into structured markdown. I also have it available as an npm package.

Planning to add a bunch of other features like the ability to parse PDFs, output a response in JSON, ect... If anyone has any questions, feel free to send them and I'll try to respond!

replies(5): >>42155235 #>>42155376 #>>42155942 #>>42158372 #>>42159434 #

1. rch ◴[16 Nov 24 21:18 UTC] No.42159434[source]▶

>>42155007 #

I've had trouble with pulling scientific content out of poster PDFs, mostly because e.g. nougat falls apart with different layouts.

Have you considered that usage yet?

↑

Llama-OCR: Document to Markdown