Hi all, I'm the author of llama-ocr. Thank you for sharing & for the kind comments! I built this earlier this week since I wanted a simple API to do OCR – it uses llama 3.2 vision (hosted on together.ai, where i work) to parse images into structured markdown. I also have it available as an npm package.
Planning to add a bunch of other features like the ability to parse PDFs, output a response in JSON, ect... If anyone has any questions, feel free to send them and I'll try to respond!
replies(5):