(llamaocr.com)

293 points lapnect | 2 comments | 16 Nov 24 04:57 UTC | HN request time: 0.425s | source

Show context

nutlope ◴[16 Nov 24 07:16 UTC] No.42155007[source]▶

Hi all, I'm the author of llama-ocr. Thank you for sharing & for the kind comments! I built this earlier this week since I wanted a simple API to do OCR – it uses llama 3.2 vision (hosted on together.ai, where i work) to parse images into structured markdown. I also have it available as an npm package.

Planning to add a bunch of other features like the ability to parse PDFs, output a response in JSON, ect... If anyone has any questions, feel free to send them and I'll try to respond!

replies(5): >>42155235 #>>42155376 #>>42155942 #>>42158372 #>>42159434 #

1. nh2 ◴[16 Nov 24 09:00 UTC] No.42155376[source]▶

>>42155007 #

I put in a bill that has 3 identical line items and it didn't include them as 3 bullet points as usual, but generated a table with a "quantity" column that doesn't exist on the original paper.

Is this amount of larger transformation expected/desirable?

(It also means that the output is sometimes a bullet point list, sometimes a table, making further automatic processing a bit harder.)

replies(1): >>42156858 #

2. zainia ◴[16 Nov 24 15:18 UTC] No.42156858[source]▶

>>42155376 (TP) #

Here's the prompt being used, tweaking that might help: https://github.com/Nutlope/llama-ocr/blob/main/src/index.ts#...

↑

Llama-OCR: Document to Markdown