Show HN: Documind – Open-source AI tool to turn documents into structured data

1. inexcf ◴[18 Nov 24 12:51 UTC] No.42171899[source]▶

Got excited about an open-source tool doing this.

Alas, i am let down. It is an open-source tool creating the prompt for the OpenAI API and i can't go and send customer data to them.

I'm aware of https://github.com/clovaai/donut so i hoped this would be more like that.

replies(5): >>42171944 #>>42171963 #>>42172184 #>>42172234 #>>42195901 #

2. _joel ◴[18 Nov 24 12:59 UTC] No.42171944[source]▶

>>42171899 (TP) #

You can self host OpenAPI compatible models with lmstudio and the like. I've used it with https://anythingllm.com/

3. turblety ◴[18 Nov 24 13:03 UTC] No.42171963[source]▶

>>42171899 (TP) #

You might be able to use Ollama, which has a OpenAI compatible API.

replies(1): >>42172054 #

4. Zambyte ◴[18 Nov 24 13:17 UTC] No.42172054[source]▶

>>42171963 #

Not without chaning the code (should be easy though)

https://github.com/DocumindHQ/documind/blob/d91121739df03867...

5. Tammilore ◴[18 Nov 24 13:39 UTC] No.42172184[source]▶

>>42171899 (TP) #

Hi. I totally get the concern about sending data to OpenAI. Right now, Documind uses OpenAI's API just so people could quickly get started and see what it is like, but I’m open to adding options and contributions that would be better for privacy.

replies(1): >>42188570 #

6. ◴[18 Nov 24 13:46 UTC] No.42172234[source]▶

>>42171899 (TP) #

7. inexcf ◴[19 Nov 24 22:01 UTC] No.42188570[source]▶

>>42172184 #

That sounds great.

8. sidmo ◴[20 Nov 24 17:00 UTC] No.42195901[source]▶

>>42171899 (TP) #

I'd recommend checking out vision language models. They generate embeddings of the images themselves (as a collection of patches) and you can see query matching displayed as a heatmap over the document. Picks up text that OCR misses. I built a simple API over it if you want to try it out: https://github.com/DataFog/vlm-api