←back to thread

168 points Tammilore | 2 comments | | HN request time: 0.417s | source

Documind is an open-source tool that turns documents into structured data using AI.

What it does:

- Extracts specific data from PDFs based on your custom schema - Returns clean, structured JSON that's ready to use - Works with just a PDF link + your schema definition

Just run npm install documind to get started.

Show context
inexcf ◴[] No.42171899[source]
Got excited about an open-source tool doing this.

Alas, i am let down. It is an open-source tool creating the prompt for the OpenAI API and i can't go and send customer data to them.

I'm aware of https://github.com/clovaai/donut so i hoped this would be more like that.

replies(5): >>42171944 #>>42171963 #>>42172184 #>>42172234 #>>42195901 #
1. turblety ◴[] No.42171963[source]
You might be able to use Ollama, which has a OpenAI compatible API.
replies(1): >>42172054 #
2. Zambyte ◴[] No.42172054[source]
Not without chaning the code (should be easy though)

https://github.com/DocumindHQ/documind/blob/d91121739df03867...