←back to thread

168 points Tammilore | 1 comments | | HN request time: 0.213s | source

Documind is an open-source tool that turns documents into structured data using AI.

What it does:

- Extracts specific data from PDFs based on your custom schema - Returns clean, structured JSON that's ready to use - Works with just a PDF link + your schema definition

Just run npm install documind to get started.

Show context
asjfkdlf ◴[] No.42173400[source]
I am looking for a similar service that turns any document (PNG, PDf, DocX) into JSON (preserving the field relationships). I tried with ChatGPT, but hallucinations are common. Does anything exist?
replies(2): >>42173587 #>>42173893 #
1. omk ◴[] No.42173587[source]
This is also using OpenAI's GPT model. So the same hallucinations are probable here for PDFs.