←back to thread

168 points Tammilore | 1 comments | | HN request time: 0.828s | source

Documind is an open-source tool that turns documents into structured data using AI.

What it does:

- Extracts specific data from PDFs based on your custom schema - Returns clean, structured JSON that's ready to use - Works with just a PDF link + your schema definition

Just run npm install documind to get started.

1. khaki54 ◴[] No.42172027[source]
Not sure I would want something non-deterministic in my data pipeline. Maybe if it used GenAI to _develop a ruleset_ that could then be deployed, it would be more practical.