(github.com)

169 points Tammilore | 1 comments | 18 Nov 24 10:51 UTC | HN request time: 0.21s | source

Documind is an open-source tool that turns documents into structured data using AI.

What it does:

- Extracts specific data from PDFs based on your custom schema - Returns clean, structured JSON that's ready to use - Works with just a PDF link + your schema definition

Just run npm install documind to get started.

1. khaki54 ◴[18 Nov 24 13:13 UTC] No.42172027[source]▶

>>42171311 (OP) #

Not sure I would want something non-deterministic in my data pipeline. Maybe if it used GenAI to _develop a ruleset_ that could then be deployed, it would be more practical.

↑

Show HN: Documind – Open-source AI tool to turn documents into structured data