Not sure I would want something non-deterministic in my data pipeline. Maybe if it used GenAI to _develop a ruleset_ that could then be deployed, it would be more practical.
What it does:
- Extracts specific data from PDFs based on your custom schema - Returns clean, structured JSON that's ready to use - Works with just a PDF link + your schema definition
Just run npm install documind to get started.