←back to thread

169 points Tammilore | 2 comments | | HN request time: 0.424s | source

Documind is an open-source tool that turns documents into structured data using AI.

What it does:

- Extracts specific data from PDFs based on your custom schema - Returns clean, structured JSON that's ready to use - Works with just a PDF link + your schema definition

Just run npm install documind to get started.

Show context
avereveard ◴[] No.42172088[source]
> an interesting open source project

enthusiastically setting up a lounge chair

> OPENAI_API_KEY=your_openai_api_key

carrying it back apathetically

replies(2): >>42172205 #>>42172807 #
1. Tammilore ◴[] No.42172205[source]
Thanks for the laugh and your feedback! I know that depending on an OpenAI isn't ideal for everyone. I'm considering ways to make it more self-contained in the future, so it’s great to hear what users are looking for.
replies(1): >>42175974 #
2. avereveard ◴[] No.42175974[source]
litellm would be a start, then you just pass in a model string that includes the provider, and can default on openai gpts, that removes most of the effort in adapting stuff both from you and other users.