←back to thread

439 points david927 | 1 comments | | HN request time: 0.242s | source

What are you working on? Any new ideas which you're thinking about?
1. superdocs1 ◴[] No.44424365[source]
Building an app that extracts key information from PDFs + highlights citations. You provide a PDF and a JSON schema defining what to extract, and it returns the extracted values, the citations and their precise locations in the document.

This is especially valuable in workflows where verification of LLM extracted information is critical (e.g. legal and finance). It can handle complex layouts like multiple columns, tables and also scanned documents.

Planning to offer this both as an API and a self-hosted option for organizations with strict data privacy requirements.

Screenshot: https://superdocs.io/highlight.png