121 points maxmaio | 2 comments | 06 Nov 24 18:05 UTC | HN request time: 0.771s | source

Hey HN, we are Max, Kieran, and Aahel from Midship (https://midship.ai). Midship makes it easy to extract data from unstructured documents like pdfs and images.

Here’s a video showing it in action: https://www.loom.com/share/ae43b6abfcc24e5b82c87104339f2625?..., and a demo playground (no signup required!) to test it out: https://app.midship.ai/demo

We started 5 months ago initially trying to make an AI natural language workflow builder that would be a simpler alternative to Zapier or Make.com. However, most of our users seemed to be much more interested in the basic (and not very good) document extraction feature we had. Seeing how people were spending hours a day manually extracting data from pdfs inspired us to build what has become Midship!

The problem is that despite all our progress in software, huge amounts of business data still lives in PDFs and images. Sure, you can OCR them, but getting clean, structured data out is still painful. Most existing tools just give you a blob of markdown - leaving you to figure out which parts matter and how they relate.

We've found that combining OCR with language models lets us do something more useful: extract specific fields and tables that users actually care about. The LLMs help correct OCR mistakes and understand context (like knowing that "Inv#" and "Invoice Number" mean the same thing).

We have two main kinds of users today, non-technical users that extract data via our web app and developers who use our extraction api. We were initially focused on the first one as they seemed like an underserved part of the market, but we’ve received a lot of interest from developers who face the same issues.

For pricing, we currently charge a monthly Saas fee per seat for the web app and a volume based pricing for the API.

We’re really excited to share what we’ve built so far and look forward to any feedback from the community!

Show context

ivanvanderbyl ◴[06 Nov 24 19:49 UTC] No.42068247[source]▶

>>42066500 (OP) #

Congrats on the launch!

I’m curious to hear more about your pivot from AI workflow builder to document parsing. I can see correlations there, but that original idea seems like a much larger opportunity than parsing PDFs to tables in what is an already very crowded space. What verticals did you find have this problem specifically that gave you enough conviction to pivot?

replies(1): >>42069121 #

maxmaio ◴[06 Nov 24 20:48 UTC] No.42069121[source]▶

>>42068247 #

We saw initial traction with real estate firms extracting property data like rent rolls. But we've also seen traction in other verticals like accounting and intake forms. The original idea was very ambitious and when talking to potential customers they all seemed to be happy with the existing players.

replies(1): >>42073041 #

smt88 ◴[07 Nov 24 03:24 UTC] No.42073041[source]▶

>>42069121 #

How do you guarantee that nothing in an extracted rent roll is hallucinated?

replies(1): >>42073193 #

1. themanmaran ◴[07 Nov 24 03:50 UTC] No.42073193[source]▶

>>42073041 #

The same way you guarantee that a person manually typing the data never makes a mistake.

replies(1): >>42077296 #

2. smt88 ◴[07 Nov 24 15:10 UTC] No.42077296[source]▶

>>42073193 (TP) #

Humans in this space tend to make mistakes like, "Added rent as per-square-foot instead of absolute value," or, "Missed a rent escalation for year 3."

These tend to be easy to catch, even for the same person who's reviewing the data. They would see that rent steps looked strange (Y2 and Y4, but not Y3) or there was an order-of-magnitude difference in rent from one month to another.

AI can do something like invent reasonable-looking rent steps. They're designed to create output that seems reasonable, even if it's completely made up.

When humans are wrong, they tend to misread what's there, which is much less insidious than inventing something.

And if you have a human reviewer for all the work this AI does, what's the point of the AI in the first place? The human has become the source of truth either way.

↑

Launch HN: Midship (YC S24) – Turn PDFs, docs, and images into usable data