(github.com)

990 points pierre | 1 comments | 20 Oct 25 06:26 UTC | HN request time: 0.295s | source

Show context

yoran ◴[20 Oct 25 07:22 UTC] No.45640836[source]▶

How does an LLM approach to OCR compare to say Azure AI Document Intelligence (https://learn.microsoft.com/en-us/azure/ai-services/document...) or Google's Vision API (https://cloud.google.com/vision?hl=en)?

replies(7): >>45640943 #>>45640992 #>>45642214 #>>45643557 #>>45644126 #>>45647313 #>>45667751 #

1. junto ◴[22 Oct 25 12:00 UTC] No.45667751[source]▶

>>45640836 #

Not sure how it compares but we did some trials with Azure AI Document Intelligence and were very surprised at how good it was. We had a document example which was a poor photograph of a document that had quite a skew, and it (too our surprise), also detected the customer’s human legible signature and extracted their name from that signature.

↑

DeepSeek OCR