(github.com)

990 points pierre | 1 comments | 20 Oct 25 06:26 UTC | HN request time: 0s | source

Show context

yoran ◴[20 Oct 25 07:22 UTC] No.45640836[source]▶

How does an LLM approach to OCR compare to say Azure AI Document Intelligence (https://learn.microsoft.com/en-us/azure/ai-services/document...) or Google's Vision API (https://cloud.google.com/vision?hl=en)?

1. make3 ◴[20 Oct 25 13:17 UTC] No.45643557[source]▶

aren't all of these multimodal LLM approaches, just open vs closed ones

DeepSeek OCR