←back to thread

DeepSeek OCR

(github.com)
990 points pierre | 1 comments | | HN request time: 0.358s | source
1. joshstrange ◴[] No.45645761[source]
> [2025/x/x] We release DeepSeek-OCR, a model to investigate the role of vision encoders from an LLM-centric viewpoint.

So close but it should be 2025/X/XX as "X" = 10 in Roman Numerals /s

Jokes aside, this is really neat and I'm looking forward to getting this running. For most OCR-type stuff I just use AWS Textract since I need it so rarely and that service does a decent job. I really like how well this model seems to extract images/figures as well from the original document.