←back to thread

DeepSeek OCR

(github.com)
990 points pierre | 1 comments | | HN request time: 0.289s | source
Show context
ellisd ◴[] No.45641234[source]
The paper makes no mention of Anna’s Archive. I wouldn’t be surprised if DeepSeek took advantage of Anna’s offer granting OCR researchers access to their 7.5 million (350 TB) Chinese non-fiction collection ... which is bigger than Library Genesis.

https://annas-archive.org/blog/duxiu-exclusive.html

replies(5): >>45641927 #>>45642797 #>>45642836 #>>45643509 #>>45644415 #
dev1ycan ◴[] No.45643509[source]
Oh great so now Anna's archive will get taken down as well by another trash LLM provider abusing repositories that students and researchers use, META torrenting 70TB from library genesis wasn't enough
replies(4): >>45643563 #>>45643595 #>>45643640 #>>45643646 #
1. ◴[] No.45643640[source]