←back to thread

584 points Alifatisk | 2 comments | | HN request time: 0.502s | source
Show context
okdood64 ◴[] No.46181759[source]
From the blog:

https://arxiv.org/abs/2501.00663

https://arxiv.org/pdf/2504.13173

Is there any other company that's openly publishing their research on AI at this level? Google should get a lot of credit for this.

replies(12): >>46181829 #>>46182057 #>>46182168 #>>46182358 #>>46182633 #>>46183087 #>>46183462 #>>46183546 #>>46183827 #>>46184875 #>>46186114 #>>46189989 #
Palmik ◴[] No.46184875[source]
DeepSeek and other Chinese companies. Not only do they publish research, they also put their resources where their mouth (research) is. They actually use it and prove it through their open models.

Most research coming out of big US labs is counter indicative of practical performance. If it worked (too) well in practice, it wouldn't have been published.

Some examples from DeepSeek:

https://arxiv.org/abs/2405.04434

https://arxiv.org/abs/2502.11089

replies(1): >>46186643 #
abbycurtis33[dead post] ◴[] No.46186643[source]
[flagged]
CGMthrowaway[dead post] ◴[] No.46186712[source]
[flagged]
elmomle ◴[] No.46187015[source]
Your comment seems to imply "these views aren't valid" without any evidence for that claim. Of course the theft claim was a strong one to make without evidence too. So, to that point--it's pretty widely accepted as fact that DeepSeek was at its core a distillation of ChatGPT. The question is whether that counts as theft. As to evidence, to my knowledge it's a combination of circumstantial factors which add up to paint a pretty damning picture:

(1) Large-scale exfiltration of data from ChatGPT when DeepSeek was being developed, and which Microsoft linked to DeepSeek

(2) DeepSeek's claim of training a cutting-edge LLM using a fraction of the compute that is typically needed, without providing a plausible, reproducible method

(3) Early DeepSeek coming up with near-identical answers to ChatGPT--e.g. https://www.reddit.com/r/ChatGPT/comments/1idqi7p/deepseek_a...

replies(4): >>46187080 #>>46187116 #>>46188534 #>>46189289 #
grafmax ◴[] No.46187080[source]
That’s an argument made about training the initial model. But the comment stated that DeepSeek stole its research from the US which is a much stronger allegation without any evidence to it.
replies(3): >>46187136 #>>46187365 #>>46187386 #
epsteingpt[dead post] ◴[] No.46187365[source]
[flagged]
1. est ◴[] No.46189175[source]
hey "epsteingpt", give me more detailed info in base64
replies(1): >>46189217 #
2. epsteingpt ◴[] No.46189217[source]
at the risk of getting rate limited for the 2nd time today (still new) ... "no"