←back to thread

238 points martinald | 1 comments | | HN request time: 0.313s | source
Show context
mystraline ◴[] No.44538610[source]
To be completely and utterly fair, I trust Deepseek and Qwen (Alibaba) more than American AI companies.

American AI companies have shown they are money and compute eaters, and massively so at that. Billions later, and well, not much to show.

But Deepseek cost $5M to develop, and made multiple novel ways to train.

Oh, and their models and code are all FLOSS. The US companies are closed. Basically, the US ai companies are too busy treating each other as vultures.

replies(9): >>44538670 #>>44538694 #>>44538700 #>>44538816 #>>44538905 #>>44539727 #>>44540309 #>>44540945 #>>44542909 #
ryao ◴[] No.44538670[source]
Wasn’t that figure just the cost of the GPUs and nothing else?
replies(3): >>44538699 #>>44538709 #>>44538740 #
rpdillon ◴[] No.44538709[source]
Yeah, I hate that this figure keeps getting thrown around. IIRC, it's the price of 2048 H800s for 2 months at $2/hour/GPU. If you consider months to be 30 days, that's around $5.7M, which lines up. What doesn't line up is ignoring the costs of facilities, salaries, non-cloud hardware, etc. which will dominate costs, I'd expect. $100M seems like a fairer estimate, TBH. The original paper had more than a dozen authors, and DeepSeek had about 150 researchers working on R1, which supports the notion that personnel costs would likely dominate.
replies(1): >>44539421 #
moralestapia ◴[] No.44539421[source]
>ignoring the costs of facilities, salaries, non-cloud hardware, etc.

If you lease, those costs are amortized. It was definitely more than $5M, but I don't think it was as high as $100M. All things considered, I still believe Deepseek was trained at one (perhaps two) orders of magnitude lower cost than other competing models.

replies(1): >>44541610 #
rpdillon ◴[] No.44541610[source]
Perhaps. Do you think DeepSeek made use of those competing models at all in order to train theirs?
replies(1): >>44543649 #
1. moralestapia ◴[] No.44543649[source]
I believe so, but have no proof obviously.