←back to thread

Grok 3: Another win for the bitter lesson

(www.thealgorithmicbridge.com)
129 points kiyanwang | 2 comments | | HN request time: 0.432s | source
Show context
smy20011 ◴[] No.43112235[source]
Did they? Deepseek spent about 17 months achieving SOTA results with a significantly smaller budget. While xAI's model isn't a substantial leap beyond Deepseek R1, it utilizes 100 times more compute.

If you had $3 billion, xAI would choose to invest $2.5 billion in GPUs and $0.5 billion in talent. Deepseek, would invest $1 billion in GPUs and $2 billion in talent.

I would argue that the latter approach (Deepseek's) is more scalable. It's extremely difficult to increase compute by 100 times, but with sufficient investment in talent, achieving a 10x increase in compute is more feasible.

replies(10): >>43112269 #>>43112330 #>>43112430 #>>43112606 #>>43112625 #>>43112895 #>>43112963 #>>43115065 #>>43116618 #>>43123381 #
1. dogma1138 ◴[] No.43112430[source]
Deepseek didn’t seem to invest in talent as much as it did in smuggling restricted GPUs into China via 3rd countries.

Also not for nothing scaling compute x100 or even x1000 is much easier than scaling talent by x10 or even x2 since you don’t need workers you need discovery.

replies(1): >>43113483 #
2. tw1984 ◴[] No.43113483[source]
Talent is not something you can just freely pick up from your local Walmart.