Grok 3: Another win for the bitter lesson

(www.thealgorithmicbridge.com)

132 points kiyanwang | 1 comments | 20 Feb 25 07:15 UTC | HN request time: 0.216s | source

Show context

smy20011 ◴[20 Feb 25 08:06 UTC] No.43112235[source]▶

>>43111963 (OP) #

Did they? Deepseek spent about 17 months achieving SOTA results with a significantly smaller budget. While xAI's model isn't a substantial leap beyond Deepseek R1, it utilizes 100 times more compute.

If you had $3 billion, xAI would choose to invest $2.5 billion in GPUs and $0.5 billion in talent. Deepseek, would invest $1 billion in GPUs and $2 billion in talent.

I would argue that the latter approach (Deepseek's) is more scalable. It's extremely difficult to increase compute by 100 times, but with sufficient investment in talent, achieving a 10x increase in compute is more feasible.

replies(10): >>43112269 #>>43112330 #>>43112430 #>>43112606 #>>43112625 #>>43112895 #>>43112963 #>>43115065 #>>43116618 #>>43123381 #

sigmoid10 ◴[20 Feb 25 08:12 UTC] No.43112269[source]▶

>>43112235 #

>It's extremely difficult to increase compute by 100 times, but with sufficient investment in talent, achieving a 10x increase in compute is more feasible.

The article explains how in reality the opposite is true. Especially when you look at it long term. Compute power grows exponentially, humans do not.

replies(4): >>43112294 #>>43112314 #>>43112378 #>>43112615 #

1. smy20011 ◴[20 Feb 25 08:16 UTC] No.43112294[source]▶

>>43112269 #

Human do write code that scalable with compute.

The performance is always raw performance * software efficiency. You can use shitty software and waste all these FLOPs.

↑