←back to thread

S1: A $6 R1 competitor?

(timkellogg.me)
851 points tkellogg | 1 comments | | HN request time: 0.205s | source
Show context
swiftcoder ◴[] No.42948127[source]
> having 10,000 H100s just means that you can do 625 times more experiments than s1 did

I think the ball is very much in their court to demonstrate they actually are using their massive compute in such a productive fashion. My BigTech experience would tend to suggest that frugality went out the window the day the valuation took off, and they are in fact just burning compute for little gain, because why not...

replies(5): >>42948369 #>>42948616 #>>42948712 #>>42949773 #>>42953287 #
gessha ◴[] No.42948712[source]
This is pure speculation on my part but I think at some point a company's valuation became tied to how big their compute is so everybody jumped on the bandwagon.
replies(3): >>42948854 #>>42949513 #>>42951813 #
tyfon ◴[] No.42951813[source]
I don't think you need to speculate too hard. On CNBC they are not tracking revenue, profits or technical breakthroughs, but how much the big companies are spending (on gpus). That's the metric!
replies(5): >>42951860 #>>42952948 #>>42953193 #>>42954800 #>>42955651 #
1. LeifCarrotson ◴[] No.42953193[source]
I probably don't have to repeat it, but this is a perfect example of Goodhart's Law: when a metric is used as a target, it loses its effectiveness as a metric.

If you were a reporter who didn't necessarily understand how to value a particular algorithm or training operation, but you wanted a simple number to compare the amount of work OpenAI vs. Google vs Facebook are putting into their models, yeah, it makes sense. How many petaflops their datacenters are churning through in aggregate is probably correlated to the thing you're trying to understand. And it's probably easier to look at their financials and correlate how much they've spent on GPUs to how many petaflops of compute they need.

But when your investors are giving you more money based on how well they perceive you're doing, and their perception is not an oracle but is instead directly based on how much money you're spending... the GPUs don't actually need to do anything other than make number go up.