Most active commenters
  • embedding-shape(9)
  • whimsicalism(4)
  • tonyarkles(3)
  • jimbokun(3)
  • noosphr(3)

←back to thread

387 points reaperducer | 43 comments | | HN request time: 0.941s | source | bottom
1. SubiculumCode ◴[] No.45772210[source]
Given that AI is a national security matter now, I'd expect the U.S.A to step in and rescue certain companies in the event of a crash. However, I'd give higher chances to NVIDIA than OpenAI. Weights are easily transferrable and the expertise is in the engineers, but ability to continue making advanced chips is not as easily transferred.
replies(4): >>45772241 #>>45772328 #>>45772343 #>>45772651 #
2. embedding-shape ◴[] No.45772241[source]
Why is ML knowledge "in the engineers" while chip manufacturing apparently sits in the company/hardware/something else than the engineers/humans?
replies(6): >>45772325 #>>45772346 #>>45772355 #>>45772369 #>>45772507 #>>45772729 #
3. NBJack ◴[] No.45772325[source]
Read up a bit on the effort needed to get a fab going, and the yield rates. While engineers are crucial in the setup, the fab itself is not as 'fungible' as the employees involved.

I can spin up a strong ML team through hiring in probably 6-12 months with the right funding. Building a chip fab and getting it to a sensible yield would take 3-5 years, significantly more funding, strong supply lines, etc.

replies(5): >>45772443 #>>45772496 #>>45772509 #>>45772514 #>>45773390 #
4. philipwhiuk ◴[] No.45772328[source]
If they're too-important-to-fail they're too important not to be broken up or nationalised.
replies(3): >>45772693 #>>45773573 #>>45774853 #
5. ◴[] No.45772343[source]
6. tonyarkles ◴[] No.45772346[source]
First-order: because of the capex and lead times. If you grab a bunch of world-class ML folks and put them in a room together, they're going to be able to start producing world-class work together. If you grab a bunch of world-class chip designers in the same scenario but don't have world-class fabs for them to use, they're not going to be able to ship competitive designs.
replies(1): >>45772419 #
7. bob1029 ◴[] No.45772355[source]
One person can implement a transformer model from scratch in a weekend. Hardware is not the valuable part of machine learning. Data and how it is used are.

The "magic of AI" doesn't live inside an Nvidia GPU. There are billions of dollars of marketing being deployed to convince you it does. As soon as the market realizes that nvidia != magic AI box, the music should stop pretty quickly.

replies(2): >>45772397 #>>45772784 #
8. jeffwask ◴[] No.45772369[source]
The start-up costs of creating a new chip manufacture are significantly higher (you can't just SAAS your way into factories) and the chips themselves more subject to IP and patents owned by that company.
9. tehjoker ◴[] No.45772397{3}[source]
That's true, but without the kind of horsepower provided by modern hardware, even though I'm skeptical that it's all needed, especially given DeepSeek's amazing results, AI would be nearly impossible.

There are some important innovations on the algorithm / network structure side, but all these ideas are only able to be tried because the hardware supports it. This stuff has been around for decades.

replies(1): >>45772790 #
10. embedding-shape ◴[] No.45772419{3}[source]
> If you grab a bunch of world-class chip designers in the same scenario but don't have world-class fabs for them to use, they're not going to be able to ship competitive designs.

But why such an unfair comparison?

Instead of comparing "skilled people with hardware VS skilled people without hardware", why not compare it to "a bunch of world-class ML folks" without any computers to do the work, how could they produce world-class work then?

replies(1): >>45772760 #
11. wongarsu ◴[] No.45772443{3}[source]
But the fabs don't belong to NVIDIA, they belong to TSMC. I have no doubt that Taiwan and maybe even the US government would step in to save TSMC if for some reason it got existential problems, but that doesn't provide an argument for saving NVIDIA
12. trollbridge ◴[] No.45772496{3}[source]
Right. I could spin up a strong ML team, an AI startup, build a foundational model, etc give a reasonable amount of seed capital.

Build a chip fab? I’ve got no idea where to start, where to even find people to hire, and i know the equipment we’d need to acquire would be also quite difficult to get at any price.

13. thesz ◴[] No.45772507[source]
Chip manufacturing is extremely time consuming, especially when we are talking about masks for lithography.

The rights on masks for chips and their parts (IPs) belong to companies.

And one definitely does not want these masks to be sold during bankruptcy process to (arbitrary) higher bidders.

14. OfficialTurkey ◴[] No.45772509{3}[source]
> I can spin up a strong ML team through hiring in probably 6-12 months with the right funding.

Mark Zuckerberg would like a word with you

15. singron ◴[] No.45772514{3}[source]
Nvidia isn't a fab.
16. lz400 ◴[] No.45772651[source]
Even if/when the bubble pops, I don't think NVIDIA is even close to need rescuing or being in trouble. They might end being worth 2 trillion instead of 5 but they're still selling GPUs nobody else knows how to make that power one of the most important technologies in the world. Also, all their other divisions.

The .com bubble didn't stop the internet or e-commerce, they still won, revolutioned everything, etc. etc. Just because there's a bubble it doesn't mean AI won't be successful. It will be, almost for sure. We've all used it, it's truly useful and transformative. Let's not miss the forest for the trees.

17. jimbokun ◴[] No.45772693[source]
While that is a sensible opinion the 2008 crash showed that it is not the opinion of decision makers in the US.
18. jimbokun ◴[] No.45772729[source]
Chip designs have strong IP protections.

AI models do not. Sure you can't just copy the exact floating point values without permission. But with enough capital you can train a model just as good, as the training and inference techniques are well known.

replies(1): >>45773420 #
19. jimbokun ◴[] No.45772760{4}[source]
Much easier and cheaper to source computers than a fab.
replies(1): >>45773318 #
20. chermi ◴[] No.45772784{3}[source]
Umm, part of it does. It necessary but not sufficient, at least to achieve it on the timescales we've seen. Scale is part of the "magic".
21. chermi ◴[] No.45772790{4}[source]
Deepseek required existing models that required the horsepower.
replies(1): >>45777027 #
22. embedding-shape ◴[] No.45773318{5}[source]
Right, but to source a fab you need experience as well, nothing you can just hire a random person to do exactly.
replies(1): >>45773867 #
23. embedding-shape ◴[] No.45773390{3}[source]
> I can spin up a strong ML team through hiring in probably 6-12 months with the right funding

Not sure what to call this except "HN hubris" or something.

There are hundreds of companies who thought (and still think) the exact same thing, and even after 24 months or more of "the right funding" they still haven't delivered the results.

I think you're misunderstanding how difficult all of this is, if you think it's merely a money problem. Otherwise we'd see SOTA models from new groups every month, which we obviously aren't, we have a few big labs iteratively progressing SOTA, with some upstarts appearing sometimes (DeepSeek, Kimi et al) but it isn't as easy as you're trying to make it out to be.

replies(3): >>45773610 #>>45773835 #>>45774090 #
24. embedding-shape ◴[] No.45773420{3}[source]
> But with enough capital you can train a model just as good, as the training and inference techniques are well known

You're not alone in believing just money can train a good model, and I've already answered elsewhere why things aren't so easy as you believe, but besides this, where are y'all getting that from? Is there some popular social media influencer that keeps parroting this or where it comes from? Clearly you're not involved in those processes/workflows yourself, then you wouldn't claim it's just a money problem, so where are you all getting this from?

25. whimsicalism ◴[] No.45773573[source]
I’m curious if those of you calling for nationalization have worked for the government or a state-owned enterprise like Amtrak. People should witness the effects of long-term public sector ownership on productivity and effectiveness in a workplace.
replies(2): >>45774072 #>>45774782 #
26. whimsicalism ◴[] No.45773610{4}[source]
There’s a lot in LLM training that is pretty commodity at this point. The difficulty is in data - and a large part of why it has gotten more challenging is simply that some of the best sources of data have locked down against scraping post-2022 and it is less permissible to use copyrighted data than the “move fast and break things” pre-2023 era.

As you mentioned, multiple no name chinese companies have done it and published many of their results. There is a commodity recipe for dense transformer training. The difference between Chinese and US is that they have less data restrictions.

I think people overindex on the Meta example. It’s hard to fully understand why Meta/llama have failed as hard as they have - but they are an outlier case. Microsoft AI only just started their efforts in earnest and are already beating Meta shockingly.

27. marcyb5st ◴[] No.45773835{4}[source]
Fully agree. I also think we are deep into the diminishing returns territory.

If I have to guess OAI and others pay top dollars for talent that has a higher probability of discovering the next "attention" mechanism and investors are betting this is coming soon (hence the hige capitalizations and willing to loive with 11B losses/quarter). If they lose patience in throwing money at the problem I see only few players remaining in the race because they have other revenue streams

28. tonyarkles ◴[] No.45773867{6}[source]
To simplify it down even more:

- For the ML team, you need money. Money to pay them and money to get access to GPUs. You might buy the GPUs and make your own server farm (which also takes time) or you might just burn all that money with AWS and use their GPUs. You can trade off money vs. time.

- For the chip design team, you need money and time. There's no workaround for the time aspect of it. You can't spend more money and get a fab quicker.

replies(1): >>45773937 #
29. embedding-shape ◴[] No.45773937{7}[source]
> - For the ML team, you need money. Money to pay them and money to get access to GPUs. You might buy the GPUs and make your own server farm (which also takes time) or you might just burn all that money with AWS and use their GPUs. You can trade off money vs. time.

Even if you do those things though, it doesn't guarantee success or you'll be able to train something bigger. For that you need knowledge, hard work and expertise, regardless of how much money you have. It's not a problem you can solve by throwing money at it, although many are trying. You can increase the chances of hopefully discovering something novel that helps you build something SOTA, but as current history tells us, it isn't as easy as "ML Team + Money == SOTA model in a few months".

replies(1): >>45774519 #
30. saulpw ◴[] No.45774072{3}[source]
Yeah, like IBM and Intel and GE and GM are shining examples of how effectively the private sector runs companies. Maybe large enterprises are by their nature inefficient. Maybe productivity isn't the best metric for a utility. We could, for instance, prioritize resiliency, longevity, accessibility, and environmental concerns.
replies(1): >>45774099 #
31. noosphr ◴[] No.45774090{4}[source]
>Otherwise we'd see SOTA models from new groups every month

We do.

It's just that startups don't go after the frontier models but niche spaces which are under served and can be explored with a few million in hardware.

Just like how open AI made gpt2 before they made gpt3.

replies(1): >>45774208 #
32. whimsicalism ◴[] No.45774099{4}[source]
Even those problematic companies exemplify the difference: when enterprises are mismanaged and fail, capital is reallocated away from them.
replies(1): >>45775190 #
33. embedding-shape ◴[] No.45774208{5}[source]
> We do.

> It's just that startups don't go after the frontier models but niche spaces

But both of "New SOTA models every month" and "Startups don't go for SOTA" cannot be true at the same time. Either we get new SOTA models from new groups every month (not true today at least) or we don't, maybe because the labs are focusing on non-SOTA instead.

replies(1): >>45774513 #
34. noosphr ◴[] No.45774513{6}[source]
State of the art doesn't mean frontier.
replies(1): >>45774746 #
35. tonyarkles ◴[] No.45774519{8}[source]
Sure. No guarantees that you could throw money at putting an ML team together and have a new SOTA model in a few months. You might, you might not.

You know what I can guarantee? No matter how much money you throw at it, you will not have a new SOTA fab in a few months.

36. embedding-shape ◴[] No.45774746{7}[source]
I've always taken that term literally, basically "top of the top". If you're not getting the best responses from that LLM, then it's not "top of the top" anymore, regardless of size.

Then something could be "SOTA in it's class" I suppose, but personally that's less interesting and also not what the parent commentator claimed, which was basically "anyone with money can get SOTA models up and running".

Edit: Wikipedia seems to agree with me too:

> The state of the art (SOTA or SotA, sometimes cutting edge, leading edge, or bleeding edge) refers to the highest level of general development, as of a device, technique, or scientific field achieved at a particular time

I haven't heard of anyone using SOTA to not mean "at the front of the pack", but maybe people outside of ML use the word differently.

replies(1): >>45776117 #
37. overfeed ◴[] No.45774782{3}[source]
The USPS does more for its workers and customers than FedEx. There are addresses FedEx won't service due to "inefficiencies", hand over packages to the USPS for delivery.
38. will4274 ◴[] No.45774853[source]
Fwiw, this is a facile argument. You make no attenpt to demonstrate that after major reorganization (breakup / nationalization) that the firm will continue to have the desirable attributes (innovation, efficincy, ability to build) that made them too important to fail.
39. saulpw ◴[] No.45775190{5}[source]
The US government just allocated $10b towards Intel, and bailed out GM in the past. So what you said is clearly not the case. Now we have publicly-funded private management that is failing. At least if they were publicly owned and managed outright, they wouldn't be gutted by executives prioritizing quarterly profits.
replies(1): >>45775367 #
40. whimsicalism ◴[] No.45775367{6}[source]
Executives should prioritize producing things people are willing to pay money for cheaply. If there is a bias towards short-termism, that is a governance problem that should be addressed.

I agree that the US taking stakes or picking winners is bad, I don't think it follows that nationalization is the solution.

41. noosphr ◴[] No.45776117{8}[source]
A sota decoder model is a bigger deal than yet another trillion parameter encoder only model trained on benchmarks.

I don't get why you think that the only way that you can beat the big guys is by having more parameters than them.

replies(1): >>45777145 #
42. tehjoker ◴[] No.45777027{5}[source]
That was claimed but never proven. I personally don't believe the American companies making this claim. I suspect they made this up to protect their valuations when they were hideously embarassed and lost a trillion dollars in equity.
43. embedding-shape ◴[] No.45777145{9}[source]
> I don't get why you think that the only way that you can beat the big guys is by having more parameters than them.

Yeah, and I don't understand why people have to argue against some point others haven't made, kind of makes it less fun to participate in any discussions.

Whatever gets the best responses (no matter parameter size, specific architecture, addition of other things) is what I'd consider SOTA, then I guess you can go by your own definition.