←back to thread

Google is winning on every AI front

(www.thealgorithmicbridge.com)
993 points vinhnx | 1 comments | | HN request time: 0s | source
Show context
thunderbird120 ◴[] No.43661807[source]
This article doesn't mention TPUs anywhere. I don't think it's obvious for people outside of google's ecosystem just how extraordinarily good the JAX + TPU ecosystem is. Google several structural advantages over other major players, but the largest one is that they roll their own compute solution which is actually very mature and competitive. TPUs are extremely good at both training and inference[1] especially at scale. Google's ability to tailor their mature hardware to exactly what they need gives them a massive leg up on competition. AI companies fundamentally have to answer the question "what can you do that no one else can?". Google's hardware advantage provides an actual answer to that question which can't be erased the next time someone drops a new model onto huggingface.

[1]https://blog.google/products/google-cloud/ironwood-tpu-age-o...

replies(12): >>43661870 #>>43661974 #>>43663154 #>>43663455 #>>43663647 #>>43663720 #>>43663956 #>>43664320 #>>43664354 #>>43672472 #>>43673285 #>>43674134 #
mike_hearn ◴[] No.43663720[source]
TPUs aren't necessarily a pro. They go back 15 years and don't seem to have yielded any kind of durable advantage. Developing them is expensive but their architecture was often over-fit to yesterday's algorithms which is why they've been through so many redesigns. Their competitors have routinely moved much faster using CUDA.

Once the space settles down, the balance might tip towards specialized accelerators but NVIDIA has plenty of room to make specialized silicon and cut prices too. Google has still to prove that the TPU investment is worth it.

replies(4): >>43663930 #>>43664015 #>>43666501 #>>43668095 #
summerlight ◴[] No.43668095[source]
Not sure how familiar you are with the internal situation... But from my experience think it's safe to say that TPU basically multiplies Google's computation capability by 10x, if not 20x. Also they don't need to compete with others to secure expensive nvidia chips. If this is not an advantage, I don't see there's anything considered to be an advantage. The entire point of vertical integration is to secure full control of your stack so your capability won't be limited by potential competitors, and TPU is one of the key component of its strategy.

Also worth noting that its Ads division is the largest, heaviest user of TPU. Thanks to it, it can flex running a bunch of different expensive models that you cannot realistically afford with GPU. The revenue delta from this is more than enough to pay off the entire investment history for TPU.

replies(1): >>43668328 #
mike_hearn ◴[] No.43668328[source]
They must very much compete with others. All these chips are being fabbed at the same facilities in Taiwan and capacity trades off against each other. Google has to compete for the same fab capacity alongside everyone else, as well as skilled chip designers etc.

> The revenue delta from this is more than enough to pay off the entire investment history for TPU.

Possibly; such statements were common when I was there too but digging in would often reveal that the numbers being used for what things cost, or how revenue was being allocated, were kind of ad hoc and semi-fictional. It doesn't matter as long as the company itself makes money, but I heard a lot of very odd accounting when I was there. Doubtful that changed in the years since.

Regardless the question is not whether some ads launches can pay for the TPUs, the question is whether it'd have worked out cheaper in the end to just buy lots of GPUs. Answering that would require a lot of data that's certainly considered very sensitive, and makes some assumptions about whether Google could have negotiated private deals etc.

replies(1): >>43669840 #
1. summerlight ◴[] No.43669840[source]
> They must very much compete with others. All these chips are being fabbed at the same facilities in Taiwan and capacity trades off against each other.

I'm not sure what you're trying to deliver here. Following your logic, even if you have a fab you need to compete for rare metals, ASML etc etc... That's a logic built for nothing but its own sake. In the real world, it is much easier to compete outside Nvidia's own allocation as you get rid of the critical bottleneck. And Nvidia has all the incentives to control the supply to maximize its own profit, not to meet the demands.

> Possibly; such statements were common when I was there too but digging in would often reveal that the numbers being used for what things cost, or how revenue was being allocated, were kind of ad hoc and semi-fictional.

> Regardless the question is not whether some ads launches can pay for the TPUs, the question is whether it'd have worked out cheaper in the end to just buy lots of GPUs.

Of course everyone can build their own narratives in favor of their launch, but I've been involved in some of those ads quality launches and can say pretty confidently that most of those launches would not be launchable without TPU at all. This was especially true in the early days of TPU as the supply of GPU for datacenter was extremely limited and immature.

More GPU can solve? Companies are talking about 100k~200k of H100 as a massive cluster and Google already has much larger TPU clusters with computation capability in a different order of magnitudes. The problem is, you cannot simply buy more computation even if you have lots of money. I've been pretty clear about how relying on Nvidia's supply could be a critical limiting factor in a strategic point of view but you're trying to move the point. Please don't.