←back to thread

251 points slyall | 9 comments | | HN request time: 0.423s | source | bottom
1. gregw2 ◴[] No.42064762[source]
The article credits two academics (Hinton, Fei Fei Li) and a CEO (Jensen Huang). But really it was three academics.

Jensen Huang, reasonably, was desperate for any market that could suck up more compute, which he could pivot to from GPUs for gaming when gaming saturated its ability to use compute. Screen resolutions and visible polygons and texture maps only demand so much compute; it's an S-curve like everything else. So from a marketing/market-development and capital investment perspective I do think he deserves credit. Certainly the Intel guys struggled to similarly recognize it (and to execute even on plain GPUs.)

But... the technical/academic insight of the CUDA/GPU vision in my view came from Ian Buck's "Brook" PhD thesis at Stanford under Pat Hanrahan (Pixar+Tableau co-founder, Turing Award Winner) and Ian promptly took it to Nvidia where it was commercialized under Jensen.

For a good telling of this under-told story, see one of Hanrahan's lectures at MIT: https://www.youtube.com/watch?v=Dk4fvqaOqv4

Corrections welcome.

replies(2): >>42065944 #>>42073157 #
2. markhahn ◴[] No.42065944[source]
Jensen embraced AI as a way to recover TAM after ASICs took over crypto mining. You can see that between-period in NVidia revenue and profit graphs.

By that time, GP-GPU had been around for a long, long time. CUDA still doesn't have much to do with AI - sure, it supports AI usage, even includes some AI-specific features (low-mixed precision blocked operations).

replies(2): >>42066745 #>>42070739 #
3. cameldrv ◴[] No.42066745[source]
Jensen embraced AI way before that. CuDNN was released back in 2014. I remember being at ICLR in 2015, and there were three companies with booths: Google and Facebook who were recruiting, and NVIDIA was selling a 4 GPU desktop computer.
replies(2): >>42067387 #>>42072539 #
4. dartos ◴[] No.42067387{3}[source]
Well as soon as matmul has a marketable use (ML predictive algorithms) nvidia was on top of it.

I don’t think they were thinking of LLMs in 2014, tbf.

replies(1): >>42071405 #
5. aleph_minus_one ◴[] No.42070739[source]
> Jensen embraced AI as a way to recover TAM after ASICs took over crypto mining.

TAM: Total Addressable Market

6. throwaway314155 ◴[] No.42071405{4}[source]
Effectively no one was but LLM's are precisely "ML predictive algorithms". That neural networks more broadly would scale at all on gaming chips is plenty foresight to be impressed with.
7. esjeon ◴[] No.42072539{3}[source]
No. At the time, it was about GPGPU, not AI.
8. a-dub ◴[] No.42073157[source]
that's what i remember. i remember reading an academic paper about a cool hack where someone was getting the shaders in gpus to do massively parallel general purpose vector ops. it was this massive orders of magnitude scaling that enabled neural networks to jump out of obscurity and into the limelight.

i remember prior to that, support vectors and rkhs were the hotness for continuous signal style ml tasks. they weren't particularly scalable and transfer learning formulations seemed quite complicated. (they were, however, pretty good for demos and contests)

replies(1): >>42073253 #
9. sigmoid10 ◴[] No.42073253[source]
You're probably thinking of this paper: https://ui.adsabs.harvard.edu/abs/2004PatRe..37.1311O/abstra...

They were running a massive neural network (by the standards back then) on a GPU years before CUDA even existed. Even funnier, they demoed it on ATI cards. But it still took until 2012 and AlexNet making heavy use of CUDA's simpler interface before the Deep Learning hype started to take off outside purely academic playgrounds.

So the insight neither came from Jensen nor the other authors mentioned above, but they were the first ones to capitalise on it.