Andrej Karpathy: Software in the era of AI [video]

I watched Karpathy's Intro to Large Language Models[0] not so long ago and must say that I'm a bit confused by this presentation, and it's a bit unclear to me what it adds.

1,5 years ago he saw all the tool uses in agent systems as the future of LLMs, which seemed reasonable to me. There was (and maybe still is) potential for a lot of business cases to be explored, but every system is defined by its boundaries nonetheless. We still don't know all the challenges we face at that boundaries, whether these could be modelled into a virtual space, handled by software, and therefor also potentially AI and businesses.

Now it all just seems to be analogies and what role LLMs could play in our modern landscape. We should treat LLMs as encapsulated systems of their own ...but sometimes an LLM becomes the operating system, sometimes it's the CPU, sometimes it's the mainframe from the 60s with time-sharing, a big fab complex, or even outright electricity itself?

He's showing an iOS app, which seems to be, sorry for the dismissive tone, an example for a better looking counter. This demo app was in a presentable state for a demo after a day, and it took him a week to implement Googles OAuth2 stuff. Is that somehow exciting? What was that?

The only way I could interpret this is that it just shows a big divide we're currently in. LLMs are a final API product for some, but an unoptimized generative software-model with sophisticated-but-opaque algorithms for others. Both are utterly in need for real world use cases - the product side for the fresh training data, and the business side for insights, integrations and shareholder value.

Am I all of a sudden the one lacking imagination? Is he just slurping the CEO cool aid and still has his investments in OpenAI? Can we at least agree that we're still dealing with software here?

[0]: https://www.youtube.com/watch?v=zjkBMFhNj_g

> and must say that I'm a bit confused by this presentation, and it's a bit unclear to me what it adds.

I think the disconnect might come from the fact that Karpathy is speaking as someone who's day-to-day computing work has already been radically transformed by this technology (and he interacts with a ton of other people for whom this is the case), so he's not trying to sell the possibility of it: that would be like trying to sell the possibility of an airplane for someone who's already just cruising around in one every day. Instead the mode of the presentation is more: well, here we are at the dawn of a new era of computing, it really happened. Now how can we relate this to the history of computing to anticipate where we're headed next?

> ...but sometimes an LLM becomes the operating system, sometimes it's the CPU, sometimes it's the mainframe from the 60s with time-sharing, a big fab complex, or even outright electricity itself?

He uses these analogies in clear and distinct ways to characterize separate facets of the technology. If you were unclear on the meanings of the separate analogies it seems like the talk may offer some value for you after all but you may be missing some prerequisites.

> This demo app was in a presentable state for a demo after a day, and it took him a week to implement Googles OAuth2 stuff. Is that somehow exciting? What was that?

The point here was that he'd built the core of the app within a day without knowing the Swift language or ios app dev ecosystem by leveraging LLMs, but that part of the process remains old-fashioned and blocks people from leveraging LLMs as they can when writing code—and he goes on to show concretely how this could be improved.