Andrej Karpathy: Software in the era of AI [video]

(www.youtube.com)

1479 points sandslash | 3 comments | 19 Jun 25 00:33 UTC | HN request time: 0.655s | source

Show context

mentalgear ◴[19 Jun 25 09:33 UTC] No.44316934[source]▶

Meanwhile, I asked this morning Claude 4 to write a simple EXIF normalizer. After two rounds of prompting it to double-check its code, I still had to point out that it makes no sense to load the entire image for re-orientating if the EXIF orientation is fine in the first place.

Vibe vs reality, and anyone actually working in the space daily can attest how brittle these systems are.

Maybe this changes in SWE with more automated tests in verifiable simulators, but the real world is far to complex to simulate in its vastness.

replies(7): >>44317104 #>>44317116 #>>44317136 #>>44317214 #>>44317305 #>>44317622 #>>44317741 #

sensanaty ◴[19 Jun 25 10:41 UTC] No.44317305[source]▶

>>44316934 #

There's also those instances where Microsoft unleashed Copilot on the .NET repo, and it resulted in the most hilariously terrible PRs that required the maintainers to basically tell Copilot every single step it should take to fix the issue. They were basically writing the PRs themselves at that point, except doing it through an intermediary that was much dumber, slower and less practical than them.

And don't get me started on my own experiences with these things, and no, I'm not a luddite, I've tried my damndest and have followed all the cutting-edge advice you see posted on HN and elsewhere.

Time and time again, the reality of these tools falls flat on their face while people like Andrej hype things up as if we're 5 minutes away from having Claude become Skynet or whatever, or as he puts it, before we enter the world of "Software 3.0" (coincidentally totally unrelated to Web 3.0 and the grift we had to endure there, I'm sure).

To intercept the common arguments,

- no I'm not saying LLMs are useless or have no usecases

- yes there's a possibility if you extrapolate by current trends (https://xkcd.com/605/) that they indeed will be Skynet

- yes I've tried the latest and greatest model released 7 minutes ago to the best of my ability

- yes I've tried giving it prompts so detailed a literal infant could follow along and accomplish the task

- yes I've fiddled with providing it more/less context

- yes I've tried keeping it to a single chat rather than multiple chats, as well as vice versa

- yes I've tried Claude Code, Gemini Pro 2.5 With Deep Research, Roocode, Cursor, Junie, etc.

- yes I've tried having 50 different "agents" running and only choosing the best output form the lot.

I'm sure there's a new gotcha being written up as we speak, probably something along the lines of "Well for me it doubled my productivity!" and that's great, I'm genuinely happy for you if that's the case, but for me and my team who have been trying diligently to use these tools for anything that wasn't a microscopic toy project, it has fallen apart time and time again.

The idea of an application UI or god forbid an entire fucking Operating System being run via these bullshit generators is just laughable to me, it's like I'm living on a different planet.

replies(5): >>44317421 #>>44317440 #>>44317630 #>>44317721 #>>44318531 #

diggan ◴[19 Jun 25 11:00 UTC] No.44317421[source]▶

>>44317305 #

You're not the first, nor the last person, to have a seemingly vastly different experience than me and others.

So I'm curious, what am I doing differently from what you did/do when you try them out?

This is maybe a bit out there, but would you be up for sending me like a screen recording of exactly what you're doing? Or maybe even a video call sharing your screen? I'm not working in the space, have no products or services to sell, only curious is why this gap seemingly exists between you and me, and my only motive would be to understand if I'm the one who is missing something, or there are more effective ways to help people understand how they can use LLMs and what they can use them for.

My email is on my profile if you're up for it. Invitation open for others in the same boat as parent too.

replies(1): >>44317810 #

bsenftner ◴[19 Jun 25 11:58 UTC] No.44317810[source]▶

>>44317421 #

I'm a greybeard, 45+ years coding, including active in AI during the mid 80's and used it when it applied throughout my entire career. That career being media and animation production backends, where the work is both at the technical and creative edge.

I currently have an AI integrated office suite, which has attorneys, professional writers, and political activists using the system. It is office software, word processing, spreadsheets, project management and about two dozen types of AI agents that act as virtual co-workers.

No, my users are not programmers, but I do have interns; college students with anything from 3 to 10 years experience writing software.

I see the same AI use problem issues with my users, and my interns. My office system bends over backwards to address this, but people are people: they do not realize that AI does not know what they are talking about. They will frequently ask questions with no preamble, no introduction to the subject. They will change topics, not bothering to start a new session or tell the AI the topic is now different. There is a huge number of things they do, often with escalating frustration evident in their prompts, that all violate the same basic issue: the LLM was not given a context to understand the subject at hand, and the user is acting like many people and when explaining they go further, past the point of confusion, now adding new confusion.

I see this over and over. It frustrates the users to anger, yet at the same time if they acted, communicated to a human, in the same manner they'd have a verbal fight almost instantly.

The problem is one of communications. ...and for a huge number of you I just lost you. You've not been taught to understand the power of communications, so you do not respect the subject. How to communication is practically everything when it comes to human collaboration. It is how one orders their mind, how one collaborates with others, AND how one gets AI to respond in the manner they desire.

But our current software development industry, and by extension all of STEM has been short changed by never been taught how to effectively communicate, no not at all. Presentations and how to sell are not effective communications, that's persuasion, about 5% of what it takes to convey understanding in others which then unblocks resistance to changes.

replies(2): >>44317896 #>>44318408 #

1. diggan ◴[19 Jun 25 12:10 UTC] No.44317896[source]▶

>>44317810 #

But parent explicitly mentioned:

> - yes I've tried giving it prompts so detailed a literal infant could follow along and accomplish the task

Which you are saying that might have missed in the end regardless?

replies(1): >>44317941 #

2. bsenftner ◴[19 Jun 25 12:18 UTC] No.44317941[source]▶

>>44317896 (TP) #

I'd like to see the prompt. I suspect that "literal infant" is expected to be a software developer without preamble. The initial sentence to an LLM carries far more relevance, it sets the context stage to understand what follows. If there is no introduction to the subject at hand, the response will be just like anyone fed a wall of words: confusion as to what all this is about.

replies(1): >>44318374 #

3. diggan ◴[19 Jun 25 13:08 UTC] No.44318374[source]▶

>>44317941 #

You and me both :) But I always try to read the comments here with the most charitable interpretation I can come up with.

↑