Andrej Karpathy: Software in the era of AI [video]

(www.youtube.com)

1481 points sandslash | 1 comments | 19 Jun 25 00:33 UTC | HN request time: 0.211s | source

Show context

eitally ◴[19 Jun 25 13:54 UTC] No.44318713[source]▶

It's going to be very interesting to see how things evolve in enterprise IT, especially but not exclusively in regulated industries. As more SaaS services are at least partly vibe coded, how are CIOs going to understand and mitigate risk? As more internal developers are using LLM-powered coding interfaces and become less clear on exactly how their resulting code works, how will that codebase be maintained and incrementally updated with new features, especially in solo dev teams (which is common)?

I easily see a huge future for agentic assistance in the enterprise, but I struggle mightily to see how many IT leaders would accept the output code of something like a menugen app as production-viable.

Additionally, if you're licensing code from external vendors who've built their own products at least partly through LLM-driven superpowers, how do you have faith that they know how things work and won't inadvertently break something they don't know how to fix? This goes for niche tools (like Clerk, or Polar.sh or similar) as much as for big heavy things (like a CRM or ERP).

I was on the CEO track about ten years ago and left it for a new career in big tech, and I don't envy the folks currently trying to figure out the future of safe, secure IT in the enterprise.

replies(4): >>44318751 #>>44318835 #>>44318900 #>>44319146 #

r2b2 ◴[19 Jun 25 14:15 UTC] No.44318900[source]▶

>>44318713 #

I've found that as LLMs improve, some of their bugs become increasingly slippery - I think of it as the uncanny valley of code.

Put another way, when I cause bugs, they are often glaring (more typos, fewer logic mistakes). Plus, as the author it's often straightforward to debug since you already have a deep sense for how the code works - you lived through it.

So far, using LLMs has downgraded my productivity. The bugs LLMs introduce are often subtle logical errors, yet "working" code. These errors are especially hard to debug when you didn't write the code yourself — now you have to learn the code as if you wrote it anyway.

I also find it more stressful deploying LLM code. I know in my bones how carefully I write code, due to a decade of roughly "one non critical bug per 10k lines" that keeps me asleep at night. The quality of LLM code can be quite chaotic.

That said, I'm not holding my breath. I expect this to all flip someday, with an LLM becoming a better and more stable coder than I am, so I guess I will keep working with them to make sure I'm proficient when that day comes.

replies(3): >>44319115 #>>44319120 #>>44323546 #

1. throw234234234 ◴[19 Jun 25 23:45 UTC] No.44323546[source]▶

>>44318900 #

Saw a recent talk where someone described AI as making errors, but not errors that a human would naturally make and are usually "plausible but wrong" answers. i.e. the errors that these AI's make are of a different nature than what a human would do. This is the danger - that reviews now are harder; I can't trust it as much as a person coding at present. The agent tools are a little better (Claude Code, Aider, etc) in that they can at least take build and test output but even then I've noticed it does things that are wrong but are "plausible and build fine".

I've noticed it in my day-to-day: an AI PR review is different than if I get the PR from a co-worker with different kinds of problems. Unfortunately the AI issues seem to be more of the subtle kind - the things if I'm not diligent could sneak into production code. It means reviews are more important, and I can't rely on previous experience of a co-worker and the typical quality of their PR's - every new PR is a different worker effectively.

↑