Andrej Karpathy: Software in the era of AI [video]

(www.youtube.com)

1480 points sandslash | 1 comments | 19 Jun 25 00:33 UTC | HN request time: 0.201s | source

Show context

abdullin ◴[19 Jun 25 07:03 UTC] No.44316210[source]▶

Tight feedback loops are the key in working productively with software. I see that in codebases up to 700k lines of code (legacy 30yo 4GL ERP systems).

The best part is that AI-driven systems are fine with running even more tight loops than what a sane human would tolerate.

Eg. running full linting, testing and E2E/simulation suite after any minor change. Or generating 4 versions of PR for the same task so that the human could just pick the best one.

replies(7): >>44316306 #>>44316946 #>>44317531 #>>44317792 #>>44318080 #>>44318246 #>>44318794 #

bandoti ◴[19 Jun 25 12:36 UTC] No.44318080[source]▶

>>44316210 #

Here’s a few problems I foresee:

1. People get lazy when presented with four choices they had no hand in creating, and they don’t look over the four and just click one, ignoring the others. Why? Because they have ten more of these on the go at once, diminishing their overall focus.

2. Automated tests, end-to-end sim., linting, etc—tools already exist and work at scale. They should be robust and THOROUGHLY reviewed by both AI and humans ideally.

3. AI is good for code reviews and “another set of eyes” but man it makes serious mistakes sometimes.

An anecdote for (1), when ChatGPT tries to A/B test me with two answers, it’s incredibly burdensome for me to read twice virtually the same thing with minimal differences.

Code reviewing four things that do almost the same thing is more of a burden than writing the same thing once myself.

replies(2): >>44318111 #>>44318430 #

abdullin ◴[19 Jun 25 12:40 UTC] No.44318111[source]▶

>>44318080 #

A simple rule applies: "No matter what tool created the code, you are still responsible for what you merge into main".

As such, task of verification, still falls on hands of engineers.

Given that and proper processes, modern tooling works nicely with codebases ranging from 10k LOC (mixed embedded device code with golang backends and python DS/ML) to 700k LOC (legacy enterprise applications from the mainframe era)

replies(3): >>44318177 #>>44318268 #>>44319968 #

1. ponector ◴[19 Jun 25 12:57 UTC] No.44318268[source]▶

>>44318111 #

> As such, task of verification, still falls on hands of engineers.

Even before LLM it was a common thing to merge changes which completely brake test environment. Some people really skip verification phase of their work.

↑