Andrej Karpathy: Software in the era of AI [video]

(www.youtube.com)

1479 points sandslash | 1 comments | 19 Jun 25 00:33 UTC | HN request time: 0.405s | source

Show context

abdullin ◴[19 Jun 25 07:03 UTC] No.44316210[source]▶

Tight feedback loops are the key in working productively with software. I see that in codebases up to 700k lines of code (legacy 30yo 4GL ERP systems).

The best part is that AI-driven systems are fine with running even more tight loops than what a sane human would tolerate.

Eg. running full linting, testing and E2E/simulation suite after any minor change. Or generating 4 versions of PR for the same task so that the human could just pick the best one.

replies(7): >>44316306 #>>44316946 #>>44317531 #>>44317792 #>>44318080 #>>44318246 #>>44318794 #

OvbiousError ◴[19 Jun 25 09:35 UTC] No.44316946[source]▶

>>44316210 #

I don't think the human is the problem here, but the time it takes to run the full testing suite.

replies(6): >>44317032 #>>44317123 #>>44317166 #>>44317246 #>>44317515 #>>44318555 #

tlb ◴[19 Jun 25 10:33 UTC] No.44317246[source]▶

>>44316946 #

Yes, and (some near-future) AI is also more patient and better at multitasking than a reasonable human. It can make a change, submit for full fuzzing, and if there's a problem it can continue with the saved context it had when making the change. It can work on 100s of such changes in parallel, while a human trying to do this would mix up the reasons for the change with all the other changes they'd done by the time the fuzzing result came back.

LLMs are worse at many things than human programmers, so you have to try to compensate by leveraging the things they're better at. Don't give up with "they're bad at such and such" until you've tried using their strengths.

replies(1): >>44317950 #

HappMacDonald ◴[19 Jun 25 12:19 UTC] No.44317950[source]▶

>>44317246 #

You can't run N bots in parallel with testing between each attempt unless you're also running N tests in parallel.

If you could run N tests in parallel, then you could probably also run the components of one test in parallel and keep it from taking 2 hours in the first place.

To me this all sounds like snake oil to convince people to do something they were already doing, but by also spinning up N times as many compute instances and run a burn endless tokens along the way. And by the time it's demonstrated that it doesn't really offer anything more than doing it yourself, well you've already given them all of your money so their job is done.

replies(1): >>44318148 #

1. abdullin ◴[19 Jun 25 12:44 UTC] No.44318148[source]▶

>>44317950 #

Running tests is already an engineering problem.

In one of the systems (supply chain SaaS) we invested so much effort in having good tests in a simulated environment, that we could run full-stack tests at kHz. Roughly ~5k tests per second or so on a laptop.

↑