←back to thread

399 points nomdep | 1 comments | | HN request time: 0.206s | source
Show context
tptacek ◴[] No.44295712[source]
I'm fine with anybody saying AI agents don't work for their work-style and am not looking to rebut this piece, but I'm going to take this opportunity to call something out.

The author writes "reviewing code is actually harder than most people think. It takes me at least the same amount of time to review code not written by me than it would take me to write the code myself". That sounds within an SD of true for me, too, and I had a full-time job close-reading code (for security vulnerabilities) for many years.

But it's important to know that when you're dealing with AI-generated code for simple, tedious, or rote tasks --- what they're currently best at --- you're not on the hook for reading the code that carefully, or at least, not on the same hook. Hold on before you jump on me.

Modern Linux kernels allow almost-arbitrary code to be injected at runtime, via eBPF (which is just a C program compiled to an imaginary virtual RISC). The kernel can mostly reliably keep these programs from crashing the kernel. The reason for that isn't that we've solved the halting problem; it's that eBPF doesn't allow most programs at all --- for instance, it must be easily statically determined that any backwards branch in the program runs for a finite and small number of iterations. eBPF isn't even good at determining that condition holds; it just knows a bunch of patterns in the CFG that it's sure about and rejects anything that doesn't fit.

That's how you should be reviewing agent-generated code, at least at first; not like a human security auditor, but like the eBPF verifier. If I so much as need to blink when reviewing agent output, I just kill the PR.

If you want to tell me that every kind of code you've ever had to review is equally tricky to review, I'll stipulate to that. But that's not true for me. It is in fact very easy to me to look at a rote recitation of an idiomatic Go function and say "yep, that's what that's supposed to be".

replies(7): >>44295745 #>>44295773 #>>44295785 #>>44295795 #>>44296065 #>>44296839 #>>44296921 #
112233 ◴[] No.44295795[source]
This is radical and healthy way to do it. Obviously wrong — reject. Obviously right — accept. In any other case — also reject, as non-obvious.

I guess it is far removed from the advertized use case. Also, I feel one would be better off having auto-complete powered by LLM in this case.

replies(3): >>44295800 #>>44295849 #>>44296573 #
1. vidarh ◴[] No.44296573[source]
Auto-complete means having to babysit it.

The more I use this, the longer the LLM will be working before I even look at the output any more than maybe having it chug along on another screen and occasionally glance over.

My shortest runs now usually takes minutes of the LLM expanding my prompt into a plan, writing the tests, writing the code, linting its code, fixing any issues, and write a commit message before I even review things.