Generative AI coding tools and agents do not work for me

(blog.miguelgrinberg.com)

399 points nomdep | 1 comments | 17 Jun 25 00:33 UTC | HN request time: 0.207s | source

Show context

tptacek ◴[17 Jun 25 04:16 UTC] No.44295712[source]▶

I'm fine with anybody saying AI agents don't work for their work-style and am not looking to rebut this piece, but I'm going to take this opportunity to call something out.

The author writes "reviewing code is actually harder than most people think. It takes me at least the same amount of time to review code not written by me than it would take me to write the code myself". That sounds within an SD of true for me, too, and I had a full-time job close-reading code (for security vulnerabilities) for many years.

But it's important to know that when you're dealing with AI-generated code for simple, tedious, or rote tasks --- what they're currently best at --- you're not on the hook for reading the code that carefully, or at least, not on the same hook. Hold on before you jump on me.

Modern Linux kernels allow almost-arbitrary code to be injected at runtime, via eBPF (which is just a C program compiled to an imaginary virtual RISC). The kernel can mostly reliably keep these programs from crashing the kernel. The reason for that isn't that we've solved the halting problem; it's that eBPF doesn't allow most programs at all --- for instance, it must be easily statically determined that any backwards branch in the program runs for a finite and small number of iterations. eBPF isn't even good at determining that condition holds; it just knows a bunch of patterns in the CFG that it's sure about and rejects anything that doesn't fit.

That's how you should be reviewing agent-generated code, at least at first; not like a human security auditor, but like the eBPF verifier. If I so much as need to blink when reviewing agent output, I just kill the PR.

If you want to tell me that every kind of code you've ever had to review is equally tricky to review, I'll stipulate to that. But that's not true for me. It is in fact very easy to me to look at a rote recitation of an idiomatic Go function and say "yep, that's what that's supposed to be".

replies(7): >>44295745 #>>44295773 #>>44295785 #>>44295795 #>>44296065 #>>44296839 #>>44296921 #

monero-xmr ◴[17 Jun 25 04:30 UTC] No.44295785[source]▶

>>44295712 #

I mostly just approve PRs because I trust my engineers. I have developed a 6th sense for thousand-line PRs and knowing which 100-300 lines need careful study.

Yes I have been burned. But 99% of the time, with proper test coverage it is not an issue, and the time (money) savings have been enormous.

"Ship it!" - me

replies(2): >>44295968 #>>44295984 #

theK ◴[17 Jun 25 05:26 UTC] No.44295984[source]▶

>>44295785 #

I think this points out the crux of the difference of collaborating with other devs vs collaborating with am AI. The article correctly States that the AI will never learn your preferences or idiosyncrasies of the specific projects/company etc because it effectively is amnesic. You cannot trust the AI the same you trust other known collaborators because you don't have a real relationship with it.

replies(2): >>44296402 #>>44296584 #

1. loandbehold ◴[17 Jun 25 07:02 UTC] No.44296402[source]▶

>>44295984 #

Most AI coding tools are working on this problem. E.g. say with Claude Code you can add your preferences to claude.md file. When I notice repeatedly correcting AI's mistake I add instruction to claude.md to avoid it in the future. claude.md is exactly that: memory of your preferences, idiosyncrasies and other project-related info.

↑