Generative AI coding tools and agents do not work for me

(blog.miguelgrinberg.com)

399 points nomdep | 1 comments | 17 Jun 25 00:33 UTC | HN request time: 0.45s | source

Show context

tptacek ◴[17 Jun 25 04:16 UTC] No.44295712[source]▶

I'm fine with anybody saying AI agents don't work for their work-style and am not looking to rebut this piece, but I'm going to take this opportunity to call something out.

The author writes "reviewing code is actually harder than most people think. It takes me at least the same amount of time to review code not written by me than it would take me to write the code myself". That sounds within an SD of true for me, too, and I had a full-time job close-reading code (for security vulnerabilities) for many years.

But it's important to know that when you're dealing with AI-generated code for simple, tedious, or rote tasks --- what they're currently best at --- you're not on the hook for reading the code that carefully, or at least, not on the same hook. Hold on before you jump on me.

Modern Linux kernels allow almost-arbitrary code to be injected at runtime, via eBPF (which is just a C program compiled to an imaginary virtual RISC). The kernel can mostly reliably keep these programs from crashing the kernel. The reason for that isn't that we've solved the halting problem; it's that eBPF doesn't allow most programs at all --- for instance, it must be easily statically determined that any backwards branch in the program runs for a finite and small number of iterations. eBPF isn't even good at determining that condition holds; it just knows a bunch of patterns in the CFG that it's sure about and rejects anything that doesn't fit.

That's how you should be reviewing agent-generated code, at least at first; not like a human security auditor, but like the eBPF verifier. If I so much as need to blink when reviewing agent output, I just kill the PR.

If you want to tell me that every kind of code you've ever had to review is equally tricky to review, I'll stipulate to that. But that's not true for me. It is in fact very easy to me to look at a rote recitation of an idiomatic Go function and say "yep, that's what that's supposed to be".

replies(7): >>44295745 #>>44295773 #>>44295785 #>>44295795 #>>44296065 #>>44296839 #>>44296921 #

kenjackson ◴[17 Jun 25 04:22 UTC] No.44295745[source]▶

>>44295712 #

I can read code much faster than I can write it.

This might be the defining line for Gen AI - people who can read code faster will find it useful and those that write faster then they can read won’t use it.

replies(2): >>44295945 #>>44296246 #

autobodie ◴[17 Jun 25 05:14 UTC] No.44295945[source]▶

>>44295745 #

I think that's wrong. I only have to write code once, maybe twice. But when using AI agents, I have to read many (5? 10? I will always give up before 15) PRs before finding one close enough that I won't have to rewrite all of it. This nonsense has not saved me any time, and the process is miserable.

I also haven't found any benefit in aiming for smaller or larger PRs. The aggregare efficiency seems to even out because smaller PRs are easier to weed through but they are not less likely to be trash.

replies(1): >>44296043 #

kenjackson ◴[17 Jun 25 05:39 UTC] No.44296043[source]▶

>>44295945 #

I only generate the code once with GenAI and typically fix a bug or two - or at worst use its structure. Rarely do I toss a full PR.

It’s interesting some folks can use them to build functioning systems and others can’t get a PR out of them.

replies(2): >>44296359 #>>44296712 #

1. dagw ◴[17 Jun 25 06:55 UTC] No.44296359[source]▶

>>44296043 #

It’s interesting some folks can use them to build functioning systems and others can’t get a PR out of them.

It is 100% a function of what you are trying to build, what language and libraries you are building it in, and how sensitive that thing is to factors like performance and getting the architecture just right. I've experienced building functioning systems with hardly any intervention, and repeatedly failing to get code that even compiles after over an hour of effort. There exists small, but popular, subset of programming tasks where gen AI excels, and a massive tail of tasks where it is much less useful.

↑