I wonder how long it will be before LLM-text trademarks become seen as a sign of bad writing or laziness instead? And then maybe we'll have an arms race of stylistic changes.
---
Completely agree with the author:
Earlier this week I asked Claude to summarize a bunch of code files since I was looking for a bug. It wrote paragraphs and had 3 suggestions. But when I read it, I realized it was mostly super generic and vague. The conditions that would be required to trigger the bug in those ways couldn't actually exist, but it put a lot of words around the ideas. I took longer to notice that they were incorrect suggestions as a result.
I told it "this won't happen those ways [because blah blah blah]" and it gave me the "you are correct!" compliment-dance and tried again. One new suggestion and a claimed reason about how one of its original suggestions might be right. The new suggestion seemed promising, but I wasn't entirely convinced. Tried again. It went back to the first three suggestions - the "here's why that won't happen" was still in the context window, but it hit some limit of its model. Like it was trying to reconcile being reinforcement-learning'd into "generate something that looks like a helpful answer" with "here is information in the context window saying the text I want to generate is wrong" and failing. We got into a loop.
It was a rare bug so we'll see if the useful-seeming suggestion was right or not but I don't know yet. Added some logging around it and some other stuff too.
The counterfactuals are hard to evaluate:
* would I have identified that potential change quicker without asking it? Or at all?
* would I have identified something else that it didn't point out?
* what if I hadn't noticed the problems with some other suggestions and spent a bunch of time chasing them?
The words:information ratio was a big problem in spotting the issues.
So was the "text completion" aspect of "if you're asking about a problem here, there must be a solution I can offer" RL-seeming aspect of its generated results. It didn't seem to be truly evaluating the code then deciding so much as saying "yes, I will definitely tell you there are things we can change, here are some that seem plausible."
Imagine if my coworker had asked me the question and I'd just copy-pasted Claude's first crap attempt to them in response? Rude as hell.