Hallucinations in code are the least dangerous form of LLM mistakes

(simonwillison.net)

371 points ulrischa | 1 comments | 02 Mar 25 19:15 UTC | HN request time: 0.201s | source

Show context

not2b ◴[02 Mar 25 22:19 UTC] No.43235837[source]▶

If the hallucinated code doesn't compile (or in an interpreted language, immediately throws exceptions), then yes, that isn't risky because that code won't be used. I'm more concerned about code that appears to work for some test cases but solves the wrong problem or inadequately solves the problem, and whether we have anyone on the team who can maintain that code long-term or document it well enough so others can.

replies(2): >>43235865 #>>43237349 #

t14n ◴[02 Mar 25 22:22 UTC] No.43235865[source]▶

>>43235837 #

fwiw this problem already exists with my more junior co-workers. and also my own code that I write when exhausted!

if you have trusted processes for review and aren't always rushing out changes without triple checking your work (plus a review from another set of eyes), then I think you catch a lot of the subtler bugs that are emitted from an LLM.

replies(1): >>43244565 #

1. not2b ◴[03 Mar 25 17:50 UTC] No.43244565[source]▶

>>43235865 #

Yes, code review can catch these things. But code review for more complex issues works better when the submitter can walk the reviewers through the design and explain the details (sometimes the reviewers will catch a flaw in the submitter's reasoning before they spot the issue in the code: it can become clearer that the developer didn't adequately understand the spec or the problem to be solved). If an LLM produced it, a rigorous process will take longer, which reduces the value of using the LLM in the first place.

↑