(dl.acm.org)

66 points appwiz | 1 comments | 23 Jun 25 07:14 UTC | HN request time: 0.209s | source

Show context

simonw ◴[26 Jun 25 02:22 UTC] No.44383691[source]▶

I still don't think hallucinations in generated code matter very much. They show up the moment you try to run the code, and with the current batch of "coding agent" systems it's the LLM itself that spots the error when it attempts to run the code.

I was surprised that this paper talked more about RAG solutions than tool-use based solutions. Those seem to me like a proven solution at this point.

replies(4): >>44384474 #>>44384576 #>>44387027 #>>44388124 #

mucha ◴[26 Jun 25 13:02 UTC] No.44387027[source]▶

>>44383691 #

Interesting. How do existing systems catch Task Requirement hallucinations?

replies(1): >>44387759 #

1. simonw ◴[26 Jun 25 14:26 UTC] No.44387759[source]▶

>>44387027 #

They don't. My comment was about "hallucinations in generated code".

↑

LLM Hallucinations in Practical Code Generation