←back to thread

Why language models hallucinate

(openai.com)

277 points simianwords | 1 comments | 06 Sep 25 07:41 UTC | HN request time: 0.303s | source

Show context

amelius ◴[06 Sep 25 13:39 UTC] No.45149170[source]▶

>>45147385 (OP) #

They hallucinate because it's an ill-defined problem with two conflicting usecases:

1. If I tell it the first two lines of a story, I want the LLM to complete the story. This requires hallucination, because it has to make up things. The story has to be original.

2. If I ask it a question, I want it to reply with facts. It should not make up stuff.

LMs were originally designed for (1) because researchers thought that (2) was out of reach. But it turned out that, without any fundamental changes, LMs could do a little bit of (2) and since that discovery things have improved but not to the point that hallucination disappeared or was under control.

replies(10): >>45149354 #>>45149390 #>>45149708 #>>45149889 #>>45149897 #>>45152136 #>>45152227 #>>45152405 #>>45152996 #>>45156457 #

1. skybrian ◴[06 Sep 25 15:03 UTC] No.45149889[source]▶

I don’t think it’s inherently ill-defined, since the context can tell you whether fiction is being requested or not. For an AI chatbot, the default shouldn’t be fiction.

What is true is that during pretraining, the model doesn’t know enough to determine this or to distinguish between what it knows and what it’s making up. This is a higher-level distinction that emerges later, if at all.

The recent research discovering an “evil vector” is an example of a higher-level distinction.