Why language models hallucinate

(openai.com)

277 points simianwords | 2 comments | 06 Sep 25 07:41 UTC | HN request time: 0s | source

Show context

rhubarbtree ◴[06 Sep 25 21:12 UTC] No.45152883[source]▶

I find this rather oddly phrased.

LLMs hallucinate because they are language models. They are stochastic models of language. They model language, not truth.

If the “truthy” responses are common in their training set for a given prompt, you might be more likely to get something useful as output. Feels like we fell into that idea and said - ok this is useful as an information retrieval tool. And now we use RL to reinforce that useful behaviour. But still, it’s a (biased) language model.

I don’t think that’s how humans work. There’s more to it. We need a model of language, but it’s not sufficient to explain our mental mechanisms. We have other ways of thinking than generating language fragments.

Trying to eliminate cases where a stochastic model the size of an LLM gives “undesirable” or “untrue” responses seems rather odd.

replies(9): >>45152948 #>>45153052 #>>45153156 #>>45153672 #>>45153695 #>>45153785 #>>45154058 #>>45154227 #>>45156698 #

crystal_revenge ◴[06 Sep 25 23:15 UTC] No.45153785[source]▶

>>45152883 #

People also tend not to understand the absurdity of assuming that we can make LLMs stop hallucinating. It would imply not only that truth is absolutely objective, but that it exists on some smooth manifold which language can be mapped to.

That means there would be some high dimensional surface representing "all true things". Any fact could be trivially resolved as "true" or "false" simply by exploring whether or not it was represented on this surface. Where or not "My social security number is 123-45-6789" is true could be determined simply by checking whether or not that statement was mappable to the truth manifold. Likewise you could wander around that truth manifold and start generating output of all true things.

If such a thing existed it would make even the wildest fantasies about AGI seem tame.

edit: To simplify it further, this would imply you could have an 'is_true(statement: string): bool' function for any arbitrary statement in English.

replies(5): >>45153832 #>>45154240 #>>45154507 #>>45155042 #>>45155447 #

1. beeflet ◴[07 Sep 25 04:41 UTC] No.45155447[source]▶

>>45153785 #

Maybe if a language model was so absolutely massive, it could <think> enough to simulate the entire universe and determine your social security number

replies(1): >>45155634 #

2. riwsky ◴[07 Sep 25 05:27 UTC] No.45155634[source]▶

>>45155447 (TP) #

↑