Why language models hallucinate

(openai.com)

277 points simianwords | 2 comments | 06 Sep 25 07:41 UTC | HN request time: 0s | source

Show context

rhubarbtree ◴[06 Sep 25 21:12 UTC] No.45152883[source]▶

I find this rather oddly phrased.

LLMs hallucinate because they are language models. They are stochastic models of language. They model language, not truth.

If the “truthy” responses are common in their training set for a given prompt, you might be more likely to get something useful as output. Feels like we fell into that idea and said - ok this is useful as an information retrieval tool. And now we use RL to reinforce that useful behaviour. But still, it’s a (biased) language model.

I don’t think that’s how humans work. There’s more to it. We need a model of language, but it’s not sufficient to explain our mental mechanisms. We have other ways of thinking than generating language fragments.

Trying to eliminate cases where a stochastic model the size of an LLM gives “undesirable” or “untrue” responses seems rather odd.

replies(9): >>45152948 #>>45153052 #>>45153156 #>>45153672 #>>45153695 #>>45153785 #>>45154058 #>>45154227 #>>45156698 #

crystal_revenge ◴[06 Sep 25 23:15 UTC] No.45153785[source]▶

>>45152883 #

People also tend not to understand the absurdity of assuming that we can make LLMs stop hallucinating. It would imply not only that truth is absolutely objective, but that it exists on some smooth manifold which language can be mapped to.

That means there would be some high dimensional surface representing "all true things". Any fact could be trivially resolved as "true" or "false" simply by exploring whether or not it was represented on this surface. Where or not "My social security number is 123-45-6789" is true could be determined simply by checking whether or not that statement was mappable to the truth manifold. Likewise you could wander around that truth manifold and start generating output of all true things.

If such a thing existed it would make even the wildest fantasies about AGI seem tame.

edit: To simplify it further, this would imply you could have an 'is_true(statement: string): bool' function for any arbitrary statement in English.

replies(5): >>45153832 #>>45154240 #>>45154507 #>>45155042 #>>45155447 #

mqus ◴[06 Sep 25 23:26 UTC] No.45153832[source]▶

>>45153785 #

Well, no. The article pretty much says that any arbitrary statement can be mapped to {true, false, I don't know}. This is still not 100% accurate, but at least something that seems reachable. The model should just be able to tell unknowns, not be able to verify every single fact.

replies(1): >>45153929 #

gary_0 ◴[06 Sep 25 23:45 UTC] No.45153929[source]▶

>>45153832 #

Determining a statement's truth (or if it's outside the system's knowledge) is an old problem in machine intelligence, with whole subfields like knowledge graphs and such, and it's NOT a problem LLMs were originally meant to address at all.

LLMs are text generators that are very good at writing a book report based on a prompt and the patterns learned from the training corpus, but it's an entirely separate problem to go through that book report statement by statement and determine if each one is true/false/unknown. And that problem is one that the AI field has already spent 60 years on, so there's a lot of hubris in assuming you can just solve that and bolt it onto the side of GPT-5 by next quarter.

replies(1): >>45155631 #

red75prime ◴[07 Sep 25 05:25 UTC] No.45155631[source]▶

>>45153929 #

> And that problem is one that the AI field has already spent 60 years on

I hope you don't think that the solutions will be a closed-form expression. The solution should involve exploration and learning. The things that LLMs are instrumental in, you know.

replies(2): >>45155808 #>>45156174 #

1. sirwhinesalot ◴[07 Sep 25 07:34 UTC] No.45156174[source]▶

>>45155631 #

Not the same person but I think the "structure" of what the ML model is learning can have a substantial impact, specially if it then builds on that to produce further output.

Learning to guess the next token is very different from learning to map text to a hypervector representing a graph of concepts. This can be witnessed in image classification tasks involving overlapping objects where the output must describe their relative positioning. Vector-symbolic models perform substantially better than more "brute-force" neural nets of equivalent size.

But this is still different from hardcoding a knowledge graph or using closed-form expressions.

Human intelligence relies on very similar neural structures to those we use for movement. Reference frames are both how we navigate the world and also how we think. There's no reason to limit ourselves to next token prediction. It works great because it's easy to set up with the training data we have, but it's otherwise a very "dumb" way to go about it.

replies(1): >>45179842 #

2. red75prime ◴[09 Sep 25 09:55 UTC] No.45179842[source]▶

>>45156174 (TP) #

I mostly agree. But, next token prediction is a pretraining phase of an LLM, not all there is to LLMs.

↑