Why language models hallucinate

(openai.com)

277 points simianwords | 2 comments | 06 Sep 25 07:41 UTC | HN request time: 0s | source

Show context

roxolotl ◴[06 Sep 25 13:09 UTC] No.45148981[source]▶

This seems inherently false to me. Or at least partly false. It’s reasonable to say LLMs hallucinate because they aren’t trained to say they don’t have a statistically significant answer. But there is no knowledge of correct vs incorrect in these systems. It’s all statistics so what OpenAI is describing sounds like a reasonable way to reduce hallucinations but not a way to eliminate them nor the root cause.

replies(4): >>45149040 #>>45149166 #>>45149458 #>>45149946 #

ACCount37 ◴[06 Sep 25 13:38 UTC] No.45149166[source]▶

>>45148981 #

Is there any knowledge of "correct vs incorrect" inside you?

If "no", then clearly, you can hit general intelligence without that.

And if "yes", then I see no reason why an LLM can't have that knowledge crammed inside it too.

Would it be perfect? Hahahaha no. But I see no reason why "good enough" could not be attained.

replies(3): >>45149445 #>>45149581 #>>45155233 #

thaumasiotes ◴[06 Sep 25 14:29 UTC] No.45149581[source]▶

>>45149166 #

> And if "yes", then I see no reason why an LLM can't have that knowledge crammed inside it too.

An LLM, by definition, doesn't have such a concept. It's a model of language, hence "LLM".

Do you think the phrase just means "software"? Why?

replies(1): >>45149734 #

ACCount37 ◴[06 Sep 25 14:46 UTC] No.45149734[source]▶

>>45149581 #

If I had a penny for an every confidently incorrect "LLMs can't do X", I'd be able to buy an H100 with that.

Here's a simple test: make up a brand new word, or a brand new person. Then ask a few LLMs what the word means, or when that person was born.

If an LLM had zero operational awareness of its knowledge, it would be unable to recognize that the word/person is unknown to it. It would always generate a plausible-sounding explanation for what the word might mean, the same exact way it does for the word "carrot". Or a plausible-sounding birth date, the way it does for the person "Abraham Lincoln".

In practice, most production grade LLMs would recognize that a word or a person is unknown to them.

This is a very limited and basic version of the desirable "awareness of its own knowledge" - and one that's already present in current LLMs! Clearly, there's room for improved self-awareness.

replies(1): >>45150023 #

pessimizer ◴[06 Sep 25 15:16 UTC] No.45150023[source]▶

>>45149734 #

Do they "recognize" that they don't know the word, or are there just no statistically plausible surroundings that they can embed a nonsense word into other than settings that usually surround un-tokenizable words?

If you told them to write a Lewis Carroll poem about a nonsense word, it wouldn't have any problem. Not because it "recognizes" the word as being like a nonsense word in a Lewis Carroll poem, but because those poems are filled with other un-tokenizable words that could be replaced with anything.

I'm starting to come to the conclusion that LLMs are Mad-Libs at scale. Which are actually very useful. If there are paragraphs where I can swap out the words for other words, and generate a plausible idea, I can try it out in the real world and it might really work.

replies(2): >>45150523 #>>45152182 #

ACCount37 ◴[06 Sep 25 16:13 UTC] No.45150523{3}[source]▶

>>45150023 #

I don't think there's a direct link to the tokenizer - it's a higher level capability. You can stitch together a nonsense word out of common "word fragment" tokens and see if that impairs the LLM's ability to recognize the word as nonsense.

replies(1): >>45151128 #

Jensson ◴[06 Sep 25 17:19 UTC] No.45151128{4}[source]▶

>>45150523 #

That is wrong, I just generated 5 random letters in python and sent it to gpt-5 and it totally failed to answer properly, said "Got it, whats up :)" even though what I wrote isn't recognizable at all.

The "capability" you see is for the LLM to recognize its a human typed random string since human typed random strings are not very random. If you send it an actual random word then it typically fails.

replies(1): >>45156014 #

1. pfg_ ◴[07 Sep 25 06:59 UTC] No.45156014{5}[source]▶

>>45151128 #

I tried this four times, every time it recognized it as nonsense.

replies(1): >>45156122 #

2. typpilol ◴[07 Sep 25 07:23 UTC] No.45156122[source]▶

>>45156014 (TP) #

Same

↑