←back to thread

277 points simianwords | 1 comments | | HN request time: 0.405s | source
Show context
kingstnap ◴[] No.45150336[source]
There is this deeply wrong part of this paper that no one has mentioned:

The model head doesn't hallucinate. The sampler does.

If you ask an LLM when x was born and it doesn't know.

And you take a look at the actual model outputs which is a probability distribution over tokens.

IDK is cleanly represented as a uniform probability Jan 1 to Dec 31

If you ask it to answer a multiple choice question and it doesn't know. It will say this:

25% A, 25% B, 25% C, 25%D.

Which is exactly, and correctly, the "right answer". The model has admitted it doesn't know. It doesn't hallucinate anything.

In reality we need something smarter than a random sampler to actually extract this information out. The knowledge and lack of knowledge is there, you just produced bullshit out of it.

replies(4): >>45150445 #>>45150464 #>>45154049 #>>45167516 #
1. cyanydeez ◴[] No.45150464[source]
Im betting there's a graph model using various vectors that could improve known-knowns in outcomes.

But unknown-unknowns likely reduce to the Halting problem, which human intelligence doesnt really solve either.