←back to thread

277 points simianwords | 1 comments | | HN request time: 0s | source
Show context
rhubarbtree ◴[] No.45152883[source]
I find this rather oddly phrased.

LLMs hallucinate because they are language models. They are stochastic models of language. They model language, not truth.

If the “truthy” responses are common in their training set for a given prompt, you might be more likely to get something useful as output. Feels like we fell into that idea and said - ok this is useful as an information retrieval tool. And now we use RL to reinforce that useful behaviour. But still, it’s a (biased) language model.

I don’t think that’s how humans work. There’s more to it. We need a model of language, but it’s not sufficient to explain our mental mechanisms. We have other ways of thinking than generating language fragments.

Trying to eliminate cases where a stochastic model the size of an LLM gives “undesirable” or “untrue” responses seems rather odd.

replies(9): >>45152948 #>>45153052 #>>45153156 #>>45153672 #>>45153695 #>>45153785 #>>45154058 #>>45154227 #>>45156698 #
1. munchler ◴[] No.45153672[source]
This is directly addressed in the article, which states that language models can be trained to abstain when uncertain, by changing how rewards are set up. Incentives currently encourage guessing rather than being honest about uncertainty. If you disagree, it would be helpful to explain why, rather than just responding to the title alone.