←back to thread

416 points floverfelt | 7 comments | | HN request time: 0.486s | source | bottom
Show context
oo0shiny ◴[] No.45057794[source]
> My former colleague Rebecca Parsons, has been saying for a long time that hallucinations aren’t a bug of LLMs, they are a feature. Indeed they are the feature. All an LLM does is produce hallucinations, it’s just that we find some of them useful.

What a great way of framing it. I've been trying to explain this to people, but this is a succinct version of what I was stumbling to convey.

replies(5): >>45060348 #>>45060455 #>>45061299 #>>45061334 #>>45061655 #
jstrieb ◴[] No.45060348[source]
I have been explaining this to friends and family by comparing LLMs to actors. They deliver a performance in-character, and are only factual if it happens to make the performance better.

https://jstrieb.github.io/posts/llm-thespians/

replies(4): >>45061465 #>>45061893 #>>45061997 #>>45063090 #
1. red75prime ◴[] No.45061893[source]
The analogy goes down the drain when a criterion for good performance is being objectively right. Like with Reinforcement Learning from Verifiable Rewards.
replies(3): >>45062885 #>>45064907 #>>45065336 #
2. ACCount37 ◴[] No.45062885[source]
RLVR can also encourage hallucinations quite easily. Think of SAT: giving a random answer is right 20% of the time, giving "I don't know" is right 0% of the time. If you only reward for test score, you encourage guesswork. So good RL reward design is as important as ever.

That being said, there are methods to train LLMs against hallucinations, and they do improve hallucination-avoidance. But anti-hallucination capabilities are fragile and do not fully generalize. There's no (known) way to train full awareness of its own capabilities into an LLM.

replies(1): >>45064815 #
3. neonspark ◴[] No.45064815[source]
I think what you say is true, and I think that this is exactly true for humans as well. There is no known way to completely eliminate unintentional bullshit coming from a human’s mouth. We have many techniques for reducing it, including critical thinking, but we are all susceptible to it and I imagine we do it many times a day without too much concern.

We need to make these models much much better, but it’s going to be quite difficult to reduce the levels to even human levels. And the BS will always be there with us. I suppose BS is the natural side effect of any complex system, artificial or biological, that tries to navigate the problem space of reality and speak on it. These systems, sometimes called “minds”, are going to produce things that sound right but just are not true.

replies(1): >>45065236 #
4. jstrieb ◴[] No.45064907[source]
Nobody that I'd be using this analogy with is currently using LLMs for tasks that are covered by RLVF. They're asking models for factual information about the real world (Google replacement), or to generate text (write a cover letter), not the type of outputs that are verifiable within formal systems—by definition the type of output that RLVF is intended to improve. The actor analogy is still helpful for providing intuition to non-technical people who don't know how to think about LLMs, but do use them.

Also, unless I am mistaken, RLVF changes the training to make LLMs less likely to hallucinate, but in no way does it make hallucination impossible. Under the hood, the models still work the same way (after training), and the analogy still applies, no?

replies(1): >>45066936 #
5. ACCount37 ◴[] No.45065236{3}[source]
It's a feeling I can't escape: that by trying to build thinking machines, we glimpse more and more of how the human mind works, and why it works the way it does - imperfections and all.

"Critical thinking" and "scientific method" feel quite similar to the "let's think step by step" prompt for the early LLMs. More elaborate directions, compensating for the more subtle flaws of a more capable mind.

6. jimbokun ◴[] No.45065336[source]
But being "objectively right" is not the goal of an actor.

Thus, why it's a good metaphor for the behavior of LLMs.

7. red75prime ◴[] No.45066936[source]
> Under the hood, the models still work the same way (after training), and the analogy still applies, no?

Under the hood we have billions of parameters that defy any simple analogies.

Operations of a network are shaped by human data. But the structure of the network is not like the human brain. So, we have something that is human-like in some ways, but deviates from humans in ways, which are unlikely to be like anything we can observe in humans (and use as a basis for analogy).