←back to thread

385 points vessenes | 4 comments | | HN request time: 0.628s | source

So, Lecun has been quite public saying that he believes LLMs will never fix hallucinations because, essentially, the token choice method at each step leads to runaway errors -- these can't be damped mathematically.

In exchange, he offers the idea that we should have something that is an 'energy minimization' architecture; as I understand it, this would have a concept of the 'energy' of an entire response, and training would try and minimize that.

Which is to say, I don't fully understand this. That said, I'm curious to hear what ML researchers think about Lecun's take, and if there's any engineering done around it. I can't find much after the release of ijepa from his group.

1. killthebuddha ◴[] No.43365456[source]
I've always felt like the argument is super flimsy because "of course we can _in theory_ do error correction". I've never seen even a semi-rigorous argument that error correction is _theoretically_ impossible. Do you have a link to somewhere where such an argument is made?
replies(3): >>43366044 #>>43367051 #>>43370111 #
2. aithrowawaycomm ◴[] No.43366044[source]
In theory transformers are Turing-complete and LLMs can do anything computable. The more down-to-earth argument is that transformer LLMs aren't able to correct errors in a systematic way like Lecun is describing: it's task-specific "whack-a-mole," involving either tailored synthetic data or expensive RLHF.

In particular, if you train an LLM to do Task A and Task B with acceptable accuracy, that does not guarantee it can combine the tasks in a common-sense way. "For each step of A, do B on the intermediate results" is a whole new Task C that likely needs to be fine-tuned. (This one actually does have some theoretical evidence coming from computational complexity, and it was the first thing I noticed in 2023 when testing chain-of-thought prompting. It's not that the LLM can't do Task C, it just takes extra training.)

3. vhantz ◴[] No.43367051[source]
> of course we can _in theory_ do error correction

Oh yeah? This is begging the question.

4. tyronehed ◴[] No.43370111[source]
As soon as you need to start leaning heavily on error correction, that is an indication that your architecture and solution is not correct. The final solution will need to be elegant and very close to a perfect solution immediately.

You must always keep close to the only known example we have of an intelligence which is the human brain. As soon as you start to wander away from the way the human brain does it, you are on your own and you are not relying on known examples of intelligence. Certainly that might be possible, but since there's only one known example in this universe of intelligence, it seems ridiculous to do anything but stick close to that example, which is the human brain.