385 points vessenes | 1 comments | 10 Mar 25 19:41 UTC | HN request time: 0.29s | source

So, Lecun has been quite public saying that he believes LLMs will never fix hallucinations because, essentially, the token choice method at each step leads to runaway errors -- these can't be damped mathematically.

In exchange, he offers the idea that we should have something that is an 'energy minimization' architecture; as I understand it, this would have a concept of the 'energy' of an entire response, and training would try and minimize that.

Which is to say, I don't fully understand this. That said, I'm curious to hear what ML researchers think about Lecun's take, and if there's any engineering done around it. I can't find much after the release of ijepa from his group.

Show context

rglover ◴[14 Mar 25 20:06 UTC] No.43366681[source]▶

>>43325049 (OP) #

Not an ML researcher, but implementing these systems has shown this opinion to be correct. The non-determinism of LLMs is a feature, not a bug that can be fixed.

As a result, you'll never be able to get 100% consistent outputs or behavior (like you hypothetically can with a traditional algorithm/business logic). And that has proven out in usage across every model I've worked with.

There's also an upper-bound problem in terms of context where every LLM hits some arbitrary amount of context that causes it to "lose focus" and develop a sort of LLM ADD. This is when hallucinations and random, unrequested changes get made and a previously productive chat spirals to the point where you have to start over.

replies(1): >>43380964 #

1. numba888 ◴[16 Mar 25 18:16 UTC] No.43380964[source]▶

>>43366681 #

> where every LLM hits some arbitrary amount of context that causes it to "lose focus" and develop a sort of LLM ADD.

Humans brains have the same problem. As any intelligence probably. Solution for this is structural thinking. One piece at a time, often top-down. Educated humans do it, LLM can be orchestrated to do it too. Effective context window will be limited even though some claim millions of tokens.

↑

Ask HN: Any insider takes on Yann LeCun's push against current architectures?