385 points vessenes | 1 comments | 10 Mar 25 19:41 UTC | HN request time: 0.217s | source

So, Lecun has been quite public saying that he believes LLMs will never fix hallucinations because, essentially, the token choice method at each step leads to runaway errors -- these can't be damped mathematically.

In exchange, he offers the idea that we should have something that is an 'energy minimization' architecture; as I understand it, this would have a concept of the 'energy' of an entire response, and training would try and minimize that.

Which is to say, I don't fully understand this. That said, I'm curious to hear what ML researchers think about Lecun's take, and if there's any engineering done around it. I can't find much after the release of ijepa from his group.

Show context

tyronehed ◴[14 Mar 25 18:40 UTC] No.43365788[source]▶

>>43325049 (OP) #

Any transformer based LLM will never achieve AGI because it's only trying to pick the next word. You need a larger amount of planning to achieve AGI. Also, the characteristics of LLMs do not resemble any existing intelligence that we know of. Does a baby require 2 years of statistical analysis to become useful? No. Transformer architectures are parlor tricks. They are glorified Google but they're not doing anything or planning. If you want that, then you have to base your architecture on the known examples of intelligence that we are aware of in the universe. And that's not a transformer. In fact, whatever AGI emerges will absolutely not contain a transformer.

replies(3): >>43366660 #>>43366893 #>>43366959 #

1. flawn ◴[14 Mar 25 20:04 UTC] No.43366660[source]▶

>>43365788 #

It's not about just picking the next word here, that doesn't at all refuse whether Transformers can achieve AGI. Words are just one representation of information. And whether it resembles any intelligence we know is also not an argument because there is no reason to believe that all intelligence is based on anything we've seen (e.g us, or other animals). The underlying architecture of Attention & MLPs can surely still depict something which we could call an AGI, and in certain tasks it surely can be considered an AGI already. I also don't know for certain whether we will hit any roadblocks or architectural asymptotes but I haven't come across any well-founded argument that Transformers definitely could not reach AGI.

↑

Ask HN: Any insider takes on Yann LeCun's push against current architectures?