385 points vessenes | 3 comments | 10 Mar 25 19:41 UTC | HN request time: 0.615s | source

So, Lecun has been quite public saying that he believes LLMs will never fix hallucinations because, essentially, the token choice method at each step leads to runaway errors -- these can't be damped mathematically.

In exchange, he offers the idea that we should have something that is an 'energy minimization' architecture; as I understand it, this would have a concept of the 'energy' of an entire response, and training would try and minimize that.

Which is to say, I don't fully understand this. That said, I'm curious to hear what ML researchers think about Lecun's take, and if there's any engineering done around it. I can't find much after the release of ijepa from his group.

Show context

eximius ◴[14 Mar 25 21:31 UTC] No.43367519[source]▶

>>43325049 (OP) #

I believe that so long as weights are fixed at inference time, we'll be at a dead end.

Will Titans be sufficiently "neuroplastic" to escape that? Maybe, I'm not sure.

Ultimately, I think an architecture around "looping" where the model outputs are both some form of "self update" and "optional actionality" such that interacting with the model is more "sampling from a thought space" will be required.

replies(3): >>43367644 #>>43370757 #>>43372112 #

1. randomNumber7 ◴[15 Mar 25 07:38 UTC] No.43370757[source]▶

>>43367519 #

Why, even animals sleep? And if you for example learn an instrument you will notice that a lot of the learning of the muscel memory happens during sleep.

replies(1): >>43375590 #

2. eximius ◴[15 Mar 25 22:33 UTC] No.43375590[source]▶

>>43370757 (TP) #

I guess you're saying that non-inference time training can be that "sleep period"?

replies(1): >>43377819 #

3. randomNumber7 ◴[16 Mar 25 09:34 UTC] No.43377819[source]▶

>>43375590 #

Yes, i could imagine something like a humanoid robot, where the "short term memory" is just a big enough context to keep all input of the day. Then during "sleep" there is training where the information is processed.

But I also think that current LLM tech does not lead to agi. You cant train something on pattern matchin and then it becomes magically intelligent (although i could be wrong).

Imo an AGI would need to be able to interact with the environment and learn to reflect on its interactions and its abilities within it. I suspect we have the hardware to build s.th. intelligent as a cat or a dog, but not the algorithms.

↑

Ask HN: Any insider takes on Yann LeCun's push against current architectures?