385 points vessenes | 1 comments | 10 Mar 25 19:41 UTC | HN request time: 0.206s | source

So, Lecun has been quite public saying that he believes LLMs will never fix hallucinations because, essentially, the token choice method at each step leads to runaway errors -- these can't be damped mathematically.

In exchange, he offers the idea that we should have something that is an 'energy minimization' architecture; as I understand it, this would have a concept of the 'energy' of an entire response, and training would try and minimize that.

Which is to say, I don't fully understand this. That said, I'm curious to hear what ML researchers think about Lecun's take, and if there's any engineering done around it. I can't find much after the release of ijepa from his group.

Show context

jurschreuder ◴[14 Mar 25 19:24 UTC] No.43366246[source]▶

>>43325049 (OP) #

This concept comes from Hopfield networks.

If two nodes are on, but the connection between them is negative, this causes energy to be higher.

If one of those nodes switches off, energy is reduced.

With two nodes this is trivial. With 10 nodes it's more difficult to solve, and with billions of nodes it is impossible to "solve".

All you can do then is try to get the energy as low as possible.

This way also neural networks can find out "new" information, that they have not learned, but is consistent with the constraints they have learned about the world so far.

replies(1): >>43367355 #

1. vessenes ◴[14 Mar 25 21:12 UTC] No.43367355[source]▶

>>43366246 #

So, what’s modeled as a “node” in an EBM, and what’s modeled as a connection? Are they vectors in a tensor, (well I suppose almost certainly that’s a yes). Do they run side by side a model that’s being trained? Is the node connectivity architecture fixed or learned?

↑

Ask HN: Any insider takes on Yann LeCun's push against current architectures?