←back to thread

385 points vessenes | 1 comments | | HN request time: 0.223s | source

So, Lecun has been quite public saying that he believes LLMs will never fix hallucinations because, essentially, the token choice method at each step leads to runaway errors -- these can't be damped mathematically.

In exchange, he offers the idea that we should have something that is an 'energy minimization' architecture; as I understand it, this would have a concept of the 'energy' of an entire response, and training would try and minimize that.

Which is to say, I don't fully understand this. That said, I'm curious to hear what ML researchers think about Lecun's take, and if there's any engineering done around it. I can't find much after the release of ijepa from his group.

Show context
inimino ◴[] No.43367126[source]
I have a paper coming up that I modestly hope will clarify some of this.

The short answer should be that it's obvious LLM training and inference are both ridiculously inefficient and biologically implausible, and therefore there has to be some big optimization wins still on the table.

replies(5): >>43367169 #>>43367233 #>>43367463 #>>43367776 #>>43367860 #
Etheryte ◴[] No.43367860[source]
How does this idea compare to the rationale presented by Rich Sutton in The Bitter Lesson [0]? Shortly put, why do you think biological plausibility has significance?

[0] http://www.incompleteideas.net/IncIdeas/BitterLesson.html

replies(2): >>43367923 #>>43376139 #
1. inimino ◴[] No.43376139[source]
I'll have to refer you to my forthcoming paper for the full argument, but basically, humans (and all animals) experience surprise and then we attribute that surprise to a cause, and then we update (learn).

In ANNs we backprop uniformly, so the error correction is distributed over the whole network. This is why LLM training is inefficient.