←back to thread

385 points vessenes | 1 comments | | HN request time: 0.221s | source

So, Lecun has been quite public saying that he believes LLMs will never fix hallucinations because, essentially, the token choice method at each step leads to runaway errors -- these can't be damped mathematically.

In exchange, he offers the idea that we should have something that is an 'energy minimization' architecture; as I understand it, this would have a concept of the 'energy' of an entire response, and training would try and minimize that.

Which is to say, I don't fully understand this. That said, I'm curious to hear what ML researchers think about Lecun's take, and if there's any engineering done around it. I can't find much after the release of ijepa from his group.

1. estebarb ◴[] No.43365718[source]
I have no idea about EBM, but I have researched a bit on the language modelling side. And let's be honest, GPT is not the best learner we can create right now (ourselves). GPT needs far more data and energy than a human, so clearly there is a better architecture somewhere waiting to be discovered.

Attention works, yes. But it is not naturally plausible at all. We don't do quadratic comparisons across a whole book or need to see thousands of samples to understand.

Personally I think that in the future recursive architectures and test time training will have a better chance long term than current full attention.

Also, I think that OpenAI biggest contribution is demostrating that reasoning like behaviors can emerge from really good language modelling.