Why do they have to use the word “hallucination” when the model makes a mistake, if you tell your teacher or boss you didn’t get the answer wrong, you’ve hallucinated it, he will send you to the hospital.
In exchange, he offers the idea that we should have something that is an 'energy minimization' architecture; as I understand it, this would have a concept of the 'energy' of an entire response, and training would try and minimize that.
Which is to say, I don't fully understand this. That said, I'm curious to hear what ML researchers think about Lecun's take, and if there's any engineering done around it. I can't find much after the release of ijepa from his group.