Ask HN: Any insider takes on Yann LeCun's push against current architectures?

385 points vessenes | 2 comments | 10 Mar 25 19:41 UTC | HN request time: 0.615s | source

So, Lecun has been quite public saying that he believes LLMs will never fix hallucinations because, essentially, the token choice method at each step leads to runaway errors -- these can't be damped mathematically.

In exchange, he offers the idea that we should have something that is an 'energy minimization' architecture; as I understand it, this would have a concept of the 'energy' of an entire response, and training would try and minimize that.

Which is to say, I don't fully understand this. That said, I'm curious to hear what ML researchers think about Lecun's take, and if there's any engineering done around it. I can't find much after the release of ijepa from his group.

Show context

bravura ◴[14 Mar 25 22:40 UTC] No.43368085[source]▶

>>43325049 (OP) #

Okay I think I qualify. I'll bite.

LeCun's argument is this:

1) You can't learn an accurate world model just from text.

2) Multimodal learning (vision, language, etc) and interaction with the environment is crucial for true learning.

He and people like Hinton and Bengio have been saying for a while that there are tasks that mice can understand that an AI can't. And that even have mouse-level intelligence will be a breakthrough, but we cannot achieve that through language learning alone.

A simple example from "How Large Are Lions? Inducing Distributions over Quantitative Attributes" (https://arxiv.org/abs/1906.01327) is this: Learning the size of objects using pure text analysis requires significant gymnastics, while vision demonstrates physical size more easily. To determine the size of a lion you'll need to read thousands of sentences about lions, or you could look at two or three pictures.

LeCun isn't saying that LLMs aren't useful. He's just concerned with bigger problems, like AGI, which he believes cannot be solved purely through linguistic analysis.

The energy minimization architecture is more about joint multimodal learning.

(Energy minimization is a very old idea. LeCun has been on about it for a while and it's less controversial these days. Back when everyone tried to have a probabilistic interpretation of neural models, it was expensive to compute the normalization term / partition function. Energy minimization basically said: Set up a sensible loss and minimize it.)

replies(16): >>43368212 #>>43368251 #>>43368801 #>>43368817 #>>43369778 #>>43369887 #>>43370108 #>>43370284 #>>43371230 #>>43371304 #>>43371381 #>>43372224 #>>43372695 #>>43372927 #>>43373240 #>>43379739 #

throw310822 ◴[15 Mar 25 00:32 UTC] No.43368801[source]▶

>>43368085 #

I don't get it.

1) Yes it's true, learning from text is very hard. But LLMs are multimodal now.

2) That "size of a lion" paper is from 2019, which is a geological era from now. The SOTA was GPT2 which was barely able to spit out coherent text.

3) Have you tried asking a mouse to play chess or reason its way through some physics problem or to write some code? I'm really curious in which benchmark are mice surpassing chatgpt/ grok/ claude etc.

replies(2): >>43368852 #>>43377806 #

YeGoblynQueenne ◴[16 Mar 25 09:30 UTC] No.43377806[source]▶

>>43368801 #

Oh mice can solve a plethora of physics problems before it's time for breakfast. They have to navigate the, well, physical world, after all.

I'm also really curious what benchmarks LLMs have passed that include surviving without being eaten by a cat, or a gull, or an owl, while looking for food to survive and feed one's young in an arbitrary environment chosen from urban, rural, natural etc, at random. What's ChatGPT's score on that kind of benchmark?

replies(2): >>43378896 #>>43380373 #

throw310822 ◴[16 Mar 25 13:26 UTC] No.43378896[source]▶

>>43377806 #

> mice can solve a plethora of physics problems before it's time for breakfast

Ah really? Which ones? And nope, physical agility is not "solving a physics problem", otherwise a soccer players and figure skaters would all have PhDs, which doesn't seem to be the case.

I mean, an automated system that solves equations to keep balance is not particularly "intelligent". We usually call intelligence the ability to solve generic problems, not the ability of a very specialized system to solve the same problem again and again.

replies(1): >>43379409 #

1. YeGoblynQueenne ◴[16 Mar 25 14:37 UTC] No.43379409[source]▶

>>43378896 #

>> Ah really? Which ones? And nope, physical agility is not "solving a physics problem", otherwise a soccer players and figure skaters would all have PhDs, which doesn't seem to be the case.

Yes, everything that has to do with navigating physical reality, including, but not restricted to physical agility. Those are physics problems that animals, including humans, know how to solve and, very often, we have no idea how to program a computer to solve them.

And you're saying that solving physics problems means you have a PhD? So for example Archimedes did not solve any physics problems otherwise he'd have a PhD?

replies(1): >>43380411 #

2. throw310822 ◴[16 Mar 25 16:57 UTC] No.43380411[source]▶

>>43379409 (TP) #

> Those are physics problems that animals, including humans, know how to solve

No, those are problems that animals and humans solve, not know how to solve. I'm not the greatest expert of biochemistry that ever lived because of what goes on in my cells.

Now, I understand perfectly well the argument that "even small animals do things that our machines cannot do". That's been indisputably true for a long time. Today, it seems that the be more a matter of embodiment and speed of processing rather than a level of intelligence out of our reach. We already have machines that understand natural language perfectly well and display higher cognitive abilities than any other animal- including abstract reasoning, creating and understanding metaphors, following detailed instructions, writing fiction, etc.

↑