Yann LeCun to depart Meta and launch AI startup focused on 'world models'

He also said other things about LLMs that turned out to be either wrong or easily bypassed with some glue. While I understand where he comes from, and that his stance is pure research-y theory driven, at the end of the day his positions were wrong.

Previously, he very publicly and strongly said:

a) LLMs can't do math. They trick us in poetry but that's subjective. They can't do objective math.

b) they can't plan

c) by the very nature of autoregressive arch, errors compound. So the longer you go in your generation, the higher the error rate. so at long contexts the answers become utter garbage.

All of these were proven wrong, 1-2 years later. "a" at the core (gold at IMO), "b" w/ software glue and "c" with better training regimes.

I'm not interested in the will it won't it debates about AGI, I'm happy with what we have now, and I think these things are good enough now, for several usecases. But it's important to note when people making strong claims get them wrong. Again, I think I get where he's coming from, but the public stances aren't the place to get into the deep research minutia.

That being said, I hope he gets to find whatever it is that he's looking for, and wish him success in his endeavours. Between him, Fei Fei Li and Ilya, something cool has to come out of the small shops. Heck, I'm even rooting for the "let's commoditise lora training" that Mira's startup seems to go for.

replies(3): >>45897933 #>>45898169 #>>45905642 #

5. consumer451 ◴[12 Nov 25 08:27 UTC] No.45897687{3}[source]▶

>>45897658 #

Who’s gonna pay for that inference?

It’s going to take money, what if your AGI has some tax policy ideas that are different from the inference owners?

Why would they let that AGI out into the wild?

Let’s say you create AGI. How long will it take for society to recover? How long will it take for people of a certain tax ideology to finally say oh OK, UBI maybe?

The last part is my main question. How long do you think it would take our civilization to recover from the introduction of AGI?

Edit: sama gets a lot of shit, but I have to admit at least he used to work on the UBI problem, orb and all. However, those days seem very long gone from the outside, at least.

replies(3): >>45898336 #>>45900951 #>>45905114 #

6. lm28469 ◴[12 Nov 25 08:46 UTC] No.45897826{3}[source]▶

>>45897658 #

How old are you?

That's what they've been selling us for the past 50 years and nothing has changed, all the productivity gain was pocketed by the elite

replies(1): >>45900093 #

7. qsort ◴[12 Nov 25 08:52 UTC] No.45897871{3}[source]▶

>>45897658 #

>> non-$$ logic [...] aside from misanthropy

> I hope AGI can be used to automate work

You people need a PR guy, I'm serious. OpenAI is the first company I've ever seen that comes across as actively trying to be misanthropic in its messaging. I'm probably too old-fashioned, but this honestly sounds like Marlboro launching the slogan "lung cancer for the weak of mind".

replies(1): >>45899722 #

8. eloisant ◴[12 Nov 25 08:58 UTC] No.45897906[source]▶

>>45897613 #

That's the old dream of creating life, becoming God. Like the Golem, Frankenstein...

9. ilaksh ◴[12 Nov 25 09:01 UTC] No.45897933[source]▶

>>45897683 #

That's true but I also think despite being wrong about the capabilities of LLMs, LeCun has been right in that variations of LLMs are not an appropriate target for long term research that aims to significantly advance AI. Especially at the level of Meta.

I think transformers have been proven to be general purpose, but that doesn't mean that we can't use new fundamental approaches.

To me it's obvious that researchers are acting like sheep as they always do. He's trying to come up with a real innovation.

LeCun has seen how new paradigms have taken over. Variations of LLMs are not the type of new paradigm that serious researches should be aiming for.

I wonder if there can be a unification of spatial-temporal representations and language. I am guessing diffusion video generators already achieve this in some way. But I wonder if new techniques can improve the efficiency and capabilities.

I assume the Nested Learning stuff is pretty relevant.

Although I've never totally grokked transformers and LLMs, I always felt that MoE was the right direction and besides having a strong mapping or unified view of spatial and language info, there also should somehow be the capability of representing information in a non-sequential way. We really use sequences because we can only speak or hear one sound at a time. Information in general isn't particularly sequential, so I doubt that's an ideal representation.

So I guess I am kind of variations of transformers myself to be honest.

But besides being able to convert between sequential discrete representations and less discrete non-sequential representations (maybe you have tokens but every token has a scalar attached), there should be lots of tokenizations, maybe for each expert. Then you have experts that specialize in combining and translating between different scalar-token tokenizations.

Like automatically clustering problems or world model artifacts or something and automatically encoding DSLs for each sub problem.

I wish I really understood machine learning.

10. tonii141 ◴[12 Nov 25 09:35 UTC] No.45898169[source]▶

>>45897683 #

a) Still true: vanilla LLMs can’t do math, they pattern-match unless you bolt on tools.

b) Still true: next-token prediction isn’t planning.

c) Still true: error accumulation is mitigated, not eliminated. Long-context quality still relies on retrieval, checks, and verifiers.

Yann’s claims were about LLMs as LLMs. With tooling, you can work around limits, but the core point stands.

replies(2): >>45898248 #>>45898683 #

11. NitpickLawyer ◴[12 Nov 25 09:49 UTC] No.45898248{3}[source]▶

>>45898169 #

a) no, gemini 2.5 was shown to "win" gold w/o tools. - https://arxiv.org/html/2507.15855v1

b) reductionism isn't worth our time. Planning works in the real world, today. (try any agentic tool like cc/codex/whatever). And if you're set on the purist view, there's mounting evidence from anthropic that there is planning in the core of an LLM.

c) so ... not true? Long context works today.

This is simply moving goalposts and nothing more. X can't do Y -> well, here they are doing Y -> well, not like that.

replies(1): >>45898433 #

12. ACCount37 ◴[12 Nov 25 10:01 UTC] No.45898328[source]▶

>>45897613 #

Have you ever seen that "science advocate vs scientist" comic?

https://www.smbc-comics.com/?id=2088

It's true. When it comes to the people doing bleeding edge research and development, the answer often is "BECAUSE IT'S FUCKING AWESOME". Regardless of what they tell the corporate higher-ups or put on the grant application statements.

Sure, a lot of people believe that AGI is going to make the world a better place. But "mad scientist" is a stereotype for a reason. You look into their eyes and you see the flame of madness flickering behind them.

13. Arkhaine_kupo ◴[12 Nov 25 10:02 UTC] No.45898336{4}[source]▶

>>45897687 #

I am not someone working on AGI but I think a lot of people work backwards from the expected outcome.

Expected outcome is usually something like a Post-Scarcity society, this is a society where basic needs are all covered.

If we could all live in a future with a free house and a robot that does our chores and food is never scarce we should works towards that, they believe.

The intermiddiete steps aren't thought out, in the same way that for example the communist manifesto does little to explain the transition from capitalism to communism. It simply says there will be the need for things like forcing the bourgiese to join the common workers and there will be a transition phase but no clear steps between either system.

Similarly many AGI proponents think in terms of "wouldnt it be cool if there was an AI that did all the bits of life we dont like doing", without systemic analysis that many people do those bits because they need money to eat for example.

14. TheAceOfHearts ◴[12 Nov 25 10:05 UTC] No.45898355[source]▶

>>45897613 #

I'm a true believer in AGI being able to become a force for immense good if deployed carefully by responsible parties.

Currently one of the key issues with a lot of fields is that they operate as independent / largely isolated silos. If you could build a true AGI capable of achieving top-level mastery across multiple disciplines it would likely be able to integrate all that knowledge and make a lot of significant discoveries that would improve people's lives. Just exploring existing problem spaces with the full intellectual toolkit that humanity has developed is probably enough to make significant progress.

Our understanding of biology is still painfully primitive. To give a concrete example, I dream that someday it'll be possible to develop medical interventions that allow humans to regrow missing limbs and fix almost any health issue.

Have you ever lived with depression or any other psychiatric problem? I think if we could create medical interventions and environments that are conductive towards healing psychiatric problems, that would also be a massive quality of life improvement for huge numbers of people. Do you know how our current psychiatric interventions work? You try some drug, flip a coin to see if it does anything and wait 4 weeks to get the result. Then you keep iterating and hope that eventually the doctor finds some magical combination to make life barely tolerable.

I think the best path forward for improving humanity's understanding of biology, and ultimately medical science, is to go all-in on AGI-style technology.

15. tonii141 ◴[12 Nov 25 10:16 UTC] No.45898433{4}[source]▶

>>45898248 #

a) That "no-tools" win depends on prompt orchestration which can still be categorized as tooling.

b) Next-token training doesn’t magically grant inner long-horizon planners..

c) Long context ≠ robust at any length. Degradation with scale remains.

Not moving goalposts, just keeping terms precise.

replies(1): >>45899019 #

16. killerstorm ◴[12 Nov 25 10:56 UTC] No.45898667[source]▶

>>45897613 #

R&D can be automated to speed up medical research - saving lives, prolonging life, etc.

Assistant robots for the elderly. In many countries population is shrinking, so fundamentally just not enough people to take care of the old.

17. killerstorm ◴[12 Nov 25 10:59 UTC] No.45898683{3}[source]▶

>>45898169 #

My man, math is pattern matching, not magic. So is logic. And computation.

Please learn the basics before you discuss what LLMs can and can't do.

replies(1): >>45899359 #

18. ACCount37 ◴[12 Nov 25 11:51 UTC] No.45899019{5}[source]▶

>>45898433 #

My man, you're literally moving all the goalposts as we speak.

It's not just "long context" - you demand "infinite context" and "any length" now. Even humans don't have that. "No tools" is no longer enough - what, do you demand "no prompts" now too? Having LLMs decompose tasks and prompt each other the way humans do is suddenly a no-no?

replies(1): >>45899469 #

19. p_v_doom ◴[12 Nov 25 12:22 UTC] No.45899287{3}[source]▶

>>45897658 #

Automating work and making life easier for people are two entirely different things. Automating work tends to lead to life becoming harder for people - mostly on account of who is benefiting from the automation - basically that better life aint gonna happen under capitalism

20. rhubarbtree ◴[12 Nov 25 12:30 UTC] No.45899342[source]▶

>>45897613 #

Well, AGI could accelerate scientific and medical discovery, saving lives and impacting billions of people positively.

The potential downside is admittedly severe.

replies(2): >>45904976 #>>45905599 #

21. ozgrakkurt ◴[12 Nov 25 12:31 UTC] No.45899359{4}[source]▶

>>45898683 #

I'm no expert on math but "math is pattern matching" really sounds wrong.

Maybe programming is mostly pattern matching but modern math is built on theory and proofs right?

replies(2): >>45900035 #>>45905753 #

22. tonii141 ◴[12 Nov 25 12:45 UTC] No.45899469{6}[source]▶

>>45899019 #

I’m not demanding anything, I’m pointing out that performance tends to degrade as context scales, which follows from current LLM architectures as autoregressive models.

In that sense, Yann was right.

replies(1): >>45901699 #

23. NiloCK ◴[12 Nov 25 13:01 UTC] No.45899621[source]▶

>>45897613 #

Trying to engage in good faith here but I don't really get this. You're pretending to have never encountered positive visions of technologically advanced futures.

Cure all disease?

Stop aging?

End material scarcity?

It's completely fair to expect that these are all twisted monkey's paw scenarios that turn out dystopian, but being unable to understand any positive motivations for the creation of AGI seems a bit far fetched.

replies(1): >>45900298 #

24. FergusArgyll ◴[12 Nov 25 13:12 UTC] No.45899722{4}[source]▶

>>45897871 #

Matt Levine calls it business negging

25. noddybear ◴[12 Nov 25 13:40 UTC] No.45900035{5}[source]▶

>>45899359 #

Nah, its all pattern matching. This is how automated theorem provers like Isabelle are built, applying operations to lemmas/expressions to reach proofs.

replies(2): >>45900776 #>>45901563 #

26. cantor_S_drug ◴[12 Nov 25 13:45 UTC] No.45900093{4}[source]▶

>>45897826 #

Here's my prediction : The rapid progress of AI will make money as an accounting practice irrelevant. Take the concept of "Future is already here but unevenly distributed." When we will have true abundance, what the elites will target is the convex hull of progress, they want to be in control of leading edge / leading wavefront and its direction and who has access to resources and decision making. In such a scenario of abundance, populace will have access to iPhone 50 but the Elites will have access to iPhone 500. i.e. uneven distribution. Elites would like to directly control which resource gets allocated to which projects. Elon is already doing that with his immense clout. This implies we would have a sort of multidimensional resource based economy.

replies(3): >>45901642 #>>45902055 #>>45905033 #

27. rchaud ◴[12 Nov 25 14:03 UTC] No.45900298{3}[source]▶

>>45899621 #

That the development of this technology is in the hands of a few people that don't use even a fraction of their staggering wealth to address these challenges now, tells me that they aren't interested in using AI to solve them later.

28. staticman2 ◴[12 Nov 25 14:38 UTC] No.45900776{6}[source]▶

>>45900035 #

I'm sure if you pick a sufficiently broad definition of pattern matching your argument is true by definition!

Unfortunately that has nothing to do with the topic of discussions, which is the capabilities of LLMs, which may require a more narrow definition of pattern matching.

29. jpadkins ◴[12 Nov 25 14:52 UTC] No.45900951{4}[source]▶

>>45897687 #

If you are genuine in your questions, I will give them a shot.

AGI applied to the inputs (or supply chain) of what is needed for inference (power, DC space, chips, network equipment, etc) will dramatically reduced costs of inference. Most of the costs of stuff today are driven by the scarcity of "smart people's time". The raw resources of material needed are dirt cheap (cheaper than water). Transforming raw resources into useful high tech is a function of applied intelligence. Replace the human intelligence with machine intelligence, and costs will keep dropping (faster than the curve they are already on). Economic history has already shown this effect to be true; as we develop better tools to assist human productivity, the unit cost per piece of tech drops dramatically (moore's law is just one example, everything that tech touches experiences this effect).

If you look at almost any universal problem with the human condition, one important bottleneck to improving it is intelligence (or "smart people's time").

30. vbarrielle ◴[12 Nov 25 15:44 UTC] No.45901563{6}[source]▶

>>45900035 #

Automated theorem provers are also built around backtracking, which is absent in LLMs.

31. snapcaster ◴[12 Nov 25 15:50 UTC] No.45901642{5}[source]▶

>>45900093 #

Why didn't that happen when historically productivity already increased 10000x? Why would this time be different?

32. snapcaster ◴[12 Nov 25 15:55 UTC] No.45901699{7}[source]▶

>>45899469 #

Not sure if you're just someone who doesn't want to ever lose an argument or you're actually coping this hard

33. lm28469 ◴[12 Nov 25 16:22 UTC] No.45902055{5}[source]▶

>>45900093 #

We already have an abundance of things, food and energy, what we need is meaning and time, not iphones 5000.

34. mrguyorama ◴[12 Nov 25 19:27 UTC] No.45904976{3}[source]▶

>>45899342 #

Would AGI actually be better than just giving all these dollars to various rote and boring sciences?

Bioscience is the next real revolution IMO. Figuring out our bodies as systems and how to program them will lead to a change bigger than information technology.

But what we need for that is not AGI. Bioscience suffers from a total lack of data. We only mapped the human genome a couple decades ago, and that's overselling it. We are currently in the process of slowly mapping out many proteins and receptors and interactions in the body.

We finally have the tooling to do that. We finally have the understanding to do that. What is limiting us right now is mostly the amount of graduate students being paid to laboriously analyze those proteins and what they interact with and other data points.

Once we have enough of that data, we can approach big ideas and other extremely beneficial models.

Right now we are in the calm before the storm. We are mid-1800s physics, just collecting the data necessary to discover and quantify models of electromagnetic energy and fields, the modeling of which is what directly lead to the information and then computer revolution. Most advancements of the 20th century were about utilizing those models to master the electromagnetic field. Similar data was how we figured out the nuclear forces.

We should be funding the mapping of the human biological system. We should be gathering the data required interact with our bodies.

No amount of "self improving superintelligent AGI" can actually overcome the whole "There's no data" problem. If we had a magical AGI in 1750, it would not have been able to produce Maxwell's equations.

35. mrguyorama ◴[12 Nov 25 19:29 UTC] No.45905033{5}[source]▶

>>45900093 #

>When we will have true abundance, what the elites will target is the convex hull of progress

We have abundance. The elite took it all.

Every single dollar gained from increased productivity over the past 50 years has been given to them, by changes in tax structures primarily, and their current claim is that actually we need to give them more.

Because that's all they know. More. A bigger slice of the pie. They demonstrably do not want to make the pie bigger.

Making the whole pie bigger dilutes their control and power over resources. They'd rather just take more of the current pie.

36. mrguyorama ◴[12 Nov 25 19:34 UTC] No.45905114{4}[source]▶

>>45897687 #

>sama gets a lot of shit, but I have to admit at least he used to work on the UBI problem, orb and all.

The Orb was never ever ever meant to ever fix anything about UBI or even help it happen.

It was always about creating a hyped enough cryptocoin he could use as an ATM to fund himself and other things. That's what all these assholes got into crypto for, like, demonstrably. It was always about taking investment from fools who could not punish you for screwing them over, and then taking your bag and going home to play.

The orb was a sales and marketing gimmick. There's nothing it could do that couldn't be done by commodity fingerprint scanners.

37. ◴[12 Nov 25 20:01 UTC] No.45905599{3}[source]▶

>>45899342 #

38. HarHarVeryFunny ◴[12 Nov 25 20:04 UTC] No.45905642[source]▶

>>45897683 #

> So the longer you go in your generation, the higher the error rate. so at long contexts the answers become utter garbage.

Not totally wrong. They can self-correct, but it seems context rot will eventually set in.

39. HarHarVeryFunny ◴[12 Nov 25 20:09 UTC] No.45905753{5}[source]▶

>>45899359 #

When an LLM does it, it's pattern matching.

RL training amounts to pattern matching.

How does an LLM decode Base64? Decode algorithm? No - predictive pattern matching.

An LLM isn't predicting what a person thinks - it's predicting what a person does.

↑