What an insane time horizon to define success. I suppose he easily can raise enough capital for that kind of runway.
If you think LLMs are not the future then you need to come with something better
If you have a theoretical idea that's great, but take to at least GPT2 level first before writing off LLMs
Theoretical people love coming up with "better ideas" that fall flat or have hidden gotchas when they get to practical implementation
As Linus says, "talk is cheap, show me the code".
AI Agents like LLMs make great use of pre-computed information. Providing a comprehensive but efficient world model (one where more detail is available wherever one is paying more attention given a specific task) will definitely eke out new autonomous agents.
Swarms of these, acting in concert or with some hive mind, could be how we get to AGI.
I wish I could help, world models are something I am very passionate about.
Are all critiques of the obvious decline in physical durability of American-made products invalid unless they figure out a solution to the problem? Or may critics of a subject exist without necessarily being accredited engineers themselves?
And also it has extreme limitations that only world models or RL can fix.
Meta can't fight Google (has integrated supply chain, from TPUs to their own research lab) or OpenAI (brand awareness, best models).
The only other thing I can imagine is not very charitable: intellectual greed.
It can't just be that, can it? I genuinely don't understand. I would love to be educated.
No doubt his pitch deck will be the same garbage slides he’s been peddling in every talk since the 2010’s.
If you want to predict future text, you use an LLM. If you want to predict future frames in a video, you go with Diffusion. But what both of them lack is object permanence. If a car isn't visible in the input frame, it won't be visible in the output. But in the real world, there are A LOT of things that are invisible (image) or not mentioned but only implied (text) that still strongly affect the future. Every kid knows that when you roll a marble behind your hand, it'll come out on the other side. But LLMs and Diffusion models routinely fail to predict that, as for them the object disappears when it stops being visible.
Based on what I heard from others, world models are considered the missing ingredient for useful robots and self-driving cars. If that's halfway accurate, it would make sense to pour A LOT of money into world models, because they will unlock high-value products.
Corporate R&D teams are there to absorb risk, innovate, disrupt, create new fields, not for doing small incremental improvements. "If we know it works, it's not research." (Albert Einstein)
I also agree with LeCun that LLMs in their current form - are a dead end. Note that this does not mean that I think we have already exploited LLMs to the limit, we are still at the beginning. We also need to create an ecosystem in which they can operate well: for instance, to combine LLMs with Web agents better we need a scalable "C2B2C" (customer delegated to business to business) micropayment infrastructure, because as these systems have already begun talking to each other, in the longer run nobody would offer their APIs for free.
I work on spatial/geographic models, inter alia, which by coincident is one of the direction mentioned in the LeCun article. I do not know what his reasoning is, but mine was/is: LMs are language models, and should (only) be used as such. We need other models - in particular a knowledge model (KM/KB) to cleanly separate knowledge from text generation - it looks to me right now that only that will solve hallucination.
Previously, he very publicly and strongly said:
a) LLMs can't do math. They trick us in poetry but that's subjective. They can't do objective math.
b) they can't plan
c) by the very nature of autoregressive arch, errors compound. So the longer you go in your generation, the higher the error rate. so at long contexts the answers become utter garbage.
All of these were proven wrong, 1-2 years later. "a" at the core (gold at IMO), "b" w/ software glue and "c" with better training regimes.
I'm not interested in the will it won't it debates about AGI, I'm happy with what we have now, and I think these things are good enough now, for several usecases. But it's important to note when people making strong claims get them wrong. Again, I think I get where he's coming from, but the public stances aren't the place to get into the deep research minutia.
That being said, I hope he gets to find whatever it is that he's looking for, and wish him success in his endeavours. Between him, Fei Fei Li and Ilya, something cool has to come out of the small shops. Heck, I'm even rooting for the "let's commoditise lora training" that Mira's startup seems to go for.
It’s going to take money, what if your AGI has some tax policy ideas that are different from the inference owners?
Why would they let that AGI out into the wild?
Let’s say you create AGI. How long will it take for society to recover? How long will it take for people of a certain tax ideology to finally say oh OK, UBI maybe?
The last part is my main question. How long do you think it would take our civilization to recover from the introduction of AGI?
Edit: sama gets a lot of shit, but I have to admit at least he used to work on the UBI problem, orb and all. However, those days seem very long gone from the outside, at least.
Somehow it's one of the most valuable businesses in the world instead.
I don't know him, but, if not him, who else would be responsible for that?
Messing with the logic in the loop and combining models has an enormous potential, but it's more engineering than researching, and it's just not the sort of work that LeCun is interested in. I think the conflict lies there, that Facebook is an engineering company, and a possible future of AI lies in AI engineering rather than AI research.
Maybe at university, but not at a trillion dollar company. That job as chief scientist is leading risky things that will work to please the shareholders.
And while we've been able to approximate the world behind the words, it's just full of hallucinations because the AI's lack axiomatic systems beyond much manually constructed machinery.
You can probably expand the capabilties by attaching to the front-end but I suspect that Yann is seeing limits to this and wants to go back and build up from the back-end of world reasoning and then _among other things_ attach LLM's at the front-end (but maybe on equal terms with vision models that allows for seamless integration of LLM interfacing _combined_ with vision for proper autonomous systems).
Everything from the sorites paradox to leaky abstractions; everything real defies precise definition when you look closely at it, and when you try to abstract over it, to chunk up, the details have an annoying way of making themselves visible again.
You can get purity in mathematical models, and in information systems, but those imperfectly model the world and continually need to be updated, refactored, and rewritten as they decay and diverge from reality.
These things are best used as tools by something similar to LLMs, models to be used, built and discarded as needed, but never a ground source of truth.
> I hope AGI can be used to automate work
You people need a PR guy, I'm serious. OpenAI is the first company I've ever seen that comes across as actively trying to be misanthropic in its messaging. I'm probably too old-fashioned, but this honestly sounds like Marlboro launching the slogan "lung cancer for the weak of mind".
I think transformers have been proven to be general purpose, but that doesn't mean that we can't use new fundamental approaches.
To me it's obvious that researchers are acting like sheep as they always do. He's trying to come up with a real innovation.
LeCun has seen how new paradigms have taken over. Variations of LLMs are not the type of new paradigm that serious researches should be aiming for.
I wonder if there can be a unification of spatial-temporal representations and language. I am guessing diffusion video generators already achieve this in some way. But I wonder if new techniques can improve the efficiency and capabilities.
I assume the Nested Learning stuff is pretty relevant.
Although I've never totally grokked transformers and LLMs, I always felt that MoE was the right direction and besides having a strong mapping or unified view of spatial and language info, there also should somehow be the capability of representing information in a non-sequential way. We really use sequences because we can only speak or hear one sound at a time. Information in general isn't particularly sequential, so I doubt that's an ideal representation.
So I guess I am kind of variations of transformers myself to be honest.
But besides being able to convert between sequential discrete representations and less discrete non-sequential representations (maybe you have tokens but every token has a scalar attached), there should be lots of tokenizations, maybe for each expert. Then you have experts that specialize in combining and translating between different scalar-token tokenizations.
Like automatically clustering problems or world model artifacts or something and automatically encoding DSLs for each sub problem.
I wish I really understood machine learning.
If the answer is yes, then better to keep him, because he has already proved himself and you can win in the long-term. With Meta's pockets, you can always create a new department specifically for short-term projects.
If the answer is no, then nothing to discuss here.
His stance is understandable, but hardly the best way to rally a team that needs to push current tech to the limit.
The real issue: Meta is *far behind* Google, Anthropic, and OpenAI.
A radical shift is absolutely necessary - regardless of how much we sympathize with LeCun’s vision.
----
According to Grok, these were LeCun's real contributions at Meta (2013–2025):
----
- PyTorch – he championed a dynamic, open-source framework; now powers 70%+ of AI research
- LLaMA 1–3 – his open-source push; he even picked the name
- SAM / SAM 2 – born from his "segment anything like a baby" vision
- JEPA (I-JEPA, V-JEPA) – his personal bet on non-autoregressive world models
----
Everything else (Movie Gen, LLaMA 4, Meta AI Assistant) came after he left or was outside his scope.
The last time LeCun disagreed with the AI mainstream was when he kept working on neural net when everyone thought it was a dead end. He might be entirely right in his LLM scepticism. It's hardly a surefire path. He didn't prevent Meta from working on LLM anyway.
The issue is more than his position is not compatible with short term investors expectations and that's fatal in a company like Meta at the position LeCun occupies.
Well, no, Meta is behind the main framework used by nearly anyone largely thanks to LeCun. LLaMA was also very significant in making open weight a thing and that largely contributed to avoiding Google and OpenAI consolidating as the sole providers.
It's not a perfect tenure but implying he didn't deliver anything is far too harsh.
If you follow LeCun on social media, you can see that the way FAIR’s results are assessed is very narrow-minded and still follows the academic mindset. He mentioned that his research is evaluated by: "Research evaluation is a difficult task because the product impact may occur years (sometimes decades) after the work. For that reason, evaluation must often rely on the collective opinion of the research community through proxies such as publications, citations, invited talks, awards, etc."
But as an industry researcher, he should know how his research fits with the company vision and be able to assess that easily. If the company's vision is to be the leader in AI, then as of now, he seems to have failed that objective, even though he has been at Meta for more than 10 years.
Yes but he was hired in the ZIRP era where all SV companies were hiring every opinionated academic and giving them free reign and unlimited money to burn in the hopes that maybe they'll create the next big thing for them eventually.
These are very different economic times right now, after the FED infinite money glitch has been patched out, so now people do need to adjust to them and start actually making some products of value for their seven figure costs to their employers, or end up being shown the door.
b) Still true: next-token prediction isn’t planning.
c) Still true: error accumulation is mitigated, not eliminated. Long-context quality still relies on retrieval, checks, and verifiers.
Yann’s claims were about LLMs as LLMs. With tooling, you can work around limits, but the core point stands.
Social network wasn't even novel at the inception of FB. MySpace, Friendster, and Hi5 were already popular with millions of users.
Zuck operated it well and was able to grow it from 0 to what it is today. That is what matters.
Frontier models are all profitable. Inference is sold with a damn good margin, and the amounts of inference AI companies sell keeps rising. This necessitates putting more and more money into infrastructure. AI R&D is extremely expensive too, and this necessitates even more spending.
A mistake I see people make over and over again is keeping track of the spending but overlooking the revenue altogether. Which sure is weird: you don't get from $0B in revenue to $12B in revenue in a few years by not having a product anyone wants to buy.
And I find all the talk of "non-deterministic hallucinatory nature" to be overrated. Because humans suffer from all of that too, just less severely. On top of a number of other issues current AIs don't suffer from.
Nonetheless, we use human labor for things. All AI has to do is provide a "good enough" alternative, and it often does.
b) reductionism isn't worth our time. Planning works in the real world, today. (try any agentic tool like cc/codex/whatever). And if you're set on the purist view, there's mounting evidence from anthropic that there is planning in the core of an LLM.
c) so ... not true? Long context works today.
This is simply moving goalposts and nothing more. X can't do Y -> well, here they are doing Y -> well, not like that.
Also, like… it’s Facebook. It has a history of ploughing billions into complete nonsense (see metaverse). It is clearly not particularly risk averse.
They tend to get incredibly offended when they see anyone who doesn't toe the Party's line - let alone believe that the Chinese government is untrustworthy and evil.
https://www.smbc-comics.com/?id=2088
It's true. When it comes to the people doing bleeding edge research and development, the answer often is "BECAUSE IT'S FUCKING AWESOME". Regardless of what they tell the corporate higher-ups or put on the grant application statements.
Sure, a lot of people believe that AGI is going to make the world a better place. But "mad scientist" is a stereotype for a reason. You look into their eyes and you see the flame of madness flickering behind them.
Expected outcome is usually something like a Post-Scarcity society, this is a society where basic needs are all covered.
If we could all live in a future with a free house and a robot that does our chores and food is never scarce we should works towards that, they believe.
The intermiddiete steps aren't thought out, in the same way that for example the communist manifesto does little to explain the transition from capitalism to communism. It simply says there will be the need for things like forcing the bourgiese to join the common workers and there will be a transition phase but no clear steps between either system.
Similarly many AGI proponents think in terms of "wouldnt it be cool if there was an AI that did all the bits of life we dont like doing", without systemic analysis that many people do those bits because they need money to eat for example.
Currently one of the key issues with a lot of fields is that they operate as independent / largely isolated silos. If you could build a true AGI capable of achieving top-level mastery across multiple disciplines it would likely be able to integrate all that knowledge and make a lot of significant discoveries that would improve people's lives. Just exploring existing problem spaces with the full intellectual toolkit that humanity has developed is probably enough to make significant progress.
Our understanding of biology is still painfully primitive. To give a concrete example, I dream that someday it'll be possible to develop medical interventions that allow humans to regrow missing limbs and fix almost any health issue.
Have you ever lived with depression or any other psychiatric problem? I think if we could create medical interventions and environments that are conductive towards healing psychiatric problems, that would also be a massive quality of life improvement for huge numbers of people. Do you know how our current psychiatric interventions work? You try some drug, flip a coin to see if it does anything and wait 4 weeks to get the result. Then you keep iterating and hope that eventually the doctor finds some magical combination to make life barely tolerable.
I think the best path forward for improving humanity's understanding of biology, and ultimately medical science, is to go all-in on AGI-style technology.
My thinking is that such world models should be integrated with LLM like the lower levels of perception are integrated with higher brain function.
Cracking that is a huge step, pure multi-modal trained models will probably give us a hint, but I think we're some ways from seeing a pure multi-modal open model which can be pulled apart/modified. Even then they're still train and deploy not dynamically learning. I worry we're just going to see LSTM design bolted onto deep LLM because we don't know where else to go and it will be fragile and take eons to train.
And less said about the crap of "but inference is doing some kind of minimization within the context window" the better, it's vacuous and not where great minds should be looking for a step forwards.
b) Next-token training doesn’t magically grant inner long-horizon planners..
c) Long context ≠ robust at any length. Degradation with scale remains.
Not moving goalposts, just keeping terms precise.
Oh god, that is massively under-selling their learning ability. These models are able to extract and reply with why jokes are funny without even knowing basic vocab, yet there are pure-code models out there with lingual rules baked in from day one which still struggle with basic grammar.
The _point_ of LLMs arguably is there ability to learn any pattern thrown at it with enough compute. With an exception to learning how logical processes work, and pure LLMs only see "time" in the sense of a paragraph begins and ends.
At the least they have taught computers, "how to language", which in regards to how to interact with a machine is a _huge_ step forward.
Unfortunately the financial incentives are split between agentic model usage (taking the idea of a computerised butler further), maximizing model memory and raw learning capacity (answering all problems at any time), and long-range consistency (longer ranges give better stable results due to a few reasons, but we're some way from seeing an LLM with a 128k experts and 10e18 active tokens).
I think in terms of building the perfect monkey butler we already have most or all of the parts. With regard to a model which can dynamically learn on the fly... LLMs are not the end of the story and we need something to allow the models to more closely tie their LS with the context. Frankly the fact that DeepSeek gave us an LLM with LS was a huge leap since previous model attempts had been overly complex and had failed in training.
How does this fit together with a startup? Would investors happily invest into this knowing not to expect anything in return for at least the next 5-10 years?
I have a disorder characterised by the brain failing to filter own its own sensory noise, my vision is full of analogue TV-like distortion and other artefacts. Sometimes when it's bad I can see my brain constructing an image in real time rather than this perception happening instantaneously, particularly when I'm out walking. A deer becomes a bundle of sticks becomes a muddy pile of rocks (what it actually is) for example over the space of seconds. This to me is pretty strong evidence we do not experience reality directly, and instead construct our perceptions predictively from whatever is to hand.
Pytorch, used by everyone, yet no real value to stockholders, META even "fired" the creator of pytorch days ago.
SAM is great, what value does it bring to META business? Nobody knows about it. Great tool BTW.
JEPA is a failure (will it get better? I hope so.)
Did you read my list?
LeCun hasn't produced anything noteworthy in the past decade.
He uses the same slides in all of his presentations.
LLMs, while not yet AGI, have shown tremendous progress, and are actually useful for 99% of use cases for the average person.
The remaining 1% is for deep research into the deep unknown (physics, chemistry, genetics, diseases, the nature of intelligence itself), an area in which they falter.
I suppose they could solve superintelligence and cure cancer and build fusion reactors with it, but that's 100% outside their comfort zone - if they manage to build synthethic conversation partners and synthethic content generators as good or better than the real thing the value of having every other human on the planet registered to one of their social network goes to zero.
Which is impossible anyway - I facebook to maintain real human connections and keep up with people who I care about, not to consume infinite content.
/s
In the software development world yes, outside of that, virtually none. Yes, you can transcribe a video call in Office, yes, but that's not ground breaking. I dare you to list 10 impacts on different fields, excluding tech and including at least half blue collar fields and at least half white collar fields , at different levels from the lowest to the highest in the company hierarchy, that LLM/Diffusion models are having. Impact here specifically means a significant reduction of costs or a significant increase of revenue. Go on
Even when writing, it shifts the mental burden from an easy thing (writing code) to a very hard thing (reading that code, validating it's right, hallucination free, and then refactoring it to match your teams code style and patterns).
It's great for building a first-order approximation of a tech demo app that you then throw out and build from scratch, and auto-complete. In my experience, anyways. I'm sure others have had different experiences.
I would have loved to see a VLM utilizing JEPA for example, but it simply never happened.
Assistant robots for the elderly. In many countries population is shrinking, so fundamentally just not enough people to take care of the old.
You're quite correct that our "reality" is in part constructed. The Flashed Face Distortion Effect [0][1] (wherein faces in the peripheral vision appear distorted due the the brain filling in the missing information with what was there previously) is just one example.
[0] https://en.wikipedia.org/wiki/Flashed_face_distortion_effect [1] https://www.nature.com/articles/s41598-018-37991-9
Please learn the basics before you discuss what LLMs can and can't do.
Starting with the sophomoric questions of the optimist who mistakes the possible for the viable: how definite of a thing is "the world", how knowable is it, what is even knowledge... and then back through the more pragmatic: by whom is it knowable, to what degree, and by what means. The mystics: is "the world" the same thing as "the sum of information about the world"? The spooks: how does one study those fields of information which are already agentic and actively resist being studied by changing themselves, such as easily emerge anywhere more than n(D) people gather?
Plenty of food for thought from why ontologies are/aren't a thing. The classical example of how this plays out in the market being search engines winning over internet directories. But that's one turn of the wheel. Look at what search engines grew into quarter century later. What their outgrowths are doing to people's attitude towards knowledge. Different timescale, different picture.
Fundamentally, I don't think human language has sufficient resolution to model large spans of reality within the limited human attention span. The physical limits of human language as information processing device have been hit at some point in the XX century. Probably that 1970s divergence between productivity and wages.
So while LLMs are "computers speak language now" and it's amazing if sad that they cracked it by more data and not by more model, what's more amazing is how many people are continually ready to mistake language for thought. Are they all P-zombies or just obedience-conditioned into emulating ones?!?!?
Practically, what we lack is not the right architecture for "big knowing machine", but better tools for ad-hoc conceptual modeling of local situations. And, just like poetry that rhymes, this is exactly what nobody has a smidgen of interest to serve to consumers, thus someone will just build it in their basement in the hope of turning the tables on everyone. Probably with the help of LLMs as search engines and code generators. Yall better hurry. They're almost done.
if OpenAI can build a "social" network of completely generated content, that can kill Meta. Even today I venture to guess that most of the engagements in their platforms is not driven by real friends, so an AI driven platform won't be too different, or it might make content generation be so easy as to make your friends engage again.
Apart from it the ludicrous vision of the metaverse seems much more plausible with highly realistic world models
Is he advocating for philosophical idealism of the mind or does he has an alternate physicalist theory?
Text and languages contain structured information and encode a lot of real-world complexity (or it's "modelling" that).
Not saying we won't pivot to visual data or world simulations, but he was clearly not the type of person to compete with other LLM research labs, nor did he propose any alternative that could be used to create something interesting for end-users.
But that sure didn't happen.
Also do you get comorbid headaches with yours out of interest?
I really resonate with his view due to my background in physics and information theory. I for one welcome his new experimentation in other realms while so many still hack away at their LLMs in pursuit of SOTA benchmarks.
If that content becomes even cheaper, of higher quality and highly tailored to you, that is probably worth a lot of money, or at least worth not losing your entire company by a new competitor
The future is here folks, join us as we build this giant slop machine in order to sell new socks to boomers.
It's not just "long context" - you demand "infinite context" and "any length" now. Even humans don't have that. "No tools" is no longer enough - what, do you demand "no prompts" now too? Having LLMs decompose tasks and prompt each other the way humans do is suddenly a no-no?
Why is this idea of a world model helpful? Because it allows multiple interesting things, like predict what happens next, model counterfactuals (what would happen if I do X or don't do X) and many other things that tend to be needed for actual principled reasoning.
https://en.wikipedia.org/wiki/2022_University_of_Idaho_murde...
Its pretty much dog eat dog at top management positions.
Its not exactly a space for free thinking timelines.
Is the real bubble ignorance? Maybe you'll cool down but the rest of the world? There will just be more DeepSeek and more advances until the US loses its standing.
When did they make groundbreaking foundation models though? DeepMind and OpenAI have done plenty of revolutionary things, what did Meta AI do while being led by LeCun?
[1] Doctor of Philosophy:
This is why we're losing innovation.
Look at electric cars, batteries, solar panels, rare earths and many more. Bubble or struggle for survival? Right, because if US has no AI the world will have no AI? That's the real bubble - being stuck in an ancient world view.
Meta's stock has already tanked for "over" investing in AI. Bubble, where?
But the skill sets to avoid and survive personnel issues in academia is different from industry. My 2c.
The US government basically forced AT&T to use revenue from its monopoly to do fundamental research for the public good. Could the government do the same thing to our modern megacorps? Absolutely! Will it? I doubt it.
https://www.nytimes.com/1956/01/25/archives/att-settles-anti...
I understand Meta's not academia nor charity, but come on, how much profit do they need to make so we can expect them to allocate part of their resources towards some long term goals beneficial for society,.not only for shareholders?
Hasn't that narrow focus and chasing the profits get us in trouble already?
Thankfully I don't get comorbid headaches – in fact I seldom get headaches at all. And even on the odd occasion that I do, they're mild and short-lived (like minutes). I don't recall ever having a headache that was severe, or that lasted any length of time.
Yours does sound much more extreme than mine, in that mine is in no way debilitating. It's more just frustrating that it exists at all, and that it isn't more widely recognised and researched. I have yet to meet an optician that seems entirely convinced that it's even a real phenomenon.
You assume that's the only use of it.
And are people not using these code generators?
Is this an issue with a lost generation that forgot what Capex is? We've moved from Capex to Opex and now the notion is lost, is it? You can hire an army of software developers but can't build hardware.
Is it better when everyone buys DeepSeek or a non-US version? Well then you don't need to spend Capex but you won't have revenue either.
Meta is now just competing against giants like OpenAI, Anthropic and Google, plus all the new Chinese companies; I see no real chance for them to offer a popular chat model, but rather to market their AI as a bundled product for companies which want to advertise, where the images and videos will be automatically generated by Meta.
The potential downside is admittedly severe.
It seems they've given up on the research and are now doubling down on LLMs.
Maybe programming is mostly pattern matching but modern math is built on theory and proofs right?
We have estimates that range from 30% to 70% gross margin on API LLM inference prices at major labs, 50% middle road. 10% to 80% gross margin on user-facing subscription services, error bars inflated massively. We also have many reports that inference compute has come to outmatch training run compute for frontier models by a factor of x10 or more over the lifetime of a model.
The only source of uncertainty is: how much inference do the free tier users consume? Which is something that the AI companies themselves control: they are in charge of which models they make available to the free users, and what the exact usage caps for free users are.
Adding that up? Frontier models are profitable.
This goes against the popular opinion, which is where the disbelief is coming from.
Note that I'm talking LLMs rather than things like image or video generation models, which may have vastly different economics.
Damn did you just invent that? That's really catchy.
Skills may transfer to other research areas, lessons may be learnt, closing the feedback loop with usage provides more data and opportunities for learning. It also creates a culture where bullshit isn’t possible, as the thing has to actually work. Academic research often ends up serving no one but the researchers, because there is little or no incentive to produce real knowledge.
Anyway, how much that matters for an investor is hard to form a clear answer to - investors are after all not directly looking for profitability as such, but for valuation growth. The two are linked but not the same -- any investor in OpenAI today probably also places themselves into a game of chance, betting on OpenAI making more breakthroughs and increasing the cash flow even more -- not just becoming profitable at the same rate of cash flow. So there's still some of the same risk baked into this investment.
But with a new startup like LeCun's is going to be, it's 100% on the risk side and 0% on the optionality side. The path to profitability for a startup would be something like 1) a breakthrough is made 2) that breakthrough is utilized in a way that generates cash flow 3) the company becomes profitable (and at this point hopefully the valuation is good.)
There's a lot of things that can go wrong at every step here (aside from the obvious), including e.g. making a breakthrough that doesn't represent a defensible mote for your startup, failing to build the structure of the business necessary to generate cashflow, ... OpenAI et al already have a lot of that behind them, and while that doesn't mean that they don't face upcoming risks and challenges, the huge amount of cashflow they have available helps them overcome these issues far more easily than a startup, which will stop solving problems if you stop feeding money into it.
In any case if I have to guess, we will see shallow things like the Sora app, a video generation tiktok social network and deeper integration like fake influencers, content generation that fits your preferences and ad publishers preferences
a more evil incarnation of this might be a social network where you aren't sure who is real and who isn't. This will probably be a natural evolution of the need to bootstrap a social network with people and replacing these with LLMs
The capabilities of LLMs are limited by what's in their training data. You can use all the tricks in the book to squeeze the most out of that - RL, synthetic data, agentic loops, tools, etc, but at the end of the day their core intelligence and understanding is limited by that data and their auto-regressive training. They are built for mimicry, not creativity and intelligence.
Best of luck to LeCun. I hope by World Model's he means embodied AI or humanoid robots. We'll have to wait and see.
Cure all disease?
Stop aging?
End material scarcity?
It's completely fair to expect that these are all twisted monkey's paw scenarios that turn out dystopian, but being unable to understand any positive motivations for the creation of AGI seems a bit far fetched.
"Why Bell Labs Worked" [1]
"The Influence of Bell Labs" [2]
"Bringing back the golden days of Bell Labs" [3]
"Remembering Bell Labs as legendary idea factory prepares to leave N.J. home" [4] or
"Innovation and the Bell Labs Miracle" [5]
interesting too.
[1] https://news.ycombinator.com/item?id=43957010 [2] https://news.ycombinator.com/item?id=42275944 [3] https://news.ycombinator.com/item?id=32352584 [4] https://news.ycombinator.com/item?id=39077867 [5] https://news.ycombinator.com/item?id=3635489
your whole reasoning is neither here not there, and attacking a straw man - YLC for sure knows that human experience of reality is heavily modified and distorted
but he also knows, and I'd bet he's very right on this, that we don't "sip reality through a narrow straw of tokens/words", and that we don't learn "just from our/approved written down notes", and only under very specific and expensive circumstances (training runs)
anything closer to more-direct-world-models (as LLMs are ofc at a very indirect level world models) has very high likelihood of yielding lots of benefits
The mental model this person has of this feed of words is what an LLM at best has (but human model likely much richer since they have a brain, not just a transformer). No real-world experience or grounding, therefore no real-world model. The only model they have is of the world they have experience with - a world of words.
I don't disagree that the world is full of fuzziness. But the problem I have with this portrayal is that formal models are often normative rather than analytical. They create reality rather than being an interpretation or abstraction of reality.
People may well have a fuzzy idea of how their credit card works, but how it really works is formally defined by financial institutions. And this is not just true for software products. It's also largely true for manufactured products. Our world is very much shaped by artifacts and man-made rules.
Our probabilistic, fuzzy concepts are often simply a misconception. That doesn't mean it's not important of course. It is important for an AI to understand how people talk about things even if their idea of how these things work is flawed.
And then there is the sort of semi-formal language used in legal or scientific contexts that often has to be translated into formal models before it can become effective. Law makers almost never write algorithms (when they do, they are often buggy). But tax authorities and accounting software vendors do have to formally model the language in the law and then potentially change those formal definitions after court decisions.
My point is that the way in which the modeled, formal world interacts with probabilistic, fuzzy language and human actions is complex. In my opinion we will always need both. AIs ultimately need to understand both and be able to combine them just like (competent) humans do. AI "tool use" is a stop-gap. It's not a sufficient level of understanding.
No, I think hes suggesting that "world models" are more impactful. The issue for him inside meta is that there is already a research group looking at that, and are wildly more successful (in terms of getting research to product) and way fucking cheaper to run than FAIR.
Also LeCun is stuck weirdly in product land, rather than research (RL-R) which means he's not got the protection of Abrash to isolate him from the industrial stupidity that is the product council.
The models aren’t Chinese they are the entire world, unless I became Chinese without realizing
This is the weirdest technology market that I’ve seen. Researchers are getting rewarded with VC money to try what remains a science experiment. That used to be a bad word and now that gets rewarded with billions of dollars in valuation.
_That is what an AI winter is_.
Like, if you look at the previous ones, it's a cycle of over-hype, over-promising, funding collapse after the ridiculous over-promising does not materialise. But the tech tends to hang around. Voice recognition did not change the world in the 90s, but neither did it entirely vanish once it was realised that there had been over-promising, say.
If LLMs actually hit a plateau, then investment will flow towards other architectures.
You must have not lived through the dot com boom. There was almost everything under the sun was being sold under a website that started with an "e". ePets, ePlants, eStamps, eUnderwear, eStocks, eCards, eInvites.....
> is massively monopolistic and have unbounded discretionary research budget
that is the case for most megacorps. if you look at all the financial instruments.
modern monopolies are not equal to single corporation domination. modern monopolies are portfolios who do business using the same methods and strategies.
the problem is that private interests strive mostly for control, not money or progress. if they have to spend a lot of money to stay in control of (their (share of the)) segments, they will do that, which is why stuff like the current graph of investments of, by and for AI companies and the industries works.
A modern equivalent and "breadth" of a Bell Labs (et. al) kind of R&D speed could not be controlled and would 100% result in actual Artificial Intelligence vs all those white labelababbebel (sry) AI toys we get now.
Post WW I and II "business psychology" have build a culture that cannot thrive in a free world (free as in undisturbed and left to all devices available) for a variety of reasons, but mostly because of elements with a medieval/dark-age kind of aggressive tendency to come to power and maintain it that way.
In other words: not having a Bell Labs kind of setup anymore ensures that the variety of approaches taken on large scales aka industry-wide or systemic, remains narrow enough.
>> away from long-term research toward commercial AI products and large language models - LLMs
This feels more like what I see every day: the people in charge desperately looking for some way - any way - to capitalize on the frenzy. They're not looking to fund research; they just want to get even richer. It's pets.ai this time.
I actually suspected 5HT2A might be involved before that study came out, since my visual distortions sometimes resemble those caused by psychedelics. It's also known that both psychedelics and anecdotally from patient's groups SSRIs too can cause a similar symptoms to visual snow syndrome, I had a bad experience with SSRIs for example but serotonin antagonists actually fixed my vision temporarily - albeit with intolerable side-effects so I had to stop.
It's definitely a bit of a faff that people have never heard of it, I had to see a neuro-ophthalmologist and a migraine specialist to get a diagnosis. On the other hand being relatively unknown does mean doctors can be willing to experiment. My headaches at least are controlled well these days.
I don't know if that's indicative of the market as a whole though. Zuck just seems really gutted they fell behind with Llama 4.
We need another illegal Steve Jobs style freeze on talent theft (/s or I get downvoted to oblivion).
If Deepseek is free it undermines the value of LLMs, so the value of these US companies is mainly speculation/FOMO over AGI.
And I stopped reading him, since he - in my opinion - trashed on autopilot everything 99% did - and these 99% were already beyond the two standard deviation of greatness.
It is even more highly problematic if you have absolutely no results eg products to back your claims.
How did you determine that "surefire paths to success still available"? Most academics agree that LLMs (or LLMs alone) are not going to lead us to AGI. How are you so certain?
I'll happily step out of the way once someone simply tells me what it is you're trying to accomplish. Until you can actually define it, you can't do "it".
https://www.youtube.com/watch?v=l-OLgbdZ3kk
In this video we explore Predictive Coding – a biologically plausible alternative to the backpropagation algorithm, deriving it from first principles.
Predictive coding and Hebbian learning are interconnected learning mechanisms where Hebbian learning rules are used to implement the brain's predictive coding framework. Predictive coding models the brain as a hierarchical system that minimizes prediction errors by sending top-down predictions and bottom-up error signals, while Hebbian learning, often simplified as "neurons that fire together, wire together," provides a biologically plausible way to update the network's weights to improve predictions over time.
I’ve worked for multiple startups and I’ve watched startup job boards most of my career.
A lot of VC backed startups have a founder with a research background and are focused on providing out some hypothesis. I don’t see anything uncommon about this arrangement.
If you live near a University that does a lot of research it’s very common to encounter VC backed startups that are trying to prove out and commercialize some researcher’s experiment. It’s also common for those founders to spend some time at a FAANG or similar firm before getting VC funded.
yes, a glib response, but think about it: we define an intelligence test for humans, which by definition is an artificial construct. If we then get a computer to do well on the test we haven't proved it's on par with human intelligence, just that both meet some of the markers that the test makers are using as rough proxies for human intelligence. Maybe this helps signal or judge if AI is a useful tool for specific problems, but it doesn't mean AGI
Same goes for academia. People's visions compete for other people's financial budgets, time and other resources. Some dogs get to eat, study, train at the frontier and with top tools in top environments while the others hope to find a good enough shelter.
From what I recall there were some biotech stocks in that era that do fit the bill.
Not that I believe AGI is the measure of success, there's probably much more efficient ways to achieve company goals than simulating humans.
That didn’t last. People in the know knew that once you have a billion users and insane revenue and market power and have basically bought or driven out of business most of your competitors (Diapers.com, Jet.com, etc) you can eventually slow down your physical expansion, tighten the screws on your suppliers, increase efficiencies, and start printing money.
The VCs who are funding these companies are hoping that they have found the next Amazon. Many will probably go out of business, but some might join the ranks of trillion dollar companies.
As for IQ tests and the like, to the extent they are "scientific" they are designed based on empirical observations of humans. It is not designed to measure the intelligence of a statistical system containing a compressed version of the internet.
After that, VC had become more like PE, investing in stuff that was working already but needed money to scale.
Who says they don't make money? Same with open source software that offer a hosted version.
> If Deepseek is free it undermines the value of LLMs, so the value of these US companies is mainly speculation/FOMO over AGI
Freemium, open source and other models all exist. Does it undermine the value of e.g. Salesforce?
He’s not completely wrong in the sense that hallucinations aren’t completely solved but hallucinations definitely are becoming less and less to the point where AI can de a daily driver for even coders.
> Our probabilistic, fuzzy concepts are often simply a misconception.
How eg a credit card works today is defined by financial institutions. How it might work tomorrow is defined by politics, incentives, and human action. It's not clear how to model those with formal language.
I think most systems we interact with are fuzzy because they are in a continual state of change due to the aforementioned human society factors.
Unfortunately that has nothing to do with the topic of discussions, which is the capabilities of LLMs, which may require a more narrow definition of pattern matching.
Why are these so different?
Yann was largely wrong about AI. Yann coined the term stochastic parrot and derrided LLMs as a dead end. It’s now utterly clear the amount of utility LLMs have and that whatever these LLMs are doing it is much more than stochastic parroting.
I wouldn’t give money to Yann, the guy is a stubborn idiot and closed minded. Whatever he’s doing wont even touch LLM technology. He was so publicly deriding LLMs I see no way he will back pedal from that.
I dont think LLMs are the end of the story for agi. But I think they are a stepping stone. Whatever agi is in the end, LLMs or something close to it will be a modular component of aspect of the final product. For LeCunn to dismiss even the possibility of this is idiotic. Horrible investment move to give money to Yann to likely pursue Agi without even considering LLMs.
This is something that was true last year, but hanging on by a thread this year. Genie shows this off really well, but it's also in the video models as well.[1]
[1]https://storage.googleapis.com/gdm-deepmind-com-prod-public/...
AGI applied to the inputs (or supply chain) of what is needed for inference (power, DC space, chips, network equipment, etc) will dramatically reduced costs of inference. Most of the costs of stuff today are driven by the scarcity of "smart people's time". The raw resources of material needed are dirt cheap (cheaper than water). Transforming raw resources into useful high tech is a function of applied intelligence. Replace the human intelligence with machine intelligence, and costs will keep dropping (faster than the curve they are already on). Economic history has already shown this effect to be true; as we develop better tools to assist human productivity, the unit cost per piece of tech drops dramatically (moore's law is just one example, everything that tech touches experiences this effect).
If you look at almost any universal problem with the human condition, one important bottleneck to improving it is intelligence (or "smart people's time").
Like the new spin out Episteme from OpenAI?
I cannot remember the quote, but it's something to the effect of "Listen closely to grey haired men when they talk about what is possible, and never listen when they talk about what is impossible."
I wonder what changed. Does AI look like a safe bet? Or does every other bet seem to not have any reasonable return?
They generate revenue, but most companies are in the hole for the research capital outlay.
If open source models from China become popular, then the only thing that matters is distribution / moat.
Can these companies build distribution advantage and moats?
Biotech has been a YC darling. Was Ginkgo Bioworks not doing science experiments?
Clean energy was a big YC fad roughly 15 years ago. Billions were invested towards scientific research into biofuels, solar, etc.
There's absolutely no reason to think this. In fact, all of the evidence we have to this point suggests that scaling intelligence horizontally doesn't increase capabilities – you have to scale vertically.
Additionally, as it stands I'd argue there's foundational architectural advancements needed before artificial neutral networks can learn and reason at the same level (or better) than humans across a wide variety of tasks. I suspect when we solve this for LLMs the same techniques could be applied to world models. Fundamentally, the question to ask here is whether AGI is io dependant, and I see no reason to believe this to be the case – if someone removes your eyes and cuts off your hands they don't make you any less generally intelligent.
He's a great researcher, but that's abysmal leadership. He had to go.
If he gets funding (and he probably will) that's a win for everyone.
The second reason is by how much it's going to be better in the end. Fusion has to compete with hydro, nuclear, solar and wind. It makes exactly the same energy, so the upside is already capped unlike with AI which brings something disruptive.
Training on 2,500 hours of prerecorded video of people playing Minecraft, they produce a neural net world model of Minecraft. It is basically a learned Minecraft simulator. You can actually play Minecraft in it, in real time.
They then train a neural net agent to play Minecraft and achieve specific goals all the way up to obtaining diamonds. But the agent never plays the real game of Minecraft during training. It only plays in the world model. The agent is trained in its own imagination. Of course this is why it is called Dreamer.
The advantage of this is that once you have a world model, no extra real data is required to train agents. The only input to the system is a relatively small dataset of prerecorded video of people playing Minecraft, and the output is an agent that can achieve specific goals in the world. Traditionally this would require many orders of magnitude more real data to achieve, and the real data would need to be focused on the specific goals you want the agent to achieve. World models are a great way to cheaply amplify a small amount of undifferentiated real data into a large amount of goal-directed synthetic data.
Now, Minecraft itself is already a world model that is cheap to run, so a learned world model of Minecraft may not seem that useful. Minecraft is just a testbed. World models are very appealing for domains where it is expensive to gather real data, like robotics. I recommend listening to the interview above if you want to know more.
World models can also be useful in and of themselves, as games that you can play, or to generate videos. But I think their most important application will be in training agents.
If consumption of slop turns out to be a novelty that goes away and enough time goes by without a leap to truly useful intelligence, the AI investment will go down.
But ultimately I agree with you that this entire societal process is just categorically different. It's simply not a description or definition of something, and therefore the question of how formal it can be doesn't really make sense.
Formalisms are tools for a specific but limited purpose. I think we need those tools. Trying to replace them with something fuzzy makes no sense to me either.
> how many people are continually ready to mistake language for thought
This is a fundamental illusion - where, rote memory and names and words get mistaken for understanding. This was wonderfully illustrated here [1]. Few really grok what understanding actually is. This is an unfortunate by-product of our education system.
> Are they all P-zombies or just obedience-conditioned into emulating ones?!?!?
Brilliant way to state the fundamental human condition. ie, we are all zombies conditioned to imitate rather than understand. Social media amplifies the zombification, and now LLMs do that too.
> Starting with the sophomoric questions of the optimist who mistakes the possible for the viable
This is the fundamental tension between operationalized meaning and imagination. A grokking soul gathers mists from the cosmic chaos and creates meaning and operationalizes it for its own benefit and then continually adapts it.
> it's amazing if sad that they cracked it by more data and not by more model
I was speaking to experts in the sciences (chemistry). They were shocked that the underlying architecture is brute force. They expected a compact information-compressed theory which is able to model independent of data. The problem with brute-force approaches are that they dont scale, and dont capture the essences which are embodied in theories.
> The physical limits of human language as information processing device have been hit at some point in the XX century
2000 years back when humans realized that formalism was needed to operationalize meaning, and natural language was too vague to capture and communicate it. Because the world model that natural language captures encompasses "everything" whereas for making it "useful" requires to limit it via formalism.
Talk is cheap. Until they're actually cash flow positive, I'll believe it when I see it
I've started breathing a little easier about the possibilty of AI taking all our software engineering jobs after using Anthropic's dev tools.
If the people making the models and tools that are supposed to take all our jobs can't even fix their own issues in a dependable and expedient manner, then we're probably going to be ok for a bit.
This isn't a slight against Anthropic, I love their products and use them extensively. It's more a recognition of the fact that the more difficult aspects of engineering are still quite difficult, and in a way LLMs just don't seem well suited for.
And of course it doesn't work. Humans don't have world models. There's no such thing as a world model!
And let's not speak about those so deep into sloth that put it into use to deteriorate, and not augment as they claim to do, humane creative recreational activities.
So they're piling gobs of capital into an "AI" company with four customers with the hope that it is the one that becomes the home run (they know it won't, but LPs give you money to deploy it!)
It also means that companies like Yann's potential new one have the best chance in history of being funded, and that's a great thing.
P.S. all VCs outside the top-10 lose against the S&P. While I love that dumb capital is being injected into big, risky bets, surely the other shoe will drop at some point. Or is this just wealth redistribution with extra steps?
If you think about Theranos, Magic leap, openai, anthropic they are all the same, one idea thats kinda plausible (well if you don't look too closely), have a slick demo, and well connected founders.
Much as a lot of people dislike LeCun (just look at the blind posts about him) he did run and setup a very successful team inside meta, well nominally at least.
The specific issue you linked is related to the way Ink works, and the way terminals use ANSI escape codes to control rendering. When building a terminal app there is a tradeoff between (1) visual consistency between what is rendered in the viewport and scrollback, and (2) scrolling and flickering which are sometimes negligible and sometimes a really bad experience. We are actively working on rewriting our rendering code to pick a better point along this tradeoff curve, which will mean better rendering soon. In the meantime, a simple workaround that tends to help is to make the terminal taller.
Please keep the feedback coming!
Capital always chases the highest rate of return as well, and margins on energy production are tight. Margins on performing labor are huge.
The issue is context. trying to make an AI assistant with just text only inputs is doeable but limiting. You need to know the _context_ of all the data, and without visual input most of it is useful.
For example "Where is the other half of this" is almost impossible to solve unless you have an idea of what "this" is.
but to do that you need to have cameras, to use cameras you need to have position, object, and people tracking. And that is a hard problem thats not solved.
the hypothesis is that "world models" solve that with an implicit understanding of the worl and the objects in context
Nobody had a way to do silicon transistor manufacturing at scale until the traitorous eight flipped Shockley the bird and took a $1.4M seed investment from Sherman Fairchild.
Big bets on uncertain technology is what tech is supposed to be about.
CC is one of the best and most innovative pieces of software of the last decade. Anthropic has so much money. No judgment, just curious, do you have someone who’s an expert on terminal rendering on the team? If not, why? If so, why choose a buggy / poorly designed TUI library — or why not fix it upstream?
Manager: Now you "have" AI, release 10 features instead of 1 in the next month.
Devs: Spending 50% more working hours to make AI code "work" and deliver 10.
They're trying desperately to find profit in what so far has been the biggest boondoggle of all time.
Now, it's not like he opened up Anthropic's books for an audit, so you don't necessarily have to trust him. But you do need to believe that either (a) what he is saying is roughly true or (b) he is making the sort of fraudulent statements that could get you sent to prison.
Direct Realism is the idea that reality is directly available to us and any intermediate transformations made by our brains is not enough to change the dial.
Direct Realism has long been refuted. There are a number of examples, e.g. the hot and cold bucket; the straw in a glass; rainbows and other epiphenomena, etc.
LeCun had chosen to focus on the latter. He can't be blamed for not having taken the second hat.
Why they decided not to do that is kind of a puzzle.
Some world models can also be updated by their respective AI agents, e.g. "I, Mr. Bot, have moved the ice cream into the freezer from the car" (thereby updating the state of freezer and car, by transferring ice cream from one to the other, and making that the context for future interactions).
There are trillions of labor dollars that can be replaced by software. The US alone has almost $12 trillion of labor annually.
If an AI company has a 10% shot of developing a product that can replace 10% of it, they are worth $120 billion in expected value. (These numbers are obviously just for illustration).
The unprecedented numbers are a simple function of the unprecedented market size. Nobody has ever had a chance of creating trillions of dollars of economic value in a handful of years before.
Just for payroll of 10 AI researchers at 300k/yr would cost over $3 million per year. And his wealth probably isn't fully liquid. Given payroll + compute he would be bankrupt in a year. Of course he's not using just his own money.
However, I expect he will be a major investor. Most founders prefer to maintain some control.
that's not how profits work. Companies don't get paid for the value they create but for the value they can capture, otherwise the ffmpeg people would already be trillionaires.
If you have a dozen companies making the same general purpose technology, not product, your only hope is being able to slap ads on top of it, which is why they're so keen on targeting consumers rather than trying to automate jobs.
The first ventures were funding voyages to a New World thousands of miles away, essentially a different planet as far as the people then were concerned.
Venture capital for a new B2B application is playing it safe as far as risk capital goes
The phenomenon you're seeing is well described here: "The Perfect AI Startup" (https://www.bloomberg.com/opinion/newsletters/2025-09-29/the...)
“It was the most absurd pitch meeting,” one investor who met with Murati said. “She was like, ‘So we’re doing an AI company with the best AI people, but we can’t answer any questions.’”
Despite that vagueness, Murati raised $2 billion in funding...
Other terminal apps make different tradeoffs: for example Vim virtualizes scrolling, which has tradeoffs like the scroll physics feeling non-native and lines getting fully clipped. Other apps do what Claude Code does but don’t re-render scrollback, which avoids flickering but means the UI is often garbled if you scroll up.
Tech debt isn't something that even experienced large teams are immune to. I'm not a huge TypeScript fan, so seeing their choice to run their app on Node to me felt like a trade-off between development speed with the experience that the team had and at the expense of long-term growth and performance. I regularly experience pretty intense flickering and rendering issues and high CPU usage and even crashes but that doesn't stop me from finding the product incredibly useful.
Developing good software especially in a format that is relatively revolutionary takes time to get right and I'm sure whatever efforts they have internally to push forward a refactor will be worth it. But, just like in any software development, refactors are prone to timeline slips and scope creep. A company having tons of money doesn't change the nature of problem-solving in software development.
So it's made it easier for people to be taken advantage of at the grocery store etc.
When the bubble pops, and it’s very close to popping, there’s going to be a lot of burning piles of cash with no viable path to reviver that money.
I'd argue it's a failure of education or general lack of intelligence. The existence of a tool to speed the process up doesn't preclude people understanding the process.
I don't think this relates as closely to AI as you seem to. I'm simply better at building things, and doing things, with AI than without. Not just faster, better. If that's not true for you, you're either using it wrong or maybe you already knew how to do everything already - if so, good for you!
That kind of hallucination is somewhat acceptable for something marketed as a chatbot, less so for an assistant helping you with scientific knowledge and research.
Kimi k2 thinking > As for why we chose INT4 instead of more "advanced" formats like MXFP4/NVFP4, it's indeed, as many have mentioned, to better support non-Blackwell architecture hardware.
It gets lost on people in techcentric fields because Claude's at the forefront of things we care about, but Anthropic is basically unknown among the wider populace.
Last I'd looked a few months ago, Anthropic's brand awareness was in the middle single digits; OpenAI/ChatGPT was somewhere around 80% for comparison. MS/Copilot and Gemini were somewhere between the two but closer to Open AI than Anthropic.
tl;dr - Anthropic has a lot more to gain from awareness campaigns than the other major model providers do.
Meta chief AI scientist Yann LeCun plans to exit and launch own startup - https://news.ycombinator.com/item?id=45886217 - Nov 2025 (14 comments)
That thread didn't spend any time on the frontpage so we can treat the current post as non-dupe.
* Sign in with Apple on the website
* Buy subscriptions from iOS In App Purchases
* Remove our payment info from our account before the inevitable data breach
* Give paying subscribers an easy way to get actual support
As a frequent traveller I'm not sure if some of those features are gated by region, because some people said they can do some of those things, but if that is true, then that still makes the UX worse than the competitors.
However, I speak with a small subset of our most experienced engineers and they all love Claude Sonnet 4.5. Who knows if this lead will last.
Musk cares about AI research as much as he cared about Path of Exile
https://www.vox.com/podcasts/467048/unexplainable-hearing-au...
I don't see what the basis for this is that wouldn't be equally true for OpenAI.
Anthropic's edge is that they very arguably have some of the best technology available right now, despite operating at a fraction of the scale of their direct competitors. They have to start building mind and marketshare if they're going to hold that position, though, which is the point of advertising.
Interestingly, Yoshua Bengio is the only one who hasn't given into industry even though he could easily raise a lot of money.
The reality is that while LLMs can make mistakes mid-output, those interim mistakes don't necessarily detract from the model's final output. We see a version of this all the time with agents as they make tactical mistakes but quickly backtrack and ultimately solve the root problem.
It really felt like LeCun was willing to die on this hill. He continued to argue about really pedantic things like the importance researchers, etc.
I'm glad he's gone and hopeful Meta can actually deliver real AI products for their users with better leadership.