If you think LLMs are not the future then you need to come with something better
If you have a theoretical idea that's great, but take to at least GPT2 level first before writing off LLMs
Theoretical people love coming up with "better ideas" that fall flat or have hidden gotchas when they get to practical implementation
As Linus says, "talk is cheap, show me the code".
Are all critiques of the obvious decline in physical durability of American-made products invalid unless they figure out a solution to the problem? Or may critics of a subject exist without necessarily being accredited engineers themselves?
If you want to predict future text, you use an LLM. If you want to predict future frames in a video, you go with Diffusion. But what both of them lack is object permanence. If a car isn't visible in the input frame, it won't be visible in the output. But in the real world, there are A LOT of things that are invisible (image) or not mentioned but only implied (text) that still strongly affect the future. Every kid knows that when you roll a marble behind your hand, it'll come out on the other side. But LLMs and Diffusion models routinely fail to predict that, as for them the object disappears when it stops being visible.
Based on what I heard from others, world models are considered the missing ingredient for useful robots and self-driving cars. If that's halfway accurate, it would make sense to pour A LOT of money into world models, because they will unlock high-value products.
Corporate R&D teams are there to absorb risk, innovate, disrupt, create new fields, not for doing small incremental improvements. "If we know it works, it's not research." (Albert Einstein)
I also agree with LeCun that LLMs in their current form - are a dead end. Note that this does not mean that I think we have already exploited LLMs to the limit, we are still at the beginning. We also need to create an ecosystem in which they can operate well: for instance, to combine LLMs with Web agents better we need a scalable "C2B2C" (customer delegated to business to business) micropayment infrastructure, because as these systems have already begun talking to each other, in the longer run nobody would offer their APIs for free.
I work on spatial/geographic models, inter alia, which by coincident is one of the direction mentioned in the LeCun article. I do not know what his reasoning is, but mine was/is: LMs are language models, and should (only) be used as such. We need other models - in particular a knowledge model (KM/KB) to cleanly separate knowledge from text generation - it looks to me right now that only that will solve hallucination.
Messing with the logic in the loop and combining models has an enormous potential, but it's more engineering than researching, and it's just not the sort of work that LeCun is interested in. I think the conflict lies there, that Facebook is an engineering company, and a possible future of AI lies in AI engineering rather than AI research.
Maybe at university, but not at a trillion dollar company. That job as chief scientist is leading risky things that will work to please the shareholders.
And while we've been able to approximate the world behind the words, it's just full of hallucinations because the AI's lack axiomatic systems beyond much manually constructed machinery.
You can probably expand the capabilties by attaching to the front-end but I suspect that Yann is seeing limits to this and wants to go back and build up from the back-end of world reasoning and then _among other things_ attach LLM's at the front-end (but maybe on equal terms with vision models that allows for seamless integration of LLM interfacing _combined_ with vision for proper autonomous systems).
Everything from the sorites paradox to leaky abstractions; everything real defies precise definition when you look closely at it, and when you try to abstract over it, to chunk up, the details have an annoying way of making themselves visible again.
You can get purity in mathematical models, and in information systems, but those imperfectly model the world and continually need to be updated, refactored, and rewritten as they decay and diverge from reality.
These things are best used as tools by something similar to LLMs, models to be used, built and discarded as needed, but never a ground source of truth.
The last time LeCun disagreed with the AI mainstream was when he kept working on neural net when everyone thought it was a dead end. He might be entirely right in his LLM scepticism. It's hardly a surefire path. He didn't prevent Meta from working on LLM anyway.
The issue is more than his position is not compatible with short term investors expectations and that's fatal in a company like Meta at the position LeCun occupies.
Yes but he was hired in the ZIRP era where all SV companies were hiring every opinionated academic and giving them free reign and unlimited money to burn in the hopes that maybe they'll create the next big thing for them eventually.
These are very different economic times right now, after the FED infinite money glitch has been patched out, so now people do need to adjust to them and start actually making some products of value for their seven figure costs to their employers, or end up being shown the door.
Also, like… it’s Facebook. It has a history of ploughing billions into complete nonsense (see metaverse). It is clearly not particularly risk averse.
Cracking that is a huge step, pure multi-modal trained models will probably give us a hint, but I think we're some ways from seeing a pure multi-modal open model which can be pulled apart/modified. Even then they're still train and deploy not dynamically learning. I worry we're just going to see LSTM design bolted onto deep LLM because we don't know where else to go and it will be fragile and take eons to train.
And less said about the crap of "but inference is doing some kind of minimization within the context window" the better, it's vacuous and not where great minds should be looking for a step forwards.
Oh god, that is massively under-selling their learning ability. These models are able to extract and reply with why jokes are funny without even knowing basic vocab, yet there are pure-code models out there with lingual rules baked in from day one which still struggle with basic grammar.
The _point_ of LLMs arguably is there ability to learn any pattern thrown at it with enough compute. With an exception to learning how logical processes work, and pure LLMs only see "time" in the sense of a paragraph begins and ends.
At the least they have taught computers, "how to language", which in regards to how to interact with a machine is a _huge_ step forward.
Unfortunately the financial incentives are split between agentic model usage (taking the idea of a computerised butler further), maximizing model memory and raw learning capacity (answering all problems at any time), and long-range consistency (longer ranges give better stable results due to a few reasons, but we're some way from seeing an LLM with a 128k experts and 10e18 active tokens).
I think in terms of building the perfect monkey butler we already have most or all of the parts. With regard to a model which can dynamically learn on the fly... LLMs are not the end of the story and we need something to allow the models to more closely tie their LS with the context. Frankly the fact that DeepSeek gave us an LLM with LS was a huge leap since previous model attempts had been overly complex and had failed in training.
In the software development world yes, outside of that, virtually none. Yes, you can transcribe a video call in Office, yes, but that's not ground breaking. I dare you to list 10 impacts on different fields, excluding tech and including at least half blue collar fields and at least half white collar fields , at different levels from the lowest to the highest in the company hierarchy, that LLM/Diffusion models are having. Impact here specifically means a significant reduction of costs or a significant increase of revenue. Go on
Even when writing, it shifts the mental burden from an easy thing (writing code) to a very hard thing (reading that code, validating it's right, hallucination free, and then refactoring it to match your teams code style and patterns).
It's great for building a first-order approximation of a tech demo app that you then throw out and build from scratch, and auto-complete. In my experience, anyways. I'm sure others have had different experiences.
Starting with the sophomoric questions of the optimist who mistakes the possible for the viable: how definite of a thing is "the world", how knowable is it, what is even knowledge... and then back through the more pragmatic: by whom is it knowable, to what degree, and by what means. The mystics: is "the world" the same thing as "the sum of information about the world"? The spooks: how does one study those fields of information which are already agentic and actively resist being studied by changing themselves, such as easily emerge anywhere more than n(D) people gather?
Plenty of food for thought from why ontologies are/aren't a thing. The classical example of how this plays out in the market being search engines winning over internet directories. But that's one turn of the wheel. Look at what search engines grew into quarter century later. What their outgrowths are doing to people's attitude towards knowledge. Different timescale, different picture.
Fundamentally, I don't think human language has sufficient resolution to model large spans of reality within the limited human attention span. The physical limits of human language as information processing device have been hit at some point in the XX century. Probably that 1970s divergence between productivity and wages.
So while LLMs are "computers speak language now" and it's amazing if sad that they cracked it by more data and not by more model, what's more amazing is how many people are continually ready to mistake language for thought. Are they all P-zombies or just obedience-conditioned into emulating ones?!?!?
Practically, what we lack is not the right architecture for "big knowing machine", but better tools for ad-hoc conceptual modeling of local situations. And, just like poetry that rhymes, this is exactly what nobody has a smidgen of interest to serve to consumers, thus someone will just build it in their basement in the hope of turning the tables on everyone. Probably with the help of LLMs as search engines and code generators. Yall better hurry. They're almost done.
I don't disagree that the world is full of fuzziness. But the problem I have with this portrayal is that formal models are often normative rather than analytical. They create reality rather than being an interpretation or abstraction of reality.
People may well have a fuzzy idea of how their credit card works, but how it really works is formally defined by financial institutions. And this is not just true for software products. It's also largely true for manufactured products. Our world is very much shaped by artifacts and man-made rules.
Our probabilistic, fuzzy concepts are often simply a misconception. That doesn't mean it's not important of course. It is important for an AI to understand how people talk about things even if their idea of how these things work is flawed.
And then there is the sort of semi-formal language used in legal or scientific contexts that often has to be translated into formal models before it can become effective. Law makers almost never write algorithms (when they do, they are often buggy). But tax authorities and accounting software vendors do have to formally model the language in the law and then potentially change those formal definitions after court decisions.
My point is that the way in which the modeled, formal world interacts with probabilistic, fuzzy language and human actions is complex. In my opinion we will always need both. AIs ultimately need to understand both and be able to combine them just like (competent) humans do. AI "tool use" is a stop-gap. It's not a sufficient level of understanding.
No, I think hes suggesting that "world models" are more impactful. The issue for him inside meta is that there is already a research group looking at that, and are wildly more successful (in terms of getting research to product) and way fucking cheaper to run than FAIR.
Also LeCun is stuck weirdly in product land, rather than research (RL-R) which means he's not got the protection of Abrash to isolate him from the industrial stupidity that is the product council.
How did you determine that "surefire paths to success still available"? Most academics agree that LLMs (or LLMs alone) are not going to lead us to AGI. How are you so certain?
Not that I believe AGI is the measure of success, there's probably much more efficient ways to achieve company goals than simulating humans.
> Our probabilistic, fuzzy concepts are often simply a misconception.
How eg a credit card works today is defined by financial institutions. How it might work tomorrow is defined by politics, incentives, and human action. It's not clear how to model those with formal language.
I think most systems we interact with are fuzzy because they are in a continual state of change due to the aforementioned human society factors.
This is something that was true last year, but hanging on by a thread this year. Genie shows this off really well, but it's also in the video models as well.[1]
[1]https://storage.googleapis.com/gdm-deepmind-com-prod-public/...
But ultimately I agree with you that this entire societal process is just categorically different. It's simply not a description or definition of something, and therefore the question of how formal it can be doesn't really make sense.
Formalisms are tools for a specific but limited purpose. I think we need those tools. Trying to replace them with something fuzzy makes no sense to me either.
> how many people are continually ready to mistake language for thought
This is a fundamental illusion - where, rote memory and names and words get mistaken for understanding. This was wonderfully illustrated here [1]. Few really grok what understanding actually is. This is an unfortunate by-product of our education system.
> Are they all P-zombies or just obedience-conditioned into emulating ones?!?!?
Brilliant way to state the fundamental human condition. ie, we are all zombies conditioned to imitate rather than understand. Social media amplifies the zombification, and now LLMs do that too.
> Starting with the sophomoric questions of the optimist who mistakes the possible for the viable
This is the fundamental tension between operationalized meaning and imagination. A grokking soul gathers mists from the cosmic chaos and creates meaning and operationalizes it for its own benefit and then continually adapts it.
> it's amazing if sad that they cracked it by more data and not by more model
I was speaking to experts in the sciences (chemistry). They were shocked that the underlying architecture is brute force. They expected a compact information-compressed theory which is able to model independent of data. The problem with brute-force approaches are that they dont scale, and dont capture the essences which are embodied in theories.
> The physical limits of human language as information processing device have been hit at some point in the XX century
2000 years back when humans realized that formalism was needed to operationalize meaning, and natural language was too vague to capture and communicate it. Because the world model that natural language captures encompasses "everything" whereas for making it "useful" requires to limit it via formalism.
And let's not speak about those so deep into sloth that put it into use to deteriorate, and not augment as they claim to do, humane creative recreational activities.
Talking to these people is exhausting, so I cut straight to the chase: name the exact, unavoidable conditions that would prove AGI won’t happen.
Shockingly, nobody has an answer. They’ve never even considered it.
That’s because their whole belief is unfalsifiable.
How many decades did it take for neural nets to take off?
The reason we're even talking about LeCun today is because he was early in seeing the promise of neural nets and stuck with it through the whole AI winter when most people thought it was a waste of time.
The problem isn't LLMS, the problem is that everyone is trying to build bigger/better llms or manually code agents around LLMs. Meanwhile, projects like Mu Zero are forgotten, despite being vastly more important for things like self driving.