I think Yann Lecun was right about LLMs (but perhaps only by accident)

> And years later, we’re still not quite at FSD. Teslas certainly can’t drive themselves; Waymos mostly can, within a pre-mapped area, but still have issues and intermittently require human intervention.

This is a bit unfair to Waymo as it is near-fully commercial in cities like Los Angeles. There is no human driver in your hailed ride.

> But this has turned out to be wrong. A few new AI systems (notably OpenAI o1/o3 line and Deepseek R1) contradict this theory. They are autoregressive language models, but actually get better by generating longer outputs:

The arrow of causality is flipped here. Longer outputs does not make a model better. A better model can output a longer output without being derailed. The referenced graph from DeepSeek doesn't prove anything the author claims. Considering that this argument is one of the key points of the article, this logical error is a serious one.

> He presents this problem of compounding errors as a critical flaw in language models themselves, something that can’t be overcome without switching away from the current autoregressive paradigm.

LeCun is a bit reductive here (understandably as it was a talk for a live audience). Indeed, autoregressive algorithms can go astray as previous errors do not get corrected, or worse yet, accumulate. However, an LLM is not autoregressive in the customary sense that it is not like a streaming algorithm (O(n)) used in time series forecasting. LLMs have have attention mechanisms and large context windows, making the algorithm at least quadratic, depending on the implementation. In other words, LLM can backtrack if the current path is off and start afresh from a previous point its choice, not just the last output. So, yes, the author is making a valid point here, but technical details were missing. On a minor note, the non-error probability in LeCunn's slide actually shows non-autoregressive assumption. He seems to be contradicting himself in the very same slide.

I actually agree with the author on the overacrhing thesis. There is almost a fetishization of AGI and humanoid robots. There are plenty of interesting applications well before having those things accomplished. The correct focus, IMO, should be measurable economic benefits, not sci-fi terms (although I concede these grandiose visions can be beneficial for fundraising!).