It basically uses the image generation approach of progressively refining the entire thing at once, but applied to text. It can self-correct mid-process.
The blog post where I found it originally that goes into more detail and raises some issues with it: https://timkellogg.me/blog/2025/02/17/diffusion
as a result of the whole learning process the toddler in particular learns how to self-correct itself, ie. as a grown up s/he knows, without much trial and errors anymore, how to continue in straight line if the previous step went sideways for whatever reason
>An LLM using autoregressive inference can only compound errors.
That is pretty powerful statement completely dismissing that some self-correction may be emerging there.
The metric may be including say a weight/density of the attracting facts cluster - somewhat like gravitation drives the stuff in the Universe with the LLM learning can be thought as pre-distributing matter in its own that very high-dimensional Universe according to the semantic "gravitational" field.
The resulting - emerging - metric and associated geometry is currently mind-boggling incomprehensible, and even in much-much simpler, single-digit dimensional, spaces systems described by Lecun still can be [quasi]stable and/or [quasi]periodic around say some attractor(s).