The next-gen LLMs are going to use something like mipmaps in graphics: a stack of progressively smaller versions of the image, with a 1x1 image at the top. The same concept applies to text. When you're writing something, your have a high-level idea in mind that serves as a guide. That idea is such a mipmap. Perhaps the next-gen LLMs will be generating a few parallel sequencies, the top-level will be a slow-pace anchor and the bottom-level being the actual text that depends on slower upper levels.