> You won't get an LLM outputting "wait, that's not right" halfway through their original output
No, that's one contiguous response from the LLM. I have screenshots, because I was so surprised the first time. I've had it happen many times. This was (as I always use LLM) direct API calls. In the first case it happened, it was with largest Llama 3.5. It usually only happens one shot, no context, base/empty system prompt.
> LLMs don't exhibit such an inner feedback loop
That's not true, at all. Next token prediction is based on all previous text, including the previous word that was just produced. It uses what it has said for what it will say next, within the same response, just as a markov chain would.