When the attempt is though to have the LLM output an "idea", not just a "next token", the selection over the logits vector should break that original idea... If the idea is complete, there should be no need to use sampling over the logits.
The sampling, in this framework, should not happen near the output level ("what will the next spoke word be").
replies(1):