I am very dummy on LLMs, but wouldn't a confined model (no internet access) eventually just loop to repeating itself on each consecutive run or is entropy enough for them to produce endless creativity?
replies(3):
In order to make this probability distribution useful, the software chooses a token based on its position in the distribution. I'm simplifying here, but the likelihood that it chooses the most probable next token is based on the model's temperature. A temperature of 0 means that (in theory) it'll always choose the most probable token, making it deterministic. A non-zero temperature means that sometimes it will choose less likely tokens, so it'll output different results every time.
Hope this helps.