←back to thread

724 points simonw | 6 comments | | HN request time: 2.145s | source | bottom
Show context
xnx ◴[] No.44527256[source]
> It’s worth noting that LLMs are non-deterministic,

This is probably better phrased as "LLMs may not provide consistent answers due to changing data and built-in randomness."

Barring rare(?) GPU race conditions, LLMs produce the same output given the same inputs.

replies(7): >>44527264 #>>44527395 #>>44527458 #>>44528870 #>>44530104 #>>44533038 #>>44536027 #
simonw ◴[] No.44527395[source]
I don't think those race conditions are rare. None of the big hosted LLMs provide a temperature=0 plus fixed seed feature which they guarantee won't return different results, despite clear demand for that from developers.
replies(3): >>44527634 #>>44529574 #>>44529823 #
1. xnx ◴[] No.44527634[source]
Fair. I dislike "non-deterministic" as a blanket llm descriptor for all llms since it implies some type of magic or quantum effect.
replies(4): >>44527956 #>>44528597 #>>44528690 #>>44529070 #
2. dekhn ◴[] No.44527956[source]
I see LLM inference as sampling from a distribution. Multiple details go into that sampling - everything from parameters like temperature to numerical imprecision to batch mixing effects as well as the next-token-selection approach (always pick max, sample from the posterior distribution, etc). But ultimately, if it was truly important to get stable outputs, everything I listed above can be engineered (temp=0, very good numerical control, not batching, and always picking the max probability next token).

dekhn from a decade ago cared a lot about stable outputs. dekhn today thinks sampling from a distribution is a far more practical approach for nearly all use cases. I could see it mattering when the false negative rate of a medical diagnostic exceeded a reasonable threshold.

3. basch ◴[] No.44528597[source]
I agree its phrased poorly.

Better said would be: LLM's are designed to act as if they were non-deterministic.

replies(1): >>44528792 #
4. tanewishly ◴[] No.44528690[source]
Errr... that word implies some type of non-deterministic effect. Like using a randomizer without specifying the seed (ie. sampling from a distribution). I mean, stuff like NFAs (non-deterministic finite automata) isn't magic.
5. ◴[] No.44528792[source]
6. EdiX ◴[] No.44529070[source]
Interesting, but in general it does not imply that. For example: https://en.wikipedia.org/wiki/Nondeterministic_finite_automa...