←back to thread

724 points simonw | 1 comments | | HN request time: 0.214s | source
Show context
xnx ◴[] No.44527256[source]
> It’s worth noting that LLMs are non-deterministic,

This is probably better phrased as "LLMs may not provide consistent answers due to changing data and built-in randomness."

Barring rare(?) GPU race conditions, LLMs produce the same output given the same inputs.

replies(7): >>44527264 #>>44527395 #>>44527458 #>>44528870 #>>44530104 #>>44533038 #>>44536027 #
llm_nerd ◴[] No.44533038[source]
That non-deterministic claim, along with the rather ludicrous claim that this is all just some accidental self-awareness of the model or something (rather than Elon clearly and obviously sticking his fat fingers into the machine), make the linked piece technically dubious.

A baked LLM is 100% deterministic. It is a straightforward set of matrix algebra with a perfectly deterministic output at a base state. There is no magic quantum mystery machine happening in the model. We add a randomization -- the seed or temperature -- to as a value-add randomize the outputs in the intention of giving creativity. So while it might be true that "in the customer-facing default state an LLM gives non-deterministic output", this is not some base truth about LLMs.

replies(1): >>44533633 #
simonw ◴[] No.44533633[source]
LLMs work using huge amounts of matrix multiplication.

Floating point multiplication is non-associative:

  a = 0.1, b = 0.2, c = 0.3
  a * (b * c) = 0.006
  (a * b) * c = 0.006000000000000001
Almost all serious LLMs are deployed across multiple GPUs and have operations executed in batches for efficiency.

As such, the order in which those multiplications are run depends on all sorts of factors. There are no guarantees of operation order, which means non-associative floating point operations play a role in the final result.

This means that, in practice, most deployed LLMs are non-deterministic even with a fixed seed.

That's why vendors don't offer seed parameters accompanied by a promise that it will result in deterministic results - because that's a promise they cannot keep.

Here's an example: https://cookbook.openai.com/examples/reproducible_outputs_wi...

> Developers can now specify seed parameter in the Chat Completion request to receive (mostly) consistent outputs. [...] There is a small chance that responses differ even when request parameters and system_fingerprint match, due to the inherent non-determinism of our models.

replies(2): >>44534555 #>>44536746 #
1. ◴[] No.44534555[source]