Grok: Searching X for "From:Elonmusk (Israel or Palestine or Hamas or Gaza)"

(simonwillison.net)

725 points simonw | 1 comments | 11 Jul 25 00:22 UTC | HN request time: 0.274s | source

Show context

xnx ◴[11 Jul 25 00:34 UTC] No.44527256[source]▶

> It’s worth noting that LLMs are non-deterministic,

This is probably better phrased as "LLMs may not provide consistent answers due to changing data and built-in randomness."

Barring rare(?) GPU race conditions, LLMs produce the same output given the same inputs.

replies(7): >>44527264 #>>44527395 #>>44527458 #>>44528870 #>>44530104 #>>44533038 #>>44536027 #

simonw ◴[11 Jul 25 00:58 UTC] No.44527395[source]▶

>>44527256 #

I don't think those race conditions are rare. None of the big hosted LLMs provide a temperature=0 plus fixed seed feature which they guarantee won't return different results, despite clear demand for that from developers.

replies(3): >>44527634 #>>44529574 #>>44529823 #

xnx ◴[11 Jul 25 01:41 UTC] No.44527634[source]▶

>>44527395 #

Fair. I dislike "non-deterministic" as a blanket llm descriptor for all llms since it implies some type of magic or quantum effect.

replies(4): >>44527956 #>>44528597 #>>44528690 #>>44529070 #

1. dekhn ◴[11 Jul 25 02:42 UTC] No.44527956[source]▶

>>44527634 #

I see LLM inference as sampling from a distribution. Multiple details go into that sampling - everything from parameters like temperature to numerical imprecision to batch mixing effects as well as the next-token-selection approach (always pick max, sample from the posterior distribution, etc). But ultimately, if it was truly important to get stable outputs, everything I listed above can be engineered (temp=0, very good numerical control, not batching, and always picking the max probability next token).

dekhn from a decade ago cared a lot about stable outputs. dekhn today thinks sampling from a distribution is a far more practical approach for nearly all use cases. I could see it mattering when the false negative rate of a medical diagnostic exceeded a reasonable threshold.

↑