Most active commenters

    ←back to thread

    3337 points keepamovin | 15 comments | | HN request time: 0.513s | source | bottom
    Show context
    jll29 ◴[] No.46216933[source]
    AI professor here. I know this page is a joke, but in the interest of accuracy, a terminological comment: we don't call it a "hallucination" if a model complies exactly with what a prompt asked for and produces a prediction, exactly as requested.

    Rater, "hallucinations" are spurious replacements of factual knowledge with fictional material caused by the use of statistical process (the pseudo random number generator used with the "temperature" parameter of neural transformers): token prediction without meaning representation.

    [typo fixed]

    replies(12): >>46217033 #>>46217061 #>>46217166 #>>46217410 #>>46217456 #>>46217758 #>>46218070 #>>46218282 #>>46218393 #>>46218588 #>>46219018 #>>46219935 #
    1. articlepan ◴[] No.46217166[source]
    I agree with your first paragraph, but not your second. Models can still hallucinate when temperature is set to zero (aka when we always choose the highest probability token from the model's output token distribution).

    In my mind, hallucination is when some aspect of the model's response should be consistent with reality but is not, and the reality-inconsistent information is not directly attributable or deducible from (mis)information in the pre-training corpus.

    While hallucination can be triggered by setting the temperature high, it can also be the result of many possible deficiencies in model pre- and post- training that result in the model outputting bad token probability distributions.

    replies(3): >>46217642 #>>46217654 #>>46219141 #
    2. ActivePattern ◴[] No.46217642[source]
    I've never heard the caveat that it can't be attributable to misinformation in the pre-training corpus. For frontier models, we don't even have access to the enormous training corpus, so we would have no way of verifying whether or not it is regurgitating some misinformation that it had seen there or whether it is inventing something out of whole cloth.
    replies(1): >>46217736 #
    3. julienreszka ◴[] No.46217654[source]
    That's because of rounding errors
    replies(1): >>46218631 #
    4. Aurornis ◴[] No.46217736[source]
    > I've never heard the caveat that it can't be attributable to misinformation in the pre-training corpus.

    If the LLM is accurately reflecting the training corpus, it wouldn’t be considered a hallucination. The LLM is operating as designed.

    Matters of access to the training corpus are a separate issue.

    replies(4): >>46217795 #>>46218354 #>>46218380 #>>46218561 #
    5. parineum ◴[] No.46217795{3}[source]
    The LLM is always operating as designed. All LLM outputs are "hallucinations".
    replies(1): >>46221127 #
    6. CGMthrowaway ◴[] No.46218354{3}[source]
    > If the LLM is accurately reflecting the training corpus, it wouldn’t be considered a hallucination. The LLM is operating as designed.

    That would mean that there is never any hallucination.

    The point of original comment was distinguishing between fact and fiction, which an LLM just cannot do. (It's an unsolved problem among humans, which spills into the training data)

    replies(1): >>46218549 #
    7. eMPee584 ◴[] No.46218380{3}[source]
    not that the internet had contained any misinformation or FUD when the training data was collected

    also, statments with certainty about fictitious "honey pot prompts" are a problem, plausibly extrapolating from the data should be more governed by internal confidence.. luckily there are benchmarks now for that i believe

    8. Aurornis ◴[] No.46218549{4}[source]
    > That would mean that there is never any hallucination.

    No it wouldn’t. If the LLM produces an output that does not match the training data or claims things that are not in the training data due to pseudorandom statistical processes then that’s a hallucination. If it accurately represents the training data or context content, it’s not a hallucination.

    Similarly, if you request that an LLM tells you something false and the information it provided is false, that’s not a hallucination.

    > The point of original comment was distinguishing between fact and fiction,

    In the context of LLMs, fact means something represented in the training set. Not factual in an absolute, philosophical sense.

    If you put a lot of categorically false information into the training corpus and train an LLM on it, those pieces of information are “factual” in the context of the LLM output.

    The key part of the parent comment:

    > caused by the use of statistical process (the pseudo random number generator

    replies(1): >>46219042 #
    9. Workaccount2 ◴[] No.46218561{3}[source]
    I believe it was a super bowl ad for gemini last year where it had a "hallucination" in the ad itself. One of the screenshots of gemini being used showed this "hallucination", which made the rounds in the news as expected.

    I want to say it was some fact about cheese or something that was in fact wrong. However you could also see the source gemini cited in the ad, and when you went to that source, it was some local farm 1998 style HTML homepage, and on that page they had the incorrect factoid about the cheese.

    10. leecarraher ◴[] No.46218631[source]
    i agree, not just the multinomial sampling that causes hallucinations. If that were the case, setting temp to 0 and just argmax over the logits would "solve" hallucinations. while round-off error causes some stochasticity it's unlikely to be the the primary cause, rather it's lossy compression over the layers that causes it.

    first compression: You create embeddings that need to differentiate N tokens, JL lemma gives us a bound that modern architectures are well above that. At face value, the embeddings could encode the tokens and provide deterministic discrepancy. But words aren't monolithic , they mean many things and get contextualized by other words. So despite being above jl bound, the model still forces a lossy compression.

    next compression: each layer of the transformer blows up the input to KVQ, then compresses it back to the inter-layer dimension.

    finally there is the output layer which at 0 temp is deterministic, but it is heavily path dependent on getting to that token. The space of possible paths is combinatorial, so any non-deterministic behavior elsewhere will inflate the likelihood of non-deterministic output, including things like roundoff. heck most models are quantized down to 4 even2 bits these days, which is wild!

    11. CGMthrowaway ◴[] No.46219042{5}[source]
    OK if everyone else agrees with your semantics then I agree
    12. antonvs ◴[] No.46219141[source]
    > In my mind, hallucination is when some aspect of the model's response should be consistent with reality

    By "reality", do you mean the training corpus? Because otherwise, this seems like a strange standard. Models don't have access to "reality".

    replies(1): >>46219522 #
    13. KalMann ◴[] No.46219522[source]
    > Models don't have access to "reality"

    This is an explanation of why models "hallucinate" not a criticism for the provided definition of hallucination.

    replies(1): >>46221939 #
    14. Al-Khwarizmi ◴[] No.46221127{4}[source]
    The LLM is always operating as designed, but humans call its outputs "hallucinations" when they don't align with factual reality, regardless of the reason why that happens and whether it should be considered a bug or a feature. (I don't like the term much, by the way, but at this point it's a de facto standard).
    15. antonvs ◴[] No.46221939{3}[source]
    That's a poor definition, then. It claims that a model is "hallucinating" when its output doesn't match a reference point that it can't possibly have accurate information about. How is that an "hallucination" in any meaningful sense?