Thanks for pointing out the elephant in the room with LLMs.
The basic design is non-deterministic. Trying to extract "facts" or "truth" or "accuracy" is an exercise in futility.
Thanks for pointing out the elephant in the room with LLMs.
The basic design is non-deterministic. Trying to extract "facts" or "truth" or "accuracy" is an exercise in futility.
You can't blame an LLM for getting the facts wrong, or hallucinating, when by design they don't even attempt to store facts in the first place. All they store are language statistics, boiling down to "with preceding context X, most statistically likely next words are A, B or C". The LLM wasn't designed to know or care that outputting "B" would represent a lie or hallucination, just that it's a statistically plausible potential next word.
Facts, unlike fabulations, require crossing experience beyond the expressions on trial.
But again, LLMs don't even deal in facts, nor store any memories of where training samples came from, and of course have zero personal experience. It's just "he said, she said" put into a training sample blender and served one word at a time.