The "confident idiot" problem: Why AI needs hard rules, not vibe checks

(steerlabs.substack.com)

323 points steerlabs | 2 comments | 04 Dec 25 20:48 UTC | HN request time: 0.037s | source

Show context

jqpabc123 ◴[04 Dec 25 21:38 UTC] No.46153440[source]▶

We are trying to fix probability with more probability. That is a losing game.

Thanks for pointing out the elephant in the room with LLMs.

The basic design is non-deterministic. Trying to extract "facts" or "truth" or "accuracy" is an exercise in futility.

replies(17): >>46155764 #>>46191721 #>>46191867 #>>46191871 #>>46191893 #>>46191910 #>>46191973 #>>46191987 #>>46192152 #>>46192471 #>>46192526 #>>46192557 #>>46192939 #>>46193456 #>>46194206 #>>46194503 #>>46194518 #

HarHarVeryFunny ◴[08 Dec 25 13:20 UTC] No.46191893[source]▶

>>46153440 #

The factuality problem with LLMs isn't because they are non-deterministic or statistically based, but simply because they operate at the level of words, not facts. They are language models.

You can't blame an LLM for getting the facts wrong, or hallucinating, when by design they don't even attempt to store facts in the first place. All they store are language statistics, boiling down to "with preceding context X, most statistically likely next words are A, B or C". The LLM wasn't designed to know or care that outputting "B" would represent a lie or hallucination, just that it's a statistically plausible potential next word.

replies(7): >>46192027 #>>46192141 #>>46192198 #>>46192246 #>>46193031 #>>46193526 #>>46194287 #

AlecSchueler ◴[08 Dec 25 13:55 UTC] No.46192246[source]▶

>>46191893 #

In a way though those things aren't so different as they might first appear. The factual answer is traditionally the most plausible response to many questions. They don't operate on any level other than pure language but there are a heap of behaviours which emerge from that.

replies(2): >>46192521 #>>46192585 #

1. psychoslave ◴[08 Dec 25 14:21 UTC] No.46192521[source]▶

>>46192246 #

Most plausible world model is not something stored raw in utterances. What we interpret from sentences is vastly different from what is extractable from mere sentences on their own.

Facts, unlike fabulations, require crossing experience beyond the expressions on trial.

replies(1): >>46192650 #

2. HarHarVeryFunny ◴[08 Dec 25 14:32 UTC] No.46192650[source]▶

>>46192521 (TP) #

Right, facts need to be grounded and obtained from reliable sources such as personal experience, or a textbook. Just because statistically most people on Reddit or 4Chan said the moon is made of cheese doesn't make it so.

But again, LLMs don't even deal in facts, nor store any memories of where training samples came from, and of course have zero personal experience. It's just "he said, she said" put into a training sample blender and served one word at a time.

↑