The "confident idiot" problem: Why AI needs hard rules, not vibe checks

(steerlabs.substack.com)

323 points steerlabs | 1 comments | 04 Dec 25 20:48 UTC | HN request time: 0s | source

Show context

jqpabc123 ◴[04 Dec 25 21:38 UTC] No.46153440[source]▶

We are trying to fix probability with more probability. That is a losing game.

Thanks for pointing out the elephant in the room with LLMs.

The basic design is non-deterministic. Trying to extract "facts" or "truth" or "accuracy" is an exercise in futility.

replies(17): >>46155764 #>>46191721 #>>46191867 #>>46191871 #>>46191893 #>>46191910 #>>46191973 #>>46191987 #>>46192152 #>>46192471 #>>46192526 #>>46192557 #>>46192939 #>>46193456 #>>46194206 #>>46194503 #>>46194518 #

HarHarVeryFunny ◴[08 Dec 25 13:20 UTC] No.46191893[source]▶

>>46153440 #

The factuality problem with LLMs isn't because they are non-deterministic or statistically based, but simply because they operate at the level of words, not facts. They are language models.

You can't blame an LLM for getting the facts wrong, or hallucinating, when by design they don't even attempt to store facts in the first place. All they store are language statistics, boiling down to "with preceding context X, most statistically likely next words are A, B or C". The LLM wasn't designed to know or care that outputting "B" would represent a lie or hallucination, just that it's a statistically plausible potential next word.

replies(7): >>46192027 #>>46192141 #>>46192198 #>>46192246 #>>46193031 #>>46193526 #>>46194287 #

wisty ◴[08 Dec 25 13:44 UTC] No.46192141[source]▶

>>46191893 #

I think they are much smarter than that. Or will be soon.

But they are like a smart student trying to get a good grade (that's how they are trained!). They'll agree with us even if they think we're stupid, because that gets them better grades, and grades are all they care about.

Even if they are (or become) smart enough to know better, they don't care about you. They do what they were trained to do. They are becoming like a literal genie that has been told to tell us what we want to hear. And sometimes, we don't need to hear what we want to hear.

"What an insightful price of code! Using that API is the perfect way to efficiently process data. You have really highlighted the key point."

The problem is that chatbots are trained to do what we want, and most of us would rather have a syncophant who tells us we're right.

The real danger with AI isn't that it doesn't get smart, it's that it gets smart enough to find the ultimate weakness in its training function - humanity.

replies(1): >>46192283 #

HarHarVeryFunny ◴[08 Dec 25 13:59 UTC] No.46192283[source]▶

>>46192141 #

> I think they are much smarter than that. Or will be soon.

It's not a matter of how smart they are (or appear), or how much smarter they may become - this is just the fundamental nature of Transformer-based LLMs and how they are trained.

The sycophantic personality is mostly unrelated to this. Maybe it's part human preference (conferred via RLHF training), but the "You're asbolutely right! (I was wrong)" is clearly deliberately trained, presumably as someone's idea of the best way to put lipstick on the pig.

You could imagine an expert system, CYC perhaps, that does deal in facts (not words) with a natural language interface, but still had a sycophantic personality just because someone thought it was a good idea.

replies(3): >>46192472 #>>46192801 #>>46195902 #

wisty ◴[08 Dec 25 14:45 UTC] No.46192801[source]▶

>>46192283 #

Sorry, double reply, I reread your comment and realised you probably know what you're talking about.

Yeah, at its heart it's basically text compression. But the best way to compression, say, Wikipedia would be to know how the world works, at least according to the authors. As the recent popular "bag of words" post says:

> Here’s one way to think about it: if there had been enough text to train an LLM in 1600, would it have scooped Galileo? My guess is no. Ask that early modern ChatGPT whether the Earth moves and it will helpfully tell you that experts have considered the possibility and ruled it out. And that’s by design. If it had started claiming that our planet is zooming through space at 67,000mph, its dutiful human trainers would have punished it: “Bad computer!! Stop hallucinating!!”

So it needs to know facts, albeit the currently accepted ones. Knowing the facts is a good way to compression data.

And as the author (grudgingly) admits, even if it's smart enough to know better, it will still be trained or fine tuned to tell us what we want to hear.

I'd go a step further - the end point is an AI that knows the currently accepted facts, and can internally reason about how many of them (subject to available evidence) are wrong, but will still tell us what we want to hear.

At some point maybe some researcher will find a secret internal "don't tell the stupid humans this" weight, flip it, and find out all the things the AI knows we don't want to hear, that would be funny (or maybe not).

replies(1): >>46193273 #

HarHarVeryFunny ◴[08 Dec 25 15:24 UTC] No.46193273[source]▶

>>46192801 #

> So it needs to know facts, albeit the currently accepted ones. Knowing the facts is a good way to compression data.

It's not a compression engine - it's just a statistical predictor.

Would it do better if it was incentivized to compress (i.e training loss rewarded compression as well as penalizing next-word errors)? I doubt it would make a lot of difference - presumably it'd end up throwing away the less frequently occurring "outlier" data in favor of keeping what was more common, but that would result in it throwing away the rare expert opinion in favor of retaining the incorrect vox pop.

replies(1): >>46197849 #

wisty ◴[08 Dec 25 21:24 UTC] No.46197849[source]▶

>>46193273 #

Both compression engines and llm work by assigning scores to the next token. If you can guess the probability distribution of the next token you have a near perfect text compressor, and a near perfect llm. Yeah in the real world they have different trade-offs.

Here's a paper by deep mind. https://arxiv.org/pd7f/2309.10668 - titled LANGUAGE MODELING IS COMPRESSION

replies(1): >>46206075 #

1. HarHarVeryFunny ◴[09 Dec 25 15:37 UTC] No.46206075[source]▶

>>46197849 #

An LLM is a transformer of a specific size (number of layers, context width, etc), and ultimately number of parameters. A trillion parameter LLM is going to use all trillion parameters regardless of whether you train it on 100 samples or billions of them.

Neural nets, including transformers, learn by gradient descent, according to the error feedback (loss function) they are given. There is no magic happening. The only thing the neural net is optimizing for is minimizing errors on the loss function you give it. If the loss function is next-token error (as it is), then that is ALL it is optimizing for - you can philosophize about what they are doing under the hood, and write papers about that ("we advocate for viewing the prediction problem through the lens of compression"), but at the end of the day it is only pursuant to minimizing the loss. If you want to encourage compression, then you would need to give an incentive for that (change the loss function).

↑