The "confident idiot" problem: Why AI needs hard rules, not vibe checks

(steerlabs.substack.com)

323 points steerlabs | 2 comments | 04 Dec 25 20:48 UTC | HN request time: 0.019s | source

Show context

jqpabc123 ◴[04 Dec 25 21:38 UTC] No.46153440[source]▶

We are trying to fix probability with more probability. That is a losing game.

Thanks for pointing out the elephant in the room with LLMs.

The basic design is non-deterministic. Trying to extract "facts" or "truth" or "accuracy" is an exercise in futility.

replies(17): >>46155764 #>>46191721 #>>46191867 #>>46191871 #>>46191893 #>>46191910 #>>46191973 #>>46191987 #>>46192152 #>>46192471 #>>46192526 #>>46192557 #>>46192939 #>>46193456 #>>46194206 #>>46194503 #>>46194518 #

pydry ◴[08 Dec 25 13:28 UTC] No.46191987[source]▶

>>46153440 #

I find it amusing that once you try to take LLMs and do productive work with them either this problem trips you up constantly OR the LLM ends up becoming a shallow UI over an existing app (not necessarily better, just different).

replies(1): >>46192554 #

bee_rider ◴[08 Dec 25 14:24 UTC] No.46192554[source]▶

>>46191987 #

The UI of the Internet (search) has recently gotten quite bad. In this light it is pretty obvious why Google is working heavily on these models.

I fully expect local modes to eat up most other LLM applications—there’s no reason for your chat buddy or timer setter to reach out to the internet, but LLMs are pretty good at vibes based search, and that will always require looking at a bunch of websites, so it should slot exactly into the gap left by search engines becoming unusable.

replies(1): >>46197538 #

1. mrguyorama ◴[08 Dec 25 20:54 UTC] No.46197538[source]▶

>>46192554 #

The reason search got so bad, even pretending google themselves are some beneficial actors, is because it is a directly adversarial process. It is profitable to be higher in search results than you "naturally" would be, so of course people attack it.

Google's entire theory of founding was that you could do better than Yahoo hand picking websites with an algorithm, and pagerank was the demonstration, but IMO that was only possible with a dataset that was non-adversarial because you couldn't "attack" yahoo and friend's processes from the data itself.

The moment that changed, the moment pagerank was used in production, the game was up. As long as you try to use content to judge search ranking, content will be changed, modified, abused, cheated to increase your search rank.

The very moment it becomes profitable to do the same for LLM "search", it will happen. LLMs are rather vulnerable to "attack", and will run into the exact same adversarial environment that nullified the effectiveness of pagerank.

This is orthogonal also to if you believe Google let search be shittier to improve their ad empire. LLM "search" will have exactly this same problem if you believe it exists.

If you build a credit card fraud model on a dataset that contains no attacks, you will build a rather bad fraud model. The same is true of pagerank and algorithmic search.

replies(1): >>46198173 #

2. bee_rider ◴[08 Dec 25 21:53 UTC] No.46198173[source]▶

>>46197538 (TP) #

Oh, that’s an interesting thought, I was really hoping LLMs would break the cycle there but of course there’s no reason to assume they’d be immune to adversarial content optimization.

↑