An LLM is a lossy encyclopedia

(simonwillison.net)

509 points tosh | 4 comments | 29 Aug 25 09:40 UTC | HN request time: 0.851s | source

(the referenced HN thread starts at https://news.ycombinator.com/item?id=45060519)

Show context

latexr ◴[02 Sep 25 10:27 UTC] No.45101170[source]▶

>>45062046 (OP) #

A lossy encyclopaedia should be missing information and be obvious about it, not making it up without your knowledge and changing the answer every time.

When you have a lossy piece of media, such as a compressed sound or image file, you can always see the resemblance to the original and note the degradation as it happens. You never have a clear JPEG of a lamp, compress it, and get a clear image of the Milky Way, then reopen the image and get a clear image of a pile of dirt.

Furthermore, an encyclopaedia is something you can reference and learn from without a goal, it allows you to peruse information you have no concept of. Not so with LLMs, which you have to query to get an answer.

replies(10): >>45101190 #>>45101267 #>>45101510 #>>45101793 #>>45101924 #>>45102219 #>>45102694 #>>45104357 #>>45108609 #>>45112011 #

gjm11 ◴[02 Sep 25 12:28 UTC] No.45102219[source]▶

>>45101170 #

Lossy compression does make things up. We call them compression artefacts.

In compressed audio these can be things like clicks and boings and echoes and pre-echoes. In compressed images they can be ripply effects near edges, banding in smoothly varying regions, but there are also things like https://www.dkriesel.com/en/blog/2013/0802_xerox-workcentres... where one digit is replaced with a nice clean version of a different digit, which is pretty on-the-nose for the LLM failure mode you're talking about.

Compression artefacts generally affect small parts of the image or audio or video rather than replacing the whole thing -- but in the analogy, "the whole thing" is an encyclopaedia and the artefacts are affecting little bits of that.

Of course the analogy isn't exact. That would be why S.W. opens his post by saying "Since I love collecting questionable analogies for LLMs,".

replies(3): >>45102280 #>>45102368 #>>45103467 #

jpcompartir ◴[02 Sep 25 12:33 UTC] No.45102280[source]▶

>>45102219 #

Interesting, in the LLM case these compression artefacts then get fed into the generating process of the next token, hence the errors compound.

replies(1): >>45102750 #

ACCount37 ◴[02 Sep 25 13:17 UTC] No.45102750[source]▶

>>45102280 #

Not really. The whole "inference errors will always compound" idea was popular in GPT-3.5 days, and it seems like a lot of people just never updated their knowledge since.

It was quickly discovered that LLMs are capable of re-checking their own solutions if prompted - and, with the right prompts, are capable of spotting and correcting their own errors at a significantly-greater-than-chance rate. They just don't do it unprompted.

Eventually, it was found that reasoning RLVR consistently gets LLMs to check themselves and backtrack. It was also confirmed that this latent "error detection and correction" capability is present even at base model level, but is almost never exposed - not in base models and not in non-reasoning instruct-tuned LLMs.

The hypothesis I subscribe to is that any LLM has a strong "character self-consistency drive". This makes it reluctant to say "wait, no, maybe I was wrong just now", even if latent awareness of "past reasoning look sketchy as fuck" is already present within the LLM. Reasoning RLVR encourages going against that drive and utilizing those latent error-correction capabilities.

replies(2): >>45102860 #>>45103637 #

Mallowram ◴[02 Sep 25 13:28 UTC] No.45102860[source]▶

>>45102750 #

The problem is that language doesn't produce itself. Re-checking, correcting error is not relevant. Error minimization is not the fount of survival, remaining variable for tasks is. The lossy encyclopedia is neither here nor there, it's a mistaken path:

"Language, Halliday argues, "cannot be equated with 'the set of all grammatical sentences', whether that set is conceived of as finite or infinite". He rejects the use of formal logic in linguistic theories as "irrelevant to the understanding of language" and the use of such approaches as "disastrous for linguistics"."

replies(1): >>45103516 #

ACCount37 ◴[02 Sep 25 14:22 UTC] No.45103516[source]▶

>>45102860 #

Sorry, what? This is borderline incoherent.

replies(1): >>45103661 #

mallowdram ◴[02 Sep 25 14:33 UTC] No.45103661[source]▶

>>45103516 #

The units themselves are meaningless without context. The point of existence, action, tasks is to solve the arbitrariness in language. Tasks refute language, not the other way around. This may be incoherent as the explanation is scientific, based in the latest conceptualization of linguistics.

CS never solved the incoherence of language, conduit metaphor paradox. It's stuck behind language's bottleneck, and it do so willingly blind-eyed.

replies(1): >>45103716 #

ACCount37 ◴[02 Sep 25 14:37 UTC] No.45103716[source]▶

>>45103661 #

What? This is even less coherent.

You weren't talking to GPT-4o about philosophy recently, were you?

replies(1): >>45103758 #

mallowdram ◴[02 Sep 25 14:42 UTC] No.45103758[source]▶

>>45103716 #

I'd know cutting-edge linguistics and signaling theory well beyond Shannon to parse this, not NLP or engineering reduction. What I've stated is extremely coherent to Systemic Functional Linguists.

Beyond this point engineers actually have to know what signaling is, rather than 'information.'

https://www.sciencedirect.com/science/article/abs/pii/S00033...

Ultimately, engineering chose the wrong approach to automating language, and it sinks the field. It's irreversible.

replies(2): >>45104224 #>>45104778 #

morpheos137 ◴[02 Sep 25 15:55 UTC] No.45104778[source]▶

>>45103758 #

If not language what training substrate do you suggest? Also not strong ideas are expressible coherently. You have an ironic pattern in your comments of getting lost in the very language morass you propose to deprecate. If we don't train models on language what do we train them on? I have some ideas of my own but I am interested if you can clearly express yours.

replies(1): >>45104909 #

mallowdram ◴[02 Sep 25 16:04 UTC] No.45104909[source]▶

>>45104778 #

Neural/spatial syntax. Analoga of differentials. The code to operate this gets built before the component.

If language doesn't really mean anything, then automating it in geometry is worse than problematic.

The solution is starting over at 1947: measurement not counting.

replies(1): >>45105390 #

1. morpheos137 ◴[02 Sep 25 16:33 UTC] No.45105390[source]▶

>>45104909 #

The semantic meaning of your words here is non-existent. It is unclear to me how else you can communicate in a text based forum if not by using words. Since you can't despite your best effort I am left to conclude you are psychotic and should probably be banned and seek medical help.

replies(1): >>45105584 #

2. mallowdram ◴[02 Sep 25 16:44 UTC] No.45105584[source]▶

>>45105390 (TP) #

Engineers are so close-minded, you can't see the freight train bearing down on the industry. All to science's advantage replacing engineers. Interestingly, if you dissect that last entry, I've just made the case measurement (analog computation) is superior to counting (binary computation) and laid out the strategy how. All it takes is brains, or an LLM to decipher what it states.

https://pmc.ncbi.nlm.nih.gov/articles/PMC3005627/

"First, cell assemblies are best understood in light of their output product, as detected by ‘reader-actuator’ mechanisms. Second, I suggest that the hierarchical organization of cell assemblies may be regarded as a neural syntax. Third, constituents of the neural syntax are linked together by dynamically changing constellations of synaptic weights (‘synapsembles’). Existing support for this tripartite framework is reviewed and strategies for experimental testing of its predictions are discussed."

replies(1): >>45105773 #

3. morpheos137 ◴[02 Sep 25 16:57 UTC] No.45105773[source]▶

>>45105584 #

I 100% agree analog computing would be better at simulating intelligence than binary. Why don't you state that rather than burying it under a mountain of psychobabble?

replies(1): >>45106799 #

4. mallowdram ◴[02 Sep 25 18:04 UTC] No.45106799{3}[source]▶

>>45105773 #

Listing the conditions, dichotomizing the frameworks counting/measurement is the farthest from psycho-babble. Anyone with knowledge of analog knows these terms. And enough to know analog doesn't simulate anything. And intelligence isn't what's being targeted.

↑