←back to thread

An LLM is a lossy encyclopedia

(simonwillison.net)
509 points tosh | 2 comments | | HN request time: 0.47s | source

(the referenced HN thread starts at https://news.ycombinator.com/item?id=45060519)
Show context
GuB-42 ◴[] No.45101186[source]
There are a lot of parallels between AI and compression.

In fact the best compression algorithms and LLMs have in common that they work by predicting the next word. Compression algorithms take an extra step called entropy coding to encode the difference between the prediction and the actual data efficiently, and the better the prediction, the better the compression ratio.

What makes a LLM "lossy" is that you don't have the "encode the difference" step.

And yes, it means you can turn a LLM into a (lossless) compression algorithm, and I think a really good one in term of compression ratio on huge data sets. You can also turn a compression algorithm like gzip into a language model! A very terrible one, but the output is better than a random stream of bytes.

replies(3): >>45101276 #>>45102534 #>>45103227 #
1. arjvik ◴[] No.45101276[source]
With a handy trick called arithmetic coding, you can actually turn an LLM into a lossless compression algorithm!
replies(1): >>45101585 #
2. vbarrielle ◴[] No.45101585[source]
Indeed, see https://bellard.org/nncp/ for an example.