←back to thread

262 points rain1 | 1 comments | | HN request time: 0s | source
Show context
ljoshua ◴[] No.44443222[source]
Less a technical comment and more just a mind-blown comment, but I still can’t get over just how much data is compressed into and available in these downloadable models. Yesterday I was on a plane with no WiFi, but had gemma3:12b downloaded through Ollama. Was playing around with it and showing my kids, and we fired history questions at it, questions about recent video games, and some animal fact questions. It wasn’t perfect, but holy cow the breadth of information that is embedded in an 8.1 GB file is incredible! Lossy, sure, but a pretty amazing way of compressing all of human knowledge into something incredibly contained.
replies(22): >>44443263 #>>44443274 #>>44443296 #>>44443751 #>>44443781 #>>44443840 #>>44443976 #>>44444227 #>>44444418 #>>44444471 #>>44445299 #>>44445966 #>>44446013 #>>44446775 #>>44447373 #>>44448218 #>>44448315 #>>44448452 #>>44448810 #>>44449169 #>>44449182 #>>44449585 #
exe34 ◴[] No.44443296[source]
Wikipedia is about 24GB, so if you're allowed to drop 1/3 of the details and make up the missing parts by splicing in random text, 8GB doesn't sound too bad.

To me the amazing thing is that you can tell the model to do something, even follow simple instructions in plain English, like make a list or write some python code to do $x, that's the really amazing part.

replies(2): >>44443455 #>>44444576 #
bbarnett ◴[] No.44443455[source]
Not to mention, Language Modeling is Compression https://arxiv.org/pdf/2309.10668

So text wikipedia at 24G would easily hit 8G with many standard forms of compression, I'd think. If not better. And it would be 100% accurate, full text and data. Far more usable.

It's so easy for people to not realise how massive 8GB really is, in terms of text. Especially if you use ascii instead of UTF.

replies(1): >>44443590 #
horsawlarway ◴[] No.44443590{3}[source]
The 24G is the compressed number.

They host a pretty decent article here: https://en.wikipedia.org/wiki/Wikipedia:Size_of_Wikipedia

The relevant bit:

> As of 16 October 2024, the size of the current version including all articles compressed is about 24.05 GB without media.

replies(1): >>44443836 #
1. bbarnett ◴[] No.44443836{4}[source]
Nice link, thanks.

Well I'll fallback position, and say one is lossy, the other not.