←back to thread

602 points emrah | 5 comments | | HN request time: 0.408s | source
1. wtcactus ◴[] No.43743666[source]
They keep mentioning the RTX 3090 (with 24 GB VRAM), but the model is only 14.1 GB.

Shouldn’t it fit a 5060 Ti 16GB, for instance?

replies(3): >>43743691 #>>43743768 #>>43747505 #
2. jsnell ◴[] No.43743691[source]
Memory is needed for more than just the parameters, e.g. the KV cache.
replies(1): >>43743879 #
3. oktoberpaard ◴[] No.43743768[source]
With a 128K context length and 8 bit KV cache, the 27b model occupies 22 GiB on my system. With a smaller context length you should be able to fit it on a 16 GiB GPU.
4. cubefox ◴[] No.43743879[source]
KV = key-value
5. Havoc ◴[] No.43747505[source]
Just checked - 19 gigs with 8k context @ q8 kv.Plus another 2.5-ish or so for OS etc.

...so yeah 3090