←back to thread

210 points blackcat201 | 3 comments | | HN request time: 0.001s | source
1. amoskvin ◴[] No.45769758[source]
any hardware recommendations? how much memory do we need to this?
replies(1): >>45770061 #
2. uniqueuid ◴[] No.45770061[source]
You will effectively want a 48GB card or more for quantized versions, otherwise you won't have meaningful space left for the KV cache. Blackwell and above is generally a good idea to get faster hardware support for 4b (some recent models took some time to ship for older architectures, gpt-oss IIRC).
replies(1): >>45771736 #
3. samus ◴[] No.45771736[source]
This is a Mixture of Experts model with only 3B activated parameters. But I agree that for the intended usage scenario VRAM for the KV cache is the real limitation.