←back to thread

577 points simonw | 1 comments | | HN request time: 0s | source
Show context
pamelafox ◴[] No.44723651[source]
Alas, my 3 year old Mac has only 16 GB RAM, and can barely run a browser without running out of memory. It's a work-issued Mac, and we only get upgrades every 4/5 years. I must be content with 8B parameters models from Ollama (some of which are quite good, like llama3.1:8b).
replies(4): >>44723672 #>>44723697 #>>44723806 #>>44724480 #
GaggiX ◴[] No.44723697[source]
Reasoning models like qwen3 are even better, and they have more options, for example you can choose the 14B model (at the usual 4KM quantization) instead of the 8B model.
replies(1): >>44723986 #
pamelafox ◴[] No.44723986[source]
Are they quantized more effectively than the non-reasoning models for some reason?
replies(1): >>44724022 #
1. GaggiX ◴[] No.44724022[source]
There is no difference, you can choose a 6 bits quantization if you prefer, at that point it's essentially lossless.