(simonwillison.net)

577 points simonw | 1 comments | 29 Jul 25 13:45 UTC | HN request time: 0s | source

Show context

pamelafox ◴[29 Jul 25 14:13 UTC] No.44723651[source]▶

Alas, my 3 year old Mac has only 16 GB RAM, and can barely run a browser without running out of memory. It's a work-issued Mac, and we only get upgrades every 4/5 years. I must be content with 8B parameters models from Ollama (some of which are quite good, like llama3.1:8b).

replies(4): >>44723672 #>>44723697 #>>44723806 #>>44724480 #

GaggiX ◴[29 Jul 25 14:18 UTC] No.44723697[source]▶

>>44723651 #

Reasoning models like qwen3 are even better, and they have more options, for example you can choose the 14B model (at the usual 4KM quantization) instead of the 8B model.

replies(1): >>44723986 #

pamelafox ◴[29 Jul 25 14:37 UTC] No.44723986[source]▶

>>44723697 #

Are they quantized more effectively than the non-reasoning models for some reason?

replies(1): >>44724022 #

1. GaggiX ◴[29 Jul 25 14:40 UTC] No.44724022[source]▶

>>44723986 #

There is no difference, you can choose a 6 bits quantization if you prefer, at that point it's essentially lossless.

↑

My 2.5 year old laptop can write Space Invaders in JavaScript now (GLM-4.5 Air)