←back to thread

S1: A $6 R1 competitor?

(timkellogg.me)
851 points tkellogg | 1 comments | | HN request time: 1.183s | source
Show context
leopoldj ◴[] No.42962843[source]
>it can run on my laptop

Has anyone run it on a laptop (unquantized)? Disk size of the 32B model appears to be 80GB. Update: I'm using a 40GB A100 GPU. Loading the model took 30GB vRAM. I asked a simple question "How many r in raspberry". After 5 minutes nothing got generated beyond the prompt. I'm not sure how the author ran this on a laptop.

replies(1): >>42965317 #
coder543 ◴[] No.42965317[source]
32B models are easy to run on 24GB of RAM at a 4-bit quant.

It sounds like you need to play with some of the existing 32B models with better documentation on how to run them if you're having trouble, but it is entirely plausible to run this on a laptop.

I can run Qwen2.5-Instruct-32B-q4_K_M at 22 tokens per second on just an RTX 3090.

replies(1): >>42966020 #
leopoldj ◴[] No.42966020[source]
My question was about running it unquantized. The author of the article didn't say how he ran it. If he quantized it then saying he ran it on a laptop is not a news.
replies(2): >>42966043 #>>42967382 #
1. kristianp ◴[] No.42967382[source]
Maybe he has a 64GB laptop. Also he said he can run it, not that he actually tried it.