←back to thread

1311 points msoad | 1 comments | | HN request time: 0.336s | source
Show context
TaylorAlexander ◴[] No.35394064[source]
Great to see this advancing! I’m curious if anyone knows what the best repo is for running this stuff on an Nvidia GPU with 16GB vram. I ran the official repo with the leaked weights and the best I could run was the 7B parameter model. I’m curious if people have found ways to fit the larger models on such a system.
replies(2): >>35394117 #>>35394765 #
1. terafo ◴[] No.35394117[source]
I'd assume that 33B model should fit with this(only repo that I know of that implements SparseGPT and GPTQ for LLaMa), I, personally, haven't tried though. But you can try your luck https://github.com/lachlansneff/sparsellama