←back to thread

1311 points msoad | 4 comments | | HN request time: 0.001s | source
1. TaylorAlexander ◴[] No.35394064[source]
Great to see this advancing! I’m curious if anyone knows what the best repo is for running this stuff on an Nvidia GPU with 16GB vram. I ran the official repo with the leaked weights and the best I could run was the 7B parameter model. I’m curious if people have found ways to fit the larger models on such a system.
replies(2): >>35394117 #>>35394765 #
2. terafo ◴[] No.35394117[source]
I'd assume that 33B model should fit with this(only repo that I know of that implements SparseGPT and GPTQ for LLaMa), I, personally, haven't tried though. But you can try your luck https://github.com/lachlansneff/sparsellama
3. enlyth ◴[] No.35394765[source]
https://github.com/oobabooga/text-generation-webui
replies(1): >>35397820 #
4. TaylorAlexander ◴[] No.35397820[source]
looks great thank you!