(github.com)

1311 points msoad | 1 comments | 31 Mar 23 20:37 UTC | HN request time: 0.42s | source

Show context

Does that also mean 6GB VRAM?

1. terafo ◴[31 Mar 23 20:49 UTC] No.35393441[source]▶

No(llama.cpp is cpu-only) and no(you need to requantize the model).

Llama.cpp 30B runs with only 6GB of RAM now