Llama.cpp 30B runs with only 6GB of RAM now

(github.com)

1311 points msoad | 1 comments | 31 Mar 23 20:37 UTC | HN request time: 0.237s | source

1. muyuu ◴[01 Apr 23 10:11 UTC] No.35398974[source]▶

the ggml-model-f16.bin file for 30B is taking me a bunch of hours to process

are these made available somewhere?