←back to thread

DeepSeek-v3.1

(api-docs.deepseek.com)
776 points wertyk | 1 comments | | HN request time: 0s | source
Show context
danielhanchen ◴[] No.44978800[source]
For local runs, I made some GGUFs! You need around RAM + VRAM >= 250GB for good perf for dynamic 2bit (2bit MoE, 6-8bit rest) - can also do SSD offloading but it'll be slow.

./llama.cpp/llama-cli -hf unsloth/DeepSeek-V3.1-GGUF:UD-Q2_K_XL -ngl 99 --jinja -ot ".ffn_.*_exps.=CPU"

More details on running + optimal params here: https://docs.unsloth.ai/basics/deepseek-v3.1

replies(6): >>44979837 #>>44980406 #>>44981373 #>>44982860 #>>44984274 #>>44987809 #
zargon ◴[] No.44981373[source]
Thanks for your great work with quants. I would really appreciate UD GGUFs for V3.1-Base (and even more so, GLM-4.5-Base + Air-Base).
replies(1): >>44981700 #
1. danielhanchen ◴[] No.44981700[source]
Thanks! Oh base models? Interesting since I normally do only Instruct models - I can take a look though!