←back to thread

DeepSeek-v3.1

(api-docs.deepseek.com)
776 points wertyk | 4 comments | | HN request time: 0.001s | source
Show context
danielhanchen ◴[] No.44978800[source]
For local runs, I made some GGUFs! You need around RAM + VRAM >= 250GB for good perf for dynamic 2bit (2bit MoE, 6-8bit rest) - can also do SSD offloading but it'll be slow.

./llama.cpp/llama-cli -hf unsloth/DeepSeek-V3.1-GGUF:UD-Q2_K_XL -ngl 99 --jinja -ot ".ffn_.*_exps.=CPU"

More details on running + optimal params here: https://docs.unsloth.ai/basics/deepseek-v3.1

replies(6): >>44979837 #>>44980406 #>>44981373 #>>44982860 #>>44984274 #>>44987809 #
1. azinman2 ◴[] No.44987809[source]
It’d also be great if you guys could do a fine tune to run on an 8x80G A/H100. These H200/B200 configs are harder to come by (and much more expensive).
replies(1): >>44987952 #
2. danielhanchen ◴[] No.44987952[source]
Unsloth should work on any GPU setup all the way until the old Tesla T4s and the newer B200s :) We're working on a faster and better multi GPU version, but using accelerate / torchrun manually + Unsloth should work out of the box!
replies(1): >>44987969 #
3. azinman2 ◴[] No.44987969[source]
I guess I was hoping for you guys to put up these weights. I think they’d be popular for these very large models.

You guys already do a lot for the local LLM community and I appreciate it.

replies(1): >>45000138 #
4. danielhanchen ◴[] No.45000138{3}[source]
I'll see what I can do :)