DeepSeek-v3.1

(api-docs.deepseek.com)

776 points wertyk | 2 comments | 21 Aug 25 19:06 UTC | HN request time: 0s | source

Show context

danielhanchen ◴[21 Aug 25 22:21 UTC] No.44978800[source]▶

For local runs, I made some GGUFs! You need around RAM + VRAM >= 250GB for good perf for dynamic 2bit (2bit MoE, 6-8bit rest) - can also do SSD offloading but it'll be slow.

./llama.cpp/llama-cli -hf unsloth/DeepSeek-V3.1-GGUF:UD-Q2_K_XL -ngl 99 --jinja -ot ".ffn_.*_exps.=CPU"

More details on running + optimal params here: https://docs.unsloth.ai/basics/deepseek-v3.1

replies(6): >>44979837 #>>44980406 #>>44981373 #>>44982860 #>>44984274 #>>44987809 #

diggan ◴[22 Aug 25 13:14 UTC] No.44984274[source]▶

>>44978800 #

> More details on running + optimal params here: https://docs.unsloth.ai/basics/deepseek-v3.1

Was that document almost exclusively written with LLMs? I looked at it last night (~8 hours ago) and it was riddled with mistakes, most egregious was that the "Run with Ollama" section had instructions for how to install Ollama, but then the shell commands were actually running llama.cpp, a mistake probably no human would make.

Do you have any plans on disclosing how much of these docs are written by humans vs not?

Regardless, thanks for the continued release of quants and weights :)

replies(2): >>44987926 #>>44997216 #

1. wfn ◴[23 Aug 25 16:52 UTC] No.44997216[source]▶

>>44984274 #

> but then the shell commands were actually running llama.cpp, a mistake probably no human would make.

But in the docs I see things like

    cp llama.cpp/build/bin/llama-* llama.cpp

Wouldn't this explain that? (Didn't look too deep)

replies(1): >>45000122 #

2. danielhanchen ◴[24 Aug 25 00:05 UTC] No.45000122[source]▶

>>44997216 (TP) #

Yes it's probs the ordering od the docs thats the issue :) Ie https://docs.unsloth.ai/basics/deepseek-v3.1#run-in-llama.cp... does:

```

apt-get update

apt-get install pciutils build-essential cmake curl libcurl4-openssl-dev -y

git clone https://github.com/ggerganov/llama.cpp cmake llama.cpp -B llama.cpp/build \ -DBUILD_SHARED_LIBS=OFF -DGGML_CUDA=ON -DLLAMA_CURL=ON

cmake --build llama.cpp/build --config Release -j --clean-first --target llama-quantize llama-cli llama-gguf-split llama-mtmd-cli llama-server

cp llama.cpp/build/bin/llama-* llama.cpp

```

but then Ollama is above it:

```

./llama.cpp/llama-gguf-split --merge \ DeepSeek-V3.1-GGUF/DeepSeek-V3.1-UD-Q2_K_XL/DeepSeek-V3.1-UD-Q2_K_XL-00001-of-00006.gguf \ merged_file.gguf

```

I'll edit the area to say you first have to install llama.cpp

↑