←back to thread

DeepSeek-v3.1

(api-docs.deepseek.com)
776 points wertyk | 2 comments | | HN request time: 0s | source
Show context
danielhanchen ◴[] No.44978800[source]
For local runs, I made some GGUFs! You need around RAM + VRAM >= 250GB for good perf for dynamic 2bit (2bit MoE, 6-8bit rest) - can also do SSD offloading but it'll be slow.

./llama.cpp/llama-cli -hf unsloth/DeepSeek-V3.1-GGUF:UD-Q2_K_XL -ngl 99 --jinja -ot ".ffn_.*_exps.=CPU"

More details on running + optimal params here: https://docs.unsloth.ai/basics/deepseek-v3.1

replies(6): >>44979837 #>>44980406 #>>44981373 #>>44982860 #>>44984274 #>>44987809 #
diggan ◴[] No.44984274[source]
> More details on running + optimal params here: https://docs.unsloth.ai/basics/deepseek-v3.1

Was that document almost exclusively written with LLMs? I looked at it last night (~8 hours ago) and it was riddled with mistakes, most egregious was that the "Run with Ollama" section had instructions for how to install Ollama, but then the shell commands were actually running llama.cpp, a mistake probably no human would make.

Do you have any plans on disclosing how much of these docs are written by humans vs not?

Regardless, thanks for the continued release of quants and weights :)

replies(2): >>44987926 #>>44997216 #
1. wfn ◴[] No.44997216[source]
> but then the shell commands were actually running llama.cpp, a mistake probably no human would make.

But in the docs I see things like

    cp llama.cpp/build/bin/llama-* llama.cpp
Wouldn't this explain that? (Didn't look too deep)
replies(1): >>45000122 #
2. danielhanchen ◴[] No.45000122[source]
Yes it's probs the ordering od the docs thats the issue :) Ie https://docs.unsloth.ai/basics/deepseek-v3.1#run-in-llama.cp... does:

```

apt-get update

apt-get install pciutils build-essential cmake curl libcurl4-openssl-dev -y

git clone https://github.com/ggerganov/llama.cpp cmake llama.cpp -B llama.cpp/build \ -DBUILD_SHARED_LIBS=OFF -DGGML_CUDA=ON -DLLAMA_CURL=ON

cmake --build llama.cpp/build --config Release -j --clean-first --target llama-quantize llama-cli llama-gguf-split llama-mtmd-cli llama-server

cp llama.cpp/build/bin/llama-* llama.cpp

```

but then Ollama is above it:

```

./llama.cpp/llama-gguf-split --merge \ DeepSeek-V3.1-GGUF/DeepSeek-V3.1-UD-Q2_K_XL/DeepSeek-V3.1-UD-Q2_K_XL-00001-of-00006.gguf \ merged_file.gguf

```

I'll edit the area to say you first have to install llama.cpp