DeepSeek-v3.1

(api-docs.deepseek.com)

776 points wertyk | 4 comments | 21 Aug 25 19:06 UTC | HN request time: 0s | source

Show context

danielhanchen ◴[21 Aug 25 22:21 UTC] No.44978800[source]▶

For local runs, I made some GGUFs! You need around RAM + VRAM >= 250GB for good perf for dynamic 2bit (2bit MoE, 6-8bit rest) - can also do SSD offloading but it'll be slow.

./llama.cpp/llama-cli -hf unsloth/DeepSeek-V3.1-GGUF:UD-Q2_K_XL -ngl 99 --jinja -ot ".ffn_.*_exps.=CPU"

More details on running + optimal params here: https://docs.unsloth.ai/basics/deepseek-v3.1

replies(6): >>44979837 #>>44980406 #>>44981373 #>>44982860 #>>44984274 #>>44987809 #

pshirshov ◴[22 Aug 25 00:31 UTC] No.44979837[source]▶

>>44978800 #

By the way, I'm wondering why unsloth (a goddamn python library) tries to run apt-get with sudo (and fails on my nixos). Like how tf we are supposed to use that?

replies(2): >>44980068 #>>44981691 #

danielhanchen ◴[22 Aug 25 01:09 UTC] No.44980068[source]▶

>>44979837 #

Oh hey I'm assuming this is for conversion to GGUF after a finetune? If you need to quantize to GGUF Q4_K_M, we have to compile llama.cpp, hence apt-get and compiling llama.cpp within a Python shell.

There is a way to convert to Q8_0, BF16, F16 without compiling llama.cpp, and it's enabled if you use `FastModel` and not on `FastLanguageModel`

Essentially I try to do `sudo apt-get` if it fails then `apt-get` and if all fails, it just fails. We need `build-essential cmake curl libcurl4-openssl-dev`

See https://github.com/unslothai/unsloth-zoo/blob/main/unsloth_z...

replies(5): >>44980567 #>>44980608 #>>44980665 #>>44982700 #>>44983011 #

1. Balinares ◴[22 Aug 25 11:01 UTC] No.44983011[source]▶

>>44980068 #

I'll venture that whoever is going to fine-tune their own models probably already has llama.cpp installed somewhere, or can install if required.

Please, please, never silently attempt to mutate the state of my machine, that is not a good practice at all and will break things more often than it will help because you don't know how the machine is set up in the first place.

replies(1): >>44984021 #

2. danielhanchen ◴[22 Aug 25 12:53 UTC] No.44984021[source]▶

>>44983011 (TP) #

Oh yes so before we install llama.cpp we do an path environment check and if its not defined then it'll install.

But yes agreed there won't be any more random package installs sorry!

replies(1): >>44986274 #

3. Balinares ◴[22 Aug 25 16:08 UTC] No.44986274[source]▶

>>44984021 #

Thanks for the reply! If I can find the time (that's a pretty big if), I'll try to send a PR to help with the packaging.

replies(1): >>44987981 #

4. danielhanchen ◴[22 Aug 25 18:28 UTC] No.44987981{3}[source]▶

>>44986274 #

No worries :)

↑