DeepSeek-v3.1

(api-docs.deepseek.com)

776 points wertyk | 2 comments | 21 Aug 25 19:06 UTC | HN request time: 0s | source

Show context

danielhanchen ◴[21 Aug 25 22:21 UTC] No.44978800[source]▶

For local runs, I made some GGUFs! You need around RAM + VRAM >= 250GB for good perf for dynamic 2bit (2bit MoE, 6-8bit rest) - can also do SSD offloading but it'll be slow.

./llama.cpp/llama-cli -hf unsloth/DeepSeek-V3.1-GGUF:UD-Q2_K_XL -ngl 99 --jinja -ot ".ffn_.*_exps.=CPU"

More details on running + optimal params here: https://docs.unsloth.ai/basics/deepseek-v3.1

replies(6): >>44979837 #>>44980406 #>>44981373 #>>44982860 #>>44984274 #>>44987809 #

pshirshov ◴[22 Aug 25 00:31 UTC] No.44979837[source]▶

>>44978800 #

By the way, I'm wondering why unsloth (a goddamn python library) tries to run apt-get with sudo (and fails on my nixos). Like how tf we are supposed to use that?

replies(2): >>44980068 #>>44981691 #

danielhanchen ◴[22 Aug 25 01:09 UTC] No.44980068[source]▶

>>44979837 #

Oh hey I'm assuming this is for conversion to GGUF after a finetune? If you need to quantize to GGUF Q4_K_M, we have to compile llama.cpp, hence apt-get and compiling llama.cpp within a Python shell.

There is a way to convert to Q8_0, BF16, F16 without compiling llama.cpp, and it's enabled if you use `FastModel` and not on `FastLanguageModel`

Essentially I try to do `sudo apt-get` if it fails then `apt-get` and if all fails, it just fails. We need `build-essential cmake curl libcurl4-openssl-dev`

See https://github.com/unslothai/unsloth-zoo/blob/main/unsloth_z...

replies(5): >>44980567 #>>44980608 #>>44980665 #>>44982700 #>>44983011 #

pxc ◴[22 Aug 25 03:11 UTC] No.44980665[source]▶

>>44980068 #

It seems Unsloth is useful and popular, and you seem responsive and helpful. I'd be down to try to improve this and maybe package Unsloth for Nix as well, if you're up for reviewing and answering questions; seems fun.

Imo it's best to just depend on the required fork of llama.cpp at build time (or not) according to some configuration. Installing things at runtime is nuts (especially if it means modifying the existing install path). But if you don't want to do that, I think this would also be an improvement:

  - see if llama.cpp is on the PATH and already has the requisite features
  - if not, check /etc/os-release to determine distro
  - if unavailable, guess distro class based on the presence of high-level package managers (apt, dnf, yum, zypper, pacman) on the PATH
  - bail, explain the problem to the user, give copy/paste-friendly instructions at the end of we managed to figure out where we're running

Is either sort of change potentially agreeable enough that you'd be happy to review it?

replies(2): >>44980750 #>>44980820 #

danielhanchen ◴[22 Aug 25 03:43 UTC] No.44980820[source]▶

>>44980665 #

As an update, I pushed https://github.com/unslothai/unsloth-zoo/commit/ae675a0a2d20...

(1) Removed and disabled sudo

(2) Installing via apt-get will ask user's input() for permission

(3) Added an error if failed llama.cpp and provides instructions to manual compile llama.cpp

replies(1): >>44981782 #

mFixman ◴[22 Aug 25 07:14 UTC] No.44981782[source]▶

>>44980820 #

Maybe it's a personal preference, but I don't want external programs to ever touch my package manager, even with permission. Besides, this will fail loudly for systems that don't use `apt-get`.

I would just ask the user to install the package, and _maybe_ show the command line to install it (but never run it).

replies(3): >>44981868 #>>44982581 #>>44984685 #

1. solarkraft ◴[22 Aug 25 13:50 UTC] No.44984685{6}[source]▶

>>44981782 #

I like it when software does work for me.

Quietly installing stuff at runtime is shady for sure, but why not if I consent?

replies(1): >>44987963 #

2. danielhanchen ◴[22 Aug 25 18:27 UTC] No.44987963[source]▶

>>44984685 (TP) #

Do you think it's ok for permissioning I guess? I might also add a 30 second timer and just bail out if there's no response from the user

↑