←back to thread

DeepSeek-v3.1

(api-docs.deepseek.com)
776 points wertyk | 1 comments | | HN request time: 0s | source
Show context
danielhanchen ◴[] No.44978800[source]
For local runs, I made some GGUFs! You need around RAM + VRAM >= 250GB for good perf for dynamic 2bit (2bit MoE, 6-8bit rest) - can also do SSD offloading but it'll be slow.

./llama.cpp/llama-cli -hf unsloth/DeepSeek-V3.1-GGUF:UD-Q2_K_XL -ngl 99 --jinja -ot ".ffn_.*_exps.=CPU"

More details on running + optimal params here: https://docs.unsloth.ai/basics/deepseek-v3.1

replies(6): >>44979837 #>>44980406 #>>44981373 #>>44982860 #>>44984274 #>>44987809 #
pshirshov ◴[] No.44979837[source]
By the way, I'm wondering why unsloth (a goddamn python library) tries to run apt-get with sudo (and fails on my nixos). Like how tf we are supposed to use that?
replies(2): >>44980068 #>>44981691 #
danielhanchen ◴[] No.44980068[source]
Oh hey I'm assuming this is for conversion to GGUF after a finetune? If you need to quantize to GGUF Q4_K_M, we have to compile llama.cpp, hence apt-get and compiling llama.cpp within a Python shell.

There is a way to convert to Q8_0, BF16, F16 without compiling llama.cpp, and it's enabled if you use `FastModel` and not on `FastLanguageModel`

Essentially I try to do `sudo apt-get` if it fails then `apt-get` and if all fails, it just fails. We need `build-essential cmake curl libcurl4-openssl-dev`

See https://github.com/unslothai/unsloth-zoo/blob/main/unsloth_z...

replies(5): >>44980567 #>>44980608 #>>44980665 #>>44982700 #>>44983011 #
elteto ◴[] No.44980567[source]
Dude, this is NEVER ok. What in the world??? A third party LIBRARY running sudo commands? That’s just insane.

You just fail and print a nice error message telling the user exactly what they need to do, including the exact apt command or whatever that they need to run.

replies(4): >>44980604 #>>44980675 #>>44980823 #>>44983311 #
danielhanchen ◴[] No.44980675[source]
Yes I had that at the start, but people kept complaining they don't know how to actually run terminal commands, hence the shortcut :(

I was thinking if I can do it during the pip install or via setup.py which will do the apt-get instead.

As a fallback, I'll probably for now remove shell executions and just warn the user

replies(2): >>44981039 #>>44982238 #
rfoo ◴[] No.44982238[source]
IMO the correct thing to do to make these people happy, while being sane, is - do not build llama.cpp on their system. Instead, bundle a portable llama.cpp binary along with unsloth, so that when they install unsloth with `pip` (or `uv`) they get it.

Some people may prefer using whatever llama.cpp in $PATH, it's okay to support that, though I'd say doing so may lead to more confused noob users spam - they may just have an outdated version lurking in $PATH.

Doing so makes unsloth wheel platform-dependent, if this is too much of a burden, then maybe you can just package llama.cpp binary and have it on PyPI, like how scipy guys maintain a https://pypi.org/project/cmake/ on PyPI (yes, you can `pip install cmake`), and then depends on it (maybe in an optional group, I see you already have a lot due to cuda shit).

replies(1): >>44982421 #
danielhanchen ◴[] No.44982421[source]
Oh yes I was working on providing binaries together with pip - currently we're relying on pyproject.toml, but once we utilize setup.py (I think), using binaries gets much simpler

I'm still working on it, but sadly I'm not a packaging person so progress has been nearly zero :(

replies(2): >>44982626 #>>44983062 #
ffsm8 ◴[] No.44982626{3}[source]
I think you misunderstood rfoos suggestion slightly.

From how I interpreted it, he meant you could create a new python package, this would effectively be the binary you need.

In your current package, you could depend on the new one, and through that - pull in the binary.

This would let you easily decouple your package from the binary,too - so it'd be easy to update the binary to latest even without pushing a new version of your original package

I've maintained release pipelines before and handled packaging in a previous job, but I'm not particularly into the python ecosystem, so take this with a grain of salt: an approach would be

Pip Packages :

    * Unsloth: current package, prefers using unsloth-llama, and uses path llama-cpp as fallback (with error msg as final fallback if neither exist, promoting install for unsloth-llama)
    * Unsloth-llama: new package which only bundles the llama cpp binary
replies(1): >>44982737 #
danielhanchen ◴[] No.44982737{4}[source]
Oh ok sorry maybe I misunderstood sorry! I actually found my partial work I did for precompiled binaries! https://huggingface.co/datasets/unsloth/precompiled_llama_cp...

I was trying to see if I could pre-compile some llama.cpp binaries then save them as a zip file (I'm a noob sorry) - but I definitely need to investigate further on how to do python pip binaries

replies(1): >>44985855 #
docfort ◴[] No.44985855{5}[source]
https://docs.astral.sh/uv/guides/package/#publishing-your-pa...
replies(1): >>45000136 #
1. danielhanchen ◴[] No.45000136{6}[source]
Oh thanks - we currently use Pypi so pip install works - https://pypi.org/project/unsloth/

But I think similarly for uv we need a setup.py for packaging binaries (more complex)