Most active commenters

danielhanchen(12)
mkl(3)
solarkraft(3)

Popular/hot comments

>>44981782 #

←back to thread

DeepSeek-v3.1

(api-docs.deepseek.com)

Show context

danielhanchen ◴[21 Aug 25 22:21 UTC] No.44978800[source]▶

>>44976764 (OP) #

For local runs, I made some GGUFs! You need around RAM + VRAM >= 250GB for good perf for dynamic 2bit (2bit MoE, 6-8bit rest) - can also do SSD offloading but it'll be slow.

./llama.cpp/llama-cli -hf unsloth/DeepSeek-V3.1-GGUF:UD-Q2_K_XL -ngl 99 --jinja -ot ".ffn_.*_exps.=CPU"

More details on running + optimal params here: https://docs.unsloth.ai/basics/deepseek-v3.1

replies(6): >>44979837 #>>44980406 #>>44981373 #>>44982860 #>>44984274 #>>44987809 #

pshirshov ◴[22 Aug 25 00:31 UTC] No.44979837[source]▶

>>44978800 #

By the way, I'm wondering why unsloth (a goddamn python library) tries to run apt-get with sudo (and fails on my nixos). Like how tf we are supposed to use that?

replies(2): >>44980068 #>>44981691 #

danielhanchen ◴[22 Aug 25 01:09 UTC] No.44980068[source]▶

>>44979837 #

Oh hey I'm assuming this is for conversion to GGUF after a finetune? If you need to quantize to GGUF Q4_K_M, we have to compile llama.cpp, hence apt-get and compiling llama.cpp within a Python shell.

There is a way to convert to Q8_0, BF16, F16 without compiling llama.cpp, and it's enabled if you use `FastModel` and not on `FastLanguageModel`

Essentially I try to do `sudo apt-get` if it fails then `apt-get` and if all fails, it just fails. We need `build-essential cmake curl libcurl4-openssl-dev`

See https://github.com/unslothai/unsloth-zoo/blob/main/unsloth_z...

replies(5): >>44980567 #>>44980608 #>>44980665 #>>44982700 #>>44983011 #

1. pxc ◴[22 Aug 25 03:11 UTC] No.44980665[source]▶

>>44980068 #

It seems Unsloth is useful and popular, and you seem responsive and helpful. I'd be down to try to improve this and maybe package Unsloth for Nix as well, if you're up for reviewing and answering questions; seems fun.

Imo it's best to just depend on the required fork of llama.cpp at build time (or not) according to some configuration. Installing things at runtime is nuts (especially if it means modifying the existing install path). But if you don't want to do that, I think this would also be an improvement:

  - see if llama.cpp is on the PATH and already has the requisite features
  - if not, check /etc/os-release to determine distro
  - if unavailable, guess distro class based on the presence of high-level package managers (apt, dnf, yum, zypper, pacman) on the PATH
  - bail, explain the problem to the user, give copy/paste-friendly instructions at the end of we managed to figure out where we're running

Is either sort of change potentially agreeable enough that you'd be happy to review it?

replies(2): >>44980750 #>>44980820 #

2. danielhanchen ◴[22 Aug 25 03:29 UTC] No.44980750[source]▶

>>44980665 (TP) #

Thanks for the suggestions! Apologies again I'm pretty bad at packaging, so hence the current setup.

1. So I added a `check_llama_cpp` which checks if llama.cpp does exist and it'll use the prebuilt one https://github.com/unslothai/unsloth-zoo/blob/main/unsloth_z...

2. Yes I like the idea of determining distro

3. Agreed on bailing - I was also thinking if doing a Python input() with a 30 second waiting period for apt-get if that's ok? We tell the user we will apt-get some packages (only if apt exists) (no sudo), and after 30 seconds, it'll just error out

4. I will remove sudo immediately (ie now), and temporarily just do (3)

But more than happy to fix this asap - again sorry on me being dumb

replies(1): >>44982623 #

3. danielhanchen ◴[22 Aug 25 03:43 UTC] No.44980820[source]▶

>>44980665 (TP) #

As an update, I pushed https://github.com/unslothai/unsloth-zoo/commit/ae675a0a2d20...

(1) Removed and disabled sudo

(2) Installing via apt-get will ask user's input() for permission

(3) Added an error if failed llama.cpp and provides instructions to manual compile llama.cpp

replies(1): >>44981782 #

4. mFixman ◴[22 Aug 25 07:14 UTC] No.44981782[source]▶

>>44980820 #

Maybe it's a personal preference, but I don't want external programs to ever touch my package manager, even with permission. Besides, this will fail loudly for systems that don't use `apt-get`.

I would just ask the user to install the package, and _maybe_ show the command line to install it (but never run it).

replies(3): >>44981868 #>>44982581 #>>44984685 #

5. danielhanchen ◴[22 Aug 25 07:30 UTC] No.44981868{3}[source]▶

>>44981782 #

Hopefully the solution for now is a compromise if that works? It will show the command as well, so if not accepted, typing no will error out and tell the user on how to install the package

6. lucideer ◴[22 Aug 25 09:41 UTC] No.44982581{3}[source]▶

>>44981782 #

I don't think this should be a personal preference, I think it should be a standard*.

That said, it does at least seem like these recent changes are a large step in the right direction.

---

* in terms of what the standard approach should be, we live in an imperfect world and package management has been done "wrong" in many ecosystems, but in an ideal world I think the "correct" solution here should be:

(1) If it's an end user tool it should be a self contained binary or it should be a system package installed via the package manager (which will manage any ancillary dependencies for you)

(2) If it's a dev tool (which, if you're cloning a cpp repo & building binaries, it is), it should not touch anything systemwide. Whatsoever.

This often results in a README with manual instructions to install deps, but there are many good automated ways to approach this. E.g. for CPP this is a solved problem with Conan Profiles. However that might incur significant maintenace overhead for the Unsloth guys if it's not something the ggml guys support. A dockerised build is another potential option here, though that would still require the user to have some kind of container engine installed, so still not 100% ideal.

replies(1): >>44982765 #

7. mkl ◴[22 Aug 25 09:49 UTC] No.44982623[source]▶

>>44980750 #

It shouldn't install any packages itself. Just print out a message about the missing packages and your guess of the command to install them, then exit. That way users can run the command themselves if it's appropriate or add the packages to their container build or whatever. People set up machines in a lot of different ways, and automatically installing things is going to mess that up.

replies(2): >>44982744 #>>44984736 #

8. danielhanchen ◴[22 Aug 25 10:14 UTC] No.44982744{3}[source]▶

>>44982623 #

Hmmm so I should get rid of the asking / permissions message?

replies(1): >>44982786 #

9. danielhanchen ◴[22 Aug 25 10:17 UTC] No.44982765{4}[source]▶

>>44982581 #

I would like to be in (1) but I'm not a packaging person so I'll need to investigate more :(

(2) I might make the message on installing llama.cpp maybe more informative - ie instead of re-directing people to the docs on manual compilation ie https://docs.unsloth.ai/basics/troubleshooting-and-faqs#how-..., I might actually print out a longer message in the Python cell entirely

Yes we're working on Docker! https://hub.docker.com/r/unsloth/unsloth

replies(1): >>44983175 #

10. mkl ◴[22 Aug 25 10:20 UTC] No.44982786{4}[source]▶

>>44982744 #

Yes, since you won't actually need the permissions.

replies(1): >>44982932 #

11. danielhanchen ◴[22 Aug 25 10:49 UTC] No.44982932{5}[source]▶

>>44982786 #

Hmmm I'm worried people will really not get on how to install / compile / use the terminal hmmm hence I thought permissions were like a compromise solution

replies(2): >>44983399 #>>44984762 #

12. lucideer ◴[22 Aug 25 11:26 UTC] No.44983175{5}[source]▶

>>44982765 #

> Yes we're working on Docker!

That will be nice too, though I was more just referring to simply doing something along the lines of this in your current build:

  docker run conanio/gcc11-ubuntu16.04 make clean -C llama.cpp etc etc...

(likely mounting & calling a sh file instead of passing individual commands)

---

Although I do think getting the ggml guys to support Conan (or monkey patching your own llama conanfile in before building) might be an easier route.

replies(1): >>44983919 #

13. segmondy ◴[22 Aug 25 11:53 UTC] No.44983399{6}[source]▶

>>44982932 #

Don't listen to this crowd, these are "technical folks". Most of your audience will fail to figure it out. You can provide an option that llama.cpp is missing and give them an option where you auto install it or they can install it themselves and do manual configuration. I personally won't tho.

replies(2): >>44984057 #>>44987769 #

14. danielhanchen ◴[22 Aug 25 12:45 UTC] No.44983919{6}[source]▶

>>44983175 #

Oh ok I'll take a look at conanfiles as well - sorry I'm not familiar with them!

15. danielhanchen ◴[22 Aug 25 12:56 UTC] No.44984057{7}[source]▶

>>44983399 #

I think for a compromise solution I'll allow the permission asking to install. I'll definitely try investigating pre built binaries though

16. solarkraft ◴[22 Aug 25 13:50 UTC] No.44984685{3}[source]▶

>>44981782 #

I like it when software does work for me.

Quietly installing stuff at runtime is shady for sure, but why not if I consent?

replies(1): >>44987963 #

17. solarkraft ◴[22 Aug 25 13:54 UTC] No.44984736{3}[source]▶

>>44982623 #

This is an edge case optimization at the cost of 95% of users.

replies(1): >>44985650 #

18. solarkraft ◴[22 Aug 25 13:57 UTC] No.44984762{6}[source]▶

>>44982932 #

I think that it is, quite a good one, even:

- Determine the command that has to be run by the algorithm above.

This does most of the work a user would have to figure out what has to be installed on their system.

- Ask whether to run the command automatically.

This allows the “software should never install dependencies by itself” crowd to say no and figure out further steps, while allowing people who just want it to work to get on with their task as quickly as possible (who do you think there are more of?).

I think it would be fine to print out the command and force the user to run it themselves, but it would bring little material gain at the cost of some of your users’ peace (“oh no it failed, what is it this time ...”).

replies(1): >>44987975 #

19. mkl ◴[22 Aug 25 15:15 UTC] No.44985650{4}[source]▶

>>44984736 #

95% of users probably won't be using Linux. Most of those who are will have no problem installing dependencies. There are too many distributions and ways of setting them up for automated package manager use to be the right thing to do. I have never seen a Python package even try.

20. Computer0 ◴[22 Aug 25 18:13 UTC] No.44987769{7}[source]▶

>>44983399 #

Who do you think the audience is here if not technical. We are in a discussion about a model that requires over 250gb of ram to run. I don't know a non-technical person with more than 32gb.

replies(1): >>44991279 #

21. danielhanchen ◴[22 Aug 25 18:27 UTC] No.44987963{4}[source]▶

>>44984685 #

Do you think it's ok for permissioning I guess? I might also add a 30 second timer and just bail out if there's no response from the user

22. danielhanchen ◴[22 Aug 25 18:28 UTC] No.44987975{7}[source]▶

>>44984762 #

Oh ok! I would say 50% of people manually install llama.cpp and the other 50% want it to be automated

23. pxc ◴[22 Aug 25 23:38 UTC] No.44991279{8}[source]▶

>>44987769 #

I think most of the people like this in the ML world are extreme specialists (e.g.: bioinformaticians, statisticians, linguists, data scientists) who are "technical" in some ways but aren't really "computer people". They're power users in a sense but they're also prone to strange bouts of computing insanity and/or helplessness.

↑