Most active commenters

danielhanchen(7)
mkl(3)

DeepSeek-v3.1

(api-docs.deepseek.com)

Show context

danielhanchen ◴[21 Aug 25 22:21 UTC] No.44978800[source]▶

For local runs, I made some GGUFs! You need around RAM + VRAM >= 250GB for good perf for dynamic 2bit (2bit MoE, 6-8bit rest) - can also do SSD offloading but it'll be slow.

./llama.cpp/llama-cli -hf unsloth/DeepSeek-V3.1-GGUF:UD-Q2_K_XL -ngl 99 --jinja -ot ".ffn_.*_exps.=CPU"

More details on running + optimal params here: https://docs.unsloth.ai/basics/deepseek-v3.1

replies(6): >>44979837 #>>44980406 #>>44981373 #>>44982860 #>>44984274 #>>44987809 #

pshirshov ◴[22 Aug 25 00:31 UTC] No.44979837[source]▶

>>44978800 #

By the way, I'm wondering why unsloth (a goddamn python library) tries to run apt-get with sudo (and fails on my nixos). Like how tf we are supposed to use that?

replies(2): >>44980068 #>>44981691 #

danielhanchen ◴[22 Aug 25 01:09 UTC] No.44980068[source]▶

>>44979837 #

Oh hey I'm assuming this is for conversion to GGUF after a finetune? If you need to quantize to GGUF Q4_K_M, we have to compile llama.cpp, hence apt-get and compiling llama.cpp within a Python shell.

There is a way to convert to Q8_0, BF16, F16 without compiling llama.cpp, and it's enabled if you use `FastModel` and not on `FastLanguageModel`

Essentially I try to do `sudo apt-get` if it fails then `apt-get` and if all fails, it just fails. We need `build-essential cmake curl libcurl4-openssl-dev`

See https://github.com/unslothai/unsloth-zoo/blob/main/unsloth_z...

replies(5): >>44980567 #>>44980608 #>>44980665 #>>44982700 #>>44983011 #

pxc ◴[22 Aug 25 03:11 UTC] No.44980665[source]▶

>>44980068 #

It seems Unsloth is useful and popular, and you seem responsive and helpful. I'd be down to try to improve this and maybe package Unsloth for Nix as well, if you're up for reviewing and answering questions; seems fun.

Imo it's best to just depend on the required fork of llama.cpp at build time (or not) according to some configuration. Installing things at runtime is nuts (especially if it means modifying the existing install path). But if you don't want to do that, I think this would also be an improvement:

  - see if llama.cpp is on the PATH and already has the requisite features
  - if not, check /etc/os-release to determine distro
  - if unavailable, guess distro class based on the presence of high-level package managers (apt, dnf, yum, zypper, pacman) on the PATH
  - bail, explain the problem to the user, give copy/paste-friendly instructions at the end of we managed to figure out where we're running

Is either sort of change potentially agreeable enough that you'd be happy to review it?

replies(2): >>44980750 #>>44980820 #

1. danielhanchen ◴[22 Aug 25 03:29 UTC] No.44980750{4}[source]▶

>>44980665 #

Thanks for the suggestions! Apologies again I'm pretty bad at packaging, so hence the current setup.

1. So I added a `check_llama_cpp` which checks if llama.cpp does exist and it'll use the prebuilt one https://github.com/unslothai/unsloth-zoo/blob/main/unsloth_z...

2. Yes I like the idea of determining distro

3. Agreed on bailing - I was also thinking if doing a Python input() with a 30 second waiting period for apt-get if that's ok? We tell the user we will apt-get some packages (only if apt exists) (no sudo), and after 30 seconds, it'll just error out

4. I will remove sudo immediately (ie now), and temporarily just do (3)

But more than happy to fix this asap - again sorry on me being dumb

replies(1): >>44982623 #

2. mkl ◴[22 Aug 25 09:49 UTC] No.44982623[source]▶

>>44980750 (TP) #

It shouldn't install any packages itself. Just print out a message about the missing packages and your guess of the command to install them, then exit. That way users can run the command themselves if it's appropriate or add the packages to their container build or whatever. People set up machines in a lot of different ways, and automatically installing things is going to mess that up.

replies(2): >>44982744 #>>44984736 #

3. danielhanchen ◴[22 Aug 25 10:14 UTC] No.44982744[source]▶

>>44982623 #

Hmmm so I should get rid of the asking / permissions message?

replies(1): >>44982786 #

4. mkl ◴[22 Aug 25 10:20 UTC] No.44982786{3}[source]▶

>>44982744 #

Yes, since you won't actually need the permissions.

replies(1): >>44982932 #

5. danielhanchen ◴[22 Aug 25 10:49 UTC] No.44982932{4}[source]▶

>>44982786 #

Hmmm I'm worried people will really not get on how to install / compile / use the terminal hmmm hence I thought permissions were like a compromise solution

replies(2): >>44983399 #>>44984762 #

6. segmondy ◴[22 Aug 25 11:53 UTC] No.44983399{5}[source]▶

>>44982932 #

Don't listen to this crowd, these are "technical folks". Most of your audience will fail to figure it out. You can provide an option that llama.cpp is missing and give them an option where you auto install it or they can install it themselves and do manual configuration. I personally won't tho.

replies(2): >>44984057 #>>44987769 #

7. danielhanchen ◴[22 Aug 25 12:56 UTC] No.44984057{6}[source]▶

>>44983399 #

I think for a compromise solution I'll allow the permission asking to install. I'll definitely try investigating pre built binaries though

8. solarkraft ◴[22 Aug 25 13:54 UTC] No.44984736[source]▶

>>44982623 #

This is an edge case optimization at the cost of 95% of users.

replies(1): >>44985650 #

9. solarkraft ◴[22 Aug 25 13:57 UTC] No.44984762{5}[source]▶

>>44982932 #

I think that it is, quite a good one, even:

- Determine the command that has to be run by the algorithm above.

This does most of the work a user would have to figure out what has to be installed on their system.

- Ask whether to run the command automatically.

This allows the “software should never install dependencies by itself” crowd to say no and figure out further steps, while allowing people who just want it to work to get on with their task as quickly as possible (who do you think there are more of?).

I think it would be fine to print out the command and force the user to run it themselves, but it would bring little material gain at the cost of some of your users’ peace (“oh no it failed, what is it this time ...”).

replies(1): >>44987975 #

10. mkl ◴[22 Aug 25 15:15 UTC] No.44985650{3}[source]▶

>>44984736 #

95% of users probably won't be using Linux. Most of those who are will have no problem installing dependencies. There are too many distributions and ways of setting them up for automated package manager use to be the right thing to do. I have never seen a Python package even try.

11. Computer0 ◴[22 Aug 25 18:13 UTC] No.44987769{6}[source]▶

>>44983399 #

Who do you think the audience is here if not technical. We are in a discussion about a model that requires over 250gb of ram to run. I don't know a non-technical person with more than 32gb.

replies(1): >>44991279 #

12. danielhanchen ◴[22 Aug 25 18:28 UTC] No.44987975{6}[source]▶

>>44984762 #

Oh ok! I would say 50% of people manually install llama.cpp and the other 50% want it to be automated

13. pxc ◴[22 Aug 25 23:38 UTC] No.44991279{7}[source]▶

>>44987769 #

I think most of the people like this in the ML world are extreme specialists (e.g.: bioinformaticians, statisticians, linguists, data scientists) who are "technical" in some ways but aren't really "computer people". They're power users in a sense but they're also prone to strange bouts of computing insanity and/or helplessness.

↑