./llama.cpp/llama-cli -hf unsloth/DeepSeek-V3.1-GGUF:UD-Q2_K_XL -ngl 99 --jinja -ot ".ffn_.*_exps.=CPU"
More details on running + optimal params here: https://docs.unsloth.ai/basics/deepseek-v3.1
./llama.cpp/llama-cli -hf unsloth/DeepSeek-V3.1-GGUF:UD-Q2_K_XL -ngl 99 --jinja -ot ".ffn_.*_exps.=CPU"
More details on running + optimal params here: https://docs.unsloth.ai/basics/deepseek-v3.1
There is a way to convert to Q8_0, BF16, F16 without compiling llama.cpp, and it's enabled if you use `FastModel` and not on `FastLanguageModel`
Essentially I try to do `sudo apt-get` if it fails then `apt-get` and if all fails, it just fails. We need `build-essential cmake curl libcurl4-openssl-dev`
See https://github.com/unslothai/unsloth-zoo/blob/main/unsloth_z...
Imo it's best to just depend on the required fork of llama.cpp at build time (or not) according to some configuration. Installing things at runtime is nuts (especially if it means modifying the existing install path). But if you don't want to do that, I think this would also be an improvement:
- see if llama.cpp is on the PATH and already has the requisite features
- if not, check /etc/os-release to determine distro
- if unavailable, guess distro class based on the presence of high-level package managers (apt, dnf, yum, zypper, pacman) on the PATH
- bail, explain the problem to the user, give copy/paste-friendly instructions at the end of we managed to figure out where we're running
Is either sort of change potentially agreeable enough that you'd be happy to review it?1. So I added a `check_llama_cpp` which checks if llama.cpp does exist and it'll use the prebuilt one https://github.com/unslothai/unsloth-zoo/blob/main/unsloth_z...
2. Yes I like the idea of determining distro
3. Agreed on bailing - I was also thinking if doing a Python input() with a 30 second waiting period for apt-get if that's ok? We tell the user we will apt-get some packages (only if apt exists) (no sudo), and after 30 seconds, it'll just error out
4. I will remove sudo immediately (ie now), and temporarily just do (3)
But more than happy to fix this asap - again sorry on me being dumb