./llama.cpp/llama-cli -hf unsloth/DeepSeek-V3.1-GGUF:UD-Q2_K_XL -ngl 99 --jinja -ot ".ffn_.*_exps.=CPU"
More details on running + optimal params here: https://docs.unsloth.ai/basics/deepseek-v3.1
./llama.cpp/llama-cli -hf unsloth/DeepSeek-V3.1-GGUF:UD-Q2_K_XL -ngl 99 --jinja -ot ".ffn_.*_exps.=CPU"
More details on running + optimal params here: https://docs.unsloth.ai/basics/deepseek-v3.1
There is a way to convert to Q8_0, BF16, F16 without compiling llama.cpp, and it's enabled if you use `FastModel` and not on `FastLanguageModel`
Essentially I try to do `sudo apt-get` if it fails then `apt-get` and if all fails, it just fails. We need `build-essential cmake curl libcurl4-openssl-dev`
See https://github.com/unslothai/unsloth-zoo/blob/main/unsloth_z...
Imo it's best to just depend on the required fork of llama.cpp at build time (or not) according to some configuration. Installing things at runtime is nuts (especially if it means modifying the existing install path). But if you don't want to do that, I think this would also be an improvement:
- see if llama.cpp is on the PATH and already has the requisite features
- if not, check /etc/os-release to determine distro
- if unavailable, guess distro class based on the presence of high-level package managers (apt, dnf, yum, zypper, pacman) on the PATH
- bail, explain the problem to the user, give copy/paste-friendly instructions at the end of we managed to figure out where we're running
Is either sort of change potentially agreeable enough that you'd be happy to review it?1. So I added a `check_llama_cpp` which checks if llama.cpp does exist and it'll use the prebuilt one https://github.com/unslothai/unsloth-zoo/blob/main/unsloth_z...
2. Yes I like the idea of determining distro
3. Agreed on bailing - I was also thinking if doing a Python input() with a 30 second waiting period for apt-get if that's ok? We tell the user we will apt-get some packages (only if apt exists) (no sudo), and after 30 seconds, it'll just error out
4. I will remove sudo immediately (ie now), and temporarily just do (3)
But more than happy to fix this asap - again sorry on me being dumb
(1) Removed and disabled sudo
(2) Installing via apt-get will ask user's input() for permission
(3) Added an error if failed llama.cpp and provides instructions to manual compile llama.cpp
I would just ask the user to install the package, and _maybe_ show the command line to install it (but never run it).
That said, it does at least seem like these recent changes are a large step in the right direction.
---
* in terms of what the standard approach should be, we live in an imperfect world and package management has been done "wrong" in many ecosystems, but in an ideal world I think the "correct" solution here should be:
(1) If it's an end user tool it should be a self contained binary or it should be a system package installed via the package manager (which will manage any ancillary dependencies for you)
(2) If it's a dev tool (which, if you're cloning a cpp repo & building binaries, it is), it should not touch anything systemwide. Whatsoever.
This often results in a README with manual instructions to install deps, but there are many good automated ways to approach this. E.g. for CPP this is a solved problem with Conan Profiles. However that might incur significant maintenace overhead for the Unsloth guys if it's not something the ggml guys support. A dockerised build is another potential option here, though that would still require the user to have some kind of container engine installed, so still not 100% ideal.
(2) I might make the message on installing llama.cpp maybe more informative - ie instead of re-directing people to the docs on manual compilation ie https://docs.unsloth.ai/basics/troubleshooting-and-faqs#how-..., I might actually print out a longer message in the Python cell entirely
Yes we're working on Docker! https://hub.docker.com/r/unsloth/unsloth
That will be nice too, though I was more just referring to simply doing something along the lines of this in your current build:
docker run conanio/gcc11-ubuntu16.04 make clean -C llama.cpp etc etc...
(likely mounting & calling a sh file instead of passing individual commands)---
Although I do think getting the ggml guys to support Conan (or monkey patching your own llama conanfile in before building) might be an easier route.
Quietly installing stuff at runtime is shady for sure, but why not if I consent?
- Determine the command that has to be run by the algorithm above.
This does most of the work a user would have to figure out what has to be installed on their system.
- Ask whether to run the command automatically.
This allows the “software should never install dependencies by itself” crowd to say no and figure out further steps, while allowing people who just want it to work to get on with their task as quickly as possible (who do you think there are more of?).
I think it would be fine to print out the command and force the user to run it themselves, but it would bring little material gain at the cost of some of your users’ peace (“oh no it failed, what is it this time ...”).