./llama.cpp/llama-cli -hf unsloth/DeepSeek-V3.1-GGUF:UD-Q2_K_XL -ngl 99 --jinja -ot ".ffn_.*_exps.=CPU"
More details on running + optimal params here: https://docs.unsloth.ai/basics/deepseek-v3.1
./llama.cpp/llama-cli -hf unsloth/DeepSeek-V3.1-GGUF:UD-Q2_K_XL -ngl 99 --jinja -ot ".ffn_.*_exps.=CPU"
More details on running + optimal params here: https://docs.unsloth.ai/basics/deepseek-v3.1
There is a way to convert to Q8_0, BF16, F16 without compiling llama.cpp, and it's enabled if you use `FastModel` and not on `FastLanguageModel`
Essentially I try to do `sudo apt-get` if it fails then `apt-get` and if all fails, it just fails. We need `build-essential cmake curl libcurl4-openssl-dev`
See https://github.com/unslothai/unsloth-zoo/blob/main/unsloth_z...
You just fail and print a nice error message telling the user exactly what they need to do, including the exact apt command or whatever that they need to run.
I was thinking if I can do it during the pip install or via setup.py which will do the apt-get instead.
As a fallback, I'll probably for now remove shell executions and just warn the user
Some people may prefer using whatever llama.cpp in $PATH, it's okay to support that, though I'd say doing so may lead to more confused noob users spam - they may just have an outdated version lurking in $PATH.
Doing so makes unsloth wheel platform-dependent, if this is too much of a burden, then maybe you can just package llama.cpp binary and have it on PyPI, like how scipy guys maintain a https://pypi.org/project/cmake/ on PyPI (yes, you can `pip install cmake`), and then depends on it (maybe in an optional group, I see you already have a lot due to cuda shit).
I'm still working on it, but sadly I'm not a packaging person so progress has been nearly zero :(
From how I interpreted it, he meant you could create a new python package, this would effectively be the binary you need.
In your current package, you could depend on the new one, and through that - pull in the binary.
This would let you easily decouple your package from the binary,too - so it'd be easy to update the binary to latest even without pushing a new version of your original package
I've maintained release pipelines before and handled packaging in a previous job, but I'm not particularly into the python ecosystem, so take this with a grain of salt: an approach would be
Pip Packages :
* Unsloth: current package, prefers using unsloth-llama, and uses path llama-cpp as fallback (with error msg as final fallback if neither exist, promoting install for unsloth-llama)
* Unsloth-llama: new package which only bundles the llama cpp binary
I was trying to see if I could pre-compile some llama.cpp binaries then save them as a zip file (I'm a noob sorry) - but I definitely need to investigate further on how to do python pip binaries
Try to find prebuilt and download.
See if you can compile from source if a compiler is installed.
If no compiler: prompt to install via sudo apt and explaining why, also give option to abort and have the user install a compiler themselves.
This isn't perfect, but limits the cases where prompting is necessary.
But I think similarly for uv we need a setup.py for packaging binaries (more complex)