./llama.cpp/llama-cli -hf unsloth/DeepSeek-V3.1-GGUF:UD-Q2_K_XL -ngl 99 --jinja -ot ".ffn_.*_exps.=CPU"
More details on running + optimal params here: https://docs.unsloth.ai/basics/deepseek-v3.1
./llama.cpp/llama-cli -hf unsloth/DeepSeek-V3.1-GGUF:UD-Q2_K_XL -ngl 99 --jinja -ot ".ffn_.*_exps.=CPU"
More details on running + optimal params here: https://docs.unsloth.ai/basics/deepseek-v3.1
There is a way to convert to Q8_0, BF16, F16 without compiling llama.cpp, and it's enabled if you use `FastModel` and not on `FastLanguageModel`
Essentially I try to do `sudo apt-get` if it fails then `apt-get` and if all fails, it just fails. We need `build-essential cmake curl libcurl4-openssl-dev`
See https://github.com/unslothai/unsloth-zoo/blob/main/unsloth_z...
I chose (1) since it was mainly for ease of use for the user - but I agree it's not a good idea sorry!
:( I also added a section to manually compile llama.cpp here: https://docs.unsloth.ai/basics/troubleshooting-and-faqs#how-...
But I agree I should remove apt-gets - will do this asap! Thanks for the suggestions :)
I think that you have removed sudo so this is nice, my suggestion is pretty similar to that of pxc (basically determine different distros and use them as that)
I wonder if we will ever get a working universal package manager in linux, to me flatpak genuinely makes the most sense even sometimes for cli but flatpak isn't built for cli unlike snap which both support cli and gui but snap is proprietory.
I agree on handling different distros - sadly I'm not familiar with others, so any help would be appreciated! For now I'm most familiar with apt-get, but would 100% want to expand out!
Interesting will check flatpak out!
I doubt its efficacy here, they might be more useful if you provide a whole jupyter / browser gui though but a lot o f us run it just in cli so I doubt flatpak.
I didn't mean to say that flatpak was the right tool for this job, I seriously don't know too much to comment and so I'd prefer if you could ask someone definitely experienced regarding it.
My reasoning for flatpak was chunking support (that I think is rare in appimage) and easier gpu integration (I think) compared to docker, though my reasoning might be flawed since flatpak isn't mostly used with cli.