Where "supporting" a model doesn't mean what you think it means for cpp
Between that and the long saga with vision models having only partial support, with a CLI tool, and no llama-server support (they only fixed all that very recently) the fact of the matter is that ollama is moving faster and implementing what people want before lama.cpp now
And it will finally shut down all the people who kept copy pasting the same criticism of ollama "it's just a llama.cpp wrapper why are you not using cpp instead"
Went with my own wrapper around llama.cpp and stable-diffusion.cpp with optional prompting hosted if I don’t like the result so much, but it makes a good start for hosted to improve on.
Also obfuscates any requests sent to hosted, cause why feed them insight to my use case when I just want to double check algorithmic choices of local AI? The ground truth relationship func names and variable names imply is my little secret