Ollama's new engine for multimodal models

I think you misunderstand how these pieces fit together. llama.cpp is library that ships with a CLI+some other stuff, ggml is a library and Ollama has "runners" (like an "execution engine"). Previously, Ollama used llama.cpp (which uses ggml) as the only runner. Eventually, Ollama made their own runner (which also uses ggml) for new models (starting with gemma3 maybe?), still using llama.cpp for the rest (last time I checked at least).

ggml != llama.cpp, but llama.cpp and Ollama are both using ggml as a library.

replies(1): >>44007638 #

5. cwillu ◴[16 May 25 17:00 UTC] No.44007638{4}[source]▶

>>44006311 #

“The llama.cpp project is the main playground for developing new features for the ggml library” --https://github.com/ggml-org/llama.cpp

“Some of the development is currently happening in the llama.cpp and whisper.cpp repos” --https://github.com/ggml-org/ggml

replies(1): >>44010025 #

6. diggan ◴[16 May 25 21:42 UTC] No.44010025{5}[source]▶

>>44007638 #

Yeah, those both makes sense. ggml was split from llama.cpp once they realized it could be useful elsewhere, so while llama.cpp is the "main playground", it's still used by others (including llama.cpp). Doesn't mean suddenly that llama.cpp is the same as ggml, not sure why you'd believe that.

↑