The community getting obsessed with Ollama has done huge damage to the field, as it's ineffecient compared to vLLM. Many people can get far more tok/s than they think they could if only they knew the right tools.
It is important to know about both to decide between the two for your use case though.
Unfortunately Ollama and vLLM are therefore incomparable at the moment, because vLLM does not support these models yet.
Whatever those keyword things are, they certainly don't seem to be doing any form of RAG.
> Be kind. Don't be snarky.
> Please don't post shallow dismissals, especially of other people's work.
In my opinion, your comment is not in line with the guidelines. Especially the part about sillytavern being the only LLM frontend that matters. Telling the devs of any LLM frontend except sillytavern that their app doesn't matter seems exactly like a shallow dismissal of other people's work to me.