Gemma 3 QAT Models: Bringing AI to Consumer GPUs

1. Alifatisk ◴[20 Apr 25 14:25 UTC] No.43743973[source]▶

Except this being lighter than the other models, is there anything else the Gemma model is specifically good at or better than the other models at doing?

replies(3): >>43744015 #>>43744269 #>>43744286 #

2. itake ◴[20 Apr 25 14:31 UTC] No.43744015[source]▶

>>43743973 (TP) #

Google claims to have better multi language support, due tokenizer improvements.

3. nico ◴[20 Apr 25 15:08 UTC] No.43744269[source]▶

>>43743973 (TP) #

They are multimodal. Havent tried the QAT one yet. But the gemma3s released a few weeks ago are pretty good at processing images and telling you details about what’s in them

4. Zambyte ◴[20 Apr 25 15:10 UTC] No.43744286[source]▶

>>43743973 (TP) #

I have found Gemma models are able to produce useful information about more niche subjects that other models like Mistral Small cannot, at the expense of never really saying "I don't know", where other models will, and will instead produce false information.

For example, if I ask mistral small who I am by name, it will say there is no known notable figure by that name before the knowledge cutoff. Gemma 3 will say I am a well known <random profession> and make up facts. On the other hand, I have asked both about local organization in my area that I am involved with, and Gemma 3 could produce useful and factual information, where Mistral Small said it did not know.