Gemma 3 QAT Models: Bringing AI to Consumer GPUs

(developers.googleblog.com)

Show context

rob_c ◴[20 Apr 25 14:01 UTC] No.43743812[source]▶

>>43743337 (OP) #

Given how long between this being released and this community picking up on it... Lol

replies(1): >>43745299 #

1. GaunterODimm ◴[20 Apr 25 17:55 UTC] No.43745299[source]▶

>>43743812 #

2days :/...

replies(1): >>43745387 #

2. rob_c ◴[20 Apr 25 18:09 UTC] No.43745387[source]▶

>>43745299 (TP) #

Given I know people running gemma3 on local devices for over almost a month now this is either a very slow news day or evidence of finger missing the pulse... https://blog.google/technology/developers/gemma-3/

replies(1): >>43745443 #

3. simonw ◴[20 Apr 25 18:17 UTC] No.43745443[source]▶

>>43745387 #

This is new. These are new QAT (Quantization-Aware Training) models released by the Gemma team.

replies(1): >>43745501 #

4. rob_c ◴[20 Apr 25 18:25 UTC] No.43745501{3}[source]▶

>>43745443 #

There's nothing more than an iteration on the topic, gemma3 was smashing local results a month ago and made no waves as it dropped...

replies(1): >>43745530 #

5. simonw ◴[20 Apr 25 18:29 UTC] No.43745530{4}[source]▶

>>43745501 #

Quoting the linked story:

> Last month, we launched Gemma 3, our latest generation of open models. Delivering state-of-the-art performance, Gemma 3 quickly established itself as a leading model capable of running on a single high-end GPU like the NVIDIA H100 using its native BFloat16 (BF16) precision.

> To make Gemma 3 even more accessible, we are announcing new versions optimized with Quantization-Aware Training (QAT) that dramatically reduces memory requirements while maintaining high quality.

The thing that's new, and that is clearly resonating with people, is the "To make Gemma 3 even more accessible..." bit.

replies(1): >>43745596 #

6. rob_c ◴[20 Apr 25 18:37 UTC] No.43745596{5}[source]▶

>>43745530 #

As I've said in my lectures on how to perform 1bit training of QAT systems to build classifiers...

"An iteration on a theme".

Once the network design is proven to work yes it's an impressive technical achievement, but as I've said given I've known people in multiple research institutes and companies using Gemma3 for a month mostly saying they're surprised it's not getting noticed... This is just enabling more users but the none QAT version will almost always perform better...

replies(1): >>43745652 #

7. simonw ◴[20 Apr 25 18:45 UTC] No.43745652{6}[source]▶

>>43745596 #

Sounds like you're excited to see Gemma 3 get the recognition it deserves on Hacker News then.

replies(1): >>43745675 #

8. rob_c ◴[20 Apr 25 18:48 UTC] No.43745675{7}[source]▶

>>43745652 #

No just pointing out the flooding obvious as usual and collecting down votes for it

replies(1): >>43746331 #

9. fragmede ◴[20 Apr 25 20:31 UTC] No.43746331{8}[source]▶

>>43745675 #

Speaking for myself, my downvotes are not because of the content of your arguments, but because your tone is consistently condescending and dismissive. Comments like “just pointing out the flooding obvious” come off as smug and combative rather than constructive.

HN works best when people engage in good faith, stay curious, and try to move the conversation forward. That kind of tone — even when technically accurate — discourages others from participating and derails meaningful discussion.

If you’re getting downvotes regularly, maybe it's worth considering how your comments are landing with others, not just whether they’re “right.”

replies(1): >>43748316 #

10. rob_c ◴[21 Apr 25 03:31 UTC] No.43748316{9}[source]▶

>>43746331 #

My tone only switches once people get uppity. The original comment is on point and accurate, not combative and not insulting (unless the community seriously takes a 'lol'....

Tbh I give up writing that in response to this rant. My polite poke holds and it's non insulting so I'm not going to capitulate to those childish enough to not look inwards.

↑