(developers.googleblog.com)

602 points emrah | 1 comments | 20 Apr 25 12:22 UTC | HN request time: 0s | source

1. yuweiloopy2 ◴[21 Apr 25 13:26 UTC] No.43751784[source]▶

Been using the 27B QAT model for batch processing 50K+ internal documents. The 128K context is game-changing for our legal review pipeline. Though I wish the token generation was faster - at 20tps it's still too slow for interactive use compared to Claude Opus.

↑

Gemma 3 QAT Models: Bringing AI to Consumer GPUs