(developers.googleblog.com)

602 points emrah | 1 comments | 20 Apr 25 12:22 UTC | HN request time: 0s | source

Show context

holografix ◴[20 Apr 25 13:27 UTC] No.43743631[source]▶

Could 16gb vram be enough for the 27b QAT version?

replies(5): >>43743634 #>>43743704 #>>43743825 #>>43744249 #>>43756253 #

1. abawany ◴[21 Apr 25 20:42 UTC] No.43756253[source]▶

I tried the 27b-iat model on a 4090m with 16gb vram with mostly default args via llama.cpp and it didn't fit - used up the vram and tried to use about 2gb of system ram: performance in this setup was < 5 tps.

↑

Gemma 3 QAT Models: Bringing AI to Consumer GPUs