(developers.googleblog.com)

602 points emrah | 1 comments | 20 Apr 25 12:22 UTC | HN request time: 0s | source

Show context

holografix ◴[20 Apr 25 13:27 UTC] No.43743631[source]▶

Could 16gb vram be enough for the 27b QAT version?

jffry ◴[20 Apr 25 13:39 UTC] No.43743704[source]▶

With `ollama run gemma3:27b-it-qat "What is blue"`, GPU memory usage is just a hair over 20GB, so no, probably not without a nerfed context window

replies(1): >>43743804 #

1. woadwarrior01 ◴[20 Apr 25 13:58 UTC] No.43743804{3}[source]▶

Indeed, the default context length in ollama is a mere 2048 tokens.

Gemma 3 QAT Models: Bringing AI to Consumer GPUs