/top/
/new/
/best/
/ask/
/show/
/job/
^
slacker news
login
about
←back to thread
Gemma 3 QAT Models: Bringing AI to Consumer GPUs
(developers.googleblog.com)
602 points
emrah
| 1 comments |
20 Apr 25 12:22 UTC
|
HN request time: 0.204s
|
source
Show context
holografix
◴[
20 Apr 25 13:27 UTC
]
No.
43743631
[source]
▶
>>43743337 (OP)
#
Could 16gb vram be enough for the 27b QAT version?
replies(5):
>>43743634
#
>>43743704
#
>>43743825
#
>>43744249
#
>>43756253
#
jffry
◴[
20 Apr 25 13:39 UTC
]
No.
43743704
[source]
▶
>>43743631
#
With `ollama run gemma3:27b-it-qat "What is blue"`, GPU memory usage is just a hair over 20GB, so no, probably not without a nerfed context window
replies(1):
>>43743804
#
1.
woadwarrior01
◴[
20 Apr 25 13:58 UTC
]
No.
43743804
[source]
▶
>>43743704
#
Indeed, the default context length in ollama is a mere 2048 tokens.
ID:
GO
↑