/top/
/new/
/best/
/ask/
/show/
/job/
^
slacker news
login
about
←back to thread
Gemma 3 QAT Models: Bringing AI to Consumer GPUs
(developers.googleblog.com)
602 points
emrah
| 1 comments |
20 Apr 25 12:22 UTC
|
HN request time: 0.207s
|
source
Show context
holografix
◴[
20 Apr 25 13:27 UTC
]
No.
43743631
[source]
▶
>>43743337 (OP)
#
Could 16gb vram be enough for the 27b QAT version?
replies(5):
>>43743634
#
>>43743704
#
>>43743825
#
>>43744249
#
>>43756253
#
hskalin
◴[
20 Apr 25 14:02 UTC
]
No.
43743825
[source]
▶
>>43743631
#
With ollama you could offload a few layers to cpu if they don't fit in the VRAM. This will cost some performance ofcourse but it's much better than the alternative (everything on cpu)
replies(2):
>>43744666
#
>>43752342
#
1.
dockerd
◴[
21 Apr 25 14:22 UTC
]
No.
43752342
[source]
▶
>>43743825
#
Does it work on LM Studio? Loading 27b-it-qat taking up more than 22GB on 24GB mac.
ID:
GO
↑