(developers.googleblog.com)

602 points emrah | 2 comments | 20 Apr 25 12:22 UTC | HN request time: 0s | source

Show context

wtcactus ◴[20 Apr 25 13:33 UTC] No.43743666[source]▶

They keep mentioning the RTX 3090 (with 24 GB VRAM), but the model is only 14.1 GB.

Shouldn’t it fit a 5060 Ti 16GB, for instance?

1. jsnell ◴[20 Apr 25 13:37 UTC] No.43743691[source]▶

Memory is needed for more than just the parameters, e.g. the KV cache.

replies(1): >>43743879 #

2. cubefox ◴[20 Apr 25 14:12 UTC] No.43743879[source]▶

KV = key-value

Gemma 3 QAT Models: Bringing AI to Consumer GPUs