(developers.googleblog.com)

602 points emrah | 1 comments | 20 Apr 25 12:22 UTC | HN request time: 0s | source

Show context

api ◴[20 Apr 25 15:32 UTC] No.43744419[source]▶

When I see 32B or 70B models performing similarly to 200+B models, I don’t know what to make of this. Either the latter contains more breadth of information but we have managed to distill latent capabilities to be similar, the larger models are just less efficient, or the tests are not very good.

replies(2): >>43744582 #>>43744783 #

1. retinaros ◴[20 Apr 25 16:28 UTC] No.43744783[source]▶

>>43744419 #

its just bs benchmarks. they are all cheating at this point feeding the data in the training set. doesnt mean the llm arent becoming better but when they all lie...

↑

Gemma 3 QAT Models: Bringing AI to Consumer GPUs