The End of Moore's Law for AI? Gemini Flash Offers a Warning

1. incomingpain ◴[03 Jul 25 18:24 UTC] No.44457855[source]▶

I think the big thing that really surprised me.

Llama 4 maverick is 16x 17b. So 67GB of size. The equivalency is 400billion.

Llama 4 behemoth is 128x 17b. 245gb size. The equivalency is 2 trillion.

I dont have the resources to be able to test these unfortunately; but they are claiming behemoth is superior to the best SAAS options via internal benchmarking.

Comparatively Deepseek r1 671B is 404gb in size; with pretty similar benchmarks.

But you compare deepseek r1 32b to any model from 2021 and it's going to be significantly superior.

So we have quality of models increasing, resources needed decreasing. In 5-10 years, do we have an LLM that loads up on a 16-32GB video card that is simply capable of doing it all?

replies(2): >>44458059 #>>44460104 #

2. sethkim ◴[03 Jul 25 18:47 UTC] No.44458059[source]▶

>>44457855 (TP) #

My two cents here is the classic answer - it depends. If you need general "reasoning" capabilities, I see this being a strong possibility. If you need specific, factual information baked into the weights themselves, you'll need something large enough to store that data.

I think the best of both worlds is a sufficiently capable reasoning model with access to external tools and data that can perform CPU-based lookups for information that it doesn't possess.

3. ezekiel68 ◴[03 Jul 25 23:57 UTC] No.44460104[source]▶

>>44457855 (TP) #

"How is our 'Strategic Use of LLM Technology' initiative going, Harris?"

"Sir, I'm delighted to report that the productivity and insights gained outclass anything available from four years ago. We are clearly winning."