←back to thread

113 points sethkim | 3 comments | | HN request time: 0.513s | source
1. incomingpain ◴[] No.44457855[source]
I think the big thing that really surprised me.

Llama 4 maverick is 16x 17b. So 67GB of size. The equivalency is 400billion.

Llama 4 behemoth is 128x 17b. 245gb size. The equivalency is 2 trillion.

I dont have the resources to be able to test these unfortunately; but they are claiming behemoth is superior to the best SAAS options via internal benchmarking.

Comparatively Deepseek r1 671B is 404gb in size; with pretty similar benchmarks.

But you compare deepseek r1 32b to any model from 2021 and it's going to be significantly superior.

So we have quality of models increasing, resources needed decreasing. In 5-10 years, do we have an LLM that loads up on a 16-32GB video card that is simply capable of doing it all?

replies(2): >>44458059 #>>44460104 #
2. sethkim ◴[] No.44458059[source]
My two cents here is the classic answer - it depends. If you need general "reasoning" capabilities, I see this being a strong possibility. If you need specific, factual information baked into the weights themselves, you'll need something large enough to store that data.

I think the best of both worlds is a sufficiently capable reasoning model with access to external tools and data that can perform CPU-based lookups for information that it doesn't possess.

3. ezekiel68 ◴[] No.44460104[source]
"How is our 'Strategic Use of LLM Technology' initiative going, Harris?"

"Sir, I'm delighted to report that the productivity and insights gained outclass anything available from four years ago. We are clearly winning."