←back to thread

172 points marban | 10 comments | | HN request time: 0.419s | source | bottom
Show context
InTheArena ◴[] No.40051885[source]
While everyone has focused on Apple's power-efficiency on the M series chips, one thing that has been very interesting is how powerful the unified memory model (by having the memory on-package with CPU) with large bandwidth to the memory actually is. Hence a lot of people in the local LLMA community are really going after high-memory Macs.

It's great to see NPUs here with the new Ryzen cores - but I wonder how effective they will be with off-die memory versus the Apple approach.

That said, it's nothing but great to see these capabilities in something other then a expensive NVIDIA card. Local NPUs may really help with edge deploying more conferencing capabilities.

Edited - sorry, ,meant on-package.

replies(8): >>40051950 #>>40052032 #>>40052167 #>>40052857 #>>40053126 #>>40054064 #>>40054570 #>>40054743 #
chaostheory ◴[] No.40052032[source]
What Apple has is theoretically great on paper, but it fails to live up to expectations. Whats the point of having the RAM for running an LLM locally when the performance is abysmal compared to running it on even a consumer Nvidia GPU. It’s a missed opportunity that I hope either the M4 or M5 addresses
replies(8): >>40052327 #>>40052344 #>>40052929 #>>40053695 #>>40053835 #>>40054577 #>>40054855 #>>40056153 #
InTheArena ◴[] No.40052327[source]
The performance of oolama on my M1 MAX is pretty solid - and does things that my 2070 GPU can't do because of memory.
replies(1): >>40052675 #
1. dangus ◴[] No.40052675[source]
Not that I don’t believe you but the 2070 is two generations and 5 years old. Maybe a comparison to a 4000 series would be more appropriate?
replies(2): >>40052731 #>>40052773 #
2. Kirby64 ◴[] No.40052731[source]
The M1 Max is also 2 generations old, and ~3 years old at this point. Seems like a fair comparison to me.
replies(2): >>40052845 #>>40052863 #
3. Teever ◴[] No.40052773[source]
Well, you know that it would still be able to do more than a 4000 series GPU from Nvidia because you can have more system memory in a mac than you can have video ram in a 4000 series GPU.
replies(1): >>40052862 #
4. dangus ◴[] No.40052845[source]
The 4000 series still has a bigger gap in how much of a generational leap that product was.

The M3 Max has something like 33% faster overall graphics performance than the M1 Max (average benchmark) while the 4090 is something like 138% faster than the 2080Ti.

Depending on which 2070 and 4070 models you compare the difference is similar, close to or exceeding 100% uplift.

replies(1): >>40055156 #
5. dangus ◴[] No.40052862[source]
Yes, obviously I’m aware that you can throw more RAM at an M-series GPU.

But of course that’s only helpful for specific workflows.

6. talldayo ◴[] No.40052863[source]
Maybe it's controversial, but I don't think comparing 5nm mobile hardware from 2021 is a fair fight against 12nm desktop hardware from 2018.

And still, performance-wise, the 2070 still wins out by a ~33% margin: https://browser.geekbench.com/opencl-benchmarks

replies(2): >>40053373 #>>40055651 #
7. chessgecko ◴[] No.40053373{3}[source]
For this comparison the generation of chip doesn’t really matter because the llm decode (which is the costly step) barely uses any of the perf and just needs the model weights to fit in memory
8. whizzter ◴[] No.40055156{3}[source]
Googling power draw the 4090 goes up to 450w whilst the 2080ti was at 250w, adjusting for power consumption the increase is somewhere around 32%. Some architectural gains and probably optimizations in chipset workings but we're not seeing as many amazing generational leaps anymore regardless of manufacturer/designer.
replies(1): >>40077726 #
9. JudasGoat ◴[] No.40055651{3}[source]
I found it interesting that the Apple M3 scored nearly identical to the Radeon 780M. I know the memory bandwidth is slower but you can add 2 32gb sodimms to the AMD APU for short money.
10. dangus ◴[] No.40077726{4}[source]
I’m still seeing over a 100% uplift comparing mobile to mobile on Nvidia products: https://gpu.userbenchmark.com/Compare/Nvidia-RTX-4090-Laptop...

As far as desktop products, power consumption is irrelevant.