←back to thread

172 points marban | 3 comments | | HN request time: 0.845s | source
Show context
InTheArena ◴[] No.40051885[source]
While everyone has focused on Apple's power-efficiency on the M series chips, one thing that has been very interesting is how powerful the unified memory model (by having the memory on-package with CPU) with large bandwidth to the memory actually is. Hence a lot of people in the local LLMA community are really going after high-memory Macs.

It's great to see NPUs here with the new Ryzen cores - but I wonder how effective they will be with off-die memory versus the Apple approach.

That said, it's nothing but great to see these capabilities in something other then a expensive NVIDIA card. Local NPUs may really help with edge deploying more conferencing capabilities.

Edited - sorry, ,meant on-package.

replies(8): >>40051950 #>>40052032 #>>40052167 #>>40052857 #>>40053126 #>>40054064 #>>40054570 #>>40054743 #
chaostheory ◴[] No.40052032[source]
What Apple has is theoretically great on paper, but it fails to live up to expectations. Whats the point of having the RAM for running an LLM locally when the performance is abysmal compared to running it on even a consumer Nvidia GPU. It’s a missed opportunity that I hope either the M4 or M5 addresses
replies(8): >>40052327 #>>40052344 #>>40052929 #>>40053695 #>>40053835 #>>40054577 #>>40054855 #>>40056153 #
bearjaws ◴[] No.40052344[source]
It's a 25w processor. How will it ever live up to a 400w GPU? Also you can't even run large models on a single 4090, but you can on M series laptops with enough RAM.

The fact a laptop can run 70B+ parameter models is a miracle, it's not what the chip was built to do at all.

replies(3): >>40052910 #>>40056761 #>>40059613 #
wongarsu ◴[] No.40052910[source]
It's a valid comparison in the very limited sense of "I have $2000 to spend on a way to run LLMs, should I get an RTX4090 for the computer I have or should I get a 24GB MacBook", or "I have $5000, should I get an RTX A6000 48GB or a 96GB MacBook".

Those comparisons are unreasonable in a sense, but they are implied by statements like GPs "Hence a lot of people in the local LLMA community are really going after high-memory Macs".

replies(3): >>40053367 #>>40054359 #>>40054554 #
0x457 ◴[] No.40054554[source]
I think there are two communities:

- the "hobbyists" with $5k GPUs

- People that work in the industry that never used "not mac" or even if they did - explaining to IT that you need a PC with RTX A6000 48GB instead of a mac like literally everyone else in the company is a loosing battle.

replies(1): >>40054806 #
1. wongarsu ◴[] No.40054806[source]
There is also an important third group:

- people that work outside Silicon Valley, where the entire company uses Windows centrally managed through Active Directory, and explaining IT that you need an Mac is an uphill battle. So you just submit your request for an RTX A6000 48GB to be added to your existing workstation

Those people are the intended target customer of the A6000, and there are a lot of them.

replies(1): >>40067455 #
2. 0x457 ◴[] No.40067455[source]
While there are many people "that work outside Silicon Valley, where the entire company uses Windows centrally managed through Active Directory, and explaining IT that you need a Mac is an uphill battle" I think these companies either don't care about AI or get into it by acquiring a startup (that runs on macs).

Companies that run windows and AD are too busy to make sure you move your mouse every 5 minutes while you're on the clock more. At least that is my experience.

replies(1): >>40076476 #
3. MichaelZuo ◴[] No.40076476[source]
There are some genuine, non-Dilbert-esque uses for AD. Where there really isn't a viable, similarly performant, alternative.