←back to thread

172 points marban | 1 comments | | HN request time: 0.298s | source
Show context
InTheArena ◴[] No.40051885[source]
While everyone has focused on Apple's power-efficiency on the M series chips, one thing that has been very interesting is how powerful the unified memory model (by having the memory on-package with CPU) with large bandwidth to the memory actually is. Hence a lot of people in the local LLMA community are really going after high-memory Macs.

It's great to see NPUs here with the new Ryzen cores - but I wonder how effective they will be with off-die memory versus the Apple approach.

That said, it's nothing but great to see these capabilities in something other then a expensive NVIDIA card. Local NPUs may really help with edge deploying more conferencing capabilities.

Edited - sorry, ,meant on-package.

replies(8): >>40051950 #>>40052032 #>>40052167 #>>40052857 #>>40053126 #>>40054064 #>>40054570 #>>40054743 #
v1sea ◴[] No.40053126[source]
edit: I was wrong.
replies(3): >>40053352 #>>40053365 #>>40053618 #
1. oflordal ◴[] No.40053365[source]
You can do that on both HIP and cuda through e.g. hipHostMalloc and the cuda equivalent (Not officially supported on the AMD APUs but works in practice). With a discrete GPU the GPU will access memory across PCIe but on an APU it will go full speed to RAM as far as I can tell.