Most active commenters

boomskats(6)
eightysixfour(3)
almostgotcaught(3)

AI PCs Aren't Good at AI: The CPU Beats the NPU

(github.com)

Show context

eightysixfour ◴[16 Oct 24 20:32 UTC] No.41863546[source]▶

I thought the purpose of these things was not to be fast, but to be able to run small models with very little power usage? I have a newer AMD laptop with an NPU, and my power usage doesn't change using the video effects that supposedly run on it, but goes up when using the nvidia studio effects.

It seems like the NPUs are for very optimized models that do small tasks, like eye contact, background blur, autocorrect models, transcription, and OCR. In particular, on Windows, I assumed they were running the full screen OCR (and maybe embeddings for search) for the rewind feature.

replies(7): >>41863632 #>>41863779 #>>41863821 #>>41863886 #>>41864628 #>>41864828 #>>41869772 #

boomskats ◴[16 Oct 24 20:56 UTC] No.41863779[source]▶

>>41863546 #

That's especially true because yours is a Xilinx FPGA. The one that they just attached to the latest gen mobile ryzens is 5x more capable too.

AMD are doing some fantastic work at the moment, they just don't seem to be shouting about it. This one is particularly interesting https://lore.kernel.org/lkml/DM6PR12MB3993D5ECA50B27682AEBE1...

edit: not an FPGA. TIL. :'(

replies(5): >>41863852 #>>41863876 #>>41864048 #>>41864435 #>>41865733 #

errantspark ◴[16 Oct 24 21:04 UTC] No.41863852[source]▶

>>41863779 #

Wait sorry back up a bit here. I can buy a laptop that has a daughter FPGA in it? Does it have GPIO??? Are we seriously building hardware worth buying again in 2024? Do you have a link?

replies(2): >>41863959 #>>41864293 #

1. eightysixfour ◴[16 Oct 24 21:21 UTC] No.41863959[source]▶

>>41863852 #

It isn't as fun as you think - they are setup for specific use cases and quite small. Here's a link to the software page: https://ryzenai.docs.amd.com/en/latest/index.html

The teeny-tiny "NPU," which is actually an FPGA, is 10 TOPS.

Edit: I've been corrected, not an FPGA, just an IP block from Xilinx.

replies(2): >>41864036 #>>41864062 #

2. wtallis ◴[16 Oct 24 21:30 UTC] No.41864036[source]▶

>>41863959 (TP) #

It's not a FPGA. It's an NPU IP block from the Xilinx side of the company. It was presumably originally developed to be run on a Xilinx FPGA, but that doesn't mean AMD did the stupid thing and actually fabbed a FPGA fabric instead of properly synthesizing the design for their laptop ASIC. Xilinx involvement does not automatically mean it's an FPGA.

replies(2): >>41864064 #>>41864111 #

3. boomskats ◴[16 Oct 24 21:33 UTC] No.41864062[source]▶

>>41863959 (TP) #

Yes, the one on the ryzen 7000 chips like the 7840u isn't massive, but that's the last gen model. The one they've just released with the HX370 chip is estimated at 50 TOPS, which is better than Qualcomm's ARM flagship that this post is about. It's a fivefold improvement in a single generation, it's pretty exciting.

A̵n̵d̵ ̵i̵t̵'̵s̵ ̵a̵n̵ ̵F̵P̵G̵A̵ It's not an FPGA

replies(1): >>41864248 #

4. eightysixfour ◴[16 Oct 24 21:33 UTC] No.41864064[source]▶

>>41864036 #

Thanks for the correction, edited.

5. boomskats ◴[16 Oct 24 21:40 UTC] No.41864111[source]▶

>>41864036 #

Do you have any more reading on this? How come the XDNA drivers depend on Xilinx' XRT runtime?

replies(2): >>41864232 #>>41864296 #

6. almostgotcaught ◴[16 Oct 24 21:55 UTC] No.41864232{3}[source]▶

>>41864111 #

because XRT has a plugin architecture: XRT<-shim plugin<-kernel driver. The shims register themselves with XRT. The XDNA driver repo houses both the shim and the kernel driver.

replies(1): >>41864611 #

7. almostgotcaught ◴[16 Oct 24 21:56 UTC] No.41864248[source]▶

>>41864062 #

> And it's an FPGA.

nope it's not.

replies(1): >>41864925 #

8. wtallis ◴[16 Oct 24 22:03 UTC] No.41864296{3}[source]▶

>>41864111 #

It would be surprising and strange if AMD didn't reuse the software framework they've already built for doing AI when that IP block is instantiated on an FPGA fabric rather than hardened in an ASIC.

replies(1): >>41864630 #

9. boomskats ◴[16 Oct 24 22:43 UTC] No.41864611{4}[source]▶

>>41864232 #

Thanks, that makes sense.

10. boomskats ◴[16 Oct 24 22:46 UTC] No.41864630{4}[source]▶

>>41864296 #

Well, I'm irrationally disappointed, but thanks. Appreciate the correction.

11. boomskats ◴[16 Oct 24 23:26 UTC] No.41864925{3}[source]▶

>>41864248 #

I've just ordered myself a jump to conclusions mat.

replies(1): >>41865072 #

12. almostgotcaught ◴[16 Oct 24 23:51 UTC] No.41865072{4}[source]▶

>>41864925 #

Lol during grad school my advisor would frequently cut me off and try to jump to a conclusion, while I was explaining something technical often enough he was wrong. So I did really buy him one (off eBay or something). He wasn't pleased.

↑