AI PCs Aren't Good at AI: The CPU Beats the NPU

(github.com)

488 points dbreunig | 5 comments | 16 Oct 24 19:44 UTC | HN request time: 2.139s | source

Show context

eightysixfour ◴[16 Oct 24 20:32 UTC] No.41863546[source]▶

I thought the purpose of these things was not to be fast, but to be able to run small models with very little power usage? I have a newer AMD laptop with an NPU, and my power usage doesn't change using the video effects that supposedly run on it, but goes up when using the nvidia studio effects.

It seems like the NPUs are for very optimized models that do small tasks, like eye contact, background blur, autocorrect models, transcription, and OCR. In particular, on Windows, I assumed they were running the full screen OCR (and maybe embeddings for search) for the rewind feature.

replies(7): >>41863632 #>>41863779 #>>41863821 #>>41863886 #>>41864628 #>>41864828 #>>41869772 #

1. eightysixfour ◴[16 Oct 24 21:22 UTC] No.41863976[source]▶

>>41863886 (TP) #

The 7940HS shipped before recall and doesn't support it because it is not performant enough, so, that doesn't make sense.

I just gave you a use case, mine in particular uses it for background blur and eye contact filters with the webcam and uses essentially no power to do it. If I do the same filters with nvidia broadcast, the power usage is dramatically higher.

replies(2): >>41864053 #>>41864126 #

2. wtallis ◴[16 Oct 24 21:32 UTC] No.41864053[source]▶

>>41863976 #

Intel is also about to launch their first desktop processors with an NPU which falls far short of Microsoft's performance requirements for a "Copilot+ PC". Should still be plenty for webcam use.

3. moffkalast ◴[16 Oct 24 21:41 UTC] No.41864126[source]▶

>>41863976 #

I doubt there's no notable power draw, NPUs in general have always pulled a handful of watts which should at least about match a modern CPU's idle draw. But it does seem odd that your power usage doesn't change at all, it might be always powered on or something.

Eye contact filters seem like a horrible thing, autocorrect won't work better than a dictionary with a tiny model and I doubt these things can come even close to running whisper for decent voice transcription. Background blur alright, but that's kind of stretching it. I always figured Zoom/Teams do these things serverside anyway.

And alright, if it's not MS making them do it, then they're just chasing the fad themselves while also shipping subpar hardware. Not sure if that makes it better.

replies(2): >>41864298 #>>41866381 #

4. Dylan16807 ◴[16 Oct 24 22:03 UTC] No.41864298{3}[source]▶

>>41864126 #

> I doubt these things can come even close to running whisper for decent voice transcription.

Whisper runs almost realtime on a single core of my very old CPU. I'd be very surprised if it can't fit in an NPU.

5. kalleboo ◴[17 Oct 24 04:16 UTC] No.41866381{3}[source]▶

>>41864126 #

> I doubt these things can come even close to running whisper for decent voice transcription

https://github.com/ggerganov/whisper.cpp/pull/566

"The performance gain is more than x3 compared to 8-thread CPU"

And this is on the 3 year old M1 Pro

↑