←back to thread

486 points dbreunig | 2 comments | | HN request time: 0s | source
Show context
eightysixfour ◴[] No.41863546[source]
I thought the purpose of these things was not to be fast, but to be able to run small models with very little power usage? I have a newer AMD laptop with an NPU, and my power usage doesn't change using the video effects that supposedly run on it, but goes up when using the nvidia studio effects.

It seems like the NPUs are for very optimized models that do small tasks, like eye contact, background blur, autocorrect models, transcription, and OCR. In particular, on Windows, I assumed they were running the full screen OCR (and maybe embeddings for search) for the rewind feature.

replies(7): >>41863632 #>>41863779 #>>41863821 #>>41863886 #>>41864628 #>>41864828 #>>41869772 #
refulgentis ◴[] No.41863821[source]
You're absolutely right IMO, given what I heard when launching on-device speech recognition on Pixel, and after leaving Google, what I see from ex. Apple Neural Engine vs. CPU when running ONNX stuff.

I'm a bit suspicious of the article's specific conclusion, because it is Qualcomm's ONNX, and it be out of date. Also, Android loved talking shit about Qualcomm software engineering.

That being said, its directionally correct, insomuch as consumer hardware AI acceleration claims are near-universally BS unless you're A) writing 1P software B) someone in the 1P really wants you to take advantage.

replies(1): >>41864564 #
1. kristianp ◴[] No.41864564[source]
1P?
replies(1): >>41864574 #
2. refulgentis ◴[] No.41864574[source]
First party, i.e. Google/Apple/Microsoft