←back to thread

486 points dbreunig | 1 comments | | HN request time: 0.211s | source
Show context
isusmelj ◴[] No.41863460[source]
I think the results show that just in general the compute is not used well. That the CPU took 8.4ms and GPU took 3.2ms shows a very small gap. I'd expect more like 10x - 20x difference here. I'd assume that the onnxruntime might be the issue. I think some hardware vendors just release the compute units without shipping proper support yet. Let's see how fast that will change.

Also, people often mistake the reason for an NPU is "speed". That's not correct. The whole point of the NPU is rather to focus on low power consumption. To focus on speed you'd need to get rid of the memory bottleneck. Then you end up designing your own ASIC with it's own memory. The NPUs we see in most devices are part of the SoC around the CPU to offload AI computations. It would be interesting to run this benchmark in a infinite loop for the three devices (CPU, NPU, GPU) and measure power consumption. I'd expect the NPU to be lowest and also best in terms of "ops/watt"

replies(8): >>41863552 #>>41863639 #>>41864898 #>>41864928 #>>41864933 #>>41866594 #>>41869485 #>>41870575 #
AlexandrB ◴[] No.41863552[source]
> Also, people often mistake the reason for an NPU is "speed". That's not correct. The whole point of the NPU is rather to focus on low power consumption.

I have a sneaking suspicion that the real real reason for an NPU is marketing. "Oh look, NVDA is worth $3.3T - let's make sure we stick some AI stuff in our products too."

replies(8): >>41863644 #>>41863654 #>>41865529 #>>41865968 #>>41866150 #>>41866423 #>>41867045 #>>41870116 #
kmeisthax ◴[] No.41863644[source]
You forget "Because Apple is doing it", too.
replies(1): >>41864659 #
rjsw ◴[] No.41864659[source]
I think other ARM SoC vendors like Rockchip added NPUs before Apple, or at least around the same time.
replies(2): >>41864882 #>>41865956 #
acchow ◴[] No.41864882[source]
I was curious so looked it up. Apple's first chip with an NPU was the A11 bionic in Sept 2017. Rockchip's was the RK1808 in Sept 2019.
replies(2): >>41865370 #>>41865486 #
j16sdiz ◴[] No.41865486[source]
Google TPU was introduced around same time as apple. Basically everybody knew it can be something around that time, just don't know exactly how
replies(1): >>41866672 #
1. Someone ◴[] No.41866672[source]
https://en.wikipedia.org/wiki/Tensor_Processing_Unit#Product... shows the first one is from 2015 (publicly announced in 2016). It also shows they have a TDP of 75+W.

I can’t find TDP for Apple’s Neural Engine (https://en.wikipedia.org/wiki/Neural_Engine), but the first version shipped in the iPhone 8, which has a 7 Wh battery, so these are targeting different markets.