←back to thread

486 points dbreunig | 1 comments | | HN request time: 0.42s | source
Show context
tromp ◴[] No.41863436[source]
> the 45 trillion operations per second that’s listed in the specs

Such a spec should be ideally be accompanied by code demonstrating or approximating the claimed performance. I can't imagine a sports car advertising a 0-100km/h spec of 2.0 seconds where a user is unable to get below 5 seconds.

replies(3): >>41863444 #>>41863452 #>>41864522 #
dmitrygr ◴[] No.41863452[source]
most likely multiplying the same 128x128 matrix from cache to cache. That gets you perfect MAC utilization with no need to hit memory. Gets you a big number that is not directly a lie - that perf IS attainable, on a useless synthetic benchmark
replies(1): >>41863655 #
1. kmeisthax ◴[] No.41863655[source]
Sounds great for RNNs! /s