←back to thread

305 points todsacerdoti | 3 comments | | HN request time: 0.506s | source
1. lubesGordi ◴[] No.44062002[source]
Honestly its a little surprising the first optimization he found was something fairly obvious just by using perf. I thought they had discussed the zeroing buffers issue in the first post? The second optimization was definitely more involved/interesting but was still pointed at by perf. Don't underestimate that tool!
replies(2): >>44062673 #>>44067686 #
2. sounds ◴[] No.44062673[source]
He came from the aarch64 perspective on an Apple device. I often experience someone spotting an "obvious in hindsight" gap because they come from a different background.
3. Sesse__ ◴[] No.44067686[source]
AFAICS, it wasn't “just perf”; it was doing a differential profile between the C and Rust versions, with manual matching up. (perf diff exists, but can't match across the differing symbol names, and few people seem to use it.)