Surprisingly, I only came across Francesco's blog this month. I stumbled upon the 2021 post "Speeding up atan2f by 50x" while searching for others who have to reimplement trigonometry in SIMD every other year. I've also enjoyed "Beating the L1 cache with value speculation" from the same year, as well as the 2013 Agda sorting example.
Highly recommend checking it out: https://mazzo.li/archive.html