Ok I answered my own question.
Ok I answered my own question.
It’s several things:
* Cutting-edge code, not overly concerned with optimization
* Code written by scientists, who aren’t known for being the world’s greatest programmers
* The obsession the research world has with using Python
Not surprising that there’s a lot of low-hanging fruit that can be optimized.
You're completely correct that the speed-sensitive parts are written in lower-level libraries, but another way to phrase that is "Python can go really fast, as long as you don't use Python." But this also means ML is effectively hamstrung into only using methods that already exist and have been coded in C++, since anything in Python would be too slow to compete.
There's lots of languages that make good tradeoffs between performance and usability. Python is not one of those languages. It is, at best, only slightly harder to use than Julia, yet orders-of-magnitude slower.