There has been so much ink spilled on the question of what kind of type systems help programmers be productive but there is not such controversy on the performance side.
There has been so much ink spilled on the question of what kind of type systems help programmers be productive but there is not such controversy on the performance side.
At this point in the collective journey we are all on understanding programming languages and what they can do, the evidence is overwhelming that there are in fact plenty of useful semantics that are intrinsically slower than other useful semantics, relative to any particular chunk of hardware executing them. That is, what is slow on CPU or GPU may differ, but there is certainly some sort of difference that will exist, and there is no amount of (feasible) compiling around that problem.
Indeed, that's why we have CPUs and GPUs and soon NPUs and in the future perhaps other types of dedicated processors... precisely because not all semantics can be implemented equally no matter how smart you are.
- The overwhelming majority of new code is written in high-level languages
- High-level languages have continued to close what small performance gaps remain
- There have been no serious efforts to implement a true low-level language for post-Pentium (superscalar) CPUs, yet alone the CPUs of today
- Even GPUs and NPUs are largely programmed by using languages that express largely the same semantics as languages for CPUs, and relying heavily on compiler optimisation
You can easily observe in any cross-language shootout in 2024 that optimized code bases in the various languages still have gradients and we do not live in a world where you can just start slamming out Python code and expect no performance delta against C.
https://prog21.dadgum.com/40.html
Merely smart compilers are amazing; one of the defining characteristics of the software world is that you can be handed these things for free. The "sufficiently smart compiler", however, does not exist, and while there is no mathematical proof that I'm aware of that they are impossible, after at least two decades of people trying to produce them and totally failing, the rational expectation at this point must be that they do not exist.
Microbenchmarks may show an advantage for C - but it's one that is shrinking all the time (and that goes doubly for Java, which was the go-to example in the original "sufficiently smart compiler" conversations - but no longer is, because you can't actually be confident that Java is going to perform worse than C any more). And the overwhelming majority of the time, for real-world business problems, people do just start slamming out Python code, and if anything it tends to perform better.
And conversely even those C programs now rely extremely heavily on compiler smartness to reorder instructions, autovectorise, etc., often producing something quite radically different from what a naive reading of the code would mean - and there is no real appetite for a language that doesn't do this, one with semantics designed to perform well on today's CPUs or GPUs. Which suggests that designing the language semantics for performance is not actually particularly important.
Best of luck in your engineering endeavors if you ever end up in a place you need high performance code. You're going to need a lot of it.