There has been so much ink spilled on the question of what kind of type systems help programmers be productive but there is not such controversy on the performance side.
There has been so much ink spilled on the question of what kind of type systems help programmers be productive but there is not such controversy on the performance side.
You are correct that you can often engineer the performance later. But layout concerns and nominal types for performance are fairly non controversial.
In contrast, you can find Fortran compilers that can substitute your inefficient implementation of some numerical algorithm with different much more performant one.
Python does not attempt to apply anything more complex than that peephole optimizer, whatever that means.
Judging from my experience with Python, that peephole optimizer cannot lift operations on "dictionary with default value" to a regular dictionary operations using conditionals. I had to manually rewrite my program to use conditionals to recover it's speed.