There has been so much ink spilled on the question of what kind of type systems help programmers be productive but there is not such controversy on the performance side.
There has been so much ink spilled on the question of what kind of type systems help programmers be productive but there is not such controversy on the performance side.
I’m mostly familiar with Haskell which does type erasure. I think the means that there is no type checking at run time.
I think dependently typed languages would the benefit of knowing structure at compile time enough to detect things like dead branches and impossible cases which can optimize by removing code. I’m not sure that all types are erased in such languages though.
If you want to add two things in Python they could be totally different types so at runtime the program is likely to have pointers in those registers and it is going to have to look at the objects and figure out what it has to do to add them, possibly calling the __add__ method on the object. In the obvious implementation you are having to fetch the int out of memory when you could have just got it out of the register.
Now many languages play tricks with pointers, we are not filling out a 64-bit address space any time soon, so we could make 'pointers' with certain bit patterns host integers and other numbers inside. Still it is a lot harder to add two of those than it is to just add two values we know are integers.
With user types it is pretty much the same, particularly in single inheritance languages like Java where it is dead obvious that you can access field A of type X by adding a fixed offset to the object pointer. In a language like Python or Javascript you are looking everything up in a hashtable, even if it is a fast hashtable. (You don't consistently win using slots.)
A really advanced compiler/runtime could undo some of this stuff, for instance it might be clear that in a particular case you are adding x+y and those are both going to be 64-bit ints and you can specialize it. You could do this at build time or even at runtime see that function has been called in a particular context 10,000 and you get the chance to inline that function into the function that calls it and then inline that function and then recompile it with the fastest code but it is tricky to get right. If an x that isn't an int finds its way in there you need to despecialize the function without breaking anything.
PyPy shows that Python could be greatly improved over what it is, I think CPython is going to approach it in speed gradually.