> because it is the kind of optimizing compiler you say it is
What other kind of optimisations are you imagining? I'm not talking about a particular "kind" of optimisation but the entire category. Lets look at two real world optimisations from opposite ends of the scale to see:
1. Peephole removal of null sequences. This is a very easy optimisation, if we're going to do X and then do opposite-of-X we can do neither and have the same outcome which is typically smaller and faster. For example on a simple stack machine pushing register R10 and then popping R10 achieves nothing, so we can remove both of these steps from the resulting program.
BUT if we've defined everything this can't work because it means we're no longer touching the stack here, so a language will often not define such things at all (e.g. not even mentioning the existence of a "stack") and thus permit this optimisation.
2. Idiom recognition of population count. The compiler can analyse some function you've written and conclude that it's actually trying to count all the set bits in a value, but many modern CPUs have a dedicated instruction for that, so, the compiler can simply emit that CPU instruction where you call your function.
BUT You wrote this whole complicated function, if we've defined everything then all the fine details of your function must be reproduced, there must be a function call, maybe you make some temporary accumulator, you test and increment in a loop -- all defined, so such an optimisation would be impossible.