The key to the whole thing was that it was a great 32 bit processor; the 64 bit stuff was gravy for many, later.
Apple did something similar with its CPU changes - now three - they only swap when the old software runs better on the new chip even if emulated than it did on the old.
AMD64 was also well thought out; it wasn't just a simple "have two more bytes" slapped on 32 bit. Doubling the number of general purpose registers was noticeable - you took a performance hit going to 64 bit early on because all the memory addresses were wider, but the extra registers usually more than made up for it.
This is also where the NX bit entered.