The Metrowerks profiler and linker worked together to optimize locality in the binary, the focus was on PowerPC code. The linker could generate the static call tree, but the profiler could generate a dynamic call tree of what was actually called. Separating out the cold portions of the call tree into portions of the executable that didn't get paged in was the goal.
I worked on the Profiler and I seem to remember that Microsoft was one of the developers that put a bunch of effort into using this to optimize the Office suite on Mac. I remember the release of Word that used it was snappier.