It would look like a Fourier transform ;)
Zooming in to cartoonish levels might drive the point home a bit. Suppose you have sound waves
|---------|---------|---------|
What is the frequency exactly 1/3 the way between the first two wave peaks? It's a nonsensical question. The frequency relates to the time delta between peaks, and looking locally at a sufficiently small region of time gives no information about that phenomenon.
Let's zoom out a bit. What's the frequency over a longer period of time, capturing a few peaks?
Well...if you know there is only one frequency then you can do some math to figure it out, but as soon as you might be describing a mix of frequencies you suddenly, again, potentially don't have enough information.
That lack of information manifests in a few ways. The exact math (Shannon's theorems?) suggests some things, but the language involved mismatches with human perception sufficiently that people get burned trying to apply it too directly. E.g., a bass beat with a bit of clock skew is very different from a bass beat as far as a careless decomposition is concerned, but it's likely not observable by a human listener.
Not being localized in time means* you look at longer horizons, considering more and more of those interactions. Instead of the beat of a 4/4 song meaning that the frequency changes at discrete intervals, it means that there's a larger, over-arching pattern capturing "the frequency distribution" of the entire song.
*Truly time-nonlocalized sound is of course impossible, so I'm giving some reasonable interpretation.