←back to thread

How the cochlea computes (2024)

(www.dissonances.blog)
475 points izhak | 1 comments | | HN request time: 0s | source
Show context
edbaskerville ◴[] No.45762928[source]
To summarize: the ear does not do a Fourier transform, but it does do a time-localized frequency-domain transform akin to wavelets (specifically, intermediate between wavelet and Gabor transforms). It does this because the sounds processed by the ear are often localized in time.

The article also describes a theory that human speech evolved to occupy an unoccupied space in frequency vs. envelope duration space. It makes no explicit connection between that fact and the type of transform the ear does—but one would suspect that the specific characteristics of the human cochlea might be tuned to human speech while still being able to process environmental and animal sounds sufficiently well.

A more complicated hypothesis off the top of my head: the location of human speech in frequency/envelope is a tradeoff between (1) occupying an unfilled niche in sound space; (2) optimal information density taking brain processing speed into account; and (3) evolutionary constraints on physiology of sound production and hearing.

replies(12): >>45763026 #>>45763057 #>>45763066 #>>45763124 #>>45763139 #>>45763700 #>>45763804 #>>45764016 #>>45764339 #>>45764582 #>>45765101 #>>45765398 #
patrickthebold ◴[] No.45764582[source]
I think I might be missing something basic, but if you actually wanted to do a Fourier transform on the sound hitting your ear, wouldn't you need to wait your entire lifetime to compute it? It seems pretty clear that's not what is happening, since you can actually hear things as they happen.
replies(4): >>45764633 #>>45764755 #>>45764761 #>>45764952 #
cherryteastain ◴[] No.45764755[source]
Not really, just as we can create spectrograms [1] for a real time audio feed without having to wait for the end of the recording by binning the signal into timewise chunks.

[1] https://en.wikipedia.org/wiki/Spectrogram

replies(1): >>45764961 #
IshKebab ◴[] No.45764961[source]
Those use the Short-Time Fourier Transform, which is very much like what the ear does.

https://en.wikipedia.org/wiki/Short-time_Fourier_transform

replies(1): >>45766435 #
anyfoo ◴[] No.45766435[source]
Yes, but the article specifically says that it isn't like a short-time fourier transform either, but more like a wavelet transform, which is different yet again.
replies(1): >>45766571 #
IshKebab ◴[] No.45766571[source]
Barely different though. Obviously nobody is saying it's exactly a Fourier transform or a STFT. But it's very like a STFT (or a wavelet transform).

The article is pretty much "cows aren't actually spheres guys".

replies(2): >>45766613 #>>45768760 #
1. kragen ◴[] No.45768760{5}[source]
It's very unlike both of those, as the nice diagrams in the article explain; not only is what it is saying not obvious to you, it is apparently something you actively disbelieve.