Most active commenters
  • IshKebab(3)

←back to thread

How the cochlea computes (2024)

(www.dissonances.blog)
475 points izhak | 12 comments | | HN request time: 0.414s | source | bottom
Show context
edbaskerville ◴[] No.45762928[source]
To summarize: the ear does not do a Fourier transform, but it does do a time-localized frequency-domain transform akin to wavelets (specifically, intermediate between wavelet and Gabor transforms). It does this because the sounds processed by the ear are often localized in time.

The article also describes a theory that human speech evolved to occupy an unoccupied space in frequency vs. envelope duration space. It makes no explicit connection between that fact and the type of transform the ear does—but one would suspect that the specific characteristics of the human cochlea might be tuned to human speech while still being able to process environmental and animal sounds sufficiently well.

A more complicated hypothesis off the top of my head: the location of human speech in frequency/envelope is a tradeoff between (1) occupying an unfilled niche in sound space; (2) optimal information density taking brain processing speed into account; and (3) evolutionary constraints on physiology of sound production and hearing.

replies(12): >>45763026 #>>45763057 #>>45763066 #>>45763124 #>>45763139 #>>45763700 #>>45763804 #>>45764016 #>>45764339 #>>45764582 #>>45765101 #>>45765398 #
1. patrickthebold ◴[] No.45764582[source]
I think I might be missing something basic, but if you actually wanted to do a Fourier transform on the sound hitting your ear, wouldn't you need to wait your entire lifetime to compute it? It seems pretty clear that's not what is happening, since you can actually hear things as they happen.
replies(4): >>45764633 #>>45764755 #>>45764761 #>>45764952 #
2. xeonmc ◴[] No.45764633[source]
You’ll also need to have existed and started listening before the beginning of time, forever and ever. Amen.
3. cherryteastain ◴[] No.45764755[source]
Not really, just as we can create spectrograms [1] for a real time audio feed without having to wait for the end of the recording by binning the signal into timewise chunks.

[1] https://en.wikipedia.org/wiki/Spectrogram

replies(1): >>45764961 #
4. bonoboTP ◴[] No.45764761[source]
Yes, for the vanilla Fourier transform you have to integrate from negative to positive infinity. But more practically you can put put a temporally finite-support window function on it, so you only analyze a part of it. Whenever you see a 2d spectrogram image in audio editing software, where the audio engineer can suppress a certain range of frequencies in a certain time period they use something like this.

It's called the short-time Fourier transform (STFT).

https://en.wikipedia.org/wiki/Short-time_Fourier_transform

replies(1): >>45768746 #
5. IshKebab ◴[] No.45764952[source]
Yes exactly. This is a classic "no cats and dogs don't actually rain from the sky" article.

Nobody who knows literally anything about signal processing thought the ear was doing a Fourier transform. Is it doing something like a STFT? Obviously yes and this article doesn't go against that.

6. IshKebab ◴[] No.45764961[source]
Those use the Short-Time Fourier Transform, which is very much like what the ear does.

https://en.wikipedia.org/wiki/Short-time_Fourier_transform

replies(1): >>45766435 #
7. anyfoo ◴[] No.45766435{3}[source]
Yes, but the article specifically says that it isn't like a short-time fourier transform either, but more like a wavelet transform, which is different yet again.
replies(1): >>45766571 #
8. IshKebab ◴[] No.45766571{4}[source]
Barely different though. Obviously nobody is saying it's exactly a Fourier transform or a STFT. But it's very like a STFT (or a wavelet transform).

The article is pretty much "cows aren't actually spheres guys".

replies(2): >>45766613 #>>45768760 #
9. anyfoo ◴[] No.45766613{5}[source]
I'd say the title is like that (and I agree with someone else's assessment of it being clickbait-y). I think the actual article does a pretty good job in distinguishing a lot of these transforms, and honing into which one matches most.

But the title instead makes it sound (pun unintended) that what the ear does is not about frequency decomposition at all.

replies(1): >>45768406 #
10. jibal ◴[] No.45768406{6}[source]
The fourth sentence in the article is "Vibrations travel through the fluid to the basilar membrane, which remarkably performs frequency separation", with the footnote

"We call this tonotopic organization, which is a mapping from frequency to space. This type of organization also exists in the cortex for other senses in addition to audition, such as retinotopy for vision and somatotopy for touch."

So the cochlea does frequency decomposition but not by performing a FT (https://en.wikipedia.org/wiki/Fourier_transform), but rather by a biomechanical process involving numerous sensors that are sensitive to different frequency ranges ... similar to how we have different kinds (only 3, or in birds and rare humans 4) of cones in the retina that are sensitive to different frequency ranges.

The claim that the title makes it sound like what the ear does is not about frequency decomposition at all is simply false ... that's not what it says, at all.

11. kragen ◴[] No.45768746[source]
Yeah. But a really annoying thing about the STFT is that its temporal resolution is independent of frequency, so you either have to have shitty temporal resolution at high frequencies or shitty frequency resolution at low ones, compared to the human ear. So in Audacity I keep having to switch back and forth between window sizes.
12. kragen ◴[] No.45768760{5}[source]
It's very unlike both of those, as the nice diagrams in the article explain; not only is what it is saying not obvious to you, it is apparently something you actively disbelieve.