How the cochlea computes (2024)

(www.dissonances.blog)

475 points izhak | 2 comments | 30 Oct 25 17:01 UTC | HN request time: 0.001s | source

Show context

edbaskerville ◴[30 Oct 25 17:52 UTC] No.45762928[source]▶

To summarize: the ear does not do a Fourier transform, but it does do a time-localized frequency-domain transform akin to wavelets (specifically, intermediate between wavelet and Gabor transforms). It does this because the sounds processed by the ear are often localized in time.

The article also describes a theory that human speech evolved to occupy an unoccupied space in frequency vs. envelope duration space. It makes no explicit connection between that fact and the type of transform the ear does—but one would suspect that the specific characteristics of the human cochlea might be tuned to human speech while still being able to process environmental and animal sounds sufficiently well.

A more complicated hypothesis off the top of my head: the location of human speech in frequency/envelope is a tradeoff between (1) occupying an unfilled niche in sound space; (2) optimal information density taking brain processing speed into account; and (3) evolutionary constraints on physiology of sound production and hearing.

replies(12): >>45763026 #>>45763057 #>>45763066 #>>45763124 #>>45763139 #>>45763700 #>>45763804 #>>45764016 #>>45764339 #>>45764582 #>>45765101 #>>45765398 #

a-dub ◴[30 Oct 25 18:09 UTC] No.45763124[source]▶

>>45762928 #

> At high frequencies, frequency resolution is sacrificed for temporal resolution, and vice versa at low frequencies.

this is the time-frequency uncertainty principle. intuitively it can be understood by thinking about wavelength. the more stretched out the waveform is in time, the more of it you need to see in order to have a good representation of its frequency, but the more of it you see, the less precise you can be about where exactly it is.

> but it does do a time-localized frequency-domain transform akin to wavelets

maybe easier to conceive of first as an arbitrarily defined filter bank based on physiological results rather than trying to jump directly to some neatly defined set of orthogonal basis functions. additionally, orthogonal basis functions cannot, by definition, capture things like masking effects.

> A more complicated hypothesis off the top of my head: the location of human speech in frequency/envelope is a tradeoff between (1) occupying an unfilled niche in sound space; (2) optimal information density taking brain processing speed into account; and (3) evolutionary constraints on physiology of sound production and hearing.

(4) size of the animal.

notably: some smaller creatures have supersonic vocalization and sensory capability, sometimes this is hypothesized to complement visual perception for avoiding predators, it also could just have a lot to do with the fact that, well, they have tiny articulators and tiny vocalizations!

replies(1): >>45764189 #

Terr_ ◴[30 Oct 25 19:24 UTC] No.45764189[source]▶

>>45763124 #

> it also could just have a lot to do with the fact that, well, they have tiny articulators and tiny vocalizations!

Now I'm imagining some alien shrew with vocal-cords (or syrinx, or whatever) that runs the entire length of its body, just so that it can emit lower-frequency noises for some reason.

replies(3): >>45764677 #>>45765058 #>>45768799 #

bragr ◴[30 Oct 25 20:05 UTC] No.45764677[source]▶

>>45764189 #

Well without the humorous size difference, this is basically what whales and elephants do for long distance communication.

replies(1): >>45765641 #

1. Terr_ ◴[30 Oct 25 21:33 UTC] No.45765641[source]▶

>>45764677 #

Was playing around with a fundamental frequency calculator [0] to associate certain sizes to hertz, then using a tone-generator [1] to get a subjective idea of what it'd sound like.

Though of course, nature has plenty of other tricks, like how Koalas can go down to ~27hz. [2]

[0] https://acousticalengineer.com/fundamental-frequency-calcula...

[1] https://www.szynalski.com/tone-generator/

[2] https://www.nature.com/articles/nature.2013.14275

replies(1): >>45766503 #

2. fuzzfactor ◴[30 Oct 25 23:12 UTC] No.45766503[source]▶

>>45765641 (TP) #

How long would a Dachshund have to be for it to sound like a 60 kilo Great Dane?

↑