←back to thread

How the cochlea computes (2024)

(www.dissonances.blog)
475 points izhak | 7 comments | | HN request time: 0.814s | source | bottom
1. kazinator ◴[] No.45762948[source]
> A Fourier transform has no explicit temporal precision, and resembles something closer to the waveforms on the right; this is not what the filters in the cochlea look like.

Perhaps the ear does someting more vaguely analogous to a discrete Fourier transforms on samples of data, which is what we do in a lot of signal processing.

In signal processing, we take windowed samples, and do discrete transforms on these. These do give us some temporal precision.

There is a trade off there between frequency and temporal precision, analgous to the Pauli exclusion principle in quantum mechanics. The better we know a frequency, the less precisely we know the timing. Only an infinite, periodic signal has a single precise frequency (or precise set of harmonics) which are infinitely narrow blips in the frequency domain.

The continuous Fourier transform deals with periodic signals only. We transform an entire function like sin(x) over the entire domain. If that domain is interpreted as time, we are including all of eternity, so to speak from negative infinite time to positive.

replies(4): >>45763111 #>>45763560 #>>45764192 #>>45766585 #
2. xeonmc ◴[] No.45763111[source]
> analgous to the Pauli exclusion principle

Did you mean the Heisenberg Uncertainty Principle instead? Or is there actually some connection of Pauli Exlusion Principle to conjugate transforms that I was’t aware of?

replies(1): >>45765239 #
3. HarHarVeryFunny ◴[] No.45763560[source]
> There is a trade off there between frequency and temporal precision

Sure, and the FFT isn't inherently biased towards one vs the other. If you take an FFT over a long time window (narrowband spectrogram) then you get good frequency resolution at the cost of time resolution, and vice versa for a short time window (wideband spectrogram).

For speech recognition ideally you'd want to use both since they are detecting different things. TFA is saying that this is in fact what our cochlea filter bank is doing, using different types of filter at different frequency ranges - better frequency resolution at lower frequencies where the formants are (carrying articulatory information), and better time resolution at the high frequencies generated by fricatives where frequency doesn't matter but accurate onset detection is useful for detecting plosives.

4. energy123 ◴[] No.45764192[source]
STFT?
replies(1): >>45773833 #
5. kvakkefly ◴[] No.45765239[source]
They are not connected afaik.
6. jibal ◴[] No.45766585[source]
The ear clearly doesn't operate on "samples of data", it doesn't "take windowed samples" ... there's an ongoing mechanical process.
7. ducttapecrown ◴[] No.45773833[source]
Yesterday there was an article about how the ear works more like a Gabor transform or a wavelet transform than a Fourier transform, both of which are Short Time Fourier Transforms, so yes!