←back to thread

439 points david927 | 2 comments | | HN request time: 0.954s | source

What are you working on? Any new ideas which you're thinking about?
Show context
daxfohl ◴[] No.44417355[source]
I was hoping to make a piano practice assistant for my kids, that would take sheet music in MusicXML format, listen to the microphone stream, and check for things they frequently miss like rests, dynamics, consistent tempos.

Surprisingly the blocker has been identifying notes from the microphone input. I assumed that'd have been a long-solved problem; just do an FFT and find the peaks of the spectrogram? But apparently that doesn't work well when there's harmonics and reverb and such, and you have to use AI models (google and spotify have some) to do it. And so far it still seems to fail if there are more than three notes played simultaneously.

Now I'm baffled how song identification can work, if even identifying notes is so unreliable! Maybe I'm doing something wrong.

replies(3): >>44417420 #>>44417427 #>>44417886 #
1. fxtentacle ◴[] No.44417420[source]
Note detection works ok if you ignore the octave. Otherwise, you need to know the relative strength of overtones, which is instrument dependent. Some years ago I built a piano training app with FFT+Kalman filter.
replies(1): >>44417668 #
2. daxfohl ◴[] No.44417668[source]
Cool, I'll give it a shot. So far I've just been blindly feeding into the AI and crossing my fingers. I'll try displaying the spectrogram graphically, and I imagine that'll help figure out what the next step needs to be.

I was thinking this would be a good project to learn AI stuff, but it seems like most of the work is better off being fully deterministic. Which, is maybe the best AI lesson there is. (Though I do still think there's opportunity to use AI in the translation of teacher's notes (e.g. "pay attention to the rest in measure 19") to a deterministic ruleset to monitor when practicing).