My current area of research is in sparse, event-based encodings of musical audio (https://blog.cochlea.xyz/sparse-interpretable-audio-codec-pa...). I'm very interested in decomposing audio signals into a description of the "system" (e.g., room, instrument, vocal tract, etc.) and a sparse "control signal" which describes how and when energy is injected into that system. This toy was a great way to start learning about physical modeling synthesis, which seems to be the next stop in my research journey. I was also pleasantly surprised at what's possible these days writing custom Audio Worklets!
I did notice glitching in latest Firefox on a Mac, like I'd get running a DAW with too small a buffer...While the tab was open I got similar crackles and slightly delayed audio stream playing YouTube vids in other tabs.
My assumption has been that any physics engine that does soft-body physics would work in this regard, just at a much higher sampling rate than one would normally use in a gaming scenario. This simulation is actually only running at 22050hz, rather than today's standard 44100hz sampling rate.
Working with AudioWorklets (https://developer.mozilla.org/en-US/docs/Web/API/AudioWorkle...) has been really cool, and I've been surprised at what's possible, but I _haven't_ yet figured out how to get good feedback about when the custom processor node is "falling behind" in terms of not delivering the next buffer quickly enough.
Now that I understand the basics of how this works, I'd like to use a (much) more efficient version of the simulation as an infinite-dataset generator and try to learn a neural operator, or NERF like model that, given a spring mesh configuration, a sparse control signal, and a time, can produce an approximation of the simulation in a parallel and sample-rate-independent manner. This also (maybe) opens the door to spatial audio, such that you could approximate sound-pressure levels at a particular point in time _and_ space. At this point, I'm just dreaming out-loud a bit.
I used to do some web audio and tonejs works, but later switched to rust and glicol for sound synthesis.
For example, this handwritten dattorro reverb:
https://glicol.org/demo#handmadedattorroreverb
This karplus-stress-tester may also be interesting to you.
https://jackschaedler.github.io/karplus-stress-tester/
In short, I think to study more powerful physics synthesis, you need to consider the technology stack of
- rust -> wasm - audioworklet - sharedarraybuffer
Visual can rely on wgpu. Of course, webgl is enough in this case imho.
If it is purely desktop, you can consider using the physics library in bevy.
I've written one other AudioWorklet at this point, which just runs "inference" on a single-layer RNN given a pre-trained set of weights: https://blog.cochlea.xyz/rnn.html. It has similarly mediocre performance.
Thanks for all the great tips, and for your work on Glicol!
In Safari (iOS 18.3.1) if you set the Mass slider to 0 and increase Tension, not only does the app crash, but a repeated clicking noise starts, and persists, even after the tab is closed - even after Safari itself is closed! Seems to be a Safari bug. I have reproduced it 3x.
If you reproduce this and want the noise to go away you have to start another app that tries to play sound.
It is also a spring-mass synth but with Midi and audio and much more options.