My current area of research is in sparse, event-based encodings of musical audio (https://blog.cochlea.xyz/sparse-interpretable-audio-codec-pa...). I'm very interested in decomposing audio signals into a description of the "system" (e.g., room, instrument, vocal tract, etc.) and a sparse "control signal" which describes how and when energy is injected into that system. This toy was a great way to start learning about physical modeling synthesis, which seems to be the next stop in my research journey. I was also pleasantly surprised at what's possible these days writing custom Audio Worklets!
My assumption has been that any physics engine that does soft-body physics would work in this regard, just at a much higher sampling rate than one would normally use in a gaming scenario. This simulation is actually only running at 22050hz, rather than today's standard 44100hz sampling rate.