Insane how much low hanging fruit there is for Audio models right now. A team of two picking things up over a few months can build something that still competes with large players with tons of funding
This is amazing.
Is it possible to build in a chosen voice, a bit like Eleven Labs does?
...This may be on the git summary, being lazy and asking anyway :=)
Thanks for your work.