←back to thread

448 points lastdong | 8 comments | | HN request time: 1.105s | source | bottom
1. aargh_aargh ◴[] No.45114957[source]
Is there a current, updated list (ideally, a ranking) of the best open weights TTS models?

I'm actually more interested in STT (ASR) but the choices there are rather limited.

replies(4): >>45115421 #>>45116346 #>>45116488 #>>45116531 #
2. xnx ◴[] No.45115421[source]
Click leaderboard in the hamburger menu: https://huggingface.co/spaces/TTS-AGI/TTS-Arena-V2
replies(2): >>45116081 #>>45116291 #
3. prophesi ◴[] No.45116081[source]
Is there a way to filter out hosted models? The top three winners currently are all proprietary as far as I can tell.

edit: Ah, there's a lock icon next to the name of each proprietary model.

4. odie5533 ◴[] No.45116291[source]
That's a highly incomplete comparison
5. odie5533 ◴[] No.45116346[source]
Best TTS: VibeVoice, Chatterbox, Dia, Higgs, F5 TTS, Kokoro, Cosy Voice, XTTS-2.
replies(1): >>45124463 #
6. Uehreka ◴[] No.45116488[source]
Yes: https://huggingface.co/models?pipeline_tag=text-to-speech

Generally if a model is trending on that page, there’s enough juice for it to be worth a try. There’s a lot of subjective-opinion-having in this space, so beyond “is it trending on HF” the best eval is your own ears. But if something is not trending on HF it is unlikely to be much good.

7. watsonmusic ◴[] No.45116531[source]
yes the best
8. kroaton ◴[] No.45124463[source]
Unmute.sh (same team as Kokoro) gets slept on, but it's really good.