←back to thread

257 points amrrs | 4 comments | | HN request time: 1.124s | source
1. DevX101 ◴[] No.41841586[source]
Has anyone done a comparison of combined speech to text and TTS vs speech-to-speech for create audio only interfaces? Particularly curious around latency, and quality of audio output.
replies(2): >>41841719 #>>41846399 #
2. amrrs ◴[] No.41841719[source]
Hugging Face has got a TTS leaderboard (arena like lmsys) - https://huggingface.co/spaces/TTS-AGI/TTS-Arena
3. yavorgiv ◴[] No.41846399[source]
Latency from the announcement https://x.com/_mfelfel/status/1846025183993511965/photo/1
replies(1): >>41850983 #
4. stuxyz ◴[] No.41850983[source]
also here: https://x.com/play_ht/status/1846240712125452469