←back to thread

448 points lastdong | 3 comments | | HN request time: 0.001s | source
Show context
egorfine ◴[] No.45114627[source]
[deleted - I'm an idiot]
replies(1): >>45114669 #
x187463 ◴[] No.45114669[source]
Whisper is speech-to-text. VibeVoice is text-to-speech.
replies(2): >>45114712 #>>45114850 #
1. mpeg ◴[] No.45114712[source]
There is a text-to-speech version of whisper, but IMHO the quality is much worse than the demos of this model.
replies(1): >>45114765 #
2. x187463 ◴[] No.45114765[source]
Are you referring to this?

https://github.com/WhisperSpeech/WhisperSpeech

Or is there some OpenAI official Whisper TTS?

replies(1): >>45114838 #
3. mpeg ◴[] No.45114838[source]
Yep, nothing official that I know, but that one is fairly popular so maybe they were referring to it (although AFAIK it's not frontier?)