/top/
/new/
/best/
/ask/
/show/
/job/
^
slacker news
login
about
←back to thread
VibeVoice: A Frontier Open-Source Text-to-Speech Model
(microsoft.github.io)
448 points
lastdong
| 3 comments |
03 Sep 25 10:44 UTC
|
HN request time: 0.001s
|
source
Show context
egorfine
◴[
03 Sep 25 11:47 UTC
]
No.
45114627
[source]
▶
>>45114245 (OP)
#
[deleted - I'm an idiot]
replies(1):
>>45114669
#
x187463
◴[
03 Sep 25 11:52 UTC
]
No.
45114669
[source]
▶
>>45114627
#
Whisper is speech-to-text. VibeVoice is text-to-speech.
replies(2):
>>45114712
#
>>45114850
#
1.
mpeg
◴[
03 Sep 25 11:59 UTC
]
No.
45114712
[source]
▶
>>45114669
#
There is a text-to-speech version of whisper, but IMHO the quality is much worse than the demos of this model.
replies(1):
>>45114765
#
ID:
GO
2.
x187463
◴[
03 Sep 25 12:07 UTC
]
No.
45114765
[source]
▶
>>45114712 (TP)
#
Are you referring to this?
https://github.com/WhisperSpeech/WhisperSpeech
Or is there some OpenAI official Whisper TTS?
replies(1):
>>45114838
#
3.
mpeg
◴[
03 Sep 25 12:18 UTC
]
No.
45114838
[source]
▶
>>45114765
#
Yep, nothing official that I know, but that one is fairly popular so maybe they were referring to it (although AFAIK it's not frontier?)
↑