←back to thread

448 points lastdong | 1 comments | | HN request time: 0.303s | source
Show context
baal80spam ◴[] No.45114646[source]
Wow. I admit that I am not a native speaker, but this looks (or rather, sounds) VERY impressive and I could mistake it for hearing two people talking.
replies(2): >>45114849 #>>45114891 #
1. tracker1 ◴[] No.45114891[source]
Yeah, a lot of the TTS has gotten really impressive in general. Definitely a clear leap from the TTS stuff I worked with for training simulations a bit over a decade ago. Aside: Installing a sound card (unused) on a windows server just to be able to generate TTS was interesting. It was required by the platform, even if it wasn't used for it.

I generally don't like a lot of the AI generated slop that's starting to pop up on YouTube these days... I do enjoy some of the reddit story channels, but have completely stopped with it all now. With the AI stuff, it really becomes apparent with dates/ages and when numbers are spoken. Dates/ages/timelines are just off as far as story generation, and really should be human tweaked. As to the voice gen, saying a year or measurement is just not how English speakers (US or otherwise) speak.