(github.com)

652 points toebee | 2 comments | 21 Apr 25 17:07 UTC | HN request time: 0.403s | source

Show context

hemloc_io ◴[21 Apr 25 19:22 UTC] No.43755481[source]▶

Very cool!

Insane how much low hanging fruit there is for Audio models right now. A team of two picking things up over a few months can build something that still competes with large players with tons of funding

replies(3): >>43757397 #>>43758495 #>>43760210 #

1. kreelman ◴[22 Apr 25 02:15 UTC] No.43758495[source]▶

>>43755481 #

This is amazing. Is it possible to build in a chosen voice, a bit like Eleven Labs does? ...This may be on the git summary, being lazy and asking anyway :=) Thanks for your work.

replies(1): >>43759095 #

2. JonathanFly ◴[22 Apr 25 04:33 UTC] No.43759095[source]▶

>>43758495 (TP) #

Yes, see: https://github.com/nari-labs/dia/blob/main/example/voice_clo...

↑

Show HN: Dia, an open-weights TTS model for generating realistic dialogue