←back to thread

652 points toebee | 2 comments | | HN request time: 0.547s | source
1. jokethrowaway ◴[] No.43756758[source]
Looking forward to try. My current go-to solution is E5-F2 (great cloning, decent delivery, ok audio quality, a lot of incoherence here and there forcing you to do multiple generations).

I've just been massively disappointed by Sesame's CSM: on their gradio on the website it was generating flawless dialogs with amazing voice cloning. When running it local the voice cloning performance is awful.

replies(1): >>43758239 #
2. toebee ◴[] No.43758239[source]
Thanks for the interest! We also enjoyed using E5-F2 :) You can try it now on HF Spaces: https://huggingface.co/spaces/nari-labs/Dia-1.6B