(github.com)

49 points LorenDB | 1 comments | 25 Jun 25 17:15 UTC | HN request time: 0.305s | source

Show context

ipsum2 ◴[25 Jun 25 18:19 UTC] No.44380330[source]▶

>>44379688 (OP) #

I've been using Nvidia's parakeet model, it's been better than Whisper v3 large and smaller. Only supports English.

https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2

replies(3): >>44380380 #>>44381533 #>>44384824 #

nico ◴[25 Jun 25 18:23 UTC] No.44380380[source]▶

>>44380330 #

Does it need a newer GPU? Or can it run on just CPU?

Would it run on a raspberry pi?

replies(4): >>44380459 #>>44380499 #>>44380936 #>>44382292 #

lupusreal ◴[25 Jun 25 22:16 UTC] No.44382292[source]▶

>>44380380 #

Best CPU TTS that can run on something like a raspberry pi is Piper. It can do real time synthesis on a raspberry pi and on a real computer it runs several times faster with negligible performance cost. I use it for 'reading' ebooks when my eyes get tired. The quality is roughly on par with where Mac OS's TTS was ~10 years ago (the last time I used it.) You can tell it's TTS, but it's good enough that you can become accustomed to it fairly easily.

https://github.com/rhasspy/piper

replies(2): >>44382342 #>>44382651 #

1. GaggiX ◴[25 Jun 25 22:26 UTC] No.44382342[source]▶

>>44382292 #

They are talking about STT, not TTS, but as a TTS piper is very good and works nicely on a raspberry pi, I agree.

↑

DeepSpeech Is Discontinued (2020)