Show HN: Aqua Voice 2 – Fast Voice Input for Mac and Windows

1. fxtentacle ◴[09 Apr 25 20:40 UTC] No.43637679[source]▶

This looks like it'll slurp up all your data and upload it into a cloud. Thanks, no. I want privacy, offline mode and source code for something as crucial to system security as an input method.

"we also collect and process your voice inputs [..] We leverage this data for improvements and development [..] Sharing of your information [..] service providers [..] OpenAI" https://withaqua.com/privacy

replies(7): >>43637923 #>>43638662 #>>43638673 #>>43638808 #>>43639318 #>>43639535 #>>43640415 #

2. FloatArtifact ◴[09 Apr 25 21:00 UTC] No.43637923[source]▶

>>43637679 (TP) #

Local inference only is an absolute requirement. It's not even really all that accessible if it's online only. I can say this as someone that's used over 20000 hours worth of voice dictation and computer control.

3. pokstad ◴[09 Apr 25 22:20 UTC] No.43638662[source]▶

>>43637679 (TP) #

This should be on the FAQ. I was trying to find out if it was 100% processed locally.

4. jackthetab ◴[09 Apr 25 22:21 UTC] No.43638673[source]▶

>>43637679 (TP) #

Agreed.

This is where I bounce (out of this discussion).

5. thmsmlr ◴[09 Apr 25 22:42 UTC] No.43638808[source]▶

>>43637679 (TP) #

I totally agree, I created BetterDictation (.com) exactly because of that. Offline was a super important requirement for me.

6. canada_dry ◴[10 Apr 25 00:06 UTC] No.43639318[source]▶

>>43637679 (TP) #

First thing I looked for and read: the FAQ.

No mention of privacy (or on prem) - so assumed it's 100% cloud.

Non-starter for me. Accuracy is important, but privacy is more so.

Hopefully a service with these capabilities will be available where the first step has the user complete a brief training session, sends that to the cloud to tailor the recognition parameters for their voice and mannerisms... then loads that locally.

replies(1): >>43650975 #

7. toddmorey ◴[10 Apr 25 00:46 UTC] No.43639535[source]▶

>>43637679 (TP) #

And man it's another monthly subscription. I'm not mad at them for finding a gap in the market and putting a business around it. I'm mad at Apple for leaving that gap... hopefully built in voice dictation improves quickly.

replies(2): >>43639713 #>>43650211 #

8. FireBeyond ◴[10 Apr 25 01:19 UTC] No.43639713[source]▶

>>43639535 #

Is there a gap in the market? It's being rapidly filled with the likes of MacWhisper, etc., which offer local-only, one-off pricing.

9. jmcintire1 ◴[10 Apr 25 03:43 UTC] No.43640415[source]▶

>>43637679 (TP) #

fair point. offline+local would be ideal, but as it stands we can't run asr and an llm locally at the speed that is required to provide the level of service we want to.

given that we need the cloud, we offer zero data retention -- you can see this in the app. your concern is as much about ux and communications as it is privacy

replies(2): >>43641065 #>>43642213 #

10. mrtesthah ◴[10 Apr 25 05:53 UTC] No.43641065[source]▶

>>43640415 #

MacWhisper does realtime system-wide dictation on your local machine (among other things). Just a one-time fee for an app you download -- the way shareware is supposed to be. Of course it doesn't use MoE transcription with 6 models like Aqua Voice, but if you guys expect to be acquired by Apple (that is your exit strategy, right?), you're going to need better guarantees of privacy than "we don't log".

replies(1): >>43642111 #

11. shinycode ◴[10 Apr 25 09:06 UTC] No.43642111{3}[source]▶

>>43641065 #

I downloaded the turbo whisper model optimized for Mac, created a python script that get the mic input and paste the result. The python script is LLM generated and it works with pushing a key. For 80% of the functionality for free and done locally.

12. fxtentacle ◴[10 Apr 25 09:22 UTC] No.43642213[source]▶

>>43640415 #

The problem if you actually need the cloud is that it kind of completely destroys your business model. OpenAI is bleeding money every month because they massively subsidize the hosting cost of their models. But eventually they will have to post a profit. And then if they know that your product is completely dependent on their API, they can milk you until there's no profits left for you.

And self-hosting real-time streaming LLMs will probably also come out at 50 cents per hour. Arguing a $120/month price for power users is probably going to be very difficult. Especially so if there is free open-source alternatives.

13. pablopeniche ◴[11 Apr 25 03:23 UTC] No.43650211[source]▶

>>43639535 #

"hopefully built in voice dictation improves quickly." I would not hold my breath on that one lol

14. oulipo ◴[11 Apr 25 06:16 UTC] No.43650975[source]▶

>>43639318 #

A similar but offline tool is VoiceInk, it's also open-source so you can extend it