←back to thread

The era of open voice assistants

(www.home-assistant.io)
879 points _Microft | 9 comments | | HN request time: 0s | source | bottom
Show context
Jarwain ◴[] No.42468180[source]
I'm actually really excited for this!

I noticed recently there weren't any good open source hardware projects for voice assistants with a focus on privacy. There's another project I've been thinking about where I think the privacy aspect is Important, and figuring out a good hardware stack has been a Process. The project I want to work on isn't exactly a voice assistant, but same ultimate hardware requirements

Something I'm kinda curious about: it sounds like they're planning on a sorta batch manufacturing by resellers type of model. Which I guess is pretty standard for hardware sales. But why not do a sorta "group buy" approach? I guess there's nothing stopping it from happening in conjunction

I've had an idea floating around for a site that enables group buys for open source hardware (or 3d printed items), that also acts like or integrates with github wrt forking/remixing

replies(5): >>42468413 #>>42468436 #>>42468945 #>>42469600 #>>42470457 #
pimeys ◴[] No.42470457[source]
I'm also very excited. I've had some ESP32 microphones before, but they were not really able to understand the wake word, sometimes even when it was quiet and you were sitting next to the mic.

This one looks like it can recognize your voice very well, even when music is playing.

Because... when it works, it's amazing. You get that Star Trek wake word (KHUM-PUTER!), you can connect your favorite LLM to it (ChatGPT, Claude Sonnet, Ollama), you can control your home automation with it and it's as private as you want.

I ordered two of these, if they are great, I will order two more. I've been waiting for this product for years, it's hopefully finally here.

replies(2): >>42472346 #>>42478129 #
nine_k ◴[] No.42472346[source]
As a side note, it always slightly puzzles me when I see "voice interface" and "private" used together. Maybe it takes living alone to issue voice commands and feel some privacy.

(Yes, I do understand that "privacy" here is mostly about not sending it for processing to third parties.)

replies(4): >>42472512 #>>42473295 #>>42474143 #>>42481940 #
1. iteria ◴[] No.42473295[source]
I don't like these interaces because unless they are button activated or something, they must be always listening and sending sound from where you are to a 3rd party server. No thanks. Of course this could be happening with my phone, but at least it have to be a malicious action to record me 24/7
replies(5): >>42474134 #>>42476228 #>>42476238 #>>42476613 #>>42477336 #
2. pimeys ◴[] No.42474134[source]
How these ESP32-systems work is that you send a wake word to the device itself. It can detect the word without an internet connection, the device itself understands it and wakes up. After the device is woken up, it sends your speech to home assistant, which either

  - handles it locally, if you have fast enough computer
  - sends it to home assistant cloud, if you set it up
  - sends it to chatgpt, claude sonnet etc. if you set it up
I'm planning on building a proxmox rack server next year, so I'm probably going to just handle all the discussions locally. The home assistant cloud is quite private too, at least that's what they say (and they're in EU, so I think there might be truth in what they say)...
3. ◴[] No.42476228[source]
4. horsawlarway ◴[] No.42476238[source]
I mean... That's not true, though.

The main pitch of a tool like this is that I can absolutely verify it's not true.

I'm currently running a slightly different take of this (Esp 32 based devices, with whisper through Willow inference server, with Willow autocorrect, tied into home assistant).

For context, it works completely offline. My modem can literally be unplugged and I can control my smart devices just fine, with my voice. Entirely on my local network, with a couple of cheap devices and a ten year old gaming PC as the server.

My data

5. gregmac ◴[] No.42477336[source]
FWIW that's not even how Alexa or Google Assistant work. Both listen locally for the wake word with onboard processing, and only when they recognize it do they send the audio stream to the server to fully interpret.

You can test this in a couple ways: they'll respond to their wake word when the internet is down (but have an error response). You can also look at the outbound data and see they're not sending continuous traffic.

Not to say with the proprietary products that they couldn't sneakily change this on the fly and record everything, maybe even turning it on for a specific device or account.

replies(1): >>42477560 #
6. mattmaroon ◴[] No.42477560[source]
The developers could do sneaky things with any device that has wifi and a mic.
replies(1): >>42477787 #
7. adrianN ◴[] No.42477787{3}[source]
And yet most people have a phone in their pocket.
replies(2): >>42479297 #>>42479570 #
8. fsflover ◴[] No.42479297{4}[source]
Try to live without it. It's almost impossible. I try to use Librem 5 as a daily driver, with hardware kill switches and GNU/Linux, and it's not always easy.
9. mattmaroon ◴[] No.42479570{4}[source]
Well that’s my point, we’ve already just accepted the risk. Probably more than half of people think their phone is spy on them but carry it anyway.