Most active commenters
  • IgorPartola(7)
  • Jarwain(4)
  • pimeys(4)

←back to thread

The era of open voice assistants

(www.home-assistant.io)
878 points _Microft | 54 comments | | HN request time: 1.017s | source | bottom
1. Jarwain ◴[] No.42468180[source]
I'm actually really excited for this!

I noticed recently there weren't any good open source hardware projects for voice assistants with a focus on privacy. There's another project I've been thinking about where I think the privacy aspect is Important, and figuring out a good hardware stack has been a Process. The project I want to work on isn't exactly a voice assistant, but same ultimate hardware requirements

Something I'm kinda curious about: it sounds like they're planning on a sorta batch manufacturing by resellers type of model. Which I guess is pretty standard for hardware sales. But why not do a sorta "group buy" approach? I guess there's nothing stopping it from happening in conjunction

I've had an idea floating around for a site that enables group buys for open source hardware (or 3d printed items), that also acts like or integrates with github wrt forking/remixing

replies(5): >>42468413 #>>42468436 #>>42468945 #>>42469600 #>>42470457 #
2. Brendinooo ◴[] No.42468413[source]
I invested in Mycroft and it flopped. Here’s hoping some others can go where they couldn’t.
replies(5): >>42468709 #>>42469160 #>>42470293 #>>42470771 #>>42473432 #
3. IgorPartola ◴[] No.42468436[source]
A group buy for an existing product makes sense. Want to buy a 24TB Western Digital hard drive? It’s $350. But if you and your 1000 closest friends get together the price can be $275.

But for a first time unknown product? You get a lot fewer interested parties. Lots of people want to wait for tech reviews and blog posts before committing to it. And group buys being the only way to get them means availability will be inconsistent for the foreseeable future. I don’t want one voice assistant. I want 5-20, one for every space in my house. But I am not prepared to commit to 20 devices of a first run and I am not prepared to buy one and hope I’ll get the opportunity to buy more later if it doesn’t flop. Stability of the supply chain is an important signal to consumers that the device won’t be abandoned.

replies(5): >>42468476 #>>42468813 #>>42469311 #>>42474130 #>>42478401 #
4. bhaney ◴[] No.42468476[source]
> I am not prepared to buy one and hope I’ll get the opportunity to buy more later

As long as this thing works and there's demand for it, I doubt we'll ever run out of people willing to connect an XU316 and some mics to an ESP32-S3 and sell it to you with HA's open source firmware flashed to it, whether or not HA themselves are still willing to.

replies(1): >>42469475 #
5. ◴[] No.42468709[source]
6. ascorbic ◴[] No.42468813[source]
Kickstarter shows that a lot of people feel different.
replies(1): >>42470842 #
7. interludead ◴[] No.42468945[source]
Your idea about group buys is really intriguing. I wonder if the community might organically set something like that up once there’s enough interest
8. bdavbdav ◴[] No.42469160[source]
I guess the difference here is that HA has a huge community already. I believe the estimate was around 250k installations running actively. I suspect a huge chunk of the HA users venn diagram slice fits within the voice users slice.
replies(1): >>42470242 #
9. esperent ◴[] No.42469311[source]
> But for a first time unknown product? You get a lot fewer interested parties. Lots of people want to wait for tech reviews and blog posts before committing to it.

I used to think so too. But then Kickstarter proved that actually, as long as you have a good advertising style, communicate well, and get lucky, you can get people to contribute literal millions for a product that hasn't even reached the blueprints stage yet.

replies(2): >>42470393 #>>42470848 #
10. Jarwain ◴[] No.42469475{3}[source]
I agree! I mean, just look at the market for Meshtastic devices! So many options! Or devices with WLED pre-installed! It'll take a Lot for Esp32 to go out of style
11. choffee ◴[] No.42469600[source]
Not really sure what the benefit of group buy would be here. Nuba Casa, the company that supports the development of home assistant and developed this product, already has a few products they sell. They had this stocked all over the world for the announcement and it sold out. I assume they had already made a few thousand. They will get more stock now and it will sell just like the other things they make. Any profit from this will go back into development of Home Assistant.
replies(1): >>42472047 #
12. balloob ◴[] No.42470242{3}[source]
Our estimates are more than a million active instances https://analytics.home-assistant.io/
replies(1): >>42470953 #
13. bronco21016 ◴[] No.42470293[source]
I think Mycroft was unfortunately just ahead of its time. STT was just becoming good enough but NLU wasn’t quite there yet. Add in you’re up against Apple Google and Amazon who were able to add integrations like music and subsidize the crap out of their products.

I just think this time around is different. Open Whisper gives them amazing STT and LLMs can far more easily be adapted for the NLU portion. The hardware is also dirt cheap which makes it better suited to a narrow use case.

14. pimeys ◴[] No.42470457[source]
I'm also very excited. I've had some ESP32 microphones before, but they were not really able to understand the wake word, sometimes even when it was quiet and you were sitting next to the mic.

This one looks like it can recognize your voice very well, even when music is playing.

Because... when it works, it's amazing. You get that Star Trek wake word (KHUM-PUTER!), you can connect your favorite LLM to it (ChatGPT, Claude Sonnet, Ollama), you can control your home automation with it and it's as private as you want.

I ordered two of these, if they are great, I will order two more. I've been waiting for this product for years, it's hopefully finally here.

replies(2): >>42472346 #>>42478129 #
15. geerlingguy ◴[] No.42470771[source]
IIRC one of the main devs behind this device came from Mycroft.
replies(2): >>42471981 #>>42473759 #
16. IgorPartola ◴[] No.42470842{3}[source]
Kickstarter isn’t a group buy. Similar, but not the same.
17. IgorPartola ◴[] No.42470848{3}[source]
Kickstarter isn't a group buy.
replies(1): >>42470922 #
18. yunohn ◴[] No.42470922{4}[source]
Kickstarter is often basically a group buy. Project owners make MVPs and market/pitch it, get funding from the public, and then commission a large batch run.
replies(1): >>42477313 #
19. emsixteen ◴[] No.42470953{4}[source]
More than a million? It says on the page: "424,548 Active Home Assistant Installations"

Am I missing something? Is it that these are just those you know are sharing details, and you can scale that up by a known percentage? :)

replies(2): >>42471673 #>>42472285 #
20. schnapsidee ◴[] No.42471673{5}[source]
> Analytics in Home Assistant are opt-in and do not reflect the entire Home Assistant userbase. We estimate that a third of all Home Assistant users opt in.
21. dole ◴[] No.42471981{3}[source]
OP's username checks out.
22. Jarwain ◴[] No.42472047[source]
Heh thus far I've been an excited spectator of HomeAssistant, and wasn't aware of Nuba Casa until doing research for a different comment on the thread. I do love and appreciate their model here

I guess the benefits that came to mind are - alternative crowdsourced route for sourcing hardware, to avoid things like that raspberry pi shortage (although if it's due to broader supply chain issues then this doesn't necessarily help) - hardware forks! If someone wanted a version with a more powerful ESP32, or a GPS, or another mic, or an enclosure for a battery and charging and all that, took the time to fork the design to add these features, and found X other users interested in the fork to get it produced... (of course I might be betraying my ignorance on how easy it is to set up this sort of alternative manufacturing chain or what unit amounts are necessary to make this kind of forking economical)

23. alias_neo ◴[] No.42472285{5}[source]
I'm a big fan of home assistant, and use it to control a LOT of my home, have done for years, have tonnes of hardware dedicated to and for it, and I've also ordered some of these Voice devices.

I'm also opted OUT of the analytics.

24. nine_k ◴[] No.42472346[source]
As a side note, it always slightly puzzles me when I see "voice interface" and "private" used together. Maybe it takes living alone to issue voice commands and feel some privacy.

(Yes, I do understand that "privacy" here is mostly about not sending it for processing to third parties.)

replies(4): >>42472512 #>>42473295 #>>42474143 #>>42481940 #
25. staunton ◴[] No.42472512{3}[source]
> Yes, I do understand that "privacy" here is mostly about not sending it for processing to third parties.

Then why does it puzzle you?

replies(1): >>42472822 #
26. entropicdrifter ◴[] No.42472822{4}[source]
Because you wouldn't ask it deeply private questions in front of your mom, for instance
replies(3): >>42473020 #>>42475400 #>>42477749 #
27. iteria ◴[] No.42473295{3}[source]
I don't like these interaces because unless they are button activated or something, they must be always listening and sending sound from where you are to a 3rd party server. No thanks. Of course this could be happening with my phone, but at least it have to be a malicious action to record me 24/7
replies(5): >>42474134 #>>42476228 #>>42476238 #>>42476613 #>>42477336 #
28. tacticalturtle ◴[] No.42473432[source]
I believe Mycroft was killed in part due to a patent troll:

https://www.theregister.com/AMP/2023/02/13/linux_ai_assistan...

Hopefully the troll is no longer around

replies(1): >>42474797 #
29. robotfelix ◴[] No.42473759{3}[source]
Yep, Mike Hansen was on the live stream launching the new device. He also notably created Rhasspy [1], which is open-source voice assistant software for Raspberry Pi (when connected to a microphone and speaker).

[1] https://rhasspy.readthedocs.io/en/latest/

30. burningChrome ◴[] No.42474130[source]
>> I want 5-20, one for every space in my house.

I don't have a small house, but I'm trying to think why I would need even 5 of these, let alone 20. The majority of the time my family spends together is in the open layout on our main floor where the kitchen flows into the living room with an adjacent sun room off the living room.

I'm genuinely curious why you need so many of these.

I do agree that if you do have a legit use case for so many, buying so many in essentially a first run is a risky thing. Coupled with the ability for this to be supported for more than a fleeting couple of years is also a huge risk.

replies(2): >>42474789 #>>42477304 #
31. pimeys ◴[] No.42474134{4}[source]
How these ESP32-systems work is that you send a wake word to the device itself. It can detect the word without an internet connection, the device itself understands it and wakes up. After the device is woken up, it sends your speech to home assistant, which either

  - handles it locally, if you have fast enough computer
  - sends it to home assistant cloud, if you set it up
  - sends it to chatgpt, claude sonnet etc. if you set it up
I'm planning on building a proxmox rack server next year, so I'm probably going to just handle all the discussions locally. The home assistant cloud is quite private too, at least that's what they say (and they're in EU, so I think there might be truth in what they say)...
32. pimeys ◴[] No.42474143{3}[source]
Private meaning that a big American corporation is not listening and using my voice to either track me or teach their own AI service with it.
33. Jarwain ◴[] No.42474789{3}[source]
Just using where I might want it in childhood home as an example - master bedroom - master bathroom - grandma's room - my room - brother's room - upstairs bathroom - upstairs loft? - office room - living room/diningroom - kitchen/kitchentable/familyroom - garage?

9-14 devices for a 5 person household. May be a stretch since I'm not sure if my grandma could even really use it. Bathroom's a stretch but I'm imagining being in the shower and wanting to note multiple showerthoughts

34. NoNotTheDuo ◴[] No.42474797{3}[source]
I think another part is that there is a failure mechanism on their boards that was recently identified: https://community.openconversational.ai/t/sj-201-sj201-failu...

The short version, from the post, is that there are 4 capacitors that are only rated for 6.3v, but the power supply is 12v. Eventually one of these capacitors will fail, causing the board to stop working entirely.

It would be hard for a company to stay in business when they are fighting a patent troll lawsuit and having to handle returns on every device they sold through kickstarter.

35. xandrius ◴[] No.42475400{5}[source]
There are levels of privacy. Because I'm not going to ask deeply private questions, it doesn't mean that I want everyone to be snooping into what I'm planning to eat tonight.
36. ◴[] No.42476228{4}[source]
37. horsawlarway ◴[] No.42476238{4}[source]
I mean... That's not true, though.

The main pitch of a tool like this is that I can absolutely verify it's not true.

I'm currently running a slightly different take of this (Esp 32 based devices, with whisper through Willow inference server, with Willow autocorrect, tied into home assistant).

For context, it works completely offline. My modem can literally be unplugged and I can control my smart devices just fine, with my voice. Entirely on my local network, with a couple of cheap devices and a ten year old gaming PC as the server.

My data

38. IgorPartola ◴[] No.42477304{3}[source]
I have four bedrooms, living/family room, study, office, rumpus room, garage, workshop, and trying to build out a basement with three more rooms. Each of these rooms have some form of smart lighting or devices like TVs or thermostats that people have a much easier time controlling with voice than phone apps. Granted this may sound extravagant but I have a large family so all this space is all very well utilized hence the need for a basement expansion. Again, at $25/room and bought over time the Echo Dots are a really simple way to add very easy to use controls that require almost no user training. We pause the living room TV and “set condition two throughout the fleet” at the end of the day with these devices.
replies(1): >>42478161 #
39. IgorPartola ◴[] No.42477313{5}[source]
A group buy is when you want to buy a bunch of existing product at wholesaler prices. Kickstarter is about funding new project that don’t exist yet. Like if the wholesaler refuses to sell you 1000 video cards, just give the money back. If you spend the Kickstarter money and can’t land a product there isn’t much you can do for refunds.
replies(1): >>42480396 #
40. gregmac ◴[] No.42477336{4}[source]
FWIW that's not even how Alexa or Google Assistant work. Both listen locally for the wake word with onboard processing, and only when they recognize it do they send the audio stream to the server to fully interpret.

You can test this in a couple ways: they'll respond to their wake word when the internet is down (but have an error response). You can also look at the outbound data and see they're not sending continuous traffic.

Not to say with the proprietary products that they couldn't sneakily change this on the fly and record everything, maybe even turning it on for a specific device or account.

replies(1): >>42477560 #
41. mattmaroon ◴[] No.42477560{5}[source]
The developers could do sneaky things with any device that has wifi and a mic.
replies(1): >>42477787 #
42. jorvi ◴[] No.42477749{5}[source]
You wouldn’t ask your partner deeply private questions in front of your mom either. Not sure how you think it’s a dig against voice assistant privacy.
43. adrianN ◴[] No.42477787{6}[source]
And yet most people have a phone in their pocket.
replies(2): >>42479297 #>>42479570 #
44. ijidak ◴[] No.42478129[source]
I'm trying to understand. Is there an SDK I can use to enhance this? Or is this a package product?

I'm really hoping it's the former. But I don't see any information about how to develop with this.

replies(1): >>42478250 #
45. 8n4vidtmkvmk ◴[] No.42478161{4}[source]
What's condition 2?
replies(1): >>42479882 #
46. pimeys ◴[] No.42478250{3}[source]
Yep, ESPHome SDK. It's all open source and well-documented:

https://esphome.io/

Some notable blog posts, docs and a video on the wake words and voice assistant usage:

https://community.home-assistant.io/t/on-device-wake-word-on...

https://esphome.io/components/voice_assistant.html

https://www.home-assistant.io/voice_control/create_wake_word...

https://www.youtube.com/watch?v=oSKBWtBJyDE

47. darkwater ◴[] No.42478401[source]
There are two types of "group buy". The one that you illustrated, but also one not only focused on saving bucks but also helping small, independent makers/producers to sell their usually more sustainable or more private product (which is also usually more expensive due to the lack of economies of scale).
48. fsflover ◴[] No.42479297{7}[source]
Try to live without it. It's almost impossible. I try to use Librem 5 as a daily driver, with hardware kill switches and GNU/Linux, and it's not always easy.
49. mattmaroon ◴[] No.42479570{7}[source]
Well that’s my point, we’ve already just accepted the risk. Probably more than half of people think their phone is spy on them but carry it anyway.
50. IgorPartola ◴[] No.42479882{5}[source]
It’s a reference to Battlestar Galactica. They would say that phrase to mean that the fleet is on standby. Condition one meant under attack. For us here it means turn off the lights.
51. wlonkly ◴[] No.42480396{6}[source]
It's both, I think, depending on the conventions around the thing being group-bought.

What you describe is unquestionably a group buy, but in, for example, the mechanical keyboards community, a "group buy" is paying the designer of a thing (keyboard, keycap set, etc.) for the expense of third-party production up front. It's really more of a preorder that requires a certain volume to proceed. But regardless, they're called group buys in that hobby.

(With expected mixed results, I should add -- plenty of keyboard "group buys" never come to fruition, and since they're not backed by a Kickstarter-like platform, the money is just gone. The /r/mechanicalkeyboards subreddit has many such stories.)

replies(1): >>42483490 #
52. sangnoir ◴[] No.42481940{3}[source]
> Maybe it takes living alone to issue voice commands and feel some privacy

Perhaps your definition of "private" is more stringent than most people's. Collective privacy exists, for example "The family would appreciate some privacy as they grieve". It is correct to term something "private" when it is shared with your entire household, but no one else.

53. IgorPartola ◴[] No.42483490{7}[source]
Hah I was recently looking at that subreddit and yeah that’s why I don’t like the idea of that kind of group buy. It’s a gamble on everything working out and everyone doing the right things. I also would argue that requesting a known designer/manufacturer to make N of a specific item is different than asking an unknown designer to do so for the first time. Terminology aside, this is my original point: that is a risky way of doing things and communicates to the consumer that the product is unlikely to just get made and be available.
replies(1): >>42483965 #
54. wlonkly ◴[] No.42483965{8}[source]
Absolutely. If one was uncharitable, one might suggest the reason that some of those group-buy-powered companies run their own storefront instead of Kickstarter is so that their customers associate them with "buying" and not with "funding".