The era of open voice assistants

(www.home-assistant.io)

Show context

lxe ◴[20 Dec 24 04:42 UTC] No.42468351[source]▶

Here's what I'm looking for in a voice assistant:

- Full privacy: nothing goes to the "cloud"

- Non-shitty microphones and processing: i want to be able to be heard without having to yell, repeat, or correct

- No wake words: it should listen to everything, process it, and understand when it's being addressed. Since everything is private and local, this is now doable

- Conversational: it should understand when I finished talking, have ability to be interrupted, all with low latency

- Non-stupid: it's 2024, and alexa and siri and google are somehow absolutely abysmal at doing even the basics

- Complete: i don't want to use an app to get stuff configured. I want everything to be controlled via voice

replies(5): >>42468394 #>>42468471 #>>42468967 #>>42470013 #>>42471806 #

danparsonson ◴[20 Dec 24 04:55 UTC] No.42468394[source]▶

>>42468351 #

> No wake words: it should listen to everything, process it, and understand when it's being addressed

Even humans struggle with this one - that's what names are for!

replies(2): >>42468438 #>>42481564 #

1. antonyt ◴[20 Dec 24 05:07 UTC] No.42468438[source]▶

>>42468394 #

Yeah, I’m having a hard time imagining how no-wake-word could work in practice.

replies(3): >>42468837 #>>42470838 #>>42473855 #

2. fragmede ◴[20 Dec 24 06:44 UTC] No.42468837[source]▶

>>42468438 (TP) #

after setting up the system, if I say "turn the ceiling lights to 20%", who else would be changing the lights?

But also, post-fix wake word would also be natural if it was recording all the time. "turn on the lights, Google", for instance

replies(2): >>42472751 #>>42476125 #

3. ethbr1 ◴[20 Dec 24 13:18 UTC] No.42470838[source]▶

>>42468438 (TP) #

Like that really annoying friend who jumps in every other sentence with "Well actually..."

replies(1): >>42472167 #

4. marcosdumay ◴[20 Dec 24 15:56 UTC] No.42472167[source]▶

>>42470838 #

I have a coworker that set up an Alexa an year or so ago, I don't know what was the issue, but it would jump into Teams meetings after every noise in his house.

5. TheCoelacanth ◴[20 Dec 24 17:04 UTC] No.42472751[source]▶

>>42468837 #

Someone in a TV show that you're watching?

replies(1): >>42479383 #

6. lukifer ◴[20 Dec 24 19:06 UTC] No.42473855[source]▶

>>42468438 (TP) #

This is one advantage of a system with a constrained set of commands/grammars, as opposed to the Alexa/Siri model of trying to process all arbitrary text while in active mode. It can simply ignore/discard any invocations which don't match those specific grammars (and no need to wait to confirm that the device is awake).

"Computer, turn lights to 50%" -> "turn lights to fifty percent" -> {action: "lights", value: 50}

"My new computer has a really beefy graphics card" -> "has a really beefy graphics card" -> {action: null}

replies(1): >>42475451 #

7. danparsonson ◴[20 Dec 24 23:21 UTC] No.42476125[source]▶

>>42468837 #

Sure, if the system is set up to only respond to very specific commands that humans would not respond to, I guess that could work. I was thinking more about the other way around, where a person might speak to someone else in the room and be overheard and acted upon - "turn on the lights!" could be a command for the computer controlling the room, or the human standing next to the Christmas tree, for example.

8. joshstrange ◴[21 Dec 24 12:53 UTC] No.42479383{3}[source]▶

>>42472751 #

I’ve never had Alexa control a device via a TV show’ audio but playing back a video of me testing my home automation (“Alex, do X”) triggered my lights.

I’d love a no-wake-word world where something locally was always chewing on what you said but I’m not sure how well it would work in practice.

I think it would only take 1-2 instances of it hearing “Hey, who turned off the lights?” in a show turning off my lights for real (and scaring the crap out of me). Doctor Who isn’t particularly scary but if I was watching Silence in the Library and that line turned off my lights I’d be spoked and it would take me a hot minute to realize what happened.

↑