←back to thread

The era of open voice assistants

(www.home-assistant.io)
878 points _Microft | 1 comments | | HN request time: 0.603s | source
Show context
lxe ◴[] No.42468351[source]
Here's what I'm looking for in a voice assistant:

- Full privacy: nothing goes to the "cloud"

- Non-shitty microphones and processing: i want to be able to be heard without having to yell, repeat, or correct

- No wake words: it should listen to everything, process it, and understand when it's being addressed. Since everything is private and local, this is now doable

- Conversational: it should understand when I finished talking, have ability to be interrupted, all with low latency

- Non-stupid: it's 2024, and alexa and siri and google are somehow absolutely abysmal at doing even the basics

- Complete: i don't want to use an app to get stuff configured. I want everything to be controlled via voice

replies(5): >>42468394 #>>42468471 #>>42468967 #>>42470013 #>>42471806 #
danparsonson ◴[] No.42468394[source]
> No wake words: it should listen to everything, process it, and understand when it's being addressed

Even humans struggle with this one - that's what names are for!

replies(2): >>42468438 #>>42481564 #
antonyt ◴[] No.42468438[source]
Yeah, I’m having a hard time imagining how no-wake-word could work in practice.
replies(3): >>42468837 #>>42470838 #>>42473855 #
1. lukifer ◴[] No.42473855[source]
This is one advantage of a system with a constrained set of commands/grammars, as opposed to the Alexa/Siri model of trying to process all arbitrary text while in active mode. It can simply ignore/discard any invocations which don't match those specific grammars (and no need to wait to confirm that the device is awake).

"Computer, turn lights to 50%" -> "turn lights to fifty percent" -> {action: "lights", value: 50}

"My new computer has a really beefy graphics card" -> "has a really beefy graphics card" -> {action: null}

replies(1): >>42475451 #