←back to thread

The era of open voice assistants

(www.home-assistant.io)
878 points _Microft | 1 comments | | HN request time: 0s | source
Show context
lxe ◴[] No.42468351[source]
Here's what I'm looking for in a voice assistant:

- Full privacy: nothing goes to the "cloud"

- Non-shitty microphones and processing: i want to be able to be heard without having to yell, repeat, or correct

- No wake words: it should listen to everything, process it, and understand when it's being addressed. Since everything is private and local, this is now doable

- Conversational: it should understand when I finished talking, have ability to be interrupted, all with low latency

- Non-stupid: it's 2024, and alexa and siri and google are somehow absolutely abysmal at doing even the basics

- Complete: i don't want to use an app to get stuff configured. I want everything to be controlled via voice

replies(5): >>42468394 #>>42468471 #>>42468967 #>>42470013 #>>42471806 #
danparsonson ◴[] No.42468394[source]
> No wake words: it should listen to everything, process it, and understand when it's being addressed

Even humans struggle with this one - that's what names are for!

replies(2): >>42468438 #>>42481564 #
1. lxe ◴[] No.42481564[source]
Wake words are different from "listen to everyhing until name is called". A wake work is needed for both privacy and technical reasons -- you can't just have alexa beaming everything it hears to amazon. So instead it uses a local lightweight "dumb" system to listen to specific words only.

That's exactly why there's massive latencies between command recognition, processing, and execution.

Imagine if it had sub-ms response to "assistant, add uuh eggs and milk to the shopping list... actually no just eggs sorry"