> Theoretically, saying, “order an Uber to airport” seems like the easiest way to accomplish the task. But is it? What kind of Uber? UberXL, UberGo? There’s a 1.5x surge pricing. Acceptable? Is the pickup point correct? What would be easier, resolving each of those queries through a computer asking questions, or taking a quick look yourself on the app?
> Another example is food ordering. What would you prefer, going through the menu from tens of restaurants yourself or constantly nudging the AI for the desired option? Technological improvement can only help so much here since users themselves don’t clearly know what they want.
Of course a conversational interface is useless if it tries to just do the same thing as a web UI, which is why it failed a decade ago when it was trendy, because the tech was nowhere clever enough to make that useful. But today, I'd bet the other way round.
Such dialog is probably nice for first time user, it is a nightmare for repeated user.
Then it can assume you choice haven't changed, and propose you a solution that matches your previous choices. And to give the user control it just needs to explicitly tell the user about the assumption it made.
In fact, a smart enough system could even see when violating the assumptions could lead to a substantial gain and try convincing the user that it may be a good option this time.
Talking is not very efficient, and it's serial in fixed time. With something visual you can look at whatever you want whenever you want, at your own (irregular) pace.
You will also be able to make changes much faster. You can go to the target form element right away, and you get immediate feedback from the GUI (or from a physical control that you moved - e.g. in cars). If it's talk, you need to wait to have it said back to you - same reason as why important communication in flight control or military is always read back. Even humans misunderstand. You can't just talk-and-forget unless you accept errors.
You would need some true intelligence for just some brief spoken requests to work well enough. A (human) butler worked fine for such cases, but even then only the best made it into such high-level service positions, because it required real intelligence to know what your lord needed and wanted, and lots of time with them to gain that experience.
I used to be a reading blog over watching video person, but for some things I’ve come to appreciate the video version. The reason you want to get the video of the whatever is because in the blog post, what’s written down only what the author thought was important. But I’m not them. I don’t know everything they know and I don’t see everything they see. I can’t do everything they do but with the video I get everything. When you perform the whatever the video has every detail, not just the ones you think are important. That bit between step 1 and step 2 that’s obvious? It’s not obvious to everyone, or mine is broken in a slightly different way that I really need to see that bit between 1 and 2. of course, videos get edited and cut so they don’t always have that benefit, but I’ve grown to appreciate them.