> Theoretically, saying, “order an Uber to airport” seems like the easiest way to accomplish the task. But is it? What kind of Uber? UberXL, UberGo? There’s a 1.5x surge pricing. Acceptable? Is the pickup point correct? What would be easier, resolving each of those queries through a computer asking questions, or taking a quick look yourself on the app?
> Another example is food ordering. What would you prefer, going through the menu from tens of restaurants yourself or constantly nudging the AI for the desired option? Technological improvement can only help so much here since users themselves don’t clearly know what they want.
How many of these inconveniences will you put up with? Any of them, all of them? What price difference makes it worthwhile? What if by traveling a day earlier you save enough money to even pay for a hotel...?
All of that is for just 1 flight, what if there are several alternatives? I can't imagine have a dialogue about this with a computer.
Similarly, long before Waymo, you'd get into a taxi, and tell the human driver you're going to the airport, and they'd take you there. In fact, they'd get annoyed at you if you backseat drove, telling them how to use the blinker and how hard to brake and accelerate.
The thing about conversational interfaces is that we're used to them, because we (well, some of us) interface with other humans fairly regularly, and so it's a fairly baseline level skill to have to exist in the world today. There's a case to be made against them, but since everyone can be assumed to be conversational (though perhaps not in a given language), it's here to stay. Restaurants have menus that customers look at before using the conversation interface to get food, in order to guide the discussion, and that's had thousands of years to evolve, so it might be a local maxima, but it's a pretty good one.
Of course a conversational interface is useless if it tries to just do the same thing as a web UI, which is why it failed a decade ago when it was trendy, because the tech was nowhere clever enough to make that useful. But today, I'd bet the other way round.
Such dialog is probably nice for first time user, it is a nightmare for repeated user.
Amen to that. I guess, it would help to get of the IT high horse and have a talk with linguists and philosophers of language. They are dealing with this shit for centuries now.
Then it can assume you choice haven't changed, and propose you a solution that matches your previous choices. And to give the user control it just needs to explicitly tell the user about the assumption it made.
In fact, a smart enough system could even see when violating the assumptions could lead to a substantial gain and try convincing the user that it may be a good option this time.
The booking experience today is granular to help you find a suitable flight to meet all the preferences you’re compiling into an optimal scenario. The experience of AI booking in the future will likely be similar: find that optimal scenario for you once you’re able to articulate your preferences and remember them over time.
I guess there's just no substitute for someone actually doing the work of figuring out the most appropriate HMI for a given task or situation, be it voice controls, touch screens, physical buttons or something else.
Talking is not very efficient, and it's serial in fixed time. With something visual you can look at whatever you want whenever you want, at your own (irregular) pace.
You will also be able to make changes much faster. You can go to the target form element right away, and you get immediate feedback from the GUI (or from a physical control that you moved - e.g. in cars). If it's talk, you need to wait to have it said back to you - same reason as why important communication in flight control or military is always read back. Even humans misunderstand. You can't just talk-and-forget unless you accept errors.
You would need some true intelligence for just some brief spoken requests to work well enough. A (human) butler worked fine for such cases, but even then only the best made it into such high-level service positions, because it required real intelligence to know what your lord needed and wanted, and lots of time with them to gain that experience.
Anecdata: last year my wife and I went on a rail tour through Eastern Europe and god, I wish we had chosen to spend a few hundred euros on a travel agency in retrospect - I can't count just how much time we had to spend researching on what kind of rail, bus and public transit tickets you need on which leg, how to create accounts, set up payment and godknowswhat else. Easily took us two days worth of work and about two dozens individual payment transactions. A professional travel agency can do all the booking via Sabre, Amadeus or whatever...
Who said it cannot be visual? It's still a “conversational” UI if it's a chatbot that writes down its answer.
> Similar reason why many people prefer a blog post over a video.
Well I certainly do, but I also know that we are few and far between in that case. People in general prefer videos over blog post by a very large margin.
> Talking is not very efficient, and it's serial in fixed time. With something visual you can look at whatever you want whenever you want, at your own (irregular) pace. You will also be able to make changes much faster. You can go to the target form element right away, and you get immediate feedback from the GUI.
Saying “I want to travel to Berlin next monday” is much faster than fighting with the website's custom datepicker which will block you until you select your return date until you realize you need to go back and toggle the “one way trip” button before clicking the calendar otherwise it's not working…
There's a reason why nerds love their terminal: GUIs are just very slow and annoying. They are useful for whatever new thing you're doing, because it's much more discoverable than CLI, but it's much less efficient.
> If it's talk, you need to wait to have it said back to you - same reason as why important communication in flight control or military is always read back. Even humans misunderstand. You can't just talk-and-forget unless you accept errors.
This is true, but stays true with a GUI, that's why you have those pesky confirmation pop-ups, because as annoying as they are when you know what you're doing, they are necessary to catch errors.
> You would need some true intelligence for just some brief spoken requests to work well enough.
I don't think so. IMO you just need something that emulates intelligence enough on that particular purpose. And we've seen that LLMs are pretty decent at emulating apparent intelligence so I wouldn't bet against them on that.
Maybe I'm tired of layovers and I'm willing to pay more for a direct flight this time. Maybe I want a different selection at a restaurant because I'm in the mood for tacos rather than a burrito.
The whole point is that we currently have better, more efficient ways of doing those things, so why would we regress to inferior methods?
To relate to the article - google flights is the Keyboard and Mouse - covering 80% of cases very quickly. Conversational is better for when you're juggling more contextual info than what can be represented in a price/departure time/flight duration table. For example, "i'm bringing a small child with me and have an appointment the day before and I really hate the rain".
Rushed comment because I'm working, but I hope you get the gist.
Current flight planning UX is overfit on the 80% and will never cater to the 20% because cost/benefit of the development work isn't good
That's why the “advanced search” is almost always hidden somewhere. And that's also why you can never find the filter you need on an e-shopping website.
How long is it going to take you to get to a device, load the app/webpage, tell it which airport you're flying from and going to and what date and then you start looking at options. You've blown way past the 10 seconds it took for that executive to get a plane flight.
Better is in the eye of the beholder. What's monetarily efficient isn't going to be temporaly efficient, and that's true along a lot of other dimensions too.
Point is, there are some people that like having conversations, you may not be one of them. you don't have to be. I'm not taking away your mouse and keyboard. I have those too and won't give them up either. But I also find talking out loud helps my thinking process though I know that's not everybody.
I used to be a reading blog over watching video person, but for some things I’ve come to appreciate the video version. The reason you want to get the video of the whatever is because in the blog post, what’s written down only what the author thought was important. But I’m not them. I don’t know everything they know and I don’t see everything they see. I can’t do everything they do but with the video I get everything. When you perform the whatever the video has every detail, not just the ones you think are important. That bit between step 1 and step 2 that’s obvious? It’s not obvious to everyone, or mine is broken in a slightly different way that I really need to see that bit between 1 and 2. of course, videos get edited and cut so they don’t always have that benefit, but I’ve grown to appreciate them.
You can't be serious??
Oh it's 1st of April, my apologies! I almost took it seriously. I should ignore this website on this day.
What's the difference between a blog post and a chatbot answer in terms of how “visual” things are?
But you can, so as long as the interlocutor tells you what assumptions it made, you can correct it if it doesn't match your current mood.
> So yeah, this argument in favor of conversational interfaces sounds at this point more like ideology than logic.
There's no ideology behind the fact that every people rich enough to afford paying someone to deal with mundane stuff will have someone doing it for them, it's just about convenience. Nobody likes to fight with web UIs for fun, the only reason why it has become mainstream is because it's so much cheaper than having a real person working.
Same for Microsoft Word by the way, many people used to have secretaries typing stuff for them, and it's been a massive regression of social status for the upper middle class to have to type things by themselves, it only happened because it was cheaper (in appearance at least).