←back to thread

423 points serjester | 1 comments | | HN request time: 0.215s | source
Show context
simonw ◴[] No.43535919[source]
Yeah, the "book a flight" agent thing is a running joke now - it was a punchline in the Swyx keynote for the recent AI Engineer event in NYC: https://www.latent.space/p/agent

I think this piece is underestimating the difficulty involved here though. If only it was as easy as "just pick a single task and make the agent really good at that"!

The problem is that if your UI involves human beings typing or talking to you in a human language, there is an unbounded set of ways things could go wrong. You can't test against every possible variant of what they might say. Humans are bad at clearly expressing things, but even worse is the challenge of ensuring they have a concrete, accurate mental model of what the software can and cannot do.

replies(12): >>43536068 #>>43536088 #>>43536142 #>>43536257 #>>43536583 #>>43536731 #>>43537089 #>>43537591 #>>43539058 #>>43539104 #>>43539116 #>>43540011 #
serjester ◴[] No.43536257[source]
Even operator's original demo the first thing they showed was booking restaurant reservations and ordering groceries. I understand their need to demo something intuitive but it's still debatable whether these tasks are ones that most people want delegated to black-box agents.
replies(1): >>43538396 #
ToucanLoucan ◴[] No.43538396[source]
They don't. I have never once in my life wanted to talk to my smart speaker about what I wanted for dinner, not even because a smart speaker is/can be creepy, not because of social anxiety, no, it's just simpler and more straightforward to open Doordash on my damn phone, and look at a list of restaurants nearby to order from. Or browse a list of products on Amazon to buy. Or just call a restaurant to get a reservation. These tasks are trivial.

And like, as a socially anxious millennial, no I don't particularly like phone calls. However I also recognize that setting my discomfort aside, a direct connection to a human being who can help reason out a problem I'm having is not something easily replaced with a chatbot or an AI assistant. It just isn't. Perfect example: called a place to make a reservation for myself, my wife and girlfriend (poly long story) and found the place didn't usually do reservations on the day in question, but the person did ask when we'd be there. As I was talking to a person, I could provide that information immediately, and say "if you don't take reservations don't worry, that's fine," but it was an off-busy hour so we got one anyway. How does an AI navigate that conversation more efficiently than me?

As a techie person I basically spend the entire day interacting with various software to perform various tasks, work related and otherwise. I cannot overstate: NONE of these interactions, not a single one, is improved one iota by turning it into a conversation, verbal or text-based, with my or someone else's computer. By definition it makes basic tasks take longer, every time, without fail.

replies(3): >>43539272 #>>43543816 #>>43547918 #
1. Terr_ ◴[] No.43543816[source]
Agreed, verbally asking for X might make it easier for Aunt "where's the Any key" Tillie to get a solution, but it doesn't necessarily give a better solution for everyone else.

Or, for that matter, solutions you can trust. Remember the pitch for Amazon Dash buttons, where you press it and it maybe-reorders a product for delivery, instantly and sight-unseen? What if the price changed? What if it's not exactly the same product anymore? Wait, did someone else already press it? Maybe I can get a better deal? etc.

Actually, that spurs a random thought: Perhaps some of these smart-speaker ordering pitches land differently if someone is in a socioeconomic class where they're already accustomed to such tasks being done competently by human office-assistants, nannies, etc. Their default expectation might be higher, and they won't need to invest time pinching pennies like the rest of us.