←back to thread

423 points serjester | 1 comments | | HN request time: 0.373s | source
Show context
simonw ◴[] No.43535919[source]
Yeah, the "book a flight" agent thing is a running joke now - it was a punchline in the Swyx keynote for the recent AI Engineer event in NYC: https://www.latent.space/p/agent

I think this piece is underestimating the difficulty involved here though. If only it was as easy as "just pick a single task and make the agent really good at that"!

The problem is that if your UI involves human beings typing or talking to you in a human language, there is an unbounded set of ways things could go wrong. You can't test against every possible variant of what they might say. Humans are bad at clearly expressing things, but even worse is the challenge of ensuring they have a concrete, accurate mental model of what the software can and cannot do.

replies(12): >>43536068 #>>43536088 #>>43536142 #>>43536257 #>>43536583 #>>43536731 #>>43537089 #>>43537591 #>>43539058 #>>43539104 #>>43539116 #>>43540011 #
1. Spooky23 ◴[] No.43537591[source]
It's no different than the old Amazon button thing. I'm not going to automatically pay whatever price Amazon is going to charge to push-button replenish household goods. Especially in those days, where "The World's Biggest" fence would have pretty wild swings in price.

If i were rich enough to have some bot fly me somewhere, I'd have a real-life minion do it for me.