AI agents: Less capability, more reliability, please

(www.sergey.fyi)

423 points serjester | 2 comments | 31 Mar 25 14:45 UTC | HN request time: 0.595s | source

Show context

simonw ◴[31 Mar 25 15:08 UTC] No.43535919[source]▶

Yeah, the "book a flight" agent thing is a running joke now - it was a punchline in the Swyx keynote for the recent AI Engineer event in NYC: https://www.latent.space/p/agent

I think this piece is underestimating the difficulty involved here though. If only it was as easy as "just pick a single task and make the agent really good at that"!

The problem is that if your UI involves human beings typing or talking to you in a human language, there is an unbounded set of ways things could go wrong. You can't test against every possible variant of what they might say. Humans are bad at clearly expressing things, but even worse is the challenge of ensuring they have a concrete, accurate mental model of what the software can and cannot do.

replies(12): >>43536068 #>>43536088 #>>43536142 #>>43536257 #>>43536583 #>>43536731 #>>43537089 #>>43537591 #>>43539058 #>>43539104 #>>43539116 #>>43540011 #

hansmayer ◴[31 Mar 25 16:18 UTC] No.43536731[source]▶

>>43535919 #

It's so funny when people try to build robots imitating people. I mean part funny, part tragedy of the upcoming bust. The irony being, we would have been better off with an interoperable flight booking API standard which a deterministic headless agent could use to make perfect bookings every single time. There is a reason current user interfaces stem from a scientific discipline once called "Human-Computer Interaction".

replies(3): >>43537033 #>>43537160 #>>43538872 #

jatins ◴[31 Mar 25 16:44 UTC] No.43537033[source]▶

>>43536731 #

But that's the promise of AI, right? You can't put an API on everything for human + technological reasons.

replies(2): >>43537058 #>>43537071 #

dartos ◴[31 Mar 25 16:48 UTC] No.43537071[source]▶

>>43537033 #

You can’t put an API on everything because it’d take a ton of time and money to pull that off.

I can’t think of any technological reasons why every digital system can’t have an API (barring security concerns, as those would need to be case by case)

So instead, we put 100s of billions of dollars into statistical models hoping they could do it for us.

It’s kind of backwards.

replies(3): >>43537721 #>>43537952 #>>43538391 #

1. datadrivenangel ◴[31 Mar 25 18:12 UTC] No.43537952[source]▶

>>43537071 #

A web page is an Application/Human Interface. Outside of security concerns, companies can make more money if they control the Application/Human Interface, and reduce the risk of a middleman / broker extorting them for margins.

If I run a flight aggregator that has a majority of flight bookings, I can start charging 'rents' by allowing featured/sponsored listings to be promoted above the 'best' result, leading to a prisoner's dilemma where airlines should pay up to their margins to keep market share.

If an AI company becomes the default application human interface, they can do the same thing. Pay OpenAI tribute or be ended as a going concern.

replies(1): >>43542902 #

2. dartos ◴[01 Apr 25 04:38 UTC] No.43542902[source]▶

>>43537952 (TP) #

LLMs as a natural language interface is fine.

What I’m saying is that if there was a standard protocol for making travel plans over the internet, we wouldn’t need an AI agent to book a trip.

We could just create great user experiences that expose those APIs like we do for pretty much everything on the web.

↑