←back to thread

39 points suchintan | 1 comments | | HN request time: 0.201s | source
Show context
happyopossum ◴[] No.42742284[source]
Many of the examples given for agents such as this are things I just flat wouldn’t trust an LLM to do - buying something on Amazon for example: Will it pick new or ‘renewed’? Will it select an item that is from a janky looking vendor and may be counterfeit? Will it pick the cheapest option for me? What if multiple colors are offered?

This one example alone has so many branches that would require knowing what’s in my head.

On the flip side, it’s a ridiculously simple task for a human to do for themselves, so what am I truly saving?

Call me when I can ask it to check the professional reviews of X category on N websites (plus YouTube), summarize them for me, and find the cheapest source for the top 2 options in the category that will arrive in Y days or sooner.

That would be useful.

replies(3): >>42742436 #>>42742518 #>>42744388 #
Fnoord ◴[] No.42742436[source]
I got Amazon Prime. If it has Prime, it is a no-brainer. Free return for 30 days. No S&H costs. Only cost is my time.
replies(2): >>42743004 #>>42743859 #
drdaeman ◴[] No.42743004[source]
Yea, but LLMs cannot reason - we've all seen them blurt out complete non-sequitur, or end up in death loops of pseudo-reasoning (e.g. https://news.ycombinator.com/item?id=42734681 has a few examples). I don't think one should trust an LLM to pick Prime products all the time even if that's very explicitly requested - I'm sure it's possible to minimize errors so it'll do the right thing most of the time, but having a guarantee that it won't pick non-Prime item sounds impossible. Same for any other tasks - if there is a way to make a mistake, a mistake will be eventually made.

(Idk if we can trust a human either - brain farts are a thing after all, but at least humans are accountable. Machines are not - at least not at the moment.)

replies(2): >>42743087 #>>42744599 #
1. lyime ◴[] No.42743087[source]
To your last point -- Humans make mistakes too. I asked my EA to order a few things for our office a few days ago, and she ended up ordering things that I did not want. In this case I could have wrote a better prompt. Even with a better prompt she could have ordered the unwanted item. This is a reversible decision.

So my point is, that while you might get some false positives, it's worth automating as long as many of the decisions are reversible or correctable.

You might not want to use this in all cases, but it's still worthwhile for many many cases. The use case worth automating depends on the acceptable rate of error for the given use case.