Building Effective "Agents"

(www.anthropic.com)

Show context

curious_cat_163 ◴[20 Dec 24 17:09 UTC] No.42472796[source]▶

> Agents can be used for open-ended problems where it’s difficult or impossible to predict the required number of steps, and where you can’t hardcode a fixed path. The LLM will potentially operate for many turns, and you must have some level of trust in its decision-making. Agents' autonomy makes them ideal for scaling tasks in trusted environments.

The questions then become:

1. When can you (i.e. a person who wants to build systems with them) trust them to make decisions on their own?

2. What type of trusted environments are we talking about? (Sandboxing?)

So, that all requires more thought -- perhaps by some folks who hang out at this site. :)

I suspect that someone will come up with a "real-world" application at a non-tech-first enterprise company and let us know.

replies(1): >>42477100 #

1. ripped_britches ◴[21 Dec 24 02:37 UTC] No.42477100[source]▶

>>42472796 #

Just take any example and think how a human would break it down with decision trees.

You are building an AI system to respond to your email.

The first agent decides whether the new email should be responded to, yes or no.

If no, it can send it to another LLM call that decides to archive it or leave it in the inbox for the human.

If yes, it sends it to classifier that decides what type of response is required.

Maybe there are some emails like for your work that require something brief like “congrats!” to all those new feature launch emails you get internally.

Or others that are inbound sales emails that need to go out to another system that fetches product related knowledge to craft a response with the right context. Followed by a checker call that makes sure the response follows brand guidelines.

The point is all of these steps are completely hypothetical but you can imagine how loosely providing some set of instructions and function calls and procedural limits can easily classify things and minimize error rate.

You can do this for any workflow by creatively combining different function calls, recursion, procedural limits, etc. And if you build multiple different decision trees/workflows, you can A/B test those and use LLM-as-a-judge to score the performance. Especially if you’re working on a task with lots of example outputs.

As for trusted environments, assume every single LLM call has been hijacked and don’t trust its input/output and you’ll be good. I put mine in their own cloudflare workers where they can’t do any damage beyond giving an odd response to the user.

replies(1): >>42477598 #

2. skydhash ◴[21 Dec 24 04:44 UTC] No.42477598[source]▶

>>42477100 (TP) #

> The first agent decides whether the new email should be responded to, yes or no.

How would you trust that the agent is following the criteria, and how sure that the criteria is specific enough. Like someone you just meet told you they going to send you something via email, but then the agent misinterpret it due to missing context and decided to respond in a generic manner leading to misunderstanding.

> assume every single LLM call has been hijacked and don’t trust its input/output and you’ll be good.

Which is not new. But with formal languages, you have a more precise definition of what acceptable inputs are (the whole point of formalism is precise definitions). With LLM workflows, the whole environment should be assumed to be public information. And you should probably add a fine point that the output does not engage you in anything.

replies(1): >>42481015 #

3. TobTobXX ◴[21 Dec 24 17:44 UTC] No.42481015[source]▶

>>42477598 #

> How would you trust that the agent is following the criteria, and how sure that the criteria is specific enough?

How do you know if a spam filter heuristic works only when intended?

You test it. Hard. On the thousands of emails in your archive, on edge-cases you prepare manually, and on the incoming mails. If it doesn't work for some cases, write tests that test for this, adjust prompt and run the test suite.

It won't ever work in 100% of all cases, but neither do spam filters and we still use them.

replies(1): >>42481361 #

4. th0ma5 ◴[21 Dec 24 18:43 UTC] No.42481361{3}[source]▶

>>42481015 #

This is the location of the arguments. When they work they're "magical" but when they don't work "well people or other things are just as bad" ... And this means that you just cannot reason with the mysticism people have surrounding these things because show stopper problems are minimized or it is implied that they can somehow be reduced. The whole rest of computing does not work this way. So, you can't get the magic of reliable systems from probabilistic outcomes, and people confusing these things are seriously holding back honest and real discussions about probabilistic systems. (Not to harp on you specifically here, the whole language of the field is seriously confusing these issues)

replies(1): >>42483224 #

5. og_kalu ◴[21 Dec 24 23:38 UTC] No.42483224{4}[source]▶

>>42481361 #

It's one thing if you're up against the staunch reliability of traditional algorithmic systems but if you're not then it's just silly to hang on it.

You are not getting 100% on email handling whatever method you wish to use. You compare LLMs or probabilistic systems to the best of your alternatives.

There are niches where the lack of complete reliability would be a deal breaker. This isn't one of them and it would be weird to act like it were. People aren't sweeping anything under the rug. It simply just isn't a showstopper.

replies(1): >>42483670 #

6. th0ma5 ◴[22 Dec 24 01:10 UTC] No.42483670{5}[source]▶

>>42483224 #

I don't think people are aware of the line at all. You ask five different people which things are reliable in any of these systems and you'll get five different guesses. C'mon here.

replies(1): >>42483850 #

7. og_kalu ◴[22 Dec 24 01:51 UTC] No.42483850{6}[source]▶

>>42483670 #

That's not what i'm talking about though. The point is that for spam detection, LLMs are up against other probabilistic measures. No-one sane is detecting spam with if-then's. You simply do not have the luxury of rigid reliability.

↑