←back to thread

Building Effective AI Agents

(www.anthropic.com)
543 points Anon84 | 5 comments | | HN request time: 0.768s | source
Show context
simonw ◴[] No.44302601[source]
This article remains one of the better pieces on this topic, especially since it clearly defines which definition of "AI agents" they are using at the start! They use: "systems where LLMs dynamically direct their own processes and tool usage, maintaining control over how they accomplish tasks".

I also like the way they distinguish between "agents" and "workflows", and describe a bunch of useful workflow patterns.

I published some notes on that article when it first came out: https://simonwillison.net/2024/Dec/20/building-effective-age...

A more recent article from Anthropic is https://www.anthropic.com/engineering/built-multi-agent-rese... - "How we built our multi-agent research system". I found this one fascinating, I wrote up a bunch of notes on it here: https://simonwillison.net/2025/Jun/14/multi-agent-research-s...

replies(5): >>44302676 #>>44303599 #>>44304356 #>>44305116 #>>44305898 #
smoyer ◴[] No.44303599[source]
The article on the multi-agent research is awesome. I do disagree with one statement in the building effective AI agents article - building your initial system without a framework sounds nice as an educational endeavor but the first benefit you get from a good framework is the easy ability to try out different (and cross-vendor) LLMs
replies(3): >>44305075 #>>44307235 #>>44309698 #
1. XenophileJKO ◴[] No.44307235[source]
Having built several systems serving massive user bases with LLMs. I think the ability to swap out APIs just isn't the bottleneck.. like ever. It is always the behavioral issues or capability differences between models.

The frameworks just usually add more complexity, obscurity, and API misalignment.

Now the equation can change IF you are getting a lot of observability, experimentation, etc. I think we are just reaching that point of utility where it is a real question whether you should use the framework by default.

For example I build a first version of a product with my own java code hooking right into an API. I was able to deliver the product quickly with a clean architecture and observability, etc. Then once the internal ecosystem was aligned on a framework (on mentioned in the article) a team took up migrating it to python on the framework. It still isn't complete, it just introduces a lot of abstraction layers where you have to adapt them to your internal systems and your internal observability setup, and any other things that the rest of your applications do.

People underestimate that cost. So by default to get your V0 product off the ground (if you are not a complete startup), just use the API. That is my advice.

replies(3): >>44307883 #>>44307887 #>>44310151 #
2. davedx ◴[] No.44307883[source]
This aligns with my experience (specifically with langgraph). I actually find it a depressing sign of the times that your prototype was in Java and the "production" version is going to be in python.

My experience with langgraph is you spend so much time just fixing stupid runtime type errors because the state of every graph is a stupid JSON blob with very minimal typing, and it's so hard figuring out how data moves through the system. Combined with python's already weak type support, and the fact you're usually dealing with long running processes where things break mid- or end- of process, development becomes quite awful. AI coding assistants only help so much. Tests are hard to write because these frameworks inevitably lean in to the dynamic nature of python.

I just can't understand why people are choosing to build these huge complex systems in an untyped language when the only AI or ML is API calls... or very occasionally doing some lightweight embeddings.

3. ◴[] No.44307887[source]
4. IanCal ◴[] No.44310151[source]
> I think the ability to swap out APIs just isn't the bottleneck.. like ever

It's a massive pain in the arse for testing though. Checking which out of X number of things performs the best for your use case is quite annoying if you have to have X implementations. Having one set that you swap out keys and some vars makes this massively easier.

replies(1): >>44398692 #
5. kfajdsl ◴[] No.44398692[source]
This is easily solved with a thin wrapper for calling openai/anthropic/google/whatever that has the same interface between model providers (except for unique capabilities). You don't need a whole framework for this.