Most active commenters

daxfohl(4)
simonw(3)

Popular/hot comments

>>44306135 #

←back to thread

Building Effective AI Agents

(www.anthropic.com)

1. AvAn12 ◴[17 Jun 25 19:54 UTC] No.44303213[source]▶

>>44301809 (OP) #

How do agents deal with task queueing, race conditions, and other issues arising from concurrency? I see lots of cool articles about building workflows of multiple agents - plus what feels like hand-waving around declaring an orchestrator agent to oversee the whole thing. And my mind goes to whether there needs to be some serious design considerations and clever glue code. Or does it all work automagically?

replies(7): >>44303413 #>>44303510 #>>44303611 #>>44303637 #>>44303642 #>>44304027 #>>44304092 #

2. cmsparks ◴[17 Jun 25 20:11 UTC] No.44303413[source]▶

>>44303213 (TP) #

Frankly, it's pretty difficult. Though, I've found that the actor model maps really well onto building agents. An instance of an actor = an instance of an agent. Agent to agent communication is just tool calling (via MCP or some other RPC)

I use Cloudflare's Durable Objects (disclaimer: I'm biased, I work on MCP + Agent things @ Cloudflare). However, I figure building agents probably maps similarly well onto any actor style framework.

replies(1): >>44303617 #

3. simonw ◴[17 Jun 25 20:21 UTC] No.44303510[source]▶

>>44303213 (TP) #

The standard for "agents" is that tools run in sequence, so no need to worry about concurrency. Several models support parallel tool calls now where the model can say "Run these three tools" and your harness can chose to run them in parallel or sequentially before passing the results back to the model as the next step in the conversation.

Anthropic are leaning more into multi-agent setups where the parent agent might delegate to one or more sub-agents which might run in parallel. They use that trick for Claude Code - I have some notes on reverse-engineering that here https://simonwillison.net/2025/Jun/2/claude-trace/ - and expand on that in their write-up of how Claude Research works: https://simonwillison.net/2025/Jun/14/multi-agent-research-s...

It's still _very_ early in figuring out good patterns for LLM tool-use - the models only got really great at using tools in about the past 6 months, so there's plenty to be discovered about how best to orchestrate them.

replies(2): >>44304116 #>>44304552 #

4. gk1 ◴[17 Jun 25 20:31 UTC] No.44303611[source]▶

>>44303213 (TP) #

In at least the case for coding agents the emerging pattern is to have the agents use containers for isolating work and git for reviewing and merging that work neatly.

See for example the container use MCP which combines both: https://github.com/dagger/container-use

That’s for parallelizing coding work… I’m not sure about other kinds of work. I still see people using workflow builder tools like n8n, Zapier, and maybe CrewAI.

5. pyman ◴[17 Jun 25 20:32 UTC] No.44303617[source]▶

>>44303413 #

Should the people developing AI agent protocols be exploring decentralised architectures, using technologies like blockchain and peer-to-peer networks to distribute models and data? What are the trade-offs of relying on centralised orchestration platforms owned by large companies like Amazon, Cloudfare or NVIDIA? Thanks

replies(1): >>44304163 #

6. daxfohl ◴[17 Jun 25 20:35 UTC] No.44303637[source]▶

>>44303213 (TP) #

Nothing works automagically. You still have to build in all the operational characteristics that you would for any traditional system. It's deceptively easy to look at some AI agent demos and think "oh, I can replace my team's huge mess of spaghetti code with a few clever AI prompts!" And it may even work for the first couple use cases. But all that code is there for a reason, and eventually it'll have to be reckoned with. Once you get to the point where you're translating all that code directly into the AI prompt and hoping for no hallucinations, you know you've lost the plot.

replies(1): >>44306135 #

7. nurettin ◴[17 Jun 25 20:35 UTC] No.44303642[source]▶

>>44303213 (TP) #

If I had to deal with "AI agent concurrency", I would get them to submit their requests to a queue and process those sequentially.

8. 0x457 ◴[17 Jun 25 21:13 UTC] No.44304027[source]▶

>>44303213 (TP) #

I can only talk about Codex web interface, I had a very detailed refactoring plan for a project it was too long to complete in one go, so used "ask" feature to split it up into multiple task and group them by "which tasks can be executed concurrently".

It split them up in a way they would be split up in real life, but in real life there is an assumption that people working on tasks going to communicate with each other. The way it generates tasks resulted in HUGE loss of context (my plan was hella detailed).

I was willing to spend a few more hours trying to make it work rather than doing the work myself. I've opened another chat and split it up into multiple sequential tasks, with a detailed prompt for each task (why, what, how, validation, update documentation reminder etc).

Anyway, orchestrator might work on some super simple tasks, much smaller tasks than those articles make you believe.

9. rdedev ◴[17 Jun 25 21:20 UTC] No.44304092[source]▶

>>44303213 (TP) #

This is why I am leaning towards making the llm generate code that calls operates on took calls instead of having everything in JSON.

Huggingfaces's smolagents library makes the llm generate python code where tools are just normal python functions. If you want parallel tools calls just prompt the llm to do so. It should take care of synchronizing everything. Ofcourse there is the whole issue around executing llm generated code but we have a few solutions for that

10. svachalek ◴[17 Jun 25 21:23 UTC] No.44304116[source]▶

>>44303510 #

I'm not sure we're at "great" yet. Gemini 2.5 pro fails maybe 50% of the time for me at even generating a syntactically successful tool call.

replies(1): >>44304768 #

11. daxfohl ◴[17 Jun 25 21:30 UTC] No.44304163{3}[source]▶

>>44303617 #

That's more of a hobbyist thing I'd say. Corporations developing these things will of course want to use some centralized system that they trust. It's more efficient, they have more control over it, it's easier for average people to use, etc.

A decentralized thing would be more for individuals who want more control and transparency. A decentralized public ledger would make it possible to verify that your agent, the agents it interacts with, and the contents of their interactions have not been altered or compromised in any way, whereas a corporate-owned framework could not provide the same level of assurance.

But technically, there's no advantage I can think of for using a public distributed ledger to manage interactions. Agent tasks are pretty ephemeral, so unlike digital currency, there's not really a need to maintain a complete historical log of every action forever. And as far as providing tools for dealing with race conditions, blockchain would be about the least efficient way of creating a mutex imaginable. So technically, just like with non-AI apps, cetralized architecture is always going to be a lot more efficient.

replies(1): >>44304485 #

12. pyman ◴[17 Jun 25 22:07 UTC] No.44304485{4}[source]▶

>>44304163 #

Good points. I agree that for most companies using centralised systems offers more advantages because of efficiency, control and user experience, but I wasn't arguing that decentralisation is better technically, just wondering if it might be necessary in the long run.

If agents become more autonomous and start coordinating across platforms owned by different companies, it might make sense to have some kind of shared, trustless layer (maybe not blockchain but something distributed, auditable and neutral).

I agree that agent tasks are ephemeral, but what about long lived multi-agent workflows or contracts between agents that execute over time? In those cases transparency and integrity might matter more.

I don't think it's one or the other. Centralised systems will dominate in the short term, no doubt about that, but if we're serious about agent ecosystems at scale, we might need more open coordination models too.

replies(2): >>44304722 #>>44311360 #

13. jsemrau ◴[17 Jun 25 22:15 UTC] No.44304552[source]▶

>>44303510 #

"The standard for "agents" is that tools run in sequence"

I don't think that this correct. Agents benefit is that they can use tools on the fly. Ideally the right tool at the right time.

I.e., Which number is bigger 9.11 or 9.9 -> Agent uses calculator tool. or What is the annual 2020-2023 revenue for Apple -> Financial Statements MCP

replies(1): >>44304931 #

14. ◴[17 Jun 25 22:35 UTC] No.44304722{5}[source]▶

>>44304485 #

15. simonw ◴[17 Jun 25 22:41 UTC] No.44304768{3}[source]▶

>>44304116 #

Are you using Gemini's baked in API tool calling mechanisms or are you prompting it and telling it to produce specific XML/JSON?

replies(1): >>44304970 #

16. samtheprogram ◴[17 Jun 25 22:59 UTC] No.44304931{3}[source]▶

>>44304552 #

Nothing you said contradicts the quote. When they say in sequence, they don’t mean “in a previously defined order”, they mean “not in parallel”.

17. mediaman ◴[17 Jun 25 23:05 UTC] No.44304970{4}[source]▶

>>44304768 #

What do you recommend for this? I've actually had good luck having them create XML, even though you're "supposed" to use the native tool calling in a JSON schema. There seems to be far fewer issues with getting JSON syntax correct.

replies(1): >>44305191 #

18. simonw ◴[17 Jun 25 23:45 UTC] No.44305191{5}[source]▶

>>44304970 #

I'm using their native tool calling: https://github.com/simonw/llm-gemini/commit/a7f1096cfbb73301... - it's been working really well for me so far.

19. whattheheckheck ◴[18 Jun 25 02:28 UTC] No.44306135[source]▶

>>44303637 #

Then wtf is the point of this?

replies(3): >>44309003 #>>44311070 #>>44313336 #

20. pferde ◴[18 Jun 25 11:43 UTC] No.44309003{3}[source]▶

>>44306135 #

That's the neat part - there is none!

21. deadbabe ◴[18 Jun 25 16:07 UTC] No.44311070{3}[source]▶

>>44306135 #

Now you’re starting to realize, AI has no real purpose except as a natural language processor for ambiguous unstructured inputs.

Anything an AI agent does that is not that, can be done cheaply and deterministically by some code.

If code can replace humans, it can replace AI.

22. daxfohl ◴[18 Jun 25 16:39 UTC] No.44311360{5}[source]▶

>>44304485 #

My hunch would still be no; human agents are able to cooperate without needing to do everything in a global shared record, so I'd expect AI agents would as well. If you (or any other AI agent) feel the need to check that the AI agent did some task, you just verify it "manually", like add a verification step in the workflow so that your AI agent checks your bank account to verify that the other AI agent actually transferred the sum that they said, just like human-to-human interaction (and just like a non-AI automated workflow would do).

But, that's just a guess. Maybe the combination of AI and automation adds something special to the mix where a global public ledger becomes more valuable (beyond the hobbyist community) and I'm just not seeing it.

23. daxfohl ◴[18 Jun 25 21:15 UTC] No.44313336{3}[source]▶

>>44306135 #

If you're a big software company, not much. If you're a small non-tech business, it could be an easy way to automate some things without hiring a software engineer.

↑