←back to thread

Building Effective "Agents"

(www.anthropic.com)
763 points jascha_eng | 8 comments | | HN request time: 1.595s | source | bottom
Show context
simonw ◴[] No.42475700[source]
This is by far the most practical piece of writing I've seen on the subject of "agents" - it includes actionable definitions, then splits most of the value out into "workflows" and describes those in depth with example applications.

There's also a cookbook with useful code examples: https://github.com/anthropics/anthropic-cookbook/tree/main/p...

Blogged about this here: https://simonwillison.net/2024/Dec/20/building-effective-age...

replies(6): >>42475903 #>>42476486 #>>42477016 #>>42478039 #>>42478786 #>>42479343 #
1. NeutralForest ◴[] No.42476486[source]
Thanks for all the write-ups on LLMs, you're on top of the news and it makes it way easier to follow what's happening and the existing implementations by following your blog instead.
replies(1): >>42481308 #
2. th0ma5 ◴[] No.42481308[source]
Probably the least critical and most myth pushing content imo.
replies(1): >>42481680 #
3. herecomethefuzz ◴[] No.42481680[source]
> most myth pushing content

Care to elaborate?

replies(1): >>42481986 #
4. th0ma5 ◴[] No.42481986{3}[source]
Lots of lists of the myths of LLMs out there https://masterofcode.com/blog/llms-myths-vs-reality-what-you... Every single post glosses over some aspect of these myths or posits they can be controlled or mitigated in some way, with no examples of anyone else finding applicability of the solutions to real world problems in a supportable and reliable way. When pushed, a myth in the neighborhood of those in the list above is pushed like the system will get better, or some classical computing mechanism will make up the difference, or that the problems aren't so bad, the solution is good enough in some ambiguous way, or that people or existing systems are just as bad when they are not.
replies(1): >>42482144 #
5. simonw ◴[] No.42482144{4}[source]
I've written extensively about myths and misconceptions about LLMs, much of which overlaps with the observations in that post.

Here's my series about misconceptions: https://simonwillison.net/series/llm-misconceptions/

It doesn't seem to me that you're familiar with my work - you seem to be mixing me in with the vast ocean of uncritical LLM boosting content that's out there.

replies(1): >>42483664 #
6. th0ma5 ◴[] No.42483664{5}[source]
I'm thinking of the system you built to watch videos and parse JSON and the claims of that having a general suitability, which is simply dishonest imo. You seem to be confusing me with someone that hasn't been asking you repeatedly to address these kinds of concerns and the above series are a kind of potemkin set of things that don't intersect with your other work.
replies(2): >>42487527 #>>42490138 #
7. kordlessagain ◴[] No.42487527{6}[source]
> dishonest Potemkin

It's like criticizing a "Hello World" program for not having proper error handling and security protocols. While those are important for production systems, they're not the point of a demonstration or learning example.

Your response seems to take these examples and hold them to the standard of mission-critical systems, which is a form of technical gatekeeping - raising the bar unnecessarily high for what counts as a "valid" technical demonstration.

8. simonw ◴[] No.42490138{6}[source]
You mean this? https://simonwillison.net/2024/Oct/17/video-scraping/

To my surprise, on re-reading that post I didn't mention that you need to double-check everything it does. I guess I forgot to mention that at the time because I thought it was so obvious - anyone who's paying attention to LLMS should already know that you can't trust them to reliably extract this kind of information.

I've mentioned that a lot in my other writing. I frequently tell people that the tricky thing about working with LLMs is learning how to make use of a technology that is inherently unreliable.

Update: added a new note about reliability here: https://simonwillison.net/2024/Oct/17/video-scraping/#a-note...

Second update: I just noticed that I DID say "You should never trust these things not to make mistakes, so I re-watched the 35 second video and manually checked the numbers. It got everything right." in that post already.

> You seem to be confusing me with someone that hasn't been asking you repeatedly to address these kinds of concerns

Where did you do that?