What to build instead of AI agents

(decodingml.substack.com)

Show context

transcriptase ◴[03 Jul 25 00:58 UTC] No.44450515[source]▶

I love that there are somehow authorities on tech that realistically they could have 1-2 years experience with tops. It’s the reverse of the “seeking coder with 10 years of experience in a 2 year old language” meme.

replies(2): >>44450544 #>>44450763 #

noosphr ◴[03 Jul 25 01:01 UTC] No.44450544[source]▶

>>44450515 #

I've been building what's called ai agents since gpt3 came out. There are plenty of other people who did the same thing. That's five years now. If you can't be an expert after 5 years then there is no such thing as experts.

Of course agents is now a buzzword that means nothing so there is that.

replies(7): >>44450574 #>>44450635 #>>44450669 #>>44451016 #>>44451590 #>>44452288 #>>44454512 #

1. Voloskaya ◴[03 Jul 25 06:39 UTC] No.44452288[source]▶

>>44450544 #

“Agent” involves having agency. Calling the GPT-3 API and asking it to do some classification or whatever else your use case was, would not be considered agentic. Not only were there no tools back then to allow an LLM to carry out a plan of its own, even if you had developed your own, GPT-3 still sucked way too much to trust it with even basic tasks.

I have been working on LLMs since 2017, both training some of the biggest and then creating products around them and consider I have no experience with agents.

replies(2): >>44452529 #>>44453334 #

2. noosphr ◴[03 Jul 25 07:28 UTC] No.44452529[source]▶

>>44452288 (TP) #

All llms still suck too much to trust them with basic tasks without human in the loop. The only people who don't realize this are the ones whose paycheck depends on them not understanding it.

replies(1): >>44452587 #

3. Voloskaya ◴[03 Jul 25 07:38 UTC] No.44452587[source]▶

>>44452529 #

I don't necessarily disagree, my point is more that today you can realistically let an agent do several steps and use several tools, following a plan of it's own, before doing a manual review (e.g. Claude Code followed by a PR review). After all an intern has agency, even if I'm going to double check everything they do.

GPT-3, while being impressive at the time, was too bad to even let it do that, it would break after 1 or 2 steps, so letting it do anything by itself would have been a waste of time where the human in the loop would always have to re-do everything. It's planning ability was too bad and hallucinations way to frequent to be useful in those scenarios.

4. nunodonato ◴[03 Jul 25 09:46 UTC] No.44453334[source]▶

>>44452288 (TP) #

In defense of the previous commenter, I also started with GPT3. I had tool calling and reasoning before chatgpt even came out. So yeah, there was a lot that could be done before the models started integrating it

replies(1): >>44453778 #

5. Voloskaya ◴[03 Jul 25 11:06 UTC] No.44453778[source]▶

>>44453334 #

> I had tool calling and reasoning before chatgpt even came out.

Do you know of any kind of write up (by you or someone else) on this topic? Admittedly I never really spent too much time on this since I was working on pre-training, but I did try to do a few smart things with it and it pretty much failed at every thing, in big part because it wasn't even instruction tuned, so was very much still an autocomplete model.

So would be curious to learn more about how people got it to succeeed at agentic behaviors.

replies(1): >>44456677 #

6. nunodonato ◴[03 Jul 25 16:25 UTC] No.44456677{3}[source]▶

>>44453778 #

I used to vlog my experiments. Not really a very scientific write up on the topic, mostly just ramblings while experimenting cool stuff

↑