How to build a coding agent

1. normie3000 ◴[24 Aug 25 05:51 UTC] No.45001738[source]▶

>>45001051 (OP) #

Why are any of the tools beyond the bash tool required?

Surely listing files, searching a repo, editing a file can all be achieved with bash?

Or is this what's demonstrated by https://news.ycombinator.com/item?id=45001234?

replies(6): >>45001930 #>>45001984 #>>45002135 #>>45002208 #>>45002306 #>>45003670 #

2. zarzavat ◴[24 Aug 25 06:43 UTC] No.45001930[source]▶

>>45001738 (TP) #

Separate tools is simpler than having everything go through bash.

If everything goes through bash then you need some way to separate always safe commands that don't need approval (such as listing files), from all other potentially unsafe commands that require user approval.

If you have listing files as a separate tool then you can also enforce that the agent doesn't list any files outside of the project directory.

replies(1): >>45002621 #

3. BenderV ◴[24 Aug 25 06:57 UTC] No.45001984[source]▶

>>45001738 (TP) #

Why do humans need a IDE when we could do anything in a shell? Interface give you the informations you need at a given moment and the actions you can take.

replies(1): >>45002679 #

4. faangguyindia ◴[24 Aug 25 07:32 UTC] No.45002135[source]▶

>>45001738 (TP) #

>Why are any of the tools beyond the bash tool required?

My best guess is they started out with a limited subset of tools and realised they can just give it bash later.

5. the_mitsuhiko ◴[24 Aug 25 07:43 UTC] No.45002208[source]▶

>>45001738 (TP) #

Technically speaking, you can get away with just a Bash tool, and I had some success with this. It's actually quite interesting to take away tools from agents and see how creative they are with the use.

One of the reasons why you get better performance if you give them the other tools is that there has been some reinforcement learning on Sonne with all these tools. The model is aware of how these tools work, it is more token-efficient and it is generally much more successful at performing those actions. The Bash tool, for instance, at times gets confused by bashisms, not escaping arguments correctly, not handling whitespace correctly etc.

replies(3): >>45002420 #>>45002638 #>>45004117 #

6. kissgyorgy ◴[24 Aug 25 08:03 UTC] No.45002306[source]▶

>>45001738 (TP) #

This is explained in 3.2 How to design good tools?

    This saves the LLM from having to do multiple low level clicking and typing and keeps it on track. Help the poor model out, will ya!?

replies(1): >>45002609 #

7. dotancohen ◴[24 Aug 25 08:25 UTC] No.45002420[source]▶

>>45002208 #

  > The Bash tool, for instance, at times gets confused by bashisms, not escaping arguments correctly, not handling whitespace correctly etc.

This was the only informative sentence in the reply. Can you please go on in this manner - it was an important question.

8. normie3000 ◴[24 Aug 25 09:00 UTC] No.45002609[source]▶

>>45002306 #

I'm not sure where this quote is from - it doesn't seem to appear in the linked article.

replies(1): >>45004672 #

9. normie3000 ◴[24 Aug 25 09:02 UTC] No.45002621[source]▶

>>45001930 #

> you need some way to separate always safe commands that don't need approval (such as listing files), from all other potentially unsafe commands that require user approval.

This is a very strong argument for more specific tools, thanks!

10. normie3000 ◴[24 Aug 25 09:05 UTC] No.45002638[source]▶

>>45002208 #

> The model is aware of how these tools work, it is more token-efficient and it is generally much more successful at performing those actions.

Interesting! This didn't seem to be the case in the OP's examples - for instance using a list_files tool and then checking if the json result included README vs bash [ -f README ]

replies(1): >>45004841 #

11. normie3000 ◴[24 Aug 25 09:10 UTC] No.45002679[source]▶

>>45001984 #

To me a better analogy would be: if you're a household of 2 who own 3 reliable cars, why would you need a 4th car with smaller cargo & passenger capacities, higher fuel consumption, worse off-road performance and lower top speed?

12. ghuntley ◴[24 Aug 25 12:16 UTC] No.45003670[source]▶

>>45001738 (TP) #

Yeah, you could get away with a coding agent just using the Bash tool and the Edit tool (tbh somewhat optional but not having it would be highly inefficient). I haven't tried it, but it might struggle with the code search functionality. It would be possible with the right prompting. For example, you could just prompt the LLM to say "If you need to search the source code, use ripgrep with the Bash tool."

replies(1): >>45006289 #

13. ◴[24 Aug 25 13:29 UTC] No.45004117[source]▶

>>45002208 #

14. kissgyorgy ◴[24 Aug 25 14:51 UTC] No.45004672{3}[source]▶

>>45002609 #

ahh, sorry, different article :(

15. the_mitsuhiko ◴[24 Aug 25 15:13 UTC] No.45004841{3}[source]▶

>>45002638 #

> Interesting! This didn't seem to be the case in the OP's examples - for instance using a list_files tool and then checking if the json result included README vs bash [ -f README ]

There is no training on a tool with that name. But it likely also doesn't need training because the parameter is just a path and that's a pretty basic tool.

On the other hand to know how to execute a bash command, you need to know bash. Bash is a known tool to the Claude models [1] and so is text editing [2]. You're supposed to reference those in the tool listing but at least from my testing, the moment you call a tool "bash", Claude makes plenty of assumptions about what the point of this thing is.

[1]: https://docs.anthropic.com/en/docs/agents-and-tools/tool-use...

[2]: https://docs.anthropic.com/en/docs/agents-and-tools/tool-use...

16. normie3000 ◴[24 Aug 25 18:02 UTC] No.45006289[source]▶

>>45003670 #

> Edit tool (tbh somewhat optional but not having it would be highly inefficient)

If you need to edit the source, just use patch with the bash tool.

What's the efficiency issue?