Surely listing files, searching a repo, editing a file can all be achieved with bash?
Or is this what's demonstrated by https://news.ycombinator.com/item?id=45001234?
Surely listing files, searching a repo, editing a file can all be achieved with bash?
Or is this what's demonstrated by https://news.ycombinator.com/item?id=45001234?
If everything goes through bash then you need some way to separate always safe commands that don't need approval (such as listing files), from all other potentially unsafe commands that require user approval.
If you have listing files as a separate tool then you can also enforce that the agent doesn't list any files outside of the project directory.
My best guess is they started out with a limited subset of tools and realised they can just give it bash later.
One of the reasons why you get better performance if you give them the other tools is that there has been some reinforcement learning on Sonne with all these tools. The model is aware of how these tools work, it is more token-efficient and it is generally much more successful at performing those actions. The Bash tool, for instance, at times gets confused by bashisms, not escaping arguments correctly, not handling whitespace correctly etc.
This saves the LLM from having to do multiple low level clicking and typing and keeps it on track. Help the poor model out, will ya!?
> The Bash tool, for instance, at times gets confused by bashisms, not escaping arguments correctly, not handling whitespace correctly etc.
This was the only informative sentence in the reply. Can you please go on in this manner - it was an important question.This is a very strong argument for more specific tools, thanks!
Interesting! This didn't seem to be the case in the OP's examples - for instance using a list_files tool and then checking if the json result included README vs bash [ -f README ]
There is no training on a tool with that name. But it likely also doesn't need training because the parameter is just a path and that's a pretty basic tool.
On the other hand to know how to execute a bash command, you need to know bash. Bash is a known tool to the Claude models [1] and so is text editing [2]. You're supposed to reference those in the tool listing but at least from my testing, the moment you call a tool "bash", Claude makes plenty of assumptions about what the point of this thing is.
[1]: https://docs.anthropic.com/en/docs/agents-and-tools/tool-use...
[2]: https://docs.anthropic.com/en/docs/agents-and-tools/tool-use...
If you need to edit the source, just use patch with the bash tool.
What's the efficiency issue?