←back to thread

548 points kmelve | 1 comments | | HN request time: 0.202s | source
Show context
swframe2 ◴[] No.45108930[source]
Preventing garbage just requires that you take into account the cognitive limits of the agent. For example ...

1) Don't ask for large / complex change. Ask for a plan but ask it to implement the plan in small steps and ask the model to test each step before starting the next.

2) For really complex steps, ask the model to write code to visualize the problem and solution.

3) If the model fails on a given step, ask it to add logging to the code, save the logs, run the tests and the review the logs to determine what went wrong. Do this repeatedly until the step works well.

4) Ask the model to look at your existing code and determine how it was designed to implement a task. Some times the model will put all of the changes in one file but your code has a cleaner design the model doesn't take into account.

I've seen other people blog about their tricks and tips. I do still see garbage results but not as high as 95%.

replies(20): >>45109085 #>>45109229 #>>45109255 #>>45109297 #>>45109350 #>>45109631 #>>45109684 #>>45109710 #>>45109743 #>>45109822 #>>45109969 #>>45110014 #>>45110639 #>>45110707 #>>45110868 #>>45111654 #>>45112029 #>>45112178 #>>45112219 #>>45112752 #
nostrademons ◴[] No.45109822[source]
I've found that an effective tactic for larger, more complex tasks is to tell it "Don't write any code now. I'm going to describe each of the steps of the problem in more detail. The rough outline is going to be 1) Read this input 2) Generate these candidates 3) apply heuristics to score candidates 4) prioritize and rank candidates 5) come up with this data structure reflecting the output 6) write the output back to the DB in this schema". Claude will then go and write a TODO list in the code (and possibly claude.md if you've run /init), and prompt you for the details of each stage. I've even done this for an hour, told Claude "I have to stop now. Generate code for the finished stages and write out comments so you can pick up where you left off next time" and then been able to pick up next time with minimal fuss.
replies(2): >>45109910 #>>45110652 #
hex4def6 ◴[] No.45109910[source]
FYI: You can force "Plan mode" by pressing shift-tab. That will prevent it from eagerly implementing stuff.
replies(1): >>45110172 #
jaggederest ◴[] No.45110172[source]
> That will prevent it from eagerly implementing stuff.

In theory. In practice, it's not a very secure sandbox and Claude will happily go around updating files if you insist / the prompt is bad / it goes off on a tangent.

I really should just set up a completely sandboxed VM for it so that I don't care if it goes rm-rf happy.

replies(1): >>45110424 #
adastra22 ◴[] No.45110424[source]
Plan mode disabled the tools, so I don’t see how it would do that.

A sandboxed devcontainer is worth setting up though. Lets me run it with —dangerously-skip-permissions

replies(3): >>45110522 #>>45111207 #>>45124163 #
jaggederest ◴[] No.45110522[source]
I don't know either but I've seen it write to files in plan mode. Very confusing.
replies(3): >>45111221 #>>45111254 #>>45113288 #
EnPissant ◴[] No.45111221[source]
That's not possible. You are misremembering.
replies(4): >>45111407 #>>45112419 #>>45112658 #>>45122288 #
1. nomoreofthat ◴[] No.45112419[source]
It’s entirely possible. Claude’s security model for subagents/tasks is incoherent and buggy, far below the standard they set elsewhere in their product, and planning mode can use subagent/tasks for research.

Permission limitations on the root agent have, in many cases, not been propagated to child agents, and they’ve been able to execute different commands. The documentation is incomplete and unclear, and even to the extent that it is clear it has a different syntax with different limitations than are used to configure permissions for the root agent. When you ask Claude itself to generate agent configurations, as is recommended, it will generate permissions that do not exist anywhere in the documentation and may or may not be valid, but there’s no error admitted if an invalid permission is set. If you ask it to explain, it gets confused by their own documentation and tells you it doesn’t know why it did that. I’m not sure if it’s hallucinating or if the agent-generating-agent has access to internal detail details that are not documented anywhere in which the normal agent can’t see.

Anthropic is pretty consistently the best in this space in terms of security and product quality. They seem to actually care about doing software engineering properly. (I’ve personally discovered security bugs in several competing products that are more severe and exploitable than what I’m talking about here.) I have a ton of respect for Anthropic. Unfortunately, when it comes to sub agents in Claude code, they are not living up to standard they have set.