←back to thread

548 points kmelve | 1 comments | | HN request time: 0.213s | source
Show context
swframe2 ◴[] No.45108930[source]
Preventing garbage just requires that you take into account the cognitive limits of the agent. For example ...

1) Don't ask for large / complex change. Ask for a plan but ask it to implement the plan in small steps and ask the model to test each step before starting the next.

2) For really complex steps, ask the model to write code to visualize the problem and solution.

3) If the model fails on a given step, ask it to add logging to the code, save the logs, run the tests and the review the logs to determine what went wrong. Do this repeatedly until the step works well.

4) Ask the model to look at your existing code and determine how it was designed to implement a task. Some times the model will put all of the changes in one file but your code has a cleaner design the model doesn't take into account.

I've seen other people blog about their tricks and tips. I do still see garbage results but not as high as 95%.

replies(20): >>45109085 #>>45109229 #>>45109255 #>>45109297 #>>45109350 #>>45109631 #>>45109684 #>>45109710 #>>45109743 #>>45109822 #>>45109969 #>>45110014 #>>45110639 #>>45110707 #>>45110868 #>>45111654 #>>45112029 #>>45112178 #>>45112219 #>>45112752 #
nostrademons ◴[] No.45109822[source]
I've found that an effective tactic for larger, more complex tasks is to tell it "Don't write any code now. I'm going to describe each of the steps of the problem in more detail. The rough outline is going to be 1) Read this input 2) Generate these candidates 3) apply heuristics to score candidates 4) prioritize and rank candidates 5) come up with this data structure reflecting the output 6) write the output back to the DB in this schema". Claude will then go and write a TODO list in the code (and possibly claude.md if you've run /init), and prompt you for the details of each stage. I've even done this for an hour, told Claude "I have to stop now. Generate code for the finished stages and write out comments so you can pick up where you left off next time" and then been able to pick up next time with minimal fuss.
replies(2): >>45109910 #>>45110652 #
hex4def6 ◴[] No.45109910[source]
FYI: You can force "Plan mode" by pressing shift-tab. That will prevent it from eagerly implementing stuff.
replies(1): >>45110172 #
jaggederest ◴[] No.45110172[source]
> That will prevent it from eagerly implementing stuff.

In theory. In practice, it's not a very secure sandbox and Claude will happily go around updating files if you insist / the prompt is bad / it goes off on a tangent.

I really should just set up a completely sandboxed VM for it so that I don't care if it goes rm-rf happy.

replies(1): >>45110424 #
adastra22 ◴[] No.45110424[source]
Plan mode disabled the tools, so I don’t see how it would do that.

A sandboxed devcontainer is worth setting up though. Lets me run it with —dangerously-skip-permissions

replies(3): >>45110522 #>>45111207 #>>45124163 #
jaggederest ◴[] No.45110522[source]
I don't know either but I've seen it write to files in plan mode. Very confusing.
replies(3): >>45111221 #>>45111254 #>>45113288 #
EnPissant ◴[] No.45111221[source]
That's not possible. You are misremembering.
replies(4): >>45111407 #>>45112419 #>>45112658 #>>45122288 #
1. laborcontract ◴[] No.45112658[source]
No, it is possible. I just got it to write files both using Bash and its Write tools while in plan mode right now.