Most active commenters
  • simonw(3)

Living Dangerously with Claude

(simonwillison.net)
36 points FromTheArchives | 12 comments | | HN request time: 0.84s | source | bottom
1. igor47 ◴[] No.45674278[source]
My approach is to ask Claude to plan anything beyond a trivial change and I review the plan, then let it run unsupervised to execute the plan. But I guess this does still leave me vulnerable to prompt injection if part of the plan is accessing external content
2. danielbln ◴[] No.45675102[source]
Claude Code offers sandboxing now: https://www.anthropic.com/engineering/claude-code-sandboxing
replies(1): >>45676932 #
3. lacker ◴[] No.45675963[source]
The sandbox idea seems nice, it's just a question of how annoying it is in practice. For example the "Claude Code on the web" sandbox appears to prevent you from loading `https://api.github.com/repos/.../releases/latest`. Presumably that's to prevent you from doing dangerous GitHub API operations with escalated privileges, which is good, but it's currently breaking some of my setup scripts....
replies(1): >>45676023 #
4. simonw ◴[] No.45676023[source]
Is that with their default environment?

I have been running a bunch of stuff in there with a custom environment that allows "*"

5. js2 ◴[] No.45676932[source]
It's discussed in the linked post.
6. matthewdgreen ◴[] No.45677089[source]
So let me get this straight. You’re writing tens of thousands of lines of code that will presumably go into a public GitHub repository and/or be served from some location. Even if it only runs locally on your own machine, at some point you’ll presumably give that code network access. And that code is being developed (without much review) by an agent that, in our threat model, has been fully subverted by prompt injection?

Sandboxing the agent hardly seems like a sufficient defense here.

replies(1): >>45677537 #
7. simonw ◴[] No.45677537[source]
What is your worst case scenario from this?
8. catigula ◴[] No.45677712[source]
Telling Claude to solve a problem and walking away isn't a problem you solved. You weren't in the loop. You didn't complete any side quests or do anything of note, you merely watched an AGI work.
replies(1): >>45678921 #
9. simonw ◴[] No.45678921[source]
Here's one I did even less work for: https://tools.simonwillison.net/terminal-to-html - prompt and video here: https://simonwillison.net/2025/Oct/23/claude-code-for-web-vi...
10. stuaxo ◴[] No.45680719[source]
I've been thinking about this a bit.

I reckon something lie Qubes could work fairly well.

Create a new Qube and have control over network connectivity, and do everything there, at the end copy the work out and destroy it.

11. BoredPositron ◴[] No.45680947[source]
Every blog post gets worse. Sorry but you are in way over your head and the solution is always bruteforce with LLMs. You never show learnings it's always open ended it's always verbose and tbh how shouldn't it be with 10-15 posts per month. You did 5 posts in 3 days... and for someone it's literally just repacking the source with some added one liners. It's tiktok level engagement farming.