←back to thread

Living Dangerously with Claude

(simonwillison.net)
200 points FromTheArchives | 1 comments | | HN request time: 0.216s | source
Show context
matthewdgreen ◴[] No.45677089[source]
So let me get this straight. You’re writing tens of thousands of lines of code that will presumably go into a public GitHub repository and/or be served from some location. Even if it only runs locally on your own machine, at some point you’ll presumably give that code network access. And that code is being developed (without much review) by an agent that, in our threat model, has been fully subverted by prompt injection?

Sandboxing the agent hardly seems like a sufficient defense here.

replies(3): >>45677537 #>>45684527 #>>45686450 #
tptacek ◴[] No.45684527[source]
Where did "without much review" come from? I don't see that in the deck.
replies(2): >>45684731 #>>45688191 #
enraged_camel ◴[] No.45684731[source]
Yeah. Personally I haven't found a workflow that relies heavily on detailed design specs, red/green TDD followed by code review. And that's fine because that's how I did my work before AI anyway, both at the individual level and at the team level. So really, this is no different than reviewing someone else's PR, aside from the (greatly increased) turnaround time and volume.
replies(3): >>45684813 #>>45684821 #>>45689590 #
1. Leynos ◴[] No.45689590[source]
That's pretty much how I use Codex.