Sandboxing the agent hardly seems like a sufficient defense here.
Sandboxing the agent hardly seems like a sufficient defense here.
All three of the projects I described in this talk have effectively zero risk in terms of containing harmful unreviewed code.
DeepSeek-OCR on the Spark? I ran that one in a Docker container, saved some notes on the process and then literally threw away the container once it had finished.
The Pyodide in Node.js one I did actually review, because its code I execute on a machine that isn't disposable. The initial research ran in a disposable remote container though (Claude Code for web).
The Perl in WebAssembly one? That runs in a browser sandbox. There's effectively nothing bad that can happen there, that's why I like WebAssembly so much.
I am a whole lot more cautious in reviewing code that has real stakes attached to it.