Something I've realized about LLM tool use is that it means that if you can reduce a problem to something that can be solved by an LLM in a sandbox using tools in a loop, you can brute force that problem.
The job then becomes identifying those problems and figuring out how to configure a sandbox for them, what tools to provide and how to define the success criteria for the model.
That still takes significant skill and experience, but it's at a higher level than chewing through that problem using trial and error by hand.
My assembly Mandelbrot experiment was the thing that made this click for me: https://simonwillison.net/2025/Jul/2/mandelbrot-in-x86-assem...
replies(7):