(simonwillison.net)

159 points jbredeche | 2 comments | 06 Oct 25 10:40 UTC | HN request time: 0.431s | source

Show context

xnx ◴[09 Oct 25 18:39 UTC] No.45531447[source]▶

We are at a weird moment where the latency of the response is slow enough that we're anthropomorphizing AI code assistants into employees. We don't talk about image generation this way. With images, its batching up a few jobs and reviewing the results later. We don't say "I spun up a bunch of AI artists."

replies(2): >>45531611 #>>45532016 #

1. laterium ◴[09 Oct 25 19:27 UTC] No.45532016[source]▶

>>45531447 #

As a follow-up, how would this workflow feel if the LLM generation were instantenous or cost nothing? What would the new bottleneck be? Running the tests? Network speed? The human reviewer?

replies(1): >>45532837 #

2. simonw ◴[09 Oct 25 20:41 UTC] No.45532837[source]▶

>>45532016 (TP) #

You can get a glimpse of that by trying one of the wildly performant LLM providers - most notably Cerebras and Groq, or the Gemini Diffusion preview.

I have videos showing Cerebras: https://simonwillison.net/2024/Oct/31/cerebras-coder/ and Gemini Diffusion: https://simonwillison.net/2025/May/21/gemini-diffusion/

↑

Embracing the parallel coding agent lifestyle