Development speed is not a bottleneck

(pawelbrodzinski.substack.com)

Show context

thenanyu ◴[05 Sep 25 14:08 UTC] No.45138802[source]▶

It's completely absurd how wrong this article is. Development speed is 100% the bottleneck.

Just to quote one little bit from the piece regarding Google: "In other words, there have been numerous dead ends that they explored, invalidated, and moved on from. There's no knowing up front."

Every time you change your mind or learn something new and you have to make a course correction, there's latency. That latency is just development velocity. The way to find the right answer isn't to think very hard and miraculously come up with the perfect answer. It's to try every goddamn thing that shows promise. The bottleneck for that is 100% development speed.

If you can shrink your iteration time, then there are fewer meetings trying to determine prioritization. There are fewer discussions and bargaining sessions you need to do. Because just developing the variations would be faster than all of the debate. So the amount of time you waste in meetings and deliberation goes down as well.

If you can shrink your iteration time between versions 2 and 3, between versions 3 and 4, etc. The advantage compounds over your competitors. You find promising solutions earlier, which lead to new promising solutions earlier. Over an extended period of time, this is how you build a moat.

replies(13): >>45139053 #>>45139060 #>>45139417 #>>45139619 #>>45139814 #>>45139926 #>>45140039 #>>45140332 #>>45140412 #>>45141131 #>>45144376 #>>45147059 #>>45154763 #

trjordan ◴[05 Sep 25 14:30 UTC] No.45139053[source]▶

>>45138802 #

This article is right insofar as "development velocity" has been redefined to be "typing speed."

With LLMs, you can type so much faster! So we should be going faster! It feels faster!

(We are not going faster.)

But your definition, the right one, is spot on. The pace of learning and decisions is exactly what drives development velocity. My one quibble is that if you want to learn whether something is worth doing, implementing it isn't always the answer. Prototyping vs. production-quality implementation is different, even within that. But yeah, broadly, you need to test and validate as many _ideas_ as possible, in order take make as many correct _decisions_ as possible.

That's one place I'm pretty bullish on AI: using it to explore/test ideas, which otherwise would have been too expensive. You can learn a ton by sending the AI off to research stuff (code, web search, your production logs, whatever), which lets you try more stuff. That genuinely tightens the feedback loop, and you go faster.

I wrote a bit more about that here: https://tern.sh/blog/you-have-to-decide/

replies(4): >>45139232 #>>45139283 #>>45139863 #>>45140155 #

giancarlostoro ◴[05 Sep 25 15:42 UTC] No.45139863[source]▶

>>45139053 #

I can agree with this sentiment. It does not matter how insanely good LLMs become, if you cannot assess it quickly enough. You will ALWAYS want a human to verify and validate, and test the software. There could be a ticking timebomb in there somewhere.

Maybe the real skynet will kill us with ticking time bomb software bugs we blindly accepted.

replies(3): >>45140469 #>>45140953 #>>45143958 #

1. ACCount37 ◴[05 Sep 25 17:12 UTC] No.45140953[source]▶

>>45139863 #

The threshold of supervision keeps rising - and it's going to keep rising.

GPT-2 was barely capable of writing two lines of code. GPT-3.5 could write a simple code snippet, and be right more often than it was wrong. GPT-4 was a leap over that, enabling things like "vibe coding" for small simple projects, and GPT-5 is yet another advancement in the same direction. Each AI upgrade brings forth more capabilities - with every upgrade, the AI can go further before it needs supervision.

I can totally see the amount of supervision an AI needs collapsing to zero within our lifetimes.

replies(2): >>45141183 #>>45145223 #

2. gyrovagueGeist ◴[05 Sep 25 17:31 UTC] No.45141183[source]▶

>>45140953 (TP) #

In the middle term, I almost feel less productive using modern GPT-5/Claude Sonnet 4 for software dev than prior models, precisely because they are more hands off and less supervised.

Because they generate so much code, that often passes initial tests, looks reasonable, and fails in nonhuman ways, in a pretty opinionated style tbh.

I have less context (and need to spend much more effort and supervision time to get up to speed to learn) to fix, refactor, and integrate the solutions, than if I was only trusting short few line windows at a time.

replies(1): >>45141293 #

3. warkdarrior ◴[05 Sep 25 17:39 UTC] No.45141293[source]▶

>>45141183 #

> I almost feel less productive using modern GPT-5/Claude Sonnet 4 for software dev than prior models, precisely because they are more hands off and less supervised.

That is because you are trained in the old way to writing code: manual crafting of software line by line, slowly, deliberately, thoughtfully. New generations of developers will not use the same workflow as you, just like you do not use the same workflow as folks who programmed punch cards.

replies(1): >>45141398 #

4. _se ◴[05 Sep 25 17:47 UTC] No.45141398{3}[source]▶

>>45141293 #

No, it's because reading code is slower than writing it.

The only way these tools can possibly be faster for non-trivial work is if you don't give a shit enough about the output to not even read it. And if you can do that and still achieve your goal, chances are your goal wasn't that difficult to begin with.

That's why we're now consistently measuring individuals to be slower using these tools even though many of them feel faster.

replies(2): >>45142008 #>>45156360 #

5. mwigdahl ◴[05 Sep 25 18:34 UTC] No.45142008{4}[source]▶

>>45141398 #

"Consistently"? Is there more than just the one METR study that's saying this?

replies(1): >>45148718 #

6. daxfohl ◴[06 Sep 25 00:12 UTC] No.45145223[source]▶

>>45140953 (TP) #

I could see it happening in a year or two. Especially in backend. There's only so many different architecture patterns we use, and an LLM will have access to every one that has ever been deployed, every document, every gripe, every research paper, etc.

I mean, I think ultimately the state space in designing a feature is way smaller than, say, go (the game). Maybe a few hundred common patterns and maybe a billion reasonable ways to combine them. I think it's only a matter of time before we ask it to design a feature, and it produces five options that are all better than what we'd have come up with.

7. _se ◴[06 Sep 25 12:31 UTC] No.45148718{5}[source]▶

>>45142008 #

I have measured it myself within my organization, and I know many peers across companies who have done the same. No, I cannot share the data (I wish I could, truly), but I expect that we will begin to see many of these types of studies emerge before long.

The tools are absolutely useful, but they need to be applied in the right places and they are decided not a silver bullet or general-purpose software engineering tool in the manner that they're being billed at present. We still use them despite our finding, but we use them judiciously and where they actually help.

8. KronisLV ◴[07 Sep 25 08:11 UTC] No.45156360{4}[source]▶

>>45141398 #

> No, it's because reading code is slower than writing it.

This feels wrong to me, unless we qualify the statement with: "...if you want the exact same level of understanding of it."

Otherwise, the bottleneck in development would be pull/merge request review, not writing the code to do something. But almost always, it's the other way around - someone works on a feature for 3-5 days, the pull/merge request does not really spend the same time in active review. I don't think you need the exact same level of intricate understanding over some code when reviewing it.

It's quite similar with the AI stuff, I often nitpick and want to rework certain bits of code that AI generates (or fix obvious issues with it), but using it for the first version/draft is still easier than trying to approach the issue from zero. Ofc AI won't make you consistently better, but will remove some of the friction and reduce the cognitive load.

↑