←back to thread

Development speed is not a bottleneck

(pawelbrodzinski.substack.com)
191 points flail | 2 comments | | HN request time: 0.405s | source
Show context
thenanyu ◴[] No.45138802[source]
It's completely absurd how wrong this article is. Development speed is 100% the bottleneck.

Just to quote one little bit from the piece regarding Google: "In other words, there have been numerous dead ends that they explored, invalidated, and moved on from. There's no knowing up front."

Every time you change your mind or learn something new and you have to make a course correction, there's latency. That latency is just development velocity. The way to find the right answer isn't to think very hard and miraculously come up with the perfect answer. It's to try every goddamn thing that shows promise. The bottleneck for that is 100% development speed.

If you can shrink your iteration time, then there are fewer meetings trying to determine prioritization. There are fewer discussions and bargaining sessions you need to do. Because just developing the variations would be faster than all of the debate. So the amount of time you waste in meetings and deliberation goes down as well.

If you can shrink your iteration time between versions 2 and 3, between versions 3 and 4, etc. The advantage compounds over your competitors. You find promising solutions earlier, which lead to new promising solutions earlier. Over an extended period of time, this is how you build a moat.

replies(13): >>45139053 #>>45139060 #>>45139417 #>>45139619 #>>45139814 #>>45139926 #>>45140039 #>>45140332 #>>45140412 #>>45141131 #>>45144376 #>>45147059 #>>45154763 #
trjordan ◴[] No.45139053[source]
This article is right insofar as "development velocity" has been redefined to be "typing speed."

With LLMs, you can type so much faster! So we should be going faster! It feels faster!

(We are not going faster.)

But your definition, the right one, is spot on. The pace of learning and decisions is exactly what drives development velocity. My one quibble is that if you want to learn whether something is worth doing, implementing it isn't always the answer. Prototyping vs. production-quality implementation is different, even within that. But yeah, broadly, you need to test and validate as many _ideas_ as possible, in order take make as many correct _decisions_ as possible.

That's one place I'm pretty bullish on AI: using it to explore/test ideas, which otherwise would have been too expensive. You can learn a ton by sending the AI off to research stuff (code, web search, your production logs, whatever), which lets you try more stuff. That genuinely tightens the feedback loop, and you go faster.

I wrote a bit more about that here: https://tern.sh/blog/you-have-to-decide/

replies(4): >>45139232 #>>45139283 #>>45139863 #>>45140155 #
giancarlostoro ◴[] No.45139863[source]
I can agree with this sentiment. It does not matter how insanely good LLMs become, if you cannot assess it quickly enough. You will ALWAYS want a human to verify and validate, and test the software. There could be a ticking timebomb in there somewhere.

Maybe the real skynet will kill us with ticking time bomb software bugs we blindly accepted.

replies(3): >>45140469 #>>45140953 #>>45143958 #
ACCount37 ◴[] No.45140953[source]
The threshold of supervision keeps rising - and it's going to keep rising.

GPT-2 was barely capable of writing two lines of code. GPT-3.5 could write a simple code snippet, and be right more often than it was wrong. GPT-4 was a leap over that, enabling things like "vibe coding" for small simple projects, and GPT-5 is yet another advancement in the same direction. Each AI upgrade brings forth more capabilities - with every upgrade, the AI can go further before it needs supervision.

I can totally see the amount of supervision an AI needs collapsing to zero within our lifetimes.

replies(2): >>45141183 #>>45145223 #
gyrovagueGeist ◴[] No.45141183[source]
In the middle term, I almost feel less productive using modern GPT-5/Claude Sonnet 4 for software dev than prior models, precisely because they are more hands off and less supervised.

Because they generate so much code, that often passes initial tests, looks reasonable, and fails in nonhuman ways, in a pretty opinionated style tbh.

I have less context (and need to spend much more effort and supervision time to get up to speed to learn) to fix, refactor, and integrate the solutions, than if I was only trusting short few line windows at a time.

replies(1): >>45141293 #
warkdarrior ◴[] No.45141293[source]
> I almost feel less productive using modern GPT-5/Claude Sonnet 4 for software dev than prior models, precisely because they are more hands off and less supervised.

That is because you are trained in the old way to writing code: manual crafting of software line by line, slowly, deliberately, thoughtfully. New generations of developers will not use the same workflow as you, just like you do not use the same workflow as folks who programmed punch cards.

replies(1): >>45141398 #
_se ◴[] No.45141398[source]
No, it's because reading code is slower than writing it.

The only way these tools can possibly be faster for non-trivial work is if you don't give a shit enough about the output to not even read it. And if you can do that and still achieve your goal, chances are your goal wasn't that difficult to begin with.

That's why we're now consistently measuring individuals to be slower using these tools even though many of them feel faster.

replies(2): >>45142008 #>>45156360 #
1. mwigdahl ◴[] No.45142008[source]
"Consistently"? Is there more than just the one METR study that's saying this?
replies(1): >>45148718 #
2. _se ◴[] No.45148718[source]
I have measured it myself within my organization, and I know many peers across companies who have done the same. No, I cannot share the data (I wish I could, truly), but I expect that we will begin to see many of these types of studies emerge before long.

The tools are absolutely useful, but they need to be applied in the right places and they are decided not a silver bullet or general-purpose software engineering tool in the manner that they're being billed at present. We still use them despite our finding, but we use them judiciously and where they actually help.