Most active commenters

thenanyu(4)

Popular/hot comments

>>45139589 #
>>45139863 #

←back to thread

Development speed is not a bottleneck

(pawelbrodzinski.substack.com)

Show context

thenanyu ◴[05 Sep 25 14:08 UTC] No.45138802[source]▶

>>45138156 (OP) #

It's completely absurd how wrong this article is. Development speed is 100% the bottleneck.

Just to quote one little bit from the piece regarding Google: "In other words, there have been numerous dead ends that they explored, invalidated, and moved on from. There's no knowing up front."

Every time you change your mind or learn something new and you have to make a course correction, there's latency. That latency is just development velocity. The way to find the right answer isn't to think very hard and miraculously come up with the perfect answer. It's to try every goddamn thing that shows promise. The bottleneck for that is 100% development speed.

If you can shrink your iteration time, then there are fewer meetings trying to determine prioritization. There are fewer discussions and bargaining sessions you need to do. Because just developing the variations would be faster than all of the debate. So the amount of time you waste in meetings and deliberation goes down as well.

If you can shrink your iteration time between versions 2 and 3, between versions 3 and 4, etc. The advantage compounds over your competitors. You find promising solutions earlier, which lead to new promising solutions earlier. Over an extended period of time, this is how you build a moat.

replies(13): >>45139053 #>>45139060 #>>45139417 #>>45139619 #>>45139814 #>>45139926 #>>45140039 #>>45140332 #>>45140412 #>>45141131 #>>45144376 #>>45147059 #>>45154763 #

1. trjordan ◴[05 Sep 25 14:30 UTC] No.45139053[source]▶

>>45138802 #

This article is right insofar as "development velocity" has been redefined to be "typing speed."

With LLMs, you can type so much faster! So we should be going faster! It feels faster!

(We are not going faster.)

But your definition, the right one, is spot on. The pace of learning and decisions is exactly what drives development velocity. My one quibble is that if you want to learn whether something is worth doing, implementing it isn't always the answer. Prototyping vs. production-quality implementation is different, even within that. But yeah, broadly, you need to test and validate as many _ideas_ as possible, in order take make as many correct _decisions_ as possible.

That's one place I'm pretty bullish on AI: using it to explore/test ideas, which otherwise would have been too expensive. You can learn a ton by sending the AI off to research stuff (code, web search, your production logs, whatever), which lets you try more stuff. That genuinely tightens the feedback loop, and you go faster.

I wrote a bit more about that here: https://tern.sh/blog/you-have-to-decide/

replies(4): >>45139232 #>>45139283 #>>45139863 #>>45140155 #

2. add-sub-mul-div ◴[05 Sep 25 14:48 UTC] No.45139232[source]▶

>>45139053 (TP) #

I think people are largely split on LLMs based on whether they've reached a point of mastery where they can work close to as fast as they think and the tech would therefore slow them down rather than accelerate them.

replies(2): >>45139589 #>>45145091 #

3. skydhash ◴[05 Sep 25 14:53 UTC] No.45139283[source]▶

>>45139053 (TP) #

Naur’s theory of programming has always felt right to me. Once you known everything about the current implementation, planning and decision making can be done really fast and there’s not much time lost on actually implementing prototypes and dead ends (learning with extra steps).

It’s very rare to not touch up code, even when writing new features. Knowing where to do so in advance (and planning to not have to do that a lot) is where velocity is. AI can’t help.

replies(2): >>45140154 #>>45145405 #

4. no_wizard ◴[05 Sep 25 15:19 UTC] No.45139589[source]▶

>>45139232 #

The verbose LLM approach that Cursor and some others have taken really annoys me. I would prefer if it simply gave me the results (written out to files, changes to files or whatever the appropriate medium is) and only let me introspect the verbose steps it took if I request to do so.

That’s what slows me down with AI tools and why I ended up sticking with GitHub Copilot, which does not do any of that unless I prompt it to

replies(3): >>45142018 #>>45142053 #>>45143248 #

5. giancarlostoro ◴[05 Sep 25 15:42 UTC] No.45139863[source]▶

>>45139053 (TP) #

I can agree with this sentiment. It does not matter how insanely good LLMs become, if you cannot assess it quickly enough. You will ALWAYS want a human to verify and validate, and test the software. There could be a ticking timebomb in there somewhere.

Maybe the real skynet will kill us with ticking time bomb software bugs we blindly accepted.

replies(3): >>45140469 #>>45140953 #>>45143958 #

6. flail ◴[05 Sep 25 16:07 UTC] No.45140154[source]▶

>>45139283 #

I wouldn't discuss with that part, although there are definitely limits to how big a chunk of a big product a single brain can really grasp technically. And when the number of people involved in "grasping" grows, so does the coordination/communication tax. I digress, though.

We could go with that perception, however, only if we assume that whatever is in the backlog is actually the right thing to build. If we knew that every feature has value to the customers and (even better) they are sorted from the most valuable to the least valuable one.

In reality, many features have negative value, i.e., they hurt performance, customer satisfaction, any key metric a company employs.

The big question: can we check some of these before we actually develop a fully-fledged feature? The answer, very often, is positive. And if we follow up with an inquiry about how to validate such ideas without development, we will find a way more often than not.

Teresa Torres' Continuous Discovery Habits is an entire book about that :)

One of her recurring patterns is the Opportunity Solution Tree, which is a way of navigating across all the possible experiments to focus on the right ones (and ignore, i.e., not develop, all the rest).

7. ajuc ◴[05 Sep 25 16:07 UTC] No.45140155[source]▶

>>45139053 (TP) #

It's like speed of light in different mediums. It's not that photons slow down. They just hit more stuff and spend more time getting absorbed and remitted.

Better developer wastes less time solving the wrong problem.

8. thenanyu ◴[05 Sep 25 16:32 UTC] No.45140469[source]▶

>>45139863 #

In most scenarios I can tell you if I like or dislike a feature much faster than it takes a developer to build it

replies(1): >>45140922 #

9. k__ ◴[05 Sep 25 17:09 UTC] No.45140922{3}[source]▶

>>45140469 #

If it just came down to the "idea guy liking or disliking a feature" things would be quite easy...

replies(1): >>45140980 #

10. ACCount37 ◴[05 Sep 25 17:12 UTC] No.45140953[source]▶

>>45139863 #

The threshold of supervision keeps rising - and it's going to keep rising.

GPT-2 was barely capable of writing two lines of code. GPT-3.5 could write a simple code snippet, and be right more often than it was wrong. GPT-4 was a leap over that, enabling things like "vibe coding" for small simple projects, and GPT-5 is yet another advancement in the same direction. Each AI upgrade brings forth more capabilities - with every upgrade, the AI can go further before it needs supervision.

I can totally see the amount of supervision an AI needs collapsing to zero within our lifetimes.

replies(2): >>45141183 #>>45145223 #

11. thenanyu ◴[05 Sep 25 17:15 UTC] No.45140980{4}[source]▶

>>45140922 #

why doesn't it? it doesn't have to be you or me personally, it could be a representative sample of our users

replies(1): >>45141955 #

12. gyrovagueGeist ◴[05 Sep 25 17:31 UTC] No.45141183{3}[source]▶

>>45140953 #

In the middle term, I almost feel less productive using modern GPT-5/Claude Sonnet 4 for software dev than prior models, precisely because they are more hands off and less supervised.

Because they generate so much code, that often passes initial tests, looks reasonable, and fails in nonhuman ways, in a pretty opinionated style tbh.

I have less context (and need to spend much more effort and supervision time to get up to speed to learn) to fix, refactor, and integrate the solutions, than if I was only trusting short few line windows at a time.

replies(1): >>45141293 #

13. warkdarrior ◴[05 Sep 25 17:39 UTC] No.45141293{4}[source]▶

>>45141183 #

> I almost feel less productive using modern GPT-5/Claude Sonnet 4 for software dev than prior models, precisely because they are more hands off and less supervised.

That is because you are trained in the old way to writing code: manual crafting of software line by line, slowly, deliberately, thoughtfully. New generations of developers will not use the same workflow as you, just like you do not use the same workflow as folks who programmed punch cards.

replies(1): >>45141398 #

14. _se ◴[05 Sep 25 17:47 UTC] No.45141398{5}[source]▶

>>45141293 #

No, it's because reading code is slower than writing it.

The only way these tools can possibly be faster for non-trivial work is if you don't give a shit enough about the output to not even read it. And if you can do that and still achieve your goal, chances are your goal wasn't that difficult to begin with.

That's why we're now consistently measuring individuals to be slower using these tools even though many of them feel faster.

replies(2): >>45142008 #>>45156360 #

15. cestith ◴[05 Sep 25 18:30 UTC] No.45141955{5}[source]▶

>>45140980 #

So if you wait to put together a representative sample of users and gather the data long enough for the numbers to matter, you’ve gated further changes. If you’ve gated further changes for a week, why does it matter that the feature change was done in an hour or a day?

replies(1): >>45142084 #

16. mwigdahl ◴[05 Sep 25 18:34 UTC] No.45142008{6}[source]▶

>>45141398 #

"Consistently"? Is there more than just the one METR study that's saying this?

replies(1): >>45148718 #

17. cestith ◴[05 Sep 25 18:35 UTC] No.45142018{3}[source]▶

>>45139589 #

I want a merge request with a short, meaningful comment and the diffs just like I’d get from a human. Then I want to be able to discuss the changes if they aren’t exactly what’s needed, just like with a human. I don’t want to have to hold its hand and I don’t want to have to pair program everything with a chatbot. It also needs to be able to show a logic diagram, a data flow diagram, and a dependency tree. If an agent can’t give me that, it’s not really ready to work as a developer for me.

18. DenisM ◴[05 Sep 25 18:38 UTC] No.45142053{3}[source]▶

>>45139589 #

LLM might rely on their own verbosity to carry the conversation in a stable direction.

19. thenanyu ◴[05 Sep 25 18:40 UTC] No.45142084{6}[source]▶

>>45141955 #

Releasing it to users does not take a long time. Randomly select 5% of your user base and give them the feature. If your development process was mature, this would be a button you could push in your deployment env.

20. daliusd ◴[05 Sep 25 20:28 UTC] No.45143248{3}[source]▶

>>45139589 #

So you want Aider, Claude Code or opencode.ai it seems. I use opencode.ai a lot nowadays and am really happy and productive.

replies(2): >>45145124 #>>45162442 #

21. IanCal ◴[05 Sep 25 21:40 UTC] No.45143958[source]▶

>>45139863 #

That doesn’t require developer time though.

Also that time is needed regardless, do you think it’s the majority of time related to releasing a feature?

22. tharkun__ ◴[05 Sep 25 23:53 UTC] No.45145091[source]▶

>>45139232 #

I can't. The LLM (Claude Code really) is just too slow. It is just so slow at doing the things I ask it to do once I'm at the review stage.

Like the initial plan always sounds great and looks great. Then it goes to actually do the changes and proclaims victory after I left it alone doing other stuff, because it takes a while. Then I review what it did and what it didn't do and I inevitably find that it only did half of what it said it would do and did half of what it did do incorrectly despite what it told me what it would do.

The use case here is a large code base that needs changes. Not new feature development on a green field (or a green corner of an established product). And it's just so unbearably frustrating. It's like giving the task to a Junior on probation. I tell them something, they go off for 10 minutes and tell me they're done and I look and find seven holes I need to tell them to fix. But they aren't the Junior that picks up stuff and gets better and needs less supervision. Instead it seems like the context gets more and more polluted and the Junior gets closer and closer to failing his probation.

Many grey hairs added recently, because yeah, we also "have to be faster by using AI" now ...

23. tharkun__ ◴[05 Sep 25 23:57 UTC] No.45145124{4}[source]▶

>>45143248 #

I really wanted to use Aider. But it's impossible. How do people actually use it?

Like, I gave it access to our code base, wanted to try a very simple bug fix. I only told it to look at one service I knew needed changes, because it says it works better in smaller code bases. It wanted to send so many tokens to sonnet that I hit the limits before it even started actually doing any coding.

Instant fail.

Then I just ran Claude Code, gave it the same instructions and I had a mostly working fix in a few minutes (never mind the other fails with Claude I've had - see other comment), but Aider was a huge disappointment for me.

replies(1): >>45152381 #

24. daxfohl ◴[06 Sep 25 00:12 UTC] No.45145223{3}[source]▶

>>45140953 #

I could see it happening in a year or two. Especially in backend. There's only so many different architecture patterns we use, and an LLM will have access to every one that has ever been deployed, every document, every gripe, every research paper, etc.

I mean, I think ultimately the state space in designing a feature is way smaller than, say, go (the game). Maybe a few hundred common patterns and maybe a billion reasonable ways to combine them. I think it's only a matter of time before we ask it to design a feature, and it produces five options that are all better than what we'd have come up with.

25. himeexcelanta ◴[06 Sep 25 00:37 UTC] No.45145405[source]▶

>>45139283 #

Typing syntax and dealing with language issues takes a lot of mental overhead that AI mostly solves in the right hands. It’s not zero!

26. _se ◴[06 Sep 25 12:31 UTC] No.45148718{7}[source]▶

>>45142008 #

I have measured it myself within my organization, and I know many peers across companies who have done the same. No, I cannot share the data (I wish I could, truly), but I expect that we will begin to see many of these types of studies emerge before long.

The tools are absolutely useful, but they need to be applied in the right places and they are decided not a silver bullet or general-purpose software engineering tool in the manner that they're being billed at present. We still use them despite our finding, but we use them judiciously and where they actually help.

27. daliusd ◴[06 Sep 25 19:59 UTC] No.45152381{5}[source]▶

>>45145124 #

I don't know about Aider, I am not using it because of lack of MCP and poor GitHub Copilot support (both are important to me). Maybe in the future that will get better if that will be relevant. I am using opencode.ai with Claude Sonnet 4 usually. Sometimes I try to switch to different models, e.g. Gemini 2.5 Pro, but Sonnet is more consistent for me.

It would be good to define what's "smaller code bases". Here is what I am working one: 10 years old project full of legacy consisting of about 10 services and 10 front-end projects. As well tried it on project similar to MUI or Mantine UI. Naturally on many smaller projects. As well tried it on TypeScript codebase where it has failed for me (but it is hard to judge from one attempt). Lastly I am using it on smaller projects. Overall question is more about task than about code base size. If the task does not involve loading too much context when code base size might be irrelevant.

28. KronisLV ◴[07 Sep 25 08:11 UTC] No.45156360{6}[source]▶

>>45141398 #

> No, it's because reading code is slower than writing it.

This feels wrong to me, unless we qualify the statement with: "...if you want the exact same level of understanding of it."

Otherwise, the bottleneck in development would be pull/merge request review, not writing the code to do something. But almost always, it's the other way around - someone works on a feature for 3-5 days, the pull/merge request does not really spend the same time in active review. I don't think you need the exact same level of intricate understanding over some code when reviewing it.

It's quite similar with the AI stuff, I often nitpick and want to rework certain bits of code that AI generates (or fix obvious issues with it), but using it for the first version/draft is still easier than trying to approach the issue from zero. Ofc AI won't make you consistently better, but will remove some of the friction and reduce the cognitive load.

29. no_wizard ◴[07 Sep 25 21:43 UTC] No.45162442{4}[source]▶

>>45143248 #

At the end of the day I want what my job is willing to pay for, which is a few different flavors of AI tools

↑