I can highly recommend these talks to get your eyes slightly opened to how stuck we are in a local minima.
Sure, code sweat shops have very different % of above, but thats a completely different game altogether.
Vibe coding is going to make this so much worse; the tech debt of load-bearing code that no one really understands is going to be immense.
Just to quote one little bit from the piece regarding Google: "In other words, there have been numerous dead ends that they explored, invalidated, and moved on from. There's no knowing up front."
Every time you change your mind or learn something new and you have to make a course correction, there's latency. That latency is just development velocity. The way to find the right answer isn't to think very hard and miraculously come up with the perfect answer. It's to try every goddamn thing that shows promise. The bottleneck for that is 100% development speed.
If you can shrink your iteration time, then there are fewer meetings trying to determine prioritization. There are fewer discussions and bargaining sessions you need to do. Because just developing the variations would be faster than all of the debate. So the amount of time you waste in meetings and deliberation goes down as well.
If you can shrink your iteration time between versions 2 and 3, between versions 3 and 4, etc. The advantage compounds over your competitors. You find promising solutions earlier, which lead to new promising solutions earlier. Over an extended period of time, this is how you build a moat.
I use Python differently because uv made many things faster, less costly. Stuff I used to do in bash are now in Python. Stuff I wouldn't do at all because 3rd party modules were an incompressible expense, now I do because the cost is low.
Same with AI.
Every week, there was a small tool I actively chose to not develop because I know that it would save less time by automating the thing than it would take coding it.
E.G: I send regularly documents from my hard drive or forward mails to a specific email for accounting. It would be nice to be able to do those in one click. But dev a nautilus script or thunderbird extension to save max a minute a day doesn't make sense.
Except now with claude code, it does. In a week, they paid off. And now I'm racking the minutes.
Now each week, I'm getting a new tool that is not only saving me minutes, but also reducing context switching. Those turn into hours, which turn into days. These compounds.
And of course, getting out a MVP, or a new feature demo out of the door quickly allows you to get feedback faster.
In general, AI lets you get a shorter feedback loop. Trash bad concept sooner. Get crucial info faster.
Those do speed up a project.
Research and thinking is always going to be the bottleneck.
But with LLMs I'm not so sure. I feel like I can skip the effort of typing, which is still effort, despite years of coding. I feel like I actually did end up spending quite a lot of time doing trivial nonsense like figuring out syntax errors and version mismatches. With an LLM I can conserve more of my attention on the things that really matter, while the AI sorts out the tedious things.
This in turn means that I can test more things at the top architectural level. If I want to do an experiment, I don't feel a reluctance to actually do it, since I now don't need to concentrate on it, rather I'm just guiding the AI. I can even do multiple such explorations at once.
With LLMs, you can type so much faster! So we should be going faster! It feels faster!
(We are not going faster.)
But your definition, the right one, is spot on. The pace of learning and decisions is exactly what drives development velocity. My one quibble is that if you want to learn whether something is worth doing, implementing it isn't always the answer. Prototyping vs. production-quality implementation is different, even within that. But yeah, broadly, you need to test and validate as many _ideas_ as possible, in order take make as many correct _decisions_ as possible.
That's one place I'm pretty bullish on AI: using it to explore/test ideas, which otherwise would have been too expensive. You can learn a ton by sending the AI off to research stuff (code, web search, your production logs, whatever), which lets you try more stuff. That genuinely tightens the feedback loop, and you go faster.
I wrote a bit more about that here: https://tern.sh/blog/you-have-to-decide/
This is /especially/ true in software in 2025, because most products are SaaS or subscription based, so you have a consistent revenue stream that can cover ongoing development costs which gives you the necessary runway to iterate repeatedly. Development costs then become relatively stable for a given team size and the velocity of that team entirely determines how often you can iterate, which determines how quickly you find an optimal solution and derive more value.
With the llm I really can spend most of my time on the verification problem.
It's basically the wetware equivalent of page thrashing.
My experience is that I write better code faster by turning off the AI assistants and trying to configure the IDE to as best possible produce deterministic and fast suggestions, that way they become a rapid shorthand. This makes for a fast way of writing code that doesn't lead to mental model thrashing, since the model can be updated incrementally as I go.
The exception is using LLMs to straight up generate a prototype that can be refined. That also works pretty well, and largely avoids the expensive exchanges of information back and forth between human and machine.
This has been my experience as well :/
Depending on your subject matter you might only need an idea or two per 100loc generated. So much of what I used to do turns out to be grunt work that was simply pattern matching on simple heuristics, but I can churn out 5-10 good ideas per hour it seems, so I’m definitely rate limited on coding.
Similar to your comment on architectural experiments, one thing I have been observing is that the critical path doesn’t go 10x faster, but by multiplexing small incidental ideas I can get a lot more done. Eg “it would be nice if we had a new set of integration tests that stub this API in some slightly tedious way, go build that”.
It’s very rare to not touch up code, even when writing new features. Knowing where to do so in advance (and planning to not have to do that a lot) is where velocity is. AI can’t help.
That’s what slows me down with AI tools and why I ended up sticking with GitHub Copilot, which does not do any of that unless I prompt it to
The current trend in anti-vibe-coding articles is to take whatever the vibe coding maximalists are saying and then stake out the polar opposite position. In this case, vibe coding maximalists are claiming that LLM coding will dramatically accelerate time to market, so the anti-vibe-coding people feel like they need to claim that development speed has no impact at all. Add a dash of clickbait (putting "development speed" in the headline when they mean typing speed) and you get the standard LLM war clickbait article.
Both extremes are wrong, of course. Accelerating development speed is helpful, but it's not the only factor that goes into launching a successful product. If something can accelerate development speed, it will accelerate time to market and turnaround on feature requests.
I also think this mentality appeals to people who have been stuck in slow moving companies where you spend more time in meetings, waiting for blockers from third parties, writing documents, and appeasing stakeholders than you do shipping code. In some companies, you really could reduce development time to 0 and it wouldn't change anything because every feature must go through a gauntlet of meetings, approvals, and waiting for stakeholders to have open slots in their calendars to make progress. For anyone stuck in this environment, coding speed barely matters because the rest of the company moves so slow.
For those of us familiar with faster moving environments that prioritize shipping and discourage excessive process and meetings, development speed is absolutely a bottleneck.
The vibe coders can deliver on happy path results pretty fast but I already have seen within 2 months it starts to fall apart quick and has to be extensively refactored which ends up ultimately taking more time than if it was done with quality in mind in the first place
And supposedly the free market makes companies “efficient and logical”
You can't test or evaluate something that doesn't work yet.
You have even CEO of car companies that get fired because they mess this up. Or even the Sonos company lost a lot of value, and got their CEO fired because they messed up and can't fix it in time.
Speed is not everything. Developing the right features (what users want) and Quality are the most important things, but development speed allows you to test features and fix things fast and course correct.
Whether you call yourself an engineer, developer, programmer, or even a coder is mostly a localized thing, not an evaluation of expertise.
We're confusing everyone when we pretend a title reflects how good we are at the craft, especially titles we already use to refer to ourselves without judgement. At least use script kiddie or something.
Two/three months to code everything ("It's maximum priority!"), about four to QA, and then about a year to deploy to individual country services by ops team.
During test and deploy phases, the developers were just twiddling thumbs because ops refused to allow them access and product refused to take in new projects due to possibility of developers having to go back to code.
It took the CEO to intervene and investigate the issues, and the CTO's college best friend that was running DevOps was demoted.
Maybe the real skynet will kill us with ticking time bomb software bugs we blindly accepted.
But there are people with great product taste who can know by trying a product whether it meets a real user need - some of these are early-adopter customers, sometimes they are great designers, sometimes PMs. And they really do need to try a product (or prototype) to really know whether it works. I was always frustrated as a junior engineer when the PM would design a feature in a written spec, we would implement it, and then when trying it out before launch, they would want to totally redesign it, often in ways which required either terrible hacks or significant technical design changes to meet the new requirements. But after 15 years of seeing some great ideas on paper fall flat with our users, and noticing that truly exceptional product people could tell exactly what was wrong after the feature was built but before it was released to users, I learned to be flexible about those sorts of rewrites. And it’s exactly that sort of thing that vibecoding can accelerate
Writing a compiler at Sycor, there were teams waiting for us to finish our development. We were successful, being about an order of magnitude faster than the effort we replaced.
And just because google cancels products doesn't suggest anything about development speed.
If I were an LLM advocate (having much fun currently with gemini), I would let the criticism roll and make book using LLMs.
My new paradigm is something like:
- write a few paragraphs about what is needed
- have the bot take in the context and produce a prototype solution outside of the main application
- have the bot describe main integration challenges
- do that integration myself — although I’m still somewhat lazy about this and keep trying to have the bot do it after the above steps; it seems to only have maybe 50% success rate
- obviously test thoroughly
Perhaps I've just misunderstood the point, but it seems like a nonsensical argument.
Also, in the past I've done interactive maps and charts for different media organizations, and people would often debate for a considerable amount of time whether to, for example, make a bar or line chart (the actual questions and visualizations themselves were usually more sophisticated).
I remember occasionally suggesting prototyping both options and trying them out, and intuitively that usually struck people as impractical, even though it would often take less time than the discussions and yield more concrete results.
I think those “fall apart in 2 months” kinds of projects will still keep happening, but some of us had that experience and are refining our use of the tools. So I think in the future we will see a broader spread of “percent generated code” and degrees of success
The whole Lean Startup was about figuring out how to validate ideas without actually developing them. And it is as relevant as ever, even with AI (maybe, especially with AI).
In fact, it's enough to look at the appalling rate of product success. We commonly agree that 90% of startups fail. The majority of that cohort have built things that shouldn't have been built at all in the first place. That's utter waste.
If only, instead of focusing on building more, they stopped and reevaluated whether they were building the right thing in the first place. Yet, most startups are completely immersed in the "development as a bottleneck" principle. And I tell that part from our own experience of 20+ years of helping such companies to build their early-stage products. The biggest challenge? Convince them to build less, validate, learn, and only then go back to further development.
When it comes to existing products, it gets even more complex. The quote from Leah Tharin explicitly mentions waiting weeks/months of wait till they were able to get statistically significant data. What follows is that within that part of experimentation, they were blocked.
Another angle to take a look at it is the fundamental difference in innovation between Edison/Dyson and Tesla.
The first duo was known for "I have not failed. I found 10,000 ways that don't work." They were flailing around with ideas till something eventually clicked.
Tesla, in contrast, would be at the Einstein's end of the spectrum with "If I had an hour to solve a problem, I'd spend 55 minutes thinking about the problem and 5 minutes thinking about [or in Tesla's case, making] solutions."
While most of the product companies would be somewhere in between, I'd argue that development is a bottleneck only if we are very close to Edison/Dyson's approach.
But fine, let's take the subset of features / projects that can be tested or somehow validated. In my experience (having worked for 13+ years on companies that prefer to A/B almost everything), more than half of the tests fail. People initially might think the solution is to have better ideas, cook them more, do better analysis. That's usually wrong. I've seen PHDs with 20+ years of experience in a given industry (Search) launch experiments and they still fail.
The solution is to have some sort of "just enough" analysis like user studies, intuition, and business needs, and launch as fast and as many as you can. Therefore, development speed is A bottleneck (there's no Silver Bullet so it's not THE bottleneck).
We could go with that perception, however, only if we assume that whatever is in the backlog is actually the right thing to build. If we knew that every feature has value to the customers and (even better) they are sorted from the most valuable to the least valuable one.
In reality, many features have negative value, i.e., they hurt performance, customer satisfaction, any key metric a company employs.
The big question: can we check some of these before we actually develop a fully-fledged feature? The answer, very often, is positive. And if we follow up with an inquiry about how to validate such ideas without development, we will find a way more often than not.
Teresa Torres' Continuous Discovery Habits is an entire book about that :)
One of her recurring patterns is the Opportunity Solution Tree, which is a way of navigating across all the possible experiments to focus on the right ones (and ignore, i.e., not develop, all the rest).
We have literally one half-hour-long sync meeting a week. The rest is as lightweight as possible, typically averaging below 10 minutes daily with clients (when all the decisions happen on the fly).
I've worked in the corpo world, too, and it is anything but.
We do use vibe coding a lot in prototyping. Depending on the context, we sometimes have a lot of AI-agent-generated code, too.
What's more, because of working on multiple projects, we have a fairly decent pool of data points. And we don't see much of speed improvement from a perspective of a project (I wrote more on it here: https://brodzinski.com/2025/08/most-underestimated-factor-es...).
However, developers sure report their perception of being more productive. We do discuss how much these perceptions are grounded in reality, though. See this: https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o... and this: https://substack.com/home/post/p-172538377
So, I don't think I'm biased toward bureaucratic environments, where developers code in MS Word rather than VS Code.
But these are all just one dimension of the discussion. The other is a simple question: are there ways of validating ideas before we turn them into implemented features/products?
The answer has always been a wholehearted "yes".
If development pace were all that counted, Googles and Amazons of this world would be beating the crap out of every aspiring startup in any niche the big tech cared about, even remotely. And that simply is not happening.
Incumbents are known to be losing ground, and old-school behemoths that still kick butts (such as IBM) do so because they continuously reinvent their businesses.
Do we always have to build it before we know that it will work (or, in 9 cases out of 10, that it will not work)?
Even more so, do we have to build a fully-fledged version of it to know?
If yes, then I agree, development is the bottleneck.
Development speed absolutely is a bottleneck. But coding speed? Like, typing? Yeah, I can definitely type faster than I can think about code, or anything really (typing at 100wpm is a fun party trick but not super useful in the end). Many times over... Even single finger typers who peck at the keyboard probably can, auto-complete has existed for a long time...
The effort it takes to implement a feature makes is more likely you think twice before you start.
If the effort goes to zero, so does the thinking.
We will turn from programmers to just LLM customers sooner or later.
Because testing if it works can be done by none programmers
Cognitively, these are very different tasks. With the former, we actively drive technical decisions (decide on architecture, implementation details, even naming). The latter offers all these decisions made, and we first need to untangle them all before we can scrutinize the details.
What's more, often AI-generated code results in bigger PRs, which again adds to the cognitive load.
And some developers fall into a rabbit hole of starting another thing while they wait for their agent to produce the code. Adding context switching to an already taxing challenge basically fries brains. There's no way such a code review to consistently catch the issues.
I see how development teams define health routines around working with generated code. Especially around limiting context switching. But also retaking tasks to be made by hand.
And don't take that as a complaint. It's a basic behavioral observation. What we say we do is different from what we really do. By the same token, what we say we want is different from what we really want.
At a risk of being a bit sarcastic: we say we want regular exercise to keep fit, but we really want doomscrolling on a sofa with a beer in hand.
In the product development context, we have a very different attitude towards an imagined (hell, even wireframed) solution than an actual working piece of software. So it's kinda obvious we can't get it right on the first attempt.
We can be working toward the right direction, and many product teams don't even do that. For them, development speed is only a clock counting time remaining before VCs pull the plug.
Check out all of the bullshit “AI” companies that YC is funding.
BigTech is not “loosing ground” all of them are reporting increasing revenues and profits.
GPT-2 was barely capable of writing two lines of code. GPT-3.5 could write a simple code snippet, and be right more often than it was wrong. GPT-4 was a leap over that, enabling things like "vibe coding" for small simple projects, and GPT-5 is yet another advancement in the same direction. Each AI upgrade brings forth more capabilities - with every upgrade, the AI can go further before it needs supervision.
I can totally see the amount of supervision an AI needs collapsing to zero within our lifetimes.
VP of Product put all the pressure on dev teams to deliver all the features against the specs. Then they release the new product/new version with plenty of fanfare.
And then literally no one measures which parts have actually delivered any value. I'd bet a big part of that code added no value, so it's a pure waste. Some other parts were actually harmful. They frustrated users, drove key metrics down, or have you. They are worse than waste.
But no one cared to check. Good product people, and there are scarcely few of them, would follow up with validation on what worked and what did not. They would argue against "major" releases whenever possible.
And seriously, if Amazon can avoid major releases, almost anyone could.
Suddenly, we might flip the script and have a VP of Product not asking "when will it be done?" but rather trying to figure out what the next most sensible experiments are.
They don't understand that this AI was built decades ago and has been improved on several times over: Compilers & Interpreters. Furthermore, you don't need billion-dollar neural-network supercomputers, just a vanilla laptop.
It's because of how you talk about the job, though. We automate every other kind of "coding" - why can't we automate yours?
Accompanying many early-stage startups in their journey, I see how often the development (which we're responsible for) takes a back seat. Sometimes the pivotal role will be customer support, sometimes it will be business development, and often product management will drive the whole thing.
And there's one more follow-up thought to this observation. Products that achieved success, inevitably, get into a spiral of getting more features. That, in turn, makes them more clunky and less usable, and ultimately opens a way for new players who disrupt the niche.
At some point, adding more features in general makes things worse--too complicated, too overwhelming, making it harder to accomplish the core task. And yet, adding new stuff never ceases.
In the long run, the best tactic may actually be to go slower (and stop at some point), but focus on the meaningful changes.
LLMs are a tool that added a new dimension to explore. While I haven't like many felt actual gains, others are finding, and time will allow us to better judge if those can lead to long term impacts in the economy.
Just based on what I've been reading and experiencing: - Short term POCs can reach validation stage faster. - Mature cloud software needs a lot of extra tooling (LLMs don't understand the codebase, lack of places to derive good context from, and so on). - Anything in between for cloud seems to be a hit or miss, where people are mostly trading first iteration time for more refactoring later down the line.
From another perspective, areas of software where things are a lot more about numbers (cpu time, memory consumption, and so on), may benefit a lot from faster development/coding as the validation phase is either shorter or can be executed in parallel.
The key reality here is that I've been observing higher expectations for deliveries without a proof that we actually got better at coding in general. Which means that sacrifices are being made somewhere.
No code -> no software.
Because they generate so much code, that often passes initial tests, looks reasonable, and fails in nonhuman ways, in a pretty opinionated style tbh.
I have less context (and need to spend much more effort and supervision time to get up to speed to learn) to fix, refactor, and integrate the solutions, than if I was only trusting short few line windows at a time.
A question: what if all those activities are to build a feature that will harm user retention or a product no one wants?
A follow-up question: what if we could have known that up front, or there was a simple way to learn that?
Because so often we build stuff that shouldn't have been built in the first place (appalling startup success rate is probably a good enough statistical measure of that). And yes, there are ways to learn that we're building the wrong thing, other than building a fully-fledged version of it.
Other than that the discovery process of what you should build is the hardest and the costliest part, the main conclusion from the article seems to be that if you outsource the first iterations to AI via vibe-coding, you will have much harder time changing and evolving it from there (iterating); to this, I agree
That is because you are trained in the old way to writing code: manual crafting of software line by line, slowly, deliberately, thoughtfully. New generations of developers will not use the same workflow as you, just like you do not use the same workflow as folks who programmed punch cards.
I like this metaphor. Looking at a map, we may get a pretty good understanding of whether it's a place we'd like to spend time, say, on vacation.
We don't physically go to a place to scrutinize it.
And we don't limit ourselves to maps only. We check reviews, ask friends, and what have you. We do cheap validation before committing to a costly decision.
If we planned vacations the way we build software products, we'd just go there (because the map is not the territory), learn that the place sucks, and then we'd complain that finding good vacation spots is costly and time-consuming. Oh, and we'd mention that traveling is a bottleneck in finding good spots.
This is often a CTO putting pressure on a dev manager when the bottleneck is ops, or product, or putting pressure on product when the bottleneck is dev.
The normal rationalization is that "you should be putting pressure on them".
The actual reason is that they are putting pressure on you as a show of force, rather than actually wanted it to go faster.
This is why the only response to a bad manager is to run away.
The only way these tools can possibly be faster for non-trivial work is if you don't give a shit enough about the output to not even read it. And if you can do that and still achieve your goal, chances are your goal wasn't that difficult to begin with.
That's why we're now consistently measuring individuals to be slower using these tools even though many of them feel faster.
It is about designing good experiments, validating, and learning, so that when we're down to development, we build something that's way more likely to succeed.
The fact that we were advised to build non-technical experiments is but a small part. And with the current AI capabilities, we actually have a new power tool for prototyping that falls neatly into the whole puzzle.
Here's a bit more elaborate argument (sorry for a LinkedIn link): https://www.linkedin.com/posts/pawelbrodzinski_weve-already-...
Disregard parts that explicitly assume that they are relevant only because, in 2013, development was expensive. There are very few parts that you would throw out.
It is telling that, while the article's theme is product management (and its relationship with the pace of development), that context is largely ignored in some comments. It's as if the article's scope was purely what happens within the IDE and/or AI agent of choice.
The whole point is that the perspective necessarily should be broader. Otherwise, we make it a circular argument, really: development is a bottleneck of development.
Well, hard to disagree on that.
Claude crapped out a workable landing page in ~30 seconds of prompting. I updated the copy on the page, total time less than an hour.
The odds of me spending more than an hour just picking a color theme for the page or finding the SVG icons it used is pretty much 100%.
------------
I had a bug in some async code, it hit rarely but often enough it was noticeable. I had narrowed down what file it was in, but after over an hour of staring at the code I wasn't finding it.
Popped into cursor, asked it to look for async bugs in the current file. "You forgot to clean up a resource on this line here."
Bug fixed.
------------
"Here is my nginx config, what is wrong with the block I just added for this new site I'm throwing up?"
------------
"Write a regex to do nnnnnn"
------------
"This page isn't working on mobile, something is wrong, can you investigate and tell me what the issues may be?"
Oh that won't go well, all of the models get super confused about CSS at some point and end up in doom spirals applying incorrect fixes again and again.
> Googles and Amazons of this world would be beating the crap out of every aspiring startup in any niche the big tech cared about, even remotely. And that simply is not happening.
This is already a well explored and understood space, to the extent that big tech cos have at times spun teams off to work independently to gain the advantage of startup-like velocities.
The more infra you have, the more overhead you have. Deploying a company's first service to production is really easy, no infra needed, no dev-ops, just publish.
Deploying the 5th service, eh.
Deploying the 50th service, well by now you need to have a host of meetings before work even starts to make sure you aren't duplicating effort and that the libraries you use mesh with the department's strategic technical vision. By the time those meeting are done, a startup will have already put 3 things into prod.
The communication overhead within large orgs is also famously non-linear.
I spent 10 years working at Microsoft, then 3 years at HBO Max (lean tech company 200 engineers, amazing dev ops), and now I'm working at startups of various sizes.
At Microsoft, pre-Azure it could take weeks just to get a machine provisioned to test an idea out on. Actually getting a project up and running in a repo was... hard at times. Build systems were complex, tooling was complex, and you sure as hell weren't getting anything pushed to users without a lot of checks in place. Now many of those checks were in place for damn good reasons, wrongly drawn lines on a map inside Windows is a literal international incident[1], and we had separate localizations for different variants of English around the world. (And I'd argue that Microsoft's agility at deploying software around the entire world at the same time is unmatched, the people I worked with there were amazing at sorting through the cultural and legal problems!)
Also if Google launches a new service and it goes down from too much traffic, it is embarrassing. Everything they do has to be scalable and load balanced, just to avoid bad press. If a startup hits the front page of HN and their website goes down from being too popular, they get to write a follow up blog post about how their announcement was so damn popular their site crashed! (And if they are lucky, hit the front page of HN again!)
The differences in designing for levels of scale is huge.
At Microsoft it was "expect potentially a billion users" At HBO it was "Expect tens of millions of users", at many startups it is "If we hit 10k users we'll turn a profit and we can figure out how to scale out later."
10K DAU is a load balances and 3 instances of NodeJS (for rolling updates) each running on a potato of a CPU.
> So, I don't think I'm biased toward bureaucratic environments, where developers code in MS Word rather than VS Code.
I've worked in those environments, and the level of engineering quality can be much higher. The number of bugs that can be hammered out and avoided in spec reviews is huge. Technology designs that end up being servicable for years to decades instead of "until the next rewrite". The actual code tends to flow much faster as well, or at least as fast as it can flow in the large sprawling code bases that exist at big tech companies. At other times, those specs are needed so that one has a path forward while working through messy legacy code bases.
Both styles have their place - Sometimes you need to iterate quickly and get lots of code down and see what works, other times it is worth thinking through edge cases, usage scenarios, and performance characteristics. Heck I've done memory bus calculations for different designs, when you are working at that level you don't just "write code and see what works", you first spend a few days (or a week!) with some other smart engineers and try to narrow down the potential field of you should even be trying to do!
[1]https://www.upi.com/Archives/1995/09/09/Microsoft-settles-In...
With a more complex code base (and a less popular tech stack), the perceived gains quickly diminish. Beyond a certain level of tech debt, AI-generated code is utterly useless. It's no surprise that we see people who vibe-coded their products with no technical knowledge whatsoever, and now they call professional engineers to untangle the mess.
A software agency I know well responded to the rise of AI somewhere between the lines of "Now, we'll have plenty of work to clean all that mess!" Admittedly, they always specialized in complex/rescue engineering gigs.
However, the "development as a bottleneck" discussion was set here in a broader context. It's not only how efficiently we are able to deliver bits of functionality, but primarily whether we should be building these things in the first place.
Equally for early-stage startups and established products alike, so much of features are built because someone said so. At the end of the day, they don't deliver any value (if we're lucky) or are plain harmful (if we're out of luck).
In such cases, it would have been better if developers actually sipped coffee and read Hacker News rather than coded/developed/engineered stuff.
The context of the article is product development, with a bias toward the commercial part of the ecosystem. And of course, as any picture painted with broad strokes, some generalizations were inevitable.
As a scientist, you definitely are familiar with the weight (or lack thereof) of anecdotal evidence. Unless the claim is "it can never work" or "it always works," my individual experience is just that--an individual experience.
Or on a smaller scale, what's the last genuine Attlassian success?
Yet, when it comes to product innovation, the momentum is always on the side of the new players. Always has been.
Project management/work organization software? Linear. Async communication? Slack. Social Media? TikTok. One has to be curious how Zoom is doing so well, given that all the big competition actually controls the channels for setting up meetings. Self-publishing? Substack. Even with AI, everyone plays catch-up with Sam Altman, and many of the most prominent companies are newcomers.
We could go on and on.
Yes, Big Techs will survive because they have enough momentum to survive events such as the Balmer-era MS. But that doesn't mean they lead product innovation.
And it's expected. Conflicting priorities, growing bureaucracies, shareholders' expectations, old business lines (and more), all make them less flexible.
Paul Buchheit's stories about Gmail and AdSense are good examples. I was an early Gmail user when it was invitation-only and invitations were scarcely distributed (only as fast as the infrastructure could handle).
So, while I understand the difference in PR costs, it's not like they don't have tools to run smaller experiments.
I agree with the huge bureaucracy cost. On the other hand, they really have (relatively) infinite resources if they care to deploy them. And sometimes they do. And they still fail.
They often fail even when they try a Skunk Works-like approach. Google Wave was famously developed as a corporate Lean Startup (before there was Lean Startup). It was a disaster. Precisely because they did close to zero validation pre-release.
A side note, a huge flop it was (although Buzz and Google+ were bigger), it didn't hurt them long term in PR or reputation.
An innovative product is one where customers in aggregate are willing to pay more for it than it costs to create and run. Any idiot can sell a bunch of dollar bills for 95 cents.
Going back to the latest batch of YC companies, there value play can easily be duplicated by any company in their vertical either by throwing a few engineers on it or creating a statement of work for the consulting company I work for and I can pull together a few engineers and knock it out in a few months and they will already have customers to sell it to.
There was one recent YC company (of course one of the BS AI companies) that was a hiring a “founding full stack engineer” for $150K. It looks like they were two non technical “serial entrepreneurs” without even an MVP that YC threw money at.
You can’t imagine how many times some hair brain underfunded startup reached out to me to be a “CTO” that paid less than I made as a mid level employee at BigTech with the promise of Monopoly money “equity”.
Go ahead and code as much as you want. Unless you can communicate the utility of that code to a paying customer it has no value or relevance.
People criticize Microsoft's historical fiefdom model, and it had its issues, but it also allowed orgs to find what worked for them and basically run independently. Of course it also had orgs fighting with each other and killing off good products.
Xbox was also a skunk works project at Microsoft (a few good books have been written about it!) and so was Microsoft Band. Xbox succeeded, Band failed for a number of reasons not related to the product or execution itself. (Politics and some historical corporate karma).
IMHO the only company good at deploying infinite resources quickly is Apple. 1 billion developing the first Apple Watch (Microsoft spend under 50 million on two generations of Band!) and then they kept going after the market, even though the first version was kinda meh. In comparison Google wear was on again of again for years until they finally took it seriously recently. I'm sure they spent lots of $, but the end result is nowheres near what Apple pulled off.
Nobody wants to believe it, but just try compiling C++ on Windows and again in a Linux VM. Linux in a VM on the same host compiles at least twice as fast. It's insanity. I tried a script that rsync's the project files to my server from 2013, runs the build and rsync's the artifacts back. Running the build on a Xeon 2500 is still far faster with Linux than windows on my two year old i9. Even with the overhead of sending binaries over the internet. Absolutely disgusting.
Move faster and move better (to move faster) are the same thing. You reduce costs by going faster, and with lean you go faster by avoiding time wasters.
And few of us can usefully compare what we do with what amazon google Facebook or other giants do.
Good luck on flipping their script. Meanwhile I’ll be over here making book
Certainly there’s no simple F(num_lines_changed) value function. There are many other parameters. But to suggest, as many here somehow do, that lines of code touched is independent to effective development, is plain ludicrous.
The article uses "development" to refer only to the part where code is generated, while you are saying "development" is the process as a whole.
You both agree that latency in the real-world validation feedback loop leads to longer cycles and fewer promising solutions and that is the bottleneck.
Sonos decided they wanted to centralize their architecture so they could tap into it to make extra surveillance money. They trashed things that worked perfectly and replaced them with a cloudshit architecture that nobody asked for and that _cannot_ deliver the same low-latency, quality experience as before. They could have developed things 1000% faster, they would have just drove a cliff sooner.
Even if people could write apps instantly, nothing would prevent them for being stupid and greedy.
Like the initial plan always sounds great and looks great. Then it goes to actually do the changes and proclaims victory after I left it alone doing other stuff, because it takes a while. Then I review what it did and what it didn't do and I inevitably find that it only did half of what it said it would do and did half of what it did do incorrectly despite what it told me what it would do.
The use case here is a large code base that needs changes. Not new feature development on a green field (or a green corner of an established product). And it's just so unbearably frustrating. It's like giving the task to a Junior on probation. I tell them something, they go off for 10 minutes and tell me they're done and I look and find seven holes I need to tell them to fix. But they aren't the Junior that picks up stuff and gets better and needs less supervision. Instead it seems like the context gets more and more polluted and the Junior gets closer and closer to failing his probation.
Many grey hairs added recently, because yeah, we also "have to be faster by using AI" now ...
Like, I gave it access to our code base, wanted to try a very simple bug fix. I only told it to look at one service I knew needed changes, because it says it works better in smaller code bases. It wanted to send so many tokens to sonnet that I hit the limits before it even started actually doing any coding.
Instant fail.
Then I just ran Claude Code, gave it the same instructions and I had a mostly working fix in a few minutes (never mind the other fails with Claude I've had - see other comment), but Aider was a huge disappointment for me.
I mean, I think ultimately the state space in designing a feature is way smaller than, say, go (the game). Maybe a few hundred common patterns and maybe a billion reasonable ways to combine them. I think it's only a matter of time before we ask it to design a feature, and it produces five options that are all better than what we'd have come up with.
Thank you for articulating something I knew but haven't been able to express as eloquently.
It frustrates me to no end to watch half a dozen non-technical bureaucrats argue for days about something that can be tried (and discarded) in a few hours with zero consequences.
"Let's write a position paper so that everyone involved can agree before we do anything."
Noooo! Just do it! See if it works in practice! Validate the marketing! Kick the tyres! Go for a test drive. Just. Get. Behind. The. Wheel.
Strange, I'd been more of the impression that this is an argument from pro vibe-coders. As more data comes in, the "productivity increases" of AI are not showing up as expected. So as people question, how come things are not getting done faster even though you say you are 10x faster at coding? The vibe-coders answer by saying that coding isn't the bottleneck, as opposed to capitulating and saying that maybe they're not that much faster at coding after-all.
How many dinners a day can you have?
You would still rely on alternative proxies, like recommendations or reviews.
I think there's another issue, but it could relate to your first two statement here. Even to try ideas, to explore the space of solutions, you need to have ideas to try. When entering development, you need clarity on what you're trying. It's very hard to make decisions on even a single attempt. I see engineers working task the entire time simply not sure what really the task is about.
And in a way, the coding agents need even more clarity in what you ask of them to deliver good result.
So even inside of what we consider "development" or "coding", the bottleneck is often: "what am I supposed to do here?" and not so much "I don't know how to do this" or "I have so much to implement".
This is obvious as well, once you throw more engineers, and you can't break up the work, because you have no clue what so many people could even all do. Knowing what all the needed tasks even are is hard and a big bottleneck.
> If you give them a task and spell it out, they can knock out code for it at a really good pace and wow upper management.
This is so true. I sometimes spend entire days, weeks, all I do is provide those type of engineers the clarity to "unblock" them. Yet I always wonder, if I had just spent that time coding myself, I might have gotten more done.
But it's also this that I think bottlenecks development. The set of people who really know what needs to be done, at the level of detail that these developer will need to be told, or that coding agents will need to be told, is very small, and that's your bottleneck, you have like 1 or 2 devs on all project that knows what to do, and everyone else need a Standard Operating Procedure handed to them for every task. And now everyone is always just waiting on the same 2 devs to tell them what to do.
Asking LLM for a code and then read/review it is a huge speed up for me in a lot of cases comparing to when i would need to write the same thing by myself (but i agree it may not work well in a big/complex systems... yet).
And I'm with you with a critical view on their all-in move toward AI. It's just what all the VCs do, and it's hard to say who's parroting who in this setup (I think that others are parroting YC, but feel free to challenge me on that).
Having said all that, I wouldn't be surprised if a couple of companies from this year's cohort made it big. If you look at YC's biggest successes year by year, you will often (but not always) find a household name.
Was there anyone who predicted these would be the greatest hits? Of course not! That's the whole point of having an investment portfolio. You can be wrong a lot of times if you secure an early investment in a unicorn every other year or so.
Also, "one recent example" of poor investment decision doesn't invalidate 2 decades of rather successful investment portfolios (as a whole, not individually).
In no way is it a YC defense. I'm very critical of the whole startup funding ecosystem, and they are a prominent player. Yet, if they were consistently stupid with their decisions, they wouldn't exist, let alone be the most desired accelerator out there.
Also, if it's that simple to copy what they do and what the companies in their portfolio do, why wouldn't Google et al. take their almost infinite funds and get the competing offers for non-BS ideas up and running in no time?
I bet that if you had an idea that could pay off thousandfold, you'd get enough eager ears to hear you out in any big tech. And still, it's the makeshift mass of startups that come through with new products.
One has to wonder why things like Shopify, Stripe, Zapier, or Figma did not come from the big tech. Each would have an ideal match. Even if you look at the AI landscape, how come Lovable made such a career? After all, they repackage the AI capabilities rented elsewhere. Somehow, with all the ingenuity of building ChatGPT, OpenAI and the rest didn't get it.
And that's only the things that they have released. I'd bet that there are lots more that never make it to the public.
And I expect no less from Microsoft, by the way. Microsoft is, in fact, a great case in point of how failed releases don't hurt the company's PR long-term. How many failures have they scored trying to catch up with the missed opportunities of the 2000s? Smartphones & tablets, search, music players, social media.
They were late to move the Office to the cloud, and kept pumping dollars into the Explorer/Edge lost cause, too.
I don't know enough details, but Xbox seems more like an outlier than a norm.
Yet they rebounded with Azure and made some good bets with AI, and are doing better than ever. However, we don't see a stream of new product bets coming from them.
Oh, and on Apple: I wouldn't discount the role of cult-like following in repeated product success. Neither of the other big techs has such a relationship with its user base. You don't see many raving fans of Facebook or Google. And you definitely have millions of people who would buy any new Apple product simply because it is a new Apple product.
It's like Joel Spolsky but on a global scale. In the 2000s, whatever Joel Spolsky touched turned into gold. Stack Overflow? Check. Trello? Check. Was there something unique about these products? Details, sure. But the biggest thing was Joel's leverage.
Having run a highly popular blog for developers, he could instantly reach out to his early adopters. Given that many of the readers were actual fans, they'd jump on the opportunity, whatever it was. So the early traction was not a problem (which was especially crucial for the developers' forum).
Scale that up to the big tech context, and you get Steve Jobs.
A side note: I wonder how long it will take Tim Cook to dismantle that. You can already see cracks.
The tools are absolutely useful, but they need to be applied in the right places and they are decided not a silver bullet or general-purpose software engineering tool in the manner that they're being billed at present. We still use them despite our finding, but we use them judiciously and where they actually help.
It would be good to define what's "smaller code bases". Here is what I am working one: 10 years old project full of legacy consisting of about 10 services and 10 front-end projects. As well tried it on project similar to MUI or Mantine UI. Naturally on many smaller projects. As well tried it on TypeScript codebase where it has failed for me (but it is hard to judge from one attempt). Lastly I am using it on smaller projects. Overall question is more about task than about code base size. If the task does not involve loading too much context when code base size might be irrelevant.
https://medium.com/@kazeemibrahim18/the-post-ipo-performance...
I have found that spending more time thinking generally reduces the amount of failed attempts. It's amazing what "thinking hard" beforehand can do to eliminate reprioritization scrambling.
This feels wrong to me, unless we qualify the statement with: "...if you want the exact same level of understanding of it."
Otherwise, the bottleneck in development would be pull/merge request review, not writing the code to do something. But almost always, it's the other way around - someone works on a feature for 3-5 days, the pull/merge request does not really spend the same time in active review. I don't think you need the exact same level of intricate understanding over some code when reviewing it.
It's quite similar with the AI stuff, I often nitpick and want to rework certain bits of code that AI generates (or fix obvious issues with it), but using it for the first version/draft is still easier than trying to approach the issue from zero. Ofc AI won't make you consistently better, but will remove some of the friction and reduce the cognitive load.
This is a first-degree smart argument. It presents a seemingly non-obvious idea that makes sense in retrospect.
However I happen to work at the experimentation team of a hyperscaler so I have a different perspective.
First, we aren't always saturating all of the potential experiments we could be running. The reasons are different, but essentially it takes time and effort to build those experimental features. If that cost trended to 0, we could make sure to have a queue of experiments deep enough.
Also in our side we need to do development work to support new features and products. We have a backlog long enough to keep us perpetually busy. If dev cost trended to 0, we could always be ready to provide our customers what they need.
Speaking of new products, each one our company comes up with comes with extra effort to support in our side, and yet more effort to produce dozens of AB test to validate new functionality.
This is not talking about ongoing maintenance effort. Bugfixes, upgrades etc. take a non-trivial amount of effort to keep up.
And this is only inside our little experimentation team. What about security, reliability, scalability, efficiency... it makes me wonder if OP has experience running products at scale.
Instead I'd like to think that dropping the cost of development by orders of magnitude changes the equation of how we create products.