Measuring the impact of AI on experienced open-source developer productivity

(metr.org)

Show context

simonw ◴[10 Jul 25 17:36 UTC] No.44523442[source]▶

Here's the full paper, which has a lot of details missing from the summary linked above: https://metr.org/Early_2025_AI_Experienced_OS_Devs_Study.pdf

My personal theory is that getting a significant productivity boost from LLM assistance and AI tools has a much steeper learning curve than most people expect.

This study had 16 participants, with a mix of previous exposure to AI tools - 56% of them had never used Cursor before, and the study was mainly about Cursor.

They then had those 16 participants work on issues (about 15 each), where each issue was randomly assigned a "you can use AI" v.s. "you can't use AI" rule.

So each developer worked on a mix of AI-tasks and no-AI-tasks during the study.

A quarter of the participants saw increased performance, 3/4 saw reduced performance.

One of the top performers for AI was also someone with the most previous Cursor experience. The paper acknowledges that here:

> However, we see positive speedup for the one developer who has more than 50 hours of Cursor experience, so it's plausible that there is a high skill ceiling for using Cursor, such that developers with significant experience see positive speedup.

My intuition here is that this study mainly demonstrated that the learning curve on AI-assisted development is high enough that asking developers to bake it into their existing workflows reduces their performance while they climb that learing curve.

replies(33): >>44523608 #>>44523638 #>>44523720 #>>44523749 #>>44523765 #>>44523923 #>>44524005 #>>44524033 #>>44524181 #>>44524199 #>>44524515 #>>44524530 #>>44524566 #>>44524631 #>>44524931 #>>44525142 #>>44525453 #>>44525579 #>>44525605 #>>44525830 #>>44525887 #>>44526005 #>>44526996 #>>44527368 #>>44527465 #>>44527935 #>>44528181 #>>44528209 #>>44529009 #>>44529698 #>>44530056 #>>44530500 #>>44532151 #

eightysixfour ◴[10 Jul 25 21:37 UTC] No.44525887[source]▶

>>44523442 #

I have been teaching people at my company how to use AI code tools, the learning curve is way worse for developers and I have had to come up with some exercises to try and breakthrough the curve. Some seemingly can’t get it.

The short version is that devs want to give instructions instead of ask for what outcome they want. When it doesn’t follow the instructions, they double down by being more precise, the worst thing you can do. When non devs don’t get what they want, they add more detail to the description of the desired outcome.

Once you get past the control problem, then you have a second set of issues for devs where the things that should be easy or hard don’t necessarily map to their mental model of what is easy or hard, so they get frustrated with the LLM when it can’t do something “easy.”

Lastly, devs keep a shit load of context in their head - the project, what they are working on, application state, etc. and they need to do that for LLMs too, but you have to repeat themselves often and “be” the external memory for the LLM. Most devs I have taught hate that, they actually would rather have it the other way around where they get help with context and state but want to instruct the computer on their own.

Interestingly, the best AI assisted devs have often moved to management/solution architecture, and they find the AI code tools brought back some of the love of coding. I have a hypothesis they’re wired a bit differently and their role with AI tools is actually closer to management than it is development in a number of ways.

replies(2): >>44526796 #>>44527055 #

1. rester324 ◴[11 Jul 25 00:02 UTC] No.44527055[source]▶

>>44525887 #

> Interestingly, the best AI assisted devs have often moved to management/solution architecture, and they find the AI code tools brought back some of the love of coding

This suggests me though that they are bad at coding, otherwise they would have stayed longer. And I can't find anything in your comment that would corroborate the opposite. So what gives?

I am not saying what you say is untrue, but you didn't give any convincing arguments to us to believe otherwise.

Also, you didn't define the criteria of getting better. Getting better in terms of what exactly???

replies(2): >>44527914 #>>44528565 #

2. eightysixfour ◴[11 Jul 25 02:32 UTC] No.44527914[source]▶

>>44527055 (TP) #

> This suggests me though that they are bad at coding, otherwise they would have stayed longer.

Or they care about producing value, not just the code, and realized they had more leverage and impact in other roles.

> And I can't find anything in your comment that would corroborate the opposite.

I didn’t try and corroborate the opposite.

Honestly, I don’t care about the “best coders.” I care about people who do their job well, sometimes that is writing amazing code but most of the time it isn’t. I don’t have any devs in my company who work in a magical vacuum where they are handed perfectly written tasks, they complete them, and then they do the next one.

If I did, I could replace them with AI faster.

> Also, you didn't define the criteria of getting better. Getting better in terms of what exactly?

Delivery velocity - bug fixes, features, etc. that pass testing/QA and goes to prod.

replies(1): >>44528219 #

3. rester324 ◴[11 Jul 25 03:47 UTC] No.44528219[source]▶

>>44527914 #

> Honestly, I don’t care about the “best coders.”

> Interestingly, the best AI assisted devs have often moved to management/solution architecture

Is it just me? Or does it seem to others as well that you pretty much rank these people even at the moment and your first comment contradicts your second comment? Especially when you admit that you rank them based on velocity.

I am not saying you shouldn't do that, but it feels to me like rating road construction workers on the number of potholes fixed, even though it's very possible that the potholes are caused by the sloppy work to begin with.

Not what I would want to do.

replies(1): >>44528847 #

4. qingcharles ◴[11 Jul 25 05:05 UTC] No.44528565[source]▶

>>44527055 (TP) #

I'm not bad at coding. I would say I'm pretty damned good. But coding is a means-to-an-end. I come up with an idea, then I have the long-winded middle bit where I have to write all the code, spin up a DB, create the tables, etc.

LLMs have given me a whole new love of coding, getting rid of the dull grind and letting me write code an order of magnitude quicker than before.

5. eightysixfour ◴[11 Jul 25 06:07 UTC] No.44528847{3}[source]▶

>>44528219 #

> Is it just me? Or does it seem to others as well that you pretty much rank these people even at the moment and your first comment contradicts your second comment?

I think you are reading what you want to read and not what I said, so yes it is you. The most productive, valuable people with developer titles in my organizations are not the ones who write the cleanest, most beautiful, most perfect code. They do all of the other parts of the job well and write solid code.

Following the introduction of AI tools, many of the people in my organization who most effectively learned to use those tools are people who previously chose to move to manager and SA roles.

Not only are these not contradictory, they fit quite well together. People who do the things around coding well, but maybe have to work hard at writing the actual code, are better at using the AI tools than exceptional coders. For my organization, the former are generally more valuable than the latter without AI, and that is increasing as a result of AI.

> I am not saying you shouldn't do that, but it feels to me like rating road construction workers on the number of potholes fixed, even though it's very possible that the potholes are caused by the sloppy work to begin with.

Not if your measurement includes quality testing the pothole repairs, which mine does, as I explicitly called out. I work in industries with extensive, long testing cycles, we are (imperfectly, of course) able to measure productivity based on things which make it through those cycles.

You are trying very hard to find ways to ignore what I am saying. It is fine if you don’t want to believe me, but these things have been true based on our observations:

A. Great “coders” have a much harder time picking up AI dev tools and using them effectively, and when they see how others use them they will admit that isn’t how they use them. They will revert to their previous habits and give up on the tools.

B. The productivity gains for the people who are good at using the tools, as measured by velocity with a minimum bar for quality (with substantial QA), are very high.

C. We have measured these things to thoroughly understand the ROI and we are accelerating our investment in AI coding tools as a result.

Some caveats I am absolutely willing to make - we are not working on bleeding edge tech doing things no one has ever done before.

We failed to effectively use AI many times before we started to get it right.

There are developers who are slower with the AI code tools than without it.

replies(1): >>44528986 #

6. rester324 ◴[11 Jul 25 06:35 UTC] No.44528986{4}[source]▶

>>44528847 #

I am not convinced.

If what you write was true, then the rate of bugs of those incredible devs would simply fall to zero at one point, and at that point they would become a legend who we all would have heard of by now. So the whole story sounds too fishy to my taste.

It's OK if you want to manage your team this way. Everyone needs some external feedback to confirm their own bias. It seems you found yours and it works for you.

It's just not a good argument in support of AI or AI assisted development.

It's too anecdotal.

And since you are the one who are telling me that you are right, and not others, it makes me even more skeptical about the whole story.

↑