Measuring the impact of AI on experienced open-source developer productivity

Here's the full paper, which has a lot of details missing from the summary linked above: https://metr.org/Early_2025_AI_Experienced_OS_Devs_Study.pdf

My personal theory is that getting a significant productivity boost from LLM assistance and AI tools has a much steeper learning curve than most people expect.

This study had 16 participants, with a mix of previous exposure to AI tools - 56% of them had never used Cursor before, and the study was mainly about Cursor.

They then had those 16 participants work on issues (about 15 each), where each issue was randomly assigned a "you can use AI" v.s. "you can't use AI" rule.

So each developer worked on a mix of AI-tasks and no-AI-tasks during the study.

A quarter of the participants saw increased performance, 3/4 saw reduced performance.

One of the top performers for AI was also someone with the most previous Cursor experience. The paper acknowledges that here:

> However, we see positive speedup for the one developer who has more than 50 hours of Cursor experience, so it's plausible that there is a high skill ceiling for using Cursor, such that developers with significant experience see positive speedup.

My intuition here is that this study mainly demonstrated that the learning curve on AI-assisted development is high enough that asking developers to bake it into their existing workflows reduces their performance while they climb that learing curve.

I don't even think we know how to do it yet. I revise my whole attitude and all of my beliefs about this stuff every week: I figure out things that seemed really promising don't pan out, I find stuff that I kick myself for not realizing sooner, and it's still this high-stakes game. I still blow a couple of days and wish I had just done it the old-fashioned way, and then I'll catch a run where it's like, fuck, I was never that good, that's the last 5-10% that breaks a PB.

I very much think that these things are going to wind up being massive amplifiers for people who were already extremely sophisticated and then put massive effort into optimizing them and combining them with other advanced techniques (formal methods, top-to-bottom performance orientation).

I don't think this stuff is going to democratize software engineering at all, I think it's going to take the difficulty level so high that it's like back when Djikstra or Tony Hoare was a fairly typical computer programmer.