←back to thread

690 points dheerajvs | 1 comments | | HN request time: 0.339s | source
Show context
simonw ◴[] No.44523442[source]
Here's the full paper, which has a lot of details missing from the summary linked above: https://metr.org/Early_2025_AI_Experienced_OS_Devs_Study.pdf

My personal theory is that getting a significant productivity boost from LLM assistance and AI tools has a much steeper learning curve than most people expect.

This study had 16 participants, with a mix of previous exposure to AI tools - 56% of them had never used Cursor before, and the study was mainly about Cursor.

They then had those 16 participants work on issues (about 15 each), where each issue was randomly assigned a "you can use AI" v.s. "you can't use AI" rule.

So each developer worked on a mix of AI-tasks and no-AI-tasks during the study.

A quarter of the participants saw increased performance, 3/4 saw reduced performance.

One of the top performers for AI was also someone with the most previous Cursor experience. The paper acknowledges that here:

> However, we see positive speedup for the one developer who has more than 50 hours of Cursor experience, so it's plausible that there is a high skill ceiling for using Cursor, such that developers with significant experience see positive speedup.

My intuition here is that this study mainly demonstrated that the learning curve on AI-assisted development is high enough that asking developers to bake it into their existing workflows reduces their performance while they climb that learing curve.

replies(33): >>44523608 #>>44523638 #>>44523720 #>>44523749 #>>44523765 #>>44523923 #>>44524005 #>>44524033 #>>44524181 #>>44524199 #>>44524515 #>>44524530 #>>44524566 #>>44524631 #>>44524931 #>>44525142 #>>44525453 #>>44525579 #>>44525605 #>>44525830 #>>44525887 #>>44526005 #>>44526996 #>>44527368 #>>44527465 #>>44527935 #>>44528181 #>>44528209 #>>44529009 #>>44529698 #>>44530056 #>>44530500 #>>44532151 #
furyofantares ◴[] No.44523765[source]
> My personal theory is that getting a significant productivity boost from LLM assistance and AI tools has a much steeper learning curve than most people expect.

I totally agree with this. Although also, you can end up in a bad spot even after you've gotten pretty good at getting the AI tools to give you good output, because you fail to learn the code you're producing well.

A developer gets better at the code they're working on over time. An LLM gets worse.

You can use an LLM to write a lot of code fast, but if you don't pay enough attention, you aren't getting any better at the code while the LLM is getting worse. This is why you can get like two months of greenfield work done in a weekend but then hit a brick wall - you didn't learn anything about the code that was written, and while the LLM started out producing reasonable code, it got worse until you have a ball of mud that neither the LLM nor you can effectively work on.

So a really difficult skill in my mind is continually avoiding temptation to vibe. Take a whole week to do a month's worth of features, not a weekend to do two month's worth, and put in the effort to guide the LLM to keep producing clean code, and to be sure you know the code. You do want to know the code and you can't do that without putting in work yourself.

replies(4): >>44524369 #>>44524418 #>>44524779 #>>44525480 #
bluefirebrand ◴[] No.44524418[source]
> Take a whole week to do a month's worth of features

Everything else in your post is so reasonable and then you still somehow ended up suggesting that LLMs should be quadrupling our output

replies(1): >>44524826 #
furyofantares ◴[] No.44524826[source]
I'm specifically talking about greenfield work. I do a lot of game prototypes, it definitely does that at the very beginning.
replies(2): >>44524954 #>>44525479 #
bluefirebrand ◴[] No.44524954[source]
Greenfield is still such a tiny percentage of all software work going on in the world though :/
replies(2): >>44525006 #>>44525398 #
1. Filligree ◴[] No.44525398[source]
It’s a tiny percentage of software work because the programming is slow, and setting up new projects is even slower.

It’s been a majority of my projects for the past two months. Not because work changed, but because I’ve written a dozen tiny, personalised tools that I wouldn’t have written at all if I didn’t have Claude to do it.

Most of them were completed in less than an hour, to give you an idea of the size. Though it would have easily been a day on my own.