←back to thread

691 points dheerajvs | 1 comments | | HN request time: 0.211s | source
Show context
simonw ◴[] No.44523442[source]
Here's the full paper, which has a lot of details missing from the summary linked above: https://metr.org/Early_2025_AI_Experienced_OS_Devs_Study.pdf

My personal theory is that getting a significant productivity boost from LLM assistance and AI tools has a much steeper learning curve than most people expect.

This study had 16 participants, with a mix of previous exposure to AI tools - 56% of them had never used Cursor before, and the study was mainly about Cursor.

They then had those 16 participants work on issues (about 15 each), where each issue was randomly assigned a "you can use AI" v.s. "you can't use AI" rule.

So each developer worked on a mix of AI-tasks and no-AI-tasks during the study.

A quarter of the participants saw increased performance, 3/4 saw reduced performance.

One of the top performers for AI was also someone with the most previous Cursor experience. The paper acknowledges that here:

> However, we see positive speedup for the one developer who has more than 50 hours of Cursor experience, so it's plausible that there is a high skill ceiling for using Cursor, such that developers with significant experience see positive speedup.

My intuition here is that this study mainly demonstrated that the learning curve on AI-assisted development is high enough that asking developers to bake it into their existing workflows reduces their performance while they climb that learing curve.

replies(33): >>44523608 #>>44523638 #>>44523720 #>>44523749 #>>44523765 #>>44523923 #>>44524005 #>>44524033 #>>44524181 #>>44524199 #>>44524515 #>>44524530 #>>44524566 #>>44524631 #>>44524931 #>>44525142 #>>44525453 #>>44525579 #>>44525605 #>>44525830 #>>44525887 #>>44526005 #>>44526996 #>>44527368 #>>44527465 #>>44527935 #>>44528181 #>>44528209 #>>44529009 #>>44529698 #>>44530056 #>>44530500 #>>44532151 #
grey-area ◴[] No.44524005[source]
Well, there are two possible interpretations here of 75% of participants (all of whom had some experience using LLMs) being slower using generative AI:

LLMs have a v. steep and long learning curve as you posit (though note the points from the paper authors in the other reply).

Current LLMs just are not as good as they are sold to be as a programming assistant and people consistently predict and self-report in the wrong direction on how useful they are.

replies(6): >>44524525 #>>44524552 #>>44525186 #>>44525216 #>>44525303 #>>44526981 #
atiedebee ◴[] No.44525216[source]
Let me bring you a third (not necessarily true) interpretation:

The developer who has experience using cursor saw a productivity increase not because he became better at using cursor, but because he became worse at not using it.

replies(2): >>44525343 #>>44530391 #
card_zero ◴[] No.44525343[source]
Or, one person in 16 has a particular personality, inclined to LLM dependence.
replies(2): >>44525736 #>>44526281 #
runarberg ◴[] No.44525736[source]
Invoking personality is to the behavioral science as invoking God is to the natural sciences. One can explain anything by appealing to personality, and as such it explains nothing. Psychologists have been trying to make sense of personality for over a century without much success (the best efforts so far have been a five factor model [Big 5] which has ultimately pretty minor predictive value), which is why most behavioral scientists have learned to simply leave personality to the philosophers and concentrate on much simpler theoretical framework.

A much simpler explanation is what your parent offered. And to many behavioralists it is actually the same explanation, as to a true scotsm... [cough] behavioralist personality is simply learned habits, so—by Occam’s razor—you should omit personality from your model.

replies(2): >>44525860 #>>44528801 #
suddenlybananas ◴[] No.44528801[source]
Behaviorism is a relic of the 1950s
replies(1): >>44535677 #
1. runarberg ◴[] No.44535677[source]
Not really a relic. Reinforcement learning is one of the best model for learned behavior we have. In the 1950s however cognitive science didn’t exist, and behavioralists thought they could explain much more with their model than they could, so they oversold the idea, by a lot.

Cognitive science was able to explain stuff like biases, pattern recognition, language, etc. which behavioral science thought they could explain, but couldn’t. In the 1950s it was really the only game in town (except for psychometrics which failed in a way much more complete—albeit less spectacular—way then behaviorism), so understandably scientists (and philosophers) went a little overboard with it (kind of like evolutionary biology did in the 1920s).

I think a more fair viewpoint is to claim that behaviorism’s heyday in the 1950s has passed, but it still provides an excellent theoretical framework for some of human behavior, and along with cognitive science, is able to explain most of what we know about human behavior.