←back to thread

688 points dheerajvs | 2 comments | | HN request time: 0.496s | source
Show context
simonw ◴[] No.44523442[source]
Here's the full paper, which has a lot of details missing from the summary linked above: https://metr.org/Early_2025_AI_Experienced_OS_Devs_Study.pdf

My personal theory is that getting a significant productivity boost from LLM assistance and AI tools has a much steeper learning curve than most people expect.

This study had 16 participants, with a mix of previous exposure to AI tools - 56% of them had never used Cursor before, and the study was mainly about Cursor.

They then had those 16 participants work on issues (about 15 each), where each issue was randomly assigned a "you can use AI" v.s. "you can't use AI" rule.

So each developer worked on a mix of AI-tasks and no-AI-tasks during the study.

A quarter of the participants saw increased performance, 3/4 saw reduced performance.

One of the top performers for AI was also someone with the most previous Cursor experience. The paper acknowledges that here:

> However, we see positive speedup for the one developer who has more than 50 hours of Cursor experience, so it's plausible that there is a high skill ceiling for using Cursor, such that developers with significant experience see positive speedup.

My intuition here is that this study mainly demonstrated that the learning curve on AI-assisted development is high enough that asking developers to bake it into their existing workflows reduces their performance while they climb that learing curve.

replies(33): >>44523608 #>>44523638 #>>44523720 #>>44523749 #>>44523765 #>>44523923 #>>44524005 #>>44524033 #>>44524181 #>>44524199 #>>44524515 #>>44524530 #>>44524566 #>>44524631 #>>44524931 #>>44525142 #>>44525453 #>>44525579 #>>44525605 #>>44525830 #>>44525887 #>>44526005 #>>44526996 #>>44527368 #>>44527465 #>>44527935 #>>44528181 #>>44528209 #>>44529009 #>>44529698 #>>44530056 #>>44530500 #>>44532151 #
ivanovm ◴[] No.44526996[source]
I find the very popular response of "you're just not using it right" to be big copout for LLMs, especially at the scale we see today. It's hard to think of any other major tech product where it's acceptable to shift so much blame on the user. Typically if a user doesn't find value in the product, we agree that the product is poorly designed/implemented, not that the user is bad. But AI seems somehow exempt from this sentiment
replies(15): >>44527074 #>>44527365 #>>44527386 #>>44527577 #>>44527623 #>>44527723 #>>44527868 #>>44528270 #>>44528322 #>>44529356 #>>44529649 #>>44530908 #>>44532696 #>>44533993 #>>44537674 #
1. DanielVZ ◴[] No.44529356[source]
> It's hard to think of any other major tech product where it's acceptable to shift so much blame on the user.

Sorry to be pedantic but this is really common in tech products: vim, emacs, any second-brain app, effectiveness of IDEs depending on learning its features, git, and more.

replies(1): >>44529405 #
2. ndsipa_pomu ◴[] No.44529405[source]
Well, surely vim is easy to use - I started it and and haven't stopped using it yet (one day I'll learn how to exit)