←back to thread

688 points dheerajvs | 1 comments | | HN request time: 0.218s | source
Show context
simonw ◴[] No.44523442[source]
Here's the full paper, which has a lot of details missing from the summary linked above: https://metr.org/Early_2025_AI_Experienced_OS_Devs_Study.pdf

My personal theory is that getting a significant productivity boost from LLM assistance and AI tools has a much steeper learning curve than most people expect.

This study had 16 participants, with a mix of previous exposure to AI tools - 56% of them had never used Cursor before, and the study was mainly about Cursor.

They then had those 16 participants work on issues (about 15 each), where each issue was randomly assigned a "you can use AI" v.s. "you can't use AI" rule.

So each developer worked on a mix of AI-tasks and no-AI-tasks during the study.

A quarter of the participants saw increased performance, 3/4 saw reduced performance.

One of the top performers for AI was also someone with the most previous Cursor experience. The paper acknowledges that here:

> However, we see positive speedup for the one developer who has more than 50 hours of Cursor experience, so it's plausible that there is a high skill ceiling for using Cursor, such that developers with significant experience see positive speedup.

My intuition here is that this study mainly demonstrated that the learning curve on AI-assisted development is high enough that asking developers to bake it into their existing workflows reduces their performance while they climb that learing curve.

replies(33): >>44523608 #>>44523638 #>>44523720 #>>44523749 #>>44523765 #>>44523923 #>>44524005 #>>44524033 #>>44524181 #>>44524199 #>>44524515 #>>44524530 #>>44524566 #>>44524631 #>>44524931 #>>44525142 #>>44525453 #>>44525579 #>>44525605 #>>44525830 #>>44525887 #>>44526005 #>>44526996 #>>44527368 #>>44527465 #>>44527935 #>>44528181 #>>44528209 #>>44529009 #>>44529698 #>>44530056 #>>44530500 #>>44532151 #
rafaelmn ◴[] No.44524515[source]
>My personal theory is that getting a significant productivity boost from LLM assistance and AI tools has a much steeper learning curve than most people expect.

Are we are still selling the "you are an expert senior developer" meme ? I can completely see how once you are working on a mature codebase LLMs would only slow you down. Especially one that was not created by an LLM and where you are the expert.

replies(1): >>44524598 #
1. bicx ◴[] No.44524598[source]
I think it depends on the kind of work you're doing, but I use it on mature codebases where I am the expert, and I heavily delegate to Claude Code. By being knowledgeable of the codebase, I know exactly how to specify a task I need performed. I set it to work on one task, then I monitor it while personally starting on other work.

I think LLMs shine when you need to write a higher volume of code that extends a proven pattern, quickly explore experiments that require a lot of boilerplate, or have multiple smaller tasks that you can set multiple agents upon to parallelize. I've also had success in using LLMs to do a lot of external documentation research in order to integrate findings into code.

If you are fine-tuning an algorithm or doing domain-expert-level tweaks that require a lot of contextual input-output expert analysis, then you're probably better off just coding on your own.

Context engineering has been mentioned a lot lately, but it's not a meme. It's the real trick to successful LLM agent usage. Good context documentation, guides, and well-defined processes (just like with a human intern) will mean the difference between success and failure.