←back to thread

688 points dheerajvs | 1 comments | | HN request time: 0.423s | source
Show context
simonw ◴[] No.44523442[source]
Here's the full paper, which has a lot of details missing from the summary linked above: https://metr.org/Early_2025_AI_Experienced_OS_Devs_Study.pdf

My personal theory is that getting a significant productivity boost from LLM assistance and AI tools has a much steeper learning curve than most people expect.

This study had 16 participants, with a mix of previous exposure to AI tools - 56% of them had never used Cursor before, and the study was mainly about Cursor.

They then had those 16 participants work on issues (about 15 each), where each issue was randomly assigned a "you can use AI" v.s. "you can't use AI" rule.

So each developer worked on a mix of AI-tasks and no-AI-tasks during the study.

A quarter of the participants saw increased performance, 3/4 saw reduced performance.

One of the top performers for AI was also someone with the most previous Cursor experience. The paper acknowledges that here:

> However, we see positive speedup for the one developer who has more than 50 hours of Cursor experience, so it's plausible that there is a high skill ceiling for using Cursor, such that developers with significant experience see positive speedup.

My intuition here is that this study mainly demonstrated that the learning curve on AI-assisted development is high enough that asking developers to bake it into their existing workflows reduces their performance while they climb that learing curve.

replies(33): >>44523608 #>>44523638 #>>44523720 #>>44523749 #>>44523765 #>>44523923 #>>44524005 #>>44524033 #>>44524181 #>>44524199 #>>44524515 #>>44524530 #>>44524566 #>>44524631 #>>44524931 #>>44525142 #>>44525453 #>>44525579 #>>44525605 #>>44525830 #>>44525887 #>>44526005 #>>44526996 #>>44527368 #>>44527465 #>>44527935 #>>44528181 #>>44528209 #>>44529009 #>>44529698 #>>44530056 #>>44530500 #>>44532151 #
ivanovm ◴[] No.44526996[source]
I find the very popular response of "you're just not using it right" to be big copout for LLMs, especially at the scale we see today. It's hard to think of any other major tech product where it's acceptable to shift so much blame on the user. Typically if a user doesn't find value in the product, we agree that the product is poorly designed/implemented, not that the user is bad. But AI seems somehow exempt from this sentiment
replies(15): >>44527074 #>>44527365 #>>44527386 #>>44527577 #>>44527623 #>>44527723 #>>44527868 #>>44528270 #>>44528322 #>>44529356 #>>44529649 #>>44530908 #>>44532696 #>>44533993 #>>44537674 #
sanderjd ◴[] No.44527577[source]
> It's hard to think of any other major tech product where it's acceptable to shift so much blame on the user.

Maybe, but it isn't hard to think of developer tools where this is the case. This is the entire history of editor and IDE wars.

Imagine running this same study design with vim. How well would you expect the not-previously-experienced developers to perform in such a study?

replies(2): >>44528674 #>>44529374 #
fingerlocks ◴[] No.44528674[source]
No one is claiming 10x perf gains in vim.

It’s just a fun geeky thing to use with a lot of zany customizations. And after two hellish years of memory muscling enough keyboard bindings to finally be productive, you earned it! It’s a badge of pride!

But we all know you’re still fat fingering ggdG on occasion and silently cursing to yourself.

replies(1): >>44529110 #
TeMPOraL ◴[] No.44529110[source]
> No one is claiming 10x perf gains in vim.

Sure they are - or at least were, unitl the last couple years. Same thing with Emacs.

It's hard to claim this now, because the entire industry shifted towards webshit and cloud-based practices across the board, and the classical editors just can't keep up with VS Code. Despite the latter introducing LSP, which leveled the playing field wrt. code intelligence itself, the surrounding development process and the ecosystem increasingly demands you use web-based or web-derived tools and practices, which all see a browser engine as a basic building block. Classical editors can't match the UX/DX on that, plus the whole thing breaks basic assumptions about UI that were the source of the "10x perf gains" in vim and Emacs.

Ironically, a lot of the perf gains from AI come from letting you avoid dealing with the brokenness of the current tools and processes, that vim and Emacs are not equipped to handle.

replies(3): >>44529753 #>>44534294 #>>44535648 #
fingerlocks ◴[] No.44529753[source]
Yeah I’m in my 40s and have been using vim for decades. Sure there was an occasional rando stirring up the forums about made-up productivity gains to get some traffic to their blog, but that was it. There has always been push back from many of the strongest vim advocates that the appeal is not about typing speed or whatever it was they were claiming. It’s just ergonomics and power.

It’s just not comparable to the LLM crazy hype train.

And to belabor your other point, I have treesitter, lsp, and GitHub Copilot agent all working flawlessly in neovim. Ts and lsp are neovim builtins now. And it’s custom built for exactly how I want it to be, and none of that blinking shit or nagging dialog boxes all over VSCode.

I have VScode and vim open to the same files all day quite literally side by side, because I work at Microsoft, share my screen often, and there are still people that have violent allergic reactions to a terminal and vim. Vim can do everything VSCode does and it’s not dogshit slow.

replies(1): >>44532161 #
Imustaskforhelp ◴[] No.44532161[source]
I am really curious what your thoughts on zed are, given that it has a lot of features and is still mostly vim compatible (from what i know) so you have the same ergonomics and power and it has some sane defaults / I don't need to tinker as much with zed as I would have to with nvim.

Its not that I don't like tinkering. I really enjoy tinkering with config files but I never could understand nvim personally since I usually want a lsp / good enough experience that nvim or any lunarvim etc. couldn't provide without me installing additional software.

replies(1): >>44536223 #
1. fingerlocks ◴[] No.44536223[source]
I haven’t tried zed and I’m getting old and set in my ways. If it ain’t broke don’t fix it and all that.

So if the claim is that I can get everything I have out of vim, most importantly being unbeatably fast text buffers, and I don’t need a suitcase full of config files, that’s very compelling.

Is that the promise of zed?