Most active commenters

Popular/hot comments

(metr.org)

Show context

simonw ◴[10 Jul 25 17:36 UTC] No.44523442[source]▶

Here's the full paper, which has a lot of details missing from the summary linked above: https://metr.org/Early_2025_AI_Experienced_OS_Devs_Study.pdf

My personal theory is that getting a significant productivity boost from LLM assistance and AI tools has a much steeper learning curve than most people expect.

This study had 16 participants, with a mix of previous exposure to AI tools - 56% of them had never used Cursor before, and the study was mainly about Cursor.

They then had those 16 participants work on issues (about 15 each), where each issue was randomly assigned a "you can use AI" v.s. "you can't use AI" rule.

So each developer worked on a mix of AI-tasks and no-AI-tasks during the study.

A quarter of the participants saw increased performance, 3/4 saw reduced performance.

One of the top performers for AI was also someone with the most previous Cursor experience. The paper acknowledges that here:

> However, we see positive speedup for the one developer who has more than 50 hours of Cursor experience, so it's plausible that there is a high skill ceiling for using Cursor, such that developers with significant experience see positive speedup.

My intuition here is that this study mainly demonstrated that the learning curve on AI-assisted development is high enough that asking developers to bake it into their existing workflows reduces their performance while they climb that learing curve.

replies(33): >>44523608 #>>44523638 #>>44523720 #>>44523749 #>>44523765 #>>44523923 #>>44524005 #>>44524033 #>>44524181 #>>44524199 #>>44524515 #>>44524530 #>>44524566 #>>44524631 #>>44524931 #>>44525142 #>>44525453 #>>44525579 #>>44525605 #>>44525830 #>>44525887 #>>44526005 #>>44526996 #>>44527368 #>>44527465 #>>44527935 #>>44528181 #>>44528209 #>>44529009 #>>44529698 #>>44530056 #>>44530500 #>>44532151 #

ivanovm ◴[10 Jul 25 23:53 UTC] No.44526996[source]▶

>>44523442 #

I find the very popular response of "you're just not using it right" to be big copout for LLMs, especially at the scale we see today. It's hard to think of any other major tech product where it's acceptable to shift so much blame on the user. Typically if a user doesn't find value in the product, we agree that the product is poorly designed/implemented, not that the user is bad. But AI seems somehow exempt from this sentiment

replies(15): >>44527074 #>>44527365 #>>44527386 #>>44527577 #>>44527623 #>>44527723 #>>44527868 #>>44528270 #>>44528322 #>>44529356 #>>44529649 #>>44530908 #>>44532696 #>>44533993 #>>44537674 #

viraptor ◴[11 Jul 25 00:05 UTC] No.44527074[source]▶

>>44526996 #

> It's hard to think of any other major tech product where it's acceptable to shift so much blame on the user.

It's completely normal in development. How many years of programming experience you need for almost any language? How many days/weeks you need to use debuggers effectively? How long from the first contact with version control until you get git?

I think it's the opposite actually - it's common that new classes of tools in tech need experience to use well. Much less if you're moving to something different within the same class.

replies(5): >>44527384 #>>44528359 #>>44528413 #>>44529459 #>>44530955 #

1. intended ◴[11 Jul 25 04:21 UTC] No.44528359[source]▶

>>44527074 #

> LLMs, especially at the scale we see today

The OP qualifies how the marketing cycle for this product is beyond extreme, and its own category.

Normal people are being told to worry about AI ending the world, or all jobs disappearing.

Simply saying “the problem is the user”, without acknowledging the degree of hype, and expectation setting, the is irresponsible.

replies(1): >>44529050 #

2. TeMPOraL ◴[11 Jul 25 06:47 UTC] No.44529050[source]▶

>>44528359 (TP) #

AI marketing isn't extreme - not on the LLM vendor side, at least; the hype is generated downstream of it, for various reasons. And it's not the marketing that's saying "you're using it wrong" - it's other users. So, unless you believe everyone reporting good experience with LLMs is a paid shill, there might actually be some merit to it.

replies(4): >>44529194 #>>44529508 #>>44529573 #>>44538020 #

3. carschno ◴[11 Jul 25 07:11 UTC] No.44529194[source]▶

>>44529050 #

It's called grassroots marketing. It works particularly well in the context of GenAI because it is fed with esoteric and ideological fragments that overlap with common beliefs and political trends. https://en.wikipedia.org/wiki/TESCREAL

Therefore, classical marketing is less dominant, although more present at down-stream sellers.

replies(1): >>44529462 #

4. TeMPOraL ◴[11 Jul 25 07:57 UTC] No.44529462{3}[source]▶

>>44529194 #

Right. Let's take a bunch of semi-related groups I don't like, and make up an acronym for them so any of my criticism can be applied to some subset of those groups in some form, thus making it seem legitimate and not just a bunch of half-assed strawman arguments.

Also, I guess you're saying I'm a paid shill, or have otherwise been brainwashed by marketing of the vendors, and therefore my positive experiences with LLMs are a lie? :).

I mean, you probably didn't mean that, but part of my point is that you see those positive reports here on HN too, from real people who've been in this community for a while and are not anonymous Internet users - you can't just dismiss that as "grassroot marketing".

replies(1): >>44530213 #

5. intended ◴[11 Jul 25 08:04 UTC] No.44529508[source]▶

>>44529050 #

It is extreme, and on the vendor side. The OpenAI non profit vs profit saga, was about profit seeking vs the future of humanity. People are talking about programming 3.0.

I can appreciate that it’s other users who are saying it’s wrong, but that doesn’t escape the point on ignoring the context.

Moreover, it’s unhelpful communication. Its gives up acknowledging a mutually shared context, the natural confusion that would arise from the ambiguous, high level hype, and the actual down to earth reality.

Even if you have found a way to make it work, having someone understand your workflow can’t happen without connecting the dots between their frame of reference and yours.

replies(1): >>44530477 #

6. OccamsMirror ◴[11 Jul 25 08:17 UTC] No.44529573[source]▶

>>44529050 #

I think the relentless podcast blitz by OpenAI and Anthropic founders suggests otherwise. They're both keen to confirm that yes, in 5 - 10 years, no one will have any jobs any more. They're literally out there discussing a post employment world like it's an inevitability.

That's pretty extreme.

replies(2): >>44530181 #>>44538091 #

7. disgruntledphd2 ◴[11 Jul 25 09:36 UTC] No.44530181{3}[source]▶

>>44529573 #

Those billions won't raise themselves, you know.

More generally, these execs are talking their book as they're in a low margin capital intensive businesses whose future is entirely dependent on raising a bunch more money, so hype and insane claims are necessary for funding.

Now, maybe they do sortof believe it, but if so, why do they keep hiring software engineers and other staff?

8. carschno ◴[11 Jul 25 09:39 UTC] No.44530213{4}[source]▶

>>44529462 #

> I mean, you probably didn't mean that

Correct, I think you've read too much into it. Grassroots marketing is not a pejorative term, either. Its strategy is to trigger positive reviews about your product, ideally by independent, credible community members, indeed.

That implies that those community members have motivations other than being paid. Ideologies and shared beliefs can be some of them. Being happy about the product is a prerequisite, whatever that means for the individual user.

9. pera ◴[11 Jul 25 10:20 UTC] No.44530477{3}[source]▶

>>44529508 #

It really is, for example here is a quote from AI 2027:

> By early 2030, the robot economy has filled up the old SEZs, the new SEZs, and large parts of the ocean. The only place left to go is the human-controlled areas. [...]

> The new decade dawns with Consensus-1’s robot servitors spreading throughout the solar system. By 2035, trillions of tons of planetary material have been launched into space and turned into rings of satellites orbiting the sun. The surface of the Earth has been reshaped into Agent-4’s version of utopia: datacenters, laboratories, particle colliders, and many other wondrous constructions doing enormously successful and impressive research.

This scenario prediction, which is co-authored by a former OpenAI researcher (now at Future of Humanity Institute), received almost 1 thousand upvotes here on HN and the attention of the NYT and other large media outlets.

If you read that and still don't believe the AI hype is _extreme_ then I really don't know what else to tell you.

https://news.ycombinator.com/item?id=43571851

10. patrakov ◴[11 Jul 25 23:56 UTC] No.44538020[source]▶

>>44529050 #

> And it's not the marketing that's saying "you're using it wrong" - it's other users.

No, it's the non-coding managers who vibe-coded a half-working prototype, not other users. And here, the Dunning-Kruger effect is at play - those non-coding types do not understand that AI is not working for them either.

Full disclosure: I do rely on vibe-coded jq lines in one-off scripts that will definitely not process more data after the single intended use, and this is where AI saves my time.

11. patrakov ◴[12 Jul 25 00:09 UTC] No.44538091{3}[source]▶

>>44529573 #

This was present (in a positive way, though) even in Soviet films for children.

    Позабыты хлопоты,
    Остановлен бег,
    Вкалывают роботы,
    Счастлив человек!

    Worries forgotten,
    The treadmill doesn't run,
    Robots are working,
    Humans have fun!

↑

Measuring the impact of AI on experienced open-source developer productivity