Measuring the impact of AI on experienced open-source developer productivity

1. pera ◴[10 Jul 25 18:54 UTC] No.44524261[source]▶

Wow these are extremely interesting results, specially this part:

> This gap between perception and reality is striking: developers expected AI to speed them up by 24%, and even after experiencing the slowdown, they still believed AI had sped them up by 20%.

I wonder what could explain such large difference between estimation/experience vs reality, any ideas?

Maybe our brains are measuring mental effort and distorting our experience of time?

replies(7): >>44524872 #>>44524974 #>>44525239 #>>44525349 #>>44528508 #>>44528626 #>>44530564 #

2. alfalfasprout ◴[10 Jul 25 20:00 UTC] No.44524872[source]▶

>>44524261 (TP) #

I would speculate that it's because there's been a huge concerted effort to make people want to believe that these tools are better than they are.

The "economic experts" and "ml experts" are in many cases effectively the same group-- companies pushing AI coding tools have a vested interest in people believing they're more useful than they are. Executives take this at face value and broadly promise major wins. Economic experts take this at face value and use this for their forecasts.

This propagates further, and now novices and casual individuals begin to believe in the hype. Eventually, as an experienced engineer it moves the "baseline" expectation much higher.

Unfortunately this is very difficult to capture empirically.

3. longwave ◴[10 Jul 25 20:10 UTC] No.44524974[source]▶

>>44524261 (TP) #

I also wonder how many of the numerous AI proponents in HN comments are subject to the same effect. Unless they are truly measuring their own performance, is AI really making them more productive?

replies(1): >>44526704 #

4. chamomeal ◴[10 Jul 25 20:32 UTC] No.44525239[source]▶

>>44524261 (TP) #

It’s funny cause I sometimes have the opposite experience. I tried to use Claude code today to make a demo app to show off a small library I’m working on. I needed it to set up some very boilerplatey example app stuff.

It was fun to watch, it’s super polished and sci-fi-esque. But after 15 minutes I felt braindead and was bored out of my mind lol

5. evanelias ◴[10 Jul 25 20:42 UTC] No.44525349[source]▶

>>44524261 (TP) #

Here's a scary thought, which I'm admittedly basing on absolutely nothing scientific:

What if agentic coding sessions are triggering a similar dopamine feedback loop as social media apps? Obviously not to the same degree as social media apps, I mean coding for work is still "work"... but there's maybe some similarity in getting iterative solutions from the agent, triggering something in your brain each time, yes?

If that was the case, wouldn't we expect developers to have an overly positive perception of AI because they're literally becoming addicted to it?

replies(5): >>44525418 #>>44525471 #>>44526779 #>>44528433 #>>44532628 #

6. EarthLaunch ◴[10 Jul 25 20:48 UTC] No.44525418[source]▶

>>44525349 #

> The LLMentalist Effect: how chat-based Large Language Models replicate the mechanisms of a psychic’s con

https://softwarecrisis.dev/letters/llmentalist/

Plus there's a gambling mechanic: Push the button, sometimes get things for free.

replies(1): >>44526719 #

7. csherratt ◴[10 Jul 25 20:52 UTC] No.44525471[source]▶

>>44525349 #

That's my suspicion to.

My issue with this being a 'negative' thing is that I'm not sure it is. It works off of the same hunting / foraging instincts that keep us alive. If you feel addiction to something positive, it is bad?

Social media is negative because it addicts you to mostly low quality filler content. Content that doesn't challenge you. You are reading shit posts instead of reading a book or doing something with better for you in the long run.

One could argue that's true for AI, but I'm not confident enough to make such a statement.

replies(1): >>44525552 #

8. evanelias ◴[10 Jul 25 21:00 UTC] No.44525552{3}[source]▶

>>44525471 #

The study found AI caused a "significant slowdown" in developer efficiency though, so that doesn't seem positive!

9. malfist ◴[10 Jul 25 23:05 UTC] No.44526704[source]▶

>>44524974 #

How would you even measure your own performance? You can go and redo something, forgetting everything you did along the way the first time

replies(1): >>44526806 #

10. lll-o-lll ◴[10 Jul 25 23:07 UTC] No.44526719{3}[source]▶

>>44525418 #

This is very interesting and disturbing. We are outsourcing our decision making to an algorithmic “Mentalist” and will reap a terrible reward. I need to ween myself off the comforting teat of the chatbot psychic.

11. jwrallie ◴[10 Jul 25 23:16 UTC] No.44526779[source]▶

>>44525349 #

Like the feeling of the command line being always faster than using the GUI? Different ways we engage with a task can change our time perception.

I wish there was a simple way to measure energy spent instead of time. Maybe nature is just optimizing for something else.

12. jwrallie ◴[10 Jul 25 23:21 UTC] No.44526806{3}[source]▶

>>44526704 #

You could go the same way as the study, flip a coin to use AI or not, write down the task you just did, the time you thought the task took you and the actual clock time. Repeat and self-evaluate.

replies(1): >>44527279 #

13. malfist ◴[11 Jul 25 00:37 UTC] No.44527279{4}[source]▶

>>44526806 #

Sample size of 16 is already hard enough to draw conclusions from. Sample size of 1 is even worse.

replies(2): >>44529007 #>>44537965 #

14. coffeefirst ◴[11 Jul 25 04:36 UTC] No.44528433[source]▶

>>44525349 #

This is fascinating and would go a long ways to explain why people seem to have totally different experiences with the same machines.

15. fiddlerwoaroof ◴[11 Jul 25 04:51 UTC] No.44528508[source]▶

>>44524261 (TP) #

I think just about every developer hack turns out this way: static vs dynamic types; keyboard shortcuts vs mice; etc. But I think it’s also possible to over-interpret these findings: using the tools that make your work enjoyable has important second-order effects even if they aren’t the productivity silver bullet everyone claims they are.

16. qingcharles ◴[11 Jul 25 05:18 UTC] No.44528626[source]▶

>>44524261 (TP) #

Part of it is that I feel I don't have to put as much mental energy into the coding part. I use my mental energy on the design and ideas, then kinda breeze through the coding now with AI at a much lower mental energy state than I would have when I was typing every single character of every line.

17. JelteF ◴[11 Jul 25 06:38 UTC] No.44529007{5}[source]▶

>>44527279 #

It's the most representative sample size if you're interested in your own performance though. I really don't care if other people are more productive with AI, if I'm the outlier that's not then I'd want to know.

18. rsynnott ◴[11 Jul 25 10:34 UTC] No.44530564[source]▶

>>44524261 (TP) #

> I wonder what could explain such large difference between estimation/experience vs reality, any ideas?

This bit I wasn't at all surprised by, because this is _very common_. People who are doing a [magic thing] which they believe in often claim that it is improving things even where it empirically isn't; very, very common with fad diets and exercise regimens, say. You really can't trust subjects' claims of efficacy of something that's being tested on them, or that they're testing on themselves.

And particularly for LLM tools, there is this strong sense amongst many fans that they are The Future, that anyone who doesn't get onboard is being Left Behind, and so forth. I'd assume a lot of users aren't thinking particularly rationally about them.

19. hopeless ◴[11 Jul 25 14:36 UTC] No.44532628[source]▶

>>44525349 #

What if agentic coding results in _less_ dopamine than manual coding? Because honestly I think that's more likely and jives with my experience.

There's no flow state to be achieved with AI tools (at the moment)

replies(1): >>44535780 #

20. evanelias ◴[11 Jul 25 19:01 UTC] No.44535780{3}[source]▶

>>44532628 #

With manual coding, the big dopamine hit comes at the end of a task - that's your internal feeling of reward for completing something.

I would think this could contrast with agentic coding, where the AI keeps generating code, and then you iterate on this process to get the AI to fix its mistakes. With normal human code review, it takes longer to get revisions and can feel like a slog. But with AI that's a much tighter loop, so maybe developers feel extra productive from all these dopamine hits from each interaction with the agent.

When manually coding and in flow state I'd think it's a more consistent level of arousal, less spiky. Probably varies by person and coding style though, which might also explain why some people love TDD and others can't stand it?

21. sarchertech ◴[11 Jul 25 23:47 UTC] No.44537965{5}[source]▶

>>44527279 #

Sample of 16 is plenty if the effect is big enough.

It’s also not a sample size of 1, it’s a sample size of however many tasks you do because you don’t care about measuring the effect AI has on anyone but yourself if you’re trying to discern how it impacts you.