Most active commenters

XenophileJKO(6)
WD-42(3)

Popular/hot comments

>>44568209 #
>>44568159 #

←back to thread

LLM Inevitabilism

(tomrenner.com)

Show context

delichon ◴[15 Jul 25 04:49 UTC] No.44567913[source]▶

>>44567857 (OP) #

If in 2009 you claimed that the dominance of the smartphone was inevitable, it would have been because you were using one and understood its power, not because you were reframing away our free choice for some agenda. In 2025 I don't think you can really be taking advantage of AI to do real work and still see its mass adaptation as evitable. It's coming faster and harder than any tech in history. As scary as that is we can't wish it away.

replies(17): >>44567949 #>>44567951 #>>44567961 #>>44567992 #>>44568002 #>>44568006 #>>44568029 #>>44568031 #>>44568040 #>>44568057 #>>44568062 #>>44568090 #>>44568323 #>>44568376 #>>44568565 #>>44569900 #>>44574150 #

1. afavour ◴[15 Jul 25 05:20 UTC] No.44568040[source]▶

>>44567913 #

Feels somewhat like a self fulfilling prophecy though. Big tech companies jam “AI” in every product crevice they can find… “see how widely it’s used? It’s inevitable!”

I agree that AI is inevitable. But there’s such a level of groupthink about it at the moment that everything is manifested as an agentic text box. I’m looking forward to discovering what comes after everyone moves on from that.

replies(2): >>44568087 #>>44570793 #

2. XenophileJKO ◴[15 Jul 25 05:29 UTC] No.44568087[source]▶

>>44568040 (TP) #

We haven't even barely extracted the value from the current generation of SOTA models. I would estimate less then 0.1% of the possible economic benefit is currently extracted, even if the tech effectively stood still.

That is what I find so wild about the current conversation and debate. I have claude code toiling away building my personal organization software right now that uses LLMs to take unstructured input and create my personal plans/project/tasks/etc.

replies(1): >>44568159 #

3. WD-42 ◴[15 Jul 25 05:41 UTC] No.44568159[source]▶

>>44568087 #

I keep hearing this over and over. Some llm toiling away coding personal side projects, and utilities. Source code never shared, usually because it’s “too specific to my needs”. This is the code version of slop.

When someone uses an agent to increase their productivity by 10x in a real, production codebase that people actually get paid to work on, that will start to validate the hype. I don’t think we’ve seen any evidence of it, in fact we’ve seen the opposite.

replies(3): >>44568209 #>>44568248 #>>44570410 #

4. XenophileJKO ◴[15 Jul 25 05:50 UTC] No.44568209{3}[source]▶

>>44568159 #

:| I'm an engineer of 30+ years. I think I know good and bad quality. You can't "vibe code" good quality, you have to review the code. However it is like having a team of 20 Junior Engineers working. If you know how to steer a group of engineers, then you can create high quality code by reviewing the code. But sure, bury your head in the sand and don't learn how to use this incredibly powerful tool. I don't care. I just find it surprising that some people have such a myopic perspective.

It is really the same kind of thing.. but the model is "smarter" then a junior engineer usually. You can say something like "hmm.. I think an event bus makes sense here" Then the LLM will do it in 5 seconds. The problem is that there are certain behavioral biases that require active reminding (though I think some MCP integration work might resolve most of them, but this is just based on the current Claude Code and Opus/Sonnet 4 models)

replies(4): >>44568238 #>>44568420 #>>44568553 #>>44574632 #

5. twelve40 ◴[15 Jul 25 05:56 UTC] No.44568238{4}[source]▶

>>44568209 #

> it is like having a team of 20 Junior Engineers

lol sounds like a true nightmare. Code is a liability. Faster junior coding = more crap code = more liability.

replies(1): >>44568536 #

6. enjo ◴[15 Jul 25 05:58 UTC] No.44568248{3}[source]▶

>>44568159 #

100% agree. I have so much trouble squaring my experience with the hype and the grandparent post here.

The types of tasks I have been putting Claude Code to work on are iterative changes on a medium complexity code base. I have an extensive Claude.md. I write detailed PRDs. I use planning mode to plan the implementation with Claude. After a bunch of iteration I end up with nicely detailed checklists that take quite a lot of time to develop but look like a decent plan for implementation. I turn Claude (Opus) loose and religiously babysit it as it goes through the implementation.

Less than 50% of the time I end up with something that compiles. Despite spending hundreds of thousands of tokens while Claude desperately throws stuff against the wall trying to make it work.

I end up spending as much time as it would have taken just to write it to get through this process AND then do a meticulous line by line review where I typically find quite a lot to fix. I really can't form a strong opinion about the efficiency of this whole thing. It's possible this is faster. It's possible that it's not. It's definitely very high variance.

I am getting better at pattern matching on things AI will do competently. But it's not a long list and it's not much of the work I actually do in a day. Really the biggest benefit is that I end up with better documentation because I generated all of that to try and make the whole thing actually work in the first place.

Either I am doing something wrong, the work that AI excels at looks very different than mine, or people are just lying.

replies(1): >>44568331 #

7. XenophileJKO ◴[15 Jul 25 06:13 UTC] No.44568331{4}[source]▶

>>44568248 #

1. What are your typical failures? 2. What language and domain are you working in?

I'm kind of surprised, certainly there is a locality bias and an action bias to the model by default, which can partially be mitigated by claude.md instructions (though it isn't great at following if you have too much instruction there). This can lead to hacky solutions without additional meta-process.

I've been experimenting with different ways for the model to get the necessary context to understand where the code should live and the patterns it should use.

I have used planning mode only a little (I was just out of the country for 3 weeks and not coding, so it has only just become available before I left, but it wasn't a requirement in my past experience)

The only BIG thing I want from Claude Code right now is a "Yes, and.." for accepting code edits where I can steer the next step while accepting the code.

8. WD-42 ◴[15 Jul 25 06:30 UTC] No.44568420{4}[source]▶

>>44568209 #

I use llms every day. They’ve made me slightly more productive, for sure. But these claims that they “are like 20 junior engineers” just don’t hold up. First off, did we already forget the mythical man month? Second, like I said, greenfield side projects are one thing. I could vibe code them all day. The large, legacy codebases at work? The ones that have real users and real consequences and real code reviewers? I’m sorry, but I just haven’t seen it work. I’ve seen no evidence that it’s working for anyone else either.

replies(1): >>44572169 #

9. alternatex ◴[15 Jul 25 06:45 UTC] No.44568536{5}[source]▶

>>44568238 #

I've never seen someone put having a high number of junior engineers in a positive light. Maybe with LLMs it's different? I've worked at companies where you would have one senior manage 3-5 juniors and the code was completely unmaintainable. I've done plenty of mentoring myself and producing quality code through other people's inexperienced hands has always been incredibly hard. I wince when I think about having to manage juniors that have access to LLMs, not to mention just LLMs themselves.

replies(1): >>44568639 #

10. OccamsMirror ◴[15 Jul 25 06:47 UTC] No.44568553{4}[source]▶

>>44568209 #

It's definitely made me more productive for admin tasks and things that I wouldn't bother scripting if I had to write it myself. Having an LLM pump out busy work like that is definitely a game changer.

When I point it at my projects though, the outcomes are much less reliable and often quite frustrating.

replies(1): >>44570290 #

11. XenophileJKO ◴[15 Jul 25 06:58 UTC] No.44568639{6}[source]▶

>>44568536 #

Ah.. now you are asking the right questions. If you can't handle 3-5 junior engineers.. then yes, you likely can't get 10-20x speed from an LLM.

However if you can quickly read code, see and succintly communicate the more optimal solution, you can easily 10x-20x your ability to code.

I'm begining to believe it may primarily come down to having the vocabulary and linguistic ability to succintly and clearly state the gaps in the code.

replies(1): >>44568936 #

12. fzeroracer ◴[15 Jul 25 07:50 UTC] No.44568936{7}[source]▶

>>44568639 #

> However if you can quickly read code, see and succintly communicate the more optimal solution, you can easily 10x-20x your ability to code.

Do you believe you've managed to solve the most common wisdom in the software engineering industry? That reading code is much harder than writing it? If you have, then you should write up a white paper for the rest of us to follow.

Because every time I've seen someone say this, it's from someone that doesn't actually read the code they're reviewing.

replies(1): >>44569624 #

13. XenophileJKO ◴[15 Jul 25 10:05 UTC] No.44569624{8}[source]▶

>>44568936 #

Harder maybe, slower.. no.

14. liveoneggs ◴[15 Jul 25 12:07 UTC] No.44570290{5}[source]▶

>>44568553 #

https://marketoonist.com/2023/03/ai-written-ai-read.html

15. PleasureBot ◴[15 Jul 25 12:24 UTC] No.44570410{3}[source]▶

>>44568159 #

People have much more favorable interactions with coding LLMs when they are using it for greenfield projects that they don't have to maintain (ie personal projects). You can get 2 months of work done in a weekend and then you hit a brick wall because the code is such a gigantic ball of mud that neither you nor the LLM are capable of working on it.

Working with production code is basically jumping straight to the ball of mud phase, maybe somewhat less tangled but usually a much much larger codebase. Its very hard to describe to an LLM what to even do since you have such a complex web of interactions to consider in most mature production code.

replies(1): >>44572855 #

16. jowea ◴[15 Jul 25 13:13 UTC] No.44570793[source]▶

>>44568040 (TP) #

Big Tech can jam X everywhere and not get actual adoption though, it's not magic. They can nudge people but can't force them to use it. And yes a lot of AI jammed everywhere is getting the Clippy reaction.

replies(1): >>44573076 #

17. throwawayoldie ◴[15 Jul 25 15:26 UTC] No.44572169{5}[source]▶

>>44568420 #

> They’ve made me slightly more productive, for sure

How are you measuring this? Are you actually saying that you _feel_ slightly more productive?

replies(1): >>44573181 #

18. XenophileJKO ◴[15 Jul 25 16:25 UTC] No.44572855{4}[source]▶

>>44570410 #

Maybe the difference is I know how to componentize mature code bases, which effectively limits the scope required for a human (or AI) to edit.

I think it is funny how people act like it is a new problem. If the AI is having trouble with a "ball of mud", don't make mud balls (or learn to carve out abstractions). This cognitive load is impacting everyone working on that codebase. Skilled engineers enable less skilled engineers to flourish by creating code bases where change is easy because the code is modular and self-contained.

I think one sad fact is many/most engineers don't have the skills to understand how to refactor mature code to make it modular. This also means they can't communicate to the AI what kind of refactoring they should make.

Without any guidance Claude will make mud balls because of two tendencies, the tendency to put code where it is consumed and the tendency to act instead of researching.

There are also some second level tendencies that you also need to understand, like the tendency to do a partial migration when changing patterns.

These tendencies are not even unique to the AI, I'm sure we have worked with people like that.

So to counteract these tendencies, just apply your same skills at reading code and understanding when an abstraction is leaky or a method doesn't align with your component boundary. Then you too can have AI building pretty good componentized code.

For example in my pet current project I have a clear CQRS api, access control proxies, repositories for data access. Clearly defined service boundaries.

It is easy for me to see when the AI for example makes a mistake like not using the data repository or access control because it has to add an import statement and dependency that I don't want. All I have to do is nudge it in another direction.

19. wavemode ◴[15 Jul 25 16:40 UTC] No.44573076[source]▶

>>44570793 #

The thing a lot of people haven't yet realized is: all those AI features jammed into your consumer products, aren't for you. They're for investors.

We saw the same thing with blockchain. We started seeing the most ridiculous attempts to integrate blockchain, by companies where it didn't even make any sense. But it was all because doing so excited investors and boosted stock prices and valuations, not because consumers wanted it.

20. WD-42 ◴[15 Jul 25 16:48 UTC] No.44573181{6}[source]▶

>>44572169 #

I guess I’m not measuring it, really. But I know that in the past I’d do a web search to find patterns or best practices, now the llm is pretty good at proving that kind of stuff. My stack overflow usage has gone way down, for example.

21. pron ◴[15 Jul 25 18:58 UTC] No.44574632{4}[source]▶

>>44568209 #

> However it is like having a team of 20 Junior Engineers working. If you know how to steer a group of engineers, then you can create high quality code by reviewing the code.

You cannot effectively employ a team of twenty junior developers if you have to review all of their code (unless you have like seven senior developers, too).

But this isn't a point that needs to be debated. If it is true that LLMs can be as effective as a team of 20 junior developers, then we should be seeing many people quickly producing software that previously required 20 junior devs.

> but the model is "smarter" then a junior engineer usually

And it is also usually worse than interns in some crucial respects. For example, you cannot trust the models to reliably tell you what you need to know such as difficulties they've encountered or important insights they've learnt and understand they're important to communicate.

↑