A staff engineer's journey with Claude Code

(www.sanity.io)

548 points kmelve | 1 comments | 02 Sep 25 19:34 UTC | HN request time: 0s | source

Show context

spicyusername ◴[03 Sep 25 11:41 UTC] No.45114584[source]▶

>>45107962 (OP) #

I guess we're just going to be in the age of this conversation topic until everyone gets tired of talking about it.

Every one of these discussions boils down to the following:

- LLMs are not good at writing code on their own unless it's extremely simple or boilerplate

- LLMs can be good at helping you debug existing code

- LLMs can be good at brainstorming solutions to new problems

- The code that is written by LLMs always needs to be heavily monitored for correctness, style, and design, and then typically edited down, often to at least half its original size

- LLMs utility is high enough that it is now going to be a standard tool in the toolbox of every software engineer, but it is definitely not replacing anyone at current capability.

- New software engineers are going to suffer the most because they know how to edit the responses the least, but this was true when they wrote their own code with stack overflow.

- At senior level, sometimes using LLMs is going to save you a ton of time and sometimes it's going to waste your time. Net-net, it's probably positive, but there are definitely some horrible days where you spend too long going back and forth, when you should have just tried to solve the problem yourself.

replies(12): >>45114610 #>>45114779 #>>45114830 #>>45115041 #>>45115537 #>>45115567 #>>45115676 #>>45115681 #>>45116405 #>>45116622 #>>45118918 #>>45120482 #

sunir ◴[03 Sep 25 13:28 UTC] No.45115537[source]▶

>>45114584 #

All true if you one shot the code.

If you have a sophisticated agent system that uses multiple forward and backward passes, the quality improves tremendously.

Based on my set up as of today, I’d imagine by sometime next year that will be normal and then the conversation will be very different; mostly around cost control. I wouldn’t be surprised if there is a break out popular agent control flow language by next year as well.

The net is that unsupervised AI engineering isn’t really cheaper better or faster than human engineering right now. Does that mean in two years it will be? Possibly.

There will be a lot of optimizations in the message traffic, token uses, foundational models, and also just the Moore’s law of the hardware and energy costs.

But really it’s the sophistication of the agent systems that control quality more than anything. Simply following waterfall (I know, right? Yuck… but it worked) increased code quality tremoundously.

I also gave it the SelfDocumentingCode pattern language that I wrote (on WikiWikiWeb) as a code review agent and quality improved tremendously again.

replies(3): >>45115704 #>>45119020 #>>45119921 #

1. zarzavat ◴[03 Sep 25 18:33 UTC] No.45119020[source]▶

>>45115537 #

> If you have a sophisticated agent system that uses multiple forward and backward passes, the quality improves tremendously.

Just an hour ago I asked Claude to find bugs in a function and it found 1 real bug and 6 hallucinated bugs.

One of the "bugs" it wanted to "fix" was to revert a change that I had made previously to fix a bug in code it had written.

I just don't understand how people burning tokens on sophisticated multi-agent systems are getting any value from that. These LLMs don't know when they are doing something wrong, and throwing more money at the problem won't make them any smarter. It's like trying to build Einstein by hiring more and more schoolkids.

Don't get me wrong, Claude is a fantastic productivity boost but letting it run around unsupervised would slow me down rather than speed me up.

↑