←back to thread

548 points kmelve | 2 comments | | HN request time: 0.003s | source
Show context
rhubarbtree ◴[] No.45112846[source]
Does anyone have a link to a video that uses Claude Code to produce clean robust code that solves a non trivial problem (ie not tic tac toe or a landing page) more quickly than a human programmer can write? I don’t want a “demo”, I want a livestream from an independent programmer unaffiliated with any AI company and thus not incentivised to hype.

I want the code to have subsequently been deployed in production and demonstrably robust, without additional work outside of the livestream.

The livestream should include code review, test creation, testing, PR creation.

It should not be on a greenfield project, because nearly all coding is not.

I want to use Claude and I want to be more productive, but my experience to date is that for writing code beyond autocomplete AI is not good enough and leads to low quality code that can’t be maintained, or else requires so much hand holding that it is actually less efficient than a good programmer.

There are lots of incentives for marketing at the grassroots level. I am totally open to changing my mind but I need evidence.

replies(27): >>45112915 #>>45112951 #>>45112960 #>>45112964 #>>45112968 #>>45112985 #>>45112994 #>>45113041 #>>45113054 #>>45113123 #>>45113184 #>>45113229 #>>45113316 #>>45113448 #>>45113465 #>>45113643 #>>45113677 #>>45113802 #>>45114193 #>>45114454 #>>45114485 #>>45114519 #>>45115642 #>>45115900 #>>45116522 #>>45123605 #>>45125152 #
M4v3R ◴[] No.45113316[source]
I've live streamed how I've built a tower defense game over a span of a week entirely using AI. I've also written down all the prompts were used to create this game, you can read about it here: https://news.ycombinator.com/item?id=44463967

Mind you I've never wrote a non-trivial game before in my life. It would take me weeks to do this on my own without any AI assistance.

Right now I'm working on a 3d world map editor for Final Fantasy VII that was also almost exclusively vibe-coded. It's almost finished and I plan a write up and a video about it when I'm done.

Now of course you've made so many qualifiers in your post that you'll probably dismiss this as "not production", "not robust enough", "not clean" etc. But this doesn't matter to me. What matters is I manage to finish projects that I would not otherwise if not for the AI coding tools, so having them is a huge win for me.

replies(3): >>45113578 #>>45115504 #>>45119997 #
hvb2 ◴[] No.45113578[source]
> What matters is I manage to finish projects that I would not otherwise if not for the AI coding tools, so having them is a huge win for me.

I think the problem is in your definition of finishing a project.

Can you support said code, can you extend it, are you able to figure out where bugs are when they show up? In a professional setting, the answer to all of those should likely be yes. That's what production code is.

replies(1): >>45113700 #
ffsm8 ◴[] No.45113700{3}[source]
I disagree with your sentiment.

The difference isn't what's finishing a project is, is the dissonance between what M4v3R and rhubarbtree understand when talking about "nontrivial production" software.

When you're working in enterprise, you usually have multiple stakeholders each defining sometimes even conflicting requirements to behavior of your software. And you're required to adhere to these requirements stringently.

That's an environment that's inherently a bad fit for vibe coding.

It can still be used there, too, but you will not get a 2-3x speed up, because the LLM will always introduce minor behavioral changes - which aren't important in M4v3R scenario, but a complete deal breaket for rhubarbtree.

From my own experience, I don't get a speed up at all via CoPilot agentic mode (Claude code is banned at my workplace). But I have had a significant boost in productivity in projects that don't need to adhere to any specific spec - namely projects I do an my own time (with Claude code right now).

I still use Copilot agentic mode though. While I haven't timed myself, I don't think I'm faster with it whatsoever. It's just less mentally involved in a lot of scenarios, so it's less exhausting - leaving more energy for side projects .

replies(1): >>45114138 #
mattmanser ◴[] No.45114138{4}[source]
I don't believe it's to do with the requirements. I think you'll still hit the same problems if those greenfield projects grow. It's still fundamentally about the code. I think you're missing the difference between a 10/100k+ lines of code professional software vs a quick 3k lines greenfield project.

In a few thousand lines of code you can get away with a massive amount of code bloat, quick hacks and inconsistent APIs. In a program that's anything more than a few thousand lines, you can't. It just becomes too confusing. You have to be deliberate. Code has to follow patterns so the cognitive load is lowered. Stuff has to be split up in a predictable manner.

And there's another problem, sensible and predictable maintenance. Changes and fixes have to be targeted and specific. They have to be written to avoid side-effect.

For organisation, it's been a huge effort on everyone's part these last 30 years to achieve that. Make code understandable, by organising it better. From one direction, languages have improved, with authors reducing boilerplate + cross-pollination of ideas between languages like anonymous methods. On the other, it's developers inventing + describing patterns or KISS or the single responsibility principle. Or even seemingly trivial things like choosing predictable folder structures and enforcing indentation rules[1]. I'm starting to feel that's often the main skill a senior dev brings to the table, organising code well.

Better code organization has made it possible for developers to make larger program. Code organisation is a need that becomes a big problem if you're not doing it well in large projects, but not really a problem if you're not doing it well in small projects.

And right now, AI isn't very good at code organisation. We might believe that you have to have a mental model of the whole program in your head, something an LLM is just not capable of right now. And I don't know if that's going to turn out to be a solvable problem as it seems like a huge context problem.

For maintenance, I'm not sure. AI seems pretty terrible at it. It often rewrites everything and throws the baby out with the bathwater. Again, it's a context problem.

Both could turn out to be easy to solve for this generation of AI, in the end.

[1] Younger programmers will not believe that even 15/20 years ago it was still a common problem that developers did not bother to indent their code consistently. In my first two jobs I'd regularly hit inconsistently indented code.

replies(1): >>45115433 #
MGriisser ◴[] No.45115433{5}[source]
I personally find Claude Code has no real issues working and producing code in the 40k LoC Ruby on Rails repo I work in nor in the 45k LoC Elixir/Phoenix repo I work in. For the last few months I'd say 99% of all changes I do to both are purely via Claude Code, I almost never use my editor anymore at all. It's common things don't work on the first try or aren't exactly what I want but usually just giving an error to Claude or further instructions will fix it in an iteration or two.

I think the code organization isn't amazing, but it's fine and frankly not that much of a concern to me usually as I'm usually just reading diffs and not digging around in the code much myself.

replies(1): >>45116995 #
1. ffsm8 ◴[] No.45116995{6}[source]
Totally of topic, but the other day I was considering trying out elixir for a mainly vibe coded project, mainly because i thought the way you can structure code in it should be pretty much optimal for LLM driven development.

I haven't tried it yet, but I thought elixirs easily implementable static analysis of code could make enforcement whenever the LLM goes off rails highly useful, and an umbrella architecture would make modularity well established.

Modules could all define their own contexts via nested CLAUDE.md and subagents could be used to give it explicit implementation details.

Did you try something like that before MGriisser? (successfully or not?)

replies(1): >>45122133 #
2. MGriisser ◴[] No.45122133[source]
Unfortunately I don't do anything nearly that sophisticated, I honestly barely even know Elixir, I had just used it a little bit at a previous job and thought it would be a nice choice to try for the web server part of an application I was building.

I mostly use Claude in that repo for controllers, DB access, and front end via heex templates, often with LiveView. I find it can get a bit mixed up with heex stuff occasionally given the weirdness of nested code into the HTML and all that but I think on pure Elixir it usually does a good job.