We'll just keep getting submission after submission talking about how amazing Claude Code is with zero real world examples.
it's funny because as I have gotten better as a dev I've gone backwards through his progression. when I was less experienced I relied on Google; now, just read the docs
I've been building commercial codebases with Claude for the last few months and almost all of my input is on taste and what defines success. The code itself is basically disposable.
I'm finding this is the case for my work as well. The spec is the secret sauce, the code (and its many drafts) are disposable. Eventually I land on something serviceable, but until I do, I will easily drop a draft and start on a new one with a spec that is a little more refined.
It’s a very different discussion when you’re building a product to sell.
Yes it knows a lot and can regurgitate things and create plausible code (if I have it run builds and fix errors every time it changes a file - which of course eats tokens) but having absolutely no understanding of how time or space works leads to 90% of its great ideas being nonsensical for UI tasks. Everything is needing very careful guidance and supervision otherwise it decides to do something different instead. For back end stuff, maybe it's better.
I'm on the fence regarding overall utility but $20/month could almost be worth it for a tool that can add a ton of debug logging in seconds, some months.
I find it difficult to include examples because a lot of my work is boring backend work on existing closed-source applications. It's hard to share, but I'll give it a go with a few examples :)
----
First example: Our quota detection system (shipped last month) handles configurable threshold detection across billing metrics. The business logic is non-trivial, distinguishing counter vs gauge metrics, handling multiple consumers, and efficient SQL queries across time windows.
Claude's evolution: - First pass: Completely wrong approach (DB triggers) - Second pass: Right direction, wrong abstraction - Third pass: Working implementation, we could iterate on
---- Second example: Sentry monitoring wrapper for cron jobs, a reusable component to help us observe our cronjob usage
Claude's evolution: - First pass: Hard-coded the integration into each cron job, a maintainability nightmare. - Second pass: Using a wrapper, but the config is all wrong - Third pass: Again, OK implementation, we can iterate on it
----
The "80%" isn't about line count; it's about Claude handling the exploration space while I focus on architectural decisions. I still own every line that ships, but I'm reviewing and directing rather than typing.
This isn't writing boilerplate, it's core billing infrastructure. The difference is that Claude is treated like a very fast junior who needs clear boundaries rather than expecting senior-level architecture decisions.
Two recent production features:
1. *Quota crossing detection system* - Complex business logic for billing infrastructure - Detects when usage crosses configurable thresholds across multiple metric types - Time: 4 days parallel work vs ~10 days focused without AI
The 3-attempt pattern was clear here:
- Attempt 1: DB trigger approach - wouldn't scale for our requirements
- Attempt 2: SQL detection but wrong interfaces, misunderstood counter vs gauge metrics
- Attempt 3: Correct abstraction after explaining how values are stored and consumed
2. *Sentry monitoring wrapper for cron jobs*
- Reusable component wrapping all cron jobs with monitoring
- Time: 1 day parallel vs 2 days focusedNothing glamorous, but they are real-world examples of changes I've deployed to production quicker because of Claude.