This would take time to write if I’m doing it myself so I decided to vibe code it entirely. I had this idea that a compiled language is less likely to have errors (on account of the compiler giving the LLM quicker feedback than me) and so I chose Tauri with TS (I think).
The experience has been both wonderful and strange. The app was built by Claude Code with me intermittently prompting it between actual work sessions.
What’s funny is the bugs. If you ever played Minecraft during the Alpha days you know that Notch would be like “Just fixed lighting” in one release. And you’d get that release and it’d be weird like rain would now fall through glass.
Essentially the bugs are strange. At least in the MC case you could hypothesize (transparency bit perhaps was used for multiple purposes) but this app is strange. If the LLM configuration modal is fixed, suddenly the MCP/tool tree view will stop expanding. What the heck, why are these two related? I don’t know. I could never know because I have never seen the code.
The compile time case did catch some iterations (I let Claude compile and run the program). But to be honest, the promise of correctness never landed.
Some people have been systematic and documented the prompts they use but I just free flowed it. The results are outstanding. There’s no way I could have had this built for the $50 in Claude credits. But also there’s no way I could interpret the code.
https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...
Developers believe they complete tasks 25% faster with AI but when measured they are 19% slower when using AI.
Second of all, it's easy to fart out some program in a few days vibe coding. How will that fare as more and more features need to be added on? We all used to say "Dropbox that's just FTP wrapped in a nice UI anyone can make that". This protocollie project seems to be a documentation viewer / postman for MCP. Which is cool, but is it something that would have taken a competent dev months to build? Probably not. And eventually the actual value of such things is the extensibility and integrations with various things like corporate SAML etc.
Will the vibe code projects of today be extensible like that, enough to grab market share vs the several similar versions and open source versions anyone can make in a few days, as the author suggests? It can be hard to extend a codebase you don't understand because you didn't write...
A clickbaity title in opposition with the content isn't helpful either. I would've recommended their "The Great Experiment Nobody's Running the Same Way" heading as a better choice, even thought it might not perform as well from a content marketing POV.
I had stumbled upon Kidlin’s Law—“If you can write down the problem clearly, you’re halfway to solving it”.
This is a powerful guiding principle in today’s AI-driven world. As natural language becomes our primary interface with technology, clearly articulating challenges not only enhances our communication but also maximizes the potential of AI.
The async approach to coding has been most fascinating, too.
I will add, I've been using Repl.it *a lot*, and it takes everything to another level. Getting to focus on problem solving, and less futzing with hosting (granted it is easy in the early journey of a product) - is an absolute game changer. Sparking joy.
I personally use the analogy of mario kart mushroom or star; that's how I feel using these tools. It's funny though, because when it goes off the rails, it really goes off the rails lol. It's also sometimes necessary to intercept decisions it will take.. babysitting can take a toll (because of the speed of execution). Having to deal with 1 stack was something.. now we're dealing with potential infinite stacks.
Ive taken to co-writing a plan with requirements with cursor and it works really well at first. But as it makes mistakes and we use those mistakes to refine the document eventually we are ready to “go” and suddenly it’s generating a large volume of code that directly contradicts something in the plan. Small annoyances like its inability to add an empty line after markdown headings have to be explicitly re added and re-reminded.
I almost wish I had more control over how it was iterating. Especially when it comes to quality and consistency.
When I/we can write a test and it can grind on that is when AI is at its best. It’s a closed problem. I need the tools to help me, help it, turn the open problem I’m trying to solve into a set of discrete closed problems.
I have enjoyed the github copilot agent style development where someone elses computer is running everything, and I can make a request and just come back half an hour later and check on it. But this level 5 driver gets the wrong destination basically every time, and then it's another 10, 20 or even 30 minutes for it to make a minor adjustment. It doesnt understand my `yarn` scripts, it runs my tests wrong, it can't do codegen, it doesn't format or lint files, etc. I asked copilot yesterday to lint and format a PR and it took 25 minutes of agentic work lol.
I'm actually producing code right this moment, where I would normally just relax and do something else. Instead, I'm relaxing and coding.
It's great for a senior guy who has been in the business for a long time. Most of my edits nowadays are tedious. If I look at the code and decide I used the wrong pattern originally, I have to change a bunch of things to test my new idea. I can skim my code and see a bunch of things that would normally take me ages to fiddle. The fiddling is frustrating, because I feel like I know what the end result should be, but there's some minor BS in the way, which takes a few minutes each time. It used to take a whole stackoverflow search + think, recently it became a copilot hint, and now... Claude simply does it.
For instance, I wrote a mock stock exchange. It's the kind of thing you always want to have, but because the pressure is on to connect to the actual exchange, it is often a leftover task that nobody has done. Now, Claude has done it while I've been reading HN.
Now that I have that, I can implement a strategy against it. This is super tedious. I know how it works, but when I implement it, it takes me a lot of time that isn't really fulfilling. Stuff like making a typo, or forgetting to add the dependency. Not big brain stuff, but it takes time.
Now I know what you're all thinking. How does it not end up with spaghetti all over the place? Well. I actually do critique the changes. I actually do have discussions with Claude about what to do. The benefit here is he's a dev who knows where all the relevant code is. If I ask him whether there's a lock in a bad place, he finds it super fast. I guess you need experience, but I can smell when he's gone off track.
So for me, career-wise, it has come at the exact right time. A few years after I reached a level where the little things were getting tedious, a time when all the architectural elements had come together and been investigated manually.
What junior devs will do, I'm not so sure. They somehow have to jump to the top of the mountain, but the stairs are gone.
Why are these chatbots that mangle data 1/3 to 1/2 of the time getting their budgets 10x over and over again?
This is irrational. If the code mangles data this bad, it's garbage.
Of course some people will lose jobs just like what happened to several industries when search became ubiquitous. (newspapers, phone books, encyclopedias, travel agents)
But IMHO this isn't the existential crisis people think it is.
It's just a tool. Smart, clever people can do lots of cool stuff with tools.
But you still have to use it,
Search has just become Chat.
You used to have to search, now you chat and it does the searching, and more!
That’s not jazz. Jazz being what it is, a lot of people in 2025 think it’s “everyone improvising,” but (outside of some free jazz) it’s quite structured and full of shared conventions.
Analogies work when you and your audience both understand the things being compared. In this case, the author doesn’t, and maybe some of the audience shares the same misperception, and so the analogy only works based on shared misunderstanding.
The analogy to jazz actually works better the more you know about it. But that’s accidental.
I don't work like this, I don't want to work like this and maybe most importantly I don't want to work with somebody who works like this.
Also I am scared that any library that I am using through the myriad of dependencies is written like this.
On the other hand... if I look at this as some alternate universe where I don't need to directly or indirectly touch any of this... I am happy that it works for these people? I guess? Just keep it away from me
Exactly my thinking, nearly 50, more than 30 years of experience in early every kind of programming, like you do, I can easily architect/control/adjust the agent to help me produce great code with a very robust architecture. By I do that out of my experience, both in modelling (science) and programming, I wonder how the junior devs will be able to build experience if everything comes cooked by the agent. Time will tell us.
I just started an embedded project where two different people had implemented subsystems independently, and I asked Claude to merge the code into a single project and convert the existing synchronous code into asynchronous state machines called from a single main loop. It wrote three drafts with me giving it different stylistic principles to follow. I don't know if I would have had the patience to do that myself!
I imagine the next generation will have a similar relationship with AI. What might seem "common sense" with the younger, more tech-saavy crowd, will be difficult for older generations whose default behavior isn't to open up chatgpt or gemini and find the solution quickly.
And also to help me troubleshoot my old yacht, it taught me to be an amateur marine electrician
I do not let it into my entire codebase tho. Keep the context small and if I dont get what I want in one or two prompt I dont use it
I think chat-like LLM interfacing is not the most efficient way. There has to be a smarter way.
I was super skeptical about a year ago. Copilot was making nice predictions, that was it. This agent stuff is truly impressive.
It suggests you've had very positive life experiences, that you trust human developers so much more than computers.
To me it mostly comes with a feeling of uncertainty. As if someone tells you something he got told on a party. I need to Google it, to find a trustful source for verification, else it's just a hint.
So I use it if I want a quick hint. Not if I really want to have information worth remembering. So it's certainly not a replacement for me. It actually makes things worse for me because of all that AI slop atm.
Famously complicated interface with a million buttons and menus.
Now there's more buttons for the AI tools.
Because at the end of the day, using a "brush" tool to paint over the area containing the thing you want it to remove or change in an image is MUCH simpler than trying to tell it that through chat. Some sort of prompt like "please remove the fifth person from the left standing on the brick path under the bus stop" vs "just explicitly select something with the GUI." The former could have a lot of value for casual amateur use; it's not going to replace the precise, high-functionality tool for professional use.
In software - would you rather chat with an LLM to see the contents of a proposed code change, or use a visual diff tool? "Let the agent run and then treat it's stuff as a PR from a junior dev" has been said so many times recently - which is not suggesting just chatting with it to do the PR instead of using the GUI. I would imagine that this would get extended to something like the input not just being less of a free-form chat, but more of a submission of a Figma mockup + a link to a ticket with specs.
The bigger issue, would there be a need for coding and software? Who would use them? Why are they using it? Are they buying something? searching for info? The usecase will see a revolution. The new usecases won't need the traditonal kind software. But AI can only produce traditional software.
Can I ask Claude to code up its clone for local use?
If we cannot find a way to redirect income from AI back to the creators of the information they rehash (such as good and honest journalism), a critical load-bearing pillar of democratic society will collapse.
The news industry has been in grave danger for years, and we've seen the consequences it brings (distrust, division, misinformation, foreign manipulation). AI may drive the last stake in its back.
It's not about some jobs being replaced; that is not even remotely the issue. The path we are on currently is a dark one, and dismissing it as "just some jobs being lost" is a naive dismissal of the danger we're in.
You prompt. You go live your life. You come back to ten thousand lines of code. You spend 5 minutes reading. One sentence of feedback. Another ten thousand lines appear while you're making lunch.
Yeah, it strikes me the author writes prose the same way they're generating code. 20k lines? That's enough code for a whole compiler or an operating system kernel. I'd love to see what those 20k lines actually do -- notably, in these articles about AI, people tend to not link the actual code when they easily could, which is curious. I mean, my macro expander can also write 20k lines of code while I eat lunch, but no one is pretending it's sentient and about to replace devs.I have heard the take that "writing code is not what makes you an engineer, solving problems and providing value is what makes you an engineer" and while that's cool and all and super important for advancing in your career and delivering results, I very much also like writing code. So there's that.
But it's never displaced the market for highly-produced, highly-planned, "central" software pieces that the utilities glue together and help you work with, etc.
The growth of that software-as-big-business has only enlarged the need for utilities, really, to integrate everything, but it's a tough space to work in - "it's hard to compete with free." One classic move is selling support, etc.
Might be tough to do non-LLM-driven software development there - the selling support for your LLM-created-products model is still viable, but if there's an increase in velocity in useful utility creation or maintenance, possibly the dev headcount needs are lower.
But does anyone know how to use LLMs to make those giant ones yet? Or to make those central core underlying libraries you mention? Doesn't seem like it. Time will tell if there's a meaningful path that is truly different from "an even higher level programming language." Even on the edges - "we outgrew the library and we have to fork it because of [features/perf/bugs]" is a pretty common pattern when working on those larger projects already, and the more specific the exact changes you need are, the less the LLM might be able to do it for you (e.g. the "it kept assuming this function existed because it exists in a lot of similar things" problem).
What I hope is that we can find good ways to leverage these for quality control and testing and validation. (Though this is the opposite of the sort of greenfield dev demos that get the most press right now.)
Testing/validation is hard and expensive enough that basically nobody does a thorough job of it right now, especially in the consumer space. It would be wonderful if we could find ways to release higher quality software without teams of thousands doing manual validation.
- Using a source to claim the opposite of what the source says.
- Point to irrelevant sources.
- Use a very untrustworthy source.
- Give our sources that do not have anything to do with what it says.
- Make up additional things like any other LLM without source or internet search capability, despite reading sources.
I've specifically found Gemeni (the one Google puts at the top of searches) is hallucination-prone, and I've had far better results with other agents with search capability.
So... presenting a false or made-up answer to a person searching the web on a topic they don't understand... I'd really like to see a massive lawsuit cooked up about this when someone inevitably burns their house down or loses their life.
Really helped my understanding of how apps work.
Completely new ways of programming are forming, completely new ways of computing and the best the luddites can do is be “against it”.
A revolution came along, a change in history and instead of being excited by the possibilities, joining in, learning, discovering, creating …… the luddites are just “against it all”.
I feel sorry for them. Why be in computing at all if you don’t like new technology?
I call it 'Orchestratic Development'.
Edit: Seriously, down voted twice when just commenting on an article? God I hate this arrogant shithole.
They are entering the job market with sensibilities for a higher-level of abstraction. They will be the first generation of devs that went through high-school + college building with AI.
This isn't a magic code genie, it's a very complicated and very powerful new tool that you need to practice using over time in order to get good results from.
Honestly reminds me of the digital currency mania that busted a couple of years ago. Same types of articles popping up too.
Look I understand the benefits of AI but it’s clear ai is limited by the compute power of today. Maybe the dream this author has will be realized some day. But it won’t be today or in current generations lifespan.
That's not to say there aren't vocations, or people in software who feel the way you do, but it's a tiny minority.
Cringe. The tech is half baked and the author is already fully committed to this is the future, I am living in the future, I bake cookies while Claude codes.
Pure cringe. This confirms my earlier theories that everyone just wants to be a manager. You don't need to manage humans. You just want to be a manager.
The whole article could be summed down to I always wanted to be a manager and now I am a manager of bots.
I think in the last month we've entered an inflection point with terminal "agents" and new generations of LLMs trained on their previously spotty ability to actually do the thing. It's not "there" yet and results depend on so many factors like the size of your codebase, how well-represented that kinda stuff is in its training data, etc but you really can feed these things junior-sized tickets and send them off expecting a PR to hit your tray pretty quickly.
Do I want the parts of my codebase with the tricky, important secret sauce to be written that way? Of course not, but I wouldn't give them to most other engineers either. A 5-20 person army of ~interns-newgrads is something I can leverage for a lot of the other work I do. And of course I still have to review the generated code, because it's ultimately my responsibility, but I prefer that over having to think about http response codes for my CRUD APIs. It gives me more time to focus on L7 load balancing and cluster discovery and orchestration engines.
But! There's still room for expertise. And this is where I disagree about swimming with the tide. There will be those who are uninterested in using the AI. They will struggle. They will hone their craft. They will have muscle memory for the tasks everyone else forgot how to do. And they will be able to perform work that the AI users cannot.
The future needs both types.
I've been experimenting with a toolchain in which I speak to text to agents, navigate the files with vim and autocomplete, and have Grok think through some math for me. It's pretty fun. I wonder if that will change to tuning agents to write code that go through that process in a semi-supervised manner will be fun? I don't know, but I'm open to the idea that as we progress I will find toolchains that bring me into flow as I build.
This kind of working is relaxing and enjoyable until capitalism discovers that it is, and then you have to do it on five projects simultaneously.
There are a lot gotchas with these new models. They get incredibly lazy if you let them. For example, I asked it to do a simple tally by year. I just assumed it’s simple enough I don’t need to ask to write a code. It counted first couple of years and just “guessed” the rest based on pattern it noticed.
Sometimes, it feels like having a lazy coworker that you have to double check constantly and email with repeated details. Other times, I just sit there in awe of how smart it is in my weekly AGI moment and how it’s going to replace me soon.
you: HAVE YOU PUT MORE TOKENS IN???? ARE YOU PUTTING THEM IN THE EXPENSIVE MACHINES???
super compelling argument /s
if you want to provide working examples of "prompt engineering" or "context engineering" please do but "just keep paying until the behavior is impressive" isn't winning me as a customer
it's like putting out a demo program that absolutely sucks and promising that if I pay, it'll get good. why put out the shit demo and give me this impression, then, if it sucks?
It's always a mix of:
1. "Wait for the next models", despite models having all but plateaued for the past 3 years,
2. "It's so good for boilerplate code", despite libraries and frameworks being much better suited for this task, and boilerplate code being actually rare to write in the normal lifecycle of a project,
3. "You need to prompt it differently", glossing over the fact that to prompt it so it can do what you want it to do accurately it would take longer than not to use AI at all,
4. And the worst: "We don't know how to use those models yet"
Maybe the real reason it doesn't work is because IT JUST DOESN'T FUCKING WORK.
Why is it so unfathomable that a next token generator is gonna suck at solving complex problems? It is blindingly obvious.
I think this is a really interesting question and an insight into part of the divide.
Places like HN get a lot of attention from two distinct crowds: people who like computers and related tech and people who like to build. And the latter is split into "people who like to build software to help others get stuff done" and "people who like to build software for themselves" too. Even in the professional-developer-world that's a lot of the split between those with "cool" side projects and those with either only-day-job software or "boring" day-job-related side projects.
I used to be in the first group, liking computer tech for its own sake. The longer I work in the profession of "using computer tools to build things for people" the less I like the computer industry, because of how much the marketing/press/hype/fandom elements go overboard. Building-for-money often exposes, very directly, the difference between "cool tools" and "useful and reliable tools" - all the bugs I have to work around, all the popular much-hyped projects that run into the wall in various places when thrown into production, all the times simple and boring beats cool when it comes to winning customers. So I understand when it makes others jaded about the hype too. Especially if you don't have the intrinsic "cool software is what I want to tinker with" drive.
So the split in reactions to articles like this falls on those lines, I think.
If you like cool computer stuff, it's a cool article, with someone doing something neat.
If you are a dev enthusiast who likes side projects and such (regardless of if it's your day job too or not), it's a cool article, with someone doing something neat.
If you are in the "I want to build stuff that helps other people get shit done" crowd then it's probably still cool - who doesn't like POCs and greenfield work? - but it also seems scary for your day to day work, if it promises a flood of "adequate", not-well-tested software that you're going to be expected to use and work with and integrate for less-technical people who don't understand what goes into reliable software quality. And that's not most people's favorite part of the job.)
(Then there's a third crowd which is the "people who like making money" crowd, which loves LLMs because they look like "future lower costs of labor." But that's generally not what the split reaction to this particular sort of article is about, but is part of another common split between the "yay this will let me make more profit" and "oh no this will make people stop paying me" crowds in the biz-oriented articles.)
But there is also the area of boilerplate, where non-LLM-AI-based IDEs for a few decades already help a lot with templates and "smart" completion. Current AI systems widen that area.
The trouble with AI is when you are reaching the boundary of its capabilities. The trivial stuff it does well. For the complex stuff it fails spectacularly. In the in between you got to review carefully, which easily becomes less fun than simply writing by oneself.
A friend’s dad only knows assembly. He’s the ceo of his company and they do hardware, and he’s close to retirement now, but he finds this newfangled C and C++ stuff a little too abstract. He sadly needs to trust “these people” but really he prefers being on the metal.
It might be as simple as creating awareness about how everything works underneath and creating graduates that understand how these things should work in a similar vein.
What you describe is exactly what a project manager does. Refines the technical, stories, organizes the development towards a goal.
This doesn’t feel like programming because it isn’t. It doesn’t NOT feel like programming because you’re supervising. In the end, you are now a project manager.
Then it ran out of money again, and I gave it even more money.
I'm in the low 4 figures a year now, and it's worth it. For a day's pay each year, I've got a junior dev who is super fast, makes good suggestions, and makes working code.
I'm thinking about Personal Knowledge Systems and their innovative ideas regarding visual representations of data (mind maps, website of interconnected notes, things like that). That could be useful for AI search. What elements are doing in a sense is building concept web, which would naturally fit quite well into visualization.
The ChatBot paradigm is quite centered around short easily digestible narratives, and will humans are certainly narrative generating and absorbing creatures to a large degree, things like having a visually mapped out counter argument can also be surprisingly useful. It's just not something that humans naturally do without effort outside of, say, a philosophy degree.
There is still the specter of the megacorp feed algo monster lurking though, in that there is a tendency to reduce the consumer facing tools to black-box algorithms that are optimized to boost engagement. Many of the more innovative approaches may involve giving users more control, like dynamic sliders for results, that sort of thing.
What does the next generation do when we’ve automated away that work? How do they learn to recognise what good looks like, and when their LLM has got stuck on a dead end and is just spewing out nonsense?
This is a core problem with amateurs pretending to be software producers. There are others, but this one is fundamental to acceptable commercial software and will absolutely derail vibe coded products from widespread adoption.
And if you think these aspects of quality software are easily reduced to prompts, you've probably never done serious work in those spaces.
You look at the PRs... there are 786(!) AI generated pull requests and an associated AI generated code review for each one. Each PR is about ~20-100 lines of Ruby (including comments) that implements an "action" for the sublayer system as a Ruby class. So probably something that could be handled by a macro expander. Or at least it's AI used as a fancy macro expander.
But yeah, there's about 20k lines of code right there easily. Although, because it's Ruby, it's not (much) of an exaggeration to say ~50% of the generated lines are a single "end" keyword.
The author is someone who before AI, would publish ~300 commits a year to Github. This year they are on track for 3000 commits using AI. But the result seems to be that PRs are accumulating in their repo, implementing hundreds of features. I'm wondering why the PRs are accumulating and not getting merged if the code is good? Is the bottleneck now review? What would happen if AI took over PR merging as well as PR creation?
You will most likely get your wish but not in the way you want. In a few years when this is fully matured there will be little reason to hire devs with their inflated salaries (especially in the US) when all you need is someone with some technical know-how and a keen eye on how to work with AI agents. There will be plenty of those people all over the globe who will demand much less than you will.
Hate to break it to you but this is the future of writing software and will be a reckoning for the entire software industry and the inflated salaries it contains. It won't happen overnight but it'll happen sooner than many devs are willing to admit.
Where I use it for is:
1. Remembering what something is called -- in my case the bootstrap pills class -- so I could locate it in the bootstrap docs. Google search didn't help as I couldn't recall the right name to enter into it. For the AI I described what I wanted to do and it gave the answer.
2. Working with a language/framework that I'm familiar with but don't know the specifics in what I'm trying to do. For example:
- In C#/.NET 8.0 how do I parse a JSON string?
- I have a C# application where I'm using `JsonSerializer.Deserialize` to convert a JSON string to a `record` class. The issue is that the names of the variables are capitalized -- e.g. `record Lorem(int Ipsum)` -- but the fields in the JSON are lowercase -- e.g. `{"ipsum": 123}`. How do I map the JSON fields to record properties?
- In C# how do I convert a `JsonNode` to a `JsonElement`?
3. Understanding specific exceptions and how to solve them.
In each case I'm describing things in general terms, not "here's the code, please fix it" or "write the entire code for me". I'm doing the work of applying the answers to the code I'm working on.
The truth is something like: for this to work, there is huge requirements in tooling/infrastructure/security/simulation/refinement/optimization/cost-saving that just could never be figured out by the big companies. So they are just like... well lets trick as many investors and plebs to try to use this as possible, maybe one of them will come up with some breakthrough we can steal
I know most true programmers will vouch for me and my need to understand. But clients and project managers and bosses? Are they really gonna keep accepting a refrain like this from their engineers?
"either it gets done in a day and I understand none of it, or it gets done in a month and I fully understand it and like it"
In essence, you have to do the "engineering" part of the app and they can write the code pretty fast for you. They can help you in the engineering part, but you still need to be able to weigh in whatever crap they recommend and adjust accordingly.
For anyone trying to back of the napkin at $1000 as 4-figures per year, averaged as a day salary, the baseline salary where this makes sense is about ~$260,000/yr? Is that about right lordnacho?
> With enough AI assistants building enough single-purpose tools, every problem becomes shallow. Every weird edge case already has seventeen solutions. Every 2am frustration has been felt, solved, and uploaded.
> We're not drowning in software. We're wading in it. And the water's warm
Just sounds like GPT style writing. I’m not saying this blog is all written by GPT, but it sounds like it is. I wonder if those of us who are constantly exposed to AI writing are starting to adopt some of that signature fluffy, use-a-lot-of-words-without-saying-much kinda style.
Life imitates art. Does intelligence imitate artificial intelligence?? Or maybe there’s more AI written content out there than I’m willing to imagine.
(Those snippets are from another post in this blog)
A lot of what is “working” in the article is closer to “jugaad”/prototyping.
Something the author acknowledges in their opening- it’s a way to prototype and get something off the ground.
Technically debt will matter for those products that get off the ground.
I'm reminded of teaching bootcamp software engineering, when every day #1 we go through simple git workflows and it seems very intimidating to students and they don't understand the value. Which fair enough because git has a steep learning curve and you need to use it practically to start picking it up.
I think this might be analogous to the shift going on with ai-generated and agent-generated coding, where you're introducing an unfamiliar tool with a steep learning curve, and many people haven't seen the why? for its value.
Anyways, I'm 150 commits into a vibe coding project that still standing strong, if you're curious as to how this can work, you can see all the prompts and the solutions in this handy markdown I've created: https://github.com/sutt/agro/blob/master/docs/dev-summary-v1...
I guess if all you do is write React To-Do apps all day, it might even work for a bit.
And that's not saying AI tools are the real deal, either. It can be a lot less than a fully self driving dev and still be worth a significant fraction of an entry level dev.
The section "What Even Is Programming Anymore?" hit on a lot of the thoughts and feels I've been going through. I'm using all my 25+ years of experience and CS training, but it's _not_ programming per se.
I feel like we're entering an era where we're piloting a set of tools, not hand crafting code. I think a lot of people (who love crafting) will be leaving the industry in the next 5 years, for better or worse. We'll still need to craft things by hand, but we're opening some doors to new methodologies.
And, right now, those methodologies are being discovered, and most of us are pretty bad at them. But that doesn't mean they're not going to be part of the industry.
> it's a very complicated and very powerful new tool that you need to practice using over time in order to get good results from.
Of course this is and would be expected to be true. Yet adoption of this mindset has been orders of magnitude slower than the increase in AI features and capabilities.Repeat that a few hundred times and you'll have some strong intuitions and sensibilities.
Unless you've never written code outside of a classroom you should know how unbelievably wrong this is.
So more work gets to penetrate a part of your life that it formerly wouldn't. What's the value of “productivity gains”, when they don't improve your quality of life?
https://github.com/jerpint/context-llemur
The idea is to track all of the context of a project using git. It’s a CLI and MCP tool, the human guides it but the LLM contributes back to it as the project evolves
I used it to bootstrap the library itself, and have been using it more and more for context management of all sorts of things I care about
There's a huge disconnect I notice where experienced software engineers rage about how shitty things are nowadays while diving directly into using AI garbage, where they cannot explain what their code is doing if their lives depended on it.
As has been the case for all those jobs changed by programmers, the people who keep an open mind and are willing to learn new ways of working will be fine or even thrive. The people rusted to their seat, who are barely adding value as is, will be forced to choose between changing or struggling.
Worth it to me as I can fix all the above after the fact.
Just annoying haha
One could even imagine going a step further and having a confidence level associated with different parts of the code, that would help the LLM concentrate changes on the areas that you're less sure about.
Looking at other industries, music production is probably the one to look at. What was once the purview of record labels with recording studios that cost a million dollars to outfit, is now a used MacBook and, like, $1,000 of hardware/software. The music industry has changed, dramatically, as a result of the march of technology, and thus so will software. So writing software will go the way of the musician. What used to be a middle class job as a trumpet player in NYC before the advent of records, is now only a hobby except for the truely elite level practicioners.
The thing for me is that AI writing the boilerplate feels like the brute force solution, compared to investing in better language and tooling design that may obviate the need for such boilerplate in the first place.
The economic viability to do proper journalism was already destroyed by the ad supported click and attention based internet. (and particular the way people consume news through algorithmic social media)
I believe most independent news sites have been economically forced into sensationalism and extremism to survive. Its not what they wilfully created.
Personally, i find that any news organisations that is still somewhat reputable have source of income beyond page visits and ads; Be it a senior demorgaphic that still subscribe to the paper, loyal reader base that pay for the paywall, or government sponsoring its existence as public service.
Now what if you cut out the last piece of income journalists rely on to stay afloat? We simply fire the humans and tell an AI to summarise the other articles instead, and phrase it how people want to hear it.
And thats a frightening world.
Which, of course, is your perogative, but in what other ways do we, as fellow programmers, judge software libraries and dependencies so harshly? As a Vim user, do I care that Django was written with a lot of emacs? Or that Linus used emacs to write git? Or maybe being judgemental about programming languages; ugh, that's "just" a scripting language, it's not "real" programming unless you use a magnet up against a hard drive to program in ones and zeros. As a user, do I care that Calibre is written in Python, and not something "better"? Or that curl is written in good ole C. Or how about being opinionated as to whether or not the programmer used GDB or printf debugging to make the library?
What are you attached to and identify with that you’re rejecting new ways to work?
Change is the only constant and tools now look like superhuman tools created for babies compared to the sota at bell or NASA in the 1960s when they were literally trying to create superhuman computing.
We have more access to powerful compute and it’s never been easier to build your own everything.
What’s the big complaint?
The energy cost is absurdly high for the result, but in current economics, where it's paid by investors not users, it's hidden. Will be interesting to see when AI companies got to the level where they have to make profits and how much optimisation there is to come ...
I do think that for most of the people, you are right, you do not need to know a lot, but my philosophy was to always understand how the tool you use work (one level deeper), but now the tool is creating a new tool. How do you understand the tool which has been created by your Agent/AI tool?
I find this problem interesting, this is new to me and I will happily look at how our society and the engineering community evolve with these new capacities.
I’ll probably get over it, but I’ve been realizing how much fun I get out building something as opposed to just having be built. I used to think all I cared about was results, and now I know that’s not true, so that’s fun!
Of course for the monotonous stuff that I’ve done before or don’t care a lick about, hell yeah I let em run wild. Boilerplate, crud, shell scripts, CSS. Had claude make me a terminal based version of snake. So sick
Then other times, I go to create something that is suggested _by them below the prompt box_ and it can't do it properly.
You can also literally do exactly what you said with "going a step further".
Open Claude Code, run `/init`. Download Superwhisper, open a new file at project root called BRAIN_DUMP.md, put your cursor in the file, activate Superwhisper, talk in stream of consciousness-style about all the parts of the code and your own confidence level, with any details you want to include. Go to your LLM chat, tell it to "Read file @BRAIN_DUMP.md" and organize all the contents into your own new file CODE_CONFIDENCE.md. Tell it to list the parts of the code base and give it's best assessment of the developer's confidence in that part of the code, given the details and tone in the brain dump for each part. Delete the brain dump file if you want. Now you literally have what you asked for, an "index" of sorts for your LLM that tells it the parts of the codebase and developer confidence/stability/etc. Now you can just refer to that file in your project prompting.
Please, everyone, for the love of god, just start prompting. Instead of posting on hacker news or reddit about your skepticism, literally talk to the LLM about it and ask it questions, it can help you work through almost any of this stuff people rant about.
heck I built a full app in an afternoon AND I was a good dad?
> I'd wander into my office, check what Claude had built, test it real quick. If it worked, great! Commit and push. "Now build the server connection UI," I'd say, and wander back out.
Made breakfast. Claude coded.
Played with my son. Claude coded.
Watched some TV. Claude coded.
Every hour or so, I'd pop back in. Five minutes of testing. One minute of feedback.
Despite explicit instructions in all sorts of rules and .md’s, the models still make changes where they should not. When caught they innocently say ”you’re right I shouldn’t have done that as it directly goes against your rule of <x>”.
Just to be clear, are you suggesting that currently, with your existing setup, the AI’s always follow your instructions in your rules and prompts? If so, I want your rules please. If not, I don’t understand why you would diss a solution which aims to hardcode away some of the llm prompt interpretation problems that exist
This really hasn't been my experience
Maybe I just expect more out of juniors than most people, though
It's a fact models aren't getting as cost efficient nor better with the same rate that the costs increases of training and running them. It's also a fact that they are so unprofitable that Anthropic feels like they gotta rug-pull your Claude tokens (https://news.ycombinator.com/item?id=44598254#44602695) without telling you, let's just ignore those facts and fanboy with wide-closed about that future.
A future framed as "inevitable" by a bunch of people whose job/wealth depends on framing it as such. Nah, hard pass.
In fact what I really want to see is a successful product that no one realizes was built by AI vibes until after it was successful. Customers don’t give a shit how something was built.
Back in the Bitcoin hype days, there were new posts here every single day about the latest and greatest Bitcoin thing. Everyone was using it. It was going to take over the world. Remember all the people on this very site that sincerely thought fiat currency was going away and we'd be doing all of our transactions with Bitcoin? How'd that work out?
It feels exactly the same. Now the big claims are that coding jobs are going away, or if you at least don't use it you'll be left behind. People are posting AI stories every day. Everyone is using it. People say it's going to transform the industry.
Back then there was greater motivation to evangelize Bitcoin, as you could get rich by convincing people to buy in, and it's just to a lesser degree now. People who work for AI companies (like the author), posting AI stuff, trying to drum up more people to give them views/clicks, buy their products.
And of course you'll have people replying to this trying to make the case for why AI coding is already a thing, when in reality those posts are once again going to be carbon copies of similar comments from the Bitcoin days "hey, you're wrong, I bought pizza with Bitcoin last night, it's already taking over, bud!"
If you tell it to use existing libraries (and they are in its training data) it will do that instead.
If you tell it about libraries it hasn't encountered before it can use those instead.
I tried to follow the hype and generate an application but it took a lot of time and it did generate something but not something that works with many subtle bugs. Now it may be that I needed to prompt it better, but that response also feels similar to how Scrum is always "done wrong" when it doesn't work. The result started getting better when I got more and more detailed with my prompts and then I realized that I am about to start writing code as a prompt and I may as well write the code myself.
So I still think it's an interesting tool, and it will automate away certain industries but no where near what the advertising is implying.
LLMs can be thought of metaphorically as a process of decompression, if you can give it a compressed form for your scenario 1 it'll go great - you're actually doing a lot of mental work to arrive at that 'compressed' request, checking technical feasibility, thinking about interactions, hinting at solutions.
If you feed it back it's own suggestion it's no so guaranteed to work.
The junior dev who has agents write a program for them may not understand the code well enough to really touch it at all. They will make the wrong suggestions to fix problems caused by inexperienced assumptions, and will make the problems worse.
i.e. it's because they're junior and not qualified to manage anybody yet.
The LLMs are being thought of as something to replace juniors, not to assist them. It makes sense to me.
The fact that AI can actually handle the former case is, to be clear, awesome; but not surprising. Low-code tools have been doing it for years. Retool, even back in 2018, was way more productive than any LLMs I've seen today, at the things Retool could do. But its relative skill at these things, to me, does not conclusively determine that it is on the path toward being able to autonomously handle the latter.
The english language is simply a less formal programming language. Its informality means it requires less skill to master, but also means it may require more volume to achieve desired outcomes. At some level of granularity, it is necessarily the case that programming in english begins to look like programming in javascript; just with capital letters, exclamation points, and threats to fire the AI instead of asserts and conditionals. Are we really saving time, and thus generating higher levels of productivity? Or, is its true benefit that it enables foray into languages and domains you might be unfamiliar with; unlocking software development for a wider range of people who couldn't muster it before? Its probably a bit of both.
Dario Amodei says we'll have the first billion dollar solo-company by 2026 [1]. I lean toward this not happening. I would put money on even $100M not happening, barring some level of hyperinflation which changes our established understanding of what a dollar even is. But, here's what I will say: hitting levels of revenue like this, with a human count so low that the input of the AI has to overwhelm the input from the humans, is the only way to prove to me that, actually, these things might be more than freakin awesome tools. Blog posts from people making greenfield apps named after a furrsona DJ isn't moving the needle for me on this issue.
[1] https://www.inc.com/ben-sherry/anthropic-ceo-dario-amodei-pr...
>My four-document system? Spaghetti that happened to land in a pattern I could recognize. Tomorrow it might slide off the wall. That's fine. I'll throw more spaghetti.
Amazing that in July 2025 people still think you can scale development this way.
To be fair, a lot of commercial software clearly hasn't, either.
Whereas AI is as big as life, eukaryotes, multi-cellularity, human intelligence, agriculture and the industrial revolution. It will certainly change everything (and make humans go extinct unless we are very careful).
But does Claude's code work? Does it work to the level where you'd depend on it yourself; where you'd bill customers for it; where you'd put your reputation behind it?
I say no. And it's because I use Claude. Two events changed how I use Claude: now it's an advisor, and I mostly type the code myself. Because I don't trust it.
First, I caught it copying one of my TypeScript interfaces and modifying it. So now we have User which looks like my actual user, that I defined, and UserAgain which does not, and which Claude is now using and proudly proclaiming that my type checks all pass. Well of course they do!
Second, I was told that the best way to catch this sort of thing is to get it to write tests. So it wrote some tests, and they failed, and it kept going, and it eventually wrote an un-failable test. The test mocked itself.
So, sure, enjoy time with your kids. Please don't ask me to use your app for anything important.
"That's OK, I found a jetpack."
I've said it before and I'll say it again: there likely isn't a "golden workflow" or "generally accepted best practices" on how to code with AI. The new models and agentic capabilities seem to be very powerful, and they will conform to whatever methodologies you currently use with whatever project you're working on, but that may still be under-utilizing what they are truly capable of.
A true optimum may even require you to adjust the way you work, down to structuring your code and projects differently. In fact you may need to figure out different approaches based on the project, the language, the coding style, the model, the specific task at hand, even your personality. I am convinced this aspect is what's causing the bimodal nature of AI coding discussions: people who stuck at it and figured it out, or just got lucky with the right mix of model / project / task / methodology, are amazed at their newfound superpowers -- whereas people who didn't, are befuddled by the hype.
This may seem like a lot of work, but it makes sense if you stop thinking of this as just a tool and more like working with a new team-mate.
I’m starting to believe that’s not necessarily true. And if some study finds out later that stuff built slowly by hand is actually better in every way except time-to-market, then it means AI is not really a competitive edge, it’s just a Quality of Life improvement that allows software engineers to be even lazier. And at future price points of $200, $400, even $1000 a month per head, that becomes a hard sell for most companies. Might be easier to have engineers pay for their own AI if they want to be lazy. And of course whether they use AI or not, you can still measure productivity under the assumption that every engineer does…
I've experienced the exact issues you've described. I've also drastically reduced these issues via good instructions and automated followup passes that eliminate code that was created from ignored instructions.
It all feels like a hack, but the more I choose to trust it and treat it like it's the correct path and that it's just a different set of problems that need to be solved, the more success I have.
1. I usually just pull up the docs for the CSS framework, give it a quick look over to know what it offers and the nomenclature and then keep it open for all the code examples.
2. I've serialized json in enough languages to know the pain points, so what I usually do is locate the module/library responsible for that in that language. And then give the docs/code sample a quick lookover to know where things are.
3. With nice IDEs, you launch the debugger and you have a nice stack frame to go through. In languages with not so great tooling, you hope for a trace.
It's not that your workflow won't yield result. But I prefer to be able to answer 5 successive why's about the code I'm working on. With PRs taking hours and days to be merged, it's not like I'm in an hurry.
Photoshop is quite nice for an expert tool. Blender is the complicated one where you have to get a full-sized keyboard and know a handful of shortcuts to have a normal pace.
> The former could have a lot of value for casual amateur use; it's not going to replace the precise, high-functionality tool for professional use.
I was just discussing that in another thread. Most expert works are routine, and they will build workflows, checklists, and processes to get them to be done with the minimum cognitive load. And for that you need reliability. Their focus are on the high leverage decision points. Take any digital artist's photoshop settings, They will have a specific layout, a few document templates, their tweaked brushes. And most importantly, they know the shortcuts because clicking on the tiny icons takes too much times.
The trick is not about being able to compute, it's knowing the formula and just give the parameters to a computer that will do the menial work. It's also not about generating a formula that may or may not be what we want.
I think you're spot on. It was once necessary to acquire knowledge in order to acquire productivity. This made knowledge valuable and worth attaining. Now, with LLMs, we we can skip the middle man and go straight to the acquisition of productivity. I'd call it the democratisation of knowledge, but it's something more than that — knowledge just isn't needed anymore.
Are people implementing stuff from start to finish in one go? For me, it's always been iterative. Start from scaffolding, get one thing right,then the next. It's like drawing. You start with a few shapes, then connect them. After you sketch on top, then do a line art, and then you finish with values (this step is also iterative refinements). With each step, you become more certain of what you want to do, while also investing the minimum possible effort.
So for me coding is more about refactoring. I always type the minimal amount of code to get something to work. And it usually means shortcuts which I annotate with a TODO comment. Then I iterate over, making it more flexible, adds more flexibility, makes the code more clean.
Do you think humanity will be better off because we'll have humans who don't know how to do anything themselves, but they're really good at asking the magical AI to do it for them?
What a sad future we're going to have.
I would think that's the process too, but according to the article the dude is almost completely hands off:
> You come back to ten thousand lines of code. You spend 5 minutes reading. One sentence of feedback. Another ten thousand lines appear while you're making lunch.
You can't humanly review 10 thousand lines of code in 5 minutes. This is either complete bullshit or it really writes flawless code for them and never makes any mistakes.
This is interesting. Does Claude have a memory? Is this just a limit on the number of input tokens? It sounds like a fundamental misappropriation of cause, but maybe I just don't understand the latest whizbang feature of Claude. Can anyone clarify?
There's no way I could hire someone who'd want me hovering over their shoulder like this.
This sounds tedious I guess, but it's actually quite zen, and faster than solo coding most of the time. It gives me a ton of confidence to try new things and new libraries, because I can ask it to explain why it's suggesting the changes or for an overview of an approach. At no point am I not aware of what it's doing. This isn't even close to what people think of as vibe coding. It's very involved.
I'm really looking forward to increasing context sizes. Sometimes it can spin it's wheels during a refactor and want to start undoing changes it made earlier in the process, and I have to hard correct it. Even twice the context size will be a game changer for me.
A million monkeys randomly typing could actually complete that task as well.
I think if I was just starting out learning to program, I would find something fun to build and pick a very correct, typed, and compiled language like Haskell or Purescript or Elm, and have the agent explaining what it's doing and why and go very slow.
It sounds like Claude Code is the best UX right now but I don’t want to be locked into a Claude subscription, I want to bring my own key and tap into whatever provider I want.
I didn't see internationalization and localization, but I don't see anything fundamental about those that would be different.
Security, on the other hand, does feel like a different beast.
Namely, you don’t deserve to be paid for working 8 hours if you only worked for 30 minutes over an eight hour period.
I don’t care if you personally agree with that or not, the reality is that businesses believe it.
That means, sooner or later there will be a great rebalancing where people will be required to do significantly more work; probably the work of other developers who will be fired.
It’s fun for home projects; but the somewhat depressing reality is that there is no chance in hell this (sitting around for 7 hours a day reading reddit while Claude codes) will fly in corporate environments; instead, you’re looking at mass layoffs.
So. Enjoy it while you can folks.
In the future you’ll be spending that 8 hours struggling to juggle the context and review 20 different tasks, not playing with your kids.
You won’t have time to do it; it’s naive and ridiculous to expect that businesses will just let people goof off for 7 hours a day.
Regardless of the output they generate.
Anyone who doesn’t believe this has never had to manage budgets and staff.
It’s the “AI utopia” people making vague hand wavey motions about post-scarcity.
this already started in 2022-23 with all the layoffs and "downsizing"
Correct - covered here in my talk - https://ghuntley.com/six-month-recap/
AI isn't going to take anyone's jobs. Your co-worker who knows how to use multiple agents at a time and automates their job function will.
LLMs have changed me. I want to go outside while they are working and I am jealous of all the young engineers that won’t lose the years I did sitting in front of a screen for 12 hours a day while sometimes making no progress on connecting two black boxes.
Soylent Green is a lot closer to the reality of capitalism.
Maybe walking away is a better choice hah.
Im using it regardless. Ive just learnt to deal with these and keep an eye on them. When it creates a duplicate interface I roll back to earlier prompt and be more explicit that this type already exists.
I try to not argue whether something it does is wrong or right. There is not point. I will simply rollback and try with another prompt. Claude is not a human.
And of course you'll have people replying to this trying to make the case for why AI coding is already a thing, when in reality those posts are once again going to be carbon copies of similar comments from the Bitcoin days
What is the actual argument here? Anyone claiming that AI has been useful for them is a lying shill?
Give it two years.
Those young engineers, in 10 years, won't be able to fix what the LLM gave them,because they have not learned anything about programming.
They have all learned how to.micromanage an LLM instead.
[1] https://www.vox.com/technology/23882304/gen-z-vs-boomers-sca...
Wish I had your confidence in this. I can easily see how this nullifies my hard earned experience and basically puts me in the same sport as a more mid level or even junior engineer.
I have heard a version of this plenty of times, and it was never correct. In the early 90s it was the "electronics" people that were saying "I come from an electronics background, these young'uns will look at a computer and don't know what to do if it breaks". Well, bob, we did, the whole field moved to color coded anti-stupid design, and we figured it out.
Then I heard it about IDEs. Oh, you young people are so spoiled with your IDEs and whatnot, real men code in a text editor.
Then it was about frameworks. BBbbut what if your framework breaks, what do you do then, if you don't know the underlying whatever?
... same old, same old.
https://github.com/terhechte/CCORP
Works fine on macOS / Linux, untested on Windows. Still working on improving it.
Every single finance person uses a calculator. How effective do you think a person in any aspect of finance would be if they had never learned what multiplication is? Would they perform their job adequately if they don't know that `X * Y` is `X repeated Y times`?
IOW, if you gave a finance person (accountant, asset manager, whatever) a non-deterministic calculator for multiplication, would you trust the person's output if they never learned what multiplication is?
This is the situation I am asking about; we aren't talking about whether deterministically automating something that the user already knows how to do is valuable, we're talking about whether non-deterministically generating something that the user is unable to do themselves, even if given all the time in the world, is valuable.
All those examples you give are examples of deterministic automation that the user could inspect for accuracy. I'm asking about a near-future where people managing your money have never learned multiplication because "Multiplication has been abstracted away to a tool that gets it right 90% of the time"
Why not? Not like companies have to actually do anything beyond marketing to get insane evaluations… remember theranos?
Yup, my mom used to say "you need to be able to do it without a calculator, because in life you won't always have a calculator with you"... Well, guess what mom :)
But on a serious note, what I'm trying to say (perhaps poorly worded) is that this is a typical thing older generations say about younger ones. They'll be lost without x and y. They won't be able to do x because they haven't learned about y. They need to go through the tough stuff we went through, otherwise they'll be spoiled brats.
And that's always been wrong, on many levels. The younger generations always made it work. Just like we did. And just like the ones before us did.
There's this thing that parents often do, trying to prepare their children for the things they think will be relevant, from the parent's perspective. And that often backfires, because uhhh the times are achanging. Or something. You get what I'm trying to say. It's a fallacy to presuppose that you know what's coming, or that somehow an entire generation won't figure things out if they have a shortcut to x and y. They'll be fine. We're talking about millions / billions of people eventually. They'll figure it out.
I've found LLM's add lots of standard protections to api endponts, or database constraints etc than I would do on a lazy Saturday.
This is exactly what bothers me about the present moment. Not that the pride of craftsmanship is everything, but dialing it down to zero with extreme pressure to stay that way is a bit sad.
But we’ve clearly gone through this with other mediums before, perhaps someday people will appreciate hand written code the way we appreciate hand carved wood. Or perhaps we were all wasting time in this weird middle ground in the march of progress. I guess we’ll find out in 5-15 years.
If its an easy skill to learn, with little consequences if you get it wrong especially for small scale apps why pay for it? Don't know why seniors (of which I'm one) think they are immune to this.
The decision making parts of people's brains will atrophy. It will be interesting to see what will happen.
The short version is that they mistake confidence for competence, and the younger consumers are more confident poking around because they grew up with superior idiot-proofing. The better results are because they dare to fiddle until it works, not because they know what's wrong.
Many years ago, in another millennium, before I even went to university but still was an apprentice (the German system, in a large factory), I wrote my first professional software, in assembler. I got stuck on a hard part. Fortunately there was another quite intelligent apprentice colleague with me (now a hard-science Ph.D.), and I delegated that task to him.
He still needed an explanation since he didn't have any of my context, so I bit the bullet and explained the task to him as well as I could. When I was done I noticed that I had just created exactly the algorithm that I needed. I just wrote it down easily myself in less than half an hour after that.
The ability seems like pure magic. I know that there are others who have it very easy now building even complex software with AI and delivering project after project to clients at record speed at no less of quality as they did before. But the majority of devs who won’t even believe that it’s remotely possible to do so is also not helping this style of building/programming mature.
I wouldn’t even call it vibe coding anymore. I think the term hurts what it actually is. For me it’s just a huge force multiplier, maybe 10-20x of my ability to deliver with my own knowledge and skills on a web dev basis.
The tarpit of AI discussion is that everybody assumes that their local perspective is globally applicable. It is not.
I’ll try my hand at some guidelines: the prime directive would be “use the right ai tool for the right task”. Followed by “use a statically typed language”. Followed by “express yourself precisely in English. You need to be able to write like a good technical lead and a good product manager.”
With those out of the way:
Completions work when you’re doing lots of rote moderately difficult work within established patterns. Otherwise, turn them off, they’ll get in the way. When they do work, their entire point is to extend your stamina.
Coding agents work when at-worst a moderately novel vertical needs implementation. New architecture and patterns need to be described exhaustively with accurate technical language. Split up the agents work into the same sort of chunks that you would do between coffee breaks. Understand that while the agent will make you 5x faster, you’ll still need to put in real work. Get it right the first time. Misuse the agent and straightening out the mistakes will cost more time than if you hadn’t used the agent at all.
If novelty or complexity is high, use an advanced reasoning model as interactive documentation, a sparring partner, and then write the code by hand. Then ask the reasoning model to critique your code viciously. Have the reasoning model configured for this role beforehand.
These things together have added up to the biggest force multiplier I’ve encountered in my career.
I’m very much open to other heuristics.
I’ve always bemoaned my distractibility as an impediment to deep expertise, but at least it taught me to write well, for all kinds of audiences.
Boy do I feel lucky now.
My ex teaches UX. We were talking about AI in academia last week. She said that she requires students to not use AI on their first assignment but on subsequent ones they are permitted to.
I find myself having to spend more time guiding the model in the right direction and fixing its mistakes than I would’ve spent building it all myself.
Every time I read one of these stories I feel like maybe you guys have models from 2035, because the ones we have today seem to be useless outside of creating greenfield, simple React apps that just sort of work.
One thing I’ll say is that it’s been a real time saver for debugging. For coding, a huge waste of time. Even for tasks that are menial, repetitive, require no thinking etc. I find that it’s mostly crap.
- when I ask models to do defined that I know how to do and can tell them about that method but can't remember the details off off hand and then I check the answers things work.
- when I attempt to specify things that I don't understand fully the model creates rubbish 7 out of 10 times, and those episodes are irretrievable. About 30% of the time I get a hint of what I should do and can make some progress.
I can tell you that this claim is where a lot of engineers are getting hung up. People keep saying that they are 10, 20 and sometimes even 100x more productive but it's this hyperbole that is harming that building style more than anything.
If you anyone could get 10 to 20 years worth of work done in 1 year, it would be so obvious that you wouldn't even have to tell anyone. Everyone would just see how much work you got done and be like "How did you do 2 decades worth of work this year?!"
I think it’s more nuanced than that.
Not every project one does will be or should be considered art or a source of joy and pride.
The boring CRUD apps that put the “bread on the table” are just that, a means to an end, they will not be your main source of pride or fulfillment. But somewhere in between there will be projects where you can put all your heart in and turn off that LLM.
Think of the countless boring weddings playlists a DJ has to do or the boring “give me the cheapest” single family homes an architect has to design.
How buggy is it? How long would it have taken to build something similar by hand?
I work in a large corpo eco system of products across languages that talk to a mess of micro and not so micro services.
Ai tools are rarely useful out of the box in this context. Mostly because they can't fit the ecosystem into their context. I think i would need 10 agents or more for the task.
We have good documentation, but just fitting the documentation into context alongside a microservice is a tight fit. Most services would need one agent for the code (and even then it'd only fit 10% in context), and one for the docs.
Trying to use them without sufficient context, or trying to cram the right 10% into context, takes more effort than just coding the feature, and produces worse results with the worst kind of bugs, subtle ones borne from incorrect assumptions.
Does this mean basically "Opus"? What goes into "Have the reasoning model configured for this role beforehand."?
You get way farther when you have the AI drop in Tailwind templates or Shadcn for you and then just let it use those components. There is so much software outside that web domain though.
A lot of people just stop working on their AI projects because they don't realize how much work it's going to take to get the AI to do exactly what they want in the way that they want, and that it's basically going to be either you accept some sort of randomized variant of what you're thinking of, or you get a thing that doesn't work at all.
Perhaps that is true, but without any examples I was immediately suspicious of this line.
> Either way, we're in this delicious middle ground where nobody can pretend expertise because the whole thing keeps changing under our feet.
Upon reflection this does in fact remind me of the early days of rocketry when we were just reaching into the upper atmosphere and then orbit. Wild things were being tried because there was not yet any handrails. Exploding a huge tank of water in the ionosphere just because, launching giant mylar balloons into orbit to try and bounce radar signals off of them, etc.
Right now its all monetization at gravity. As if companies are ready to pour software developer salaries in tools.
I imagine beginners will not have gpu rich environments and AI will not reach mainstream as traditional development did, unless something happens, idk what.
Right now, seniors love the complexity and entry barrier to it, so they can occupy the top of the food chain. History has proven that that does not last long.
In some scenarios as airtable, AI is replacing docs and customer support, eleminating the learning curve.
Do people read the code? Or just test if it work and push?
To me, code is like a map that has to be clear enough so other humans can read it to navigate the territory (codebase). Even if it's just two – me and AI agent – working on the codebase, it's not much different from "me and another programmer". We both want to have updated mental model of how exactly code structured and how it works and why.
Using AI for coding and not reading the code sounds more like stopping being developer and self-promoting yourself to the manager of AI-programmers who trusts their craft completely.
I agree with you! I'm not saying that I like it; this is the perfect example of turbo capitalism applied to innovation.
I also like to code and to build software, and the joy that comes from the act of creation. Only, I'm quite sure it's not going to last.
To continue this thought - what could have been different in the last 10-15 years to encourage junior developers to listen more where they might not have to those who were slightly ahead of them?
If you asked me months ago whether "prompt engineering" was a skill I'd have said absolutely not, it's no different than using stack overflow and writing tickets, but having watched otherwise skilled devs flounder I might have to admit there is some sort of skill needed.
IMO the dichotomy should not be deterministic/stochastic, but proved/unproved reliable. gcc has been shown reliable, for instance, so I don't need to know whether it was built by deterministic (clever engineers) or stochastic (typewriting monkeys) processes. I'm certain the former are more efficient, but this is ultimately not what makes the tool valuable.
As a bit of an artificial example, there's stochastic processes that can be proved to converge to a desired result (say, a stochastic gradient descent, or Monte-Carlo integration), in the same way that deterministic methods can (say a classic gradient descent or quadrature rules).
In practical cases, the only proof that matters is empirical. I write (deterministic) mathematical algorithms for a living, yet they very rarely come out correct on first iteration. The fact there is a mathematical proof that a certain algorithm yields certain results lets me arrive at a working program much faster than if I left it to typewriting monkeys, but it is ultimately not what guarantees a valid program. I could just as well, given enough time, let a random text file generator write the programs, and do the same testing I do currently, it would just be very inefficient (an understatement).
https://bsky.app/profile/stefanmunz.bsky.social
https://www.meetup.com/de-DE/de-DE/agentic-coding-meetup-ham...
For example:
If I tell it to not use X, it will do X.
When I point it out, it fixes it.
Then a few prompts later, it will use X again.
Another issue is the hallucinations. Even if you provide it the entire schema (I did this for a toy app I was working with), it kept on making up "columns" that don't exist. My Invoice model has no STATUS column, why do you keep assuming it's there in the code?
I found them useful for generating the initial version of a new simple feature, but they are not very good for making changes to an existing ones.
I've tried many models, Sonnet is the better one at coding, 3.7 at least, I am not impressed with 4.
However, what we are maybe not considering enough is that general AI adoption could and almost certainly will affect the standards for cybersecurity as well. If everyone uses AI and everyone gets used to its quirks and mistakes and is also forgiving about someone else using it since they themselves use it too, the standards for robust and secure systems could decrease to adjust to that. Now, your services as a cybersecurity consultant are no longer in need as much, as whatever company would need them can easily point to all the other companies also caring less and not doing anything about the security issues introduced by the AI that everyone uses. The legal/regulation body would also have to adjust to this, as it is not possible to enforce certain standards if no one can adhere to them.
If anyone cared enough to do anything, they would be burning everything down already
It’s a lot of impotent rage because the only virtue people have is consumption, they don’t actually believe in anything. The ones who do believe in fairy tales are part of a dwindling population (religion) that is rightfully crashing.
Welcome to the wasteland of the real
I saw this as a chance to embrace AI, after a while of exploring I found Claude Code, and ended up with a pretty solid workflow.
But I say this as someone who has worked with distributed systems / data engineering for almost 2 decades, and spend most of my time reviewing PRs and writing specs anyway.
The trick is to embrace AI on all levels: learn how to use prompts. learn how to use system prompts. learn how to use AI to optimize these prompts. learn how to first write a spec, and use a second AI (“adversarial critic”) to poke holes in that plan. find incompletenesses. delegate the implementation to a cheaper model. learn how to teach AI how to debug problems properly, rather than trying to one-shot fixes in the hope it fixes things. etc
It’s an entirely different way of working.
I think juniors can learn this as well, but need to work within very well-defined frameworks and probably needs to be part of college curriculum as well.
The problem is not having any evidence or basis on which to compare claims. Alchemists claimed for centuries to synthesize gold, if they only had video we could’ve ruled that out fast.
Like there's a mindset where you just want to get the job done, ok cool just let the llm do it for me (and it's not perfect atm), and ill stitch everything together fix small stuff that it gets wrong etc, saves alot of time and sure I might learn something in the process as well. And then the other way of working is the traditional way, you google, look up on stackoverflow, read documentations, you sit down try to find out what you need and understand the problem, code a solution iteratively and eventually you get it right and you get a learning experience out of it. Downside is this can take 100 years, at the very least much longer than using an llm in general. And you could argue that if you prompt the llm in a certain way, it would be equivalent to doing all of this but in a faster way, without taking away from you learning.
For seniors it might be another story, it's like they have the critical thinking, experience and creativity already, through years of training, so they don't loose as much compared to a junior. It will be closer for them to treat this as a smarter tool than google.
Personally, I look at it like you now have a smarter tool, a very different one as well, if you use it wisely you can definitely do better than traditional googling and stackoverflow. It will depend on what you are after, and you should be able to adapt to that need. If you just want the job done, then who cares, let the llm do it, if you want to learn you can prompt it in certain way to achieve that, so it shouldn't be a problem. But this sort of way of working requires a conscious effort on how you are using it and an awareness of what downsides there could be if you choose to work with the llm in a certain way to be able to change the way you interact with the llm. In reality I think most people don't go through the hoops of "limiting" the llm so that you can get a better learning experience. But also, what is a better learning experience? Perhaps you could argue that being able to see the solution, or a draft of it, can be a way of speeding up learning experience, because you have a quicker starting point to build upon a solution. I dunno. My only gripe with using LLM, is that deep thinking and creativity can take a dip, you know back in the day when you stumbled upon a really difficult problem, and you had to sit down with it for hours, days, weeks, months until you could solve that. I feel like there are some steps there that are important to internalize, that LLM nowdays makes you skip. What also would be so interesting to me is to compare a senior that got their training prior to LLM, and then compare them to a senior now that gets their training in the new era of programming with AI, and see what kinds of differences one might find I would guess that the senior prior to LLM era, would be way better at coding by hand in general, but critical thinking and creativity, given that they both are good seniors, maybe shouldn't be too different honestly but it just depends on how that other senior, who are used to working with LLMs, interacts with them.
Also I don't like how LLM sometimes can influence your approach to solving something, like perhaps you would have thought about a better way or different way of solving a problem if you didn't first ask the LLM. I think this could be true to a higher degree for juniors than seniors due to gap in experience when you are senior, you sort of have seen alot of things already, so you are aware of alot of ways to solve something, whereas for a junior that "capability" is more limited than a senior.
It used to be you could learn to program with a cheap old computer a majority of families can afford. It might have run slower, but you still had all the same tooling that's found on a professional's computer.
To use LLMs for coding, you either have to pay a third party for compute power (and access to models), or you have to provide it yourself (and use freely available ones). Both are (and IMO will remain) expensive.
I'm afraid this builds a moat around programming that will make it less accessible as a discipline. Kids won't just tinker they way into a programming career as they used to, if it takes asking for mom's credit card from minute 0.
As for HS + college providing a CS education using LLMs, spare me. They already don't do that when all it takes is a computer room with free software on it. And I'm not advocating for public funds to be diverted to LLM providers either.
100% agree AI based dev is at odds with agile. You’re basically going to use the AI to fully rewrite the software over and over until the spec becomes clear which just isn’t very efficient. Plus it doesn’t help that natural language cannot be as clear a spec as code.
I think that if you willfully ignore the development, you might be left in the dust. As you say, it is a force multiplier. Even average programmers can become extremely productive, if they know how to use the AI.
For 3 -- Sure, that can help. But sometimes it is difficult to follow what is going on. Especially if that comes from a library/framework you are unfamiliar with such as AWS.
I've also used it to help with build errors such as "Bar.csproj: Error NU1604 : Warning As Error: Project dependency Foo does not contain an inclusive lower bound. Include a lower bound in the dependency version to ensure consistent restore results." -- That was because it was using a fixed version of the module via the "[1.0]" syntax, but my version of NuGet and/or Rider didn't like that so once I new that and the range syntax specifying "[1.0, 1.0]" worked. I was able to understand that from the LLM response to the error message and telling it the specific `<PackageReference>`.
If the code I was writing was, say, small websites all the time for different clients, I can see it being a big improvement. But iterating on a complex existing platform, I’m not so sure that AI will keep the system designed in a maintainable and good way.
But if your experience is with the same sort of code as me, then I may have to re evaluate my judgments.
I state things crystal clear in real life on the internets. Seems like most of the time, nobody has any idea what I'm saying. My direct reports too.
Anyway, my point is, if human confusion and lack of clarity is the training set for these things, what do you expect
———
> Maybe all methodology is just mutually agreed-upon fiction that happens to produce results?
Good news! *All of computer science is this way.* There’s nothing universally fundamental about transistors, or Turing machines, or OOP, or the HTTP protocol. They’re all just abstractions that fit, because they worked.
———
When should I stop learning and start building? My coworker wrote absolutely ATROCIOUS code, alone, for the past decade. But he was BUILDING that whole time. He’s not up to date on modern web practices, but who cared? He built.
Juniors need experience to know if the machine is going in the right direction or guide it. That experience is now nigh impossible to get, nobody has the time for apprentices now. It’ll take some brave management to pave a way forward, we don’t know what it’ll be exactly yet.
You have to know how software gets built and works. You can't just expect to get it right without a decent understanding of software architecture and product design.
This is something that's actually very hard. I'm coming to grips with that slowly, because it's always been part of my process. I'm both a programmer and a graphic designer. It took me a long while to recognize not everyone has spent a great deal of time doing both. Fewer yet decide to learn good software design patterns, study frameworks and open-source projects to understand the problems each of them are solving. It takes a LOT of time. It too me probably 10-15 years just to learn all of this. I've been building software for over 20 years. So it just takes time and that's ok.
The most wonderful thing I see about AI is that it should help people focus on these things. It should free people from getting too far into the weeds and too focused on the code itself. We need more people who can apply critical thinking and design from a bird's eye perspective. We need people who can see the big picture.
I am a bit disillusioned - I find mentoring humans satisfying but I don't get the same satisfaction mentoring AI. I also think it's a probably going to backfire by hamstringing the next generation and 'draining the competence' from the current.
I've been around the block a few times on ideas like a B2B/SaaS requirements gathering product that other B2B/SaaS vendors could use to collect detailed, structured requirements from their customers. Something like an open-world Turbo Tax style workflow experience where the user is eventually cornered into providing all of the needed information before the implementation effort begins.
Increasingly I've also just ben YOLOing single shot throw-away systems to explore the design space - it is easier to refine the ideas with partially working systems than just abstract prose.
If Sonnet 3.7 is the best you've found, then no, you haven't tried many models. At least not lately.
For coding, I'd suggest Gemini 2.5 Pro, o3-mini-high, or Opus 4. I've heard good things about Grok 4 as well, so if you're OK with that whole scene and the guy who runs it, maybe give it a shot.
If you have already done so and still think Sonnet 3.7 is better than any of them, then the most likely explanation is that you got incredibly lucky with Claude and incredibly unlucky with the others. LLMs aren't parrots, but they are definitely stochastic.
Unfortunately I’ve been around this industry long enough to know that this is not in fact what is going to happen. We will be driven by greedy people with small minds to produce faster rather build correct systems, and the people who will pay will be users and consumers.
People who "take the time to really understand the code" will rapidly be outcompeted by people who don't. You don't like that, I don't like that, but guess what: nobody cares.
I suppose we'll get over it, eventually, just like last time.
That's not true at all, and hasn't been for a while. When using LLMs to tackle an unfamiliar problem, I always start by asking for a comparative review of possible strategies.
In other words, I don't tell it, "Provide a C++ class that implements a 12-layer ABC model that does XYZ," I ask it, "What ML techniques are considered most effective for tasks similar to XYZ?" and drill down from there. I very frequently see answers like, "That's not a good fit for your requirements for reasons 1, 2, and 3. Consider UVW instead." Usually it's good advice.
At the same time I will typically carry on the same conversation with other competing models, and that can really help avoid wasting time on faulty assumptions and terrible ideas.
I would argue we still need the knowledge: the principles aren't changing, and they are needed to be truly productive in certain things. But the application of those principles _are_ changing.
And I don't think there's anything to get over about them. They are useful but people elevate their significance too much over what they actually are.
As for "autocorrect," let us know when your "autocorrect" takes gold at the International Math Olympiad, with or without steroids.
I want to be wrong (it will affect me and my family personally) - but there is a reason every AI proponent talks about coding and making "coding redundant". For most jobs/industries software is a compliment (e.g. Product Owner, BA, etc) unless that is your main skill in which case it is your main service you are selling. Most roles want to turn software into a commodity or the typical business PM word "resource" that they can acquire as they need - the dream of most business roles (e.g. Project Managers, BA's, etc). It sadly also seems to be low hanging fruit of LLM's; doesn't mean there isn't other aspects to the job of course but coding is becoming "less special" with these technologies especially with common tech and use cases.
Sure I have to be sure what I'm committing and running is good, especially in critical domains. The cheap cost of iteration before actual commit IMO is the one reason why LLM's are disruptive in software and other "generative" domains in the digital world. Conversely real-time requirements, software that needs to be relied on (e.g. a life support system?), things that post opinions in my name online etc will probably, even if written by a LLM, will need someone accountable and verifying the output.
Again as per many other posts "I want to be wrong" given I'm a senior in my career and would find it hard to change now given age. I don't like how our career is concentrating to the big AI labs/companies rather than our own intelligence/creativity. But rationally its hard to see how software continues to be the same career going forward and if I don't adapt I might die. I will most likely going forward, similar to what I do with my current team, just define and verify.
...and today, Nvidia ships self-immolating graphics cards because nobody wanted to figure out how to design a safe electric connector.
> Oh, you young people are so spoiled with your IDEs and whatnot, real men code in a text editor.
...and today, a lot of so-called programmers are trapped in AbstractHellFactorySingletonFactories that they cannot and never will understand, because generations of code monkeys have abused IDE assistance to dig themselves deeper into their hole.
And as a user, you'll know, because the software they write is garbage and never works reliably.
> Then it was about frameworks. BBbbut what if your framework breaks, what do you do then, if you don't know the underlying whatever?
Going by software like Teams, or Slack: They just ignore it, because consumers can't fight back against the the enshittification of increasingly useless software nobody understands.
Developers throwing huge amounts of money (in cloud resources) at performance problems that would’ve been prevented if they had some understanding of how their tech stack actually worked.
> And eventually the actual value of such things is the extensibility and integrations with various things like corporate SAML etc.
I could make a Dropbox clone by hand in a weekend. I could make one with AI in a few hours. Neither will be valuable like Dropbox because the value is the hard stuff that I have yet to see proof AI can do
Reason 1: Because any company with one individual who leverages AI to achieve a billion dollar valuation would trivially & obviously be more valuable and be able to achieve more if they had two people leveraging AI. And, at a billion dollar valuation; why not pay that extra salary? Why not add a third? There's a huge difference in potential output by building with a founder team of 2 or 3 people versus 1; with not that much difference in cost.
Reason 2: The most capital efficient valuations in history are B2C companies (Instagram & Whatsapp are maybe the two best examples) during the VC boom era of the 2010s. B2C is naturally very capital efficient; you can do inbound marketing, no sales teams, just build. But, B2C success stories are more-and-more rare; the name of the game in the 2020s era has generally been B2B. Cursor might be the fastest company to reach $500M in revenue, and it got there on B2B, Notion isn't building its AI tools to sell to its consumer customers, etc. B2B is a lot harder; it requires outbound marketing & sales, as well as customer-led product development that usually requires interacting with people. AI will absorb some of those roles, but to suggest it can widdle down to one person per billion dollars in valuation in a year feels too accelerated to me. The world is complex, old, and crusty.
Just look at the YC batches for 2025 [1]: Of the 375 companies in the three batches, 20 are Consumer tech (~5%).
Reason 3: Weaker, but something I think about: AI is weirdly non-differentiating/egalitarian. You see some people try to differentiate it with crazy prompts or context engineering, but then next month Cursor ships an update and suddenly those aren't differentiating anymore. If you're an investor in a single-person unicorn-moonshot, you have to ask: Why can't some other company just come in and do the same thing you're doing? If that much can be automated through off-the-shelf systems. My feeling is that this concern would lead to lower revenue multiples on the valuation, which just makes it that much harder to hit unicorn status.
[1] https://www.ycombinator.com/companies?batch=Summer%202025&ba...
None of that is actually what makes my favorite tools work. It’s usually some nerds that never stopped using C/C++ and really know hardware.
The data wall idea of needing 10 more internets is like the peak oil theory of the 2000s, imho. I don’t think 10 more internets are required, and all those tasks like proper security and i18n, none of these require creativity of any kind. These are ideal candidates for LLMs to solve quite soon.
But to talk about the data wall more: Current SOTA LLMs are trained on circa 30-50T tokens. First, that is not a full internet‘s worth; it is estimated that Meta owns more like 200T tokens of user data etc. Second, the work in optimizing for data efficiency is just beginning.
Which means clearly we need to feed video of the dancefloor to a vision model and output MIDI tokens!
Being able to write code in a programming language is a feature, not a flaw. If we had always had to program in natural language, the precision and unambiguity of programming languages would be an eagerly welcomed revolution.
[1] https://cacm.acm.org/opinion/on-program-synthesis-and-large-...
They usually don't write this kind of blog post anymore after they tried to vibe-code an extension or modification onto their vibe-coded application.
There was a good one on HN a while ago, though, of someone who decided to actually take a look at their vibe-coded codebase, finding an unbelievable mess of inconsistencies and redundancies.
People who just want to put something together and hate programming can be fine with this. Anyone who wants to build something clean that actually lasts, won't.