[1] https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...
[2] https://futurism.com/companies-fixing-ai-replacement-mistake...
So indeed, IF you are in that case: Many years on the same project with multiple years experience then it is not usefull, otherwise it might be. This means it might be usefull for junior and for experienced devs who are switching projects. It is a tool like any other, indeed if you have a workflow that you optimized through years of usage it won't help.
You are welcomed to your point of view, but for me while one agent is finding an obscure bug, I have another agent optimising or refactoring, while I am working on something else. Its hard to believe I am deluded in thinking I am spending more time on a task.
I think the research does highlight that training is important. I don't throws devs agents and expect them to be productive.
The challenge with the bubble/not bubble framing is the question of long term value.
If the labs stopped spending money today, they would recoup their costs. Quickly.
There are possible risks (could prices go to zero because of a loss leader?), but I think anthropic and OpenAI are both sufficiently differentiated that they would be profitable/extremely successful companies by all accounts if they stopped spending today.
So the question is: at what point does any of this stop being true?
It makes me perhaps a little sad to say that "I'm showing my age" by bringing up the .com boom/bust, but this feels exactly the same. The late 90s/early 00s were the dawn of the consumer Internet, and all of that tech vastly changed global society and brought you companies like Google and Amazon. It also brought you Pets.com, Webvan, and the bajillion other companies chronicled in "Fucked Company".
You mention Anthropic, which I think is in a good a position as any to be one of the winners. I'm much less convinced about tons of the others. Look at Cursor - they were a first moving leader, but I know tons of people (myself included) who have cancelled their subscription because there are now better options.
If that is the case at some point the music is going to stop and they will either perish or they will have to crank up their subscription costs.
In other words: it might be useful for people who don't understand the generated code well enough to know that it's incorrect or unmaintainable.
Are those things created by Claude actually making you that much in real money every month? Because the amount of money it would cost to pay someone to create something, and the value that something brings to you once it's made are largely unrelated.
I use claude code exclusively for the initial version of all new features, then I review and iterate. With the Max plan I can have many of these loops going concurrently in git worktrees. I even built a little script to make the workflow better: http://github.com/jarredkenny/cf
As I said above, I don’t think a single AI company is remotely in the black yet. They are driven by speculation and investment and they need to figure out real quick how they’re going to survive when that money dries up. People are not going to fork out 24k a year for these tools. I don’t think they’ll spend even $10k. People scoff at paying $70+ for internet, a thing we all use basically all the time.
I have found it rather odd that they have targeted individual consumers for the most part. These all seem like enterprise solutions that need to charge large sums and target large companies tbh. My guess is a lot of them think it will get cheaper and easier to provide the same level of service and that they won’t have to make such dramatic increases in their pricing. Time will tell, but I’m skeptical
I stopped writing code by hand almost entirely and my output (measured in landed PRs) has been 10x
And when I write code myself then it’s gnarly stuff and I want AI to get out of my way…so I just use Webstorm
Maybe. But that would probably be temporary. The market is sufficiently dynamic that any advantages they have right now, probably isn't stable defensible longer term. Hence the need to keep spending. But what do I know? I'm not a VC.
For the time being, nothing comes close, at least for me.
My assessment so far is that it is well worth it, but only if you're invested in using the tool correctly. It can cause as much harm as it can increase productivity and i'm quite fearful of how we'll handle this at day-job.
I also think it's worth saying that imo, this is a very different fear than what drives "butts in seats" arguments. Ie i'm not worried that $Company will not get their value out of the Engineer and instead the bot will do the work for them. I'm concerned that Engineer will use the tool poorly and cause more work for reviewers having to deal with high LOC.
Reviews are difficult and "AI" provides a quick path to slop. I've found my $200 well worth it, but the #1 difficulty i've had is not getting features to work, but in getting the output to be scalable and maintainable code.
Sidenote, one of the things i've found most productive is deterministic tooling wrapping the LLM. Eg robust linters like Rust Clippy set to automatically run after Claude Code (via hooks) helps bend the LLM away from many bad patterns. It's far from perfect of course, but it's the thing i think we need most atm. Determinism around the spaghetti-chaos-monkeys.
If you discusses a plan with CC well upfront, covering all integration points where things might go off rail, perhaps checkpoint the plan in a file then start a fresh CC session for coding, then CC is usually going to one shot a 2k-LoC feature uninterrupted, which is very token efficient.
If the plan is not crystal clear, people end up arguing with CC over this and that. Token usage will be bad.
Once I max out the premium credits I pay-as-you-go for Gemini 2.5 Pro via OpenRouter, but always try to one shot with GPT 4.1 first for regular tasks, or if I am certain it's asking too much, use 2.5 Pro to create a Plan.md and then switch to 4.1 to implement it which works 90% of the time for me (web dev, nothing too demanding).
With the different configurable modes Roo Code adds to Cline I've set up the model defaults so it's zero effort switching between them, and have been playing around with custom rules so Roo could best guess whether it should one shot with 4.1 or create a plan with 2.5 Pro first but haven't nailed it down yet.
By similar token Windows is mostly a wrapper around Intel and AMD and now Qualcomm CPUs. Cursor/Windsurf add a lot of useful functionality. So much so so that Microsoft GitHub Copilot is losing marketshare to these guys.
Please don't say stuff like that.
As a 20-something who was in diapers during the dot-com boom, I really appreciate your insight. Thanks for sticking around on HN!
Now I just find myself exasperated at its choices and constant forgetfulness.
Cursor has a $500mm ARR your anecdote might be meaningful in the medium turn but so far growth as not slowed down.
It is a lot less trivial than people like yourself make it out to be to get an effective tool chain and especially do it efficiently.
Ah, yes, companies like Amazon.com, eBay, PayPal, Expedia, and Google. Never heard of those losers again. Not to mention those crazy kids at Kozmo foolishly thinking that people would want to have stuff delivered same-day.
The two lessons you should learn from the .com bubble are that the right idea won’t save you from bad execution, and that boom markets–especially when investors are hungry for big returns–can stay inflated longer than you think. You can be early to market, have a big share, and still end up like Netscape because Microsoft decided to take the money from under the couch cushions and destroy your revenue stream. That seems especially relevant for AI as long as model costs are high and nobody has a moat: even if you’re right on the market, if someone else can train users to expect subsidized low prices long enough you’ll run out of runway.
Yes, there's (maybe?) four, but they're at the very bottom of the value chain.
Things built on top of them will be higher up the value chain and (in theory anyway) command a larger margin, hence a VC rush into betting on which company actually makes it up the value chain.
I mean, the only successes we see now are with coding agents. Nothing else has made it up the value chain except coding agents. Everything else (such as art and literature generation) is still on the bottom rung of the value chain.
That, by definition alone, is where the smallest margins are!
Cursor’s growth is impressive, but sustained dominance isn’t guaranteed. Distribution, margins, and defensibility still matter and we haven’t seen how durable any of that is once incentives tighten and infra costs stop being subsidized.
Then I'll look through the changes and decide if it is correct. Sometimes can just run the code to decide if it is correct. Any compilation errors are pasted right back in to the chat in agent mode.
Once the feature is done, commit the changes. Repeat for features.
The only answer that matters is the one to the question "how much more are you making per month from your $200/m spend?"
There also were companies like Sun and Cisco who had real, roaring business and lots of revenue that depended on loose start-up purse-strings, and VC exuberance...
Sun and Cisco both survived the .com bust, but were never the same, nor did theu ever reach their high-water marks again. They were shovel-sellers, much like Amazon and Nvidia in 2025.
I know it's hard to place a value on how much a utility saves a business, but honestly this math is like the piracy math and we didn't buy it back then either.
Some teenager downloading 20k songs does not mean that they saved $20k[1], nor does it mean that the record labels lost $20k.
In your case, the relevant question is "how much did your revenue increase by after you started 10x your utility code?"
[1] Assuming the songs are sold on the market for $1 each.
what? Do you think providers (or their other customers) don’t care about the business implications of a decision like this? All so that cursor can bring their significant customer base to a nearly-indistinguishable competitor?
I'm just worried that I'm doing it wrong.
I decide to try out the agent built into VS Code. It basically matches most of these fly by night "agent" ides which are mostly just VS Code forks anyway.
But it's weird. Because Microsoft can use Anthropic's API, funnel them revenue and take a loss on Copilot.
We're all getting this stuff heavily subsidized by either VC money or big corp money.
Microsoft can eat billions in losses on this if they become *the* provider of choice.
This stuff isn't perfect, but this is the worst it'll ever be. In 2 years it'll be able to replace many of us.
Claude 3.7 Sonnet supposedly cost "a few tens of millions of dollars"[1], and they recently hit $4B ARR[2].
Those numbers seem to give a fair bit of room for salaries, and it would be surprising if there wasn't a sustainable business in there.
[1] https://techcrunch.com/2025/02/25/anthropics-latest-flagship...
[2] https://www.theinformation.com/articles/anthropic-revenue-hi...
Here are some nice copilot resources: https://github.com/github/awesome-copilot
Also, I am using tons of markdown documents for planning, results, research.... This makes it easy to get new agent sessions or yourself up to context.
2. They don't have a moat. DeepSeek and Kimi are already good enough to destroy any high margins they're hoping to generate from compute.
Just because something is highly useful doesn't mean it's highly profitable. Water is essential to life, but it's dirt cheap in most of the world. Same goes for food.
I'm not the original poster, but regarding workflow, I've found it works better to let the LLM create one instead of imposing my own. My current approach is to have 10 instances generate 10 different plans, then I average them out.
You can actually hire a few excellent devs for very little money. You just can't hire 20k of them and convince them to move to a certain coastal peninsula with high rent and $20 shawarmas, for very little money each.
I am old and I remember when you could make a lot of money offering "Get Your Business On The Information Superhighway" (HTML on Apache) and we're in that stage of LLMadness today, but I suspect it will not last.
Sometimes one model would get stuck in their thinking and submitting the same question to a different model would resolve the problem
Don’t be sorry it shows your true colors. The point stands that you continue to step around. Cursor and other tools like it are more than a trivial wrapper but of course you have never used them so you have no idea. At least give yourself some exposure before projecting.
Dropbox is still a $5+bn business. Cursor is still growing, will it work out, I don’t know but lots of folks are seeing the value in these tools and I suspect we have not hit peak yet with the current generation. I am not sure what a service business like a small biz website builder has to do with Cursor or other companies in adjacent spaces.
Your characterization of hosting as "a small biz website builder" is revealing. https://finance.yahoo.com/quote/GDDY/ is the one that made it and is now a $24B firm, but there were at least dozens of these companies floating around in the early 2000s.
Why are you so sure Cursor is the new GoDaddy and not the new Tripod? https://www.tripod.lycos.com/
The only person being defensive here is you. My point was simple: tools like Cursor are more than just “wrappers.” Whether it becomes a massive business or not, revenue is growing, and clearly many users find enough value to justify the subscription. You don’t have to like it but writing it off without firsthand experience just weakens your argument.
At this point, you’re debating a product you haven’t tried, in a market you’re not tracking. Maybe sit this one out unless you have something constructive to say beyond “it’s just a wrapper”.
OP wanted a thing. in the past, they've been OK paying $10k for similar things. now they're paying $200/month + a bunch of their time wrangling it and they're also OK with that.
seems reasonable to consider that "$10k of value" in very rough terms which is of course how all value is measured.
Do you also get it to add to it's to-do list?
I also find that having the o3 model review the plan helps catch gaps. Do you do the same?
Current situation doesn't sound too good for "scaling hypothesis" itself.
I find a combination of local Ollama models, with very inexpensive APis like Moonshot’s Kimi with occasional Gemini 2.5 Pro use, and occasionally using gemini-cli provides extraordinary value. Am I missing out by not using one or more $200-$300 a month subscriptions? Probably but I don’t care.
That's the problem with most "AI" products/companies that still isn't being answered. Why do people use your tool/service if you don't own the LLM which is most of the underlying "engine"? And further, how do you stay competitive when your LLM provider starts to scale RL with whatever prompting tricks you're doing, making your product obsolete?
The shell of the IDE is open source. It’s true there is some risk on the supply of models and compute but again none of those, except MSFT which does not even own any of the SOTA models, have any direct competition. OpenAI has codex but it’s half baked and being bundled in ChatGPT. It is in nobodies interest to cut off Cursor as at this point they are a fairly sustained and large customer. The risk exists but feels pretty far fetched until someone is actively competing or Cursor gets bought out by a OpenAI.
Again, what proof do you have that there is zero complexity or most being driven by the sandwich filling. Most of OpenAIs valuation is being driven by the wrapper ChatGPT not API usage. I have written a number of integrations with LLM APIs and while some of it just works, there is a lot of nuance to doing it effectively and efficiently at scale. If it was so simple why would we not see many other active competitors in this space with massive MAUs?
I don’t want to descend into talking politics, but I want to say that geopolitics, the rising geopolitical ‘south’, etc., is fascinating stuff - much more interesting and entertaining than anything fictional on Netflicks or HBO!
It allows you to have CC shoot out requests to o3, 2.5 pro and more. I was previously bouncing around between different windows to achieve the same thing. With this I can pretty much live in CC with just an editor open to inspect / manually edit files.
Okay, then their costs should have come down similarly, no? OP said they were a business and that these weren't luxury hobby things but business needs. In which case, it must reflect on the bottom line.
I operate as a business myself (self-employed), and I can generally correlate purchases with the bottom line almost immediately for some things (Jetbrains, VPSes for self-hosted git, etc) and correlate it with other things in the near future (certifications, conferences, etc).
The idea that "here is something I recently started paying a non-trivial amount for but it does not reflect on the bottom line" is a new and alien concept to me.
Meanwhile other "wrappers" e.g. in nvim or whatever, don't have this feature, they just have slightly better autocomplete than bare LSP.
Which puts the current valuations I've heard pretty much in the right ballpark. Crazy, but it could make sense.
User experience is definitely worth something, and I think Cursor had the first great code integration, but then there is very little stopping the foundation model companies from coming in and deciding they want to cut out the middleman if so desired.
The irony with Webvan, they had the right idea about 15 years too early. Now we have InstaCart, DoorDash, etc. You really needed the mobile revolution circa 2010 for it to work.
Pets.com is essentially Chewy (successful pet focused online retailer)
So, neither of those ideas were really terrible in the same vain as say Juicera, or outright frauds like Theranos. Overvalued and ill-timed, sure
I'm an attorney that got pitched the leading legal AI service and it was nothing but junk... so I'm not sure why you think that's different from what's going on right now.
Roo Code just has a lot more config exposed to the user which I really appreciate. When I was using Cline I would run into minor irritating quirks that I wished I can change but couldn't vs. Roo where the odds are pretty good there are some knobs you could turn to modify that part of your workflow.
But the “scaling hypothesis” is the easiest, fastest story to raise money. So it will be leveraged until conclusively broken by the next advancement.
But productivity software in general, only a few large companies seem to be able to get away with it. The Office Suite, CRM such as SalesForce.
In the graphics world, Maya and 3DS Max. Adobe has been holding on.
Every time I've tried Copilot or Cursor, it's happily gone off and written or rewritten code into a state it seemed very proud of, and which didn't even work, let alone solve the problem I put to it.
Meanwhile, Kiro:
1. Created a requirements document, with user stories and acceptance criteria, so that we could be on the same page about the goals
2. Once I signed off on that, it then created a design document, with code examples, error handling cases, and an architecture diagram, for me to review
3. After that looked good, it set about creating an itemized task list for each step of the implementation, broken down into specific tasks and sub-tasks and including which of the acceptance criteria from step 1 that task addressed
4. I could go through the document task by task, ask it to work on it, and then review the results
At one point, it noticed that the compiler had reported a minor issue with the code it had written, but correctly identified that resolving that issue would involve implementing something that was slated for a future task, so it opted to ignore the issue until the appropriate time.
For once, I found myself using an AI tool that handled the part of the job I hate the most, and am the worst at: planning, diagramming, and breaking down tasks. Even if it hadn't been able to write any working code at all, it already created something useful for me that I could have built off of, but it did end up writing something that worked great.
In case anyone is curious about the files it created, you can see them here: https://github.com/danudey/rust-downloader/pull/4
Note that I'm not really familiar with Rust (as most of the code will demonstrate), so it would probably have been far faster for an experienced Rust programmer to implement this. In my case, though, I just let it do its thing in the background and checked in occasionally to validate it was doing what I expected.
Briefpoint.ai, casely.ai, eve.legal etc. I work with an attorney who trained his paralegals to use chatgpt + some of these drafting tools, says it's significantly faster than what they could've done previously.
> I feel like big VC money goes to solving legal analysis, but I'm seeing a lot of wins with document drafting/templating.
What do you mean "wins?" Like motions won with AI drafted papers? I'm skeptical.
>I work with an attorney who trained his paralegals to use chatgpt + some of these drafting tools, says it's significantly faster than what they could've done previously.
I'd be concerned about malpractice, personally. The case reviews I've seen from Vincent (which is ChatGPT + the entire federal docket) are shocking in how facially wrong they can be. It's one thing for an attorney to use ChatGPT when they do know the law and issues (hasn't seemed to help the various different partners getting sanctioned for filing AI drafted briefs) but to leave the filtering to a paralegal? That's insane, imo.
If you need to repeatedly remind it to do something though, you can store it in claude.md so that it is part of every chat. For example, in mine I have asked it to not invoke git commit but to review the git commit message with me before committing, since I usually need to change it.
There may be a maximum amount of complexity it can handle. I haven't reached that limit yet, but I can see how it could exist.
I've found though that if you can steer it in the right direction it usually works out okay. It's not particularly good at design, but it's good at writing code, so one thing you can do is say write classes and some empty methods with // Todo Claude: implement, then ask it to implement the methods with Todo Claude in file foo. So this way you get the structure that you want, but without having to implement all the details.
What kind of things are you having issues with?
I probably mean it less as "I'm too old" and more of "Wow, time really flies".
To me, who started my career in the very late 90s, the .com boom doesn't really seem that long ago. But then I realize that there is about the same amount of time between now and the .com boom, and the .com boom and the Altair 8800, and I think "OK, that was a loooong time ago". It really is true what they say, that the perception of time speeds up the older you get, which is both a blessing and a curse.
Regarding AI, it's a bit fascinating to me to think that we really only had enough data to get generative AI to work in the very recent past, and nearly as soon as we had enough data, the tech appeared. In the past I would have probably guessed that the time between having enough data and the development of AI would have been a lot longer.
I watch the changes on Kilo Code as well (https://github.com/Kilo-Org/kilocode). Their goal is to merge the best from Cline & Roo Code then sprinkle their own improvements on top.
That being said _sometimes_ its analysis is actually correct, so it's not a total miss. Just not something I'm willing to pay for when Ollama and free models exist.
I am not sure why you would think your single anecdote is defensible or evidence to prove much. My perspective is valuations that are going on right now don’t have multiples that are that wild especially if we aren’t compare it to the com bubble.
Kozmo is a great case study: decent demand, terrible unit economics, and zero pricing power. They didn’t just scale too fast, they scaled a structurally unprofitable model. There was no markup, thin margins, and they held inventory without enough throughput.
Many of these companies may fail but it’s a much different environment and the path to profitability is moving a lot quicker.
March 1999: ~$27.7B
Jan 2009: ~$25B (back to $27.7B & rising by Feb)
Huh.
Evidence? Prove? What are you talking about. This is just a discussion between people, not some courtroom melodrama you are making it out to be.
>My perspective is valuations that are going on right now don’t have multiples that are that wild especially if we aren’t compare it to the com bubble.
Okay, I could be equally rude to you, but I wont.
As for valuations, when looking at current VC multiples and equity markets, I don’t see the same bubble from a qualitative perspective. Absolutely there is over hype coming from CEOs in public markets but there is a lot of value being driven. I don’t believe the giants are going to do well, maybe the infrastructure plays will but I think we will see a carve out of a new generation of companies driving the change. Unlike ‘99, I am seeing a lot more startups and products with closer to the ground roadmaps to profitability. In 99 so many were running off of hopes and dreams.
If you would actually like to converse I would love to see your perspective but if all you can be is mad please please don’t respond. Nobody is having a courtroom drama other than what’s playing out in your head.
The goal for investors is to be able to exit their investment for more than they put in.
That doesn't mean the company needs to be profitable at all.
Broadly speaking, investors look for sustainable growth. Think Amazon, when they were spending as much money as possible in the early 2000s to build their distribution network and software and doing anything they possibly could to avoid becoming profitable.
Most of the time companies (and investors) don't look for profits. Profits are just a way of paying more tax. Instead the ideal outcome is growing revenue that is cost negative (ie, could be possible) but the excess money is invested in growing more.
Note that this doesn't mean the company is raising money from external sources. Not being profitable doesn't imply that.