Spending Too Much Money on a Coding Agent

1. nickjj ◴[03 Jul 25 17:12 UTC] No.44457178[source]▶

Serious question, how do you justify paying for any of this without feeling like it's a waste?

I occasionally use ChatGPT (free version without logging in) and the amount of times it's really wrong is very high. Often times it takes a lot of prompting and feeding it information from third party sources for it to realize it has incorrect information and then it corrects itself.

All of these prompts would be using money on a paid plan right?

I also used Cursor (free trial on their paid plan) for a bit and I didn't find much of a difference. I would say whatever back-end it was using was possibly worse. The code it wrote was busted and over engineered.

I want to like AI and in some cases it helps gain insight on something but I feel like literally 90% of my time is it prodiving me information that straight up doesn't work and eventually it might work but to get there is a lot of time and effort.

replies(7): >>44457220 #>>44457223 #>>44457236 #>>44457331 #>>44457367 #>>44457386 #>>44458671 #

2. benbayard ◴[03 Jul 25 17:16 UTC] No.44457220[source]▶

>>44457178 (TP) #

I'd try out cursor with either o3 or Claude 4 Opus. The free version of ChatGPT and Claude in Cursor are much better. That's also what this article claims and is true in my experience.

3. chis ◴[03 Jul 25 17:17 UTC] No.44457223[source]▶

>>44457178 (TP) #

I can't believe people are still writing comments like this lol how can it be

replies(1): >>44457312 #

4. abdullahkhalids ◴[03 Jul 25 17:17 UTC] No.44457236[source]▶

>>44457178 (TP) #

Depends on how much you use. I use AI to think through code and other problems, and write the dumb parts of code. Claude definitely works much better than the free offerings. I use OpenRouter [1] and spend only a couple of dollar per month on AI usage. It's definitely worth it.

[1] https://openrouter.ai No affiliation

5. zzzeek ◴[03 Jul 25 17:26 UTC] No.44457312[source]▶

>>44457223 #

I think it's a serious question because something really big is being missed here. There seem to be very different types of developers out there and/or working on very different kinds of codebases. Hypothetically, maybe you have devs or specific contexts where the dev can just write the code really fast where having to explain it to a bot is more time consuming, vs. devs /contexts where lots of googling and guessing goes on and it's easier to get the AI to just show you how to do it.

I'm actually employer mandated to continue to try/use AI bots / agents to help with coding tasks. I'm sort of getting them to help me but I'm still really not being blown away and still tending to prefer not to bother with them with things I'm frequently iterating on, they are more useful when I have to learn some totally new platform/API. Why is that? do we think there's something wrong with me?

replies(1): >>44457482 #

6. vineyardmike ◴[03 Jul 25 17:29 UTC] No.44457331[source]▶

>>44457178 (TP) #

> Serious question, how do you justify paying for any of this without feeling like it's a waste?

I would invert the question, how can you think it's a waste (for OP) if they're willing to spend $1000/mo on it? This isn't some emotional or fashionable thing, they're tools, so you'd have to assume they derive $1000 of value.

> free version... the amount of times it's really wrong is very high... it takes a lot of prompting and feeding it information from third party

Respectfully, you're using it wrong, and you get what you paid for. The free versions are obviously inferior, because obviously they paywall the better stuff. If OP is spending $50/day, why would the company give you the same version for free?

The original article mentions Cursor. With (paid) cursor, the tool automatically grabs all the information on behalf of the user. It will grab your code, including grepping to find the right files, and it will grab info from the internet (eg up to date libraries, etc), and feed that into the model which can provide targeted diffs to update just select parts of a file.

Additionally, the tools will automatically run compiler/linter/unit tests to validate their work, and iterate and fix their mistakes until everything works. This write -> compile -> unit test -> lint loop is exactly what a human will do.

replies(3): >>44457586 #>>44458202 #>>44458256 #

7. ◴[03 Jul 25 17:33 UTC] No.44457367[source]▶

>>44457178 (TP) #

8. jonfw ◴[03 Jul 25 17:35 UTC] No.44457386[source]▶

>>44457178 (TP) #

The AI agents that run on your machine where they have access to the code, and tools to browse/edit the code, or even run terminal commands are much more powerful than a simple chatbot.

It took some time for me to learn how to use agents, but they are very powerful once you get the hang of it.

replies(1): >>44457494 #

9. vineyardmike ◴[03 Jul 25 17:45 UTC] No.44457482{3}[source]▶

>>44457312 #

> I'm actually employer mandated to continue to try/use AI bots / agents to help with coding tasks

I think a lot of this comes down to the context management. I've found that these tools work worse at my current employer than my prior one. And I think the reason is context - my prior employer was a startup, where we relied on open source libraries and the code was smaller, following public best practices regarding code structure in Golang and python. My current employer is much bigger, with a massive monorepo of custom written/forked libraries.

The agents are trained on lots of open source code, so popular programming languages/libraries tend to be really well represented, while big internal libraries are a struggle. Similarly smaller repositories tend to work better than bigger ones, because there is less searching to figure out where something is implemented. I've been trying some coding agents with my current job, and they spend a lot more time searching through libraries looking to understand how to implement or use something if it relies on an internal library.

I think a lot of these struggles and differences are also present with people, but we tend to discount this struggle because people are generally good at reasoning. Of course, we also learn from each task, so we improve over time, unlike a static model.

10. josefresco ◴[03 Jul 25 17:46 UTC] No.44457494[source]▶

>>44457386 #

> much more powerful than a simple chatbot

Claude Pro + Projects is a good middle ground between the two. Things didn't really "click" for me as a non-developer until I got access to both.

11. klank ◴[03 Jul 25 17:55 UTC] No.44457586[source]▶

>>44457331 #

> This isn't some emotional or fashionable thing, they're tools, so you'd have to assume they derive $1000 of value.

This is not born out in my personal experience at all. In my experience, both in the physical and software tool worlds, people are incredibly emotional about their tools. There are _deep_ fashion dynamics within tool culture as well. I mean, my god, editors are the prima donna of emotional fashion running roughshod over the developer community for decades.

There was a reason it was called "Tool Time" on Home Improvement.

12. pxc ◴[03 Jul 25 19:05 UTC] No.44458202[source]▶

>>44457331 #

> This isn't some emotional or fashionable thing, they're tools, so you'd have to assume they derive $1000 of value.

If someone spends a lot of money on something but they don't derive commensurate value from that purchase, they will experience cognitive dissonance proportional to that mismatch. But ceasing or reversing such purchases are only some of the possibilities for resolving that dissonance. Another possibility is adjusting one's assessment of the value of that purchase. This can be subconscious and automatic, but it an also involve validation-seeking behaviors like reading positive/affirming product reviews.

In this present era of AI hype, purchase-affirming material is very abundant! Articles, blog posts, interviews podcasts, HN posts.. there's lots to tell people that it's time to "get on board", to "invest in AI" both financially and professionally, etc.

How much money people have to blow on experiments and toys probably makes a big difference, too.

Obviously there are limits and caveats to this kind of distortion. But I think the reality here is a bit more complicated than one in which we can directly read the derived value from people's purchasing decisions.

13. nickjj ◴[03 Jul 25 19:09 UTC] No.44458256[source]▶

>>44457331 #

> Respectfully, you're using it wrong, and you get what you paid for.

I used the paid (free trial) version of Cursor to look at Go code. I used the free version of ChatGPT for topics like Rails, Flask, Python, Ansible and various networking things. These are all popular techs. I wouldn't describe either platform as "good" if we're measuring good by going from an idea to a fully working solution with reasonable code.

Cursor did a poor job. The code it provided was mega over engineered to the point where most of the code had to be thrown away because it missed the big picture. This was after a lot of very specific prompting and iterations. The code it provided also straight up didn't work without a lot of manual intervention.

It also started to modify app code to get tests to pass when in reality the test code was the thing that was broken.

Also it kept forgetting things from 10 minutes ago and repeating the same mistakes. For example when 3 of its solutions didn't work, it started to go back and suggest using the first solution that was confirmed to not work (and it even output text explaining why it didn't work just before).

I feel really bad for anyone trusting AI to write code when you don't already have a lot of experience so you can keep it in check.

So far at best I barely find it helpful for learning the basics of something new or picking out some obscure syntax of a tool you don't well after giving it a link to the tool's docs and source code.

replies(1): >>44458680 #

14. BeetleB ◴[03 Jul 25 20:00 UTC] No.44458671[source]▶

>>44457178 (TP) #

Try with serious models. Here's what I would suggest:

1. Go to https://aider.chat/docs/leaderboards/ and pick one of the top (but not expensive) models. If unsure, just pick Gemini 2.5 Pro (not Flash).

2. Get API access.

3. Find a decent tool (hint: Aider is very good and you can learn the basics in a few minutes).

4. Try it on a new script/program.

5. (Only after some experience): Read people's detailed posts describing how they use these tools and steal their ideas.

Then tell us how it went.

15. BeetleB ◴[03 Jul 25 20:01 UTC] No.44458680{3}[source]▶

>>44458256 #

> I feel really bad for anyone trusting AI to write code when you don't already have a lot of experience so you can keep it in check.

You definitely should be skilled in your domain to use it effectively.