there is the temptation to just let these things run in our codebases, which I think for some projects is totally fine. For most websites I think this would usually be fine. This is for two reasons: 1) these models have been trained on more websites than probably anything else and 2) if a div/text is off by a little bit then usually there will be no huge problems.
But if you're building something that is mission critical, unless you go super slowly, which again is hard to do because these agents are tempting to go super fast. That is sort of the allure of them: to be able to write sofware super fast.
But as we all know, in some programs you cannot have a single char wrong or the whole program may not work or have value. At least that is how the one I am working on is.
I found that I lost the mental map of the codebase I am working on. Claude Code had done too much too fast.
I found a function this morning to validate futures/stocks/FUT-OPT/STK-OPT symbols where the validation was super basic and terrible that it had written. We had implemented some very strong actual symbol data validation a week or two ago. But that wasn't fully implemented everywhere. So now I need to go back and do this.
Anyways, I think finding where certain code is written would be helpful for sure and suggesting various ways to solve problems. But the separate GUI apps can do that for us.
So for now I am going to keep just using the separate LLM apps. I will also save lots of money in the meantime (which I would gladly spend for a higher quality Claude Code ish setup).