←back to thread

413 points martinald | 1 comments | | HN request time: 0.232s | source
Show context
Volundr ◴[] No.46205257[source]
FTA

> I've had Claude Code write an entire unit/integration test suite in a few hours (300+ tests) for a fairly complex internal tool. This would take me, or many developers I know and respect, days to write by hand.

I have no problem believing that Claude generated 300 passing tests. I have a very hard time believing those tests were all well thought out, consise, actually testing the desired behavior while communicating to the next person or agent how the system under test is supposed to work. I'd give very good odds at least some of those tests are subtly testing themselves (ex mocking a function, calling said function, then asserting the mock was called). Many of them are probably also testing implementation details that were never intended to be part of the contract.

I'm not anti-AI, I use it regularly, but all of these articles about how crazy productive it is skip over the crazy amount of supervision it needs. Yes, it can spit out code fast, but unless your prepared to spend a significant chunk of that 'saved" time CAREFULLY (more carefully than with a human) reviewing code, you've accepted a big drop in quality.

replies(7): >>46205349 #>>46205526 #>>46205624 #>>46206683 #>>46206705 #>>46208955 #>>46214506 #
kllrnohj ◴[] No.46205349[source]
Anecdotes etc etc but the AI tests I've been sent to review have been absolute shit. Stuff like that just calling a function doesn't crash the program. No assertions other than "end of test method reached"

Yes sometimes those tests are necessary, but it seemed to just do it everywhere because it made the code coverage percentage go up. Even though it was useless.

I have also had great experiences with AI cranking out straightforward boilerplate or asking C++ template metaprogramming questions. It's not all negative. But net-net it feels like it takes more work in total to use AI as you have to learn to recognize when it just won't handle the task, which can happen a lot. And you need to keep up with what it did enough to be able to take over. And reading code is harder than writing it.

replies(1): >>46206422 #
1. piperswe ◴[] No.46206422[source]
I’ve seen agents produce plenty of those tests, but recently I’ve seen them generate some actually decent unit tests that I wouldn’t have thought of myself. It’s a bit of a crapshoot