←back to thread

170 points anandchowdhary | 1 comments | | HN request time: 0s | source

Continuous Claude is a CLI wrapper I made that runs Claude Code in an iterative loop with persistent context, automatically driving a PR-based workflow. Each iteration creates a branch, applies a focused code change, generates a commit, opens a PR via GitHub's CLI, waits for required checks and reviews, merges if green, and records state into a shared notes file.

This avoids the typical stateless one-shot pattern of current coding agents and enables multi-step changes without losing intermediate reasoning, test failures, or partial progress.

The tool is useful for tasks that require many small, serial modifications: increasing test coverage, large refactors, dependency upgrades guided by release notes, or framework migrations.

Blog post about this: https://anandchowdhary.com/blog/2025/running-claude-code-in-...

Show context
apapalns ◴[] No.45957654[source]
> codebase with hundreds of thousands of lines of code and go from 0% to 80%+ coverage in the next few weeks

I had a coworker do this with windsurf + manual driving awhile back and it was an absolute mess. Awful tests that were unmaintainable and next to useless (too much mocking, testing that the code “works the way it was written”, etc.). Writing a useful test suite is one of the most important parts of a codebase and requires careful deliberate thought. Without deep understanding of business logic (which takes time and is often lost after the initial devs move on) you’re not gonna get great tests.

To be fair to AI, we hired a “consultant” that also got us this same level of testing so it’s not like there is a high bar out there. It’s just not the kind of problem you can solve in 2 weeks.

replies(8): >>45957997 #>>45958225 #>>45958365 #>>45958599 #>>45958634 #>>45959634 #>>45968154 #>>45969561 #
simonw ◴[] No.45958225[source]
I find coding agents can produce very high quality tests if and only if you give them detailed guidance and good starting examples.

Ask a coding agent to build tests for a project that has none and you're likely to get all sorts of messy mocks and tests that exercise internals when really you want them to exercise the top level public API of the project.

Give them just a few starting examples that demonstrate how to create a good testable environment without mocking and test the higher level APIs and they are much less likely to make a catastrophic mess.

You're still going to have to keep an eye on what they're doing and carefully review their work though!

replies(9): >>45958347 #>>45958377 #>>45958424 #>>45958535 #>>45958864 #>>45959436 #>>45960363 #>>45968188 #>>46016548 #
btown ◴[] No.45959436[source]
Has anyone had success with specific prompts to avoid the agent over-indexing on implementation details? For instance, something like: "Before each test case, add a comment justifying the business case for every assumption made here, without regards to implementation details. If this cannot be made succinct, or if there is ambiguity in the business case, the test case should not be generated."
replies(1): >>45959647 #
1. freedomben ◴[] No.45959647[source]
I've had reasonable success from doing something like this, though it is my current opinion that it's better to write the first few tests yourself to establish a clear pattern and approach. However, if you don't care that much (which is common with side projects):

Starting point: small-ish codebase, no tests at all:

    > I'd like to add a test suite to this project.  It should follow language best practices.  It should use standard tooling as much as possible.  It should focus on testing real code, not on mocking/stubbing, though mocking/stubbing is ok for things like third party services and parts of the code base that can't reasonably run in a test environment.  What are some design options we could do? Don't write any code yet, present me the best of the options and let me guide you.
    > Ok, I like option number two.  Put the basic framework in place and write a couple of dummy tests.
    > Great, let's go ahead and write some real tests for module X.
and etc. For a project with an existing and mature test suite, it's much easier:

    > I'd like to add a test (or improve a test) for module X.  Use the existing helpers and if you find yourself needing new helpers, ask me about the approach before implementing
I've also found it helpful to put things in AGENTS.md or CLAUDE.md about tests and my preferences, such as:

    - Tests should not rely on sleep to avoid timing issues.  If there is a timing issue, present me with options and let me guide you
    - Tests should not follow an extreme DRY pattern, favor human readability over absolute DRYness
    - Tests should focus on testing real code, not on mocking/stubbing, though mocking/stubbing is ok for things like third party services and parts of the code base that can't reasonably run in a test environment.
    - Tests should not make assumptions about the current running state of the environment, nor should they do anything that isn't cleaned up before completing the test to avoid polluting future tests
I do want to stress that every project and framework is different and has different needs. As you discover the AI doing something you don't like, add it to the prompts or the AGENTS.md/CLAUDE.md. Eventually it will get pretty decent, though never blindly trust it because a butterfly flapping it's wings in Canada sometimes causes it to do unexpected things.