←back to thread

254 points mrlesk | 1 comments | | HN request time: 0.202s | source
Show context
mrlesk ◴[] No.44483531[source]
I threw Claude Code at an existing codebase a few months back and quickly quit— untangling its output was slower than writing from scratch. The fix turned out to be process, not model horsepower.

Iteration timeline

==================

• 50 % task success - added README.md + CLAUDE.md so the model knew the project.

• 75 % - wrote one markdown file per task; Codex plans, Claude codes.

• 95 %+ - built Backlog.md, a CLI that turns a high-level spec into those task files automatically (yes, using Claude/Codex to build the tool).

Three step loop that works for me 1. Generate tasks - Codex / Claude Opus → self-review.

2. Generate plan - same agent, “plan” mode → tweak if needed.

3. Implement - Claude Sonnet / Codex → review & merge.

For simple features I can even run this from my phone: ChatGPT app (Codex) → GitHub app → ChatGPT app → GitHub merge.

Repo: https://github.com/MrLesk/Backlog.md

Would love feedback and happy to answer questions!

replies(6): >>44484317 #>>44484770 #>>44486442 #>>44487755 #>>44490648 #>>44559194 #
mitjam ◴[] No.44484317[source]
Really love this.

Would love to see an actual end to end example video of you creating, planning, and implementing a task using your preferred models and apps.

replies(1): >>44484679 #
mrlesk ◴[] No.44484679[source]
Will definitely do. I am also planning to run a benchmark with various models to see which one is more effective at building a full product starting from a PRD and using backlog for managing tasks
replies(2): >>44484751 #>>44485226 #
1. bazooka5798 ◴[] No.44484751[source]
I'd love to see openRouter connectivity to try non Claude models for some of the planning parts of the cycle.