←back to thread

159 points jbredeche | 3 comments | | HN request time: 0.892s | source
1. extr ◴[] No.45533546[source]
IMO, I was an early adopter to this pattern and at this point I've mostly given it up (except in cases where the task is embarassingly parallel, eg: add some bog standard logging to 6 different folders). It's more than just that reviewing is high cognitive overhead. You become biased by seeing the AI solutions and it becomes harder to catch fundamental problems you would have noticed immediately inline.

My process now is:

- Verbally dictate what I'm trying to accomplish with MacWhisper + Parakeet v3 + GPT-5-Mini for cleanup. This is usually 40-50 lines of text.

- Instruct the agent to explore for a bit and come up with a very concise plan matching my goal. This does NOT mean create a spec for the work. Simply come up with an approach we can describe in < 2 paragraphs. I will propose alternatives and make it defend the approach.

- Authorize the agent to start coding. I turn all edit permissions off and manually approve each change. Often, I find myself correcting it with feedback like "Hmmm, we already have a structure for that [over here] why don't we use that?". Or "If this fails we have bigger problems, no need for exception handling here."

- At the end, I have it review the PR with a slash command to catch basic errors I might have missed or that only pop up now that it's "complete".

- I instruct it to commit + create a PR using the same tone of voice I used for giving feedback.

I've found I get MUCH better work product out of this - with the benefit that I'm truly "done". I saw all the lines of code as they were written, I know what went into it. I can (mostly) defend decisions. Also - while I have extensive rules set up in my CLAUDE/AGENTS folders, I don't need to rely on them. Correcting via dictation is quick and easy and doesn't take long, and you only need to explicitly mention something once for it to avoid those traps the rest of the session.

I also make heavy use of conversation rollback. If I need to go off on a little exploration/research, I rollback to before that point to continue the "main thread".

I find that Claude is really the best at this workflow. Codex is great, don't get me wrong, but probably 85% of my coding tasks are not involving tricky logic or long range dependencies. It's more important for the model to quickly grok my intent and act fast/course correct based on my feedback. I absolutely use Codex/GPT-5-Pro - I will have Sonnet 4.5 dump a description of the issue, paste it to Codex, have it work/get an answer, and then rollback Sonnet 4.5 to simply give it the answer directly as if from nowhere.

replies(1): >>45534040 #
2. foobar10000 ◴[] No.45534040[source]
Did you try to add codex cli as an MCP server so that Claude uses it as an mcp client instead of pasting to it? Something like “ claude mcp add codex-high -- codex -c model_reasoning_effort="high" -m "gpt-5-codex" mcp-server” ?

I’ve had good luck with it - was wondering if that makes the workflow faster/better?

replies(1): >>45535223 #
3. extr ◴[] No.45535223[source]
Yeah I've looked into that kind of thing. In general I don't love the pattern where a coding agent calls another agent automatically. It's hard to control and I don't like how the session "disappears" after the call is done. It can be useful to leave that Codex window open for one more question.

One tool that solves this is RepoPrompt MCP. You can have Sonnet 4.5 set up a call to GPT-5-Pro via API and then that session stays persisted in another window for you to interact with, branch, etc.