←back to thread

514 points mfiguiere | 1 comments | | HN request time: 0s | source
Show context
gklitt ◴[] No.43710093[source]
I tried one task head-to-head with Codex o4-mini vs Claude Code: writing documentation for a tricky area of a medium-sized codebase.

Claude Code did great and wrote pretty decent docs.

Codex didn't do well. It hallucinated a bunch of stuff that wasn't in the code, and completely misrepresented the architecture - it started talking about server backends and REST APIs in an app that doesn't have any of that.

I'm curious what went so wrong - feels like possibly an issue with loading in the right context and attending to it correctly? That seems like an area that Claude Code has really optimized for.

I have high hopes for o3 and o4-mini as models so I hope that other tests show better results! Also curious to see how Cursor etc. incorporate o3.

replies(7): >>43710162 #>>43710290 #>>43711286 #>>43713258 #>>43714390 #>>43714966 #>>43716635 #
strangescript ◴[] No.43711286[source]
Claude Code still feels superior. o4-mini has all sorts of issues. o3 is better but at that point, you aren't saving money so who cares.

I feel like people are sleeping on Claude Code for one reason or another. Its not cheap, but its by far the best, most consistent experience I have had.

replies(3): >>43711411 #>>43711764 #>>43712470 #
ekabod ◴[] No.43711764[source]
"gemini 2.5 pro exp" is superior to Claude Sonnet 3.7 when I use it with Aider [1]. And it is free (with some high limit).

[1]https://aider.chat/

replies(3): >>43711773 #>>43713447 #>>43755725 #
jacooper ◴[] No.43711773[source]
Don't they train on your inputs if you use the free Ai studio api key?
replies(1): >>43711799 #
1. asadm ◴[] No.43711799{3}[source]
speaking for myself, I am happy to make that trade. As long as I get unrestricted access to latest one. Heck, most of my code now is written by gemini anyway haha.