←back to thread

511 points meetpateltech | 1 comments | | HN request time: 0.211s | source
Show context
johnjwang ◴[] No.44007301[source]
Some engineers on my team at Assembled and I have been a part of the alpha test of Codex, and I'll say it's been quite impressive.

We’ve long used local agents like Cursor and Claude Code, so we didn’t expect too much. But Codex shines in a few areas:

Parallel task execution: You can batch dozens of small edits (refactors, tests, boilerplate) and run them concurrently without context juggling. It's super nice to run a bunch of tasks at the same time (something that's really hard to do in Cursor, Cline, etc.)

It kind of feels like a junior engineer on steroids, you just need to point it at a file or function, specify the change, and it scaffolds out most of a PR. You still need to do a lot of work to get it production ready, but it's as if you have an infinite number of junior engineers at your disposal now all working on different things.

Model quality is good, but hard to say it's that much better than other models. In side-by-side tests with Cursor + Gemini 2.5-pro, naming, style and logic are relatively indistinguishable, so quality meets our bar but doesn’t yet exceed it.

replies(15): >>44007420 #>>44007425 #>>44007552 #>>44007565 #>>44007575 #>>44007870 #>>44008106 #>>44008575 #>>44008809 #>>44009066 #>>44009783 #>>44010245 #>>44012131 #>>44014948 #>>44016788 #
1. _bin_ ◴[] No.44008575[source]
I believe cursor now supports parallel tasks, no? I haven't done much with it personally but I have buddies who have.

If you want one idiot's perspective, please hyper-focus on model quality. The barrier right now is not tooling, it's the fact that models are not good enough for a large amount of work. More importantly, they're still closer to interns than junior devs: you must give them a ton of guidance, constant feedback, and a very stern eye for them to do even pretty simple tasks.

I'd like to see something with an o1-preview/pro level of quality that isn't insanely expensive, particularly since a lot of programming isn't about syntax (which most SotA modls have down pat) but about understanding the underlying concepts, an area in which they remain weak.

Atp I really don't care if the tooling sucks. Just give me really, really good mdoels that don't cost a kidney.