A Research Preview of Codex

(openai.com)

511 points meetpateltech | 1 comments | 16 May 25 15:02 UTC | HN request time: 0s | source

Show context

johnjwang ◴[16 May 25 16:27 UTC] No.44007301[source]▶

Some engineers on my team at Assembled and I have been a part of the alpha test of Codex, and I'll say it's been quite impressive.

We’ve long used local agents like Cursor and Claude Code, so we didn’t expect too much. But Codex shines in a few areas:

Parallel task execution: You can batch dozens of small edits (refactors, tests, boilerplate) and run them concurrently without context juggling. It's super nice to run a bunch of tasks at the same time (something that's really hard to do in Cursor, Cline, etc.)

It kind of feels like a junior engineer on steroids, you just need to point it at a file or function, specify the change, and it scaffolds out most of a PR. You still need to do a lot of work to get it production ready, but it's as if you have an infinite number of junior engineers at your disposal now all working on different things.

Model quality is good, but hard to say it's that much better than other models. In side-by-side tests with Cursor + Gemini 2.5-pro, naming, style and logic are relatively indistinguishable, so quality meets our bar but doesn’t yet exceed it.

replies(15): >>44007420 #>>44007425 #>>44007552 #>>44007565 #>>44007575 #>>44007870 #>>44008106 #>>44008575 #>>44008809 #>>44009066 #>>44009783 #>>44010245 #>>44012131 #>>44014948 #>>44016788 #

fourside ◴[16 May 25 16:39 UTC] No.44007420[source]▶

>>44007301 #

> You still need to do a lot of work to get it production ready, but it's as if you have an infinite number of junior engineers at your disposal now all working on different things.

One issue with junior devs is that because they’re not fully autonomous, you have to spend a non trivial amount of time guiding them and reviewing their code. Even if I had easy access to a lot of them, pretty quickly that overhead would become the bottleneck.

Did you think that managing a lot of these virtual devs could get overwhelming or are they pretty autonomous?

replies(3): >>44007581 #>>44007712 #>>44030398 #

1. tom_m ◴[19 May 25 14:39 UTC] No.44030398[source]▶

>>44007420 #

You also have to provide accurate instructions.

I find most often times, "bugs" aren't with writing code that doesn't compile or doesn't have passing tests. The "bugs" come from not understanding the requirements and what it is you're building.

I'm not entirely sure AI will help this at all. People are generally bad at describing software and how they want it to work. They are inaccurate there or entirely omit things in the requirements.

Yes, though, it would be overwhelming to manage a bunch of AI agents. Context switching and redirecting, guiding, will be very difficult and not everyone's cup of tea.

If argue this isn't really a result of AI though. Many people are already in this boat today. The industry is set up in this way with contractors and outsourced devs that are at a junior level...because it's the attraction of cheap labor. Many businesses are attracted to this beyond programming. One of the questions is going to be, is the cost per token economics cheaper? So long as it's cheaper, AI coding agents will have a future. If it proves to not be cheaper (and this could take years to prove out), then I don't think it'll be as popular. I think people will need to go back to the drawing board on how we use AI agents or use AI for other purposes (like training, education, developer onboarding, code reviews, debugging, etc.)

↑