←back to thread

858 points cryptophreak | 2 comments | | HN request time: 0.001s | source
Show context
wiremine ◴[] No.42936346[source]
I'm going to take a contrarian view and say it's actually a good UI, but it's all about how you approach it.

I just finished a small project where I used o3-mini and o3-mini-high to generate most of the code. I averaged around 200 lines of code an hour, including the business logic and unit tests. Total was around 2200 lines. So, not a big project, but not a throw away script. The code was perfectly fine for what we needed. This is the third time I've done this, and each time I get faster and better at it.

1. I find a "pair programming" mentality is key. I focus on the high-level code, and let the model focus on the lower level code. I code review all the code, and provide feedback. Blindly accepting the code is a terrible approach.

2. Generating unit tests is critical. After I like the gist of some code, I ask for some smoke tests. Again, peer review the code and adjust as needed.

3. Be liberal with starting a new chat: the models can get easily confused with longer context windows. If you start to see things go sideways, start over.

4. Give it code examples. Don't prompt with English only.

FWIW, o3-mini was the best model I've seen so far; Sonnet 3.5 New is a close second.

replies(27): >>42936382 #>>42936605 #>>42936709 #>>42936731 #>>42936768 #>>42936787 #>>42936868 #>>42937019 #>>42937109 #>>42937172 #>>42937188 #>>42937209 #>>42937341 #>>42937346 #>>42937397 #>>42937402 #>>42937520 #>>42938042 #>>42938163 #>>42939246 #>>42940381 #>>42941403 #>>42942698 #>>42942765 #>>42946138 #>>42946146 #>>42947001 #
zahlman ◴[] No.42941403[source]
LoC per hour seems to me like a terrible metric.
replies(1): >>42942607 #
1. esafak ◴[] No.42942607[source]
Why? Since you are vetting the code it generates, the rate at which you end up with code you accept seems like a good measure of productivity.
replies(1): >>42945471 #
2. 59nadir ◴[] No.42945471[source]
1000 lines of perfectly inoffensive and hard to argue against code that you don't need because it's not the right solution is negative velocity. Granted, I don't think that's much worse with LLMs but I do think it's going to be a growing problem caused by the cost of creating useless taxonomies and abstractions going down.

That is to say: I think LLMs are going to make a problem we already had (much) worse.