←back to thread

447 points crawshaw | 1 comments | | HN request time: 0.239s | source
Show context
kgeist ◴[] No.43998994[source]
Today I tried "vibe-coding" for the first time using GPT-4o and 4.1. I did it manually - just feeding compilation errors, warnings, and suggestions in a loop via the canvas interface. The file was small, around 150 lines.

It didn't go well. I started with 4o:

- It used a deprecated package.

- After I pointed that out, it didn't update all usages - so I had to fix them manually.

- When I suggested a small logic change, it completely broke the syntax (we're talking "foo() } return )))" kind of broken) and never recovered. I gave it the raw compilation errors over and over again, but it didn't even register the syntax was off - just rewrote random parts of the code instead.

- Then I thought, "maybe 4.1 will be better at coding" (as advertized). But 4.1 refused to use the canvas at all. It just explained what I could change - as in, you go make the edits.

- After some pushing, I got it to use the canvas and return the full code. Except it didn't - it gave me a truncated version of the code with comments like "// omitted for brevity".

That's when I gave up.

Do agents somehow fix this? Because as it stands, the experience feels completely broken. I can't imagine giving this access to bash, sounds way too dangerous.

replies(31): >>43999028 #>>43999055 #>>43999097 #>>43999162 #>>43999169 #>>43999248 #>>43999263 #>>43999272 #>>43999296 #>>43999300 #>>43999358 #>>43999373 #>>43999390 #>>43999401 #>>43999402 #>>43999497 #>>43999556 #>>43999610 #>>43999916 #>>44000527 #>>44000695 #>>44001136 #>>44001181 #>>44001568 #>>44001697 #>>44002185 #>>44002837 #>>44003198 #>>44003824 #>>44008480 #>>44048487 #
1. flashgordon ◴[] No.44048487[source]
Ive been doing this exactly "manual" setup (actually after a while I jsut wrote a few browser drivers so it is much more snappier but I am getting ahead).

I started with GPT which gave mediocre results, then switched to claude which was a step function improvement - but again grinded when complexity got a bit high. Main problem was after a certain size it did not give good ways break down your projects.

Then I switched to Gemini. This has blown my mind away. I dont even use cursor etc. Just plain old simple prompts and summarization and regular refactoring and it handles itself pretty well. I must have generated 30M tokens so far (in about 3 weeks) and less 1% of "backtracking" needed. i define backtracking as your context has gone so wonky that you have to start all over again.