Gemini 2.5 Pro vs. Claude 3.7 Sonnet: Coding Comparison

(composio.dev)

483 points mraniki | 4 comments | 31 Mar 25 12:09 UTC | HN request time: 1.554s | source

Show context

HarHarVeryFunny ◴[31 Mar 25 14:58 UTC] No.43535790[source]▶

I'd like to see an honest attempt by someone to use one of these SOTA models to code an entire non-trivial app. Not a "vibe coding" flappy bird clone or minimal ioS app (call API to count calories in photo), but something real - say 10K LOC type of complexity, using best practices to give the AI all the context and guidance necessary. I'm not expecting the AI to replace the programmer - just to be a useful productivity tool when we move past demos and function writing to tackling real world projects.

It seems to me that where we are today, AI is only useful for coding for very localized tasks, and even there mostly where it's something commonplace and where the user knows enough to guide the AI when it's failing. I'm not at all convinced it's going to get much better until we have models that can actually learn (vs pre-trained) and are motivated to do so.

replies(6): >>43535869 #>>43535969 #>>43536042 #>>43536795 #>>43536842 #>>43538608 #

1. redox99 ◴[31 Mar 25 16:28 UTC] No.43536842[source]▶

>>43535790 #

I use cursor agent mode with claude on my NextJS frontend and Typescript GraphQL backend. It's a real, reasonably sized, production app that's a few years old (pre-ChatGPT).

I vibe code the vast majority features nowadays. I generally don't need to write a single line of code. It often makes some mistakes but the agent figures out that the tests fail, or it doesn't build, fixes it, and basically "one shots" it after it doing its thing.

Only occasionally I need to write a few lines of code or give it a hint when it gets stuck. But 99% of the code is written by cursor.

replies(2): >>43537041 #>>43537948 #

2. HarHarVeryFunny ◴[31 Mar 25 16:45 UTC] No.43537041[source]▶

>>43536842 (TP) #

That's pretty impressive - a genuine real-world use case where the AI is doing the vast majority of the work.

3. orange_puff ◴[31 Mar 25 18:12 UTC] No.43537948[source]▶

>>43536842 (TP) #

When you say "vibe code" do you mean the true definition of that term, which is to blindly accept any code generated by the AI, see if it works (maybe agent mode does this) and move on to the next feature? Or do you mean prompt driven development, where although you are basically writing none of the code, you are still reading every line and maintain high involvement in the code base?

replies(1): >>43540127 #

4. redox99 ◴[31 Mar 25 21:22 UTC] No.43540127[source]▶

>>43537948 #

Kind of in between. I accept a lot of code without ever seeing it, but I check the critical stuff that could cause trouble. Or stuff that I know the AI is likely to mess up.

Specifically for the front end I mostly vibe code, and for the backend I review a lot of the code.

I will often follow up with prompts asking it to extract something to a function, or to not hardcode something.

↑