Gemini 2.5 Pro vs. Claude 3.7 Sonnet: Coding Comparison

(composio.dev)

483 points mraniki | 5 comments | 31 Mar 25 12:09 UTC | HN request time: 1.064s | source

Show context

neal_ ◴[31 Mar 25 13:00 UTC] No.43534543[source]▶

I was using gemini 2.5 pro yesterday and it does seem decent. I still think claude 3.5 is better at following instruction then the new 3.7 model which just goes ham messing stuff up. Really disappointed by Cursor and the Claude CLI tool, for me they create more problems then fix. I cant figure out how to use them on any of my projects with out them ruining the project and creating terrible tech debt. I really like the way gemini shows how much context window is left, i think every company should have this. To be honest i think there has been no major improvement beyond the original models which gained popularity first. Its just marginal improvements 10% better or something, and the free models like deepseek are actually better imo then anything openai has. I dont think the market can withstand the valuations of the big ai companies. They have no advantage, there models suck worse then free open source ones, and they charge money??? Where is the benefit to there product?? People originally said the models are the moat and methods are top secret, but turns out its pretty easy to reproduce these models, and its the application layer built on top of the models that is much more specific and has the real moat. People said the models would engulf these applications built ontop and just integrate natively.

replies(4): >>43534760 #>>43534894 #>>43535115 #>>43536010 #

martin-t ◴[31 Mar 25 15:17 UTC] No.43536010[source]▶

>>43534543 #

Whenever I read about LLMs or try to use them, I feel like I am asleep in a dream where two contradicting things can be true at the same time.

On one hand, you have people claiming "AI" can now do SWE tasks which take humans 30 minutes or 2 hours and the time doubles every X months so by Y year, SW development will be completely automated.

On the other hand, you have people saying exactly what you are saying. Usually that LLMs have issues even with small tasks and that repeated/prolonged use generates tech debt even if they succeed on the small tasks.

These 2 views clearly can't both be true at the same time. My experience is the second category so I'd like to chalk up the first as marketing hype but it's confusing how many people who have seemingly nothing to gain from the hype contribute to it.

replies(4): >>43536241 #>>43536654 #>>43537271 #>>43537992 #

1. bitcrusher ◴[31 Mar 25 18:16 UTC] No.43537992[source]▶

>>43536010 #

I'm not sure why this is confusing? We're seeing the phenomenon everywhere in culture lately. People WANT something to be true and try to speak it into existence. They also tend to be the people LEAST qualified to speak about the thing they are referencing. It's not marketing hype, it is propaganda.

Meanwhile, the 'experts' are saying something entirely different and being told they're wrong or worse, lying.

I'm sure you've seen it before, but this propaganda, in particular, is the holy grail of 'business people'. The ones who "have a great idea, just need you to do all the work" types. This has been going on since the late 70s, early 80s.

replies(1): >>43541254 #

2. martin-t ◴[31 Mar 25 23:42 UTC] No.43541254[source]▶

>>43537992 (TP) #

Not necessarily confusing but very frustrating. This is probably the first time I encountered such a wide range of opinions and therefore such a wide range of uncertainty in a topic close to me.

When a bunch of people very loudly and confidently say your profession, and something you're very good at, will become irrelevant in the next few years, it makes you pay attention. And when you then can't see what they claim to be seeing, then it makes you question whether something is wrong with you or them.

replies(1): >>43548219 #

3. bitcrusher ◴[01 Apr 25 15:49 UTC] No.43548219[source]▶

>>43541254 #

Totally get that; I'm on the older side, so personally I've been down this road quite a few times. We're ALWAYS on the verge of our profession being rugged somehow. RAD tools, Outsourcing, In-sourcing, No-Code, AI/LLM... I used to be curious about why there was overwhelming pressure to eliminate "us", but gave up and just focus on doing good work.

replies(1): >>43552583 #

4. martin-t ◴[02 Apr 25 00:14 UTC] No.43552583{3}[source]▶

>>43548219 #

The pressure is simple - money. Competent people are rare and we're not cheap. But it turns out, those cheaper less competent people can't replace us, no matter what tools you give them - there is fundamental complexity to the work we do which they can't handle.

However, I think this time is qualitatively different. This time the rich people who wanna get rid of us are not trying to replace us with other people. This time, they are trying to simulate _us_ using machines. To make "us" faster, cheaper and scalable.

I don't think LLMs will lead to actual AI and their benefit is debatable. But so much money is going into the research that somebody might just manage to build actual AI and then what?

Hopefully, in 10 years we'll all be laughing at how a bunch of billionaires went bankrupt by trying to convince the world that autocomplete was AI. But if not, a whole bunch of people will be competing for a much smaller pool of jobs, making us all much, much poorer, while they will capture all the value that would have normally been produced by us right into their pockets.

replies(1): >>43558333 #

5. bitcrusher ◴[02 Apr 25 16:17 UTC] No.43558333{4}[source]▶

>>43552583 #

I agree; I wasn't clear in my previous post. I understand the economic underpinnings. I cannot understand the coupled animus and have stopped trying.

↑