Gemini 2.5 Pro vs. Claude 3.7 Sonnet: Coding Comparison

(composio.dev)

483 points mraniki | 3 comments | 31 Mar 25 12:09 UTC | HN request time: 0.622s | source

Show context

phkahler ◴[31 Mar 25 13:30 UTC] No.43534852[source]▶

Here is a real coding problem that I might be willing to make a cash-prize contest for. We'd need to nail down some rules. I'd be shocked if any LLM can do this:

https://github.com/solvespace/solvespace/issues/1414

Make a GTK 4 version of Solvespace. We have a single C++ file for each platform - Windows, Mac, and Linux-GTK3. There is also a QT version on an unmerged branch for reference. The GTK3 file is under 2KLOC. You do not need to create a new version, just rewrite the GTK3 Linux version to GTK4. You may either ask it to port what's there or create the new one from scratch.

If you want to do this for free to prove how great the AI is, please document the entire session. Heck make a YouTube video of it. The final test is weather I accept the PR or not - and I WANT this ticket done.

I'm not going to hold my breath.

replies(15): >>43534866 #>>43534869 #>>43535026 #>>43535180 #>>43535208 #>>43535218 #>>43535261 #>>43535424 #>>43535811 #>>43535986 #>>43536115 #>>43536743 #>>43536797 #>>43536869 #>>43542998 #

jchw ◴[31 Mar 25 16:19 UTC] No.43536743[source]▶

>>43534852 #

I suspect it probably won't work, although it's not necessarily because an LLM architecture could never perform this type of work, but rather because it works best when the training set contains inordinate sample data. I'm actually quite shocked at what they can do in TypeScript and JavaScript, but they're definitely a bit less "sharp" when it comes to stuff outside of that zone in my experience.

The ridiculous amount of data required to get here hints that there is something wrong in my opinion.

I'm not sure if we're totally on the same page, but I understand where you're coming from here. Everyone keeps talking about how transformational these models are, but when push comes to shove, the cynicism isn't out of fear or panic, its disappointment over and over and over. Like, if we had an army of virtual programmers fixing serious problems for open source projects, I'd be more excited about the possibilities than worried about the fact that I just lost my job. Honest to God. But the thing is, if that really were happening, we'd see it. And it wouldn't have to be forced and exaggerated all the time, it would be plainly obvious, like the way AI art has absolutely flooded the Internet... except I don't give a damn if code is soulless as long as it's good, so it would possibly be more welcome. (The only issue is that it most likely actually suck when that happens, and rather just be functional enough to get away with, but I like to try to be optimistic once in a while.)

You really make me want to try this, though. Imagine if it worked!

Someone will probably beat me to it if it can be done, though.

replies(5): >>43537512 #>>43538902 #>>43539761 #>>43541786 #>>43552468 #

skydhash ◴[31 Mar 25 17:29 UTC] No.43537512[source]▶

>>43536743 #

> the cynicism isn't out of fear or panic, its disappointment over and over and over

Very much this. When you criticize LLM's marketing, people will say you're a ludite.

I'd bet that no one actually likes to write code, as in typing into an editor. We know how to do it, and it's easy enough to enter in a flow state while doing it. But everyone is trying to write less code by themselves with the proliferation of reusable code, libraries, framework, code generators, metaprogramming,...

I'd be glad if I could have a DAW or CAD like interface with very short feedback (the closest is live programming with Smalltalk). So that I don't have to keep visualizing the whole project (it's mentally taxing).

replies(3): >>43538806 #>>43539637 #>>43542982 #

1. galbar ◴[31 Mar 25 20:35 UTC] No.43539637[source]▶

>>43537512 #

>I'd bet that no one actually likes to write code

And you'd be wrong. I, for one, enjoy the process of handcrafting the individual mechanisms of the systems I create.

replies(1): >>43540010 #

2. skydhash ◴[31 Mar 25 21:11 UTC] No.43540010[source]▶

>>43539637 (TP) #

Do you like writing all the if, def, public void, import keywords? That is what I’m talking about. I prefer IDE for java and other verbose languages because of the code generation. And I configure my editors for templates and snippets because I don’t like to waste time on entering every single character (and learned vim because I can act on bigger units; words, lines, whole blocks).

I like programming, I do not like coding.

replies(1): >>43541035 #

3. galbar ◴[31 Mar 25 23:16 UTC] No.43541035[source]▶

>>43540010 #

I'm not bothered by if nor def. public void can be annoying but it's also fast to type and it doesn't bother me. For import I always try my best at having some kind of autoimport. I too use vim and use macros for many things.

To be honest I'm more annoyed by having to repeat three times parameters in class constructors (args, member declaration and assignment), and I have a macro for it.

The thing is, most of the time I know what I want to write before I start writing. At that point, writing the code is usually the fastest way to the result I want.

Using LLMs usually requires more writing and iterations; plus waiting for whatever it generates, reading it, understanding it and deciding if that's what I wanted; and then it suddenly goes crazy half way through a session and I have to start over...

↑