(composio.dev)

483 points mraniki | 1 comments | 31 Mar 25 12:09 UTC | HN request time: 0.209s | source

1. stared ◴[31 Mar 25 14:42 UTC] No.43535624[source]▶

At this level, it is very contextual - depending on your tools, prompts, language, libraries, and the whole code base. For example, for one project, I am generating ggplot2 code in R; Claude 3.5 gives way better results than the newer Claude 3.7.

Compare and contrast https://aider.chat/docs/leaderboards/, https://web.lmarena.ai/leaderboard, https://livebench.ai/#/.

↑

Gemini 2.5 Pro vs. Claude 3.7 Sonnet: Coding Comparison