←back to thread

2127 points bakugo | 2 comments | | HN request time: 0.463s | source
Show context
anotherpaulg ◴[] No.43164684[source]
Claude 3.7 Sonnet scored 60.4% on the aider polyglot leaderboard [0], WITHOUT USING THINKING.

Tied for 3rd place with o3-mini-high. Sonnet 3.7 has the highest non-thinking score, taking that title from Sonnet 3.5.

Aider 0.75.0 is out with support for 3.7 Sonnet [1].

Thinking support and thinking benchmark results coming soon.

[0] https://aider.chat/docs/leaderboards/

[1] https://aider.chat/HISTORY.html#aider-v0750

replies(18): >>43164827 #>>43165382 #>>43165504 #>>43165555 #>>43165786 #>>43166186 #>>43166253 #>>43166387 #>>43166478 #>>43166688 #>>43166754 #>>43166976 #>>43167970 #>>43170020 #>>43172076 #>>43173004 #>>43173088 #>>43176914 #
1. billmalarky ◴[] No.43176914[source]
Hi Paul, been following the aider project for about a year now to develop an understanding of how to build SWE agents.

I was at the AI Engineering Summit in NYC last week and met an (extremely senior) staff ai engineer doing somewhat unbelievable things with aider. Shocking things tbh.

Is there a good way to share stories about real-world aider projects like this with you directly (if I can get approval from him)? Not sure posting on public forum is appropriate but I think you would be really interested to hear how people are using this tool at the edge.

replies(1): >>43182570 #
2. tecleandor ◴[] No.43182570[source]
Hope it gets to be public, I love to learn "weird" (or unusual) ways of using tools