Tied for 3rd place with o3-mini-high. Sonnet 3.7 has the highest non-thinking score, taking that title from Sonnet 3.5.
Aider 0.75.0 is out with support for 3.7 Sonnet [1].
Thinking support and thinking benchmark results coming soon.
Tied for 3rd place with o3-mini-high. Sonnet 3.7 has the highest non-thinking score, taking that title from Sonnet 3.5.
Aider 0.75.0 is out with support for 3.7 Sonnet [1].
Thinking support and thinking benchmark results coming soon.
65% Sonnet 3.7, 32k thinking
64% R1+Sonnet 3.5
62% o1 high
60% Sonnet 3.7, no thinking
60% o3-mini high
57% R1
52% Sonnet 3.5
It's unclear to me how they'll shift to making money while providing almost no enhanced value.
It's not like the web suddenly was just there, it came slow at first, then everywhere at once, the money came even later.
Originally electric generators merely replaced steam generators but had no additional productivity gains, this only changed when they changed the rest of the processes around it.
LLMs might enable some completely new things to be automated that made no sense to automate before, even if it’s necessary to error correct with humans / computers.