←back to thread

2127 points bakugo | 1 comments | | HN request time: 0s | source
Show context
anotherpaulg ◴[] No.43164684[source]
Claude 3.7 Sonnet scored 60.4% on the aider polyglot leaderboard [0], WITHOUT USING THINKING.

Tied for 3rd place with o3-mini-high. Sonnet 3.7 has the highest non-thinking score, taking that title from Sonnet 3.5.

Aider 0.75.0 is out with support for 3.7 Sonnet [1].

Thinking support and thinking benchmark results coming soon.

[0] https://aider.chat/docs/leaderboards/

[1] https://aider.chat/HISTORY.html#aider-v0750

replies(18): >>43164827 #>>43165382 #>>43165504 #>>43165555 #>>43165786 #>>43166186 #>>43166253 #>>43166387 #>>43166478 #>>43166688 #>>43166754 #>>43166976 #>>43167970 #>>43170020 #>>43172076 #>>43173004 #>>43173088 #>>43176914 #
anotherpaulg ◴[] No.43166754[source]
Using up to 32k thinking tokens, Sonnet 3.7 set SOTA with a 64.9% score.

  65% Sonnet 3.7, 32k thinking
  64% R1+Sonnet 3.5
  62% o1 high
  60% Sonnet 3.7, no thinking
  60% o3-mini high
  57% R1
  52% Sonnet 3.5
replies(4): >>43167134 #>>43168719 #>>43168852 #>>43169016 #
mikae1 ◴[] No.43168852[source]
It's clear that progress is incremental at this point. At the same time Anthropic and OpenAI are bleeding money.

It's unclear to me how they'll shift to making money while providing almost no enhanced value.

replies(1): >>43168989 #
khafra ◴[] No.43168989[source]
Yudkowsky just mentioned that even if LLM progress stopped right here, right now, there are enough fundamental economic changes to provide us a really weird decade. Even with no moat, if the labs are in any way placed to capture a little of the value they've created, they could make high multiples of their investors' money.
replies(5): >>43169795 #>>43169803 #>>43170002 #>>43171064 #>>43175528 #
dragonwriter ◴[] No.43170002{3}[source]
With no moat, they aren't placed to capture much value; moats are what stops market competition from driving prices to the zero economic profit level, and that's even without further competition from free products that are being produced by people who aren’t even trying to support themselves in the market you are selling into, which can make even the zero economic profit price untenable.
replies(1): >>43171172 #
TeMPOraL ◴[] No.43171172{4}[source]
Market competition doesn't work in an instant; even without a moat, there's plenty of money they can capture before it evaporates.

Think pouring water from the faucet into a sink with open drain - if you have high enough flow rate, you can fill the sink faster than it drains. Then, when you turn the faucet off, as the sink is draining, you can still collect plenty of water from it with a cup or a bucket, before the sink fully drains.

replies(2): >>43172946 #>>43172969 #
1. AJ007 ◴[] No.43172969{5}[source]
The startups that are using API credits seem like the most likely to be able to achieve a good return on capital. There is a pretty clear cost structure and it's much more straightforward whether you are making money or not.

The infrastructure side of things, tens of billions and probably hundreds of billions going in, may not be fantastic for investors. The return on capital should approach cost of capital if someone does their job correctly. Add in government investment and subsidies (in China, the EU, the United States) and it become extremely difficult to make those calculations. In the long term, I don't think the AI infrastructure will be overbuilt (datacenters, fabs), but like the telecom bubble, it is easy to end up in a position where there is a lot of excess capacity and the way you made your bet means getting wiped out.

Of course if you aren't the investor and it isn't your capital, then there is a tremendous amount of money to be made because you have nothing to lose. I've been around a long time, and this is the closest thing I've felt to that inflection point where the web took off.