(www.thealgorithmicbridge.com)

132 points kiyanwang | 1 comments | 20 Feb 25 07:15 UTC | HN request time: 0.258s | source

Show context

bambax ◴[20 Feb 25 09:04 UTC] No.43112611[source]▶

>>43111963 (OP) #

This article is weak and just general speculation.

Many people doubt the actual performance of Grok 3 and suspect it has been specifically trained on the benchmarks. And Sabine Hossenfelder says this:

> Asked Grok 3 to explain Bell's theorem. It gets it wrong just like all other LLMs I have asked because it just repeats confused stuff that has been written elsewhere rather than looking at the actual theorem.

https://x.com/skdh/status/1892432032644354192

Which shows that "massive scaling", even enormous, gigantic scaling, doesn't improve intelligence one bit; it improves scope, maybe, or flexibility, or coverage, or something, but not "intelligence".

replies(7): >>43112886 #>>43112908 #>>43113270 #>>43113312 #>>43113843 #>>43114290 #>>43115189 #

1. jiggawatts ◴[20 Feb 25 10:59 UTC] No.43113312[source]▶

>>43112611 #

People have called LLMs a "blurry picture of the Internet". Improving the focus won't change the subject of the picture, it just makes it sharper. Every photographer knows this!

A fundamentally new approach is needed, such as training AIs in phases, where instead of merely training them to learn to parrot their inputs, the first AI is used to critique and analyse the inputs, which is then used to train another model in a second pass, which is used to critique the data again, and so on, probably for half a dozen or more iterations. On each round, the model can learn not just what it heard, but also an analysis of the veracity, validity, and consistency.

Notably, something akin to this was done for training Deepseek, but only in a limited fashion.

↑

Grok 3: Another win for the bitter lesson