←back to thread

S1: A $6 R1 competitor?

(timkellogg.me)
851 points tkellogg | 1 comments | | HN request time: 0.21s | source
Show context
Aperocky ◴[] No.42950592[source]
For all the hype about thinking models, this feels much like compression in terms of information theory instead of a "takeoff" scenario.

There are a finite amount of information stored in any large model, the models are really good at presenting the correct information back, and adding thinking blocks made the models even better at doing that. But there is a cap to that.

Just like how you can compress a file by a lot, there is a theoretical maximum to the amount of compression before it starts becoming lossy. There is also a theoretical maximum of relevant information from a model regardless of how long it is forced to think.

replies(3): >>42951063 #>>42956052 #>>42960773 #
1. jedbrooke ◴[] No.42960773[source]
my thinking (hope?) is that the reasoning models will be more like how a calculator doesn’t have to “remember” all the possible combinations of addition, multiplication, etc for all the numbers, but can actually compute the results.

As reasoning improves the models could start with a basic set of principles and build from there. Of course for facts grounded in reality RAG would still likely be the best, but maybe with enough “reasoning” a model could simulate an approximation of the universe well enough to get to an answer.