To me the amazing thing is that you can tell the model to do something, even follow simple instructions in plain English, like make a list or write some python code to do $x, that's the really amazing part.
Then ask for the same list sorted and get that nearly instantly,
These models have a short time context for now, but they already have a huge “working memory” relative to us.
It is very cool. And indicative that vastly smarter models are going to be achieved fairly easily, with new insight.
Our biology has had to ruthlessly work within our biological/ecosystem energy envelope, and with the limited value/effort returned by a pre-internet pre-vast economy.
So biology has never been able to scale. Just get marginally more efficient and effective within tight limits.
Suddenly, (in historical, biological terms), energy availability limits have been removed, and limits on the value of work have compounded and continue to do so. Unsurprising that those changes suddenly unlock easily achieved vast untapped room for cognitive upscaling.
I don't think your second sentence logically follows from the first.
Relative to us, these models:
- Have a much larger working memory.
- Have much more limited logical reasoning skills.
To some extent, these models are able to use their superior working memories to compensate for their limited reasoning abilities. This can make them very useful tools! But there may well be a ceiling to how far that can go.
When you ask a model to "think about the problem step by step" to improve its reasoning, you are basically just giving it more opportunities to draw on its huge memory bank and try to put things together. But humans are able to reason with orders of magnitude less training data. And by the way, we are out of new training data to give the models.
Only easily accessible text data. We haven't really started using video at scale yet for example. It looks like data for specific tasks goes really far too ... for example agentic coding interactions aren't something that has generally been captured on the internet. But capturing interactions with coding agents, in combination with the base-training of existing programming knowledge already captured is resulting in significant performance increases. The amount of specicialed data we might need to gather or synthetically generate is perhaps orders of magnitude less that presumed with pure supervised learning systems. And for other applications like industrial automation or robotics we've barely started capturing all the sensor data that lives in those systems.