←back to thread

AI 2027

(ai-2027.com)
949 points Tenoke | 1 comments | | HN request time: 0s | source
Show context
Vegenoid ◴[] No.43585338[source]
I think we've actually had capable AIs for long enough now to see that this kind of exponential advance to AGI in 2 years is extremely unlikely. The AI we have today isn't radically different from the AI we had in 2023. They are much better at the thing they are good at, and there are some new capabilities that are big, but they are still fundamentally next-token predictors. They still fail at larger scope longer term tasks in mostly the same way, and they are still much worse at learning from small amounts of data than humans. Despite their ability to write decent code, we haven't seen the signs of a runaway singularity as some thought was likely.

I see people saying that these kinds of things are happening behind closed doors, but I haven't seen any convincing evidence of it, and there is enormous propensity for AI speculation to run rampant.

replies(8): >>43585429 #>>43585830 #>>43586381 #>>43586613 #>>43586998 #>>43587074 #>>43594397 #>>43619183 #
benlivengood ◴[] No.43585830[source]
METR [0] explicitly measures the progress on long term tasks; it's as steep a sigmoid as the other progress at the moment with no inflection yet.

As others have pointed out in other threads RLHF has progressed beyond next-token prediction and modern models are modeling concepts [1].

[0] https://metr.org/blog/2025-03-19-measuring-ai-ability-to-com...

[1] https://www.anthropic.com/news/tracing-thoughts-language-mod...

replies(2): >>43585918 #>>43586196 #
Vegenoid ◴[] No.43586196[source]
At the risk of coming off like a dolt and being super incorrect: I don't put much stock into these metrics when it comes to predicting AGI. Even if the trend of "length of task an AI can reliably do doubles every 7 months" continues, as they say that means we're years away from AI that can complete tasks that take humans weeks or months. I'm skeptical that the doubling trend will continue into that timescale, I think there is a qualitative difference between tasks that take weeks or months and tasks that take minutes or hours, a difference that is not reflected by simple quantity. I think many people responsible for hiring engineers are keenly aware of this distinction, because of their experience attempting to choose good engineers based on how they perform in task-driven technical interviews that last only hours.

Intelligence as humans have it seems like a "know it when you see it" thing to me, and metrics that attempt to define and compare it will always be looking at only a narrow slice of the whole picture. To put it simply, the gut feeling I get based on my interactions with current AI, and how it is has developed over the past couple of years, is that AI is missing key elements of general intelligence at its core. While there's more lots more room for its current approaches to get better, I think there will be something different needed for AGI.

I'm not an expert, just a human.

replies(2): >>43586698 #>>43586818 #
1. benlivengood ◴[] No.43586698[source]
> I think there is a qualitative difference between tasks that take weeks or months and tasks that take minutes or hours, a difference that is not reflected by simple quantity.

I'd label that difference as long-term planning plus executive function, and wherever that overlaps with or includes delegation.

Most long-term projects are not done by a single human and so delegation almost always plays a big part. To delegate, tasks must be broken down in useful ways. To break down tasks a holistic model of the goal is needed where compartmentalization of components can be identified.

I think a lot of those individual elements are within reach of current model architectures but they are likely out of distribution. How many gantt charts and project plans and project manager meetings are in the pretraining datasets? My guess is few; rarely published internal artifacts. Books and articles touch on the concepts but I think the models learn best from the raw data; they can probably tell you very well all of the steps of good project management because the descriptions are all over the place. The actual doing of it is farther toward the tail of the distribution.