AGI Is Still 30 Years Away – Ege Erdil and Tamay Besiroglu

(www.dwarkesh.com)

174 points Philpax | 5 comments | 17 Apr 25 16:42 UTC | HN request time: 0.859s | source

Show context

dicroce ◴[17 Apr 25 17:35 UTC] No.43719918[source]▶

>>43719280 (OP) #

Doesn't even matter. The capabilities of the AI that's out NOW will take a decade or more to digest.

replies(3): >>43719953 #>>43722914 #>>43747545 #

EA-3167 ◴[17 Apr 25 17:37 UTC] No.43719953[source]▶

>>43719918 #

I feel like it's already been pretty well digested and excreted for the most part, now we're into the re-ingestion phase until the bubble bursts.

replies(4): >>43719975 #>>43720000 #>>43720090 #>>43720159 #

1. dicroce ◴[17 Apr 25 17:53 UTC] No.43720159[source]▶

>>43719953 #

Not even close. Software can now understand human language... this is going to mean computers can be a lot more places than they ever could. Furthermore, software can now understand the content of images... eventually this will have a wild impact on nearly everything.

replies(2): >>43720259 #>>43722780 #

2. AstralStorm ◴[17 Apr 25 18:01 UTC] No.43720259[source]▶

>>43720159 (TP) #

Understand? It fails with to understand a rephrasing of a math problem a five year old can solve... They get much better at training to the test from memory the bigger they get. Likewise you can get some emergent properties out of them.

Really it does not understand a thing, sadly. It can barely analyze language and spew out a matching response chain.

To actually understand something, it must be capable of breaking it down into constituent parts, synthesizing a solution and then phrasing the solution correctly while explaining the steps it took.

And that's not even what huge 62B LLM with the notepad chain of thought (like o3, GPT-4.1 or Claude 3.7) can really properly do.

Further, it has to be able to operate on sub-token level. Say, what happens if I run together truncated version of words or sentences? Even a chimpanzee can handle that. (in sign language)

It cannot do true multimodal IO either. You cannot ask it to respond with at least two matching syllables per word and two pictures of syllables per word, in addition to letters. This is a task a 4 year old can do.

Prediction alone is not indicative of understanding. Pasting together answers like lego is also not indicative of understanding. (Afterwards ask it how it felt about the task. And to spot and explain some patterns in a picture of clouds.)

3. burnte ◴[17 Apr 25 22:12 UTC] No.43722780[source]▶

>>43720159 (TP) #

It doesn't understand anything, there is no understanding going on in these models. It takes input and generates output based on the statistical math created from its training set. It's Bayesian statistics and vector/matrix math. There is no cogitation or actual understanding.

replies(1): >>43723563 #

4. abletonlive ◴[18 Apr 25 00:00 UTC] No.43723563[source]▶

>>43722780 #

This is insanely reductionist and mindless regurgitation of what we already know about how the models work. Understanding is a spectrum, it's not binary. We can measurably show that that there is in fact, some kind of understanding.

If you explain a concept to a child you check for understanding by seeing if the output they produce checks out with your understanding of the concept. You don't peer into their brain and see if there are neurons and consciousness happening

replies(1): >>43729248 #

5. burnte ◴[18 Apr 25 15:54 UTC] No.43729248{3}[source]▶

>>43723563 #

The method of verification has no bearing on the validity of the conclusion. I don't open a child's head because there are side effects on the functioning of the child post brain-opening. However I can look into the brain of an AI with no such side effects.

This is an example I saw 2 days ago without even searching. Here ChatGPT is telling someone that it independently ran a benchmark on it's MacBook: https://pbs.twimg.com/media/Goq-D9macAApuHy?format=jpg

I'm reasonably sure ChatGPT doesn't have a Macbook, and didn't really run the benchmarks. But It DID produce exactly what you would expect a human to say, which is what it is programmed to do. No understanding, just rote repetition.

I won't post more because there are a billion of them. LLMs are great, but they're not intelligent, they don't understand, and the output still needs validated before use. We have a long way to go, and that's ok.

↑