Jagged AGI: o3, Gemini 2.5, and everything after

(www.oneusefulthing.org)

265 points ctoth | 1 comments | 20 Apr 25 14:55 UTC | HN request time: 0.208s | source

Show context

sejje ◴[20 Apr 25 17:04 UTC] No.43744995[source]▶

In the last example (the riddle)--I generally assume the AI isn't misreading, rather that it assumes you couldn't give it the riddle correctly, but it has seen it already.

I would do the same thing, I think. It's too well-known.

The variation doesn't read like a riddle at all, so it's confusing even to me as a human. I can't find the riddle part. Maybe the AI is confused, too. I think it makes an okay assumption.

I guess it would be nice if the AI asked a follow up question like "are you sure you wrote down the riddle correctly?", and I think it could if instructed to, but right now they don't generally do that on their own.

replies(5): >>43745113 #>>43746264 #>>43747336 #>>43747621 #>>43751793 #

Jensson ◴[20 Apr 25 17:25 UTC] No.43745113[source]▶

>>43744995 #

> generally assume the AI isn't misreading, rather that it assumes you couldn't give it the riddle correctly, but it has seen it already.

LLMs doesn't assume, its a text completer. It sees something that looks almost like a well known problem and it will complete with that well known problem, its a problem specific to being a text completer that is hard to get around.

replies(6): >>43745166 #>>43745289 #>>43745300 #>>43745301 #>>43745340 #>>43754148 #

monkpit ◴[20 Apr 25 17:54 UTC] No.43745289[source]▶

>>43745113 #

This take really misses a key part of implementation of these LLMs and I’ve been struggling to put my finger on it.

In every LLM thread someone chimes in with “it’s just a statistical token predictor”.

I feel this misses the point and I think it dismisses attention heads and transformers, and that’s what sits weird with me every time I see this kind of take.

There _is_ an assumption being made within the model at runtime. Assumption, confusion, uncertainty - one camp might argue that none of these exist in the LLM.

But doesn’t the implementation constantly make assumptions? And what even IS your definition of “assumption” that’s not being met here?

Edit: I guess my point, overall, is: what’s even the purpose of making this distinction anymore? It derails the discussion in a way that’s not insightful or productive.

replies(1): >>43746020 #

1. Jensson ◴[20 Apr 25 19:42 UTC] No.43746020[source]▶

>>43745289 #

> I feel this misses the point and I think it dismisses attention heads and transformers

Those just makes it better at completing the text, but for very common riddles those tools still gets easily overruled by pretty simple text completion logic since the weights for those will be so extremely strong.

The point is that if you understand its a text completer then its easy to understand why it fails at these. To fix these properly you need to make it no longer try to complete text, and that is hard to do without breaking it.

↑