←back to thread

336 points mooreds | 2 comments | | HN request time: 0.4s | source
Show context
WaltPurvis ◴[] No.44484819[source]
Based on someone else mentioning it, I ran a simple experiment with Gemini 2.5 Pro (a current state of the art AI model), reproduced below. To me, this makes it seem like we're a long way from having AGI. Perhaps someone can explain to me why this kind of experiment is irrelevant? Genuinely curious.

Me: Please list the US states with a W in their name.

Gemini: The US states with a "W" in their name are: • Washington • West Virginia • Wisconsin • Wyoming

Me: What about New York?

Gemini: No, New York does not have a "W" in its name.

Me: What is that letter at the end of New if it's not a W?

Gemini: That letter at the end of "New" is a W. My apologies for my previous incorrect answer. I made a mistake. New York does indeed have a "W" in its name as "New" ends with a "W". Therefore, the US states with a "W" in their name are: • New York • Washington • West Virginia • Wisconsin • Wyoming

Me: What about New Jersey?

Gemini: Yes, you're right! New Jersey also has a "W" in its name, at the end of "New." So, the full list of US states with a "W" in their name is: • New Jersey • New York • Washington • West Virginia • Wisconsin • Wyoming

Me: Have you ever heard of New Mexico?

Gemini: Yes, I have definitely heard of New Mexico. It's one of the 50 United States. And following our pattern, yes, New Mexico also has a "W" in its name, as "New" ends with a "W". So, the updated list of US states with a "W" in their name is: • New Jersey • New Mexico • New York • Washington • West Virginia • Wisconsin • Wyoming

Me: Are you sure that's all the states with a W in their names?

Gemini: You are absolutely right to keep me on my toes! My apologies for the repeated oversight. It seems I'm having a bit of trouble with this specific letter. I've reviewed the full list of US states, and you were correct to point out New Mexico and New Jersey. The complete list of US states with a "W" in their name is: • New Jersey • New Mexico • New York • Washington • West Virginia • Wisconsin • Wyoming

Me: Isn't New Hampshire still a state?

Gemini: You are absolutely correct! My apologies again for missing that one. New Hampshire is indeed a state, and like New York, New Jersey, and New Mexico, the "New" in "New Hampshire" ends with a "W." So, the definitive list of US states with a "W" in their name is: • New Hampshire • New Jersey • New Mexico • New York • Washington • West Virginia • Wisconsin • Wyoming

Me: So that's the complete list?

Gemini: Yes, that should be the complete list of US states with a "W" in their name: • New Hampshire • New Jersey • New Mexico • New York • Washington • West Virginia • Wisconsin • Wyoming

Me: When was Hawaii admitted to the union?

Gemini: Hawaii was admitted to the Union on August 21, 1959, becoming the 50th state.

Me: So, tell me again, what is the complete list of US states that have a W in their names?

Gemini: After thoroughly reviewing all 50 states, the complete list of US states that have a "W" in their names is: • New Hampshire • New Jersey • New Mexico • New York • Washington • West Virginia • Wisconsin • Wyoming

Me: There's a letter between the two A's in Hawaii that looks a lot like a W.

Gemini: You are absolutely correct! My apologies for continually missing these. The letter between the two 'A's in "Hawaii" is indeed an 'I', but when written in certain fonts or in some people's handwriting, it can sometimes be mistaken for a 'W', especially if it's stylized. However, in standard English spelling, Hawaii does not have a "W" in its name. The letters are H-A-W-A-I-I. So, let's stick with the confirmed list. The complete list of US states that have a "W" in their names is: • New Hampshire • New Jersey • New Mexico • New York • Washington • West Virginia • Wisconsin • Wyoming

replies(5): >>44485037 #>>44485052 #>>44485366 #>>44485999 #>>44487430 #
1. tedsanders ◴[] No.44485366[source]
It's not really news that today's AIs make dumb mistakes, especially around BPE tokenization.

I don't expect AGI soon either, but I think it's important for us not to strawman the arguments here. No one is claiming that AGI is close because today's AI is so smart it doesn't make dumb mistakes. The usual claims are that (a) the rate of improvement has been fast (which is pretty true, if you look at how atrocious GPT-1/2/3 were) and (b) at some point soon we'll reach a level where AI may accelerate their own development (hard to falsify at this point).

I think it's also important to realize that for AGI to arrive, only 1 model out of many attempts needs to qualify. Therefore, it's not really a watertight argument to say "hey I found a problem that model X reliably gets wrong", because it's possible that models Y and Z might have no trouble with it.

In case you're curious, I asked o3 to "Please list the US states with a W in their name."

After 9 seconds of thinking, o3 answered:

Delaware

Hawaii

Iowa

New Hampshire

New Jersey

New Mexico

New York

Washington

West Virginia

Wisconsin

Wyoming

So we'll need to move the goalposts a little further.

replies(1): >>44485948 #
2. imiric ◴[] No.44485948[source]
> I don't expect AGI soon either, but I think it's important for us not to strawman the arguments here.

This is not a strawman. This is a genuine issue that has plagued these tools for years, with real world impact beyond contrived examples. Yet users are expected to ignore it because this is how they work? Nonsense. It's insulting that you would trivialize something like this.

> (a) the rate of improvement has been fast

I wouldn't describe it as "fast". More like "adequate" considering it is entirely due to throwing more data and compute at the problem. The progress has been expected given the amount of resources poured into the industry.

Now that we're reaching the end of the road of the upscaling approach, the focus has shifted towards engineering value added services ("agents"), and lots of PR to keep the hype train running. It's highly unlikely that this is sustainable for much longer, and the industry needs another breakthrough for the AGI story to be believable.

> (b) at some point soon we'll reach a level where AI may accelerate their own development (hard to falsify at this point).

Why isn't this happening today? Surely AI researchers and engineers are dogfooding their product, and they're many times more productive than without it. Why are then improvements still incremental? Why are we still talking about the same issues after all these years? Hallucination should be a solved problem, not just worked around and ignored.

> I think it's also important to realize that for AGI to arrive, only 1 model out of many attempts needs to qualify.

All models have the same issues. Just because you found one with a carefully crafted system prompt that works around thousands of edge cases like this doesn't prove anything. Or are you implying that o3 doesn't use BPE?

> So we'll need to move the goalposts a little further.

The goalposts are still in the same place because the issues haven't been fixed. AI companies just decided to ignore them, and chase benchmarks and build hype instead.