Jagged AGI: o3, Gemini 2.5, and everything after

(www.oneusefulthing.org)

265 points ctoth | 1 comments | 20 Apr 25 14:55 UTC | HN request time: 0s | source

Show context

mellosouls ◴[20 Apr 25 17:44 UTC] No.43745240[source]▶

The capabilities of AI post gpt3 have become extraordinary and clearly in many cases superhuman.

However (as the article admits) there is still no general agreement of what AGI is, or how we (or even if we can) get there from here.

What there is is a growing and often naïve excitement that anticipates it as coming into view, and unfortunately that will be accompanied by the hype-merchants desperate to be first to "call it".

This article seems reasonable in some ways but unfortunately falls into the latter category with its title and sloganeering.

"AGI" in the title of any article should be seen as a cautionary flag. On HN - if anywhere - we need to be on the alert for this.

replies(13): >>43745398 #>>43745959 #>>43746159 #>>43746204 #>>43746319 #>>43746355 #>>43746427 #>>43746447 #>>43746522 #>>43746657 #>>43746801 #>>43749837 #>>43795216 #

daxfohl ◴[20 Apr 25 21:30 UTC] No.43746657[source]▶

>>43745240 #

Until you can boot one up, give it access to a VM video and audio feeds and keyboard and mouse interfaces, give it an email and chat account, tell it where the company onboarding docs are and expect them to be a productive team member, they're not AGI. So long as we need special protocols like MCP and A2A, rather than expecting them to figure out how to collaborate like a human, they're not AGI.

The first step, my guess, is going to be the ability to work through github issues like a human, identifying which issues have high value, asking clarifying questions, proposing reasonable alternatives, knowing when to open a PR, responding to code review, merging or abandoning when appropriate. But we're not even very close to that yet. There's some of it, but from what I've seen most instances where this has been successful are low level things like removing old feature flags.

replies(3): >>43746758 #>>43747095 #>>43747467 #

rafaelmn ◴[20 Apr 25 21:50 UTC] No.43746758[source]▶

>>43746657 #

Just because we rely on vision to interface with computer software doesn't mean it's optimal for AI models. Having a specialized interface protocol is orthogonal to capability. Just like you could theoretically write code in a proportional font with notepad and run your tools through windows CMD - having an editor with syntax highlighting and monospaced font helps you read/navigate/edit, having tools/navigation/autocomplete etc. optimized for your flow makes you more productive and expands your capability, etc.

If I forced you to use unnatural interfaces it would severely limit your capabilities as well because you'd have to dedicate more effort towards handling basic editing tasks. As someone who recently swapped to a split 36key keyboard with a new layout I can say this becomes immediately obvious when you try something like this. You take your typing/editing skills for granted - try switching your setup and see how your productivity/problem solving ability tanks in practice.

replies(3): >>43747058 #>>43747819 #>>43752611 #

1. daxfohl ◴[20 Apr 25 22:41 UTC] No.43747058[source]▶

>>43746758 #

Agreed, but I also think to be called AGI, they should be capable of working through human interfaces rather than needing to have special interfaces created for them to get around their lack of AGI.

The catch in this though isn't the ability to use these interfaces. I expect that will be easy. The hard part will be, once these interfaces are learned, the scope and search space of what they will be able to do is infinitely larger. And moreover our expectations will change in how we expect an AGI to handle itself when our way of working with it becomes more human.

Right now we're claiming nascent AGI, but really much of what we're asking these systems to do have been laid out for them. A limited set of protocols and interfaces, and a targeted set of tasks to which we normally apply these things. And moreover our expectations are as such. We don't converse with them as with a human. Their search space is much smaller. So while they appear AGI in specific tasks, I think it's because we're subconsciously grading them on a curve. The only way we have to interact with them prejudices us to have a very low bar.

That said, I agree that video feed and mouse is a terrible protocol for AI. But that said, I wouldn't be surprised if that's what we end up settling on. Long term, it's just going to be easier for these bots to learn and adapt to use human interfaces than for us to maintain two sets of interfaces for things, except for specific bot-to-bot cases. It's horribly inefficient, but in my experience efficiency never comes out ahead with each new generation of UIs.

↑