> the Park et. al. “Generate Agents” paper is using a different architecture from GPT
It's using a wrapper around 3.5.
> and specifically says that an LLM by itself is not capable of making believable plans (!)
I don't see where it says that and at any rate I suspect it's false. Careful to equivocate between "We couldn't get it to make a plan" and "it cannot make plans". Many people have decided that LLMs are incapable of a thing on the basis of very bad prompts.
> and they warn “We suggest that generative agents should never be a substitute for real human input in studies and design processes.”
I suspect they're referring to generative agents at the current level of skill. I don't think they mean it as "ever, under any circumstances, no matter how capable".
> We have some idea about how to make AI permanently learn to autocomplete about things that weren’t in the training data, but GPT doesn’t have that yet, nor does any other AI to date.
I don't even think humans have that. We simply have a very abstracted library of patterns. "Things that aren't in the training data" don't look like an unusual circumstance, they look like random noise. So long as we can phrase it in understandable terms, it's by definition not a situation outside the training data.
> Are you essentially arguing that you think humans are autocomplete and nothing more?
I view it the other way around: I think "autocomplete" is such a generic term that it can fit anything we do. It's like saying humans are "just computers" - like, yes, I think the function our brains evaluate is computable, but that doesn't actually put any restrictions and what it can be. Any worldmodel can be called "autocomplete".
> But the theme I’m seeing in the pro-AI arguments is a talking point that since we don’t fully understand human consciousness and can’t define it in such a way that excludes today’s AI, then GPT is probably AGI already.
To be clear, I think GPT is AGI for other reasons. The arguments about consciousness simply fail to justify excluding it. I think GPT is AGI because when I try to track the development of AI, I evaluate something like "which capabilities do I, a human, have? Which capabilities do I know GPT can simulate? What's left necessary to make them match up?" GPT will naturally generate analogues to these capabilities simply as a matter of backprop over a human data corpus; if it fails, it will be due to insufficient scale, inadequate design, inadequate training, etc. So then I look at: "how does it fail?" What's the step of something where I make a decision, where I introspectively go zig, and GPT goes zag? And in my model, none of the remaining weaknesses and inabilities are things that the transformer architecture cannot represent. My view is if you got God to do it, He could probably turn GPT 3.5 into an AGI by precisely selecting the right weights and then writing a very thin wrapper. I think the fact that we cannot find those weights is much more down to the training corpus than anything architectural. When I look at GPT 3.5 reason through a problem, I recognize my internal narrative; conversely, when I make an amusing blooper IRL, I occasionally recognize where my brain autocompleted a pattern a bit too readily.
Of course, the oft-repeated pattern of "GPT will never be able to X" :next day: "We present a prompt that gets GPT to do X" also doesn't help dissuade me.
Like, "GPT can't think, it can only follow prompts". Prompts are a few lines of text. GPT is a text-generator. Do you really think prompts are going to be, in the long term, the irreducibly human technology that keeps GPT from parity with us? If GPT can be AGI if only for the ability to make prompts, we're one good prompt generation dataset out from AGI.