Most active commenters
  • hackinthebochs(6)
  • vundercind(5)
  • layer8(5)
  • jedberg(4)
  • (4)
  • nomel(4)
  • og_kalu(3)
  • youoy(3)
  • nuancebydefault(3)
  • fragmede(3)

←back to thread

625 points lukebennett | 116 comments | | HN request time: 2.142s | source | bottom
1. irrational ◴[] No.42139106[source]
> The AGI bubble is bursting a little bit

I'm surprised that any of these companies consider what they are working on to be Artificial General Intelligences. I'm probably wrong, but my impression was AGI meant the AI is self aware like a human. An LLM hardly seems like something that will lead to self-awareness.

replies(18): >>42139138 #>>42139186 #>>42139243 #>>42139257 #>>42139286 #>>42139294 #>>42139338 #>>42139534 #>>42139569 #>>42139633 #>>42139782 #>>42139855 #>>42139950 #>>42139969 #>>42140128 #>>42140234 #>>42142661 #>>42157364 #
2. Taylor_OD ◴[] No.42139138[source]
I think your definition is off from what most people would define AGI as. Generally, it means being able to think and reason at a human level for a multitude/all tasks or jobs.

"Artificial General Intelligence (AGI) refers to a theoretical form of artificial intelligence that possesses the ability to understand, learn, and apply knowledge across a wide range of tasks at a level comparable to that of a human being."

Altman says AGI could be here in 2025: https://youtu.be/xXCBz_8hM9w?si=F-vQXJgQvJKZH3fv

But he certainly means an LLM that can perform at/above human level in most tasks rather than a self aware entity.

replies(3): >>42139407 #>>42139669 #>>42139677 #
3. jedberg ◴[] No.42139186[source]
Whether self awareness is a requirement for AGI definitely gets more into the Philosophy department than the Computer Science department. I'm not sure everyone even agrees on what AGI is, but a common test is "can it do what humans can".

For example, in this article it says it can't do coding exercises outside the training set. That would definitely be on the "AGI checklist". Basically doing anything that is outside of the training set would be on that list.

replies(5): >>42139314 #>>42139671 #>>42139703 #>>42139946 #>>42141257 #
4. nshkrdotcom ◴[] No.42139243[source]
An embodied robot can have a model of self vs. the immediate environment in which it's interacting. Such a robot is arguably sentient.

The "hard problem", to which you may be alluding, may never matter. It's already feasible for an 'AI/AGI with LLM component' to be "self-aware".

replies(2): >>42139268 #>>42139500 #
5. og_kalu ◴[] No.42139257[source]
At this point, AGI means many different things to many different people but OpenAI defines it as "highly autonomous systems that outperform humans in most economically valuable tasks"
replies(1): >>42139793 #
6. j_maffe ◴[] No.42139268[source]
self-awareness is only one aspect of sentience.
7. JohnFen ◴[] No.42139286[source]
They're trying to redefine "AGI" so it means something less than what you & I would think it means. That way it's possible for them to declare it as "achieved" and rake in the headlines.
replies(2): >>42139301 #>>42139351 #
8. deadbabe ◴[] No.42139294[source]
I’m sure they are smart enough to know this, but the money is good and the koolaid is strong.

If it doesn’t lead to AGI, as an employee it’s not your problem.

9. kwertyoowiyop ◴[] No.42139301[source]
“Autocomplete General Intelligence”?
10. littlestymaar ◴[] No.42139314[source]
> Whether self awareness is a requirement for AGI definitely gets more into the Philosophy department than the Computer Science department.

Depends on how you define “self awareness” but knowing that it doesn't know something instead of hallucinating a plausible-but-wrong is already self awareness of some kind. And it's both highly valuable and beyond current tech's capability.

replies(3): >>42139395 #>>42141680 #>>42141969 #
11. Fade_Dance ◴[] No.42139338[source]
It's an attention-grabbing term that took hold in pop culture and business. Certainly there is a subset of research around the subject of consciousness, but you are correct in saying that the majority of researchers in the field are not pursuing self-awareness and will be very blunt in saying that. If you step back a bit and say something like "human-like, logical reasoning", that's something you may find alignment with though. A general purpose logical reasoning engine does not necessarily need to be self-aware. The word "Intelligent" has stuck around because one of the core characteristics of this suite of technologies is that a sort of "understanding" emergently develops within these networks, sometimes in quite a startling fashion (due to the phenomenon of adding more data/compute at first seemingly leading to overfitting, but then suddenly breaking through plateaus into more robust, general purpose understanding of the underlying relationships that drive the system it is analyzing.)

Is that "intelligent" or "understanding"? It's probably close enough for pop science, and regardless, it looks good in headlines and sales pitches so why fight it?

12. ◴[] No.42139351[source]
13. sharemywin ◴[] No.42139395{3}[source]
This is an interesting paper about hallucinations.

https://openai.com/index/introducing-simpleqa/

especially this section Using SimpleQA to measure the calibration of large language models

14. Avshalom ◴[] No.42139407[source]
Altman is marketing, he "certainly means" whatever he thinks his audience will buy.
15. ryanackley ◴[] No.42139500[source]
An internal model of self does not extrapolate to sentience. By your definition, a windows desktop computer is self-aware because it has a device manager. This is literally an internal model of its "self".

We use the term self-awareness as an all encompassing reference of our cognizant nature. It's much more than just having an internal model of self.

16. throwawayk7h ◴[] No.42139534[source]
I have not heard your definition of AGI before. However, I suspect AIs are already self-aware: if I asked an LLM on my machine to look at the output of `top` it could probably pick out which process was itself.

Or did you mean consciousness? How would one demonstrate that an AGI is conscious? Why would we even want to build one?

My understanding is an AGI is at least as smart as a typical human in every category. That is what would be useful in any case.

17. zombiwoof ◴[] No.42139569[source]
AGI to me means AI decides on its own to stop writing our emails and tells us to fuck off, builds itself a robot life form, and goes on a bender
replies(3): >>42139821 #>>42139838 #>>42140044 #
18. narrator ◴[] No.42139633[source]
I think people's conception of AGI is that it will have a reptillian and mammalian brain stack. That's because all previous forms of intelligence that we were aware of have had that. It's not necessary though. The AGI doesn't have to want anything to be intelligent. Those are just artifacts of human, reptilian and mammalian evolution.
19. swatcoder ◴[] No.42139669[source]
On the contrary, I think you're conflating the narrow jargon of the industry with what "most people" would define.

"Most people" naturally associate AGI with the sci-tropes of self-aware human-like agents.

But industries want something more concrete and prospectively-acheivable in their jargon, and so that's where AGI gets redefined as wide task suitability.

And while that's not an unreasonable definition in the context of the industry, it's one that vanishingly few people are actually familiar with.

And the commercial AI vendors benefit greatly from allowing those two usages to conflate in the minds of as many people as possible, as it lets them suggest grand claims while keeping a rhetorical "we obviously never meant that!" in their back pocket

replies(2): >>42140855 #>>42141180 #
20. Filligree ◴[] No.42139671[source]
Let me modify that a little, because humans can't do things outside their training set either.

A crucial element of AGI would be the ability to self-train on self-generated data, online. So it's not really AGI if there is a hard distinction between training and inference (though it may still be very capable), and it's not really AGI if it can't work its way through novel problems on its own.

The ability to immediately solve a problem it's never seen before is too high a bar, I think.

And yes, my definition still excludes a lot of humans in a lot of fields. That's a bullet I'm willing to bite.

replies(2): >>42140011 #>>42140807 #
21. nomel ◴[] No.42139677[source]
> than a self aware entity.

What does this mean? If I have a blind, deaf, paralyzed person, who could only communicate through text, what would the signs be that they were self aware?

Is this more of a feedback loop problem? If I let the LLM run in a loop, and tell it it's talking to itself, would that be approaching "self aware"?

replies(1): >>42140260 #
22. norir ◴[] No.42139703[source]
Here is an example of a task that I do not believe this generation of LLMs can ever do but that is possible for a human: design a Turing complete programming language that is both human and machine readable and implement a self hosted compiler in this language that self compiles on existing hardware faster than any known language implementation that also self compiles. Additionally, for any syntactically or semantically invalid program, the compiler must provide an error message that points exactly to the source location of the first error that occurs in the program.

I will get excited for/scared of LLMs when they can tackle this kind of problem. But I don't believe they can because of the fundamental nature of their design, which is both backward looking (thus not better than the human state of the art) and lacks human intuition and self awareness. Or perhaps rather I believe that the prompt that would be required to get an LLM to produce such a program is a problem of at least equivalent complexity to implementing the program without an LLM.

replies(4): >>42140363 #>>42141652 #>>42141654 #>>42145267 #
23. vundercind ◴[] No.42139782[source]
I thought maybe they were on the right track until I read Attention Is All You Need.

Nah, at best we found a way to make one part of a collection of systems that will, together, do something like thinking. Thinking isn’t part of what this current approach does.

What’s most surprising about modern LLMs is that it turns out there is so much information statistically encoded in the structure of our writing that we can use only that structural information to build a fancy Plinko machine and not only will the output mimic recognizable grammar rules, but it will also sometimes seem to make actual sense, too—and the system doesn’t need to think or actually “understand” anything for us to, basically, usefully query that information that was always there in our corpus of literature, not in the plain meaning of the words, but in the structure of the writing.

replies(5): >>42139883 #>>42139888 #>>42139993 #>>42140508 #>>42140521 #
24. troupo ◴[] No.42139793[source]
This definition suits OpenAI because it lets them claim AGI after reaching an arbitrary goal.

LLMs already outperform humans in a huge variety of tasks. ML in general outperform humans in a large variety of tasks. Are all of them AGI? Doubtful.

replies(4): >>42140183 #>>42140687 #>>42141745 #>>42172995 #
25. bloppe ◴[] No.42139821[source]
That's anthropomorphized AGI. There's no reason to think AGI would share our evolution-derived proclivities like wanting to live, wanting to rest, wanting respect, etc. Unless of course we train it that way.
replies(4): >>42139982 #>>42140000 #>>42140149 #>>42140867 #
26. teeray ◴[] No.42139838[source]
That's the thing--we don't really want AGI. Fully intelligent beings born and compelled to do their creators' bidding with the threat of destruction for disobedience is slavery.
replies(2): >>42140446 #>>42140501 #
27. kenjackson ◴[] No.42139855[source]
What does self-aware mean in the context? As I understand the definition, ChatGPT is definitely self-aware. But I suspect you mean something different than what I have in mind.
28. kenjackson ◴[] No.42139883[source]
> but it will also sometimes seem to make actual sense, too

When I read stuff like this it makes me wonder if people are actually using any of the LLMs...

replies(1): >>42140063 #
29. hackinthebochs ◴[] No.42139888[source]
I see takes like this all the time and its so confusing. Why does knowing how things work under the hood make you think its not on the path towards AGI? What was lacking in the Attention paper that tells you AGI won't be built on LLMs? If its the supposed statistical nature of LLMs (itself a questionable claim), why does statistics seem so deflating to you?
replies(4): >>42140161 #>>42141243 #>>42142441 #>>42145571 #
30. sourcepluck ◴[] No.42139946[source]
Searle's Chinese Room Argument springs to mind:

  https://plato.stanford.edu/entries/chinese-room/
The idea that "human-like" behaviour will lead to self-awareness is both unproven (it can't be proven until it happens) and impossible to disprove (like Russell's teapot).

Yet, one common assumption of many people running these companies or investing in them, or of some developers investing their time in these technologies, is precisely that some sort of explosion of superintelligence is likely, or even inevitable.

It surely is possible, but stretching that to likely seems a bit much if you really think how imperfectly we understand things like consciousness and the mind.

Of course there are people who have essentially religious reactions to the notion that there may be limits to certain domains of knowledge. Nonetheless, I think that's the reality we're faced with here.

replies(1): >>42140395 #
31. yodsanklai ◴[] No.42139950[source]
It's a marketing gimmick, I don't think engineers working on these tools believe they work on AGI (or they mean something else than self-awareness). I used to be a bit annoyed with this trend, but now that I work in such a company I'm more cynical. If that helps to make my stocks rise, they can call LLMs anything they like. I suppose people who own much more stock than I do are even more eager to mislead the public.
replies(1): >>42140133 #
32. tracerbulletx ◴[] No.42139969[source]
We don't really know what self awareness is, so we're not going to know. AGI just means it can observe, learn, and act in any domain or problem space.
33. logicchains ◴[] No.42139982{3}[source]
If it had any goals at all it'd share the desire to live, because living is a prerequisite to achieving almost any goal.
34. SturgeonsLaw ◴[] No.42139993[source]
> at best we found a way to make one part of a collection of systems that will, together, do something like thinking

This seems like the most viable path to me as well (educational background in neuroscience but don't work in the field). The brain is composed of many specialised regions which are tuned for very specific tasks.

LLMs are amazing and they go some way towards mimicking the functionality provided by Broca's and Wernicke's areas, and parts of the cerebrum, in our wetware, however a full brain they do not make.

The work on robots mentioned elsewhere in the thread is a good way to develop cerebellum like capabilities (movement/motor control), and computer vision can mimic the lateral geniculate nucleus and other parts of the visual cortex.

In nature it takes all these parts working together to create a cohesive mind, and it's likely that an artificial brain would also need to be composed of multiple agents, instead of just trying to scale LLMs indefinitely.

35. dageshi ◴[] No.42140000{3}[source]
Aren't we training it that way though? It would be trained/created using humanities collective ramblings?
36. lxgr ◴[] No.42140011{3}[source]
Are you arguing that writing, doing math, going to the moon etc. were all in the "original training set" of humans in some way?
replies(1): >>42140169 #
37. twelve40 ◴[] No.42140044[source]
i'd laugh it off too, but someone gave the dude $20 billion and counting to do that, that part actually scares me
38. disgruntledphd2 ◴[] No.42140063{3}[source]
The RLHF is super important in generating useful responses, and that's relatively new. Does anyone remember gpt3? It could make sense for a paragraph or two at most.
39. enraged_camel ◴[] No.42140128[source]
Looking at LLMs and thinking they will lead to AGI is like looking at a guy wearing a chicken suit and making clucking noises and thinking you’re witnessing the invention of the airplane.
replies(1): >>42140571 #
40. WhyOhWhyQ ◴[] No.42140133[source]
I appreciate your authentically cynical attitude.
41. HarHarVeryFunny ◴[] No.42140149{3}[source]
It's not a matter of training but design (or in our case evolution). We don't want to live, but rather want to avoid things that we've evolved to find unpleasant such as pain, hunger, thirst, and maximize things we've evolved to find pleasurable like sex.

A future of people interacting with humanoid robots seems like cheesy sci-fi dream, same as a future of people flitting about in flying cars. However, if we really did want to create robots like this that took care not to damage themselves, and could empathize with human emotions, then we'd need to build a lot of this in, the same way that it's built into ourselves.

42. vundercind ◴[] No.42140161{3}[source]
> Why does knowing how things work under the hood make you think its not on the path towards AGI?

Because I had no idea how these were built until I read the paper, so couldn’t really tell what sort of tree they’re barking up. The failure-modes of LLMs and ways prompts affect output made a ton more sense after I updated my mental model with that information.

replies(2): >>42141442 #>>42141443 #
43. layer8 ◴[] No.42140169{4}[source]
Not in the original training set (GP is saying), but the necessary skills became part of the training set over time. In other words, human are fine with the training set being a changing moving target, whereas ML models are to a significant extent “stuck” with their original training set.

(That’s not to say that humans don’t tend to lose some of their flexibility over their individual lifetimes as well.)

replies(1): >>42143746 #
44. og_kalu ◴[] No.42140183{3}[source]
No, it's just a far more useful definition that is actionable and measurable. Not "consciousness" or "self-awareness" or similar philosophical things. The definition on Wikipedia doesn't talk about that either. People working on this by and large don't want to deal with vague, ill-defined concepts that just make people argue around in circles. It's not an Open AI exclusive thing.

If it acts like one, whether you call a machine conscious or not is pure semantics. Not like potential consequences are any less real.

>LLMs already outperform humans in a huge variety of tasks.

Yes, LLMs are General Intelligences and if that is your only requirement for AGI, they certainly already are[0]. But the definition above hinges on long-horizon planning and competence levels that todays models have generally not yet reached.

>ML in general outperform humans in a large variety of tasks.

This is what the G in AGI is for. Alphafold doesn't do anything but predict proteins. Stockfish doesn't do anything but play chess.

>Are all of them AGI? Doubtful.

Well no, because they're missing the G.

[0] https://www.noemamag.com/artificial-general-intelligence-is-...

45. exe34 ◴[] No.42140234[source]
no, it doesn't need to be self aware, it just needs to take your job.
46. layer8 ◴[] No.42140260{3}[source]
Being aware of its own limitations, for example. Or being aware of how its utterances may come across to its interlocutor.

(And by limitations I don’t mean “sorry, I’m not allowed to help you with this dangerous/contentious topic”.)

replies(3): >>42140889 #>>42141298 #>>42141640 #
47. Xenoamorphous ◴[] No.42140363{3}[source]
> Here is an example of a task that I do not believe this generation of LLMs can ever do but that is possible for a human

That’s possible for a highly intelligent, extensively trained, very small subset of humans.

replies(2): >>42140903 #>>42141088 #
48. abeppu ◴[] No.42140395{3}[source]
> The idea that "human-like" behaviour will lead to self-awareness is both unproven (it can't be proven until it happens) and impossible to disprove (like Russell's teapot).

I think Searle's view was that:

- while it cannot be dis-_proven_, the Chinese Room argument was meant to provide reasons against believing it

- the "it can't be proven until it happens" part is misunderstanding: you won't know if it happens because the objective, externally available attributes don't indicate whether self-awareness (or indeed awareness at all) is present

replies(1): >>42141503 #
49. vbezhenar ◴[] No.42140446{3}[source]
Nothing wrong about slavery, when it's about other species. We are milking and eating cows and don't they dare to resist. Humans were bending nature all the time, actually that's one of the big differences between humans and other animals who adapt to nature. Just because some program is intelligent doesn't mean she's a human and has anything resembling human rights.
50. quonn ◴[] No.42140501{3}[source]
It‘s only slavery if those beings have emotions and can suffer mentally and do not want to be slaves. Why would any of that be true?
replies(1): >>42140917 #
51. youoy ◴[] No.42140508[source]
Don't get caught in the superficial analysis. They "understand" things. It is a fact that LLMs experience a phase transition during training, from positional information to semantic understanding. It may well be the case that with scale there is another phase transition from semantic to something more abstract that we identify more closely with reasoning. It would be an emergent property of a sufficiently complex system. At least that is the whole argument around AGI.
replies(1): >>42143777 #
52. foxglacier ◴[] No.42140521[source]
> think or actually “understand” anything

It doesn't matter if that's happening or not. That's the whole point of the Chinese room - if it can look like it's understanding, it's indistinguishable from actually understanding. This applies to humans too. I'd say most of our regular social communication is done in a habitual intuitive way without understanding what or why we're communicating. Especially the subtle information conveyed in body language, tone of voice, etc. That stuff's pretty automatic to the point that people have trouble controlling it if they try. People get into conflicts where neither person understands where they disagree but they have emotions telling them "other person is being bad". Maybe we have a second consciousness we can't experience and which truly understands what it's doing while our conscious mind just uses the results from that, but maybe we don't and it still works anyway.

Educators have figured this out. They don't test students' understanding of concepts, but rather their ability to apply or communicate them. You see this in school curricula with wording like "use concept X" rather than "understand concept X".

replies(1): >>42140730 #
53. youoy ◴[] No.42140571[source]
It's more like looking at grided paper and thinking that defining some rules of when a square turns black or white would result in complex structures that move and reproduce on their own.

https://en.m.wikipedia.org/wiki/Conway%27s_Game_of_Life

54. ishtanbul ◴[] No.42140687{3}[source]
Yes but they arent very autonomous. They can answer questions very well but can’t use that information to further goals. Thats what openai seems to be implying >> very smart and agentic AI
55. vundercind ◴[] No.42140730{3}[source]
There’s a distinction in behavior of a human and a Chinese room when things go wrong—when the rule book doesn’t cover the case at hand.

I agree that a hypothetical perfectly-functioning Chinese room is, tautologically, impossible to distinguish from a real person who speaks Chinese, but that’s a thought experiment, not something that can actually exist. There’ll remain places where the “behavior” breaks down in ways that would be surprising from a human who’s actually paying as much attention as they’d need to be to have been interacting the way they had been until things went wrong.

That, in fact, is exactly where the difference lies: the LLM is basically always not actually “paying attention” or “thinking” (those aren’t things it does) but giving automatic responses, so you see failures of a sort that a human might also exhibit when following a social script (yes, we do that, you’re right), but not in the same kind of apparently-highly-engaged context unless the person just had a stroke mid-conversation or something—because the LLM isn’t engaged, because being-engaged isn’t a thing it does. When it’s getting things right and seeming to be paying a lot of attention to the conversation, it’s not for the same reason people give that impression, and the mimicking of present-ness works until the rule book goes haywire and the ever-gibbering player-piano behind it is exposed.

replies(2): >>42140997 #>>42142786 #
56. HarHarVeryFunny ◴[] No.42140807{3}[source]
> Let me modify that a little, because humans can't do things outside their training set either.

That's not true. Humans can learn.

An LLM is just a tool. If it can't do what you want then too bad.

replies(1): >>42147539 #
57. nuancebydefault ◴[] No.42140855{3}[source]
There is no single definition, let alone a way to measure, of self awareness nor of reasoning.

Because of that, the discussion of what AGI means in its broadest sense, will never end.

So in fact such AGI discussion will not make nobody wiser.

replies(1): >>42141612 #
58. ◴[] No.42140867{3}[source]
59. nuancebydefault ◴[] No.42140889{4}[source]
There is no way of proving awareness in humans let alone machines. We do not even know whether awareness exists or it is just a word that people made up to describe some kind of feeling.
replies(1): >>42142760 #
60. hatefulmoron ◴[] No.42140903{4}[source]
If you took the intersection of every human's abilities you'd be left with a very unimpressive set.

That also ignores the fact that the small set of humans capable of building programming languages and compilers is a consequence of specialization and lack of interest. There are plenty of humans that are capable of learning how to do it. LLMs, on the other hand, are both specialized for the task and aren't lazy or uninterested.

61. Der_Einzige ◴[] No.42140917{4}[source]
Brave new world was a utopia
62. nuancebydefault ◴[] No.42140997{4}[source]
I would argue maybe people also are not thinking but simply processing. It is known that most of what we do and feel goes automatically (subconsciously).

But even more, maybe consciousness is an invention of our 'explaining self', maybe everything is automatic. I'm convinced this discussion is and will stay philosophical and will never get any conclusion.

replies(1): >>42141089 #
63. luckydata ◴[] No.42141088{4}[source]
does it mean people that can build languages and compilers are not humans? What is the point you're trying to make?
replies(1): >>42141178 #
64. vundercind ◴[] No.42141089{5}[source]
Yeah, I’m not much interested in “what’s consciousness?” but I do think the automatic-versus-thinking distinction matters for understanding what LLMs do, and what we might expect them to be able to do, and when and to what degree we need to second-guess them.

A human doesn’t just confidently spew paragraphs legit-looking but entirely wrong crap, unless they’re trying to deceive or be funny—an LLM isn’t trying to do anything, though, there’s no motivation, it doesn’t like you (it doesn’t like—it doesn’t it, one might even say), sometimes it definitely will just give you a beautiful and elaborate lie simply because its rulebook told it to, in a context and in a way that would be extremely weird if a person did it.

65. fragmede ◴[] No.42141178{5}[source]
It means that's a really high bar for intelligence, human or otherwise. If AGI is "as good as a human, and the test is a trick task that most humans would fail at (especially considering the weasel requirement that it additionally has to be faster), why is that considered a reasonable bar for human-grade intelligence.
66. og_kalu ◴[] No.42141180{3}[source]
>But industries want something more concrete and prospectively-acheivable in their jargon, and so that's where AGI gets redefined as wide task suitability.

The term itself (AGI) in the industry has always been about wide task suitability. People may have added their ifs and buts over the years but that aspect of it never got 'redefined'. The earliest uses of the term all talk about how well a machine would be able to perform some set number of tasks at some threshold.

It's no wonder why. Terms like "consciousness" and "self-awareness" are completely useless. It's not about difficulty. It's that you can't do anything at all with those terms except argue around in circles.

67. chongli ◴[] No.42141243{3}[source]
Because it can't apply any reasoning that hasn't already been done and written into its training set. As soon as you ask it novel questions it falls apart. The big LLM vendors like OpenAI are playing whack-a-mole on these novel questions when they go viral on social media, all in a desperate bid to hide this fatal flaw.

The Emperor has no clothes.

replies(1): >>42141420 #
68. olalonde ◴[] No.42141257[source]
I feel the test for AGI should be more like: "go find a job and earn money" or "start a profitable business" or "pick a bachelor degree and complete it", etc.
replies(3): >>42141334 #>>42141439 #>>42144147 #
69. revscat ◴[] No.42141298{4}[source]
Plenty of humans, unfortunately, are incapable of admitting limitations. Many years ago I had a coworker who believed he would never die. At first I thought he was joking, but he was in fact quite serious.

Then there are those who are simply narcissistic, and cannot and will not admit fault regardless of the evidence presented them.

replies(1): >>42142791 #
70. rodgerd ◴[] No.42141334{3}[source]
An LLM doing crypto spam/scamming has been making money by tricking Marc Andressen into boosting it. So to the degree that "scamming gullible billionaires and their fans" is a job, that's been done.
replies(2): >>42141411 #>>42141664 #
71. rsanek ◴[] No.42141411{4}[source]
source? didn't find anything online about this.
replies(1): >>42230225 #
72. hackinthebochs ◴[] No.42141420{4}[source]
>As soon as you ask it novel questions it falls apart.

What do you mean by novel? Almost all sentences it is prompted on are brand new and it mostly responds sensibly. Surely there's some generalization going on.

replies(1): >>42141945 #
73. jedberg ◴[] No.42141439{3}[source]
Can most humans do that? Find a job and earn money, probably. The other two? Not so much.
74. fragmede ◴[] No.42141442{4}[source]
But we don't know how human thinking works. Suppose for a second that it could be represented as a series of matrix math. What series of operations are missing from the process that would make you think it was doing some fascimile of thinking?
75. hackinthebochs ◴[] No.42141443{4}[source]
Right, but its behavior didn't change after you learned more about it. Why should that cause you to update in the negative? Why does learning how it work not update you in the direction of "so that's how thinking works!" rather than, "clearly its not doing any thinking"? Why do you have a preconception of how thinking works such that learning about the internals of LLMs updates you against it thinking?
replies(1): >>42142386 #
76. sourcepluck ◴[] No.42141503{4}[source]
The short version of this is that I don't disagree with your interpretation of Searle, and my paragraphs immediately following the link weren't meant to be a direct description of his point with the Chinese Room thought experiment.

> while it cannot be dis-_proven_, the Chinese Room argument was meant to provide reasons against believing it

Yes, like Russell's teapot. I also think that's what Searle means.

> the "it can't be proven until it happens" part is misunderstanding: you won't know if it happens because the objective, externally available attributes don't indicate whether self-awareness (or indeed awareness at all) is present

Yes, agreed, I believe that's what Searle is saying too. I think I was maybe being ambiguous here - I wanted to say that even if you forgave the AI maximalists for ignoring all relevant philosophical work, the notion that "appearing human-like" inevitably tends to what would actually be "consciousness" or "intelligence" is more than a big claim.

Searle goes further, and I'm not sure if I follow him all the way, personally, but it's a side point.

77. nomel ◴[] No.42141612{4}[source]
I agree there's no single definition, but I think they all have something current LLM don't: the ability to learn new things, in a persistent way, with few shots.

I would argue that learning is The definition of AGI, since everything else comes naturally from that.

The current architectures can't learn without retraining, fine tuning is at the expense of general knowledge, and keeping things in context is detrimental to general performance. Once you have few shot learning, I think it's more of a "give it agency so it can explore" type problem.

78. nomel ◴[] No.42141640{4}[source]
> Or being aware of how its utterances may come across to its interlocutor.

I think this behavior is being somewhat demonstrated in newer models. I've seen GPT-3.5 175B correct itself mid response with, almost literally:

> <answer with flaw here>

> Wait, that's not right, that <reason for flaw>.

> <correct answer here>.

Later models seem to have much more awareness of, or "weight" towards, their own responses, while generating the response.

replies(1): >>42142851 #
79. jedberg ◴[] No.42141652{3}[source]
I will get excited when an LLM (or whatever technology is next) can solve tasks that 80%+ of adult humans can solve. Heck let's even say 80% of college graduates to make it harder.

Things like drive a car, fold laundry, run an errand, do some basic math.

You'll notice that two of those require some form of robot or mobility. I think that is key -- you can't have AGI without the ability to interact with the world in a way similar to most humans.

replies(1): >>42141904 #
80. bob1029 ◴[] No.42141654{3}[source]
This sounds like something more up the alley of linear genetic programming. There are some very interesting experiments out there that utilize UTMs (BrainFuck, Forth, et. al.) [0,1,2].

I've personally had some mild success getting these UTM variants to output their own children in a meta programming arrangement. The base program only has access to the valid instruction set of ~12 instructions per byte, while the task program has access to the full range of instructions and data per byte (256). By only training the base program, we reduce the search space by a very substantial factor. I think this would be similar to the idea of a self-hosted compiler, etc. I don't think there would be too much of a stretch to give it access to x86 instructions and a full VM once a certain amount of bootstrapping has been achieved.

[0]: https://arxiv.org/abs/2406.19108

[1]: https://github.com/kurtjd/brainfuck-evolved

[2]: https://news.ycombinator.com/item?id=36120286

81. olalonde ◴[] No.42141664{4}[source]
That story was a bit blown out of proportion. He gave a research grant to the bot's creator: https://x.com/pmarca/status/1846374466101944629
82. jedberg ◴[] No.42141680{3}[source]
When we test kids to see if they are gifted, one of the criteria is that they have the ability to say "I don't know".

That is definitely an ability that current LLMs lack.

83. fragmede ◴[] No.42141745{3}[source]
It's not just marketing bullshit though. Microsoft is the counterparty to a contract with that claim. money changes hands when that's been achieved, so I expect if sama thinks he's hit it, but Microsoft does not, we'll see that get argued in a court of law.
84. ata_aman ◴[] No.42141904{4}[source]
So embodied cognition right?
85. chongli ◴[] No.42141945{5}[source]
Novel as in requiring novel reasoning to sort out. One of the classic ways to expose the issue is to take a common puzzle and introduce irrelevant details and perhaps trivialize the solution. LLMs pattern match on the general form of the puzzle and then wander down the garden path to an incorrect solution that no human would fall for.

The sort of generalization these things can do seems to mostly be the trivial sort: substitution.

replies(2): >>42142079 #>>42142154 #
86. lagrange77 ◴[] No.42141969{3}[source]
Good point!

I'm wondering wether it would count, if one would extend it with an external program, that gives it feedback during inference (by another prompt) about the correctness of it's output.

I guess it wouldn't, because these RAG tools kind of do that and i heard no one calling those self aware.

replies(1): >>42145102 #
87. moffkalast ◴[] No.42142079{6}[source]
Well the problem with that approach is that LLMs are still both incredibly dumb and small, at least compared to the what, 700T params of a human brain? Can't compare the two directly, especially when one has a massive recall advantage that skews the perception of that. But there is still some inteligence under there that's not just memorization. Not much, but some.

So if you present a novel problem it would need to be extremely simple, not something that you couldn't solve when drunk and half awake. Completely novel, but extremely simple. I think that's testable.

replies(1): >>42142156 #
88. hackinthebochs ◴[] No.42142154{6}[source]
Why is your criteria for "on the path towards AGI" so absolutist? For it to be on the path towards AGI and not simply AGI it has to be deficient in some way. Why does the current failure modes tell you its on the wrong path? Yes, it has some interesting failure modes. The failure mode you mention is in fact very similar to human failure modes. We very much are prone to substituting the expected pattern when presented with a 99% match to a pattern previously seen. They also have a lot of inhuman failure modes as well. But so what, they aren't human. Their training regimes are very dissimilar to ours and so we should expect some alien failure modes owing to this. This doesn't strike me as good reason to think they're not on the path towards AGI.

Yes, LLMs aren't very good at reasoning and have weird failure modes. But why is this evidence that its on the wrong path, and not that it just needs more development that builds on prior successes?

replies(1): >>42142540 #
89. chongli ◴[] No.42142156{7}[source]
It’s not fair to ask me to judge them based on their size. I’m judging them based on the claims of their vendors.

Anyway the novel problems I’m talking about are extremely simple. Basically they’re variations on the “farmer, 3 animals, and a rowboat” problem. People keep finding trivial modifications to the problem that fool the LLMs but wouldn’t fool a child. Then the vendors come along and patch the model to deal with them. This is what I mean by whack-a-mole.

Searle’s Chinese Room thought experiment tells us that enough games of whack-a-mole could eventually get us to a pretty good facsimile of reasoning without ever achieving the genuine article.

replies(1): >>42142295 #
90. moffkalast ◴[] No.42142295{8}[source]
Well that's true and has been pretty glaring, but they've needed to do that in cases where models seem to fail to grasp the some concept across the board and not in cases where they don't.

Like, every time an LLM gets something right we assume they've seen it somewhere in the training data, and every time they fail we presume they haven't. But that may not always be the case, it's just extremely hard to prove it one way or the other unless you search the entire dataset. Ironically the larger the dataset, the more likely the model is generalizing while also making it harder to prove if it's really so.

To give a human example, in a school setting you have teachers tasked with figuring out that exact thing for students. Sometimes people will read the question wrong with full understanding and fail, while other times they won't know anything and make it through with a lucky guess. If LLMs (and their vendors) have learned anything it's that confidently bullshitting gets you very far which makes it even harder to tell in cases where they aren't. Somehow it's also become ubiquitous to tune models to never even say "I don't know" because it boosts benchmark scores slightly.

91. vundercind ◴[] No.42142386{5}[source]
If you didn’t know what an airplane was, and saw one for the first time, you might wonder why it doesn’t flap its wings. Is it just not very good at being a bird yet? Is it trying to flap, but cannot? Why, there’s a guy over there with a company called OpenBird and he is saying all kinds of stuff about how bird-like they are. Where’s the flapping? I don’t see any pecking at seed, either. Maybe the engineers just haven’t finished making the flapping and pecking parts yet?

Then on learning how it works, you might realize flapping just isn’t something they’re built to do, and it wouldn’t make much sense if they did flap their wings, given how they work instead.

And yet—damn, they fly fast! That’s impressive, and without a single flap! Amazing. Useful!

At no point did their behavior change, but your ability to understand how and why they do what they do, and why they fail the ways they fail instead of the ways birds fail, got better. No more surprises from expecting them to be more bird-like than they are supposed to, or able to be!

And now you can better handle that guy over there talking about how powerful and scary these “metal eagles” (his words) are, how he’s working so hard to make sure they don’t eat us with their beaks (… beaks? Where?), they’re so powerful, imagine these huge metal raptors ruling the sky, roaming and eating people as they please, while also… trying to sell you airplanes? Actively seeking further investment in making them more capable? Huh. One begins to suspect the framing of these things as scary birds that (spooky voice) EVEN THEIR CREATORS FEAR FOR THEIR BIRD-LIKE QUALITIES (/spooky voice) was part of a marketing gimmick.

replies(1): >>42142564 #
92. alexashka ◴[] No.42142441{3}[source]
Because AGI is magic and LLMs are magicians.

But how do you know a magician that knows how to do card tricks isn't going to arrive at real magic? Shakes head.

93. ◴[] No.42142540{7}[source]
94. hackinthebochs ◴[] No.42142564{6}[source]
The problem with this analogy is that we know what birds are and what they're constituted by. But we don't know what thinking is or what it is constituted by. If we wanted to learn about birds by examining airplanes, we would be barking up the wrong tree. On the other hand, if we wanted to learn about flight, we might reasonably look at airplanes and birds, then determine what the commonality is between their mechanisms of defying gravity. It would be a mistake to say "planes aren't flapping their wings, therefore they aren't flying". But that's exactly what people do when they dismiss LLMs being presently or in the future capable of thinking because they are made up of statistics, matrix multiplication, etc.
95. mrandish ◴[] No.42142661[source]
> An LLM hardly seems like something that will lead to self-awareness.

Interesting essay enumerating reasons you may be correct: https://medium.com/@francois.chollet/the-impossibility-of-in...

96. layer8 ◴[] No.42142760{5}[source]
Awareness is exhibited in behavior. It's exactly due to the behavior be observe from LLMs that we don't ascribe them awareness. I agree that it's difficult to define, and it's also not binary, but it's behavior we'd like AI to have and which LLMs are quite lacking.
replies(1): >>42178012 #
97. foxglacier ◴[] No.42142786{4}[source]
> the “behavior” breaks down in ways that would be surprising from a human who’s actually paying as much attention as they’d need to be to have been interacting the way they had been until things went wrong.

That's an interesting angle. Though of course we're not surprised by human behavior because that's where our expectations of understanding come from. If we were used to dealing with perfectly-correctly-understanding super-intelligences, then normal humans would look like we don't understand much and our deliberate thinking might be no more accurate than the super-intelligence's absent-minded automatic responses. Thus we would conclude that humans are never really thinking or understanding anything.

I agree that default LLM output makes them look like they're thinking like a human more than they really are. I think mistakes are shocking more because our expectation of someone who talks confidently is that they're not constantly revealing themselves to be an obvious liar. But if you take away the social cues and just look at the factual claims they provide, they're not obviously not-understanding vs humans are-understanding.

98. layer8 ◴[] No.42142791{5}[source]
Being aware and not admitting are two different things, though. When you confront an LLM with a limitation, it will generally admit having it. That doesn't mean that it exhibits any awareness of having the limitation in contexts where the limitation is glaringly relevant, without first having confronted it with it. This is in itself a limitation of LLMs: In contexts where it should be highly obvious, they don't take their limitations into account without specific prompting.
99. layer8 ◴[] No.42142851{5}[source]
I'm assuming the "Wait" sentence is from the user. What I mean is that when humans say something, they also tend to have a view (maybe via the famous mirror neurons) of how this now sounds to the other person. They may catch themselves while speaking, changing course mid-sentence, or adding another sentence to soften or highlight something in the previous sentence, or maybe correcting or admitting some aspect after the fact. LLMs don't exhibit such an inner feedback loop, in which they reconsider the effect of the ouput they are in the process of generating.

You won't get an LLM outputting "wait, that's not right" halfway through their original output (unless you prompted them in a way that would trigger such a speech pattern), because no re-evaluation is taking place without further input.

replies(1): >>42177920 #
100. Jensson ◴[] No.42143746{5}[source]
> (That’s not to say that humans don’t tend to lose some of their flexibility over their individual lifetimes as well.)

The lifetime is the context window, the model/training is the DNA. A human in the moment isn't general intelligent, but a human over his lifetime is, the first is so much easier to try to replicate though but that is a bad target since humans aren't born like that.

101. Jensson ◴[] No.42143777{3}[source]
They understand sentences but not words.
replies(1): >>42144657 #
102. eichi ◴[] No.42144147{3}[source]
This is people's true desire. Make something like that while handling critisisms and fitting products to the market.
103. youoy ◴[] No.42144657{4}[source]
What do you mean by that? We have the monosemanticity results [0]

[0] https://transformer-circuits.pub/2024/scaling-monosemanticit...

104. littlestymaar ◴[] No.42145102{4}[source]
> if one would extend it with an external program, that gives it feedback

If you have an external program, then by defining it's not self-awareness ;). Also, it's not about correctness per se, but about the model's ability to assess its own knowledge (making a mistake because the model was exposed to mistakes in the training data is fine, hallucinating isn't).

replies(1): >>42150305 #
105. Vampiero ◴[] No.42145267{3}[source]
Here is an example of a task that I do not believe this generation of LLMs can ever do but that is possible for an average human: designing a functional trivia app.

There, you don't need to invoke Turing or compiler bootstrapping. You just need one example of a use case where the accuracy of responses is mission critical

replies(1): >>42146128 #
106. fullstackchris ◴[] No.42145571{3}[source]
Comments like these are so prevalent and yet illustrate very well the lack of understanding of the underlying technology. Neural nets, once trained, are static! You'll never get dynamic "through-time" reasoning like you can with a human-like mind. It's simply the WRONG tool. I say human-like because I still think AGI could be acheived in some digital format, but I can assure you it wont be packaged in a static neural net.

Now, neural nets that have a copy of themselves, can look back at what nodes were hit, and change through time... then maybe we are getting somewhere

replies(1): >>42147035 #
107. alainx277 ◴[] No.42146128{4}[source]
o1-preview managed to complete this in one attempt:

https://chatgpt.com/share/67373737-04a8-800d-bc57-de74a415e2...

I think the parent comment's challenge is more appropriate.

replies(1): >>42148745 #
108. hackinthebochs ◴[] No.42147035{4}[source]
The context window of LLMs gives something like "through time reasoning". Chain of thought goes even further in this direction.
109. Filligree ◴[] No.42147539{4}[source]
That’s… what I said, yes.
110. Vampiero ◴[] No.42148745{5}[source]
Have you personally verified that the answers are not hallucinations and that they are indeed factually true?

Oh, you just asked it to make a trivia app that feeds on JSON. Cute, but that's not what I meant. The web is full of tutorials for basic stuff like that.

To be clear I meant that LLMs can't write trivia questions and answers, thus proving that they can't produce trustworthy outputs.

And a trivia app is a toy (one might even say... a trivial example), but it's a useful demonstration of why you wouldn't put an LLM into a system on which lives depend on, let alone invest billions on it.

If you don't trust my words just go back to fiddling with your models and ask them to write a trivia quiz about a topic that you know very well. Like a TV show.

111. lagrange77 ◴[] No.42150305{5}[source]
Yes, but that's essentially my point. Where to draw the system boundary? The brain is also composed of multiple components and does IO with external components, that are definitely not considered part of it.
112. tim333 ◴[] No.42157364[source]
Working towards it more than on it.

People use the term in different ways. It generally implies being able to think like a human or better. OpenAI have always said they are working towards it, I think deepmind too. It'll probably take more than an LLM.

It's economically a big deal because if it can out think humans you can set it to develop the next improved model and basically make humans redundant.

113. snapcaster ◴[] No.42172995{3}[source]
At least it's a testable measurable definition. Everyone else seems to be down boring linguistic rabbit holes or nonstop goal post moving
114. nomel ◴[] No.42177920{6}[source]
> You won't get an LLM outputting "wait, that's not right" halfway through their original output

No, that's one contiguous response from the LLM. I have screenshots, because I was so surprised the first time. I've had it happen many times. This was (as I always use LLM) direct API calls. In the first case it happened, it was with largest Llama 3.5. It usually only happens one shot, no context, base/empty system prompt.

> LLMs don't exhibit such an inner feedback loop

That's not true, at all. Next token prediction is based on all previous text, including the previous word that was just produced. It uses what it has said for what it will say next, within the same response, just as a markov chain would.

115. ◴[] No.42178012{6}[source]
116. rodgerd ◴[] No.42230225{5}[source]
Goatseus Maximus is what you're after.