Most active commenters
  • hackinthebochs(6)
  • vundercind(5)
  • chongli(3)

←back to thread

625 points lukebennett | 29 comments | | HN request time: 0.43s | source | bottom
Show context
irrational ◴[] No.42139106[source]
> The AGI bubble is bursting a little bit

I'm surprised that any of these companies consider what they are working on to be Artificial General Intelligences. I'm probably wrong, but my impression was AGI meant the AI is self aware like a human. An LLM hardly seems like something that will lead to self-awareness.

replies(18): >>42139138 #>>42139186 #>>42139243 #>>42139257 #>>42139286 #>>42139294 #>>42139338 #>>42139534 #>>42139569 #>>42139633 #>>42139782 #>>42139855 #>>42139950 #>>42139969 #>>42140128 #>>42140234 #>>42142661 #>>42157364 #
1. vundercind ◴[] No.42139782[source]
I thought maybe they were on the right track until I read Attention Is All You Need.

Nah, at best we found a way to make one part of a collection of systems that will, together, do something like thinking. Thinking isn’t part of what this current approach does.

What’s most surprising about modern LLMs is that it turns out there is so much information statistically encoded in the structure of our writing that we can use only that structural information to build a fancy Plinko machine and not only will the output mimic recognizable grammar rules, but it will also sometimes seem to make actual sense, too—and the system doesn’t need to think or actually “understand” anything for us to, basically, usefully query that information that was always there in our corpus of literature, not in the plain meaning of the words, but in the structure of the writing.

replies(5): >>42139883 #>>42139888 #>>42139993 #>>42140508 #>>42140521 #
2. kenjackson ◴[] No.42139883[source]
> but it will also sometimes seem to make actual sense, too

When I read stuff like this it makes me wonder if people are actually using any of the LLMs...

replies(1): >>42140063 #
3. hackinthebochs ◴[] No.42139888[source]
I see takes like this all the time and its so confusing. Why does knowing how things work under the hood make you think its not on the path towards AGI? What was lacking in the Attention paper that tells you AGI won't be built on LLMs? If its the supposed statistical nature of LLMs (itself a questionable claim), why does statistics seem so deflating to you?
replies(4): >>42140161 #>>42141243 #>>42142441 #>>42145571 #
4. SturgeonsLaw ◴[] No.42139993[source]
> at best we found a way to make one part of a collection of systems that will, together, do something like thinking

This seems like the most viable path to me as well (educational background in neuroscience but don't work in the field). The brain is composed of many specialised regions which are tuned for very specific tasks.

LLMs are amazing and they go some way towards mimicking the functionality provided by Broca's and Wernicke's areas, and parts of the cerebrum, in our wetware, however a full brain they do not make.

The work on robots mentioned elsewhere in the thread is a good way to develop cerebellum like capabilities (movement/motor control), and computer vision can mimic the lateral geniculate nucleus and other parts of the visual cortex.

In nature it takes all these parts working together to create a cohesive mind, and it's likely that an artificial brain would also need to be composed of multiple agents, instead of just trying to scale LLMs indefinitely.

5. disgruntledphd2 ◴[] No.42140063[source]
The RLHF is super important in generating useful responses, and that's relatively new. Does anyone remember gpt3? It could make sense for a paragraph or two at most.
6. vundercind ◴[] No.42140161[source]
> Why does knowing how things work under the hood make you think its not on the path towards AGI?

Because I had no idea how these were built until I read the paper, so couldn’t really tell what sort of tree they’re barking up. The failure-modes of LLMs and ways prompts affect output made a ton more sense after I updated my mental model with that information.

replies(2): >>42141442 #>>42141443 #
7. youoy ◴[] No.42140508[source]
Don't get caught in the superficial analysis. They "understand" things. It is a fact that LLMs experience a phase transition during training, from positional information to semantic understanding. It may well be the case that with scale there is another phase transition from semantic to something more abstract that we identify more closely with reasoning. It would be an emergent property of a sufficiently complex system. At least that is the whole argument around AGI.
replies(1): >>42143777 #
8. foxglacier ◴[] No.42140521[source]
> think or actually “understand” anything

It doesn't matter if that's happening or not. That's the whole point of the Chinese room - if it can look like it's understanding, it's indistinguishable from actually understanding. This applies to humans too. I'd say most of our regular social communication is done in a habitual intuitive way without understanding what or why we're communicating. Especially the subtle information conveyed in body language, tone of voice, etc. That stuff's pretty automatic to the point that people have trouble controlling it if they try. People get into conflicts where neither person understands where they disagree but they have emotions telling them "other person is being bad". Maybe we have a second consciousness we can't experience and which truly understands what it's doing while our conscious mind just uses the results from that, but maybe we don't and it still works anyway.

Educators have figured this out. They don't test students' understanding of concepts, but rather their ability to apply or communicate them. You see this in school curricula with wording like "use concept X" rather than "understand concept X".

replies(1): >>42140730 #
9. vundercind ◴[] No.42140730[source]
There’s a distinction in behavior of a human and a Chinese room when things go wrong—when the rule book doesn’t cover the case at hand.

I agree that a hypothetical perfectly-functioning Chinese room is, tautologically, impossible to distinguish from a real person who speaks Chinese, but that’s a thought experiment, not something that can actually exist. There’ll remain places where the “behavior” breaks down in ways that would be surprising from a human who’s actually paying as much attention as they’d need to be to have been interacting the way they had been until things went wrong.

That, in fact, is exactly where the difference lies: the LLM is basically always not actually “paying attention” or “thinking” (those aren’t things it does) but giving automatic responses, so you see failures of a sort that a human might also exhibit when following a social script (yes, we do that, you’re right), but not in the same kind of apparently-highly-engaged context unless the person just had a stroke mid-conversation or something—because the LLM isn’t engaged, because being-engaged isn’t a thing it does. When it’s getting things right and seeming to be paying a lot of attention to the conversation, it’s not for the same reason people give that impression, and the mimicking of present-ness works until the rule book goes haywire and the ever-gibbering player-piano behind it is exposed.

replies(2): >>42140997 #>>42142786 #
10. nuancebydefault ◴[] No.42140997{3}[source]
I would argue maybe people also are not thinking but simply processing. It is known that most of what we do and feel goes automatically (subconsciously).

But even more, maybe consciousness is an invention of our 'explaining self', maybe everything is automatic. I'm convinced this discussion is and will stay philosophical and will never get any conclusion.

replies(1): >>42141089 #
11. vundercind ◴[] No.42141089{4}[source]
Yeah, I’m not much interested in “what’s consciousness?” but I do think the automatic-versus-thinking distinction matters for understanding what LLMs do, and what we might expect them to be able to do, and when and to what degree we need to second-guess them.

A human doesn’t just confidently spew paragraphs legit-looking but entirely wrong crap, unless they’re trying to deceive or be funny—an LLM isn’t trying to do anything, though, there’s no motivation, it doesn’t like you (it doesn’t like—it doesn’t it, one might even say), sometimes it definitely will just give you a beautiful and elaborate lie simply because its rulebook told it to, in a context and in a way that would be extremely weird if a person did it.

12. chongli ◴[] No.42141243[source]
Because it can't apply any reasoning that hasn't already been done and written into its training set. As soon as you ask it novel questions it falls apart. The big LLM vendors like OpenAI are playing whack-a-mole on these novel questions when they go viral on social media, all in a desperate bid to hide this fatal flaw.

The Emperor has no clothes.

replies(1): >>42141420 #
13. hackinthebochs ◴[] No.42141420{3}[source]
>As soon as you ask it novel questions it falls apart.

What do you mean by novel? Almost all sentences it is prompted on are brand new and it mostly responds sensibly. Surely there's some generalization going on.

replies(1): >>42141945 #
14. fragmede ◴[] No.42141442{3}[source]
But we don't know how human thinking works. Suppose for a second that it could be represented as a series of matrix math. What series of operations are missing from the process that would make you think it was doing some fascimile of thinking?
15. hackinthebochs ◴[] No.42141443{3}[source]
Right, but its behavior didn't change after you learned more about it. Why should that cause you to update in the negative? Why does learning how it work not update you in the direction of "so that's how thinking works!" rather than, "clearly its not doing any thinking"? Why do you have a preconception of how thinking works such that learning about the internals of LLMs updates you against it thinking?
replies(1): >>42142386 #
16. chongli ◴[] No.42141945{4}[source]
Novel as in requiring novel reasoning to sort out. One of the classic ways to expose the issue is to take a common puzzle and introduce irrelevant details and perhaps trivialize the solution. LLMs pattern match on the general form of the puzzle and then wander down the garden path to an incorrect solution that no human would fall for.

The sort of generalization these things can do seems to mostly be the trivial sort: substitution.

replies(2): >>42142079 #>>42142154 #
17. moffkalast ◴[] No.42142079{5}[source]
Well the problem with that approach is that LLMs are still both incredibly dumb and small, at least compared to the what, 700T params of a human brain? Can't compare the two directly, especially when one has a massive recall advantage that skews the perception of that. But there is still some inteligence under there that's not just memorization. Not much, but some.

So if you present a novel problem it would need to be extremely simple, not something that you couldn't solve when drunk and half awake. Completely novel, but extremely simple. I think that's testable.

replies(1): >>42142156 #
18. hackinthebochs ◴[] No.42142154{5}[source]
Why is your criteria for "on the path towards AGI" so absolutist? For it to be on the path towards AGI and not simply AGI it has to be deficient in some way. Why does the current failure modes tell you its on the wrong path? Yes, it has some interesting failure modes. The failure mode you mention is in fact very similar to human failure modes. We very much are prone to substituting the expected pattern when presented with a 99% match to a pattern previously seen. They also have a lot of inhuman failure modes as well. But so what, they aren't human. Their training regimes are very dissimilar to ours and so we should expect some alien failure modes owing to this. This doesn't strike me as good reason to think they're not on the path towards AGI.

Yes, LLMs aren't very good at reasoning and have weird failure modes. But why is this evidence that its on the wrong path, and not that it just needs more development that builds on prior successes?

replies(1): >>42142540 #
19. chongli ◴[] No.42142156{6}[source]
It’s not fair to ask me to judge them based on their size. I’m judging them based on the claims of their vendors.

Anyway the novel problems I’m talking about are extremely simple. Basically they’re variations on the “farmer, 3 animals, and a rowboat” problem. People keep finding trivial modifications to the problem that fool the LLMs but wouldn’t fool a child. Then the vendors come along and patch the model to deal with them. This is what I mean by whack-a-mole.

Searle’s Chinese Room thought experiment tells us that enough games of whack-a-mole could eventually get us to a pretty good facsimile of reasoning without ever achieving the genuine article.

replies(1): >>42142295 #
20. moffkalast ◴[] No.42142295{7}[source]
Well that's true and has been pretty glaring, but they've needed to do that in cases where models seem to fail to grasp the some concept across the board and not in cases where they don't.

Like, every time an LLM gets something right we assume they've seen it somewhere in the training data, and every time they fail we presume they haven't. But that may not always be the case, it's just extremely hard to prove it one way or the other unless you search the entire dataset. Ironically the larger the dataset, the more likely the model is generalizing while also making it harder to prove if it's really so.

To give a human example, in a school setting you have teachers tasked with figuring out that exact thing for students. Sometimes people will read the question wrong with full understanding and fail, while other times they won't know anything and make it through with a lucky guess. If LLMs (and their vendors) have learned anything it's that confidently bullshitting gets you very far which makes it even harder to tell in cases where they aren't. Somehow it's also become ubiquitous to tune models to never even say "I don't know" because it boosts benchmark scores slightly.

21. vundercind ◴[] No.42142386{4}[source]
If you didn’t know what an airplane was, and saw one for the first time, you might wonder why it doesn’t flap its wings. Is it just not very good at being a bird yet? Is it trying to flap, but cannot? Why, there’s a guy over there with a company called OpenBird and he is saying all kinds of stuff about how bird-like they are. Where’s the flapping? I don’t see any pecking at seed, either. Maybe the engineers just haven’t finished making the flapping and pecking parts yet?

Then on learning how it works, you might realize flapping just isn’t something they’re built to do, and it wouldn’t make much sense if they did flap their wings, given how they work instead.

And yet—damn, they fly fast! That’s impressive, and without a single flap! Amazing. Useful!

At no point did their behavior change, but your ability to understand how and why they do what they do, and why they fail the ways they fail instead of the ways birds fail, got better. No more surprises from expecting them to be more bird-like than they are supposed to, or able to be!

And now you can better handle that guy over there talking about how powerful and scary these “metal eagles” (his words) are, how he’s working so hard to make sure they don’t eat us with their beaks (… beaks? Where?), they’re so powerful, imagine these huge metal raptors ruling the sky, roaming and eating people as they please, while also… trying to sell you airplanes? Actively seeking further investment in making them more capable? Huh. One begins to suspect the framing of these things as scary birds that (spooky voice) EVEN THEIR CREATORS FEAR FOR THEIR BIRD-LIKE QUALITIES (/spooky voice) was part of a marketing gimmick.

replies(1): >>42142564 #
22. alexashka ◴[] No.42142441[source]
Because AGI is magic and LLMs are magicians.

But how do you know a magician that knows how to do card tricks isn't going to arrive at real magic? Shakes head.

23. ◴[] No.42142540{6}[source]
24. hackinthebochs ◴[] No.42142564{5}[source]
The problem with this analogy is that we know what birds are and what they're constituted by. But we don't know what thinking is or what it is constituted by. If we wanted to learn about birds by examining airplanes, we would be barking up the wrong tree. On the other hand, if we wanted to learn about flight, we might reasonably look at airplanes and birds, then determine what the commonality is between their mechanisms of defying gravity. It would be a mistake to say "planes aren't flapping their wings, therefore they aren't flying". But that's exactly what people do when they dismiss LLMs being presently or in the future capable of thinking because they are made up of statistics, matrix multiplication, etc.
25. foxglacier ◴[] No.42142786{3}[source]
> the “behavior” breaks down in ways that would be surprising from a human who’s actually paying as much attention as they’d need to be to have been interacting the way they had been until things went wrong.

That's an interesting angle. Though of course we're not surprised by human behavior because that's where our expectations of understanding come from. If we were used to dealing with perfectly-correctly-understanding super-intelligences, then normal humans would look like we don't understand much and our deliberate thinking might be no more accurate than the super-intelligence's absent-minded automatic responses. Thus we would conclude that humans are never really thinking or understanding anything.

I agree that default LLM output makes them look like they're thinking like a human more than they really are. I think mistakes are shocking more because our expectation of someone who talks confidently is that they're not constantly revealing themselves to be an obvious liar. But if you take away the social cues and just look at the factual claims they provide, they're not obviously not-understanding vs humans are-understanding.

26. Jensson ◴[] No.42143777[source]
They understand sentences but not words.
replies(1): >>42144657 #
27. youoy ◴[] No.42144657{3}[source]
What do you mean by that? We have the monosemanticity results [0]

[0] https://transformer-circuits.pub/2024/scaling-monosemanticit...

28. fullstackchris ◴[] No.42145571[source]
Comments like these are so prevalent and yet illustrate very well the lack of understanding of the underlying technology. Neural nets, once trained, are static! You'll never get dynamic "through-time" reasoning like you can with a human-like mind. It's simply the WRONG tool. I say human-like because I still think AGI could be acheived in some digital format, but I can assure you it wont be packaged in a static neural net.

Now, neural nets that have a copy of themselves, can look back at what nodes were hit, and change through time... then maybe we are getting somewhere

replies(1): >>42147035 #
29. hackinthebochs ◴[] No.42147035{3}[source]
The context window of LLMs gives something like "through time reasoning". Chain of thought goes even further in this direction.