←back to thread

108 points bertman | 1 comments | | HN request time: 0s | source
Show context
n4r9 ◴[] No.43819695[source]
Although I'm sympathetic to the author's argument, I don't think they've found the best way to frame it. I have two main objections i.e. points I guess LLM advocates might dispute.

Firstly:

> LLMs are capable of appearing to have a theory about a program ... but it’s, charitably, illusion.

To make this point stick, you would also have to show why it's not an illusion when humans "appear" to have a theory.

Secondly:

> Theories are developed by doing the work and LLMs do not do the work

Isn't this a little... anthropocentric? That's the way humans develop theories. In principle, could a theory not be developed by transmitting information into someone's brain patterns as if they had done the work?

replies(6): >>43819742 #>>43821151 #>>43821318 #>>43822444 #>>43822489 #>>43824220 #
ryandv ◴[] No.43821318[source]
> To make this point stick, you would also have to show why it's not an illusion when humans "appear" to have a theory.

This idea has already been explored by thought experiments such as John Searle's so-called "Chinese room" [0]; an LLM cannot have a theory about a program, any more than the computer in Searle's "Chinese room" understands "Chinese" by using lookup tables to generate canned responses to an input prompt.

One says the computer lacks "intentionality" regarding the topics that the LLM ostensibly appears to be discussing. Their words aren't "about" anything, they don't represent concepts or ideas or physical phenomena the same way the words and thoughts of a human do. The computer doesn't actually "understand Chinese" the way a human can.

[0] https://en.wikipedia.org/wiki/Chinese_room

replies(6): >>43821648 #>>43822082 #>>43822399 #>>43822436 #>>43824251 #>>43828753 #
TeMPOraL ◴[] No.43822082[source]
Wait, isn't the conclusion to take from the "Chinese room" literally the opposite of what you suggest? I.e. it's the most basic, go-to example of a larger system showing capability (here, understanding Chinese) that is not present in any of its constituent parts individually.

> Their words aren't "about" anything, they don't represent concepts or ideas or physical phenomena the same way the words and thoughts of a human do. The computer doesn't actually "understand Chinese" the way a human can.

That's very much unclear at this point. We don't fully understand how we relate words to concepts and meaning ourselves, but to the extent we do, LLMs are by far the closest implementation of those same ideas in a computer.

replies(4): >>43822153 #>>43822155 #>>43822821 #>>43830055 #
sgt101 ◴[] No.43830055[source]
>We don't fully understand how we relate words to concepts and meaning ourselves,

This is definitely true.

>but to the extent we do, LLMs are by far the closest implementation of those same ideas in a computer

Well - this is half true but meaningless. I mean - we don't understand so LLM's are as good a bet as anything.

LLMs will confidently tell you that white wine is good with fish, but they have no experience of the taste of wine, or fish, or what it means for one to compliment the other. Humans all know what it's like to have fluid in their mouths, they know the taste of food and the feel of the ground under their feet. LLMs have no experience, they exist crystalised and unchanging in an abstract eternal now, so they literally can't understand anything.

replies(2): >>43849672 #>>43855778 #
stevenhuang ◴[] No.43855778[source]
It's incoherent to think the ability to reason requires the reasoner to be able to change permanently. You realize that LLMs do change; their context window and model weights change on every processed token. Not to mention the weights can be saved and persisted in a sense via LORAs.

The belief LLMs cannot reason maybe justifiable for other reasons, just not for reasons you've outlined.

replies(1): >>43871396 #
1. sgt101 ◴[] No.43871396{3}[source]
I'm not sure you're right you know. I think that the way that an LLM maintains a conversation is to have the conversational thread fed into an instance of it at every step. You can see this if you do a conversation step by step and then take all of it (including the LLM responses) apart from the final outcome and paste that into a new thread:

https://chatgpt.com/share/6814e827-81cc-8001-a75f-64ed6df5fc...

https://chatgpt.com/share/6814e7fb-f4d0-8001-a503-9c991df832...

if you think about how these things work as services you can see that this makes sense. The model weights are several gb, so caching the model weights for utilisation by a particular customer is impractical. So if the forward pass does update the model then that's instantly discarded, what's retained is the conversational text, and that's the bit that's uploaded to the model on each iteration for a new reply. There are hundreds of requests pinging through the data center where the models are used every second, all of these use the same models.

But if you believe that there is a reasoning process taking place in the text then fair enough.