←back to thread

108 points bertman | 1 comments | | HN request time: 0s | source
Show context
n4r9 ◴[] No.43819695[source]
Although I'm sympathetic to the author's argument, I don't think they've found the best way to frame it. I have two main objections i.e. points I guess LLM advocates might dispute.

Firstly:

> LLMs are capable of appearing to have a theory about a program ... but it’s, charitably, illusion.

To make this point stick, you would also have to show why it's not an illusion when humans "appear" to have a theory.

Secondly:

> Theories are developed by doing the work and LLMs do not do the work

Isn't this a little... anthropocentric? That's the way humans develop theories. In principle, could a theory not be developed by transmitting information into someone's brain patterns as if they had done the work?

replies(6): >>43819742 #>>43821151 #>>43821318 #>>43822444 #>>43822489 #>>43824220 #
ryandv ◴[] No.43821318[source]
> To make this point stick, you would also have to show why it's not an illusion when humans "appear" to have a theory.

This idea has already been explored by thought experiments such as John Searle's so-called "Chinese room" [0]; an LLM cannot have a theory about a program, any more than the computer in Searle's "Chinese room" understands "Chinese" by using lookup tables to generate canned responses to an input prompt.

One says the computer lacks "intentionality" regarding the topics that the LLM ostensibly appears to be discussing. Their words aren't "about" anything, they don't represent concepts or ideas or physical phenomena the same way the words and thoughts of a human do. The computer doesn't actually "understand Chinese" the way a human can.

[0] https://en.wikipedia.org/wiki/Chinese_room

replies(6): >>43821648 #>>43822082 #>>43822399 #>>43822436 #>>43824251 #>>43828753 #
CamperBob2 ◴[] No.43821648[source]
You're seriously still going to invoke the Chinese Room argument after what we've seen lately? Wow.

The computer understands Chinese better than Searle (or anyone else) understood the nature and functionality of language.

replies(1): >>43821684 #
ryandv ◴[] No.43821684[source]
You're seriously going to invoke this braindead reddit-tier of "argumentation," or rather lack thereof, by claiming bewilderment and offering zero substantive points?

Wow.

replies(1): >>43821955 #
CamperBob2 ◴[] No.43821955[source]
Yes, because the Chinese Room was a weak test the day it was proposed, and it's a heap of smoldering rhetorical wreckage now. It's Searle who failed to offer any substantive points.

How do you know you're not arguing with an LLM at the moment? You don't... any more than I do.

replies(1): >>43821978 #
ryandv ◴[] No.43821978[source]
> How do you know you're not arguing with an LLM at the moment? You don't.

I wish I was right now. It would probably provide at least the semblance of greater insight into these topics.

> the Chinese Room was a weak test the day it was proposed

Why?

replies(2): >>43822163 #>>43822171 #
nullstyle ◴[] No.43822171{3}[source]
It’s a crappy thought experiment free from the constraints of any reality, and given that these fancy lookup tables understand most languages better than I do, it doesnt hold water. Thought experiments arent science.
replies(4): >>43822317 #>>43822388 #>>43822574 #>>43823211 #
Jensson ◴[] No.43822388{4}[source]
If the Chinese rooms tells you "I just left the train, see you in 5 minutes", what do you think the Chinese room try to convey? Do you think it knows what it just said? LLMs say such things all the time if you don't RLHF them to stop, why do you think they wouldn't be just as clueless about other things?
replies(1): >>43823848 #
CamperBob2 ◴[] No.43823848{5}[source]
If you ask an LLM to do some math, what happens is interesting.

Simple arithmetic ("What is 2+2") is obviously going to be well-represented in the training data, so the model will simply regurgitate "4."'

For more advanced questions like "What are the roots of 14.338x^5 + 4.005x^4 + 3.332x^3 - 99.7x^2 + 120x = 0?", the model will either yield random nonsense as GPT-4o did, or write a Python script and execute it to return the correct answer(s) as o4-mini-high did: https://chatgpt.com/share/680fb812-76b8-800b-a19e-7469cbcc43...

Now, give the model an intermediate arithmetic problem, one that isn't especially hard but also isn't going to be in-distribution ("If a is 3 and b is 11.4, what is the fourth root of a*b?").

How would YOU expect the operator of a Chinese Room to respond to that?

Here's how GPT-4o responded: https://chatgpt.com/share/680fb616-45e0-800b-b592-789f3f8c58...

Now, that's not a great answer, it's clearly an imprecise estimate. But it's more or less right, and the fact that it isn't a perfect answer suggests that the model didn't cheat somehow. A similar but easier problem would almost certainly have been answered correctly. Where did that answer come from, if the model doesn't "understand" the math to a nontrivial extent?

If it can "understand" basic high-school math, what else can it "understand?" What exactly are the limits of what a transformer can "understand" without resorting to web search or tool use?

An adherent of Searle's argument is going to have a terrible time explaining phenomena like this... and it's only going to get worse for them over time.

replies(2): >>43823953 #>>43824295 #
Jensson ◴[] No.43823953{6}[source]
> If it can "understand" basic high-school math, what else can it "understand?" What exactly are the limits of what a transformer can "understand" without resorting to web search or tool use?

It is basically a grammar machine, it mostly understands stuff that can be encoded as a grammar. That is extremely inefficient for math but it can do it, that gives you a really simple way to figure out what it can do and can't do.

Knowing this LLM never really surprised me, you can encode a ton of stuff as grammars, but that is still never going to be enough given how inefficient grammars are at lots of things. But when you have a grammar the size of many billions of bytes then you can do quite a lot with it.

replies(1): >>43824405 #
CamperBob2 ◴[] No.43824405{7}[source]
Let's stick with the Chinese Room specifically for a moment.

1) The operator doesn't know math, but the Chinese books in the room presumably include math lessons.

2) The operator's instruction manual does not include anything about math, only instructions for translation using English and Chinese vocabulary and grammar.

3) Someone walks up and hands the operator the word problem in question, written in Chinese.

Does the operator succeed in returning the Chinese characters corresponding to the equation's roots? Remember, he doesn't even know he's working on a math problem, much less how to solve it himself.

As humans, you and I were capable of reading high-school math textbooks by the time we reached the third or fourth grade. Just being able to read the books, though, would not have taught us how to attack math problems that were well beyond our skill level at the time.

So much for grammar. How can a math problem be solved by someone who not only doesn't understand math, but the language the question is written in? Searle's proposal only addresses the latter: language can indeed be translated symbolically. Wow, yeah, thanks for that insight. Meanwhile, to arrive at the right answers, an understanding of the math must exist somewhere... but where?

My position is that no, the operator of the Room could not have arrived at the answer to the question that the LLM succeeded (more or less) at solving.

replies(1): >>43825702 #
1. Jensson ◴[] No.43825702{8}[source]
> Meanwhile, to arrive at the right answers, an understanding of the math must exist somewhere... but where?

In the grammar, you can have grammar rules like "1 + 1 = " must be followed by 2 etc. Then add a lot of dependency rules like "He did X" the He depends on some previous sentence to stuff like that, in same way "1 plus 1" translates to "1 + 1" or "add 1 to 1" is also "1 + 1", and now you have a machine that can do very complex things.

Then you take such a grammar machine and train it on all text human has ever written, and it learns a lot of such grammar structures, and can thus parse and solve some basic math problems since the solution to them is a part of the grammar it learned.

Such a machine is still unable to solve anything outside of the grammar it has learned. But it is still very useful, pose a question in a way that makes it easy to parse, and that has a lot of such grammar dependencies you know it can handle, and it will almost always output the right response.