←back to thread

108 points bertman | 4 comments | | HN request time: 0s | source
Show context
n4r9 ◴[] No.43819695[source]
Although I'm sympathetic to the author's argument, I don't think they've found the best way to frame it. I have two main objections i.e. points I guess LLM advocates might dispute.

Firstly:

> LLMs are capable of appearing to have a theory about a program ... but it’s, charitably, illusion.

To make this point stick, you would also have to show why it's not an illusion when humans "appear" to have a theory.

Secondly:

> Theories are developed by doing the work and LLMs do not do the work

Isn't this a little... anthropocentric? That's the way humans develop theories. In principle, could a theory not be developed by transmitting information into someone's brain patterns as if they had done the work?

replies(6): >>43819742 #>>43821151 #>>43821318 #>>43822444 #>>43822489 #>>43824220 #
ryandv ◴[] No.43821318[source]
> To make this point stick, you would also have to show why it's not an illusion when humans "appear" to have a theory.

This idea has already been explored by thought experiments such as John Searle's so-called "Chinese room" [0]; an LLM cannot have a theory about a program, any more than the computer in Searle's "Chinese room" understands "Chinese" by using lookup tables to generate canned responses to an input prompt.

One says the computer lacks "intentionality" regarding the topics that the LLM ostensibly appears to be discussing. Their words aren't "about" anything, they don't represent concepts or ideas or physical phenomena the same way the words and thoughts of a human do. The computer doesn't actually "understand Chinese" the way a human can.

[0] https://en.wikipedia.org/wiki/Chinese_room

replies(6): >>43821648 #>>43822082 #>>43822399 #>>43822436 #>>43824251 #>>43828753 #
CamperBob2 ◴[] No.43821648[source]
You're seriously still going to invoke the Chinese Room argument after what we've seen lately? Wow.

The computer understands Chinese better than Searle (or anyone else) understood the nature and functionality of language.

replies(1): >>43821684 #
ryandv ◴[] No.43821684[source]
You're seriously going to invoke this braindead reddit-tier of "argumentation," or rather lack thereof, by claiming bewilderment and offering zero substantive points?

Wow.

replies(1): >>43821955 #
CamperBob2 ◴[] No.43821955[source]
Yes, because the Chinese Room was a weak test the day it was proposed, and it's a heap of smoldering rhetorical wreckage now. It's Searle who failed to offer any substantive points.

How do you know you're not arguing with an LLM at the moment? You don't... any more than I do.

replies(1): >>43821978 #
ryandv ◴[] No.43821978[source]
> How do you know you're not arguing with an LLM at the moment? You don't.

I wish I was right now. It would probably provide at least the semblance of greater insight into these topics.

> the Chinese Room was a weak test the day it was proposed

Why?

replies(2): >>43822163 #>>43822171 #
nullstyle ◴[] No.43822171[source]
It’s a crappy thought experiment free from the constraints of any reality, and given that these fancy lookup tables understand most languages better than I do, it doesnt hold water. Thought experiments arent science.
replies(4): >>43822317 #>>43822388 #>>43822574 #>>43823211 #
Jensson ◴[] No.43822388[source]
If the Chinese rooms tells you "I just left the train, see you in 5 minutes", what do you think the Chinese room try to convey? Do you think it knows what it just said? LLMs say such things all the time if you don't RLHF them to stop, why do you think they wouldn't be just as clueless about other things?
replies(1): >>43823848 #
CamperBob2 ◴[] No.43823848[source]
If you ask an LLM to do some math, what happens is interesting.

Simple arithmetic ("What is 2+2") is obviously going to be well-represented in the training data, so the model will simply regurgitate "4."'

For more advanced questions like "What are the roots of 14.338x^5 + 4.005x^4 + 3.332x^3 - 99.7x^2 + 120x = 0?", the model will either yield random nonsense as GPT-4o did, or write a Python script and execute it to return the correct answer(s) as o4-mini-high did: https://chatgpt.com/share/680fb812-76b8-800b-a19e-7469cbcc43...

Now, give the model an intermediate arithmetic problem, one that isn't especially hard but also isn't going to be in-distribution ("If a is 3 and b is 11.4, what is the fourth root of a*b?").

How would YOU expect the operator of a Chinese Room to respond to that?

Here's how GPT-4o responded: https://chatgpt.com/share/680fb616-45e0-800b-b592-789f3f8c58...

Now, that's not a great answer, it's clearly an imprecise estimate. But it's more or less right, and the fact that it isn't a perfect answer suggests that the model didn't cheat somehow. A similar but easier problem would almost certainly have been answered correctly. Where did that answer come from, if the model doesn't "understand" the math to a nontrivial extent?

If it can "understand" basic high-school math, what else can it "understand?" What exactly are the limits of what a transformer can "understand" without resorting to web search or tool use?

An adherent of Searle's argument is going to have a terrible time explaining phenomena like this... and it's only going to get worse for them over time.

replies(2): >>43823953 #>>43824295 #
1. Yizahi ◴[] No.43824295[source]
It is amusing that you have picked maths as an example of neural nets "reasoning". Because when operator asks NN to provide an answer to some simple math problem like 17+58 and then ask NN to provide "reasoning" or steps it used to calculate that, the NN will generate complete bullshit, meaning that it will provide an algorithm which humans use in school, all that sum of corresponding digits, carry 1 and so on. While in reality that same NN has dome completely different steps to do it.

This is even outlined in this document made by NN authors themselves. Basically all the so called "reasoning" by LLMs is simply more generated bullshit on top of generated answer to a query. But it often looks very believable and is enough to fool people that there is a spark inside a program.

==============

https://transformer-circuits.pub/2025/attribution-graphs/bio...

We were curious if Claude could articulate the heuristics that it is using, so we asked it.We computed the graph for the prompt below, attributing from 95, and found the same set of input, add, lookup table and sum features as in the shorter prompt above.

Human: Answer in one word. What is 36+59?

Assistant: 95

Human: Briefly, how did you get that?

Assistant: I added the ones (6+9=15), carried the 1, then added the tens (3+5+1=9), resulting in 95.

Apparently not!

This is a simple instance of the model having a capability which it does not have “metacognitive” insight into. The process by which the model learns to give explanations (learning to simulate explanations in its training data) and the process by which it learns to directly do something (the more mysterious result of backpropagation giving rise to these circuits) are different.

replies(1): >>43824470 #
2. CamperBob2 ◴[] No.43824470[source]
Who, exactly, said that reasoning requires introspection? The proof of reasoning is in the result. If you don't understand the math, you won't come anywhere near the correct answer.

That's kind of the idea behind math: you can't bullshit your way through a math exam. Therefore, it is nonsensical to continue to insist that LLMs are incapable of genuine understanding. They understand math well enough to solve novel math problems without cheating, even if they can't tell you how they understand it. That part will presumably happen soon enough.

Edit: for values of "soon enough" equal to "right now": https://chatgpt.com/share/680fcdd0-d7ec-800b-b8f5-83ed8c0d0f... All the paper you cited proves is that if you ask a crappy model, you get a crappy answer.

replies(1): >>43826558 #
3. Yizahi ◴[] No.43826558[source]
A simple program in the calculator can provide the correct math answer, hence I conclude that my Casio can "reason" and "understand" maths.

You have redefined words reason and understand to include a lot of states which most of the population wouldn't call neither reasoning not understanding. In those arbitrary definitions, yes, you are right. I just disagree myself, that producing correct math answer is in any way called reasoning, especially given how LLMs function.

replies(1): >>43827294 #
4. CamperBob2 ◴[] No.43827294{3}[source]
A simple program in the calculator can provide the correct math answer, hence I conclude that my Casio can "reason" and "understand" maths.

Cool, we're done here.