Most active commenters

CamperBob2(6)
ryandv(4)
Jensson(3)

Naur's "Programming as Theory Building" and LLMs replacing human programmers

(ratfactor.com)

Show context

n4r9 ◴[28 Apr 25 10:34 UTC] No.43819695[source]▶

Although I'm sympathetic to the author's argument, I don't think they've found the best way to frame it. I have two main objections i.e. points I guess LLM advocates might dispute.

Firstly:

> LLMs are capable of appearing to have a theory about a program ... but it’s, charitably, illusion.

To make this point stick, you would also have to show why it's not an illusion when humans "appear" to have a theory.

Secondly:

> Theories are developed by doing the work and LLMs do not do the work

Isn't this a little... anthropocentric? That's the way humans develop theories. In principle, could a theory not be developed by transmitting information into someone's brain patterns as if they had done the work?

replies(6): >>43819742 #>>43821151 #>>43821318 #>>43822444 #>>43822489 #>>43824220 #

ryandv ◴[28 Apr 25 13:28 UTC] No.43821318[source]▶

>>43819695 #

> To make this point stick, you would also have to show why it's not an illusion when humans "appear" to have a theory.

This idea has already been explored by thought experiments such as John Searle's so-called "Chinese room" [0]; an LLM cannot have a theory about a program, any more than the computer in Searle's "Chinese room" understands "Chinese" by using lookup tables to generate canned responses to an input prompt.

One says the computer lacks "intentionality" regarding the topics that the LLM ostensibly appears to be discussing. Their words aren't "about" anything, they don't represent concepts or ideas or physical phenomena the same way the words and thoughts of a human do. The computer doesn't actually "understand Chinese" the way a human can.

[0] https://en.wikipedia.org/wiki/Chinese_room

replies(6): >>43821648 #>>43822082 #>>43822399 #>>43822436 #>>43824251 #>>43828753 #

CamperBob2 ◴[28 Apr 25 14:05 UTC] No.43821648[source]▶

>>43821318 #

You're seriously still going to invoke the Chinese Room argument after what we've seen lately? Wow.

The computer understands Chinese better than Searle (or anyone else) understood the nature and functionality of language.

replies(1): >>43821684 #

ryandv ◴[28 Apr 25 14:08 UTC] No.43821684[source]▶

>>43821648 #

You're seriously going to invoke this braindead reddit-tier of "argumentation," or rather lack thereof, by claiming bewilderment and offering zero substantive points?

Wow.

replies(1): >>43821955 #

CamperBob2 ◴[28 Apr 25 14:33 UTC] No.43821955[source]▶

>>43821684 #

Yes, because the Chinese Room was a weak test the day it was proposed, and it's a heap of smoldering rhetorical wreckage now. It's Searle who failed to offer any substantive points.

How do you know you're not arguing with an LLM at the moment? You don't... any more than I do.

replies(1): >>43821978 #

ryandv ◴[28 Apr 25 14:36 UTC] No.43821978[source]▶

>>43821955 #

> How do you know you're not arguing with an LLM at the moment? You don't.

I wish I was right now. It would probably provide at least the semblance of greater insight into these topics.

> the Chinese Room was a weak test the day it was proposed

Why?

replies(2): >>43822163 #>>43822171 #

1. nullstyle ◴[28 Apr 25 14:54 UTC] No.43822171[source]▶

>>43821978 #

It’s a crappy thought experiment free from the constraints of any reality, and given that these fancy lookup tables understand most languages better than I do, it doesnt hold water. Thought experiments arent science.

replies(4): >>43822317 #>>43822388 #>>43822574 #>>43823211 #

2. ryandv ◴[28 Apr 25 15:07 UTC] No.43822317[source]▶

>>43822171 (TP) #

> these fancy lookup tables understand most languages better than I do

I see. So if I gave you a full set of those lookup tables, a whole library full, and a set of instructions for their usage... you would now understand the world's languages?

3. Jensson ◴[28 Apr 25 15:13 UTC] No.43822388[source]▶

>>43822171 (TP) #

If the Chinese rooms tells you "I just left the train, see you in 5 minutes", what do you think the Chinese room try to convey? Do you think it knows what it just said? LLMs say such things all the time if you don't RLHF them to stop, why do you think they wouldn't be just as clueless about other things?

replies(1): >>43823848 #

4. psychoslave ◴[28 Apr 25 15:30 UTC] No.43822574[source]▶

>>43822171 (TP) #

>Thought experiments arent science.

By that standard we should have drop many of the cutting edge theory that was ever produced in science. It took like a century between some of Einstein’s thought experiments and any possibility to challenge them experimentally.

And while Lucretius’ idea of atom was very different than the one we kept with standard model, it actually has put the concept on the table several thousand years before they could be falsified experimentally.

It looks like you should seriously consider to expand your epistemological knowledge if you want to contribute more relevantly on the topic.

https://bigthink.com/surprising-science/einstein-is-right-ag...

5. emorning3 ◴[28 Apr 25 16:25 UTC] No.43823211[source]▶

>>43822171 (TP) #

>>Thought experiments arent science.<<

Thought experiments provide conclusions based on deductive or inductive reasoning from their starting assumptions.

Thought experiments are proofs.

That's science.

replies(1): >>43839561 #

6. CamperBob2 ◴[28 Apr 25 17:27 UTC] No.43823848[source]▶

>>43822388 #

If you ask an LLM to do some math, what happens is interesting.

Simple arithmetic ("What is 2+2") is obviously going to be well-represented in the training data, so the model will simply regurgitate "4."'

For more advanced questions like "What are the roots of 14.338x^5 + 4.005x^4 + 3.332x^3 - 99.7x^2 + 120x = 0?", the model will either yield random nonsense as GPT-4o did, or write a Python script and execute it to return the correct answer(s) as o4-mini-high did: https://chatgpt.com/share/680fb812-76b8-800b-a19e-7469cbcc43...

Now, give the model an intermediate arithmetic problem, one that isn't especially hard but also isn't going to be in-distribution ("If a is 3 and b is 11.4, what is the fourth root of a*b?").

How would YOU expect the operator of a Chinese Room to respond to that?

Here's how GPT-4o responded: https://chatgpt.com/share/680fb616-45e0-800b-b592-789f3f8c58...

Now, that's not a great answer, it's clearly an imprecise estimate. But it's more or less right, and the fact that it isn't a perfect answer suggests that the model didn't cheat somehow. A similar but easier problem would almost certainly have been answered correctly. Where did that answer come from, if the model doesn't "understand" the math to a nontrivial extent?

If it can "understand" basic high-school math, what else can it "understand?" What exactly are the limits of what a transformer can "understand" without resorting to web search or tool use?

An adherent of Searle's argument is going to have a terrible time explaining phenomena like this... and it's only going to get worse for them over time.

replies(2): >>43823953 #>>43824295 #

7. Jensson ◴[28 Apr 25 17:37 UTC] No.43823953{3}[source]▶

>>43823848 #

> If it can "understand" basic high-school math, what else can it "understand?" What exactly are the limits of what a transformer can "understand" without resorting to web search or tool use?

It is basically a grammar machine, it mostly understands stuff that can be encoded as a grammar. That is extremely inefficient for math but it can do it, that gives you a really simple way to figure out what it can do and can't do.

Knowing this LLM never really surprised me, you can encode a ton of stuff as grammars, but that is still never going to be enough given how inefficient grammars are at lots of things. But when you have a grammar the size of many billions of bytes then you can do quite a lot with it.

replies(1): >>43824405 #

8. Yizahi ◴[28 Apr 25 18:12 UTC] No.43824295{3}[source]▶

>>43823848 #

It is amusing that you have picked maths as an example of neural nets "reasoning". Because when operator asks NN to provide an answer to some simple math problem like 17+58 and then ask NN to provide "reasoning" or steps it used to calculate that, the NN will generate complete bullshit, meaning that it will provide an algorithm which humans use in school, all that sum of corresponding digits, carry 1 and so on. While in reality that same NN has dome completely different steps to do it.

This is even outlined in this document made by NN authors themselves. Basically all the so called "reasoning" by LLMs is simply more generated bullshit on top of generated answer to a query. But it often looks very believable and is enough to fool people that there is a spark inside a program.

==============

https://transformer-circuits.pub/2025/attribution-graphs/bio...

We were curious if Claude could articulate the heuristics that it is using, so we asked it.We computed the graph for the prompt below, attributing from 95, and found the same set of input, add, lookup table and sum features as in the shorter prompt above.

Human: Answer in one word. What is 36+59?

Assistant: 95

Human: Briefly, how did you get that?

Assistant: I added the ones (6+9=15), carried the 1, then added the tens (3+5+1=9), resulting in 95.

Apparently not!

This is a simple instance of the model having a capability which it does not have “metacognitive” insight into. The process by which the model learns to give explanations (learning to simulate explanations in its training data) and the process by which it learns to directly do something (the more mysterious result of backpropagation giving rise to these circuits) are different.

replies(1): >>43824470 #

9. CamperBob2 ◴[28 Apr 25 18:23 UTC] No.43824405{4}[source]▶

>>43823953 #

Let's stick with the Chinese Room specifically for a moment.

1) The operator doesn't know math, but the Chinese books in the room presumably include math lessons.

2) The operator's instruction manual does not include anything about math, only instructions for translation using English and Chinese vocabulary and grammar.

3) Someone walks up and hands the operator the word problem in question, written in Chinese.

Does the operator succeed in returning the Chinese characters corresponding to the equation's roots? Remember, he doesn't even know he's working on a math problem, much less how to solve it himself.

As humans, you and I were capable of reading high-school math textbooks by the time we reached the third or fourth grade. Just being able to read the books, though, would not have taught us how to attack math problems that were well beyond our skill level at the time.

So much for grammar. How can a math problem be solved by someone who not only doesn't understand math, but the language the question is written in? Searle's proposal only addresses the latter: language can indeed be translated symbolically. Wow, yeah, thanks for that insight. Meanwhile, to arrive at the right answers, an understanding of the math must exist somewhere... but where?

My position is that no, the operator of the Room could not have arrived at the answer to the question that the LLM succeeded (more or less) at solving.

replies(1): >>43825702 #

10. CamperBob2 ◴[28 Apr 25 18:30 UTC] No.43824470{4}[source]▶

>>43824295 #

Who, exactly, said that reasoning requires introspection? The proof of reasoning is in the result. If you don't understand the math, you won't come anywhere near the correct answer.

That's kind of the idea behind math: you can't bullshit your way through a math exam. Therefore, it is nonsensical to continue to insist that LLMs are incapable of genuine understanding. They understand math well enough to solve novel math problems without cheating, even if they can't tell you how they understand it. That part will presumably happen soon enough.

Edit: for values of "soon enough" equal to "right now": https://chatgpt.com/share/680fcdd0-d7ec-800b-b8f5-83ed8c0d0f... All the paper you cited proves is that if you ask a crappy model, you get a crappy answer.

replies(1): >>43826558 #

11. Jensson ◴[28 Apr 25 20:24 UTC] No.43825702{5}[source]▶

>>43824405 #

> Meanwhile, to arrive at the right answers, an understanding of the math must exist somewhere... but where?

In the grammar, you can have grammar rules like "1 + 1 = " must be followed by 2 etc. Then add a lot of dependency rules like "He did X" the He depends on some previous sentence to stuff like that, in same way "1 plus 1" translates to "1 + 1" or "add 1 to 1" is also "1 + 1", and now you have a machine that can do very complex things.

Then you take such a grammar machine and train it on all text human has ever written, and it learns a lot of such grammar structures, and can thus parse and solve some basic math problems since the solution to them is a part of the grammar it learned.

Such a machine is still unable to solve anything outside of the grammar it has learned. But it is still very useful, pose a question in a way that makes it easy to parse, and that has a lot of such grammar dependencies you know it can handle, and it will almost always output the right response.

12. Yizahi ◴[28 Apr 25 21:55 UTC] No.43826558{5}[source]▶

>>43824470 #

A simple program in the calculator can provide the correct math answer, hence I conclude that my Casio can "reason" and "understand" maths.

You have redefined words reason and understand to include a lot of states which most of the population wouldn't call neither reasoning not understanding. In those arbitrary definitions, yes, you are right. I just disagree myself, that producing correct math answer is in any way called reasoning, especially given how LLMs function.

replies(1): >>43827294 #

13. CamperBob2 ◴[28 Apr 25 23:40 UTC] No.43827294{6}[source]▶

>>43826558 #

A simple program in the calculator can provide the correct math answer, hence I conclude that my Casio can "reason" and "understand" maths.

Cool, we're done here.

↑