The wall confronting large language models

(arxiv.org)

170 points PaulHoule | 4 comments | 03 Sep 25 11:40 UTC | HN request time: 0s | source

Show context

measurablefunc ◴[03 Sep 25 20:29 UTC] No.45120049[source]▶

There is a formal extensional equivalence between Markov chains & LLMs but the only person who seems to be saying anything about this is Gary Marcus. He is constantly making the point that symbolic understanding can not be reduced to a probabilistic computation regardless of how large the graph gets it will still be missing basic stuff like backtracking (which is available in programming languages like Prolog). I think that Gary is right on basically all counts. Probabilistic generative models are fun but no amount of probabilistic sequence generation can be a substitute for logical reasoning.

replies(16): >>45120249 #>>45120259 #>>45120415 #>>45120573 #>>45120628 #>>45121159 #>>45121215 #>>45122702 #>>45122805 #>>45123808 #>>45123989 #>>45125478 #>>45125935 #>>45129038 #>>45130942 #>>45131644 #

Certhas ◴[03 Sep 25 20:48 UTC] No.45120259[source]▶

>>45120049 #

I don't understand what point you're hinting at.

Either way, I can get arbitrarily good approximations of arbitrary nonlinear differential/difference equations using only linear probabilistic evolution at the cost of a (much) larger state space. So if you can implement it in a brain or a computer, there is a sufficiently large probabilistic dynamic that can model it. More really is different.

So I view all deductive ab-initio arguments about what LLMs can/can't do due to their architecture as fairly baseless.

(Note that the "large" here is doing a lot of heavy lifting. You need _really_ large. See https://en.m.wikipedia.org/wiki/Transfer_operator)

replies(5): >>45120313 #>>45120341 #>>45120344 #>>45123837 #>>45124441 #

awesome_dude ◴[03 Sep 25 20:57 UTC] No.45120341[source]▶

>>45120259 #

I think that the difference can be best explained thus:

I guess that you are most likely going to have cereal for breakfast tomorrow, I also guess that it's because it's your favourite.

I understand that you don't like cereal for breakfast, and I understand that you only have it every day because a Dr told you that it was the only way for you to start the day in a way that aligns with your health and dietary needs.

Meaning, I can guess based on past behaviour and be right, but understanding the reasoning for those choices, that's a whole other ballgame. Further, if we do end up with an AI that actually understands, well, that would really open up creativity, and problem solving.

replies(1): >>45120879 #

quantummagic ◴[03 Sep 25 22:04 UTC] No.45120879{3}[source]▶

>>45120341 #

How are the two cases you present fundamentally different? Aren't they both the same _type_ of knowledge? Why do you attribute "true understanding" to the case of knowing what the Dr said? Why stop there? Isn't true understanding knowing why we trust what the doctor said (all those years of schooling, and a presumption of competence, etc)? And why stop there? Why do we value years of schooling? Understanding, can always be taken to a deeper level, but does that mean we didn't "truly" understand earlier? And aren't the data structures needed to encode the knowledge, exactly the same for both cases you presented?

replies(1): >>45121027 #

awesome_dude ◴[03 Sep 25 22:23 UTC] No.45121027{4}[source]▶

>>45120879 #

When you ask that question, why don't you just use a corpus of the previous answers to get some result?

Why do you need to ask me, isn't a guess based on past answers good enough?

Or, do you understand that you need to know more, you need to understand the reasoning based on what's missing from that post?

replies(1): >>45121696 #

1. quantummagic ◴[03 Sep 25 23:51 UTC] No.45121696{5}[source]▶

>>45121027 #

I asked that question in an attempt to not sound too argumentative. It was rhetorical. I'm asking you to consider the fact that there isn't actually any difference between the two examples you provided. They're fundamentally the same type of knowledge. They can be represented by the same data structures.

There's _always_ something missing, left unsaid in every example, it's the nature of language.

As for your example, the LLM can be trained to know the underlying reasons (doctor's recommendation, etc.). That knowledge is not fundamentally different from the knowledge that someone tends to eat cereal for breakfast. My question to you, was an attempt to highlight that the dichotomy you were drawing, in your example, doesn't actually exist.

replies(1): >>45122364 #

2. awesome_dude ◴[04 Sep 25 01:21 UTC] No.45122364[source]▶

>>45121696 (TP) #

> They're fundamentally the same type of knowledge. They can be represented by the same data structures.

Maybe, maybe one is based on correlation, the other causation.

replies(1): >>45122758 #

3. quantummagic ◴[04 Sep 25 02:16 UTC] No.45122758[source]▶

>>45122364 #

What if the causation had simply been that he enjoyed cereal for breakfast?

In either case, the results are the same, he's eating cereal for breakfast. We can know this fact without knowing the underlying cause. Many times, we don't even know the cause of things we choose to do for ourselves, let alone what others do.

On top of which, even if you think the "cause" is that the doctor told him to eat a healthy diet, do you really know the actual cause? Maybe the real cause, is that the girl he fancies, told him he's not in good enough shape. The doctor telling him how to get in shape is only a correlation, the real cause is his desire to win the girl.

These connections are vast and deep, but they're all essentially the same type of knowledge, representable by the same data structures.

replies(1): >>45123193 #

4. awesome_dude ◴[04 Sep 25 03:24 UTC] No.45123193{3}[source]▶

>>45122758 #

> In either case, the results are the same, he's eating cereal for breakfast. We can know this fact without knowing the underlying cause. Many times, we don't even know the cause of things we choose to do for ourselves, let alone what others do.

Yeah, no.

Understanding the causation allows the system to provide a better answer.

If they "enjoy" cereal, what about it do they enjoy, and what other possible things can be had for breakfast that also satisfy that enjoyment.

You'll never find that by looking only at the fact that they have eaten cereal for breakfast.

And the fact that that's not obvious to you is why I cannot be bothered going into any more depth on the topic any more. It's clear that you don't have any understanding on the topic beyond a superficial glance.

Bye :)

↑