The wall confronting large language models

(arxiv.org)

Show context

measurablefunc ◴[03 Sep 25 20:29 UTC] No.45120049[source]▶

There is a formal extensional equivalence between Markov chains & LLMs but the only person who seems to be saying anything about this is Gary Marcus. He is constantly making the point that symbolic understanding can not be reduced to a probabilistic computation regardless of how large the graph gets it will still be missing basic stuff like backtracking (which is available in programming languages like Prolog). I think that Gary is right on basically all counts. Probabilistic generative models are fun but no amount of probabilistic sequence generation can be a substitute for logical reasoning.

replies(16): >>45120249 #>>45120259 #>>45120415 #>>45120573 #>>45120628 #>>45121159 #>>45121215 #>>45122702 #>>45122805 #>>45123808 #>>45123989 #>>45125478 #>>45125935 #>>45129038 #>>45130942 #>>45131644 #

Certhas ◴[03 Sep 25 20:48 UTC] No.45120259[source]▶

>>45120049 #

I don't understand what point you're hinting at.

Either way, I can get arbitrarily good approximations of arbitrary nonlinear differential/difference equations using only linear probabilistic evolution at the cost of a (much) larger state space. So if you can implement it in a brain or a computer, there is a sufficiently large probabilistic dynamic that can model it. More really is different.

So I view all deductive ab-initio arguments about what LLMs can/can't do due to their architecture as fairly baseless.

(Note that the "large" here is doing a lot of heavy lifting. You need _really_ large. See https://en.m.wikipedia.org/wiki/Transfer_operator)

replies(5): >>45120313 #>>45120341 #>>45120344 #>>45123837 #>>45124441 #

arduanika ◴[03 Sep 25 20:54 UTC] No.45120313[source]▶

>>45120259 #

What hinting? The comment was very clear. Arbitrarily good approximation is different from symbolic understanding.

"if you can implement it in a brain"

But we didn't. You have no idea how a brain works. Neither does anyone.

replies(3): >>45120357 #>>45120411 #>>45121006 #

1. mallowdram ◴[03 Sep 25 21:04 UTC] No.45120411{3}[source]▶

>>45120313 #

We know the healthy brain is unpredictable. We suspect error minimization and prediction are not central tenets. We know the brain creates memory via differences in sharp wave ripples. That it's oscillatory. That it neither uses symbols nor represents. That words are wholly external to what we call thought. The authors deal with molecules which are neither arbitrary nor specific. Yet tumors ARE specific, while words are wholly arbitrary. Knowing these things should offer a deep suspicion of ML/LLMs. They have so little to do with how brains work and the units brains actually use (all oscillation is specific, all stats emerge from arbitrary symbols and worse: metaphors) that mistaking LLMs for reasoning/inference is less lexemic hallucination and more eugenic.

replies(3): >>45120774 #>>45120824 #>>45124688 #

2. Zigurd ◴[03 Sep 25 21:52 UTC] No.45120774[source]▶

>>45120411 (TP) #

"That words are wholly external to what we call thought." may be what we should learn, or at least hypothesize, based on what we see LLMs doing. I'm disappointed that AI isn't more of a laboratory for understanding brain architecture, and precisely what is this thing called thought.

replies(1): >>45121279 #

3. quantummagic ◴[03 Sep 25 21:58 UTC] No.45120824[source]▶

>>45120411 (TP) #

What do you think about the idea that LLMs are not reasoning/inferring, but are rather an approximation of the result? Just like you yourself might have to spend some effort reasoning, on how a plant grows, in order to answer questions about that subject. When asked, you wouldn't replicate that reasoning, instead you would recall the crystallized representation of the knowledge you accumulated while previously reasoning/learning. The "thinking" in the process isn't modelled by the LLM data, but rather by the code/strategies used to iterate over this crystallized knowledge, and present it to the user.

replies(1): >>45121309 #

4. mallowdram ◴[03 Sep 25 22:59 UTC] No.45121279[source]▶

>>45120774 #

The question is how to model the irreducible. And then to concatenate between spatiotemporal neuroscience (the oscillators) and neural syntax (what's oscillating) and add or subtract what the fields are doing to bind that to the surroundings.

5. mallowdram ◴[03 Sep 25 23:03 UTC] No.45121309[source]▶

>>45120824 #

This is toughest part. We need some kind of analog external that concatenates. It's software, but not necessarily binary, it uses topology to express that analog. It somehow is visual, ie you can see it, but at the same time, it can be expanded specifically into syntax, which the details of are invisible. Scale invariance is probably key.

6. suddenlybananas ◴[04 Sep 25 07:44 UTC] No.45124688[source]▶

>>45120411 (TP) #

We don't know those things about the brain. I don't know why you keep going around HN making wildly false claims about the state of contemporary neuroscience. We know very very little about how higher order cognition works in the brain.

replies(1): >>45126938 #

7. mallowdram ◴[04 Sep 25 13:17 UTC] No.45126938[source]▶

>>45124688 #

Of course we know these things about the brain, and who said anything about higher order cognition? I'd stay current, you seem to be a legacy thinker. I'll needle drop ONE of the references re: unpredictability and brain health, there are about 30, just to keep you in your corner. The rest you'll have to hunt down, but please stop pretending you know what you're talking about.

Your line of attack which is to dismiss from a pretend point of certainty, rather than inquiry and curiosity, seems indicative of the cog-sci/engineering problem in general. There's an imposition based in intuition/folk psychology that suffuses the industry. The field doesn't remain curious to new discoveries in neurobiology, which supplants psychology (psychology is being based, neuro is neural based). What this does is remove the intent of rhetoric/being and suggest brains built our external communication. The question is how and by what regularities. Cog-sci has no grasp of that in the slightest.

https://pubmed.ncbi.nlm.nih.gov/38579270/

replies(1): >>45137064 #

8. suddenlybananas ◴[05 Sep 25 10:35 UTC] No.45137064{3}[source]▶

>>45126938 #

Your writing reminds me of a schizophrenic.

↑