The wall confronting large language models

(arxiv.org)

170 points PaulHoule | 2 comments | 03 Sep 25 11:40 UTC | HN request time: 0s | source

Show context

measurablefunc ◴[03 Sep 25 20:29 UTC] No.45120049[source]▶

There is a formal extensional equivalence between Markov chains & LLMs but the only person who seems to be saying anything about this is Gary Marcus. He is constantly making the point that symbolic understanding can not be reduced to a probabilistic computation regardless of how large the graph gets it will still be missing basic stuff like backtracking (which is available in programming languages like Prolog). I think that Gary is right on basically all counts. Probabilistic generative models are fun but no amount of probabilistic sequence generation can be a substitute for logical reasoning.

replies(16): >>45120249 #>>45120259 #>>45120415 #>>45120573 #>>45120628 #>>45121159 #>>45121215 #>>45122702 #>>45122805 #>>45123808 #>>45123989 #>>45125478 #>>45125935 #>>45129038 #>>45130942 #>>45131644 #

Anon84 ◴[03 Sep 25 21:31 UTC] No.45120628[source]▶

>>45120049 #

There definitely is, but Marcus is not the only one talking about it. For example, we covered this paper in one of our internal journal clubs a few weeks ago: https://arxiv.org/abs/2410.02724

replies(1): >>45122852 #

godelski ◴[04 Sep 25 02:31 UTC] No.45122852[source]▶

>>45120628 #

I just want to highlight this comment and stress how big of a field ML actually is. I think even much bigger than most people in ML research even know. It's really unfortunate that the hype has grown so much that even in the research community these areas are being overshadowed and even dismissed[0]. It's been interesting watching this evolution and how we're reapproaching symbolic reasoning while avoiding that phrase.

There's lots of people doing theory in ML and a lot of these people are making strides which others stand on (ViT and DDPM are great examples of this). But I never expect these works to get into the public eye as the barrier to entry tends to be much higher[1]. But they certainly should be something more ML researchers are looking at.

That is to say: Marcus is far from alone. He's just loud

[0] I'll never let go how Yi Tay said "fuck theorists" and just spent his time on Twitter calling the KAN paper garbage instead of making any actual critique. There seems to be too many who are happy to let the black box remain a black box because low level research has yet to accumulate to the point it can fully explain an LLM.

[1] You get tons of comments like this (the math being referenced is pretty basic, comparatively. Even if more advanced than what most people are familiar with) https://news.ycombinator.com/item?id=45052227

replies(2): >>45124050 #>>45125796 #

1. voidhorse ◴[04 Sep 25 10:52 UTC] No.45125796[source]▶

>>45122852 #

That linked comment was so eye-opening. It suddenly made sense to me why people who are presumably technical and thus shouldn't even be entertaining the notion that LLMs reason (and who should further realize that the use and choice of this term was pure marketing strategy) are giving it the time of day. When so many of the enthusiasts can't even get enough math under their belt to understand basic claims it's no wonder the industry is a complete circus right now.

replies(1): >>45129443 #

2. godelski ◴[04 Sep 25 17:01 UTC] No.45129443[source]▶

>>45125796 (TP) #

Let me introduce to you one of X's former staff members arguing that there is no such thing as deep knowledge or expertise[0]

I would love to tell you that I don't meet many people working in AI that share this sentiment, but I'd be lying.

And just for fun, here's a downvoted comment of mine, despite my follow-up comments that evidence my point being upvoted[1] (I got a bit pissed in that last one). The point here is that most people don't want to hear the truth. They are just glossing over things. But I think the two biggest things I've learned from the modern AI movement is: 1) gradient descent and scale are far more powerful than I though, 2) I now understand how used car salesmen are so effective on even people I once thought smart. People love their sycophants...

I swear, we're going to make AGI not by making the AI smarter but by making the people dumber...

[0] https://x.com/yacineMTB/status/1836415592162554121

[1] https://news.ycombinator.com/item?id=45122931

↑