←back to thread

170 points PaulHoule | 1 comments | | HN request time: 0s | source
Show context
measurablefunc ◴[] No.45120049[source]
There is a formal extensional equivalence between Markov chains & LLMs but the only person who seems to be saying anything about this is Gary Marcus. He is constantly making the point that symbolic understanding can not be reduced to a probabilistic computation regardless of how large the graph gets it will still be missing basic stuff like backtracking (which is available in programming languages like Prolog). I think that Gary is right on basically all counts. Probabilistic generative models are fun but no amount of probabilistic sequence generation can be a substitute for logical reasoning.
replies(16): >>45120249 #>>45120259 #>>45120415 #>>45120573 #>>45120628 #>>45121159 #>>45121215 #>>45122702 #>>45122805 #>>45123808 #>>45123989 #>>45125478 #>>45125935 #>>45129038 #>>45130942 #>>45131644 #
logicchains ◴[] No.45120415[source]
LLMs are not formally equivalent to Markov chains, they're more powerful; transformers with sufficient chain of thought can solve any problem in P: https://arxiv.org/abs/2310.07923.
replies(1): >>45120484 #
measurablefunc ◴[] No.45120484[source]
If you think there is a mistake in this argument then I'd like to know where it is: https://markov.dk.workers.dev/.
replies(2): >>45122838 #>>45124408 #
logicchains ◴[] No.45124408{3}[source]
It assumes the LLM only runs once, i.e. it doesn't account for chain of thought, which makes the program not memoryless.
replies(1): >>45124467 #
1. measurablefunc ◴[] No.45124467{4}[source]
There is no such assumption. You can run/sample the LLM & the equivalent Markov chain as many times as you want & the logical analysis remains the same b/c the extensional equivalence between the LLM & Markov chain has nothing to do w/ how many times the trajectories are sampled from each one.