Reasoning models reason well, until they don't

(arxiv.org)

Show context

My_Name ◴[31 Oct 25 11:10 UTC] No.45770715[source]▶

I find that they know what they know fairly well, but if you move beyond that, into what can be reasoned from what they know, they have a profound lack of ability to do that. They are good at repeating their training data, not thinking about it.

The problem, I find, is that they then don't stop, or say they don't know (unless explicitly prompted to do so) they just make stuff up and express it with just as much confidence.

replies(9): >>45770777 #>>45770879 #>>45771048 #>>45771093 #>>45771274 #>>45771331 #>>45771503 #>>45771840 #>>45778422 #

1. usrbinbash ◴[31 Oct 25 12:58 UTC] No.45771503[source]▶

>>45770715 #

> They are good at repeating their training data, not thinking about it.

Which shouldn't come as a surprise, considering that this is, at the core of things, what language models do: Generate sequences that are statistically likely according to their training data.

replies(1): >>45772607 #

2. dymk ◴[31 Oct 25 14:51 UTC] No.45772607[source]▶

>>45771503 (TP) #

This is too large of an oversimplification of how an LLM works. I hope the meme that they are just next token predictors dies out soon, before it becomes a permanent fixture of incorrect but often stated “common sense”. They’re not Markov chains.

replies(3): >>45772668 #>>45772674 #>>45780675 #

3. adastra22 ◴[31 Oct 25 14:57 UTC] No.45772668[source]▶

>>45772607 #

They are next token predictors though. That is literally wha they are. Nobody is saying they are simple Markov chains.

replies(1): >>45775953 #

4. gpderetta ◴[31 Oct 25 14:57 UTC] No.45772674[source]▶

>>45772607 #

Indeed, they are next token predictors, but this is a vacuous statement because the predictor can be arbitrary complex.

replies(1): >>45776178 #

5. dymk ◴[31 Oct 25 19:46 UTC] No.45775953{3}[source]▶

>>45772668 #

It’s a uselessly reductive statement. A person at a keyboard is also a next token predictor, then.

replies(3): >>45776192 #>>45776258 #>>45778151 #

6. HarHarVeryFunny ◴[31 Oct 25 20:10 UTC] No.45776178{3}[source]▶

>>45772674 #

Sure, but a complex predictor is still a predictor. It would be a BAD predictor if everything it output was not based on "what would the training data say?".

If you ask it to innovate and come up with something not in it's training data, what do you think it will do .... it'll "look at" it's training data and regurgitate (predict) something labelled as innovative

You can put a reasoning cap on a predictor, but it's still a predictor.

replies(1): >>45776459 #

7. HarHarVeryFunny ◴[31 Oct 25 20:12 UTC] No.45776192{4}[source]▶

>>45775953 #

Yes, but it's not ALL they are.

replies(1): >>45776451 #

8. daveguy ◴[31 Oct 25 20:19 UTC] No.45776258{4}[source]▶

>>45775953 #

They are both designed, trained, and evaluated by how well they can predict the next token. It's literally what they do. "Reasoning" models just buildup additional context of next token predictions and RL is used to bias output options to ones more appealing to human judges. It's not a meme. It's an accurate description of their fundamental computational nature.

9. adastra22 ◴[01 Nov 25 00:20 UTC] No.45778151{4}[source]▶

>>45775953 #

Yes. That's not the devastating take-down you think it is. Are you positing that people have souls? If not, then yes: human chain-of-thought is the equivalent of next token prediction.

10. Libidinalecon ◴[01 Nov 25 10:49 UTC] No.45780675[source]▶

>>45772607 #

The problem is in adding the word "just" for no reason.

It makes the statement of a fact a type of rhetorical device.

It is the difference between saying "I am a biological entity" and "I am just a biological entity". There are all kinds of connotations that come along for the ride with the latter statement.

Then there is the counter with the romantic statement that "I am not just a biological entity".

↑