(arxiv.org)

214 points optimalsolver | 1 comments | 31 Oct 25 09:23 UTC | HN request time: 0.234s | source

Show context

My_Name ◴[31 Oct 25 11:10 UTC] No.45770715[source]▶

I find that they know what they know fairly well, but if you move beyond that, into what can be reasoned from what they know, they have a profound lack of ability to do that. They are good at repeating their training data, not thinking about it.

The problem, I find, is that they then don't stop, or say they don't know (unless explicitly prompted to do so) they just make stuff up and express it with just as much confidence.

replies(9): >>45770777 #>>45770879 #>>45771048 #>>45771093 #>>45771274 #>>45771331 #>>45771503 #>>45771840 #>>45778422 #

usrbinbash ◴[31 Oct 25 12:58 UTC] No.45771503[source]▶

>>45770715 #

> They are good at repeating their training data, not thinking about it.

Which shouldn't come as a surprise, considering that this is, at the core of things, what language models do: Generate sequences that are statistically likely according to their training data.

replies(1): >>45772607 #

dymk ◴[31 Oct 25 14:51 UTC] No.45772607[source]▶

>>45771503 #

This is too large of an oversimplification of how an LLM works. I hope the meme that they are just next token predictors dies out soon, before it becomes a permanent fixture of incorrect but often stated “common sense”. They’re not Markov chains.

replies(3): >>45772668 #>>45772674 #>>45780675 #

adastra22 ◴[31 Oct 25 14:57 UTC] No.45772668[source]▶

>>45772607 #

They are next token predictors though. That is literally wha they are. Nobody is saying they are simple Markov chains.

replies(1): >>45775953 #

dymk ◴[31 Oct 25 19:46 UTC] No.45775953[source]▶

>>45772668 #

It’s a uselessly reductive statement. A person at a keyboard is also a next token predictor, then.

replies(3): >>45776192 #>>45776258 #>>45778151 #

1. adastra22 ◴[01 Nov 25 00:20 UTC] No.45778151[source]▶

>>45775953 #

Yes. That's not the devastating take-down you think it is. Are you positing that people have souls? If not, then yes: human chain-of-thought is the equivalent of next token prediction.

↑

Reasoning models reason well, until they don't