←back to thread

Reasoning models reason well, until they don't

(arxiv.org)

214 points optimalsolver | 2 comments | 31 Oct 25 09:23 UTC | HN request time: 0.401s | source

1. kordlessagain ◴[31 Oct 25 13:03 UTC] No.45771531[source]▶

>>45769971 (OP) #

What specific reasoning capabilities matter for what real-world applications?

Nobody knows.

Moreover, nobody talks about that because it's boring and non-polarizing. Instead, supposedly smart people post stupid comments that prevent anyone from understanding this paper is worthless.

The paper is worthless because it has a click-bait title. Blog posts get voted down for that, why not this?

The implicit claim is worthless. Failure to navigate a synthetic graph == failure to solve real world problems. False.

Absolutely no connection to real world examples. Just losing the model in endless graphs.

replies(1): >>45775451 #

2. wavemode ◴[31 Oct 25 18:58 UTC] No.45775451[source]▶

>>45771531 (TP) #

> The implicit claim is worthless. Failure to navigate a synthetic graph == failure to solve real world problems. False.

This statement is the dictionary definition of attacking a strawman.

Every new model that is sold to us, is sold on the basis that it performs better than the old model on synthetic benchmarks. This paper presents a different benchmark that those same LLMs perform much worse on.

You can certainly criticize the methodology if the authors have erred in some way, but I'm not sure why it's hard to understand the relevance of the topic itself. If benchmarks are so worthless then go tell that to the LLM companies.