←back to thread

214 points optimalsolver | 2 comments | | HN request time: 0.465s | source
1. moritzwarhier ◴[] No.45771297[source]
From the abstract:

> some even claiming they are capable of generalized reasoning and innovation in reasoning-intensive fields such as mathematics, physics, medicine, and law. However, by more carefully scaling the complexity of reasoning problems, we show existing benchmarks actually have limited complexity

Can someone ELI5 what the definitions of reasoning and complexity are here?

I see they seem to focus on graph problems and representing problems as graph problems. But I didn't completely read the paper or understand it in depth. I skimmed some parts that seem to address this question (e.g. section 5 and the Introduction), but maybe there are simpler definitions that elude me.

Surely they don't mean "computational complexity"?

And what exactly is "reasoning"?

I'm aware of philosophical logic and strict logic that can be applied to natural language arguments.

But have we already agreed on a universal scale that grades answers to questions about the physical world? Or is this about mathematical reasoning?

Mixing all of this together always irks me when it comes to these AI "benchmarks". But apparently people see value in these?

I know my question isn't new.

To me it seems, that when we leave the mathematical realms, it quickly becomes fuzzy what correct "reasoning" should be.

People can be convincing and avoid obious logical fallacies, and still make wrong conclusions... or conclusions that run counter to assumed goals.

replies(1): >>45771443 #
2. dcre ◴[] No.45771443[source]
Even in the mathematical/formal realm, the meaning of reasoning is not as clear as it seems. The result of the activity of reasoning may be a formal argument that can be evaluated according to well-defined rules, but the actual process your mind went through to get there is just as opaque (or more) as whatever is going on inside LLMs. It seems likely, as you suggest, that we are going to have to define reasoning in terms of ability to solve certain classes of problems but leaving the character of the process unspecified.