←back to thread

214 points optimalsolver | 8 comments | | HN request time: 1.977s | source | bottom
Show context
alyxya ◴[] No.45770449[source]
The key point the paper seems to make is that existing benchmarks have relatively low complexity on reasoning complexity, so they made a new dataset DeepRD with arbitrarily large reasoning complexity and demonstrated that existing models fail at a complex enough problem. Complexity is defined from the complexity of a graph created by modeling the problem as a graph and determining the traversals needed to go from some source node to a target node.

My main critique is that I don't think there's evidence that this issue would persist after continuing to scale models to be larger and doing more RL. With a harness like what coding agents do these days and with sufficient tool use, I bet models could go much further on that reasoning benchmark. Otherwise, if the reasoning problem were entirely done within a single context window, it's expected that a complex enough reasoning problem would be too difficult for the model to solve.

replies(5): >>45771061 #>>45771156 #>>45772667 #>>45775565 #>>45775741 #
1. tomlockwood ◴[] No.45771156[source]
So the answer is a few more trillion?
replies(1): >>45771324 #
2. code_martial ◴[] No.45771324[source]
It’s a worthwhile answer if it can be proven correct because it means that we’ve found a way to create intelligence, even if that way is not very efficient. It’s still one step better than not knowing how to do so.
replies(2): >>45771753 #>>45772739 #
3. tomlockwood ◴[] No.45771753[source]
So we're sending a trillion on faith?
replies(1): >>45771805 #
4. code_martial ◴[] No.45771805{3}[source]
No, that’s not what I said.
replies(1): >>45772216 #
5. tomlockwood ◴[] No.45772216{4}[source]
Why are we sending the trillion?
replies(1): >>45775705 #
6. usrbinbash ◴[] No.45772739[source]
> if it can be proven correct

Then the first step would be to prove that this works WITHOUT needing to burn through the trillions to do so.

7. measurablefunc ◴[] No.45775705{5}[source]
It must be deposited into OpenAI's bank account so that they can then deposit it into NVIDIA's account who can then in turn make a deal w/ OpenAI to deposit it back into OpenAI's account for some stock options. I think you can see how it works from here but if not then maybe one of the scaled up "reasoning" AIs will figure it out for you.
replies(1): >>45778896 #
8. tomlockwood ◴[] No.45778896{6}[source]
I understand perfectly, thank you!!!