←back to thread

184 points hhs | 1 comments | | HN request time: 0.205s | source
Show context
aabhay ◴[] No.41840024[source]
The ability to use automatic verification + synthetic data is basically common knowledge among practitioners. But all these organizations have also explored endlessly the different ways to overfit on such data and the conclusion is the same -- the current model architecture seems to plateau when it comes to multi-step logical reasoning. You either drift from your common knowledge pre-training too far or you never come up with the right steps in instances where there's a vast design space.

Think -- why has nobody been able to make an LLM play Go better than AlphaZero while still retaining language capabilities? It certainly would have orders of magnitude more parameters.

replies(3): >>41840256 #>>41844066 #>>41848037 #
1. ◴[] No.41848037[source]