> I don't think there's evidence that this issue would persist after continuing to scale models to be larger and doing more RL
And how much larger do we need to make the models? 2x? 3x? 10x? 100x? How large do they need to get before scaling-up somehow solves everything?
Because: 2x larger, means 2x more memory and compute required. Double the cost or half the capacity. Would people still pay for this tech if it doubles in price? Bear in mind, much of it is already running at a loss even now.
And what if 2x isn't good enough? Would anyone pay for a 10x larger model? Can we even realistically run such models as anything other than a very expensive PoC and for a very short time? And whos to say that even 10x will finally solve things? What if we need 40x? Or 100x?
Oh, and of course: Larger models also require more data to train them on. And while the Internet is huge, it's still finite. And when things grow geometrically, even `sizeof(internet)` eventually runs out ... and, in fact, may have done so already [1] [2]
What if we actually discover that scaling up doesn't even work at all, because of diminishing returns? Oh wait, looks like we did that already: [3]
[1]: https://observer.com/2024/12/openai-cofounder-ilya-sutskever...
[2]: https://biztechweekly.com/ai-training-data-crisis-how-synthe...
[3]: https://garymarcus.substack.com/p/confirmed-llms-have-indeed...