Many of his arguments make “logical” sense, but one way to evaluate them is: would they have applied equally well 5 years ago? and would that have predicted LLMs will never write (average) poetry, or solve math, or answer common-sense questions about the physical world reasonably well?
Probably. But turns out scale is all we needed. So yeah, maybe this is the exact point where scale stops working and we need to drastically change architectures. But maybe we just need to keep scaling.