I have a paper coming up that I modestly hope will clarify some of this.
The short answer should be that it's obvious LLM training and inference are both ridiculously inefficient and biologically implausible, and therefore there has to be some big optimization wins still on the table.
replies(5):