It begins to fall down when you think in terms of interpolation and movement, if the server had to confirm your every movement it'd end up very jittery and feel awful as you ping back and forth between where your client state thinks you are and the server state thinks you are.
Even the client is kind of guessing (visually) where it is a lot of the time, at least until the next physics or update tick comes in and all this means that the server is going to be doing a hell of a lot of guess work about the state of the clients.
This article helps with reasoning around what a game is doing per-frame: https://gameprogrammingpatterns.com/game-loop.html
Certainly though, I think in this day and age, for slower games you could probably do a better job of this on the server though -- and I'm sure people are working on it.