It's so much easier to make cheats today than it was, say, 10 years ago.
It's also easier because more and more games are sharing common infrastructure like game engines, as compared to the past. What works in one Unreal game may save you a lot of time developing a cheat for another Unreal game.
These days, many online games encounter serious cheats within the first couple of days of release - if not the day OF release.
Cheating and anti-cheat used to rely a lot on the pure technical parts (like "is something sneaking some reads from the memory the game engine uses to clip models?"), which is ultimately not something you will win as a game developer (DMA/Hardware attacks or even just frame grabbing the eDP or LVDS signal and intercepting the USB HID traffic has been on the market for quite a while).
But implausible actions and results for a player can only be attributed to luck so many times. Do 30 360noscope flick headshots in a row on a brand new account and you can be pretty sure something is wrong.
If we can get plausibility vs. luck sorted out to a degree where the method of cheating no longer matters, that's when the tide turns. Works for pure bots as well. But it's difficult to do, and probably not something every developer is able/willing to develop or invest in.
Anything that makes assumptions about player's skills runs into problems too. For any online PvP game, the skill ceiling will rise with time. What once may have been considered improbable may soon become what's consistent for the top 1% or even 0.1% of the playerbase given a few years.
As well, it can run into problems as rebalancing occurs and new abilities are released.
The only group you'd punish with that is skilled players that lose their account (and create a new one), but if you use a moving skill window they can grow back into their plausibility pretty quickly, and it's a small cost compared to everything else. And you could even mitigate that by making things like the first 10 matches require a different plausibility score than the matches after that.
And with different I don't mean "no scoring at all" or something like that. But a cheater tends to not cheat "a little bit". You might have togglers, but that sticks out like a sore thumb (people don't suddenly lose or gain skill like that). And even if that fails (lots of "cheating a little bit" for example), you've still managed to boot out the obvious persistent cheating.
And that's just with 1 example and 1 scenario. Granted, that bypasses the fact that it is still difficult and doing it broader than one example/scenario is even more difficult, but that's why I ended the previous comment pointing out the difficulty and associated cost, which goes hand in hand with the balancing difficulty you pointed out. Even tribunal-assisted methods (not sure if Riot games still does that) have the same problem.
And - what about experienced players who cheat?
In some scenes, it's actually more often that cheaters are some of the best, most experienced players who have a strong competitive lean and feel they 'deserve' to win, so use cheats to get an edge. It's far more common than you'd think.
That's the problem with any anti-cheat system. It's all the what-ifs. Every single 'clever idea' that has been theorized under the sun has been tried and most have failed.
Experienced players who cheat will still be subject to plausibility. Say there is a normal amount of variance in humans but suddenly some player no longer has variance in their action. That's not plausible at all. Or a player looking at things they cannot see, that might sometimes be a coincidence, but that level of coincidence is not plausible to suddenly change a drastic amount.
Again, this sort of thing doesn't catch all subtle cheaters, but those are also not the biggest issue. It's the generic "runs into a room, beats everyone within 10ms", and "cannot see, but hits anyway all the time" type of cheat you'd want to capture automatically.
A what-if in a tournament or the top 1% of players is such a small set of players, you'd be able to do human observation. Even then someone could cheat, but you're so far outside of the realm of general cheating, I wonder if that's worth including in a system that's mostly beneficial inside the mass market gaming players.
Either way, this sort of detection is usually done in the financial and retail world, and results in highly acceptable rates and results. It's not perfect with a 100% success rate or something like that, but it's pretty successful. Just not something studios or publishers seem to want to invest in. It's much simpler to just buy or licence something (like Easy Anti-Cheat). Broad internal expertise isn't something the markets are rewarding at this point.