The tests I'd delete are the ones that just test that the code is written in a particular way instead of testing the expected behaviour of thr code.
The tests I'd delete are the ones that just test that the code is written in a particular way instead of testing the expected behaviour of thr code.
Often times a flakey test is not flakey because it was well-written and something else strange is failing. Often times the test reveals something about the system that is somewhat non-deterministic, but not non-deterministic in a detrimental way. When you have multiple levels of abstraction and parallelization and interdependent behavior, fixing a single test becomes a time consuming process that is difficult to work with (because it's flakey, you can't always replicate the failure).
If a test fails in CI and the traceback is unclear, many people will re-run once and let it continue to flake. Obvious flakes around time and other dependencies are much easier to spot and fix, so they are. It's only the weird ones that lead to pain and regret.