Delete tests

(andre.arko.net)

125 points mooreds | 4 comments | 27 Aug 25 11:18 UTC | HN request time: 0.621s | source

Show context

dcminter ◴[30 Aug 25 04:15 UTC] No.45071880[source]▶

How about you fix the flakey tests?

The tests I'd delete are the ones that just test that the code is written in a particular way instead of testing the expected behaviour of thr code.

replies(3): >>45071959 #>>45072007 #>>45072191 #

1. Shank ◴[30 Aug 25 05:51 UTC] No.45072191[source]▶

>>45071880 #

> How about you fix the flakey tests?

Often times a flakey test is not flakey because it was well-written and something else strange is failing. Often times the test reveals something about the system that is somewhat non-deterministic, but not non-deterministic in a detrimental way. When you have multiple levels of abstraction and parallelization and interdependent behavior, fixing a single test becomes a time consuming process that is difficult to work with (because it's flakey, you can't always replicate the failure).

If a test fails in CI and the traceback is unclear, many people will re-run once and let it continue to flake. Obvious flakes around time and other dependencies are much easier to spot and fix, so they are. It's only the weird ones that lead to pain and regret.

replies(2): >>45073163 #>>45077959 #

2. lexicality ◴[30 Aug 25 09:11 UTC] No.45073163[source]▶

>>45072191 (TP) #

Sounds like it's not actually well written in that case. Either you're testing the wrong output if it's non-deterministic or you have a consistency bug that's corrupting data in production.

replies(1): >>45073266 #

3. dcminter ◴[30 Aug 25 09:32 UTC] No.45073266[source]▶

>>45073163 #

Exactly; a very occasionally flakey test may be tolerable but is almost by definition not well written.

The commonest type I see is one where instead of waiting until expected behaviour is exhibited with a suitable timeout, the test sleeps for some shorter period and then checks to see if the behaviour was exhibited.

These tests not only flake occasionally when the CI server or dev laptop is under unusual load, but worse, accumulate until the test suite is so full of "short" sleeps that the full set of test takes half an hour to run.

Often the sleeps were seen as being acceptable because the plan was to run the tests in parallel, but then the increased load results in the tests becoming flakey.

Once you have dozens of these flaking tests for this or other reasons, it becomes a project in itself to refactor them back to something sane.

Flakey tests should always be fixed immediately unless you're in the middle of an incident or something.

4. Izkata ◴[30 Aug 25 20:59 UTC] No.45077959[source]▶

>>45072191 (TP) #

My favorite flakey test where nothing was actually wrong with the code or test: The system used some of the same settings between development and CI, including the memcached server. The test would fail if one of the devs happened to be using their development site within 15 minutes of the next CI run, because the code would retrieve a nonexistent object from the cache and fail with a really strange error.

↑