Most active commenters
  • lucianbr(3)
  • __MatrixMan__(3)

←back to thread

Unit tests as documentation

(www.thecoder.cafe)
94 points thunderbong | 26 comments | | HN request time: 0.218s | source | bottom
1. lucianbr ◴[] No.41872163[source]
One - unit tests explain nothing. They show what the output should be for a given input, but not why, or how you get there. I'm surprised by the nonchalant claim that "unit tests explain code". Am I missing something about the meaning of the english word "explain"?

Two - so any input value outside of those in unit tests is undocumented / unspecified behavior? A documentation can contain an explanation in words, like what relation should hold between the inputs and outputs in all cases. Unit tests by their nature can only enumerate a finite number of cases.

This seems like such an obviously not great idea...

replies(14): >>41872317 #>>41872378 #>>41872470 #>>41872545 #>>41872973 #>>41873690 #>>41873888 #>>41874566 #>>41874890 #>>41874910 #>>41875148 #>>41875681 #>>41875896 #>>41876058 #
2. atoav ◴[] No.41872317[source]
Not sure about this, but I like it the way it is done in the Rust ecosystem.

In Rust, there are two types of comments. Regular ones (e.g. starting with //) and doc-comments (e.g. starting with ///). The latter will land in in the generated documentation when you run cargo doc.

And now the cool thing: If you have example code in these doc comments, e.g. to explain how a feature of your library can be used, that script will automatically become part of the tests per default. That means you are unlikey to forget to update these examples when your code changes and you can use them as tests at the same time by asserting something at the end (which also communicates the outcome to the reader).

replies(3): >>41872703 #>>41872805 #>>41874298 #
3. monocasa ◴[] No.41872378[source]
Unit tests can explain nothing. But so can paragraphs of prose.

The benefit of explanations in tests is that running them gets you closer to knowing if any of the explanations have bit rotted.

replies(1): >>41872921 #
4. __MatrixMan__ ◴[] No.41872470[source]
Often, tests are parameterized over lists of cases such that you can document the general case near the code and document the specific cases near each parameter. I've even seen test frameworks that consume an excel spreadsheet provided by product so that the test results are literally a function of the requirements.

Would we prefer better docs than some comments sprinkled in strategic places in test files? Yes. Is having them with the tests maybe the best we can do for a certain level of effort? Maybe.

If the alternative is an entirely standalone repository of docs which will probably not be up to date, I'll take the comments near the tests. (Although I don't think this approach lends itself to unit tests.)

5. worldsayshi ◴[] No.41872545[source]
One: Can we test the tests using some somewhat formal specification of the why?

Two: my intuition says that exhaustively specifying the intended input output pairs would only hold marginal utility compared to testing a few well selected input output pairs. It's more like attaching the corners of a sheet to the wall than gluing the whole sheet to the wall. And glue is potentially harder to remove. The sheet is n-dimensional though.

replies(1): >>41872651 #
6. lucianbr ◴[] No.41872651[source]
I really don't understand the "exhaustive specification" thing. How else is software supposed to work but with exhaustive specification? Is the operator + not specified exhaustively? Does your intuition tell you it is enough to give some pairs of numbers and their sums, and no need for some words that explain + computes the algebraic sum of its operands? There are an infinite number of functions of two arguments that pass through a finite number of specified points. Without the words saying what + does, it could literally do anything outside the test cases.

Of course, for + it's relatively easy to intuit what it is supposed to mean. But if I develop a "joe's interpolation operator", you think you'll understand it well enough from 5-10 unit tests, and actually giving you the formula would add nothing? Again I find myself wondering if I'm missing some english knowledge...

Can you imagine understanding the Windows API from nothing but unit tests? I really cannot. No text to explain the concepts of process, memory protection, file system? There is absolutely no way I would get it.

replies(2): >>41873282 #>>41874180 #
7. lucianbr ◴[] No.41872703[source]
Yeah, combining unit tests and written docs in various ways seems fine. My reading of the article was that the tests are the only documentation. Maybe that was not the intent but just a bad interpretation on my part.

Though some replies here seem to keep arguing for my interpretation, so it's not just me.

replies(1): >>41876048 #
8. chrisweekly ◴[] No.41872805[source]
Does your IDE handle syntax-highlighting and intellisense -type enhancements for these unit tests written as doc-comments?
9. mannykannot ◴[] No.41872921[source]
> The benefit of explanations in tests is...

What you appear to have in mind here is the documentation of a test. Any documentation that correctly explains why it matters that the test should pass will likely tell you something about what the purpose of the unit is, how it is supposed to work, or what preconditions must be satisfied in order for it to work correctly, but the first bullet point in the article seems to be making a much stronger claim than that.

The observation that both tests and documentation may fail to explain their subject sheds no light on the question of whether (or to what extent) tests in themselves can explain the things they test.

10. danielovichdk ◴[] No.41872973[source]
This is one of those thing that is "by philosophy", and I understand, i think, what you are saying.

I do think that tests should not explain the why, that would be leaking too much detail, but at the same time the why is somewhat the result of the test. A test is a documentation of a regression, not of how code it tests is implemented/why.

The finite number of cases is interesting. You can definitely run single tests with a high number of inputs which of course is still finite but perhaps closer to a possible way of ensuring validity.

11. __MatrixMan__ ◴[] No.41873282{3}[source]
The thing about Joe's interpolation operator is that Joe doesn't work here anymore but thousands of users are relying on his work and we need to change it such that as few of them scream as possible.

That's the natural habitat for code, not formally specified, but partially functioning in situ. Often the best you can do is contribute a few more test cases towards a decent spec for existing code because there just isn't time to re-architect the thing.

If you are working with code in an environment where spending time improving the specification can be made a prerequisite of whatever insane thing the stakeholders want today... Hang on to that job. For the rest of us, it's a game of which-hack-is-least-bad.

replies(1): >>41875738 #
12. lcall ◴[] No.41873690[source]
At least sometimes, it really helps for a test to say WHY it is done that way. I had a case where I needed to change some existing code, and all the unit tests passed but one. The author was unavailable. It was very unclear whether I should change the test. I asked around. I was about to commit the changes to the code and test when someone came back from vacation and helpfully explained. I hope I added a useful comment.
13. Etherlord87 ◴[] No.41873888[source]
Documentation:

> returns a sum of reciprocals of inputs

Unit Test:

    assert_eq(foo(2, 5), 1/2 + 1/5)
    assert_eq(foo(4, 7), 1/4 + 1/7)
    assert_eq(foo(10, 100, 10000), 0.1101)
14. worldsayshi ◴[] No.41874180{3}[source]
I suspect we're thinking about quite different use cases for our testing code. If the input-output pairs are describing a highly technical relationship I would probably want a more rigorous testing procedure. Possibly proofs.

Most of the tests I write daily is about moving and transforming data in ways that are individually rather trivial, but when features pile up, keeping track of all requirements is hard, so you want regression tests. But you also don't want a bunch of regression tests that are hard to change when you change requirements, which will happen. So you want a decent amount of simple tests for individually simple requirements that make up a complex whole.

15. readline_prompt ◴[] No.41874298[source]
Doctests are great aren't they?
replies(1): >>41874460 #
16. TeMPOraL ◴[] No.41874460{3}[source]
IDK, they sound like they overflow the "maximum code" counter and land up straight in the literate programming land. I wonder how far you could go writing your whole program as doctests spliced between commentary.
17. tpoacher ◴[] No.41874890[source]
Unit tests are programmatic specification. I'm assuming it is in this manner that the article is referring to them as documentation, rather than as "explanations" per se.

Obviously unit tests cannot enumerate all inputs, but as a form of programmatic specification, neither do they have to.

For the case you mention where a broad relation should hold, this is a special kind of unit test strategy, which is property testing. Though admittedly other aspects of design-by-contract are also better suited here; nobody's claiming that tests are the best or only programmatic documentation strategy.

Finally, there's another kind of unit testing, which is more appropriately called characterisation testing, as per M. Feathers book on legacy code. The difference being, unit tests are for developing a feature and ensuring adherence to a spec, whereas characterisation tests are for exploring the actual behaviour of existing code (which may or may not be behaving according to the intended spec). These are definitely then tests as programmatic documentation.

18. m000 ◴[] No.41874910[source]
One - Why do you care how you got there? You need to read the code for that. But the tests do explain/document how you can expect the test to work. If the code is unreadable, well that sucks. But you at least have a programmatic (and hopefully annotated) description of how the code is expected to work, so you have a stable base for rewritting it to be more clear.

Two - Ever heard of code coverage? Type systems/type checkers? Also, there's nothing precluding you from using assertions in the test that make any assumed relations explicit before you actually test anything.

19. tmoertel ◴[] No.41875148[source]
In some cases, unit tests can both test and specify the semantics of the system being tested. My favorite example is the ReadP parsing library for Haskell. The source code ends with a short and automatically testable specification of the semantics of the combinators that make up the library. So, in this example, the tests tell you almost everything you need to know about the library.

https://hackage.haskell.org/package/ghc-internal-9.1001.0/do...

20. 8n4vidtmkvmk ◴[] No.41875681[source]
Yes, actually. Sometimes the edge cases that aren't covered by unit tests are undefined behavior. I don't recommend doing this frequently but sometimes it's hard to know the best way to handle weird edge cases until you gather more use cases so deliberately not writing a test for such things is a legit strategy IMO. You should probably also add to the method doc comment that invoking with X is not well defined.
21. invaderzirp ◴[] No.41875738{4}[source]
What's stopping someone from reading the code, studying it deeply, and then writing down what it does? That's what I do, but I see people struggle with it because they just want to get more tickets done.
replies(2): >>41876363 #>>41876389 #
22. gorgoiler ◴[] No.41875896[source]
For something like this:

  def get_examples(
    source: Path,
    minimum_size: float,
    maximum_size: float,
    total_size: float,
    seed: float = 123,
  ) -> Iterator[Path]:
      …
…it’s pretty obvious what those float arguments are for but the “source” is just a Path. Is there an example “source” I can look at to see what sort of thing I am supposed to pass there?

Well you could document that abstractly in the function (“your source must be a directory available via NFS to all devs as well as the build infra”) but you could also use the function in a test and describe it there, and let that be the “living documentation” of which the original author speaks.

Obviously if this is a top level function in some open source library with a readthedocs page then it’s good to actually document the function and have a test. If it’s just some internal thing though then doc-rot can be more harmful than no docs at all, so the best docs are therefore verified, living docs: the tests.

(…or make your source an enumeration type so you don’t even need the docs!)

23. the_af ◴[] No.41876048{3}[source]
Combining is what TFA suggests. They even go as far as closing the article with:

> Note also that I’m not suggesting that unit tests should replace any form of documentation but rather that they should complement and enrich it.

24. the_af ◴[] No.41876058[source]
I think the line of thought behind the article is making the tests be like a "living spec". Well written tests (especially those using things like QuickCheck, aka "property testing") will cover more than simply a few edge cases. I don't think many developers know how to write good test cases like this, though, so it becomes a perilous proposition.

Do note TFA doesn't suggest replacing all other forms of documentation with just tests.

25. __MatrixMan__ ◴[] No.41876363{5}[source]
Nothing, sounds like a great plan.

But if you want other people to benefit from it, a good place to put it is right next to a test that will start failing as soon as the code changes in a way that no longer conforms to the spec.

Otherwise those people who just want to get more tickets done will change the code without changing the spec. Or you'll end up working on something else and they'll never even know about your document, because they're accustomed to everybody else's bad habits.

If you're going to be abnormally diligent, you might as well so in a way that the less diligent can approach gradually: One test at a time.

26. dullcrisp ◴[] No.41876389{5}[source]
The code already says what it does.