Most active commenters

wubrr(18)
the_af(10)
crabbone(5)
invaderzirp(5)
bunderbunder(4)
lucianbr(4)
__MatrixMan__(3)
(3)
rglover(3)
tpoacher(3)

Popular/hot comments

>>41874483 #
>>41872163 #
>>41874705 #
>>41872129 #
>>41872141 #
>>41872317 #
>>41872611 #
>>41877434 #
>>41871879 #
>>41872038 #
>>41872241 #
>>41872618 #

Unit tests as documentation

(www.thecoder.cafe)

1. PaulHoule ◴[17 Oct 24 17:26 UTC] No.41871656[source]▶

Actually every example in the documentation should be backed by a unit test, as in the example is transcluded from the unit test into the docs. Since you often want to show examples that don’t compile in docs you also should be able to write tests for compile errors.

replies(1): >>41871925 #

2. eschneider ◴[17 Oct 24 17:47 UTC] No.41871864[source]▶

>>41871629 (OP) #

Unit tests are a _kind_ of documentation, but are rarely a complete solution to "documenting code". In general, the folks who don't do adequate code documentation are the same folks who don't do adequate unit tests. :/

3. simonw ◴[17 Oct 24 17:48 UTC] No.41871877[source]▶

>>41871629 (OP) #

A trick I use a lot these days is to take the unit tests from an under-documented library, dump them into an LLM and ask it to write me detailed usage documentation.

This works REALLY well. I've even occasionally done some of my own reviewing and editing of those docs and submitted them back to the project. Here's an example: https://github.com/pydantic/jiter/pull/143 - Claude transcript here: https://gist.github.com/simonw/264d487db1a18f8585c2ca0c68e50...

replies(1): >>41872536 #

4. tln ◴[17 Oct 24 17:48 UTC] No.41871879[source]▶

>>41871629 (OP) #

Extracting unit tests from your docs: great!

Somehow extracting your docs from unit tests: might be ok!

Pointing people at unit tests instead of writing docs: not even remotely ok.

Is that really what this guy is advocating??

replies(3): >>41872139 #>>41872279 #>>41875354 #

5. zahlman ◴[17 Oct 24 17:53 UTC] No.41871919[source]▶

>>41871629 (OP) #

This isn't at all a new idea, but it's the first time I've seen it presented with this textbook AI style.

replies(1): >>41872252 #

6. red2awn ◴[17 Oct 24 17:54 UTC] No.41871925[source]▶

>>41871656 #

Better yet, use doc test as featured in Python [1] or Rust [2]. This makes sure your documentation examples are always up-to-date and runnable.

[1]: https://docs.python.org/3/library/doctest.html

[2]: https://doc.rust-lang.org/rustdoc/write-documentation/docume...

7. meindnoch ◴[17 Oct 24 17:56 UTC] No.41871957[source]▶

>>41871629 (OP) #

Is this "article" written by a LLM?

"Tomorrow, you will receive your weekly recap on unit tests."

Please, no.

replies(1): >>41872205 #

8. rglover ◴[17 Oct 24 18:03 UTC] No.41872038[source]▶

>>41871629 (OP) #

Just write the docs. A simple template:

- What is it?

- What does it do?

- Why does it do that?

- What is the API?

- What does it return?

- What are some examples of proper, real world usage (that don't involve foo/bar but instead, real world inputs/outputs I'd likely see)?

replies(3): >>41872068 #>>41872141 #>>41873752 #

9. croes ◴[17 Oct 24 18:06 UTC] No.41872068[source]▶

>>41872038 #

Why is a hard question.

And what should be obvious or it’s still too complex.

replies(1): >>41872144 #

10. Attummm ◴[17 Oct 24 18:11 UTC] No.41872109[source]▶

>>41871629 (OP) #

Unit tests as documentation have proven their worth over the years.

For example this recent feature was added through unit test as documentation.

https://github.com/Attumm/redis-dict/blob/main/extend_types_...

11. danjl ◴[17 Oct 24 18:12 UTC] No.41872129[source]▶

>>41871629 (OP) #

Why just unit tests? Integration tests seem much more valuable as documentation of what the users will do in the app. Unit tests have limited benefits overall, and add a bunch of support time, slowing down development. If you have good (90%+) coverage just from integration tests, you are likely doing 90%+ coverage of the unit tests at the same time, without the extra effort or support burden. You can use the same reasoning to describe the benefits for understanding the code, you get a clear understanding of the important usage cases, plus you get the unit-level "documentation" for free.

replies(6): >>41872167 #>>41872168 #>>41872352 #>>41872618 #>>41876571 #>>41880223 #

12. bluefirebrand ◴[17 Oct 24 18:13 UTC] No.41872139[source]▶

>>41871879 #

> Pointing people at unit tests instead of writing docs: not even remotely ok.

Couldn't agree more

I'm trying to integrate with a team at work that is doing this, and I'm finding it impossible to get a full picture of what their service can do.

I've brought it up with my boss, their boss, nothing happens

And then the person writing the service is angry that everyone is asking him questions about it all the time. "Just go read the tests! You'll see what it does if you read the tests!"

Incredibly frustrating to deal with when my questions are about the business rules for the service, not the functionality of the service

replies(1): >>41873496 #

13. MathMonkeyMan ◴[17 Oct 24 18:13 UTC] No.41872141[source]▶

>>41872038 #

I was going to say that unit tests have the benefit of breaking when the truth changes.

But then I realized that a lot of what makes a set of tests good documentation is comments, and those rot, maybe worse than dedicated documentation.

Keeping documentation up to date is a hard problem that I haven't yet seen solved in my career.

replies(4): >>41872241 #>>41872394 #>>41872424 #>>41874894 #

14. rglover ◴[17 Oct 24 18:14 UTC] No.41872144{3}[source]▶

>>41872068 #

If why is hard it may not need to exist. For example:

"This function exists to generate PDFs for reports and customer documents."

"This endpoint exists to provide a means for pre-flight authorization of requests to other endpoints."

replies(1): >>41872655 #

15. lucianbr ◴[17 Oct 24 18:16 UTC] No.41872163[source]▶

>>41871629 (OP) #

One - unit tests explain nothing. They show what the output should be for a given input, but not why, or how you get there. I'm surprised by the nonchalant claim that "unit tests explain code". Am I missing something about the meaning of the english word "explain"?

Two - so any input value outside of those in unit tests is undocumented / unspecified behavior? A documentation can contain an explanation in words, like what relation should hold between the inputs and outputs in all cases. Unit tests by their nature can only enumerate a finite number of cases.

This seems like such an obviously not great idea...

replies(16): >>41872317 #>>41872378 #>>41872470 #>>41872545 #>>41872973 #>>41873690 #>>41873888 #>>41874566 #>>41874890 #>>41874910 #>>41875148 #>>41875681 #>>41875896 #>>41876058 #>>41878172 #>>41879682 #

16. kbbgl87 ◴[17 Oct 24 18:16 UTC] No.41872164[source]▶

>>41871629 (OP) #

I believe that doctest is the best of both worlds, https://docs.python.org/3/library/doctest.html

17. theLiminator ◴[17 Oct 24 18:16 UTC] No.41872167[source]▶

>>41872129 #

I think unit testing if you're testing in a blackbox manner. Whitebox unit testing tends to be very fragile and nowhere near as valuable as an integration test.

18. smrtinsert ◴[17 Oct 24 18:16 UTC] No.41872168[source]▶

>>41872129 #

If you look for edge cases in integration tests, you will have a combinatorial explosion of integration tests and you will be adding much more work. Unit tests save time, not lose it.

I make this part of my filtering potential companies to work with now. I can't believe how often people avoid doing unit tests.

replies(1): >>41872237 #

19. teivah ◴[17 Oct 24 18:19 UTC] No.41872205[source]▶

>>41871957 #

As the post's author, no, it's not written by an LLM.

The Coder Cafe is a daily newsletter for coders; we go over different topics from Monday to Thursday, and on Friday, there's a recap ;)

20. Etheryte ◴[17 Oct 24 18:19 UTC] No.41872206[source]▶

>>41871629 (OP) #

This is functionally not different from saying your code is your documentation. If it builds, then it's valid, etc. In other words, nonsense. Code, tests and documentation each serve a useful purpose and crucially they each serve a purpose that's distinct from the other ones, but supports them. Code is there to do the thing, tests are there to make sure the thing is done correctly, documentation is for other humans to understand what the thing is and how it's done.

replies(1): >>41872373 #

21. mihaigalos ◴[17 Oct 24 18:21 UTC] No.41872231[source]▶

>>41871629 (OP) #

In TDD, u-tests are called "spec". Pretty much sums it up.

replies(2): >>41872337 #>>41872778 #

22. worik ◴[17 Oct 24 18:22 UTC] No.41872234[source]▶

>>41871629 (OP) #

Unit tests are valuable

But they are also pricy

I am interested in how people prevent unit tests becoming a maintenance burden over time.

I have seen so many projects with legacy failing tests. Any proposal to invest time and money cleaning them up dies on the alter of investing limited resources in developing features that make money

23. danjl ◴[17 Oct 24 18:22 UTC] No.41872237{3}[source]▶

>>41872168 #

That's funny, since I wouldn't code at a place that mandates unit tests. Sure, they have a very minor role, in very specific cases, but I'd say 90% of projects can get 90% of the benefits by writing only integration tests with 90% coverage. If you'd like a more in-depth discussion of why integration testing is better: https://kentcdodds.com/blog/write-tests

24. rglover ◴[17 Oct 24 18:22 UTC] No.41872241{3}[source]▶

>>41872141 #

The only fix for that is discipline. You can't automate away quality. The best people/teams understand that and make good docs a feature requirement, not an afterthought.

My favorite example is Stripe. They've never skimped on docs and you can tell they've made it a core competency requirement for their team.

replies(3): >>41872339 #>>41872469 #>>41876506 #

25. teivah ◴[17 Oct 24 18:24 UTC] No.41872252[source]▶

>>41871919 #

Is there something problematic you think about the style? It's a genuine question.

I wrote a book, and when I created my newsletter, I wanted to have a shift in terms of style because, on the Internet, people don't have time. You can't write a post the same way you write a book. So, I'm following some principles taken here and there. But happy to hear if you have some feedback about the style itself :)

26. teivah ◴[17 Oct 24 18:26 UTC] No.41872279[source]▶

>>41871879 #

No, not replacing documentation is a way to enrich documentation. That being said, that should have been clearer; I will update it.

Thanks, "This guy"

27. atoav ◴[17 Oct 24 18:30 UTC] No.41872317[source]▶

>>41872163 #

Not sure about this, but I like it the way it is done in the Rust ecosystem.

In Rust, there are two types of comments. Regular ones (e.g. starting with //) and doc-comments (e.g. starting with ///). The latter will land in in the generated documentation when you run cargo doc.

And now the cool thing: If you have example code in these doc comments, e.g. to explain how a feature of your library can be used, that script will automatically become part of the tests per default. That means you are unlikey to forget to update these examples when your code changes and you can use them as tests at the same time by asserting something at the end (which also communicates the outcome to the reader).

replies(4): >>41872703 #>>41872805 #>>41874298 #>>41876631 #

28. lucianbr ◴[17 Oct 24 18:31 UTC] No.41872337[source]▶

>>41872231 #

So again, any inputs outside of those exemplified in unit tests are unspecified behaviour? How would this work for mathematical operators for example?

replies(2): >>41872555 #>>41875288 #

29. MathMonkeyMan ◴[17 Oct 24 18:31 UTC] No.41872339{4}[source]▶

>>41872241 #

I wonder if there's some conservation law for "concerted mental effort." As if by spending time and energy on the often exasperating task of keeping documentation relevant, you reduce the time and energy required to comprehend the system.

You're right, it is a matter of culture and discipline. It's much harder to maintain a consistent and legible theory of a software component than it is to wing it with your 1-2 other teammates. Naming things is hard, especially when the names and their meanings eventually change.

30. evil-olive ◴[17 Oct 24 18:32 UTC] No.41872352[source]▶

>>41872129 #

unit vs integration tests is not an either/or. you need both, and in appropriate coverage amounts.

a common way to think about this is called the "test pyramid" - unit tests at the base, supporting integration tests that are farther up the pyramid. [0]

roughly speaking, the X-axis of the pyramid is number of test cases, the Y-axis is number of dependencies / things that can cause a test to fail.

as you travel up the Y-axis, you get more "lifelike" in your testing...but you also generally increase the time & complexity it takes to find the root-cause of a test failure.

many times I've had to troubleshoot a failure in an integration test that is trying to test subsystem A, and it turns out the failure was caused by unrelated flakiness in subsystem B. it's good to find that flakiness...but it's also important to be able to push that testing "down the pyramid" and add a unit test of subsystem B to prevent the flakiness from reoccurring, and to point directly at the problem if it does.

> Unit tests have limited benefits overall, and add a bunch of support time, slowing down development

unit tests, _when done poorly_, have limited benefits, require additional maintenance, and slow down development.

integration tests can also have limited benefits, require additional maintenance, and slow down development time, _when done poorly_.

testing in general, _when done well_, increases development velocity and improves product quality in a way that completely justifies the maintenance burden of the additional code.

0: https://martinfowler.com/articles/practical-test-pyramid.htm...

replies(2): >>41876521 #>>41884613 #

31. wubrr ◴[17 Oct 24 18:33 UTC] No.41872373[source]▶

>>41872206 #

Code as documentation is not nonsense at all. I do think high quality documentation should exist on it's own, but cleanly written and organized, well-commented code that is easy to read and understand is extremely valuable for many reasons. It IS a huge part of the documentation for the technical people that have to maintain the code and/or use it in advanced/specialized ways.

replies(1): >>41872984 #

32. monocasa ◴[17 Oct 24 18:34 UTC] No.41872378[source]▶

>>41872163 #

Unit tests can explain nothing. But so can paragraphs of prose.

The benefit of explanations in tests is that running them gets you closer to knowing if any of the explanations have bit rotted.

replies(1): >>41872921 #

33. Ygg2 ◴[17 Oct 24 18:35 UTC] No.41872394{3}[source]▶

>>41872141 #

> Keeping documentation up to date is a hard problem that I haven't yet seen solved in my career.

Rust doctests. They unite documentation and unit test. Basically documentation that's never so out of sync its assert fail.

34. sbuttgereit ◴[17 Oct 24 18:37 UTC] No.41872424{3}[source]▶

>>41872141 #

Elixir's documentation (ExDoc) & unit testing framework (ExUnit) doesn't solve this problem but provides a facility to ease it a bit.

In the documentation, you can include code examples that, if written a certain way, not only looks good when rendered but can also be tested for their form and documented outputs as well. While this doesn't help with the descriptive text of documentation, at least it can flag you when the documented examples are no longer valid... which can in turn capture your attention enough to check out the descriptive elements of that same area of documentation.

This isn't to say these documentation tests are intended to replace regular unit tests: these documentation tests are really just testing what is easily testable to validate the documentation, the code examples.

Something can be better than nothing and I think that's true here.

35. kubectl_h ◴[17 Oct 24 18:38 UTC] No.41872436[source]▶

>>41871629 (OP) #

I am starting to notice more and more unit tests in my org are written by AI -- I'm guessing usually after the implementation. I know this because I have, guiltily, done it and can tell when someone else has done it as well. I don't think anything can be done about this technically so it probably needs to be something discussed socially within the team.

replies(1): >>41872516 #

36. hitchdev ◴[17 Oct 24 18:41 UTC] No.41872469{4}[source]▶

>>41872241 #

I dont think it is about discipline. Discipline is required if you're duplicating tedious work, not for creativity.

At its core, a good test will take an example and do something with it to demonstrate an outcome.

That's exactly what how to docs do - often with the exact same examples.

Logically, they should be the same thing.

You just need a (non turing complete) language that is dual use - it generates docs and runs tests.

For example:

https://github.com/crdoconnor/strictyaml/blob/master/hitch/s...

And:

https://hitchdev.com/strictyaml/using/alpha/scalar/email-and...

replies(2): >>41875786 #>>41882124 #

37. __MatrixMan__ ◴[17 Oct 24 18:41 UTC] No.41872470[source]▶

>>41872163 #

Often, tests are parameterized over lists of cases such that you can document the general case near the code and document the specific cases near each parameter. I've even seen test frameworks that consume an excel spreadsheet provided by product so that the test results are literally a function of the requirements.

Would we prefer better docs than some comments sprinkled in strategic places in test files? Yes. Is having them with the tests maybe the best we can do for a certain level of effort? Maybe.

If the alternative is an entirely standalone repository of docs which will probably not be up to date, I'll take the comments near the tests. (Although I don't think this approach lends itself to unit tests.)

38. jaredcwhite ◴[17 Oct 24 18:43 UTC] No.41872507[source]▶

>>41871629 (OP) #

I very much disagree with this.

Good code can be documentation, both in the way it's written and structured and obviously in the form of comments.

Good tests simply verify what the author of the test believes the behavior of what is being tested should be. That's it. It's not documentation, it rarely "explains" anything, and any time someone eschews actually writing documentation in the form of good code hygiene and actual docs in favor of just writing tests causes the codebase to suffer.

replies(1): >>41874406 #

39. _thisdot ◴[17 Oct 24 18:44 UTC] No.41872516[source]▶

>>41872436 #

What is wrong with this? Tests involve a lot of hardcoding and mocking. I see this as an excellent use case for AI.

replies(1): >>41872801 #

40. ◴[17 Oct 24 18:45 UTC] No.41872536[source]▶

>>41871877 #

41. worldsayshi ◴[17 Oct 24 18:46 UTC] No.41872545[source]▶

>>41872163 #

One: Can we test the tests using some somewhat formal specification of the why?

Two: my intuition says that exhaustively specifying the intended input output pairs would only hold marginal utility compared to testing a few well selected input output pairs. It's more like attaching the corners of a sheet to the wall than gluing the whole sheet to the wall. And glue is potentially harder to remove. The sheet is n-dimensional though.

replies(1): >>41872651 #

42. viraptor ◴[17 Oct 24 18:47 UTC] No.41872555{3}[source]▶

>>41872337 #

A part of this lives in the spec name, and a part in the assumption that the Devs are not psychos. As in, if you test that sum(a,b) returns a sum of your numbers, the name/description of the test says so. And the second part means that it should hold for all numbers and the exceptions would be tested explicitly - nobody added "if a=5 & b=3 return 'foobar'" to it.

43. benrutter ◴[17 Oct 24 18:52 UTC] No.41872611[source]▶

>>41871629 (OP) #

I've heard "tests are documentation" a lot, and even said it without thinkibg much myself. It sounds good, and I definitely like the idea of it, but I'm not sure it's true. Here's my thinking:

- I've never tried to understand a code base by looking at the tlunit tests first. They often require more in depth understanding (due to things like monkeypatching) than just reading the code. I haven't seen anyone else attempt this either.

- Good documentation is good as far as it aids understanding. This might be a side effect of tests, but I don't think it's their goal. A good test will catch breaks in behaviour, I'd never trade completeness for readability in tests, in docs it's the reverse.

So I think maybe, unit tests are just tests? They can be part of your documentation, but calling them documentation in and of themselves I think is maybe just a category error?

replies(4): >>41875780 #>>41877033 #>>41878206 #>>41879710 #

44. avensec ◴[17 Oct 24 18:52 UTC] No.41872618[source]▶

>>41872129 #

Your point is valid, and some of the dialog in the replies to your comment is also valid. So, I'm just responding to the root of the dialog. What architectures are you working with that suggest higher integration test strategies?

I'd suggest that the balance between Unit Test(s) and Integration Test(s) is a trade-off and depends on the architecture/shape of the System Under Test.

Example: I agree with your assertion that I can get "90%+ coverage" of Units at an integration test layer. However, the underlying system would suggest if I would guide my teams to follow this pattern. In my current stack, the number of faulty service boundaries means that, while an integration test will provide good coverage, the overhead of debugging the root cause of an integration failure creates a significant burden. So, I recommend more unit testing, as the failing behaviors can be identified directly.

And, if I were working at a company with better underlying architecture and service boundaries, I'd be pointing them toward a higher rate of integration testing.

So, re: Kent Dodds "we write tests for confidence and understanding." What layer we write tests at for confidence and understanding really depends on the underlying architectures.

replies(3): >>41877989 #>>41881632 #>>41884604 #

45. lucianbr ◴[17 Oct 24 18:55 UTC] No.41872651{3}[source]▶

>>41872545 #

I really don't understand the "exhaustive specification" thing. How else is software supposed to work but with exhaustive specification? Is the operator + not specified exhaustively? Does your intuition tell you it is enough to give some pairs of numbers and their sums, and no need for some words that explain + computes the algebraic sum of its operands? There are an infinite number of functions of two arguments that pass through a finite number of specified points. Without the words saying what + does, it could literally do anything outside the test cases.

Of course, for + it's relatively easy to intuit what it is supposed to mean. But if I develop a "joe's interpolation operator", you think you'll understand it well enough from 5-10 unit tests, and actually giving you the formula would add nothing? Again I find myself wondering if I'm missing some english knowledge...

Can you imagine understanding the Windows API from nothing but unit tests? I really cannot. No text to explain the concepts of process, memory protection, file system? There is absolutely no way I would get it.

replies(2): >>41873282 #>>41874180 #

46. croes ◴[17 Oct 24 18:55 UTC] No.41872655{4}[source]▶

>>41872144 #

Isn’t that the same as the what?

47. lucianbr ◴[17 Oct 24 19:00 UTC] No.41872703{3}[source]▶

>>41872317 #

Yeah, combining unit tests and written docs in various ways seems fine. My reading of the article was that the tests are the only documentation. Maybe that was not the intent but just a bad interpretation on my part.

Though some replies here seem to keep arguing for my interpretation, so it's not just me.

replies(1): >>41876048 #

48. advisedwang ◴[17 Oct 24 19:08 UTC] No.41872778[source]▶

>>41872231 #

spec and documentation are things different though?

replies(1): >>41877668 #

49. JonChesterfield ◴[17 Oct 24 19:10 UTC] No.41872801{3}[source]▶

>>41872516 #

Generating tests that match the implementation doesn't tell you the implementation is doing the right thing. If it isn't, changing the implementation will break the tests, which in the best case wastes time and in the worst means the bugfix is abandoned.

I deeply hate "regression tests" that turn red when the implementation changes, so you regenerate the tests to match the new implementation and maybe glance at the diff, but the diff is thousands of lines long so really it's not telling you anything other than "something changed".

50. chrisweekly ◴[17 Oct 24 19:11 UTC] No.41872805{3}[source]▶

>>41872317 #

Does your IDE handle syntax-highlighting and intellisense -type enhancements for these unit tests written as doc-comments?

replies(1): >>41888867 #

51. eesmith ◴[17 Oct 24 19:17 UTC] No.41872845[source]▶

>>41871629 (OP) #

I did not agree with most of the advice. Here are some examples:

> Unit tests explain [expected] code behavior

Unit tests rarely evaluate performance, so can't explain why something is O(n) vs O(n^2), or if it was supposed to be one or the other.

And of course the unit tests might not cover the full range of behaviors.

> Unit tests are always in sync with the code

Until you find out that someone introduced a branch in the code, eg, for performance purposes (classic refactor step), and forgot to do coverage tests to ensure the unit tests exercised both branches.

> Unit tests cover edge cases

Note the True Scotsman fallacy there? 'Good unit tests should also cover these cases' means that if it didn't cover those cases, it wasn't good.

I've seen many unit tests which didn't cover all of the edge cases. My favorite example is a Java program which turned something like "filename.txt" into "filename_1.txt", where the "_1" was a sequence number to make it unique, and ".txt" was required.

Turns out, it accepted a user-defined filename from a web form, which could include a NUL character. "\x00.txt" put it in an infinite loop due to it's incorrect error handling of "", which is how the Java string got interpreted as a filename.

> Descriptive test name

With some test systems, like Python's unittest, you have both the test name and the docstring. The latter can be more descriptive. The former might be less descriptive, but easier to type or select.

> Keep tests simple

That should be 'Keep tests understandable'. Also, 'too many' doesn't contribute information as by definition it's beyond the point of being reasonable.

52. mannykannot ◴[17 Oct 24 19:26 UTC] No.41872921{3}[source]▶

>>41872378 #

> The benefit of explanations in tests is...

What you appear to have in mind here is the documentation of a test. Any documentation that correctly explains why it matters that the test should pass will likely tell you something about what the purpose of the unit is, how it is supposed to work, or what preconditions must be satisfied in order for it to work correctly, but the first bullet point in the article seems to be making a much stronger claim than that.

The observation that both tests and documentation may fail to explain their subject sheds no light on the question of whether (or to what extent) tests in themselves can explain the things they test.

53. danielovichdk ◴[17 Oct 24 19:32 UTC] No.41872973[source]▶

>>41872163 #

This is one of those thing that is "by philosophy", and I understand, i think, what you are saying.

I do think that tests should not explain the why, that would be leaking too much detail, but at the same time the why is somewhat the result of the test. A test is a documentation of a regression, not of how code it tests is implemented/why.

The finite number of cases is interesting. You can definitely run single tests with a high number of inputs which of course is still finite but perhaps closer to a possible way of ensuring validity.

54. Etheryte ◴[17 Oct 24 19:33 UTC] No.41872984{3}[source]▶

>>41872373 #

Yes, except this is not what people talk about when they say code is the documentation. What's meant in that context is no documentation and only code, with the idea that you can always read the code if you need to figure something out. Which, of course, is nonsense.

replies(1): >>41873475 #

55. latchkey ◴[17 Oct 24 19:35 UTC] No.41872995[source]▶

>>41871629 (OP) #

Nobody is mentioning this. Tests are for change over time, they are not just for testing the output is the same.

When you have a codebase sitting around rotting for years and you need to go back and refactor things to add a feature or change the behavior, how do you know you aren't breaking some dependent code down the line?

What happens when you upgrade a 3rd party dependency, how do you know it isn't breaking your code? The javascript ecosystem is rife with this. You can't upgrade anything years later or you have to start over again.

Tests are especially important when you've quit your company and someone else is stuck maintaining your code. The only way they can be sure to have all your ingrained knowledge is to have some sort of reliable way of knowing when things break.

Tests are for preventing the next developer from cursing you under their breath.

replies(1): >>41876009 #

56. danielovichdk ◴[17 Oct 24 19:36 UTC] No.41873005[source]▶

>>41871629 (OP) #

Unit tests is documentation of assertions. Hence it documents the result of how the code results to specification.

It's of course not documentation in the sense of a manual to the detail of code it exercises, but it definitely helps if tests are proper crafted.

57. timeon ◴[17 Oct 24 20:08 UTC] No.41873254[source]▶

>>41871629 (OP) #

"// The Coder Cafe"

if it had "///" it could have test in docs: https://doc.rust-lang.org/stable/book/ch14-02-publishing-to-...

58. __MatrixMan__ ◴[17 Oct 24 20:11 UTC] No.41873282{4}[source]▶

>>41872651 #

The thing about Joe's interpolation operator is that Joe doesn't work here anymore but thousands of users are relying on his work and we need to change it such that as few of them scream as possible.

That's the natural habitat for code, not formally specified, but partially functioning in situ. Often the best you can do is contribute a few more test cases towards a decent spec for existing code because there just isn't time to re-architect the thing.

If you are working with code in an environment where spending time improving the specification can be made a prerequisite of whatever insane thing the stakeholders want today... Hang on to that job. For the rest of us, it's a game of which-hack-is-least-bad.

replies(1): >>41875738 #

59. wubrr ◴[17 Oct 24 20:33 UTC] No.41873475{4}[source]▶

>>41872984 #

Nah, that's your own (incorrect) interpretation, the first result of googling 'code as documentation' [0], starts off with:

> Almost immediately I feel the need to rebut a common misunderstanding. Such a principle is not saying that code is the only documentation.

[0] https://martinfowler.com/bliki/CodeAsDocumentation.html

60. alphanumeric0 ◴[17 Oct 24 20:35 UTC] No.41873496{3}[source]▶

>>41872139 #

The code, tests and comments convey what actual business rules are implemented.

While documentation is someone's non-precise natural language expression of what (to the best of their imperfect human capacity) expected the code to implement at the time of writing.

replies(2): >>41873994 #>>41875753 #

61. lcall ◴[17 Oct 24 20:54 UTC] No.41873690[source]▶

>>41872163 #

At least sometimes, it really helps for a test to say WHY it is done that way. I had a case where I needed to change some existing code, and all the unit tests passed but one. The author was unavailable. It was very unclear whether I should change the test. I asked around. I was about to commit the changes to the code and test when someone came back from vacation and helpfully explained. I hope I added a useful comment.

62. alphanumeric0 ◴[17 Oct 24 21:01 UTC] No.41873752[source]▶

>>41872038 #

- What is it? - What does it do? - Why does it do that?

This could all easily fit in the top-level comments of a main() function or the help text of a CLI app.

- What is the API?

This could be gleaned from the code, either by reading it or by generating automatic documentation from it.

- What does it return?

This is commonly documented in function code.

- What are some examples of proper, real world usage (that don't involve foo/bar but instead, real world inputs/outputs I'd likely see)?

This is typically in comments or help text if it's a CLI app.

63. Etherlord87 ◴[17 Oct 24 21:17 UTC] No.41873888[source]▶

>>41872163 #

Documentation:

> returns a sum of reciprocals of inputs

Unit Test:

    assert_eq(foo(2, 5), 1/2 + 1/5)
    assert_eq(foo(4, 7), 1/4 + 1/7)
    assert_eq(foo(10, 100, 10000), 0.1101)

64. bluefirebrand ◴[17 Oct 24 21:28 UTC] No.41873994{4}[source]▶

>>41873496 #

Yes, it is absolutely more valuable to know what the code "should" be doing than to know what the code is doing

Otherwise there is no way to know what is expected behavior or just a mistake built into it by accident

replies(1): >>41875761 #

65. worldsayshi ◴[17 Oct 24 21:50 UTC] No.41874180{4}[source]▶

>>41872651 #

I suspect we're thinking about quite different use cases for our testing code. If the input-output pairs are describing a highly technical relationship I would probably want a more rigorous testing procedure. Possibly proofs.

Most of the tests I write daily is about moving and transforming data in ways that are individually rather trivial, but when features pile up, keeping track of all requirements is hard, so you want regression tests. But you also don't want a bunch of regression tests that are hard to change when you change requirements, which will happen. So you want a decent amount of simple tests for individually simple requirements that make up a complex whole.

66. readline_prompt ◴[17 Oct 24 22:06 UTC] No.41874298{3}[source]▶

>>41872317 #

Doctests are great aren't they?

replies(1): >>41874460 #

67. RangerScience ◴[17 Oct 24 22:22 UTC] No.41874406[source]▶

>>41872507 #

I disagree in a very specific and limited way: good tests show you how to use the code, which can be as simple as just “here’s some typical parameters for this function.”

In more complex situations, good tests also show you the environmental set up - for example, all the various odd database records the code needs or expects.

It’s not everything you’d want out of a doc, but it’s a chunk of it.

replies(2): >>41875127 #>>41875744 #

68. janalsncm ◴[17 Oct 24 22:22 UTC] No.41874409[source]▶

>>41871629 (OP) #

Code and tests tell you what. They don’t tell you why. And if there’s a bug not covered in the tests, neither code nor tests can help you figure that out.

69. TeMPOraL ◴[17 Oct 24 22:29 UTC] No.41874460{4}[source]▶

>>41874298 #

IDK, they sound like they overflow the "maximum code" counter and land up straight in the literate programming land. I wonder how far you could go writing your whole program as doctests spliced between commentary.

replies(1): >>41879065 #

70. bunderbunder ◴[17 Oct 24 22:32 UTC] No.41874483[source]▶

>>41871629 (OP) #

I share this ideal, but also have to gripe that "descriptive test name" is where this falls apart, every single time.

Getting all your teammates to quit giving all their tests names like "testTheThing" is darn near impossible. It's socially painful to be the one constantly nagging people about names, but it really does take constant nagging to keep the quality high. As soon as the nagging stops, someone invariably starts cutting corners on the test names, and after that everyone who isn't a pedantic weenie about these things will start to follow suit.

Which is honestly the sensible, well-adjusted decision. I'm the pedantic weenie on my team, and even I have to agree that I'd rather my team have a frustrating test suite than frustrating social dynamics.

Personally - and this absolutely echoes the article's last point - I've been increasingly moving toward Donald Knuth's literate style of programming. It helps me organize my thoughts even better than TDD does, and it's earned me far more compliments about the readability of my code than a squeaky-clean test suite ever does. So much so that I'm beginning to hold hope that if you can build enough team mass around working that way it might even develop into a stable equilibrium point as people start to see how it really does make the job more enjoyable.

replies(20): >>41874655 #>>41874662 #>>41874705 #>>41875392 #>>41875790 #>>41875904 #>>41875926 #>>41876835 #>>41876977 #>>41877265 #>>41877415 #>>41877434 #>>41877459 #>>41877538 #>>41878062 #>>41878426 #>>41878897 #>>41879455 #>>41879817 #>>41880385 #

71. byyll ◴[17 Oct 24 22:42 UTC] No.41874553[source]▶

>>41871629 (OP) #

Write your unit tests all you want but they are not documentation.

72. tqi ◴[17 Oct 24 22:50 UTC] No.41874611[source]▶

>>41871629 (OP) #

Without further documentation (beyond a descriptive test name), I fear that unit tests inevitably become a kind of Chesterton's Fence...

73. tpoacher ◴[17 Oct 24 22:58 UTC] No.41874655[source]▶

>>41874483 #

Obviously this is slightly implementation dependent but if your tests are accompanied by programmatic documentation (that is output together with the test), doesn't that eliminate the need for a descriptive test name in the first place?

If anything, in this scenario, I wouldn't even bother printing the test names, and would just give them generated identifier names instead. Otherwise, isn't it a bit like expecting git hashes to be meaningful when there's a commit message right there?

replies(1): >>41878135 #

74. zoover2020 ◴[17 Oct 24 22:59 UTC] No.41874662[source]▶

>>41874483 #

Have you considered a linter rule for test names? Both Checkstyle and ESLint did great work for our team

75. wubrr ◴[17 Oct 24 23:05 UTC] No.41874705[source]▶

>>41874483 #

> It's socially painful to be the one constantly nagging people about names, but it really does take constant nagging to keep the quality high.

What do test names have to do with quality? If you want to use it as some sort of name/key, just have a comment/annotation/parameter that succinctly defines that, along with any other metadata you want to add in readable English. Many testing frameworks support this. There's exactly zero benefit toTryToFitTheTestDescriptionIntoItsName.

replies(7): >>41874814 #>>41874867 #>>41875382 #>>41876013 #>>41876871 #>>41876888 #>>41877002 #

76. 6r17 ◴[17 Oct 24 23:23 UTC] No.41874814{3}[source]▶

>>41874705 #

That's not the point of the article. The code should be readable no exception. The only reason we should be ysing x y z are for coordinates ; i should be left for index_what ; same goes for parameters ; they should also contain what unit they are on (not scale, but scale_float) only exception I see are typed languages ; and even then I'm occasionally asked a detail about some obscure parameter that we set up a year ago. I understand it can sound goofy, but the extra effort is made towards other people working on the project, or future self. There is no way I can remember keys or where I left the meaning of those, and there is no justification to just write it down.

Readability of the code makes a lot of it's quality. A working code that is not maintainable will be refactored. A non working cofe that is maintainable will be fixed.

replies(1): >>41881251 #

77. 8note ◴[17 Oct 24 23:31 UTC] No.41874867{3}[source]▶

>>41874705 #

It's important to this article because its claiming that the name is coupled functionally to what the code tests -- that the test will fail if the name is wrong.

I don't know if any test tools that work like that though.

replies(1): >>41875993 #

78. tpoacher ◴[17 Oct 24 23:33 UTC] No.41874890[source]▶

>>41872163 #

Unit tests are programmatic specification. I'm assuming it is in this manner that the article is referring to them as documentation, rather than as "explanations" per se.

Obviously unit tests cannot enumerate all inputs, but as a form of programmatic specification, neither do they have to.

For the case you mention where a broad relation should hold, this is a special kind of unit test strategy, which is property testing. Though admittedly other aspects of design-by-contract are also better suited here; nobody's claiming that tests are the best or only programmatic documentation strategy.

Finally, there's another kind of unit testing, which is more appropriately called characterisation testing, as per M. Feathers book on legacy code. The difference being, unit tests are for developing a feature and ensuring adherence to a spec, whereas characterisation tests are for exploring the actual behaviour of existing code (which may or may not be behaving according to the intended spec). These are definitely then tests as programmatic documentation.

replies(1): >>41879055 #

79. starkparker ◴[17 Oct 24 23:34 UTC] No.41874894{3}[source]▶

>>41872141 #

Not that this solves the hard problem, but there's a simonw post for that: https://simonwillison.net/2018/Jul/28/documentation-unit-tes...

Including screenshots, which a lot of tech writing teams raise as a maintenance burden: https://simonwillison.net/2022/Oct/14/automating-screenshots...

Then there are tools like Doc Detective to inline tests in the docs, making them dependent on each other; if documented steps stop working, the test derived from them fails: https://doc-detective.com/

80. m000 ◴[17 Oct 24 23:35 UTC] No.41874910[source]▶

>>41872163 #

One - Why do you care how you got there? You need to read the code for that. But the tests do explain/document how you can expect the test to work. If the code is unreadable, well that sucks. But you at least have a programmatic (and hopefully annotated) description of how the code is expected to work, so you have a stable base for rewritting it to be more clear.

Two - Ever heard of code coverage? Type systems/type checkers? Also, there's nothing precluding you from using assertions in the test that make any assumed relations explicit before you actually test anything.

81. ssalka ◴[17 Oct 24 23:49 UTC] No.41874992[source]▶

>>41871629 (OP) #

I forget where I heard this, but early in my career someone described unit tests to me as "a contract between you and your code." Which seems largely true – when I write a test, I'm saying "this is how a given function should behave, and that contract should hold true over time." If my future self wants the code to behave differently, so be it, but the contract needs to be amended so that the new code changes are also in agreement with it.

Conversely, if you fail to write a unit test, there is no contract, and the code can freely diverge over time from what you think it ought to be doing.

82. hannasm ◴[17 Oct 24 23:52 UTC] No.41875013[source]▶

>>41871629 (OP) #

I like the idea of this article but I would say that it's actually integration tests that are documentation.

When learning a new codebase, and I'm looking for an example of how to use feature X I would look in the tests first or shortly after a web search.

It seems to me like the second half of this article also undermines the main idea and goal of using unit tests in this way though.

  > Descriptive test name, Atomic, Keep tests simple, Keep tests independent

A unit test that is good at documenting the system needs to be comprehensive, clear and in many cases filled with complexity that a unit test would ignore or hide.

A test with a bunch of mocks, helpers, overrides and assumptions does not help anyone understand things like how to use feature X or the correct way to solve a problem with the software.

There are merits to both kinds of tests in their time and place but good integration tests are really the best ones for documenting and learning.

replies(1): >>41876063 #

83. hombre_fatal ◴[18 Oct 24 00:06 UTC] No.41875095[source]▶

>>41871629 (OP) #

I like how they couldn't be bothered to show examples of this ideal unit test code they think is just as good as documentation, just like people who can't be bothered to write docs.

In reality, except for the most trivial projects or vigilant test writers, tests are too complicated to act as a stand in for docs.

They are usually abstract in an effort to DRY things up such that you don't even get to see all the API in one place.

I'd rather keep tests optimized for testing rather than nerfing them to be readable to end users.

84. bluefirebrand ◴[18 Oct 24 00:11 UTC] No.41875127{3}[source]▶

>>41874406 #

> good tests show you how to use the code

If you can't find examples of how to use the code in the code then why does the code even exist?

replies(2): >>41875986 #>>41877941 #

85. tmoertel ◴[18 Oct 24 00:14 UTC] No.41875148[source]▶

>>41872163 #

In some cases, unit tests can both test and specify the semantics of the system being tested. My favorite example is the ReadP parsing library for Haskell. The source code ends with a short and automatically testable specification of the semantics of the combinators that make up the library. So, in this example, the tests tell you almost everything you need to know about the library.

https://hackage.haskell.org/package/ghc-internal-9.1001.0/do...

86. exabrial ◴[18 Oct 24 00:25 UTC] No.41875202[source]▶

>>41871629 (OP) #

If you want to see how to do this right, go look at the CDI specification for Java.

Every statement in the spec has a corresponding unit test, and it’s unbelievably incredible. Hats of to everyone that worked on this.

87. samatman ◴[18 Oct 24 00:39 UTC] No.41875288{3}[source]▶

>>41872337 #

Induction is a valid form of inference.

88. ◴[18 Oct 24 00:50 UTC] No.41875354[source]▶

>>41871879 #

89. seadan83 ◴[18 Oct 24 00:55 UTC] No.41875382{3}[source]▶

>>41874705 #

What kinds of things would you say are best as annotation vs in the test method name? Would you mind giving a few examples?

Also, are you a fan of nesting test classes? Any opinions? Eg:

Class fibrulatatorTest {

  Class highVoltages{

      Void tooMuchWillNoOp() {}
      Void maxVoltage() {}

} }

replies(2): >>41876463 #>>41881313 #

90. yourapostasy ◴[18 Oct 24 00:56 UTC] No.41875392[source]▶

>>41874483 #

> ...increasingly moving toward Donald Knuth's literate style of programming.

I've been wishing for a long time that the industry would move towards this, but it is tough to get developers to write more than performative documentation that checks an agile sprint box, much less get product owners to allocate time test the documentation (throw someone unfamiliar with the code to do something small with it armed with only its documentation, like code another few necessary tests and document them, and correct the bumps in the consumption of the documentation). Even tougher to move towards the kind of Knuth'ian TeX'ish-quality and -sophistication documentation, which I consider necessary (though perhaps not sufficient) for taming increasing software complexity.

I hoped the kind of deep technical writing at large scales supported by Adobe Framemaker would make its way into open source alternatives like Scribus, but instead we're stuck with Markdown and Mermaid, which have their place but are painful when maintaining content over a long time, sprawling audience roles, and broad scopes. Unfortunate, since LLM's could support a quite rich technical writing and editing delivery sitting on top of a Framemaker-feature'ish document processing system oriented towards supporting literal programming.

91. 1980phipsi ◴[18 Oct 24 00:57 UTC] No.41875397[source]▶

>>41871629 (OP) #

D has documented unit tests.

https://dlang.org/spec/unittest.html#documented-unittests

Nice when combined with CI since you’ll know if you accidentally break your examples.

92. youainti ◴[18 Oct 24 01:50 UTC] No.41875674[source]▶

>>41871629 (OP) #

Something I've been thinking of is that unit tests may now become useful as examples to be input into LLMs. If each function has a couple of tests with appropriate documentation, that may be useful as RAG input.

93. 8n4vidtmkvmk ◴[18 Oct 24 01:51 UTC] No.41875681[source]▶

>>41872163 #

Yes, actually. Sometimes the edge cases that aren't covered by unit tests are undefined behavior. I don't recommend doing this frequently but sometimes it's hard to know the best way to handle weird edge cases until you gather more use cases so deliberately not writing a test for such things is a legit strategy IMO. You should probably also add to the method doc comment that invoking with X is not well defined.

94. lihaoyi ◴[18 Oct 24 01:54 UTC] No.41875694[source]▶

>>41871629 (OP) #

I make heavy use of this idea in many of my open source projects. I've tried a variety of approaches:

* ScalaSql, where the reference docs (e.g. https://github.com/com-lihaoyi/scalasql/blob/main/docs/refer...) are generated by running unit tests (e.g. https://github.com/com-lihaoyi/scalasql/blob/53cbad77f7253f3...)

* uPickle, where the documentation site (https://com-lihaoyi.github.io/upickle/#GettingStarted) is generated by the document-generator which has syntax to scrape (https://github.com/com-lihaoyi/upickle/blob/004ed7e17271635d...) the unit tests without running them (https://github.com/com-lihaoyi/upickle/blob/main/upickle/tes...)

* OS-Lib, where the documentation examples (e.g. https://github.com/com-lihaoyi/os-lib?tab=readme-ov-file#osr...) are largely manually copy-pasted from the unit tests (e.g. https://github.com/com-lihaoyi/os-lib/blob/9e7efc36355103d71...) into the readme.md/adoc

It's a good idea overall to share unit tests and documentation, but there is a lot of subtlety around how it must be done. Unit tests and documentation have many conflicting requirements, e.g.

* Unit tests prefer thoroughness to catch unintuitive edge cases whereas documentation prefers highlighting of key examples and allowing the reader to intuitively interpolate

* Unit tests examples prefer DRY conciseness whereas documentation examples prefer self-contained-ness

* Unit tests are targeted at codebase internal developers (i.e. experts) whereas documentation is often targeted at external users (i.e. non-experts)

These conflicting requirements mean that "just read the unit tests" is a poor substitute for documentation. But there is a lot of overlap, so it is still worth sharing snippets between unit tests and examples. It just needs to be done carefully and with thought given handling the two sets of conflicting requirements

95. valenterry ◴[18 Oct 24 01:58 UTC] No.41875719[source]▶

>>41871629 (OP) #

Okay, I'll go with it: statically defined types are also documentation!

And the method names are equivalent to the test names. Of course, only if you don't wildly throw around exceptions or return null (without indicating it clearly in the type signature).

96. invaderzirp ◴[18 Oct 24 02:00 UTC] No.41875738{5}[source]▶

>>41873282 #

What's stopping someone from reading the code, studying it deeply, and then writing down what it does? That's what I do, but I see people struggle with it because they just want to get more tickets done.

replies(2): >>41876363 #>>41876389 #

97. invaderzirp ◴[18 Oct 24 02:01 UTC] No.41875744{3}[source]▶

>>41874406 #

Erm, this is what docs are for. Like a man page, where it'll give you function signatures and return types and explain what those functions do. Unit tests are not that, and they shouldn't be that, because that's not their purpose.

New rule: if you write a function, you also have to write down what it does, and why. Deal?

98. invaderzirp ◴[18 Oct 24 02:03 UTC] No.41875753{4}[source]▶

>>41873496 #

"Oh yeah, those tests are always flaky. We just rerun them until they pass. Or we turn them off. I mean, Jeff wrote them like three years ago and he quit last year, so..."

I'd rather have the prose. And if it's wrong, then fix it. I'm so tired of these excuses.

99. invaderzirp ◴[18 Oct 24 02:05 UTC] No.41875761{5}[source]▶

>>41873994 #

Especially on teams where deviance has been normalized, and broken things are just expected. I've been bitten both ways before: is this an obvious mistake? Or the lynchpin holding up the house of cards? Of course, if someone had just written some text explaining it, or perhaps a decent commit message instead of just "WIP", maybe we wouldn't have to do archaeology every single time.

100. andoando ◴[18 Oct 24 02:08 UTC] No.41875780[source]▶

>>41872611 #

I definitely do look at tests to see examples of how library is meant to be used. But thats quite different

101. invaderzirp ◴[18 Oct 24 02:09 UTC] No.41875786{5}[source]▶

>>41872469 #

No, you just need to both understand how your system works and then clearly write down what it's doing and why. If projects like Postgres and SQLite and musl libc and the Linux kernel can all do it, I think the CRUD app authors can do it, too. But it's not magic, and another tool won't solve it (source: I've seen a hundred of these tools on scores of teams, and they don't help when people have no clue what's happening and then they don't write anything down).

102. ronnier ◴[18 Oct 24 02:10 UTC] No.41875790[source]▶

>>41874483 #

I’d rather leave a good comment instead of good test names. I mean do both, but a good comment is better imo. All I really care about is comments anymore. Just leave context, clues, and a general idea of what it’s trying to accomplish.

replies(1): >>41876142 #

103. gorgoiler ◴[18 Oct 24 02:36 UTC] No.41875896[source]▶

>>41872163 #

For something like this:

  def get_examples(
    source: Path,
    minimum_size: float,
    maximum_size: float,
    total_size: float,
    seed: float = 123,
  ) -> Iterator[Path]:
      …

…it’s pretty obvious what those float arguments are for but the “source” is just a Path. Is there an example “source” I can look at to see what sort of thing I am supposed to pass there?

Well you could document that abstractly in the function (“your source must be a directory available via NFS to all devs as well as the build infra”) but you could also use the function in a test and describe it there, and let that be the “living documentation” of which the original author speaks.

Obviously if this is a top level function in some open source library with a readthedocs page then it’s good to actually document the function and have a test. If it’s just some internal thing though then doc-rot can be more harmful than no docs at all, so the best docs are therefore verified, living docs: the tests.

(…or make your source an enumeration type so you don’t even need the docs!)

104. hinkley ◴[18 Oct 24 02:37 UTC] No.41875904[source]▶

>>41874483 #

It’s practically a sociology experiment at this point: half of the time when I suggest people force themselves to use a thesaurus whether they think they need it or not, I get upvoted. And half the time I get downvoted until I get hidden.

People grab the first word they think of. And subconsciously they know if they obsess about the name it’ll have an opportunity cost - dropping one or more of the implementation details they’re juggling in their short term memory.

But if “slow” is the first word you think of that’s not very good. And if you look at the synonyms and antonyms you can solidify your understanding of the purpose of the function in your head. Maybe you meant thorough, or conservative. And maybe you meant to do one but actually did another. So now you can not just chose a name but revisit the intent.

Plus you’re not polluting the namespace by recycling a jargon word that means something else in another part of the code, complicating refactoring and self discovery later on.

105. gorgoiler ◴[18 Oct 24 02:44 UTC] No.41875926[source]▶

>>41874483 #

Hah, I swing the other way! If module foo had a function bar then my test is in module test_foo and the test is called test_bar.

Nine times out of ten this is the only test, which is mostly there to ensure the code gets exercised in a sensible way and returns a thing, and ideally to document and enforce the contract of the function.

What I absolutely agree with you on is that being able to describe this contract alongside the function itself is far more preferable. It’s not quite literate programming but tools like Python’s doctest offer a close approximation to interleaving discourse with machine readable implementation:

  def double(n: int) -> int:
    “””Increase by 100%

    >>> double(7)
    14
    “””
    return 2 * n

replies(1): >>41879740 #

106. esafak ◴[18 Oct 24 02:58 UTC] No.41875986{4}[source]▶

>>41875127 #

Code is not always meant to be run by strangers, if it's internal to a team. The interface for customers should be documented.

107. the_af ◴[18 Oct 24 02:59 UTC] No.41875993{4}[source]▶

>>41874867 #

That's not what the article claims at all.

It claims that, in order for tests to serve as documentation, they must follow a set of best practices, one of which is descriptive test names. It says nothing about failing tests when the name of the test doesn't match the actual test case.

Note I'm not saying whether I consider this to be good advice; I'm merely clarifying what the article states.

108. luisgvv ◴[18 Oct 24 03:02 UTC] No.41876009[source]▶

>>41872995 #

I used to work for a company that was in the fintech space and had several scenarios and rules for deciding whether a person could apply for a loan, the interest rate, prime, insurance etc. A lot of the code written was way back from the early 2000s in Visual Basic and migrated improved towards C#.

I didn't knew a thing about how the business operated and the rationale behind the loans and the transactions. The parts of the application that had unit and behavior tests were easy to work on. Everyone dreaded touching the old pieces that didn't have tests.

replies(1): >>41876438 #

109. the_af ◴[18 Oct 24 03:03 UTC] No.41876013{3}[source]▶

>>41874705 #

> What do test names have to do with quality?

The quality of the tests.

If we go by the article, specifically their readability and quality as documentation.

It says nothing about the quality of the resulting software (though, presumably, this will also be indirectly affected).

replies(1): >>41881317 #

110. the_af ◴[18 Oct 24 03:11 UTC] No.41876048{4}[source]▶

>>41872703 #

Combining is what TFA suggests. They even go as far as closing the article with:

> Note also that I’m not suggesting that unit tests should replace any form of documentation but rather that they should complement and enrich it.

111. the_af ◴[18 Oct 24 03:14 UTC] No.41876058[source]▶

>>41872163 #

I think the line of thought behind the article is making the tests be like a "living spec". Well written tests (especially those using things like QuickCheck, aka "property testing") will cover more than simply a few edge cases. I don't think many developers know how to write good test cases like this, though, so it becomes a perilous proposition.

Do note TFA doesn't suggest replacing all other forms of documentation with just tests.

112. Terretta ◴[18 Oct 24 03:14 UTC] No.41876063[source]▶

>>41875013 #

Executable documentation:

Hitchstory is a type-safe StrictYAML python integration testing framework exploring some interesting ideas around this.

https://hitchdev.com/hitchstory/

Example:

https://hitchdev.com/hitchstory/using/behavior/run-single-na...

Source:

https://github.com/hitchdev/hitchstory/blob/master/hitch/sto...

See also the explanation of self-rewriting tests:

https://hitchdev.com/hitchstory/why/rewrite/

113. lallysingh ◴[18 Oct 24 03:40 UTC] No.41876142{3}[source]▶

>>41875790 #

Four test failures in different systems, each named well, will more quickly and accurately point me to my introduced bug than comments in those systems.

Identifiers matter.

114. __MatrixMan__ ◴[18 Oct 24 04:34 UTC] No.41876363{6}[source]▶

>>41875738 #

Nothing, sounds like a great plan.

But if you want other people to benefit from it, a good place to put it is right next to a test that will start failing as soon as the code changes in a way that no longer conforms to the spec.

Otherwise those people who just want to get more tickets done will change the code without changing the spec. Or you'll end up working on something else and they'll never even know about your document, because they're accustomed to everybody else's bad habits.

If you're going to be abnormally diligent, you might as well so in a way that the less diligent can approach gradually: One test at a time.

115. dullcrisp ◴[18 Oct 24 04:40 UTC] No.41876389{6}[source]▶

>>41875738 #

The code already says what it does.

116. matheusmoreira ◴[18 Oct 24 04:44 UTC] No.41876409[source]▶

>>41871629 (OP) #

I did something like this for my language. I built an interpreter and tested it by writing input programs and expected outputs. The test suite feeds the program through the interpreter and compares the actual and expected outputs. It's just like Ruby's executable RSpec. It's so nice. Every time I add a feature, I write an example program for it and that program automatically tests the language, its features, its semantics... With good comments they could conceivably be used to teach the language.

I eventually added support for real unit tests to my test suite as well. I started testing parts of the runtime through them. Those turned out to be a lot messier than I'd hoped. Hopefully I'll be able to improve them over time by applying the principles outlined in the article.

117. latchkey ◴[18 Oct 24 04:49 UTC] No.41876438{3}[source]▶

>>41876009 #

I've got a typescript react component library that integrates 3 different projects together. It gets 36k downloads on npm every month. I started the project 5 years ago.

When I originally wrote it, I knew that I would have to maintain it over time, so I wrote a ton of unit tests that mostly were just snapshots of the html output. I have two choices, running through my relatively complicated example app by hand and verifying things still work, or writing tests. I used this project to prove to myself that tests are indeed valuable.

Over the years, I've made many releases. The 3 projects have been independently upgraded over time. The only way that I would have kept any sanity and been motivated to even work on this project (I no longer even use it myself!), is the fact that it takes almost zero effort to upgrade the dependencies, run the tests and build a release.

If there are too many things to fix, I just wait for the community to eventually submit a PR. The best part is that if they break something, it is easy to spot in the snapshots (or test failures). I can almost accept PR's without having to even read them, just because the tests pass. That's pretty cool.

118. biggc ◴[18 Oct 24 04:55 UTC] No.41876463{4}[source]▶

>>41875382 #

Table tests can enable useful test naming without a bunch of clunky named test functions. I use them most often in Go but I’m sure other languages have support

https://go.dev/wiki/TableDrivenTests

119. BeetleB ◴[18 Oct 24 05:05 UTC] No.41876506{4}[source]▶

>>41872241 #

> The only fix for that is discipline.

The one lesson I have learned over my career: Don't work in teams (or for managers) that rely on discipline to get things done. Every time I've encountered them, they've been using it as an excuse to avoid better processes.

Sure, some counterexamples exist. Chances are, those counterexamples aren't where a given reader of your comment is working.

120. BeetleB ◴[18 Oct 24 05:10 UTC] No.41876521{3}[source]▶

>>41872352 #

> unit vs integration tests is not an either/or. you need both, and in appropriate coverage amounts.

Agreed. But I also agree with the commenter that for documentation purposes, integration tests are an order of magnitude more useful.

> a common way to think about this is called the "test pyramid" - unit tests at the base, supporting integration tests that are farther up the pyramid.

I used to be a believer in that pyramid, but my experience has shown me it depends on the project. Wherever it's feasible (i.e. doesn't involve long test times), I've found integration tests to be far more useful than unit tests. I've had experiences where I'd do a project and have really high unit test coverage, only to unveil fairly trivial bugs. The reverse hasn't happened - if I start a project with solid integration tests, I almost never encounter trivial bugs.

Generally, I now write integration tests and mock away time consuming/resource heavy parts (e.g. network calls, DB calls, etc). Better for documentation. Better for testing.

replies(1): >>41877339 #

121. bni ◴[18 Oct 24 05:26 UTC] No.41876571[source]▶

>>41872129 #

I do Integrunit tests

Rules: Cover a whole functionality, not just a single file/class/function.

Can't use I/O except reading in-codebase test-data files.

Must be fast.

Mock only external technical dependencies, not own code.

122. sedatk ◴[18 Oct 24 05:39 UTC] No.41876631{3}[source]▶

>>41872317 #

I discovered that completely by surprise when I was learning Rust. I wrote code and unit tests, wrote some documentation, and I was blown away when I saw that my documentation also ran as part of the suite. Such a magical moment.

123. ◴[18 Oct 24 06:27 UTC] No.41876835[source]▶

>>41874483 #

124. yen223 ◴[18 Oct 24 06:35 UTC] No.41876871{3}[source]▶

>>41874705 #

Kotlin has an interesting approach to solving this. You can name functions using backticks, and in those backticks you can put basically anything.

So it's common to see unit tests like

  @Test
  fun `this tests something very complicated`() {
    ...
  }

replies(1): >>41877009 #

125. serial_dev ◴[18 Oct 24 06:37 UTC] No.41876888{3}[source]▶

>>41874705 #

Some languages / test tools don’t enforce testNamesLikesThisThatLookStupidForTestDescriptions, and you can use proper strings, so you can just say meaningful requirements with a readable text, like “extracts task ID from legacy staging URLs”.

It looks, feels, and reads much better.

replies(2): >>41877015 #>>41881545 #

126. misja111 ◴[18 Oct 24 06:56 UTC] No.41876977[source]▶

>>41874483 #

Thanks for the hint about Knuth's literate programming! I hadn't heard about it before but it immediately looks great. (For those of us who hadn't heard about it before either, here is a link: https://en.wikipedia.org/wiki/Literate_programming)

About your other point: I have experienced exactly the same. It just seems impossible to instill the belief into most developers that readable tests lead to faster solving of bugs. And by the way, it makes tests more maintainable as well, just like readable code makes the code more maintainable anywhere else.

127. misja111 ◴[18 Oct 24 07:03 UTC] No.41877002{3}[source]▶

>>41874705 #

It's funny, you are asking what test names have to do with quality, and you proceed with mentioning a really bad test name, 'toTryToFitTheTestDescriptionIntoItsName', and (correctly) stating that this has zero benefit.

Just like normal code, test methods should indicate what they are doing. This will help you colleague when he's trying to fix the failing test when you're not around. There are other ways of doing that of course which can be fine as well, such as describing the test case with some kind of meta data that the test framework supports.

But the problem that OP is talking about, is that many developers simply don't see the point of putting much effort into making tests readable. They won't give tests a readable name, they won't give it a readable description in metadata either.

replies(1): >>41881233 #

128. sfn42 ◴[18 Oct 24 07:04 UTC] No.41877009{4}[source]▶

>>41876871 #

You can do that in Java as well. Can't remember if it's exactly the same syntax

replies(2): >>41879165 #>>41882303 #

129. lbreakjai ◴[18 Oct 24 07:07 UTC] No.41877015{4}[source]▶

>>41876888 #

With jest (Amonsts others), you can nest the statements. I find it really useful to describe what the tests are doing:

    describe('The foo service', () => {

      describe('When called with an array of strings', () => {

        describe('And the bar API is down', () => {

          it('pushes the values to a DLQ' () => {
            // test here
          })

          it('logs the error somewhere' () => {
            // test here
          })

          it('Returns a proper error message`, () => {
            // test here
          })
        })
      })
    })

You could throw all those assertions into one test, but they’re probably cheap enough that performance won’t really take a hit. Even if there is a slight impact, I find the reduced cognitive load of not having to decipher the purpose of 'callbackSpyMock' to be a worthwhile trade-off.

replies(1): >>41877837 #

130. sixthDot ◴[18 Oct 24 07:12 UTC] No.41877033[source]▶

>>41872611 #

> Good documentation is good as far as it aids understanding. This might be a side effect of tests, but I don't think it's their goal. A good test will catch breaks in behaviour, I'd never trade completeness for readability in tests, in docs it's the reverse.

The D language standard library uses both. When you generate the documentation from the comments attached to a declaration, the following unittests (they are identified using a special markup, (that is just triple slashes...) are also included.

Example once rendered [0], in the source you see the examples are actually unit tests [1].

[0]: https://dlang.org/phobos/std_algorithm_searching.html#.all

[1]: https://github.com/dlang/phobos/blob/master/std/algorithm/se...

131. jillesvangurp ◴[18 Oct 24 07:14 UTC] No.41877041[source]▶

>>41871629 (OP) #

Tests and documentation are things that I'm increasingly using LLMs for. Writing exhaustive tests and documentation is somewhat tedious work. You do it to help others. Or if you are smart enough to realize it, your future self. But LLMs take the tedium away.

Example: I wrote a little fast api endpoint and realized that having some decent openapi documentation would be nice. Support for that is built in. So, copy paste into chat gpt, "add openapi documentation to these endpoints" paste it back. And then "write me an integration test that exercises these endpoints and tests all the exceptional responses". Simple stuff. But the point is that the generated documentation is pretty good and helpful. It's exhaustive. It documents all the essentials. I could sit down and do it manually for an hour. Or I could just generate it.

I also generated a README for the same project with instructions on how to setup all the tools and do all the key things (run tests, run a dev server, build a docker container, etc. The Dockerfile was generated too. Dockerfiles are also great as documentation artifacts because it precisely defines how to build and run your software.

LLMs are really good at documenting and summarizing things.

132. nissarup ◴[18 Oct 24 07:55 UTC] No.41877238[source]▶

>>41871629 (OP) #

For one narrow definition of documentation, yes.

I'm pretty sure our end-users would get no value out of reading the unit tests in the code of the application they are using.

133. yesbabyyes ◴[18 Oct 24 08:00 UTC] No.41877265[source]▶

>>41874483 #

As a fan of literate programming, I hope this could be a tool in the box for Node.js developers: Testy is basically doctests for Node.js, building off JSDoc's @examples stanza.

I would be honored by anyone checking it out: https://github.com/linus/testy

replies(1): >>41877481 #

134. noobermin ◴[18 Oct 24 08:06 UTC] No.41877300[source]▶

>>41871629 (OP) #

This article is almost jumping the shark. It's fine to say it's in the constellation of things that help document code, but the end where you're starting to limit the tests you can do to highlight an ideal, for example, the "atomic" bullet, that's where you're now doing things for ideology, not really for their utility. A similar thing occurred with function purity, etc, etc.

I don't know who this helps but if you're a young developer, always beware what you read on substack about how you should constrain yourself. Take them with a grain of salt.

135. troupo ◴[18 Oct 24 08:13 UTC] No.41877339{4}[source]▶

>>41876521 #

> for documentation purposes, integration tests are an order of magnitude more useful.

Not just documentation purposes. In almost all cases integration is better than unit tests: they cover the same code paths, they actually test observed behaviour of the app, etc.

Notable exceptions: complex calculations, library functions.

> I've found integration tests to be far more useful than unit tests. I've had experiences where I'd do a project and have really high unit test coverage, only to unveil fairly trivial bugs. The reverse hasn't happened - if I start a project with solid integration tests, I almost never encounter trivial bugs.

If I could upvote this several times, I would :)

136. player1234 ◴[18 Oct 24 08:21 UTC] No.41877372[source]▶

>>41871629 (OP) #

Just read the code.

137. ojkelly ◴[18 Oct 24 08:33 UTC] No.41877415[source]▶

>>41874483 #

Test names are one of those things that are painful because it’s obvious to you as you write it, but there’s an extra hassle to switch gears in your head to describe what the contents of the test is doing.

It is really valuable when they are named well.

I’ve found this is where LLM can be quite useful, they’re pretty good at summarising.

Someday soon I think we’ll see a language server that checks if comments still match what they’re documenting. The same for tests being named accurately.

replies(1): >>41877822 #

138. gus_leonel ◴[18 Oct 24 08:36 UTC] No.41877434[source]▶

>>41874483 #

Test names should be sentences: https://bitfieldconsulting.com/posts/test-names

replies(4): >>41880087 #>>41880687 #>>41881699 #>>41882785 #

139. macspoofing ◴[18 Oct 24 08:55 UTC] No.41877538[source]▶

>>41874483 #

>Getting all your teammates to quit giving all their tests names like "testTheThing" is darn near impossible.

You can do better than "testTheThing".

Have your team (or a working group composed of your team, if your team is too big) put together a set of guidelines on naming conventions for unit test methods. Have your team agree to these conventions (assumption is that the working group would have consulted with rest of team and incorporated their feedback).

Then make that part of the code review checklist (so you aren't the one that is actually enforcing the policy). Do spot checks for the first little while, or empower some individuals to be responsible for that - if you really want to. Do a retrospective after a month or 2 months to see how everyone is doing and see how successful this initiative was.

140. gpmcadam ◴[18 Oct 24 09:24 UTC] No.41877668{3}[source]▶

>>41872778 #

In TDD spec is converted to behaviours that can be ran as automated tests so if the logic of the code changes, it breaks the spec and in turn the tests

Whereas documentation can (inevitably) go stale with no feedback or build failures

141. pydry ◴[18 Oct 24 09:58 UTC] No.41877822{3}[source]▶

>>41877415 #

I've never seen this as a problem. If you're doing TDD and you have a scenario in mind, you describe that scenario in the name of the test.

If you're writing the test after then yeah, maybe it's hard, but that's one of the many reasons why it's probably better to write the test before and align it with the actual feature or bugfix you're intending to implement.

replies(1): >>41881022 #

142. chriswarbo ◴[18 Oct 24 10:01 UTC] No.41877837{5}[source]▶

>>41877015 #

The `describe`/`it` nesting pattern is quite common (I currently use it in Jest and HSpec); but it doesn't solve the social problem. It's common to see tests like:

    describe("foo", () => {
      describe("called with true", () => {
        it("returns 1", () => {
          assert(foo(someComplicatedThing, true) === 1)
        })
      })
      describe("called with false", () => {
        it("returns 12", () => {
          assert(foo(someOtherIndecipherableThing, false) === 12)
        })
      })
    })

It's the same problem as comments that repeat what the code says, rather than what it means, why it's being done that way, etc. It's more annoying in tests, since useless comments can just be deleted, whilst changing those tests would require discovering better names (i.e. investigating what it means, why it's being done that way, etc.). The latter is especially annoying when a new change causes such tests to fail.

Tests with such names are essentially specifying the function's behaviour as "exactly what it did when first written", which is ignoring (a) that the code may have bugs and (b) that most codebases are in flux, as new features get added, things get refactored, etc. They elevate implementation details to the level of specification, which hinders progress and improvement.

replies(2): >>41878364 #>>41881597 #

143. badmintonbaseba ◴[18 Oct 24 10:19 UTC] No.41877941{4}[source]▶

>>41875127 #

There are multiple problems with that:

1. The code uses internal interfaces, not meant to be used by users of the code.

2. The code might not use the high level public interfaces you are interested in. Those interfaces are meant to be used by users, and tested by tests.

Having said that reading the code itself is often fruitful. Not for example usages, but to just learn how the thing is implemented.

144. badmintonbaseba ◴[18 Oct 24 10:26 UTC] No.41877989{3}[source]▶

>>41872618 #

I wouldn't count the coverage of integration tests with the same weight as coverage from unit tests.

Unit tests often cover the same line multiple times meaningfully, as it's much easier to exhaust corner case inputs of a single unit in isolation than in an integration test.

Think about a line that does a regex match. You can get 100% line coverage on that line with a single happy path test, or 100% branch coverage with two tests. You probably want to test a regex with a few more cases than that. It can be straightforward from a unit test, but near impossible from an integration test.

Also integration tests inherently exercise a lot of code, then only assert on a few high level results. This also inflates coverage compared to unit tests.

145. crabbone ◴[18 Oct 24 10:35 UTC] No.41878062[source]▶

>>41874483 #

Bad work practices create bad social dynamics. If someone on the team isn't pulling their weight by being lazy, I don't see a reason to like that person if I'm their team member.

146. crabbone ◴[18 Oct 24 10:46 UTC] No.41878135{3}[source]▶

>>41874655 #

The article addresses this: there's no systematic enforcement of documentation currency. It may accidentally and unbeknownst to its authors become outdated.

Anyways, as already mentioned earlier: unit tests are code and all quality criteria that apply to any other code apply to unit tests too. We expect identifiers used in code to help understand the code no matter if it's the name of a unit test or any other entity in our program.

NB. To me this argument seems as bizarre as disputing washing your hands after using the bathroom. Why would anyone think that they should get a pass on code quality standards when writing unit tests? This just doesn't make sense...

replies(1): >>41935431 #

147. crabbone ◴[18 Oct 24 10:52 UTC] No.41878172[source]▶

>>41872163 #

Often times matching input to output is already enough of a clue for the reader to understand the purpose of the functionality being tested. Often the code being tested is hard to use at all unless an example input is shown to the user. This is often my reason to read the unit tests: the function takes data of a shape that's very loosely defined, and no matter how I arrange those data, the code breaks with unhelpful error messages. This is, of course, also the problem of the function that I'm trying to run, but usually that's outside of the area of my responsibility, where my task is to just "make it work", not fix it. So, in this sense, unit test is a perfectly good tool to do the explanation.

148. crabbone ◴[18 Oct 24 10:57 UTC] No.41878206[source]▶

>>41872611 #

You are confused between "if" and "if and only if":

"Tests are documentation" doesn't imply "documentation is tests". There are many useful tools that are documentation, none of them has to be exclusive.

149. yakshaving_jgt ◴[18 Oct 24 11:27 UTC] No.41878364{6}[source]▶

>>41877837 #

At the end of the day, someone has to shoulder the burden of holding their colleagues to higher standards. I don’t think there’s a technical solution to this social problem.

replies(2): >>41878925 #>>41879458 #

150. nzach ◴[18 Oct 24 11:37 UTC] No.41878426[source]▶

>>41874483 #

>Getting all your teammates to quit giving all their tests names like "testTheThing" is darn near impossible.

Sometimes people are pretty bad a coming up new names, but selecting a good name given some options generally isn't a big problem. So maybe we should create a kind o LLM linter for this situation ?

The prompt could be along the lines:

"Given this function: <FUNCTION>

This unit test: <UNIT TEST>

And these naming considerations: <NAMING GUIDE>

Is the current test name a good option?

What would be some better options?"

I did some quick testing and it seems work to reasonably well. It doesn't create anything mind-blowing but at least it seems to provide some consistent options.

replies(1): >>41879260 #

151. flerchin ◴[18 Oct 24 12:44 UTC] No.41878897[source]▶

>>41874483 #

You know what they say:

Naming things is one of the 2 hardest problems in computer science. The other one being cache invalidation and off by one errors.

replies(1): >>41882902 #

152. chiph ◴[18 Oct 24 12:47 UTC] No.41878925{7}[source]▶

>>41878364 #

This is part of the job of being a team lead or manager. You have a standard, you need to get people to follow it (or consequences..)

153. Piraty ◴[18 Oct 24 13:03 UTC] No.41879055{3}[source]▶

>>41874890 #

this is a good read https://fsharpforfunandprofit.com/posts/property-based-testi...

154. phi-go ◴[18 Oct 24 13:04 UTC] No.41879065{5}[source]▶

>>41874460 #

Doctests are tests they are not part of a release build. Also I don't believe there is a supported way to access code in doctests from the outside.

So (sadly?) no literate programming

155. lmz ◴[18 Oct 24 13:17 UTC] No.41879165{5}[source]▶

>>41877009 #

You can't put spaces in the function name, but you can set a display name for JUnit - https://junit.org/junit5/docs/5.0.3/api/org/junit/jupiter/ap...

replies(1): >>41887546 #

156. Jeff_Brown ◴[18 Oct 24 13:23 UTC] No.41879222[source]▶

>>41871629 (OP) #

First type signatures, then good function names, then tests. Between those three I rarely need comments -- except, perhaps ironically, in the tests, to explain what kind of corner condition I'm testing where.

157. nzach ◴[18 Oct 24 13:27 UTC] No.41879260{3}[source]▶

>>41878426 #

Here is a really bad POC for golang tests, if anyone is interested: https://github.com/nzachow/lmlinter

Right now it just prints the prompt in the terminal.

158. TheSoftwareGuy ◴[18 Oct 24 13:48 UTC] No.41879455[source]▶

>>41874483 #

This is one area where a BDD style framework like catch2[0] really shines, IMO. The way tests are written in this style naturally lends itself to self-documenting each branch

[0]: https://github.com/catchorg/Catch2

159. cle ◴[18 Oct 24 13:48 UTC] No.41879458{7}[source]▶

>>41878364 #

It could also be a symptom of something else, like I’ve seen this happen when someone goes overboard on unit tests and they become so burdensome that other engineers just want to get it out of the way. They may not consciously realize it, but subconsciously they know that it’s BS and so they don’t mind BS names to just move on with actual productive work.

Not saying it’s always the case, but it could be. Higher standards are not always better, they have diminishing returns.

replies(1): >>41881073 #

160. actinium226 ◴[18 Oct 24 14:15 UTC] No.41879682[source]▶

>>41872163 #

Unit tests show expected input and output of a function.

In code that was written without tests, inputs/outputs end up being surprisingly far more spread out than you might think. One of the function inputs might be a struct with 5 members, but the function itself only uses one of those members 50 lines in. If it's OOP, one of the inputs might be a member variable that's set elsewhere, and same for the outputs.

A unit test shows the reader what information is needed for the function and what it produces, without having to read the full implementation.

Also, when you write it, you end up discovering things like what I mentioned above, and then you end up refactoring it to make more sense.

161. actinium226 ◴[18 Oct 24 14:17 UTC] No.41879710[source]▶

>>41872611 #

As an example, when I was doing some complex stuff in pybind11 using callbacks, I heavily relied on their tests to understand how it was supposed to be implemented.

162. hu3 ◴[18 Oct 24 14:20 UTC] No.41879740{3}[source]▶

>>41875926 #

> If module foo had a function bar then my test is in module test_foo and the test is called test_bar.

Same. This is a good 80/20 in my experience.

Testing the happy paths is already very rewarding.

163. nuancebydefault ◴[18 Oct 24 14:31 UTC] No.41879817[source]▶

>>41874483 #

I used to be the weenie who always said we need to define nomenclature before things get silly names (hence very very early to prevent those to become the norm for legacy reasons) and use proper naming for everything, such that code would read like a novel (yes that is a stretch).

But indeed it tended to lead to frustrating social dynamics in stead of happy romcom scripts.

So I gave up on most of it.

That said, my opinion about test code is, it exists to find bugs or steer away from regression. API descriptions, together with a design with some graphs, should be formal and clear enough to understand code usage. I don't want to figure out the usage of fwrite() by going through its elaborate test suite.

164. alfonsodev ◴[18 Oct 24 14:39 UTC] No.41879867[source]▶

>>41871629 (OP) #

I wonder if something like Jupyter notebook would be a cool format to implement this.

165. nuancebydefault ◴[18 Oct 24 15:05 UTC] No.41880087{3}[source]▶

>>41877434 #

In fact that is not very hard to do and provides a great advantage!

166. najork ◴[18 Oct 24 15:16 UTC] No.41880195[source]▶

>>41871629 (OP) #

Unit tests to assert a spec are great, but they don't provide any context as to why the spec was defined as it was. This context is critical when making changes to the spec, so unit tests on their own aren't sufficient in my experience.

167. crabbone ◴[18 Oct 24 15:19 UTC] No.41880223[source]▶

>>41872129 #

Article doesn't say that documentation is unit tests, it says that unit tests are documentation. It never claimed there aren't any other means to document your code.

168. wryoak ◴[18 Oct 24 15:36 UTC] No.41880385[source]▶

>>41874483 #

I encourage my teams to include the project tracker ticket name of the requirement or bug fix in the name. Eg, “XYZ12” -> fn { … } // test for value always being positive

Of course some test libraries make this approach difficult.

169. projectileboy ◴[18 Oct 24 16:02 UTC] No.41880656[source]▶

>>41871629 (OP) #

They express behaviors, yes, but they do not express intent, or any broader context. I’ve been on more than one project where well-meaning developers forbid comments and additional documentation because they felt that the only documentation should be the tests, and the result was always a complete lack of understanding as to why anything worked the way it did.

170. wnmurphy ◴[18 Oct 24 16:06 UTC] No.41880687{3}[source]▶

>>41877434 #

100%. Test names should include the word "should" and "when". Then you get a description of the expected behavior.

171. HelloNurse ◴[18 Oct 24 16:08 UTC] No.41880713[source]▶

>>41871629 (OP) #

Meaningful test names are unimportant compared to sensible test organization and understandable test content: meaningful test fixtures, helper functions etc. to reduce repetitions; no dependencies on test execution order and other capital sins; straightforward and concise patterns like invoking the ordinary validation of object results instead of inspecting them ad-hoc or tables of simple inputs and expected outputs.

Simple tests don't really need descriptive names because what they test should be obvious, while special tests need more than a good name to be documented enough: comments (e.g. why can crazy cases actually occur), links, extra effort to write them clearly, etc.

172. commandersaki ◴[18 Oct 24 16:33 UTC] No.41880993[source]▶

>>41871629 (OP) #

I like examples in documentation and unit tests to be separated like how Go does it.

173. mewpmewp2 ◴[18 Oct 24 16:35 UTC] No.41881022{4}[source]▶

>>41877822 #

Maybe also why TDD is hard for me because I only truly start to think or visualize when I'm writing the actual code. I don't know if it's ADHD or what it is, but writing requirements, tests before hand, is just not my forte. It's like I only get dopamine from when I build something and everything else feels frustrating.

replies(1): >>41882618 #

174. mewpmewp2 ◴[18 Oct 24 16:40 UTC] No.41881073{8}[source]▶

>>41879458 #

It's all a spectrum of trade-offs with different people having different opinions.

There could be some sort of formula to explain this better to determine how much effort to spend on tests vs features and product quality and importance of quality compared to that.

175. tdiff ◴[18 Oct 24 16:43 UTC] No.41881106[source]▶

>>41871629 (OP) #

I doubt anyone who have ever seen complex test fixtures testing complex cases would agree it is better than free-form explanation of intended behaviour.

Human language is just much more dense in terms of amount of conveyed information.

176. wubrr ◴[18 Oct 24 16:55 UTC] No.41881233{4}[source]▶

>>41877002 #

> It's funny, you are asking what test names have to do with quality, and you proceed with mentioning a really bad test name, 'toTryToFitTheTestDescriptionIntoItsName', and (correctly) stating that this has zero benefit.

Not at all. Those kinds of names are like a de-facto standard for the people that try to push this kind of practice. Obviously the example I used is not related to any real test.

> This will help you colleague when he's trying to fix the failing test when you're not around.

Really? Encoding what a test function does in it's name is your recommendation for helping someone understand what the code is doing? There are far better ways of accomplishing this, especially when it comes to tests.

> There are other ways of doing that of course which can be fine as well

'Can be fine as well'? More like 'far superior in every possible way'.

> But the problem that OP is talking about, is that many developers simply don't see the point of putting much effort into making tests readable.

Not at all, making a test readable and trying to encode what it does into it's name are completely separate things.

177. wubrr ◴[18 Oct 24 16:57 UTC] No.41881251{4}[source]▶

>>41874814 #

I'm obviously replying to GP's specific comment on test names. I fail to see how your reply relates to my comment at all.

178. wubrr ◴[18 Oct 24 17:02 UTC] No.41881313{4}[source]▶

>>41875382 #

Like others have already stated/provided examples of[0] - the test function names are generally irrelevant. Many testing frameworks use a single/same test function name, or a completely unnamed function/lambda, while providing any needed context/documentation as params or annotations.

Realistically, many unit tests are far more complicated (in terms of business logic) than functions where names actually matter, like 'remove()', 'sort()', 'createCustomer()', etc. I've worked in several places where people aggressively pushed the 'encode test description in test name' BS, which invariably always leads to names like 'testThatCreatingACustomerFromSanctionedCountryFailsWithErrorX'. It's completely absurd.

> Also, are you a fan of nesting test classes? Any opinions?

It really depends on the framework you're using, but in general nesting of tests is a good thing, and helps with organizing your tests.

[0] https://news.ycombinator.com/item?id=41871629#41877015

replies(1): >>41883124 #

179. wubrr ◴[18 Oct 24 17:02 UTC] No.41881317{4}[source]▶

>>41876013 #

> The quality of the tests.

Very insightful, thanks.

replies(1): >>41882843 #

180. wubrr ◴[18 Oct 24 17:24 UTC] No.41881545{4}[source]▶

>>41876888 #

Yup, I've not actually seen any tool that enforces these kinds of test names. But yeah, trying to encode test description/documentation into it's name is like one of the worst common ways of documenting your tests.

181. wubrr ◴[18 Oct 24 17:29 UTC] No.41881597{6}[source]▶

>>41877837 #

Yeah, it doesn't solve the problem of low quality code/laziness, but it's a better tool/approach for documenting your tests than encoding the description/documentation into it's name.

Encoding such information into the name makes about as much sense as encoding constraints into SQL column names.

182. danjl ◴[18 Oct 24 17:32 UTC] No.41881632{3}[source]▶

>>41872618 #

I'd also include the status of the company. What a startup needs from tests is very different from what an enterprise company needs. If you're searching for product market fit, you need to be able to change things quickly. If you're trying to support a widely used service, you need better test coverage.

replies(1): >>41882616 #

183. wubrr ◴[18 Oct 24 17:39 UTC] No.41881699{3}[source]▶

>>41877434 #

Shallow/meh, article. Demonstrates complete lack of familiarity with many popular testing frameworks/approaches, and proposes a subpar solutions for problems that have already been solved in superior ways.

Your test description/documentation should be sentences, but there is absolutely zero reason to try to encode that into the name of your test function. Not to mention this article then suggests using another tool to decode this function name into a proper sentence for reporting... ok now you completely lost the ability to ctrl+f and jump to the function... terrible advice all around.

Why not just use a testing framework that actually supports free-form sentence descriptions/documentation for your tests?

replies(2): >>41882229 #>>41883316 #

184. larsrc ◴[18 Oct 24 18:28 UTC] No.41882124{5}[source]▶

>>41872469 #

"[docs] will take an example and do something with it to demonstrate an outcome."

No. Good docs will explain the context and choices made and trade-offs and risks and relations etc. All the things you can't read from the code. API docs can to a great degree be auto-generated, but not writing the _why_ is the beginning of the end.

185. bunderbunder ◴[18 Oct 24 18:38 UTC] No.41882229{4}[source]▶

>>41881699 #

If my unit testing framework supports free-form sentence descriptions, I'll use it. But I won't use that feature as a wedge issue. It doesn't bother me all that much to have test functions with names like `test_transaction_fails_if_insufficient_funds_are_available()`. Other features of the test framework might have a much bigger impact on my developer experience.

replies(1): >>41883449 #

186. bunderbunder ◴[18 Oct 24 18:46 UTC] No.41882303{5}[source]▶

>>41877009 #

I don't think you can do it in Java specifically. But once upon a time it was rather popular to write test fixtures for Java code in Groovy, which does let you do it.

replies(1): >>41887548 #

187. nonameiguess ◴[18 Oct 24 19:08 UTC] No.41882482[source]▶

>>41871629 (OP) #

This at least can work reasonably well if you're talking purely about library documentation meant to be consumed by other developers. It does nothing at all to provide documentation to software users who don't read code.

188. avensec ◴[18 Oct 24 19:25 UTC] No.41882616{4}[source]▶

>>41881632 #

Absolutely a great addition!

189. crazygringo ◴[18 Oct 24 19:25 UTC] No.41882618{5}[source]▶

>>41881022 #

I used to be like that sometimes. Then I started realizing I'd get the function 90% complete and discover an edge case and have to start over in a way that could handle the edge case. Sometimes this could happen twice.

Documenting your requirements by writing the tests in advance is of course painful because it forces you to think more upfront about the edge cases in advance. But that's precisely why it saves time in the long run, because then it makes it a lot more likely you can write the function correctly from the start.

190. globnomulous ◴[18 Oct 24 19:42 UTC] No.41882785{3}[source]▶

>>41877434 #

Not just complete sentences, test names should describe in plain English, with no reference to code or variable names, exactly what's being tested and exactly the expected outcome: "when [something happens], [the result is x]"

191. the_af ◴[18 Oct 24 19:48 UTC] No.41882843{5}[source]▶

>>41881317 #

Not sure if you felt I was being snarky, but I wasn't.

The article is discussing the quality of the tests, not quality in general and not the quality of the resulting software.

That was my point.

replies(1): >>41883558 #

192. SAI_Peregrinus ◴[18 Oct 24 19:55 UTC] No.41882902{3}[source]▶

>>41878897 #

Or the async version

There are three hard problems in computer science:

1) Naming things

2) Cachoncurr3)e invalidation

ency

4) Off-by-one errors

replies(1): >>41883658 #

193. the_af ◴[18 Oct 24 20:23 UTC] No.41883124{5}[source]▶

>>41881313 #

> Like others have already stated/provided examples of[0] - the test function names are generally irrelevant. Many testing frameworks use a single/same test function name, or a completely unnamed function/lambda, while providing any needed context/documentation as params or annotations.

I think what you're focusing on is just syntax sugar. Those examples with the 'describe'/'it' pattern are just another way to provide names to test cases, and their goal is exactly the same. If you didn't have this syntactic support, you'd write the function names representing this.

It's exactly the same thing: documenting the test case in the code (so not a separate document), with its name.

The distinction between "comment" and "function name" becomes less relevant once one realizes a function's name is just another comment.

replies(1): >>41883527 #

194. sksxihve ◴[18 Oct 24 20:42 UTC] No.41883316{4}[source]▶

>>41881699 #

When a unit test fails in code I'm working on I don't read the name of the test, I jump to the line in the file for the test and read the code so I never really understood what people find advantageous for this naming convention.

I've worked at companies that required this style naming for tests and it was an unholy mess, and it only works if the unit test is small enough that the name is still a reasonable length which at that point the code should be clear enough to understand what is being tested anyway.

replies(1): >>41883469 #

195. wubrr ◴[18 Oct 24 21:00 UTC] No.41883449{5}[source]▶

>>41882229 #

Pretty much every major language/framework now supports free-form test names/descriptions, including JUnit, which is referenced in the article ^ (again, highlighting the author's ignorance). Just because something doesn't bother you personally doesn't mean it's a good thing to follow, especially when its clearly inferior to other options.

> It doesn't bother me all that much to have test functions with names like `test_transaction_fails_if_insufficient_funds_are_available()

I mean, that's one example where you have one outcome based on one parameter/state. Expand this to a 2-field outcome based on 3 state conditions/parameters and now you have a 100-character long function name.

replies(1): >>41904621 #

196. wubrr ◴[18 Oct 24 21:03 UTC] No.41883469{5}[source]▶

>>41883316 #

Names, descriptions for tests are useful for many purposes, I'll leave it at that.

The point I'm making (and I think you are agreeing with me) is that trying to stuff a test description into a test function name is cumbersome and pointless. There are far better ways of adding descriptions/documentation for unit tests and pretty much every major language/testing framework supports these, nowadays.

197. wubrr ◴[18 Oct 24 21:12 UTC] No.41883527{6}[source]▶

>>41883124 #

> I think what you're focusing on is just syntax sugar. Those examples with the 'describe'/'it' pattern are just another way to provide names to test cases, and their goal is exactly the same.

The goal may be the same/similar, but one of the approaches is clearly superior to the other for multiple reasons (as stated by me and other many times in this comment tree). Also, I don't think you quite understand what 'syntactic sugar' means.

> If you didn't have this syntactic support, you'd write the function names representing this.

It's not any kind of 'syntactic support'. Pretty much every modern language/testing framework supports adding free-form test descriptions and names through various means.

> It's exactly the same thing: documenting the test case in the code (so not a separate document), with its name.

It's very clearly not the same at all lmao. And a test name, test description, other useful test documentation/metadata are also not the same.

> The distinction between "comment" and "function name" becomes less relevant once one realizes a function's name is just another comment.

Huge differences between a function name, a comment, and an annotation. HUGE. Read the other comments in this thread to understand why. If you actually worked in an environment where stuffing a test description into a test name is the preferred approach for a non-trivial amount of time, you'd know that once you get past a certain level of complexity your test names explode to 100+ character monsters, if only to differentiate them from the other tests, testing a different combination of states/inputs and outputs, etc.

replies(1): >>41883736 #

198. wubrr ◴[18 Oct 24 21:15 UTC] No.41883558{6}[source]▶

>>41882843 #

Saying 'stuffing a test description into the function name improves test quality because it improves test quality' is a cyclical, useless statement.

> The article is discussing the quality of the tests, not quality in general and not the quality of the resulting software.

All of my comments in this thread are about unit tests and test quality, not general software quality.

> That was my point.

I still don't see any valid point being made.

replies(1): >>41883747 #

199. Jerrrrrrry ◴[18 Oct 24 21:29 UTC] No.41883658{4}[source]▶

>>41882902 #

this is incredibly unnerving of a comment, thank you.

200. the_af ◴[18 Oct 24 21:37 UTC] No.41883736{7}[source]▶

>>41883527 #

Sorry, I thought you were debating in good faith. I now see the tone of your responses to everyone here.

Good luck with that!

replies(1): >>41888710 #

201. the_af ◴[18 Oct 24 21:39 UTC] No.41883747{7}[source]▶

>>41883558 #

Sorry, I replied to you because I thought you were asking how it affected the final product, and I clarified, in case you had missed it, that it was about the quality of the tests as documentation.

Sorry this whole thing seems to upset you so much. Chill!

replies(1): >>41888757 #

202. temp030033 ◴[19 Oct 24 00:16 UTC] No.41884604{3}[source]▶

>>41872618 #

I work for a huge corp, but the startup rules still apply in most situations because we're doing internal stuff where velocity really matters, and the number of small moving parts make unit tests not very useful.

Unfortunately, integration testing is painful and hardly done here because they keep inventing new bad frameworks for it, sticking more reasonable approaches behind red tape, or raising the bar for unit test coverage. If there were director-level visibility for integration test coverage, would be very different.

203. temp030033 ◴[19 Oct 24 00:18 UTC] No.41884613{3}[source]▶

>>41872352 #

It's not either/or, but integration is more important. As in, if you're going to defer one thing (which happens often), it should be unit tests.

204. sfn42 ◴[19 Oct 24 13:07 UTC] No.41887546{6}[source]▶

>>41879165 #

My bad, it seems I was misremembering

205. sfn42 ◴[19 Oct 24 13:07 UTC] No.41887548{6}[source]▶

>>41882303 #

My bad, it seems I was misremembering

206. wubrr ◴[19 Oct 24 16:19 UTC] No.41888710{8}[source]▶

>>41883736 #

Good faith debating is when you actually try to honestly consider and understand the other side's point. Not when you make dismissive blanket (and factually incorrect) statements while refusing to rationally engage with the counter-arguments.

Tone is generally completely independent of good faith.

Go heal that ego and try again.

replies(2): >>41889559 #>>41890705 #

207. wubrr ◴[19 Oct 24 16:26 UTC] No.41888757{8}[source]▶

>>41883747 #

It's ok if you misunderstood my comment and the context of this comment chain.

No need to get snarky about it and project your own feelings onto others though.

replies(1): >>41889563 #

208. bobbylarrybobby ◴[19 Oct 24 16:45 UTC] No.41888867{4}[source]▶

>>41872805 #

Yes, VSCode with rust-analyzer does

replies(1): >>41941891 #

209. the_af ◴[19 Oct 24 18:22 UTC] No.41889559{9}[source]▶

>>41888710 #

Dismissive?

210. the_af ◴[19 Oct 24 18:24 UTC] No.41889563{9}[source]▶

>>41888757 #

Snarky?

I was actually surprised you reacted with sarcasm.

Why are you trying to pick a fight?

211. seadan83 ◴[19 Oct 24 20:54 UTC] No.41890705{9}[source]▶

>>41888710 #

Hate jumping in, though both of you...

The uninviting tone discourages further discussion. I really appreciated where this convo was going until..

> Read the other comments in this thread to understand why.

This could be restated in a non aggressive way. Eg: 'Other comments go into more details why'

> If you actually worked in an environment

Presumptive statements like this are unhelpful. We need to remember we do not know the context and experiences of others. Usually better to root discussion from ones own experience and not to presume the others perspective.

replies(1): >>41892814 #

212. wubrr ◴[20 Oct 24 04:14 UTC] No.41892814{10}[source]▶

>>41890705 #

You're right, thanks.

213. bunderbunder ◴[21 Oct 24 14:35 UTC] No.41904621{6}[source]▶

>>41883449 #

You seem to be focusing on the most obtuse possible way of structuring descriptive test names for rhetorical purposes. I don't know if that's intentional or not, but either way it's not terribly convincing? If you name tests according to business domain concepts rather than tying it to individual parameters through some formulaic rubric, it's often possible to come up with test names that are both more concise and easier to understand than the specific way of naming tests that you've been consistently steering the conversation back toward throughout this thread.

replies(1): >>41973081 #

214. tpoacher ◴[24 Oct 24 13:45 UTC] No.41935431{4}[source]▶

>>41878135 #

Replying slightly late as I didn't see this on time ...

But I don't get the argument. Perhaps we're talking about different things (or have different mental anchors in mind)?

Giving python unittest syntax as an example, how does having a test called

  def test_myMethod_caseWhereZeroInputsAreGiven(): ...

provide better currency / documentation guarantees than

  def test_case000135():
      """
      Tests: myMethod
      Date: 2024-10-23
      Commit base: edb78go9
      Summary: Tests case where zero inputs are given.
      Details: ...
      """

As for quality criteria, I'm not necessarily claiming the naming scheme is overwhelmingly better, but except in the very limited and unrealistic scenario where you only need to write a single test per method, unwieldy test method names can cause more confusion than clarity, so I don't think index-based naming schemes complemented by documentation headers are 'worse' either. I don't think the standard good variable naming rules apply much here, because in general variable names largely rely on context to be meaningful, and should only be as minimally descriptive as required to understand their role in that context, whereas a test needs to be sufficiently descriptive to describe what is being tested AND provide the necessary context.

I don't think the bathroom analogy is good here either. I'm not arguing for sloppiness or bad code hygiene. A "better" analogy would be someone complaining about people not using the provided towels like people have always been doing, when the initial argument is that the towels in this particular room seem dirty and there's a perfectly fine airdryer "right there". Hence the answer "why does anyone think they get a pass washing their hands on a towel" just sounds bizzare to me, when the topic is appropriateness of sanitation method, not cleanliness itself.

(note: not being argumentative, I hope my tone does not come across as such; if you do bump onto this again and think I'm misinterpreting or missing some finer point entirely, I'd be interested to hear it)

215. chrisweekly ◴[25 Oct 24 03:07 UTC] No.41941891{5}[source]▶

>>41888867 #

Cool, thanks for the response.

I don't understand the downvote for my question (shrug).

216. wubrr ◴[28 Oct 24 16:34 UTC] No.41973081{7}[source]▶

>>41904621 #

> You seem to be focusing on the most obtuse possible way of structuring descriptive test names for rhetorical purposes.

No, I'm focusing on the most realistic and common ways this kind of pattern actually exists (based on my experience).

> If you name tests according to business domain concepts rather than tying it to individual parameters through some formulaic rubric,

While you say I'm focusing on 'the most obtuse possible way...', this kind of comment makes it seem like you haven't focused on any actual way at all. You're speaking in very ambiguous and vague terms, which actually can't be applied and enforced in practice. If you're actually trying to write a suite of unit tests around, say a function, with 3 parameters and multiple possible outcome states - you can't name your function the same for the different combinations of inputs/outputs, and you can't just handwave a 'business domain concepts' name into existence to cover each case - that just turns into an exercise of finding synonyms, abbreviations, and vague generalizations - it doesn't solve the fact that you still need all of the same test cases and they all still need to have unique function names.

You haven't actually thought through what you're proposing here.

↑