An overly aggressive mock can work fine, but break much later

(nedbatchelder.com)

67 points ingve | 2 comments | 16 Nov 25 22:36 UTC | HN request time: 0.054s | source

Show context

lelandbatey ◴[17 Nov 25 00:41 UTC] No.45949894[source]▶

I feel like the #1 reason mocks break looks nothing like this and instead looks like: you change the internal behaviors of a function/method and now the mocks interact differently with the underlying code, forcing you to change the mocks. Which highlights how awful mocking as a concept is; it is of truly limited usefulness for anything but the most brittle of tests.

Don't test the wrong things; if you care about some precondition, that should be an input. If you need to measure a side effect, that should be an output. Don't tweak global state to do your testing.

replies(3): >>45950085 #>>45951034 #>>45955029 #

bluGill ◴[17 Nov 25 01:24 UTC] No.45950085[source]▶

>>45949894 #

Most of the real world is about manipulating the real world. For algorithms it is fine to say depend on the pure inputs/outputs. However what we care about is that global state is manipulated correctly and so the integration tests that verify that are what are important. In most cases your algorithm shouldn't be unit tested separately since it is only used in one place and changes when the users change: there is no point in extra tests. If the algorithm is used in many places comprehensive unit tests are important, but they get in the way when the algorithm is used only once and so the tests just inhibit changes to the algorithm as requirements change (you have to change the user, the integration tests, and the unit tests that are redundant).

As such I disagree. Global state is what you should be testing - but you need to be smart about it. How you setup and verify global state matters. Don't confuse global state above with global state of variables, I mean the external state of the program before and after, which means network, file, time, and other IO things.

replies(1): >>45950287 #

refactor_master ◴[17 Nov 25 02:11 UTC] No.45950287[source]▶

>>45950085 #

IO and global state is also just inputs that can be part of arrange-act-assert. Instead of mocking your database call to always return "foo" when the word "SELECT" is in the query, insert a real "foo" in a real test database and perform a real query.

Again I've heard "but what if my database/table changes so rapidly that I need the mock so I don't need to change the query all the time", in which case you ought to take a moment to write down what you're trying to accomplish, rather than using mocks to pave over poor architectural decisions. Eventually, the query fails and the mock succeeds, because they were completely unrelated.

So far I've only seen mocks fail eventually and mysteriously. With setups and DI you can treat things mostly as a black box from a testing point of view, but when mocks are involved you need surgical precision to hit the right target at the right time.

replies(3): >>45950997 #>>45953637 #>>45954957 #

dpark ◴[17 Nov 25 16:15 UTC] No.45954957[source]▶

>>45950287 #

> global state

In my experience global state is the testing bug farm. Tests that depend on global state are usually taking dependencies they aren’t even aware of. Test initializations grow into complex “poke this, prod that, set this value to some magic number” setups that attempt to tame the global state but as global state grows, this becomes more and more difficult and inconsistent. Inter-test dependencies sneak in, parallelism becomes impossible, engineers start turning off “flaky” tests because they’ve spent hours or days trying to reproduce failures only to eventually give up.

This sort of development is attractive when starting up a project because it’s straightforward and the testing “bang for the buck” is high. But it degrades quickly as the system becomes complex.

> Instead of mocking your database call to always return "foo" when the word "SELECT" is in the query, insert a real "foo" in a real test database and perform a real query.

Consider not sprinkling “select” statements throughout your code instead. This tight coupling makes good testing much more difficult (requiring the “set all the global state” model of testing) but is also just generally not good code structure. The use of SQL is an implementation detail that most of your code shouldn’t need to know about.

A thin layer around the DB interactions gives you a smaller set of code that needs to be tested with state, gives you a scoped surface area for any necessary mocking, makes it much easier of you need to change storage systems, and also gives you a place that you can reason over all the possible DB interactions. This is just good separation of concerns.

replies(1): >>45955177 #

1. bluGill ◴[17 Nov 25 16:33 UTC] No.45955177[source]▶

>>45954957 #

You didn't use global state in the same sense that I did. I mean global as in state of the entire world, not just the computer you are on.

I tamed our inter-set dependencies by doing things like starting dbus on a different port for each tests - now I can test with real dbus in the loop and my tests are fast and isolated. We have a strict rule of what directories we are allowed to write to (embedded system - the others are read only in production), so it is easy to point those to a temp dir. It was some work to set that up, but it tames most of your issues with global state and allows me to verify what really counts: the system works.

For a CRUD web app your database separation of concerns makes sense. However in my domain we have lots of little data stores and nobody else needs access to that store. As such we put it on each team to develop the separation that makes sense for them - I don't agree with all their decisions, but they get to deal with it.

replies(1): >>45955703 #

2. dpark ◴[17 Nov 25 17:20 UTC] No.45955703[source]▶

>>45955177 (TP) #

Sure. I wasn’t responding to your statements but I can understand what you mean. If your code manipulates the physical world (and not in a “sets this bit on a disk” sort of way) then you have to validate that action. Whether that valuation should be decoupled from the business logic that drives that piece really depends on the complexity of the system. At some point you are going to do full integration valuation so it’s a question of balance. How much is tested that way vs more isolated.

Tests that work and verify the system works are the Pri0 requirement. Most of the conversations about how best to test are structured for the benefit of people who are struggling with meeting the Pri0 because of maintainability. With enough effort any strategy can work.

> However in my domain we have lots of little data stores and nobody else needs access to that store.

If the little data stores are isolated to small individual areas of code then you probably already have the necessary isolation. Introducing the lightweight data store isolation layer might be useless (or not, context dependent). Now if these individual areas are doing things like handing off result sets to other code then I would have something different to say.

↑