Don't test the wrong things; if you care about some precondition, that should be an input. If you need to measure a side effect, that should be an output. Don't tweak global state to do your testing.
Don't test the wrong things; if you care about some precondition, that should be an input. If you need to measure a side effect, that should be an output. Don't tweak global state to do your testing.
As such I disagree. Global state is what you should be testing - but you need to be smart about it. How you setup and verify global state matters. Don't confuse global state above with global state of variables, I mean the external state of the program before and after, which means network, file, time, and other IO things.
Again I've heard "but what if my database/table changes so rapidly that I need the mock so I don't need to change the query all the time", in which case you ought to take a moment to write down what you're trying to accomplish, rather than using mocks to pave over poor architectural decisions. Eventually, the query fails and the mock succeeds, because they were completely unrelated.
So far I've only seen mocks fail eventually and mysteriously. With setups and DI you can treat things mostly as a black box from a testing point of view, but when mocks are involved you need surgical precision to hit the right target at the right time.
Rarely should a mock be “interacting with the underlying code”, because it should be a dead end that returns canned data and makes no other calls.
If your mock is calling back into other code you’ve probably not got a mock but some other kind of “test double”. Maybe a “fake” in Martin Fowler’s terminology.
If you have test doubles that are involved in a bunch of calls back and forth between different pieces of code then there’s a good chance you have poorly factored code and your doubles are complex because of that.
Now, I won’t pretend changes don’t regularly break test doubles, but for mocks it’s usually method changes or additions and the fix is mechanical (though annoying). If your mocks are duplicating a bunch of logic, though, then something else is going on.
I haven't seen mocks fail mysteriously. I've seen them fail often though because requirements change and instead of updating the callers (generally a small number) you end up with 200 tests failing and give up because updating all the tests is too hard. Mocks are always about implementation details - sometimes you have no choice, but the more you can test actual behavior the better.
In my experience global state is the testing bug farm. Tests that depend on global state are usually taking dependencies they aren’t even aware of. Test initializations grow into complex “poke this, prod that, set this value to some magic number” setups that attempt to tame the global state but as global state grows, this becomes more and more difficult and inconsistent. Inter-test dependencies sneak in, parallelism becomes impossible, engineers start turning off “flaky” tests because they’ve spent hours or days trying to reproduce failures only to eventually give up.
This sort of development is attractive when starting up a project because it’s straightforward and the testing “bang for the buck” is high. But it degrades quickly as the system becomes complex.
> Instead of mocking your database call to always return "foo" when the word "SELECT" is in the query, insert a real "foo" in a real test database and perform a real query.
Consider not sprinkling “select” statements throughout your code instead. This tight coupling makes good testing much more difficult (requiring the “set all the global state” model of testing) but is also just generally not good code structure. The use of SQL is an implementation detail that most of your code shouldn’t need to know about.
A thin layer around the DB interactions gives you a smaller set of code that needs to be tested with state, gives you a scoped surface area for any necessary mocking, makes it much easier of you need to change storage systems, and also gives you a place that you can reason over all the possible DB interactions. This is just good separation of concerns.
I tamed our inter-set dependencies by doing things like starting dbus on a different port for each tests - now I can test with real dbus in the loop and my tests are fast and isolated. We have a strict rule of what directories we are allowed to write to (embedded system - the others are read only in production), so it is easy to point those to a temp dir. It was some work to set that up, but it tames most of your issues with global state and allows me to verify what really counts: the system works.
For a CRUD web app your database separation of concerns makes sense. However in my domain we have lots of little data stores and nobody else needs access to that store. As such we put it on each team to develop the separation that makes sense for them - I don't agree with all their decisions, but they get to deal with it.
Tests that work and verify the system works are the Pri0 requirement. Most of the conversations about how best to test are structured for the benefit of people who are struggling with meeting the Pri0 because of maintainability. With enough effort any strategy can work.
> However in my domain we have lots of little data stores and nobody else needs access to that store.
If the little data stores are isolated to small individual areas of code then you probably already have the necessary isolation. Introducing the lightweight data store isolation layer might be useless (or not, context dependent). Now if these individual areas are doing things like handing off result sets to other code then I would have something different to say.
A test should not fail when the outputs do not change. In pursuit of this ideal I often end up with fakes (to use Martin Fowler's terms) of varying levels of complexity, but not "mocks" as many folks refer to them.
[0] - https://docs.python.org/3/library/unittest.mock.html#unittes...
There are some specific cases, such as validating that caching is working as expected, where it can make sense to fully validate every call. Most of the time, though, this is a pointless exercise that serves mostly to make it difficult to maintain tests.
It can sometimes also be useful as part of writing new code, because it can help validate that your mental model for the code is correct. But it’s a nightmare for maintenance and committing over-constrained tests just creates future burden.
In Fowler terminology I think I tend to use Stubs rather than Mocks for most cases.
A particularly complex fake can even be unit-tested, if need be. Of course, if you're writing huge fakes, there's probably something wrong with your architecture, but I feel like good testing practices should give you options even when you're working with poorly architected code.