Don't test the wrong things; if you care about some precondition, that should be an input. If you need to measure a side effect, that should be an output. Don't tweak global state to do your testing.
Don't test the wrong things; if you care about some precondition, that should be an input. If you need to measure a side effect, that should be an output. Don't tweak global state to do your testing.
Rarely should a mock be “interacting with the underlying code”, because it should be a dead end that returns canned data and makes no other calls.
If your mock is calling back into other code you’ve probably not got a mock but some other kind of “test double”. Maybe a “fake” in Martin Fowler’s terminology.
If you have test doubles that are involved in a bunch of calls back and forth between different pieces of code then there’s a good chance you have poorly factored code and your doubles are complex because of that.
Now, I won’t pretend changes don’t regularly break test doubles, but for mocks it’s usually method changes or additions and the fix is mechanical (though annoying). If your mocks are duplicating a bunch of logic, though, then something else is going on.
A test should not fail when the outputs do not change. In pursuit of this ideal I often end up with fakes (to use Martin Fowler's terms) of varying levels of complexity, but not "mocks" as many folks refer to them.
[0] - https://docs.python.org/3/library/unittest.mock.html#unittes...
There are some specific cases, such as validating that caching is working as expected, where it can make sense to fully validate every call. Most of the time, though, this is a pointless exercise that serves mostly to make it difficult to maintain tests.
It can sometimes also be useful as part of writing new code, because it can help validate that your mental model for the code is correct. But it’s a nightmare for maintenance and committing over-constrained tests just creates future burden.
In Fowler terminology I think I tend to use Stubs rather than Mocks for most cases.