An overly aggressive mock can work fine, but break much later

1. lelandbatey ◴[17 Nov 25 00:41 UTC] No.45949894[source]▶

I feel like the #1 reason mocks break looks nothing like this and instead looks like: you change the internal behaviors of a function/method and now the mocks interact differently with the underlying code, forcing you to change the mocks. Which highlights how awful mocking as a concept is; it is of truly limited usefulness for anything but the most brittle of tests.

Don't test the wrong things; if you care about some precondition, that should be an input. If you need to measure a side effect, that should be an output. Don't tweak global state to do your testing.

replies(3): >>45950085 #>>45951034 #>>45955029 #

2. bluGill ◴[17 Nov 25 01:24 UTC] No.45950085[source]▶

>>45949894 (TP) #

Most of the real world is about manipulating the real world. For algorithms it is fine to say depend on the pure inputs/outputs. However what we care about is that global state is manipulated correctly and so the integration tests that verify that are what are important. In most cases your algorithm shouldn't be unit tested separately since it is only used in one place and changes when the users change: there is no point in extra tests. If the algorithm is used in many places comprehensive unit tests are important, but they get in the way when the algorithm is used only once and so the tests just inhibit changes to the algorithm as requirements change (you have to change the user, the integration tests, and the unit tests that are redundant).

As such I disagree. Global state is what you should be testing - but you need to be smart about it. How you setup and verify global state matters. Don't confuse global state above with global state of variables, I mean the external state of the program before and after, which means network, file, time, and other IO things.

replies(1): >>45950287 #

3. refactor_master ◴[17 Nov 25 02:11 UTC] No.45950287[source]▶

>>45950085 #

IO and global state is also just inputs that can be part of arrange-act-assert. Instead of mocking your database call to always return "foo" when the word "SELECT" is in the query, insert a real "foo" in a real test database and perform a real query.

Again I've heard "but what if my database/table changes so rapidly that I need the mock so I don't need to change the query all the time", in which case you ought to take a moment to write down what you're trying to accomplish, rather than using mocks to pave over poor architectural decisions. Eventually, the query fails and the mock succeeds, because they were completely unrelated.

So far I've only seen mocks fail eventually and mysteriously. With setups and DI you can treat things mostly as a black box from a testing point of view, but when mocks are involved you need surgical precision to hit the right target at the right time.

replies(3): >>45950997 #>>45953637 #>>45954957 #

4. ◴[17 Nov 25 05:12 UTC] No.45950997{3}[source]▶

>>45950287 #

5. dpark ◴[17 Nov 25 05:22 UTC] No.45951034[source]▶

>>45949894 (TP) #

> you change the internal behaviors of a function/method and now the mocks interact differently with the underlying code, forcing you to change the mocks

Rarely should a mock be “interacting with the underlying code”, because it should be a dead end that returns canned data and makes no other calls.

If your mock is calling back into other code you’ve probably not got a mock but some other kind of “test double”. Maybe a “fake” in Martin Fowler’s terminology.

If you have test doubles that are involved in a bunch of calls back and forth between different pieces of code then there’s a good chance you have poorly factored code and your doubles are complex because of that.

Now, I won’t pretend changes don’t regularly break test doubles, but for mocks it’s usually method changes or additions and the fix is mechanical (though annoying). If your mocks are duplicating a bunch of logic, though, then something else is going on.

replies(1): >>45956320 #

6. bluGill ◴[17 Nov 25 14:04 UTC] No.45953637{3}[source]▶

>>45950287 #

I don't understand the "but what if my database/table changes so rapidly that I need the mock so I don't need to change the query all the time". Are they changing your database schema at runtime (not just software update time when you can do migrations)? Otherwise you should have a test helper that creates your entire database schema so that when it changes you can update the schema in one place and all tests are re-run against the new one. (you should also have some insert common test data helpers that are easy to update as needed should the new schema affect the changes)

I haven't seen mocks fail mysteriously. I've seen them fail often though because requirements change and instead of updating the callers (generally a small number) you end up with 200 tests failing and give up because updating all the tests is too hard. Mocks are always about implementation details - sometimes you have no choice, but the more you can test actual behavior the better.

7. dpark ◴[17 Nov 25 16:15 UTC] No.45954957{3}[source]▶

>>45950287 #

> global state

In my experience global state is the testing bug farm. Tests that depend on global state are usually taking dependencies they aren’t even aware of. Test initializations grow into complex “poke this, prod that, set this value to some magic number” setups that attempt to tame the global state but as global state grows, this becomes more and more difficult and inconsistent. Inter-test dependencies sneak in, parallelism becomes impossible, engineers start turning off “flaky” tests because they’ve spent hours or days trying to reproduce failures only to eventually give up.

This sort of development is attractive when starting up a project because it’s straightforward and the testing “bang for the buck” is high. But it degrades quickly as the system becomes complex.

> Instead of mocking your database call to always return "foo" when the word "SELECT" is in the query, insert a real "foo" in a real test database and perform a real query.

Consider not sprinkling “select” statements throughout your code instead. This tight coupling makes good testing much more difficult (requiring the “set all the global state” model of testing) but is also just generally not good code structure. The use of SQL is an implementation detail that most of your code shouldn’t need to know about.

A thin layer around the DB interactions gives you a smaller set of code that needs to be tested with state, gives you a scoped surface area for any necessary mocking, makes it much easier of you need to change storage systems, and also gives you a place that you can reason over all the possible DB interactions. This is just good separation of concerns.

replies(1): >>45955177 #

8. wry_discontent ◴[17 Nov 25 16:21 UTC] No.45955029[source]▶

>>45949894 (TP) #

I see a lot of times people (read: me) are lazy and make a mock that does not have anywhere near the behavior of the original. It's more like a very partial stub. I will mock an api with 20 possible calls with the 2 that I use. Unsurprisingly, this mock is closely tied to the current implementation.

replies(1): >>45968018 #

9. bluGill ◴[17 Nov 25 16:33 UTC] No.45955177{4}[source]▶

>>45954957 #

You didn't use global state in the same sense that I did. I mean global as in state of the entire world, not just the computer you are on.

I tamed our inter-set dependencies by doing things like starting dbus on a different port for each tests - now I can test with real dbus in the loop and my tests are fast and isolated. We have a strict rule of what directories we are allowed to write to (embedded system - the others are read only in production), so it is easy to point those to a temp dir. It was some work to set that up, but it tames most of your issues with global state and allows me to verify what really counts: the system works.

For a CRUD web app your database separation of concerns makes sense. However in my domain we have lots of little data stores and nobody else needs access to that store. As such we put it on each team to develop the separation that makes sense for them - I don't agree with all their decisions, but they get to deal with it.

replies(1): >>45955703 #

10. dpark ◴[17 Nov 25 17:20 UTC] No.45955703{5}[source]▶

>>45955177 #

Sure. I wasn’t responding to your statements but I can understand what you mean. If your code manipulates the physical world (and not in a “sets this bit on a disk” sort of way) then you have to validate that action. Whether that valuation should be decoupled from the business logic that drives that piece really depends on the complexity of the system. At some point you are going to do full integration valuation so it’s a question of balance. How much is tested that way vs more isolated.

Tests that work and verify the system works are the Pri0 requirement. Most of the conversations about how best to test are structured for the benefit of people who are struggling with meeting the Pri0 because of maintainability. With enough effort any strategy can work.

> However in my domain we have lots of little data stores and nobody else needs access to that store.

If the little data stores are isolated to small individual areas of code then you probably already have the necessary isolation. Introducing the lightweight data store isolation layer might be useless (or not, context dependent). Now if these individual areas are doing things like handing off result sets to other code then I would have something different to say.

11. lelandbatey ◴[17 Nov 25 18:22 UTC] No.45956320[source]▶

>>45951034 #

To fully show my hand: What I do not like are mocks which have you outline a series of behaviors/method calls which are expected to be called, typically with specific input parameters, often with methods in order. In python, this looks like setting up Mocks and then using `assert_called` methods. In Go, this means using Gomock-generated structs which have an EXPECT.

A test should not fail when the outputs do not change. In pursuit of this ideal I often end up with fakes (to use Martin Fowler's terms) of varying levels of complexity, but not "mocks" as many folks refer to them.

[0] - https://docs.python.org/3/library/unittest.mock.html#unittes...

[1] - https://pkg.go.dev/github.com/golang/mock/gomock

replies(2): >>45956468 #>>45967865 #

12. dpark ◴[17 Nov 25 18:35 UTC] No.45956468{3}[source]▶

>>45956320 #

So this I agree with. Mocks are often used to over-constrain and end up validating that the implementation does not change rather than does not regress.

There are some specific cases, such as validating that caching is working as expected, where it can make sense to fully validate every call. Most of the time, though, this is a pointless exercise that serves mostly to make it difficult to maintain tests.

It can sometimes also be useful as part of writing new code, because it can help validate that your mental model for the code is correct. But it’s a nightmare for maintenance and committing over-constrained tests just creates future burden.

In Fowler terminology I think I tend to use Stubs rather than Mocks for most cases.

13. bccdee ◴[18 Nov 25 15:50 UTC] No.45967865{3}[source]▶

>>45956320 #

Yeah I think true mocks often lead to fragile, hard-to-read tests. A text which expects a bunch of specific interactions with a mock is white-box, whereas a test which tests the state of a fake after the operation is complete is mostly black-box. Then, if your dependency gets updated, only the fake breaks, rather than all of your test expectations.

A particularly complex fake can even be unit-tested, if need be. Of course, if you're writing huge fakes, there's probably something wrong with your architecture, but I feel like good testing practices should give you options even when you're working with poorly architected code.

14. bccdee ◴[18 Nov 25 16:01 UTC] No.45968018[source]▶

>>45955029 #

I think making narrow use-case-specific test doubles is often better than a big shared one. You can even wrap the API with an interface that only exposes the methods you need.