Most active commenters

MrJohz(9)
9rx(6)
RHSeeger(4)

Popular/hot comments

>>45072276 #

←back to thread

Delete tests

(andre.arko.net)

Show context

recursivedoubts ◴[30 Aug 25 02:27 UTC] No.45071410[source]▶

>>45038074 (OP) #

One of the most important things you can do is move your tests up the abstraction layers and away from unit tests. For lack of a better term, to move to integration tests. End-to-end tests are often too far from the system to easily understand what's wrong when they break, and can overwhelm a development org. Integration tests (or whatever you want to call them) are often the sweet spot: not tied to a particular implementation, able to survive fairly significant system changes, but also easy enough to debug when they break.

https://grugbrain.dev/#grug-on-testing

replies(11): >>45071535 #>>45071726 #>>45071751 #>>45071944 #>>45072117 #>>45072123 #>>45072158 #>>45072321 #>>45072494 #>>45074365 #>>45080184 #

RHSeeger ◴[30 Aug 25 03:42 UTC] No.45071726[source]▶

>>45071410 #

Integration tests and Unit tests are different tools; and each has their place and purpose. Using one "instead" of the other is a mistake.

replies(8): >>45072079 #>>45072176 #>>45072722 #>>45072873 #>>45073135 #>>45074394 #>>45080460 #>>45093392 #

1. MrJohz ◴[30 Aug 25 05:20 UTC] No.45072079[source]▶

>>45071726 #

I've never really found this to be the case in practice. When I look at well-written unit tests and well-written integration tests, they're usually doing exactly the same sort of thing and have very similar concerns in terms of code organisation and test structure.

For example, in both cases, the tests work best if I test the subject under test as a black box (i.e. interact only with its public interface) but use my knowledge of its internals to identify the weaknesses that will most require testing. In both cases, I want to structure the code so that the subject under test is as isolated as possible - i.e. no complex interactions with global state, no mocking of unrelated modules, and no complex mechanism to reset anything after the test is done. In both cases, I want the test to run fast, ideally instantaneously, so I get immediate results.

The biggest difference is that it's usually harder to write good integration tests because they're interacting with external systems that are generally slower and stateful, so I've got to put extra work into getting the tests themselves to be fast and stateless. But when that works, there's really not much difference at all between a test that tests a single function, and a test that tests a service class with a database dependency.

replies(7): >>45072229 #>>45072232 #>>45072401 #>>45072421 #>>45072764 #>>45073123 #>>45073242 #

2. RHSeeger ◴[30 Aug 25 05:58 UTC] No.45072229[source]▶

>>45072079 (TP) #

I'll go with a bank account, because that was one of the initial examples for automated testing.

I would write integration/system (different, but similar, imo) to test that the black box integrations with the system work as expected. Generally closer to the "user story" end of things.

I would write integration tests for smaller, targeted thing. Like making sure the sort method works in various cases, etc. Individual methods, especially ones that don't interact with data outside what is passed into them (functional methods), are good for unit testing.

replies(2): >>45072276 #>>45074474 #

3. rkomorn ◴[30 Aug 25 05:58 UTC] No.45072232[source]▶

>>45072079 (TP) #

I've found that well-written unit tests help me narrow down problems faster during development (eg one unit test failing for a function would show that a change or refactor missed an edge case).

I've found that well-written integration tests help me catch workflow-level issues (eg something changed in a dependency that might be mocked in unit tests).

So while I think good integration tests are the best way to make sure things should ship, I see a lot of value in good unit tests for day-to-day velocity, particularly in code that's being maintained or updated instead of new code.

4. 9rx ◴[30 Aug 25 06:08 UTC] No.45072276[source]▶

>>45072229 #

> to test that the black box integrations with the system work as expected. Generally closer to the "user story" end of things.

This is what unit testing was originally described as. Which confirms my belief that unit testing and integration testing has always been the very same thing.

> Individual methods, especially ones that don't interact with data outside what is passed into them (functional methods), are good for unit testing.

Perhaps unit testing has come to mean this, but these kinds of tests are rarely ever worth writing, so it is questionable if it even needs a name. Sometimes it can be helpful to isolate a function like that for the sake of pinning down complex logic or edge cases, but is likely you'll want to delete this kind of test once you're done. This is where testing brittleness is born.

replies(3): >>45072352 #>>45072380 #>>45074637 #

5. mrugge ◴[30 Aug 25 06:23 UTC] No.45072352{3}[source]▶

>>45072276 #

In test-driven development, fast unit tests are a must-have. Integration tests are too slow. If you are not doing test-driven development, can go heavier into integration tests. I find the developer experience is not as fun without good unit tests, and even if velocity metrics are the same, that factor alone is a good reason to focus on writing more fast unit tests.

replies(1): >>45072852 #

6. RHSeeger ◴[30 Aug 25 06:31 UTC] No.45072380{3}[source]▶

>>45072276 #

I've described this before on occasion; I consider there to be a wide variety of tests.

- Unit test = my code works

- Functional test = my design works

- Integration test = my code is using your 3rd party stuff correctly (databases, etc)

- Factory Acceptance Test = my system works

- Site Acceptance Test = your code sucks, this totally isn't what I asked for!?!

Then there's more "concern oriented" groupings, like "regression tests", which could fall into any number of the above.

That being said, there's a pretty wide set of opinions on the topic, and that doesn't really seem to change over time.

> these kinds of tests are rarely ever worth writing

I strongly disagree. I find it very helpful to write unit tests for specific implementations of things (like a specific sort, to make sure it works correctly with the various edge cases). Do they get discarded if you completely change the implementation? Sure. But that doesn't detract from the fact that they help make sure the current implementation works the way I say it does.

replies(1): >>45072493 #

7. soanvig ◴[30 Aug 25 06:38 UTC] No.45072401[source]▶

>>45072079 (TP) #

I think this discussion has to be open with what is a "unit" in unit tests. "Integration" consists of many units working together. But my unit can be a function or entire module. That's what people ignore in most discussions about test types.

8. CuriouslyC ◴[30 Aug 25 06:42 UTC] No.45072421[source]▶

>>45072079 (TP) #

Unit tests are good for testing isolated units of code, integration tests test integration. If you wait until you have enough code to test integration, when you actually write the tests you're going to find you've checked in a bunch of almost-working code.

9. 9rx ◴[30 Aug 25 06:57 UTC] No.45072493{4}[source]▶

>>45072380 #

> I find it very helpful to write unit tests for specific implementations of things (like a specific sort, to make sure it works correctly with the various edge cases).

Sorting mightn't be the greatest example as sorting could quite reasonably be the entire program (i.e. a library).

But if you needed some kind of custom sort function to serve features within a greater application, you are already going to know that your sort function works correctly by virtue of the greater application working correctly. Testing the sort function in isolation is ultimately pointless.

As before, there may be some benefit in writing code to run that sort function in isolation during development to help pinpoint what edge cases need to be considered, but there isn't any real value in keeping that around after development is done. The edge cases you discovered need to be moved up in the abstraction to the greater program anyway.

replies(2): >>45074747 #>>45092520 #

10. globular-toast ◴[30 Aug 25 07:53 UTC] No.45072764[source]▶

>>45072079 (TP) #

It depends what you are doing. Let's say your module implements a way to declare rules and then run some validation function to check objects against those rules. You can't just test every possible set of rules and object that you want to check, even though this is, of course, all that matters. You have to unit test the implementation of the module to be at all confident that it's doing the right thing.

So ultimately we write tests at a lower level to deal with the combinatorial explosion of possible inputs at the edge.

You should push your tests as far to the edge as possible but no further. If a test at the edge duplicates a test in the middle, delete the test in the middle. But if a test at the edge can't possibly account for everything you're going to bed a test in the middle.

replies(2): >>45073117 #>>45073150 #

11. MrJohz ◴[30 Aug 25 08:09 UTC] No.45072852{4}[source]▶

>>45072352 #

In general, fast tests are a must-have, but I find that means figuring out how to write fast integration tests as well so that they can also be run as part of a TDD-like cycle. In my experience, integration tests can generally be written to be very quick, but maybe my definition of an integration test is different from yours?

For me, heavy tests implies end-to-end tests, because at that point you're interacting with the whole system including potentially a browser, and that's just going to be slow whichever way you look at it. But just accessing a database, or parsing and sending http requests doesn't have to be particularly slow, at least not compared to the speed at which I develop. I'd expect to be able to run hundreds of those sorts of tests in less than a second, which is fast enough for me.

replies(1): >>45074223 #

12. MrJohz ◴[30 Aug 25 09:01 UTC] No.45073117[source]▶

>>45072764 #

Yeah, that's similar to how I'd look for the correct place to put my tests. But at that point, a unit test is just the innermost layer of tests, which doesn't feel like a useful distinction. In your example, I might have a set of tests checking how the rules are parsed and interpreted (say), and then a set of tests one level up checking that the validation engine was a whole works, and then another set of tests a level up testing a module that uses the validation engine. The tests for the validation engine won't retest parsing, and the tests for the module using the validation engine won't test validation per se, but there's multiple layers there where each layer contains unit tests focusing on that layer's code specifically.

13. JackSlateur ◴[30 Aug 25 09:02 UTC] No.45073123[source]▶

>>45072079 (TP) #

My integration tests test things that must work

My unit tests test things that must not work

14. troupo ◴[30 Aug 25 09:09 UTC] No.45073150[source]▶

>>45072764 #

> You can't just test every possible set of rules and object that you want to check, even though this is, of course, all that matters.

If it matters, why can't you check? Will your product/app/system not run into these possible sets eventually?

> So ultimately we write tests at a lower level to deal with the combinatorial explosion of possible inputs at the edge.

Don't you have to write the combinatorial explosion of inputs for the unit tests, too, to test "every possible combination"? If not, and you're only testing a subset, then why not test the whole flow while you're at it?

replies(1): >>45073680 #

15. ffsm8 ◴[30 Aug 25 09:27 UTC] No.45073242[source]▶

>>45072079 (TP) #

Let's say you have a function thats being called to compute a state using hundreds of attributes spread across tens of different objects, and various different levels of nesting.

Now, you could create hundreds of different integration tests for each branch of the computation..., most of which will assert the same final output state, but achieved through different transitions

Or you can make some integration tests which make sure the logic itself is being called, and then only unittest the specific criteria in isolation.

What you're talking about is likely founded in either frontend testing (component tests vs unittest) or backends which have generally pretty trivial logic complexity. In these cases, just doing an integration test gets it done for the most part, but as soon as you got multiple stakeholders giving you sperate requirements and the consumed inputs get bigger and multiply ... Testing via integration tests gets essentially impossible to do in practice

replies(1): >>45073502 #

16. chamomeal ◴[30 Aug 25 10:27 UTC] No.45073502[source]▶

>>45073242 #

I feel like that’s where property based testing comes in. Quickcheck style libraries.

I only recently started looking into Quickcheck style libraries in the typescript world, and fast-check is fantastic. Like super high quality. Great support for shrinking in all sorts of cases, very well typed, etc.

Hooking fast-check up to a real database/redis instance has been incredible for finding bugs. Pair it up with some regular ol case by case integration tests for some seriously robust typescript!

17. globular-toast ◴[30 Aug 25 11:11 UTC] No.45073680{3}[source]▶

>>45073150 #

You can't check because the numbers quickly become astronomical. Can you test the Python parser on all possible Python programs? Even if you limited the length of a program you're still talking about an absurdly large number of possible inputs.

What you do is write more primitive components and either unit test them, prove them to be correct or make them small enough to be correct by inspection. An integration test is just testing that the interfaces do indeed fit together, it won't normally be close to testing all possible code paths internally.

I think of it like building any other large machine with many inputs. You can't possibly test a car under every conceivable condition. Imagine if someone was like "but wait, did you even test going round a corner at 60mph in the wet with the radio on?!"

replies(1): >>45077966 #

18. mrugge ◴[30 Aug 25 12:53 UTC] No.45074223{5}[source]▶

>>45072852 #

I inherited a django project which has mostly 'unit' tests that flex the ORM and the db, so they are really integration tests and are painfully slow. There is some important logic that happens in the ORM layer and that needs to be tested. At some point I want to find the time to mock the database so that they can be faster, but in some cases I worry about missing important interactions. Domain is highly specialized so not very easy to just know how to untangle the mess.

replies(1): >>45075282 #

19. barrkel ◴[30 Aug 25 13:26 UTC] No.45074474[source]▶

>>45072229 #

What if, instead of a bank account, it's FooSystemFrobnicationPreparer? Something which is necessary today, but probably should be refactored within the next year or two?

Maybe FooSystem will be redesigned to take different inputs,maybe the upstream will change to provide different outputs, maybe responsibility will shift around due to changes in the number of dependencies and it makes sense to vertically integrate some prep to upstream to share it.

Unit tests in these circumstances - and they're the majority of unit tests, IME - can act as a drag on the quality of the system. It's better to test things like this at a component level instead of units.

replies(1): >>45077606 #

20. MrJohz ◴[30 Aug 25 13:42 UTC] No.45074637{3}[source]▶

>>45072276 #

I find if you figure out the right unit boundaries, and find a good way of testing the code, you can often keep the tests around long-term, and they'll be very stable. Even when you update the code you're testing, if the tests are well-written, updating the tests is often just a case of running a find-and-replace job.

That said, I think it takes a real knack to figure out the right sort of tests, and it sometimes takes me a couple of attempts to get it right. In that case, being willing to delete or completely rewrite tests that just aren't being useful is important!

21. MrJohz ◴[30 Aug 25 13:58 UTC] No.45074747{5}[source]▶

>>45072493 #

It's very often easier to trigger edge cases when just testing a smaller part of a system then when testing the whole system. Moreover, you'll probably write more useful tests if you write them knowing what's going on in the code. In these cases, colocating the tests with the thing they're meant to be testing is really useful.

I find the problem with trying to move the tests up a level of abstraction is that eventually the code you're writing is probably going to change, and the tests that were useful for development the first time round will probably continue to be useful the second time round as well. So keeping them in place, even if they're really implementation-specific, is useful for as long as that implementation exists. (Of course, if the implementation changes for one with different edge cases, then you should probably get rid of the tests that were only useful for the old implementation.)

Importantly, this only works if the boundaries of the unit are fairly well-defined. If you're implementing a whole new sort algorithm, that's probably the case. But if I was just writing a function that compares two operands, that could be passed to a built-in sort function, I might look to see if there's a better level of abstraction to test at, because I can imagine the use of that compare function being something that changes a lot during refactorings.

replies(1): >>45074922 #

22. 9rx ◴[30 Aug 25 14:21 UTC] No.45074922{6}[source]▶

>>45074747 #

> eventually the code you're writing is probably going to change

Ideally your units/integrations will never change. If they do change, that means the users of your code will face breakage and that's not good citizenry. Life is messy and sometimes you have little choice, but such changes should be as rare as possible.

What is actually likely to change is the little helper functions you create to support the units, like said bespoke sort function. This is where testing can quickly make code fragile and is ultimately unnecessary. If the sort function is more useful than just a helper then you will move it out into its own library and, like before, the sort function will become the entire program and thus the full integration.

replies(1): >>45076663 #

23. 9rx ◴[30 Aug 25 15:05 UTC] No.45075282{6}[source]▶

>>45074223 #

> I worry about missing important interactions.

If you are concerned that the ORM won't behave as it claims to, you can write tests targeted at it directly. You can then run the same tests against your mock implementation to show that it conforms to the same contract.

But an ORM of any decent quality will already be well tested and shouldn't do unexpected things, so perhaps the worry is for not?

24. MrJohz ◴[30 Aug 25 17:54 UTC] No.45076663{7}[source]▶

>>45074922 #

The interface ideally doesn't change, but the implementation probably will. And most of the units you're writing are probably internal-facing, which means that even if the interface does change, fixing that is just an internal refactoring change - with types and a good IDE, it's often just a couple of key presses away.

I think this is what you're saying about moving useful units out into their own library. I agree, and I think it sounds like we'd draw the testing boundaries in similar places, but I don't think it's necessary to move these sorts of units into separate libraries for them to be isolated modules that can be usefully tested.

The sort function is one of the edge cases where how I'd test it would probably depend a lot on the context, but in theory a generic sort function has a very standard interface that I wouldn't expect to change much, if at all. So I'd be quite happy treating it as a unit in its own right and writing a bunch of tests for it. But if it's something really implementation-specific that depends on the exact structure of the thing it's sorting, then it's probably better tested in context. But I'm quite willing to write tests for little helper functions that I'm sure will be quite stable.

replies(1): >>45076748 #

25. 9rx ◴[30 Aug 25 18:03 UTC] No.45076748{8}[source]▶

>>45076663 #

> The interface ideally doesn't change

The whole of the interface is the unit, as Beck originally defined it. As it is the integration point. Hence why there is no difference between them.

> And most of the units you're writing are probably internal-facing

No. As before, it is a mistake to test internal functions. They are just an implementation detail. I understand that some have taken unit test to mean this, but I posit that as it is foolish to do it, there is no need to talk about it, allowing unit test to refer to its original and much more sensible definition. It only serves to confuse people into writing useless, brittle tests.

> So I'd be quite happy treating it as a unit in its own right

Right, and, likewise, you'd put it in its own package in its own right so that it is available to all sort cases you have. Thus, it is really its own program — and thus would have its own tests.

replies(1): >>45077397 #

26. MrJohz ◴[30 Aug 25 19:38 UTC] No.45077397{9}[source]▶

>>45076748 #

> Right, and, likewise, you'd put it in its own package in its own right so that it is available to all sort cases you have. Thus, it is really its own program — and thus would have its own tests.

Sure, yeah, I think we're saying the same thing. A unit is a chunk of code that can act as its own program or library - it has an interface that will remain fairly fixed, and an implementation that could change over time. (Or, a unit is the interface that contains this chunk of code - I don't think the difference between these two definitions is so important here.) You could pull it out into its own library, or you can keep it as a module/file/class/function in a larger piece of software, but it is a self-contained unit.

I think the important thing that I was trying to get across earlier, though, is that this unit can contain other units. At the most maximal scale, the entire application is a single unit made up of multiple sub-units. This is why I think a definition of unit/integration test that is based on whether a unit integrates other units doesn't really make much sense, because it doesn't actually change how you test the code. You still want quick, isolated tests, you still want to test the interface and not the internals (although you should be guided by the internals), and you still want to avoid mocking. So distinguishing between unit tests and integration tests in this way isn't particularly useful.

replies(1): >>45077976 #

27. MrJohz ◴[30 Aug 25 20:08 UTC] No.45077606{3}[source]▶

>>45074474 #

I mean, you get to decide what the unit is. I think this is one of the biggest issues with Java and some similar languages, in that it puts so much emphasis on classes (each class gets its own file and is the unit of import) that people used to Java think of classes as _the_ unit boundary, as opposed to being one type of boundary that can sometimes be useful.

So `BankAccount` as a class is probably a useful unit boundary: once you've designed the class, you're probably not going to change the interface much, except for possibly adding new methods occasionally. You have a stable boundary there, where in theory you could completely rewrite the internals of the class but the external boundary will stay the same.

`FooSystemFrobnicatorPreparer` sounds much more like an internal detail of some other system, I agree, and its interface could easily be rewritten or the class removed entirely if we decide to prepare our frobnication in a different way. But in that case, maybe the `foo.system.frobnicator` package is the unit we want to test as a whole, rather than one specific internal class inside that package.

I think a lot of good test and system design is finding these natural fault lines where it's possible to create a relatively stable interface that can hide internal implementation details.

28. troupo ◴[30 Aug 25 21:00 UTC] No.45077966{4}[source]▶

>>45073680 #

> You can't check because the numbers quickly become astronomical.

But you can with unit tests?

> Can you test the Python parser on all possible Python programs?

A parser is one of the few cases where unit tests work. Very few people write parsers.

See also my sibling reply here: https://news.ycombinator.com/item?id=45078047

> What you do is write more primitive components and either unit test them, prove them to be correct or make them small enough to be correct by inspection. An integration test is just testing that the interfaces do indeed fit together, it won't normally be close to testing all possible code paths internally.

Ah yes. Somehow "behaviour of unit tests is correct" but "just testing interfaces in just a few integration tests". Funny how that becomes a PagerDuty alert at 3 in the morning because "correct behaviour" in one unit wasn't tested together with "correct behaviour" in another unit.

But when you actually write an actual integration test over actual (or simulated) inputs, suddenly 99%+ of your unit tests become redundant because actually using your app/system as intended covers most of the code paths you could possibly use.

replies(1): >>45080368 #

29. 9rx ◴[30 Aug 25 21:03 UTC] No.45077976{10}[source]▶

>>45077397 #

> and you still want to avoid mocking.

Assuming by mock you mean an alternate implementation (e.g. an in-memory database repository) that relieves dependence on a service that is outside of immediate control, nah. There is no reason to avoid that. That's just an implementation detail and, as before, your tests shouldn't be bothered by implementation details. And since you can run your 'mock' against the same test suite as the 'real thing', you know that it fulfills the same contract as the 'real thing'. Mocks in that sense are also useful outside of testing.

If you mean something more like what is more commonly known as a stub, still no. This is essential for injecting failure states. You don't want to have to actually crash your hard drive to test your code under a hard drive crash condition. Testing failure cases are the most important tests you will write, so you will definitely be using these in all but the simplest programs.

30. MrJohz ◴[31 Aug 25 04:24 UTC] No.45080368{5}[source]▶

>>45077966 #

It is important to have integration tests, but my experience is very much the opposite of what you're describing. I almost never have bugs where the cause is the small amount of glue code tying things together, because that code is usually tiny and incredibly simple (typically just passing arguments in one format to another format, and potentially catching errors and converting them to a different format). A couple of tests and a bit of static typing is sufficient to cover all the different possibilities because there are so few possibilities.

The failure mode I see much more often is in the other direction: tests that are testing too many units together and need to be lowered down to be more useful. For example, I recently wrote some code that generated intellisense suggestions for a DSL that our users use. Originally, the tests covered a large swathe of that functionality, and involved triggering e.g. lots of keydown events to check what happened when different keys were pressed. These were useful tests for checking that the suggestions box worked as expected, but they made it very difficult to test edge cases in how the suggestions were generated because the code needed to set that stuff up was so involved.

In the end what I did was I lowered the tests so I had a bunch of tests due the suggestions generation function (which was essentially `(input: str, cursor: int) -> Completion[]` and so super easy to test), and a bunch of tests for the suggestions box (which was now decoupled from the suggestions logic, and so also easier to test). I kept some higher level integration tests, but only very few of them. The result is faster, but also much easier to maintain, with tests that are easier to write and code that's easier to refactor.

31. RHSeeger ◴[01 Sep 25 13:24 UTC] No.45092520{5}[source]▶

>>45072493 #

> But if you needed some kind of custom sort function to serve features within a greater application, you are already going to know that your sort function works correctly by virtue of the greater application working correctly. Testing the sort function in isolation is ultimately pointless.

It is entirely possible for a sort function to be just one component of the functionality of the larger code base. Sort in specific is something I've written unit tests for.

> As before, there may be some benefit in writing code to run that sort function in isolation during development to help pinpoint what edge cases need to be considered, but there isn't any real value in keeping that around after development is done.

Those edge cases (and normal cases) continue to exist after the code is written. And if you find a new edge case later and need to change the code, then having the previous unit tests in place gives a certain amount of confidence that your changes (for the new case) aren't breaking anything. Generally, the only time I _remove_ unit tests is if I'm changing to a new implementation; when the method being tested no longer exists.

↑