Integration tests are, in a way, worst of both worlds: they are more complicated than unit tests, they require involved setup, and yet they can’t really guarantee that things work in production.
End-to-end tests, meanwhile, do show whether things work or not. If something fails with an error, error reporting should be good enough in the first place to show you what exactly is wrong. If something failed without an error but you know it failed, make it fail with an error first by writing another test case. If there was an error but error reporting somehow doesn’t capture it, you have a bigger problem than tests.
At the end of the day, you want certainty that you deliver working software. If it’s too difficult to identify the failure, improve your error reporting system. Giving up that certainty because your error reporting is not good enough seems like a bad tradeoff.
Incidentally, grug-friendly e2e tests absolutely exist: just take your software, exactly as it’s normally built, and run a script that uses it like it would be used in production. This gives you a good enough guarantee that it works. If there is no script, just do it yourself, go through a checklist, write a script later. It doesn’t get more grug than that.
> it's also easy to manually test while developing during QA and UAT.
As I said in the original comment, e2e tests can definitely be manual. Invoke your CLI, curl your API, click around in GUI. That said, comprehensively testing it that way quickly becomes infeasible as your software grows.
I'd say that works and works correctly and covers all edge cases are different scenarios in my mind. Looking at an exaggerated example, if I build tax calculator or something that crunches numbers, I'd have more confidence with a few unit tests matching the output of the main method that does the calculation part than a whole end-to-end test suite. It seems wasteful to run end to end (login, click buttons, check that a UI element appears, etc) to cover the logical output of one part that does the serious business logic. A simple e2e suite could be useful to check for regressions, as a smoke test, but it still needs to be kept less specific, otherwise it will break on minor UX changes, which makes it a pain to maintain.
Sure, you wouldn’t have all possible datasets and scenarios, but you can easily have a few, so that e2e test fails if results don’t make sense.
Of course, unit tests for your business logic make sense in this case. Ideally, you would express tax calculation rules as a declarative dataset and take care of that one function that applies these rules to data; if the rules are wrong, that is now a concern for the legal subject matter experts, not a bug in the app that you would need to bother writing unit tests for.
However, your unit test passing is just not a signal you can use for “ship it”. It is a development aid (hence the requirement for them to be fast). Meanwhile, an e2e test is that signal. It is not meant to be fast, but then when it comes to a release things can wait a few minutes.
> Ideally any changes are manually tested before releasing, and a bug in one part of the app that's being worked on is not likely to affect a different section, e.g. not necessary to retest the password reset flow if you're working on the home dashboard
That is one can of worms. First, during normal development work it is very common to modify some part that affects multiple parts of the app. In fact, it is inhuman to know exactly what it affects in a big app (ergo, testing). Second, while manual testing is a kind of e2e testing, it is not feasible in a bigger application.
> usually flaky end-to-end tests
Then make them not flaky. It’s amazing what can happen if something stops being treated as an afterthought!