Write Tests You Love, Not Hate #6 - Conquering the Fragile Test Problem
In this post of our series, we address the resolution of the fragile test problem: decoupling tests from implementation.
In our previous post, we explored the fragile test problem—how changes to the system result in failing tests, even when there is no behavioral change. We pinpointed tight coupling between tests and implementation as the root cause of this problem.
Figure 1 illustrates the problem of tight coupling between test and implementation. For each class on the left-hand side, there is a test class on the right-hand side. If we restructure the implementation and move code between classes or extract code to new classes, the tests fail (as illustrated) even though the behavior did not change.You can find the detailed example here.
To solve the fragile test problem, we need to decouple tests from implementation.
Towards a Loose Coupling of Test and Implementation
In our initial scenario, we divided the UserService
into multiple sub-services. Instead of testing each service separately, we can maintain our UserServiceTest
to assess all four services collectively, merely mocking the UserRepository
. This approach requires only minor modifications to the setup, if any. Tests run with @SpringBootTest
will automatically adjust correctly, so, no major changes to the tests are necessary. Although the implementation changes, it remains thoroughly tested.
However, since the implementation was too large and warranted separation, it is likely that the unit test is quite large as well and would benefit from splitting too. Since we've decoupled the test and implementation, it is possible to decide on the best strategy to split up the test, independently of the structure of the implementation.
The example in Figure 3 shows one strategy for dividing the tests. Here, we chose to capture one use case in each test class. Each class comprises tests for the happy path as well as all edge and problem cases that can occur.
All of this is somehow plausible. However, it usually triggers a set of objections, mostly introduced with a thoughtful: “But wait...”.
So, let’s have a look at some of the main issues.
That breaks the rules, right?
Some people assume an unwritten (or written) rule:
"For every class `X`, there should be a test class `XTest`."
No such rule exists. The implicit assumption is that a unit always has to be a single class (or even a method). However, a unit can be anything. In our example, it's a set of services that together provide a certain functionality for managing users. Instead of having separate services, you could also have private methods that are neither accessible nor visible from the outside. It's up to the unit how it implements its functionality.
Despite that, even if it were a rule, it would be a misguided one. As we've seen above, it would directly lead to tight coupling between test and implementation, and tight coupling is almost always bad.
Isn't that indirect testing?
Let's look at a typical example of indirect testing: testing through the presentation layer. This is done, for example, if the component under test cannot be accessed directly. While this approach helps provide tests in such scenarios, it has the drawback of being very fragile. Any change in the presentation of results can break the test, even though the system behavior is still correct.
In our case, we test the result of the interaction between multiple services, not individual services through another service. As long as the setup code doesn't become too complex and we're not building around our current structure to access a hidden service (which would also break the decoupling of test and implementation), we are fine. However, we need to be aware of the risk of slipping into indirect testing.
In the end, tests should be useful for the teams. If one of the services grows too complex and separate testing becomes more appropriate, go for it. There is no absolute right or wrong, just a set of trade-offs (as always).
It's much harder now to identify the cause of a failing test, isn't it?
With more code, the error could be anywhere. However, if you work in small increments and run tests after every change, this problem won't occur. The small increments make it easy to pinpoint the cause of a failing test—it can only be in the changes made.
Still, sometimes it's not possible, it’s difficult, or it just doesn't happen to work in small increments. In such cases, finding the root cause of a failing test becomes generally more difficult. It would be best to address the root cause (like increasing the speed of tests). However, that might not be feasible. So, feel free to test smaller units. Ultimately, tests should be useful for you.
Why not structure your system according to use cases as well?
That's a fair point. Structuring a system according to use cases is a valid (and likely beneficial) strategy. It might mean that, when refactoring is complete, there is once again a 1:1 mapping of test classes to implementation. We are fine with that. However, structuring a system according to use cases is only one strategy. In the end, we want to decouple test and implementation so that they can be changed and developed independently.
In summary:
Tests should be useful. Let’s design them accordingly.
Decoupling test and implementation is merely one effective strategy to achieve this.
Next, we'll explore how to accelerate slow tests through platform abstraction. Stay tuned.
Thank you for reading ❤️