Write Tests You Love, Not Hate #5 - Understanding the Fragile Test Problem
What if tests break with every minor change, even though the behavior of the system hasn't changed? To understand the Fragile Test Problem, we dive into an example and identify the underlying issue.
Sometimes, you might make minor changes in your implementation only to spend days fixing the tests afterward. This is an extreme—and blunt—example of the Fragile Test Problem.
In this case, tests are highly sensitive to changes that don’t affect the actual behavior of the system (often called refactorings). They generate many false positives, making them untrustworthy and unreliable. This, in turn, significantly slows down development.
In this post, we examine the problem and identify one of its root causes.
The Fragile Test Problem
Figure 1 shows a typical test scenario. On the left-hand side, you see the implementation, and on the right-hand side, the tests. We have a UserService
that uses a UserRepository
. A test called UserServiceTest
checks the behavior of the UserService
by mocking the UserRepository
. So far, so good. Now, imagine that the system evolves, and after several months, the team decides the UserService
has become too large and needs to be split.
They decide to extract multiple services, like UserProfileService
, UserLoginService
, and UserNotificationService
, from the UserService
, as illustrated in Figure 2. However, this change breaks the tests as indicated by the red color of the UserServiceTest
.
To rectify this, the team mocks the behavior of the new services in the UserServiceTest
, which then completes successfully (see Figure 3). Yet, the code moved to the new services is now untested.
Thus, the team must create a new set of tests for each of the new services (see Figure 4) to test their behavior individually.
To make the new tests work, they need to mock the UserRepository
and, of course, implement the proper checks and verifications. Finally, the team finishes the new tests, and everything works nicely, as you can see in Figure 5.
However, life goes on, and after some time, the team concludes that their chosen setup was suboptimal. They decide to make changes once again and restructure their code.
As a result, three of their tests break again, as shown in Figure 6. Since functionality has moved in the implementation, the tests also need to be moved and adjusted.
After a few days of work, the tests are back up and running.
But is everything truly what we want? Let's consider what happened here. Besides being painful for the team, this approach underscores several problems with common testing practices.
This is no refactoring!
The first issue is that none of the steps described previously qualify as refactorings. Refactoring is defined as a sequence of small changes designed to keep the tests passing at all times.
The key principle here is consistent: Changes should keep the tests passing at all times. However, this was not the case in our example, where the tests broke on several occasions and required adjustments.
Why is this problematic?
In our scenario, we were basically forced to write new tests after the initial changes to the UserService
and make significant adjustments following the second change. Consequently, we cannot assure that the behavior of the system post-changes remains identical to its previous state.
The fundamental goal of our tests is to facilitate modifications that might alter the system's structure without affecting its behavior, ensuring the behavior remains consistent.
When we are forced to essentially rewrite the tests, we forfeit this advantage. The outcome is akin to not having any tests at all, except with additional effort involved. We can no longer guarantee that we haven’t introduced bugs or unwanted behaviors.
Tight Coupling
Let’s consider our example from a different perspective. Instead of focusing on implementation and test code, let’s discuss two separate components, A and B.
Upon examining the connections between these two components as depicted in Figure 8, it becomes evident that they are tightly coupled. Each element in Component B is directly linked to at least one element in Component A. Therefore, any change in Component A necessitates a corresponding change in Component B. As we've learned in our Software Engineering 101 course, tight coupling is always a bad idea.
Looking at the problem from this perspective already points to the solution: In order to solve the fragile test problem, we need to decouple tests from implementation. The specifics of how to achieve this will be explored in our next post.
Thank you for reading ❤️