Unit tests are undeniably a good thing, but you only realize the full benefits of them when you have enough tests that you can make changes with confidence. If you can make a change, run your tests, and be comfortable enough to ship your changes, then you and your team can get work done much faster. More drastic changes to the shared code become feasible. Life gets better.

It makes sense then that teams want to ensure that code is sufficiently covered with tests. Nobody wants to count tests every time they review a PR, so tools are added that check it automatically. It’s then a small step to set a coverage target, and suddenly you have a machine checking every PR for tests. This all makes sense to me, and it was my first instinct too. I don’t recommend this approach any more.

The problem with test coverage tools is that they can’t (at least, can’t yet) measure the quality or value of a test. They instead measure the quantity of code that the tests exercise. This can encourage a misplaced focus on building lots of low-value tests. For example, consider the following piece of relatively standard web service code:

1
2
3
public GetResult get(GetRequest request) {
return service.get(request);
}

Here is a standard test for this function:

1
2
3
4
5
6
7
8
9
10
11
@Test
public void getCallBusinessLayerGetAndReturnResult() {
var request = new GetRequest();
var expected = new GetResult();
when(mockBusiness.get(request)).thenReturn(expected);

var actual = service.get(request);

verify(mockBusiness).get(request);
assertEquals(expected, actual);
}

And now imagine thousands of tests like this, all testing very similar functions.

The unit test is testing the expected behavior of the service function, but what are the chances of there being a bug in a function this simple? I think it’s far more likely that the test would be written incorrectly than the service function.

The worst case scenario would be if all these service functions and all of these tests were generated by copy-paste and modification. Then it is much more likely that the wrong values get pasted into the test and the implementation at the same time. Unfortunately this kind of boilerplate code is almost always generated with copy and paste because it’s fast and and easy.

You could make an argument that you should avoid architectures that encourage lots of boring boilerplate code. I agree with that idea, but in my experience most teams are not mature enough to design systems that prevent it. It is easy to follow simple service-business-repository patterns blindly, and to be honest, for most software this is good enough.

Again: I do think unit tests are a good thing, and good test coverage is essential to get good value from them, but unit tests also have a cost. Unit tests often need to be changed when code is being changed. If you have lots of low-value tests testing lots of simple methods, you can quickly get overwhelmed trying to make non-trivial changes. Unit tests are supposed to make it safer to go faster… but poorly written tests can do the exact opposite too.

Of course there is an exception to every rule. If you are writing the software for my bank, or for medical equipment, or for self driving cars, please enforce 100% coverage and use several other tools to enure an extremely high quality. For most of us though, not all of the tests are actually worth the effort of writing them, and I don’t want to edit the coverage percentage every time I make a change.