Goodhart’s Law says that “when a measure becomes a target, it ceases to be a good measure.”
Frequently when I talk about that, everyone nods their heads and then immediately makes a target out of the measures they’re looking at. It’s harder than it sounds, to get this one right.
Let’s look at an example. Let’s say that’s we’ve decided we want better code coverage for our tests. We want the confidence of being able to make a change to the code and knowing that if we made a mistake, the tests will catch that. Having a safety net like that, has huge value to the organization.
So we might measure code coverage as a percentage. How much of the production code is executed during a test run?
That’s a reasonable measure, that we can then use to make some decisions. Not as many good decisions as people normally assume, but still a good measure.
The gotcha is that many people will then immediately turn it into a target. If we’re currently at 60% coverage then someone might decide that we need to increase that to 80%.
As a measure, it was great. As a target, it’s horrible as it now encourages people to add tests that increase the test coverage numbers, whether or not those tests improve the original goal of having a better safety net.
You might be thinking “nobody would game those numbers in my organization”. They will and they do. Code coverage in particular, is gamed everywhere.
Lou Gertsner, former CEO of IBM, has famously said: “People don’t do what you expect, they do what you inspect.”
What might it look like if instead of making code coverage a target, we’d left it as a measure?
We might have hypothesized that if we adopted Test-Driven Development, that might improve our safety net. So we could try that for a period of time and then looked at code coverage as an indicator of whether we were improving or not. Not a target, but an indication of the direction that we were moving in.
