In a world where it is acceptable for people to say, “I’m not good at math,” it’s tempting to lean on analytic models as the arbiters of truth.
But like anything else, analytic models can be done poorly. And sometimes, you must look outside your organization to spot the damages.
The Nature of Analytic Insights
Traditional OLAP focuses on the objective aspects of business information. “The person who placed this $40 order is 39 years old and lives in Helena Montana.” No argument there.
But analytics go beyond simple descriptive assertions. Analytic insights are derived from mathematical models that make inferences or predictions, often using statistics and data mining.1
This brings in the messy world of probability. The result is a different kind of insight: “This person is likely to default on their payment.” How likely? What degree of certainty is needed to turn them away?
When you make a decision based on analytics, you are playing the odds at best. But what if the underlying model is flawed?
Several things can go wrong with the model itself:
- It is a poor fit to the business situation
- It is based on inappropriate proxy metrics
- It uses training data that reinforces past errors or injustices
- It is so complex that it is not understood by those who use it to make decisions.
The first step in learning to avoid these problems is knowing what to look for.
Shining a Light on Hidden Damages
In Weapons of Math Destruction, Cathy O’Neil teaches you to identify a class of models that do serious harm. This harm might otherwise go un-noticed, since the negative impacts are often felt outside the organization.2 She calls these models “weapons of math destruction.”
O’Neil defines a WMD as a model with three characteristics:
- Opacity – the workings of the model are not accessible to those it impacts
- Scale – the model has the potential to impact large numbers of people
- Damage – the model is used to make decisions that may negatively impact individuals
Misaligned Incentives
O’Neil’s book is not a blanket indictment of analytics. She points out that analytic models can have wide ranging benefits. This occurs when everyone’s best interests line up.
For example, as Amazon’s recommendation engine improves, both Amazon and their customers benefit. In this case, the internal incentive to improve lines up with the external benefits.
WMD’s occur when these interests conflict. O’Neil finds this to be the case for models that screen job applications. If these models reduce the number of résumés that HR staff must consider, they are deemed “good enough” to use. They may also exclude valid candidates from consideration, but there is not an internal incentive to improve them. The fact that they harm outside parties may even go unnoticed.
Untangling Impact from Intent
WMD’s can seem insidious, but they are often born of good intentions. O’Neil shows that it is important to distinguish between the business objective and the model itself. It’s possible to have the best of intentions, but produce a model that generates untold damage.
The hand-screening of job applications, for example, has been shown to be inherently biased. Who would argue against “solving” this problem by replacing the manual screening with an objective model?
This may be a noble intention, but O’Neil shows that it fails miserably when the model internalizes the very same biases. Couple that with misaligned incentives for improvement, and the WMD fuels a vicious cycle that can have the precisely the opposite of the intended effect.
Learning to Spot Analytic Pitfalls
The first step to avoiding analytics gone awry is to learn what to look for.
“Data scientists all too often lose sight of the folks at the receiving end of the transaction,” O’Neill writes in the introduction. This book is the vaccine that helps prevent that mistake.
If you work in the field of analytics, Weapons of Math Destruction is an essential read.
Notes:
1. OLAP and Analytics are two of the key service areas of a modern BI program. To learn more about what distinguishes them, see The Three Pillars of Modern BI (Feb 9, 2005).
2. But not always. For example, some of the models explored in the book have negative impacts on employees.