Wednesday, January 25, 2017

Avoid the Unintended Consequences of Analytic Models

Cathy O’Neil’s Weapons of Math Destruction is a must-read for analytics professionals and data scientists.

In a world where it is acceptable for people to say, “I’m not good at math,” it’s tempting to lean on analytic models as the arbiters of truth.

But like anything else, analytic models can be done poorly. And sometimes, you must look outside your organization to spot the damages.

The Nature of Analytic Insights

Traditional OLAP focuses on the objective aspects of business information. “The person who placed this $40 order is 39 years old and lives in Helena Montana.” No argument there.

But analytics go beyond simple descriptive assertions. Analytic insights are derived from mathematical models that make inferences or predictions, often using statistics and data mining.1

This brings in the messy world of probability. The result is a different kind of insight: “This person is likely to default on their payment.” How likely? What degree of certainty is needed to turn them away?

When you make a decision based on analytics, you are playing the odds at best. But what if the underlying model is flawed?

Several things can go wrong with the model itself:
  • It is a poor fit to the business situation
  • It is based on inappropriate proxy metrics
  • It uses training data that reinforces past errors or injustices
  • It is so complex that it is not understood by those who use it to make decisions.
And here is the worst news: whether or not you manage to avoid these pitfalls, a model can seem to be “working” for one area of your business, while causing damage elsewhere.

The first step in learning to avoid these problems is knowing what to look for.

Shining a Light on Hidden Damages

In Weapons of Math Destruction, Cathy O’Neil teaches you to identify a class of models that do serious harm. This harm might otherwise go un-noticed, since the negative impacts are often felt outside the organization.2 She calls these models “weapons of math destruction.”

O’Neil defines a WMD as a model with three characteristics:
  • Opacity – the workings of the model are not accessible to those it impacts
  • Scale – the model has the potential to impact large numbers of people
  • Damage – the model is used to make decisions that may negatively impact individuals
The book explores models that have all three of these characteristics. It exposes their hidden effects on familiar areas of everyday life – choosing a college, getting a job, or securing a loan. It also explores their effects on parts of our culture that might not be familiar to the reader, such as sentencing in the criminal justice system.

Misaligned Incentives

O’Neil’s book is not a blanket indictment of analytics. She points out that analytic models can have wide ranging benefits. This occurs when everyone’s best interests line up.

For example, as Amazon’s recommendation engine improves, both Amazon and their customers benefit. In this case, the internal incentive to improve lines up with the external benefits.

WMD’s occur when these interests conflict. O’Neil finds this to be the case for models that screen job applications. If these models reduce the number of résumés that HR staff must consider, they are deemed “good enough” to use. They may also exclude valid candidates from consideration, but there is not an internal incentive to improve them. The fact that they harm outside parties may even go unnoticed.

Untangling Impact from Intent

WMD’s can seem insidious, but they are often born of good intentions. O’Neil shows that it is important to distinguish between the business objective and the model itself.  It’s possible to have the best of intentions, but produce a model that generates untold damage.

The hand-screening of job applications, for example, has been shown to be inherently biased. Who would argue against “solving” this problem by replacing the manual screening with an objective model?

This may be a noble intention, but O’Neil shows that it fails miserably when the model internalizes the very same biases. Couple that with misaligned incentives for improvement, and the WMD fuels a vicious cycle that can have the precisely the opposite of the intended effect.

Learning to Spot Analytic Pitfalls

The first step to avoiding analytics gone awry is to learn what to look for.

“Data scientists all too often lose sight of the folks at the receiving end of the transaction,” O’Neill writes in the introduction. This book is the vaccine that helps prevent that mistake.

If you work in the field of analytics, Weapons of Math Destruction is an essential read.


Notes:

1. OLAP and Analytics are two of the key service areas of a modern BI program. To learn more about what distinguishes them, see The Three Pillars of Modern BI (Feb 9, 2005).

2. But not always. For example, some of the models explored in the book have negative impacts on employees.