Friday, December 2, 2016

Is Your Team Losing the Spirit of the Agile Manifesto?

As the adoption of agile BI techniques spread, it is easy to become wrapped up in methodology and lose site of the agile spirit.  

This year is the 15th anniversary of the Agile Manifesto. Mark the occasion by refocusing on the agile principles.

The Age of “NO”

Fifteen years ago, most business software was developed using rich but complex methodologies. Businesses had spent years refining various waterfall-based methods, which were heavily influenced by two decades of information engineering.

The result seemed to make sense from IT’s point of view, ensuring a predictable process, repeatable tasks, and standard deliverables.

To the people who needed the systems, however, the process was inscrutable.1 It seemed like a bewildering bureaucracy that was managed in a foreign language. They were always being told “no.”

Can we add a field to this screen?
No, you would have had to ask that before the requirements freeze.
Can we change the priority of this report?
No, that decision must be made by the steering committee.
Can we add a value to this domain?
No, that is the province of the modeling group.

The groups were looking at software development from completely different perspectives. They were not really collaborating.

Business people were asking for functionality. IT was beating back the requests by appealing to methodology.

Enter Agile

In 2001, a group of developers met in Colorado to talk about what was working and not working in the world of software development. They produced the Agile Manifesto:

Manifesto for Agile Software Development

We are uncovering better ways of developing
software by doing it and helping others do it.
Through this work we have come to value:

Individuals and interactions over processes and tools
Working software over comprehensive documentation
Customer collaboration over contract negotiation
Responding to change over following a plan

That is, while there is value in the items on
the right, we value the items on the left more.

Kent Beck, Mike Beedle, Arie van Bennekum, Alistair Cockburn, Ward Cunningham, James Grenning, Jim Highsmith, Andrew Hunt, Ron Jeffries, Jon Kern, Brian Marick, Robert C. Martin, Steve Mellor, Ken Schwaber, Jeff Sutherland, Dave Thomas

© 2001, the above authors
this declaration may be freely copied in any form, but only in its entirety through this notice. 

The thrust of the manifesto is on collaboration between business and technical personnel, with an emphasis on visible business results. That is, while methods are important, results are more important.2

The Agile Manifesto helped refocus software development on the product: functional business applications.

Agile Today: The Danger of Cooptation

Fifteen years later, it is safe to say that “Agile” has permeated the mainstream. This is largely a good thing. But whenever something moves from being a new alternative to wide acceptance, there is a potential dark side. As new ideas spread, they can be misunderstood or corrupted; adoption becomes cooption.

I frequently see signs of back-sliding to The Age of “No.” Even among developers who follow agile-based approaches, there is a tendency to lose sight of the agile principles. Here are two examples from my recent experience:

A team had decided to implement an unusual data model. It could easily return incorrect results if not queried in specific ways. The recommendation was to write some extra documentation explaining the pitfalls, and a short tutorial for developers. The response: “We cannot do that. We are an Agile shop. We cannot produce documentation that is not auto-generated by our tools.”

On another occasion, a team was developing the scope for several work streams. One set of needs clearly required a three-week collaborative activity. This was expressly forbidden. “Our Agile approach requires everything be broken down into two-week sprints.”

In both cases, the response was couched in methodological concerns, with no focus on the business benefit or value.3 

This is precisely the kind of methodological tunnel-vision against which the Agile Manifesto was a reaction.

Keeping the Faith

It is hard to disagree with the agile principles, regardless of your organization’s commitment (or lack thereof) to an agile process.

You can take one simple step to ensure that you are not losing touch with agile principles:

     Whenever you are tempted to say “no,” pause and reflect.

Why are you denying the request? Is it simply based on process? Think about what the person actually wants. Is there value? Is there a way to address the business concern?

Sometimes “no” is the right answer. But always be sure your evaluation places business capabilities and benefits ahead of process and procedure.


Learn More:

Read more posts about applying agile principles to BI and analytics:

Notes

1. These people were often referred to as “users.” Over time, this became a derogatory term.
2. Agile is often misinterpreted as emphasizing speed.
3. Luckily, both of these teams saw fit to make exceptions to their processes, prioritizing business value over method.

Thursday, November 17, 2016

Probability and Analytics: Reactions to 2016 Election Forecasts

Reactions to the 2016 election forecasts suggest we don’t do a good job communicating probability and risk.

In a September 2016 post, I suggested readers check out the discussions of analytic models at FiveThirtyEight. One of the links led to their forecast model for the 2016 presidential election.1

In the past week, I have received quite a bit of email suggesting I should take down the post, given that the model “failed.” For example, one emailer wrote:
How can you continue to promote Nate Silver? The election result proved the analytics wrong.
These reactions expose a real issue with analytics: most people do not understand how to interpret probability.

An analytic failure?

On November 7th, the  final prediction of the FiveThirtyEight “Polls Only Model” gave Hillary Clinton a 71% chance of winning. As things turned out, she lost.

Those emailing me were not alone in believing the model failed. The day after the election, there were many stories suggesting FiveThirtyEight and the other aggregators were wrong.2

But were they?


Nate Silver discusses the FiveThirtyEight Model
(If the video above does not play, you can access it here.)

Understanding probability

The FiveThirtyEight model gave Clinton a 71% chance of winning the election. That’s about a 7 in 10 chance. To understand how to interpret this probability, try the following thought experiment:

Suppose you are at Dulles airport, and are about to board a plane. While you are waiting, you are notified that there is a 7 in 10 chance your flight will land safely.  Would you get on the plane? 

I know I wouldn’t.

When the probability of something happening is 70%, the probability of it not happening is 30%. In the case of the airline flight, that’s not an acceptable risk!

Now suppose the flight lands safely.  Was the prediction right?

Maybe, but maybe not.  The plane landed safely, but were the odds with the passengers?  Was there actually a greater danger that was narrowly avoided? Was there no danger at all?

When a single event is assigned a probability, its hard to assess whether the assigned probability was “correct.”

Suppose every flight departing Dulles was given a 7 in 10 chance of landing safely, rather than just one. The next day, we check the results and find that all flights landed safely. Was the prediction correct?

In this case, we are able to say that the model was clearly wrong. About 1,800 flights depart Dulles airport each day. The model predicted that thirty percent, or about 540 flights, would not land safely. It clearly missed the mark, and by a wide margin.

Probabilistic predictions are easier to evaluate when they apply to a large number of events.

Explaining probability

In the days and weeks leading up to the election, the FiveThirtyEight staff spent a good deal of time trying to put the uncertainty of their forecast in context.  As the election drew closer, these became daily warnings:
  • November 6:  A post outlined just how close the race was, and how a standard polling miss of 3% could swing the election.
  • November 7:  An update called a Clinton win “probable but far from certain.”
  • November 8: The final model discussion outlined all the reasons a Clinton win was not a certainty, and explored scenarios that would lead to a loss.
Despite all this, many people were unable to interpret the probabilistic model, and the associated uncertainty.

Avoiding unrealistic expectations

If  a research scientist at Yale and the MIT Technology Review misunderstood a probabilistic forecast, how well are people in your business doing?
  • Are people in your business making decisions based on probabilistic models? 
  • Are they factoring an appropriate risk level into their actions?
  • Are you doing enough to help them understand the strength of model predictions?
It's important that decision makers comprehend the predictive strength of the models they use. And it’s everyone’s responsibility to make sure they understand.

We have a long, long way to go.



Notes:

1. See the post Read (or Listen to) Discussions of Analytic Models.  The model discussion I linked to is: A User’s Guide To FiveThirtyEight’s 2016 General Election Forecast

2. “Aggregators" is a term used by the mainstream press to describe data scientists who build models based on polling data.  Here are a few stories that suggested these models were wrong: The WrapVanity FairThe New YorkerQuanta MagazineMIT Technology Review.


Wednesday, September 28, 2016

Read (or Listen to) Discussions of Analytic Models

Organizations often feel their analytics are proprietary, and therefore decline to discuss how their models work. One shining exception is Nate Silver’s FiveThirtyEight.com. The site makes a point of exposing how their models are built. They also discuss their models as part of their elections podcast.

Data Storytelling

Recommended Reading
As students in my courses know, FiveThirtyEight.com is a data driven journalism blog founded by Nate Silver. FiveThirtyEight covers sports, politics, science, and popular culture.

If you are interested in visualization, analytics, or telling stories with data, you will enjoy the site.

Stories on FiveThirtyEight are always shaped by data. And if they develop a model of any kind, that model is openly explained. You may have to cull through footnotes, but its always there.

One of the most detailed discussions on the site right now describes their 2016 election forecast model. (With apologies to readers outside the US, this is a very US-centric topic.)

Podcasts

FiveThirtyEight also offers several podcasts, where you can listen to analyst discussions which are driven by data.

Until recently, these conversations rarely delved into the technical realm. On the elections podcast, if Nate Silver or Harry Enton mentioned “long tails,” “blended averages,” or “p-values,” the other hosts jokingly steered the conversation back to analysis.

That practice was put to an end a few weeks ago with the establishment of “Model Talk” episodes. Every second Friday the model itself is discussed in greater detail. For example, in the 8/26 episode, Silver describes the predictive value of state polls over national polls, and why it is important to build a model where state by state probabilities interact.

Here are links to the “model talk" discussions to date:

Recommended Reading

I also highly recommend Silver’s book, The Signal and the Noise: Why So Many Predictions Fail—but Some Don’t. If you are interested in analytics, it is a fascinating read.










Sunday, April 24, 2016

Chris Adamson on Modeling Challenges

In a recent interview, the folks at WhereScape asked me some questions about data modeling challenges.



In Business Intelligence, modeling is a social activity. You cannot design a good model alone. You have to go out and talk to people.

As a modeler, your job is to facilitate consensus among all interested parties. Your models need to reflect business needs first and foremost. They must also balance a variety of other concerns — including program objectives, the behavior of your reporting and visualization tools, your data integration tools, and your DBMS.

It’s also important to understand what information resources are available. You need to verify that it is possible to fill the model with actual enterprise data. This means you need to profile and understand potential data sources. If you don’t consider sources of data, your designs are nothing more than wishful thinking.

When considering a non-relational data sources, resist the urge to impose structure before you explore it. You’ve got to understand the data before you spend time building a model around it.

Check out the video above, where I discuss these and other topics. For a full-sized version, visit the WhereScape page.