Sunday, April 24, 2016

Chris Adamson on Modeling Challenges

In a recent interview, the folks at WhereScape asked me some questions about data modeling challenges.

In Business Intelligence, modeling is a social activity. You cannot design a good model alone. You have to go out and talk to people.

As a modeler, your job is to facilitate consensus among all interested parties. Your models need to reflect business needs first and foremost. They must also balance a variety of other concerns — including program objectives, the behavior of your reporting and visualization tools, your data integration tools, and your DBMS.

It’s also important to understand what information resources are available. You need to verify that it is possible to fill the model with actual enterprise data. This means you need to profile and understand potential data sources. If you don’t consider sources of data, your designs are nothing more than wishful thinking.

When considering a non-relational data sources, resist the urge to impose structure before you explore it. You’ve got to understand the data before you spend time building a model around it.

Check out the video above, where I discuss these and other topics. For a full-sized version, visit the WhereScape page.

Wednesday, December 23, 2015

What Hollywood Can Teach Analytics Professionals: How to Tell Stories

You might not realize it, but you probably have something in common with the creators of the TV show South Park. 

Analytics yield insights that can have powerful business impact. These insights come from statistics and data mining—processes that are inaccessible to most people. If you want your business to learn and remember, you have to tell a story.

All too often, the communication of an analytic finding reads like a police report: procedural, laden with jargon, and stripped of meaningful business context.

That’s not interesting. People won’t learn from it, and they certainly won’t change their behavior.

How then to get your point across? You need to learn how to tell stories. Data stories.

Trey Parker and Matt Stone know a thing or two about telling a story. They are the creators of South Park, a wildly successful television show which has been on the air for 19 years. Like you, their success depends on telling interesting stories.

In the video clip above, Parker and Stone are speaking to a group of students at NYU on storytelling strategies. Trey tells the students:

We can take these beats, which are basically the beats of your outline, and if the words “and then” belong between those beats, you’re f***ed. Basically. You’ve got something pretty boring.

What should happen between every beat that you’ve written down is either the word “therefore” or “but.”

Data storytellers make this mistake all the time. "We did this…then we tried that…the algorithm showed this…the correlation coefficient is that…our conclusion is...”

This kind of forensic storytelling is boring. It won’t be remembered, and the value of the insight will be lost. Save the procedural detail for an appendix somewhere. People learn from good stories, not lab reports.

As Matt says later in the clip, you need causality to have an interesting story:

But. Because. Therefore.  That gives you the causation between each beat.  And that…that’s a story.

Be sure to watch the entire clip and, if you are so inclined, take some time off for an episode or two of South Park. It just might make you a better data scientist!

The embedded video is from the NY Times ArtsBeat blog post, Hello! Matt Stone and Tray Parker Crash a Class at NYU (September 8, 2011).  Hat tip to Tony Zhou and his Video Essay on F for Fake at the marvelous blog Every Frame a Paining.

Wednesday, July 8, 2015

Create Social Documentation

Documentation is sometimes viewed as a necessary evil. But it doesn't have to be. Here's how to produce documentation that will be used.
Useful documentation gets used -- during all development phases, and by all interested parties.
Burdensome methodologies often expend precious hours producing documentation that is hard to use. Many projects leave behind fat binders of text that hardly anyone will ever open. These examples have given documentation a bad name.

The good news is that documentation can be done right. It does not have to be a drag on project time, it does not have to be a chore to read and review, and it does not have to be something we interact with alone.

Why we need documentation

Documentation is not an after-the-fact explanation of what has been built. Used properly, it is a central component of the entire lifecycle of a BI solution.

Important uses include:

  1. Prior to development: Identify and validate requirements and designs
  2. During development: Specify what to build
  3. After development: Educate business people and support personnel

Of course, there are many other areas in which documentation has value (program planning, governance, change management, etc.). These three above are sufficient to illustrate the value of social documentation.

Social Documentation

Useful documentation should be easy to read and discuss. It should also not be burdensome to produce. Three principles shape social documentation.

Social documentation is the focus of collaboration. 

Whenever possible, I recommend to my clients that we use PowerPoint for documentation. Why? Word processors are tailor made for reading, which is a solitary activity. Presentation software is tailor made for collaboration.

Social documentation is easy to navigate. 

Support "random access" rather than "sequential access." Presentation software is great for this; we can easily sort and navigate slides by their titles. This can also be achieved using document maps or outlines.

Social documentation is not prose. 

Each slide in a presentation, or section in a document, should be set up to capture essential information in a consistent format.  This format may be tabular, diagramatic, or both. Your subject matter will dictate the appropriate format.

But here is the important part:
  • No paragraphs
  • No prose
  • If using PowerPoint: No bullet lists. (They're just a back door to writing paragraphs.)

Uses for social documentation

I find the presentation format excellent for defining program priorities, defining project scope, capturing business requirements, developing top level information architectures, and a variety of other tasks. For specifications, a word processed document with multi-level headers and a document map typically fits the bill.

When documenting business metrics for a dashboard or scorecard, for example, set up a PowerPoint presentation with one slide per metric. Use a standard tabular format to document each metric. This documentation is easy to produce, review and revise, as I will discuss in a moment.

Where presentation software is not practical, word processors can be used in the same way. Divide the document into sections, activate the contents sidebar, and use a consistent tabular format.

Of course, not all documentation is captured in this manner. For example, we might use social documentation to capture a top level star schema design, then use a modeling tool to produce a detailed design.

Advantages of Social Documentation

This simple approach has numerous advantages.

Frictionless and Comprehensive

During requirements specification, social documentation allows you to capture the necessary information in frictionless and comprehensive manner. A standard tabular format, for example, ensures the same items are filled in. The presentation itself is easy to navigate via sections and slide titles.

Engages with the business

Social documentation invites collaboration. Give people a big fat binder and their eyes will cross. Show them 5 or 6 slides that capture the business metrics they care about, and they will give you feedback.

I always have my laptop with me, so if I happen to be in a room with a SME, I can pull it out, flip to the correct slide, and ask a question.

Incidentally, collaboration with the business is one of the cornerstones of the agile manifesto.

Reviewed together, rather than in isolation

Ever sent out a fat document for review? If you have, you know the results are not good. Most people will not review it by the deadline. When reminded, they will say, "it looks good." A precious few will provide detailed feedback.

Social documentation transforms this process. A review is conducted by bringing people into a room and reviewing the deck. Any agreed upon changes are made directly to the presentation slides.

The documentation is now ready for the next tasks: guiding development and then serving as the basis for education.

Learn More

Read more about documenting BI program activities in these posts:
For more details on what to document, check out my book Star Schema: The Complete Reference. Detailed descriptions and examples can be found in Chapter 18, "How To Design And Document A Dimensional Model.”

I also discuss documentation of information requirements and business metrics in the course “Business Information and Modern BI.”  Check the sidebar for current offerings.