Tuesday, June 5, 2012

The Conformance Matrix

Conformed dimensions are the linchpins of dimensional models. This post summarizes their use, and describes how to document them in matrix form.

Conformed dimensions: a refresher

Metrics describing different processes can be compared if they are stored in stars which share common dimensions. As I have discussed previously, these compound metrics often turn out to be among the most valuable from a business perspective.
Image by Patrick Hosley
Licensed under CC 2.0

The common dimensions do not have to be physically shared tables.  Each star may reside in a separate database.

As long as the common dimensions, such as "customer" or "product" have the same structure and content, they are said to conform.

Using conformed dimensions, we are able to compare measurements stored in different star schemas through a process called drilling across.

Planning conformance

Conformed dimensions are therefore the linchpins of the dimensional model. They ensure that each star works on its own, and also works with other stars.

If conformed dimensions are planned in advance, you can implement one star at a time without worrying about incompatibility issues.

This is the core idea behind Ralph Kimball's bus architecture. Conformed dimensions are designed as part of an up-front architecture project. Then, they serve as a semantic "bus."  Like cards plug into the backplane of a PC, fact tables plug into this dimensional bus.

The concept is also important in other architectures. For W.H. Inmon's Corporate Information Factory, conformance allows process comparison within data marts as well as across data marts.

Without conformed dimensions, subject areas become stovepipes. The opportunity to build cross-process metrics is lost. Worse yet, users also develop distrust in the individual data marts.  Their thinking is that  if sales and inventory cannot be compared, there must be something wrong with the data.

Documenting conformance

The conformance plan is a central feature of your dimensional model, so of course it must be documented.

Conformed dimensions are best documented in a matrix format, as in the diagram below.

Image from Star Schema: The Complete Reference by Chris Adamson
 (c) 2010 McGraw-Hill.  Used by permission.
The rows of this diagram correspond to fact tables, and the columns are dimensions.  Where you see a checkmark, it indicates that the fact table makes use of the associated dimension.

The matrix makes it easy to identify compatibility across fact tables. When two fact tables have a checkmark in the same column, that dimension cab be used as the basis for comparing the processes (aka drilling across).

Notice that conformed dimensions are depicted with associated "levels."  Salesperson, for example, has three successive levels of conformity: regions, territories, and individual salespeople. (This topic has been covered previously.)

It is possible that a degenerate dimension (a dimension attribute stored within a fact table) may be a conforming dimension. These attributes should also appear on the conformance matrix. In the picture above, order_line may be a degenerate dimension.

Variations on the conformance matrix

The small conformance matrix above focuses on a subject area (sales).  Conformance across subject areas can also be illustrated using a matrix, albeit a larger one.

An enterprise level conformance matrix is a valuable tool. It is a blueprint that can guide incremental implementation. It also helps break down proprietary attitudes toward data among different groups within your business.  One look at the matrix, for example, and it becomes clear that "customer" touches several parts of the business.

Conformance matrices can be produced at different levels of summarization.  A more summarized matrix may contain one row per data mart, rather that one per fact table. This may help guide project planning, or simply make an enterprise level matrix easier to digest.

Similarly, the conformance matrix can be used to map individual fact to dimensions. Architects use this kind of matrix when they are having trouble identifying discrete fact table. Performing affinity analysis on this kind of matrix reveals facts that share dimensionality. These may be candidates for inclusion in a single star.

Support this blog

Pick up a copy of Star Schema: The Complete Reference and you will be helping support this blog. Chapter 5 is completely dedicated to conformed dimensions.

A lot of information on conformed dimensions also appears in this blog:
  • Conformed Dimensions (Nov 15, 2011) discusses different ways dimensions can conform, and introduces the concept of "levels" of conformance.
There is also a category label for posts referencing conformed dimensions.

Matrix Falling image by Patrick Hosley
Licensed under CC 2.0

Conformance matrix illustration is from
Star Schema: The Complete Reference by Chris Adamson,  
 Copyright (c) 2010 by McGraw-Hill. Used by permission.