Tuesday, November 15, 2011

Conformed Dimensions

This second post on conformed dimensions explores different ways in which dimensions can conform. 
There are several flavors of conformed dimensions. Dimensions may be identical, or may share a subset of attributes that conform.
Conformance basics

Conformed dimensions are central to the discipline of dimensional modeling.  The basics of conformance were introduced in a post from earlier this year.  In a nutshell:

  • Measurements of discrete business processes are captured in individual star schemas (e.g. proposals and orders)
  • Some powerful business metrics combine information from multiple processes.  (e.g. fill rate: the ratio of orders to proposals)
  • We construct these metrics through a process called drilling across
  • Drilling across requires dimensions with the same structure and content (e.g. proposals and orders have a common customer dimension)

For a refresher on these concepts, see "Multiple stars and conformed dimensions" (8/15/2011). 

Physically shared dimensions not required 

When two fact tables share the same dimension table, their conformance is a given. Since the shared dimensions are the same table, we know they will support drilling across.

For example, stars for proposals and orders may share the customer dimension table.  This makes it possible to query orders by customer and products by customer, and then merge the results together.

But this process of drilling across does not require shared dimension tables. It works equally well if proposals and orders are in separate data marts in separate databases.

As long as the stars each include dimensions that share the same structure (e.g. a column called customer_name) and content (i.e. the customer values are the same), it will be possible to merge information from the stars. 

Levels of conformance 

It is easy to take this a step further.  We can also observe that there is compatibility between dimensions that are not identical.

If a subset of attributes from two dimension share the same structure and content, they form a sort of “lowest common denominator” across which we can compare data from the stars.

For example, suppose we establish budgets at the monthly level, and track spending at the daily level.  Clearly, days roll up to months.  If designed correctly, it should be possible to compare data from budget and spending stars by month.

The picture below illustrates the conformance of a MONTH and DAY table graphically.  The ring highlights the shared attributes; any of these can be used as the basis for comparing facts in associated fact tables.

In this case, the two conformed dimensions participate in a natural hierarchy.  Months summarize days. The month table is referred to as a “conformed roll-up” of day.

To successfully drill across, the content of the shared attributes must also be the same.  Instances of month names, for example, must be identical in each table -- "January" and "January" conform; "January" and "JAN." do not.

To guarantee conformance of content, the source of the rollup should be the base dimension table. This also simplifies the ETL process, since it need not reach back to the source a second time. 

Other kinds of conformance 

Identical tables and conformed roll-ups are the most common kinds of conformed dimensions.  Other kinds are less common. 

Degenerate dimensions (dimensions that appear in a fact table) may also conform. This is particularly useful with transaction identifiers. 

Overlapping dimensions may share a subset of attributes, but not participate in a hierarchy. This is most common with geographical data. 

More Info 

For more information, see the following posts: 
I also write about conformed dimensions extensively in Star Schema, The Complete Reference.  
If you enjoy this blog, you can help support it by picking up a copy!

Photo by Agnes Periapse, licensed under Creative Commons 2.0