Tuesday, April 27, 2010

Basics: What is a Dimensional Model

In a recent post, I asserted that a star schema is the implementation of a dimensional model in a relational database.  Many readers may have missed this point, because it was embedded in a conversation about the term normalization

Here, then, is a clutter-free discussion the dimensional model.

Activities and Conditions

A dimensional model is a structured framework for measurement.   

Usually, a dimensional model describes a process or activity.  For example, a retailer might design a dimensional model of sales transactions.

A dimensional model may also describe conditions, as measured at predefined intervals. A water utility might design a dimensional model of reservoir levels, measured daily.

Facts and Dimensions

A dimensional model describes measurement of a process (or conditions) through facts and dimensions.

Facts are the measurements.  For the retailer's sales activities, facts that measure each sales transaction include the quantity sold and the price paid.  For the utility's reservoir conditions, a single fact is measured: gallons on hand.

Facts are not useful without context.  For example, "one hundred units sold," is a measurement of quantity sold, but it has no context.  One hundred of what?  When were they sold?  Where?

Dimensions are used to describe the context of facts.  For the retailer, dimension values provide context for each measurement.  These include the date and time of each purchase, the product sold, and the store in which it was purchased. 

For the utility's model of reservoir status, dimensions include the date the measurement was taken, the reservoir or facility where the measurement was taken, and the inspector who measured it.

If you can define a process, facts and dimensions, you have the core of a dimensional model.  Of course, there is a bit more to it than that (the concept of grain is also essential), but that's the basic idea.

Uses

The dimensional model is most famous for its role as the basis for a star schema.  Before we get to that, though, its important to recognize that it has other uses.

First, a dimensional model is an excellent way to describe requirements for an analytic system.  A model of process measurement is far more efficient and flexible than a list of of specific business questions.  A single dimensional model may be able to answer thousands of questions, including some that have not yet been thought of.

A dimensional model is also an excellent tool for planning your data warehouse strategy, and for managing the scope of implementation projects.  This is particularly true when the model is translated to a database design and linked to sources of data.  (I've written about this before.)
 
Stars, Snowflakes, Cubes

A dimensional model can serve as the basis for a database design.  This, of course, is what it is famous for.  (If you can even say it is famous, that is.)
  • When a dimensional model is implemented in a relational database, it is called a star schema (or sometimes a snowflake schema.)  
  • When implemented in a multidimensional database, it is called a cube.  (I touched on this previously, as well.)
If you work with any of these things, you are working with a dimensional model.

- Chris

Image: Slide together Polyhedra by fdecomite
Licensed under Creative Commons 2.0

Monday, April 5, 2010

TDWI Members: Read Chris's Column in Flashpoint

If you are a member of The Data Warehousing Institute, be sure to check out the April 1, 2010 issue of Flashpoint.

My article The Hidden Value of Dimensional Design explains how you can use dimensional design to cultivate a shared understanding of project scope between business and technical personnel.

This publication is only available to TDWI members.

If you are not a member (or if you want to read more on the topic) have a look at this blog post I wrote in 2007:  Drive Warehouse Strategy With A Dimensional Model.

- Chris