Monday, March 26, 2007

Three Data Warehouse Architectures that Use Star Schema

Whether you follow Kimball or Inmon, or you just build stand-alone data marts, this Blog offers practical advice for your star schemas.

Most data warehouses have a data architecture that fits into one of three categories. These styles of data warehousing often use the same terms to mean very different things. The term “data mart,” for example, has very different connotations for proponents of each of the different architectures. As a result, there is a lot of misinformation about the relative merits of each approach.

But dig a little deeper, and it turns out all three architectures share something in common. Each architecture has a place for the star schema – a set of tables designed according to the principles of dimensional modeling.

Three Data Architectures

At the risk of fanning the flames of a debate that is often characterized as “religious,” here is a very brief summary of each of the three styles of data warehouse design.

Corporate Information Factory This approach is advocated by Bill Inmon, Claudia Imhoff, and many others. It involves a central repository of detailed or “atomic,” data, organized in third normal form. Separate departmental data marts reorganize portions of this data for departmental purposes. These data marts may be built using star schema design.

Dimensional Data Warehouse This approach is advocated by Ralph Kimball, Margy Ross and many others. Atomic data from across the enterprise is stored in star-schema format. The star schemas are designed with a set of “conformed dimensions” allowing them to work together. Data marts are not separate entities, but subject areas within this set of star schemas.

Stand-Alone Data Marts Although there are no major proponents of this approach, it is very popular. Stand-alone data marts focus on a particular business process or department, and are not designed in an enterprise context. Stand alone data marts may be built using star schema designs, or follow other approaches.

This information is summarized in the table below, which also includes some alternate terms used to describe each approach.

Keep in mind that this is a very high level summary, and it is here only to illustrate how the star schema factors in. I encourage readers to seek out more information on each of these approaches.

Something in Common: The Star Schema

Despite the furor over which of these architectures makes the most sense, it should be obvious that they have something in common: each has a place for the star schema. And in all cases, the principles of star schema design are the same.

Regardless of the role of the star schema in your architecture, it must be designed carefully. A poor design will lead to issues, whether you are building a corporate information factory or a dimensional data warehouse. A good design, coupled with proper expectations for your data warehouse architecture, will yield success.

Pragmatic, Not Dogmatic

In my first book on data warehouse design, there is a chapter on data warehouse architecture which advocates the dimensional data warehouse approach. But I have designed star schemas that are used successfully in all three architectures, and the principles of dimensional modeling are the same.

Sometimes people are surprised that I am willing to be involved in projects that follow all three approaches, or that I don’t automatically move to throw away imperfect solutions. But the investment of resources in a data warehouse is a business decision. Design principles may factor in, but so do a variety of other factors – factors over which I am not qualified to weigh in.

Instead, I try to be pragmatic. Yes, I have my preferred approach. But what makes the most sense for the business? Is there already a significant investment in a different architecture? Is there simply not funding for the ideal solution?

The information and resources in this blog are meant to help you, regardless of your architecture. If you use the star schema for anything, you will find help here. From time to time, I will look at concerns specific to one style or another, but this will always be clearly spelled out. Most of the time, we will look at principles that have applicability across the board.

Copyright © 2007 Chris Adamson