In your data warehouse, each star corresponds to a business process. Combining facts from different processes can produce powerful compound metrics. The key to making this work is a set of conformed dimensions.
Conformed dimensions are closely associated with Kimball's "Bus Architecture," but are crucial in any scenario that involves dimensional data.
In a dimensional design, each star captures collects measurements that describe a discrete business process.
If we have two measurements that describe different processes, we place them into separate fact tables.
For example, a sales data mart may contain multiple stars:
- Proposal information by Salesperson, Prospect, Product, Proposal and Proposal Date
- Order information by Salesperson, Customer, Product, Proposal, Contract and Order Date
- Shipping information by Salesperson, Customer, Product, Proposal, Contract, Shipment, Shipper, and Shipment Date
- Return information by Salesperson, Customer, Product, Contract, Reason and Return Date
By recording measurements of each process in a different star, we are able to capture information at the most detailed level possible. We can study each of these processes, complete with attendant details, by accessing the appropriate star.
Some of the most powerful measurements actually combine information from multiple processes. These metrics require combining facts from different stars.
In the sales data mart, the ratio of proposals to orders is the "close rate," a powerful indicator that sales managers and executives look at on a regular basis.
Similarly, the ratio of shipments to returns is the "return rate," an essential quality control metric.
When we compare facts from different stars, we don't simply join the fact tables. To do so might cause double-counting of some facts.
Instead we follow a process that Kimball calls drilling across. The drill-across process can be broken down into two phases. In the first phase, each star is queried, and results are aggregated to a common level of detail. In the second phase, these result sets are merged based on their common dimensions.
For example, to compute the return rate by product for August of 2011, we do the following:
- a. Figure out quantity shipped by product for August 2011
b. Figure out quantity returned by product for August 2011
- Merge these amounts based on the common product names and compute the ratio
The key to making all this work is the organization of the dimensions. As you saw in the example above, we used the dimension values to link our results together: product names were used to merge together shipment and return quantities and compute their ratio. This would not have worked if the product dimensions for Shipments and Returns had been different.
This is the basic idea behind conformed dimensions. We say that two dimensions conform if they have the same structure and content. Both our stars, had a product dimension table with a product name attribute, and the product names were specified the same way in each. Sharing a single physical table is one way to ensure conformance, but it is not required.
Two dimensions can also conform if one has a subset of the other's attributes. As long as the common attributes have the same structure and content, they are said to conform.
By planning a set of conformed dimensions, we ensure that fact tables can be used to compare processes. This is important within a single data mart, such as the one above, and it is also important when looking across multiple data marts.
Conformed dimensions are the organizing principle in Kimball's architecture. Conformed dimensions are planned up-front, as a part of a project that establishes a data architecture based on dimensional design. Implementation proceeds once this conformance bus has been planned.
The concept is also important in other architectures. For example, the sales data mart discussed above might be part of Corporate Information Factory architecture. Within this data mart, conformance guarantees we can compare shipments to returns, proposals to orders, and so forth.
More to come
In the coming weeks, I will post more about conformed dimensions. We will look at "levels" of conformance, how to document conformed dimensions, and how different tools work with conformed dimensions. If you have my book, I also encourage you to read Chapters 4 and 5, which look at these concepts in detail.