Friday, July 2, 2010

Price Waterfall Analysis and Dimensional Design

Many businesses rely on information in the data warehouse to conduct price waterfall analysis.  There is no single way to design a schema to support waterfall analysis; your solution will vary according to the waterfall model being used and the availability of data elements.

Waterfall Analysis

Waterfall analysis represents pricing as a series of "leakages" or deductions from list price, usually grouped into major categories, with various intermediate assessments of price.  Target margins are also sometimes included.  The journey from "list price" to "pocket margin" may be expressed in terms of unit prices, average unit prices, or as percentages.

In the diagram below, for example, waterfall analysis represents "list price" as 100%.  Several types of "leakages" appear as percentage reductions.  The various approximations of price are shown in dark blue; leakages are shown in light blue.  Targets appear in red. 

"Invoice Price," for example is what appears on customer orders, and factors in several discounts.  "Net Price" factors in post-order discounts.  "Product Price" factors in sales and marketing costs, and "Pocket Margin" is the end result of the analysis.

Supporting from a Dimensional Model

From a dimensional design perspective, waterfall analysis is similar to profitability analysis.  There is not a single, "best way" to support waterfall analysis. The sophistication of your solution will vary based on the complexity of the waterfall model and the availability of data.

It is best to start by thinking about the different prices and leakages as additive facts, which implies extended amounts.  While prices and leakages can be specified as unit amounts, waterfall analysis is most flexible when it can be conducted across products, time periods or geography.  Storing unit amounts in a dimensional schema does not serve this well.   Instead, think about the waterfall components as facts that represent extended amounts.  These are fully additive, and can be summarized with maximum flexibility.  (Some graphics of waterfall analysis use unit amounts, but these are computed based on volume totals, rather than the reverse.)

A Value Chain

If thought of as additive facts, the various prices and leakages can be associated with one or more processes, each potentially represented by a fact table.  Together, there fact tables form a value chain, related by a set of common dimensions.  By combining the data from these stars, the waterfall can be constructed. 

Next, look at your business's waterfall model itself.  Are the data elements available?  At what level of detail?  Do some represent allocations?  What are the allocation rules?  This will drive the design of stars that capture key components of the waterfall.

In the illustration, the first several columns may be available directly from an order-entry system.  We can envision a single order_facts star, which contains facts capturing these extended amounts for each order line:  list price, volume discounts, corporate discounts, promotion discounts, and invoice price.

The order entry system may also support rebate processing, allowing the next two facts to be stored in the same table:  rebates and net price.  Or, it may be that these are computed at different periodicity and/or in a different system, in which case they may belong in a separate star.  If stored separately, tying them to the original order will allow them to correlate to an order date rather than the time of the rebate, allowing for "lagged" and "unlagged" flavors of analysis. 


As in traditional profitability analysis, some components of waterfall analysis may be allocated.  Where this occurs, you must be sure that there is a definitive source of data.  Alternatively, there must be an agreed upon and consistent set of business rules used to construct allocations.  (Remember that data warehouses publish data; they do not manufacture it.)

The example above requires marketing and advertising costs be available by (or allocated to) quarter, product category and region.  This may be less granular than the data that supported the columns to their left.  Conformed dimensions will ensure that all data can be expressed at a common level of detail.

If there is no source of data or business rule, the desired model cannot and should not be supported by the data warehouse.  It is up to the business to develop the processes and systems to define these data elements, not the data warehouse.


You may also have targets for the various prices, which are specified by planners as percentages.  These will also need a home somewhere.  They are really non-additive facts, and will likely be stored in fact tables that associate them with time periods and plan version.  Alternatively, they may be incorporated into reports by hand.

Pulling it Together

Waterfall analysis reports, like the one above, can be constructed by combining data from the constituent stars and combining it based on the common dimensions (such as products, time, or regions.)  This process is sometimes referred to as "drilling across."

It is also possible to construct a single "derived" star (or cube), in which this consolidation is handled in the ETL process, rather than the reporting process.   This makes the analysis much easier to perform, since much of the hard work is taken care of in advance.

Variation in the Model

Some businesses have models that vary over time or according to products or categories, and may have support for waterfall analysis incorporated directly into operational systems.  This is accomplished on the operational side by setting up a breakdown of each order line according to a set of configurable waterfall components.

Such an operational solution may translate into a dimensional design that represents each transaction line as a series of rows in a single fact table, each line containing a single dollar amount. Each row represents one of the components in the waterfall chart.  A dimension determines which component is represented by each row.

This design approach offers nice flexibility, as the model can be changed without altering the design.  However, the reports themselves may require alteration as the model is refined.  Furthermore, he star itself is limited in its utility; its single fact must be carefully qualified each time it is aggregated.