Chris Adamson’s Blog: The Accumulating Snapshot

Tuesday, March 6, 2007

The Accumulating Snapshot

Although my latest book is about aggregate tables, it covers design basics as well as advanced topics. This excerpt describes how to use an accumulating snapshot design to analyze business processes involving multiple steps.

Excerpted from Mastering Data Warehouse Aggregates, by Chris Adamson.
Used by permission.

Many business processes can be understood in terms of a series of stages or steps that something must undergo. In made-to-order manufacturing, an individual item is ordered, manufactured, quality assured, packaged, and shipped. In banking, a mortgage application is submitted, reviewed by a loan officer, evaluated by an underwriter, and eventually consummated at a closing.

When the individual items that pass through the process are readily identifiable (an individual item being manufactured; a specific mortgage application), and the processing steps or milestones are predicable (the manufacturing process stages, the mortgage approval phases), its status can be tracked by an accumulating snapshot model. Such a model provides a way to understand the status of an individual item at the detail level, the workload at each processing step, and the efficiency with which items move from one stage to the next.

Consider the process by which a mortgage application moves from initial submission to closing. The process begins when an applicant completes a mortgage application with a mortgage officer. A processor then takes over, ensuring that documentation required to evaluate the loan application is gathered. Next, the application is reviewed by an underwriter, who reviews the application and supporting documentation, evaluates potential risk, and approves or rejects the application. The amount approved for underwriting may be less than the amount of the original application. Finally, the mortgage becomes legally binding at the time of closing.

Figure 1, below, shows an accumulating snapshot for the mortgage application process. The grain of this fact table is an application. Each application will be represented by a single row in the fact table. The major milestones are represented by multiple foreign key references to the Day dimension—the date of submission, the date approved by mortgage officer, the date all supporting documentation was complete, the date approved by an underwriter, and the date of closing.

Figure 1: An accumulating snapshot tracking mortgage applications.

Note that it is not necessary to build five Day dimension tables. A single Day table can serve in all these roles; it will simply be aliased at query time or through a series of views. Similarly, the three employees involved in the process are represented by three foreign key references to the Employee dimension table—one for the mortgage officer, one for the processor that assembles the supporting materials, and one for the underwriter.

The facts include several dollar amounts. The amount of the original application is the application_amount. The amounts approved and rejected by the mortgage officer and underwriter are recorded as officer_approved_amount and underwritten_amount. The amount actually loaned at closing is the closing_amount.

When the application is first submitted, the only date that is known is the application submission date. The appropriate day_key value is assigned to the application_submission_date_key. In order to accommodate the dates that are as yet unknown, the Day dimension must contain a row for "unknown" or "n/a." The key value for this row will be assigned to the remaining dates. The application_amount is set to the requested mortgage amount from the application. The remaining dollar amounts are initialized to zero.

As the application works its way through the process, the fact table row will be updated. The various date keys are revised as each milestone is reached. The additional facts are filled in as the appropriate milestones are reached.

The fact table also contains a series of lag columns. These represent the number of days the application spends at each stage. Each is a simple calculation based on milestone dates, populated once it completes the appropriate stage. They allow measurement of processing time.

The fact table also contains a series of quantities, each corresponding to one of the milestones. The value of each count is zero or one, depending on whether the application has reached the corresponding milestone. These facilitate workload reporting, and will take on other values when aggregated. To make it easier to identify the current status of an application, a status attribute may be added to the application dimension or as a degenerate dimension in the fact table.

Using this schema, it is possible to study the lifecycle of a particular application, the workload at each processing step, and the efficiency of the overall process itself.

Excerpted from Mastering Data Warehouse Aggregates, by Chris Adamson.
Copyright (c) 2006 by Wiley Publishing, Inc
Used by permission.

Classes

Chris is scheduled to present at the following events. Course enrollment is open to the general public.

All these courses are also available on site (see below).

August 18, 2019
San Diego, CA
Data Modeling in the Age of Big Data
Registration: TDWI San Diego
August 20, 2019
San Diego, CA
Data Architecture: Managing Information in the Age of Big Data
Registration: TDWI San Diego
August 20, 2019
San Diego, CA
Workshop: Building the Business Case for Advanced Analytics
Registration: TDWI San Diego Strategy Summit
Monday October 21, 2019
San Francisco, CA
TDWI Dimensional Data Modeling Primer: From Requirements to Business Analysis
Registration: TDWI Seminars
Tuesday October 22, 2019
San Francisco, CA
Advanced Dimensional Modeling: Techniques for Practitioners
Registration: TDWI Seminars
Wednesday October 23, 2019
San Francisco, CA
Dimensional Models: What’s New in the Big Data Era
Registration: TDWI Seminars
November 12, 2019
Orlando, FL
Data Architecture: Managing Information in the Age of Big Data
Registration: TDWI Orlando
November 12, 2019
Orlando, FL
The Dimensional Model Refactored: New Techniques for the 21st Century
Registration: TDWI Orlando
November 15, 2019
Orlando, FL
Advanced Dimensional Modeling: Complete Tour of Modern Best Practices
Registration: TDWI Orlando

Onsite Education

You can bring Chris to your team for interactive education.

Dimensional Modeling

Chris provides full-day and expanded two-day courses covering the dimensional design concepts from Star Schema: The Complete Reference.
TDWI Courses

Chris teaches select TDWI courses that cover topics like data BI fundamentals, performance management, business analytics, dashboards and scorecards, and more.

All of Chris's education offerings are provided through TDWI.

For information on onsite offerings, contact TDWI Onsite Education. or Oakton Software