Thursday, September 15, 2011

Factless Fact Tables

This post introduces the concept of the factless fact table and the situations where you would model one.

When a fact table does not contain any facts, it is called a factless fact table. There are two types of factless fact tables: those that describe events, and those that describe conditions. Both may play important roles in your dimensional models. 

Factless fact tables for events

Sometimes there seem to be no facts associated with an important business process. Events or activities occur that you wish to track, but you find no measurements. In situations like this, build a standard transaction-grained fact table that contains no facts.

For example, you may be tracking contact events with customers. How often and why we have contact with each customer may be an important factor in focusing retention efforts, targeting marketing campaigns, and so forth.

The factless fact table shown here captures a row each time contact with a customer occurs.  Dimensions represent the date and time of each contact, the customer contacted, and the type of communication (e.g. inbound phone call, outbound phone call, robo-call, email campaign, etc.).

While there are no facts, this kind of star schema is indeed measuring something: the occurrence of events. In this case, the event is a contact. Since the events correspond to the grain of the fact table, a fact is not required; users can simply count rows:

SELECT
    customer_name,
    count (contact_type_key)as "CONTACT COUNT"
  FROM
    contact_facts...

  CUSTOMER_   CONTACT
  NAME        COUNT
  =========== =======
  BURNS, K          8
  HANLEY, S        11
  ROGERS, S         4
  SCANLON, C        8
  SMITH, B          8
  SMITH, M.E.      12


This fragment of SQL reveals that there really is a fact here: the number of events.  Its not necessary to store it, because it coincides with the grain of the fact table.

To acknowledge this, you can optionally add a fact called "contact_count"; it will always contain the constant value 1.

Factless fact tables for conditions

Factless fact tables are also used to model conditions or other important relationships among dimensions. In these cases, there are no clear transactions or events, but the factless fact table will enable useful analysis.

For example, an investment bank assigns a broker to each customer. Since customers may be inactive for periods of time, this relationship may not be visible in transaction-grained fact tables. A factless fact table tracks this important relationship:


Each row in this factless fact table represents a bounded time period during which a broker was assigned to a particular customer.  This kind of factless fact table is used to track conditions, coverage or eligibility.  In Kimball terminology, it is called a "coverage table."

Comparing conditions with events yields interesting business scenarios.  This factless fact table can be compared to one that tracks investment transactions to find brokers who are not interacting with their customers, brokers who conducted transactions with accounts that belong to a different broker, etc.

More on factless fact tables

If you've got questions about factless fact tables, send them to the address in the sidebar of this blog.

You can also read more about factless fact tables in Star Schema: The Complete Reference.  Chapter 12 provides detailed coverage of factless fact table design.