Allocation factors are often seen as the cure-all remedy to avoid double counting. Unfortunately, they are not always realistic. And when they are possible, there is often a better solution.
The back-story: double counting
A dimension bridge allows a single fact row to be linked to two or more rows in a dimension. In a previous post, I showed how this technique allowed two or more salespeople to be linked to a row in an orders fact table.
As I pointed out, the danger is that queries may accidentally count the order dollars repeatedly -- once for each associated salesperson. This is the danger of double-counting.
The potential solution: allocation
An allocation factor can sometimes be used to avoid this danger. To the bridge, we add a column that indicates the "share" of the facts associated with each group member. In queries, before we aggregate facts, we apply this allocation factor.
For example, we can add an allocation factor to the salesperson group bridge:
If two members collaborate on a sale, and split the credit 75/20, there will be two rows in the bridge for their group. One would contain the group key, the first salesperson key, and a .75 allocation factor. The other would contain the group key, the second salesperson key, and a .4 allocation factor. (For sample data, see Figure 9-5 my book.)
Now, when we connect the bridge to the fact table, we can apply the allocation factor:
select sum(order_facts.order_dollars * salesperson_bridge.allocation)
Seems to be the perfect solution, right? Well, not always....
There must be a business rule
An allocation factor only works where one exists. Do not come up with your own rules -- e.g. "split transactions evenly among group members." This may not reflect how the business studies their key metrics.
Allocation factors are often not part of the business. If an insurance claim is tracked in a fact table, it may not make sense to allocate it against multiple parties. If a fact table represents a insepction event, it may not make sense to allocate it against multiple infractions.
There must be a definitive and consistent business rule that can be used. If there is no business rule, then obviously allocation will not work. Note, too, that they are even less likely to be found in situations that call for an attribute bridge.
There should be a source of record
So you have a business rule? You are not out of the woods yet.
Next, you need to make sure you have a source of record for the allocation factor.
For example, many systems that support the sale of securities calculate, and split, broker commissions at the time of sale. These systems provide a nice, clean allocation factor.
If you cannot find a source of record, you should think long and hard about using the "business rule." In all likelyhood, it will be subject to exceptions, and may even change over time.
For example, your business may tell you transactions can be allocated evenly among group participants. You may choose to "hard-code" this rule into your ETL process. But one day, you will encounter a situation where the business wants to "override" this rule for a particular transaction. Or, a new rule is instanitated that one particular salesperson always gets at least 40% of their sales, and so on...
Each time the business changes its mind, ETL rework will be necessary. If you don't have a system of record, think long and hard before hard-coding a business rule.
Consider altering fact table grain
Got a business rule AND a source of record? Fantastic! You are very lucky, and you may not even need an allocation factor at all!
When you have a clear source for an allocation factor, consider skipping the bridge altogether. Instead, restate the grain of your fact table as allocated transactions.
In the orders example, we can change the grain of the fact table from "one row per order line" to "order lines allocated to salespeople."
Now, if two salespeople collaborate on a sale, there will be two rows in the fact table. They will be fully addtitive, with no risk of double counting the facts. No bridge needed!
Note that this works best of the orders are allocated in the source system. Otherwise, you will wind up with splits that are not even (33/33/34) or fractional quantities that do not total to the same amount as the operational system.
More to Come
My inbox is full of additional questions and comments about bridge tables. I promise to post some more about bridged designs in the months to come.
In the meantime, I encourage you to check out Star Schema The Complete Reference, which contains two full chapters on bridge tables. Chapters 9-10 offer a far deeper treatment than I can provide on this blog, with 30 figures, sample instance charts, sample queries and results, etc.
Bridge photo by Branwebs via Creative Commons