Monday, April 25, 2011

Bridge Tables and Many-to-many Relationships

Recent posts about bridge tables generated a lot of interest. Today, I will look at the many-to-many relationships in these solutions. 

These relationships are intended, necessary, and valid parts of database design. Many software tools do not understand this kind of relationship, so I'll also show you how you can eliminate it, if need be.

Many-To-Many

Previously, I described two uses for bridge tables. Both examples involved many-to-many relationships between tables. Many readers may have thought this was an error. In fact, it was intended.

One case involved a situation where facts must relate to multiple dimension members. Specifically, it was possible for two or more salespeople might collaborate on orders captured in a fact table. The bridged solution looked like this:

Notice that there is a many-to-many relationship between the bridge table and the fact table. This can be understood as follows:
  • A given fact may reference multiple rows in the bridge.  This happens when two salespeople collaborate on an order. 
  • A given row in the bridge may reference multiple facts.  This happens when the same group collaborates on more than one order line.
The bridge table allows us to link the repeating dimension to the facts, but as noted in the post, care must be taken not to double count.

A Legal Relationship

In the world of entity-relationship modeling, we are taught that many-to-many relationships may be present in a logical model, but not  in a physical design. But we are not discussing entity-relationship modeling.

In a dimensional model, this kind of relationship is allowed. It is also 100% compatible with the relational model of data.  Despite what you may think, it is also compatible with your relational database management system.

The figure above can be implemented as a physical design, and it will work. Unfortunately, some of our software tools do not understand this. 

Tool Problems

Many of the software products we use in data warehousing were originally developed for transaction processing systems. Because of this lineage, the many-to-many relationship can be a problem.

For example, suppose your DBMS works best if joins between tables are supported by declaration of primary key/foreign key relationships. This is a problem for the model pictured above. Order_facts and salesperson_bridge can be joined using sales_group_key.  But it is not a primary key in either table.

Luckily, this join will work even if it is not supported by any key constraints. Unfortunately, other tools may not be as forgiving.

Some data modeling tools do not permit a many-to-many relationship in a physical model. This presumably reflects their entity-relationship lineage. And if you use a SQL-generating BI tool to create reports, you may find it identifies bridge table as fact tables.

Eliminating The Many-To-Many

If your software products cannot live with a many-to-many relationship, there is a simple solution. You can resolve in the same way an entity-relationship modeler would: create an intersect table.

In the example above, a new table is added that contains one row for each group. An association or intersect table, will link the group to the salespeople. The result looks like this:
For those of us who find humor in data modeling, this is a source of much amusement. The sales_group table consists of a single column. And since the sole column is a surrogate key, the table's entire contents are likely to be a sequence of integers!

Still: problem solved. Only take this additional step if you really must.

More on Bridges

Bridges are powerful but dangerous . To make proper use of them, there is more to understand.  Additional topics include he impact of slowly changing dimensions, the use of allocation factors, how to prevent end users from double counting....the list goes on.

I will post more about bridges, but I encourage you to also read about them in my book, Star Schema The Complete Reference.  The book goes into much more depth on bridge tables than I can do in a blog post - or even a series of posts.  Two full chapters are dedicated to bridge tables.  Check it out!

- Chris


Bridge photo by Branwebs via Creative Commons