Monday, August 2, 2010

FAQ on Star Schema: The Complete Reference

My third book, Star Schema: The Complete Reference, is now available.

I took a year off from work in order to write this book.  It has an immense amount of detail. 

If you’ve enjoyed my other books, my classes, or this blog, please consider supporting this effort.

Use this link to order a copy from  Or use the links in the sidebar. 

Here are answers to some questions I have gotten about the book.

Why another book on Star Schema?

I often want to refer people to something they can read on a particular aspect of dimensional design.  Usually, this is harder than it should be.

There are some great books on star schema, but they are organized into chapters based on business cases (my own prior work included).  You can’t open up a book like that and turn to “the chapter” that covers a particular design technique.

If you want to read about snapshot designs, for example, you’ve got to flip back and forth between chapters about inventory, banking, budgeting, etc.

Also, all books target a particular data warehouse architecture—either Inmon’s "Corporate Information Factory" or Kimball’s dimensional “bus architecture.”  Since most of the principles of dimensional design are universal, this can get in the way.

My aim was to create the missing reference on star schema design, and to make it useful to anyone who works with dimensional data—stars, snowflakes or cubes.

How is this book different?

It’s structured into chapters and sections based on design topics, instead of by industries or business scenarios.  This makes it easy to find everything on a particular topic.

It’s also architecturally neutral.  It provides design techniques and best practices without advocating a specific approach to data warehousing.

It provides deep coverage.  It explains best practices and fully explores the reasoning behind them.  It looks at the impact of each technique on BI and ETL processes, and also explores situations where you may wish to deviate from best practices.

What is special about how the book is organized?

Its organized into chapters and sections based on design concepts.  If there is a particular topic you want to research, you should be able to find it by scanning the table of contents. 

For example, if want to learn about snapshot designs, you can turn to the chapter that covers them.  You will find a complete explanation of best practices, examples from numerous industries, and references to other books that have more examples.  You won’t need to skip back and forth between chapters that cover various industries.

Does that mean there are no examples?

Not at all!  Example models are used to illustrate every design concept.  In fact, there are over 175 figures illustrating schema designs and sample data.  Each is fully explored in the text.  The difference is that these business cases are used to illustrate design topics, rather than the reverse. 

Is it just for experts?

No!  Beginners can read it cover-to-cover, because it starts with fundamentals and then builds on them.  Experienced readers can skip directly to topics of interest, which is easy because of the book’s organization by design concept.

What is “Architecture-Neutral”?

Dimensional design is central to Ralph Kimball’s dimensional “bus architecture.”  But it also has a place in W.H. Inmon’s “Corporate Information Factory.” And many people use it just to create data marts.

The principles are mostly universal, and that’s what this book focuses on.  If your architecture includes stars, snowflakes or cubes, this book will help you understand, design and use them.

So the book does not talk about architecture?

Three architectures are explained in the book, but it does not argue for one over the others.  Nor does it assume you will be using a particular architecture.  The book is designed to be useful to anyone who uses dimensional data. 

Where can I learn more about these architectures?

Every chapter concludes with a section on "further reading," directing you to the parts of other books that where you will find more examples or variations in approach.

The chapter on architecture recommends the following books:

Your previous books have advocated the Kimball approach.  Why not this one?

Most of us do not get to choose our approach anymore.  Most businesses already have a data warehouse architecture in place.  Scrapping it because it does not follow one’s preferred architecture can be an expensive and foolhardy mistake.

I have worked on projects that follow the Kimball approach, the Inmon approach, and others.  I have seen success in all these paradigms.  Like my classes, this book is meant to help you—regardless of your approach.

That said, my background is heavily rooted in the Kimball approach.  But I have been very careful to separate architectural issues from the design principles in this book. 

What do you mean by “deep coverage?”

A comprehensive reference to dimensional design should not leave important issues unexplored.  I have been teaching dimensional design for over fifteen years, and have answered thousands of questions from novice and experienced designers.  This book incorporates that experience. 

A discussion of recursive hierarchies, for example, is incomplete without a close look at the slow change issues you encounter if you build a bridge table.  A discussion of accumulating snapshots is incomplete without exploring non-linear business processes.  You get the idea.

How is each topic covered?

For every design concept, best practices are explained and illustrated with detailed examples.  I also look at what happens when you deviate from the best practices.  This will help you when you encounter a new situation, or when you make a conscious decision to deviate from best practices.

Why not just provide a set of rules?

Developing dimensional designs involves making choices.  Design principles can guide you, but you need to understand why they exist.  The book gives you the best practices, but also explains why“Teach a man to fish….” 

Why would I deviate from best practices?

There are reasons behind every guideline, but extenuating circumstances may alter the equation.  If you know “what” and “why”, then you can make an informed decision to take a different path.

For example, we are often taught not to “snowflake” dimensions.  The book explains the reasons for this best practice, but also looks at situations where you may want to ignore it.  Your database or BI software may work better if you use a snowflake design.  Ignoring this fact would limit the value of your solution.  My motto is to be pragmatic, not dogmatic.

What happened to The Star Schema Handbook?

That was the original working title of this book.  It was listed under that name as “coming soon” by some online booksellers.  Before it came out, I changed publishers and changed the name.   It’s the same book. 

Is there a class that follows this book?

The book parallels my course on advanced dimensional design, which I teach at The Data Warehousing Institute conference events.

Upcoming events are always listed on the sidebar of this blog. I also teach it onsite, and you can contact TDWI for more information.