Now its time to discuss about SNOWFLAKE Schema, a type of star schema. It is called so because the diagram of the schema resembles a snowflake. Just like in Star Schema, Snowflake schema is represented by centralized fact tables which are connected to multiple dimensions, such that the ER diagram resembles a snowflake in shape. The only difference is that, Snowflake schema further divides the Dimension tables into sub-dimensional tables that represents a Dimensional Hierarchy. So, Snowflake encourages normalization of Dimension tables. In simple, it can be termed as normalized STAR schema.
In a Snowflake Schema, each dimension breaks down into multiple tables in a hierarchial fashion. As a result, all the low level dimension tables are tied to next higher level dimension by means of a natural key, whereas the highest level dimension tied to the FACT table by a Surrogate/Primary KEY.
The main reason behind normalizing dimensions is to eliminate redundancy. While this saves space, it increases the number of dimension tables and requires more foreign key joins. The result is more complex queries and reduced query performance. So, the more you normalize the dimension table, the more complicated SQL joins must be issued to query them.
Now our concern is about which one needs to be implemented in a particular environment – We cannot just decide which is better over the other. There are lot of factors that are to be considered to know which one to be used for a particular environment. Lets see some of those factors –
1.) Oracle recommends you choose a star schema over a snowflake schema unless you have a clear reason not to. 2.) It also depends on what the query tools being used in an environment. 3.) When a Dimensional table is very large with number of attributes, where it's been difficult for star schema to represent data structure, then the better choice is using a SNOWFLAKE. 4.) If dimension table contains millions of rows, the use of snowflake schemas should significantly improve performance by moving out some data to other table. 5.) General rule of thumb is keep away from snowflake schemas as even though it may save you some space, but it costs a lot in terms of query times.
Finally, we need to determine one thing .. Why we are building a Data Warehouse and for whom its been build for? Yes .. you are right!!! For Analysis purpose and for the Business users. I think this will enough to answer for the query which schema to use? That is the reason why Oracle itself recommended to go with STAR, until and unless you have specific reason for SNOWFLAKE. Even performance might not be a factor to choose between STAR and SNOWFLAKE(I mean it . It should be considerable, not big difference) and other factors should be given more priority.
So we conclude this post here and will discuss another topic in the next post.