Data Warehouse – Let’s build it!!! – Part2

Now our Idea (concept) is to build a Data Warehouse, so the first step is to develop a Conceptual framework. To do so, gather necessary requirements and identify specific scope for your application and develop your mental image into an initial design which is called ‘CONCEPTUAL MODEL‘. This is considered as High-level Design and not a detailed one. It gives the vague idea of your Mental image. It identifies various objects involved in building a data warehouse and the relationship between them. But it doesn’t show how those objects are related to each other.

To make it into a SYSTEM DELIVERABLE, it is necessary to develop a more detailed form from the Conceptual Model. Logical Design is more detailed and shows how the objects(entities) are interrealated to each other. It Provides detailed explaination(defines Attributes) to each Object(entity) and show how they are realated to each other. So a logical design brings relationship between the objects(entities) by means of their attributes.

So the end product of a logical design is –

* a set of entities and attributes corresponding to fact tables and 
  dimension tables.
* a model of operational data from your source into subject-oriented 
  information in your target data warehouse schema.

Once you are ready with the Logical Design, now its time for you to Implement that Logical Design to a Physical form.

The Physical representation of the the Logical Design, is the most detailed data model in Data Warehouse Data Modelling. During this process you translate the expected schemas into actual data structures. So, what actually this physical design contains is –

    * Actual tables
    * Columns and their data types and Size 
    * Integrity Constraints (PK's, FK's..)
    * Other Objects like Indexes, Views, MT's, Synonyms .. etc.,

The way how you group these Objects is called SCHEMA. There are wide variety of ways in which You can arrange schema objects in the schema models designed for data warehousing. The most noted approaches are (1) Normalized Approach and (2) Dimensional Approach. But most data warehouses use a dimensional model.

Dimensional Modeling is a favourite modeling technique in data warehousing. This model should be more flexible than OLTP system, because it should be designed for inconsistent access patterns by the users. Moreover, this model must be extensible and easily adaptable to change, without impacting the current usage. A Data Warehouse, by its nature, is continuosly changing from time-to-time, so the model must be amenable to reflect that change.

So we need to keep all these things in mind while implementing actual design of data warehousing.

A Dimension Model is visually represented as a FACT table surrounded by DIMENSION tables as Shown below –

Each of the Dimension tables Primary Key becomes Foreign Key’s of the related Fact Table in Dimensional Model Designs. All the Foreign Key’s(FK) of the Fact Table constitute a composite Primary Key (PK) for that Fact Table. Thus the FACT table express a Many-to-Many relationship.

This is how a Data Warehouse Schema is represented. There are a number of ways in which a Data Warehouse Schema can be represented. Most important ways are –

* STAR Schema

So, before concluding this post, let us see what we discussed in this post in pictorial form –

Thus we conclude here and discuss in detail about the STAR and SNOW-FLAKE schemas in later posts.

This entry was posted in Data Warehouse. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s