In the introduction, we have seen what the necessity in having a Data Warehouse. Now in this post we move forward to know more about the Data Warehouse.
When it comes to designing a DWH, it is completely different from normal Transactional database (OLTP). In case of Normal operational database, high volumes of updations/ insertions /deletions of data takes place. Besides that, in OLTP, data validation takes place while loading the data into the database. Whereas in contrast, there are only two actions that happens on a Data Warehouse database. They are –
* Loading the data * Retrieving the data
Furthermore, data validations do not takes place while data is being loaded into the data warehouse. Because it was loaded with valid data and no real time validation is necessary. These validation checks will be done at source level or may be at staging areas. Another point is that the data in data warehouse never changes and furthermore it keeps the history of records.
Moreover, data in the data warehouse stored based on Subject rather than application. Which means, a DWH is not a store to keep all irrelevant(unnecessary)data, but keeps history of relevant data, based on the Subject Area that a company concentrates on. So certain data in source databases is irrelevant for analysis.
From the above we can say that – Data in Data Warehouse is –
Subject-based(Data in DWH stored based on Subject Area.) Integrated (Similar data might be represented in different forms by different sources but as the data is moved to the Warehouse, it is transformed into a consistent way.) Non-Volatile (no updations allowed on existing data) Time-Variant (can have snapshot view of data at a given time.)
While designing the Data Warehouse, the only thing we need to consider is how well the data is organized in data warehouse, so that the data is easily accessible for business users for their strategic decisions.
Thus we conclude here and continue in the next post on how to build a Data Warehouse.