Documente Academic
Documente Profesional
Documente Cultură
Bharat Jain Ankita Golchha Shilpa Kasani Vijay Kumar Tasneem Taj 26 27 28 29 30
Rachana Kola
31
Data warehousing
Data Warehousing is a database used for reporting &
analysis.
It focuses on data storage. Essential components of Data warehouse system. Data warehouse can be subdivided into data marts.
Conceptual view.
Unlimited dimensions.
What the data reveals Inserts & updates Queries Processing speed Space requirements Database design
ARCHITECTURE
External data sources EXTRACT CLEAN TRANSFORM LOAD REFRESH Serves Reports Metadata Repository
OLAP
Data warehouse
Operational systems
Data Mining
COMPONENTS
3 main systems required :
Source systems Data staging area Presentation servers
Operational data :
Internal data External data
Load manager :
Simple transformation of data to prepare the data for entry
CONTD..
Warehouse manager :
Analysis of data. Transformation & merging of data. Backing up & archiving of data.
Meta data means data about data. Extraction & loading process. Warehouse management process. Query management process.
ETL PROCESS
Extract
Loading
Transform
Cleansing
IMPORTANT TERMS
Drill down
Roll up
Aggregation
Granularity
3.
systems.
The objective of dimensional modeling is to represent a set of business
Dimension Tables.
A fact table is a table that contains the measures of interest.
STAR SCHEMA
The star schema is also called star-join schema, data cube, or multi-dimensional schema.
A star schema classifies the attributes of an event into facts (measured numeric/time data), and descriptive dimension attributes (product id, customer name, sale date) that give the
facts a context.
tools, which may anticipate or even require that the data-warehouse schema contain dimension tables
That dimensional table is normalized into multiple lookup tables each representing a level in the dimensional hierarchy.
CONTD..
The Time Dimension that consists of 2 different hierarchies:
Year Month Day
Week Day
Year is connected to Month, which is then connected to Day. Week is only connected to Day.
constellation schema By splitting the original star schema into more star schemes each of them describes facts on another level of dimension hierarchies
individual PCs. Developed for a standalone environment Address applications requiring only small volumes of warehouse data.
and planning.
applications. Mandates that the organization look not only at past performance but, more importantly, at the future performance of the business. The combined analysis of historical data with future projections is critical to the success of today's corporation.
operational systems. Extremely expensive. Costs : Time spent in careful analysis. Design & implementation. Hardware costs. Software costs. On going support & maintenance.
Conclusion
Data warehousing is necessary to analyze the business
needs, integrate data from several sources, model the data in an appropriate manner to present the business information in the form of dashboards and reports.