Sunteți pe pagina 1din 61

Data Warehousing Basics I

Basic Basic

C3: Protected

About the Author


Created By: Credential Information: Version and Date: Gughanand Dhananjayan (134701) 3 years of experience in DW and BI tools DWBASIC/PPT/1106/1.0

Copyright 2005, Cognizant Academy, All Rights Reserved

Icons Used
Questions Hands-on Exercise

A Welcome Break

Test Your Understanding

Coding Standards

Reference

Demo

Key Contacts

Copyright 2005, Cognizant Academy, All Rights Reserved

Data Warehousing Basics I: Overview


Introduction:
This chapter explains the DataWarehouse and Data Marts

Copyright 2005, Cognizant Academy, All Rights Reserved

Data Warehousing Basics I: Objectives


Objective:
After completing this chapter, you will be able to: Explain the Data Warehouse concepts Describe the data marts and the architectures

Copyright 2005, Cognizant Academy, All Rights Reserved

What is an Operational System?


Operational systems are what its name implies. It is the system that help you to run the day-to-day enterprise operations. These are the backbone systems of any enterprise, such as order entry inventory. The classic examples are airline reservations, credit-card authorizations, ATM withdrawals, and so on. Because of their importance to the organization, operational systems were almost always the first parts of the enterprise to be computerized. Over the years, these operational systems have been extended and rewritten, enhanced and maintained to the point that they are completely integrated into the organization.
Copyright 2005, Cognizant Academy, All Rights Reserved 6

Characteristics of Operational Systems


Following are the characteristic of operational system: Continuous availability Predefined access paths Transaction integrity High volume of transaction Low data volume per query Is used by operational staff Supports day to day control operations Supports large number of users

Copyright 2005, Cognizant Academy, All Rights Reserved

Historical Look at Informational Processing


The goal of Informational Processing is to turn data into information! Why? Because business questions are answered using information and the knowledge of how to apply that information to a given problem.

Data Data

Information Information

Knowledge Knowledge

Copyright 2005, Cognizant Academy, All Rights Reserved

Need for a Separate Informational System


Data: Informational data is distinctly different from operational data in its structure and content . Processing: Informational processing is distinctly different from operational processing in its characteristics and use of data

Copyright 2005, Cognizant Academy, All Rights Reserved

The Information Center


Management requires business information

A request for a report is made to the Information Center

Information Center works on developing the report

Requirements for the report must be clarified

Copyright 2005, Cognizant Academy, All Rights Reserved

10

The Information Center (Contd.)


Report provided to analyst Analyst manipulates data for decision making

Management receives information, but... What took so long? and How do I know its right?

Copyright 2005, Cognizant Academy, All Rights Reserved

11

The Information Center (Contd.)

Too Many Steps Involved!

Copyright 2005, Cognizant Academy, All Rights Reserved

12

Tactical Information
Transported Quantity Inventory Control System

Order quantity
OLTP Server

Production quantity

Supports day to day control operations Transaction Processing High Performance Operational Systems Fast Response Time Initiates immediate action
13

Copyright 2005, Cognizant Academy, All Rights Reserved

Strategic Information
Production and Inventory Marketing

Finance Understand Business Issues Analyze Trends and Relationships Analyze Problems Discover Business Opportunities Plan for the Future

Payroll

Copyright 2005, Cognizant Academy, All Rights Reserved

14

Need for Tactical and Strategic Information


OLTP Server
Operational Data Periodic Refresh

Data Warehouse Server

Tactical Information

Strategic Information

Operational data helps the organization to meet the operational and tactical requirements for data. While the Data Warehouse data helps the organization to meet strategic requirements for information.
15

Copyright 2005, Cognizant Academy, All Rights Reserved

Operational Vs Analytical Systems


Operational Primarily primitive Current; accurate as of now Constantly updated Minimal redundancy Highly detailed data Referential integrity Supports day-to-day business functions Normalized design Analytical Primarily derived Historical; accuracy maintained over time Less frequently updated Managed redundancy Summarized data Historical integrity Supports long-term informational requirements De-normalized design

Copyright 2005, Cognizant Academy, All Rights Reserved

16

Data Warehouse: Definition


The Data Warehouse is: Subject oriented Integrated Time variant Non-volatile

collection of data in support of management decision processes

Copyright 2005, Cognizant Academy, All Rights Reserved

17

Data Warehouse: Differences from Operational Systems


Operational Systems
Order Entry

Data Warehouse
Customer

Billing Accounting

Usage

Revenue

Operational data is organized by specific processes or tasks and is maintained by separate systems

Warehoused data is organized by subject area and is populated from many operational systems

Copyright 2005, Cognizant Academy, All Rights Reserved

18

Data Warehouse: Differences from Operational Systems (Contd.)


Operational Systems Data Warehouse

Application Specific Applications and their databases were designed and built separately Evolved over long periods of time

Integrated

Integrated from the start Designed (or Architected) at one time, implemented iteratively over short periods of time

Copyright 2005, Cognizant Academy, All Rights Reserved

19

Data Warehouse: Differences from Operational Systems (Contd.)


Operational Systems Data Warehouse

Primarily concerned with current data

Generally concerned with historical data

Copyright 2005, Cognizant Academy, All Rights Reserved

20

Data Warehouse: Differences from Operational Systems (Contd.)


Operational systems Database Data warehouse

Update Insert Update Delete Insert

Load/ Update

Initial Load

Incremental Load Incremental Load

Constant Change Updated constantly Data changes according by need, not as a fixed schedule

Consistent Points in Time Added regularly, but loaded data are rarely changed directly Does not mean the data warehouse is never updated or never changes!!

Copyright 2005, Cognizant Academy, All Rights Reserved

21

Data in a Data Warehouse


What about the data in the Data Warehouse? Separate DSS data base Storage of data only, no data is created Integrated and scrubbed data Historical data Read-only (no recasting of history) Various levels of summarization Meta data Subject oriented Easily accessible

Copyright 2005, Cognizant Academy, All Rights Reserved

22

Data Warehousing Features


The following are the features of Data Warehousing: Strategic enterprise level decision support Multi-dimensional view on the enterprise data Caters to the entire spectrum of management Descriptive, standard business terms High degree of scalability High analytical capability Historical data only

Copyright 2005, Cognizant Academy, All Rights Reserved

23

Data Warehouse: Business Benefits


Benefits to business: Understand business trends Better forecasting decisions Better products to market in timely manner Analyze daily sales information and make quick decisions Solution for maintaining your company's competitive edge

Copyright 2005, Cognizant Academy, All Rights Reserved

24

Data Warehouse: Application Areas


Following are some Business Applications of a Data Warehouse: Risk management Financial analysis Marketing programs Profit trends Procurement analysis Inventory analysis Statistical analysis Claims analysis Manufacturing optimization Customer relationship management
Copyright 2005, Cognizant Academy, All Rights Reserved 25

Data Marts: Overview


Data mart is a decentralized subset of data found either in a Data Warehouse or as a standalone subset designed to support the unique business unit requirements of a specific decision-support system. Data marts have specific business-related purposes such as measuring the impact of marketing promotions, or measuring and forecasting sales performance, and so on.
Data Mart

Enterprise Data Warehouse

Data Mart

Copyright 2005, Cognizant Academy, All Rights Reserved

26

Data Marts: Features


The following are few important features of data marts: Low cost Controlled locally rather than centrally, conferring power on the user group Contain less information than the warehouse Rapid response Easily understood and navigated than an enterprise Data Warehouse Within the range of divisional or departmental budgets

Copyright 2005, Cognizant Academy, All Rights Reserved

27

Advantages of Data Mart over Data Warehouse


Data mart advantages: Typically single subject area and fewer dimensions Limited feeds Very quick time to market (30-120 days to pilot) Quick impact on bottom line problems Focused user needs Limited scope Optimum model for DW construction Demonstrates ROI Allows prototyping

Copyright 2005, Cognizant Academy, All Rights Reserved

28

Disadvantages of Data Mart


Data Mart disadvantages: Does not provide integrated view of business information. Uncontrolled proliferation of data marts results in redundancy. More number of data marts complex to maintain. Scalability issues for large number of users and increased data volume.

Copyright 2005, Cognizant Academy, All Rights Reserved

29

Architected Data Warehouse

Data Staging
Metadata

Enterprise Data Warehouse

Source systems

End user access

Copyright 2005, Cognizant Academy, All Rights Reserved

30

Unarchitected Data marts Vs Data Warehouse


Unarchitected Data Marts
Source systems Data marts End user access

Data Warehouse

Data Staging

Enterprise
Data Warehouse
Metadata

Enterprise Data Warehouse

Source systems End user access

Easy to do, Not architected ? Are the extracts, transformations, integration's and loads consistent? ? Is the redundancy managed? ? What is the impact on the sources?
Copyright 2005, Cognizant Academy, All Rights Reserved

Architected Data and results consistent Redundancy is managed Detailed history available for drilldown Metadata is consistent!
31

Operational Data Store Definition


The Operational Data Store (ODS) is defined to be a structure that is: Integrated Subject oriented Volatile, where update can be done Current valued, containing data that is a day or perhaps a month old Contains detailed data only

Copyright 2005, Cognizant Academy, All Rights Reserved

32

ODS: Needs
To obtain a system of record that contains the best data that exists in a legacy environment as a source of information. Best data implies to be: Complete Up to date Accurate In conformance with the organizations information model.

Copyright 2005, Cognizant Academy, All Rights Reserved

33

ODS: Insulated from OLTP


OLTP Server

ODS data resolves data integration issues.

ODS

Data physically separated from production environment to insulate it from the processing demands of reporting and analysis.

Tactical Analysis

Access to current data facilitated.


34

Copyright 2005, Cognizant Academy, All Rights Reserved

ODS: Data
Detailed data: Records of Business Events, for example, orders capture Data from heterogeneous sources Does not store summary data Contains current data

Copyright 2005, Cognizant Academy, All Rights Reserved

35

ODS: Benefits
The following are the benefits of ODS: Integrates the data Synchronizes the structural differences in data High transaction performance Serves the operational and DSS environment Transaction level reporting on current data

Flat files

60,5.2,JOHN 72,6.2,DAVID Operational Data Store

Relational Database

Excel files Copyright 2005, Cognizant Academy, All Rights Reserved 36

Operational Data Store: Update schedule

ODS Data

Data Warehouse Data

Update schedule - Daily or less time frequency Detail of Data is mostly between 30 and 90 days Addresses operational needs

Weekly or greater time frequency Potentially infinite history Address strategic needs

Copyright 2005, Cognizant Academy, All Rights Reserved

37

ODS Vs Data Warehouse Characteristics

Copyright 2005, Cognizant Academy, All Rights Reserved

38

What is OLAP
OLAP tools are used for analyzing data It helps users to get an insight into the organizations data It helps users to carry out multi dimensional analysis on the available data Using OLAP techniques users will be able to view the data from different perspectives Helps in decision making and business planning Converting OLTP data into information Solution for maintaining your company's competitive edge

Copyright 2005, Cognizant Academy, All Rights Reserved

39

OLAP Terminology
Drill Down and Drill Up Slice and Dice Multi dimensional analysis What IF analysis

Copyright 2005, Cognizant Academy, All Rights Reserved

40

Basic Data Warehouse Architecture


Meta Data Management

Information Access

Data warehouse
Reporting tools

Operational & External data

ODS
Mining

Data Staging layer

OLAP

Data Marts

Information Servers

Web Browsers

Administration
Copyright 2005, Cognizant Academy, All Rights Reserved 41

Operational and External Data Layer


Meta Data Management

The database-ofInformation Access

record Consists of system specific reference data and event data Source of data for the Data Warehouse Contains detailed data Continually changes due to updates Stores data up to the last transaction

Data warehouse
Reporting tools

Operational & E xterna l data

ODS
Mining

Data Sta ging layer

OLAP

Data Marts

Information Servers

Web Browsers

Administration

Operational & External Data Layer

Copyright 2005, Cognizant Academy, All Rights Reserved

42

Data Staging Layer


Meta Data Management

Information Access

Extracts data from operational and external databases Transforms the data and loads into the Data Warehouse This includes decoding production data and merging of records from multiple DBMS formats

Data warehouse
Reporting tools

Operational & E xterna l data

ODS
Mining

Data Sta ging layer

OLAP

Data Marts

Information Servers

Web Browsers

Administration

Data Staging layer

Copyright 2005, Cognizant Academy, All Rights Reserved

43

Data Warehouse Layer


Meta Data Management

Stores data used for


Information Access

informational analysis Present summarized data to the end-user for

Data warehouse
Reporting tools

Operational & Externa l data

ODS
Mining

Data Sta ging layer

OLAP

analysis The nature of the operational data, the end-user requirements and the business objectives of the enterprise determine the structure

Data Marts

Information Servers

Web Browsers

Administration

Data Warehouse Layer

Copyright 2005, Cognizant Academy, All Rights Reserved

44

Meta Data Layer


Meta Data Management

Metadata is data about data.


Information Access

Stored in a repository. Contains all corporate Metadata resources: database catalogs, and data dictionaries

Data warehouse
Reporting tools

Operational & E xterna l data

ODS
Mining

Data Sta ging layer

OLAP

Data Marts

Information Servers

Web Browsers

Administration

Meta Data Layer

Copyright 2005, Cognizant Academy, All Rights Reserved

45

Process Management Layer


Meta Data Management

Information Access

Data warehouse
Reporting tools

Operational & E xterna l data

ODS
Mining

Data Sta ging layer

OLAP

Data Marts

Information Servers

Web Browsers

Administration

Scheduler or the high-level job control To build and maintain the Data Warehouse and data directory information To keep the Data Warehouse up-to-date.

Process Management Layer

Copyright 2005, Cognizant Academy, All Rights Reserved

46

Information Access Layer


Meta Data Management

Interfaced with the


Information Access

Data Warehouse through an OLAP server Performs analytical operations and presents data for analysis End-users generates ad-hoc reports and perform multidimensional

Data warehouse
Reporting tools

Operational & E xterna l data

ODS
Mining

Data Sta ging layer

OLAP

Data Marts

Information Servers

Web Browsers

Administration

Information Access Layer

analysis using OLAP tools

Copyright 2005, Cognizant Academy, All Rights Reserved

47

Data Warehouse Architecture: Implementation


The following should be considered for a successful implementation of a Data Warehousing solution: Architecture: Open Data Warehousing architecture with common interfaces for product integration Data Warehouse database server Tools: Data Modeling tools Extraction and Transformation/propagation tools Analysis/end-user tools: OLAP and Reporting Metadata Management tools

Copyright 2005, Cognizant Academy, All Rights Reserved

48

Enterprise Data Warehouse (EDW)


An Enterprise Data Warehouse (EDW) contains detailed as well as summarized data. Separate subject-oriented database. Supports detailed analysis of business trends over a period of time. Used for short- and long-term business planning and decision making covering multiple business units.

Copyright 2005, Cognizant Academy, All Rights Reserved

49

EDW- Top Down Approach


Heterogeneous Source Systems Source 1 Source 2 Source 3

Common Staging interface Layer

Staging

Data mart bus architecture Layer

Enterprise Datawarehouse

Incremental Architected data marts DM 2 DM 1 DM 3

Copyright 2005, Cognizant Academy, All Rights Reserved

50

EDW - Top Down Approach: Implementation


An EDW is composed of multiple subject areas, such as Finance, Human Resources, Marketing, Sales, Manufacturing, and so on. In a top down scenario, the entire EDW is architected, and then a small slice of a subject area is chosen for construction. Subsequent slices are constructed, until the entire EDW is complete.

Copyright 2005, Cognizant Academy, All Rights Reserved

51

Upsides and Downsides of Top-Down Approach


The upsides to a Top Down approach are: 1. Coordinated environment 2. Single point of control & development The downsides to a Top Down approach are: 1. Cross everything nature of enterprise project 2. Analysis paralysis 3. Scope control 4. Time to market 5. Risk and exposure

Copyright 2005, Cognizant Academy, All Rights Reserved

52

EDW- Bottom up Approach


Heterogeneous Source Systems Source 1 Source 2 Source 3

Common Staging interface Layer

Staging

Data mart bus architecture Layer

Incremental Architected data marts DM 2 DM 1 DM 3

Enterprise Datawarehouse

Copyright 2005, Cognizant Academy, All Rights Reserved

53

EDW- Bottom Up Approach: Implementation


Initially an Enterprise Data Mart Architecture (EDMA) is developed. Once the EDMA is complete, an initial subject area is selected for the first incremental Architected Data Mart (ADM). The EDMA is expanded in this area to include the full range of detail required for the design and development of the incremental ADM.

Copyright 2005, Cognizant Academy, All Rights Reserved

54

Upsides and Downsides of Bottom Up Approach


The upsides to a bottom up approach are: 1. Quick Return On Investment (ROI) 2. Low risk, low political exposure learning and development environment 3. Lower level, shorter-term political will required 4. Fast delivery 5. Focused problem, focused team 6. Inherently incremental The downsides to a bottom up approach are: 1. Multiple team coordination 2. Must have an EDMA to integrate incremental data marts
Copyright 2005, Cognizant Academy, All Rights Reserved 55

Allow time for questions from participants

Copyright 2005, Cognizant Academy, All Rights Reserved

56

Test Your Understanding


1. What is the difference between Data Warehouse and Data Marts? 2. Explain Top-down and Bottom-up Approach. 3. What are the differences between ODS in DW and OLTP? 4. Explain OLAP.

Copyright 2005, Cognizant Academy, All Rights Reserved

57

Data Warehousing Basics I: Summary


Operational systems are what its name implies. It is the system that help you to run the day-to-day enterprise operations. The following are the features of Data Warehousing: Strategic enterprise level decision support Multi-dimensional view on the enterprise data Caters to the entire spectrum of management Descriptive, standard business terms High degree of scalability High analytical capability Historical data only Data mart is a decentralized subset of data found either in a Data Warehouse or as a standalone subset designed to support the unique business unit requirements of a specific decision-support system.
Copyright 2005, Cognizant Academy, All Rights Reserved 58

Data Warehousing Basics I: Summary (Contd.)


Data marts have specific business-related purposes such as measuring the impact of marketing promotions, or measuring and forecasting sales performance, and so on. The Operational Data Store (ODS) is defined to be a structure that is integrated, subject oriented, volatile, current valued, and Contains detailed data only. An Enterprise Data Warehouse (EDW) contains detailed as well as summarized data. An EDW is composed of multiple subject areas, such as Finance, Human Resources, Marketing, Sales, Manufacturing, and so on.

Copyright 2005, Cognizant Academy, All Rights Reserved

59

Data Warehousing Basics I: Source


Datawarehousing Guide - Raphal Kimball

Disclaimer: Parts of the content of this course is based on the materials available from the Web sites and books listed above. The materials that can be accessed from linked sites are not maintained by Cognizant Academy and we are not responsible for the contents thereof. All trademarks, service marks, and trade names in this course are the marks of the respective owner(s).
Copyright 2005, Cognizant Academy, All Rights Reserved 60

You have successfully You have successfully completed completed Data Warehousing Data Warehousing Basics I. Basics I.

S-ar putea să vă placă și