Sunteți pe pagina 1din 2

REGULATION: 2013 ACADEMIC YEAR: 2017-18

IFET COLLEGE OF ENGINEERING


DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
IT6702 -DATA WAREHOUSING AND DATA MINING
UNIT-1 DATA WAREHOUSING - FUNDAMENTALS
100 % Theory
Data warehousing Components –Building a Data warehouse – Mapping the Data Ware
house to a Multiprocessor Architecture – DBMS Schemas for Decision Support – Data
Extraction, Cleanup and Transformation Tools –Metadata.

Data warehousing
 A Data Warehouse is a collection of data marts representing historical data from
different operations in the company.
 It Collect the data from multiple heterogeneous data base files (flat, text, and etc.).
 It stores the 5 to 10 years of huge amount of data.
 The Data is stored in a structure optimized for querying and analysis as a data
warehouse.
Data warehousing components
 Data sourcing, cleanup, transformation and migration tools
 Metadata repository
 Warehouse/database technology
 Data marts
 Data query, reporting, analysis and mining tools
 Data warehouse administration and management
 Information delivery system
Building a data warehouse
There are two factors that drive you to build and use data warehouse.
I. Business Factors
 To make decision quickly and correctly using all data
II. Technical factors
 Incompatibility of operational data stores.
 IT infrastructure changing rapidly.
 Nine Decisions in the design of Data warehouse
 Choosing the subject matter
 Deciding what a fact table represents
 Identifying and conforming the dimensions
 Choosing the facts
 Storing pre calculations in the fact table
 Rounding out the dimension table
 Choosing the duration of the database
 The need to track slowly changing dimensions
 Deciding the query priorities and query models

1
IFETCE/CSE/III YEAR /VI SEM/IT6702/DWDM/UNIT-I/FUNDAMENTALS/VER 1.2
REGULATION: 2013 ACADEMIC YEAR: 2017-18

Benefits of Data warehousing


 Queries that would be complex in much normalized databases could be easier to
build and maintain in data warehouses, decreasing the workload on transaction
systems.
 Data warehousing is an efficient way to manage demand for lots of information
from lots of users.
 Data warehouses are designed to perform well with aggregate queries running on
large amounts of data.
 The structure of data warehouses is easier for end users to navigate, understand
and query against unlike the relational databases primarily designed to handle lots
of transactions.
Mapping the Data warehouse to a multi-processor architecture
 The functions of data warehouse are based on the relational data base technology
 The relational database technology is implemented in parallel manner.
 Types of Parallelism
 Inter query Parallelism – server threads or processes handle multiple
requests at the same time.
 Intra query Parallelism – SQL Query into lower level operations such as
scan, join, sort, etc.
 Database Architecture for Parallel Processing
 Shared-Memory Architecture – shared the memory globally but the
scalability is limited
 Shared-disk Architecture – shared the disk to the local memory and
provides incremental growth.
 Shared-Nothing Architecture – access with disks, not access to memory,
systems growth is unlimited.
 Parallel DBMS Vendors
 Oracle, IBM, Sybase & Microsoft.
DBMS Schemas for Decision Support
 Star Schema-The multidimensional view of data that is expressed using
relational database semantics is provided by the database schema design called
star schema.
 Bitmapped Indexing - A New approach to increasing performance of a relational
DBMS is to use innovative indexing techniques to provide rapid direct access to
data. example
 SYBASE IQ
Data Extraction, Clean up and transformation tools
 Access to legacy Data
 Data layer for data access and transaction
 Process layer for current business process
 User layer for user interaction
 Transformation Engines
 Informatics, Consteller
Metadata – Data about the data
 Metadata Interchange Initiative
 Goals like vendor creations control the users to manage the access.

2
IFETCE/CSE/III YEAR /VI SEM/IT6702/DWDM/UNIT-I/FUNDAMENTALS/VER 1.2

S-ar putea să vă placă și