Documente Academic
Documente Profesional
Documente Cultură
Objective
Introduction
Operational Systems
Warehousing data outside the operational systems
Integrating data from more than one operational system
Differences between transaction and analysis processes
Data is mostly non-volatile
Data saved for longer periods than in transaction systems
Logical transformation of operational data
Structured extensible data model
Data warehouse model aligns with the business structure
Transformation of the operational state information
Objective
The aim of this lesson is to explain you the need and importance of an Operational Systems in a Data warehouse.
Introduction
Data warehouse, a collection of data designed to support
management decision-making. Data warehouses contain a wide
variety of data that present a coherent picture of business
conditions at a single point in time.
Development of a data warehouse includes development of
systems to extract data from operating systems plus installation
of a warehouse database system that provides managers flexible
access to the data.
The term data warehousing generally refers to the combination
of many different databases across an entire enterprise.
Operational Systems
Up to now, the early database system of the primary purpose
was to, meet the needs of operational systems, which are
typically transactional in nature. Classic examples of operational
systems include
General Ledgers
Accounts Payable
Financial Management
Order Processing
Order Entry
Inventory
25
Structure
CHAPTER 2
BUILDING A DATA
WAREHOUSE PROJECT
OUSING AND DA
NING
26
Even though many of the queries and reports that are run
against a data warehouse are predefined, it is nearly impossible
to accurately predict the activity against a data warehouse. The
process of data exploration in a data warehouse takes a business
analyst through previously undefined paths. It is also common
to have runaway queries in a data warehouse that are triggered
by unexpected results or by users lack of understanding of the
data model. Further, many of the analysis processes tend to be
all encompassing whereas the operational processes are well
segmented. A user may decide to explore detail data while
reviewing the results of a report from the summary tables.
After finding some interesting sales activity in a particular
month, the user may join the activity for this month with the
marketing programs that were run during that particular month
to further understand the sales. Of course, there would be
instances where a user attempts to run a query that will try to
build a temporary table that is a Cartesian product of two tables
containing a million rows each! While an activity like this would
unacceptably degrade an operational systems performance, it is
expected and planned for in a data warehousing system.
Order processing
2 second response time
Last 6 months orders
Daily cl
osed
orders
Data
Warehouse
Last 5 years data
Product Price/inventory
10 second response time
Last 10 price changes
tory
nven
rice/I
uct p
prod
ly
k
Wee
Marketing
30 second response time
g
etin
ark
ly m
k
e
We
ms
gra
pro
27
OUSING AND DA
Product
Future
NING
Future
Orders
The data warehouse model needs to be extensible and structured such that the data from different applications can be
added as a business case can be made for the data. A data
warehouse project in most cases cannot include data from all
possible applications right from the start. Many of the
successful data warehousing projects have taken an incremental
approach to adding data from the operational systems and
aligning it with the existing data. They start with the objective
of eventually adding most if not all business data to the data
warehouse. Keeping this long-term objective in mind, they
may begin with one or two operational applications that
provide the most fertile data for business analysis. Figure 4
illustrates the extensible architecture of the data warehouse.
Order processing
Customer
orders
Product
price
Available Inventory
Data
Warehouse
Customers
Product Price/inventory
Product
price
Product
Inventory
Products
Orders
Product Inventory
Product Price
Marketing
Customer
Profile
Product
price
Marketing programs
DATA WAREHOUSE
PROCESSES
THOUSANDS OF
MILLIONS OF
TRANSACTIONS
DAILY.
PROCESSES ONE
TRANSACTION
DAILY THAT
CONTAINS
MILLIONS OF
RECORDS.
Deals with a
summary of multiple
accounts
Runs day-to-day
operations.
Exists on different
machines, dividing
university resources.
Exists on a single
machine, providing a
centralized resource.
Provides instantaneous
snapshot of an
organization's affairs. Data
changes from moment to
moment.
Order Processing
System
Adds layers of
snapshots on a
regularly scheduled
basis. Data changes
only when DW
Team updates the
warehouse.
Daily cl
osed
Order
Inventory
Down
Up
orders
Data
Warehouse
Editor:
Orders (Closed)
Inventory snapshot 1
kly
Wee
Inventory snapshot 2
shot
snap
ntory
inve
Figure 6 illustrates how most of the operational state information cannot be carried over the data warehouse system.
De-normalization of Data
Before we consider data model de-normalization in the context
of data warehousing, let us quickly review relational database
concepts and the normalization process. E. F. Codd developed
relational database theory in the late 1960s while he was a
researcher at IBM. Many prominent researchers have made
significant contributions to this model since its introduction.
Today, most of the popular database platforms follow this
model closely. A relational database model is a collection of
two-dimensional tables consisting of rows and columns. In
the relational modeling terminology, the tables, rows, and
columns are respectively called relations, attributes, and tuples.
The name for relational database model is derived from the
term relation for a table. The model further identifies unique
keys for all tables and describes the relationship between tables.
Normalization is a relational database modeling process where
the relations or tables are progressively decomposed into
smaller relations to a point where all attributes in a relation are
very tightly coupled with the primary key of the relation. Most
data modelers try to achieve the Third Normal Form with all
29
OPERATIONAL SYSTEMS
OUSING AND DA
NING
30
Order processing
Customer
orders
Product
price
Data
Warehouse
Available Inventory
Product Price/inventory
Product
price
Product
Inventory
De-normalized
data
Customers
Products
Orders
Transform
State
Product Inventory
Product Price
Marketing
Customer
Profile
Product
price
Marketing programs
References
1. Adriaans, Pieter, Data mining, Delhi: Pearson Education
Asia, 1996.
2. Anahory, Sam, Data warehousing in the real world: a practical
guide for building decision support systems, Delhi: Pearson
Education Asia, 1997.
3. Berry, Michael J.A. ; Linoff, Gordon, Mastering data mining
: the art and science of customer relationship management, New
York : John Wiley & Sons, 2000
4. Corey, Michael, Oracle8 data warehousing, New Delhi: Tata
McGraw- Hill Publishing, 1998.
5. Elmasri, Ramez, Fundamentals of database systems, 3rd ed.
Delhi: Pearson Education Asia, 2000.
Notes
31