Documente Academic
Documente Profesional
Documente Cultură
Primer
Chapter 1
Kimball & Ross
Concepts Discussed
Business driven goals
Data warehouse publishing
Major components
Importance of dimensional modeling for the
presentation area
Facts & dimension tables
Myths of dimensional modeling
Pitfalls to avoid
Different Information Worlds
Users of operational system turn the wheels
of an organization
Users of data warehouse watch the wheels
of the organization turn
Warehouse users have drastically different
needs than users of operational systems
Returning Themes
We have mountains of data but we cannot access it
We need to slice the data in different ways
Need to make it easy for business users to access
the data
Just show me what is important
It drives me craze when different people present
the same metrics with different numbers
Fact-based decision making
Goals of Data Warehouse
Make an organization’s information easily
accessible
Present the information in a consistent manner
Adaptive and resilient to change
Secure and protects information
Serves as a foundation for improved decision
making
Business users must accept the data warehouse if
it is to be useful
Publishing Metaphor
Data warehouse manager is a “publisher” of
the right data
Responsible for publishing data collected
from a variety of sources and edited for
quality and consistency
Components of a Data
Warehouse
Operational source systems
Data staging area
Data presentation area
Data access tools
Data Staging Area
Key structural requirement is that is it off-
limits to business users and does not
provide query and presentation services.
– Correct misspellings, resolve domain conflicts,
deal with missing elements, parse into standard
formats, combine data from multiple sources.
– Normalized structures sometimes called
“enterprise data warehouse” – it is a misnomer
(Kimball).
Data Staging Area
Dominated by simple activities sorting and
sequential processing.
Normalized data is acceptable, although this
is not the end goal.
Data Presentation
Series of integrated data marts. Data mart is
data from a single business process. Wedge
of the overall pie.
Data must be presented, stored and accessed
in dimensional schema.
Data Presentation
Should not be in normalized form.
They must contain detailed atomic data in
addition to data in summary form, because
the queries are ad hoc and cannot be
predicted.
Facts and dimensions – called conformed.
Presentation Area
If it is based on a relational data base, it is
called start schema.
If it is multidimensional database, or OLAP,
then the data is stored in cubes.
Data Access Tools
Querying is the whole point of DW.
Can be as simple as an ad hoc query tool or
as complex as a data mining or a modeling
application.
Parameter driven analytic operations.
80 to 90 of the users are served by canned
applications.
Additional Considerations
Meta data
Operational data store