Documente Academic
Documente Profesional
Documente Cultură
• View:
OLTP Current Data
OLAP Heterogeneous data sources
• Access Patterns:
OLTP Short Atomic transactions
OLAP Read only operations
OLTP vs. OLAP
OLTP OLAP
users clerk, IT professional knowledge worker
function day to day operations decision support
DB design application-oriented subject-oriented
data current, up-to-date historical,
detailed, flat relational summarized, multidimensional
isolated integrated, consolidated
usage repetitive ad-hoc
access read/write lots of scans
index/hash on prim. key
unit of work short, simple transaction complex query
# records accessed tens millions
#users thousands hundreds
DB size 100MB-GB 100GB-TB
metric transaction throughput query throughput, response
Why Separate Data Warehouse?
all
0-D(apex) cuboid
Highest Level sum
time,location,supplier
3-D cuboids
time,item,location
time,item,supplier item,location,supplier
4-D(base) cuboid
Lowest Level Sum
time, item, location, supplier
Example of Star Schema
time
time_key item
day Sales Fact Table item_key
day_of_the_week item_name
month brand
quarter time_key type
year supplier_type
item_key
branch_key
branch location
location_key location_key
branch_key
branch_name units_sold street
branch_type city
dollars_sold province_or_state
country
avg_sales
Measures
Example of Snowflake Schema
time
time_key item
day Sales Fact Table item_key supplier
day_of_the_week item_name supplier_key
month brand supplier_type
quarter time_key type
year item_key supplier_key
branch_key
branch location
location_key location_key
branch_key
units_sold street
branch_name
city_key city
branch_type
dollars_sold city_key
avg_sales city
province_or_street
Measures country
Example of Fact Constellation
time
time_key item Shipping Fact Table
day item_key
day_of_the_week Sales Fact Table item_name time_key
month brand
quarter time_key type item_key
year supplier_type
item_key shipper_key
from_location
branch_key
branch location_key location to_location
branch_key dollars_cost
branch_name units_sold location_key
street
branch_type units_shipped
dollars_sold city
province_or_street
avg_sales country shipper
Measures shipper_key
shipper_name
location_key
shipper_type
A Data Mining Query Language, DMQL:
Language Primitives
• Cube Definition (Fact Table):
define cube <cube_name> [<dimension_list>]: <measure_list>
• Dimension Definition ( Dimension Table )
define dimension <dimension_name> as
(<attribute_or_subdimension_list>)
• Special Case (Shared Dimension Tables)
– First time as “cube definition”
– define dimension <dimension_name> as
<dimension_name_first_time> in cube <cube_name_first_time>
Defining a Star Schema in DMQL
all all
Street Day