Sunteți pe pagina 1din 84

Enterprise Data Warehouse A Technical Perspective

Tony Dalwood Information Architecture & Management University of South Australia

IT Structure

ISTS Information Strategy & Technology Services Information Strategy


Corporate Information Systems E-Business Information Architecture & Management Customer Services Network Services Systems Infrastructure

Technical Services

Information Architecture & Management (IAM)


Merger of DBA team & Information Integration team in Feb 2006 IAM manages

Corporate System Databases (3 DBAs) Operational Data Store Management Middle Tier Apps

Student Portal (myUniSA) Staff Portal (UniSAinfo)

UniSAinfo Reporting EDW

Project Governance

Steering Group

Includes Directors of ISTS, Planning and Assurance Services (PAS), Student & Academic Services (SAS), Finance Director of Planning & Assurance Services Dep. Director Information Strategy Business Project Manager Technical Project Manager Senior Officers from PAS, HR, Research, SAS, Finance

Sponsors Group

Reference Group

Project Governance

Project Team
Business Project Manager (PAS) Technical Project Manager (ISTS) Design Architect/Dev Team Leader (ISTS) Business Analyst (x1.5) (PAS) Data Quality Manager (0.5) (PAS) Developers (x3 variant) (ISTS)

EDW Project Milestones


Aug 2004 - Business Case submitted by Planning & Assurance Services


(PAS) and ISTS to extend current reporting environment to an EDW ($150K)

Feb 2005 Project Commenced Feb-July 2005 Data Gathering Workshops Sep-Dec 2005 Technical Research & Proof of Concept (0.5 IT
Resource)

Jan-Feb 2006 External Consultancy (1 IT Resource) May 2006 First Star Schema complete (Research Publications) (4 IT
Resources)

July 2006 Three more Star Schemas complete (Research Income,


AVCC Data, Research Staff Supervision) (4 IT Resources)

August 2006 First Soft Production Release (2.5 IT Resources) Beyond Student Data & Finance Data (min 2 IT Resources)
Manager NB: IT Resource not including part time Tech Project

Project Goals
Business One Source of the truth Performance External Data Simplicity Historical Capability Technical Conformed Dimensions Consolidated Facts Transformed schema design Flexible data sources Pre-calculated measures Versioning, Snapshot Data

By-Products of an EDW Project

Data Discovery
What data do we have How data is used and maintained What is the quality of the data How data can be utilised by more of the organisation

Enhanced Collaboration

Intra and Inter communication between business units, system owners and IT

Technical Project Plan


Warehousing Research Proof of Concept exercise External Assistance Implementation of an Architecture Development Standards & Procedures Build & Implementation of Stage 1 Review

Proof of Concept

Validate Warehouse research findings Proof of Concept covered the following topics:
Project methodology Technical architecture Design methodology ETL methodology MetaData options Data Quality approach Security implementation options

Project Methodology

Technical Architecture

Inputs into Architecture


Business Goals Existing Reporting Environments Technology Time $$ Resources/Skills

Data Flow Architecture


Source System Source System Source Systems
Replicated Data Table

ODS

Previous Snapshot Table

24 Hour
Snapshot schema Previous Snapshot Schema

EDW

Diff Process Staging Tables

Deltas

Target EDW Tables

External Files
Transform & Load

Design Methodology

Dimensional Modelling chosen as the design philosophy


Star Schemas/Snowflakes Facts Dimensions Measures Bridges

History Retention for Slowly Changing Dimensions

Warehouse records are versioned i.e. never deleted or overwritten. Views to identify current records

Transformation of Design Source

Transformation of Design Target

ETL Methodology

Scripts Vs Tool decision Tool chosen for following reasons:


Already licensed for Oracle Internet Developer Suite that includes Oracle Warehouse Builder Oracle Database environment Oracle technical skills Visibility of Development Environment Auto technical Meta Data generation Auto and accessible code generation using PL/SQL Ability to include custom code Integration with Oracle database and related Oracle technology Framework for Beginners Difficult to evaluate other products without expertise

Smarts & Effort into Modelling and Design ETL should be a no brainer

MetaData

Data about Data Oracle Warehouse Builder provides technical metadata Business MetaData facility currently restricted to documentation and Cognos catalogs Evaluation of MetaData methods to be reviewed at the completion of Stage 1 development

Data Quality

Pre-ETL

Technical profile to ensure physical design has mapped appropriate data elements Business profile of source data to identify data attributes e.g. data type, patterns, nulls, min, max, outlies Transform to conformed data sets Foreign Key checks Reporting of anomolies Final Business profile to validate transformations of data

ETL

Post ETL

Security

Security options implemented are:

Database Layer
Oracle roles to grant or deny access to database objects based on Business rules Oracle views for granular data security where appropriate

User Layer

Access to end user Cognos catalogues/cubes controlled via Cognos security mechanisms and filesystem access

Development Lifecycle

Business Requirements Design Process


Logical Design Physical Design Data Mapping Data Profiling

Development Lifecycle

Design & Build ETL Objects & Processes


Extraction routines Diff routines

Tag records as Inserts, Updates or Deletes

Build Staging tables Build Target warehouse tables

Standard ETL Process

Scheduled Extract/Diff process runs to populate a Diff table in the Staging Area ETL process then performs a standard set of steps

Load Staging from Diff table Stamp Staging record according to Diff type (U, D or I)

Updated Record Tag staging record as new version of core record Deleted Record Tag staging record Retired record in warehouse Inserted Record Tag staging record to be new record (version 1)

Development Lifecycle

Post ETL
Measures Summary data Process Flows to execute ETL Security views End User Layer e.g. Catalogues

ETL Auditing

When did a process last run How long did it run for Did it Succeed, Fail or produce Warnings How many records did it alter or insert What were the data exceptions

UniSA EDW Toolset


Oracle Database Oracle Warehouse Builder Oracle Workflow Oracle Enterprise Manager Datiris Data profiler Cognos Impromptu/Powerplay Whiteboard and lots of A3 Paper!!!

Oracle Database

Options assisting Warehouse implementation


External tables Materialised Views Query Rewrite Bitmap indexes Partitioning Star Query optimizer options

Provides the design and development environment and framework for the build and deployment of Warehouse objects and transformation processes Consists of Design Repository and Runtime components

Oracle Warehouse Builder

Oracle Workflow

Optionally used for job execution with dependency management Exists as an optional install with RDBMS Run as Client/Server or HTTP browser based application Workflow engine is a service on the warehouse database server administered by a workflow schema

Optionally used as the scheduling option for submitting and monitoring Warehouse builder processes or workflows Base OEM comes with RDBMS Optionally run as standalone install or Management Server mode using a web console

Oracle Enterprise Manager

Catalogues

Cognos 7.3 Reporting Suite


Report Developer access layer Reporting capability Multi-dimensional analysis Web interface

Impromptu

Powerplay

Upfront

Oracle Warehouse Builder Demonstration

OWB 10g Release 2 Paris


New Features: Design Tool Graphic Interface Improvements Built in Slowly Changing Dimension property Data Profiling/Quality utilities Better Integrated Workflow Engine Job Scheduling within OWB via OEM

Project Review

Sanity Check on whole process, architecture, methodology Business & Technical Evaluate ROI Quantify metrics on time to deliver Proposed Future phases Usage Statistics Hardware adequacy & capacity

Useful Technical References Links

Oracle Business Intelligence & Technical Sites


http://www.oracle.com/solutions/business_intelligence/index.html http://www.oracle.com/technology/tech/bi/index.html http://www.rittman.net/ http://www.kimballgroup.com/html/designtips.html

Rittman Blog

Kimball Tips

Texts

Oracle 9iRel2 Data Warehousing - Hobbs Kimball Texts


The Data Warehouse Lifecycle Toolkit The Data Warehouse Lifecycle Toolkit The Data Warehouse ETL Toolkit

Questions ?

S-ar putea să vă placă și