Sunteți pe pagina 1din 182

Data Warehouse

PhD. Eric Manuel Rosales Peña Alfaro

February, 2015
Agenda
 Data Warehouse
 Information Quality Architecture
 Data Warehouse
 Data Mart

 DWh implementation methodology


 Enterprise requirement analysis
 Data Warehouse modelling

30/08/2019 Data Warehouse 2


Agenda
 ETL tools
 Extract
 Transform
 Load

 OLAP
 Architecture
 Operations

30/08/2019 Data Warehouse 3


Agenda
 Technological evaluation
 Guidelines
 Providers´ analysis

30/08/2019 Data Warehouse 4


Dr. Eric Manuel Rosales Peña Alfaro
Agenda
 Stages of a BI project
 Roles and Responsibilities in a BI project
 Self-Assessment Process
 Best Practice Methodologies
 SQLBI Methodology
 ROADMAP

30/08/2019 Methodologies... 6
Stages of a BI project
 Initiation

 Planning

 Execution

 Completion or Finalization

30/08/2019 Methodologies... 7
Stages of a BI project
 Initiation
 Define objectives
• Quantified & aligned with respect to strategic plan
Define scope
• What should it be delivered?
• How many functional areas will be involved?
• What functional areas will be involved?
 Define term
• Short, medium or long

30/08/2019 Methodologies... 8
Stages of a BI project
 Initiation
Define cost
Define resources
Define approval criteria: strategic alignment, benefits,
risks, implementation time; a combination of them
Risk=F(project size, degree of structure, IT knowledge)
When there´s a bunch of projects
• managing projects portfolio: identify new projects, evaluate,
prioritise, select, allocate resources

30/08/2019 Methodologies... 9
Stages of a BI project
 Initiation
Project Portfolio Analysis
• Analyze cost-benefit or risk-benefit

30/08/2019 Methodologies... 10
Stages of a BI project
 Planning  define
 What to do?  Activities & tasks  Activities
 How to do it?  sequence of…
 When to do?, who is responsible to do it?  select team
 How to control
 Generate
 Gantt chart
 Critical path
 Project Management

30/08/2019 Methodologies... 11
Stages of a BI project
 Completion or Finalization

 Evaluate if objectives were accomplished

 Deviations

 Cost and allocated resources tracking

30/08/2019 Methodologies... 12
Stages of a BI project
 Main causes of failures
 Failure to comply with the results
 Generate-user dissatisfaction customers,
 High maintenance costs,
 Poor image of the participants in the project,
 Loss of competitiveness, etc..
 Poor project management: By missing the deadline,
infringement of the budget.

30/08/2019 Methodologies... 13
Stages of a BI project
 Main causes of failures
 Poor project management:

 Missing the deadline,


 Infringement of the budget.

30/08/2019 Methodologies... 14
Stages of a BI project
 Key management elements
 The objectives of the project
 The leader of the project
 The team
 Strategic support from the Direction to the project
 Resource allocation
 Communication channels
 Control mechanisms

30/08/2019 Methodologies... 15
Roles and Responsibilities in a
BI project
 Organizing the BI team project
 The BI project team must have a set of specific skills to
be able to perform the needed tasks

 Though at each project stage, they have their own


members, from the project management perspective,
the structure of the team has two type of teams:
 The core
 The extended

30/08/2019 Methodologies... 16
Roles and Responsibilities in a
BI project
 The Core team
Role Main responsabilities
Application development Design and supervise the development of access
leader and analysis application (eg reports, queries, etc.)
Establish and maintain the technical
Architect BI infrastructure infrastructure of BI. Generally reports to the
strategic architect extended team.
Participate in modeling sessions, providing data
definitions, write test cases, make business
decisions, resolve disputes between business
Business representative
units, and improve the quality of the data under
the control of the business unit that represents
this role.

30/08/2019 Methodologies... 17
Roles and Responsibilities in a
BI project
 The Core team
Role Main responsabilities
Perform cross-organizational analysis, create
Data Manager logical data models specific to the project and add
them to the logical enterprise data model
Choose and execute the data mining tool, he/she
Data mining expert
must have experience in statistics
Ensure the quality of data sources and prepare
Data quality analyst cleaning specifications and selection of data for
the ETL processes
Design, load, monitor, and adjust the BI
Data Base Administrator
destination databases

30/08/2019 Methodologies... 18
Roles and Responsibilities in a
BI project
 The Core team
Role Main responsabilities
ETL development leader Design and supervise the ETL process.
Build or Buy (license), improve, charge and
Metadata Administrator
maintain the metadata repository
Define, plan, coordinate, monitor and review all
project activities, track and make progress
reports, technical or administrative problem
Project Manager solving, train the team, negotiating with
suppliers, the representative of the company and
the business sponsor . Has overall responsibility
for the project

30/08/2019 Methodologies... 19
Roles and Responsibilities in a
BI project
 The Core team
Role Main responsabilities
Provide business knowledge about the data,
Business expert
processes and requirements.

30/08/2019 Methodologies... 20
Roles and Responsibilities in a
BI project
 The Extended team
Role Main responsabilities
Encode reporting programs, write queries’ scripts
Application developers
and develop analysis and access applications
BI support Train the staff of the company
Bring to successful the BI initiative and eliminate
Company sponsor
business barriers to the BI project team
Encode ETL programs and / or prepare
ETL developers
instructions for ETL tools
IT Auditor or Quality Identify project risks and exposures to the BI
Analyst project due to lack of internal or external forces

30/08/2019 Methodologies... 21
Roles and Responsibilities in a
BI project
 The Extended team
Role Main responsabilities
Encode migration programs to load the data bases
Metadata repository
to the metadata repository, providing metadata
developers
reports and online help function.
Network services staff Maintain network environment
Execute the processes for ETL cycles, application
Operations staff
access and analysis, and the metadata repository
Ensure that the safety requirements are defined
Security officer and the security functions tested in all the tools
and databases

30/08/2019 Methodologies... 22
Roles and Responsibilities in a
BI project
 The Extended team
Role Main responsabilities
Manage limited responsibilities on the BI project,
Stakeholders (other
such as reviewing and ratifying the Standards,
representatives of the
organizational and business rules that the BI
business or IT managers)
project team use or develop
Managing the technological infrastructure of the
Strategic architect
organization, including the BI one
Maintaining the hardware infrastructure and
Technical services staff
operating systems

30/08/2019 Methodologies... 23
Roles and Responsibilities in a
BI project
 The Extended team
Role Main responsabilities
Test the programming code created by ETL,
Test staff
Application and Metadata Repository developers
Install and maintain development tools and
Tools administrators
access and analysis tools
Designing the website and create web pages to
Web developers display the reports and queries in the intranet,
extranet or internet
Web master Configure the Web server and Web security

30/08/2019 Methodologies... 24
Roles and Responsibilities in a
BI project
 The BI Arbitration Board
 In BI business projects, technical or administrative disputes may
arise, such as neither the core team nor the extended team are able
to solve.
 A dispute solution procedure should be stablished with certain
guidelines. The BI Arbitration Board is created for this purpose.
 The BI arbitration boards can be organized in various ways. A BI
arbitration board may be a newly created group whose members
include the sponsor of the company, the CTO / CIO, IT managers,
COO, CFO and directors of the business line.
 In some organizations, the BI Arbitration board could be an existing
committee

30/08/2019 Methodologies... 25
Self-Assessment Process
 Is the business INTELLIGENT?

 Is the enterprise well applying a BI solution?

 What else does an enterprise need to increment the


level of “intelligence” applied?

 Is the enterprise enough intelligently mature?

30/08/2019 Methodologies... 26
Self-Assessment Process

If you don´t
know where Any path
you are will get you
going, there

30/08/2019 Methodologies... 27
Self-Assessment Process
 How about a Business Intelligence Maturity indicator?

 Similarly to CMMI, ITIL, COBIT but still not yet with a


certification

30/08/2019 Methodologies... 28
Self-Assessment Process
 The CMM: offers a set of guidelines to improve an
organisation’s processes within an important area

 A Business Intelligence Maturity indicator may help


organisations in assessing existing enterprise-scale
business intelligence implementations and identifying
potential weak points and improvement strategies.

 A model is needed!!!

30/08/2019 Methodologies... 29
Self-Assessment Process
 The Data Warehouse Institute´s Business Intelligence
Maturity Model (BIMM)
 The model is generalized
 Rates of evolution may vary!
 Skipping stages is possible but risky
 Requires expert assistance, strong executive commitment,
sizable funding
 Regressing stages is also possible
 Mergers, acquisitions, reorganizations
 New CEO/CIO
 New regulations
30/08/2019 Methodologies... 30
Self-Assessment Process
 The Data Warehouse Institute´s BIMM
 Maturity Model Adoption Curve –six stages

30/08/2019 Methodologies... 31
Self-Assessment Process
 The Data Warehouse Institute´s BIMM
 Gulf

 Executive perceptions of BI

 Proliferation of spreadmarts

 Data quality issues

30/08/2019 Methodologies... 32
Self-Assessment Process
 The Data Warehouse Institute´s BIMM
 Chasm
 Executive perceptions of BI
 Proliferation of spreadmarts, data marts, DWs
 Politics and control
 Architectural inflexibility
 Mental silos
 Unfitted BI tools

30/08/2019 Methodologies... 33
Self-Assessment Process
 The Data Warehouse Institute´s BIMM – BI
Adolescence - Symptoms
 Your BI team moves perpetually from one crisis to the
next
 You have to plead with executives to keep your budget
 Usage of the BI/DW peaked soon after the initial
deployment
 The number of spread-marts continues to grow
 Data quality is still an issue
 Users keep asking IT to develop custom reports

30/08/2019 Methodologies... 34
Self-Assessment Process
 The Data Warehouse Institute´s BIMM – BI
Adolescence - Symptoms
 Executives believe BI is operational reports or power
tools
 Query performance degrades as more users use the
system
 Users don’t know what’s in the data warehouse
 Users forget how to use the BI tools
 It takes too long to deliver new subject areas

30/08/2019 Methodologies... 35
Self-Assessment Process
 The Data Warehouse Institute´s BIMM – local vs
enterprise value

30/08/2019 Methodologies... 36
Self-Assessment Process
 The Data Warehouse Institute´s BIMM – strategic
value and ROI

30/08/2019 Methodologies... 37
Self-Assessment Process
 The Data Warehouse Institute´s BIMM – Analytic
usage

30/08/2019 Methodologies... 38
Self-Assessment Process
 The Data Warehouse Institute´s BIMM – Analytic
ouput

30/08/2019 Methodologies... 39
Self-Assessment Process
 The Data Warehouse Institute´s BIMM vs CMM
Level CMM BIMM
Process is unpredictable; Varies by
1 Infant stage: "Ad hoc" processes
individual and team
Projects establish best practice
policies and procedures (i.e.
documented, enforced, trained, and
2 Child stage: "Project methodology"
measured); Prevents unauthorized
changes to schedules and
requirements.
Organization establishes standard
policies and procedures (not just Teenager stage. "Development
3
best practices); Administrators are methodology"
trained.

30/08/2019 Methodologies... 40
Self-Assessment Process
 The Data Warehouse Institute´s BIMM vs CMM
Level CMM BIMM
Establishes performance metrics and
acceptable thresholds; Risks are Adult stage: "Measure/monitor
4
known & proactively managed; performance"
Output is predictable.
Organization focuses on continuous
process improvement and reducing Sage stage: "Continuous
5
defects through evaluation and improvement"
sharing best practices.

30/08/2019 Methodologies... 41
Self-Assessment Process
 The Data Warehouse Institute´s BIMM
Business Stage
Operate Understand Change Grow Complete Lead

Business Internal Customer Market


Management
startup; operations; focused; segmentation Performance
innovation
defining scorecards scorecards across measured
drives
products and against
and are product are cross- business competitors
industry
developing and functional standards,
units; target and customer
product and business or enterprise practices and
markets drive profitability
productivity
services unit focused wide strategies

Level 0 Level 1 Level 2 Level 3 Level 4 Level 0


Is it What Why did it What will What is Make it
happening? happened? happen? happen? happening? happen!

30/08/2019
DW/BI Maturity
Methodologies... 42
Self-Assessment Process
 The Data Warehouse Institute´s BIMM – dimensions
of DWh Maturity
Architectural Governance
Business Governance

User Access Decision Support BI


Metadata Workload Profile

These dimensions can


SUPPORT or PREVENT
the evolution of the Communication Data Quality, Data
Currency, Data
EDWh Training
Protection

Breadth (Dimensions)

30/08/2019 Methodologies... 43
Self-Assessment Process
 The Data Warehouse Institute´s BIMM
DWM Scorecard
Business Requirements 0 1 2 3 4 5
El DWh is not
alligned with BI Workload Profile 
business goals
User Access
Decision Support
El DWh is
somewhat Data Quality 
alligned with
business goals Data Currency 
Metada 
Architectural Governance
 DWh meets
business goals Business Governance
Data Protection 
Communications and Training
30/08/2019 Methodologies... 44
Self-Assessment Process
 Gartner´s Business Intelligence (BI), Analytics and
Performance Management (PM) Maturity Model
(BIAPMMM)
 A BI program includes people, skills, processes, metrics
and other components, as well as technologies

 As the BI program matures, the architecture will evolve,


along with the processes and skills needed to support it

30/08/2019 Methodologies... 45
Self-Assessment Process
 Gartner´s BIAPMMM
 At Level 1 maturity, the use of spreadsheets to gather
and analyze data is expensive, provides inconsistent and
inaccurate information, and carries a high risk of fraud.

 At Level 3 maturity, many enterprises implement a BI


competency center consisting of business users, IT
professionals and analysts to share expertise and
improve consistency for specific applications or uses of
information.

30/08/2019 Methodologies... 46
Self-Assessment Process
 Gartner´s BIAPMMM - recomendations
 Use this maturity model to talk to business managers
about the value of increasing the maturity of your BI,
PM and analytics program.

 Always seek sponsorship from business or corporate


managers for any effort to increase the maturity of your
program.

30/08/2019 Methodologies... 47
Self-Assessment Process
 Gartner´s BIAPMMM – WHY??
 Individual projects that prevailed in the past have
created silos of information without always giving
managers the insight they need to make good decisions
 Enterprises cannot enact a strategic approach in one
simple step; it takes time to build all the skills needed
for the right BI and PM program
 Identify the enterprise's current level of maturity
and the level of maturity that the enterprise's
strategic goals require

30/08/2019 Methodologies... 48
Self-Assessment Process
 Gartner´s BIAPMMM - Assumes a portfolio that
includes
 Traditional BI applications (such as ad hoc query,
reporting, dashboards, online analytical processing
(OLAP), Data integration and data warehouse),

 Analytic applications (for example, customer service


analytics) and

 PM applications (such as for sales).

30/08/2019 Methodologies... 49
Self-Assessment Process
 Gartner´s BIAPMMM

30/08/2019 Methodologies... 50
Self-Assessment Process
 Gartner´s BIAPMMM - Level 1: Unaware
 BI & analytics occur in and ad hoc manner
 Executives and managers ask for information
 Users scramble to provide it with any available
operational application
 Users range from skilled analysts to self-appointed
“spreadsheets jockeys”
 Deliver results in spreadsheets designed for one use and
stored on someone´s PC
 Analytics are embedded

30/08/2019 Methodologies... 51
Self-Assessment Process
 Gartner´s BIAPMMM - Level 1: Unaware

 The enterprise has no information infrastructure


 There´re ODBC connections
 There is no defined processes for analytics or decision
making, or performance metrics

 Prevails ´cause it costs little to get started

30/08/2019 Methodologies... 52
Self-Assessment Process
 Gartner´s BIAPMMM - Level 1: Unaware –
Disadvantages

 Are labor-intensive and duplicative, and therefore


expensive overall.

 Do not provide consistent and accurate information.

 Are not audited and carry a high risk of fraud.

30/08/2019 Methodologies... 53
Self-Assessment Process
 Gartner´s BIAPMMM -Level 2: Oportunistic
 Business units undertake every BI, PM or analytics project
individually to optimize a process or to help make tactical
decisions, and each project or domain has its own
information infrastructure, tools, applications and
performance measures
 Different applications proliferate across the organization
with its own team of IT workers, business application users
and operational managers
 Use data integration tools, analytic capabilities, databases
and BI platform capabilities

30/08/2019 Methodologies... 54
Self-Assessment Process
 Gartner´s BIAPMMM - Level 2: Oportunistic

 Delivers value to users quickly, with relevant


information and analysis

 These skills become siloed along with the applications


and information so that the wider organization cannot
benefit from any expertise.

30/08/2019 Methodologies... 55
Self-Assessment Process
 Gartner´s BIAPMMM - Level 3: Standards
 People, processes and technologies start to become
coordinated across the enterprise
 A senior executive (from the Business side)  enterprise
champion for BI, PM & analytics
 Process managers and IT leaders oversee projects across
multiple business processes that need to share analysis
and decisions
 Users make decisions based on multiple streams of data

30/08/2019 Methodologies... 56
Self-Assessment Process
 Gartner´s BIAPMMM - Level 3: Standards
 Many enterprises implement a BI competency center
consisting of business users, IT professionals and
analysts to share expertise and improve consistency for
specific applications or uses of information
 Technology standards start to emerge, including for
information infrastructure, data warehouses, and BI or
corporate PM platforms, but they are not mandated
 An “inside out” perspective dominates

30/08/2019 Methodologies... 57
Self-Assessment Process
 Gartner´s BIAPMMM - Level 3: Standards
 1st time, enterprise starts to lower the overall cost of its
BI, PM and analytics efforts through improved
coordination and the standardization of technologies

 But, the adaptability of BI, PM and analytic systems


remains low, so the enterprise has not yet reached strong
economies of scale

30/08/2019 Methodologies... 58
Self-Assessment Process
 Gartner´s BIAPMMM - Level 4: Enterprise
 Top executives such as the CFO or COO become the
program's sponsors
 A framework of performance metrics that links multiple
processes to enterprise goals has been defined
 Corporate and operational executives can see
cause/effect relationships with key activities
 People from analysts to business managers and senior
executives use the same BI, PM and analytic systems

30/08/2019 Methodologies... 59
Self-Assessment Process
 Gartner´s BIAPMMM - Level 4: Enterprise
 An enterprise information architecture guides the
design of new systems
 The enterprise exhibits a high degree of discipline
around BI, PM and analytic projects, with release
oriented program management
 Though BI, PM and analytic efforts have become more
efficient, usage grows and therefore costs remain high
 The enterprise must maintain people with a high level of
skill in many different areas, such as program and
change management
30/08/2019 Methodologies... 60
Self-Assessment Process
 Gartner´s BIAPMMM - Level 5: Transformative
 BI, PM and analytics have become a strategic initiative,
jointly run by the business and IT organization, and
supported and governed at the highest levels of the
organization
 The enterprise has completed its performance metrics
framework and even extended it to include partners and
customers
 An "outside in" perspective now permeates the
measurement system

30/08/2019 Methodologies... 61
Self-Assessment Process
 Gartner´s BIAPMMM - Level 5: Transformative
 All of these stakeholders use the information from BI,
PM and analytics systems to coordinate a response to
changing business conditions across the whole value
chain and to make transformational decisions
 Users come from multiple levels within the
organization, multiple business units and multiple
geographies as well as from customers and partners

30/08/2019 Methodologies... 62
Self-Assessment Process
 Gartner´s BIAPMMM - Level 5: Transformative
 The enterprise has turned legacy applications into
services to promote fast, easy integration and reuse
 The enterprise has optimized costs by sharing systems,
processes and skills across the organization
 Users can see the enterprise's performance and the
factors that contribute to it

 Mergers or acquisitions can reintroduce many of the


problems that the enterprise has overcome

30/08/2019 Methodologies... 63
Self-Assessment Process
 The Enterprise Business Intelligence Maturity Model
(EBIM)
 Four key dimensions
1. Information quality
2. Master data management
3. Warehousing architecture
4. Analytics
 Five levels of maturity
 20 critical factors

30/08/2019 Methodologies... 64
Self-Assessment Process
 The Enterprise Business Intelligence Maturity Model
(EBIM)

30/08/2019 Methodologies... 65
Self-Assessment Process
 The Enterprise Business Intelligence Maturity Model
(EBIM) – level 1: Initial
 Information Quality: ad-hoc
 Information Management (IM)/Information Quality
Management (IQM) processes are not standardized or
documented during this stage.
 There is no awareness of any information quality (IQ) issues,
therefore no attempts are made to assess or improve
information quality.
 Organisation acts in response only when information quality
problems occur

30/08/2019 Methodologies... 66
Self-Assessment Process
 The Enterprise Business Intelligence Maturity Model
(EBIM) – level 1: Initial
 Master Data Management: List Provisioning
 There is no systematic and thorough way of ensuring changes
to the master list.
 Defining and maintaining master lists involve significant
meetings and human interaction.
 Data conflicts, deletions, changes, explaining data file
formats, and content details are handled manually.
 Individual applications must understand how to navigate to
the master list.

30/08/2019 Methodologies... 67
Self-Assessment Process
 The Enterprise Business Intelligence Maturity Model
(EBIM) – level 1: Initial
 Warehousing architecture: Spread-marts &
Management Reporting

 Management reports are static reports which are printed and


disseminated to employees on weekly, monthly, or quarterly.

 Spread marts are spreadsheets or desktop databases that


function as surrogate data marts.

30/08/2019 Methodologies... 68
Self-Assessment Process
 The Enterprise Business Intelligence Maturity Model
(EBIM) – level 1: Initial
 Analytics: Analytically Impaired
 The company has some data and management interest in
analytics

30/08/2019 Methodologies... 69
Self-Assessment Process
 The Enterprise Business Intelligence Maturity Model
(EBIM) – level 2: Repeatable
 Information Quality: Define IP and IQ

 All Information Product (IP) and Information Quality (IQ)


requirements have been identified and documented.

 Related information quality dimensions and requirements


have been classified.

30/08/2019 Methodologies... 70
Self-Assessment Process
 The Enterprise Business Intelligence Maturity Model
(EBIM) – level 2: Repeatable
 Master Data Management: Peer-Based Access
 There is hardcoded logic for applications to interact with the
list of master data.
 A data model is created to identify each master
 record distinctively.
 Individual applications take responsibility to maintain the
master list.
 All data and integrity rules are copied to new integrated
application systems

30/08/2019 Methodologies... 71
Self-Assessment Process
 The Enterprise Business Intelligence Maturity Model
(EBIM) – level 2: Repeatable
 Warehousing architecture: Data Marts
 A data mart is an analytical data store that generally focuses
on specific business function within an organisation

 Data marts are tailored to meet the needs of data users.

 Usually interactive reporting tools such OLAP and ad hoc


query tool are used to access the data marts to gain deeper
insight.

30/08/2019 Methodologies... 72
Self-Assessment Process
 The Enterprise Business Intelligence Maturity Model
(EBIM) – level 2: Repeatable
 Analytics: Localized Analytics

 Functional management builds analytics momentum

 Executives’ interest through applications of basic analytics

30/08/2019 Methodologies... 73
Self-Assessment Process
 The Enterprise Business Intelligence Maturity Model
(EBIM) – level 3: Defined
 Information Quality: IQM Initiative

 Information quality management is treated as a core business


activity and widely implemented across organisation

30/08/2019 Methodologies... 74
Self-Assessment Process
 The Enterprise Business Intelligence Maturity Model
(EBIM) – level 3: Defined
 Master Data Management: Centralized Hub Processing

 Everything is centralized during this stage.


 Master reference data, business-oriented data rules, and
connected processing are centrally handled.
 Crossfunctional or cross-organisation conflict can be resolved
by a data governance process.
 Data accuracy and consistency is guaranteed

30/08/2019 Methodologies... 75
Self-Assessment Process
 The Enterprise Business Intelligence Maturity Model
(EBIM) – level 3: Defined
 Warehousing architecture: Data Warehouse

 A data warehouse provides interactive reporting and deeper


analysis.

 New insights are promised due to the capability of cross-


functional boundaries query

30/08/2019 Methodologies... 76
Self-Assessment Process
 The Enterprise Business Intelligence Maturity Model
(EBIM) – level 3: Defined
 Analytics: Analytical Aspirations

 Executives commit to analytics by aligning resources and


setting a timetable to build a broad analytical capability

30/08/2019 Methodologies... 77
Self-Assessment Process
 The Enterprise Business Intelligence Maturity Model
(EBIM) – level 4: Managed
 Information Quality: IQ Assessment

 Information quality metrics have been developed

 Information quality is being evaluated

30/08/2019 Methodologies... 78
Self-Assessment Process
 The Enterprise Business Intelligence Maturity Model
(EBIM) – level 4: Managed
 Master Data Management: Business Rules & Policy
Support
 A process-driven data governance framework exists to
maintain centralized business rules management and
distributed rules processing.
 Organisation has a mature change management process.
 SOA is applied to integrate common business methods and
data across applications.
 There is an automated way to both enforce and undo changes
to master reference data.

30/08/2019 Methodologies... 79
Self-Assessment Process
 The Enterprise Business Intelligence Maturity Model
(EBIM) – level 3: Defined
 Warehousing architecture: Enterprise Data Warehouse

 Enterprise data warehouse acts as an integration machine that


continuously merges all other analytic structures into itself.

 The enterprise data warehouse helps organisation to achieve a


single version of the truth

30/08/2019 Methodologies... 80
Self-Assessment Process
 The Enterprise Business Intelligence Maturity Model
(EBIM) – level 3: Defined
 Analytics: Analytical Company

 Analytic capability draws most attention from company top


executives, thus an enterprise-wide analytics capability is
being developed.

30/08/2019 Methodologies... 81
Self-Assessment Process
 The Enterprise Business Intelligence Maturity Model
(EBIM) – level 5: Optimizing
 Information Quality: Single View of Truth
 Source of information quality problems have been recognised.
 There are continuous initiatives to improve processing of
information quality problems.
 Besides, impact of poor information quality has been
calculated.

30/08/2019 Methodologies... 82
Self-Assessment Process
 The Enterprise Business Intelligence Maturity Model
(EBIM) – level 4: Managed
 Master Data Management: Enterprise Data Convergence

 The hub is fully integrated into the application system


environment.
 The hub will propagate data changes to all the application
systems that need the master data.
 Application processing occur without depending on physical
system location and data navigation.

30/08/2019 Methodologies... 83
Self-Assessment Process
 The Enterprise Business Intelligence Maturity Model
(EBIM) – level 4: Managed
 Warehousing architecture: Analytical Services

 The enterprise data warehouse value increases as its visibility


declines.
 Enterprise data warehouse fades into the background as a
business intelligence service.
 Examples of analytical services are interactive extranets, web
Services, decision engines and so forth

30/08/2019 Methodologies... 84
Self-Assessment Process
 The Enterprise Business Intelligence Maturity Model
(EBIM) – level 4: Managed
 Analytics: Analytical Company

 The enterprise-wide analytics capability promises the


company regular benefits
 The company focuses on continuous analytics review and
enhancement

30/08/2019 Methodologies... 85
Best Practice Methodologies
 SQLBI Methodology – A BI Solution is

30/08/2019 Methodologies... 86
Best Practice Methodologies
 A BI solution can be
 Small

 Medium

 Large

30/08/2019 Methodologies... 87
Best Practice Methodologies
 In a BI solution participate
 User / Customer

 BI Analyst

30/08/2019 Methodologies... 88
Best Practice Methodologies
 SQLBI Methodology with a Microsoft BI Suite
 SQL Server

 SQL Server Integration Services (SSIS)

 SQL Server Analysis Services (SSAS)

 Excel

 ProClarity

30/08/2019 Methodologies... 89
Best Practice Methodologies
 SQLBI Methodology - Kimball methodology
 Facts & dimensions, junk dimensions, degenerate
dimensions
 Star & snowflake schema
 Slowly changing dimension
 Bridge tables (factless fact tables)
 Snapshot vs transactions fact tables
 Updating facts & dimensions
 Natural and surrogate keys

30/08/2019 Methodologies... 90
Best Practice Methodologies
 SQLBI Methodology - Kimball methodology

 Create the data marts using a dimensional modelling


technique

 Incrementally, add more data marts

 Until, data warehouse

30/08/2019 Methodologies... 91
Best Practice Methodologies
 SQLBI Methodology - Kimball methodology

30/08/2019 Methodologies... 92
Best Practice Methodologies
 SQLBI Methodology - Kimball methodology, the DWh

30/08/2019 Methodologies... 93
Best Practice Methodologies
 SQLBI Methodology – Inmon methodology

 CIF – Corporate Information Factory

 A database environment: operational, Atomic data warehouse,


departmental and individual

 The data warehouse is part of the bigger whole (CIF)

30/08/2019 Methodologies... 94
Best Practice Methodologies
 SQLBI Methodology – Inmon methodology

30/08/2019 Methodologies... 95
Best Practice Methodologies
 SQLBI Methodology – Inmon methodology
 Three level of data modelling
 ERD (Entity Relationship Diagram)
 Refines entities, attributes and relationships
 Mid-Level Model (DIS)
 Data Item Sets
 Data sets by department
 Four constructs
1. Primary data groupings
2. Secondary data groupings
3. Connects
4. “type of” data
 Physical data model
 Optimize for performance (de-normalize)

30/08/2019 Methodologies... 96
Best Practice Methodologies
 SQLBI Methodology – Inmon methodology

 Create the data warehouse model

 Incrementally, add more BI applications

 Until customer satisfied

30/08/2019 Methodologies... 97
Best Practice Methodologies
 SQLBI Methodology – Inmon methodology, the DWh

30/08/2019 Methodologies... 98
Best Practice Methodologies
 SQLBI Methodology – Kimball vs. Inmon
Inmon Kimball
Subject-oriented Business-process-oriented
Integrated
Top-down Bottom-p & evolutionary
Non-volatile, Time-variant
Integration achieved via an Assumed Stresses Dimensional Model, Not E-R
Enterprise Data Model Integration achieved via Conformed
Dimensions
Star Schemas enforce Query
Semantics
Characterizes Data marts as
Aggregates
30/08/2019 Methodologies... 99
Best Practice Methodologies
 SQLBI Methodology – Kimball vs. Inmon
Inmon Kimball
Overall approach Top-down Bottom-up
Architectural structure Enterprise-wide DWh Data marts model a
feeds departmental DBs business process;
enterprise is achieved
with conformed dims
Complexity of method Quite complex Fairly simple

30/08/2019 Methodologies... 100


Best Practice Methodologies
 SQLBI Methodology – Kimball vs. Inmon
Inmon Kimball
Data orientation Subject or data driven Process oriented
Tools Traditional (ERDs & DIS) Dimensional modelling;
departs from traditional
relational modelling
End user accessibility low High

30/08/2019 Methodologies... 101


Best Practice Methodologies
 SQLBI Methodology – Kimball vs. Inmon
Inmon Kimball
Timeframe Continues & Discrete Slowly Changing
Methods Timestamps Dimension Keys

30/08/2019 Methodologies... 102


Best Practice Methodologies
 SQLBI Methodology – Kimball vs. Inmon
Inmon Kimball
Primary Audience IT End Users
Place in the Integral part of the Transformer and retainer
Organization Corporate Information of operational data
Factory (CIF)
Objective Deliver a sound technical Deliver a solution that
solution based on proven makes it easy for end user
methods to directly query data and
still have reasonable
response rate

30/08/2019 Methodologies... 103


Best Practice Methodologies
 SQLBI Methodology – Kimball vs. Inmon
Characteristic Favors Inmon Favors Kimball
Nature of the
organization´s
tactical strategic
decision support
reqirements
Data integration Enterprise-wide
Individual business areas
requirements integration
Structure of Data Business metrics, Non-metric data and for
performance measures data that will be applied
and scorecards to meet multiple and
varied information needs

30/08/2019 Methodologies... 104


Best Practice Methodologies
 SQLBI Methodology – Kimball vs. Inmon
Characteristic Favors Inmon Favors Kimball
Scalability Need to adapt to highly Growing scope and
volatile requirements changing requirements
within a limited scope are critical
Persistency of data Source systems are High rate of change from
relatively stable source systems
Staffing and skills Small teams of generalist Larger team(s) of
requirements specialists
Time to deliver Need for the first DWh Organization´s
application is urgent requirements allow for
longer star-up time

30/08/2019 Methodologies... 105


Best Practice Methodologies
 SQLBI Methodology – Kimball vs. Inmon
Characteristic Favors Inmon Favors Kimball
Cost to deploy Lower start-up costs, with Higher start-up costs,
each subsequent project with lower subsequent
costing about the same project development
costs

30/08/2019 Methodologies... 106


Best Practice Methodologies
 ROADMAP Methodology – remember…

 A BI application is an engineering project

 Engineering projects of any kind go through six stages


between inception and implementation

30/08/2019 Methodologies... 107


Best Practice Methodologies
 ROADMAP Methodology – remember…
1. Justification: An assessment is made of a business
problem or a business opportunity, which gives rise to
the engineering project.
2. Planning: Strategic and tactical plans are developed,
which lay out how the engineering project will be
accomplished.
3. Business Analysis: Detailed analysis of the business
problem or business opportunity is performed, which
provides a solid understanding of the business
requirements for a solution.
30/08/2019 Methodologies... 108
Best Practice Methodologies
 ROADMAP Methodology – remember…
4. Design: A product is conceived, which solves the
business problem or enables the business opportunity.
5. Construction: The conceived product is built, which
is expected to provide a return on the development
investment within a predefined time frame.
6. Deployment: The finished product is implemented
(or sold) and its effectiveness is measured, which will
determine whether the solution meets, exceeds or fails
the expected return on investment.

30/08/2019 Methodologies... 109


Best Practice Methodologies
 ROADMAP Methodology – remember…spiral

30/08/2019 Methodologies... 110


Best Practice Methodologies
 ROADMAP Methodology – remember…spiral in each
subproject

30/08/2019 Methodologies... 111


Best Practice Methodologies
 ROADMAP Methodology – BI Development Stages
and Steps
 A BI roadmap is comprised of sixteen development
steps.

30/08/2019 Methodologies... 112


Best Practice Methodologies
 ROADMAP Methodology – Justification stage
 Step 1: Business Case Assessment.

 The business problem or business opportunity is defined and


a BI solution is proposed.

 Each BI application release should be cost-justified and


should clearly define the benefits of either solving a business
problem or taking advantage of a business opportunity.

30/08/2019 Methodologies... 113


Best Practice Methodologies
 ROADMAP Methodology – Planning stage
 Step 2: Enterprise Infrastructure.

 BI is a cross- organizational decision support solution,

 An enterprise infrastructure must exist or be developed while


the BI applications are developed.

30/08/2019 Methodologies... 114


Best Practice Methodologies
 ROADMAP Methodology – Planning stage
 Step 2: Enterprise Infrastructure.
 An enterprise infrastructure has two components:
1. Technical infrastructure which includes hardware,
software, middleware, database management systems,
operating systems, network components, meta data
repository and applications; and
2. Nontechnical infrastructure which includes meta data
standards, data naming standards, enterprise data
architecture (evolving), methodology, guidelines, testing
procedures, change control process, issues management
procedures and dispute resolution procedures

30/08/2019 Methodologies... 115


Best Practice Methodologies
 ROADMAP Methodology – Planning stage
 Step 3: Project Planning.
 BI projects are extremely dynamic and changes to scope, staff,
budget, technology, users and sponsors can severely impact
the success of the project.

 Project planning must be detailed

 Actual progress must be closely watched and reported.

30/08/2019 Methodologies... 116


Best Practice Methodologies
 ROADMAP Methodology – Business Analysis stage
 Step 4: Project Delivery Requirements.
 Scoping is one of the most difficult tasks for BI applications.
The desire to have everything instantly is difficult to curtail;
however, keeping the scope small is one of the most important
aspects to defining the requirements for each deliverable.

 These requirements should be expected to change throughout


the development cycle as more is learned about the
possibilities and the limitations of the technology.

30/08/2019 Methodologies... 117


Best Practice Methodologies
 ROADMAP Methodology – Business Analysis stage
 Step 5: Data Analysis.
 The biggest challenge to all BI projects is the quality of the
source data.
 The bad habits developed over decades are difficult to break,
and it is very difficult and time-consuming to find and correct
the damage resulting from the bad habits.
 Data analysis in the past was confined to one line-of-business
user's view and was never reconciled with other views in the
organization.
 This step will take a significant percentage of time in the
entire project schedule.
30/08/2019 Methodologies... 118
Best Practice Methodologies
 ROADMAP Methodology – Business Analysis stage
 Step 6: Application Prototyping.
 Analysis for the functional deliverable(s), formerly called
system analysis, is best done through prototyping.
 Today there are tools and new programming languages that
enable the developers to prove or disprove a concept or idea
relatively quickly.
 Prototyping also allows the users to see the potential and the
limits of the technology. This gives them an opportunity to
adjust their delivery requirements and their expectations.

30/08/2019 Methodologies... 119


Best Practice Methodologies
 ROADMAP Methodology – Business Analysis stage
 Step 7: Meta Data Repository Analysis.
 Having more tools means having more technical meta data in
addition to the business meta data, which is usually captured
in a modeling CASE (computer-aided software engineering)
tool.
 This meta data needs to be mapped to other meta data and
stored in a repository.
 Meta data repositories can be purchased or built. In either
case, the requirements for what type of meta data to capture
and store must be documented in a meta model.
 In addition, the requirements for delivering meta data to the
users have to be analyzed.

30/08/2019 Methodologies... 120


Best Practice Methodologies
 ROADMAP Methodology – Design stage
 Step 8: Meta Data Repository Design.

 If a meta data repository is purchased, it will most likely have


to be extended with features that are required by your BI
applications.

 If a meta data repository is built, the database has to be


designed based on the meta model developed during the
previous step.

30/08/2019 Methodologies... 121


Best Practice Methodologies
 ROADMAP Methodology – Design stage
 Step 9: Database Design.
 One or more databases will be storing the business data in
detailed or aggregated form, depending on the reporting
requirements of the users.

 Not all reporting requirements are strategic, and not all of


them are multidimensional.

 The database design schema must match the access


requirements of the business.

30/08/2019 Methodologies... 122


Best Practice Methodologies
 ROADMAP Methodology – Design stage
 Step 10: ETL Design.
 This process is the most complicated process of the entire BI
project; it is also the least glamorous.
 Extract, transform and load (ETL) processing time frames
(batch windows) are typically small.
 The poor quality of the source data usually mandates a lot of
time to run the transformation and cleansing programs.
 It is a challenge for most organizations to finish the ETL
process within the available time frame.

30/08/2019 Methodologies... 123


Best Practice Methodologies
 ROADMAP Methodology – Construction stage
 Step 11: ETL Development.
 Many tools are available for this process, some sophisticated
and some simple.
 Depending on the data cleansing and data transformation
requirements developed during the data analysis step, an ETL
tool may or may not be the best solution.
 In either case, preprocessing the data and writing extensions
to the tool capabilities are frequently required.

30/08/2019 Methodologies... 124


Best Practice Methodologies
 ROADMAP Methodology – Construction stage
 Step 12: Application Development.

 Once the prototyping effort has finalized the functional


delivery requirements, true development can begin on either
the same user access and analysis tools, such as OLAP tools, or
on different tools.

 This activity is usually performed in parallel to the meta data


repository and ETL activities.

30/08/2019 Methodologies... 125


Best Practice Methodologies
 ROADMAP Methodology – Construction stage
 Step 13: Data Mining.
 Many organizations do not use their BI databases to their
fullest capability.
 In fact, usage is often limited to prewritten reports some of
them not even new types of reports, but replacements of old
reports.
 The real payback for BI applications comes from the business
intelligence hidden in the organization's data, which can only
be discovered with data mining tools.

30/08/2019 Methodologies... 126


Best Practice Methodologies
 ROADMAP Methodology – Construction stage
 Step 14: Meta Data Repository Development.

 If the decision is made to build a meta data repository rather


than to buy one, a separate team is usually charged with the
development process.

 This becomes a sizable subproject of the overall BI project.

30/08/2019 Methodologies... 127


Best Practice Methodologies
 ROADMAP Methodology – Deployment stage
 Step 15: Implementation.
 Once all components of the BI application are thoroughly
tested, the BI databases and functions are rolled out.

 Users must be trained and the support functions initiated.

 These functions include help desk support, maintenance of


the BI target databases, scheduling and running ETL batch
jobs, performance monitoring and database tuning.

30/08/2019 Methodologies... 128


Best Practice Methodologies
 ROADMAP Methodology – Deployment stage
 Step 16: Release Evaluation.
 With an application release concept, it is very important to
benefit from lessons learned on the previous project.
 Any tools, techniques, guidelines and processes that were not
helpful should be reevaluated and adjusted, possibly even
discarded.
 Any missed deadlines, cost overruns, disputes and their
resolutions should be examined.
 Adjustments to the processes should be made before the next
release.

30/08/2019 Methodologies... 129


Best Practice Methodologies
 ROADMAP Methodology

 The development steps need not be performed in


sequence; most likely, they will be performed in parallel.

 However, because there is a natural order of progression


from one engineering stage to another, certain
dependencies exist between some of the development
steps

30/08/2019 Methodologies... 130


Best Practice Methodologies
 ROADMAP Methodology

30/08/2019 Methodologies... 131


Best Practice Methodologies
 ROADMAP Methodology – Parallel Development
Tracks

 Most BI projects have at least three development tracks


running in parallel once the project delivery
requirements have been defined

30/08/2019 Methodologies... 132


Best Practice Methodologies
 ROADMAP Methodology – Parallel Development
Tracks

1. ETL Track also known as Back-End. The design and


population of the BI target databases are the most
important components of BI projects. The fanciest
OLAP tools in the world will not work if the databases
are not designed properly or if they are populated with
dirty data.

30/08/2019 Methodologies... 133


Best Practice Methodologies
 ROADMAP Methodology – Parallel Development
Tracks

2. Application Track also known as Front-


End. Value-added data delivery from the BI databases
as well as easy ad hoc (spontaneous) access to the
business data are the key reasons for building the BI
environment.

30/08/2019 Methodologies... 134


Best Practice Methodologies
 ROADMAP Methodology – Parallel Development
Tracks

3. Meta Data Repository Track. Meta data is a


deliverable for every BI application. It can no longer be
shoved aside as documentation. It must serve the users
as a navigation tool for the target databases in the BI
environment.

30/08/2019 Methodologies... 135


Best Practice Methodologies
 ROADMAP Methodology – Parallel Development
Tracks
TRACKS

Stage/Steps MetaData
ETL Application
Repository
Development Development
Development
Justification
Step 1   
Planning
Step 2   
Step 3   

30/08/2019 Methodologies... 136


Best Practice Methodologies
 ROADMAP Methodology – Parallel Development
Tracks
TRACKS

Stage/Steps MetaData
ETL Application
Repository
Development Development
Development
Business Analysis
Step 4   
Step 5 
Step 6  
Step 7 
Design
Step 8 
Step 9  
30/08/2019 Step 10 Methodologies...  137
Best Practice Methodologies
 ROADMAP Methodology – Parallel Development
Tracks
TRACKS

Stage/Steps MetaData
ETL Application
Repository
Development Development
Development
Business Analysis
Step 4   
Step 5 
Step 6  
Step 7 
Design
Step 8 
Step 9  
30/08/2019 Step 10 Methodologies...  138
Best Practice Methodologies
 ROADMAP Methodology – Parallel Development
Tracks
TRACKS

Stage/Steps MetaData
ETL Application
Repository
Development Development
Development
Construction
Step 11 
Step 12 
Step 13 
Step 14 
Deployment
Step 15   
Step 16   
30/08/2019 Methodologies... 139
Best Practice Methodologies
 ROADMAP Methodology – Parallel Development
Tracks
 These three tracks can be considered major
subprojects of the BI project.
 Each will have its own team and set of activities after
the project delivery requirements have been
formalized.
 Discoveries made on one track can, and often do,
impact the other tracks

30/08/2019 Methodologies... 140


Best Practice Methodologies
 ROADMAP Methodology – Parallel Development
Tracks

30/08/2019 Methodologies... 141


Best Practice Methodologies
 ROADMAP Methodology – Parallel Development
Tracks
 Each development track has a specific deliverable which
contributes to the BI project objectives:
 The ETL track will deliver loaded databases.

 The application track will deliver the reports, queries and ad


hoc tools.

 The meta data repository will deliver the meta data.

30/08/2019 Methodologies... 142


Best Practice Methodologies
 ROADMAP Methodology – Parallel Development
Tracks

 Each track moves through the six engineering stages


either together or apart and in parallel, performing the
engineering activities in their specific steps.

30/08/2019 Methodologies... 143


Data Warehouse
Implementation Methodology

30/08/2019 Tools and Techniques for BI 144


Data Warehouse
Implementation Methodology

30/08/2019 Data Modelling for BI 145


Data Warehouse
Implementation Methodology
 Analysis Phase
 Convert the recollected requirements into a set of
specifications which can help the design phase.
 There are 3 main specifications:
1. Business view requirements, delineate the boundaries of
information that need to be part of the DWh
2. Data Source requirements that delineate the boundaries of
the available information of the current data sources.
3. Access and End-user requirements, define how the DWh´s
information will be used

30/08/2019 Data Modelling for BI 146


Data Warehouse
Implementation Methodology
 Analysis Phase – specifying business view
requirements
 Reference Framework
 Top-down vision
 Source vision
 DWh vision
 Business queries vision
 Data source Modelling
 DWh Modelling
 Business queries modelling

30/08/2019 Data Modelling for BI 147


Data Warehouse
Implementation Methodology Top-Down Vision
Allows the selection of the CORRECT information for
the DWh
 Analysis Phase – specifying business view
requirements,
Data Source Vision
referenceDWh
framework,
Vision
interrelation
Business queries vision
among visions
Tables of
End-
Data
User 1
Source 1
Facts
tables
Source
Tables of End-user
data End-
Data local
integrated User 2
Source 2 tables
tables

Tables of
Dimension End-
Data
tables User n
Source n

Data Source DWh and Data Marts End-Users


30/08/2019 Data Modelling for BI 148
Data Warehouse
Implementation Methodology Utilities
 Analysis Phase – specifying business view
requirements, reference framework, Objective
Analysis
Incomes Costs

Sales Prices Fixed Variables

Manufacturing
Promotions costs

30/08/2019 Data Modelling for BI 149


Data Warehouse
Implementation Methodology
 Analysis Phase – specifying business view
requirements, reference framework, Objective vs
Organizational level Analysis
Objective Organizational level
Utilities
High direction
More
dimen-
Incomes Costs
Vice-presidents .sions,
condens
ation
Divisional Manager
Sales Prices Fixed Variables and
addition
Department Manager
Manufactu
Promotions
ring costs
30/08/2019 Data Modelling for BI 150
Data Warehouse
Implementation Methodology
 Analysis Phase – specifying business view
requirements, use of classical techniques for business
analysis
Enterprise's Infrastructure Profit
Human Resource Management
Technological Development
Achievements

Customer Internal Service


queries Analysis Design logistics

Marketing Workforce Workforce Technical


Publicity Promotion
Management Management Operations literature
30/08/2019 Data Modelling for BI 151
Data Warehouse
Implementation Methodology
 Analysis Phase – specifying data source vision
 Multiple storage technologies
 Multiple data definition
 Synonyms
 Homonyms
 Null fields
 Format differences among similar fields of different DB
 Data type
 Length

 Coding differences
 Duplicities

30/08/2019 Data Modelling for BI 152


Data Warehouse
Implementation Methodology
 Analysis Phase – specifying data source vision –to do
1. Modeling sources Inheritance with terms and definitions
(inventories)
2. Transforming these into a common form of representation
3. Purging names of data elements
4. Selecting data elements subsets required for a theme
5. Integrating data sources into theme data models
6. Integrating data sources models into a consolidated data
source
7. Transforming the consolidated data source into a DWh
model

30/08/2019 Data Modelling for BI 153


Data Warehouse
Implementation Methodology
 Analysis Phase – specifying data source vision –to be
careful

How judiciously mix the contents


of many sources of data, while
preserving the content
organized?
30/08/2019 Data Modelling for BI 154
Data Warehouse
Implementation Methodology
 Analysis Phase – specifying DWh vision – DWh model
vs Data Model (DM and Data Management)
 Operational DB Data Model focused on
 Eliminate redundancies
 Coordinate DB updates
 Support transactions

 DHw data model is oriented towards


 Handling a ample range of queries
 Recovering information in a frequent fashion

30/08/2019 Data Modelling for BI 155


Data Warehouse
Implementation Methodology
 Analysis Phase – specifying DWh vision – DWh model
vs Data Model (DM and Data Management)

 Operational DB Data Model contains


 Many updates standards for consistent handling, maintaining
referential integrity

 DHw data model contains


 Few rules to provide instant access without many unions

30/08/2019 Data Modelling for BI 156


Data Warehouse
Implementation Methodology
 Analysis Phase – specifying DWh vision – DWh model
vs Data Model (DM and Data Management)

 Operational DB Data Model is


 Just a product of the DB and developers,
 Complex and big

 DHw data model


 End users understand the DM, for easy and convenient
visibility and data access

30/08/2019 Data Modelling for BI 157


Data Warehouse
Implementation Methodology
 Analysis Phase – specifying DWh vision – DWh model
vs Data Model (DM and Data Management)

 Operational DB are
 Just what is needed for current operations, supporting
unnecessary information

 DHw data are


 As operational as historic
 In the order of MB to GB of volume

30/08/2019 Data Modelling for BI 158


Data Warehouse
Implementation Methodology
 Analysis Phase – specifying DWh vision – DWh model
vs Data Model (DM and Data Management)

 Operational DB store
 few derived data, calculating them on the fly

 DHw data store


 Large amounts of derived data to save repetitive
computational effort

30/08/2019 Data Modelling for BI 159


Data Warehouse
Implementation Methodology
 Analysis Phase – specifying DWh vision – DWh model
vs Data Model (DM and Data Management)

 Operational DB have
 All the needed data to support its operations

 DHw data store


 Stored timeliness data

30/08/2019 Data Modelling for BI 160


Data Warehouse
Implementation Methodology
 Analysis Phase – specifying DWh vision – DWh model
vs Data Model (DM and Data Management)

 Operational DB have
 Lightly summarized data for reports

 DHw
 Pre-compute and stored very summarized data

30/08/2019 Data Modelling for BI 161


Data Warehouse
Implementation Methodology
 Analysis Phase – specifying Business queries vision

Application of facts, called


measurements, across
multiple dimensions

30/08/2019 Data Modelling for BI 162


Data Warehouse
Implementation Methodology
 Analysis Phase – specifying Business queries vision

 Also contain sub-queries or cut-offs

 They are starting points for the DWH schema definition

30/08/2019 Data Modelling for BI 163


Data Warehouse
Implementation Methodology
 Analysis Phase – specifying Business queries vision

 What are the monthly costs of promotion by media type


in the last two years?

 What was the budget for each quarterly ad agency in the


last four quarters?

 What are the monthly sales by product line for each


district in the last 24 months?

30/08/2019 Data Modelling for BI 164


Data Warehouse
Implementation Methodology
 Analysis Phase – specifying Business queries vision
SALES
BUDGET
PROMOTION
PRODUCTS THEME AREAS
MEDIA
TIME
AGENCY

Derived from
reports

30/08/2019 Data Modelling for BI 165


Data Warehouse
Implementation Methodology
 Analysis Phase – specifying Business queries vision

 What were the sales representative contracts for each of


the last 12 months?

 What were the quarterly sales for each district and


region in the last 8 quarters?

 What were the sales by product line for each district in


the past 24 months (only season)?

30/08/2019 Data Modelling for BI 166


Data Warehouse
Implementation Methodology
 Analysis Phase – specifying Business queries vision

CUSTOMER
ORDER
PRODUCTS
THEME AREAS
SALES
GEOGRAPHY
TIME

Derived from
reports

30/08/2019 Data Modelling for BI 167


Data Warehouse
Implementation Methodology
 Analysis Phase – Data Source & DWh Modelling

 Traditional modelling techniques

 Entity-Relation

 CASE tools

30/08/2019 Data Modelling for BI 168


Data Warehouse
Implementation Methodology
 Analysis Phase – Data Source & DWh Modelling
 Classify theme areas together with data sources
Theme Area Data Source Technology Object

CONTRACT OLTP Contract Table


CUSTOMER OLTP Customer Table
MARKET commercialisation
ORDER OLTP
PRODUCT Commercialisation/OLTP
PROMOTION Commercialisation
SALES commercialisation/OLTP
SHIPMENT OLTP
TIME commercialisation

30/08/2019 Data Modelling for BI 169


Data Warehouse
Implementation Methodology
 Analysis Phase – Data Source & DWh Modelling

 Organize data into facts, dimensions and


measurements, using either a star or a snowflake
scheme

30/08/2019 Data Modelling for BI 170


Data Warehouse
Implementation Methodology
 Analysis Phase – Data Source & DWh Modelling, Star
Scheme
Time´s dimension table Product´s dimension table
Many time´s attributes Sales fact table Many product´s attributes
Time_Key
Product_Key
Store_Key
Location_Key
Store´s dimension table unit_sales Location´s dimension table
Many store´s attributes dollar_sales Many location´s attributes
yens_sales

Facts

30/08/2019 Data Modelling for BI 171


Data Warehouse
Implementation Methodology
 Analysis Phase – Data Source & DWh Modelling, Star
Scheme, how to improve performance?
 Defining additions of existing fact tables or new adding
tables
 Segmenting the fact table such as most of the queries
access only a segment or partition
 Create separate fact tables
 Create unique numeric index
 Other to improve union performance

30/08/2019 Data Modelling for BI 172


Data Warehouse
Implementation Methodology
 Analysis Phase – Data Source & DWh Modelling,
Snowflake Scheme Product´s Supplier_Key
Time´s dimension table dimension table
Time_Key Sales fact table Supplier_Key
Time_Key Product_Key
Product_Key
Store_Key
Location_Key
Store´s dimension table Location´s dimension table
unit_sales
Store_Key dollar_sales Location_Key
Country
yens_sales
Location_Key
Region
Facts Location_Key

30/08/2019 Data Modelling for BI 173


Data Warehouse
Implementation Methodology
 Analysis Phase – Data Source & DWh Modelling,
Mixed scheme
 Facts and dimension tables with no normalization in the
star part

 Normalized dimension tables in the snowflaque part

30/08/2019 Data Modelling for BI 174


Data Warehouse
Implementation Methodology
 Analysis Phase – Business queries modelling
 Query Mold isolate dimensions subject area
 Allowing to find fact and dimension tables

Dimension

Theme area
Dimension

Dimension

30/08/2019 Data Modelling for BI 175


Data Warehouse
Implementation Methodology
 Analysis Phase – Business queries modelling
 What were the sales by product line and territory?

TIME

Anual
Product line
SALES
PRODUCT
Territory

GEOGRAPHY

30/08/2019 Data Modelling for BI 176


Data Warehouse
Implementation Methodology
 Analysis Phase – Business queries modelling
 Business queries consolidation
 Customer sales representatives by contract
 Order by size by geography
 Income Size per customer per year
 Regional Sales by quarter
 Product sales per promotion events
 Percentage of shipments by type of shipment´s method per
year

30/08/2019 Data Modelling for BI 177


Data Warehouse
Implementation Methodology
 Analysis Phase – Business queries modelling
 Business queries consolidation
 Product sales by region by quarter
 Sales of competition by region per year
 Demographics by region and district
 Shipping costs by shipping method
 Volume of shipments by destination
 Shipments by shipment´s method per day

30/08/2019 Data Modelling for BI 178


Data Warehouse
Implementation Methodology
 Analysis Phase – Business queries modelling – Star
Model Customer
orders
Shipment´s Customer
method contracts
Aerial express

Bus order Product


Product group
Product line
Quarterly
Product element
Time
Annual Daily
District Seller Competence
Region
District
Countr Divisio
y n
Geography Organization
Promotion
30/08/2019 Data Modelling for BI 179
Data Warehouse
Implementation Methodology
 Analysis Phase – Business queries modelling – Star
Model
9 to 6 dimensions

Addition
exploration,
combining two or
more dimensions to
reduce the amount
of them

30/08/2019 Data Modelling for BI 180


Data Warehouse
Implementation Methodology
 Analysis Phase – Business queries modelling – Star
Model
9 to 6 dimensions

Addition
exploration,
combining two or
more dimensions to
reduce the amount
of them

30/08/2019 Data Modelling for BI 181


Data Warehouse
Implementation Methodology
 Analysis Phase – Business queries modelling – Star
Model
Useful for deepening
requirements
identification

Penetration of the query


• Complete query (all
dimensions)
• Along one or more
dimensions

30/08/2019 Data Modelling for BI 182

S-ar putea să vă placă și