Documente Academic
Documente Profesional
Documente Cultură
Analysis Process
Designer and
Data Mining
Agenda
Exercise
New insights
Sources Target
Transformation
Transformation
x²+y²+2dx+2ey+f=0
(x,y)=F(x²,y
²)
Selection Update and Analysis
in SAP BW
SAP BW
Preparation
Preparation and
Data extraction Data Storage / delivery of the
from disparate Consolidation information
data sources and Structuring (Reports and
Analyses)
“Enhanced Transfer
Rules“
Analysis
Process
Repository
Data Base Tables: Read data from a database table (hidden with BW 3.5)
Preparation (Step 2) :
Transformation (Step 3) :
To facilitate the discovery process the APD provides visualization tools with
which intermediate results can be easily displayed or the quality of the data
can be analyzed:
Display Data: the data per “process node“ can be displayed at any time in tabular
format.
Elementary Statistics: Advanced visualization methods for a quick view of basic
properties and quality of the interim results per “process node”. This functionality
includes histograms, distributions and basic statistical measures like means,
standard deviations, correlations and visualizations.
Calculate Intermediate Results: the interim results of each “process node” can
be stored temporarily for performance reasons.
At first sight the APD seems to offer the similar functions, as those performed by ETL
Tools in Data Warehousing solutions. Nevertheless it is important to note that these
are two completely different applications with different objectives.
1. ETL Process
Extraction: data procurement, that is selection of relevant data from source
systems and supply of the data work area
Loading: bringing the data physically from the workspace into the Data
Warehouse.
2. APD Process
In an Analysis Process existing data are accessed in SAP BW (Data Warehouse) and
completely new data are created via the use of specialized transformations and only the
new data are written back into the database of the SAP BW or an operational system.
ETL Process
APD Process
The most important potential which can be realized with this improved
information are:
By being 100% integrated into the SAP BW, the Analysis Process
Designer (incl. Data Mining Features) also guarantees, that only a single
database is accessed and not different data tables in different source
system. This significantly decreases interfacing problems as well as
related issues with data integrity, quality and system performance.
Exercise
Data mining not only provides insights by analyzing past data, but it is
also capable of predicting future trends and behaviors
..…
….. Data
Data Preparation
Deployment
Modeling
Evaluation
Business Understanding
Description of the Business Objectives and Data Mining Goals / Success
Data Understanding
Selection of the data and exploratory analysis (quality, problems, description of selected
data)
Data Preparation
Cleaning, transformation, integration, formatting of the selected data
Modeling
Selection, building, testing and running different models
Evaluation
Approval of the models and assessment of the results (in accordance with the defined
objectives), review of the process
Deployment
Preparation of final reports, presentation, action plans and deployment of results
There are several divergent KDD methodologies / models. Nearly all can
be summarized / synthesized into the following 5 main phases:
Data Mining methods have been available in SAP BW since Release 3.0B:
ABC
Association Analysis
Regression
Decision Tree
Clustering
Training and application of the SAP Data Mining Methods via a separate
Workbench (Data Mining Workbench)
Decision Tree
A tree-like way of representing a Identification of behavior patterns, e.g.
collection of hierarchical rules that lead churn behavior, satisfaction analysis,
to a class or value. risk analysis
Clustering
Clustering serves to segment and divide Clustering can find use e.g. in an
data into so-called clusters in a way, that insurance by creating customer groups
data of similar content will be assigned with respect to income, age, insurance
to one cluster however the clusters differ policy and well known cases of damage.
among one another as far as possible. By doing so it is possible to identify
through Clustering, which combinations
of certain characteristics orrcure often
together and form corresponding
customer segments.
ABC Classification
The ABC-classification is a frequently E.g. customer can be classified into
used analysis method in order to classify three classes (A, B, C) according to the
objects (customers, products or amount of turnover realized with the
colleagues) on the basis of a certain company.
measurement category, like revenue or
profit.
Association Analysis
The association analysis serves to find The association analysis helps to find
regularities above all in business e.g. cross-selling chances. The identified
operations and to formulate rules can be used to arrange associated
corresponding rules, in the way like "if a products together in a catalogue, super
customer buys product A, he also buys market or web-shop, or to address
product B and C". systematically customer which have
already bought product A for product C.
Step 1: Select Data Step 3: Transformation Step 4: Store/Transfer Data Step 5: Deploy Data
Camp.Targetgroups
SAP
ABC Analysis
APD BW other
CRM
Systems
(e.g. CRM)
A B C
Step 2: Preparation
Exercise