Sunteți pe pagina 1din 21

BANA 104 (3 Units)

Fundamentals of Predictive Analytics

By: Danilo “Sir Dan” Dumantay, MBA, CPA, CGFM, AIF


Predictive Analytics Outline
A. Review of Business Analytics
B. Understanding Predictive Analytics
B1. Practical Illustrations
B2. Introduction, Definition, Evolution
B3. Uses
C. Getting Started with Predictive Analytics
D. Data Preprocessing
E. Understanding Data Mining
F. Using Predictive Tools
G. Interacting With Predictive Analytics Software
(Power BI)
H. Case Study
Understanding Data Mining

1. Introduction of Data Mining


2. Definition of Data Mining
3. Importance of Data Mining
4. Goal of Data Mining
5. Types of Data Mining Algorithm
– Supervised Learning
– Unsupervised Learning
Introduction of Data Mining
Data mining is one of the “10 emerging
technologies that will change the world” listed by
the MIT Technology Review (Larose).

There is no doubt why many firms embrace data


mining in their operations. An article in
Information System Management points out that
“data mining has become a widely accepted
process for organizations to enhance their
organizational performance and gain a competitive
advantage”
Introduction of Data Mining
What is Data Mining in
Business?

• Decision making
• Marketing
• Detecting Fraud
The Data Mining technology is popular with many
businesses because it allows businesses to learn
more about their customers, prevent frauds and
identity theft, and also make smart marketing
decisions.
Definitions of Data Mining
Data Mining is the analysis step of
Knowledge Discovery in Databases or KDD.
 The core of the KDD process, involving the
inferring of algorithms that explore the
data, develop the model and discover
previously unknown patterns.
 Algorithms - a process or set of rules to be
followed in calculations or other problem-
solving operations, especially by a computer.
Definitions of Data Mining
Data Mining is the process of discovering
new, hidden or unexpected patterns and
inferring associations in raw data.
Data Mining is a collection of powerful
techniques intended to analyse large
amounts of data.
There is no single Data Mining approach.
Data Mining can employ a range of
methods, either individually or in
combination with each other.
Importance of Data Mining
Data are being generated in enormous
quantities
Data are being collected over long periods
of time
Data are being kept for long periods of
time
Computing power is formidable and
cheap
A variety of Data Mining software is
available
Goal of Data Mining
The overall goal of the data mining process is to
extract information from a data set and transform
it into an understandable structure for further use
and action. Predictive Analytics

Discovering meaningful new corrections, patterns.


Predictive Analytics

Discovering trends. Forecasting


Types of Data Mining Algorithms
Supervised learning
• Classification
• Regression

Unsupervised learning
• Association Analysis
• Sequential Pattern Analysis
• Clustering
• Text Mining/Social Media Sentiment
Analysis
Supervised Learning
 In supervised learning, the output datasets
are provided which are used to train the
machine and get the desired outputs.
 In supervised learning, there is a given data
set and how the correct output should look
like is already known.
 In supervised learning, there is a
relationship between the input and the
output.
Unsupervised Learning
 In Unsupervised learning no datasets
are provided, instead the data is
clustered into different classes .
 In Unsupervised learning, it allows us
to approach problems with little or no
idea what our results should look like.
We can derive structure from data
where we don't necessarily know the
effect of the variables.
Illustration: Supervised vs Unsupervised
Situation:
• Basket full of fresh fruits (apple, banana, cherry
grape, orange).
• Task is to arrange the same type of fruit

Supervised learning:
• From previous work, you already know the
shape of each fruit so it is easy to arrange the
same type of fruits at one place.
• Here your previous work is called as train data in
data mining.
• You already learn the things from your train data
(i.e. the features of each fruit).
Illustration: Supervised vs Unsupervised, continued
Unsupervised Learning:
•No knowledge about fruits. So, how will you arrange
the same type of fruit.
•To arrange, select any physical character of a particular
fruit.
• If color: Then it will be arranged based on color
•Red group: apples & cherry fruits.
•Green group: bananas & grapes.
•Select another physical character, eg. Size.
•Red and big group: apple.
•Red and small group: cherry fruits.
•Green and big group: bananas.
•Green and small: grapes
Categories of Supervised Learning
Supervised learning problems are categorized into
"classification" and “regression” problems.

• In a classification problem, we predict


results in a discrete output, i.e.
mapping input variables into discrete
categories.
• In a regression problem, we predict
results within a continuous output, i.e.
mapping input variables to some
continuous function.
Categories of Supervised Learning, continued
Given: Data about the size of houses on the real
estate market.
 If predicting the output as to whether the
house "sells for more or less than the
asking price”, it is a classification
problem.
 If predicting the price of the house, it is a
regression problem because price as a
function of size is a continuous output.
Supervised Learning: Classification
1. Classification
–Data mining task of predicting the
value of a categorical variable by
building a model based on one or
more numerical and/or categorical
variables.
–Classify a data item into one or several
predefined classes.
Supervised Learning: Regression
2. Regression
• Data mining task of predicting the value of
numerical variable by building a model based
on one or more predictors (numerical and
categorical variables.
• Examples:
o Predicting sales mounts of new product
based on advertising expenditure.
o Predicting wind velocities as a function of
temperature, humidity, air pressure, etc.
o Time series prediction of stock market
indices.
Questions?

By: Danilo “Sir Dan” Dumantay, MBA, CPA, CGFM, AIF

S-ar putea să vă placă și