Documente Academic
Documente Profesional
Documente Cultură
Churn prediction
SupervisorDr. Sonali Agarwal
Pretam Jayaswal
ISE2013025
Overview
Basic Terms
Motivation
Importance of churn prediction
Selection of domain
Objective
Workflow
My approach
What is Churn
Churn is a word derived from Change and
Turn.
In context of Customer Relationship It refers
to the discontinuation of a contract by the
customer.
Churn Prediction
Churn prediction is the term used to
determine the churning customers from a
given service provider.
Nowadays, more and more companies start to
focus on CRM to prevent churn.
Types of Churn
There are 3 types of churning customers1. Active/Deliberate- The customer decides to
quit his contract and switch to another
service provider.
2. Rational/Incidental- The customer quits
contract without the aim of switching to a
competitor.
3. Passive/ Non-voluntary The service
provider discontinues the contract itself.
Types of Churn
There are 3 types of churning customers1. Active/Deliberate- The customer decides to
quit his contract and switch to another
service provider.
2. Rational/Incidental- The customer quits
contract without the aim of switching to a
competitor.
3. Passive/ Non-voluntary The service
provider discontinues the contract itself.
Motivation
KDD Cup 2009 : Customer Relationship prediction
(15th ACM SIGKDD Conference on Knowledge Discovery
and Data Mining) [2]
Banking Sector
Insurance Companies
E-commerce Industry
Telecom Industry etc.
Banking Sector
Insurance Companies
E-commerce Industry
Telecom Industry etc.
Objective
The objective of this thesis is to predict the churning
customer with confidence i.e. with higher accuracy,
rate each customer with churn likelihood and
assign them relative score to identify churn
potential of customers in Telecom Industry.
Dataset
The dataset that will be used is provided by
Orange Telecom for KDD cup 2009 problem.[2]
Both training and test sets contains
approx. 50,000 examples.
Workflow
Ensemble approach
Ensembles are a divide-and-conquer approach used
to improve performance.
The main principle behind ensemble methods is that
a group of weak learners can come together to
form a strong learner.
Approach
Our major concern is better prediction so I will
use ensemble approach with voting schem
that uses multiple learning algorithms and try
to obtain better predictive performance than
could be obtained from any of the generic
learning algorithms
My Approach : Step 1
Pre-Processing over Dataset
Traditional data pre-processing approaches are
used to handle data issues like data cleaning and
reduction etc.
.
My Approach : Step 2
Partition the data into subsets.
we divide an entire dataset into M equally sized,
non-overlapping subsamples using a SRSWOR
(Simple random sample without replacement)
scheme.
My Approach : Step 3
For each Partitioned Sample, Built a
corresponding Classifier (Training).
Will use 3 supervised classification models.
1) Decision Tree
2) Random Forest
3) Support Vector Machines
Decision Trees
A decision tree (DT) is a flowchart-like tree
structure, where each internal node denotes a test
on an attribute and each branch represents an
outcome of the test
Decision trees are "white boxes" in the sense that
the acquired knowledge can be expressed in a
readable form.
decision trees are quit robust to the presence of
noise in data.
Decision Trees
Decision trees are highly interpretable and
simple to grow.
Random Forest
Random Forest uses Bagging approach for
the classification.
random forests, bagging is used in tandem
with random feature selection. Each new
training set is drawn from the original training
set.
Random Forest
The working of random forest algorithm is as follows.
1. A random seed is chosen which pulls out at random a
collection of samples from the training dataset while
maintaining the class distribution.
2. With this selected data set, a random set of attributes
from the original data set is chosen based on user
defined values. All the input variables are not
considered because of enormous computation and
high chances of over fitting.
Random Forest
3. In a dataset where M is the total number of input
attributes in the dataset, only R attributes are chosen
at random for each tree where R< M.
RF advantages
Compared with Adaboost, the forests discussed
here have following desirable characteristics[6]: its accuracy is as good as Adaboost and
sometimes better;
its relatively robust to outliers and noise;
its faster than bagging or boosting;
it gives useful internal estimates of error,
strength, correlation and variable importance;
its simple and easily parallelized.
My Approach : Step 3
For each Partitioned Sample, Built a
corresponding Classifier (Training Phase).
My Approach : Step 4
Evaluation of Classifier performance
(Validation).
For validation cross-validation is used to ensure that
every example from the original dataset has the
same chance of appearing in the training and testing
set.
k-fold cross-validation is used for training and
validation.
My Approach : Step 5
Apply the weighting scheme to the classifiers
based on their performance.
The classifiers with higher prediction accuracy
will have higher weight or more dominant.
Poor classifiers will have lower weight.
My Approach : Step 6
Generating the Collective decision
The final prediction on the test data set is
weighted according to this normalized weight
as follows:
My Approach : Step 6
Generating the Collective decision
Analysis
The outcomes of the proposed methodology
will be compared with the existing state of the
art research work to check the advantages/
disadvantages of the thesis work.
Parameters
Customer demography
Bill and payment analysis
Call detail records analysis
Customer care/service analysis
References
1.
2.
3.
4.
5.
6.
Lu, Ning, et al. "A Customer Churn Prediction Model in Telecom Industry Using
Boosting." (2011): 1-1.
Dror, Gideon, et al. "The 2009 Knowledge Discovery and Data Mining
Competition (KDD Cup 2009)." (2011).
Bandara, W. M. C., A. S. Perera, and D. Alahakoon. "Churn prediction
methodologies in the telecommunications sector: A survey." Advances in ICT for
Emerging Regions (ICTer), 2013 International Conference on. IEEE, 2013
Lazarov, Vladislav, and Marius Capota. "Churn prediction." Business Analytics
Course (2007).
Hung, Shin-Yuan, David C. Yen, and Hsiu-Yu Wang. "Applying data mining to
telecom churn management." Expert Systems with Applications 31.3 (2006): 515524
Breiman, Leo. "Random forests." Machine learning 45.1 (2001): 5-32.
Thank you