Documente Academic
Documente Profesional
Documente Cultură
a r t i c l e
i n f o
Keywords:
ANFIS
Data mining
Churn management
Telecom churn prediction
Soft computing
a b s t r a c t
Churn management is important and critical issue for Global Services of Mobile Communications (GSM)
operators to develop strategies and tactics to prevent its subscribers to pass other GSM operators. First
phase of churn management starts with prole creation for the subscribers. Proling process evaluates
call detail data, nancial information, calls to customer service, contract details, market details and geographic and population data of a given state. In this study, input features are clustered by x-means and
fuzzy c-means clustering algorithms to put the subscribers into different discrete classes. Adaptive Neuro
Fuzzy Inference System (ANFIS) is executed to develop a sensitive prediction model for churn management by using these classes. First prediction step starts with parallel Neuro fuzzy classiers. After then,
FIS takes Neuro fuzzy classiers outputs as input to make a decision about churners activities.
2010 Elsevier Ltd. All rights reserved.
1. Introduction
Turkeys Global Services of Mobile (GSM) 1800 licenses were
distributed to ARIA and AYCELL respectively in 2000. Thus, GSM
market has been enforced to enhance the quality of services
(QoS) and supports to customers. One of the major problems of
GSM operators has been churning customers. Churning means that
subscribers may move from one operator to another operator
wherefore the dissatisfaction of services. For instance, cost of services, corporate capability, credibility, customer communication,
customer services, roaming and coverage, call quality, billing and
cost of roaming may be reason to churn (Mozer et al., 2000). Hence
churn management becomes an important issue for the GSM operators to struggling with. Churn management includes monitoring
the aim of the subscribers, and offering new alternative campaigns
to improve expectations and satisfactions of subscribers.
Quality metrics can be used to determine indicators to identify
inefciency problems. Metrics of churn management are related
with quality of network services, operations, and customer services. Mobility of GSM numbers is critical metric for determining
the churners. In Turkey, end of the 2008, Telecommunication Regulation Committee decided that GSM subscribers can move other
operators with their original GSM numbers. Thus, these possible
churner activities should have been predicted before to prevent
the lost of the subscribers to the other GSM carriers.
When subscribers are clustered or predicted for the arrangement of the campaigns, telecom operators should have focused
on demographic data, billing data, contract situations, and number
Corresponding author.
E-mail address: akarahoca@bahcesehir.edu.tr (A. Karahoca).
0957-4174/$ - see front matter 2010 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2010.07.110
1815
Tailored data mining solutions can provide far more useful information to the carrier (Gerpott et al., 2001).
Hung et al., 2006 proposed a data mining solution with decision
trees and neural networks to predict churners to assess the model
performances by LIFT (a measure of the performance of a model at
segmenting the population) and hit ratio (true positives). Data
mining approaches are considered to predict customer behaviors
by call detail records (CDRs) and demographics data in Wei and
Chiu (2002), Yan et al. (2005), Karahoca et al. (2007).
In this study, main motivation is investigating the best data
mining model for churn management, according to the measure
of the performances of the model. We have utilized data sets which
are obtained from a Turkish mobile telecom operators data warehouse system to analyze the churner activities to develop a data
mining model.
The remainder of the paper is organized as follows. Firstly data
preparation procedure is summarized in Section 2. Executed methods, x-means, fuzzy c-means and ANFIS are explained respectively
in Section 3 with comparison procedure of data mining methods.
Findings of the study are given in Section 4 by using benchmarking
methodologies to compare different prediction techniques. Conclusions are considered in Section 5.
Meaning
City
Age
Occupation
Home
Month-income
Credit-limit
Avglen-call-month
Avg-len-call46month
Avg-len-sms-month
Avglensms46month
Tariff
Marriage
Child
Gender
MonthExpense
Sublen
CustSegment
AvgLenCall3month
TotalSpent
AvgLenSms3month
GsmLineStatus
Output
Subscribers city
Subscribers age
Occupation code of subscriber
Home city of subscriber
Monthly income of subscriber
Credit limit of subscriber
Average length of calls in last month
Average length of calls in last 46 months
Number of SMS in last month
Number of SMS in last 46 months
Subscriber tariff
Marital Status of the subscriber
Number of child(ren) of the subscriber
Gender of the subscriber
Monthly Expense of subscriber
Length of service duration
Customer segment for subscriber
Average length of calls in last 3 months
Total expenditure of the subscriber
Number of sent SMS in last 3 months
Status of subscriber line
Churn status of subscriber
Table 2
Ranked attributes.
Attribute name
Spearmans Rho
Monthexpense
Age
Marriage
Totalspent
Month-income
Custsegment
Sublen
0.4804
0.4154
0.3533
0.2847
0.2732
0.2477
0.2304
1816
Table 3
Example data set.
Monthly
Expense
Age
Marriage
Total
spent
Monthly
Income
Customer
Segment
Sub
Length
Output
2
2
1
3
3
5
2
4
4
1
1
3
1
4
3
1
3
1
3
4
1
1
1
1
2
1
5
7
2
2
3
1
4
2
4
0.984
0.280
0.808
0.064
0.712
3. Methods
According to the previous studies of (Karahoca et al., 2007),
churner prediction process can supported with ANFIS. It seems to
be better predicted subscribers than other prediction methods.
But prediction sensitivity and correctness of the ANFIS has only
85% and three clusters (loyal, hopeless, lost) was considered constantly. For exceeding this handicap, x-means and fuzzy c-means
algorithms are used respectively to determine clusters effectively
for providing better clustered inputs to prediction model. In this
section, x-means algorithm and fuzzy c-means algorithm are introduced. ANFIS architecture is explained in deep. Also, benchmarking
methodology of the data mining techniques is given.
3.1. x-Means clustering
Jm
N X
C
X
i1
2
um
ij kxi c j k ;
16m61
j1
uij
PC
PN
kxi cj k
k1 kxi ck k
2
m1
m
i1 uij xi
cj PN
m
i1 uij
o
n
k1
k
This iteration will stop when maxij uij
uij < e, where e
is a termination criterion between 0 and 1, whereas k are the iteration steps. This procedure converges to a local minimum or a saddle point of Jm. The algorithm is composed of the following steps as
listed in Algorithm 2 (Dunn, 1973; Bezdek, 1981).
The Improve-Params operation consists of running conventional K-means to convergence. The Improve-Structure operation
nds out if and where new centroids should appear.
j1
D2ijA
as:
D2ijA xj ci T Axj ci ; 1 6 i 6 nc ; 1 6 j 6 N.
Step 3: Update the fuzzy partition matrix:
li;j
t
Step 1: Compute the cluster centers ci P
N
li;jt Pnc
2=m1
k1 DijA =DkjA
end for
until kU(t) U(t1)k < e
t1
i;j
Xj
m
If x is A and y is B then x f x; y;
where A and B are fuzzy sets in the antecedent; z = f (x, y) is a crisp
function in the consequent. Usually f(x, y) is a polynomial in the input variables x and y, but it can be any other functions that can
appropriately describe the output of the system within the fuzzy region specied by the antecedent of the rule. When f(x, y) is a rstorder polynomial, we have the rst-order Sugeno fuzzy model,
which was originally proposed in (Sugeno & Kang, 1988; Takagi &
Sugeno, 1985). When f is a constant, we then have the zero-order
Sugeno fuzzy model, which can be viewed either as a special case
of the Mamdani fuzzy inference system (Mamdani & Assilian,
1975) where each rules consequent is specied by a fuzzy singleton, or a special case of Tsukamotos fuzzy model (Tsukamato,
1979) where each rules consequent is specied by a membership
function of a step function centered at the constant. Moreover, a
zero-order Sugeno fuzzy model is functionally equivalent to a radial
basis function network under certain minor constraints (Jang, 1993,
1996). Consider a rst-order Sugeno fuzzy inference system which
contains two rules:
1817
1818
lAi x
1
1
bi
2
xci
ai
wi lAi xlBi y;
i 1; 2:
Layer 3: Every node in this layer is xed and determines a normalized ring strength. It calculates the ratio of the jth rules
ring strength to the sum of all rules ring strength.
wi
i
;
w
w1 w2
i 1; 2:
Layer 4: The nodes in this layer are adaptive and are connected
with the input nodes (of layer 0) and the preceding node of
layer 3. The result is the weighted output of the rule j.
i pi x qi y r i
i fi w
w
i is the output of layer 3, and {pi, qi, ri} is the parameter set.
where w
Parameters in this layer are referred to as the consequent
parameters.
Layer 5: This layer consists of one single node which computes
the overall output as the summation of all incoming signals.
Overall Output
X
i
P
wf
i fi Pi i i
w
i wi
TPR TP=TP FN
TNR TN=TN FP
Actual Class
+
TP
FP
FN
TN
1819
Table 5
x-Means algorithm: Conguration parameters.
Parameter
Value
Requested iterations
Iterations performed
Splits prepared
Splits performed
Cutoff factor
Percentage of splits accepted by cutoff factor
Cutoff factor
Cluster centers
1
1
2
2
0.5
0%
0.5
4 centers
Table 6
Clusters.
Cluster
Mean
Standard deviation
0
1
2
3
0.55
0.29
1.0
0.60
0.31
0.12
0.71
0.11
ing partition matrix. The one latter parameter has their default value 5, if they are not given by x-means algorithm. The function
calculates with the standard Euclidean distance norm, the norm
inducing matrix is an N N identity matrix. The result of the
partition is collected in structure arrays. One can get the partition
matrix cluster centers, the square distances, the number of iteration and the values of the c-means functional at each iteration step.
Table 7 displays number of data for each cluster and their percentages. We can see distribution of data in Fig. 5.
In Fig. 5, the dots remark the data points, the o the cluster centers, which are the weighted, mean of the data. The algorithm can
only detect clusters with circle shape, that is why it cannot really
Table 7
Clustered instances.
Cluster
0
1
2
3
45
12
15
28
1820
discover the orientation and shape of the cluster right below the
circles in the contour-map are a little elongated, since the clusters
have correct on each other. However, the fuzzy c-means algorithm
is a very good initialization tool for more sensitive methods.
Fig. 6 displays the distributions of churners clusters based on
fuzzy c-means algorithm. Finally software draws ellipses using
regularity vector variables, results shown in Fig. 7 notes that desired criteria may be implemented on choice.
Fig. 7. The display of churners cluster data with self regulating ellipses.
Ridor
Decision Tree
ANFIS
Fuzzy c-means + ANFIS
Training data
Sensitivity
Specicity
Precision
Correctness
0.90
0.85
0.86
0.91
0.91
0.84
0.85
0.93
0.72
0.73
0.82
0.91
0.67
0.72
0.81
0.93
Sensitivity
Specicity
Precision
Correctness
0.78
0.75
0.85
0.91
0.78
0.73
0.88
0.93
0.72
0.72
0.81
0.92
0.66
0.71
0.80
0.93
Table 9
Testing results for the methods used.
Method
Ridor
DT
ANFIS
Fuzzy c-means + ANFIS
Testing data
1821
Fig. 9 displays plot of input factors for fuzzy inference and the
output results in the conditions. The horizontal axis has extracted
attributes from Table 2. The fuzzy inference diagram is the composite of all the factor diagrams. It simultaneously displays all
parts of the fuzzy inference process. Information ows through
the fuzzy inference diagram that is sequential.
ANFIS creates membership functions for each input variables.
The graphs show Marital Status, Age, Monthly Expense and Customer Segment variables membership functions. In these properties, changes of the ultimate (after training) generalized
membership functions with respect to the initial (before training)
generalized membership functions of the input parameters were
examined.
Whenever an input factor has an effect over average, it shows
considerable deviation from the original curve. We can infer from
the membership functions that, these properties has considerable
effect on the nal decision of churn analysis since they have significant change in their shapes.
In Figs. 1013, vertical axis is the value of the membership function; horizontal axis denotes the value of input factor.
Marital Status is an important indicator for churn management;
it shows considerable deviation from the original Gaussian curve
as seen in Fig. 10, during the iterative process.
Fig. 11 displays the initial and nal membership functions. As
expected, Age Group found to be an important indicator to identify
churn. In network, monthly expense is another factor affecting the
nal model most. Resultant membership function is displayed in
Fig. 12.
Subscribers customer segment also critically affects the model.
As seen in Fig. 13, deviation from original curve is signicant. These
attributes represented in Figs. 1013 has the highest effect on nal
classication, the process has changed the membership functions
signicantly giving the values more emphasis for the nal decision.
By using this ANFIS structure, following results obtained when
analysing Receiver Operating Characteristics (ROC).
Receiver Operating Characteristics (ROC) analysis, well-established technique in diagnostics, was used for model assessment.
Fig. 14 illustrates the ROC curve for the best four methods, namely
fuzzy c-means + ANFI, ANFIS, RIDOR and Decision Trees. The fuzzy
c-means + ANFIS method is far more accurate where the smaller
false positive rate is critical. In this situation where preventing
1822
Fig. 14. ROC curve for c-means_ANFIS, ANFIS, RIDOR and Decision Trees.