Session 03 - Paper 36

Diabetics prediction by using feature selection based on
coefficient of variation
Simon Fong
1
, J ustin Liang
1
, Suash Deb
2

1
Department of Computer Science, University of Macau, Taipa, Macau SAR
2
Department of Computer Science and Engineering, Cambridge Institute of Technology, Ranchi, India
Abstract
Diabetes has become a prevalent metabolic disease nowadays, affecting patients of all age groups and large populations
around the world.Early detection would facilitate early treatment that helps for certain the prognosis. In the literature of
computational intelligence and medical care communities, different techniques have been proposed in predicting diabetes
based on the historical records of related symptoms. The researchers share a common goal of improving the accuracy of such
diabetes prediction model. In addition to the model induction algorithms, feature selection is a significant approach in
retaining only the relevant attributes for the sake of building a quality prediction model later. In this paper, a novel and simple
feature selection criterion called Coefficient of Variation (CV) is proposed as a filter-based feature selection scheme. By
following the CV method, attributes that have too low the data dispersion are disqualified from the model construction
process. Thereby the attributes which are factors leading to poor model accuracy are discarded. The computation of CV is
simple hence enabling an efficient feature selection process. Computer simulation experiments by using the Prima Indian
diabetes dataset is used to compare the performance of CV with other traditional feature selection methods. Superior results
by CV are observed.
2013 Elsevier Science. All rights reserved.
Keywords: Diabetes prediction; Classification; Data Mining; Feature Selection.
1. Introduction
Diabetes is a global health concern in both developed and developing countries, and its prevalenceis rising. In
just UK alone, 2.9 million people are suffering from diabetes mellitus in 2011 that constitutes to 4.45% of the
population [1]. By 2025, it is projected to have 5 million people in UK inflicted with diabetes. This incurable
metabolic disorder is chronic and characterized by deficiency of insulin secretion or insensitivity of the body
tissues to insulin. The former is known as Type-I insulin-dependent diabetes mellitus (IDDM) where the body
defects to produce sufficient insulin due to autoimmune destruction of pancreatic -cells. As a result the patients'
body cells may wither because they cannot absorb the needful amount of glucose in the bloodstream without this
important hormone. The second type is called Type-II non-insulin-dependent diabetes mellitus which is usually
associated with obesity and lack of bodilyexercises. It will inevitably lead to insulin treatment, probably for life-
long. Early detection of diabetes has become vital and the detection techniques are maturing over the years.
However, it is reported that about half of the patients with Type II diabetes are undiagnosed and the latency from
disease onset to diagnosis may exceed over a decade [2, 3]. Therefore, the importance of early prediction and
detection of diabetes that enables timely treatment of hyperglycaemia and related metabolic abnormalitiesis
escalating.
In the light of this motivation, diabetes prediction models are being formulated and developed in machine-
learning research community that claimed to be able to do blood glucose prediction based on the historical
records of diabetes patients and their relevant attributes. One of the most significant works is by J an Maciejovski
[4] who formulated predictive diabetic control by using a group of linear and non-linear programming functions
that take into consideration of variables and constraints. The other direction related to blood glucose prediction is
time-series forecasting [5], which take into account of the measurements of the past blood glucose cycles, in order
Proceedings of International Conference on Computing Sciences
WILKES100 ICCS 2013
ISBN:978-93-5107-172-3
265 Elsevier Publications, 2013
*
Corresponding author -
Simon Fong, Justin Liang and Suash Deb
to do some short-term blood glucose forecast. Another popular choice of algorithm in implementing a blood
glucose predictor is artificial neural network [6, 7, 8] which non-linearly maps daily regimens of food, insulin and
exercise expenditure as inputs to a predicted output. Although neural network predictors usually can achieve a
relatively high accuracy (88.8% as in [8]), the model itself is a black-box where the logics in the process of
decision making are mathematical inference. For example, numeric weights associated in each neuron and the
non-linear activation function. Recently some researchers advocated applicability of decision trees in predicting
diabetic prognosis such as batch-training model [9] and real-time incremental training model [10]. The resultant
decision tree is in a form of predicate logics IF-THEN-ELSE rules which are descriptive enough for decision
support when the rules are embedded in some predictor system, as well as for reference and studies by clinicians.
However, one major drawback on decision tree is the selection of the appropriate data attributes or features that
should be general enough to model the historical cases, while providing sufficiently high prediction accuracy in
the event of unseen case.
Potentially there exist many factors (so-called features) for analysis and diagnosis of the diabetes of patients;
these factors may be direct physiological symptoms or lifestyle habits that contribute to the disease. However,
there is no standard rule-of-thumb in deciding which of these factors into the inclusion of the model induction
[11], given different physicians might have their own opinions. At convenience when all the available features are
included in the process of model construction, quite often some of these features may found to be insignificant or
irrelevant. Consequently the accuracy of the prediction model reduces because these the inappropriate feature
might have added randomness to the data or the values of these features lead to biased results. Although the topic
of feature selection has been widely studied, to the best of the authors knowledge, a comprehensive evaluation of
feature selection methods pertaining to the neural network and decision tree classification has not been done so
far. The existing research works either focus on a classification model, especially support-vector-machine (SVM)
or on a few feature selection techniques. For instance, research teams of the works [12, 13, 14] dedicated research
efforts on solving the feature selection by using SVM classifier and its variants. Huang et al, [15] researched the
diabetes prediction problem with a variety of classifiers such as CART decision tree and so forth, a singular
feature selection called ReliefF was used. In this paper, we propose a novel feature selection method called
Coefficient of Variation (CV) which is characterized by its simple and efficient computation. In comparison to
other popular feature selection methods which are based on calculating the information gain or correlation among
attributes and to the target classes, CV only calculates the ratio of the standard deviation and the mean of each
column of attribute data. The underlying principle is that a good attribute should have its data that vary
considerably in value, and the data should adequately spread over a certain range, in order to characterize a
quality prediction model. Otherwise having an attribute whose data values diverge insufficiently implies certain
bias may exist in the data. At least such attribute contributes little to the generality of the induction model, by a
relatively narrow data range that it covers. In the context of stochastic optimization, a model induced by such data
would likely lead the result falls into a local optimum.
The reminder of this paper is structured as follow. Section 2 describes the diabetes dataset and the related data
mining techniques to be applied in the experiment. It is known as the data mining framework as a whole. Section
3 reports the results of the experiment, followed by some discussion. Section 4 concludes the paper.
2. Data Mining Framework
This section describes the data to be used in the experiment, the feature selection algorithms and the model
induction algorithms. Though sometimes this is commonly known as KDD, the details especially the attributes
are clearly presented so to appreciate the efficacy of the CV feature selection method which follows.
2.1. Diabetes Dataset
One of the most popularly used datasets in testing machine learning algorithms for diabetes prediction is Pima
Indian diabetes dataset [16]. Generally the dataset is challenging to building a highly accurate model because all
the attributes do not have a profound relation to the predictable class, though these attributes are believed to be
subtly related to the diabetes diseases somehow. The dataset has eight potentially useful attributes or features, that
describe 768 sample cases of whether the case is of diabetes or not. The binary class is of normal with 500 cases
and abnormal (diagnosed with Pima Indian diabetes) with 268 cases. The ratio is considered quite balanced, with
a ratio of diabetes-free and confirmed diabetes 65.10% : 39.89%. The diabetes cases are those that are diagnosed
with diabetes onset within five years.The data are owned by Peter Turney, National Institute of Diabetes and
Diabetics Prediction by Using Feature Selection Based on Coefficient of Variation
Digestive and Kidney Diseases. And the database is donated by Vincent Sigillito, Research Center, RMI Group
Leader, Applied Physics Laboratory, The J ohns Hopkins University, J ohns Hopkins Road, Laurel, United States.
The eight features are briefly described as follow together with their acronyms.
Feature 1 preg: Number of times pregnant
Feature 2 plas: Plasma glucose concentration a 2 hours in an oral glucose tolerance test
Feature 3 pres: Diastolic blood pressure (mm Hg)
Feature 4 skin: Triceps skin fold thickness (mm)
Feature 5 insu: 2-Hour serum insulin (mu U/ml)
Feature 6 mass: Body mass index (weight in kg/(height in m)
2
)
Feature 7 pedi: Diabetes pedigree function
Feature 8 age: Age (in years)
Class class: variable (0 or 1)
2.2. Feature Selection
The dataset has not been pre-processed, cleansed, nor filtered in our experiment. All the 768 instances are used
as they originally are in the model induction. Nevertheless, feature selection is applied prior to model induction;
the standard feature selection algorithms are those made available by Weka [17] which is a popular software
platform of machine learning algorithms for solving data mining problems implemented in J ava and open sourced
under the GPL license by the University of Waikato, New Zealand. Most of these algorithms are documented in
[18], and their implementations are provided by Weka as built-in functions as follow. Readers who want further
information about these algorithms can refer to the users manual of Weka [17] or the survey [18].
cfsubseteval: Evaluates the worth of a subset of attributes by considering the individual predictive ability of
each feature along with the degree of redundancy between them.
ChiSquaredAttributeEval: Evaluates the worth of an attribute by computing the value of the chi-squared
statistic with respect to the class.
InfoGainAttributeEval: Evaluates the worth of an attribute by measuring the information gain with respect to
the class.
PrincipalComponents: Performs a principal components analysis and transformation of the data.
SignificanceAttributeEval: Evaluates the worth of an attribute by computing the Probabilistic Significance as a
two-way function.
CorrelationAttributeEval: Evaluates the worth of an attribute by measuring the correlation (Pearson's) between
it and the class.
SVMAttributeEval: Evaluates the worth of an attribute by using an SVM classifier.
ReliefFAttributeEval: Evaluates the worth of an attribute by repeatedly sampling an instance and considering
the value of the given attribute for the nearest instance of the same and different class.
SymmetricalUncertAttributeEval: Evaluates the worth of an attribute by measuring the symmetrical
uncertainty with respect to the class.
A new feature selection method is proposed in this paper, based on Coefficient of Variation, called
CVAttribeEval. The method is programmed as a Weka extension plug-in in J ava language. It is founded on the
belief that a good attribute in a training dataset should have its data vary sufficiently wide across a range of
values, so that it is significant to characterize a useful prediction model. To illustrate this concept, visually, the
eight attributes of the Pima Indian diabetes dataset are plotted by using the Projection Plot in Weka. Since the
original attribute values take in various unit scales, e.g. Age or Body mass index usually is a double digit number,
and Diabetes pedigree function has at least a 3-digit number, the attributes are first subject to normalization into
fixed range of [0, 1]. In the visualization program, the attribute values are displayed in seven different scales of
color tones, indicating the ranges of values that the data fall into. The farthest ends of values are in solid red and
blue, the central values are in pale, and those in between are in mild color. The color chart is shown in Figure 1.
The outputs of the eight attributes are shown in Figures 2 (a-h).


Fig. 1. Color chart for displaying the values of attribute data in different hues.
Fig. 2(a). Visualization of data values of attribute preg. Fig. 2(b). Visualization of data values of attribute plas.
Fig. 2(c). Visualization of data values of attribute pres. Fig. 2(d). Visualization of data values of attribute skin.
Fig. 2(e). Visualization of data values of attribute insu. Fig. 2(f). Visualization of data values of attribute mass.
Fig. 2(g). Visualization of data values of attribute pedi. Fig. 2(h). Visualization of data values of attribute age.

Low data dispersion
Low data dispersion
Low data dispersion
Low data dispersion
As it can be observed from the collection of visualizations, the attributes do have a good distribution over the
data space except plas, pres, mass and age, which are shown in Figures 2 (b) (c) (f) and (h) respectively. These
attributes which do not have an far spread in the data space often too are carrying mediocre data ranges, as
represented by white to grey colors in the data points. That is, the data of these attributes do not vary much in the
data scale; hence these attributes may not contribute significantly to an accurate prediction model. These visual
patterns can be quantified by using CV computation.
Let X is a training dataset with n instances of vector whose values are characterized by a total of m attributes or
features. An instance is a m-dimensional tuple, in the form of (x
1
, x
2
, ..x
m
). For each x
a
where a[1..m], can be
partitioned into subgroups of different classes where cC is the total number of prediction target classes. So that
x
a
{
..
}. Setting a threshold =1, if a coefficient of variation of a feature a[1..m], v

a
<1, then the a
th

feature can be removed according to the CV feature selection. The coefficient of variation is defined as:

where
is the mean of all the a

th
feature values that belong to class c. v
a
is the sum of all coefficients of variation
for each class c where c[1..C], for that particular a
th
feature.The coefficient of variation is expressed as a real
number from - to + and it describes the standard deviation relative to the mean. It can be used to compare
variability even when the units are different (the units will divide out, providing just a pure number).
In general CV informs us about the size of variation relative to the size of the observation, and it has the
advantage that the coefficient of variation is independent of the units of observation. The coefficient of variation,
however, will be the same over all the features as it does not depend on the unit of measurement. So you can
obtain information about the data variation throughout all the features, by using the coefficient of variation to
look at all the ratios of standard deviations to mean in each feature.
It is therefore effectively a normalized or relative measure of the variation in a data set, (e.g. a time series) in
that it is a proportion (and therefore can be expressed as a single numeric indicator). Intuitively, if the mean is the
expected value, then the coefficient of variation is the expected variability of a measurement, relative to the mean.
This is useful when comparing measurements across multiple heterogenous data sets or across multiple
measurements taken on the same data set - the coefficient of variation between two data sets, or calculated for two
attributes of measurements in the case of feature selection, can be directly compared, even if the data in each are
measured on very different scales, sampling rates or resolutions. In contrast, standard deviation is specific to the
measurement/sample it is obtained from, i.e. it is an absolute rather than a relative measure of variation. In
statistics, it is sometimes known as measure of dispersion, which helps compare variation across variables with
different units. A variable withhigher coefficient of variation is more dispersed than one with lower CV. In our
programmed CVAttribeEvalfunction, a feature with a composite CV of all the target classes, lower than one is
deemed to be removed.
2.3. Data Mining Experiment
The goal of the data mining experiment to be conducted is to evaluate the efficacy of the above-mentioned
feature selection algorithms including the newly proposed CVAttribeEval. The whole Pima Indian diabetes,
without any pre-processing are applied with the 10 feature selection algorithms. The filtered dataset, with the
redundant features trimmed off, is then used to train a classification model with 10-folds cross-validation
assessment method. Two Wekaclassification algorithms are used, Decision Tree which is implemented as J48
pruned decision tree function, and Neural Network which is implemented asMultilayerPerceptron function.
By using the CVAttribeEval approach, the coefficients of variation are calculated for each feature with
respective each of the two classes, and they are shown in Table I. It can be seen that some features of which the
CV are higher than one would be retained for subsequent model training. They are highlighted in Table 1. On the
other hand, the other featuresthat should be removed (in sorted order) are: mass, plas, pres, andage.
Table I. Results of CVAttribeEval feature selection method applied on the diabetes dataset.
Feature class 1 class 2
mean stddev sd/m mean stddev sd/m Total
preg 3.4234 3.0166 0.881171 4.9795 3.6827 0.739572 1.620743
plas 109.9541 26.1114 0.237475 141.2581 31.8728 0.225635 0.463111
pres 68.1397 17.9834 0.26392 70.718 21.4094 0.302743 0.566663
skin 19.8356 14.8974 0.751044 22.2824 17.6992 0.794313 1.545357
insu 68.8507 98.828 1.435396 100.2812 138.4883 1.381 2.816395
mass 30.3009 7.6833 0.253567 35.1475 7.2537 0.206379 0.459946
pedi 0.4297 0.2986 0.694903 0.5504 0.3715 0.674964 1.369867
age 31.2494 11.6059 0.371396 37.0808 10.9146 0.294346 0.665742
After eliminating the low CV features, four features remain. These four features, however, can form different
combination of feature subset; up to p!/[(p-q)!q!] combinations can be formed given p is the maximum cardinality
of the feature subset (p=4) and q is the choice length of the feature subset (q=1,2,3 or 4). A snapshot of
classification results by using the possible combinations of feature subsets is shown in Table II. Note that the
accuracy is averaged over each group of feature subsets for each cardinality q. The classification results are
generated by using J 48, where both accuracy in terms of correctly classified instances over the total instances, and
the size of the induced decision tree which is the sum of nodes and leaves, are shown.
Table II. Classification results for possible feature subsets by CVAttribeEval feature selection method and J 48 algorithm.

The classification experiment repeats for each of the ten feature selection algorithms. The corresponding
performance in terms of averaged accuracies and averaged tree sizes are recorded. After it is done with J48, the
whole data mining experiment is reiterated with exactly the same steps with MultilayerPerceptron.
3. Experiment Results
Performance results are tabulated in Table III and Table IV for decision tree model and neural network model
respectively. For each type of feature selection, the overall accuracy which is averaged over all the possible
combinations of feature subset that are formed by the features selected by the feature selection algorithm, and the
maximum accuracy are reported. Both sets of results point to the fact that CV feature selection can achieve the
highest averaged accuracy and the best possible accuracy in different feature subsets.
Table III. Performance results for all the feature selection algorithms applied for building a Decision Tree model.

Table IV. Performance results for all the feature selection algorithms applied for building a Neural Network model.
cardinality=4 Average accuracy Average size
{insu, preg, skin, pedi}
75.7813 75.7813 21
21
cardinality=3
{insu, preg, skin} {insu, preg, pedi} {insu, skin, pedi} {preg, skin, pedi}
75.9115 75.5208 75.9115 74.7396 75.52085 24
31 21 23 21
cardinality=2
{insu, preg} {insu, skin} {insu, pedi} {preg, skin} {preg, pedi} {skin, pedi}
75.2604 75.1302 75.651 73.9583 74.4792 74.349 74.80468 28.33333333
33 39 23 31 21 23
cardinality=1
{insu} {preg} {skin} {pedi}
74.8698 73.6979 73.9583 73.9583 74.12108 33.5
39 33 39 23
Overal l average: 75.05698 26.70833333
J48
FS algorithm Average accuracy Best accuracy Average size Best size
None 73.8281 73.8281 (39) 39 39 (73.8281)
FeatureSelection_CV 75.05698 75.9115 (23) 26.70833 21 (75.7813)
cfsubseteval 74.8698 74.8698 (29) 29 29 (74.8698)
ChiSquaredAttributeEval 74.25403 74.8698 (39) 27.58333 19 (74.349)
InfoGainAttributeEval 74.25403 74.8698 (39) 27.58333 19 (74.349)
PrincipalComponents 73.01702 75.651 (23) 18.25 9 (73.0469)
SignificanceAttributeEval 74.67449 75.9115 (23) 28 15 (75.1302)
CorrelationAttributeEval 74.67449 75.9115 (23) 28 15 (75.1302)
SVMAttributeEval 74.60396 75.1302 (39) 31.29167 21 (75)
ReliefFAttributeEval 74.35168 75.651 (23) 27.70833 15 (75.1302)
SymmetricalUncertAttributeEval 74.17265 74.7396 (21) 27.83333 19 (73.6979)
Mul ti l ayerPerceptron
FS algorithm Average accuracy Best accuracy
None 75.3906 73.8281 (39)
FeatureSelection_CV 76.14746 77.0833
cfsubseteval 74.8698 74.8698
ChiSquaredAttributeEval 75.52896 76.6927
InfoGainAttributeEval 75.74054 76.6927
PrincipalComponents 74.26758 75.1302
SignificanceAttributeEval 75.46386 76.3021
CorrelationAttributeEval 75.49641 76.3021
SVMAttributeEval 75.32552 75.61848
ReliefFAttributeEval 75.01627 75.55338
SymmetricalUncertAttributeEval 75.74054 76.6927

By plotting the average accuracy versus the average decision tree size for the J 48 model, a comparison
diagram is shown in Figure 3. It shows essentially CV has attained the highest average accuracy over all the other
feature selection algorithms. However, in terms of tree size, PrincipalComponents yields the smaller tree but its
accuracy is the worst, even lower than the accuracy of a model built without using any feature selection. This may
be due to the phenomenon that an individual feature which may not be significant by itself, but when joined with
others they map an important relation to the class. CV however can accomplish a good accuracy together with a
relatively compact decision tree size.
Fig. 3. Comparison chart that shows average accuracies and decision tree sizes produced by J 48 model built with different feature
selection algorithms.
4. Conclusion
Diabetes prediction has drawn much attention recently as the instances of the disease increase worldwide. In
machine learning research communities, researchers have been trying to improve the accuracy of such prediction
model. In this paper, a novel feature selection method called Coefficient of Variation is proposed. Experiment
results show that the new method outperformed nine other conventional feature selection methods. The proposed
CV method has potential to be applied with other modern classification algorithms for unprecedented
performance improvement.
References
[1] Figures based on AHPO diabetes prevalence model: http://bit.ly/aphodiabetes.
[2] American diabetes asociation, http://www.diabetes.org/diabetes-basics
[3] International Diabetes Federation, http://www.idf.org
[4] Maciejowski J M. Predictive control with constraints, Prentice-Hall, Pearson Education Limited, Harlow, UK, 2002, ISBN 0-201-39823-
0 PPR.
[5] Sthl F, J ohansson R. Diabetes mellitus modeling and short-term prediction based on blood glucose measurements, Mathematical
Biosciences, 2009, 217:101117.
[6] Otto E, Semotok C, Andrysek J , Basir O. An intelligent diabetes software prototype: predicting blood glucose levels and recommending
regimen changes. Diabetes Technol Ther., 2000 Winter, 2(4):569-76.
[7] Gogou G, Maglaveras N, Ambrosiadou BV, Goulis D, Pappas C. A neural network approach in diabetes management by insulin
administration, J Med Syst. 2001 Apr. 25(2):119-31.
[8] Akmal S.M, Ismail K, Zainudin S. Prediction of Diabetes by using Artificial Neural Network, Conf Proc 2011 International Conference
on Circuits, System and Simulation, 2011, 7:299-303.
[9] Han J .C, Rodriguez J .C, Beheshti M. Discovering Decision Tree Based Diabetes Prediction Model, Conf Proc ASEA, 2009, 30:99109.
[10] Zhang Y, Fong S, Fiaidhi J , Mohammed S. Real-Time Clinical Decision Support System with Data Stream Mining, J ournal of
Biomedicine and Biotechnology, Hindawi, Volume 2012, Article ID 580186, May 2012.
[11] A.G.K.J anecek, W.N.Gansterer, M.A.Demel, G.F.Ecker, On the Relationship Between Feature Selection and Classification Accuracy,
J MLR: Workshop and Conference Proceedings, Vol. 4, pp.90-115.
[12] S. Balakrishnan, R. Narayanaswamy, Feature Selection Using FCBF in Type II Diabetes Databases, Special Issue of the International
J ournal of the Computer, the Internet and Management, Vol.17, No. SP1, March 2009, pp.50.2-50.8.
[13] Y. Praharsi, S.G. Miaou, H.M. Wee, Supervised learning approaches and feature selection - a case study in diabetes, Int. J . of Data
Analysis Techniques and Strategies, Inderscience, Vol.5, Iss.3, pp.323-337.
[14] D. Giveki, H. Salimi, G.R. Bahmanyar, Y. Khademian, Automatic Detection of Diabetes Diagnosis using Feature Weighted Support
Vector Machines based on Mutual Information and Modified Cuckoo Search, International J ournal of Computational Engineering
Research, Vol.2, Iss.5, pp.1384-1387.
[15] Y. Huang, Paul McCullagh, N. Black, R. Harper, Feature Selection and Classification Model Construction on Type 2 Diabetic Patient's
Data, ICDM 2004, LNAI 3275, Springer-Verlag, 2004, pp.153-162.
[16] Pima Indians Diabetes Data Set, UCI Machine Learning Repository, Center for Machine Learning and Intelligent Systems,
http://archive.ics.uci.edu/ml/datasets/Pima+Indians+Diabetes
[17] Weka 3: Data Mining Software in J ava, http://www.cs.waikato.ac.nz/ml/weka/
[18] L.C. Molina, L. Belanche, A. Nebot, Feature selection algorithms: a survey and experimental evaluation, Proceedings. 2002 IEEE
International Conference on Data Mining, pp.306-313

Index
F
Fuzzy association rules mining, 258259
crisp boundary problem, 256257
definitions, 256

V
VANET. see Vehicular ad hoc network (VANET)
Vehicular ad hoc network (VANET)
experiment, 260262
relation graph, visualization by, 262263

W
Web mining and visualization systems, 253255

Session 03 - Paper 36

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Session 03 - Paper 36

Încărcat de

Drepturi de autor:

Formate disponibile

Diabetics prediction by using feature selection based on

}. Setting a threshold =1, if a coefficient of variation of a feature a[1..m], v

is the mean of all the a

S-ar putea să vă placă și