Documente Academic
Documente Profesional
Documente Cultură
Pattern Recognition
journal homepage: www.elsevier.com/locate/pr
School of Computer Science and Technology, Xidian University, Xian 710071, PR China
Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education of China, Xidian University, Xian 710071, PR China
a r t i c l e i n f o
abstract
Article history:
Received 2 February 2012
Received in revised form
26 May 2012
Accepted 7 June 2012
Available online 18 June 2012
For classication problems, in practice, real-world data may suffer from two types of noise, attribute
noise and class noise. It is the key for improving recognition performance to remove as much of their
adverse effects as possible. In this paper, a formalism algorithm is proposed for classication problems
with class noise, which is more challenging than those with attribute noise. The proposed formalism
algorithm is based on evidential reasoning theory which is a powerful tool to deal with uncertain
information in multiple attribute decision analysis and many other areas. Thus, it may be more effective
alternative to handle noisy label information. And then a specic algorithmEvidential Reasoning
based Classication algorithm (ERC) is derived to recognize human faces under class noise conditions.
The proposed ERC algorithm is extensively evaluated on ve publicly available face databases with class
noise and yields good performance.
& 2012 Elsevier Ltd. All rights reserved.
Keywords:
Face recognition
Class noise
Evidential reasoning
Linear regression classication (LRC)
Sparse representation-based classication
(SRC)
1. Introduction
Pattern recognition/classication is a very topic in machine learning (or articial intelligence). It assigns the input data into one of a
given number of categories by an algorithm. The algorithm is
obtained by learning the training set of instances. Classication is
applied in many elds, such as speech recognition, handwriting
recognition, document classication, internet search engines, medical
image analysis, optical character recognition, and so on. For classication problems, the training set may suffer from two types of noise:
attribute noise and class noise [1], that decrease classication
accuracy usually. If some training samples are not correctly labeled,
the training data would have class noise. Classication under class
noise conditions is a more challenging problem usually, compared to
attribute noise. The sources of class noise are very diverse, such as
subjectivity, data-entry error, or inadequacy of the information [2].
This paper focuses on face classication problems in the presence of
class noise.
Many algorithms have been proposed to solve the class noise
problem. Parts of them are summarized as follows.
n
Corresponding author at: School of Computer Science and Technology, Xidian
University, Xian 710071, PR China. Tel.: 86 2988204310; fax: 86 2988201023.
E-mail addresses: xiao_dong_wang1975@163.com (X. Wang),
lf204310@163.com (F. Liu).
0031-3203/$ - see front matter & 2012 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.patcog.2012.06.005
4118
Table 1
The related algorithms.
First group
Nearest neighbor algorithm
Decision tree algorithm
Probabilistic algorithm
Ensemble learning algorithm
Other algorithm
[37]
[9,10]
[2]
[20,21]
Second group
[8,11,12]
[1317]
[18,19]
Now, we briey describe some basic conceptions and conclusions of the ER analytical algorithm [3840]. They are given to
cater for classication problems. Let O fC1 ,C2 , . . . ,CK g be a
collectively exhaustive and mutually exclusive set of hypotheses,
then O is called the frame of discernment. The nonnegative vector
b b1, . . . , bKT is called a belief degree vector (BDV) if
PK
where bi9bCi is the belief degree of the hypothi 1 bi r1,
PK
esis C . If
i 1 bi 1, the BDV b is complete; otherwise, if
PK i
i 1 bi o1, the BDV b is incomplete. For classication problems, let x be a sample and O fC1 ,C2 , . . . ,CK g correspond to K
classes respectively, bi represents the belief degree of the
hypothesis Ci x belongs to i-th class. So, BDV can be explained
P
as a soft label. If the BDV is incomplete, bO 1 Ki 1 bi is the
belief degree of O which corresponds to the hypothesis x do not
belong to any class.
Several BDVs corresponding to the same sample x form the
1
N
belief rule base (BRB). Let fb , . . . , b g be a BRB corresponding to
the sample x, the nal conclusion can be combined by ER
analytical algorithm [40,41],
2
m4
K Y
N
X
@wi bi s 1wi
s1i1
K
X
bi jAK1
j1
N
Y
0
@1wi
i1
K
X
131
bi jA5 ,
j1
1
QN
bk
i
i 1 wi b s 1wi
QN
i
j 1 b j
i 1 1wi
QN
i 1 1wi
PK
PK
j1
bi j
1m
2
In Eqs. (1) and (2), the activation weights are calculated [41,42] by
yi
wi PN
j1
yj
i 1; 2, . . . ,N,
b1 0:6190, b2 0:3810:
1
b1 0:5, b2 0:5:
1
b1 0:5793, b2 0:4207:
1
3. ERC Algorithm
With increasingly diverse face data sources, e.g. internet or
surveillance video, class noise will be unavoidable. Based on the
formalism classication algorithm, we develop a specic algorithm
ERC for face recognition problems with class noise in this section. It
4119
. . . ,xknk g
^ k a^ k J , k 1; 2, . . . ,K.
Step 1.3: Compute distance dk xi Jxi X
2
~
Step 1.4: Calculate b according to
gdk xi
formulab~ k exp P
, k 1; 2, . . . ,K
5
d x
j j
where g 4 0 is a constant.
Step 1.5: Give activation weights w1 1r for b and w2 r
for b~ , where r A 0; 1 is a constant.
Step 1.6: Combine b and b~ according to formulae (1)(3) with
i
The BDV b fuses information that comes from both the class label
of xi and other training samples. So, it can reduce the adverse effects
from class noise well. It will be validated by experiments in Section
4.1. Specically, if b and b~ indicate the same class that xi belongs to,
i
the combination result b will clearly show the class; otherwise, there
i
will be no component of b signicantly greater than other compoi
nents. In the latter case, the contribution of b will be small to classify
the test sample. In other words, BDVs distinguish the training
samples according to their importance.
i
The method to generate b is similar to many data-cleaning
approaches [37]. The difference is that the data-cleaning approaches
remove the unreliable training samples, whereas our method retains
all training samples and generates BDVs. The BDVs represent the
belief degree of training samples, therefore they can be considered as
soft labels. Compared with the data cleaning approaches, the BDVs
maintain more information of training samples which will be
combined by ER analytical algorithm well later.
The idea to form b~ is inspired by the method in [30]. One of the
difference between them is that the proposed method uses the
distances between sample and the class subspaces rather than the
distances between samples. The distances between sample and
the class subspaces (LRC [43]) is more suitable for face recognition. The function of BDV in this paper is similar to the function of
the basic probability assignment (BPA) in [30]. Another difference
is that the BDVs are xed and the BPAs are changed with test
samples during the test phase.
Algorithm 1 involves two parameters g and r. In Eq. (5), the
parameter g is set to 10. Another parameter r in Step 1.5 will be
described in Section 4.
4120
4121
1
P
D
1
2
D T
where f 1=N N
i 1 f i . Let Wpca wpca ,wpca , . . . ,wpca be a D N
D
projective
matrix
and
xD
W
f
(i
1;
2, . . . ,N).
If
i
pca
P i
2
D 2
d
d minfD A f1; 2, . . . ,1024g9 N
(i 1; 2,
i 1 Jxi J2 4 0:99Jf i J2 g, xi
. . . ,N) are the dimensionality reduction data by PCA which will be
used to test ERC in the numerical experiments. The other experimental designs and corresponding results are given in the following
sections. They illustrate that ERC is more robust against class noise
than competing methods.
Fig. 1. The rst, second and fth rows show cropped facial images (32 32) of the rst subject in the AR database, Georgia Tech database and ORL database, respectively.
The third and fourth rows show cropped facial images (32 32) of the rst subject in the JAFFE database. The 64 faces (32 32 cropped images) about rst subject of
Extended Yale B database are shown in the sixth-ninth rows.
4122
AR
GT
JAFFE
ORL
Extended Yale B
n1
n2
7
1
8
2
11
2
5
1
32
6
Table 3
The average error rate and standard deviation obtained by Algorithm 1.
Face database
AR
GT
JAFFE
ORL
Extended Yale B
Noise ratio
New noise ratio
0.1429
0.09947 0.0061
0.6
0.25
0.1778 7 0.0141
0.7
0.1818
0.0591 7 0.0194
0.6
0.2
0.1235 7 0.0160
0.6
0.1875
0.0376 70.0051
0.6
Table 4
The average error rate and standard deviation obtained by different algorithms.
Face database
AR
GT
JAFFE
ORL
Yale B
ERC
0.2517 7 0.0206
0.6
0.2
0.4794 7 0.0191
0.3020 7 0.0215
0.6677 7 0.0219
0.5462 7 0.0187
0.7930 7 0.0185
0.3670 7 0.0176
0.2738 7 0.0235
0.3400 70.0188
0.6
0.1
0.3396 7 0.0195
0.4058 7 0.0244
0.4647 7 0.0256
0.4882 7 0.0273
0.5397 7 0.0340
0.4223 7 0.0264
0.3075 7 0.0218
0.0307 70.0210
0.6
0.1
0.0334 7 0.0205
0.1309 7 0.0398
0.0683 7 0.0322
0.0788 7 0.0402
0.0670 70.0321
0.1501 7 0.0419
0.0262 70.0232
0.1653 70.0297
0.6
0.1
0.2402 70.0297
0.2472 7 0.0325
0.4159 7 0.0394
0.3987 7 0.0402
0.5550 70.0501
0.2392 7 0.0271
0.1797 7 0.0297
0.08597 0.0087
0.6
0.3
0.50707 0.0145
0.1096 70.0088
0.5931 70.0181
0.5768 70.0141
0.6602 70.0172
0.2413 70.0130
0.1709 70.0111
r
l
KNNDS
LRC
RT1
RT2
RT3
SRC
Linear SVM
4123
GT database
AR database
0.8
ERC
KNNDS
LRC
RT1
RT2
RT3
SRC
SVM
error rate
0.6
0.5
error rate
0.7
ERC
KNNDS
LRC
RT1
RT2
RT3
SRC
SVM
0.55
0.5
0.45
0.4
0.4
0.35
0.3
0.3
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Fig. 2. Evolutions of the error rate and standard deviation of ERC on AR database
versus r. In experiments, l 0:2 and best error rate and standard deviation of
other algorithms also are shown for comparison.
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Fig. 5. Evolutions of the error rate and standard deviation of ERC on GT database
versus l. In experiments, r 0:6 and best error rate and standard deviation of
other algorithms also are shown for comparison.
AR database
JAFFE database
0.8
0.6
ERC
KNNDS
LRC
RT1
RT2
RT3
SRC
SVM
0.18
0.16
0.14
error rate
0.7
error rate
0.2
ERC
KNNDS
LRC
RT1
RT2
RT3
SRC
SVM
0.5
0.4
0.12
0.1
0.08
0.06
0.04
0.3
0.02
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Fig. 3. Evolutions of the error rate and standard deviation of ERC on AR database
versus l. In experiments, r 0:6 and best error rate and standard deviation of
other algorithms also are shown for comparison.
GT database
0.55
error rate
0.5
0.45
ERC
KNNDS
LRC
RT1
RT2
RT3
SRC
SVM
0.4
0.35
0.3
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Fig. 4. Evolutions of the error rate and standard deviation of ERC on GT database
versus r. In experiments, l 0:1 and best error rate and standard deviation of
other algorithms also are shown for comparison.
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Fig. 6. Evolutions of the error rate and standard deviation of ERC on JAFFE
database versus r. In experiments, l 0:1 and best error rate and standard
deviation of other algorithms also are shown for comparison.
4124
JAFFE database
Yale B database
0.2
ERC
KNNDS
LRC
RT1
RT2
RT3
SRC
SVM
0.16
error rate
0.14
0.12
0.1
ERC
KNNDS
LRC
RT1
RT2
RT3
SRC
SVM
0.6
0.5
error rate
0.18
0.08
0.4
0.3
0.06
0.2
0.04
0.02
0
0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Fig. 7. Evolutions of the error rate and standard deviation of ERC on JAFFE
database versus l. In experiments, r 0:6 and best error rate and standard
deviation of other algorithms also are shown for comparison.
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Fig. 10. Evolutions of the error rate and standard deviation of ERC on Extended
Yale B database versus r. In experiments, l 0:3 and best error rate and standard
deviation of other algorithms also are shown for comparison.
Yale B database
ORL database
0.6
ERC
KNNDS
LRC
RT1
RT2
RT3
SRC
SVM
0.5
error rate
0.45
0.4
0.6
0.5
error rate
0.55
ERC
KNNDS
LRC
RT1
RT2
RT3
SRC
SVM
0.35
0.4
0.3
0.3
0.2
0.25
0.2
0.1
0.15
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Fig. 8. Evolutions of the error rate and standard deviation of ERC on ORL database
versus r. In experiments, l 0:1 and best error rate and standard deviation of
other algorithms also are shown for comparison.
ERC
KNNDS
LRC
RT1
RT2
RT3
SRC
SVM
0.55
0.5
error rate
0.45
0.4
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Fig. 11. Evolutions of the error rate and standard deviation of ERC on Extended
Yale B database versus l. In experiments, r 0:6 and best error rate and standard
deviation of other algorithms also are shown for comparison.
ORL database
0.6
0.1
0.35
0.3
0.25
0.2
0.15
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Fig. 9. Evolutions of the error rate and standard deviation of ERC on ORL database
versus l. In experiments, r 0:6 and best error rate and standard deviation of
other algorithms also are shown for comparison.
GT database
AR database
4125
JAFFE database
0.5
0.9
0.7
0.8
0.4
0.6
error rate
0.7
0.3
0.6
0.5
0.5
0.2
0.4
0.4
0.1
0.3
0.3
0.2
1
0
4
1
2
3
4
1 2
the number of the training images with noisy labels
0.8
error rate
LRC
0.5
0.5
RT1
0.4
RT2
0.3
0.3
RT3
0.2
0.2
SRC
0.1
SVM
0.1
1
KNNDS
0.6
0.4
ERC
0.7
0.6
Yale B database
ORL database
0.7
3
3
6
9
12 15
the number of the training images with noisy labels
Fig. 12. Evolutions of the error rate and standard deviation of ERC versus the numbers of the training images with noisy labels. In experiments, the numbers of the training
images keeps unchanged. There are 8, 8, 12, 6 and 32 training images for AR database, GT database, JAFFE database, ORL database and Extended Yale B databases,
respectively.
4126
error rate
0.25
0.55
0.7
0.5
0.2
0.6
0.45
0.5
0.4
0.15
0.4
0.35
0.1
0.3
0.3
0.2
0.25
0.1
0.2
7
10 11 12
0.05
0
8
9
10 11 12
the number of the trainning images
10 12 14 16 18
Yale B database
ORL database
error rate
JAFFE dataqbase
GT database
AR database
0.8
0.7
0.7
ERC
0.6
0.6
KNNDS
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
LRC
RT1
RT2
0.1
RT3
SRC
0.1
4
SVM
26 32 40 48 56
the number of the trainning images
Fig. 13. Evolutions of the error rate and standard deviation of ERC versus the numbers of the training images. In experiments, the numbers of the training images with
noisy labels keeps unchanged. There are 1, 2, 2, 1 and 6 training images with noisy labels for AR database, GT database, JAFFE database, ORL database and Extended Yale B
databases, respectively.
Table 5
The average error rate and standard deviation obtained by different algorithms.
Face database
AR
GT
JAFFE
ORL
Yale B
ERC
KNNDS
LRC
RT1
RT2
RT3
SRC
Linear SVM
0.25277 0.0195
0.4807 70.0171
0.2958 7 0.0237
0.6768 7 0.0219
0.5471 7 0.0189
0.7914 7 0.0199
0.3738 7 0.0159
0.27007 0.0238
0.3513 7 0.0245
0.3427 7 0.0288
0.4048 70.0303
0.4766 7 0.0305
0.4869 7 0.0326
0.5389 7 0.0247
0.4255 7 0.0294
0.3123 70.0236
0.0291 70.0201
0.0365 70.0223
0.1153 7 0.0320
0.0751 70.0427
0.0738 70.0304
0.0672 70.0350
0.1390 70.0349
0.0249 70.0181
0.1779 70.0298
0.2480 7 0.0253
0.2562 7 0.0384
0.4171 7 0.0430
0.4078 7 0.0462
0.5668 7 0.0517
0.2535 7 0.0355
0.1822 7 0.0316
0.0848 70.0090
0.5054 7 0.0123
0.1078 7 0.0080
0.5889 7 0.0179
0.5677 7 0.0169
0.6639 7 0.0142
0.2365 7 0.0107
0.1702 7 0.0142
Table 6
The average training CPU time and standard deviation obtained by different algorithms.
Face database
AR
GT
JAFFE
ORL
Yale B
ERC
RT1
RT2
RT3
Linear SVM
4.1028 7 0.0148
37.7467 1.4622
2.86197 0.0773
0.8438 7 0.0465
0.2859 7 0.0119
2.67197 0.5809
5.25787 0.5314
0.6897 7 0.0690
0.79007 0.2883
0.1059 70.0106
1.0794 70.4571
0.5538 70.2490
0.05137 0.0071
0.07667 0.0253
0.00887 0.0101
1.7000 7 0.6871
1.0944 7 0.2700
0.2822 7 0.0259
0.1197 7 0.0125
0.0325 7 0.0062
74.1437 6.2063
152.317 3.1661
5.4703 70.1159
4.89457 1.4927
0.6977 70.0116
CPU time about KNNDS, LRC and SRC is not represented in Table 6
because they have not training phase. Although it needs long CPU
time to train ERC, the computation cost can be tolerated. In the
test phase, the CPU time is little more than the sum of the CPU
time consumed by LRC and SRC. It is consistent with the
theoretical analysis of computational complexity in Section 3.4.
4127
Table 7
The average test CPU time and standard deviation obtained by different algorithms.
Face database
AR
GT
JAFFE
ORL
Yale B
ERC
KNNDS
LRC
RT1
RT2
RT3
SRC
Linear SVM
13.040 70.3545
0.3475 7 0.0098
1.43697 0.0289
0.2900 70.0096
0.3541 7 0.0075
0.2456 7 0.0078
8.37447 0.0356
0.5344 7 0.0063
5.94257 0.0342
0.5438 7 0.2047
0.4866 7 0.0055
0.0966 70.0068
0.1563 7 0.0055
0.0856 70.0079
4.8069 7 0.0291
0.1263 7 0.0043
2.04097 0.2936
0.03067 0.0137
0.05197 0.0074
0.01727 0.0047
0.0200 7 0.0071
0.01507 0.0044
1.96817 0.2936
0.00847 0.0079
1.47977 0.0298
0.0541 70.0253
0.1391 70.0384
0.0378 70.0430
0.04507 0.0462
0.0331 70.0517
1.14697 0.0355
0.0291 70.0316
189.527 2.1644
4.95317 0.6369
89.8667 1.7034
2.15167 0.5379
3.5078 70.4134
0.39067 0.0134
51.9877 0.9396
0.76027 0.0105
5. Conclusion
Acknowledgment
The authors would like to thank the three reviewers for their
comments, they help us very greatly to improve this submission. The
authors would like to thank Dr. Zhijie Zhou for his suggestions which
correct many errors and improve the quality of this paper greatly.
This work was supported in part by the National Natural Science
Foundation of China (No. 60803097, 60970067, 61003198, 61072106,
60971112, 60971128, 61072108), The Fund for Foreign Scholars in
University Research and Teaching Programs (the 111 Project) (No.
B07048), the National Science and Technology Ministry of China (No.
9140A07011810DZ0107, 9140A07021010DZ0131), the Fundamental
Research Funds for the Central Universities (No. JY10000902001,
K50510020001, JY10000902045).
References
[1] X. Zhu, X. Wu, Class noise vs. attribute noise: a quantitative study, Articial
Intelligence Review 22 (2004) 177210.
[2] C. Brodley, M. Freidl, Identifying mislabeled training data, Journal of Articial
Intelligence Research 11 (1999) 131167.
[3] B. Dasarathy, Noising around the neighbourhood: a new system structure and
classication rule for recognition in partially exposed environments, IEEE
Transactions on Pattern Analysis and Machine Intelligence 2 (1980) 6771.
[4] G. Gates, The reduced nearest neighbor rule, IEEE Transactions on Information Theory 18 (1972) 431433.
[5] P. Hart, The condensed nearest neighbor rule, IEEE Transactions on Information Theory 14 (1968) 515516.
[6] F. Angiulli, Fast condensed nearest neighbor rule, in: International Conference
on Machine Learning, 2005, pp. 711.
[7] D. Wilson, T. Martinez, Instance pruning techniques, in: International Conference on Machine Learning, 1997, pp. 404411.
[8] G. John, Robust decision trees: removing outliers from databases, in:
Proceedings of the First ACM SIGKDD Conference on Knowledge Discovery
and Data Mining, 1995, pp. 174179.
[9] T. Denoeux, M. Bjanger, Induction of decision trees from partially classied
data using belief functions, in: Proceedings of SMC, 2000, pp. 29232928.
[10] P. Vannoorenbergue, T. Denoeux, Handling uncertain labels in multiclass
problems using belief decision trees, in: Proceedings of IPMU, 2002.
[11] J. Mingers, An empirical comparison of pruning methods for decision tree
induction, Machine Learning 4 (1989) 227243.
[12] X. Zhu, X. Wu, Q. Chen, Eliminating class noise in large datasets, in:
International Conference on Machine Learning, 2003, pp. 920927.
[13] D. Hawkins, G. McLachlan, High-breakdown linear discriminant analysis,
Journal of the American Statistical Association 92 (1997) 136143.
[14] S. Bashir, E. Carter, High breakdown mixture discriminant analysis, Journal of
Multivariate Analysis 93 (2005) 102111.
4128
[20] X. Zeng, T. Martinez, A noise ltering method using neural networks, in: IEEE
International Workshop on Soft Computing Techniques in Instrumentation,
Measurement and Related Applications, 2003, pp. 2631.
[21] I. Guyon, N. Matic, V. Vapnik, Discovering informative patterns and data cleaning,
Advances in Knowledge Discovery and Data Mining (1996) 181203.
[22] G. Shafer, A Mathematical Theory of Evidence, Princeton University Press,
Princeton, 1976.
[23] A. Dempster, Upper and lower probabilities induced by a multi-valued
mapping, Annals of Mathematical Statistics 38 (1967) 325339.
[24] T. Denoeux, Z. Younes, F. Abdallah, Representing uncertainty on set-valued
variables using belief functions, Articial Intelligence 174 (2010) 479499.
[25] E. Co me, L. Oukhellou, T. Denoeux, P. Aknin, Learning from partially
supervised data using mixture models and belief functions, Pattern Recognition 42 (2009) 334348.
[26] T. Denoeux, P. Smets, Classication using belief functions: the relationship
between the case-based and model-based approaches, IEEE Transactions on
Systems, Man and CyberneticsPart B 36 (2006) 13951406.
[27] T. Denoeux, A neural network classier based on DempsterShafer theory, IEEE
Transactions on Systems, Man and CyberneticsPart A 30 (2000) 131150.
[28] L. Zouhal, T. Denoeux, An evidence-theoretic k-NN rule with parameter
optimization, IEEE Transactions on Systems, Man and CyberneticsPart C 28
(1998) 263271.
[29] T. Denoeux, Analysis of evidence-theoretic decision rules for pattern classication, Pattern Recognition 30 (1997) 10951107.
[30] T. Denoeux, A k-nearest neighbor classication rule based on DempsterShafer
theory, IEEE Transactions on Systems, Man and Cybernetics 25 (1995) 804813.
[31] Y. Bi, J. Guan, D. Bell, The combination of multiple classiers using an
evidential reasoning approach, Articial Intelligence 172 (2008) 17311751.
[32] Y. Bi, S. McClean, T. Anderson, Combining rough decisions for intelligent text
mining using Dempsters rule, Articial Intelligence Review 26 (2006) 191209.
[33] L. Xu, A. Krzyzak, C. Suen, Methods of combining multiple classiers and their
applications to handwriting recognition, IEEE Transactions on Systems, Man
and Cybernetics 22 (1992) 418435.
[34] M. Masson, T. Denoeux, RECM: relational evidential c-means algorithm,
Pattern Recognition Letters 30 (2009) 10151026.
[35] M. Masson, T. Denoeux, ECM: an evidential version of the fuzzy c-means
algorithm, Pattern Recognition 41 (2008) 13841397.
[36] M. Masson, T. Denoeux, Clustering interval-valued data using belief functions, Pattern Recognition Letters 25 (2004) 163171.
[37] T. Denoeux, M. Masson, EVCLUS: evidential clustering of proximity data, IEEE
Transactions on Systems, Man and CyberneticsPart B 34 (2004) 95109.
[38] J. Yang, M. Singh, An Evidential reasoning approach for multiple-attribute
decision making with uncertainty, IEEE Transactions on Systems, Man and
Cybernetics 24 (1994) 118.
[39] J. Yang, D. Xu, On the evidential reasoning algorithm for multiple attribute
decision analysis under uncertainty, IEEE Transactions on Systems, Man and
Cybernetics 32 (2002) 289304.
[40] Y. Wang, J. Yang, D. Xu, Environmental impact assessment using the
evidential reasoning approach, European Journal of Operational Research
174 (2006) 18851913.
[41] Z. Zhou, C. Hu, J. Yang, D. Xu, D. Zhou, Online updating belief rule based
system for pipeline leak detection under expert intervention, Expert Systems
with Applications 36 (2009) 77007709.
[42] J. Yang, J. Liu, J. Wang, H. Sii, H. Wang, Belief rule-base inference methodology
using the evidential reasoning approach-RIMER, IEEE Transactions on Systems,
Man, and Cybernetics C Part A: Systems and Humans 36 (2006) 266285.
[43] I. Naseem, R. Togneri, M. Bennamoun, Linear regression for face recognition, IEEE
Transactions on Pattern Analysis and Machine Intelligence 32 (2010) 21062112.
[44] J. Wright, A. Yang, A. Ganesh, S. Sastry, Y. Ma, Robust face recognition via
sparse representation, IEEE Transactions on Pattern Analysis and Machine
Intelligence 31 (2009) 210227.
[45] J. Yang, Y. Wang, D. Xu, K. Chin, The evidential reasoning approach for MADA
under both probabilistic and fuzzy uncertainties, European Journal of
Operational Research 171 (2006) 309343.
[46] J. Yang, P. Sen, A general multi-level evaluation process for hybrid MADM
with uncertainty, IEEE Transactions on Systems Man, and Cybernetics 24
(1994) 14581473.
[47] J. Yang, Rule and utility based evidential reasoning approach for multiattribute decision analysis under uncertainties, European Journal of Operational Research 131 (2001) 3161.
[48] Z. Zhou, C. Hu, J. Yang, D. Xu, M. Chen, D. Zhou, A sequential learning
algorithm for online constructing belief-rule-based systems, Expert Systems
with Applications 37 (2010) 17901799.
[49] J. Zhou, C. Hu, D. Xu, M. Chen, D. Zhou, A model for real-time failure prognosis
based on hidden Markov model and belief rule base, European Journal of
Operational Research 207 (2010) 269283.
[50] J. Zhou, C. Hu, J. Yang, D. Xu, D. Zhou, New model for system behavior
prediction based on belief rule based systems, Information Sciences 180
(2010) 48434846.
[51] J. Zhou, C. Hu, J. Yang, D. Xu, D. Zhou, Bayesian reasoning approach based
recursive algorithm for online updating belief rule based expert system of pipeline
leak detection, Expert Systems with Applications 38 (2011) 39373943.
[52] J. Zhou, C. Hu, J. Yang, D. Xu, D. Zhou, Online updating belief-rule-bass using
the RIMER approach, IEEE Transactions on Systems, Man, and
CyberneticsPart A: Systems and Humans, doi:http://dx.doi.org/10.1109/
TSMCA.2011.2147312.
[53] T. Sakai, Multiple pattern classication by sparse subspace decomposition,
arXiv:0907.5321v2.
[54] S. Li, J. Lu, Face recognition using the nearest feature line method, IEEE
Transactions on Neural Networks 10 (1999) 439443.
[55] B. Efron, T. Hastie, I. Johnstone, R. Tibshirani, Least angle regression, The
Annals of Statistics 32 (2004) 407499.
[56] A. Martinez, R. Benavente, The AR Face Database, CVC Technical Report 24, 1998.
[57] A. Martinez, A. Kak, PCA versus LDA, IEEE Transactions on Pattern Analysis
and Machine Intelligence 23 (2001) 228233.
[58] Georgia Tech Face Database, /http://www.anean.com/face_reco.htmS,
2007.
[59] M. Lyons, J. Budynek, S. Akamatsu, Automatic classication of single facial
images, IEEE Transactions on Pattern Analysis and Machine Intelligence 21
(1999) 13571362.
[60] F. Samaria, A. Harter, Parameterization of a stochastic model for human face
identication, in: Proceedings of the Second IEEE Workshop Applications of
Computer Vision, 1994, pp. 138142.
[61] A. Georghiades, P. Belhumeur, D. Kriegman, From few to many: illumination
cone models for face recognition under variable lighting and pose, IEEE
Transactions on Pattern Analysis and Machine Intelligence 23 (2001)
643660.
[62] K. Lee, J. Ho, D. Kriegman, Acquiring linear subspaces for face recognition
under variable lighting, IEEE Transactions on Pattern Analysis and Machine
Intelligence 27 (2005) 684698.
[63] /http://sparselab.stanford.edu/S.
[64] C. Bishop, Pattern Recognition and Machine Learning, Springer, 2007.
Xiaodong Wang received the B.S. degree from Harbin Institute of Technology, Harbin, China, in 1998, and the M.S. degree from Inner Mongolia University of Technology,
Hohhot, China, in 2007. He is currently working toward the Ph.D. degree in Computer Application Technology at the School of Computer Science and Technology, Xidian
University and the Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education of China, Xian, China. His current research interests include
convex optimization, compressive sensing and pattern recognition.
Fang Liu (M07-SM07) received the B.S. degree in Computer Science and Technology from Xian Jiaotong University, Xian, China, in 1984, and the M.S. degree in Computer
Science and Technology from Xidian University, Xian, in 1995. Currently, she is a Professor with the School of Computer Science, Xidian University, Xian, China. She is the
author or coauthor of ve books and more than 80 papers in journals and conferences. Her research interests include signal and image processing, synthetic aperture radar
image processing, multiscale geometry analysis, learning theory and algorithms, optimization problems, and data mining.
L.C. Jiao (SM89) received the B.S. degree from Shanghai Jiaotong University, Shanghai, China, in 1982, and the M.S. and Ph.D. degrees from Xian Jiaotong University, Xian,
China, in 1984 and 1990, respectively. He is currently a Distinguished Professor with the School of Electronic Engineering, Xidian University, Xian, China. His research
interests include signal and image processing, natural computation, and intelligent information processing. He has led approximately 40 important scientic research
projects and published more than ten monographs and 100 papers in international journals and conferences. He is the author of three books: Theory of Neural Network
Systems (Xian, China: Xidian University Press, 1990), Theory and Application on Nonlinear Transformation Functions (Xian, China: Xidian University Press, 1992), and
Applications and Implementations of Neural Networks (Xian, China: Xidian University Press, 1996). He is the author or coauthor of more than 150 scientic papers.
Prof. Jiao is a member of the IEEE Xian Section Executive Committee, and the Chairman of Awards and Recognition Committee and an executive committee member of
the Chinese Association of Articial Intelligence.
Jiao Wu (S09) received the B.S. degree and the M.S. degree in Applied Mathematics from Shaanxi Normal University, Xian, China, in 1999 and 2002, respectively. She is
currently working towards the Ph.D. degree in Computer Application Technology at the School of Computer Science and Technology, Xidian University and the Key
Laboratory of Intelligent Perception and Image Understanding of Ministry of Education of China, Xian, China. Her research interests include image processing, machine
learning, statistics learning theory, and algorithms.