Sunteți pe pagina 1din 13

3922

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 9, SEPTEMBER 2014

Ordinal Feature Selection for Iris


and Palmprint Recognition
Zhenan Sun, Member, IEEE, Libin Wang, and Tieniu Tan, Fellow, IEEE
Abstract Ordinal measures have been demonstrated as an
effective feature representation model for iris and palmprint
recognition. However, ordinal measures are a general concept of
image analysis and numerous variants with different parameter
settings, such as location, scale, orientation, and so on, can be
derived to construct a huge feature space. This paper proposes a
novel optimization formulation for ordinal feature selection with
successful applications to both iris and palmprint recognition.
The objective function of the proposed feature selection method
has two parts, i.e., misclassification error of intra and interclass
matching samples and weighted sparsity of ordinal feature
descriptors. Therefore, the feature selection aims to achieve an
accurate and sparse representation of ordinal measures. And, the
optimization subjects to a number of linear inequality constraints,
which require that all intra and interclass matching pairs are
well separated with a large margin. Ordinal feature selection
is formulated as a linear programming (LP) problem so that a
solution can be efficiently obtained even on a large-scale feature pool and training database. Extensive experimental results
demonstrate that the proposed LP formulation is advantageous
over existing feature selection methods, such as mRMR, ReliefF,
Boosting, and Lasso for biometric recognition, reporting stateof-the-art accuracy on CASIA and PolyU databases.
Index Terms Iris, palmprint, ordinal measures, feature selection, linear programming.

I. I NTRODUCTION

RIS and palmprint texture patterns are accurate biometric


modalities with successful applications for personal identification. The success of a texture biometric recognition system
heavily depends on its feature analysis model, against which
biometric images are encoded, compared and recognized by a
computer. It is desirable to develop a feature analysis method
which is ideally both discriminating and robust for iris and
palmprint biometrics. On one hand, the biometric features
should have enough discriminating power to distinguish interclass samples. On the other hand, intra-class variations of
Manuscript received July 4, 2013; revised January 3, 2014 and May 23,
2014; accepted June 4, 2014. Date of publication July 11, 2014; date of
current version July 25, 2014. This work was supported in part by the National
Basic Research Program of China under Grant 2012CB316300 and in part by
the National Natural Science Foundation of China under Grant 61273272
and Grant 61135002. The associate editor coordinating the review of this
manuscript and approving it for publication was Dr. Nikolaos V. Boulgouris.
The authors are with the Center for Research on Intelligent Perception and Computing (CRIPAC), National Laboratory of Pattern Recognition
(NLPR), Institute of Automation, Chinese Academy of Sciences (CASIA),
Beijing 100190, China (e-mail: znsun@nlpr.ia.ac.cn; lbwang@nlpr.ia.ac.cn;
tnt@nlpr.ia.ac.cn).
This paper has supplementary downloadable material available at
http://ieeexplore.ieee.org., provided by the author. The material includes an
appendix of the paper. The total size is 0.18 MB. Contact nsun@nlpr.ia.ac.cn
for further questions about this work.
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TIP.2014.2332396

biometric patterns in uncontrolled conditions such as illumination changes, deformation, occlusions, pose/view changes, etc.
should be minimized via robust feature analysis. Therefore it
is a challenging problem to achieve a good balance between
inter-class distinctiveness and intra-class robustness.
Generally the problem of feature analysis can be divided
into two sub-problems, i.e. feature representation and
feature selection. Feature representation aims to computationally characterize the visual features of biometric images.
Local image descriptors such as Gabor filters, Local Binary
Patterns and ordinal measures are popular methods for feature
representation of texture biometrics [1]. However, variations
of the tunable parameters in local image filters (e.g. location,
scale, orientation, and inter-component distance) can generate
a large and over-complete feature pool. Therefore feature
selection is usually necessary to learn a compact and effective
feature set for efficient identity authentication. In addition,
feature selection can discover the knowledge related to the
pattern recognition problem of texture biometrics, such as the
importance of various image structures in iris and palmprint
images and the most suitable image operators for identity
authentication.
Our previous work has demonstrated that ordinal measures
(OM) [2] provide a good feature representation for iris [3],
palmprint [4] and face recognition [5]. Ordinal measures are
defined as the relative ordering of a number of regional image
features (e.g. average intensity, Gabor wavelet coefficients,
etc.) in the context of visual image analysis. The basic idea
of OM is to characterize the qualitative image structures
of texture-like biometric patterns. The success of ordinal
representation comes from the texture-like visual biometric
patterns where sharp and frequent intensity variations between
image regions provide abundant ordinal measures for robust
and discriminating description of individual features. Detailed
information on ordinal measures in the context of biometrics,
including its definition and properties of invariance, robustness, distinctiveness, compactness and efficiency can be found
in [2][5].
Multi-lobe Ordinal Filter (MOF) with a number of tunable
parameters is proposed to analyze the ordinal measures of
biometric images (Fig. 1) [3]. MOF has a number of positive
and negative lobes which are specially designed in terms of
distance, scale, orientation, number, and location so that the
filtering result of MOF with the biometric images can measure
the ordinal relationship between image regions covered by
the positive and negative lobes. From Fig. 1 we can see that
variations of the parameters in multi-lobe ordinal filter can
lead to an extremely huge feature set of ordinal measures.
For example, each basic Gaussian lobe in MOF has five

1057-7149 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

SUN et al.: ORDINAL FEATURE SELECTION FOR IRIS AND PALMPRINT RECOGNITION

Fig. 1.

Multi-lobe ordinal filters [3].

parameters, i.e., x-location, y-location, x-scale, y-scale and


orientation. Thus there are totally 10 variables in a di-lobe
ordinal filter and 15 tunable parameters in a tri-lobe ordinal
filter. Supposing that each variable has ten possible values, the
number of all possible di-lobe and tri-lobe ordinal measures
in a biometric image is at least in the order of 1010 and 1015
respectively. Although in general ordinal measures are good
descriptors for biometric feature representation, there are significant differences between various ordinal features in terms
of distinctiveness and robustness. Since the primitive image
structures vary greatly across different biometric modalities in
terms of shape, orientation, scale, etc., there does not exist a
generic feature set of ordinal measures which can achieve the
optimal recognition performance for all biometric modalities.
Even for the same biometric modality, the existing individual
difference in visual texture pattern determines that the optimal
ordinal features may vary from person to person. Moreover the
redundancy among different ordinal features should be reduced
and it has been proven that it is possible to only use a small
number of ordinal features to achieve high accuracy in iris [3]
and palmprint biometrics [4]. Therefore it is unnecessary to
extract all ordinal features because of the redundancy in the
over-complete set of ordinal feature representation. Based on
the above analysis, a much smaller subset of ordinal measures
must be selected from the original feature space as a compact biometric representation, into which the characteristics
of visual biometric patterns should be incorporated, for the
purpose of efficient biometric identification.
The remainder of this paper is organized as follows.
Section II introduces related work of feature selection in
biometrics applications. Section III describes the technical
details of the feature selection method based on linear programming. Section IV and Section V present the applications
of the proposed LP formulation to iris and palmprint biometrics respectively. Section VI concludes this paper with some
discussions. Preliminary results on linear programming for iris
recognition are presented in [18].
II. R ELATED W ORK
Feature selection is a key problem in pattern recognition and has been extensively studied. However, finding an

3923

optimal feature subset is usually intractable and in most cases


there are only solutions to suboptimal feature selection [6].
Since no generic feature selection methods are applicable to
all problems, a number of feature selection methods have
been proposed [7][13]. These methods employ different
optimization functions and searching strategies for feature
selection. For example, the criteria of Max-Dependency,
Max-Relevance is used to formulate an optimization based
feature selection method mRMR [11]. ReliefF is a simple yet
efficient feature selection method suitable for problems with
strong dependencies between features [12]. ReliefF has been
regarded as one of the most successful strategies in feature
selection because the key idea of the ReliefF is to estimate
the quality of features according to how well their values
distinguish between instances that are near to each other [12].
Most research works on feature selection mainly focus on
generic pattern classification applications rather than specific
applications in biometrics. This paper mainly addresses the
efficient feature selection methods applicable to biometric
authentication. Boosting [14] and Lasso [15] have been proved
as the well performed feature selection methods in face
recognition.
Boosting [14] has become a popular approach used for both
feature selection and classifier design in biometrics. Boosting
algorithm aims to select a complementary ensemble of weak
classifiers in a greedy manner. A reweighting strategy is
applied for training samples to make sure that every selected
weak classifier should have a good performance on the hard
samples which can not be well classified by the previously
selected classifiers. Boosting has achieved good performance
in visual biometrics, including both face detection [14] and
face recognition [16]. However, boosting can not guarantee a
globally optimal feature set and an overfitting result may be
obtained if the training data is not well designed.
Destrero et al. proposed a regularized machine learning
method enforcing sparsity for feature selection of face biometrics based on Lasso regression [15], [17]. The Lasso feature
selection aims to solve the following penalized least-squares
problem [15], [17]:
f L = arg min {g A f 22 + 2 | f |1 }

(1)

where g means the intra- or inter-class label (+1 or -1), the


components of A indicate the intra- or inter-class matching
results based on individual features in the training database,
f denotes the feature weight vector, and is a parameter
controlling the balance between regression errors and sparsity
of selected features. The objective function includes two parts.
The first part g A f 22 aims to minimize the regression
errors and the second part 2 | f |1 uses L1 regularization to
enforce sparsity of the selected features. The optimization
problem can be solved by Landweber iteration [15], [17].
The L1 regularized sparse representation was evaluated to be
better than Boosting for face detection and authentication in
small size training dataset [15], [17]. However, this approach
also has some drawbacks. Firstly, although the optimization
problem defined in sparse representation is possible to achieve
a global minimum, it is not efficient in implementation due

3924

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 9, SEPTEMBER 2014

to the non-linear objective function. Therefore a three-stage


architecture is necessary to solve a large learning problem [17].
Secondly, the squared sum of regression errors defined in
the objective function makes the feature selection sensitive to
outliers. Thirdly, the class label g can only take the value either
+1 or 1, therefore the model could not generate a maximal
margin. Margin analysis is important to the generalization
ability of machine learning algorithms and the most powerful
machine learning methods, e.g. Support Vector Machine and
Boosting are motivated by margins. In addition, the features of
training samples should be normalized to match the class label,
therefore additional computational cost is needed. Fourthly,
the model of Lasso is not flexible so that the optimization
does not take into account the characteristics of image features
and biometric recognition. For example, L1 regularization term
| f |1 in the optimization function Eqn. 1 assigns an identical
weight to all features and the discriminative information of
each feature is not taken into consideration.
The L1 regularization is a popular technique for feature
selection. For example, Guo et al. proposed a linear programming formulation of feature selection with application to facial
expression recognition [13]. The objective function aims to
minimize misclassification error and the L1 norm of feature
weight.
In summary, both Boosting [14], [16] and Lasso [15], [17]
have limitations in ordinal feature selection and it is desirable
to develop a feature selection method with the following
properties.
1) The feature selection process can be formulated as a
simple optimization problem. Here simple means that
both the objective function and the constraint terms
can be defined following a well-established standard
optimization problem. So that it is easy to obtain a global
solution of the feature selection problem.
2) A sparse solution can be achieved in feature selection
so that the selected feature set is compact for efficient
storage, transmission and feature matching.
3) The penalty of misclassification can not be a high-order
function of regression errors to control the influence of
outliers.
4) The model of feature selection should be flexible to
take into account the characteristics of the biometric
recognition problem so that the genuine and imposter
matching results can be well separated from each other
and the selected image features are accurate in training
database.
5) The feature selection problem has less dependence on
the training data and it can be solved by a small set of
training samples. It requires the feature selection method
can circumvent the curse of dimensionality problem and
generalize to practical applications.
This paper proposes a novel feature selection method which
meets all requirements listed above. In our method, the feature selection process of ordinal measures is formulated as
a constrained optimization problem. Both the optimization
objective and the constraints are modeled as linear functions;
therefore linear programming (LP) can be used to efficiently

Fig. 2.

Problem statement of ordinal feature selection in iris recognition.

solve the feature selection problem. The feature units used


for LP formulation are regional ordinal measures, which are
tested on the training dataset to generate both intra- and interclass matching samples. Our feature selection method aims at
finding a compact subset of ordinal features that minimizes
biometric recognition errors with large margin between intraand inter-class matching samples. The objective function of
our LP formulation includes two parts. The first part measures
the misclassification errors of training samples failing to
follow a large margin principle. And the second part indicates
weighted sparsity of ordinal feature units. Traditional sparse
representation uses L1-norm to achieve sparsity of feature
selection and all feature components have an identical unit
weight in sparse learning. However, we argue that it is better to
incorporate some prior information related to candidate feature
units into sparse representation so that the most individually
accurate ordinal measures are given preferential treatment.
And the linear inequality constraints of LP optimization problem require that all intra- and inter-class matching results are
well separated from each other with a large margin. Slack
variables are introduced to ensure the inequality constraints of
ambiguous and outlier samples.

III. F EATURE S ELECTION BASED ON


L INEAR P ROGRAMMING
The objective of feature selection for biometric recognition
is to select a limited number of feature units from the candidate
feature set (Fig. 2). In this paper, a feature unit is defined as
the regional ordinal encoding result using a specific ordinal
filter on a specific biometric region. We aim to use a machine
learning technique to find the weights of all ordinal feature
units. So that feature selection can also be regarded as a
sparse representation method, i.e. most weight values are zero
and only a compact set of feature units have the weighted
contribution to biometric recognition.
In this paper, ordinal feature selection is formulated as a
constrained optimization problem as follows.

SUN et al.: ORDINAL FEATURE SELECTION FOR IRIS AND PALMPRINT RECOGNITION

3925

The objective function of our LP formulation includes two


parts motivated by the basic idea of our feature selection
method. The first part of the objective function
+

N
N

+  +

+
k
j
N+
N
j =1

Fig. 3. Illustration of the constraint terms of LP formulation using face


recognition as the example.

Objective function:

min


D
N
N
+  +  

+
P
w
i i
j
k
N+
N
+

j=1

k=1

(2)

i=1

Subject to:

k=1

aims to minimize the misclassification errors of intra- and


inter-class matching samples according to the expected thresholds and . Since and in our experiments are defined
as the mean intra- and inter-classing Hamming distance for
well performing ordinal features, a large margin principle
is actually incorporated into the optimization problem. The
biometric matching samples failing to meet the large margin
requirement will suffer a penalty and such a penalty is determined by the distance from the dissimilarity measure to the
expected thresholds and . Here a soft margin technique is
adopted by introducing slack variables j+ and k to guarantee
that all intra-class and inter-class matching results follow the
large margin principle. So the first part of objective function
+

D

i=1
D


wi x i+j + j+ ,

wi x ik
k ,

i=1
+

j 0,

j = 1, 2, , N +
k = 1, 2, , N

(3)

(4)

j = 1, 2, , N +

(5)

(6)

k 0,

k = 1, 2, , N

wi 0,

i = 1, 2, , D

(7)

where D is the number of ordinal features available for feature


selection, N + and N denote the number of intra- and
inter-class biometric matching pairs in the training database
respectively, wi means the weight of i th ordinal feature for
the biometric recognition system, Pi measures the recognition
accuracy of i th ordinal feature on the training database, x i+j
denotes the Hamming distance of i th ordinal feature for j th

intra-class biometric image pairs in the training database, x ik


denotes the Hamming distance of i th ordinal feature for kth
inter-class iris image pairs in the training database, and are
two fixed parameters indicating the expected intra- and interclass biometric matching results respectively, j+ and k are
slack variables for intra- and inter-class biometric matching
respectively, + and are the constant parameters tuning
the importance of intra- and inter-class matching results for
the biometric recognition system respectively. The idea of this
feature selection method is illustrated in Fig. 3.
The basic idea of the proposed feature selection method
is to find a sparse representation of ordinal features on the
condition of large margin principle. On one hand, the intraand inter-class biometric matching results are expected to be
well separated with a large margin. On the other hand, the
number of selected ordinal features should be much smaller
than the large number of candidates. These two seemingly
contradictory requirements are well integrated in our feature
selection method.

N
N

+  +

+
k
j
N+
N
j =1

k=1

defines the overall penalty term of training samples according


to the large margin principle. The constant parameters + and
measure the penalty weights to the misclassifications of
intra- and inter-class matching samples respectively. And their
value can be tuned according to the application requirements.
For example, the FRR (False Reject Rate) sensitive
applications such as watch-list monitoring can set a larger +
and the FAR (False Accept Rate) sensitive applications such
as banking can set a larger . In normal applications, we
can set + = . In summary, the objective function of the
proposed LP feature selection method aims to minimize the
misclassification errors and enforce sparsity of the selected
ordinal features simultaneously. And the parameters + and
can balance the trade-off between accuracy and sparsity.
The second part of the objective function
D


Pi wi

i=1

enforces weighted sparsity of ordinal feature units. Sparsity


of the ordinal feature units is very important to effective
and efficient biometric recognition. Firstly, the objective of
biometric recognition is to find a mapping function between
the most characterizing features and the identity label. Sparse
learning is just for this purpose and it is possible to discover the intrinsic features of biometric patterns. Secondly,
sparsity means that it is possible to use a compact feature
set for biometric recognition, i.e., efficient encoding, storage,
transmission and comparison of biometric feature templates.
Weighted sparsity proposed in this paper is a novel idea
in sparse representation. It differs from the existing sparse
representation method [15], [17] in that the good performing
individual features in the training database are given a higher
weight in sparse learning. Here the weight Pi represents
the prior information of individual ordinal feature in terms

3926

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 9, SEPTEMBER 2014

of recognition performance. It may be defined as the Equal


Error Rate (EER), the Area Under the ROC Curve (AUC)
or the inverse of Discriminating Index (1/D-index). We will
analyze and compare these three options of Pi (EER, AUC,
D-index) in experiments on their effectiveness for biometric
feature selection. Since the weight of each ordinal measure
wi is constrained to be non-negative value, the second part of
objective function approximates the L1 regularization which is
beneficial to generate a sparse ordinal feature set after feature
selection. The L1 regularization term in sparse representation
[15], [17] (Eqn. 1) can be regarded as the special case of
D


Pi wi

i=1

where Pi = 1 for all ordinal features. The prior information


of each feature is not taken into account in the Lasso method
[15], [17] and all features are evenly treated to enforce sparsity.
In our feature selection method, better performing ordinal
features are assigned with higher weights Pi so that a more
compact and effective feature set can be selected.
The LP formulation subjects to a set of linear inequality
constraints. Eqn. 3 and Eqn. 4 require that all intra- and interclass matching samples in the training database should be well
separated based on a large margin principle. In fact, a large
number of training samples close to the decision boundaries
can not meet the large margin principle and these intraand inter-class matching results usually can not be linearly
separated. Therefore slack variables j+ and k are introduced
to the inequality constraints which makes our model more
flexible and robust. Our LP formulation is actually a soft
margin model which can remove the influence of noisy samples or outliers adaptively and also generate a larger margin
to improve the accuracy and generalization performance with
the help of slack variables. Eqn. 7 indicates a non-negative
constraint on the weight of features w {wi }. In contrast, w is
possible to be negative in the Lasso algorithm [15], [17]. We
argue that the non-negative constraint of w is both reasonable
and beneficial. Firstly, the target of feature selection is to
find the optimal solution of w, which is a very important
variable with physical meaning. Each element in w denotes
the contribution of each ordinal feature to the success of biometric recognition. Since we are discussing a feature selection
method, each feature should only have positive contribution to
the resulting large-margin classification. Secondly, the second
part of objective function
D


Pi wi

i=1

is equal to a weighted L1 regularization term if wi is enforced


to be positive, which can lead to a sparse result of feature
selection. Thirdly, non-negative constraint of w is beneficial to
a stable solution of the LP optimization problem. For example,
if wi < 0, it means that intra-class Hamming distance of
i th ordinal features may be generally larger than inter-class
Hamming distance based on Eqn. 3 and Eqn. 4. Of course
such a conclusion contradicting the fact may bring instable
factors to the LP learning problem.

The feature selection method proposed in this paper


has a different optimization formulation to the existing LP
method [13] in terms of the weighted sparsity term in objective function and non-negative constraint of feature weights.
Therefore our method is more suitable to learn discriminant,
robust and sparse features for biometric recognition.
It should be noted that our LP formulation is flexible and a
number of variants may be generated to meet the requirements
of some specific feature selection applications. For example,
the LP formulation turns to a 0-1 programming problem when
wi is defined as a binary variable 1 or 0. And an additional
constraint
D


wi = N

i=1

may be introduced to control the number of ordinal feature


components (N) according to practical requirements. And we
can also add other application specific requirements to the
objective function and constraints. If these newly added terms
can be expressed in linear functions, our feature selection can
also be efficiently solved based on linear programming.
Because our feature selection method can be transformed
to a standard linear programming model, it can be solved by
the Simplex algorithm [19] conveniently and efficiently. The
Simplex algorithm has a profound theory to obtain a globally
optimal solution. We sort the weights of features to get the
desired number of features. In order to correct truncation
errors, then extra classifiers can be used for recognition, e.g.,
Nearest Neighbor (NN), SVM. Another advantage of LP is
that there are a number of software tools to solve the linear
programming problem such as CPLEX [19], LINDO [20],
etc. And state-of-the-art commercial mathematical toolboxes
can efficiently solve large-scale linear programming problems
with millions of variables. The LP formulation of this paper
only involves thousands of variables so we choose the CPLEX
software package provided by IBM [19], which is free of
charge to academic research.
There are a number of implementations of the fast algorithm of linear programming. The computational complexity
of linear programming based on interior-point method is
O(D N 2 ) [21], where N is the number of training samples
and D is the initial dimension of feature pool. In contrast, the
complexity of GentleBoost is O(d D 2 log D) [22], where d is
the number of selected features and D is the initial dimension
of feature pool. The complexity of Lasso is O(TND) [23],
where T is the number of iterations, N is the number of
training samples and D is the initial dimension of feature
pool. Therefore GentleBoost is efficient in biometric feature
selection because a small number (d ) of effective features are
accurate enough for personal identification. Lasso algorithm is
more time-consuming because it involves matrix operations.
The complexity of LP based feature selection is low for small
training databases.
The following two sections will demonstrate the effectiveness of the proposed feature selection method for iris
and palmprint recognition respectively. State-of-the-art iris
and palmprint recognition methods and representative feature

SUN et al.: ORDINAL FEATURE SELECTION FOR IRIS AND PALMPRINT RECOGNITION

selection methods are evaluated on the CASIA and PolyU


biometrics databases for performance comparison to show the
merit of the proposed LP formulation. It should be noted
that the main purpose of this paper is to discover the most
effective ordinal features for iris and palmprint recognition.
It can be regarded as a specific feature selection problem.
So our method is not tested on the popular databases for
research and evaluation of generic feature selection methods.
And two representative methods in generic feature selection,
i.e. mRMR [11] and ReliefF [12], and two popular feature
selection methods in biometrics, i.e. Boosting [14] and Lasso
[15], [17], are used for performance comparison.
IV. O RDINAL F EATURE S ELECTION
FOR I RIS R ECOGNITION
Our previous work has demonstrated the effectiveness of
ordinal measures for iris recognition and there are a large number of stable ordinal measures in iris images [3]. However, how
to choose the most effective feature set of ordinal measures
for reliable iris recognition is still an unsolved problem. In our
previous work, a di-lobe and a tri-lobe ordinal filter were
jointly used for iris feature extraction [3]. The parameter
settings of these ordinal filters are hand-crafted and they are
performed on all iris image regions. However, the texture
characteristics such as scale, orientation and salient texture
primitives of iris patterns vary from region to region. So it is
a better solution to employ a region specific ordinal filter for
iris feature analysis.
It should be noted that the process of ordinal feature selection does not consider the prior mask information of eyelids,
eyelashes, specular reflections. There are mainly two kinds of
strategies to deal with occlusion problem in iris recognition.
The first is to segment and exclude occlusion regions in iris
images and label the regions using mask in iris matching. But
it needs accurate and efficient iris segmentation. In addition,
the size of iris template becomes double. More importantly, the
computational cost of both iris image preprocessing and iris
matching is significantly increased because of the iris mask
strategy. So it is more realistic to identify and exclude the
heavily occluded iris images in quality assessment stage. The
remained iris images used for feature extraction and matching
are less occluded by eyelids and eyelashes. So that it is
beneficial to both accuracy and efficiency of iris recognition.
This paper aims to learn a common ordinal feature set applicable to less occluded iris images of all subjects. The process
of the feature selection is independent on any individual or
image specific prior information such as iris segmentation
mask. We believe the commonly selected feature set should
be accurate enough to recognize almost all subjects because
the individual or sample specific variations have already been
taken into consideration in feature selection. We have also
tried to integrate the occlusion mask into feature selection and
feature matching but no improvement of accuracy on stateof-the-art iris image databases which have usually excluded
heavily occluded iris images. We believe the common ordinal
features discovered in this paper are valuable for practical iris
recognition systems.

3927

This paper mainly focuses on feature analysis and the


details of iris image preprocessing can be found in [24].
Iris texture varies from region to region in terms of scale,
orientation, shape of texture primitives, etc. So it is needed
to use region specific ordinal filters to achieve the best
performance. Therefore iris images are divided into multiple
blocks and different types of ordinal filters with different
parameter settings are applied on each image block. So that
feature selection methods can be used to find the most effective
set of image blocks with the most appropriate setting of
parameters. In this paper, the preprocessed and normalized
iris image is divided into multiple regions and a number of
di-lobe and tri-lobe ordinal filters with variable scale, orientation and inter-lobe distance are performed on each region
to generate 47,042 regional ordinal feature units (Fig. 2).
Each feature unit, which is jointly determined by the spatial
location of iris region and the corresponding ordinal filter, is
constituted by 256 ordinal measures or 32 Bytes in feature
encoding. The objective of feature selection is to select a
limited number of OM feature units from the candidate feature
set.
The experimental part of this paper aims to test and compare
the proposed Linear Programming (LP) method with four
feature selection methods for ordinal iris feature analysis,
i.e., Boosting [14], [16], Lasso [15], [17], mRMR [11] and
ReliefF [12]. All these feature selection methods used for
selecting the effective set of ordinal measures are simply
named as LP-OM, Boost-OM, Lasso-OM, mRMR-OM and
ReliefF-OM respectively. There exist a number of variants of
boosting, so we tried Adaboost and Gentleboost in experiments and found that Gentleboost performs slightly better
than Adaboost. So Gentleboost is used in this paper to
represent a typical category of feature selection methods
based on Boosting. In this paper, three iris image datasets in
CASIA Iris Image Database Version 4.0 (CASIA-IrisV4) [25],
namely CASIA-Iris-Thousand, CASIA-Iris-Lamp and CASIAIris-Interval, are used in the experiments. To demonstrate the
advantage of feature selection methods for visual biometrics,
a randomly selected ordinal feature set with the same number
of feature units is employed as the baseline algorithm. Such an
ordinal feature representation method without feature selection
is denoted as Random-OM. To demonstrate the benefit of
feature selection in iris recognition, state-of-the-art iris recognition methods proposed by Daugman [26] and Ma et al. [27]
are implemented as the baseline algorithms. A number of
hand-crafted parameter settings are tried for these two methods
[26], [27] and the best results are reported in this paper. The
idea of sparse representation of iris features has been recently
proposed by Kumar [28] using L1 regularization. So the
main feature selection method in [28] can be represented by
Lasso-OM.
CASIA-Iris-Thousand contains 20,000 iris images from
1,000 subjects, which were collected using IKEMB-100
camera produced by IrisKing. The samples in CASIA-IrisThousand are 8-bit gray level iris images with resolution
640480. The diameter of iris ring is around 200 pixels. And
all iris images in CASIA-Iris-Thousand are compressed to
JPEG format to save storage memory. The main sources of

3928

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 9, SEPTEMBER 2014

all ordinal feature units (47,016, 99.94%) have zero weight in


LP feature selection. Therefore two conclusions can be drawn
from this observation. Firstly, the fundamental assumption
of high dimensional ordinal feature selection is satisfied that
the regression function from ordinal measures to individual
identity lies in a low dimensional manifold. So that it is
possible to use statistical inference methods such as linear
programming to derive a compact feature set for efficient
iris recognition. Secondly, the proposed linear programming
method can achieve a sparse feature set. The sparsity property
of our feature selection method comes from the second part
of objective function
D


Pi wi

i=1

and the non-negative constraints in Eqn. 7. So that the


minimization of
D


Pi wi

i=1

Fig. 4. Sparsity analysis of feature selection methods. (a) The learning result
of linear programming. (b) Iris recognition performance as a function of the
number of selected ordinal feature units.

intra-class variations in CASIA-Iris-Thousand include illumination changes, motion blur, eyeglasses, specular reflections,
and JPEG compression. Since CASIA-Iris-Thousand is the
largest iris image dataset in the public domain, it is wellsuited for studying the uniqueness of iris features and practical
performance of iris recognition algorithms.
The iris images of the first 25 subjects are used as the
training dataset and the remained 19,500 iris images of 975
subjects are used to test the performance of various feature selection methods. There are totally 500 iris images of
50 eyes in the training dataset and they are used to generate
2,250 intra-class and 4,900 inter-class matching samples. We
do not use all possible inter-class matching samples due to
three reasons. Firstly, it can keep the balance between the
number of positive and negative samples. Secondly, the use of
a subset of inter-class comparisons can minimize the number
of linear constraints in linear programming so that the solution
of optimization problem is simplified. Thirdly, it can reduce
the redundancy among negative samples. Five iris recognition
methods namely LP-OM, Boost-OM, Lasso-OM, mRMR-OM
and ReliefF-OM are performed on the training dataset to
obtain the most effective feature set of ordinal measures
respectively. And then the selected ordinal feature units are
evaluated on the testing dataset.
Firstly we investigate the feature selection results of the
proposed LP-OM method. The weights of 47,042 ordinal
feature units as the feature selection output are shown in
Fig. 4a. There are only 26 non-zero components and almost

is approximately equivalent to the minimization of the


L1 regularization term in the Lasso algorithm, which
has a solid theory to guarantee a sparse learning result.
To further investigate the relationship between iris recognition
performance and the number of ordinal feature units, the
discriminating index of top N ordinal features chosen by
the three feature selection methods (LP-OM, Boost-OM,
Lasso-OM) is shown in Fig. 4b. The experimental results indicate saturation of iris recognition performance with increasing
of the number of ordinal feature units. And this result demonstrates the necessity and possibility of sparse representation of
ordinal measures in iris images. Because a limited number of
ordinal features are sufficient to achieve high accuracy, only
15 ordinal feature units (i.e., 420 Bytes ordinal code) with the
largest weights are selected to build an iris recognition system
for the feature selection methods in the following experiments.
It is interesting to investigate the parameter Pi in the linear
programming formulation. When P is a unit vector, e.g.
P1 = P2 = = PD ,

D


Pi wi

i=1

is equal to the L1 regularization term in Lasso algorithm. We


argue that it is better to incorporate the prior information of
each ordinal feature unit into the objective function to enforce
the priority of well-performing ordinal feature units in the
training dataset. In the experiment on CASIA-Iris-Thousand,
four options of P (i.e., Pi = 1/D, Pi = 1/D i ndex(O Mi ),
Pi = AU C(O Mi ), Pi = E E R(O Mi )) are tried to learn
different ordinal feature sets for iris recognition. The testing
results of these four settings of parameter Pi are shown
in Fig. 5. It is obvious the best iris recognition result is
achieved when Pi = 1/D i ndex(O Mi ), which indicates the
discriminating index is the most important prior information
of each ordinal feature unit. And the results also demonstrate
incorporation of discriminative penalty terms such as EER and
AUC into feature learning module can significantly improve
biometric recognition accuracy.

SUN et al.: ORDINAL FEATURE SELECTION FOR IRIS AND PALMPRINT RECOGNITION

Fig. 5.

Comparison of different weighting strategies for LP-OM.

Fig. 6.

Comparison of four feature selection methods for iris recognition.

Comparison results of the five feature selection methods


and state-of-the-art iris recognition methods [26], [27] on the
testing dataset of CASIA-Iris-Thousand are shown in Fig. 6
and Table I. And the baseline performance based on RandomOM is also listed in Table I.
A number of conclusions can be drawn from the experimental results.
Ordinal features are effective for iris recognition. Even
though we randomly select 15 ordinal feature units from
47,016 candidates, it is possible to achieve a good
recognition performance (EER=2.91%) on the largest iris
dataset in the public domain.
The ordinal features automatically selected by most
machine learning approaches (Boost-OM, Lasso-OM,
mRMR-OM and LP-OM) perform much better than randomly chosen ordinal features (Random-OM). Therefore
it is necessary to adopt feature selection methods to
learn a distinctive and robust ordinal feature set for
iris recognition. The feature selection based iris recognition methods perform significantly better than stateof-the-art methods [26], [27]. There are two advantages
of our methods. The first is the advantage of ordinal
measures [3] over iris code [26] and shape code [27].
The second advantage is the use of feature selection
method. In contrast, the implementation of state-of-theart iris recognition methods is based on hand-crafted
feature parameters. To demonstrate the effectiveness of
the proposed feature selection method, LP is also used to

3929

select the best parameter setting of Gabor filters for iris


recognition in Appendix A [37]. The results show that
learned Gabor filters can achieve much higher accuracy
the Gabor filters of the hand-crafted feature parameters.
There exist performance differences between the five
feature selection methods. Both mRMR-OM and
ReliefF-OM are generic feature selection methods, which
are worse than the proven feature selection methods in
biometrics such as Boost-OM and Lasso-OM. In general,
the global optimization methods such as Lasso-OM and
LP-OM can achieve a higher accuracy in testing dataset
than greedy learning method such as Boost-OM. And
LP-OM can learn a better ordinal feature set than LassoOM. Therefore the experimental results demonstrate that
the proposed linear programming method achieves the
highest accuracy in terms of EER, discriminating index
and AUC. And the advantage of our feature selection
method is more significant in most practical iris recognition applications when FAR is usually required to
be smaller than 106. For example, when FAR=108 ,
LP-OM can achieve a significantly smaller FRR compared with Lasso-OM and Boost-OM.
The computational cost of feature selection is tested using
Matlab2011 programming environment on a 2.83GHZ
personal computer. We can see linear programming is
much more efficient than Lasso and mRMR in feature
selection. Boosting is the fastest method to select top
15 ordinal feature units because of its greedy feature
selection strategy. In contrast, both linear programming
and Lasso can provide a global weighting result for
all ordinal feature units, so they are less efficient than
boosting method in feature selection. This paper only
tried feature selection with ten thousands of variables so
computational complexity of feature selection is not so
important in offline training stage. But it is possible to
introduce millions of variables to optimization in largescale training database because more training data usually
benefits pattern recognition. In addition, it is possible to
extend our work to online feature selection (e.g., person
specific feature selection in forensic applications) when
training time makes sense in real-time applications.
The optimization objective functions of both Lasso and
LP are mainly constituted by two terms, namely misclassification penalty term and sparsity penalty term.
We can use a parameter to assign the importance weight
to these two terms. It is interesting to investigate the
sensitivity of visual biometric recognition performance
to the parameter . The EER of iris recognition as a
function of for these two feature selection methods in
a cross validation dataset is shown in Fig. 7. We can see
that Lasso is sensitive to the parameter setting of but
LP can achieve a comparatively stable performance with
variation of .
It is interesting to investigate the sparsity property of
Lasso and LP. The results show that linear programming
can achieve a much more sparse training result, i.e., 26
non-zero components (LP) vs. 500 non-zero components
(Lasso). Therefore LP is advantageous over Lasso to

3930

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 9, SEPTEMBER 2014

TABLE I
C OMPARISON OF P ERFORMANCE OF I RIS R ECOGNITION M ETHODS O N T HE C ASIA -I RIS -T HOUSAND

Fig. 7. The sensitivity of feature selection methods to the trade-off parameter


between accuracy and sparsity.

achieve a much more compact feature representation for


iris biometrics.
Some typical ordinal feature units which are selected by
mRMR, LP, Lasso and Boost are illustrated in Fig. 8 (The
results of ReliefF are not shown here because it performs
much worse than other feature selection methods). A number
of conclusions can be drawn from the visualization of feature
selection results.
1) The lower part of iris image regions adjacent to pupil
are the most effective for iris recognition because these
regions are rich of iris texture information and have
much smaller probability to be occluded by eyelids and
eyelashes.
2) Both di-lobe and tri-lobe filters are selected so they are
complementary for iris recognition. And the orientation
of most ordinal filters is horizontal because iris texture
is mainly distributed along the circular direction in iris
images, i.e. horizontal orientation in the normalized
format.
3) There exist some differences among the four feature
selection methods (mRMR, LP, Lasso and Boost) in
terms of the selected ordinal filters and iris image
regions. And these minor differences of feature selection results determine the differences of iris recognition
performance.
The experimental results on the CASIA-Iris-Lamp and
CASIA-Iris-Interval are reported in Appendix B and C
respectively [37].

Fig. 8. Some typical ordinal feature units selected by LP, Lasso, Boost and
mRMR. (a) LP-OM. (b) Lasso-OM. (c) Boost-OM. (d) mRMR-OM.

V. O RDINAL F EATURE S ELECTION FOR


PALMPRINT R ECOGNITION
Palmprint provides a reliable source of information for
automatic personal identification and has wide and important
applications [29], [30]. Richness of visual information available on palmprint images including principal lines, wrinkles,

SUN et al.: ORDINAL FEATURE SELECTION FOR IRIS AND PALMPRINT RECOGNITION

ridges, minutiae points, singular points, texture, etc. provides


various possibilities for palmprint feature representation and
pattern recognition [29], [30]. A number of feature representation methods for palmprint recognition have been proposed in
the literature, including geometric structure such as point and
line patterns, global appearance description based on subspace
analysis, and local texture analysis, etc. Competitive code represents the state-of-the-art performance in palmprint recognition [29][31]. There, each palmprint image region is assumed
to have a dominant line segment and its orientation is regarded
as the palmprint feature. Because the even Gabor filter is well
suited to model the line segment, it was used to filter the
local image region along six different orientations, obtaining
the corresponding contrast magnitudes. Based on the winnertake-all competitive rule, the index (ranging from 0 to 5) of
the minimum contrast magnitude was represented by three bits,
namely competitive code [31]. Recently some improvements
[32], [33] or variants [34] of competitive code have been
reported. This paper attempts to provide a new understanding
and solution to the problem of palmprint feature analysis using
ordinal measures and linear programming.
Unique and rich texture information of palmprint images is
useful for personal identification. There are a large number of
irregularly distributed line segments on palm surface, mainly
constituted by principle lines and wrinkles. Photometric
properties of these line segments are significantly different to
that of non-line regions. Thus the reflection ratios between
line and non-line regions have stable ordinal relationship, i.e.
R(li ne) < R(non li ne). Since the illumination strength
of neighboring palmprint regions are approximately identical, it can be derived that the ordinal measures of intensity
between palmprint regions are robust descriptor for identity
verification. For each palm, spatial configuration of the line
and non-line image regions for ordinal measures, such as
location, orientation, scale, has its unique layout. So the core
idea of ordinal measures based palmprint representation is to
recover the random layout of ordinal measures for feature
matching.
This paper mainly focuses on feature analysis of palmprint
biometrics. And the details of palmprint image preprocessing
can be found in the existing publications [29], [30]. For
palmprint images, the gaps between neighboring fingers can
be used as the landmark points for correction of the rotation
and scale changes of palmprint images and then the central
region can be cropped as the input of feaure analysis. In
this paper, all palmprint images are normalized into a central
ROI region with resolution 128128. And then each ordinal
filter is performed on the ROI to generate 3232=1024 Bits
(128 Bytes) ordinal code following the feature extraction routine of most state-of-the-art palmprint recognition algorithms
[4], [29], [31][33]. So if we select N ordinal filters for
palmprint image analysis, the template size for each palmprint
image is 128 N Bytes.
Because of the difference between the texture primitives
in iris and palmprint biometric patterns, we need to provide
biometric modality specific ordinal filters as the input of
feature selection. Our previous work [4] only tried di-lobe
ordinal filters (Fig. 9a) for palmprint recognition and the

3931

Fig. 9. Illustration of di-lobe and tri-lobe ordinal filters for palmprint image
analysis. (a) Examples of di-lobe ordinal filters. (b) Examples of tri-lobe
ordinal filters.

results show that the ordinal measures between two elongated,


line-like and orthogonal image regions are well-suited for
palmprint feature analysis. In this paper we explore tri-lobe
ordinal filters (Fig. 9b) for palmprint feature extraction because
of the following reasons.
1) Tri-lobe ordinal filters are expected to be more discriminative and robust than di-lobe filters;
2) A much larger feature space can be generated by tri-lobe
ordinal filters so that it is possible for feature selection
methods to search a better solution for palmprint recognition;
3) Di-lobe filters can be regarded as the special cases of
tri-lobe filters so the good performing di-lobe ordinal
filters in our previous work are also included into our
current development.
To test the proposed feature selection method for palmprint recognition, the PolyU palmprint image database [35]
is used for performance evaluation. The PolyU Palmprint
Database was collected by a CCD camera-based imaging
device. A subject puts his hand on a platform with the guidance
of six pegs. Hence the low-resolution images (75 dpi) are
captured for online processing. The latest PolyU Ver 2.0
contains 7,752 palmprint images from 386 palms. Each palm
has two sessions of images, either of which has at the most
10 images. Average time interval between two sessions is
two months. Light conditions and focus of the imaging device
are changed between two occasions of image capture, which
is challenging to robustness of recognition algorithms. All
images are 8-bit gray-level images with resolution 384284.
It is a great challenge to group together intra-class palmprint images without compromising inter-class distinctiveness.
The latest version of PolyU Palmprint Database or PolyUPalmprint Ver 2.0 has been widely used in the literature
and most state-of-the-art palmprint recognition methods are
tested and compared on this database. The first version of
PolyU Palmprint Database or PolyU-Palmprint Ver 1.0 only
has 600 palmprint images of 100 classes. In this paper we use
PolyU-Palmprint Ver 1.0 as the training dataset and PolyUPalmprint Ver 2.0 as the testing dataset.
It should be noted that the palmprint images of PolyU 1.0
are transformed from a small part of images in PolyU 2.0 so
there may exist correlation or overlap between PolyU 1.0 and

3932

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 9, SEPTEMBER 2014

Fig. 10.

Illustration of the generation of synthetic training dataset.

PolyU 2.0. It is usually suggested to use independent training and testing datasets in pattern recognition experiments.
However, this paper still uses PolyU 1.0 for training and PolyU
2.0 for testing due to the following reasons.
Almost all public palmprint databases including PolyU
and CASIA do not have a division of training set and
testing set like face biometrics. So most palmprint recognition researchers usually report the best results which
are tuned on the whole database. We think it is fair
to compare our methods with state-of-the-art palmprint
recognition methods considering PolyU 1.0 is only related
to 7.7% palmprint images of PolyU 2.0. It is better to
report the palmprint recognition accuracy on the full
PolyU 2.0 for performance evaluation of the existing
methods.
Our previous work [4] has demonstrated that it is easy to
achieve 100% accuracy in PolyU 1.0 for both competitive
code and ordinal code. So the performance of state-ofthe-art palmprint recognition methods on the independent
version of PolyU 2.0 (excluding all related images in
PolyU 1.0) can be measured and compared with the
testing results on PolyU 2.0.
The generalization capability of LP-OM will be
demonstrated on the CASIA database using the ordinal
features trained on PolyU 1.0 (Appendix D [37]). So it
is unnecessary to emphasize the independence between
PolyU 1.0 and PolyU 2.0.
Since PolyU Palmprint Database is collected using highquality sensor and PolyU-Palmprint Ver 1.0 is small in size,
our previous work based on hand crafted di-lobe ordinal
filters [4] can achieve zero EER on PolyU-Palmprint Ver 1.0.
To learn a robust feature set of ordinal measures, a more
challenging training dataset is constructed by adding some
noise and perturbations into PolyU-Palmprint Ver 1.0 (Fig. 10).
Finally the synthetic training dataset includes 4,200 palmprint
images of 100 classes.
Firstly 5,000 tri-lobe ordinal filters are generated with
random parameter setting of location, scale, and orientation.
They are tested on the training dataset. The top 500 trilobe ordinal filters with the smallest EER are selected as
the candidate feature pool. Some tri-lobe ordinal filters in
the feature pool are shown in Fig. 9b. We can see that the
ordinal filters are significantly different to the filters used for

Fig. 11. Illustration of selected tri-lobe ordinal filters for palmprint image
analysis. (a) The top 5 tri-lobe ordinal filters selected by linear programming
from 500 ordinal filter pool in the first round of feature selection. (b) The top
2 ordinal filters in the second round of feature selection.

Fig. 12. ROC curves of palmprint recognition methods on PolyU Palmprint


Image Database.

iris recognition. And then the proposed linear programming


method is used to select the top 5 ordinal filters as shown
in Fig. 11a. The experimental results on the testing dataset
show that we can only use the first two tri-lobe ordinal filters
to achieve state-of-the-art palmprint recognition performance.
It is a grand challenge to search the huge parameter space and
find the optimal parameter setting of tri-lobe ordinal filters for
palmprint recognition because the design of tri-lobe ordinal
filters totally involves 15 variables. Although the top 2 trilobe ordinal filters selected from the random filter pool are
good enough for palmprint recognition, the candidate feature
pool only has 500 tri-lobe ordinal filters and it is possible to
find better tri-lobe ordinal filters outside the candidate feature
pool. Therefore we further generate more tri-lobe ordinal filters
based on the basic profiles of top 2 tri-lobe ordinal filters by
variations of the scale and location parameters of basic lobes
in tri-lobe ordinal filters. The newly generated tri-lobe ordinal
filters are used to train a better palmprint recognition algorithm
after the second round of feature selection (Fig. 11b).
The experimental results of the three feature selection
methods on PolyU Palmprint Image Database Ver 2.0 are
shown in Fig. 12 and Table II. The state-of-the-art palmprint
recognition method based on competitive code [31] and its
variants [32][34] and our previously proposed di-lobe OM [4]
are used as the reference algorithms for performance
comparison. We can see that the top 2 tri-lobe ordinal filters
in the first round of LP feature selection already achieve
smaller EER than Boost-OM, Lasso-OM, competitive code

SUN et al.: ORDINAL FEATURE SELECTION FOR IRIS AND PALMPRINT RECOGNITION

TABLE II
C OMPARISON OF P ERFORMANCE OF PALMPRINT R ECOGNITION
M ETHODS O N P OLYU PALMPRINT I MAGE D ATABASE

and di-lobe OM. Moreover, the LP-OM after the second round
of feature selection achieves the highest accuracy (EER=
6.19 105 ) with the smallest feature template (256 Bytes)
on the PolyU Ver 2.0 to the best of our knowledge.
The experimental results on the CASIA Palmprint Image
Database are reported in Appendix D [37].
VI. D ISCUSSION AND C ONCLUSIONS
This paper has proposed a novel feature selection method to
learn the most effective ordinal features for iris and palmprint
recognition based on linear programming. The success of
LP feature selection comes from the incorporation of the
large margin principle and weighted sparsity rules into the
LP formulation. The feature selection model based on LP is
flexible to integrate the prior information of each feature unit
related to biometric recognition such as DI, EER and AUC
into the optimization procedure. The experimental results have
demonstrated that the proposed LP feature selection method
outperforms mRMR, ReliefF, Boosting and Lasso.
A number of conclusions can be drawn from the study.
The identity information of visual biometric patterns
comes from the unique structure of ordinal measures. The
optimal setting of parameters in local ordinal descriptors
varies from biometric modality to modality, subject to
subject and even region to region. So it is impossible
to develop a common set of ordinal filters to achieve the
best performance for all visual biometric patterns. Ideally
it is better to select the optimal ordinal filters to encode
individually specific ordinal measures via machine
learning. However, such a personalized solution is inefficient in large-scale personal identification applications.
So the task of this paper turns to a suboptimal solution,
learning a common ordinal feature set for each biometric modality, which is expected to work well for most
subjects.
A main contribution of this paper is a novel optimization
formulation for feature selection based on linear programming (LP). Our expectations on the feature selection

3933

results, i.e. an accurate and sparse ordinal feature set,


can be described as a linear objective function. Such
a linear learning model has three advantages. Firstly,
it is simple to build, understand, learn and explain the
feature selection model. Secondly, linear penalty term is
robust against outliers. Thirdly, linear model only needs
a small number of training samples to achieve a global
optimization result with great generalization ability.
Weighted sparsity is proposed in this paper and the
results show that it performs better than traditional sparse
representation methods. So it is better to incorporate prior
information of candidate features into the optimization
model in sparse learning.
The intra-class variations in visual biometrics mainly come
from photometric (e.g. illumination) and geometric changes
(e.g. pose, deformation). In this paper we have shown that
LP feature selection is a good solution to sharp photometric
variations and slight geometric variations in iris and palmprint
patterns. Our future work will apply LP feature selection to
other visual biometric traits such as palm vein, finger vein, face
and fingerprint recognition, but some additional efforts may
be required to address the sharp geometric variations in face
(pose) and fingerprint (deformation) biometrics. The proposed
linear programming formulation is used for visual biometrics
in this paper, but we think it is a general feature selection
and sparse representation method applicable to other computer
vision and pattern recognition tasks.
R EFERENCES
[1] A. K. Jain, A. A. Ross, and K. Nandakumar, Introduction to Biometrics.
New York, NY, USA: Springer-Verlag, 2011.
[2] T. Tan and Z. Sun, Ordinal representations for biometrics recognition,
in Proc. 15th Eur. Signal Process. Conf., 2007, pp. 3539.
[3] Z. Sun and T. Tan, Ordinal measures for iris recognition, IEEE Trans.
Pattern Anal. Mach. Intell., vol. 31, no. 12, pp. 22112226, Dec. 2009.
[4] Z. Sun, T. Tan, Y. Wang, and S. Z. Li, Ordinal palmprint representation for personal identification, in Proc. Conf. Comput. Vis. Pattern
Recognit. (CVPR), vol. 1. 2005, pp. 279284.
[5] Z. Chai, Z. Sun, H. Mendez-Vazquez, R. He, and T. Tan, Gabor ordinal
measures for face recognition, IEEE Trans. Inf. Forensics Security,
vol. 9, no. 1, pp. 1426, Jan. 2014.
[6] H. Liu and L. Yu, Toward integrating feature selection algorithms for
classification and clustering, IEEE Trans. Knowl. Data Eng., vol. 17,
no. 4, pp. 491502, Apr. 2005.
[7] L. Yu and H. Liu, Efficient feature selection via analysis of relevance
and redundancy, J. Mach. Learn. Res., vol. 5, pp. 12051224, Oct. 2004.
[8] M. A. Hall, Correlation-based feature selection for discrete and numeric
class machine learning, in Proc. 17th Int. Conf. Mach. Learn., 2000,
pp. 359366.
[9] S. Das, Filters, wrappers and a boosting-based hybrid for feature
selection, in Proc. 18th Int. Conf. Mach. Learn., 2001, pp. 7481.
[10] Q. Song, J. Ni, and G. Wang, A fast clustering-based feature subset
selection algorithm for high-dimensional data, IEEE Trans. Knowl.
Data Eng., vol. 25, no. 1, pp. 114, Jan. 2013.
[11] H. Peng, F. Long, and C. Ding, Feature selection based on mutual
information criteria of max-dependency, max-relevance, and minredundancy, IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 8,
pp. 12261238, Aug. 2005.
[12] R.-S. Marko and K. Igor, Theoretical and empirical analysis of ReliefF
and RReliefF, Mach. Learn. J., vol. 53, nos. 12, pp. 2369, 2003.
[13] G. Guo and C. R. Dyer, Simultaneous feature selection and classifier
training via linear programming: a case study for face expression
recognition, in Proc. Conf. Comput. Vis. Pattern Recognit. (CVPR),
Jun. 2003, pp. 346352.
[14] P. Viola and M. Jones, Robust real-time face detection, Int. J. Comput.
Vis., vol. 57, no. 2, pp. 137154, May 2004.

3934

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 9, SEPTEMBER 2014

[15] A. Destrero, C. De Mol, F. Odone, and A. Verri, A regularized


framework for feature selection in face detection and authentication,
Int. J. Comput. Vis., vol. 83, no. 2, pp. 164177, Jun. 2009.
[16] S. Z. Li, R. Chu, S. Liao, and L. Zhang, Illumination invariant face
recognition using near-infrared images, IEEE Trans. Pattern Anal.
Mach. Intell., vol. 29, no. 4, pp. 627639, Apr. 2007.
[17] A. Destrero, C. De Mol, F. Odone, and A. Verri, A sparsity-enforcing
method for learning face features, IEEE Trans. Image Process., vol. 18,
no. 1, pp. 188201, Jan. 2009.
[18] L. Wang, Z. Sun, and T. Tan, Robust regularized feature selection for
iris recognition via linear programming, in Proc. 21st Int. Conf. Pattern
Recognit. (ICPR), Nov. 2012, pp. 33583361.
[19] (2014, May 23). CPLEX [Online]. Available: http://www.ibm.com
[20] (2014, May 23). LINDO [Online]. Available: http://www.lindo.com
[21] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge, U.K.:
Cambridge Univ. Press, 2004.
[22] J. Friedman, T. Hastie, and R. Tibshirani, Additive logistic regression:
A statistical view of boosting, Ann. Statist., vol. 28, no. 2, pp. 337407,
2000.
[23] I. Daubechies, M. Defrise, and C. De Mol, An iterative thresholding
algorithm for linear inverse problems with a sparsity constraint, Commun. Pure Appl. Mathem., vol. 57, no. 11, pp. 14131457, Nov. 2004.
[24] Z. He, T. Tan, Z. Sun, and X. Qiu, Toward accurate and fast iris
segmentation for iris biometrics, IEEE Trans. Pattern Anal. Mach.
Intell., vol. 31, no. 9, pp. 16701684, Sep. 2009.
[25] (2014, May 23). CASIA Iris Image Database [Online]. Available:
http://biometrics.idealtest.org
[26] J. G. Daugman, High confidence visual recognition of persons by a test
of statistical independence, IEEE Trans. Pattern Anal. Mach. Intell.,
vol. 15, no. 11, pp. 11481161, Nov. 1993.
[27] L. Ma, T. Tan, Y. Wang, and D. Zhang, Efficient iris recognition by
characterizing key local variations, IEEE Trans. Image Process., vol. 13,
no. 6, pp. 739750, Jun. 2004.
[28] A. Kumar, T.-S. Chan, and C.-W. Tan, Human identification from at-adistance face images using sparse representation of local iris features,
in Proc. 5th Int. Conf. Pattern Recognit. (IAPR) Int. Conf. Biometrics
(ICB), Mar./Apr. 2012, pp. 303309.
[29] D. Zhang, W. Zuo, and F. Yue, A comparative study of palmprint
recognition algorithms, ACM Comput. Surveys, vol. 44, no. 1, p. 2,
Jan. 2012.
[30] A. Kong, D. Zhang, and M. Kamel, A survey of palmprint recognition,
J. Pattern Recognit., vol. 42, no. 7, pp. 14081418, 2009.
[31] A. W.-K. Kong and D. Zhang, Competitive coding scheme for palmprint verification, in Proc. 17th Int. Conf. Pattern Recognit. (ICPR),
vol. 1. Aug. 2004, pp. 520523.
[32] F. Yue, W. Zuo, D. Zhang, and K. Wang, Orientation selection
using modified FCM for competitive code-based palmprint recognition,
Pattern Recognit., vol. 42, no. 11, pp. 28412849, 2009.
[33] W. Zuo, Z. Lin, Z. Guo, and D. Zhang, The multiscale competitive code
via sparse representation for palmprint verification, in Proc. IEEE Conf.
Comput. Vis. Pattern Recognit. (CVPR), Jun. 2010, pp. 22652272.
[34] W. Jia, D.-S. Huang, and D. Zhang, Palmprint verification based
on robust line orientation code, Pattern Recognit., vol. 41, no. 5,
pp. 15041513, May 2008.
[35] (2014, May 23). PolyU Palmprint Database [Online]. Available:
http://www.comp.polyu.edu.hk/biometrics/
[36] (2014, May 23). CASIA Palmprint Database [Online]. Available:
http://biometrics.idealtest.org
[37] (2014, May 23). Appendix of this Paper [Online]. Available:
http://www.cripac.ia.ac.cn/people/znsun/TIP-LPOM.pdf

Zhenan Sun is currently a Professor with the Center


for Research on Intelligent Perception and Computing, National Laboratory of Pattern Recognition
(NLPR), Institute of Automation, Chinese Academy
of Sciences (CASIA), Beijing, China. He received
the B.E. degree in industrial automation from the
Dalian University of Technology, Dalian, China, in
1999, the M.S. degree in system engineering from
the Huazhong University of Science and Technology, Wuhan, China, in 2002, and the Ph.D. degree
in pattern recognition and intelligent systems from
CASIA in 2006. Since 2006, he has been with NLPR as a Faculty Member.
He is a member of the IEEE Computer Society and the IEEE Signal
Processing Society. His current research interests include biometrics, pattern
recognition, and computer vision, and has authored and co-authored over 100
technical papers. He is an Associated Editor of the IEEE T RANSACTIONS
ON I NFORMATION F ORENSICS AND S ECURITY and the IEEE B IOMETRICS
C OMPENDIUM.

Libin Wang is currently pursuing the Ph.D. degree


with the Center for Research on Intelligent Perception and Computing, National Laboratory of Pattern Recognition, Institute of Automation, Chinese
Academy of Sciences, Beijing, China. He received
the B.Sc. degree in electronic engineering and information science from the University of Science and
Technology of China, Hefei, China. His current
research interests include biometrics and machine
learning.

Tieniu Tan (F04) is currently a Professor with the


Center for Research on Intelligent Perception and
Computing, National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy
of Sciences, Beijing, China. He received the B.Sc.
degree in electronic engineering from Xian Jiaotong
University, Xian, China, in 1984, and the M.Sc.
and Ph.D. degrees in electronic engineering from the
Imperial College of Science, Technology and Medicine, London, U.K., in 1986 and 1989, respectively.
His current research interests include biometrics,
image and video understanding, information hiding, and information forensics.
He is a fellow of the International Association for Pattern Recognition and
the Chinese Academy of Sciences.

S-ar putea să vă placă și