Sunteți pe pagina 1din 75

ABSTRACT

CONTENTS

1. Introduction
1.1 Motivation: Biometric Security Technology
1.2 Face Recognition

2. Analysis
2.1 Problem Statement
2.2 Literature Survey
2.2.1 Eigenface Method
2.2.2 Neural Network Approach
2.2.3 Stochastic Modeling
2.2.4 Geometrical feature matching
2.2.5 Template Matching
2.2.6 Graph Matching
2.2.7 N-tuple classifiers
2.2.8 Line Edge Map
2.3 Line Edge Map Method
2.3.1 Face Detection
2.3.2 Edge Detectors
2.3.3 Thinning Algorithm
2.3.4 Curve Fitting Algorithm
2.3.5 Hausdorff Distance Algorithm
2.4 Use Case Diagram

3. Design
3.1 Class Relationship Diagram
3.2 Class Diagram
3.3 Sequence Diagram

4. Implementation and Test Results


Conclusion

References

Appendix A Users Manual

Appendix B .NET Framework Setup

Appendix C SQL Server 7.0 Setup

CHAPTER 1
INTRODUCTION

1.1 Motivation: Biometric Security Technology

WHAT IS A BIOMETRIC?

The security field uses three different types of authentication:

• Something you know—a password, PIN, or piece of personal information (such as


your mother's maiden name);
• Something you have—a card key, smart card, or token (like a Secure ID card);
and/or
• Something you are—a biometric.

Of these, a biometric is the most secure and convenient authentication tool. It can't be
borrowed, stolen, or forgotten; and forging one is practically impossible. Biometrics takes
account of individual’s unique physical or behavioral characteristics to recognize or
authenticate identity. Common physical biometrics include fingerprints; hand or palm
geometry; retina, iris, or facial characteristics. Behavioral characters include signature,
voice (which also has a physical component), keystroke pattern, and gait. Of these classes
of biometrics, technologies for signature and voice are the most developed.

Figure 1 describes the process involved in using a biometric system for security.
Figure 1: How a biometric system works.
(1) Capture the chosen biometric; (2) process the biometric, extract and enroll the
biometric template; (3) store the template in a local repository, or a central repository, or
a portable token such as a smart card; (4) live-scan the chosen biometric; (5) process the
biometric and extract the biometric template; (6) match the scanned biometric against
stored templates; (7) provide a matching score to business applications; (8) record a
secure audit trail with respect to system used.
Fingerprints
A fingerprint looks at the patterns found on a fingertip. There are different approaches to
fingerprint verification. Some emulates the traditional police method of matching
minutiae while others use either straight pattern-matching devices or a bit more unique
i.e. things like moiré fringe patterns and ultrasonic. Some verification approaches can
detect when a live finger is presented while others cannot.

Hand geometry
Hand geometry involves analyzing and measuring the shape of the hand. This biometric
offers a good balance of performance characteristics and is relatively easy to use. It is
suitable for places with more users. It is also appropriate when users access the system
infrequently and/or are perhaps less disciplined in their approach to the system. Accuracy
can be very high if desired. Flexible performance tuning and configuration can
accommodate a wide range of applications.

Retina
A retina-based biometric involves analyzing the layer of blood vessels situated at the back
of the eye. An established technology, this technique involves using a low-intensity light
source through an optical coupler to scan the unique patterns of the retina. Retinal
scanning can be quite accurate but does require the user to look into a receptacle and
focus on a given point. This is not particularly convenient if you wear glasses or are
concerned about having close contact with the reading device.

Iris
An iris-based biometric, on the other hand, involves analyzing features found in the
colored ring of tissue that surrounds the pupil. Iris scanning, undoubtedly the less
intrusive of the eye-related biometrics, uses a fairly conventional camera element and
requires no close contact between the user and the reader. In addition, it has the potential
for higher than average template-matching performance. Iris biometrics work with glasses
in place and is one of the few devices that can work well in identification mode.

Face
Face recognition analyzes facial characteristics. It requires a digital camera to develop a
facial image of the user for authentication. Facial features are most important thing in
face recognition. It extracts features from a face image and compare with those stored in
the database for identification.

Signature
Signature verification analyzes the way a user signs his/her name. Signing features such
as speed, velocity, and pressure are as important as the finished signature's static shape.

Voice
Voice authentication is not based on voice recognition but on voice-to-print
authentication, where complex technology transforms voice into text. Voice biometrics
has the maximum potential for growth, as it requires no new hardware.

USES FOR BIOMETRICS

Security systems use biometrics for two basic purposes:


1. Either to verify or
2. To identify the user.
Identification tends to be the more difficult of the two uses because a system must search
a database of enrolled users to find a match (a one-to-many search). The biometric that a
security system employs depends in part on
• What the system is protecting and
• What it is trying to protect against.

Physical access
For decades, many highly secure environments have used biometric technology for entry
access. Today, the primary application of biometrics is in physical security i.e. to control
access to secure locations (rooms or buildings). Unlike photo identification cards, which a
security guard must verify, biometrics permits unmanned access control. Biometrics is
useful for high-volume access control. For example, biometrics controlled access of
65,000 people during the 1996 Olympic Games, and Disney World uses a fingerprint
scanner to verify season-pass holders entering the theme park.

Virtual access
For a long time, biometric-based network and computer access were areas often discussed
but rarely implemented. Virtual access is the application that will provide the critical mass
to move biometrics for network and computer access from the realm of science-fiction
devices to regular system components.

Physical lock-downs can protect hardware, and passwords are currently the most popular
way to protect data on a network. Biometrics, however, can increase the ability to protect
data by implementing a more secure key than a password. Biometrics also allows a
hierarchical structure of data protection, making the data further secure. Passwords supply
a minimal level of access to network data, but biometrics is the next level. You can even
lairize biometric technologies to enhance security levels.

E-commerce applications
E-commerce developers are exploring the use of biometrics and smart cards to more
accurately verify a trading party's identity. Some are using biometrics to obtain secure
services over the telephone through voice authentication.

Covert surveillance
One of the more challenging research areas involves using biometrics for covert
surveillance. Using facial and body recognition technologies, researchers hope to use
biometrics to automatically identify known suspects entering buildings or traversing
crowded security areas such as airports. The use of biometrics for covert identification as
opposed to authentication must overcome technical challenges such as simultaneously
identifying multiple subjects in a crowd and working with uncooperative subjects. In
these situations, devices cannot count on consistency in pose, viewing angle, or distance
from the detector.

THE FUTURE OF BIOMETRICS

Although companies are using biometrics for authentication in a variety of situations, the
industry is still evolving and emerging. To both guide and support the growth of
biometrics, the Biometric Consortium formed in December 1995.

Standardization
The biometrics industry includes more than 150 separate hardware and software vendors,
each with their own proprietary interfaces, algorithms, and data structures. Standards are
emerging to provide a common software interface, to allow sharing of biometric
templates, and to permit effective comparison and evaluation of different biometric
technologies.

The BioAPI standard, defines a common method for interfacing with a given biometric
application. BioAPI is open-systems standard developed by a consortium of more than 60
vendors and government agencies. Written in C, it consists of a set of function calls to
perform basic actions common to all biometric technologies, such as

• Enroll user,
• Verify asserted identity (authentication), and
• Discover identity.

Another draft standard is the Common Biometric Exchange File Format, which defines a
common mean of exchanging and storing templates collected from a variety of biometric
devices.

Biometric assurance i.e. the confidence that a biometric device can achieve the intended
level of security is another active research area. Current metrics for comparing biometric
technologies, such as the crossover error rate and the average enrollment time, are limited
because they lack a standard test bed on which to base their values. Several groups,
including the US Department of Defense's Biometrics Management Office, are
developing standard testing methodologies.

Hybrid technology uses


One of the more interesting uses of biometrics involves combining biometrics with smart
cards and Public-Key Infrastructure (PKI). A major problem with biometrics is how and
where to store the user's template. Because the template represents the user's personal
characteristics, its storage introduces privacy concerns. Furthermore, storing the template
in a centralized database leaves that template subject to attack and compromise. On the
other hand, storing the template on a smart card enhances individual privacy and
increases protection from attack, because individual users control their own templates.

PKI uses public and private-key cryptography for user identification and authentication. It
has some advantages over biometrics: It is mathematically more secure, and it can be
used across the Internet. The main drawback of PKI is the management of the user's
private key. To be secure, the private key must be protected from compromise, while to be
useful; the private key must be portable. The solution to these problems is to store the
private key on a smart card and protect it with a biometric.

1.2 Face Recognition

With the advancement in computer and automated systems, one is seldom surprised to
find such systems applicable to many visual tasks in our daily activities. Automated
systems on production lines inspect goods for our consumption, and law-enforcement
agencies use computer systems to search databases of fingerprint records. Visual
surveillance of scenes, visual feedback for control etc. all has potential applications for
automated visual systems.

One area that has grown significantly in importance over the past decade is that of
computer face processing in visual scenes. Researchers attempt to teach the computer to
recognize and analyze human faces from images so as to produce an easy and convenient
platform for interaction between human and computers. Law- enforcement can be
improved by automatically recognizing criminals from a group of suspects. Security can
also be reinforced by identifying that the authorized person is physically present.
Moreover, human facial expressions can be analyzed to direct robot motion to perform
certain secondary, or even primary, tasks in our routine work requirements.

For more than a quarter of a century, research has been done in automatic face
recognition. Psycho-physicists and neuroscientists have attempted to understand why a
human being is able to handle problem of face recognition without any problems.
Engineers had and still have the dream of a face recognition, which is done fully
automatically by computers. They want to obtain efficiency, which is comparable with the
human ability of face recognition. This problem has not been solved yet and the scientists
have still to go a very long way to reach this goal.
Face recognition can basically be understood as a complex pattern recognition task.
Thus, most of the techniques that have been applied originate from the field of signal
processing and computer science research.

Probably because of the fast development in the research of face recognition or the large
number of parallel existing approaches there is no textbook that can be recommended.
Though, some survey articles are useful to get acquainted with this matter.

First attempts of face recognition deal with the problem by describing features in the
image as compared to the stored data. Several other approaches have applied correlations
with already stored feature templates.

Automatic face recognition is a technique that can locate and identify faces automatically
in an image and determine “who is who” from a database. It is gaining more and more
attention in the area of computer vision, image processing and pattern recognition. There
are several important steps involved in recognizing face such as detection, representation
and identification. Based on different representations, various approaches can be grouped
into feature-based and image-based.

Usually every group of researchers uses their own database of manually normalized faces,
but these have conditions far away from the kind of images expected in reality. In realistic
situations there would be rather a face somewhere in the image, but not in the middle and
thus the background would be cluttered.
Many techniques for face recognition have been developed whose principles span several
disciplines, such as image processing, pattern recognition, computer vision, and neural
networks. The increasing interest in face recognition is mainly driven by application
demands, such as non-intrusive identification and verification for credit cards and
automatic teller machine transactions, non-intrusive access-control to buildings,
identification for law enforcement, etc. Machine recognition of faces yields problems that
belong to the following categories whose objectives are briefly outlined:
1. Face Recognition: Given a test face and a set of reference faces in a database
find the N most similar reference faces to the test face.
2. Face Authentication: Given a test face and a reference one, decide if the test
face is identical to the reference face.

Face recognition has been studied more extensively than face authentication. The two
problems are conceptually different. On one hand, a face recognition system usually
assists a human expert to determine the identity of a test face by computing all similarity
scores between the test face and each human face stored in the system database and by
ranking them. On the other hand, a face authentication system should decide itself if a test
face is assigned to a client (i.e., one who claims his/her own identity) or to an impostor
(i.e., one who pretends to be someone else).

Cognitive psychological studies indicated that human beings recognize line drawings as
quickly and almost as accurately as gray-level pictures. These results might imply that
edge images of objects could be used for object recognition and to achieve similar
accuracy as gray-level images. A novel concept, “faces can be recognized using line edge
map,” is proposed. A compact face feature, Line Edge Map (LEM), is extracted for face
coding and recognition.
The faces were encoded into binary edge maps using Sobel edge detection algorithm. The
Hausdorff distance was chosen to measure the similarity of the two point sets, i.e., the
edge maps of two faces, because the Hausdorff distance can be calculated without an
explicit pairing of points in their respective data sets. A pre-filtering scheme (two-stage
identification) is used to speed up the searching using a 2D pre-filtering vector derived
from the face LEM.

A feasibility investigation and evaluation for face recognition based solely on face LEM
is conducted, which covers all the conditions of human face recognition, i.e., face
recognition under controlled/ideal condition, varying lighting condition, varying facial
expression, and varying pose.

Chapter 2
Analysis

2.1 Problem Statement

For more than a quarter of a century research has been done in automatic face
recognition. Psycho-physicists and neuroscientists have attempted to understand why the
human being is able to handle the problem of face recognition nearly without any
problems. Engineers had and still have the dream of a face recognition, which is done
fully automatically by computers. They want to obtain efficiency, which is comparable
with the human ability of face recognition. This problem has not been solved yet and the
scientists have still to go a very long way to reach this goal.

There is an increasing application demand, such as nonintrusive identification and


verification for credit cards and automatic teller machine transactions, nonintrusive
access-control to buildings, identification for law enforcement, etc. So face recognition
system is very useful at these demands.

As we are using LEM method, we will use certain algorithms to fulfill this task. For that,
we will find important region of face from the image. Then we detect edge from that
image and will thin the edges. This thinned edge map is then converted to line edge map,
which will be store in to database. As input comes, that will also be processed through
these steps and finally compared with images stored in the database.

In short our problem comprise of 5 main modules:


- Face Detection
- Edge Detection
- Thinning
- Line Edge Map
- Face Comparison

2.2 Literature Survey

There are many techniques available for human face recognition, but the major human
face recognition technique applies mostly to frontal faces. Major methods considered for
face recognition are eigen-face (eigen-feature), neural network, dynamic link architecture,
hidden Markov model, geometrical feature matching, and template matching. The
approaches are analyzed in terms of the facial representations they used.

2.2.1 Eigenface Method

Eigen-face is one of the most thoroughly investigated approaches to face recognition. It is


also known as Karhunen- LoeÁve expansion, eigen-picture, eigen-vector, and principal
component. Sirovich and Kirby [5] and Kirby et al. [6] used principal component analysis
to efficiently represent pictures of faces. They argued that any face images could be
approximately reconstructed by a small collection of weights for each face and a standard
face picture (eigen-picture). The weights describing each face are obtained by projecting
the face image onto the eigen-picture. Turk and Pentland [7] used eigen-faces, motivated
by the technique of Kirby and Sirovich, for face detection and identification.

In mathematical terms, eigen-faces are the principal components of the distribution of


faces, or the eigen-vectors of the covariance matrix of the set of face images. The eigen-
vectors are ordered to represent different amounts of the variation, respectively, among
the faces. Each face can be represented exactly by a linear combination of the eigen-faces.
It can also be approximated using only the best eigenvectors with the largest eigenvalues.
The best M eigenfaces construct an M dimensional space, i.e., the face space. As the
images include a large quantity of background area, the results can be influenced by
background. Grudin [26] showed that the correlation between images of the whole faces
is not efficient for satisfactory recognition performance.

Illumination normalization [6] is usually necessary for the eigenface approach. Zhao and
Yang [32] proposed a new method to compute the covariance matrix using three images
each taken in different lighting conditions to account for arbitrary illumination effects, for
an object. Lambertian and Pentland et al. [8] extended their early work on eigenface to
eigenfeatures corresponding to face components, such as eyes, nose, and mouth. They
used a modular eigenspace which was composed of the above eigenfeatures (i.e.,
eigeneyes, eigennose, and eigenmouth). This method would be less sensitive to
appearance changes than the standard eigenface method. In summary, eigenface appears
as a fast, simple, and practical method. However, in general, it does not provide
invariance over changes in scale and lighting conditions.

2.2.2 Neural Network Approach


The attractiveness of using neural network could be due to its non-linearity in the
network. Hence, the feature extraction step may be more efficient than the linear
Karhunen-LoeÁve methods. One of the first artificial neural network (ANN) techniques
used for face recognition is a single layer adaptive network called WISARD which
contains a separate network for each stored individual [9]. The way of constructing a
neural network structure is crucial for successful recognition. It is very much dependent
on the intended application.
• For face detection, multilayer perceptron [10] and convolutional neural network
[11] have been applied.
• For face verification, Cresceptron [12] multi resolution pyramid structure is used.

Lawrence et al. [11] proposed a hybrid neural network, which combined local image
sampling, a self-organizing map (SOM) neural network, and a convolutional neural
network. The SOM provides a quantization of the image samples into a topological space
where inputs that are nearby in the original space are also nearby in the output space,
thereby providing dimension reduction and invariance to minor changes in the image
sample. The convolutional network extracts successively larger features in a hierarchical
set of layers and provides partial invariance to translation, rotation, scale, and
deformation. The authors reported 96.2 percent correct recognition on ORL database of
400 images of 40 individuals. The classification time is less than 0.5 second, but the
training time is as long as 4 hours. Lin et al. [13] used probabilistic decision-based neural
network (PDBNN), which inherited the modular structure from its predecessor, a decision
based neural network (DBNN) [14].
The PDBNN can be applied effectively to
1) Face detector: This finds the location of a human face in a cluttered image,
2) Eye localizer: This determines the positions of both eyes in order to generate
meaningful feature vectors, and
3) Face recognizer: A hierarchical neural network structure with non-linear basis
functions and a competitive credit-assignment scheme.

PDBNN-based biometric identification system has the merits of both neural networks and
statistical approaches, and its distributed computing principle is relatively easy to
implement on parallel computer. In [13], it was reported that PDBNN face recognizer had
the capability of recognizing up to 200 people and could achieve up to 96 percent correct
recognition rate in approximately 1 second. However, when the number of persons
increases, the computing expense will become more demanding. In general, neural
network approaches encounter problems when the number of classes (i.e., individuals)
increases. Moreover, they are not suitable for a single model image recognition task
because multiple model images per person are necessary in order for training the systems
to optimal parameter setting.

2.2.3 Stochastic Modeling

Stochastic modeling of non-stationary vector time series based on hidden Markov models
(HMM) has been very successful for speech applications. Samaria and Fallside [27]
applied this method to human face recognition. Faces were intuitively divided into
regions such as the eyes, nose, mouth, etc., which can be associated with the states of a
hidden Markov model. Since HMMs require a one-dimensional observation sequence and
images are two-dimensional, the images should be converted into either 1D temporal
sequence or 1D spatial sequence.

In [28], a spatial observation sequence was extracted from a face image by using a band
sampling technique. Each face image was represented by a 1D vector series of pixel
observation. Each observation vector is a block of L lines and there is an M lines overlap
between successive observations. An unknown test image is first sampled to an
observation sequence. Then, it is matched against every HMMs in the model face
database (each HMM represents a different subject). The match with the highest
likelihood is considered the best match and the relevant model reveals the identity of the
test face. The recognition rate of HMM approach is 87 percent using ORL database
consisting of 400 images of 40 individuals. Pseudo 2D HMM [28] was reported to
achieve a 95 percent recognition rate in their preliminary experiments. Its classification
time and training time were not given (believed to be very expensive). The choice of
parameters had been based on subjective intuition.

2.2.4 Geometrical feature matching

Geometrical feature matching techniques are based on the computation of a set of


geometrical features from the picture of a face. The fact that the face recognition is
possible even at a coarse resolution of 8x6 pixels [17] where the single facial features are
hardly revealed in detail implies that the overall geometrical configuration of the face
features is sufficient for recognition. The overall configuration can be described by a
vector representing the position and size of the main facial feature, such as eyes and
eyebrows, nose, mouth, and the shape of face outline.

One of the pioneering works on automated face recognition by using geometrical features
was done by Kanade [19] in 1973. Their system achieved a peak performance of 75
percent recognition rate on a database of 20 people using two images per person, one as
the model and the other as the test image. Goldstein et al. [20] and Kaya and Kobayashi
[18] showed that a face recognition program provided with features extracted manually
could perform recognition apparently with satisfactory results. Bruneli and Poggio [21]
automatically extracted a set of geometrical features from the picture of a face, such as
nose width and length, mouth position, and chin shape. There were 35 features extracted
to form a 35 dimensional vector. The recognition was then performed with a Bayes
classifier. They reported a recognition rate of 90 percent on a database of 47 people.

Cox et al. [22] introduced a mixture-distance technique, which achieved 95 percent


recognition rate on a query database of 685 individuals. Each face was represented by 30
manually extracted distances. Manjunath et al. [23] used Gabor wavelet decomposition to
detect feature points for each face image, which greatly reduced the storage requirement
for the database. Typically, 35-45 feature points per face were generated. The matching
process utilized the information presented in a topological graphic representation of the
feature points. After compensating for different centroid location, two cost values, the
topological cost, and similarity cost, were evaluated. The recognition accuracy in terms of
the best match to the right person was 86 percent and 94 percent of the correct person's
face was in the top three candidate matches.
In summary, geometrical feature matching based on precisely measured distances
between features may be most useful for finding possible matches in a large database
such as a mug shot album. However, it will be dependent on the accuracy of the feature
location algorithms. Current automated face feature location algorithms do not provide a
high degree of accuracy and require considerable computational time.

2.2.5 Template Matching

A simple version of template matching is that a test image represented as a two-


dimensional array of intensity values is compared using a suitable metric, such as the
Euclidean distance, with a single template representing the whole face. There are several
other more sophisticated versions of template matching on face recognition. One can use
more than one face template from different viewpoints to represent an individual's face. A
face from a single viewpoint can also be represented by a set of multiple distinctive
smaller templates [24], [21]. The face image of gray levels may also be properly
processed before matching [25]. In [21], Bruneli and Poggio automatically selected a set
of four features templates, i.e., the eyes, nose, mouth, and the whole face, for all of the
available faces. They compared the performance of their geometrical matching algorithm
and template matching algorithm on the same database of faces, which contains 188
images of 47 individuals.

The template matching was superior in recognition (100 percent recognition rate) to
geometrical matching (90 percent recognition rate) and was also simpler. Since the
principal components (also known as eigenfaces or eigenfeatures) are linear combinations
of the templates in the data basis, the technique cannot achieve better results than
correlation [21], but it may be less computationally expensive. One drawback of template
matching is its computational complexity. Another problem lies in the description of these
templates. Since the recognition system has to be tolerant to certain discrepancies
between the template and the test image, this tolerance might average out the differences
that make individual faces unique. In general, template-based approaches compared to
feature matching are a more logical approach.

2.2.6 Graph Matching

Graph matching is another approach to face recognition. Lades et al. [15] presented a
dynamic link structure for distortion invariant object recognition, which employed elastic
graph matching to find the closest stored graph. Dynamic link architecture is an extension
to classical artificial neural networks. Memorized objects are represented by sparse
graphs, whose vertices are labeled with a multi-resolution description in terms of a local
power spectrum and whose edges are labeled with geometrical distance vectors. Object
recognition can be formulated as elastic graph matching which is performed by stochastic
optimization of a matching cost function.

Wiskott and von der Malsburg [16] extended the technique and matched human faces
against a gallery of 112 neutral frontal view faces. Probe images were distorted due to
rotation in depth and changing facial expression. Encouraging results on faces with large
rotation angles were obtained. They reported recognition rates of 86.5 percent and 66.4
percent for the matching tests of 111 faces of 15 degree rotation and 110 faces of 30
degree rotation to a gallery of 112 neutral frontal views. In general, dynamic link
architecture is superior to other face recognition techniques in terms of rotation invariant;
however, the matching process is computationally expensive.

2.2.7 N-tuple classifiers

Conventional n-tuple systems have the desirable features of super-fast single-pass


training, super-fast recognition, conceptual simplicity, straightforward hardware and
software implementations and accuracy that is often competitive with other more
complex, slower methods. Due to their attractive features, n-tuple methods have been the
subject of much research. In conventional n-tuple based image recognition systems, the
locations specified by each n-tuple are used to identify an address in a look-up-table. The
contents of this address either use a single bit to indicate whether or not this address was
accessed during training, or store a count of how many times that address occurred.

While the traditional n-tuple classifier deals with binary-valued input vectors, methods
using n-tuple systems with integer-valued inputs have also been developed. Allinson and
Kolcz [3] have developed a method of mapping scalar attributes into bit strings based on
a combination of CMAC and Gray coding methods. This method has the property that for
small differences in the arithmetic values of the attributes, the hamming distance between
the bit strings is equal to the arithmetic difference. For larger values of the arithmetic
distance, the hamming distance is guaranteed to be above a certain threshold.

The continuous n-tuple method also shares some similarity at the architectural level with
the single layer look-up perceptron of Tattersall et al [32], though they differ in the way
the class outputs are calculated, and in the training methods used to configure the contents
of the look-up tables (RAMS).

In summary, no existing technique is free from limitations. Further efforts are required to
improve the performances of face recognition techniques, especially in the wide range of
environments encountered in real world.

2.2.8 Line Edge Map

Cognitive psychological studies indicated that human beings recognize line drawings as
quickly and almost as accurately as gray-level pictures. These results might imply that
edge images of objects could be used for object recognition and to achieve similar
accuracy as gray-level images. The faces were encoded into binary edge maps using
Sobel edge detection algorithm. The Hausdorff distance was chosen to measure the
similarity of the two point sets, i.e., the edge maps of two faces, because the Hausdorff
distance can be calculated without an explicit pairing of points in their respective data
sets. The modified Hausdorff distance in the formulation of

h(A,B) = (1/Na) ∑a∈A{ minb∈B ||a – b|| }

was used, as it is less sensitive to noise than the maximum or kth ranked Hausdorff
distance formulations. Takacs argued that the process of face recognition might start at a
much earlier stage and edge images can be used for the recognition of faces without the
involvement of high-level cognitive functions. However, the Hausdorff distance uses only
the spatial information of an edge map without considering the inherent local structural
characteristics inside such a map. A successful object recognition approaches might need
to combine aspects of feature-based approaches with template matching method. A Line
Edge Map (LEM) approach extracts lines from a face edge map as features. This
approach can be considered as a combination of template matching and geometrical
feature matching. The LEM approach not only possesses the advantages of feature-based
approaches, such as invariant to illumination and low memory requirement, but also has
the advantage of high recognition performance of template matching. The above three
reasons together with the fact that edges are relatively insensitive to illumination changes
motivated this research.

2.3 Line Edge Map Method

A novel face feature representation, Line Edge Map (LEM), is proposed here to integrate
the structural information with spatial information of a face image by grouping pixels of
face edge map to line segments. After thinning the edge map, a polygonal line fitting
process is applied to generate the LEM of a face. The LEM representation, which records
only the end points of line segments on curves, further reduces the storage requirement.
Efficient coding of faces is a very important aspect in a face recognition system. LEM is
also expected to be less sensitive to illumination changes due to the fact that it is an
intermediate-level image representation derived from low level edge map representation.
The basic unit of LEM is the line segment grouped from pixels of edge map.

In this study, we explore the information of LEM and investigate the feasibility and
efficiency of human face recognition using LEM. A Line Segment Hausdorff Distance
(LHD) measure is then proposed to match LEMs of faces. LHD has better distinctive
power because it can make use of the additional structural attributes of line orientation,
line-point association, and number disparity in LEM, i.e., it is not encouraged to match
two lines with large orientation difference, and all the points on one line have to match to
points on another line only.

2.3.1 Face Detection

The original algorithm is based on mosaic images of reduced resolution that attempt to
capture the macroscopic features of the human face. It is assumed that there is a
resolution level where the main part of the face occupies an area of about 4x4 cells.
Accordingly, a mosaic image can be created for this resolution level. It is the so called
quartet image. The grey level of each cell equals the average value of the grey levels of
all pixels included in the cell. An abstract model for the face at the resolution level of the
quartet image is depicted in Figure.

The main part of the face corresponds to the region of 4x4 cells having an origin cell
marked by “X”. By subdividing each quartet image cell to 2x2 cells of half dimensions
the octet image results, where the main facial features such as the eyebrows/eyes, the
nostrils/nose and the mouth are detected. Therefore, a hierarchical knowledge-based
system can be designed that aims at detecting facial candidates by establishing rules
applied to the quartet image and subsequently at validating the choice of a facial
candidate by establishing rules applied to the octet image for detecting the key facial
features.

Fig. 1

As can be seen, the underlying idea is very simple and very attractive, because it is close
to our intuition for the human face. However, the implementation is computationally
intensive. The algorithm is applied iteratively for the entire range of possible cell
dimensions in order to determine the best cell dimensions for creating the quartet image
for each person. Another limitation is that only square cells are employed. In order to
avoid the iterative nature of the original method, we estimate the cell dimensions in the
quartet image by processing the horizontal and the vertical profile of the image. Let us
denote by n and m the vertical and the horizontal quartet cell dimensions, respectively.

The horizontal profile of the image is obtained by averaging all pixel intensities in each
image column, by detecting abrupt transitions in the horizontal profile. These local
minima are determined in the horizontal profile. These correspond to the left and right
side of the head. Accordingly, the quartet cell dimension in the horizontal direction can
easily be estimated.

Similarly, the vertical profile of the image is obtained by averaging all pixel intensities in
each image row. The significant local minima in the vertical profile correspond to the
hair, eyebrows, eyes, mouth and chin. It is fairly easy to locate the row where the
eyebrows/eyes appear in the image by detecting the local minimum after the first abrupt
transition in the vertical profile. Searching for the row should be detected. It corresponds
to a significant maximum that occurs below the eyes. Then, the steepest minimum below
the nose tip is associated to the upper lip. By setting the distance between the rows where
the eyes and the upper lip have been found to 2n, the quartet cell dimension in the vertical
direction can be estimated. It is evident that the proposed preprocessing step overcomes
also the drawback of square cells, because the cell dimensions are adapted to each person
separately.
Having estimated the quartet cell dimensions, comes the description of facial candidate
detection rules. Since the system remains hierarchical, it is more preferable to decide that
a face exists in a scene although there may be no actual face than not to detect a face that
exists. The decision whether or not a region of 4x4 cells is a facial candidate is based on:
• The detection of a homogenous region of 2x2 cells in the middle of the model that
is shown in light grey color in fig 1 above.
• The detection of homogeneous connected components having significant length in
the π-shaped region shown in black color in fig 1, or,
• The detection of a beard region shown in dark gray color in fig 1.

Moreover, a significant difference in the average cell intensity between the central 2x2
region and the π-shaped region must be detected. For the sake of completeness, we note
that if there aren’t adequate cells in the vertical direction, the π-shaped region may have a
total length of 12 cells instead of 14 cells. We have found that the above described rules
are more successful in detecting a facial candidate. Subsequently, eyebrows/eyes,
nostrils/nose and mouth detection rules are developed to validate the facial candidates
determined by the procedure outlined above.

2.3.2 Edge Detectors

The operators described here are those whose purpose is to identify meaningful image
features on the basis of distributions of pixel grey levels. The two categories of operators
included here are:
1. Edge Pixel Detectors - that assign a value to a pixel in proportion to the likelihood that
the pixel is part of an image edge (i.e. a pixel which is on the boundary between
two regions of different intensity values).
2. Line Pixel Detectors - that assign a value to a pixel in proportion to the likelihood that
the pixel is part of an image line (i.e. a dark narrow region bounded on both sides
by lighter regions, or vice-versa).

Detectors for other features can be defined, such as circular arc detectors in intensity
images (or even more general detectors, as in the generalized Hough transform), or planar
point detectors in range images, etc.

Note that the operators merely identify pixels likely to be part of such a structure. To
actually extract the structure from the image it is then necessary to group together image
pixels (which are usually adjacent).

1) Roberts Cross Edge Detector

The Roberts Cross operator performs a simple, quick to compute, 2-D spatial gradient
measurement on an image. It thus highlights regions of high spatial gradient, which often
correspond to edges. In its most common usage, the input to the operator is a greyscale
image, as is the output. Pixel values at each point in the output represent the estimated
absolute magnitude of the spatial gradient of the input image at that point.
How It Works
In theory, the operator consists of a pair of 2×2 convolution masks as shown in Figure 1.
One mask is simply the other rotated by 90°. This is very similar to the Sobel operator.

+1 0 0 +1

0 -1 -1 0

Gx Gy

Figure 1: Roberts Cross convolution masks

These masks are designed to respond maximally to edges running at 45° to the pixel grid,
one mask for each of the two perpendicular orientations. The masks can be applied
separately to the input image, to produce separate measurements of the gradient
component in each orientation (call these Gx and Gy). These can then be combined
together to find the absolute magnitude of the gradient at each point and the orientation of
that gradient.

The gradient magnitude is given by:


| G |= Gx 2 + Gy 2

Although typically, an approximate magnitude is computed using:


| G |=| Gx | + | Gy |
which is much faster to compute.

The angle of orientation of the edge giving rise to the spatial gradient (relative to the pixel
grid orientation) is given by:
θ = arctan(Gy / Gx) − 3π / 4

In this case, orientation 0 is taken to mean that the direction of maximum contrast from
black to white runs from left to right on the image, and other angles are measured anti-
clockwise from this.

Often, the absolute magnitude is the only output the user sees --- the two components of
the gradient are conveniently computed and added in a single pass over the input image
using the pseudo-convolution operator shown in Figure 2.

P1 P2

P3 P4

Figure 2: Pseudo-Convolution masks used to quickly compute


approximate gradient magnitude
Using this mask the approximate magnitude is given by:

| G |=| P1 − P4 | + | P2 − P3 |

The main reason for using the Roberts cross operator is that it is very quick to compute.
Only four input pixels need to be examined to determine the value of each output pixel,
and only subtractions and additions are used in the calculation. In addition there are no
parameters to set. Its main disadvantages are that since it uses such a small mask, it is
very sensitive to noise. It also produces very weak responses to genuine edges unless they
are very sharp. The Sobel operator performs much better in this respect.

2) Sobel Edge Detector

The Sobel operator performs a 2-D spatial gradient measurement on an image and so
emphasizes regions of high spatial gradient that corresponds to edges. Typically it is used
to find the approximate absolute gradient magnitude at each point in an input greyscale
image.

How It Works
In theory at least, the operator consists of a pair of 3×3 convolution masks as shown in
Figure 1. One mask is simply the other rotated by 90°. This is very similar to the Roberts
Cross operator.

-1 0 +1 +1 +2 +1

-2 0 +2 0 0 0

-1 0 +1 -1 -2 -1

Gx Gy

Figure 1: Sobel convolution masks

These masks are designed to respond maximally to edges running vertically and
horizontally relative to the pixel grid, one mask for each of the two perpendicular
orientations. The masks can be applied separately to the input image, to produce separate
measurements of the gradient component in each orientation (call these Gx and Gy).
These can then be combined together to find the absolute magnitude of the gradient at
each point and the orientation of that gradient. The gradient magnitude is given by:

| G |= Gx 2 + Gy 2

Although typically, an approximate magnitude is computed using:

| G |=| Gx | + | Gy |
which is much faster to compute.
The angle of orientation of the edge (relative to the pixel grid) giving rise to the spatial
gradient is given by:
θ = arctan(Gy / Gx) − 3π / 4
In this case, orientation 0 is taken to mean that the direction of maximum contrast from
black to white runs from left to right on the image, and other angles are measured anti-
clockwise from this.

Often, this absolute magnitude is the only output the user sees --- the two components of
the gradient are conveniently computed and added in a single pass over the input image
using the pseudo-convolution operator shown in Figure 2.

P1 P2 P3

P4 P5 P6

P7 P8 P9

Figure 2: Pseudo-convolution masks used to quickly compute


approximate gradient magnitude

Using this mask the approximate magnitude is given by:

| G |= | ( P1 + 2 × P2 + P3 ) − ( P7 + 2 × P8 + P9 ) | + | ( P3 + 2 × P6 + P9 ) − ( P1 + 2 × P4 + P7 ) |

The Sobel operator is slower to compute than the Roberts Cross operator, but its larger
convolution mask smoothes the input image to a greater extent and so makes the operator
less sensitive to noise. The operator also generally produces considerably higher output
values for similar edges compared with the Roberts Cross.

As with the Roberts Cross operator, output values from the operator can easily overflow
the maximum allowed pixel value for image types that only support smallish integer pixel
values (e.g. 8-bit integer images). When this happens the standard practice is to simply set
overflowing output pixels to the maximum allowed value. The problem can be avoided by
using an image type that supports pixel values with a larger range.

Natural edges in images often lead to lines in the output image that are several pixels
wide due to the smoothing effect of the Sobel operator. Some thinning may be desirable
to counter this. Failing that, some sort of hysteresis ridge tracking could be used as in the
Canny operator.

3) Canny Edge Detector

The Canny operator was designed to be an optimal edge detector (according to particular
criteria --- there are other detectors around that also claim to be optimal with respect to
slightly different criteria). It takes as input a grey scale image, and produces as output an
image showing the positions of tracked intensity discontinuities.

How It Works
The Canny operator works in a multi-stage process. First of all the image is smoothed by
Gaussian convolution. Then a simple 2-D first derivative operator (somewhat like the
Roberts Cross) is applied to the smoothed image to highlight regions of the image with
high first spatial derivatives. Edges give rise to ridges in the gradient magnitude image.
The algorithm then tracks along the top of these ridges and sets to zero all pixels that are
not actually on the ridge top so as to give a thin line in the output, a process known as
non-maximal suppression. The tracking process exhibits hysteresis controlled by two
thresholds: T1 and T2 with T1 > T2. Tracking can only begin at a point on a ridge higher
than T1. Tracking then continues in both directions out from that point until the height of
the ridge falls below T2. This hysteresis helps to ensure that noisy edges are not broken
up into multiple edge fragments.

The effect of the Canny operator is determined by three parameters --- the width of the
Gaussian mask used in the smoothing phase, and the upper and lower thresholds used by
the tracker. Increasing the width of the Gaussian mask reduces the detector's sensitivity to
noise, at the expense of losing some of the finer detail in the image. The localization error
in the detected edges also increases slightly as the Gaussian width is increased.

Usually, the upper tracking threshold can be set quite high, and the lower threshold quite
low for good results. Setting the lower threshold too high will cause noisy edges to break
up. Setting the upper threshold too low increases the number of spurious and undesirable
edge fragments appearing in the output.

One problem with the basic Canny operator is to do with Y-junctions i.e. places where
three ridges meet in the gradient magnitude image. Such junctions can occur where an
edge is partially occluded by another object. The tracker will treat two of the ridges as a
single line segment, and the third one as a line that approaches, but doesn't quite connect
to, that line segment.

4) Compass Edge Detector

Compass Edge Detection is an alternative approach to the differential gradient edge


detection. The operation usually outputs two images, one estimating the local edge
gradient magnitude and one estimating the edge orientation of the input image.

How It Works
When using compass edge detection the image is convolved with a set of (in general 8)
convolution masks, each of which is sensitive to edges in a different orientation. For each
pixel the local edge gradient magnitude is estimated with the maximum response of all 8
masks at this pixel location:
|G| = max (|Gi| : I = 1 to n)

where Gi is the response of the mask i at the particular pixel position and n is the number
of convolution masks. The local edge orientation is estimated with the orientation of the
mask, which yields the maximum response.
Various masks can be used for this operation, for the following discussion we will use the
Prewitt mask. Two templates out of the set of 8 are shown in Figure 1:

-1 +1 +1 +1 +1 +1

-1 -2 +1 -1 -2 +1

-1 +1 +1 -1 -1 +1

0o 45o

Figure 1 Prewitt compass edge detecting templates sensitive to 0° and 45°.

The whole set of 8 masks is produced by taking one of the masks and rotating its
coefficients circularly. Each of the resulting masks is sensitive to another edge orientation
ranging from 0° to 315° in steps of 45°, where 0° corresponds to a vertical edge.

The maximum response |G| for each pixel gives rise to the value of the corresponding
pixel in the output magnitude image. The values for the output orientation image lie
between 1 and 8, depending on which of the 8 masks produced the maximum response.
This edge detection method is also called edge template matching, because a set of edge
templates is matched to the image, each representing an edge in a certain orientation. The
edge magnitude and orientation of a pixel is then determined by the template, which
matches the local area of the pixel the best.

The compass edge detector is an appropriate way to estimate the magnitude and
orientation of an edge. Whereas differential gradient edge detection needs a rather time-
consuming calculation to estimate the orientation from the magnitudes in x- and y-
direction, the compass edge detection obtains the orientation directly from the mask with
the maximum response. The compass operator is limited to (here) 8 possible orientations;
however experience shows that most direct orientations estimates are not much more
accurate.

On the other hand, the compass operator needs (here) 8 convolutions for each pixel,
whereas the gradient operator needs only 2, one mask being sensitive to edges in the
vertical direction and one to the horizontal direction. The result for the edge magnitude
image is very similar with both methods, provided the same convolving mask is used.

Common Variants

As already mentioned earlier, there are various masks, which can be used for Compass
Edge Detection. The most common ones are shown in Figure 2:
0o 45o

-1 0 +1 0 1 2

-2 0 +2 Sobel -1 0 1

-1 0 +1 -2 -1 0
-3 -3 +5 -3 +5 +5

-3 0 +5 Kirsch -3 0 +5

-3 -3 +5 -3 -3 -3

-1 0 +1 0 +1 +1

-1 0 +1 Robinson -1 0 +1

-1 0 +1 -1 -1 0

Figure 2 Some examples for the most common compass edge detecting
masks, each example showing two masks out of the set of eight.
For every template, the set of all eight masks is obtained by shifting the coefficients of the
mask circularly. The result for using different templates is similar; the main difference is
the different scale in the magnitude image. The advantage of Sobel and Robinson masks
is, that only 4 out of the 8 magnitude values must be calculated. Since each pair of masks
rotated about 180° opposite is symmetric, each of the remaining four values can be
generated by inverting the result of the opposite mask.

5) Zero Crossing Detector

The zero crossing detector looks for places in the Laplacian of an image where the value
of the Laplacian passes through zero --- i.e. points where the Laplacian changes sign.
Such points often occur at `edges' in images --- i.e. points where the intensity of the
image changes rapidly, but they also occur at places that are not as easy to associate with
edges. It is best to think of the zero crossing detector as some sort of feature detector
rather than as a specific edge detector. Zero crossings always lie on closed contours and
so the output from the zero crossing detector is usually a binary image with single pixel
thickness lines showing the positions of the zero crossing points.

The starting point for the zero crossing detector is an image, which has been filtered using
the Laplacian of Gaussian filter. The zero crossings that result are strongly influenced by
the size of the Gaussian used for the smoothing stage of this operator. As the smoothing is
increased then fewer and fewer zero crossing contours will be found, and those that do
remain will correspond to features of larger and larger scale in the image.

How It Works
The core of the zero crossing detector is the Laplacian of Gaussian filter and so
knowledge of that operator is assumed here. As described there, `edges' in images give
rise to zero crossings in the LoG output. For instance, Figure 1 shows the response of a
1-D LoG filter to a step edge in the image.

However, zero crossings also occur at any place where the image intensity gradient starts
increasing or starts decreasing, and this may happen at places that are not obviously
edges. Often zero crossings are found in regions of very low gradient where the intensity
gradient wobbles up and down around zero.

Once the image has been LoG filtered, it only remains to detect the zero crossings. This
can be done in several ways.

The simplest is to simply threshold the LoG output at zero, to produce a binary image
where the boundaries between foreground and background regions represent the locations
of zero crossing points. These boundaries can then be easily detected and marked in
single pass, e.g. using some morphological operator. For instance, to locate all boundary
points, we simply have to mark each foreground point that has at least one background
neighbor.

The problem with this technique is that will tend to bias the location of the zero crossing
edge to either the light side of the edge, or the dark side of the edge, depending upon
whether it is decided to look for the edges of foreground regions or for the edges of
background regions.

Figure 1 Response of 1-D LoG filter to a step edge. The left hand graph
shows a 1-D image, 200 pixels long, containing a step edge. The right
hand graph shows the response of a 1-D LoG filter with Gaussian standard
deviation 3 pixels.

A better technique is to consider points on both sides of the threshold boundary, and
choose the one with the lowest absolute magnitude of the Laplacian, which will hopefully
be closest to the zero crossing.

Since the zero crossings generally fall in between two pixels in the LoG filtered image, an
alternative output representation is an image grid, which is spatially, shifted half a pixel
across and half a pixel down relative to the original image. Such a representation is
known as a dual lattice. This does not actually localize the zero crossing any more
accurately of course. A more accurate approach is to perform some kind of interpolation
to estimate the position of the zero crossing to sub-pixel precision.
The behavior of the LoG zero crossing edge detector is largely governed by the standard
deviation of the Gaussian used in the LoG filter. The higher this value is set, the more
smaller features will be smoothed out of existence, and hence fewer zero crossings will be
produced. Hence, this parameter can be set to remove unwanted detail or noise as desired.
The idea that at different smoothing levels, different sized features become prominent is
referred to as `scale'.

6) Line detection

While edges (i.e. boundaries between regions with relatively distinct greylevels) are by
far the most common type of discontinuity in an image, instances of thin lines in an image
occur frequently enough that it is useful to have a separate mechanism for detecting them.
Here we present a convolution based technique, which produces a gradient image
description of the thin lines in an input image. Note that the Hough transform can be used
to detect lines; however, in that case, the output is a parametric description of the lines in
an image.

How It Works
The line detection operator consists of a convolution mask tuned to detect the presence of
lines of a particular width n, at a particular orientation θ. Figure 1 shows a collection of
four such masks, which each respond to lines of single pixel width at the particular
orientation shown.

-1-1-1+2+2+2-1 -1+2-1-1+2-1-1 -1-1+2-1+2-1+2 +2-1-1-1+2-1-1


-1-1 a +2-1 b -1-1 c -1+2 d

Figure 1 Four line detection masks which respond maximally to horizontal, vertical, and
oblique (+45 and -45 degree) single pixel wide lines.

If Ri denotes the response of mask i, we can apply each of these masks across an image
and, for any particular point, if | Ri| > | Rj |for all j ≠ i that point is more likely to contain a
line whose orientation (and width) corresponds to that of mask i. One usually thresholds
Ri to eliminate weak lines corresponding to edges and other features with intensity
gradients which have a different scale than the desired line width. In order to find
complete lines, one must join together line fragments, e.g., with an edge tracking
operator.

2.3.3 Thinning Algorithm

“Thinning” plays an important role in digital image processing and pattern recognition,
since the outcome of thinning can largely determine the effectiveness and efficiency of
extracting the distinctive features from the images. In image processing and pattern
recognition problems, a digitized binary pattern is normally defined by a matrix, where
each element, called a pixel, is either 1 (front/white pixel) or 0 (background/dark pixel).
Thinning is a process that deletes the front pixels and transforms the pattern into a “thin”
line drawing. The resulted thin image is denominated as a skeleton of the original image.
The thinned image must preserve the basic structure and the connectedness of the original
image.

Skeletonization or thinning is a very important preprocessing step in pattern analysis such


as industrial parts inspection, fingerprint recognition, optical character recognition, and
biomedical diagnosis [37]. One advantage of skeletonization is the reduction of memory
space required for storing the essential structural information presented in a pattern.
Moreover, it simplifies the data structure required in pattern analysis. Most of the
skeletonization algorithms require iterative passes through the whole image, or at least
through each pixel of the object considered. At each pass, a relatively complicated
analysis over each pixel’s neighborhood must be performed, which makes the algorithms
time-consuming.

The objective of thinning is to reduce the amount of information in image pattern to the
minimum needed for recognition. Thinned image helps the extraction of important
features such as end points, junction points, and connections from image patterns. Thus,
many thinning algorithms have been proposed until now.

Two major approaches of thinning digital patterns can be categorized into iterative
boundary removal algorithms and distance transformation algorithms [38]. Iterative
boundary removal algorithms delete pixels on the boundary of a pattern repeatedly until
only unit pixel-width thinned image remains. Distance transformation algorithms are not
appropriate for general applications since they are not robust, especially for patterns with
highly variable stroke directions and thickness.

Thinning based on iterative boundary removal can be divided into sequential and parallel
algorithms. In a sequential/serial method, the value of a pixel at the nth iteration depends
on a set of pixels for some of which the result of nth iteration is already known. In
parallel processing, the value of a pixel at the nth iteration depends on the values of the
pixel and its neighbors at the (n - 1)th iteration. Thus, all the pixels of the digital pattern
can be thinned simultaneously.

There are two main steps in this thinning algorithm that are repeated until the obtained
image approaches the medium axis of the original image. In the first step the contour of
the image is calculated for deletion and marked (this is a serial approach) and in the
second step the contour marked is deleted (this is a parallel approach). The contour of an
image is formed by pixels-on that are found in the innermost and most distant positions of
this image.

These are the main characteristics of the TA algorithm: i) maintain the connectivity and
preserve the end points; ii) the skeleton resulting approaches the medium axis of the
original image; iii) is practically immune to noise; and iv) the execution time is very fast
[39].

2.3.4 Curve Fitting Algorithm


I. Dynamic Strip Algorithm

Strip algorithms for curve fitting have received much attention recently because of their
superior speed performance advantage. As shown in fig. 1, a strip is defined by one
critical and two boundary lines. A critical line is defined by two reference points, the first
and the second data points (i.e. points O and a in fig. 1) of a curve. Then two boundary
lines which are parallel to the critical line and at a distance d from it are defined. The
distance d is commonly called the error tolerance. These two boundary lines form a strip
to restrict the line fitting process. The curve is then traversed point by point. The process
stops and a line segment is generated when the first point which is outside the strip is
found (e.g. point e in fig. 1). A line segment is then defined by the points O and c. Point c
is used again as the starting point for the next strip fitting mechanism.

One major problem with the strip algorithm is that if the second reference point is
positioned in such a way that the third point on the curve is outside the strip, the resulting
line segment will then be very short and is often not desirable. An example is shown as
strip 1 in fig 2. it can be seen in the same figure that strip 2 is a more desirable strip
because it contains more data points.

From the above simple observation, Leung and Yang [40] proposed a Dynamic Strip
algorithm (DSA) which rotates the strip using the starting point as a pivot. The basic idea
is to rotate the strip to enclose as many data points as possible. An example to illustrate
the advantage of the Dynamic Strip algorithm can be seen in fig.3 where case (a)
illustrates the best possible strip without rotation while case (b) illustrates the best
possible strip when rotation is allowed. Orientation of the strip is the only parameter to
vary in the Dynamic Strip Algorithm.

Boundary Line
d
o. .a Critical Line
.b .c d
.e .f Boundary Line

Fig. 1 Definition of a strip.

Strip 2
.e
.b .c
o. .a
Strip 1

Fig. 2: A badly and a properly chosen strip.

II. Dynamic Two-Strip Algorithm

The Dynamic Two-Strip algorithm has two stages. In the first stage, a generator called the
Left-Right Strip Generator (LRSG) is employed to find the best fitted LHS and RHS
strips at each data point. In our convention, a point that is traversed before (after) the data
point P is said to be on the RHS (LHS) of P. The computed strips are used to compute the
figure of merit of the data point. In the second stage, a local maximum detection process
is applied to pick out desirable feature points, i.e. points of high curvature. The
approximated curve is one with the feature points connected by straight lines.

Left-Right Strip Generator (LRSG)

LRSG is an extension of the Dynamic Strip algorithm. The strip is allowed to adjust its
orientation as well as the width dynamically. To simplify our discussion, we assume our
data points are labeled from 0, 1, …, to N-1 and are traversed in either clockwise or
Left Right Left Right
counter-clockwise fashion. Let Li ( Li ) and Wi ( Wi ) be the length and
th
width of the fitted LHS (RHS) strip at the i data point. Initially, a strip with the
minimum width (i.e. Wi = wmin ) is used in each direction. When no more data points can
be included into the strip, the ratio

LLeft LRight
E i
left
= i
and E i
Right
= i

Wi Left Wi Right

is computed. Intuitively, Ei is a measure of the elongatedness of the strip. The longer or


narrower the strip is the higher the value of Ei. An elongated strip, which is one with large
Ei, is therefore desirable. The value of the width is then increased by the smallest amount
such that the strip fitting iteration can resume, i.e. when more data points can be included.
This process continues until the maximum allowable width (Wmax) of the strip is reached.
The value of Wmax can be set arbitrarily large while the minimum width (W min) cannot be
set arbitrary small. This is particularly true if the data are digitized because the length of a
strip has an upper bound Lmax (bounded by the dimension of the screen) and a lower
bound Lmin (bounded by the distance between two consecutive vertical or horizontal
pixels) but not the width of a strip. The width of a strip does not have a lower bound. If
we arbitrarily choose Wmin to be less than 1/ L max, no strip of width ≥ 1 can be chosen
since it always gives a smaller Ei. This can be illustrated by considering
Li
Ei =
Wi
L'i
E = '
i
'

Wi

with Wi < 1 / Lmax and 1 ≤ Wi’ < ∞. In this case, we will have E i > Li ⋅ Lmax . Since Li is
bounded from below by 1, Ei would be > Lmax. On the other hand, Ei’ can be at most equal
to Li’ with Wi’ = 1. since Li’ is bounded from above by Lmax, E i would be larger than Ei’.
Therefore, under the situation with Wi < 1/Lmax, no strip of width ≥ 1 will be chosen and
none or little data reduction (noise filtering) is done. In practice, data reduction or noise
filtering is desirable.

The result of the above operation is a collection of all the longest possible LHS (RHS)
strips with different width at each data point. At each side of the data point, only the strip
with the largest Ei is selected.
The LRSG simulates the side detection mechanism. The curvature at a point can then be
determined by the angle subtended by the best fitted left and right strips. In order to
determine if the ith data point pi is a feature point, we define a figure of merit ( f i ) that
measures the worthiness of Pi to be included in the approximation. f i is defined as :
f i = EiLeft ⋅ S iθ ⋅ EiRight
θ
Where θ is the angle subtended by the best fitted left and right strips and Si is the angle
acuteness measure at point i.

S iθ =| 180  − θ | 0 ≤ θ ≤ 360o.
θ
According to this computation, sharper angles will give a larger value of Si . It can be
seen that a sharp angle subtended by long strips will result in a large fi whereas a blunt
angle subtended by short strips will result in a small fi. The above discussion can be
summarized by the following three steps.
Left Right
(1) Determine Ei and Ei for all i.
(2) Determine the angle θ subtended by the left and right strips and also the value of
Siθ .
(3) Determine f i .

Local Maximum Detection

The local maximum detection process consists of three stages. First, non-local-maximum
points (i.e. points with small f as compared with their neighbors) are eliminated
temporarily. The second step is to check if over eliminated has occurred. Consequently,
some temporarily eliminated points are added back to the result. The final step is to fit
narrow strips to the remaining points to eliminate points that align approximately on a
straight line. Details of the above steps are described in the following.

Non-local-maximum elimination process: basically, this is a process that allows each data
point, Pi, with high f i to eliminate other points that are in the left and right domain of P i.
A domain is defined by the area or length covered by the best fitted strip of a point. To
Left
simplify future discussing, the left and right domains of Pi are denoted by Di and
DiRight respectively. A point Q is in say, the left domain of Pi is written as:
Left
Q ∈ Di

An ideal case is shown in fig. 4 where points A and B are local maxima since all the other
points which are between A and B (e.g. C) have strips subtending an angle of
approximately 180o (fig 3(a)) or they may have strips of wider widths together with wider
angles (fig. 3(b)). In these cases, points between A and B (e.g. point C) are eliminated.

In the algorithm, a point Pj is eliminated if one of the following conditions are satisfied.

(i) there exists m such that


Pj ∈ D j − m and D j ⊆ D j − m and f j < f j − m
Left Left Left
(ii) there exists m such that
Pj ∈ D j + m and D j ⊆ D j + m and f j < f j + m
Right Right Right

In practice, it was found to be difficult to find complete compliance of the domain


subsetting conditions, i.e. D j ⊆ D j − m or D j ⊆ D j + m . Therefore the conditions are
Left Left Right Right

relaxed. We define that the condition D j ⊆ D j − m is said to hold if half of the left
Left Left

**
domain of Prcrcj Left
is covered by D j − m . The same can be applied to the right domain

Another problem that can arise can be understood by considering fig. 4. In fig. 4, if the
lines AB and FG are ling enough, the curve BCDF is comparatively insignificant and can
be ignored. On the other hand, if either AB or FG is short, the curve BCDF may be of
significance. The classification can be illustrated by considering the angle at point B. At
point B, the best fitted right strip would be from point B to A. If the line FG is long, the
best fitted left strip of B would be from point B to G. On the other hand, if the line FG is
short, the best fitted left strip may be from B to C since a narrower strip, which can give a
Left
larger value of f i , can be used. In the first case the angle subtended by the left and right
strips of point B is obtuse while in the second be eliminated.

C
Left domain of C
B. . .A
Right domain of C

(a)

Right domain of C
B. .A
.C
Left domain of C

(b)

Fig 3 Examples of two local maximum points(A and B)


and one weaker point (C) with its domains.

For example (see fig. 3), if the best fitted left strip of point A is from A to G, the process
will examine the points (e.g. B,C,D and F) in between before eliminating any of them. If
the lines AB and FG are long enough, all the points in between will have obtuse angles
and are eliminated. Otherwise, those which have acute angles will be retained. For
example, if point B has an acute angle, only points between A and B will be eliminated by
A. consequently, the left domain of A is reduced to be from A to B only

Left Domain of A & Right domain of G

G. F B .A
D C

Figure 4: A possible but undesirable chosen strip (AG).


Bridging process: in the first process, weak and insignificant points are eliminated. In
practice, some weak points may be of significance. A check is made to determine the
possibility of over elimination. In case of over elimination, some temporarily eliminated
points are added back to the result. Ideally, neighboring feature points are supported by
domain of point A covers point B and the right domain of point B covers point A. If two
selected neighboring points A and B are not bridged, we say over elimination has
occurred Bridges can be broken in the following ways:

(i) A ∉ D BRight and B ∉ D ALeft (see fig. 5b)


(ii) A ∉ D BRight and B ∈ D ALeft (see fig. 5c)
Or
A ∈ D BRight and B ∉ D ALeft .

In either case, additional feature points are sought and the points involved are reexamined
iteratively (or recursively) until all neighboring points are bridged together. The
additional feature points are sought at the end of the shortened domains by selecting
immediate local maximum points in the neighborhood. For example, in fig. 5(b) at the
end of the shortened right domain of B, the process looks for the first local maximum
starting from point C to B. For the shortened left domain of A, the process starts from
point D to A.

In short, the bridging process checks for the termination condition (i.e. all neighboring
points are bridged together) in each iteration. If the condition is satisfied, the process
terminates. Otherwise, additional feature Right
pointsdomain
are sought
of Band the iteration continues.
Left domain of A
Strip fitting
B process: it is a data reduction process to fit narrow strips to the remaining
A
points. The reason behind this process is that some consecutive feature points may align
approximately on a straight line and it is desirable to eliminate the points in between. For
example, ifFigure
points5(a):
A, B,AnCideal
and relationship
D are chosen between
as thetwo localpoints after the first two
feature
processes as shown inmaximum
fig. 5(b),points (A and B)toand
it is desirable their domains.
eliminate points C and D and let the
more prominent points A and B to represent the curve ADCB. In practice, the process first
locates the most outstanding points, the local maximum points (e.g. A) among the
remaining points,
Right as starting
domain of points.
B Then two narrowLeft strips of fixed
domain of Awidths (one half of
the minimum width) is fitted to the LHS and RHS of the data point, eliminating any
B
points within the strips with smaller values of merit (e.g. C and D). TheAfitting strips
whenever the last point can be fitted within the strip is found or a point of a larger value
of merit is met. In either case, the lastCpoint examined isDnot eliminated.
Fig 5(b) An example of bridges broken with condition(i)

Right domain of B
Left domain of A
B A

C
Fig. 5(c) An example of bridge broken with condition(ii)
2.3.5 Hausdorff Distance Algorithm

The Hausdorff distance is a shape comparison metric based on binary images. It is a


distance defined between two point sets. Unlike most shape comparison methods that
build a point-to-point correspondence between a model and a test image, Hausdorff
distance can be calculated without explicit point correspondence. The Hausdorff distance
for binary image matching is more tolerant to perturbations in the locations of points than
binary correlation techniques, since it measures proximity rather than exact superposition.

The use of the Hausdorff distance for binary image comparison and computer vision was
originally proposed by Huttenlocher and colleagues [41]. In their paper the authors argue
that the method is more tolerant to perturbations in the locations of points than binary
correlation techniques since it measures proximity rather than exact superposition. Unlike
most shape comparison methods, the Hausdorff distance can be calculated without the
explicit pairing of points in their respective data sets, A and B. Furthermore, there is a
natural allowance to compare partial images and the method lends itself to simple and fast
implementation. Formally, given two finite point sets A ={a1, …, ap}, and B = {b1, …,
bq}, the Hausdorff distance is defined as

H ( A , B ) = max ( h ( A , B ) , h ( A , B ) ),

Where, h(A,B) = max min || a – b ||.


a∈A b∈B

In the formulation above ||.|| is some underlying norm over the point sets A and B. In the
following discussing, we assume that the distance between any two data points is defined
as the Euclidean distance. h (A, B) can be trivially computed in time O(pq) for point sets
of size p and q, respectively, and this can be improved to O((p + q)log(p + q)). The
function h (A, B) is called the directed Hausdorff distance from set A to B. It identifies
the point a∈A that is farthest from any point of B and measures the distance from a to its
nearest neighbor in B. In other words, h (A, B) in effect ranks each point of A based on its
distance to the nearest point in B and then used the largest ranked such point as the
measure of distance (the most mismatched point of A). Intuitively, if h (A, B) = d, then
each point of A must be within distance d of some point of B, and there also is some point
of A that is exactly distance d from the nearest point of B.

For practical implementations, it is also important (due to occlusion or noise conditions)


to be able to compare portions of shapes rather than providing exact matches. To handle
such situations, the Hausdorff distance can be naturally extended to find the best partial
distance between sets A and B. To achieve this, while computing h (A, B), one simply has
to rank each point of A by its distance to the nearest point in B and take the Kth ranked
value. This definition provides a nice property, that is it automatically selects the K “best
matching” points of set A that minimizes the directed Hausdorff distance [41].

Realizing that there could be many different ways to define the directed (h (A, B), h (B,
A)) and undirected (H (A, B)) distances between two point sets A and B, Dubuisson and
Jain revised the metric and redefine the original definition of h (A, B) proposing an
improved measure, called the modified Hausdorff distance (MHD), which is less sensitive
to noise. Specifically, in their formulation
1
h( A, B ) =
Na
∑ min || a − b ||
a∈ A
b∈B

where Na = p, the number of points in set A. In their paper, the authors argue that even the
Kth ranked Hausdorff distance of Huttenlocher present some problems for object matching
under noisy conditions, and conclude that the modified distance proposed above has the
most desirable behavior for real-world applications.

In this paper, we adopt the MHD formulation of Dubuisson, and further improve its
performance by introducing the notion of a neighborhood function ( N Ba ) and associated
penalties (P). Specifically, we assume that for each point in set A, the corresponding point
in B must fall within a range of a given diameter. This assumption is valid under the
conditions that (i) the input and reference images are normalized by appropriate
preprocessing algorithms, and (ii) the non-rigid transformation is small and localized. Let
Nba be the neighborhood of point a in set B, and an indicator I = 1 if there exists a point b
∈ N Ba , and I=0 otherwise. The complete formulation of the “doubly” modified Hausdorff
distance (M2HD) can now be written as

d ( A, B ) = max( I mina || a − b ||, (1 − I ) P ),


b∈N B

1
h( A, B) =
Na
∑ d ( a , B ),
a∈ A

H ( A, B) = max(h( A, B), h( B, A)).

The notion of similarity encoded by this modified Hausdorff distance is that each point of
A be near some point of B and vice versa. It requires, however, that all matching pairs fall
within a given neighborhood of each other in consistency with our initial assumption that
local image transformations may take place. If no matching pair can be found, the present
model introduces a penalty mechanism to ensure that images with large overlap are easily
distinguished as well. As a result, the proposed modified Hausdorff measure (M2HD) is
ideal for applications, such as face recognition, where although overall shape similarity is
maintained, the matching algorithm has to account for small, non-rigid local distortions.

2.4 Use Case Diagram

Create

File

Perform

User Director

Perform

Fig. Use Case Diagram for How User interacts with System
Face Detection

Robert Cross

Edge Detection

Robert Cross

Convert Image
to Binary Image

Process

Generate Line
Edge Map
Tester
Trainer

Save Image to
Database

Find Image

Directed Modified

Doubly
Modified HD
Fig. Use Case Diagram for Systems internal process
CHAPTER 3
DESIGN

3.1 Class Relationship Diagram

Package: FaceRecognitionSystem
Package: FaceRecognitionSystem.MainWin

Package: FaceRecognitionSystem.GUI
Package: FaceRecognitionSystem.CreateDB

Package: FaceRecognitionSystem.StoreDB

Package: FaceRecognitionSystem.Support

Package: FaceRecognitionSystem.Binary
Package: FaceRecognitionSystem.FaceRegion

Package: FaceRecognitionSystem.EdgeDetector

Package: FaceRecognitionSystem.Thinning

Package: FaceRecognitionSystem.Dynamic2Strip
Package: FaceRecognitionSystem.HausdorffDistance

3.2 Class Diagrams


This class is a main GUI
class, which opens a
form in which different
user defined user-
controls are placed. This
class is using other GUI
classes to show
processing and result.
|__ FaceRecognitionSystem.GUI.MainWin
System.Windows.Forms
Class FaceRecognitionSystem.GUI.MainWin :
This class is providing
two important method to
find successor and
predecessor of any point
in image.
Class FaceRecognitionSystem.Support.SuccPredec
This class is used for
conversion between
RGB and HSL models.
It can also set/modify
brightness, saturation
and hue.
Class FaceRecognitionSystem.Support.RGBHSL
This class is contained in
RGBHSL class and is
used as an external
support. It is used for
RGB to HIS and vice
versa conversions.
Class FaceRecognitionSystem.Support.HSL
Interface FaceRecognitionSystem.EdgeDetector.Convolution
This class is used to
extract main region of
the face from the
images.
Class FaceRecognitionSystem.FaceRegion.FaceRegion
This class is used to
insert data (images) into
database.
Class FaceRecognitionSystem.StoreDB.DataMgmt
This class is used to
create database and table
in that database in SQL
Server 7.
Class FaceRecognitionSystem.CreateDB.CreateDB
This class is for
converting image to
binary image.
Class FaceRecognitionSystem.Binary.BinImage
This class is an
implementation of a
Sobel Edge Detector.
Class FaceRecognitionSystem.EdgeDetector.Sobel
This class is an
implementation of a
Robert Cross Edge
Detector.
Class FaceRecognitionSystem.EdgeDetector.RobertCross
This class will generate
Edge Map of an image.
Class FaceRecognitionSystem.EdgeDetector.EdgeMap
This is an abstract class,
which provides image to
2-D matrix and vice
versa conversion.
Abstract Class FaceRecognitionSystem.EdgeDetector.ImgMatrix
This is an interface,
which provides method
for convolution.
Class FaceRecognitionSystem.Thinning.BinMatrix

matrix.
operations on image-
performing binary
This class is used for

Class FaceRecognitionSystem.Thinning.HitAndMiss

process on face image.


perform Hit and Miss
This class is used to

Class FaceRecognitionSystem.Thinning.SerialThinning

a binary face image.


thinning operation over
This class performs

Class FaceRecognitionSystem.Dynamic2Strip.LocalMaximum
This class is used to
find Hausdorff
distance of an input
image to each
images stored in the
database.
Class FaceRecognitionSystem.HausdorffDistance.HausdorffDistance
This class is a left-right
strip generator based on
Dynamic Two-Strip
algorithm.
Class FaceRecognitionSystem.Dynamic2Strip.LRSG
This class will find
pixels with local
maximum and eliminate
other pixels.
3.3 Sequence Diagram
This class is an
implementation of a
Doubly Modified
Hausdorff Distance
algorithm.
Class FaceRecognitionSystem.HausdorffDistance.M2HD
This class is an
implementation of a
Modified Hausdorff
Distance algorithm.
Class FaceRecognitionSystem.HausdorffDistance.MHD
This class is an
implementation of a
Directed Hausdorff
Distance algorithm.
Class FaceRecognitionSystem.HausdorffDistance.HD
Fig. Sequence Diagram for Creating Database
Fig.
Fig.
Fig. Sequence
Sequence
Sequence Diagram
Diagram
Diagram for
forfor
Full
Full
Step Testing
By Training
Step Testing

Fig. Sequence Diagram for Step By Step Training


CHAPTER 4
ANALISYS

Class FaceRecognitionSystem.GUI.MainWin
public class MainWin : System.Windows.Forms.Form
{
private System.Windows.Forms.MainMenu mainMenu1;
private System.Windows.Forms.MenuItem menuFile;
private System.Windows.Forms.MenuItem menuOpen;
private System.Windows.Forms.MenuItem menuExit;
private System.Windows.Forms.MenuItem menuItem1;
private System.Windows.Forms.MenuItem menuOptions;
private System.Windows.Forms.MenuItem menuTraining;
private System.Windows.Forms.MenuItem menuTesting;
private System.Windows.Forms.MenuItem menuSBSTraining;
private System.Windows.Forms.MenuItem menuFTraining;
private System.Windows.Forms.MenuItem menuSBSTesting;
private System.Windows.Forms.MenuItem menuFTesting;
private System.Windows.Forms.OpenFileDialog openFileDialog;

private System.ComponentModel.Container components = null;


private FaceRecognitionSystem.GUI.ShowProcessing showProcessing1;
private FaceRecognitionSystem.GUI.SBSTraining sbsTraining1;
private System.Windows.Forms.MenuItem menuDatabase;
private System.Windows.Forms.MenuItem menuDBCreate;
private System.Windows.Forms.MenuItem menuHelp;
private System.Windows.Forms.MenuItem menuUse;
private System.Windows.Forms.MenuItem menuAbtUS;
private System.Windows.Forms.TextBox txtWelcome;
private FaceRecognitionSystem.GUI.FullTrainning fullTrainning1;
private FaceRecognitionSystem.GUI.SBSTesting sbsTesting1;
private FaceRecognitionSystem.GUI.FullTesting fullTesting1;
private FaceRecognitionSystem.GUI.ShowResult showResult1;
private System.Windows.Forms.MenuItem menuItem4;
public FaceRecognitionSystem.GUI.PBoxPanel pBoxPanel1;
public System.Windows.Forms.Panel WelcomePanel;
public FaceRecognitionSystem.GUI.ShowInformation showInformation1;

public MainWin()
{
InitializeComponent();
}
protected override void Dispose( bool disposing );
static void Main()
{
Application.Run(new MainWin());
}
}
This is a class from where Main( ) method is called. When we run application, this
form will be loaded first, where other user-defined user controls are placed.

Class FaceRecognitionSystem.Support.HSL and


Class FaceRecognitionSystem.Support.RGBHSL

public class HSL


{
double _h;
double _s;
double _l;
double H
double S
double L

public HSL();
}

public class RGBHSL


{
public RGBHSL();
public static Color SetBrightness(double brightness);
public static Color ModifyBrightness(Color c,double brightness);
public static Color SetSaturation(Color c,double Saturation);
public static Color ModifySaturation(Color c,double Saturation);
public static Color SetHue(Color c, double Hue);
public static Color ModifyHue(Color c, double Hue);
public static Color HSL_to_RGB(HSL hsl);
public static HSL RGB_to_HSL (Color c);
}

This class is used as an external support. This class is useful to convert RGB to
HSL and vice versa. And also used to set/modify brightness, saturation and hue.

Class FaceRecognitionSystem.Support.SuccPredec

public class SuccPredec


{
public SuccPredec();
public Point successor(Point x,Point p,int[][] Q);
public Point predecessor(Point x,Point p,int[][] Q);
}

public Point successor(Point x,Point p,int[][] Q)


Encapsulation : public
Return type : Point
Method name : successor
Arguments :
x - reference point with respect to which the successor of current point will be
found from image matrix.
p - current point
Q - image matrix. Matrix of type integer is a representation of image as an
intensity values of each pixel at each node of matrix.

This method returns a point, which will be a successor point of p with respect to x
from image-matrix Q.

public Point predecessor(Point x,Point p,int[][] Q)


Encapsulation : public
Return type : Point
Method name : predecessor
Arguments :
x - reference point with respect to which the predecessor of current point will be
found from image matrix.
p - current point
Q - image matrix. Matrix of type integer is a representation of image as an
intensity values of each pixel at each node of matrix.

This method returns a point, which will be a predecessor point of p with respect to
x from image-matrix Q.

Class FaceRecognitionSystem.CreateDB.CreateDB

public class CreateDB


{
string str;
SqlConnection con;
SqlCommand comm;

public CreateDB();
}

str : variable of type string, used as a query string for database.


con : variable of type SqlConnection, used to create connection to database.
comm : variable of type SqlCommand, used to create command stored in string str,
which will be executed in database connected with connection con.
public CreateDB()
Encapsulation : public
Method type : constructor
Method name : CreateDB
Arguments : N/A

This method will create database named “FaceDB” in SQL Server and than create
a table named as “FaceTab” in FaceDB, which is used for storing images in to database
for identification.
Class FaceRecognitionSystem.StoreDB.DataMgmt

public class DataMgmt


{
SqlConnection con;
SqlDataAdapter adap;
SqlCommandBuilder builder;
DataSet dataset;
string insert;

public DataMgmt();
public void insertion(Image OI, Image PI);
public void distroy();
}

con : variable of type SqlConnection, used to create connection to database.


adap : variable of type SqlDataAdapter, used to create data adapter.
builder : variable of type SqlCommandBuilder, used to build command and execute.
dataset : variable of type DataSet, used to insert data into database.
insert : variable of type string, used to store query for inserting data into database.

public DataMgmt()
Encapsulation : public
Method type : constructor
Method name : DataMgmt
Arguments : N/A

This method will create connection to database. And connect dataset to the table
“FaceTab” in the database.

public void insertion(Image OI, Image PI)


Encapsulation : public
Return type : void
Method name : insertion
Arguments :
OI : argument of type Image, is an original image on which processing is done.
PI : argument of type Image, is a processed image.

This method will store OI and PI into the database.

public void distroy()


Encapsulation : public
Return type : void
Method name :distroy
Arguments : N/A

This method will close connection and dispose builder and adapter objects.
Class FaceRecognitionSystem.FaceRegion.FaceRegion

public class FaceRegion


{
public Image FaceReg(Image I2);
private Image ScaleImage (Image image, int width, int height);
private Image MainRegion(Image I,int[] cols, int[] rows);
}

public Image FaceReg(Image I2)


Encapsulation : public
Return type : Image
Method name : FaceReg
Arguments :
I2 – argument of type Image, is an original image from which the main region of
the image will extract.

This function will find the region of the face from the image passed as an
argument and will return the portion of the image found as an image.

private Image ScaleImage (Image image, int width, int height)


Encapsulation : public
Return type : Image
Method name : ScaleImage
Arguments :
image : an original image, which is to be scale.
width : integer value shows the width of the scaled image.
height : integer value shows the height of the scaled image.

This function will scale image to given size of width and height and return the
scaled image.

private Image MainRegion(Image I,int[] cols, int[] rows)


Encapsulation : private
Return type : Image
Method name : MainRegion
Arguments :
I : original image
cols : integer array, contains x position of left-top and right-bottom points
rows : integer array, contains y position of left-top and right-bottom points

This method will extract the region founded, specified with two array cols and
rows form the image I and convert that region to image and return.

Interface FaceRecognitionSystem.EdgeDetector.Convolution

interface Convolution
{
double[][] Convolve(double[][] X,int[][] Y);
}
double[][] Convolve(double[][] X,int[][] Y)
Encapsulation : public
Return type : double[][]
Method name : Convolve
Arguments :
x : 2-D array of double, which will be convolve.
y : 2-D array of integer, by which x will be convolved.

Class FaceRecognitionSystem.EdgeDetector.ImgMatrix

public abstract class ImgMatrix


{
protected abstract double[][] ImgToMat(Image I);
protected abstract Image MatToImg(double[][] X);
}

protected abstract double[][] ImgToMat(Image I)


Encapsulation : protected
Return type : double[][]
Method name : ImgToMat
Arguments :
I - argument of type Image.

This method will convert image I to matrix of double values filled with the
intensity values of each pixel and return matrix.
protected abstract Image MatToImg(double[][] X)
Encapsulation : protected
Return type : double[][]
Method name : MatToImg
Arguments :
I - argument of type Image.

This method will convert matrix of double values filled with the intensity values
of each pixel to an image and return an image.

Class FaceRecognitionSystem.EdgeDetector.EdgeMap

public class EdgeMap : ImgMatrix, Convolution


{
private double[][] Gx;
private double[][] Gy;
public EdgeMap(int[][] X,int[][] Y,double[][] Z)
public double[][] Magnitude()
public double[][] Angle()
public double[][] Convolve(double[][] X,int[][] Y)
protected override double[][] ImgToMat(Image I)
protected override Image MatToImg(double[][] X)
}
Gx : 2-D array of double, stores x-kernel
Gy : 2-D array of double, stores y-kernel

public EdgeMap(int[][] X,int[][] Y,double[][] Z)


Encapsulation : public
Method type : constructor
Method name : EdgeMap
Arguments :
X – 2-D array of integers, is a kernel
Y – 2-D array of integers, is a kernel
Z – 2-D array of doubles, is an image-matrix

This method will convolve Z with respect to X and Y and store result into Gx and
Gy respectively.

public double[][] Magnitude()


Encapsulation : public
Return type : double[][]
Method name : Magnitude
Arguments : N/A

This method will find magnitude from Gx and Gy and return the result in 2-D
array of double value.

public double[][] Angle()


Encapsulation : public
Return type : double[][]
Method name : Angle
Arguments : N/A

This method will find Angle from Gx and Gy and return result in 2-D array of
double value.

public double[][] Convolve(double[][] X,int[][] Y)


Encapsulation : public
Return type : double[][]
Method name : Convolve
Arguments :
x : 2-D array of double, which will be convolve.
y : 2-D array of integer, by which x will be convolved.

This method will perform convolution of X with respect to Y and return result in
2-D array of double value.

protected override double[][] ImgToMat(Image I)


Encapsulation : protected
Return type : double[][]
Method name : ImgToMat
Arguments :
I – Image to convert
This method will convert an image I into 2-D array of double value filled with
intensity values of each pixel.

protected override Image MatToImg(double[][] X)


Encapsulation : protected
Return type : double[][]
Method name : MatToImg
Arguments :
X – 2-D array of double, is an image-matrix

This method will convert an image-matrix which contains intensity values of each
pixel to an image and return that image.

Class FaceRecognitionSystem.EdgeDetector.RobertCross

public class RobertCross : EdgeMap


{
int[][] Gx;
int[][] Gy;
public RobertCross()
public Image RCrossED(Image I)
}

Gx : 2-D array of integer, is kernel.


Gy : 2-D array of integer, is kernel.

public RobertCross()
Encapsulation : public
Method type : constructor
Method name : RobertCross
Arguments : N/A

This method will initialize Gx and Gy kernels.

public Image RCrossED(Image I)


Encapsulation : public
Return type : Image
Method name : RCrossED
Arguments :
I – an Image to process

This method will process on image I, extracts edges from image, and returns
extracted edge-map as an image.
Class FaceRecognitionSystem.EdgeDetector.Sobel

public class Sobel : EdgeMap


{
int[][] Gx;
int[][] Gy;
public Sobel()
public Image SobelED(Image I)
}

Gx : 2-D array of integer, is kernel.


Gy : 2-D array of integer, is kernel.

public Sobel()
Encapsulation : public
Method type : constructor
Method name : Sobel
Arguments : N/A

This method will initialize Gx and Gy kernels.

public Image SobelED(Image I)


Encapsulation : public
Return type : Image
Method name : SobelED
Arguments :
I – an Image to process

This method will process on image I, extracts edges from image, and returns
extracted edge-map as an image.

Class FaceRecognitionSystem.Binary.BinImage

public class BinImage : EdgeDetector.EdgeMap


{
public BinImage()
public Image BinaryImage(Image I)
}

public Image BinaryImage(Image I)


Encapsulation : public
Return type : Image
Method name : BinaryImage
Arguments :
I – Image to convert into binary.

This method will convert image I to binary based on some predefined threshold.
Class FaceRecognitionSystem.Thinning.BinMatrix

public class BinMatrix : support.SuccPredec


{
public BinMatrix()
public int[][] NOT(int[][] X)
public Image OR(Image X1,Image Y1)
public int[][] AND(int[][] X,int[][] Y)
protected double[][] ImgToMat(Image I)
protected Image MatToImg(double[][] X)
public int[][] ImgMat(Image I)
public Image MatImg(int[][] X)
}

public int[][] NOT(int[][] X)


Encapsulation : public
Return type : int[][]
Method name : NOT
Arguments :
x - 2-D array of integers, is an image-matrix to invert.

This method will invert image-matrix x into 2-D array and return the got result.

public Image OR(Image X1,Image Y1)


Encapsulation : public
Return type : Image
Method name : OR
Arguments :
X1 - first Image
Y1 – second Image

This method will OR two images and generate single resulted image and return it.

public int[][] AND(int[][] X,int[][] Y)


Encapsulation : public
Return type : Image
Method name : OR
Arguments :
X - first Image
Y – second Image

This method will AND two images and generate single resulted image and return
it.

public int[][] ImgMat(Image I)


Encapsulation : public
Return type : int[][]
Method name : ImgMat
Arguments :
I – Image to convert
This method will convert an image I into 2-D array of integer, filled with intensity
values of each pixels.

public Image MatImg(int[][] X)


Encapsulation : public
Return type : Image
Method name : MatImg
Arguments :
X – 2-D array of integer, is an image-matrix

This method will convert an image-matrix, which contains intensity values of each
pixel to an image and return that image.

Class FaceRecognitionSystem.Thinning.HitAndMiss

public class HitAndMiss


{
public int[][] HitNMiss(int[][] I,int[][] SE)
}

public int[][] HitNMiss(int[][] I,int[][] SE)


Encapsulation : public
Return type : int[][]
Method name : HitNMiss
Arguments :
I – 2-D array of integers, is an image-matrix.
SE – 2-D array of integers, is a structuring element.

This method will process image I with structuring element SE and find resulted
image and convert that into 2-D array of integers and returns it.

Class FaceRecognitionSystem.Thinning.SerialThinning

public class SerialThinning : BinMatrix


{
private Point First;
private Point Prev;
private int[][] Q;
public SerialThinning()
public SerialThinning(Image I)
private void Deletion(Point p)
public Image Thinning()
private int B(Point p)
private int A(Point p)
}

First : starting point of any loop


Prev : previous point of current point that traversed
Q : 2-D array of integers, is an image-matrix.

public SerialThinning(Image I)
Encapsulation : public
Method type : constructor
Method name : SerialThinning
Arguments :
I – image to thin.

This method will perform initialization of different variables.

private void Deletion(Point p)


Encapsulation : private
Return type : void
Method name : Deletion
Arguments :
P – point to delete

This method will delete point P from image-matrix Q.

public Image Thinning()


Encapsulation : public
Return type : Image
Method name : Thinning
Arguments : N/A

This method will perform thinning operation over image-matrix Q and return
thinned image as a result.

private int B(Point p)


Encapsulation : private
Return type : int
Method name : B
Arguments :
p – current point

This method will return total number of pixels form value 1 in the neighborhood
of P.

private int A(Point p)


Encapsulation : private
Return type : int
Method name : A
Arguments :
p – current point

This method will return number of 1 to 0 transitions in the neighborhood of p.


Class FaceRecognitionSystem.Dynamic2Strip.LocalMaximum

public class LocalMaximum : LRSG


{
private int[][] Q1,Q;
private float[][] fi;
public LocalMaximum(Image I, ref PictureBox pb)
public void LMax()
public void NLMElim()
private Point[] strip(Point p)
}

Q1 : 2-D array of type integer, an image-matrix, used to store original input image.
Q : 2-D array of type integer, an image-matrix, used for processing.
fi : 2-D array of type float, used to store calculated value of each pixel.

public LocalMaximum(Image I, ref PictureBox pb)


Encapsulation : public
Method type : constructor
Method name : LocalMaximum
Arguments :
I : an input image.
pb : reference variable to PictureBox.

This method will initialize Q,Q1 and fi and then call other method to process of
input image. Finally place processed image into the picture box whose reference is passed
as an argument.

public void LMax()


Encapsulation : public
Return type : void
Method name : LMax
Arguments : N/A

This method will calculate local maximum of each dark pixel of the image and
store result in matrix f.

public void NLMElim()


Encapsulation : public
Return type : void
Method name : NLMElim
Arguments : N/A

This method will eliminate all pixels which are not local maximum.

private Point[] strip(Point p)


Encapsulation : private
Return type : Point[]
Method name : strip
Arguments :
p – point, for which strips to be found.

This method will find strips on both sides of a pixel p, and return an array of
point in which all points within the rectangle generate from strips are there.

Class FaceRecognitionSystem.Dynamic2Strip.LRSG

public class LRSG : support.SuccPredec


{
private double LLength,LWidth;
private double RLength,RWidth;
private double angle;
private int[][] Q;
private Point Lpt1,Lpt2,Lpt3,Lpt4;
private Point Rpt1,Rpt2,Rpt3,Rpt4;
private Point start;
private float lm,rm,m1,m2;
public LRSG()
public float f(Point p, int[][] Q1)
private double S()
private double E(Point p, char d)
private void strips(Point p, char d)
private double length(char d)
public float line_slop(Point p, Point z)
public bool check(Point pt1,Point pt2,Point t,Point st,float m1,float m2)
public bool same_side(Point x, float m, Point p, Point t)
public int line_side(Point x, float m, Point p)
public void points12(Point p, Point z, char d)
public void pt12(Point p, Point z, ref Point pt1, ref Point pt2)
public void points34(Point p,char d)
public Point intersectln(Point p, Point pt)
}

LLength : length of left-side strip.


LWidth : distance between left-side strips.
RLength : length of right-side strip.
RWidth : distance between right-side strips.
angle : subtended angle between left and right strips.
Q : an image-matrix.
Lpt1, Lpt2, Lpt3, Lpt4 : end points of two strips of left-side.
Rpt1, Rpt2, Rpt3, Rpt4 : end points of two strips of right-side.
start : starting point of loop.
lm : slop of left strip.
Rm : slop of right strip.

public LRSG()
Encapsulation : public
Method type : constructor
Method name : LRSG
Arguments : N/A

This method will initialize variables.

public float f(Point p, int[][] Q1)


Encapsulation : public
Return type : float
Method name : f
Arguments :
p – point for which to calculate f.
Q1 – an image-matrix

This method will calculate f for point p.

private double S()


Encapsulation : private
Return type : double
Method name : S
Arguments : N/A

This method will find S for a point.

private double E(Point p, char d)


Encapsulation : private
Return type : double
Method name : E
Arguments :
p – point for which to calculate elongatedness.
d – character indicate the direction as left or right.

This method will calculate elongatedness of a point p in direction d.

private void strips(Point p, char d)


Encapsulation : private
Return type : void
Method name : strips
Arguments :
p – point around which strips are to be found.
d – char indicate the direction.

This method will find strips and store their end points into global variables
Lpt1,Lpt2,Lpt3,Lpt4 and Rpt1,Rpt2,Rpt3,Rpt4 as per direction passed.

private double length(char d)


Encapsulation : private
Return type : double
Method name : length
Arguments :
d – character indicate direction.
This method will find length of a strip based on the direction passed.

public float line_slop(Point p, Point z)


Encapsulation : public
Return type : float
Method name : line_slop
Arguments :
p – first point.
z – second point.

This method will find slop of a line passing through p and z and return slop found.

public bool check(Point pt1,Point pt2,Point t,Point st,float m1,float m2)


Encapsulation : public
Return type : bool
Method name : check
Arguments :
pt1: point on a first left-side strip.
Pt2 : point on a second left-side strip.
t : point to be check.
st : point for which strips are generated.
m1: slop of strips on left-side.
m2: slop of line perpendicular to line with slop m1.

This method will check if point t is in between two lines passing from pt1 and pt2
with slop m1 and return its status as true if point is in between two lines else flase.

public bool same_side(Point x, float m, Point p, Point t)


Encapsulation : public
Return type : bool
Method name : same_side
Arguments :
x : point on a line with slop m.
m : slop of a line.
p, t : points which are to be checked.

This method will check if points p and t are on same side of a line passing from x
with slop m.

public int line_side(Point x, float m, Point p)


Encapsulation : public
Return type : int
Method name : line_side
Arguments :
x : point on a line with slop m.
m : slop of a line.
p : point to be check.
This method will calculate on which side of line passing from x with slop m is
point p and return calculated value.

public void points12(Point p, Point z, char d)


Encapsulation : public
Return type : void
Method name : points12
Arguments :
p, z – points on a line.
d – character indication a direction.

This method will find two points on both side of p which will be on two strips of
p.

public void points34(Point p,char d)


Encapsulation : public
Return type : void
Method name : points34
Arguments :
p – last point in the region

This method will find two last points of the strips.

public Point intersectln(Point p, Point pt)


Encapsulation : public
Return type : Point
Method name : intersectln
Arguments :
p, pt : two points on different lines.

This method will find intersection of two lines passing from p an pt with slops m1
and m2 respectively.

Class FaceRecognitionSystem.HausdorffDistance.HausdorffDistance

public class HausdorffDistance : Thinning.BinMatrix


{
private Image I;
private float[] dist;
private int P;
private int N;
private Image[] ImgArr;
int ReqImg;
string svr;
public HausdorffDistance(string svraddr)
public Image[] HausdorffDist(Image I1,int choice,int no)
public Image[] HausdorffDist(Image I1, int P1, int N1, int no)
public void distance(int choice)
public int[] sort()
public float max(float A,float B)
public float h(int[][] A,int[][] B,int choice)
}

I : Image passed for searching.


dist : array of float values, stores distance calculated with each image in database.
P : penalty
N : radius of neighborhood
ImgArr : array of images, stores best matches in descending order.
ReqImg : integer, number of requested images.
svr : string, stores server name/address.

public HausdorffDistance(string svraddr)


Encapsulation : public
Method type : constructor
Method name : HausdorffDistance
Arguments :
svraddr : string, used to pass server address.

This method will initialize server address that will be used in processing through
svr.

public Image[] HausdorffDist(Image I1,int choice,int no)


Encapsulation : public
Return type : Image[]
Method name : HausdorffDist
Arguments :
I1 : Image, passed for searching.
choice : integer, identified which algorithm to use for searching.
no : integer, number of best match images to return.

This method will find distance between image pass as argument and images in the
database and return number of images requested in array of images.

public Image[] HausdorffDist(Image I1, int P1, int N1, int no)
Encapsulation : public
Return type : Image[]
Method name : HausdorffDist
Arguments :
I1 : Image, passed for searching.
P1 : integer, penalty value passed for calculation.
N1 : integer, radius of neighborhood.
no : integer, number of best match images to return.

This method will find distance between image pass as argument and images in the
database and return number of images requested in array of images.

public void distance(int choice)


Encapsulation : public
Return type : void
Method name : distance
Arguments :
choice : integer, identify the algorithm to use.

public int[] sort()


Encapsulation : public
Return type : int[]
Method name : sort
Arguments : N/A

This method will sort the dist array and return sorted array of index.

public float max(float A,float B)


Encapsulation : public
Return type : float
Method name : max
Arguments :
A, B : two float values from which to find max value.

This method will return the maximum value from A and B.

public float h(int[][] A,int[][] B,int choice)


Encapsulation : public
Return type : float
Method name : h
Arguments :
A : 2-D array of integers, is an image-matrix to search.
B : 2-D array of integers, is an image-matrix retrieved from database to compare.
choice : integer, identifies the comparison algorithm.

This method will find distance from A to B.

Class FaceRecognitionSystem.HausdorffDistance.HD

public class HD
{
public float h(int[][] A,int[][] B)
public float max(float[] a)
public float[] min(int[][] A,int[][] B)
public float minimum(Point p,int[][] B)
}

public float h(int[][] A,int[][] B)


Encapsulation : public
Return type : float
Method name : h
Arguments :
A : 2-D array of integers, is an image-matrix to search.
B : 2-D array of integers, is an image-matrix retrieved from database to compare.
This method will find distance from A to B.

public float max(float[] a)


Encapsulation : public
Return type : float
Method name : max
Arguments :
a : an array of float values from which to find max value.

This method will return the maximum value from float array a.
public float[] min(int[][] A,int[][] B)
Encapsulation : public
Return type : float[]
Method name : min
Arguments :
A : image-matrix, which is to be search.
B : image-matrix, retrieved from database.

This method will find minimum distance of all points in A with each point in B
and returns calculated distance as an array of float values.

public float minimum(Point p,int[][] B)


Encapsulation : public
Return type : float
Method name : minimum
Arguments :
p : point in the image.
B : image-matrix, retrieved from database.

This method will find minimum distance from p to each point in B and returns
calculated distance as a float value.

Class FaceRecognitionSystem.HausdorffDistance.MHD

public class MHD : HD


{
public new float h(int[][] A,int[][] B)
public float avg(float[] m)
}

public new float h(int[][] A,int[][] B)


Encapsulation : public
Return type : float
Method name : h
Arguments :
A : 2-D array of integers, is an image-matrix to search.
B : 2-D array of integers, is an image-matrix retrieved from database to compare.
This method will find distance from A to B.

public float avg(float[] m)


Encapsulation : public
Return type : float
Method name : avg
Arguments :
m : an array of float values.

This method will return the average value of all elements of array m.

Class FaceRecognitionSystem.HausdorffDistance.M2HD

public class M2HD : MHD


{
int P;
int N;
public M2HD()
public M2HD(int P1,int N1)
public new float h(int[][] A,int[][] B)
public float[] ds(int[][] A,int[][] B)
public float d(Point p,int[][] B)
public float max(float a, float b)
public float min(Point p,int[][] B)
}

P : integer, penalty
N : integer, radius of neighborhood.

public M2HD(int P1,int N1)


Encapsulation : public
Method type : constructor
Method name : M2HD
Arguments :
P1 : integer, penalty passed for calculation.
N1 : integer, radius of neighborhood.

This method will initialize local parameters.

public new float h(int[][] A,int[][] B)


Encapsulation : public
Return type : float
Method name : h
Arguments :
A : 2-D array of integers, is an image-matrix to search.
B : 2-D array of integers, is an image-matrix retrieved from database to compare.

This method will find distance from A to B.


public float[] ds(int[][] A,int[][] B)
Encapsulation : public
Return type : float[]
Method name : ds
Arguments :
A : image-matrix, which is to be search.
B : image-matrix, retrieved from database.

This method will return the values calculated for all points in A with each point in
B based on penalty.

public float d(Point p,int[][] B)


Encapsulation : public
Return type : float
Method name : d
Arguments :
p : Point in image.
B : image-matrix, retrieved from database.

This method will find distance of point p with each point in B and returns
calculated distance based on penalty.

public float max(float a, float b)


Encapsulation : public
Return type : float
Method name : max
Arguments :
a, b : two float values from which to find max value.

This method will return the maximum value from a and b.

public float min(Point p,int[][] B)


Encapsulation : public
Return type : float
Method name : min
Arguments :
p : point in the image.
B : image-matrix, retrieved from database.

This method will find minimum distance from p to each point in B and returns
calculated distance as a float value.
CONCLUSION
REFERENCES

[1] Yongsheng Gao and Maylor K.H. Leung, “Face Recognition Using Line Edge Map”
IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 24, No. 6, June
2002.

[2] Surendra Gupta and Krupesh Parmar, “A Combined approach of Serial and Parallel
Thinning Algorithm for Binary Face Image,” Computing-2005, Division IV, CSI
Conference, May 2005.

[3] Y. Gao, “Efficiently comparing face images using a modified Hausdorff distance,”
IEE Proc.-Vis. Image Signal Process., Vol. 150, No. 6, December 2003.

[4] M.K.H. Leung and Y.H. Yang, “Dynamic Two-Strip Algorithm in Curve Fitting,”
Pattern Recognition, vol. 23, pp. 69-79, 1990.

[5] David S. Bolme, “ELASTIC BUNCH GRAPH MATCHING,” Colorado State


University Fort Collins, Colorado Summer 2003

[6] Laurenz Wiskott, “The Role of Topographical Constraints in Face Recognition,”


Pattern Recognition Letters 20(1):89-96 (1999)

[7] Daniel L. Swets, John (Juyang) Weng, “ Using Discriminant Eigenfeatures for Image
Retrieval, ,” IEEE Trans. Pattern Anal. Machine Intell., vol. 18, August 1996.

[8] C. Kotropoulos and I. Pitas, “Rule-Based Face Detection in Frontal Views,” Proc.
IEEE Int'l Conf. Acoustics, Speech, and Signal Processing (ICASSP-97), vol. 4, pp.
2537-2540, Apr. 1997.

[9] P. T. Jackway and M. Deriche, “Scale-space properties of the multiscale


morphological dilation-erosion,” IEEE Trans. Pattern Anal. Machine Intell., vol. 18,
pp. 38–51, Jan. 1996.

[10]L. Sirovich and M. Kirby, “Low-Dimensional Procedure for the Characterisation of


Human Faces,” J. Optical Soc. of Am., vol. 4, pp. 519-524, 1987.

[11]M. Kirby and L. Sirovich, “Application of the Karhunen-LoeÁve Procedure for the
Characterisation of Human Faces,” IEEE Trans. Pattern Analysis and Machine
Intelligence, vol. 12, pp. 831-835, Dec. 1990.

[12]M. Turk and A. Pentland, “Eigenfaces for Recognition,” J. Cognitive Neuroscience,


vol. 3, pp. 71-86, 1991.

[13]M.A. Grudin, “A Compact Multi-Level Model for the Recognition of Facial Images,”
PhD thesis, Liverpool John Moores Univ., 1997.

[14]L. Zhao and Y.H. Yang, “Theoretical Analysis of Illumination in PCA-Based Vision
Systems,” Pattern Recognition, vol. 32, pp. 547-564, 1999.
[15]A. Pentland, B. Moghaddam, and T. Starner, “View-Based and Modular Eigenspaces
for Face Recognition,” Proc. IEEE CS Conf. Computer Vision and Pattern
Recognition, pp. 84-91, 1994.

[16]T.J. Stonham, “Practical Face Recognition and Verification with WISARD,” Aspects
of Face Processing, pp. 426-441, 1984.

[17]K.K. Sung and T. Poggio, “Learning Human Face Detection in Cluttered Scenes,”
Computer Analysis of Image and Patterns, pp. 432-439, 1995.

[18]S. Lawrence, C.L. Giles, A.C. Tsoi, and A.D. Back, “Face Recognition: A
Convolutional Neural-Network Approach,” IEEE Trans. Neural Networks, vol. 8, pp.
98-113, 1997.

[19]J. Weng, J.S. Huang, and N. Ahuja, “Learning Recognition and Segmentation of 3D
objects from 2D images,” Proc. IEEE Int'l Conf. Computer Vision, pp. 121-128, 1993.

[20]S.H. Lin, S.Y. Kung, and L.J. Lin, “Face Recognition/Detection by Probabilistic
Decision-Based Neural Network,” IEEE Trans. Neural Networks, vol. 8, pp. 114-132,
1997.

[21]S.Y. Kung and J.S. Taur, “Decision-Based Neural Networks with Signal/Image
Classification Applications,” IEEE Trans. Neural Networks, vol. 6, pp. 170-181,
1995.

[22]F. Samaria and F. Fallside, “Face Identification and Feature Extraction Using Hidden
Markov Models,” Image Processing: Theory and Application, G. Vernazza, ed.,
Elsevier, 1993.

[23]F. Samaria and A.C. Harter, “Parameterisation of a Stochastic Model for Human Face
Identification,” Proc. Second IEEE Workshop Applications of Computer Vision, 1994.

[24]S. Tamura, H. Kawa, and H. Mitsumoto, “Male/Female Identification from 8_6 Very
Low Resolution Face Images by Neural Network,” Pattern Recognition, vol. 29, pp.
331-335, 1996.

[25]Y. Kaya and K. Kobayashi, “A Basic Study on Human Face Recognition,” Frontiers
of Pattern Recognition, S. Watanabe, ed., p. 265, 1972.

[26]T. Kanade, “Picture Processing by Computer Complex and Recognition of Human


Faces,” technical report, Dept. Information Science, Kyoto Univ., 1973.

[27]A.J. Goldstein, L.D. Harmon, and A.B. Lesk, “Identification of Human Faces,” Proc.
IEEE, vol. 59, p. 748, 1971.

[28]R. Bruneli and T. Poggio, “Face Recognition: Features versus Templates,” IEEE
Trans. Pattern Analysis and Machine Intelligence, vol. 15, pp. 1042-1052, 1993.

[29]I.J. Cox, J. Ghosn, and P.N. Yianios, “Feature-Based Face Recognition Using
Mixture-Distance,” Computer Vision and Pattern Recognition, 1996.
[30]B.S. Manjunath, R. Chellappa, and C. von der Malsburg, “A Feature Based Approach
to Face Recognition,” Proc. IEEE CS Conf. Computer Vision and Pattern
Recognition, pp. 373-378, 1992.

[31]B. TakaÂcs, “Comparing Face Images Using the Modified Hausdorff Distance,”
Pattern Recognition, vol. 31, pp. 1873-1881, 1998.

[32]C. Kotropoulos, A. Tefas, and I. Pitas, “Frontal Face Authentication Using


Morphological Elastic Graph Matching,” IEEE Trans. Image Processing, vol. 4, no. 9,
pp. 555-560, Apr. 2000.

[33]P.J.M. van Laarhoven and E.H.L. Aarts, Simulated Annealing: Theory and
Applications. Kluwer Academic Publishers, 1987.

[34]R.H.J.M. Otten and L.P.P.P. van Ginneken, The Annealing Algorithm. Kluwer
Academic Publishers, 1989.

[35]Olivier de Vel and Stefan Aeberhard, “Line-Based Face Recognition under Varying
Pose,” IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 21, No.
10, October 1999.

[36]H.I. Kim, S.H. Lee and N.I. Cho, “Rotation-invariant face detection using angular
projections,” ELECTRONICS LETTERS 10th June 2004 Vol. 40 No. 12

[37]Frank Y. Shih and Wai-Tak Wong, “A New Safe-Point Thinning Algorithm Based on
the Mid-Crack Code Tracing”, IEEE Transactions on Systems. Man, and Cybernetics,
vol. 25, no. 2, pp. 370-377, Feb. 1995.

[38]N. H. Han, C. W. La, and P. K. Rhee, “An Efficient Fully Parallel Thinning
Algorithm”, IEEE,1997.

[39]Edna Lucia Flores, “A Fast Thinning Algorithm”, IEEE, 1998.

[40]Leung M.K., Yang Y. “A region based approach for human body motion analysis”,
Pattern Recognition 20:321-339; 1987.

[41]D.P. Huttenlocher, G.A. Klanderman and W.J.Rucklidge “Comparing Images Using


the Hausdorff Distance”, IEEE Trans., Pattern Analysis and Machine Intelligence
15(9), 850-863 (1993).

S-ar putea să vă placă și