Sunteți pe pagina 1din 28

[Year]

[Type the document title]

Jaiveer
Microsof
[Pick the date]
INDUSTRIAL TRAINING REPORT

on
Face And Eye Detection Using Machine Learning
Submitted as a part of course curriculum for

Bachelor of Technology
in

Computer Science and Engineering

Submitted by

Name of the Student: APOORVA GUPTA

University Roll Number: 1502910027

Department of Computer Science and Engineering

KIET GROUP OF INSTITUTIONS, GHAZIABAD

D R . A P J A B D U L K A L A M T E C H N I C A L U N I V E R S I T Y,
LUCKNOW

2018-19
S. No. Contents Page No.
1. Introduction ( Organization And Project) 7
2. Synopsis 8
3. Overview Of E-Shopping 9-11
4. Number of Modules 12
5. Hardware Requirements 12
6. System Analysis 13-18
 Introduction
 System Study
 Needs of the System
 System Planning
 Preliminary investigation
 Information Gathering
 Structured Analysis
 Feasibility Study

7. Software Requirement Specification 19-21


Requirement Analysis and Planning Step
1. Gantt Chart
2. System Features
3. Non-Functional Requirements
4. Safety Requirements
5. Security Requirements
8. Software Engineering Paradigm Applied 22-24
1.Spiral Model
9. 1.Design 25-29
 Context Level DFD ER-Diagram
1. 0-Level
2. Level-1
 Component Diagram
Deployment Diagram
INTRODUCTION
Machine Learning is an idea to learn from examples and experience, without being explicitly
programmed. Instead of writing code, you feed data to the generic algorithm, and it builds
logic based on the data given.
For example, one kind of algorithm is a classification algorithm. It can put data into different
groups. The classification algorithm used for Face Detection and also be used to classify
emails into spam and not-spam.

Implementation

The implementation process can be split into two main stages: 1. The classifier training stage
2. The application development stage During the first stage, the classifier was trained on the
preprocessed training data. This was done in a Jupyter notebook (titled “Create and freeze
graph”), and can be further divided into the following steps: 1. Load both the training and
validation images into memory, preprocessing them as described in the previous section 2.
Implement helper functions: a. get_batch(...).

Learning algorithms can be used in several fields from software


engineering to investment banking. Learning algorithms can recognize
patterns which can help detect cancer for example or we may construct
algorithms that can have a very good guess about stock prices
movement in the market.

Some of Machine Learning Algorithms regression and classification problems


with Linear Regression, Logistic Regression, Naive Bayes Classifier,
kNN algorithm, Support Vector Machines (SVMs) and Decision Trees

Machine learning is a type of technology that aims to learn from experience. For example, as a
human, you can learn how to play chess simply by observing other people playing chess.
In the same way, computers are programmed by providing them with data from which
they learn and are then able to predict future elements or conditions.
Let's say, for instance, that you want to write a program that can tell whether a certain type of
fruit is an orange or a lemon. You might find it easy to write such a program and it will
give the required results, but you might also find that the program doesn't work effectively
for large datasets. This is where machine learning comes into play.
There are various steps involved in machine learning:
1.collection of data
2.filtering of data
3.analysis of data
4.algorithm training
5.testing of the algorithm
6.using the algorithm for future predictions

ORGANIZATION
Machine learning uses different kinds of algorithms to find patterns, and these
algorithms are classified into two groups:

 supervised learning
 unsupervised learning

Supervised Learning
Supervised learning is the science of training a computer to recognize elements
by giving it sample data. The computer then learns from it and is able to predict
future datasets based on the learned data.

For example, you can train a computer to filter out spam messages based on
past information.

Supervised learning has been used in many applications, e.g. Facebook, to


search images based on a certain description. You can now search images on
Facebook with words that describe the contents of the photo. Since the social
networking site already has a database of captioned images, it is able to search
and match the description to features from photos with some degree of accuracy.

There are only two steps involved in supervised learning:

 training
 testing

Some of the supervised learning algorithms include:

 decision trees
 support vector machines
 naive Bayes
 k-nearest neighbor
 linear regression
Unsupervised Learning

Unsupervised learning is when you train your machine with only a set of
inputs. The machine will then be able to find a relationship between the
input data and any other you might want to predict. Unlike in supervised
learning, where you present a machine with some data to train on,
unsupervised learning is meant to make the computer find patterns or
relationships between different datasets.

Unsupervised learning can be further subdivided into:

 clustering
 association

Clustering: Clustering means grouping data inherently. For example,


you can classify the shopping habits of consumers and use it for
advertising by targeting the consumers based on their purchases and
shopping habits.

Association: Association is where you identify rules that describe large


sets of your data. This type of learning can be applicable in associating
books based on author or category, whether motivational, fictional, or
educational books.

Some of the popular unsupervised learning algorithms include:

 k-means clustering
 hierarchical clustering

Unsupervised learning will be an important technology in the near future.


This is due to the fact that there is a lot of unfiltered data which has not
yet been digitized.
PROJECT

FACE DETECTION

The problem of face recognition is all about face detection. This is a fact that seems quite
bizarre to new researchers in this area. However, before face recognition is possible, one
must be able to reliably find a face and its landmarks. This is essentially a segmentation
problem and in practical systems, most of the effort goes into solving this task. In fact the
actual recognition based on features extracted from these facial landmarks is only a minor
last step.

There are two types of face detection problems:


1) Face detection in images and
2) Real-time face detection

FACE DETECTION IN IMAGES

1.1 A successful face detection in an image with a frontal view of a human face.

Most face detection systems attempt to extract a fraction of the whole face, thereby
eliminating most of the background and other areas of an individual's head such as hair
that are not necessary for the face recognition task. With static images, this is often done
by running a across the image. The face detection system then judges if a face is present
inside the window (Brunelli and Poggio, 1993). Unfortunately, with static images there is
a very large search space of possible locations of a face in an image.
Most face detection systems use an example based learning approach to decide whether or not
a face is present in the window at that given instant (Sung and Poggio,1994 and
Sung,1995). A neural network or some other classifier is trained using supervised learning
with 'face' and 'nonface' examples, thereby enabling it to classify an image (window in
face detection system) as a 'face' or 'non-face'.. Unfortunately, while it is relatively easy to
find face examples, how would one find a representative sample of images which
represent non-faces (Rowley et al., 1996)? Therefore, face detection systems using
example based learning need thousands of 'face' and 'nonface' images for effective
training. Rowley, Baluja, and Kanade (Rowley et al.,1996) used 1025 face images and
8000 non-face images (generated from 146,212,178 sub-images) for their training set.

1.2 REAL-TIME FACE DETECTION


Real-time face detection involves detection of a face from a series of frames from a
videocapturing device. While the hardware requirements for such a system are far more
stringent, from a computer vision stand point, real-time face detection is actually a far
simpler process thandetecting a face in a static image. This is because unlike most of our
surrounding environment, people are continually moving. We walk around, blink,
fidget,hand.

1.3 FACE DETECTION PROCESS


It is process of identifying different parts of human faces like eyes, nose, mouth, etc… this
process can be achieved by using MATLAB codeIn this project the author will attempt to
detect faces in still images by using image invariants. To do this it would be useful to
study the greyscale intensity distribution of an average human face. The following
'average human face' was constructed from a sample of 30 frontal view human faces, of
which 12 were from females and 18 from males. A suitably scaled colormap has been
used to highlight grey-scale intensity differences.

Face detection
All faces in the face database are transformed into face space. Then face recognition is
achieved by transforming any given test image into face space and comparing it with the
training set vectors. The closest matching training set vector should belong to the same
individual as the test image.
Face recognition and detection system is a pattern recognition approach for personal
identification purposes in addition to other biometric approaches such as fingerprint
recognition, signature, retina and so forth. Face is the most common biometric used by
humans applications ranges from static, mug-shot verification in a cluttered background.
Analysis
We propose an Open Soruce Software to efficiently detect and extract faces from an
image,giving using OPENCV,the most popular library for computer vision.Originally
written in C,C++,it now provides bindings for python.

Analysis takes place in following manner :


For Face Detection,algorithm starts at the top left of a picture and moves down across small
blocks of data,looking at each block,constantly asking, “ Is this a Face?......Is this a
Face?...” Since there are 6000 or more test case per block you might have millions of
calculations to do,this will grind your computer to a halt. To get around this,OpenCv uses
Cascades.
Like a series of waterfalls,the OpenCv cascade breaks the problem of detecting faces into
multiple stages.For each block,it does a very rough and quick test.If that passes,it does a
slightly more detailed test,and so on.The algorithm may have 30-50 of these stages or
cascades,and it will only detect a face if all stages pass.The advantage is that the majority
of pictures will return negative during the first few stages,which means the algorithm
wont waste time testing all 6,000 features on it.Instead of taking hours,face detection can
now be done in real time.
Though the theory may sound complicated,in practice it is quite easy.The cascades themselves
are just a bunch of XML files that contain OpenCv data used to detect objects.You
initialize your code with the cascade you want and then it does the work.

For this, we apply each and every feature on all the training images. For
each feature, it finds the best threshold which will classify the faces to
positive and negative. Obviously, there will be errors or misclassifications.
We select the features with minimum error rate, which means they are the
features that most accurately classify the face and non-face images. (The
process is not as simple as this. Each image is given an equal weight in
the beginning. After each classification, weights of misclassified images
are increased. Then the same process is done. New error rates are
calculated. Also new weights. The process is continued until the required
accuracy or error rate is achieved or the required number of features are
found).

The final classifier is a weighted sum of these weak classifiers. It is called


weak because it alone can't classify the image, but together with others
forms a strong classifier. The paper says even 200 features provide
detection with 95% accuracy. Their final setup had around 6000 features.
(Imagine a reduction from 160000+ features to 6000 features. That is a big
gain).

So now you take an image. Take each 24x24 window. Apply 6000 features
to it. Check if it is face or not. Wow.. Isn't it a little inefficient and time
consuming? Yes, it is. The authors have a good solution for that.

In an image, most of the image is non-face region. So it is a better idea to


have a simple method to check if a window is not a face region. If it is not,
discard it in a single shot, and don't process it again. Instead, focus on
regions where there can be a face. This way, we spend more time
checking possible face regions.

For this they introduced the concept of Cascade of Classifiers. Instead of


applying all 6000 features on a window, the features are grouped into
different stages of classifiers and applied one-by-one. (Normally the first
few stages will contain very many fewer features). If a window fails the
first stage, discard it. We don't consider the remaining features on it. If it
passes, apply the second stage of features and continue the process. The
window which passes all stages is a face region. How is that plan!

The authors' detector had 6000+ features with 38 stages with 1, 10, 25, 25
and 50 features in the first five stages. (The two features in the above
image are actually obtained as the best two features from Adaboost).
According to the authors, on average 10 features out of 6000+ are
evaluated per sub-window.
DESCRIPTION OF TOOLS USED

Machine learning tools make applied machine learning faster and


easier

 Faster: Good tools can automate each step in the applied


machine learning process. This means that the time from ideas to
results is greatly shortened. The alternative is that you have to
implement each capability yourself. From scratch. This can take
significantly longer than choosing a tool to use off the shelf.
 Easier: You can spend your time choosing the good tools
instead of researching and implementing techniques to implement.
The alternative is that you have to be an expert in every step of the
process in order to implement it. This requires research, deeper
exercise in order to understand the techniques, and a higher level of
engineering to ensure it is implemented efficiently.

Machine learning tools are not just implementations of machine


learning algorithms. They can be, but they can also provide capabilities
that you can use at any step in the process of working through a
machine learning problem.

PLATFORM
A machine learning platform provides capabilities to complete a machine
learning project from beginning to end. Namely, some data analysis, data
preparation, modeling and algorithm evaluation and selection.
Platform used in this project is:
Anaconda is a free and open source[4] distribution of the Python and R programming languages
for data science and machine learningrelated applications (large-scale data
processing, predictive analytics, scientific computing), that aims to simplify package
managementand deployment. Package versions are managed by the package management
system conda.[5] The Anaconda distribution is used by over 6 million users, and it includes more
than 250 popular data science packages suitable for Windows, Linux, and MacOS

scikit-learn, a Python module for machine learning. scikit boasts a number of


simple and efficient tools for data mining and data analysis. The basic motivation
behind scikit is For Science! And as such, it’s highly accessible and reusable
across various contexts. Plus, it builds off of well-known data science tools like
NumPy, SciPy, and matplotlib.

LIBRARY
A machine learning library provides capabilities for completing part of a machine learning
project. For example a library may provide a collection of modeling algorithms.

Features of machine learning libraries are:

 They provide a specific capability for one or more steps in a machine learning project.
 The interface is typically an application programming interface requiring
programming.
 They are tailored for a specific use case, problem type or environment.

Library used in this project is:OpenCv library

Methodology

Object Detection using Haar feature-based cascade classifiers is an


effective object detection method proposed by Paul Viola and Michael
Jones in their paper, "Rapid Object Detection using a Boosted Cascade of
Simple Features" in 2001. It is a machine learning based approach where a
cascade function is trained from a lot of positive and negative images. It is
then used to detect objects in other images.

Here we will work with face detection. Initially, the algorithm needs a lot
of positive images (images of faces) and negative images (images without
faces) to train the classifier. Then we need to extract features from it. For
this, Haar features shown in the below image are used. They are just like
our convolutional kernel. Each feature is a single value obtained by
subtracting sum of pixels under the white rectangle from sum of pixels
under the black rectangle.
Now, all possible sizes and locations of each kernel are used to calculate
lots of features. (Just imagine how much computation it needs? Even a
24x24 window results over 160000 features). For each feature calculation,
we need to find the sum of the pixels under white and black rectangles. To
solve this, they introduced the integral image. However large your image, it
reduces the calculations for a given pixel to an operation involving just
four pixels. Nice, isn't it? It makes things super-fast.

But among all these features we calculated, most of them are irrelevant.
For example, consider the image below. The top row shows two good
features. The first feature selected seems to focus on the property that
the region of the eyes is often darker than the region of the nose and
cheeks. The second feature selected relies on the property that the eyes
are darker than the bridge of the nose. But the same windows applied to
cheeks or any other place is irrelevant. So how do we select the best
features out of 160000+ features? It is achieved by Adaboost.
For this, we apply each and every feature on all the training images. For
each feature, it finds the best threshold which will classify the faces to
positive and negative. Obviously, there will be errors or misclassifications.
We select the features with minimum error rate, which means they are the
features that most accurately classify the face and non-face images. (The
process is not as simple as this. Each image is given an equal weight in
the beginning. After each classification, weights of misclassified images
are increased. Then the same process is done. New error rates are
calculated. Also new weights. The process is continued until the required
accuracy or error rate is achieved or the required number of features are
found).

The final classifier is a weighted sum of these weak classifiers. It is called


weak because it alone can't classify the image, but together with others
forms a strong classifier. The paper says even 200 features provide
detection with 95% accuracy. Their final setup had around 6000 features.
(Imagine a reduction from 160000+ features to 6000 features. That is a big
gain).

So now you take an image. Take each 24x24 window. Apply 6000 features
to it. Check if it is face or not. Wow.. Isn't it a little inefficient and time
consuming? Yes, it is. The authors have a good solution for that.

In an image, most of the image is non-face region. So it is a better idea to


have a simple method to check if a window is not a face region. If it is not,
discard it in a single shot, and don't process it again. Instead, focus on
regions where there can be a face. This way, we spend more time
checking possible face regions.

For this they introduced the concept of Cascade of Classifiers. Instead of


applying all 6000 features on a window, the features are grouped into
different stages of classifiers and applied one-by-one. (Normally the first
few stages will contain very many fewer features). If a window fails the
first stage, discard it. We don't consider the remaining features on it. If it
passes, apply the second stage of features and continue the process. The
window which passes all stages is a face region. How is that plan!

The authors' detector had 6000+ features with 38 stages with 1, 10, 25, 25
and 50 features in the first five stages. (The two features in the above
image are actually obtained as the best two features from Adaboost).
According to the authors, on average 10 features out of 6000+ are
evaluated per sub-window.

Haar-cascade Detection in OpenCV

OpenCV comes with a trainer as well as detector. If you want to train your
own classifier for any object like car, planes etc. you can use OpenCV to
create one. Its full details are given here: Cascade Classifier Training.

Here we will deal with detection. OpenCV already contains many pre-
trained classifiers for face, eyes, smiles, etc. Those XML files are stored in
the opencv/data/haarcascades/ folder. Let's create a face and eye detector
with OpenCV.

Preparation of the training data

For training a boosted cascade of weak classifiers we need a set of


positive samples (containing actual objects you want to detect) and a set
of negative images (containing everything you do not want to detect). The
set of negative samples must be prepared manually, whereas set of
positive samples is created using the opencv_createsamples application.

Negative Samples

Negative samples are taken from arbitrary images, not containing objects
you want to detect. These negative images, from which the samples are
generated, should be listed in a special negative image file containing one
image path per line (can be absolute or relative). Note that negative
samples and sample images are also called background samples or
background images, and are used interchangeably in this document.

Described images may be of different sizes. However, each image should


be equal or larger than the desired training window size (which
corresponds to the model dimensions, most of the times being the
average size of your object), because these images are used to
subsample a given negative image into several image samples having this
training window size.

An example of such a negative description file:

Directory structure:
/img

img1.jpg

img2.jpg

bg.txt

File bg.txt:

img/img1.jpg

img/img2.jpg

Your set of negative window samples will be used to tell the machine
learning step, boosting in this case, what not to look for, when trying to
find your objects of interest.

Positive Samples

Positive samples are created by the opencv_createsamples application.


They are used by the boosting process to define what the model should
actually look for when trying to find your objects of interest. The
application supports two ways of generating a positive sample dataset.

1. You can generate a bunch of positives from a single positive object


image.
2. You can supply all the positives yourself and only use the tool to cut
them out, resize them and put them in the opencv needed binary
format.

While the first approach works decently for fixed objects, like very rigid
logo's, it tends to fail rather soon for less rigid objects. In that case we do
suggest to use the second approach. Many tutorials on the web even state
that 100 real object images, can lead to a better model than 1000
artificially generated positives, by using the opencv_createsamples
application.

Using OpenCV's integrated annotation tool

Since OpenCV 3.x the community has been supplying and maintaining a
open source annotation tool, used for generating the -info file. The tool can
be accessed by the command opencv_annotation if the OpenCV
applications where build.
Using the tool is quite straightforward. The tool accepts several required
and some optional parameters:

 --annotations (required) : path to annotations txt file, where you want


to store your annotations, which is then passed to the -
info parameter [example - /data/annotations.txt]
 --images (required) : path to folder containing the images with your
objects [example - /data/testimages/]

 --maxWindowHeight (optional) : if the input image is larger in height then


the given resolution here, resize the image for easier annotation,
using --resizeFactor.

 --resizeFactor (optional) : factor used to resize the input image when


using the --maxWindowHeight parameter.

Note that the optional parameters can only be used together. An example
of a command that could be used can be seen below

opencv_annotation --annotations=/path/to/annotations/file.txt
--images=/path/to/image/folder/

This command will fire up a window containing the first image and your
mouse cursor which will be used for annotation. A video on how to use the
annotation tool can be found here. Basically there are several keystrokes
that trigger an action. The left mouse button is used to select the first
corner of your object, then keeps drawing until you are fine, and stops
when a second left mouse button click is registered. After each selection
you have the following choices:

 Pressing c : confirm the annotation, turning the annotation green and


confirming it is stored
 Pressing d : delete the last annotation from the list of annotations
(easy for removing wrong annotations)

 Pressing n : continue to the next image

 Pressing ESC : this will exit the annotation software

Finally you will end up with a usable annotation file that can be passed to
the -info argument of opencv_createsamples.
Cascade Training

The next step is the actual training of the boosted cascade of weak
classifiers, based on the positive and negative dataset that was prepared
beforehand.

Command line arguments of opencv_traincascade application grouped by


purposes:

 Common arguments:
o -data <cascade_dir_name> : Where the trained classifier should be
stored. This folder should be created manually beforehand.

o -vec <vec_file_name> : vec-file with positive samples (created by


opencv_createsamples utility).

o -bg <background_file_name> : Background description file. This is


the file containing the negative sample images.

o -numPos <number_of_positive_samples> : Number of positive samples


used in training for every classifier stage.

o -numNeg <number_of_negative_samples> : Number of negative samples


used in training for every classifier stage.

o -numStages <number_of_stages> : Number of cascade stages to be


trained.

o -precalcValBufSize <precalculated_vals_buffer_size_in_Mb> : Size of


buffer for precalculated feature values (in Mb). The more
memory you assign the faster the training process, however
keep in mind that -precalcValBufSize and -
precalcIdxBufSize combined should not exceed you available
system memory.

o -precalcIdxBufSize <precalculated_idxs_buffer_size_in_Mb> : Size of


buffer for precalculated feature indices (in Mb). The more
memory you assign the faster the training process, however
keep in mind that -precalcValBufSize and -
precalcIdxBufSize combined should not exceed you available
system memory.

o -baseFormatSave : This argument is actual in case of Haar-like


features. If it is specified, the cascade will be saved in the old
format. This is only available for backwards compatibility
reasons and to allow users stuck to the old deprecated
interface, to at least train models using the newer interface.
o -numThreads <max_number_of_threads> : Maximum number of threads
to use during training. Notice that the actual number of used
threads may be lower, depending on your machine and
compilation options. By default, the maximum available
threads are selected if you built OpenCV with TBB support,
which is needed for this optimization.

o -acceptanceRatioBreakValue <break_value> : This argument is used to


determine how precise your model should keep learning and
when to stop. A good guideline is to train not further than 10e-
5, to ensure the model does not overtrain on your training
data. By default this value is set to -1 to disable this feature.

 Cascade parameters:

o -stageType <BOOST(default)> : Type of stages. Only boosted


classifiers are supported as a stage type at the moment.

o -featureType<{HAAR(default), LBP}> : Type of features: HAAR - Haar-


like features, LBP - local binary patterns.

o -w <sampleWidth> : Width of training samples (in pixels). Must have


exactly the same value as used during training samples
creation (opencv_createsamples utility).

o -h <sampleHeight> : Height of training samples (in pixels). Must


have exactly the same value as used during training samples
creation (opencv_createsamples utility).

 Boosted classifer parameters:

o -bt <{DAB, RAB, LB, GAB(default)}> : Type of boosted classifiers:


DAB - Discrete AdaBoost, RAB - Real AdaBoost, LB -
LogitBoost, GAB - Gentle AdaBoost.

o -minHitRate <min_hit_rate> : Minimal desired hit rate for each


stage of the classifier. Overall hit rate may be estimated as
(min_hit_rate ^ number_of_stages), [191] §4.1.

o -maxFalseAlarmRate <max_false_alarm_rate> : Maximal desired false


alarm rate for each stage of the classifier. Overall false alarm
rate may be estimated as (max_false_alarm_rate ^
number_of_stages), [191] §4.1.

o -weightTrimRate <weight_trim_rate> : Specifies whether trimming


should be used and its weight. A decent choice is 0.95.

o -maxDepth <max_depth_of_weak_tree> : Maximal depth of a weak tree.


A decent choice is 1, that is case of stumps.
o -maxWeakCount <max_weak_tree_count> : Maximal count of weak trees
for every cascade stage. The boosted classifier (stage) will
have so many weak trees (<=maxWeakCount), as needed to
achieve the given -maxFalseAlarmRate.

 Haar-like feature parameters:

o -mode <BASIC (default) | CORE | ALL> : Selects the type of Haar


features set used in training. BASIC use only upright features,
while ALL uses the full set of upright and 45 degree rotated
feature set. See [110] for more details.

 Local Binary Patterns parameters: Local Binary Patterns don't have


parameters.

After the opencv_traincascade application has finished its work, the


trained cascade will be saved in cascade.xml file in the -data folder. Other
files in this folder are created for the case of interrupted training, so you
may delete them after completion of training.

Training is finished and you can test your cascade classifier!

Visualising Cascade Classifiers

From time to time it can be useful to visualise the trained cascade, to see
which features it selected and how complex its stages are. For this
OpenCV supplies a opencv_visualisation application. This application has
the following commands:

 --image (required) : path to a reference image for your object model.


This should be an annotation with dimensions [-w,-h] as passed to
both opencv_createsamples and opencv_traincascade application.
 --model (required) : path to the trained model, which should be in the
folder supplied to the -data parameter of the opencv_traincascade
application.

 --data (optional) : if a data folder is supplied, which has to be


manually created beforehand, stage output and a video of the
features will be stored.

An example command can be seen below

opencv_visualisation --image=/data/object.png --model=/data/model.xml


--data=/data/result/

Some limitations of the current visualisation tool


 Only handles cascade classifier models, trained with the
opencv_traincascade tool, containing stumps as decision trees
[default settings].
 The image provided needs to be a sample window with the original
model dimensions, passed to the --image parameter.

Example of the HAAR/LBP face model ran on a given window of Angelina


Jolie, which had the same preprocessing as cascade classifier files –
>24x24 pixel image, grayscale conversion and histogram equalisation:

A video is made with for each stage each feature visualised:

Each stage is stored as an image for future validation of the


features:
PROJECT CODE

OpenCV already contains many pre-trained classifiers for face, eyes, smiles,
etc. Those XML files are stored in the opencv/data/haarcascades/ folder. Let's
create a face and eye detector with OpenCV.

First we need to load the required XML classifiers. Then load our input image
(or video) in grayscale mode.

Then we find the faces in the image. If faces are found, it returns the positions
of detected faces as Rect(x,y,w,h). Once we get these locations, we can
create a ROI for the face and apply eye detection on this ROI (since eyes are
always on the face !!! ).
Result will look like below:

Original Picture After Picture

Original Picture After Picture


Conclusion

The computational models, which were implemented in this project, were chosen after
extensive research, and the successful testing results confirm that the choices made by the
researcher were reliable.The system with manual face detection and automatic face
recognition did not have a recognition accuracy over 90%, due to the limited number of
eigenfaces that were used for the PCA transform. This system was tested under very
robust conditions in this experimental study and it is envisaged that real-world
performance will be far more accurate.The fully automated frontal view face detection
system displayed virtually perfect accuracy and in the researcher's opinion further work
need not be conducted in this area. The fully automated face detection and recognition
system was not robust enough to achieve a high recognition accuracy. The only reason for
this was the face recognition subsystem did not display even a slight degree of invariance
to scale, rotation or shift errors of the segmented face image. This was one of the system
requirements identified in section 2.3. However, if some sort of further processing, such
as an eye detection technique, was implemented to further normalise the segmented face
image, performance will increase to levels comparable to the manual face detection and
recognition system. Implementing an eye detection technique would be a minor extension
to the implemented system and would not require a great deal of additional research.All
other implemented systems displayed commendable results and reflect well on the
deformable template and Principal Component Analysis strategies.The most suitable real-
world applications for face detection and recognition systems are for mugshot matching
and surveillance. There are better techniques such as iris or retina recognition and face
recognition using the thermal spectrum for user access and user verification applications
since these need a very high degree of accuracy.The real-time automated pose invariant
face detection and recognition system proposed in chapter seven would be ideal for crowd
surveillance applications. If such a system were widely implemented its potential for
locating and tracking suspects for law enforcement agencies is immense. The
implemented fully automated face detection and recognition system (with an eye
detection system) could be used for simple surveillance applications such as ATM user
security, while the implemented manual face detection and automated recognition system
is ideal of mugshot matching. Since controlled conditions are present when mugshots are
gathered, the frontal view face recognition scheme should display a recognition accuracy
far better than the results, which were obtained in this study, which was conducted under
adverse conditions. Furthermore, many of the test subjects did not present an
expressionless, frontal view to the system. They would probably be more compliant when
a 6'5'' policeman is taking their mugshot! In mugshot matching applications, perfect
recognition accuracy or an exact match is not a requirement. If a face recognition system
can reduce the number of images that a human operator has to search through for a match
from 10000 to even a 100, it would be of incredible practical use in law enforcement. The
automated vision systems implemented in this thesis did not even approach the
performance, nor were they as robust as a human's innate face recognition system.
However, they give an insight into what the future may hold in computer vision
Analysis
We propose an Open Soruce Software to efficiently detect and extract faces from an
image,giving using OPENCV,the most popular library for computer vision.Originally
written in C,C++,it now provides bindings for python.

Analysis takes place in following manner :


For Face Detection,algorithm starts at the top left of a picture and moves down across small
blocks of data,looking at each block,constantly asking, “ Is this a Face?......Is this a
Face?...” Since there are 6000 or more test case per block you might have millions of
calculations to do,this will grind your computer to a halt. To get around this,OpenCv uses
Cascades.
Like a series of waterfalls,the OpenCv cascade breaks the problem of detecting faces into
multiple stages.For each block,it does a very rough and quick test.If that passes,it does a
slightly more detailed test,and so on.The algorithm may have 30-50 of these stages or
cascades,and it will only detect a face if all stages pass.The advantage is that the majority
of pictures will return negative during the first few stages,which means the algorithm
wont waste time testing all 6,000 features on it.Instead of taking hours,face detection can
now be done in real time.
Though the theory may sound complicated,in practice it is quite easy.The cascades themselves
are just a bunch of XML files that contain OpenCv data used to detect objects.You
initialize your code with the cascade you want and then it does the work.

S-ar putea să vă placă și