Documente Academic
Documente Profesional
Documente Cultură
ProjectReport
EE14MTECH11005
EE14RESCH11006
DeterminingmoodfromfacialexpressionusingConvolutionalNeural
Networks
Dr.KSriRamaMurty(Instructor:EE7390)
L.PraveenKumarReddy(EE14MTECH11005),M.ShanmukhReddy(EE14RESCH11006)
Abstract:
keywords
:CNNs,Pooling,softmax.
Introduction:
Humansinteractwitheach
othermostlythroughspeech,andalsothroughbodygestures
to emphasize a certain part of speech and/or
display of emotions. In order to achieve more
effective humancomputer in
teraction, recognizing the emotional state of the human fromhis
or her face could prove to be an invaluable tool. With this motivation we come up with a
modelthatcanrecogniseaperson'smoodbasedonfacialexpression.
There has been research work in emotion detection from past two decades,but various
machine learning approach basedmodelswerebeingproposedinrecentyears.Briefoverviewof
different methods: In [1], the facial landmarks of every image are extracted at first stretch,then
they are using them as input to train a SVM(Support Vector Machine) model.The spatial
relationship/information was not taken into account and is dependent on finding landmarks on
every image.Wepropose aCNNbasedmodelwithseveralconvolutionallayerseachfollowedby
EE7390:PatternRecognitionandMachineLearning
ProjectReport
EE14MTECH11005
EE14RESCH11006
subsampling/pooling layers to extract features, then feed it to a neural network with a softmax
activationfunction.
figure1:sevenemotionsstartingwithsurprise,sad,neutral,happy,disgust,angerandfear
databasesource:EmotionLab(http://www.emotionlab.se/resources/kdef)
The following sections of this report are organised with a brief overview on CNN
followedbyproposedrecognitionalgorithm,resultsandthenconclusion.
[2]
ConvolutionalNeuralNetworks
:
The design of a CNN is motivated by the discovery of a visual mechanism, the visual
cortex, in the brain. The visual cortex contains a lot of cells that are responsible for detecting
light in small, overlapping subregions of the visual field, which are called receptive fields.
These cells act as local filters over the input space, and the more complex cells have larger
receptive fields. The convolution layer in a CNN performs the function thatisperformedbythe
cells in the visualcortex.ACNNis aspecialcaseoftheneuralnetworkdescribedabove.ACNN
consists of one ormoreconvolutionallayers, oftenwithasubsamplinglayer,whicharefollowed
by one or more fully connected layers as in a standard neural network.In a CNN, convolutional
layers playtheroleof featureextractor.Buttheyarenothand designed.Convolutionfilterkernel
EE7390:PatternRecognitionandMachineLearning
ProjectReport
EE14MTECH11005
EE14RESCH11006
weightsaredecidedonaspartofthetrainingprocess.Convolutionallayersareabletoextractthe
localfeaturesbecausetheyrestrictthereceptivefieldsofthehiddenlayerstobelocal.
figure2:AnexampleofConvolutionalNeuralNetworkArchitecture.
In a CNN, convolutional layers play the role of feature extractor. But they are not hand
designed. Convolution filter kernel weights are decided on as part of the training process.
Convolutional layers are able to extract the local features because they restrict the receptive
fieldsofthehiddenlayerstobelocal.
LayersofCNN
By stacking multiple and different layers in a CNN, complex architectures are built for
classification problems. Four types of layers are most common: convolution layers, pooling or
subsamplinglayers,nonlinearlayers,andfullyconnectedlayers.
EE7390:PatternRecognitionandMachineLearning
ProjectReport
EE14MTECH11005
EE14RESCH11006
the precise mix of ingredients to determine a specific target output result. In case of a fully
connected layer, all the elements of all the features of the previous layer get used in the
calculationofeachelementofeachoutputfeature.
RecognitionAlgorithm:
ConvolutionallayerI:64filterseachofsize3*3
maxpoolinglayerI:2*2withoutanyoverlap
ConvolutionallayerII:128filterseachofsize3*3
maxpoolinglayerII:2*2withoutanyoverlap
ConvolutionallayerIII:256filterseachofsize3*3
maxpoolinglayerIII:2*2withoutanyoverlap
ConvolutionallayerIII:256filterseachofsize3*3
maxpoolinglayerIV:2*2withoutanyoverlap
flatteningtheoutputsandgivingittoalayerwith512nodes(neurons)
activationlayer:activationusingsigmoidfunction
Outputlayerwith7nodes
activationofoutputlayerusingsoftmaxfunction
Image database with which we trained our model contains 2400 images,which includes
35 males and 35 females with 7 facial expressions.from 240 images we are using 1800 images
for training and 600 images for testing.The database used has images of size is 562*762*3
pixels,which is very large when compared with the usual CNN input images. As a part of
preprocessing we madesurethateveryimageisresized,convertingthecolorimagestograyscale
and resized to 128*128.In the first convolutional layer the modeofconvolutionalisfullwhich
means it includes zero padding such that the output of the convolution is
(M1+N11)*(M2+N21). In the remaining convolutional layers the mode of convolution is
validwhichincludesnozeropaddingsooutputsizeis(M1N1+1)*(M2N2+1).
EntiremodelisbuildusingKerasinpython.
Results:
As first cut, we trained our model with only 471 (frontal view images) images of the
dataset we had, out of which 371 images are used as training set and rest(100) images as test
EE7390:PatternRecognitionandMachineLearning
ProjectReport
EE14MTECH11005
EE14RESCH11006
set.Our model gave an accuracy of 67.001% within 50 epochs. The training for entire datawith
five different views is being trained,1800 as training setand600astestingset,andyettogetthe
output.
Conclusion:
Acknowledgements:
We would like to thank our course instructor, Dr. K. Sri Rama Murty, for suggesting to
go with CNN based classification rather than depending on the facial landmarks in this
project.WealsothankEmotionLabatKarolinskyforprovidingafreeaccesstotheirdatabase.
References:
[1]StanfordCSclass
CS231n:ConvolutionalNeuralNetworksforVisualRecognition
.
[2]UsingConvolutionalNeuralNetworksforImageRecognition.
SamerHijazi,RishiKumar,andChrisRowen
[3]TheKarolinskaDirectedEmotionalFaces.
DepartmentofClinicalNeuroscience,Psychologysection,Karolinska
[4]Kerasdocumentation.