Sunteți pe pagina 1din 25

Emotionally Responsive Game

Guide Dr. Sougata Karmakar

DD312 Systems Approach to Design

Prabhat Kumar 10020532


My sincere thanks to Dr. Sougata Karmakar for guiding me through out the development process and helping me in narrowing down the project. Also I would like to thank my friend Ashish Arora for helping me choosing the right programming language and in initiating the project. Lastly I would I like to thank Yasser Souri for rescuing my project at high times by providing me some code snippet and helping me understand them.

Introduction.4 Objective...5 Chosen Approach.6 Face recognition7
Face detection
Haar-cascade Other methods

Face extraction
methods PCA Gabor

Database for training


Future works..20 Related works.21 References...22

The gaming industry is diligently working to find ways to make gaming experience more immersive. Current area of research is majorly focused on improving involvement of person into game. Most of the game are focused on evoking emotion of a player, such as empathy, so that the player have more involving experience in the game. However, to further enhance the experience there should be an emotional dialogue between the player and the computer.

Phase I
Aim: To enhance gaming experience by making it more immersive through emotional involvement. Objectives: To understand the working of facial recognition system and image extraction. To develop a prototype which would be able to detect emotion through facial expression.

Chosen Approach
OpenCV library has been chosen over Matlab image toolbox, reason being openCV is faster in doing realtime image processing as compared to matlab.[1] Language used-python as the language and working environment is much more simpler as compared to C++. - it has many inbuilt libraries - its faster - its portable - it has a very vast support forum. 599/opencv-vs-matlab

Face Recognition
Human facial features play a significant role for face recognition and Neurophysiologic research. According to studies it is determined that eyes, mouth, and nose are amongst the most important features for recognition. Recognizing someone from facial features makes human recognition a more automated process. Basically the extraction of facial feature points, (eyes, nose, mouth) plays an important role in many applications, such as face recognition, face detection, model based image coding , expression recognition, facial animation and head pose determination.[6]


Face detection

Feature Extraction


Verification/ Identification

Face Detection

The main function of this step is to determine whether human faces appear in a given image, and where these faces are located at. The expected outputs of this step are patches containing each face in the input image. In order to make further face recognition system more robust and easy to design, face alignment are performed to justify the scales and orientations of these patches. Besides serving as the pre-processing for face recognition, face detection could be used for region- of-interest detection, retargeting, video and image classification, etc.

Face Detection
Haar like feature

Cropping the Relevant information

Fast face detection based on the Haar features and the Adaboost algorithm

The first two features (top) and their respective tuning curves (bottom). Each feature is shown over the average face. The tuning curves show the evidence for face (high) vs. non-face (low). The first tuning curve shows that a dark horizontal region over a bright horizontal region in the center of the window is evidence for a face, and for nonface otherwise. The output of the second filter is bimodal. Both a strong positive and a strong negative output is evidence for a face, while output closer to zero is evidence for non-face

Other methods


Knowledgebased methods
Hierarchical knowledge-based method

Feature invariant approaches

Face Detection Using Color Information Face detection based on random labeled graph matching

Template matching methods

Adaptive appearance model

Horizontal / vertical projection

Appearancebased methods
Example-based learning for viewbased human face detection

Part-based methods
based on the generative model framework

Fast face detection based on the Haar features and the Adaboost algorithm

Component-based face detection based on the SVM classifier

Feature Extraction
After the face detection step, human-face patches are extracted from images. Directly using these patches for face recognition have some disadvantages, first, each patch usually contains over 1000 pixels, which are too large to build a robust recognition system1. Second, face patches may be taken from different camera alignments, with different face expressions, illuminations, and may suffer from occlusion and clutter. To overcome these drawbacks, feature extractions are performed to do information packing, dimension reduction, salience extraction, and noise cleaning. After this step, a face patch is usually transformed into a vector with fixed dimension or a set of fiducial points and their corresponding locations. We will talk more detailed about this step in Section 2. In some literatures, feature extraction is either included in face detection or face recognition.

Feature Extraction
Holistic-based methods
Eigenface and Principal Component Analysis

Feature-based methods
Gabor wavelet features with elastic graph matching based methods

Fisherface and linear Discriminative Analysis

Binary features

Independent Component Analysis

Laplacianfaces and nonlinear dimension reduction

Template-based methods

Part-based methods
Componentbased face recognition

Robust face recognition via sparse representation

Person-specific SIFT features for face recognition

Eigenface and Principal Component Analysis




(a) a database with only 10 faces and each face patch is of size 100by-100. Through the computation of PCA basis, (b) a mean face and (c) 9 eigenface (the order of eigenfaces from highest eigenvalues is listed from left to right, and from top to bottom)

Gabor wavelet features with elastic graph matching based methods

Biological motivation: The simple cells of the visual cortex of mammalian brains are best modeled as a family of self-similar 2D Gabor wavelets.

Gabors selected by Adaboost for each expression. White dots indicate locations of all selected Gabors. Below each expression is a linear combination of the real part of the first 5 Adaboost features selected for that expression. Faces shown are a mean of 10 individuals.[3]

Flowchart of the feature extraction stage of the facial images. Start



Find Feature Points

Feature Vectors x 1 end y 1

Gabor wavelet transform coefficient


Support Vector Machine

Neural Network

In machine learning, support vector machines (SVMs, also support vector networks[1]) are supervised learning models with associated learning algorithms that analyze data and recognize patterns, used for classification and regression analysis. More formally, a support vector machine constructs a hyperplane or set of hyperplanes in a high- or infinite-dimensional space, which can be used for classification, regression, or other tasks.

An artificial neural network, often just named a neural network, is a mathematical model inspired by biological neural networks. A neural network consists of an interconnected group of artificial neurons, and it processes information using a connectionist approach to computation. In most cases a neural network is an adaptive system changing its structure during a learning phase. Neural networks are used for modeling complex relationships between inputs and outputs or to find patterns in data.

Database for training

SVM method has been chosen for training purpose as its implementation is easier as compared to neural network.

Prediction Using SVM

Hyper Planes

Predefine Model spaces Model

A SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall on.

Current product capabilities

Based on Haar-casacade system for face detection Uses gabor filters for feature extraction Based on support vector machine for classification Can detect four facial emotion
Anger Happiness Surprise Neutral

Response time 2-3 second Operating system windows only

Instruction manual for using the code.

Download link: pression_recognition_code.rar

Windows only Python 2.7 Opencv for python 2.3 or higher scikit-learn LibSVM 3.11 or higer

Creating a database
Through command prompt goto the code folder and type python Take snapshot of 20 images each for corresponding emotional state of your face. Paste all the images from the folder data starting with f into data>train folder

Recognizing facial expression

Type python in the command window Press space bar every time you want the software to recognize the emotion

The gabor extraction system is slow it takes approx. 2-3 secs to predict the results. Also the memory and processing requirement is very high therefore it can only be trained for a single individual. The accuracy vary greatly by the change of the illumination of the face.

Lot of work is required further to improve this system. Image processing and machine learning together have a huge potential and huge research work is undergoing in this field. Most of the recent technology is the product of the combination of these two area e.g. self driving car, Kinect, security system, google goggles etc.

Future work
Improving Algorithm and understanding concept more thoroughly Studying the work done related to gaming. Developing methodology for improving gaming experience. User testing and analysis.

Related Works done in designing emotional immersive game

The Act (Cecropia 2007) The player uses a single rotating knob to manipulate a particular behavior or emotion, such as sense of humor or courage, of the PC Edgar, with the goal to maintain the right degree to pass the level. Super Princess Peach the player can make the PC, Princess Peach, take on different emotions Joy gives her the ability of wind, Rage gives her the ability of fire, Calm gives her a healing ability, Gloom gives her the ability of water through her tears. Ruben & Lullaby (Opertoon 2009) The player can direct the argument by shaking the iPhone to make a character mad or stroke the character on the touch-sensitive screen to calm them down. Faade (Procedural Arts 2005) The player, caught up in a feud between a married couple, can type in sentences to converse with them. Then the naturallanguage processor deciphers it and it affects the couple and their opinion of each other. Fifa 13 (EA Sports) Fifa 13 takes advantage of Microsoft Kinect to allow players to shout commands to their team to change tactics or formations without pausing the game. The system also listen for players whose frustration gets the better of them. Swearing at the referee will influence their decision-making, possibly leading to more bookings. In Fifa 13's career mode, storylines will develop if the gamer acquire a reputation for abusing referees.

1. Opencv Vs Matlab Vs SimpleCV a comparision by Anthony -vs-matlab-vs-simplecv 2. Wei-Lun Chao, GICE, National Taiwan University Face recognition a survey 3. Marian, Gwen, Ian, Javier Real Time Face Detection and Facial Expression Recognition: Development and Applications to Human Computer Interaction Machine Perception Laboratory, Institute for Neural Computation University of California, San Diego, CA 92093. 4. R. O. Duda, P. E. Hart, D. G. Stoke, Pattern classification, 2nd ed., John Wiley & Sons, 2001. 5. T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, 2nd ed., Springer, 2005. 6. Bhumika, G. Bhatt, ZankhanaFace Feature Extraction Techniques: A Survey Computer Department, BVM Engineering College, Anand, India 7. Intoduction to support vector machines