Hand Gesture Controlled RObot

Blue Book on MOBO: The Motion Bot Submitted in partial fulfillment for the award of the degree of BACHELOR
OF ENGINEERING in COMPUTER ENGINEERING by Shantanu Suhas Ambadkar Jaishal Rajendra Bansal Sidharth Subhash Jain
Under the Guidance of Sheetal Rathi Asst. Prof Thakur college of Engineering and Technology ShyamnarayanMarg, Thakur Village, Kandivli(E), Mumbai-101 Year 2013-2014
CERTIFICATE
This is to certify that Sidharth Jain is a bonafide students of Thakur College of Engineering and Technology, Mumbai. He/She has satisfactorily completed the requirements of the PROJECT-I as prescribed by University of Mumbai while working on MoBo:The Motion Bot.
Ms. Sheetal Rathi (Guide)
Dr. R. R. Sedamkar (Dean Academics & HOD CMPN)
Dr. B. K. Mishra (Principal)
Internal Examiner (Name and Signature with Date)
External Examiner (Name and Signature with Date)
Thakur College of Engineering and Technology Kandivali(E), Mumbai-400101. PLACE: Mumbai DATE:
ACKNOWLEDGEMENT
We have a great pleasure in presenting this project report on MoBo:The Motion Bot & to express our deep regard to towards those who have offered their valuable time & guidance in my hour of need. To complete any type of seminar work is teamwork. It involves all the technical/ non-technical expertise from various sources. He contribution from the experts in the form of knows-how and other technical supports is of vital importance. I am indebted to our inspiring guide Ms. Sheetal Rathi and our H.O.D. Dr. R R Sedamkar who has extended valuable guidelines, help and constant encouragement through the various different stages for the onslaught of the project. I have great pleasure in offering our sincere thanks to our honorable Principal Dr. B K Mishra Last but not least, we would like to thanks all the direct and indirect help provided by friends, parents and the staff of this college for successful completion of this project.
Shantanu Ambadkar
Jaishal Bansal
Sidharth Jain
ABSTRACT
Now a days robots are controlled by remote or cellphone or by direct wired connection. If we thinking about cost and required hardwares all this things increases the complexity, especially for low level application. Now the robot that we have designed is different from above one. It doesnt required any type of remote or any communication module. it is self activated robot, which drive itself according to position of user who stands in front of it. It does what user desires to do. it makes copy of its all movement of the user standing in front of it. Hardware required is very small, and hence low cost and small in size. The proliferation of low power, low cost accelerometers on consumer electronics has brought an opportunity to personalize gesture-based interaction. We present MoBo, an efficient personalized gesture recognizer based on a 3-D accelerometer. The core technical components of MoBo include quantization of accelerometer readings, dynamic time warping and template adaptation. Unlike statistical methods, MoBo requires a single training sample and allows users to employ personalized gestures. Our evaluation is based on a large gesture library with over 4000 samples collected from eight users. It shows that MoBo achieves 98.6% accuracy, competitive with statistical methods which require significantly more training samples.
CONTENTS Chapter No. Topic 3 Axis Accelerometer Gesture segmentation Working of Module Pg. No. 8 10 16
Chapter 1 Chapter 2
Introduction Proposed Work and Literature Review 2.1 Problem Definition 2.2 Literature Review 2.3 Gesture Segmentation Analysis and Planning and Requirement 3.1 3.2 3.3 3.3 Hand Gesture Recognition Finding a Hand Gesture Hand Gesture Extraction Using 3-D images
8 8 10
Chapter 3
12 13 13 14
Chapter 4
Design Phase 4.1 Project Specification 4.2 Working of Module
16 17
Chapter 5 Appendix A
Conclusion
Abbreviation and symbols
Appendix B Definitions
5
Chapter 1 : INTRODUCTION
Interpretation of human gestures by a computer is used for human-machine interaction in the area of computer vision. The main purpose of gesture recognition research is to identify a particular human gesture and convey information to the user pertaining to individual gesture. From the corpus of gestures, specific gesture of interest can be identified, and on the basis of that, specific command for execution of action can be given to the machine. Overall aim is to make the computer to understand human body language, thereby bridging the gap between machine and human. Hand gesture recognition can be used to enhance human computer interaction without depending on traditional input devices such as keyboard and mouse. Hand gestures are extensively used for telerobotic control and applications. Robotic systems can be controlled naturally and intuitively with such telerobotic communication. A prominent benefit of such a system is that it presents a natural way to send geometrical information to the robot such as: left, right, etc. Robotic hand can be controlled remotely by hand gestures. Research is being carried out in this area for a long time. Several approaches have been developed for sensing hand movements and corresponding by controlling robotic hand. Glove based technique is well-known means of recognizing hand gestures. It utilizes sensordetached mechanical glove devices that directly measure hand and/or arm joint angles and spatial position. Although glove-based gestural interfaces give more precision, it limits freedom as it requires users to wear cumbersome patch of devices. Jae-Ho Shin used entropy analysis to extract hand region in complex background for hand gesture recognition system. Robot controlling is done by Fusion of Hand Positioning and Arm Gestures using data glove. Although it gives more precision, it limits freedom due to necessity of wearing gloves. For capturing hand gestures correctly, proper light and camera angle are required. The problem of visual hand recognition and tracking is quite challenging. Many early approaches used position markers or colored bands to make the problem of hand recognition easier, but due to their inconvenience, they cannot be considered as a natural interface for the robot control. We have proposed a fast as well as automatic hand gesture detection and recognition system. This approach of gesture identification On the basis of recognized hand gesture can be used in any robotic system or machines with a number of specific commands suitable to that system.
Chapters 2: Proposed Work and Literature Review

Gesture recognition has been extensively investigated.The majority of the past work has focused on detecting the contour of hand movement. Computer vision techniques in different forms have been extensively explored in this direction (refer:[1] Personal Ubiquitous Computing. 10, 5, 285-299, July 2006) . For a recent example, VisionWand employs computer vision to recognize the movement of a passive wand with a predefined color pattern. While the most common form requires one or more cameras to capture hand movement, the Wii remote has the camera (IR sensor) inside the remote and detects motion by tracking the relative movement of IR transmitters mounted on the display.Therefore, it basically maps the three-dimensional remote movement onto a planar surface. This translates a gesture into handwriting, lending itself to a rich set of handwriting recognition techniques. Vision-based methods, however, are fundamentally limited by their hardware requirements (i.e. cameras or transmitters) and relatively high computation load. Smart glove based solutions have been investigated to recognize very fine gestures, for example the finger movement and conformation, instead of hand movement.These solutions require the user to wear a glove tagged with multiple sensors to capture the motion of fingers and hand in fine granularity(refer[2] : Proc. Int. Gesture Wrkshp. Gesture and Sign Language in Human-Computer Interaction, September 1997.). While they often yield impressive accuracy, these solutions are inadequate for spontaneous interaction with consumer electronics and mobile devices, because of the high cost of the glove and the high overhead of engagement. As ultra low power, low cost accelerometers, gyroscopes, and compasses start to appear on consumer electronics and mobile devices, many have recently investigated gesture recognition based on the time series of acceleration, often with additional information from a gyroscope or compass. Signal processing and ad hoc recognition methods were explored . LiveMove Pro from Ailive provides a gesture recognition library based the accelerometer in the Wii remote. Unlike MoBo, LiveMove Pro targets at userindependent gesture recognition with predefined gesture classifiers and requires 5 to 10 training samples(refer[3] Wrkshp Robot and Human Interactive Communication, 2003). No systematic evaluation of the accuracy of LiveMove Pro exists. HMM, investigated in, is the mainstream method for speech recognition. However, HMM-based methods require extensive training data to be effective. The authors of realized this and attempted to address it by converting two samples into a large set of training data by adding Gaussian noise. While the authors showed improved accuracy, the effectiveness of this method is likely to be highly limited because it essentially assumes that variations in human gestures are Gaussian( refer[4]: Proc. Int. Symp. Wearable Computers, 178 180, 18-19 October 1999). In contrast, MoBo requires as few as a single training sample for each gestue and delivers competitive accuracy. Another limitation of HMM-based methods is that they often require knowledge of the vocabulary in order to configure the models properly, e.g. the number of states in the model. Therefore, HMMbased methods may suffer when users are allowed to choose gestures freely. Moreover, as we will see in the evaluation section, the evaluation dataset and the test procedure used in did not consider gesture variations over the time. Thus their results can be overly optimistic. Dynamic time warping (DTW) is the core of MoBo. It was extensively investigated for speech recognition in the 1970s and early 1980s , in particular speakerdependent speech recognition with a limited vocabulary. Later, HMMbased methods became the
7
mainstream because they are more scalable toward a large vocabulary and can better benefit from a large set of training data. However, DTW is still very effective in coping with limited training data and a small vocabulary, which matches up well with personalized gesture-based interaction with consumer electronics and mobile devices. Wilson and Wilson applied DTW and HMM with XWand to user-independent gesture recognition. The low accuracies, 72% for DTW and 90% for HMM with seven training samples, render them almost impractical. In contrast, MoBo focuses on personalized and user-dependent gesture recognition, thus achieving much higher recognition accuracies. 2.1 Problem Definition It is also important to note that the evaluation data set employed in this work is considerably more extensive than previously reported work, including Similar to MoBo, the $1 recognizer presented in was also based on template matching (refer[5]: Fellow, IEEE MEMS Accelerometer Based Nonspecific-User Hand Gesture Recognition. IEEE SENSORS JOURNAL, VOL. 12, NO. 5, MAY 2012).It is important to note that gestures in that work refer to handwritings on touch screen, instead of three-dimensional free-hand movement. The $1 recognizer is related to MoBo in multiple aspects. First, although it is also based on template matching, the $1 recognizer may not apply to time series of accelerometer readings, which are subject to temporal dynamics (how fast and forceful the hand moves), threedimensional acceleration data due to movement of six degrees of freedom, and the confusion introduced by gravity. Second, the authors concluded that DTW is slower but achieves similar accuracy as the $1 recognizer. We show that with proper quantization, DTW-based MoBo can be extremely efficient. Moreover, MoBo allows DTW to start as the very first point of the time series comes in and to proceed as more points are available(refer[6]: Proc. 3rd Int. Conf. Mobile and Ubiquitous Multimedia, October 2004). As a result, the delay can be masked by the much slower hand movement. Third, three-dimensional gestures may be projected onto a surface as handwritings. Therefore, the $1 recognizer can potentially be applied to recognize certain gestures. However, this will suffer from the limitation of vision-based gesture recognition. In contrast, MoBo is completely camera free and recognizes three-dimensional free hand movement. Fourth and most importantly, MoBo and $1 recognizer are related in their focus on personalization. While is mostly focused on recognition accuracy and speed, we are interested in the interaction between MoBo and the personalizing process as our user studies are designed for. Moreover, we also investigate broader issues that concern accelerometerbased gesture recognition, such as user dependence, tilt, and vocabulary selection. 2.2 Literature Review A. Sensor Description The sensing system utilized in our experiments for hand motiondata collection is shown in Fig. 2 and is essentially aMEMS3-axes acceleration(as shown in fig1) sensing chip integrated with data management and Bluetooth wireless data chips. The algorithms described in this paper were implemented and run on a PC. Details of the hardware architecture of this sensing system were published by our group in [19] and [20]. The sensing system has also been commercialized in a more compact form recently.
B. System Work Flow When the sensing system is switched on, the accelerations in three perpendicular directions are detected by the MEMS sensors and transmitted to a PC via Bluetooth protocol. The gesture 1168 (refer[4]: Fellow, IEEE SENSORS JOURNAL, VOL. 12, NO. 5, MAY 2012). Illustration of the components of the sensing system used for hand gesture recognition. Workflow of gesture recognition system using MEMS. motion data then go through a segmentation program which automatically identifies the start and end of each gesture so that only the data between these terminal points will be processed to extract feature. Subsequently, the processed data are recognized by a comparison program to determine the presented gestures. The work flow of this system is shown in Fig. 2.
2.3. GESTURE SEGMENTATION A. Data Acquisition To collect reliable hand gesture data for the sensing system, the experimental subject should follow guidelines below during the data acquisition stage: The sensing devices hould be held horizontally during the whole data collection process (i.e., the x-y plane of the sensor chip pointing towards the ground). The time interval between two gestures should be no less than 0.2 seconds so that the segmentation program can separate each one of the gestures in sequential order. The gestures should be performed. B. Gesture Segmentation
10
1) Data Preprocessing: Raw data received from the sensors are preprocessed by two 2 processes: a) vertical axis offsets are removed in the time-sequenced data by subtracting each data points from the mean value of a data set; hence, a data set shows zero value on the vertical axes when no acceleration is applied; b) a filter is applied to the data sets to eliminate high-frequency noise data. 2) Segmentation: The purpose of the segmentation algorithm is to find the terminal points of each gesture in a data set of gesture sequence. The algorithm checks various conditions of all the data points and picks out the most likely data points as the gesture termination points. The conditions of determining the gesture terminal points in our algorithm are a) amplitude of the points ( -coordinate value of a data point); b) point separation (the difference between the xcoordinates of the two points); c) mean value (mean of y-coordinates of points on left and right sides of a selected point); d) distance from the nearest intersection (quantifies how far is a selected point away from an intersection point, i.e., a point where acceleration curve crosses from negative to positive or vice versa); e) sign variation between two successive points. After examining all the points by checking these 5 different conditions, the terminal points can be generated for the motion data on each axis. Since, all these five conditions are checked separately on x- and z- axes acceleration data, two matrices are generated for each of gesture sequence data.
11
Chapter 3: Analysis and Planning

Hand Gesture Recognition
Human hand gestures are a set of movements of the hand and arm which range from the simple action of pointing at something to the complex ones used to communicate with other people. Understanding and interpreting these movements requires modeling them in both spatial and temporal domains. Static configuration of the human hand which is called hand posture and its dynamic activities are vital for human compute interaction. Psychological studies show that a hands gesture consists of three phases. These phases are: Preparation, Nucleus, and Retraction. The preparatory phase is to bring the hand from its resting state to the starting posture of the gesture. This phase sometimes is very short and sometimes it is combined with the retraction phase of the previous gesture. The nucleus contains the main concept and has a definite form. The retraction phase shows the resting movement of the hand after completing the gesture. Retraction may be very short or not present if the gesture is succeeded by another gesture. The preparatory and retraction phases are generally short and the hand movements are faster compared to the nucleus phase. Several classifications have been considered for hand gestures in the literature. One taxonomy which is more suitable for human computer interaction applications divides hand gestures into three groups. These groups are: communicative gestures, manipulative gestures, and controlling gestures. Communicative gestures are intended to express an idea or a concept. These gestures are either used together with speeches or are a substitute for verbal communications which on the other hand requires a high structured set of gestures such as those defined in sign languages. Manipulative gestures are used for interaction with objects in an environment. These gestures are mostly used for interaction in virtual environments such as tele operation or virtual assembly systems however; physical objects can be manipulated through gesture controlled robots. Controlling gestures are the group of gestures which are used to control a system or point and locate and object. Finger Mouse is a sample application which detects 2D finger movements and controls mouse movements on the computer desktop. Analyzing hand gestures is completely application dependant and involves analyzing the hand motion, modeling hand and arm, mapping the motion features to the model and interpreting the gesture in a time interval. Hand gestures can be divided into two categories. Static gestures utilize only spatial information and dynamic gestures utilize both spatial and timed information. With static gestures, as number of predefined gestures is increased, the differences between gestures become harder to distinguish. In the case of dynamic gestures, they are easier and more comfortable to express and larger number of gestures can be predefined, but there are some difficulties with extracting proper data from load of meaningless information. Glove based techniques and computer vision techniques are the two well-known means of recognizing hand gestures. The first utilizes sensor-detached mechanical glove devices that directly measure hand and/or arm joint angles and spatial position. But glove-based gestural interfaces require users to wear cumbersome patch of devices. The latter approach suggests using a set of video cameras and computer vision techniques to interpret gestures providing more
12
natural way of interactions. However, since it is troublesome to analyze hand movements and recognize postures from complex images, methods such as putting certain colored marker on hands or wearing special types of gloves in restricted set of backgrounds are widely acknowledged limitations. In this paper, we propose a method of hand gestures recognition based on computer vision techniques but, without restricting backgrounds or using any markers.
3.1 Finding A Hand Gesture

Gesture finding from online Camera: From video stream one frame is captured in each 1/3 second. Target is to identify the frame that contains hand gesture shown by human. For this we are searching the frame in which there is no movement. Required frame is identified by comparing three continuous captured frames. Motion parameter is determined for each frame by counting total pixels of mismatch. If motion parameter is less than the specified threshold value, it is considered as a frame having less movement of object i.e. the frame contains some hand gesture, user wants to show. Analysis of frames to find the frame of interest is done by converting captured frame into a binary frame. Differences between newly captured frame and two previously captured frames are determined and they are added together to find motion parameter. Differences between values of corresponding pixels are counted with both frames and added to find motion parameter. Since binary image has values of one or zero, XOR function can give locations where mismatches occur. If frame1, frame2, and frame3 are three matrixes containing three frames captured in three continuous time slots respectively then: fr1= frame1 XOR frame3 fr2 = frame2 XOR frame3 mismatch_matrix = fr1 OR fr2 Here r and c are the number of rows and columns in image frames. Threshold value is set as 0.01. i.e. if total pixels of mismatch is less than 1% of total pixels in a frame, then it is considered as frame of interest. Required frame is forwarded for further process.
3.2 Hand Gesture Extraction

The frame with a gesture contains extra part along with required part of hand i.e. background objects, blank spaces etc. For better result in pattern matching, unused part must be removed. Therefore hand gesture is cropped out from obtained frame. Cropping of hand gesture from the obtained frame contains three steps: First step is to convert selected frame into black-and-white image using global thresholding. Second step is to extract object of interest from the frame. In our case, object of interest is the part of human hand showing gesture. For this, extra part other than the hand is cropped out so that pattern matching can give more accurate results. For cropping extra parts row and column number is determined, from where object of interest appears. This is done by searching from each side of binary image and moving forward until white pixels encountered are more than the offset value. Experimental results shows that offset value set to 1% of total width gives better result for noise compensation. If size of selected image is mXn then: Hor_offset=m/100 Ver_offset=n/100 Min_col= minimum value of column number where total number of white pixels are more than Hor_offset.
13
Max_col= maximum value of column number where total number of white pixels are more than Hor_offset. Min_row= minimum value of row number where total number of white pixels are more than Ver_offset. Max_row= maximum value of row number where total number of white pixels are more than Ver_offset. Third step is to remove parts of hand not used in gesture presentation i.e. removal of wrist, arm etc. Because theses extra parts are of variable length in image frame, pattern matching with gesture database gives unwanted results, due to limitations of gesture database. Therefore, parts of hand before the wrist need to be cropped out. Statistical analysis of hand shape shows that either we pose palm or fist, width is lowest at wrist and highest at the middle of palm. Therefore extra hand part can be cropped out from wrist by determining location where minimum width of vertical histogram is found. Figure 3.c and 3.d show global maxima and cropping points for hand gestures in figure 3.a and 3.b respectively. Cropping point is calculated as: Global Maxima = column number where height of histogram is highest (i.e. column number for global maxima as shown in figure 3). Cropping point = column number where height of histogram is lowest in such a way that cropping point is in between first column and column of Global Maxima If gesture is shown from opposite side (i.e. from other hand) then cropping point is searched between column of Global Maxima and last column. Direction of the hand is detected using continuity analysis of object during hand gesture area determination. Continuity analysis shows that whether the object continues from column of Global maxima to first column or to last column. i.e. whether extra hand is left side of palm or right side of palm.
3.3 Using 3-D Images

The precise hand region can be obtained through the fusion of the hand geometric features and the 3D depth information in real-time. The gesture analysis can be generally classified into two categories: one is the 2D image data, the other is the 3D depth data. By using the 2D camera, many vision-based algorithms are exploited to realize hand tracking and gesture recognition. However, the performance will be decreased tremendously when confronted with complicated environment, such as complex backgrounds or variable illuminations. Therefore a more intuitive way is using the 3D depth data to eliminate the noise. To get the useful 3D depth information, one kind of method is to use more than one camera, i.e. the stereo vision methods, such as [8 11]. And another alternative way is to use the 3D sensor, which can provide the dense range image. With the depth data captured by 3D camera, the first step is to detect coarse hand region. For this purpose, some candidate regions can be acquired by pursuing the most frontal connected regions in depth range image. The obtained hand region usually contains not only hand, but also forearm part.
14
In most HCI environments, such assumption that the depth of hand is smaller than forearm is usually holds. Therefore, the rough hand location can be estimated through the statistical information of the pixels with smaller or shallower depth. Here, we use the mean position of these pixels as the rough hand center where dpi is the depth of the pixel pi and dT is the depth corresponding to the N-nearest pixels. The palm location in the coarse hand region is determined by extracting circle features. Here the scale space feature detection is adopted.The scale space representation of image I is shown as is a Gaussian kernel with scale t . X = (x, y) and = (x, y) are coordinates of pixel in image. The circle features are detected as local maxima in scale space of the square of the normalized Laplacian operator [7],BnormS = t(xx + yy), here S is the scale space representation. The radius r of detected circle feature is proportional to its scale. For planar hand shape, the scale-space feature detection is effective to find palm area. In practice, the detection is executed at a wide range of scale to find palm of any size. So there are always many circle features detected in coarse hand region To find the exact circle feature for palm area, several steps are taken as follows: (1) Among all the extracted circles, only the circle features having strong response to detectors are maintained to form a cluster ball. (2) Select the circle feature bmax with maximum scale tmax in the cluster ball . (3) Denote the circle feature bmax with its center P, if the distance between P and C is below a threshold ( set to 0.5r experientially, here r is the radius of circle corresponding to bmax), then bmax is the desired circle feature for palm. Otherwise, delete bmax from cluster ball and jump to step (2). Forearm cutting:- Since palm has been located in the coarse hand region, the remaining problem is to determine the cutting direction and cutting position with the results of above steps. The forearm cutting can be implemented in the following steps: 1) Determine whether the cutting direction is horizontal or vertical according to the aspect ratio of the coarse hand region:r = w/h, where w and h are width and height of bounding rectangle of this region respectively. If r < T , then forearm cutting is along a horizontal direction, on the contrary, a vertical direction is determined. Here, T is a pre-defined threshold, and in our case T = 1.2. 2) The spatial relationship between P and C, as given below, is used to determine the cutting position. (a) If r < T and C is above P ,then the forearm is cut at the bottom of palm circle in (b) If r < T and C is below P, then the forearm is cut at the top of palm circle in If r T and C is in the left of P , then the forearm is cut at the right end of palm vertical. (d) If r T and C is in the right of P, then the forearm is cut at the left end of palm vertical. horizon. horizon. circle in circle in
Finally, morphological close-open operation is performed to the resulted binary image to get more elaborate hand region.
15
Chapter 4: Design
PROJECT SPECIFICATION POWER SUPPLY MOTOR : 9V SENSOR(ACCELEROMETER): 3.5V CONTROLLER: 5V CONTROOLER USED ATmega 16(AVR):8-Bit SENSOR ADXL335 (ACCELEROMETER) Three direction (x,y,z) Speed of robot: 60 rpm Maximum input channel capacity: max 8 input It Can drive max four motors.
CIRCUIT DIAGRAM
16
WORKING OF MODULE
This robot consists of mainly three parts. First is sensor, which works as vision of robot. We have used accelerometer that act as sensor for our robot. A Gesture Controlled robot is a kind of robot which can be controlled by your hand gestures not by old buttons.You just need to wear a small transmitting device in your hand which included an acceleration meter.This will transmit an appropriate command to the robot so that it can do whatever we want. The transmitting device included a comparator IC for analog to digital conversion and an encoder IC(HT12E) which is use to encode the four bit data and then it will transmit by an RF Transmitter module. At the receiving end an RF Receiver module receive's the encoded data and decode it by an decoder IC(HT12D). This data is then processed by a microcontroller (P89V51RD2) and finally our motor driver to control the motor's. As user makes movements of his hand in front of it, it senses and according to that it sends the signal for decision. Output from accelerometer is gathered for process by microcontroller. As per sensor output, the controller is made to work according to the program written inside it and it sends the respective signal to third part which is motors. This is the last part which drives the wheel of our robot. It uses two dc motors to make movement. To drive them one motor driver is IC used which provides sufficient current to motors. All this material is mounted on metal chesi. As we move our hand to right robot will move to right side. Similar to this it will copy all our movements. The ADXL335 is a small, thin, low power, complete 3-axis accelerometer with signal conditioned voltage outputs. The product measures acceleration with a minimum full-scale range of 3 g. It can measure the static acceleration of gravity in tilt-sensing applications, as well as dynamic Acceleration resulting from motion, shock, or vibration. One of the most common inertial sensors is the accelerometer, a dynamic sensor capable of a vast range of sensing. Accelerometers are available that can measure acceleration in one, two, or three orthogonal axes. They are typically used in one of three modes: As an intertial measurement of velocity and position; As a sensor of inclination, tilt, or orientation in 2 or 3 dimensions, as referenced from the acceleration of gravity (1 g = 9.8m/s2); As a vibration or impact (shock) sensor. There are considerable advantages to using an analog accelerometer as opposed to an inclinometer such as a liquid tilt sensor inclinometers tend to output binary information (indicating a state of on or off), thus it is only possible to detect when the tilt has exceeded some thresholding angle. Most accelerometers are Micro-Electro-Mechanical Sensors (MEMS). The basic principle of operation behind the MEMS accelerometer is the displacement of a small proof mass etched into
17
the silicon surface of the integrated circuit and suspended by small beams. Consistent with Newton's second law of motion (F = ma), as an acceleration is applied to the device, a force develops which displaces the mass. The support beams act as a spring, and the fluid (usually air) trapped inside the IC acts as a damper, resulting in a second order lumped physical system. This is the source of the limited operational bandwidth and non uniform frequency response of accelerometers. For more information, see reference to Elwenspoek, 1993.
Proposed methodology is able to use live video camera for gesture identification. It sniffs frames of live video stream in some time interval. In our case frame capture rate for gesture search is 3 frames per second. Proposed technique to control robotic system using hand gesture display is divided into four subparts: Capture frame containing some gesture presentation. Extract hand gesture area from captured frame. Determine gesture by pattern matching using PCA algorithm Determine control instruction, corresponding to matched gesture, and give that instruction to specified robotic system. The block diagram above shows the flow diagram of whole system, i.e. performing hand gesture identification and robot control. Gesture is captured by taking a snap shot from a continuous video. The captured image is searched for a valid hand gesture. The region showing the gesture is then cropped out and the image is resized to match with the gestures in the database. On the basis of gesture, identified by pattern matching, control instruction is determined from the stored instructions set. The selected instruction set, corresponding to recognized hand gesture is given to robot for carrying out the control action.
18
Chapter 5: Conclusions
We present MoBo for personalized gesture-based interaction. MoBo employs a single accelerometer so that can be readily implemented on many commercially available consumer electronics and mobile devices. The core of MoBo includes 1) dynamic time warping (DTW) to measure similarities between two time series of accelerometer readings; 2) quantization for reducing computation load and suppressing noise and non-intrinsic variations in gesture performance; and 3) template adaptation for coping with gesture variation over the time. Its simplicity and efficiency allow implementation on a wide range of devices, including simple 16bit microcontrollers, as long as an accelerometer is available. We evaluate the application of MoBo to user-dependent recognition of predefined gestures with over 4000 samples collected from eight users over multiple weeks. Our experiments demonstrate that MoBo achieves 98.6% accuracy starting with only one training sample. This is comparable to the reported accuracy by HMM-based methods [6] with 12 training samples (98.9%). We show that the quantization improves recognition accuracy and reduces the computation load. Our evaluation also highlights the challenge of variations over the time to user-dependent gesture recognition and the challenge of variations across users to userindependent gesture recognition. We believe MoBo is a first major step toward building technology that facilitates personalized gesture recognition. Its accurate recognition with one training sample is critical to the adoption of personalized gesture recognition in a range of devices and platforms. It has the potential to enable novel gesture-based navigation and operation of next generation user interfaces. In todays digitized world, processing speeds have increased dramatically, with computers being advanced to the levels where they can assist humans in complex tasks. Yet, input technologies seem to cause a major bottleneck in performing some of the tasks, under-utilizing the available resources and restricting the expressiveness of application use. Hand Gesture recognition comes to rescue here. Robotic control is dependent on accurate hand gesture detection and hand gesture detection directly depends on lighting quality. Besides robustness of the system, proposed method for controlling robot using hand gesture is very fast. This methodology can be extended for more complex robots in the fields of computer vision and robotics.
19
References
[1] Kela, J., Korpip, P., Mntyjrvi, J., Kallio, S., Savino G., Jozzo, L., and Marca, D. Accelerometer-based gesture control for a design environment: Personal Ubiquitous Computing. 10, 5, 285-299,July 2006. [2] Hofmann, F. G., Heyer, P., and Hommel, G. Velocity Profile Based Recognition of Dynamic Gestures with Discrete Hidden Marko[v Models: Proc. Int. Gesture Wrkshp. Gesture and Sign Language in Human-Computer Interaction, September 1997. [3] Jang, I. J. and W. B. Park. Signal processing of the accelerometer for gesture awareness on handheld devices: 9 Proc. 12th IEEE Int. Wrkshp Robot and Human Interactive Communication, 2003. [4] Perng, J.K., Fisher, B., Hollar, S., and Pister, K.S.J Acceleration sensing glove (ASG): Proc. Int. Symp. Wearable Computers, 178 180, 18-19 October 1999. 2 [5] Ruize Xu, Shengli Zhou, and Wen J. Li: Fellow, IEEE MEMS Accelerometer Based Nonspecific-User Hand Gesture Recognition. IEEE SENSORS JOURNAL, VOL. 12, NO. 5, MAY 2011 [6] Mntyjrvi, J., Kela, J., Korpip, P., and Kallio, S. Enabling fast and effortless customisation in accelerometer based gesture interaction: Proc. 3rd Int. Conf. Mobile and Ubiquitous Multimedia, October 2004. [7] Analog Device, Small, Low Power, 3-Axis 3g I MEMS Accelerometer, ADXL330 datasheet, 2006. [8] Cao, X. and Balakrishnan, R. VisionWand: interaction techniques for large displays using a passive wand tracked in 3D: Proc. 16th Annual ACM Symp. User Interface Software and Technology, November 2003. [9] McInnes, F.R., Jack, M.A., and Laver, J. Template adaptation in an isolated word-recognition system: IEE Proceedings, Vol. 136, Pt. I, No.2, April 1989. [10] Myers, C. S. and Rabiner, L. R.. A comparative study of several dynamic time-warping algorithms for connected word recognition: The Bell System Technical Journal, 60(7):13891409, September 1981. [11] Nintendo Wii, http://www.nintendo.com/wii/. [12] Rice Orbit Sensor Platform,http://www.recg.org/orbit/ [13] Baudel, T. and Beaudouin-Lafon, M. Charade: remote control of objects using free-hand gestures: ACM Communication, 36, 7, 28-35, Jul. 1993 [14] Rabiner, L. R. and Juang, B. H., An Introduction to Hidden Markov Models: IEEE ASSP Magazine, pp. 4-15, January 1986. [15] Salvador, S. and Chan, P. FastDTW: Toward accuratedynamic time warping in linear time and space: Proc.ACM Wkshp. Mining Temporal and Sequential Data, August2004 [16]http://seminarprojects.net/t-hand-gesture-for-human-machineinteraction?pid=31421&mode=threaded. [17] http://pranjalnrobotics.blogspot.in/2012/08/wireless-gesture-controlled-robot-using.html. [18] Chernick, Michael R. Bootstrap Methods, A practitioner's guide: Wiley Series in Probability and Statistics, 1999.
20

Hand Gesture Controlled RObot

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Hand Gesture Controlled RObot

Încărcat de

Drepturi de autor:

Formate disponibile

Blue Book on MOBO: The Motion Bot Submitted in partial fulfillment for the award of the degree of BACHELOR

Ms. Sheetal Rathi (Guide)

Dr. R. R. Sedamkar (Dean Academics & HOD CMPN)

Dr. B. K. Mishra (Principal)

Internal Examiner (Name and Signature with Date)

External Examiner (Name and Signature with Date)

Design Phase 4.1 Project Specification 4.2 Working of Module

Abbreviation and symbols

Chapters 2: Proposed Work and Literature Review

Chapter 3: Analysis and Planning

3.1 Finding A Hand Gesture

3.2 Hand Gesture Extraction

3.3 Using 3-D Images

S-ar putea să vă placă și