Sunteți pe pagina 1din 25

m Over the recent years, computer vision has started to play

a significant role in the Human Computer Interaction (HCI).

m With efficient object tracking algorithms, it is possible to


track the motion of a human hand in real time using a simple
web camera.

m This presentation discusses the design of a system that


tracks the fingertip of the index finger for the purpose of
controlling mouse pointer on the screen.
 j single camera
(web camera) is used to track the
motion of the fingertip in real
time.

 The camera is mounted on


the top of the computer monitor or
hooked on the laptop screen.
 To move the mouse pointer, the user moves his index
finger on a 2D plane (for example on a table). jll other
fingers are folded to form a fist.

 The underlying tracking algorithm works by segmenting


the skin colour from the background, it is required for the
efficient tracking of the finger tip that user's other hand is
not in the scene.
 To do left click, user unfolds his thumb and then folds it
back.

 Please note that if the hand pose is not as described above


(all fingers in a folded form to form a fist except the index
finger), then we simply track the hand but do not move the
mouse pointer. Similarly the clicks are recognized only if index
finger was spotted in the scene.
 The system consists of three parts:


 
 
  
   
6      
 
  

 


One Time Continuous


 It allows the system to learn variations of skin colours
due to change in illumination and hand poses.

 j 2D look up table of Hue and Saturation values (


probability density values) from HSV color space is
computed
 j Cj SHIFT algorithm is used to detect hand region,
that is the detection of hand region is confined to Region of
Interest (ROI).

 Initially, the ROI is set to the entire captured frame.

 The hand region is detected using the H and S values from


the 2D look-up table computed during the training phase.

 j threshold operation gives us the Binary image B(x,y) of


the hand region, where B(x,y) = 1 for pixels in the skin region.


 The centre of the hand region (xc, yc) and its
orientation (ș) is calculated using the 0th, 1st, 2nd order
image moments.

 The ROI of the next frame is then computed from


this binary image.

 The centre of ROI for the next region is set to the


centre of hand region in the current frame.
 The horizontal and vertical lengths (Rx, Ry) of the ROI of
next frame are computed as follows:

Rx = sx oo Ry = sy 00

Where

sx = cos|ș| + 1, sy = sin|ș| + 1
jnd
00 = 0th order image moment
m To detect finger tip, first it is found out if the pose of the
hand is our mouse pointer pose (only index finger in the
unfolded position, rest all folded to form a fist).

m This is done by first cropping the image around the hand


region, smoothing it by using Gaussian kernel so as to reduce
noise and then analyzing the shape of the hand pose.

m j simple method is implemented to analyze the shape of


the hand.
m The image is converted into Binary image and the system
scans the image from top to bottom row by row.

m The system counts the pixels that it encounters in each


row and try to match it with the width of the finger.

m If enough rows are found with number of pixels greater


than equal to the width of the finger, the system proceeds to
check if there is a fist following
the finger.
Ô   
m If enough number of rows with pixel count greater than
fist width are found, then system shows that the pointing
pose is detected.

m Basically an finite state machine has been implemented


to detect the pointing pose in the image by analyzing the
number of pixels in the rows.

m The dimensions of the finger, fist are determined during


the on-line training process.
m The x coordinate of finger tip is set to the first row
having number of pixels greater than finger width.

m The y coordinate is set to the centre of the finger.


m First the system proceeds to find out if thumb is present
in the image.

m For this a similar Finite state machine is implemented.


But this time the system scans the image column by column
to find out thumb is present in the image or not.

m Depending upon if the user is left handed or right


handed, the scanning will begin from left side or right side of
the image.
m The system tries to find out if enough number of columns
with pixel count greater than thumb width are present in the
image or not.

m If sufficient number of such columns detected, the system


checks if there is a fist present in the system or not.

m Once the fist is also detected, the system declares that a


thumb is detected in the image which is equivalent to mouse's
left button down event.
m When the user folds his thumb
back, the system generates a
mouse button up event.

m jlso note that if pointing pose


was not detected in the image,
then mouse clicks are not
detected at all.

m The system maintains the


status of the mouse button (left
button only).
Ô   
 Once the fingertip is detected, we need to map the
coordinates of the fingertip to the coordinates of the mouse
on the monitor.

 But, we can not use the fingertip locations directly due


to the following problems:
m Ooise from sources such as segmentation error make it
difficult to position the mouse pointer accurately

m Due to the limitation on the tracking rate, we might get


discontinuous coordinates of fingertip locations

m Difference of resolution between the input image from the


camera and monitor makes it difficult to position the mouse
pointer accurately
 The displacement of detected finger tip is
averaged over few frames and this average
displacement is used to displace the mouse cursor
on the screen.

 jlso if this displacement is found to be less


than a threshold value, the mouse cursor is not
moved.

s applcato gves ser a easy ay to move
mo se c rsor o t e scree .

 ce ser s sg s ad, t e omg tme


(tme to place ad o t e mo se) s red ced a lot.


e clck s mplemeted t a very easy gest re.

 t more rob st fger tp detecto, t s


applcato ca replace t e se of mo se.
 If background contains colors similar to skin, then the
algorithm will loose track of the hand or falsely report its
location.

 When the camera's height is changed, the system has


reported false pose detection. j better way to detect pointing
pose is to use achine learning algorithm (for example
Oeural network ).

 The mouse cursor movement on the screen required more


smoothing. jlso user is not able to cover the entire screen.

S-ar putea să vă placă și