Virtual Mouse

m Over the recent years, computer vision has started to play
a significant role in the Human Computer Interaction (HCI).
m With efficient object tracking algorithms, it is possible to

track the motion of a human hand in real time using a simple
web camera.
m This presentation discusses the design of a system that

tracks the fingertip of the index finger for the purpose of
controlling mouse pointer on the screen.
j single camera
(web camera) is used to track the
motion of the fingertip in real
time.
The camera is mounted on

the top of the computer monitor or
hooked on the laptop screen.
To move the mouse pointer, the user moves his index
finger on a 2D plane (for example on a table). jll other
fingers are folded to form a fist.
The underlying tracking algorithm works by segmenting

the skin colour from the background, it is required for the
efficient tracking of the finger tip that user's other hand is
not in the scene.
To do left click, user unfolds his thumb and then folds it
back.
Please note that if the hand pose is not as described above

(all fingers in a folded form to form a fist except the index
finger), then we simply track the hand but do not move the
mouse pointer. Similarly the clicks are recognized only if index
finger was spotted in the scene.
The system consists of three parts:

6

One Time Continuous

It allows the system to learn variations of skin colours
due to change in illumination and hand poses.
j 2D look up table of Hue and Saturation values (

probability density values) from HSV color space is
computed
j Cj SHIFT algorithm is used to detect hand region,
that is the detection of hand region is confined to Region of
Interest (ROI).
Initially, the ROI is set to the entire captured frame.
The hand region is detected using the H and S values from

the 2D look-up table computed during the training phase.
j threshold operation gives us the Binary image B(x,y) of

the hand region, where B(x,y) = 1 for pixels in the skin region.

The centre of the hand region (xc, yc) and its
orientation (ș) is calculated using the 0th, 1st, 2nd order
image moments.
The ROI of the next frame is then computed from

this binary image.
The centre of ROI for the next region is set to the

centre of hand region in the current frame.
The horizontal and vertical lengths (Rx, Ry) of the ROI of
next frame are computed as follows:
Rx = sx oo Ry = sy 00
Where
sx = cos|ș| + 1, sy = sin|ș| + 1
jnd
00 = 0th order image moment
m To detect finger tip, first it is found out if the pose of the
hand is our mouse pointer pose (only index finger in the
unfolded position, rest all folded to form a fist).
m This is done by first cropping the image around the hand

region, smoothing it by using Gaussian kernel so as to reduce
noise and then analyzing the shape of the hand pose.
m j simple method is implemented to analyze the shape of

the hand.
m The image is converted into Binary image and the system
scans the image from top to bottom row by row.
m The system counts the pixels that it encounters in each

row and try to match it with the width of the finger.
m If enough rows are found with number of pixels greater

than equal to the width of the finger, the system proceeds to
check if there is a fist following
the finger.
Ô
m If enough number of rows with pixel count greater than
fist width are found, then system shows that the pointing
pose is detected.
m Basically an finite state machine has been implemented

to detect the pointing pose in the image by analyzing the
number of pixels in the rows.
m The dimensions of the finger, fist are determined during

the on-line training process.
m The x coordinate of finger tip is set to the first row
having number of pixels greater than finger width.
m The y coordinate is set to the centre of the finger.

m First the system proceeds to find out if thumb is present
in the image.
m For this a similar Finite state machine is implemented.

But this time the system scans the image column by column
to find out thumb is present in the image or not.
m Depending upon if the user is left handed or right

handed, the scanning will begin from left side or right side of
the image.
m The system tries to find out if enough number of columns
with pixel count greater than thumb width are present in the
image or not.
m If sufficient number of such columns detected, the system

checks if there is a fist present in the system or not.
m Once the fist is also detected, the system declares that a

thumb is detected in the image which is equivalent to mouse's
left button down event.
m When the user folds his thumb
back, the system generates a
mouse button up event.
m jlso note that if pointing pose

was not detected in the image,
then mouse clicks are not
detected at all.
m The system maintains the

status of the mouse button (left
button only).
Ô
Once the fingertip is detected, we need to map the
coordinates of the fingertip to the coordinates of the mouse
on the monitor.
But, we can not use the fingertip locations directly due

to the following problems:
m Ooise from sources such as segmentation error make it
difficult to position the mouse pointer accurately
m Due to the limitation on the tracking rate, we might get

discontinuous coordinates of fingertip locations
m Difference of resolution between the input image from the

camera and monitor makes it difficult to position the mouse
pointer accurately
The displacement of detected finger tip is
averaged over few frames and this average
displacement is used to displace the mouse cursor
on the screen.
jlso if this displacement is found to be less

than a threshold value, the mouse cursor is not
moved.

s applcato gves ser a easy ay to move
mo se c rsor o t e scree .
ce ser s sg s ad, t e omg tme

(tme to place ad o t e mo se) s red ced a lot.

e clck s mplemeted t a very easy gest re.
t more rob st fger tp detecto, t s

applcato ca replace t e se of mo se.
If background contains colors similar to skin, then the
algorithm will loose track of the hand or falsely report its
location.
When the camera's height is changed, the system has

reported false pose detection. j better way to detect pointing
pose is to use achine learning algorithm (for example
Oeural network ).
The mouse cursor movement on the screen required more

smoothing. jlso user is not able to cover the entire screen.

Virtual Mouse

Încărcat de

Informații document

Descriere originală:

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Virtual Mouse

Încărcat de

Drepturi de autor:

Formate disponibile

m Over the recent years, computer vision has started to play

a significant role in the Human Computer Interaction (HCI).

m With efficient object tracking algorithms, it is possible to

m This presentation discusses the design of a system that

 The camera is mounted on

 The underlying tracking algorithm works by segmenting

 Please note that if the hand pose is not as described above

One Time Continuous

 j 2D look up table of Hue and Saturation values (

 Initially, the ROI is set to the entire captured frame.

 The hand region is detected using the H and S values from

 j threshold operation gives us the Binary image B(x,y) of

 The ROI of the next frame is then computed from

 The centre of ROI for the next region is set to the

m This is done by first cropping the image around the hand

m j simple method is implemented to analyze the shape of

m The system counts the pixels that it encounters in each

m If enough rows are found with number of pixels greater

m Basically an finite state machine has been implemented

m The dimensions of the finger, fist are determined during

m The y coordinate is set to the centre of the finger.

m For this a similar Finite state machine is implemented.

m Depending upon if the user is left handed or right

m If sufficient number of such columns detected, the system

m Once the fist is also detected, the system declares that a

m jlso note that if pointing pose

m The system maintains the

 But, we can not use the fingertip locations directly due

m Due to the limitation on the tracking rate, we might get

m Difference of resolution between the input image from the

 jlso if this displacement is found to be less

 ce ser s sg s ad, t e omg tme

 t more rob st fger tp detecto, t s

 When the camera's height is changed, the system has

 The mouse cursor movement on the screen required more

S-ar putea să vă placă și

The camera is mounted on

The underlying tracking algorithm works by segmenting

Please note that if the hand pose is not as described above

j 2D look up table of Hue and Saturation values (

Initially, the ROI is set to the entire captured frame.

The hand region is detected using the H and S values from

j threshold operation gives us the Binary image B(x,y) of

The ROI of the next frame is then computed from

The centre of ROI for the next region is set to the

But, we can not use the fingertip locations directly due

jlso if this displacement is found to be less

ce ser s sg s ad, t e omg tme

t more rob st fger tp detecto, t s

When the camera's height is changed, the system has

The mouse cursor movement on the screen required more