Documente Academic
Documente Profesional
Documente Cultură
ABSTRACT
We present a robust vision-based skin-colour
segmentation method for moving hands in a real-time
application. Segmentation of hands is an important
processing step in gesture recognition applications,
where the general shape and position of the hands are of
interest. In contrast to these approaches, the presented
method concentrates on an accurate segmentation,
which is required for further processing steps in a realtime videoconferencing application. A hand tracking
procedure is applied to improve the segmentation in
terms of accuracy, robustness and processing speed.
Furthermore the presented approach can accomplish
difficult situations like contact between hands or contact
between face and hands. This is important for many
real-time applications, e.g. for the presented
videoconference system to allow the conferees a natural
behaviour. Moreover we present an approach for an
automatic initialisation of the skin-colour range to the
specific user. We show experimental results proving the
efficiency and reliability of our approach. The proposed
hand segmentation method is capable of processing TVsized (CCIR 601, 576x720 pixels) video images in realtime with 25 Hz on a common PC. The presented
approach will support any video processing in visual
media production, where segmentation accuracy and
real-time capability is required.
1 INTRODUCTION
Numerous applications use skin-colour as one of the
basic features for detecting or analysing human face or
hands. They have different aims and different
constraints under which the human face or hands are
being analysed. One crucial point, which is common for
most of the applications in this context, is an accurate
segmentation of human face or hands. Many
applications deal with segmentation of hands, such as
hand sign recognition, human vehicle interaction,
human computer interfaces, but common to all is a
rough segmentation result as other features are derived
(refer to Cui (1), Imagawa (2), Guo (3), Zhu (4), Starner
(5)). In some hand segmentation approaches marked
gloves are used, which are not applicable in video
conferencing systems ( see Dorfmueller (6)). In others
approaches infrared cameras are used or depth
information based on multi-views is exploited e.g Sato
on subsampled image
on original
image size
Determination
of skin colour
range
Handtracking
Skin-colour
segmentation
Initialisation
Tracking
Segmentation
two terms: 1) global skin-colour, representing skincolour in a general way with large tolerance values, and
2) skin-colour, representing the skin-colour for the
specific person under certain illumination conditions
described with specific mean values and a reduced
tolerances.
Hence, an important question arises how to determine
appropriate skin-colour parameters for a scenario to
achieve best segmentation results. Applying parameters
from general statistical analysis of skin-colour does not
lead to optimal segmentation results in the majority of
cases. But they can be often used as good coarse start
values to find appropriate parameters by slightly
varying them.
One option is to adapt the thresholds manually at the
beginning of the segmentation. This is obviously not
convenient in terms of usability and user friendliness of
in the case of a video conferencing system. Therefore, a
quasi-automatic method is presented to find suitable
parameters. Nevertheless in the case of extreme dark or
extreme bright illumination an additional manual
adjustment is unavoidable. However we experienced,
that for brighter illumination, it is reasonable to chose
larger tolerance values than for dark cases.
The initialisation step is performed in the sub-sampled
image for real-time and stability purposes. Beside the
desired skin-colour range, it provides also three centres
of gravity of the two hands and the head, which are used
as start positions for the bounding boxes. In the first
image, a pixelwise skin-colour segmentation is
performed. For the initial skin colour range, threshold
values are obtained from statistical analysis of a number
of images representing the global skin-colour cloud.
After applying the global thresholds, a rough binary
mask is obtained, which is filtered to reduce noise.
The goal of the following process is to determine the
blob position of both hands and the head and to
calculate new and more accurate skin-colour threshold
values in the distinct area. Hence, the row and column
histograms of the binary image are calculated, which
represents the skin coloured pixel-distribution in
horizontal and vertical direction (Fig. 4).
get new
centre
of
gravity
new
centre
of
gravity
old bounding
box position
new bounding
box position
part of left
hand
left hand
bounding box
observed
area
hand
detection
part of
right hand
search
directions
Fig. 10: Contact of hand boxes
If both hands have contact to each other, then the
bounding boxes overlap. If the hands come apart, the
bounding boxes must be separated obviously, which is
again not trivial. To overcome with this problem, the
following approach has been implemented to separate
the bounding boxes, when the hands get separated. For
each bounding box favourite directions are defined e.g.
left -bottom edge for one box and right-top edges for the
other. While the hands are in contact, the boxes are just
allowed to move in the preferred directions. If the hands
are not connected any more, the preference gets
switched off and the movement of the bounding boxes
is not limited further on. Example images of a sequence
are presented in the next section.
If a hand has contact with the face, the following is
performed: In addition to the hands, the head blob of the
participant resulting from the initialisation phase is
tracked as well in the sub-sampled image, using a third
7 EXPERIMENTAL RESULTS
The presented methods are running on a standard PC
Pentium IV, 2GHz in real-time on full TV resolution
video (576x720 pixels at 25Hz). Hence all situations
such as different behaviour and gestures, have been
tested under real conditions. The following extracted
images of a sequence will show the robustness in
several situations and the accuracy of the segmentation.
In Fig. 12, an example is given, where the hands contact
each other and come apart. After the contact tracking is
still successful and the bounding boxes can be separated
correctly.
In Fig. 13, a misleading tracking is shown. In this case
the right hand box is getting lost after contact of the
hand with the face. Instead of it, the face of the person is
wrongly tracked. The successful operation of our
method is shown in Fig. 14 and Fig. 15 for situations,
where just a single, but also both hands have contact
with the face region.
Fig. 13: Contact of hand and head box, hand box is lost
Fig. 15: Contact of both hands and head together, correct tracking (order: left to right)
The image series (Fig. 14) shows, that the right hand
box still tracks the hand correctly after the contact using
the head box processing method. Despite the robust
tracking it must be indicated that with our algorithm it is
not possible to determine the contours of the objects
while they are connected. Only if they are separated, our
application makes use of the contours determined in the
single boxes.
Finally Fig. 15 gives an example where both hands
touch the head at the same time. After separation each
box is tracking correctly the corresponding object.
Actually, some assumptions for our skin-colour
segmentation method have been made.
No sudden change of illumination
Long sleeves of wearing clothes
Normal motion speed of hands while gesticulating
Minor changes in the illumination can be considered
easily as in every new image actual skin-coloured pixels
are determined. Based on these pixels, new thresholds
can be derived.
The restriction to long sleeves is mainly determined by
the size of the bounding boxes. A larger bounding box
is increasing the computational effort, which may result
in a lower frame rate under certain circumstances.
9 ACKNOWLEDGEMENT
This work is supported by the Deutsche Forschungsgemeinschaft (DFG) under grant number DD 20 9 11.
REFERENCES
1.
2.
3.
4.
5.
6.
7.
8.
9.
Asian