Sunteți pe pagina 1din 4

IEICE TRANS. FUNDAMENTALS, VOL.E92A, NO.

11 NOVEMBER 2009

2858

LETTER

Special Section on Smart Multimedia & Communication Systems

An Implementation of Privacy Protection for a Surveillance


Camera Using ROI Coding of JPEG2000 with Face Detection
Mitsuji MUNEYASUa) , Senior Member, Shuhei ODANI , Yoshihiro KITAURA ,
and Hitoshi NAMBA , Nonmembers

SUMMARY
On the use of a surveillance camera, there is a case where
privacy protection should be considered. This paper proposes a new privacy
protection method by automatically degrading the face region in surveillance images. The proposed method consists of ROI coding of JPEG2000
and a face detection method based on template matching. The experimental
result shows that the face region can be detected and hidden correctly.
key words: privacy control, JPEG2000, ROI, template matching

1.

Introduction

A surveillance camera has been used for prevention and


suppression of crimes. The face of a man captured by the
surveillance camera is important information for specifying
one. However there are some cases where it should treat as
secret information, due to the application [1]. Therefore if
only the face in the captured image can be hidden, the privacy can be protected, then the surveillance camera may set
up in the unprecedented area. Actually a privacy protection
method by using adaboost and JPEG2000 PSNR layer structure has been proposed [2]. It is eective for the privacy
protection for surveillance camera images. This algorithm
should require the training data of the face region because
the adaboost is one of the learning algorithms.
In this paper, a new privacy protection method by automatically degrading the face region in captured images is
proposed. For identification of mans face region, an outline based template matching technique is applied and the
model of the face is represented by an ellipse to generate the
template with any size. An application of region of interest (ROI) coding in JPEG2000 [3] for hiding the face region
is also proposed. Compared to the algorithm in Ref. [2], the
proposed algorithm is simple. The specification of ROI is directly used for hiding the face region and no training data is
required. The experimental result shows the proposed technique can detect the face region and deteriorate it for various
moving pictures. Although applied sequences are dierent,
the hit ratio for finding the face region is almost same as that
in Ref. [2].
Manuscript received January 26, 2009.
Manuscript revised June 14, 2009.

The authors are with the Faculty of Engineering Science,


Kansai University, Suita-shi, 564-8680 Japan.

Presently, with Sharp Corp.

Presently, with Nippon COMSIS Corp.

Presently, with Panasonic Corp.


a) E-mail: muneyasu@ipcku.kansai-u.ac.jp
DOI: 10.1587/transfun.E92.A.2858

2.

ROI in JPEG2000 [3]

The aim of the ROI coding is to make image quality of a specific region high. In the ROI coding of the JPEG2000 Part 1,
the max-shift method is adopted. The ROI coding is accomplished by shifting up DWT coecients in the ROI region
higher than that in the background (BG). The least significant bit (LSB) of the ROI region is shifted up higher than the
most significant bit (MSB) of the BG region. An advantage
of this method is no necessity to transmit the shape of the
ROI region to the decoder. This advantage also enables to
specify an arbitrary shape of the ROI region.
In a low bit rate, the ROI coding of JPEG2000 makes
the quality of the background deteriorate. Therefore if the
background region to the ROI region is specified as the original ROI, the quality of the original ROI region can be deteriorated. This fact is used for the proposed method.
3.

Proposed Method

The proposed method consists of two processing steps. First


one is an automatic detection of the face region from a captured moving image. An improved method of the template
matching technique based on an outline of the object [4] is
applied for this step. In second step, the detected face region
is blurred by using the ROI coding of JPEG2000. The values of the parameters in the following algorithm are decided
experimentally.
In the proposed method, the technique which is based
on the outline of the object [4] is modified. This technique
has become robust for the change of colors or textures in
a target object due to lighting conditions and others. This
method measures the similarity on the distance image in
which the distance between each pixel and its nearest edge
is regard as its pixel value. The similarity with the template
for each position has been obtained by
1 
dT (x, e),
(1)
DT (x) =
|T | eT
where |T | means the number of pixels on the outline of the
template, e the position of the template on the distance image, and dT (x, e) the pixel value of the distance image in e
when the position of the template is x. This formula shows
the average distance between the positions of each outline
pixel in the template and its nearest edge. If this value is

c 2009 The Institute of Electronics, Information and Communication Engineers


Copyright 

LETTER

2859

small, it indicates the more suitable position and we can


identify the position of the object.
However if this method is directly applied to the
surveillance camera, various information of the background
which is included in image causes miss-detection. Moreover
the size of its template is hard to fix only one, since the size
of the object will vary from time to time in a moving image.

(C)

(D)
3.1 Preprocessing for the Distance Image
One of the problems in the template matching in Ref. [4] is
the influence by the background. To exclude this, the distance image should be generated by only the edges of the
moving object. The following preprocessing shown in Fig. 1
is introduced to achieve this. The input image and the result
of this procedure are shown in Fig. 2.
(A) Calculation of the dierence of frames
The dierences between the current and previous
frames and the current and subsequent frames are calculated. The logical product between these binarized
dierence images is obtained.
(B) Noise removal

(E)

(F)

(H)

In this processing, if there are more than three pixels


whose values take 1 in 5 5 window whose center is
the processing point, the pixel value of the processing
point is turned to 1, if not, it 0.
Dilation and Erosion
5 times dilation and erosion with 8-neighbors are applied.
Labeling
The regions are labeled, the numbers of the pixels in the
regions are counted and the maximum one is selected.
Hole filling
Scanning the image from the edges of both sides and
finding the pixel whose value takes 1, the horizontal
coordinates of the pixels which are found firstly in the
both sides are recorded. The values of pixels between
the recorded horizontal coordinates in the same scan
line are tuned to 1 for filling the holes.
Edge detection
The edge image is obtained from the obtained binarized
image.
Transform to the distance image
The obtained edge image is transformed to the distance
image. To reduction the amount of the calculation, if
there is no edge pixel in 2121 window whose center is
the processing point, the pixel value of the processing
point takes 255.

3.2 Template Matching

Fig. 1

Procedure of preprocessing.

Since the size of the target object in the captured image


varies for various reasons, such as the light condition, the
camera position and others, it is hard to decide the typical
size of the template and we may require preparing many
sizes and kinds of the template.
To solve this problem, we assumed that the shape of a
human head can be represented by the ellipse and the template generation method by using the formula of the ellipse
is proposed. This method can produce the template with any
size for the major and minor axes. The formula of the ellipse
is given by
f (x, y) =

x2 y2
+
1
a2 b2

(2)

where a and b indicate the diameters of the major and minor


axes, respectively.
The procedure of the template matching based on this
template generation method is as follows.

Fig. 2

Original and resulting images for the preprocessing procedure.

(1) Rough adjustment of the template size


Until all the outline of the template is included in the
region of the distance image whose pixel value is not
255, match the template on the distance image and if
not match, reduce the lengths of the major and minor
axes of the template by 3 each in the minor one and 4
each in the major one.
(2) Detailed adjustment of the template size
Moreover, from the parameters a and b obtained by

IEICE TRANS. FUNDAMENTALS, VOL.E92A, NO.11 NOVEMBER 2009

2860

the previous step, match the template and reduce the


lengths of them by 1.5 each in the minor one and 2
each in the major one to obtain the minimum value of
the similarity measure.
(3) Finer adjustment of the template size
By using the parameters a and b which are decided
in the above steps, match the template and reduce the
lengths of them from 2a 3 to 2a + 3 in the minor one
and from 2b3 to 2b+3 each in the major one to obtain
the minimum value of the similarity measure.

seq. 1
seq. 2
seq. 3
Total

Table 1 Numerical evaluation.


Frame Hit [%]
False [%]
Mis [%]
136
84
15
1
170
89
10
1
238
96
2
2
544
91
8
1

3.3 Hiding of the Face Region by ROI Coding


Finally, for hiding the face region which is found, its BG region is specified as the ROI in JPEG2000 ROI coding. This
can easily deteriorate the face region.
There is the relation between the bit-rate specified for
compression and the image quality of the face region. However from experiment, we confirm the degradation of the
face region occurs under 8 bpp, because the face region is
enough small compared to the BG region. Therefore this
means the proposed method can work well for the practical
situation.
4.

Experimental Result

To verify the eectiveness of the proposed technique, the


proposed one was applied to the captured moving image.
For the capturing, the video camera (HDR-SR1, Sony corp.)
was used and the position of it was fixed. We assumed that
the number of the moving object which was included in the
image was 1. VM 8.6 was also used for the implementation
of the proposed technique, as the JPEG2000 codec software.
We captured three sequences and the numerical evaluation is shown in Table 1. In this table, Frame indicates the
number of frames in each sequence, Hit the hit rate which
means the correct detection of the face, False the false
alarm rate which means the incorrect detection of the face
to no faces area and Mis misdetection rate which means
the detection failure of the correct face area. From this table,
over 90% hit ratio can be attained and the good results were
obtained.
This algorithm detects the elliptic area of the moving
object and it causes false alarm. For privacy protection, we
consider that false alarm is better than misdetection, since
false alarms were occurred in the case where there is no
face in the frame. From this point, simulation result shows
that false alarm rate is low and the misdetection ratio is extremely low. Therefore the obtained result is satisfactory
and false alarm becomes no serious problem in these example sequences. The misdetection ratio should be further
reduced and it is left for future work.
The processed images (Sequence 1) are shown in
Fig. 3. From this result, the proposed technique can detect
the human face exactly and the image quality of it can be
also deteriorated. This technique can successfully apply to

Fig. 3

Result of the proposed technique.

the tracking and deteriorating of the face region for another


person and moving. As a result, the eect of the privacy
protection by the proposed technique can be confirmed.
Finally the processing time for a frame took about 6
seconds by using Intel Pentium D 3.20 GHz processor with
2 GB main memory and we can not process in real time such
as 1/30 seconds. We should refine the algorithm in view of
the processing, e.g. reconsideration of matching interval of
template and others, however it is left for the future work.
5.

Conclusions

This paper has proposed a new privacy protection method


by automatically degrading the face region in captured images. The outline based template matching technique with
the elliptic template model has been applied for the identification of the face region. This method can give the ability
of generating the template with any size and the detection of
the precise position is also available. The feature of the ROI
coding in JPEG2000 has been adopted for hiding the face
region. The experimental result has shown the eectiveness
of the proposed technique for the privacy protection.
The setting of parameters in the proposed algorithm
is obtained experimentally and rational tuning of them for
other image sequences is left for the future work. In addi-

LETTER

2861

tion, some ellipse area in the frame is always detected in this


algorithm. If there is some elliptic object in the background,
this may be some serious defect. By introduction of another
technique like color histogram it may be avoided, however
it is also left for the future work.
Acknowledgment
This work was partially supported by Grant-in-Aid for Scientific Research (C), 20560380.
References
[1] K. Ando, O. Watanabe, and H. Kiya, Partial-scrambling of images

encoded by JPEG2000, IEICE Trans. Inf. & Syst. (Japanese Edition),


vol.J85-D-II, no.2, pp.282290, Feb. 2002.
[2] I. Martinez-Ponte, X. Desurmont, J. Meessen, and J.-F. Delaigle, Robust human face hiding ensuring privacy, Workshop on the Integration of Knowledge, Semantics and Digital Media Technology, Montreux, Switzerland, April 2005. (http://wcam.epfl.ch/publications/
cr1094.pdf)
[3] D. Taubman and M.W. Marcellin, JPEG2000: Image compression
fundamentals, standards and practice, Kluwer Academic, Boston,
1989.
[4] D. Mochizuki, Y. Yano, T. Hashiyama, and S. Ohkuma, Pedestrian
detection with a vehicle camera using fast template matching based on
background elimination and active search, IEICE Trans. Inf. & Syst.
(Japanese Edition), vol.J87-D-II, no.5, pp.10941103, May 2004.

S-ar putea să vă placă și