Chen 2013

REVIEW OF SCIENTIFIC INSTRUMENTS 84, 065109 (2013)
Object detection and tracking with active camera on motion

vectors of feature points and particle filter
Yong Chen,1,2,a) Rong-Hua Zhang,1 Lei Shang,1 and Eric Hu2
1
School of Energy Science and Engineering, University of Electronic Science and Technology of China,
Chengdu 611731, China
2
School of Mechanical Engineering, The University of Adelaide, SA 5005 Australia
(Received 8 April 2013; accepted 20 May 2013; published online 18 June 2013)
A method based on motion vectors of feature points and particle filter has been proposed and devel-
oped for an active/moving camera for object detection and tracking purposes. The object is detected
by histogram of motion vectors first, and then, on the basis of particle filter algorithm, the weighing
factors are obtained via color information. In addition, re-sampling strategy and surf feature points
are used to remedy the drawback of particle degeneration. Experimental results demonstrate the prac-
ticability and accuracy of the new method and are presented in the paper. © 2013 AIP Publishing
LLC. [http://dx.doi.org/10.1063/1.4809768]
INTRODUCTION order to overcome the limitations of the GMM approach with

its pixel-wise processing, the background model was com-
Object detection and tracking of a camera are widely
bined with the motion cue in MAP (a maximum a posteriori
used in various realms including surveillance system, intel-
probability)-MRF framework. However, the drawback of A
ligent traffic system (ITS), and automatic control system. The
method based on motion vectors of feature points and particle
camera used can either be stationary or moving, also termed
filter has been proposed and developed for an active/moving
“active,” depending on the application. Compared to the lat-
camera for object detection and tracking purposes. The ob-
ter, motion detection with stationary camera is not difficult
ject is detected by histogram of motion vectors first, and then,
due to the fixed view of the camera. For an active camera,
on the basis of particle filter algorithm, the weighing factors
two types of motion vectors may be involved, namely, motion
are obtained via color information. In addition, re-sampling
vectors of the camera and motion vectors of the object to be
strategy and surf feature points are used to remedy the draw-
detected and tracked.1 Therefore, the accurate detection of a
back of particle degeneration. Experimental results demon-
moving object with an active camera has been a great chal-
strate the practicability and accuracy of the new method, and
lenge to researchers.
are presented in the paper. However, the drawback of failing
Two processes are involved in the object detection by a
to calculate the numbers of GMM models and MRF models
moving camera: motion detection and motion compensation.
automatically restricts its applications. An approach combin-
The motion detection is the process of recognizing variant and
ing MRF and fuzzy edge was used to do the process of space
invariant regions of frames, while the motion compensation is
segmentation in Ref. 7, where the advanced method computes
to estimate the location of object via calculating motion vec-
the parameters of MRF via EM (expectation maximum). Al-
tors. However, the motion vectors of the moving object and
though the method achieves better performance than other al-
the moving camera make it difficult to implement the process
gorithms, the process was complicated and time-consuming.
of motion compensation.
Apart from this restriction, the method could not achieve
Previous methods proposed to tackle the problem includ-
satisfactory results under applications where there were oc-
ing Kanade-Lucas-Tomasi (KLT) based on motion vector of
cluded object and shadows.
optical flow was proposed by Lucas and Kanade,2 which
The procedure of object detection with these mentioned
has been widely used. Recently, a method based on KLT
approaches all consists of region segmentation, region match-
feature was developed to track moving object.3, 4 However,
ing, and feature matching. In real applications with a mov-
this approach cannot achieve satisfactory performances for
ing camera, although the scale and location of an object are
a moving camera. Another method based on HMM (Hidden
always changing, the relationship between the motion vec-
Markov Model) was to detect object via color and optical flow
tor of moving object and that of the background remains un-
information.5 Furthermore, KLT and particle filter methods
changed during the object tracking process. Therefore, the
are also used to track moving object. It is evidenced that the
region segmentation can be conducted through computing
tracking process is of low efficiency to some extent, which
the motion vector. An alternative algorithm was presented
affects its applications with real-time requirements. Berrabah
in Ref. 8. The proposed algorithm utilizes the homogeneity
et al. proposed in their paper6 to simulate the process of fore-
property of the spatiotemporally localized moving object’s
ground detection via Bayes algorithm. This approach models
information-macro block motion vectors and DCT’ s (Dis-
the background based on GMM (Gaussian Mixture Model). In
crete Cosine Transform) DC coefficients, in order to achieve
segmentation with accuracy. Another notable algorithm was
a) Author to whom correspondence should be addressed. Electronic mail: given by Ewerth and Schwalb in Ref. 9, where camera mo-
ychencd@uestc.edu.cn tion parameters were estimated and translational movements
0034-6748/2013/84(6)/065109/6/$30.00 84, 065109-1 © 2013 AIP Publishing LLC

065109-2 Chen et al. Rev. Sci. Instrum. 84, 065109 (2013)
FIG. 1. The framework of our proposed method.
were distinguished from rotational movements based on a 3D consuming, which limits its applications with real-time re-
camera model. Then, motion vectors, which did not fit to the quirements.
camera motion estimation, were assigned to object clusters. Bay et al. presented a method named SURF (Speeded
Both of the above methods detect object by motion vectors. up robust feature) in 2006,11 in which box filter and integral
Inspired by these methods, a new approach is proposed in this image were used to establish scale spaces followed by fea-
paper to improve the object detection and tracking with mov- ture points detection using Hessian matrix. Finally, wavelet
ing camera. With the proposed approach the motion vectors transformation method was adopted to generate the feature
of scale invariant feature points are computed first, and the descriptor.
histogram of motion vectors is then generated. The maximum Box filters are used to represent Gaussian functions,
value of the histogram is identified as the motion vector of the which simplifies the establishment of scale space. The pro-
moving object, resulting motion templates are established. Fi- cedures are shown in Fig. 2.
nally, particle filter algorithm is adopted to track the object. For any point p(x, y) in the image I, the Hessian matrix
H(p, σ ) in p at scale σ is defined as

Lxx (x, σ ) Lxy (x, σ )
MOVING OBJECT DETECTION AND TRACKING
H (x, σ ) = , (1)
Lxy (x, σ ) Lyy (x, σ )
The framework proposed for object detection and track- where Lxx (x, σ ) is the convolution of the Gaussian second or-
ing is presented in Fig. 1. Considering the changes of scale der derivative with the image I in point p, and similarly for
in applications with moving camera, scale invariant feature Lxy (x, σ ), Lyy (x, σ ). Here the second order Gaussian derivative
points are extracted first. Then feature points of foreground is approximated by the box filters and can be evaluated very
are selected from the original feature points via motion vec- fast using integral images. With integral image, the computa-
tors. Finally, the process of object tracking is accomplished. tion of scale space establishment is reduced. The determinant
The front part of this framework represents input video. The of Hessian matrix is defined as
output is the result of the object tracking.
det(H essian) = Dxx Dyy − (wDxy )2 , (2)
where w is the weighing factor that equals 0.9. The candi-
date points are selected from the original points only when the
Feature points extraction
value of the determinant is greater than threshold T. After gen-
It is evidenced that the scale of feature point is always eration of the candidate points, the process of non-maximum
changing because of the randomness of moving object and suppression in a neighborhood can be done (in this paper:
active camera. In 2004, Lowe proposed SIFT (scale invari- 3*3*3).
ant feature transform) algorithm in paper,10 which has been After generation of the key points, the Haar wavelet re-
widely used in multiple fields, including image registration, sponses in horizontal and vertical directions (dx and dy ) are
object recognition and so on. However, this method is time- summed up over each sub-region. Meanwhile, the absolute
FIG. 2. Approximation of the second order Gaussian derivative.

(a) Frame 579 (b) Frame 590 (c) Frame 604
FIG. 3. Results of feature points detection.
values of the responses (|dx | and |dy |) are also summed. Thus ture points left, therefore, Nk pairs of motion vectors can be
for each sub-region, a four dimensional descriptor can be ob- generated and obtained.12
tained as The motion vectors are then described in the form of his-

togram. The number of matched pairs determines the number
v= dx, dy, |dx|, |dy| .
of vectors in discrete histogram. There are two classes of mo-
After all these procedures, a descriptor with 64 elements tion vectors: background and foreground ones. The location
is generated. SURF feature points are shown in Fig. 3. of the object vector is the region of the maximum value in
It can be seen from Fig. 3 that the feature points consist of vectors histogram.
the points on the foreground and on the background. In order The location of object is the region of maximum value
to detect the object, points of background should be deleted in histogram, as shown in Fig. 4(a). The center point of the
next. object is presented in Fig. 4(b) with black circle, assuming
that the locations of the object center points are (x, y)(xn , yn ).
With the assumption that the distribution characteristic of
Foreground region detection
object feature points resembles that of the Gaussian function,
As presented in “Feature points extraction” section, fea- the window of the moving object is a rectangle whose center
ture points obtained consist of the foreground and background point is p-(Xn , yn ). For the rectangle, the height is set to 2 ·
points. Besides, translation exists when points on the back- σ y and the width is 2 · σ x , where σ x and σ y are the standard
ground and the foreground are overlapped. However, dis- deviations of the Gauss function.
placement vectors of background and foreground have differ- Specific steps are as follows:
ent directions and magnitudes. This characteristic can be used
to distinguish them. (1) Differences of location between object point (x, y) and
Assuming that N number of the feature points exist in a center point (x̄, ȳ) are obtained;
frame. The points that do not obey the matching rule will be (2) Standard deviations of the differences in two
deleted from the original feature points set. After elimination dimensions-x and y are generated as shown in Eq. (3),
of the points using matching rule, there are Nk pairs of fea- and
(a) Histogram of motion vectors (b) Center point of object (c) Center region
FIG. 4. Histogram of motion vectors.

is assumed to be white and zero mean noise, and mk is the

measurement noise independent of the system noise.
After the original state of particle is initialized, the state
and weight are sampled. The distribution characteristic of a
posterior probability can be obtained after the process of nor-
malization.
Establishment of state vectors and models

The template of a particle is of vital importance in the
FIG. 5. Schematic diagram of selecting feature points. process of particle filter algorithm. The distinctive differences
between adjacent frames make it difficult to search for par-
(3) Assuming that the maximum value in motion vector his- ticles. Because of the invariant characteristic of the surface,
togram is the relative feature point (Xn , yn ), the initial surface algorithm is employed to solve the problem caused by
center region is obtained, as shown in Fig. 4(c), differences of adjacent frames.
If the color histogram of the region R can be described
n n
as x = {x(y)k }k = 1, .... K, the histogram of candidate re-
σx = (xi − x) , σy =
2 (yi − y)2 . (3) gion (x0 , x1 , ....... xk ) and SURF feature points (v1 ...vk ) would
i=1 i=1 be selected as the state vector. The state vector is defined as
xk = [x0 , x1 , ....... xk , v1 ...vk ].
Euclidean distance rule is employed to judge the simi-
OBJECT TRACKING
larity of SURF feature points. During the searching process,
similarity weighing parameter ω1i is obtained by comparing
The object is tracked by particle filter algorithm, the the SURF feature points of certain circle whose center is at
weighing factors are obtained by color information. In addi- the center point of candidate region, as shown Figure 5.
tion, re-sampling strategy and surf feature points are used to If there are too many feature points found, the one near-
remedy the drawback of particle degeneration. est the center point would be chosen. At the same time, the
similarity between particles is measured. Here Bhattacharyya
distance constraint is used to evaluate the similarity between
Particle filter
two histograms, as presented by Eq. (5). With Bhattacharyya
Particle filter algorithm is adopted here to track moving coefficient, another similarity parameter ω2 can be obtained.
object with the moving camera. The core idea of the approach Where, Xn is the state vector of optimal particle at time n,
is to estimate the PDF (probability distribution feature) of ran- Xn + m stands for another state vector of optimal particle at
dom variables with lots of discrete sampled points.13 The in- time n + m and m is selected as 7. If d is less than threshold
tegral process is substituted by average value of samples value T, the template is required to be updated in order to keep

the tracking system stable
xk+1 = fk (xk , nk )
. (4)
yk = hk (xk , mk )
d= 1 − ρ(xn , xn+m ). (5)
The task of tracking is simplified via the establishment of
a state model and an observation model, which are presented The combined weighing factor ω is obtained using
in Eq. (4), where Xk is the state vector with n elements, nk Eq. (6), where β 1i is the weighing factor of SURF feature
FIG. 6. Results of particle distribution.

FIG. 7. Results of the method proposed in this paper.
point and β is that of color histogram EXPERIMENTAL RESULTS

n
To verify the feasibility of the method proposed, an air-
ω= β1i ∗ ω1i + β ∗ ω2 , (6) plane was selected as the experimental object. In the exper-
i=1
iment, the background is always changing, which includes

N scale and rotation transformations. In the experiment, two fac-
Neff = 1/ (ωj )2 . (7) tors worth noting the disturbances interfering by the extrac-
j =1 tion of the foreground feature points and the interruption of
the establishment of the target model by the rotation and scale
During the process of iteration, the major drawback of changes. Figures 7 and 8 show the results of the proposed al-
this algorithm is the degeneration of the particle set. In prac- gorithm and the previous research based on data clustering in
tical application, Neff is employed to measure the extent of three test videos. Figure 7 is the tracking result of proposed
degeneration. The less Neff is, the worse the phenomenon of method, while Figure 8 shows the results of data clustering.
degeneration is. Generally, the way to avoid the degeneration From the results shown above, the proposed method
is to select important function and re-sample. The implemen- based on SURF and data clustering presents good tracking
tation steps are as below. results in the experiment with simple and complicated back-
First, select particles with higher weights as the key ground. However, the tracking center has deviations. The
points. Then, distribute more particles around the key point. method based on data clustering classifies all these resem-
Re-sampling is a satisfactory way to remedy this drawback bling feature points into the same class, which would cause
for it can eliminate the particles with the smallest weights. more deviations with time going on. Comparatively, under
Figure 6 shows the particle distribution. According to the tests with non-stationary background, the proposed algorithm
re-sampling rule, more particles should be distributed around based on the histogram of motion vectors can detect moving
points highly weighed. Meanwhile, fewer particles should be object successfully. Table I shows the accuracy ratios of the
distributed around the points with smaller weighing factors. two methods in several tests.
FIG. 8. Results of the method based on data clustering.

TABLE I. Results of the two methods. 2 B. D. Lucas and T. Kanade, “An iterative image registration technique with
an application to stereo vision,” in Proceedings of the Imaging Understand-

Method Test 1 Test 2 Test 3 ing Workshop (Morgan Kaufmann Publishers Inc., 1981), pp. 121–130.
3 M. Hwangbo, J. S. Kim, and T. Kanade, “Inertial-aided KLT feature track-
Histogram of motion vectors (HMV) 0.95 0.96 0.95 ing for a moving camera,” in Proceedings of the IEEE/RSJ International
Method based on SURF and data clustering 0.75 0.60 0.80 Conference on Intelligent Robots and Systems (IEEE, 2009), pp. 1909–
1906.
4 X. B. Cao and J. H. Lan, “KLT feature based vehicle detection and track-
ing in airborne videos,” in Proceedings of the International Conference on

CONCLUSIONS Image and Graphics (IEEE, 2011), pp. 673–678.
5 C. M. Huang, Y. R. Chen, and L. C. Fu, “Real-time object detection and
In this paper, a new method based on motion vectors of tracking on a moving camera platform,” in Proceedings of the ICROS-SICE
SURF feature points and particle filter is proposed to detect International Joint Conference (IEEE, 2009), pp. 717–722.
6 S. A. Berrabah, G. D. Cubber, and V. Enescu, “MRF-based foreground
moving object with non-stationary background. For the rota-
tion and scale changes caused by camera, SURF algorithm is detection in image sequence from a moving camera,” in Proceedings of
the IEEE International Conference on Image Processing (IEEE, 2006),
used to extract invariant feature points. With regard to the de- pp. 1125–1128.
generation of particles, re-sampling strategy is used to remedy 7 A. Ghosh, B. Narayan, and S. Ghosh, “Object detection from videos cap-
the drawback. Meanwhile, some satisfactory points are used tured by moving camera by fuzzy edge incorporated markov random field
in state vectors, which improve the robustness of this method. and local histogram matching,” IEEE Trans. Circuits Systs. Video Technol.
22(8), 1127–1135 (2012).
Experimental results demonstrate the effectiveness and feasi- 8 H. L. Eng and K. K. Ma, “Spatiotemporal segmentation of moving
bility of the proposed algorithm. Further work on this topic video object over MPEG compressed domain,” in Proceedings of the
will focus on improving the detection and tracking accuracies IEEE International Conference on Multimedia and Expo (IEEE, 2000),
pp. 1455–1458.
in videos with many feature points in background. 9 R. Ewerth and M. Schwalb, “Segmenting moving objects in MPEG
videos in the presence of camera motion,” in Proceedings of the Inter-

ACKNOWLEDGMENTS national Conference on Image Analysis and Processing (IEEE, 2007),
pp. 819–824.
10 D. G. Lowe, “Distinctive image features from scale-invariant keypoints,”
The authors appreciate the support received from Na-
tional Natural Science Foundation of China under Grant No. Int. J. Comput. Vis. 60(2), 91–110 (2004).
11 H. Bay, T. Tuytelaars, and L. V. Gool, “Surf: Speeded up robust features,”
(61105030), the Fundamental Research Funds for the Central in Proceedings of the European Conference on Computer Vision (Springer
Universities (ZYGX2011J021), the Scientific and Technical Berlin Heidelberg, 2006), pp. 404–417.
12 S. W. Sun and F. Huang, “Data-driven foreground object detection from a
Supporting Programs of Sichuan Province (2013GZ0054).
non-stationary camera,” in Proceedings of the International Conference on
Pattern Recognition (IEEE, 2010), pp. 3053–3056.
1 K.K. Kim and S. H. Cho, “Detecting and tracking moving object using 13 N. J. Gordon, D. J. Salmond, and A. F. Smith, “Novel approach to non-
an active camera,” in Proceedings of the International Conference on Ad- linear non-Gaussian Bayesian state estimation,” IEE Proc. F, Radar Signal
vanced Communication Technology (IEEE, 2005), pp. 817–820. Process. 140(2), 107–113 (1993).
Review of Scientific Instruments is copyrighted by the American Institute of Physics (AIP).
Redistribution of journal material is subject to the AIP online journal license and/or AIP
copyright. For more information, see http://ojps.aip.org/rsio/rsicr.jsp

Chen 2013

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Chen 2013

Încărcat de

Drepturi de autor:

Formate disponibile

REVIEW OF SCIENTIFIC INSTRUMENTS 84, 065109 (2013)

Object detection and tracking with active camera on motion

INTRODUCTION order to overcome the limitations of the GMM approach with

0034-6748/2013/84(6)/065109/6/$30.00 84, 065109-1 © 2013 AIP Publishing LLC

FIG. 1. The framework of our proposed method.

FIG. 2. Approximation of the second order Gaussian derivative.

(a) Frame 579 (b) Frame 590 (c) Frame 604

FIG. 3. Results of feature points detection.

FIG. 4. Histogram of motion vectors.

is assumed to be white and zero mean noise, and mk is the

Establishment of state vectors and models

(a) Frame 10 (b) Frame 64 (c) Frame 154

FIG. 6. Results of particle distribution.

(a) Frame 30 (b) Frame 123 (c) Frame 588

FIG. 7. Results of the method proposed in this paper.

point and β is that of color histogram EXPERIMENTAL RESULTS

(a) Frame 30 (b) Frame 123 (c) Frame 588

FIG. 8. Results of the method based on data clustering.

an application to stereo vision,” in Proceedings of the Imaging Understand-

ing in airborne videos,” in Proceedings of the International Conference on

videos in the presence of camera motion,” in Proceedings of the Inter-

S-ar putea să vă placă și