Documente Academic
Documente Profesional
Documente Cultură
were distinguished from rotational movements based on a 3D consuming, which limits its applications with real-time re-
camera model. Then, motion vectors, which did not fit to the quirements.
camera motion estimation, were assigned to object clusters. Bay et al. presented a method named SURF (Speeded
Both of the above methods detect object by motion vectors. up robust feature) in 2006,11 in which box filter and integral
Inspired by these methods, a new approach is proposed in this image were used to establish scale spaces followed by fea-
paper to improve the object detection and tracking with mov- ture points detection using Hessian matrix. Finally, wavelet
ing camera. With the proposed approach the motion vectors transformation method was adopted to generate the feature
of scale invariant feature points are computed first, and the descriptor.
histogram of motion vectors is then generated. The maximum Box filters are used to represent Gaussian functions,
value of the histogram is identified as the motion vector of the which simplifies the establishment of scale space. The pro-
moving object, resulting motion templates are established. Fi- cedures are shown in Fig. 2.
nally, particle filter algorithm is adopted to track the object. For any point p(x, y) in the image I, the Hessian matrix
H(p, σ ) in p at scale σ is defined as
Lxx (x, σ ) Lxy (x, σ )
MOVING OBJECT DETECTION AND TRACKING
H (x, σ ) = , (1)
Lxy (x, σ ) Lyy (x, σ )
The framework proposed for object detection and track- where Lxx (x, σ ) is the convolution of the Gaussian second or-
ing is presented in Fig. 1. Considering the changes of scale der derivative with the image I in point p, and similarly for
in applications with moving camera, scale invariant feature Lxy (x, σ ), Lyy (x, σ ). Here the second order Gaussian derivative
points are extracted first. Then feature points of foreground is approximated by the box filters and can be evaluated very
are selected from the original feature points via motion vec- fast using integral images. With integral image, the computa-
tors. Finally, the process of object tracking is accomplished. tion of scale space establishment is reduced. The determinant
The front part of this framework represents input video. The of Hessian matrix is defined as
output is the result of the object tracking.
det(H essian) = Dxx Dyy − (wDxy )2 , (2)
where w is the weighing factor that equals 0.9. The candi-
date points are selected from the original points only when the
Feature points extraction
value of the determinant is greater than threshold T. After gen-
It is evidenced that the scale of feature point is always eration of the candidate points, the process of non-maximum
changing because of the randomness of moving object and suppression in a neighborhood can be done (in this paper:
active camera. In 2004, Lowe proposed SIFT (scale invari- 3*3*3).
ant feature transform) algorithm in paper,10 which has been After generation of the key points, the Haar wavelet re-
widely used in multiple fields, including image registration, sponses in horizontal and vertical directions (dx and dy ) are
object recognition and so on. However, this method is time- summed up over each sub-region. Meanwhile, the absolute
values of the responses (|dx | and |dy |) are also summed. Thus ture points left, therefore, Nk pairs of motion vectors can be
for each sub-region, a four dimensional descriptor can be ob- generated and obtained.12
tained as The motion vectors are then described in the form of his-
togram. The number of matched pairs determines the number
v= dx, dy, |dx|, |dy| .
of vectors in discrete histogram. There are two classes of mo-
After all these procedures, a descriptor with 64 elements tion vectors: background and foreground ones. The location
is generated. SURF feature points are shown in Fig. 3. of the object vector is the region of the maximum value in
It can be seen from Fig. 3 that the feature points consist of vectors histogram.
the points on the foreground and on the background. In order The location of object is the region of maximum value
to detect the object, points of background should be deleted in histogram, as shown in Fig. 4(a). The center point of the
next. object is presented in Fig. 4(b) with black circle, assuming
that the locations of the object center points are (x, y)(xn , yn ).
With the assumption that the distribution characteristic of
Foreground region detection
object feature points resembles that of the Gaussian function,
As presented in “Feature points extraction” section, fea- the window of the moving object is a rectangle whose center
ture points obtained consist of the foreground and background point is p-(Xn , yn ). For the rectangle, the height is set to 2 ·
points. Besides, translation exists when points on the back- σ y and the width is 2 · σ x , where σ x and σ y are the standard
ground and the foreground are overlapped. However, dis- deviations of the Gauss function.
placement vectors of background and foreground have differ- Specific steps are as follows:
ent directions and magnitudes. This characteristic can be used
to distinguish them. (1) Differences of location between object point (x, y) and
Assuming that N number of the feature points exist in a center point (x̄, ȳ) are obtained;
frame. The points that do not obey the matching rule will be (2) Standard deviations of the differences in two
deleted from the original feature points set. After elimination dimensions-x and y are generated as shown in Eq. (3),
of the points using matching rule, there are Nk pairs of fea- and
(a) Histogram of motion vectors (b) Center point of object (c) Center region
TABLE I. Results of the two methods. 2 B. D. Lucas and T. Kanade, “An iterative image registration technique with
the drawback. Meanwhile, some satisfactory points are used tured by moving camera by fuzzy edge incorporated markov random field
in state vectors, which improve the robustness of this method. and local histogram matching,” IEEE Trans. Circuits Systs. Video Technol.
22(8), 1127–1135 (2012).
Experimental results demonstrate the effectiveness and feasi- 8 H. L. Eng and K. K. Ma, “Spatiotemporal segmentation of moving
bility of the proposed algorithm. Further work on this topic video object over MPEG compressed domain,” in Proceedings of the
will focus on improving the detection and tracking accuracies IEEE International Conference on Multimedia and Expo (IEEE, 2000),
pp. 1455–1458.
in videos with many feature points in background. 9 R. Ewerth and M. Schwalb, “Segmenting moving objects in MPEG
an active camera,” in Proceedings of the International Conference on Ad- linear non-Gaussian Bayesian state estimation,” IEE Proc. F, Radar Signal
vanced Communication Technology (IEEE, 2005), pp. 817–820. Process. 140(2), 107–113 (1993).
Review of Scientific Instruments is copyrighted by the American Institute of Physics (AIP).
Redistribution of journal material is subject to the AIP online journal license and/or AIP
copyright. For more information, see http://ojps.aip.org/rsio/rsicr.jsp