Sunteți pe pagina 1din 8

A Background Subtraction Based Video Object Detecting and Tracking Method

horng@kmit.edu.tw
model the noise by using a mixture of Gaussian distributions at each pixel [ 2]. Latterly, Oliver et al. proposed the eigen-background model [3]. The eigenbackground was generated using the principle component analysis from several static background images. Cheng and Chen applied the discrete wavelet transform to remove the Gaussian noise [1]. However, there are additional factors that give rise to the background changes. These factors will be discussed in Section 2. To reduce the background changes , the temporal difference approach [7] detects motions by taking the absolute difference of consecutive images. Lipton et al. extracted all moving objects by using a temporal difference algorithm [6]. They applied the temporal differencing and adopted many variants on this method. The disadvantage of the temporal difference approach is that it may miss the detection of an object that stops motion in the image frame. It also suffers from some problems when tracking multiple objects. The probability approach uses the observed information to obtain a classification equation of probability to segment image. This approach suffers from the problem of high computational complexity. So it is hard to be used in real-time surveillance application. In this paper, we adopt the background subtraction based approach. To conquer the problem of background changes, we propose a new method to eliminate the background changes. The background changes are classified into three classes: the Gaussian noise, the light changes, and the objects shadows. The properties of each class are discussed. Then, we eliminate them according to their corresponding properties. The motion objects can be detected successfully. After the motion object detection, we define the feature vector of a motion object and propose a method for object tracking. The following sections are organized as follows. Section 2 introduces the classes of background changes, the remove of the Gaussian nois es, and the detection of motion objects. Section 3 presents the proposed feature vector of a motion object, the method of object tracking, and the elimination of the light changes and the objects shadows. Section 4 gives the experimental results. Conclusions are given in Section 5.

Abstract
A new method for detecting and tracking mo tion objects in video image sequences based on the background subtraction approach is proposed. The changing regions of the image sequence are detected by the background subtraction. Noises are removed by using the low-pass filtering and the down sampling techniques. A new method is proposed to eliminate the changing regions of the light changes and the objects shadows. The feature vectors of the detected motion objects are computed and registered. The feature vector includes the position, size, and color (mean and variance) information of a motion object. The motion objects are tracked by comparing the feature vectors of the consecutive images. Experimental results are given. Keywords : Object tracking, Background subtraction, Motion detection, Computer Vision

1. Introduction
With the development of the electronic technology, the video capturing and image storage devices have become cheaper and more popular. Public area including streets, banks, stores, etc. is monitored by a mass of static cameras. These cameras captured a vast amount of image sequences everyday. Most of these data are daily rituals. To detect the irregular part of a image sequence is a time-consuming job. Several methods have been proposed to detect the motion objects in an image sequence automatically [1]-[12]. They can be classified into three major categories: the background subtraction based [1]-[5], the temporal difference based [6]-[7], and the probability based [8]-[9] approach. Each approach has its advantages and shortcomings. The fundamental process of the background subtraction based approach is to subtract the current image with a pre-selected background image. Then, the subtraction result is analyzed to find the motion objects. The major disadvantage of the background subtraction based approach is that the background may change with time slightly. The subtraction result may include some additional noise. Wren et al. tried to

2. Motion Detection
To detect the motion objects, we choose a background image from the image sequences first. Then, each image under process is subtracted by the background image. Theoretically, the nonzero regions in the subtraction result are the locations of the motion objects. However, there are many other things presented in practice. These unwanted things are called the background changes as a whole. By experiments, we classify the additional things, except for the motion objects, into three classes. They are the noises, the light changes, and the motion objects shadows. The Gaussian noise is presented in the image sequence captured by a camera. This noise distributes on an image in a random manner. Therefore, it can not be removed by the background subtraction process. The second class of the background changes is the light change. Light is the energy source for images. The charge-coupled device of a camera senses the light photons reflected from the objects in the image scene. The light changes will reflect on the image. Typical light sources are the sunlight and the indoor lamps. The image sequence obtained from an indoor camera may be influenced by the lamps and the sun light shining through the windows. The lamps may be turned on or off between two consecutive image frames. Thus, it will result in a background change. The sunlight may shine on some local regions through the windows. These regions may move slowly with time. The third class is the shadows of the motion objects. The appearance of a motion object is usually accompanied with its shadow. In practical applications of the motion detection, we focus on the motion objects and their shadows are usually negligible. Therefore, we classify them as a type of the background changes. The proposed method consists of two major stages: the motion detection and the object tracking. The Gaussian noise will be reduced before the first stage, while the elimination of the light change and the objects shadows will be executed after the second stage.

A(x,y,c) = abs( Xn(x,y,c)-X0(x,y,c) ); for all x, y, c B(x,y)=(A(x,y,1)+A(x,y,2)+A(x,y,3))/3; where Xn and X0 are the current image and the background image respectively, x and y are the rectangular coordinates, c is the color index: 1 for red, 2 for green, 3 for blue, B(x,y) is the subtracting result. This slight modification can prevent the cancellation between the three color components. Because different color components may have opposite changing directions, one is increasing and another is decreasing, that will result in opposite signs and mutual cancellation.

2.2 Noise Reduction


Many methods are applicable to reduce the Gaussian noise in an image. We use a simple method to reduce the noise in the background subtracted result (called the difference image henceforth). The difference image is convolved with an average mask which takes a local average computation for each pixel. Then, it is converted into a binary difference image by thresholding. The Gaussian noise can be effective reduced by this simple process and the changing regions appear.

2.3 Object Detection


After noise reduction, the changing regions appear. However, it is usually the case that the changing regions are not closed regions. Besides, some sparse points may present. To obtain closed changing regions and removing sparse points, we take the following two processing steps. 2.3.1 Decimation The binary difference image is downsampled by a factor of N, where N is the size of the average mask applied in the noise reduction process. 2.3.2 Removing and adding points We apply the following decision rule to achieve an effect which is similar to the morphological filtering of an opening followed by a closing operation. Input E: the binary difference image; For each pixel E(m,n), Count k: the number of 1s in its 8NN; If (k>=6) F(m,n)=1, Elseif (k<=2) F(m,n)=0, Else F(m,n)=E(m,n); Endfor; Output F;

2.1 Background Subtraction


The background subtraction is formally calculated by taking the intensity difference of the current image with the background image. In this paper, we take a different way to execute this process. We calculate the absolute difference of each color component separately. Then, we average the three components to get the background subtraction result.

2.3.3 Grouping The resulting binary image of the previous step contains several closed regions. Each region consists of a group of pixels. In this step, the binary image is scanned to get a first nonzero pixel. Then, we find the nonzero pixels connected to it. These pixels are treated as a group. This process proceeds until all nonzero pixels are grouped. Each group of pixels is treated as a motion object. Although some of the groups may be a region of light change or a shadow. These false detections will be deleted at the latter stage.

3.3 The Elimination of the Light Changes and the Objects Shadows
The changing region corresponds to a light change has the properties that it is appeared, remained, and disappeared all on the same region and it produces only intensity change during the time. Whereas a shadow has the properties that it may appear or disappear at any time, it is always accompanied with a real motion object, and it again makes only intensity change. Based on the above observations, the false detections can be eliminated by checking each feature vectors history.

3. Object Tracking 4. Experimental Results


The motion objects are tracked by managing the motion objects feature vectors between consecutive frames. This section will introduce the feature vector of a motion object defined in this paper, the object tracking method, and the elimination of the false motion detections. Our method is programmed by using the Matlab language. The experimental image sequences are downloaded from the website: http://www.ippr.org.tw/ Figure 1 demonstrates the motion detection technique proposed in this paper. Figure 1(a) is the background image of the image sequence under processing. Figure 1(b) is the current image to be analyzed. Figure 1(c) shows absolute difference image obtained by the background subtraction. Figure 1(d) is obtained by averaging the three color component of (c). It can be seen that the difference image is noisy. Figure 1(e) is the lowpass filtered image of (d). Figure 1(f) is the binary difference image obtained by thresholding (e). Figure 1(g) shows the downsampled image of (f). Figure 1(h) shows the morphological like filtered image of (g). Figure 1(i) shows the grouping result. Figure 1(j) shows the corresponding changing regions in the original image. The feature vectors of the changing image segments are listed in Table I. The origin of the rectangular coordinates is at the upper-left corner of the image. We can observe that K0 is the feature vector of the person in blue pants, L0 is produced by a light change, and M0 is the person in red coat. The changing regions and their corresponding feature vectors of the consecutive image are given in Figure 1(k) and Table II. We can observe that K1 is the feature vector of the person in blue pants, L1 is produced by a light change, and M1 is the person in red coat. By comparing Table I and II, we can find that the entries Rm, Gm, Bm, Rd, Gd, and Bd of the corresponding vectors have approximately the same value. While s, x, and y of K and M are changing in the way that the objects are moving toward the upperleft corner of the image and they are gradually vanishing. That is, their values are getting smaller. The distance of the feature vectors of the consecutive images are listed in Table III. We can easily find the correspondence between the feature

3.1 The Feature Vector of a Motion Object


Each group of pixels detected in the previous stage is treated as a motion object. Thus, the motion object can be segmented from the original image by mapping back. The proposed feature vector consists of 9 entries: the size, the x, y coordinates, the means and the deviations of the R, G, and B color components. V=[s, x, y, Rm, Gm, Bm, Rd, Gd, Bd]; The size is obtained by counting the number of the pixel group. The x, y coordinates of each group are obtained by average the maximum and the minimum values of the corresponding coordinate. The mean value of each color component is the average value of that component in the corresponding original image segment. The deviation of each color component is obtained by average the absolute deviation from the mean value.

3.2 The Object Tracking Method


For each frame in the image sequence, we compute the feature vectors of all motion image segments. Then, the feature vectors of the consecutive image frames are pairwisely compared. We define the distance D between two feature vectors by D(V1,V2)=mean(abs(V1-V2)); The pair of feature vectors having distance under some given threshold is treated as the same objects feature vectors. From frame to frame, a feature vector may appear, update, or disappear. A motion object can be tracked by observing the changing of its corresponding feature vector.

vectors. Figure 1(l) shows the ratio of the current image with the background image in the changing regions. We can find that the regions of the motion objects are colorful, while the region of light change is gray. After deleting the false changing region, the final result is shown in Figure 1(m).

Figure 1. (d) The difference image obtained by averaging the three color component of (c).

Figure 1. (a) The background image.

Figure 1. (e) The lowpass filtered image of (d).

Figure 1. (b) The current image to be analyzed.

Figure 1. (f) The binary difference image obtained by thresholding (e).

Figure 1. (c) The absolute difference image obtained by the background subtraction.

Figure 1. (g) The downsampled difference image of (f).

Figure 1. (j) The changing regions are segmented from the original image. Table I. The feature vectors of the changing image segments in Figure 1(j). K0 L0 M0 Rm Gm Bm Rd Gd Bd s x y 89.8648 110.5600 127.6248 24.3632 30.5558 42.2007 58.0000 17.0000 24.0000 110.8347 114.9766 114.0106 12.6677 12.6971 12.5913 53.0000 37.5000 59.5000 65.1813 35.8821 51.5767 16.5273 34.1001 23.0638 175.0000 38.5000 49.0000

Figure 1. (h) The binary difference image obtained by processing (g) with a morphological like filtering.

Figure 1. (k) The changing regions of the consecutive image. Figure 1. (i) The motion objects obtained by grouping the nonzero pixels. Outer frames of each group are also marked. Table II. The feature vectors of the changing image segments in Figure 1(k). K1 M1 L1 Rm 93.5592 68.3447 111.0051 Gm 111.0839 38.5920 115.1651 Bm 128.0063 54.8224 114.4611 Rd 24.9295 13.4180 12.1565 Gd 28.7505 31.2776 12.3286 Bd 38.3485 18.9513 12.2329

s x y

51.0000 17.0000 23.0000

175.0000 37.0000 43.0000

55.0000 37.5000 59.5000

Table III. The distance between the feature vectors of consecutive frames. K0 L0 M0 K1 2.0915 16.4707 42.2829 M1 39.6897 38.0837 2.9626 L1 17.5901 0.4497 39.6072

Figure 2. (a) The changing regions of frame #1 in experiment 1.

Figure 1. (l) The intensity ratio of the current image with the background image in the changing regions.

Figure 2. (b) The changing regions of frame #2 in experiment 1.

Figure 1. (m) The final result of the motion detection. The changing regions of three consecutive frames of the same experiment are shown in Figure 2(a), (b), and (c). There is a light change presented in all the frames. A shadow of the person in red cost is appeared shortly in the frame #2. After deleting of the false changing regions, Figure 2(d), (e), and (f) show that the light change and the shadow are successfully removed.

Figure 2. (c) The changing regions of frame #3 in experiment 1.

Figure 2. (d) The final result of the frame #1.

Figure 3. (a) The changing regions of frame #1 in experiment 2.

Figure 2. (e) The final result of the frame #2. Figure 3. (b) The changing regions of frame #2 in experiment 2.

Figure 2. (f) The final result of the frame #3. Another experiment is shown in Figure 3. Figure 3(a), (b), and (c) show the changing regions of three consecutive frames. There is a remaining region of light change. In addition, a shadow is accompanied with the walking person. Figure 3(d), (e), and (f) show that our method can successfully deleting the regions of light change and the shadow. Figure 3. (c) The changing regions of frame #3 in experiment 2.

References
[1] Fang-Hsuan Cheng and Yu -Liang Chen, Real time multiple objects tracking and identification based on discrete wavelet transform, Pattern Recognition 39(2006), pp 1126-1139. [2] C.R. Wren, A. Azarbayejani, T. Darrel, A. Pentland, Pfinder: real time tracking of the human body, IEEE Trans. Pattern Anal. Mach. Intell. 19 (7) (1997), pp 780-785. [3] N. Oliver, B. Rosario, A Pentland, A Bayesian computer vision system for modeling human interactions, Proceedings of International Conference on Vision Systems 99, 1999, pp. 255272. [4] K. Sato, J.K. Aggarwal, Tracking and recognizing two-person interaction in outdoor image sequences, 2001 IEEE workshop on Multi-object tracking, pp. 87-94. [5] M. Quming, J.K. Aggarwal, Tracking and classifying moving objects from video, Proceedings of the second IEEE International Workshop on PETS, 2001. [6] A. Lipton, H. Fujiyoshi, R. Patil, Moving target classification and tracking from real-time video, Proceedings of the 1998 DARPA Image Understanding Workshop. [7] C. Anderson, P. Burt, G. van der Wal, Change detection and tracking using pyramid transformation techniques, Proceedings of SPIEIntellegent Robots and Computer Vision, vol. 579, 1985, pp. 72-78. [8] B. Bascle, R. Deriche, Region tracking through image sequences, Proceedings of IEEE International Conference on Computer Vision, 1995, pp. 302-307. [9] I. Haritaoglu, D. Harwood, L.S. Davis, W4: realtime surveillance of people and their activities, IEEE Trans. Pattern Anal. Mach. Intell. 22 (8) (2000) 266-280. [10] N. Paragios, G. Tziritas, Adaptive detection and localization of moving objects in image sequences, Signal Process.: Image Commun. 14 (1999), pp. 277-296. [11] H.H. Nagel, G. Socher, H. Kollnig, M. Otte, Motion boundary detection in image sequences by local stochastic tests, Proceedings of the European Conference on Computer Vision, vol. 2, 1994, pp. 305-315. [12] N. Diehl, Object-oriented motion estimation and segmentation in image sequences, IEEE Trans. Image Process. 3 (1990), pp. 1901-1904.

Figure 3. (d) The final result of the frame #1.

Figure 3. (e) The final result of the frame #2.

Figure 3. (f) The final result of the frame #3.

5. Conclusions
In this paper, we propose a new method of motion detecting and object tracking. The proposed method can reduce noise, detect motions, track objects, and delete the light change and the shadows. Experimental results verify that the proposed method can be applied to the real image sequences and providing satisfactory results.

S-ar putea să vă placă și