Sunteți pe pagina 1din 5

Counting Traffic Using Optical Flow Algorithm on

Video Footage of a Complex Crossroad


Alvin Abdagic, Omer Tanovic, Abdulah Aksamovic and Senad Huseinbegovic
Department for Automatic Control and Electronics
Faculty of Electrical Engineering, University of Sarajevo
Sarajevo, Bosnia and Herzegovina
Email: alvin.abdagic@etf.unsa.ba

Abstract—A practical solution for counting traffic is analysed in to implement without accurate turning movement counts [8].
this paper. The goal was to develop a solution that focuses on Therefore, using estimated turning movement counts is not an
more CPU processing than on placing many complex sensors. option.
This approach should reduce costs of both deployment and
maintenance of such system. The proposed solution is to use Different approach is to use a camera that is mounted in a
video footage from a properly placed camera as input data. location that allows overview of the crossroad, using multiple
Vehicle movement is detected using optical flow algorithm. A
great deal of post-processing is used to cope with problems that cameras or use lenses that allow wider viewing angles [9],
arise from this method of movement detection. Since the video [10]. Video is then analysed, usually frame by frame. Vehicle
of the entire crossroad is available, it is possible to fully analyse detection is almost always done using method of background
vehicle movement and produce turning movement counts and subtraction [11], [12]. However, in order to perform back-
efficiently estimate (time varying) origin-destination trip tables ground subtraction, the background itself must be known.
if enough crossroads are monitored this way. The solution was
developed in Simulink using Video and Image Processing Toolbox In real situations background is not time invariant. Different
and custom blocks. ambient lighting conditions and weather conditions modify the
background, to name just a few. This problem is usually solved
Index Terms—crossroad; traffic counting; turning movement
counts; origin-destination trip table; cameras; video processing; by using different background learning algorithms [13], [14].
optical flow All such algorithms have the same problem when used on
video footage from a crossroad. Vehicles cannot be considered
always moving on a crossroad. Also traffic congestion is a
I. I NTRODUCTION common event. In such scenarios, stationary or slow moving
vehicles corrupt background learning algorithms, as they get
Counting traffic is an operation that is a prerequisite for many
confused with the background. Different post-processing is
aspects of traffic monitoring, management and planning traffic
then used to cope with these issues. In this paper vehicle
infrastructure. Manual traffic counting, being most accurate
detection is done in a different manner, which is more suitable
method is however also the most expensive. Nowadays the
for crossroads. Optical flow1 is computed between every two
most common method of counting traffic that is used, is by
successive frames. Optical flow is much more computationally
means of specialised sensors such as magnetic loop detectors
complex then background subtraction. However, as computers
[1], [2]. Such sensor often require that the road itself is drilled.
get more powerful every day, it has became feasible to use
Once such sensors are placed at a specific location, they cannot
optical flow. Since optical flow can be implemented as a
be relocated. Also, all such solutions require a lot of wiring.
parallel algorithm, real time performance can be achieved.
Such solutions can only count the number of vehicles that
Using camera to count traffic in a crossroad has an important
come from a specific direction and number of vehicles that
advantage over using specialised sensors. Vehicles are tracked
go to a specific direction [3]. If at a certain time period
throughout the entire crossroad, so turning movement counts
vehicles can turn in multiple directions coming from one lane
can be generated directly and time varying O-D trip tables
and reach certain lane coming from different directions, use
can be estimated very efficiently with enough crossroads
of such sensors cannot directly give us turning movement
monitored this way. Overview of the realised solution is shown
counts without estimation [4]. Dynamic transportation models
in Figure 1 and individual steps of the algorithm are described
are becoming widespread due to increasing access to power-
in the following sections.
ful computing resources. Use of dynamic traffic assignment
(DTA) models in traffic simulation for different planning ap-
II. V EHICLE DETECTION
plications has become quite common [5], [6], [7]. Such models
cannot exist without time varying origin-destination (O-D) trip Frame segmentation is performed using optical flow algorithm.
tables. However, they are generally not directly observed and Actual direction of pixel movement is not used, but only
must be estimated from other traffic measurements such as
counts. Such estimations are shown to be extremely difficult 1 Using algorithm described in [15].

nd
52 International Symposium ELMAR-2010, 15-17 September 2010, Zadar, Croatia
41
tionable quality. Many problems arise. First of all depending
on the location of the camera partial occlusions are possible.
When one vehicle is closer to the camera and the camera is
not placed in a location that is sufficiently elevated from the
ground, another vehicle (farther away from the camera) can be
partially occluded by the first vehicle. In this event proposed
method of vehicle detection will detect the two vehicles as one.
Middle part of the vehicle, i.e. the roof is in most vehicles a
clear surface of highly homogeneous colour. This represents
a problem for optical flow algorithm. Since all the pixel on
the roof of the vehicle look very much alike, optical flow
algorithm sometimes considers that there is no movement in
this area. Other parts of the vehicle are sufficiently distinct to
be correctly detected. Effect that this information has on the
performance, is that cars are sometimes “split in half”. The
front and the back are detected as moving, the roof is not, and
the front and the back get detected as two individual vehicles.
Entire detection algorithm is based on the idea of detecting
moving object in the video. Therefore, when a vehicle stops
moving it can no longer be detected. All of the described
problems could be partially solved using additional image
Figure 1: Overview of the solution and video processing. However this approach would require a
great deal of processing. Therefore, we employ a different
idea. Vehicle detection is used as it is and problems that
intensity of the movement. Ideally, detection would be better arise are fixed in a later stage of post-processing. This post-
as pixels could be grouped together using movement vectors of processing no longer deals with images and video but instead
equal directions and intensities. In reality however, optical flow with locations of detected vehicles.
algorithm does not provide results that are sufficiently accurate
for such analysis. Even if it did, such analysis would be overly III. V EHICLE TRACKING
complex and time consuming. Instead, a different approach is Videos after all are sequences of sampled images. High
used. Pixels are classified as either moving pixels or stationary resolutions are necessary in order to perform optical flow
pixels. Binary thresholding is used in the following manner. algorithms with acceptable results. Therefore frame rates in
First movement intensities are calculated as modulus of each such videos cannot be high, as it is well known that frame
complex number associated with each pixel. Then average of rates almost always drop as resolution is chosen higher.
these values is calculated across entire frame and then over In such scenario identifying the same vehicle in successive
time (weighted average of previous average and new average). frames is an important issue. We considered that a simple
Binary thresholding is performed using this value as a limit. approach should yield results with acceptable accuracy and
In this manner logical ones and zeros are associated with each experiments showed it correct. Vehicles at this point are seen
individual pixel in the current frame. Logical one represent a only as rectangles, so pure geometry calculation is sensible.
moving pixel. Using this data as a binary image gives a rather All vehicles share a common characteristic, the dimension
descriptive black and white image. As cars are separated a along which the vehicle is normally moving is larger than
noticeable distance one from another connected moving pixels the other ones. Because of this it is important to allow vehicle
can be grouped together. Since optical flow algorithm is not larger displacement along larger dimensions between frames.
ideal by design, morphological closing is used to fill the gaps In order to achieve this the following test is used. In each
that should not happen during calculation. In order to remove n + 1-th frame, centroid of the vehicle is calculated. If this
noise that shows up during both recording and optical flow centroid is inside the rectangle that represents the vehicle in
calculation a median filter with 3x3 pixel neighbourhood is n-th frame, then we consider the two vehicles to be the same.
used. Now vehicles are detected using blob analysis. A group
of pixels that are white (i.e. moving by criteria from previous Coordinate system used for all geometric calculations is simi-
analysis) and can be connected together into a single contour lar to Cartesian coordinate system with the origin in the upper
are called a blob. Each blob is considered to represent a left corner of the video. Notable difference is that positive
vehicle. In order to filter out other moving objects in the video, direction of the vertical axis is downward. Let the rectangle
such as pedestrians, filtering based on minimum area of the that represents the car in the n + 1-th frame be defined with
blob is used. two points in the previously described coordinate system,
upper left point A (xA , yA ) and lower right point B (xB , yB ).
Using this method of vehicle detection yields results of ques- Let the centroid of this rectangle be defined with a point in

nd
52 International Symposium ELMAR-2010, 15-17 September 2010, Zadar, Croatia
42
implies that vehicles cannot disappear from crossroads. Whe-
never a vehicle detected in previous frame is not detected
in a new frame, we can assume it has stopped moving. Its
location is memorised and we wait for the vehicle to continue
it’s movement when it can be detected again. Only vehicles
that are on the boundary of video frame are not memorised as
it is possible that they can no longer be seen.
Partial occlusion is somewhat difficult to solve using only
locations of the blobs. Because of this, cameras should be
placed in such location to minimise such situations. It is
important to place the camera so that at least during a fraction
of the time vehicles take to pass through the crossroad, they
are fully visible. After they have been identified as separate
vehicles, they can be tracked together while counting them
Figure 2: Tracking vehicles as several. To separate this case a single car being “split in
half”, the following observation can be used. When a single
car is segmented due to fallacy of the optical flow algorithm,
the previously described coordinate system C (xC , yC ). The it is always separated along the axis that is orthogonal to
following relations stand the direction of movement. Quite opposite, if vehicles are
xA + xB occluded one by another, when they become fully visible,
xC = ,
2 (1) they will be separated along the axis of the movement. This
yA + yB algorithm for detection of partial occlusion was not tested in
yC = .
2 this particular solution, instead we ensured that the camera is
Let the rectangle that represents the car in the n-th frame be properly placed so that partial occlusions are minimal.
defined with two points in the previously described coordinate
system, upper left point D (xD , yD ) and lower right point IV. T URNING MOVEMENT COUNTS
E (xE , yE ). Then the centroid C (xC , yC ) lies within this
rectangle if and only if
The main advantage of using described method of traffic
counting is the fact that vehicles are tracked throughout the
(xD < xC < xE ) ∧ (yD < yC < yE ) . (2) crossroad, therefore turning movement counts (TMC) can be
generated without any need of estimation. Having a video
In Figure 2 we can observe how vehicles are tracked between feed from a crossroad, it is simple to manually identify all
subsequent frames. Solid rectangles are blobs representing allowed turning movements. Some of the turning movements
detected vehicles in n + 1-th frame and dashed rectangles can be done in several different ways, i.e. from different
are detected vehicles in n-th frame. Centroid of the solid red and to different lanes. Individual lanes are marked with line
rectangle is inside the dashed red rectangle, therefore the two segments placed close to the traffic light. Testing whether a
blobs represent the same vehicle. The same is true for blue certain vehicle was moving along a certain lane is actually
rectangles. testing if a rectangle that represents the vehicle has at some
Before using information about tracked vehicles, issues that time intersected particular line segment. Each allowed turning
were present after detection must be solved. In previous sec- movement is then defined by two lanes, i.e. two line segments.
tion it has been described how a single vehicle can mistakenly These turning movements can then be grouped together when
be detected as two. However those two segments will continue calculating TMC if they in fact represent the same movement
to move at equal speeds throughout the video. Even if two but completed in different ways.
vehicles (one driver behind another) can move at similar
Let the line segment that represents the lane be defined with
speeds, they will either be far apart or speed difference will be
two points in the previously described coordinate system,
noticeable during acceleration and deceleration. Grouping car
upper left point A (xA , yA ) and lower right point B (xB , yB ).
segments can be done using these observations. Grouping car
Let the rectangle that represents the car be defined with two
segments is well documented in [16]. Additionally, once car
points in the previously described coordinate system, upper
segments are grouped and identified as a single vehicle in n-
left point C (xC , yC ) and lower right point D (xD , yD ). Then
th frame, in every subsequent frame both segments centroids
the specified line segment and the rectangle intersect (i.e. the
will be inside the rectangle that represents the vehicle in n-th
car is in the specified lane) if and only if
frame.
Vehicle detection is based on movement detection, therefore (∃t ∈ [0, 1]) ((xC ≤ xA + t (xB − xA ) ≤ xD )
(3)
stationary vehicles cannot be detected. Never the less logic ∧ (yC ≤ yA + t (yB − yA ) ≤ yD )) .

nd
52 International Symposium ELMAR-2010, 15-17 September 2010, Zadar, Croatia
43
However, checking if rule (3) is valid, requires solving an
equation which is computationally a non trivial problem. It is
important to notice that the value for variable t in the previous
rule is not important in any way except that it’s value must
be withing [0, 1]. With this observation we can simplify the
stated rule and construct an equivalent specified by relation
(4).

⎧    

⎪ xC − xA yC − yA xD − xA yD − yA

⎪ max , ≤ min ,

⎪ xB − xA yB − yA xB − xA yB − yA

⎨  
xC − xA yC − yA (a) Input video feed
max , ≤1

⎪ xB − xA yB − yA

⎪  

⎪ xD − xA yD − yA

⎩ min , ≥0
xB − xA yB − yA
(4)
Even though relation (4) seams more complicated than (3),
it is in fact computationally very simple. It only involves
basic mathematical operations (subtraction and division) and
comparisons. It can even be transformed to contain only
integer math if required.

V. EXPERIMENTAL RESULTS
(b) Binary image after thresholding
In order to test as many potentially problematic scenarios
we did not use a video feed from a real world crossroad,
but instead created animations of a crossroad. Since partial
occlusion were not handled, it would be necessary to place
several cameras on an actual crossroad to get good coverage
which is somewhat difficult. Synthetically created scenarios
serve as a proof of concept, whereas in practical applications
it would be feasible to place sufficient number of cameras to
ensure good coverage.
Results of binary thresholding and morphological operations
are shown in Figure 3b. Using red ellipses we emphasised
areas where stationary vehicles are located but cannot be
detected using optical flow. Using blue ellipse we emphasised (c) Tracked vehicles after all post-processing with assigned unique IDs
the vehicle that is “split in half” due to previously described
Figure 3
homogeneous roof colour. After post-processing (Figure 3c)
we can see that all vehicles are successfully tracked (even
the stationary ones) and the one that was “split in half” The way that presented solution works is such that poten-
is successfully detected as a single vehicle. Additionally all tial geometrical deformations of the video feed would not
vehicles are assigned unique IDs so that turning movements influence the results, because cars are not detected by features
can be monitored. but by movement. Animation that was created showed the
crossroad where no occlusions could happen. Issue that arises
VI. C ONCLUSION in a practical application is partial occlusion of vehicles, but
means to solve these issues are proposed in this paper.
Traffic flow measurement using a camera has shown to be
a feasible solution. Even though idealised conditions were It may not be possible to position a camera so that entire
analysed, results in real world scenario should not be different. crossroad is visible. In such situation we can use two or
We have demonstrated that the use of video feeds from more cameras to cover the entire crossroad. Then input video
inexpensive cameras can be used instead of specialised sensors feed could be a simple concatenation of different feeds from
to monitor traffic in the crossroads and provide valuable data different cameras. During such concatenation it would only
that can be used for different planning applications and traffic be important to align the edges of the feeds, and to do so
simulations. geometrical transformations can be used as the geometric

nd
52 International Symposium ELMAR-2010, 15-17 September 2010, Zadar, Croatia
44
distortions of the feeds do not effect performance. When using [7] Y. ming Chen and D. yun Xiao, “Real-time traffic management under
multiple cameras to cover the crossroad, they can be arranged emergency evacuation based on dynamic traffic assignment,” Automation
and Logistics, 2008. ICAL 2008. IEEE International Conference on, pp.
in a way to minimise the effects that occlusions could cause. 1376 –1380, sept. 2008.
[8] A. R. H. S. Ramachandran Balakrishna, Daniel Morgan, “Advances in
This solution with the addition of occlusion handling as des- origin-destination trip table estimation for transportation planning and
cribed and using multiple video feeds from actual crossroads traffic simulation,” European Transport, 2008. Proceedings. 2001 ETC,
will be analysed in the future. 2008.
[9] S.-M. Lee and H. Baik, “Origin-destination (o-d) trip table estima-
Additionally weather conditions are also considered ideal, tion using traffic movement counts from vehicle tracking system at
intersection,” IEEE Industrial Electronics, IECON 2006 - 32nd Annual
whereas rain, snow or limited visibility could be a serious Conference on, pp. 3332 –3337, nov. 2006.
challenge for the optical flow algorithm. A possible solution [10] S.-M. Lee, H. Baik, and J. Park, “Visual traffic movement counts at
could be the use of infrared cameras that would not be as intersection and origin-destination (o-d) trip table estimation,” Intelligent
Transportation Systems Conference, 2007. ITSC 2007. IEEE, pp. 1108
severely affected by weather conditions. –1113, 30 2007-oct. 3 2007.
[11] Y.-K. Jung and Y.-S. Ho, “Traffic parameter extraction using video-based
Entire solution was developed and tested in Simulink. This vehicle tracking,” Intelligent Transportation Systems, 1999. Proceedings.
leaves room to further analyse possibilities of an embedded 1999 IEEE/IEEJ/JSAI International Conference on, pp. 764 –769, 1999.
solution that would not require the presence of a computer. [12] T. Ikeda, S. Ohnaka, and M. Mizoguchi, “Traffic measurement with
a roadside vision system-individual tracking of overlapped vehicles,”
Pattern Recognition, 1996., Proceedings of the 13th International Confe-
R EFERENCES rence on, vol. 3, pp. 859 –864 vol.3, aug 1996.
[13] M. Fathy and M. Siyal, “A window-based image processing technique
[1] R. Bishop, “A survey of intelligent vehicle applications worldwide,” for quantitative and qualitative analysis of road traffic parameters,”
Intelligent Vehicles Symposium, 2000. IV 2000. Proceedings of the IEEE, Vehicular Technology, IEEE Transactions on, vol. 47, no. 4, pp. 1342
pp. 25 –30, 2000. –1349, nov 1998.
[2] M. Mills, “Inductive loop detector analysis,” Vehicular Technology [14] C. Zhang, S.-C. Chen, M.-L. Shyu, and S. Peeta, “Adaptive background
Conference, 1981. 31st IEEE, vol. 31, pp. 401 – 411, april 1981. learning for vehicle detection and spatio-temporal tracking,” Informa-
[3] C.-J. Lan, “Sufficiency of detector information under incomplete confi- tion, Communications and Signal Processing, 2003 and the Fourth
guration for intersection od estimation,” Intelligent Transportation Sys- Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint
tems, 2001. Proceedings. 2001 IEEE, pp. 398 –403, 2001. Conference of the Fourth International Conference on, vol. 2, pp. 797
[4] ——, “Adaptive turning flow estimation based on incomplete detector – 801 vol.2, dec. 2003.
information for advanced traffic management,” Intelligent Transportation [15] B. Lucas and T. Kanade, “An iterative image registration technique
Systems, 2001. Proceedings. 2001 IEEE, pp. 830 –835, 2001. with an application to stereo vision,” Image Understanding, 1981.
[5] D. Yu, X. Yin, L. Du, and J. Xie, “Simulation research on dynamic Proceedings., 1981 DARPA, pp. 121–130, 1981.
traffic assignment model,” Intelligent Computation Technology and [16] D. Beymer, P. McLauchlan, B. Coifman, and J. Malik, “A real-time
Automation, 2009. ICICTA ’09. Second International Conference on, computer vision system for measuring traffic parameters,” Computer Vi-
vol. 2, pp. 240 –243, oct. 2009. sion and Pattern Recognition, 1997. Proceedings., 1997 IEEE Computer
[6] L. Zhen-long, “A differential game modeling approach to dynamic traffic Society Conference on, pp. 495 –501, jun 1997.
assignment and traffic signal control,” Systems, Man and Cybernetics,
2003. IEEE International Conference on, vol. 1, pp. 849 – 855 vol.1,
oct. 2003.

nd
52 International Symposium ELMAR-2010, 15-17 September 2010, Zadar, Croatia
45

S-ar putea să vă placă și