Sunteți pe pagina 1din 8

416 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 6, NO.

4, DECEMBER 2005

A Vision-Based Approach to Collision


Prediction at Traffic Intersections
Stefan Atev, Hemanth Arumugam, Osama Masoud, Ravi Janardan, Senior Member, IEEE, and
Nikolaos P. Papanikolopoulos, Senior Member, IEEE

AbstractMonitoring traffic intersections in real time and pre- Studies have shown that collisions between vehicles at traffic
dicting possible collisions is an important first step towards build- intersections account for nearly a third of all reported crashes
ing an early collision-warning system. We present a vision-based in the United States [8], [18], [21]. This has led to considerable
system addressing this problem and describe the practical adap-
tations necessary to achieve real-time performance. Innovative interest at the federal level in developing an intelligent low-cost
low-overhead collision-prediction algorithms (such as the one system that can detect and prevent potential collisions in real
using the time-as-axis paradigm) are presented. The proposed time. This paper presents the various components of a vision-
system was able to perform successfully in real time on videos based system that monitors a traffic intersection and uses the
of quarter-video graphics array (VGA) (320 240) resolution tracking results to predict collisions over a short time interval
under various weather conditions. The errors in target position
and dimension estimates in a test video sequence are quantified extending into the future. Our goal is to establish the feasibility
and several experimental results are presented. of this approach; the specific way in which these predictions
Index TermsCollision prediction, machine vision, real-time can be used to prevent possible traffic accidents, however, is
systems, tracking, traffic control (transportation). not addressed at this time. As far as we know, this is the first
time that computer vision in conjunction with computational
geometry tries to supply a solution to this complex problem.
I. I NTRODUCTION
The rest of this paper is organized as follows: We initially

I NTELLIGENT transportation systems (ITS) are built from


technologies such as sensing, control, engineering, and com-
puting to solve transportation-related problems. Such systems
discuss the problem that our system addresses and explain
our particular design choices. The techniques used to predict
collisions and measure vehicle dimensions are then presented.
have evolved from primarily addressing traffic-control prob- Finally, the paper presents and evaluates our results on test
lems to current applications that involve better lane design video sequences.
for reduced congestion, developing systems that help reduce
traffic-related fatalities, and better vehicle- and pedestrian-
II. P ROBLEM S PECIFICATION AND S YSTEM O VERVIEW
monitoring systems to study flow patterns in various traffic
scenarios. Such systems perform well in steady-state traffic The purpose of our system is to monitor a traffic intersection
situations like those on a freeway, but perform badly when using a live video feed from one or more cameras over extended
applied to the highly unsteady flow of a busy traffic intersection. periods of time. Possible collisions between tracked vehicles
Traffic-monitoring applications regularly make use of must be detected before they occur so that a timely warning
computer-vision principles to model and analyze traffic scenar- may be issued. It is critical that the system runs in real time and
ios. For example, in [11], a contour tracker was used to model that it adapts to changes in the environment for the duration
each moving vehicle, while the problem of tracking vehicles of the monitoring. In addition, since the system must predict
under challenging conditions in a freeway is addressed by [6] collisions, we need to produce accurate estimates of both the
using a feature-based tracking system. Kato et al. [10] have positions and dimensions of vehicles, preferably in real-world
used a hidden Markov model (HMM) for traffic monitoring. units. Initially, we looked at deriving the bounding rectangles
Tracking technology has evolved to a point that techniques (for the overall framework, please see Fig. 1), but we eventually
such as particle filtering [1], [17], [19], combination of multiple moved to more complex representations (bounding boxes) due
visual cues [13], [23], [25], data association [4], [22], and to vehicle-size-estimation inaccuracies associated with bound-
occlusion handling [14] present new opportunities for research ing rectangles.
in the area. Appearance-based trackers, such as the kernel-based mean-
shift tracker presented in [7], are good at following moving ob-
Manuscript received February 20, 2004; revised June 23, 2005. This work jects, even in the presence of partial occlusions. Unfortunately,
was supported by the Minnesota Department of Transportation, the Intelligent such methods usually require a target model to be manually
Transportation Systems (ITS) Institute at the University of Minnesota, and specified and cannot delineate objects accurately. Since we
the National Science Foundation through Grant CMS-0127893. The Associate
Editor for this paper was M. Kuwahara. cannot provide target models for the various vehicles that can
The authors are with the Department of Computer Science and Engineering, enter a traffic scene in advance, and because we need the
University of MinnesotaTwin Cities, Minneapolis, MN 55455 USA outlines of vehicles to measure their dimensions, we chose to
(e-mail: atev@cs.umn.edu; hemanth@cs.umn.edu; masoud@cs.umn.edu;
janardan@cs.umn.edu; npapas@cs.umn.edu). base our target models on connected regions of foreground
Digital Object Identifier 10.1109/TITS.2005.858786 pixels. Connected regions (blobs) extracted from a foreground
1524-9050/$20.00 2005 IEEE
ATEV et al.: A VISION-BASED APPROACH TO COLLISION PREDICTION AT TRAFFIC INTERSECTIONS 417

Fig. 1. Initial approach used based on bounding rectangles.

mask provide us with good object outlines, allow for automatic are rarely occluded in the views of all cameras. However, even
target identification, and are well-suited for real-time systems. when multiple cameras are used, it is advantageous to have
In order to classify foreground pixels, we need a background some means of handling occlusions in a single view. For that
model of the observed scene. Due to the gradual changes in reason, we represent targets as sets of regions and introduce
scene appearance over extended periods of time, we cannot use a second-level tracker that is capable of handling blob merges
a static background model. Instead, we use an adaptive back- and splits. The second-level tracker also makes use of camera-
ground model based on the mixtures-of-Gaussians segmenta- calibration data in order to estimate the position and velocity of
tion method in [24]. The resulting background/foreground clas- vehicles in world coordinates. The calibration method we use is
sifier adapts well to gradual changes in the monitored outdoor presented in [16] and allows us to accurately map points from
environment and allows for the detection of targets even if they the image planes of all cameras to positions on a single world-
are not movinga common occurrence at traffic intersections. coordinate ground plane.
Occasionally, the background extraction can fail, either because It is common practice to use the centroids of connected
of sudden illumination changes in the scene caused by passing regions to represent a targets position. Such an approach is sub-
clouds or a cameras gain control circuitry, or due to minor optimal, because the position of a centroid relative to the ground
camera shakes resulting from road vibrations or wind load. plane depends on the size and orientation of vehicles and also
We have developed efficient methods for compensating for on the particular camera placement. The last fact complicates
sudden illumination changes and camera shakes. We also have multicamera tracking since the centroids tracked in different
devised a fast implementation of the method presented in [5] camera views do not correspond to the same real-world point.
for cleaning up the foreground masks. We introduce a method that can identify the centers of vehicular
Vehicles that move throughout the scene will sometimes bases on the ground plane given the outlines of the vehicles
occlude other moving objects, or be themselves occluded by and the camera-calibration data. The base centers identified by
static objects such as road-sign poles, traffic lights, etc. Such our method correspond to the same real-world point, which
occlusions cause blobs to merge, split into smaller regions, or allows for the sequential incorporation of the measurements in
to disappear completely. These interactions between connected a targets state vector. The method also produces estimates of
regions can either cause a single target to be visible as more the width, length, and height of vehicles that are critical for the
than one blob in the foreground, or cause several targets to collision-prediction system.
be represented as a single blob. The problem can be allevi- The last component of our system is the collision-prediction
ated by making use of multiple cameras to observe the same module. Given the visual measurements of all targets in the
intersection; proper camera placement can ensure that vehicles scene, the module reports all target pairs that will collide within
418 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 6, NO. 4, DECEMBER 2005

Fig. 2. Low-level vision-system components and data flow for a single frame. Lighter lines indicate the use of data from the processing of the previous frame.

Fig. 3. Collision-prediction-system components and data flow. Size and position estimates are provided to the ground-plane tracking component of the
tracking module.

a time interval of length L in the immediate future, assuming outlines and calibration data for the first task is presented,
their velocities stay constant. Optionally, the actual time of followed by a description of the collision-detection test.
impact for each target pair that collides within the specified
L time units can be reported. We present several methods for
predicting pairwise collisions. The most effective method is A. Dimension and Position Measurement
based on the idea of extruding the two-dimensional (2-D)
To measure a targets position and dimensions, we fit a three-
vehicular bases along a time axis to obtain three-dimensional
dimensional box to its outline. The outline is the union of
polytopes that can be tested for overlap efficiently by making
all contour points of regions in the targets blob collection.
use of the Separating Axis Theorem [9].
Assuming for the moment that we know the boxs position,
dimension, and orientation, we can extend its edges into lines.
III. L OW -L EVEL V ISION S YSTEM Those lines intersect at three distinct vanishing points in the
image plane (we also handle lines parallel to the image plane
In this section, we summarize the methods used. Initially, we as a special case). The importance of this result is that we can
tried to base our methods on bounding rectangles. It worked reverse the processstarting with the three vanishing points
in an acceptable fashion, but we had to use the assumption and the objects outline, we can find some of the edges of
of known vehicle width, which really distorted the vehicle- the box. The vanishing points in a given direction d R3 are
size estimation. We then developed more advanced methods for also determined. The relevant directions are found by making
background-model maintenance, illumination filtering, camera- two assumptions: that the targets orientation coincides with
shake compensation, and noise removal in the foreground mask. its direction of motion and that the bases of the targets are
Fig. 2 shows the components of the low-level vision system and parallel to the ground plane. The first assumption determines
the data interactions between them. More information about the vanishing points wx and wy in directions parallel to the
these methods can be found in [2]. ground plane: the direction of motion dx and perpendicular
direction dy . The second assumption fixes the third vanishing
point wz in the direction dz = [0, 0, 1]T .
IV. C OLLISION P REDICTION
For each vanishing point, we find the two tangent lines to
Initially, the collision-prediction module was responsible for the convex hull of the targets outline. If a vanishing point is
measuring the bounding rectangles. We eventually evolved the inside the outlines hull, it will be ignored in subsequent steps,
module to measure the base center and dimensions of a target since all box edges that vanish to that point would be contained
given its outline, and to predict potential collisions between in the outline and are thus, irrecoverable. There is no need to
vehicles (Fig. 3). The specific way in which we use object compute the convex hull, since its vertices are a subset of the
ATEV et al.: A VISION-BASED APPROACH TO COLLISION PREDICTION AT TRAFFIC INTERSECTIONS 419

converted to a simpler collision test between pairs of rectangles.


It should be noted that the Minskowski Sum of the rectangle
with a circle of diameter /2 in 2-D will give a more accurate
representation of the resulting shape, but the resulting geometry
will no longer be a rectangle and would greatly affect the
choice of the collision-detection technique employed. To keep
the geometry simple, a rectangular approximation of the final
shape is initially used to represent the resulting shape.
The general problem to be solved by the collision-detection
module can then be restated as follows: Given the position,
orientation, and size, at each time step, of n oriented rectangles
in 2-D, find all possible pairs of rectangles that intersect in the
Fig. 4. Tangent lines to a targets outline from the wx and wy vanishing
points. The directions dx , dy , and dz are indicated as well. current and future time steps. We then tried various practical
computational techniques to detect collisions between rectan-
outlines vertices. The tangent lines to the hull are the lines gles in real time. The problem of detecting collisions in the
through a vanishing point and an outline point that form the current frame is solved first.
least and greatest angles relative to a fixed axis. Fig. 4 shows a A direct approach to computing all possible pairs of inter-
sample target, the relevant vanishing points, and the directions secting rectangles is to test each pair of rectangles for such an
that determine them, as well as the tangent lines identified by intersection. Such a brute-force approach will take O(n2 ) time
our algorithm. to compute all such pairs. When the number of vehicles in any
The endpoints of the bounding box edges are found by time instant is small (say, < 10), such a brute-force method
intersecting the tangent lines. The three cases that we need should be sufficient and there should be no need to use more
to consider are shown in Fig. 5. In the last case, we cannot sophisticated algorithms. The actual test performed to detect
determine the length of two of the edges. This case does not if two oriented rectangles are intersecting each other is to use
occur if the camera is placed sufficiently high above the road a general polygon-clipping algorithm. This algorithm proceeds
plane, however, even with proper camera placement, some of by using one of the two rectangles as a clipper and the other
the dimensions may be unavailable due to partial visibility.1 as the target rectangle. The edges of the clipper rectangle are
Partially visible edges will not be used to produce dimension assumed to be oriented in a clockwise direction. Each directed
measurements, but will still be used for position measurements, edge of the clipper rectangle is used in turn to cut the target
since the tracker requires them for every frame. rectangle into two pieces, retaining at each step the piece that
The center of a targets base is obtained by taking the is to the right of the clipping edge until all the clipper edges
midpoint of a line segment connecting two edge endpoints are exhausted. A collision exists if the target rectangle has a
on opposite corners of the box. The real-world lengths of nonzero area left at the end of the clipping process. The current
edges in the ground plane can be identified by using endpoints. implementation of this brute-force algorithm in the collision-
The length of vertical edges can be determined by employing detection module makes use of the polygon-clipping algorithm
the techniques described in [16]. The recovered heights are found in the vision-something-libraries (VXL).
necessary in order to determine the lengths of edges for which The brute-force method described above makes an implicit
both endpoints are vertically displaced from the ground plane. assumption that each rectangle is equally likely to intersect with
every other rectangle. But in the problem domain under consid-
eration, each rectangle represents an actual vehicle in the real
B. Collision Prediction
world, and hence, moves in some predictable pattern. In order to
Initially, we experimented with bounding rectangles. Since reduce the number of polygon-clipping computations, the entire
the position, orientation and length of each vehicle are com- plane on which the vehicles move is divided into a grid of m
puted at every time step (vehicle width is assumed known), polygonal cells. The preprocessing step consists of assigning
the input to the collision-detection module is essentially a set each of the input rectangles (vehicles) to one or more of the m
of oriented rectangles in a plane. The general problem to be polygonal cells, depending on whether they are fully inside or
solved by a collision-detection module can then be stated as across one or more of these cells. Since there are m cells and n
follows: Given the position, orientation, and size, at each time rectangles, the assignment will require a worst case overhead of
step, of n oriented rectangles in 2-D, find all possible pairs of O(mn) polygonrectangle intersection computations. After the
rectangles that are within a distance in the current and future preprocessing, the actual rectanglerectangle intersections are
time steps. Finding the minimum distance between n2 pairs computed using the brute-force approach described earlier, but
of rectangles is a harder problem to solve than to just check if now doing the tests only among the rectangles within each cells.
two rectangles are intersecting. Since the value , the maximum The expected processing time to compute all possible vehicle
allowed intervehicular distance, is fixed, each rectangle can be collisions is then O((n/m)2 m) = O(n2 /m). Thus, the total
increased in size by a value of /2 and the problem can be processing time for the current time step is O(nm + n2 /m).
Hence, the design of the grid-cell size and shape becomes
1 We consider an edge partially visible if one of its endpoints is within a small important to achieve better running time than a brute-force
distance (3 or 4 pixels) of the image borders. method. The optimization goal for such a design problem
420 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 6, NO. 4, DECEMBER 2005

Fig. 5. Three major cases for the tangent-line intersections. Thicker line segments indicate edges whose length can be retrieved. The location of the wy vanishing
point and the direction of motion dx are indicated. In (a) and (c), all three dimensions can be recovered, while in (b), only two are recoverable.

Fig. 6. Rectangles extruded in time. (a) A w l rectangle at position p0 , moving for L time units with velocity v (reaching point pL ); and (b) two overlapping
parallelepipeds with labeled edges; the vector c connects the centroids of the polytopes.

should be that the vehicles get distributed uniformly across the Collision prediction however requires the computation of all
cells. This depends on the knowledge of vehicle flow pattern, vehicle pairs that could possibly collide within the next f time
density, and their variation with respect to time of the day. One steps, provided the predicted vehicle positions are available.
simple way out is to allow the end user who has knowledge While these brute-force or space-based algorithms would suf-
of traffic-flow patterns to draw these grids for a specific traffic fice for computing collisions in the current time frame, they do
intersection. not scale readily for solving the following problem: Given the
A natural extension of the above fixed grid-based approach is position, orientation, and size, at each time step, of n oriented
to make the grid adaptive to the conditions of vehicular traffic rectangles in 2-D, find all possible pairs of rectangles that could
flow. The goal is to maintain an invariant that the maximum intersect in f future time steps. In the existing implementation,
occupancy of any grid cell is given by a predetermined num- this is done by using the predicted positions through a Kalman
ber p. The grid cell is recursively divided into four subcells filter and repeating the collision-detection process once for each
whenever the above invariant is violated. Vehicles can be added of the f time steps, thus increasing the running time by a factor
to a grid cell, so long as its occupancy remains less than p. of f . Algorithms like the interval-based approach can be used
When the (p + 1)st vehicle is added to a grid cell, that cell is instead to exploit the spatial proximity of a vehicles predicted
recursively divided into four subcells and all the p + 1 vehicles position across the time frame to update the data structure incre-
are reassigned to the new child cells until the invariant is mentally, instead of recomputing the entire data structure again.
satisfied. We then moved to more complex representations and again
The above design can be represented as a Quad Tree with tried to solve the collision problem. The number of vehicles
the root of the tree representing a bounding rectangular plane that can be present at a traffic intersection is fairly limited,
that will contain any given rectangle (vehicle). Each leaf of so the lower time complexity of advanced collision-detection
the tree will have a maximum of p vehicles. In addition to methods, such as those outlined in [12], does not translate to
the data required to represent a tree, each node v of the Quad better run-time performance because of the overhead imposed
Tree will hold the rectangular quadrant R(v) that it represents. by preprocessing steps and the use of advanced data structures.
Each leaf of the Quad Tree will additionally hold pointers to Thus, we opted to test all vehicle pairs in a scene for possible
the list of rectangles it contains (maximum of p). Finally at collision over a time interval and focused on maximizing the
this stage, we experimented with the popular interval-based performance of the individual tests.
approach. Intersection of convex polygons is one of the well- Our method is based on the idea of extruding the bases
studied problems in computational geometry. The intersection of vehicles along a time axis. The extrusion of a rectangle
of oriented rectangles is a special case of the above problem. It moving from a point p0 to the point pL over L time units is
is shown that the intersection of simple polygons is linear-time a parallelepiped like the one shown in Fig. 6(a). A collision oc-
transformable to the line-segment intersection testing problem. curs if and only if the parallelepipeds representing two vehicles
To find pairs of intersecting rectangles in a plane, the problem overlap. Fig. 6(b) illustrates this and introduces our notation for
is converted into a line-segment overlap problem. the following discussion.
All of the rectangle-intersection algorithms described above Two convex polytopes are disjoint if there exists an axis
are used to compute collisions at a particular time instance. on which their projections are disjoint. The Separating Axis
ATEV et al.: A VISION-BASED APPROACH TO COLLISION PREDICTION AT TRAFFIC INTERSECTIONS 421

Fig. 7. Example illustrating (1). In this particular case, the objects are separated by the axis.

TABLE I TABLE II
SIMILAR TERMS IN THE SEPARATING-AXIS TEST COMPUTING THE NUMBER OF NEAR-MISSES IN A SPARSE TRAFFIC
INTERSECTION MONITORING 60 VEHICLES OVER 4000 FRAMES
FOR D IFFERENT V ALUES OF I NTER V EHICULAR D ISTANCE d

subsequent tests. Our final implementation of the test requires


a total of 44 multiplications, 35 additions/subtractions, and
All possible values for individual terms in (1). Identical letters
indicate identical values for the absolute value of the dot prod- five comparisons. The resulting method for detecting possible
uct between the vectors indicated in the first column and the collisions over a time interval has an almost negligible impact
vectors indicated in the first row. For example, from the table, on our systems running time and requires no extra storage or
it can be seen that |(u1 u3 ) u2 | = |(u2 v3 ) u1 | (both
represented by an A in the appropriate table cells). complicated data structures.

Theorem [9] establishes that it is sufficient to test for overlaps V. R ESULTS


on axes that are perpendicular to a face from either polytope
or perpendicular to edges from both polytopes. The 15 axes we Initially, we experimented with bounding rectangles. The
need to consider are defined by the pairwise cross products of collision-detection module contains an implementation of all
the vectors u1 , u2 , u3 , v1 , v2 , and v3 in Fig. 6(b). The six the aforementioned algorithms, with the option of choosing one
pairwise cross products of the vectors u1 , u2 , v1 , and v2 are at runtime. The choice of the algorithm primarily depends on
parallel to the time axis, which eliminates them from further the number of vehicles n at any given instant. For n < 20, a
consideration, since objects always overlap on the time axis. An brute-force approach would suffice, and when the number of
axis a chosen from one of the remaining nine axes separates the vehicles is less, the module automatically chooses the brute-
two polytopes if the following condition holds force algorithm. When the number of vehicles becomes greater
than 20, the space-based approach is employed. The results for
2|c a| > |u1 a| + |u2 a| + |u3 a| such a sparse traffic intersection are shown in Table II.
We then implemented the more complex presentations in-
+ |v1 a| + |v2 a| + |v3 a|. (1)
volving bounding boxes. Our system performs in real time
The vector c in (1) connects the centers of the two polytopes, on current-generation hardware.2 The output of the system is
as shown in Fig. 6(b). The constant factor of 2 in front of the shown in Fig. 8target bounding boxes are overlaid on the
left-hand side of (1) accounts for the fact that each polytopes input video to allow for the visual inspection of the tracking
projection on the axis a is symmetric around the projection of results. The foreground mask for each frame is also shown, in
its center. Fig. 7 illustrates (1) for an equivalent 2-D example. order to highlight the behavior of the system with regard to
As soon as an axis is found to separate the two polytopes, we dynamic and static occlusions. No collision warning is issued
can be sure that no collision between the respective vehicles is for the two occluding vehicles, since their bounding boxes are
possible. separated.
It also turns out that terms in (1) that involve the same three We manually counted the number of cars in 86 regularly
vectors produce the same values, as shown in Table I. This lets spaced video frames. Of the 273 vehicles found, 231 (85%)
us eliminate four more axes from consideration and allows us were correctly identified by our system, 19 (7%) were not
to reuse some of the intermediate results (the terms indicated
by individual letters in Table I) of one axis-overlap test in 2 34.6 ft/s on a 2.66-GHz Pentium IV system.
422 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 6, NO. 4, DECEMBER 2005

The sensors used comprise of one or more fixed video cameras,


calibrated in advance of the systems operation. Instabilities in
the input video due to camera shaking, sudden illumination
changes, and weather conditions (e.g., light rain) are handled
automatically by the low-level vision components of the sys-
tem. Accurate velocity, position, and size information in real-
world units were obtained for each vehicle in the scene by
the high-level vision components of the system. The collision-
prediction module makes use of this data to predict vehicle
trajectories and report possible collisions. The data generated
by the system is also suitable for vehicle-classification purposes
and for the categorization of the severity of collisions. The
system is robust to temporary static and dynamic occlusions,
and is capable of handling the stop-and-go situations that occur
in the setting of a traffic intersection. The proposed algorithms
were implemented on a general-purpose computer system with-
out specialized vision-processing hardware. We were able to
achieve real-time performance (greater than 30-Hz sampling
Fig. 8. Tracking sequence. Two regions are merged in the second and
third frame (dynamic occlusion). Another region is split in the third frame rate) on videos of sufficient resolution.
(static occlusion). Video of the output is available at http://www.cs.umn.edu/
research/airvl/its/.
B. Future Work
TABLE III
POSITION - AND DIMENSION-MEASUREMENT ACCURACY The presence of shadows cast from moving objects poses
an interesting challenge for future research. Such shadows
exacerbate occlusion problems and degrade the quality of the
target outlines recovered by connected-region extraction. The
Root-mean-squared errors are reported in centimeters, while average detection and elimination of shadows is an area of active re-
relative errors are reported as percentages. (a) Results from all 15 search, but it remains to be seen if a method capable of meeting
vehicles; (b) Excluding a turning vehicle whose long shadow caused
severe errors in the width estimate (18 of 292 samples removed).
the stringent real-time performance and quality requirements of
our system will be found.
identified at all, and 23 (8%) were identified, but merged with Another area of interest for further research is the exten-
other targets in the scene. We then manually fit bounding boxes sion of the camera-calibration techniques we use to allow for
around 15 different vehicles for several consecutive frames, tracking of objects on nonplanar road surfaces. This amounts
which gave us 292 individual boxes. The root-mean-squared to developing supervised algorithms for the recovery of road
error in the estimated quantities as well as the average relative surfaces using common geometric primitives visible in a still
error for the three reported vehicle dimensions are tabulated in image of a traffic intersection.
Table III. Finally, we would like to investigate the benefit of using
The indicated accuracy of our system is significant, given more descriptive vehicle-motion models in order to improve the
the resolution at which the measurements were taken, and position and velocity estimates of tracked vehicles. The noise
considering that the reprojection error of our camera calibration in image-space measurements precludes the use of high-order
was as large as 40 cm in the central regions of the image. filters, but a mixture of several high-bias filters combined using
The correctness of the collision-prediction algorithm was a switching Kalman Filter may further improve the quality of
evaluated using data from real intersections (at Union and tracking results.
Washington in Minneapolis, MN, and at Rice and University
in St. Paul, MN). Since we had no real crashes to work on,
ACKNOWLEDGMENT
we focused on the number of near misses. The bounding-
boxes method performed significantly better than the methods The authors would like to thank the anonymous reviewers for
where the bounding rectangles were used due to the issues with their valuable comments.
occlusions and the assumption of known width, which really
distorts the near misses due to underestimation of the actual R EFERENCES
vehicle dimensions.
[1] M. Arulampalam, S. Maskell, N. Gordon, and T. Clapp, A tutorial on par-
ticle filters for online nonlinear/non-Gaussian Bayesian tracking, IEEE
Trans. Signal Process., vol. 50, no. 2, pp. 174188, Feb. 2002.
VI. C ONCLUSION [2] S. Atev, O. Masoud, R. Janardan, and N. Papanikolopoulos, A colli-
sion prediction system for traffic intersections, in Proc. IEEE/RSJ Conf.
A. Summary Intelligent Robots and Systems (IROS), Edmonton, AB, Canada, 2005.
[3] S. Atev, O. Masoud, and N. Papanikolopoulos, Practical mixtures of
We presented a vision-based system for monitoring traffic Gaussians with brightness monitoring, in Proc. IEEE Conf. Intelligent
intersections that issues warnings about imminent collisions. Transportation Systems (ITSC), Washington, DC, 2004, pp. 423428.
ATEV et al.: A VISION-BASED APPROACH TO COLLISION PREDICTION AT TRAFFIC INTERSECTIONS 423

[4] Y. Bar-Shalom and T. E. Fortmann, Tracking and Data Association. New Hemanth Arumugam received the M.S. degree in computer science from the
York: Academic, 1987. Department of Computer Science at the University of Minnesota, Minneapolis,
[5] A. Bevilacqua, Effective object segmentation in a traffic monitoring in 2004.
application, in Proc. 3rd Int. Association Pattern Recognition (IAPR) He has research interests in computational geometry and its applications to
Indian Conf. Computer Vision, Graphics and Image Processing, Ahmed- computer vision and bioinformatics.
abad, India, 2002, pp. 125130.
[6] B. Coifman, D. Beymer, P. McLauchlan, and J. Malik, A real-time
computer vision system for vehicle tracking and traffic surveillance,
J. Transp. Res., Part C, vol. 6, no. 4, pp. 271288, 1998.
[7] D. Comaniciu, V. Ramesh, and P. Meer, Kernel-based object tracking,
IEEE Trans. Pattern Anal. Mach. Intell., vol. 25, no. 5, pp. 564577, Osama Masoud received the B.S. and M.Sc. degrees
May 2003. in computer science from King Fahd University of
[8] B. Ferlis, Analysis of infrastructure-based system conceptsIntersection Petroleum and Minerals, Dhahran, Saudi Arabia, in
collision avoidance problem area, 1999. Unpublished FHWA document. 1992 and 1994, respectively, and the Ph.D. degree in
[9] S. Gottschalk, M. C. Lin, and D. Manocha, OBB-Tree: A hierarchical computer science from the University of Minnesota,
structure for rapid interference detection, Comput. Graph., vol. 30, no. 3, Minneapolis, in 2000.
pp. 171180, 1996. He is currently a Research Associate at the De-
[10] J. Kato, T. Watanabe, S. Joga, J. Rittscher, and A. Blake, An HMM-based partment of Computer Science and Engineering at
segmentation method for traffic monitoring movies, IEEE Trans. Pattern the University of Minnesota. In the past, he was a
Anal. Mach. Intell., vol. 24, no. 9, pp. 12911296, Sep. 2002. Postdoctoral Associate at the same department and
[11] D. Koller et al., Towards robust automatic traffic scene analysis in real- served as the Director of Research and Development
time, in Proc. 12th Int. Conf. Pattern Recognition (ICPR), Jerusalem, at Point Cloud Inc., Plymouth, MN. His research interests include computer
Israel, 1994, pp. 126131. vision, robotics, transportation applications, and computer graphics.
[12] M. Lin and S. Gottschalk, Collision detection between geometric mod- Mr. Masoud is the recipient of a Research Contribution Award from the
els: A survey, in Proc. Institute Mathematics and Applications (IMA) University of Minnesota, the Rosemount Instrumentation Award from
Conf. Mathematics Surfaces, Birmingham, U.K., 1998, pp. 3756. Rosemount Inc., and the Matt Huber Award for Excellence in Transportation
[13] S. Lu, D. Metaxas, D. Samaras, and J. Oliensis, Using multiple cues Research. One of his papers (coauthored by N. P. Papanikolopoulos) was
for hand tracking and model refinement, in Proc. Computer Vision and awarded the IEEE VTS 2001 Best Land Transportation Paper Award.
Pattern Recognition Conf., Madison, WI, 2003, pp. 443450.
[14] S. Kamijo, Y. Matsushita, K. Ikeuchi, and M. Sakauchi, Occlusion robust
tracking utilizing spatio-temporal Markov random field model, in Int.
Conf. Pattern Recognition, Barcelona, Spain, 2000, vol. 1, pp. 140144.
[15] P. Mahalanobis, On the generalized distance in statistics, in Proc. Nat. Ravi Janardan (M00SM01) received the Ph.D.
Institute Science India, Bhavnagar, India, 1936, vol. 12, pp. 4955. degree in computer science from Purdue University,
[16] O. Masoud and N. Papanikolopoulos, Using geometric primitives to West Lafayette, IN, in 1987.
calibrate traffic scenes, in Proc. IEEE/RSJ Int. Conf. Intelligent Robots He is Professor of Computer Science and Engi-
and Systems (IROS), Sendai, Japan, 2004, pp. 18781883. neering at the University of MinnesotaTwin Cities.
[17] R. Van der Merwe, J. de Freitas, A. Doucet, and E. Wan, The unscented His research interests are in the design and analysis
particle filter, in Advances Neural Information Processing Systems, of geometric algorithms and data structures, and
Denver, CO, 2000, pp. 584590. their application to problems in a variety of areas,
[18] T. Penney, Intersection Collision Warning System, 1999. Pub. No. FHWA- including computer-aided design and manufacturing,
RD-99-103. transportation, very-large-scale-integration (VLSI)
[19] P. Prez, J. Vermaak, and A. Blake, Data fusion for visual tracking with design, bioinformatics, and computer graphics. He
particles, Proc. IEEE, vol. 92, no. 3, pp. 495513, Mar. 2004. has published extensively in these areas.
[20] W. Power and J. Schoones, Understanding background mixture models
for foreground segmentation, in Proc. Imaging and Vision Computing
New Zealand (IVCNZ), Auckland, New Zealand, 2002, pp. 267271.
[21] H. Preston, R. Storm, M. Donath, and C. Shankwitz, Review of
Minnesotas rural intersection crashes: Methodology for identifying in-
tersections for Intersection Decision Support (IDS), Minnesota Dept. Nikolaos P. Papanikolopoulos (S88M93
Transp., St. Paul, MN, Tech. Rep. MN/RC-2004-31, 2004. SM01) was born in Piraeus, Greece, in 1964.
[22] C. Rasmussen and G. Hager, Probabilistic data association methods for He received the Dipl.Ing. degree in electrical and
tracking complex visual objects, IEEE Trans. Pattern Anal. Mach. Intell., computer engineering from the National Technical
vol. 23, no. 6, pp. 560576, Jun. 2001. University of Athens, Athens, Greece, in 1987,
[23] J. Sherrah and S. Gong, Fusion of perceptual cues using covariance the M.S.E.E. degree in electrical engineering from
estimation, in Proc. British Machine Vision Conf., Nottingham, U.K., Carnegie Mellon University (CMU), Pittsburgh, PA,
1999, pp. 564573. in 1988, and the Ph.D. in electrical and computer
[24] C. Stauffer and W. Grimson, Adaptive background mixture models for engineering from CMU in 1992.
real time tracking, in Proc. Computer Vision and Pattern Recognition Currently, he is a Professor in the Department of
(CVPR), Fort Collins, CO, 1999, p. 252. Computer Science at the University of Minnesota,
[25] Y. Wu and T. Huang, Robust visual tracking by integrating multiple Minneapolis, and the Director of the Center for Distributed Robotics. He
cues based on co-inference learning, Int. J. Comput. Vis., vol. 58, no. 1, was a McKnight Land-Grant Professor at the University of Minnesota for the
pp. 5571, 2004. period 19951997. He was the recipient of the Kritski Fellowship in 1986
and 1987. His research interests include robotics, sensors for transportation
applications, control, and computer vision. He has authored or coauthored
more than 190 journal and conference papers in the abovementioned areas
(47 refereed journal papers).
Stefan Atev received the B.A. degree in com-
Dr. Papanikolopoulos was a finalist for the Anton Philips Award for Best
puter science and mathematics from Luther College,
Student Paper in the 1991 IEEE International Conference on Robotics and
Decorah, IA, in 2003. He is working towards the
Automation, the recipient of the NSF Research Initiation and Early Career
Ph.D. degree in computer science at the University
Development Awards and the Faculty Creativity Award at the University of
of Minnesota, Minneapolis.
Minnesota in 19951997, and the recipient of the Best Video Award in the
He has research interests in real-time computer-
2000 IEEE International Conference on Robotics and Automation. One of his
vision systems, video surveillance, and image
papers (coauthored by O. Masoud) was awarded the IEEE VTS 2001 Best Land
processing.
Transportation Paper Award. Finally, he received grants from DARPA, DHS,
Sandia National Laboratories, NSF, Microsoft, INEEL, U.S. Army, U.S. Air
Force, USDOT, MN/DOT, Honeywell, and 3M.

S-ar putea să vă placă și