Camera Network Chapter

Optimal Visual Sensor Network
Conguration
Jian Zhao
a
Sen-ching S. Cheung
a
Thinh Nguyen
b
a
Center for Visualization and Virtual Environments, University of Kentucky
b
School of Electrical Engineering and Computer Science, Oregon State University
Abstract
Wide-area visual sensor networks are becoming more and more common. They
have wide-range of commercial and military applications from video surveillance
to smart home and from trac monitoring to anti-terrorism. The design of such a
visual sensor network is a challenging problem due to the complexity of the environ-
ment, self and mutual occlusion of moving objects, diverse sensor properties and a
myriad of performance metrics for dierent applications. As such, there is a need to
develop a exible sensor-planning framework that can incorporate all the aforemen-
tioned modeling details, and derive the sensor conguration that simultaneously
optimizes the target performance and minimizes the cost. In this chapter, we tackle
this optimal sensor problem by developing a general visibility model for visual sen-
sor networks and solving the optimization problem via Binary Integer Programming
(BIP). Our proposed visibility model supports arbitrary-shaped 3D environments
and incorporates realistic camera models, occupant trac models, self occlusion and
mutual occlusion. Using this visibility model, two novel BIP algorithms are proposed
to nd the optimal camera placement for tracking visual tags in multiple cameras.
Furthermore, a greedy implementation is proposed to cope with the complexity of
Preprint submitted to Elsevier 2 November 2008
BIP. Extensive performance analysis is performed using Monte-Carlo simulations,
virtual environment simulations and real-world experiments.
Key words: sensor placement, smart camera network, visual tags, binary integer
programming
1 Introduction
In recent years we have seen widespread deployment of smart camera net-
works for a variety of applications. Proper placement of cameras in such a
distributed environment is an important design problem. Not only does it de-
termine the coverage of the surveillance, it also has a direct impact on the
appearance of objects in the cameras which dictates the performance of all
subsequent computer vision tasks. For instance, one of the most important
tasks in distributed camera network is to identify and track common objects
across disparate camera views. It is a dicult problem because image features
like corners, scale-invariant feature transform (SIFT) contours, or color his-
tograms may vary signicantly between dierent camera views due to dispar-
ity, occlusions and variation in illumination. One possible solution is to utilize
semantically rich visual features based either on intrinsic characteristics such
as faces or gaits, or articial marks like jersey numbers or special-colored tags.
We call the problem of identifying distinctive visual features on an object the
Visual Tagging problem.
An early version of this work has appeared in IEEE Journal of Selected topics
in Signal Processing, Special Issue on Distributed Processing in Vision networks,
Volume 2 and Number 4 under the title Optimal Camera Network Congurations
for Visual Tagging by the same authors.
2
To properly design a camera network that can accurately identify and un-
derstand visual tags, one needs a visual sensor planning tool a tool that
analyzes the physical environment and determines the optimal conguration
for the visual sensors so as to achieve specic objectives under a given set
of resource constraints. Determining the optimal sensor conguration for a
large-scale visual sensor networks is technically a very challenging problem.
First, visual line-of-sight sensors are amenable to occlusion by both static and
dynamic objects. This is particularly problematic as these networks are typi-
cally deployed in urban or indoor environments characterized by complicated
topologies, stringent placement constraints and constant ux of occupant or
vehicular trac. Second, from infra-red to range sensing, from static to pan-
tilt-zoom or even robotic cameras, there are a myriad of visual sensors and
many of them have overlapping capabilities. Given a xed budget with limited
power and network connectivity, the choice and placement of sensors become
critical to the continuous operation of the visual sensor network. Third, the
performance of the network depends highly on the nature of the specic tasks
in the application. For example, biometric and object recognition require the
objects to be captured at a specic pose; triangulation requires visibility of
the same object from multiple sensors; object tracking can tolerate certain
degree of occlusion using a probabilistic tracker.
As such, there is a need to develop a exible sensor-planning framework that
can incorporate all the aforementioned modeling details, and derive the sensor
conguration that simultaneously optimizes the target performance and mini-
mizes the cost. Such a tool can allow us to scientically determine the number
of sensors, their positions, their orientations, and the expected outcome before
embarking on the actual construction of a costly visual sensor network project.
3
In this chapter, we propose a novel integer-programming based framework for
determining the optimal visual sensor conguration for 3D environments. Our
primary focus will be on optimizing the performance of the network in visual
tagging. To allow maximum exibility, we do not impose a particular method
for tag detection and simply model it as a generic visual detector. Furthermore,
our framework allows users the exibility to determine the number of views
in which the tag needed to be observed so that a wide variety of applications
can be simulated.
This chapter is organized as follows. After reviewing the state-of-the-art vi-
sual sensor placement techniques in Section 2, we discuss in Section 3 how the
performance of a sensor conguration can be measured using a general visi-
bility model. In Section 4, we adapt the general model to the visual tagging
problem using the probability of observing a tag from multiple visual sensors.
Using this rened model, we formulate in Section 5 the search of the optimal
sensor placements as two Binary Integer Programming (BIP) problems the
rst formulation, MIN CAM, focuses on minimizing the number of sensors
for a target performance level and the second one, FIX CAM, maximizes the
performance for a xed number of sensors. Due to the computational com-
plexity of BIP, we will also present a greedy approximation algorithm called
GREEDY. Experimental results on these algorithms using both simulations
and camera network experiments are presented in Section 6. We conclude the
paper by discussing future work in Section 7.
4
2 Related work
The problem of nding the optimal camera placement has been studied for
a long time. The earliest investigation can be traced back to the art gallery
problem in computational geometry. This problem is the theoretical study on
how to place cameras in an arbitrary-shaped polygon so as to cover the entire
area [13]. Although Chv atal has shown in [4] that the upper bound of the
number of cameras is n/3|, determining the minimum number of cameras
turns out to be a NP-complete problem [5]. While the theoretical diculties
of the camera placement problem are well understood and many approximate
solutions have been proposed, few of them can be directly applied to realistic
computer vision problems. Camera placement has also been studied in the eld
of photogrammetry for building accurate 3D models. Various metrics such as
visual hull [6] and viewpoint entropy [7] have been developed and optimization
are realized by various types of ad-hoc searching and heuristics [8]. These
techniques assume very dense placement of cameras and are not applicable to
wide-area wide-baseline camera networks.
Recently, Ramakrishnan et al. propose a framework to study the performance
of sensor coverage in wide-area sensor networks [9]. Unlike previous techniques,
their approach takes into account the orientation of the object. They develop
a metric to compute the probability of observing an object of random orienta-
tion from one sensor, and use that to recursively compute the performance for
multiple sensors. While their approach can be used to study the performance
of a xed number of cameras, it is not obvious on how to extend their scheme
to nd the optimal number of cameras as well as how to incorporate other con-
straints such as the visibility from more than one camera. More sophisticated
5
modeling pertinent to visual sensor networks are recently proposed in [1012].
The sophistication in their visibility models comes at a high computational
cost for the optimization. For example, the simulated annealing scheme used
in [11] takes several hours to nd the optimal placements of four cameras in a
room. Other optimization schemes such as hill-climbing[10], semi-denite pro-
gramming[12] and evolutionary approach[13] all prove to be computationally
intensive and prone to local minima.
Alternatively, the optimization can be tackled in the discrete domain. Horster
and Lienhart develop a exible camera placement model by discretizing the
space into grid and denoting the possible placement of camera as a binary vari-
able over each grid point [14]. The optimal camera conguration is formulated
as an integer linear programming problem which can incorporate dierent
constraints and cost functions pertinent to a particular application. Similar
ideas were also proposed in [1517]. While our approach follows a similar op-
timization strategy, we develop a more realistic visibility model to capture the
uncertainty of object orientation and mutual occlusion in 3D environments.
Unlike [14] in which the eld of view of a camera is modeled as a 2-D xed-size
triangle, ours is based on measuring the image size of the object as observed
by a pinhole camera with arbitrary 3-D location and pose. Our motivation is
based on the fact that the image size of the object is the key to the success
of any appearance-based object identication scheme. While the optimization
scheme described in [14] can theoretically be used for triangulating objects,
their results as well as others are limited to maximizing sensor coverage. Our
result, on the other hand, directly tackles the problem of visual tagging in
which each object needs to be visible by two or more cameras. Furthermore,
while the BIP formulation can avoid the local minima problem, its complexity
6
remains NP-complete [18, ch. 8]. As a consequence, these schemes again have
diculties in scaling up to large sensor networks.
3 General Visibility Model
Consider the 3-D environment as depicted in Figure 1. Our goal in this section
is to develop a general model to compute the visibility of a single tag centered
at P in such an environment. We assume that the 3-D environment has vertical
walls with piecewise linear contours. Obstacles are modeled as columns of nite
height and polyhedral cross sections. Whether the actual tag is the face of a
subject or an articial object, it is reasonable to model each tag as a small at
surface perpendicular to the ground plane. We further assume that all the tags
are of the same square shape with known edge length 2w. Without any specic
knowledge of the height of individuals, we assume that the centers of all the
tags lie on the same plane parallel to the ground plane. This assumption
does not hold in real world as individuals are of dierent height. Nevertheless,
as we will demonstrate in Section 6.1, such height variation does not much
aect the overall visibility measurements but reduces the complexity of our
model. While our model restricts the tags to be on the same plane, we place
no restriction on the 3-D positions, yaw and pitch angles of the cameras in
the visual sensor network.
Given the number of cameras and their placement in the environment, we
dene the visibility V of a tag using an aggregate measure of the projected size
of a tag on the image planes of dierent cameras. The projected size of the
tag is very important as the image of the tag has to be large enough to be
automatically identied at each camera view. Due to the camera projection
7
C
2w
l
x
P
P
DK
l
K
W
P
Fig. 1. Three-dimensional visibility model of a tag with orientation x
P
; is the plane
that contains the tag center P. Mutual occlusion is modeled by a maximum block
angle and its location
s
. K describes the obstacles and wall boundaries. Cameras
of arbitrary yaw and pitch angles can be placed anywhere in the 3-D environment.
of the 3-D world to the image plane, the image of the square tag can be an
arbitrary quadrilateral. While it is possible to precisely calculate the area of
this image, it is sucient to use an approximation for our visibility calculation.
Thus, we measure the projected length of the line segment l at the intersection
between the tag and the horizontal plane . The actual 3-D length of l is 2w,
and since the center of the tag always lie on l, the projected length of l is
representative of the overall projected size of the tag.
Next we identify the set of random and xed parameters that aects V . The
fact that we have chosen to measure the projected length of l instead of the
projected area of the tag greatly simplies the parametrization of V . Given
a camera network, the visibility function of a tag can be parameterized as
V (P, v
P
,
s
[w, K) where P, v
P
,
s
are random parameters about the tag; K
and w are xed environmental parameters. P denes the 2D coordinates of
8
the center of the tag on the plane . v
P
is the pose vector of the tag. As we
assume the tag is perpendicular to the ground plane, the pose vector v
P
lies
on the plane and has a single degree of freedom the orientation angle
with respect to a reference direction. Note the dependency of V on v
P
allows
us to model self-occlusion the tag is being occluded by the person who is
wearing it. The tag will not be visible to a camera if the pose vector is pointing
away from the camera.
While self occlusion can be succinctly captured by a single pose vector, the
precise modeling of mutual occlusion can be very complicated the modeling
of mutual occlusion involves the number of neighboring objects, their distance
to the tag, the positions and orientations of the cameras. In our model, we
choose the worst-case approach by considering a xed-size occlusion angle
at random position measured from the center of the tag on the plane.
Mutual occlusion is said to occur if the projection of the line of sight on
the plane falls within the range of the occlusion angle. In other words, we
model the occlusion as a cylindrical wall of innite height around the tag
partially blocking a xed visibility angle of at random starting position
s
. w is half of the edge length of the tag which is a known parameter. The
shape of the environment is encapsulated in the xed parameter set K which
contains a list of oriented vertical planes that describe the boundary wall and
obstacles of nite height. It is straightforward to use K to compute whether
there is a direct line of sight between an arbitrary point in the environment
and a camera. The specic visibility function suitable for visual tagging will
be described in Section 4.
To correctly identify and track any visual tag, a typical classication algorithm
would require the tag size on the image to be larger than a certain minimum
9
size, though a larger projected size usually does not make much dierence.
For example, a color tag detector needs a threshold to dierentiate the tag
from noises, and a face detector needs a face image large enough to observe
the facial features. On the other hand, the information gain does not increase
as the projected object size increases beyond a certain value. Therefore, the
threshold version can represent our problem much better than the absolute
image size. Assuming that this minimum threshold on image size is T pixels,
this requirement can be modeled by binarizing the visibility function as follows:
V
b
(P, v
P
,
s
[w, K, T) =
1 if V (P, v
P
,
s
[w, K) > T
0 otherwise.
(1)
Finally, we dene , the mean visibility, to be the metric for measuring the
average visibility of P over the entire parameter space:
=
V
b
(P, v
P
,
s
[w, K, T) f(P, v
P
,
s
) dP dv
P
d
s
(2)
where f(P, v
P
,
s
) is the prior distribution that can incorporate prior knowl-
edge about the environment for example, if an application is interested in
locating faces, the likelihood of the head positions and poses are aected by
furnishings and attractions such as television sets and paintings. Except for the
most straightforward environment such as a single camera in a convex environ-
ment discussed in [19], Equation (2) does not admit a closed-form solution.
Nevertheless, it can be estimated by using standard Monte-Carlo sampling
and its many variants.
10
4 Visibility model for visual tagging
In this section, we present a visibility model for the visual tagging problem.
This model is a specialization of the general model in Section 3. The goal is
to design a visibility function V (P, v
P
,
s
[w, K) that can measure the perfor-
mance of a camera network in capturing a tag in multiple camera views. We
will rst present the geometry for the visibility from one camera and then
show a simple extension to create V (P, v
P
,
s
[w, K) for arbitrary number of
cameras.
Given a single camera with the camera center at C, it is straightforward to
see that a tag at P is visible at C if and only if the following four conditions
hold:
(1) The tag is not occluded by any obstacle or wall. (Environmental Occlu-
sion)
(2) The tag is within the cameras eld of view. (Field Of View)
(3) The tag is not occluded by the person wearing it. (Self-Occlusion)
(4) The tag is not occluded by other moving objects. (Mutual Occlusion)
Thus, we dene the visibility function for one camera to be the projected
length [[l
[[ on the image plane of the line segment l across the tag if the above
conditions are satised, and zero otherwise.
Figure 2 shows the projection of l, delimited by P
l1
and P
l2
, onto the image
plane . Based on the assumptions that all the tag centers has the same
elevation and all tag planes are vertical, we can analytically derive the formulae
11
for P
l1
, P
l2
as
P
li
= C
v
C
, O C)
v
C
, P
li
C)
(P
li
C) (3)
where , ) indicates inner product. The projected length [[l
[[ is simply [[P
l1
l2
[[.
Pl1
Pl2
s
P
TagOrient at ionvP
D
Occlusion
Environment ,K
Pl1
Pl2
vC
P
O
C
Image
Plane
C
Fig. 2. Projection of a single tag onto a camera.
After computing the projected length of the tag, we proceed to check the four
visibility conditions as follows:
(1) Environmental Occlusion: We assume that environmental occlusion
occurs if the line segment connecting camera center C with the tag center
P intersect with some obstacle. While such an assumption does not take
into account of partial occlusion, it is adequate for most visual tagging
applications where the tag is much smaller than its distance from the
12
camera. We represent this requirement as the following binary function:
chkObstacle(P, C, K) =
1 No obstacles intersect with the line segment PC

0 otherwise
(4)
Specically, the obstacles are recorded in K as a set of oriented verti-
cal planes that describe the boundary wall and obstacles of nite height.
Intersection between the line of sight PC and each element in K is com-
puted. If there is no intersection within the conned environment or the
points of intersection are higher than the height of the camera, no occlu-
sion occurs due to the environment.
(2) Field of View: Similar to determining environmental occlusion, we de-
clare the tag to be in the eld of view if the image P
of the tag center is

within the nite image plane . Using a similar derivation as in (3), the
image P
is computed as follows:
P
= C
v
C
, O C)
v
C
, P C)
(P C) (5)
We then convert P
to local image coordinates to determine if P
is in-
deed within . We encapsulate this condition using the binary function
chkFOV(P, C, v
C
, , O) takes camera intrinsic parameters, tag location,
pose vector as input, and returns a binary value indicating whether the
center of the tag is within the cameras eld of view.
(3) Self Occlusion: As illustrated in Figure 2, the tag is self occluded if
the angle between the light of sight to the camera C P and the tag
pose v
P
exceeds

2
. We can represent this condition as a step function
U(
2
[[).
(4) Mutual Occlusion: In Section 3, we model the worst-case occlusion
13
using an angle . As illustrated in Figure 2, mutual occlusion occurs
when the tag center or half the line segment l is occluded. The angle is
suspended at P on the plane. Thus, occlusion occurs if the projection
of the light of the sight C P on the plane at P falls within the range
of [
s
,
s
+ ). We represent this condition using the binary function
chkOcclusion(P, C, v
P
,
s
) which returns one for no occlusion and zero
otherwise.
Combining both [[l
[[ and the four visibility conditions, we dene the projected

length of an oriented tag with respect to camera as I(P, v
P
,
s
[K, ) follows:
I(P, v
P
,
s
[w, K, ) = [[l
[[ chkObstacle(P, C, K)
chkFOV(P, C, v
C
, , O) U
2
[[
chkOcclusion(P, C, v
P
,
s
) (6)
where includes all camera parameters including , O and C. As stated in
Section 3, a threshold version is usually more convenient:
I
b
(P, v
P
,
s
[w, K, , T) =
1 if I(P, v
P
,
s
[w, K, ) > T
0 otherwise
(7)
To extend the single-camera case to multiple cameras, we note that the visibil-
ity of the tag from one camera does not aect the other and thus, each camera
can be treated independently. Assume that the specic application requires a
tag to be visible by H or more camera. The tag at a particular location and
orientation is visible if the sum of the I
b
() values from all the cameras exceed
H at that location. In other words, given N cameras
1
,
2
, . . . ,
N
, we dene
14
the threshold visibility function V
b
(P, v
P
,
s
[w, K, T) as
V
b
(P, v
P
,
s
[w, K, T) =
1 if

N
i=1
I
b
(P, v
P
,
s
[w, K,
i
) H
0 otherwise.
(8)
Using this denition of visibility function, we can then compute the mean
visibility as dened in (2) as a measure of the average likelihood of a random
tag being observed by H or more cameras. While the specic value of H
depends on the application, we will use H = 2 without loss of generality in
the sequel for concreteness.
5 Optimal Camera Placement
The goal of an optimal camera placement is to identify, among all possible
camera network congurations, the one that maximizes the visibility function
given by Equation (8). As Equation (8) does not possess an analytic form, it
is very dicult to apply conventional continuous optimization strategies such
as variational techniques or convex programming. As such, we follow a similar
approach as in [14] by nding an approximate solution over a discretization
of two spaces the space of possible camera congurations and the space
of tag location and orientation. Section 5.1 describes the discretization of our
parameter spaces. In Sections 5.2 and 5.3, we introduce two BIP formulations,
targeting at dierent cost functions, on computing optimal congurations over
the discrete environment. A computationally ecient algorithm for solving
BIP based on greedy approach is presented in Section 5.4.
15
5.1 Discretization of Camera and Tag Spaces
The design parameters for a camera network include the number of cameras,
their 3-D locations, as well as their yaw and pitch angles. The number of
cameras is either an output discrete variable or a constraint in our formulation.
The elevation of the cameras is usually constrained by the environment. As
such, our optimization does not search for the optimal elevation but rather
have the user input it as a xed value. For simplicity, we assume that all
cameras have the same elevation but it is a simple change in our code to allow
dierent elevation constraints to be used in dierent parts of the environment.
The remaining 4-D camera space: the 2-D location, yaw and pitch angles are
discretized into a uniform lattice gridC of N
c
camera grid points, denoted as
i
: i = 1, 2, . . . , N
c
.
The unknown parameters about the tag in computing the visibility function
(8) include the location of the tag center P, the pose of the tag v
P
and the
starting position of the worse-case occlusion angle
s
. Our assumptions stated
in Section 3 have the tag center lied on a 2-D plane and the pose restricted to
a 1-D angle with respect to a reference direction. As for occlusion, our goal is
to perform the worst-case analysis so that as long as the occlusion angle is less
than a given as dened in Section 3, our solution is guaranteed to work no
matter where the occlusion is. As such, a straightforward quantization of the
starting position
s
of the occlusion angle will not work an occlusion angle
of starting anywhere between grid points will occlude additional view. To
simultaneously discretize the space and maintain the guarantee, we select a
larger occlusion angle
m
> and quantize the starting position of the occlu-
sion angle using a step-size of
=
m
. The occlusion angles considered
16
under this discretization will then be [i
, i
+
m
) : i = 0, . . . , N
1
where N
= (
m
)/
|. This guarantees that any occlusion angle less than

or equal to will be covered by one of the occlusion angles. Figure 3 show an
example of =
= /4 and
m
= /2. Combining these three quantities
Fig. 3. Discretization to guarantee occlusion less than = /4 at any position will

be covered in one of the above three cases: [0,

2
), [
4
,
3
2
) and [
2
, ).
together, we discretize the 4-D tag space into a uniform lattice gridP with N
p
tag grid points
i
: i = 1, 2, . . . , N
p
.
Given a camera grid point
i
and a tag grid point
j
, we can explicitly eval-
uate the threshold single-camera visibility function (7) which we now rename
as I(
j
[w, T, K,
i
) with
j
representing the grid point for the space of P,
v
P
and
s
; w the size of the tag; T is the visibility threshold; K is the envi-
ronmental parameter and
i
is the camera grid point. The numerical values
of I(
j
[w, T, K,
i
) will then be used in formulating cost constraints in our
optimal camera placement algorithms.
5.2 MIN CAM: Minimizing the number of cameras for a target visibility
MIN CAM estimates the minimum number of cameras which can provide a
mean visibility equal to or higher than a given threshold
t
. There are two
17
main characteristics of MIN CAM: rst, is computed not on the discrete tag
space but on the actual continuous space using Monte Carlo simulation. As
such, the measurement is independent of the discretization. Furthermore, if
the discretization of the tag space is done with enough prior knowledge of the
environment, MIN CAM can achieve the target using very few grid points.
This is important as the complexity of BIP depends greatly on the number
of constraints which is proportional to the number of grid points. Second, the
visual tagging requirements are formulated as constraints rather than the cost
function in the BIP formulation of MIN CAM. Thus, the solution will guaran-
tee the chosen tag grid points be visible at two or more cameras. While this is
useful to those applications where the visual tagging requirement in the envi-
ronment needs to be strictly enforced, they may inate the number of cameras
needed to capture some poorly chosen grid points. Before describing the de-
tails of how we handle this problem, we rst describe the BIP formulation in
MIN CAM.
We rst associate each camera grid point
i
in gridC with a binary variable
b
i
such that
b
i
=
1 if a camera is present at
i
0 otherwise
(9)
The optimization problem can be described as the minimization of the number
of cameras:
min
b
i
Nc
i=1
b
i
(10)
subjected to the following two constraints: rst, for each tag point
j
in gridP,
we have
Nc
i=1
b
j
I
b
(
j
[w, T, K,
i
) 2 (11)
18
This constraint represents the requirement of visual tagging that all tags must
be visible at two or more cameras. As dened in Equation (7), I
b
(
j
[w, T, K,
i
)
measures the visibility of tag
j
with respect to camera at
i
. Second, for each
camera location (x, y), we have
all
i
at (x, y)
b
i
1 (12)
These are a set of inequalities guaranteeing that only one camera is placed at
any spatial location. The optimization problem in (10) with constraints (11)
and (12) forms a standard BIP problem.
The solution to the above BIP problem obviously depends on the selection of
grid points in gridP and gridC. While gridC is usually predened according
to the constraint of the environment, there is no guarantee that a tag at a
random location can be visible by two cameras even if there is a camera at
every camera grid point. Thus, tag grid points must be placed intelligently
tag grid points away from obstacles and walls are usually easier to observe. On
the other hand, focusing only on areas away from the obstacles may produce
a subpar result when measured over the entire environment. To balance the
two considerations, we solve the BIP repeatedly over a progressively rened
gridP over the spatial dimensions until the target
t
, measured over the entire
continuous environment, is satised. One possible renement strategy is to
have gridP started from a single grid point at the middle of the environment,
and grew uniformly in density within the interior of the environment but
remains at least one interval away from the boundary. If the BIP fails to
return a solution, the algorithm will randomly remove half of the newly added
tag grid points. The iteration terminates when the target
t
is achieved or all
the newly added grid points are removed. The above process is summarized
19
in Algorithm 1.
Input: initial grid points for cameras gridC and tag gridP,
t
, maximum
grid density maxDensity
Output: Camera placement camPlace
Set = 0, newP = ;
while
t
do
foreach
i
in gridC do
foreach
j
in gridP newP do
Calculate I
b
(
j
[w, T, K,
i
);
end
end
Solve newCamPlace = BIP solver(gridC, gridP, I
b
);
if newCamPlace == then
if [newP[ == 1 then
break, return failure ;
Randomly remove half of the elements from newP;
else
camPlace = newCamPlace;
gridP = gridP newP;
newP = new grid points created by halving the spatial separation;
newP = newP gridP;
Calculate for camPlace by Monte Carlo Sampling;
end
end
Algorithm 1: MIN CAM Algorithm
20
5.3 FIX CAM: Maximizing the visibility for a given number of cameras
A drawback of MIN CAM is that it may need a large number of cameras in
order to satisfy the visibility of all tag grid points. If the goal is to maximize
the average visibility, a sensible way to reduce the number of cameras is to
allow a small portion of the tag grid points not being observed by two or more
cameras. The selection of these tag grid points should be dictated based on the
distribution of the occupant trac f(P, v
P
,
s
) used in computing the average
visibility as described in Equation (2). FIX CAM is the algorithm that does
precisely that.
We rst dene a set of binary variables on the tag grid x
j
: j = 1, . . . , N
p
indicating whether a tag on the j

th
tag point in gridP is visible at two or more
cameras. We also assume a prior distribution
j
: j = 1, . . . , N
p
,
j

j
= 1
that describes the probability of having a person at that tag grid point. The
cost function dened to be the average visibility over the discrete space is
given as follows:
max
b
i
Np
j=1
j
x
j
(13)
The relationship between the camera placement variables b
i
s as dened in
(9) and visibility performance variables x
j
s can be described by the following
constraints. For each tag grid point
j
, we have
Nc
i=1
b
i
I
b
(
j
[w, T, K,
i
) (N
c
+ 1)x
j
< 1 (14)
Nc
i=1
b
i
I
b
(
j
[w, T, K,
i
) 2x
j
0 (15)
These two constraints eectively dene the binary variable x
j
: if x
j
= 1,
21
Inequality (15) becomes
Nc
i=1
b
i
I
b
(
j
[w, T, K,
i
) 2
which means that a feasible solution of b
i
s must have the tag visible at two
or more cameras. Inequality (14) becomes
Nc
i=1
b
i
I
b
(
j
[w, T, K,
i
) < N
c
+ 2
which is always satised the largest possible value from the left-hand size is
N
c
corresponding to the case when there is a camera at every grid point and
every tag point is observable by two or more cameras. If x
j
= 0, Inequality
(14) becomes
Nc
i=1
b
i
I
b
(
j
[w, T, K,
i
) < 1
which implies that the tag is not visible by two or more cameras. Inequality
(15) is always satised as it becomes
Nc
i=1
b
i
I
b
(
j
[w, T, K,
i
) 0
Two additional constraints are needed to complete the formulation: as the
cost function focuses only on visibility, we need to constrain the number of
cameras to be less than a maximum number of cameras as follows:
Nc
j=1
b
j
m (16)
We also keep the constraint in (12) to ensure only one camera is used at each
spatial location.
22
5.4 GREEDY: Greedy Algorithm to speed up BIP
BIP is a well-studied NP-hard combinatorial problem with plenty of heuristic
schemes such as branch-and-bound already implemented in software libraries
such as lp solve [20]. However, even these algorithms can be quite intensive
if the search space is large. In this section, we introduce a simple greedy
algorithm GREEDY that can be used for both MIN CAM and FIX CAM.
Besides experimentally showing the eectiveness of GREEDY, we believe that
the greedy approach is an appropriate approximation strategy due to the
similarity of our problem to the set cover problem.
In the set cover problem, items can belong to multiple sets and the optimiza-
tion goal is to minimize the number of sets to cover all the items. While nding
the optimal solution to set covering is a NP-hard problem [21], it has been
shown that the greedy approach is essentially the best one can do to obtain
an approximate solution [22]. We can draw the parallel between our problem
with the set cover problem by considering each of the tag grid point as an item
belonging to a camera grid point if the tag is visible at that camera. The set
cover problem then minimizes the number of cameras needed, which is almost
identical to MIN CAM except for the fact that visual tagging requires each tag
to be visible by two or more cameras. The FIX CAM algorithm further allows
some of the tag points not to be covered at all. It is still an open problem
on whether these properties can be incorporated into the framework of set
covering, our experimental results demonstrate that the greedy approach is a
reasonable solution to our problem. The GREEDY algorithm is described in
Algorithm 2.
23
Input: initial grid points for cameras gridC and tags gridP, target mean
visibility
t
and the maximum number of cameras m
Output: Camera placement camPlace
Set U = gridC, V = , W = gridP, camPlace = ;
while [V [ <
t
[gridP[ or [camPlace[ m do
c = the camera grid point in U that maximizes the number of visible tag
grid points in W;
camPlace = camPlace c;
S = the subset of grid that are visible by two or more cameras in
camPlace;
V = V S;
W = W S;
Remove c and all camera grid points in U that share the same spatial
location as c;
if U == then
camPlace = ;
return;
end
Output camPlace
Algorithm 2: GREEDY: greedy search camera placement algorithm
In each round of the GREEDY algorithm, the camera grid point that can
see the most number of tag grid points is selected and all the tag grid points
visible at two or more cameras are removed. When using GREEDY to ap-
proximate MIN CAM, we no longer need to rene the tag grids to reduce the
computational eciency. We can start with a fairly dense tag grid and set
the camera bound m to innity. The GREEDY algorithm will terminate if
the estimated mean visibility reaches the target
t
. When using GREEDY to
24
approximate FIX CAM,
t
will be set to one and the GREEDY algorithm
will terminate when the number of cameras reaches the upper bound m as
required by FIX CAM.
6 Experimental Results
In this section, we present both simulation and realistic camera network re-
sults to demonstrate the proposed algorithms. In Section 6.1, we show various
properties of MIN CAM, FIX CAM and GREEDY by varying dierent model
parameters. In Section 6.2, we compare the optimal camera congurations
computed by our techniques with other camera congurations.
6.1 Optimal camera placement simulation experiments
All the simulations in this section assume a room of dimension 10m 10m
with a single obstacle and a square tag with edge length w = 20 cm long. For
the camera and lens models, we assume a pixel width of 5.6 m, focal length
of 8 cm and the eld of view of 60 degrees. These parameters closely resembles
the real cameras that we use in the real-life experiments. The threshold T for
visibility is set to ve pixels which we nd to be an adequate threshold for our
color-tag detector.
6.1.1 Performance of MIN CAM
We rst study how MIN CAM estimates the minimum number of cameras
for a target mean visibility
t
through tag grid renement. For simplicity, we
25
keep all the cameras at the same elevation as the tags and assume no mutual
occlusion. The target mean visibility is set to be
t
= 0.90 and the algorithm
reaches this target in four iterations. The output at each iteration are shown
in Figure 4. Figures 4(a) and 4(e) show the rst iteration. Figure 4(a) shows
the environment with one tag grid point (black dot) in the middle. The camera
grid points are restricted at regular intervals along the red boundary of the
environment and remain the same for all iterations. The blue arrows indicate
the output position and pose of the cameras from the BIP solver. Figure 4(e)
shows the Monte-Carlo simulation results the gure shows the coverage of
the environment by calculating the local average visibility at dierent spatial
locations. The gray-level of each pixel represents the visibility at each local
region the brighter the pixel, the higher the probability that it is visible
at two or more cameras. The overall mean visibility over the environment
is estimated to be 0.4743. Since it is below the target
t
, the tag grid is
rened as shown in Figure 4(b) with the corresponding Monte-Carlo simulation
shown in Figure 4(f). With the number of cameras increases from four to
eight, increases to 0.7776. The next iteration shown in Figure 4(c) grows the
tag grid further. With so many constraints, the BIP solver fails to return a
feasible solution. MIN CAM then randomly discards roughly half of the newly
added tag grid points. The discarded grid points are shown as blue dots in
Figure 4(d). With fewer grid points and hence fewer constraints, a solution
is returned with eleven cameras. The corresponding Monte-Carlo simulation
shown in Figure 4(h) gives = 0.9107 which exceeds the target threshold and
MIN CAM terminates.
26
(a) Iteration 1 (b) Iteration 2 (c) Iteration 3 (d) Iteration 4
(e) = 0.4743 (f) = 0.7776 (g) = 0 (h) = 0.9107
Fig. 4. Four iterations of MIN CAM
6.1.2 FIX CAM versus MIN CAM
In the second experiment, we demonstrate the dierence between FIX CAM
and MIN CAM. Using the same environment as in Figure 4(c), we run FIX CAM
to maximize the performance with eleven cameras. The trac model
j
is set
to be uniform. MIN CAM fails to return a solution under this dense grid and
after randomly discarding some of the tag grid points, outputs = 0.9107
using eleven cameras. On the other hand, without any random tuning of the
tag grid, FIX CAM returns a solution of = 0.9205 and the results are shown
in Figures 5(a) and 5(b). When we reduce the number of cameras to ten and
rerun FIX CAM, we manage to produce = 0.9170 which still exceeds the
results from MIN CAM. This demonstrates that we can use FIX CAM to
ne-tune the approximate result obtained by MIN CAM. The camera con-
guration and the visibility distribution of using ten cameras are shown in
Figure 5(c) and 5(d), respectively.
27
(a) FC: 11 cam (b) FC: = 0.9205 (c) FC: 10 cam (d) FC: = 0.9170
(e) G: 11 cam (f) G: = 0.9245 (g) G: 10 cam (h) G: = 0.9199
Fig. 5. Figures 5(a) to 5(d) show the results of using FIX CAM (F). Figures 5(e)
to 5(h) show the same set of experiments using GREEDY (G)as an approximation
to FIX CAM.
6.1.3 GREEDY Implementation of FIX CAM
Using the same setup, we repeat our FIX CAM experiments using the GREEDY
implementation. Our algorithm is implemented using MATLAB version 7.0 on
a Xeon 2.1Ghz machine with 4 Gigabyte of memory. The BIP solver inside the
FIX CAM algorithm is based on lp solve [20]. We have tested both algorithms
using eleven, ten, nine and eight maximum number of cameras. While chang-
ing the number of cameras does not change the number of constraints, the
search space becomes more restrictive as we reduce the number of cameras.
As such, it is progressively more dicult to prune the search space, making
the solver resemble that of an exhaustive search. The results are summarized
in Table 1. For each run, three numerical values are reported: the fraction
of tag points visible to two or more cameras which is the actual minimized
cost function, the running time and the mean visibility estimated by Monte
28
Carlo simulations. At eight cameras, GREEDY is 30,000 times faster than
lp solve but only 3% fewer visible tag points than the exact answer. It is also
worthwhile to point out that the lp solve fails to terminate when we rene
the tag grid by halving the step-size at each dimension, while GREEDY uses
essentially the same amount of time. The placement and visibility maps of
the GREEDY algorithm that mirror those from FIX CAM are shown in the
second row of Figure 5.
Table 1
Comparison between Lp solve and greedy
No. cameras Lp solve Greedy
Visible Tags Time(s) Visible Tags Time(s)
Eleven 0.99 1.20 0.9205 0.98 0.01 0.9245
Ten 0.98 46.36 0.9170 0.98 0.01 0.9199
Nine 0.97 113.01 0.9029 0.97 0.01 0.8956
Eight 0.96 382.72 0.8981 0.94 0.01 0.8761
6.1.4 Elevation of tags and cameras
Armed with an ecient greedy algorithm, we can explore various modeling
parameters in our framework. An assumption we made in the visibility model
is that all the tag centers are in the same horizontal plane. This does not re-
ect the real world due to the dierent height of individuals. In the following
experiment, we examine the impact of the variation in height on the perfor-
mance of a camera placement. Using the camera placement in Figure 5(g),
29
we simulate ve dierent scenarios: the height of each person is 10 cm or 20
cm taller/shorter than the assumed height, as well as heights randomly drawn
from a bi-normal distribution based on U.S. census data [23]. The changes in
the average visibility are shown in Table 2. They range from -3.8% to -1.3%
which indicate that our assumption does not has a signicant impact on the
measured visibiliy.
Table 2
Eect of height variation on
height model +20 -20 +10 -10 Random
Change in 3.8% 3.3% 1.2% 1.5% 1.3%
Next, we consider the elevation of the cameras. In typical camera networks,
cameras are usually installed at elevated positions to mitigate occlusion. The
drawback of the elevation is that it has a smaller eld of view when compared
with the case when the camera is at the same elevation as the tags. By adjust-
ing the pitch angle of an elevated camera, we can selectively move the eld of
view to various part of the environment. As we now add one more additional
dimension of pitch angle, the optimization becomes signicantly more dicult
and GREEDY algorithm must be used. Figure (6) shows the result for m = 10
cameras with three dierent elevations above the plane on which the centers
of all the tags are located. As expected, the mean visibility reduces as we raise
the cameras. The visibility maps in Figures 6(d), 6(e) and 6(f) show that as
the cameras are elevated, the coverage near the boundary drops but the center
remains well-covered as the algorithm adjusts the pitch angles of the cameras.
30
(a) 0.4m (b) 0.8m (c) 1.2m
(d) = 0.9019 (e) = 0.8714 (f) = 0.8427
Fig. 6. Camera planning and Monte-Carlo simulation results when the cameras are
elevated to be 0.4, 0.8 and 1.2m above the tags.
6.1.5 Mutual Occlusion
We present simulation results to show how our framework deals with mutual
occlusion. Recall that we model occlusion as an occlusion angle of at the
tag. Similar to the experiments on camera elevation, our occlusion model adds
an additional dimension to the tag grid and thus we have to resort to the
GREEDY algorithm. We would like to investigate how occlusion aects the
number of cameras and the camera positions of the output conguration. As
such, we use GREEDY to approximate MIN CAM by identifying the minimum
number of cameras to achieve a target level of visibility. We use a denser tag
grid than before to minimize the dierence between the actual mean visibility
and that estimated by GREEDY over the discrete tag grid. The tag grid we
use is 16 16 spatially with 16 dierent orientations. We set the target to be
t
= 0.8 and test dierent occlusion angle at 0
, 22.5
and 45
. As explained
31
earlier in Section 5.1, our discretization uses a slightly larger occlusion angle
to guarantee worst-case analysis we uses
m
= 32.5
for = 22.5
and
m
= 65
for = 45
. In the Monte Carlo simulation, we put the occlusion

angle at random position of each sample point. The results are shown in Figure
7. We can see that even with increasing number of cameras from six to eight to
twelve, the resulting mean visibility suers slightly when the occlusion angle
increases. Another interesting observation from the visibility maps in Figures
7(d), 7(e) and 7(f) is that the region with perfect visibility, indicated by the
white pixels, dwindles as occlusion increases. This is reasonable because it is
dicult for a tag to be visible at all orientation in the presence of occlusion.
(a) 0
; 6 cam. (b) 22.5
; 8 cam (c) 45
; 12 cam
(d) = 0.8006 (e) = 0.7877 (f) = 0.7526
Fig. 7. As the occlusion angle increases from 0
in Figure 7(a) to 22.5
in Figure
7(b) and 45
in Figure 7(c), the required number of cameras increases from 6 to 8

and 12 when using GREEDY to achieve a target performance of
t
= 0.8. Figure
7(d) to Figure 7(f ) are the correspondent visibility maps.
32
6.1.6 Realistic Occupant Trac Distribution
In this last experiment, we show how one can incorporate realistic occupant
trac patterns into the FIX CAM algorithm. All experiments thus far assume
an uniform trac distribution over the entire tag space it is equally likely to
nd a person at each spatial location and at each orientation. This model does
not reect many real-life scenarios. For example, consider a hallway inside a
shopping mall: while there are people browsing at the window display, most of
the trac ows from one end of the hallway to the other end. By incorporating
an appropriate trac model, the performance should be improved under the
same resource constraint. In the FIX CAM framework, a trac model can
be incorporated into the optimization by using non-uniform weights
j
in the
cost function (13).
In order to use a reasonable trac distribution, we employ a simple random
walk model to simulate a hallway environment. We imagine that there are
openings on the either sides of the top portion of the environment. At each of
the tag grid point, which is characterized by both the orientation and the po-
sition of a walker, we impose the following transitional probabilities: a walker
has a 50% chance of moving to the next spatial grid point following the cur-
rent orientation unless it is obstructed by an obstacle, and has a 50% chance
of changing orientation. In the case of changing orientation, there is a 99%
chance of choosing the orientation to face the tag grid point closest to the
nearest opening while the rest of the orientations share the remaining 1%. At
those tag grid points closest to the openings, we create a virtual grid point
to represent the event of a walker exiting the environment. The transitional
probabilities from the virtual grid point back to the real tag points near the
openings are all equal. The stationary distribution
j
is then computed by
33
nding the eigenvector of the transitional probability matrix of the entire en-
vironment with eigenvalue equal to one [24][ch.11.3].
Figure 8(a) shows this hallway environment. The four hollow circles indicate
the tag grid points closest to the openings. The result of the optimization
under the constraint of using four cameras is shown in Figure 8(b). Clearly
the optimal conguration favors the heavy trac hallway area. If the uniform
distribution is used instead, we obtain the conguration in Figure 8(c) and the
visual map in Figure 8(d). The average visibility drops from 0.8395 to 0.7538
as there is a mismatch of the trac pattern.
(a) Random Walk (b) = 0.8395 (c) Uniform (d) = 0.7538
Fig. 8. Figures 8(a) and 8(b) use the specic trac distribution for optimization
and obtain a higher as compared to using an uniform distribution in gures 8(c)
and 8(d).
6.2 Comparison with other camera placement strategies
In this section, we compare our optimal camera placements with two dier-
ent placement strategies. The rst one is uniform placement assuming that
the cameras are restricted along the boundary of the environment, the most
intuitive scheme is to place them at regular intervals on the boundary, each
pointing towards the center of the room. The second one is based on the
34
optimal strategy proposed in [14].
To test the dierences in visibility models, it is unfair to use Monte-Carlo
simulations which use the same model as the optimization. As a result, we
resort to virtual environment simulations by creating a virtual 3-D environ-
ment that mimics the actual 10m10m room used in Section 6.1. We then
insert a random-walking humanoid wearing a red tag. The results are based
on the visibility of the tag in two or more cameras. The cameras are set at
the same height as the tag and no mutual occlusion modeling is used. The
optimization is performed with respect to a xed number of cameras. To be
fair to the scheme in [14], we run their optimization formulation to maximize
the visibility from two cameras. The measurements of for the three schemes
with the number of cameras varied from ve to eight are shown in Table 3.
Our proposed FIX CAM performs the best followed by the uniform placement.
The scheme in [14] does not perform well as it does not take into account the
orientation of the tag. As such the cameras do not compensate each other
when the tag is in dierent orientations.
Table 3
measurements among the three schemes using virtual simulations
Number of cameras FIX CAM [14] Uniform Placement
5 0.614 0.011 0.352 0.010 0.522 0.011
6 0.720 0.009 0.356 0.010 0.612 0.011
7 0.726 0.009 0.500 0.011 0.656 0.010
8 0.766 0.008 0.508 0.011 0.700 0.009
35
We are, however, surprised by how close uniform placement is to our optimal
scheme. Thus, we further test the dierence between the two with a real-
life experiment that incorporates mutual occlusion. We conduct our real-life
experiments indoor in a room of 7.6 meters long, 3.7 meters wide, and 2.5
meters high. There are two desks and a shelf along three of the four walls. Seven
Unibrain Fire-i400 cameras at elevation of 1.5 meters with Tokina Varifocol
TVR0614 lens are used. Since they are variable focal-length lens, we have
set them at a focal length of 8mm with a vertical eld of view of 45
and
horizontal eld of view of 60
. As the elevation of the cameras is roughly level

with the position of the tags, we have chosen a fairly large occlusion angle of
m
= 65
in deriving our optimal placement. Monte-Carlo results between the

uniform placement and the optimal placement are shown in Figure 9. For the
virtual environment simulation, we insert three randomly walking humanoids
and capture 250 frames for measurement. For the real-life experiments, we
capture about two minutes of video from the seven cameras, again with three
persons walking in the environment. Figures 10 and 11 show the seven real-
life and virtual camera views from both the uniform placement and optimal
placement respectively. As shown in Table 4, the optimal camera placement is
better than the uniform camera placement in all three evaluation approaches.
The three measured s for the optimal placement are consistent. The results
of the uniform placement have higher variation most likely due to the fact
that excessive amount of occlusion makes detection of color tags less reliable.
36
Table 4
measurements between uniform and optimal camera placements
Method MC Simulations Virtual Simulation Real-life Experiments
Uniform 0.3801 0.4104 0.0153 0.2335 0.0112
Optimal 0.5325 0.5618 0.0156 0.5617 0.0121
(a) Uniform placement: = 0.3801 (b) Optimal placement: = 0.5325
(c) Uniform placement: = 0.3801 (d) Optimal placement: = 0.5325
Fig. 9. Camera placement in a real camera network
Fig. 10. Seven camera views from uniform camera placement
7 Conclusions and Future work
In this chapter, we have proposed a framework in modeling, measuring and
optimizing placement of multiple cameras. By using a camera placement met-
ric that captures both self and mutual occlusion in 3-D environments, we
37
Fig. 11. Seven camera views from optimal camera placement
have proposed two optimal camera placement strategies that complement each
other using grid based binary integer programming. To deal with the computa-
tional complexity of BIP, we have also developed a greedy strategy to approx-
imate both of our optimization algorithms. Experimental results have been
presented to verify our model and to show the eectiveness of our approaches.
There are many interesting issues in our proposed framework and visual tag-
ging in general that deserve further investigation. The incorporation of models
for dierent visual sensors such as omnidirectional and PTZ cameras or even
non-visual sensors and other output devices such as projectors is certainly a
very interesting topic. The optimality of our greedy approach can benet from
a detailed theoretical studies. Last but not the least, the use of visual tagging
in other application domains such as immersive environments and surveillance
visualization should be further explored.
References
[1] J. ORourke, Art Gallery Theorems and Algorithms, Oxford University Press,
1987.
[2] J. Urrutia, Art Gallery And Illumination problems, Amsterdam: Elsevier
Science, 1997.
38
[3] T. Shermer, Recent results in art galleries, Proceedings of the IEEE 80 (9)
(1992) 13841399.
[4] V. Chvatal, A combinatorial theorem in plane geometry, Journal of
Combinatorial Theory Series B 18 (1975) 3941.
[5] D. Lee, A. Lin, Computational complexity of art gallery problems, IEEE
Transactions on Information Theory 32 (1986) 276282.
[6] D. Yang, J. Shin, A. Ercan, L. Guibas, Sensor tasking for occupancy reasoning in
a camera network, in: 1st Workshop on Broadband Advanced Sensor Networks
(BASENETS), IEEE/ICST, 2004.
[7] P.-P. Vazquez, M. Feixas, M. Sbert, W. Heidrich, Viewpoint selection using
viewpoint entropy, in: Proceedings of the Vision Modeling and Visualization
Conference (VMV01), IOS Press, Amsterdam, 2001, pp. 273280.
[8] J. Williams, W.-S. Lee, Interactive virtual simulation for multiple camera
placement, in: International Workshop on Haptic Audio Visual Environments
and their Applications, IEEE, 2006, pp. 124129.
[9] S. Ram, K. R. Ramakrishnan, P. K. Atrey, V. K. Singh, M. S. Kankanhalli,
A design methodology for selection and placement of sensors in multimedia
surveillance systems, in: VSSN 06: Proceedings of the 4th ACM international
workshop on Video surveillance and sensor networks, ACM Press, New York,
NY, USA, 2006, pp. 121130.
[10] T. Bodor, A. Dremer, P. Schrater, N. Papanikolopoulos, Optimal camera
placement for automated surveillance tasks, Journal of Intelligent and Robotic
Systems 50 (2007) 257295.
[11] A. Mittal, L. S. Davis, A general method for sensor planning in multi-sensor
systems: Extension to random occlusion, International Journal of Computer
Vision 76 (1) (2008) 3152.
39
[12] A. Ercan, D. Yang, A. E. Gamal, L. Guibas, Optimal placement and selection
of camera network nodes for target localization, in: International Conference on
Distributed Computing in Sensor System, Vol. 4026, IEEE, 2006, pp. 389404.
[13] E. Dunn, G. Olaque, Pareto optimal camera placement for automated visual
inspection, in: International Conference on Intelligent Robots and Systems,
IEEE/RSJ, 2005, pp. 38213826.
[14] E. Horster, R. Lienhart, On the optimal placement of multiple visual sensors,
in: VSSN 06: Proceedings of the 4th ACM international workshop on Video
surveillance and sensor networks, ACM Press, New York, NY, USA, 2006, pp.
111120.
[15] M. A. Hasan, K. K. Ramachandran, J. E. Mitchell, Optimal placement of stereo
sensors, Optimization Letters 2 (2008) 99111.
[16] U. M. Erdem, S. Sclaro, Optimal placement of cameras in oorplans to satisfy
task-specic and oor plan-specic coverage requirements, Computer Vision
and Image Understanding 103 (3) (2006) 156169.
[17] K. Chakrabarty, S. Iyengar, H. Qi, E. Cho, Grid coverage of surveillance and
target location in distributed sensor networks, IEEE Transaction on Computers
51(12) (2002) 14481453.
[18] G. Sierksma, Linear and Integer Programming: Theory and Practice, Marcel
Dekker Inc., 2002.
[19] J. Zhao, S.-C. Cheung, Multi-camera surveillance with visual tagging and
generic camera placement, in: International Conference on Distributed Smart
Cameras, ICDSC07, ACM/IEEE, 2007.
[20] Introduction to lp solve 5.5.0.10, http://lpsolve.sourceforge.net/5.5/.
[21] R. M. Karp, Reducibility among combinatorial problems, Complexity of
Computer Computations (1972) 85103.
40
[22] U. Feige, A threshold of ln n for approximating set cover, Journal of the ACM
45 (4) (1998) 634652.
[23] United states census bureaustatistical abstract of the united states: 1999, United
States Census Bureau (1999).
[24] C. M. Grinstead, L. J. Snell, Introduction to Probability, American
Mathematical Society, 1997.
41

Camera Network Chapter

Încărcat de

Informații document

Descriere originală:

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Camera Network Chapter

Încărcat de

Drepturi de autor:

Formate disponibile

Optimal Visual Sensor Network

1 No obstacles intersect with the line segment PC

of the tag center is

to local image coordinates to determine if P

[[ and the four visibility conditions, we dene the projected

|. This guarantees that any occlusion angle less than

Fig. 3. Discretization to guarantee occlusion less than = /4 at any position will

indicating whether a tag on the j

. In the Monte Carlo simulation, we put the occlusion

; 6 cam. (b) 22.5

in Figure 7(a) to 22.5

in Figure 7(c), the required number of cameras increases from 6 to 8

. As the elevation of the cameras is roughly level

in deriving our optimal placement. Monte-Carlo results between the

S-ar putea să vă placă și