Documente Academic
Documente Profesional
Documente Cultură
high-level vision
L Enrique Sucar and Dunan F Gillies
Imperial College of Science, Technology and Medicine, Department Figure 1 Levels of representation in vision. The different features
of Computing, 180 Queens Gate, London SW7 2BZ, UK obtained from the low-level processes are fed into the high-level
Paper received: 29 September 1992; revised paper received: I2 July processes that use knowledge of the world for postulating what
1993 objects are in the image (recognition)
Model-based vision
They use simple models. The objects to be recog- This is similar to the general framework for model-
nized can be represented geometrically with few based vision; the main differences are in the way we
parameters and in a relatively precise way. This represent and use domain knowledge. In the model-
applies mainly to man-made objects. based approach the representation is based on exact
They process few objects. The number of different geometric models. In the knowledge-based case,
objects in the domain is small. This is limited by the the models use a symbolic description based on
memory and processing requirements, especially the different knowledge representation techniques
if the system has to operate in real-time. developed in artificial intelligence (AI). We can divide
Graph matching algorithms are computationally a knowledge-based vision system into three processes:
expensive. Indeed, in its worst case subgraph
isomorphism is an NP-complete problem. Feature extraction, which obtains the relevant
They have robust feature extraction. The features features from the images and constructs a symbolic
obtained through the perception processes are description of the objects used for recognition.
reliable and an object description can be easily These are integrated into what we call a symbolic
constructed. This usually assumes known illumina- image.
tion conditions, surface properties and camera Knowledge acquisition, which concerns the
position, and sometimes the use of other sensors. construction of the model that represents our
such as multiple cameras and range finders. knowledge about the domain. This is done prior to
recognition and stored in the knowledge-base.
Current model-based systems are limited as general- Inference, which is the process of deducing from the
purpose vision system? by the performance of the facts in the knowledge-base and the information in
segmentation modules and their restricted representa- the symbolic image the identity and location of the
tion of world knowledge. These limitations also apply relevant objects.
to other domains that to not satisfy the restrictions
mentioned above. They are not well suited for natural
scenes such as those required for autonomous naviga- Uncertainty
tion in natural environments.
Our knowledge about the relationship between the
image features (evidence) and the objects in the world
(hypothesis) is often uncertain. If we knew the struc-
Knowledge-based vision ture of the world we observe then we can categorically
determine its projection in the image, as is the case in
While model-based systems mainly use analogical computer graphics, but given an image the relation
models to represent objects, knowledge-based systems becomes probabilistic, which is the case in vision. In
use propositional models, and recognition is realized by both cases we represent the relationship between the
a process of inference which does not require the world and the image in a causal direction, represented
construction of an explicit model of the objects to be by a directed link from world to image.
identified. They are a collection of propositions that Previous work in knowledge-based vision uses repre-
represent facts about the objects and their relation- sentations based on predicate logic, production
ships. The interpretation process consists of inferring, systems, semantic networks and frames. These
from the data and the known facts about the domain, systems have logic as their underlying principle, and
the identity of the objects in the image. A general reasoning is based on logical inference. Thus, handling
structure for this type of system is shown in Figure 2. uncertainty has been done by enhancing the basic
representation and reasoning mechanisms, using some The evidence comprises the features obtained from
alternative numerical techniques such as Dempster- the image by domain-independent low-level vision,
Shafer theory., certainty factors, fuzzy logic, a including uniform regions, contours, colour, 3D infor-
combination of them, or ad hoc methods developed mation, etc. The hypotheses are the different objects
for specific applications i.il. Most systems based their we expect to find in the worid, which is usuahy
beliefs, certainty factors or probabilities on statistical restricted to a specific domain, although, of course,
data. Only probability provides a clear semantic inter- we can have several knowledge-bases, one for each
pretation, either as a statistical frequency (objective), domain. The knowledge-base relates the features to the
or as a subjective estimate used traditionally in expert objects, through intermediate regions, surfaces and
systems. sub-objects, depending upon the complexity of the
Probability theory provides an adequate basis objects in the domain. It can also include relationships
for handling uncertainty in high-level vision. It is and restrictions between objects, such as topological
theoretically and practically appropriate in cases in relations. These relations can also exist between sub-
which we require numerical predictions on verifiable objects or lower-level features. Thus, we can represent
eventsi2, which is the case in most computer vision our visual KB as a series of hierarchical reiationships,
systems. Probability theory satisfies the procedural starting from the Iow-level evidence features, to
adequacy criteria for a visual knowledge representation regions, surfaces, sub-objects and objects.
proposed by ~ackworth~~, namely: soundness, com- In probabilistic terms these relations correspond to
pleteness, flexibility, acquisition and efficiency. It also dependency information, i.e. we can think of a series of
provides a preference ordering and efficient algorithms image features giving evidence for certain objects; or
for a wide range of problems. The purely subjective an object causing the appearance of certain features in
Bayesian approach has been used in severa vision the image. This dependency information and its causal
applications i4, however, there are some difficulties in interpretation can be represented by a probabilistic
its application. Ng and Abrahmson point out three network. Each node in the network will represent an
main problems: entity (features, region, . . ., object) or relation
{above, surrounding, near, . . .) in the domain, and the
1. The difficulty in extracting from the expert(s) the arcs will represent the dependencies between these
knowledge to assess the required probability distri- entities. This representation makes explicit the depen-
butions. dency information, indicating that a certain object
2. The huge number of probabilities that must be depends on a group of entities and is independent from
obtained to construct a functioning knowledge- the others. Thus, the structure of the probabilistic
base. network will represent the qualitative knowlege about
3. The independence assumptions that are required to the domain. The probability distributions attached to
render the problem tractable. the links wiIl correspond to the strength of their
dependencies. This representation reminds us of a
Rimey and Brown I6 have made some progress semantic network representation, but in contrast, the
towards reducing the large number of probabilities structure of a probabilistic network has a clear meaning
required by incorporating geometric properties of the in terms of probabilistic dependencies, and it adds the
scene into their computation, but their approach representation of uncertainty to the relations, which
remains application oriented. In contrast, we have are measured statistically. It also gives a rank ordering,
based our approach on objective probability, which in terms of probabilities, allowing us to select one of
provides an adequate framework for knowledge repre- several conflicting interpretations of an image.
sentation and reasoning in computer vision, and
permits machine estimation of the probabilities.
F
Figure 5 Example of a multitree for visual recognition. Feldman
gives this as an example of a description of Ping-pong and golf balls
according to his functional model for vision. It is interesting to note
that this model, derived from another perspective, is analogous to a
probabilistic multitree (the figure is based on Feldman*, p. 541) ;i
a
object hypothesis, with common nodes only for repre-
senting relations. The low level evidence or feature Figure 6 Expressing relational knowledge. The network in (a)
nodes will correspond to the information that is represents a unary relation between 0 and F, and the network in (b)
a binary relation between 0 and F,. Fz which is expressed through
obtained directly from the image, so these nodes will be the relational variable R. The deterministic links between F,, F2 and
common to several objects, representing conflicting R indicate that the value of R is a deterministic function of F, and Fz.
hypotheses. Thus, we have a network in which certain ---f: Probabilistic link; -----f: deterministic link
nodes have no parent (objects), other nodes have one
parent (intermediate objects/features), and some nodes independent. So there is a unary relationship between
have multiple parents (features and relations). We will an object 0 and a feature F. This is represented
restrict the multiple parents variables to leaf nodes, graphically by the probabilistic network in Figure 60.
that is, nodes with no descendants and call this type However, in many cases the relation between several
of network a rnuftitree. An example of a multitree is parts or features are important for the recognition of an
depicted in Figure 5. object.
Before we can define formally a multitree, some Multiple objects in the world are easier to model as
additional definitions are required. Let G = (V, E) be the composition of several sub-objects or parts, sub-
a DAG, and v and u vertices in V. If (u, V) E E then u is divided recursively until at the lowest level there are
a parent of v and v is a child. If there is a path from u to simple objects that will generally correspond to a
v, u is an ancestor of v and v is a descendant of u. If u region in the image. An example of this type of
has no parents it is called a root and if u has no children representation is the 3D models proposed by Marr in
it is called a leaf. which an object is described by a number of cylindrical
Now we can define a multitree as follows. Let G = parts and their relationships. In this case the recogni-
(V, E) be a DAG, and v a vertex in V. G is called a tion of an object is not only dependent on the presence
multitree if every non-leaf vertex v E V has at most one of its subparts, but it is also dependent on the physical
parent, i.e. that the root nodes have no parents, the relation between them. We need to represent this
leaf nodes have mutliple parents, and every other node dependency of the object on the relation between its
has exactly one parent. components in the network.
In a multitree there is one tree that encapsulates the A physical relation between the components could,
rules for recognition for each object. Each recognition for example, be the distance between two regions or
tree consists of a single root that represents the object their topological relation (adjacent, above, surround-
of interest, intermediate objects/features, and several ing, etc.). These relationships are well defined, and so
leaves that constitute the measured parameters. we can say that there is a determinisitic link from the
Although this structure could seem somewhat restric- components to their relation. However, the instance of
tive, it is generally adequate for representing a a relation extracted from the image could either be
visual KB, and probability propagation can be done caused by the object of interest, or be accidental or
efficiently as we will show later. illusory. To represent this uncertainty, we consider that
there is some probabilistic link from the object to the
relation. For the case of one object and a binary
Expressing relational knowledge relation between two features, the corresponding
network is shown in Figure 6b. We represent a
The causal relationships represented in the visual probabilistic link by an arrow and functional or
probabilistic network we have discussed previously deterministic links by a dotted arrow (Figure 6~).
express unary relationships between objects and We can extend the representation to include more
features, i.e. the presence of a certain object in the variables, and in general a relation for n variables can
world will cause the appearance of some specific be expressed by an analogous network, as depicted in
features in the image, which are usually assumed Figure 7~. Assuming that the relation can be uniquely
t I-1 1 ti 1 $+I
a b a b
Figure 8 Semi-static recognition. The network in (a) represents the Figure 9 Dynamic recognition. The network in (a) corresponds to a
causal relations between an object in several images. This is probabilistic network for representing an object across several
simplified to the network in (b) for recognition at time I, images, which is extended to include relation<, in (b)
PROBAEIILITY PROPAGATION
b
propagation techniques described in this section. The
second mode of reasoning has to do with constructing F
Each node uses this information to update its local separating variables, averaging the results weighted by
parameters, and update its posterior probability if the posterior probabilities.
required. For example. if nodes D and F were These techniques provide efficient algorithms
instantiated in the network of Figure IOU, they will send for probability propagation in certain classes of
A messages to B, which will then send a A message to networks: tree, polytrecs and sparse multiply con-
A and 7r messages to all of its sons. After the messages nected networks. But in gcncral. WC have more
reach the root node, they will propagate top-down until complex structures, and it has been shown by Cooper
they reach the leaf nodes, where the propagation that probabilistic inference is NF-hard.
terminates and the network comes to a new equili-
brium. So the information propagates through the tree
in a single pass, in a time (in parallel) proportional to Probability propagation in multitrees
the diameter of the network.
The previous propagation mechanism only works for As we can see in Figure 5, 21 multitree is not singly
trees, so it is restricted to only one cause per variable. connected, i.c. there can be multiple paths between
An extension for more complex networks, called nodes. Algorithms for multiply connected networks arc
plytrees. was proposed by Kim and Pearl. In a more complex, and only work efficiently for certain
polytree each node can have multiple parents, as shown kind of structures. Thus. we need to develop a
Figure IOh, but it is still a singly connected graph. technique for efficient propagation in multitrees, or
else, a transformation of the network to make it singly
connected. There are two important characteristics of
Probability propagation in multiply connected the general structure of a visual PN that help to simplify
networks probability propagation:
Multiply connected networks are those in which there I. Probability propagation is usually, bottom-up. That
are multiple paths between nodes in the directed graph, is, from the evidence nodes instantiated from the
implying the formation of loops in the underlying images towards the object nodes whose posterior
network. If we apply the propagation schemes of the probability we want to obtain.
previous section in this type of network, the messages 2. The leaf nodes will usually correspond to instan-
will circulate around these loops. and we will not reach tiated or evidence variables. These nodes, the
a coherent state of equilibrium. feature and relation nodes, will have fixed values
For a sparse network, a solution is to transform it obtained from the feature extraction processes of
into a different structure, for which an efficient low-level vision.
propagation scheme can be applied. For this, Lauritzen
and Spiegelhalterx developed a method based on By making use of these properties and the condition-
graph theory, which extracts an undirected triangufuted ing technique we described before, we can separate a
graph from the probabilistic network, and creates a tree multitree into a series of independent trees so that
whose nodes are the cliques of this triangulated graph. we can do probability propagation independently in
Then the probabilities are updated by propagation in each one by extending the technique for probability
the tree of cliques. This technique will work efficiently propagation in probabilistic trees.
if the number of variables in each clique is relatively
small. It has been applied in MUNIN, a medical expert Theorem 1 Let G = (V. E) be CImuititree, and WE V
system for diagnosis and test planning in electro- the subset of all leaf nodes in V. If every node v E W is
myographylx. instantiated, then the multitree G cun he squruted in one
Pearl suggests two other methods for dealing with or more probabilistic networks G,. CT?, .1 CT,%where
multiply connected networks: stochastic simulation and each network G, is u probabilistic tree.
conditioning. Stochastic simulation consists of assigning
to each non-instantiated variable a random value and Proof
computing a probability distribution based on the By conditioning we block the pathways that go through
values of its neighbours. Based on this distribution, the instantiated nodes. In the case of leaf nodes, this is
a value is elected at random for each variable, and equivalent to partitioning the network along this node.
the combined values represents a sample point. The and connecting a copy of it to each one of its parents.
procedure is repeated a number of times, and the Given that all the leaves are instantiated we can
posterior probability of each variable is calculated partition the network at every leaf node, meaning that
as the percentage of times each possible value appears. each instantiated copy of the node will have only one
Conditioning is based on the fact that if we instantiate parent. If every leaf has only one parent and. by
a variable it blocks the paths for which the corres- definition, every other node in a multitree has at most
ponding node forms part, changing the connectivity of one parent. then every node in the network has at
the network. Thus, we can transform a multrply most one parent. From this it follows that the network
connected graph into a singly connected graph by is a forest, that is a DAG where each node has at most
instantiating a selected group of variables. We assume a one parent. A tree is a particular case of a forest where
set of fixed values for these variables, and so the there is only one root node, and a forest is equivalent
probability propagation for the singly connected net- to one or more trees.
work. This is repeated for all the possible values of the Figure II shows an example of a multitree which has
Parameter learning in mukitrees Previously, we have assumed that all the variables
are directly observable from the images, or are
We start with only qualitative information from the provided as feedback by the users in the learning phase.
domain, that is the structure of the network, and obtain But it could be that some intermediate nodes could not
all the prior and conditional probability distributions be measured directly. In this cast, we propagate the
from real data. Tn the case of a multitree, we need to probabilities in both directions in the tree (from the
estimate the prior probability distributions of each root root and from the leaves) towards the unknown node
node P(A,) denoted by P(A), and the conditional and obtain a posterior probability for it. The value with
probability distributions of each node given its parent the highest probability is assumed to be the observed
P(B,lA,). denoted by P(BlA). We will assume that and all the conditional distributions arc updated as
each node represents a discrete multivalued variable above. Assuming that at least the root and leaf nodes
with N possible values. so each P(A) will be stored as are instantiated, we combine the bottom-up and top-
an N dimensional vector and each P(B IA) as an N x M down propagation mechanisms for probabilistic trees to
dimensional matrix. Each entry in P(A) and P(BIA) develop an algorithm for estimating the probability
will bc stored as an integer ratio. Given the nature of distribution of hidden nodes in a multitree.
the vision domain and the limited resolution of image
acquisition. most variables will be discrete or can be
given a discrete approximation in a limited range. In Algorithm 2 Learning Hidden Node.? in a Multitree
I. Decompose the network into N probabilistic trees
the second case we use a method based on Pawn
(r,) by partitioning it in every leaf node with
~~indovt:s to estimate the probability density function.
multiple parents.
approximating it by N intervals with constant value.
Given an observation and the actual value of each 3
-. Instantiate all leaf nodes by using the mcasurcd
node. it is trivial to update the system parameters by features obtained from low-level vision.
incrementing the corresponding entries in the prior and
conditional probability matrices. If we observe Ax and 3. Instantiate the root node and (~11possible inter-
B,. and the previous values are: P(A) = a,/s and mediate nodes from feedback provided by an
P(B IA) = h,/u,, then the probabilities will be updated external agent (usually the expert).
by using the following equations: 4. For every tree r, (I= I. 3.. . NY):
4.1 For all instantiated nodes B. set A(B,) = I
l prior probabilities:
and T(B,) = 1 for the instantiated value. and
a,+ 1 0 otherwise.
f(A,)= z, i=k (13)
4.2 From each instantiated rmdc R send a A
message to its parent A and z- messages to its
P(A,) = a, i#k children S, calculated as:
s+l
h.(A,) = 2: f(B, ~A,)A(B,) (1x1
l conditional probabilities:
m(B,) = u dB,) 1 1 n/fH,) (1))
b, + 1 . /
P(B,jil,)= rr, i=kandj=l (15) 4.3 For each non-instantiated node f1 that
I
receives A messages from its children S and ;I
They are restricted to predefined structures: trees As in the Bayesian subjective approach, we start from
or polytrees, and there is a guarantee of an subjective knowledge. We initially assume that the
optimum solution only for trees. group of characteristics is independent given their
They cannot obtain the direction of all the links in cause (parent), but we then test out this assumption
the network. Even for a tree the selection of the against the data. One way of testing for independences
root node is arbitrary. consists of calculating the correlation of pairs of the
They cannot find the actual causal relations characteristics. Although a low correlation does not
between variables without additional information necessarily imply independence, it provides evidence
or external semantics, i.e. they cannot distinguish that independence is a reasonable assumption to make;
genuine causal relationships from spurious cor- and, on the other hand, a high correlation indicates that
relations. the characteristics are not independent and our
They make use of the closed world assumption, assumptions are invalid. If we find a high correlation
considering that all relevant variables are already between a pair of characteristics we must modify
included in the network. the structure of the probabilistic network. This could be
done by introducing a link between the pair
Due to these limitations, instead of starting from no of dependent nodes, but it will cause the destruction
information about the structure of the network, we of the tree structure. There are at least three other
use the network derived from the subjective rules alternatives:
N&e e~i~z~nat~#Fl: eliminate one of the dependent dency, in general they provide a reasonable approxima-
sons, including the subtree rooted at this node. We tion. So if either of them is above a certain threshold,
choose for elimination the variable with lower we consider the variables not independent and we
discriminatory cupability. The concept of probabi- modify the dependency structure of the network
listic di.~tance7 can be used for selecting the accordingly.
redundant node, in which case the node with the
lowest value is selected for elimination. Structure refinement algorithm
Nerds ~~~rnbirlati~n: fuse the two dependent
variables into one node which includes the infor- Based on the dependancy and discrimin~t~~ry tests
mation provided by the two dependent nodes, and described in the previous section. we developed an
link this new variable to the parent and sons of the algorithm for refining the structure of a visual KR
original nodes. represented as a probabilistic multitree. The algorithm
Node creation: create a new variable which will be tests the validity of the network using statistical data, it
an intermediate node between the object and the makes the possible improvements automatically, and
two dependent features, linking this node to the it alerts us to the remaining problems which require
object (parent) and the features (sons). This new external knowledge. The algorithm is the foll(~wing:
node will, hopefully, make the two dependent
variables conditionally independent, removing the Algorithm 3 Structure Refinement in Probabilistic
requirement that they should be independent given Multitrees
their previous parent (0). 1. Decompose the network into N probabilistic trees
(7,) by partitioning it in cvcry leaf node with
The three alternatives are illustrated in Figure 13. multiple parents. For every tree r, (/= I. 2, _,
Each one implies an increasing level of complexity. In Nfdo2to4.
(1) we simply cut part of the network without anv other
effect: in (2) we need to recompute the probabilities in 2. Select the top subtree in the tree defining the root
the links based on the joint distribution of the two node as 0 and its sons as F,. . . F,,.
variables; and in (3) we will usually require from some ?J. For each pair F;. F,:
external agent to define the new variable, and we also
need to obtain the required probabilities. Thus, for Obtain the correlation coefficients p and T.
node combination we need to return to a parameter If j pl < Tp and / 71-c T, go to the next
learning phase, and for node creation we need to return pair, otherwise continue
to the initial knowledge acquisition phase. Following
our general principles, we try each alternative in Node eliminati[?n ~ obtain the prt~babilistic
sequential order, testing the system after each modifi- distance (n,,) for F, and F, and eliminate
cation, until the performance IS satisfactory. the node with the lower is,,. <Given that wc
For validating the independence assumption we use are restricted to discrete variables and assum-
the correlation coefficients estimated from the sample ing conditional independcncc. WC use the
data available. For this, we apply two different Patrick-Fisher measure for I),,, which for
measures of association between random variables: each feature variable in the two class case is
Pearsons correlation coefficient (p), which indicates if defined as:
there is a linear relationship; and Kendalls c~rre~at~~~~
~~~~~~~~e~t( 7). which detects any increasing or
decreasing trend curve present in the data. Although
both measures do not correspond directly with depen- where F, is the jth possihlr: value of feature
variable F. Then test the system, if perform- others. If the tip is not controlled correctly it can be
ance is satisfactory go to the next pair, very painful and dangerous to the patient, and could
otherwise go to 3.4. even cause perforations on the colon wall. This is
further complicated by the presence of many difficult
3.4 Node combination - combine the nodes F,
situations such as the contraction and movement of the
and Fj into a single node Fk and obtain
colon, fluid and bubbles that obstruct the view, pockets
the conditional distributions P(F, IO) and
(diverticula) that can be confused with the lumen and
P(Si 1FL), where S, are the sons of F,, F,.
the paradoxical behaviour produced by the endoscope
Test the system, if performance is satisfactory
looping inside the colon.
go to the next pair, otherwise go to 3.5.
We are developing an expert system for
3.5 Node creation -create an intermediate node I colonoscopyzx. Its primary objective is to help the
by consulting the expert, and obtain the doctor with the navigation of the endoscope inside the
conditional distributions P(F, 1I), P(F, 1I), colon by controlling the orientation of the tip via the
and P(il0). right/left and up/down controls. The control of the
translation of the instrument (push/pull) and possible
4. Repeat 3 for every subtree in the tree, by selecting shaft rotation will still be controlled manually by the
each of the child nodes as the parent, recursively, doctor. As well as a navigation system, it will also serve
until the leaf nodes are reached. as an advisory system for novice endoscopists.
mating the lumetl location. We will review both the colon surface in the image. The depth map we
techniques in the following section. obtain consists of a vector (p. q) per pixel that gives the
orientation of the surface at this point with respect to
two orthogonal axes (x, _Y)that are perpendicular to the
Low-level processing camera axis (2). The surface orientati(~n can vary from
perpendicular to the endoscopes tip (camera) with
Due to the type of illumination of the endoscope, the p = y = 0. to parallel to the camera with p or y close to
darkest region in the image generally corresponds to infinity. If we assume that the colon has a shape similar
the centre of the colon (lumen) because there is a single to a tube and in the image a section of the internal wall
light source close to the camera. To obtain this of this tube is observed, then a reasonable approxima-
icformation, Khan has developed a new method for tion of the position of the centre of the colon (lumrn)
the extraction of the dark region in colon images. He will be a function of the direction in which the majority
uses a ~~~~dt~e~ representation in which the largest of the p-q vectors are pointing. For this WC divide
quadrant with a certain intensity level and variance is the space of possible values of /j-y into a small number
used as a seed region that is entended to cover the of slots and obtain a 2D histogram of this space. which
lumen area. This technique is based on a Quadtree we call a gradient or p9 hi~togrum, The largest peak
representation of the image and is suitable for parallel in this histogram will give an estimate of the position of
implementation in a pyramid architecture. From this the lumen. So from this process we obtain another
process we obtain a dark region in the image with the estimate of the lumen. consisting ot a pmk in the pq
following features: region size (SIZE), mean intensity histogram with the following attributes: number
level (MEAN), and intensity variance in the region of pixels (NUMBER), distance from the centrc
(VAR). (DISTANCE), and the difference in size to the next
A shape from shading technique suitable for endos- peak (DIFFERENCE).
copy was developed by Rashid3. His model assumes a The features obtained through these two low-level
point light source very close to the camera which is a vision processes are integrated in high-level vision using
good approximation to a real endoscope. His shape our probabilistic reasoning techniques. The quantiza-
from shading algorithm obtains the relative depth of tion of the variables results directly from the discrete
nature of the measurement process. For example, since Table 1 Summary of the results of image interpretation and advice
the seed regions are extracted using a regular quadtree for 90 colon images (experiment 1)
decomposition, their size, measured as the pixel length tmage Interpretatinn
one dimension, is always a power of 2. Similarly, the
mean is quantized into integer values at the intensity Number Percentage
resolution of the hardware we are using. The choice of
Diverticula interpretation correct 40 x5.1
resolution of the probability distributions must be
Lumen inte~r~tation correct 21 87.5
a compromise between the speed of computation
required, and the accuracy to which the required Experts Rating
probabilities are computed. We have not investigated
this relationship, but chosen a resolution which makes Correct interpretation 72 80.9
Not correct interpretation 17 19.1
full use of the measured data.
Predictive Score (Breir score)
9
Although its performance was satisfactory for lumen
Lumen
and di~~ertic~~arecognition, we thought it could still be
improved. We were mainly interested in validating the
Lumen Subtree
Diverticula Subtree
lmageandVisionComputing 1994
Voiume12 Number1 January/February 57
Probabilistic reasoning in high-level vision: L E Sucar and D F Gillies
Table 4 Performance results for different structures {experiment 1)
Expert raring
Correct interpretation 72 80.8 6X 76.4 73 82.0
Not correct 17 19.1 21 23.5 16 17.9
interpretation Figure 18 Probabilistic tree for lumen recognition, combining dark
region (2D) and shape from shading (3D) features
points to the structure in Figure 17~ as the preferred Table 5 Summary of the results of image interpretation for 50 colon
images (experiment 2)
alternative.
In this case of diverticula, all the correlations are low Image interpretation
giving evidence of independence. So the diverticula
subtree should remain with the same structure. Number Percentage
For comparison purposes, we tested the performance Lumen interpre~tion correct 34 97.1
of the system with both alternative structures for lumen NOT-Lumen interpretation correct 21 h8.8
and diverticula, and the original structure. The per-
formance evaluation was done with a similar sample of Predictive score (Breir score)
l The estimates provided by pq-histogram and dark We have proposed a general framework for represent-
region are independent. ing uncertain knowledge in computer vision. Starting
Figun 19 Interpretllion and adviccgivc. tor diilerenl.wl!n maels (rrpenmcnr 2) tn ed.h unagc,rhr 'Jrrk region,is dcficiert.s a redangle
andtlE luoen Posltio. inrercd ttom r\c ?ti-hhtog'am \ \hosn Js a d.uble rc.rangrcfhe Gdi;gtc ]n Lhr middleis fixcd and onty shownior
referene) Beloww showthe inlerpretationasit wasgivcnb' lhe system,.nd l he p;sftrior proba-bitit!of lune. lnLerprerations:(;) r,1l(p=
0.081r(b) toi ri.D (P:0 r0): lc) trnen lP:o.ee): (t) tarten (p:0.e1)
tmageandVisionComputing
Volume12 Number1 January/February
1994 59
Probabilistic reasoning in high-level vision: L E Sucar and D F Gillies
This framework provides an adequate basis for 15 Ng, K and Abramson, B Uncertainty management in expert
representing uncertin knowledge in computer vision, systems, IEEE Expert, Vol 5 No 2 (1990) pp 29-48
16 Rimey, R D and Brown, C M Where to look next using a
especially in complex natural environments. It has been
Bayesian net: Incorporating geometric relations, Proc. 2nd
tested in a realistic problem in endoscopy, performing Euro. Conf. on Comput. Vision, Springer-Verlag. Berlin (1992)
image interpretation with good results. We consider pp 542-550
that it can be applied in other domains, providing a 17 Pearl, J On evidential reasoning in a hierarchy of hypothesis,
coherent basis for developing knoweldge-based vision Artif. Intell., Vol 28 (1986) pp Y-15
18 Lauritzen. S L and Spiegelhalter, D J Local computations with
systems. probabilities on graphic structures and their application to expert
sytems, J. Roy. Statist. Sot. B, Vol 50 No 2 (1988) pp 157-224
19 Pearl, J Probabilistic Reasoning in Intelligent Systems, Morgan
Kaufmann, San Mateo, CA (1988)
REFERENCES 20 Sucar, L E. Gillies, D F and Gillies, D A Handling uncertainty
in knowledge-based computer vision, in Kruse, R and Siegel, P
1 Marr, D Vision, WH Freeman, New York (1982) (eds.), Symbolic and Quantitative Approaches to Uncertainty
2 Feldman, J A A functional model of vision and space, in Arbib, (ECSQAU-91), Springer-Verlag. Berlin (1991) pp 328-332
M A and Janson. A R (eds.). Vision, Brain and Cooperarive 21 Cooper, G F The computational complexity of probabilistic
Computation, MIT Press, Cambridge MA (1987) pp 531-562 inference using Bayesian networks, Artif. Intell., Vol 42 (1990)
3 Pentland, P A Introduction: From pixels to predicates, in pp 393-405
Pentland, A P (ed.), From Pixels to Predicate, Ablex, NJ (1986) 22 Duda. R 0 and Hart. P E Pattern Classification and Scene
4 Binford. T 0 Survey of model-based image analysis systems, Analysis, Wiley, New York (1973)
lnt. J. Robotics Res., Vol I No 1 (1982) pp 1X-64 23 Spiegelhalter, D J A statistical view of uncertainty in expert
5 Rao, A R and Jain, R Knowledge repres&tation and control in systems, in Gale, W A (ed.). Artificial Intelligence and Statistics,
computer vision systems, IEEE Expert, Vol 3 No 1 (1988) pp Addison-Wesley, Reading, MA (1986) pp 17-55
64-79 24 Spiegelhalter. D J and Lauritzen, S L Sequential updating
6 Wang, C and Srihari, S N A framework for object recognition in of conditional probabilities on directed graphical structures,
a visually complex environment and its application to locating Networks, Vol 20 No 5 (1990) pp 579-606
address blocks on mail pieces, Int. J. Comput. Vision, Vol 2 2s Chow, C K and Liu. C N Approximating probability distri-
(1988) pp 125-151 butions with dependence trees, IEEE Trans. Infor. Theory, Vol
7 Wesley, L P Evidential knowledge-based computer vision, Opf. 14 No 3 ( 1968) pp 462-468
Eng., Vol 25 No 3 (1986) pp 363-379 26 Sucar. L E, Gillies, D F and Gillies, D A Handling uncertainty
8 Perkins, W A Rule-based interpreting of aerial photographs in knowledge-based sytems using objective probabilities, The
using the lockheed expert system, Opt. Eng.. Vol 25 No 3 World Congress on Expert Systems, Orlando, FL (December
(1986) pp 356-363 199 I) pp 952-960
9 Ohta, Y Knowledge-based interpretation of Outdoor Natural 27 Kittler. J Feature selection and extraction, in Young, T Y and
Colour Scenes, Pitman, Boston, MA (1985) Fu. K S (eds.). Handbook of Pattern Reco,&ion and Imane ._
10 Hanson, A R and Riseman, E M Visions: A computer system Processing, Academic Press, &lando, FL (1986)
for interpreting scenes, in Hanson, A R and Riseman, E M 28 Sucar, L E and Gillies. D F Knowledee-based assistant for
(eds.). Computer Vision Systems, Academic Press. New York colonoscopy, Proc. 3rd Int. Conf. on In2 and Eng. Applic. of
(1978) pp 303-333 Artif Intell. and Expert Systems. Vol 11. ACM Press, NY (1990)
11 McKeown, D M, Harvey, W A and McDermott, J Rule-Based pp 665-672
Interpretation of Aerial Imagery. /EEE Trans. PAMI, 2Y Khan. G N and Gillies. D F A highly parallel shaded image
Vol 7 No 5 (1985) pp 570-585 segmentation method. in Dew, P M and Earnshaw. R A (eds.).
12 Spiegelhalter. D J Probabilistic reasoning in predictive Parallel Processing for Vision and Display, Addison-Wesley.
expert systems, in Kanal, L A and Lemmer, J F (eds.). Wokingham, UK (1989) pp 180-189
Uncertainty in Artificial Intelligence, North-Holland, Amsterdam 30 Rashid. H and Burger, P Differential algorithm for the
(1986) pp 47-68 determination of shape from shading using a point light source.
13 Mackworth, A Adequacy criteria for visual knowledge repre- Image & Vision Comput.. Vol IO No 2 (1992) pp 119-127
sentation, in Pylyshyn, Z (ed.), Computational Processes in 31 Sucar. L E, Gillies, D F and Rashid. H U Integrating shape
human vision: An Interdisciplinary Perspective, Ablex. NJ (1988) from shading in a gradient histogram and its application to
pp 464-476 endoscope navigation. V lnt. S,imposium on k>tif. Intell..
14 Agosta, J M The structure of Bayes networks for visual Cancun, Mexico (December 1992)
recognition, in Uncertainty in Artificial Intelligence. Elsevier, 32 Sucar. L E Probabilistic Reasoning in Knowledge-based Vision
Amsterdam (1990) pp 397-405 Systems, PhD Thesis, Imperial College, London (1992)