Connectionist Models and Their Properties PDF

COGNITIVE SCIENCE 6, 205-254 (1982)
C0nnecti0nist Models and Their Properties

J. A . FELDMAN AND D. H. BALLARD
Computer Science Department
University of Rochester
Rochester, NY 14627
Much of the progress in the fields constituting cognitive science has been based
upon the use of expl i ci t information processing models, almost exclusively
patterned af t er conventional serial computers. An extension of these ideas to
massively paral l el , connectianist models appears to of f er a number of advan-
tages. Af t er a prel i mi nary discussion, this paper introduces a general connec-
tionist model and considers how it mi ght be used in cognitive science. Among
the issues addressed are: stability and noise-sensitivity, distributed decision-
making, t i me and sequence problems, and the representati on of compl ex
concepts.
1. I NTRODUCTI ON
Much of t he progress in t he fields const i t ut i ng cogni t i ve science has been
based upon t he use of concr et e i nf or mat i on processi ng model s ( I PM) ,
al most exclusively pat t er ned af t er convent i onal sequent i al comput er s.
Ther e are several reasons for t ryi ng t o ext end I PM t o cases where t he com-
put at i ons are carri ed out by a parallel comput at i onal engi ne with per haps
billions of active units. As an i nt r oduct i on, we will at t empt to mot i vat e t he
cur r ent interest in massively paral l el model s f r om f our di f f er ent perspec-
tives: anat omy, comput at i onal compl exi t y, t echnol ogy, and t he rol e of f or -
mal l anguages in science. It is t he last of t hese which is of pr i mar y concer n
here. We will focus upon a par t i cul ar f or mal i sm, connect i oni st model s
(CM), which is based explicitly on an abst r act i on of our cur r ent under st and-
ing of t he i nf or mat i on processi ng pr oper t i es of neur ons.
Ani mal brai ns do not comput e like a convent i onal comput er . Com-
parat i vel y slow (millisecond) neural comput i ng el ement s with compl ex,
parallel connect i ons f or m a st r uct ur e which is dr amat i cal l y di f f er ent f r om a
hi gh-speed, pr edomi nant l y serial machi ne. Much of cur r ent research in t he
neurosci ences is concer ned with t raci ng out t hese connect i ons and with dis-
coveri ng how t hey t r ansf er i nf or mat i on. One pur pose of this paper is t o
suggest how connect i oni st t heori es of t he brai n can be used t o pr oduc e
205
206 FELDMAN AND BALLARD
t est abl e, det ai l ed model s of i nt erest i ng behavi ors. The di st ri but ed nat ur e of
i nf or mat i on processi ng in t he brai n is not a new di scovery. The t r adi t i onal
view (which we shared) is t hat convent i onal comput er s and l anguages were
Turi ng universal and coul d be made t o si mul at e any paral l el i sm (or anal og
values) whi ch mi ght be r equi r ed. Cont e mpor a r y comput er science has sharp-
ened our not i ons of what is " c o mp u t a b l e " t o i ncl ude bounds on t i me, st or-
age, and ot her resources. It does not seem unr easonabl e to requi re t hat
comput at i onal model s in cogni t i ve science be at least pl ausi bl e in t hei r
post ul at ed r esour ce r equi r ement s.
The critical r esour ce t hat is most obvi ous is time. Neur ons whose basic
comput at i onal speed is a few milliseconds must be made t o account f or
compl ex behavi or s which are carri ed out in a few hundr ed mi l l i seconds
( Posner , 1978). Thi s means t hat entire compl ex behaviors are carried out in
less than a hundred time steps. Cur r ent AI and si mul at i on pr ogr ams requi re
millions of t i me steps. It may appear t hat t he pr obl em posed here is i nher-
ent l y unsol vabl e and t hat t here is an er r or in our f or mul at i on. But recent
results in comput at i onal compl exi t y t heor y ( J a ' J a ' , 1980) suggest t hat net-
works of active comput i ng el ement s can car r y out at least si mpl e compu-
t at i ons in t he requi red t i me range. In subsequent sections we present fast
sol ut i ons t o a vari et y of rel evant comput i ng pr obl ems. These sol ut i ons in-
vol ve using massive number s of units and connect i ons, and we also address
t he quest i ons of l i mi t at i ons on these resources.
Anot her recent devel opment is t he feasibility of bui l di ng paral l el com-
put ers. Ther e is cur r ent l y t he capabi l i t y to pr oduce chips with 100,000 gates
at a r epr oduct i on cost of a few cents each, and t he t echnol ogy t o go t o
1,000,000 gat es/ chi p appear s t o be in hand. Thi s has t wo i mpor t ant conse-
quences f or t he st udy of CM. The obvi ous consequence is t hat it is now fea-
sible to f abr i cat e massi vel y paral l el comput er s, al t hough no one has yet done
so ( Fahl man, 1980; Hillis, 1981). The second consequence of this devel op-
ment is t he renewed interest in t he basic pr oper t i es of hi ghl y paral l el com-
put at i on. A maj or r eason why t here ar en' t yet any of t hese CM machi nes is
t hat we do not yet know how t o design, assembl e, test, or pr ogr am such
engines. An i mpor t ant mot i vat i on f or t he car ef ul st udy of CM is t he hope
t hat we will l earn mor e about how to do parallel comput i ng, but we will say
no mor e about t hat in this paper .
The most i mpor t ant r eason f or a serious concer n in cogni t i ve science
f or CM is t hat t hey mi ght lead t o bet t er science. It is obvi ous t hat t he choi ce
of t echni cal l anguage t hat is used f or expressing hypot heses has a pr of ound
i nfl uence on t he f or m in which t heori es are f or mul at ed and exper i ment s
under t aken. Art i fi ci al intelligence and art i cul at i ng cogni t i ve sciences have
made great progress by empl oyi ng model s based on convent i onal digital
comput er s as t heori es of intelligent behavi or . But a number of cruci al
phenomena such as associ at i ve memor y, pri mi ng, per cept ual ri val ry, and
CONNECTIONIST MODELS AND THEIR PROPERTIES 207
the remarkable recovery ability of animals have not yielded to this treat-
ment. A major goal of this paper is to lay a foundation for the systematic
use of massively parallel connectionist models in the cognitive sciences, even
where these are not yet reducible to physiology or silicon.
Over the past few years, a number of investigators in different fields
have begun to employ highly parallel models (idiosyncratically) in their
work. The general idea has been advocated for animal models by Arbib
(1979) and for cognitive models by Anderson (Anderson et al., 1977) and
Ratcliff (1978). Parallel search of semantic memory and various "spreading
activation" theories have become common (though not quite consistent)
parts of information processing modeling. In machine perception research,
massively parallel, cooperative computational theories have become a
dominant paradigm (Marr & Poggio, 1976; Rosenfeld et al., 1976) and
many of our examples come from our own work in this area (Ballard, 1981;
Sabbah, 1981). Scientists looking at performance errors and other non-
repeatable behaviors have not found conventional IPM to be an adequate
framework for their efforts. Norman (1981) has recently summarized argu-
ments from cognitive psychology, and Kinsbourne and Hicks (1979) have
been led to a similar view from a different perspective. It appears to us that
all of these efforts could fit within the CM paradigm outlined here.
One of the most interesting recent studies employing CM techniques is
the partial theory of reading developed in (McClelland & Rumelhart, 1981).
They were concerned with the word superiority effect and related questions
in the perception of printed words, and had a large body of experimental
data to explain. One major finding is that the presence of a printed letter in a
brief display is easier to determine when the letter is presented in the context
of a word than when it is presented alone. The model they developed (cf.
Figure 1) explicitly represents three levels of processing: visual features of
printed letters, letters, and words. The model assumes that there are positive
and negative (circular tipped) connections from visual features to the letters
that they can (respectively, cannot) be part of. The connections between let-
ters and words can go in either direction and embody the constraints of
English. The model assumes that many units can be simultaneously active,
that units form algebraic sums of their inputs and output values propor-
tionally. The activity of a unit is bounded from above and below, has some
memory, and decays with time. All of these features, and several more, are
captured in the abstract unit described in Section 2.
This idea of simultaneously evaluating many hypotheses (here words)
has been successfully used in machine perception for some time (Hanson &
Riseman, 1978). What has occurred to us relatively recently is that this is a
natural mode of computation for widely interconnected networks of active
elements like those envisioned in connectionist models. The generalization
of these ideas to the connectionist view of brain and behavior is that all im-
Figure 1. A f ew of the neighbors of the node for the l et t er " t " in the first posi t i on in a wor d,
and t hei r i nterconnecti ons (McCl el l and & Rumelhart, 1981).
portant encodings in the brain are in terms of the relative strengths of
synaptic connections. The fundamental premise of connectionism is that in-
dividual neurons do not transmit large amounts of symbolic information.
Instead they compute by being appropriately connected to large numbers of
similar units. This is in sharp contrast to the conventional computer model
of intelligence prevalent in computer science and cognitive psychology.
The fundamental distinction between the conventional and connec-
tionist computing models can be conveyed by the following example. When
one sees an apple and says the phrase "wormy apple, " some information
must be transferred, however indirectly, from the visual system to the
speech system. Either a sequence of special symbols that denote a wormy
apple is transmitted to the speech system, or there are special connections to
the speech command area for the words. Figure 2 is a graphic presentation
of the two alternatives. The path on the right described by double-lined
arrows depicts the situation (as in a computer) where the information that a
wormy apple has been seen is encoded by the visual system and sent as an
abstract message (perhaps frequency-coded) to a general receiver in the
speech system which decodes the message and initiates the appropriate
speech act. Notice that a complex message would presumably have to be
transmitted sequentially on this channel, and that each end would have to
der
~ o d e r
Figure 2. Connectionism vs. symbolic encoding.
As s u me s s o me general encoding
Assumes individual connections
l earn t he c ommon code f or every new concept . No one has yet pr oduced a
bi ol ogi cal l y and comput at i onal l y pl ausi bl e real i zat i on of this convent i onal
comput er model .
The onl y al t ernat i ve t hat we have been able t o uncover is descri bed by
t he pat h with single-width ar r ows. Thi s suggests t hat t her e are (indirect)
links f r om t he units (cells, col umns, cent ers, or what - have- you) t hat
recogni ze an appl e t o some units responsi bl e f or speaki ng t he wor d. The
connect i oni st model requi res onl y very si mpl e messages (e.g. st i mul us
st rengt h) t o cross a channel but put s st r ong demands on t he avai l abi l i t y of
t he right connect i ons. Quest i ons concer ni ng t he l earni ng and r ei nf or cement
of connect i ons are addressed in Fel dman, (1981b).
For a number of reasons (i ncl udi ng r edundancy f or reliability), it is
highly unlikely t hat t her e is exact l y one neur on f or each concept , but t he
poi nt of view t aken here is t hat t he act i vi t y of a small number of neur ons
(say 10) encodes a concept like appl e. An al t ernat i ve view ( Hi nt on & Ander -
son, 1981) is t hat concept s are r epr esent ed by a " pa t t e r n of act i vi t y" in a
much larger set of neur ons (say 1,000) whi ch also represent many ot her con-
cepts. We have not seen how t o car r y out a pr ogr am of specific model i ng in
t er ms of these di f f use model s. One of t he maj or pr obl ems with di f f use
model s as a parallel comput at i on scheme is cross-t al k among concept s. For
exampl e, i f concept s using units (10, 20, 30 . . . . ) and (5, 15, 25 . . . . ) were
si mul t aneousl y act i vat ed, many ot her concept s, e. g. , (20, 25, 30, 35 . . . . )
woul d be active as well. In t he exampl e of Fi gure 2, this means t hat di f f use
model s woul d be mor e like t he shar ed sequent i al channel . Al t hough a single
concept coul d be t r ansmi t t ed in paral l el , compl ex concept s woul d have t o
go one at a t i me. Si mul t aneousl y t r ansmi t t i ng mul t i pl e concept s t hat shar ed
units woul d cause cross-t al k. It is still t r ue in our CM t hat many rel at ed
units will be t ri ggered by spreadi ng act i vat i on, but t he r epr esent at i on of
each concept is t aken t o be compact .
Most cogni t i ve scientists believe t hat t he br ai n appear s t o be massi vel y
parallel and t hat such st ruct ures can comput e special f unct i ons ver y well.
But massively parallel st ruct ures do not seem t o be usabl e f or general pur-
pose comput i ng and t her e is not near l y as much knowl edge of how t o con-
st ruct and anal yze such model s. The c ommon bel i ef (which may well be
right) is t hat t her e are one or mor e i nt er medi at e levels of comput at i onal
or gani zat i on l ayered on t he neur onal st r uct ur e, and t hat t heori es of intelli-
gent behavi or shoul d be descri bed in t er ms of t hese higher-level l anguages,
such as Pr oduct i on Systems, Pr edi cat e Cal cul us, or LI SP. We have not seen
a r educt i on (i nt erpret er, i f you will) of any hi gher f or mal i sm which has
pl ausi bl e resource requi rement s, and this is a pr obl em well wor t h pursui ng.
Our at t empt s t o devel op cogni t i ve science model s di rect l y in neural
t erms mi ght fail for one of t wo reasons. It may be t hat t her e real l y is an in-
t er pr et ed symbol system in ani mal brai ns. In this case we woul d hope t hat
our ef f or t s woul d br eak down in a way t hat coul d shed light on t he nat ur e
of this symbol system. The ot her possibility is t hat CM t echni ques are
di rect l y appl i cabl e but we are unabl e t o figure out how t o model some im-
por t ant capaci t y, e. g. , pl anni ng. Our pr ogr am is t o cont i nue t he CM at t ack
on pr obl ems of increasing di ffi cul t y (and t o i nduce some of you t o j oi n us)
until we encount er one t hat is i nt r act abl e in our t erms. Ther e are a number
of pr obl ems t hat are known t o be di ffi cul t f or syst ems wi t hout an i nt er pr et ed
symbol i c r epr esent at i on, i ncl udi ng compl ex concept s, l earni ng, and nat ur al
l anguage under st andi ng. The cur r ent paper is mai nl y concer ned with laying
out t he f or mal i sm and showi ng how it applies in t he easy cases, but we do
address t he pr obl em of compl ex concept s in Sect i on 4. We have made some
progress on t he pr obl em of l earni ng in CM systems ( Fel dman, 1981b) and
are begi nni ng t o wor k seri ousl y on nat ur al l anguage processi ng and on
higher-level vision. Our ef f or t s on pl anni ng and l ong- t er m me mor y r eor ga-
ni zat i on have not advanced si gni fi cant l y beyond t he discursive pr esent at i on
in ( Fel dman, 1980).
We will cert ai nl y not get ver y far in this pr ogr am wi t hout devel opi ng
some syst emat i c met hods of at t acki ng CM tasks and some bui l di ng-bl ock
circuits whose pr oper t i es we under st and. A first step t owar ds a syst emat i c
devel opment of CM is t o defi ne an abst r act comput i ng uni t . Our unit is
CONNECTI ONI ST MODELS AND THEIR PROPERTIES 211
r at her mor e general t han previ ous pr oposal s and is i nt ended t o capt ur e t he
cur r ent under st andi ng of t he i nf or mat i on processi ng capabilities of
neur ons. Some useful special cases of our general def i ni t i on and some pr op-
erties of ver y simple net wor ks are devel oped in Sect i on 2. Among t he key
ideas are l ocal memor y, non- homogeneous and non- l i near f unct i ons, and
t he not i ons of mut ual i nhi bi t i on and stable coal i t i ons.
A maj or pur pose of t he rest of t he paper is t o descri be bui l di ng bl ocks
which we have f ound useful in const r uct i ng CM sol ut i ons t o var i ous t asks.
The const r uct i ons are i nt ended t o be used t o make specific model s but t he
exampl es in this paper are onl y suggestive. We present a number of CM
sol ut i ons t o general pr obl ems arising in intelligent behavi or , but we are not
suggesting that any of these are necessarily employed by nature. Our not i on
of an adequat e model is one t hat account s f or all of t he est abl i shed rel evant
findings and this is not a t ask t o be under t aken lightly. We are devel opi ng
some pr el i mi nar y sket ches (Bal l ard & Sabbah, 1981; Sabbah, 1981) f or a
seri ous model of low and i nt er medi at e level vision. As we devel op vari ous
bui l di ng bl ocks and t echni ques we will also be t ryi ng t o bur y some of t he
cont ami nat ed debri s of past neural model i ng ef f or t s. Many of our const r uc-
t i ons are i nt ended as answers t o known har d pr obl ems in CM comput at i on.
Among t he issues addressed are: st abi l i t y and noi se-sensi t i vi t y, di st r i but ed
deci si on- maki ng, t i me and sequence pr obl ems, and t he r epr esent at i on of
compl ex concept s. The cruci al quest i ons of l earni ng and change in CM
syst ems are discussed el sewhere ( Fel dman, 1981b).
2. NEURON- LI KE COMP UTI NG UNI TS
As par t of our ef f or t t o devel op a general l y useful f r amewor k f or connec-
t i oni st t heori es, we have devel oped a st andar d model of t he i ndi vi dual uni t .
It will t ur n out t hat a " u n i t " may be used t o model anyt hi ng f r om a small
par t of a neur on t o t he ext ernal f unct i onal i t y of a maj or subsyst em. But t he
basic not i on of uni t is meant t o l oosel y cor r es pond t o an i nf or mat i on pr o-
cessing model of our cur r ent under st andi ng of neur ons. The par t i cul ar
defi ni t i ons here were chosen t o make it easy t o speci fy det ai l ed exampl es of
rel at i vel y compl ex behavi ors. Ther e is no at t empt t o be mi ni mal or mat he-
mat i cal l y el egant . The vari ous numer i cal values appear i ng in t he defi ni t i ons
are ar bi t r ar y, but fixed finite bounds pl ay a cruci al rol e in t he devel opment .
The pr esent at i on o f t he def i ni t i ons will be in stages, accompani ed by ex-
ampl es. A compact t echni cal speci fi cat i on f or r ef er ence pur poses is i ncl uded
as Appendi x A. Each uni t will be char act er i zed by a small number of dis-
cret e states plus:
p - - a cont i nuous val ue in [ - 10, 10], called potential (accuracy of several digits)
v - - a n output value, integers O_ v _ 9
i - - a vect or of inputs i, . . . . . in
P- Uni t s
For some appl i cat i ons, we will be able t o use a par t i cul ar l y si mpl e ki nd of
unit whose out put v is pr opor t i onal t o its pot ent i al p ( r ounded) when p > 0
and which has onl y one state. In ot her words
p - - p +/3 Z; wkik
V --i f p>O then r ound ( p - 0 ) e/se 0
[0_~ wk_< 1]
[v = 0...91
where/~, 0 are const ant s and w~ are weights on t he i nput values. The weights
are t he sole locus of change with experi ence in t he cur r ent model . Most
of t en, t he pot ent i al and out put of a unit will be encodi ng its confidence, and
we will somet i mes use this t erm. The " - - " not at i on is bor r owed f r om t he
assi gnment st at ement of pr ogr ammi ng l anguages. Thi s not at i on covers bot h
cont i nuous and di scret e t i me f or mul at i ons and allows us t o t al k about some
issues wi t hout any explicit ment i on of time. Of course, cert ai n ot her ques-
tions will i nherent l y i nvol ve t i me and comput er si mul at i on of any net wor k
of units will raise del i cat e quest i ons of discretizing t i me.
The rest ri ct i on t hat out put t ake on small i nt eger values is cent ral t o
our ent erpri se. The firing frequenci es of neur ons range f r om a few t o a few
hundr ed impulses per second. In t he 1/ 10 second needed f or basic ment al
events, t here can onl y be a limited amount of i nf or mat i on encoded in fre-
quencies. The t en out put values are an at t empt t o capt ur e this idea. A mor e
accurat e renderi ng of neural events woul d be t o al l ow 100 discrete val ues
with noise on t ransmi ssi on (cf. Sej nowski , 1977). Tr ansmi ssi on t i me is
assumed t o be negligible; del ay units can be added when t ransi t t i me needs
to be t aken i nt o account .
The p-uni t is somewhat like classical l i near t hr eshol d el ement s ( Mi nsky
& Paper t , 1972), but t her e are several di fferences. The pot ent i al , p, is a
crude f or m of memor y and is an abst r act i on of t he i nst ant aneous membr ane
pot ent i al t hat charact eri zes neur ons; it great l y reduces t he noi se sensitivity of
our net wor ks. Wi t hout local memor y in t he uni t , one must guar ant ee t hat
all t he inputs requi red f or a comput at i on appear si mul t aneousl y at t he uni t .
One pr obl em with the defi ni t i on above of a p-uni t is t hat its pot ent i al
does not decay in t he absence of i nput . Thi s decay is bot h a physi cal pr op-
erty of neur ons and an i mpor t ant comput at i onal f eat ur e f or our hi ghl y
parallel model s. One comput at i onal t ri ck t o solve this is t o have an in-
hi bi t or y connect i on f r om t he unit back t o itself. I nf or mal l y, we i dent i f y t he
negat i ve self f eedback with an exponent i al decay in pot ent i al whi ch is
mat hemat i cal l y equi val ent . Wi t h this addi t i on, p-uni t s can be used f or
many CM tasks of i nt er medi at e di ffi cul t y. The I nt er act i ve Act i vat i on
model s of McCl el l and and Rumel har t can be descri bed nat ur al l y wi t h
p-units, and some of our own wor k (Bal l ard, 1981) and t hat of ot her s ( Mar r
CONNECTIONIST MODELS AND THEIR PROPERTIES 2 1 3
& Poggi o, 1976) can be done wi t h p-uni t s. But t her e are a numbe r o f addi -
t i onal f eat ur es whi ch we have f ound val uabl e in mor e compl ex model i ng
t asks.
Disjunctive Firing Conditions and Conjunctive Connections
It is bot h comput at i onal l y effi ci ent and bi ol ogi cal l y real i st i c t o al l ow a uni t
to r es pond t o one of a numbe r of al t er nat i ve condi t i ons. One way t o vi ew
this is t o i magi ne t he uni t havi ng " d e n d r i t e s " each o f whi ch depi ct s an al t er-
nat i ve enabl i ng condi t i nn (Fi gure 3). For exampl e, one coul d ext end t he net -
wor k of Fi gure 1 t o al l ow f or several di f f er ent t ype f ont s act i vat i ng t he s ame
letter node, with t he hi gher connect i ons unchanged. Bi ol ogi cal l y, t he fi ri ng
of a neur on depends, in ma ny cases, on local s pat i o- t empor al s umma t i on
i nvol vi ng onl y a smal l par t of t he ne ur on' s sur f ace. So-cal l ed dendr i t i c
spi kes t r ans mi t t he act i vat i on t o t he rest o f t he cell.
i 3
i ,
is
i8
i7
Fi gur e 3. Co n j u n c t i v e c onnec t i ons and d i s j u n c t i v e i n p u t s i t e s .
In t er ms of our f or mal i s m, t hi s coul d be descr i bed in a var i et y of
ways. One o f t he si mpl est is t o def i ne t he pot ent i al in t er ms of t he ma x i mu m
of t he separ at e c omput a t i ons , e. g. ,
p - - p +/ ~Max(i , + i2 - ~, i3 + i, - ~, is + i6 - i7 - )
where/ 3 is a scale cons t ant as in t he p-uni t and is a cons t ant chosen (usual l y
> 10) t o suppr ess noi se and r equi r e t he pr esence o f mul t i pl e act i ve i nput s
( Sabbah, 1981). The mi nus sign associ at ed wi t h i, cor r es ponds t o its bei ng
an i nhi bi t or y i nput .
It does not seem unr eas onabl e (given cur r ent dat a, Kuf f l er & Ni chol l s,
1976) t o model t he fi ri ng r at e o f s ome uni t s as t he ma x i mu m of t he rat es at
its act i ve sites. Uni t s whose pot ent i al is changed accor di ng t o t he ma x i mu m
of a set of al gebr ai c sums will occur f r equent l y in our speci fi c model s. One
advant age of keepi ng t he pr ocessi ng power of our abs t r act uni t cl ose t o t hat
o f a neur on is t hat it hel ps i nf or m our count i ng ar gument s . When we at -
t empt t o model a par t i cul ar f unct i on (e. g. , stereopsis), we expect t o r equi r e
t hat t he number of units and connect i ons as well as t he execut i on t i me re-
qui red by t he model are plausible.
The max- of - sum unit is t he cont i nuous anal og of a logical OR- of - AND
(di sj unct i ve nor mal f or m) uni t and we will somet i mes use t he l at t er as an ap-
pr oxi mat e versi on of t he f or mer . The OR- of - AND uni t cor r espondi ng t o
Fi gure 3 is:
p - p + ot OR (i,&i2, i3&i,, is&i~&(not i,) )
Thi s f or mul at i on stresses t he i mpor t ance t hat near by spat i al connect i ons all
be firing bef or e t he pot ent i al is af f ect ed. Hence, in t he above exampl e, i3
and i4 make a conjunctive connection with t he uni t . The ef f ect of a conj unc-
tive connect i on can always be si mul at ed with mor e uni t s but t he number of
ext ra units may be ver y large.
Q-Units and Compound Units
Anot her useful special case arises when one suppresses t he numer i cal pot en-
tial, p, and relies upon a fi ni t e-st at e set {q} f or model i ng. I f we also i dent i f y
each i nput of i with a separ at e named i nput signal, we can get classical fi ni t e
aut omat a. A si mpl e exampl e woul d be a uni t t hat coul d be st ar t ed or s t opped
f r om firing.
One coul d descri be t he behavi or of this uni t by a t abl e, with rows cor-
r espondi ng t o states in {q} and col umns t o possible i nput s, e. g. ,
i~ (st art ) i2 (stop)
Fi ri ng Firing Nul l
Null Null Firing
One woul d also have t o speci fy an out put f unct i on, giving out put val ues re-
qui r ed by t he rest of t he net wor k, e. g. ,
v - - i f q = Fi ri ng t hen 6 else 0.
Thi s coul d also be added t o t he t abl e above. An equi val ent not at i on woul d
be t ransi t i on net wor ks wi t h states as nodes and i nput s and out put s on t he
arcs.
In or der t o bui l d model s of i nt erest i ng behavi or s we will need t o
empl oy many of t he same t echni ques used by designers of compl ex com-
put ers and pr ogr ams. One of t he most power f ul t echni ques will be encapsu-
lation and abstraction of a subnetwork by an individual unit. For example,
a system that had separate motor abilities for turning !eft and turning right
(e.g., fins) could use two start-stop units to model a turn-unit, as shown in
Figure 4.
l e f t start ~ c a u s e s
s t o p ~ ~'~ motion
t o left
~ ~ s t e c a u s e s
right rt . ~ -~ m o t i o n
~ , . s t o p ~ t o r i g h t
Figure 4. A Turn Unit.
Note that the compound unit here has two distinct outputs, where
basic units have only one (which can branch, of course). In general, com-
pound units will differ from basic ones only in that they can have several
distinct outputs.
The main point of this example is that the turn-unit can be described
abstractly, independent of the details of how it is built. For example, using
the tabular conventions described above,
Left Right Values Output
a gauche a gauche adr oi t v, =7, v2=O
adr oi t a gauche adr oi t v, =0, v2=8
where the right-going output being larger than the left could mean that we
have a right-finned robot. There is a great deal more that must be said about
the use of states and symbolic input names, about multiple simultaneous in-
puts, etc., but the idea of describing the external behavior of a system only
in enough detail for the task at hand is of great importance. This is one of
the few ways known of coping with the complexity of the magnitude needed
for serious modeling of biological functions. It is not strictly necessary that
the same formalism be used at each level of functional abstraction and, in
the long run, we may need to employ a wide range of models. For example,
for certain purposes one might like to expand our units in terms of compart-
mental models of neurons like those of (Perkel, 1979). The advantage of
keeping within the same formalism is that we preserve intuition, mathe-
matics, and the ability to use existing simulation programs. With sufficient
care, we can use the units defined above to represent large subsystems with-
out gi vi ng up t he not i on t hat each uni t can st and f or an abs t r act neur on. The
cruci al poi nt is t hat a subsyst em mus t be el abor at ed i nt o its neur on- l evel
uni t s f or t i mi ng and size cal cul at i ons, but can ( hopef ul l y) be descr i bed much
mor e si mpl y when onl y its effect s on ot her subsyst ems ar e of di rect concer n.
Un i t s E mp l o y i n g p and q
It will al r eady have occur r ed t o t he r eader t hat a numer i cal val ue, like our p,
woul d be useful f or model i ng the a mount o f t ur ni ng t o t he left or ri ght in
t he last exampl e. It appear s t o be general l y t r ue t hat a single numer i cal val ue
and a smal l set o f di scret e st at es combi ne t o pr ovi de a power f ul yet t r act abl e
model i ng uni t . Thi s is one r eason t hat t he cur r ent def i ni t i ons were chosen.
Anot her r eason is t hat t he mi xed uni t seems t o be a par t i cul ar l y conveni ent
way o f model i ng t he i nf or mat i on pr ocessi ng behavi or o f neur ons, as gener-
al l y descri bed. The di scret e st at es enabl e one t o model t he ef f ect s in neur ons
o f pol ypept i de modul at or s , a bnor ma l chemi cal envi r onment s , fat i gue, etc.
Al t hough t hese effect s are of t en cont i nuous f unct i ons of uni t par amet er s ,
t here are several advant ages t o usi ng di scret e st at es in our model s. Scientists
and l aymen al i ke of t en gi ve di st i nct names (e. g. , cool , wa r m, hot ) t o p a r a m-
et er ranges t hat t hey want t o t r eat di f f er ent l y. We al so can expl oi t a l arge
l i t er at ur e on under st andi ng l oosel y- coupl ed syst ems as fi ni t e-st at e machi nes
(Sunshi ne, 1979). It is al so t r adi t i onal t o br eak up a f unct i on i nt o s epar at e
r anges when it is si mpl er t o descri be t hat way. We have al r eady empl oyed all
of these uses of discrete st at es in our detailed wor k ( Fel dman, 1981b; Sabbah,
1981). One exampl e of a uni t empl oyi ng bot h p and q non- t r i vi al l y is t he
fol l owi ng cr ude neur on model . Thi s model is concer ned wi t h s at ur at i on and
assumes t hat t he out put st r engt h, v, is s omet hi ng like aver age fi ri ng f r e-
quency. It is not a model of i ndi vi dual act i on pot ent i al s and r ef r act or y
peri ods.
We suppose t he di st i nct st at es o f t he uni t q e {nor mal , r ecover }. In
normal st at e t he uni t behaves like a p- uni t , but while it is recovering it ig-
nor es i nput s. The fol l owi ng t abl e capt ur es al mos t all of t hi s behavi or .
( i ncompl et e)
nor ma l
r ecover
- 1 < p < 9 p > 9 Out put Value
p - - p + Ei p - - - p / v - - ct p - / 9
r ecover
nor ma l < i mpossi bl e > v- - O
Her e we have t he change f r om one st at e t o t he ot her dependi ng on t he
val ue of t he pot ent i al , p, r at her t han on speci fi c i nput s. The r ecover i ng st at e
is al so char act er i zed by t he pot ent i al bei ng set negat i ve. The unspeci f i ed
issue is what det er mi nes t he dur at i on of t he r ecover i ng s t a t e - - t he r e ar e
several possibilities. One is an explicit di shabi t uat i on signal like t hose in
Kandel ' s exper i ment s (Kandel , 1976). Anot her woul d be t o have t he uni t
sum i nput s in t he recoveri ng st at e as well. The r eader mi ght want t o con-
sider how t o add this t o t he t abl e.
The t hi r d possibility, which we will use f r equent l y, is t o assume t hat
t he pot ent i al , p, decays t owar d zero ( f r om bot h direction~) unless explicitly
changed. Thi s implicit decay p- - p0e -kf can be model ed by sel f i nhi bi t i on; t he
decay const ant , k, det ermi nes t he length of t he r ecover y per i od.
The general def i ni t i on of our abst r act neural comput i ng uni t is j ust a
f or mal i zat i on of t he ideas pr esent ed above. To t he previ ous not i ons of p, v,
and i we f or mal l y add
{ q} - - a set of discrete states, < 10
and f unct i ons f r om ol d t o new values of t hese
p- - f ( i , p, q)
q- - g( i , p, q)
v- - h( i , p, q)
which we assume, f or now, t o comput e cont i nuousl y. The f or m of t he f, g,
and h f unct i ons will vary, but will general l y be rest ri ct ed t o condi t i onal s and
simple f unct i ons. Ther e are bot h bi ol ogi cal and comput at i onal reasons f or
allowing units t o r espond (for exampl e) l ogari t hmi cal l y t o t hei r i nput s and
we have al ready seen i mpor t ant uses of t he maxi mum f unct i on.
The onl y ot her not i on t hat we will need is modi f i er s associ at ed with
t he i nput s of a unit. We el abor at e t he i nput vect or i in t er ms of received
values, weights, and modi fi ers:
V j , b = r~-w~..mj j = 1 . . . . . n
where rj is t he value received f r om a predecessor [ r = 0 . . . 9 ] ; wj is a
changeabl e weight, unsi gned [0_< w~_< 1] ( accur acy of several digits); and mj
is a synapt o- synapt i c modifier whi ch is ei t her 0 or 1.
The weights are t he onl y t hi ng in t he syst em whi ch can change with ex-
peri ence. They are unsi gned because we do not want a connect i on t o change
f r om exci t at or y t o i nhi bi t ory. The modi f i er or gat e simplifies many of our
det ai l ed model s. Lear ni ng and change will not be t r eat ed t echni cal l y in this
paper , but t he defi ni t i ons are i ncl uded in t he Appendi x f or compl et eness
( Fel dman, 1981b).
We concl ude this sect i on with some pr el i mi nar y exampl es of net wor ks
of our units, illustrating t he key i dea of mut ual (lateral) i nhi bi t i on (Fig. 5).
Mut ual i nhi bi t i on is wi despread in nat ur e and has been one of t he basic
comput at i onal schemes used in model i ng. We will present t wo exampl es of
how it works t o hel p aid in i nt ui t i on as well as t o i l l ust rat e t he not at i on. The
basic si t uat i on is symmet r i c conf i gur at i ons of p-uni t s whi ch mut ual l y in-
hibit one anot her . Ti me is br oken i nt o discrete i nt erval s f or these exampl es.
The exampl es are t oo simple t o be realistic, but do cont ai n ideas whi ch we
will empl oy r epeat edl y.
Two P-Units Symmetrically Connected
Suppose w, = 1, w2 = - .5
p(t + 1) = p(t) + r, - (.5)r2
v = r ound( p) [ 0 . . . 9 ]
rs = recei ved
Referri ng t o Fi gure 5a, suppose t he initial i nput t o t he uni t A. 1 is 6, t hen 2
per t i me step, and t he initial i nput t o B. I is 5, t hen 2 per t i me step. At each
t i me step, each uni t changes its pot ent i al by addi ng t he ext ernal val ue (r,)
and subst ract i ng hal f t he out put val ue of its rival. Thi s system will stabilize
t o t he side of t he l arger of t wo i nst ant aneous i nput s.
Two Symmetric Coalitions of 2-Units
wl =l
W~=. 5
W3 = -- . 5
p(t + 1) = p(t) + r, + .5(r, - r3)
v = r ound( p)
A, C st art at 6, B, D at 5;
A, B, C, D have no ext ernal i nput f or t > 1
The connect i ons f or this system are shown in Fi gure 5b. Thi s syst em
converges fast er t han t he previ ous exampl e. The idea her e is t hat uni t s A and
C f or m a " c oa l i t i on" with mut ual l y r ei nf or ci ng connect i ons. The compet i ng
units are A vs. B and C vs. D. The last exampl e is t he smallest net wor k de-
picting what we believe t o be t he basic mode of oper at i on in connect i oni st
systems. The fast er conver gence is not an ar t i f act ; t he positive feedback
among member s of a coal i t i on will general l y lead t o fast er conver gence t han
in separat e compet i t i ons. It is t he amount of posi t i ve f eedback r at her t han
j ust t he size of t he coal i t i on t hat det ermi nes t he rat e of conver gence (Feld-
man & Ballard, 1982). In t er ms of Fi gure 1, this coul d represent t he behavi or
of t he rival letters A and T in conj unct i on with t he rival words ABLE and
TRAP, in t he absence of ot her act i ve nodes.
Compet i ng coal i t i ons of units will be t he organi zi ng pri nci pl e behi nd
most of our model s. Consi der t he t wo al t ernat i ve readi ngs of t he Necker
. Q
Q - Q .
~

E E
. -
u u
~
0 0
o .
~ H ~ o
0
0
z
u
E
E
0
E
" 0
C
0
o
M.
219
cube shown in Fi gure 6. At each level of visual processi ng, t her e are mut ual l y
cont r adi ct or y units represent i ng al t ernat i ve possibilities. The dashed lines
denot e t he boundar i es of coal i t i ons which embody t he al t ernat i ve i nt erpre-
t at i ons of t he image. A number of i nt erest i ng phenomena (e. g. , pr i mi ng,
per cept ual ri val ry, filling, subj ect i ve cont our ) fi nd nat ur al expressi on in this
f or mal i sm. We are engaged in an ongoi ng ef f or t (Bal l ard, 1981; Sabbah,
1981) t o model as much of visual processi ng as possible wi t hi n t he connec-
tionist f r amewor k. The next sect i on describes in some detail a vari et y of
simple net wor ks which we have f ound t o be useful in this ef f or t .
3. NETWORKS OF UNI TS
The mai n rest ri ct i on i mposed by t he connect i oni st par adi gm is t hat no sym-
bolic i nf or mat i on is passed f r om unit t o unit. Thi s rest ri ct i on makes it di ffi -
cult t o empl oy st andar d comput at i onal devices like par amet er i zed f unct i ons.
In this section, we present connect i oni st sol ut i ons t o a vari et y of comput a-
t i onal probl ems. The sections address t wo pri nci pal issues. One is: Can t he
net wor ks be connect ed up in a way t hat is suffi ci ent t o represent t he pr ob-
lem at hand? The ot her is: Gi ven these connect i ons, how can t he net wor ks
exhibit appr opr i at e dynami c behavi or , such as maki ng a decision at an
appr opr i at e t i me?
Usi ng a Uni t to Represent a Value
One key t o many of our const r uct i ons is t he dedi cat i on of a separ at e unit t o
each val ue of each par amet er of i nt erest , whi ch we t er m t he uni t / val ue pri n-
ciple. We will show how t o comput e using uni t / val ue net wor ks and present
ar gument s t hat t he number of units r equi r ed is not unr easonabl e. In this
r epr esent at i on t he out put of a uni t may be t hought of as a conf i dence mea-
sure. Suppose a net wor k of dept h units encodes t he di st ance of some obj ect
f r om t he ret i na. Then i f t he uni t represent i ng dept h = 2 sat urat es, t he net-
wor k is expressing conf i dence t hat t he di st ance is t wo units. Si mi l arl y, t he
" G- h i d d e n " node in Fi gure 6 expresses conf i dence in its assert i on. Ther e is
much neur ophysi ol ogi cal evi dence t o suggest uni t / val ue or gani zat i ons in
less abst r act cort i cal maps. Exampl es are edge sensitive uni t s ( Hubel &
Wiesel, 1979) and per cept ual col or units (Zeki , 1980), whi ch are rel at i vel y
insensitive t o i l l umi nat i on spect ra. Exper i ment s with cort i cal mot or cont r ol
in t he monkey and cat ( Wur t z & Al bano, 1980) suggest a uni t / val ue or gani -
zat i on. Our hypot hesi s is t hat t he uni t / val ue or gani zat i on is wi despread,
and is a f undament al design pri nci pl e.
H
D
B C
\ \ /
/ \ /
\ /
J A /
/ \
I / !
\
Figure 6. The Necker Cube.
Al t hough many physi cal neur ons do seem t o f ol l ow t he uni t / val ue
rule and r espond accor di ng t o t he rel i abi l i t y of a par t i cul ar conf i gur at i on,
t her e are also ot her neur ons whose out put represent s t he r ange of some
par amet er , and appar ent l y some units whose firing f r equency reflects bot h
r ange and st rengt h i nf or mat i on (Scientific Amer i can, 1979). Bot h of t he
l at t er t ypes can be a c c ommoda t e d wi t hi n our def i ni t i on of a uni t , but we
will empl oy onl y uni t / val ue net wor ks in t he r emai nder of this paper .
In t he uni t / val ue r epr esent at i on, much comput at i on is done by t abl e
l ook- up. As a simple exampl e, let us consi der t he mul t i pl i cat i on of t wo vari -
ables, i . e. , z = x y . In t he uni t / val ue f or mal i sm t her e will be units f or e v e r y
val ue of x and y t hat is i mpor t ant . Appr opr i at e pai rs of these will make a
conj unct i ve connect i on with anot her uni t cell represent i ng a specific val ue
f or t he pr oduct . Fi gure 7 shows this f or a small set of units represent i ng
values f or x and y. Not i ce t hat t he conf i dence (expressed as out put value)
t hat a par t i cul ar pr oduct is an answer can be a l i near f unct i on of t he max-
i mum o f t he sums o f t he conf i dences of its t wo i nput s. A maj or pr obl em
with f unct i on t abl es (and with CM in general ) is t he pot ent i al combi nat or i al
expl osi on in t he number of units r equi r ed f or a comput at i on. A nai ve ap-
pr oach woul d demand N 2 uni t s t o represent all pr oduct s of number s f r om 1
t o N. The net wor k of Fi gure 7 requi res many fewer units because each pr od-
uct is represent ed onl y once, anot her advant age of conj unct i ve connect i ons.
We coul d use even fewer units by expl oi t i ng posi t i onal not at i on and repl ac-
ing each out put connect i on with a conj unct i on of out put s f r om uni t s repre-
senting mul t i pl es of l , 10, 100, etc. The quest i on of effi ci ent ways of
bui l di ng connect i on net wor ks is t r eat ed in detail in Sect i on 4 (cf. also Hi n-
t on, 1981a; 1981b).
z - - - - f ( x , y ) = x y x- uni t s
,.onit,
U
\
z uni t :
Fi gure 7. Mul t i pl i cat i on Uni ts

Modifiers and Mappings
The idea of f unct i on tables (Fig. 7) can be ext ended t hr ough t he use of vari-
able mappings. In our def i ni t i on of t he comput at i onal uni t , we i ncl uded a
bi nar y modi f i er , m, as an opt i on on every connect i on. As t he def i ni t i on
specifies, i f t he modi f i er associ at ed wi t h a connect i on is zer o, t he val ue v
sent al ong t hat connect i on is i gnored. Thus t he modi f i er denot es i nhi bi t i on,
or bl ocki ng. Ther e is consi derabl e evi dence in nat ur e f or synapses on synap-
ses (Kandel , 1976) and t he modi f i er s add great l y t o t he comput at i onal
simplicity of our net works. Let us st art with an initial i nf or mal exampl e of
t he use of modi f i er s and mappi ngs. Suppose t hat one has a model of grass as
green except in Cal i f or ni a where it is br own (gol den), as shown in Fi gure 8.
CONNECTIONIST MODELS A ND THEIR PROPERTIES 2 2 3
Fi gur e 8. Gr ass is Gr e e n c o n n e c t i o n mo d i f i e d by Ca l i f o r n i a .
Her e we can see t hat grass and green are pot ent i al member s of a coal i t i on
(can r ei nf or ce one anot her ) except when t he link is bl ocked. Thi s use is simi-
lar t o t he cancel l at i on link of ( Fahl man, 1979) and gives a cr ude idea of how
cont ext can ef f ect per cept i on in our model s. Not e t hat in Fi gure 8 we are
using a s hor t hand not at i on. A modi f i er t ouchi ng a doubl e- ended ar r ow
act ual l y bl ocks t wo connect i ons. (Somet i mes we also omi t t he ar r owheads
when connect i on is doubl e- ended. )
Mappi ngs can also be used t o select among a number of possible
values. Consi der t he exampl e of t he r el at i on bet ween dept h, physical size,
and ret i nal size of a circle. (For now, assume t hat t he circle is cent er ed on
and or t hogonal t o t he line of sight, t hat t he focus is fi xed, et c. ) Then t her e is
a fixed rel at i on bet ween t he size of ret i nal i mage and t he size of t he physi cal
circle f or any given dept h. That is, each dept h specifies a mapping f r om
ret i nal t o physical size (see Fig. 9). Her e we suppose t he scales f or dept h and
t he t wo sizes are chosen so t hat uni t dept h means t he same numer i cal size. I f
we knew t he dept h of t he obj ect (by t ouch, cont ext , or magi c) we woul d
know its physical size. The net wor k above allows ret i nal size 2 t o r ei nf or ce
physical size 2 when dept h = 1 but i nhi bi t s this connect i on f or all ot her
dept hs. Similarly, at dept h 3, we shoul d i nt er pr et ret i nal size 2 as physi cal
size 8, and i nhi bi t ot her i nt er pr et at i ons. Several r emar ks are in or der . First,
not i ce t hat this net wor k i mpl ement s a f unct i on phys = f(ret , dep) t hat maps
f r om ret i nal size and dept h t o physi cal size, pr ovi di ng an exampl e of how t o
repl ace funct i ons with par amet er s by mappi ngs. For t he simple case of
l ooki ng at one obj ect per pendi cul ar t o t he line of sight, t here will be one
consi st ent coal i t i on of uni t s whi ch will be stable. The wor k does somet hi ng
mor e, and this is cruci al t o our ent erpri se; t he net wor k can r epr esent t he
consi st ency rel at i on R among t he t hr ee quant i t i es: dept h, ret i nal size, and
physical size. It embodi es not onl y t he f unct i on f, but its t wo inverse func-
t i ons as well (dep =f~(ret , phys), and ret =f 2( phys, dep) ) . ( The net wor k as
shown does not i ncl ude t he links f or f, and f~, but t hese are similar t o t hose
f or f. ) Most of Sect i on 5 is devot ed t o laying out net wor ks t hat embody
t heori es of par t i cul ar visual consi st ency rel at i ons.
The idea of modi fi ers is, in a sense, compl ement ar y t o t hat of con-
j unct i ve connect i ons. For exampl e, t he net wor k of Fi gure 9 coul d be t rans-
f or med i nt o t he fol l owi ng net wor k (Fig. 10). In this net wor k t he vari abl es
f or physical size, dept h, and ret i nal size are all given equal weight. For ex-
ampl e, physical size =4 and dept h = 1 make a conjunctive connection with
ret i nal size =4. Each of t he val ue units in a compet i ng r ow coul d be con-
nect ed t o all of its compet i t or s by i nhi bi t or y links and this woul d t end t o
make t he net wor k act i vat e onl y one val ue in each cat egor y. The general
issue of ri val ry and coal i t i ons will be discussed in t he next t wo sub-sect i ons.
When shoul d a rel at i on be i mpl ement ed with modi f i er s and when
shoul d it be i mpl ement ed with conj unct i ve connect i ons? A simple, non-
ri gorous answer t o this quest i on can be obt ai ned by exami ni ng t he size of
t wo sets of units: (1) t he number of uni t s t hat woul d have t o be i nhi bi t ed by
modi fi ers; and (2) t he number of units t hat woul d have t o be r ei nf or ced
with conj unct i ve connect i ons. I f (1) is l arger t han (2), t hen one shoul d
choose modi fi ers; ot herwi se choose conj unct i ve connect i ons. Somet i mes
t he choi ce is obvi ous: t o i mpl ement t he br own Cal i f or ni an grass exampl e of
Fi gure 8 with conj unct i ve connect i ons, one woul d have t o r ei nf or ce all units
represent i ng places t hat had green grass! Cl earl y in this case it is easier t o
handl e t he except i on with modi fi ers. On t he ot her hand, t he dept h rel at i on
R( phy, dep, r et ) is mor e cheapl y i mpl ement ed with conj unct i ve connect i ons.
Since our modi f i er s are strictly bi nar y, conj unct i ve connect i ons have t he
addi t i onal advant age of cont i nuous modul at i on.
To see how t he conj unct i ve connect i on st rat egy works in general , sup-
pose a const r ai nt rel at i on t o be satisfied involves a vari abl e x, e. g. , f(x, y, z, w)
= 0. For a par t i cul ar val ue of x, t here will be triples of values of y, z, and w
that sat i sfy the rel at i on f. Each of these triples shoul d make a conj unct i ve
connect i on with t he unit represent i ng t he x-value. Ther e coul d also be 3-in-
put conj unct i ons at each val ue of y, z, w. Each of these f our di f f er ent kinds
of conj unct i ve connect i ons cor r esponds t o an i nt er pr et at i on of t he relation
f ( x, y, z, w) =0 as a function, i. e. , x = f , ( y, z, w) , y = f2(x,z,w), z = f3(x,y,w), or
w = f, (x, y, z). Of course, t hese f unct i ons need not be single-valued. Thi s net-
work connect i on pat t er n coul d be ext ended t o mor e t han f our vari abl es, but
high number s of variables woul d t end t o increase its sensitivity t o noi sy in-
puts. Hi nt on has suggested a special not at i on f or the si t uat i on where a net-
wor k exact l y capt ures a consi st ency rel at i on. The mut ual l y consi st ent values
are all shown t o be cent ral l y linked (Fig. 11). Thi s not at i on provi des an ele-
a
u_ e ~
e-
a . .
" 13
0
. c :
0
z
e -
Q .
a~
14.
m
2 2 5
0
.u ~
~ . ~ , . ~
i-
~'~
r ~
.o
r -
t -
o
u
o
>
c
0
U
. ~ ,
0
0
Z
n
226
CONNECTIONIST MODELS AND THEIR PROPERTIES
Figure 1 I . Not at i on f or consi stency rel at i ons.
227
gant way of present i ng t he i nt er act i ons among net wor ks, but must be used
with care. Wri t i ng down a t ri angl e di agr am does not i nsure t hat t he under -
lying mappi ngs can be made consi st ent or comput at i onal l y wel l -behaved.
Winner-Take-All Net works and Regul ated Net works
A very general pr obl em t hat arises in any di st r i but ed comput i ng si t uat i on is
how t o get t he ent i re syst em t o make a decision (or per f or m a coher ent ac-
t i on, etc. ). Bi ol ogi cal l y necessary exampl es of this behavi or abound; rangi ng
f r om t ur ni ng left or ri ght , t hr ough fi ght -or-fl i ght responses, t o i nt er pr et a-
t i ons of ambi guous wor ds and i mages. Deci si on- maki ng is a par t i cul ar l y i m-
por t ant issue f or t he cur r ent model because of its rest ri ct i ons on i nf or mat i on
fl ow and because of t he al most li'near nat ur e of t he p-uni t s used in many o f
our specific exampl es. Deci si on- maki ng i nt r oduces t he not i ons of stable
states and convergence of net wor ks.
One way t o deal with t he issue of coher ent decisions in a connect i oni st
f r amewor k is t o i nt r oduce winner-take-all (WTA) net wor ks, whi ch have t he
pr oper t y t hat onl y t he uni t with t he highest pot ent i al ( among a set of con-
t enders) will have out put above zero af t er some setting t i me (Fig. 12). Ther e
are a number of ways t o const r uct WTA net wor ks f r om t he uni t s descri bed
above. For our pur poses it is enough t o consi der one exampl e o f a WTA net -
wor k whi ch will oper at e in one t i me step f or a set of cont ender s each of
whom can r ead t he pot ent i al of all of t he ot her s. Each uni t in t he net wor k
comput es its new pot ent i al accor di ng t o t he rule:
p- - i f p >max(ij, .1) then p else O.
2 2 8 FELDMAN A N D BALLARD
That is, each uni t sets i t sel f t o zero i f it knows o f a hi gher i nput . Thi s is fast
and si mpl e, but pr oba bl y a little t oo compl ex t o be pl ausi bl e as t he behavi or
of a single neur on. Ther e is a s t andar d t ri ck ( appar ent l y wi del y used by
nat ur e) t o conver t this i nt o a mor e pl ausi bl e scheme. Repl ace each uni t
above wi t h t wo units; one comput es t he ma x i mu m of t he c ompe t i t or ' s in-
put s and i nhi bi t s t he ot her . The ci rcui t above can be st r engt hened by addi ng
a reverse i nhi bi t or y link, or one coul d use a modi f i er on t he out put , etc. Ob-
vi ousl y one coul d have a WTA l ayer t hat got i nput s f r om s ome set of com-
pet i t or s and set t l ed t o a wi nner when t ri ggered t o do so by s ome downs t r e a m
net wor k. Thi s is an exact anal ogy of st r obi ng an out put buf f e r in a conven-
t i onal comput er .
One pr obl e m wi t h pr evi ous neur al model i ng at t empt s is t hat t he ci r-
cuits pr opos ed were of t en unnat ur al l y del i cat e (unst abl e). Smal l changes in
pa r a me t e r val ues woul d cause t he net wor ks t o osci l l at e or conver ge t o i ncor-
rect answers. We will have t o be car ef ul not t o fall i nt o t hi s t r ap, but woul d
like t o avoi d det ai l ed anal ysi s of each par t i cul ar model f or del i cacy in t hi s
paper . What appear s t o be requi red are s ome bui l di ng bl ocks and combi na-
t i on rules t hat pr eser ve t he desi red pr oper t i es. For exampl e, t he WTA sub-
net wor ks of t he last exampl e will not osci l l at e in t he absence of osci l l at i ng
i nput s. Thi s is al so t r ue of any s ymmet r i c mut ual l y i nhi bi t or y s ubnet wor k.
Thi s is i nt ui t i vel y cl ear and coul d be pr oven r i gor ousl y under a var i et y o f
as s umpt i ons (cf. Gr ossber g, 1980). I f ever y uni t receives i nhi bi t i on pr opor -
t i onal t o t he act i vi t y ( pot ent i al ) of each of its rivals, t he i ns t ant aneous
l eader will recei ve less i nhi bi t i on and t hus not lose its l ead unless t he i nput s
change si gni fi cant l y.
Anot her useful pr i nci pl e is t he e mpl oyme nt o f l ower - bound and upper -
bound cells t o keep t he t ot al act i vi t y of a net wor k wi t hi n bounds (Fig. 13).
Suppose t hat we add t wo ext r a uni t s, LB and UB, t o a net wor k whi ch has
coor di nat ed out put . The LB cell c ompa r e s t he t ot al (sum) act i vi t y o f t he
uni t s of t he net wor k wi t h a l ower bound and sends posi t i ve act i vat i on uni -
f or ml y t o all member s i f t he sum is t oo l ow. The UB cell i nhi bi t s all uni t s
equal l y i f t he sum o f act i vi t y is t oo hi gh. Not i ce t hat LB and UB can be
par amet er s set f r om out si de t he net wor k. Under a wi de r ange of condi t i ons
(but not all), t he LB- UB augment ed net wor k can be desi gned t o pr eser ve
or der r el at i onshi ps a mong t he out put s vj of t he ori gi nal net wor k whi l e keep-
ing t he sum bet ween LB and UB.
We will of t en as s ume t hat LB- UB pai rs ar e used t o keep t he s um o f
out put s f r om a net wor k wi t hi n a gi ven range. Thi s s ame mechani s m al so
goes f ar t owar ds el i mi nat i ng t he t wi n peri l s o f uni f or m s at ur at i on and
uni f or m silence whi ch can easi l y ari se in mut ual i nhi bi t i on net wor ks . Thus
we will of t en be abl e t o r eason about t he c omput a t i on of a net wor k as s um-
ing t hat it st ays act i ve and bounded.
a)
-i
a
t -
C~
,.C
0
c ~
L~
C~
._o
r--
LE
O
0~
C
O ~
Z aJ
c ~
-j
0)
LL
2 ~ 9
Stable Coalitions
For a massi vel y paral l el syst em t o act ual l y ma ke a deci si on (or do some-
t hi ng), t here will have t o be st at es in whi ch s ome act i vi t y st r ongl y domi nat es .
Such st abl e, connect ed, hi gh conf i dence uni t s are t er med st abl e coalitions.
A st abl e coal i t i on is our ar chi t ect ur al l y- bi ased t er m f or t he psychol ogi cal
not i ons of per cept , act i on, etc. We have shown s ome si mpl e i nst ances o f
st abl e coal i t i ons, in Fi gure 5b and t he WTA net wor k. In t he dept h net wor ks
of Fi gures 9 and 10, a st abl e coal i t i on woul d be t hr ee uni t s r epr esent i ng con-
sistent val ues of ret i nal size, dept h, and physi cal size. But t he general i dea is
t hat a very l arge compl ex subsyst em mus t st abi l i ze, e. g. , t o a fi xed i nt er pr e-
t at i on o f vi sual i nput , as in Fi gure I. The way we bel i eve t hi s t o ha ppe n is
t hr ough mut ual l y r ei nf or ci ng coal i t i ons whi ch domi na t e all ri val act i vi t y
when the deci si on is requi red. The si mpl est case of t hi s is Fi gure 5b, where
t he t wo units A and B f or m a coal i t i on whi ch suppr esses C and D. For mal l y,
a coal i t i on will be cal l ed st abl e when t he out put o f all its me mbe r s is non-
decreasing. Not i ce t hat a coal i t i on is not a par t i cul ar anat omi cal st r uct ur e,
but an i nst ant aneousl y mut ual l y r ei nf or ci ng set of uni t s, in t he spi ri t o f
He b b ' s cell assembl i es ( Jusczyk & Kl ei n, 1980).
What can we say about t he condi t i ons under whi ch coal i t i ons will
become and r emai n st abl e? We will begi n i nf or mal l y wi t h an al most t ri vi al
condi t i on. Consi der a set o f uni t s {a, b . . . . } whi ch we wish t o exami ne as a
possi bl e coal i t i on, ~r. For now, we assume t hat t he units in r ar e all p- uni t s
and are in t he non- s at ur at ed r ange and have no decay. Thus f or each u in r ,
p( u) - - p( u) + Exc - I nh,
where Exc is t he wei ght ed sum o f exci t at or y i nput s and I nh is t he wei ght ed
sum o f i nhi bi t or y i nput s. Now suppose t hat ExclTr, t he exci t at i on f r om t he
coal i t i on 7r onl y, were gr eat er t han I NH, t he l argest possi bl e i nhi bi t i on
recei vabl e by u, f or each uni t u in ~r, i . e. ,
(SC) V u e r ; Ex c l r > I NH
Then it fol l ows t hat
V u e 7r ; p ( u ) - p ( u ) +~ where 6 >0 .
That is, the potential of every unit in the coalition will increase. This is not
only true instantaneously, but remains true as long as nothing external
changes (we are ignoring state change, saturation, and decay). This is
because Excl~r continues to increase as the potential of the members of r in-
creases. Taking saturation into account adds no new problems; if all of the
units in ~- are saturated, the change, 6, will be zero, but the coalition will re-
main stable.
The condition that the excitation from other coalition members alone,
Excl~r, be greater than any possible inhibition INH for each unit may ap-
pear to be too strong to be useful. It is certainly true that coalitions can be
stable without condition (SC) being met. The condition (SC) is useful for
model building because it may be relatively easy to establish. Notice that
INH is directly computable from the description of the unit; it is the largest
negative weighted sum possible. If inhibition in our networks is mutual, the
upper-bound possible after a fixed time r, INHr, will depend on the current
value of potential in each unit u. The simplest case of this is when two units
are "deadly ri val s"--each gets all its inhibition from the other. In such
cases, it may well be feasible to show that after some time r, the stable coali-
tion condition will hold (in the absence of decay, fatigue, and changes exter-
nal to the network). Often, it will be enough to show that the coalition has a
stable "front i er, " the set of units with outputs to some system under in-
vestigation.
There are a number of interesting properties of the stable coalition
principle. First notice that it does not prohibit multiple stable coalitions nor
single coalitions which contain units which mutually inhibit one another
(although excessive mutual inhibition is precluded). If the units in the coali-
tion had non-zero decay, the coalition excitation Excl~r would have to ex-
ceed both INH and decay for the coalition to be stable. We suppose that a
stable coalition yields control when its input elements change (fatigue and
explicit resets are also feasible). To model coalitions with changeable inputs,
we add boundary elements, which also had external "I nput " and thus
whose condition for being part of a stable coalition, 7r, would be:
ExcJ r + Input > INH.
This kind of unit could disrupt the coalition if its Input went too low. The
mathematical analysis of CM networks and stable coalitions continues to be
a problem of interest. We have achieved some understanding of special
cases (Feldman & Ballard, 1982) and these results have been useful in
designing CM too complex to analyze in closed form.
4. CONSERVI NG CONNECTI ONS
It is cur r ent l y est i mat ed t hat t here are about 10 ' t neur ons and 10 t5 connec-
t i ons in t he human brai n and t hat each neur on receives i nput f r om about 10 3
- 10' ot her neur ons. These number s are qui t e large, but not so l arge as t o
present no pr obl ems f or connect i oni st t heori es. It is also i mpor t ant t o
r emember t hat neur ons are not switching devices; t he same signal is pr opa-
gat ed al ong all of t he out goi ng br anches. For exampl e, suppose some model
called f or a separat e, dedi cat ed pat h bet ween all possi bl e pairs of uni t s in
t wo layers in size N. It is easy t o show t hat this requi res N 2 i nt er medi at e
sites. Thi s means, f or exampl e, t hat t her e are not enough neur ons in t he
brai n t o pr ovi de such a cross-bar switch f or subst r uct ur es of a mi l l i on ele-
ment s each. Si mi l arl y, t here are not enough neur ons t o pr ovi de one t o
represent each compl ex obj ect at ever y posi t i on, or i ent at i on, and scale o f
visual space. Al t hough t he devel opment of connect i oni st model s is in its
peri nat al per i od, we have been abl e t o accumul at e a number of ideas on
how some of t he r equi r ed comput at i ons can be car r i ed out wi t hout excessive
r esour ce requi rement s. Fi ve of t he most i mpor t ant of t hese are descri bed
below: (I) f unct i onal decomposi t i on; (2) l i mi t ed preci si on comput at i on; (3)
coarse and coar se- f i ne codi ng; (4) t uni ng; and (5) spat i al coher ence.
Functional Decomposition
When t he number of vari abl es in t he f unct i on becomes large, t he fan-i n or
number of i nput connect i ons coul d become unreal i st i cal l y large. For exam-
ple, with t he f unct i on t = f ( u, v, w, x, y, z) i mpl ement ed with I00 val ues of t,
when each of its ar gument s can have 100 distinct values, woul d requi re an
average number of i nput s per unit of 10'5/102, or 10 '. However , t her e are
simple ways of t radi ng uni t s f or connect i ons. One is t o repl i cat e t he number
of units with each value. Thi s is a good sol ut i on when t he i nput s can be par -
t i t i oned in some nat ur al way as in t he vision exampl es in t he next sect i on. A
mor e power f ul t echni que is t o use i nt er medi at e units when t he comput at i on
can be decompos ed in some way. For exampl e, i f f ( u, v, w, x, y, z ) =g( u, v) o
h( w, x, y, z) , where o is some composi t i on, t hen separ at e net wor ks of val ue
units f or f(g, h), g(u, v), and h( w, x, y, z) can be used. The out put s f r om t he g
and h units can be combi ned in conj unct i ve connect i ons accor di ng t o t he
composi t i on oper at or o in a t hi r d net wor k represent i ng f. An exampl e is t he
case of wor d r ecogni t i on. Let t er - f eat ur e uni t s woul d have t o connect t o
vastly mor e wor d units wi t hout t he i mposi t i on of t he i nt er medi at e level of
letter units. The letter units limit t he ways l et t er - f eat ur e uni t s can appear in
a wor d.
Limited Precision Computation
In the multiplication example z = xy, the number of z units required is pro-
portional to NxN, even when redundant value units are eliminated, and in
general the number of units could grow exponentially with the number of
arguments. However, there are several refinements which can drastically
reduce the number of required units. One way to do this is to fix the number
of units at the precision required for the computation. Figure 14 shows the
network of Figure 7 modified when less computational accuracy is required.
Figure 14. Modi f i ed Mul t i pl i cat i on Table using Less Units.
This is the same principle that is incorporated in integer calculations in
a sequential computer: computations are rounded to within the machine' s
accuracy. Accuracy is related to the number of bits and the number repre-
sentation. The main difference is that since the sequential computer is
general purpose, the number representations are conservative, involving
large number of bits. The neural units need only represent sufficient ac-
curacy for the problem at hand. This will generally vary from network to
network, and may involve very inhomogeneous, special purpose number
representations.
Coarse and Coarse-Fine Coding
Coarse coding is a general technical device for reducing the number of units
needed to represent a range of values with some fixed precision, due to Hin-
ton (1980). As Figure 15a suggests, one can represent a more precise value
as t he si mul t aneous act i vat i on of several (here 3) over l appi ng coar se- val ued
units. In general , D si mul t aneous act i vat i ons of coarse cells of di amet er D
precise units suffi ce. For a par amet er space of di mensi on k, a r ange of F
values can be capt ur ed by onl y P / D e-' units r at her t han F k in t he nai ve
met hod. The coarse codi ng t ri ck and t he rel at ed coar se- f i ne t ri ck t o be
descri bed next bot h depend on t he i nput at any given t i me bei ng sparse
relative t o t he set of all values expressible by t he net wor k.
The coarse-fi ne codi ng t echni que is useful when t he space of values t o
be represent ed has a nat ur al st r uct ur e which can be expl oi t ed. Suppose a set
of units represent s a vect or par amet er v whi ch can be t hought of as part i -
t i oned i nt o t wo component s (r,s). Suppose f ur t her t hat t he number of uni t s
r equi r ed t o represent t he subspace r is N, and t hat r equi r ed t o represent s is
N,. Then t he number of units r equi r ed t o represent v is NrN,. It is easy t o
const r uct exampl es in vision where t he pr oduct NrN, is t oo close t o t he upper
bound of 10 ~' units t o be realistic. Consi der t he case of t r i hedr al (v) vertices,
an i mpor t ant visual cue. Thr ee angles and t wo posi t i on coor di nat es are
necessary t o uni quel y def i ne ever y possible t r i hedr al vert ex. (Two angles
defi ne t he t ypes of vert ex (arrow, y-j oi nt ); the t hi r d specifies t he r ot at i on of
t he j oi nt in space. ) I f we use 5 degree angl e sensitivity and l 0 s spat i al sampl e
poi nt s, t he number of uni t s is gi ven by Nr = 3 . 6 x 105 and N, = 10 s so t hat
N, N, = 3.6 x 10 '. How can we achi eve t he r equi r ed r epr esent at i on accur acy
with less units?
In many instances, one can t ake advant age of t he fact t hat t he act ual
occurrence o f par amet er s is sparse. In t er ms of t r i hedr al vertices, one
assumes t hat in an image, such vertices will r ar el y occur in t i ght spat i al
clusters. ( I f t hey do, t hey cannot be resol ved as i ndi vi dual s si mul t aneousl y. )
Gi ven t hat si mul t aneous pr oxi mal values of par amet er s are unl i kel y, t hey
can be r epr esent ed accur at el y f or ot her comput at i ons , wi t hout excessive
cost .
The sol ut i on is t o decompose t he space v i nt o t wo subspaces, r and s,
each with uni l at eral l y r educed r esol ut i on.
Inst ead of N, N, units, we represent v with t wo spaces, one with Nr , N,
units where N, , < < N, and anot her wi t h Nr N, , uni t s where Ns, < < N, .
To i l l ust rat e this t echni que with t he exampl e of t r i hedr al vertices we
choose
N, , =0. 01N, and N, , =0 . 0 I N, .
Thus t he di mensi ons of t he t wo sets of units are:
N, , N, = 3 . 6 x 10 s
and
N, N, , = 3. 6 x l 0 s.
CONNECTI ONI ST MODELS AND THEIR PROPERTIES 235
The choices result in one set of units which accurately represent the angle
measurements and fire for a specific trihedral vertex anywhere in a fairly
broad visual region, and another set of units which fire only i f a general
trihedral vertex is present at the precise position. The coarse-fine technique
can be viewed as replacing the square coarse-valued covering in Figure 15a
with rectangular (multi-dimensional) coverings, like those shown in Figure
16. In terms of our value units, the coarse-fine representation of trihedral
vertices is shown in Figure 15b.
o.
b.
F i g u r e 1 5 a . C o a r s e c o d i n g e x a m p l e . I n a t w o - d i m e n s i o n a l m e a s u r e m e n t s p a c e , t h e
p r e s e n c e of a m e a s u r e m e n t c a n b e e n c o d e d b y m a k i n g a s i n g l e u n i t i n t h e f i n e r e s o l u t i o n
s p a c e h a v e a h i g h c o n f i d e n c e v a l u e . T h e s a m e m e a s u r e m e n t c a n b e e n c o d e d by m a k i n g
o v e r l a p p i n g c o a r s e u n i t s i n t h r e e d i s t i n c t c o a r s e a r r a y s h a v e h i g h c o n f i d e n c e v a l u e s .
w h e r e A, , A2, A3
o r e r a n g e s of
ongulor~ R
Y w i t h
~ = 95
<~=81
( x~=45
x = 2 7
y ----31
F i g u r e 1510. C o a r s e a n g l e - - f i n e p o s i t i o n a n d c o a r s e p o s i t i o n - - f i n e a n g l e u n i t s c ombi ne t o
y i e l d p r e c i s e v a l u e s o f a l l f i v e p a r a m e t e r s .
I f t he t r i hedr al angl e ent ers i nt o anot her rel at i on, say R(v,t~), where
bot h its angl e and posi t i on are r equi r ed accur at el y, one conj unct i vel y con-
nects pai rs of appr opr i at e units f r om each of t he r educed r esol ut i on spaces
t o appr opr i at e R-units. The conj unct i ve connect i on represent s t he i nt ersec-
t i on of each of its component s ' f i e l ds . Essent i al l y t he same mechani sm will
suffi ce f or conj oi ni ng (e. g. ) accur at e col or wi t h coar se vel oci t y i nf or mat i on.
An i mpor t ant l i mi t at i on of t hese t echni ques, however , is t hat t he in-
put must be sparse. I f i nput s are t oo closely spaced, " g h o s t " firings will
occur. In Fi gure 16, t wo sets of over l appi ng fields are shown, each with uni-
l at eral l y r educed r esol ut i on. Act ual i nput at poi nt s A and B will pr oduce an
er r oneous i ndi cat i on of an i nput at C, in addi t i on t o t he cor r ect signals. The
sparseness r equi r ement has been shown t o be satisfied in a number of ex-
peri ment s with visual dat a (Bal l ard & Ki mbal l , 1981a, 1981b; Bal l ard &
Sabbah, 1981).
The r esol ut i on device involves a uni t s / connect i ons t r adeof f , but in
general , t he t r a de of f is at t ract i ve. To see this, consi der a uni t t hat receives
i nput f r om a net wor k represent i ng a vect or par amet er v. I f n is t he number
of places where t he out put is used, and conj unct i ve connect i ons are used t o
conj oi n t he D firing units, t hen Dn synapses are r equi r ed. Thus i f A is t he
number of non- coar se coded units t o achi eve a given acui t y, t hen coar se
codi ng is at t ract i ve when A/ D k-' > Dn, assumi ng connect i ons and uni t s are
equal l y scarce. Thi s result is opt i mi st i c in t hat , when ot her uses of conj unc-
tive connect i ons are t aken i nt o account , t he number of conj unct i ve uni t s
coul d be unreal i st i cal l y large.
f
D~
desi red
resol ut i on J'-'-I
f i e l d s o f
d i f f e r e n t ~
units
Figure 16. Inputs at A 8, B cause ghosts at C 8, D.
Tuning
The i dea of t uni ng f ur t her expl oi t s net wor ks composed of coarsel y- and
fi nel y-grai ned units. Suppose t her e are n fi ne r esol ut i on units of a f eat ur e A
and n fine resol ut i ons f or a f eat ur e B. To have explicit units f or f eat ur e
values AB, n 2 uni t s woul d be r equi r ed. Thi s is an unt enabl e sol ut i on f or
large f eat ur e spaces (the number of units grows exponent i al l y with t he
number of feat ures), so al t ernat i ves must be sought . One sol ut i on t o this
pr obl em is t o var y t he grai n of t he AB units so t hat t hey are onl y coarsel y
represent ed. Thi s sol ut i on has its at t endant di sadvant ages in t hat separ at e
stimuli within t he limits of t he coarse r esol ut i on grai n cannot be distin-
gui shed. Al so, a set of weak stimuli can be mi si nt er pr et ed. A bet t er sol ut i on
is t o have a coarse unit t hat woul d r espond onl y t o a single sat ur at ed unit
wi t hi n its i nput range. In t hat way a col l ect i on of weak i nput s is not misin-
t er pr et ed.
Thi s si t uat i on can be achi eved by havi ng t he units in each f i nel y- t uned
net wor k t hat are in t he field of a coarse uni t l at eral l y i nhi bi t each ot her ,
e. g. , in t he WTA net wor k of Fi gure 5a. The out put s of t hese i ndi vi dual
f eat ur e units t hen f or m di sj unct i ve connect i ons with appr opr i at e coarse
r esol ut i on mul t i pl e f eat ur e units. I f m is t he grai n of t he coarse r esol ut i on
units al ong with each f eat ur e di mensi on, t he number of di sj unct i ons per
coarse unit is ( n/ m) 2. The result of this connect i on st rat egy is t hat a coarse
unit r esponds with a st rengt h t hat varies as t he st rengt hs of t he largest max-
i mum in t he subnet wor k of each of t he fi nel y-t uned units t hat cor r es pond t o
its field. The response of a coar se- t uned uni t is t he maxi mum of t he sums of
t he conj unct i ve i nput s f r om t he fi nel y t uned units which connect t o it. In
t erms of Fi gure 15, a t uned coarse-angl e cell woul d r espond onl y t o one
hi gh- conf i dence pai r of angles in its range, and not t o several weak ones
(which coul dn' t cor r ect l y appear all at one posi t i on). Thi s is a bet t er pr op-
er t y t han j ust havi ng unst r uct ur ed coarse units and it will be expl oi t ed in t he
next sect i on, when we deal with perceiving compl ex obj ect s.
Spatial Coherence
The most serious pr obl em which requi res conservi ng connect i ons is t he
r epr esent at i on of compl ex concept s. The obvi ous way of represent i ng con-
cepts (sets of propert i es) is t o dedi cat e a separ at e unit t o each conj unct i on of
feat ures. In fact , it first appear s t hat one woul d need a separ at e uni t f or each
combi nat i on at each l ocat i on in t he visual field. We will present her e a sim-
ple way ar ound t he pr obl em of separ at e units f or each l ocat i on and deal
with t he mor e general pr obl em in t he next sect i on.
The basic pr obl em can be readi l y seen in t he exampl e of Fi gur e 17.
Suppose t her e were one unit each f or fi nal l y recogni zi ng concept s like col-
or ed circles and squares. Now consi der t he case when a red circle (at x = 7)
and a bl ue squar e (at x = 11) si mul t aneousl y appear in t he visual field. I f t he
vari ous " c ol or e d f i gur e" units si mpl y s ummed t hei r i nput s, t he i ncor r ect
" bl ue ci r cl e" unit woul d see t wo active i nput s, j ust like t he cor r ect " r e d cir-
cl e" and " bl ue s qua r e " units. Thi s pr obl em is known as cross-t al k, and is
always a pot ent i al hazar d in CM net wor ks. The sol ut i on pr esent ed in Fi gure
17 is qui t e general . Each uni t is assumed t o have a separ at e conj unct i ve con-
nect i on site f or each posi t i on of t he visual field. In our exampl e, t he cor r ect
units get dual i nput s t o a single site (and are act i vat ed) while t he part i al l y
mat ched units receive separ at ed i nput s and are not act i vat ed. Onl y sets of
pr oper t i es which are spat i al l y coher ent can serve t o act i vat e concept units.
Thi s exampl e was meant t o show how spatial coher ence coul d be used with
conj unct i ve connect i ons t o el i mi nat e cross-t al k. Ther e are a number of ad-
di t i onal ways of using spat i al coher ence, each of whi ch involves di f f er ent
t r adeof f s. These are discussed in t he next sect i on, whi ch consi ders some
sampl e appl i cat i ons in mor e detail.
at
at
7 Red Ci r cl e 11 i ~ ~ . 1 Bl ue 7 -,-, _ Squar e A 11
\
\
\
at
\
"S
%
\
\
I
~ . . . . ~ i I
at
at
at
Bl ue Ci r cl e
Fi gur e 17. Spat i al coher ence on i nput s can represent complex concepts without cr oss- t al k.
Sol i d l i nes s how act i ve i nput s and dashed l i nes (some of t he) i nact i ve i nput s.
5. APPLICATIONS
This section illustrates the power of the CM paradigm via two groups of ex-
amples. The first shows how the various techniques for conserving connec-
tions can be used in an idealized form of perception of a complex object.
Here the point is that an object has multiple features which are computed in
parallel via the transform methodology. The second group of examples
starts with a relatively simple problem, that of vergence eye movements, to
illustrate motor control using value units. In this example, control is imme-
diate; a visual signal produces an instantaneous output (within the settling
time constants of the units). Extensions of this idea use space as a buffer for
time. For motor output, space allows the incorporation of more complex
motor commands. For speech input, spatial buffering allows for phoneme
recognition based on subsequent information.
These examples were chosen to show that CM can provide a unified
representation for both perception and motor control. This is important
since an animal is hardly ever passively responding to its environment. In-
stead, it seems involved in what Arbib has called a perception-action cycle
(Arbib, 1979). Perceptions result in actions which in turn cause new percep-
tions, and so on. Massive parallelism changes the way the perception-action
cycle is viewed. In the traditional view, one would convert the input to a lan-
guage which uses variables, and then use these variables to direct motor
commands. CM suggests that we think of accomplishing the same actions
via a transformation: sensory input is transformed (connected to) to abstract
representational units, which in turn are transformed (connected to) to
motor units. This will obviously work for reflex actions. The examples are
intended to sugest how more flexible command and control structures can
also be represented by systems of value units.
Object Recognition
The examples of Figures 1 and 6 are representative of the problem of gestalt
perception: that of seeing parts of an image as a single percept (object). An
"obj ect " is indicated by the "simultaneous" appearance of a number of
"visual features" in the correct relative spatial positions. In any realistic
case, this will involve a variety of features at several different levels of
abstraction and complex interaction among them. A comprehensive model
of this process would be a prototype theory of visual perception and is well
beyond the scope of this paper. What we will do here is consider the pre-
requisite task of constructing CM solutions to the problems of detecting
non-punctate visual features and of forming sets of the features which could
help characterize a percept. We will refer throughout to the prototype prob-
lem of detecting Fred' s frisbee, which is known to be round, baby-blue, and
moving fairly fast. The development suppresses many important issues such
as hierarchical descriptions, perspective, occlusion, and the integration of
separate fixations, not to mention learning. A brief discussion of how these
might be tackled follows the technical material.
The first problem is to develop a general CM technique for detecting
features and properties of images, given that these features are not usually
detectable at a single point in some retinotopic map. The basic idea is to
find parameters which characterize the feature in question and connect each
retinotopic detector to the parameter values consistent with its detectand.
Consider the problem of detecting lines in an image from short edge
segments. Different lines can be represented by units having different
discrete parameter values, e.g. in the line equation p= xcos0+ ysin0, the
parameters are p and 0. Thus edge units at (x,y,u) could be connected to ap-
propriate line units. Note that this example is analogous to the word recog-
nition example (Fig. 1). Edges are analogous to letters and lines to words. As
in the words-letter example, "t op- down" connections allow the existence of
a line to raise the confidence of a local edge. In our line detection example,
lines in the image are high potential (confidence) units in a slope-intercept
(O,p) parameter space. High confidence edge units produce high confidence
line units by virtue of the network connectivity. This general way of describ-
ing this relationship between parts of an image (e.g., edges) and the
associated parameters (e.g., p,O for a line) is a connectionist interpretation
of the Hough transform (Duda & Hart, 1972). Since each parameter value is
determined by a large number of inputs, the method is inherently noise-
resistant and was invented for this purpose. A Hough transform network
for circles (like Fred' s frisbee) would involve one parameter for size plus
two for spatial location, and exactly this method has been used for tumor
detection in chest radiographs (Kimme et al., 1975). Notice that the circle
parameter space is itself retinotopic in that the centers of circles have
specified locations; this will be important in registering multiple features.
The Hough transform is a formalism for specifying excitatory links
between units. The general requirements are that part of an image represen-
tation can be represented by a parameter vector a in an image space A and a
feature can be represented by a vector b which is an element of a feature
space B. Physical constraints f ( a, b) =0 relate a and b. The space A
represents spatially indexed units, and each individual element ah is only
consistent with certain elements in the space B, owing to the constraint im-
posed by the relation f. Thus for each ak it is impossible to compute the set
B k = { b [ ak a n d f ( a ~ , b ) _ < 6 b }
where Bk is the set of units in the feature space network B that the a~ unit
must connect to, and the constant 6~ is related to the quantization in the
space B. Let H(b) be the number of active connections the value unit b
receives f r om units in A. H(b) is t he number of i mage measur ement s which
are consi st ent with t he par amet er val ue b. The pot ent i al of uni t s in B is
given by p( b) - - H( b) / EbH( b) . The val ue p(b) can st and f or t he conf i dence
t hat segment with f eat ur e val ue b is present in t he image. I f t he measur e-
ment represent ed by a is real i zed as gr oups of units, e. g. , a- - ( al , a2) , t hen
conj unct i ve connect i ons are r equi r ed t o i mpl ement t he const r ai nt rel at i on.
I mpl ement i ng these net wor ks of t en results in a set of very sparsely
dis/t'ibuted hi gh- conf i dence f eat ur e space units. In i mpl ement at i ons of t he
line det ect i on exampl e, onl y appr oxi mat el y 1% of t he uni t s have maxi mum
conf i dence values. Thi s figure is also t ypi cal of ot her modal i t i es. In general ,
each ak and the rel at i onshi p f will not det er mi ne a single uni t in Bk as in t he
line det ect i on exampl e, but t here still will be i sol at ed hi gh- conf i dence units.
Figure 1 shows why this is t he case: di f f er ent ah l et t er - f eat ur e units connect
to c ommon units in t he letter space B.
We have f ound t hat par amet er spaces combi ne with t he growi ng body
of knowl edge on specific physical const r ai nt s t o pr ovi de a power f ul and
robust model f or t he si mul t aneous comput at i on of i nvari ant obj ect pr op-
erties such as refl ect ance, cur vat ur e, and rel at i ve mot i on (Bal l ard, 1981).
Of cour se segment at i on must i nvol ve ways of associ at i ng peaks in
several di f f er ent f eat ur e spaces and met hods f or doi ng this are discussed
present l y, but the cor ner st one of t he t echni ques are hi gh- conf i dence uni t s in
t he i ndi vi dual - modal i t y f eat ur e spaces. In ext endi ng t he single f eat ur e case
to mul t i pl e feat ures, t he most seri ous pr obl em is t he i mmense size of t he
cross pr oduct of t he spatial di mensi ons with t hose of i nt erest i ng feat ures
such as col or , vel oci t y, and t ext ure. Thus t o expl ai n how image-like i nput
such as col or and opt i cal flow are rel at ed t o abst r act obj ect s such as " a
blue, fast -movi ng t hi ng, " it becomes necessary t o use all t he t echni ques of
the previ ous sections.
Even i f we assume t hat t here is a special uni t f or recogni zi ng images of
Fr ed' s frisbee, it cannot be t he case t hat t her e is a separat e one of t hese units
for each poi nt in t he visual field. One weak sol ut i on t o this kind of pr obl em
was given in Fi gure 17 of t he last sect i on. Ther e coul d concei vabl y be a
separat e 3-way conj unct i ve connect i on on t he Fr ed' s fri sbee uni t f or each
posi t i on in space. Act i vat i on of one conj unct woul d requi re t he si mul t a-
neous act i vat i on of circle, baby- bl ue, and fai rl y-fast in t he same par t of t he
visual field. The sol ut i on style with separ at e conj unct i ons f or every poi nt in
space becomes i ncreasi ngl y i mpl ausi bl e as we consi der mor e compl ex ob-
j ect s with hi erarchi cal and mul t i pl e descri pt i ons. The spat i al l y regi st ered
conj unct i ons woul d have to be preserved t hr oughout t he st ruct ure.
The pr obl em of goi ng f r om a set of descri pt ors (feat ures) t o t he obj ect
which is t he best mat ch t o t he set is known in artificial intelligence as t he in-
dexing problem. The f eat ur e set is viewed as an i ndex (as in a dat a base).
Ther e have been several pr oposed paral l el hi erarchi cal net wor k sol ut i ons t o
the indexing pr obl em ( Fahl man, 1979; Hillis, 1981) and t hese can be mapped
i nt o CM t erms. But t hese designs assume t hat t he net wor k is pr esent ed with
sets of descri pt ors whi ch are al ready par t i t i oned; precisely t he vision pr ob-
lem we are t ryi ng t o solve. Ther e are t hr ee addi t i onal mechani sms t hat seem
to be necessary, t wo of which have al r eady been discussed. Coar se codi ng
and t uni ng (as discussed in Sect i on 4) make it much less cost l y t o represent
conj unct i ons. In addi t i on, some general concept s (e. g. , bl ue frisbee) mi ght
be i ndexed mor e effi ci ent l y t hr ough less precise units. The new idea is an ex-
t ensi on of spatial coher ence t hat expl oi t s t he fact t hat t he net wor ks r espond
t o activity t hat occurs t oget her in t i me. I f t her e were a way t o f ocus t he ac-
tivity of t he net wor k on one area at a t i me, onl y pr oper t i es det ect ed in t hat
area woul d compet e t o i ndex obj ect s.
The obvi ous way t o focus at t ent i on on one ar ea of t he visual field is
with eye movement s, but t her e is evi dence t hat focus can also be done
within a fi xat i on. The general idea of i nt ernal spat i al focus is shown in
Fi gure 18. In this net wor k, t he general " b a b y - b l u e " unit is conf i gur ed t o
have separat e conj unct i ve i nput s f or each poi nt in space, like t he blue-
square units of Fi gure 17. The di f f er ence is t hat t he second i nput t o t he con-
j unct i on comes f r om a " f o c u s " uni t , and this makes a much mor e general
net wor k. The idea of maki ng a uni t (e. g. , baby blue) mor e responsi ve t o in-
put s f r om a given spatial posi t i on can be i mpl ement ed in di f f er ent ways.
The conj unct i ve connect i on at t he x =7 l obe of t he baby- bl ue uni t is t he
most di rect way. But t r eat i ng this conj unct as a strict AND woul d mean t hat
all spatial units woul d have t o be act i ve when t here was no focus. An alter-
places
o t h e r 1
colors
a t
---~7
Figure 18. Spotiol focus unit can g a t e only input f r o m attended positions.
nat i ve woul d be t o have t he " f oc us on 7 " unit boost t he out put of t he
" b a b y bl ue at 7 " uni t (and all of its rivals) as shown by t he dashed line; this
woul d el i mi nat e t he need f or separ at e spat i al conj unct i ons on t he baby- bl ue
uni t , but woul d al t er t he pot ent i al of all t he uni t s at t he posi t i on bei ng at -
t ended. The t r adeof f s become even t ri cki er when goal -di rect ed i nput is
t aken i nt o account , but bot h met hods have t he same ef f ect on i ndexi ng. I f
t he syst em has its at t ent i on di rect ed onl y t o x = 7, t hen t he onl y f eat ur e units
act i vat ed at all will be t hose whose local represent at i ves are domi nant (in
t hei r WTA) at x = 7. In such a case, t her e woul d be a t i me when t he onl y
concept units act i ve in t he ent i re net wor k woul d be t hose f or x =7. Thi s
does not " s o l v e " t he pr obl em of i dent i fyi ng obj ect s in a visual scene, but it
does suggest t hat sequent i al l y focusi ng at t ent i on on separ at e places can hel p
si gni fi cant l y. Ther e is consi der abl e r eason t o suppose ( Posner , 1978; Tri es-
man, 1980) t hat peopl e do this even in t asks wi t hout eye movement .
Ther e are ot her ways of l ooki ng at t he net wor k of Fi gure 18. Suppose
t he syst em had r eason t o focus on some par t i cul ar pr oper t y (e. g. , baby-
blue). I f we make hi -di rect i onal t he links f r om " f oc us on x = 7 " t o " ba by-
bl ue" and " baby- bl ue at 7 , " a nice possi bi l i t y arises. The " f oc us on 7 " uni t
coul d have a conj unct i ve connect i on f or each separ at e pr oper t y at its posi-
t i on. If, f or exampl e, baby- bl ue was chosen f or focus and was t he domi nant
col or at x = 7, t hen t he " f oc us on x = 7 " uni t woul d domi nat e its rivals. Thi s
suggests anot her way in which t he r ecogni t i on of compl ex obj ect s coul d be
hel ped by spat i al focus. Fi gure 19 depi ct s t he fai rl y general si t uat i on.
In Fi gure 19, t he units represent i ng baby- bl ue, ci rcul ar, and fai rl y-fast
are assumed t o be f or t he ent i re visual field and moder at el y precise. The
dot t ed ar r ows t o t he " Fr e d ' s f r i s bee" node suggest t hat t her e mi ght be mor e
levels of descr i pt i on in a realistic syst em. The spat i al focus links i nvol vi ng
baby- bl ue are t he same as in Fi gure 18, and are repl i cat ed f or t he ot her t wo
propert i es. Not i ce t hat t he posi t i on-speci fi c sensing units do not have t hei r
pot ent i al s af f ect ed by spatial f ocus units, so t hat t he sensed dat a can r emai n
i nt act . The net wor k of Fi gure 19 can be used in several ways.
I f at t ent i on has been f ocused on x- - 7 f or any r eason, t he vari ous
space- i ndependent units whose represent at i ves are most act i ve at x = 7 will
become most act i ve, pr esumabl y l eadi ng t o t he act i vat i on (recogni t i on) of
Fr ed' s frisbee. I f a t op- down goal of l ooki ng f or Fr ed' s fri sbee (or even j ust
somet hi ng baby- bl ue) is act i ve, t hen t he " f oc us on x - - 7 " will t end t o def eat
its WTA rivals, l eadi ng t o t he same result. A t hi r d possibility is a little mor e
compl i cat ed, but qui t e power f ul . Suppose t hat a given i mage, even in con-
t ext , act i vat es t oo many pr oper t y units so t hat no obj ect s are effect i vel y in-
dexed. One st rat egy woul d be t o syst emat i cal l y scan each ar ea of t he visual
field, el i mi nat i ng conf oundi ng act i vi t y f r om ot her areas. But it is also possi-
ble to be mor e effi ci ent . I f some pr oper t y uni t (say baby- bl ue) were st r ongl y
act i vat ed, t he net wor k coul d f ocus at t ent i on on all t he posi t i ons with t hat
pr oper t y. In this case it is like put t i ng a baby- bl ue fi l t er in f r ont of t he
scene, and shoul d of t en lead t o bet t er conver gence in t he net wor ks f or
shape, speed, etc.
One shoul d compar e t he net wor k of Fi gure 17 with Fi gures 18 and 19.
In t he f or mer , parallel co-existing concept s are possible i f we assume deli-
cat e ar r angement s of conj unct i ve connect i ons. The l at t er net wor ks are mor e
r obust but use sequent i al i t y t o el i mi nat e cross-t al k.
Ti me and Sequence
Connect i oni st model s do not initially appear t o be well-suited t o represent -
ing changes with t i me. The net wor k f or comput i ng some f unct i on can be
made qui t e fast , but it will be fixed in f unct i onal i t y. Ther e are t wo qui t e dif-
ferent aspects of t i me vari abi l i t y of connect i oni st st ruct ures. One is t i me-
varyi ng responses, i. e. , l ong-t erm modi f i cat i on of t he net wor ks ( t hr ough
changi ng weights) and shor t - t er m changes in t he behavi or of a fi xed net -
wor k with t i me. The second aspect is sequence: t he pr obl em of anal yzi ng
i nherent l y sequent i al i nput (such as speech) or pr oduci ng i nher ent l y sequen-
tial out put (such as mot or commands) with parallel model s. The pr obl em of
change will be def er r ed t o (Fel dman, 1981b). The pr obl em of sequence is
discussed here.
Ther e are a number of bi ol ogi cal l y suggested mechani sms f or chang-
ing t he weight (wj) of synapt i c connect i ons, but none of t hem are near l y
rapi d enough t o account f or our ability t o hear, read, or speak. The ability
t o perceive a t i me-varyi ng signal like speech or t o i nt egrat e t he images f r om
successive fi xat i ons must be achi eved (accordi ng t o our dogma) by some
dynami c (electrical) activity in t he net wor ks. As usual, we will present com-
put at i onal sol ut i ons t o t he pr obl ems of sequence t hat appear t o be consis-
t ent with known st ruct ural and per f or mance const rai nt s. These are, agai n,
t oo cr ude t o be t aken literally but do suggest t hat connect i oni st model s can
describe t he phenomena.
Motor Control of the Eye. To see how t he t r ans f or m not i on of dis-
t r i but ed units mi ght wor k f or mot or cont r ol , we present a simplistic model
of vergence eye movement s. (The same idea may be valid f or fi xat i ons, but
cont r ol pr obabl y t akes place at hi gher levels of abst r act i on. ) In this model
r et i not opi c (spatial) units are connect ed di rect l y t o muscl e cont r ol units.
Each r et i not opi c unit can i f sat ur at ed cause t he appr opr i at e cont r act i on so
t hat t he new eye posi t i on is cent ered on t hat uni t . When several r et i not opi c
units sat urat e, each enabl es a muscle cont r ol unit i ndependent l y and t he
muscle itself cont r act s an average amount .
Fi gure 20 shows t he idea f or a one- di mensi onal ret i na. For exampl e,
with units at posi t i ons 2, 4, 5, and 6 sat ur at ed, t he net result is t hat t he mus-
cle is cent ered at 17/ 4 or 4. 25. (This idea can be ext ended t o t he case where
/
i
/
i
x,
s "
FJgure 19. Spatial focus and i ndexi ng.
@
Q
@

@
reti nal
spatial units
[C(x) in Fig. 19]
current eye coordi nate
muscle
command units
Figure 20. Distributed Control of Eye Fixations
245
t he r et i not opi c units have over l appi ng fields.) Thi s ki nd of or gani zat i on
coul d be ext ended t o mor e compl ex movement model s such as t hat of t he
or gani zat i on of t he super i or colliculus in t he monkey (Wurt z & Al bano,
1980).
Not i ce t hat each r et i not opi c unit is capabl e of enabl i ng di f f er ent mus-
cle cont r ol units. The appr opr i at e one is det er mi ned by t he enabl ed x-ori gi n
unit which inhibits commands t o t he i nappr opr i at e cont r ol units via
modi fi ers.
One pr obl em with this simple net wor k arises when di spar at e gr oups of
r et i not opi c units are sat ur at ed. The present conf i gur at i on can send t he eye
t o an average posi t i on i f t he feat ures are t r ul y identical. The net wor k can be
modi f i ed with addi t i onal connect i ons so t hat onl y a single connect ed com-
ponent of sat ur at ed units is enabl ed by using addi t i onal obj ect pri mi t i ves. A
version of this WTA mot or cont r ol i dea has al r eady been used in a comput er
model of t he frog t ect um (Di dday, 1976).
Ther e are still many details t o be wor ked out bef or e this coul d be con-
sidered a realistic model of vergence cont r ol , but it does i l l ust rat e t he basic
idea: local spat i al l y separat e sensors have distinct, active connect i ons which
coul d be averaged at t he muscle f or fine mot or cont r ol or be fed t o some in-
t ermedi at e net wor k for t he cont r ol of mor e compl ex behavi ors.
Converting Space to Time. Consi der t he pr obl em of cont r ol l i ng a
simple physical mot i on, such as t hr owi ng a ball. It is not har d t o i magi ne
t hat in a skilled mot or per f or mance uni t -groups fire each ot her in a fixed
succession, l eadi ng to t he mot or sequence. The comput at i onal pr obl em is
t hat t here is a uni que set of ef f ect or units (say at t he spinal level) t hat must
receive i nput f r om each gr oup at t he right t i me. Fi gure 21a depi ct s a simple
case in which t her e are t wo ef f ect or units (e,, e2) t hat must be act i vat ed
al t ernat i vel y. The circles mar ked 1--4 represent units (or gr oups of units)
which act i vat e t hei r successor and inhibit t hei r pr edecessor (cf. Del comyn,
1980). The mai n poi nt is t hat a succession of out put s t o a single ef f ect or set
can be model ed as a sequence of time-exclusive gr oups represent i ng i nst an-
t aneous coor di nat ed signals. Movi ng f r om one t i me step t o t he next coul d
be cont r ol l ed by pur e t i mi ng f or ballistic movement s, or by a pr opr i ocept i ve
feedback signal. Ther e is, of course, an enor mous amount mor e t han this t o
mot or cont r ol , and realistic model s woul d have t o model f or ce cont r ol ,
ballistic movement s, gravi t y compensat i on, etc.
The second par t of Fi gure 21 depi ct s a somewhat f anci f ul not i on of
how a vari et y of out put sequences coul d shar e a col l ect i on of l ower level
response units. The net wor k shown has a single " Di x i e " uni t whi ch can
st art a sequence and which j oi ns in conj unct i ve connect i ons with each not e
t o speci fy its successor. At each t i me step, a WTA net wor k deci des what
not e gets sounded. One can i magi ne addi ng t he r hyt hm net wor k and t rans-
posi t i on net wor ks t o ot her keys and t o ot her modal i t i es of out put .
a. Sequence and Suppression
star
\
o
@( 9@
1 '
~ ' ~ " D ix ie "
Rhythm
>
b . W h is tlin g D ix ie
Figure 21. Mapping Space to Time.
247
Converting Time to Space. The sequencer model f or skilled move-
ment s was great l y si mpl i fi ed by t he assumpt i on t hat t he sequence of activi-
ties was pre-wi red. How coul d one (still cr udel y, of course) model a si t uat i on
like speech per cept i on where t her e is a l argel y unpr edi ct abl e t i me- var yi ng
comput at i on t o be carri ed out ? One sol ut i on is t o combi ne t he sequencer
model of Fi gure 21 with a simple vision-like scheme. We assume t hat speech
is recogni zed by being sequenced i nt o a buf f er of about t he length of a
phrase and t hen is rel axed against cont ext in t he way descri bed above f or vi-
sion. For simplicity, assume t hat t here are t wo i dent i cal buf f er s, each hav-
ing a pervasi ve modi f i er (mj) i nner vat i on so t hat ei t her one can be swi t ched
i nt o or out of its connect i ons. We are part i cul arl y concer ned with t he pr o-
cess of goi ng f r om a sequence of pot ent i al phonet i c feat ures i nt o an i nt er-
pret ed phrase. Fi gure 22 gives an idea of how this mi ght happen.
t i me =
t
o
Figure 22. Mappi ng Time to Space.
CONNECTI ONI ST MODELS A N D THEIR PROPERTIES 2 4 9
Assume t hat t here is a separat e uni t f or each pot ent i al f eat ur e f or each
t i me step up t o t he length of t he buf f er . The net wor k which anal yzes sound
is connect ed i dent i cal l y t o each col umn, but conj unct i on allows onl y t he
connect i ons t o t he active col umn t o t r ansmi t val ues. Under ideal ci rcum-
stances, at each t i me step exact l y one f eat ur e unit woul d be act i ve. A phrase
woul d t hen be laid out on t he buf f er like an i mage on t he " mi nd' s e ye , "
and t he anal ogous kind of rel axat i on cones (cf. Fi gure l, 6) i nvol vi ng mor -
phemes, words, etc. , coul d be br ought t o bear. The mor e realistic case
where sounds are l ocal l y ambi guous present s no addi t i onal pr obl ems. We
assume t hat , at each t i me step, t he vari ous compet i ng feat ures get varyi ng
act i vat i on. Di phone const r ai nt s coul d be capt ur ed by ( + o r - ) links t o t he
next col umn as suggested by Fi gure 22. The result is a mul t i pl e possibility
rel axat i on pr obl e m- - a ga i n exact l y like t hat in visual per cept i on. The fact
t hat each pot ent i al f eat ur e coul d be assigned a r ow of units is essential t o
this sol ut i on; we do not know how t o make an anal ogous model f or a se-
quence of sounds which cannot be cl earl y cat egori zed and combi ned. Recall
t hat t he pur pose of this exampl e is t o i ndi cat e how t i me-varyi ng i nput coul d
be t r eat ed in connect i oni st model s. The pr obl em of act ual l y laying out
det ai l ed model s for l anguage skills is enor mous and our exampl e may or
may not be useful in its cur r ent f or m. Some of t he consi der at i ons t hat arise
in di st ri but ed model i ng of l anguage skills are pr esent ed in Ar bi b and
Capl an, (1979).
CONCLUSI ONS
The CM par adi gm advanced in this paper has been appl i ed successfully onl y
to rel at i vel y low-level tasks. Ther e is no reason, as yet, to be conf i dent t hat
an i nt er medi at e symbol i c r epr esent at i on will not be requi red f or model i ng
hi gher cogni t i ve processes. Ther e is, however, t he begi nni ng of a col l ect i on
of ef f or t s whi ch can be i nt er pr et ed as at t empt i ng CM appr oaches t o hi gher
level tasks. These i ncl ude work which explicitly uses parallelism in pl anni ng
(Stefik, 1981) and deduct i on, and wor k which i ncor por at es mor e connec-
t i oni st archi t ect ural not i ons of val ue units (Forbus, 1981) and coarse codi ng
( Gar vey, 1981).
We have now compl et ed six years of i nt ensi ve ef f or t on t he devel op-
ment of connect i oni st model s and t hei r appl i cat i on t o t he descri pt i on of
compl ex tasks. Whi l e we have onl y t ouched t he surface, t he results t o dat e
are ver y encour agi ng. Somewhat t o our surpri se, we have yet t o encount er a
chal l enge t o t he basic f or mul at i on. Our at t empt s to model in detail par-
t i cul ar comput at i ons (Bal l ard & Sabbah, 1981; Sabbah, 198 I) have led t o a
number of new insights ( f or us, at least) i nt o these specific tasks. At t empt s
like this one t o f or mul at e and solve general comput at i onal pr obl ems in
realistic connect i oni st t er ms have pr oven t o be di f f i cul t , but less so t han we
woul d have guessed. Ther e a ppe a r t o be a numbe r o f i nt erest i ng t echni cal
pr obl ems wi t hi n t he t heor y and a wi de r ange of quest i ons about br ai ns and
behavi or whi ch mi ght benef i t f r om an a ppr oa c h al ong t he lines suggest ed in
this paper .
AP P ENDI X: SUMMARY OF DEFI NI TI ONS AND NOTATI ON
A unit is a c omput a t i ona l ent i t y compr i si ng:
{ q } - - a set o f discrete states, < 10
I r - - a cont i nuous val ue in [ - 10,10], called pot ent i al (accuracy of several digits)
v - - a n out put value, i nt egers 0_< v _ 9
i - - a vect or of i nput s i, . . . . . i.
and f unct i ons f r om ol d t o new val ues o f t hese
p - - f ( i , p , q )
q- - g( i , p, q)
v - - h( i , p, q)
which we assume t o c omput e cont i nuousl y. The f or m of t he f, g, and h
f unct i ons will var y, but will general l y be rest ri ct ed t o condi t i onal s and si m-
ple f unct i ons.
P-Units
For s ome appl i cat i ons, we will use a par t i cul ar l y si mpl e ki nd of uni t whose
out put v is pr opor t i onal t o its pot ent i al p ( r ounded) (when p > 0) and whi ch
has onl y one st at e. In ot her wor ds
p- - p + ~ Ewki~
v - - i f p > 0 t hen r ound (p - 0) else 0
[ 0- <wk< 1]
[v =0. . . 9]
where ~, 0 are const ant s and wk ar e wei ght s on t he i nput val ues.
Conjunctive Connecti ons
In t er ms of our f or mal i s m, t hi s coul d be descr i bed in a var i et y o f ways. One
of t he si mpl est is t o def i ne t he pot ent i al in t er ms o f t he ma x i mu m, e. g. ,
p - - p +/ 3Max(i , +i ~- ~o, i 3 + i , - ~ , i s + i 6 - i , - ~ )
where fl is a scale const ant as in t he p-uni t and ~o is a const ant chosen (usually
> 10) t o suppress noi se and r equi r e t he presence of mul t i pl e act i ve i nput s.
The mi nus sign associ at ed with i, cor r esponds t o its bei ng an i nhi bi t or y in-
put . The max- of - sum uni t is t he cont i nuous anal og of a logical OR- of - AND
(di sj unct i ve nor mal f or m) uni t and we will somet i mes use t he l at t er as an ap-
pr oxi mat e versi on of t he f or mer . The OR- of - AND uni t cor r espondi ng t o
t he above is:
p--p + ct OR (i,&i,, i3&i,, is&i~&(not i,) )
Wi nner - t ake- al l (WTA) net wor ks have t he pr oper t y t hat onl y t he uni t
with t he highest pot ent i al ( among a set of cont ender s) will have out put
above zero af t er some settling t i me.
A coal i t i on will be called st abl e when t he out put of all of its member s
is non- decr easi ng.
Change
For our pur poses, it is useful t o have all t he adapt abi l i t y of net wor ks be
conf i ned t o changes in weights. Whi l e t here is known t o be some gr owt h of
new connect i ons in adul t s, it does not appear t o be fast or ext ensi ve enough
t o pl ay a ma j or r ol e in l earni ng. For t echni cal reasons, we consi der very
local gr owt h or decay of connect i ons t o be changes in existing connect i on
pat t er ns. Obvi ousl y, model s concer ned wi t h devel opi ng syst ems woul d need
a ri cher not i on of change in connect i oni st net wor ks (cf. v o n d e r Mal sbur g &
Willshaw, 1977). We pr ovi de each uni t wi t h a me mor y vect or ~t whi ch can be
updat ed:
#- - c( i , p, q, x, w, tt)
where/~ is t he i nt er medi at e- t er m me mor y vect or , w is t he weight vect or , i, p,
and q are as al ways, and x is an addi t i onal single i nt eger i nput (0_<x_<9)
whi ch capt ures t he not i on of t he i mpor t ance and val ue of t he cur r ent behav-
ior. I nst ant aneous est abl i shment of l ong- t er m me mor y i mpr i nt i ng woul d be
equi val ent t o havi ng # = w. The assumpt i on is t hat t he consol i dat i on of
l ong- t er m changes is a separ at e process.
We post ul at e t hat i mpor t ant , f avor abl e or unf avor abl e, behavi or s can
give rise t o fast er l earni ng. The r at i onal e f or this is given in ( Fel dman, 1980;
1981a), whi ch also lays out i nf or mal l y our views on how shor t - and l ong-
t er m l earni ng coul d occur in connect i oni st net wor ks. A det ai l ed t echni cal
discussion of this mat er i al , al ong t he lines of this paper , is pr esent ed in
( Fel dman, 1981b). Obvi ousl y enough, a pl ausi bl e model of l earni ng and
memor y is a prerequi si t e f or any seri ous scientific use of connect i oni sm.
R E F E R E N C E S
Anderson, J. A., Silverstein, J. W., Ritz, S. A. , & Jones, R. S. Distinctive features, categorical
perception, and probabi l i t y learning: Some appl i cat i ons of a neural model. Psychologi-
cal Review, September 1977, 84(5), 413-451.
Arbi b, M. A. Perceptual structures and distributed mot or control. COINS (Tech. Rep. 79-11).
University of Massachusetts, Comput er and I nf or mat i on Science, and Cent er for Sys-
tems Neuroscience, June 1979.
Arbi b, M. A., & Capl an, D. Neurolinguistics must be comput at i onal . The Brain and Behav-
ioral Sciences, 1979, 2, 449-483.
Ballard, D. H. Paramet er networks: Towards a theory of low-level vision. Proceedings o f the
7th I JCAL Vancouver, BC, August 1981.
Ballard, D. H. , & Kimball, O. A. Rigid body motion f r om depth and optical f l ow (Tech. Rep.
70). New York: University of Rochester, Comput er Science Depart ment , in press,
1981. Ca)
Ballard, D. H. , & Kimball, O. A. Shape and light source direction f r om shading (Tech. Rep.).
Rochester, NY: University of Rochester, Comput er Science Depart ment , in press,
1981. (b)
Ballard, D. H. , & Sahbah, D. On shapes. Proceedings o f the 7th IJCAI, Vancouver, BC,
August 1981.
Collins, A. M., & Loftus, E. F. A spreading-activation theory of semantic processing. Psycho-
logical Review, November 1975, 82, 407-429.
Delcomyn, F. Neural basis of rhyt hmi c behavi or in animals. Science, Oct ober 1980, 210,
492-498.
Dell, G. S., & Reich, P. A. Toward a unified model of slips of the tongue. In V. A. Fr omki n
(Ed.), Errors in Linguistic Performance: Slips o f the Tongue, Ear, Pen, and Hand. New
York: Academic Press, 1980.
Didday, R. L. A model of vi suomot or mechani sms in t he frog optic rectum. Mathematical
Bioscience, 1976, 30, 169-180.
Duda, R. O. , & Hart , P. E. Use of t he Hough t r ansf or m t o detect lines and curves in pictures.
Communications o f the A CM 15Cl), Januar y 1972, I 1-15.
Edelman, G., & Mount cast l e, B. The Mindful Brain. Boston, MA: MI T Press, 1978.
Fahl man, S. E. NETL, A System f or Representing and Using Real Knowledge. Bost on, MA:
MI T Press, 1979.
Fahl man, S. E. The Hashnet i nt erconnect i on scheme. Comput er Science Depart ment ,
Carnegie-Mellon University, June 1980.
Fel dman, J. A. ,4 distributed information processing model o f visual memory (Tech. Rep. 52).
Rochester, NY: University of Rochester, Comput er Science Depart ment , 1980.
Fel dman, J. A. A connect i oni st model of visual memory. In G. E. Hi nt on & J. A. Ander son
(Eds.), Parallel Models o f Associative Memory. Hillsdale, N J: Lawrence Er l baum
Associates, 1981. Ca)
Feldman, J. A. Memory and change in connection networks (Tech. Rep. 96). Rochester, NY:
University of Rochester, Comput er Science Depart ment , Oct ober 1981. (b)
Feldman, J. A. Four f rames suffice (Tech. Rep. 99). Rochester, NY: University of Rochester,
Comput er Science Depart ment , in press, 1982.
Feldman, J. A. , & Ballard, D. H. Computing with connections (Tech. Rep. 72). Rochester,
NY: University of Rochester, Comput er Science Depart ment , 1981; to appear in book
by A. Rosenfeld & J. Beck (Eds.), 1982.
Forbus, K. D. Qualitative reasoni ng about physical processes. Proceedings o f the 7th I JCAL
Vancouver, BC, August 1981, 326-330.
Freuder, E. C. Synthesizing const rai nt expressions. Communications o f the ACM, November
1978, 21(11), 958-965.
CONNECTI ONI ST MODELS AND THEIR PROPERTIES 2 5 3
Garvey, T. D., Lowrance, J. D., & Fiscbler, M. A. An inference technique for integrating
knowledge from disparate sources. Proceedings of the 7th IJCAI, Vancouver, BC,
August 1981, 319-325.
Grossberg, S. Biological competition: Decision rules, pattern formation, and oscillations.
Proc. National Academy of Science USA, April 1980, 77(4), 2238-2342.
Hanson, A. R., & Riseman, E. M., (Eds.). Computer Vision Systems. New York: Academic
Press, 1978.
Hillis, W. D. The connection machine (Computer architecture for the new wave). AI Memo
646, M. I. T. , September 1981.
Hinton, G. E. Relaxation and its role in vision. (Ph.D. thesis, University of Edinburgh,
December 1977.)
Hinton, G. E. Draft of Technical Report. La Jolla, CA: University of California at San Diego,
1980.
Hinton, G. E. The role of spatial working memory in shape perception. Proceeding of the
Cognitive Science Conference, Berkeley, CA, August 1981. (a) 56--60.
Hinton, G. E. The role of spatial working memory in shape perception. Proceedings of the
Cognitive Science Conference, Berkeley, CA, August 1981. (a) 56-60.
Hinton, G. E., & Anderson, J. A. (Eds.). Parallel Models of Associative Memory. Hillsdale,
N J: Lawrence Erlbaum Associates, 1981.
Horn, B. K. P. , & Schunck, B. G. Determining Optical Flow. AI Memo 572, AI Lab, MIT,
April 1980.
Hubel, D. H. , & Wiesel, T. N. Brain mechanisms of vision. Scientific American, September
1979, 150-162.
J a ' J a ' , J. , & Simon, J. Parallel algorithms in graph theory: Planarity testing. CS 80-14, Com-
puter Science Department, Pennsylvania State University, June 1980.
Jusczyk, P. W., & Klein, R. M. (Eds.). The Nature of Thought: Essays in Honor of D. O.
Hebb. Hillsdale, N J: Lawrence Erlbaum Associates, 1980.
Kandel, E. R. The Cellular Basis of Behavior. San Francisco, CA: Freeman, 1976.
Kimme, C., Sklansky, J. , & Ballard, D. Finding circles by an array of accumulators. Commu-
nications of the ACM, February 1975.
Kinsbourne, M., & Hicks, R. E. Functional cerebral space: A model for overflow, transfer
and interference effects in human performance: A tutorial review. In J. Requin (Ed.),
Attention and Performance 7. Hillsdale, N J: Lawrence Erlbaum Associates, 1979.
Kosslyn, S. M. Images and Mind. Cambridge, MA: Harvard University Press, 1980.
Kuffler, S. W., & Nicholls, J. G. From Neuron to Brain: A Cellular Approach to the Func-
tion of the Nervous System. Sunderland, MA: Sinauer Associates, Inc., Publishers,
1976.
Marr, D. C., & Poggio, T. Cooperative computation of stereo disparity. Science, 1976, 194,
283-287.
McClelland, J. L., & Rumelhart, D. E. An interactive activation model of the effect of context
in perception: Part 1. Psychological Review, 1981.
Minsky, M., & Papert, S. Perceptrons. Cambridge, MA: The MIT Press, 1972.
Norman, D. A. A psychologist views human processing: Human errors and other phenomena
suggest processing mechanisms. Proceedings of the 7th IJCAL Vancouver, BC, August
1981, 1097-1101.
Perkel, D. H. , & Mulloney, B. Calibrating compartmental models of neurons. American Jour-
nal of Physiology 1979, 235(1), R93-R98.
Posner, M. 1. Chronometric Explorations of Mind. Hillsdale, N J: Lawrence Erlbaum Asso-
ciates, 1978.
Prager, J. M. Extracting and labeling boundary segments in natural scenes. IEEE Trans.
PAMI, January 1980, 2(1), 16-27.
Ratcliff, R. A theory of memory retrieval. Psychological Review, March 1978, 85(2), 59-108.
2 5 4 FELDMAN A N D BALLARD
Rosenfeld, A., Hummel , R. A., & Zucker, S. W. Scene labelling by relaxation operat i ons.
IEEE Trans. SMC 6, 1976.
Sabbah, D. Design of a highly parallel visual recognition system. Proceedings o f the 7th IJCA L
Vancouver, BC, August, 1981.
Scientific American. The Brain. San Francisco, CA. : W. H. Freeman and Company, 1979.
Sejnowski, T. J. St rong covariance with nonlinearly interacting neurons. Journal o f Mathe-
matical Biology, 1977, 4(4), 303-321.
Smith, E. E., Shoben, E. J. , & Rips, L. J. St ruct ure and process in semantic memory: A fea-
tural model for semantic decisions. Psychological Review, 1974, 8(3), 214-241.
Stefik, M. Pl anni ng with Const rai nt s (MOLGEN: Part I). Artificial Intelligence, 16(2), 1981.
Stent, G. S. A physiological mechanism for Hebb' s postulate of learning. Proc. National
Academy o f Science USA, April 1973, 70(4), 997-1001.
Sunshine, C. A. Formal techniques for protocol specification and verification. I EEE Compu-
ter, August 1979.
Torioka, T. Pat t ern separability in a r andom neural net with i nhi bi t ory connect i ons. Biologi-
cal Cybernetics, 1979, 34, 53-62.
Triesman, A. M., & Gelade, G. A feature-integration t heory of at t ent i on. Cognitive Psychol-
ogy, 1980, 12, 97-136.
UIIman, S. Relaxation and const rai ned opt i mi zat i on by local processes. Computer Graphics
and Image Processing, 1979, I0, 115-125.
yon der Malsburg, Ch. , & Willshaw, D. J. How to label nerve cells so t hat they can i nt ercon-
nect in an ordered fashi on. Proc. National Academy o f Science USA, November 1977,
74(1 I), 5176-5178.
Wickelgren, W. A. Chunki ng and consol i dat i on: A theoretical synthesis of semantic net works,
configuring in condi t i oni ng, S-R versus cognitive learning, normal forgetting, the
amnesic syndrome, and t he hi ppocampal arousal system. Psychologial Review, 1979,
86(1 ), 44-60.
Wurtz, R. H. , & Al bano, J. E. Vi sual -mot or funct i on of t he pri mat e superior colliculus. An-
nual Review o f Neurscience, 1980, 3, 189-226.
Zeki, S. The represent at i on of colours in the cerebral cortex. Nature, April 1980, 284, 412-418.

Connectionist Models and Their Properties PDF

Încărcat de

Informații document

Descriere originală:

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Connectionist Models and Their Properties PDF

Încărcat de

Drepturi de autor:

Formate disponibile

COGNITIVE SCIENCE 6, 205-254 (1982)

C0nnecti0nist Models and Their Properties

S-ar putea să vă placă și