Lectures On Controllabiliti and Observability

R. E KALMAN'" . .
' I
~ t a nordr University f StaGfo~d, Calif. 94305, u . s ~ . Centre dlAutomatique Ecole Nationale, SupCrieure des Mines , Paris, FRANCE
and
' -'
C
;-
.. ..
-
.*
Lectures delivered at CENTRO INTERNAZIONALE- MATEMATICO ESTIVO (C.I.M.E.) Seminar on Controllability and Observability, Fondazione Guglielmo Marconi, Pontecchio Marconi (Bologna, ITALY), from July 1 through July 9, 1968.
*
-
. --
-.
-.
** This research was s

in part by Grant NGR-05-WLQZA a small travel grai
. . --C1
. -
8
LL
'p.///yq?
(Nara(ca OR
OR i o NUMBER)
(CATEGORY)
(PAGES)
--
CENTRO INTERNAZIONALE MATEMATICO ESTIVO

(C. I. M. E. )
L E C T U R E S ON CONTROLLABILITY AND OBSERVABILITY
R.E. KALMAN (Stanford- U n i v e r s i t y )
C o r s o t e n u t o a S a s s o M a r c o n i (Bologna) d a l
1 a1 9
Luglio
1968
Page intentionally left blank
b
3 .
'
TABLE OF CONTENTS
Introduction. Classical and modern dynamical systems.
5 1 5
Standardization of definitions and "classical" results. 23 Definition of states via Nerode equivalence classes. Modules induced by linear input/output maps. Cyclicity and related questions. Transfer functions. Abstract construction of realizations. Construction of realizations. Theory of partial realizations. General theory of 05servability. Historical notes. References. 35 43 59
78
R . E. Kalman
INTRODUCTION The theory of controllability and observability has been developed, one might almost say reluctantly, in response to problems generated by technological science, especially in areas related to control, communication, and computers. It seems that the first
conscious steps to formalize these matters as a separate area of
( system-theoretic or mathematical) research were undertaken only as

late as
1 5 , by 99
KALMAN k960b-c1.
There have been, however, many
scattered results before this time (see Section 12 for some historical comments and references), and one h g h t confidently assert today that
some ofthe main results have been discovered, more or less independently, in every country which has reached an advanced stage of "development" and it is certain that these same results will be rediscovered again in still more places as other countries progress on the road to development.
..
I
With the perspective afforded by ten years of happenings in
this field, we ought not hesitate to make some guesses of the significance of what has been accomplished.
I see two main trends:
(i) The use of the concepts of controlla5ility and observability to study nonclassical questions in optimal control and optimal estimation theory, sometimes as basic hypotheses securing existence, more often as seemingly technical colditionswhich allow a sharper statement of results or shorter proofs. (ii) Interaction between the concepts of controllability and
observability and the study of structure of dynamical systems, such
R.E. Kalman
as: formulation and solution of the problem of realization,
canonical forms, decomposition of systems. The first of these topics is older and has been studied primarily from the point of view of analysis, although the basic lemma (2.7 ) is purely algebraic. The second group of topics may be viewed as "blowing up" the ideas inherent in the basic lemma (2.7 ), resulting in a more and more strictly algebraic point of view. There is active research in both areas.
In the first, attention has shifted from the case of systems

governed by finite-dimensional linear differential equations with constant coefficients (where success was quick and total) to systems governed by infinite-dimensional linear differential equztions (delay differential equations, classical types of partial different2al equations, etc ) , to finite-dimensional linear differential equations with time-dependent coefficients, and finally to all sorts and subsorts of nonlinear differential equations. The first two topics are surveyed concurrently by WEISS i~g6glwhile MARKUS [ 1 6 I 95 looks at the nonlinear situation.
My own current interest lies in the second strem, and these

lectures will deal primarily with it, after a rather hurried overGiew of the general problem and of the "classicalt1 results. Let us take a quick look at the most important of these "cla~sical'~ results. For con~enience shall describe them in system-theoretic I
R. E. Kalman
( r a t h e r than conventional pure mathematical) language.
The mathe-
matically t r a i n e d reader should have no d i f f i c u l t y i n converting them i n t o h i s p r e f e r r e d framework, by digging a l i t t l e i n t o t h e references.
In a r e a ( i ) , t h e most important r e s u l t s a r e probably those

which give more o r l e s s e x p l i c i t and computable r e s u l t s f o r controll a b i l i t y and o b s e r v a b i l i t y of c e r t a i n s p e c i f i c c l a s s e s of systems. Beyond these, t h e r e seem t o be two main theorems:
THEOREM A.
A r e a l , continuous-t ime, n-dimensional,
constant, n
l i n e a r dynamical system C has t h e property "every s e t of
eigenvalues may be produ-ced by s u i t a b l e s t a t e feedback" i f and only i f
i s completely controllable.
The c e n t r a l s p e c i a l case is t r e a t e d i n g r e a t d e t a i l by KAI;MAN, F U 3 , and ARBIB [1969, Chapter 2, Theorem 5.101; f o r a proof of t h e
O IA general case with background comments, r e f e r t o W NT M [ 19673.
As
a p a r t i c u l a r case, tre have t h a t every system s a t i s f y i n g t h e hypotheses

of the theorem can be "stabilizedl1 (made t o have eigenvalues with negative r e a l p a r t s ) v i a a s u i t a b l e choice of feedbaek. This r e s u l t
i s t h e "existence theorem" f o r algorithms used t o construct c o n t r ~ l

systems f o r t h e p a s t thee decades, znd y e t a conscious formulation of t h e problem and i t s mathematical s o l c t i o n go back t:, about 19631 (see Theorem D below. ) The analogous problem f o r nonconstant l i n e a r
systems (governed.by l i n e a r d i f f e r e n t i a l equations with variable c o e f f i c i e n t s ) i s s t i l l not solved.
IIRRIREM B .
("Duality ~rinci~le") Every problem of control-
lability in a real, (continuous-time, or discrete-time), finitedimensional, constant, linear dynamical system is equivalent to
a controllability problem in a dual system.

This fact was first observed by KALMAN [ 1960al in the solution of the optimal stochastic filtering problem for discrete-time systems, and was soon applied to several problems in system theory by
KALMAN (1960b-cl . See also many related comments by KALMAN, FALB,

and ARBIB [chapters 2 and
6 19691. As a theorem, this principle ,
is not yet known to be valid outside the linear area, but as an intuitive prescription it has been rather usef'ul in guiding systemtheoretic research. The problems involved here are those of fomulation rather than proof. The basic difficulties seem to point toward category theory. System-theoretic
algebra and in particular
duality, like the categoric one, is concerned with "revers'ing arrows". See Section 1 for a modern discussion of these points 0
and a precise version of Theorem B . Partly as a result of the questions raised by Theorem B and partly because of the algebraic techniques needed to prove Theorem
A and related lemmas, attention in the early 19601s shifted toward

certain problems of a structural nature which were, somewbat surprisingly at first, found to be related to controllability and observability. The main theorems again seem to be two:
THEOREM C
(canonical ~ecomposition) Every real ( co$inuous-
time or discrete-time), finite-dinensional, cans%s~t,, linear avnamical
R. E, Kalman
F%tm se
may be canonically decomposed into four parts, of which only
one part, that which is completely controllzble and completely observable, is involved in the input/output behavior of the system. The proof given by W M [1962] applies to nonconsta~tsystems only under the severe restriction that the dimensions of the subspace of all control.lable and all unobservable states is constant
on the whole real line. The result represented by Theorem C is far from
definitive, however, since finite-dimensional linear, 2cnconsta>t systems admit at least four differez: canonical decompositic.na: it is
possible and f u t ' l to dualize the notions of controllability rifu and observability, thereby arriving at -properties, present'ljr four called reachability and controllability as well as constructibility" and observability. (see Section 2 definitions.)
Any combi~ation a property from of
the first list with a property from the second list gives a canonical decomposition result analogous to meorem C. The complexity of
fd 1$ the situation was first revealed by \IEISS z i KAIWLN [ 9 ] ; this paper contributed to a revival of interest (with hopes of success)
In the special problems of nonconstant.linear systems. Recent
'AWEISS [1969]uses "determinabilityl1instead of constructibility. The new terminology used in these lectures is not yet entirely standard.
R . E. Kalman
progress is surveyed by WEISS [ 1 6 3 99
Intimately related to the
canonical structure theorkm, and in fact necessary to Mly clarify the phrase "involved in the input/output behavior of the system1: is the last basic result:
THEORE24 D .
(uniqueness of Minimal ~ealization) Given the of a real, continuous-time, finite-
impulse-response matrix W
dimensional, linear dynamical system, there exists a 3eal,'continuoustime, finite-dimensional, linear dynanical system (a) realizes W:
which
that is, the impulse-response matrix of
is equal to W;
(b) has minimal dimension in the class 02 linear systems
sa'tisfying ( ) ; a
c)
fd ()
is rcompletely controllable and compbete1.yobservable.; is uniquely determined (modulo the choice of a basis
gar its state space) by reauirenlent t o g e t b r with ('b) or, independently, by ( ) together with a
(4
In short, for any W as described above, there is an "essentially uniq-iiel1 of the same "type1'which satisfies ( ) through (c) a
COROLLARY 1 If W comes from a consta~tsystem, there is a .

constant
which satisfies (a) khrough ( ) and is uniquely c,
determined by (a)
+ (b) - (a) + or
(c)
(modulo a fixed choice of
basis for its state space).
R . E. Kalman
CQROmY
2.
A 1 1 claims of Corollary 1 continue t o hold i f
"impulse-response matrix of a constz.nt, finite-dimensional system"
i s replaced by " t r a x f e r f'unction matrix of a constant, f i n i t e dimensional systemu.
The first general discussion of t h e s i t u a t i o n with a n equival e n t statement of Theorem D i s due t o WT [1963b, Theorems 7 ! and 8 . 3
h his
paper: does not include cozlplete proofs, o r even
an e x p l i c i t statement of Corollaries 1 zr.d 2, although they a r e implied by t h e general algorithm given i n Section
7. An edited
version of t h e o r i g i n a l unpublished proof of Theorem D i s given i n KAWAN, FAI;By and ARBIB
1969, Chapter 10, Appendix c ] )
These r e s u l t s are' of great importalrce i n 'engineering system theory since they r e l a t e methods based on t h e Laplace transform (using the t r a n s f e r fbnction of the systen) m d t h e time-damin n methods based on input/output data (the m t r i x W) t o %he s t a t e variable ( d y n m i c a l system) methods devzloped i n 1955-1960.' In
f a c t , by Corollary 1 it follows t h a t the t~:o methods ?met yield i d e n t i c a l r e s u l t s ; f o r instance, s t a r t i n g with a constant impulseresponse matrix W, propert~r(c) implies thaz t h e existence
of a s t a b l e control l a y i s always assured by v i r t u e of Theorem A. Thus it i s only a f t e r t h e development represented by Theorems A-D t h a t a rigorous j u s t i f i c a t i o n i s obtained f o r t h e i n t u i t i v e design methods used i n c o n t r o l engineering. A s with
he or em
C, c e r t a i n forinulationzl d i f f'iculties a r i s e
i n connection with a precise d e f i n i t i o n of a "nonconstant l i n e a r
d y d c a l system".
Thus, it $eems preferable at present to replace by "weighting pattern W1'
in Theorem D "impulse-response matrix W" (or "abstract input/output map
w")
-and "complete controllability"
by "complete reachabilityl'. The definitive form of the 1963 theorem
16 evolved through the works of WEISS and KllIiEi'iAN 1 9 51, YOULA
[1966],
and KAINAN; a precise formulation and modernized proof of Theorem D in the weighting pattern case was given recently by KALMAN, FAD, and ARBIB [1969, Chapter 1 , Section 13.] A completely general 0 discussion of what is meant by a "minimal realizationn of a nonconstant impulse-response matrix involves many technical complications due to the fact that such a minimal realization does not exist in the class of linear differential equations with "nice1' coefficient f'unctions. For the current status of this probleia, consult especially DESOER and V A W Y A [19671,SILVERMAIT and MEADOWS
[ 1 6 1 KUMAN, FAI;B, and ARBIB [ 1969, Chapter 1 , Section 131 and 99, 0
WEISS [ 1 6 1 99
From the standpoint of the present lectures, by far the most interesting consequence of Theorem D is its influence, via efforts to arrive at a definitive proof of Corollary 1 on the development , of the algebraic stream of system theory. The first proof of this
important result (in the special case of disklnct eigenvalues) is that of GILBERT [1963]. Immediately afterwards, a general proof This proof, strictly
was given by WlLMAN [ 1963b, Section 7 I.
computational and.linear algebraic in nature, yields no theore~l\
cal insight although it is usef'ul as the basis of a computer algorithm.
R . E.Kalman
Using the classical theory of invariant factors, KALEJL9N [1965a] succeeded in showing that the solution of the minimal realization problem can be efTectively reduced to the classical invariantfactor algorithm.
since it strongly suggests the now standard module theoretic approach, but it does not lead to a simple proof of Corollary 1 and is not a practical method of computation.
This result is of great theoretical interest
The best known proof of Corollary 1 was obtained in 1 6 by 95 B. L Ho, with the aid of a remarkable algorithm, which is equally important ' .
from a theoretical and computational viewpoint. The early formulation of the algorithm was described by HO and MIXAN [ 9 6 , with 16] later refinements discussed in HO and KALMAN [lg6gl, KALhIAI?, FALB, and ARBIB [ 1969, Chapter 1 , Section 1 1 and K A W [ 1969~1. 0 1 Almost simultaneously with the work of B L Ho, the basic results . . were discovered independently also by YOULA and TISSI [1966]and by SILVERM!!q
[19661. The subject goes back to the 19th century
and centers around.the theory of Hankel matrices; however, many ofthe results just referenced seem to be f'undamentally new. This field is currently in a very active stage of develogment. We shall discuss the essential ideas involved in Sections 8-9. Ma-ny other topics, especially Silvemanlsgeneralization of the algorithm to nonconstant systems unfortunately cannot be covered due to lack of time,
fl. E,Kalman
Acknowledgment
It is a pleasure to thank C. I M E and its organizers, . . .

especially Professors E Bompiani, E Sarti, and E Belardinelli, . . . for arranging a special conference on these to2ics. The sunny
skies and hospitality of Italy; along with Bolognese food p1a.yed a subsidiary but vital part in the success of this important gathering of scientists.
R. E. Kalman
1 CLASSICAL AM> MODERN DYNAMICAL S S E S . YTM
I n mathematics t h e term dynamical system (synonyms:
topological
dynamics, flows, a b s t r a c t dynamics, etc.) usually connotes t h e action of a one-parameter group
(t'le r e a l s ) on a s e t
X,
where X
is
a t l e a s t a topological space (more often, a d i f f e r e n t i a b l e manifold)

and t h e action i s at l e a s t continuous. This setup i s physically A "dynamical system"
motimted, but i n a very old-fashioned sense.
as j u s t defined i s an idealization, generalization, and abstraction

o f Newtonts world view of t h e Solar System a s described v i a a f i n i t e s e t of nonlinear ordinary d i f f e r e n t i a l equations. These equations represent
t h e positions and momenta of t h e planets regarded as point masSes and a r e completely determined by t h e laws of gravitation, i.e., they do
not contain any terms t o account f o r "external" forces t h a t may a c t on t h e system. Interesting as t h i s notation of a dynamical system may be (and i s l ) i n pure mathematics, it i s much too limited f o r the study of those dynamical systems which a r e of contemporary i n t e r e s t . There
a r e a t l e a s t three d i f f e r i n t way& i n which t h e c l a s s i c a l concept must be generalized: (i) The t i n e s e t of the system i s not necessarily r e s t r i c t e d
t o the reals; (ii)

A state
x E X
of the system i s not merely acted upon by .. .

,
t h e "passage of time" but a l s o by inputs which a r e o r c6&3Te pulated t o bring about a desired type of behavior;
. J&
mani-
R. E. Kalman
(iii)
The s t a t e s of t h e system cannot, i n general, be observed.
Rather, the physical behavior of $he system i s manifested through

i t s outputs which a r e many-to-one functions of t h e s t a t e .
The generalization of t h e time s e t i s of minor i n t e r e s t t o us here. The notions of input and output, however, a r e exceedingly
f'undamental; i n fact, c o n t r o l l a b i l i t y i s r e l a t e d t o t h e input and observability t o the output. With respect t o dynamical systems i n
t h e c l a s s i c a l sense, neither c o n t r o l l a b i l i t y nor observability a r e meaningful concepts. A much more detailed discussion of dynamical systems i n t h e modern sense, together with r a t h e r detailed precise definitions, w i l l be found i n KALMAN, FALB, and ARBIB [1969, Chapter
1 1.
From here on, w w i l l use t h e term I'dynamical system1' exclusively e i n t h e modern sense (we have already done so i n t h e ~ n t r o d u c t i o n ) . The following symbols w i l l have a fixed meaning throughout the paper :
T
U
= =
= = =
time set, s e t of input values, s t a t e set, s e t of outpv-t values, input functions, t r a n s i t i o n map,
(1.1)
cp =
q = readout map.
The following assumptio~s i l l always apply (otherwise t h e s e t s w above a r e arbitrary) :
R . E. Kalman
T = an ordered subset of t h e r e a l s
52
=
, ? l -
c l a s s of f'unctions T (i) each function w f i n i t e interval

(ii) if JU
w E R
+ U such t h a t
i s undefined outside some
JU
CT
dependent on w;
nJu,
t h e r e i s a function
w
which agrees with
on
JU and
For mosl-, _cu2;3sea
later,
w i l l be equal t o
Z. = -
(ordered)
11
a b e l i a n group of integers;
U, X, Y, 52
w i l l be l i n e a r spaces;
unde-
fined" can be replaced by "equal t o
0"; and "functions undefined out-
s i d e a f i n i t e i n t e r v a l " w i l l mean t h e same as " f i n i t e sequences". The most general notion of a dyimnical system f o r our present needs i s given by t h e followirlg
(1.3)
DEFINITION.
A dynamical- system C -cp,
i s a compo'site object
T, U, R, X, Y
c o n s j s t i n g of t h e maps (a -s above) :
cp:
defined on t h e s e t s
T X T X X X R - ,
X,
X,
q(t;
7,
: (t; 7,
w)
c 3
x, 0 )
undefined whenever t -
> T; Y:
( t , x) I-+ ~ ( tx. ) , s a t i s f y t ~ h e following asbumptions: t
7:
T X X -
The t r a n s i t i o n
&
R. E. Kalman
if -
= cul on
7, X,
W)
[T, t ] ,
then f o r a l l x, m').
s E [T, t ]
q(s;
q(s;
7,
The d e f i n i t i o n of a dynamical system on t h i s l e v e l of g e n e r a l i t y should be regarded only a s a scaffolding f o r t h e terminology; i n t e r e & i n g mathematics begins only a f t e r f u r t h e r hypotheses a r e made. instance, it i s usually necessary t o endow t h e s e t s For and
*4 -
T, U, R, X,
Y with a topology
and then require t h a t
and
9 be continuous.
(1.7)
MAEIIPLE.
The c l a s s i c a l setup i n topological dynamics may Let
be deduced from our Definition (1.3) i n t h e following way.
T = 5 = reals, -
regarded a s an abelian group under t h e usual addition

R
and having t h e usual topology; l e t defined function; l e t define

cp
c o n s i s t only of t h e nowhere-
X . be topological space; disregard

T
and
entirely;
f o r - t, all q(t;
T, X,
E T
and w r i t e it a s
W )
~ ' (- t T),
and
t h a t is, a f'unction of
alone.
Check (1.4-5);
in
t h e new n o t a t ion Yney becoffie x.0

=
and
x*(s+ t )
(xis)*t. be continuous.
Finally, require t h a t t h e map
(x, t ) w x - t
(1.8)
INTERPRETATION.
The e s s e n t i a l idea of ~ e f i n i t i o n (1.3) i s

A dynamical system i s informally
t h a t it axiomatizes t h e notion of s t a t e .
R. E Kalman
a r u l e f o r s t a t e t r a n s i t i o n s (the f'unction )
together with suitable
means of expressing the e f f e c t of t h e input on t h e s t a t e and t h e e f f e c t of t h e s t a t e on t h e output (the f'unction
T)).The
map
cp i s verbalized
x
as follows:
time
T
''an input a,
applied t o the system C
i n state
at
produces t h e s t a t e
~ ( tT,; x, a )
a t time
t." The peculiar
d e f i n i t i o n of an input f'unction
i s used here mainly f o r technical
convenience; by (1.6) only equivalence classes of inputs agreeing over [T, t ] enter i n t o the determination of
cp(t;
T,
X,
a).
"a not defined"
at
t means no Input a c t s on C at time t

The p a i r (7, x) E T X X w i l l be called an event of a dynamical
system C.
I n the sequel, w s h a l l be concerned primarily with systems which e

a r e finite-dimensicnal, linear, and continuous-time or discrete-time.
Often these systems w i l l be a l s o r e a l and constant (= stationary or time-invariant)
W leave t h e precise d e f i n i t i o n of these terms i n e (consult KALMAN, FALB,
the context of Definition (1.3) t o t h e reader o r ARBIB E1969, Chapter
11
as needed) and proceed t o make some ad hoc
definitions without detailed explanation. The following conventions w i l l remain i n force thkoughout t h e lectures whenever t h e ,-inear case i s discussed:
. .
S= 2
Y
a l l continuous functions
g -
Rm -
which A n i s h out-
side a f i n i t e interval. (1.10) Discrete-time.

T= _Z,
K = fixed f i e l d (arbitrary),
R. E. Kalman
U =
Z + -
p ,
X = I , Y = K*, ?
il = a l l f'unctions
which a r e zero f o r a l l but a f i n i t e number of
t h e i r arguments
N w we have, f i n a l l y , o (1.11) DEFINITION.

A r e a l , continuous -time, n-dimensional, l i n e a r
dynamical system - C time -
i s a t r i p l e of continuous matrix functions of

where
X n
(F(*), G ( * ) , H ( - ) )
F ):
0 )
I - {n 1 - ,
matrices over
g) g), -
G (
g -
-3
{n X m matrices over {p X n matrices over
H .) : R ( -
-+
R). C
i n t h e folllowine,
These naps determine the equations of motion of manner :
where
t E= R,
x E
gn, ~ ( 5EFf, )-
an& -
y(t) E
$. -
To check t h a t (1.12) indeed malies C
i n t o a well-defined dynamical
system i n t h e sense of Definition (1.3)~ it i s necessary t o r e c a l l t h e basic f a c t s about f i n i t e systems of ordinary l i n e a r d i f f e r e n t i a l equations with continuous coefficients.
cP
Define t h e map { n X n matrices over
(t,
2): .
R n
X R X
-+
n
5) -
t o be the family o f
matrix solutions of t h e l i n e a r d i f f e r e n t i a l
R . E.Kalman
e
ion
subject t o t h e i n i t i a l condition ( Then T T) = I

= unit matrix,
T
g. It i s called t h e
i s of c l a s s
c1
i n both arguments.
t r a n s i t i o n matrix of ( t h e system C matrix i s ) F(*).
whose "infinitesimal' t r a n s i t ion
From t h i s standard r e s u l t we get e a s i l y a l s o t h e
f a c t t h a t t h e t r a n s i t i o n map of (1.13) cp(t; T, x a) = , $(t,
i s e x p l i c i t l y given by
r)x
+ $t b(t, S)G(S)GI @b(t, s)ds (s)

T
while t h e readout map i s given by
It i s instructive t o v e r i f y t h a t
cp indeed depends only on the equiva[r, t ] .
lence c l a s s of cuts which agree on
I n view of t h e c l a s s i c a l terminology "linear d i f f e r e n t i a l equat i o n s with constant coefficients1; we introduce t h e nonstandard (1.15)
\
DEFINITION.
A real, continuous-time, f inite-dimensional
l i n e a r dynamical system C = (F(*), G ( - ) , H ( * ) ) i s c a l l e d constant

if'f a l l three matrix f'unctions a r e constaht.
I n s t r i c t analogy with (1.15)) we say:

(1.16) DEFINITION. A dj-screte-time, f inite-dimensional, over linear,
constant dynamical system Z
is a triple
(F, G, H)
of
R. E. Kalman
n X'n, n Xm, p
n matrices over the field K .
These maps deter-
mine the equations of motion of i in the following manner:
where
In the sequel, we shall use the notations (F, Gy -)

F ,
or
-,H)
to denote systems possessing certain properties which
are true for any H or G.

Finally, we adopt the following convention, which is already implicit in the preceding discussion:
(.8 11)
Z
DEFINITION. The dimension n of a dynamical system -
is equal to the.dimension of X z
,
as a vector space.
2 .
STANDARDIZATION O DEFINITIONS AND "CLASSICAI/'RESULTS F
I n t h i s section, w s h a l l be mainly i n t e r e s t e d i n f i n i t e e dimensional l i n e a r dynamical systems, although t h e f i r s t two d e f i n i t i o n s w i l l be quite general. Let
C be an a r b i t r a r y dynamical system a s defined i n
. Section 1 W assume t h e following s l i g h t l y s p e c i a l property: e
There e x i s t s a s t a t e cp(t;
X3C, &)
9 and an input
=
&
such t h a t
T,
X3C
f o r a11 t,
X3C
E T and t > r. 0. (When X
For simplicity, w w r i t e e and ing.)
and CM a s 0
have a d d i t i v e structure,
w i l l have t h e usual mean-
The next two d e f i n i t i o n s r e f e r t o dynamical systems
with t h i s e x t r a property. DEFINITION. An event (z, x)

E R
(2.1)
i s controllable i f f .
co
there exists a
t E T and an w
(both
and -t -
may depend
I n words:
*to 0 co.
an event i s c o a t r o l l a b l e i f f it can be t r a n s f e r r e
i n f i n i t e time by an appropriate choice of t h e input function

(7,
Think of t h e path from
x)
to
( t , 0)
as the graph of a
function defined over
[ T t 1. ,
--------------6The
t e c h n i c a l wor'd i f f means i f and only i f .
R.'E. Kalman
. Consider now a reflection of this graph about T This .
suggests a new definition which is a kind of "adjointtbfthe definition of controllability:

(.) 22
DEFINITION. An event (T, x )

and an w R
is reachable iff there

03
is an - sE T
(T, x ) )
(both s and
- -
may depend on
such that
We emphasize: controllability and reachability are entirely different concepts. A strzking example of this fact is encountered below in Proposition ( . 6 . 42) We shall now review briefly some well-known criteria for and relations'between reachability and control.lability in linear systems.
(2-3)
PROPOSITION. In a real,continuo~~s-time, finite-dinensional,
linear dynamical system Z = (F(*), G(* ), (a)
),
A
an event for -
(T,
x)
is
reachable if and only if x E range ~ ( s ,T ) some -
R, s < T, where -
for -
(b) controllable if an only if x range W T t) (,
some -
t E R, t > -
T,
where -
The original proof of (b) is in KALMAN [1960b]; both cases are treated in detail in KALMAN, FALB, aad LWIB [ 1 6 , Chzpter 2 99 ,
R . E. Kalman
Section
21.
Note t h a t i f
G(*)
i s i d e n t i c a l l y zero on
G ( = ) i s identically
(-
w,
T)
we cannot have reachability, and i f zero on

(7, f
00)
we cannot have c o n t r o l l a b i l i t y .
For a constant system, t h e i n t e g r a l s above depend only on t h e difference of t h e l i m i t s ; hence, i n p a r t i c u l a r
w(r, t )
So we have
= t ( 2 ~
- t,
r).
(2.4)
PROPOSITION.
I n a real, continuous-tbe, finite-dimensional,

(7,
l i n e a r , constant d ~ ~ n a n i c a l system an e v a for a l l

T
x)
i s reachable
if and only i f it i s reschable f o r one
an evsnt
i s reachable if and only i f it i s con-brollable.

From (2.3) one can obtain i n a straightforward fashion a l s o t h e following much stronger r e s u l t :
(.) 25
THEORXM.
I n a real, cont inuous-t ime, n-dimensional,
linear, constant dynamical system C = (F, G, -)

. .-
a state x
i s reachable (or, equivalently, controllable) a t a x r E

i f and onlv i f
R -
x E span (G, FG,
... ) c $;
be inter$reted a's t h e vector space generated by the columns of these matrices.)
' ,
26
R. E. Kalman
A proof o f (2.5) may be found , n KALMAN, HO, and TTARENDRA i
119631 and i n WILMAN, FALB, and ARBIB [1969, Chapter 2, Section

31.
A trivial but noteworthy consequence i s t h e f a c t t h a t t h e
C
d e f i n i t i o n of reachable s t a t e s of
COROmY.
i s "coardinate- free" :
The s e t of reachable (or controllable)
s t a t e s of space
C i n Theorem (2.5) i s a Subspace of the r e a l vector

t h e s t a t e space of
Very o f t e n t h e a t t e n t i o n t o individual s t a t e s i s urmecessary

and therefore m y authors prefer t o use the terminology n completely reachable at
T"
If
is
for
It
"every event
(7,
x), T = fixed,
x E XE
i s reachable",
or
H completely reachable" f o r "every

etc. Thus (2.5), together with t h e
event i n Z
i s reachable",
Cayley-Hamilton theorem, implies the (2.7) linear -,

BASIC LEMMA. A real, continuous-t ime, n-dimensional,
constant dynamical system C = (F, G, -)
i s completely
reachable i f a n only i f rank (G, FG,
..., P I G )
= n.
Condition (2.8) i s very w e l l - k r m ; it o r equivalent forms of
it have been discovered, e x p l i c i t l y used, or i m p l i c i t l y assumed by

many authors.
(2.9)
A t r i v i a l l y equivalent form of (2.7) i s given by
COROLLARY 1 A constant system C = (F, G, -) .
+-
completely reachable i f and only i f the s ~ 3 l l e s t F-invariant subspace of itself.
Xz
containing ( a l l colunn vectors of)
- 3 is
R , E.Kalman
A useful variant of the last fact is given by

(.0 21)
COROLLARY 2 .
(w. Hahn)
A constant system
C =
(F, G, -)
is completely reachable if and only if there is no nonzero eigenvector of F which is orthogonal to (every column vector of) G . Finally, let us note that, far from being a technical condition, (2-5) has a direct system-theoretic interpretation, as follows:
(.1 21)
PROPOSITION. ,me state space
of a real, continuousG, -)
time, n-dimensional, linear, constant dynamical system C = (I?, may be written as a direct sum
which induces a decomposition of the equations of motion as (obvious notations)
The aubsystem
is comljletelv reachable. Hence
a state x = (xl, x ) E 2
is reachable if and only if x2
= 0.
PROOF. We define X1 to be the set of reachable states

of 2; b y ( ' 5 this is an F-invariant subspace of XC. 2.) finite-dimensionality, X1 .tion, every state in X1
is a direct summand in
Hence, by By construc-
is reachable, and (every column vector of)
R. E. K a l m a n
G
belongs t o X. 1
The F-invariance of
implies t h a t
FU = 0, which implies t h e asserted form of t h e equations of

motion. (2.13)
REWiRK.
Note t h a t
X2
i s not i n t r i n s i c a l l y defined
(it depends on an a r b i t r a r y choice i n completing t h e d i r e c t sum).

Hence t o say t h a t state if x2
"(0, x2)
i s an unreachable (or uncontrollable)
0" i s an. abuse of language.
More precisely:
the
s e t of a l l reachable ( o r controllable) s t a t e s has t h e s t r u c t u r e of
a vector space, b& t h e s e t of aJ-1 unreachable (or uncon'r,rollable)

s t a t e s does not have such structure. This f a c t i s important t o
bear i n mind f o r the algebraic development which follows a f t e r t h i s section and a l s o i n t h e . d e f i n i t i o n of observability and c o n s t r u c t i b i l i t y beiow. chosen i n such a way t h a t I n general, the d i r e c t s m cannot be u F 2 = 0. 1
While condition (2.8) has been frequently used a s a technical requirement i n the solution of various optimal control problems i n t h e l a t e 1950's, it was only i n 1959-60 t h a t t h e r e l a t i o n between (2.8) and system t h e o r e t i c questions w a s c l a r i f i e d by KADKN [ l g a b - c ]
via Definition (2.2) and Propositions (2.5) and (2.11).

1 for further details. ) 1
(see Section
I n other words, without t h e preceding
discussion the use of (2.8) may appear t o be a r t i f i c i a l , but i n f a c t
it i s not, at l e a s t i n problems i n which control entby (2.12) control problems s t a t e d f o r respect t o the i n t r i n s i c subspace
because,
a r e n o n t r i v i a l only with
x 1
R, E.. Kalman
The hypothesis "constant" i s by no means e s s e n t i a l f o r Proposition (2.11), but w must forego f u r t h e r comments here. e
For l a t e r purposes, w s t a t e some f a c t s here f o r d i s c r e t e e t h e , constant l i n e a r systems analogous t o those already developed f o r t h e i r continuous-time counterparts. The proofs a r e s t r a i g h t -
forward and t h e r e f o r e omitted (or given l a t e r , f o r i l l u s t r a t i v e purposes)

-
.
PROPOSITION.
(2.14)
A state
x of a real, discrete-time,
G, -)
n-dimensional,
l i n e a r , constant dyxamical system Z = (I?,
i s reachable if and only i f (2.15)
x span(G, FG,
..., F"-lG).
Thus such a system i s completely reachable i f and only i f (2.8) holds. PROPOSITION.
A state x
of t h e system C
described
i n Proposition (2.14) i s controllable i f and only i f
x E span (F-'G,
P
..., F-"G) ,
$ X
where
-k F G
(2.18)
IX:
giy
gi
= column v e c t ~ r of
GI.
PROPOSITION.
I n a real, discrete-time, finite-dimensional,

a reachable s t a t e
linear, constant dynamical system C = (F, G, -)
i s always c o n t r o l l a b l e and t h e converse i s always t r u e whenever

det F
f 0,
Note also that Propositions (2.11) and its proof continue to be correct, without any modification, when Itcontinuous-time" is replaced by "discrete-time". Now we turn to a discussion of observability. The original definition of observability by KALMCLNr [1960b, Definition ( . 3 ] 52) was concocted in such a way as to take advanThe conceptual problems surround-
tage of vector-space duality.
ing duality are easy to handle in the linear case but are still by no means fuLly understood in the nonlinear case (see Section
1) 0.
In order to get at the main facts quickly, we shall consider
here only the linear case and even then we shall use the-underlying idea of vector-space duality in a rather ad-hoc fashion. The reader wishing to do so can easily turn our rernarks into a strictly dual treatment of facts (2.1) -(2.12) with the aid of the setup intrduced in Section 1 . 0
(.9 21)
DEFIPKITION. An event (T, x )
in a real, continuous-
time, finite-dimensional, linear dynamical system C = ( ( *), F is unobservable iff ~ ( s $s ), ( (2.20)

(T, x )
T)X
-,H(*))
for all s E [ T,
m).
DEFINITION. With respect to the same system, an event is unconstructible* iff
*In the older literature, starting with KALM&N [1960b, Definition (5.23)1, it is this concept which is called "observability". By hindsight, the present choice of words seems to be more natural to the writer.
R. E. Kalman
~(rr)$(u, T)X = 0
for a l l u E (-
m,
r]. the
The moti-vation f o r t h e first d e f i c i t i o n i s obvious:

1 s
occurrence" of an unobservable event cannot be detected by lookT.
ing at t h e output of the system a f t e r time subsumes w = 0, linearity.)
(The d e f i n i t i o n
but t h i s i s no l o s s of generality because of
The motivation f o r t h e second d e f i n i t i o n i s l e s s
obvious but i s i n f a c t strongly swgested by s t a t i s t i c a l f i l t e r i n g theory (see Section 10). I n any case, Definition (2.21) comple-
ments Definition (2.20) i n exactly t h e same way as Definition (2.1) complements Definition (2.2)
From these definitions, it i s very easy t o deduce t h e following c r i t e r i a : (2.21)

PROPOSITION.
I n a real, continuous-time, f inite-dimensional,
l i n e a r dynamical system C = (F(*), (a)
-,H(*))
an event
(T, x)
is
unobservable i f and anly i f for a l l t E
x E kernel 2 ( ~ ,t )
g, t -
t r, where
(b)
unconstructible i f and only i f for a l l
x kernel ~ ( s T) ,
s E
g, -
< T,
where
R . E.Kalman
PROOF.
Part (a) follows immediately from the observation:
E kernel ~ ( r ,t) @
Hs@(, ()~s
rx )
= 0 for all s E [T, tl.

'
Part
(b) follows by an analogous argument.
REMARK.
Let us compare this result with Proposition ( . ) 23,
and let us indulge (only temporarily) in abusas of language of the following sort:*
(, T
x = unreachable )
< *
x E kernel $(T, t)
for all t > T and

( , x) T
=
observable
x range ;(T,
t)
for some t > T. From these relations we can easily deduce the so-called "duality rules"; that is, problems involving observability (or constructibility) are converted into problems involving reachability (or controllability) in a suitably defined dual system. See KAINAN, FUB, and ARBIB [ 1 6 , Chapter 2 Proposition (6.12) 99 ,
0 discussion in Section 1 .
and the broader
We -say, by slight abuse of language, that a system is will

completely observable whenever 0 is the only unobservable state. Thus the Basic Lemma
( . ) "dualizes" to the 27
(.3 22)
PROPOSITION. A real, continuous-time or discrete-time,

=
n-dimensional, linear, constant dynamical system C
(F,
--------------11
- , H)
*All this would be s-t;rictlycorrect if we agreed to replace direct sum" in Proposition ( . 1 and its counterpart ( . 5 by 21) 22) "orthogonal.direct sun"; b ~ t this would be an arbitrary convention which, while convenieat, hzs no natural system-theoretic justifica'(2.. tion. ,Reread~erna+%. 13)
R. E Ealman
is completely observable if and only if
(.4 22)
rank
(a',
FH, ''
..., (F*)"-%')
n .
By duality, com2lete constructibility in a continuous-time system is equivalent to observability; in a discrete-time system this is not true in general but it is true when det F It is easy to see also that (2.11) "dualizes" to: (2.25)
0 .
PROPOSITION. The state space X
of a real, continuous-
time or discrete-time., n-dimensional, linear, constant dynamical system C = (F,
-,H)
may be writter, as a.direct siun -
and the equations of C are decomposed correspondingly as
PROOF.
Proceed dually to the proof of Froposition ( . 1 , 21)
beginning with the definiticn of states of C .
X1
as the set of all unobservable
Combining Propositions ( . 1 and ( . 5 gives Theorem C as in 21) 22)
. [ 1962I
This completes our survey cf the tt classical" results related
R. E. K a l m a n
to reachability, controllability, observability, and constructibility
.
The main motivation for the succeeding
The remaining lectures will be concerned exclusively with discrete-time systems.
developments will be the algebraic criteria (2.8) and (2.24)
as well as a deeper exd.nation of Theorems C and D of the

Introduction.
3.
DEFINITION OF STATES VIA N R D EQUIVALENCE CLASSES EOE
A c l a s s i c a l dynamical system i s e s s e n t i a l l y t h e a c t i o n of t h e
time s e t
(= r e a l s ) .on t h e s t a t e s X.
I n other words, t h e
s t a t e s a r e a c t e d on by -an a b e l i a n group, namely d e f i n i t i o n of a d d i t i o n ) . consequences. inputs

Sl
(R + u s u a l -
'
This i s a t r i v i a l fact, but it has deep
A (modern) dynamical system i s t h e a c t i o n of t h e
on X;
i n exact analogy with t h e c l a s s i c a l case, t o
t h e a b e l i a n s t r u c t u r e on
t h e r e corresponds a n ( a s s o c i a t i v e
St.
but noncommutative) semigroup s t r u c t u r e on
The idea t h a t
always admits such a s t r u c t u r e was apparently overlooked u n t i l t h e l a t e 1950's when it became fashionable i n automata theory This seens t o be t h e "right9' way (school of SCHUTZENBERGER). of t r a n s l a t i n g t h e i n t u i t i v e notion of dynamics i n t o mthematics, and it w i l l be fundamental i n our succeeding investigations.
1% i s convenient t o assume fro= now oaf u n t i l t h e end of

t h e s e lectures, t h a t
(3.1)
T = time s e t integers.
additive (ordered) group of
Since w s h a l l be only i n t e r e s t e d i n constant systems from e here on, we s h a l l adopt t h e following normalization convention:*
*In t h e d i s c w d t i m e nonconstant case, we would have t o deal with Z copies of St, each normalized with respect t o a d i f f e r e n t p a r t i c u l a r value of T E Z. -
R E. Kalman
N element of o
i s defined f o r
t > T = 0.
Icul
I n view of (3.2))
.
we can define the "lengthn
of
cu by
.
= max
Ct E Z: - -
cu
is n ~ t defined for-any s
< t).
Before defining the semigroup on S2, f'undamental notion of dynamics: defined f o r a l l
we introduce another ub,
t h e ( l e f t ) s h i f t operator
>0 -
i n Z_ by -
Note t h a t t h e d e f i n i t i o n of t i o n (3.2).
If
i s compatible with t h e normaliza-
Ju
nJut
= empty f o r
o, cut E R,
we define t h e Join
of
cu
and cut
as t h e function
When Q has an additive structure, then we replace : u

DEFINITION.
0 :
, cut
by cu
+ cut.
There i s an associative operation
X Q
+ R,
c a l l e d concatenation, defined by ..
Note that, by (3.2) through (3.4))
i s well defined.
Note a l s o t h a t t h e asserted existence of concatenation r e s t s on t h e f a c t t h a t intervals i n

R T.
Q'
i s made up of functions defined over f i n i t e
W might express t h e content of (3.5) a l s o as: e
i s a semigroup with va
.-.---I---
R. E.Kalman
In view of ( . ) it is natural to use an abbreviated notation* 35,

also for the transition f'unction, as follows:
Now we come to an important nonclassical concept in dynmical
systems, whose evolution was strongly influenced by problems in communications and automata theory: input/output map a discrete-time constant
We interpret this map as follows: y(1) system C
is the output of some is subjected to
(say, a digital computer) when X
the (finite) input sequence a assuming that C is some fixed ,
. initial equilibrium state before the application of w This

definition automatically incorporates the notions of "discretetime" as well as "causal" or "dynamics" (the latter because yt () is not defined for t < 1. However, ( . ) does not ) 37
clearly imply "constancyt1(implicitly, however, this is clear from the normalization assumption ( . ) on 32
a. )
To make the definition
more forceful, we extend f to the map
-: f
R+
..
(infinite cartesian product)

=
: m w
( ( ( - *)= f f, y* , 4q
( ( 1 ~ ( 2 1 , .* = * ~1,
)*
y2, ()
Interpretation: f gives the output sequence y = (() yl,
...
of the system I= after t = 0 resulting from the application of an
--------------*Observe that xow is the strict analog of the notation xt customary in topological dynamics. The action of w on x satis= (XU) ov in view of (1.5) fies XO(UOV)
R. E. Kalman
input
w
which stops a t
t = 0.
This d e f i n i t i o n expresses c a u s a l i t y more f o r c e f u l l y and incorporates copstanc:r, operator f o r any provided w define t h e ( l e f t ) s h i f t e So,
u
T
on I'
so as t o be compatible with (3.3). let
> 0, -
T Z, -
Note:
the operator c
uQ
llappendsl' an undefined term a t
0,
the
operator
" l d i s ~ a r d t h e~ term ~ ( 1 ) . s~
f,
N w droppf ng t h e bar over o , DEFINITION.
w adopt e constant input/output map
A discrete-time, -
(of some underlying dynamical syst2rn C) t h e following diagram
i s any map
such t h a t
i s commut&tive. W s q t -- f e hat
i s line ----- a r
i f f it i s a K-vector -------
space -sm. - - 3on3mor-o' i

---A --. -&-
L I
It w i l l be convenient t o regard (3.10) as t h e external

,
d e f i n i t i o n of a dynemica1 system, i n contrast t o t h e i n t e r n a l

. d e f i n i t i o n s e t up i n Section 1
I n t u i t i v e l y , we should think of Kina of experimental data; namely,

f
a s a highly idealized incorporates a l l possible
information t h a t could be gained by subjecting t h e underlying
R. E. Kalman
system to experiments in which only input/outgut data is available. This point of view is related to experimental physics the same w y as the classical notion of a dynamlcal system is related a
-0 6
Newtonian (axiomatic) physics. The basic question which motivates much of what will follow
can now be formulated as follows:
(.1 31)
PROBLFJ4 OF REALIZATION. Given only the knowledge of
f (but of course also of 3, i, - 2
and
I?)
how can we discover,
in a mathematically consistent, rigorous, and natural m y , the properties of the system C which is supposed to underlie the given input/output r r q f? This suggests immediately the following fundamental concept: (.2 31) DEFINITION. A fixed dynamical system C (internal
definition, as in Section 1 is a realization of a fixed input/ ) output map fo iff f0 = f z 0 the input/output map of Zo.
that is, fo is identical with
In view of the notations of Section 1 plus the special convention (3.6), the explicit form of the realization condition is simply that
for all
u ,
SZ.
The symbol
stands for an arbitrary equili-
brium state in which Lo remins, by definition, until the application of w. (~ater simply take we
to be 0 ) .
R; E.Kalman
To solve the realization problem, the critical step is to induce a definition of X (of some
zo)
from the given foe
It is rather surprising that this step turns out to be trivial, on the abstract level.
(on the concrete level, however, there are

In
many unsolved problems in actually computing what X is.
, Section 8 we shall solve this problem, too, but only in t h e

linear case.) The essential idea seems to have been published
first by NERODE [195 1: 8

DEFINITION.
Make the concatenation semigroup
$2
into
a monoid by adjoining a neutral elemeat defined function on ) -
(which is the nowhere(read: u is Nerode
Then - cu
-f
U1
equivalent to u with respect to f) t iff
f(cu0.v)
There are
I T -
f(utov)
for all
n.
intuitive, physical, historical, and technical
reasons (which are scattered throughout the literature and concea-. trated especially strongly in KALMCLN, FAB, and ARBIB [ 9 9 ) for 16] using this as the
(.5 31)
=
MAIN DEFINITION.
The set of equivalence classes under is the state set of the
-fJ
denoted as Xf = - { ( ) u f: u E O , )
l
input/output map f
Let us veri*
sense:
immediately that (3.19) makes mathematical
R. E.Kalman
(3.16)
PROPOSITION.
For each linear, constant input/output maE

..
there e x i s t s a dynamical system Cf
such t h a t
(a) Zf
realizes
f;
PROOF.
W show how t o induce e
Cf,
given
f.
W e
d2fine the s t a t e s e t of t r a n s i t i o n f'mction of
C
C f
by (b) by
Further, w define t h e e
W must check t h a t e two different uses of sentation of
on t h e l e f t of
i s well defined (note
! )
t h a t is, independent of t h e repreThis follows t r i v i a l l y from (3.I&).

Cf
x as - (a)
f.
N w we define the readout 12ap of o
by
e Again, t h i s map i s well defined since w can take special case i n (3.14). Then
V =.
PI.. as a
and t h e r e a l i z a t i o n condition (3.6) i s verified.
Hence claim (a)
. i s correct.
(3.19)
COMMENTS. In automata theory,
Cf
i s known a s t h e
Clearly, any two
reduced form of .any system which r e a l i z e s
f.
reduced forms are isomorphic, in the set-theoretic sense, since
the set Xf is intrinsically defined by f .
(This observation
is a weak version of Theorem D of the Introduction; here "uniqueness" means "moiiulo a pernutation of the labels of eleme~tsin the set Xf".) Eiotice also that Ef is comgletely reachable of Xf
since, by Definition (3.15
,,
every eLement x =
cut
is reac-hbie via any element
- in the Nerode equivalence class
( u ) ~ . As to observability of Cp
see Section 1 . 0
K. E Kalman
W a r e now ready t o embark on t h e main t o p i c s of these l e c t u r e s . e
It i s assumed t h a t t h e reader i s conversant with modern algebra (especially: a b e l i a n groups, commutative rings, f i e l d s , modules, t h e r i n g
of polynomials i n one variabletand t h e theory of elementary d i v i s o r s ) , on t h e l e v e l of, say, VAN DER WAERDEN, LANG ZARISKt and S M E [1958, Vol. A UL on d a t e s from
119651, W [ 19651 o r
11.
The material covered from here
1965 or l a t e r .
Standing assumptions u n t i l Section 10:
A l l systems C
(F, G, H)
a r e discrete-time, l i n e a r ,
K
constant, defined over a f i x e d f i e l d

f inite-dimensional)
(but not necessarily
Our immediate objective i s t o provide t h e setup and proof f o r t h e
(4.2)
state set
FCTNDAMENTAL T E R M O LINEAR S S E T E R . HO E F YT M H O Y
The n a t u r a l
Xi associated with a discrete-time, linear, .constant inputf
output map
over a fixed f i e l d
admits t h e s t r u c t u r e of a f i n i t e l y
generated module over t h e r i n g ~ [ z ]of polynomials ( k i t h indeterminate z and c o e f f i c i e n t s i n K)
.
K[Z]
(4-3)
C M E T. O MNS
C,
Since t h e r i n g
w i l l be seen t o be r e l a t e d
t o ' t h e inputs t o
t h i s r e s u l t has a s u p e r f i c i a l resemblance t o t h e the state s e t admits
f a c t t h a t i n an a r b i t r a r y dy-namical system C t h e a c t i o n of a semigroup, namely

RC
(see (3.6) and r e l a t e d footnote).

R R
It t u r n s out, however, t h a t t h i s a c t i o n of
from combining t h e concatenation product i n
on X,
which r e s u l t s
with t h e d e f i n i t i o n of
R. E, Kalman
s t a t e s via Nerode equivalence, i s incompatible with the additive structure of

R
[ KALMAN, 1967, Section 3 1
Our theorem a s s e r t s t h e
X.
existence of an e n t i r e l y d i f f e r e n t kind of structure of structure, t h a t of a
This
K[ zl-module, i s not just a consequence of
dynamics, but depends c r i t i c a l l y on t h e additive structure on R and on t h e l i n e a r i t y of f. The relevant multiplication i s not
(noncommutative) concatenation but (commutative) convolution (because convolution i s t h e natural product i n K[z]) ; dynamics i s thereby r e s t a t e d i n such a way t h a t t h e t o o l s of commutative algebra become applicable. (4.30)), I n a c e r t a i n r a t h e r d e f i n i t e sense (see a l s o Remark
Theorem (4.2) expresses t h e algebraic content of t h e method
of t h e Laplace transformation, especially a s regards t h e practices developed i n e l e c t r i c a l engineering i n t h e U.S. during the 1950's.
The proof of Theorem (4.2) consists i n a long sequence of canonic a l constructions and the v e r i f i c a t i o n t h a t everything i s well defined and works a s needed. , I n view of (4.1) and the conventions made i n Section 1 R be viewed a s a K-vector space and w ( t ) = 0 and a l l cu E R. ~ ( t= 0 ) (a) m y
f o r almost a l l t E Z_ -
By convention (3.2 ), w have assumed a l s o t h a t e
for a l l t
R
> 0. As a r e s u l t , we have t h a t :
a s a K-vector space. Let us exhibit t h e isomor-
P[z]
phism e x p l i c i t l y a s follows:
By (3.2
), the sum i n (4.4) i s always f i n i t e .
The isomorphism
R. E,Kalman
obviously preserves t h e K-linear structure on R.

.
I n t h e sequel, w e
and
shall not distinguish sharply between w a s a f'unction T +

-
co as an m-vector polynomial.
(b)
i s a f r e e ~[zl-modulewith m penerators, t h a t is,

I n fact, w define t h e e
i2
K ? [ ~ I a l s o i n t h e K[zl-module sense. K[z] on R
action of
by scalar multiplication a s
-: K[z] X R
where
R:
(n, w)
a.w
(4.5)
( a . K[z], j
1 ,
..., m).
m e product of
n with the components of t h e vector w i s t h e
product i n ~ [ z ] . W e write t h e scalar product on t h e l e f t , t o avoid any confusion with notation (3.6 ) axioms a r e verified;
R
It i s easy t o see t h a t the module

"
i s obviously free, with generatorsL-
position,
(c)
i2
t h e action of t h e s h i f t operator
z.
cR i s represented
by m u l t i p l i c a t i o n by
Thi-s, of course,. i s t h e mi-n reason-for
introducing the isomorphism (4.4) i n t h e f i r s t place. . *
R. E. Kalman
(d)
Each element of
I'
i s a formal power s e r i e s i n
z 1 In fact,
-.
-
(4.4) suggests viewing ' z hence we define
as an abstract representation of
t E Z; -
By (3.8 ) and
(4 . ) y ( t ) I,
E KP
f o r each
t > 1 and i s zero, (or
not defined) f o r
t < 1 I n general the s m i s +;aken over i n f i n i t e l y many . u
nonzero terms; t h e r e i s no question of convergence and t h e right-hand side of (4.7) i s t o be interpreted s t i c t l y a9gebraically a s a formal power series. that (e) Since Y(Q)
i s always zero (see (3.8)),
we can say a l s o .
i s isomorphic t o the K-vector subspace of
K~[[Z,-~I]
(formal power s e r i e s i n of a l l power s e r i e s with
-1 w i t h coefficients i n
0
K ~ ) consisting
first t e r ~ .
The f i r s t n o n t r i v i a l construction i s t h e following: (f)
has t h e structure of a
~ [ z ]module, with scalar
multi~lication defined a s
: ~ [ z X I? ]
r:
('IT,.
Y)
H8 - Y
= 'IT(G~)Y.
This product may be interpreted a s the ordinary product of a power series i n
z -1 by a polynomial i n
z, .followed by the deletion of
a l l terms containing no. negative powers of t h e module axioms i s straightforward.
z. The verification of
R E Kalman
(g)
.. .
.f
i s a ~ [ z ]homomorphism.
This is an immediate conse-
quence of the f a c t t h a t cation by z
f = constant (see (3 .l~))'andt h a t multipli-
corresponds t o ' t h e l e f t s h i f t operators on Q and I ' . are isomorphic with
The Nerode equivalence classes of
G?/kernel f . . This i s an easy but highly nontrivial lemma, connecting Nerode equivalence with the module structure on R. immediate consequence of the formula The proof i s an
I n fact, by K-linearity of
f,
.(4;9) implies
for a l l v E Q
f(0ov)
i f and only i f
f(m1ov)
k f ( a -0)
= f(zk*ml) f o r a l l k
> 0 in -
Z. -
The proof of Theorem (4.2) i s now complete, since the l a s t
lemma identifies X f
module Q/kernel f .
as defined by (3.15) with the
K[z]
quotient
W write elements of the l a t t e r as e
["If = m
..
..
kernel f ; [ellp
then
it i s clear t h a t Xf
since R
as a K[z]-moduie i s generated by
el
i t s e l f i s generated by
...
is
...., [e,lf,
Note also
e m
(see (4.6)).
th@t the scalar product i n R/kernel f
The l a s t product abom (that i n R)
has already been defined i n (4.5).
The reader should verify d i r e c t l y t h a t (4.10) gives a well-defined scalar product.
R.E Kalrnan
(4.11) define f.
REMARK.
There i s a s t r i c t d u a l i t y i n t h e setup used t o
From the point of view of homological algebra [MAC LANE Since every f r e e module i s
19631, t h i s d u a l i t y looks a s follows. projective, t h e n a t u r a l map
n e x h i b i t s X a s t h e image of a projective module. O the. other f hand, t h e r e i s a b i j e c t i o n between t h e s e t Xf and t h e s e t
Zf
i s clearly a
f
K 21-subnodule of [
I (with zef(m) = f ( z - a ) ) , ' K[ cl-modules.

It i s
and so X
and
zf' a r e
isomorphic a l s o a s
known t h a t Exercise 21
Xf
I?
i s an i n j e c t i v e module [MAC IdWE 1963, page 95,

So the n a t u r a l m a 2 Xf
-t
Zf:
[elf
f(m)
exhibits
a s a submodule of an i n j e c t i v e ~ o d u l e . This f a c t i s basic i n t h e f ( s e c t i o n 7),
construction of t h e " t r a n s f e r f'unctiontl associated with
but its fill implications a r e not y e t understood a t present. There i s an easy counterpart of Theorem (4.2) which concerns a dynamical systen given i n " i n t e r n a l " form: (4.12) PROPOSITION. The s t a t e s e t
of every discrete-time,
finite-dimensional, l i n e a r , constant dynamical system C = (F, G, -) admits t h e s t r u c t u r e of a PROOF. K-vector space. K[z]-module.
X =
By
d e f i n i t i o n (see (1.10)),
K"
i s already a
W make it i n t o a e
K[z]-module by defining
R E Kalman
.' .'
(.4 41)
COMMENT.
The construction used in the proof of (4.12) is
the classical tridrof studying the properties of a fixed linear map
F :
+ $1 via the K[ zl-module structure that F induces on

by (4.13)
. In view of the canonical construction of

..
C provided by f
[] Proposition (3.16)) the state set X can be treated as a K z module irrespective as to whether X is constructed from f
(X =
xf)
or given a priori as part of the specification of L
(X =
x~).
Thus
the K[z]-module
structure on X is a nice way of uniting the "external"
and the "internalI1definitions of a dynamical system. Henceforth we shall talk about a (discrete-time, linear, constant dynamical) system
C
some~~hat imprecisely via properties of its associated K[ z ] -module
5.
We shall now give some examples of using module-theoretic language to express standard facts encountered before.
(.5 41)
Fz
PROPOSITION.
I f
is the state-module of C ,
-
the map
is given by X ->
X: x H z - x .
PROOF.
This is obvious from (4.13) if X =
3.
If X
Xf
3,
f
then we find that, by ( . 7 , ll)
since x 0 ()
results from input I,
x1 ()
results from input .z*S+ ~ ( 0 )
R.' E. Kalman
and we get
So t h e a s s e r t i o n i s again verified. N w we can replace Proposition (2.14) by the much more elegant o
(4.16)
PROPOSITION.
A system C = (F, G, -)
i s completely reachable
if and only i f t h e columris of -G generate
3.
. i s expressible a s
PROOF. The claim i s t h a t complete reachability i s equival e n t t o t h e f a c t t h a t every element
xE
I n view of (4.15),
t h i s i s t h e same a s requiring t h a t
x be expressible
t h i s l a s t condition i s equivalent t o complete reachability by (2.14). (4.17)

CORO-Y.
The reachable s t a t e s of
C a r e precisely
G.
those of the submodule of

REMARK.
generated.by (the columns of)
The statement t h a t
X
"C i s not completely reachable"
simply means t h a t the matrix

G
i s not generated by those vectors which make up
i n t h e specification of the input side of the system
C.
R'. E.. 'Kalman
It does not follow t h a t

vectors.
cannot be f i n i t e l y generated by some other
I n f a c t , t o avoid unnecessary generality, w s h a l l henceforth e
assume t h a t
i s always f i n i t e l y generated over

From t h e system-theoretic point of view, t h e case when we need
i n f i n i t e l y many generators, t h a t is, i n f i n i t e l y many input channels, seems r a t h e r bizzare a t present. (4.19) PRQPOSITION. The system Xf PROOF.
i s completely reachable.
a state
Obvious from t h e notation:

E R.
x = [ Elf
i s reached by
(4.20)
PROPOSITION. PROOF.
The system Xf
i s completely observable.
~ ( [ mf]) = f (u) = 0
Obvious from Lemma (h) above:
iff
m E [OIf,
which says t h a t t h e only unobservable s t a t e of
Xf
is 0 E Xfr
Let us generalize t'no l a s t r e s u l t t o obtain a m0dLl.e-theoretic c r i t e r i ~ n f o r com2lete observability. doing t h i s . There a r e two t e c h n i c a l l y d i f f e r e n t ways of
The first depends on t h e observation t h a t t h e "dualt1 of a The second defines
submodule (see Ccrollary (4.17)) i s a quotient module. observability via t h e "dual1' system
( F I , Ht, -)
associated with
(F,
-,H).
Consider a dynamical system C = (F,.
-,H)
and t h e corresponding W can extend H e
K[ 21-module
and K-homomorphism
H:
% -t
Y .'K =
R. E. Kalman
t o a K[z]-homomorphism
(look back a t (2.8))
by s e t t i n g
From Definition (2.19) w see t h a t no nonzero element of t h e quotient e module q k e r n e l

i s unobservable.
Hence, by abuse of language, w e

Z.
can say t h a t
%/kernel
Ti
i s t h e module of observable s t a t e s of
Thus w a r r i v e a t p h a s i n g t h e counterparts of (4.16-17) i n t h e followe i n g language: (4.21) PROPOSITION.
A system C = (F,
- ,
H)
i s completely observable i s isomorphic with

C
if anlonly if the quotient module
%/kernel
3.
(4.22)
COROLLARY.
The observable s t a t e s of
a r e t o be identified.
w i t h t h e elements of t h e quotient module
%/kernel
E .
(4.23)
TR I OO Y E MN L G .
The preceding considerations suggest viewilng

Strictly
a system C as e s s e n t i a l l y t h e same "thing" as a module X.

speaking, however, knowing C = (F, G H) , (see (4.13)) but a l s o a quotiant module module ( t h a t generated by
G)
gives us not only (over kernel
3=$
of a sub-
E )
of
%,
that i s
If
"5
we say t h a t
i s canonical < r e l a t i v e t o t h e given Gy H).
To be more precise, l e t us observe t h e following stronger version

of (4.19-20):
R. E. Kalman
(4.24) between
CORRESPoNDE3lCETHEOFZ31.
Thereisabijectivecorrespondence
and t h e equivalence c l a s s of modulo a
K[z]-homomorphisms
f: 0
completely reachable and completely observable systems b a s i s change i n
5.
Detailed discussion of t h i s r e s u l t i s postponed u n t i l Section
7.
A s t r i c t e r observation of t h e "duality principlet' leads t o

(4.25) C = (F', *
C*
DEFINITION.
HI,
The K-linear
C.
dual of
C = (F, G, H) i s
The s t a t e s of
G I ) ( 1 = matrix transpositton).
a r e c a l l e d c o s t a t e s of
The following f a c t i s an i m e d i a t e consequence of t h i s definition: (4.26) PROPOSITION. ~[z-'] The s t a t e s e t
%* of
Z*
may be given t h e
s t r u c t u r e of
module, as follows:
( i ) as a vector spece
X2,
i s t h e dual of
product i n
regarded as a K-vector space, ( i i ) t h e s c a l a r
i s defined by
(4.26~)
REMARK.
W cannot define e
Xz*
as Honk[ zl (3,~ [ z ] ) equal t o

M
over an i n t e g r a l
K[z]-linear dual of domain D
because every-torsion module
has a t r i v i a l D-dual.
However, t h e reader can v e r i f y (using defined above i s iso-
t h e ideas t o be developed i n Section 6) t h a t morphic with (2e e'd.), HornK[

21 .
(3, z) /K[ z 1) . K(
See BOURBAKI [Alghbre, Chapter
Section
4, No. 81.
R , E. Icalman
N w w verify e a s i l y t h e follo-+ring o e dual statements of (4.16-17):
(4.2'7)
PROPOSITION.
A system C = (F,
-,H)
i s completely observable
if and only if
H generates I
%*.
C*
(4.28)
COROLLARY.
The observable COstates of
a r e precisely
t h e reachable s t a t e s of
C, *
t h a t is, those of t h e submodule of
%*
generated bx H I
W have eliminated the abuse of languzge incurred by talking e about t'observzble s t a t e s t ' tbzough introduction of t h e new notion of "observable COstatesl'. The full explication of why t h i s i s necessary
(as well a s natural) i s postponed u n t i l Section 10. The preceding simple f a c t s depend only on t h e notion of a module and are immediate once w recognize the f a c t t h a t e
F may be eliminated F
from statements such a s (2.8) by passing t o the b d u l e induced by v i a (4.13)
But module theory yields many other, l e s s obvious r e s u l t s ~ [ z ]i s a principal-
as well, which derive mainly from t h e f a c t t h a t

i d e a l domain. W recall: e an element
m of an R-module
(R = a r b i t r a r y such t h a t
M
commutative ring) has t o r s i o n iff there i s a r E R
r a m = 0.
If t h i s i s not the case,
m i s free.
Similarly,
is
said t o be a t o r s i o n module i f f every element of

M
M has torsion.
i s a free module i f no nonzero element has torsion.
If
L C M
i s any subset of
K ,
(r:
t h e annihilator
of
is the set
r-I = 0
for a l l
I E L); R.
it follows immediately t h a t AL
i s an i d e a l i n
Note a l s o t h a t
R.E.Kalman
the statement "M that is a torsion mo6uleV does not imply in general
is nontrivial, that is,
(~ounterexample: take
an M which is not finitely generated.) Coupling these notions with the special fact that, for us,
R = Kz, []
we get a number of interesting system-theoretic results:

PRORISITION.
(.9 42)
C is finite-dimensional if and only if
is a torsion K[z]-module.
COROLLkUY.
I f
is free, C is infinite dimensional.
PROOF. We recall that "C = finite-dimensional" is defined

to be
' = finite-dimensional as a K-vector space"; 3
See ( . 8 . 11)
Sufficiency. By assumption X is-finitelygenerated by, say, q nonzero elements necessarilg the columiis of G )
... .
Hence
of XL
(which are not.
Since ~[z]is a principal-ideal domain, each of the Ax

3
is a princi-
pal ideal, Say, Y . [ l KZ

J
then degy = n > O
J for all j =1,
d t h Y. E K.. f1
If
is a torsion module, For otherwise y
..., q.
is either zero (and then x
is free, which is a contradiction) or Hence we can
a unit which implies x
= 0 - contr y to assumption.
replace each expression
R. E, Kalman
by the simpler.one
which shows that
5,as a K-module, is generated by the finite set
Necessity. Let qF be the minimal polynomial of the map
F x u z-x. If :
is finite-dimensional as a K-module, deg iF 0. >
This means (by the usual definition of the minimal polynomial in matrix theory or more generally in linear algebra) that xE
qF annihilates every
I3
so that F z
is a torsion K[z]-module.
Notice, from the second half of the proof, that the notion of a minimal polynomial can be extended from K-linear algebra to K[z]-modules. In fact, the same argument gives us also the well-known
(.0 43)
PROPOSITION. Every finitely generated torsion module M

z,
over a principal-ideal domain R has
nontrivial minimal D ynomial.
qM given by
+ = "MR.
X is finitely generated with
then -
(.1 43)
COROLLARY. If a K[z]-module
q generators and minimal polynomial $X ' *dimX (.2 43) REMARK.
(as K-vector space) - q~deg qx. The fact that Lf is completely reachable and is
<
therefore generated.by m vectors allows us to estimate the dimension of Ef by (4.31) knowing only deg
but without. having computed
Xf itself.
(Knowing Xf
explicitly means knowing F: x
z-x, etc
.r
In other words, the module-theoretic setup considerably enhances the content of Proposition (3.16)
Guided by these observations, we shall
develop in Section 8 explicit algorithms for calculating dim Zf directly
. from f without first having to compute F
(.3 43)
C
w
PROPOSITION.
I % - free K[ zl-module, no sta.te of f is a -
can be simultaneously reachable and controllable. PROOF. We recall that

= free" means that
is
(isomorphic to) a finite sum of copies of K z . [] simplicity that
Suppose for
% = K[.zl- Then
x = reachable means that x = 5 1
for some E E ~ [ z ] . Similarly, x = controllable means that
z '"I *x + m.1 = 0 for some m E K z] [
Hence if x has both properties,
This shows that 1 is annihilated by Sow,

by m ,
the input 5
followed
which contradicts the assumption that'
is free.
The most important consequence of Theorem
( . ) is due to the 42
fact that through it we can apply to iinear dynamical systems the well-known
(4.34)
FUNDAMENTAL STRUCTURE THEOREM FOR F I N I T E L Y GFJTERATRD MODULFS

R (~nvariant Factor Theorem for ~odules)
generators is isomorphic to
OVER A PRINCIPAL IDEAL DOMAIN Every such module M - m 112th
where the R* /? the -
are quotient rings of R viewed as modules over R,
qi (called the invariant factors of M) are uniquely determined

M up to units in R
qi1qiTl, i =
2 ,
..., q,
and, as usual, ' R r+ s
denotes the free R-module with s generators; finally,
< m. ,
Various proofs of this theorem are referenced in IWiMAN) FALB, and ARBIB Note:
[ 9 9 page 2 0 , and one is given later i Section 6 16, 71 n .

The divisibility conditions imply that M is a torsfon
module iff s = 0 and then
VM
q. l
One important consequence of this theorem (others in Section 7 ) is that it gives us the most general situation when torsion module C.
is not a we
For instance, combining (4.33) with ( . 4 ) 43)
( . 6) 43
PROPOSITION. A system cannot be simultaneously completely
reachable and completely controllable if its K[z]-module m-dimensional components (i. e, . s > 0 in (4.35 ) )
X has any
.
most of our
( . 7) 43
REMARK.
Although our entire development in this section may
be regarded as a deep examination of Proposition ( . 4 ) 21) coments apply equally well to (2.7),
since both statements rest on
the same algebraic condition ( . ) In fact, the only remaining 28. thing to be "algebraizedn is the notion of "continuous-time". We
shall not do this here. Once this last step is taken, the algebraization
of the Uplace transform (as related to ordinary linear differential

equations) will be complete.
5. .CYCLICITYAND RELAW
QUESTIONS
We recall that an R-module M iff there is an element m E M
(R = arbitrary ring) is cyclic

[It would be
such that M = Rm.
better to say tkst such a module is monogenic: element m.]
generated by one
I M is cyclic, the map R + M r I+ f :

and has kernel Am,
r - m is an epimorphism This plus the
the annihil~ti~z ideal of m.
homomorphism theorem gives the well-.ho-m

(5.1)
PROPOSITION. mery cyclic R
e E :
with gecerztx m
is isomorphic with the quotient ring 8/~, viewe6 es sn 3-r3dule. This result is much more i~teresting when, as in our case, Ii is not only commutative and a principal-iseal dozain, but specifically [] the polynomial ring K z . So let X be a cyclic K[z]-module with generator g and let
A = qgK[z],
g
where
is the min-1
g
or annihilating polynomial of Hence $

g
g .
By commutativity and cyclicity, A = qC.
is a minimal
polynomial also for X. Write $ = $ = p. Io view of (5.l, ) g x X " ~[z]j$~[z]. Let us recall sone features of the ring K[z]/?l.~[z]: (i)
?r
Its elemerits are the residue ciasses of ~olynodals ?r (mod q ) ,
E K[z]
Write these as
[T] or [n$.
Multiplication is 6eiin.d
as
[7rI-[vl = [TU]. (ii) Each
[I
is either a
=
or a divisor of zero.
T,
In fect,
Jr
is a
[TI
is a unit iff ( , Jr) T
greatest comon divisor of
R. E. Kalman
u n i t i n K[z]
( t h a t is, (n, )
K)
Then
m + T$
so t h a t
(7J,
= 1 (u, T K C ~ ] )
[a]
= 0
i s t h e inverse of
In].
O t h e other hand, i f n [n] and [v/Q] a r e zero
unit i n K[z],
then both
divisors since (iii)

If
$
[n].[$/@] = [(n/Q)$] = 0 .
i s a prime i n
KI z ]
(that
ts,
an irreducible poly-
nomial with respect t o coefficients over t h e ground f i e l d

by ( i i )
K),
then
K[z]/$Jc[z] i s a f i e l d .
This i s a very standard construction
i n algebraic number theory. Since it i s awkward t o compute with equivalence classes s h a l l often prefer t o work with t h e standard representative of namely a polynomial mined by [TI of l e a s t degree i n [ I . [n], .we
[a],
? i s uniquely deterHenceforth
"
and t h e condition deg ??
< deg
*.
will
always be used i n t h i s sense. The next kwo assertions a r e M e d i a t e : (5 -2) tothe K[z]/$x[
PROPOSITION.
K[z ]/vK[ z 1 a s a
K-vector space @(n) = { E E K [ z ] : deg 5 < n = d e g $ } .

N
- K-vector
&
space i s isomorphic
21
i s a l s o isomorphic t o
9'") as a
K z]-module, provided [
i.r
we define the scalar product i n
(8.1)
n$.
h .
. (5 03)
then -
PROPOSITION.
dim C = deg $.
If
Xx
i s cyclic with minimal polyxomial $,
R. E. Kalman
Looking back a t Theorem (4.34),
w see t h a t t h e most general e
KC
21-module i s a d i r e c t s m of c y c l i c u
KC
z ] -modules.
By combining
(5.3) and (4.34) and using t h e f a c t t h a t dimension i s a d d i t i v e under

exact r e s u l t : d i r e c t summing, w can replace ( 4 3 1 ) by the! followil~g. e
,.-
(5-4)
factors
PROPOSITION.
I f
then
Xz
i s a t o r s i o n module with invariant
..
dim Z
=
Q -
deg
... + deg $q.

1969,
A simple but highly usef'ul consequence of c y c l i c i t y i s t h e

so-called control canonical form [ KALMAN, FALB, and ARBIB, page 441 f o r a completely reachable p a i r n X 1 matrix.
(F, g) where g i s an
W s h a l l now prpcee&to deduce t h i s r e s u l t . e ll(F, g) completely reachablet1 i s equivaF v i a (!+.13)."

+
Observe f i r s t t h a t lent t o "g generates
XF,
t h e module induced by
Let
x,(z)
det
(21
- F),
then
i s t h e c h a r a c t e r i s t i c (and a l s o t h e ) minimal polynomial f o r [This i s a well-known f a c t of module theory. See f o r example
XF.
KALMAN, FALB, and ARBIB [1969, Chapter 10, Section

discussion.]
71 f o r d e t a i l e d
A s i n KALMAN [19621, consider the vectors
in
$.
[For cansistency,
5 (z) = ~ (n+l)
( z .I)
These vectors are
easily seen to be linearly independent over K They generate . since
"dn) a as
K-vector space (Proposition (.). 52)
Hence
e' "" n are a basis for $ as .a K-vector space. With l respect to this basis, the K-homomorphism
is represented by the matrix
his
is proved by direct computation. In pX.~<ica~~
d?gA
necessary to use the fact that
R .E, Kalman
z-e = 1
2 %
(4 (z)og,
56 Note that the last row of F in ( . ) consists of the coefficients

of
By definition, g = e n
Hence g as a column vector in
I has the representatioq ?
Conversely, suppose (? & have the matrix representation (5.6-7) I , with respect to some basis in 3Sn. Then (by direct computation)
the rank condition (2.8) is satisfied and therefore (F, g) is completely reachable in both the continuous-time and discretetime cases (Propositions (2.7) and ( . 6 ) 21). We have now proved:
(5 -8)
PROPOSITION. The pair (F, g )
is completely reachable
if and only if there is a basis relative to which F is given by
(.) 56
*
n z
(5.7).
Given an arbitrary n-th degree polynomial
(.) 59
hz ()
=
COZOLLARY.
+ p1zn-1
.:. + Pn
in
Kz, []
arbitrary field. There if and only if the
exists an n-vector I pair
such that h = $-gl,
(, g F )
is completely reachable.
R . E.Kalman
PROOF. Suppose that (F, g)
is completely reachable.
With respect to the same basis (5.5) which exhibits the canonical forms
(5.6-?) , define
Then verify by direct computation that h = Conversely, suppose that (F, g)
'
is not completely
reachable. Then, recalling Proposition ( . 2 (which is an 21) algebraic consequence of ( . ) and hence equally valid for both 28 continuous-time and discrete-time), Since 2 2 the polynomial deg dim X > 0 and so is also
2
=
is an F-invariant subspace of X
K",
is independent of the choic~of basis in
and the same is true then also for particular,
5
22
522 = YX~ll . (In

@
does not depend on the arbitrary choice of
X2 in satisf'ying the condition X = we have for all .a-vectors 3,
X. 2)
In view of ( . 2 , 21)
This contradicts the claim that h = with suitable choice of 3.
is true for any h
In view of the importance of this last result, we shall

rephrase it in purely module theoretic terms:
R. E, Kalman
(5.11)
THEOREM. Let K be an a r b i t r a r y f i e l d and X a cyclic

and minimal polynomial n-th degree polynomials of degree
z]-module with generator
n.
There i s a b i j e c t i o n between
+
n n-1 h(z) = z + Plz -
... + p,
3
e:
t-tp: x ( J ) . ~ I + l . . g
such t h a t
(j = 1 . , n and
K z] and K-homomorphisms [
X
defined
as i n -(5.5))
A i s t h e minimal polynomial f o r the

2: ,
new module s t r u c t u r e induced on X by t h e m a 2 Note t h a t i n (5.11) The map

C = (F, g,
x t+ in
zox
l?(x)
l?(x) corresponds t o
gIlx
(5.10).
l?
i n (5.11) defines a control l a w f o r t h e system corresponding t o t h e module X. The passage from
-)
z t o z*
i s t h e module-theoretic form of t h e well-known open-loop
t o closed-loop transformation used i n c l a s s i c a l l i n e a r control theory.

PROOF.
Since t h e vectors
X(')*g,
..., X(").g
form a W e
basis for treat on X

l?
? ,
a' i s c l e a r l y a well-defined
K-homomorphism.
formally as an element of
Ki z ]
( t h a t i s , an operator I * x = l?(Eeg), where
is a
K-vector space), by writing
represents t h e equivalence c l a s s i d e n t i c a l l y zero,

R
[ 5 1 = ( 5: 5.g = x)
Unless
i s never a K[z 1-homomorphism and therefore
does not commute with nonunits i n ~ [ z ] . Define

l?
pj
t h a t t h i s choice of
j-= , 1
or 3'
j = 1 ,
.
-
n.
W prove f i r s t e for
implies
h(')(z
j.
I) = ~ ( " ( z )
By definition,
...) n
-t-
1 Use induction on .
4)(z - 8 ) '
= X(')
(2). ;In t h e general case,
n. E. Kalman
So the open-loop/closed-loop transformation is essentially a change in the canonical basis, provided X is cyclic. It is interesting that the x(') have long been known in
Algebra (they are related to the Tschirnhausen transformation
19, discussed extensively by WBER 1 8 8 46,
5 , 74, 8 , 9], bkt their 4 5 6)
present (very natural) use in module theory seems to be new. -Theorem' ( . 1 b y be viewed as the central special case 51)
--
of Theorem A of the Introduction. Let us restate the latter in precise form as follows:
(..) 513
THEOREM. Given an arbitrary n-th degree polynomial

+
hz = zn + Bz n-1 () 1
* o m
'&
, in
~[z], K
= arbitrary field.
There exists an n X m matrix L over K such that if and only if (F, G is completely reachable. )
kGL, = h
For some timz, this result had the status of a well-known folk theorem, considered to be a straightforward consequence of ( . ) 59. has been discovered independently by many people. The latter
( first heard 1
of it in 1958, proposed as a conjecture by J E Bertram and proved . . soon arterwards by the so-called root-locus method.) Indeed, the
t
passage from (5.11) to (5.13) is primarily a technical problem. A proof of (5.13) was given by LAKGET~OP [ 9 ] and subseq~eritly 1& simplified by W O W ? [1967]. T'e first proof was (~n~ecessarily) very long, but the second proof is also unsatisfactory; since it-dependson arguments using a splitting field of
*The School.
material between these marks was added after the Summer
R.E.Kalman
and fail when K is a finite field. We shall use this situation
as an excuse to illustrate the power of the module-theoretic approach and to give a proof of (5.13) valid for arbitrary fields. The procedure of LANGENHOP and WONHAM rests on the following fact, of which we give a module-theoretic proof:
LEMMA.
Let
K be an arbitrary but infinite field. Let completely reachable; Then there is
be cyclic* and an -
? .m-vector a E I such that (F, ~ a ) is also completely
reachable. We begin with a simple remark, which is also useful in
51) reducing the proof of (5.13) to Lemma ( . 8

SB. U-
Every sibmodule of a cyclic rcodule over a
principal-ideal domain is cyclic. PROOF OF ( . 4 . We use induction on m 51) . The case
m = 1 is trivial. The general case mounts to the following. Consider the submodule Y of X =
gp
-**,
generated by the columns
%-1
of G. In view of (5.15))
Y is cyclic. By the
inductive hypotinesis, we are given the existence of a cyclic generator of Y of the form g = a.g + + a,_l-%-l, ai E K. Y 11 We must prove: for suitable a p E K the vector a-gy + , is a cyclic generator for X .
...
*Of course, this means that the K[ z]-module X F is cyclic.
(see (4.13))
By hypothesis,
gX.
has an (abstract) cyclic generator
By c y c l i c i t y we have t h e representations
Hence our problem i s reduced t o proving t h e following:
f a r suitable
a, f3 E K t h e polynomial
+ &I
is a unit i n
K z]/$K[ z ] [
This,
i n turn, i s equivalent t o proving
where
Q, 1
..., Qr
mod Q
i n K[z]
a r e t h e unique priloe f a c t o r s of
.
zero.
Let
mean t h e representative of l e a s t degree of equivalence
classes
Then no p a i r
Q
i = 1 ,
..
can be annihilates
For if one is, then

XI
) ,
t h a t is,
XI
yei
the submodule module of reachable.

X,
= ~ [ z ] g K[z]% ~
whence
i s a proper sub-
contradicting t h e f a c t t h a t
(F, G) i s completely
If a l l the
pi
a r e zero, then every and gy
.. f Ti
0,
so
is a unit in
So l e t
K[ Z]/%K[ a],
i s alreedy a cyclic generator.
= 1 Then .
t h e condition
?i + bi =
Since
K
eliminates a t most
values of
f3 f r o n consideration.
i s i n f i n i t e by
hypothesis, t h e r e a r e always some
f3 which s a t i s f y (5.16)
.
a E
An e s s e n t i a l p a r t of t h e lemma i s t h e s t i p u l a t i o n t h a t
The hypothesis
II
p.
F = cyclic
(F, G) = completely reachable" means t h a t
H. E. Kalman
that is, the lemnz is trivleUy true for some a E K~[Z] since
g~
= Ga.
But since we want a E K -there must be interaction ,
between vector-space structure and module structure, and for this reason the lemma is nontrivial. 'As a-matter ofaf&ct, the l e m a is false when K = finite field. The simplest counterexample is provided when ( . 2 rules out a single nonzero value of f3, 51) out thereby ruling
p.
COWE-LE. Let K = 2/2Z, that is, the ring of
( 5 017)
integers modulo the prime ideal 22.' -
Consider
(as a K[z]-module), @ X2 @ X, 1 3 minimal polynomials of the direct s m a n d s are Notice ihat XF

=
where the
A l these factors are relatively prime, l
( X X
= I ,
hence
X is cyclic. Notice also that gl generates X @ X3 1 Cnerates X2 @ X A cyclic generator for X is 3'
while g 2
R E Kalman
.' .
A simple calculation gives
Conditions
(5.16) are here
a-i+ p.0
a-0 + p-1
# o # o
(mod
x) 1, (mod x) 2,
These conditions have no solution in
- ~322.
At this point, the following is the situation concerning Theorem (5.13) :

(1) Its cou&erpart,
Theorem A of the Introduction, was
claimed to be true in the continuous-time case under the hypc'.:lesis of complete controllability.
(2)
In the discrete-time case (5.13) with the preceding
hypothesis Theorem A is false, because-ofthe counterexample: the pair
(F = nilpotent, G
0 is completely controllable,but evidently )

L.
is 3-GLindependent of
However, in view of (5.11) ,.Theorem
( . 3 might be true also in the discrete-time case if "complete 51)

.
controllabilityllis replaced by I1complet reachabilitytl, e this modification being immaterial in the continuous-time case.
I) (3) Because of (5. T , we might expect that a theorem like (5.13)

i
is false for an arbitrary field K .
R. E. Kalman
(4)
If our general claim that reachability properties are
reflected in module-theoretic properties is true, then (5.13) , should hold without assumptions concerning K because the principal module-theoretic fact, that K z] [
=
principal ideal domain, is
independent of the specific choice of K . We now proceed to establlsh Theorem (5.13). nypotheses on K will turn out to be irrelevant. PROOF OF (5.13). Necessity is proved exactly as in ( . ) 58. once we have proved it That is, special
Sufficiency will follow by induction on m, in the special case m = 2 :
(.8 51)
IZMMA.
. Let
K be an arbitrary field and let X be a There is a K-homomorphism JI
K[z]-module generated by gl, g2.
, (of the type defined in ( . 1 ) such that if z = z 51) Kz]mdl [,-oue
g2.
JI
induces a
structure on X - X is cyclic with respect to this then
structure and is generated by either g1 + g2 PROOF. Let Y = K[z]gl
or
and Z = ~ [ z l2 g
take an
case I Y /Iz = 0 that is, x = Y Z ~n (5.11) . , . . . such that J x = 0 -for all x E Z Replacing z by () .
i will change the ?
z ,
= z
K[z]-module structure on Y but preso that the new minimal poly-
. serve that on Z Further, choose
nomial h on Y is prime to the unchanged minimal polynomial
on Z Thus there exist polynomials V, a such that Vh + a X = 1 . .
z .
= X
By hypothesis, every x E X has the representation
.Now v e r i f y t h a t
Hence
gl
+ P2
i s in+.eed a c y c l i c generator f02 X
as a
K[z ,
1-m~dule.
Cas2 2.
Y ~ = z : W
f:
0.
L2t
W.
37 11-ypothesis,
there i s a c y c l i c i t y of Take same w
E ~ [ z ] such t h a t
w = 5 g2 and therefore, by
q E ~ [ z ] such t h a t
Y,
there is also a
S*g2 = w = tl'gl*
0.
Then if q = m d t (mod and so Z = X.
1 - 1 50%
q
5) we
a r e done because
generates Y ,
I n t h e n o n t r i v i a l case,
u n i t (mod
3) TO show: .
a s a - K[ , z
there i s a suitable new module s t r u c t u r e
on X
such t h a t
X
r , = u n i t (mod l
x,),
X ,
being the minimal poly-
nomial of
1-module.
The main f a c t 5 we need a r e the following:
(5.19)
deg X = n ,
SVslEMK4.
Let
X be a fixed element of X
K[ z ]
with
FX t h e companion matrix of Fp g
given by (5.6),
t h e cyclic module induced by

X
"FX a cyclic generator of

i f and only i - 7 - g f
Fx -
Then
q E K[z]
i s a u n i t modulo
X
is -
a l s o a cyclic generator of PROOF. Obvious.
Fx
R. E. Kalman
(5.20) S B E bA . Same notations a s i n (5.19). U L Ml . Write i n (5.5)).
7
Then
7, .x(j'(z) J=1 J
(x(')
-defined -
i s a u n i t rno&ul. X
i f and if - only-
where
i s t h e column vector
PROOF.
Since
X( 1 )
...
i s t h e basis for the
K-vector space of a l l polynomials of degree
< n,
the n-tuple FX
( ,
...
i s uhipuely determined by
q.
By d e f i n i t i o n
i s t h e matrix representing t h e module operator

t o t h e special b a s i s el,
z : x H z = x relative
..., en
given by
(5.5)
Similarly,
using one of t h e module axioms, w verif'y t h a t e
n - j
m: e
J
,
i n other words, the numerical vector (5.22) represents t h e a b s t r a c t

vector
N
qag in
*x
relativetothesamebasis
el.'" ' ?
en'
Recall
R , E. Kalman
that By
' ; i e g
generates
iff
( F B ( F ~ ) ~ ) complete reachable. ~ is
(2.7) t h e l a t t e r condition i s equivalent t o (5.21).
The r e s t
follows from (5.19) (5.23) S BE M . UIM A
.
Same notations a s i n (5.19) g & (5.20). Given n-vector (5.'22), t h e r e e x i s t s a polynomial
any nonzero numerical
such t h a t (5.21) i s s a t i s f i e d .
PROOF. Let
numbers
N
be t h e f i r s t member of t h e sequence of which i s nonzero. Write
ql,
q2,
...
and determine t h e f i r s t
c o e f f i c i e n t s of
X by t h e r u l e
(since a i l nmbers belong t o a f i e l d , t h e required values of
ar)
an e x i s t . ) N w checb by computation, t h a t these conditions o
reduce t h e matrix i n (5.21) t o t h e d i r e c t sum of two t r i a n g u l a r m t r i c e s , each with nonzero elements on i t s diagonal. I n view of (5.12), always choose a new
it follows from these f a c t s t h a t we can

such t h a t
'$
xt
qt = u n i t mod X. ,.
R. E. Kalman
The proof of Case 2 is not yet coaplete, however, because
we must still extend the Kz] [,

is easy. Write first Z = W
@
-module structure from Y to X This .
Z and then X = Y '
z, '
where the Extend
direct sun? is now with respect to the K-madule structure of X . from Y to X polynomial X , (.2, 51)
'cjr
settin;
81 Z' = 9.
2 ,
Zow we have a new minimal

= Zt
defined over X Since
On Y ,
t, l
qt
BY
5 is replaced by some 5 ,
such that
that is, our previous representation of w
in W induces a structure
similar representation with respect to the new K[z*]-module , on X. Since q u, q By

=
is a unit madulo Xi, we can write

1 + T X ~ , with u,
T E
~[z*].
[ ( . 4 , we have, with respect to the K z,]-structure, 52)
This' proves thet g2
generates both Y and Z; teat is,
g2
is The
a cyclic generator for X endo-i~ed with the K[z,]-structure.

proof of Lemma (5.18). is now complete.
It should be c l e a r t h a t Theorem (5.13) i s not a purely modulet h e o r e t i c r e s u l t , but depends on the interplay between module theory, vector-spaces, and elimination theory ( v i a (5.21)). the fact t h a t
1
For instance, which was needed
can be extended from Y
t o X,
i n t h e proof of Case 2, i s a t y p i c a l vector-space argument.** There a r e many open (or forgotten) r e s u l t s concerning cyclic modules which are of i n t e r e s t i n system theory. For instance, it
i s easy t o show t h a t an
polynomial
Y
n X n
r e a l matrix i s cyclic i f f a c e r t a i n
Y E g[zl, -
..., zn2 ]
i s nonzero a t
det
F;
t h e polynomial
i s roughly analogous t o t h e polynomial
i n t h e same ring,
Y
but, unlike i n t h e l a t t e r case, t h e general form of t o be known.
does not seem
W must not terminate t h i s discussion without pointing out e another consequence of c y c l i c i t y which transcends the module framework. Since X = c y c l i c with generator g
i s isomorphic with
K[ z 1/xgK[ 2 1,
it i s c l e a r t h a t X - has t h e structure of t h i s also
c o m t a t i v e ring, t h a t is, t h e product i s defined as
If
X
= irreducible, then X i s even a f i e l d . Hence, i n particula,r, g has a galois group. N one has ever given a &vnamical interpret%o
X
.. .
I
t i o n of t h i s galois grou2.
I n other words, there a r e obvious algebraic
f a c t s i n t h e theory of dyraraical systems which have never been examined t h e +mica1 point of view. For some r e l a t e d comments i n the
d from
mw i
g of topological semigroups, see DAY and WALLACE [ 19671.
R. E ,Kalman
PREAMBLE.
There has been a vigorous t r a d i t i o n i n engineer-
ing (especially i n e l e c t r i c a l engineering i n t h e United S t a t e s during 1940-1960) t h a t seeks t o phrase a l l r e s u l t s of t h e theory of l i n e a r constant dynamical systems i n t h e language of t h e Laplace transform. Textbooks i n t h i s a r e a o f t e n t r y t o motivate t h e i r biased p o i n t of view by claiming t h a t "the Laplace transform reduces the a n a l y t i c a l problem of solving a d i f f e r e n t i a l equation t o an algebraic problem". When d i r e c t e d t o a mathematician, such claims a r e highly misleading
because t h e mthematical ideas of t h e Laplace transform a r e never i n f a c t used. are The ideas which - a c t u c l l y used belong t o c l a s s i c a l properties of r a t i o n a l functions, t h e More importantly,
complex flmction theory:
p a r t i a l - f r a c t i o n expansion, residue calculus, e t c .
t h e word "algebraict1 i s used i n engineering i n an archaic sense and t h e a c t u a l (modern) algebraic content of engineering education and p r a c t i c e as r e l a t e d t o l i n e a r systems i B very meager. For example,
t h e c r u c i a l concept of t h e t r a n s f e r function i s usually introduced v i a h e u r i s t i c argumerits based on linea,rity o r "defined" purely formally a s "the r a t i o of Laplace transforms of t h e output over t h e input". To
do t h e job right, and t o recognize t h e t r a n s f e r m c t i o n a s a n a t u r a l
and purely algebraic gadget, requires a d r a s t i c a l l y new point of view, which i s now a t hand a s t h e machinery s e t up i n Sections 3-5. e s s e n t i a l idea of our present treatment was f i r s t published i n
The
KALMAN [ 19653I.
R . E. Kalman
The first purpose of this section is to give an intrinsically algebraic definition of the transfer f'unction associated with a discrete-time, constant, linear input/output map (see ~efinition(3.10)) Since the applications of transfer functions are standard, we shall not develop them in detail, but we do want to emphasize their role in relating the classical invariant factor theorem for polynomial matrices to the corresponding module theorem
(.4. 43)
(see lemma
Consider an arbitrary K[ z1 -homomorphism f : 0 +
(g) following Theorem (4.2) )
Then as a "mathematical object" f is i= 1 ,
equivalent to the set { ( j , fe) since
..., m,
ej defined by
(.), 46)
(The scalar product on the right is that in the K[z]-module defined in Section 4.) By definition of I, ?
I',
as
each f( e .) is a formal J 1 with vanishing first term. We shall try to power series in z
represent these formal power series by ratios of polynomials (which we shall call transfer functions%.) and then we can replace formula ( . ) 61 by a certain specially defined product of a ratio of polynomials by a polynomial. Some algebraic sophistication will be needed to find the correct rules of calculations. These "rules" will consititute a rigorous (and simple) version of Heaviside s so-called llcalculusll. There are no conceptual complications of any sort.
o ow ever, we are dodging some difficulties by working solely in discrete-time.)
*This entrenched terminology is rather unenlightening in the present algebraic context.
R', E. Kalman
Let a
Xf = n/kernel f
be t h e s t a t e s e t of
regarded a s
~ [ ~ ] - m ~ d u lW. assume t h a t ee
Xf
i s a t o r s i o n module with n o n t r i v i a l
m i n i m a l polynomial
Jr.
Then, f o r eac3 j = 1 ,
..., m
we have
By d e f i n i t i o n o f t h e module s t r u c t u r e on ordinary product of t h e power s e r i e s a (vector) polynpmial. no dot = ordinary
r,
J
(6.2) means t h a t t h e by t h e polynomial (notation:
f ( e .)
Jr
is
Hence (6.2) i s equivalent t o
In - t u i t i- v e l y-, ----
Ire can solve t h i s equation ky w r i t i n g
f (ej) = Oj/).
There a r e two prays of making t h i s idea rigorous. Method 1 Define .
a s t h e formal d i v i s i o n of
by
Jr
i n t o ascending powers of
0.
z-1
Check t h a t t h e c o e f f i c i e n t of
zO i s always
Verify by comptation
t h a t t h e power s e r i e s so obtained s a t i s f i e s ( 6 . 2 1 ) . Method 2, ;(z-l)

= z-")(z)
Multiply both s i d e s of (6.21) by and
z ' ~ . Write
.J
( z ) = z n ( )
Then
E ~ [ z - l ] K[ [z-'I] C
and (6.2')
becomes
Moreover, t h e
0-th
c b e f f i c i e n t of
is
1 (because of the convention
R. E. Kalman
t h a t t h e leading c o e f f i c i e n t of
$
is
l),
hence
i s a unit i n
K[ [ zI -] '
and t h e r e f o r e
Note t h a t (6.3) and (6.3') a c t u a l l y give s l i g h t l y d i f f e r e n t d e f i n i e f ( e . ) , depending on whether w use a t r a n s f e r function with J -1 ( ~ 0 t h respect t o t h e v a r i a b l e z or z notations have been used t i o n s of
i n t h e engineering l i t e r a t u r e . )
preferable.
For us t h e formalism of Method 1
ig
1 .)
h he
calculations of Method 1 can be reduced by Method 2
t o t h e b e t t e r - k n ~ mc a l c u l a t i o n s of t h e inverse i n t h e r i n g Summarizing, w have t h e easy but f'undamental r e s u l t : e

EXISTENCE O T A S E FUNCTIONS. F R NFR
K[ [ z-'I
There i s a b i j e c t i v e
f: R + I '
correspondence between polynomial
KC z 1-homomorphisrs
with minimal
Jr-----and t r a n s f e r function matrices of t h e type --
where
Q.E K [' J
21,
&g
denomi.nator of
j .
< Geg I,
and -
I i s t h e l e a s t common
I n many contexts, it i s preferable t o d e a l wit11 t h e ponding t o

f
corres-
r a t h e r than with
itself.
Because t h e correspondence f a r e well-
i s b i j e c t i v e , it i s c l e a r t h a t a l l objects induced by defined a l s o f o r
Zf
A
=
and conversely. dim f

A
!Thus, f o r instance,
dim Zf
dim X
f;
Z,
qZ = i e a s t c o m n denominator of
=
minimal polynoraial of
fZ.
R. E. Kalman
RENARK.
I n view of Propositions (4.20-21), namely X

A
= Xf
the natural
r e a l i z a t i o n o f Z,
,
Z
i s completely reachable a s
well as completely observable. has caused a great confusion,
~ o having t h i s f a c t available before 1960 t Questions such as t h o e r e s o l v e d by Theorem (5.13)
tended t o be attacked algorithmically, using s p e c i a l t r i c k s amounting t o elementary algebraic manipulations of elements of

Z.
Very few
t h e o r e t i c a l r e s u l t s could be conclusively established by t h i s route u n t i l t h e conceptual foundations of t h e theory of reachability and observability were developed. The preceding r e s u l t s may be r e s t a t e d as t'rulestt whereby t h e values 03 f may be computed using
Z.
W have i n fact, e
f(cu) = Z a ,
\q1lere
=. multiply t h e polynomial matrix
$Z consisting of t h e numerators of Z with cu, reduce t o minimaldegree polynomials modulo $ and then divide as i n Method 1 above. formally by
W can a l s o compute t h e e n t i r e output of t h e system Z Z e
( t h a t is,
a l l output values following t h e application of t b e first nonzero input

value) by the r u l e (6.7)
Zcu A =
=
(*h)/$,
same as above, but do not reduce modulo
$.
I n t h i s second case, t h e output sequence w i l l begin with a p o s i t i v e power of

z.
(!!?he coefficients of the positive powers of f
are
thrown away i n t h e d e f i n i t i o n of
(see (3.7)) and i n t h e d e f i n i t i o n
R. E. Kalman
of t h e s c a l a r product i n for
P,
i n order t o secure a simple formula
Xf =
kernel
f.)
Many other applications of t r a n s f e r f'unctions may be found i n

KALMAN, FALB, and ARBIB [ 1969, Chapter 10, Section 101
.
h his
is
It i s easy t o show t h a t t h e t r a n s f e r function associated with

t h e system Xf = (I?, G, H)
i s given by
zf
= H(ZI
F)-'G.
j u s t t h e formal Laplace transform computed from t h e constant version of (1.12) by s e t t i n g
z = d/dt
or from (1.17) by s e t t i n g
x ( t + 1 ) = z x ( t ) .) via t h e formula
Probably t h e simplest way of computing
is
(ZI - F ) - ~ =
where
J "
) (z) ,
deg 9,
iF i s t h e minimal polynomial of t h e matrix F and the super-
s c r i p t denotes t h e s p e c i a l polynomials defined i n (5.5)
The matrix
i d e n t i t y (6.8) follows a t once from t h e c l a s s i c a l scalar i d e n t i t y

[WEBER,
1898, 9 1
upon s e t t i n g w = F,
T =
JrF,
and invoking t h e Cayley-Hamilton theorem.
Much of c l a s s i c a l l i n e a r system theory was concerned with computing

zf '
I n t h e modern context, t h i s problem t t f a c t o r s t ti n t o f i r s t solving f
t h e r e a l i z a t i o n problem Sections 8 and 9.
+ Zf
acd then applying formula
(6.8).
See
One of t h e mysterious f e a t u r e s of Rule

r
(6.6) ( a s contrasted with
the
conventions-rule (6.7))
.I ..
i s t h e necessity of reducing modulo 9.
The simplest way of understanding t h e importance of t h i s
R. E'. Kalman
aspect of t h e prablem i s t o show how t o r e l a t e t h e module invariant f a c t o r s occuring i n t h e s t r u c t u r e thsorem (4.34) t o t h e c l a s s i c a l f a c t s concerning t h e invariant f a c t o r s of a polynomial matrix.
(6.9)
I N V W F C O THEOREM F R MATRICES. ATR O
Let
P - p X m be a Then
matrix with elements i n an a r b i t r a r y principal-ideal domain R.
where A
and -
are -
p X p
and -
m X m ~atrices (not necessarily

det A, det B u n i t s i n R, hi E R while
unique) with elements i n
and
II
diag (Al,
..., % 0, ..., 0) with
i s unique (up t o u n i t s i n R) - hilhi+l, with
i = 1 ,
..., q - 1,
P.
and
= rank P.
The
a r e c a l l e d the invariant f a c t o r s of
As anyone would expect, t h e r e i s a correspondence between the module s t r u c t u r e theorem ( b .34) and t h e matrix s t r u c t u r e theorem (6.9) and, i n p a r t i c u l a r , between t h e respective invariant f a c t o r s and Al,
...,
11-
..., .
Let us sketch t h e stacdzrd proof of t h i s f a c t follow-
ing CURTIS and REINXR 11962, $13.31 who a l s o give a proof of (6.9). P O F O (4.34) RO F onto
.
I +
Consider t h e R-homon;_or_~hism from R gi, where t h e e

i
given by
p:
a r e t h e standard gi generate
b a s i s elements of Clearly,
( r e c a l l (4.6)) and t h e N = kernel p.
M.
R~/N,
where
It can be proved t h a t
" Ra
i s a f r e e s~bmoduleof
R ~ , with a b a s i s of a t most
<m R.
elements.
Write each b a s i s element
of
as
C pij*ei, pij E
Apply
A
( . ) to the R-matrix I?. 69

. .
n
Define fk
=
e =
Za
w . .. By (6.10-ll), e , ij 1
C cij*fi,
8
-1 B ,
?i*9i. Hence
Then, by "direct sum1',
That is, (4.34) holds with
Jri
= hi
and r
rank P = J.
By the same type of calculations,
we can prove also
(.2 61)
$Z
THEOREM.
...
be the invariant factors of i

=
given bx ( . ) and let (Ai, q ) = Bi, 69,
1 ,
..., q.
Then the
invariant factors of
% are
where
1= r
r is the smallest integer such that
+ 1, ...,
EKlO7.
JrI A. for 1
q = rank
ConBicleler ~l[z]-ephorphism p: S the 2
-t
%:
o,
[U]~.
Clearly, m E [ I = kernel p 0Z ($~)m = 0 (mod $).
iff Z*m = 0 (see ( . ) . 66)
~~uiialentl~,
Using the representation whose existence is claimed
by (6.9), write J Z= CAD r
(c, A,
matrices over K[z] .)
Define
= D
-I
Y where ,
Then AY = 0 ,
(zw = $)
0 and W ,
has clearly maximal rank among K z [] consti-
matrices with this property. So the columns of the-matrix W tute a basis for kernel p.
The rest follows easily, as in the proof
(6.13)
REN4RK.
The preceding proof remains correct, without any det C det D = units , The former
modification, if the representation $ = CAD, Z is taken in the ring K z / K z, [ ]$[ ]
rather than in K z . []
representation follows trivially from the latter but may be easier to compute
.
REMARK. Theorem (6.12) shows how to compute the invariant from those of $ . We must define the invariant Z (because of the
(.4 61)
factors of XZ
factors of Z to be the same as those of XZ bijective correspondence Z that we write

~t
-
x) Z.
Consistency with (6.12) demands
where
is defined as in ( . ) In other words, 63.
Jri
are
the denominators of the scalar transfer function A of all common factors. Theorems
' -.- - cancellation after
( . 4 and (6.12) do not fully reveal the significance 43)
of invariant factors in dynamical systems. Nor is it convenient to o'h deduce all properties of matrix-invariant factors frmte representation
R . E Kalman
theorem (6.9).
It i s i n t e r e s t i n g t h a t t h e sharpened r e s u l t s w present e
below a r e much i n t h e s p i r i t of t h e o r i g i n a l work of WEIERSTRASS, H. J. S.

SMITH, KRONEICKER, FROBFAIUS, and WSEL, a s summarized i n t h e well-known
monograph of MUTH [1899]. (6.16)

DEFIPJITIOW.
~ e A, t
r e c t a a n m a r matrices over a uniaue f a c t divides
orizaf,ion d o G i n R.
V, W
AIB
(read: A -
B)
i f f t h e r e a r e matrices
B = VAW.
(over
- R,
of appropriate s i z e s )
such t h a t
This i s of course j u s t t h e usual d e f i n i t i o n of "divide" i n a ring, specialized t o t h e noncommutative r i n g of matrices. The following r e s u l t [MUTH 1899, Theorems IIIa-b, p. 521 shows t h a t i n case of p r i n c i p a l - i d e a l domains t h e correspondence between matrices and t h e i r i n v a r i a n t f a c t o r s preserves t h e divide r e l a t i o n ( i s "flunctcrial" with respect t o "divide") : (6.17)
THEOREM.
Let
1
be e, principal-idealdomain.
- AIB Then
i f and - only i f PROOF.
A ~ ( A?L(B) fo? a l l .i. ) Write t h e represelltation (6.10) a s
Sufficiency.
By hypothesis, t h e r e i s a
(diagonal) such t h a t
i l 3 = 4 Hence .
R. E .Kalman
Necessity. This is just the following
( 1) 6 8
LEMMA- For an arbitrary unique-factorization
domain R, A!B implies h() iAI P3OOF.
hi(^).
By elementary determinant manZpulations, as in
[1899,Theorem 11, p. 16-17 I.

This completes the proof of Theorem (6.i7)
(6.S)
mx.
Since ( . ) does not apply (why?) to unique factori69
zation domains, for purposes of using Lema (6.
a)we need WEIERSTRASS
definition of invariant factors: if A . A = greatest common factor of () J all j X j minors of a matrix A, with A~(A) = 1 then , A.1(A) = A~(A)/A~-~(A) Of course, this definition can be shown to be equivalent (over principal-ideal donains) to that implied by ( . ) 69.
In analogy with Definition (6.~3, us agree (note inversion]) on let

..
(.0 62)
DEFINITION. Let Z , Z2 be transfer-function natrices 1
z 1 2 (read: Z1 divides 1Z
such that Z = V W. 1 Z
2
z2)
iff there are matrices V,
over z 1
2
~[z]
(~ote that Z lZg implies at once: l if and only if
I i I .) z
for all i.
(6.21)
THEOREM.
PROOF.
z lz2 l
qi(zl)1 qi(zg)
This is the natural counterpart of Theorem ( . 6 , 61)
and follows from it by a simple calculation using the definition of
t() ,z
given by ( . 5 . 61)
R. E. Kalman
(6.22)
DEFINITION.
Ell.$
(read: -
C can be simulated by 1
c*)
iff T I T y
t h a t i s , i f f XC i s isomorphic t o a submodule of 1 1 2 [o.r isomorphic t o a quotient module of % I .

2
This d e f i n i t i o n i s a l s o f'unctorially r e l a t e d t o t h e d e f i n i t i o n of "divide" over a p r i n c i p a l i d e a l domain R standard r e s u l t : (6.23) R-modules. of because of t h e following
THEOREM.
Then Y
Let
if
be a principal-ideal domain and X, Y
i s (isomorphic) t o a submodule or quotient module
if - and only
PROOF. form (4.34), and with

J
Sufficiency. xl,
..?,
Take both
and Y
i n canonical
X,
xr(x)
generating t h e c y c l i c pieces of if
i
Y, 1
Yr(x)
(with yi = O
> r ( ~ ) those )
of
Y.
The
assignment
w ( i ( ~ ) / i ( ~ ) x idefines a monomorphism
X,
that
is, e x h i b i t s Y as (isomorphic t o ) a submodule of

assignment x. w yi
1
X.
Similarly, t h e exhibiting
defines an epimorphism X -+ Y
as
(isomorphic t o ) a quotient module of Necessity Section
X.
(following BOURB4KI [AlGbre, Chapter Let
(2e ed.),
4, Exercise 81).
where
be a submodule of
X.
By (4.34),
L/N
L, N
a r e f r e e R-modules.
By a c l a s s i c a l isomorphism
N/N,
theorem, and M
i s isomorphic t o a quotient module
where
L 3M 3 N
i s f r e e (since submodules of a f r e e module a r e f r e e ) .
R E ,Kalman
From t h e last relation, that, f o r any R-module
X
o r ( ~ - r ( ~ ) . Nw observe, again using (4.34) ) and. any n E R,
<
and therefore
wk(x)
Since Y
i d e a l generated by
(n: r ( n ~ ) k} C
.
is
i s a submodule of
n X
f o r a l l n R,
it follows that
R $ ~ ( x ) R $ ~ ( Y ) , and t h e proof i s complete f o r the case when Y 3
a submodule of
X.
The proof of t h e other case i s similar.
PROOF.
Immediate from t h e f a c t t h a t
zc
i s a submodule
of
(see Section 7).
Nw w can swnmaarize main r e s u l t s of t h i s section a s the o e (6.25)
P R I D E C O ~ S I T I O N THEOREM FOR L ~
I DYNAMICAL S Y S ~ . ~
The following conditions a r e equivalent: (i) (ii) (iii)

PROOF.
Z1
divides
2'
qi(zl)
.Zz
d i v i d e s qi(Z2)
for a l l 2
i.
can be simulated by Zz
1
This follows by combining Theorem (6.21) with Theorem by definition.
(6.23),
since
qi(Z) = ti(ZZ)
R. E.Kalman
(6.26)
1-TATION.
The d e f i n i t i o n of
Zll Z2
means, i n syste&
t h e o r e t i c terms, t h a t t h e inputs and outputs of the machine whose t r a n s f e r f'unction i s

Z2
a r e t o be " r e ~ o d e d ~ ~h e o r i g i n a l input t:
B(Z)U~
u ,
i s replaced by
'
an input mg =
and t h e output y2
i s replaced by an output
Z2 w i l l a c t l i k e
yl = A ( z ) ~ with these "coding" operations, ~;
a machine with t r a n s f e r flmction Z1.

t r a n s f e r function, the equation
I n view of t h e d e f i n i t i o n of a
Z1 = AZ B 2
i s always s a t i s f i e d whenever
A, B a r e replaced by
(reduced modulo @ ) This means t h a t the z2 coding operations can be c a r r i e d out physically given a delay of
x,
d = deg qz u n i t s o f time (or more). NO feedback i s involved i n coding, 2 i t i s merely necessary t o s t o r e t h e d l a s t elements of t h e input and
output sequences. Hence, i n view of Theorem (6.25) and Corollary (6.24),
w can say t h a t it i s possible t o a l t e r t h e meal behavior of a e system C2 a r b i t r a r i l y by external coding involving delay but not feedback i f and only i f t h e invariant f a c t o r s of t h e desired external behavior behavior ( z ~ ) a r e d i v i s o r s of invariant f a c t o r s of the e x t e ~ a l (Z
) of t h e given system. The invariant f a c t o r s may be z2 c a l l e d t h e PRDES of l i n e a r systems: they represent t h e atoms of system
behalfior which cannot be simulated from smaller u n i t s using a r b i t r a r y but feedback-free coding. I n fact, there i s a close (bot not isomorphic)
relationship between t h e Krohn-Rhodes primes of automata theory (see
KKtMfW, FALB, an% m
p [1969, Chapterb 7-91) and ours. A
full treat-
ment of t h i s p a r t of l i n e a r system theory w i l l be published elsewhere.
R. E. Kalman
7 ABSTRACT THEORY OF REALIZATIONS .

The purpose of this short section is to review and expand those portions of the previous discussion which are relevant to the detailed theory of realizations to be presented in Sections 8 and 9 The same . issues are examined (from a different point of view) also in KADIAN,
FAI;B, and ARBIB [ 19691
.
Let, us recall the
Let f R - I? : , construction of X (sections 3 and

f'
be a fixed input/output map.
as a set and as carrying a ~[zl-mod~le structure

f=
L ~ D I . L ~ ,
4 . It is clear that (i) )

m H [m],
where
pi: R + X , :
are K[ z ] -homomorphisms, and (ii) We have also seen that
pf =
epimorphism while
monor ~orphism.
rvf
(7-1)
epimorphism U Xf is completely reachable; monomorphism <=>

Xf
is completeiy observable.
These facts set up a "f'unctor" between system-theoretic notions and algebra which characterize Xf uniquely. Consequently, it is desirable
to replace also o w system-theoretic definition of a r2alization (3.12)

by a purely algebraic one:
(7 2)
DEFINITION. A realization of a K[zj-homomorphism
f: SZ +
is any factorization f that is, any commutative diagram
R. E. Kalman
of K[z]-homomorphisms.
The
K[z]-module
X is called the state
is
module of the realization. A realization is canonical iff it is completely reachable and completely observable, that is, . p surjective and
L
is injective.
= R,
p =
A realization always exists because we can take X
ln l~
REMARK.
It is clear that a realization in the sense of (3.12)
can always be obtained from a realization given by
( . ) In fact, 72.
) define C = (F, G, H by
G =
H =
p
L
restricted to the submodule Iw: followed by the projection y

H
lcul
1. 1
~(1).
It is easily verified that these rules will define a system with f x = f Given any such C, , . it is also clear that the rules
. define a factorization of f Hence the correspondence between (3.12) and - (7.2) is bijective. The quickest way to exploit the algebraic consequences of our definition (7.2) i s via the following arrow-theoretic fact:
R . E ,Kalman
ZEIGEZl FILL-IN LEMMA.
Let A, B, C, D be sets and a @, ,
Y,
and -6
set maps for which the following diagram commutes:
If -
a is surjective and 6 is injective, there exists a unique set
map cp corresponding to the dashed arrow which preserves commutativity.

This follows by straightforward it diagram-chasing1', which proves at the same time the COROLLARY. The. claim of the lemma remains valid if are replaced by "R-rnodt~les~l "set maps" by llR-homomorphismsll. and Applying the m o u e version of the lemma twice, we get
(7 6 )
PROPOSITION. Consider any two canonical realizations of -. a
fixed f: the corresponding state-sets are isomorphic as K[Z]-modyles.
?
Since every K z 1-module is automatical-lyalso a K-vector space, [
( j .6)
shows that the two state sets are K-isomorphic, that is, have the same dimension as vector spaces. The fact that they are also K[ z1-isomorphic implies, via Theorem (4.34), that they kiave the same invariant factors.
We have already employed the convention that (in view of the bijection
between f and Z ) the invariant factors of f and Xf are to be , f
R. E.Kalman
identified. In view of
( . ) this is now a general fact, not-dependent 76)

'
on the.specia1 construction used to get X f ( . ) as the 76
We can therefore restate
(7 -7)
ISOMORPHISM THEOREM FOR CANONICAL RFALIZATIONS. Any two
canonical realizations of a fixed. f have isomorphic state modules. The state module of a canonical realization is uniquely characterized (up to isomorphism) by its invariant factors, which may be also viewed as those of f .
A simple exercise proves also
( -8) 7
PROPOSITION.
I f
X is the state module of a canonical

(as a vector space) is minimum in the
realization f,
- dim X then
. class of all realizations of f This result has-beenused in some of the literature to justify the terminology "minimal realizationn as equivalent to "canonical realization". We shall see in Section 9 that the two notions are not always equivalent; we prefer to view tion and ( . ) as a derived fact. 78
(7.2) as the basic defini-
(7 9 )
claimed
REMARK.
Theorem
( . ) constitutes a proof of the previously 77

C = (I?,
( . 4 . To be more explicit : if 42)

h A
G, H) and
2=
(F, Gy H are two triples of matrices defining canonical realiza)
tions of the same f, then space isomorphism A: X

A
( . ) implies the existence of a vector77
X such that
R'. E. Kalman
If we identify X and X then A is simply a basis change and it follows that the class of all matrix triples which are canonical with the general linear realizations of a fixed f is isomorphicgroup over X . The actual computation of a canonical realization, that is, of the abstract Nerode equivalence classes [wIf, require a consider-
able mount of applied-mathematical machinery, which will be developed in the next section. The critical hypthesis is the existence of a factorization of f such that dim X
.-
< a. (this is sometimes

Given any such reali-
expressed by saying that f has finite rank.)
zation, it is possible to obtain a canonical one by a process of reduction. More precisely, we have
(.1 71)
THEORZM.
Every realization of f with state module X
contains a subquotient (a quotient of a submodule, or equivalently, a submodule of a quotient) X*
of
X which is the state-module of
a canonical realization of f;
PROOF. The reachable states Xr =image ir
are a submodule
Hence
of X and so are the unobservable-states Xo = kernel L.
X ,
xr/xr x0
"
is a subpuotient of X .
It follows immediately that

[ The proof may be visualized
is a canonical state-module for f
vie the following comutative diagram, where the j's and p s '
canonical injections and projectionsI .
are
R, E .Kalman
(7.12)
REMARK. Since any subquotient of X is isomorphic to a

it follows from Theorem (6.23)
submodule (or a quotient module) of X,

that
X
can be state-state module of a realization only if q ( ) q ( ) iflix
for all i (recall also Corollary ( . 4 ) 62) not enough since the
This condition, however, is
are invariants of module isomorphisms and not i isomorphisms of the commutative diagram ( . ) 72.
The preceding discussion should be kept in mind to gain an overview of the 'algorfthmsto be developed in the next sections.
R. E. Kalman
8.
CONSTRUCTION O REALIZATIONS F
N w w s h a l l develop and generalize the basic algorithm, o r i g i n a l l y o e o due t o B. L. H (see HO and KALWU [1966]), f o r computing a c k o n i c a l r e a l i z a t i o n C = (F, G, H) of a given input/output map f. Most of
t h e discuss5on w i l l be i n t h e language of matrix algebra. Notations. Here and i n Section 9 boldface c a p i t a l l e t t e r s *

will
denote block matrices o r sequences of matrices; f i n i t e block matrices
w i l l be denoted by small Greek subscripts on boldface c a ~ i t a l s ;t h e

elements of such matrices w i l l be denoted by ordinary c a p i t a l s . This
i s intended t o make t h e p r a c t i c a l aspects of t h e computations s e l f -
evident; no f u r t h e r explanations w i l l be made. Let

f: Sl
D +
r
f
be a given, fixed K[z]-homomorphism. w have t h a t e

. -
Using only
the K-linearity of
where the
(k > 0) a r e p
m matrices over t h e fixed f i e l d
K.
W 'denote t h e t o t a l i t y of these matrices by e
Then it i s c l e a r t h a t t h e specification of a K[z]-homomorphism
i s equivalent t o the specification of i t s matrix sequence ~ ( f ) . Moreover, i f
C realizes
(8.1) can be written e x p l i c i t l y a s
*Note t o Printer:
Indicated by double underline.
Comparing ( . ) and ( . ) we can translate (3.12) into an equivalent 81 82 matrix-language
(8.3)
DEFIMTION. A dynamical system C = (F, G, H realizes a ) iff the relation
is satisfied.
6 Let us now try to obtain

., . seQuence
also a matrix criterion for an infinite
4 to have a finite-dimensional realization. The simplest So let
I way to do that is to first write down a matrix representation for the Ei f: ?

C +=r, l
vector vith elements (~~(0))
...) m(0)) ~~(1))

UJ
issically,
g(A) is known as the (infinite) Hsnkel matrix associated --
with .A. We denote by H the p X V block submatrix of g appear-PIv ing in'the upper left-hand corner of H.
=U,v=
(A) < d i m C for all P, =
> 1 . =I
R. E. Kalman
COROLLARY.
An i n f i n i t e sequence A -a finite-dimensional - has rank
r e a l i z a t i o n only i f large. PROOF. If
=p, v
(A) -
i s constant f o r a l l p V - ,
sufficiently
dim C =
t h e claim of t h e proposition i s
vacuous (although formally correct !)
Assume therefol-e t h a t
dim C
<
and define from C t h e f i n i t e block matrices
Then
by t h e d e f i n i t i o n (8.3) of a r e a l i z a t i o n .
and rank
It i s c l e a r t h a t
rank R
=v
-P
a r e at most
n = dimC.
Thus our claim i s reduced t o
t h e standard matrix f a c t rank (AB)
< -
min [ rank A, rank B)
Our next objective i s t h e proof of t h e converse of t h e corollary. This can be done i n s e v e r a l ways. The o r i g i n a l proof i s due t o HO and XALMAN 11.9661;
similar r e s u l t s were obtained independently and concurreztly by YOULA and TISSI [ 19661 a s well a s by S m M N [ 19661. I R A L T o d i f f e r e n t proofs w
a r e analyzed and compared i n K A Z , ? , FALB, and ARBIB [1969, Chapter 10, Section
111.
A l l proofsdepend on c e r t a i n f i n i t e n e s s arguments.
W e
s h a l l give here a variant of t h e proof developed i n HO and KALMAN 119691.
R,. E. Kalman ,
(.) 86'
DEFINITIOPJ.
The infinite Hankel matrix. _H associated with . =
the sequence
4 has finite length A -
(At, A") iff one of the follow-
ing two equivalent conditions holds:

A1
= min
{at:
rank
gI,;v.=rank ga,+i ,
for all K,V
= 1 2 , ,
... ) <
A"
= {min
a!t: ra- gP, I,, =
, , rank gp,j f I + -~ for all K, p = 1 2
...
)<me
is the row length of H -
and A"
is the column length of
H. -
The equivalence of the two conditions is immediate from the equality of the row rank and column rank of a finite matrix. The proof of the following result (not needed in the sequel) is left for the redder as an exercise in familiarizing himself with the special pattern,,ofthe elements of a Hankel matrix:
PROPOSITION. For any
, -
the following inequ?&ties
are
either both true [K has finite length]' or both false [otherwise]: -
The most direct consequence of the finiteness condition given by
( . ) is the existence of a finite-dimensional representation S and 86 Z of the shift operator uA acting on a sequence A. The I1operand" will be the Hankel'matrix associated with a given A; As we shall see soon, this representation of the shift operator induces a rule for
R. E. Kalman
computing the matrix F of a realization of A . we would expect:
This is exactly what
module theory tells us that, loosely speaking,
DEFIhTTIOX;
.%he shift opezato&
on an infinite sequence
A is given by -
the corresponding shift operator on Hankel matrices is then
(of course, aH is well-defined also on submatrices of a Hankel matrix.)
(8-8)
MAIN LEMMA.
A Hankel matrix - associated with an infinite
sequence A has finite length if and only if the shift operator cH has finite-dimensional left and right matrix representations. fiecisely:
H has finite length h and an

X
(At, h ' if and only if there exist i)
1 X J1 '
J" block matrices - - and Z such that S =
and furthermore the minimum size of these matrices satiseing ( . ) 8 9 is h l

X
ht
and -
h"
A' t.
PROOF.
Sufficiency. Take any JB1 J" block matrix Z_ X -
which satatfsfies ( . ) 89
Compute the last column of IIp,
J1lg:
R. E. Kalman
for a l l j = 0 of
...
(where Z
PV
i s the
(p, V ) th
element block
z). -
Relation (8.10) proves that
ZK+1,2 I f
rank lIK+l,
for a l l
0, 1 ,
... ;
Hence the of
the general case follows by repetition of the same argument. existence of the claimed
Z implies t h a t the colunn.length An If actually
g -
cannot exceed the size of ' Z . of the smallest
A"
i s smaller than the s h e
e which works i n (8.9), w get a contradiction from The claims concerning
the necessity part of the proof. by a s t r i c t l y dual argument.
are proved
Necessity* By the definition of (h"
A",
each column of the
l)th
block column of
ly I ,
i s l i n e a r l y dependent on the
columns of the preceding block columns of property i s t r u e for a l l integers exist y,
zp,
moreover, t h i s So there
no d t t e r how large. such t h a t the relabion
m matrices Z
...
holds identically for a l l
j = 0, 1 ,
... .
X
Nw define o
Z_ t o be an -
Atl
h" block companion matrix of m
m block made up from the
Zi
just 'defined:
R. E, Kalman
The verification of ( . ) is immediate, using ( . 1 . The existence 89 81)

of ht x h a
block matrix S verifying ( . ) follows by a strictly 89 -
dual argument. Now we have enough material on hand to prove the strong version
of Corollary ( . ): 85
(.2 81)
!EEOREM.
in infinite sequence A l -
has a finite-dimensional
rea.lization of dimension n if and only i ' the associated Hankel f mfttrix H has finite length -
= (),I,
A) ".
PROOF. Sufficiency. Let gh", be a A" X 1 block column matrix whose first block element is an m X m unit matrix and
89 the other blocks are m X m zero matrices. Using ( . ) with

define
-A -
R. E. Kalman
Then, f o r a l l k
> 0, -
comput 2
t h e second s t e p uses matrix i s j u s t t h e Hence t h e given
(8.9).
By d e f i n i t i o n of
cA and
d, -
t h e last
( , l)thelement of 1
k lI(cA(&)), namely Alck. -
i s a r e a l i z a t i o n of
A. ;ary (8.5).
Necessity.
This i s immediate from Cor:
N w we want t o a t t a c k t h e problem of finding a canonical r e a l i z a o t i o n of
, -
since t h e r e a l i z a t i o n given by (8.13) i s usually very f a r
from canohical.
Our succeeding consideratiomhere an6 i n Section 9
a r e made more transparent i f w digress f o r a moment t o e s t a b l i s h e another consequence of (8.8)
.
4 has f i n i t e W note e
By outrageous abuse of language, w s h a l l say t h a t e

length i f f
z(A) has f i n i t e length. --
(8.14)
order N
DEFINITIONo An i n f i n i t e sequence
3 -
i s an extension of
of ( t h e i n i t i a l p a r t o f ) an i n f i n i t e sequence
4 iff -
Bk
for
k=l,...,
N. (At, A")
(8.15)
T E RN HOE .
N i n f i n i t e sequence of f i n i t e length o N
has d i s t i n c t length-preserving extensions of any order
> At with
A".
PROOF.
N of
Suppose
i s a length-preserving extension of order
A, -
t h e length of both sequences being
( A t , A"),
N > At =
+ A".
By (8.8),
both sequences s a t i s f y r e l a t i o n (8.9), with s u i t a b l e
&
and
3. -
R. E. Kalman
m e sequence A -
i s uniquely determined by
acting on ~ A l , y , ( $ )
from t h e l e f t 2nd t h e sequence acting on t h e matrix
_B
i s uniquely determined by
ZB
zAl,
At,
(B) -
from t h e r i g h t . Moreover,
The two matrices
a r e equal by hypothesis on N.
and
Hht,Vt(~)& = . .
on t h e 2nd,
~&,l,Aw
( B) =
a r e - equal, since t h e matrices on t h e right-hand side depend only also
..., N-th
member of each sequence.
Using only t h i s f a c t
and t h e a s s o c i a t i v i t y of t h e matrix product
Nw w can hope f o r a r e a l i z a t i o n algorithm which uses only t h e o e f i r s t . A'
htt terms of a sequence of f i n i t e length.
I n fact, w have e
.(8.16)
sequence
B. L HO1sREALIZATIONALGORITHM. C o n s i d e r a n y i n f i n i t e .
of f i n i t e length with associated Hankel m t r i x A: -
The -
follo~~ing steps w i l l lead t o a canonical r e a l i z a t i o n of
R. E. Kalman
(i) Determine (ii) Compute nonsingular pAt

X
At, A".
n = rank Ri,,A,,,
and
..
in doing so, determine
pht
mh" XmA" matrices I, Q such that ?
( iii) Compute
where R cm are idempotent "editing" matrices corresponding to the pJ operations "retain only the first p rows" and "retain only the first m columns".
We claim the
(.9 81)
REALIZATION THEORE31 FOR IXFIKCTE SEQUENCES. -For any infinite
sequence A whose associated Hankel matrix H has finite length -
(At, A ) ",
'
B. L. Hots formulas (8.17-18)yield a cano-nical realization.
PROOF. If C defined by (8.17-18) a realization of 4, is then it is certainly canonical: by ( . ) C has minimal dimension i 84 n
78. the class of all realizations of A and so it is canonical by ( . ) The required verification is interesting. First, drop all subscripts. Observe that - = QCRP is a pseudo-inverse of H, that H# -
R. E.Kalman
is,
E#H --
5. -
Then, by d e f i n i t i o n of
F, G, H,
'2nd
H# , -
by repeated application of (8.9)
,.
The l a s t equation c a l l s f o r picking out the f i r s t first (8.20)
rows and the
m columns of
COlrlMENT.
c -
which i s j u s t
*lfk?
as required.
This i s a considerably sharper r e s u l t than Theorem
(8.12)) i n two respects: (i) use the matrix (ii) form:
It i s no longer necessary t o compute
Z_: -
we simply
2 -A1,
A" (
which i s p a r t of the data of the problem.
Formulas '(8.18) give t h e desired r e a l i z a t i o n i n minimal
there i s no need t o reduce (8.13) t o a minimal r e a l i z a t i o n (reca.11
here (7.11)). Notice a l s o t h a t the proof of (8.19) does not require (8.12) but depends ( j u s t l i k e t h e l a t t e r ) on d i r e c t use of (8.8).
R. E , Kalman
An apparently serious limitation of the algorithm ( . 6 is the 81)

necessity to verify abstractly that
" has 4
finite length". Of
course, this can be done only on the basis of certain special hypotheses on A, (i i) given in advance. (~xamples (i) :
%=0
for all k
>
q;
% = coefficients of the Taylor expansion of a rational f'unction. )
Fortunately, the difficulty is only apparent, for the preceding developments can be sharpened further:
FlIMWENTAI; THEOREM OF LINEAR REALIZATION THEORY.
Consider
any infinite sequence A and the corresponding Hankel matrix H. Suppose there exist integers ' a r i k
-81,
I" 'such that
g l a l J , l
(4) -
= =
rank ra*
gI
A
($1,
(A) =
g l e l l l , I +
Then there exists unique extension A such that AA
of A -
of order
+ I"
with
At
A = = -81, A" =
< I - AX < an; moreover, applying formulas (8.17-18) 1 and

A
=
a"
gives a canonical realization of -
A.
PROOF. Exactly as in the necessity part ofthe proof of
(8.8),
coridition (8.22) implies the existence of S and Z_ such that -
Define an extension
A -
of A of order -
-81
EM by
R. E Kalman
any 11, P satisfying 1' + a" = N ' .

the next section.
Tkis problem is the topic of
(8.25)
FINAL COI4ElC.
An essential feature of B L. .Hot algorithm . s

Of
is - that is presPcrves the block,structureof the data of the problem. course, one can obtain parallel results by treating ordinary matrix, disregarding its block-Iia&el
zJ,,a,,
as an
structure. Such a
procedure requires looking at a minor of H of maxirmun rank, and was -
described explicitly by SILVEN/IAN [ 1 6 1 and SILVEiRljM and MEADOFIS [ 1 6 1 96 99 There does not seem to be any obvious computation~l advantage associated with the second method.
9.
THEORY OF PARTIAL REAIJZATIONS
I n one obvious respect t h e theory of r e a l i z a t i o n s developed

i n t h e previous section i s rather unsatisfactory: with i n f i n i t e sequences.
it i s concerned
From here on w c a l l a system s a t i s f y i n g e
(8.3) a complete realization, t o distinguish it from t h e p r a c t i c a l l y

more i n t e r e s t i n g case given by (9.1)
DEFINITION.
-A= Let -
( A ~ , A2,
... )
be an i n f i n i t e
K.
A dynamical
sequence of system
A' =
p X m matrices over a fixed f i e l d
C = (F, G, H)
i s a p a r t i a l r e a l i z a t i o n of order
of
iff =
&G
for
k = 0,1,
..., r.
A = (A~, =S
W s h a l l use t h e same t e d n o l o g y i f , irjstead of an i n f i n i t e e
sequence A, s
w a r e given merely a f i n i t e sequence e
..., As),
> r. -
The reason f o r t i i i s convention w i l l be c l e a r from the disW s h a l l c a l l the f i r s t e r).
cussion t o follow. sequence (of order
terms of
A a partial -
The concepts of canonical p a r t i s l r e a l i z a t i o n and minimal p a r t i a l r e a l i z a t i o n w i l l be understood i n exactly the same sense a s f o r a complete realization. W warn t n e reader, however, t h a t now these e
two notions w i l l t u r n out t o be inequivalent, i n t h a t minimal p a r t i a l 3 canonical p a r t i a l but not conversely.

Our main i n t e r e s t w i l l be t o determine a l l equivalence classes
of minimal p a r t i a l realizations; i n general, a given sequence w i l l .
R,E ,Kalman
have i n f i n i t e l y many inequivalent minimil p a r t i a l r e a l i z a t i o n s i f
r
i s sufficiently s m a l l .
According t o t h e Main Theorem (8.21) of t h e theory of r e a l i z a tions, t h e minimal p a r t i a l r e a l i z a t i o n problem has a unique solution whenever t h e rank condition (8.22) i s satgsfied. I f t h e length
of t h e
p a r t i a l sequence i s prescribed a p r i o r i , it may w e l l happen . t h a t (8.22) does not hold. realization sequence of W a t o do? ht Clearly, i f w have a minimal p a r t i a l e
(F, G, H)
A
of order
r w can exten3 t h e p a r t i a l e
=r
on which t h i s r e a l i z a t i o n i s based t o an i n f i n i t e
-
sequence canonically r e a l i z e d by
(F, G, H)
simply by s e t t i n g
Consequc ntly, we have t h e preliminary
PROPOSITION. The' determication of a minimal p a r t i a l

realization for
A
=r
i s equivalent t o t h e determination of a l l
extensions o f a p a r t i a l sequence sequence i s
=r
such t h a t t h e extended
( i) f i n i t e - dircens -?ci
( i i ) h t s dimension i s minimal i n t h e c l a s s of a11 extensions.
It i s t r i v i a l t o p r o v e . t h a t finite-dimensional extensions e x i s t
f o r any p a r t i a l sequence (of f i n i t e length). Hence t'ne problem i s immediately The
reduced t o determining extensions which have minimal dimension. solution of t h i s l a t t e r problem c o n s i s t s of two steps.
F i r s t , w show e
by a trivial argument thzt t h e minimal dimension can be bounded from
R. E. Kalman
below by an examination of the Hankel array defined by the partial sequence. Second, and this is rather surprising, we show that the lower bound can be actually attained. For further details, especially the characterization of equivalence classes of the minimal partial and 1970bI. realizations, see KALMAN [ 1969~
(.) 93
DEFINITION. By the Hankel array
lI(Ir)
of ,a partial .
sequence A we mean that r X r block Hankel matrix whose (i, j th ) =r block is Ai+ if - i + j - 1 < r and undefined otherwise. =
. In. other words, the Hankel array of a partial sequence A
=T.
consists of block rows and columns made up of subsequences

A
' P
...,
A ;
(l'<,p r) - _< -
of A and blank spaces. =r n(4 ) o=r bg the number of rows of the
(9-4)
.
--a
PROPOSITION. & J e -
Hankel array of A which are linearly independent of the rows .=r abave them. Then the dimension of &realization of A is at least =r
n0(A 1. =r
PROOF. The rank o ' f% Haiikel matrix of an infinite realization
sequence A_ is.a lower bound on the dimensionYJ I ~ of -
, -
by Progosition
- t o consider assuitable exten~i9p'A of A =r
. .This implies "filling in" the blank spaces in the Hankel array of A . Regardless of how =r
..
( . ). By Proposition (9.2), it suffices 84
.&AGr 'is f i x l e d in, the rank of the resulting' r X r block IIankel ,T ) :
' matrix is bounded from below by no($).

B the block symmetry of the Hankel matrix, we would expect y
to be able to determine n0(4r) by an analogous examination of the
R . E. Kzlman
columns o f t h e Hankel a r r a y of lower bound.
, -
thereby obtaining the same
This i s indeed true.
W prefer not t o give a d i r e c t e
proof, since t h e r e s u l t w i l l follow as a corollary of t h e Main meorem (9.7). The c r i t i c a l f a c t i s given by t h e

MAIN LENMA.
For a p a r t i a l sequence _Ar smallest integer such t h a t f o r row of

H(A ) - =r
define:
kt >
At
ht (br)
every
i s l i n e a r l y dependent on t h e
rows above it.
hl'($)
smallest integer such t h a t f o r k" column i n t h e k-th
> A"
every
block column of
E.(_A ) - -r
i s l i n e a r l y dependent on t h e columns t o t h e
l e f t of it. Every p a r t i a l sequence A sequence
- may be extended t o an i n f i n i t e r
i n at l e a s t one w y such t h a t the condition a r a n k -p, v (A) g =

=
n0 ( =r ) A
for a l l
@> Al(&),
V >
A " ( =r A)
i s satisfied.
.PROOF,
The existence of the numbers
At.
A"
i s trivial. in
It suffices t o show, f o r a r b i t r a r y r, how t o s e l e c t AHl .. such a way t h a t t h e numbers A',, A", -and n0 remain constant.
'
Consider t h e f i r s t row of f i r s t rows of t h e f i r s t , second, third,
-and exmine i n t u n a l l t h e
...,
h?th
block rows i n
I f thie f i r s t row of t h e f i r s t block row i s l i n e a r l y depen0), w f i l l i n the f i r s t e

~ O W
dent on the rows above it ( t h a t is,
R. E. Kalman
of Aptl using t h i s linear dependence ( t h a t is, w make the f i r s t e AH a l l zeros). This choice of the f i r s t row of
Afil
row of
w i l l preserve linear dependencies f o r the first row of every block

row below the second block row, by the definition of the Hankel pattern, I f the f i r s t row i n the f i r s t block row is l i n e a r l y
1 to
independent of those above (that is, contributes
no(A ) ), =r Eirentually
w pass t o the second block row ana repeat the procedure. e
the f i r s t row of some block row w i l l become linearly dependent on those above it, except when row of
A'
= r; i n t h a t case, choose the first
AHl
t o be l i n e a r l y dependent of the first rows of Repeating t h i s process f o r t h e second, third,

pt1
.. Ar
or
...
rows
of each block ro~fi, eventually A ing A' no.
i s detefmined without increas-
e To complete the proof, w must show t h a t the above definition

Of
Ar+l
also preserves the value of
A'!
That is, w must show e
t h a t no new independent columns a r e produced i n the Hankel array of A
=r when AHl
i s f i l l e d in. Artl
This i s verified immediately by noting implies the conditions
t h a t the definition of rank H =r,1
rank
=
!iHl,l~
ra*
gr-l,
=l, r
rank H =r,2'
rank IE
rank^ r =2, --
ra*
gl, r+lnot
---------------
*Of course, uov linear depenaence i n the f i r s t step does imply t h a t the corresponding row of A*l. w i l l be a l l zeros.
R. E.Kalman
With t h e a i d of t h i s simple but subtle observation, t h e problem
i s reduced t o t h a t covered by the Main Theorem (8.21) of Section 8. W have: e
MAIN T E F M F R MIElIMAL PARTIAL FE&IZATIONS.* H OE O

be a p a r t i a l sequence. Then: A
Ar
n (&r).
(3.) Every minimal realization of
=r
has dimension
( i i ) A l l minimal realizations m y be determined with t h e a i d of B. L Hots f o m l a s (8.17-18) p i t h .
as given by Lemma (9.5)

(iii)
.
A
A' = A ~ =r) (A
and
A" = A"(A )
=r
I f
r - At(&) + A?(A ) =r
>
then the minimal r e a l i z a t i o n
i s unique.
Othersiise there a r e e s ~i;;zny-lhinirml z l i z e t i m s as r2
there a r e extensions of PROOF.
=r
satisfying (9.6). every p a r t i a l sequence A =r
By t h e Main Lemma (9.5),
has a t l e a s t one i n f i n i t e extension which preserves
At, A"
and
So we can apply t h e (8.21) of the preceding section.
It f o l l o ~ ? sh a t t h e minimal, p a r t i a l r e a l i z a t i o n i s unique i f t
2 -
h t =r) + A"(A ) ( ~ =r
(the
At(A ) . + An(A ) =r =r
1 Hankel matrix can be
f i l l e d i n completely with the available data); i n the contrary case, t h e
m i n i m a l extensions w i l l depend on t h e m r a e r i n which t h e matrices

* * s *At+R,
have been determined (subject t o t h e requirement
I n v i e w a t h e theorem, w a r e j u s t i f i e d i n c a l l i n g t h e integer f e
no( P ) the dimension of A , --.. =r =r

-.s.z-*-
. .
*A siailar r e s u l t was obtained shultaneously and independently by T. . Tether (stanford d5.s sertz%ion, 1969)
- 118 REMARK.
R. E Kalman
The essential point is that the quantities no,
A , and t
AV
are uniquely determined already from partial data,

-
irrespective of the possible nonuniqueness of the minimal extensions
of the partial sequence. We warn, however, that this result does not generalize to all invariants of the minimal realization. For
instance, one cannot determine from A how many cyclic pieces a =r minimal realization of A will have: some minimal realizations =r
may be cyclic and others may nbt [ ULMAN W O ' J ]
Finally, let us note also a second consequence of the Main

Theorem:
COROLLARY. Suppose n1 =r) is the number of independent (A columns of the Hankel array of A (defined analogously with =r
If %(A=r) > no(A ) then, using the Main Theorem, =r we get a contradiction to the fact that the rank of any Hankel matrix
PROOF.
of an infinite sequence is lower bound for the dimension of any realization (~roposition 8 4) ( . ). If
y(~2,) 0(A ) - < n =r
then extending A =r
to any l A + , we contradict the fact that rank IIAlyA,, at least Slxl is equal to
no(^ ). =r
In other words, the characteristic property of rank, that counting rank by row or column dependence y f e l f s iden6ical results, is preserved even for incomplete Hankel arrays. It is useful to check a simple case which illustrates some of --vvn ' the technicalities of the proof of the Main Lemma.
(9.k)
r
X
EXAMPLE.
The dimension of ( , 0 0 ,
1\t
..., 0, A ~ ) is precisely
p where 'p = rank A- and , -
= A" = r.
R. E.Kalman
1 . GENERAL THEORY OF OBSERVABILITY 0
In this concluding section, we wish to discuss the problem of observability in a rather general setting: we will not assume linearity, at least in the beginning. This is an ambitious program and leads to many more problems than results. Still, I think it is interesting to give some indication of the difficulties which are conceptual as well as mathematical. This discussion can also K A 1969a, ~
Lm-a ]on the observability problem in certain classes of nonlinear T~o

systems. The motivation for this section, as indeed for the whole theory of observability, stems from the writerls discovery [IWXAN 1960~1 that the problem of (linear) statistical prediction and filtering can be formulated and resolved very effectively by consistent use of dynamical concepks and methods, and that this whole theory is a strict dual of the theory of optimal control of linear systems with quadratic Lagrangian. For those who are familiar with the standard classical theory of statistical filtering (see, for instance, YAGLOM C19621), we can summarize the situation very simply by saying that Wiener-Kolmogorov filter
serve as an introduction to very recent research [
+ theory of finite-dimensional linear dynamical systems

= Kalman filter.
For the latter, the original papers are [KALMAN 1960a, 1963a] and
l KAIMAN and BUCY'1 6 1 91
R. E. Kalman
The reader interested i n further d e t a i l s and a m~dernexposition i s referred especially t o t h e monograph of K A W [i.96913]. W s h a l l examine here only one aspect of t h i s theory (which e does not involve
s rr rj
stochastic elements) : t h e s t r i c t formulation
of t h e "duality principle" between reachability and observability. ~ This principle was formally s t a t e d f o r t h e first time 5y lCALMAN [ 1 9 6 0 I, but t h e pertinent discussion i n t h i s paper i s limited t o t h e l i n e a r case and
i s somewhat ad-hoc.
Aided by research progress since 1960, it i s
now possible t o develop a completely general approach t o t h ? ltduality principle1'. W s h a l l do t h i s and, as a by-product, w s h a l l obtain e e
a new and s t r i c t l y deductive proof of t h e principle i n t h e now

c l a s s i c a l l i n e a r case.
W s h a l l introduce a general notion of t h e "dual" system, and e use it t o replace the problem of observability by an equivalent problem of reachability. I n keeping with t h e point of view of t h e
e a r l i e r lectures, w s h a l l view a system i n terms of i t s input/output e map

f
and dualize
(rather than
z).
The c o n s t r u c t i b i l i t y
problem w i l l not be of d i r e c t i n t e r e s t , since i t s theory i s similar t o t h a t of t h e observability problem.
Let R, I be the same s e t s as defined i n Section 4 and used '

from then on. W assume t h a t both e
R
and
I a r e K-vector spaces '
(K = a r b i t r a r y f i e l d ) and r e c a l l t h e d e f i n i t i o n of the s h i f t
operators
o
n and
on R z
and
(see ( . 1 ) . W denote 3.0) e
both s h i f t operators by module structure on 52
but ignore, u n t i l l a t e r , t h e
KE
zl-
and I'.
R. E Kalman
By a constant (not nec,essarilylinear) input/output map

f R +I' :
we shall mean any map f which commutes with the shift that is,
&erators,
Let us now formulate the general problem of this section:

(01 1.)
P R O B W OF OBSERVABILITY. Given an input/output map f, and an input sequence

V
its canonical realization C, after t

=
E R applied
0 Determine the state x of C .
t = 0 from
. the knowledge of the output sequence of C after t = 0
This problem cannot be solved in general!
To see this, recall
that the state set Xf of f may be viewed as a set of functions
since cut
is Nerode-equivalent to
fc'.() (uo)l
=
o,
iff
fLo)l (u.()
Giving V E R
and the corresponding output sequence amounts to
giving various values of f(cue ) ( ) (namely those corresponding 1 2 to the sequences @, Vr, zV r + Vr-1' V, z z V and
it may happen that these substitutions do not yield enough values of
(u*() the function f c o ) l
to determine the fin~t_ion itself. This
sitwtion has been recognized for a long time in automata theory,
R. E. Kalman
where, i n an almost self-explanatory terminology, one says t h a t

"Z
i s i n i t i a l - s t a t e determinable by an i n f i n i t e multiple experiment

V 1s)
(possibly i n f i n i t e l y many d i f f z r e n t single experiment ( s i n g l e

V
but not necessarily by a
chosen a t w i l l )
If
See MOORE [ 1956 I
The problem i s f'urther complicated by t h e f a c t t h a t it may make a difference whether o r not w have a f r e e choice of e
V.
K A W ,
FALB, and ARBIB [1969, Section 6.3) ] give some r e l a t e d comments
A f'urther d-ifficulty inherent i n t h e preceding discussion i s
t h a t t h e problem i s posed on a purely s e t - t h e o r e t i c l e v e l and does not lend i t s e l f t o t h e introduction of more r e f i n e d s t r u c t u r a l assumptions. W s h a l l t h e r e f o r e reformulate t h e problem i n such e
I
a way a s t o focus a t t e n t i o n on determining those p r o p e r t i e s of t h e

i n i t i a l s t a t e which can be computed from t h e combined knowledge of t h e input and output sequence occurring a f t e r For simplicity, we s h a l l f i x t h e value of generality, since r e s u l t i n g from x f i s not l i n e a r ) .
t
V
0.
at.. 0
(no l o s s of
Then t h e output sequence f(w), where
after
t =
i s given simply a s
W s h a l l use t h e circwtflex t o denote c e r t a i n c l a s s e s of e f'unctions from a s e t i n t o t h e f i e l d

K.
For
t h e moment, t h i s
c l a s s w i l l be t h e c l a s s of a l l functions.
Thus
?
A
( a l l functions of
l ?
-t
K)
.
y
An element y
-f
i s simply a " ~ u l e "( i n practice, a computing

in
algorithm) which assigns t o each possible output seq.1ence
I '
R. E. Kalman
a number i n t h e f i e l d K.
then
If
resulted from t h e s t a t e
[mIf,
gives t h e value of a c e r t a i n f'unction i n Q
and, by d e f i n i t i o n of
h
t h e state, a l s o t h e value of a c e r t a i n function i n X. the

DEPiNITION. An element
This suggests
hx E 2
i s an 03servable costate
i f f there i s a
j; P ;E
such t h a t w have i d e n t i c a l l y f o r a l l e
I n other words, no matter what t h e i n i t i a l s t a t e x

t h e value of
A
["If
is,
at
can always be determined by applying t h e

f(m)
rule Y ;
t o t h e output sequence
r e s u l t i n g from x.
Note,
carefully, t h a t t h i s d e f i n i t i o n subsumes ( 5 ) a fixed choice of t h e c l a s s of f'unctions denoted by t h e circumflex, and ( i i ) a fixed input sequence,after t . .
=
(here
0).
For c e r t a i n purposes,
it
may 5e necessary t o generalize t h e d e f i n i t i o n i n various ways

[KAMAN 1 9 0 a , but here w wish t o avoid a l l unessential complica] e
tions.
f.
According t o Definition (10.2),
w s h a l l see that a system i s e This agrees
completely observable i f f every costate i s observable.
with ,the point of view adopted e a r l i e r (see Section 4) i n an ad-hoc fashion. Also, t h e vague requirement t o "determine xtl used i n
R. E ,Kalman
(10.1) i s now replaced by a precise notion which can be manipulated (via the actual definition of the circumflex) t o express limitations on the algorithms t h a t w may apply t o the output sequence of the e system. The requirement "every costate i s observable" can be often replaced by a much simpler one. For instance, i f
X
i s a vector
space, it i s enough t o know t h a t !'every linear costate i s observable1' or even just t h a t "every element of some dual basis i s an observable costate"; i f
X
i s an algebraic variety, it i s natural t o interpret
"complete observa'bility" a s "every element of the coordinate ring of
i s an observable costate1' [ KALMAN .1970a].

W e can now carry out a straightforward udualization" of the
setup involved i n the definitior, of the input/output map
f:
R +T'.
First, w adopt (again with respect t o a fixed interpretation of the e circumflex) : (10.3)
DEFINITION.
The - . of an input/output map dual
f: G !
+T '
i s the map
Note t h a t
i s well-defined,
since the circumflex means the class
of a l l f'unctions. A s t o the next step, w wish t o prove t h a t constancy i s inherited e
under dualization.
.A
e To do this, w have t o induce a definition of the and
s h i f t operator on I ' obvious ones:
c.
The only possible definitions are the
R. E.Kalman
: -Both
of t h e s e new s h i f t operators w i l l be denoted by
z -I
The reason f o r t h i s notation w i l l become c l e a r l a t e r . N w it i s easy t o v e r i f y : o (10.4)

PROPOSITION.
If
i s constant, so i s
f.
PROOF.
W apply t h e d e f i n i t i o n s i n s u i t a b l e sequence: e ) (m)

=
=
( z
( z-'03)
( f (m) )
(def. of (def. of (f
?),
up),
;(z-f(m))
= ;(f(z.m))
h A
i s constant),
=
=
h
f(r)(z-m) ( z - l - ? ( i ) ) (m) z whenever

A
(def. of (def. of f does.
?),
u) 6,
and s o we see t h a t
commutes with
A t t h i s stage, we cannot as y e t view
as t h e input/output map
h
of a dynamical system because concatenation i s not y e t defined on and therefore

h
r,
i s not y e t a properly defined "input sett1.
I n other words, it i s necessary t o check t h a t t h e notion of time i s a l s o i n h e r i t e d under dualization. I n general, t h i s does not appear
h
t o be possible without some strong l i m i t a t i o n on t h e c l a s s we s h a l l look o n l y . a t t h e simplest
I".
Here
R. E.Kalman
A
(10.5)
HYPOTHESIS.
Every f'unction
- 2 in
sa t i sf i e s the (dependent on
f i n i t e n e s s condition:
There i s an i n t e g e r
IF]
r)
such t h a t f o r a l l y, 6 E I t h e condition '
implies
= ?(6).
I n other words, we assume t h a t t h e value of each
rh
at y
i s uniquely determined by some f i n i t e portion of t h e output sequence

Y
Assuming (10.5))
it i s immediate t h a t
admits a concatenation
multiplication which corresponds (at l e a s t i n t u i t i v e l y ) t o t h e usual one defined on Q:
W can now prove the expected theorem, which may be regarded e
as t h e p r e c i s e form of t h e lldualityll principle:

(10.7) map and
THEOREM.
Let
C
b e a n a r b i t r a r y c o n s t a n t input/output
i t s dual.
Suppose. f'urther t h a t (10.5) holds.

f
- Then
each observable c o s t a t e of
(relative t o
A
satisf'yinq (10.5))
may be viewed as a reachable s t a t e o f PROOF.

A
f,
and conversely.
F i r s t we determine t h e Nerode equivalence c l a s s e s on
I induced by hf. .By d e f i n i t i o n '
for a l l
r.
Npw
i s l i n e a r (!);
i n f a c t , d i r e c t use of
the definition o f
and (10.6) gives
So
?of
and
gof a r e equal a s elements
31
i:
=hey define t h e
s m e observable ~ o s t a t e . Tn f a n c i e r language, t h e asstgnment
i s w e l l defined and c o n s t i t u t e s a b i j e c t i o n between t h e reachable

A
s t a t e s of
and those c o s t a t e s of class

A
which a r e observable
relative t o the
.
A
Thus (10.5) i s a sufficient. condition f o r bhe & ~ a P i t y r i n c i p l e p t o hold. However, t h e f a c t t h a t t h e canonical r e a l i z a t i o n ~f
f is
completely reachable i s not q u i t e t h e same a s saying t h a t t h e canonical r e a l i z a t i o n of on t h e choice of f i s completely observable because t h e . l a % t e r depends
L .
I '
and t h e r e f o r e i s not a n i n t r i n s i c property of
f.
Moreover, Theorem (10.7) does not give any i n d i c a t i o n how "big" and it may c e r t a i n l y happen t h a t t h e observability problem f o r r m h more d i f f i c u l t 'than t h e r e a c h a b i l i t y problem. be i l l u s t r a t e d later by some examples.
f
Xi: i s
is
These matters w i l l
N w we deduce t h e o r i g i n a l form of t h e d u a l i t y p r i n c i p l e from o Theorem (10.7). The e s s e n t i a l point i s t h a t (10.5) holds automati-
c a l l y as a r e s u l t of l i n e a r i t y . N w d e f i n i t i o n of t h e function c l a s s : e l e t t h e circumflex denote
the c l a s s of a l l K-linear fimctions.
( A l l t h e underlying s e t s ssitb t h e
K-vector spaces, so t h e d e f i n i t i o n makes sense. )
R. E Kalman
The following f a c t s a r e well known: (10.9) PROPOSITION. Then:
- * Let
denote d u a l i t y i n t h e sense of
K-vector spaces.
N w we can s t a t e t h e o (10.10) MAIN THEOREM. Suppose

f
2 K-linear,
A
constant, f i n i t e Then:
dimensional.
Suppose f u r t h e r t h a t
h
means K-linear d u a l i t y .
(i)
i s K-linear and constant, t h a t i s , a K[Z-'1-homomorphism F ) and finite-dimensional. P are isomorphic with t h e i s observable.
(and therefore w r i t t e n s (ii)
The reachable s t a t e s of
K-linear dual of ' X f ; PROOF. that

h
hence every c o s t a t e of
Xf
The f a c t t h a t ' I
i s K-linear implies, by ( 1 0 . 3 ) ~
f
A
i s K-linear; t h e constancy of (caution: f, f
always implies t h a t of
f,
by Proposition (10.4).
is
& the
K[z]-linear
dual of t h e
K[z]-homomorphism
and t h e construction given here
cannot be simplified.
See Iiemark (4.26~) )
To prove t h e second part, w note t h a t by Proposition (10.9) e

h
Hy-pothesis (10.5) holds and thus map-of a dynamical system. of
f = P" i s a well-defined input/output
W mst prove t h a t t h e reachable s t a t e s e
F a r e isomorphic with
$,
t h e K-linear dual of
Xf.
This
amounts t o proving t h a t t h e K-vector space of functions
i s isomorphic with t h e K-vector space
9.
It s u f f i c e s t o prove
d ?
t h e K-vector space generated by t h e K-linear f'unctions (A:
(10.n)
x o [hf(zi-x)]
3'
i = 0,1,
...
x,
and
1 ,
..., m}
i s isomorphic with
Then x = 0, by
f
xX~.
Suppose that, f o r fixed
every
h(x) = 0.
by d e f i n i t i o n of t h e Nerode equivalence r e l a t i o n induced Since X
( r e c a l l here t h e discussion from Section 3 ) .
f is
finite-dimensional by hy-pothesis, t h e f'unctions
it follows from t h i s property of

Obviously, din: X* = dim X f f '
(A)
t h a t they generate . x :
so t h a t everything i s proved.
I n other terns, t h e f a c t t h a t f = K[z]-homomorphism together

with t h e appropriate d e f i n i t i o n of
A
implies t h a t
i s a K[ z-'1 -homomorphism.
h
Since (10.5) holds, we can fnterpret t h e output of the dual
i n a system-theoretic way, as follcxs:
system a t
-k
due t o input
i s given by t h e assignment
which i s a l i n e a r function defined on t h e sequence. I n fact, w have e
k-th
term of the input
H ,E. Kalman
(10.12) that
RFSICLRK.
It i s e s s e n t i a l l y a consequence of Proposition (10.9)

Note,
hf
t u r n s out t o be t h e same kind of algebraic object a s f .
however, t h a t under d u a l i t y t h e input and output terminals a r e interchanged and
i s replaced by
-t (hence
I n terms of t h e p i c t o r i a l d e f i n i t i o n of a system, t h i s statement simply amounts t o I1reversing the d i r e c t i o n s of t h e arrowsv, which i s t h e "right1' way t o define d u a l i t y i n t h e most general mathematical context, namely i n category theory. W would expect e
t h a t t h e d u a l i t y p r i n c i p l e s of system theory w i l l eventually become
a p a r t of t h i s very general d u a l i t y theory.
This has not happened
y e t beczuse t h e c o r r e c t categories t o be considered i n t h e study of dynamical systems have not y e t been determined.
It i s l i k e l y t h a t
eventually mny d i f f e r e n t categories w i l l have t o be looked at i n studying dynamical problems.
W s h a l l now present an example which should help t o i n t e r p r e t e

t h e previous r e s u l t s ;
W emphasize, however, t h a t t h e theory sketched e
here i s s t i l l i n a very rudi.mentary form. (10.13)

EXAMFLF,.
Consider t h e system C
defined by
with
X = U-= Y = R mod 1 i.e., , -
the interval W let e
[O, 1). ( 1 i s t o u ( t ) = 0. W view e
be thought of a s i d e n t i f i e d with 0.)
x through i t s binary representation
It i s c l e a r from t h e d e f i n i t i o n of t h e system t h a t t h e output

sequence due t o any x
i s precisely
. -.
If
x i s i r r a t i o n a l , i n f i n i t e l y many terms a r e needed t o i d e n t i f y

L.
it. Consequently, t h e x ' s a r e isomorphic with t h e Nerode equivalence c l a s s e s induced by fz. So C- cannot be .reduced. fz i s If
Relative t o
'IA
= -flmctions", everyccostate of
observable, provided t h a t Hypothesis (10.5) i s - s a t i s f i e d . not
it is, then only those c o s t a t e s defined on fixed-length r a t i o n a l s

B r e observable (more precisely,
these s r e functions which depend only Sk(x) 1 S )
on a fixed f i n i t e subset of t h e
Thus:
either
does
not define a dy-nmical system o r not a l l c o s t a t e s a r e observable. N w l e t us replace t h e s e t o with t h e r a t i o n a l s . f o r determining x: [O, 1) by i t s i n t e r s e c t i o n
It i s c l e a r t h z t t h e r e i s now a f i n i t e algorithm
w simply apply the r e s u l t s of p a r t i a l r e a l i z a e (we take
t i o n theory of t h e previous section. problem i s t o express of polynomials i n i s rational.) x from (tl(x),
..., E2(x)0
= Z =2
and t h e
as a ratio
Z$[2]--whlch x
i s always possible since e&ch x
However,
i s not lleffectively compt&blefl i n <he
strict sense since there is no way of knowing when the algorithm has stopped. In other words, given an arbitrary costate x -there exists .. A A no fixed rule y; such that the application of y~ to yx gives ;
A
xx ()
for all x.
On the other hand, substituting into x the
A<
results of the partial-realization algorithm will give an approximation to the value of : ( X ) which always converges in a finite
(but a priori unknown) number of steps as more values of the output sequence are observed. In short, the costate-determination algorithm - f s certain pseudo-random elements in it and therefore cannot be fi described through the machinery of deterministic dynamical systems.
(IS there some relation here to the conceptual difficulties of

Quantum Mechanics?)
R. E. Kalman
1 . HISTORICAL COMMENTS 1
It is not an exaggeration to say that the entire theory of linear, constant (and here, discrete-time) dynamical systems can be viewed as
a systematic development of the equivalent algebraic conditions (2.8)

and ( . 5 . 21)
Of course, the use of modules (over
KEzI)
to study a constant
square matrix (see ( . 3 ) has been "standard" since the 19208s under 41)
. the influence of E NOETHER and especially after the publication of

the Modern Algebra of VAN DER WAERDEN. must be also quite old. Condition ( . 5 , by itself, 21)
, For instance, GANTMAIMER [1959, Vol. 1 p. 2031
att~ibutesto KRYLOV [1931] the idea of computing the characteristic polynomial of a square matrix A by choosing a random vector b and commting successively b, Ab, A , %
...
until linear dependence is
) obtained, which yields the coefficients of det (zI - A.

will succeed iff XA is cyclic with generator g ) .
h he
method
However, the
41) 21) merger of ( . 3 with ( . 5 , which is the essential idea in the algebraic theory of linear systems, was done explicitly first in KAINAN [1965b].
We shall direct our remarks here mainly to the history of conditions
( . ) and (2.15) as related to controllability. See also earlier 28
81 comments in XALMAN [1960c, pp. 481, 483, 4 4 and in KAIXAN, HO, and
IWENDRA
[1963, pp. 210-2121. We will have to bear in mind that the
development of modern control theory cannot be separated from the development of the concept of controllability; moreover, the technological problems of the 1950's and even earlier had a major influence on the genesis of mathematical ideas (just as the latter have led to many new technological applications of control in the 19601s).
R. E. Kalman
The writer developed the mathematical definition of controllability with applications to control theory, during the first part of 1959. (~gpublishedcourse notes at Johns Hopkins University, 1958/59. ) These
23. first definitions were in the form of (2.17) and ( . )
Formal presenta-
1959, see tions of the results were made in Mexico Citg (~e~tember,
KALMlLN [1960b]), University of California at Berkeley (~pril,1969, see
K . [1960d]),
and Moskva (~une,1960, see KAZlMAN [1960c3), and in As
scientific lectures on many other concurrent occasions in the U.S. 'far as the writer is aware, a conscious and explicit definition of controllability which combines a control-theoretic wording with a
precise mathematical criterion was first given in the above references. There are of course many instances of similar ideas arising in related contexts. Perhaps the comments below can be used as the starting point of a more detailed examination of the situation in a seminar in the history of ideas.
The following is the chain of the writer's own ideas culminating in the publications mentioned above:
(1) In KALMAN [1954] it is pointed out (using transform methods)

that continuous-time linear systems can be controlled by a linear discrete-time (sampled-data) controller in finite time.*
---------------
*It is sometimes claimed in the mathematical literature of optimal control theory that this cannot be done with a linear system. This is false; the correct statement is "cannot be done with a linear controller producing control functions which are continuous (and not merely piecewise continuousI ) in time." Such a restriction is completely'irrelevant from the technological. point of view. As a matter of fact, computer-controlled systems have been proposed and built for many years on the basis of linear, time-optimal control.
R. E Kalman .
94 (2) Transposing the result.of KllLMAN [ 1 5 I from transfer f'unctions

to state variables, an algorithm was sketched'for the solution of the
discrete-time time-optimal control of systems with bounded control and linear continuous-time dynamics
[KAIM.N,
1957]
(3) As a popularization of the results of the preceding work, the

same technique was applied to give a general method for the design of linear sampled-data systems by KALMAN and BERTRAM [ 9 8 . 15] Some background comments concerning these papers are appropriate:
(1) The ideas and method presented in KALMAN
[1954]descend
directly from earlier (and very well known) engineering research on time-optimal control. (The main references in
KIIl;MAN
[19541 are:
McDONALD [ 1 5 1 HOPKIN 1 9 1 , BOGNER and KAZDA [ 1 5 1, as well as a 90, 151 94 research report included in
~~ [1955].)
Although the results of
KALMAN [1954]on linear time-optimal control were considered to be new

when published, it became clear later that similar ideas were at least implicit in 0L;DENBOURG and SARTORIUS [1951, $90, p. 2191 and in TSYPKINI s work in the early 19501s. The engineering idea of nonlinear time-optimal control goes back, at least, to DOLL [1943] and to OLDENBURGER in
1944,
although the latter's work was unfortunately not widely known before 1957. During the same time, there was much interest in the same problems in other countries; see, for instance, FELOBAUM [1953] and UTTLEY and HAMMOND
[1953]. Mathematical work in these problems probably began with BUSHAlITls

dissertation [1952] in which, to quote from KAISm
[1955, before equation
( 0 1, '' 4)
...:. [it was] rigorously proved that the intuition which led to
the formulation of the [engineering] theory [quoted above] was indeed correct. " TSIEN1s survey [ 19541 contains a lengthy account of this state
R. E.Kalman
of affairs and was ready by many.-. We emphasize': none of this extensive literature contains even a hint of the algebraic considerations related to controllability.
(2-3) The critical insight gained and recorded in KA.LlIAN [1957] is
the following: the solution of the discrete-time time-optimal control problem is equivalent to expressing the state as a linear combination of a certain vector sequence (related to control and dynamics) with coefficients bounded by 1 in absolute value, the coefficients being the values of the optimal control sequence. The linear independence of the first n vectors of the sequence guarantees that every point in a neighborhood of zero can be moved to the origin in at most n steps (hence the terminology of "complete controllabilityw); and the condition for this is identical with ( . 7 (stated in KALMAN [1957] 21) and KAIMAN and BERTRAM [1958] only for the case det F
0 and m = 1. )
A thorough discussion of these matters is found in KlUMW [lgfhc; see
especially Theorem I p. 4851. A serious conceptual error in KllLMAN ,

[ 1 5 1 occurred, however, in that complete controllability was not 97
assume3, as a hypothesis for the existence of time-optimal control law,

but an attempt was made to show that the controllability is almost always conplete [ L e m 1 . In fact, this lemma - true, with a small 1 is technical modification in the condition. Only much later did it become clear (see the discussion of Theorem D in the ~ntroduction), however, system is always completely:controllable (in the nonconstant that a .dynamical' case,. completeu reachable) if it is derived from an exbernal description. this difficulty, very mysterious in 1 5 , which led to the development 97 It was
R. E.Kalman
of a formal machinery f o r t h e d e f i n i t i o n of c o n t r o l l a b i l i t y during t h e
next two years.
The changing point of view i s already apparent i n
KALMAN and BERTRAM 119581; t h e unpublished paper promised t h e r e was

delayed precisely because t h e algebraic machinery t o prove Theorem D
was out of reach i n 1957-8.

grapher EiUDOLF [ 19691 I N S Ml R : U bA Y
Consult a l s o t h e findings of the biblio-
under t h e stimulation of t h e engineering problems
of minimal-time optimal control, t h e researches begun by KAI;MAN [1954, 19571 and IWA14N and BERTRAM [ 19581 eventually evolved intoi what has come t o be c a l l e d t h e mathematical theory of c o n t r o l l a b i l i t y (of l i n e a r systems)
.
1 P0NTRYAGIN.and his: school i n t h e USSR developed t h e i r
Beginning about 1955, end stimulated by t h e same engineering
mathematical theory of optimal control around t h e celebrated "Maximum Principle". mentioned

-
hey were well aware of t h e survey of TSIEN [19541

above, and referenced it both i n English and i n t h e Russian W now h o w t h a t 9 theory of control, regarde -
t r a n s l a t i o n of 1956.)
l e s s of i t s particular mathemtical. style, must contain ingredients related t o controllability. So it i s i n t e r e s t i n g t o examine how
e x p l i c i t l y the c o n t r o l l a b i l i t y condition appears i n t h e work of PONTRYAGIN and r e l a t e d research. GAMKRELIDZE [1957, $2; 1958 $1, $21 c a l l s t h e time optimal control problem associated with t h e system
"nondegeneratetr i f f subspace of (11.2)

(i. e.,
i s not contained i n a proper A-invariant
R ~ . He notes immediately t h a t t h i s i s equivalent t o
det (b, Ab,
..., nn%)
jl
t h e s p e c i a l case of (2.8) f o r
m = 1)
He then proves :
&
t h e "degenerate" case t h e problem e i t h e r reduces t o a simpler one o r t h e motion cannot be influenced by t h e control function u( )
All
t h i s i s very close t o an e x p l i c i t d e f i n i t i o n of c o n t r o l l a b i l i t y . However, i n discussing t h e general case m
>
1 ,
GAMIWZLIDZE [ 1958,
$3, Section 1 1 defines "nondegeneracy" of t h e system
a s t h e condition
(11.4)
det (bi, Abi,
..., A"-%.
1)
f o r every column bi E B,
but he does not show t h a t t h i s generalized condition of "nondegeneracy" f o r (11.3) i n h e r i t s t h e i n t e r e s t i n g characterization p o v e d f o r "nondegeneracy" i n t h e case of (11.1). I n fact, condition (11.4) i s much too strong
t o prove t h i s ; t h e correct condition i s (2.8), t h a t i s , complete controllability. I n other wards, i n GP!dKRELIDZE1s work (11.4) plays
t h e r o l e of a t e c h n i c a l condition f o r eliminating "degeneracy" (actually, lack of uniqueness) from a p a r t i c u l a r optimal control problem and i s n o t , e x p l i c i t l y r e l a t e d t o the more basic notion of complete c o n t r o l l a b i l i t y . Neither
CtAMI(REL1Dm
nor PONTRYAGIN [1958] give an i n t e r p r e t a t i o n of
( l l . 4 ) a s a property df t h e dynamical system (11.3), but employ (11.4) only i n r e l a t i o n t o t h e p a r t i c u l a r problem of time-optimal control. See
also KALMW [1960~, p.

U W
4841. A siuular point of view is taken by
[ 9 0 ; he calls a dynamical system (11.3) satisfying (2.8) 16]
"proper" but then goes on to require

C
( l 4 (to assure the uniqueness l.)
of the time-optimal controls and calls such systems llnomal't. The assmtion of some kind of "nondegeneracy" conditio~ yas unavoidable in the early phases of research on the time-
3
5
IC
optimal control problem.
For example, ROSE [1953, pp. 39-58] examines
this problem for ( 1 1 ; by defining "nondegeneracyt1 1.) [p.
4 1 by a 1
iondition equivalent ot ( 1 2 , he obtains most of GAMKEELIDZE 1 s results 1.) -
in the special case when A has real eigenvalues [~kreorem 121. ROSE
uses determinants closely related to the now familiar lemmas in cantrol-
kbility theory but he, too, fails to formulate controllability as a concept independent of the time-optimal control problem.
b
m
A similar situation exists in the calculus of variations. The
to a kind of classification of controllability properties o ' nonconstant r In fact, the standard notion of a normal family of extremals
=systems.
of the calculus of variations is closely related to condition
(14, 1.)
suitably generalized via (2.5) to nonconstant systems.*
Normality is
used in the calculus of variations mainly as a 'hondege;eracl' condition. is importan: to note that the "nondegeneracy" conditions lemployed in optima.3 cori-cru~ and the calculus
0 1
varla-clonsplay mainly the
ole of eliminating annoying %echnicalities ar-d simplieing proofs.
--------------
*The use of the word "normal1' I a S A L U [ 1 6 1 for ( 1 . ) only by 90 - 1 4 is accidentally coincident with the earlier use of the "normal" in the c u l u s of variations
-.
a 4
R.E.Kalman
With suitable formulation, however, the basic .:esults of time-optimal control theory continue to hold without the assumption of complete controllability. The same is not true, however, of the four kinds of theorems mentioned in the Intorduction, and therefore these results are more relevant to the story of controllability than the time-optimal control discussed above. There is a considerable body of literature relevant to controllability theory which is quite independent of control theory. For instance, the
treatment of a reachability condition in partial differential equations goes back at least to CHOW [1940]but perhaps it is fairer to at+ribute it to Caratheodoryts well-known approach to entropy via the nonintegrability condition. The current status of these ideas as related to controllability is reviewed by WEISS [1969, Section
9 . An independent 1
and very explicit study of reachability is due to ROXIN [19601;unfortunately, his examples were purely geometric and therefore the paper did.cot help in clarifying the celebrated condition ( . ) 28
The
Wronskian determinant of the classical theory of ordinary differential equations with variable coefficients also has intersections with controllability theory, as pointed out recently with considerable success bpSILVERMAN 1 9 6 . Vany problems in control theory were misunderstood 161 or even incorrectly solved before the advent of controllability theory.
'4
1 Some of t'nese are mentioned in KALMlLN [1963b, Section 9 . For relations

with automata theory, see ARBIB [ 9 5 . 161 Let us conclude by stating khe wrzter's 9 currena! position as m
L
to the significance of controllability as a subject in mathematics:
,PIms-
R. E. Kalman
( ) Controllability is basically an algebraic concept. 1
his
clah applies of course also to the nonlinear controllability results obtained via the Pfaffian method.)
(2) The historical development of controllability was heavily
n influenced by the interest prevailing in the 1950's i optimal control

theory. Ultimately, however, controllability is seen as a relatively minor component of that theory.
.
(3) Controllability as a conceptual tool is indispensable in
the discussion of the relationship between transfer functions and differential equations and in questiohs relating to the .four theorems of the Introduction.
(4)
The chief current problem in controllability theory is the
ekension to more elaborate algebraic structures. For a survey of the historical background of observability, which would take us too far afield here, the reader should consult
R . E. Kalman
1 . REFERENCES 2 Sect,lonA: General References
[1965]
A common framework for automata theory and control theory, SIAM J. Contr., 3206-222.
C. W. CURTIS and I REINER .

[ 19621
Representation Theory of Finite Groups and Associative Algebras, Interscience-Wiley.
E M. DAY and A. D. WALLACE .

[ 19671
Multiplication induced in the state space of an act, Math. System Theorn 13305-314.
C. A. DESOEH and P. VARAIYA

[1 6 1 97
The minimal realization of a nbnanticipative impulse response matrix, SII~M ~ppl.~ath.,J. 15:754-764.
E. G. GILBERT
[ 19631
Controllability and observability in multivariable control systeas, SIAM J. Control, 1 128-151. :
B. L. HO and R. E. KAL!! Effective construction of linear state-variable models from input/output functions, Regelungstechnik, 1 :545-548. 4
[ 19691
.
The realization of linear, constant input/output maps, . I Complete realizations, SIAM J. Contr., to appear. .
16 1 95
Elements of Modern Algebra, Holden-Day
[1960a]
-
A new approach to linear filtering and prediction problems, J Basic Engr. (~rans.ASME), 82~:35-45. .
[1960bl
Contributions to the theory of optimal control, Bol. Soc. Mat. Mexicana, 5:102-119.
[1960c]
143 -
On the general theory of control systems, Proc 1st I ' C Congress, Moscow; Butterworths, London. FA
'
192 161
[1963a]
Canonical structure of linear dynamical systems, Proc.
- Nat. Acad. of Sci. (USA), % 5 6 6 0 :9-0.

New methods in Wiener filtering theory, Proc. 1st Symp. on Engineering Applications of Random Function Theory and Probability, firdue University, November 1960, pp 270-388, from RIAS Technical Report 61-1. ) Wiley. (~brid~ed Mathematical description of linear dynamical systems, SIAM J Contr., &:152-192. . Irreducible realizations and t h ~ degree of a rational matrix, SLAM J Contr., 13:520-544. .
Algebraic structure of linear dynamical systems. I The . Module of C, Proc. Nat. Acad. Sci. (USA), 5 : 4 1503-1508.
Algebraic aspects of the theory of dynamical systems, in ~ifferentdal-~~uations Dyn&cal and Systems, J K. Hale . and J. P LaSalle ( d . , pp. 133-146, Academic Press. . es) [1969al On multilinear machines, J. Comp. and System Sci., to appear. Dynamic Prediction and Filtering Theory, Springer, to appear. [1969c]
[ 1970a]
On partial realizations of a linear input/output map, Guillemin Anniversary Volume, Holt, Winston and Rinehart. Observability in mltilinear system, to appear.
The realization of linear, constant, input/output maps.

11. Partial realizations, SIAM J. Control, to appear.
. and R. S. BUCY R. E W ?
New results in linear prediction and filtering theory, J Basic Engr. (~rans. . ASME, Ser. D, ) 83~:95-100.
R E FdWlAN, P L FB . . . . &
and M. A. ARBIB
R. E KALMAN, Y. C. HO and K. NARENDRA . [ 19631
Controllability of linear dynamical systems, Contr to :8-1. Diff. Equations, 1 1 9 2 3

1
On the stabilization of linear systems, Proc. Am. mth. SOC., =:735-742. .
[1 6 1 95
Algebra, Addison-Mesley
S MAC LANE .
[ 19631
Homology, Springer
I. A. ,
wm
Controllability of nonlinear processes, SIAM J . control, 3:78-90.
[1 6 1 95
E F. MOORE .
[ 19561
Gedanken-experiments on sequential machines, in Automsta Studies, C. E Shannon and J. McCarthy ( d . , pp. 129-153, . es) Princeton University Press.
Theorie und Anwendung der Elementartheiler, Teubner, Leipzig. A NERODE .

[1 5 1 98
Linear automaton transformations, Proc. Amer. Math. Soc., 9: - 541-544.
[1 6 1 96
Representation and realization of time-variable linear systems, Doctoral dissertation, Columbia University.
L. M. S I L V E W and H. E MEADONS .
[1 6 1 99
Equivalent realizations of linear systems, S U M J. Control, to appear.
H WEBER .
19 1 88
Lehrbuch der Algebra, Vol. 1 2nd Edi-iion, reprinted by , Chelsea, Nzw York.
f 19691
Lectures on Controllability and Observability, C.I.M.E. Seminar.
L WEISS and R. E KALMAN . .

contributions to linear system theory, Intern. J Engr . Sci., - 141-171. 3:
[1 6 1 97
On pole assignment in multi-input controllable linear systems, IEEE Trans. Auto. Contr , AC-12:600-665.
A. M. YAGMM
[ 19621
-u nF
An Introduction t o t h e Theory of Stationary Random Prentice-Hall.
The synthesis of l i n e a r dynamical systems from prescribed weighting patterns, SIAM J. Appl. Math., 14:527-549.
D. C. YOULA and P. TISSI

n-port synthesis v i a reactance extraction, Part I, IEEE Intern. Convention Record.
0 ZARISKI and P. SAMUEL .

Commutative Algebra, Vol. 1 Van Nostrand. ,
Section B: References for Section 1 1
M. A. ARBIB
[1 6 1 95
A common framework for automata theory and control theory, ?JAMIAM.Contr , 3 :206-222. J .
.-
An investigation of the switching criteria for higher

order contactor servomechanisms, Trans. AIEE,
a 11:ll8-127.
Differential equations with a discontinuous forcing term, doctoral dissertation, Princeton University. C CARATHEODORY .
[ 19331
# D
Uber die Einteilung der Variationsprobleme von Lagrange nach Klassen, Comm. Mat. Helv., 5:l-19.
W. L. CHOW
143 90
.*
Uber Systeme von linearen partiellen Differentialgleicl erster Ordnung, Math. Annalen, :98-105.
Automatic contyol system for vehicles, US Patent 2,463,362.
Avtomatika i Telemekhanika, 14:712-728.
[ 19571 [ 19581
On the theory of optimal processes in linear systems 16 -1 1: (in Russian), Dokl. Akad. Nauk SSSR, - g 1 . The theory of optimal processes in linear systems (in Russian), Izvestia Akad. Nauk SSSR, 2:449-474.
[ 19591
The Theory of Matrices, 2 vols., Chelsea.
A phase-plane approach to the compensation of saturating servomechanisms, Trans. AIEE, 70:631-639.
AIEE, 73 11: 245-246.

[ 19551
[ 19571
Discussion of a paper by Bergen and Ragazzini, Trans. Analysis and design principles of second and higherorder saturating servomechanisms, Trans. AIEE, - 11:29L-310. 7 4 Optimal nonlinear control of saturating systems by intermittent control, IRE WESCON Convention Record, 1 IV:133-135
[1960b] [1960c] [1960d] [1963b] [1965bj

[ 1969b1
Contributions to the theory of optimal control, Bol. Soc. Mat. Mexicans, -: 0 - 1 . 51219 On the general theory of control systems, Proc. 1st IFAC Congress, Moscow; Butterworths, London. Lecture notes on control system theory (by M. Athans and G ~endaris), Univ. of Calif. at Berkeley. . Blathematdcal description of linear dynamical systems, . 1 : SIAM J 'Contr , - 152-192.
Algebraic structure of linear dynamical systems. I The . Madule of ,C, Proc. Nat. Acad. Sci. (USA), 54:1503-1508.
Dynamic Prediction and Filtering Theory, Springer, to appear.
R E KPUqATJand J E BERTRAM . . .
[ 1958'1
General synthesis procedure for computer control of single and ~ulti-loop linear systems, Trans, AIEE, 12 3-7 I:6?2-609.
R. E W Z A . N , Y C . .
. HO and K. KCRENDRA
i963I
Controllability of linear dynami.ca1 systems, Contr.. to Diff. Equations, -: 1 189-213.
On the numerical solution of the equation by which the frequency of small oscillations is determined in technical problems (in Russian), Izv. Akad. Nauk SSSR Ser. Fix.-Mat., k:491-539.
The time-optimal control problem, Contr. Nonlinear Oscillations, Vol. 5, Princeton Univ. Press.
Nonlinear techniques for improving servo performance, Proc Nat Electronics Conf (USA), 6:400-421.
. .
R C OLDENBOURG and H SARTORIUS . . .
11 51 9 1
Dynamik selbstgttiger Regelungen, 2nd edition, Oldenbourg, Munchen.
R OLDENBURGER .
[ 19571 [1 6 1 96
Opth u m nonlinear control, Trans. ASME,
- 527-546. 79:
Optimal and Self-optimizing Control, MIT Press.
[1 5 98
Optimal control processes (in ,~ussian),Uspekhi Mat. Nauk, &:3-20.
N J ROSE . . Theoretical aspects of limit control, Report Stevens Institute of Tech., Hoboken, N.J.
459,
E. ROXIN
190 161
.
Reachable zones in autonomous differential systems, . Bol. Soc. Mat. Mexicana, 5:125-135.
[1 6 1 99
On some unpublished works of R. E Kalman, not to be . unpublished.
119541
Engineering Cybernetics, McGraw-Hill.
A. M. UTTmY and P H HAMMOND . . 119531
The stabilization of on-off controlled servomechanisms, in Autsmatic and Manuzl Control, Academic Press.
Lectures on eontrollability and Observability, C .'I.M. E . Seminar

Lectures On Controllabiliti and Observability

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Lectures On Controllabiliti and Observability

Încărcat de

Drepturi de autor:

Formate disponibile

R. E KALMAN'" . .

** This research was s

CENTRO INTERNAZIONALE MATEMATICO ESTIVO

L E C T U R E S ON CONTROLLABILITY AND OBSERVABILITY

R.E. KALMAN (Stanford- U n i v e r s i t y )

Page intentionally left blank

Introduction. Classical and modern dynamical systems.

conscious steps to formalize these matters as a separate area of

( system-theoretic or mathematical) research were undertaken only as

There have been, however, many

With the perspective afforded by ten years of happenings in

I see two main trends:

observability and the study of structure of dynamical systems, such

In the first, attention has shifted from the case of systems

My own current interest lies in the second strem, and these

( r a t h e r than conventional pure mathematical) language.

matically t r a i n e d reader should have no d i f f i c u l t y i n converting them i n t o h i s p r e f e r r e d framework, by digging a l i t t l e i n t o t h e references.

In a r e a ( i ) , t h e most important r e s u l t s a r e probably those

A r e a l , continuous-t ime, n-dimensional,

l i n e a r dynamical system C has t h e property "every s e t of

eigenvalues may be produ-ced by s u i t a b l e s t a t e feedback" i f and only i f

a p a r t i c u l a r case, tre have t h a t every system s a t i s f y i n g t h e hypotheses

i s t h e "existence theorem" f o r algorithms used t o construct c o n t r ~ l

systems (governed.by l i n e a r d i f f e r e n t i a l equations with variable c o e f f i c i e n t s ) i s s t i l l not solved.

("Duality ~rinci~le") Every problem of control-

a controllability problem in a dual system.

KALMAN (1960b-cl . See also many related comments by KALMAN, FALB,

6 19691. As a theorem, this principle ,

algebra and in particular

A and related lemmas, attention in the early 19601s shifted toward

(canonical ~ecomposition) Every real ( co$inuous-

time or discrete-time), finite-dinensional, cans%s~t,, linear avnamical

may be canonically decomposed into four parts, of which only

Any combi~ation a property from of

In the special problems of nonconstant.linear systems. Recent

progress is surveyed by WEISS [ 1 6 3 99

Intimately related to the

(uniqueness of Minimal ~ealization) Given the of a real, continuous-time, finite-

that is, the impulse-response matrix of

(b) has minimal dimension in the class 02 linear systems

COROLLARY 1 If W comes from a consta~tsystem, there is a .

which satisfies (a) khrough ( ) and is uniquely c,

(modulo a fixed choice of

basis for its state space).

A 1 1 claims of Corollary 1 continue t o hold i f

"impulse-response matrix of a constz.nt, finite-dimensional system"

i s replaced by " t r a x f e r f'unction matrix of a constant, f i n i t e dimensional systemu.

paper: does not include cozlplete proofs, o r even

version of t h e o r i g i n a l unpublished proof of Theorem D i s given i n KAWAN, FAI;By and ARBIB

1969, Chapter 10, Appendix c ] )

i n connection with a precise d e f i n i t i o n of a "nonconstant l i n e a r

Thus, it $eems preferable at present to replace by "weighting pattern W1'

in Theorem D "impulse-response matrix W" (or "abstract input/output map

-and "complete controllability"

by "complete reachabilityl'. The definitive form of the 1963 theorem

16 evolved through the works of WEISS and KllIiEi'iAN 1 9 51, YOULA

was given by WlLMAN [ 1963b, Section 7 I.

computational and.linear algebraic in nature, yields no theore~l\

cal insight although it is usef'ul as the basis of a computer algorithm.

This result is of great theoretical interest

[19661. The subject goes back to the 19th century

It is a pleasure to thank C. I M E and its organizers, . . .

1 CLASSICAL AM> MODERN DYNAMICAL S S E S . YTM

l i n e a r dynamical system C = (F(), G ( - ) , H ( ) ) i s c a l l e d constant