Discrete Time Optimal Adaptive Control For Linear Stochastic Systems

TSINGHUA SCIENCE AND TECHNOLOGY
ISSN 1007-0214 16/17 pp105-110

Volume 12, Number 1, February 2007
Discrete Time Optimal Adaptive Control for
Linear Stochastic Systems
*
JIANG Rui ( )
1,2
, LUO Guiming ()
2, **
1. Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China;
2. School of Software, Tsinghua University, Beijing 100084, China
Abstract: The least-squares (LS) algorithm has been used for system modeling for a long time. Without any
excitation conditions, only the convergence rate of the common LS algorithm can be obtained. This paper
analyzed the weighted least-squares (WLS) algorithm and described the good properties of the WLS algo-
rithm. The WLS algorithm was then used for adaptive control of linear stochastic systems to show that the
linear closed-loop system was globally stable and that the system identification was consistent. Compared
to the past optimal adaptive controller, this controller does not impose restricted conditions on the coeffi-
cients of the system, such as knowing the first coefficient before the controller. Without any persistent excita-
tion conditions, the analysis shows that, with the regulation of the adaptive control, the closed-loop system
was globally stable and the adaptive controller converged to the one-step-ahead optimal controller in some
sense.
Key words: stochastic system; weighted least-squares (WLS) algorithm; optimal adaptive control; globally
stable
Introduction
Adaptive control has been playing a more and more
important role in academic research and industrial
processes. Many problems related to convergence, sta-
bility, and optimization have attracted a considerable
amount of attention over the last few decades. The pio-
neering work is the self-tuning regulator introduced by
Astron and Wittenmark
[1]
based on the least-squares
(LS) estimate. They found that if the parameter estima-
tion converged to any limiting random variable, the
limiting control law was optimal.
The strategy for constructing a controller is that the
system parameters can be estimated as accurately as
possible while keeping the output of the system below
a specified level of variability. There is an optimal
adaptive experiment design problem with constraints
on the system output. Some similar non-adaptive ex-
perimental design problems were considered in the
system identification literature.
Recent adaptive control schemes
[2-4]
require that the
system must have a stable inverse and satisfy some
persistent excitation (PE) conditions. Furthermore,
adaptive controller designs must contain a variable de-
termined by the Riccati equation, which requires a
large amount of computational time to solve at each
step of the control scheme. An optimal adaptive control
design was proposed using input matching tech-
niques
[5,6]
. The advantages of this method were that the
Riccati equation was not needed in the optimal adap-
tive control design and the minimum phase restriction
could be relaxed.
One-step-ahead stochastic systems were analyzed in

Received: 2006-03-10
Supported by the National Natural Science Foundation of China (No.
60474026), and the Asia Research Center at Tsinghua University
To whom correspondence should be addressed.
E-mail: gluo@tsinghua.edu.cn; Tel: 86-10-62795440
Tsinghua Science and Technology, February 2007, 12(1): 105-110 106
this paper. Using the weighted least-squares (WLS)
algorithm, the authors gave the identification conver-
gence of system parameters, and a one-step-ahead
adaptive controller was designed which was proved to
converge to the one-step-ahead optimal adaptive con-
troller under certain conditions. At the same time, the
stability of the control system was analyzed in this
paper.
1 Self-Convergence of WLS
Algorithm
Linear systems can often be described as the following
linear regression model:
( ) ( ) ( )
n n n
A z y zB z u C z w = + (1)
where
1
1
1
( ) 1 ,
( ) ,
( ) 1 .
p
p
q
q
r
r
A z a z a z
B z b z b z
C z c z c z
|
= + + +
= + + +
= + + +
n
y , ,
n
u and
n
w are the system output, input, and
noise sequences, while ( ) A z , ( ) B z , and ( ) C z are
polynomials in the form of backward-shift operator
z with unknown coefficients and known upper bounds
, , p q and r, with 0 | = .
The system can be rewritten as
T 0
1 1 n n n
y w u
+ +
= + (2)
Here,

T
1 1 1
0 T
1 1 1
T
1
1 1
T
1
( ,..., , , ,..., , ,..., ) ,
( ,..., , , ..., , ,..., ) ,
( ,..., , , ..., , ,..., ) ,
.
p q r
n n n p n n n q n n r
n n r
n n n p n n n q
n
n n n
a a b b c c
y y u u u w w
y y u u u w w
w y
u |
u
+ +
+
+
=
=
=
=
First, a set of functions are as follows:
{ ( ) : ( ) is slowly increasing and F f f x
d
for some 0 }
( )
M
x
M
xf x
< >
}
(3)
( ) f has to satisfy
2
( ) ( ( )) f x O f x = for all large
0 x > .
Then, the recursive WLS algorithm has the follow-
ing form:
T
1 1
( )
n n n n n n
L y u u u
+ +
= + (4)
1 T
n n
n
n n n n
P
L
a P
=
+
(5)
T
1 1 T
n n n n
n n
n n n n
P P
P P
a P

+
=
+
(6)
Here the initial values
0
u and
0
(0 1) P I o o = < < are
chosen arbitrarily, and { }
n
a is the weighing sequence
defined by
2
1
0
0
1
, ,
( )
n
n n i
i
n
a r P
f r

=
= = +
with ( ) f being any measurable function in the set F

defined by Eq. (3).
For convenience, take
1
(log ), 0
L
t t
a O r L
= > (7)
The analysis of the WLS algorithm is based on the
following standard assumptions.
A1 { , }
n n
w F is a Martingale difference sequence de-
fined on the basic probability space F P O with
1
0
sup { | } , a.s. 2
r
n n
n
E w F r
+
<
.
2
1
1
limsup 0
n
i
n
i
w R
n
=
= >
, a.s.
A2 u
n
is F
n
-measurable.
A3
1
1
( )
2
C z
is strictly positive real, i.e.,

1
max ( ) 1 1
z
C z
=
< .
Using the Markov inequality,
0
0
2
1
1 / 2
{ | }
( | ) .
r
n n
n n r
E w F
P w n F
n
c
c
+
+

So with Condition A1, it is deduced that
0
0
2
1
1 / 2
1 1
0
{ | }
( | )
2
a.s. ( ,1).
r
n n
n n r
n n
E w F
P w n F
n
r
c
c
c

+
+
= =
<
e

From the conditional Borel-Cantelli Lemma, it fol-
lows that
0
2
1 0
2
( ) a.s. ( ,1)
n
w O n
r
c
c
+
= e (8)
Lemma 1 Let System (1) satisfy Conditions A1 and
A2, then the WLS algorithm described by Eqs. (4)-(6)
has the following
[7]
properties :
(a)

1
2
1
1
(1) n
n
P O u
+
+
= a.s. (9)
(b)

T 2 2
1
1
[( ) ( ) ] n t
n n t
n
a w w u
+
=
+ <
a.s. (10)
JIANG Rui ( ) et alDiscrete Time Optimal Adaptive Control for 107
(c)

T 2
1 T
1
( ) n
n
n
n n n n
a P
u

=
<
+
a.s. (11)
where

n
n
u u u .
Theorem 1 Let System (1) satisfy Conditions A1 and
A2, then the WLS algorithm described by Eqs. (4)-(6)
has the following self-convergence properties.
(a)
n
u converges almost surely to a finite random
vector u (not necessarily equal to u ) (12)
(b)
1
2
1
min
( )
n
n
a
O
n
u u

+
| |
=
|
\ .
(13)
Proof
(a) Substitute Eq. (2) into Eq. (4),
T
T 0
1 1
T
T 0
0 1
0
[ ( ) ]
[ ( ) ]
n
n n n n n n n
n
i
i i i i i
i
L w
L w
u u u u
u u u
+ +
+
=
= + + + =
+ + +
(14)
From Eq. (6),
T
1 1 T
n n n n
n n
n n n n
P P
P P
a P

+
=
+
, taking
trace of both sides and summing up give
2
1 0 1 T
0 0
(tr( ) tr( )) tr( )
i i
i i
i i
i i i i
P
P P P
a P

+
= =
= <
+

(15)
From Formula (15), Lemma 1c, and the Schwarz
inequality, it follows that

T T
1 T
0 0
1
T 2
2 2
1 T 1 T
0 0
n
i i
i i
i i i
i i
i i i i
i
i i i
i i
i i i i i i i i
P
L
a P
P
a P a P
u u

u

= =

= =
| |
=
|
+
\ .
(
( <
+ +
(

(16)
Hence,
T
0
n
i
i i
i
L u
=
converges.
Similarly,
T 0
0
( )
n
i i i
i
Lu
=

also converges, a.s.,

since by Lemma 1b,
T 0
0
1/ 2
2
2
0
1 T
0 0
( )
n
i i i
i
i i
i i i
i i
i i i i
L
P
a
a P
u
u

=

= =

<
`
+

)
(17)
As for the last term in Eq. (14), Condition A1 and
Eq. (15) lead to
2
1
1
[ | ]
n n n
n
E L w F
+
=
<
a.s.
From Chows Martingale convergence theorem,
1
0
i i
i
L w
+
=
also converges, a.s. , that is

1
0
i i
i
L w
+
=
<
(18)
Therefore, from Eqs. (14) and (16)-(18),
n
u con-
verges, a.s. , as desired.
(b) By Lemma 1a, it deduces that
2
1
max 1
( ( )) n
n
O P u +
+
= (19)
Also,
1 T 1
min 1 min 0 min
1
( ) ( ) ( )
n
n i i i n
i
P a P a n

+
=
= +
(20)
Since
1
min 1
max 1
1
( )
( )
n
n
P
P
+
+
= , Eqs. (19) and (20)
can be combined to give Eq. (13).
2 Optimal Adaptive Controller and
Stability
Because the WLS algorithm converges,
n
| converges
to a definite figure | (not necessarily equal to | ).
Suppose that
( ) ( ) 0 1, 0 B z A z z z | + = : (21)
For System (1), let { }
n
y
-
be a given almost surely
bounded reference signal and let
1 n
y
-
+
be F
n
-
measurable. The desired controller for System (1)
should approximately track the reference signal
n
y
-
while simultaneously minimizing the control effort.
Therefore, with the performance cost,
{ }
2
2
1 1
( ) | , 0
n n n n n
J u E y y F u
-
+ +
= + (22)
Then, the one-step-ahead optimal controller can be
obtained
[8-10]
:
2 1 T
1
( ) ( )
n n n n
u y u | | | u
- - -
+
= + + (23)
The one-step-ahead adaptive controller for System
(1) is defined as
2 1 T
1
( ) ( )
n n n n n n n n
u y u | | | u
-
+
= + + (24)
Lemma 2 Define

T
n
n n
z u = , so that
2
1
( )
N
n N
n
z o r
=
=
(25)
Proof
Because
1 T 1 T
1 1 T 1 1
1
,
i i i i i i i i i
i i i i i i i i i
a P a I a P
a P a P a P P

+
+ = + =
+ =
then
1 T 1
1
i i
i i i i
i
P P
P a
P

+
+
= (26)
By using Eq. (6), it is easy to get
T
T
1 1 T
i i i
i i i i
i i i i
P
a P
a P

+
=
+
(27)
From Eqs. (26) and (27), the following is correct.
T
1 T
1 1 T
1 1 1
t t t
i i i i i
i i i i
i i i
i i i i
P P P
a P
a P P

+
+
= = =
= = =
+

1
1
1 1
d
d
(log )
i
i
i
i
P
t t
P
P
t
P
i i
i
x
x
O r
P x
+
+
= =
< =
}

}
(28)
Then by Eq. (6) and the proof of Theorem 1,
1
0
i i
P P
+
, when i .
Therefore,
2
1
1 1
( )
t t
i i i i i
i i
P P o
t

+
= =
| |
=
|
\ .

(29)
From Eqs. (28) and (29),
T T T
1 1
1 1 1
1 T T
1 1
1 1
2
1 1
1
( )
( ) ( )
( log ) ( ) ( log ) ( ).
t t t
i i i i i i i i i i
i i i
t t
t i i i i i i i i
i i
t
t t i t t t
i
P P P P
O a a P P P
O a r o O a r o r

+ +
= = =
+ +
= =

=
= + =
+ =
+ = +

By the definition of
1
t
a

in Eq. (7),
T
1
( )
t
i i i t
i
P o r
=
=
(30)
Then
T
( )
n n n n
P o r = (31)
Using Eq. (11),
T 2
1
( ) n
n
n
n
r
u
=
<
, i.e.,
2
1
n
n
n
z
r
=
<
(32)
From Condition A1 and Eq. (8), it is easy to get
( )
n
n O r = . So by the Kronecker Lemma and Eq. (32),
Eq. (25) is obtained immediately.
Remarks
From Eq. (2),
T 0 T
1 1
T T 0
1
T T 0
1
( )
( )
( )
n n n n n
n n n n n
n n n n n n
y w
w
z w
u u u
u u
u u
+ +
+
+
= + = +
+ + =
+ + + (33)
Therefore,
1 1
T 0
( ) ( ( ) ( ))
( ( ) ( )) ( )
( ) ( ) ( ) ( )
n n n
n n n n
n n n n n
A z z B z A z u
C z A z w A z y
B z u A z
| |
| |
| | | u
-
+ +
= + +
+
(34)
In the same way,
1
1 1
T 0
1
( ) ( ( ) ( ))
( ( ) ( )) ( )
( ) ( ) ( ) ( )
n n n
n n n n
n n n n n
B z z B z A z y
C z B z w B z y
B z y B z
| |
| |
| | | u
+
-
+ +
+
= +
+ +
(35)
Considering Eq. (21), Eqs. (34) and (35) can be
conducted to show that a constant (0,1) s e exists;
therefore it is true that
2 2 2
0 0
2
0
2
0
( ) ( )
[ ( ) ]
( ) (1) a.s.
n n
n i n i
n i i
i i
n
n i
n i
i
r
n k
n k
k
u O s z O s w
O s u
w w O
| |

= =

=
= + +
+
+

(36)
1
2 2 2
0 0
2
0
2
0
( ) ( )
[ ( ) ]
( ) (1) a.s.
n n
n i n i
n i i
i i
n
n i
n i
i
r
n k
n k
k
y O s z O s w
O s y
w w O
| |

= =

=
= + +
+
+

(37)
From the above relation it yields
2
2 2
0 0
2 2
0 0
2
0
( ) ( )
[ ( ) ] [ ( ) ]
2 ( ) (1)
n n
n i n i
n i i
i i
n n
n i n i
n i n i
i i
r
n k
n k
k
O s z O s w
O s u O s y
w w O
| | | |

= =

= =

=
= + +
+ +
+

(38)
Because lim
n
n
| |
= , then
2
2 2
1 1 1
( ) ( ) (1)
N N N
n n n
n n n
O z O w O
= = =
= + +

(39)
Theorem 2 If System (1) satisfy Conditions A1 and
A2 and Eq. (21), for the adaptive controller Eqs. (4)-(6)
and (24), some conclusions can be obtained as follows:
(a)
2 2
0
1
limsup ( )
N
n n
N
n
u y
N
=
+ <
a.s. (40)
(b)
2
0
1
limsup ( )
N
n n
N
n
u u C
N
-
=
=
a.s. (41)
Proof
(a) Using Eqs. (25) and (39),
2
0
1
N
N n
n
r r
=
= +
2 2
1 1
( ) ( ) (1)
N N
n n
n n
O z O w O
= =
= + + =

( ) ( )
N
o r O N + a.s. (42)
which implies that ( )
N
r O N = a.s.
then Eq. (40) follows.
JIANG Rui ( ) et alDiscrete Time Optimal Adaptive Control for 109
(b) Now consider 0
n
| = and 0
n
| = separately.
If 0
n
| = , Eqs. (23) and (24) can be conducted to
give
1 1 1 T 0
( ) ( ) ( )
n n n n n n n
u u u z | | | u
-
+ = +
(43)
Then it is easy to obtain
2 2
1 T 0
2
2
1 1 2 1
2 2
T 0 1 1
( ( )
( ) ) 3 (
( ) ( ) )
n n n n n
n n n
n n n n
u u z
u z
u
| u
| | |
u | |

+ +
+
+
(44)
If 0
n
| = , then 0
n
u = by Eq. (24), so
2 2 2
2
1 T 0
( ( ) ).
n n n n n
u u z | u
-
= +
Let
2
, ( )
n n
c M | | | | = + = , and make use of
lim
n
n
| |
= , so lim 0
n
n
c
= .
By making use of Lemma 2 and the Kronecker
Lemma,
2
0
2
2
1
1
2
1 1
1
0
2
T 0
1
2
1
0
2 2
1
0
0
1
limsup ( )
1
3 ( limsup
1
limsup ( )
1
limsup ( ) )
1
( limsup ( ) )
2
( limsup ( ) )
( lim
n
n
n
n
N
n n
N
n
N
n
N
n
N
n n
N
n
N
n n
N
n
N
n
n
N
n
n
N
n n
N
n
N
u u
N
z
N
u
N
N
O u
N
O u
N
O
|
|
|
|
|
| |
u
| |
||
| |
-
=
=
=
=
=
=
+
+
=
=
=
2 2
1
2 2
1
0
1
sup ( ) )
1
( limsup ) (1) .
n
N
n
n
N
n n
N
n
u
N
O c u O C
N
|
| |
=
=
=
+
= =
Remarks
If | | = , then 0 M = , so
2
0
1
limsup ( ) 0
N
n n
N
n
u u
N
-
=
=
,
which means the adaptive controller converges to the
one-step-ahead optimal controller.
3 Simulation
A simulation was conducted to illustrate the behavior
of the WLS estimation.
Example The output error model is given as
1 1 1 2
1 1 2
1 2 1 2
1 2 1 2
( ) 1
1 1
n n n
z b z c z c z
y u w
a z a z a z a z
|

+ + +
= +
+ + + +
(45)
The real system parameters are
1 2
0.3, 0.4, a a = =
1 1 2
0.1, 1, 0.2, and 0.5. b c c | = = = =
The input signal{ }
n
u was generated by a square gen-
erator with { },
n
w an approximate white noise with
variance 1 = . The parameters estimation process with
the WLS algorithm within 1000 steps is shown in Fig. 1.
Fig. 1 Identification example
The tracking error between
n
y
-
and
n
y is shown in
Fig. 2, where { }
n
y
-
is a reference sequence and { }
n
y is
the output sequence with the adaptive control
{ }
n
u which is defined by Eq. (24).
Fig. 2 WLS tracking error
Consider the system
T
1 1 n n n n
y w u
+ +
= + , where

T
1
1 1
( , , , , , ) , n n
n n n n n
y y u u w w

= with variance 0.01 = ,
and the in Eq. (24) is also 0.01.
A common adaptive controller based on the LS algo-
rithm will have a tracking error as shown in Fig. 3.
Fig. 3 LS tracking error
The results in Fig. 2 and Fig. 3 are different for the
two different algorithms. For 1000, N = and take
* 2
0
( )
N
n n
i
y y
=

as the tracking error, it is easy to work

out that the tracking error of Fig. 2 is 19.0873, while
the tracking error of Fig. 3 is 18.5810.
At the same time, the energy of the adaptive control-
ler which is defined as
2
0
N
n
n
u
=
has much difference.

The adaptive controller based on the WLS algorithm
has
2
0
367.3023
N
n
n
u
=
=
, but the normal adaptive con-

troller based on the LS algorithm has
2 3
0
4.4024 10
N
n
n
u
=
=
.
Thus, the adaptive controller based on the WLS al-
gorithm has good tracking ability, and for the same
tracking ability, it can consume much less energy than
the normal adaptive controller based on the LS
algorithm.
4 Conclusions
A one-step-ahead stochastic system was analyzed with
parameter identification based on the WLS algorithm.
The results show that the WLS algorithm has excellent
self-convergence property and based on that, some
good properties of adaptive control are obtained. The
analysis shows that, a closed-loop system without any
PE conditions is globally stable and the adaptive con-
troller converges to the one-step-ahead optimal con-
troller in some sense. The adaptive controller has a bet-
ter tracking ability and consumes much less energy
than that of the common adaptive controller.
References
[1] Astrom K J, Wittenmark B. Adaptive Control. MA: Addi-
son-Wesley, 1995.
[2] Hijab O B. The adaptive LOG problem (Part I). IEEE
Transactions on Automatic Control, 1983, 28(2): 171-178.
[3] Kumar P R. Optimal adaptive control of linear quadratic
Gaussion systems. SIAM Control and Optimization, 1983,
21(2): 163-178.
[4] Caines P E, Chen H F. Optimal adaptive LOG control for
systems with finite state process parameters. IEEE Trans-
actions on Automatic Control, 1985, 30(2): 185-189.
[5] Johnson C R, Tse E. Adaptive implementation of one-step-
ahead optimal control via input matching. IEEE Transac-
tions on Automatic Control, 1978, 23(5): 856-872.
[6] Goodwin G C, Johnson C R, Sin K S. Global convergence
for adaptive one-step-ahead optimal controllers based on
input matching. IEEE Transactions on Automatic Control,
1981, 26(6): 1269-1273.
[7] Guo L. Self-convergence of weighed least-squares with ap-
plications to stochastic adaptive control. IEEE Transactions
on Automatic Control, 1996, 41(1): 79-89.
[8] Luo Guiming. Optimal adaptive controllers based on LS
algorithms. Acta Automatica Sinica, 1996, 8(1): 73-79.
[9] Lo Kueiming, Kimura Hidenori. Optimal adaptive control-
ler for systems with delay. International Journal of Adap-
tive Control and Signal Processing, 2004, 18(9&10): 799-
819.
[10] Lo Kueiming, Zhang D C. Stochastic adaptive one-step-
ahead optimal controllers based on input matching. IEEE
Transactions on Automatic Contol, 2000, 45(5): 980-983.

Discrete Time Optimal Adaptive Control For Linear Stochastic Systems

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Discrete Time Optimal Adaptive Control For Linear Stochastic Systems

Încărcat de

Drepturi de autor:

Formate disponibile

TSINGHUA SCIENCE AND TECHNOLOGY

ISSN 1007-0214 16/17 pp105-110

with ( ) f being any measurable function in the set F

is strictly positive real, i.e.,

also converges, a.s.,

also converges, a.s. , that is

as the tracking error, it is easy to work

has much difference.

, but the normal adaptive con-

S-ar putea să vă placă și