Documente Academic
Documente Profesional
Documente Cultură
460
Fig. 4.
Decompo5ition of a multistageproblem
Ail).
?l,.(r): x(0)
I . . . . .
.(t),
U,(O)
..... U [ . ( P I )
T F ( I ) : x ( O ) . . .. , x( t ) , U F ( 0 ) ,... *
I),
UF( I -
AC-26. NO. 2.
APRIL
1981
i.e.. each player has perfect memory of the state ( x ) and his own control
history. The point here is that given . x ( t ) . u,.(t- I), X([- I). we can in
general calculate u F (t - 1) or vice versa [4]. As shown in Fig. 4, the leader
at time r can choose h s decision u , ( f ) based on u F ( r - I ) (or the entire
past decisions of the follower). thus L essentially imposes a kind of
reversed information structure on F. Note that whoever gets to declare his
strategy First becomes the leader. This approach requires separate-treatments at t = T - 1 and 0 by solving u,(T- 1) first which has a permanently optimal solution, and considering u F (- I ) to be fixed at zero as is
evident in the works of Basar and Tolwinski. Also. bear in mind that the
A New Computational Method for Stackelberg and
distinction between closed-loop Stackelberg controls and Stackelberg
feedback strategies [I] still exists. With this understanding. closed-loop
Min-Max Problems by Use of a Penalty Method
Stackelberg strategy for linear quadratic deterministic problem can be
KIYOTAKA SHIMIZU AND EITARO AIYOSHI
solved using the basic idea discussed in Section 111. It is clear that many
the enormous flexibility here. For
strategies y;, are possible dueto
Abstract-This paper is concernedwiththeStackelbergproblemand
example, Ls strategy may punish a nonrational behavior of F for one
stage only (asin (41). two stages, etc.. or for the rest of the game (as in [2]). the min-max problem in competitive systems. The Stackelberg approachis
applied to theoptimization of two-level systems where the higher level
It is thus possible that different y;. may enjoy various advantages.
determines the optimal value of its decision variables (parameters for the
lower level) so as to minimize its objective, Hide the lower level minimizes
VIII. CONCLIXON
its own objective with respect to the lower level decision variables under
the given parameters. Meanwhile, the min-max problem is to determine a
In this paper we have identified two reasons why L may be interested in
min-maxsolutionsuchthatafunctionmaximizedwithrespect
to the
the decision of F. First. knowing Fs strategy and h s decision may enable
maximizers variables is minimized with respect to the minimizers varia1. to infer the states of nature [RI)]. Second, Fs decision may directly
bles. This problem is also characterizedbyaparametricapproach
in a
affect Ls payoff [R2)]. We then discussed mechanisms by which L can
two-level scheme. New computational methods are proposed here; thatis, a
induce F to behave cooperatively. In case R I ) the mechanism is to
series of nonlinearprogrammingproblemsapproximatingtheoriginal
transform Fs payoff function so that it looks like Ls ow-n. In case R2) it
two-levelproblem byapplication of apenaltymethod
to a constrained
is to make directly any choice of Fs strategy other than the cooperative
parametricproblem in thelowerlevelaresolvediteratively.
It is proved
one unpalatable. In either case the crucial requirement is that we have the
reversed information structure as defined in Section 111. It serves as a that a sequence of approximated solutions converges to the correct Stackelberg solution, or the min-max solution. Some numerical examples are
unifying ingredient in diverse applications.
presented to illustrate the algorithms.
ACKNOWLEDGMENT
I. INTRODUCTION
The authors would like to thank Dr. G. J. Olsder for many insightful
comments on this subject.
REFERENCES
AC-26,NO. 2.
APRIL
1981
berg and the min-max solutions differ from them in that Player 2 makes
his decision after Player 1 does. i.e., a precedence in decision-making
order exists. Therefore, both the Stackelberg and the min-max problems
can be represented by hierarchical optimization problems, where a part of
constraints in the upper level problem consists of a parameterized optimization problem in the lower level. The problems of this type can be solved
by a parametric approach in principle. However, that approach causes
difficulties in developing an available computational method also. Some
computational methods for the min-max problem were proposed in [4].
[6], [9]. [ 1 I], where a gradient type algorithm was discussed [4]. [6]and a
relaxation method [Y], [ll]. In [Is], a penalty method was applied to
calculate a saddle-point solution. However. papers on the Stackelberg and
the min-max solutions are comparatively few. most of them limited to
separate constraints cases when the constraints for each player do not
depend on another player's strategies.
In this paper. we consider the most general problems with unseparate
constraints. and propose new computational methods based on the theoq
of a barrier method (an interior penalty method [l]. [ j ] ) . Our method is
based on that a series of nonlinear programming problems approximating
the original hierarchical one are solved iteratively by applying the barrier
method to the lower level problem. It proves that a sequence of approximate solutions converges to the true solution for the original problem.
46 1
Our approach to the problem (3) begins with replacing the lower level
problem by an unconstrained problem based on the theory of a barrier
method. Then, the parametric minimization problem (3d)-(3f) is transformed into an unconstrained parametric minimization problem with the
augmented objective function P combined with the constraint functions.
The parametric solutionp(x), whchis regarded as a function of x. can be
approximated by an implicit function satisfying a stationaq condition for
the resulting unconstrained lower level problem. A sequence of approximated problems will be solved by use of appropriate nonlinear programming techniques.
Let the feasible region Y be given with the vector function h , E Rq' as
F={ y l h , ( y)<O). and let
S(x)={pJhz(J')~O,g2(x,p)<o}.
We begin with imposing the following assumption:
a)int S(x). the interior of S(x). which is given by
11. FORMCLATIOK
OF THE STACKELBERG
PROBLEM
intS(x)=(y;h2(p)<0,gz(x.y)<O},
/,
is not empty for any fixed x and its closure becomes S(x).
Let us define the augmented objective function on xXint S(x). where x
is any point in R"', as
p'(x.S.)=f,(x.J)+'Q(gz(x.P).h?(y))
+ is such a continuous
aspEintS(x)
Q(g2(x,~).h2(~))-++cx.
asJ+aS(x).
(5)
L'
forVyEYsatisfyingg,(x.J)GO.
XI'
cb(gz(x.J).hz(Y))~O.
/,(**J:(x))Gf,(x.Y)
(4)
('?
such that
/l(x\~..i.(x~~))~/l(x.j(x?)
subject toyES(x).
for VxEXsatisfyingg,(x, y ( x ) ) ~ O .
(3a)
JYX))
subject to EX
(3b)
(3c)
gl(x. .i.(X))GO
/dx. i ( x ) ) = minf2(x. 1'1
(3d)
L'
subject t o y Y
&(X,
y)<O
(6)
(2)
(3e)
(30
In ordertoapply
the theory of abarrier method, we impose the
following assumptions.
b) The functionsf2(x. y ) and g,(x, y ) are continuous at any (x, J ) E
R"] X R"' and h2(y) is continuous at anyyER"2.
c) The set Y is compact. This implies that S(x) is also compact.
Then,it is assured that the problems ( 5 ) and (6) have their optimal
solutions j ' ( x ) Eint S( x) and I(x) E S( x), respectively, for any fixed x.
Let us consider a sequence of the optimal solutions { s ; r l x ) ) for the
problem (5) in response to a positive parameter sequence { r k ) strictly
decreasing to zero. When the parameter x is fixed, it follows directly from
a theory of a barrier method (an interior penal5 function method [ 11. [5])
that any accumulation point of the sequence { p r 1 x ) ) is optimal for the
problem (6). On the other hand, when theparameter x varies as a
we
sequence ( x k ) . the convergence about ( j ' ^ ( x k ) )is left unsettled.
prepare the following lemma.
Len~nzu1: Let { j r ' ( x h) ) be a sequence of the optimal solutions to the
problem (5) in response to a sequence { x'} C X converging to 1 and a
positive sequence { r ' ) strictly decreasing to zero. If assumptions a)-c)
So.
464
11-1-1- TKASSACTIONS
ON
= ~ ~ f I ( x . . , ~ ) - q . f , ( r . , ~ ) [ ~ ~ , ~ PTr.;Pr(x,J)
'(x.y)]
(25)
- I
~ . , ( X ) = ~ . ~ g l ( x . y ) - ~ ~ R I ( X . . ) ' ) [ ~ ~ ~ PY,I.P'(X.J.).
~(x.y)]
(26)
Then, the problem (22) is equivalent to the following problem under the
relation (23):
minfl(x.?(xN
(27a)
2. APRIL 1981
subject toh,(x)<O
g1(x. v(x))<O.
(27~)
From now on. let us t r y tosolve the problem (27) instead of the
problem (12). Note. however. it is impossible to represent the function
(30a
q ( x ) in an explicit form. Thismatter
causes difficulties in directly
apply-ing nonlinear programming tothe problem (27). In the most of
iterative methods of NLP. however. the data needed for its computation
are values of gradients of the objective and the constraint functions as
well as their function values at the current point. Let x' bc the current
point. Then. in our case. the value y' ~ q x')( coincides u-ith the solution
?'( x ' ) to the problem ( 5 ) with x=x'. whch can be solved easill. By use of
this valuey'. we can evaluate $,(x[) and e l ( x ' )by ( 2 5 )and (26) along
with
jl(x')=f,(x~.JJ).
g,(x')=gl(x'.y').
Only after the preparation of those data, we can apply the existing
nonlinear programming, suchas Zoutendijk's feasible direction method
[ 161. to the problem (27).
IV. AN APPLICATION
TO T
H MIS-MAX
~
PROBLI..?.~
minf(x. j ( x ) )
X
subject to x A'
(28b)
g,(x. j ( X ) ) G O
(28C)
f(x,j(x))=maxf(x,y)
(28d)
subject toy E Y
gz(X. y)<O
(Be)
(29)
I'
(27b)
subject toxEA'
(30b)
(30~)
g,(x.j'(x))<O
P r ( x ,i ' ( x ) ) = maxP'(x.J*).
(304
1'
i) l i m ' - ~ ~ r ( x L . ~ r ~ x L ) ) = f ( i , j ( . ? ) ~ .
ProoJ:
i) Setting fi = -f in Corollary 1 in Section 111. we have t h s result
immediately.
ii) Suppose that P ' \ x L , f(x* )) does not converge to f( T, F(f)),
then there exist a positive number c and a positive integer K, such that
(280
~ P " ( X ~ . ? ~ ~ ( X ~ ) ) - ~ ( T . ~ (forVX>K,.
T))~~~
(31)
Here. consider an open ball B( j(T ): 8 ) C R'': around j(2 ) . then the
continuity off at y implies the existence of a positive 6 such that
J(i.y)-j(.?.t(i))
<c
forV.rrB( f ( f ) : 8 ) .
min max j ( x. p )
X L X .I'E Y
,f(~.y")-!(T.j(~))I<E-
(34
y' E S ( X ' )
forVX>K2
and that y A-y", Since y"Eint S ( ?). \ve have the existence of a positive
integer K , such that
IEEE TRANSACTIONS
ON
465
(33)
]P"(xk,yy")-f(f,y")J<~
forVk>K,
200
50
8.430
8.992
9.414
9.730
9.873
9.944
IO
I
0.1
Prk(xk,yk)>f(f,y,,)--E
0.01
>f(f,jqa))-2rzP*(xk,j,rk(xk)).
13.55
81.88
88.96
94.74
97.49
98.88
126.9
44.05
13.63
2.107
0.5589
0.1194
The computational aspect for the problem (30) is similar to that for the
problem ( I 2) in Section 111.
we omit the computational consideration
for the min-max problem.
So.
NUhlERlCAL EXAMPLES
v . SOME
In order to illustrate the convergence properties, we shall give some
examples for the Stackelberg problems.
Example I: As the Stackelberg problem of theunseparate type. we
present the following simple problem:
minx2+(p(x)- 10)'
X
subject to - x + p ( x ) G O
O G x G I5
(34)
g,(x", P(X"))<O
8.425
8.988
9.413
9.729
9.872
9.943
(~+Zp(x)-3O)~=min(x+2y-30)'
and
f(x".-ij(x"))<f(f,jqf)).
subject to x+y-20<0
Here. let
OGyG20.
where-E>O.
/(f.p(i))-/(~".p(~"))=2-~.
(35)
IP"(x",j,"(x"))-f(x",P(x"))l<f
(36)
subject to x , + 2 x 2 a 3 0
IP'"(~',,'A(~rA))-/(f,i,(f))!<~
forVk>f?.
(37)
x , +xz d 2 5
Set K=max(K". f ? ) . Then. using (36). (35). and (37) in turn, we have
the following relations for all k >K :
x1 d 15
( x , - I ~ ( X ) ) ' + ( X ~ - ~ ~ ( X ) ) ~min(x,
=
-y,)'
+ ( x z -y212
.1'
P'.*(,",p'~(x~~))<f(x~~,,(x"))+c
Since we find that (x", j'ix'')) satisfies (30c) in the similar manner to
the proof in Theorem 1. this relation contradicts the fact that (xr*) solves
the problem (30).
0dyz d 10.
Our method exhibits the power, as there exists no other appropriate
manner to calculate the Stackelberg solution even for the separate type.
Table IIillustrates the convergence property in this case. too.
TABLE I1
rh
500
200
50
10
x;'
xf
18.44
19.03
20.00
20.00
20.00
6.563
5.968
5.000
5.000
5.000
True
xi
x;
Values 20.00 5.000
$t(XrA)
fikxrA)
/,(xrh,jr\xr'))
6.339
7.288
8.553
9.326
9.775
5.173
5.228
4.976
4.821
4.959
291.0
276.0
253.4
234.9
228.7
j~(x')
h(x")
/l(x',j(x'))
225.0
10.000
5.000
P~\X~~,~~'(X~'))
564.0
319.8
191.4
133.9
109.5
fJx'.J(x'))
100.0
466
2.
1981
APRIL
CoNCLUsloN
In this paper. we have presented two solution methods for the most
general tj-pes of the Stackelberg and the min-max problems, both of
which are formulated as two-level optimization problems to be solved in
principle by a parametric approach. Our methods apply the barrier
method to transform the two-level problems into the ordinary nonlinear
programming problems. Some numerical examples were given to illustrate
the proposed algorithms. In viewof the fact that there are scarcely anyavailable solution methods so far. we are conLinced of the significance of
our methods. Nevertheless. there are also troubles in practical computations. For instance. when r is set sufficiently small. the rapid increase or
decrease of the function P (x. J) in the vicinity of the boundary of the
feasible region troubles us to obtain an accurate solution to the problem
( 5 ) or (29), and the increase of sensitivity of P( x. J) introduces roundoff
errors in the calculation of the gradient of the implicit function. Those
troubles give serious influence on the computational results. It is hoped.
however, that wewill overcome the above stated troubles. which are
originated in disadvantages peculiar to the barrier method. with various
existing techniques for improvement of the barrier method (SUMT).
ACKNOWLEDGMENT
The authors wish to thank thereviewers for their many constructive
comments.
REFERENCES
M Avriel. .Vouhmwr Progranmm~. E n g l e w d Cliffs. NJ: Prcntlce-Hall. 1976.
J. Bram. The Lagrangc multiplier theorem for max-m1n uith w.eraI constraints.
S l A M J . A p p l M u [ h . vol.
~
14. no. 4. pp. 665-667, 1966
J hi. Damkin. The Theow o/ ,Muv-.&f/n uml I t , Applr~unonI O U>upopo,a 4llorurwm
Prohlene. Berlin: Springer-Verlag. 1967
V. F DemJanov. Algorithmsfor some minimar problem.. J Conlpur Sial S I . .
w l 2. no 4. pp. 342-380. 1968.
A. V. Flacco and Ci. P. McCormick. The xqucntial unconatralncdminlmizatlon
tcchnsquc for nonlinear programming. ~ h ~ u : e m e Sc!
) ~ r. bo1 IO. no 2 . pp 601-617.
1964.
1. INIXODUCTION
We consider the control of the one-dimensional system described by the
It6 equation
d.x,=n2(x,. u,)dr+cr(.x,.u,)dw,.
t>O
(1.1)
<x2
< . . . <.x,z) is a finite set of states called the sw,itchingser (boundaq set
in [I]). and Q is any specified family of functions 4:.x+.+(.x). A particularly- policy @ in Q=Q,(S. k) is obtained by assigning to each x , E S a
function $,, 9.Then @=\I.is simply the family of all assignments
+=(+,,:.
..$,,.). To simplify notation such @ is usually denoted by
..$,,).
In the folloLbingwe suppose that a family @ = @ (S . \I.) is given in
advance.
Fix a +=(+,,:. ..IC,,,)
in 0.The control action ( K , , r>O) indicated by
@ and the resulting state (.x,. r>O) can now be described. Iiote that ( u , )
and (x,) are stochastic processes which depend on 9.
For simplicity suppose that the initial state is a switching state. say
.x,,=.x, a.s. Then the control action is gven bythe feedback policy
u , =$,(x,) for O<r< TI. where TI is the first time that x, reaches another
switching state. say .x, +.x,. The control action is now given by u, =$,(.x,)
for TI< r t T 2 . where T, is the first time after TI that x, reaches a new
x,),
switching state .xI #.x, when the feedback policy is switched to u ,
and so on. In brief. as soon as x, encounters a switching state the control
is saitched to the corresponding feedback policy-. More precisely, define
the random variables
To =O
a.s.
~ , = i n f { t > T , - , ( x J . E S . x T # . ~ T , , ~ , } n, 2 1 .
(2.1)