Optimal Control

Chapter 3
Stochastic Control, HJB

Equations in Finance
Reference:
Yong J. and XY Zhou, Stochastic controls, Hamiltonian systems and HJB
equations, Springer-Verlag, 1999.
Oksendal, B., Stochastic dierential equations, Springer-Verlag, 6th edition, 2003.
Fleming, W.H. and H.M. Soner, Controlled Markov processes and viscosity solutions, Springer-Verlag, 1992.
3.1
Dynamic programming and HJB equations
Dynamic programming is a robust approach to solving optimal control problems. The method was originated by R. Bellman in early 1950s, and its basic
idea is to consider a family of optimal control problems with dierent initial times and states, to establish relationships amon these problems via the
so-called Hamilton-Jacobi-Bellman equation (HJB, for short). If the HJB
equation is solvable (either analytically or numerically), then one can obtain
an optimal feedback control by taking the maiximizer/minimizer involved in
the HJB equation (i.e. the so-called verication technique).
For illustration, let us rst look at the deterministic case.
33
34CHAPTER 3. STOCHASTIC CONTROL, HJB EQUATIONS IN FINANCE
3.1.1
Deterministic control
Give x0 R, consider the following control system:

x (t) = b(t, x(t), u(t)), a.e. t [0, T ],

x(0) = x0 ,
(3.1)
where the control u() : [0, T ] U belongs to

V[0, T ] = {u() is measurable in [0, T ]} ,
with U being a metric space, T > 0, and b : [0, T ] R U R a given map.
The cost functional associated with (3.1) is the following:
J(u()) =
f (t, x(t), u(t))dt + h(x(T )),
(3.2)
for some given maps f and h. Given certain regularity conditions, the state
equation (3.1) admits a unique solution x() C([0, T ]; R) and (3.2) is welldened. The optimal control problem is stated as follows:
Minimize (3.2) over V[0, T ].
Let (s, y) [0, T ) R, and consider the following control system over
[s, T ]:
x (t) = b(t, x(t), u(t)), a.e. t [s, T ],

x(s) = y.
(3.3)
Here, the control u() V[s, T ] = {u() is measurable in [s, T ]} . The cost
functional is the following:
J(s, y; u() =
f (t, x(t), u(t))dt + h(x(T )).
Now we dene the following function (value function):

V (s, y) = inf u()V [s,T ] J(s, y; u()), for any (s, y) [0, T ) R
V (T, y) = h(y)
3.1. DYNAMIC PROGRAMMING AND HJB EQUATIONS
35
Bellmans principle of optimality

Assume U is a separable metric space, f and h are uniformly continuous, and
there exists a constant L > 0 such that for (t, x, u) = b(t, x, u), f (t, x, u),
h(x),
|(t, x, u) (t, x, u)| L |x x| ,
|(t, 0, u)| L, for any t [0, T ], x, x R, u U
Then for any (s, y) [0, T ) R,
V (s, y) =
inf

s
uV[s,T ]
f (t, x(t), u(t))dt + V (s, x(s; u())) , for any 0 s s T.
(3.4)
Equation (3.4) is referred to as the dynamic programming equation. The
result is known as Bellmans principle of optimality.
Proof: Let us denote the right-hand side of the above equation by V (s, y).
By denition,
V (s, y) J(s, y; u()) =

s
s
f (t, x(t), u(t))dt+J(s, x(s); u()), for any u() V[s, T ].
Thus, taking the inmum over V[s, T ] we get V (s, y) V (s, y). Conversely,
for any > 0, there exists a u () V[s, T ] such that
V (s, y) + J(s, y; u ())

s
s
f (t, x (t), u (t))dt + V (s, x (s; u ())) V (s, y),
which implies V (s, y) V (s, y). We then obtain the desired result.
Let us make an observation on the Bellmans principle. Suppose (x(), u())
is an optimal pair. Then
V (s, y) = J(s, y; u()) =
=

s
s

s
s

s
s
f (t, x(t), u(t))dt +

s
f (t, x(t), u(t))dt + h(x(T ))
f (t, x(t), u(t))dt + J(s, x(s); u())

f (t, x(t), u(t))dt + V (s, x(s; u())) V (s, y).

Thus all of the above equalities hold, in particular
V (s, x(s)) = J(s, x(s); u())

s
f (t, x(t), u(t))dt + h(x(T )).
This indicates that

if u() is optimal on [s, T ], then u|[s,T ] () is optimal on [s, T ], for any s (s, T ).
In other words, global optimality must lead to local optimality. This is the
essence of Bellmans principle.
HJB equation
If the value function V C 1 ([0, T ] R) , then V is a solution to the following
terminal value problem of a rst-order partial dierential equation (HJB
equation):

vt + inf uU {b(t, x, u)vx + f (t, x, u)} = 0,
v|t=T = h(x), (t, x) [0, T ] R
Proof. Fix a u U. Let x() be the state trajectory corresponding to the
control u(t) u. By the Bellmans principle,
V (s, y)

s
s
f (t, x(t), u)dt + V (s, x(s))
or
1 s
V (s, x(s)) V (s, y)
+
f (t, x(t), u)dt 0, for any u U.
s s
s s s
It follows
Vt + b(t, x, u)Vx + f (t, x, u) 0, for any u U,
which results in
Vt + inf {b(t, x, u)Vx + f (t, x, u)} 0.
uU
On the other hand, for any > 0, 0 s < s T with s s > 0 small
enough, there exists a u() u,s () V[s, T ] such that
V (s, y) + (s s)

s
s
f (t, x(t), u(t))dt + V (s, x(s)).
3.1. DYNAMIC PROGRAMMING AND HJB EQUATIONS
37
Thus, it follows that

V (s, x(s)) V (s, y)
1 s
+
f (t, x(t), u)dt
s s
s s s
1 s
Vt (t, x(t)) + b(t, x(t), u(t))Vx (t, x(t)) + f (t, x(t), u)dt
=
s s s
1 s
Vt (t, x(t)) + inf {b(t, x(t), u)Vx (t, x(t)) + f (t, x(t), u)} dt
uU
s s s
Vt (t, x(t)) + inf {b(t, y, u)Vx (t, y) + f (t, y, u)} , as s s.
uU
In the last limit above, we have used the uniform continuity of functions
b and f as assumed. This yields Vt + inf uU {b(t, x, u)Vx + f (t, x, u)} 0,
which is desired.
If the inmum in the HJB equation is achieved at u = u(t, x), we can
substitute it into (3.3) to get x(t). Then (x(), u()) is an optimal pair.
3.1.2
Stochastic Control and HJB Equations
Let Wt be Brownian motion. We consider the following stochastic controlled

system:
dx(t) = b(t, x(t), u(t)) + (t, x(t), u(t))dWt , t [0, T ],

x(0) = x0 ,
with the cost functional
J(u()) = E
(3.5)
f (t, x(t), u(t))dt + h(x(T )) .
(3.6)
Dene
U[0, T ] = {u() is measurable in [0, T ], and {Ft }t0 adapted} .
The optimal stochastic control problem is stated as follows:
Minimize (3.6) over U[0, T ].
Let (s, y) [0, T ) R, and consider the following control system over
[s, T ]:
dx(t) = b(t, x(t), u(t)) + (t, x(t), u(t))dWt , t [s, T ],

x(s) = y.
(3.7)

Here, the control u() U[s, T ]. The cost functional is the following:
J(s, y; u() = E
f (t, x(t), u(t))dt + h(x(T )) .
We dene the value function as follows:

V (s, y) = inf u()U [s,T ] J(s, y; u()), for any (s, y) [0, T ) R
V (T, y) = h(y)
An example in nance: Mertons problem

Suppose that there are only two assets available for investment: a riskless
asset (bank account) and a risky asset (stock). Their prices, denoted by Rt
and St , respectively, evolve according to the following equations:
dRt = rRt dt,
dSt = St [dt + dWt ] .
where r > 0 is the constant riskless rate, > 0 and > 0 are constants called
the expected rate of return and the volatility, respectively, of the stock.
Now we consider an investment and consumption problem associated with
the market. Assume that an investor has an initial wealth Z0 . At time t, the
investor holds Yt and Zt Yt in stock and bank respectively. In addition, he
consumes at a rate C(t). Then,
dZt = [rZt + ( r) Yt C(t)] dt + Yt dWt
with the initial wealth Z0 .
The investors problem is
sup
Yt ,Ct
E0z0

(T s)
U (Cs ) ds + U (ZT ) ,
where U () is utility function.

Bellmans principle of optimality and HJB equation
Now the Bellmans principle is stated as follows: for any (s, y) [0, T ) R,
V (s, y) =
inf
uU [s,T ]

s

s
f (t, x(t; s, y, u()), u(t))dt + V (s, x(s; s, y, u())) ,

(3.8)
3.2. VISCOSITY SOLUTIONS
39
for any 0 s s T. The proof is somewhat technical. I refer interested

students to Yong and Zhou (1999).
Assume V C 1,2 ([0, T ] Rn ). In terms of the Bellmans principle, we are
able to show that V is a solution of the following terminal value problem:
vt + inf uU 12 2 (t, x, u)vxx + b(t, x, u)vx + f (t, x, u) = 0,

v|t=T = h(x), (t, x) [0, T ] R
(3.9)
The proof is left as an assignment.
3.2
Viscosity solutions
The value function is often not smooth. Thus, one needs to introduce the
notion of viscosity solutions to characterize the value function.
Denition 1 A function v C ([0, T ] R) is called a viscosity subsolution
(supersolution) of (3.9) if
v(T, x) h(x), for any x R
(or v(T, x) h(x))
and for any C 1,2 ([0, T ] R) , whenever v attains a local maximum
(minimum) at (t, x) [0, T ) R, we have
t inf
uU
1 2
(t, x, u)xx + b(t, x, u)x + f (t, x, u) 0, ( 0).
2
If v C ([0, T ] R) is both a viscosity subsolution and viscosity supersolution

of (3.9), then it is called a viscosity solution of (3.9).
We will show that the value function V is a viscosity solution of (3.9).
Proof: For any C 1,2 ([0, T ] R) , let v attain a local maximum at
(s, y) [0, T ) R. Fix a u U. Let x() = x(; s, y, u) be the state trajectory
with the control u(t) u. Then by Bellmans principle and Itos formula, we
have for s > s,
E {V (s, y) (s, y) V (s, x(s)) + (s, x(s))}
s s

s

1
E
f (t, x(t), u)dt (s, y) + (s, x(s))
s s
s
1
t (s, y) + 2 (t, x, u)xx + b(t, x, u)x + f (t, x, u), for any u U.
2

Hence,
t inf
uU
1 2
(t, x, u)xx + b(t, x, u)x + f (t, x, u) 0,
2
which implies that V is a viscosity subsolution.

On the other hand, if v attains a local minimum at (s, y) [0, T ) R,
then, for any > 0 and s > s, we can nd a u() = u,s () U[s, T ] such that
0 E {V (s, y) (s, y) V (s, x(s)) + (s, x(s))}
(s s) + E

s

s
f (t, x(t), u(t))dt (s, y) + (s, x(s))
Dividing by s s and applying Ito formula, we get

s
1
E
f (t, x(t), u())dt (s, y) + (s, x(s))
s s
s

s

1
E
[t (t, x(t)) dt
s s
s

1
+ 2 (t, x(t), u())xx + b(t, x(t), u())x + f (t, x(t), u()) dt
2

s

1
1 2
E
(t, x, u)xx + b(t, x, u)x + f (t, x, u) dt
t (t, x(t)) + inf
uU 2
s s
s

1 2
t (t, x(t)) + inf
(t, x, u)xx + b(t, x, u)x + f (t, x, u) .
uU 2
So,
t inf
uU
1 2
(t, x, u)xx + b(t, x, u)x + f (t, x, u) 0,
2
namely, V is a viscosity supersolution. We then conclude that V is a viscosity

solution.
To show the uniqueness of viscosity solution, we need comparison principle: if u and v are respectively viscosity subsolution and supersolution and
u| v| , then u v. See Crandall, M.G., H. Ishii and P.L. Lions (1992)
Users Guide to Viscosity Solutions of Second Order PDEs. Bull. of Amer.
Math. Soc., 27: 1-62.

Optimal Control

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Optimal Control

Încărcat de

Drepturi de autor:

Formate disponibile

Chapter 3

Stochastic Control, HJB

Dynamic programming and HJB equations

34CHAPTER 3. STOCHASTIC CONTROL, HJB EQUATIONS IN FINANCE

Give x0 R, consider the following control system:

x (t) = b(t, x(t), u(t)), a.e. t [0, T ],

where the control u() : [0, T ] U belongs to

f (t, x(t), u(t))dt + h(x(T )),

x (t) = b(t, x(t), u(t)), a.e. t [s, T ],

f (t, x(t), u(t))dt + h(x(T )).

Now we dene the following function (value function):

3.1. DYNAMIC PROGRAMMING AND HJB EQUATIONS

Bellmans principle of optimality

f (t, x(t), u(t))dt + V (s, x(s; u())) , for any 0 s s T.

f (t, x(t), u(t))dt+J(s, x(s); u()), for any u() V[s, T ].

f (t, x (t), u (t))dt + V (s, x (s; u ())) V (s, y),

f (t, x(t), u(t))dt +

f (t, x(t), u(t))dt + h(x(T ))

f (t, x(t), u(t))dt + J(s, x(s); u())

36CHAPTER 3. STOCHASTIC CONTROL, HJB EQUATIONS IN FINANCE

f (t, x(t), u(t))dt + h(x(T )).

This indicates that

f (t, x(t), u)dt + V (s, x(s))

f (t, x(t), u(t))dt + V (s, x(s)).

3.1. DYNAMIC PROGRAMMING AND HJB EQUATIONS

Thus, it follows that

Stochastic Control and HJB Equations

Let Wt be Brownian motion. We consider the following stochastic controlled

dx(t) = b(t, x(t), u(t)) + (t, x(t), u(t))dWt , t [0, T ],

with the cost functional

f (t, x(t), u(t))dt + h(x(T )) .

dx(t) = b(t, x(t), u(t)) + (t, x(t), u(t))dWt , t [s, T ],

38CHAPTER 3. STOCHASTIC CONTROL, HJB EQUATIONS IN FINANCE

f (t, x(t), u(t))dt + h(x(T )) .

We dene the value function as follows:

An example in nance: Mertons problem

where U () is utility function.

f (t, x(t; s, y, u()), u(t))dt + V (s, x(s; s, y, u())) ,

3.2. VISCOSITY SOLUTIONS

for any 0 s s T. The proof is somewhat technical. I refer interested

vt + inf uU 12 2 (t, x, u)vxx + b(t, x, u)vx + f (t, x, u) = 0,

The proof is left as an assignment.

If v C ([0, T ] R) is both a viscosity subsolution and viscosity supersolution

40CHAPTER 3. STOCHASTIC CONTROL, HJB EQUATIONS IN FINANCE

which implies that V is a viscosity subsolution.

f (t, x(t), u(t))dt (s, y) + (s, x(s))

Dividing by s s and applying Ito formula, we get

namely, V is a viscosity supersolution. We then conclude that V is a viscosity

S-ar putea să vă placă și