Petr Veverka - Infinite Time Maximum Principle

Innite horizon maximum principle for the dis-
counted control problem ∗

Petr Veverka
2nd year of PGS, email: veverpe2@fjfi.cvut.cz
Department of Mathematics, Faculty of Nuclear Sciences and Physical
Engineering, CTU
advisor: Prof. RNDr. Bohdan Maslowski, DrSc.,
Abstract. In this article, the proof of Pontryagin's maximum principle for innite horizon
discounted stochastic control problem is given. A sucient version of the maximum principle is
considered, i.e. the maximality and concavity of the associated Hamiltonian function is assumed.
The crucial method used is the approximation of the (innite horizon) backward adjoint process.
Finally, some basic examples from nance are given to show the usage of the maximum principle.
Abstrakt. V tomto £lánku je uveden d·kaz Pontryaginova principu maxima pro diskontovanou
úlohu °ízení. V¥ta je vyslovena ve smyslu posta£ující podmínky pro optimální °ízení, tj. pro
Hamiltonovu funkci je p°edpokládána konkavita a platnost podmínky maximality. Klí£ový krok
v d·kazu je aproximace °e²ení zp¥tné (sdruºené) stochastické diferenciální rovnice. Na záv¥r jsou
uvedeny p°íklady uºití dokázaného principu maxima pro typové p°íklady z oblasti optimálního
investování a stochastických nancí.
1 Introduction
At present, the eld of stochastic control is widely used in many spheres of real life where,
in any form, optimal performances are desired. Important examples of its applicability
come from domains such as nancial mathematics (managing nancial portfolios, hedging
tasks, dening investments strategies etc.), harvesting of row materials (nding optimal
harvesting scenarios with respect to current market price of the commodity), engineering
(vibration absorbing; optimal regulation of temperature, power etc.) and many others.
This shows how important it is to investigate the possibilities of mathematical description
of the problem in question, nding optimal control to the problem (if it is possible) or
approximating the optimal solution if the optimum is not easy to nd, and, of course, in
that case, evaluating the error of using this approximate solution.
In this paper, the discounted stochastic control problem is considered. This kind of
problem is very popular and plentifully used in the domain of stochastic nance since it
leads to maximizing the average discounted agent's utility. The approach to the solution
here is the maximum principle which, in deterministic setting, was formulated in 1950s
by the group of L.S.Pontryagin. For diusions, the maximum principle has been studied
by many researchers. The earliest versions of a maximum principle for such process were
given by Kushner [6] and Bismut [7]. Further progress on the subject was subsequently
∗
The result of this paper was made within my PhD. study programme at Faculty of Nuclear Sciences
and Physical Engineering, Czech Technical University in Prague.
1
2 2 PRELIMINARIES
made by Bensoussan [8], Peng [9], and Cadenillas and Haussmann [10]. Originally, the
main technical tool used when considering maximum principle was the calculus of variati-
ons which was not easy to apply to real examples and was dicult to simulate. This was
the reason why the approach via maximum principle was rather theoretical and discom-
ted by the dynamic programming approach. The turning point which led to its intensive
study was the paper [4] by Pardoux and Peng who formulated the general problem of
BSDE and proved the existence and uniqueness theorems. BSDE's provide an elegant
and easy-to-handle tool to describe the adjoint (shadow price) processes to the control
problem and to formulate the maximum principle using the Hamiltonian function. For
diusions with jumps, a necessary maximum principle on the nite time horizon was
formulated by Tang and Li [11] whereas sucient optimality conditions on nite time
horizon were specied by Øksendal, Sulem and Framstad [1].
The paper is organized as follows: in the second section, some known results on FB-
SDE with innite horizon and stochastic control with nite horizon are provided. The
formulation of the discounted problem is in the third section. Fourth section contains
the main result of the paper - the formulation and proof of the sucient innite time
maximum principle for the discounted problem. The last section gives some examples
from nance.
2 Preliminaries
2.1 Basic notation
We are given a basic probability space Ω, F , P , R -valued standard Wiener process

d
W = Wt t≥0 . Let FtW t≥0 be the canonical ltration of W , i.e. FtW = σ Ws ; s ≤ t ,

Ft F∞ = Ft ⊂ F . Further, we simplify
W
and be its UC completion. We denote t≥0
t≥0
the notation P -a.s. just to a.s.
2.2 Some known results on FBSDE in innite time horizon

When applying the maximum principle to stochastic control problems, the class of Forward-
Backward Stochastic Dierential Equations (FBSDE in short) naturally arises in form of
a coupled system of the state (forward) equation for the controlled diusion and the dual
(backward) equation for generalized Lagrange multiplicators as it will be seen later in
the paper. For that reason we present here some important results on FBSDE (without
the control process) in innite time horizon which can be found in [5].
2.2 Some known results on FBSDE in innite time horizon 3
Let us consider the following fully coupled system of equations
dXt = b(t, Xt , ut , Yt , Zt , ω)dt + σ(t, Xt , ut , Yt , Zt , ω)dWt , ∀t ≥ 0 a.s. (2.1)
X0 = x ∈ Rn ,
Z +∞ Z +∞
Yt = h(s, Xs , us , Ys , Zs )ds − Zs dWs , ∀t ≥ 0, a.s.
t t
(2.2)
where U ⊂ Rk , U closed, b, h : R+ × Rn × U × Rn × Rn×d × Ω → Rn , σ : R+ × Rn ×

U × R × Rn×d × Ω → Rn×d are continuous in variables (x, y, z) In the latter we omit the
n
notation of the dependence on ω . Further we make the following assumptions
(H1) b(·, x, u, y, z), h(·, x, u, y, z) and σ(·, x, u, y, z) are (Ft )-progressively measurable
random processes for all (x, u, y, z) ∈ Rn × U × Rn × Rn×d .
(H2) For any t, x, x1 , x2 , u, y, y1 , y2 and z , there exist µ1 , µ2 ∈ R such that a.s.

hx1 − x2 , b(t, x1 , u, y, z) − b(t, x2 , u, y, z)i ≤ µ1 ||x1 − x2 ||2 , (2.3)
hy1 − y2 , h(t, x, u, y1 , z) − h(t, x, u, y2 , z)i ≤ µ2 ||y1 − y2 ||2 , (2.4)
(H3) For any t, x, x1 , x2 , u, y, y1 , y2 , z1 , z2 there exist k, ki , ci ≥ 0, i = 1, 2 such that a.s.

||b(t, x, u, y1 , z1 ) − b(t, x, u, y2 , z2 )|| ≤ k1 ||y1 − y2 || + k2 ||z1 − z2 ||, (2.5)
||h(t, x1 , u, y, z1 ) − h(t, x2 , u, y, z2 )|| ≤ c1 ||x1 − x2 || + c2 ||z1 − z2 ||, (2.6)
||b(t, x, u, 0, 0)|| ≤ ||b(t, 0, u, 0, 0)|| + k(1 + ||x||), (2.7)
||h(t, 0, u, y, 0)|| ≤ ||h(t, 0, u, 0, 0)|| + ϕ(||y||), (2.8)
where ϕ : R+ → R+ is a continuous increasing function.
(H4) For any t, xi , u, yi , zi , i = 1, 2 there exist k3 , k4 , k5 ≥ 0 such that a.s.

||σ(t, x1 , u, y1 , z1 ) − σ(t, x2 , u, y2 , z2 )||2 ≤ k3 ||x1 − x2 ||2 + k2 ||y1 − y2 ||2 + k5 ||z1 − z2 ||2 .
(2.9)
(H5) There exist constants λ ∈ R, ε1 , ε2 > 0 such that
2µ2 + 4c21 + 2c22 + 1 < λ < −2µ1 − k3 − k1 ε1 − k2 ε2 − (k1 ε−1 −1

1 + k4 ) ∨ (k2 ε2 + k5 ),
(2.10)
Z +∞
E eλt ||b(t, 0, ut , 0, 0)||2 + ||σ(t, 0, ut , 0, 0)||2 + ||h(t, 0, ut , 0, 0)||2 dt < +∞.
0
(2.11)
4 2 PRELIMINARIES
Denition 1: A triple of Ft -adapted processes (Xt , Yt , Zt )t≥0 is called an adapted solu-

tion to FBSDE (2.1) if the following holds:
Z t Z t
Xt = x + b(s, Xs , us , Ys , Zs )ds + σ(s, Xs , us , Ys , Zs )dWs , ∀t ≥ 0, a.s. (2.12)
0 0
Z +∞ Z +∞
Yt = h(s, Xs , us , Ys , Zs )ds − Zs dWs , ∀t ≥ 0, a.s. (2.13)
t t
Z +∞
E eλt ||Xt ||2 + ||Yt ||2 + ||Zt ||2 dt < +∞. (2.14)
0
We denote the space L2λ (R+ ; Rn ) the space of all Rn -valued Ft -adapted processes
(ϕt )t≥0 such that
Z +∞
||ϕ||2λ =E eλt ||ϕt ||2 dt < +∞. (2.15)
0
The space L2λ (R+ ; Rn×d ) is dened similarly. The main result in [5], Theorem 3.1,
states that
Theorem 2.1. Assume (H1) - (H5) hold. Then there exists a unique solution (Xt , Yt , Zt )t≥0
to the FBSDE (2.1) in L2λ (R+ ; Rn ) × L2λ (R+ ; Rn ) × L2λ (R+ ; Rn×d ).
Further, we present in short the construction of the solution of the backward equation
in (2.1). The approximation technique was proposed by Pardoux in [3] where he proved
that the BSDE with random (possibly innite) terminal time τ
Z T ∧τ Z T ∧τ
Yt = YT + h(s, Ys , Zs )ds − Zs dWs , ∀T > 0 a.s. (2.16)
t t
has a unique solution (Yt , Zt )t≥0 ∈ L2λ (R+ ; Rn ) × L2λ (R+ ; Rn×d ) under the assumptions
(H1) - (H5).
The solution to (2.16) is obtained as limit of the approximation sequence (Ytn , Ztn )t≥0
where
Z n Z n
Ytn

= E ξ|Fn + 1[0,τ ] (s)h(s, Ys , Zs )ds − 1[0,τ ] (s)Zs dWs , 0 ≤ t ≤ n
t t
Ytn = E ξ|Ft , for t ≥ n.

(2.17)
The solution process (Ztn )t≥0 is obtained by the classical martingale representation
argument. The convergence is with respect to the topology induced by the norm
h Z τ Z τ i
λt 2 λt 2
E sup e ||Yt || + e ||Yt || dt + eλt ||Zt ||2 dt , (2.18)
0≤t≤τ 0 0
for some λ given by properties of the generator h.

2.3 Some known results on stochastic control in nite time horizon 5
2.3 Some known results on stochastic control in nite time hori-

zon
Let T >0 be a nite time horizon and Xt be a controlled diusion process in Rn , i.e. Xt
is a strong solution to the SDE
dXs = b(s, Xs , us )ds + σ(s, Xs , us )dWs , ∀s ∈ (t, T ] a.s. (2.19)
Xt = x,
where(us ) ∈ U(t, x) is U -valued control process, U ⊂ Rk , U(t, x) is a set of admissible
controls, b and σ are two measurable functions ensuring strong existence of the solution
to (2.19) for all (us ) ∈ U(t, x).
1
Furthermore, let f ∈ C and g ∈ C be two functions so that the following functional
hZ T i
t,x t,x
J(t, x, u) = E f (s, Xs , us )ds + g(XT ) (2.20)
t
is meaningful (i.e. it converges) and we dene the cost (or value) function v(t, x) by
v(t, x) = sup J(t, x, u). (2.21)

u∈U (t,x)
The goal is to nd such a strategy u∗ ∈ U(t, x) so that
v(t, x) = J(t, x, u∗ ).
u∗ is called optimal control.
Let us dene generalized Hamiltonian of the problem by
H(t, x, u, y, z) = b(t, x, u), y + T r σ 0 (t, x, u)z + f (t, x, u).

We suppose that H is dierentiable in x (denoted as ∇x H) and we consider the

following BSDE
−dYt = ∇x H(t, Xt , ut , Yt , Zt )dt − Zt dWt , ∀t ∈ [0, T ) a.s.

YT = ∇x g(XT ) a.s. (2.22)
Then we can formulate the maximum principle on nite time horizon providing sucient
∗
conditions for the optimal strategy u . For the proof and interesting references see [12].
Theorem 2.2 (Stochastic Pontryagin's maximum principle) . Let û ∈ U(t, x) and X̂ be

the associated controlled diusion process. Further, let us suppose that there exists a
solution (Ŷ , Ẑ) to associated BSDE (2.22) such that
• H(t, X̂t , û, Ŷt , Ẑt ) = max H(t, X̂t , u, Ŷt , Ẑt ), ∀t ∈ [0, T ] a.s.
u∈U
• (x, u) → H(t, x, u, Ŷt , Ẑt ) is a concave function for all t.

Then û = u∗ , i.e. û is optimal control strategy to the stochastic control problem (2.19)
- (2.21).
6 3 DISCOUNTED CONTROL PROBLEM
3 Discounted control problem

In this section, we consider the stochastic control problem over whole R+ . The functional
to be maximised is the discounted one as dened below.
The controlled state process (Xt )t≥0 is a strong solution to the following controlled
SDE on R+
dXt = b(Xt , ut )dt + σ(Xt , ut )dWt , ∀t ≥ 0 a.s. (3.1)
X0 = x,
where the random functions b, σ satisfy assumptions (H1)-(H5). The functional con-
sidered here is of the form
hZ +∞ i
−βt
J(x, u) = E e f (Xt , ut )dt , (3.2)
0
wheref ∈ C(Rn × U ; R) is the penalization (or appreciation) function over R+ such

that J(x, u) converges and β > 0 is the discount factor.
Further, we dene the cost function v(x) by
v(x) = sup J(x, u). (3.3)

u∈U (x)
The goal is to nd such a strategy u∗ ∈ U(x) so that the supremum in (3.3) is attained
∗ ∗
in u , i.e. v(x) = J(x, u ).
3.1 Hamiltonian of the system

We dene a generalized Hamiltonian H associated to control problem (3.1) - (3.3) by
H : Rn × U × Rn × Rn×d → R and
H(x, u, y, z) = b(x, u), y + T r(σ(x, u)0 z) + f (x, u) − β x, y ,

(3.4)

where ·, · denotes inner product in Rn . The Hamiltonian is an analogy to the
Lagrange function in the theory of constrained optimization since the variables y and
z can be viewed as "generalized Lagrange multiplicators and the functions b and σ as
the constraints for the space process Xt
. Westress that the extremal point of H w.r.t.
u does not depend on the last term −β x, y and the concavity/convexity w.r.t. (x, u)
neither.
We suppose that H is dierentiable in x (with the gradient denoted as ∇x H) and we
consider the following BSDE
Z +∞ Z +∞
Yt = ∇x H(Xs , us , Ys , Zs )ds − Zs dWs , ∀t ≥ 0 a.s..
t t
7
4 Innite horizon sucient maximum principle

In this section, the main result is given.
Theorem 4.1 (Sucient stochastic maximum principle). Let û ∈ U(x) and X̂ be the
associated controlled diusion process. Further, let us suppose that there exists a solution
(Ŷ , Ẑ) to the associated BSDE (3.5) such that
• H(X̂t , û, Ŷt , Ẑt ) = max H(X̂t , u, Ŷt , Ẑt ), ∀t ∈ R+ , a.s.

u∈U
• (x, u) → H(x, u, Ŷt , Ẑt ) is a concave function for all t.
Then û = u∗ , i.e. û is the optimal control strategy to the stochastic control problem
(3.1) - (3.3).
Proof: Let us take an arbitrary u ∈ U(x) and examine the dierence J(x, û) − J(x, u).
The goal is to show that this quantity is nonnegative.
Using the denition of J(x, u) and H we have
+∞
Z
e−βt f (X̂t , ût ) − f (Xt , ut ) dt =

J(x, û) − J(x, u) = E
0
Z +∞
e−βt H(X̂t , ût , Ŷt , Ẑt ) − H(Xt , ut , Ŷt , Ẑt ) dt+

E
Z0 +∞
e−βt b(Xt , ut ) − b(X̂t , ût ), Ŷt dt+

E
Z0 +∞ n Z +∞
−βt 0 0
o 0
E e T r σ (Xt , ut ) − σ (X̂t , ût ) Ẑt dt + E e−βt β X̂t − Xt Ŷt dt. (4.1)
0 0
The Lebesgue integral over R+ can be expressed (from Fubini's theorem and existence
of the original integral) as the following limit
Z +∞ Z T
E It dt = lim E It dt, (4.2)
0 T →+∞ 0
where It is the integrand of (4.1).

Further, we dene a new pair of solution processes (ŶtT , ẐtT ) of the new BSDE on [0, T ]
−dŶtT = ∇x H(X̂t , ût , ŶtT , ẐtT )dt − ẐtT dWt , ∀t ∈ [0, T ) a.s.
ŶTT = ∇x g(X̂T ) ≡ 0, a.s. (4.3)
for a function g ≡ 0. We note that the pair (ŶtT , ẐtT )t∈[0,T ] can be dened on R+ by
T T
laying Ŷt = 0 for t > T and Ẑt = 0 for t > T . This way we obtain an approximation
T T
process (Ŷt , Ẑt )t∈R+ of the solution to the associated BSDE (3.5) exactly according to
the approach by Pardoux (2.17).
8 4 INFINITE HORIZON SUFFICIENT MAXIMUM PRINCIPLE
Now, using again the ctive function g ≡ 0 in the denition of the functional J(x, u)
(as an analogy to the nite time horizon case), one can imagine the new functional (or
better, the dierence of functionals) in (4.2)
hZ T i
˜ û) − J(x,
J(x, ˜ u) := lim E It dt + g(X̂T ) − g(XT ) (4.4)
T →+∞ 0
and further, using the supergradient inequality for C1 concave functions we have
0 0 0
0 = g(X̂T ) − g(XT ) ≥ X̂T − XT ∇x g(X̂T ) = X̂T − XT ŶTT = X̂T − XT e−βT ŶTT

since ŶTT = 0 a.s. Applying the Itô formula to the last expression and taking E(·) we
arrive to
h i Z T
0 −βT T 0
X̂t − Xt d e−βt ŶtT +

0 =E g(X̂T ) − g(XT ) = E X̂T − XT e ŶT = E
0
Z T Z T n
0 o
e−βt ŶtT d X̂t − Xt + E e−βt T r σ 0 (X̂t , ût ) − σ 0 (Xt , ut ) ẐtT dt =

E
0 0
Z T Z +∞
−βt 0 0
T T
e−βt β X̂t − Xt ŶtT dt+

=E e X̂t − Xt − ∇x H(X̂t , ût , Ŷt , Ẑt ) dt − E
0 0
Z T n o Z T
e−βt T r σ 0 (X̂t , ût ) − σ 0 (Xt , ut ) ẐtT dt + E e−βt b(X̂t , ût ) − b(Xt , ut ), ŶtT dt.

E
0 0
(4.5)
Then, putting (4.5) into (4.4) we get
Z T
J(x, û) − J(x, u) = lim E Ĩt dt, (4.6)
T →+∞ 0
where
"
0
Ĩt = e−βt H(X̂t ,ût , Ŷt , Ẑt ) − H(Xt , ut , Ŷt , Ẑt ) − X̂t − Xt ∇x H(X̂t , ût , ŶtT , ẐtT )+
n o
T 0 0

T
b(X̂t ,ût ) − b(Xt , ut ), Ŷt − Ŷt + T r σ (X̂t , ût ) − σ (Xt , ut ) Ẑt − Ẑt +
#
0 T

β X̂t −Xt Ŷt − Ŷt . (4.7)
Now, shifting the function ∇x H to the point (X̂t , ût , Ŷt , Ẑt ) we nally deduce that
9
"
0
Ĩt = e−βt H(X̂t ,ût , Ŷt , Ẑt ) − H(Xt , ut , Ŷt , Ẑt ) − X̂t − Xt ∇x H(X̂t , ût , Ŷt , Ẑt )+
0
∇x H(X̂t , ût , Ŷt , Ẑt ) − ∇x H(X̂t , ût , ŶtT , ẐtT ) +

X̂t −Xt
n o
b(X̂t ,ût ) − b(Xt , ut ), ŶtT − Ŷt + T r σ 0 (X̂t , ût ) − σ 0 (Xt , ut ) ẐtT − Ẑt +

#
0
β X̂t −Xt Ŷt − ŶtT .

(4.8)
From the concavity of H in (x, u), we know that
0
H(X̂t , ût , Ŷt , Ẑt ) − H(Xt , ut , Ŷt , Ẑt ) − X̂t − Xt ∇x H(X̂t , ût , Ŷt , Ẑt ) ≥ 0. (4.9)
To complete the proof, we have to show that all the remaining terms in (4.8) tend
to zero when taking E(·), integration over [0, T ] and sending T → +∞. We examine the
four remaining terms separately.
Second term in 4.8

To show that
Z +∞ 0
lim E 1[0,T ] (t)e−βt X̂t − Xt ∇x H(X̂t , ût , Ŷt , Ẑt ) − ∇x H(X̂t , ût , ŶtT , ẐtT ) dt = 0
T →+∞ 0
(4.10)
we denote ∇x H(X̂t , ût , y, z) = h(y, z) and we apply two times the Schwarz inequality
to obtain
+∞
Z
−βt
0
E
1[0,T ] (t)e X̂t − Xt h(Ŷt , Ẑt ) − h(ŶtT , ẐtT ) dt ≤
0
Z +∞
E e−βt ||X̂t − Xt || · ||h(Ŷt , Ẑt ) − h(ŶtT , ẐtT )||dt ≤
0
Z +∞ 1/2 Z +∞ 1/2
−βt 2 −βt T T 2
E e ||X̂t − Xt || E e ||h(Ŷt , Ẑt ) − h(Ŷt , Ẑt )|| .
0 0
(4.11)
The rst integral does not depend on T and is nite due to (2.12). The integrand in
the second term can be rewritten as follows

||h(Ŷt , Ẑt ) − h(ŶtT , ẐtT )||2 ≤ 2 ||h(ŶtT , ẐtT ) − h(ŶtT , Ẑt )||2 + ||h(ŶtT , Ẑt ) − h(Ŷt , Ẑt )||2 ≤
2c22 ||ẐtT − Ẑt ||2 + 2||h(ŶtT , Ẑt ) − h(Ŷt , Ẑt )||2 . (4.12)
2
R +∞ −βt T
Therefore, the integral 2c2 E
0
e ||Ẑt − Ẑt ||2 dt vanishes as T → +∞ due to the
T
approximation property of Ẑ , see (2.17) and (2.18). The remaining term has to be
treated in a dierent way. From (2.18) we know that
10 4 INFINITE HORIZON SUFFICIENT MAXIMUM PRINCIPLE

−βt
lim E sup e ||ŶtT 2
− Ŷt || = 0. (4.13)
T →+∞ t≥0
Then there exists a sequence Tn , Tn % +∞ as n → +∞ so that
sup e−βt ||ŶtTn − Ŷt ||2 → 0, as n → +∞, a.s. (4.14)

t≥0
and therefore ||ŶtTn − Ŷt ||2 → 0 as n → +∞ for every t≥0 xed.

To conclude this step we realize that
"
||h(ŶtTn , Ẑt ) − h(Ŷt , Ẑt )||2 ≤ 4 ||h(ŶtTn , Ẑt ) − h(ŶtTn , 0)||2 +
#
||h(Ŷt , Ẑt ) − h(Ŷt , 0)||2 + ||h(Ŷt , 0)||2 + ||h(ŶtTn , 0)||2 ≤
h i
8c22 ||Ẑt ||2 2 2 Tn 2

+ 16||h(0, 0)|| + 4 ϕ ||Ŷt || + ϕ ||Ŷt || , (4.15)
which, after taking E(·) and multiplying by the factor e−βt , is an integrable majorant
with respect to Lebesgue integration over R+ . Thus, the whole term in (4.11) vanishes
as n → +∞ due to Fubini, dominated convergence, continuity of h(y, z) in y and the
approximation property (2.17) and (2.18).
Third term in 4.8 To show that

Z +∞ D E
lim E 1[0,T ] (t)e−βt b(X̂t , ût ) − b(Xt , ut ), ŶtT − Ŷt dt = 0 (4.16)
T →+∞ 0
we apply, again, two times the Schwarz inequality so that we obtain
+∞
Z
−βt

E
1[0,T ] (t)e hb(X̂t , ût ) − b(Xt , ut ), ŶtT − Ŷt i dt ≤
0
Z +∞
E e−βt ||b(X̂t , ût ) − b(Xt , ut )|| · ||ŶtT − Ŷt ||dt ≤
0
Z +∞ 1/2 Z +∞ 1/2
−βt 2 −βt T 2
E e ||b(X̂t , ût ) − b(Xt , ut )|| E e ||Ŷt − Ŷt || . (4.17)
0 0
Then the second term on the r.h.s. vanishes as T → +∞ due to the approximation
T
property of Ŷ . Using Assumptions (H1) - (H5), the rst term on the r.h.s. is bounded
for it holds

2 2 2
||b(X̂t , ût ) − b(Xt , ut )|| ≤ 2 ||b(X̂t , ût )|| + ||b(Xt , ut )|| ≤

2 ||b(0, ût )||2 + ||b(0, ut )||2 + 2k 1 + ||X̂t ||2 + ||Xt ||2 . (4.18)
11
Therefore, we can deduce from Assumptions (H1) - (H5) that (4.16) holds.
Fourth term in 4.8

To show that
Z +∞ h i
−βt 0 0
T
lim E 1[0,T ] (t)e T r σ (X̂t , ût ) − σ (Xt , ut ) Ẑt − Ẑt dt = 0 (4.19)
T →+∞ 0
again, applying two times the Schwarz inequality we have
+∞
Z h
−βt 0 0
T i
E
1[0,T ] (t)e T r σ (X̂t , ût ) − σ (Xt , ut ) Ẑt − Ẑt dt ≤
0
Z +∞
E e−βt ||σ 0 (X̂t , ût ) − σ 0 (Xt , ut )|| · ||ẐtT − Ẑt kdt ≤
0
Z +∞ 1/2 Z +∞ 1/2
−βt 0 0
E e ||σ (X̂t , ût ) − σ (Xt , ut )|| 2
E e−βt ||ẐtT − Ẑt ||2 .
0 0
(4.20)
Taking into account that
||σ(X̂t , ût ) − σ(Xt , ut )|| ≤ ||σ(X̂t , ût ) − σ(Xt , ût )|| + ||σ(Xt , ût ) − σ(Xt , ut )|| ≤
k3 ||X̂t − Xt || + ||σ(Xt , ût ) − σ(0, ût )|| + ||σ(Xt , ut ) − σ(0, ut )|| + ||σ(0, ût ) − σ(0, ut )|| ≤

2k3 ||X̂t || + ||Xt || + ||σ(0, ût )|| + ||σ(0, ut )||, (4.21)
it can be easily seen that the fourth term tends to 0 as T → +∞ by similar arguments
as above.
Fifth term in 4.8

The term
Z +∞ 0
1[0,T ] (t)e−βt β X̂t − Xt Ŷt − ŶtT dt

lim E (4.22)
T →+∞ 0
tends to 0 by similar arguments as above.
Therefore, we deduce that
J(x, û) − J(x, u) ≥ 0, ∀u ∈ U(x)

and û is indeed the optimal control.
5 Examples
5.1 Stochastic control with random time
This example is taken from the book by H.Pham [12] and it ilustrates how, under some
assumptions, random horizon control problem can be rewritten as an innite horizon
12 5 EXAMPLES
discounted control problem. Let us consider stochastic control problem with random
terminal time τ
(which can represent random time of changes in the investor's opportunity
u
set, time of an exogenous shock to the controlled process (Xt )t≥0 etc.) We lay
hZ τ i
−βt
v(x) = sup E e f (Xtu )dt , (5.1)
u∈U (x) 0
where β>0 is the discount factor and f is investor's utility function.

Furthermore, we assume that τ is independent of the F∞ which is a UC completion
t≥0 Ft and we Fτ (t) = P[τ ≤ t] = P[τ ≤ t|F∞ ] = 1 − e−λt for some
X
W
of suppose that
λ > 0.
Then we can write (keeping in mind independence of τ on F∞ )
hZ τ i hZ +∞ i
−βt −βt
E e f (Xtu )dt =E 1t<τ e f (Xtu )dt = (5.2)
0 0
hZ +∞ i hZ +∞ i
−βt −(β+λ)t
E (1 − Fτ (t))e f (Xtu )dt =E e f (Xtu )dt .
0 0
This way we obtain an innite time horizon problem with the modied discount factor
β + λ.
5.2 Production planning problem

The setting of the problem was taken from [13] and it is nothing else that the classical LQ
problem whose solution is well known from e.g. dynamic programming, see [2], chapter
11. In the Markov setting, it leads to the Riccati ODE. We show that the maximum
principle leads to the same result.
Let us consider a factory producing a single good. Let X(·) denote its inventory level
as a function of time and u(·) ≥ 0 denote the production rate. ξ will denote the constant
demand rate and x1 , u1 resp. the factory-optimal inventory level and production rate.
The inventory process is modelled as the controlled diusion

dXt = ut − ξ dt + σdWt , ∀t ≥ 0 a.s., (5.3)
where σ is a constant. The aim is to minimize over non-anticipative u(·) the discounted
cost
"Z #
+∞ 2 2
J(x, u) = E e−αt c · ut − u1 + h · X t − x1 dt , (5.4)
0
where c, h ∈ R are known coecients for the production cost and the inventory hol-
ding cost, resp. and α>0 is a discount factor.
The minimization task is converted to maximization by taking −J(x, u) as the functi-

onal. The Hamiltonian of this control problem is
5.2 Production planning problem 13
H(x, u, y, z) = (u − ξ)y + σz − c(u − u1 )2 − h(x − x1 )2 − αxy (5.5)
which is a concave function in (x, u). The driver of the backward adjoint equation is
obtained as the derivative of H w.r.t x, i.e.
∂
h(x, u, y, z) = H(x, u, y, z) = −2h(x − x1 ) − αy (5.6)
∂x
and the associated BSDE is
Z +∞
Z +∞
Yt = − 2h(Xs − x1 ) − αYs ds − Zs dWs , ∀t ≥ 0 a.s. (5.7)
t t
To nd the extremal (maximal) point of H we lay
∂
H(x, u, y, z) = y − 2c(u − u1 ) = 0 (5.8)
∂u
which leads to
Ŷt
ût = + u1 . (5.9)
2c
Further, we will try to nd the solution to BSDE (5.7) in the feedback form which
will lead to Markov control. Let us assume that
Yt = ϕt · Xt + ψt , ∀t ≥ 0 a.s. (5.10)
for some deterministic functions ϕ, ψ in C 1. Applying the Itô formula to the process
Yt we get
dYt = ϕ0t Xt dt + ϕt dXt + ψt0 dt = ϕ0t Xt dt + ϕt ut − ξ dt + ϕt σdWt + ψt0 dt.

(5.11)
Now, comparing the dierentials in (5.7) and (5.11) we obtain the set of equations
ϕ0t Xt + ϕt ut − ξ + ψt0 = 2h(Xt − x1 ) + αYt = 2h(Xt − x1 ) + αϕt Xt + αψt

(5.12)
ϕ t σ = Zt . (5.13)
It is seen that once having the functions ϕ, ψ , we can compute the processes Yt , Zt
and nally the optimal control ût . To proceed, we express the control ut from equation
(5.12) and compare it to the expression in (5.9) which leads to
ϕt Xt + ψt 1
Xt αϕt + 2h − ϕ0t + αψt − 2hx1 − ψt0 + ξ.

+ u1 = (5.14)
2c ϕt
Finally, comparing the terms which correspond to the powers of Xt , i.e. Xt1 and Xt0 we
obtain the ODE system for the functions ϕ, ψ which can be solved without any problems.
14 REFERENCE
1 1
αϕt + 2h − ϕ0t

ϕt = (5.15)
2c ϕt
1 1
αψt − 2hx1 − ψt0 + ξ.

ψt + u1 = (5.16)
2c ϕt
We see that the rst equation is of the Riccati type as expected.
The author wishes to thank prof. Maslowski for his help, guidance and encouragement
to the work.
Reference
[1] Øksendal B., Sulem A. and Framstad N. C. A sucient stochastic maximum prin-
ciple for optimal control of jump diusions and applications to nance. J. Optimi-
zation Theory and Applications, 121, 77-98, 2004. Errata: J. Optimization Theory
and Applications 124, 511-512, 2005.
[2] Øksendal B. Stochastic Dierential Equations. An Introduction with Applications,
4th edition. Berlin, Springer-Verlag 1995., ISBN 3-540-60243-7 (Universitext)
[3] Pardoux E. BSDEs weak convergence and homogenizations of semilinear PDEs.
Nonlinear Analysis Dierential Equations and Control. Clark, F.H., Stern, R.J.
(Eds.), Kluwer Academic, Dordrecht, 503-549, 1999.
[4] Pardoux E., Peng S. G. Adapted solution of a backward stochastic dierential
equation. Systems & Control Letters. 14, 55-61, 1990.
[5] Yin J. On solutions of a class of innite horizon FBSDE's. Statistics and Probability
Letters. 78, 2412-2419, 2008.
[6] Kushner H. J. Necessary conditions for continuous parameter stochastic optimization
problems. SIAM J. Control. 10, 550-565, 1972.
[7] Bismut J.-M. Conjugate convex functions in optimal stochastic control. J. Math.
Anal. Appl. 44, 384?04, 1973.
[8] Bensoussan A. Maximum principle and dynamic programming approaches of the
optimal control of partially observed diusions. Stochastics. 9, 169-222, 1983.
[9] Peng S. G. A general stochastic maximum principle for optimal control problems.
SIAM J. Control Optim. 28, 966-979, 1990.
[10] Cadenillas A. and Haussmann U. G. The stochastic maximum principle for a singular
control problem. Stochastics Stochastics Rep. 49, 211-237, 1994.
Necessary conditions for optimal control of stochastic systems
[11] Tang S. J. and Li X. J.
with random jumps. SIAM J. Control Optim. 32, 1447-1475, 1994.
REFERENCE 15
[12] Pham, H. Continuous-time stochastic control and optimization with nancial appli-
cations. Stochastic Modelling and Applied Probability 61, Springer-Verlag, Berlin,
2009.
[13] Borkar V. S. Control diusion processes. Probability Surveys. 2, 213-244, 2005.

Petr Veverka - Infinite Time Maximum Principle

Încărcat de

Informații document

Descriere originală:

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Petr Veverka - Infinite Time Maximum Principle

Încărcat de

Drepturi de autor:

Formate disponibile

Innite horizon maximum principle for the dis-

counted control problem ∗

and Physical Engineering, Czech Technical University in Prague.

W = Wt t≥0 . Let FtW t≥0 be the canonical ltration of W , i.e. FtW = σ Ws ; s ≤ t ,

2.2 Some known results on FBSDE in innite time horizon

Let us consider the following fully coupled system of equations

dXt = b(t, Xt , ut , Yt , Zt , ω)dt + σ(t, Xt , ut , Yt , Zt , ω)dWt , ∀t ≥ 0 a.s. (2.1)

where U ⊂ Rk , U closed, b, h : R+ × Rn × U × Rn × Rn×d × Ω → Rn , σ : R+ × Rn ×

notation of the dependence on ω . Further we make the following assumptions

(H2) For any t, x, x1 , x2 , u, y, y1 , y2 and z , there exist µ1 , µ2 ∈ R such that a.s.

hy1 − y2 , h(t, x, u, y1 , z) − h(t, x, u, y2 , z)i ≤ µ2 ||y1 − y2 ||2 , (2.4)

(H3) For any t, x, x1 , x2 , u, y, y1 , y2 , z1 , z2 there exist k, ki , ci ≥ 0, i = 1, 2 such that a.s.

||h(t, x1 , u, y, z1 ) − h(t, x2 , u, y, z2 )|| ≤ c1 ||x1 − x2 || + c2 ||z1 − z2 ||, (2.6)

||b(t, x, u, 0, 0)|| ≤ ||b(t, 0, u, 0, 0)|| + k(1 + ||x||), (2.7)

||h(t, 0, u, y, 0)|| ≤ ||h(t, 0, u, 0, 0)|| + ϕ(||y||), (2.8)

where ϕ : R+ → R+ is a continuous increasing function.

(H4) For any t, xi , u, yi , zi , i = 1, 2 there exist k3 , k4 , k5 ≥ 0 such that a.s.

(H5) There exist constants λ ∈ R, ε1 , ε2 > 0 such that

2µ2 + 4c21 + 2c22 + 1 < λ < −2µ1 − k3 − k1 ε1 − k2 ε2 − (k1 ε−1 −1

Denition 1: A triple of Ft -adapted processes (Xt , Yt , Zt )t≥0 is called an adapted solu-

for some λ given by properties of the generator h.

2.3 Some known results on stochastic control in nite time hori-

dXs = b(s, Xs , us )ds + σ(s, Xs , us )dWs , ∀s ∈ (t, T ] a.s. (2.19)

v(t, x) = sup J(t, x, u). (2.21)

The goal is to nd such a strategy u∗ ∈ U(t, x) so that

H(t, x, u, y, z) = b(t, x, u), y + T r σ 0 (t, x, u)z + f (t, x, u).

We suppose that H is dierentiable in x (denoted as ∇x H) and we consider the

−dYt = ∇x H(t, Xt , ut , Yt , Zt )dt − Zt dWt , ∀t ∈ [0, T ) a.s.

Theorem 2.2 (Stochastic Pontryagin's maximum principle) . Let û ∈ U(t, x) and X̂ be

• (x, u) → H(t, x, u, Ŷt , Ẑt ) is a concave function for all t.

3 Discounted control problem

dXt = b(Xt , ut )dt + σ(Xt , ut )dWt , ∀t ≥ 0 a.s. (3.1)

wheref ∈ C(Rn × U ; R) is the penalization (or appreciation) function over R+ such

v(x) = sup J(x, u). (3.3)

3.1 Hamiltonian of the system

H(x, u, y, z) = b(x, u), y + T r(σ(x, u)0 z) + f (x, u) − β x, y ,

4 Innite horizon sucient maximum principle

• H(X̂t , û, Ŷt , Ẑt ) = max H(X̂t , u, Ŷt , Ẑt ), ∀t ∈ R+ , a.s.

• (x, u) → H(x, u, Ŷt , Ẑt ) is a concave function for all t.

where It is the integrand of (4.1).

Then, putting (4.5) into (4.4) we get

From the concavity of H in (x, u), we know that

Second term in 4.8

Then there exists a sequence Tn , Tn % +∞ as n → +∞ so that

sup e−βt ||ŶtTn − Ŷt ||2 → 0, as n → +∞, a.s. (4.14)

and therefore ||ŶtTn − Ŷt ||2 → 0 as n → +∞ for every t≥0 xed.

Third term in 4.8 To show that

Fourth term in 4.8

again, applying two times the Schwarz inequality we have

Taking into account that

Fifth term in 4.8

J(x, û) − J(x, u) ≥ 0, ∀u ∈ U(x)

where β>0 is the discount factor and f is investor's utility function.

5.2 Production planning problem

The inventory process is modelled as the controlled diusion

The minimization task is converted to maximization by taking −J(x, u) as the functi-

H(x, u, y, z) = (u − ξ)y + σz − c(u − u1 )2 − h(x − x1 )2 − αxy (5.5)

To nd the extremal (maximal) point of H we lay

dYt = ϕ0t Xt dt + ϕt dXt + ψt0 dt = ϕ0t Xt dt + ϕt ut − ξ dt + ϕt σdWt + ψt0 dt.

ϕ0t Xt + ϕt ut − ξ + ψt0 = 2h(Xt − x1 ) + αYt = 2h(Xt − x1 ) + αϕt Xt + αψt

[13] Borkar V. S. Control diusion processes. Probability Surveys. 2, 213-244, 2005.

S-ar putea să vă placă și

Innite horizon maximum principle for the dis-

W = Wt t≥0 . Let FtW t≥0 be the canonical ltration of W , i.e. FtW = σ Ws ; s ≤ t ,

2.2 Some known results on FBSDE in innite time horizon

Denition 1: A triple of Ft -adapted processes (Xt , Yt , Zt )t≥0 is called an adapted solu-

2.3 Some known results on stochastic control in nite time hori-

The goal is to nd such a strategy u∗ ∈ U(t, x) so that

We suppose that H is dierentiable in x (denoted as ∇x H) and we consider the

4 Innite horizon sucient maximum principle

and therefore ||ŶtTn − Ŷt ||2 → 0 as n → +∞ for every t≥0 xed.

The inventory process is modelled as the controlled diusion

To nd the extremal (maximal) point of H we lay

[13] Borkar V. S. Control diusion processes. Probability Surveys. 2, 213-244, 2005.