Sunteți pe pagina 1din 593

Jan A.

Van Casteren

Markov processes, Feller


semigroups and evolution
equations

– Monograph –

April 29, 2008

Springer
The author wants to dedicate this book to his mathematics
teacher Rudi Hirschfeld at the occasion of his 80th birthday.
Preface

Writing the present book has been a long time project which emerged more
than five years ago. One of the main sources of inspiration was a mini-course
which the author taught at Monopoli (University of Bari, Italy). This course
was based on the text in [241]. The main theorem of the present book (The-
orem 1.39), but phrased in the locally compact setting, was a substantial
part of that course. The title of the conference was International Summer
School on Operator Methods for Evolution Equations and Approximation
Problems, Monopoli (Bari), September 15–22, 2002. The mini-course was en-
titled “Markov processes and Feller semigroups”. Other papers which can be
considered as predecessors of the present book are [238, 240, 247, 248]. In
this book a Polish state space replaces the locally compact state space in the
more classical literature on the subject. A Polish space is separable and com-
plete metrizable. Important examples of such spaces are separable Banach and
Frechet spaces. The generators of the Markov processes or diffusions which
play a central role in the present book could be associated with stochastic
differential equations in a Banach space. In the formulation of our results we
avoid the use of the metric which turns the state space into a complete metriz-
able space; see e.g. the Propositions 3.23 and 8.8. As a rule of thumb we phrase
results in terms of (open) subsets rather than using a metric. As one of the
highlights of the book we mention Theorem 1.39 and everything surrounding
it. This theorem gives an important relationship between the following con-
cepts: probability transition functions with the (strong) Feller property, strong
Markov processes, martingale problems, generators of Markov processes, and
uniqueness of Markov extensions. In this approach the classical uniform topol-
ogy is replaced by the so-called strict topology. A sequence of bounded contin-
uous functions converges for the strict topology if it is uniformly bounded, and
if it converges uniformly on compact subsets. It can be described by means of
a certain family of semi-norms which turns the space of bounded continuous
functions into a sequentially complete locally convex separable vector space.
Its topological dual consists of genuine complex measures on the state space.
This is the main reason that the whole machinery works. The second chapter
VIII Preface

contains the proofs of the main theorem. The original proof for the locally
compact case, as exhibited in e.g. [34], cannot just be copied. Since we deal
with a relatively large state space every single step has to be reproved. Many
results are based on Proposition 2.2 which ensures that the orbits of our pro-
cess have the right compactness properties. If we talk about equi-continuity,
then we mean equi-continuity relative to the strict topology: see e.g. Theorem
1.7, Definition 1.16, Theorem 1.18, Corollary 1.19, Proposition 2.4, Corollary
2.4, Corollary 2.10, equation (3.114). In §3.4 a general criterion is given in or-
der that the sample paths of the Markov process are almost-surely continuous.
In addition this section contains a number of results pertaining to dissipativ-
ity properties of its generator: see e.g. Proposition 3.11. A discussion of the
maximum principle is found here: see e.g. Lemma 3.22 and Proposition 3.23.
In Section 3.3 we discuss Korovkin properties of generators. This notion is
closely related to the range property of a generator. In Section 3.5 we discuss
(measurability) properties of hitting times. In Chapters 4 and 5 we discuss
backward stochastic differential equations for diffusion processes. A highlight
in Chapter 4 is a new way to prove the existence of solutions. It is based on
a homotopy argument as explained in Theorem 1 (page 87) in Crouzeix et
al [63]: see Proposition 4.36, Corollary 4.37 and Remark 4.38. A martingale
which plays an important role in Chapter 5 is depicted in formula (5.2). A
basic result is Theorem 5.1. In Chapter 6 we discuss for a time-homogeneous
process a version of the Hamilton-Jacobi-Bellmann equation. Interesting the-
orems are the Noether theorems 6.13 and 6.17. In Chapters 7. 8, and 9 the
long time behavior of a recurrent time-homogeneous Markov process is inves-
tigated. Chapter 8 is analytic in nature; it is inspired by the Ph.-D. thesis
of Katilova [129]. Chapter 7 describes a coupling technique from Chen and
Wang [55]: see Theorem 8.3 and Corollary 8.4. The problem raised by Chen
and Wang (see §8.3) about the boundedness of the diffusion matrix can be
partially solved by using a Γ2 -condition instead of condition (8.5) in Theorem
8.3 without violating the conclusion in (8.6): see Theorem 8.71 and Example
8.77, Proposition 8.79 and the formulas (8.247) and (8.248). For more details
see Remark 8.41 and inequality (8.149) in Remark 8.53. Furthermore Chapter
8 contains a number of results related to the existence of an invariant σ-
additive measure for our recurrent Markov process. For example in Theorem
8.8. Conditions are given in order that there exist compact recurrent sub-
sets. This property has far-reaching consequences: see e.g. Proposition 8.16,
Theorem 8.18, and Proposition 8.24. Results about uniqueness of invariant
measures are obtained: see Corollary 8.35. The results about recurrent sub-
sets and invariant measures are due to Seidler [207]. Poincarë type inequalities
are proved: see the propositions 8.55 and 8.73, and Theorem 8.18. The results
on the Γ2 -condition are taken from Bakry [16, 17], and Ledoux [144]. In Chap-
ter 9 we collect some properties of relevant martingales. In addition, we prove
the existence and uniqueness of an irreducible invariant measure: see Theorem
9.12 and the results in §9.3. In Theorem 9.25 we follow Kaspi and Mandelbaum
[127] to give a precise relationship between Harris recurrence and recurrence
Preface IX

phrased in terms of hitting times. Theorem 9.36 is the most important one
for readers interested in an existence proof of a σ-additive invariant measure
which is unique up to a multiplicative constant. Assertion (e) of Proposition
9.40 together with Orey’s theorem for Markov chains (see Theorem 9.4) yields
the interesting consequence that, up to multiplicative constants, σ-finite in-
variant measures are unique. In §9.4 Orey’s theorem is proved for recurrent
Markov chains. In the proof we use a version of the bivariate linked forward
recurrence time chain as explained in Lemma 9.50. We also use Nummelin’s
splitting technique: see [162], §5.1 (and §17.3.1). The proof of Orey’s theo-
rem is based on Theorems 9.53 and 9.62. Results Chapter 9 go back to Meyn
and Tweedie [162] for time-homogeneous Markov chains and Seidler [207] for
time-homogeneous Markov processes.

Interdependence

From the above discussion it is clear how the chapters in this book are related.
Chapter 1 is a prerequisite for all the others except Chapter 7. Chapter 2
contains the proofs of the main results in Chapter 1; it can be skipped at a
first reading. Chapter 3 contains material very much related to the contents
of the first chapter. Chapter 5 is a direct continuation of 4, and is somewhat
difficult to read and comprehend without the knowledge of the contents of
Chapter 4. Chapter 6 is more or less independent of the other chapters in
Part 2. For a big part Chapter 7 is independent of the other chapters: most of
the results are phrased and proved for a finite-dimensional state space. The
chapters 8 and 9 are very much interrelated. Some results in Chapter 8 are
based on results in Chapter 9. In particular this is true in those results which
use the existence of an invariant measure. A complete proof of existence and
uniqueness is given in Chapter 9 Theorem 9.36. As a general prerequisite for
understanding and appreciating this book a thorough knowledge of probability
theory, in particular the concept of the Markov property, combined with a
comprehensive notion of functional analysis is very helpful. On the other hand
most topics are explained from scratch.

Acknowledgement

Part of this work was presented at a Colloquium at the University of Gent,


October 14, 2005, at the occasion of the 65th birthday of Richard Delanghe
and appeared in a very preliminary form in [244]. Some results were also pre-
sented at the University of Clausthal, at the occasion of Michael Demuth’s
60th birthday September 10–11, 2006, and at a Conference in Marrakesh, Mo-
rocco, “Marrakesh World Conference on Differential Equations and Applica-
tions”, June 15–20, 2006. Some of this work was also presented at a Conference
on “The Feynman Integral and Related Topics in Mathematics and Physics:
In Honor of the 65th Birthdays of Gerry Johnson and David Skoug”, Lincoln,
X Preface

Nebraska, May 12–14, 2006. Finally, another preliminary version was pre-
sented during a Conference on Evolution Equations, in memory of G. Lumer,
at the Universities of Mons and Valenciennes, August 28–September 1, 2006.
The author also has presented some of this material during a colloquium at
the University of Amsterdam (December 21, 2007), and at the AMS Special
Session on the Feynman Integral in Mathematics and Physics, II, on January
9, 2008, in the Convention Center in San Diego, CA.
The author is obliged to the University of Antwerp (UA) and FWO Flan-
ders (Grant number 1.5051.04N) for their financial and material support. He
was also very fortunate to have discussed part of this material with Karel
in’t Hout (University of Antwerp), who provided some references with a cru-
cial result about a surjectivity property of one-sided Lipschitz mappings: see
Theorem 1 in Croezeix et al [63]. Some aspects concerning this work, like
backward stochastic differential equations, were at issue during a conservation
with Étienne Pardoux (CMI, Université de Provence, Marseille); the author is
grateful for his comments and advice. The author is indebted to J.-C. Zambrini
(Lisboa) for interesting discussions on the subject and for some references. In
addition, the information and explanation given by Willem Stannat (Technical
University Darmstadt) while he visited Antwerp are gratefully acknowledged.
In particular this is true for topics related to asymptotic stability: see Chap-
ter 8. The author is very much obliged to Natalia Katilova who has given
the ideas of Chapter 7; she is to be considered as a co-author of this chapter.
Finally, this work was part of the ESF program “Global”.

Key words and phrases, subject classification

Some key words and phrases are: backward stochastic differential equation,
parabolic equations of second order, Markov processes, Markov chains, ergod-
icity conditions, Orey’s theorem, theorem of Chacon-Ornstein, invariant mea-
sure, Korovkin properties, maximum principle, Kolmogorov operator, squared
gradient operator, martingale theory.
AMS Subject classification [2000]: 60H99, 35K20, 46E10, 60G46, 60J25.

Antwerp, Jan A. Van Casteren


February 2008
Contents

Part I Strong Markov processes

1 Strong Markov processes on polish spaces . . . . . . . . . . . . . . . . . 3


1.1 Strict topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 Theorem of Daniell-Stone . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.2 Measures on polish spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.1.3 Integral operators on the space of bounded continuous
functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.2 Strong Markov processes and Feller evolutions . . . . . . . . . . . . . . 27
1.2.1 Generators of Markov processes and maximum principles 31
1.3 Strong Markov processes: main result . . . . . . . . . . . . . . . . . . . . . . 35
1.3.1 Some historical remarks and references . . . . . . . . . . . . . . . 40
1.4 Dini’s lemma, Scheffé’s theorem, and the monotone class
theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
1.4.1 Dini’s lemma and Scheffé’s theorem . . . . . . . . . . . . . . . . . . 42
1.4.2 Monotone class theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

2 Strong Markov processes: proof of main result . . . . . . . . . . . . . 47


2.1 Proof of the main result: Theorem 1.39 . . . . . . . . . . . . . . . . . . . . . 47
2.1.1 Proof of item (a) of Theorem 1.39 . . . . . . . . . . . . . . . . . . . 47
2.1.2 Proof of item (b) of Theorem 1.39 . . . . . . . . . . . . . . . . . . . 69
2.1.3 Proof of item (c) of Theorem 1.39 . . . . . . . . . . . . . . . . . . . 72
2.1.4 Proof of item (d) of Theorem 1.39 . . . . . . . . . . . . . . . . . . . 76
2.1.5 Proof of item (e) of Theorem 1.39 . . . . . . . . . . . . . . . . . . . 94
2.1.6 Some historical remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

3 Space-time operators and miscellaneous topics . . . . . . . . . . . . . 99


3.1 Space-time operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
3.2 Dissipative operators and maximum principle . . . . . . . . . . . . . . . 111
3.3 Korovkin property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
3.4 Continuous sample paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
XII Contents

3.5 Measurability properties of hitting times . . . . . . . . . . . . . . . . . . . 150


3.5.1 Some side remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
3.5.2 Some related remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

Part II Backward Stochastic Differential Equations

4 Feynman-Kac formulas, backward stochastic differential


equations and Markov processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
4.2 A probabilistic approach: weak solutions . . . . . . . . . . . . . . . . . . . . 191
4.3 Existence and Uniqueness of solutions to BSDE’s . . . . . . . . . . . . 194
4.4 Backward stochastic differential equations and Markov
processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225

5 Viscosity solutions, backward stochastic differential


equations and Markov processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
5.1 Comparison theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
5.2 Viscosity solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
5.3 Backward stochastic differential equations in finance . . . . . . . . . 248
5.4 Some related remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253

6 The Hamilton-Jacobi-Bellman equation and the stochastic


Noether theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
6.2 The Hamilton-Jacobi-Bellman equation and its solution . . . . . . 258
6.3 The Hamilton-Jacobi-Bellman equation and viscosity solutions 267
6.4 A stochastic Noether theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
6.4.1 Classical Noether theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 289
6.4.2 Some problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290

Part III Long time behavior

7 On non-stationary Markov processes and Dunford


projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
7.2 Kolmogorov operators and weak∗ -continuous semigroups . . . . . 296
7.3 Kolmogorov operators and analytic semigroups . . . . . . . . . . . . . . 301
7.3.1 Ornstein-Uhlenbeck process . . . . . . . . . . . . . . . . . . . . . . . . . 318
7.4 Ergodicity in the non-stationary case . . . . . . . . . . . . . . . . . . . . . . 357
7.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
Contents XIII

8 Coupling methods and Sobolev type inequalities . . . . . . . . . . . 383


8.1 Coupling methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
8.2 Some related stability results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
8.3 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457

9 Miscellaneous topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459


9.1 Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459
9.2 Stopping times and time-homogeneous Markov processes . . . . . 463
9.3 Markov Chains: invariant measure . . . . . . . . . . . . . . . . . . . . . . . . . 465
9.3.1 Some definitions and results . . . . . . . . . . . . . . . . . . . . . . . . 465
9.3.2 Construction of an invariant measure . . . . . . . . . . . . . . . . 479
9.4 A proof of Orey’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533
9.5 About invariant (or stationary) measures . . . . . . . . . . . . . . . . . . . 552
9.6 Weak and strong solutions to stochastic differential equations . 553

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569
Part I

Strong Markov processes


1
Strong Markov processes on polish spaces

1.1 Strict topology

Throughout this book E stands for a complete metrizable separable topo-


logical space, i.e. E is a polish space. The Borel field of E is denoted by
E. We write Cb (E) for the space of all complex valued bounded continu-
ous functions on E. The space Cb (E) is equipped with the supremum norm:
kf k∞ = supx∈E |f (x)|, f ∈ Cb (E). The space Cb (E) will be endowed with
a second topology which will be used to describe the continuity properties.
This second topology, which is called the strict topology, is denoted as Tβ -
topology. The strict topology is generated by the semi-norms of the form pu ,
where u varies over H(E), and where pu (f ) = supx∈E |u(x)f (x)| = kuf k∞ ,
f ∈ Cb (E). Here a function u belongs to H(E) if u is bounded and if for every
real number α > 0 the set {|u| ≥ α} = {x ∈ E : |u(x)| ≥ α} is contained in
a compact subset of E. It is noticed that Buck [44] was the first author who
introduced the notion of strict topology (in the locally compact setting). He
used the notation β instead of Tβ .
Remark 1.1. Let H + (E) be the collection of those functions u ∈ H(E) with
the following properties: u ≥ 0 and for every α > 0 the set {u ≥ α} is a
compact subset of E. Then every function u ∈ H + (E) is bounded, and the
strict topology is also generated by semi-norms of the form {pu : u ∈ H + (E)}.
Every u ∈ H + (E) attains its supremum at some point x ∈ E. Moreover
a sequence (fn )n∈N converges to a function f ∈ Cb (E) if and only if it is
uniformly bounded and if for every compact subset K of E the equality
limm→∞ supn≥m supx∈K |fn (x) − fm (x)| = 0 holds. Since Tβ -convergent se-
quences are Tβ -bounded, from Proposition 1.3 below it follows that a Tβ -
convergent sequence is uniformly bounded. The same conclusion is true for
Tβ -Cauchy sequences. Moreover, a Tβ -Cauchy sequence (fn )n∈N converges to a
bounded function f . Such a sequence converges uniformly on compact subsets
of the space E. Since the space E is polish, it follows that the limit function
f is continuous. Consequently, the space (Cb (E), Tβ ) is sequentially complete.
4 1 Strong Markov processes

Observe that continuity properties of functions f ∈ Cb (E) can be formulated


in terms of convergent sequences in E which are contained in compact subsets
of E. The topology of uniform convergence on Cb (E) is denoted by Tu .

1.1.1 Theorem of Daniell-Stone

In Proposition 1.3 below we need the following theorem. It says that an


abstract integral is a concrete integral. Theorem 1.2 will be applied with
S = E, H = Cb+ , the collection of non-negative functions in Cb (E), and
+
for I : CB → [0, ∞) we take the restriction to Cb+ of a non-negative linear
functional defined on Cb (E) which is continuous with respect to the strict
topology.
Theorem 1.2 (Theorem of Daniell-Stone). Let S be any set, and let H
be a non-empty collection of functions on S with the following properties:
(1) If f and g belong to H, then the functions f + g, f ∨ g and f ∧ g belong
to H as well;
(2) If f ∈ H and α is a non-negative real number, then αf , f ∧ α, and
+
(f − α) = (f − α) ∨ 0 belong to H;
(3) If f , g ∈ H are such that f ≤ g ≤ 1, then g − f belongs to H.
Let I : H → [0, ∞] be an abstract integral in the sense that I is a mapping
which possesses the following properties:
(4) If f and g belong to H, then I (f + g) = I(f ) + I(g);
(5) If f ∈ H and α ≥ 0, then I (αf ) = αI(f );
(6) If (fn )n∈N is a sequence in H which increases pointwise to f ∈ H, then
I (fn ) increases to I (f ).
Then there exists a non-negative σ-additive measureRµ on the σ-field generated
by H, which is denoted by σ(H), such that I(f ) = f dµ, for f ∈ H. If there
exists a countable family of functions
S∞ (fn )n∈N ⊂ H such that I (fn ) < ∞ for
all n ∈ N, and such that S = n=1 {fn > 0}, then the measure µ is unique.

Proof. Define the collection H ∗ of functions on S as follows. A function f :


S → [0, ∞] belongs to H ∗ provided there exists a sequence (fn )n∈N ⊂ H
which increases pointwise to f . Then the subset H ∗ has the properties (1)
and (2) with H ∗ instead of H. Define the mapping I ∗ : H ∗ → [0, ∞] by

I ∗ (f ) = lim I (fn ) , f ∈ H ∗ ,
n→∞

where (fn )n∈N ⊂ H is a sequence which pointwise increases to f . The defini-


tion does not depend on the choice of the increasing sequence (fn )n∈N ⊂ H.
In fact let (fn )n∈N and (gn )n∈N be sequences in H which both increase to
f ∈ H ∗ . Then by (6) we have

lim I (fn ) = sup I (fn ) = sup sup I (fn ∧ gm ) = sup sup I (fn ∧ gm )
n→∞ n∈N n∈N m∈N m∈N n∈N
1.1 Strict topology 5

= sup I (gm ) = lim I (gm ) . (1.1)


m∈N m→∞

From (1.1) it follows that I ∗ is well-defined. The functional I ∗ : H ∗ → [0, ∞]


has the properties (4), (5), and (6) (somewhat modified) with H ∗ instead of
H and I replaced by I ∗ . In fact the correct version of (6) for H ∗ reads as
follows:
(6∗ ) Let (fn )n∈N be a sequence H ∗ which increases pointwise to a function
f . Then f ∈ H ∗ , and I ∗ (fn ) increases to I ∗ (f ).
We also have the following assertion:
(3∗ ) Let f and g ∈ H ∗ be such that f ≤ g. Then I ∗ (f ) ≤ I ∗ (g).
We first prove (3∗ ) if f and g belong to H and f ≤ g. From (6), (3) and (4)
we get

I(g) = sup I (g ∧ m) = sup (I (g ∧ m − f ∧ m) + I (f ∧ m))


m∈N m∈N
≥ sup I (f ∧ m) = I (f ) . (1.2)
m∈N

Here we used the fact that by (3) the functions g ∧ m − f ∧ m, m ∈ N,


belong to H. Next let f and g be functions in H such that f ≤ g. Then there
exist increasing sequences (fn )n∈N and (gn )n∈N in H such that fn converges
pointwise to f ∈ H ∗ and gn to g ∈ H ∗ . Then

I ∗ (f ) = sup I (fn ) ≤ sup I (fn ∨ gn ) = I ∗ (g) . (1.3)


n∈N n∈N

Next we prove (6∗ ). Let (fn )n∈N be a pointwise increasing sequence in H ∗ ,


and put f = supn∈N fn . Choose for every n ∈ N an increasing sequence
(fn,m )m∈N ⊂ H such that supm∈N fn,m = fn . Define the functions gm , m ∈ N,
by
gm = f1,m ∨ f2,m ∨ · · · ∨ fm,m .
Then gm+1 ≥ gm and gm ∈ H for all m ∈ N. In addition, we have

sup gm = sup max fn,m = sup sup fn,m = sup fn = f. (1.4)


m∈N m∈N 1≤n≤m n∈N m≥n n∈N

Hence f ∈ H ∗ . For 1 ≤ n ≤ m the inequalities fn,m ≤ fn ≤ fm hold pointwise,


and hence gm ≤ fm . From (3∗ ) we infer

I ∗ (f ) = sup I (gm ) = sup I ∗ (gm ) ≤ sup I ∗ (fm ) ≤ I ∗ (f ) , (1.5)


m∈N m∈N m∈N

and thus supm∈N I ∗ (fm ) = I ∗ (f ).


Next we will get closer to measure theory. Therefore we define the col-
lection G of subsets of S by G = {G ⊂ S : 1G ∈ H ∗ }, and the mapping
µ : G → [0, ∞] by µ (G) = I ∗ (1G ), G ∈ G. The mapping µ possesses the
following properties:
6 1 Strong Markov processes

(10 ) If the subsets G1 and G2 belong to G, then the same is true for the
subsets G1 ∩ G2 and G1 ∪ G2 ;
(20 ) ∅ ∈ G;
(30 ) If the subsets G1 and G2 belong to G and if G1 ⊂ G2 , then µ (G1 ) ≤
µ (G2 );
(40 ) If the subsets G1 and G2 belong to G, then the following strong additivity
holds: µ (G1 ∩ G2 ) + µ (G1 ∪ G2 ) = µ (G1 ) + µ (G2 );
(50 ) µ (∅) = 0;
(60 ) If (Gn )n∈N is a sequence ¡in G such¢ that Gn+1 ⊃ Gn , n ∈ N, then
S S
n∈N Gn belongs to G and µ n∈N Gn = supn∈N µ (Gn ).

These properties are more or less direct consequences of the corresponding


properties of I ∗ : (1∗ )–(6∗ ).
Using the mapping µ we will define an exterior or outer measure µ∗
on the collection of all subsets of S. Let A be any subset of S. Then
we put µ∗ (A) = ∞ if for no G ∈ G we have A ⊂ G, and we write
µ∗ (A) = inf {µ (G) : G ∈ G, G ⊃ A}, if A ⊂ G0 for some G0 ∈ G. Then
µ∗ has the following properties:
(i) µ∗ (∅) = 0;
(ii) µ∗ (A) ≥ 0, for all subsets A of S;
(iii)µ∗ (A) ∗
S∞≤ µ (B),P whenever A and B are subsets of S for which A ⊂ B;

(iv)µ ( n=1 An ) ≤ n=1 µ∗ (An ) for any sequence (An )n∈N of subsets of S.

The assertions (i), (ii) and (iii) follow directly from the definition of µ∗ .
In order to prove (iv) we choose a sequence (An )n∈N , An ⊂ S, such that
µ∗ (An ) < ∞ for all n ∈ N. Fix ε > 0, and choose for every n ∈ N an subset
Gn of S which belongs to G and which has the S following properties:
S SmAn ⊂ Gn
and µ (GnS) ≤ µ∗ (An ) + ε2−n . By the equality n∈N Gn = m∈N n=1 Gn we
see that n∈N Gn belongs to G. From the properties of an exterior measure
we infer the following sequence of inequalities:
à ! à ! à ! Ãm !
[ [ [ [
∗ ∗
µ An ≤ µ Gn = µ Gn = sup µ Gn
m∈N n=1
n∈N n∈N n∈N
à m
! m
¡ ¢ X X
= sup I ∗ 1∪m
n=1 Gn
≤ sup I ∗ 1Gn = sup I ∗ (1Gn )
m∈N m∈N n=1 m∈N n=1
m
X ∞
X ∞
¡ ¢ X
= sup µ (Gn ) ≤ µ∗ (An ) + ε2−n = µ∗ (An ) + ε.
m∈N n=1 n=1 n=1
(1.6)
¡S ¢ P∞
Since ε > 0 was arbitrary we see that µ∗ n∈N An ≤

n=1 µ (An ). Hence
assertion (iv) follows.
Next we consider the σ-field D which is associated to the exterior measure
µ∗ , and which is defined by
1.1 Strict topology 7

D = {A ⊂ S : µ∗ (D) ≥ µ∗ (A ∩ D) + µ∗ (Ac ∩ D) for all D ⊂ S}


= {A ⊂ S : µ (D) ≥ µ∗ (A ∩ D) + µ∗ (Ac ∩ D)
for all D ∈ G with µ(D) < ∞} . (1.7)

Here we wrote Ac = S \ A for the complement of A in S. The reader is


invited to check the equality in (1.7). According to Caratheodory’s theorem
the exterior measure µ∗ restricted to the σ-field D is a σ-additive measure. We
will prove that D contains G. Therefore pick G ∈ G, and consider for D ∈ G
for which µ(D) < ∞ the equality

µ∗ (G ∩ D) + µ∗ (Gc ∩ D) = µ (G ∩ D) + inf {µ (U ) : U ∈ G, U ⊃ Gc ∩ D} .
(1.8)
Choose h ∈ H ∗ in such that h ≥ 1Gc ∩D . For 0 < α < 1 we have
1
1Gc ∩D ≤ 1{h>α} ≤ h.
α
Since 1{h>α} = supm∈N 1 ∧ (m(h − α)+ ) we see that the set {h > α} is a
member of G. It follows that T ∗ (h) ≥ αµ ({h > α}) ≥ αµ∗ (Gc ∩ D), and
hence

µ∗ (Gc ∩ D) ≤ inf {I ∗ (h) : h ≥ 1Gc ∩D , h ∈ H ∗ }


≤ inf {I ∗ (1U ) : U ⊃ Gc ∩ D, U ∈ G} = µ∗ (Gc ∩ D) . (1.9)

From (1.9) the equality

µ∗ (Gc ∩ D) = inf {I ∗ (h) : h ≥ 1Gc ∩D , h ∈ H ∗ }

follows. Next choose the increasing sequences (fn )n∈N and (gn )n∈N in such a
way that the sequence fn increases to 1D and gn increases to 1G . Define the
functions hn , n ∈ N, by

hn = 1D − fn ∧ gn = sup {(fm − fn ) + (fn − fn ∧ gn )} .


m≥n

Since the functions fm − fn , m ≥ n, and fn − fn ∧ gn belong to H we see that


hn belongs to H ∗ . Hence we get:

∞ > µ(D) = I ∗ (1D ) = I ∗ (hn ) + I ∗ (fn ∧ gn ) = I ∗ (hn ) + I (fn ∧ gn ) . (1.10)

In addition we have hn ≥ 1Gc ∩D . Consequently,

µ∗ (G ∩ D) + µ∗ (Gc ∩ D)
≤ µ (G ∩ D) + inf I ∗ (hn )
n∈N
= µ (G ∩ D) + µ (D) − sup I (fn ∧ gn )
n∈N
= µ (G ∩ D) + µ (D) − µ (G ∩ D) = µ (D) . (1.11)
8 1 Strong Markov processes

The equality in (1.11) proves that the σ-field D contains the collection G,
and hence that the mapping µ, which originally was defined on G in fact the
restriction is of a genuine measure defined on the σ-field generated by H,
which is again called µ, to G. R
We will show the equality I(f ) = f dµ for all f ∈ H. For f ∈ H we have
Z Z Z n2n
∞ ∞

¡ ¢ 1 X ∗¡ ¢
f dµ = µ {f > ξ} dξ = I 1{f >ξ} dξ = sup n I 1{f >j2−n }
0 0 n∈N 2 j=1
 
n2 n µ Z ¶
1 X ∞
= sup I ∗  n 1{f >j2−n }  = I ∗ x 7→ 1{f >ξ} (x)dξ
n∈N 2 j=1 0

= I ∗ (f ) = I (f ) . (1.12)
Finally we will prove the uniqueness of the measure Rµ. Let µ1 and
R µ2 be two
measures on σ(H) with the property that I(f ) = f dµ1 = f dµ2 for all
f ∈ H. Under the extra condition in Theorem 1.2 that there exist countable
manySfunctions (fn )n∈N such that I (fn ) < ∞ for all n ∈ N and such that

S = n=1 {fn > 0} we shall show that µ1 (B) = µ2 (B) for all B ∈ σ(H).
Therefore
© we Rfix a function
R f ∈ª H for which I(f ) < ∞. Then the collection
B ∈ σ(H) : B f dµ1 = B f dµ2 is a Dynkin system containing all sets of
the form {g > β} with g ∈³ H and β > 0. Fix ξ > 0, β > ´0 and g ∈ H. Then
+ +
the functions gm,n := min m (g − β) ∧ 1, n (f − ξ) ∧ 1 , m, n ∈ N, belong
to H. Then we have
Z
µ1 [{g > β} ∩ {f > ξ}] = lim lim gm,n dµ1 = lim lim I (gm,n )
m→∞ n→∞ m→∞ n→∞
Z
= lim lim gm,n dµ2 = µ2 [{g > β} ∩ {f > ξ}] . (1.13)
m→∞ n→∞

Upon integration the extreme terms


R in (1.13) with
R respect to the Lebesgue
measure dξ shows the equality {g>β} f dµ1 = {g>β} f dµ2 . It follows that
© R R ª
the collection B ∈ σ(H) : B f dµ1 = B f dµ2 contains all sets of the form
{g > β} where g ∈ H and β > 0. Such collection of sets is closed under finite
intersection. Hence, by a Dynkin argument, we infer the equality
½ Z Z ¾
B ∈ σ(H) : f dµ1 = f dµ2 = σ(H).
B B

The same argument applies with (nf ) ∧ 1 replacing f . By letting n tend to


∞ this shows the equality
σ(H) = {B ∈ σ(H) : µ1 [B ∩ {f > 0}] = µ2 [B ∩ {f > 0}]} . (1.14)
Since the set H is closed under taking finite maxima,
S∞ I (f ∨ g) ≤ I(f )+I(g) <
∞ whenever I(f ) and I(g) are finite, and S = n=1 {fn > 0} with I (fn ) < ∞,
n ∈ N, we see that
1.1 Strict topology 9
· ½ ¾¸
µ1 (B) = lim µ1 B ∩ max fj > 0
n→∞ 1≤j≤n
· ½ ¾¸
= lim µ2 B ∩ max fj > 0 = µ2 (B) (1.15)
n→∞ 1≤j≤n

for B ∈ σ(H).
This finishes the proof of Theorem 1.2.

1.1.2 Measures on polish spaces

Our first proposition says that the identity mapping f 7→ f sends Tβ -bounded
subsets of Cb (E) to k·k∞ -bounded subsets.
Proposition 1.3. Every Tβ -bounded subset of Cb (E) is k·k∞ -bounded. On
the other hand the identity is not a continuous operator from (Cb (E), Tβ ) to
(Cb (E), k·k∞ ), provided that E itself is not compact.

Proof. Let B ⊂ Cb (E) be Tβ -bounded. If B were not uniformly bounded, then


there exists sequences (fn )n∈N ⊂ B and (xn )n∈N ⊂ E such that |fn (xn )| ≥ n2 ,
X∞
1
n ∈ N. Put u(x) = 1x . Then the function u belongs to H(E), but
n=1
n n
sup pu (f ) ≥ sup pu (fn ) ≥ sup u (xn ) |f (xn )| ≥ sup n = ∞. The latter shows
f ∈B n∈N n∈N n∈N
that the set B is not Tβ -bounded. By contra-position it follows that Tβ -
bounded subsets are uniformly bounded.
Next suppose that E is not compact. Let u be any function in H(E). Then
limn→∞ u (xn ) = 0. If the imbedding (Cb (E), Tβ ) → (Cb (E), Tu ) were contin-
uous, then there would exist a function u ∈ H + (E) such that kf k∞ ≤ kuf k∞
for all f ∈ Cb (E). Let K be a compact subset of E such that 0 ≤ u(x) ≤ 21 for
x∈ / K. Since 1 ≤ kuk∞ = u (x0 ) for some x0 ∈ E, and since by assumption
E is not compact we see that K 6= E. Choose an open neighborhood O of
K, O 6= E, and a function f ∈ Cb (E) such that 1 − 1O ≤ f ≤ 1 − 1K .
In particular, it follows that f = 1 outside of O, and f = 0 on K. Then
1 1
1 = kf k∞ ≤ kuf k∞ ≤ supx∈K / |u(x)f (x)| ≤ 2 kf k∞ ≤ 2 . Clearly, this is a
contradiction.
This concludes the proof of Proposition 1.3.

The following proposition shows that the dual of the space (Cb (E), Tβ ) coin-
cides with the space of all complex Borel measures on E.
Proposition 1.4. 1. Let µ be a complex¯RBorel¯ measure on E. Then there
exists a function u ∈ H(E) such that ¯ f dµ¯ ≤ pu (f ) for all f ∈ Cb (E).
2. Let Λ : Cb (E) → C be a linear functional on Cb (E) which is continuous
with respect to the strict topology.R Then there exists a unique complex
measure µ on E such that Λ(f ) = f dµ, f ∈ Cb (E).
10 1 Strong Markov processes

Proof. 1 Since on a polish space every bounded Borel measure is inner regular,
there exists an increasing sequence of compact subsets (Kn )n∈N in E such that
|µ| (E \ Kn ) ≤ 2−2n−2 |µ| (E), n ∈ N. Fix f ∈ Cb (E). Then we have
¯Z ¯ X ¯ ¯
∞ ¯Z ¯ X ∞ Z
¯ ¯ ¯ ¯
¯ f dµ¯ ≤ ¯ f dµ¯ ≤ |f | d |µ|
¯ ¯ ¯ Kj+1 \Kj ¯
j=0 Kj+1 \Kj
j=0

X ° °
≤ °1K \K f ° |µ| (Kj+1 \ Kj )
j+1 j ∞
j=0
X∞
° °
≤ °1K \K f ° |µ| (E \ Kj )
j+1 j ∞
j=0

X ° °
≤ 2−2j−2 °1Kj+1 \Kj f °∞ |µ| (E)
j=0
X∞
≤ 2−2j−2 2j+1 kuf k∞ ≤ kuf k∞ (1.16)
j=0
P∞
where u(x) = j=1 2−j 1Kj (x) |µ| (E).
2 We decompose the functional Λ into a combination of four positive func-
+ − + −
tionals: Λ = (<Λ) − (<Λ) + i (=Λ) − i (=Λ) where the linear function-
+ −
als (<Λ) and (<Λ) are determined by their action on positive functions
f ∈ Cb (E):
+
(<Λ) (f ) = sup {< (Λ(g)) : 0 ≤ g ≤ f, g ∈ Cb (E)} , and

(<Λ) (f ) = sup {< (−Λ(g)) : 0 ≤ g ≤ f, g ∈ Cb (E)} .
+ −
Similar expressions cam be employed for the action of (=Λ) and (=Λ) on
functions f ∈ Cb+ . Since the complex linear functional Λ : Cb (E) → C is Tβ -
continuous there exists a function u ∈ H + (E)
¯ such that
¯ |Λ(f )| ≤ kuf k∞ for
¯ + ¯
all f ∈ Cb (E). Then it easily follows that ¯(<Λ) (f )¯ ≤ kuf k∞ for all real-
¯ ¯ √
¯ + ¯
valued functions in Cb (E), and ¯(<Λ) (f )¯ ≤ 2 kuf k∞ for all f ∈ Cb (E),

which in general take complex values. Similar inequalities hold for (<Λ) (f ),
+ −
(=Λ) (f ), and (=Λ) (f ). Let (fn )n∈N be a sequence of functions in Cb+ (E)
which pointwise increases to a function f ∈ Cb+ (E). Then limn→∞ Λ (fn ) =
Λ(f ). This can be seen as follows. Put gn = f − fn , and fix ε > 0. Then
the sequence (gn )n∈N decreases pointwise to 0. Moreover it is dominated by
f . Choose a strictly positive real number α in such a way that α kf k∞ ≤ ε.
Then it follows that
¡° ° ° ° ¢
|Λ (gn )| ≤ kugn k∞ = max °u1{u≥α} gn °∞ , °u1{u<α} gn °∞
¡ ° ° ¢
≤ max kuk °1{u≥α} gn ° , α kf k
∞ ∞
≤ε
∞ (1.17)
1.1 Strict topology 11
° °
where N chosen so large that kuk∞ °1{u≥α} gn °∞ ≤ ε for n ≥ N . By Dini’s
lemma such a choice of N is possible. An application of Theorem 1.2 then
yields the existence of measures µj , 1 ≤ j ≤ 4, defined on the Baire field of E
+ R − R + R
such that (<Λ) (f ) = f dµ1 , (<Λ) (f ) = f dµ2 , (=Λ) (f ) = f dµ3 , and
− R R R
(=Λ)
R (f ) =R f dµ4 forR f ∈ Cb (E). It follows that Λ(f ) = f dµ1 − f dµ2 +
i f dµ3 − i f dµ4 = f dµ for f ∈ Cb (E). Here µ = µ1 − µ2 + iµ3 − iµ4
and each measure µj , 1 ≤ j ≤ 4, is finite and positive. Since the space E is
polish it follows that Baire field coincides with the Borel field, and hence the
measure µ is a complex Borel measure.
This concludes the proof of Proposition 1.4.

The next corollary gives a sequential continuity characterization of linear



functionals which belong to the space (Cb (E), Tβ ) , the topological dual
of the space Cb (E) endowed with the strict topology. We say that a se-
quence (fn )n∈N ⊂ Cb (E) converges for the strict topology to f ∈ Cb (E) if
limn→∞ ku (f − fn )k∞ = 0 for all functions u ∈ H + (E). It follows that a
sequence (fn )n∈N ⊂ Cb (E) converges to a function f ∈ Cb (E) with respect
to the strict topology if and only if this sequence is uniformly bounded and
limn→∞ k1K (f − fn )k∞ = 0 for all compact subsets K of E.
Corollary 1.5. Let Λ : Cb (E) → C be a linear functional. Then the following
assertions are equivalent:

(1) The functional Λ belongs to (Cb (E), Tβ ) ;
(2) limn→∞ Λ (fn ) = 0 whenever (fn )n∈N is a sequence in Cb+ (E) which con-
verges to the zero-function for the strict topology;
(3) There exists a finite constant C ≥ 0 such that |Λ(f )| ≤ C kf k∞ for all
f ∈ Cb (E), and limn→∞ Λ (gn ) = 0 whenever (gn )n∈N is a sequence in
Cb+ (E) which is dominated by a sequence (fn )n∈N in Cb+ (E) which de-
creases pointwise to 0;
(4) There exists a finite constant C ≥ 0 such that |Λ(f )| ≤ C kf k∞ for all
f ∈ Cb (E), and limn→∞ Λ (fn ) = 0 whenever (fn )n∈N is a sequence in
Cb+ (E) which decreases pointwise to 0; R
(5) There exists a complex Borel measure µ on E such that Λ(f ) = f dµ for
all f ∈ Cb (E).
In (3) we say that a sequence (gn )n∈N in Cb+ (E) is dominated by a sequence
(fn )n∈N if gn ≤ fn for all n ∈ N.

Proof. (1) =⇒ (2). First suppose that Λ belongs to (Cb (E), Tβ ) . Then there
exists a function u ∈ H + (E) such that |Λ(f )| ≤ kuf k∞ for all f ∈ Cb (E).
Hence, if the sequence (fn )n∈N ⊂ Cb+ (E) converges to zero for the strict
topology, then limn→∞ kufn k∞ = 0, and so limn→∞ Λ (fn ) = 0. This proves
the implication (1) =⇒ (2).
(2) =⇒ (3). Let (fn )n∈N be a sequence in Cb (E) which converges
³ to´0 for
+
the uniform topology. From (2) it follows that the sequences (<fn ) ,
n∈N
12 1 Strong Markov processes
³ ´ ³ ´ ³ ´
− + −
(<fn ) , (=fn ) , and (=fn ) converge to 0 for the strict
n∈N n∈N n∈N
topology Tβ , and hence limn→∞ Λ (fn ) = 0. Consequently, the functional Λ :
Cb (E) → C is continuous if Cb (E) is equipped with the uniform topology,
and hence there exists a finite constant C ≥ 0 such that |Λ(f )| ≤ C kf k∞
for all f ∈ Cb (E). If (fn )n∈N is a sequence in Cb+ (E) which decreases to 0,
then by Dini’s lemma it converges uniformly on compact subsets of E to 0.
Moreover, it is uniformly bounded, and hence it converges to 0 for the strict
topology. If the sequence (gn )n∈N ⊂ Cb+ (E) is such that gn ≤ fn . Then the
sequence (gn )n∈N converges to 0 for the strict topology. Assertion (2) implies
that limn→∞ Λ (gn ) = 0.
(3) =⇒ (4). This implication is trivial.
(3) =⇒ (5). The boundedness of the functional Λ, i.e. the inequality
|Λ(f )| ≤ C kf k∞ , f ∈ Cb (E), enables us to write Λ in the form Λ =
+ − +
Λ1 −Λ2 +iΛ3 −iΛ4 in such a way that Λ1 = (<Λ) , Λ2 = (<Λ) , Λ3 = (=Λ) ,

and Λ3 = (=Λ) . From the definitions of these functionals (see the proof of
assertion 2 in Proposition 1.4) assertion (3) implies that limn→∞ Λj (fn ) = 0,
1 ≤ j ≤ 4, whenever the sequence (fn )n∈N ⊂ Cb+ (E) decreases to 0. From
Theorem 1.2 we infer that each functional
R Λj , 1 ≤ j ≤ 4, can be represented
by a Borel measure
R µj : Λ j (f ) = f dµ j , 1 ≤ j ≤ 4, f ∈ CB (E). It follows
that Λ(f ) = f dµ, f ∈ Cb (E), where µ = µ1 − µ2 + iµ3 − iµ4 .
(4) =⇒ (5). From the apparently weaker hypotheses in assertion (4) com-
pared to (3) we still have to prove that the functionals Λj , 1 ≤ j ≤ 4, as de-
scribed in the implication (3) =⇒ (5) have the property that limn→∞ Λj (fn ) =
0 whenever the sequence (fn )n∈N ⊂ Cb+ (E) decreases pointwise to 0. We
+
will give the details for the functional Λ1 = (<Λ) . This suffices because
+ + +
Λ2 = (< (−Λ)) , Λ3 = (< (−iΛ)) , and Λ4 = (< (iΛ)) . So let the sequence
+
(fn )n∈N ⊂ Cb (E) decreases pointwise to 0. Fix ε > 0, and choose 0 ≤ g1 ≤ f1 ,
g1 ∈ Cb (E), in such a way that

+ 1
Λ1 (f1 ) = (<Λ) (f1 ) ≤ < (Λ (g1 )) + ε. (1.18)
2
Then wePchoose sequence of functions (uk )k∈N ⊂ Cb+ (E) such that g1 =
aP
n ∞
supn∈N k=1 uk = k=1 uk (which is a pointwise increasing limit), and such
that uk ≤ fk − fk+1 , k ∈ N. In Lemma
Pn1.6 below we will show that such a
decomposition is possible. Then g1 − k=1 uk decreases pointwise to 0, and
hence by (4) we have
à n !
X 1
<Λ (g1 ) ≤ <Λ uk + ε, for n ≥ nε . (1.19)
2
k=1

From (1.18) and (1.19) we infer for n ≥ nε the inequality


1.1 Strict topology 13

+ 1
Λ1 (f1 ) = (<Λ) (f1 ) ≤ < (Λ (g1 )) + ε
à n ! 2
X Xn n
X +
≤ <Λ uk + ε = <Λ (uk ) + ε ≤ (<Λ) (fk − fk+1 ) + ε
k=1 k=1 k=1
n
X
= Λ1 (fk − fk+1 ) + ε = Λ1 (f1 ) − Λ1 (fn+1 ) + ε. (1.20)
k=1

From (1.20) we deduce Λ1 (fn ) ≤ ε for n ≥ nε + 1. Since ε > 0 was arbitrary,


this shows limn→∞ Λ1 (fn ) = 0. This is true for the other linear functionals
Λ2 , Λ3 and Λ4 as well. As in the proof of the implication (3) =⇒ (5) from
Theorem 1.2 it follows that eachRfunctional Λj , 1 ≤ j ≤ 4, can be represented
by a Borel
R measure µj : Λj (f ) = f dµj , 1 ≤ j ≤ 4, f ∈ Cb (E). It follows that
Λ(f ) = f dµ, f ∈ Cb (E), where µ = µ1 − µ2 + iµ3 − iµ4 .
(5) =⇒ (1). The proof of assertion 1 in Proposition 1.4 then shows that

the functional Λ belongs to (Cb (E), Tβ ) .

Lemma 1.6. Let the sequence (fn )n∈N ⊂ Cb+ (E) decrease pointwise to 0,
and 0 ≤ g ≤ f1 be a continuous function. Then there exists a sequence of
continuous functions (uPk )nk∈N such P
that 0 ≤ uk ≤ fk − fk+1 , k ∈ N, and

such that g = supn∈N k=1 uk = k=1 uk which is a pointwise monotone
increasing limit.
Pn
Proof. We write g = v1 = u1 + v2 = k=1 uk + vn+1 , and vn+1 = un+1 + vn+2
where u1 = g ∧ (f1 − f2 ), un+1 = vn+1 ∧ (fn+1 − fn+2 ), and vn+2 = vn+1 −
un+1 . Then 0 ≤ vn+1 ≤ vn ≤ fn . Since the sequence (fn )n∈N decreases
Pn to 0,
the sequence (vn )n∈N also decreases to 0, and thus g = supn∈N k=1 uk .
The latter shows Lemma 1.6.

In the sequel we write M(E) for the complex vector space of all complex Borel
measures on the polish space E. The space is supplied with the weak topology
σ (E, Cb (E)). We also write M+ (E) for the convex cone of all positive (= non-
negative) Borel measures in M(E). The notation M+ 1 (E) is employed for all
probability measures in M+ (E), and M+ ≤1 (E) stands for all sub-probability

measures in M+ (E). We identify the space M(E) and the space (Cb (E), Tβ ) .
Theorem 1.7. Let M be a subset of M(E) with the property that for every
sequence (Λn )n∈N in M there exists a subsequence (Λnk )k∈N such that
¡ ¢ ¡ ¢
lim sup < i` Λnk (f ) = sup < i` Λ(f ) , 0 ≤ ` ≤ 3,
k→∞ 0≤f ≤1 0≤f ≤1

for some Λ ∈ M(E). Then M is a relatively weakly compact subset of M(E)


if and only if it is equi-continuous viewed as a subset of the dual space of

(Cb (E), Tβ ) .
14 1 Strong Markov processes

Proof. First suppose that M(E) is relatively weakly compact. Since the weak
topology on M(E) restricted to compact subsets is metrizable and separa-
ble, the weak closure of M is bounded for the variation norm. Without loss
of generality we may and do assume that M itself is weakly compact. Fix
+
f ∈ Cb (E), f ≥ 0. Consider the mapping Λ 7→ (<Λ) (f ), Λ ∈ M(E). Here we

identify Λ = Λµ ∈ (Cb (E), Tβ ) and the corresponding
R complex Borel mea-
sure µ = µΛ given by the equality Λ(g) = gdµ, g ∈ Cb (E). The mapping
+
Λ 7→ (<Λ) (f ), Λ ∈ M(E), is weakly continuous. This can be seen as fol-
+
lows. Suppose Λn (g) → Λ(g) for all g ∈ Cb (E). Then (<Λn ) (f ) ≥ <Λn (g)
+
for all 0 ≤ g ≤ f , g ∈ Cb (E), and hence lim inf n→∞ (<Λn ) (f ) ≥
+
lim inf n→∞ <Λn (g) = (<Λ) (g). It follows that lim inf n→∞ (<Λn ) (f ) ≥
+ + +
sup0≤g≤f (<Λ) (g) = (<Λ) (f ). Since limn→∞ (<Λn ) (1) = (<Λ) (1) we
+ +
also have lim inf n→∞ (<Λn ) (1−f ) ≥ sup0≤g≤1−f (<Λ) (g) = (<Λ) (1−f ).
+ +
Hence we see lim supn→∞ (<Λn ) (f ) ≤ (<Λ) (f ).

In what follows we write K(E) for the collection of compact subsets of E.


Theorem 1.8. Let M be a subset of M(E). Then the following assertions are
equivalent:
(a) For every sequence (fn )n∈N ⊂ ZCb (E) which decreases pointwise to the zero
function the equality inf sup fn d |µ| = 0 holds;
n∈N µ∈M
(b) The equality inf sup |µ| (E \ K) = 0 holds, and sup |µ| (E) < ∞;
K⊂E, K∈K(E) µ∈M µ∈M
(c) There exists a function u¯R∈ H +¯ (E) such that for all f ∈ Cb (E) and for
all µ ∈ M the inequality ¯ f dµ¯ ≤ kuf k∞ holds.
Moreover, if M ⊂ M(E) satisfies one of the equivalent conditions (a), (b) or
(c), then M is relatively weakly compact.
Let Λ : Cb (E) → C be a linear functional such that inf n∈N |Λ| (fn ) = 0
for every sequence (fn )n∈N ⊂ Cb (E) which decreases pointwise to zero.
Here the linear functional |Λ| is defined in such a way that |Λ| (f ) =
sup {|Λ(v)| : |v| ≤ f, v ∈ Cb (E)} for all f ∈ Cb+ (E). Then by RCorollary 1.5
there exists a complex Borel measure µ such that Λ(f ) = f dµRfor all
f ∈ Cb (E). The positive Borel measure |µ| is such that |Λ| (f ) = f d |µ|
for all f ∈ Cb (E).
Proof. (a) =⇒ (b). By choosing the sequence fn = n−1 1 we see that
supµ∈M |µ| (E) < ∞. Next let ρ be a metric on E which it a polish space, let
(xn )n∈N be a dense sequence in E, and put Bk,n = {x ∈ E : ρ (x, xk ) ≤ 2−n }.
Choose continuous functions wk,n ∈ Cb (E) such that 1Bk,n
c ≤ wk,n ≤ 1Bk,n+1
c .
Put v`,n = min1≤k≤` wk,n . Then for every n ∈ N the sequence ` 7→ w`,n
decreases pointwise to Rzero. So for given ε > 0 and for given n ∈ N there
−n
exists
³ `n (ε) such
´ that w`n (ε),n d |µ| ≤ ε2 for all µ ∈ M . It follows that
` (ε)
c
|µ| ∩k=1
n
Bk,n ≤ ε2−n , and hence
1.1 Strict topology 15
³ ´
` (ε)
|µ| ∪∞ c
n=1 ∩k=1 Bk,n ≤ ε,
n
µ ∈ M.

` (ε)
Put K(ε) = ∩∞ n=1 ∪k=1 Bk,n . Then K(ε) is closed, and thus complete, and
n

completely ρ-bounded. Hence it is compact. Moreover, |µ| (E \ K(ε)) ≤ ε for


all µ ∈ M . Hence (b) follows from (a).
(b) =⇒ (c). This proof follows the lines of proof of assertion 1 of Proposition
1.4. Instead of considering just one measure we now have a family of measures
M.
(c) =⇒ (a). Essentially speaking this is a consequence of Dini’s lemma.
¯R Here
¯
we use the following fact. If for some µ ∈ M(E) ¯the inequality ¯ f dµ¯ ≤
R ¯
kuf k∞ holds for all f ∈ Cb (E), then we also have ¯ f d |µ|¯ ≤ kuf k∞ for all
f ∈ Cb (E). Fix α > 0. If (fn )n∈N is any sequence in Cb+ (E) which decreases
pointwise to zero, then for µ ∈ M we have the following estimate
Z
¡° ° ° ° ¢
fn d |µ| ≤ max °u1{u≥α} fn °∞ , °u1{u<α} fn °∞
à !
≤ max kuk∞ sup fn (x), α sup fn (x)
x∈{u≥α} x∈E
à !
≤ max kuk∞ sup fn (x), α sup f1 (x) . (1.21)
x∈{u≥α} x∈E

From the fact that the set {u ≥ α} is contained in a compactR subset of


E from (1.21) and Dini’s lemma we deduce that inf n∈N supµ∈M fn d |µ| ≤
α supx∈E f1 (x) for all α > 0. Consequently, (a) follows.
Finally we prove that if M satisfies (c), then M is relatively weakly com-
pact. First observe that µ ∈ M implies |µ| (E) ≤ kuk∞ . So the subset M is
uniformly bounded, and since E is a polish space, the same is true for the
ball {µ ∈ M(E) : |µ| (E) ≤ kuk∞ } endowed with the weak topology. There-
fore, if (µn )n∈N is a sequence
R in M it contains a subsequence (µnk )k∈N such
that Λ(f ) := limk→∞ f dµnk exists for all f ∈ Cb (E). Then it follows that
|Λ(f )| ≤ kuf k∞ for all f ∈ Cb (E). Consequently,
R the linear functional Λ can
be represented as a measure: Λ(f ) = f dµ, f ∈ Cb (E). It follows that the
weak closure of the set M is weakly compact.
Definition 1.9. A family of complex measures M ⊂ M(E) is called tight
if it satisfies one of the equivalent conditions in Theorem 1.8. Let M f be a
collection of linear functionals on Cb (E) which are continuous for the strict
topology. Then each Λ ∈ M f can be represented by a measure: Λ(f ) = inf f dµΛ ,
f ∈ Cb (E). Then the collection Mf of linear functionals is called tight, provided
n o
the same is true for the family M = µΛ : Λ ∈ M f .

Remark 1.10. In fact if M satisfies (a), then M satisfies Dini’s condition in


the sense that a sequence of functions µ 7→ |µ| (fn ) which decreasing pointwise
16 1 Strong Markov processes

to zero in fact converges uniformly on M . Assertion (b) says that the family
M is tight in the usual sense as it can be found in the standard literature.
Assertion (c) says that the family M is equi-continuous for the strict topology.

The following corollary says that if for M in Theorem 1.8 we choose a col-
lection of positive measures, then the family M is tight if and only if it is
relatively weakly compact. Compare these results with Stroock [224].
Corollary 1.11. Let M be a collection of positive Borel measures. Then the
following assertions are equivalent:
(a) The collection M is relatively weakly compact.
(b) The collection M is tight in the sense that supµ∈M µ(E) < ∞ and
inf K∈K(E) supµ∈M µ (E \ K) = 0. ¯R ¯
(c) There exists a function u ∈ H + (E) such that ¯ f dµ¯ ≤ kuf k∞ for all
µ ∈ M and for all f ∈ Cb (E).

Remark 1.12. Suppose that the collection M in Corollary 1.11 consists of prob-
ability measures and is closed with respect to the Lévy metric. If M satisfies
one of the equivalent conditions in 1.11, then it is a weakly compact subset of
P (E), the collection of Borel probability measures on E.

Proof. Corollary 1.11 follows more or less directly from Theorem 1.8. Let
M be as in Corollary 1.11, and (fn )n∈N be a sequence in Cb (E) which de-
creases
R to the zero
R function. Then observe that the sequence of functions
µ 7→ fn d |µ| = fn dµ, µ ∈ M , decreases pointwise to zero. Each of these
functions is weakly continuous. Hence, if M is relatively weakly compact, then
Dini’s lemma implies that this sequence converges uniformly on M to zero. It
follows that assertion (a) in Corollary 1.11 implies assertion (a) in Theorem
1.8. So we see that in Corollary 1.11 the following implications are valid: (a)
=⇒ (b) =⇒ (c). If M ⊂ M+ (E) satisfies (c), then Theorem 1.8 implies that
M is relatively weakly compact. This means that the assertions (a), (b) and
(c) in Corollary 1.11 are equivalent.

We will also need the following theorem.


Theorem 1.13. Let (µn )n∈N ⊂ M(E) be a tight R sequence (see Definition
1.9) with the property that Λ(f ) := limn→∞ f dµn exists for all f ∈
Cb (E). Let Φ ⊂ Cb (E) be a family of functions which is equi-continuous
and bounded.
¯Z Then ΛZ can be
¯ represented as a complex Borel measure µ, and
¯ ¯
lim sup ¯¯ ϕdµn − ϕdµ¯¯ = 0.
n→∞ ϕ∈Φ

Remark 1.14. According to the Theorem of Arzela-Ascoli an equi-continuous


and uniformly bounded family of functions restricted to a compact subset K
is relatively compact in Cb (K).
1.1 Strict topology 17

Proof. The fact that the linear functional Λ can be represented by a Borel
measure follows from Corollary 1.5 and Theorem 1.8. Assume to arrive at a
contradiction that
¯Z Z ¯
¯ ¯
lim sup sup ¯¯ ϕdµn − ϕdµ¯¯ > 0.
n→∞ ϕ∈Φ

Then there exist ε > 0, a subsequence (µnk )k∈N , and a sequence (ϕk )k∈N ⊂ Φ
such that ¯Z Z ¯
¯ ¯
¯ ϕk dµn − ϕk dµ¯ > ε, k ∈ N. (1.22)
¯ k ¯
Choose a compact subset of E in such a way that
ε
sup kϕk∞ × sup |µn | (E \ K) ≤ . (1.23)
ϕ∈Φ n∈N 16

By the Bolzano-Weierstrass theorem for bounded equi-continuous families of


functions, there exists a continuous function ϕK ∈ C(K) and a subsequence
of the sequence (ϕk )k∈N , which we call again (ϕk )k∈N , such that
¯ ¯
lim sup ¯ϕk (x) − ϕK (x)¯ = 0. (1.24)
k→∞ x∈K

By Tietze’s extension theorem there exists a continuous function ϕ ∈ Cb (E)


such that ϕ restricted to K coincides with ϕK and such that |ϕ| ≤ 2 sup kψk∞ .
ψ∈Φ
From (1.24) it follows there exists kε ∈ N such that for k ≥ kε the inequality
ε
sup |µn | (E) k1K (ϕk − ϕ)k∞ ≤ . (1.25)
n∈N 8

From (1.23)and (1.25) we obtain the following estimate:


¯Z Z ¯
¯ ¯
¯ ϕk dµn − ϕk dµ¯
¯ k ¯
¯Z Z ¯
¯ ¯
≤ ¯¯ (ϕk − ϕ) dµnk − (ϕk − ϕ) dµ¯¯
K K
¯Z Z ¯ ¯Z Z ¯
¯ ¯ ¯ ¯
¯ ¯ ¯
+¯ (ϕk − ϕ) dµnk − (ϕk − ϕ) dµ¯ + ¯ ϕdµnk − ϕdµ¯¯
¯ E\K E\K ¯
≤ k1K (ϕk − ϕ)k∞ (|µnk | (K) + |µ| (K))
¯Z Z ¯
¯ ¯
+ 4 sup kψk∞ (|µnk | (E \ K) + |µ| (E \ K)) + ¯ ϕdµnk − ϕdµ¯¯
¯
ψ∈Φ

≤ 2 k1K (ϕk − ϕ)k∞ sup |µnk | (K)


k∈N
¯Z Z ¯
¯ ¯
+ 8 sup kψk∞ sup |µnk | (E \ K) + ¯ ϕdµnk − ϕdµ¯¯
¯
ψ∈Φ k∈N
18 1 Strong Markov processes
¯Z Z ¯
3 ¯ ¯
≤ ε + ¯¯ ϕdµnk − ϕdµ¯¯ . (1.26)
4
¯R R ¯
Since limn→∞ ¯ ϕdµn − ϕdµ¯ = 0 the equality in (1.26) implies
¯Z Z ¯
¯ ¯
¯ ϕk dµn − ϕk dµ¯ < ε (1.27)
¯ k ¯

for k large enough. The conclusion in (1.27) contradicts our assumption in


(1.22).
This proves Theorem 1.13.

Occasionally we will need the following version of the Banach-Alaoglu


R the-
orem; see e.g. Theorem 7.18. We use the notation hf, µi = E f (x) dµ(x),
f ∈ Cb (E), µ ∈ M (E). For a proof of the following theorem we refer to e.g.
Rudin [205]. Notice that any Tβ -equi-continuous family of measures is con-
tained in Bu for some u ∈ H(E). Here Bu is the collection defined in (1.28)
below.
Theorem 1.15. (Banach-Alaoglu) Let u be a function in H(E), and define
the subset Bu of M (E) by

Bu = {µ ∈ M (E) : |hf, µi| ≤ kuf k∞ for all f ∈ Cb (E)} . (1.28)

Then Bu is σ (M (E), Cb (E))-compact.


Since the space (Cb (E), Tβ ) is separable, it follows that for every sequence
(µn )n∈N in Bu there exists a measure µ ∈ M (E) and a subsequence (µnk )k∈N
such that lim hf, µnk i = hf, µi for all f ∈ Cb (E).
k→∞
Instead of “σ (M (E), Cb (E))”-convergence we often write “weak∗ -conver-
gence”, which is a functional analytic term. In a probabilistic context people
usually write “weak convergence”.

1.1.3 Integral operators on the space of bounded continuous


functions

We insert a short digression to operator theory. Let E1 and E2 be two polish


spaces, and let T : Cb (E1 ) → Cb (E2 ) be a linear operator with the property
that its absolute value |T | : Cb (E1 ) → Cb (E2 ) determined by the equality

|T | (f ) = sup {|T g| : |g| ≤ f } , f ∈ Cb (E1 ) , f ≥ 0,

is well-defined and acts as a linear operator from Cb (E1 ) to Cb (E2 ).


Definition 1.16. A family of linear operators {Tα : α ∈ A}, where every
Tα ∈ L (Cb (E1 ) , Cb (E2 )) is called equi-continuous for the strict topology
if for every v ∈ H (E2 ) there exists u ∈ H (E1 ) such that the inequality
kvTα f k∞ ≤ kuf k∞ holds for all α ∈ A and for all f ∈ Cb (E1 ).
1.1 Strict topology 19

So the notion “equi-continuous for the strict topology” has a functional ana-
lytic flavor.
Definition 1.17. A family of linear operators {Tα : α ∈ A}, where every Tα
is a continuous linear operator from Cb (E1 ) to Cb (E2 ) is called tight if for
every compact subset K of E2 the family of functionals {Λα,x : α ∈ A, x ∈ K}
is tight in the sense of Definition 1.9. Here the functional Λα,x : Cb (E1 ) → C
is defined by Λα,x (f ) = Tα f (x), f ∈ Cb (E1 ). Its absolute value |Λα,x | has
then the property that |Λα,x | (f ) = |Tα | f (x), f ∈ Cb (E1 ).
The following theorem says that a tight family of operators {Tα : α ∈ A} is
equi-continuous for the strict topology and vice versa. Both spaces E1 and E2
are supposed to be polish.
Theorem 1.18. Let A be some index set, and let for every α ∈ A the mapping
Tα : Cb (E1 ) → Cb (E2 ) be a linear operator, which is continuous for the
uniform topology. Suppose that the family {Tα : α ∈ A} is tight. Then for every
v ∈ H (E2 ) there exists u ∈ H (E1 ) such that

kvTα f k∞ ≤ kuf k∞ , for every α ∈ A and for all f ∈ Cb (E1 ). (1.29)

Conversely, if the family {Tα : α ∈ A} is equi-continuous in the sense that for


every v ∈ H (E2 ) there exists u ∈ H (E1 ) such that (1.29) is satisfied. Then
the family {Tα : α ∈ A} is tight.
If the family {Tα : α ∈ A} satisfies (1.29), then the family {|Tα | : α ∈ A} sat-
isfies the same inequality with |Tα | instead of Tα . The argument to see this
goes in more or less the same way as we will prove the first part of Proposi-
tion 1.28 below. Fix f ∈ Cb (E1 ), α ∈ A, and x ∈ E1 , and let the functions
u ∈ H (E1 ) and v ∈ H (E2 ) be such that (1.29) is satisfied. Choose ϑ ∈ [−π, π]
in such a way that
¡ ¡ ¢¢ ³ ¡ ¢+ ´
|v(x) |Tα | (f )(x)| = |v(x)| |Tα | < eiϑ f (x) ≤ |v(x)| |Tα | < eiϑ f (x)

(definition of |Tα |)
n ¡ ¢+ o
= sup |v(x)Tα g(x)| : |g| ≤ < eiϑ f
n ¡ ¢+ o
≤ sup kugk∞ : |g| ≤ < eiϑ f ≤ kuf k∞ . (1.30)

From (1.30) we see that the inequality in (1.29) is also satisfied for the oper-
ators |Tα |, α ∈ A.
Corollary 1.19. Like in Theorem 1.18 let A be some index set, and let for
every α ∈ A the mapping Tα : Cb (E1 ) → Cb (E2 ) be a positivity preserving lin-
ear operator. Then the family {Tα : α ∈ A} is Tβ -equi-continuous if and only
if for every sequence (ψm )m∈N which decreases pointwise to 0, the sequence
{Tα (ψm f ) : m ∈ N} decreases pointwise to 0 uniformly in α ∈ A.
20 1 Strong Markov processes

Proof (Proof of Corollary 1.19.). Choose v ∈ H + (E). The proof follows by


considering the family of functionals Λα,x : Cb (E) → C, α ∈ A, x ∈ E, de-
fined by Λα,x f (x) = u(x)Tα f (x), f ∈ Cb (E). If the family {Tα : α ∈ A} is
Tβ -equi-continuous, then the family {Λα,x : α ∈ A, x ∈ E} is tight. For exam-
ple, it then easily follows that {Λu,α,x fm : α ∈ A, x ∈ E} converges uniformly
in α ∈ A, x ∈ E, to 0, provided that the sequence (fm )m∈N decreases point-
wise to 0. Conversely, suppose that for any given v ∈ H + (E), and for any
sequence of functions (fm )m∈N ⊂ Cb (E) which decreases pointwise to 0, the
sequence {Λv,α,x fm : α ∈ A, x ∈ E}m∈N converges uniformly to 0. Then the
family {Λα,x : α ∈ A, x ∈ E} is tight: see 1.19.

Proof (Proof of Theorem 1.18.). Like in Definition 1.17 the functionals Λα,x ,
α ∈ A, x ∈ E1 , are defined by Λα,x (f ) = [Tα f ] (x), f ∈ Cb (E1 ). First we
suppose that the family {Tα : α ∈ A} is tight. Let (fn )n∈N ⊂ Cb+ (E1 ) be
sequence of continuous functions which decreases pointwise to zero, and let v ∈
H (E2 ) be arbitrary. Since the family {Tα : α ∈ A} is tight, it follows that, for
every compact subset K the collection of functionals {Λα,x : α ∈ A, x ∈ K}
is tight. Then, since the sequence (fn )n∈N ⊂ Cb+ (E1 ) decreases pointwise to
zero, we have

lim sup |Λα,x | (fn ) = 0 for every compact subset K of E1 . (1.31)


n→∞ α∈A, x∈K

From (1.31) it follows that limn→∞ supα∈A, x∈K |v(x)| |Λα,x | (fn ) = 0. Hence
the family of functionals {|v(x)| Λα,x : α ∈ A, x ∈ E1 } is tight. By Theorem
1.8 (see Definition 1.9 as well) it follows that there exists a function u ∈ H (E1 )
such that
|v(x) [Tα f ] (x)| = |v(x)Λα,x (f )| ≤ kuf k∞ (1.32)
for all f ∈ Cb (E1 ), for all x ∈ E and for all α ∈ A. The inequality in (1.32)
implies the equi-continuity property (1.29).
Next let the family {Tα : α ∈ A} be equi-continuous in the sense that
it satisfies inequality (1.29). Then the same inequality holds for the family
{|Tα | : α ∈ A}; the argument was given just prior to the proof of Theorem
1.18. Let K be any compact subset of E1 and let (fn )n∈N ⊂ Cb+ (E1 ) be a
sequence which decreases to zero. Then there exists a function u ∈ H (E1 )
such that

sup [|Tα | fn ] (x) = k1K |Tα | fn k∞ ≤ kufn k∞ . (1.33)


α∈A, x∈K

From (1.33) it readily follows that limn→∞ supα∈A, x∈K [|Tα | fn ] (x) = 0. By
Definition 1.17 it follows that the family {Tα : α ∈ A} is tight.
This completes the proof of Theorem 1.18.

Theorem 1.20. Let E1 and E2 be two polish spaces, and let U : Cb (E1 , R) →
Cb (E2 , R) be a mapping with the following properties:
1.1 Strict topology 21

1. If f1 and f2 ∈ Cb (E1 ) are such that f1 ≤ f2 , then U (f1 ) ≤ U (f2 ). In


other words the mapping f 7→ U f , f ∈ Cb (E1 , R) is monotone.
2. If f1 and f2 belong to Cb (E1 , R), and if α ≥ 0, then U (f1 + f2 ) ≤ U (f1 )+
U (f2 ), and U (αf1 ) = αU (f1 ).
3. U is unit preserving: U (1E1 ) = 1E2 .
4. If (fn )n∈N ⊂ Cb (E1 , R) is a sequence which decreases pointwise to zero,
then so does the sequence (U (fn ))n∈N .
Then for every v ∈ H + (E2 ) there exists u ∈ H + (E1 ) such that
sup v(y)U (<f ) (y) ≤ sup u(x)<f (x), for all f ∈ Cb (E1 ) and hence
y∈E2 x∈E1

sup v(y)U |f | (y) ≤ sup u(x) |f (x)| , for all f ∈ Cb (E1 ). (1.34)
y∈E2 x∈E1

If the mapping U maps Cb (E1 ) to L∞ (E, R, E), then the conclusion about its
continuity as described in (1.34) is still true provided it possesses the properties
(1), (2), (3), and (4) is replaced by
40 If (fn )n∈N ⊂ Cb (E1 , R) is a sequence which decreases pointwise to zero,
then the sequence (U (fn ))n∈N decreases to zero uniformly on compact sub-
sets of E2 .
Proof. Put
½
MvU = ν ∈ M + (E1 ) : ν (E1 ) = sup v(y), < hg, νi ≤ sup v(y) (U <g) (y)
<
y∈E2 y∈E2
¾
for all g ∈ Cb (E1 ) and
½
|·|
MvU = ν ∈ M + (E1 ) : ν (E1 ) = sup v(y), |hg, νi| ≤ sup v(y) (U |g|) (y)
y∈E2 y∈E2
¾
for all g ∈ Cb (E1 ) . (1.35)

A combination of Theorem 1.8 and its Corollary 1.11 shows that the collections
< |·|
MvU and MvU are tight. Here we use hypothesis 4. We also observe that
< |·| |·|
MvU = MvU . This can be seen as follows. First suppose that ν ∈ MvU and
choose g ∈ Cb (E1 ). Then we have
h<g + k<gk∞ , νi ≤ sup v(y) (U |<g + kgk∞ |) (y)
y∈E2

≤ sup (v(y)U (<g) (y)) + sup v(y) kgk∞


y∈E2 y∈E2

= sup (v(y)U (<g) (y)) + ν (E1 ) kgk∞ . (1.36)


y∈E2

|·|
From (1.36) we deduce < hg, νi ≤ supy∈E1 (v(y)U (<g) (y)), and hence MvU ⊂
<
MvU . The reverse inclusion is show by the following arguments:
22 1 Strong Markov processes
­ ¡ iϑ ¢ ®
|hg, νi| = sup < e g ,ν
ϑ∈[−π,π]
¡¯ ¡ ¢¯¢
≤ sup sup v(y)U ¯< eiϑ g ¯ (y)
ϑ∈[−π,π] y∈E2

≤ sup sup v(y)U (|g|) (y) = sup v(y)U (|g|) (y). (1.37)
ϑ∈[−π,π] y∈E2 y∈E2

< |·|
From (1.37) the inclusion MvU ⊂ MvU follows. So from now on we will write
|·|
MvU = MvU = MvU . There exists a function Ru ∈ H + (E) such that for all
<

f ∈ Cb (E) and for all µ ∈ M the inequality < f dµ ≤ supx∈E < (u(x)f (x))
holds. The result in Theorem 1.20 follows from assertion in the following
equalities
sup v(y)U <f (y) = sup {< hf, νi : ν ∈ MvU } , and (1.38)
y∈E2

sup v(y)U |f | (y) = sup {|hf, νi| : ν ∈ MvU } . (1.39)


y∈E2

The equality in (1.38) follows from the Theorem of Hahn-Banach. In the


present situation it says that there exists a linear functional Λ : Cb (E1 , R) →
R such that Λ(f ) ≤ sup v(y)U f (y), for all f ∈ Cb (E1 , R), and
y∈E2

Λ (1E1 ) = sup v(y)U (1E1 ) (y) = sup v(y)1E2 (y) = sup v(y). (1.40)
y∈E2 y∈E2 y∈E2

Let f ∈ Cb (E1 , R), f ≤ 0. Then Λ(f ) ≤ sup v(y)U f (y) ≤ 0. Again using
y∈E2
Hypothesis 4 shows that Λ can be identified with a positive Borel measure on
E1 , which than belongs to MvU . Consequently, the left-hand side of (1.38) is
less than or equal to its right-hand side. Since the reverse inequality is trivial,
the equality in (1.38) follows. The equality in (1.39) easily follows from (1.38).
The assertion about a sub-additive mapping U which sends functions in
Cb (E1 ) to functions in L∞ (E, R, E) can easily be adopted from the first part
of the proof.
This concludes the proof of Theorem 1.20.
The results in Proposition 1.21 should be compared with Definition 3.6. We
describe two operators to which the results of Theorem 1.20 are applicable.
Let L be an operator with domain and range in Cb (E), with the property
that for all µ > 0 and f ∈ D(L) with µf − Lf ≥ 0 implies f ≥ 0. There
is a close connection between this positivity property (i.e. positive resolvent
property) and the maximum principle: see Definition 3.4 and inequality (3.46).
In addition, suppose that the constant functions belong to D(L), and that
L1 = 0. Fix λ > 0, and define the operators Uλj : Cb (E, R) → L∞ (E, R, E),
j = 1, 2, by the equalities (f ∈ Cb (E, R)):
Uλ1 f = sup inf {g ≥ f 1K : λg − Lg ≥ 0} , and (1.41)
K∈K(E) g∈D(L)

Uλ2 f = inf {g ≥ f : λg − Lg ≥ 0} . (1.42)


g∈D(L)
1.1 Strict topology 23

Here the symbol K(E) stands for the collection of all compact subsets of E.
Observe that, if g ∈ D(L) is such that λg − Lg ≥ 0, then g ≥ 0. This follows
from the maximum principle.
Proposition 1.21. Let the operator L be as above, and let the operators Uλ1
and Uλ2 be defined by (1.41) and (1.42) respectively. Then the following asser-
tions hold true:
(a) Suppose that the operator Uλ1 has the additional property that for every
sequence
¡ 1 ¢ (fn )n∈N ⊂ Cb (E) which decreases pointwise to zero the sequence
Uλ fn n∈N does so uniformly on compact subsets of E. Then for every
u ∈ H + (E) there exists a function v ∈ H + (E) such that

sup u(x)Uλ1 f (x) ≤ sup v(x)f (x), and


x∈E x∈E

sup u(x)Uλ1 |f | (x) ≤ sup v(x) |f (x)| for all f ∈ Cb (E, R). (1.43)
x∈E x∈E

(b) Suppose that the operator Uλ2 has the additional property that for every
sequence
¡ 2 ¢ (fn )n∈N ⊂ Cb (E) which decreases pointwise to zero the sequence
Uλ fn n∈N does so uniformly on compact subsets of E. Then for every
u ∈ H + (E) there exists a function v ∈ H + (E) such that the inequalities
in (1.43) are satisfied with Uλ2 instead of Uλ1 . Moreover, for f ∈ D (Ln ),
µ ≥ 0, and n ∈ N, the following inequalities hold:
n
µn f ≤ Uλ2 (((λ + µ) I − L) f ) , and (1.44)
n
µn kuf k∞ ≤ kv ((λ + µ) I − L) f k∞ . (1.45)

In (1.45) the functions u and v are the same as in (1.43) with Uλ2 replacing
Uλ1 .
The inequality in (1.45) could be used to say that the operator L is Tβ -
dissipative: see inequality (3.14) in Definition 3.5. Also notice that Uλ1 (f ) ≤
Uλ 2(f ), f ∈ Cb (E, R). It is not clear, under what conditions Uλ1 (f ) = Uλ2 (f ).
In Proposition 1.22 below we will return to this topic. The mapping Uλ1 is
heavily used in the proof of (iii) =⇒ (i) of Theorem 3.10. If the operator L in
Proposition 1.21 satisfies the conditions spelled out in assertion (a), then it is
called sequentially λ-dominant: see Definition 3.6.
Proof. The assertion in (a) and the first assertion in (b) is an immediate
consequence of Theorem 1.20. Let f ∈ D(L) be real-valued. The inequality
(1.45) can be obtained by observing that
24 1 Strong Markov processes

Uλ2 ((λ + µ) I − L) f
= inf {g ≥ ((λ + µ) I − L) f : λg − Lg ≥ 0}
g∈D(L)

= inf {g ≥ ((λ + µ) I − L) f :
g∈D(L)

(λ + µ) g − Lg ≥ µg ≥ ((λ + µ) I − L) (µf )}
= inf {g ≥ ((λ + µ) I − L) f : λg − Lg ≥ 0, g ≥ µf } ≥ µf. (1.46)
g∈D(L)

Repeating the arguments which led to (1.46) will show the inequality in (1.44).
From (1.46) and (1.43) with Uλ2 instead of Uλ1 we obtain

sup u(x) (µn f ) (x) ≤ sup Uλ2 ((λ + µ) f − Lf ) (x)


x∈E x∈E
n
≤ sup v(x) ((λ + µ)I − L) f (x), (1.47)
x∈E

for µ ≥ 0 and f ∈ D (Ln ). The inequality in (1.45) is an easy consequence of


(1.47). This concludes the proof of Proposition 1.21.

Proposition 1.22. Let the operator L with domain and range in Cb (E) have
the following properties:
1. For every λ > 0 the range of λI − L coincides with Cb (E), and the inverse
−1
R(λ) := (λI − L) exists as a positivity preserving bounded linear opera-
tor from Cb (E) to Cb (E). Moreover, 0 ≤ f ≤ 1 implies 0 ≤ λR(λ)f ≤ 1.
2. The equality lim λR(λ)f (x) = f (x) holds for every x ∈ E, and f ∈
λ→∞
Cb (E).
3. If (fn )n∈N ⊂ Cb (E) is any sequence which decreases pointwise to zero,
then for every λ > 0 the sequence (λR(λ)fn )n∈N decreases to zero as well.
Fix λ > 0, and define the mappings Uλ1 and Uλ2 as in (1.41) and (1.42)
respectively. Then the (in-)equalities
n o
k
sup (µR (λ + µ)) f ; µ > 0, k ∈ N ≤ Uλ1 (f ) ≤ Uλ2 (f ) (1.48)

hold for f ∈ Cb (E, R). Suppose that f ≥ 0. If the function in the left extremity
of (1.48) belongs to Cb (E), then the first two terms in (1.48) are equal. If it
belongs to D(L), then all three quantities in (1.48) are equal.

Proof. First we observe that for every (λ, x) ∈ (0, ∞) × E there exists a
Borel
R measure B 7→ r (λ, x, B) such that λr (λ, x, E) ≤ 1, and R(λ)f (x) =
E
f (y)r (λ, x, dy), f ∈ Cb (E). This result follows by considering the func-
tional Λλ,x : Cb (E) → C, defined by Λλ,x (f ) = R(λ)f (x). In fact

r (λ, x, B) = sup inf {R(λ)f (x) : f ≥ 1K } , B ∈ E.


K∈K(E), K⊂B

This result follows from Corollary 1.5. Often we write


1.1 Strict topology 25
Z
R(λ) (f 1B ) = f (y)r (λ, x, dy) , B ∈ E, f ∈ Cb (E).
B

Observe that the mapping B 7→ R(λ) (f 1B ) is a positive Borel measure on E.


Moreover, by Dini’s lemma we see that

lim sup sup λR(λ)fn (x) = 0, λ0 > 0. (1.49)


n→∞ λ≥λ0 x∈E

whenever the sequence (fn )n∈N ⊂ Cb (E) decreases pointwise to zero. From
Theorem 1.18 and its Corollary 1.19 it then follows that the family of operators
{λR(λ) : λ ≥ λ0 } is equi-continuous for the strict topology Tβ , i.e. for every
function u ∈ H + (E) there exists a function v ∈ H + (E) such that

λ kuR(λ)f k∞ ≤ kvf k∞ for all λ ≥ λ0 and all f ∈ Cb (E). (1.50)

Fix f ∈ Cb (E, R) and λ > 0. Next we will prove the


n o
k
Uλ1 (f ) ≥ sup (µ ((λ + µ)I − L)) : µ > 0, k ∈ N . (1.51)

A version of this proof will be more or less retaken in (3.137) in the proof of
the implication (iii) =⇒ (i) of Theorem 3.10 with D1 + L instead of L. First
we observe that for g ∈ D(L) we have

λg(x) − Lg(x) = lim µ (g(x) − µR (λ + µ) g(x)) , x ∈ E. (1.52)


µ→∞

If g ∈ D(L) is such that λg − Lg ≥ 0, then (λ + µ) g − Lg ≥ µg, and hence


g ≥ µR(λ + µ)g for all µ > 0. If g ≥ µR(λ + µ)g, then µ (g − µR(λ + µ)g) ≥ 0,
and by (1.52) we see λg − Lg ≥ 0. So that we have the following equality of
subsets

{g ∈ D(L) : λg − Lg ≥ 0} = {g ∈ D(L) : g ≥ µR (λ + µ) g for all µ > 0} .


(1.53)
From (1.53) we infer
( )
k
{g ∈ D(L) : λg − Lg ≥ 0} = g ∈ D(L) : g ≥ sup (µR (λ + µ)) g .
µ>0, k∈N
(1.54)
Let g ∈ D(L) be such that g ≥ f 1K and such that λg − Lg ≥ 0, then (1.54)
k k
implies g ≥ sup (µR (λ + µ)) (f 1K ). Since the operators (µR (λ + µ)) ,
µ>0, k∈N
µ > 0, k ∈ N, are integral operators, and bounded Borel measures are inner-
regular (with respect to compact subsets), we obtain
k
g≥ sup (µR (λ + µ)) f,
µ>0, k∈N

and hence
26 1 Strong Markov processes
k
sup inf {g ≥ f 1K : λg − Lg ≥ 0} ≥ sup (µ ((λ + µ) I − L)) f.
K∈K(E) g∈D(L) µ>0, k∈N
(1.55)
The inequality in (1.55) implies (1.51) and hence, since the inequality Uλ1 (f ) ≤
Uλ2 (f ) is obvious, the inequalities in (1.48) follow. Here we employ the fact
that λg − Lg ≥ 0 implies g ≥ 0. Fix a compact subset K of E, and f ≥ 0,
k
f ∈ Cb (E). If the function g = sup (µ (λ + µ) I − L) f belongs to Cb (E),
µ>0, k∈N
then g ≥ f 1K , and g ≥ µR (λ + µ) g for all µ > 0. Hence it follows that
k
sup (µ (λ + µ) I − L) f ≥ inf {g ≥ f 1K : g ≥ µR (λ + µ) g, g ∈ Cb (E)} .
µ>0, k∈N
(1.56)
Next we show that τβ - lim αR(α)f = f . From the assumptions 2 and 3, and
α→∞ ³ ´
−1
from (1.50) it follows that D(L) = R (βI − L) is Tβ -dense in Cb (E).
Therefore let g be any function in D(L), and let u ∈ H + (E). Consider, for
α > λ0 the equalities
f − αR(α)f = f − g − αR(α) (f − g) + g − αR(α)g
= f − g − αR(α) (f − g) − R(α) (Lg) , (1.57)
and the corresponding inequalities
ku (f − αR(α)f )k∞ ≤ ku (f − g)k∞ + kuαR(α) (f − g)k∞ + kuR(α) (Lg)k∞
kuk∞
≤ ku (f − g)k∞ + kv (f − g)k∞ + kLgk∞ . (1.58)
α
So that for given ε > 0 we first choose g ∈ D(L) in such a way that
2
ku (f − g)k∞ + kv (f − g)k∞ ≤ ε. (1.59)
3
kuk∞ 1
Then we choose αε ≥ λ0 so large that kLgk∞ ≤ ε. From the latter,
α 3
(1.58), and (1.59) we conclude:
ku (f − αR(α)f )k∞ ≤ ε, for α ≥ αε . (1.60)
From (1.60) we see that Tβ - lim αR(α)f = f . So that the inequality in (1.56)
α→∞
implies:
k
sup (µ (λ + µ) I − L) f ≥ inf {g ≥ f 1K : g ≥ µR (λ + µ) g, g ∈ D((L)} ,
µ>0, k∈N
(1.61)
k
and consequently Uλ1 (f ) ≤ f λ := sup (µ (λ + µ) I − L) f . It follows that
µ>0, k∈N
f λ = Uλ1 (f ) provided that f and f λ both belong to Cb (E). If f λ ∈ D(L),
then f λ = Uλ1 (f ) and f λ ≥ µR(λ + µ)f λ , and consequently λf λ − Lf λ . The
conclusion Uλ2 (f ) = f λ is then obvious.
This finishes the proof of Proposition 1.22.
1.2 Strong Markov processes and Feller evolutions 27

In the following proposition we see that a multiplicative Borel measure is a


point evaluation.

RProposition R 1.23.R Let µ be a non-zero Borel measure with the property that
f gdµ = f dµ R gdµ for all functions f and g ∈ Cb (E). Then there exists
x ∈ E such that f dµ = f (x) for f ∈ Cb (E).
R R
Proof. Since µ 6= 0 there exists f ∈ Cb (E) such that 0 6= f dµ = f 1dµ =
R R R ¡R ¢2 R
f dµ 1dµ, and hence 0 6= 1dµ = 1dµ . Consequently, 1dµ = 1. Let
+
f and g be functions in Cb (E). Then we have
Z ½¯Z ¯ ¾
¯ ¯
¯ ¯
f gd |µ| = sup ¯ hdµ¯ : |h| ≤ f g, h ∈ Cb (E)
½¯Z ¯ ¾
¯ ¯
¯ ¯
= sup ¯ h1 h2 dµ¯ : |h1 | ≤ f, |h2 | ≤ g, h1 , h2 ∈ Cb (E)
½¯Z ¯ ¾
¯ ¯
¯ ¯
= sup ¯ h1 dµ¯ : |h1 | ≤ f, h1 ∈ Cb (E)
½¯Z ¯ ¾
¯ ¯
¯ ¯
× sup ¯ h2 dµ¯ : |h2 | ≤ g, h2 ∈ Cb (E)
Z Z
= f d |µ| gd |µ| . (1.62)

From (1.62) it follows that the variation measure |µ| is multiplicative as well.
Since E is a polish space, the measure |µ| is inner regular. So there exists a
compact subset K of E such that |µ| (E \ K) ≤ 1/2, and hence |µ| (K) > 1/2.
Since |µ| is multiplicative it follows that |µ| (K) = 1 = |µ| (E). It follows that
the multiplicative measure |µ| is concentrated on the compact subset K, and
hence it can be considered as a multiplicative measure on C(K). But then
there exists a point x ∈ K such that |µ| = δx , the Dirac measure at x. So
there exists a constant cx such that µ = cx |µ| = cx δx . Since µ(E) = δx (E) = 1
it follows that cx = 1. This proves Proposition 1.23.

1.2 Strong Markov processes and Feller evolutions


In the sequel E denotes a separable complete metrizable topological Hausdorff
space. In other words E is a polish space. The space Cb (E) is the space of all
complex valued bounded continuous functions. The space Cb (E) is not only
equipped with the uniform norm: kf k∞ := supx∈E |f (x)|, f ∈ Cb (E), but
also with the strict topology Tβ . It is considered as a subspace of the bounded
Borel measurable functions L∞ (E), also endowed with the supremum norm.
Definition 1.24. A family {P (s, t) : 0 ≤ s ≤ t ≤ T } of operators defined on
L∞ (E) is called a Feller evolution or a Feller propagator on Cb (E) if it pos-
sesses the following properties:
28 1 Strong Markov processes

(i) It leaves Cb (E) invariant: P (s, t)Cb (E) ⊆ Cb (E) for 0 ≤ s ≤ t ≤ T ;


(ii) It is an evolution: P (τ, t) = P (τ, s) ◦ P (s, t) for all τ , s, t for which
0 ≤ τ ≤ s ≤ t and P (t, t) = I, t ∈ [0, T ];
(iii) It consists of contraction operators: kP (s, t)f k∞ ≤ kf k∞ for all t ≥ 0
and for all f ∈ Cb (E);
(iv) It is positivity preserving: f ≥ 0, f ∈ Cb (E), implies P (s, t)f ≥ 0;
(v) For every f ∈ Cb (E) the function (s, t, x) 7→ P (s, t)f (x) is continuous
on the diagonal of the set {(s, t, x) ∈ [0, T ] × [0, T ] × E : 0 ≤ s ≤ t ≤ T }
in the sense that for every element (t, x) ∈ (0, T ] × E the equality
lim P (s, t)f (y) = f (x) holds, and for every element (s, x) ∈ [0, T ) × E
s↑t,y→x
the equality lim P (s, t)f (y) = f (x) holds.
t↓s,y→x
(vi) For every t ∈ [0, T ] and f ∈ Cb (E) the function (s, x) 7→ P (s, t)f (x) is
Borel measurable and if (sn , xn )n∈N is any sequence in [0, t] × E such that
sn decreases to s ∈ [0, t], xn converges to x ∈ E, and lim P (sn , t) g (xn )
n→∞
exists in C for all g ∈ Cb (E), then lim P (sn , t) f (xn ) = P (s, t)f (x).
n→∞
(vii) For every (t, x) ∈ (0, T ] × E and f ∈ Cb (E) the following equality holds:
lims↑t, s≥τ P (τ, s) f (x) = P (τ, t) f (x), τ ∈ [0, t).

Remark 1.25. Since the space E is polish, the continuity as described in (v) can
also be described by sequences. So (v) is equivalent to the following condition:
for all element (t, x) ∈ (0, T ] × E and (s, x) ∈ [0, T ) × E the equalities

lim P (sn , t) f (yn ) = f (x) and lim P (s, tn ) f (yn ) = f (x) (1.63)
n→∞ n→∞

hold. Here (sn )n∈N ⊂ [0, t] is any sequence which increases to t, (tn )n∈N ⊂
[s, T ] is any sequence which decreases to s, and (yn )n∈N is any sequence in
E which converges to x ∈ E. If for f ∈ Cb (E) and t ∈ [0, T ] the function
(s, x) 7→ P (s, t)f (x), (s, x) ∈ [0, t] × E, is continuous, then (vi) and (vii) are
satisfied. If the function (s, t, x) 7→ P (s, t) f (x) is continuous on the space
{(s, t, x) ∈ [0, T ] × [0, T ] × E : s ≤ t}, then the propagator P (s, t) possesses
property (v) through (vii). In Proposition 1.26 we will single out a closely
related property. Its proof is part of the proof of part (b) in Theorem 1.39.

Proposition 1.26. Let the family {P (τ, t) : 0 ≤ τ ≤ t ≤ T } which possesses


properties (i) through (iv) of Definition 1.24. Suppose that for every f ∈ Cb (E)
the function (τ, t, x) 7→ P (τ, t) f (x) is continuous on the space

{(τ, t, x) ∈ [0, T ] × [0, T ] × E : τ ≤ t} . (1.64)

Then for every f ∈ Cb ([0, T ] × E) the function (τ, t, x) 7→ P (τ, t) f (t, ·) (x)
is continuous on the space in (1.64).

It is noticed that assertions (iii) and (iv) together are equivalent to


(iii0 ) If 0 ≤ f ≤ 1, f ∈ Cb (E), then 0 ≤ P (s, t)f ≤ 1, for 0 ≤ s ≤ t ≤ T .
1.2 Strong Markov processes and Feller evolutions 29

In the presence of (iii), (ii) and (i), property (v) is equivalent to:
(v0 ) lim ku (P (s, t)f − f )k∞ = 0 and lim ku (P (s, t)f − f )k∞ = 0 for all f ∈
t↓s s↑t
Cb (E) and u ∈ H(E). So that a Feller evolution is in fact Tβ -strongly
continuous in the sense that, for every f ∈ Cb (E) and u ∈ H(E),

lim ku (P (s, t) f − P (s0 , t0 ) f )k∞ = 0, 0 ≤ s0 ≤ t0 ≤ T. (1.65)


(s,t)→(s0 ,t0 )
s≤s0 ≤t0 ≤t

Remark 1.27. Property (vi) is satisfied if for every t ∈ (0, T ] the function
(s, x) 7→ P (s, x; t, E) = P (s, t) 1(x) is continuous on [0, t] × E, and if for
every sequence (sn , xn )n∈N ⊂ [0, t] × E for which sn decreases to s and xn
converges to x, the inequality lim supn→∞ P (sn , t) f (xn ) ≥ P (s, t) f (x) holds
for all f ∈ Cb+ (E). Since functions of the form x 7→ P (s, t)f (x), f ∈ Cb (E),
belong to Cb (E), it is also satisfied provided for every f ∈ Cb (E) we have

lim P (sn , t) f = P (s, t) f, uniformly on compact subsets of E.


n→∞

This follows from the inequality:

|P (sn , t) f (xn ) − P (s, t) f (x)|


≤ |P (sn , t) f (xn ) − P (s, t) (xn )| + |P (s, t) f (xn ) − P (s, t) f (x)|

where sn ↓ s, xn → as n → ∞, and f ∈ Cb (E).


Proposition 1.28. Let {P (s, t) : 0 ≤ s ≤ t ≤ T } be a family of operators
having property (i) and (ii) of Definition 1.24. Then property (iii0 ) is equiva-
lent to the properties (iii) and (iv) together.
Moreover, if such a family {P (s, t) : 0 ≤ s ≤ t ≤ T } possesses property (i),
(ii) and (iii), then it possesses property (v) if and only if it possesses (v0 ).
Proof. First suppose that the operator P (s, t) : L∞ (E) → L∞ (E) has the
properties (iii) and (iv), and let f ∈ Cb (E) be such that 0 ≤ f ≤ 1. Then by
(iii) and (iv) we have 0 ≤ P (s, t)f (x) ≤ supy∈E f (y) ≤ 1, and hence (iii0 ) is
satisfied. Conversely, let f ∈ Cb (E) and x ∈ E. Then by (iii0 ) the operator
P (s, t) satisfies

<P (s, t)f (x) = [P (s, t)<f ] (x) ≤ sup <f (y) ≤ k<f k∞ . (1.66)
y∈E

There exists ϑ ∈ [−π, π] such that by (1.66) we have


£ ¤
|P (s, t)f (x)| = < eiϑ P (s, t)f (x)
£ ¡ ¢¤ ° ¡ ¢°
= P (s, t)< eiϑ f (x) ≤ °< eiϑ f °∞ ≤ kf k∞ ,

from which (iii0 ) easily follows.


Next, suppose that the family {P (s, t) : 0 ≤ s ≤ t ≤ T } possesses property
(v0 ). Then, by taking s0 = t0 , it clearly has property (v). Fix (s0 , t0 ) ∈ [0, T ]×
30 1 Strong Markov processes

[0, T ] in such a way that s0 ≤ t0 . For the converse implication we employ


Theorem 1.18 with the families of operators

{P (sm , s0 ) : 0 ≤ sm ≤ sm+1 ≤ s0 } and {P (t0 , tm ) : t0 ≤ tm+1 ≤ tm ≤ T }


(1.67)
respectively. Let (fn )n∈N be a sequence functions in Cb+ (E) which decreases
pointwise to zero. Then by Dini’s lemma and assumption (v) we know that

lim sup sup P (sm , s0 ) fn (x) = lim sup sup P (t0 , tm ) fn (x) = 0 (1.68)
n→∞ m∈N x∈K n→∞ m∈N x∈K

for all compact subsets K of E. From (1.68) we see that the sequences of
operators in (1.67) are tight. By Theorem 1.18 it follows that they are equi-
continuous. If the pair (s, t) belongs to [0, s0 ] × [t0 , T ], then we write

P (s, t) f − P (s0 , t0 ) f = P (s, t0 ) (P (t0 , t) − I) f + (P (s, s0 ) − I) P (s0 , t0 ) f.


(1.69)
Let v be a function in H(E). Since the first sequence in (1.67) is equi-
continuous and by invoking (1.69) there exists a function v ∈ H(E) such
that the following inequality holds for all m ∈ N and all f ∈ Cb (E):

ku (P (sm , tm ) f − P (s0 , t0 ) f )k∞


≤ kv (P (t0 , tm ) − I) f k∞ + ku (P (sm , s0 ) − I) P (s0 , t0 ) f k . (1.70)

In order to prove the equality in (1.65) it suffices to show that the right-hand
side of (1.70) tends to zero if m → ∞. By the properties of the functions u
and v it suffices to prove that

lim k1K (P (sm , s0 ) f − f )k∞ = lim k1K (P (t0 , tm ) f − f )k∞ = 0 (1.71)


m→∞ m→∞

for every compact subset K of E and for every function f ∈ Cb (E). The
equalities in (1.71) follow from the sequential compactness of K and (v) which
imply that

lim P (sm , s0 ) f (xm ) = f (x0 ) = lim P (t0 , tm ) f (xm )


m→∞ m→∞

whenever sm increases to s0 , tm decreases to t0 and xm converges to x0 .


This completes the proof of Proposition 1.28.

Definition 1.29. Let for every (τ, x) ∈ [0, T ] × E, a probability measure Pτ,x
on FTτ be given. Suppose that for every bounded random variable Y : Ω → R
the equality ¡ ¯ ¢
Eτ,x Y ◦ ∨t ¯ Ftτ = Et,X(t) [Y ◦ ∨t ]
holds Pτ,x -almost surely for all (τ, x) ∈ [0, T ] × E and for all t ∈ [τ, T ]. Then
the process

{(Ω, FTτ , Pτ,x ) , (X(t), τ ≤ t ≤ T ) , (∨t : τ ≤ t ≤ T ) , (E, E)} (1.72)


1.2 Strong Markov processes and Feller evolutions 31

is called a Markov process. If the fixed time t ∈ [τ, T ] may be replaced with a
stopping time S attaining values in [τ, T ], then the process in (1.72) is called
a strong Markov process. By definition Pτ,4 (A) = 1A (ω4 ) = δω4 (A). Here
A belongs to F, and ω4 (s) = 4 for all s ∈ [0, T ]. If

{(Ω, FTτ , Pτ,x ) , (X(t), τ ≤ t ≤ T ) , (∨t : τ ≤ t ≤ T ) , (E, E)}

is a Markov process, then we write

P (τ, x; t, B) = Pτ,x (X(t) ∈ B), B ∈ E, x ∈ E, τ ≤ t ≤ T, (1.73)

for the corresponding transition function. The operator family (of evolutions,
propagators)
{P (s, t) : 0 ≤ s ≤ t ≤ T }
is defined by
Z
[P (s, t)f ](x) = Es,x [f (X(t))] = f (y)P (s, x; t, dy) , f ∈ Cb (E), s ≤ t ≤ T.

Let S : Ω → [τ, T ] be an (Ftτ )t∈[τ,T ] -stopping time. Then the σ-field FSτ is
defined by
FSτ = ∩t∈[τ,T ] {A ∈ FTτ : A ∩ {S ≤ t} ∈ Ftτ } .
Of course, a stochastic variable S : Ω → [τ, T ] is called an (Ftτ )t∈[τ,T ] -stopping
time, provided that for every t ∈ [τ, T ] the event {S ≤ t} belongs to Ftτ .
This is perhaps the right place to explain the compositions F ◦ ∨t , F ◦ ∧t ,
and F ◦ ϑt , if F : Ω → C is FT0 -measurable, and if t ∈ [0,QT ]. Such functions
n
F are called stochastic variable. If F is of the form F = j=1 fj (tj , X (tj )),
where the functions fj , 1 ≤ j ≤ n, are bounded Borel functions, defined on
[0, T ] × E, then, by definition,
n
Y n
Y
F ◦ ∨t = fj (tj ∨ t, X (tj ∨ t)) , F ◦ ∧t = fj (tj ∧ t, X (tj ∧ t)) , and
j=1 j=1
Yn
F ◦ ϑt = fj ((tj + t) ∧ T, X ((tj + t) ∧ T )) . (1.74)
j=1
¡ ¢
If t is a Ft0 t∈[0,T ] -stopping time, then a similar definition is applied. By the
Monotone Class Theorem, the definitions in (1.74) extend to all FT0 measurable
variables F , i.e. to all stochastic variables. For a discussion on the Monotone
Class Theorem see Subsection 1.4.2

1.2.1 Generators of Markov processes and maximum principles

We begin with the definition of of the generator of a time-dependent Feller


evolution.
32 1 Strong Markov processes

Definition 1.30. A family of operators L(t), 0 ≤ t ≤ T , is said to be the


(infinitesimal) generator of a Feller evolution {P (s, t) : 0 ≤ s ≤ t ≤ T }, if
P (s, t)f − f
L(s)f = Tβ -lim , 0 ≤ s ≤ T . This means that a function f belongs
t↓s t−s
P (s, t)f − f
to D (L(s)) whenever L(s)f := lim exists in Cb (E), equipped with
t↓s t−s
the strict topology. It is the same as saying
½ that the function L(s)f
¾ belongs to
P (s, t)f − f
Cb (E), that the family of functions : t ∈ (s, T ) is uniformly
t−s
bounded and that convergence takes place uniformly on compact subsets of E.
Such a family of operators is considered as an operator L with domain in the
space Cb ([0, T ] × E). A function f ∈ Cb ([0, T ] × E) is said to belong to D(L)
if for every s ∈ [0, T ] the function x 7→ f (s, x) is a member of D(L(s)) and
if the function (s, x) 7→ L(s)f (s, ·) (x) belongs to Cb ([0, T ] × E). Instead of
L(s)f (s, ·) (x) we often write L(s)f (s, x). If a function f ∈ D(L) is such that
the function s 7→ f (s, x) is continuously differentiable, then we say that f

belongs to D(1) (L). The time derivative operator is often written as D1 .
∂s
(1)
Its domain is denoted by D (D1 ), and hence D (L) = D (D1 ) ∩ D(L).
Definition 1.31. The family of operators L(s), 0 ≤ s ≤ T , is said to generate
a time-inhomogeneous Markov process

{(Ω, FTτ , Pτ,x ) , (X(t) : T ≥ t ≥ τ ) , (∨t : τ ≤ t ≤ T ) , (E, E)} (1.75)

if for all functions u ∈ D(L), for all x ∈ E, and for all pairs (τ, s) with
0 ≤ τ ≤ s ≤ T the following equality holds:
· ¸
d ∂u
Eτ,x [u (s, X(s))] = Eτ,x (s, X(s)) + L(s)u (s, ·) (X(s)) . (1.76)
ds ∂s
Here it is assumed that the derivatives are interpreted as limits from the right
which converge uniformly on compact subsets of E, and that the differential
quotients are uniformly bounded.
So these derivatives are Tβ -derivatives.
Definition 1.32. By definition the Skorohod space D ([0, T ], E) consists of
all functions from [0, T ] to E which¡ posses left
¢ limits in E and are right-
continuous. The Skorohod space D [0, T ], E 4 consists of all functions from
[0, T ] to E 4 which posses left limits in E 4 and are right-continuous.
¡ ¢More
precisely, a path (or function) ω : [0, T ] → E 4 belongs to D [0, T ], E 4 if it
possesses the following properties:
(a) if ω(t) ∈ E, and s ∈ [0, t], then there exists ε > 0 such that X(ρ) ∈ E for
ρ ∈ [0, t + ε], and ω(s) = lim ω(ρ) and ω(s−) := lim ω(ρ) belong to E.
ρ↓s ρ↑s
(b) if ω(t) = 4 and s ∈ [t, T ], then ω(s) = 4. In other words 4 is an
absorbing state.
1.2 Strong Markov processes and Feller evolutions 33

Observe that the range of ω ∈ D ([0, T ], E) is contained in totally bounded


subset of
¡ E. Such ¢sets are relatively compact. Also observe that the range of
ω ∈ D [0, T ], E 4 restricted to an interval of the form [0, t] is ¡also totally¢
bounded provided that ω(t) ∈ E. It follows that paths ω ∈ D [0, T ], E 4
restricted to intervals of the form [0, t] have relatively compact range as long
as they have not reached the absorption state 4, i.e. as long as ω(t) ∈ E.
The assertions (a), (b), (c), and (d) of the following theorem are well-
known in case E is a locally compact second countable Hausdorff space. In
fact the sample space Ω should depend on τ . This is taken care of by assuming
that the measure Pτ,x is defined on the σ-field FTτ .
Let L be a linear operator with domain D(L) and range R(L) in Cb (E).
The following definition should be compared with Definition 3.4 below, and
with assertion (b) in Proposition 3.11.
Definition 1.33. Let E0 be subset of E. The operator L satisfies the maxi-
mum principle on E0 , provided

sup < (λf (x) − Lf (x)) ≥ λ sup <f (x), for all λ > 0, and for all f ∈ D(L).
x∈E0 x∈E0
(1.77)
If L satisfies (1.77) on E0 = E, then the operator L satisfies the maximum
principle of Definition 3.4.
The following definition is the same as the one in Definition 3.14 below.
Definition 1.34. Let E0 be a subset of E. Suppose that the operator L has the
property that for every λ > 0 and for every x0 ∈ E0 the number <h (x0 ) ≥ 0,
whenever h ∈ D(L) is such that < (λI − L) h ≥ 0 on E0 . Then the operator
L is said to satisfy the weak maximum principle on E0 .
The following proposition says that the concepts in the definitions 1.33 and
1.34 coincide, provided 1 ∈ D(L) and L1 = 0.
Proposition 1.35. If the operator L satisfies the maximum principle on E0 ,
then L satisfies the weak maximum principle on E0 . Suppose that the constant
functions belong to D(L), and that L1 = 0. If L satisfies the weak maximum
principle on E0 , then it satisfies the maximum principle on E0 .

Proof. First we observe that (1.77) is equivalent to

inf < (λf (x) − Lf (x)) ≤ λ inf <f (x), for all λ > 0, and for all f ∈ D(L).
x∈E0 x∈E0
(1.78)
Hence, if λf − Lf ≥ 0 on E0 , then (1.78) implies that <f (x0 ) ≥ 0 for all
x0 ∈ E0 .
Conversely, suppose that 1 ∈ D(L) and that L1 = 0. Let f ∈ D(L), put
m = inf {<f (y) : y ∈ E0 }, and assume that

inf < (λf − Lf ) (x) > λ inf <f (y) = λm. (1.79)
x∈E0 y∈E0
34 1 Strong Markov processes

Then there exists ε > 0 such that inf < (λf − Lf ) (x) ≥ λ (m + ε). Hence,
x∈E0
since L1 = 0, inf < (λI − L) (f − m − ε) (x) ≥ 0. Since the operator L satis-
x∈E
fies the weak maximum principle, we see < (f − m − ε) ≥ 0 on E0 . Since this
is equivalent to <f ≥ m + ε on E0 , which contradicts the definition of m.
Hence, our assumption in (1.79) is false, and consequently,

inf < (λf − Lf ) (x) ≤ λ inf <f (y). (1.80)


x∈E0 y∈E0

Since (1.80) is equivalent to (1.77) this concludes the proof of Proposition


1.35.

Definition 1.36. Let an operator L, with domain and range in Cb (E), sat-
isfy the maximum principle. Then L is said to possess the global Korovkin
property, if there exists λ0 > 0 such that fore every x0 ∈ E, the subspace
S (λ0 , x0 ), defined by

S (λ0 , x0 ) = {g ∈ Cb (E) : for every ε > 0 the inequality


sup {h1 (x0 ) : (λ0 I − L) h1 ≤ < g + ε, h1 ∈ D(L)}
≥ inf {h2 (x0 ) : (λ0 I − L) h2 ≥ < g − ε, h2 ∈ D(L)} is valid} ,
(1.81)

coincides with Cb (E).

Remark 1.37. Let D be a subspace of Cb (E) with the property that for every
x0 ∈ E the space S(x0 ), defined by

S (x0 ) = {g ∈ Cb (E) : for every ε > 0 the inequality


sup {h1 (x0 ) : h1 ≤ < g + ε, h1 ∈ D}
≥ inf {h2 (x0 ) : h2 ≥ < g − ε, h2 ∈ D} holds} , (1.82)

coincides with Cb (E). Then such a subspace D could be called a global Ko-
rovkin subspace of Cb (E). In fact the inequality in (1.82) is pretty much the
same as the one in (1.81) in case L = 0.

In what follows the symbol Kσ (E) denotes the collection of σ-compact subsets
of E. The set E0 in the following definition is in practical situation a member
of Kσ (E).
Definition 1.38. Let E0 be subset of E. Let an operator L, with domain and
range in Cb (E), satisfy the maximum principle on E0 . Then L is said to
possess the Korovkin property on E0 , if there exists λ0 > 0 such that for every
x0 ∈ K, the subspace Sloc (λ0 , x0 , E0 ), defined by
1.3 Strong Markov processes: main result 35
½
Sloc (λ0 , x0 , E0 ) = g ∈ Cb (E) : for every ε > 0 the inequality

sup {h1 (x0 ) : (λ0 I − L) h1 ≤ < g + ε, on E0 }


h1 ∈D(L)
¾
≥ inf {h2 (x0 ) : (λ0 I − L) h2 ≥ < g − ε, on E0 } ,
h2 ∈D(L)
(1.83)

coincides with Cb (E).

1.3 Strong Markov processes: main result


The following theorem contains the basic results about strong Markov pro-
cesses on polish spaces, their sample paths, and their generators. Item (a)
says that a Feller evolution (or propagator) can be considered as the one-
dimensional distributions, or marginals, of a strong Markov process. Item (b)
describes the reverse situation: with certain Markov processes we may asso-
ciate Feller propagators. In item (c) the intimate link between unique solutions
to the martingale problem and the strong Markov property is established. Item
(d) contains a converse result: Markov processes can be considered as solutions
to the martingale problem. Finally, in (e) operators which possess unique lin-
ear extensions which generate Feller evolutions are described, and for which
the martingale problem is uniquely solvable. For such operators the martingale
problem is said to be well-posed. A Hunt process is a strong Markov process
which is quasi-left continuous with respect to the minimum completed ad-
missible filtration {Ftτ }τ ≤t≤T . For assertion (a) in the locally compact setting
and a time-homogeneous Feller evolution (i.e. a Feller-Dynkin semigroup) the
reader may e.g. consult R.M. Blumenthal and R.K. Getoor [34].
Theorem 1.39.
Let {P (τ, t) : τ ≤ t ≤ T } be a Feller evolution in Cb (E). Then there exists
a strong Markov process (in fact a Hunt process)

{(Ω, FTτ , Pτ,x ) , (X(t), τ ≤ t ≤ T ) , (∨t : τ ≤ t ≤ T ) , (E, E)} , (1.84)

such that [P (τ, t)f ] (x) = Eτ,x [f (X(t))] , f ∈ Cb (E), t ≥ 0. Moreover this
Markov process is normal (i.e. Pτ,x [X(τ ) = x] = 1), is right continuous
(i.e. limt↓s X(t) = X(s), Pτ,x -almost surely for τ ≤ s ≤ T ), possesses left
limits in E on its life time (i.e. limt↑s X(t) exists in E, whenever ζ > s),
and¡ is quasi-left
¢ continuous (i.e. if (τn : n ∈ N) is an increasing sequence
τ
of Ft+ -stopping times, X(τn ) converges Pτ,x -almost surely to X (τ∞ ) on
the event {τ∞ < ζ}, where τ∞ = supn∈N τn ). Here ζ is the life time of the
process t 7→ X(t): ζ = inf {s > 0 : X(s) = 4}, when X(s) = 4 for some
s ∈ [0, T ], and elsewhere ζ = T . Put
τ
Ft+ = ∩s∈(t,T ] Fsτ = ∩s∈(t,T ] σ (X(ρ) : τ ≤ ρ ≤ s) . (1.85)
36 1 Strong Markov processes

Let F : Ω → C be a bounded FTs -measurable stochastic variable. Then


£ ¯ ¤ £ ¯ τ ¤
Es,X(s) [F ] = Eτ,x F ¯ Fsτ = Eτ,x F ¯ Fs+ (1.86)

Pτ,x -almost surely for all τ ≤ s and x ∈ E. Consequently, the process


defined in (1.84)
¡ τ ¢ is in fact a Markov process with respect to the right closed
filtrations: Ft+ t∈[τ,T ]
, τ ∈ [0, T ]. Moreover, the events {X(t) ∈ E} and
{X(t) ∈ E, ζ ≥ t} coincide Pτ,x -almost surely for τ ≤ t ≤ T and x ∈ E.
Even more is true, the process defined in (1.84) is strong Markov with
respect to the filtrations (Ftτ )t∈[τ,T ] , τ ∈ [0, T ], in the sense that
£ ¯ τ ¤
ES,X(S) [F ◦ ∨S ] = Eτ,x F ◦ ∨S ¯ FS+ (1.87)

for ¡all bounded


¢ FT0 -measurable stochastic variables F : Ω → C and for
τ τ
all Ft+ t∈[τ,T ] -stopping times S : Ω → [τ, T ]. The σ-field FS+ is de-
fined in Remark 1.40: see (1.91). As Ω the Skorohod
¡ space
¢ D ([0, T ], E), if
P (τ, x : t, E) = 1 for all 0 ≤ τ ≤ t ≤ T , or D [0, T ], E 4 , otherwise, may
be chosen.
(a)(b) Conversely, let

{(Ω, FTτ , Pτ,x ) , (X(t), τ ≤ t ≤ T ) , (∨t : τ ≤ t ≤ T ) , (E, E)} (1.88)

be a strong Markov process which is normal, right continuous, and pos-


sesses left limits in E on its life time. Put, for x ∈ E and 0 ≤ τ ≤ t ≤ T ,
and f ∈ L∞ ([0, T ] × E, E),
Z
[P (τ, t)f (t, ·)] (x) = Eτ,x [f (t, X(t))] = P (τ, x; t, dy) f (t, y) , (1.89)

where P (τ, x; t, B) = Pτ,x [X(t) ∈ B], B ∈ E. Suppose that the function


(s, t, x) 7→ P (s, t)f (x) is continuous on the set

{(s, t, x) ∈ [0, T ] × [0, T ] × E : s ≤ t}

for all functions f belonging to Cb (E), 0 ≤ s ≤ t ≤ T . Then the


family {P (s, t) : T ≥ t ≥ s ≥ 0} is a Feller evolution. Moreover, for f ∈
Cb ([0, T ] × E) the function (s, t, x) 7→ P (s, t)f (t, ·) (x) is continuous on
the same space. The operators ∨t : Ω → Ω, t ∈ [τ, T ], have the property
that for all (τ, x) ∈ [0, T ] × E the equality X(s) ◦ ∨t = X (s ∨ t) holds Pτ,x
for all t ∈ [τ, T ].
(c) Let the family L = {L(s) : 0 ≤ s ≤ T } be the generator of a Feller evo-
lution in Cb (E) and let the process in (1.84) be the be the corresponding
Markov process. For every f ∈ D(L(s)) and for every (τ, x) ∈ [0, T ] × E,
the process
Z tµ ¶

t 7→ f (t, X(t)) − f (τ, X(τ )) − + L(s) f (s, X(s)) ds (1.90)
τ ∂s
1.3 Strong Markov processes: main result 37

is a Pτ,x -martingale for the filtration (Ftτ )T ≥t≥τ , where each σ-field Ftτ ,
T ≥ t ≥ τ ≥ 0, is (some completion
T of ) σ (X(u) : τ ≤ u ≤ t). In fact the σ-
field Ftτ may be taken as Ftτ = s>t σ (X(ρ) : τ ≤ ρ ≤ s). ItR is also possible
to complete Ftτ with respect to Pτ,µ , given by Pτ,µ (A) = Pτ,x (A)dµ(x).
For Ftτ the following σ-field may be chosen:
\ \
Ftτ = {Pτ,µ -completion of σ (X(u) : τ ≤ u ≤ s)} .
µ∈P (E) T ≥s>t

(d) Conversely, let L = {L(s) : 0 ≤ s ≤ T } be a family of Tβ -densely defined


linear operators with domain D(L(s)) and range R(L(s)) in Cb (E), such
that D(1) (L) is Tβ -dense in Cb ([0, T ] × E). Let

((Ω, FTτ , Pτ,x ) : (τ, x) ∈ [0, T ] × E)

be a unique family of probability spaces, together with³state variables ´


(X(t) : t ∈ [0, T ]) defined on the filtered measure space Ω, (Ftτ )τ ≤t≤T
with values in the state space (E, E) with the following properties: for all
pairs 0 ≤ τ ≤ t ≤ T the state variable X(t) is Ftτ -E-measurable, for all
pairs (τ, x) ∈ [0, T ] × E, Pτ,x [X(τ ) = x] = 1, and for all f ∈ D(1) (L) the
process
Z tµ ¶

t 7→ f (t, X(t)) − f (τ, X(τ )) − + L(s) f (s, X(s)) ds
τ ∂s
is a Pτ,x -martingale with respect to the filtration (Ftτ )τ ≤t≤T . Then the fam-
ily of operators L = {L(s) : 0 ≤ s ≤ T } possesses a unique extension

L0 = {L0 (s) : 0 ≤ s ≤ T } ,

which generates a Feller evolution in Cb (E). It is required that the operator


D1 + L is sequentially λ-dominant in the sense of Definition 3.6; i.e. for
every sequence of functions (ψm )m∈N ⊂ C©b ([0, T ] × E) ª which decreases
pointwise to zero implies that the sequence ψnλ : n ∈ N , defined by

ψnλ = sup inf {g ≥ ψn 1K : g ∈ D (D1 + L) , (λI − D1 − L) g ≥ 0} ,


K∈K([0,T ]×E)

decreases uniformly on compact subsets of [0, T ] × E to zero.


¡ In addition,
¢
the sample space Ω is supposed to the Skorohod space D [0, T ] , E 4 ; in
particular X(t) ∈ E, τ ≤ s < t, implies X(s) ∈ E.
(e) (Unique Markov extensions) Suppose that the Tβ -densely defined linear
operator ½ ¾

D1 + L = + L(s) : 0 ≤ s ≤ T ,
∂s
with domain and range in Cb ([0, T ] × E), possesses the global Korovkin
property and satisfies the maximum principle, as exhibited in Definition
38 1 Strong Markov processes

1.33. Also suppose that L assigns real functions to real functions. Then
the family L = {L(s) : 0 ≤ s ≤ T } extends to a unique generator L0 =
{L0 (s) : 0 ≤ s ≤ T } of a Feller evolution, and the martingale problem is
well posed for the family of operators {L(s) : 0 ≤ s ≤ T }. Moreover, the
Markov process associated with {L0 (s) : 0 ≤ s ≤ T } solves the martingale
problem uniquely for the family L = {L(s) : 0 ≤ s ≤ T }.
Let E0 be a subset of E which is polish for the relative metric. The same
conclusion is true with E0 instead of E if the operator D1 + L possesses
the following properties:
1. If f ∈ D(1) (L) vanishes on E0 , then D1 f + Lf vanishes on E0 as well.
2. The operator D1 + L satisfies the maximum principle on E0 .
3. The operator D1 + L is positive Tβ -dissipative on E0 .
4. The operator D1 + L is sequentially λ-dominant on E0 for some λ > 0.
5. The operator D1 + L has the Korovkin property on E0 .
The notion of maximum principle on E0 is explained in the definitions
1.34 and 1.33: see Proposition 1.35 as well. The concept of Korovkin prop-
erty on a subset E0 can be found in Definition © 1.38. Let (D1 + L) ª ¹E0
be the operator defined by D ((D1 + L) ¹E0 ) = f ¹E0 : f ∈ D(1) (L) , and
(D1 + L) ¹E0 (f ¹E0 ) = D1 f + Lf ¹E0 , f ∈ D(L). Then the operator L ¹E0
possesses a unique linear extension to the generator L0 of a Feller semigroup
on Cb (E0 ).
For the notion of Tβ -dissipativity the reader is referred to inequality (3.14)
in Definition 3.5, and for the notion of sequentially λ-dominant operator see
Definition 3.6. In Proposition 1.21 the function ψnλ in assertion (d) is denoted
by Uλ1 (ψn ). The sequential λ-dominance will guarantee that the semigroup
which can be constructed starting from the other hypotheses in (d) and (e) is
a Feller semigroup indeed: see Theorem 3.10.
Remark 1.40. Notice that in (1.87) we cannot necessarily write
£ ¯ ¤
ES,X(S) [F ◦ ∨S ] = Eτ,x F ◦ ∨S ¯ FSτ ,

because events of the form {S ≤ t} may not be Ftτ -measurable, and hence the
σ-field FSτ is not well-defined. In (1.87) the σ-field FS+
τ
is defined by
τ
© ª
FS+ = ∩t≥0 A ∈ FTτ : A ∩ [0, t] ∈ Ft+τ
. (1.91)

Remark 1.41. Let d : E × E → [0, 1] be a metric on E which turns E into a


complete metrizable space, and let 4 be an isolated point of E 4 = E ∪ {4}.
The metric d4 : E 4 × E 4 → [0, 1] defined by
¯ ¯
d4 (x, y) = d (x, y) 1E (x)1E (y) + ¯1{4} (x) − 1{4} (y)¯

turns E 4 into
¡ a complete
¢ metrizable space. Moreover, if (E, d) is separable,
then so is E 4 , d
¡ 4¢ 4 . We also notice that the function x 7→ 1E (x), x ∈ E 4 ,
belongs to Cb E .
1.3 Strong Markov processes: main result 39

Remark 1.42. Let {P (τ, t) : 0 ≤ τ ≤ t ≤ T } be an evolution family on Cb (E).


Suppose that for any sequence of functions (fn )n∈N which decreases pointwise
to zero limn→∞ P (τ, t) fn (x) = 0, 0 ≤ τ ≤ t ≤ T . Then there exists a family
of Borel measures {B 7→ P (τ, x; t, B) : 0 ≤ τ ≤ t ≤ T } such that
Z
P (τ, t) f (x) = f (y)P (τ, x; t, dy) , f ∈ Cb (E). (1.92)

This is a consequence of Corollary 1.5. In addition the family

{B 7→ P (τ, x; t, B) : 0 ≤ τ ≤ t ≤ T }

satisfies the equation of Chapman-Kolmogorov:


Z
P (τ, x; s, dz) P (s, z; t, B) = P (τ, x; t, B) , 0 ≤ τ ≤ s ≤ t ≤ T, B ∈ E.
(1.93)
Next, for B ∈ E4 , and 0 ≤ τ ≤ t ≤ T we put

N (τ, x; t, B) = P (τ, x; t, B ∩ E) + (1 − P (τ, x; t, E)) 1B (4), x ∈ E, and


N (τ, 4; t, B) = 1B (4) . (1.94)

Then the family {B 7→ N (τ, x; t, ¡B) : 0 ≤ τ ¢≤ t ≤ T } satisfies the Chapman-


Kolmogorov equation on E 4 , N τ, x; t, E 4 = 1, and N (τ, 4; t, E) = 0. So
that if B 7→ P (τ, x; t, B) is a sub-probability on E, then B 7→ N (τ, x; t, B) is
a probability measure on E4 , the Borel field of E 4 .

Remark 1.43. Besides the family of (maximum) time operators {∨t : t ∈ [0, T ]}
we have the following more or less natural families: {∧t : t ∈ [0, T ]} (min-
imum
© T time operators),
ª and the time translation or time shift operators
ϑt : t ∈ [0, T ] . Instead of ϑTt we usually write ϑt . The operators ∧t :
Ω → Ω have the basic properties: ∧s ◦ ∧t = ∧s∧t , s, t ∈ [0, T ], and
X(s) ◦ ∧t = X (s ∧ t), s, t ∈ [0, T ]. The operators ϑt : Ω → Ω, t ∈ [0, T ],
have the following basic properties: ϑs ◦ ϑt = ϑs+t , s, t ∈ [0, T ], and
X(s) ◦ ϑt = X ((s + t) ∧ T ) = X (ϑs+t (0)).

It is clear that if a diffusion process (Xt , Ω, Ftτ , Pτ,x ) generated by the family
of operators Lτ exists, then for every pair (τ, x) ∈ [0, T ] × Rd , the measure
Pτ,x solves the martingale problem π(τ, x). Conversely, if the family Lτ is
given, we can try to solve the martingale problem for all (τ, x) ∈ [0, T ] × Rd ,
find the measures Pτ,x , and then try to prove that Xt is a Markov process
with respect to the family of measures Pτ,x . For instance, if we know that
for every pair (τ, x) ∈ [0, T ] × Rd the martingale problem π(τ, x) is uniquely
solvable, then the Markov property holds, provided that for there exists op-
erators ∨s : Ω → Ω, 0 ≤ s ≤ T such that Xt ◦ ∨s = Xt∨s , Pτ,x -almost surely
for τ ≤ t ≤ T , and τ ≤ s ≤ T . For the time-homogeneous case see, e.g.,
[84] or [109]. The martingale problem goes back to Stroock and Varadhan
40 1 Strong Markov processes

(see [225]). It found numerous applications in various fields of Mathematics.


We refer the reader to [147], [136], and [135] for more information about and
applications of the Martingale problem. In [80] the reader may find singular
diffusion equations which possess or which do not possess unique solutions.
Consequently, for (singular) diffusion equations without unique solutions the
martingale problem is not uniquely solvable.
Examples of (Feller) semigroups can be manufactured by taking a contin-
uous function ϕ : [0, ∞) × E → E with the property that
ϕ (s + t, x) = ϕ (t, ϕ (s, x)) ,
for all s, t ≥ 0 and x ∈ E. Then the mappings f 7→ P (t)f , with P (t)f (x) =
f (ϕ (t, x)) defines a semigroup. It is a Feller semigroup if limx→4 ϕ (t, x) = 4.
An explicit example of such a function, which does not provide a Feller
x
semigroup on C0 (R) is given by ϕ(t, x) = q (example due to
1 + 12 tx2
∂u
V. Kolokoltsov). Put u(t, x) = P (t)f (x) = f (ϕ(t, x)). Then (t, x) =
∂t
∂u
−x3 (t, x). In fact this (counter-)example shows that solutions to the mar-
∂x
tingale problem do not necessarily give rise to Feller-Dynkin semigroups.
These are semigroups which preserve not only the continuity, but also the
fact that functions which tend to zero at 4 are mapped to functions with the
same property. However, for Feller semigroups we only require that continu-
ous functions with values in [0, 1] are mapped to continuous functions with
the same properties. Therefore, it is not needed to include a hypothesis like
(1.95) in item (d) of Theorem 1.39. Here (1.95) reads as follows: for every
(τ, s, t, x) ∈ [0, T ]3 × E, τ < s < t, the equality
Pτ,x [X(t) ∈ E] = Pτ,x [X(t) ∈ E, X(s) ∈ E] (1.95)
holds.
In fact the result as stated is correct, but in case E happens to be locally
compact, then the resulting semigroup need not be a Feller-Dynkin semigroup.
This means that the corresponding family of operators assigns bounded con-
tinuous functions to functions in C0 (E), but they need not vanish at 4. This
means that the main result, Theorem 2.5, as stated in [48] is not correct.
That is solutions to the martingale problem can, after having visited 4, still
be alive. In the case of a non-compact space the metric without the Lévy part
is not adequate enough. That is why we have added the Lévy term. The prob-
lem is that the limits of the finite-dimensional distributions, given in (2.105)
below, on its own need not be a measure, and so there is no way of applying
Kolmogorov’s extension theorem.

1.3.1 Some historical remarks and references


Whereas in [48] we used only the first term in the distance dL of formula
(2.104) this is not adequate in the non-compact case. The reason for this
1.3 Strong Markov processes: main result 41

is that the second term in the right-hand side of the definition of the metric
dL (P2 , P1 ) in (2.104) ensures us that the limiting “functionals” are probability
measures indeed. Here we use a concept due to Lévy: the Lévy metric. In [74]
the authors Dorroh and Neuberger also use the strict topology to describe the
behavior of semigroups acting on the space of bounded continuous functions
on a Polish space. In fact the author of the present book was at least partially
motivated by their work to establish a general theory for Markov processes
on Polish spaces. Another motivation is provided by results on bi-topological
spaces as established by e.g. Kühnemund in [140]. Other authors have used
this concept as well, e.g. Es-Sarhir and Farkas in [82]. The notion of “strict
topology” plays a dominant role in Hirschfeld [103]. As already mentioned
Buck [44] was the first author who introduced the notion of strict topology
(in the locally compact setting). He denoted it by β in §3 of [44]. There are
several other authors who used it and proved convergence and approxima-
tion properties involving the strict topology: Buck [43], Prolla [190], Prolla
and Navarro [191], Katsaras [131], Ruess [206], Giles [91], Todd [235], Wells
[253]. This list is not exhaustive: the reader is also referred to Prolla [189],
and the literature cited there. In [250] Varadhan describes a metric on the
space D ([0, 1], R) which turns it into a complete metrizable separable space;
i.e. the Skorohod topology turns D ([0, 1], R) into a Polish space. On the other
hand it is by no means necessary that the ¡Skorohod¢ topology is the most
natural topology to be used on the space D [0, 1], Rd . For example in [113]
Jakubowski employs a quite different topology on this space. In [114] elab-
orates on Skorohod’s ideas about sequential convergence of distributions of
stochastic processes. After that the S-topology, as introduced by Jakubowski,
has been used by several others as well: see the references in [39] as well. Def-
inition 1.44 below also appears in [39]. Although the definition is confined to
R-valued paths, the S-topology also extends easily to the finite dimensional
Euclidean space Rd . By V+ ⊂ D ([0, T ], R) we denote the space of nonneg-
ative and nondecreasing functions V : [0, T ] → [0, ∞) and V = V+ − V+ .
We know that any element V ∈ V+ determines a unique positive measure
dV on [0, T ] and V can be equipped with the topology of weak convergence
RT RT
of measures; i.e. the equality limn→∞ 0 ϕ(s)dVn (s) = 0 ϕ(s)dV (s) for all
functions ϕ ∈ C ([0, T ], R) describes the weak convergence of the sequence
(Vn )n∈N ⊂ V to V ∈ V. Without loss of generality we may assume that the
functions V ∈ V are right-continuous and possess left limits in R.
Definition 1.44. Let (Y n )1≤n≤∞ ⊂ D ([0, T ], R). The sequence (Y n )n∈N is
said to converges to Y ∞ with respect to the S-topology, if for every ε > 0 there
exist elements (V n,ε )1≤n≤∞ ⊂ V such that kV n,ε − Y n k∞ ≤ ε, n = 1, . . . , ∞,
Z T Z T
and lim ϕ(s) dV n,ε (s) = ϕ(s) dV ∞,ε (s), for all ϕ ∈ C ([0, T ], R).
n→∞ 0 0
42 1 Strong Markov processes

1.4 Dini’s lemma, Scheffé’s theorem, and the monotone


class theorem
The contents of this section is taken from Appendix E in [70]. In this section
we formulate and discuss these three theorems.

1.4.1 Dini’s lemma and Scheffé’s theorem

The contents of this subsection is devoted to Dini’s lemma and Scheffé’s the-
orem. Another proof of Dini’s lemma can be found in Stroock [222], Lemma
7.1.23, p. 146.
Lemma 1.45. (Dini) Let (fn : n ∈ N) be a sequence of continuous functions
on the locally compact Hausdorff space E. Suppose that fn (x) ≥ fn+1 (x) ≥ 0
for all n ∈ N and for all x ∈ E. If limn→∞ fn (x) = 0 for all x ∈ E, then,
for all compact subsets K of E, limn→∞ supx∈K fn (x) = 0. If the function f1
belongs to C0 (E), then limn→∞ supx∈E fn (x) = 0.
Proof. We only prove the second assertion. Fix η > 0 and consider the subset
\
{x ∈ E : fn (x) ≥ η} .
n∈N

Since, by assumption, the function f1 belongs to C0 (E), and lim fn (x) = 0,


T n→∞

S the intersection n∈N {x ∈ E : fn (x) ≥ η} is void. As


x ∈ E, it follows that
a consequence E = n∈N {fn < η}. Let ε > 0 and put K = {f1 ≥ ε}. The
subset K is compact. By the preceding argument there exist nε ∈ N for which
K ⊆ {fnε < ε}. For n ≥ nε , we have 0 ≤ fn (x) ≤ ε for all x ∈ E.
In Definition 1.46 and in Theorem 1.50 of this subsection (E, E, m) may be
any measure space with m(B) ≥ 0 for B ∈ E.
Definition 1.46. A collection of functions {fj : j ∈ J} in L1 (E, E, m) is uni-
formly L1 -integrable if for every ε > 0 there exists g ∈ L1 (E, E, µ), g ≥ 0, for
which Z
sup |fj | dm ≤ ε.
j∈J {|fj |≥g}

Remark 1.47. If the collection {fj : j ∈ J} is uniformly L1 -integrable, and if


{gj : j ∈ J} is a collection for which |gj | ≤ |fj |, m-almost everywhere, for all
j ∈ J, then the collection {gj : j ∈ J} is uniformly L1 -integrable as well.
Remark 1.48. Cauchy sequences in L1 (E, E, m) are uniformly L1 -integrable.
Remark 1.49. Let f ≥ 0 beR a function in L1 (Rν , B, m), where m is the
Lebesgue measure. Suppose f (x)dm(x) = 1 and limn→∞ nν f (nx) = 0 for
all x 6= 0. Put fn (x) = nν f (nx), n ∈ N. Then the sequence is not uniformly
L1 -integrable. This will follow from Theorem 1.50 below.
1.4 Dini’s lemma, Scheffé’s theorem, and the monotone class theorem 43

A version of Scheffé’s theorem reads as follows. Our proof uses the arguments
in the proof Theorem 3.3.5 (Lieb’s version of Fatou’s lemma) in Stroock [222],
p. 54. Another proof can be found in Bauer [25], Theorem 2.12.4, p. 103.
Theorem 1.50. (Scheffé) Let (fn : n ∈ N) be a sequence in L1 (E, E, m). If
limn→∞ fn (x) = f (x), m-almost everywhere, then the sequence (fn : n ∈ N)
is uniformly L1 -integrable if and and only
Z Z
lim |fn (x)| dm(x) = |f (x)| dm(x).
n→∞

Proof. Consider the m-almost everywhere pointwise inequality

0 ≤ |fn − f | + |f | − |fn | ≤ 2 |f | . (1.96)

First suppose that the sequence {fn : n ∈ N} is uniformly L1 -integrable. Then,


by Fatou’s lemma,
Z Z Z
|f (x)| dm(x) = lim inf |fn (x)| dm(x) ≤ lim inf |fn (x)| dm(x)

R
(choose g ∈ L1 (E, m) such that {|fn |≥g}
|fn (x)| dm(x) ≤ 1)
Z Z
≤ lim inf |fn (x)| dm(x) + |fn (x)| dm(x)
{|fn |≥g} {|fn |≤g}
Z
≤ 1 + g(x)dm(x). (1.97)

From (1.97) we see that the function f belongs to L1 (E, m). From Lebesgue’s
dominated convergence theorem in conjunction with (1.97) we infer
Z
lim (|fn − f | + |f | − |fn |) dm = 0. (1.98)
n→∞

Since the sequence {fn : n ∈ N} is uniformly L1 -integrable,


R and since for m-
almost all x, limn→∞ fn (x) = f (x), we see that limn→∞ |fn − f | dm = 0.
So from (1.98) we get
Z Z
lim |fn | dm = |f | dm < ∞. (1.99)
n→∞

Conversely, suppose (1.99) holds. Then f belongs to L1 (E, m). Again we may
invoke Lebesgue’s dominated convergence theorem
R to conclude (1.98) from
(1.96). Again using (1.99) implies limn→∞ |fn − f | dm = 0. An appeal to
Remark 1.48 yields the desired result.
44 1 Strong Markov processes

1.4.2 Monotone class theorem

Our presentation of the monotone class theorems is taken from Blumenthal


and Getoor [34], pp. 5–7. For other versions of this theorem see e.g. Sharpe
[208], pp. 364–366. Theorems 1.52, 1.53, and Propositions 1.54, 1.55 we give
closely related versions of this theorem.
Definition 1.51. Let Ω be a set and let S be a collection of subsets of Ω.
Then S is a Dynkin system if it has the following properties:
(a) Ω ∈ S;
(b) if A and B belong to S and if A ⊇ B, then A \ B belongs to S;
(c) if
S∞ (An : n ∈ N) is an increasing sequence of elements of S, then the union
n=1 An belongs to S.

The following result on Dynkin systems is well-known.


Theorem 1.52. Let M be a collection of subsets of of Ω, which is stable
under finite intersections. The Dynkin system generated by M coincides with
the σ-field generated by M.

Theorem 1.53. Let Ω be a set and let M be a collection of subsets of of Ω,


which is stable (or closed) under finite intersections. Let H be a vector space
of real valued functions on Ω satisfying:
(i) The constant function 1 belongs to H and 1A belongs to H for all A ∈ M;
(ii)if (fn : n ∈ N) is an increasing sequence of non-negative functions in H
such that f = supn∈N fn is finite (bounded), then f belongs to H.
Then H contains all real valued functions (bounded) functions on Ω, that are
σ(M) measurable.

Proof. Put D = {A ⊆ Ω : 1A ∈ H}. Then by (i) Ω belongs to D and D ⊇ M.


If A and B are in D and if B ⊇ A, then B \ A belongs to D. If (An : n ∈ N)
is an increasing sequence in D, then 1∪An = supn 1An belongs to D by (ii).
Hence D is a Dynkin system, that contains M. Since M is closed under finite
intersection, it follows by Theorem 1.52 that D ⊇ σ(M). If f ≥ 0 is measurable
with respect to σ(M), then
Xn2n
f = sup 2−n 1{f ≥j2−n } . (1.100)
n j=1

Since 1{f ≥j2−n } , j, n ∈ N, belong to σ(M), we see that f belongs to H. Here


we employed the fact that σ(M) ⊆ D. If f is σ(M)-measurable, then we write
f as a difference of two non-negative σ(M)-measurable functions.

The previous theorems, i.e. Theorems 1.52 and 1.53, are used in the following
form. Let Ω be a set and let (Ei , Ei )i∈I be a family of measurable spaces,
indexed by an arbitrary set I. For each i ∈ I, let Si denote a collection of
1.4 Dini’s lemma, Scheffé’s theorem, and the monotone class theorem 45

subsets of of Ei , closed under finite intersection, which generates the σ-field


Ei , and let fi : Ω → Ei be a map from Ω to Ei . In our presentation of the
Markov property the space Ei are all the same, and the maps fi , i ∈ I, are
the state variables X(t), t ≥ 0. in this context the following two propositions
follow.
\
Proposition 1.54. Let M be the collection of all sets of the form fi−1 (Ai ),
i∈J
Ai ∈ Si , i ∈ J, J ⊆ I, J finite. Then M is a collection of subsets of Ω which
is stable under finite intersection and σ(M) = σ (fi : i ∈ I).

Proposition 1.55. Let H be a vector space of real-valued functions on Ω such


that:
(i) the constant function 1 belongs to H;
(ii)if (hn : n ∈ N) is an increasing sequence of non-negative functions in H
such that h = supn hn is finite (bounded),
Q then h belongs to H;
(iii)H contains all products of the form i∈J 1Ai ◦ fi , J ⊆ I, J finite, and
Ai ∈ Si , i ∈ J.
Under these assumptions H contains all real-valued functions (bounded) func-
tions in σ(fi : i ∈ I).

Definition 1.56. Theorems 1.52 and 1.53, and Propositions 1.54 and 1.55
are called the monotone class theorems.

Other theorems and results on integration theory, not explained in the book,
can be found in any textbook on the subject. In particular this is true for
Fatou’s lemma and Fubini’s theorem on the interchange of the order of inte-
gration. Proofs of these results can be found in Bauer [25] and Stroock [222].
The same references contain proofs of the Radon-Nikodym theorem. This the-
orem may be phrased as follows.
Theorem 1.57. (Radon-Nikodym) If a finite measure µ on some σ-finite
measure space (E, E, m) is absolutely continuous with respect
R to m, then there
exists a function f ∈ L1 (E, E, m) such that µ(A) = A f (x)dm(x) for all
subsets A ∈ E.
The measure µ is said to be absolutely continuous with respect to m if m(A) =
0 implies µ(A) = 0, and the measure m is said to be σ-finiteS if there exists
an increasing sequence (En : n ∈ N) in E such that E = n∈N En and for
which m (En ) < ∞, n ∈ N. A very important application is the existence of
conditional expectations. This can be seen as follows.
Corollary 1.58. Let (Ω, F, P) be a probability space and let F0 be a sub-field
of F, and let Y : Ω → [0, ∞] be a F-measurable function (random vari-
able) in L1 (Ω, F, P). Then there exists a function G ∈ L1 (E, F0 , m) such that
E [Y 1A ] = µ(A) = E [G1A ] for all A ∈ F0 .
46 1 Strong Markov processes
£ ¯ ¤
By convention the random variable G is written as G = E Y ¯ F0 . It is called
the conditional expectation on the σ-field F0 .
Proof. Put m(A) = E [Y 1A ], A ∈ F, and let µ be the restriction of m to F0 .
If for some A ∈ F0 , m(A) − 0, then µ(A) = 0. The Radon-Nikodym theorem
yields the existence of a function G ∈ L1 (E, F0 , m) such that E [Y 1A ] =
µ(A) = E [G1A ] for all A ∈ F0 .
2
Strong Markov processes: proof of main result

2.1 Proof of the main result: Theorem 1.39


In the present section we will prove item (a), (b), (c), (d) and (e) of Theorem
1.39. We will need a number of auxiliary results which can be found in the
current section or in the sections 3.1 and 3.2. We will always give the relevant
references. We need the following definition.
Definition
¡ 2.1.
¢ Let {X(t)}t∈[0,T ] , and {Y (t)}t∈[0,T ] be stochastic processes
on Ω, FT0 , P . The process {X(t)}t∈[0,T ] is a modification of {Y (t)}t∈[0,T ] if
P [X(t) = Y (t)] = 1 for all t ∈ [0, T ].

2.1.1 Proof of item (a) of Theorem 1.39

This subsection contains the proof of part (a). It employs the Kolmogorov’s
extension theorem and it uses the polish nature of the state space E in an
essential way.
Proof (Proof of item (a) of Theorem 1.39). We begin with the proof of the
existence of a Markov process (1.84), starting from a Feller evolution: see
Definition 1.24. First we assume P (τ, t) 1 = 1. Remark 1.42 will be used to
prove assertion (a) in case P (τ, t) 1 < 1. Temporarily we write Ω = E [0,T ] en-
dowed with the product topology, and product σ-algebra (or product σ-field),
which is the smallest σ-field on Ω which renders all coordinate mappings, or
state variables, measurable. The state variables X(t) : Ω → E are defined by
X(t, ω) = X(t)(ω) = ω(t), ω ∈ Ω, and the maximal mappings ∨s : Ω → Ω,
s ∈ [0, T ], are defined by ∨s (ω)(t) = ω (s ∨ t). Let the family of Borel measures
on
{B 7→ P (τ, x; t, B) : B ∈ E, (τ, x) ∈ [0, T ] × E, t ∈ [τ, T ]}
be determined by the equalities:
Z
P (τ, t) f (x) = f (y)P (τ, x; t, dy) , f ∈ Cb (E). (2.1)
48 2 Proof of main result

By Kolmogorov’s extension theorem there exists a family of probability spaces


(Ω, FTτ , Pτ,x ) , (τ, x) ∈ [0, T ] × E,
such that
Eτ,x [f (X (t1 ) , . . . , X (tn ))]
Z Z
= . . . f (y1 , . . . , yn ) P (τ, x; t1 , dy1 ) . . . P (tn−1 , yn−1 ; tn , dyn ) (2.2)
| {z }

¡ n¢
where τ ≤ t1 < · · · < tn ≤ T , and f ∈ L∞ E n , E⊗ . For f ∈ Cb ([0, T ] × E),
0 ≤ f , and α > 0 given we introduce the following processes:
Z ∞
t 7→ αR(α)f (t, X(t)) = α e−α(ρ−t) P (t, ρ ∧ T ) f (ρ ∧ T, ·) (X(t)) dρ
t
Z ∞
=α e−α(ρ−t) Et,X(t) [f (ρ ∧ T, X (ρ ∧ T ))] , t ∈ [0, T ], and (2.3)
t
s 7→ P (s, t) f (t, ·) (X(s)) = Es,X(s) [f (t, X(t))] , s ∈ [0, t], t ∈ [0, T ]. (2.4)
The processes in (2.3) and (2.4) could have been more or less unified by
considering the process:
Z ∞
(s, t) 7→α e−α(ρ−t) P (s, ρ ∧ T ) f (ρ ∧ T, ·) (X(s)) dρ
t
= αP (s, t) R(α)f (t, ·) (X(s)) , 0 ≤ s ≤ t ≤ T. (2.5)
Observe that limα→∞ αR(α)f (t, X(t)) = f (t, X(t)), t ∈ [0, T ]. Here we
use the continuity of the function ρ 7→ P (t, ρ) f (ρ, ·) (X(t)) at ρ = t. In
addition, for y ∈ E fixed, we have that the family of functionals f 7→
αR(α)f (t, ·) (X(t)), α ≥ 1, t ∈ [τ, T ], is Pτ,x -almost surely equi-continuous
for the strict topology.
Our first task will be to prove that for every (τ, x) ∈ [0, T ] × E the orbit
{(t, X(t)) : t ∈ [τ, T ]} is a Pτ,x -almost surely sequentially compact. Therefore
we choose an infinite sequence (ρn , X (ρn ))n∈N where ρn ∈ [τ, T ], n ∈ N.
This sequence contains an infinite subsequence (sn , X (sn ))n∈N such that
sn < sn+1 , n ∈ N, or an infinite subsequence (tn , X (tn ))n∈N such that
tn > tn+1 , n ∈ N. In the first case we put s = supn∈N sn , and in the sec-
ond case we write t = inf n∈N tn . In either case we shall prove that there exists
a subsequence which is a Cauchy sequence in [τ, T ] × E for a compatible uni-
formly bounded metric Pτ,x -almost surely. First we deal with the case that tn
decreases to t ≥ τ . Then we consider the stochastic process in (2.4) given by
ρ 7→ Et,X(t) [f (ρ, X(ρ))] where f is an arbitrary function in Cb ([0, T ] × E).
By hypothesis on the transition function P (τ, x; t, B) we have
Z
lim Et,X(t) [f (tn , X (tn ))] = lim P (t, X(t); tn , dy) f (tn , y) = f (t, X(t)) .
n→∞ n→∞
(2.6)
2.1 Proof of the main result: Theorem 1.39 49
h i
2
By applying the argument in (2.6) to the process ρ 7→ Et,X(t) |f (ρ, X(ρ))| ,
ρ ∈ [t, T ], the Markov property implies
h¯ ¯2 i
Eτ,x ¯Et,X(t) [f (ρ, X(ρ))] − f (ρ, X(ρ))¯
h i h¯ ¯2 i
= Eτ,x |f (ρ, X(ρ))| + Eτ,x ¯Et,X(t) [f (ρ, X(ρ))]¯
2

h i
− 2<Eτ,x f (ρ, X(ρ))Et,X(t) [f (ρ, X(ρ))]
£ ¯ ¤
(Markov property: Et,X(t) [f (ρ, X(ρ))] = Eτ,x f (ρ, X(ρ)) ¯ Ftτ Pτ,x -almost
surely)
h i h¯ ¯2 i
= Eτ,x |f (ρ, X(ρ))| + Eτ,x ¯Et,X(t) [f (ρ, X(ρ))]¯
2

h i
− 2<Eτ,x Et,X(t) [f (ρ, X(ρ))]Et,X(t) [f (ρ, X(ρ))]
h i h¯ ¯2 i
= Eτ,x |f (ρ, X(ρ))| − Eτ,x ¯Et,X(t) [f (ρ, X(ρ))]¯ .
2
(2.7)
h i
2
Applying the argument in (2.6) to the process ρ 7→ Et,X(t) |f (ρ, X(ρ))| ,
ρ ∈ [t, T ], and employing (2.7) we obtain:
h¯ ¯2 i
lim Eτ,x ¯Et,X(t) [f (tn , X (tn ))] − f (ρ, X (ρ))¯ = 0. (2.8)
n→∞

Again using (2.6) and invoking (2.8) we see that

lim f (tn , X (tn )) = f (t, X(t))


n→∞

in the space L2 (Ω, FTτ , Pτ,x ). Hence there exists a subsequence denote by
(f (tnk , X (tnk )))k∈N which converges Pτ,x -almost surely to f (t, X(t)). Let
d : E × E → [0, 1] be a metric on E which turns it into a polish space, and
let (xj )j∈N be a countable dense sequence in E. The previous arguments are
applied to the function f : [0, T ] × E → R defined by

X
f (ρ, x) = 2−j (d (xj , x) + |ρj − ρ|) , (2.9)
j=1

where the sequence (ρj )j∈N is a dense sequence in [0, T ]. From the previous
arguments we see that there exists a subsequence (tnk , X (tnk ))k∈N such that

lim f (tnk , X (tnk )) = f (t, X(t)) , Pτ,x -almost surely. (2.10)


k→∞

It follows that limk→∞ tnk = t. From (2.10) we also infer that

lim d (xj , X (tnk )) = d (xj , X(t)) , Pτ,x -almost surely for all j ∈ N.
k→∞
(2.11)
50 2 Proof of main result

Since the sequence (xj )j∈N is dense in E we see that

lim d (y, X (tnk )) = d (y, X(t)) , Pτ,x -almost surely for all y ∈ E. (2.12)
k→∞

The substitution y = X(t) in (2.12) shows that

lim (tnk , X (tnk )) = (t, X(t)) , Pτ,x -almost surely. (2.13)


k→∞

Again let f ∈ Cb ([0, T ] × E) be given. Next we consider the situation where


we have an infinite subsequence (sn , X (sn ))n∈N such that sn < sn+1 , n ∈ N.
Put s = supn∈N sn , and consider the process
Z
ρ 7→ Eρ,X(ρ) [f (s, X(s))] = P (ρ, X(ρ); s, dy) f (s, y) = P (ρ, s) f (s, ·) (X(s))
(2.14)
which is Pτ,x -martingale with respect to the filtration (Ftτ )τ ≤ρ≤s . Since the
process in (2.14) is a martingale we know that the limit

lim Esn ,X(sn ) [f (s, X(s))]


n→∞

exists. We also have


h i £ ¤
Eτ,x lim Esn ,X(sn ) [f (s, X(s))] = lim Eτ,x Esn ,X(sn ) [f (s, X(s))]
n→∞
£ £ ¯ n→∞
¤¤
= lim Eτ,x Eτ,X(τ ) f (s, X(s)) ¯ Fsτn = Eτ,x [f (s, X(s))] . (2.15)
n→∞

Like in (2.7) we write


h¯ ¯2 i
Eτ,x ¯Eρ,X(ρ) [f (s, X(s))] − f (ρ, X(ρ))¯
h¯ ¯2 i h i
= Eτ,x ¯Eρ,X(ρ) [f (s, X(s))]¯ + Eτ,x |f (ρ, X(ρ))|
2

h i
− 2<Eτ,x f (ρ, X(ρ))Eρ,X(ρ) [f (s, X(s))] . (2.16)

The expression in (2.16) converges to 0 as ρ ↑ s. Here we used the following


identity:

lim Eτ,x [g (ρ, X(ρ))] = lim P (τ, ρ) g (ρ, ·) (x)


ρ↑s ρ↑s

= P (τ, s) g (s, ·) (x) = Eτ,x [g (s, X(s))] , (2.17)


¡ ¢
Consequently, the Pτ,x -martingale Esn ,X(sn ) [f (s, X(s))] n∈N converges Pτ,x -
almost surely and in the space L2 (Ω, FTτ , Pτ,x ) to the stochastic variable
f (s, X(s)). In addition, the sequence (f (sn , X (sn )))n∈N converges in the
space L2 (Ω, FTτ , Pτ,x ) to the same stochastic variable f (s, X(s)). Then there
exists a subsequence (f (snk , X (snk )))n∈N which converges Pτ,x -almost surely
to f (s, X(s)). Again we employ the function in (2.9) to prove that
2.1 Proof of the main result: Theorem 1.39 51

lim X (snk ) = X(s), Pτ,x -almost surely. (2.18)


k→∞

The equalities (2.13) and (2.18) show that the orbit {(ρ, X(ρ)) : ρ ∈ [τ, T ]}
is Pτ,x almost surely a sequentially compact subset of E. Since the space
E is complete metrizable we infer that this orbit is Pτ,x -almost surely a
compact
n subset of E.oWe still have to show that there exists a modifica-
e
tion X(s) : s ∈ [0, T ] of the process {X(s) : s ∈ [0, T ]} which possesses left
limits, is right-continuous Pτ,x -almost surely, and is such that
h ³ ´i
e
P (τ, t) f (x) = Eτ,x [f (X(t))] = Eτ,x f X(t) , f ∈ Cb (E). (2.19)

. In order to achieve this we begin by using a modified version of the process


in (2.3):
Z ∞
t 7→ e−αt R(α)f (t, X(t)) = e−αρ P (t, ρ ∧ T ) f (ρ ∧ T, ·) (X(t)) dρ, (2.20)
t

for t ∈ [0, T ]. ¡The¢ process in (2.20) is a Pτ,x -supermartingale with respect to


the filtration Fρτ τ ≤ρ≤T . Since the process in (2.20) is a Pτ,x -supermartingale
on the interval [τ, T ] we deduce that for t varying over countable subsets its left
and right limits exist Pτ,x -almost surely. Then the process in (2.3) shares this
property as well. For a detailed argument which substantiates this claim see
the propositions 2.4 and 2.5 below. Since the orbit {(ρ, X(ρ)) : ρ ∈ [τ, T ]} is
Pτ,x -almost surely compact, and since the function f belongs to Cb ([0, T ] × E)
we infer that for sequences the process t 7→ f (t, X(t)) possesses Pτ,x -almost
surely left and right limits in E. Again an appeal to the function f in (2.9)
shows that the limits lims↑t, s∈D X(s) and limt↓s, t∈D X(t) exist Pτ,x -almost
surely for t ∈ (τ, T ] and s ∈ [τ, T ]. Here we wrote D = {k2−n : k ∈ N, n ∈ N}
for the collection of non-negative dyadic numbers. A redefinition (modifica-
e
tion) X(ρ) of the process X(ρ), ρ ∈ [0, T ], reads as follows:

e
X(ρ) = lim e ) = X(T ).
X(t), ρ ∈ [0, T ), X(T (2.21)
t↓ρ, t∈D∩(ρ,T ], t>ρ

So we obtain the following intermediate important result.


n o
Proposition 2.2. The process X(ρ) e : ρ ∈ [0, T ] is continuous from the
right and has left limits in E Pτ,x -almost surely. Moreover, its Pτ,x -distribution
coincides with that of the process
n {X(ρ)o: ρ ∈ [0, T ]}. n³ Fix (τ, x)´∈ [0, T ] × E o
e
and t ∈ [τ, T ]. On the event X(t) ∈ E the orbits e
s, X(s) : s ∈ [τ, t]
are Pτ,x -almost surely compact subsets of [τ, T ] × E.
³ ´
Fix 0 ≤ τ ≤ t ≤ T , and let S, S1 and S2 be F eτ -stopping times. In
t
t∈[τ,T ]
what follows we will make use of the following σ-fields:
52 2 Proof of main result
³ ´
e τ = σ X(ρ)
F e : τ ≤ρ≤t ;
t
³ ´
e τ = ∩0<ε≤T −t σ X(ρ)
F e : τ ≤ ρ ≤ t + ε = ∩0<ε≤T −t F eτ ; (2.22)
t+ t+ε
³³ ´ ´
e S,∨ = σ ρ ∨ S, X
F e (ρ ∨ S) : 0 ≤ ρ ≤ T ; (2.23)
T
n o
e S1 ,∨ = ∩s∈[0,T ] A ∈ F
F e S1 ,∨ : A ∩ {S2 ≤ s} ∈ F e0 ; (2.24)
S2 T s
n o
e
F S ,∨ e S1 ,∨ : A ∩ {S2 < s} ∈ F e0
S2 + = ∩s∈[0,T ] A ∈ FT
1
s
n o
= ∩0<ε≤T ∩s∈[0,T −ε] A ∈ F e S1 ,∨ : A ∩ {S2 ≤ s} ∈ F
e0
T s+ε

e S1 ,∨
= ∩ε>0 F (S2 +ε)∧T . (2.25)

The σ-field in (2.22) is called the right closure of F e τ , the σ-field in (2.23) is
t
called the σ-field after time S, the σ-field in (2.24) is called the σ-field between
time S1 and S2 , and finally the one in (2.25) is called the right closure of the
one in (2.24).
Proof (Continuation of the proof of assertion (a) of Theorem 1.39). Our most
important aim is to prove that the process
n³ ´ ³ ´ o
e τ , Pτ,x , X(t),
Ω, F e τ ≤ t ≤ T , (∨t : τ ≤ t ≤ T ) , (E, E) (2.26)
T

is a strong Markov process. We begin by proving the following equalities:


£ ¯ ¤
Es,X(s)
e [F ◦ ∨s ] = Eτ,x F ◦ ∨s ¯ Fsτ (2.27)
£ ¯ τ ¤ h ¯ τ i
= Eτ,x F ◦ ∨s ¯ Fs+ = Eτ,x F ◦ ∨s ¯ F e
s+ . (2.28)
³ ´
e
First we take F of the form F = f X(s) where f ∈ Cb (E). By an approx-
imation
³ ´argument it then follows that (2.27) and (2.28) also
h hold foriF =
e
f X(s) e
with f ∈ L∞ (E, E). So let f ∈ Cb (E). Since Ps,y X(s) =y = 1
³ ´ ³ ´
e
and f X(s) e
◦ ∨s = f X(s) we see
h ³ ´ i h ³ ´i ³ ´
Es,X(s)
e
e
f X(s) ◦ ∨s = Es,X(s)
e
e
f X(s) e
= f X(s) . (2.29)
³ ´
Since the stochastic variable f X(s)e is measurable with respect to the σ-
τ
field Fs+ by (2.29) we also have the Pτ,x -almost sure equalities:
h ³ ´ ¯ τ i h ³ ´¯ i ³ ´
e
Eτ,x f X(s) ◦ ∨s ¯ Fs+ = Eτ,x f X(s) e ¯ Fs+
τ e
= f X(s) . (2.30)

Next we calculate, while using the Markov property of the process t 7→ X(t)
and right-continuity of the function t 7→ P (s, t) f (y), s ∈ [τ, T ], y ∈ E,
2.1 Proof of the main result: Theorem 1.39 53
h ³ ´ ¯ τ i h ³ ´¯ i
e
Eτ,x f X(s) ◦ ∨s ¯ Fs+ e
= Eτ,x f X(s) ¯ Fsτ
£ ¯ ¤
= lim Eτ,x f (X(s + ε)) ¯ Fsτ = lim Es,X(s) [f (X(s + ε))]
ε↓0 ε↓0

= lim P (s, s + ε) f (·)) (X(s)) = f (X(s)) . (2.31)


ε↓0

In order to complete the


³ arguments
´ for the proof of (2.27) and (2.28) for
e
F of the form F = f X(s) , f ∈ Cb (E), we have to show the equality
³ ´
e
f X(s) = f (X(s)) Pτ,x -almost surely. This will be accomplished by the
following identities:
·¯ ³ ´ ¯2 ¸
¯ e ¯
Eτ,x ¯f X(s) − f (X(s))¯
h i h i h i
2 2
= lim Eτ,x |f (X(t))| − 2 lim <Eτ,x f (X(s))f (X(t)) + Eτ,x |f (X(s))|
t↓s t↓s

(Markov property for the process t 7→ X(t))


h i h i
2
= lim Eτ,x |f (X(t))| − 2 lim <Eτ,x f (X(s))Es,X(s) [f (X(t))]
t↓s t↓s
h i
2
+ Eτ,x |f (X(s))|

(relationship between Feller propagator and Markov property of X)


h i
2
= lim P (τ, t) |f (·)| (x) − 2 lim < P (τ, s) f (·)P (s, t) f (·) (x)
t↓s t↓s
2
+ P (τ, s) |f (·)| (x)
h i
2 2
= P (τ, s) |f (·)| (x) − 2 P (τ, s) f (·)f (·) (x) + P (τ, s) |f (·)| (x)
= 0. (2.32)
³ ´
From (2.32) we infer that f X(s) e = f (X(s)) Pτ,x -almost surely. From
(2.30), (2.31), and (2.32) we deduce
³ ´ equalities in (2.27) and (2.28) for
the
a variable F of the form F = f X(s) e , f ∈ Cb (E). An approximation argu-
ments then yields (2.27) and (2.28) for f ∈ L∞ (E, E).
In order to prove (2.27) in full generality it suffices by the Monotone Class
Theorem and an approximation argument to prove ³ the ´equalities in (2.28) for
Qn e (sj ) , where the functions
stochastic variables F of the form F = j=0 fj X
fj , 0 ≤ j ≤ n, belong³ to C´b (E) and where s = s0 < s1 < s2 < · · · < sn ≤ T .
e
Since the equality f X(s) = f (X(s)) holds Pτ,x -almost surely, it is easy to
see that by using the equalities (2.30), (2.31), ³and (2.32)
´ that it suffices to
Qn+1 e
take the variable F of the form F = j=1 fj X (sj ) where as above the
54 2 Proof of main result

functions fj , 1 ≤ j ≤ n + 1, belong to Cb (E) and where s < s1 < s2 < · · · <


sn < sn+1 ≤ T . For n = 0 we have Pτ,x -almost surely
h ³ ´¯ i £ ¯ ¤
Eτ,x f1 X e (s1 ) ¯ Fsτ = lim Eτ,x f1 (X (s1 + ε)) ¯ Fsτ
ε↓0

(Markov property of the process X)

= lim Es,X(s) [f1 (X (s1 + ε))] = lim P (s, s1 + ε) f (X(s))


ε↓0 ε↓0

= lim P (s, s1 ) P (s1 , s1 + ε) f (X(s)) = P (s, s1 ) f (X(s))


ε↓0
³ ´ h ³ ´i
e
= P (s, s1 ) f X(s) =E e f Xe (s1 ) . (2.33)
s,X(s)

³ ´
The equalities in (2.33) imply (2.27) with F = f1 X e (s1 ) where f1 ∈ Cb (E)
and s < s1 ≤ T . Then we apply induction ´ with respect to n to obtain (2.27)
Qn+1 ³ e
for F of the form F = j=1 fj X (sj ) where as above the functions fj ,
1 ≤ j ≤ n+1, belong to Cb (E) and where s < s1 < s2 < · · · < sn < sn+1 ≤ T .
In fact using the measurability of X e (sj ) with respect to the σ-field Fsτ + ,
n
1 ≤ j ≤ n, and the tower property of conditional expectation we get Pτ,x -
almost surely:
 
Y ³
n+1 ´¯
Eτ,x  fj Xe (sj ) ¯ Fsτ 
j=1
 
n
Y ³ ´ h ³ ´¯ i
= Eτ,x  fj e (sj ) Eτ,x fn+1 X
X e (sn+1 ) ¯ Fsτ Fsτ 
n
j=1

(Markov property for n = 1)


 
n
Y ³ ´ h ³ ´i
= Eτ,x  fj e (sj ) E e
X e Fsτ 
sn ,X(sn ) fn+1 X (sn+1 )
j=1

(induction hypothesis)
2.1 Proof of the main result: Theorem 1.39 55
 
n
Y ³ ´ h ³ ´i
= Es,X(s)
e
 fj Xe (sj ) E e fn+1 Xe (sn+1 ) 
sn ,X(sn )
j=1
 
n
Y ³ ´ h ³ ´¯ i
= Es,X(s)  fj e (sj ) E e
X e ¯ Fss 
e s,X(s) fn+1 X (sn+1 ) n
j=1
 
n
Y ³ ´ ³ ´
= Es,X(s)
e
 fj e (sj ) fn+1 X
X e (sn+1 ) 
j=1
 
n+1
Y ³ ´
= Es,X(s)
e
 e (sj )  .
fj X (2.34)
j=1

Qn+1 ³ e ´
So that (2.34) proves (2.27) for F = j=1 fj X (sj ) where the functions
fj , 1 ≤ n + 1, belong to Cb (E), and s < s1 < · · · < sn+1 . As remarked
above from (2.30), (2.31), and (2.32) the equality in ³ (2.27)´ then also follows
Qn e (sj ) with fj ∈ Cb (E)
for all stochastic variables of the form F = j=0 fj X
for 0 ≤ j ≤ n and 0 = s0 < s1 < · · · < sn ≤ T . By the Monotone Class
Theorem and approximation arguments it then follows that (2.27) is true for
all bounded FTτ stochastic variables F .
Next we proceed with a proof of the equalities in (2.28). Since F e τ ⊂ Fτ ,
s+ s+
τ
and the variable Es,X(s)
e [F ◦ ∨s ] is Fs+ -measurable, it suffices to prove the
first equality in (2.28), to wit
£ ¯ τ ¤
Eτ,x F ◦ ∨s ¯ Fs+ = Es,X(s)
e [F ◦ ∨s ] (2.35)

for any bounded FTτ -measurable stochastic variable F . We will not prove the
equality in (2.35) directly, but we will show the following ones instead:
£ ¯ τ ¤ £ ¯ s ¤ h ¯ s i
Eτ,x F ◦ ∨s ¯ Fs+ = Es,X(s) F ◦ ∨s ¯ Fs+ = Es,X(s) e
F ◦ ∨s ¯ F
e e s+ ,
(2.36)
under the condition that the function (s, x) 7→ P (s, t)f (x) is Borel measurable
on [τ, t] × E for f ∈ Cb (E), which is part of (vi) in Definition 1.24. In order to
prove the equalities in (2.36) it
³ suffices´ by the Monotone Class Theorem to take
Qn e (sj ) with s = s0 < s1 < · · · < sn ≤ T and
F of the form F = j=0 fj X
where de functions fj , 0 ≤ j ≤ n, are bounded Borel measurable functions.
By another approximation argument we may assume that the functions fj ,
0 ≤ j ≤ n, belong to Cb (E).
³ An ´ induction
³ argument
´ shows that it suffices to
e e
prove (2.36) for F = f0 X (s0 ) f1 X (s1 ) where s = s0 < s1 ≤ T , and
the functions f0 and f1 are members of Cb (E). The case f1 = 1 ³was taken
´
e
care of in the equalities (2.29) and (2.30). Since the variable f0 X(s) is
s
Fs+ -measurable the proof of the equalities in (2.36) reduces to the case where
56 2 Proof of main result
³ ´
e
F = f X(t) where τ < s < t ≤ T and f ∈ Cb (E). The following equalities
show the first equality in (2.36). With s < sn+1 < sn < t and limn→∞ sn = s
we have
h ³ ´¯ i h h ³ ´¯ ¯ i¯ i
e
Eτ,x f X(t) ¯ Fs+
τ
= Eτ,x Eτ,x f X(t) e ¯ Fsτ ¯ Fsn ¯ Fs+ τ
n
h h ³ ´i ¯ i
e
= Eτ,x Esn ,X(sn ) f X(t) ¯ Fs+
τ

h h ³ ´i ¯ i
= Eτ,x Esn ,X(s e
e n ) f X(t)
¯ Fs+
τ

h h ³ ´i ¯ i
= Eτ,x lim Esn ,X(se n ) f X(t)
e ¯ Fs+
τ
n→∞
h ³ ´i
= lim Esn ,X(s
e n) f e
X(t) (2.37)
n→∞
h h ³ ´i ¯ i
= Es,X(s)
e lim E e e
f X(t) ¯ Fs+
s
n→∞ sn ,X(sn )
h h ³ ´i ¯ i
= Es,X(s)
e Esn ,X(s e
e n ) f X(t)
¯ Fs+
s

h h ³ ´¯ i¯ i
= Es,X(s)
e Es,X(s)
e
e
f X(t) ¯ Fss ¯ Fs+ s
n
h ³ ´¯ i
= Es,X(s)
e
e
f X(t) ¯ Fs+
s
(2.38)
h ³ ´i
In these equalities we used the fact that the process ρ 7→ Eρ,X(ρ)e
e
f X(t) ,
s < ρ ≤ t is Ps,y -martingale for (s, y) ∈ [0, t)×E. The equality in (2.38) implies
the first equality in (2.36). The second one can be obtained by repeating the
four final steps in the proof of (2.38) with F e s instead of Fs . Here we use
s+ s+
that the stochastic variable in (2.37) is measurable with respect to the σ-field
e s , which is smaller than Fs .
F s+ s+
In order to deduce (2.35) from (2.36) we will need the full strength of
property (vi) in Definition 1.24. In fact using the representation in³ (2.37) ´
and using the continuity property in (vi) shows (2.35) for F = f X(t) e ,
f ∈ Cb (E). By the previous arguments the full assertion in (2.28) follows. In
fact Proposition 2.4 gives a detailed proof of the equalities in (2.72) below.
The equalities in (2.37) then follow from the Monotone Class Theorem.
Next we want to prove that the process t 7→ ³X(t) e
´possesses the strong
Markov property. This means that for any given Ft+ e τ
-stopping time
t∈[τ,T ]
S : Ω → [τ, T ] we have to prove an equality of the form (see (1.87))
h ¯ τ i
ES,X(S) [F ◦ ∨S ] = Eτ,x F ◦ ∨S ¯ Fe
e S+ , (2.39)

and this for all bounded FTτ -measurable stochastic variables F . By the Mono-
tone Class Theorem it follows that it suffices to³prove (2.39) for bounded
´
Qn e (sj ∨ S) where
stochastic variables F of the form F = j=0 fj sj ∨ S, X
the functions fj , 0 ≤ j ≤ n, are bounded Borel functions on [τ, T ] × E, and
2.1 Proof of the main result: Theorem 1.39 57

τ = s0 < s1 < · · · < sn ≤ T . By another approximation argument it suffices


to replace the bounded Borel functions fj , 0 ≤ j ≤ n, by bounded continuous
e τ -measurable.
functions on [τ, T ] × E. By definition the stopping time S is F S+
e τ
Let us show that X(S) is FS+ -measurable. Therefore we approximate the
stopping time S from above by stopping times Sn , n ∈ N, of the form
» ¼
T − τ 2n (S − τ )
Sn = τ + . (2.40)
2n T −τ

If t ∈ [τ, T ], then
¹ º
t−τ
2n
T[− τ ½ ¾
k−1 k
{Sn ≤ t} = (T − τ ) + τ < S ≤ (T − τ ) + τ , (2.41)
2n 2n
k=0
¡ τ ¢
and hence Sn is Ft+ t∈[τ,T ] -stopping time. Moreover, on the event
½ ¾
k−1 k
(T − τ ) + τ < S ≤ n (T − τ ) + τ
2n 2

k(T − τ )
the stopping time Sn takes the value Sn = tk,n , where tk,n = τ + .
2n
Consequently, we have the following equality of events:
½ ¾ ½ ¾
k(T − τ ) k−1 k
Sn = τ + = tk,n = (T − τ ) + τ < S ≤ n (T − τ ) + τ ,
2n 2n 2

2n (t − τ )
so that for k ≤ , which is equivalent to tk,n ≤ t, the event
½ T −τ
¾
k(T − τ ) e τ -measurable, and on this event the state vari-
Sn = τ + is F t+
2n
able Xe (Sn ) = X eτ
e (tk,n ) is F tk,n + -measurable. As a consequence we see that
on the event {Sn ≤ t} e τ -measurable. Then the
e (Sn ) is F
³ the state variable
´ X t+
space-time variable Sn , X e (Sn ) is measurable with respect to the σ-field
e τ . In addition, we have
FS+

T −τ
S ≤ Sn+1 ≤ Sn ≤ S + , (2.42)
2n
³ ´
and hence the space-time variable S, X e (S) is Fe τ -measurable as well. This
S+
³ ´
proves the equality in (2.39) in case F = f τ ∨ S, X e (τ ∨ S) where f ∈
Qn ³ ´
Cb ([τ, T ] × E). As a preparation for the case F = j=0 fj sj ∨ S, X e (sj ∨ S)
58 2 Proof of main result

where the functions fj , 0 ≤ j ≤ n, are bounded Borel functions on [τ, T ] × E,


and τ = s0 < s1 < · · · < sn ≤ T , we first consider the case (τ < t ≤ T )
³ ´ ³ ´
F = f t ∨ S, X e (t ∨ S) 1{S≤t} = f t, X(t)
e 1{S≤t} (2.43)

where f ∈ Cb ([τ, T ] × E). On the event {S ≤ t} we approximate the stopping


time S from above by stopping times Sn , n ∈ N, of the form
» ¼
t − τ 2n (S − τ )
Sn (t) = τ + n . (2.44)
2 t−τ

Then on the event {S ≤ t} we have the following inclusions of σ-fields:

e τ ∩ {S ≤ t} = F
F eτ eτ eτ
S+ S∧t+ ∩ {S ≤ t} ⊂ FSn+1 (t)+ ∩ {S ≤ t} ⊂ FSn (t)+ ∩ {S ≤ t}
(2.45)

and

∩∞ eτ eτ
n=1 FSn (t)+ ∩ {S ≤ t} = FS+ ∩ {S ≤ t} . (2.46)

Here we wrote F ∩ A0 = {A ∩ A0 : A ∈ F} when F is any σ-field on Ω and


A0 ⊂ Ω. Then we have
h ³ ´ ¯ τ i
Eτ,x f t ∨ S, Xe (t ∨ S) 1{S≤t} ¯ F e
S+
h ³ ´ ¯ τ i
= Eτ,x f t, X e (t) 1{S≤t} ¯ F e
S+
h h ³ ´ ¯ τ i¯ i
= Eτ,x Eτ,x f t, X e (t) 1{S≤t} ¯ F e ¯Feτ
Sn (t)+ S+
h h ³ ´i ¯ τ i
= Eτ,x ESn (t),X(S e
e n (t)) f t, X (t) 1{S≤t} ¯ Fe
S+
h h ³ ´i ¯ τ i
= lim Eτ,x ESn (t),X(S e
e n (t)) f t, X (t)
e
1{S≤t} ¯ F S+
n→∞
h h ³ ´i ¯ τ i
= Eτ,x lim ESn (t),X(S e
e n (t)) f t, X (t)
e
1{S≤t} ¯ F S+
n→∞

(employ (2.46) and the arguments leading to equality (2.36))


h ³ ´i
= lim ESn (t),X(S e
e n (t)) f t, X (t) 1{S≤t}
n→∞
h ³ ´¯ i
= ES,X(S)
e f t, X e S,∨ 1{S≤t}
e (t) ¯ F
S+

(appeal to (2.35) which relies on property (vi) of Definition 1.24)


h ³ ´i
= ES,X(S)
e
e (t) 1{S≤t} .
f t, X (2.47)
2.1 Proof of the main result: Theorem 1.39 59

e S,∨ -measurability of the stochastic state variable


From (2.47) and the F
³ ´ S+
e
S, X(S) we infer
h ³ ´¯ i
Eτ,x f t ∨ S, X e (t ∨ S) ¯ F eτ
S+
h ³ ´ ¯ τ i
= Eτ,x f t ∨ S, X e (t ∨ S) 1{S≤t} ¯ F e
S+
h ³ ´ ¯ τ i
+ Eτ,x f t ∨ S, X e (t ∨ S) 1{S>t} ¯ F e
S+
h ³ ´ ¯ τ i
= Eτ,x f t, X e (t) 1{S≤t} ¯ F e
S+
h ³ ´ ¯ τ i
+ Eτ,x f S, X e (S) 1{S>t} ¯ F e
S+
h ³ ´ i ³ ´
= ES,X(S)
e f t, X e (t) 1{S≤t} + f S, X e (S) 1{S>t}
h ³ ´i
= ES,X(S)
e f t ∨ S, X e (t ∨ S) . (2.48)
Qn+1 ³ ´
Next we consider the case F = j=0 fj sj ∨ S, X e (sj ∨ S) where the func-
tions fj , 0 ≤ j ≤ n + 1, are bounded Borel functions on [τ, T ] × E, and
τ = s0 < s1 < · · · < sn+1 ≤ T . From (2.48) and the F e S,∨ -measurability
³ ´ S+
of the stochastic state variable S, X(S) e we obtain (2.39) in case F =
³ ´ ³ ´
e ) f1 s1 , X
f0 τ, X(τ e (s1 ) , and thus
³ ´ ³ ´
F ◦ ∨S = f0 τ ∨ S, X(τe ∨ S) f1 s1 ∨ S, X e (s1 ∨ S) .

So that the cases n = 0 and n = 1 have been taken care of. The remaining
part of the proof uses induction. From (2.48) with the maximum operator
sn ∨ S replacing S together with the induction hypothesis we get
 
Y ³
n+1 ´¯
Eτ,x  e (sj ∨ S) ¯ FS+
fj sj ∨ S, X τ 

j=0

n
Y ³ ´
= Eτ,x  e (sj ∨ S)
fj sj ∨ S, X
j=0

h ³ ´¯ i¯
× Eτ,x fn+1 e (sn+1 ∨ S) ¯ Fsτ
sn+1 ∨ S, X ∨S+
¯ FS+
τ 
n


n
Y ³ ´
= Eτ,x  e (sj ∨ S)
fj sj ∨ S, X
j=0

h ³ ´i ¯
× Esn ∨S,X(s
e n ∨S) fn+1
e (sn+1 ∨ S) ¯ Fτ 
sn+1 ∨ S, X S+
60 2 Proof of main result

(induction hypothesis)

n
Y ³ ´
= ES,X(S)
e
 e (sj ∨ S)
fj sj ∨ S, X
j=0

h ³ ´i
× Esn ∨S,X(s
e n ∨S) fn+1
e (sn+1 ∨ S) 
sn+1 ∨ S, X

n
Y ³ ´
= ES,X(S)
e
 e (sj ∨ S)
fj sj ∨ S, X
j=0

h ³ ´¯ i
× ES,X(S)
e
e (sn+1 ∨ S) ¯ FS,∨
fn+1 sn+1 ∨ S, X 
sn ∨S+

  
n+1
Y ³ ´¯
= ES,X(S)
e
E e  fj e (sj ∨ S) ¯ FS,∨ 
sj ∨ S, X
S,X(S) sn ∨S+
j=0
 
n+1
Y ³ ´
= ES,X(S)
e
 fj e (sj ∨ S)  .
sj ∨ S, X (2.49)
j=0

The strong Markov property of the process X e follows from (2.49), an approx-
imation argument and the Monotone Class Theorem.
We still need to redefine our process and probability measures Pτ,x on
the Skorohod space D ([0, T ], E), (τ, x) ∈ [0, T ] × E in such a way that the
e is preserved. This can be done replacing (2.26)
distribution of the process X
with the collection
n³ ´ ³ ´ o
e F
Ω, eτ , P
eτ,x , X(t),
e τ ≤ t ≤ T , (∨t : τ ≤ t ≤ T ) , (E, E) (2.50)
T

where Ω eτ,x is determined by the equality E


e = D ([0, T ], E), and P e τ,x [F ] =
e
Eτ,x [F ◦ π]. Here F : Ω → C is a bounded variable which is measurable with
respect to the σ-field generated by the coordinate variables: X(t)e :ωe 7→ ωe (t),
t ∈ [τ, T ], ω e e e
e ∈ Ω. Notice that the restriction of X(t) to Ω is evaluation of
ω
e∈Ω e at t. The mapping π : Ω → Ω e is defined by π(ω) = X(t,
e ω), t ∈ [0, T ],
0 0
ω ∈ Ω . Here Ω has the property that for all (τ, x) ∈ [0, T ]×E its complement
in Ω is Pτ,x -negligible. We will describe the space Ω 0 . Let D be the collection
of positive dyadic numbers. For Ω 0 we may choose the space:

Ω 0 : = {ω ∈ Ω : t 7→ ω(t), t ∈ D ∩ [0, T ] has left and right limits in E}


∩ {ω ∈ Ω : the range {ω(t) : t ∈ D ∩ [0, T ]} is totally bounded in E} .
(2.51)
2.1 Proof of the main result: Theorem 1.39 61

Let (xj )j∈N be a sequence in E which is dense, and let d be a metric on


E ×E which turns E into a polish space. Put B (x, ε) = {y ∈ E : d(y, x) < ε}.
Define, for any finite subset of [0, T ] with an even number of members U =
{t1 , . . . , t2n } say, and ε > 0, the stochastic variable Hε (U ) by
n
X
Hε (U )(ω) = 1{d(X(t2j−1 ),X(t2j ))≥ε} (ω).
j=1

We also put

Hε (D ∩ [0, T ])
= sup {Hε (U ) : U ⊂ D ∩ [0, T ], U contains an even number of elements} .

Then the subset Ω 0 of Ω = E [0,T ] can be described as follows:


© ª
Ω 0 = ∩∞
n=1 ω ∈ Ω : H1/n (D ∩ [0, T ]) (ω) < ∞ (2.52)
\ n o
∩∞ ∞ n
m=1 ∪n=1 ω ∈ Ω : (X(s)(ω))s∈D∩[0,T ] ⊂ ∪j=1 B (xj , 1/m) .

The description in (2.52) shows that the subset Ω 0 is a measurable subset of


Ω. In addition we have Pτ,x (Ω 0 ) := Pτ,x (Ωτ0 ) = 1 for all (τ, x) ∈ [0, T ] × E.
Here
Ωτ0 = {ω ∈ Ω 0 : ω(ρ) = ω(τ ), ρ ∈ D ∩ [0, τ ]} , (2.53)
© 0
ª
which may be identified with ω ¹[τ,T ] : ω ∈ Ω which is a measurable subset
of Ωτ = E [τ,T ] . In order to complete the construction and the proof of assertion
(a) in Theorem 1.39 we need to prove the quasi-left continuity of the process
e So let (τn )
X. τ
n∈N be an increasing sequence of (Ft )t∈[τ,T ] -stopping times with
values in [τ, T ]. Put τ∞ = supn∈N τn . Let f and g be functions Cb+ (E), and
let h > 0. Then by the strong Markov property we have for m ≤ n
h ³ ´ ³ ´i
Eτ,x f X e (τm ) g X e ((τm + h) ∧ T )
h ³ ´ h ³ ´ii
= Eτ,x f X e (τm ) E e g Xe ((τm + h) ∧ T )
τm ,X(τm )
h ³ ´ h ³ ´i i
= Eτ,x f X e (τm ) E e g Xe ((τm + h) ∧ T ) , τm + h ≥ τ∞
τm ,X(τm )
h ³ ´ h ³ ´i i
+ Eτ,x f X e (τm ) E e g Xe ((τm + h) ∧ T ) , τm + h < τ∞
τm ,X(τm )

h ³ ´i
(the process ρ 7→ Eρ,X(ρ)
e
e
g X(s) is a right-continuous Pτ,x -martingale on
[τ, s])
h ³ ´ h ³ ´i i
= Eτ,x f Xe (τm ) E e g e ((τm + h) ∧ T ) , τm + h ≥ τ∞
X
τm ,X(τm )
h ³ ´ h ³ ´i i
+ Eτ,x f Xe (τm ) E e e
τm ,X(τm ) g X ((τm + h) ∧ T ) , τm + h < τ∞
62 2 Proof of main result
h ³ ´ ³ ´ i
= Eτ,x f Xe (τm ) P (τn , (τm + h) ∧ T ) g Xe (τn ) , τm + h ≥ τ∞
h ³ ´ h ³ ´i i
+ Eτ,x f Xe (τm ) E e g e ((τm + h) ∧ T ) , τm + h < τ∞ .
X
τm ,X(τm )
(2.54)

Put L = limn→∞ X e (τn ). Upon taking limits, as n → ∞, and employing the


fact that the propagator P (τ, t) is continuous from the left on the diagonal in
(2.54) we obtain:
h ³ ´ ³ ´i
Eτ,x f X e (τm ) g X e ((τm + h) ∧ T )
h ³ ´ ³ ´
= lim Eτ,x f X e (τm ) P (τn , τ∞ ) P (τ∞ , (τm + h) ∧ T ) g X e (τn ) ,
n→∞
i
τm + h ≥ τ∞
h ³ ´ h ³ ´i i
+ Eτ,x f X e (τm ) E e g e ((τm + h) ∧ T ) , τm + h < τ∞
X
τm ,X(τm )
h ³ ´ i
= Eτ,x f X e (τm ) P (τ∞ , (τm + h) ∧ T ) g (L) , τm + h ≥ τ∞
h ³ ´ h ³ ´i i
+ Eτ,x f X e (τm ) E e e
τm ,X(τm ) g X ((τm + h) ∧ T ) , τm + h < τ∞ .
(2.55)

Next we let m → ∞ in (2.55) to get


h ³ ´i
Eτ,x f (L) g X e ((τ∞ + h) ∧ T −) = Eτ,x [f (L) P (τ∞ , (τ∞ + h) ∧ T ) g (L)]
(2.56)
where we invoked property (vii) of Definition 1.24. Next we let h decrease to
zero in (2.56). This yields
h ³ ´i
Eτ,x f (L) g X e (τ∞ ) = Eτ,x [f (L) P (τ∞ , τ∞ ) g (L)] = Eτ,x [f (L) g (L)] .
(2.57)
Since f and g are arbitrary in Cb+ (E), the equality in (2.57) implies that
h ³ ´i
Eτ,x h L, Xe (τ∞ ) = Eτ,x [h (L, L)] (2.58)

for all bounded Borel measurable functions h ∈ L∞ (E × E, E ⊗ E). In particu-


lar we may take a bounded continuous metric h(x, y) = d(x, y), (x, y) ∈ E ×E.
From (2.58) it follows that
h ³ ´i
Eτ,x d L, Xe (τ∞ ) = Eτ,x [d (L, L)] = 0,

and hence
e (τn ) = X
L = lim X e (τ∞ ) , Pτ,x -almost surely. (2.59)
n→∞
2.1 Proof of the main result: Theorem 1.39 63

Essentially speaking this proves part (a) of Theorem 1.39 in case we are dealing
with conservative Feller propagators, i.e. Feller propagators with the property
that P (s, t) 1 = 1, 0 ≤ s ≤ t ≤ T . In order to be correct the process, or rather
the family of probability spaces in (2.26) has to be replaced with (2.50).
This completes the proof of Theorem 1.39 assertion (a) in case the Feller
propagator is phrased in terms of probabilities P (τ, x; t, E) = 1, 0 ≤ τ ≤ t ≤
T , x ∈ E. The case P (s, t) 1 ≤ 1 is treated in the continuation of the present
proof.

Proof (Continuation of the proof of (a) of Theorem 1.39 in case of sub-


probabilities.). We have to modify the proof in case a point of absorption
is required. Most of the proof for the case that P (τ, x; t, E) = 1 can be re-
peated with the probability transition function N (τ, x; t, B), B ∈ E4 . This
function was defined in (1.94) of Remark 1.42. However, we need to show
that the E 4 -valued process X e does not enter the absorption state 4 prior
to reentering the state space E. This requires an extra argument. We will
use a stopping time argument and Doob’s optional sampling time theorem to
achieve this: see Proposition 2.3 in which the transition function N (τ, x; t, B)
is also employed.
For further use we will also need a Skorohod space with a point of absorp-
¡ ¢[0,T ]
tion 4. The space Ω 4,0 consists of those ω ∈ E 4 whose restrictions to
D ∩ [0, T ] have left and right limits in E 4 , and which are such that for some
t = t(ω) ∈ [0, T ] the range {ω(s) : s ∈ D ∩ [0, t0 ]} is totally bounded in E for
all t0 ∈ (0, t), and such that ω(s) = 4 for s ∈ D ∩ [t(ω), T ]. Again using a
metric d4 on E 4 × E 4 which renders E 4 polish, it can be shown that Ω 4,0
¡ ¢[0,T ]
is a measurable subset of Ω = E 4 . In fact Ω 4,0 can be written as

Ω 4,0
[
= {ω ∈ Ω : s 7→ ω(s), s ∈ D ∩ [0, r] has left and right limits in E}
r∈D∩[0,T ]

\ \ [
({ω ∈ Ω : ω (D ∩ [0, r1 ]) is totally bounded in E}
m=1r1 <r2 , r2 −r1 <1/m
r1 , r2 ∈D∩[0,T ]

∩ {ω ∈ Ω : ω(s) = 4 for all s ∈ D ∩ [r2 , T ]}) . (2.60)


¡ ¢[0,T ]
From (2.60) it follows that Ω¡ 4,0 is¢ a measurable subset of Ω = E 4 .
Again it turns out that Pτ,x Ω 4,0 = 1. This fact follows from Proposition
2.3 and the fact that for all t ∈ D ∩ [0, T ]

Pτ,x [ω ∈ Ω : s 7→ ω(s), s ∈ D ∩ [0, t]


has left and right limits in E, and X(t) ∈ E]
= Pτ,x [ω ∈ Ω : ω(t) ∈ E] . (2.61)
64 2 Proof of main result

The equality in (2.61) follows in the same way as the corresponding result
in case P (τ, x; t, B), B ∈ E, but now with N (τ, x; t, B), B ∈ E4 . Again the
construction which led to the process in (2.50) can be performed to get a
strong Markov process of the form:
n³ ´ ³ ´ ¡ ¢o
e F
Ω, eτ , P
eτ,x , X(t),
e τ ≤ t ≤ T , (∨t : τ ≤ t ≤ T ) , E 4 , E4 , (2.62)
T

¡ ¢
e is the Skorohod space D [0, T ], E 4 .
where Ω
Since for functions f ∈ Cb (E) we have
Z Z
P (τ, t) f (x) = P (τ, x; t, dy) f (y) = N (τ, x; tdy) f (y) (2.63)

provided f (4) = 0, it follows that the process X e is quasi-left continuous on


its life time ζ. For the definition of N (τ, x; t, B) see Remark 1.42. In order to
be correct the process, or rather the family of probability spaces in (2.26) has
to be replaced with (2.62).
The arguments in Proposition 2.3 below then complete the proof of asser-
tion (a) of Theorem 1.39 in case the Feller propagator is phrased in terms of
sub-probabilities P (τ, x; t, E) ≤ 1, 0 ≤ τ ≤ t ≤ T , x ∈ E.
In the final part of assertion (a) of Theorem 1.39 we needed the following
proposition.
Proposition 2.3. Suppose the transition function P (τ, x; t, B), which satis-
fies the equation of Chapman-Kolmogorov, consists of sub-probability Borel
measures. Let N (τ, x; t, B), B ∈ E4 be the Feller transition function as con-
structed in Remark 1.42, which now consists of genuine Borel probability mea-
sures on the Borel field E4 of E 4 . As in (2.26) construct the corresponding
Markov process
n³ ´ ³ ´ ¡ ¢o
e τ , Pτ,x , X(t),
Ω, F e τ ≤ t ≤ T , (∨t : τ ≤ t ≤ T ) , E 4 , E4 . (2.64)
T

Fix (τ, x) ∈ [0, T ] × E and t ∈ [τ, T ]. Then the orbit


n³ ´ o
e
s, X(s) : τ ≤ s ≤ t, X(t) ∈ E

is Pτ,x -almost surely a relatively compact subset of E.


Proof. A proof can be based on a stopping time argument and Doob’s optional
sampling theorem. Let the life time ζ : Ω → [0, T ] be defined by
 n o
 inf s > 0 : X(s)
e e
= 4 , if X(s) = 4 for some s ≤ T ,
ζ=
T otherwise.
³ ´

Then ζ is an F -stopping time and we have:
t
t∈[τ,t]
2.1 Proof of the main result: Theorem 1.39 65
h i
e
Pτ,x X(t) ∈E
h h ii
= Eτ,x Pζ∧t,X(ζ∧t)
e
e
X(t) ∈E
h h i i
= Eτ,x Pζ∧t,X(ζ∧t)
e
e
X(t) ∈E ,ζ≤t
h h i i
+ Eτ,x Pζ∧t,X(ζ∧t)
e
e
X(t) ∈E ,ζ>t
h h i i h h i i
= Eτ,x Pζ,X(ζ)
e
e
X(t) ∈ E , ζ ≤ t + Eτ,x Pt,X(t)
e
e
X(t) ∈E ,ζ>t
h h i i h i
= Eτ,x Pζ,4 X(t) e e
∈ E , ζ ≤ t + Pτ,x X(t) ∈ E, ζ > t

(see Remark 1.42)


h i
e
= Eτ,x [N (ζ, 4; t, E) , ζ ≤ t] + Pτ,x X(t) ∈ E, ζ > t
h i
e
= Pτ,x X(t) ∈ E, ζ > t . (2.65)
n o
e
From (2.65) it follows that on the event X(t) ∈ E the orbits
n³ ´ o
e
s, X(s) : s ∈ [τ, t]

are Pτ,x -almost surely contained in compact subsets of [τ, t] × E.

In the proof of Proposition 2.5 we need the following result. Notice that in
this Proposition as well as in Proposition 2.5 the conservative property (2.66)
is employed. Proposition 2.3 contains a result which can be used in the non-
conservative situation. The possibility of non-conservativeness plays a role in
the proof of item (d) as well: see the inequalities in (2.119) and (2.120), and
their consequences.
Proposition 2.4. Let (τ, x) be an element in [0, T ] × E, and assume

Pτ,x [X(t) ∈ E] = P (τ, t) 1E (x) = P (τ, x; t, E) = 1 (2.66)

for all t ∈ [τ, T ]. Let (fm )m∈N be a sequence in Cb+ ([τ, T ] × E) which decreases
pointwise to zero. Denote by D the collection of positive dyadic numbers. Then
the following equality holds Pτ,x -almost surely:

inf sup sup Es,X(s) [fm (t, X(t))] = 0. (2.67)


m∈N t∈D∩[τ,T ] s∈D∩[0,t]

Consequently, the collection of linear functionals Λs,t : Cb ([τ, T ] × E) → C


defined by Λs,t (f ) = Es,X(s) [f (t, X(t))], f ∈ Cb ([τ, T ] × E), τ ≤ s ≤ t ≤ T ,
s, t ∈ D, is Pτ,x -almost surely equi-continuous for the strict topology Tβ .
Let (sn , tn ) be any sequence in [τ, T ] × [τ, T ] such that sn ≤ tn , n ∈ N.
Then the collection
66 2 Proof of main result

{Λs,t : τ ≤ s ≤ t ≤ T, s, t ∈ D or (s, t) = (sn , tn ) for some n ∈ N}

is Pτ,x -almost surely equi-continuous as well.


Proof. Let (fm )m∈N ⊂ Cb+ ([τ, T ] × E) be as in Proposition 2.4. For every
m ∈ N and t ∈ [τ, T ] we define the Pτ,x -martingale s 7→ Mt,m (s), s ∈ [τ, T ],
by Mt,m (s) = Es∧t,X(s∧t) [fm (t, X(t))]. Then the process

s 7→ sup Es∧t,X(s∧t) [fm (t, X(t))] = sup Es∧t,X(s∧t) [fm (t, X(t))]
t∈[τ,T ] t∈D∩[τ,T ]

is a Pτ,x -submartingale. Fix η > 0. By Doob’s submartingale inequality we


have
" #
ηPτ,x sup sup Es,X(s) [fm (t, X(t))] ≥ η
t∈D∩[τ,T ] s∈D∩[t,T ]
" #
= ηPτ,x sup sup Mm,t (s) ≥ η
t∈D∩[τ,T ] s∈D∩[τ,T ]
" #
= ηPτ,x sup sup Mm,t (s) ≥ η
s∈D∩[τ,T ] t∈D∩[τ,T ]
" # " #
≤ Eτ,x sup Mm,t (T ) = Eτ,x sup Et,X(t) [fm (t, X(t))]
t∈D∩[τ,T ] t∈D∩[τ,T ]
" #
= Eτ,x sup fm (t, X(t)) . (2.68)
t∈D∩[τ,T ]

Since the orbit {(t, X(t)) : t ∈ D ∩ [τ, T ]} is Pτ,x -almost surely contained in a
compact subset of E, Dini’s lemma implies that

sup fm (t, X(t)) decreases to 0 Pτ,x -almost surely,


t∈D∩[τ,T ]

which implies " #


lim Eτ,x sup fm (t, X(t)) = 0. (2.69)
m→∞ t∈D∩[τ,T ]

A combination of (2.68) and (2.69) yields (2.67). So the first part of Proposi-
tion 2.4 has been established.
The second assertion follows from (2.67) together with Theorem 1.8.
The third assertion follows from the fact that for f ∈ Cb+ ([τ, T ] × E) and
τ ≤ sn ≤ tn ≤ T the inequality

Esn ,X(sn ) [f (tn , X (tn ))] ≤ sup sup Es,X(s) [f (t, X (t))]
t∈D∩[τ,T ] s∈D∩[τ,t]

holds Pτ,x -almost surely.


This shows Proposition 2.4.
2.1 Proof of the main result: Theorem 1.39 67

The next proposition was used in the proof of item (a) of Theorem 1.39.
Proposition 2.5. Let (τ, x) ∈ [0, T ] × E, and assume the conservative prop-
erty (2.66). In addition, let f ∈ Cb ([0, T ] × E) and let ((sn , tn ))n∈N be se-
quence in [τ, T ] × [τ, T ] such that sn ≤ tn , n ∈ N and such that lim (sn , tn ) =
n→∞
(s, t). Then the limit
lim Esn ,X(sn ) [f (tn , X (tn ))] = lim [P (sn , tn ) f (tn , ·)] (X (sn ))
n→∞ n→∞
= [P (s, t) f (t, ·)] (X (s)) = Es,X(s) [f (t, X(t))] (2.70)
exists Pτ,x -almost surely. In particular if sn = tn for all n ∈ N, then s = t
and
lim Etn ,X(tn ) [f (tn , X (tn ))] = lim f (tn , X (tn ))
n→∞ n→∞
= f (t, X(t)) , Pτ,x -almost surely. (2.71)
In addition, by taking tn = t and letting the sequence (sn )n∈N decrease or
increase to s ∈ [τ, t] it follows that the process s 7→ Es,X(s) [f (t, X(t))] is Pτ,x -
almost surely a left and right continuous martingale. Moreover, the equalities
£ ¯ τ ¤ £ ¯ ¤
Eτ,x f (t, X(t)) ¯ Fs+ = Es,X(s) [f (t, X(t))] = Eτ,x f (t, X(t)) ¯ Fsτ (2.72)
hold Pτ,x -almost surely.
The equalities in (2.37) then follow from (2.72) together with the Monotone
Class Theorem.
Proof. In the proof of Proposition 2.5 we will employ the properties of the
process in (2.5) to its full extent. In addition we will use Proposition 2.4 which
implies that continuity properties of the process
Z ∞
(s, t) 7→α e−α(ρ−t) Es,X(s) [f (ρ ∧ T, X (ρ ∧ T ))] dρ
t
Z ∞
=α e−α(ρ−t) P (s, ρ ∧ T ) f (ρ ∧ T, ·) (X(s)) dρ
t
= αP (s, t) R(α)f (t, ·) (X(s))
Z ∞ h ³³ ρ´ ³³ ρ´ ´´i
= e−ρ Es,X(s) f t + ∧ T, X t + ∧T dρ, (2.73)
0 α α
0 ≤ s ≤ t ≤ T , Pτ,x -almost surely carry over to the process
(s, t) 7→P (s, t) f (t, ·) (X(s)) = Es,X(s) [f (t, X(t))]
Z ∞
= lim α e−α(ρ−t) P (s, ρ ∧ T ) f (ρ ∧ T, ·) (X(s)) dρ
α→∞ t
Z ∞ ³ ³ ρ´ ´ ³³ ρ´ ´
= lim e−ρ P s, t + ∧T f t+ ∧ T, · (X(s)) dρ
α→∞ 0 α α
Z ∞ h ³³ ´ ³³
ρ ρ´ ´´i
= lim e−ρ Es,X(s) f t + ∧ T, X t + ∧T dρ.
α→∞ 0 α α
(2.74)
68 2 Proof of main result

Let (sn , tn )n∈N be a sequence in [τ, T ] × [τ, T ] for which sn ≤ tn . Put


Z ∞
Λα,s,t f = α e−α(ρ−t) P (s, ρ ∧ T ) f (ρ ∧ T, ·) (X(s)) dρ.
t

The equality in (2.73) in conjunction with Proposition 2.4 shows that the
collection of functionals
{Λα,s,t : τ ≤ s ≤ t ≤ T, s, t ∈ D or (s, t) = (sn , tn ) for some n ∈ N, α ≥ 1}
is Pτ,x -almost surely Tβ -equi-continuous. Therefore the family of its limits
Λt,s = limα→∞ Λα,s,t inherits the continuity properties from the family
{Λα,s,t : τ ≤ s ≤ t ≤ T, s, t ∈ D or (s, t) = (sn , tn ) for some n ∈ N}
where α ∈ (0, ∞) is fixed.
We still have to prove that
lim Esn ,X(sn ) [f (tn , X (tn ))] = Es,X(s) [f (t, X(t))] (2.75)
n→∞

Pτ,x -almost surely, whenever f ∈ Cb ([τ, t] × E) and the sequence (sn , tn )n∈N
in [τ, T ] × [τ, T ] is such that limn→∞ (sn , tn ) = (s, t) and sn ≤ tn for all n ∈ N.
In view of the first equality in (2.20) and the previous arguments it suffices
to prove this equality for processes of the form
Z ∞
(s, t) 7→ α e−α(ρ−t) Es,X(s) [f (ρ ∧ T, X (ρ ∧ T ))] dρ
t

instead of
(s, t) 7→ Es,X(s) [f (t, X (t))]
It is easy to see that this convergence reduces to treating the case where, for
ρ ∈ (τ, T ] fixed and for sn → s, s ∈ [τ, ρ],
lim Esn ,X(sn ) [f (ρ, X(ρ))] = Es,X(s) [f (ρ, X(ρ))] (2.76)
n→∞

Here we will distinguish two cases: sn increases to s and sn decreases to s.


In both cases we will prove that the equality in (2.76). In case of an in-
creasing the result follows more or less directly from the martingale mar-
tingale property and from the left continuity on the diagonal. In case of
a decreasing sequence we employ the fact that a subspace of the form
{P (ρ, u) g : u ∈ (ρ, T ], g ∈ Cb (E)} is Tβ -dense in Cb (E). First we consider
the situation where sn increases to s ∈ [τ, ρ]. Then we have
£ ¯ ¤
Esn ,X(sn ) [f (ρ, X(ρ))] = Eτ,x f (ρ, X(ρ)) ¯ Fsτn
£ £ ¯ ¤¯ ¤
= Eτ,x Eτ,x f (ρ, X(ρ)) ¯ Fsτ ¯ Fsτn
£ ¯ ¤
= Eτ,x Es,X(s) [f (ρ, X(ρ))] ¯ Fsτn
£ ¤
= Esn ,X(sn ) Es,X(s) [f (ρ, X(ρ))]
= (P (sn , s) Es,· [f (ρ, X(ρ))]) (X (sn )) . (2.77)
2.1 Proof of the main result: Theorem 1.39 69

In (2.77) we let n → ∞ and use the left continuity of the propagator (see
property (v) in Definition 1.24) to conclude

lim Esn ,X(sn ) [f (ρ, X(ρ))] = Es,X(s) [f (ρ, X(ρ))] . (2.78)


n→∞

The equality in (2.78) shows the Pτ,x -almost sure left continuity of the process
s 7→ Es,X(s) [f (ρ, X(ρ))] on the interval [τ, ρ]. Next assume that the sequence
(sn )n∈N decreases to s ∈ [τ, ρ]. Then we get Pτ,x -almost surely

Es,X(s) [f (ρ, X(ρ))] = P (s, ρ) f (ρ, ·) (X(ρ))

(employ (vi) of Definition 1.24)

= lim P (sn , ρ) f (ρ, ·) (X (sn )) = lim Esn ,X(sn ) [f (ρ, X(ρ))]


n→∞ n→∞
h ¯ s i
= Es,X(s) lim Esn ,X(sn ) [f (ρ, X(ρ))] ¯ Fs+
n→∞
£ ¯ s ¤
= lim Es,X(s) Esn ,X(sn ) [f (ρ, X(ρ))] ¯ Fs+
n→∞
£ £ ¯ ¤¯ s ¤
= lim Es,X(s) Es,X(s) f (ρ, X(ρ)) ¯ Fssn ¯ Fs+
n→∞

(tower property of conditional expectation)


£ ¯ s ¤
= Es,X(s) f (ρ, X(ρ)) ¯ Fs+
£ £ ¯ s ¤¯ τ ¤
= Eτ,x Es,X(s) f (ρ, X(ρ)) ¯ Fs+ ¯ Fs+
h ¯ τ i
= Eτ,x lim Esn ,X(sn ) [f (ρ, X(ρ))] ¯ Fs+
n→∞
£ ¯ τ ¤
= lim Eτ,x Esn ,X(sn ) [f (ρ, X(ρ))] ¯ Fs+
n→∞

(Markov property)
£ £ ¯ ¤¯ τ ¤
= lim Eτ,x Eτ,x f (ρ, X(ρ)) ¯ Fsτn ¯ Fs+
n→∞

(tower property of conditional expectation)


£ ¯ τ ¤
= Eτ,x f (ρ, X(ρ)) ¯ Fs+ . (2.79)

The equality in (2.79) is the same as the first equality in (2.72). The second
equality is a consequence of the Markov property with respect to the filtration
(Ftτ )t∈[τ,T ] .
This completes the proof of Proposition 2.4.

2.1.2 Proof of item (b) of Theorem 1.39

Here we have to prove that Markov processes with certain continuity proper-
ties give rise to Feller evolutions.
70 2 Proof of main result

Proof (Proof of item (b) in Theorem 1.39.). Let the operators P (τ, t), τ ≤ t,
be as in (1.89). We have to prove that this collection is a Feller evolution. The
properties (i), (iii) and (iv) of Definition 1.24 are obvious. The propagator
property (ii) is a consequence of the Markov property of the process in (1.88).
To be precise, let f ∈ Cb (E) and 0 ≤ τ < s < t ≤ T . Then we have:
£ ¤
P (τ, s) P (s, t) f (x) = Es,x [P (s, t) f (X(s))] = Eτ,x Es,X(s) [f (X(t))]
£ £ ¯ ¤¤
= Eτ,x Eτ,x f (X(t)) ¯ Fsτ
= Eτ,x [f (X(t))] = P (τ, t) f (x). (2.80)

Let f be any function in Cb (E). The continuity of the function (τ, t, x) 7→


P (τ, t) f (x), 0 ≤ τ ≤ t ≤ T , x ∈ E, implies the properties (v) through (vii)
of Definition 1.24. Let f ∈ Cb ([0, T ] × E). In addition we have to prove that
the function (τ, t, x) 7→ P (τ, t) f (t, ·) (x) is continuous. The proof of this fact
requires the following steps:
1. The Feller evolution {P (τ, t) : 0 ≤ τ ≤ t ≤ T } is Tβ -equi-continuous.
2. Define the operators R(α) : Cb ([0, T ] × E) → Cb ([0, T ] × E), α > 0, as
in (3.6) below;
Z ∞
R(α)f (t, x) = e−α(ρ−t) P (t, ρ ∧ T ) f (ρ ∧ T, ·) (x)dρ, (2.81)
t

f ∈ Cb ([0, T ] × E). Then the functions (τ, t, x) 7→ P (τ, t) [R(α)f (t, ·)] (x),
0 ≤ τ ≤ t ≤ T , x ∈ E, α > 0, are continuous for all f ∈ Cb ([0, T ] × E).
3. The family {R(α) : α > 0} is a resolvent family, and hence the range of
R(α) does not depend on α > 0. The Tβ -closure of its range coincides
with Cb ([0, T ] × E).
From 3, 1 and 2 it then follows that functions of the form P (τ, t) f (t, ·) (x),
0 ≤ τ ≤ t ≤ T , f ∈ Cb ([0, T ] × E), are continuous. So we have to prove 1
through 3.
Let (ψm )m∈N be a sequence of functions in C+ (E) which decreases point-
wise to zero. Since, by assumption, the functions (τ, t, x) 7→ P (τ, t) ψm (x),
m ∈ N, are continuous, the sequence P (τ, t) ψm (x) decreases uniformly on
compact subsets to 0. By Theorem 1.18 it follows that the Feller evolution
{P (τ, t) : 0 ≤ τ ≤ t ≤ T } is Tβ -equi-continuous. This proves 1.
Let f ∈ Cb ([0, T ] × E), and fix α > 0. Then the function P (τ, t) R(α)f
can be written in the form
Z ∞
P (τ, t) [R(α)f (t, ·)] (x) = e−α(ρ−t) P (τ, ρ ∧ T ) f (ρ ∧ T, ·) (x)dρ,
t

which by inspection is continuous, because for fixed ρ ∈ [0, T ] the function


(τ, x) 7→ P (τ, ρ) f (ρ, ·) (x) is continuous. This proves assertion 2.
The family {R(α) : α > 0} is a resolvent family, i.e. it satisfies:

R(β) = R(α) + (α − β)R(α)R(β), α, β > 0. (2.82)


2.1 Proof of the main result: Theorem 1.39 71

Consequently, the range R(α)Cb ([0, T ] × E) does not depend on α > 0. Next
fix f ∈ Cb ([0, T ] × E). Then limα→∞ αR(α)f (t, x) = f (t, x) for all (t, x) ∈
[0, T ] × E. By dominated convergence it also follows that
Z
lim αR(α)f (t, x) dµ(t, x)
α→∞ [0,T ]×E
Z Z ³ ³

ρ´ ´ ³³ ρ´ ´
= lim e−ρ P t, t + ∧T f t+ ∧ T, · (x)dρ dµ(t, x)
α→∞ [0,T ]×E 0 α α
Z
= f (t, x) dµ(t, x), (2.83)
[0,T ]×E

where µ is a complex Borel measure on [0, T ] × E. From (2.83) and Corol-


lary 1.5 we see that the space R(α)Cb ([0, T ] × E) is Tβ -weakly dense in
Cb ([0, T ] × E). It follows that it is Tβ -dense. Let K be a compact subset
of E. Since the Feller evolution is Tβ -equi-continuous there exists a bounded
function u ∈ H + ([0, T ] × E) such that

sup |P (τ, t) f (x)| ≤ kuf k∞ , f ∈ Cb (E). (2.84)


(t,x)∈[0,T ]×E

Fix ε > 0. For α0 > 0 and f ∈ Cb ([0, T ] × E) fixed, there exists a function
g ∈ Cb ([0, T ] × E) such that

sup sup |u(s, y)f (s, y) − α0 R (α0 ) g (s, y)| ≤ ε. (2.85)


s∈[0,T ] y∈E

From (2.84) and (2.85) we infer:

sup sup |P (τ, t) [f (t, ·) − α0 R (α0 ) g (t, ·)] (x)|


0≤τ ≤t≤T x∈K

≤ sup sup |u(y) (f (s, y) − α0 R (α0 ) (s, y))| ≤ ε. (2.86)


0≤s≤T y∈E

As a consequence of (2.86) the function (τ, t, x) 7→ P (τ, t) f (t, ·) (x) inherits


its continuity properties from functions of the form

(τ, t, x) 7→ P (τ, t) R (α0 ) f (t, ·) (x), 0 ≤ τ ≤ t ≤ T, x ∈ E.

Since the latter functions are continuous, the same is true for the function
P (τ, t) f (t, ·) (x).
This concludes the proof of part (b) of Theorem 1.39.
As a corollary we mention the following: its proof follows from the argu-
ments leading to the observation that for all f ∈ Cb ([0, T ] × E) the function
(τ, t, x) 7→ P (τ, t) f (t, ·) (x) is continuous. It will be used in the proof of The-
orem 3.10 below.
72 2 Proof of main result

Corollary 2.6. Let the family {P (τ, t) : 0 ≤ τ ≤ t ≤ T } be a Feller evolu-


tion in Cb (E). Extend these operators to the space Cb ([0, T ] × E) by the for-
mula
n Pe (τ, t) f (τ, x) = P o
(τ, t) f (t, ·) (x), f ∈ Cb ([0, T ] × E). Then the family
e
P (τ, t) : 0 ≤ τ ≤ t ≤ T is again Tβ -equi-continuous. In addition define the
Tβ -continuous semigroup {S(t) : t ≥ 0} on Cb ([0, T ] × E) by

S(t)f (τ, x) = P (τ, (τ + t) ∧ T ) f ((τ + t) ∧ T, ·) (x),


f ∈ Cb ([0, T ] × E) .
(2.87)
Then the semigroup {S(t) : t ≥ 0} is Tβ -equi-continuous.

In the sequel we will not use the notation Pe (τ, t) for the extended Feller
evolution very much: we will simply ignore the difference between Pe (τ, t) and
P (τ, t). For more details on the semigroup defined in (2.87) see (3.5) below.
Proof (Proof of Corollary 2.6.). Let f ∈ Cb ([0, T ] × E). From the proof of
(b) of Theorem 1.39 (see the very end) we infer that the function (τ, t, x) 7→
Pe (tau, t) f (τ, x) is continuous. Let (ψm )m∈N be a sequence of functions in
Cb ([0, T ] × E) which decreases pointwise to 0. Let u ∈ H + ([0, T ] × [0, T × E).
Then the functions Pe (τ, t) (ψm f ) (x) alson decrease uniformlyo to 0. From
Corollary 1.19 it follows that the family Pe (τ, t) : τ ≤ t ≤ T is Tβ -equi-
continuous. From the representation (2.87) of the semigroup {S(t) : t ≥ 0}, it
is also clear that this semigroup is Tβ -equi-continuous.

2.1.3 Proof of item (c) of Theorem 1.39

In this part and in (d) of Theorem 1.39 we will see the intimate relationship
which exists between solutions to the martingale problem and the correspond-
ing (strong) Markov processes.
Proof (Proof of part (c) of Theorem 1.39.). In the proof of item (c) we will
use the fact that an operator L generates a Feller evolution if and only if it
generates the corresponding Markov process: see Proposition 3.1 below. So
we may assume that the corresponding Markov process is that of part (a)
of Theorem 1.39: see (1.84). Among other things this means that it is right
continuous, and has left limits in E in its life time. In addition, it is quasi-left
continuous on its life time. Let f ∈ Cb ([0, T ] × E) belong to the domain of
D1 +L. We will show that the process in (1.90) is a Pτ,x -martingale. Therefore,
fix s ∈ [τ, t], and put
Z sµ ¶

Mτ,f (s) = f (s, X(s)) − f (τ, X(τ )) − + L(ρ) f (ρ, ·) (X(ρ)) dρ.
τ ∂ρ

Then by the Markov property we have


£ ¯ ¤
Eτ,x Mτ,f (t) ¯ Fsτ − Mτ,f (s)
2.1 Proof of the main result: Theorem 1.39 73
£ ¯ ¤
= Eτ,x Ms,f (t) ¯ Fsτ = Es,X(s) [Ms,f (t)]
= Es,X(s) [f (t, X(t))] − Es,X(s) [f (s, X(s))]
Z t ·µ ¶ ¸

− Es,X(s) + L(ρ) f (ρ, ·) (X(ρ)) dρ
s ∂ρ

(the operator L generates the involved Markov process)


Z t
d
= Es,X(s) [f (t, X(t))] − Es,X(s) [f (s, X(s))] − Es,X(s) [f (ρ, X(ρ))] dρ
s dρ
¯ρ=t
= Es,X(s) [f (t, X(t))] − Es,X(s) [f (s, X(s))] − Es,X(s) [f (ρ, X(ρ))] ¯ρ=s = 0.
(2.88)

The equality in (2.88) proves the first part of assertion (b). Proposition 2.7
below proves more than what is claimed in (b) of Theorem 1.39. Therefore
the proof of item (b) in Theorem 1.39 is completed by 2.7.

Proposition 2.7. Let the Markov family of probability spaces be as in (a),


formula (1.84) of Theorem 1.39. Let ∨t , ∧t , ϑt : Ω → Ω, t ∈ [0, T ],
be time transformations with the following respective defining properties:
X(s) ◦ ∨t = X (s ∨ t), X(s) ◦ ∧t = X (s ∧ t), and X(s) ◦ ϑt = X ((s + t) ∧ T ),
for all s, t ∈ [0, T ]. Let the σ-fields Ftt21 , 0 ≤ t1 ≤ t2 ≤ T , be defined by
Ftt21 = σ (X(s) : t1 ≤ s ≤ t2 ). Fix t ∈ [0, T ]. Then the mapping ∨t is Ftt21∨t ∨t
-
t1 t1 ∧t t1 (t +t)∧T
Ft2 -measurable, the mapping ∧t is Ft2 ∧t -Ft2 -measurable, and ϑt is F(t2 +t)∧T t -
1

Ftt21 -measurable.
Fix τ ∈ [0, T ], and τ ≤ t1 ≤ t2 ≤ T . Let µ be a Borel probability measure
on E, and define the probability measure Pτ,µ on FTτ by the formula Pτ,µ (A) =
R ¡ ¢τ,µ
P (A)dµ(x), A ∈ FTτ . Let Ftt21
E τ,x
be the Pτ,µ -completion of the σ-field
Ftt21 . Then (Pτ,µ -a.s. means Pτ,µ -almost surely)
¡ t1 ¢τ,µ n τ,µ
o
F t2 = A ∈ (FTτ ) : 1A ◦ ∨t1 ◦ ∧t2 = 1A , Pτ,µ -a.s. , (2.89)

and
¡ ¢τ,µ \ n τ,µ
o
Ftt21+ = A ∈ (FTτ ) : 1A ◦ ∨t1 ◦ ∧t2 +ε = 1A , Pτ,µ -a.s. .
ε∈(0,T −t2 ]
(2.90)

In addition the following equalities are Pτ,µ -almost surely valid for all bounded
τ,µ
stochastic variables F which are (FTτ ) -measurable:
£ ¯ τ ¤ £ ¯ ¤
Eτ,µ F ¯ Ft+ = Eτ,µ F ¯ Ftτ , (2.91)
h ¯¡ ¢ τ,µ i £ ¯ τ ¤
Eτ,µ F ¯ Ft+ τ = Eτ,µ F ¯ Ft+ . (2.92)
74 2 Proof of main result
¡ τ ¢τ,µ
If the variable F is Ft+ -measurable, then the equalities
h ¯¡ ¢τ,µ i £ ¯ τ ¤ £ ¯ ¤
F = Eτ,µ F ¯ Ft+τ = Eτ,µ F ¯ Ft+ = Eτ,µ F ¯ Ftτ (2.93)

τ,µ
hold Pτ,µ -almost surely. If the bounded stochastic variable F is (FTt ) -
measurable, then Pτ,µ -almost surely
h ¯¡ ¢τ,µ i
Eτ,µ F ¯ Ft+τ = Et,X(t) [F ] . (2.94)

¡ t ¢τ,µ
Finally, if F is Ft+ -measurable, then

F = Et,X(t) [F ] , Pτ,µ -almost surely. (2.95)

In particular such variables are Pτ,x -almost surely functions of the space-time
variable (t, X(t)).

Proof. Let F be a bounded Fss21 -measurable variable. The measurability prop-


erties of the time operator ∨t follow from the fact that F ◦ ∨t is Fss21∨t∨t
-
measurable. Similar statements hold for the operators ∧t and ϑt .
The equality

Ftt21 = {A ∈ FTτ : 1A ◦ ∨t1 ◦ ∧t2 = 1A , Pτ,µ -a.s.} (2.96)

is clear, and so the left-hand is included in the right-hand side of (2.89). This
τ,µ
can be seen as follows. Let A ∈ Ftt21 . Then there exist subsets A1 and
A2 ∈ Ftt21 such that A1 ⊂ A ⊂ A2 and Pτ,µ [A2 \ A1 ] = 0. Then we have

1A1 − 1A2 = 1A1 − 1A2 ◦ ∨t1 ◦ ∧t2


≤ 1A − 1A ◦ ∨t1 ◦ ∧t2 ≤ 1A2 − 1A1 ◦ ∨t1 ◦ ∧t2 = 1A2 − 1A1 . (2.97)

From (2.97) we see that 1A −1A ◦∨t1 ◦∧t2 and hence the left-hand side of (2.89)
is
n includedτ,µ
in the right-hand side. Since by othe same argument the σ-field
τ
A ∈ (FT ) : 1A ◦ ∨t1 ◦ ∧t2 = 1A , Pτ,µ -a.s. is Pτ,µ -complete and since
¡ ¢
{A ∈ (FTτ ) : 1A ◦ ∨t1 ◦ ∧t2 = 1A , Pτ,µ -a.s.} ⊂ Ftt21 , (2.98)

we also obtain that the right-hand side of (2.89) is contained in the left-hand
side. The equality in (2.90) is an immediate consequence of (2.89), and the
definition of Ftt21+ .
By the Monotone Class Theorem and an approximation Qn argument the
proof of (2.91) can be reduced to the case where F = j=1 fj (X (tj )) where
τ ≤ t1 < · · · tk ≤ t < tk+1 < · · · < tn ≤ T , and fj ∈ Cb (E), 1 ≤ j ≤ n.
Then by properties of conditional expectation and the Markov property with
respect to the filtration (Ftτ )t∈[τ,T ] we have
2.1 Proof of the main result: Theorem 1.39 75
 
n
Y
£ ¯ τ ¤ ¯ τ
Eτ,µ F ¯ Ft+ = Eτ,µ  fj (X (tj )) ¯ Ft+ 
j=1
   
k
Y n
Y ¯ τ ¯ τ
= fj (X (tj )) Eτ,µ Eτ,µ  fj (X (tj )) ¯ Ftk+1  ¯ Ft+ 
j=1 j=k+1

(Markov property)

k
Y £ ¯ τ ¤
= fj (X (tj )) Eτ,µ g (X (tk+1 )) ¯ Ft+ , (2.99)
j=1
hQ i
n
where g(y) = fk+1 (y)Etk+1 ,y j=k+1 fj (X (tj )) . Again we may suppose
that the function g belongs to Cb (E). Then we get, for t < s < tk+1 ,
£ ¯ τ ¤ £ £ ¯ ¤¯ τ ¤
Eτ,µ g (X (tk+1 )) ¯ Ft+ = Eτ,µ Eτ,µ g (X (tk+1 )) ¯ Fsτ ¯ Ft+

(Markov property)
£ ¯ τ ¤
= Eτ,µ Es,X(s) [g (X (tk+1 ))] ¯ Ft+
£ ¯ τ ¤
= lim Eτ,µ Es,X(s) [g (X (tk+1 ))] ¯ Ft+
s↓t
£ ¯ τ ¤
= Eτ,µ Et,X(t) [g (X (tk+1 ))] ¯ Ft+
= Et,X(t) [g (X (tk+1 ))]

(again Markov property)

= Eτ,µ [g (X (tk+1 )) Ftτ ] . (2.100)

Inserting the result of (2.100) into (2.99) and reverting the arguments which
τ τ
led
Qn to (2.99) with Ft instead of Ft+ shows the equality in (2.91) for F =
j=1 fj (X (tj )) where the functions fj , 1 ≤ j ≤ n, belong to Cb (E). As men-
tioned earlier this suffices to obtain (2.91) for all bounded random variables
τ,µ
F which are (FTτ ) -measurable. Here we use the fact that for any σ-field
τ,µ τ,µ
F ⊂ (FTτ ) , and any bounded £(FTτ¯) ¤ -measurable stochastic variable F an
equality of the form F = Eτ,µ F ¯ F holds Pτ,µ -almost surely. This argu-
ment also shows that the equality in (2.92) is a consequence of (2.91). The
equalities in (2.93) follow from the definition of conditional expectation and
the equalities (2.91) and (2.92). The equality in (2.94) also follows from (2.91)
and (2.92) together with the Markov property. Finally, the equality in (2.100)
is a consequence of (2.99) and the definition of conditional expectation.
Altogether this proves Proposition 2.7.
76 2 Proof of main result

2.1.4 Proof of item (d) of Theorem 1.39

In this subsection we will establish the fact that unique solutions to the mar-
tingale problem yield strong Markov processes.
Proof (Proof of item (d) of Theorem 1.39.). The proof of this result is quite
technical. The first part follows from a well-known theorem of Kolmogorov
on projective systems of measures. In the second part we must show that
the indicated path space has full measure, so that no information is lost. The
techniques used are reminiscent the material found in for example Blumenthal
and Getoor [34], Theorem 9.4. p. 46. The result in (d) is a consequence of the
propositions 2.8, 2.9, and 2.11 below.
In (d) as anywhere else in the book L = {L(s) : 0 ≤ s ≤ T } is considered as a
linear operator with domain D(L) and range R(L) in the space Cb ([0, T ] × E).
Suppose that the domain D(L) of L is Tβ -dense in Cb ([0, T ] × E). The prob-
lem we want to address is the following. Give necessary and sufficient condi-
tions on the operator L in order that for every (τ, x) ∈ [0, T ] × E there exists
a unique probability measure Pτ,x on FTτ with the following properties:
(i) For every f ∈ D(L), which is C (1) -differentiable in the time variable the
process
Z t
f (t, X(t)) − f (τ, X(τ )) − (D1 f + Lf ) (s, X(s)) ds, t ∈ [τ, T ],
τ

is a Pτ,x -martingale;
(ii) Pτ,x [X(τ ) = x] = 1.
¡ ¢
Here we suppose Ω = D [0, ∞], E 4 is the Skohorod space associated with
E 4 , as described in Definition 1.32, and FTτ is the σ-field generated by the
state variables X(t), t ∈ [τ, T ]. The probability measures Pτ,x are defined on
the σ-field FTτ . The following procedure extends them to FT0 . If the event A
belongs to FT0 , then we put Pτ,x [A] = Eτ,x [1A ◦ ∨τ ]. The composition 1A ◦ ∨τ
is defined in (1.74). With this convention in mind the equality in (ii) may be
replaced by
(ii)0Pτ,x [X(s) = x] = 1 for all s ∈ [0, τ ].
Let P (Ω) be the set of all probability measures on FT0 and define the subset
P00 (Ω) of P (Ω) by
[ ½
P00 (Ω) = P ∈ P (Ω) : P [X(τ ) = x] = 1
(τ,x)∈[0,T ]×E 4

and for every f ∈ D(L) ∩ D (D1 ) the process


Z t
f (t, X(t)) − f (τ, X(τ )) − (D1 + L) f (s, X(s)) ds, t ∈ [τ, T ],
τ
2.1 Proof of the main result: Theorem 1.39 77
¾
is a P-martingale . (2.101)

Instead of D(L) ∩ D (D1 ) we often write D(1) (L): see the comments following
Definition 1.30. Let (vj : j ∈ N) be a sequence of continuous functions defined
on [0, T ] × E with the following properties:
(i) v0 = 1E , v1 = 1{4} ;
(ii) kvj k∞ ≤ 1, vj belongs to D(1) (L) = D(L) ∩ D (D1 ), and vj (s, 4) = 0 for
j ≥ 2; ¡ ¢
(iii)The linear span of vj , j ≥ 0, is dense in Cb [0, T ] × E 4 for the strict
topology Tβ .
In addition let (fk : k ∈ N) be a sequence in D(1) (L) such that the linear
span of {(fk , (D1 + L) fk ) : k ∈ N} is Tβ dense in the graph G (D1 + L) :=
{(f, (D1 + L) f ) : f ∈ D(L)} of the operator D1 +L. Moreover, let (sj : j ∈ N)
be an enumeration of the set Q∩[0, T ]. A subset P 0 (Ω), which is closely related
to P00 , may be described as follows (see (2.52) as well):
P 0 (Ω)
∞ \
\ ∞ \
∞ \ \
= {P ∈ P (Ω) :
n=1 k=1 m=0 (j1 ,...,jm+1 )∈Nm+1 0≤sj1 <...<sjm+1 ≤T
£ ¡ ¢ ¤
P [X (sjk ) ∈ E, 1 ≤ k ≤ m + 1] = P X sjm+1 ∈ E , and
Z m
¡ ¡ ¡ ¢¢ ¢Y
fk sjm+1 , X sjm+1 − fk (sjm , X (sjm )) vjk (sjk , X (sjk )) dP
k=1
Z ÃZ sjm+1
! m
Y
)
= (D1 + L) fk (s, X(s)) ds vjk (sjk , X (sjk )) dP . (2.102)
s jm k=1

Let P (Ω) be the collection of probability measures on FT0 . For a concise


formulation of the relevant distance between probability measures in P (Ω)
we introduce kind of Lévy numbers. Let P1 and P2 ∈ P (Ω). Then we write,
for Λ ⊂ [0, T ], Λ finite or countable,
© £ ¡ ¢¤
LΛ (P2 , P1 ) = lim inf η > 0 : P2 (X(s))s∈Λ ⊂ ∪`j=1 B xj , 2−m
`→∞
¡ ¢ £ ¤ ª
≥ 1 − η2−m P1 (X(s))s∈Λ ⊂ E , for all m ∈ N (2.103)
where B (x, ε) is a ball in E centered at x and with radius ε > 0. Notice that
in (2.103) lim`→∞ may be replaced with inf `∈N . In fact we shall prove that,
if for the operator L the martingale problem is solvable, that then the set
P 0 (Ω) is complete metrizable and separable for the metric d(P1 , P2 ) given by
dL (P1 , P2 )
¯ ¯
X X ¯Z ¯
¯ Y −j−` ¡ ¡ ¢¢ ¯
= 2−|Λ| ¯ 2 j
v s , X s d (P − P )¯
Λ⊂N,|Λ|<∞ ¯
(`j )j∈Λ ¯
j `j `j 2 1 ¯
j∈Λ ¯
78 2 Proof of main result

X ¡ ¢
+ 2−k LQ∩[0,sk ] (P2 , P1 ) + LQ∩[0,sk ] (P1 , P2 ) . (2.104)
k=1

If a sequence of probability measures (Pn )n∈N converges to P with respect


to the metric in (2.104), then the first term on the right-hand side says that
the finite-dimensional distributions of Pn converge to the finite-dimensional
distributions of P. The second term says, that the limit P is a measure indeed,
and that the paths of the process are P-almost surely totally bounded. The
following result should be compared to the comments in 6.7.4. of [225], pp.
167–168. It is noticed that in Proposition 2.8 the uniqueness of the martingale
problem is used to prove the separability.
Proposition 2.8. The set P 0 (Ω) supplied with the metric dL defined in
(2.104) is a separable complete metrizable Hausdorff space.
Proof. Let (Pn : n ∈ N) be a Cauchy sequence in (P 0 (Ω) , d). Then for ev-
ery m ∈ N, for every m-tuple (j1 , . . . , jm )R Q in Nm and for every m-tuple
m
(sj1 , . . . , sjm ) ∈ Qm ∩[0, T ] the limit lim`→∞ k=1 vjk (sjk , X (sjk )) dPn` ex-
ists. We shall prove that for every every m ∈ N, for every m-tuple (j1 , . . . , jm )
in Nm and for every m-tuple (tj1 , . . . , tjm ) ∈ [0, T ]m the limit
Z Y
m
lim ujk (tjk , X (tjk )) dPn (2.105)
n→∞ k=1

exists for all sequences (uj )j∈N in Cb ([0, T ] × E). Since, in addition,

lim lim LQ∩[0,sk ] (Pn , Pm ) = lim lim LQ∩[0,sk ] (Pn , Pm ) = 0, (2.106)


n→ m→∞ m→∞ n→∞

for all k ∈ N, it follows that the sequence (Pn )n∈N is tight in the sense that the
paths {X(s) : s ∈ Q ∩ [0, sk ]} are Pn -almost surely totally bounded uniformly
in Pn for all n simultaneously. The latter means that for every ε > 0 there
exists n(ε) ∈ N and integers (`m (ε))m∈N such that
h ¢i
`m (ε) ¡
Pn2 (X(s))s∈Q∩[0,sk ] ⊂ ∪j=1 B xj , 2−m
¡ ¢ h i
≥ 1 − ε2−m Pn1 (X(s))s∈Q∩[0,sk ] ⊂ E (2.107)

for all n2 , n1 ≥ n(ε), and for all m ∈ N. By enlarging `m (ε) we may and do
assume that
h ¢i
`m (ε) ¡
Pn (X(s))s∈Q∩[0,sk ] ⊂ ∪j=1 B xj , 2−m
¡ ¢ h i
≥ 1 − ε2−m Pn(ε) (X(s))s∈Q∩[0,sk ] ⊂ E , (2.108)

and
h ¢i
`m (ε) ¡
Pn (X(s))s∈Q∩[0,sk ] ⊂ ∪j=1 B xj , 2−m
2.1 Proof of the main result: Theorem 1.39 79
¡ ¢ h i
−m
≥ 1 − ε2 Pn (X(s))s∈Q∩[0,sk ] ⊂ E (2.109)

for all n ∈ N. It follows that


h ¡ ¢i
`m (ε)
Pn (X(s))s∈Q∩[0,sk ] ⊂ ∩∞ ∪
m=1 j=1 B xj , 2 −m

h i
≥ (1 − ε) Pn (X(s))s∈Q∩[0,sk ] ⊂ E , (2.110)

for all n ∈ N. But then there exists, by Kolmogorov’s extension theorem, a


probability measure P such that
Z Y Z Y
m m
lim ujk (tjk , X (tjk )) dPn = ujk (tjk , X (tjk )) dP, (2.111)
n→∞ k=1 k=1

for all m ∈ N, for all (j1 , . . . , jm ) ∈ Nm and for all (tj1 , . . . , tjm ) ∈ [0, T ]m .
From the description (2.101) of P 0 (Ω) it then readily follows that P is a
member of P 0 (Ω). So the existence of the limit in (2.105) remains to be
verified, together with the following facts: the limit P is a martingale solution,
and D([0, ∞], E 4 ) has full P-measure. Let t be in Q ∩ [0, T ]. Since, for every
j ∈ N, the process
Z s
vj (s, X(s)) − vj (0, X(0)) − (D1 + L) vj (σ, X(σ)) dσ, s ∈ [0, T ],
0

is a martingale for the measure Pn` , we infer


Z Z t Z Z
(D1 + L) vj (s, X(s)) dsdPn` = vj (t, X(t)) dPn` − vj (0, X(0)) dPn` .
0
Z Z t
and hence the limit lim (D1 + L) vj (s, X(s)) dsdPn` exists. Next let t0
`→∞ 0
be in [0, T ]. Again using the martingale property we see
Z
vj (t0 , X (t0 )) d (Pn` − Pnk )
Z µZ t ¶
= (D1 + L) vj (s, X(s)) ds d (Pn` − Pnk )
Z 0
+ vj (0, X(0)) d (Pn` − Pnk )
Z µZ t ¶
− (D1 + L) vj (s, X(s)) ds d (Pn` − Pnk ) , (2.112)
t0

where t is any number in Q ∩ [0, T ]. From (2.112) we infer


¯Z ¯
¯ ¯
¯ vj (t0 , X (t0 )) d (Pn − Pn )¯
¯ ` k ¯
80 2 Proof of main result
¯Z µZ t ¶ ¯
¯ ¯
≤ ¯¯ (D1 + L) vj (s, X(s)) ds d (Pn` − Pnk )¯¯
¯Z 0 ¯
¯ ¯
+ ¯¯ vj (0, X(0)) d (Pn` − Pnk )¯¯ + 2 |t − t0 | k(D1 + L) vj k∞ .(2.113)

If we let ` and k tend to infinity, we obtain


¯Z ¯
¯ ¯
lim sup ¯ vj (t0 , X (t0 )) d (Pn` − Pnk )¯¯ ≤ 2 |t − t0 | k(D1 + L) vj k∞ . (2.114)
¯
`,k→∞
R
Consequently for every s ∈ [0, T ] the limit lim`→∞ vj (s, X(s)) dPn` exists.
The inequality
¯Z Z ¯
¯ ¯
¯ vj (t, X(t)) dPn − vj (t0 , X (t0 )) dPn ¯
¯ ` `¯

¯Z Z t ¯
¯ ¯
= ¯¯ (D1 + L) vj (s, X(s)) ds dPn` ¯¯
t0
≤ |t − t0 | k(D1 + L) vj k∞
R
shows that the functions t 7→ lim`→∞ vj (t, X(t)) dPn` , j ∈ N, are contin-
uous. Since the linear span of (vj : j ≥ 2) is dense in Cb ([0, T ] × E) for the
strict topology, it follows that for every v ∈ Cb ([0, T ] × E) and for every
t ∈ [0, T ] the limit
Z
t 7→ lim v (t, X(t)) dPn` , t ∈ [0, T ], (2.115)
`→∞

exists and that this limit, as a function of t, is continuous. The following step
consists in proving that for every t0 ∈ [0, ∞) the equality
Z
lim lim sup |vj (t, X(t)) − vj (t0 , X (t0 ))| dPn` = 0 (2.116)
t→t0 `→∞

holds. For t > s the following (in-)equalities are valid:


µZ ¶2 Z
2
|vj (t, X(t)) − vj (s, X(s))| dPn` ≤ |vj (t, X(t)) − vj (s, X(s))| dPn`
Z Z
2 2
= |vj (t, X(t))| dPn` − |vj (s, X(s))| dPn`
Z
− 2< (vj (t, X(t)) − vj (s, X(s))) v j (s, X(s))dPn`
Z Z
2 2
= |vj (t, X(t))| dPn` − |vj (s, X(s))| dPn`
Z µZ t ¶
− 2< (D1 + L) vj (σ, X(σ)) dσ v j (s, X(s))dPn`
s
2.1 Proof of the main result: Theorem 1.39 81
Z Z
2 2
≤ |vj (t, X(t))| dPn` − |vj (s, X(s))| dPn`

+ 2(t − s) k(D1 + L) vj k∞ . (2.117)

Hence (2.115) together with (2.117) implies (2.116). By (2.116), we may ap-
ply Kolmogorov’s extension theorem to prove that there exists a probability
¡ ¢[0,T ]
measure P on Ω 0 := E 4 with the property that
Z Y
m Z Y
m
vjk (sjk , X (sjk )) dP = lim vjk (sjk , X (sjk )) dPn , (2.118)
n→∞
k=1 k=1

holds for all m ∈ N and for all (sj1 , . . . , sjm ) ∈ [0, T ]m . It then follows
¡ that the¢
equality in (2.118) is also valid for all m-tuples f1 , . . . , fm in Cb [0, T ] × E 4
instead of for vj1 , . . . , vjm . This
¡ is true because
¢ the linear span of the sequence
(vj )j∈N is Tβ -dense in Cb [0, T ] × E 4 . In addition we conclude that the
processes
Z t
f (t, X(t)) − f (0, X(0)) − (D1 + L) f (s, X(s)) ds,
0

t ∈ [0, T ], f ∈ D(1) (L), are P-martingales. We still have to show that


D([0, T ], E 4 ) has P-measure 1. From (2.116) it essentially follows that set
¡ ¢[0,T ]
of ω ∈ E 4 for which the left and right hand limits exist in E 4 has
“full” P-measure. First let f ≥ 0 be in Cb ([0, T ] × E). Then the process
·Z ∞ ¸
¯ 0
[Gλ f ] (t) := E e −λσ ¯
f (σ ∧ T, X(σ ∧ T )) dσ Ft
t
¡ ¢
is a P-supermartingale with respect to the filtration Ft0 t∈[0,T ] . It follows that
the limits limt↑t0 [Gλ f ] (t) and limt↓t0 [Gλ f ] (t) both exist P-almost surely for
all t0 ≥ 0 and for all f ∈ Cb ([0, T ] × E). In particular these limits exist P-
almost surely for all f ∈ D(1) (L). By the martingale property it follows that,
for f ∈ D(1) (L),
¯ ¯
¯f (t, X(t)) − λeλt [Gλ f ] (t)¯
¯ ·Z ∞ ¸¯
¯ ¯ ¯
= ¯¯λeλt E e−λσ (f (σ ∧ T, X(σ ∧ T )) − f (t, X(t))) dσ ¯ Ft0 ¯¯
¯ ·Zt ∞ µZ σ ¶ ¸¯
¯ λt ¯ ¯
= ¯¯λe E e−λσ (D1 + L) f (s, X(s)) ds dσ ¯ Ft0 ¯¯
Z ∞t t

≤ λeλt e−λσ (σ − t) k(D1 + L) f k∞ dσ = λ−1 k(D1 + L) f k∞ . (2.119)


t

Consequently, we may conclude that, for all s, t ≥ 0,

|f (t, X(t)) − f (s, X(s))|


82 2 Proof of main result
¯ ¯
≤ 2λ−1 k(D1 + L) f k∞ + ¯λeλt [Gλ f ] (t) − λeλs [Gλ f ] (s)¯ , (2.120)

Again using (2.108), (2.109) and (2.110) it follows that the the path

{X(s) : s ∈ Q ∩ [0, t], X(t) ∈ E}

is P-almost surely totally bounded. By separability and Tβτ -density of D(1) (L)
it follows that the limits limt↓s X(t) and lims↑t X(s) exist in E P-almost surely
for all s respectively t ∈ [0, T ], for which X(s) respectively X(t) belongs to
E. See the arguments which led to (2.12) and (2.13) in the proof of (a) of
Theorem 1.39. Put Z(s)(ω) = limt↓s,t∈Q∩[0,T ] X(t)(ω). Then, for P-almost all
ω the mapping s 7→ Z(s)(ω) is well-defined, possesses left limits in t ∈ [0, T ]
for those paths ω ∈ Ω for which ω(t) ∈ E and is right continuous. In addition
we have

E [f (s, Z(s))g(s)] = E [f (s, X(s+))g(s, X(s))]


= lim E [f (t, X(t))g(s, X(s))] = E [f (s, X(s))g(s, X(s))] ,
t↓s

for all f , g ∈ Cb ([0, T ] × E) and for all s ∈ [0, T ]: see (2.116). But then we
may conclude that X(s) = Z(s) P-almost surely for all s ∈ [0, T ]. Hence we
may replace X with Z and consequently (see the arguments in the proof of
(a) of Theorem 1.39, and see Theorem 9.4 in Blumenthal and Getoor [34], p.
49])
P [Ω] = 1, and so P ∈ P 0 (Ω) = P00 (Ω) (2.121)
¡ ¢ ¡ ¢
where Ω = D [0, T ] × E 4 . For the definition of D [0, T ] × E 4 see Defini-
tion 1.32, and for the definition of P 0 (Ω), and P00 (Ω) the reader is referred
to (2.102) and (2.101).
We also have to prove the separability. Denote by Convex the collection
of all mappings

α : Pf (N) × Pf (Q ∩ [0, T ]) → Q ∩ [0, 1],

which take only finitely many non-zero values, such that


X
α (Λ0 , Λ) = 1, Λ ∈ Pf (Q ∩ [0, T ]) ,
Λ0 ∈Pf (N)

and let {wΛ0 : Λ0 ∈ Pf (N)} be a countable family of functions from Q ∩ [0, T ]


to E 4 such that for every finite subset Λ = {sj1 , . . . , sjn } ∈ Pf (Q ∩ [0, T ])
the collection
{(wΛ0 (sj1 ) , . . . , wΛ0 (sjn )) : Λ0 ∈ Pf (N)}
¡ ¢(sj1 ,...,sjn ) ¡ 4 ¢Λ
is dense in E 4 = E . For example the value of the wΛ0 (sj` )
could be xk` , 1 ≤ ` ≤ n, where Λ0 = (k1 , . . . , kn ). Here (xk )k∈N is a dense
sequence in E4 . The countable collection of probability measures
2.1 Proof of the main result: Theorem 1.39 83

{Pα,w,Λ : α ∈ Convex, Λ ∈ Pf (N)}

determined by
£ ¡ ¢¤ X ¡ ¢
Eα,w,Λ F (s, X(s))s∈Λ = α (Λ0 , Λ) F (s, wΛ0 (s))s∈Λ
Λ0 ∈Pf (N)

is dense in P (Ω) endowed with the metric dL . Since P 0 (Ω) is a closed subspace
of P (Ω), it is separable as well.
Finally we observe that X(t) ∈ E, τ < s < t, implies ¡ X(s) ∈ E. ¢ This
follows from the assumption that the Skorohod space D [0, T ], E 4 is the
sample space on which we consider the martingale problem: see Definition
1.32. In particular it is assumed that X(s) = 4, τ < s ≤ t, implies X(t) = 4,
and L(ρ)f (ρ, ·) (X(ρ)) = 0 for s < ρ < t. Consequently, once we have X(s) =
4, and t ∈ (s, T ], then X(t) = 4, and by transposition X(t) ∈ E, s ∈ [τ, t)
implies X(s) ∈ E.

Proposition 2.9. Suppose that for every (τ, x) ∈ [0, T ] × E the martingale
problem is uniquely solvable. In addition, suppose that there exists λ > 0
such that the operator D1 + L is sequentially λ-dominant: see Definition
3.6. Define the map F : P 00 (Ω) → [0, T ] × E by F (P) = (τ, x), where
P ∈ P 00 (Ω) is such that P(X(s) = x) = 1, for s ∈ [0, τ ]. Then F is a
homeomorphism from the polish space P 00 (Ω) onto [0, T ] × E. In fact it fol-
lows that for every u ∈ Cb ([0, T ] × E) and for every s ∈ [τ, T ], the function
(τ, s, x) 7→ Eτ,x [u (s, X(s))], 0 ≤ τ ≤ s ≤ T , x ∈ E, is continuous.

Here P 00 (Ω) := {Pτ,x : (τ, x) ∈ [0, T ] × E}.


Proof. Since the martingale problem is uniquely solvable for every (τ, x) ∈
[0, T ] × E the map F is a one-to-one map from the polish space (P 00 (Ω) , dL )
onto [0, T ] × E (see Proposition 2.8 and (2.104)). Let for (τ, x) ∈ [0, T ] × E
the probability Pτ,x be the unique solution to the martingale problem:
(i) For every f ∈ D(1) (L) the process
Z t
f (t, X(t)) − f (τ, X(τ )) − (D1 + L) f (s, X(s))ds, t ∈ [τ, T ],
τ

is a Pτ,µ -martingale;
(ii) The Pτ,µ -distribution of X(τ ) is the measure µ. If µ = δx , then we write
Pτ,δx = Pτ,x , and Pτ,x [X(τ ) = x] = 1.
Then, by definition F (Pτ,x ) = (τ, x), (τ, x) ∈ [0, T ] × E. Moreover, since for
every (τ, x) ∈ [0, T ] × E the martingale problem is uniquely solvable we see
P 0 (Ω) = {Pτ,µ : (τ, µ) ∈ [0, T ] × P (E)}. Here P (E) is the collection of Borel
probability measures on E. This equality of probability spaces can be seen as
follows. If the measure Pτ,µ is a solution to the martingale problem, then it
84 2 Proof of main result

is automatically a member of P 0 (Ω). If P is a member of P 0 (Ω) which starts


at time τ , then by uniqueness of solutions we have:
£ ¯ ¤¯
P A ¯ σ (X(τ )) ¯X(τ )=x = Pτ,x [A] , A ∈ FTτ . (2.122)

In addition, P = Pτ,µ , where µ(B) = P [X(τ ) ∈ B], B ∈ E. Let ((t` , x` ))`∈N be


a sequence in [0, T ] × E with the property that lim`→∞ dL (Pt` ,x` , Pτ,x ) = 0
for some (τ, x) ∈ [0, T ] × E. Then for some stochastic variable ε the orbit
{(s, X(s)) : s ∈ (τ − ε, τ + ε)} is totally bounded Pt` ,x` -almost surely for all t`
and τ simultaneously. It follows that the sequence {x` = X (t` ) : ` ∈ N} ∪ {x}
is contained in a compact subset of E. Then lim`→∞ |vj (t` , x` ) − vj (τ, x)| =
0, for all j ∈ N, where, as above, the span of the sequence (vj )j≥2 is Tβ -
dense in C ([0, T ] × E). It follows that lim`→∞ (t` , x` ) = (t, x) in [0, T ] × E.
Consequently the mapping F is continuous. Since F is a continuous bijective
map from one polish space

P 00 (Ω) := {Pτ,x : (τ, x) ∈ [0, T ] × E} (2.123)

onto another such space [0, T ] × E, its inverse is continuous as well. Among
other things this impliesR that, for every s ∈ Q ∩ [0, ∞) and for every j ≥ 2,
the function (τ, x) 7→ vj (s, X(s)) dPτ,x belongs to Cb ([0, T ] × E). Since
the linear span of the sequence (vj : j ≥ 2) is Tβ -dense in Cb ([0, T ] × E)
it
R also follows that for every v ∈ Cb ([0, T ] × E), the function (τ, x) 7→
v (s, X(s)) dPτ,x belongs to Cb ([0, T ] × E). Next let s0 ∈ [0, T ] be arbitrary.
For every j ≥ 2 and every s ∈ Q ∩ [0, T ], s > s0 , we have by the martingale
property:

sup |Eτ,x (vj (s, X(s))) − Eτ,x (vj (s0 , X (s0 )))|
(τ,x)∈[0,s0 ]×E
¯Z s ¯
¯ ¯
= sup ¯ Eτ,x (Lvj (σ, X(σ))) dσ ¯¯
¯
(τ,x)∈[0,s0 ]×E s0

≤ (s − s0 ) k(D1 + L) vj k∞ . (2.124)

Consequently, for every s ∈ [0, T ], the function (τ, x) 7→ Eτ,x [vj (s, X(s))],
j ≥ 1, belongs to Cb ([0, T ] × E). It follows that, for every v ∈ Cb ([0, T ] × E)
and every s ∈ [0, T ], the function (τ, x) 7→ Eτ,x [v (s, X(s))] belongs to
Cb ([0, T ] × E). These arguments also show that the function (τ, s, x) 7→
Eτ,x [v (s, X(s))], 0 ≤ τ ≤ s ≤ T , x ∈ E, is continuous for every v ∈
Cb ([0, T ] × E). The continuity in the three variables (τ, s, x) requires the se-
quential λ-dominance of the operator D1 + L for some λ > 0. The arguments
run as follows. Using the Markov process

{(Ω, FTτ , Pτ,x ) , (X(t), τ ≤ t ≤ T ) , (∨t : τ ≤ t ≤ T ) , (E, E)} (2.125)

we define the semigroup {S(ρ) : ρ ≥ 0} as follows

S(ρ)f (τ, x) = P (τ, (ρ + s) ∧ T ) f ((ρ + s) ∧ T, ·) (x)


2.1 Proof of the main result: Theorem 1.39 85

= Eτ,x [f ((ρ + s) ∧ T, X ((ρ + s) ∧ T ))] . (2.126)

Here (τ, x) ∈ [0, T ] × E, ρ ≥ 0, and f ∈ Cb ([0, T ] × E). Let λ > 0 and


f ∈ Cb ([0, T ] × E). We want to establish a relationship between the semigroup
{S(ρ) : ρ ≥ 0} and the operator D1 + L. Therefore we first prove that the
process

t 7→e−λt f (t ∧ T, X (t ∧ T )) − e−λτ f (τ, X(τ ))


Z t
+ e−λρ (λI − D1 − L) f (ρ ∧ T, X (ρ ∧ T )) dρ, t ≥ τ, (2.127)
τ

is a Pτ,x -martingale with respect to the filtration (Ftτ )t∈[τ,T ] . Let τ ≤ s < t ≤
T , and y ∈ E. Then integration by parts shows:
Z t
−λt −λs
e f (t, X(t)) − e f (s, X(s)) + e−λρ (λI − D1 − L) f (ρ, X(ρ)) dρ
s
Z t
= e−λt f (t, X(t)) − e−λs f (s, X(s)) + λ e−λρ f (ρ, X(ρ)) dρ (2.128)
s
Z t Z t
− e−λt (D1 + L) f (ρ, X(ρ)) dρ − λ e−λρ (f (ρ, X(ρ)) − f (s, X(s))) dρ.
s s

Then by the martingale property the Ps,y -expectation of the expression in


(2.128) is zero. Employing the Markov property we obtain
·
Eτ,x e−λt f (t, X(t)) − e−λτ f (τ, X(τ ))
Z t ¸
¯ τ
+ e −λρ ¯
(λI − D1 − L) f (ρ, X(ρ)) dρ Fs
τ
µ
− e−λs f (s, X(s)) − e−λτ f (τ, X(τ ))
Z s ¶
−λρ
+ e (λI − D1 − L) f (ρ, X(ρ)) dρ
τ
·
= Eτ,x e−λt f (t, X(t)) − e−λs f (s, X(s))
Z t ¸
¯
+ e−λρ (λI − D1 − L) f (ρ, X(ρ)) dρ ¯ Fsτ
s

(Markov property)
·
= Es,X(s) e−λt f (t, X(t)) − e−λs f (s, X(s))
Z t ¸
−λρ
+ e (λI − D1 − L) f (ρ, X(ρ)) dρ = 0 (2.129)
s
86 2 Proof of main result

where in the final step in (2.129) we used the fact that the Ps,y -expectation,
y ∈ E, of the expression in (2.128) vanishes. Consequently, the process in
(2.127) is a Pτ,x -martingale. From the fact that the process in (2.127) is a
Pτ,x -martingale we infer by taking expectations that for t ≥ 0

e−λ(t+τ ) Eτ,x [f ((t + τ ) ∧ T, X ((t + τ ) ∧ T ))] − e−λτ Eτ,x [f (τ, X(τ ))]
Z t+τ
+ e−λρ Eτ,x [(λI − D1 − L) f (ρ ∧ T, X (ρ ∧ T ))] dρ = 0. (2.130)
τ

The equality in (2.130) is equivalent to

Eτ,x [f (τ, X(τ ))] − e−λt Eτ,x [f ((t + τ ) ∧ T, X ((t + τ ) ∧ T ))]


Z t+τ
= e−λ(ρ−τ ) Eτ,x [(λI − D1 − L) f (ρ ∧ T, X (ρ ∧ T ))] dρ = 0. (2.131)
τ

In terms of the semigroup {S(ρ) : ρ ≥ 0} the equality in (2.131) can be rewrit-


ten as follows:
Z t
f (τ, x) − e−λt S(t)f (τ, x) = e−λρ S(ρ) (λI − D1 − L) f (τ, x) dρ. (2.132)
0

By letting t → ∞ in (2.132) we see


Z ∞
f (τ, x) = e−λρ S(ρ) (λI − D1 − L) f (τ, x) dρ
0
= R(λ) (λI − D1 − L) f (τ, x) dρ (2.133)

where the definition


¡ ¢ of R(λ), λ > 0, is self-explanatory. Define the opera-
tor L(1) : D L(1) = R(λ)Cb ([0, T ] × E) → Cb ([0, T ] × E) by L(1) R(λ)f =
¡ ¢
λR(λ)f −f , f ∈ Cb ([0, T ] × E). Then by definition we see λI − L(1) R(λ)f =
¡ ¢
f , and thus R(λ) λI − L(1) R(λ)f = R(λ)f , f ∈ Cb ([0, T ] × E). Put g =
¡ ¢
λI − L(1) R(λ)f − f . Then by the resolvent identity we see that R(α)g = 0
for all α > 0, and hence S(ρ)g (τ, x) = Eτ,x [g ((ρ + τ ) ∧ T, X ((ρ + τ ) ∧ T ))] =
0 for all ρ > 0. By the right-continuity
¡ ¢ of the process ρ 7→ X(ρ), we see
that g = 0. Consequently, λI − L(1) R(λ)f − f = 0, f ∈ Cb ([0, T ] × E).
If f ∈ ¡D(1) (L),
¢ then¡ (2.133) ¢reads f = R(λ) (λI − D1 − L) f , and hence
f ∈ D L(1) , and λI − L(1) f = (λI − D1 − L) f , or what amounts to
¡ ¢
the same f ∈ D L(1) , and L(1) f = D1 f + Lf . In other words the op-
erator L(1) extends D1 + L. As in (1.41) define the sub-additive mapping
Uλ1 : Cb ([0, T ] × E, R) → L∞ ([0, T ] × E, R) by

Uλ1 f = sup inf {g ≥ f 1K : λg − D1 g − Lg ≥ 0} , and (2.134)


K∈K(E) g∈D (1) (L)

Since L(1) extends D1 + L, from (2.134) we get


2.1 Proof of the main result: Theorem 1.39 87
n o
Uλ1 f ≥ sup inf g ≥ f 1K : λg − L(1) g ≥ 0 , and (2.135)
K∈K(E) g∈D (L(1) )

Then, as explained in Proposition 1.22, formula (1.48), we have


n o
k
sup (µR (λ + µ)) f ; µ > 0, k ∈ N ≤ Uλ1 (f ), f ∈ Cb ([0, T ] × E, R) .
(2.136)
As is indicated in the proof of (iii) =⇒ (i) of Theorem 3.10 the following
equality also holds:
n o © ª
k
sup (µR (λ + µ)) f ; µ > 0, k ∈ N = sup e−λρ S(ρ) : ρ ≥ 0 , (2.137)

where f ∈ Cb ([0, T ] × E, R). For this observation the reader is referred to


the formulas (3.20), (3.21) and (3.22). Next let (fn )n∈N ⊂ Cb ([0, T ] × E) be a
sequence which decreases pointwise to zero. Using the sequential λ-dominance
of the operator D1 + L and using the equality in (2.136) and the inequality
in (2.137) we see that sup e−λρ S(ρ)fn (τ, x) decreases to zero uniformly on
ρ≥0
compact subsets of [0, T ] שE: see Definitionª3.6. From Proposition 1.21 it
follows that the semigroup e−λρ S(ρ) : ρ ≥ 0 is Tβ -equi-continuous. In ad-
dition, by the arguments above, every operator S(ρ), ρ ≥ 0, assigns to a
function f ∈ D(1) (L) = D (D1 ) ∩ D(L) a function S(ρ)f ∈ Cb ([0, T ] × E).
By the Tβ -continuity of S(ρ), and by the fact that D(1) (L) is Tβ -dense in
Cb ([0, T ] × E), the mapping S(ρ) extends to a Tβ -continuous linear contin-
uous operator from Cb ([0, T ] × E) to itself. This extension is again denoted
by S(ρ). In addition, for v ∈ D(1) (L), the function (τ, ρ, x) 7→ S(ρ)v(τ, x) is
continuous on [0, T ] × [0, ∞) × E; see (2.124). Fix f ∈ Cb ([0, T ] × E). Using
the sequential
© λ-dominance and ª its consequence of Tβ -equi-continuity of the
semigroup e−λρ S(ρ) : ρ ≥ 0 we see that the function (τ, s, x) 7→ S(ρ)f (τ, x)
is continuous on [0, T ]×[0, ∞)×E, and hence the same is true for the function
(τ, s, x) 7→ Eτ,x [f (s, X(s))]. Here we again used the Tβ -density of D(1) (L) in
Cb ([0, T ] × E).
This completes the proof of Proposition 2.9.
Notice that in the proof of the implication (iii) =⇒ (i) of Theorem 3.10 ar-
guments very similar to the ones in the final part of the proof of Proposition
2.9 will be employed.
Corollary 2.10. Suppose that the martingale problem is well posed for the
operator D1 + L, and that the operator D1 + L is sequentially λ-dominant for
some λ > 0. Let {(Ω, FTτ , Pτ,x ) : (τ, x) ∈ [0, T ] × E} be the solutions to the
martingale problem. Let the process in (2.125) be the corresponding Markov
process, and let the semigroup {S(ρ) : ρ ≥ 0}, as defined in (2.126), be the
corresponding Feller semigroup. Then this semigroup is Tβ -equi-continuous,
and its generator extends D1 + L.
88 2 Proof of main result

Proof.
© −λρ From Proposition
ª 1.21 it follows that for some λ > 0 the semigroup
e S(ρ) : ρ ≥ 0 is Tβ -equi-continuous: see the proof of Proposition 2.9.
Since S(ρ) = S(T ) for ρ ≥ T , we see that the semigroup {S(ρ) : ρ ≥ 0}
itself is Tβ -equi-continuous. Moreover, it is a Feller semigroup in the sense
that it consists of Tβ -continuous linear operators, and Tβ - lim S(t)f = S(s)f ,
t→s
f ∈ Cb ([0, T ] × E). From the proof of Proposition 2.9 it follows that the
generator of the semigroup {S(ρ) : ρ ≥ 0} extends D1 + L.
This proves Corollary 2.10.
The proof of the following proposition may be copied from Ikeda and Watan-
abe [109], Theorem 5.1. p. 205.
Proposition 2.11. Suppose that for every
¡ (τ, x) ∈ ¢[0, T ] × E the martingale
problem, posed on the Skorohod space D [0, T ], E 4 as follows,
(i) For every f ∈ D(1) (L) the process
Z t
f (t, X(t)) − f (τ, X(τ )) − (D1 + L) f (s, X(s))ds, t ∈ [τ, T ],
τ

is a P-martingale;
(ii)P(X(τ ) = x) = 1,
has a unique solution P = Pτ,x .
Then the process

{(Ω, FTτ , Pτ,x ) , (X(t), τ ≤ t ≤ T ) , (∨t : τ ≤ t ≤ T ) , (E, E)} , (2.138)

is
¡ τa ¢strong Markov process with respect to the right-continuous filtration
Ft+ t∈[τ,T ] .
τ
For the definition of FS+ the reader is referred to (1.91) in Remark 1.40.
Proof. Fix (τ, x) ∈ [0,
£ T ] × E¯ and ¤let S beτ a stopping time and choose a
realization A 7→ Eτ,x 1A ◦ ∨S ¯ FS+
τ
, A ∈ FT . Fix any ω ∈ Ω for which
£ ¯ τ ¤
A 7→ Qs,y [A] := Eτ,x 1A ◦ ∨S ¯ FS+ (ω),

is defined for all A ∈ FTτ . Here, by definition, (s, y) = (S(ω), ω(S(ω))). Notice
that this construction can be performed for Pτ,x -almost all ω. Let f be in
D(1) (L) = D (D1 ) ∩ D(L) and fix T ≥ t2 > t1 ≥ 0. Moreover, fix C ∈ Ftτ1 .
Then ∨−1 τ ∨S
S (C) is a member of Ft1 ∨S+ . Put Mf (t) = f (t, X(t)) − f (X(τ )) −
Rt
τ
(D1 + L) f (s, X(s))ds, t ∈ [τ, T ]. We have

Es,y [Mf (t2 )1C ] = Es,y [Mf (t1 )1C ] . (2.139)

We also have
Z µ Z t2 ¶
f (t2 , X(t2 )) − f (τ, X(τ )) − Lf (X(s))ds 1C dQs,y (2.140)
τ
2.1 Proof of the main result: Theorem 1.39 89

= Eτ,x f (t2 ∨ S, X (t2 ∨ S)) − f (S, X(S))

Z ¶ #
t2 ¯ τ
− (D1 + L) f (s ∨ S, X(s ∨ S)) ds (1C ◦ ∨S ) ¯ FS+ (ω)
τ

= Eτ,x f (t2 ∨ S, X(t2 ∨ S)) − f (S, X(S))

Z ! #
t2 ∨S ¯ τ
− (D1 + L) f (X(s)) ds (1C ◦ ∨S ) ¯ FS+ (ω)
S
" "Ã
= Eτ,x Eτ,x f (t2 ∨ S, X(t2 ∨ S)) − f (S, X(S))

Z ! # #
t2 ∨S ¯ τ ¯ τ
− (D1 + L) f (s, X(s)) ds ¯ Ft ∨S+ 1C ◦ ∨S ¯ FS (ω).(2.141)
1
S

By Doob’s optional sampling theorem, and right-continuity of paths, the pro-


cess
Z t∨S
f (t ∨ S, X(t ∨ S)) − f (S, X(S)) − (D1 + L) f (s, X(s)) ds
S

is a Pτ,x -martingale with respect to the filtration consisting of the σ-fields


τ
Ft∨S+ , t ∈ [τ, T ]. So from (2.140) we obtain:
Z µ Z t2 ¶
f (t2 , X(t2 )) − f (τ, X(τ )) − Lf (X(s))ds 1C dQs,y
τ

= Eτ,x f (t1 ∨ S, X(t1 ∨ S)) − f (S, X(S)) (2.142)

Z ! #
t1 ∨S ¯ τ
− ¯
(D1 + L) f (s, X(s)) ds (1C ◦ ∨S ) FS+ (ω)
S
Z µ Z t1 ¶
= f (t1 , X(t1 )) − f (τ, X(τ )) − (D1 + L) f (s, X(s))ds 1C dQs,y .
τ

It follows that, for f ∈ D(L), the process Mf (t) is a Ps,y - as well as a Qs,y -
martingale. Since Ps,y [X(s) = y] = 1 and since
£ ¯ τ ¤
Qs,y [X(s) = y] = Eτ,x 1{X(S)=y} ◦ ∨S ¯ FS+ (ω)
£ ¯ τ ¤
¯
= Eτ,x 1{X(S)=y} FS+ (ω) = 1{X(S)=y} (ω) = 1,(2.143)

we conclude that the probabilities Ps,y and Qs,y are the same. Equality (2.143)
follows, because, by definition, y = X(S)(ω) = ω(S(ω)). Since Ps,y = Qs,y , it
then follows that
90 2 Proof of main result
£ ¯ τ ¤
PS(ω),X(S)(ω) [A] = Eτ,x 1A ◦ ∨S ¯ FS+ (ω), A ∈ FTτ .

Or putting it differently:
£ ¯ τ ¤
PS,X(S) [1A ◦ ∨S ] = Eτ,x 1A ◦ ∨S ¯ FS+ , A ∈ FTτ . (2.144)

However this is exactly the strong Markov property.


This concludes the proof of Proposition 2.11.
The following proposition can be proved in the same manner as Theorem 5.1.
Corollary in Ikeda and Watanabe [[109], p. 206].
Proposition 2.12. If an operator family L = {L(s) : 0 ≤ s ≤ T } generates
a Feller evolution {P (s, t) : 0 ≤ s ≤ t ≤ T }, then the martingale problem is
uniquely solvable for L.

Proof. Let {P (τ, t) : 0 ≤ τ ≤ t ≤ T } be a Feller evolution generated by L and


let

{(Ω, FTτ , Pτ,x ) , (X(t), τ ≤ t ≤ T ) , (∨t : τ ≤ t ≤ T ) , (E, E)} , (2.145)

be the associated strong Markov process (see Theorem 1.39 (a)) If f belongs
to D(1) (L), then the process
Z t
Mf (t) := f (t, X(t)) − f (τ, X(τ )) − (D1 + L) f (s, X(s))ds, t ∈ [τ, T ],
τ

is a Pτ,x -martingale for all (τ, x) ∈ [0, T ] × E. This can be seen as follows. Fix
T ≥ t2 > t1 ≥ 0. Then
£ ¯ ¤
Eτ,x Mf (t2 ) ¯ Ftτ1 − Mf (t1 )
· Z t2 ¸
¯ τ
= Eτ,x f (t2 , X(t2 )) − ¯
(D1 + L) f (X(s))ds Ft1 − f (t1 , X(t1 ))
t1
(Markov property)
· Z t2 ¸
= Et1 ,X(t1 ) f (t2 , X(t2 )) − (D1 + L) f (s, X(s))ds − f (t1 , X(t1 ))
t1
Z t2
= Et1 ,X(t1 ) [f (t2 , X(t2 ))] − Et1 ,X(t1 ) [(D1 + L) f (s, X(s))] ds
t1
− f (t1 , X(t1 ))

(see Proposition 3.1 below)


Z t2
d
= Et1 ,X(t1 ) [f (t2 , X(t2 ))] − Et ,X(t1 ) [f (s, X(s))] ds − f (t1 , X(t1 ))
t1 ds 1
= 0. (2.146)
2.1 Proof of the main result: Theorem 1.39 91

Hence from (2.146) it follows that the process Mf (t), t ≥ 0, is a Pτ,x -


martingale. Next we shall prove the uniqueness of the solutions of the mar-
tingale problem associated to the operator L. Let P1τ,x and P2τ,x be solutions
“starting” in x ∈ E at time τ . We have to show that these ¡probabilities
¢
coincide. Let f belong to D(1) (L) and let S : Ω → [τ, T ] be an Ft+τ
t∈[τ,T ]
-
stopping time. Then, via partial integration, we infer
Z (

λ e−λt f ((t + S) ∧ T, X ((t + S) ∧ T ))
0
Z )
t+S
− (D1 + L) f (ρ ∧ T, X (ρ ∧ T )) dρ − f (S, X(S)) dt + f (S, X(S))
S
Z (

−λt
=λ e f ((t + S) ∧ T, X ((t + S) ∧ T ))
0
Z )
t+S
− (D1 + L) f (ρ ∧ T, X (ρ ∧ T )) dρ dt
S
Z ∞
=λ e−λt f ((t + S) ∧ S, X ((t + S) ∧ T )) dt
0
Z ∞ Z t
−λ e−λt (D1 + L) f ((t + S) ∧ T, X ((ρ + S) ∧ T )) dρ dt
0 0
Z ∞
=λ e−λt f ((t + S) ∧ T, X ((t + S) ∧ T )) dt
0
Z ∞ µZ ∞ ¶
−λt
−λ e dt (D1 + L) f ((ρ + S) ∧ T, X ((ρ + S) ∧ T )) dρ
0 ρ
Z ∞
= e−λt [(λI − D1 − L)f ] ((t + S) ∧ T, X ((t + S) ∧ T )) dt. (2.147)
0

From Doob’s optional sampling theorem together with (2.147) we obtain:


Z ∞
£ ¯ τ ¤
e−λt E1τ,x (λI − D1 − L) f ((t + S) ∧ T, X ((t + T ) ∧ T )) ¯ FS+ dt
0
Z "(

=λ e−λt E1τ,x f ((t + S) ∧ T, X ((t + S) ∧ T ))
0
Z ) #
t+S ¯
¯ τ
− (D1 + L) f (ρ ∧ T, X (ρ ∧ T )) dρ − f (S, X(S)) ¯ FS+ dt
S

+ f (S, X(S))
= f (S, X(S))
Z ∞ "(
−λt 2
=λ e Eτ,x f ((t + S) ∧ T, X ((t + S) ∧ T ))
0
92 2 Proof of main result
Z ) #
t+S ¯
¯ τ
− (D1 + L) f (ρ ∧ T, X (ρ ∧ T )) dρ − f (S, X(S)) ¯ FS+ dt
S

+ f (S, X(S))
Z ∞ h ¯ i
¯ τ
= e−λt E2τ,x (λI − D1 − L) f ((t + S) ∧ T, X ((t + S) ∧ T )) ¯ FS+ dt.
0
(2.148)

As in (3.5) below we write:

S(ρ)f (t, x) = P (t, (ρ + t) ∧ T ) f ((ρ + t) ∧ T, ·) (x), f ∈ Cb ([0, T ] × E) ,

ρ ≥ 0, (t, x) ∈ [0, T ] × E. Then the family {S(ρ) : ρ ≥ 0} is a Tβ -continuous


semigroup. Its resolvent is given by
Z ∞
[R(λ)f ] (τ, x) = e−λt [P (τ, (τ + t) ∧ T ) f ((τ + t) ∧ T, ·)] (x)dt
0
Z ∞
= e−λt S(t)f (τ, x)dt, (2.149)
0

for x ∈ E, λ > 0, and f ∈ Cb ([0, T ] × E). Let L(1) be its generator. Then, as
will be shown in Theorem 3.2 below, L(1) is the Tβ -closure of D1 + L, and
³ ´
λI − L(1) R(λ)f = f, f ∈ Cb ([0, T ] × E) ,
³ ´ ³ ´
R(λ) λI − L(1) f = f, f ∈ D L(1) . (2.150)

Since L(1) is the Tβ -closure of D1 + L, the equalities in (2.148) also hold for
L(1) instead of D1 + L. Among other things we see that
³ ´
R λI − L(1) = Cb ([0, T ] × E) , λ > 0.

From (2.148), with L(1) instead of D1 + L, (2.149), and (2.150) it then follows
that for g ∈ Cb ([0, T ] × E) we have
Z ∞
£ ¯ τ ¤
e−λt E1τ,x g ((t + S) ∧ T, X ((t + T ) ∧ T )) ¯ FS+ dt
0
Z ∞
= e−λt [S(t)g] (S, X(S)) dt
Z0 ∞
£ ¯ τ ¤
= e−λt E2τ,x g ((t + S) ∧ T, X ((t + T ) ∧ T )) ¯ FS+ dt. (2.151)
0

Since Laplace transforms are unique, g belongs to Cb ([0, T ] × E) and paths


are right continuous, we conclude
£ ¯ τ ¤
E1τ,x g ((t + S) ∧ T, X ((t + S) ∧ T )) ¯ FS+
2.1 Proof of the main result: Theorem 1.39 93

= [S(t)g] (S, X(S))


£ ¯ τ ¤
= E2τ,x g ((t + S) ∧ T, X ((t + S) ∧ T )) ¯ FS+ , (2.152)
¡ τ ¢
whenever g belongs to Cb ([0, T ] × E), t ∈ [0, ∞] and S is an Ft+ t∈[τ,T ]
-
stopping time. The first equality in (2.152) holds P1τ,x -almost surely and the
second P2τ,x -almost surely. In (2.152) we take for S a fixed time s ∈ [τ, T − t]
and we substitute ρ = t + s. Then we get
£ ¯ τ ¤ £ ¯ τ ¤
E1τ,x g ((ρ, X (ρ))) ¯ Fs+ = [S(ρ − s)g] (s, X(s)) = E2τ,x g (ρ, X (ρ)) ¯ Fs+ .
(2.153)
For s = τ the equalities in (2.153) imply
£ ¯ ¤ £ ¯ ¤
E1τ,x g ((ρ, X (ρ))) ¯ Fττ + = [S(ρ − τ )g] (τ, X(τ )) = E2τ,x g (ρ, X (ρ)) ¯ Fττ + ,
(2.154)
and by taking expectations in (2.154) we get

E1τ,x [g ((ρ, X (ρ)))] = [S(ρ − τ )g] (τ, x) = E2τ,x [g (ρ, X (ρ))] (2.155)

where we used the fact that X(τ ) = x P1τ,x - and P2τ,x -almost surely. It follows
that the one-dimensional distributions of P1τ,x and P2τ,x coincide. By induction
with respect to n and using (2.153) several times we obtain:
hYn i hYn i
E1τ,x fj (tj , X (tj )) = E2τ,x fj (tj , X (tj )) (2.156)
j=1 j=1

for n = 1, 2, . . . and for f1 , . . . , fn in Cb ([0, T ] × E). But then the probabilities


P1τ,x and P2τ,x are the same.
This proves Proposition 2.12.

Proposition 2.13. Let L be a densely defined operator for which the mar-
tingale problem is uniquely solvable. Then there exists a unique closed linear
extension L0 of L, which is the generator of a Feller semigroup.

Proof. Existence. Let {Pτ,x : (τ, x) ∈ [0, T ] × E} be the solution for L. Put

[S(t)f ] (τ, x) = Eτ,x [f ((τ + t) ∧ T, X ((τ + t) ∧ T ))] ,


Z ∞
[R(λ)f ] (τ, x) = e−λs [S(s)f ] (τ, x)ds,
0
L0 (R(λ)f ) := λR(λ)f − f, f ∈ Cb ([0, T ] × E) .

Here t ∈ [0, T ] and λ > 0 are fixed. Then, as follows from the proof of Theorem
3.2 the operator L0 extends D1 + L and generates a Tβ -continuous Feller
semigroup.
Uniqueness. Let L1 and L2 be closed linear extensions of L, which both gen-
erate Feller evolutions. Let
©¡ ¢ ª
Ω, FTτ , P1τ,x , (X(t) : t ∈ [0, T ]), (∨t : t ∈ [0, T ]), (E, E)
94 2 Proof of main result

respectively
©¡ ¢ ª
Ω, FTτ , P2τ,x , (X(t) : t ∈ [0, T ]), (∨t : t ∈ [0, T ]), (E, E)
be the corresponding Markov processes. For every f ∈ D(L), the process
Z t
f (t, X(t)) − f (τ, X(τ )) − (D1 + L) f (s, X(s))ds, t ≥ 0,
τ

is a martingale with respect to P1τ,x


as well as with respect to P2τ,x . Uniqueness
1 2
implies Pτ,x = Pτ,x and hence L1 = L2 .
Proof (Proof of item (d) of Theorem 1.39: conclusion). In this final part of the
proof we mainly collect the results, which we proved in part (a) of Theorem
1.39, and the propositions 2.8, 2.9, 2.10, 2.11, 2.12, and 2.13. The main work we
have to do is to organize these matters into a proof of item (d) of Theorem 1.39.
More details follow. As in (2.123) let P 00 (Ω) = {Pτ,x : (τ, x) ∈ [0, T ] × E}, be
the collection of unique solutions to the martingale problem. Then the process
n o
(Ω, FTτ , Pτ,x )(τ,x)∈[0,T ]×E , (X(t), t ∈ [0, T ]) , (∨t : t ∈ [0, T ]) , (E, E)

is strong Markov process, and the function P (τ, x; t, B) defined by


P (τ, x; t, B) = Pτ,x [X(t) ∈ B] , 0 ≤ τ ≤ t ≤ T, x ∈ E, B ∈ E,
is a Feller evolution. Here the ¡ state variables
¢ X(t) : Ω → E 4 are defined
4
by X(t) = ω(t), ω ∈ Ω = D [0, T ], E . The sample path space is supplied
with the standard filtration (Ftτ )τ ≤t≤T . The strong Markov property follows
from Proposition 2.11. The Feller property is a consequence of Proposition 2.9
(which in turn is based on Proposition 2.8 where completeness and separability
of the space P 00 (Ω) is heavily used). Its Tβ -continuity and Tβ -equi-continuity
is explained in Corollary 2.10 to Proposition 2.11. Define the Feller semigroup
{S(ρ) : ρ ≥ 0} on Cb ([0, T ] × E) as in (2.126), and let L(1) be its generator.
From Corollary 2.10 we see that L(1) extends the operator D1 + L. Since the
martingale problem is uniquely solvable for the operator L, it follows that the
martingale problem is uniquely solvable for the operator L(1) (but now as a
time-homogeneous martingale problem). Therefore, Proposition 2.12 implies
that the operator L(1) is the unique extension of D1 + L which generates
a Feller semigroup. It follows that L(1) − D1 is the unique Tβ -extension of
L which generates a Feller evolution. This Feller evolution is given by the
original solution to the martingale problem: this claim follows from item (a)
of Theorem 1.39.
Finally, this completes the proof of item (d) of Theorem 1.39.

2.1.5 Proof of item (e) of Theorem 1.39


In this subsection we will show that under certain conditions, like possess-
ing the Korovkin property, satisfying the maximum principle, and Tβ -equi-
continuity a Tβ -densely defined operator in Cb (E) has a unique extension
which generates a (strong) Markov process.
2.1 Proof of the main result: Theorem 1.39 95

Proof (Proof of item (e) of Theorem 1.39). Let E0 be a subset of [0, T ] × E


which is polish for the relative topology. First suppose that the operator D1 +
L possesses the Korovkin property on K. Also suppose that it satisfies the
maximum principle on E0 . By Proposition 3.17 and its Corollary 3.18 there
exists a family of linear operators {R(λ) : λ > 0} such that for all (τ0 , x0 ) ∈ E0
and g ∈ Cb (E0 ) the following equalities hold:

λR(λ)g (τ0 , x0 )
½ · µ ¶ ¸ ¾
1
= inf max h (τ0 , x0 ) + g − I − (D1 + L) h (τ, x)
h∈D (1) (L) (τ,x)∈E0 λ
½ µ ¶ ¾
1
= inf h (τ0 , x0 ) : I − (D1 + L) h ≥ g on E0
h∈D (1) (L) λ
½ µ ¶ ¾
1
= sup h (τ0 , x0 ) : I − (D1 + L) h ≤ g on E0
h∈D (1) (L) λ
½ · µ ¶ ¸ ¾
1
= sup min h (τ0 , x0 ) + g − I − (D1 + L) h (τ, x) .
h∈D (1) (L) (τ,x)∈E0 λ
(2.157)

As will be shown in Proposition 3.17 the family {R(λ) : λ > 0} has the resol-
vent property: R(λ) − R(µ) = (λ − µ) R(µ)R(λ), λ > 0, µ > 0. It also follows
that R(λ) (λI − D1 − L) f = f on E0 for f ∈ D(1) (L). This equality is an
easy consequence of the inequalities in (2.157): see Corollary 3.18. Fix λ > 0
and f ∈ Cb ([0, T ] × E). We will prove that f = Tβ - lim αR(α)f . If f is of
α→∞
the form f = R(λ)g, g ∈ Cb (E0 ), then by the resolvent property we have

α αR(α)g
αR(α)f − f = αR(α)R(λ)g − R(λ)g = R(λ)g − R(λ)g − .
α−λ α−λ
(2.158)
Since kαR(α)gk∞ ≤ kgk∞ , the equality in (2.158) yields

k·k∞ - lim αR(α)f − f = 0 for f of the form f = R(λ)g, g ∈ Cb (E0 ).


α→∞

Since g = R(λ) (λI − D1 − L) g on E0 , g ∈ D(1) (L), it follows that

lim kαR(α)g − gk∞ = 0 for g ∈ D(1) (L) = D (D1 ) ∩ D(L). (2.159)


α→∞

As will be proved in Corollary 3.20 there exists λ0 > 0 such that the family
{λR(λ) : λ ≥ λ0 } is Tβ -equi-continuous. Hence for u ∈ H + (E0 ) there exists
v ∈ H + (E0 ) that for α ≥ λ0 we have

kuαR(α)gk∞ ≤ kvgk∞ , g ∈ Cb (E0 ) . (2.160)

Fix ε > 0, and choose for f ∈ Cb (E0 ) and u ∈ H + (E0 ) given g ∈ D(1) (L) in
such a way that
96 2 Proof of main result

2
ku(f − g)k∞ + kv(f − g)k∞ ≤ ε. (2.161)
3
Since D(L) is Tβ -dense in Cb ([0, T ] × E) such a choice of g is possible by. The
inequality (2.161) and the identity

αR(α)f − f = αR(α)(f − g) − (f − g) + αR(α)g − g yield


ku (αR(α)f − f )k∞
≤ ku (αR(α)(f − g))k∞ + ku(f − g)k∞ + kuαR(α)g − gk∞
≤ kv(f − g)k∞ + ku(f − g)k∞ + kuαR(α)g − gk∞
2
≤ ε + kuαR(α)g − gk∞ . (2.162)
3
From (2.159) and (2.162) we infer Tβ - lim αR(α)f = f , f ∈ Cb (E0 ). Of
α→∞
course the same arguments if E0 = [0, T ] × E. In fact the detailed arguments
which prove the fact that the operator D1 + L, confined to E0 , extends to
the unique generator of a Feller semigroup are found in the proof of Theorem
3.21.
We still have to show that the martingale problem for the operator L
restricted to E0 is well posed. Saying that the martingale is well posed for
L ¹E0 is the same as saying that the martingale problem is well posed for the
operator (D1 + L) ¹E0 . More precisely, if
n o
(Ω, FTτ , Pτ,x )(τ,x)∈[0,T ]×E , (X(t), t ∈ [0, T ]) , (E0 , E0 ) (2.163)

is a solution to the martingale problem associated to L ¹E0 , then the time-


homogeneous family
½³ ´ ¾
¡ ¢
e e (0)
Ω, F, Pτ,x , (Y (t), t ≥ 0) , [0, T ] × E, B[0,T ] ⊗ E0 (2.164)
(τ,x)∈[0,T ]×E0

is a solution to the martingale problem associated with (D1 + L) ¹E0 . Here


Ωe = [0, T ] × Ω, Y (t)(τ, ω) = ((τ + t) ∧ T, X ((τ + t) ∧ T )), (τ, ω) ∈ [0, T ] × Ω,
(0)
and the measure Pτ,x is determined by the equality
   
Yn n
Y
E(0)
τ,x
 fj (Y (tj )) = Eτ,x  fj ((τ + tj ) ∧ T, X ((τ + tj ) ∧ T ))
j=1 j=1
(2.165)
where the functions fj , 1 ≤ j ≤ n, are bounded Borel measurable functions
(0)
on [0, T ] × E0 , and where 0 ≤ t1 < · · · tn . Conversely, if the measures Pτ,x in
(2.164) are known, then those in (2.163) are also determined by (2.165):
   
n
Y n
Y
Eτ,x  fj (tj , X (tj )) = E(0)
τ,x
 fj (Y (tj − τ )) (2.166)
j=1 j=1
2.1 Proof of the main result: Theorem 1.39 97

where the functions fj , 1 ≤ j ≤ n, are again bounded Borel measurable func-


tions on [0, T ] × E0 , and where τ ≤ t1 < · · · tn ≤ T . In fact in (2.166) the
functions fj , 1 ≤ j ≤ n, only need to be defined on E0 , and such a function fj
can be identified with a function on [0, T ] × E0 which does not depend on the
time variable: (s, y) 7→ fj (y), (s, y) ∈ [0, T ]×E0 . It follows that instead of con-
sidering the time-inhomogeneous martingale problem associated with L ¹E0
we may consider the time-homogeneous martingale problem associated with
(D1 + L) ¹E0 . However, the martingale problem for the time-homogeneous
case is taken care of in the final part of Theorem 3.21.
So combining the above observations with Theorem 3.21 completes the
proof of item (e) of Theorem 1.39.
Remark 2.14. Prove that the process in (2.3) is a supermartingale indeed.
+
Remark 2.15. Let (ψm )m∈N be a sequence in Cn³ b ([τ, T ] ´
× E) which decreases
o
pointwise to the zero function. Since the orbit e
t, X(t) : t ∈ [τ, T ] is Pτ,x -
almost surely compact we know that
³ ´
inf sup ψm t, X(t)e = 0, Pτ,x -almost surely.
m∈N t∈[τ,T ]

2.1.6 Some historical remarks


The Lévy numbers in (2.103) are closely related to the Lévy metric, which
in turn is related to approach structures. The definition of Lévy metric and
Lévy-Prohorov metric can be found in Encyclopaedia of Mathematics, edited
by Hazewinkel [102]. In the area of convergence of measures the Encyclopae-
dia contains contributions by V. M. Zolotarev. In fact special sections are
devoted to the Lévy metric, the Lévy-Prokhorov metric, and related topics
like convergence of probability measures on complete metrizable spaces. The
Lévy metric goes back to Lévy: see [146]. The Lévy-Prohorov metric general-
izes the Lévy metric, and has its origin in Prohorov [188]. For completeness
we insert the definition of Lévy-Prohorov metric.
Definition 2.16. Let (E, d) be a metric space with its Borel sigma field E.
Let P(E) denote the collection of all probability measures on the measurable
space (E, E). For a subset A ⊆ E, define the ε-neighborhood of A by
[
Aε := {x ∈ E : there exists y ∈ A such that d(x, y) < ε} = Bε (y)
y∈A

where Bε (y) is the open ball of radius ε centered at y. The Lévy-Prohorov


metric dLP : P(E)2 → [0, +∞) is defined by setting the distance between two
probability measures µ and ν as
dLP (µ, ν)
= inf {ε > 0 : µ(A) ≤ ν (Aε ) + ε and ν(A) ≤ µ (Aε ) + ε for all A ∈ E} .
(2.167)
98 2 Proof of main result

For probability measures we clearly have dLP (µ, ν) ≤ 1.


Some authors omit one of the two inequalities or choose only open or
closed subsets A; either inequality implies the other, but restricting to open
or closed sets changes the metric as defined in (2.167). The Lévy-Prohorov
metric is also called the Prohorov metric. The interested reader should com-
pare the definition of Lévy-Prohorov metric with that of approach structure
as exhibited in e.g. Lowen [152]. When discussing convergence of measures
and constructing appropriate metrics the reader is also referred to Billingsley
[30], Parthasarathy [186], Zolotarev [261], and others like Bickel, Klaassen,
Ritov and Wellner in [29], appendices A6–A9. A book which uses the notion
of Korovkin set to a great extent is [6]. For applications of Korovkin sets to
ergodic theory see e.g. Marsden and Riemenschneider [155], Nishiraho [166],
Labsker [142], Chapter 7 and 8 in Donner [72], and Krengel [138]. Another
book of interest is [28] edited by Bergelson, March and Rosenblatt. For the
convergence results we also refer to the original book by Korovkin [137]. The
reader also might want to consult (the references in) Bukhalov [45]. In the
terminology of test sets, or Korovkin sets, our space D(1) = D (D1 ) ∩ D (L)
−1
in Cb ([0, T ] × E) is a Korovkin set for the resolvent family (λI − D1 − L) ,
λ > 0. From the proof of item (e) of Theorem 1.39 it follows that we only need
the Korovkin property for some fixed λ0 > 0: see the definitions 3.13 and 1.36.
In the finite-dimensional setting these Korovkin sets may be relatively small:
see e.g. Özarslan and Duman [180]. Section 5.2 in the recent book on func-
tional analysis by Dzung Minh Ha [96] carries the title “Korovkin’s theorem
and the Weierstrass approximation theorem”.
3
Space-time operators and miscellaneous topics

3.1 Space-time operators

In this section we will discuss in more detail the generators of the the time-
space Markov process (see (1.75):

{(Ω, FTτ , Pτ,x ) , (X(t) : T ≥ t ≥ τ ) , (∨t : τ ≤ t ≤ T ) , (E, E)} (3.1)

In Definition 1.30 we have introduced the family of generators of the corre-


sponding Feller evolution {P (τ, t) : 0 ≤ τ ≤ t ≤ T } given by P (τ, t) f (x) =
Eτ,x [f (X(t))], f ∈ Cb (E). In fact for any fixed t ∈ [0, T ] we will consider
the Feller evolution as an operator from Cb ([0, T ] × E) to Cb ([0, t] × E).
This is done in the following manner. To a function f ∈ Cb ([0, T ] × E) our
Feller evolution assigns the function (τ, x) 7→ P (τ, t) f (t, ·) (x). We will also
consider the family of operators L := {L(t) : t ∈ [0, T )} as defined in Def-
inition 1.30, and which is considered as a linear operator which acts on a
subspace of Cb ([0, T ] × E). It is called the (infinitesimal) generator of the
P (s, t)f − f
Feller evolution {P (s, t) : 0 ≤ s ≤ t ≤ T }, if L(s)f = Tβ -lim ,
t↓s t−s
0 ≤ s ≤ T . This means that a function f belongs to D (L(s)) whenever
P (s, t)f − f
L(s)f := lim exists in Cb (E), equipped with the strict topology.
t↓s t−s
As explained earlier, such a family of operators is considered as an operator
L with domain in the space Cb ([0, T ] × E). A function f ∈ Cb ([0, T ] × E)
is said to belong to D(L) if for every s ∈ [0, T ] the function x 7→ f (s, x)
is a member of D(L(s)) and if the function (s, x) 7→ L(s)f (s, ·) (x) belongs
to Cb (E). Instead of L(s)f (s, ·) (x) we often write L(s)f (s, x). If a function
f ∈ D(L) is such that the function s 7→ f (s, x) is differentiable, then e say
that f belongs to D(1) (L). We will show that such a generator also gener-
ates the corresponding Markov process in the sense of Definition 1.31. For
convenience of the reader we repeat here the defining property. A family of
operators L := {L(s) : 0 ≤ s ≤ T }, is said to generate a time-inhomogeneous
100 3 Space-time operators

Markov process, as described in (1.75), if for all functions u ∈ D(L), for all
x ∈ E, and for all pairs (τ, s) with 0 ≤ τ ≤ s ≤ T the following equality holds:
· ¸
d ∂u
Eτ,x [u (s, X(s))] = Eτ,x (s, x) + L(s)u (s, ·) (X(s)) . (3.2)
ds ∂s
Our first result says that generators of Markov processes and the corre-
sponding Feller evolutions coincide.
Proposition 3.1. Let the Markov process in (3.1) and the Feller evolu-
tion {P (τ, t) : 0 ≤ τ ≤ t ≤ T } be related by P (τ, t) f (x) = Eτ,x [f (X(t))],
f ∈ Cb (E). Let L = {L(s) : 0 ≤ s ≤ T } be a family of linear operators with
domain and range in Cb (E). If L is a generator of the Feller evolution, then
it also generates the corresponding Markov process. Conversely, if L generates
a Markov process, then it also generates the corresponding Feller evolution.
Proof. First suppose that the Feller evolution {P (τ, t) : 0 ≤ τ ≤ t ≤ T } is gen-
erated by the family L. Let the function f belong to the domain of L and
suppose that D1 f is continuous on [0, T ] × E. Then we have
· ¸
∂f
Eτ,x (s, X(s)) + L(s)f (s, ·) (X(s))
∂s
∂f
= P (τ, s) (s, ·) (x) + P (τ, s) L(s)f (s, ·) (x)
∂s · ¸
∂f P (s, s + h) f (s, ·) − f (s, ·)
= P (τ, s) (s, ·) (x) + P (τ, s) lim (x)
∂s h↓0 h
· ¸
∂f P (s, s + h) f (s, ·) − f (s, ·)
= P (τ, s) (s, ·) (x) + lim P (τ, s) (x)
∂s h↓0 h
· ¸
∂f P (τ, s + h) f (s, ·) − P (τ, s) f (s, ·)
= P (τ, s) (s, ·) (x) + lim (x)
∂s h↓0 h
· ¸
∂f f (s + h, ·) − f (s, ·)
= P (τ, s) (s, ·) (x) − lim P (τ, s + h) (x)
∂s h↓0 h
· ¸
P (τ, s + h) f (s + h, ·) − P (τ, s) f (s, ·)
+ lim (x).
h↓0 h
∂f ∂f
= P (τ, s) (s, ·) (x) − P (τ, s) (s, ·) (x)
∂s ∂s
Eτ,x [f (s + h, X (s + h))] − Eτ,x [f (s, X (s))]
+ lim
h↓0 h
d
= Es,X(s) [f (s, X (s))] . (3.3)
ds
In (3.3) we used the fact that the function D1 f is continuous and its con-
f (s + h, y) − f (s, y)
sequence that lim converges uniformly for y in com-
h↓0 h
pact subsets of E. We also used the fact that the family of operators
{P (τ, t) : t ∈ [τ, T ]} is equi-continuous for the strict topology.
3.1 Space-time operators 101

In the second part we have to show that a generator L of a Feller process


(3.1) also generates the corresponding Feller evolution. Therefore we fix s ∈
[0, T ] and take f ∈ D(L(s)) ⊂ Cb (E). Using the fact that L generates the
Markov process in (3.1) we infer for h ∈ (0, T − s):

P (s, s + h)f (x) − f (x) d ¯


lim = P (s, s + h) ¯h=0
h↓0 h dh
d ¯
= Es,x [f (X (s + h))] ¯h=0 = Es,x [L(s)f (X(s))] = L(s)f (x). (3.4)
dh
To such a Feller evolution {P (τ, t) : 0 ≤ τ ≤ t ≤ T } we may also associate a
semigroup of operators S(ρ) acting on the space Cb ([0, T ] × E) and the cor-
responding resolvent family {R(α) : <α > 0}. The semigroup {S(ρ) : ρ ≥ 0}
is defined by the formula:

S(ρ)f (t, x) = P (t, (ρ + t) ∧ T ) f ((ρ + t) ∧ T, ·) (x)


= Et,x [f ((ρ + t) ∧ T, X ((ρ + t) ∧ T ))] , (3.5)

f ∈ Cb ([0, T ] × E), (t, x) ∈ [0, T ] × E. Notice that the operator S(ρ) does
not leave the space Cb (E) invariant: i.e. a function of the form (s, y) 7→ f (y),
f ∈ Cb (E), will be mapped to function S(ρ)f ∈ Cb ([0, T ] × E) which really
depends on the time variable. Then the resolvent operator R(α) which also
acts as an operator on the space of bounded continuous functions on space-
time space Cb ([0, T ] × E) is given by
Z ∞
R(α)f (t, x) = e−α(ρ−t) P (t, ρ ∧ T ) f (ρ ∧ T, ·) (x)dρ
Zt ∞
= e−αρ P (t, (ρ + t) ∧ T ) f ((ρ + t) ∧ T, ·) (x)dρ
0
Z ∞
= e−αρ S(ρ)f (t, x) dρ
0
·Z ∞ ¸
−αρ
= Et,x e f ((ρ + t) ∧ T, X ((ρ + t) ∧ T )) dρ , (3.6)
0

f ∈ Cb ([0, T ] × E), (t, x) ∈ [0, T ] × E. In order to prove that the family


{R(α) : <α > 0} is a resolvent family indeed it suffices to establish that the
family {S(ρ) : ρ ≥ 0} is a semigroup. Let f ∈ Cb ([0, T ] × E) and fix 0 ≤ ρ1 ,
ρ2 < ∞. Then this fact is a consequence of the following identities:

S (ρ1 ) S (ρ2 ) f (t, x)


= P (t, (ρ1 + t) ∧ T ) [y 7→ S (ρ2 ) f ((ρ1 + t) ∧ T, y)] (x)
= P (t, (ρ1 + t) ∧ T )
[y 7→ P ((ρ1 + t) ∧ T, (ρ2 + ρ1 + t) ∧ T ) f ((ρ2 + ρ1 + t) ∧ T, y)] (x)

(use evolution property)


102 3 Space-time operators

= P (t, (ρ2 + ρ1 + t) ∧ T ) f ((ρ2 + ρ1 + t) ∧ T, ·) (x)


= S (ρ2 + ρ1 ) f (t, x). (3.7)
(1)
Let D1 : Cb [0, T ] → Cb ([0, T ]) be the time derivative operator. Then the
space-time operator D1 + L defined by

(D1 + L) f (t, x) = D1 f (t, x) + L(t)f (t, ·) (x), f ∈ D (D1 + L) ,

turns out to be the generator of the semigroup {S(ρ) : ρ ≥ 0}. We also ob-
serve that once the semigroup {S(ρ) : ρ ≥ 0} is known, the Feller evolution
{P (τ, t) : 0 ≤ τ ≤ t ≤ T } can be recovered by the formula:

P (τ, t) f (x) = S (t − τ ) f (τ, x), f ∈ Cb (E), (3.8)

where at the right-hand side of (3.8) the function f is considered as the func-
tion in Cb ([0, T ] × E) given by (s, y) 7→ f (y).
Theorem 3.2. Let {P (τ, t) : 0 ≤ τ ≤ t ≤ T } be a Feller propagator. Define
the corresponding Tβ -continuous semigroup {S(ρ) : ρ ≥ 0} as in (3.5). Define
the resolvent
¡ family
¢ {R(α) : α > 0} as in (3.6). Let L(1)¡ be its generator.
¢
(1)
Then αI − L R(α)f = f , f ∈ Cb ([0, T ] × E), R(α) αI − L(1) f = f ,
¡ (1) ¢
f ∈ D L , and L(1) extends D1 + L. Conversely, if the operator L(1) is
defined by L(1) R(α)f = αR(α)f − f , f ∈ Cb ([0, T ] × E), then L(1) generates
the semigroup {S(ρ) : ρ ≥ 0}, and L(1) extends the operator D1 + L.
Proof. By definition we know that
1 ³ ´
L(1) f = Tβ - lim (S(t) − S(0)) f , f ∈ D L(1) . (3.9)
t↓0 t
¡ ¢
Here D L(1) is the subspace of those f ∈ Cb ([0, T ] × E) for which the limit
in (3.9) exists. Fix f ∈ Cb ([0, T ] × E), and α > 0. Then
Z
¡ ¢ ∞ −αρ
I − e−αt S(t) e S(ρ)f dρ
0
Z ∞ Z ∞
= e−αρ S(ρ)f dρ − e−αρ e−α(t+ρ) S (t) S (ρ) f dρ
0 0
Z ∞ Z ∞
= e−αρ S(ρ)f dρ − e−αρ S(ρ)f dρ
0 t
Z t
= e−αρ S(ρ)f dρ. (3.10)
0
¡ ¢ ¡ ¢
From (3.10) it follows that R(α)f ∈ D L(1) , and that αI − L(1) R(α)f =
¡ (1) ¢
f . Conversely, let f ∈ D L . Then we have
³ ´ 1¡ ¢
R(α) αf − L(1) f = R(α)Tβ - lim f − e−αt S(t)f dρ
t↓0 t
3.1 Space-time operators 103

1¡ ¢
= Tβ - lim R(α)f − R(α)e−αt S(t)f dρ
t↓0 t
Z
1 t −αρ
= Tβ - lim e S(ρ)f dρ = f. (3.11)
t↓0 t 0

The first part of Theorem 3.2 follows from (3.10) and (3.11). In order to show
that L(1) extends D1 + L we recall the definition of generator of a Feller
P (s, t)f − f
evolution as given in Definition 1.30: L(s)f = Tβ -lim . So that
¡ ¢ t↓s t−s
if f ∈ D(1) (L), then f ∈ D L(1) , and L(1) f = D1 f + Lf . Recall that
Lf (s, x) = L(s)f (s, ·) (x). Next, if the operator L0 is defined by L0 R(α)f =
αR(α)f − f , f ∈ Cb ([0, T ] × E). Then necessarily we have L0 = L(1) , and
hence L0 generates the semigroup {S(ρ) : ρ ≥ 0}.

Theorem 3.3. Let L be a linear operator with domain D(L) and range R(L)
in Cb (E). The following assertions are equivalent:
(i) The operator L is Tβ -closable and its Tβ -closure generates a Feller semi-
group.
(ii)The operator L verifies the maximum principle, its domain D(L) is Tβ -
dense in Cb (E), it is Tβ -dissipative and sequentially λ-dominant for some
λ > 0, and there exists λ0 > 0 such that the range R (λ0 I − L) is Tβ -dense
in Cb (E).

In the definitions 3.4 – 3.6 the notions of maximum principle, dissipativity,


and sequential λ-dominance are explained. In the proof we will employ the
results of Proposition 1.22.
Definition 3.4. An operator L with domain and range in Cb (E) is said to
satisfy the maximum principle, if for every f ∈ D (L) there exists a sequence
(xn )n∈N ⊂ E with the following properties:

lim <f (xn ) = sup <f (x), and lim <Lf (xn ) ≤ 0. (3.12)
n→∞ x∈E n→∞

In assertion (b) of Proposition 3.11 it will be shown that (3.12) is equivalent


to the inequality in (3.46).
Definition 3.5. An operator L with domain and range in Cb (E) is called
dissipative if

kλf − Lf k∞ ≥ λ kf k∞ , for all λ > 0, and for all f ∈ D(L). (3.13)

An operator L with domain and range in Cb (E) is called Tβ -dissipative if there


exists λ0 ≥ 0 such that for every function u ∈ H + (E) there exists a function
v ∈ H + (E) such that

kv (λf − Lf )k∞ ≥ λ kuf k∞ , for all λ ≥ λ0 , and all f ∈ D(L). (3.14)


104 3 Space-time operators

An operator L with domain and range in Cb (E) is called positive Tβ -dissipative


if there exists λ0 > 0 such that for every function u ∈ H + (E) there exists a
function v ∈ H + (E) for which

sup v(x)< (λf (x) − Lf (x)) ≥ λ sup u(x)<f (x), (3.15)


x∈E x∈E

for all λ ≥ λ0 , and for all f ∈ D(L).


The definition which follows is crucial in proving that an operator L (or its
Tβ -closure) generates a Tβ -continuous Feller semigroup. The symbol K(E)
stands for the collection of compact subsets of E. The mapping f 7→ Uλ1 (f ),
f ∈ Cb (E, R), was introduced in (1.41).
Definition 3.6. Let L be an operator with domain and range in Cb (E) and
fix λ > 0. Let f ∈ Cb (E, R), λ > 0, and put

Uλ1 (f ) = sup inf {g ≥ f 1K : (λI − L) g ≥ 0} . (3.16)


K∈K(E) g∈D(L)

The operator L is called sequentially λ-dominant if¡ for every sequence


¢ (fn )n∈N ,
λ 1
which decreases pointwise to zero, the sequence fn = Uλ (f ) n∈N defined as
in (3.16) possesses the following properties:
1. The function f¡nλ dominates
¢ fn : fn ≤ fnλ , and
λ
2. The sequence fn n∈N converges to zero uniformly on compact subsets of
E: lim sup fnλ (x) = 0 for all K ∈ K(E).
n→∞ x∈K

The functions fnλ automatically have the first property, provided that the con-
stant functions belong to D(L) and that L1 = 0. The real condition is given
by the second property. Some properties of the mapping Uλ1 : Cb (E, R) →
L∞ (E, E, R) were explained in Proposition 1.22.
If in Definition 3.6 Uλ1 is a mapping from Cb (E, R) to itself, then Dini’s lemma
implies that in (2) uniform convergence on compact subsets of E may be
replaced by pointwise convergence on E.
Remark 3.7. Suppose that the operator L in Definition 3.6 satisfies the max-
imum principle and that (µI − L) D(L) = Cb (E), µ > 0. Then the inverses
−1
R(µ) = (µI − L) , µ > 0, exist and represent positivity preserving opera-
tors. If a function g ∈ D(L) is such that (λI − L) g ≥ 0, then g ≥ 0 and
((λ + µ) I − L) g ≥ µg, µ ≥ 0. It follows that g ≥ µR (λ + µ) g, µ ≥ 0. In the
literature functions g ∈ Cb (E) with the latter property are called λ-super-
median. For more details see e.g. Sharpe [208]. If the operator L generates a
Feller semigroup {S(t) : t ≥ 0}, then a function g ∈ Cb (E) is called λ-super-
mean valued if for every t ≥ 0 the inequality e−λt S(t)g ≤ g holds pointwise. In
Lemma (9.12) in Sharpe [208] it is shown that, essentially speaking, these no-
tions are equivalent. In fact the proof is not very difficult. It uses the Hausdorff-
Bernstein-Widder theorem about the representation by Laplace transforms of
3.1 Space-time operators 105

positive Borel measures on [0, ∞) of completely positive functions. It is also


implicitly proved in the proof of Theorem 3.10 implication (iii) =⇒ (i): see
(in-)equalities (3.131), (3.132), (3.133), (3.134), and (3.140).

Proof (Proof of Theorem 3.3). (i) =⇒ (ii). Let L be the Tβ -closure


¡ of¢ L,
which is the Tβ -generator of the semigroup {S(t) : t ≥ 0}. Then R λI − L =
Cb (E), and the
R ∞ inverses of λI − L which we denote by R(λ) exist and satisfy:
R(λ)f (x) = 0 e−λt S(t)f (x). It follows that
Z ∞
λ< (R(λ)f (x)) = λ e−λt (S(t)<f ) (x)dt
0
Z ∞
≤λ e−λt dt sup <f (y) = sup <f (y), (3.17)
0 y∈E y∈E

and hence supx∈E λ< (R(λ)f (x)) ≤ supy∈E <f (y). The substitution f = λg −
Lg yields:
¡ ¢ ¡ ¢
λ sup <g(x) ≤ sup < λg(y) − Lg(y) , g ∈ D L . (3.18)
x∈E y∈E

In other words, the operator L satisfies the maximum principle, and so does
the operator L: see Proposition 3.11 assertion (b) below. Since the operator
L is Tβ -dissipative, the resolvent families {R(λ) : λ ≥ λ0 }, λ0 > 0, are Tβ -
equi-continuous.
R Hence every operator R(λ) can be written as an integral:
R(λ)f (x) = f (y)r (λ, x, dy), f ∈ Cb (E). For this the reader may consider the
arguments in (the proof
© of) Proposition
ª 1.22. Moreover, we have that for every
λ0 > 0, the family e−λ0 t S(t) : t ≥ 0 is Tβ -equi-continuous, and in addition,
lim S(t)f (x) = f (x), f ∈ Cb (E). It then follows that lim λR(λ)f (x) = f (x),
t↓0 λ→∞
f ∈ Cb (E). As in the proof of Proposition 1.22 we see that Tβ - lim λR(λ)f =
λ→∞
f , f ∈ Cb (E): see e.g. (1.58). Let f ≥ 0 belong to Cb (E), and consider the
function Uλ1 (f ) defined by

Uλ1 (f ) = sup inf {g ≥ f 1K : λg − Lg ≥ 0} . (3.19)


K∈K(E) g∈D(L)

In fact this definition is copied from (1.41). As was shown in Proposition 1.22,
we have the following equality:
n¡ ¢k o © ª
Uλ1 (f ) = sup (λ + µ) I − L f : µ > 0, k ∈ N = sup e−λt S(t)f : t ≥ 0 .
(3.20)
In fact in Proposition 1.22 the first equality in (3.20) was proved. The second
equality follows from the representations:
Z ∞
k µk
(µR (λ + µ)) f = tk−1 e−µt e−λt S(t)f dt and (3.21)
(k − 1)! 0
106 3 Space-time operators

X k
(µt) k
e−λt S(t)f = Tβ - lim e−µt (µR (λ + µ)) f. (3.22)
µ→∞ k!
k=0

A similar argument will be used in the proof of Theorem 3.10 (iii) =⇒ (i): see
(3.133) and (3.134). The representation in (3.20) implies that the operator L
is λ-dominant. Altogether this proves the implication (i) =⇒ (ii) of Theorem
3.3.
(ii) =⇒ (i). As in Proposition 3.11 assertion (a) below, the operator L is
Tβ -closable. Let L be its Tβ -closure. Then the operator L is Tβ -dissipative,
¡ ¢
λ-dominant, and satisfies the maximum principle. In addition R λI − L =
¡ ¢−1
Cb (E), λ > 0. Consequently, the inverses R(λ) = λI − L , λ > 0, exist.
The formulas in (3.21) and (3.22) can be used to represent the powers of the
resolvent operators, and to define the Tβ -continuous semigroup generated by
L. The λ-dominance is used in a crucial manner to prove that the semigroup
represented by (3.22) is a Tβ -equi-continuous semigroup which consists of
operators, which assign bounded continuous functions to such functions. For
details the reader is referred to the proof of Theorem 3.10 implication (iii)
=⇒ (i), where a very similar construction is carried for a time space operator
L(1) which is the Tβ -closure of D1 + L. In Theorem 3.10 the operator D1 is
taking derivatives with respect to time, and L generates a Feller evolution.

Proposition 3.8. Let L be a Tβ -closed linear operator with domain and


range in Cb (E). Suppose that the operator L satisfies the maximum prin-
ciple,
n and is such that R (λI o− L) = Cb (E), λ > 0. Then the resolvent family
−1
R(λ) = (λI − L) : λ > 0 consists of positivity preserving operators. In
addition, suppose that L possesses a Tβ -dense domain, and that the following
limits exist: for all (t, x) ∈ [0, ∞) × E and for all f ∈ Cb (E)

X (µt)k k
e−λt S(t)f (τ, x) = lim e−µt (µR (λ + µ)) f (τ, x), (3.23)
µ→∞ k!
k=0

and for all f ∈ D(L) and x ∈ E

lim µ (I − µR (λ + µ)) f (x) = λf (x) − Lf (x). (3.24)


µ→∞

Moreover, suppose that the operators R(λ), λ > 0, are Tβ -continuous. Fix
f ∈ Cb (E), f ≥ 0, and λ > 0. The following equalities and equality hold true:

sup inf {g ≥ f 1K : (λI − L) g ≥ 0} (3.25)


K∈K(E) g∈D(L)

= sup inf {g ≥ f 1K : g ≥ µR(λ + µ)g, for all µ > 0} (3.26)


K∈K(E) g∈Cb (E)
n o
k
≥ sup (µR (λ + µ)) f : µ ≥ 0, k ∈ N (3.27)
© ª
= sup e−λt S(t)f : t ≥ 0 . (3.28)
3.1 Space-time operators 107

If the
© function (t, x) ª7→ S(t)f (x) is continuous, then the function g =
sup e−λt S(t)f : t ≥ 0 is continuous, realizes the infimum in (3.26), and the
expressions (3.25) through (3.28) are all equal.

The proof of the following corollary is an immediate consequence of Proposi-


tion 3.8.
Corollary 3.9. Suppose that the operator L with domain and range in Cb (E)
be the Tβ -generator of a Feller semigroup {S(t) : t ≥ 0}. Let f ≥ 0 belong to
Cb (E). Then the quantities in (3.25) through (3.28) are equal.
Let g ∈ D(L). By assumption (3.24) we see that λg − Lg ≥ 0 if and only if
g ≥ µR (λ + µ) g for all µ > 0. Hence we have

inf {g ≥ f 1K : (λI − L) g ≥ 0}
g∈D(L)

= inf {g ≥ f 1K : g ≥ µR(λ + µ)g, for all µ > 0} . (3.29)


g∈D(L)

It is not so clear under what conditions we have equality of (3.29) and (3.26).
If f ∈ D(L) is such that λf − Lf ≥ 0, then the functions in (3.25) through
(3.28) are all equal to f .
Proof (Proof of Proposition 3.8). The representation in (3.23) shows that the
term in (3.28) is dominated by the one in (3.27). The equality
Z ∞
k µk
(µR (λ + µ)) f = tk−1 e−(λ+µ)t S(t)f dt, k ≥ 1, (3.30)
(k − 1)! 0

shows that the expression in (3.27) is less than or equal to the one in (3.28).
Altogether this proves the equality of (3.27) and (3.28). If the function g ∈
D(L) is such that g ≥ f 1K and (λI − L) g ≥ 0, then ((λ + µ) I − L) g ≥ µg,
and hence
k
g ≥ µR (λ + µ) g ≥ (µR (λ + µ)) g for all k ∈ N.

Consequently, the term in (3.25) dominates the second one. It also follows
that the expression in (3.26) is greater than or equal to
n o
k
sup sup (µR (λ + µ)) (f 1K ) : µ > 0, k ∈ N . (3.31)
K∈K(E)

k
Since the operators (µR (λ + µ)) , µ > 0 and k ∈ N, are Tβ -continuous the
expression in (3.31) is equal to the quantity in (3.27). Next we will show that
the expression in (3.26) is less than equal to (3.25). Therefore we chose an
arbitrary compact subset K of E. Let g ∈ Cb (E) be function with the following
properties: g ≥ f 1K , and g ≥ µR (λ + µ) g. Then for η > 0 arbitrary small
and α = αη > 0 sufficiently large we have αR(α) (g + η) ≥ g1K ≥ f 1K .
Moreover, the function gα,η := αR(α) (g + η) belongs to D(L) and satisfies
108 3 Space-time operators

gα,η ≥ µR (λ + µ) gα,η for all µ > 0 (3.32)

Here we employed the fact that D(L) is Tβ -dense in Cb (E). In fact we used the
fact that, uniformly on the compact subset K, g + η = limα→∞ αR(α) (g + η).
From (3.32) we obtain

(λI − L) gα,η = lim µ (I − µR (λ + µ)) gα,η ≥ 0, (3.33)


µ→∞

From (3.33) we obtain the inequality:

inf {g ≥ f 1K : g ≥ µR (λ + µ) g}
g∈Cb (E)

≥ inf {g ≥ f 1K : g ≥ µR (λ + µ) g} . (3.34)
g∈D(L)

The inequality in (3.34) shows that the expression in (3.26) is less than or equal
to the one in (3.25). Thus far we showed (3.25) = (3.26) ≥ (3.27) = 3.28). The
final assertion about the fact that the (continuous) function in (3.28) realizes
the equality in (3.27) being obvious, concludes the proof of Proposition 3.8.
In the following theorem (Theorem 3.10) we use the following subspaces of
the space Cb ([0, T ] × E):
½
(1)
CP,b = f ∈ Cb ([0, T ] × E) : all functions of the form (τ, x) 7→
Z τ +ρ ¾
P (τ, σ) f (σ, ·) (x)dσ, ρ > 0, belong to D (D1 ) ; (3.35)
τ
½
(1)
CP,b (λ) = f ∈ Cb ([0, T ] × E) : the function (τ, x) 7→
Z ∞ ¾
−λσ
e P (τ, σ) f (σ, ·) (x)dσ, belongs to D (D1 ) . (3.36)
τ

(1) (1)
Here λ > 0, and CP,b is a limiting case if λ = 0. The inclusion CP,b ⊂
(1)
∩λ0 >0 CP,b (λ0 ) follows from the representation of R (λ0 ) as a Laplace trans-
form:
Z ∞
R (λ0 ) f (τ, x) = e−λ0 ρ S(ρ)f (τ, x)dρ
0
Z ∞
= e−λ0 ρ P (τ, τ + ρ) f (τ + ρ, x)dρ
0
Z ∞ Z ρ
= λ0 e−λ0 ρ P (τ, τ + σ) f (τ + σ, x)dσ dρ
0 0
Z ∞ Z τ +ρ
= λ0 e−λ0 ρ P (τ, σ) f (σ, x)dσ dρ (3.37)
0 τ

From (3.37) we see that if for every ρ > 0 the function


3.1 Space-time operators 109
Z ρ Z τ +ρ
(τ, x) 7→ S(σ)f (τ, x)dσ = P (τ, σ ∧ T ) f (σ ∧ T, x)dσ
0 τ

belongs to D (D1 ), then so does theR function (τ, x) 7→ R (λ0 ) f (τ, x), pro-
ρ
vided that the function ρ 7→ e−λ0 ρ D1 0 S(σ)f dσ is Tβ -integrable in the space
(1) (1)
Cb ([0, T ] × E). The other inclusion, i.e. ∩λ0 >0 CP,b (λ0 ) ⊂ CP,b follows from
the following inversion formula:
Z τ +ρ Z ρ
P (τ, σ) f (σ, ·) (x)dσ = S(σ)f (τ, x) dσ
τ 0
Z ρ
2
= lim e−σλ eσλ R(λ) f (τ, x) dσ
λ→∞ 0
X∞ Z
1 ρ k k
= lim (σλ) e−σλ (λR(λ)) f (τ, x) dσ
λ→∞ k! 0
k=0
X∞ Z ρ
λk+1 k+1 −σλ k+1
= lim (σλ) e (R(λ)) f (τ, x) dσ
λ→∞ (k + 1)! 0
k=0
X∞ Z ρ Z ∞
λk+1 k+1 −σλ
= lim (σλ) e ρk1 e−λρ1 S (ρ1 ) f (τ, x) dρ1 dσ
λ→∞ (k + 1)!k! 0 0
k=0
X∞ Z k Z ∞
(−1)k λk+1 ρ k+1 −σλ ∂
= lim (σλ) e k
e−λρ1 S (ρ1 ) f (τ, x) dρ1 dσ
λ→∞ (k + 1)!k! 0 (∂λ) 0
k=0
X∞ k k+1 Z ρ k
(−1) λ k+1 −σλ ∂
= lim (σλ) e k
R(λ)f (τ, x)dσ (3.38)
λ→∞ (k + 1)!k! 0 (∂λ)
k=0

where the limits


R ρ have to taken in Tβ -sense. A similar limit representation is
valid for D1 0 S(ρ)f dρ(τ, x), provided that the family
½ ¾
λk+1
D1 R(λ)k f : λ > 0, k ∈ N
k!

is uniformly bounded. A simpler approach might be to use a complex inversion


formula:
Z ρ
(τ + ρ − σ) P (τ, (τ + σ) ∧ T ) f ((τ + σ) ∧ T, ·) (x)dσ
0
Z ρ Z ρ1 Z ω+i∞
1 1
= S(σ)f (τ, x)dσ dρ1 = eρλ R(λ)f (τ, x)dλ, (3.39)
0 0 2πi ω−i∞ λ2

and to assume that, for ω > 0, the family {λD1 R(λ)f : <λ ≥ ω} is uniformly
bounded. It is clear that the operator R(λ), <λ > 0, stands for
Z ∞
R(λ)f (τ, x) = e−λρ S(ρ)f (τ, x)dρ (3.40)
0
110 3 Space-time operators
Z ∞
= e−λρ P (τ, (τ + ρ) ∧ T ) f ((τ + ρ) ∧ T, ·) (x)dρ, f ∈ Cb ([0, T ] × E) .
0

It is also clear that the family of operators in (3.38) is a once integrated


integrated semigroup, and that the family in (3.39) is a twice integrated semi-
(1) (1)
group. In order to justify the inclusion ∩λ0 >0 CP,b (λ0 ) ⊂ CP,b in both ap-
proaches we need to know that the functions: λ 7→ R(λ)f , and λ 7→ D1 R(λ)f
are real analytic. For more details on inversion formulas for vector-valued
Laplace transforms and integrated semigroups see e.g. Bobrowski [37], Cho-
jnacki [57], Arendt [8], Arendt et al [9], and Miana [163]. For vector valued
Laplace transforms the reader is also referred to Bäumer and Neubrander [26].
Theorem 3.10. Let L be a linear operator with domain D(L) and range R(L)
in Cb ([0, T ] × E). Suppose that there exists λ > 0 such that the operator
D1 + L is sequentially λ-dominant in the sense of Definition 3.6. Under such
a hypothesis the following assertions are equivalent:
(i) The operator L is Tβ -closable, its Tβ -closure generates a Feller evolution,
the operator D1 + L is Tβ -densely defined, and there exists λ0 > 0 such
(1)
that the subspace CP,b (λ0 ) is Tβ -dense in Cb ([0, T ] × E).
(ii)The operator D1 + L is Tβ -closable and its Tβ -closure generates a Tβ -
continuous Feller semigroup in Cb ([0, T ] × E).
(iii)The operator D1 +L is Tβ -densely defined, is power Tβ -dissipative, satisfies
the maximum principle, and there exists λ0 > 0 such that the range of
λ0 I − D1 − L is Tβ -dense in Cb ([0, T ] × E).
Here we call the operator D1 + L power Tβ -dissipative if for some λ0 ≥ 0 and
for every k ∈ N there exists a Tβ -dense subspace Dk of Cb ([0, T ] × E) such
that for every u ∈ H + ([0, T ] × E) there exists v ∈ H + ([0, T ] × E) for which
the following inequality holds:
° °
° k °
λk kuf k∞ ≤ °v (λI − D1 − L) f ° for all f ∈ Dk and all λ ≥ λ0 . (3.41)

If the operator D1 + L is just Tβ -dissipative, then an inequality of the form


(3.41) holds, with a function v ∈ H + ([0, T ] × E) which depends on k. In (3.41)
the function v only depends on u (and the operator D1 + L), but it neither
depends on k, nor on f ∈ Dk or λ ≥ 1. Let the operator L(1) be an extension
of D1 + L which generates a Tβ -continuous semigroup {S0 (t) : t ≥ 0}, and
suppose that D1 + L satisfies (3.41). Then this semigroup is equi-continuous
in the sense that for every u ∈ H + ([0, T ] × E) there exists v ∈ H + ([0, T ] × E)
for which the following inequality holds:

kuS0 (t)f k∞ ≤ kvf k∞ for all f ∈ Cb ([0, ∞) × E) and all t ∈ [0, ∞). (3.42)

A closely related inequality is the following one


° °
° k °
°u (λR(λ)) f ° ≤ kvf k∞ , λ ≥ 1, f ∈ Cb ([0, T ] × E) . (3.43)

3.2 Dissipative operators and maximum principle 111

Notice that (3.43) is equivalent to (3.41) provided that the operator L(1) is
the Tβ -closure of D1 + L and the ranges of λI − L(1) , λ > 0, coincide with
Cb ([0, T ] × E). In fact the semigroup {S0 (t) : t ≥ 0} and the resolvent family
{R(λ) : λ > 0} are related as follows:
Z
k λk ∞ k−1 −λt
(λR (λ)) f = t e S0 (t)f dt, and (3.44)
k! 0
X∞
(λt)k k
S0 (t)f = Tβ - lim e−λt (λR(λ)) f. (3.45)
λ→∞ k!
k=0

The integral in (3.44) has to be interpreted in Tβ -sense. From (3.44) and


(3.45) the equivalence of (3.42) and (3.53) easily follows. We also observe that
(3.43) is equivalent to the following statement. For every sequence (fn )n∈N ⊂
Cb ([0, T ] × E) which decreases pointwise to zero it follows that
k
inf sup (λR(λ)) fn = 0.
n∈N λ≥1,k∈N

3.2 Dissipative operators and maximum principle


In the following proposition we collect some of the interrelationships which
exist between the concepts of closability, dissipativeness, and maximum prin-
ciple. A reformulation of assertion (f) in Proposition 3.11 can be found in
Lemma 7.3 in Chapter 7.
Proposition 3.11. (a1 ) Suppose that the operator L is dissipative and that
its range is contained in the closure of its domain. Then the operator L is
closable.
(a2 ) Suppose that the operator L is Tβ -dissipative and that its range is con-
tained in the Tβ -closure of its domain. Then the operator L is Tβ -closable.
(b) If the operator L satisfies the maximum principle, then
sup < (λf (x) − Lf (x)) ≥ λ sup <f (x), for all λ > 0, and for all f ∈ D(L).
x∈E x∈E
(3.46)
Conversely, if L satisfies (3.46), then the operator L satisfies the maxi-
mum principle. The inequality in (3.46) is equivalent to
inf < (λf (x) − Lf (x)) ≤ λ inf <f (x), for all λ > 0, and for all f ∈ D(L).
x∈E x∈E
(3.47)
(c) If the operator L satisfies the maximum principle, then L is dissipative.
(d) If the operator L satisfies the maximum principle, and if f ∈ D(L) is such
that λf − Lf ≥ 0 for some λ > 0, then f ≥ 0.
(e) If the operator L is dissipative, then
kλf − Lf k∞ ≥ <λ kf k∞ , for all λ with <λ > 0, and for all f ∈ D(L).
(3.48)
112 3 Space-time operators

(f ) The operator L is dissipative if and only if for every f ∈ D(L) there


exists a sequence (xn )n∈N ⊂ E such that lim |f (xn )| = kf k∞ , and
³ ´ n→∞
lim < f (xn )Lf (xn ) ≤ 0.
n→∞
(g) If the operator L is positive Tβ -dissipative, then it is Tβ -dissipative.
For the definition of an operator which is positive Tβ -dissipative or Tβ -
dissipative, the reader is referred to Definition 3.4. The same is true for the
other notions in Proposition 3.11.
Proof. (a1 ) Let (fn )n∈N ⊂ D(L) be any sequence with the following proper-
ties:
lim fn = 0, and g = lim Lfn
n→∞ n→∞

exists in Cb (E). Then we consider


° °
°(λfn + gm ) − λ−1 L (λfn + gm )° ≥ kλfn + gm k ,
∞ ∞

where (gm )m∈N ⊂ D(L) converges to g. First we let n tend to infinity, then
λ, and finally m. This limiting procedure results in

lim kgm − gk∞ ≥ lim kgm k∞ = kgk∞ .


m→∞ m→∞

Hence g = 0.
(a2 ) Let (fn )n∈N ⊂ D(L) be any sequence with the following properties:

Tβ - lim fn = 0, and g = Tβ - lim Lfn


n→∞ n→∞

exists in Cb (E). Let u ∈ H + (E) be given and let the function v be as in (3.14).
Then we consider
° ¡ ¢°
°v (λfn + gm ) − λ−1 L (λfn + gm ) ° ≥ ku (λfn + gm )k , (3.49)
∞ ∞

where (gm )m∈N ⊂ D(L) Tβ -converges to g. First we let n tend to infinity, then
λ, and finally m. The result will be

lim kvgm − vgk∞ ≥ lim kugm k∞ = kugk∞ ,


m→∞ m→∞

and hence g = 0.
(b) Let f ∈ D(L). Then choose a sequence (xn )n∈N ⊂ E as in (3.12). Then
we have

sup < (λf (x) − Lf (x)) ≥ lim < (λf (xn ) − Lf (xn )) ≥ λ sup <f (x)
x∈E n→∞ x∈E

which is the same as (3.46). Suppose that the operator L satisfies (3.46). Then
for every λ > 0 we choose xλ ∈ E such that
3.2 Dissipative operators and maximum principle 113

1
λ<f (xλ ) − <Lf (xλ ) ≥ λ sup <f (x) − . (3.50)
x∈E λ
From (3.50) we infer:
1
<Lf (xλ ) ≤ , and (3.51)
λ
1 1
sup <f (x) ≤ <f (xλ ) + − <Lf (xλ ) . (3.52)
x∈E λ2 λ
From (3.51) we see that lim supλ→∞ <Lf (xλ ) ≤ 0, and from (3.52) it follows
that lim supλ→∞ <f (xλ ) = supx∈E <f (x). From these observations it is easily
seen that (3.46) implies the maximum principle.
The substitution f → −f shows that (3.47) is a consequence of (3.46).
(c) Let f 6= 0 belong to D(L), choose α ∈ R and a sequence (xn )n∈N ⊂ E
in such a way that 0 <¢ kf k∞ = limn→∞ <eiα f (xn ) = supx∈E <eiα f (x), and
¡ iα
that limn→∞ <L e f (xn ) ≤ 0. Then
¡ ¢
kλf − Lf k∞ ≥ lim < eiα (λf − Lf ) (xn )
n→∞
¡ ¢ ¡ ¢
= lim λ< eiα f (xn ) − < eiα Lf (xn ) ≥ λ kf k∞ . (3.53)
n→∞

The inequality in (3.53) means that L is dissipative in the sense of Definition


3.5.
(d) Let f ∈ D(L) be such that for some λ > 0 λf (x) − Lf (x) ≥ 0 for all
x ∈ E. From (3.47) in (b) we see that
λ inf =f (x) = λ inf <(−if )(x) ≥ inf < (λ(−if )(x) − L(−if )(x))
x∈E x∈E x∈E
= inf = (λf (x) − L(f )(x)) = 0. (3.54)
x∈E

From (3.54) we get =f ≥ 0. If we apply the same argument to −f instead of


f we get =f ≤ 0. Hence =f ≡ 0, and so the function f is real-valued. But
then we have
0 ≤ inf (λf (x) − Lf (x)) ≤ inf f (x),
x∈E x∈E

and consequently f ≥ 0.
(e) From the proof it follows that L is dissipative if and only if for every

f ∈ D(L) there exists an element x∗ in Cb ([0, T ] × E) such that kx∗ k = 1,
∗ ∗
such that hf, x i = kf k∞ , and such that < hLf, x i ≤ 0. A proof of all this
runs as follows. Let L be dissipative. Fix f in D(L) and choose for each λ > 0

an element x∗λ in Cb ([0, T ] × E) in such a way that kx∗λ k ≤ 1 and
kλf − Lf k∞ = hλf − Af, x∗λ i . (3.55)
T
Choose an element x∗ in the intersection µ>0 weak∗ closure {x∗λ : λ > µ}.

Since the dual unit ball of Cb ([0, T ] × E) is weak∗ -compact such an element

x exists. From (3.55) it follows that
114 3 Space-time operators

< hLf, x∗λ i = λ< hf, x∗λ i − kλf − Lf k∞


≤ λ kf k∞ − kλf − Lf k∞ ≤ 0, λ > 0. (3.56)

Here we used the fact that A is supposed to be dissipative. From (3.55) we


also obtain the equality
° °
hf, x∗λ i = °f − λ−1 Lf °∞ + λ−1 hLf, x∗λ i , λ > 0. (3.57)

Since x∗ is a weak∗ limit point of {x∗λ : λ > µ} for each µ > 0 it follows from
(3.56) and (3.57) that

< hLf, x∗ i ≤ 0, and (3.58)


hf, x∗ i = kf k∞ , kx∗ k ≤ 1. (3.59)

Finally pick λ ∈ C with <λ > 0. From (3.58) and (3.59) we infer

kλf − Lf k∞ ≥ < hλf − Lf, x∗ i = < (λ hf, x∗ i) − < hLf, x∗ i


≥ < (λ kf k∞ ) − 0 = <λ kf k∞ . (3.60)

(f) If L is dissipative and if f ∈ D(L), then there exists a family (xλ )λ>0 ⊂ E
such that
kLf k∞
|λf (xλ ) − Lf (xλ )| ≥ λ kf k∞ − . (3.61)
λ
From (3.61) we infer

kLf k∞
λ |f (xλ )| + kLf k∞ ≥ λ kf k∞ − , (3.62)
λ

and
³ ´
2 2
λ2 |f (xλ )| − 2λ< f (xλ )Lf (xλ ) + |Lf (xλ )|
2
2 kLf k∞
≥ λ2 kf k2 − 2 kf k∞ kLf k∞ + . (3.63)
λ2
From (3.62) and (3.63) we easily infer

kLf k∞ kLf k∞
|f (xλ )| ≥ kf k∞ − − , (3.64)
λ λ2

and
³ ´
2 2
λ2 kf k∞ − 2λ< f (xλ )Lf (xλ ) + kLf k∞
2
2 kLf k∞
≥ λ2 kf k∞ − 2 kf k∞ kLf k∞ + . (3.65)
λ2
From (3.65) we get
3.2 Dissipative operators and maximum principle 115
³ ´ µ ¶
kf k∞ kLf k∞ 1 1 2
< f (xλ )Lf (xλ ) ≤ + 1 − 2 kLf k∞ . (3.66)
λ 2λ λ
³ ´
From (3.66) we obtain lim sup < f (xλ )Lf (xλ ) ≤ 0. From (3.64) we see
λ→∞
lim |f (xλ )| = kf k∞ . By passing to a countable sub-family we see that there
λ→∞
exists a sequence (xn )n∈N ⊂ E such that lim |f (xn )| = kf k∞ and such
³ ´ n→∞
that the limit lim < f (xn )Lf (xn ) exists and is ≤ 0. The proof of the
n→∞
converse statement is (much) easier. Let (xn )n∈N ⊂ ³E be a sequence
´ such
that lim |f (xn )| = kf k∞ and that the limit lim < f (xn )Lf (xn ) exists
n→∞ n→∞
and is ≤ 0. Fix f ∈ D(L). Then we have
³ ´
2 2 2
kλf − Lf k∞ ≥ λ2 |f (xn )| − 2λ< f (xn )Lf (xn ) + |Lf (xn )|
³ ´
2
≥ λ2 |f (xn )| − 2λ< f (xn )Lf (xn ) . (3.67)

From the properties of the sequence (xn )n∈N and (3.67) we obtain the inequal-
ity kλf − Lf k∞ ≥ λ kf k∞ , λ > 0, f ∈ D(L), which is the same as saying that
L is dissipative.
(g) Let the functions u and v ∈ H + (E) as in assertion (g), let f ∈ D(L), and
λ ≥ λ0 . Then we have
¡ ¡ ¢ ¡ ¢ ¢
kv (λf − Lf )k∞ = sup sup v(x)< λ eiϑ f (x) − L eiϑ f (x)
ϑ∈[−π,π] x∈E

(L is positive Tβ -dissipative)
¡ ¢
≥λ sup sup u(x)< eiϑ f (x) = λ kuf k∞ . (3.68)
ϑ∈[−π,π] x∈E

The inequality in (3.68) shows the dissipativity of the operator L.

Proof (Proof of Theorem 3.10). (i) =⇒ (ii). Let L be the Tβ -closure of L.


Then there exists a Feller evolution {P (s, t) : 0 ≤ s ≤ t ≤ T } such that

d ¡ ¢
P (τ, t) f (t, ·) (x) = P (τ, t) D1 + L(t) f (t, ·) (x), (3.69)
dt
¡ ¢
for all functions f ∈ D(1) L , 0 ≤ τ ≤ t ≤ T , x ∈ E. The functions f ∈
¡ ¢
D(1) L have the property that for every ρ ∈ [0, T ] the following Tβ -limits
exist:
P (ρ, ρ + h) f (ρ, ·) − f (ρ, ·)
(a) L(ρ)f (ρ, ·) (x) = Tβ - lim .
h↓0 h
∂ f (ρ + h, x) − f (ρ, x)
(b) f (ρ, x) = Tβ - lim .
∂ρ h→0 h
116 3 Space-time operators

As indicated the limits in (a) and (b) have to be interpreted in Tβ -sense.


Moreover, these functions as functions of the pair (ρ, x) are supposed to be
continuous. The equality in (3.69) was introduced in Definition 1.31. How-
ever, the reader is also referred to Proposition 3.1, and to equality (3.2). The
equality in (3.69) can also be written in integral form:
Z t µ ¶

P (τ, t) f (t, ·) (x) − f (τ, x) = P (τ, ρ) + L(ρ) f (ρ, ·) (x) (3.70)
τ ∂ρ
¡ ¢
for f ∈ D(1) L , 0 ≤ τ ≤ t ≤ T , x ∈ E. The Feller evolution is Tβ -equi-
continuous. This means that for every u ∈ H + (E), there exists v ∈ H + (E)
such that for all f ∈ Cb (E) the inequality

sup sup |u(τ, x)P (τ, t) f (·) (x)| ≤ sup |v(x)f (x)| (3.71)
τ ≤t≤T x∈E x∈E

holds
n for all f ∈ Cb (E). As
o was explained in Corollary 2.6, the Feller evolution
Pe (τ, t) : 0 ≤ τ ≤ t ≤ T , which is the same as {P (τ, t) : 0 ≤ τ ≤ t ≤ T }
considered as a family of operators on Cb ([0, T ] × E), is Tβ -equi-continuous
as well: see Corollary 1.19. As in (3.5) we define the semigroup Tβ -equi-
continuous semigroup {S(ρ) : ρ ≥ 0} by

S(ρ)f (t, x) = P (t, (ρ + t) ∧ T ) f ((ρ + t) ∧ T, ·) (x),


f ∈ Cb ([0, T ] × E) ,
(3.72)
where ρ ≥ 0, and (t, x) ∈ [0, T ] × E. Then the semigroup in (3.72) is Tτ -equi-
continuous. In fact we have

sup sup |u(τ, x)S(t)f (τ, x)| ≤ sup |v(x)f (τ, x)| (3.73)
τ ≤t≤T x∈E (τ,x)∈[0,T ]×E

where u ∈ H + ([0, T ] × E)
R ∞and v ∈ H + (E) are as in (3.71). Let L(1) be its
−λρ
generator, and R(λ)f = 0 e S(ρ)f dρ, f ∈ Cb ([0, T ] × E), its resolvent.
(1)
Then we will prove that L = D1 + L, and we will also show the following
well-known equalities (compare with (2.150)):
³ ´
λI − L(1) R(λ)f = f, f ∈ Cb ([0, T ] × E) ,
³ ´ ³ ´
R(λ) λI − L(1) f = f, f ∈ D L(1) . (3.74)

In order to understand the relationship between D1 + L and the Tβ -generator


of the semigroup {S(ρ) : ρ ≥ 0} we consider, for h > 0, λ > 0 the operators
(1) (1)
Lλ,h and ϑh Lλ,h , which are defined by

(1) 1¡ ¢
Lλ,h f (τ, x) = I − e−λh S(h) f (τ, x) (3.75)
h
1¡ ¢
= f (τ, x) − e−λh P (τ, (τ + h) ∧ T ) f ((τ + h) ∧ T, ·) (x)
h
3.2 Dissipative operators and maximum principle 117

1¡ ¢
= f (τ, x) − e−λh P (τ, (τ + h) ∧ T ) f (τ, ·) (x)
h
1 ¡ −λh ¢
− e P (τ, (τ + h) ∧ T ) (f ((τ + h) ∧ T, ·) − f (τ, ·)) (x)
h
³ ´1Z h
= λI − L(1) e−λρ S(ρ)f dρ (τ, x)
h 0

and

(1) 1¡ ¢
ϑh Lλ,h f (τ, x) = I − e−λh S(h) f ((τ − h) ∧ T ∨ 0, x) (3.76)
h
1
= (f ((τ − h) ∧ T ∨ 0, x) − f (τ, x))
h
1 ¡ −λh ¢
− e P ((τ − h) ∧ T ∨ 0, τ ) f (τ, ·) (x) − f (τ, x) .
h
The operator ϑh : Cb ([0, T ] × E) → Cb ([0, T ] × E) is defined by

ϑh f (τ, x) = f ((τ − h) ∧ T ∨ 0, x) , f ∈ Cb ([0, T ] × E) . (3.77)

Since
Z h
(1) (1) 1
Lλ,h R(λ)f = R(λ)Lλ,h f = e−λρ S(ρ)f dρ, f ∈ Cb ([0, T ] × E) ,
h 0
(3.78)
and L(1) is the Tβ -generator of the semigroup {S(ρ) : ρ ≥ 0}, the equalities
in (3.74) follow from (3.78). Since
(1) (1)
ϑh Lλ,h R(λ)f (τ, x) = Lλ,h R(λ)f ((τ − h) ∧ T ∨ 0, x) ,

it also follows that


³ ´
(1)
Tβ - lim ϑh Lλ,h R(λ)f (τ, x) = λI − L(1) R(λ)f (τ, x) . (3.79)
h↓0

A consequence of (3.79) and the second equality in (3.76) is that


³ ´ µ
(1) 1
λI − L f (τ, x) = lim (f ((τ − h) ∧ T ∨ 0, x) − f (τ, x)) (3.80)
h↓0 h

1 ¡ −λh ¢
− e P ((τ − h) ∧ T ∨ 0, τ ) f (τ, ·) (x) − f (τ, x)
h
µ
1¡ ¢
= lim f (τ, x) − e−λh P (τ, (τ + h) ∧ T ) f (τ, ·) (x)
h↓0 h

1
− ((f ((τ + h) ∧ T, ·) − f (τ, ·)) (x)) . (3.81)
h
¡ ¢
These limits exist in the strict sense; i.e. in the Tβ -topology. If f ∈ D L(1) ,
¡ ¢
and if f belongs to D (D1 ), then (3.80) and (3.81) imply that f ∈ D L , that
118 3 Space-time operators

1
L(f )(τ, x) = lim (P (τ, τ + h) f (τ, ·) (x) − f (τ, x))
h↓0 h
1
= lim (P (τ − h, τ ) f (τ, ·) (x) − f (τ, x)) , (3.82)
h↓0 h

and that
L(1) f = Lf + D1 f. (3.83)
Hence, in principle, the first term on the right-hand side in (3.80) con-
verges to¡ the negative
¢ of the time-derivative of the function f and the sec-
ond to λI − L f . The following arguments make this more precise. We
(1)
will need the fact that the subspace CP,b is Tβ -dense in Cb ([0, T ] × E). Let
f ∈ Cb ([0, T ] × E). In order to prove that, under certain conditions, ¡ ¢ the
operator L(1) is the closure of D1 + L, we consider for f ∈ D L(1) and
0 ≤ a ≤ b ≤ T the following equality:
Z b Z b
ϑρ S(ρ)L(1) f (τ, x) dρ = S(ρ)L(1) f ((τ − ρ) ∨ 0, x) dρ
a a
Z b

= ϑρ S (ρ) f (τ, x)dρ
∂τ a
+ P ((τ − b) ∨ 0, τ ) f (τ, x) − P ((τ − a) ∨ 0, τ ) f (τ, x). (3.84)

We first prove the equality on (3.84). Therefore we write


Z b

ϑρ S (ρ) f (τ, x)dρ + P ((τ − b) ∨ 0, τ ) f (τ, x) − P ((τ − a) ∨ 0, τ ) f (τ, x)
∂τ a
Z τ −a

= S (τ − ρ) f (ρ ∨ 0, x)dρ
∂τ τ −b
+ P ((τ − b) ∨ 0, τ ) f (τ, x) − P ((τ − a) ∨ 0, τ ) f (τ, x)
¡ ¢
(the function f belongs to D L(1) )
Z τ −a
= S (τ − ρ) L(1) f (ρ ∨ 0, x)dρ
τ −b
+ S (a) f ((τ − a) ∨ 0, x) − S (b) f ((τ − b) ∨ 0, x)
+ P ((τ − b) ∨ 0, τ ) f (τ, x) − P ((τ − a) ∨ 0, τ ) f (τ, x)
Z b Z b
= S (ρ) L(1) f ((τ − ρ) ∨ 0, x)dρ = ϑρ S (ρ) L(1) f (τ, x)dρ. (3.85)
a a

The equality in (3.85) shows (3.84). ¡In the¢ same manner the following equality
can be proved for λ > 0 and f ∈ D L(1) :
Z ∞ Z ∞
λ e−λρ ϑρ S(ρ)L(1) f dρ = λD1 e−λρ ϑρ S(ρ)f dρ
0 0
3.2 Dissipative operators and maximum principle 119
Z ∞
+ λ2 e−λρ ϑρ S(ρ)f dρ − λf. (3.86)
0
¡ ¢
As above let f ∈ D L(1) . From (3.86) we infer that
µ Z ∞ Z ∞ ¶
L(1) f = Tβ - lim λD1 e−λρ ϑρ S(ρ)f dρ + λ2 e−λρ ϑρ S(ρ)f dρ − λf .
λ→∞ 0 0
(3.87)
¡ ¢
If, in addition, f belongs to the domain of D1 , then it also belongs to D L ,
and
µ Z ∞ ¶
Lf = Tβ - lim λ2 e−λρ ϑρ S(ρ)f dρ − λf
λ→∞
µ Z0 ∞ ¶
= Tβ - lim λ2 e−λρ S(ρ)ϑρ f dρ − λf . (3.88)
λ→∞ 0

The second equality in (3.88) follows from (3.87). So far the result is not
conclusive. To finish the proof of the implication (i) =⇒ (ii) of Theorem 3.10
(1)
we will use the hypothesis that the space CP,b (λ0 ) is Tβ -dense for some λ0 > 0.
In addition, we will use the following identity for a function f in the domain
of the time derivative D1 :
Z ∞
λL(1) e−λρ S(ρ)ϑρ f dρ
0
Z ∞ ³ ´Z ∞
2 −λρ (1)
=λ e S(ρ)ϑρ f dρ − λf + λ λI − L e−λρ S(ρ) (I − ϑρ ) f dρ
0 0
Z ∞ Z ∞
2 −λρ
=λ e S(ρ)ϑρ f dρ − λf + λ e−λρ S(ρ)ϑρ D1 f dρ. (3.89)
0 0

However, this is not the best approach either. The following arguments will
(1)
show that the Tβ -density of CP,b (λ0 ) is dense in C0 ([0, T ] × E) entails that
D(1) (L) = D (L) ∩ D (D¡1 ) is ¢a core for the operator L(1) . From (3.83) it
follows that D(1) (L) ⊂ D L(1) . From (3.87), (3.88), and from (3.89) we also
¡ ¢ ¡ ¢
get D L(1) ∩ D (D1 ) = D L ∩ D (D1 ). Fix λ0 > 0 such that the space
(1) ¡ ¢ (1)
CP,b (λ0 ) is Tβ -dense in Cb ([0, T ] × E). Since R λ0 I − L(1) = CP,b (λ0 ), this
hypothesis has as a consequence that the range of the operator λ0 I − L − D1
(1)
is Tβ -dense in Cb ([0, T ] × E).
¡ The
¢ Tβ -dissipativity of the operator L (1) then
implies that the subspace D L ∩ D (D1 ) is a core for the operator L , and
consequently, the closure of the operator L + D1 coincides with L(1) . We will
show all this. Since the operator L(1) generates a Feller semigroup, the same
is true for the closure of L + D1 . The range of λ0 I − L − D1 coincides with
(1)
the subspace CP,b (λ0 ) defined in (3.36). It is easy to see that
½ Z ∞ ¾
(1) −λ0 ρ
CP,b (λ0 ) = f ∈ Cb ([0, T ] × E) : R (λ0 ) f = e S(ρ)f dρ ∈ D (D1 ) .
0
(3.90)
120 3 Space-time operators
(1) ¡ ¢
If f ∈ CP,b (λ0 ), then f = λ0 I − L(1) R (λ0 ) f where
³ ´ ¡ ¢
R (λ0 ) f ∈ D L(1) ∩ D (D1 ) = D L ∩ D (D1 ) , (3.91)

(1)
as was shown in (3.87) and (3.88). It follows that f ∈ CP,b (λ0 ) can be written
as ¡ ¢
f = λ0 I − L − D1 R (λ0 ) f. (3.92)
By (i) the range of λ0 I − L − D1 is Tβ -dense in Cb ([0, T ] × E). The second
equality in (3.277) follows from (3.91) and (3.92). Let f belong to the Tβ -
(1)
closure of λ0 I − L − D1 . Then there exists a net (gα )α∈A ⊂ CP,b (λ0 ) ⊂
¡ ¢
([0, T ] × E) such that f = limα λ0 I − L − D1 gα . From (3.14) we infer that
g = Tβ - limα gα . Since the operator Tβ -closed linear operator L(1) extends L +
D1 , it follows that L + D1 is Tβ -closable. Let L0 be its Tβ -closure. From (3.14)
it also follows that f = (λ0 I − L0 ) g. Since the range of λ0 I − L¡− D1¢ is Tβ -
dense, we see that R (λ0 I − L0 ) = Cb ([0, T ] × E). Next let g ∈ D L(1) . Then
¡ ¢
there exists g0 ∈ D (L0 ) such that λ0 I − L(1) g = (λ0 I − L0 ) g0 . Since L(1)
extends L0 , and since L(1) is dissipative (see (3.53), it follows that g = g0 ∈
D (L0 ). In other words, the operator L0 coincides with L(1) , and consequently,
the operator L + D1 is Tβ -closable, and its closure coincides with L(1) , the
Tβ -generator of the semigroup {S(ρ) : ρ ≥ 0}. This proves the implication (i)
=⇒ (ii) of Theorem 3.10.
(ii) =⇒ (iii). Let L(2) be the closure of the operator D1 + L. From (ii) we
(1)
know
¡ that ¢ L generates a Tβ -continuous semigroup {S2 (ρ) : ρ ≥ 0}. Since
D L(1) is Tβ -dense, it follows that D(1) (L) = D (D1 ) ∩ D(L) is Tβ -dense as
well. The generator of the Tβ -continuous semigroup {S(ρ) : ρ ≥ 0}, which we
denote by L(1) , extends D1 + L, and hence it also extends L(2) . Since L(2)
generates¡ a¢Feller semigroup, it ¡is dissipative,
¢ and so it satisfies (3.53). Let
g ∈ D L(1) , and choose g0 ∈ D L(2) such that
³ ´ ³ ´ ³ ´
λ0 I − L(1) g = λ0 I − L(1) g0 = λ0 I − L(2) g0 .
¡ ¢ ¡ ¢
The inequality in(3.53) implies that g = g0 ∈ D L(2) , and hence D L(2) =
¡ (1) ¢
D L . Moreover, L(1) extends L(2) . Therefore L(2) = L(1) . It also follows
that the semigroup {S2 (ρ) : ρ ≥ 0} is the same as {S(ρ) : ρ ≥ 0}. Moreover,
there exists λ0 > 0 such that the range of λ0 I − D1 − L is Tβ -dense in
Cb ([0, T ] × E). In fact this is true for all λ, <λ > 0. Finally, we will show
that the operator D1 + L is positive Tβ -dissipative. Let u ∈ H + ([0, T ] × E),
and consider the functionals f 7→ u(τ, x)λR(λ)f (τ, x), λ ≥ λ0 > 0, (τ, x) ∈
[0, T ] × E. Since L(1) generates a Tβ -continuous semigroup we know that
lim ku (f − λR(λ)f )k∞ = 0. (3.93)
λ→∞

If (fm )m∈N ⊂ Cb ([0, T ] × E) decreases pointwise to 0, then the sequence


(u(τ, x)λR(λ)fm (τ, x))m∈N also decreases to. By Dini’s Lemma and (3.93)
3.2 Dissipative operators and maximum principle 121

this convergence is uniform in λ ≥ λ0 and (τ, x) ∈ [0, T ] × E, because u ∈


H + ([0, T ] × E). From Theorem 1.8 it follows that there exists a function
v ∈ H + ([0, T ] × E) such that

kuλR(λ)f k∞ ≤ kvf k∞ , f ∈ Cb ([0, T ] × E) (3.94)

Since the operator L(1) sends real functions to real functions from (3.94), and
u ≥ 0, we derive for (σ, y) ∈ [0, T ] × E

< (u(σ, y)λR(λ)f (σ, y)) = u(σ, y)λR(λ) (<f ) (σ, y)


+
≤ u(σ, y)λR(λ) (<f ) (σ, y)
+
≤ sup v(τ, x) (<f ) (τ, x)
(τ,x)∈[0,T ]×E

≤ sup v(τ, x) (<f ) (τ, x). (3.95)


(τ,x)∈[0,T ]×E
¡ ¢
By the substitution f = λI − L(1) g in (3.95) we obtain:
³ ´
λ sup u(τ, x)<g(τ, x) ≤ sup v(τ, x)< λg(τ, x) − L(1) g(τ, x) .
(τ,x)∈[0,T ]×E (τ,x)∈[0,T ]×E
(3.96)
Since the operator L(1) extends D1 + L, the inequality in (3.96) displays the
fact that the operator D1 + L is positive Tβ -dissipative.
Altogether, this shows the implication (ii) =⇒ (iii) of Theorem 3.10.
(iii) =⇒ (i). Suppose that we already know that the Tβ -closure of D1 + L
generates a Tβ -continuous semigroup {S(ρ) : ρ ≥ 0}. Then we define the evo-
lution {P (τ, t) : 0 ≤ τ ≤ t ≤ T } by

P (τ, t) f (x) = S (t − τ ) [(s, y) 7→ f (y)] (τ, x) , f ∈ Cb (E). (3.97)

We have to prove that the family {P (τ, t) : 0 ≤ τ ≤ t ≤ T } is a Feller evolu-


tion indeed. First we show that it has the evolution property:

P (τ, t1 ) P (t1 , t) f (x) = S (t1 − τ ) [(s, y) 7→ P (t1 , t) f (y)] (τ, x)


= S (t1 − τ ) [(s, y) 7→ S (t − t1 ) f (s, y)] (τ, x)
= S (t1 − τ ) S (t − t1 ) [(s, y) 7→ f (s, y)] (τ, x)
= S (t − τ ) [(s, y) 7→ f (s, y)] (τ, x)
= P (τ, t) f (x). (3.98)

The equality in (3.98) exhibits the evolution property. The continuity of the
function (τ, t, x) 7→ P (τ, t) f (x) follows from the continuity of the function
(τ, t, x) 7→ S (t − τ ) [(s, y) 7→ f (y)] (τ, x): see (3.97).
Next we prove that the operator D1 + L is Tβ -closable, and that its closure
generates a Feller semigroup. Since the operator D1 + L is Tβ -densely defined
and Tβ -dissipative, it is Tβ -closable: see Proposition 3.11 assertion (a). Let
122 3 Space-time operators

L(1) be its Tβ -closure. Since there exists λ0 > 0 such that the range of λ0 I −
D1 − L is Tβ -dense in Cb ([0, T ] × E), and since D1 + L is T|beta -dissipative, it
¡ ¢ ¡ ¢−1
follows that R λ0 I − L(1) = Cb ([0, T ] × E). Put R (λ0 ) = λ0 I − L(1) ,
P∞ n n+1
and R (λ) = n=0 (λ0 − λ) (R (λ0 )) ¡, |λ − λ0 | < λ
¢ 0 . This series converges
in the uniform norm. It follows that R λI − L(1) = Cb ([0, T ] × E) for all
λ ¡∈ C for which
¢ |λ − λ0 | < λ0 . This procedure can be repeated to obtain:
R λI − L(1) = Cb ([0, T ] × E) for all λC with <λ > 0. Put
2
S0 (t)f = Tβ - lim e−λt etλ R(λ)
f, f ∈ Cb ([0, T ] × E) . (3.99)
λ→∞

Of course we have to prove that the limit in (3.99) exists. For brevity
¡ ¢ we write
A(λ) = λ2 R(λ) − λI = L(1) (λR(λ)), and notice that for f ∈ D L(1) we have
A(λ)f = λR(λ)L(1) f , and that
³ ´2 µ³ ´2 ¶
A(λ)f = λR(λ)L(1) f = R(λ) L(1) f + L(1) f, for f ∈ D L(1) .
(3.100)
Let 0 < λ < µ < ∞. From Duhamel’s formula we get

e−λt eλt(λR(λ)) f − e−µt eµt(µR(µ)) f


Z t
tA(λ) tA(µ)
=e f −e f= esA(λ) (A(λ) − A(µ)) e(t−s)A(µ) f ds. (3.101)
0
¡ ¢ ¡ ¢2
If f belongs to D L(1) , then A(λ)f − A(µ)f = (R(λ) − R(µ)) L(1) f , and
hence the equality in (3.101) can be rewritten as:

e−λt eλt(λR(λ)) f − e−µt eµt(µR(µ)) f


Z t ³ ´2
= esA(λ) (R(λ) − R(µ)) e(t−s)A(µ) L(1) f ds. (3.102)
0

From (3.102) we infer that for the uniform norm we have:


° °
° −λt λt(λR(λ)) °
°e e f − e−µt eµt(µR(µ)) f °

Z t° ³ ´2 °
° sA(λ) °
≤ ° (t−s)A(µ) (1)

°e (R(λ) − R(µ)) e L ° ds
0 ∞
Z t° ° ° °° ° ³ ´2 °
°
° sA(λ) ° ° °
≤ °e ° kR(λ) − R(µ)k °e(t−s)A(µ) ° ° ° L (1)

° ds
0 ∞
µ ¶ °³ ´2 °
1 1 ° ° L
°
≤t + (1)

° . (3.103)
λ µ ° ∞
³¡ ¢2 ´
From (3.103) we infer that for f ∈ D L(1) the limit

S0 (t)f (τ, x) = lim etA(λ) f (τ, x) (3.104)


λ→∞
3.2 Dissipative operators and maximum principle 123

exists uniformly in (τ, t, x) ∈ [0, T ] × [0, T ] × E. The ¡ next¢ step consists


¡ in
¢
showing that the limit in (3.104) exists for f ∈ D L(1) . Let f ∈ D L(1) ,
and λ > µ > 0. Then we have for λ0 > 0 sufficiently large:
° °
° tA(λ) °
°e f − etA(µ) f °

° °
° °
≤ °etA(λ) (f − λ0 R (λ0 ) f ) − etA(µ) (f − λ0 R (λ0 ) f )°

° °
° tA(λ) tA(µ) °
+ °e (λ0 R (λ0 ) f ) − e (λ0 R (λ0 ) f )°

³° ° ° °´ ° °
° tA(λ) ° ° tA(µ) ° ° (1) °
≤ °e ° + °e ° °R (λ0 ) L f °

° °
° tA(λ) tA(µ) °
+ °e (λ0 R (λ0 ) f ) − e (λ0 R (λ0 ) f )°

2 °° (1) °
° °
° tA(λ) tA(µ)
°
°
≤ °L f ° + °e (λ0 R (λ0 ) f ) − e (λ0 R (λ0 ) f )° . (3.105)
λ0 ∞ ∞

From (3.105)
¡ ¢ together with (3.104) it follows that (3.104) also holds for
f ∈ D L(1) . There remains to be shown that the limit in (3.104) also
exists in Tβ -sense, but now for f ∈ Cb ([0, T ] × E). Since the operator
L(1) is Tβ -dissipative, there exists, for u ∈ H + ([0, T ] × E), a function
v ∈ H + ([0, T ] × E) such that for all λ ≥ λ0 > 0 the inequality in (3.14)
in Definition 3.5 is satisfied, i.e.
° ³ ´° ³ ´
° °
°v λf − L(1) f ° ≥ λ kuf k∞ , for all λ ≥ λ0 , and for all f ∈ D L(1) .

(3.106)
From (3.106) we infer

λ kuR(λ)f k∞ ≤ kvf k∞ , f ∈ Cb ([0, T ] × E) . (3.107)

Let f ∈ Cb ([0, T ] × E). By Hausdorff-Bernstein-Widder inversion theorem


there exists a unique Borel-measurable function (τ, t, x) 7→ Se0 (t)f (τ, x) such
that
³ ´−1 Z ∞
R(λ)f (τ, x) = λI − L(1) f (τ, x) = e−λρ Se0 (ρ)f (τ, x)dρ, <λ > 0.
0
(3.108)
For the result in (3.108) see Widder [254] Theorem 16a, page 315. The re-
solvent property of the mapping λ 7→ R(λ), λ > 0, implies the semigroup
property of the mapping ρ 7→ S(ρ). To be precise we have:
Z ∞
¡ −λρ ¢
R(λ)f − R(µ)f = e − e−µρ Se0 (ρ)f dρ
0
Z ∞Z ρ
= (µ − λ) e−λ(ρ−s)−µs Se0 (ρ − s + s) f ds dρ
0 0
Z ∞Z ∞
= (µ − λ) e−λ(ρ−s)−µs Se0 (ρ − s + s) f dρ ds
0 s
124 3 Space-time operators
Z ∞ Z ∞
= (µ − λ) e−λρ−µs Se0 (ρ + s) f dρ ds. (3.109)
0 0

On the other hand we also have

R(λ)f − R(µ)f = (µ − λ) R(λ)R(µ)f


Z ∞Z ∞
= (µ − λ) e−λρ−µs Se0 (ρ) Se0 (s)f dρ ds. (3.110)
0 0

Comparing (3.109) and (3.110) shows the equality:

Se0 (ρ + s) f = Se0 (ρ)Se0 (s)f, ρ, s ≥ 0, f ∈ Cb ([0, T ] × E) .


n o
Hence the family Se0 (ρ) : ρ ≥ 0 is a semigroup. We have to show that the
function (τ, t, x) 7→ Se0 (t) f (τ, x) is a bounded continuous function. This will
be done in several steps. First we will prove the following representation for
Se0 (t)f , f ∈ Cb ([0, T ] × E),

X (λt)k
Se0 (t)f = lim e−λt
k
(λR(λ)) f = lim e−λt eλt(λR(λ)) f = S0 (t)f,
λ→∞ k! λ→∞
k=0
(3.111)
provided
¡ that
¢ the limit in (3.111) exists, and where S 0 (t) is as in (3.100). Let
f ∈ D L(1) . Then the function S0 (t)f is the uniform limit of functions of
the form (τ, t, x) 7→ etA(λ) f (τ, x), and such functions are continuous in the
variables (τ, t, x): see (3.105). Consequently,
¡ ¢the function S0 (t)f inherits this
continuity property. Again let f ∈ D L(1) . We will prove that R(µ)f =
R ∞ −µt
0
e S0 (t)f dt, µ > 0. Therefore we notice
Z ∞ Z ∞
e−µt S0 (t)f dt = e−µt lim etA(λ) f dt
0 0 λ→∞
Z ∞
−1
= lim e−µt etA(λ) f dt = lim (µI − A(λ)) f
λ→∞ 0 λ→∞
µ ¶µ ¶−1
λ 1 λµ
= lim I− L(1) I − L(1) f
λ→∞ λ + µ λ+µ λ+µ
³ ´−1
= µI − L(1) f. (3.112)

From (3.108) and (3.112) we infer the equality


³ ´
Se0 (t)f = S0 (t)f for f ∈ D L(1) . (3.113)

After that we will prove that the averages of the semigroup {S0 (ρ) : ρ ≥ 0}
is Tβ -continuous. As a consequence, for f ∈ Cb ([0, T ] × E) the function
Z
1 t −λρ
(τ, t, x) 7→ e S0 (ρ)f (τ, x) dρ,
t 0
3.2 Dissipative operators and maximum principle 125

is a bounded and continuous function, and the family of operators


½ Z t ¾
1 −λρ
e S0 (ρ)dρ : 0 ≤ t ≤ T is Tβ -equi-continuous. (3.114)
t 0

As above we write A(λ)f = λ2 R(λ)f − λf . Two very relevant equalities are:


³ ´−1 Z t ³ ´−1
R(λ)f = λI − L(1) f= e−λρ Se0 (ρ)f dρ + e−λt S0 (t) λI − L(1) f
0
Z t ³ ´−1
= e−λρ S0 (ρ)dρ f + e−λt S0 (t) λI − L(1) f, (3.115)
0

and
³ ´Z t
f = λI − L (1)
e−λρ Se0 (ρ)f dρ + e−λt Se0 (t)f. (3.116)
0

Here we wrote Z Z
t t
e−λρ S0 (ρ)dρ f = e−λρ Se0 (ρ)f dρ
0 0
Rt
to indicate that the operator f 7→ 0 e−λρ S0 (ρ)dρ f , f ∈ Cb ([0, T ] × E), is a
mapping from Cb ([0, T ] × E) to itself, whereas it is not is not so clear what
the target space is of the mappings Se0 (ρ), ρ > 0. In order to show that
the operators Se0 (t), t ≥ 0, are mappings from Cb ([0, T ] × E) into itself, we
need the sequential λ-dominance of the operator D1 + L for some λ > 0.
Moreover,
n it follows
o from this sequential λ-dominance that the semigroup
−λt e
e S0 (t) : t ≥ 0 is Tβ -equi-continuous. Once we know all this, then the
formula in (3.116) makes sense and is true.
For every measure ν on the Borel field of [0, T ] × E the mapping ρ 7→
R
e
S0 (ρ)f dν is a Borel measurable function on the the semi-axis [0, ∞). The for-
mula in (3.115) is correct, and poses no problem provided f ∈ Cb ([0, T ] × E).
In fact we have
Z ∞
1
e−µt e−λt S0 (t)R(λ)f dt = R(λ + µ)R(λ)f = (R(λ) − R(λ + µ)) f
0 µ
Z ∞ Z ∞Z ρ
1 − e−µρ −λρ e
= e S0 (ρ)f dρ = e−µt dt e−λρ Se0 (ρ)f dρ
µ
Z0 ∞ Z ∞ 0 0

−µt −λρ e
= e e S0 (ρ)f dρ dt, (3.117)
0 t

and hence
³ ´−1 Z ∞
e−λt S0 (t) λI − L(1) f = e−λt S0 (t)R(λ)f = e−λρ Se0 (ρ)f dρ. (3.118)
t

From (3.118) we infer


126 3 Space-time operators
Z t ³ ´−1
e−λρ Se0 (ρ)f dρ + e−λt S0 (t) λI − L(1) f
0
Z t Z ∞
= e−λρ Se0 (ρ)f dρ + e−λρ Se0 (ρ)f dρ
Z0 ∞ t
³ ´−1
−λρ e
= e S0 (ρ)f dρ = λI − L(1) f. (3.119)
0

The equality in (3.115) is the same as the one in (3.119). From the equality
Rt
in (3.115) it follows that the function (τ, t, x) 7→ 0 e−λρ Se0 (ρ)f (τ, x)dρ is
¡ ¢ ¡ ¢
continuous. Next let g ∈ D L(1) and put f = λI − L(1) g. From (3.115)
we get: Z t
g − e−λt S0 (t)g = e−λρ S0 (ρ)dρ f. (3.120)
0

From (3.120) we infer:


°Z t °
° ° ° °
°g − e−λt S0 (t)g ° = ° e −λρ
S0 (ρ)dρ f °
∞ ° °
0 ∞
Z t ° ° Z t ° °
−λρ ° e ° −λρ ° °
≤ e °S0 (ρ)f ° dρ ≤ e dρ °λf − L(1) f ° , (3.121)
0 ∞ 0 ∞
° °
and hence for λ = 0 we obtain kg − S0 (t)gk∞ ≤ t °L(1) f °∞ . This inequality
½ ¾
1
proves the uniform boundedness of the family (g − S0 (t)g) : t > 0 . Next
t
let us discuss its convergence. Therefore we again employ (3.115), and pro-
ceed as follows. Let (fn )n∈N be a sequence in Cb ([0, T ] × E) which decreases
¡ ¢ ¡ ¢
pointwise to the zero-function. Choose the sequence gnλ n∈N ⊂ D L(1) in
¡ ¢
such a way that λfn = λgnλ − L(1) gnλ . Then the sequence gnλ n∈N decreases to
zero as well. For t > 0 have
Z t
gnλ = e−λt S0 (t)gnλ + λ e−λρ S0 (ρ)dρ fn , (3.122)
0

or, what is equivalent,


Z t
eλt gnλ = S0 (t)gnλ +λ eλt−λρ S0 (ρ)dρ fn . (3.123)
0

Since the operator L(1) is Tβ -dissipative, it follows that sup gnλ decreases
λ, λ≥T −1
pointwise to zero. So that, with λ = t−1 , the equality in (3.122) implies
Z
1 t
sup S0 (ρ)dρ fn ↓ 0, as n → ∞. (3.124)
t, 0<t≤T t 0

Consequently, for any fixed λ ∈ R, the family of operators


3.2 Dissipative operators and maximum principle 127
½ Z t ¾
1
e−λρ S0 (ρ)dρ : T ≥ t > 0 (3.125)
t 0

is Tβ -equi-continuous: see Corollary 1.19. Let f ∈ Cb ([0, T ] × E). We will


show that Z
1 t −λρ
Tβ - lim e S0 (ρ)dρ f = f. (3.126)
t↓0 t 0

It suffices to prove (3.126) for λ > 0. First assume that f = R(λ) belongs to
the domain of L(1) . Then we have
Z Z Z ∞
1 t −λρ 1 t −λρ
e S0 (ρ)dρ f = e S0 (ρ) e−λσ S0 (σ) dσ dρg
t 0 t 0 0
Z Z
1 t ∞ −λ(σ+ρ)
= e S0 (σ + ρ) dσ g dρ
t 0 0
Z Z
1 t ∞ −λσ
= e S0 (σ) dσ g dρ. (3.127)
t 0 ρ
R∞
Since the function ρ 7→ ρ e−λσ S0 (σ) dσ g is continuous for the uniform norm
topology on Cb ([0, T ] × E), (3.127) implies
Z Z Z ∞ ³ ´
1 t ∞ −λσ
k·k∞ - lim e S0 (σ)dσ f = e−λσ S0 (σ)dσ g = f, f ∈ D L(1) .
t↓0 t 0 ρ 0
¡ ¢ (3.128)
Since D L(1) is Tβ -dense in Cb ([0, T ] × E), the equi-continuity of the family
in (3.125) implies that
Z
1 t −λρ
Tβ - lim e S0 (ρ)dρ f = f, f ∈ Cb ([0, T ] × E) . (3.129)
t↓0 t 0

From the equality in (3.120) together with (3.129) we see that

g − e−λt S0 (t)g ³ ´
Tβ - lim = f = λg − L(1) g, g ∈ D L(1) . (3.130)
t↓0 t
n o
So far we have proved that the semigroup Se0 (t) : t ≥ 0 maps the domain of
L(1) to bounded continuous functions, and that the family in (3.129) consists
of mappings which assign to bounded continuous again bounded bounded
continuous functions. What is not clear, is whether or not the operators Se0 (t),
t ≥ 0, leave the space Cb ([0, T ] × E) invariant. Fix λ > 0, and to every
f ∈ Cb ([0, T ] × E), f ≥ 0, we assign the function f λ defined by
n o
k
f λ = sup (µR (λ + µ)) f : µ > 0, k ∈ N , (3.131)

The reader is invited to compare the function f λ with (1.48) and other results
in Proposition 1.22. The arguments which follow are in line with the proof
128 3 Space-time operators

of Proposition 1.22. The function f λ is the smallest λ-super-median valued


function which exceeds f . A closely related notion is the notion of λ-super-
mean valued function. A function g : [0, T ] × E → [0, ∞) is called λ-super-
median valued if e−λt Se0 (t)g ≤ g for all t ≥ 0; it is called λ-super-mean valued
if µR (λ + µ) g ≤ g for all µ > 0. In Lemma (9.12) in Sharpe [208] it is shown
that, essentially speaking, these notions are equivalent. In fact the proof is
not very difficult. It uses the Hausdorff-Bernstein-Widder theorem about the
representation by Laplace transforms of positive Borel measures on [0, ∞) of
completely positive functions. The reader is also referred to Remark 3.7 and
Definition 3.6.
Let f ∈ Cb ([0, T ] × E) be positive. Here we use the representation

X (µt)k
e−λt Se0 (t)f = lim e−µt
k
(µR (λ + µ)) f ≤ f λ , (3.132)
µ→∞ k!
k=0

and hence
sup e−λt Se0 (t)f ≤ f λ . (3.133)
t>0

Since
Z ∞
µk
tk−1 e−µt e−λt Se0 (t)f dt
k
(µR (λ + µ)) f = (3.134)
(k − 1)! 0

we see by invoking (3.131) that the two expressions in (3.133) are the same.
In order to finish the proof of Theorem 3.10 we need the hypothesis that the
operator D1 + L is sequentially λ-dominant for some λ > 0. In fact, let the
sequence (fn )n∈N ⊂ Cb ([0, T ] × E) converge downward to zero, and select
functions gnλ ∈ D (D1 + L), n ∈ N, with the following properties:
1. fn ≤ gnλ ;
2. gnλ = sup inf {g ≥ fn 1K : (λI − D1 − L) g ≥ 0};
K∈K([0,T ]×E) g∈D(D1 +L)
3. lim gnλ (τ, x) =0 for all (τ, x) ∈ [0, T ] × E.
n→∞

In the terminology of (1.41) and Definition 3.6 the functions gnλ are denoted
by gnλ = Uλ1 (fn ), n ∈ N. Recall that K ([0, T ] × E) denotes the collection
of all compact subsets of [0, T ] × E. By hypothesis, the sequence as defined
in 2 satisfies 1 and 3. Let K be any compact subset of [0, T ] × E, and g ∈
D (D1 + L) be such that g ≥ fn 1K and (λI − D1 − L) g ≥ 0. Then we have
³ ´
(λ + µ) I − L(1) g = ((λ + µ) I − D1 − L) g ≥ µg. (3.135)

From (3.135) and g ≥ fn 1K we infer


k k
g ≥ µR (λ + µ) g ≥ (µR (λ + µ)) g ≥ (µR (λ + µ)) (fn 1K ) , (3.136)
and hence (3.136) together with (3.131) and (3.133) (which is in fact an equal-
ity) we see
3.2 Dissipative operators and maximum principle 129
n o
gnλ ≥ fnλ = sup sup e−λt Se0 (t) (fn 1K ) : t ≥ 0
K∈K([0,T ]×E)
n o
k
= sup sup (µR (λ + µ)) (fn 1K ) : µ > 0, k ∈ N
K∈K([0,T ]×E)
n o
k
= sup (µR (λ + µ)) fn : µ > 0, k ∈ N
n o
= sup e−λt Se0 (t)fn : t ≥ 0 . (3.137)

Since by hypothesis lim gnλ = 0 the inequality in (3.137) implies: lim fnλ = 0.
n→∞ n→∞
It follows that n o
lim sup e−λt Se0 (t)fn : t ≥ 0 = 0. (3.138)
n→∞

From Corollary 1.19 it follows that the family of operators


n o
k
(µR (λ + µ)) : µ ≥ 0, k ∈ N

is Tβ -equi-continuous. Hence for every function u ∈ H + ([0, T ] × E) there


exists a function v ∈ H + ([0, T ] × E) such that
° °
° k °
°u (µR (λ + µ)) f ° ≤ kvf k∞ , f ∈ Cb ([0, T ] × E) , µ ≥ 0, k ∈ N.

(3.139)
Since
n o n o
sup e−λt Se0 (t)f : t ≥ 0 = sup (µR (λ + µ)) f : µ ≥ 0, k ∈ N , f ≥ 0,
k

(3.140)
the inequality in (3.139) yields
° °
° −λt e °
°ue S0 (t)f ° ≤ kvf k∞ , f ∈ Cb ([0, T ] × E) , t ≥ 0. (3.141)

¡ ¢
Since D L(1) is Tβ -dense, and the operators Se0 (t), t ≥ 0, are mappings from
¡ (1) ¢
D L to Cb ([0, T ] × E) the Tβ -equi-continuity in (3.141) shows that the
operators Se0 (t), nt ≥ 0, are in fact mappings
o from Cb ([0, T ] × E) to itself, and
that the family e−λt Se0 (t) : t ≥ 0 is Tβ -equi-continuous.
However, all these observations conclude the proof of the implication (iii)
=⇒ (i) of Theorem 3.10.

Remark 3.12. The equality in (3.115) shows that the function g := R(λ)f ,
where f ≥ 0 and f ∈ Cb ([0, T ] × E) is λ-super-mean valued in the sense that
an inequality of the form e−λt S0 (t)g ≤ g holds. Such an inequality is equiv-
alent to µR (µ + λ) g ≤ g. For details on such functions and on λ-excessive
functions see Sharpe [208], page 17 and Lemma 9.12, page 45.
130 3 Space-time operators

3.3 Korovkin property

The following notions and results are being used to prove part (e) of Theorem
1.39. We recall the definition of Korovkin property.
Definition 3.13. Let K be a subset of E The operator L is said to possess
the Korovkin property if there exists a strictly positive real number λ0 > 0
such that for every x0 ∈ E0 the equality
½ · µ ¶ ¸ ¾
1
inf sup h(x0 ) + g − I − L h (x) (3.142)
h∈D(L) x∈E0 λ0
½ · µ ¶ ¸ ¾
1
= sup inf h(x0 ) + g − I − L h (x) (3.143)
h∈D(L) x∈E 0 λ 0

is valid for all g ∈ Cb (E).


Let g ∈ Cb (E) and λ > 0. The equalities
½ µ ¶ ¾
1
inf h(x0 ) : I − L h ≥ g on E0
h∈D(L) λ
µ · µ ¶ ¸ ¶
1
= inf sup h(x0 ) + g − I − L h (x)
h∈D(L) x∈E0 λ
½ · µ ¶ ¸ ¾
1
= inf sup min max h(x0 ) + g − I − L h (x) , (3.144)
Γ ⊂D(L) Φ⊂E0 h∈Γ x∈Φ λ
#Γ <∞ #Φ<∞

show that the Korovkin property could also have been defined in terms of
any of the quantities in (3.144). In fact, if L satisfies the (global) maximum
principle on E0 , i.e. if for every real-valued function f ∈ D(L) the inequality

λ sup f (x) ≤ sup (λf (x) − Lf (x)) (3.145)


x∈E0 x∈E0

holds for all λ > 0, then the Korovkin property (on E0 ) does not depend on
λ0 > 0. In other words, if it holds for one λ0 > 0, then it is true for all λ > 0.
This is part of the contents of the following proposition. In fact the maximum
principle as formulated in (3.145) is not adequate in the present context. The
correct version here is the following one, which is kind of a σ-local maximum
principle.
Definition 3.14. Let E0 be a subset of E. Suppose that the operator L has the
property that for every λ > 0 and for every x0 ∈ E0 it is true that h (x0 ) ≥ 0,
whenever h ∈ D(L) is such that (λI − L) h ≥ 0 on E0 . Then the operator L
is said to satisfy the weak maximum principle on E0 .
As we proved in Proposition 1.35 the notion weak maximum principle and
maximum principle coincide, provided 1 ∈ D(L) and l1 = 0.
3.3 Korovkin property 131

In order to be really useful, the Korovkin property on E0 should be accom-


panied by the maximum principle on E0 . To be useful the global Korovkin
property (see Definition 3.15) requires the global maximum principle (see
(3.145)). In addition we need the fact that the constant functions belong to
D(L) and that L1 = 0. If we only know the global maximum principle, in the
sense of (3.145), then the global Korovkin property is required:
Definition 3.15. The operator L is said to possess the global Korovkin prop-
erty if there exists a strictly positive real number λ0 > 0 such that for every
x0 ∈ E the equality
½ · µ ¶ ¸ ¾
1
inf sup h(x0 ) + g − I − L h (x) (3.146)
h∈D(L) x∈E λ0
½ · µ ¶ ¸ ¾
1
= sup inf h(x0 ) + g − I − L h (x) (3.147)
h∈D(L) x∈E λ 0

is valid for all g ∈ Cb (E).


First we treat the situation of a subset of E. The global version is obtained
from the one on E0 by replacing the subset E0 with the full state space E.
Again a resolvent family is obtained. In order to prove the equalities of (3.166)
through (3.175) the global maximum principle is used. In fact it is used to
show the equalities
½ · µ ¶ ¸ ¾
1
inf sup h(x0 ) + g − I − L h (x) (3.148)
h ∈ D(L) x ∈ E λ
½ µ ¶ ¾
1
= inf h (x0 ) : I − L h ≥ g on E (3.149)
h∈D(L) λ
½ µ ¶ ¾
1
= sup h (x0 ) : I − L h ≤ g on E (3.150)
h∈D(L) λ
½ · µ ¶ ¸ ¾
1
= sup inf h(x0 ) + g − I − L h (x) . (3.151)
h∈D(L) x∈E λ

In particular, if g = 0, and if L satisfies the global maximum principle, then


the expressions in (3.148) through (3.151) are all equal to 0. Put

λ0 R (λ0 ) g (x0 )
½ · µ ¶ ¸ ¾
1
= inf sup h(x0 ) + g − I − L h (x)
h∈D(L) x∈E0 λ0
½ · µ ¶ ¸ ¾
1
= sup inf h(x0 ) + g − I − L h (x) . (3.152)
h∈D(L) x∈E0 λ0

Then λ0 R (λ0 ) is a linear operator from Cb (E0 ) to Cb (E0 ). The following


proposition shows that there exists a family of operators {R(λ) : 0 < λ < 2λ0 }
132 3 Space-time operators

which has the resolvent property. The operator λR(λ) is obtained from (3.152)
by replacing λ0 with λ. It is clear that this procedure can be extended to the
whole positive real axis. In this way obtain a resolvent family {R(λ) : λ > 0}.
−1
The operator R(λ) can be written in the form R(λ) = (λI − L0 ) , where L0 is
a closed linear operator which extends L (in case E0 = E), and which satisfies
the maximum principle on E0 , and, under certain conditions, generates a Feller
semigroup and a Markov process. For convenience we insert the following
lemma. It is used for E0 = E and for E0 a subset of E which is polish with
respect to the relative metric. The condition in (3.155) is closely related to
the maximum principle.
Lemma 3.16. Suppose that the constant functions belong to D(L), and that
L1 = 0. Fix x0 ∈ E, λ > 0, and g ∈ Cb (E0 ). Let E0 be any subset of E. Then
the following equalities hold:
½ µ ¶ ¾
1
inf sup h (x0 ) + g(x) − I − L h(x)
h∈D(L) x∈E0 λ
½ µ ¶ ¾
1
= inf h (x0 ) : I − L h ≥ g on E0 , (3.153)
h∈D(L) λ

and
½ µ ¶ ¾
1
sup inf h (x0 ) + g(x) − I − L h(x)
h∈D(L) x∈E0 λ
½ µ ¶ ¾
1
= sup h (x0 ) : I − L h ≤ g on E0 . (3.154)
h∈D(L) λ

½ µ ¶ ¾
1
If inf h (x0 ) :I − L h ≥ 0 on E0 ≥ 0, then (3.155)
h∈D(L) λ
½ µ ¶ ¾
1
sup g(x) ≥ inf h (x0 ) : I − L h ≥ g on E0 ≥ inf g(x),
x∈E0 h∈D(L) λ x∈E0
(3.156)
and also
½ µ ¶ ¾
1
inf h (x0 ) : I − L h ≥ g on E0
h∈D(L) λ
½ µ ¶ ¾
1
≥ sup h (x0 ) : I − L h ≤ g on E0 . (3.157)
h∈D(L) λ

First notice that by taking h = 0 in the left-hand side of (3.153) we see that
the quantity in (3.153) is less than or equal to sup g(x), and that the quantity
x∈E0
in (3.154) is greater than or equal to inf g(x). However, it is not excluded
x∈E0
that (3.153) is equal to −∞, and that (3.154) is equal to ∞.
3.3 Korovkin property 133

Proof. Upon replacing g with −g we see that the equality in (3.154) is a


consequence of (3.153). We put
½ µ ¶ ¾
1
αE0 = inf h (x0 ) : I − L h ≥ g on E0 and
h∈D(L) λ
½ µ ¶ ¾
1
βE0 = inf sup h (x0 ) + g(x) − I − L h(x) . (3.158)
h∈D(L) x∈E0 λ

First assume that βE0 ∈ R. Let ε > 0. Choose hε ∈ D(L) in such a way that
for x ∈ E0 we have
µ ¶
1
hε (x0 ) + g(x) − I − L hε (x) ≤ βE0 + ε.
λ

Then
µ ¶
1
g(x) ≤ I − L hε (x) + βE0 + ε − hε (x0 )
λ
µ ¶
1
= I − L (hε − hε (x0 ) + βE0 + ε) (x). (3.159)
λ

The substitution e hε = hε − hε (x0 ) + βE0 + ε in (3.159) yields αE0 ≤ e


hε (x0 ) =
βE0 + ε. Since ε > 0 was arbitrary, we get αE0 ≤ βE0 . The same argument
with −n instead of βE0 + ε shows αE0 = −∞ if βE0 = −∞. Next we assume
that αE0 ∈ R. Again let ε > 0 be µ arbitrary.
¶ Choose a function hε ∈ D(L)
1
such that hε (x0 ) ≤ αE0 + ε, and I − L hε ≥ g on E0 . Then we have, for
λ
x ∈ E0 ,
µ ¶
1
hε (x0 ) + g(x) − I − L hε (x) ≤ hε (x0 ) ≤ βE0 + ε,
λ

and hence βE0 ≤ αE0 + ε. Since ε > 0 was arbitrary, we get βE0 ≤ αE0 . Again,
the argument can be adapted if αE0 = −∞: replace αE0 + ε by −n, and let n
tend to ∞. If condition (3.155) is satisfied, then with m = inf g(y) we have
y∈E0
½ µ ¶ ¾
1
αE0 ≥ inf h (x0 ) : I − L h ≥ inf g(y) on E0
h∈D(L) λ y∈E0
½ µ ¶ ¾
1
= inf h (x0 ) : I − L (h − m) ≥ 0 on E0 ≥ m. (3.160)
h∈D(L) λ

The inequality in (3.160) shows the lower estimate in (3.156). The upper
estimate is obtained by taking h = sup g(y). Next we prove the inequality
y∈E0
in (3.157). Therefore we observe that the functional Λ+
E0 : Cb (E, R) → R,
defined by
134 3 Space-time operators
½ µ ¶ ¾
1
Λ+
E0 (g) = inf h (x0 ) : I − L h ≥ g on E0 (3.161)
h∈D(L) λ
is sub-additive and positive homogeneous. The latter means that

Λ+ + + + +
E0 (g1 + g2 ) ≤ ΛE0 (g1 ) + ΛE0 (g2 ) , and ΛE0 (αg) = αΛE0 (g)

for g1 , g2 , g ∈ Cb (E, R), and α ≥ 0. Moreover,


½ µ ¶ ¾
1
−Λ+ E0 (−g) = sup h (x0 ) : I − L h ≤ g on E 0 . (3.162)
h∈D(L) λ

It follows that

Λ+ + +
E0 (g) + ΛE0 (−g) ≥ ΛE0 (0)
½ µ ¶ ¾
1
= inf h (x0 ) : I − L h ≥ 0 on E0 ≥ 0. (3.163)
h∈D(L) λ
The inequality in (3.157) is a consequence of (3.162) and (3.163).
This completes the proof of Lemma 3.16.
The definition of an operator L satisfying the maximum principle on a subset
E0 can be found in Definition 3.14
Proposition 3.17. Let 0 < λ < 2λ0 and g ∈ Cb (E) and E0 a subset of E.
Suppose the operator L satisfies the maximum principle on E0 . In addition,
let the domain of L contain the constant functions, and assume L1 = 0. Let
x0 ∈ E0 . Put

λR(λ)g(x0 )
= lim inf inf sup inf sup · · · inf sup
n→∞ h0 ∈D(L) x1 ∈E0 h1 ∈D(L) x2 ∈E0 hn ∈D(L) xn+1 ∈E0
n µ
X ¶j ½ µ ¶ ¾
λ λ 1
1− hj (xj ) + g (xj+1 ) − I − L hj (xj+1 ) (3.164)
j=0
λ0 λ0 λ0
= lim inf sup inf sup inf · · ·
n→∞ h ∈D(L) x1 ∈E0 h ∈D(L) x2 ∈E0
0 1

sup inf
hn ∈D(L) xn+1 ∈E0
n µ
X ¶j ½ µ ¶ ¾
λ λ 1
1− hj (xj ) + g (xj+1 ) − I − L hj (xj+1 ) . (3.165)
j=0
λ0 λ0 λ0

Then the following identities are true:

λR(λ)g(x0 )
 
Xn µ ¶j
λ λ j+1
= lim  1− (λ0 R (λ0 )) g (x0 ) (3.166)
n→∞ λ0 λ 0
j=0
3.3 Korovkin property 135
∞ µ ¶j
λ X λ j+1
= 1− (λ0 R (λ0 )) g (x0 ) (3.167)
λ0 j=0 λ0
= lim inf
n→∞ ∞ ³
P ´j−1
λ λ
hj ∈ D(L), (λI − L) h0 = λ0 1− λ0 (λI − L) hj
j≥0 j=1

Xn µ ¶j ½ µ ¶ ¾
λ λ 1
max 1− hj (xj ) + g(xj+1 ) − I − L hj (xj+1 )
xj ∈E0
j=0
λ0 λ0 λ0
1≤j≤n+1
(3.168)
½ · µ ¶ ¸ ¾
1
= inf max h(x0 ) + g − I − L h (x) (3.169)
h ∈ D(L) x ∈ E0 λ
½ µ ¶ ¾
1
= inf h (x0 ) : I − L h ≥ g on E0 (3.170)
h∈D(L) λ
½ µ ¶ ¾
1
= sup h (x0 ) : I − L h ≤ g on E0 (3.171)
h∈D(L) λ
½ · µ ¶ ¸ ¾
1
= sup min h(x0 ) + g − I − L h (x) (3.172)
h∈D(L) x∈E0 λ
= lim sup
n→∞
P∞ ³ ´j−1
λ λ
hj ∈ D(L), (λI − L) h0 = λ0 1− λ0 (λI − L) hj
j≥0 j=1

Xn µ ¶j ½ µ ¶ ¾
λ λ 1
min 1− hj (xj ) + g(xj+1 ) − I − L hj (xj+1 )
xj ∈E0
j=0
λ0 λ0 λ0
1≤j≤n+1
(3.173)
= lim inf max
n→∞ hj ∈D(L), 0≤j≤n xj ∈E0 , 1≤j≤n+1

Xn µ ¶j ½ µ ¶ ¾
λ λ 1
1− hj (xj ) + g (xj+1 ) − I − L hj (xj+1 ) . (3.174)
j=0
λ0 λ0 λ0
= lim sup min
n→∞ h ∈D(L), 0≤j≤n xj ∈E0 , 1≤j≤n+1
j

Xn µ ¶j ½ µ ¶ ¾
λ λ 1
1− hj (xj ) + g (xj+1 ) − I − L hj (xj+1 ) . (3.175)
j=0
λ0 λ0 λ0

Suppose that the operator possesses the global Korovkin property, and satisfies
the maximum principle, as described in (3.145). Put

λR(λ)g (x0 )
= lim inf inf sup inf sup · · · inf sup
n→∞ h0 ∈D(L) x1 ∈E h1 ∈D(L) x2 ∈E hn ∈D(L) xn+1 ∈E
136 3 Space-time operators
n µ
X ¶j ½ µ ¶ ¾
λ λ 1
1− hj (xj ) + g (xj+1 ) − I − L hj (xj+1 ) (3.176)
j=0
λ0 λ0 λ0
= lim inf sup inf sup inf · · · sup inf
n→∞ h ∈D(L) x1 ∈E h ∈D(L) x2 ∈E hn ∈D(L) xn+1 ∈E
0 1
n µ
X ¶j ½ µ ¶ ¾
λ λ 1
1− hj (xj ) + g (xj+1 ) − I − L hj (xj+1 ) . (3.177)
j=0
λ0 λ0 λ0

Then the quantities in (3.166) through (3.175) are all equal to λR(λ)g (x0 ),
provided that the set E0 is replaced by E.
In case we deal with the (σ-local) Korovkin property, the convergence of
∞ µ ¶j−1
λ X λ
(λI − L) h0 = 1− (λI − L) hj (3.178)
λ0 j=1 λ0

in (3.168) and (3.173) should be uniform onE0 . In case we deal with the
global Korovkin property, and the maximum principle in (3.145), then the
convergence in (3.178) should be uniform on E.
Corollary 3.18. Suppose that the operator L possesses the Korovkin property
on E0 . Then for all λ > 0 the quantities in (3.169), (3.170), (3.171), and
(3.172) are equal for all x0 ∈ E0 and all functions g ∈ Cb (E0 ). If L possesses
the global Korovkin property, then
½ · µ ¶ ¸ ¾
1
inf max h(x0 ) + g − I − L h (x) (3.179)
h ∈ D(L) x ∈ E λ
½ µ ¶ ¾
1
= inf h (x0 ) : I − L h ≥ g on E (3.180)
h∈D(L) λ
½ µ ¶ ¾
1
= sup h (x0 ) : I − L h ≤ g on E (3.181)
h∈D(L) λ
½ · µ ¶ ¸ ¾
1
= sup min h(x0 ) + g − I − L h (x) . (3.182)
h∈D(L) x∈E λ

Moreover, for λ > 0 and f ∈ D(L), the equality R(λ) (λI − L) f = f holds.

Proof. By repeating the result in Proposition 3.17 for all λ1 ∈ (0, 2λ0 ) instead
of λ0 we get these equalities for λ in the interval (0, 4λ0 ). This procedure can
be repeated once more. Induction then yields the desired result. That for
λ > 0 and f ∈ D(L), the equality R(λ) (λI − L) f = f holds can be seen by
the following arguments. By definition we have

λR(λ) (λI − L) f (x0 )


= inf {h (x0 ) : (λI − L) f ≥ h on E0 , h ∈ D(L)} ≤ f (x0 ) . (3.183)
3.3 Korovkin property 137

We also have

λR(λ) (λI − L) f (x0 )


= sup {h (x0 ) : (λI − L) f ≤ h on E0 , h ∈ D(L)} ≥ f (x0 ) . (3.184)

The stated equality is a consequence from (2.157) and (3.183).

Proof (Proof of Proposition 3.17). The equality of each term in (3.164) and
(3.165) follows from the Korovkin property on E0 as exhibited in the formulas
(3.142) and (3.143) of Definition 3.13, provided that the limit in (3.164) exists.
The existence of this limit, and its identification are given in (3.166) and
(3.167) respectively. For this to make sense we must be sure that the partial
sums of the first n + 1 terms of the quantities in (3.164) and (3.166) are equal.
In fact a rewriting of the quantity in (3.164) before taking the limit shows
that the quantity in (3.174) is also equal to (3.164); i.e.
 
X n 
inf sup inf sup · · · inf sup ···
h0 ∈D(L) x1 ∈E0 h1 ∈D(L) x2 ∈E0 hn ∈D(L) xn+1 ∈E0  
j=0
 
X n 
= inf max ··· .
hj ∈D(L), 0≤j≤n xj ∈E0 , 1≤j≤n+1  
j=0

In fact the same is true for the corresponding partial sums in (3.165) and
(3.175), but with inf instead of sup, and min instead of max. For 0 < λ < 2λ0 ,
we have |λ0 − λ| < λ0 . Since
¯ ¯
¯ λ0 − λ ¯
|λ0 − λ| kR (λ0 ) f k∞ ≤ ¯¯ ¯ kf k , f ∈ Cb (E, R) , (3.185)
λ0 ¯ ∞

the sum in (3.167) converges uniformly. The equality of the sum of the first
n + 1 terms in (3.164) and (3.166) can be proved as follows. For 1 ≤ k ≤ n we
may employ the following identities:

inf sup · · · inf sup


h0 ∈D(L) x1 ∈E0 hn ∈D(L) xn+1 ∈E0

Xn µ ¶j ½ µ ¶ ¾
λ λ 1
1− hj (xj ) + g (xj+1 ) − I − L hj (xj+1 )
j=0
λ0 λ0 λ0
= inf sup · · · inf sup
h0 ∈D(L) x1 ∈E0 hn−k ∈D(L) xn−k+1 ∈E0


n−k
λ
¶j ½
λ
µ
1
¶ ¾
1− hj (xj ) + g (xj+1 ) − I − L hj (xj+1 )
j=0
λ0 λ0 λ0
n
X µ ¶j
λ λ j−(n−k)
+ 1− (λ0 R (λ0 )) g (xn−k+1 ) . (3.186)
λ0 λ0
j=n−k+1
138 3 Space-time operators

The equality in (3.186) can be proved by induction with respect to k, and


by repeatedly employing the definition of λ0 R (λ0 ) f , f ∈ Cb (E, R), together
with its linearity. Using (3.186) with k = n we get

inf sup · · · inf sup


h0 ∈D(L) x1 ∈E0 hn ∈D(L) xn+1 ∈E0

Xn µ ¶j ½ µ ¶ ¾
λ λ 1
1− hj (xj ) + g (xj+1 ) − I − L hj (xj+1 )
j=0
λ0 λ0 λ0
"½ µ ¶ ¾
λ 1
= inf sup h0 (x0 ) + g (x1 ) − I − L h0 (x1 )
h0 ∈D(L) x1 ∈E0 λ0 λ0

n µ ¶j
λ X λ j
+ 1− (λ0 R (λ0 )) g (x1 )
λ0 j=1 λ0
n µ ¶j
λ X λ j+1
= 1− (λ0 R (λ0 )) g (x0 ) . (3.187)
λ0 j=0 λ0

From the equality of (3.164) and (3.165), together with (3.187) we infer
n µ ¶j
λ X λ j+1
λR(λ)g (x0 ) = lim 1− (λ0 R (λ0 )) g (x0 ) . (3.188)
n→∞ λ0 λ 0
j=0

Notice that by (3.185) the series in (3.188) converges uniformly. Consequently,


the equalities of the quantities in (3.164), (3.165), (3.166), (3.167), (3.174),
and (3.175) follow, and all these expressions are equal to λR(λ)g (x0 ). Next
let (hj )j∈N ⊂ D(L) be any sequence with the following property:
∞ µ ¶j−1
λ X λ
(λI − L) h0 = 1− (λI − L) hj (3.189)
λ0 j=1 λ0

where the series in (3.189) converges uniformly. Then by the maximum princi-
∞ µ ¶j−1
λ X λ
ple the series 1− hj converges uniformly as well. So it makes
λ0 j=1 λ0
to write:
n+1 µ ¶j−1
λ X λ
h0 = 1− h0j where h0j = hj , 1 ≤ j ≤ n, and
λ0 j=1 λ0
X∞ µ ¶j−n−1
λ
h0n+1 = 1− hj . (3.190)
j=n+1
λ0

Again the series in (3.190) converges uniformly. From the representation of h0


in (3.190) we infer the equalities:
3.3 Korovkin property 139

inf max
hj ∈D(L), 0≤j≤n xj ∈E0 , 1≤j≤n+1
n µ
X ¶j ½ µ ¶ ¾
λ λ 1
1− hj (xj ) + g (xj+1 ) − I − L hj (xj+1 ) (3.191)
j=0
λ0 λ0 λ0
= inf
∞ ³
P ´j−1
λ λ
hj ∈ D(L), (λI − L) h0 = λ0 1− λ0 (λI − L) hj
j≥0 j=1

Xn µ ¶j ½ µ ¶ ¾
λ λ 1
max 1− hj (xj ) + g(xj+1 ) − I − L hj (xj+1 ) .
xj ∈E0
j=0
λ0 λ0 λ0
1≤j≤n+1
(3.192)

Hence, the equality of (3.174) and (3.168) follows. A similar argument shows
the equality of (3.175) and (3.173). Of course, here we used the equality of
(3.191) and (3.192) with inf instead of sup, and max replaced with min and
vice versa. So we have equality of the following expressions: (3.164), (3.165),
(3.166), (3.167), (3.168), (3.173), (3.174), (3.175). The proof of the fact that
these quantities are also equal to (3.169), (3.170), (3.171), and (3.172) is still
missing. Therefore we first show that the expression in (3.168) is greater than
or equal to (3.169). In a similar manner it is shown that the expression in
(3.173) is less than or equal to (3.172): in fact by applying the inequality
(3.168) ≥ (3.169) to −g instead of +g we obtain that (3.173) is less than or
equal to (3.172). From the (local) maximum principle it will follow that the
expression in (3.169) is greater than or equal to (3.172). As a consequence we
will obtain that, with the exception of (3.170) and (3.171), all quantities in
Proposition 3.17 are equal. Proving the equality of (3.169) and (3.170), and
of (3.171) and (3.172) is a separate issue. In fact the equality of (3.169) and
(3.170) follows from Lemma 3.16 equality (3.153), and the equality of (3.171)
and (3.172) follows from the same lemma equality (3.154).
(3.168) ≥ (3.169). Fix the subset E0 of E, and let (hj )j∈N ⊂ D(L) with the
n µ ¶j−1
λ X λ
following property: Lh0 = lim 1− hj . Here the convergence
n→∞ λ0 λ0
j=1
is uniform on E0 . In fact each hj may chosen equal to h0 . In (3.168) we choose
all xj = x ∈ E0 . Then we get
Xn µ ¶j ½ µ ¶ ¾
λ λ 1
1− h (xj ) + g (xj+1 ) − I − L hj (xj+1 )
j=0
λ0 λ0 λ0
Xn µ ¶ n µ ¶j
λ X
j
λ λ
= h0 (x0 ) + 1− hj (x) + 1− g(x) − h0 (x)
j=1
λ0 λ0 j=0 λ0
 
Xn µ ¶j Xn µ ¶j
λ 1 λ λ
− 1− hj (x) + L  1− hj  (x)
j=1
λ 0 λ λ 0 j=0
λ 0
140 3 Space-time operators
n µ ¶j µ ¶
λ X λ 1
= h0 (x0 ) + 1− g(x) − I − L h0 (x)
λ0 j=0 λ0 λ
 
µ ¶n+1 X ∞ µ ¶j−n−1
1 λ λ λ
− 1− L 1− hj  (x). (3.193)
λ λ0 λ0 j=n+1 λ0

The expression in (3.193) tends to


µ ¶
1
h0 (x0 ) + g(x) − I − L (x) uniformly on E0 , (3.194)
λ

and consequently, since h0 ∈ D(L) may be chosen arbitrarily, we see that


(3.168) ≥ (3.169).
(3.173) ≤ (3.172). The proof of this inequality follows the same lines as
the proof of (3.168) ≥ (3.169). In fact it follows from the latter inequality by
applying it to −g instead of g. The reader is invited to check the details.
(3.169) ≥ (3.172). Consider the mapping Λ+ : Cb (E, R) → [−∞ + ∞)
defined by
½ µ ¶ ¾
+ λ
Λ (g) = inf sup h (x0 ) + g(x) − I − L h(x) . (3.195)
h∈D(L) x∈E0 λ0

where g ∈ Cb (E, R). From the σ-local maximum principle (see Definition
3.14) and Lemma 3.16, inequality (3.156) it follows that Λ+ attains its values
in R. In addition, the functional Λ+ is sub-additive, and the expression in
(3.172) is equal to −Λ+ (−g). It follows that

Λ+ (g) + Λ+ (−g) ≥ Λ+ (0)


½ µ ¶ ¾
1
= inf h (x0 ) : I − L h ≥ 0, on E0 ≥ 0. (3.196)
h∈D(L) λ

In (3.196) we used the σ-local maximum principle: compare with the argu-
ments in (3.163) of the proof of inequality (3.157) in Lemma 3.16.

Proposition 3.19. Suppose that the operator L possesses the Korovkin prop-
erty on E0 . Then for all λ > 0 and f ∈ Cb (E) the quantities in (3.169),
(3.170), (3.171), and (3.172) are equal for all x0 ∈ E. These quantities are
also equal to
½ µ ¶ ¾
1
sup inf h (x0 ) : v I − L h ≥ vg (3.197)
v∈H + (E) h∈D(L) λ
½ µ ¶ ¾
1
= inf sup h (x0 ) : v I − L h ≤ vg . (3.198)
v∈H + (E) h∈D(L) λ
3.3 Korovkin property 141

Recall that H + (E) stands for all functions u ∈ H(E), u ≥ 0, with the property
that for every α > 0 the level set {u ≥ α} is a compact subset of E. Observe
that for every u ∈ H(E) there exists a function u0 ∈ H + (E) such that
|u(x)| ≤ u0 (x) for all x ∈ E.
Corollary 3.20. Suppose that the operator L possesses the Korovkin property
on E0 , and is positive Tβ -dissipative on E0 . Then the family {λR(λ) : λ ≥ λ0 },
as defined in Proposition 3.17, is Tβ -equi-continuous (on E0 ) for some λ0 > 0.
Proof. We use the representation in (3.171):
½ µ ¶ ¾
1
λR(λ)f (x0 ) = sup h (x0 ) : I − L h ≤ f on E0 . (3.199)
h∈D(L) λ

Let u ∈ H + (E) and x0 ∈ E. Since L is supposed to be positive Tβ -dissipative


on E0 , there exists λ0 > 0 and v ∈ H + (E0 ) such that
λu (x0 ) h (x0 ) ≤ sup v(x) (λh(x) − Lh(x)) (3.200)
x∈E0

for all h ∈ D(L) which are real-valued and for all λ ≥ λ0 . For the precise
definition of positive Tβ -dissipativity (on E) see (3.15) in Definition 3.5. From
(3.199) and (3.200) we infer:
u (x0 ) λR(λ)f (x0 )
= sup {u (x0 ) h (x0 ) : λh − Lh ≤ λf on E0 }
h∈D(L)

≤ sup {u (x0 ) h (x0 ) : λh − Lh ≤ λf on E0 }


h∈D(L)
½ ¾
1
≤ sup sup v(x) (λh (x) − Lh(x)) : λh − Lh ≤ λf on E0
h∈D(L) λ x∈E
≤ sup v(x)f (x). (3.201)
x∈E0

Since by construction <R(λ)f = R(λ)<f , (3.201) implies:


kuλR(λ)f k∞ ≤ kvf k∞ , f ∈ Cb (E0 ) , λ ≥ λ0 . (3.202)
The conclusion in Corollary 3.20 is a consequence of (3.202).
In the following theorem we wrap up more or less everything we proved so far
about an operator with the Korovkin property on a subset E0 of E. Theorem
3.21 and the related observations are used in the proof item (e) of Theorem
1.39.
Theorem 3.21. Let E0 be a polish subspace of the polish space E. Suppose
that every function f ∈ Cb (E0 ) can be extended to a bounded continuous func-
tion on E. Let L be a linear operator with domain and range in Cb (E) which
assigns the zero function to a constant function. Suppose that the operator L
possesses the following properties:
142 3 Space-time operators

1. Its domain D(L) is Tβ -dense in Cb (E).


2. The operator L assigns real-valued functions to real-valued functions:
< (Lf ) = L<f for all f ∈ D(L).
3. If f ∈ D(L) vanishes on E0 , then Lf vanishes on E0 as well.
4. The operator L satisfies the maximum principle on E0 .
5. The operator L is positive Tβ -dissipative on E0 .
6. The operator L is sequentially λ-dominant on E0 for some λ > 0.
7. The operator L has the Korovkin property on E0 .
Let L ¹E0 be the operator defined by D (L ¹E0 ) = {f ¹E0 : f ∈ D(L)}, and
L ¹E0 (f ¹E0 ) = Lf ¹E0 , f ∈ D(L). Then the operator L ¹E0 pos-
sesses a unique linear extension to the generator L0 of a Feller semigroup
{S0 (t) : t ≥ 0} on Cb (E0 ).
In addition, the time-homogeneous Markov process associated to the Feller
semigroup {S0 (t) : t ≥ 0} serves as the unique solution to the martingale prob-
lem associated with L.

Proof. Existence. First we prove that the restriction operator L ¹E0 is well-
defined and that it is Tβ -densely defined. The fact that it is well-defined
follows from 3. In order to prove that it is Tβ -densely defined, we use a Hahn-
Banach typeRargument. Let µ e be a bounded Borel measure on E0 such that
hf ¹E0 , µ
ei = E0 f de
µ = 0 for all f ∈ D(L). Define the measure µ on the Borel
field of E by µ(B) = µ e (B ∩ E0 ), B ∈ E. Then hf, µi = 0 for all f ∈ D(L).
Since D(L) is Tβ -dense in Cb (E), we infer hf, µi = 0 for all f ∈ Cb (E). Let
fe ∈ Cb (E). Then there exists f ∈ Cb (E) such that f = fe on E0 , and hence
D E
fe, µ
e = hf ¹E0 , µ
ei = hf, µi = 0. (3.203)

From (3.203) we see that a bounded Borel measure which annihilates D (L ¹E0 )
also vanishes on Cb (E0 ). By the theorem of Hahn-Banach in combination with
the fact that every element of the dual of (Cb (E0 ) , Tβ ) can be identified with
a bounded Borel measure on E0 , we see that the subspace D (L ¹E0 ) is Tβ -
dense in Cb (E0 ). Define the family of operators {λR(λ) : λ > 0} as in Propo-
sition 3.17, By the properties 4 and 7 such definitions make sense. Moreover,
the family {R(λ) : λ > 0} possesses the resolvent property: R(λ) − R(µ) =
(µ − λ) R(µ)R(λ), λ > 0, µ > 0. It also follows that R(λ) (λI − D1 − L) f = f
on E0 for f ∈ D(1) (L). This equality is an easy consequence of the inequalities
in (2.157): see Corollary 3.18. Fix λ > 0 and f ∈ Cb (E0 ). If f is of the form
f = R(λ)g, g ∈ Cb (E0 ), then by the resolvent property we have

α αR(α)g
αR(α)f − f = αR(α)R(λ)g − R(λ)g = R(λ)g − R(λ)g − .
α−λ α−λ
(3.204)
Since kαR(α)gk∞ ≤ kgk∞ , g ∈ Cb (E0 ), the equality in (3.204) yields

k·k∞ - lim αR(α)f − f = 0 for f of the form f = R(λ)g, g ∈ Cb (K).


α→∞
3.3 Korovkin property 143

Since g = R(λ) (λI − D1 − L) g on K, g ∈ D(1) (L), it follows that

lim kαR(α)g − gk∞ = 0 for g ∈ D(1) (L) = D (D1 ) ∩ D(L). (3.205)


α→∞

As was proved in Corollary 3.20 there exists λ0 > 0 such that the family
{λR(λ) : λ ≥ λ0 } is Tβ -equi-continuous. Hence for u ∈ H + (E0 ) there exists
v ∈ H + (K) that for α ≥ λ0 we have

kuαR(α)gk∞ ≤ kvgk∞ , g ∈ Cb (E0 ) . (3.206)

Fix ε > 0, and choose for f ∈ Cb (E0 ) and u ∈ H + (E0 ) given the function
g ∈ D (L ¹E0 ) in such a way that
2
ku(f − g)k∞ + kv(f − g)k∞ ≤ ε. (3.207)
3
Since D (L ¹E0 ) is Tβ -dense in Cb (E0 ) such a choice of g. The inequality
(3.207) and the identity

αR(α)f − f = αR(α)(f − g) − (f − g) + αR(α)g − g yield


ku (αR(α)f − f )k∞
≤ ku (αR(α)(f − g))k∞ + ku(f − g)k∞ + kuαR(α)g − gk∞
≤ kv(f − g)k∞ + ku(f − g)k∞ + kuαR(α)g − gk∞
2
≤ ε + kuαR(α)g − gk∞ . (3.208)
3
From (3.205) and (3.208) we infer

Tβ - lim αR(α)f = f, f ∈ Cb (E0 ) . (3.209)


α→∞

Define the operator L0 in Cb (E0 ) as follows. Its domain is given by D (L0 ) =


R(λ)Cb (E0 ), λ > 0. By the resolvent property the space R(λ)Cb (E0 ) does
not depend on λ > 0, and so D (L0 ) is well-defined. The operator L0 :
D (L0 ) → Cb (E0 ) is defined by L0 R(λ)f = λR(λ)f − f , f ∈ Cb (E0 ). Since
R(λ)f1 = R(λ)f2 , f1 , f2 ∈ Cb (E0 ), implies R(λ) (f2 − f1 ) = 0. By the resol-
vent property we see that αR(α) (f2 − f1 ) = 0 for all α > 0. From (3.209) we
infer f2 = f1 . In other words, the operator L0 is well-defined. Since the oper-
ators R(λ), λ > 0, are Tβ -continuous it follows that the graph of the operator
L0 is Tβ -closed. As in the proof of (iii) =⇒ (i) we have, like in (3.111),

X (λt)k
Se0 (t)f = lim e−λt
k
(λR(λ)) f = lim e−λt eλt(λR(λ)) f = S0 (t)f
λ→∞ k! λ→∞
k=0
(3.210)
e
where the operator S0 (t) is defined by using the Hausdorff-Bernstein-Widder
Laplace inversion theorem we have (compare with (3.108))
144 3 Space-time operators
Z ∞
e−λρ Se0 (ρ)f (x)dρ, <λ > 0, x ∈ E0 .
−1
R(λ)f (x) = (λI − L0 ) f (x) =
0
(3.211)
For a function f belonging to the space R(λ)Cb (E0 ) the equality S0 (t)f =e
S0 (t)f holds: see (3.113). Here S0 (t)f is defined as the uniform limit in (3.210).
Since the operator L ¹E0 is sequentially
n λ-dominant for some λo > 0 we in-
k
fer that the family of operators (µR (λ + µ)) : µ ≥ 0, k ∈ N is Tβ -equi-
continuous: see (1.43) in Proposition 1.21. Since for f ∈ R(α)Cb (E0 ) we have
© ª n o
k
sup e−λt S0 (t)f : t ≥ 0 = sup (µR (λ + µ)) f : µ ≥ 0, k ∈ N . (3.212)

From (3.212) combined with the Tβ -equi-continuity and the Tβ density of


D (L ¹E0 ) we see that, each operator S0 (t) has a Tβ -continuous extension
to all of Cb (E0 ). One way of achieving this is by fixing f ∈ Cb (E0 ), and
considering the family {αR(α)f : α ≥ λ}. Then Tβ - lim©α→∞ αR(α)f = f ª . Let
u ∈ H + (E0 ). By the Tβ -equi-continuity of the family e−λt S0 (t) : t ≥ 0 we
see that
° °
lim sup °ue−λt S0 (t) (βR(β)f − αR(α)f )°∞ = 0. (3.213)
α, β→∞ t≥0

Since the functions (t, x) 7→ S0 (t) (αR(α)f ) (x), α ≥ λ, are continuous


the same is true for the function (t, x) 7→ [S0 (t)f ] (x), where S0 (t)f =
Tβ - lim αR(α)f . Of course, for almost all t ≥ 0 we have S0 (t)f (x) = Se0 (t)t(x)
α→∞
for all x ∈ E0 . Since
1¡ ¢
Tβ - lim I − e−λt S0 (t) R(λ)f = R(λ)f, f ∈ Cb (E0 ) ,
t↓ t

we see that the operator L0 generates the semigroup {S0 (t) : t ≥ 0}. The
continuous extension of S0 (t), which was originally defined on R(λ)Cb (E0 ),
to Cb (E0 ) is again denoted by S0 (t). Let f ∈ D(L). Moreover, since

R(λ) (λf − Lf ) = f on E0 ,

we have D (L ¹E0 ) ⊂ D (L0 ), and

L0 f = L0 R(λ) (λI − L) f = λR(λ) (λI − L) f − (λI − L) f


= λf − λf + Lf = Lf (3.214)

on E0 . From (3.214) we see that the operator L0 extends the operator L ¹E0 .
Uniqueness of Feller semigroups. Let L1 and L2 be two extensions of the
operator L ¹E0 which generate Feller semigroups. Let {R1 (λ) : λ > 0} and
{R2 (λ) : λ > 0} be the corresponding resolvent families. Since L1 extends
L ¹E0 we obtain, for h ∈ D(L),
3.3 Korovkin property 145
µ ¶
1
λ0 R (λ0 ) I − L h = R (λ0 ) (λ0 I − L1 ) h = h. (3.215)
λ0

Then by the maximum principle and (3.215) we infer


µ µ ¶ ¶
1
sup inf h (x0 ) + g(x) − I − L h (x)
h∈D(L) x∈E0 λ0
µ µ µ ¶ ¶ ¶
1
≤ sup h (x0 ) + λ0 R (λ0 ) g − I − L h (x0 )
h∈D(L) λ0
µ µ ¶ ¶
1
= sup h (x0 ) + λ0 R1 (λ0 ) g (x0 ) − λ0 R1 (λ0 ) I − L h (x0 )
h∈D(L) λ0
= sup (h (x0 ) + λ0 R1 (λ0 ) g (x0 ) − h (x0 ))
h∈D(L)

= λ0 R1 (λ0 ) g (x0 )
µ µ ¶ ¶
1
≤ inf sup h (x0 ) + g(x) − I − L h(x) . (3.216)
h∈D(L) x∈E0 λ0

The same reasoning can be applied to the operator R2 (λ0 ) Since the ex-
tremities in (3.215) are equal we see that R1 (λ0 ) = R2 (λ0 ). Hence we get
−1 −1
(λ0 − L1 ) = (λ0 − L2 ) , and consequently L1 = L2 .

Of course the same arguments work if E0 = E.


Uniqueness of solutions to the martingale problem. Let L0 be the (unique)
extension of L, which generates a Feller semigroup {S0 (t) : t ≥ 0}, and let

{(Ω, F, Px ) , (X(t), t ≥ 0) , (ϑt , t ≥ 0) , (E, E)}

be the corresponding time-homogeneous Markov process with Ex [g(X(t)] =


S0 (t)g(x), g ∈ Cb (E), x ∈ E, t ≥ 0. Then the family {Px : x ∈ E} is a solu-
tion to the martingale problem associated to L. The proof of the uniqueness
part follows a pattern similar to the proof of the uniqueness part of linear
extensions of L which generate Feller semigroups. We will show that the
family of probability measures {Px : x ∈ E} is a solution to the martingale
problem associated to the operator L. Let f be a member of D(L) and put
Rt
Mf (t) = f (X(t)) − f (X(0)) − 0 Lf (X(s))ds. Then, for t2 > t1 we have
£ ¯ ¤ £ ¯ ¤
Ex Mf (t2 ) ¯ Ft1 − Mf (t1 ) = Ex Mf (t2 − t1 ) ◦ ϑt1 ¯ Ft1

(Markov property)

= EX(t1 ) [Mf (t2 − t1 )] . (3.217)

Since, in addition, by virtue of the fact that L0 , which is an extension of L,


generates the semigroup {S0 (t) : t ≥ 0}, we have
146 3 Space-time operators
Z t
Ez [Mf (t)] = S0 (t)f (z) − f (z) − S0 (u)Lf (z)du
0
Z t

= S0 (t)f (z) − f (z) − (S0 (u)f (z)) du
0 ∂u
= S0 (t)f (z) − f (z) − (S0 (t)f (z) − S0 (0)f (z)) = 0,
the assertion about the existence of solutions to the martingale problem fol-
lows from 3.217. Next we prove uniqueness of solutions to the martingale
problem. Its proof resembles the way we provedn the uniqueness
o n of extensions
o
(1) (2)
of L which generate Feller semigroups. Let Px : x ∈ E and Px : x ∈ E
be two solutions to the martingale problem for L. Let h ∈ D(L), and consider
Z ∞ · µ ¶ ¸
1 ¯
λ −λt (j)
e Ex h (X(s)) − I − L h (X(t + s)) Fs dt ¯
0 λ
Z ∞
£ ¯ ¤
= h (X(s)) − e−λt E(j)
x (λI − L) h (X(t + s)) ¯ Fs dt
0
Z ∞
£ ¯ ¤
= h (X(s)) − λ e−λt E(j)
x h (X(t + s)) ¯ Fs dt
0
Z ∞
£ ¯ ¤
+ e−λt E(j)
x Lh (X(t + s)) ¯ Fs dt
0

(integration by parts)
Z ∞
£ ¯ ¤
= h (X(s)) − λ e−λt E(j)
x h (X(t + s)) ¯ Fs dt
0
Z ∞ ·Z t ¸
¯
+λ e−λt E(j)
x Lh (X(ρ + s)) dρ ¯ F s dt
0 0
Z ∞
£ ¯ ¤
= h (X(s)) − λ e−λt E(j)
x h (X(t + s)) ¯ Fs dt
0
Z ∞ ·Z t+s ¸
¯
+λ e−λt E(j)
x Lh (X(ρ)) dρ ¯ F s dt
0 s

(martingale property)
∞Z
£ ¯ ¤
= h (X(s)) − λ e−λt E(j)
x h (X(t + s)) ¯ Fs dt
0
Z ∞
£ ¯ ¤
+λ e Ex h (X(t + s)) − h (X(s)) ¯ Fs dt = 0.
−λt (j)
(3.218)
0

Fix x0 ∈ E, g ∈ Cb (E), and s > 0. Then, from (3.218) it follows that, for
h ∈ D(L),
Z ∞
£ ¯ ¤
e−λt E(j) ¯
x0 g (X(t + s)) Fs dt
0
3.3 Korovkin property 147
Z ∞ · µ ¶ ¸
1 ¯
= e−λt E(j)
x0 h (X(s)) + g (X(t + s)) − I − L h (X(t + s)) ¯ Fs dt,
0 λ

and hence
Z ∞ £ ¯ ¤
Λ− (g, X(s), λ) ≤ λ exp(−λt)E(j) ¯ +
x0 g (X(t + s)) Fs dt ≤ Λ (g, X(s), λ) ,
0
(3.219)
for j = 1, 2, where
½ · µ ¶ ¸ ¾
1
Λ+ (g, x0 , λ) = inf sup min max h(x0 ) + g − I − L h (x)
Γ ⊂D(L) Φ⊂E0 h∈Γ x∈Φ∪{4} λ
#Γ <∞ #Φ<∞
· µ ¶ ¸
½ ¾
1
= inf sup h(x0 ) + g − I − L h (x) , and (3.220)
h∈D(L) x∈E0 λ
½ · µ ¶ ¸ ¾
− 1
Λ (g, x0 , λ) = sup inf max min h(x0 ) + g − I − L h (x)
Γ ⊂D(L) Φ⊂E0 h∈Γ x∈Φ∪{4} λ
#Γ <∞ #Φ<∞
½ · µ ¶ ¸ ¾
1
= sup inf h(x0 ) + g − I − L h (x) . (3.221)
h∈D(L) x∈E0 λ

We also have
Z ∞ · µ ¶ ¸
(j) 1
λ e−λt EX(s) h (X(0)) − I − L h (X(t)) dt
0 λ
Z ∞ Z ∞
(j) (j)
= h (X(s)) − λ e−λt EX(s) [h (X(t))] dt + e−λt EX(s) [Lh (X(t))] dt
0 0

(integration by parts)
Z ∞
(j)
= h (X(s)) − λ e−λt EX(s) [h (X(t))] dt
0
Z ∞ ·Z t ¸
(j)
+λ e−λt EX(s) Lh (X(ρ)) dρ dt
0 0

(martingale property)
Z ∞
(j)
= h (X(s)) − λ e−λt EX(s) [h (X(t))] dt
0
Z ∞
(j)
+λ e−λt EX(s) [h (X(t)) − h (X(0))] dt = 0 (3.222)
0

(j)
where in the first and final step we used X(0) = z Pz -almost surely. In the
same spirit as we obtained (3.219) from (3.222) we get
148 3 Space-time operators
Z ∞
(j)
Λ− (g, X(s), λ) ≤ λ e−λt EX(s) [g (X(t))] dt ≤ Λ+ (g, X(s), λ) , (3.223)
0

for j = 1, 2. Since, by Proposition 3.17 (formula (3.169) and (3.172)) the


identity Λ+ (g, x, λ) = Λ− (g, x, λ), is true for g ∈ Cb (E), x ∈ E, λ > 0, we
(1) (2)
obtain, by putting s = 0, Ex [g(X(t))] = Ex [g(X(t))], t ≥ 0, g ∈ Cb (E).
(1)
We also obtain, Px -almost surely,
£ ¯ ¤ (1)
E(1)
x g(X(t + s)) ¯ Fs = EX(s) [g(X(t))] ,

(2)
and, Px -almost surely,
£ ¯ ¤ (2)
E(2)
x g(X(t + s)) ¯ Fs = EX(s) [g(X(t))] , for t, s ≥ 0, and g ∈ Cb (E).

(1) (2)
It necessarily follows that Px = Px , x ∈ E. Consequently, the uniqueness
of the solutions to the martingale problem for the operator L follows.
This completes the proof Theorem 3.21.

3.4 Continuous sample paths

The following Lemma 3.22 and Proposition 3.23 give a general condition which
guarantee that the sample paths are Pτ,x -almost surely continuous on their
life time.
Lemma 3.22. Let P (τ, x; t, B), 0 ≤ τ ≤ t ≤ T , x ∈ E, B ∈ E, be a sub-
Markov transition function. Let (x, y) 7→ d(x, y) be a continuous metric on E×
E and put Bε (x) = {y ∈ E : d(y, x) ≤ ε}. Fix t ∈ (0, T ]. Then the following
assertions are equivalent:
(a) For every compact subset K of E and for every ε > 0 the following equality
holds:
P (s1 , x; s2 , E \ Bε (x))
lim sup = 0. (3.224)
s1 ,s2 →t, τ <s1 <s2 ≤t x∈K s2 − s1

(b) For every compact subset K of E and for every open subset G of E such
that G ⊃ K the following equality holds:

P (s1 , x; s2 , E \ G)
lim sup = 0. (3.225)
s1 ,s2 →t, τ ≤s1 <s2 ≤t x∈K s2 − s1

Proof. (a) =⇒ (b). Let G be an open subset of E and let K be a compact


subset of G. Then there exists ε > 0, n ∈ N, and xj ∈ K, such that

G ⊃ ∪nj=1 B2ε (xj ) ⊃ ∪nj=1 int (Bε (xj )) ⊃ K. (3.226)


3.4 Continuous sample paths 149

For any x ∈ K there exists j0 , 1 ≤ j0 ≤ n, such that d (x, xj0 ) < ε, and
hence for y ∈ int (Bε (x)) d (y, xj0 ) ≤ d (y, x) + d (x, xj0 ) < 2ε. It follows
that Bε (x) ⊂ G. Consequently, for x ∈ K and τ ≤ s1 < s2 < t we get
P (s1 , x; s2 , E \ G) ≤ P (s1 , x; s2 , Bε (x)). So (b) follows from (a).
(b) =⇒ (a). Fix ε > 0 and let K be any compact subset of E. Like in the proof
of the implication (a) =⇒ ¡ (b) we¢again choose elements xj ∈ K, 1 ≤ j ≤ n,
such that K ⊂ ∪nj=1 int Bε/4 (xj ) . Let x ∈ K ∩ Bε/4 (xj ) and y ∈ Bε/2 (xj ).
Then d(y, x) ≤ d (y, xj ) + d (xj , x) ≤ 12 ε + 14 ε = 43 ε < ε. Suppose that x ∈
K ∩ Bε/4 (xj ). For τ ≤ s1 < s2 < t it follows that
¡ ¡ ¢¢
P (s1 , x; s2 , E \ Bε (x)) ≤ P s1 , x; s2 , E \ int Bε/2 (xj ) ,

and hence

sup P (s1 , x; s2 , E \ Bε (x))


x∈K
¡ ¡ ¢¢
≤ max sup P s1 , x; s2 , E \ int Bε/2 (xj ) . (3.227)
1≤j≤n x∈K∩B
ε/4 (xj )

The inequality in (3.227) together with assumption in (b) easily implies (a).
This concludes the proof of Lemma 3.22.
Proposition 3.23. Let P (τ, x; t, B) be a sub-Markov transition function and
let the process X(t) be as in (a) of Theorem 1.39. Fix (τ, x) ∈ [0, T ] × E.
Suppose that for every t ∈ [τ, T ], and for every compact subset K and for
every open subset G for which G ⊃ K the equality
P (s1 , y; s2 , E \ G)
lim sup =0
s1 ,s2 ↑t, τ ≤s1 <s2 <t y∈K s2 − s1

holds. Then for every t ∈ (τ, T ] the equality

inf sup d (X(s), X(t)) 1[X(t)∈E] = 0, holds Pτ,x -almost surely.


ε>0 t−ε≤s≤t

Here d : E × E → [0, ∞) is a continuous metric on E × E.


Proof. Put tj,n = t − ε + j2−n ε, 0 ≤ j ≤ 2n . From Proposition 2.2 with X(s)
e
instead of X(s) it follows that it suffices to prove that for every η > 0 the
equality
· ¸
inf lim Pτ,x max n d (X (tj−1,n ) , X (tj,n )) 1{X(tj−1,n )∈K} 1{X(tj,n )∈K} > η
ε>0 n→∞ 1≤j≤2

=0 (3.228)

holds for all compact subsets K of E.


· ¸
Pτ,x max n d (X (tj−1,n ) , X (tj,n )) 1{X(tj−1,n )∈K} 1{X(tj,n )∈K} > η
1≤j≤2
150 3 Space-time operators
n
2
X £ ¤
≤ Pτ,x d (X (tj−1,n ) , X (tj,n )) 1{X(tj−1,n )∈K} 1{X(tj,n )∈K} > η
j=1

(Markov property)
n
2
X £ £ ¤
= Eτ,x Ptj−1,n ,X(tj−1 ,n) d (X (tj−1,n ) , X (tj,n )) 1{X(tj,n )∈K} > η
j=1
¤
×1{X(tj−1,n )∈K}
n
2
X £ ¤
≤ sup Ptj−1,n ,y d (y, X (tj,n )) 1{X(tj,n )∈K} > η
j=1 y∈K
n
2
X
= sup P (tj−1,n , y; tj,n , K \ Bη (y)) . (3.229)
j=1 y∈K

The result in Proposition 3.23 follows from (3.229) and Lemma 3.22.

3.5 Measurability properties of hitting times


In this section we study how fast Markov process reaches a Borel subset B
of the state space E. The material is taken from Chapter 2, Section 2.10 in
Gulisashvili et al [95]. Fix τ ∈ [0, T ]. Throughout this section we will assume
that the filtrations (Ftτ )t∈[τ,T ] are right-continuous and Pτ,µ -complete. Right-
τ
T
continuity means that Ft+ = Fsτ = Ftτ . By definition the σ-field Ftτ is
s∈(t,T ]
τ
Pτ,µ -complete if Pτ,µ -negligible events A belong to Ftτ . The σ-field Ft+ is the
τ τ
Pτ,µ -completion of a σ-field Ft+ if and only if for every A ∈ Ft+ there exist
τ
events A1 and A2 ∈ Ft+ such that A1 ⊂ A ⊂ A2 and Pτ,x (A2 \ A1 ) = 0. It
is also assumed that we are in the context of a backward Feller evolution (or
propagator) {P (s, t) : 0 ≤ s ≤ t ≤ T } in the sense of Definition 1.24 and the
corresponding strong Markov process with state space E:
n o
(Ω, FTτ , Pτ,x )(τ,x)∈[0,T ]×E , (X(t), t ∈ [0, T ]) , (E, E) . (3.230)

By P (E) will be denoted the collection of all Borel probability R measures on


the space E. For A ∈ FTτ and µ ∈ P (E), we put Pτ,µ (A) = Pτ,x (A)dµ(x).
For instance, if µ = δx is the Dirac measure concentrated at x ∈ E, then
Pτ,δx = Pτ,x . Let ζ be the first time the process X(t) arrives at the absorption
state 4:
(
inf {t > 0 : X(t) = 4} if X(t) = 4 for some t ∈ (0, T ],
ζ=
T if X(t) ∈ E for all t ∈ (0, T ).
3.5 Measurability properties of hitting times 151

Definition 3.24. Let (X(t), Pτ,x )¡ be a Markov


¢ process on Ω with state space
E and sample path space Ω = D [0, T ], E 4 , and let B be a Borel subset of
E 4 . Let τ ∈ [0, T ), and suppose that S : Ω → [τ, ζ] is a Ftτ -stopping time.
For the process X(t), the entry time of the set B after time S is defined by
 [

 inf {t : t ≥ S, X(t) ∈ B} on {S ≤ t, X(t) ∈ B} ,
S
DB = τ ≤t<T

 ζ elsewhere.
(3.231)
The pseudo-hitting time of the set B after time S is defined by
 [

 inf {t : t ≥ S, X(t) ∈ B} on {S ≤ t, X(t) ∈ B} ,
eS =
D τ <t<T
B

 ζ elsewhere.
(3.232)
The hitting time of the set B after time S is defined by
 [

 inf {t : t > S, X(t) ∈ B} on {S < t, X(t) ∈ B} ,
TBS = τ ≤t<T

 ζ elsewhere.
(3.233)
Observe that on the event {S = τ, X(τ ) ∈ B} we have DBS
= τ and DeS = T S.
B B
It is not hard to prove that
[ [
{S ≤ t, X(t) ∈ B} = {S ∨ t ≤ t, X (S ∨ t) ∈ B} and
t:τ ≤t<T t:τ ≤t<T
[ [
{S ≤ t, X(t) ∈ B} = {S ∨ t ≤ t, X (S ∨ t) ∈ B} .
t:τ <t<T t:τ <t<T

We also have
S
DB∪{4} S
= DB ∧ ζ, eS
D eS S S
B∪{4} = DB ∧ ζ, and TB∪{4} = TB ∧ ζ. (3.234)

In addition, we have DB S
≤De S ≤ T S . Next we will show that the following
B B
equalities hold:
n o n o
(ε+S)∧ζ (r+S)∧ζ
TBS = inf DB = inf DB . (3.235)
ε>0 r∈Q+
© ª
Indeed on TBS < ζ , the first equality in (3.235) can be obtained by using
the inclusion

{t ≥ (ε + S) ∧ ζ, X(t) ∈ B} ⊂ {t > S, X(t) ∈ B}

and the fact that for every t ∈ [τ, T ) and ω ∈ {S < t, X(t) ∈ B}, there exists
ε > 0 depending on ω such that ω ∈ {(ε + S) ∧ ζ ≤ t, X(t) ∈ B}. Since TBS ≤
152 3 Space-time operators
(ε+S)∧ζ © ª
DB , we see that on the event TBS = ζ the first equality in (3.235) also
holds. The second equality in (3.235) follows from the monotonicity of the
S
entry time DB with respect to S.
Our next goal is to prove that for the Markov process in (3.230) the entry
time DB S
, the pseudo-hitting time D e S , and the hitting time T S are stopping
B B
times. Throughout the present section, the symbols K(E) and O(E) stand for
the family of all compact subsets and the family of all open subsets of the
space E, respectively.
The celebrated Choquet capacitability theorem will be used in the proof
S eS
of the fact that DB , DB , and TBS are stopping times. We will restrict ourselves
to positive capacities and the pavement of the space E by compact subsets.
For more general cases, we refer the reader to [73, 160].
Definition 3.25. A function I from the class P(E) of all subsets of E into
the extended real half-line R̄+ is called a Choquet capacity if it possesses the
following properties:
(i) If A1 and A2 in P(E) are such that A1 ⊂ A2 , then I (A1 ) ≤ I (A2 ).
(ii)If An ∈ P(E), n ≥ 1, and A ∈ P(E) are such that An ↑ A, then I (An ) →
I(A) as n → ∞.
(iii)If Kn ∈ K(E), n ≥ 1, and K ∈ K(E) are such that Kn ↓ K, then
I (Kn ) → I(K) as n → ∞.
Definition 3.26. A function ϕ : K(E) → [0, ∞) is called strongly sub-
additive provided that the following conditions hold:
(i) If K1 ∈ K(E) and K2 ∈ K(E) are such that K1 ⊂ K2 , then ϕ (K1 ) ≤
ϕ (K2 ).
(ii)If K1 and K2 belong to K(E), then
ϕ (K1 ∪ K2 ) + ϕ (K1 ∩ K2 ) ≤ ϕ (K1 ) + ϕ (K2 ) . (3.236)
The following construction allows one to define a Choquet capacity starting
with a strongly sub-additive function. Let ϕ be a strongly sub-additive func-
tion satisfying the following additional continuity condition:
(iii)For all K ∈ K(E) and all ε > 0, there exists G ∈ O(E) such that K ⊂ G
and ϕ (K 0 ) ≤ ϕ (K) + ε for all compact subsets K 0 of G.
For any G ∈ O(E), put
I ∗ (G) = sup ϕ(K). (3.237)
K∈K(E);K⊂G

Next define a set function I : P(E) → R̄+ by


I(A) = inf I ∗ (G), A ∈ P(E). (3.238)
G∈O(E);A⊂G

It is known that the function I is a Choquet capacity. It is clear that for


any G ∈ O(E), I(G) = I ∗ (G). Moreover, it is not hard to see that for any
K ∈ K(E), ϕ(K) = I(K).
3.5 Measurability properties of hitting times 153

Definition 3.27. Let ϕ : K(E) → [0, ∞) be a strongly subadditive function


satisfying condition (iii), and let I be the Choquet capacity obtained from ϕ
(see formulas (3.237) and (3.238)). A subset B of E is said to be I-capacitable
if the following equality holds:

I(B) = sup {ϕ(K) : K ⊂ B, K ∈ K(E)} . (3.239)

Now we are ready to formulate the Choquet capacitability theorem (see, e.g.,
[73, 69, 160]). We will also need the following version of the Choquet capacity
theorem. For a discussion on capacitable subsets see e.g. Kiselman [133]; see
Choquet [58], [27] and [69] as well. For a general discussion on the foundations
of probability theory see e.g. [118].
Theorem 3.28. Let E be a polish space, and let ϕ : K(E) → [0, ∞) be a
strongly subadditive function satisfying condition (iii), and let I be the Choquet
capacity obtained from ϕ (see formulas (3.237) and (3.238)). Then every an-
alytic subset of E, and in particular, every Borel subset of E is I-capacitable.
The definition of analytic sets can be found in [73, 69]. We will only need the
Choquet capacitability theorem for Borel sets which form a sub-collection of
the analytic sets.
Lemma 3.29. Let τ ∈ [0, T ], and let {X(t) : t ∈ [τ, T ]} be an adapted, right-
continuous, and
µ quasi left-continuous ¶stochastic process on the filtered prob-
³ τ ´ τ
ability space X(t), Ft+ , Pτ,x . Suppose that S is an Ft+ -stopping
t∈[τ,T ]
time such that τ ≤ S ≤ ζ. Then, for any t ∈ [τ, T ] and µ ∈ P (E), the following
functions are strongly sub-additive on K(E) and satisfy condition (iii):
£ S ¤ h i
K 7→ Pτ,µ DK ≤ t , and K 7→ Pτ,µ D eKS
≤ t , K ∈ K(E). (3.240)

τ
We wrote Ft+ to indicate that this σ-field is right continuous and Pτ,x -
complete.
Proof. We have to check conditions (i) and (ii) in Definition 3.26 and also
condition (iii) for the set functions in (3.240). Let K1 ∈ K(E) and K2 ∈ K(E)
S S
be such that K1 ⊂ K2 . Then DK 1
≥ DK 2
, and hence
£ S ¤ £ S ¤
Pτ,µ DK 1
≤ t ≤ Pτ,µ DK 2
≤t .
£ S ¤
This proves condition (i) for the function K 7→ Pτ,µ DK ≤ t . The proof of
(i) for the second mapping in (3.240) is similar. £ S ¤
In order to prove condition (iii) for the mapping K 7→ Pτ,µ DK ≤ t , we
use assertion (a) in Lemma 3.34. More precisely, let K ∈ K(E) and Gn ∈
O(E), n ∈ N, be such as in Lemma 3.34. Then by part (a) of Lemma 3.34
below (note that part (a) of Lemma 3.34 also holds under the restrictions in
Lemma 3.29), we get
154 3 Space-time operators
£ S ¤ £ S ¤
Pτ,µ DK ≤t ≤ inf sup Pτ,µ DK 0 ≤ t
G∈O(E):G⊃K K 0 ∈K(E):K 0 ⊂G
£ S ¤
≤ inf sup Pτ,µ DK 0 ≤ t
n∈N K 0 ∈K(E):K 0 ⊂Gn
£ S ¤ £ S ¤
≤ inf Pτ,µ DG n
≤ t = Pτ,µ DK ≤t . (3.241)
n∈N

It follows from (3.241) that


£ S ¤ £ S ¤
Pτ,µ DK ≤t = inf sup Pτ,µ DK 0 ≤ t . (3.242)
G∈O(E):G⊃K K 0 ∈K(E), K 0 ⊂G

Now it is clear that £the equality


¤ in (3.242) implies property (iii) for the
S
mapping
h K i→
7 P τ,µ D K ≤ t . The proof of (iii) for the mapping K 7→
e S
Pτ,µ D ≤ t is similar. Here we use part (d) in Lemma 3.34 (note that
K
part (d) of Lemma 3.34 also holds under the restrictions
£ S in
¤ Lemma 3.29).
Next we will prove that the function K 7→ Pτ,µ DK ≤ t satisfies condition
(ii). In the proof the following simple equalities will be used: for all Borel
subsets B1 and B2 of E,
S S S
DB 1 ∪B2
= DB 1
∧ DB 2
, and (3.243)
S S S
DB 1 ∩B2
≥ DB 1
∨ DB 2
. (3.244)
By using (3.243) and (3.244) with K1 ∈ K(E) and K2 ∈ K(E) instead of B1
and B2 respectively, we get:
© S ª © S ª ¡© S ª © S ª¢ © S ª
DK1 ∪K2 ≤ t \ DK 2
≤ t = DK 1
≤ t ∪ DK 2
≤ t \ DK 2
≤t
© S ª © S ª © S ª ¡© S ª © S ª¢
= DK 1
≤ t \ DK 2
≤ t = DK 1
≤ t \ DK 1
≤ t ∩ DK 2
≤t
© S ª © S S
ª © S ª © S ª
= DK 1
≤ t \ DK 1
∨ DK 2
≤ t ⊂ DK 1
≤ t \ DK 1 ∩K2
≤ t . (3.245)
It follows from (3.245) that
£ S ¤ £ S ¤
Pτ,µ DK 1 ∪K2
≤ t + Pτ,µ DK 1 ∩K2
≤t
£ S ¤ £ S ¤
≤ Pτ,µ DK 1
≤ t + Pτ,µ DK 2
≤t . (3.246)
Now£ it is clear
¤ that (3.246) implies condition (ii) for the function K 7→
S
Pτ,µ DK ≤ t . The proof of condition (ii) for the second function in Lemma
3.29 is similar.
This completes the proof of Lemma 3.29.
S
The next theorem states that under certain restrictions, the entry time DB ,
the pseudo-hitting time De , and the hitting time T are stopping times.
S S
B B
τ
Recall that Ft+ denote the completion of the σ-field
\
τ
Ft+ = σ (X(ρ) : τ ≤ ρ ≤ s)
s∈(t,T ]

with respect to the family of measures {Ps,x : 0 ≤ s ≤ τ, x ∈ E}.


3.5 Measurability properties of hitting times 155

Theorem 3.30. Let τ ∈ [0, T ], and {X(t) : t ∈ [τ, T ]} be as in Lemma 3.29:


(i) The process X(t) is right-continuous and quasi left-continuous on [0, ζ).
τ
(ii)The σ-fields Ft+ are Pτ,x -complete and right-continuous for t ∈ [τ, T ] and
x ∈ E.
τ
Then for every τ ∈ [0, T ) and every Ft+ -stopping time S : Ω → [τ, ζ], the
S eS τ
stochastic variables DB , DB , and TBS are Ft+ -stopping times.

Proof. We will first prove Theorem 3.30 assuming that it holds for all open
and all compact subsets of E. The validity of Theorem 3.30 for such sets will
be established in lemmas 3.31 and 3.32 below.
Let B be a Borel subset of E, and suppose that we have already shown
(ε+S)∧ζ τ
that for any ε ≥ 0 the stochastic time DB is an Ft+ -stopping time. Since
(ε+S)∧ζ
TBS = inf DB
ε>0,ε∈Q+

τ
(see (3.235)), we also obtain that TBS is an Ft+ -stopping time. Therefore, in
τ
order to prove that TBS is an Ft+ -stopping time, it suffices to show that for
S τ
every Borel subset B of E, the stochastic time DB is an Ft+ -stopping time.
Since the process t 7→ X(t) is continuous from the right, it suffices to prove
the previous assertion with S replaced by (ε + S) ∧ ζ.
Fix t ∈ [τ, T ), µ ∈ P (E), and B ∈ E. By Lemma 3.29 and the Choquet
capacitability theorem, the set B is capacitable with respect to the£ capacity
¤
S
I associated with the strongly sub-additive function K 7→ Pτ,µ DK ≤t .
Therefore, there exists an increasing sequence Kn ∈ K(E), n ∈ N, and a
decreasing sequence Gn ∈ O(E), n ∈ N, such that

Kn ⊂ Kn+1 ⊂ B ⊂ Gn+1 ⊂ Gn , n ∈ N, and


£ S ¤ £ S ¤
sup Pτ,µ DK n
≤ t = inf Pτ,µ DG n
≤t . (3.247)
n∈N n∈N

The arguments in (3.247) should be compared with those in (3.262) below.


Put
[© ª \© ª
Λτ,µ,S
1 (t) = S
DK n
≤t and Λτ,µ,S
2 (t) = DGS
n
≤ t . (3.248)
n∈N n∈N

τ
Then Lemma 3.31 implies Λτ,µ,S
2 (t) ∈ Ft+ , and Lemma 3.32 gives Λτ,µ,S
1 (t) ∈
τ
Ft+ . Moreover, we have
© S ª
Λτ,µ,S
1 (t) ⊂ DB ≤ t ⊂ Λτ,µ,S
2 (t), (3.249)

and
h i £ S ¤
Pτ,µ Λτ,µ,S
2 (t) = inf Pτ,µ DG n
≤t
n∈N
156 3 Space-time operators
£ S ¤ h i
= sup Pτ,µ DK n
≤ t = Pτ,µ Λτ,µ,S
1 (t) . (3.250)
n∈N
h i
It follows from (3.249) and (3.250) that Pτ,µ Λτ,µ,S 2 (t) \ Λτ,µ,S
1 (t) = 0. By
© S ª
using (3.249) again, we see that the event DB ≤ t belongs to the σ-field
τ S τ
Ft+ . Therefore, the stochastic time DB is an Ft+ -stopping time. As we have
τ
already observed, it also follows that the stochastic time TBS is an Ft+ -stopping
time.
A similar argument with DB S
replaced by D e S shows that the stochastic
B
τ
times D e S , B ∈ E, are F -stopping times.
B t+
This completes the proof of Theorem 3.30.

Next we will prove two lemmas which have already been used in the proof of
Theorem 3.30.
τ
Lemma 3.31. Let S : Ω → [τ, ζ] be an Ft+ -stopping time, and let G ∈ O(E).
S eS τ
Then the stochastic times DG , DG , and TGS are Ft+ -stopping times.

Proof. It is not hard to see that


© S ª \ ½ 1
¾
S
DG ≤ t < ζ = DG < t + ∩ {t < ζ}
m
m∈N
\ [
= {S ≤ ρ, X(ρ) ∈ G} . (3.251)
m∈N τ ≤ρ<t+ m
1
,ρ∈Q+

We also have
© S ª © S ª © S ª
DG ≤ t = DG ≤ t < ζ ∪ {ζ ≤ t} = DG ≤ t < ζ ∪ {X(t) = 4} .
(3.252)
τ
The event on the right-hand side of (3.251) belongs to Ft+ , and hence from
S τ
(3.251) and (3.252) the stochastic time DG is an Ft+ -stopping time. The fact
τ
e S is an Ft+ -stopping time follows from
that D G

n o \ ½ 1
¾
e S
DG ≤ t < ζ = e S
DG < t + ∩ {t < ζ}
m
m∈N
\ [
= {S ≤ ρ, X(ρ) ∈ G}
m∈N ρ∈(τ,t+ m
1
)∩Q+

together with
n o n o
DeG
S
≤t = DeG
S
≤ t < ζ ∪ {X(t) = 4} . (3.253)

τ
The equality (3.235) with G instead of B implies that TGS is an Ft+ -stopping
time.
3.5 Measurability properties of hitting times 157
τ
Lemma 3.32. Let S : Ω → [τ, ζ] be an Ft+ -stopping time, and let K ∈
¡ ¢ τ
K E 4 . Then the S
stochastic times DK e S and T S are
,D Ft+ -stopping times.
K K

Proof. First let K be a compact subset of E, and let Gn , n ∈ N, be a sequence


of open subsets of E with the following properties: K ⊂ Gn+1 ⊂ Gn and
T S τ
n∈N Gn = K. Then every stochastic time DGn is an F t+ -stopping time (see
S
Lemma 3.31), and for every µ ∈ P (E) the sequence of stochastic times DG n
,
S
n ∈ N, increases Pτ,µ -almost surely to DK . This implies that the stochastic
S τ
time TK is an Ft+ -stopping time. The equality (3.235) with K instead of B
S τ
then shows that TK is an Ft+ -stopping time. Next we will show the Pτ,µ -
S S
almost sure convergence of the sequence DG n
, n ∈ N. Put DK = sup DG n
.
n∈N
S S S S
Since DG n
≤ DG n+1
≤ DK , it follows that DK ≤ DK . By Lemma 3.31, the
S τ
stochastic times DG n
, n ∈ N, are Ft+ -stopping times. It follows from the
quasi-continuity from the left of the process X(t), t ∈ [0, ζ), that
¡ S ¢
lim X DG n
= X (DK ) Pτ,µ -a.s.
n→∞

Therefore, \
X (DK ) ∈ Gn = K Pτ,µ -a.s.
n
S S S
Since DK ≥ S, we have DK ≤ DK Pτ,µ -almost surely, and hence DK = DK
Pτ,µ -almost surely. This establishes the Pτ,µ -almost sure convergence of the
S S
sequence DG n
, n ∈ N, to DK .
In order to finish the proof of Lemma 3.32, we will establish that for every
µ ∈ P (E), the sequence of stochastic times D e S increases Pτ,µ -almost surely
Gn
e S e e S
to DK . Put DK = sup DGn . Since
n∈N

eG
D S eG
≤D S eK
≤D S
,
n n+1

eK ≤ D
it follows that D e S . By using the fact that the process X(t), t ∈ [0, ζ),
K
is quasi-continuous from the left, we get
³ ´ ³ ´
lim X D eG
S
= X eK
D Pτ,µ -a.s.
n
n→∞

Therefore ³ ´ \
X DeK ∈ Gn = K Pτ,µ -a.s.
n

Since D e S ≥ S, we have D eS ≤ D e K Pτ,µ -almost surely, and hence D


eS = D eK
K K K
e S
Pτ,µ -almost surely. This equality shows that the stochastic time DK is an
τ
Ft+ -stopping time.
We still have to consider the case that 4 ∈ K. For this we use the equalities
DKS
= DKS
∧ ζ, and DeS eS
0 ∪{4} 0 K0 ∪{4} = DK0 ∧ ζ together with the fact that
158 3 Space-time operators

a compact subset K of E 4 is a compact subset of E or is of the form K =


K0 ∪ {4} where K0 ⊂ E is compact. Observe that on the event {ζ ≥ τ } ζ is
an Ftτ -stopping time.
This completes the proof of Lemma 3.32.

Let us return to the study of standard Markov processes. It was established in


Theorem 1.39 item (a) that if P is a transition sub-probability function such
that the backward free Feller propagator {P (s, t) : 0 ≤ s ≤ t ≤ T } associated
with P is a strongly continuous (backward) Feller propagator, then there
exists a standard Markov process as in 3.267 with (τ, x; t, B) 7→ P (τ, x; t, B)
as its transition function. Let τ ∈ [0, T ], and let (X(t), Ftτ , Pτ,x ) be a Markov
τ
process. Suppose that S is an Ft+ -stopping time such that τ ≤ S ≤ ζ. Fix a
S,∨
measure µ ∈ P (E), and denote by FT the completion of the σ-field FTS,∨ =
σ (S ∨ ρ, X (S ∨ ρ) : 0 ≤ ρ ≤ T ) with respect to the measure µ. The measure µ
is used throughout Lemma 3.34 below. The next theorem provides additional
examples of families of stopping times which can be used in the formulation
of the strong Markov property with respect to families of measures.
Theorem 3.33. Let (X(t), Ftτ , Pτ,x ) be a standard Markov process as in
(3.267), and let B ∈ E4 . Then the stopping times DB S e S , and T S are
, D B B
S,∨
measurable with respect to the σ-field FT .

Proof. Since the stopping time S attains its values in the interval [τ, ζ] we see
that {ζ ≤ ρ} = {ζ ≤ ρ ∨ S} = {X (ρ ∨ S) = 4} for all ρ ∈ [τ, T ]. This shows
S,∨ S S
that ζ is measurable with respect to FT . By (3.234) we see DB∪{4} = DB ∧ζ,
De S e S
= D ∧ ζ, and T S S
= T ∧ ζ, and hence we see that it suffices
B∪{4} B B∪{4} B
S
to prove that the stochastic times DB e S , and T S are FS,∨
, D T -measurable,
B B
whenever B is a Borel subset of E.

The proof of Theorem 3.33 is based on the following lemma. The same result
with the same proof is also true with E 4 instead of E.
Lemma 3.34. Let K ∈ K(E) and τ ∈ [0, T ). Suppose
T that Gn ∈ O(E),
n ∈ N, is a sequence such that K ⊂ Gn+1 ⊂ Gn and n∈N Gn = K. Then the
following assertions hold:
S
(a) For every µ ∈ P (E), the sequence of stopping times DG n
increases and
S
tends to DK Pτ,µ -almost surely.
© S ª
(b) For every t ∈ [τ, T ], the events DG ≤ t , n ∈ N, are FTS,∨ -measurable,
© S ª S,∨
n

and the event DK ≤ t is FT -measurable.


© ª
(c) For every t ∈ [τ, T ], the events TGSn ≤ t , n ∈ N, are FTS,∨ -measurable,
© S ª S,∨
and the event TK ≤ t is FT -measurable.
(d) For every µ ∈ P (E), the sequence of stopping times D e S increases and
Gn
e S
tends to DK Pτ,µ -almost surely.
3.5 Measurability properties of hitting times 159
n o
(e) For every t ∈ [τ, T ], the events D e S ≤ t , n ∈ N, are FS,∨ -measurable,
Gn T
n o S,∨
e S
and the event DK ≤ t is FT -measurable.

Proof. (a) Fix µ ∈ P (E), and let K ∈ K(E) and Gn ∈ O(E), n ∈ N, be as


S
in assertion (a) in the formulation of Lemma 3.34. Put DK = sup DG n
. Since
n∈N
S S S S
S ≤ DG n
≤ DG n+1
≤ DK , we always have S ≤ DK ≤ DK . Moreover, DK
is a stopping time. By using the quasi-continuity from the left of the process
t 7→ X(t) on [τ, ζ) with respect to the measure Pτ,µ , we see that
¡ S ¢
lim X DG n
= X (DK ) Pτ,µ -almost surely on {DK < ζ}.
n→∞

Therefore,
\
X (DK ) ∈ Gn = K Pτ,µ -almost surely on {DK < ζ}. (3.254)
n∈N

S S S
Now by the definition of DK we have DK ≥ S, and (3.254) implies DK ≤ DK
S
Pτ,µ -almost surely on {DK < ζ}, and hence DK = DK Pτ,µ -almost surely. In
S
the final we used the inequality DK ≤ DK which is always true.
(b) Fix t ∈ [τ, T ) and n ∈ N. By the right-continuity of paths on [0, ζ) we
have
© S ª \ ½ 1
¾
S
D Gn ≤ t < ζ = D Gn < t + ∩ {t < ζ}
m
m∈N
\ [
= {S ≤ ρ, X(ρ) ∈ Gn }
m∈N ρ∈[τ,t+ 1
m)
\ [
= {S ∨ ρ ≤ ρ, X (S ∨ ρ) ∈ Gn } . (3.255)
m∈N ρ∈[τ,t+ m
1
)∩Q+

It follows that © ª
S
DG n
≤ t < ζ ∈ FTS,∨ , 0 ≤ t ≤ T.
By using assertion (a), we see that the events
© S ª \© ª
S
DK ≤ t < ζ and DG n
≤t<ζ
n∈N

© S ª S,∨
coincide Pτ,µ -almost surely. It follows that DK ≤ t < ζ ∈ FT . It also
© S ª S,∨
follows that the event DK < ζ belongs to FT . In addition we notice the
equalities
© S ª © S ª © S ª
DK ≤ t = DK ≤ t < ζ ∪ DK ≤ t, ζ ≤ t

S
(DK ≤ ζ and S ≤ ζ)
160 3 Space-time operators
© S ª
= DK ≤ t < ζ ∪ {ζ ≤ S ∨ t}
© S ª
= DK ≤ t < ζ ∪ {X (S ∨ t) = 4} (3.256)
© S ª
From (3.256) we see that events of the form DK ≤ t , t ∈ [τ, T ], belong to
S,∨ S S,∨
FT . Consequently the stopping time DK is FT -measurable. This proves
assertion (b).
(c) Since the sets Gn are open and the process X(t) is right-continuous,
the hitting times TGSn and the entry times DG S
n
coincide. Hence, the first part
of assertion (c) follows from assertion (b). In order to prove the second part
of (c), we reason as follows. By assertion (b), for every r ∈ Q+ , the stopping
(r+S)∧ζ (r+S)∧ζ,∨
time DK is FT -measurable. Our next goal is to prove that for
every ε > 0,
(ε+S)∧ζ,∨
FT ⊂ FTS,∨ . (3.257)
Fix ε > 0, and ρ ∈ [τ, ζ], and put S1 = ((ε + S) ∧ ζ) ∨ ρ. Observe that for ρ,
t ∈ [0, T ], we have the following equality of events:

{S1 ≤ t} = {((ε + S) ∧ ζ) ∨ ρ ≤ t}
= {((ε + S) ∨ ρ) ∧ (ζ ∨ ρ) ≤ t}
= {S ∨ (ρ − ε) ≤ t − ε, ρ ≤ t} ∪ {ζ ≤ S ∨ t, ρ ≤ t}
= {S ∨ (ρ − ε) ≤ t − ε, ρ ≤ t} ∪ {X (S ∨ t) = 4, ρ ≤ t} . (3.258)

Therefore, the stopping time S1 = ((ε + S) ∧ ζ) ∨ ρ is FTS,∨ -measurable. Since


the process t 7→ X(t) is right-continuous, it follows from Proposition 3.39 that
X (S1 ) is FTS,∨ -measurable. This implies inclusion (3.257). Hence,
(ε+S)∧ζ,∨ S,∨
FT ⊂ FT , (3.259)
(ε+S)∧ζ S,∨
and we see that the for every ε > 0 the stopping time DK is FT -
(ε+S)∧ζ S
measurable. Since the family DK , ε > 0, decreases to TK , the hitting
S S,∨
time TK is FT -measurable as well.
(d) Fix µ ∈ P (E), and let K ∈ K(E) and Gn ∈ O(E), n ∈ N, be as in
e K = sup D
assertion (a). Put D e S . Since
Gn
n∈N

eG
D S eG
≤D S eK
≤D S
,
n n+1

we have D eK ≤ De S . It follows from the quasi-continuity from the left of the


K
process X(t) on [0, ζ) that
³ ´ ³ ´ n o
lim X D eS e eK < ζ .
Gn = X DK Pτ,µ -almost surely on D
n→∞

Therefore,
3.5 Measurability properties of hitting times 161
³ ´ \ n o
eK ∈
X D Gn = K Pτ,µ -almost surely on DeK < ζ .
n
n o
Now D e S ≥ S implies that D eS ≤ De K Pτ,µ -almost surely on D e K < ζ , and
K K
n o
e S e
hence DK = DK Pτ,µ -almost surely on D e K < ζ . As in (a) we get D
eS = DeK
K
Pτ,µ -almost surely.
(e) Fix t ∈ [τ, T ) and n ∈ N. By the right-continuity of paths,
n o \ ½ 1
¾
DeGS
≤ t < ζ = D S
Gn < t + ∩ {t < ζ}
n
m
m∈N
\ [
= {S ≤ ρ, X(ρ) ∈ Gn }
m∈N ρ∈(τ,t+ m
1
)
\ [
= {S ∨ ρ ≤ ρ, X (S ∨ ρ) ∈ Gn } . (3.260)
m∈N ρ∈(τ,t+ m
1
)∩Q+

n o
It follows that D e S ≤ t < ζ ∈ FS,∨ . By using assertion (d), we see that the
Gn
n o T nT o
events D e ≤ t < ζ and
S e S ≤ t < ζ coincide Pτ,µ -almost surely.
D
K Gn
n o n∈NS,∨
e S
Therefore, DK ≤ t < ζ ∈ FT . As in (3.256) we have
n o n o n o
DeK
S
≤t = DeK
S
≤t<ζ ∪ D eKS
≤ t, ζ ≤ t
n o
= DeK
S
≤ t < ζ ∪ {X (S ∨ t) = 4} . (3.261)

This proves assertion (e), and therefore the proof of Lemma 3.34 is complete.

Proof (Proof of Theorem 3.33: continuation). Let us return to the proof of


S
Theorem 3.33. We will first prove that for any Borel set B, the entry time DB
S,∨
is measurable with respect to the σ-field FT . Then the same assertion holds
S,∨
for the hitting time TBS . Indeed, if DB
S
is FT -measurable for all stopping times
(ε+S)∧ζ
S, then for every ε > 0, the stopping time DB is measurable with respect
(ε+S)∧ζ,∨ S,∨
to the σ-field FT . By using (3.259), we obtain the FT -measurability
(ε+S)∧ζ S,∨
of DB . Now (3.235) implies the FT -measurability of TBS .
Fix t ∈ [τ, T ), µ ∈ P (E), and B ∈ E. By Lemma£ S 3.29,
¤ the set B is
capacitable with respect to the capacity K 7→ Pτ,µ DK ≤ t . Notice that the
following argument was also employed in the proof of Theorem 3.30. Therefore,
there exists an increasing sequence Kn ∈ K(E), n ∈ N, and a decreasing
sequence Gn ∈ O(E), n ∈ N, such that

Kn ⊂ Kn+1 ⊂ B ⊂ Gn+1 ⊂ Gn , n ∈ N, and


£ S ¤ £ S ¤
sup Pτ,µ DK n
≤ t = inf Pτ,µ DG n
≤t . (3.262)
n∈N n∈N
162 3 Space-time operators

Next we put
[© ª \© ª
Λτ,µ,S
1 (t) = S
DK n
≤t and Λτ,µ,S
2 (t) = S
DG n
≤t . (3.263)
n∈N n∈N

The equalities in (3.248) which are the same as those in (3.263) show that the
S,∨
events Λτ,µ,S
1 (t) and Λτ,µ,S
2 (t) are FT -measurable. Moreover, we have
© S ª
Λτ,µ,S
1 (t) ⊂ DB ≤ t ⊂ Λτ,µ,S
2 (t), (3.264)
and
h i £ S ¤
Pτ,µ Λτ,µ,S
2 (t) = inf Pτ,µ DG n
≤t
n∈N
£ S ¤ h i
= sup Pτ,µ DK n
≤ t = Pτ,µ Λτ,µ,S
1 (t) . (3.265)
n∈N
h i
Now (3.264) and (3.265) give Pτ,µ Λτ,µ,S
2 (t) \ Λτ,µ,S
1 (t) = 0. By using (3.264),
© S ª S,∨
we see that the event DB ≤ t is measurable with respect to the σ-field FT .
S,∨ S
This establishes the FT -measurability of the entry time DB and the hitting
S e S is similar
time TB . The proof of Theorem 3.33 for the pseudo-hitting time D B
S
to that for the entry time DB .
The proof of Theorem 3.33 is thus completed.
Definition 3.35. Fix τ ∈ [0, T ], and let S1 : Ω → [τ, T ] be an (Ftτ )t∈[τ,T ] -
stopping time. A stopping time S2 : Ω → [τ, T ] is called terminal after S1 if
S1 ,∨
S2 ≥ S1 , and if S2 is FT -measurable.
The following corollary shows that entry and hitting times of Borel subsets
which are comparable are terminal after each other.
Corollary 3.36. Let (X(t), Ftτ , Pτ,x ) be a standard process, and let A and B
τ
be Borel subsets of E with B ⊂ A. Then the entry time DB is measurable with
D τ ,∨
respect to the σ-field FT A . Moreover, the hitting time TBτ is measurable with
T τ ,∨
respect to the σ-field FTA .
Proof. By Theorem 3.33, it suffices to show that the equalities
τ τ
D τ e TA = TBτ
DB A = DB and D B (3.266)
hold Pτ,µ -almost surely for all µ ∈ P (E). The first equality in (3.266) follows
from [ [
τ
{DA ≤ s, X(s) ∈ B} = {X(s) ∈ B} ,
τ ≤s<T τ ≤s<T

while the second equality in (3.266) can be obtained from


[ [
{TAτ ≤ s, X(s) ∈ B} = {X(s) ∈ B} .
τ <s<T τ <s<T

This proves Corollary 3.36.


3.5 Measurability properties of hitting times 163

τ
It follows from Corollary 3.36 that the families {DA : A ∈ E} and {TAτ : A ∈ E}
can be used in the definition of the strong Markov property in the case of
standard processes. The next theorem states that the strong Markov property
holds for entry times and hitting times of comparable Borel subsets.
Theorem 3.37. Let (X(t), Ftτ , Pτ,x ) be a standard process, and fix τ ∈ [0, T ].
Let A and B be Borel subsets of E such that B ⊂ A, and let f : [τ, T ]×E 4 → R
be a bounded Borel function. Then the following equalities hold Pτ,x -almost
surely:
h ¯ τ i
Eτ,x f (DBτ
, X (DB τ
)) ¯ FD τ
A
= EDτ ,X (Dτ ) [f (DB τ
, X (DBτ
))] and
A A
h ¯ i
Eτ,x f (TBτ , X (TBτ )) ¯ FTτ Aτ = ET τ ,X (T τ ) [f (TBτ , X (TBτ ))]
A A

© S ª
The
© S firstª one holds Pτ,x -almost surely on DA < ζ , and the second on
TA < ζ .

Proof. Theorem 3.37 follows from Corollary 3.36 and Remark 3.40.

Definition 3.38. The quadruple


½µ ³ ´ ¶ ¾
τ
Ω, Ft+ , Pτ,x , (X(t), t ∈ [0, T ]) , (∨t , t ∈ [0, T ]) , (E, E)
t∈[τ,T ]
(3.267)
is called a standard Markov process if it possesses the following properties:
³ τ ´
1. The process X(t) is adapted to the filtration Ft+ , right-continuous
t∈[τ,T ]
and possesses left limits in E on its life time.
τ
2. The σ-fields Ft+ , t ∈ [τ, T ], are right continuous and Pτ,x -complete.
3. The process (X(t) : t ∈ [0, T ]) is strong Markov with respect to the mea-
sures {Pτ,x : (τ, x) ∈ [0, T ] × E}
4. The process (X(t) : t ∈ [0, T ]) is quasi left-continuous on [0, ζ).
5. The equalities X(t) ◦ ∨s = X(t ∨ s) hold Pτ,x -almost surely for all (τ, x) ∈
[0, T ] × E and for s, t ∈ [τ, T ].
¡ ¢
If Ω = D [0, T ], E 4 and X(t)(ω) = ω(t), t ∈ [0, T ], ω ∈ Ω, then parts of
the items (1) and (2) are automatically satisfied. For brevity we often write
(X(t), Pτ,x ) instead of (3.267).
The following proposition gives an alternative way to describe stopping
times which are terminal after another stopping time: see Definition 3.35.
Proposition 3.39. Let S1 : Ω → [τ, T ] be an Ftτ -stopping time, and let the
stopping S2 : Ω → [τ, T ] be such that S2 ≥ S1 , and such that for every
t ∈ [τ, T ] the event {S2 > t} restricted to the event {S1 < t} = {S1 ∨ t < t}
only depends on FTt . Then S2 is FTS1 ,∨ -measurable. If the paths of the process X
are right-continuous, the state variable X (S2 ) is FTS1 ,∨ -measurable as well. It
follows that the space-time variable (S2 , X (S2 )) is FTS1 ,∨ -measurable. Similar
164 3 Space-time operators

results are true if the σ-fields FTt and FTS1 ,∨ by their Pτ,µ -completions for some
probability measure µ on E.

Proof. Suppose that for every t ∈ [τ, T ] the stochastic variable S2 is such that
on {S1 < t} = {S1 ∨ t < t} the event {S2 > t} only depends on FTt . Then
on {S1 < t} the event © {S2 > t} only depends ª © on the σ-field generated by ª
the state variables X(ρ) ¹{S1 ∨t<t} : ρ ≥ t = X (ρ ∨ S1 ) ¹{S1 ∨t<t} : ρ ≥ t .
Consequently, the event {S2 > t > S1 } is FTS1 ,∨ -measurable. Since S2 = S1 +
RT
τ
1{S2 >t>S1 } dt, we see that S2 is FTS1 ,∨ -measurable. This argument can be
adapted if we only know that for every t ∈ [τ, T ] on the event {S1 < t} the
event {S2 > t} only depends
© on the Pτ,µ -completion
ª of the σ-field generated
by the state variables X(ρ) ¹{S1 ∨t<t} : ρ ≥ t for some probability measure
µ on E.
If the process X(t) is right-continuous, and if S2 is a stopping time which
is terminal after the stopping time S1 : Ω → [0, T ], then the space-time
S1 ,∨
variable (S2 , X (S2 )) is FT -measurable. This result follows from the equality
in (2.44) with S2 instead of S:
» ¼
t − τ 2n (S2 − τ )
S2,n (t) = τ + n . (3.268)
2 t−τ

Then notice that the stopping times S2,n (t), n ∈ N, t ∈ (τ, T ], are FTS1 ,∨ -
measurable, provided that S2 has this property. Moreover, we have S2 ≤
S2,n+1 (t) ≤ S2,n (t) ≤ S2 + 2−n (t − τ ). It follows that the state variables
X (S2,n (t)), n ∈ N, t ∈ (τ, T ], are FTS1 ,∨ -measurable, and that the same is true
for X (S2 ) = lim X (S2,n (t)).
n→∞
This completes the proof of Proposition 3.39.

Remark 3.40. If in Theorem¡ 3.41 for


¢ the sample path space Ω we take the
Skorohod space Ω = D [0, T ], E 4 , X(t)(ω) = ω(t), ω ∈ Ω, t ∈ [0, T ], then
the process t 7→ X(t), t ∈ [0, T ], is right-continuous, has left limits in E on its
life time, and is quasi-left continuous on its life time as well.

Theorem 3.41. Let, as in Lemma 3.29,


n³ τ
´ o
Ω, FT , Pτ,x , (X(t), t ∈ [0, T ]) , (∨t , t ∈ [τ, T ]) , (E, E)

be a standard Markov process with right-continuous paths, which has left lim-
its on its life time, and is quasi-continuous from the left on its life time.
S,∨
For fixed (τ, x) ∈ [0, T ] × E, the σ-field FT is the completion of the σ-
field FTS,∨ = σ (S ∨ ρ, X (S ∨ ρ) : 0 ≤ ρ ≤ T ) with respect to the measure Pτ,x .
S1 ,∨
Then, if (S1 , S2 ) is a pair of stopping times such that S2 is FT -measurable
and τ ≤ S1 ≤ S2 ≤ T , then for all bounded Borel functions f on [τ, T ] × E 4 ,
the equality
3.5 Measurability properties of hitting times 165
h ¯ τ i
Eτ,x f (S2 , X (S2 )) ¯ FS1 = ES1 ,X(S1 ) [f (S2 , X (S2 ))] (3.269)

holds Pτ,x -almost surely on {S1 < ζ}.


First notice that the conditions on S1 and S2 are such that S2 is terminal
after S1 : see Definition 3.35. Also observe that the Markov process in (3.267) is
quasi-continuous from the left on its life time [0, ζ): see compare with Theorem
1.39 item (a). Let A and B be Borel subsets of E such that B ⊂ A. In (3.269)
τ τ
we may put S1 = DA together with S2 = DB , or S1 = TAτ and S2 = TBτ : see
Theorem 3.33 and Corollary 3.36.
Proof. This result is a consequence of the strong Markov property as exhibited
in Theorem 1.39 item (a).

3.5.1 Some side remarks

We notice that we have used the following version of the Choquet capacity
theorem.
Theorem 3.42. In a polish space every analytic set is capacitable.
For a discussion on capacitable subsets see e.g. Kiselman [133]; see Choquet
[58], [27] and [69] as well. For a general discussion on the foundations of
probability theory see e.g. [118].
Without the sequential λ-dominance of the operator D+ L the second
formula, i.e. the formula in (3.116), poses a difficulty as far as it is not
clear that the function e−λt Se0 (t)f belongs to Cb ([0, T ] × E) indeed. For
the moment suppose that the function f ∈ Cb ([0, T ] × E) is such that
Se0 (t)f ∈ Cb ([0, T ] × E). Then equality (3.115) yields:
³ ´−1 Z t ³ ´−1
λI − L (1)
f= e−λρ Se0 (ρ)f dρ + e−λt S0 (t) λI − L(1) f
0
Z t ³ ´−1
= e−λρ Se0 (ρ)f dρ + e−λt λI − L(1) Se0 (t)f. (3.270)
0
Rt ¡ ¢
Consequently, the function 0 e−λρ Se0 (ρ)f dρ belongs to D L(1) and the
equality in (3.116) follows from (3.270). Next, let (µm )m∈N be a sequence
in (0, ∞) which increases to ∞, and let (fn )n∈N be sequence in Cb ([0, T ] × E)
which decreases pointwise to the zero-function. From (3.113), (3.270) and
(3.116) we obtain the following equality:
³ ´Z t
(1)
µm R (µm ) fn = λI − L e−λρ S0 (ρ) (µm R (µm ) fn ) dρ
0
+ e−λt S0 (t) (µm R (µm ) fn ) , m, n ∈ N, and (3.271)
Z t
µm R (µm ) R(λ)fn = e−λρ S0 (ρ) (µm R (µm ) fn ) dρ
0
166 3 Space-time operators

+ e−λt S0 (t) (µm R (µm ) R(λ)fn ) , m, n ∈ N. (3.272)

Perhaps it is better to consider the following equalities rather than those in


(3.272). They also follow from the equalities in (3.113), (3.115) and (3.116):
R t2 R t −λρ R t2 −λt
t1 0
e S0 (ρ)dρ f dt e S0 (t)R(λ)f dt
R(λ)f = + t1 , (3.273)
t2 − t1 t2 − t1

and
R t2 R t R t2
³ ´ e−λρ S0 (ρ)dρ f dt e−λt S0 (t)f dt
(1) t1 0 t1
f = λI − L + . (3.274)
t2 − t1 t2 − t1
We have to investigate the equalities in (3.273) and (3.274) if for f we choose
a function fn from a sequence (fn )n∈N which decreases to zero. Then
R t2 −λt R t2 −λt
t1
e S0 (t)fn dt t1
e S0 (t)fn dt
inf sup = lim sup .
n∈N 0≤t1 <t2 ≤T t2 − t1 n→∞ 0≤t1 <t2 ≤T t2 − t1
(3.275)
From (3.274) we infer that
R t2 R t −λρ
t1 0
e S0 (ρ)dρ fn dt
lim sup = 0. (3.276)
n→∞ 0≤t1 <t2 ≤T t2 − t1

From (3.275) and our extra assumption we see that the limit in (3.276) van-
ishes, and hence that the semigroup {S0 (t) : t ≥ 0} consists of linear mappings
which leave the function space Cb ([0, T ] × E) invariant, and which is Tβ -equi-
continuous.
The operator L(1) is the Tβ -closure of the operator D1 +L, which is positive
Tβ -dissipative. Hence the operator L(1) inherits this property, and so it is
positive Tβ -dissipative as well.
The operator D1 + L is Tβ -densely defined, is Tβ -dissipative, satisfies the
maximum principle, and there exists λ0 > 0 such that the range of λ0 I −D1 −L
is Tβ -dense in Cb ([0, T ] × E).
Moreover, we have
(1) (1) ©¡ ¢ ¡ ¢ ª
CP,b = ∩λ0 >0 CP,b (λ0 ) = ∩λ0 >0 λ0 I − L − D1 g : g ∈ D L ∩ D (D1 ) .
(3.277)
The second equality in (3.277) follows
¡ ¢from (3.91)
¡ ¢and (3.92).
Consider for functions f ∈ D L(1) , g ∈ D L ∩ D (D1 ), and λ > 0 the
equalities:

L(1) f − Lg − D1 g
= L(1) f − λR(λ)L(1) f + λ2 R(λ)f − λf − Lg − D1 g
Z ∞
¡ ¡ ¢¢ ³ (1) ´
= e−ρ I − S λ−1 ρ L f dρ
0
3.5 Measurability properties of hitting times 167
Z ∞ ¡ ¡ ¢ ¢
+λ e−ρ S λ−1 ρ − I (f − g) dρ
0
Z ∞ (¡ ¡ ¢ ¢ )
−1
S λ ρ ϑλ −1 ρ − I g
+ ρe−ρ − Lg dρ
0 λ−1 ρ
Z ∞ á ¢ !
−ρ
¡ −1 ¢ I − ϑλ−1 ρ g
+ ρe S λ ρ − D1 g dρ
0 λ−1 ρ
Z ∞
© ¡ ¢ ª
+ ρe−ρ S λ−1 ρ D1 g − D1 g dρ. (3.278)
0

We recall that the time shift operators were defined in (3.77).

3.5.2 Some related remarks

In subsection 2.1.6 we already discussed to some length topics related to Ko-


rovkin families and convergence properties of measures. Here we will say some-
thing about the maximum principle, the martingale problem, and stopping
time arguments.
For more general versions than our Choquet capacitability theorem 3.28
the reader is referred to e.g. [73, 69, 160]). For a discussion on capacitable
subsets see e.g. Kiselman [133]; see Choquet [58], [27] and [69] as well. For a
general discussion on the foundations of probability theory see e.g. [118]. In
[95] the authors also made a thorough investigation of measurability proper-
ties of stopping times. However, in that case the underlying state space was
locally compact. In [48] the author makes an extensive study of the maximum
principle of an unbounded operator with domain and range in the space of
continuous functions which vanish at infinity where the state space is locally
compact. As indicated in Chapter 1 an operator L for which the martingale
problem is well-posed need posses a unique extension which is the generator
of a Dynkin-Feller semigroup. As indicated by Kolokoltsov in [136] there exist
relatively easy counter-examples: see comments after Theorem 1.39 in §1.3.
For the time-homogeneous case see, e.g., [84] or [109]. In fact [84] contains a
general result on operators with domain and range in C0 (E) and which have
unique linear extensions generating a Feller-Dynkin semigroup. The martin-
gale problem goes back to Stroock and Varadhan (see [225]). It found numer-
ous applications in various fields of mathematics. We refer the reader to [147],
[136], and [135] for more information about and applications of the martingale
problem. In [80] the reader may find singular diffusion equations which possess
or which do not possess unique solutions. Consequently, for (singular) diffusion
equations without unique solutions the martingale problem is not uniquely
solvable. Another valuable source of information is Jacob [110, 111, 112].
Other relevant references are papers by Hoh [104, 106, 105, 107]. Some of
Hoh’s work is also employed in Jacob’s books. In fact most of these references
discuss the relations between pseudo-differential operators (of order less than
168 3 Space-time operators

or equal to 2), the corresponding martingale problem, and being the generator
of a Feller-Dynkin semigroup.
Part II

Backward Stochastic Differential Equations


4
Feynman-Kac formulas, backward stochastic
differential equations and Markov processes

In this chapter we explain the notion of stochastic backward differential equa-


tions and its relationship with classical (backward) parabolic differential equa-
tions of second order. The chapter contains a mixture of stochastic processes
like Markov processes and martingale theory and semi-linear partial differen-
tial equations of parabolic type. Some emphasis is put on the fact that the
whole theory generalizes Feynman-Kac formulas. A new method of proof of
the existence of solutions is given. All the existence arguments are based on
rather precise quantitative estimates.

4.1 Introduction
This introduction serves as a motivation for the present chapter and also for
Chapter 5. Backward stochastic differential equations, in short BSDE’s, have
been well studied during the last ten years or so. They were introduced by
Pardoux and Peng [184], who proved existence and uniqueness of adapted
solutions, under suitable square-integrability assumptions on the coefficients
and on the terminal condition. They provide probabilistic formulas for solu-
tion of systems of semi-linear partial differential equations, both of parabolic
and elliptic type. The interest for this kind of stochastic equations has in-
creased steadily, this is due to the strong connections of these equations
with mathematical finance and the fact that they gave a generalization of
the well known Feynman-Kac formula to semi-linear partial differential equa-
tions. In the present chpter we will concentrate on the relationship between
time-dependent strong Markov processes and abstract backward stochastic
differential equations. The equations are phrased in terms of a martingale
problem, rather than a stochastic differential equation. They could be called
weak backward stochastic differential equations. Emphasis is put on existence
and uniqueness of solutions. The paper [246] deals with the same subject, but
it concentrates on comparison theorems and viscosity solutions. The proof of
the existence result is based on a theorem which is related to a homotopy
172 4 BSDE’s and Markov processes

argument as pointed out by the authors of [63]. It is more direct than the
usual approach, which uses, among other things, regularizing by convolution
products. It also gives rather precise quantitative estimates.
For examples of strong solutions which are driven by Brownian motion
the reader is referred to e.g. section 2 in Pardoux [181]. If the coefficients
x 7→ b(s, x) and x 7→ σ(s, x) of the underlying (forward) stochastic differential
equation are linear in x, then the corresponding forward-backward stochastic
differential equation is related to option pricing in financial mathematics. The
backward stochastic differential equation may serve as a model for a hedging
strategy. For more details on this interpretation see e.g. El Karoui and Quenez
[126], pp. 198–199. A rather recent book on financial mathematics in terms of
martingale theory is the one by Delbaen and Schachermeyer [68]. E. Pardoux
and S. Zhang [185] use BSDE’s to give a probabilistic formula for the solution
of a system of parabolic or elliptic semi-linear partial differential equation
with Neumann boundary condition. In [40] the authors also put BSDE’s at
work to prove a result on a Neumann type boundary problem.
In this chapter we want to consider the situation where the family of
operators L(s), 0 ≤ s ≤ T , generates a time-inhomogeneous Markov process

{(Ω, FTτ , Pτ,x ) , (X(t) : T ≥ t ≥ 0) , (E, E)} (4.1)

in the sense that


d
Eτ,x [f (X(s))] = Eτ,x [L(s)f (X(s))] , f ∈ D (L(s)) , τ ≤ s ≤ T.
ds
We consider the operators L(s) as operators on (a subspace of) the space of
bounded continuous functions on E, i.e. on Cb (E) equipped with the supre-
mum norm: kf k∞ = supx∈E |f (x)|, f ∈ Cb (E), and the strict topology Tβ .
With the operators L(s) we associate the squared gradient operator Γ1 defined
by

Γ1 (f, g) (τ, x)
1
= Tβ - lim Eτ,x [(f (X(s)) − f (X(τ ))) (g (X(s)) − g (X(τ )))] , (4.2)
s↓τ s − τ

for f , g ∈ D (Γ1 ). Here D (Γ1 ) is the domain of the operator Γ1 . It consists of


those functions f ∈ Cb (E) = Cb (E, C) with the property that the strict limit
1 h ³ ´i
Tβ - lim Eτ,x (f (X(s)) − f (X(τ ))) f (X(s)) − f (X(τ )) (4.3)
s↓τ s−τ

exists. We will assume that D (Γ1 ) contains an algebra of functions in


Cb ([0, T ] × E) which is closed under complex conjugation, and which is Tβ -
dense. These squared gradient operators are also called energy operators: see
e.g. Barlow, Bass and Kumagai [23]. We assume that every operator L(s),
0 ≤ s ≤ T , generates a diffusion in the sense of the following definition.
4.1 Introduction 173

In the sequel it is assumed that the family of operators {L(s) : 0 ≤ s ≤ T }


possesses the property that the space of functions u : [0, T ] × E → R with
∂u
the property that the function (s, x) 7→ (s, x) + L(s)u (s, ·) (x) belongs to
∂s
Cb ([0, T ] × E) := Cb ([0, T ] × E; C) is Tβ -dense in the space Cb ([0, T ] × E).
This subspace of functions is denoted by D(L), and the operator L is defined
by Lu(s, x) = L(s)u (s, ·) (x), u ∈ D(L). It is also assumed that the family
A is a core for the operator L. We assume that the operator L, or that the
family of operators {L(s) : 0 ≤ s ≤ T }, generates a diffusion in the sense of
the following definition.
Definition 4.1. A family of operators {L(s) : 0 ≤ s ≤ T } is said to generate
a diffusion if for every C ∞ -function Φ : Rn → R, with Φ(0, . . . , 0) = 0, and
every pair (s, x) ∈ [0, T ] × E the following identity is valid

L(s) (Φ (f1 , . . . , fn ) (s, ·)) (x) (4.4)


Xn
∂Φ
= (f1 , . . . , fn ) L(s)fj (x)
j=1
∂x j
n
1 X ∂2Φ
+ (f1 , . . . , fn ) (x)Γ1 (fj , fk ) (s, x)
2 ∂xj ∂xk
j,k=1

for all functions f1 , . . . , fn in an algebra of functions A, contained in the


domain of the operator L, which forms a core for L.
Generators of diffusions for single operators are described in Bakry’s lecture
notes [16]. For more information on the squared gradient operator see e.g. [19]
and [17] as well. Put Φ(f, g) = f g. Then (4.4) implies L(s) (f g) (s, ·)(x) =
L(s)f (s, ·)(x)g(s, x) + f (s, x)L(s)g(s, ·)(x)Γ1 (f, g) (s, x), provided that the
three functions f , g and f g belong to A. Instead of using the full strength of
(4.4), i.e. with a general function Φ, we just need it for the product (f, g) 7→ f g:
see Proposition 4.24.
Remark 4.2. Let m be a reference measure on the Borel field E of E, and let
p ∈ [1, ∞]. If we consider the operators L(s), 0 ≤ s ≤ T , in Lp (E, E, m)-
space, then we also need some conditions on the algebra A of “core” type in
the space Lp (E, E, m). For details the reader is referred to Bakry [16].
By definition the gradient of a function u ∈ D (Γ1 ) in the direction of v ∈
D (Γ1 ) is the function (τ, x) 7→ Γ1 (u, v) (τ, x). For given (τ, x) ∈ [0, T ] × E
the functional v 7→ Γ1 (u, v) (τ, x) is linear: its action is denoted by ∇L u (τ, x).
Hence, for (τ, x) ∈ [0, T ] × E fixed, we can consider ∇L u (τ, x) as an element
in the dual of D (Γ1 ). The pair
¡ ¢
(τ, x) 7→ u (τ, x) , ∇Lu (τ, x)

may be called an element in the phase space ¡ of the family L(s), 0 ≤¢ s ≤ T ,


(see Jan Prüss [193]), and the process s 7→ u (s, X(s)) , ∇L
u (s, X(s)) will be
174 4 BSDE’s and Markov processes

called an element of the stochastic phase space. Next let f : [0, T ] × E × R ×



D (Γ1 ) → R be a “reasonable” function, and consider, for 0 ≤ s1 < s2 ≤ T
the expression:
Z s2
¡ ¢
u (s2 , X (s2 )) − u (s1 , X (s1 )) + f s, X(s), u (s, X(s)) , ∇Lu (s, X(s)) ds
s1
Z s2 µ ¶
∂u
− u (s2 , X (s2 )) + u (s1 , X (s1 )) + L(s)u (s, X(s)) + (s, X(s)) ds
s1 ∂s
(4.5)
Z s2
¡ ¢
= u (s2 , X (s2 )) − u (s1 , X (s1 )) + f s, X(s), u (s, X(s)) , ∇Lu (s, X(s)) ds
s1
− Mu (s2 ) + Mu (s1 ) , (4.6)

where

Mu (s2 ) − Mu (s1 )
Z s2 µ ¶
∂u
= u (s2 , X (s2 )) − u (s1 , X (s1 )) − L(s)u (s, X(s)) + (s, X(s)) ds
s1 ∂s
Z s2
= dMu (s). (4.7)
s1

Details on the properties of the function f will be given in the theorems 4.26,
4.30, 4.33, 4.34, and 4.42.
The following definition also occurs in Definition 1.29. In Definition 1.29
the reader will find more details about the definitions 4.3 and 4.4. It also
explains the relationship with transition probabilities and Feller propagators.
Definition 4.3. The process

{(Ω, FTτ , Pτ,x ) , (X(t) : T ≥ t ≥ 0) , (E, E)} (4.8)

is called a time-inhomogeneous Markov process if


£ ¯ ¤
Eτ,x f (X(t)) ¯ Fsτ = Es,X(s) [f (X(t))] , Pτ,x -almost surely. (4.9)

Here f is a bounded Borel measurable function defined on the state space E


and τ ≤ s ≤ t ≤ T .
Suppose that the process X(t) in (4.8) has paths which are right-continuous
and have left limits in E. Then it can be shown that the Markov property
for fixed times carries over to stopping times in the sense that (4.9) may be
replaced with
£ ¯ ¤
Eτ,x Y ¯ FSτ = ES,X(S) [Y ] , Pτ,x -almost surely. (4.10)

Here S : E → [τ, T ] is an Ftτ -adapted stopping time and Y is a bounded


stochastic variable which is measurable with respect to the future (or terminal)
4.1 Introduction 175

σ-field after S, i.e. the one generated by {X (t ∨ S) : τ ≤ t ≤ T }. For this type


of result the reader is referred to Chapter 2 in Gulisashvili et al [95] and to
item (a) in Theorem 1.39. Markov processes for which (4.10) holds are called
strong Markov processes.
The following definition is, essentially speaking, the same as Definition
1.31. Its relationship with Feller propagators or evolutions (see Chapter 1,
Definition 1.30) is explained in Proposition 3.1 in Chapter 3. The derivatives
and the operators L(s), s ∈ [0, T ], have to be taken with respect to the strict
topology: see Section 1.1.
Definition 4.4. The family of operators L(s), 0 ≤ s ≤ T , is said to generate
a time-inhomogeneous Markov process

{(Ω, FTτ , Pτ,x ) , (X(t) : T ≥ t ≥ 0) , (E, E)} (4.11)

if for all functions u ∈ D(L), for all x ∈ E, and for all pairs (τ, s) with
0 ≤ τ ≤ s ≤ T the following equality holds:
· ¸
d ∂u
Eτ,x [u (s, X(s))] = Eτ,x (s, X(s)) + L(s)u (s, ·) (X(s)) . (4.12)
ds ∂s

Next we show that under rather general conditions the process s 7→ Mu (s) −
Mu (t), t ≤ s ≤ T , as defined in (4.6) is a Pt,x -martingale. In the following
proposition we write Fst , s ∈ [t, T ], for the σ-field generated by X(ρ), ρ ∈ [t, s].
The proof of the following proposition could be based on item (c) in Theorem
1.39 in Chapter 1. For convenience we provide a direct proof based on the
Markov property.
Proposition 4.5. Fix t ∈ [τ, T ). Let the function u : [t, T ] × E → R be
∂u
such that (s, x) 7→ (s, x) + L(s)u (s, ·) (x) belongs to Cb ([t, T ] × E) :=
∂s
Cb ([t, T ] × E; C). Then the process s 7→ Mu (s) − Mu (t) is adapted to the
filtration of σ-fields (Fst )s∈[t,T ] .

Proof. Suppose that T ≥ s2 > s1 ≥ t. In order to check the martingale


property of the process Mu (s) − Mu (t), s ∈ [t, T ], it suffices to prove that
£ ¯ ¤
Et,x Mu (s2 ) − Mu (s1 ) ¯ Fst 1 = 0. (4.13)

In order to prove (4.13) we notice that by the time-inhomogeneous Markov


property:
£ ¯ ¤
Et,x Mu (s2 ) − Mu (s1 ) ¯ Fst 1 = Es1 ,X(s1 ) [Mu (s2 ) − Mu (s1 )]
·
= Es1 ,X(s1 ) u (s2 , X (s2 )) − u (s1 , X (s1 ))
Z s2 µ ¶ ¸
∂u
− L(s)u (s, X(s)) + (s, X(s)) ds
s1 ∂s
176 4 BSDE’s and Markov processes

= Es1 ,X(s1 ) [u (s2 , X (s2 )) − u (s1 , X (s1 ))]


Z s2 ·µ ¶¸
∂u
− Es1 ,X(s1 ) L(s)u (s, X(s)) + (s, X(s)) ds
s1 ∂s
Z s2
d
= Es1 ,X(s1 ) [u (s2 , X (s2 )) − u (s1 , X (s1 ))] − Es1 ,X(s1 ) [u (s, X(s))] ds
s1 ds
= Es1 ,X(s1 ) [u (s2 , X (s2 )) − u (s1 , X (s1 ))]
− Es1 ,X(s1 ) [u (s2 , X (s2 )) − u (s1 , X (s1 ))] = 0. (4.14)

The equality in (4.14) establishes the result in Proposition 4.5.

As explained in Definition 4.1 it is assumed that the subspace D(L) contains


an algebra of functions which forms a core for the operator L.
Proposition 4.6. Let the family of operators L(s), 0 ≤ s ≤ T , generate a
time-inhomogeneous Markov process

{(Ω, FTτ , Pτ,x ) , (X(t) : T ≥ t ≥ 0) , (E, E)} (4.15)

in the sense of Definition 4.4: see equality (4.12). Then the process X(t) has
a modification which is right-continuous and has left limits on its life time.
For the definition of life time see e.g. item (a) in Theorem 1.39. The life time
ζ is defined by
(
inf {s > 0 : X(s) = 4} on the event {X(s) = 4 for some s ∈ (0, T )},
ζ=
ζ = T, if X(s) ∈ E for all s ∈ (0, T ).
(4.16)
In view of Proposition 4.6 we will assume that our Markov process has left
limits on its life time and is continuous from the right. The following proof is a
correct outline of a proof of Proposition 4.6. If E is just a polish space it needs
a considerable adaptation. Suppose that E is polish, and first assume that the
process t 7→ X(t) is conservative, i.e. assume that Pτ,x [X(t) ∈ E] = 1. Then,
by an important intermediate result (see Proposition 2.2 in Chapter 2 and
the arguments leading to it) we see that the orbits {X(ρ) : τ ≤ ρ ≤ T } are
Pτ,x -almost surely relatively compact in E. In case that the process t 7→ X(t)
is not conservative, i.e. if, for some fixed t ∈ [τ, T ], an inequality of the form
Pτ,x [X(t) ∈ E] < 1 holds, then a similar result is still valid. In fact on the
event {X(t) ∈ E} the orbit {X(ρ) : τ ≤ ρ ≤ t} is Pτ,x -almost surely relatively
compact: see Proposition 2.3 in Chapter 2. All details can be found in the proof
of item (a) of Theorem 1.39: see Subsection 2.1.1 in Chapter 2.
Proof. As indicated earlier the argument here works in case the space E is
locally compact. However, the result is true for a polish space E: see item (a)
in Theorem 1.39.
Let the function u : [0, T ] × E → R belong to the space D(L). Then the
process s 7→ Mu (s) − Mu (t), t ≤ s ≤ T , is a Pt,x -martingale. Let D[0, T ]
4.1 Introduction 177

be the set of numbers of the form k2−n T , k = 0, 1, 2, . . . , 2n . By a classical


martingale convergence theorem (see e.g. Chapter II in Revuz and Yor [199])
it follows that the following limit lim u (s, X(s)) exists Pτ,x -almost
s↑t, s∈D[0,T ]
surely for all 0 ≤ τ < t ≤ T and for all x ∈ E. In the same reference it is
also shown that the limit lim u (s, X(s)) exists Pτ,x -almost surely for
s↓t, s∈D[0,T ]
all 0 ≤ τ ≤ t < T and for all x ∈ E. Since the locally compact space [0, T ] × E
is second countable it follows that the exceptional sets may be chosen to be
independent of (τ, x) ∈ [0, T ] × E, of t ∈ [τ, T ], and of the function u ∈ D(L).
Since by hypothesis the subspace D(L) is Tβ -dense in Cb ([0, T ] × E) it follows
that the left-hand limit at t of the process s 7→ X(s), s ∈ D[0, T ] ∩ [τ, t], exists
Pτ,x -almost surely for all (t, x) ∈ (τ, T ] × E. It also follows that the right-hand
limit at t of the process s 7→ X(s), s ∈ D[0, T ] ∩ (t, T ], exists Pτ,x -almost
surely for all (t, x) ∈ [τ, T ) × E. Then we modify X(t) by replacing it with
X(t+) = lims↓t, s∈D[0,T ]∩(τ,T ] X(s), t ∈ [0, T ), and X(T +) = X(T ). It also
follows that the process t 7→ X(t+) has left limits in E.
The hypotheses in the following Proposition 4.7 are the same as those in
Proposition 4.6. The functions u and v belong to D(1) (L) = D (D1 ) ∩ D(L):
see Definition 1.30.
Proposition 4.7. Let the continuous function u : [0, T ] × E → R be such
that for every s ∈ [t, T ] the function x 7→ u(s, x) belongs to D (L(s)) and
suppose that the function (s, x) 7→ [L(s)u (s, ·)] (x) is bounded and continuous.
In addition suppose that the function s 7→ u(s, x) is continuously differentiable
for all x ∈ E. Then the process s 7→ Mu (s) − Mu (t) is a Fst -martingale with
respect to the probability Pt,x . If v is another such function, then the (right)
derivative of the quadratic co-variation process of the martingales Mu and Mv
is given by:
d
hMu , Mv i (t) = Γ1 (u, v) (t, X(t)) .
dt
In fact the following identity holds as well:
Mu (t)Mv (t) − Mu (0)Mv (0)
Z t Z t Z t
= Mu (s)dMv (s) + Mv (s)dMu (s) + Γ1 (u, v) (s, X(s)) ds. (4.17)
0 0 0

Here Fst ,
s ∈ [t, T ], is the σ-field generated by the state variables X(ρ), t ≤
ρ ≤ s. Instead of Fs0 we usually write Fs , s ∈ [0, T ]. The formula in (4.17) is
known as the integration by parts formula for stochastic integrals.
Proof. We outline a proof of the equality in (4.17). So let the functions u and
v be as in Proposition 4.7. Then we have
Mu (t)Mv (t) − Mu (0)Mv (0)
n
2X −1
¡ ¢¡ ¡ ¢ ¡ ¢¢
= Mu k2−n t Mv (k + 1)2−n t − Mv k2−n t
k=0
178 4 BSDE’s and Markov processes
n
2X
−1
¡ ¡ ¢ ¡ ¢¢ ¡ ¢
+ Mu (k + 1)2−n t − Mu k2−n t Mv k2−n t
k=0
n
2X −1
¡ ¡ ¢ ¡ ¢¢ ¡ ¡ ¢ ¡ ¢¢
+ Mu (k + 1)2−n t − Mu k2−n t Mv (k + 1)2−n t − Mv k2−n t .
k=0
(4.18)
Rt
The first term on the right-hand side of (4.18) converges to 0 Mu (s)dMv (s),
Rt
the second term converges to 0 Mv (s)dMu (s). Using the identity in (4.7) for
the function u and a similar identity for
R t v we see that the third term on the
right-hand side of (4.18) converges to 0 Γ1 (u, v) (s, X(s)) ds.
This completes the proof Proposition 4.7.

Remark 4.8. The quadratic variation process of the (local) martingale s 7→


Mu (s) is given by the process s 7→ Γ1 (u (s, ·) , u (s, ·)) (s, X(s)), and therefore
"¯Z ¯2 # ·Z s2 ¸
¯ s2 ¯
Es1 ,x ¯¯ ¯
dMu (s)¯ = Es1 ,x Γ1 (u (s, ·) , u (s, ·)) (X(s)) ds < ∞
s1 s1

under appropriate conditions on the function u. Very informally we may think


of the following representation for the martingale difference:
Z s2
Mu (s2 ) − Mu (s1 ) = ∇L
u (s, X(s)) dW (s). (4.19)
s1

Here we still have to give a meaning to the stochastic integral in the right-
hand side of (4.19). If E is an infinite-dimensional Banach space, then W (t)
should be some kind of a cylindrical Brownian motion. It is closely related to
a formula which occurs in Malliavin calculus: see Nualart [168] (Proposition
3.2.1) and [169].

Remark 4.9. It is perhaps worthwhile to observe that for Brownian motion


(W (s), Px ) the martingale difference Mu (s2 ) − Mu (s1 ), s1 ≤ s2 ≤ T , is given
by a stochastic integral:
Z s2
Mu (s2 ) − Mu (s1 ) = ∇u (τ, W (τ )) dW (τ ).
s1

Its increment of the quadratic variation process is given by


Z s2
2
hMu , Mu i (s2 ) − hMu , Mu i (s1 ) = |∇u (τ, W (τ ))| dτ.
s1

Next suppose that the function u solves the equation:


¡ ¢ ∂
f s, x, u (s, x) , ∇L
u (s, x) + L(s)u (s, x) + u (s, x) = 0. (4.20)
∂s
4.1 Introduction 179

If moreover, u (T, x) = ϕ (T, x), x ∈ E, is given, then we have


Z T ¡ ¢
u (t, X(t)) = ϕ (T, X(T )) + f s, X(s), u (s, X(s)) , ∇L
u (s, X(s)) ds
t
Z T
− dMu (s), (4.21)
t

with Mu (s) as in (4.7). From (4.21) we get

u (t, x) = Et,x [u (t, X(t))] (4.22)


Z T
£ ¡ ¢¤
= Et,x [ϕ (T, X(T ))] + Et,x f s, X(s), u (s, X(s)) , ∇L
u (s, X(s)) ds.
t

Theorem 4.10. Let u : [0, T ] × E → R be a continuous function with the


property that for every (t, x) ∈ [0, T ] × E the function s 7→ Et,x [u (s, X(s))] is
differentiable and that
· ¸
d ∂
Et,x [u (s, X(s))] = Et,x L(s)u (s, X(s)) + u (s, X(s)) , t < s < T.
ds ∂s

Then the following assertions are equivalent:


(a) The function u satisfies the following differential equation:

∂ ¡ ¢
L(t)u (t, x) + u (t, x) + f t, x, u (t, x) , ∇L
u (t, x) = 0. (4.23)
∂t
(b) The function u satisfies the following type of Feynman-Kac integral equa-
tion:
" Z T #
¡ L
¢
u (t, x) = Et,x u (T, X(T )) + f τ, X(τ ), u (τ, X(τ )) , ∇u (τ, X(τ )) dτ .
t
(4.24)
.
(c) For every t ∈ [0, T ] the process
Z s ¡ ¢
s 7→ u (s, X(s)) − u (t, X(t)) + f τ, X(τ ), u (τ, X(τ )) , ∇L
u (τ, X(τ )) dτ
t

is an Fst -martingale with respect to Pt,x on the interval [t, T ].


(d) For every s ∈ [0, T ] the process
Z T ¡ ¢
t 7→ u (T, X(T ))−u (t, X(t))+ f τ, X(τ ), u (τ, X(τ )) , ∇L
u (τ, X(τ )) dτ
t

is an FTt -backward martingale with respect to Ps,x on the interval [s, T ].


180 4 BSDE’s and Markov processes

Remark 4.11. Suppose that the function u is a solution to the following ter-
minal value problem:

 L(s)u (s, ·) (x) + ∂ u (s, x) + f ¡s, x, u (s, x) , ∇L (s, x)¢ = 0;
u
∂s (4.25)

u(T, x) = ϕ(T, x).
¡ ¢
Then the pair u (s, X(s)) , ∇L u (s, X(s)) can be considered as a weak solution
to a backward stochastic differential equation. More precisely, for every s ∈
[0, T ] the process
Z T ¡ ¢
t 7→u (T, X(T )) − u (t, X(t)) + f τ, X(τ ), u (τ, X(τ )) , ∇L
u (τ, X(τ )) dτ
t

is an FTt -backward martingale relative to Ps,x on the interval [s, T ]. The sym-
bol ∇L L
u v (s, x) stands for the functional v 7→ ∇u v (s, x) = Γ1 (u, v)(s, x), where
Γ1 is the squared gradient operator:

Γ1 (u, v)(s, x) (4.26)


1
= Tβ - lim Es,x [(u (s, X(t)) − u (s, X(s))) (v (s, X(t)) − v (s, X(s)))] .
t↓s t − s

Possible choices for the function f are


¡ ¢
f s, x, y, ∇L
u = −V (s, x)y and (4.27)
¡ ¢ ¯
1¯ L ¯ 2 1
f s, x, y, ∇L
u = ∇ (s, x)¯ − V (s, x) = Γ1 (u, u) (s, x) − V (s, x).
2 u 2
(4.28)

The choice in (4.27) turns equation (4.25) into the following heat equation:

 ∂ u (s, x) + L(s)u (s, ·) (x) − V (s, x)u(s, x) = 0;
∂s (4.29)

u (T, x) = ϕ(T, x).

The function v(s, x) defined by the Feynman-Kac formula


h RT i
v(s, x) = Es,x e− s V (ρ,X(ρ))dρ ϕ (T, X(T )) (4.30)

is a candidate solution to equation (4.29).


The choice in (4.28) turns equation (4.25) into the following Hamilton-Jacobi-
Bellman equation:

 ∂ u (s, x) + L(s)u (s, X(s)) − 1 Γ (u, u) (s, x) + V (s, x) = 0;
1
∂s 2 (4.31)

u (T, x) = − log ϕ(T, x),
4.1 Introduction 181

where − log ϕ(T, x) replaces ϕ(T, x). The function SL defined by the genuine
non-linear Feynman-Kac formula
h RT i
SL (s, x) = − log Es,x e− s V (ρ,X(ρ))dρ ϕ (T, X(T )) (4.32)

is a candidate solution to (4.31). Often these “candidate solutions” are viscos-


ity solutions. However, this was the main topic in [246] and is the main topic
in Chapter 5.

Remark 4.12. Let u(t, x) satisfy one of the equivalent conditions in Theorem
4.10. Put Y (τ ) = u (τ, X(τ )), and let M (s) be the martingale determined by
M (0) = Y (0) = u (0, X(0)) and by
Z s
¡ ¢
M (s) − M (t) = Y (s) + f τ, X(τ ), Y (τ ), ∇L
u (τ, X(τ )) dτ.
t

Then the expression ∇L u (τ, X(τ )) only depends on the martingale part M of
the process s 7→ Y (s). This entitles us to write ZM (τ ) instead of ∇L u (τ, X(τ )).
d
The interpretation of ZM (τ ) is then the linear functional N 7→ hM, N i (τ ),
¡ ¢ dτ
2 0
where¡ N 0is a Pτ,x
¢ -martingale in M Ω, FT , Pt,x2 .¡ Here0a process ¢ N belongs to
2
M Ω, FT , Pt,x whenever N is martingale in L Ω, FT , Pt,x . Notice ¡ that the¢
functional ZM (τ ) is known as soon as the martingale M ∈ M2 Ω, FT0 , Pt,x
is known. From our definitions it also follows that
Z T
M (T ) = Y (T ) + f (τ, X(τ ), Y (τ ), ZM (τ )) dτ,
0

where we used the fact that Y (0) = M (0).

Remark 4.13. Let the notation be as in Remark 4.12. Then the variables Y (t)
and ZM (t) only depend on the space-time variable (t, X(t)), and as a con-
sequence the martingale increments M (t2 ) − M (t1 ), 0 ≤ t1 < t2 ≤ T , only
depend on Ftt21 = σ (X(s) : t1 ≤ s ≤ t2 ). In Section 4.2 we give Lipschitz type
conditions on the function f in order that the BSDE
Z T
Y (t) = Y (T ) + f (s, X(s), Y (s), ZM (s)) ds + M (t) − M (T ), τ ≤ t ≤ T,
t
(4.33)
possesses a unique pair of solutions

(Y, M ) ∈ L2 (Ω, FTτ , Pτ,x ) × M2 (Ω, FTτ , Pτ,x ) .

Here M2 (Ω, FTt , Pt,x ) stands for the space of all (Fst )s∈[t,T ] -martingales in
L2 (Ω, FTt , Pt,x ). Of course instead of writing “BSDE” it would be bet-
ter to write “BSIE” for Backward Stochastic Integral Equation. However,
since in the literature people write “BSDE” even if they mean integral
182 4 BSDE’s and Markov processes

equations we also stick to this terminology. Suppose that the σ (X(T ))-
measurable variable Y (T ) ∈ L2 (Ω, FTτ , Pτ,x ) is given. In fact we will prove
that
¡ the solution ¢(Y, M ¡) of the equation ¢ in (4.33) belongs to the space
S2 Ω, FTt , Pt,x ; Rk ×M2 Ω, FTt , Pt,x ; Rk . For more details see the definitions
4.18 and 4.28, and Theorem 4.42.
Remark 4.14. Let M and N be two martingales in M2 [0, T ]. Then, for 0 ≤
s < t ≤ T,
2
|hM, N i (t) − |hM, N i (s)||
≤ (hM, M i (t) − hM, M i (s)) (hN, N i (t) − hN, N i (s)) ,

and consequently
¯ ¯2
¯d ¯
¯ hM, N i (s)¯ ≤ d hM, M i (s) d hN, N i (s).
¯ ds ¯ ds ds
Hence, the inequality
Z T¯ ¯ Z Tµ ¶1/2 µ ¶1/2
¯d ¯ d d
¯ hM, N i (s)¯ ds ≤ hM, M i (s) hN, N i (s) ds
¯ ds ¯ ds ds
0 0
(4.34)
Z T¯ ¯
¯d ¯
follows. The inequality in (4.34) says that the quantity ¯ ¯
¯ ds hM, N i (s)¯ ds
0
is dominated by the Hellinger integral H (M, N ) defined by the right-hand
side of (4.34).
For a proof we refer the reader to [246]. We insert a proof here as well.
Proof (Proof of Theorem 4.10). For brevity, only in this proof, we write

F (τ, X(τ )) = f (τ, X (τ ) , u (τ, X (τ ))) .

(a) =⇒ (b). The equality in (b) is the same as the one in (4.22) which is a
consequence of (4.20).
(b) =⇒ (a). We calculate the expression
· Z s ¸
∂ ¡ L
¢
Et,x u (s, X(s)) + f τ, X (τ ) , u (τ, X (τ )) , ∇u (τ, X (τ )) dτ .
∂s t

First of all it is equal to


· ¸

Et,x u (s, X(s)) + L(s)u (s, X(s)) + F (s, X(s)) (4.35)
∂s
Next we also have by (4.24) in (b):
· Z s ¸
∂ ¡ L
¢
Et,x u (s, X(s)) + f τ, X (τ ) , u (τ, X (τ )) , ∇u (τ, X (τ )) dτ
∂s t
4.1 Introduction 183
" " Z # Z #
T s

= Et,x Es,X(s) u (T, X(T )) + F (τ, X(τ )) dτ + F (τ, X(τ )) dτ
∂s s t

(Markov property)
" " Z T # Z #
¯ t s
∂ ¯
= Et,x Et,x u (T, X(T )) + F (τ, X(τ )) dτ Fs + F (τ, X(τ )) dτ
∂s s t
" " Z T ##
∂ ¯
= Et,x Et,x u (T, X(T )) + F (τ, X(τ )) dτ ¯ Fst
∂s t
" Z T #

= Et,x u (T, X(T )) + F (τ, X(τ )) dτ = 0. (4.36)
∂s t

From (4.36) and (4.35) we get


· ¸
∂ ¡ L
¢
Et,x u (s, X(s)) + L(s)u (s, X(s))+f s, X (s) , u (s, X (s)) , ∇u (s, X (s))
∂s
= 0, s > t. (4.37)

Passing to the limit for s ↓ t in (4.37) we obtain:


· ¸
∂ ¡ ¢
Et,x u (t, X(t)) + L(t)u (t, X(t)) + f t, X (t) , u (t, X (t)) , ∇L
u (t, X (t))
∂t
=0 (4.38)

and, since X(t) = x Pt,x -almost surely, we obtain equality (4.23) in assertion
(a).
(a) =⇒ (c). If the function u satisfies the differential equation in (a), then
from the equality in (4.5) we see that
Z s
¡ ¢
0 = u (s, X (s)) − u (t, X (t)) + f τ, X(τ ), u (τ, X(τ )) , ∇Lu (τ, X(τ )) dτ
Zt s µ ¶
∂u
− u (s, X (s)) + u (t, X (t)) + L(τ )u (τ, X(τ )) + (τ, X(τ )) dτ
t ∂τ
(4.39)
Z s
¡ ¢
= u (s, X (s)) − u (t, X (t)) + f τ, X(τ ), u (τ, X(τ )) , ∇L u (τ, X(τ )) dτ
t
− Mu (s) + Mu (t) , (4.40)

where, as in (3.217),

Mu (s) − Mu (t)
Z s µ ¶
∂u
= u (s, X (s)) − u (t, X (t)) − L(τ )u (τ, X(τ )) + (τ, X(τ )) dτ
t ∂τ
184 4 BSDE’s and Markov processes
Z s
= dMu (τ ). (4.41)
t

Since the expression in (4.40) vanishes (by assumption (a)) we see that the
process in (c) is the same as the martingale s 7→ Mu (s) − Mu (t), s ≥ t. This
proves the implication (a) =⇒ (c).
The implication (c) =⇒ (b) is a direct consequence of assertion (c) and the
fact that X(t) = x Pt,x -almost surely.
The equivalence of the assertions (a) and (d) is proved in the same manner
as the equivalence of (a) and (c). Here we employ the fact that the process
t 7→ Mu (T ) − Mu (t) is an FTt -backward martingale on the interval [s, T ] with
respect to the probability Ps,x .
This completes the proof of Theorem 4.10
Remark 4.15. Instead of considering ∇Lu (s, x) we will also consider the bilinear
mapping Z(s) which associates with a pair of local semi-martingales (Y1 , Y2 )
a process which is to be considered as the right derivative of the co-variation
process: hY1 , Y2 i (s). We write
d
ZY1 (s) (Y2 ) = Z(s) (Y1 , Y2 ) = hY1 , Y2 i (s).
ds
The function f (i.e. the generator of the backward differential equation)
will
¡ then be of the ¢ form: f (s, X(s), Y (s), ZY (s)); the deterministic phase
u(s, x), ∇L u(s, x) is replaced with the stochastic phase (Y (s), ZY (s)). We
should find an appropriate stochastic phase s 7→ (Y (s), ZY (s)), which we
identify with the process s 7→ (Y (s), MY (s)) in the stochastic phase space
S2 × M2 , such that
Z T Z T
Y (t) = Y (T ) + f (s, X(s), Y (s), ZY (s)) ds − dMY (s), (4.42)
t t

where the quadratic variation of the martingale MY (s) is given by

d hMY , MY i (s) = ZY (s) (Y ) ds = Z(s) (Y, Y ) ds = d hY, Y i (s).

This stochastic phase space S2 × M2 plays a role in stochastic analysis very


similar to the role played by the first Sobolev space H 1,2 in the theory of
deterministic partial differential equations.
Remark 4.16. In case we deal with strong solutions driven by standard Brow-
nian
R s2 motion the martingale difference MY (s2 ) − MY (s1 ) can be written as
Z (s)dW (s), provided that the martingale MY (s) belongs to the space
s1 ¡ Y ¢
M2 Ω, G0T , P . Here G0T is the σ-field generated by W (s), 0 ≤ s ≤ T . If
Y (s) = u (s, X(s)), then this stochastic integral satisfies:
Z s2
ZY (s)dW (s) = u (s2 , X (s2 )) − u (s1 , X (s1 ))
s1
4.1 Introduction 185
Z s2 µ ¶

− L(s) + u (s, X (s)) ds. (4.43)
s1 ∂s

Such stochastic integrals are for example defined if the process X(t) is a
solution to a stochastic differential equation (in Itô sense):
Z s Z s
X(s) = X(t)+ b (τ, X(τ )) dτ + σ (τ, X(τ )) dW (τ ), t ≤ s ≤ T. (4.44)
t t

d
Here the matrix (σjk (τ, x))j,k=1 is chosen in such a way that

d
X
ajk (τ, x) = σj` (τ, x) σk` (τ, x) = (σ(τ, x)σ ∗ (τ, x))jk .
`=1

The process W (τ ) is Brownian motion or Wiener process. It is assumed that


operator L(τ ) has the form
d
1 X ∂2
L(τ )u(x) = b (τ, x) · ∇u(x) + ajk (τ, x) u(x). (4.45)
2 ∂xj xk
j,k=1

Then from Itô’s formula together with (4.43), (4.44) and (4.45) it follows that

the process ZY (s) has to be identified with σ (s, X(s)) ∇u (s, ·) (X(s)). For
more details see e.g. Pardoux and Peng [184] and Pardoux [181]. The equality
in (4.43) is a consequence of a martingale representation theorem: see e.g.
Proposition 3.2 in Revuz and Yor [199].
Remark 4.17. Backward doubly stochastic differential equations (BDSDEs)
could have been included in the present chapter: see Boufoussi, Mrhardy and
Van Casteren [41]. In our notation a BDSDE may be written in the form:
Z T µ ¶
d
Y (t) − Y (T ) = f s, X(s), Y (s), N 7→ hM, N i (s) ds
t ds
Z T µ ¶
d ←−
+ g s, X(s), Y (s), N 7→ hM, N i (s) d B (s)
t ds
+ M (t) − M (T ). (4.46)

Here the expression


Z T µ ¶
d ←−
g s, X(s), Y (s), N 7→ hM, N i (s) d B (s)
t ds

represents a backward Itô integral. The symbol hM, N i stands for the co-
variation process of the (local) martingales M and N ; it is assumed that this
process is absolutely continuous with respect to Lebesgue measure. Moreover,

{(Ω, FTτ , Pτ,x ) , (X(t) : T ≥ t ≥ 0) , (E, E)}


186 4 BSDE’s and Markov processes

is a Markov process generated by a family of operators L(s), 0 ≤ s ≤ T , and


Ftτ = σ {X(s) : τ ≤ s ≤ t}. The process X(t) could be the (unique) weak or
strong solution to a (forward) stochastic differential equation (SDE):
Z t Z t
X(t) = x + b (s, X(s)) ds + σ (s, X(s)) dW (s). (4.47)
τ τ

Here the coefficients b and σ have certain continuity or measurability proper-


ties, and Pτ,x is the distribution of the process X(t) defined as being the
unique weak solution to the equation in (4.47). We want to find a pair
(Y, M ) ∈ S2 (Ω, Ftτ , Pτ,x ) × M2 (Ω, Ftτ , Pτ,x ) which satisfies (4.46).

We first give some definitions. Fix (τ, x) ∈ [0, T ] × E. In the definitions 4.18
and 4.19 the probability measure Pτ,x is defined on the σ-field FTτ . In Defi-
nition 4.28 we return to these notions. The following definition and implicit
results described therein shows that, under certain conditions, by enlarging
the sample space a family of processes may be reduced to just one process
without losing the S2 -property.
Definition 4.18. Fix ¡(τ, x) ∈ [0, T ] ×¢E. An Rk -valued process Y is said to
2 τ k τ
belong to the
· space S Ω,¸FT , Pτ,x ; R if Y (t) is Ft -measurable (τ ≤ t ≤ T )
2
and if Eτ,x sup |Y (t)| < ∞. It is assumed that Y (s) = Y (τ ), Pτ,x -almost
τ ≤t≤T
surely,
¡ for s ∈ [0, τ ]. The
¢ process Y (s), s ∈ [0, T ], is said to belong to the space
S2unif Ω, FTτ , Pτ,x ; Rk if
· ¸
2
sup Eτ,x sup |Y (t)| < ∞,
(τ,x)∈[0,T ]×E τ ≤t≤T

¡ ¢
and it belongs to S2loc,unif Ω, FTτ , Pτ,x ; Rk provided that
· ¸
2
sup Eτ,x sup |Y (t)| <∞
(τ,x)∈[0,T ]×K τ ≤t≤T

for all compact subsets K of E.


¡ ¢
If the σ-field Ftτ and Pτ,x are clear from the context we write S2 [0, T ], Rk
or sometimes just S2 .
Definition 4.19. Let the process M be such that the process t 7→ M (t)−M (τ ),
t ∈ [τ, T ], is a Pτ,x -martingale with the property that the stochastic variable
2 τ
M (T ) − M ¡ (τ ) τbelongs tok ¢L (Ω, FT , Pτ,x ). Then M is said to belong to the
2
space M Ω, FT , Pτ,x ; R . By the Burkholder-Davis-Gundy
· inequality
¸ (see
2
inequality (4.79) below) it follows that Eτ,x sup |M (t) − M (τ )| is finite
τ ≤t≤T
if and only if M (T ) − M (τ ) belongs to the space L2 (Ω, FTτ , Pτ,x ). Here an
Ftτ -adapted process M (·) − M (τ ) is called a Pτ,x -martingale provided that
4.1 Introduction 187
£ ¯ ¤
Eτ,x [|M (t) − M (τ )|] < ∞ and Eτ,x M (t) − M (τ ) ¯ Fsτ = M (s) − M (τ ),
Pτ,x -almost surely, for T ≥ t ≥ s ≥ τ . The martingale¡ difference s 7→¢
M (s) − M (0), s ∈ [0, T ], is said to belong to the space M2unif Ω, FTτ , Pτ,x ; Rk
if · ¸
2
sup Eτ,x sup |M (t) − M (τ )| < ∞,
(τ,x)∈[0,T ]×E τ ≤t≤T
¡ ¢
and it belongs to M2loc,unif Ω, FTτ , Pτ,x ; Rk provided that
· ¸
2
sup Eτ,x sup |M (t) − M (τ )| <∞
(τ,x)∈[0,T ]×K τ ≤t≤T

for all compact subsets K of E.


From the Burkholder-Davis-Gundy inequality (see inequality
¡ (4.79) below)
¢ it
follows that the process M (s)−M (0) belongs to M2unif Ω, FTτ , Pτ,x ; Rk if and
only if
h i
2
sup Eτ,x |M (T ) − M (τ )|
(τ,x)∈[0,T ]×E

= sup Eτ,x [hM, M i (T ) − hM, M i (τ )] < ∞.


(τ,x)∈[0,T ]×E

Here hM, M i stands for the quadratic variation process of the process t 7→
M (t) − M (0).
The notions in the definitions 4.18 and (4.19) will exclusively be used in
case the family of measures {Pτ,x : (τ, x) ∈ [0, T ] × E} constitute the distri-
butions of a Markov process which was defined in Definition 4.3.
Again let the Markov process, with right-continuous sample paths and
with left limits,

{(Ω, FTτ , Pτ,x ) , (X(t) : T ≥ t ≥ 0) , (E, E)} (4.48)

be generated by the family of operators {L(s) : 0 ≤ s ≤ t}: see definitions 4.3,


equality (4.9), and 4.4, equality (4.11).
Next we define the family of operators {Q (t1 , t2 ) : 0 ≤ t1 ≤ t2 ≤ T } by

Q (t1 , t2 ) f (x) = Et1 ,x [f (X (t2 ))] , f ∈ Cb (E) , 0 ≤ t1 ≤ t2 ≤ T. (4.49)

Fix ϕ ∈ D(L). Since the process t 7→ Mϕ (t) − Mϕ (s), t ∈ [s, T ], is a Ps,x -


martingale with respect to the filtration (Fts )t∈[s,T ] , and X(t) = x Pt,x almost
surely, the following equality follows:
Z t
Es,x [L(ρ)ϕ (ρ, ·) (X(ρ))] dρ + Et,x [ϕ (t, X(t))] − Es,x [ϕ (t, X(t))]
s
Z t · ¸
∂ϕ
= ϕ(t, x) − ϕ(s, x) − Es,x (ρ, X(ρ)) dρ. (4.50)
s ∂ρ
188 4 BSDE’s and Markov processes

The fact that a process of the form t 7→ Mϕ (t) − Mϕ (s), t ∈ [s, T ], is a Ps,x -
martingale follows from Proposition 4.5. In terms of the family of operators

{Q (t1 , t2 ) : 0 ≤ t1 ≤ t2 ≤ T }

the equality in (4.50) can be rewritten as


Z t
Q (s, ρ) L(ρ)ϕ (ρ, ·) (x) dρ + Q(t, t)ϕ (t, ·) (x) − Q(s, t)ϕ (t, ·) (x)
s
Z t
∂ϕ
= ϕ(t, x) − ϕ(s, x) − Q (s, ρ) (ρ, ·) (x)dρ. (4.51)
s ∂ρ

From (4.51) we infer that

Q(t, t)ϕ (t, ·) (x) − Q(s, t)ϕ (t, ·) (x)


L(s)ϕ(s, ·)(x) = − lim . (4.52)
t↓s t−s

Equality (4.51) also yields the following result. If ϕ ∈ D(L) is such that

∂ϕ
L(ρ)ϕ (ρ, ·) (y) = − (ρ, y),
∂ρ
then
ϕ (s, x) = Q (ρ, t) ϕ (t, ·) (x) = Es,x [ϕ (t, X(t))] . (4.53)
Since 0 ≤ s ≤ t ≤ T are arbitrary from (4.53) we see

Q (s, t0 ) ϕ (t0 , ·) (x) = Q (s, t) Q (t, t0 ) ϕ (t0 , ·) (x) 0 ≤ s ≤ t ≤ t0 ≤ T, x ∈ E.


(4.54)
If in (4.54) we (may) choose the function ϕ (t0 , y) arbitrary, then the family
Q(s, t), 0 ≤ s ≤ t ≤ T , automatically is a propagator in the space Cb (E) in
the sense that Q (s, t) Q (t, t0 ) = Q (s, t0 ), 0 ≤ s ≤ t ≤ t0 ≤ T . For details on
propagators or evolution families see [95].
Remark 4.20. In the sequel we want to discuss solutions to equations of the
form:
∂ ¡ ¢
u (t, x) + L(t)u (t, ·) (x) + f t, x, u (t, x) , ∇L
u (t, x) = 0. (4.55)
∂t
For a preliminary discussion on this topic see Theorem 4.10. Under certain
hypotheses on the function f we will give existence and uniqueness results.
Let m be (equivalent to) the Lebesgue measure in Rd . In a concrete
¡ situation
¢
where every operator L(t) is a genuine diffusion operator in L2 Rd , m we
consider the following Backward Stochastic Differential equation
Z T ¡ ¢
u (s, X(s)) = Y (T, X(T )) + f ρ, X(ρ), u (ρ, X(ρ)) , ∇L
u (ρ, X(ρ)) dρ
s
4.1 Introduction 189
Z T
− ∇L
u (ρ, X(ρ)) dW (ρ) . (4.56)
s

Here we suppose that the process t 7→ X(t) is a solution to a genuine stochastic


differential equation driven by Brownian motion and with one-dimensional
∂u
distribution u(t, x) satisfying L(t)u (t, ·) (x) = (t, x). In fact in that case
∂t
we will not consider the equation in (4.56), but we will try to find an ordered
pair (Y, Z) such that
Z T Z T
Y (s) = Y (T ) + f (ρ, X(ρ), Y (ρ) , Z (ρ)) dρ − hZ (ρ) , dW (ρ)i . (4.57)
s s

If the pair (Y, Z) satisfies (4.57), then u (s, x) = Es,x [Y (s)] satisfies (4.55).
Moreover Z(s) = ∇L L
u (s, X(s)) = ∇u (s, x), Ps,x -almost surely. For more de-
tails see section 2 in Pardoux [181].
Remark 4.21. Some remarks follow:
(a) In section 4.2 weak solutions to BSDEs are studied.
(b) In section 7 of [246] and in section 2 of Pardoux [181] strong solutions to
BSDEs are discussed: these results are due to Pardoux and collaborators.
(c) BSDEs go back to Bismut: see e.g. [32].
d d
1 X ∂2u X ∂u
(d) If L(s)u(s, x) = aj,k (s, x) (s, x) + bj (s, x) (s, x), then
2 ∂xj xk j=1
∂x j
j,k=1

d
X ∂u ∂v
Γ1 (u, v) (s, x) = aj,k (s, x) (s, x) (s, x).
∂xj ∂xk
j,k=1

As a corollary to theorems 4.10 and 4.34 we have the following result.


Corollary 4.22. Suppose that the function u solves the following

 ∂u (s, y) + L(s)u(s, ·) (y) + f ¡s, y, u(s, y), ∇L (s, y)¢ = 0;
u
∂s (4.58)
 2 τ
u (T, X(T )) = ξ ∈ L (Ω, FT , Pτ,x ) .
Let the pair (Y, M ) be a solution to
Z T
Y (t) = ξ + f (s, X(s), Y (s), ZM (s)) ds + M (t) − M (T ), (4.59)
t

with M (τ ) = 0. Then
(Y (t), M (t)) = (u (t, X(t)) , Mu (t)) ,
where
Z t Z t
∂u
Mu (t) = u (t, X(t))−u (τ, X(τ ))− L(s)u (s, ·) (X(s)) ds− (s, X(s)) ds.
τ τ ∂s
190 4 BSDE’s and Markov processes

Notice that the processes s 7→ ∇Lu (s, X(s)) and s 7→ ZMu (s) may be identified
and that ZMu (s) only depends on (s, X(s)). The decomposition
Z tµ ¶
∂u
u (t, X(t)) − u (τ, X(τ )) = (s, X(s)) + L(s)u (s, ·) (X(s)) ds
τ ∂s
+ Mu (t) − Mu (τ ) (4.60)

splits the process t 7→ u (t, X(t)) − u (τ, X(τ )) into a part which is bounded
variation (i.e. the part which is absolutely continuous with respect to Lebesgue
measure on [τ, T ]) and a Pτ,x -martingale part Mu (t) − Mu (τ ) (which in fact
is a martingale difference part).
If L(s) = 12 ∆, then X(s) = W (s) (standard Wiener process or Brownian
motion) and (4.60) can be rewritten as
Z tµ ¶
∂u 1
u (t, W (t)) − u (τ, W (τ )) = (s, W (s)) + ∆u (s, ·) (W (s)) ds
τ ∂s 2
Z t
+ ∇u (s, ·) (W (s)) dW (s) (4.61)
τ
Rt
where τ
∇u (s, ·) (W (s)) dW (s) is to be interpreted as an Itô integral.
Remark 4.23. Suggestions for further research:
(a) Find “explicit solutions” to BSDEs with a linear drift part. This should
be a type of Cameron-Martin formula or Girsanov transformation.
(b) Treat weak (and strong) solutions BDSDEs in a manner similar to what
is presented here for BSDEs.
(c) Treat weak (strong) solutions to BSDEs generated by a function f which
is not necessarily of linear growth but for example of quadratic growth in
one or both of its entries Y (t) and ZM (t).
(d) Can anything be done if f depends not only on s, x, u(s, x), ∇u (s, x), but
also on L(s)u (s, ·) (x)?
In the following proposition it is assumed that the operator L generates a
strong Markov process in the sense of the definitions 1.30 and 1.31.
Proposition 4.24. Let the functions f , g ∈ D(L) be such that their product
f g also belongs to D(L). Then Γ1 (f, g) is well defined and for (s, x) ∈ [0, T ] ×
E the following equality holds:

L(s) (f g) (s, ·) (x) − f (s, x)L(s)g (s, ·) (x) − L(s)f (s, ·) (x)g(s, x)
= Γ1 (f, g) (s, x). (4.62)

Proof. Let the functions f and g be as in Proposition 4.24. For h > 0 we have:

(f (X(s + h)) − f (X(s))) (g (X(s + h)) − g (X(s)))


= f (X(s + h)) g (X(s + h)) − f (X(s)) g (X(s)) (4.63)
4.2 A probabilistic approach: weak solutions 191

− f (X(s)) (g (X(s + h)) − g (X(s))) − (f (X(s + h)) − f (X(s))) g (X(s)) .

Then we take expectations with respect to Es,x , divide by h > 0, and pass to
the Tβ -limit as h ↓ 0 to obtain equality (4.62) in Proposition 4.24.

4.2 A probabilistic approach: weak solutions


In this section and also in sections 4.3 we will study BSDE’s on a single prob-
ability space. In Section 4.4 and Chapter 5 we will consider Markov families
of probability spaces. In the present section we write P instead of P0,x , and
similarly for the expectations E and E0,x . Here we work on the interval [0, T ].
Since we are discussing the martingale problem and basically only the dis-
tributions of the process t 7→ X(t), t ∈ [0, T ], the solutions we obtain are of
weak type. In case we consider strong solutions we apply a martingale rep-
resentation theorem (in terms of Brownian Motion). In Section 4.4 we will
also use this result for probability measures of the form Pτ,x on the inter-
0
val [τ, T ]. In¡this section ¢we consider
¡ a pair of¢ Ft = Ft -adapted processes
2 k 2 k
(Y, M ) ∈ L Ω, FT , P; R × L Ω, FT , P : R such that Y (0) = M (0) and
such that
Z T
Y (t) = Y (T ) + f (s, X(s), Y (s), ZM (s)) ds + M (t) − M (T ) (4.64)
t

where M is a P-martingale with respect to the filtration Ft = σ (X(s) : s ≤ t).


In [246] we will employ the results of the present section with P = Pτ,x , where
(τ, x) ∈ [0, T ] × E.
Proposition 4.25. Let the pair (Y, M ) be as in (4.64), and suppose that
Y (0) = M (0). Then
Z t
Y (t) = M (t) − f (s, X(s), Y (s), ZM (s)) ds, and (4.65)
0
" Z T #
¯
Y (t) = E Y (T ) + f (s, X(s), Y (s), ZM (s)) ds ¯ Ft ; (4.66)
t
" Z #
T ¯
M (t) = E Y (T ) + f (s, X(s), Y (s), ZM (s)) ds ¯ Ft . (4.67)
0

The equality in (4.65) shows that the process M is the martingale part of the
semi-martingale Y .
Proof. The equality in (4.66) follows from (4.64) and from the fact that M is
a martingale. Next we calculate
" Z T #
¯
E Y (T ) + f (s, X(s), Y (s), ZM (s)) ds ¯ Ft
0
192 4 BSDE’s and Markov processes
" Z #
T ¯
= E Y (T ) + f (s, X(s), Y (s), ZM (s)) ds ¯ Ft
t
Z t
+ f (s, X(s), Y (s), ZM (s)) ds
0
Z t
= Y (t) + f (s, X(s), Y (s), ZM (s)) ds
0

(employ (4.64))
Z T
= Y (T ) + f (s, X(s), Y (s), ZM (s)) ds + M (t) − M (T )
t
Z t
+ f (s, X(s), Y (s), ZM (s)) ds
0
Z T
= Y (T ) + f (s, X(s), Y (s), ZM (s)) ds + M (t) − M (T )
0
= M (T ) + M (t) − M (T ) = M (t). (4.68)

The equality in (4.68) shows (4.67). Since


Z T
M (T ) = Y (T ) + f (s, X(s), Y (s), ZM (s)) ds
0

the equality in (4.65) follows.


In the following theorem we write z = ZM (s) and y belongs to Rk .
Theorem 4.26. Suppose that there exist finite constants C1 and C2 such that
2
hy2 − y1 , f (s, x, y2 , z) − f (s, x, y1 , z)i ≤ C1 |y2 − y1 | ; (4.69)
2 d
|f (s, x, y, ZM2 (s)) − f (s, x, y, ZM1 (s))| ≤ C22 hM2 − M1 , M2 − M1 i (s).
ds
(4.70)

Then there exists a unique pair of adapted processes (Y, M ) such that Y (0) =
M (0) and such that the process M is the martingale part of the semi-
martingale Y :
Z T
Y (t) = M (t) − M (T ) + Y (T ) + f (s, X(s), Y (s), ZM (s)) ds
t
Z t
= M (t) − f (s, X(s), Y (s), ZM (s)) ds. (4.71)
0

The following proof contains just an outline of the proof of Theorem 4.26.
Complete and rigorous arguments are found in the proof of Theorem 4.33: see
Theorem 4.42 as well.
4.2 A probabilistic approach: weak solutions 193

Proof. The uniqueness follows from Corollary 4.32 of Theorem 4.30 below.
In the existence part of the proof of Theorem 4.26 we will approximate the
−1
function f by Lipschitz continuous functions fδ , 0 < δ < (2C1 ) , where each
−1
function fδ has Lipschitz constant δ , but at the same time inequality (4.70)
remains valid for fixed second variable (in an appropriate sense). It follows
that for the functions fδ (4.70) remains valid and that (4.69) is replaced with
1
|fδ (s, x, y2 , z) − fδ (s, x, y1 , z)| ≤ |y2 − y1 | . (4.72)
δ
In the uniqueness part of the proof it suffices to assume that (4.69) holds. In
Theorem 4.34 we will see that the monotonicity condition (4.69) also suffices
to prove the existence. For details the reader is referred to the propositions
4.35 and 4.36, Corollary 4.37, and to Proposition 4.39. In fact for M ∈ M2
fixed, and the function y 7→ f (s, x, y, ZM (s)) satisfying (4.69) the function
y 7→ y − δf (s, x, y, ZM (s)) is surjective as a mapping from Rk to Rk and
its inverse exists and is Lipschitz continuous with constant 2. The Lipschitz
continuity is proved in Proposition 4.36. The surjectivity of this mapping is a
consequence of Theorem 1 in [63]. As pointed out by Crouzeix et al the result
follows from a non-trivial homotopy argument. A relatively elementary proof
of Theorem 1 in [63] can be found for a continuously differentiable function
in Hairer and Wanner [98]: see Theorem 14.2 in Chapter IV. For a few more
details see Remark 4.38. Let fs,M be the mapping y 7→ f (s, y, ZM (s)), and
put ³ ´
−1
fδ (s, x, y, ZM (s)) = f s, x, (I − δfs,x,M ) , ZM (s) . (4.73)
−1
Then the functions fδ , 0 < δ < (2C1 ) , are Lipschitz continuous with con-
stant δ −1 . Proposition 4.39 treats the transition from solutions of BSDE’s
with generator fδ with fixed martingale M ∈ M2 to solutions of BSDE’s
driven by f with the same fixed martingale M . Proposition 4.35 contains the
passage from solutions (Y, N ) ∈ S2 × M2 to BBSDE’s with generators of the
form (s, y) 7→ f (s, y, ZM (s)) for any fixed martingale M ∈ M2 to solutions
for BSDE’s of the form (4.71) where the pair (Y, M ) belongs to S2 × M2 . By
hypothesis the process s 7→ f (s, x, Y (s), ZM (s)) satisfies (4.69) and (4.70).
Essentially speaking a combination of these observations show the result in
Theorem 4.26.

Remark 4.27. In the literature functions with the monotonicity property are
also called one-sided Lipschitz functions. In fact Theorem 4.26, with f (t, x, ·, ·)
Lipschitz continuous in both variables, will be superseded by Theorem 4.33
in the Lipschitz case and by Theorem 4.34 in case of monotonicity in the
second variable and Lipschitz continuity in the third variable. The proof of
Theorem 4.26 is part of the results in Section 4.3. Theorem 4.42 contains a
corresponding result for a Markov family of probability measures. Its proof is
omitted, it follows the same lines as the proof of Theorem 4.34.
194 4 BSDE’s and Markov processes

4.3 Existence and Uniqueness of solutions to BSDE’s


The equation in (4.55) can be phrased in a semi-linear setting as follows. Find
a function u (t, x) which satisfies the following partial differential equation:

 ∂u (s, x) + L(s)u (s, x) + f ¡s, x, u(s, x), ∇L (s, x)¢ = 0;
u
∂s (4.74)

u(T, x) = ϕ (T, x) , x ∈ E.

Here ∇L f2 (s, x) is the linear functional f1 7→ Γ1 (f1 , f2 ) (s, x) for smooth


enough functions f1 and f2 . For s ∈ [0, T ] fixed the symbol ∇L f2 stands for the
linear mapping f1 7→ Γ1 (f1 , f2 ) (s, ·). One way to treat this kind of equation is
considering the following backward problem. Find a pair of adapted processes
(Y, ZY ), satisfying
Z T
Y (t) − Y (T ) − f (s, X(s), Y (s), Z(s) (·, Y )) ds = M (t) − M (T ), (4.75)
t

where M (s), t0 < t ≤ s ≤ T , is a forward


¡ local
¢ Pt,x -martingale (for every
T > t > t0 ). The symbol ZY1 , Y1 ∈ S2 [0, T ], Rk , stands for the functional

d ¡ ¢
ZY1 (Y2 ) (s) = Z(s) (Y1 (·), Y2 (·)) = hY1 (·), Y2 (·)i (s), Y2 ∈ S2 [0, T ], Rk .
ds
(4.76)
If the pair (Y, ZY ) satisfies (4.75), then ZY = ZM .¡Instead of¢trying¡to find the¢
pair (Y, ZY ) we will try to find a pair (Y, M ) ∈ S2 [0, T ], Rk ×M2 [0, T ], Rk
such that
Z T
Y (t) = Y (T ) + f (s, X(s), Y (s), ZM (s)) ds + M (t) − M (T ).
t
¡ ¢ ¡ ¢
Next we define the spaces S2 [0, T ], Rk and M2 [0, T ], Rk : compare with
the definitions 4.18 and 4.19.
Definition 4.28. Let (Ω, F, P) be a probability space, and let Ft , t ∈ [0, T ],
be a filtration on F. Let t 7→ Y (t) be an stochastic process with values in Rk
which is adapted to the filtration Ft and which
¡ is P-almost
¢ surely continuous.
Then Y is said to belong to the space S2 [0, T ], Rk provided that
" #
2
E sup |Y (t)| < ∞.
t∈[0,T ]
¡ ¢
Definition 4.29.
¡ The space¢ of Rk -valued martingales in L2 Ω, F, P; Rk is
denoted by M2¡ [0, T ], Rk¢ . So that a continuous martingale t 7→ M (t) − M (0)
belongs to M2 [0, T ], Rk if
h i
2
E |M (T ) − M (0)| < ∞. (4.77)
4.3 Existence and Uniqueness of solutions to BSDE’s 195
2 2
Since the process t 7→ |M (t)| − |M (0)| − hM, M i (t) + hM, M i (0) is a mar-
tingale difference we see that
h i
2
E |M (T ) − M (0)| = E [hM, M i (T ) − hM, M i (0)] , (4.78)
¡ ¢
and hence
¡ a martingale
¢ difference t 7→ M (t)−M (0) in L2 Ω, F, P; Rk belongs
to M2 [0, T ], Rk if and only if E [hM, M i (T ) − hM, M i (0)] is finite. By the
Burkholder-Davis-Gundy inequality this is the case if and only if
· ¸
2
E sup |M (t) − M (0)| < ∞.
0<t<T

To be precise, let M (s), t ≤ s ≤ T , be a continuous local L2 -martingale


taking values in Rk . Put M ∗ (s) = supt≤τ ≤s |M (τ )|. Fix 0 < p < ∞. The
Burkholder-Davis-Gundy inequality says that there exist universal finite and
strictly positive constants cp and Cp such that
h i h i
2p p 2p
cp E (M ∗ (s)) ≤ E [hM (·), M (·)i (s)] ≤ Cp E (M ∗ (s)) , t ≤ s ≤ T.
√ (4.79)
If p = 1, then cp = 14 , and if p = 12 , then cp = 18 2. For more details and a
proof see e.g. Ikeda and Watanabe [109].
The following theorem will be employed to prove continuity of solutions
to BSDE’s. It also implies that BSDE’s as considered by us possess at most
unique solutions. The variables (Y, M ) and (Y 0 , M 0 ) attain their values in Rk
Pk
endowed with its Euclidean inner-product hy 0 , yi = j=1 yj0 yj , y 0 , y ∈ Rk .
Processes of the form s 7→ f (s, Y (s), ZM (s)) are progressively measurable
processes whenever the pair (Y, M ) belongs to the space mentioned in (4.80)
mentioned in next theorem.
Theorem 4.30. Let the pairs (Y, M ) and (Y 0 , M 0 ), which belong to the space
¡ ¢ ¡ ¢
L2 [0, T ] × Ω, FT0 , dt × P × M2 Ω, FT0 , P , (4.80)

be solutions to the following BSDE’s:


Z T
Y (t) = Y (T ) + f (s, Y (s), ZM (s)) ds + M (t) − M (T ), and (4.81)
t
Z T
Y 0 (t) = Y 0 (T ) + f 0 (s, Y 0 (s), ZM 0 (s)) ds + M 0 (t) − M 0 (T ) (4.82)
t

for 0 ≤ t ≤ T . In particular this means that the processes (Y, M ) and (Y 0 , M 0 )


are progressively measurable and are square integrable. Suppose that the coef-
ficient f 0 satisfies the following monotonicity and Lipschitz condition. There
exist some positive and finite constants C10 and C20 such that the following
inequalities hold for all 0 ≤ t ≤ T :
196 4 BSDE’s and Markov processes

hY 0 (t) − Y (t), f 0 (t, Y 0 (t), ZM 0 (t)) − f 0 (t, Y (t), ZM 0 (t))i


2 2
≤ (C10 ) |Y 0 (t) − Y (t)| , and (4.83)
2
|f 0 (t, Y (t), ZM 0 (t)) − f 0 (t, Y (t), ZM (t))|
2 d
≤ (C20 ) hM 0 − M, M 0 − M i (t). (4.84)
dt
Then the pair (Y 0 − Y, M 0 − M ) belongs to
¡ ¢ ¡ ¢
S2 Ω, FT0 , P; Rk × M2 Ω, FT0 , P; Rk ,

and there exists a constant C 0 which depends on C10 , C20 and T such that
· ¸
0 2 0 0
E sup |Y (t) − Y (t)| + hM − M, M − M i (T )
0<t<T
"
2
≤ C 0 E |Y 0 (T ) − Y (T )|

Z #
T
0 2
+ |f (s, Y (s), ZM (s)) − f (s, Y (s), ZM (s))| ds . (4.85)
0

Remark 4.31. From the proof it follows that for C 0 we may choose C 0 =
2 2
260eγT , where γ = 1 + 2 (C10 ) + 2 (C20 ) .

By taking Y (T ) = Y 0 (T ) and f (s, Y (s), ZM (s)) = f 0 (s, Y (s), ZM (s)) it also


implies that BSDE’s as considered by us possess at most unique solutions. A
precise formulation reads as follows.
Corollary 4.32. Suppose that the coefficient f satisfies the monotonicity con-
dition (4.83) and the
¡ Lipschitz condition¢(4.84). ¡Then there¢ exists at most one
pair (Y, M ) ∈ L2 [0, T ] × Ω, FT0 , dt × P × M2 Ω, FT0 , P which satisfies the
backward stochastic differential equation in (4.81).

Proof (Proof of Theorem 4.30). Put Y = Y 0 − Y and M = M 0 − M . From


Itô’s formula it follows that
¯ ¯ ­ ® ­ ®
¯Y (t)¯2 + M , M (T ) − M , M (t)
Z T
¯ ¯2 ­ ®
= ¯Y (T )¯ + 2 Y (s), f 0 (s, Y 0 (s), ZM 0 (s)) − f 0 (s, Y (s), ZM 0 (s)) ds
t
Z T ­ ®
+2 Y (s), f 0 (s, Y (s), ZM 0 (s)) − f 0 (s, Y (s), ZM (s))
t
Z T ­ ®
+2 Y (s), f 0 (s, Y (s), ZM (s)) − f (s, Y (s), ZM (s)) ds
t
Z T ­ ®
−2 Y (s), dM (s) . (4.86)
t
4.3 Existence and Uniqueness of solutions to BSDE’s 197

Inserting the inequalities (4.83) and (4.84) into (4.86) shows:


¯ ¯ ­ ® ­ ®
¯Y (t)¯2 + M , M (T ) − M , M (t)
Z T Z T µ ¶1/2
¯ ¯2 0 2
¯ ¯2 ¯ ¯ d ­ ®
¯ ¯
≤ Y (T ) + 2 (C1 ) ¯ ¯
Y (s) ds + 2C2 0 ¯ Y (s)¯ M , M (s) ds
t t ds
Z T
¯ ¯
+2 ¯Y (s)¯ |f 0 (s, Y (s), ZM (s)) − f (s, Y (s), ZM (s))| ds
t
Z T ­ ®
−2 Y (s), dM (s) . (4.87)
t

b2
The elementary inequalities 2ab ≤ 2C20 a2 + and 2ab ≤ a2 + b2 , 0 ≤ a,
2C20
b ∈ R, apply to the effect that
¯ ¯ ¡­ ® ­ ® ¢
¯Y (t)¯2 + 1 M , M (T ) − M , M (t)
2
¯ ¯2 ³ ´Z T ¯ ¯
2
≤ ¯Y (T )¯ + 1 + 2 (C10 ) + 2 (C20 )
2 ¯Y (s)¯2 ds
t
Z T
2
+ |f 0 (s, Y (s), ZM (s)) − f (s, Y (s), ZM (s))| ds
t
Z T Z t
­ ® ­ ®
−2 Y (s), dM (s) + 2 Y (s), dM (s) . (4.88)
0 0

For a concise formulation of the relevant inequalities we introduce the follow-


ing functions and the constant γ:
h¯ ¯2 i
AY (t) = E ¯Y (t)¯ ,
£­ ® ­ ® ¤
AM (t) = E M , M (T ) − M , M (t) ,
h i
2
C(t) = E |f 0 (s, Y (s), ZM (s)) − f (s, Y (s), ZM (s))| ,
Z T Z T
B(t) = AY (T ) + C(s)ds = B(T ) + C(s)ds, and
t t
0 2 2
γ =1+ 2 (C1 ) + 2 (C20 ) . (4.89)

Using the quantities in (4.89) and remembering the fact that the final term
in (4.88) represents a martingale difference, the inequality in (4.88) implies:
Z T
1
AY (t) + AM (t) ≤ B(t) + γ AY (s)ds. (4.90)
2 t

Using (4.90) and employing induction with respect to n yields:


1
AY (t) + AM (t) (4.91)
2
198 4 BSDE’s and Markov processes
Z T n
X Z T n+2
γ k+1 (T − s)k γ (T − s)n+1
≤ B(t) + B(s)ds + AY (s)ds.
t k! t (n + 1)!
k=0

Passing to the limit for n → ∞ in (4.91) results in:


Z T
1
AY (t) + AM (t) ≤ B(t) + γ eγ(T −s) B(s)ds. (4.92)
2 t
RT
Since B(t) = AY (T ) + C(s)ds from (4.92) we infer:
t
à Z T !
1 γ(T −t)
AY (t) + AM (t) ≤ e AY (T ) + C(s)ds . (4.93)
2 t

By first taking the supremum over 0 < t < T and then taking expectations in
(4.88) gives:
· ¸ h¯ ´Z
¯ ¯2 ¯2 i ³ 2 2
T h¯ ¯2 i
E sup ¯Y (t)¯ ≤ E ¯Y (T )¯ + 1 + 2 (C10 ) + 2 (C20 ) E ¯Y (s)¯ ds
0<t<T 0
Z h T i
2
+ E |f 0 (s, Y (s), ZM (s)) − f (s, Y (s), ZM (s))| ds
0
· Z t ¸
­ ®
+ 2E sup Y (s), dM (s) . (4.94)
0<t<T 0
Rt­ ®
The quadratic variation of the martingale t 7→ 0 Y (s), dM (s) is given by
Rt¯ ¯2 ­ ®
the increasing process t 7→ 0 ¯Y (s)¯ d M , M (s). From the Burkholder-
Davis-Gundy inequality (4.79) we know that
Ã !1/2 
· Z t ¸ Z T
­ ® √ ¯ ¯2 ­ ®
E sup Y (s), dM (s) ≤ 4 2E  ¯Y (s)¯ d M , M (s) .
0<t<T 0 0

(4.95)
For more details on the Burkholder-Davis-Gundy inequality,
√ see e.g. Ikeda and
Watanabe [109]. Again we use an elementary inequality 4 2ab ≤ 14 a2 + 32b2
and plug it into (4.95) to obtain
 ÃZ !1/2 
· Z t ¸
­ ® √ ¯ ¯ T ­ ®
E sup Y (s), dM (s) ≤ 4 2E  sup ¯Y (t)¯ d M , M (s) 
0<t<T 0 0<t<T 0
· ¸
1 ¯ ¯2 £­ ® ¤
≤ E sup ¯Y (t)¯ + 32E M , M (T ) .
4 0<t<T
(4.96)

From (4.93) we also infer


4.3 Existence and Uniqueness of solutions to BSDE’s 199
Z Z Ã Z !
T T T
γ AY (s)ds ≤ γ eγ(T −s) AY (T ) + C(ρ)dρ ds
0 0 s
Z T
¡ ¢ ¡ ¢
= eγT − 1 AY (T ) + eγT − eγρ C(ρ)dρ. (4.97)
0

Insertingthe inequalities (4.96) and (4.97) into (4.94) yields:


· ¸ h¯ Z T · ¸
¯ ¯ ¯ i 1 ¯ ¯2
E sup ¯Y (t)¯2 ≤ eγT E ¯Y (T )¯2 + eγT C(s)ds + E sup ¯Y (t)¯
0<t<T 0 2 0<t<T
£­ ® ¤
+ 64E M , M (T ) . (4.98)
From (4.93) we also get
£­ ® ¤
E M , M (T ) = AM (0) (4.99)
à Z ! à Z !
T h¯ ¯2 i T
≤ 2eγT AY (T ) + C(s)ds = 2eγT ¯ ¯
E Y (T ) + C(s)ds .
0 0

A combination of (4.99) and (4.98) results in


· ¸ Ã Z !
¯ ¯2 h¯ ¯2 i T
¯
E sup Y (t) ¯ ≤ 258eγT
E ¯Y (T )¯ + C(s)ds . (4.100)
0<t<T 0

Adding the right- and left-hand sides of (4.98) and (4.99) proves Theorem 4.30
2 2
with the constant C 0 given by C 0 = 260eγT , where γ = 1 + 2 (C10 ) + 2 (C20 ) .
¡ ¢ ¡ ¢
In the definitions 4.28 and 4.29 the spaces S2 [0, T ], Rk and M2 [0, T ], Rk
are defined.
In Theorem 4.34 we will replace the Lipschitz condition (4.101) in Theorem
4.33 for the function Y (s) 7→ f (s, Y (s), ZM (s)) with the (weaker) monotonic-
ity condition (4.123). Here we write y for the variable Y (s) and z for ZM (s).
It is noticed ¡that¢ we consider a probability space (Ω, F, P) with a filtration
(Ft )t∈[0,T ] = Ft0 t∈[0,T ] where FT = F.
¡ ¢∗
Theorem 4.33. Let f : [0, T ] × Rk × M2 → Rk be a Lipschitz continuous
in the sense that there exists finite constants C¡1 and C2 ¢such that ¡ for any¢
two pairs of processes (Y, M ) and (U, N ) ∈ S2 [0, T ], Rk × M2 [0, T ], Rk
the following inequalities hold for all 0 ≤ s ≤ T :
|f (s, Y (s), ZM (s)) − f (s, U (s), ZM (s))| ≤ C1 |Y (s) − U (s)| , and (4.101)
µ ¶1/2
d
|f (s, Y (s), ZM (s)) − f (s, Y (s), ZN (s))| ≤ C2 hM − N, M − N i (s) .
ds
(4.102)
hR i
T 2
Suppose that E 0 |f (s, 0, 0)| ds < ∞. Then there exists a unique pair
¡ ¢ ¡ ¢
(Y, M ) ∈ S2 [0, T ], Rk × M2 [0, T ], Rk such that
200 4 BSDE’s and Markov processes
Z T
Y (t) = ξ + f (s, Y (s), ZM (s)) ds + M (t) − M (T ), (4.103)
t
¡ ¢
where Y (T ) = ξ ∈ L2 Ω, FT , Rk is given and Y (0) = M (0).
For brevity we write
¡ ¢ ¡ ¢
S2 × M2 = S2 [0, T ], Rk × M2 [0, T ], Rk
¡ ¢ ¡ ¢
= S2 Ω, FT0 , P; Rk × M2 Ω, FT0 , P; Rk .

In fact we employ this theorem with the function f replaced with fδ , 0 < δ <
−1
(2C1 ) , where fδ is defined by
³ ´
−1
fδ (s, y, ZM (s)) = f s, (I − δfs,M ) , ZM (s) . (4.104)

Here fs,M (y) = f (s, y, ZM (s)). If the function f is monotone (or one-sided
Lipschitz) in the second variable with constant C1 , and Lipschitz in the second
variable with constant C2 , then the function fδ is Lipschitz in y with Lipschitz
constant δ −1 .
Proof. The proof of the uniqueness part follows from Corollary 4.32.
In order to prove existence we proceed as follows. By induction we define
a sequence (Yn , Mn ) in the space S2 × M2 as follows.
" Z T #
¯
Yn+1 (t) = E ξ + ¯
f (s, Yn (s), ZMn (s)) ds Ft , and (4.105)
t
" Z #
T ¯
Mn+1 (t) = E ξ + f (s, Yn (s), ZMn (s)) ds ¯ Ft , (4.106)
0

Then, since the process s 7→ f (s, Yn (s), Mn (s)) is adapted we have:


Z T
ξ+ f (s, Yn (s), ZMn (s)) ds + Mn+1 (t) − Mn+1 (T )
t
Z " Z #
T T ¯
=ξ+ f (s, Yn (s), ZMn (s)) ds + E ξ + f (s, Yn (s), ZMn (s)) ds ¯ Ft
t 0
" Z #
T ¯
−E ξ+ f (s, Yn (s), ZMn (s)) ds ¯ FT
0
Z " Z #
T T ¯
=ξ+ f (s, Yn (s), ZMn (s)) ds + E ξ + f (s, Yn (s), ZMn (s)) ds ¯ Ft
t 0
Z T
−ξ− f (s, Yn (s), ZMn (s)) ds
0
4.3 Existence and Uniqueness of solutions to BSDE’s 201
" Z # Z
T ¯ t
=E ξ+ f (s, Yn (s), ZMn (s)) ds ¯ Ft − f (s, Yn (s), ZMn (s)) ds
0 0
" Z #
T ¯
=E ξ+ f (s, Yn (s), ZMn (s)) ds ¯ Ft = Yn+1 (t). (4.107)
t

Suppose that the pair (Yn , Mn ) belongs S2 × M2 . We first prove that the pair
(Yn+1 , Mn+1 ) is a member of S2 × M2 . Therefore we fix α = 1 + C12 + C22 ∈ R
where C1 and C2 are as in (4.101) and (4.102) respectively. From Itô’s formula
we get:
Z T Z T
2 2
e2αt |Yn+1 (t)| + 2α e2αs |Yn+1 (s)| ds + e2αs d hMn+1 , Mn+1 i (s)
t t
2
= e2αT |Yn+1 (T )|
Z T
+2 e2αs hYn+1 (s), f (s, Yn (s), ZMn (s)) − f (s, Yn (s), 0)i ds
t
Z T
+2 e2αs hYn+1 (s), f (s, Yn (s), 0) − f (s, 0, 0)i ds
t
Z T Z T
+2 e2αs hYn+1 (s), f (s, 0, 0)i ds − 2 e2αs hYn+1 (s), dMn+1 (s)i .
t t
(4.108)

We employ (4.101) and (4.102) to obtain from (4.108):


Z T Z T
2αt 2 2αs 2
e |Yn+1 (t)| + 2α e |Yn+1 (s)| ds + e2αs d hMn+1 , Mn+1 i (s)
t t
Z T µ ¶1/2
2 d
≤ e2αT |Yn+1 (T )| + 2C2 e2αs |Yn+1 (s)| hMn , Mn i (s) ds
t ds
Z T
+ 2C1 e2αs |Yn+1 (s)| |Yn (s)| ds
t
Z T Z T
+2 e2αs |Yn+1 (s)| |f (s, 0, 0)| ds − 2 e2αs hYn+1 (s), dMn+1 (s)i .
t t
(4.109)

b2
The elementary inequalities 2ab ≤ 2Cj a2 + , a, b ∈ R, j = 0, 1, 2, with
2Cj
C0 = 1, in combination with (4.109) yield
Z T Z T
2αt 2 2αs 2
e |Yn+1 (t)| + 2α e |Yn+1 (s)| ds + e2αs d hMn+1 , Mn+1 i (s)
t t
Z T Z T
2 2 1
≤ e2αT |Yn+1 (T )| + 2C22 e2αs |Yn+1 (s)| ds + e2αs d hMn , Mn i (s)
t 2 t
202 4 BSDE’s and Markov processes
Z T Z T
2 1 2
+ 2C12 e2αs |Yn+1 (s)| ds + e2αs |Yn (s)| ds
t 2 t
Z T Z T
2 2
+ e2αs |Yn+1 (s)| ds + e2αs |f (s, 0, 0)| ds
t t
Z T
−2 e2αs hYn+1 (s), dMn+1 (s)i , (4.110)
t

and hence by the choice of α from (4.110) we infer:


Z T Z T
2 2
e2αt |Yn+1 (t)| + e2αs |Yn+1 (s)| ds + e2αs d hMn+1 , Mn+1 i (s)
t t
Z T
+2 e2αs hYn+1 (s), dMn+1 (s)i
0
Z T Z T
2 1 1 2
≤ e2αT |Yn+1 (T )| + e2αs d hMn , Mn i (s) + e2αs |Yn (s)| ds
2 t 2 t
Z T Z t
2
+ e2αs |f (s, 0, 0)| ds + 2 e2αs hYn+1 (s), dMn+1 (s)i . (4.111)
t 0

The following steps can ¡ be justified


¢ by observing that the process Yn+1 be-
longs to the space L2 Ω, FT0 , P , and that sup0≤t≤T |Yn+1 (t)| < ∞ P-almost
surely. By stopping the process Yn+1 (t) at the stopping time τN being the first
time t ≤ T that |Yn+1 (t)| exceeds N . In inequality (4.111) we then replace
t by t ∧ τN and proceed as below with the stopped processes instead of the
processes itself. Then we use the monotone convergence theorem to obtain in-
equality
hR (4.114). By the same approximation
i argument we may assume that
T 2αs
E t e hYn+1 (s), dMn+1 (s)i = 0. Hence (4.111) implies that
" Z Z #
T T
2 2
E e2αt |Yn+1 (t)| + e2αs |Yn+1 (s)| ds + e2αs d hMn+1 , Mn+1 i (s)
t t
"Z #
h i 1 T
2αT 2 2αs
≤e E |Yn+1 (T )| + E e d hMn , Mn i (s)
2 t
"Z #
T
1 2
+ E e2αs |Yn (s)| ds
2 t
"Z #
T
2αs 2
+E e |f (s, 0, 0)| ds < ∞. (4.112)
t

Invoking the Burkholder-Davis-Gundy inequality and applying the equality


¿Z · Z · À
2αs 2αs
e hYn+1 (s), dMn+1 (s)i , e hYn+1 (s), dMn+1 (s)i (t)
0 0
Z t
2
= e4αs |Yn+1 (s)| d hMn+1 , Mn+1 i (s)
0
4.3 Existence and Uniqueness of solutions to BSDE’s 203

to (4.111) yields:
· ¸
2αt 2
E sup e |Yn+1 (t)|
0<t<T
"Z #
h i 1 T
2αT 2 2αs
≤e E |Yn+1 (T )| + E e d hMn , Mn i (s)
2 0
"Z #
T
1 2αs 2
+ E e |Yn (s)| ds
2 0
"Z # "Z #
T T
2αs 2 2αs
+E e |f (s, 0, 0)| ds − 2E e hYn+1 (s), dMn+1 (s)i
0 0
Ã !1/2 
√ Z T
 2 
+ 8 2E e4αs |Yn+1 (s)| d hMn+1 , Mn+1 i (s)
0

hR i
T
(without loss of generality assume that E 0
e2αs hYn+1 (s), dMn+1 (s)i = 0)
"Z #
h i 1 T
2αT 2 2αs
≤e E |Yn+1 (T )| + E e d hMn , Mn i (s)
2 0
"Z # "Z #
T T
1 2αs 2 2αs 2
+ E e |Yn (s)| ds + E e |f (s, 0, 0)| ds
2 0 0
 ÃZ !1/2 
√ T
+ 8 2E  sup eαt |Yn+1 (t)| e2αs d hMn+1 , Mn+1 i (s) 
0<t<T 0

√ a2
(8 2ab ≤ + 64b2 , a, b ∈ R)
2
"Z #
h i 1 T
2αT 2 2αs
≤e E |Yn+1 (T )| + E e d hMn , Mn i (s)
2 0
"Z # "Z #
T T
1 2αs 2 2αs 2
+ E e |Yn (s)| ds + E e |f (s, 0, 0)| ds
2 0 0
· ¸ "Z #
T
1 2αt 2 2αs
+ E sup e |Yn+1 (t)| + 64E e d hMn+1 , Mn+1 i (s)
2 0<t<T 0

(apply (4.112))
"Z #
h i 65 T
2αT 2 2αs
≤ 65e E |Yn+1 (T )| + E e d hMn , Mn i (s)
2 0
204 4 BSDE’s and Markov processes
"Z #
T
65 2αs 2
+ E e |Yn (s)| ds
2 0
"Z # · ¸
T
2 1 2
+ 65E e2αs |f (s, 0, 0)| ds + E sup e2αt |Yn+1 (t)| . (4.113)
0 2 0<t<T

From (4.113) it follows that


· ¸
2αt 2
E sup e |Yn+1 (t)|
0<t<T
"Z #
h i T
2 2
≤ 130e2αT E |Yn+1 (T )| + 130E e2αs |f (s, 0, 0)| ds (4.114)
0
"Z # "Z #
T T
2αs 2αs 2
+ 65E e d hMn , Mn i (s) + 65E e |Yn (s)| ds < ∞.
0 0

From (4.112) and (4.114) it follows that the pair (Yn+1 , Mn+1 ) belongs to
S2 × M2 .
Another application of Itô’s formula shows:
Z T
2αt 2 2
e |Yn+1 (t) − Yn (t)| + 2α e2αs |Yn+1 (s) − Yn (s)| ds
t
Z T
+ e2αs d hMn+1 − Mn , Mn+1 − Mn i (s)
t
2αT 2
=e |Yn+1 (T ) − Yn (T )|
Z T ­ ¡ ¢®
+2 e2αs 4Yn (s), f (s, Yn (s), ZMn (s)) − f s, Yn (s), ZMn−1 (s) ds
t
Z T ­ ¡ ¢ ¡ ¢®
+2 e2αs 4Yn (s), f s, Yn (s), ZMn−1 (s) − f s, Yn−1 (s), ZMn−1 (s) ds
t
Z T
−2 e2αs hYn+1 (s) − Yn (s), dMn+1 (s) − dMn (s)i , (4.115)
t

where for brevity we wrote 4Yn (s) = Yn+1 (s) − Yn (s). From (4.101), (4.102),
and (4.115) we infer
Z T
2αt 2 2
e |Yn+1 (t) − Yn (t)| + 2α e2αs |Yn+1 (s) − Yn (s)| ds
t
Z T
+ e2αs d hMn+1 − Mn , Mn+1 − Mn i (s)
t
2αT 2
≤e |Yn+1 (T ) − Yn (T )|
Z T µ ¶1/2
2αs d
+ 2C2 e |Yn+1 (s) − Yn (s)| hMn − Mn−1 , Mn − Mn−1 i (s) ds
t ds
4.3 Existence and Uniqueness of solutions to BSDE’s 205
Z T
+ 2C1 e2αs |Yn+1 (s) − Yn (s)| |Yn (s) − Yn−1 (s)| ds
t
Z T
−2 e2αs hYn+1 (s) − Yn (s), dMn+1 (s) − dMn (s)i
t
Z T
2 2
≤ e2αT |Yn+1 (T ) − Yn (T )| + 2C22 e2αs |Yn+1 (s) − Yn (s)| ds
t
Z T
1
+ e2αs d hMn − Mn−1 , Mn − Mn−1 i (s)
2 t
Z T Z T
2 1 2
+ 2C12 e2αs |Yn+1 (s) − Yn (s)| ds + e2αs |Yn (s) − Yn−1 (s)| ds
t 2 t
Z T
−2 e2αs hYn+1 (s) − Yn (s), dMn+1 (s) − dMn (s)i . (4.116)
t

Since α = 1 + C12 + C22 the inequality in (4.116) implies:


Z T
2 2
e2αt |Yn+1 (t) − Yn (t)| + 2 e2αs |Yn+1 (s) − Yn (s)| ds
t
Z T
+ e2αs d hMn+1 − Mn , Mn+1 − Mn i (s)
t
2
≤ e2αT |Yn+1 (T ) − Yn (T )|
Z
1 T 2αs
+ e d hMn − Mn−1 , Mn − Mn−1 i (s)
2 t
Z
1 T 2αs 2
+ e |Yn (s) − Yn−1 (s)| ds
2 t
Z T
−2 e2αs hYn+1 (s) − Yn (s), dMn+1 (s) − dMn (s)i
0
Z t
+2 e2αs hYn+1 (s) − Yn (s), dMn+1 (s) − dMn (s)i . (4.117)
0

Upon taking expectations in (4.117) we see


"Z #
h i T
2αt 2 2αs 2
e E |Yn+1 (t) − Yn (t)| + 2E e |Yn+1 (s) − Yn (s)| ds
t
"Z #
T
2αs
+E e d hMn+1 − Mn , Mn+1 − Mn i (s)
t
h i
2
≤ e2αT E |Yn+1 (T ) − Yn (T )|
"Z #
T
1 2αs
+ E e d hMn − Mn−1 , Mn − Mn−1 i (s)
2 t
206 4 BSDE’s and Markov processes
"Z #
T
1 2αs 2
+ E e |Yn (s) − Yn−1 (s)| ds . (4.118)
2 t

In particular it follows that


"Z #
T
2αs 2
2E e |Yn+1 (s) − Yn (s)| ds
t
"Z #
T
+E e2αs d hMn+1 − Mn , Mn+1 − Mn i (s)
t
"Z #
T
1 2αs 2
≤ E e |Yn (s) − Yn−1 (s)| ds
2 t
"Z #
T
1
+ E e2αs d hMn − Mn−1 , Mn − Mn−1 i (s) ,
2 t

provided that Yn+1 (T ) = Yn (T ). As a consequence we see that the sequence


(Yn , Mn ) converges with respect to the norm k·kα defined by
°µ ¶°2 "Z Z T #
° Y ° T
° ° 2αs 2 2αs
° M ° =E e |Y (s)| ds + e d hM, M i (s) .
α 0 0

Employing a similar reasoning as the one we used to obtain (4.113) and (4.114)
from (4.117) we also obtain:
2
sup e2αt |Yn+1 (t) − Yn (t)|
0≤t≤T
2
≤ e2αT |Yn+1 (T ) − Yn (T )|
Z
1 T 2αs
+ e d hMn − Mn−1 , Mn − Mn−1 i (s)
2 0
Z
1 T 2αs 2
+ e |Yn (s) − Yn−1 (s)| ds
2 0
Z T
−2 e2αs hYn+1 (s) − Yn (s), dMn+1 (s) − dMn (s)i
0
Z t
+ 2 sup e2αs hYn+1 (s) − Yn (s), dMn+1 (s) − dMn (s)i . (4.119)
0≤t≤T 0

By taking expectations in (4.119), and invoking the Burkholder-Davis-Gundy


inequality (4.79) for p = 12 we obtain:
· ¸
2
E sup e2αt |Yn+1 (t) − Yn (t)|
0≤t≤T
h i
2
≤ e2αT E |Yn+1 (T ) − Yn (T )|
4.3 Existence and Uniqueness of solutions to BSDE’s 207
"Z #
T
1
+ E e2αs d hMn − Mn−1 , Mn − Mn−1 i (s)
2 0
"Z #
T
1 2αs 2
+ E e |Yn (s) − Yn−1 (s)| ds
2 0
· Z t ¸
+ 2E sup e2αs hYn+1 (s) − Yn (s), dMn+1 (s) − dMn (s)i
0≤t≤T 0
h i
2αT 2
≤e E |Yn+1 (T ) − Yn (T )|
"Z #
T
1 2αs
+ E e d hMn − Mn−1 , Mn − Mn−1 i (s)
2 0
"Z #
T
1 2αs 2
+ E e |Yn (s) − Yn−1 (s)| ds
2 0
Ã !1/2 
√ Z T
2
+ 8 2E  e4αs |Yn+1 (s) − Yn (s)| d hMn+1 − Mn , Mn+1 − Mn i (s) 
0

(insert the definition of k·kα )

h i 1° µ ¶°
° Yn − Yn−1 °2
E |Yn+1 (T ) − Yn (T )| + ° °
2αT 2
≤e
2 ° Mn − Mn−1 °α


+ 8 2E  sup eαs |Yn+1 (s) − Yn (s)|
0≤s≤T

ÃZ !1/2 
T
× e2αs d hMn+1 − Mn , Mn+1 − Mn i (s) 
0

√ 1
(8 2ab ≤ a2 + 64b2 , a, b ∈ R)
2
h i 1° µ ¶°
° Yn − Yn−1 °2
≤e 2αT 2
E |Yn+1 (T ) − Yn (T )| + ° ° °
2 Mn − Mn−1 °α
· ¸
1 2αs 2
+ E sup e |Yn+1 (s) − Yn (s)|
2 0≤s≤T
"Z #
T
+ 64E e2αs d hMn+1 − Mn , Mn+1 − Mn i (s) . (4.120)
0

Employing inequality (4.118) (with t = 0) together with (4.120), and the


definition of the norm k·kα yields the inequality
208 4 BSDE’s and Markov processes
· ¸ "Z #
T
2αt 2 2αs 2
E sup e |Yn+1 (t) − Yn (t)| + 129E e |Yn+1 (s) − Yn (s)| ds
0≤t≤T 0
"Z #
T
1 2αs
+ E e d hMn+1 − Mn , Mn+1 − Mn i (s)
2 0

131 2αT h i 131 ° µ ¶°


° Yn − Yn−1 °2
≤ e
2
E |Yn+1 (T ) − Yn (T )| + ° °
2 4 ° Mn − Mn−1 °α
· ¸
1 2αs 2
+ E sup e |Yn+1 (s) − Yn (s)| . (4.121)
2 0≤s≤T

(In order to justify the transition from (4.119) to (4.121) like in passing from
inequality (4.111) to (4.114) a stopping time argument might be required.)
Consequently, from (4.121) we get
· ¸
2αt 2
E sup e |Yn+1 (t) − Yn (t)|
0≤t≤T
"Z #
T
2αs
+E e d hMn+1 − Mn , Mn+1 − Mn i (s)
0
h i °µ ¶°2
2αT 2 131 ° °
° Yn − Yn−1 ° .
≤ 131e E |Yn+1 (T ) − Yn (T )| + ° (4.122)
2 Mn − Mn−1 °α
£ ¯ ¤
Since by definition Yn (T ) = E ξ ¯ FTT for all n ∈ N, this sequence also con-
verges with respect to the norm k·kS2 ×M2 defined by
°µ ¶°2 · ¸
° Y °
° ° = E sup |Y (s)|
2
+ E [hM, M i (T ) − hM, M i (0)] ,
° M ° 2 2 0<s<T
S ×M

because
" Z #
T ¯ 0
Yn+1 (0) = Mn+1 (0) = E ξ + fn (s, Yn (s), ZMn (s)) ds ¯ F0 , n ∈ N.
0

This concludes the proof of Theorem 4.33.


In the following theorem we replace the Lipschitz condition (4.101) in Theorem
4.33 for the function Y (s) 7→ f (s, Y (s), ZM (s)) with the (weaker) monotonic-
ity condition (4.123). Here we write y for the variable Y (s) and z for ZM (s):
see Theorem 4.34 in Chapter 4.
¡ ¢∗
Theorem 4.34. Let f : [0, T ]×Rk × M2 → Rk be monotone in the variable
y and Lipschitz in z. More precisely, suppose that there exist finite constants
C1 ¡ and C2 such
¢ that
¡ for any¢ two pairs of processes (Y, M ) and (U, N ) ∈
S2 [0, T ], Rk × M2 [0, T ], Rk the following inequalities hold for all 0 ≤ s ≤
T:
4.3 Existence and Uniqueness of solutions to BSDE’s 209
2
hY (s) − U (s), f (s, Y (s), ZM (s)) − f (s, U (s), ZM (s))i ≤ C1 |Y (s) − U (s)| ,
(4.123)
µ ¶1/2
d
|f (s, Y (s), ZM (s)) − f (s, Y (s), ZN (s))| ≤ C2 hM − N, M − N i (s) ,
ds
(4.124)

and

|f (s, Y (s), 0)| ≤ f (s) + K |Y (s)| . (4.125)


hR ¯ ¯2 i
T
If E 0 ¯f (s)¯ ds < ∞, then there exists a unique pair
¡ ¢ ¡ ¢
(Y, M ) ∈ S2 [0, T ], Rk × M2 [0, T ], Rk

such that
Z T
Y (t) = ξ + f (s, Y (s), ZM (s)) ds + M (t) − M (T ), (4.126)
t
¡ ¢
where Y (T ) = ξ ∈ L2 Ω, FT , Rk is given and where Y (0) = M (0).
In order to prove Theorem 4.34 we need the following proposition, the proof
of which uses the monotonicity condition (4.123) in an explicit manner.
¡ ¢
Proposition 4.35. Suppose that for every ξ ∈ L2 Ω, FT0 , P and M ∈ M2
there exists a pair (Y, N ) ∈ S2 × M2 such that
Z T
Y (t) = ξ + f (s, Y (s), ZM (s)) ds + N (t) − N (T ). (4.127)
t
¡ ¢
Then for every ξ ∈ L Ω, FT0 , P there exists a unique pair (Y, M ) ∈ S2 × M2
2

which satisfies (4.126).


The following proposition can be viewed as a consequence of Theorem 12.4
in [98]. The result is due to Burrage and Butcher [46] and Crouzeix [64]. The
obtained constants are somewhat different from ours.
Proposition 4.36. Fix a martingale M ∈ M2 , and choose δ > 0 in such
a way that δC1 < 1. Here C1 is the constant which occurs in inequality
(4.123). Choose, for given y ∈ R k e k
³ , the stochastic
´ variable Y (t) ∈ R in
such a way that y = Ye (t) − δf t, Ye (t), ZM (t) . Then the mapping y 7→
³ ´
f t, Ye (t), ZM (t) is Lipschitz continuous with a Lipschitz constant which is
µ ¶
1 δC1
equal to max 1, . Moreover, the mapping y 7→ I − δf (t, y, ZM (t))
δ 1 − δC1
is surjective and has a Lipschitz continuous inverse with Lipschitz constant
1
.
1 − δC1
210 4 BSDE’s and Markov processes

k k
Proof (Proof of Proposition 4.36). Let the pair
³ (y1 , y2 ) ∈ ´ R × R and the
pair of Rk × Rk -valued stochastic variables Ye1 (t), Ye2 (t) be such that the
following equalities are satisfied:
³ ´ ³ ´
y1 = Ye1 (t) − δf t, Ye1 (t), ZM (t) and y2 = Ye2 (t) − δf t, Ye2 (t), ZM (t) .
(4.128)
We have to show that there exists a constant C(δ) such that
¯ ³ ´ ³ ´¯
¯ ¯
¯f t, Ye2 (t), ZM (t) − f t, Ye1 (t), ZM (t) ¯ ≤ C(δ) |y2 − y1 | . (4.129)

In order to achieve this we will exploit the inequality:


D ³ ´ ³ ´E
Ye2 (t) − Ye1 (t), f t, Ye2 (t), ZM (t) − f t, Ye1 (t), ZM (t)
¯ ¯2
¯ ¯
≤ C1 ¯Ye2 (t) − Ye1 (t)¯ . (4.130)

Inserting the equalities in (4.128) into (4.130) results in


D ³ ´ ³ ´E
y2 − y1 , f t, Ye2 (t), ZM (t) − f t, Ye1 (t), ZM (t)
¯ ³ ´ ³ ´¯2
¯ ¯
+ δ ¯f t, Ye2 (t), ZM (t) − f t, Ye1 (t), ZM (t) ¯
D ³ ´ ³ ´E
≤ C1 |y2 − y1 | + 2δC1 y2 − y1 , f t, Ye2 (t), ZM (t) − f t, Ye1 (t), ZM (t)
2

¯ ³ ´ ³ ´¯2
¯ ¯
+ C1 δ 2 ¯f t, Ye2 (t), ZM (t) − f t, Ye1 (t), ZM (t) ¯ . (4.131)

Notice that (4.131) is equivalent to:


¯ ³ ´ ³ ´¯2
¯ ¯
δ ¯f t, Ye2 (t), ZM (t) − f t, Ye1 (t), ZM (t) ¯
2
≤ C1 |y2 − y1 |
µ ¶
1 D ³ ´ ³ ´E
+ 2 δC1 − y2 − y1 , f t, Ye2 (t), ZM (t) − f t, Ye1 (t), ZM (t)
2
¯ ³ ´ ³ ´¯2
¯ ¯
+ C1 δ 2 ¯f t, Ye2 (t), ZM (t) − f t, Ye1 (t), ZM (t) ¯ . (4.132)

1 − |1 − 2δC1 |
Put α = . Notice that, since 1 − δC1 > 0, the constant α is
2δC1
positive as well, α = 1 provided 2δC1 < 1. Since δC1 < 1 and
¯D ³ ´ ³ ´E¯
¯ ¯
2 ¯ y2 − y1 , f t, Ye2 (t), ZM (t) − f t, Ye1 (t), ZM (t) ¯
1 ¯ ³ ´ ³ ´¯2
¯ ¯
|y2 − y1 | + αδ ¯f t, Ye2 (t), ZM (t) − f t, Ye1 (t), ZM (t) ¯ , (4.133)
2

αδ
the inequality in (4.132) implies
4.3 Existence and Uniqueness of solutions to BSDE’s 211
¯ ³ ´ ³ ´¯ µ ¶
¯ ¯ δC1
δ ¯f t, Ye2 (t), ZM (t) − f t, Ye1 (t), ZM (t) ¯ ≤ max 1, |y2 − y1 | .
1 − δC1
µ ¶ (4.134)
1 δC1
The Lipschitz constant is given by C(δ) = max 1, : compare
δ 1 − δC1
(4.134) and (4.129). The surjectivity of the mapping y 7→ y − δf (t, y, ZM (t))
is a consequence of Theorem 1 in Croezeix et al [63]. Denote the mapping
y 7→ t (t, y, ZM (t)) by ft,M . Then for 0 < 2δC1 < 1 the mapping I − δft,M is
invertible. Since
³ ´
−1 −1
(I − δft,M ) = I + δf t, (I − δft,M ) , ZM (t) ,
³ ´
−1
and since by (4.134) the mapping y 7→ f t, (I − δft,M ) y, ZM (t) is Lip-
µ ¶
1 δC1
schitz continuous with Lipschitz constant max 1, we see that
δ 1 − δC1
−1
the mapping
µ y ¶
7→ (I − δft,M ) y is Lipschitz continuous with constant
1
max 2, . A somewhat better constant is obtained by again using
1 − δC1
(4.130), and replacing
³ ´ ³ ´
f t, Ye2 (t), ZM (t) − f t, Ye1 (t), ZM (t)

with δ −1 (e
y2 − ye1 − y2 + y1 ). Then we see:
2 2
|e
y2 − ye1 | − he
y2 − ye1 , y2 − y1 i ≤ δC1 |e
y2 − ye1 | , (4.135)

and hence
2
(1 − δC1 ) |e
y2 − ye1 | ≤ he
y2 − ye1 , y2 − y1 i ≤ |e
y2 − ye1 | |y2 − y1 | . (4.136)

Altogether this proves Proposition 4.36.


Corollary 4.37. For δ > 0 such that 2δC1 < 1 there exist processes Yδ and
Yeδ ∈ S2 and a martingale Mδ ∈ M2 such that the following equalities are
satisfied:
³ ´
Yδ (t) = Yeδ (t) − δf t, Yeδ (t), ZM (t)
Z T ³ ´
= Yδ (T ) + f s, Yeδ (s), ZM (s) ds + Mδ (t) − Mδ (T ). (4.137)
t

Proof. From Theorem 1 (page 87) in Crouzeix et al [63] it follows that the
mapping y 7→ y − δf (t, y, ZM (t)) is a surjective map from Rk onto itself,
provided 0 < δC1 < 1. If y2 and y1 in Rk are such that y2 − δf (t, y2 , ZM (t)) =
y1 − δf (t, y1 , ZM (t)). Then
2 2
|y2 − y1 | = hy2 − y1 , δf (t, y2 , ZM (t)) − δf (t, y1 , ZM (t))i ≤ δC1 |y2 − y1 | ,
212 4 BSDE’s and Markov processes

and hence y2 = y1 . It follows that the continuous mapping y 7→ y −


−1
δf (t, y, ZM (t)) has a continuous inverse. Denote this
³ inverse by (I − δft,M )´ .
−1
Moreover, for 0 < 2δC1 < 1, the mapping y 7→ f t, (I − δft,M ) , Zm (t) is
−1
Lipschitz continuous with Lipschitz constant δ , which follows from Propo-
sition 4.36. The remaining assertions in Corollary 4.37 are consequences of
Theorem 4.33 where the Lipschitz condition in (4.101) was used with δ −1
instead of C1 .
This establishes the proof of Corollary 4.37.
Remark 4.38. The surjectivity of the mapping y 7→ y − δf (s, y, ZM (s)) from
Rk onto itself follows from Theorem 1 in [63]. The authors use a homotopy
argument to prove this theorem for C1 = 0. Upon replacing f (t, y, ZM (t))
with f (t, y, ZM (t)) − C1 y the result follows in our version. An elementary
proof of Theorem 1 in [63] can be found for a continuously differentiable
function in Hairer and Wanner [98]: see Theorem 14.2 in Chapter IV. The
author is grateful to Karel in’t Hout (University of Antwerp) for pointing out
closely related Runge-Kutta type results and these references.
Proof (Proof of Proposition 4.35). The proof of the uniqueness part follows
from Corollary¡ 4.32. ¢
Fix ξ ∈ L2 Ω, FT0 , P , and let the martingale Mn−1 ∈ M2 be given. Then
by hypothesis there exists a pair (Yn , Mn ) ∈ S2 × M2 which satisfies:
Z T
¡ ¢
Yn (t) = ξ + f s, Yn (s), ZMn−1 (s) ds + Mn (t) − Mn (T ). (4.138)
t

Another use of this hypothesis yields the existence of a pair (Yn+1 , Mn+1 ) ∈
S2 × M2 which again satisfies (4.138) with n + 1 instead of n. We will prove
that the sequence (Yn , Mn ) is a Cauchy sequence in the space S2 × M2 . Put
γ = 1 + 2C1 + 2C22 . We apply Itô’s formula to obtain
2 2
eγT |Yn+1 (T ) − Yn (T )| − eγt |Yn+1 (t) − Yn (t)|
Z T
2
=γ eγs |Yn+1 (s) − Yn (s)| ds
t
Z T
+2 eγs hYn+1 (s) − Yn (s), d (Yn+1 (s) − Yn (s))i
t
Z T
+ eγs d hMn+1 − Mn , Mn+1 − Mn i (s)
t
Z T
2
=γ eγs |Yn+1 (s) − Yn (s)| ds
t
Z T
+2 eγs hYn+1 (s) − Yn (s), d (Mn+1 (s) − Mn (s))i
t
Z T
−2 eγs hYn+1 (s) − Yn (s), f (s, Yn+1 (s), ZMn (s)) − f (s, Yn (s), ZMn (s))i ds
t
4.3 Existence and Uniqueness of solutions to BSDE’s 213
Z T ­ ¡ ¢®
+2 eγs Yn+1 (s) − Yn (s), f (s, Yn (s), ZMn (s)) − f s, Yn (s), ZMn−1 (s) ds
t
Z T
+ eγs d hMn+1 − Mn , Mn+1 − Mn i (s)
t

(employ (4.123) and (4.124))


Z T
2
≥γ eγs |Yn+1 (s) − Yn (s)| ds
t
Z T
+2 eγs hYn+1 (s) − Yn (s), d (Mn+1 (s) − Mn (s))i
t
Z T
2
− 2C1 eγs |Yn+1 (s) − Yn (s)| ds
t
Z T µ ¶1/2
d
− 2C2 eγs |Yn+1 (s) − Yn (s)| hMn − Mn−1 , Mn − Mn−1 i (s) ds
t ds
Z T
+ eγs d hMn+1 − Mn , Mn+1 − Mn i (s)
t

(employ the elementary inequality 2ab ≤ 2a2 + 21 b2 )


Z T
¡ ¢ 2
≥ γ − 2C1 − 2C22 eγs |Yn+1 (s) − Yn (s)| ds
t
Z T
1 γs
− e d hMn − Mn−1 , Mn − Mn−1 i (s)
2 t
Z T
+ eγs d hMn+1 − Mn , Mn+1 − Mn i (s)
t
Z T
+2 eγs hYn+1 (s) − Yn (s), d (Mn+1 (s) − Mn (s))i . (4.139)
t

From (4.139) we infer the inequality


Z T
¡ ¢ 2
γ − 2C1 − 2C22 eγs |Yn+1 (s) − Yn (s)| ds
t
Z T
+ eγs d hMn+1 − Mn , Mn+1 − Mn i (s)
t
Z T
2
+ eγt |Yn+1 (t) − Yn (t)| + 2 eγs hYn+1 (s) − Yn (s), d (Mn+1 (s) − Mn (s))i
t
Z T
γT 1 2
≤e |Yn+1 (T ) − Yn (T )| + eγs d hMn − Mn−1 , Mn − Mn−1 i (s).
2 t
(4.140)
214 4 BSDE’s and Markov processes

By taking expectations in (4.140) we get, since γ = 1 + 2C1 + 2C22 ,


"Z #
T
2
E eγs |Yn+1 (s) − Yn (s)| ds
t
"Z #
T
γs
+E e d hMn+1 − Mn , Mn+1 − Mn i (s)
t
h i
2
+ eγt E |Yn+1 (t) − Yn (t)|
h i
2
≤ eγT E |Yn+1 (T ) − Yn (T )|
"Z #
T
1
+ E eγs d hMn − Mn−1 , Mn − Mn−1 i (s) . (4.141)
2 t

Iterating (4.141) yields:


"Z #
T
γs 2
E e |Yn+1 (s) − Yn (s)| ds
t
"Z #
T
γs
+E e d hMn+1 − Mn , Mn+1 − Mn i (s)
t
h i
2
+ eγt E |Yn+1 (t) − Yn (t)|
n
X 1h i
γT 2
≤ e E |Y k+1 (T ) − Y k (T )|
2n−k
k=1
"Z #
T
1
+ nE eγs d hM1 − M0 , M1 − M0 i (s)
2 t
"Z #
T
1 γs
= nE e d hM1 − M0 , M1 − M0 i (s) (4.142)
2 t

where in the last line we used the equalities Yk (T ) = ξ, k ∈ N. From the


Burkholder-Davis-Gundy inequality with p = 12 (see (4.79)) together with
(4.142) it follows that
· Z t ¸
γs
E max e hYn+1 (s) − Yn (s), d (Mn+1 − Mn ) (s)i
0≤t≤T 0
Ã !1/2 
√ Z T
2
≤ 4 2E  e2γs |Yn+1 (s) − Yn (s)| d hMn+1 − Mn , Mn+1 − Mn i (s) 
0

√ 1
≤ 4 2E  sup e 2 γs |Yn+1 (s) − Yn (s)|
0≤s≤T
4.3 Existence and Uniqueness of solutions to BSDE’s 215
ÃZ !1/2 
T
× eγs d hMn+1 − Mn , Mn+1 − Mn i (s) 
0


(use the elementary inequality 4 2ab ≤ 14 a2 + 32b2 )
· ¸
1 2
≤ E sup eγs |Yn+1 (s) − Yn (s)|
4 0≤s≤T
"Z #
T
γs
+ 32E e d hMn+1 − Mn , Mn+1 − Mn i (s)
0
· ¸
1 γs 2
≤ E sup e |Yn+1 (s) − Yn (s)|
4 0≤s≤T
"Z #
T
1 γs
+ E e d hM1 − M0 , M1 − M0 i (s) . (4.143)
2n−5 0

From (4.140) and (4.143) we obtain


2
sup eγt |Yn+1 (t) − Yn (t)|
0≤t≤T
Z T
+2 eγs hYn+1 (s) − Yn (s), d (Mn+1 (s) − Mn (s))i
0
Z
2 1 T γs
≤ eγT |Yn+1 (T ) − Yn (T )| + e d hMn − Mn−1 , Mn − Mn−1 i (s)
2 0
Z t
+ 2 sup eγs hYn+1 (s) − Yn (s), d (Mn+1 (s) − Mn (s))i . (4.144)
0≤t≤T 0

From (4.142) (for n − 1 instead of n), (4.143), and the fact that Yn+1 (T ) =
Yn (T ) = ξ from (4.142) we infer the inequalities:
· ¸
γt 2
E sup e |Yn+1 (t) − Yn (t)|
0≤t≤T
"Z #
T
1
≤ E eγs d hMn − Mn−1 , Mn − Mn−1 i (s)
2 0
· Z t ¸
γs
+ 2E sup e hYn+1 (s) − Yn (s), d (Mn+1 (s) − Mn (s))i
0≤t≤T 0
"Z #
T
1
≤ E eγs d hMn − Mn−1 , Mn − Mn−1 i (s)
2 0
· ¸
1 2
+ E sup eγs |Yn+1 (s) − Yn (s)|
2 0≤s≤T
216 4 BSDE’s and Markov processes
"Z #
T
γs
+ 64E e d hMn+1 − Mn , Mn+1 − Mn i (s)
0
· ¸
1 γs 2
≤ E sup e |Yn+1 (s) − Yn (s)|
2 0≤s≤T
"Z #
T
65
+ nE eγs d hM1 − M0 , M1 − M0 i (s) . (4.145)
2 0

From (4.145) we infer the inequality


· ¸ "Z #
T
γt 2 65 γs
E sup e |Yn+1 (t) − Yn (t)| ≤ nE e d hM1 − M0 , M1 − M0 i (s) .
0≤t≤T 2 0
(4.146)
(In order to justify the passage from (4.140) to (4.146) like in passing from
inequality (4.111) to (4.114) a stopping time argument might be required.)
From (4.142) and (4.146) it follows that the sequence (Yn , Mn ) converges in
the space S2 × M2 , and that its limit (Y, M ) satisfies (4.126) in Theorem 4.34.
This completes the proof of Proposition 4.35.

Proposition 4.39. Let the notation and hypotheses be as in Theorem 4.34.


Let for δ > 0 with 2δC1 < 1 the processes Yδ , Yeδ ∈ S2 and the martingale
Mδ ∈ M2 be such that the equalities of (4.137) in Corollary 4.37 are satisfied.
Then the family ½ ¾
1
(Yδ , Mδ ) : 0 < δ <
2C1
converges in the space S2 × M2 if δ decreases to 0, provided that the terminal
value ξ = Yδ (T ) is given.

Let (Y, M ) be the limit in the space S2 × M2 . In fact from the proof of
Proposition 4.39 it follows that
°µ ¶°
° Yδ − Y °
° °
° Mδ − M ° 2 2 = O(δ) (4.147)
S ×M

as δ ↓ 0, provided that kYδ2 (T ) − Yδ1 (T )kL2 (Ω,F0 ,P) = O (|δ2 − δ1 |).


T

Proof (Proof of Proposition 4.39). Let C1 be the constant which occurs in


−1
inequality (4.123) in Theorem 4.34, and fix 0 < δ2 < δ1 < (2C1 ) . Our
estimates give quantitative
³ bounds´ in case we restrict the parameters δ, δ1 and
−1
δ2 to the interval 0, (4C1 + 4) . An appropriate choice for the constant γ
in the present proof turns out to be γ = 6 + 4C1 (see e.g. the inequalities
(4.149), (4.161), (4.162), and (4.163) below). An appropriate choice for the
positive number a, which may be a function of the parameters δ1 and δ2 , in
−1
(4.160), (4.161) and subsequent inequalities below is given by a = (δ1 + δ2 ) .
4.3 Existence and Uniqueness of solutions to BSDE’s 217

For convenience we introduce the following notation: 4Y (s) = Yδ2 (s)−Yδ1 (s),
4M (s) = Mδ2 (s) − Mδ1³(s), 4Ye (s) = Yeδ´2 (s) − Yeδ1 (s), and 4fe(s) = feδ2 (s) −
feδ1 (s) where feδ (s) = f s, Yeδ (s), ZM (s) . From the equalities in (4.137) we
infer
Z T
Yδ (t) = Yeδ (t) − δ feδ (t) = Yδ (T ) + feδ (s)ds + Mδ (t) − Mδ (T ). (4.148)
t
n o
−1
First we prove that the family (Yδ , Mδ ) : 0 < δ < (4C1 + 4) is bounded
in the space S2 × M2 . Therefore we fix γ > 0 and apply Itô’s formula to the
2
process t 7→ eγt |Yδ (t)| to obtain:
2 2
eγT |Yδ (T )| − eγt |Yδ (t)|
Z T Z T Z T
2
=γ eγs |Yδ (s)| ds + 2 eγs hYδ (s), dYδ (s)i + eγs d hMδ , Mδ i (s)
t t t
Z T ¯ ¯2 Z T D E
¯ ¯
=γ eγs ¯Yeδ (s) − δ feδ (s)¯ ds − 2 eγs Yeδ (s), feδ (s) ds
t t
Z T D E Z T
−2 eγs Yδ (s) − Yeδ (s), feδ (s) ds + eγs d hMδ , Mδ i (s)
t t
Z T
+2 eγs hYδ (s), dMδ (s)i
t
Z T ¯ ¯2 Z T ¯ ¯2
¯ ¯ ¯ ¯
=γ eγs ¯Yeδ (s)¯ ds + γ eγs ¯δ feδ (s)¯ ds
t t
Z T D E
− 2 (1 + γδ) eγs Yeδ (s), feδ (s) − f (s, 0, ZM (s)) ds
t
Z T D E Z T D E
+2 eγs δ feδ (s), feδ (s) ds − 2 (1 + γδ) eγs Yeδ (s), f (s, 0, ZM (s)) ds
t t
Z T Z T
+ eγs d hMδ , Mδ i (s) + 2 eγs hYδ (s), dMδ (s)i
t t
Z T ¯ ¯2 Z
¯e ¯ ¡ 2 ¢ T γs ¯¯ ¯2
¯
=γ γs
e ¯Yδ (s)¯ ds + γδ + 2δ e ¯feδ (s)¯ ds
t t
Z T D E
− 2 (1 + γδ) eγs Yeδ (s), feδ (s) − f (s, 0, ZM (s)) ds
t
Z T D E
− 2 (1 + γδ) eγs Yeδ (s), f (s, 0, ZM (s)) − f (s, 0, 0) ds
t
Z T D E
− 2 (1 + γδ) eγs Yeδ (s), f (s, 0, 0) ds
t
Z T Z T
γs
+ e d hMδ , Mδ i (s) + 2 eγs hYδ (s), dMδ (s)i
t t
218 4 BSDE’s and Markov processes

(employ the inequalities (4.123), (4.124), and (4.125) of Theorem 4.34)


Z T ¯ ¯2 Z T ¯ ¯2
¯ ¯ ¡ ¢ ¯ ¯
≥γ eγs ¯Yeδ (s)¯ ds + γδ 2 + 2δ eγs ¯feδ (s)¯ ds
t t
Z T ¯ ¯2
¯ ¯
− 2C1 (1 + γδ) eγs ¯Yeδ (s)¯ ds
t
¯ Z ¯µ d T ¶1/2
¯e ¯ γs
− 2C2 (1 + γδ) e ¯Yδ (s)¯ hM, M i (s) ds
t ds
Z T ¯ ¯
¯ ¯
− 2 (1 + γδ) eγs ¯Yeδ (s)¯ |f (s, 0, 0)| ds
t
Z T Z T
+ eγs d hMδ , Mδ i (s) + 2 eγs hYδ (s), dMδ (s)i
t t
¯ ¯2 ZZ
¯e ¯ ¡ 2 ¢ T γs ¯¯
T ¯2
¯
≥ (γ − 2 (C1 + 1) (1 + γδ)) e ¯Yδ (s)¯ ds + γδ + 2δ e ¯feδ (s)¯ ds
γs
t t
Z T Z T
2
− C22 (1 + γδ) eγs d hM, M i (s) − (1 + γδ) eγs |f (s, 0, 0)| ds
t t
Z T Z T
+ eγs d hMδ , Mδ i (s) + 2 eγs hYδ (s), dMδ (s)i . (4.149)
t t

From (4.149) we infer the inequality:


Z T ¯ ¯2 Z T ¯ ¯2
¯ ¯ ¡ ¢ ¯ ¯
(γ − 2 (C1 + 1) (1 + γδ)) eγs ¯Yeδ (s)¯ ds + γδ 2 + 2δ eγs ¯feδ (s)¯ ds
t t
Z T Z T
2
+ eγs d hMδ , Mδ i (s) + 2 eγs hYδ (s), dMδ (s)i + eγt |Yδ (t)|
t
à Zt Z !
T T
γT 2 2
≤e |Yδ (T )| + (1 + γδ) C22 γs
e d hM, M i (s) + e γs
|f (s, 0, 0)| ds .
t t
(4.150)

From (4.150) we deduce


"Z #
T ¯ ¯2
γs ¯e ¯
(γ − 2 (C1 + 1) (1 + γδ)) E e ¯Yδ (s)¯ ds
t
"Z #
¡ ¢ T ¯ ¯ h i
2 γs ¯ e ¯2 2
+ γδ + 2δ E e ¯fδ (s)¯ ds + eγt E |Yδ (t)|
t
h i
2
≤ eγT E |Yδ (T )| (4.151)
à "Z # "Z #!
T T
2 2
+ (1 + γδ) C2 E eγs d hM, M i (s) + E eγs |f (s, 0, 0)| ds .
t t
4.3 Existence and Uniqueness of solutions to BSDE’s 219

In particular from (4.151) we see


"Z #
T ¯ ¯2
¯ ¯
E eγs ¯Yeδ (s)¯ ds
t
1 h i
2
≤ eγT E |Yδ (T )|
γ − 2 (C1 + 1) (1 + γδ)
"Z #
T
1 + γδ 2 γs
+ C E e d hM, M i (s)
γ − 2 (C1 + 1) (1 + γδ) 2 t
"Z #
T ¯ ¯2
1 + γδ
+ E γs ¯
e f (s)¯ ds . (4.152)
γ − 2 (C1 + 1) (1 + γδ) t

In addition, from (4.150) we obtain the following inequalities


Z T
2 eγs hYδ (s), dMδ (s)i + 2 sup eγt |Yδ (t)|
0 0<t<T
Z t
2
≤ eγT |Yδ (T )| + 2 sup eγs hYδ (s), dMδ (s)i
0<t<T 0
à Z Z !
T T
2
+ (1 + γδ) C22 γs
e d hM, M i (s) + e γs
|f (s, 0, 0)| ds , (4.153)
t t

and hence by using the Burkholder-Davis-Gundy inequality (4.79) for p = 12 :


· ¸
γt
E sup e |Yδ (t)|
0<t<T
h i · Z t ¸
γT 2 γs
≤e E |Yδ (T )| + E sup e hYδ (s), dMδ (s)i
0<t<T 0
à "Z # "Z #!
T T
2
+ (1 + γδ) C22 E γs
e d hM, M i (s) + E γs
e |f (s, 0, 0)| ds
t t
Ã !1/2 
h i √ Z T
2 2
≤ eγT E |Yδ (T )| + 8 2E  e2γs |Yδ (s)| d hMδ , Mδ i (s) 
0
à "Z # "Z #!
T T
2
+ (1 + γδ) C22 E γs
e d hM, M i (s) + E γs
e |f (s, 0, 0)| ds
t t
h i 1 · ¸
γT 2 γt
≤e E |Yδ (T )| + E sup e |Yδ (t)|
2 0<t<T
"Z #
T
2
+ 64E eγs |Yδ (s)| d hMδ , Mδ i (s)
0
220 4 BSDE’s and Markov processes
à "Z # "Z #!
T T
2
+ (1 + γδ) C22 E γs
e d hM, M i (s) + E e γs
|f (s, 0, 0)| ds .
t t
(4.154)

From (4.151) and (4.154) we obtain


· ¸
γt
E sup e |Yδ (t)|
0<t<T
h i
2
≤ 130eγT E |Yδ (T )| (4.155)
à "Z # "Z #!
T T
2 γs 2
+ 130 (1 + γδ) C2 E e d hM, M i (s) + E eγs |f (s, 0, 0)| ds .
t t

(In order to justify the passage from (4.153) to (4.155) like in passing from
inequality (4.111) to (4.114) a stopping time argument might be required.)
Next we notice that
¯ ¯ ¯ ¯2 ¯ ¯2 d
¯ e ¯2 ¯ ¯
¯fδ (s)¯ ≤ 2 ¯f (s)¯ + 2K 2 ¯Yeδ (s)¯ + 2C22 hM, M i (s), (4.156)
ds
and hence
D E µ¯ ¯2 ¯ ¯2 ¶
¯ ¯ ¯ ¯
2 δ2 feδ2 (s) − δ1 feδ1 (s), 4fe(s) ≥ −2 |δ2 − δ1 | ¯feδ2 (s)¯ − ¯feδ1 (s)¯
µ ¯ ¯2 ¯ ¯2 ¶
¯ ¯2 ¯ ¯ ¯ ¯ d
≥ −4 |δ2 − δ1 | ¯f (s)¯ + K 2 ¯Yeδ2 (s)¯ + K 2 ¯Yeδ1 (s)¯ + C22 hM, M i (s) .
ds
(4.157)
In a similar manner we also get
¯ ¯2
¯ e ¯
¯δ2 fδ2 (s) − δ1 feδ1 (s)¯ (4.158)
µ ¯ ¯ ¯ ¯ ¶
¡ ¢ ¯ ¯2 ¯ ¯2 ¯ ¯2 d
≤ 4 δ22 + δ12 ¯f (s)¯ + K 2 ¯Yeδ2 (s)¯ + K 2 ¯Yeδ1 (s)¯ + C22 hM, M i (s) .
ds
2
Fix γ > 0, and apply Itô’s lemma to the process t 7→ eγt |4Y (t)| to obtain
2 2
eγT |4Y (T )| − eγt |4Y (t)|
Z T Z T
2
=γ eγs |4Y (s)| ds + 2 eγs h4Y (s), d4Y (s)i
t t
Z T
+ eγs d h4M, 4M i (s)
t
Z T ¯ ¯2 Z T D E
¯ e e e ¯
=γ e γs
¯4Y (s) − δ2 fδ2 (s) + δ1 fδ1 (s)¯ ds − 2 eγs 4Ye (s), 4fe(s) ds
t t
Z T D E Z T
−2 eγs 4Y (s) − 4Ye (s), 4fe(s) ds + eγs d h4M, 4M i (s)
t t
4.3 Existence and Uniqueness of solutions to BSDE’s 221
Z T
+2 eγs h4Y (s), d4M (s)i
t
Z T ¯ ¯ Z T ¯ ¯2
¯ e ¯2 ¯ ¯
=γ e γs
¯4Y (s)¯ ds + γ eγs ¯δ2 feδ2 (s) − δ1 feδ1 (s)¯ ds
t t
Z T D E
−2 eγs 4Ye (s), 4fe(s) ds
t
Z T D E
− 2γ eγs δ2 feδ2 (s) − δ1 feδ1 (s), 4Ye (s) ds
t
Z T D E
+2 eγs δ2 feδ2 (s) − δ1 feδ1 (s), 4fe(s) ds
t
Z T Z T
γs
+ e d h4M, 4M i (s) + 2 eγs h4Y (s), d4M (s)i . (4.159)
t t

Employing the inequalities (4.123), (4.157), (4.158) and an elementary one


like
2 2
2 |hy1 , y2 i| ≤ (a + 1) |y1 | + (a + 1)−1 |y2 | , y1 , y2 ∈ Rk , a > 0, (4.160)

together with (4.159) we obtain


2 2
eγT |4Y (T )| − eγt |4Y (t)|
µ ¶Z T ¯ ¯2
γ ¯ ¯
≥ γ − 2C1 − eγs ¯4Ye (s)¯ ds
a+1 t
Z T ¯ ¯2
¯ ¯
− aγ eγs ¯δ2 feδ2 (s) − δ1 feδ1 (s)¯ ds
t
ÃZ Z !
T ¯ ¯2 T
− 8γ |δ2 − δ1 | eγs ¯f (s)¯ ds + C22 γs
e d hM, M i (s)
t t
ÃZ µ¯ !
T ¯2 ¯ ¯2 ¶
2 γs ¯e ¯ ¯e ¯
− 8γK |δ2 − δ1 | e ¯Yδ1 (s)¯ + ¯Yδ2 (s)¯ ds
t
Z T Z T
γs
+ e d h4M, 4M i (s) + 2 eγs h4Y (s), d4M (s)i
t t
µ ¶Z T ¯ ¯2
γ ¯ ¯
≥ γ − 2C1 − eγs ¯4Ye (s)¯ ds
a+1
¡ ¡t ¢¢
− 4γ 2 |δ2 − δ1 | + a δ12 + δ22
ÃZ Z T !
T ¯ ¯ 2
× eγs ¯f (s)¯ ds + C22 eγs d hM, M i (s)
t t
ÃZ µ¯ !
¡ ¡ ¢¢ T ¯2 ¯ ¯2 ¶
¯e ¯ ¯e ¯
− 4γK 2 |δ2 − δ1 | + a δ12 + δ22
2
e γs
¯Yδ1 (s)¯ + ¯Yδ2 (s)¯ ds
t
222 4 BSDE’s and Markov processes
Z T Z T
+ eγs d h4M, 4M i (s) + 2 eγs h4Y (s), d4M (s)i . (4.161)
t t

From (4.161) we obtain


µ ¶Z T ¯ ¯2
γa ¯ ¯
eγs ¯4Ye (s)¯ ds + eγt |4Y (t)|
2
− 2C1
a+1 t
Z T Z T
+ eγs d h4M, 4M i (s) + 2 eγs h4Y (s), d4M (s)i
t t
γT 2
≤e |4Y (T )|
¡ ¡ ¢¢
+ 4γ 2 |δ2 − δ1 | + a δ12 + δ22
ÃZ Z !
T ¯ ¯2 T
× eγs ¯f (s)¯ ds + C22 γs
e d hM, M i (s)
t t
ÃZ µ¯ !
¡ ¡ ¢¢ T ¯2 ¯ ¯2 ¶
¯e ¯ ¯e ¯
+ 4γK 2 |δ2 − δ1 | + a δ12 + δ22
2
e γs
¯Yδ1 (s)¯ + ¯Yδ2 (s)¯ ds .
t
(4.162)

From (4.152) and (4.162) we infer


µ ¶ "Z T ¯ ¯2
#
h i
γa γs ¯ e ¯ 2
− 2C1 E e ¯4Y (s)¯ ds + eγt E |4Y (t)|
a+1 t
"Z #
T
γs
+E e d h4M, 4M i (s)
t
h i h i
γT 2 2
≤e E |4Y (T )| + γ1 (δ1 , δ2 ) eγT E |Yδ1 (T )|
h i
2
+ γ1 (δ2 , δ1 ) eγT E |Yδ2 (T )|
à "Z # "Z #!
T ¯ ¯2 T
γs ¯ ¯ 2 γs
+ γ2 (δ1 , δ2 ) E e f (s) ds + C2 E e d hM, M i (s)
t t
(4.163)

where
¡ ¡ ¢¢ 1
γ1 (δ1 , δ2 ) = 4γK 2 2 |δ2 − δ1 | + a δ12 + δ22 ;
γ − 2 (C1 + 1) (1 + γδ1 )
¡ ¡ 2 ¢¢
γ2 (δ1 , δ2 ) = 4γ 2 |δ2 − δ1 | + a δ1 + δ22
µ ¶
K 2 (1 + γδ1 ) K 2 (1 + γδ2 )
× 1+ +
γ − 2 (C1 + 1) (1 + γδ1 ) γ − 2 (C1 + 1) (1 + γδ2 )
(4.164)

From (4.162) we also get:


4.3 Existence and Uniqueness of solutions to BSDE’s 223
³ ´ Z T
2
sup eγt |4Y (t)| +2 eγs h4Y (s), d4M (s)i
0<t<T 0
2
≤ eγT |4Y (T )|
¡ ¡ ¢¢
+ 4γ 2 |δ2 − δ1 | + a δ12 + δ22
ÃZ Z !
T ¯ ¯2 T
γs ¯ ¯ 2
× e f (s) ds + C2 eγs d hM, M i (s)
0 0
ÃZ µ¯ !
¡ ¡ ¢¢ T ¯2 ¯ ¯2 ¶
¯e ¯ ¯e ¯
+ 4γK 2 |δ2 − δ1 | + a δ12 + δ22
2
e γs
¯Yδ1 (s)¯ + ¯Yδ2 (s)¯ ds
0
Z t
+ 2 sup eγs h4Y (s), d4M (s)i . (4.165)
0<t<T 0

In what follows a stopping time argument might be required. From (4.165),


(4.152), the inequality of Burkholder-Davis-Gundy (4.79) for p = 12 and
(4.163) with t = 0 we obtain:
· ³ ´¸
γt 2
E sup e |4Y (t)|
0<t<T
h i
2
≤ eγT E |4Y (T )|
¡ ¡ ¢¢
+ 4γ 2 |δ2 − δ1 | + a δ12 + δ22
à "Z # "Z #!
T ¯ ¯2 T
γs ¯ ¯ 2 γs
× E e f (s) ds + C2 E e d hM, M i (s)
0 0
2
¡ ¡ ¢¢
+ 4γK 2 |δ2 − δ1 | + a + δ12 δ22
à "Z µ #!
T ¯ ¯2 ¯ ¯2 ¶
γs ¯ e ¯ ¯e ¯
× E e ¯Yδ1 (s)¯ + ¯Yδ2 (s)¯ ds
0
· Z t ¸
+ 2E sup eγs h4Y (s), d4M (s)i
0<t<T 0
h i h i
2 2
≤ e E |4Y (T )| + γ1 (δ1 , δ2 ) eγT E |Yδ1 (T )|
γT

h i
2
+ γ1 (δ2 , δ1 ) eγT E |Yδ2 (T )|
à "Z # "Z #!
T T ¯ ¯
+ γ2 (δ1 , δ2 ) C2 E2 γs
e d hM, M i (s) + E e γs ¯f (s) ds¯2
t t
 ÃZ !1/2 
√ T
+ 8 2E  sup eγt |4Y (t)| eγs d h4M, 4M i (s) 
0<t<T 0
h i h i
2 2
≤ eγT E |4Y (T )| + γ1 (δ1 , δ2 ) eγT E |Yδ1 (T )|
h i
2
+ γ1 (δ2 , δ1 ) eγT E |Yδ2 (T )|
224 4 BSDE’s and Markov processes
à "Z # "Z #!
T T ¯ ¯
+ γ2 (δ1 , δ2 ) C22 E γs
e d hM, M i (s) + E e γs ¯f (s) ds¯2
t t
· ¸ "Z #
T
1 γt 2 γs
+ E sup e |4Y (t)| + 64E e d h4M, 4M i (s) . (4.166)
2 0<t<T 0

Consequently, from (4.163) and (4.166) we deduce, like in the proof of inequal-
ity (4.155),
· ¸
2
E sup eγt |4Y (t)|
0<t<T
h i
2
≤ 130 eγT E |4Y (T )|
³ h i h i´
2 2
+ 130 γ1 (δ1 , δ2 ) eγT E |Yδ1 (T )| + γ1 (δ2 , δ1 ) eγT E |Yδ2 (T )|
à "Z # "Z #!
T T ¯ ¯2
+ 130γ2 (δ1 , δ2 ) C2 E2 γs
e d hM, M i (s) + E e f (s) ds¯
γs ¯
.
0 0
(4.167)

(Again it is noticed that the passage from (4.165) to (4.167) is justified by


a stopping time argument. The same argument was used several times. The
first time we used it in passing from inequality (4.111) to (4.114).) Another
appeal to (4.163) and (4.167) shows:
µ ¶ "Z T ¯ ¯2
#
γa γs ¯ e ¯
− 2C1 E e ¯4Y (s)¯ ds
a+1 t
· ¸ "Z #
T
2
+ E sup eγt |4Y (t)| + E eγs d h4M, 4M i (s)
0<t<T 0
h i
2
≤ 131 eγT E |4Y (T )|
³ h i h i´
2 2
+ 131 γ1 (δ1 , δ2 ) eγT E |Yδ1 (T )| + γ1 (δ2 , δ1 ) eγT E |Yδ2 (T )|
à "Z # "Z #!
T T ¯ ¯2
2 γs γs ¯ ¯
+ 131γ2 (δ1 , δ2 ) C2 E e d hM, M i (s) + E e f (s) ds .
0 0
(4.168)

The result in Proposition 4.39 now follows from (4.168) and the continuity of
the functions y 7→ f (s, y, ZM (s)), y ∈ Rk . The fact that the convergence of
−1
the family (Yδ , Mδ ), 0 < δ ≤ (4C1 + 4) is of order δ, as δ ↓ 0, follows by the
−1
choice of our parameters: γ = 4C1 + 4 and a = (δ1 + δ2 ) .
Proof (Proof of Theorem 4.34). The proof of the uniqueness part follows from
Corollary 4.32. The existence is a consequence of Theorem 4.33, Proposition
4.39 and Corollary 4.37.
4.4 Backward stochastic differential equations and Markov processes 225

The following result shows that in the monotonicity condition we may always
assume that the constant C1 can be chosen as we like provided we replace the
equation in (4.103) by (4.169) and adapt its solution.
¡ ¢ ¡ ¢
Theorem 4.40. Let the pair (Y, M ) belong to S2 [0, T ], Rk ×M2 [0, T ], Rk .
Fix λ ∈ R, and put
µ Z t ¶
λt λs
(Yλ (t), Mλ (t)) = e Y (t), Y (0) + e dM (s) .
0

Then the pair (Yλ , Mλ ) belongs to S2 × M2 . Moreover, the following assertions


are equivalent:
(i) The pair (Y, M ) ∈ S2 × M2 satisfies Y (0) = M (0) and
Z T
Y (t) = Y (T ) + f (s, Y (s), ZM (s)) ds + M (t) − M (T ).
t

(ii)The pair (Yλ , Mλ ) satisfies Yλ (0) = Mλ (0) and


Z T ¡ ¢
Yλ (t) = Yλ (T ) + eλs f s, e−λs Yλ (s), e−λs ZMλ (s) ds
t
Z T
−λ Yλ (s)ds + Mλ (t) − Mλ (T ). (4.169)
t
¡ ¢
Remark 4.41. Put fλ (s, y, z) = eλs f s, e−λs y, e−λs z − λy. If the function
y 7→ f (s, y, z) has monotonicity constant C1 , then the function y 7→ fλ (s, y, z)
has monotonicity constant C1 −λ. It follows that by reformulating the problem
one always may assume that the monotonicity constant is 0.

Proof (Proof of Theorem 4.40). First notice the equality e−λs ZMλ (s) =
ZM (s): see Remark 4.12. The equivalence of (i) and (ii) follows by consid-
ering the equalities in (i) and (ii) in differential form.

4.4 Backward stochastic differential equations and


Markov processes

In this
¡ section
¢∗ the coefficient f of our BSDE is a mapping from [0, T ] × E ×
Rk × M2 to Rk . Theorem 4.42 below is the analogue of Theorem 4.34 with
a Markov family of measures {Pτ,x : (τ, x) ∈ [0, T ] × E} instead of a single
measure. Put
fn (s) = f (s, X(s), Yn (s), ZMn (s)) ,
and suppose that the processes Yn (s) and ZMn (s) only depend of the state-
time variable (s, X(s)). Put Y (τ, t) g(x) = Eτ,x [g (X(t))], g ∈ Cb (E), and
226 4 BSDE’s and Markov processes

suppose that for every g ∈ Cb (E) the function (τ, x, t) → 7 Y (τ, t)f (x) is con-
tinuous on the set {(τ, x, t) ∈ [0, T ] × E × [0, T ] : 0 ≤ τ ≤ T }. Then it can be
proved that the Markov process

{(Ω, FTτ , Pτ,x ) , (X(t) : T ≥ t ≥ 0) , (E, E)} (4.170)

has left limits and is right-continuous: see e.g. item (a) in Theorem 1.39.
Theorem 2.22 in [95] contains a similar result in case the state space E
is locally compact and second countable. Suppose that¡ the Pτ,x -martingale ¢
t 7→ N (t) − N (τ ), t ∈ [τ, T ], belongs to the space M2 [τ, T ], Pτ,x , Rk (see
Definition 4.19).
¡ s It follows¢ that the quantity ZM (s)(N ) is measurable with
respect to σ Fs+ , N (s+) : see equalities (4.174), (4.175) and (4.176) below.
The following iteration formulas play an important role:
Z T
Yn+1 (t) = Et,X(t) [ξ] + Et,X(t) [fn (s)] ds,
t
Z t Z T
Mn+1 (t) = Et,X(t) [ξ] + fn (s)ds + Et,X(t) [fn (s)] ds.
0 t

Then the processes Yn+1 and Mn+1 are related as follows:


Z T
Yn+1 (T ) + fn (s)ds + Mn+1 (t) − Mn+1 (T ) = Yn+1 (t).
t

Moreover, by the Markov property, the process

t 7→ Mn+1 (t) − Mn+1 (τ )


"Z #
£ ¯ ¤ T ¯
= Eτ,X(τ ) ξ ¯ Ftτ − Eτ,X(τ ) [ξ] + Eτ,X(τ ) fn (s)ds ¯ Ftτ
τ
"Z #
T
− Eτ,X(τ ) fn (s)ds
τ
" Z # " Z #
T ¯ τ T
= Eτ,X(τ ) ξ + ¯
fn (s)ds Ft − Eτ,X(τ ) ξ + fn (s)ds
τ τ

is a Pτ,x -martingale on the interval [τ, T ] for every (τ, x) ∈ [0, T ] × E.


In Theorem 4.42 below we replace the Lipschitz condition (4.101) in Theo-
rem 4.33 for the function Y (s) 7→ f (s, Y (s), ZM (s)) with the (weaker) mono-
tonicity condition (4.179) for the function Y (s) 7→ f (s, X(s), Y (s), ZM (s)).
Sometimes we write y for the variable Y (s)Tand z for ZM (s). Notice that the
t
functional ZMn (t) only depends on Ft+ := h:T ≥t+h>t σ (X(t + h)) and that
this σ-field belongs to the Pt,x -completion of σ (X(t)) for every x ∈ E. This is
the case, because by assumption the process s 7→ X(s) is right-continuous at
s = t: see Proposition 4.7. In order to show this we have to prove equalities
of the following type:
4.4 Backward stochastic differential equations and Markov processes 227
£ ¯ s ¤
Es,x Y ¯ Ft+ = Et,X(t) [Y ] , Ps,x -almost surely, (4.171)

for all bounded stochastic variables which are FTt -measurable. By the mono-
tone class theorem and density argumentsQn the proof of (4.171) reduces to
showing these equalities for Y = j=1 fj (tj , X (tj )), where t = t1 < t2 <
· · · < tn ≤ T , and the functions x 7→ fj (tj , x), 1 ≤ j ≤ n, belong to the space
Cb (E). So we consider
 
n
Y ¯ s
Es,x  fj (tj , X (tj )) ¯ Ft+ 
j=1
 
n
Y ¯ s
= f1 (t, X (t)) Et,X(t)  fj (tj , X (tj )) ¯ Ft+ 
j=2
   
n
Y ¯ s ¯ s
= f1 (t, X (t)) lim Es,x Es,x  fj (tj , X (tj )) ¯ Ft+h  ¯ Ft+ 
h↓0,0<h<t2 −t
j=2
   
n
Y ¯ s
= f1 (t, X (t)) lim Es,x Et+h,X(t+h)  fj (tj , X (tj )) ¯ Ft+ 
h↓0,0<h<t2 −t
j=2

hQ i
n
(the function ρ 7→ Eρ,X(ρ) j=2 fj (tj , X (tj )) is right-continuous)

   
n
Y ¯ s
= f1 (t, X (t)) Es,x Et,X(t)  fj (tj , X (tj )) ¯ Ft+ 
j=2
 
n
Y
= f1 (t, X (t)) Et,X(t)  fj (tj , X (tj ))
j=2
 
n
Y
= Et,X(t)  fj (tj , X (tj )) , Ps,x -almost surely. (4.172)
j=1

Next suppose that the bounded stochastic variable Y is measurable with


t
respect to Ft+ . From (4.171) with s = t it follows that Y = Et,X(t) [Y ],
Pt,x -almost surely. Hence such a variable Y only depends on the space-time
variable
£ ¯(t,tX(t)).
¤ Since X(t) = x Pt,x -almost surely it follows that the variable
Et,x Y ¯ Ft+ is Pt,x -almost equal to the deterministic constant Et,x [Y ]. A
similar argument shows the following result. Let 0 ≤ s < t ≤ T , and let Y
be a bounded FTs -measurable stochastic variable. Then the following equality
holds Ps,x -almost surely:
£ ¯ s ¤ £ ¯ ¤
Es,x Y ¯ Ft+ = Es,x Y ¯ Fts (4.173)
228 4 BSDE’s and Markov processes

s
In particular it follows that an Ft+ -measurable
£ bounded
¯ ¤ stochastic variable
coincides with the Ft -measurable variable Es,x Y ¯ Fts Ps,x -almost surely for
s
s
all x ∈ E. Hence (4.173) implies that the σ-field Ft+ is contained in the
s
Ps,x -completion of the σ-field Ft .
In addition, notice that the functional ZM (s) is defined by
hM, N i (t) − hM, N i (s)
ZM (s)(N ) = lim (4.174)
t↓s t−s
where

hM, N i (t) − hM, N i (s)


n
2X −1
= lim (M (tj+1,n ) − M (tj,n )) (N (tj+1,n ) − N (tj,n )) . (4.175)
n→∞
j=0

For this the reader is referred to the remarks 4.12, 4.13, 4.15, and to formula
(4.76). The symbol tj,n represents the real number tj,n = s + j2−n (t − s). The
limit in (4.175) exists Pτ,x -almost surely for all τ ∈ [0, s]. As a consequence
τ
the process ZM (s) is Fs+ -measurable for all τ ∈ [0, s]. It follows that the
process
£ N →
7 Z M¯ (s)(N ) is Pτ,x
¤ -almost surely equal to the functional
¡ τ N 7→
¢
Eτ,x ZM (s)(N ) ¯ σ (Fsτ , N (s)) provided that ZM (s)(N ) is σ Fs+ , N (s+) -
Rs
measurable. If the martingale M is of the form M (s)¡ = u (s, X(s))+ ¢ 0
f (ρ)dρ,
s
then the functional ZM (s)(N ) is automatically σ Fs+ , N (s+) -measurable.
It follows that, for every τ ∈ [0, s], the following equality holds Pτ,x -almost
surely:
£ ¯ ¡ τ ¢¤ £ ¯ ¤
Eτ,x ZM (s)(N ) ¯ σ Fs+ , N (s+) = Eτ,x ZM (s)(N ) ¯ σ (Fsτ , N (s+)) .
(4.176)
Moreover, in the next Theorem 4.42 the filtered probability measure
³ ¡ ¢ ´
Ω, F, Ft0 t∈[0,T ] , P

is replaced with a Markov family of measures


³ ´
Ω, FTτ , (Ftτ )τ ≤t≤T , Pτ,x , (τ, x) ∈ [0, T ] × E.

Its proof follows the lines of the proof of Theorem 4.34: it will not be re-
peated here. Relevant equalities which play a dominant role are the following
ones: (4.114), (4.122), (4.155), and (4.168). In these inequalities the mea-
sure Pτ,x replaces P and the coefficient f (s, Y (s), ZM (s)) is replaced with
f (s, X(s), Y (s), ZM (s)). Then (4.177), which is the same as (4.114), is satis-
fied and with α = 1 + C12 + C22 the following equalities play a dominant role
for the sequence (Yn , Mn ):
· ¸
2αt 2
Eτ,x sup e |Yn+1 (t)|
τ <t<T
4.4 Backward stochastic differential equations and Markov processes 229
"Z #
h i T
2 2
≤ 130e2αT Eτ,x |Yn+1 (T )| + 130Eτ,x e2αs |f (s, 0, 0)| ds
τ
"Z # "Z #
T T
2αs 2αs 2
+ 65Eτ,x e d hMn , Mn i (s) + 65Eτ,x e |Yn (s)| ds < ∞,
τ τ
(4.177)

and
· ¸
2
Eτ,x sup e2αt |Yn+1 (t) − Yn (t)|
τ ≤t≤T
"Z #
T
2αs
+ Eτ,x e d hMn+1 − Mn , Mn+1 − Mn i (s)
τ
h i 131 °µ ¶°
° Yn − Yn−1 °2
2
≤ 131e2αT Eτ,x |Yn+1 (T ) − Yn (T )| + ° ° .
2 ° Mn − Mn−1 °τ,x,α
(4.178)

Compare these inequalities with (4.114) and (4.178). The inequality in (4.178)
plays only a direct role in case we are dealing with a Lipschitz continuous
generator f . In case the generator f is only monotone (or one-sided Lipschitz)
in the variable y, then we need the propositions 4.35, 4.36, 4.39, and Corollary
4.37. °µ ¶°
° Y °
The norm ° ° M °
° is defined by:
τ,x,α

°µ ¶°2 "Z Z T #
° Y ° T
° ° = Eτ,x
2
e2αs |Y (s)| ds + e2αs d hM, M i (s) .
° M °
τ,x,α τ τ

A proof of these inequalities can be found in [245] and in the proof of Theorem
4.33 in the present Chapter 4.
¡ ¢∗
Theorem 4.42. Let f : [0, T ] × E × Rk × M2 → Rk be monotone in the
variable y and Lipschitz in z. More precisely, suppose that there exist finite
constants C¡1 and C2 ¢such that¡ for any ¢ two pairs of processes (Y, M ) and
(U, N ) ∈ S2 [0, T ], Rk × M2 [0, T ], Rk the following inequalities hold for all
0 ≤ s ≤ T:

hY (s) − U (s), f (s, X(s), Y (s), ZM (s)) − f (s, X(s), U (s), ZM (s))i
2
≤ C1 |Y (s) − U (s)| , (4.179)
|f (s, X(s), Y (s), ZM (s)) − f (s, X(s), Y (s), ZN (s))|
µ ¶1/2
d
≤ C2 hM − N, M − N i (s) , (4.180)
ds
230 4 BSDE’s and Markov processes

and

|f (s, X(s), Y (s), 0)| ≤ f (s, X(s)) + K |Y (s)| . (4.181)


¡ ¢
Fix (τ, x) ∈ [0, T ] × E hand let Y (T ) = ξ ∈i L2 Ω, FTτ , Pτ,x ; Rk be given. In
RT ¯ ¯2
addition, suppose Eτ,x τ ¯f (s, X(s))¯ ds < ∞. Then there exists a unique
pair ¡ ¢ ¡ ¢
(Y, M ) ∈ S2 [τ, T ], Pτ,x , Rk × M2 [τ, T ], Pτ,x , Rk
with Y (τ ) = M (τ ) such that
Z T
Y (t) = ξ + f (s, X(s), Y (s), ZM (s)) ds + M (t) − M (T ). (4.182)
t
T
Next let ξ = ET,X(T ) [ξ] ∈ (τ,x)∈[0,T L2 (Ω, FTτ , Pτ,x ) be given. Suppose
h i ]×E hR ¯ ¯2 i
T
Eτ,x |ξ| and (τ, x) 7→ Eτ,x τ ¯f (s, X(s))¯ ds
2
that the functions (τ, x) 7→
are locally bounded. Then there exists a unique pair
¡ ¢ ¡ ¢
(Y, M ) ∈ S2loc,unif [τ, T ], Rk × M2loc,unif [τ, T ], Rk

with Y (0) = M (0) such that equation


T (4.182) is satisfied.
Again let ξ = ET,X(T ) [ξ] ∈ (τ,x)∈[0,T ]×E L2 (Ω, FTτ , Pτ,x ) be given. Sup-
pose that the functions
"Z #
h i T ¯ ¯2
(τ, x) 7→ Eτ,x |ξ|
2
and (τ, x) 7→ Eτ,x ¯f (s, X(s))¯ ds
τ

are uniformly bounded. Then there exists a unique pair


¡ ¢ ¡ ¢
(Y, M ) ∈ S2unif [τ, T ], Rk × M2unif [τ, T ], Rk

with Y (0) = M (0) such that equation (4.182) is satisfied.


The notations
¡ ¢ ¡ ¢
S2 [τ, T ], Pτ,x , Rk = S2 Ω, FTτ , Pτ,x ; Rk and
2
¡ k
¢ 2
¡ τ k
¢
M [τ, T ], Pτ,x , R = M Ω, FT , Pτ,x ; R

are explained in the definitions 4.18 and 4.19 respectively. The same is true
for the notions
¡ ¢ ¡ ¢
S2loc,unif [0, T ], Rk = S2loc,unif Ω, FTτ , Pτ,x ; Rk ,
¡ ¢ ¡ ¢
M2loc,unif [0, T ], Rk = M2loc,unif Ω, FTτ , Pτ,x ; Rk ,
¡ ¢ ¡ ¢
S2unif [0, T ], Rk = S2unif Ω, FTτ , Pτ,x ; Rk , and
¡ ¢ ¡ ¢
M2unif [0, T ], Rk = M2unif Ω, FTτ , Pτ,x ; Rk .
4.4 Backward stochastic differential equations and Markov processes 231

The probability measure Pτ,x is defined on the σ-field FTτ . Since the exis-
tence properties of the solutions to backward stochastic equations are based
on explicit inequalities, the proofs carry over to Markov families of measures.
Ultimately these inequalities imply that boundedness and continuity prop-
erties of the function (τ, x) 7→ Eτ,x [Y (t)], 0 ≤ τ ≤ t ≤ T , depend on the
continuity of the function x 7→ ET,x [ξ], where ξ is a terminal value function
which is supposed to be σ (X(T ))-measurable. In addition, in order to be sure
that the function (τ, x) 7→ Eτ,x [Y (t)] is continuous, functions of the form
(τ, x) 7→ Eτ,x [f (t, u (t, X(t)) , ZM (t))] have to be continuous, whenever the
following mappings
"Z #
T
2
(τ, x) 7→ Eτ,x |u(s, X(s))| ds and (τ, x) 7→ Eτ,x [hM, M i (T ) − hM, M i]
τ

represent finite and continuous functions.


In the next example we see how the classical Feynman-Kac formula is
related to backward stochastic differential equations.
Example 4.43. Suppose that the coefficient f has the special form:

f (t, x, r, z) = c(t, x)r + h(t, x)

and that the process s 7→ X x,t (s) is a solution to a stochastic differential


equation:
 Z s Z s
 t,x t,x
¡ t,x
¢ ¡ ¢

 X (s) − X (t) =
 b τ, X (τ ) dτ + σ τ, X t,x (τ ) dW (τ ),
t t

 t ≤ s ≤ T;


X t,x (s) = x, 0 ≤ s ≤ t.
In that case, the BSDE is linear,
Z T
t,x t,x
Y (s) = g(X (T )) + [c(r, X t,x (r))Y t,x (s) + h(r, X t,x (r))] dr
s
Z T
− Z t,x (r) dW (r),
s

and hence it has an explicit solution. From an extension of the classical “vari-
ation of constants formula” (see the argument in the proof of the comparison
theorem 1.6 in Pardoux [181]) or by direct verification we get:
¡ ¢ RT
Y t,x (s) = g X t,x (T ) e s c(r,X (r)) dr
t,x

Z T
¡ ¢ Rr
h r, X t,x (r) e s c(α,X (α)) dα dr
t,x
+
s
Z T Rr
c(α,X t,x (α))dα
− e s Z t,x (r) dW (r).
s
232 4 BSDE’s and Markov processes

Now we have Y t,x (t) = E [Y t,x (t)], so that

Y t,x (t)
" Z #
RT T ¡ ¢ Rs
c(s,X (s)) ds
h s, X t,x (s) e t c(r,X (r)) dr ds ,
t,x t,x
= E g(X t,x (T ))e t +
t

which is the well-known Feynman-Kac formula. Clearly, solutions to stochastic


backward stochastic differential equations can be used to represent solutions
to classical differential equations of parabolic type, and as such they can be
considered as a nonlinear extension of the Feynman-Kac formula.

Example 4.44. In this example the family of operators L(s), 0 ≤ s ≤ T , gener-


ates a Markov process in the sense of Definition 4.4: see (4.11). For a “smooth”
function v we introduce the martingales:
Z sµ ¶

Mv,t (s) = v (s, X(s)) − v (t, X(t)) − + L(ρ) v (ρ, X(ρ)) dρ. (4.183)
t ∂ρ

Its quadratic variation part hMv,t i (s) := hMv,t , Mv,t i (s) is given by
Z s
hMv,t i (s) = Γ1 (v, v) (ρ, X(ρ)) dρ.
t

In this example we will mainly be concerned with the Hamilton-Jacobi-


Bellman equation as exhibited in (4.184). We have the following result for
generators of diffusions: it refines Theorem 2.4 in Zambrini [260]. Observe
M
that Pt,xv,t stands for a Girsanov transformation of the measure Pt,x .

Theorem 4.45. Suppose that the operator L = L(s) does not depend on s ∈
[0, T ]. Let χ : (τ, T ] × E → [0, ∞] be a function such that for all τ < t ≤ T
and for sufficiently many functions v
M
Et,xv,t [|log χ (T, X(T ))|] < ∞.

Let SL be a (classical) solution to the following Riccati type equation. For


τ < s ≤ T and x ∈ E the following identity is true:

 ∂SL (s, x) − 1 Γ (S , S ) (s, x) + L(s)S (s, x) + V (s, x) = 0;
1 L L L
∂s 2 (4.184)

SL (T, x) = − log χ(T, x), x ∈ E.

Then for any nice real valued v(s, x) the following inequality is valid:
"Z µ ¶ #
T
Mv,t 1
SL (t, x) ≤ Et,x Γ1 (v, v) + V (τ, X(τ ))dτ
t 2
M
− Et,xv,t [log χ (T, X(T ))] ,
4.4 Backward stochastic differential equations and Markov processes 233

and equality is attained for the “Lagrangian action” v = SL :


" Ã Z ! #
T
SL (t, x) = − log Et,x exp − V (σ, X(σ)) dσ χ (T, X(T )) . (4.185)
t

M
The probability Pt,xv,t is determined by following equality (4.186). For all finite
n-tuples t1 , . . . , tn in (t, T ] and all bounded Borel functions fj : [t, T ]×E → R,
1 ≤ j ≤ n, we have:
 
n
Y
M
Et,xv,t  fj (tj , X (tj )) (4.186)
j=1
 Ã ! 
Z T n
Y
1
= Et,x exp − Γ1 (v, v) (τ, X(τ )) dτ − Mv,t (T ) fj (tj , X (tj )) .
2 t j=1

Proof. It is just mentioned that Theorem 4.45 is fully proved in [242]. It is


also proved in Chapter 6: see Theorem 6.3.
It is conjectured that a version where the operators L(s) do depend on s is
also true. The operator family {L(s) : s ∈ [0, T ]} should be the generator of a
diffusion process in the sense of Definition 4.1. In addition, it should generate a
Feller evolution in the sense of Theorem 1.39. Moreover, the squared gradient
operator should exist in Tβ -sense, i.e. in the sense of (4.2).
5
Viscosity solutions, backward stochastic
differential equations and Markov processes

In this chapter we explain the notion of stochastic backward differential equa-


tions and its relationship with classical (backward) parabolic differential equa-
tions of second order. The chapter contains a combination of stochastic pro-
cesses like Markov processes and martingale theory and semi-linear partial
differential equations of parabolic type. Emphasis is put on the fact that the
solutions to BSDE’s obtained by stochastic methods to BSDE’s are often
viscosity solutions.
In the literature functions with the monotonicity property are also called
one-sided Lipschitz functions. In fact Theorem 4.26, with f (t, x, ·, ·) Lipschitz
continuous in both variables, will be superseded by Theorem 4.33 in the Lips-
chitz case and by Theorem 4.34 in case of monotonicity in the second variable
and Lipschitz continuity in the third variable. The proof of Theorem 4.26
is part of the results in Section 4.3. Theorem 4.42 contains a corresponding
result for a Markov family of probability measures. Its proof is omitted, it
follows the same lines as the proof of Theorem 4.34.
The notations
¡ ¢ ¡ ¢
S2 [τ, T ], Pτ,x , Rk = S2 Ω, FTτ , Pτ,x ; Rk and
2
¡ k
¢ 2
¡ τ k
¢
M [τ, T ], Pτ,x , R = M Ω, FT , Pτ,x ; R

are explained in the definitions 4.18 and 4.19 respectively. The same is true
for the notions
¡ ¢ ¡ ¢
S2loc,unif [0, T ], Rk = S2loc,unif Ω, FTτ , Pτ,x ; Rk ,
¡ ¢ ¡ ¢
M2loc,unif [0, T ], Rk = M2loc,unif Ω, FTτ , Pτ,x ; Rk ,
¡ ¢ ¡ ¢
S2unif [0, T ], Rk = S2unif Ω, FTτ , Pτ,x ; Rk , and
¡ ¢ ¡ ¢
M2unif [0, T ], Rk = M2unif Ω, FTτ , Pτ,x ; Rk .

The probability measure Pτ,x is defined on the σ-field FTτ . Since the exis-
tence properties of the solutions to backward stochastic equations are based
236 5 Viscosity solutions

on explicit inequalities, the proofs carry over to Markov families of measures.


Ultimately these inequalities imply that boundedness and continuity proper-
ties of the function (τ, x) 7→ Eτ,x [Y (t)], 0 ≤ τ ≤ t ≤ T , depend the con-
tinuity of the function x 7→7→ ET,x [ξ], where ξ is a terminal value function
which is supposed to be σ (X(T ))-measurable. In addition, in order to be sure
that the function (τ, x) 7→ Eτ,x [Y (t)] is continuous, functions of the form
(τ, x) 7→ Eτ,x [f (t, u (t, X(t)) , ZM (t))] have to be continuous, whenever the
following mappings
"Z #
T
2
(τ, x) 7→ Eτ,x |u(s, X(s))| ds and (τ, x) 7→ Eτ,x [hM, M i (T ) − hM, M i]
τ

represent finite and continuous functions.

5.1 Comparison theorems

As an introduction to the present section we insert a comparison theorem.


This theorem will also be used to establish the fact that solutions to semi-
linear BSDE’s are in fact viscosity solutions.
Theorem 5.1. Suppose that Y (T ) = ξ ≤ ξ 0 = Y 0 (T ) P-a.s., and f (t, x, y, z) ≤
f 0 (t, x, y, z) dt × dP-a.e. Then Y (t) ≤ Y 0 (t), 0 ≤ t ≤ T , P-a.s., pro-
vided that there exists a martingale N (t) such that its co-variation process
t 7→ hN, M 0 − M i (t) satisfies

d
f 0 (t, X(t), Y (t), ZM 0 (t)) − f 0 (t, X(t), Y (t), ZM (t)) = hN, M 0 − M i (t).
dt
(5.1)
If moreover Y (0) = Y 0 (0), then Y (t) = Y 0 (t), 0 ≤ t ≤ T , P-a.s. Moreover, if
either P (ξ < ξ 0 ) > 0 or f (t, y, ZM (t)) < f 0 (t, y, ZM (t)), (y, ZM (t)) ∈ R×M2 ,
on a set of positive dt × dP measure, then Y (0) < Y 0 (0).
In fact for the martingale N (t) in (5.1) we may choose:
Z t
f 0 (s, X(s), Y (s), ZM 0 (s)) − f 0 (s, X(s), Y (s), ZM (s))
N (t) =
d
0 hM 0 − M, M 0 − M i (s)
ds
(dM 0 (s) − dM (s)) , (5.2)

d
where the derivative hM 0 − M, M 0 − M i (s) stands for the Radon-Nikodym
ds
derivative of the quadratic variation process t 7→ hM 0 − M, M 0 − M i (t) at
t = s (relative to the Lebesgue measure). In the following proposition we
collect some properties of the martingale t 7→ N (t). Among other things it
says that the process t 7→ N (t) is well-defined and continuous provided the
5.1 Comparison theorems 237

martingale t 7→ M 0 (t) − M (t) is continuous. It is assumed that there exists a


constant C 0 such that
2
|f 0 (s, X(s), Y (s), ZM 0 (s)) − f 0 (s, X(s), Y (s), ZM (s))|
2 d
≤ (C 0 ) hM 0 − M, M 0 − M i (s), 0 ≤ s ≤ T. (5.3)
ds
Proposition 5.2. Suppose that the processes X(s), Y (s), M 0 (s), and M (s)
are such that (5.3) is satisfied for constant C 0 . In addition suppose that the
process M 0 − M is a martingale belonging to M2 ([0, T ], P) with the property
that the quadratic variation process s 7→ hM 0 − M, M 0 − M i (s) is absolutely
continuous with respect to the Lebesgue measure. Then the process t 7→ N (t) is
a martingale is well-defined, and also belongs to M2 ([0, T ], P). The following
inequality is satisfied:

hN, N i (t) − hN, N i (s)


2
≤ (C 0 ) (hM 0 − M, M 0 − M i (t) − hM 0 − M, M 0 − M i (s)) . (5.4)

The quadratic variation process t 7→ hN, N i (t) is absolutely continuous rel-


d
ative to the Lebesgue measure. Its Radon-Nikodym derivative hN, N i (s)
ds
satisfies
2
d |f 0 (s, X(s), Y (s), ZM 0 (s)) − f 0 (s, X(s), Y (s), ZM (s))|
hN, N i (s) = .
ds d
hM 0 − M, M 0 − M i (s)
ds
(5.5)
Let s 7→ M1 (s) and s 7→ M2 (s) be two martingales with quadratic variation
processes hM1 , M1 i and hM2 , M2 i respectively. Let the Doléans measures Qj :
FT0 ⊗ B[0,T ] → [0, ∞], j = 1 2 be determined by

Qj (A × [a, b]) = E [1A (hMj , Mj i (b) − hMj , Mj i (a))] , (5.6)

with A ∈ FT0 , 0 ≤ a ≤ b ≤ T, j = 1, 2. In addition, ¡ let s 7→ f1 (s)¢ and


2 0
s 7→¡ f2 (s) be predictable
¢ process which belong to L Ω, F T ⊗ B[0,T ] , Q1 and
L2 Ω, FT0 ⊗ B[0,T ] , Q2 respectively. In the proof of Proposition 5.2 we need
the following equality:
*Z Z + Z
(·) (·) t
f1 (s)dM1 (s), f2 (s)dM2 (s) (t) = f1 (s)f2 (s)d hM1 , M2 i (s)
0 0 0
(5.7)
where t ∈ [0, T ]. A definition of Doléans measure like (5.6), and an equality
like (5.7) are given in books on martingale theory, like Williams [255]
Proof (Proof of Proposition 5.2). Equality (5.7) in Proposition 5.2 yields

hN, N i (t)
238 5 Viscosity solutions
Z t 2
|f 0 (s, X(s), Y (s), ZM 0 (s)) − f 0 (s, X(s), Y (s), ZM (s))|
= µ ¶2
0 d
hM 0 − M, M 0 − M i (s)
ds
d hM 0 − M, M 0 − M i (s)
Z t 2
|f 0 (s, X(s), Y (s), ZM 0 (s)) − f 0 (s, X(s), Y (s), ZM (s))|
= µ ¶2
0 d 0 0
hM − M, M − M i (s)
ds
d
hM 0 − M, M 0 − M i (s) ds
ds
Z t 0 2
|f (s, X(s), Y (s), ZM 0 (s)) − f 0 (s, X(s), Y (s), ZM (s))|
= ds. (5.8)
d
0 hM 0 − M, M 0 − M i (s)
ds
The equality in (5.4) follows from (5.8). Combining the equality in (5.3) and
(5.8) results in the inequality in (5.4). The inequality in (5.5) follows from
(5.4).
If the martingale s 7→ (M 0 (s) − M (s)) is continuous, then so is the
martingale s 7→ N (s) which is obtained as a stochastic integral relative to
d (M 0 − M ) (s). This assertion also follows from Itô calculus for martingales:
see e.g. Williams [255].
This completes the proof of Proposition 5.2.

Proof (Proof of Theorem 5.1). Following Pardoux [181] we introduce the pro-
cess α(t), 0 ≤ t ≤ T , by α(t) = 0 if Y (t) = Y 0 (t), and
−1
α(t) = (Y 0 (t) − Y (t)) (f 0 (t, X(t), Y 0 (t), ZM 0 (t)) − f 0 (t, X(t), Y (t), ZM 0 (t)))
(5.9)
if Y (t) 6= Y 0 (t). Then α(t) ≤ C1 P-almost surely. We also introduce the
following processes:

U (t) = f 0 (t, X(t), Y (t), ZM (t)) − f (t, X(t), Y (t), ZM (t)) ; (5.10)
0
Y (t) = Y (t) − Y (t); (5.11)
0
M (t) = M (t) − M (t); (5.12)
0 0
ξ = Y (T ) = Y (T ) − Y (T ) = ξ − ξ. (5.13)

In terms of α(t), ξ, U (t), and the martingales N (t) and M (t) the adapted
process Y (t) satisfies the following backward integral equation:

Y (t) − ξ (5.14)
Z T Z T ­ ® ­ ®
= α(s)Y (s)ds + U (s)ds − M (T ) + M (t) + N, M (T ) − N, M (t).
t t

From Itô calculus and (5.14) it then follows that


5.1 Comparison theorems 239
RT
α(τ )dτ − 21 hN,N i(T )+ 12 hN,N i(t)+N (T )−N (t)
Y (t) = Y (T )e t (5.15)
Z T R
s 1 1 ¡ ¢
+ e t α(τ )dτ − 2 hN,N i(s)+ 2 hN,N i(t)+N (s)−N (t) U (s)ds − dM (s) − Y (s)dN (s) .
t

Since the process Y (t) is adapted and since Itô integrals with respect to mar-
tingales with bounded integrands are martingales the equality in (5.15) im-
plies:
"
RT
α(τ )dτ − 12 hN,N i(T )+ 12 hN,N i(t)+N (T )−N (t)
Y (t) = E Y (T )e t

Z #
T Rs ¯
α(τ )dτ − 21 hN,N i(T )+ 12 hN,N i(t)+N (s)−N (t) ¯
+ e t U (s)ds Ft . (5.16)
t

Since by hypothesis Y (T ) ≥ 0 and U (s) ≥ 0 for all s ∈ [0, T ], the equality in


(5.16) implies Y (t) ≥ 0. The other assertions also follow from representation
(5.16).

The following result can be proved along the same lines as Theorem 5.1.
It will be used in the proof of Theorem 5.6 with V (s) = ϕ̇(s, X(s)) +
L(s)ϕ(s, ·)(X(s)), with Y (s) = u (s, X(s)) and Y 0 (s) = ϕ (s, X(s)). In fact
the arguments in the proof of Theorem 2.4 of Pardoux [181] inspired our
proof of the following theorem.
Theorem 5.3. Fix a time t ∈ [0, T ) and fix a stopping time τ such that
t £R
< τ ≤ T .¤ Let V (s) be a progressively measurable process such that
τ
E t |V (s)| ds < ∞. Let (Y, M ) and (Y 0 , M 0 ) satisfy the following type of
backward stochastic integral equations:
Z τ
Y (s) = Y (τ ) + f (ρ, X(ρ), Y (ρ), ZM (ρ)) dρ + M (s) − M (τ ) and
s
Z τ
Y 0 (s) = Y 0 (τ ) + V (ρ)dρ + M 0 (s) − M 0 (τ )
s

for t ≤ s < τ . Suppose that Y (τ ) ≤ Y 0 (τ ) and

f (s, X(s), Y 0 (s), ZM 0 (s)) ≤ V (s), t ≤ s ≤ τ.

Then Y (s) ≤ Y 0 (s), t ≤ s ≤ T . If

f (s, X(s), Y 0 (s), ZM 0 (s)) < V (s)

on a subset of [t, τ ) × Ω of strictly positive ds × P-measure, then Y (t) < Y 0 (t).

Proof. Define the stochastic process f 0 (s, X(s), y, z) by

f 0 (s, X(s), y, z) = f (s, X(s), y, z) + V (s) − f (s, X(s), Y 0 (s), ZM 0 (s)) .


240 5 Viscosity solutions

The arguments for the proof of Theorem 5.1 now apply with the martingale
N (s), t ≤ s ≤ T , given by
Z s∧τ
f (ρ, X(ρ), Y (ρ), ZM 0 (ρ)) − f (ρ, X(ρ), Y (ρ), ZM (ρ))
N (s) =
d
t hM 0 − M, M 0 − M i (ρ)

(dM 0 (ρ) − dM (ρ)) , (5.17)

and the process α(s), t ≤ s ≤ T , defined by α(s) = 0 if Y (s) = Y 0 (s), and

α(s) (5.18)
0 −1 0
= (Y (t) − Y (t)) (f (s, X(s), Y (s), ZM 0 (s)) − f (s, X(s), Y (s), ZM 0 (s)))

if Y (t) 6= Y 0 (t). The other relevant processes are:

U (s) = V (s) − f 0 (s, X(s), Y 0 (s), ZM 0 (s)) ; (5.19)


0
Y (s) = Y (s) − Y (s); (5.20)
0
M (s) = M (s) − M (s); (5.21)
0 0
ξ = Y (τ ) = Y (τ ) − Y (τ ) = ξ − ξ. (5.22)

The remaining reasoning follows the lines of the proof of Theorem 5.1.
This completes the proof of Theorem 5.3.

Remark 5.4. If Y (s) = u (s, X(s)), u is “smooth”, and u(t, x) satisfies (4.74),
then Y (s) satisfies (4.75), and vice versa. If f (s, x, y, z) only depends on y ∈ R,
then, by the occupation formula,
Z T Z T
g (Y (s)) Z(s) (Y, Y ) ds = g (Y (s)) d hY (·), Y (·)i (s)
t
Zt
= (LyT (Y ) − Lyt (Y )) g(y)dy,
R

where dy is the Lebesgue measure, and Lyt (Y ) is the (density of the) local
time of the process Y (t). If g ≡ 1 and Y (s) = u (s, X(s)), then (4.75) is also
equivalent to the following assertion: the process
à Z Tµ ¶ !
1
exp Y (s) − Y (T ) − f (τ, X(τ ), Y (τ ), Z(τ ) (·, Y )) − hY, Y i (τ ) dτ ,
s 2

t0 < t ≤ s ≤ T , is a local backward (exponential) Pt,x -martingale (for every


T > t > t0 ). The function f depends on x ∈ E, s ∈ (t0 , T ], y ∈ R, and on
the square gradient operator (f1 , f2 ) 7→ Γ1 (f1 , f2 ), or, more generally, on the
covariance mapping (Y1 , Y2 ) 7→ hY1 , Y2 i (s) of the local semi-martingales Y1 (s)
and Y2 (s). In order to introduce boundary conditions it is required to insert
in equation (4.75) a term of the form
5.2 Viscosity solutions 241
Z T
h (X(s), s, Y (s), Z(s) (·, Y )) dA(s),
t

where A(s) is a process which is locally of bounded variation, and which only
increases when e.g. X(s) hits the boundary. To be more precise the equality
in (4.75) should be replaced with:
Z T
Y (t) − Y (T ) − f (s, X(s), Y (s), Z(s) (·, Y )) ds
t
Z T
− h (X(s), s, Y (s), Z(s) (·, Y )) dA(s) = M (t) − M (T ). (5.23)
t

We hope to come back on this and similar problems in future work. In or-
der to be sure about uniqueness and existence of solutions we probably will
need some Lipschitz and linear growth conditions on the function f and some
boundedness condition on ϕ. For more details on backward stochastic differ-
ential equations see e.g. Pardoux and Peng [184] and [181].

5.2 Viscosity solutions


The main result in this section is Theorem 5.6. We begin with some formal
definitions.
Definition 5.5. Fix t0 ∈ [0, T ], and let

F :C ([t0 , T ] × E, R) × C ([t0 , T ] × E, R) × C ([t0 , T ] × E, R)


³ ´
× L C (0,1) ([t0 , T ] × E, R) , C ([t0 , T ] × E, R)
→ C ([t0 , T ] × E, R)

be a function with the following property. If (t, x) is any point in [t0 , T ] × E,


then for all functions ϕ and ψ belonging to C ([t0 , T ] × E, R), for which the 4
functions

(s, y) 7→ ϕ̇ (s, y) , (s, y) 7→ L(s)ϕ (s, ·) (y), (5.24)


(s, y) 7→ ψ̇ (s, y) , and (s, y) 7→ L(s)ψ (s, ·) (y) (5.25)

belong to Cb ([t0 , T ] × E, R), for which the operators g 7→ ∇L ϕ (g) and g 7→


L
∇ψ (g) are Tβ -continuous mappings from D (Γ1 ) to Cb ([t0 , T ] × E), and which
are such that in case

ϕ̇(t, x) = ψ̇(t, x), Γ1 (ϕ − ψ, ϕ − ψ) (t, x) = 0, L(t)ϕ(t, x) ≤ L(t)ψ (t, x) , and


ϕ(t, x) = ψ(t, x) (5.26)

it follows that
242 5 Viscosity solutions
¡ ¢ ³ ´
F ϕ̇, Lϕ, ϕ, ∇L L
ϕ (t, x) ≤ F ψ̇, Lψ, ψ, ∇ψ (t, x).

Here we wrote
∂ϕ
ϕ̇ = , Lϕ(t, x) = [L(t)ϕ(t, ·)] (x), and ∇L
ϕ g (t, x) = Γ1 (ϕ, g) (t, x) .
∂t
Of course, similarly notions are in vogue for the function ψ. It is noticed that
Γ1 (ϕ − ψ, ϕ − ψ) (t, x) = 0
if and only if the equality
∇L L
ϕ f (t, x) = ∇ψ f (t, x) holds for all f ∈ C (0,1) ([0, T ] × E, R). (5.27)
The proof of this assertion uses the inequality
2
|Γ1 (ϕ − ψ, f ) (t, x)| ≤ Γ1 (ϕ − ψ, ϕ − ψ) (t, x)Γ1 (f, f ) (t, x) (5.28)
together with the identity ∇L
(f ) (t, x) = Γ1 (ϕ − ψ, f ) (t, x). If f = ϕ − ψ
ϕ−ψ
we have equality in (5.28). An example of such a function F is:
F (ϕ1 , ϕ2 , ϕ3 , χ) (t, x) = ϕ1 (t, x)+ϕ2 (t, x)+f (t, x, ϕ3 (t, x) , χ (t, x)) , (5.29)
where χ (t, x) is the linear functional g 7→ χ(g) (t, x). A viscosity sub-solution
for the equation
¡ ¢
F ẇ, Lw, w, ∇Lw (t, x) = 0, w(T, x) = g(x) (5.30)
is a continuous function w with the following properties. First of all w(T, x) ≤
g(x), and if ϕ : [t0 , T ] × E → R is any “smooth function” (i.e. all three
functions ϕ̇, Lϕ, ϕ are continuous and the linear mapping ψ 7→ ∇L ϕψ =
Γ1 (ψ, ϕ) is continuous as well) Γ1 (ϕ, ϕ), L(s)ϕ belong to C ([t0 , T ] × E, R)),
and if (t, x) is any point in [t0 , T ) × E where the function w − ϕ vanishes and
attains a (local) maximum, then
¡ ¢
F ϕ̇, Lϕ, w, ∇L ϕ (t, x) ≥ 0. (5.31)
The function w is a super-solution for equation (5.30) if w(T, x) ≥ g(x), and
if for any “smooth” function ϕ with the property that the function w − ϕ
vanishes and attains a (local) minimum at any point (t, x) ∈ [t0 , T ) × E, then
¡ ¢
F ϕ̇, Lϕ, w, ∇L
ϕ (t, x) ≤ 0. (5.32)
If a function w satisfies (5.31) as well as (5.32) then w is called a viscosity
solution to equation (5.30).
The definition of the space D (Γ1 ) was given in 4.3. The following result says
essentially speaking that solutions to BSDE’s and viscosity solutions to equa-
tion (4.74) are intimately related. As in Section 4.1 the family of operators
L(s), 0 ≤ s ≤ T , generates a Markov process:
{(Ω, FTτ , Pτ,x ) , (X(t) : T ≥ t ≥ 0) , (∨t , T ≥ t ≥ 0) , (E, E)} . (5.33)
5.2 Viscosity solutions 243
µ ¶ µ ¶
Y (t) u(t, X(t))
Theorem 5.6. Let the ordered pair = be a solution to
M (t) M (t)
the BSDE:
Z T
Y (s) = Y (T ) + f (ρ, X(ρ), Y (ρ), ZM (ρ)) dρ + M (s) − M (T ). (5.34)
s

Then the function u(t, x) defined by u(t, x) = Et,x [Y (t)] is a viscosity solution
to the equation the following equation

 ∂u (s, x) + L(s)u (s, x) + f ¡s, x, u(s, x), ∇L (s, x)¢ = 0;
u
∂s (5.35)

u(T, x) = ϕ (T, x) , x ∈ E,

provided that the function u(t, x) is continuous.


Notice that the equation in (5.35) is the same as the one in (4.74).
Proof. Let the function ϕ(s, y) be “smooth” and suppose that (t, x) is point
in [0, T ) × E where the function u − ϕ vanishes and attains a local maximum.
This means that there exists a subset of the form [t, t + ε] × U , where U is an
open neighborhood of x such that

sup (u(s, y) − ϕ(s, y)) = u(t, x) − ϕ(t, x) = 0.


(s,y)∈[t,t+ε]×U

We have to show that


∂ ¡ ¢
ϕ (t, x) + L(t)ϕ (t, ·) (x) + f t, x, u (t, x) , ∇L
ϕ (t, x) ≥ 0, (5.36)
∂t
where in (5.31) we have chosen
¡ ¢ ¡ ¢
F ϕ̇, Lϕ, u, ∇L L
ϕ (t, x) = ϕ̇(t, x) + L(t)ϕ(t, ·)(x) + f t, x, u(t, x), ∇ϕ (t, x) .
(5.37)
Assume to arrive at a contradiction that the expression in (5.36) is strictly
less than zero:
∂ ¡ ¢
ϕ (t, x) + L(t)ϕ (t, ·) (x) + f t, x, u (t, x) , ∇L
ϕ (t, x) < 0. (5.38)
∂t
Upon shrinking ε > 0 and the open subset U we may and do assume that for
all (s, y) ∈ [t, t + ε] × U the inequality
∂ ¡ ¢
ϕ (s, y) + L(s)ϕ (s, ·) (y) + f s, y, u (s, y) , ∇L
ϕ (s, y) < 0 (5.39)
∂s
holds. Define the stopping τ by τ = inf {s ≥ t : X(s) ∈ / U } ∧ (t + ε). From
(4.74) we have:
Z τ
u(t, X(t)) = u (τ, X(τ )) + f (ρ, X(ρ), u (ρ, X(ρ)) , ZM (ρ)) dρ
t
244 5 Viscosity solutions

+ M (t) − M (τ ). (5.40)

Let Mϕ (s) be the martingale associated to the function ϕ as in Proposition


4.7. Then

ϕ (t, X (t)) − ϕ (τ, X (τ ))


Z τµ ¶

− ϕ (s, X(s)) + L(s)ϕ (s, ·) (X(s)) ds + Mϕ (t) − Mϕ (τ ) .
t ∂s

From the definition of the stopping time τ it follows that u (τ, X (τ )) ≤


ϕ (τ, X (τ )). An application of Theorem 5.3 with V (s) = ϕ̇(s, X(s)) +
L(s)ϕ(s, ·)(X(s)), with Y (s) = u (s, X(s)) and Y 0 (s) = ϕ (s, X(s)) then shows
u (t, X(t)) < ϕ (t, X(t)) Pt,x -almost surely. Since u(t, x) = Et,x [u (t, X(t))]
and also ϕ(t, x) = Et,x [ϕ (t, X(t))] this yields a contradiction. This means
that our assumption (5.38) is false, and hence the function u(t, x) is a viscos-
ity sub-solution to equation (4.74) which is the same as (5.35. In the same
manner one shows that u(t, x) is also a viscosity super-solution to (4.74).
Altogether this completes the proof of Theorem 5.6.

Proposition 5.7. Let the pair (Y, M ) ∈ be a solution to 5.34 in Theorem 5.6.
Suppose that the pair (Y, M ) belongs to the space S2loc,unif (Ω, FTτ , Pτ,x ; R) ×
M2loc,unif (Ω, FTτ , Pτ,x ; R) (see definitions 4.18 and 4.19). In addition, suppose
that the Markov process in (5.33) is strong Feller. Then the function (t, x) 7→
u(t, x) := Et,x [Y (t)] is continuous on [0, T ] × E. This is a consequence of the
strong Feller property and the following equalities:
Z T
u(t, x) = Et,x [u (T, X(T ))] + Et,x [f (s, X(s), Y (s), ZM (s))] ds
t
Z T £ ¤
= Et,x [u (T, X(T ))] + Et,x Es,X(s) [f (s, X(s), Y (s), ZM (s))] . ds
t
(5.41)

Definition 5.8. Let {Y (t) : t ∈ [τ, T ]} be the difference of a sub-martingale


relative to Pτ,x and an increasing process in L1 (Ω, F, Pτ,x ). Then the process
{Y (t) : t ∈ [τ, T ]} is said to be of class (DL) if the collection

{Y (S) : τ ≤ S ≤ T, S stopping time}

is uniformly integrable.

Notice that an increasing process in L1 (Ω, F, P) is automatically of class (DL),


and that the same is true for a martingale. In addition, notice that that in
our case the process Y (t), 0 ≤ t ≤ T , which satisfies
Z T
Y (t) = Y (T ) + f (s, X(s), Y (s), ZM (s)) ds + M (t) − M (T ), (5.42)
t
5.2 Viscosity solutions 245

where the pair (Y, M ) belongs to S2 × M2 (Ω, FTτ , Pτ,x ) is automatically of


class (DL) in the space L1 (Ω, FTτ , Pτ,x ). The reason being that a martingale
is automatically of class (DL), and the same is true for a process of the form
Rt
t 7→ τ f (s, X(s), Y (s), ZM (s)) ds, τ ≤ t ≤ T .
In the proof we will employ a technique which is also used in the proof of
the Doob-Meyer decomposition theorem. It states that a local right-continuous
sub-martingale Ye (t) of class (DL) can be written in the form Ye (t) = M (t) +
A(t) where t 7→ M (t) is a right-continuous local martingale, and t 7→ A(t)
is a predictable increasing process. For details see e.g. Protter [192] theorems
12 and 13 in Chapter 3. For another account see Karatzas and Shreve [121]
Theorem 4.10. Another proof can be found in Rao [197]. In [249] Van Neerven
gives a detailed account of the proof in [121]. In addition, in the proof of the
Doob-Meyer decomposition theorem Van Neerven uses the following version
of the Dunford-Pettis theorem.
Theorem 5.9 (Dunford-Pettis). If (Yn )n∈N is uniformly integrable se-
quence of random variables, then there exists an integrable random variable
Y and a subsequence (Ynk )k∈N such that weak-limk→∞ Ynk = Y , i.e., for all
bounded random variables ξ the following equality holds:

lim E [ξYnk ] = E [ξY ] .


k→∞

For a proof of this version of the Dunford-Pettis theorem the reader is referred
to [118]. From general arguments in integration theory and functional analysis,
it then follows that the variable Y can be written as the P-almost sure limit
appropriately chosen convex combinations of the sequence {Ynk : k ≥ `}, and
PN`
this for all ` ∈ N. In other words there exists a sequence Ye` = k=` α`,k Ynk ⊂
P
α`,k = 1, such that L1 - lim`→∞ Ye` = Y ,
N`
L1 (Ω, F, P) such that α`,k ≥ 0, k=`
and such that lim`→∞ Ye` = Y P-almost surely.
Proof. Proof of Proposition 5.7 By the strong Feller property it suffices to
show that the process ρ 7→ f (ρ, X(ρ), u (ρ, X(ρ)) , ZM (ρ)) only depends on
the pair (ρ, X(ρ)). In other words we have to show that the functional ZM (ρ)
only depends on (ρ, X(ρ)). We will verify this claim. Therefore we introduce
the processes
" ç ¨ !#
2j t
Yj (t) = Et,X(t) Y ∧T and
2j
X · µ ¶ µ ¶¸
k+1 k
Aj (t) = Ek2−j ,X(k2−j ) Y ∧T −Y ∧T . (5.43)
j
2j 2j
0≤k<2 t

j ∈ N, t ∈ [0, T ]. Fix 0 ≤ t1 < t2 ≤ T . From (5.43) we see that the increment


¡ ¢
Aj (t2 ) − Aj (t1 ) is measurable relative to the σ-field generated by X k2−j ,
t1 ≤ k2−j < t2 , k ∈ N. Next, let (τ, x) ∈ [0, T ) × E. Since Y (s) = u (s, X(s)),
246 5 Viscosity solutions

0 ≤ s ≤ T , we are eligible to apply the Markov property to infer that Pτ,x -


almost surely
" ç ¨ ! #
2j t ¯ τ
Yj (t) = Eτ,x Y ∧ T ¯ Ft and
2j
X · µ ¶ µ ¶ ¸
k+1 k ¯ τ
Aj (t) = Aj (τ ) + Eτ,x Y ∧ T − Y ∧ T ¯ F −j .
k2
j j
2j 2j
2 τ ≤k<2 t
(5.44)

Next we show that the process t 7→ Yj (t) − Aj (t) + Aj (τ ) is a Pτ,x -martingale.


Let 0 ≤ t1 < t2 ≤ T , and notice that the variables Yj (t1 ) and Aj (t1 ) − Aj (τ )
are Ftτ1 -measurable. We employ (5.43) and (5.44) to obtain
£ ¯ ¤
Eτ,x Yj (t2 ) − Aj (t2 ) + Aj (τ ) ¯ Ftτ1 − Yj (t1 ) + Aj (t1 ) − Aj (τ )
£ ¯ ¤
= Eτ,x Yj (t2 ) − Yj (t1 ) − Aj (t2 ) + Aj (t1 ) ¯ Ftτ1
" " ç ¨ ! # " ç ¨ ! #
2j t2 ¯ τ 2 j
t1 ¯ τ
= Eτ,x Eτ,x Y ∧ T ¯ Ft2 − Eτ,x Y ∧ T ¯ Ft1
2j 2j

X · µ ¶ µ ¶ ¸
k+1 k ¯ τ ¯ τ
− Eτ,x Y ∧T −Y ∧ T ¯ Fk2−j ¯ Ft1 
j j
2j 2j
2 t1 ≤k<2 t

(tower property of conditional expectations)


" ç ¨ ! ç ¨ !
2j t2 2j t1
= Eτ,x Y ∧T −Y ∧T
2j 2j

X · µ ¶ µ ¶¸
k+1 k ¯ τ
− Y ∧T −Y ∧T ¯ Ft 
2j 2j 1
2j t1 ≤k<2j t
£ ¯ ¤
= Eτ,x 0 ¯ Ftτ1 = 0. (5.45)

From (5.45) it follows that for every pair (τ, x) ∈ [0, T ) × E the processes
t 7→ Yj (t) − Aj (t) + Aj (τ ), j ∈ N, are Pτ,x -martingales relative to the filtration
(Ftτ )t∈[τ,T ] . Put Mj (t) = Yj (t) − Aj (t). Then the process t 7→ Mj (t) − Mj (τ ),
t ∈ [τ, T ], is Pτ,x -martingale, and

Yj (t) − Yj (τ ) = Aj (t) − Aj (τ ) + Mj (t) − Mj (τ ). (5.46)

. In (5.46) we let j tend to ∞, and if necessary, we pass to a subsequence, to


obtain
Z t
Y (t) − Y (τ ) = f (s, X(s), Y (s), ZM (s)) ds + M (t) − M (τ ), (5.47)
τ
5.2 Viscosity solutions 247

where in L1 (Ω, Ftτ , Pτ,x ) and Pτ,x -almost surely


Z t Nn
X
f (s, X(s), Y (s), ZM (s)) ds = lim αn,k (Ajk (t) − Ajk (τ )) , and
τ n→∞
k=n
Nn
X
M (t) − M (τ ) = lim αn,k (Mjk (t) − Mjk (τ )) , (5.48)
n→∞
k=n
PNn
where αn,k ≥ 0 and k=n αn,k = 1. For all this see the comments following
Theorem 5.9. It follows that Pτ,x -almost surely, the variables
Z t2
f (s, X(s), Y (s), ZM (s)) ds, τ ≤ t1 < t2 ≤ T, (5.49)
t1

are Ftt21 -measurable. Consequently, for almost every s ∈ [τ, T ], the vari-
able f (s, X(s), Y (s), ZM (s)) is almost surely Pτ,x -measurable relative to
σ (s, X(s)). Since the paths of the process X are continuous from the right
it follows that for almost all s ∈ [0, T ] the variable f (s, X(s), Y (s), ZM (s)) is
t
Fs+ -measurable for all 0 ≤ t < s. ¡If 0 ¢≤ t < s ≤ T by the strong Markov
t
property relative to the filtration Fs+ s∈[t,T ]
(see Theorem 1.39) we then
have
£ ¤
Et,x Es,X(s) [f (s, X(s), Y (s), ZM (s))]
£ £ ¯ t ¤¤
= Et,x Et,x f (s, X(s), Y (s), ZM (s)) ¯ Fs+
= Et,x [f (s, X(s), Y (s), ZM (s))] , (5.50)

and

Es,X(s) [f (s, X(s), Y (s), ZM (s))]


£ ¯ ¤
= Et,x f (s, X(s), Y (s), ZM (s)) ¯ Fst
£ ¯ t ¤
= Et,x f (s, X(s), Y (s), ZM (s)) ¯ Fs+
= f (s, X(s), Y (s), ZM (s)) , Pt,x -almost surely. (5.51)

From (5.34) and (5.51) we infer


Z T
Y (t) = Y (T ) + Es,X(s) [f (s, X(s), Y (s), ZM (s))] ds
t
+ M (t) − M (T ), (5.52)

and hence by (5.50) from (5.52) we get

u(t, x) = Et,x [Y (t)] = Et,x [u (T, X(T ))]


Z T
£ ¤
+ Et,x Es,X(s) [f (s, X(s), Y (s), ZM (s))] ds. (5.53)
t
248 5 Viscosity solutions

As a consequence, the strong Feller property implies that the function


£ ¤
(t, x) 7→ Et,x Es,X(s) [f (s, X(s), Y (s), ZM (s))] , (5.54)

0 ≤ t ≤ s ≤ T , x ∈ E, is continuous. As a consequence, from (5.53) and (5.54)


we infer that the function (t, x) 7→ u(t, x) is continuous.
This conclusion completes the proof of Proposition 5.7.

5.3 Backward stochastic differential equations in finance


In [60] the authors M.G. Crandall, L.C. Evans, and P.L. Lions study properties
of viscosity solutions of Hamilton-Jacobi equations. In [182] E. Pardoux uses
viscosity solutions in the study of backward stochastic differential equations
and semi-linear parabolic equations. In [125] and in [126] the authors employ
backward stochastic equations to study American option pricing. We like to
give an introduction to this kind of stochastic differential equations and the
corresponding parabolic partial differential equations. As a rule the operator
L generates a d-dimensional diffusion. For instance, if L = 21 ∆, then the corre-
sponding diffusion is Brownian motion. To some extent a solution to a BSDE
corresponding to a semilinear parabolic partial differential equation general-
izes the (classical) Feynman-Kac formula. We also mention that Bismut [31]
was perhaps the first to consider backward stochastic differential equations.
Most of the material presented in this section is taken from El Karoui and
Quenez [126] and El Karoui, Pardoux and Quenez [125]. First we describe a
model of assets and hedging strategies. There is a non-risky asset (the money
market or bond) S 0 (t), and there are n risky assets S j (t), 1 ≤ j ≤ n. The
process S 0 (t) satisfies the differential equation

dS 0 (t) = S 0 (t)r(t)dt, where r(t) is the short time interest rate.

The other assets satisfy a linear stochastic differential equation (SDE) of the
form " #
Xn
j j j k
dS (t) = S (t) b (t)dt + σjk (t)dW (t) ,
k=1
¡ 1 n
¢∗
which is driven by a standard³ Wiener process´ W (t) = W (t), . . . , W (t) ,
defined on a filtered space Ω, (Ft )0≤t≤T , P . It is assumed that (Ft )0≤t≤T
is generated by the Wiener process. Generally speaking the coefficients r(t),
bj (t), σjk (t) are supposed to be bounded predictable £processes
¤n with values
n
in R. We also write σj (t) = (σjk (t))k=1 . The matrix σ jk (t) j,k=1 is called
the volatility matrix. To ensure the absence of arbitrage opportunities in the
market, it is assumed that there exists an n-dimensional bounded predictable
vector process ϑ(t) such that

b(t) − r(t)1 = σ(t)ϑ(t), dt ⊗ P-almost surely.


5.3 Backward stochastic differential equations in finance 249

The vector 1 is the column vector, which is constant 1, and ϑ(t) is called
the risk premium vector. It is assumed that σ(t) has full rank. Consider a
small investor, whose actions do not affect the market prices, and who can
decide at time t ∈ [0, T ] what amount of the wealth V (t) to invest in the
j-th stock, 1 ≤ j ≤ n. Of course
¡ his decisions ¢∗ are only based on thePn current
information Ft ; i.e. π(t) = π 1 (t), . . . , π n (t) , and π 0 (t) = V (t) − j=1 π j (t)
are predictable processes. The process π(t) is called the portfolio process. The
existence of such a risk process ϑ(t) guarantees that the model is arbitrage
free. Let us make this precise by beginning with some definitions.
Definition 5.10. (a) A progressively measurable Rn -valued process
© ∗ ª
π = (π1 (t), . . . , πn (t)) : 0 ≤ t ≤ T

with the property


Z t Z T
∗ 2
|π (t)σ(t)| dt + |π ∗ (t) (b(t) − r(t)1)| dt < ∞, P-almost surely
0 0

is called a portfolio
³ Rprocess. ´
t
(b) Put γ(t) = exp − 0 r(τ )dτ , and define for a given portfolio π(t) the
process M π (t) by
Z t
M π (t) = γ(s)π ∗ (s) [σ(s)dW (s) + (b(s) − r(s)1) ds] , 0 ≤ t ≤ T,
0
(5.55)
is called the discounted gains process. A portfolio π(t) is called tame if
there exists a real constant q π such that P-almost surely M π (t) ≥ q π ,
0 ≤ t ≤ T.
(c) A tame portfolio π(t) that satisfies

P [M π (T ) ≥ 0] = 1, and P [M π (T ) > 0] > 0,

is called an arbitrage opportunity (or “free lunch”). A market M is called


arbitrage free if no such portfolios exist in it.
The following theorem shows the relevance of the existence of a risk process
ϑ(t).
Theorem 5.11. (i) If M is arbitrage-free, then there exists a progressively
measurable process ϑ : [0, T ] × Ω → Rn , called the market price or price of
risk (or price of relative risk) process, such that

b(t) − r(t)1 = σ(t)ϑ(t), 0 ≤ t ≤ T, P-almost surely.

(ii)Conversely, if such a price of risk process exists and satisfies, in addition


to the above requirements,
250 5 Viscosity solutions
Z T
2
|ϑ(t)| dt < ∞, P-almost surely, and (5.56)
0
" Ã Z Z !#
T
∗ 1 T 2
E exp − ϑ (t)dW (t) − |ϑ(t)| dt = 1, (5.57)
0 2 0

then M is arbitrage free.


From Novikov’s condition (see Proposition 3.5.12 in Karatzas and Shreve
[119]), it follows that conditions (5.56) and (5.57) are satisfied if
" Ã Z !#
1 T 2
E exp |ϑ(t)| dt < ∞;
2 0

in particular this is the case if |ϑ(t)| is uniformly bounded in (t, ω) ∈ [0, T ]×Ω.
Rt
It is noticed that under (5.57) the process W (t) + 0 ϑ∗ (s)1ds is a Brownian
motion with respect to the martingale measure Q which has Radon-Nikodym
derivative
à Z Z !
T
dQ ∗ 1 T 2
:= exp − ϑ (t)dW (t) − |ϑ(t)| dt .
dP 0 2 0

For more details the reader is referred to Karatzas [120], and to Karatzas and
Shreve [122]. Another valuable source of information is Kleinert [134], Chapter
20. Following Harrison and Pliska [100] Pan strategy (V (t), π(t)) is called self-
financing if the wealth process V (t) = j=0 π j (t) obeys the equality
Z tX
n
dS j (s)
V (t) = V (0) + π j (s) ,
0 j=1 S j (s)

or, equivalently, if it satisfies the linear stochastic differential equation

dV (t) = r(t)V (t)dt + π ∗ (t) (b(t) − r(t)1) dt + π ∗ (t)σ(t)dW (t)


= r(t)V (t)dt + π ∗ (t)σ(t) [dW (t) + ϑ(t)dt] . (5.58)

Often the left side of (5.58) contains a term dK(t), where the process K(t)
is, adapted, increasing and right-continuous, with K(0) = 0, K(T ) < ∞,
P-almost surely. The process is called the cumulative consumption process.
A pair (V (t), π(t)) satisfying (5.58) is called a self-financing trading strategy.
There exists a one to one correspondence between the pairs (x, π(t)) and pairs
(V (t), π(t)) with V (0) = x and which satisfies (5.58).
Definition 5.12. A hedging strategy against a contingent claim ξ ∈ L2 is a
self-financing strategy (V (t), π(t)) such that V (T ) = ξ with
"Z #
T
2
E |σ ∗ (t)π(t)| dt < ∞.
0
5.3 Backward stochastic differential equations in finance 251

Theorem 5.13. An attainable square integrable contingent claim ξ is repli-


cated by a unique hedging strategy (V (t), π(t)); i.e. there exists a unique solu-
tion (V (t), π(t)) to equation (5.58) such that V (T ) = ξ.

The following theorem elaborates on this statement.


Theorem 5.14. Any square integrable contingent claim is attainable; i.e. the
market is complete. In other words, for every square integrable
hR stochastic vari-
i
T ∗ 2
able ξ there exists a unique pair (X(t), π(t)) such that E 0 |σ (t)π(t)| dt <
∞, and such that

dX(t) = r(t)X(t)dt + π ∗ (t)σ(t) (ϑ(t) dt + dW (t)) , X(T ) = ξ. (5.59)

The process X(t) represents


£ the
¯ price
¤ of the claim at time t, given by the closed
formula X(t) = E H t (T )ξ ¯ Ft , where H t (s), t ≤ s ≤ T , is the deflator
process, starting at time t such that

dH t (s) = −H t (s) [r(s)ds + ϑ∗ (s)dW (s)] ; H t (t) = 1. (5.60)

Remark 5.15. Suppose that the process © t 7→ X(t) satisfies equation


ª (5.59). By
Itô’s calculus it follows that the process H 0 (t)X(t) : 0 ≤ t ≤ T is a stochas-
tic integral such that
¡ ¢
d H 0 (·)X(·) (t) = H 0 (t) {π ∗ (t) − X(t)ϑ∗ (t)} dW (t).

Classical results about solutions to the linear SDE (5.60) with bounded coef-
0 2
ficients yield the (uniform)
¡ 0 boundedness of
¢ the martingale H (t) in L ; more-
over the process H (t)X(t) : 0 ≤ t ≤ T is uniformly integrable. It follows
that
£ ¯ ¤ £ ¯ ¤
H 0 (t)X(t) = E H 0 (T )ξ ¯ Ft , or, equivalently X(t) = E H t (T )ξ ¯ Ft .

The closed form of the deflator process,


µ ½Z s Z s Z ¾¶
t ∗ 1 s 2
H (s) = exp − r(τ )dτ + ϑ (τ )dW (τ ) + |ϑ(τ )| dτ ,
t t 2 t

leads to a more classical formulation of the contingent claim:

X(t)
"Ã (Z Z Z )! #
T T T ¯
1
ξ ¯ Ft
∗ 2
= E exp − r(τ )dτ + ϑ (τ )dW (τ ) + |ϑ(τ )| dτ
t t 2 t
" Ã Z ! #
T ¯
= EQ exp − r(τ )dτ ξ ¯ Ft , (5.61)
t
252 5 Viscosity solutions
à Z !
T
where exp − r(τ )dτ is the discounted factor over the time interval
t
[0, T ] and the measure Q is the risk-adjusted probability measure defined by
the Radon-Nikodym derivative with respect to P:
à (Z Z )!
T
dQ ∗ 1 T 2
= exp − ϑ (τ )dW (τ ) + |ϑ(τ )| dτ .
dP 0 2 0

Proof (Proof of Theorem 5.14). First we prove uniqueness. Let the pair
(X(t), π(t)), where X(t) is adapted and π(t) is predictable, satisfy equation
(5.59). Let the process H 0 (t) satisfy the differential equation as exhibited in
(5.60). Then
¡ ¢
d H 0 (·)X(·) (t) = H 0 (t) {π ∗ (t) − X(t)ϑ∗ (t)} dW (t).

As explained in the previous remark, it follows that


" Ã Z ! #
T ¯
X(t) = EQ exp − ¯
r(τ )dτ ξ Ft .
t

This shows that that the process X(t) is uniquely determined. But once X(t)
is uniquely determined, then the same is true for the process π(t). Corollary
Rt
9.1 in Chapter 9 implies that the process W (t)+ 0 ϑ∗ (s)ds is Brownian motion
Rt
with respect to the measure Q. Moreover, the process X(t)− 0 r (τ ) X(τ )dτ =
Rt ∗ Rt
0
π (s)σ(s) (dW (s) + ϑ(s)ds), where the process t 7→ W (t) + 0 ϑ(s)ds is
a Brownian motion with respect to the measure Q. Let (X1 (t), π1 (t)) and
(X2 (t), π1 (t)) be two solutions to the equation in (5.59). Then
Z T Z T
X1 (T ) − X1 (t) − r(τ )X1 (τ )dτ = X2 (T ) − X2 (t) − r(τ )X2 (τ )dτ
" Ã tZ ! # t
T ¯
= ξ − EQ exp − r(τ )dτ ξ ¯ Ft
t
Z " Ã Z ! #
T T ¯
− r(τ )EQ exp − r(s)ds ξ ¯ Ft dτ
t τ
Z T
=ξ− π1∗ (s)σ(s) (dW (s) + ϑ(s)ds)
t
Z T
=ξ− π2∗ (s)σ(s) (dW (s) + ϑ(s)ds) .
t

Hence,
Z T
(π1∗ (s)σ(s) − π2∗ (s)) (dW (s) + ϑ(s)ds) = 0, 0 ≤ t ≤ T. (5.62)
t
5.4 Some related remarks 253
hR i
T 2
Thus EQ t |π1 (τ ) − π2 (τ )| dτ = 0, and consequently the equality π1 (t) =
π2 (t) holds λ × Q-almost surely. Here we wrote λ for the Lebesgue measure on
R. Since the Q-negligible sets coincide with P-negligible sets, we get π1 (t) =
π2 (t) for λ × P-almost all (t, ω) ∈ [0, T ] × Ω.
Next we prove the existence. Define the process Y (t), 0 ≤ t ≤ T , by
" Ã Z ! #
T ¯
Y (t) = EQ exp − r (τ ) dτ ξ ¯ Ft .
0

The process t 7→ Y (t) is a PQ -martingale, and since the processes t 7→ W (t) +


Rt
0
ϑ(s)ds is a PQ -Brownian motion, there exists by a martingale representation
theorem a predictable process π e(t) such that
Z T
Y (T ) − Y (t) = e∗ (s) (dW (s) + ϑ(s)ds) .
π (5.63)
t

From (5.63) we easily infer that

e∗ (t) (dW (t) + ϑ(t)dt) .


dY (t) = π

Next, put
µZ t ¶ µZ t ¶
∗ ∗ −1
π (t) = exp r(τ )dτ π
e (t)σ(t) and X(t) = exp r(τ )dτ Y (t).
0 0

Then we have X(T ) = ξ, and


µ Z t ¶
dY (t) = exp − r(τ )dτ π ∗ (t)σ(t) (dW (t) + ϑ(t)dt) ,
0

and hence
µZ t ¶
dX(t) = r(t)X(t)dt + exp r(τ )dτ dY (t)
0
= r(t)X(t)dt
µZ t ¶ µ Z t ¶
+ exp r(τ )dτ exp − r(τ )dτ π ∗ (t)σ(t) (dW (t) + ϑ(t)dt)
0 0
= r(t)X(t)dt + π ∗ (t)σ(t) (dW (t) + ϑ(t)dt) . (5.64)

This proves the existence of a solution to equation (5.59).


Altogether this completes the proof of Theorem 5.59.

5.4 Some related remarks


In this section we will explain the relevance of backward stochastic differen-
tial equations (BSDEs). We will also mention that Bismut was the first to
254 5 Viscosity solutions

discuss BSDEs [31], and [32]. Of course BSDEs were popularized by Pardoux
and coworkers; see e.g. [184, 185, 181, 183]. For the close connection between
BSDEs and hedging strategies in financial mathematics the reader is referred
to e.g. El Karoui et al [125], and [126]. Another paper related to obstacles,
and therefore also to hedging strategies, is the reference [124]. For some more
explanation the reader is also referred to §6 in [241]. An important area of
mathematics and its applications where backward problems play a central
role is control theory: see e.g. Soner [215]. In the finite-dimensional setting
the paper by Crandall et al [61] is very relevant for understanding the notion
of viscosity solutions. Not necessary continuous viscosity solutions also play
a central role in applied fields like dislocation theory, see e.g. Barles et al [21]
and Barles [22].
6
The Hamilton-Jacobi-Bellman equation and
the stochastic Noether theorem

In this chapter we prove that the Lagrangian action, which may be phrased in
terms of a non-linear Feynman-Kac formula, coincides under rather generous
hypotheses with the unique viscosity solution to the Hamilton-Jacobi-Bellman
equation. The method of proof is based on martingale theory and Jensen
inequality. A version of the stochastic Noether theorem is proved, as well as
its complex companion.

6.1 Introduction
We start this chapter by pointing out that Zambrini and coworkers [259, 2,
232, 233, 59] [260, 3, 230, 231, 59] have kind of a transition scheme to go
from classical stochastic calculus (with non-reversible processes) to physical
real time (reversible) quantum mechanics and vice versa. An important tool
in this connection is the so-called Noether theorem. In fact, in Zambrini’s
words, reference [260] contains the first concrete application of this theorem.
In [260] the author formulates a theorem like Theorem 6.3 below, he also
uses so-called “Bernstein diffusions” (see e.g. [65]) for the “Euclidean Born
interpretation”
µ of quantum mechanics.
¶ The Bernstein
µ diffusions¶are related to
∂ ¡ ¢ ∂
solutions of − K0 +̇V η(t, x) = 0, and of + K0 +̇V η ∗ (t, x) = 0.
∂t ∂t
In the present paper we prove a version of the stochastic Noether theorem
in terms of the carré du champ operator and ideas from stochastic control:
see Theorem 6.13, which should be compared with Theorem 2.4 in [260]. The
operator K0 generates a diffusion in the following sense: for every C ∞ -function
Φ : Rν → R, with Φ(0, . . . , 0) = 0, the following identity is valid:

K0 (Φ(f1 , . . . , fn )) (6.1)
n
X ∂Φ n
1 X ∂2Φ
= (f1 , . . . , fn ) K0 fj − (f1 , . . . , fn ) Γ1 (fj , fk )
j=1
∂xj 2 ∂xj ∂xk
j,k=1
256 6 Hamilton-Jacobi-Bellman equation

for all functions f1 , . . . , fn in a rich enough algebra of functions A, con-


tained in the domain of the generator K0 , as described below. The condition
Φ (0, . . . , 0) = 0 will be omitted in case the function 1 belongs to the domain
of the operator K0 .

Hypotheses on the generator and the algebra A

We will assume that the constant functions belong to D (K0 ), and that
K0 1 = 0. The algebra A has to be “large” enough. To be specific, we assume
that the operator K0 is a space-time operator with domain in Cb ([0, T ] × E),
and that A is a core for the operator K0 , which means that the Tβ -closure of
its graph {(ϕ, K0 ϕ) : ϕ ∈ A} is again the graph of Tβ -closed operator, which
we keep denoting by K0 . In addition, it is assumed that A is stable under
composition with C ∞ -functions of several variables, that vanish at the origin.
Moreover, in order to obtain some nice results a rather technical condition
is required: whenever (fn : n ∈ N) is a sequence in A that converges to f
with respect to the Tβ -topology in Cb ([0, T ] × E) × Cb ([0, T ] × E) and when-
ever Φ : R → R is a C ∞ -function, vanishing at 0, with bounded derivatives
of all orders (including the order 0), then one may extract a subsequence
(Φ (fnk ) : k ∈ N) that converges to Φ(f ) in Cb ([0, T ] × E), whereas the se-
quence (K0 Φ (fnk ) : k ∈ N) converges in Cb ([0, T ] × E). Notice that all func-
tions of the form e−ψ f , ψ, f ∈ A, belong to A. Also notice that the required
properties of A depend on the generator K0 . In fact we will assume that ¡ the
¢
algebra A is also large enough for all operators of the form f 7→ eψ K0 e−ψ f ,
where ψ belongs to A. In addition, we assume that 1 ∈ D (K0 ), and that
K0 1 = 0. The operator K0 is supposed to be Tβ -closed when viewed as an
operator acting on functions in Cb ([0, T ] × E).
Remark 6.1. Let ds be the Lebesgue measure on [0, T ]. If there exists a ref-
erence measure m on the Borel field E of E, and if we want to work in the
Lp -spaces Lp ([0, T ] × E, ds × m), 1 ≤ p < ∞, then it is assumed that K0 has
dense domain in Lp ([0, T ] × E, ds × m), for each 1 ≤ p < ∞. In addition,
it is assumed that A is a subalgebra of D (K0 ) which possesses the follow-
ing properties (cfr. Bakry [16]). Its is dense in Lp ([0, T ] × E, ds × m) for all
1 ≤ p < ∞ and it is a core for K0 , provided K0 is considered as a densely
defined operator in such a space. The latter means that the algebra A consists
of functions in D (K0 ) viewed as an operator in Lp ([0, T ] × E, ds × m).
The same is true for the space Cb (([0, T ] × E) = Cb (([0, T ] × E, C), but
then relative to the strict topology. In addition, it is assumed that A is sta-
ble under composition with C ∞ -functions of several variables, that vanish
at the origin. Moreover, as indicated above in order to obtain some nice
results a more technical condition is required. Whenever (fn : n ∈ N) is a
sequence in A that converges to f with respect to the graph norm of K0
(in L2 ([0, T ] × E, ds × m)) and whenever Φ : R → R is a C ∞ -function,
6.1 Introduction 257

vanishing at 0, with bounded derivatives of all orders (including the or-


der 0), then there exitsts a subsequence (Φ (fnk ) : k ∈ N) that converges to
Φ(f ) in Cb ([0, T ] × E), whereas the sequence (K0 Φ (fnk ) : k ∈ N) converges
in Cb ([0, T ] × E) and also in L1 (E, m) to K0 Φ(f ).

Some additional comments

From (6.1) we see that


µ ¶
ψ
¡ −ψ ¢ 1
−e K0 e f = K0 ψ + Γ1 (ψ, ψ) f − K0 f − Γ1 (ψ, f ) , and (6.2)
2
K0 (ϕψ) = (K0 ϕ) ψ + ϕ (K0 ψ) − Γ1 (ϕ, ψ) (6.3)

for ϕ, ψ ∈ A, and f ∈ D (K0 ). We have the result in Theorem 4.45 for


generators of diffusions. For the notion of the squared gradient operator (carré
du champ opérateur) see equality (6.9). The operator K0 acts on the space
and time variable, and the squared gradient operator Γ1 only acts on the space
variable; its action depends on the time-coordinate. The symbol D1 stands for

the operator D1 = . Fix T > t0 ≥ 0. In the remainder of the present chapter
∂t
we work in a continuous function spaces like Cb ((t0 , T ] × E) and sometimes
in C ((t0 , T ] × E). If we write D (D1 − K0 ) for the domain of the operator
D1 − K0 , then the corresponding space should be specified. In fact the space
Cb ((t0 , T ] × E) is endowed with the strict topology Tβ , and also with that of
uniform convergence. The operator D1 − K0 is considered as the generator of
the semigroup {S(ρ) : ρ ≥ 0} defined by

S(ρ)f (τ, x) = P (τ, (ρ + s) ∧ T ) f ((ρ + s) ∧ T, ·) (x)


= Eτ,x [f ((ρ + s) ∧ T, X ((ρ + s) ∧ T ))] . (6.4)

Here {P (s, t) : 0 ≤ s ≤ t ≤ T } is the Feller propagator generated by the op-


erator −K0 : see Definition 1.31 and also Definition 1.30. The formula in (6.5)
is the same as the one (2.87) in Chapter 2. Then it follows that for t + ρ ≤ T
we have

S(ρ)P (τ + ρ, t + ρ) f (τ, x) = P (τ, t + ρ) f (t + ρ, ·) (x) = S(t + ρ)f (τ, x)


(6.5)
where f ∈ Cb (E). Notice that in (6.5) the operator S(ρ) acts on the func-
tion (s, y) 7→ P (τ + ρ, t + ρ) f (s, ·)(y) and that S(t + ρ) acts on the function
(s, y) 7→ f (y). The process

{(Ω, FTτ , Pτ,x ) , (X(t) : T ≥ t ≥ 0) , (∨t : T ≥ t ≥ 0) , (E, E)} (6.6)

is the strong Markov process generated by −K0 ; it is supposed to have con-


tinuous paths. In the space C (E) the operator K0 is considered as a local
operator in the sense that a function f ∈ C(E) belongs to its domain if there
258 6 Hamilton-Jacobi-Bellman equation

exists a function g ∈ C ((τ, T ) × E) such that for every open subset U of E


together with every compact subset K of U we have
¯ ¯
¯ f (τ, x) − Eτ,x [f (τ + h, X(τ + h)) : τU > τ + h] ¯¯
lim sup ¯ g(τ, x) −
h↓0 (τ,x)∈[0,T −h]×K ¯ h ¯

= 0.

Here τU is the first exit time from U : τU = inf {t > 0 : X(t) ∈ E \ U }. We


write g = −K0 f . From Proposition 1.6 in [70] page 9 it follows that the
constant function 1 belongs to the domain of K0 and that K0 1 = 0, provided
K0 is time-independent.
Remark 6.2. In a more classical context in e.g. Lp -spaces the operator K0 can
often be considered as a differential operator in “distributional” sense. In a
physical context the operators K0 (s), s ∈ [0, T ], are considered as self-adjoint
operators in L2 (E, m). It is noticed that there exists a close relationship
between the viscous Burgers’ equation (in an open subset of Rd )

∂U 1
− + U · ∇U − ∆U = ∇V,
∂t 2
and the Hamilton-Jacobi-Bellman equation. If we write the vector field U in
the form U = ∇ϕ, then the function ϕ satisfies
∂ϕ 1 1
− + ∇ϕ · ∇ϕ − ∆ϕ = V + constant.
∂t 2 2

6.2 The Hamilton-Jacobi-Bellman equation and its


solution

In this section we will mainly be concerned with the Hamilton-Jacobi-Bellman


equation as exhibited in (6.7). We have the following result for generators of
diffusions: it refines Theorem 2.4 in Zambrini [260]. Its proof is contained in
the proof of Theorem 6.8.
Theorem 6.3. Let χ : (t0 , T ] × E → [0, ∞] be a function such that
M
Et,xv,t [|log χ (T, X(T ))|] , v ∈ D (D1 − K0 )

is finite for t0 < t ≤ T . Here T > t0 ≥ 0 are fixed times and

{(Ω, FTτ , Pτ,x ) , (X(t) : T ≥ t ≥ 0) , (∨t : T ≥ t ≥ 0) , (E, E)}

is the strong Markov process generated by −K0 . Let SL be a solution to the


following Riccati type equation. This equation is called the Hamilton-Jacobi-
Bellman equation. For t0 < s ≤ T and x ∈ E the following identity is true:
6.2 The Hamilton-Jacobi-Bellman equation and its solution 259

 ∂SL 1
− (s, x) + Γ1 (SL , SL ) (s, x) + K0 SL (s, x) − V (s, x) = 0;
∂s 2 (6.7)
S (T, x) = − log χ(T, x), x ∈ E.
L

Then for any real valued v ∈ D (D1 − K0 ) the following inequality is valid:
"Z µ ¶ #
T
Mv,t 1 M
SL (t, x) ≤ Et,x Γ1 (v, v) + V (τ, X(τ ))dτ −Et,xv,t [log χ (T, X(T ))] ,
t 2
(6.8)
and equality is attained for the “Lagrangian action” v = SL .
By definition Et,x [Y ] is the expectation, conditioned at X(t) = x, of the
random variable Y which is measurable with respect to the information from
M
the future: i.e. with respect to σ {X(s) : s ≥ t}. The measure Pt,xv,t is defined
in equality (6.10) below. Put ηχµ(t, x) = exp (−S ¶L (t, x)), where SL satisfies
∂ ¡ ¢
(6.7). From (6.1) it follows that − K0 +̇V ηχ (t, x) = 0, provided that
∂t
K0 1(t, x) = 0 for all (t, x) ∈ [0, T ] × E.
Fix a function v : (t0 , T ] × E → R in D (D1 − K0 ), where, as above,

D1 = is differentiation with respect to t. Let the process
∂t
© ¡ ¢ª
(Ω, F, Pt,x ) , ((qv (t), t) : t ≥ 0) , (∨t : t ≥ 0) , R+ × E, BR+ ⊗ E

be the Markov process generated by the operator −Kv + D1 , where Kv is


defined by Kv (f )(t, x) = K0 f (t, x) + Γ1 (v, f ) (t, x). Here, BR+ denotes the
Borel field of R+ , and by Γ1 (v, f )(t, x) we mean

Γ1 (v, f )(t, x) (6.9)


1
= lim Et,x [(v(s, X(s)) − v(t, X(t))) (f (X(s), s) − f (t, X(t)))] .
s↓t s − t

We also believe that the following version of the Cameron-Martin formula is


true. For all finite n-tuples t1 , . . . , tn in (0, ∞) the identity (6.11) is valid:
 
n
Y
M
Et,xv,t  fj (tj + t, X (tj + t)) (6.10)
j=1
" Ã Z !
T
1
= Et,x exp − Γ1 (v, v) (τ, X(τ )) dτ − Mv,t (T )
2 t

n
Y
× fj (tj + t, X (tj + t))
j=1
 
n
Y
= Et,x  fj (tj + t, qv (tj + t)) (6.11)
j=1
260 6 Hamilton-Jacobi-Bellman equation

where the Et,x -martingale Mv,t (s), s ≥ t, is given by


Z sµ ¶

Mv,t (s) = v (s, X(s)) − v (t, X(t)) + − + K0 v (τ, X(τ )) dτ. (6.12)
t ∂τ

Its quadratic variation part hMv,t i (s) := hMv,t , Mv,t i (s) is given by
Z s
hMv,t i (s) = Γ1 (v, v) (τ, X(τ )) dτ. (6.13)
t

M
The equality in (6.10) serves as a definition of the measure Pt,xv,t (·), and the
equality in (6.11) is a statement.
The proof of Theorem 6.3 can be found in [240]; Theorem 6.3 is superseded
by the second inequality in assertion (i) of Theorem 6.8.
Next, let χ : [t, T ] × E → [0, ∞] be as in Theorem 6.3. In what follows we

write D1 = . We also write D1 ϕ = φ̇. What is the relationship between the
∂t
following expressions?
½ ¾
1
sup Φ(t, x) : −Φ̇ + K0 Φ + Γ1 (Φ, Φ) ≤ V, Φ(T, ·) ≤ − log χ (T, ·) ;
Φ∈D(D1 −K0 ) 2
(6.14)
" Ã Z ! #
T
− log Et,x exp − V (σ, X(σ)) dσ χ (T, X(T )) ; (6.15)
t
( "Z µ ¶ #
T
M 1
inf Et,xv,t Γ1 (v, v) + V (τ, X(τ ))dτ
Φ∈D(D1 −K0 ) t 2
)
M
− Et,xv,t [log χ (T, X(T ))] ; (6.16)
½ ¾
1
inf Φ(t, x) : −Φ̇ + K0 Φ + Γ1 (Φ, Φ) ≥ V, Φ(T, ·) ≥ − log χ (T, ·) .
Φ∈D(D1 −K0 ) 2
(6.17)

In order that everything works appropriately we need the following definition


and lemma.
Definition 6.4. The potential V : [0, T ] × E → R satisfies the Myadera per-
turbation condition, provided that
·Z s+τ ¸
lim sup sup Eτ,x V− (ρ, X(ρ)) dρ
s↓0 (τ,x)∈[0,T −s]×E τ
Z τ +s
= lim sup sup P (τ, ρ) V− (ρ, ·) (x)dρ < 1. (6.18)
s↓0 (τ,x)∈[0,T −s]×E τ
6.2 The Hamilton-Jacobi-Bellman equation and its solution 261

For more information on Myadera perturbations the reader is referred to e.g.


Räbiger, et al [195, 196].
Lemma 6.5. Suppose that
Z τ +s
α := lim sup sup P (τ, ρ) V− (ρ, ·) (x)dρ < 1. (6.19)
s↓0 (τ,x)∈[0,T −s]×E τ

Then " ÃZ !#
T
sup Eτ,x exp V− (ρ, X(ρ)) dρ < ∞. (6.20)
(τ,x)∈[0,T ]×E τ

Proof. Choose n ∈ N so large that


Z (nτ +T )/(n+1)
αn := sup P (τ, ρ) V− (ρ, ·) (x)dρ < 1. (6.21)
(τ,x)∈[0,T ]×E τ

By (6.18) such a choice is possible. For τ ∈ [0, T ] fixed we choose a subdivision


of the interval [τ, T ] in such a way that
n+1−j j
τ = τ0 < τ1 < · · · < τn < τn+1 = T, where τj = τ+ T.
n+1 n+1
Notice that τk+1 − τk = (T − τ )/(n + 1) ≤ T /(n + 1). Then by the Markov
property we have
" ÃZ !#
T
Eτ,x exp V− (ρ, X(ρ)) dρ
τ
 ÃZ !
n
Y τj+1
= Eτ,x  exp V− (ρ, X(ρ)) dρ 
j=0 τj
 ÃZ !
n−1
Y τj+1
= Eτ,x  exp V− (ρ, X(ρ)) dρ
j=0 τj
· µZ τn+1 ¶¸¸
×Eτn ,X(τn ) exp V− (ρ, X(ρ)) dρ
τn

(by induction)
n
Y · µZ τk+1 ¶¸
≤ sup Eτk ,y exp V− (ρ, X(ρ)) dρ . (6.22)
y∈E τk
k=0

We also have
· µZ τk+1 ¶¸
Eτk ,y exp V− (ρ, X(ρ)) dρ
τk
262 6 Hamilton-Jacobi-Bellman equation

"µZ ¶` #
X 1 τk+1
=1+ Eτ ,y V− (ρ, X(ρ)) dρ
`! k τk
`=1
 
X∞ Z Z Ỳ
=1+ Eτk ,y  V− (ρj , X (ρj )) dρ` . . . dρ1 
`=1 τk <ρ1 <···<ρ` <τk+1 j=1

(again Markov property)




X Z Z `−1
Y
=1+ Eτk ,y  V− (ρj , X (ρj ))
`=1 τk <ρ1 <···<ρ`−1 j=1
"Z # #
τk+1
Eρ`−1 ,X(ρ`−1 ) V− (ρ` , X (ρ` )) dρ` dρ`−1 . . . dρ1
ρ`−1


à ·Z ¸!`
X τk+1
≤ sup Eρ,z V− (s, X(s)) ds
`=0 (ρ,z)∈[τk ,τk+1 ]×E ρ

(notice the inequality τn+1 ≤ ρ + (T − τ )/(n + 1))


X 1
≤ αn` = . (6.23)
1 − αn
`=0

where in the final step of (6.23) we used (6.21). From (6.22) and (6.23) we
obtain (6.20).
This completes the proof of Lemma 6.5.

We also have to insert the standard Feynman-Kac formula, and its proper-
ties related to the strict topology. In addition, we have to discuss matters
like stability and consistency of families of Kato-type or Myadera poten-
tials. More precisely, let (Vk )k∈N be a sequence of potentials which satisfies,
uniformly in k, a condition like (6.19). Under what consistency (or conver-
gence) conditions are we sure that the corresponding perturbed evolutions
{PVk (s, t) : 0 ≤ s ≤ t ≤ T }, k ∈ N, converges to an evolution of the form
{PV (s, t) : 0 ≤ s ≤ t ≤ T }. In addition, we want this convergence to behave
in such a way that the operators PV (s, t), 0 ≤ s ≤ t ≤ T , assign bounded
continuous functions to bounded continuous functions, provided the same is
true for each of the operators PVk (s, t), k ∈ N, 0 ≤ s ≤ t ≤ T .
Theorem 6.6. Let the Feller evolution {P (s, t) : τ ≤ s ≤ t ≤ T } be the tran-
sition probabilities of the Markov process in (6.6). Let V : [0, T ] × E → R be
a Myadera type potential function with the following properties:
(i) Its negative part satisfies (6.19).
6.2 The Hamilton-Jacobi-Bellman equation and its solution 263

(ii)For every k, ` ∈ N, and f ∈ Cb ([0, T ] × E), the function


"Z #
(τ +t)∧T
(τ, x, t) 7→ Eτ,x Vk,` (ρ, X(ρ)) f (ρ, X(ρ)) dρ
τ

is continuous. Here Vk,` = (V ∧ `) ∨ (−k).


(iii)The following equalities hold for all compact subsets K of E:
"Z #
T
lim sup Eτ,x 0 ∨ (V − `) (ρ, X (ρ)) dρ = 0, and
`→∞ (τ,x)∈[0,T ]×K τ
"Z #
T
lim sup Eτ,x 0 ∨ (−V − k) (ρ, X (ρ)) dρ = 0. (6.24)
k→∞ (τ,x)∈[0,T ]×K τ

(iv)The function V satisfies


"Z #
T
sup Eτ,x |V (ρ, X (ρ))| dρ < ∞.
(τ,x)∈[0,T ]×E τ

Then the functions


" Ã Z ! #
(τ +t)∧T
(τ, x, t) 7→ Eτ,x exp − V (ρ, X(ρ)) dρ f (X(t)) , f ∈ Cb (E),
τ
(6.25)
are bounded continuous functions.

Remark 6.7. Suppose that the functions in (6.24) are continuous; i.e. suppose
that for every k ∈ N the functions
"Z #
T
(τ, x) 7→ Eτ,x 0 ∨ (V − k) (ρ, X (ρ)) dρ and
τ
"Z #
T
(τ, x) 7→ Eτ,x 0 ∨ (−V − k) (ρ, X (ρ)) dρ
τ

are continuous. Then (iii) is a consequence of (iv). From (iv) it follows that the
pointwise limits in (6.24) are zero. By Dini’s lemma this convergence occurs
uniformly on compact subsets of [0, T ] × E. Also observe that the limits in
(6.24) decrease monotonically with increasing ` and k respectively.

Proof. Let f ∈ Cb (E) be such that kf k∞ ≤ 1. First we notice that −V− ≤


Vk,` ≤ V+ , and hence |V − Vk,` | ≤ |V |. It follows that
" Ã Z ! #
(τ +t)∧T
Eτ,x exp − V (ρ, X(ρ)) dρ f (X ((τ + t) ∧ T ))
τ
264 6 Hamilton-Jacobi-Bellman equation
" Ã Z ! #
(τ +t)∧T
− Eτ,x exp − Vk,` (ρ, X(ρ)) dρ f (X ((τ + t) ∧ T ))
τ
Z " Ã Z !
1 (τ +t)∧T
= Eτ,x exp − {(1 − s)V (ρ, X(ρ)) + sVk,` (ρ, X(ρ))} dρ
0 τ
Z #
(τ +t)∧T
(V − Vk,` ) (ρ, X(ρ)) dρf (X ((τ + t) ∧ T )) ds,
τ

and hence
¯ " Ã Z ! #
¯ (τ +t)∧T
¯
¯Eτ,x exp − V (ρ, X(ρ)) dρ f (X ((τ + t) ∧ T ))
¯ τ
" Ã Z ! #¯
(τ +t)∧T ¯
¯
−Eτ,x exp − Vk,` (ρ, X(ρ)) dρ f (X ((τ + t) ∧ T )) ¯
τ ¯
Z " Ã Z !
1 (τ +t)∧T
≤ Eτ,x exp − {(1 − s)V (ρ, X(ρ)) + sVk,` (ρ, X(ρ))} dρ
0 τ
¯Z ¯#
¯ (τ +t)∧T ¯
¯ ¯
¯ (V − Vk,` ) (ρ, X(ρ)) dρ¯ ds kf k∞
¯ τ ¯
" ÃZ !
(τ +t)∧T
≤ Eτ,x exp V− (ρ, X(ρ)) dρ
τ
¯Z ¯#
¯ (τ +t)∧T ¯
¯ ¯
¯ (V − Vk,` ) (ρ, X(ρ)) dρ¯
¯ τ ¯
à " à Z !#!(2m+1)/(2m+2)
2m + 2 (τ +t)∧T
≤ Eτ,x exp V− (ρ, X(ρ)) dρ
2m + 1 τ
 ¯ ¯2m+2 1/(2m+2)
¯Z (τ +t)∧T ¯
Eτ,x ¯¯ ¯ 
(V − Vk,` ) (ρ, X(ρ)) dρ¯ . (6.26)
¯ τ ¯

In (6.26) we choose m so large that


" Ã Z !#
2m + 2 (τ +t)∧T
sup Es,y exp V− (ρ, X(ρ)) dρ < ∞. (6.27)
(s,y)∈[0,T ]×E 2m + 1 τ

From Lemma 6.5 it follows that such a choice of m is possible: see (6.19) and
(6.21). From the Markov property we infer
¯ ¯2m+2 
¯ Z (τ +t)∧T ¯
1 ¯ ¯
Eτ,x ¯ (V − Vk,` ) (ρ, X(ρ)) dρ¯ 
(2m + 2)! ¯ τ ¯
6.2 The Hamilton-Jacobi-Bellman equation and its solution 265
"Z #
(τ +t)∧T
1 2m+2
≤ Eτ,x |(V − Vk,` ) (ρ, X(ρ))| dρ
(2m + 2)! τ
 
Z Z 2m+2
Y
 
= Eτ,x  |(V − Vk,` ) (ρj , X (ρj ))| dρ2m+2 . . . dρ1 
j=1
τ <ρ1 <···<ρ2m+2 <(τ +t)∧T

Z Z 2m+1
Y
= Eτ,x  |(V − Vk,` ) (ρj , X (ρj ))|
τ <ρ1 <···<ρ2m+1 <(τ +t)∧T j=1
"Z #
(τ +t)∧T
×Eρ2m+1 ,X(ρ2m+1 ) |(V − Vk,` ) (ρ2m+2 , X (ρ2m+2 ))| dρ2m+2
ρ2m+1
#
dρ2m+1 . . . dρ1
 
Z Z 2m+1
Y
 
≤ Eτ,x  |(V − Vk,` ) (ρj , X (ρj ))| dρ2m+1 . . . dρ1 
j=1
τ <ρ1 <···<ρ2m+1 <(τ +t)∧T
"Z #
(τ +t)∧T
sup Es,y |(V − Vk,` ) (ρ, X (ρ))| dρ
(s,y)∈[τ,(τ +t)∧T ]×E s

(use induction)
"Z #
(τ +t)∧T
≤ Eτ,x |(V − Vk,` ) (ρ1 , X (ρ1 ))| dρ1
τ
à "Z #!2m+1
(τ +t)∧T
sup Es,y |(V − Vk,` ) (ρ, X (ρ))| dρ
(s,y)∈[τ,(τ +t)∧T ]×E s
"Z #
(τ +t)∧T
≤ Eτ,x |(V − Vk,` ) (ρ1 , X (ρ1 ))| dρ1
τ
à "Z #!2m+1
(τ +t)∧T
sup Es,y |V (ρ, X (ρ))| dρ
(s,y)∈[τ,(τ +t)∧T ]×E s
à "Z #
(τ +t)∧T
≤ Eτ,x 0 ∨ (V − `) (ρ1 , X (ρ1 )) dρ1
τ
"Z #!
(τ +t)∧T
+Eτ,x 0 ∨ (−V − k) (ρ1 , X (ρ1 )) dρ1
τ
à "Z #!2m+1
(τ +t)∧T
× sup Es,y |V (ρ, X (ρ))| dρ . (6.28)
(s,y)∈[τ,(τ +t)∧T ]×E s
266 6 Hamilton-Jacobi-Bellman equation

From (6.26), (6.27), (6.28), assumptions (iii) and (iv) it follows that, uniformly
on compact subsets of [0, T ] × E, the following equality holds:
" Ã Z ! #
(τ +t)∧T
Eτ,x exp − V (ρ, X(ρ)) dρ f (X ((τ + t) ∧ T ))
τ
" Ã Z ! #
(τ +t)∧T
= lim lim Eτ,x exp − Vk,` (ρ, X(ρ)) dρ f (X ((τ + t) ∧ T )) .
k→∞ `→∞ τ
(6.29)
In order to finish the proof of Theorem 6.6 we need to establish the continuity
of the function
" Ã Z ! #
(τ +t)∧T
(τ, x, t) 7→ Eτ,x exp − Vk,` (ρ, X(ρ)) dρ f (X ((τ + t) ∧ T )) .
τ
(6.30)
By expanding the exponential in (6.30), using the Markov property together
with assumption (ii) the continuity of the function in (6.30) follows. More
precisely, we have
" Ã Z ! #
(τ +t)∧T
Eτ,x exp − Vk,` (ρ, X(ρ)) dρ f (X ((τ + t) ∧ T ))
τ

"ÃZ !n #
X (−1)n (τ +t)∧T
= Eτ,x Vk,` (ρ, X(ρ)) dρ f (X(τ + t ∧ T ))
n=0
n! 0

= Eτ,x [f (X(τ + t ∧ T ))]


X∞ Z Z
n
+ (−1)
n=1 τ <ρ1 <···<ρn <(τ +t)∧T
 
n
Y
Eτ,x  Vk,` (ρj , X (ρj )) f (X(τ + t ∧ T )) dρn . . . dρ1
j=1

(Markov property)

= Eτ,x [f (X(τ + t ∧ T ))]



X∞ Z Z n−1
Y
+ (−1)k
Eτ,x  Vk,` (ρj , X (ρj ))
n=1 τ <ρ1 <···<ρn−1 <(τ +t)∧T j=1
"Z ##
(τ +t)∧T
Eρn−1 ,X(ρn−1 ) Vk,` (ρn , X (ρn )) dρn f (X(τ + t) ∧ T )
ρn−1

dρn−1 . . . dρ1 . (6.31)

Notice that by assumption (ii) the function


6.3 The Hamilton-Jacobi-Bellman equation and viscosity solutions 267
"Z #
(τ +t)∧T
(ρ, t, y) 7→ Eρ,y Vk,` (ρn , X (ρn )) dρn f (X ((τ + t) ∧ T )) (6.32)
ρn−1
"Z #
(τ +t)∧T
= Eρ,y Vk,` (ρn , X (ρn )) Eρn ,X(ρn ) [f (X ((τ + t) ∧ T ))] dρn .
ρn−1

By induction with respect to n it follows that each term in the right-hand


side of (6.31) is continuous. The series in (6.31) being uniformly convergent
yields the continuity of the functions in (6.25).
This concludes the proof of Theorem 6.6.

6.3 The Hamilton-Jacobi-Bellman equation and viscosity


solutions
A result which is somewhat more general then Theorem 6.3 reads as follows.
As above, we work in the space Cb ((t0 , T ] × E), where T > t0 ≥ 0 is fixed.
Theorem 6.8. (i) The following inequalities are valid:
½
1
sup Φ(t, x) : −Φ̇ + K0 Φ + Γ1 (Φ, Φ) ≤ V,
Φ∈D(D1 −K0 ) 2
¾
Φ(T, ·) ≤ − log χ (T, ·)
" Ã Z ! #
T
≤ − log Et,x exp − V (σ, X(σ)) dσ χ (T, X(T ))
t
( "Z µ ¶ #
T
M 1
≤ inf Et,xv,t Γ1 (v, v) + V (τ, X(τ ))dτ
v∈D(D1 −K0 ) t 2
)
M
− Et,xv,t [log χ (T, X(T ))]
½
1
≤ inf Φ(t, x) : −Φ̇ + K0 Φ + Γ1 (Φ, Φ) ≥ V,
Φ∈D(D1 −K0 ) 2
¾
Φ(T, ·) ≥ − log χ (T, ·) .

(ii) If the function SL defined by the non-linear Feynman-Kac formula


" Ã Z ! #
T
SL (t, x) = − log Et,x exp − V (σ, X(σ)) dσ χ (T, X(T )) (6.33)
t

belongs to D (D1 − K0 ), then the above 4 quantities are equal. Moreover


the function SL satisfies the Hamilton-Jacobi-Bellman equation (6.7). The
same is true if the expressions in (6.14) and in (6.17) are equal.
268 6 Hamilton-Jacobi-Bellman equation

(iii) In general the function in (6.33) is a viscosity solution of the Hamilton-


Jacobi-Bellman equation (6.7). This means that if (t, x) belongs to (t0 , T ]×
E and if ϕ ∈ D (D1 − K0 ) has the property that

[SL − ϕ] (t, x) = sup {[SL − ϕ] (s, y) : (s, y) ∈ [t, T ] × E} ,

then
1
[−ϕ̇ + K0 ϕ] (t, x) + Γ1 (ϕ, ϕ) (t, x) ≤ V (t, x). (6.34)
2
It also means that if (t, x) belongs to (t0 , T ] × E and if ϕ ∈ D (D1 − K0 )
has the property that

[SL − ϕ] (t, x) = inf {[SL − ϕ] (s, y) : (s, y) ∈ [t, T ] × E} ,

then
1
[−ϕ̇ + K0 ϕ] (t, x) + Γ1 (ϕ, ϕ) (t, x) ≥ V (t, x). (6.35)
2
(iv) If for all (t, x) ∈ (t0 , T ] × E the expression
" Ã Z ! #
T
Et,x exp − V (σ, X(σ)) dσ χ (T, X(T )) ,
t

is strictly positive, then the following equality is valid:


" Ã Z ! #
T
− log Et,x exp − V (σ, X(σ)) dσ χ (T, X(T )) (6.36)
t
( "Z µ ¶ #
T
M 1
= inf Et,xv,t Γ1 (v, v) + V (τ, X(τ ))dτ
v∈D(D1 −K0 ) t 2
)
M
− Et,xv,t [log χ (T, X(T ))] . (6.37)

(v) Let S be a viscosity solution to (6.7). Suppose that for every (t, x) ∈
(t0 , T ] × E there exist functions ϕ1 and ϕ2 ∈ D (K0 ) such that

(S − ϕ1 ) (t, x) = sup (S − ϕ1 ) (s, y), and (6.38)


y∈E,T >s>t

(S − ϕ2 ) (t, x) = inf (S − ϕ2 ) (s, y). (6.39)


y∈E,T >s>t

Then S = SL . More precisely, in the presence of (6.39) and (6.38) the 4


quantities in assertion (i) are equal.
Notice that the formula in (6.33) is the same as formula (4.32) in Chapter
4. The main difference is notational: in Chapter 4 and the other chapters we
write L(s) instead of −K0 (s). The notation K0 = {K0 (s) : 0 ≤ s ≤ T } refers
to a self-adjoint unperturbed (or free) Hamiltonian, which is often written as
6.3 The Hamilton-Jacobi-Bellman equation and viscosity solutions 269

~2
H0 , which usually is given by H0 = − ∆. The Schrödinger equation is then
2m
∂ψ
given by (H0 + V ) ψ = i~ . Here V stands for a potential function, which
∂t
belongs to a certain Kato type class. In mathematics Planck’s normalized
constant ~ and the particle mass m are often set equal to 1.
Remark 6.9. It would be nice to have explicit, and easy to check, conditions
on the function V which guarantee the strict positivity of the expression
" Ã Z ! #
T
Et,x exp − V (σ, X(σ)) dσ , X (T ) ∈ B ,
t

where B is any compact subset of E. Another problem which poses itself is the
following. What can be done if in equation 6.7 the expression Γ1 (SL , SL ) is
p
replaced with (Γ1 (SL , SL )) , p > 0. If 0 < p < 1, then the equation probably
can be treated by the use of branching processes: see e.g. Etheridge [83] or
Dawson and Perkins [67].
Remark 6.10. Another point of concern is the Novikov condition which is re-
quired to be sure that processes of the form
µ ¶
1
t 7→ exp −M (t) − hM, M i (t) and (6.40)
2
µ ¶
1
t 7→ exp −M (t) − hM, M i (t) (M (t) + hM.M i (t)) (6.41)
2
are martingales. The Novikov£ condition
¡ reads as¢¤ follows. Let M (t) be a mar-
tingale, and suppose that E exp 12 hM, M i (t) is finite for all t ≥ 0. Then
the process in (6.40) is a martingale. So, strictly speaking, we have to assume
in the sequel that the Novikov condition is satisfied: i.e. all the expectations
(x ∈ E, t0 ≤ t < s ≤ T )
· µ Z s ¶¸
1 ¡ ¢
Et,x exp Γ1 (ϕ, ϕ) τ, X(τ ) dτ
2 t
are supposed to be finite; otherwise we will only get local martingales. For
more details on the Novikov condition see e.g. Revuz and Yor [199], Corollary
1.16, page 309.
Remark 6.11. Another problem is about the uniqueness of the viscosity solu-
tion of equation (6.7). In order to address this problem we use a technique,
which is related to the methods used in Dynkin and Kuznetsov [77] p 26 ff,
and [76], p. 1969 ff. Among other things we tried the method of “doubling the
number of variables” as advertised in ([85]) Page 547, but it did not work out
so far. We also tried (without success) the jet bundle technique in Crandall,
Ishii and Lions [62]. To be precise we use a martingale technique combined
with sub- and super-solutions: see assertion (v) of Theorem 6.8.
270 6 Hamilton-Jacobi-Bellman equation

First we insert the following proposition.


Proposition 6.12. (i) The operator D1 − K0 − V extends to a generator of
a semigroup exp (s (D1 − K0 − V )), s ≥ 0, given by

exp (s (D1 − K0 − V )) Φ(t, x)


· µ Z s+t ¶ ¸
= Et,x exp − V (τ, X(τ )) dτ Φ (s + t, X(s + t)) . (6.42)
t

(ii)Let the function SL (t, x) be given by


" Ã Z ! #
T
SL (t, x) = − log Et,x exp − V (τ, X(τ )) dτ χ (T, X(T )) .
t

Then the following identity is valid

[exp (s (D1 − K0 − V )) exp (−SL )] (t, x) = exp (−SL (t, x)) . (6.43)

(iii)Let Ψ : (t0 , T ] × E → R be a function belonging to D (D1 − K0 ), and let


V0 : (t0 , T ] × E → R be a function for which ((s, y) ∈ (t0 , t] × E)
" Ã Z !#
T ¡ ¢ ¡ ¢
Ψ (s, y) = − log Es,y exp − V0 τ, X(τ ) dτ − Ψ t, X(t) . (6.44)
s

Then −D1 Ψ + K0 Ψ + 12 Γ1 (Ψ, Ψ ) = V0 on (t0 , t].


Remark. Suppose that the Feller propagator {P (s, t) : 0 ≤ s ≤ t ≤ T } has an
integral kernel p0 (s, x; t, y), which is continuous on

{(τ, x; t, y) ∈ [0, T ] × E × [0, T ] × E : 0 ≤ τ < t ≤ T } , (6.45)

and hence, for f : E R→ [0, ∞) any bounded Borel measurable function,


we have P (τ, t) f (x) = E p0 (τ, x; t, y)f (y)dm(y), where m is a non-negative
Radon measure on E. Define the measures µt,y τ,x on the σ-field generated by
X(τ ), τ < t ≤ T by

µt,y
τ,x (A) = Eτ,x [p0 (s, X(s); t, y) 1A ] ,

where A belongs to the σ-field generated by X(ρ), τ ≤ ρ < t0 , with t0 ∈ (τ, t)


fixed. By the Pτ,x -martingale property of the process s 7→ p0 (s, X(s); t, y),
τ ≤ s < t, the measure µt,y τ,x is well defined and can be extended to the
σ-field generated by X(τ ), τ ≤ s < t. The latter can be done via the
classical Kolmogorov extension theorem. The integral kernel of the operator
exp (s (D1 − K0 − V )) is given by the Feynman-Kac formula:
Z µ Z s+t ¶
exp (s (D1 − K0 − V )) (τ, x; t, y) = exp − V (ρ, X(ρ))dρ dµs+t,y
s+τ,x .
s+τ
6.3 The Hamilton-Jacobi-Bellman equation and viscosity solutions 271

Under appropriate conditions on V , the integral kernel of the operator

exp (s (D1 − K0 − V ))

is again continuous on the space mentioned in (6.45). Details for time-


independent functions V and time-homogenous Markov processes on second
countable locally compact spaces can be found in e.g. [70].

Proof (Proof of Proposition 6.12). (i) Let s1 and s2 be positive real numbers,
and let Φ be a non-negative Borel measurable function defined on [0, ∞) × E.
Then we have:

[exp (s1 (D1 − K0 − V )) exp (s2 (D1 − K0 − V )) Φ] (t, x)


· µ Z s1 +t ¶
= Et,x exp − V (τ, X(τ )) dτ
t
¸
{exp (s2 (D1 − K0 − V )) Φ (s1 + t, X (s1 + t))}
· µ Z s1 +t ¶
= Et,x exp − V (τ, X(τ )) dτ Es1 +t,X(s1 +t) {
t
µ Z s2 +s1 +t ¶ ¾¸
exp − V (τ, X(τ )) dτ Φ (s2 + s1 + t, X (s2 + s1 + t))
s1 +t

(Markov property)
· µ Z s1 +t ¶
= Et,x exp − V (τ, X(τ )) dτ
t
µ Z s2 +s1 +t ¶ ¸
exp − V (τ, X(τ )) dτ Φ (X(s2 + s1 + t), s2 + s1 + t)
s1 +t
· µ Z s2 +s1 +t ¶ ¸
= Et,x exp − V (τ, X(τ )) dτ Φ (X(s2 + s1 + t), s2 + s1 + t)
t
= [exp ((s1 + s2 ) (D1 − K0 − V )) Φ] (t, x). (6.46)

Next we prove assertion (ii):

[exp (s (D1 − K0 − V )) exp (−SL )] (t, x)


· µ Z s+t ¶ ¸
= Et,x exp − V (τ, X(τ )) dτ exp (−SL (s + t, X(s + t)))
t
· µ Z s+t ¶
= Et,x exp − V (τ, X(τ )) dτ
(t à Z ! )#
T
Es+t,X(s+t) exp − V (τ, X(τ )) dτ χ (T, X(T ))
s+t
272 6 Hamilton-Jacobi-Bellman equation

(Markov property)
· µ Z s+t ¶
= Et,x exp − V (τ, X(τ )) dτ
à Z t ! #
T
exp − V (τ, X(τ )) dτ χ (T, X(T ))
s+t
· µ Z s+t ¶
= Et,x exp − V (τ, X(τ )) dτ
à Z t ! #
T
exp − V (τ, X(τ )) dτ χ (T, X(T ))
s+t
" Ã Z ! #
T
= Et,x exp − V (τ, X(τ )) dτ χ (T, X(T ))
t

= exp (−SL (t, x)) . (6.47)

This proves assertion (ii).


(iii) From (6.1) and the proof of assertion (ii) of Proposition 6.12 it follows
that
1
− D1 Ψ + K0 Ψ + Γ1 (Ψ, Ψ ) = eΨ (D1 − K0 ) e−Ψ
2
= eψ (D1 − K0 − V0 ) e−Ψ + V0
1
= eΨ lim (exp (s (D1 − K0 − V0 )) − I) e−Ψ + V0
s↓s s

= V0 , (6.48)

where we used the invariance exp (s (D1 − K0 − V0 )) e−Ψ = e−Ψ , 0 < s < t.
This proves assertion (iii).

Proof. (Proof of Theorem 6.8.) (i) The first and the final inequality in (i)
follow from the non-linear Feynman-Kac formula. For Φ ∈ D (D1 − K0 ) we
have with VΦ = −Φ̇ + K0 Φ + 12 Γ1 (Φ, Φ):
" Ã Z ! #
T
Φ(t, x) = − log Et,x exp − [VΦ ] (τ, X(τ )) dτ Φ (T, X(T )) .
t

The second inequality of (i) is a consequence of Jensen inequality, and should


be compared with the arguments in Zambrini [260], who used ideas from
Fleming and Soner: see Chapter VI in [86]. The reader is also referred to Sheu
[211] and to [240]. The inequality we have in mind is the following one:
M M
− log Et,xv,t [exp (−ϕ)] ≤ Et,xv,t [ϕ] , (6.49)
6.3 The Hamilton-Jacobi-Bellman equation and viscosity solutions 273

with equality only if ϕ is constant Pt,x -almost surely. We apply (6.49) to the
stochastic variable ϕ = ϕv , given by
Z T· ¸
1
ϕv = − Γ1 (v, v) + V (τ, X(τ )) dτ − Mv,t (T ) − log χ (X (T )) . (6.50)
t 2
We also notice that the following processes are Pt,x martingales on the interval
[t, T ]:
µ ¶
1
exp − hMv,t i (s) − Mv,t (s) and (6.51)
2
µ ¶
1
exp − hMv,t i (s) − Mv,t (s) (hMv,t i (s) + Mv,t (s)) . (6.52)
2
By the Jensen inequality we have
" Z T #
Mv,t 1
Et,x hMv,t i (T ) + V (τ, X(τ )) dτ − log χ (T, X(T ))
2 t

M
(the process in (6.52) is a Pt,xv,t -martingale)
" Z #
T
M 1
Et,xv,t − hMv,t i (T ) − Mv,t (T ) + V (τ, X(τ )) dτ − log χ (T, X(T ))
2 t

(here we apply Jensen inequality)


h 1 RT i
M
≥ − log Et,xv,t e 2 hMv,t i(T )+Mv,t (T )− t V (τ,X(τ ))dτ +log χ(T,X(T ))

M
(definition of the probability measure Et,xv,t )
" Ã Z !#
T
= − log Et,x exp − V (τ, X(τ )) dτ + log χ (T, X(T )) .
t

(ii) The assertion in (ii) immediately follows from (i).


(iii) Let (t, x) belong to (t0 , T ] × E, and let ϕ be as in (6.34). Then we have
1
[−ϕ̇ + K0 ϕ − V ] (t, x) + Γ1 (ϕ, ϕ) (t, x)
2
£ ¤
= eϕ(t,x) (D1 − K0 − V ) e−ϕ (t, x)
1
= exp (ϕ(t, x)) lim [(exp (sD1 − sK0 − sV ) − I) exp (−ϕ)] (t, x)
s↓0 s
Ã
ϕ(t,x) 1 £ © ª ¤
=e lim inf exp (sD1 − sK0 − sV ) eSL −ϕ e−SL (t, x)
s↓0 s
274 6 Hamilton-Jacobi-Bellman equation
!
SL (t,x)−ϕ(t,x) −SL (t,x)
−e e
Ã
ϕ(t,x) 1 £ © ª ¤
=e lim inf exp (sD1 − sK0 − sV ) eSL −ϕ e−SL (t, x)
s↓0 s
( ) !
− sup eSL (s,y)−ϕ(s,y) e−SL (t,x)
(σ,y)∈[t,T ]×E

1
≤ eϕ(t,x) lim inf
s↓0 s
Ã" ( ) #
© SL −ϕ
ª −SL
exp (sD1 − sK0 − sV ) sup e (σ, y) e (t, x)
(σ,y)∈[t,T ]×E
( ) !
SL (σ,y)−ϕ(σ,y) −SL (t,x)
− sup e e
(σ,y)∈[t,T ]×E
©© ª ª
≤ sup eSL −ϕ (σ, y) : (σ, y) ∈ [t, T ] × E exp (ϕ(t, x))
1 ³£ ¤ ´
lim inf exp (sD1 − sK0 − sV ) e−SL (t, x) − e−SL (t,x)
s↓0 s
= eSL (t,x)−ϕ(t,x) eϕ(t,x) · 0 = 0. (6.53)

The latter equality follows because, for s > 0 the equality

[exp (sD1 − sK0 − sV ) exp (−SL )] (t, x) = exp (−SL (t, x))

is valid: see Proposition 6.12, assertion (ii).


The reverse inequality (6.35) follows in a similar manner.
(iv) In view of assertion (i) we only have to prove that the expression in
(6.37) is less than equal to the one in (6.36). To this end we consider (v ∈
D (D1 − K0 ))
" Z Z T #
T
Mv,t 1
Et,x Γ1 (v, v) (τ, X(τ )) dτ + V (τ, X(τ )) dτ − log χ (T, X(T ))
2 t t
" Z T #
Mv,t 1
= Et,x hMv,t i (T ) + V (τ, X(τ )) dτ − log χ (T, X(T ))
2 t

(the process in (6.52) is a martingale)


" Z #
T
M 1
= Et,xv,t − hMv,t i (T ) − Mv,t (T ) + V (τ, X(τ )) dτ − log χ (T, X(T ))
2 t

(definition of the martingale s 7→ Mv,t (s))


6.3 The Hamilton-Jacobi-Bellman equation and viscosity solutions 275
" Z T
M 1
= Et,xv,t − Γ1 (v, v) (τ, X(τ )) dτ − v (T, X(T )) + v (t, X(t))
2 t
Z Z #
T T
− (−D1 + K0 ) v (τ, X(τ )) dτ + V (τ, X(τ )) dτ − log χ (T, X(T ))
t t
"Z
T½ ¾
M 1
= Et,xv,t (D1 − K0 ) v (τ, X(τ )) − Γ1 (v, v) (τ, X(τ )) dτ
t 2
Z #
T
+ V (τ, X(τ )) dτ + v (t, X(t)) − v (T, X(T )) − log χ (T, X(T ))
t

((t, X(t)) = (t, x), Pt,x -almost surely)


"Z ½ ¾
T
M 1
= v(t, x) + Et,xv,t (D1 − K0 ) v − Γ1 (v, v) (τ, X(τ )) dτ
t 2
Z #
T
+ V (τ, X(τ )) dτ − v (T, X(T )) − log χ (T, X(T ))
t

= v(t, x)
"Z
T
M
+ Et,xv,t exp (v (τ, X(τ ))) [(−D1 + K0 + V ) exp(−v)] (τ, X(τ )) dτ
t
#
− v (T, X(T )) − log χ (T, X(T ))
" Ã Z !
T
1
= v(t, x) + Et,x exp − hMv,t i (T ) − Mv,t (T ) + V (τ, X(τ )) dτ
2 t
à Z !
T
exp − V (τ, X(τ )) dτ
t
Z T
exp (v (τ, X(τ ))) [(−D1 + K0 + V ) exp(−v)] (τ, X(τ )) dτ
t
#
− v (T, X(T )) − log χ (T, X(T ))

(apply the equality in (6.2) with f = 1)

= v(t, x)
" ÃZ !
T
+ Et,x exp exp (v (τ, X(τ ))) [(−D1 + K0 + V ) exp (−v)] (τ, X(τ )) dτ
t
276 6 Hamilton-Jacobi-Bellman equation
à Z !
T
exp (v (t, X(t)) − v (T, X(T ))) exp − V (τ, X(τ )) dτ
t
Z T
exp (v (τ, X(τ ))) [(−D1 + K0 + V ) exp(−v)] (τ, X(τ )) dτ
t
#
− v (T, X(T )) − log χ (T, X(T )) . (6.54)

Choose w ∈ D (D1 − K0 ) and define for s > 0 the function vs by


Z
1 s
exp (−vs ) = exp (σ (D1 − K0 − V )) exp(−w)ds.
s 0

Then
(I − exp (s (D1 − K0 − V ))) exp (−w)
exp (vs ) (−D1 + K0 + V ) exp (−vs ) = R s .
0
exp (σ (D1 − K0 − V )) exp(−w)dσ

So from (6.54) we obtain for w in the domain of D1 − K0 and s > 0 the


equality
" Z Z T #
T
Mvs ,t 1
Et,x Γ1 (vs , vs ) (τ, X(τ )) dτ + V (τ, X(τ )) dτ − log χ (T, X(T ))
2 t t

= vs (t, x)
" ÃZ !
T
[(I − exp (s (D1 − K0 − V ))) exp(−w)] (τ, X(τ ))
+ Et,x exp Rs dτ
t 0
[exp (σ (D1 − K0 − V )) exp(−w)] (τ, X(τ )) dσ
à Z !
T
exp (vs (t, X(t)) − vs (T, X(T ))) exp − V (τ, X(τ )) dτ
t
Z T
[(I − exp (s (D1 − K0 − V ))) exp(−w)] (τ, X(τ ))
Rs dτ
t 0
[exp (σ (D1 − K0 − V )) exp(−w)] (τ, X(τ )) dσ
#
− vs (T, X(T )) − log χ (T, X(T )) . (6.55)

Upon letting w ∈ D (D1 − K0 ) tend to the function SL in an appropriate


manner, we obtain by invoking Proposition 6.12 the inequality
( " Z Z T #
T
Mv,t 1
inf Et,x Γ1 (τ, X(τ )) dτ + V (τ, X(τ )) dτ − log χ (T, X(T )) :
2 t t
)
v ∈ D (D1 − K0 ) ≤ SL (t, x).
6.3 The Hamilton-Jacobi-Bellman equation and viscosity solutions 277

This proves assertion (iv). The “appropriate manner” should be such that
wn → SL implies that

Tβ - lim es(D1 −K0 −V ) e−wn = es(D1 −K0 −V ) e−SL = e−SL . (6.56)


n→∞
n o
In order that this procedure works, the semigroup es(D1 −(K0 +̇V )) : s ≥ 0
should be continuous
© for the strict topology.
ª This is true provided the unper-
turbed semigroup es(D1 −K0 ) : s ≥ 0 is continuous for the strict topology,
and the potential function satisfies a Myadera type boundedness condition,
as explained in Definition 6.4 and the corresponding Khas’minski lemma 6.5.
(v). Here we use a martingale approach together with the idea of germs of a
function. We will prove the following inequalities:

S(t, x) ≤ sup {ϕ1 (t, x) : Vϕ1 ≤ V, ϕ1 (T, ·) ≤ SL (T, ·)} (6.57)


≤ inf {ϕ2 (t, x) : Vϕ2 ≥ V, ϕ2 (T, ·) ≥ SL (T, ·)} ≤ S(t, x), (6.58)

1
where Vϕ = −ϕ̇ + K0 ϕ + Γ1 (ϕ, ϕ). In view of assertion (i) in Theorem 6.8
2
we then infer S = SL . Fix (t, x) ∈ (t0 , T ] × E. Let ϕ1 ∈ D (D1 − K0 ) be such
that
(S − ϕ1 )(t, x) = sup {(S − ϕ1 )(s, y) : y ∈ E, T ≥ s ≥ t} .
We notice that the processes Mϕ,t and MSL ,t , defined by respectively
µ Z s ¶
Mϕ,t (s) = exp − Vϕ (τ, X(τ )) dτ − ϕ (s, X(s)) + ϕ (t, X(t)) , and
µ Zt s ¶
MSL ,t (s) = exp − V (τ, X(τ )) dτ − SL (s, X(s)) + SL (t, X(t)) ,
t

t ≤ s ≤ T , are Pt,x -martingales. The latter assertion follows from the Markov
property together with the Feynman-Kac formula: see (6.43), which is also
M
true for Vϕ instead of V and ϕ replacing SL . Let Pt,xϕ,t denote the probability
M
measure defined by Pt,xϕ,t (A) = Et,x [Mϕ,t (s2 )1A ], s2 ≥ s1 , where A is Fst 1 -
measurable. Since S is a viscosity sub-solution we see that Vϕ1 (t, x) ≤ V (t, x).
Fix ε > 0 and choose δ > 0 in such a way that, for some neighborhood U of x
in E, the inequality Vϕ1 (s, y) ≤ V (s, y) + 21 ε is valid for (s, y) ∈ U × [t, t + δ].
Here we use the continuity of V (s, y) in y = x and its right continuity in
s = t. Then we choose a family of germs of “smooth” functions (Uα , ϕα ),
α ∈ A, with the following properties:
S
(a) Uα ⊇ [t, T ] × E, i.e. the family Uα , α ∈ A, forms an open cover of the
set [t, T ] × E;
(b) For every α, β ∈ A, ϕα = ϕβ on Uα ∩ Uβ ;
(c) For every α ∈ A there exists (tα , xα ) ∈ Uα such that (S − ϕα ) (s, y) ≤
(S − ϕα ) (tα , xα ), for (s, y) ∈ Uα and sα ≤ s;
278 6 Hamilton-Jacobi-Bellman equation

(d) For every α ∈ A, the inequality Vϕα ≤ V + 21 ε is valid on Uα ;


(e) If (t, x) belongs to Uα , then (S − ϕα ) (t, x) ≤ 0;
(f) If (T, y) belongs to Uα , then ϕα (T, y) ≤ S (T, y) + 12 ε(T − t) = SL (T, y) +
1
2 ε(T − t).
Since S is a viscosity sub-solution property (d) is in fact a consequence of (c);
we will need (d). Then we define the function ψ1 : [t, T ] × E → R as follows
ψ1 (s, y) = ϕα (s, y), for (s, y) ∈ Uα . Then, on Uα , Vψ1 = Vϕα ≤ V + 12 ε. We
write V1 = Vψ1 . By assertion (iii) of Proposition 6.12
1 1
Ψ1ε (s, y) := ψ1 (s, y) − ε(T − s) − ε(T − t) (6.59)
" 2
à Z µ 2 !
T ¶
¡ ¢ 1
= − log Es,y exp − V1 τ, X(τ ) − ε dτ
s 2
µ µ ¶¶#
¡ ¢ 1
× exp − ψ1 T, X(T ) − ε(T − t)
2
" Ã Z !#
T ¡ ¢ ¡ ¢
= − log Es,y exp − VΨ1ε τ, X(τ ) dτ − Ψ1ε T, X(T ) .
t

Then
" Ã Z !#
T ¡ ¢ ¡ ¢
Ψ1ε (s, y) ≤ − log Es,y exp − V τ, X(τ ) dτ − SL T, X(T )
s

= SL (s, y),

and hence ψ1 (t, x) ≤ SL (t, x)+ε(T −t). By construction we also have S(t, x) ≤
ψ1 (t, x). Consequently S(t, x) ≤ SL (t, x)+ε(T −t). Since ε > 0 is arbitrary we
see S(t, x) ≤ SL (t, x). In fact, since VΨ1ε ≤ V , and since Ψ1ε (T, y) ≤ SL (T, y),
we see that

S(t, x) ≤ sup {ϕ1 (t, x) : Vϕ1 ≤ V, ϕ1 (T, ·) ≤ SL (·, u)} .

A similar argument shows the inequality

S(t, x) ≥ inf {ϕ2 (t, x) : Vϕ2 ≥ V, ϕ2 (T, ·) ≥ SL (·, u)} .

To be precise, again we fix ε > 0, and let ϕ2 ∈ D (D1 − K0 ) be a function


such that (S − ϕ2 ) (t, x) = inf {(S − ϕ2 ) (s, y) : (s, y) ∈ [t, T ] × E}. We choose
δ > 0 and a neighborhood U of x in such a way that Vϕ2 (s, y) ≥ V (s, y) − 12 ε
for (s, y) ∈ U × [t, t + δ]. Then we choose a family of germs of ”smooth”
functions (Uα , ϕα ), α ∈ A, with the following properties:
S
(a) Uα ⊇ [t, T ]×E, i.e. the family Uα forms an open cover of the set [t, T ]×E;
(b) α, β ∈ A implies ϕα = ϕβ on Uα ∩ Uβ ;
(c) For every α ∈ A there exists (tα , xα ) ∈ Uα such that (S − ϕα ) (s, y) ≥
(S − ϕα ) (tα , xα ), for (s, y) ∈ Uα and sα ≤ s;
6.4 A stochastic Noether theorem 279

(d) For every α ∈ A, the inequality Vϕα ≥ V − 21 ε is valid on Uα ;


(e) If (t, x) belongs to Uα , then (S − ϕα ) (t, x) ≥ 0;
(f) If (T, y) belongs to Uα , then ϕα (T, y) ≥ S (T, y) − 12 ε(T − t) = SL (T, y) −
1
2 ε(T − t).
Since S is a viscosity super-solution property (d) is in fact a consequence of (c).
Then we define the function ψ2 : [t, T ] × E → R as follows ψ2 (s, y) = ϕα (s, y),
for (s, y) ∈ Uα . Then, on Uα , Vψ2 = Vϕα ≤ V + 12 ε. We write V2 = Vψ2 . As
above, assertion (iii) of Proposition 6.12 implies
1 1
Ψ2ε (s, y) := ψ2 (s, y) + ε(T − s) + ε(T − t) ≥ SL (s, y). (6.60)
2 2
By construction we have S(t, x) ≥ ψ2 (t, x), and hence SL (t, x) ≤ Ψ2ε (t, x) ≤
ψ2 (t, x) + ε(T − t) ≤ S(t, x) + ε(T − t). Since ε > 0 is arbitrary we infer
SL (t, x) ≤ S(t, x). In fact, since VΨ2ε ≥ V , and since Ψ2ε (T, y) ≥ SL (T, y), we
see that
S(t, x) ≥ sup {ϕ1 (t, x) : Vϕ1 ≤ V, ϕ1 (T, ·) ≤ SL (T, ·)} .
In the mean time we also proved that the 4 quantities in assertion (i) are
equal.

6.4 A stochastic Noether theorem


The following theorem may be called the stochastic Noether theorem: cfr.
Zambrini [260] Proposition 2.3 and Theorem 2.4. For a discussion and for-
mulation of the classical (deterministic) Noether theorem, which in fact can
be considered as the second constant of motion for a mechanical system, the
reader is referred to Thieullen and Zambrini [232], pages 300–302, and [233]
page 423. In §6.4.1 we also give a short formulation of this theory.
Theorem 6.13. Let T be a differentiable function which only depends on

time. As above the operator D1 stands for D1 = . Suppose that the functions
∂t
ϕ, w, and T satisfy the following identities.
µ ¶
dT ∂w
(a) K0 f = K0 Γ1 (f, w) − Γ1 (K0 f, w) − Γ1 f, − ϕ for all functions
dt ∂t
f ∈ D (K0 − D1 ) for which Γ1 (K0 f, w) exists as well.
∂ϕ ∂(T V )
(b) − K0 ϕ = Γ1 (V, w) + .
∂t ∂t
Put
∂f ¡ ¢ ∂f
H(f ) = − K0 +̇V f , N(f ) = Γ1 (f, w) + T − ϕf , and
∂t ∂t
∂f
D(f ) = − Γ1 (σL , f ) − K0 f . (6.61)
∂t
Suppose that the function σL satisfies:
280 6 Hamilton-Jacobi-Bellman equation
µ ¶ µ ¶
∂V ∂σL
(c) Dε − T = Γ1 + ε, w ,
∂t ∂t
1
where ε = −K0 σL − Γ1 (σL , σL ) + V .
2
Write n := −Γ1 (σL , w) + εT − ϕ. The following assertions hold true.
(i) If H(f ) = 0, then H (N(f )) = 0 as well. More generally: H (N0 (f )) =
N0 (H(f )) for appropriately chosen functions f . So the operators H and
N0 commute. For the definition of N0 see (6.63) below.
(ii) Dn = 0.
(iii) The process t 7→ n (t, X(t)) is a martingale with respect to the probability
measures
· µ Z ¶ ¸
1 t
A 7→ Et,x0 exp −MσL ,t0 (t) − Γ1 (σL , σL ) (s, X(s))ds 1A ,
2 t0

where as in (6.12) Mf,t0 (t) is given by


Z t
Mf,t0 (t) = f (t, X(t)) − f (t0 , X (t0 )) + (K0 − D1 ) f (s, X(s))ds. (6.62)
t0

Remark 6.14. The operator N0 is defined by


¡ ¢
N0 (f ) = Γ1 (f, w) + T K0 +̇V f − ϕf. (6.63)

The proof of assertion (i) shows that the operators H and N0 commute:
H (N0 f ) = N0 (Hf ), f ∈ D (H) ∩ D (N0 ), Hf ∈ D (N0 ), and N0 f ∈ D (H).
The following proposition shows a situation where (c) is satisfied.
Proposition 6.15. Suppose SL , the minimal Lagrangian action, belongs to

the domain of D1 − K0 . Here D1 = . Set σL = SL in Theorem 6.13. Then
∂t
(c) is satisfied; more precisely, Dε = D1 V and D1 σL + ε = 0.
Proof (Proof of Proposition 6.15). Notice that
1
ε = −K0 SL − Γ1 (SL , SL ) + V = −D1 SL , (6.64)
2
and hence
µ ¶ µ ¶
∂ 2 SL ∂SL ∂SL
Dε = −D (D1 SL ) = − 2 + Γ1 SL , + K0
∂t ∂t ∂t
∂ 2 SL 1 ∂ ∂
=− 2 + Γ1 (SL , SL ) + K0 (SL )
∂t
µ 2 ∂t ∂t ¶
∂ ∂SL 1 ∂V
= − + Γ1 (SL , SL ) + K0 SL = . (6.65)
∂t ∂t 2 ∂t
Proposition 6.15 easily follows from (6.64) and (6.65).
6.4 A stochastic Noether theorem 281

The equality in (6.66) below will be used in the proof Theorem 6.13.
Lemma 6.16. For all appropriate functions f , w, T , and ϕ the following
identity is true:
1 ∂T 1
− Γ1 (f, f ) − Γ1 (Γ1 (f, f ) , w) + Γ1 (f, Γ1 (f, w))
2µ ∂t 2 µ ¶¶
1 ¡ 2 ¢ ∂T ¡ ¡ 2¢ ¢ ¡ 2 ¢ 2 ∂w
= K0 f + Γ1 K0 f , w − K0 Γ1 f , w + Γ1 f , −ϕ
2 ∂t ∂t
µ µ ¶¶
∂T ∂w
− f K0 (f ) + Γ1 (K0 (f ) , w) − K0 Γ1 (f, w) + Γ1 f, −ϕ .
∂t ∂t
(6.66)
1 1 ¡ ¢
Proof (Proof of Lemma 6.16). The equality − Γ1 (f, f ) = K0 f 2 − f K0 f
2 2
together with Γ1 (f K0 f, w) = f Γ1 (K0 f, w) + K0 f Γ1 (f, w) yields
1 ∂T 1
− Γ1 (f, f ) − Γ1 (Γ1 (f, f ) , w) + Γ1 (f, Γ1 (f, w))
2½ ∂t 2 ¾
1 ¡ 2 ¢ ∂T ¡ ¡ 2¢ ¢ ¡ 2 ¢
= K0 f + Γ1 K0 f , w − K0 Γ1 f , w
2 ∂t
½ ¾
∂T
− f K0 f + Γ1 (K0 f, w) − K0 Γ1 (f, w)
∂t
1 ¡ ¢
+ K0 Γ1 f 2 , w − f K0 Γ1 (f, w) − (K0 f ) Γ1 (f, w) + Γ1 (f, Γ1 (f, w))
2
¡ ¢
( 12 K0 Γ1 f 2 , w = K0 (f Γ1 (f, w)) = (K0 f ) Γ1 (f, w) − Γ1 (f, Γ1 (f, w)) +
f K0 Γ1 (f, w))
½ ¾
1 ¡ 2 ¢ ∂T ¡ ¡ 2¢ ¢ ¡ 2 ¢
= K0 f + Γ1 K0 f , w − K0 Γ1 f , w
2 ∂t
½ ¾
∂T
− f K0 f + Γ1 (K0 f, w) − K0 Γ1 (f, w) . (6.67)
∂t
¡ ¢
Since Γ1 f 2 , ψ = 2f Γ1 (f, ψ), equality (6.66) in Lemma 6.16 follows from
(6.67).
Proof (Proof of Theorem 6.13). (i) We calculate:

H (Nf ) − N (Hf )
µ ¶
∂f ∂
= H (Γ1 (ω, f )) + H T − H (ϕf ) − Γ1 (Hf, w) − T (Hf ) + ϕHf
∂t ∂t
∂ ¡ ¢
= (Γ1 (f, w)) − K0 +̇V Γ1 (f, w)
∂tµ ¶ µ ¶
∂ ∂f ¡ ¢ ∂f ¡ ¢
T − ϕf − K0 +̇V T + K0 +̇V (ϕf )
∂t ∂t ∂t
282 6 Hamilton-Jacobi-Bellman equation
µ ¶
∂f
− Γ1 , w + Γ1 (K0 f, w) + Γ1 (V f, w)
∂t
µ 2 ¶
∂ f ∂ ¡¡ ¢ ¢ ∂f ¡ ¢
−T 2
− K 0 +̇V f +ϕ − ϕ K0 +̇V f
∂t ∂t ∂t
µ ¶
∂w
= Γ1 f, − ϕ − K0 Γ1 (f, w) + Γ1 (K0 f, w)
∂t
µ ¶ µ ¶
∂T ∂f ∂f
+ − K0 T + Γ1 T,
∂t ∂t ∂t
µ ¶
∂ϕ ∂V
+ − + K0 ϕ + Γ1 (V, w) + T f
∂t ∂t
µ ¶
∂w
= Γ1 − ϕ, f − K0 Γ1 (f, w) + Γ1 (K0 f, w) (6.68)
∂t
∂T
+ (K0 f ) − (K0 T ) K0 f + Γ1 (T, K0 f ) + Γ1 (T, V f )
µ ∂t ¶
∂ϕ ∂V ∂T
+ − + K0 ϕ + Γ1 (V, w) + T + V f
∂t ∂t ∂t

(T only depends on t)
µ ¶
∂w ∂T
= Γ1 − ϕ, f − K0 Γ1 (f, w) + Γ1 (K0 f, w) + (K0 f )
∂t ∂t
µ ¶
∂ϕ ∂V ∂T
+ − + K0 ϕ + Γ1 (V, w) + T + V f = 0, (6.69)
∂t ∂t ∂t
∂f ¡ ¢
where in (6.68) we employed the identity = K0 +̇V f . The equality in
∂t
(6.69) follows by our assumptions (a) and (b).
(ii) We compute
∂n
D(n) = − Γ1 (σL , n) − K0 n
∂t
∂ ∂ ∂
=− (Γ1 (σL , w)) + (εT ) − ϕ
∂t ∂t ∂t
+ Γ1 (σL , Γ1 (σL , w)) − Γ1 (σL , εT ) + Γ1 (σL , ϕ)
+ K0 (Γ1 (σL , w)) − K0 (εT ) + K0 ϕ
µ ¶ µ ¶
∂σL ∂w ∂T ∂ε ∂ϕ
= − Γ1 , w − Γ1 σL , +ε + T−
∂t ∂t ∂t ∂t ∂t
+ Γ1 (σL , Γ1 (σL , w)) − Γ1 (σL , ε) T + Γ1 (σL , ϕ)
+ K0 (Γ1 (σL , w)) − K0 (ε) T + K0 ϕ
µ ¶ µ ¶
∂σL ∂w ∂T ∂ε ∂ϕ
= − Γ1 + V, w − Γ1 σL , −ϕ +ε + T−
∂t ∂t ∂t ∂t ∂t
+ Γ1 (σL , Γ1 (σL , w)) − Γ1 (σL , ε) T
6.4 A stochastic Noether theorem 283

+ K0 (Γ1 (σL , w)) − K0 (ε) T + K0 ϕ + Γ1 (V, w)


µ ¶ µ ¶
∂σL ∂w ∂T
= − Γ1 + V, w − Γ1 σL , −ϕ +V
∂t ∂t ∂t
µ ¶
1 ∂T ∂ε ∂ϕ
− K0 σL + Γ1 (σL , σL ) + T−
2 ∂t ∂t ∂t
+ Γ1 (σL , Γ1 (σL , w)) − Γ1 (σL , ε) T
+ K0 (Γ1 (σL , w)) − K0 (ε) T + K0 ϕ + Γ1 (V, w)
µ ¶ µ ¶
∂σL ∂ ∂(V T )
= − Γ1 + V, w − Γ1 σL , w − ϕ +
∂t ∂t ∂t
µ ¶
1 ∂T ∂ (ε − V ) ∂ϕ
− K0 σL + Γ1 (σL , σL ) + T−
2 ∂t ∂t ∂t
+ Γ1 (σL , Γ1 (σL , w)) − Γ1 (σL , ε) T
+ K0 (Γ1 (σL , w)) − K0 (ε) T + K0 ϕ + Γ1 (V, w)
µ ¶ µ ¶ µ ¶
∂σL ∂V ∂w
= − Γ1 + ε, w + D(ε) − T − Γ1 σL , −ϕ
∂t ∂t ∂t
∂ϕ ∂(V T )
− + K0 ϕ + Γ1 (V, w) +
∂t
µ ¶ ∂t µ ¶
1 ∂T 1
− K0 σL + Γ1 (σL , σL ) − Γ1 K0 σL + Γ1 (σL , σL ) , w
2 ∂t 2
+ Γ1 (σL , Γ1 (σL , w)) + K0 (Γ1 (σL , w))
µ ¶ µ ¶
∂σL ∂V
= − Γ1 + ε, w + D(ε) − T
∂t ∂t
µ ¶
∂T ∂w
− (K0 σL ) − Γ1 (K0 σL , w) + K0 Γ1 (σL , w) − Γ1 σL , −ϕ
∂t ∂t
∂ϕ ∂(V T )
− + K0 ϕ + Γ1 (V, w) +
∂t ∂t
1 ∂T 1
− Γ1 (σL , σL ) − Γ1 (Γ1 (σL , σL ) , w) + Γ1 (σL , Γ1 (σL , w))
2 ∂t 2

(employ Lemma 6.16 with f = σL )


µ ¶ µ ¶
∂σL ∂V
= − Γ1 + ε, w + D(ε) − T (6.70)
∂t ∂t
µ ¶
∂T ∂w
− (K0 σL ) − Γ1 (K0 σL , w) + K0 Γ1 (σL , w) − Γ1 σL , −ϕ
∂t ∂t
∂ϕ ∂(V T )
− + K0 ϕ + Γ1 (V, w) +
∂t
µ ∂t µ ¶¶
1 ¡ 2 ¢ ∂T ¡ ¡ 2¢ ¢ ¡ 2 ¢ 2 ∂w
+ K0 σL + Γ1 K0 σL , w − K0 Γ1 σL , w + Γ1 σL , −ϕ
2 ∂t ∂t
284 6 Hamilton-Jacobi-Bellman equation
µ µ ¶¶
∂T ∂w
− σL K0 (σL ) + Γ1 (K0 (σL ) , w) − K0 Γ1 (σL , w) + Γ1 σL , −ϕ .
∂t ∂t
Substituting the equalities (a), (b) and (c) in (6.70) shows (ii), i.e. D(n) = 0.
(iii) Let f be a function in the domain of K0 − D1 . As in equation (6.12) of
§6.1 the Et,x0 -martingale Mf,t0 (t), t ≥ t0 , is given by the equation in (6.62).
Let f and g be two functions in D (K0 − D1 ). Then the quadratic co-variation
hMf,t0 , Mg,t0 i (t) of Mf,t0 (t) and Mg,t0 (t) is given by
Z t
hMf,t0 , Mg,t0 i (t) = Γ1 (f, g) (τ, X(τ )) dτ. (6.71)
t0

By (ii) D(n) = 0, and hence (K0 − D1 ) n = −Γ1 (σL , n). It follows that
Z t
n (t, X(t)) − n (t0 , X (t0 )) = Mn,t0 (t) − (K0 − D1 ) n (τ, X(τ )) dτ
t0
Z t
= Mn,t0 (t) + Γ1 (σL , n) (τ, X(τ )) dτ. (6.72)
t0

Let f be a function in D (K0 − D1 ). From Itô’s formula we obtain:


µ ¶
1
exp Mf,t0 (t) − hMf,t0 , Mf,t0 i (t) n (t, X(t)) − n (t0 , X (t0 ))
2
Z t µ ¶
1
= exp Mf,t0 (s) − hMf,t0 , Mf,t0 i (s) n (s, X(s)) dMf,t0 (s)
t0 2
Z t µ ¶
1 1
− exp Mf,t0 (s) − hMf,t0 , Mf,t0 i (s) n (s, X(s)) d hMf,t0 , Mf,t0 i (s)
2 t0 2
Z t µ ¶
1 1
+ exp Mf,t0 (s) − hMf,t0 , Mf,t0 i (s) n (s, X(s)) d hMf,t0 , Mf,t0 i (s)
2 t0 2
Z t µ ¶
1
+ exp Mf,t0 (s) − hMf,t0 , Mf,t0 i (s) dMn,t0 (s)
t0 2
Z t µ ¶
1
− exp Mf,t0 (s) − hMf,t0 , Mf,t0 i (s) (K0 − D1 ) n (s, X(s)) ds
t0 2
Z t µ ¶
1
+ exp Mf,t0 (s) − hMf,t0 , Mf,t0 i (s) Γ1 (f, n) (s, X(s)) ds
t0 2

(employ (6.72))

= Et,x0 -martingale (6.73)


Z t µ ¶
1
+ exp Mf,t0 (s) − hMf,t0 i (s) (Γ1 (f, n) + Γ1 (σL , n)) (s, X(s)) ds,
t0 2
where we wrote hMf,t0 i (s) = hMf,t0 , Mf,t0 i (s). Suppose Γ1 (f + σL , w) = 0.
From (6.73) it follows that the process
6.4 A stochastic Noether theorem 285
µ ¶
1
t 7→ exp Mf,t0 (t) − hMf,t0 , Mf,t0 i (t)
2
µ Z ¶
1 t
= exp Mf,t0 (t) − Γ1 (f, f ) (s, X(s)) ds n (t, X(t))
2 t0

is a Et0 ,x -martingale. So, with f = −σL , assertion (iii) of Theorem 6.13 follows.

The following theorem can be considered as a complex version of the Noether


theorem: see Theorem 6.13 above and Theorem 3.1 in Zambrini [260]. It has
a physical interpretation: N(t), defined by
¡ ¢
N(t)f = iΓ1 (f, w) − T (t) K0 +̇V f − ϕf ,

is called a Noether observable.


Theorem 6.17. Let the functions T , w, and ϕ be related as in (a0 ) and (b0 )
below:
µ ¶
0 dT ∂w
(a )K0 f = K0 Γ1 (f, w) − Γ1 (K0 f, w) + iΓ1 f, − ϕ for all functions
dt ∂t
f belonging to D (K0 − D1 ) for which Γ1 (K0 f, w) makes sense as well.
∂ϕ ∂(T V )
(b0 ) − iK0 ϕ = −Γ1 (V, w) − .
∂t ∂t
∂ ¡ ¢
Then the operators N(t) and − K0 +̇V commute.
i∂t
R
Suppose K0 f dm = 0, f ∈ D (K0 ) ∩ L1 (E, m). Then the adjoint N(t)∗ is
given by
¡ ¢
N(t)∗ f = iΓ1 (f, w) − 2i (K0 w) f − T (t) K0 +̇V f − ϕf.

∂ ¡ ¢
Hence the self-adjoint operator − K0 +̇V also commutes with the oper-
i∂t
ators N(t) + N(t)∗ and N(t) − N(t)∗ .
Proof. Let f be a “smooth enough” function. Then a calculation yields:
µ ¶ µ ¶
∂ ¡ ¢ ∂ ¡ ¢
N(t) − K0 +̇V f− − K0 +̇V N(t)f
i∂t i∂t
µ ¶ µ ¶
∂f ¡ ¢ ¡ ¢ ∂f ¡ ¢
= iΓ1 − K0 +̇V f, w − T (t) K0 +̇V − K0 +̇V f
i∂t i∂t
µ ¶
∂f ¡ ¢
−ϕ − K0 +̇V f
i∂t
µ ¶
∂ ¡ ¢ ¡ ¡ ¢ ¢
− − K0 +̇V iΓ1 (f, w) − T (t) K0 +̇V f − ϕf
i∂t
µ ¶ µ ¶
∂f 1 ∂f 1 ∂f
= Γ1 , w − iΓ1 (K0 f, w) − iΓ1 (V f, w) − T (t)K0 − T (t)V
∂t i ∂t i ∂t
286 6 Hamilton-Jacobi-Bellman equation
¡ ¢2 1 ∂f
+ T (t) K0 +̇V f − ϕ + ϕK0 f + ϕV f
i ∂t
∂ 1 ∂ ¡ ¡ ¢ ¢ 1 ∂ (ϕf )
− Γ1 (f, w) + T (t) K0 +̇V f +
∂t i ∂t i ∂t
¡ ¢ ¡ ¢2
+ i K0 +̇V Γ1 (f, w) − T (t) K0 +̇V f − K0 (ϕf ) − ϕV f
µ ¶
∂f
= Γ1 , w − iΓ1 (K0 f, w) − iV Γ1 (f, w) − if Γ1 (V, w)
∂t
µ ¶
1 ∂f 1 ∂f 1 ∂f
− T (t)K0 − T (t)V − ϕ + ϕK0 f
i ∂t i ∂t i ∂t
µ ¶ µ ¶
∂f ∂w 1 ∂T (t) 1 ∂T (t)
− Γ1 , w − Γ1 f, + K0 f + Vf
∂t ∂t i ∂t i ∂t
µ ¶
1 ∂V 1 ∂f 1 ∂f
+ T (t) f + T (t)K0 + T (t)V
i ∂t i ∂t i ∂t
1 ∂ϕ 1 ∂f
+ f+ ϕ + iK0 Γ1 (f, w) + iV Γ1 (f, w)
i ∂t i ∂t
− (K0 ϕ) f + Γ1 (f, ϕ) − ϕK0 f
µ ¶
∂w 1 ∂T (t)
= −iΓ1 (K0 f, w) − if Γ1 (V, w) − Γ1 f, −ϕ + K0 f
∂t i ∂t
1 ∂ (T (t)V ) 1 ∂ϕ
+ f+ f + iK0 Γ1 (f, w) − (K0 ϕ) f
i ∂t i ∂t µ ¶
1 ∂T (t) 1 1 ∂w
= (K0 f ) − K0 Γ1 (f, w) + Γ1 (K0 f, w) − Γ1 f, −ϕ
i ∂t i i ∂t
½ ¾
1 ∂ (T (t)V ) ∂ϕ 1
+ f Γ1 (V, w) + + + K0 ϕ . (6.74)
i ∂t ∂t i
The result in Theorem 6.17 follows from the assumptions (a0 ) and (b0 ).
Corollary 6.18. Suppose that the functions w, T (which only depends on t),
and ψ (which only depends on the space variable, not on the time t) possess
the following properties:
(a0 )The set of functions f for which the equality
dT
K0 f = K0 Γ1 (f, w) − Γ1 (K0 f, w) + Γ1 (f, K0 w + ψ)
dt
makes sense and is valid is dense in the space L2 (E × [t0 , T ], dm × dt).
(b0 )The following equality is valid:
µ 2 ¶
∂ 2 ∂ (T V )
2
+ K0 w + K0 ψ = −Γ1 (V, w) − .
∂t ∂t
Put
µµ ¶ ¶
¡ ¢ ∂
N(t)f = iΓ1 (f, w) − T (t) K0 +̇V f − + iK0 w + iψ f ,
∂t
6.4 A stochastic Noether theorem 287
¡ ¢ ∂ ¡ ¢
where f ∈ D K0 +̇V . Then N(t) commutes with − K0 +̇V .
i∂t
∂w
Proof. Set ϕ = + iK0 w + iψ in Theorem 6.17. Then
∂t
µ ¶
∂w
K0 Γ1 (f, w) − Γ1 (K0 f, w) + iΓ1 f, −ϕ
∂t
dT
= K0 Γ1 (f, w) − Γ1 (K0 f, w) + Γ1 (f, K0 w + ψ) = K0 f . (6.75)
dt
∂ψ
This shows (a0 ) of Theorem 6.17. Since = 0, we see that (b0 ) of Theorem
∂t
6.17 is satisfied as well. This proves the corollary.

The following proposition isolates the properties of the function w.


Proposition 6.19. Suppose that the function w has property (a) of Theorem
6.13, or (a0 ) of Theorem 6.17, or (a0 ) of its Corollary 6.18. Then, for all
functions f , g ∈ D (D1 − K0 ), the following identity is true:

dT
Γ1 (f, g) + Γ1 (Γ1 (f, g), w) = Γ1 (Γ1 (f, w), g) + Γ1 (f, Γ1 (g, w)) . (6.76)
dt
Remark 6.20. Let χ be a smooth enough function. From the proof of Propo-
sition 6.19 it follows that the mapping
dT
f 7→ (K0 f ) − K0 (Γ1 (f, w)) + Γ1 (K0 f, w) + Γ1 (f, χ)
dt
is a derivation if and only if (6.76) is satisfied for all functions f and g in a
large enough algebra of D (D1 − K0 ).

Proof. Let f and g be functions in D (D1 − K0 ) with the property that its
product f g also belongs to D (D1 − K0 ). We write
µ ¶
∂w 1 ∂w
χ= − ϕ, χ = − ϕ , or χ = −K0 w − ψ,
∂t i ∂t

as the case may be. Then


dT
K0 (f g) − K0 Γ1 (f g, w) + Γ1 (K0 (f g) , w) + Γ1 (f g, χ)
dt
dT
= ((K0 f ) g − Γ1 (f, g) + f (K0 g)) − K0 (Γ1 (f, w) g + f Γ1 (g, w))
dt
+ Γ1 ((K0 f ) g − Γ1 (f, g) + f (K0 g) , w) + f Γ1 (g, ξ) + Γ1 (f, χ) g
dT
= ((K0 f ) g − Γ1 (f, g) + f (K0 g))
dt
− (K0 Γ1 (f, w)) g + Γ1 (Γ1 (f, w) , g) − Γ1 (f, w) (K0 g)
288 6 Hamilton-Jacobi-Bellman equation

− (K0 f ) Γ1 (g, w) + Γ1 (f, Γ1 (g, w)) − f (K0 Γ1 (g, w))


+ Γ1 (K0 f, w) g + (K0 f ) Γ1 (g, w) − Γ1 (Γ1 (f, g) , w)
+ Γ1 (f, w) K0 g + f Γ1 (K0 g, w) + f Γ1 (g, χ) + Γ1 (f, χ) g
µ ¶
dT
= (K0 f ) − K0 Γ1 (f, w) + Γ1 (K0 f, w) + Γ1 (f, χ) g
dt
µ ¶
dT
+ f (K0 g) − K0 Γ1 (g, w) + Γ1 (K0 g, w) + Γ1 (g, χ)
dt
dT
− Γ1 (f, g) + Γ1 (Γ1 (f, w) , g) + Γ1 (f, Γ1 (g, w)) − Γ1 (Γ1 (f, g) , w) .
dt
(6.77)

An application of either (a) of Theorem 6.13 or (a0 ) of Theorem 6.17 or of


Corollary 6.18 then yields (6.76) in Proposition 6.19.

Remark 6.21. Let the functions T , w and ψ satisfy (a0 ) and (b0 ) of Corollary
6.18. Put χ = K0 w + ψ. Then the triple (T, w, χ) satisfies:
dT
(a) K0 f = K0 Γ1 (f, w) − Γ1 (K0 f, w) + Γ1 (f, χ) (for f in a dense subspace
dt
2
of L (E, m));
∂2w ∂ (T V )
(b) 2 + K0 χ = −Γ1 (V, w) − ;
∂t ∂t
∂ (χ − K0 w)
(c) = 0.
∂t
In order to find Noether observables the equations (a), (b) and (c) have to be
integrated simultaneously. Proposition 6.19 simplifies this somewhat in the
sense that one first tries to find w, then χ. The couple (w, χ) also has to
d
1 X ∂ ∂
satisfy (b). Notice that in case E = Rd and K0 f = − aj,k f,
2 ∂xj ∂xk
j,k=1
d
X ∂f ∂g
then Γ1 (f, g) = aj,k . Upon choosing linear functions f and g we
∂xj ∂xk
j,k=1
see that w has to satisfy:
d
X
dT ∂aj,k ∂w
aj,k + a`,m
dt ∂xm ∂x`
`,m=1
d
X d
X d
X
∂2w ∂ak,` ∂w ∂aj,` ∂w
=2 aj,` ak,m + aj,m + ak,m .
∂x` ∂xm ∂xm ∂x` ∂xm ∂x`
`,m=1 `,m=1 `,m=1

∂2w
It follows that the matrix with entries is, up to a first order pertur-
∂xl ∂xm
1 dT d
bation, × the inverse of the matrix (a`,m )`,m=1 .
2 dt
6.4 A stochastic Noether theorem 289

6.4.1 Classical Noether theorem

Let Q (= E) be the configuration manifold of a classical dynamical system.


The paths are C 2 -maps q : t 7→ q(t), t ∈ I := [t0 , u]. The Lagrangian is written
as (q, q̇, t) 7→ L(q, q̇, t): q̇ ∈ T Q, the tangent bundle of Q. For simplicity we
assume here Q = R3 . Then T Q may be identified with Q. We assume an
external force of the form F = −∇V , where V is a scalar potential. Then
2
L = 21 |q̇| − V (q, t). The action functional S, defined on a a domain D(S) ⊆
C 2 ([t0 , u], Q), is given by
Z u
S(q(·); t0 , u) = L(q(s), q̇(s), s)ds.
t0

Hamilton’s least action principle says that among all regular trajectories be-
tween two fixed configurations q(t0 ) = q0 and q(u) = q1 , the physical motion q
is a critical point of the action S, i.e. its variational (= its Gâteaux) derivative
in any smooth direction δq cancels: δS(q)(δq) = 0. Equivalently q solves the
Euler-Lagrange equations in Q:
µ ¶
d ∂L ∂L
= .
dt ∂ q̇ ∂q

For the Hamilton-Jacobi theory one adds an initial or final boundary condi-
tion: S(q0 ) = S0 (q0 ) or S(u) = Su (q1 ). Noether’s theorem is the second most
important theorem of classical Lagrangian mechanics. Let Uα : Q × I → Q × I
be given a given one-parameter group (α ∈ R) local group of transforma-
tions of the (q, t)-space: (q, t) 7→ (Q(q, t; α), τ (q, t; α)). The functions Q and
τ are supposed to be C 2 in their variables, and Q(q, t; 0) = q, τ (q, t; 0) = t.
Therefore (
Q (q, t; α) = q + αX(q, t) + o(α);
(6.78)
τ (q, t; α) = t + αT (q, t) + o(α).
The pair (X(q, t), T (q, t)) is called the tangent vector field of the family {Uα },
and (T, X) its infinitesimal generator. The action S is said to be divergence
invariant if there exists a C 2 -function Φ, such that for all α > 0 but small
enough, the equality
Z t01

S (q(·); t00 , t01 ) =S (Q(·); τ00 , τ10 ) −α (q(t), t)dt + o(α), (6.79)
t00 dt

for any C 2 -trajectory q(·) in D(S) and for any time interval [t00 , t01 ] in [t0 , u].
Noether’s theorem says that for a divergence invariant Lagrangian action the
expression ·µ ¶ µ ¶ ¸
∂L ∂L
X + L− q̇ T − Φ (q(t), t)
∂ q̇ ∂ q̇
290 6 Hamilton-Jacobi-Bellman equation

∂L
is constant. The first factor p = defines the momentum observable, and the
∂ q̇
∂L
second one the energy −H = L − q̇. According to E. Cartan the Noether
∂ q̇
constant can be considered as the central geometrical object of classical Hamil-
tonian mechanics.

6.4.2 Some problems

We want to mention some problems which are related to this and earlier chap-
ters. As we proved in Chapter 2 Theorem 1.39 is true if the space E is a Polish
space, and if Cb (E) is the space of all bounded continuous functions on E. In-
stead of the topology of uniform convergence we consider the strict topology.
This topology is generated by semi-norms of the form: f 7→ supx∈E |u(x)f (x)|,
f ∈ Cb (E). The functions u ≥ 0 have the property that for every α > 0 the set
{u ≥ α} is compact (or is contained in a compact subset of E). The functions
u need not be continuous.
Problem 6.22. Is there a relationship with work done by Eberle [78, 79, 80]?
In [4] the authors Altomare and Attaliente take a somewhat different point
of view. Their state space is still second countable and locally compact. They
take a bounded continuous function w : E → (0, ∞) and the consider the
space C0w (E) as being the collection of those function f ∈ C(E) with the
property that the function wf belongs to C0 (E). The space C0w (E) is sup-
plied with the norm kf kw = kwf k∞ , f ∈ C0w (E). They study the semigroup
P w (t)f := w−1 P (t)(wf ), where P (t), t ≥ 0, is a Feller semigroup. Prop-
erties of P (t) are transferred to ones of P w (t) and vice versa. Using these
weighted continuous function spaces the authors prove some new results on
the well-posedness of the Black-Sholes equation in a weighted continuous func-
tion space; see [5]; see Chapter 4 for more on this in the usual case. In [164]
Mininni and Romanelli estimate the trend coefficient in the Black-Sholes equa-
tion. The paper is somewhat complementary to what we do in Chapter 4.
Problem 6.23. Is it possible to rephrase Theorem 1.39 for reciprocal Markov
processes and diffusions?
Martingales should then replaced with differences of forward and back-
ward martingales. A stochastic process (M (t) £: t ≥ 0)¯ on¤a probability space
(Ω, F, P) is called a backward martingale if E M (t) ¯ Fs = M (s), P-almost
surely, where t < s, and Fs is the σ-field generated by the information from
the future: Fs = σ (X(u) : u ≥ s}. Of course we assume that M (t) belongs to
L1 (Ω, F, P), t ≥ 0.
Let (Ω, F, P) be a probability space. An E-valued process (X(t) : 0 ≤ t ≤ 1)
is called reciprocal if for any 0 ≤ s < t ≤ 1 and every pair of events
A ∈ σ (X(τ ) : τ ∈ (s, t)), B ∈ σ (X(τ ) : τ ∈ [0, s] ∪ [t, 1]) the equality
£ ¯ ¤ £ ¯ ¤ £ ¯ ¤
P A ∩ B ¯ X(s), X(t) = P A ¯ X(s), X(t) P B ¯ X(s), X(t) (6.80)
6.4 A stochastic Noether theorem 291

is valid. By D we denote the set

D = {(s, x, t, B, u, z) : (x, z) ∈ E × E, 0 ≤ s < t < u ≤ 1, B ∈ E} . (6.81)

A function P : D → [0, ∞) is called a reciprocal probability distribution or a


Bernstein probability if the following conditions are satisfied:
(i) the mapping B 7→ P (s, x, t, B, u, z) is a probability measure on E for any
(x, z) ∈ E × E and for any 0 ≤ s < t < u ≤ 1;
(ii) the function (x, z) 7→ P (s, x, t, B, u, z) is E ⊗ E-measurable for any 0 ≤
s < t < u ≤ 1;
(iii)For every pair (C, D) ∈ E⊗E, (x, y) ∈ E ×E, and for all 0 ≤ s < t < u ≤ 1
the following equality is valid:
Z
P (s, x, u, dξ, v, y) P (s, x, t, C, u, ξ)
D
Z
= P (s, x, t, dη, v, y) P (t, η, u, D, v, y) .
C

Then the following theorem is valid for E = Rν (see Jamison [115]).


Theorem 6.24. Let P (s, x, t, B, u, y) be a reciprocal transition probability
function and let µ be a probability measure on E ⊗ E. Then there exists a
unique probability measure Pµ on F with the following properties:
(1) With respect to Pµ the process (X(t) : 0 ≤ t ≤ 1) is reciprocal;
(2) For all (A, B) ∈ E ⊗ E the equality Pµ [X0 ∈ A, X1 ∈ B] = µ (A × B) is
valid;
(3) For every 0 ≤ s < t < u ≤ 1 and for every A ∈ E the equality
£ ¯ ¤
Pµ X(t) ∈ A ¯ X(s), X(u) = P (s, X(s), t, A, u, X(u)) is valid.

For more details see Thieullen [228] and [229]. An example of a recip-
rocal Markov probability can be constructed as follows; it is kind of a
pinned Markov process. Let {(Ω, F, Px ) , (X(t) : t ≥ 0) , (ϑt : t ≥ 0) , (E, E)}
be a (strong) time-homogeneous Markov process, and suppose that for ev-
ery t > 0 and every x ∈ E, the probability measure B 7→ P [X(t) ∈ B] has
a Radon-Nikodym derivative p0 (t, x, y) with respect to some reference mea-
sure dy. Also suppose that p0 (t, x, y) is strictly positive and continuous on
(0, ∞) × E × E. Put

p0 (u − s, x, ξ) p0 (v − u, ξ, y)
p (s, x, u, ξ, v, y) = , 0 ≤ s < u < v.
p0 (v − s, x, y)
R
Put P (s, x, u, B, v, y) = B p (s, x, u, ξ, v, y) dξ. Then P is a reciprocal Markov
probability.
292 6 Hamilton-Jacobi-Bellman equation

Conclusion

This chapter is a reworked version of [242]. One of the main results of the
present paper is contained in Theorem 6.8. The method of proof is based on
martingale methods. For more information on viscosity solutions the reader is
referred to [62]. Another feature of the present chapter is the statement and
proof of a generalized Noether theorem (Theorem 6.13) and its complex com-
panion (Theorem 6.17). The proofs are of a computational character; they only
depend on the properties of the generator of the diffusion and the correspond-
ing carré du champ operator. They imitate and improve results obtained by
Zambrini in [260]. Moreover the results solve problems posed in [240] (Prob-
lem 4, Theorem 16, pp. 257-258) and in §2 of [238]. In particular see Problem
4 and the question prior and related to the suggested Theorem 6 on pp. 48–50
of [238]. The present chapter is a substantial extension of [239].
Part III

Long time behavior


7
On non-stationary Markov processes and
Dunford projections

The aim of this chapter is to present some criteria for checking ergodic-
ity of time-continuous finite or infinite Markov chains in the sense that
µ̇(t) = K(t)µ(t), where every K(t), t ∈ R, is a weak∗ -closed linear Kol-
mogorov operator on the space of complex Borel measures M (E) on a com-
plete metrizable separable Hausdorff space E. The obtained results are valid
in the non-stationary case and can be used as reliable and valuable tools
to establish ergodicity. Some theoretical approximation results are given as
well. The present chapter was initiated by some results in the Ph.D. the-
sis of Katilova [129]: see [243] as well. What in the present chapter is called
σ (M (E), Cb (E))-convergence, or σ (M (E), Cb (E))-topology, in the probabil-
ity literature is often referred to as weak convergence, or weak topology. In
functional analytic terms these notions should be called weak∗ -convergence,
or weak∗ -topology. Here “weak∗ ” refers to the pre-dual space of M (E) which
is the space Cb (E) endowed with the strict topology. In order to avoid misun-
derstandings we sometimes write “σ (M (E), Cb (E))” instead of “weak” (prob-
abilistic notion) or “weak∗ ” (functional analytic notion). Nevertheless, we will
employ the notation “weak∗ ” and “σ (M (E), Cb (E))” interchangeably; we will
write e.g. “weak∗ -continuous semigroup” where, strictly speaking, we mean
“σ (M (E), Cb (E))-continuous semigroup”.

7.1 Introduction

Let E be a complete metrizable topological space which is separable with Borel


field E: in other words E is a Polish space. By M (E) we denote the vector
space of all complex Borel measures on E, supplied with the total variation
norm:  
Xn 
Var(µ) = sup |µ (Bj )| : Bj is a partition of E . (7.1)
 
j=1
296 7 On non-stationary Markov processes and Dunford projections

In view of inequality (3) in Theorem 7.7 in Section 7.2 (except in Example


7.32) we will not use the total variation norm, but the following equivalent
one:
kµk = sup {|µ(B)| : B ∈ E} , µ ∈ M (E). (7.2)
In fact we have kµk ≤ Var(µ) ≤ 4 kµk. In the other sections and in Ex-
ample 7.32 the symbol Var (µ), µ ∈ M (E), stands for the total variation
norm of the measure Rµ. Let f be a bounded Borel function and µ a measure
in M (E). Instead of E f dµ we often write hf, µi. By hypothesis the family
K(t), t ∈ R, is a family of linear operators with domain and range in M (E)
which are σ (M (E), Cb (E))-closed. This means that if (µn )n∈N is a sequence
in D (K(t)), the domain of K(t), for which there exists Borel measures µ
and ν ∈ M (E) such that, for all f ∈ Cb (E), limn→∞ hf, µn i = hf, µi and
such that limn→∞ hf, K(t)µn i = hf, νi, that then µ belongs to D (K(t)) and
K(t)µ = ν. Instead of σ (M (E), Cb (E))-closed we usually write weak∗ -closed.
An important example of a weak∗ -closed linear operator is the adjoint of an
operator with domain and range in Cb (E). We consider a continuous system
of the form:
µ̇(t) = K(t)µ(t), −∞ < t < ∞, (7.3)
where each K(t) is a weak∗ -closed linear operator on M (E).
Definition 7.1. Let K be a weak∗ -closed linear operator on M (E).
° °
(a) An eigenvalue µ of K is called dominant if limt→∞ °etK (I − P )° = 0.
Here P is the Dunford projection
Z on the generalized eigenspace correspond-
1 −1
ing to µ; i.e. P = (λI − K) dλ, where γ is a (small) positively
2πi γ
oriented circle around µ. The disc centered at µ and with circumference γ
does not contain other eigenvalues.
(b) An eigenvalue µ of K is called critical if it is dominant and the zero space
of K − µI is one-dimensional.
We consider the simplex P (E) ⊂ M (E) consisting of all Borel probability
measures on E:

P (E) = {µ ∈ M (E) : µ(E) = 1 and µ(B) ≥ 0 for all Borel subsets of E} ,


(7.4)
and the subspace M0 (E) of co-dimension one in M (E):

M0 (E) = {µ ∈ M (E) : µ(E) = 0} . (7.5)

7.2 Kolmogorov operators and weak∗ -continuous


semigroups
Under appropriate conditions on the family K(t), t ≥ t0 , a solution to the
equation in (7.3), i.e. a solution to
7.2 Kolmogorov operators and weak∗ -continuous semigroups 297

d
hf, µ(t)i = hf, K(t)µ(t)i , t0 ≤ t < ∞, f ∈ Cb (E), (7.6)
dt
where µ (t0 ) ∈ P (E) is given, can be written in the form:
µ(t) = X (t, t0 ) µ(t0 ), t0 ≤ t < ∞; (7.7)
the operator-valued function X (t, t0 ) satisfies the following differential equa-
tion in weak∗ -sense:

X (t, t0 ) = K(t)X (t, t0 ) . (7.8)
∂t
It is an evolution family in the sense that X (t, t2 ) X (t2 , t1 ) = X (t, t1 ), t ≥
t2 ≥ t1 ≥ t0 , X(t, t) = I. We also assume that weakast - limt↓s X (t, s) µ = µ,
i.e.
lim hf, X (t, s) µi = hf, µi for all f ∈ Cb (E) and µ ∈ M0 (E).
t↓s

Suppose now that for every t the operator K(t) is Kolmogorov or, what is
the same, has the Kolomogorov property. This in the meaning that for the
operator K(t) the following formulas are valid:
<K(t)µ(E) = < h1, K(t)µi = 0 for all µ ∈ P (E) and (7.9)
< hf, K(t)µi ≥ 0 for all (f, µ) ∈ Cb+ (E) × P (E) for which
supp(f ) ∩ supp(<µ) = ∅. (7.10)

Here Cb+ (E) is the convex cone of all nonnegative functions in Cb (E). Un-
fortunately this notion is too weak for our purposes. In fact for our purposes
we need a modification of the notion of (sub-)Kolmogorov operator which
we label as sectorial sub-Kolmogorov operator. It is somewhat stronger than
(7.10).
Definition 7.2. Let K be a linear operator with domain and range in M (E).
Suppose that its graph G(K) := {(µ, Kµ) : µ ∈ D(K)} is closed in the product
space (M (E), k·k) × (M (E), σ (M (E), Cb (E))). Here σ (M (E), Cb (E)) stands
for weak∗ -topology which M (E) gets from its pre-dual space Cb (E). The oper-
ator K is called a sub-Kolmogorov operator if for every µ ∈ D(K) the equality
sup {< hf, µi : 0 ≤ f ≤ 1, f ∈ Cb (E)}
= inf sup {< hf, µi : 0 ≤ f ≤ 1, < hf, Kµi ≤ ε, f ∈ Cb (E)} . (7.11)
ε>0

holds.
The sub-Kolmogorov operator K is called sectorial if it is a sub-Kolmogorov
operator with the property that there exists a finite constant C such that the
inequality
|λ| sup {|hf, µi| : |f | ≤ 1, f ∈ Cb (E)}
≤ C sup {|hf, λµ − Kµi| : |f | ≤ 1, f ∈ Cb (E), } (7.12)
holds for all µ ∈ D(K) and for all λ ∈ C with <λ > 0.
298 7 On non-stationary Markov processes and Dunford projections

The following definition should be compared with the corresponding definition


in Definition 3.5: see (3.13). In fact these two notions are equivalent: this is
a consequence of assertion (f) in Proposition 3.11. Lemma 7.3 is in fact a
rewording of assertion (f) in the latter proposition.
Lemma 7.3. Let L be a linear operator with domain and range in Cb (E).
The following assertions are equivalent:
(i) For every λ > 0 and for every f ∈ D(L) the following inequality holds:

λ kf k∞ ≤ kλf − Lf k∞ ; (7.13)

(ii)For every ε > 0 the following inequality holds for all f ∈ D(L):
n ³ ´ o
sup {|f (x)| : x ∈ E} ≤ sup |f (x)| : < f (x)Lf (x) ≤ ε . (7.14)

Definition 7.4. An operator L with domain and range in Cb (E) is said to be


dissipative, if for every f ∈ D(L) and every ε > 0 the following identity holds:
n ³ ´ o
sup {|f (x)| : x ∈ E} = sup |f (x)| : < f (x)Lf (x) ≤ ε, x ∈ E . (7.15)

An operator L with domain and range in Cb (E) which satisfies the maximum
principle is called sectorial if there exists a constant C such that for all λ ∈ C
with <λ > 0 the inequality

|λ| kf k∞ ≤ C k(λI − L) f k∞ , holds for all f ∈ D(L). (7.16)

Notice that the notion of dissipativeness is equivalent to the following one: for
every f ∈ D(L) there exists a sequence (xn )n∈N ⊂ E such that
³ ´
lim |f (xn )| = kf k∞ and lim < f (xn )Lf (xn ) ≤ 0. (7.17)
n→∞ n→∞

From (7.17) it follows that the present notion of “being dissipative” coincides
with the notion in Chapter 3: see §3.2. In particular, the reader is referred to
(3.13) in Definition 3.5, and to assertion (f) in Proposition 3.11. Apparently,
the conditions in the remarks 7.5 and 7.6 below are not verifiable (or they
might not be satisfied in interesting cases, where the operators K respectively
L generate analytic semigroups).
Remark 7.5. Suppose that there exists 0 < γ < 12 π such that for every µ ∈
D(K) and every ε > 0 there exists a function f ∈ Cb (E), 0 ≤ |f | ≤ 1, such
that Var(µ) ≤ |hf, µi| + ε and such that there exists ϑ(µ) ∈ R satisfying
hf, Kµi |hf, Kµi| iϑ(µ)
π ≥ |ϑ(µ)| ≥ γ + 21 π and = e . Then (7.12) is satisfied
hf, µi |hf, µi|
with C satisfying C sin γ = 1.
7.2 Kolmogorov operators and weak∗ -continuous semigroups 299

Remark 7.6. Similarly, let L be an operator with domain and range in Cb (E).
Suppose that there exists 0 < γ < 21 π such that for every f ∈ Cb (E) and
every ε > 0 there exists x ∈ E, 0 ≤ |f | ≤ 1, such that kf k∞ = |f (x)|
and such that there exists ϑ(x) ∈ R satisfying π ≥ |ϑ(x)| ≥ γ + 12 π and
Lf (x) |Lf (x)| iϑ(x)
= e . Then the operator L is sectorial in the sense that
f (x) |f (x)|

kλf − Lf k∞ ≥ sin γ |λ| kf k∞ , f ∈ D(L).

How to check a condition like the ones in (7.11) and (7.12). Therefore
+ −
we first analyze the right-hand side of (7.11). Let E = E<µ ∪ E<µ be the
Hahn-decomposition of E corresponding to³the Jordan-decomposition
´ ³ ´ of the
+ − − +
measure <µ. Then E<µ ∩ E<µ = ∅, (<µ)+ E<µ = (<µ)− E<µ = 0, and
+
if B ∈ E is a subset of E<µ , then (<µ) (B) ≥ 0. In other words the signed
+
measure <µ is positive on E<µ . Similarly the signed measure −<µ is positive

on E<µ . In addition we have

sup {< hf, µi : 0 ≤ f ≤ 1, f ∈ Cb (E)}


³ ´ n o
+ +
= <µ E<µ = sup <µ(C) : C ⊂ E<µ , C compact . (7.18)

+
Let Cn , n ∈ N, be a sequence of compact subsets of E<µ and let On , n ∈ N,
+
be a sequence of open subsets of E such that Cn ⊂ E<µ ⊂ On , and such that
D E
lim < h1Cn , Kµi = lim < h1On , Kµi = < 1E + , Kµ . (7.19)
n→∞ n→∞ <µ

In addition suppose that

lim h1On − 1Cn , |<µ|i = lim h1On − 1Cn , |<Kµ|i = 0. (7.20)


n→∞ n→∞

Here the measures |<µ| and |<Kµ| stand


D for theEvariation measures of <µ
and <Kµ respectively. Suppose that < 1E + , Kµ ≤ 0. Let fn , n ∈ N, be a

sequence of functions in Cb (E) with the property that 1Cn ≤ fn ≤ 1On . Then
from (7.19) and (7.20) it follows that
³ ´ D E
+
<µ E<µ = < 1E + , µ = lim < hfn , µi and (7.21)
<µ n→∞
D E
0 ≥ < 1E + , Kµ = lim < hfn , Kµi . (7.22)
<µ n→∞
D E
Suppose that the inequality < 1E + , Kµ ≤ 0 holds. Then the (in-)equalities

in (7.18), (7.21), and (7.22) show that the left-hand side of (7.11) is less than
or equal to its right-hand side. The converse inequality being trivial shows
that equality (7.11) holds.
300 7 On non-stationary Markov processes and Dunford projections

In order to establish an equality like the one in (7.12) it suffices to exhibit


a Borel measurable function g : E → C with the following properties: |g| = 1,
the expression hg, µi hg, Kµi is a negative real number, and

Var(µ) = sup {|hf, µi| : |f | ≤ 1, f ∈ Cb (E)} = hg, µi .

Next let K = L∗ , where L is a closed linear operator with domain and


range in Cb (E). Suppose that <Lf ≤ 0 on C whenever C is a compact subset
of E and f ∈ D(L) is such that 1C ≤ f ≤ 1. Then the operator K satisfies
(7.11).
Next let µ ∈ D(K) be such that < h1, Kµi ≤ 0, and suppose

sup {< hf, µi : 0 ≤ f ≤ 1, f ∈ Cb (E)}


= inf sup {< hf, µi : 0 ≤ f ≤ 1, < h1 − f, Kµi ≥ −ε, f ∈ Cb (E)} . (7.23)
ε>0

Then the inequality in (7.11) follows from (7.23).


Suppose that for every positive measure µ ∈ D(K) the following inequal-
ities are satisfied: Kµ(E) ≤ 0 and µ(B) = 0 implies Kµ(B) ≥ 0. Then,
for every measure µ ∈ D(K) there exists a Borel subset E + of E such that
Kµ (E + ) ≤ 0, provided that in the Jordan-decomposition µ = µ+ − µ− the
measure µ+ belongs to D(K). This fact follows from the next observation. Let
E = E + ∪ E − be the Hahn-decomposition of E corresponding to the Jordan-
decomposition of the measure µ. Then E + ∩ E − = ∅, µ+ (E − ) = µ− (E + ) = 0
and hence, by the new hypotheses,
¡ ¢ ¡ ¢ ¡ ¢
Kµ E + = Kµ+ (E) − Kµ− E + − Kµ+ E − ≤ 0.

For more details on Hahn-Jordan decompositions see e.g. Chapter 14 in Za-


anen [257]. The following theorem is the main motivation to introduce (sub-
)Kolmogorov operators K.
Theorem 7.7. Let K be an sub-Kolmogorov operator as in Definition 7.2.
Then, for every λ > 0 and µ ∈ D(K), the following inequalities hold:

λ sup <µ(B) ≤ sup < (λI − K) µ(B); (7.24)


B∈E B∈E
λ inf <µ(B) ≥ inf < (λI − K) µ(B); (7.25)
B∈E B∈E
λ sup |µ(B)| ≤ sup |(λI − K) µ(B)| . (7.26)
B∈E B∈E

Proof (Proof of Theorem 7.7). First we notice the equality:

sup <µ(B) = sup {< hf, µi : 0 ≤ f ≤ 1, f ∈ Cb (E)} .


B∈E

Assertion (7.25) is a consequence of (7.24): apply (7.24) with −µ replacing µ.


Assertion (7.26) also follows from (7.24) by noticing that
7.3 Kolmogorov operators and analytic semigroups 301
­ ®
|hf, µi| = sup < f, eiϑ µ ,
ϑ∈[−π,π]

and then applying (7.24) to the measures eiϑ µ and ϑ ∈ [−π, π]. The inequality
in (7.24) remains to be shown. Fix µ ∈ M (E) and f ∈ Cb (E), 0 ≤ f ≤ 1.
Then we have

λ< hf, µi = < hf, (λI − K) µi + < hf, Kµi . (7.27)

From (7.27) we get

λ sup {< hf, µi : 0 ≤ f ≤ 1, f ∈ Cb (E), < hf, Kµi ≤ ε}


≤ sup {< hf, (λI − K) µi : 0 ≤ f ≤ 1, f ∈ Cb (E), < hf, Kµi ≤ ε} + ε
≤ sup {< hf, (λI − K) µi : 0 ≤ f ≤ 1, f ∈ Cb (E)} + ε. (7.28)

Employing equality (7.11) in Definition 7.2 and (7.28) we get

λ sup {< hf, µi : f ∈ Cb (E), 0 ≤ f ≤ 1}


= inf sup {< hf, µi : 0 ≤ f ≤ 1, < hf, Kµi ≤ ε, f ∈ Cb (E)}
ε>0
≤ sup {< hf, (λI − K) µi : 0 ≤ f ≤ 1, f ∈ Cb (E)} . (7.29)

Assertion (7.24) follows from (7.29). This concludes the proof of Theorem 7.7.

7.3 Kolmogorov operators and analytic semigroups

In the present section we recall some properties of weak∗ -continuous bounded


analytic semigroups acting on M (E). These results have their counterparts
for strongly continuous bounded analytic semigroups.
Theorem 7.8. Suppose, in addition to the fact that K is a sectorial sub-
Kolmogorov operator, that there exists λ0 > 0 such that (λ0 I − K) D(K) =
M (E). Then for every real-valued function f ∈ Cb (E) and every µ ∈ D(K)
with values in R the expression hf, Kµi is real. Assume that the graph of
the operator K is σ (M (E), Cb (E))-closed, and that the same is true for all
operators µ 7→ 1C Kµ, µ ∈ D(K), where C is any compact subset R of E.
Here the measure 1C Kµ is defined by the equality hf, 1C Kµi = C f dKµ,
f ∈ Cb (E). Moreover, there exists a finite constant C such that for every
λ ∈ C with <λ > 0 the following assertions hold:
(1) (λI − K) D(K) = M (E).
(2) Let µ ∈ D(K) be a real-valued measure on E. Then

|λ| sup {|hf, µi| : 0 ≤ f ≤ 1, f ∈ Cb (E)}


≤ sup {|hf, (λI − K) µi| : 0 ≤ f ≤ 1, f ∈ Cb (E)} . (7.30)
302 7 On non-stationary Markov processes and Dunford projections

(3) The inequality

|λ| sup {|hf, µi| : |f | ≤ 1, f ∈ Cb (E)}


≤ C sup {|hf, (λI − K) µi| : |f | ≤ 1, f ∈ Cb (E)} (7.31)

holds for all measures µ ∈ D(K).


−1
(4) Suppose that the function x 7→ (λI − K) δx , x ∈ E, is Borel measurable.
Let µ be a bounded Borel measure on E. Then the following equality holds:
Z
−1 −1
λ (λI − K) δx dµ(x) = λ (λI − K) µ. (7.32)

Proof (Proof of Theorem 7.8). First we will show the following assertion. If
a function f ∈ Cb (E) and a measure µ ∈ D(K) are real-valued, then the
expression hf, Kµi belongs to R. For this purpose we choose measures νλ ∈
M (E), λ > 0, such that λµ = (λI − K) νλ . Then λ (iµ) = (λI − K) (iνλ ). By
(7.24) in Theorem 7.7 we have for B ∈ E

−λ=νλ (B) = λ< (iνλ (B)) ≥ inf < (λI − K) (iνλ ) (B) = inf λ< (iµ(B)) = 0.
B∈E B∈E
(7.33)
From (7.33) it follows that =νλ (B) ≤ 0 for all B ∈ E. By the same procedure
with −µ instead of µ we see =νλ (B) ≥ 0 for all B ∈ E. Hence we get =νλ (B) =
0 for all B ∈ E, or, what is the same, the measures νλ = λR(λ)µ, λ > 0,
take their values in the reals. From (7.26) it follows that k1C λR(λ)Kµk ≤
kKµk, λ > 0, C compact subset of E. Let (Ck )k∈N be increasing sequence
of compact subsets of E such that limk→∞ |Kµ| (Ck ) = |Kµ| (E). By the
theorem of Banach-Alaoglu, which states the closed dual unit ball in a dual
Banach space is weak∗ -compact, it follows that there exists a double sequence
{λk,n : k , n ∈ N} such that for every fixed k λk,n tends to ∞ as n → ∞, and
measures νk ∈ M (E) such that

lim hf, 1Ck K (λk,n R (λk,n ) µ)i = lim hf, 1Ck λk,n R (λk,n ) Kµi = hf, νk i ,
n→∞ n→∞
(7.34)
f ∈ Cb (E). Since λk,n R (λk,n ) µ − µ = R (λk,n ) Kµ inequality (7.26) implies

λk,n kλk,n R (λk,n ) µ − µk ≤ kKµk ,

we see that
lim kλk,n R (λk,n ) µ − µk = 0. (7.35)
n→∞

From (7.34) and (7.35) it follows that the pair (µ, νk ) belongs to the clo-
sure of G (1Ck K) in the space (M (E), k·k) × (M (E), σ (M (E), Cb (E))). Since
by assumption the subspace G (1Ck K) is closed for this topology we see
that νk = 1Ck Kµ, and hence 1Ck Kµ being the σ (M (E), Cb (E))-limit of
a sequence of real measures is itself a real-valued measure. Since hf, Kµi =
limk→∞ hf, 1Ck Kµi we see that Kµ is a real measure.
7.3 Kolmogorov operators and analytic semigroups 303

(1) As a second step we prove assertion (1), i.e. we show that for every
λ ∈ C with <λ > 0 the equality (λI − K) D(K) = M (E) holds. Therefore we
put

X
−1 k k+1
R (λ0 ) = (λ0 I − K) , and R(λ) = (λ0 − λ) R (λ0 ) .
k=0

By the inequality (7.26) this series converges for λ in the open disc

{λ ∈ C : C |λ − λ0 | < λ0 } .

Moreover, for such λ we have (λI − K) R(λ) = I (and R (λ) (λI − K) is the
identity on D(K)). Next, consider the subset of C defined by

{λ ∈ C : <λ > 0, (λI − K) D(K) = M (E)} (7.36)

Then the set in (7.36) is open and closed in the half plane {λ ∈ C : <λ > 0}.
Hence it coincides with the half-plane {λ ∈ C : <λ > 0}. It follows that there
exists a family of bounded linear operators R(λ), <λ > 0, such that R(λ) =
−1
(λI − K) . Note that in this construction we equipped the space M (E) with
the norm kµk = sup {|hf, µi| : 0 ≤ f ≤ 1}. Altogether this proves (1).
(2) We fix 0 ≤ f ≤ 1, f ∈ Cb (E), and µ ∈ D(K), µ(B) ∈ R, B ∈ E. Then
hf, Kµi belongs to R, and for an appropriate choice of ϑ ∈ [−π/2, π/2] we
have
­ ® ­ ® ­ ¡ ¢®
|λ| hf, µi = < f, λeiϑ µ = < f, (λI − K) eiϑ µ + < f, K eiϑ µ
­ ®
= < f, (λI − K) eiϑ µ + cos ϑ hf, Kµi . (7.37)

From (7.37) and equality (7.11) in Definition 7.2 we infer:

|λ| sup {hf, µi : 0 ≤ f ≤ 1, f ∈ Cb (E)}


= |λ| inf sup {hf, µi : 0 ≤ f ≤ 1, hf, Kµi ≤ ε, f ∈ Cb (E)}
ε>0
© ­ ¡ ¢®
≤ inf sup < f, (λI − K) eiϑ µ + cos ϑ hf, Kµi : 0 ≤ f ≤ 1,
ε>0
hf, Kµi ≤ ε, f ∈ Cb (E)}
≤ inf (sup {|hf, (λI − K) µi| : 0 ≤ f ≤ 1, f ∈ Cb (E)} + ε) . (7.38)
ε>0

From (7.38) we infer

|λ| sup {hf, µi : 0 ≤ f ≤ 1, f ∈ Cb (E)}


≤ sup {|hf, (λI − K) µi| : 0 ≤ f ≤ 1, f ∈ Cb (E)} . (7.39)

The conclusion in (7.30) of item (2) of Theorem 7.8 now follows by applying
(7.39) to the real measures µ and −µ.
(3) The inequality in (7.31) is the same as (7.12) in Definition 7.2.
304 7 On non-stationary Markov processes and Dunford projections

(4) Let µ be a bounded Borel measure on E, and let λ > 0. Then we


−1
want to show the equality in (7.32). Therefore we put νx = λ (λI − K) δx ,
x ∈ E. So that νx ∈ D(K) and (λI − K) νx = λδx . Then since the operator
K is σ (Cb (E), M (E))-closed we see
Z Z Z
λµ = λ δx dµ(x) = (λI − K) νx dµ(x) = (λI − K) νx dµ(x), (7.40)

and consequently,
Z Z
−1 −1
λ (λI − K) µ= νx dµ(x) = λ (λI − K) δx dµ(x). (7.41)

The equality in (7.41) is the same as the one in (7.32). The final step in (7.40)
can be justified as follows. We choose double sequences
{xj,n : n ∈ N, 1 ≤ j ≤ Nn } ⊂ E and {Cj,n : n ∈ N, 1 ≤ j ≤ Nn } ⊂ E
such that
¿ Z À
hf, µi = lim hf, µn i and f, νx dµ(x) = lim hf, νn i , f ∈ Cb (E),
n→∞ n→∞
(7.42)
Nn
X Nn
X Z
where hf, µn i = µ (Cj,n ) f (xj,n ), and hf, νn i = µ (Cj,n ) f dνxj,n .
j=1 j=1
−1
Here we employ the Borel measurability of the function x 7→ (λI − K) δx .
As a consequence of (7.42) we infer that
Z
σ (M (E), Cb (E)) - lim νn = νx dµ(x) and (7.43)
n→∞

σ (M (E), Cb (E)) - lim (λI − K) νn = λσ (M (E), Cb (E)) - lim µn = λµ.


n→∞ n→∞

Since the graph of the operator K is σR (M (E), Cb (E))-closed, the equalities in


(7.43) imply
R that the measure B 7→ νx (B) ¡R dµ(x), B¢ ∈ E, belongs to D(K)
and that (λI − K) νx dµ(x) = (λI − K) νx dµ(x) , which is the same as
(7.40).
This completes the proof of assertion (4), and also of Theorem 7.8
Corollary 7.9. Let the sectorial sub-Kolmogorov operator K in Theorem 7.7
have the additional property that for some λ0 ∈ C, with λ0 > 0, the range of
λ0 I − K coincides with M (E). Then for all λ ∈ C with <λ > 0 the operator
−1
(λI − K) exists as a bounded linear operator which is defined on all of
M (E), and which satisfies
³ ´
−1
|λ| Var (λI − K) µ ≤ CVar (µ) , <λ > 0, µ ∈ M (E). (7.44)

Here Var (µ) stands for the total variation norm of the measure µ; it satisfies
Var (µ) = sup {|hf, µi| : |f | ≤ 1} .
7.3 Kolmogorov operators and analytic semigroups 305

Proof (Proof of Corollary 7.9). From assertion (1) and (2) in Theorem 7.8 it
−1
follows that the inverse operators (λI − K) , <λ > 0, exist as continuous
linear operators. Then the inequality in (7.31) implies that

|λ| Var (µ) ≤ CVar ((λI − K) µ) , <λ > 0, µ ∈ M (E). (7.45)

The inequality in (7.44) follows from the inequality in (7.45). The representa-
tion of the operator etK given in (7.47) is explained in (the proof of) Theorem
7.50 (see equality (7.284)).

Proposition 7.10. Operators K which have weak∗ -dense domain and which
satisfy (7.44) generate weak∗ -continuous analytic semigroups
© ª π
etK : |arg t| ≤ α , for some 0 < α < .
2
The operators t`+1 etK and (−t)` K ` etK , t > 0, ` ∈ N, have the representations
Z ω+i∞
t`+1 tK 1 ¡ ¢ −`−2
e = etλ + e−tλ − 2 (λI − K) dλ, (7.46)
(` + 1)! 2πi ω−i∞

and
(−t)`
K ` etK
(` + 1)!
Z
1 ∞ sin2 ξ n −1
o` ³
−1
´2
= I − 2iξ (2iξI − tK) 2iξ (2iξI − tK) dξ, (7.47)
π −∞ ξ 2
n ° ° o
° −1 °
respectively. Consequently, with C(0) = sup |λ| °(λI − K) ° : <λ > 0 and
n° ° o
° −1 °
with C1 (0) = sup °I − λ (λI − K) ° : <λ > 0 , the following inequality
holds: ° ` ` tK °
°t K e °
≤ (` + 1) C(0)2 C1 (0)` , t ≥ 0, ` ∈ N. (7.48)
`!
For ` = 0 formula (7.46) can be rewritten as:
Z
1 ∞
sin2 ξ ³ −1
´2
etK = 2iξ (2iξI − tK) dξ. (7.49)
π −∞ ξ2

The formula in (7.49) can be used to define the semigroup etK , t ≥ 0. For
` = 1 formula (7.47) reduces to
Z ½ ´3 ¾
tK 2 ∞ sin2 ξ ³ −1
´2 ³
−1
−tKe = 2iξ (2iξI − tK) − 2iξ (2iξI − tK) dξ.
π −∞ ξ 2
(7.50)
306 7 On non-stationary Markov processes and Dunford projections

Proof. From Cauchy’s theorem it follows that the right-hand side of (7.46)
(` + 1)!
multiplied by is equal to
t`+1
Z Z
(` + 1)! ω+i∞ tλ −`−2 (` + 1)! ω+i∞ λ −`−2
e (λI − K) dλ = e (λI − tK) dλ.
2πit`+1 ω−i∞ 2πi ω−i∞
(7.51)
Integration by parts shows that the right-hand side of (7.51) does not depend
` ∈ N, and hence
Z Z ω+i∞
(` + 1)! ω+i∞ λ −`−2 1 −2
e (λI − tK) dλ = eλ (λI − tK) dλ.
2πi ω−i∞ 2πi ω−i∞
(7.52)
The right-hand side of (7.52) is the inverse Laplace transform at s = 1 of
the function s 7→ sestK and thus it is equal to etK . This shows (7.46). Since
−1 −1
I − λ (λI − K) = −K (λI − K) the equality in (7.46) entails:

(−t)`
K ` etK
(` + 1)!
Z ω+i∞ tλ
1 e + e−tλ − 2 ³ −1
´`
−2
= I − λ (λI − K) λ2 (λI − K) dλ
2πit ω−i∞ λ2
Z ω+i∞ λ
1 e + e−λ − 2 ³ −1
´`
−2
= I − λ (λI − tK) λ2 (λI − tK) dλ,
2πi ω−i∞ λ2
(7.53)

and hence (7.47) follows. The inequality in (7.48) follows immediately from
(7.47). The equalities in (7.49) and (7.50) are easy consequences of (7.46) and
(7.47) respectively.
Altogether this proves Proposition 7.10.
Lemma 7.11. Suppose that for <λ > 0 the operator λI − K has a bounded
inverse defined on M (E). Suppose that C(0) defined by
n° ° o
° −1 °
C(0) := sup °λ (λI − K) ° : <λ > 0 (7.54)
¡ ¢
is finite. Let 0 < α < 12 π be such that 2C(0) sin 12 α < 1. Then for λ ∈ C
with the property that |arg(λ)| < 21 π + α the operator λI − K has a bounded
inverse with the property that
° ° 1
° −1 °
|λ| °(λI − K) ° ≤ C(α), |arg(λ)| ≤ π + α,
2
where ½° ° ¾
° −1 ° 1
C(α) := sup °λ (λI − K) ° : |arg(λ)| ≤ π + α . (7.55)
2
¡1 ¢
If 0 ≤ 2 sin 2 α C(0) < 1, then C(α) < ∞, and
7.3 Kolmogorov operators and analytic semigroups 307

C(0)
C(α) ≤ ¡ ¢ . (7.56)
1 − 2 sin 12 α C(0)

In addition, the analytic semigroup esK , |arg(s)| ≤ α, can be defined by the


same formula as employed in (7.49):
Z
sK 1 ∞ sin2 ξ 2 −2
e µ= (2iξ) (2iξI − sK) µ dξ, µ ∈ M (E), |arg(s)| ≤ α,
π −∞ ξ 2
(7.57)
and hence
° sK ° C(0)2
°e ° ≤ C(α)2 ≤ ¡ ¯ ¯ ¢2 , |arg(s)| ≤ α. (7.58)
1 − 2 ¯sin 12 α¯ C(0)

Proof. Fix λ ∈ C with <λ > 0, and observe the equality


¡ ¢−1 ³ ¡ ¢ ´−1
−1 −1
λ λI − e−iα K = eiα λ (λI − K) I − 1 − eiα λ (λI − K)

−1

X ¡ ¢j ³ −1
´j
= eiα λ (λI − K) 1 − eiα λ (λI − K) .
j=0
(7.59)

The inequality in (7.56) then follows from (7.59). The equality in (7.57) follows
from (7.49) and the fact that the vector-valued functions in the right-hand side
and the left-hand side of (7.57) are holomorphic in s on an open neighborhood
of the indicated sector in C.
−j−1
Proposition 7.12. The powers of the resolvent operators (λI − K) have
the representation
¡ −iα ¢j+1 Z ∞
−j−1 λe −iα −iα
λj+1 (λI − K) µ= sj e−se λ ese K µ ds, j ∈ N,
j! 0
(7.60)
where 0 < α < 12 π if =λ ≥ 0 and <λ > 0, and 0 > α > − 12 π if =λ ≤
0 1
0 and¡ 1 <λ
¢ > 0. Next choose 0 ≤ α < α < 2 π in such a way that 0 ≤
2 sin 2 α C(0) < 1. In addition the following estimate holds for all j ∈ N and
for all λ ∈ C with |arg(λ)| ≤ 12 π + α0 < 12 π + α:
° ° 1 C(0)2
j+1 ° −j−1 °
|λ| °(λI − K) °≤ j+1 ¡ ¡ ¢ ¢2
(cos (|arg λ| − α)) 1 − 2 sin 12 α C(0)
1 C(0)2
≤ j+1 ¡ ¡ ¢ ¢2 . (7.61)
(sin (α − α0 )) 1 − 2 sin 12 α C(0)

Proof. Let C(α) be as in (7.55) and suppose C(α) < ∞. Then the measure
−iα
ese K µ has the representation:
308 7 On non-stationary Markov processes and Dunford projections
Z ∞
−iα 1 sin2 ξ ¡ iα ¢2 ¡ iα ¢−2
ese K
µ= 2
2e iξ 2e iξI − sK µ dξ. (7.62)
π −∞ ξ

From (7.62) and (7.57) the following estimate is obtained:


° −iα ° C(0)2
° se K °
°e ° ≤ C(α)2 ≤ ¡ ¯ ¯¢2 , s > 0, (7.63)
1 − 2C(0) ¯sin 12 α¯

where C(0) is defined in (7.55). From (7.63) and (7.60) we see that the follow-
ing estimate holds for all j ∈ N and for all λ ∈ C with |arg(λ)| ≤ 12 π + α0 <
1
2 π + α:
° ° 1 C(0)2
j+1 ° −j−1 °
|λ| °(λI − K) °≤ j+1 ¡ ¡ ¢ ¢2 .
(cos (|arg λ| − α)) 1 − 2 sin 12 α C(0)
(7.64)
It is clear that (7.64) implies (7.61).
This completes the proof of Proposition 7.12.

Proposition 7.13. Let the constants C0 and C1 be such that C1 ≥ 1 and


° ` ` tK °
°t K e ° ≤ (` + 1)!C02 C1` , t ≥ 0, ` ∈ N. (7.65)

Then the following inequality is valid:


° ° 27 6
° −1 °
|λ| °(λI − K) ° ≤ √ C 2 C1 , <λ > 0. (7.66)
4 35 0
¡1 ¢
Proof. Suppose that |arg(s)| ≤ α where α satisfies 0 ≤ 2C1 sin 2 α < 1.
Then the measure esK µ can be written as
³ ´`
s

X |s| −1
`
esK µ = e(s−|s|)K esK µ = (|s| K) e|s|K µ, (7.67)
`!
`=0

and the representation (7.67) together with (7.65) implies the inequality:
° sK ° C02
°e ° ≤ ¡ ¡ ¢¢2 , |arg(s)| ≤ α, (7.68)
1 − 2C1 sin 12 α
¡ ¢
provided 0 < 2C1 sin 12 α < 1. Again the representation in (7.60) ia available.
The inequality in (7.61) is replaced with
° ° 1 C02
j+1 ° −j−1 °
|λ| °(λI − K) °≤ j+1 ¡ ¡ ¢ ¢2
(cos (|arg λ| − α)) 1 − 2 sin 12 α C1
1 C02
≤ j+1 ¡ ¡ ¢ ¢2 , (7.69)
(sin (α − α0 )) 1 − 2 sin 12 α C1
7.3 Kolmogorov operators and analytic semigroups 309

provided |arg(λ)| ≤ 12 π + α0 < 12 π + α. Since 2(j +


¡ 3)C
¢ 1 > j + 1, the angle α
can be chosen in such a way that 2(j + 3)C1 sin 12 α = j + 1 to obtain the
estimate (note that C1 ≥ 1 and take α0 = 0):
° °
j+1 ° −j−1 °
|λ| °(λI − K) °
j+3
1 (j + 3) 2j+1 C1j+1 (j + 3)j+1 2 j+1
≤ ´(j+1)/2 C0 C1
4 (j + 1)j+1 ³ 2 2
4C12 (j + 3) − (j + 1)
j+3
1 (j + 3) 2j+1 (j + 3)j+1 2 j+1
≤ ´(j+1)/2 C0 C1 (7.70)
4 (j + 1)j+1 ³ 2 2
4 (j + 3) − (j + 1)

for <λ > 0. If j = 0 (7.70) reduces to (7.66).


Remark 7.14. The assumption that C1 ≥ 1 in the inequalities in (7.65) is not
too surprising. In fact from (7.65) it follows that C1 ≥ 1 (by the spectral
mapping theorem).
Corollary 7.15. Let the operator L be the generator of a Tβ -continuous Feller
semigroup in Cb (E). Suppose that L is sectorial. Then its adjoint K = L∗
is a sectorial sub-Kolmogorov operator like in Definition 7.2. Moreover, the
graph of the operator K is weak? -closed and K generates a weak∗ -continuous
bounded analytic semigroup on M (E).
Proof. As in Theorem 7.7 K has the additional property that for some λ0 ∈ C,
with λ0 > 0, the range of λ0 I − K coincides with M (E). Then for all λ ∈ C
−1
with <λ > 0 the operator (λI − K) exists as a bounded linear operator
which is defined on all of M (E). Since L is sectorial it follows that for the
operator L the following inequality holds for all λ ∈ C with <λ ≥ 0 and for
all f ∈ D(L):
|λ| kf k∞ ≤ C k(λI − L) f k∞ . (7.71)
Of course, from (7.71) we see:
° °
° −1 °
|λ| °(λI − L) f ° ≤ C kf k∞ , f ∈ Cb (E), <λ > 0. (7.72)

From (7.72) we obtain, by duality,


³ ´
−1
|λ| Var (λI − K) µ ≤ CVar (µ) , µ ∈ M (E), <λ > 0. (7.73)

We still have to prove that the operator K is a sub-Kolmogorov operator. This


can be achieved as follows. Let <µ be the real part of the measure µ ∈ D(K).
+
Then there exists a Borel subset E<µ on which <µ is a positive measure and
which has the property that
³ ´
+
sup {< hf, µi : 0 ≤ f ≤ 1} = <µ E<µ . (7.74)
310 7 On non-stationary Markov processes and Dunford projections
+
Choose compact subsets Cn and open subsets On of E such that Cn ⊂ E<µ ⊂
On , and such that lim |<µ| (On \ Cn ) = 0. Since µ ∈ D(K) we get
n→∞
³ ´D E D E
+ −1
<µ E<µ = < 1E + , µ = lim < 1E + , λ (λI − K) µ
<µ λ→∞ <µ
D E
−1
= lim lim < fn , λ (λI − K) µ
λ→∞ n→∞
D E
−1
= lim lim < λ (λI − L) fn , µ , (7.75)
λ→∞ n→∞

where 1Cn ≤ fn ≤ 1On , fn ∈ Cb (E). Fix λ > 0 and consider the function
−1
gλ,n := λ (λI − L) fn , which satisfies 0 ≤ gλ,n ≤ 1. Moreover, we have
D E D E
−1 −1
< λ (λI − L) fn , Kµ = < λL (λI − L) fn , µ
D E
−1
= < λ2 (λI − L) fn − λfn , µ
D E D ³ ´ E
−1 −1
= < 1E\On λ2 (λI − L) fn , µ + < 1On \Cn λ2 (λI − L) fn − λfn , µ
D ³ ´ E
−1
+ < 1Cn λ2 (λI − L) fn − λfn , µ . (7.76)

Since the measure <µ is positive on E\On and the function gλ,n is nonnegative
the first term on the right-hand side of (7.76) is less than or equal to zero.
The function gλ,n satisfies gλ,n ≤ 1 and the measure <µ is positive on Cn ,
and hence the third term in (7.76) is less than or equal to zero as well. Here
we also used the fact that fn = 1 on Cn . The middle term in the right-hand
side of (7.76) is dominated by
­ ®
2λ 1On \Cn gλ,n , |<µ| ≤ 2λ |<µ| (On \ Cn ) (7.77)

Inserting (7.77) in (7.76) and using the fact that the first and the third term
of the right-hand side of (7.76) are dominated by 0 shows the inequality:
D E
−1
< λ (λI − L) fn , Kµ ≤ 2λ |<µ| (On \ Cn ) (7.78)

Since limn→∞ |<µ| (On \ Cn ) = 0, from (7.74), (7.75), and (7.78) we infer that
the operator K is a sub-Kolmogorov operator: see Definition 7.2.

Remark 7.16. In fact in Section 7.2 we will need an inequality of the form

|λ| Var (µ) ≤ CVar ((λI − K) µ) , <λ > 0, µ ∈ M (E). (7.79)

In the presence of (7.79) the operator K generates a bounded analytic semi-


group; see Theorem 7.50 below. This is the case if K = L∗ , where L is an
operator with domain and range in Cb (E) with the property that

|λ| kf k∞ ≤ C k(λI − L) f k∞ , <λ > 0, f ∈ D(L).


7.3 Kolmogorov operators and analytic semigroups 311

The following theorem is related to a similar result for continuous function


spaces rather than for measures by Cerrai (see [49] and Appendix B in [50]).
In Kühnemund (see [140]) the reader may find a generalization of such a
result in the context of so-called bi-continuous semigroups. The notion of
strongly continuous semigroup is replaced with bi-continuity in the sense that
the convergence of semigroups is always assumed with respect to the topology
τ , whereas the boundedness is always meant in the norm sense. The notion
of (infinitesimal) generator is also adapted: for τ -generators convergence is
considered in the τ -sense, and boundedness is phrased in terms of the norm.
In the present situation the Banach space is the space of all bounded signed
measures on E endowed with the variation norm and the topology τ is the
weak∗ -topology. A related paper is [74] by Dorroh and Neuberger. A result
which includes Theorem 7.17 below is formulated in Bratteli and Robinson
[42] as Theorem 3.1.10 page 171.
Theorem 7.17. Let K be a weak∗ -closed linear operator with weak∗ -dense
domain in M (E). Suppose that K possesses the sub-Kolmogorov property in
the sense of Definition 7.2. Fix λ0 > 0 and suppose that for every x ∈ E there
exists a measure µλx0 such that

λ0 δx = (λ0 I − K) µλx0 . (7.80)

Then there exists a weak∗ -continuous semigroup S(t) := etK , t ≥ 0, such that
­ ¡ tK ¢ ®
f, e − I µ
lim = hf, Kµi , for all f ∈ Cb (E) and µ ∈ D(K).
t↓0 t

From Theorem 7.7 it follows that the measures µλ1,x


0
:= µλx0 , x ∈ E, are sub-
probability measures. If h1, Kµi = 0, then these measures are probability
measures.
Proof. We will show that our assumptions imply the conditions set forth in
Theorem 3.1.10 of [42]. Assertion (3) of Theorem 7.7 implies

λ kµk ≤ k(λI − K) µk , λ > 0, µ ∈ D(K), (7.81)

where kµk denotes the norm of µ as defined in (7.2). The inequality in (7.81)
is the first condition which is required to apply Theorem 3.1.10. Let µ be a
measure in M (E). Then by (7.80) we have
Z Z Z
λ0
λ0 µ = λ0 δy dµ(y) = (λ0 I − K) µy dµ(y) = (λ0 I − K) µλy 0 dµ(y),
E E E

and so the range of λ0 I−K coincides with M (E). Hence, the result in Theorem
7.17 follows from Theorem 3.1.10 in [42].
Since the operators K(t), t ≥ t0 , in equation (7.6) are supposed to have the
Kolmogorov property, the evolution family X (t, s), t ≥ s ≥ t0 , consists of
312 7 On non-stationary Markov processes and Dunford projections

Markov operators in the sense that hf, X (t, t0 ) µi ≥ 0 whenever f ∈ Cb (E)


is non-negative and µ belongs P (E); in addition, h1, X (t, t0 ) µi = 1 for
µ ∈ P (E). Since all operators K(t) are Kolmogorov it follows that X (t, t0 )
is Markov for all t ≥ t0 . This can be seen by the following approximation
argument. Fix t0 < T and put
µ ¹ º¶
t − t0 n
Kn (t) = K t0 + (T − t0 ) 2−n 2 = K (ϕn (t)) , (7.82)
T − t0
where ¹ º
t − t0 n
ϕn (t) = t0 + (T − t0 ) 2−n 2 .
T − t0
Then Kn (t) = K (t0 + (T − t0 ) j2−n ) for
j j+1
t0 + (T − t0 ) ≤ t < t0 + (T − t0 ) n .
2n 2
Solutions to the system µ̇(t) = K(t)µ(t), t0 ≤ t ≤ T , are approximated by
solutions to the equation:
µ̇n (t) = Kn (t)µn (t), t0 ≤ t ≤ T. (7.83)
A solution to (7.83) can be written in the form µn (t) = Xn (t, t0 ) µn (t0 ), with
`−1
Y
Xn (t, s) = e(t−t`,n )K(t`,n ) e(tj+1,n −tj,n )K(tj,n ) e(tk+1,n −s)K(tk,n ) , (7.84)
j=k+1

j
where t0 ≤ s ≤ t ≤ T , tj,n = t0 + (T − t0 ) n , 0 ≤ j ≤ 2n , tk,n ≤ s < tk+1,n ,
2
and t`,n ≤ t < t`+1,n . We also need Duhamel’s formula:
Z t
(Xn (t, t0 ) − Xm (t, t0 )) µ = Xn (t, s) (Kn (s) − Km (s)) Xm (s, t0 ) µ ds.
t0
(7.85)
In (7.85) we let m → ∞ and use weak∗ -convergence to obtain:
Z t
(Xn (t, t0 ) − X (t, t0 )) µ = Xn (t, s) (Kn (s) − K(s)) X (s, t0 ) µ ds. (7.86)
t0

Of course, we assume that the sequences


Z t Z t
Xn (t, s) Kn (s)Xm (s, t0 ) µ ds and Xn (t, s) Km (s)Xm (s, t0 ) µ ds
t0 t0

converge in weak -sense to
Z t Z t
Xn (t, s) Kn (s)X (s, t0 ) µ ds and Xn (t, s) K(s)X (s, t0 ) µ ds
t0 t0

respectively as m → ∞.
7.3 Kolmogorov operators and analytic semigroups 313

Theorem 7.18. Let the sequences {Kn (t) : n ∈ N} and {Xn (t, t0 ) : n ∈ N}
be as in (7.82) and in (7.84). Suppose that for all m ∈ N and all t0 ≤ t1 ≤
t2 ≤ T the measure Xm (t2 , t0 ) µ belongs to D (K (t1 )) for all measures µ ∈
M (E). Also suppose that for every probability measure µ ∈ M (E) the family of
measures {Kn (t) Xm (t, t0 ) µ : t0 ≤ t ≤ T, 1 ≤ n ≤ m} is Tβ -equi-continuous,
i.e. there exists a function u ∈ H(E) such that

sup sup |hf, Kn (t) Xm (t, t0 ) µi| ≤ kuf k∞ , f ∈ Cb (E). (7.87)


t0 ≤t≤T n≤m

(a) Then X (t, t0 ) µ := k·k - limn→∞ Xn (t, t0 ) µ exists and µ(t) := X (t, t0 ) µ
satisfies: µ̇(t) = K(t)µ(t), provided that for all t0 < s ≤ T

lim k(K(t) − K(s)) µk = 0 (7.88)


t↑s

for all measures µ ∈ ∩s−h<t<s D (K (t)) for some h > 0.


(b) Suppose that for every s, t ∈ [t0 , T ], s ≤ t, the sequence {Xn (t, s) : n ∈ N}
is uniformly weak∗ -continuous, and that for all measures

µ∈ ∩ D (K (t))
s−h<t<s

the following equality holds

weak∗ - lim K(t)µ = K(s)µ. (7.89)


t↑s

Then X (t, t0 ) µ := weak∗ - lim Xn (t, t0 ) µ, µ ∈ M (E), exists and µ(t) :=


n→∞
X (t, t0 ) µ satisfies: µ̇(t) = K(t)µ(t).
For more details on Tβ -equi-continuous families of measures see Theorem 1.8.
The sequence {Xn (t, s) : n ∈ N} is called uniformly weak∗ -continuous, if for
every function f ∈ Cb (E) and every measure µ ∈ M (E) the sequence of
continuous functions (s, t) 7→ hf, Xn (t, s) µi, t0 ≤ s ≤ T , n ∈ N, is uniformly
continuous. See Remark 1.14 as well.
Let u ≥ 0 be a function in H(E); i.e. for every α > 0 the set {u ≥ α} is
contained in a compact subset of E. In the proof we apply the Banach-Alaoglu
theorem to the effect that the collection of measures

Bu = ∩f ∈Cb (E) {µ ∈ M (E) : |hf, µi| ≤ kuf k∞ } (7.90)

is σ (M (E), Cb (E))-compact: see Theorem 1.15. As a consequence, we see that


every sequence in the collection Bu defined in (7.90) has a σ (M (E), Cb (E))-
convergent subsequence. Here we use the fact that the space Cb (E) endowed
with the strict topology is separable; i.e. Cb (E) contains a Tβ -dense countable
subset.
314 7 On non-stationary Markov processes and Dunford projections

Proof. By hypothesis (7.87) both terms in the right-hand side of in Duhamel’s


formula (7.85) are Tβ -equi-continuous. So there exists a function u ∈ H(E)
such that
¯¿ Z t À¯
¯ ¯
¯
sup sup ¯ f, Xn (t, s) Kn (s)Xm (s, t0 ) µ ds ¯¯ ≤ kuf k∞ and (7.91)
t0 ≤t≤T n≤m t0
¯¿ Z t À¯
¯ ¯
sup sup ¯¯ f, Xn (t, s) Km (s)Xm (s, t0 ) µ ds ¯¯ ≤ kuf k (7.92)
t0 ≤t≤T n≤m t0

for all f ∈ Cb (E). By the Banach-Alaoglu theorem we may assume that


through a subsequence (mj ) the weak∗ limit in the right-hand side of (7.85)
exists for all t0 ≤ t ≤ T , and that therefore the weak∗ limit of the sequence
Xmj (t, t0 ) µ exists as well for all t0 ≤ t ≤ T . Again employing the Tβ -equi-
continuity condition in (7.87) we may assume that, for every n ∈ N, the
weak∗ limit weak∗ - limj→∞ Kn (s) Xmj (s, t0 ) µ exists. Since, in addition, the
operators Xn (t, s) are continuous for the weak∗ -topology, we let m → ∞
along an appropriate subsequence and use weak∗ -convergence to obtain:
Z t
(Xn (t, t0 ) − X (t, t0 )) µ = Xn (t, s) (Kn (s) − K(s)) X (s, t0 ) µ ds. (7.93)
t0

Our extra hypothesis (7.88) then completes the proof of assertion (a) of
Theorem 7.18. The assumption that for every s, t ∈ [t0 , T ], s ≤ t, the
sequence {Xn (t, s) : n ∈ N} is uniformly weak∗ -continuous together with
weak∗ - limt↑s K(t)µ = K(s)µ completes the proof of assertion (b) Theorem
7.18 as well.

Remark 7.19. Under Tβ -equi-continuity conditions the sequences

Xm (s, t0 ) µ and Km (s)Xm (s, t0 ) µ

possess subsequences which converge in weak∗ -sense for all t0 ≤ s ≤ T . The


Kolmogorov property of the operator function K(t) entails that solutions µn (t)
of (7.83) are non-negative, i.e. hf, µn (t)i ≥ 0 for f ≥ 0, f ∈ Cb (E), and take
their values in the simplex P (E) for each initial condition µ(t0 ) ∈ P (E). The
latter is true because if h1, µn (t0 )i = 1, then h1, µn (t)i = 1 for all T ≥ t ≥ t0 .
Consequently, the mappings µn (t0 ) 7→ µn (t), t ≥ t0 , leave the simplex P (E)
invariant, provided that µ(t) is a solution to (7.83). Passing to the limit in
(7.83) yields the desired result. This passage can be justified under certain
conditions. If the function µn (t) satisfies (7.83), then
Z t
µn (t) = µn (t0 ) + Kn (s)µn (s)ds.
t0

Example 7.20. Let K be a weak∗ -closed Kolmogorov operator with weak∗ -


dense domain, and let p(t, x) be a Borel measurable strictly positive function
7.3 Kolmogorov operators and analytic semigroups 315

defined on [t0 , T ] × E with the property that for every x ∈ E the function
t 7→ p (t, x) is continuous. Define the families of operators K1 (t) and K2 (t),
t ∈ [t0 , T ] by
Z Z
K1 (t) µ(B) = p (t, x) (Kµ) (dx) and K2 (t) µ(B) = K (p(t, ·)µ) (dx)
B B

Suppose that K has the following property, which is somewhat stronger


than the standard Kolmogorov property of Definition 7.2. For every µ ≥ 0,
µ ∈ D(K), and every B ∈ E for which µ(B) = 0 we have Kµ(B) ≥ 0.
Then the operators K1 (t) and K2 (t) share this stronger Kolmogorov prop-
erty. Fix λ0 > 0 and suppose that for every x ∈ E there exists a measure
µt,λ
x
0
such that λ0 δx = (λ0 − p (t, ·) K) µt,λ
x . Then the operator K1 (t) gen-
0

erates a weak∗ -continuous semigroup: see Theorem 7.17. If for every x ∈ E


there exists a measure νxt,λ0 such that λ0 p (t, ·) δx = (λ0 − p (t, ·) K) νxt,λ0 .
Then the measure µt,λ x
0
defined by the equality νxt,λ0 = p (t, ·) µt,λ
x
0
satisfies:
t,λ0
λ0 δx = (λ0 − Kp (t, ·)) µx . Hence, by Theorem 7.17 the operator K2 (t) gen-
erates a weak∗ -continuous semigroup in M (E). If the function (t, x) 7→ p(t, x)
is uniformly bounded, then the results of (a) in Theorem 7.18 are applicable
for the family K1 (t). If the domains of the operators K2 (t) do not depend
on t ∈ [t0 , T ], then the results of (b) in Theorem 7.18 are applicable for the
family K2 (t).

Example 7.21. A better example is a family of operators K(t), t ≥ 0, which are


adjoint of operators L(t) with domain and range in Cb (E), i.e. K(t) = L(t)∗ ,
which generate a time-dependent strong Markov process

{(Ω, Ftτ , Pτ,x ) , (X(t) : t ≥ τ ) , (E, E)}

such that

Eτ,x [f (t, X(t))] = Eτ,x [(D1 + L(t)) f (t, X(t))] , f ∈ D (D1 ) ∩ D(L(t)),
∂t
where 0 ≤ τ < t ≤ ∞. The operator D1 stands for the derivative with respect
to time: see Definition 1.31. We put Y (τ, t) f (x) = Eτ,x [f (X(t))], f ∈ Cb (E),

and X (t, τ ) µ = Y (τ, t) µ, µ ∈ M (E). This means that hY (τ, t) f, µi =
hf, X (t, τ ) µi, f ∈ Cb (E), µ ∈ M (E). Put P (τ, x; t, B) = Pτ,x [X(t) ∈ B],
0 ≤ τ ≤ t < ∞, B ∈ E. Then
Z
Y (τ, t) f (x) = f (y)P (τ, x; t, dy) , f ∈ Cb (E), 0 ≤ τ ≤ t < ∞. (7.94)

Hence,
Z Z
hf, X (t, τ ) µi = f (y) P (τ, x; t, dy) dµ(x), f ∈ Cb (E), 0 ≤ τ ≤ t < ∞.
(7.95)
316 7 On non-stationary Markov processes and Dunford projections

It is assumed that for every t ≥ 0 the operator L(t) generates a bounded


analytic Feller semigroup esL(t) , |arg s| ≤ α(t). In addition, assume that
the° operator K(t)° = L(t)∗ has a spectral gap of width 2ω(t), and that
° −1 °
|λ| °(λI − L(t)) ° ≤ c(t) for <λ ≥ −ω(t), λ 6= 0. It follows that the op-
erators L(t) generate analytic semigroups esL(t) where s ∈ C belong to a
sector with angle opening. Then it follows that there exist a constant c(t) and
an angle 21 π < β(t) < π such that
° °
° −1 °
|λ| °(λI − L(t)) ° ≤ c(t), for all λ ∈ C with |arg(λ)| ≤ β(t). (7.96)

For a proof see Theorem 7.50 and its corollaries 7.51 and 7.52. Let esL(t) ,
s ≥ 0, be the (analytic) semigroup generated by the operator L(t). Then the
(unbounded)
R∞ inverse of the operator −L(t) is given by the strong integral
f 7→ 0 esL(t) f ds. From (7.225) it follows that for µ ∈ M0 (E) and <λ > 0
the inequality
¯D ¡ ¢−1 E¯¯
¯
|λ| ¯ g, λI|M0 (E) − L(t)∗ |M0 (E) µ ¯ ≤ kgk∞ Var (µ) , (7.97)

holds whenever the function g is of the form g = λf − L(t)f , with f ∈


D (L(t)). Here M0 (E) is the space of all complex Borel measures ¡ µ on¢ E

with the property that µ(E) = 0: see (7.5). Suppose that Var esL(t) µ ≤
c(t)e−2ω(t)s Var (µ) for all µ ∈ M0 (E) and s ≥ 0. Then for <λ ≥ ω(t), g ∈
C0 (E) and µ ∈ M0 (E) we have
D ¡ ¢−1 E
(λ − 2ω(t)) g, (λ − 2ω(t)) I|M0 (E) − L(t)∗ |M0 (E) µ
Z ∞D E
g, e−s((λ−2ω(t))I|M0 (E) −L(t) |M0 (E) ) µ ds,

= (λ − 2ω(t)) (7.98)
0

and hence, if |λ − 2ω(t)| ≤ 2ω(t) we have


¯D ¡ ¢−1 E¯¯
¯
|λ − 2ω(t)| ¯ g, (λ − 2ω(t)) I|M0 (E) − L(t)∗ |M0 (E) µ ¯
Z ∞ ¯D E¯
¯ ¯
¯ g, e−s((λ−2ω(t))I|M0 (E) −L(t) |M0 (E) ) µ ¯ ds

≤ |λ − 2ω(t)|
Z0 ∞ ³ ´

≤ |λ − 2ω(t)| e−s(<λ−2ω(t)) Var esL(t) |M0 (E) µ ds kgk∞
0
Z ∞
≤ c(t) |λ − 2ω(t)| e−s(<λ−2ω(t)) e−2sω(t) dsVar (µ) kgk∞
0
|λ − 2ω(t)|
= c(t) kgk∞ Var (µ) ≤ 2c(t) kgk∞ Var (µ) . (7.99)

In view of (7.96), (7.97) and (7.99) it makes sense to consider the largest ω(t)
with the property that for all functions g ∈ C0 (E), and all Borel measures
µ ∈ M0 (E) the complex-valued function
7.3 Kolmogorov operators and analytic semigroups 317
D ¡ ¢−1 E
λ 7→ λ g, λI|M0 (E) − L(t)∗ |M0 (E) µ

extends to a bounded holomorphic function on all half-planes of the form

{λ ∈ C : <λ > −2ω 0 (t)}

with ω 0 (t) < ω(t). In follows that there exists a constant c(t) such that for all
functions g ∈ Cb (E) and µ ∈ M0 (E) the following inequality holds:
¯D ¡ ¢−1 E¯¯
¯
|λ| ¯ g, λI|M0 (E) − L(t)∗ |M0 (E) µ ¯ ≤ c(t) kgk∞ Var (µ) , <λ ≥ −ω(t).

The following definition is to be compared with the definitions 7.33 and 8.54
(in Chapter 8).
Definition 7.22. The number 2ω(t) is called the M (E)-spectral gap of the
operator L(t)∗ . It is also called the uniform or L∞ -spectral gap of the operator
L(t).
Next let P (τ, x; t, B) be the transition probability function of the process

{(Ω, Ftτ , Pτ,x ) , (X(t) : t ≥ τ ) , (E, B)}

generated by the operators L(t). Suppose that, for every τ ∈ (0, ∞) and every
Borel probability measure on E, the following condition is satisfied:
Z µ ¶
c(t) ∂
lim Var P (τ, x; t, ·) dµ(x) = 0.
t→∞ ω(t) E ∂t

Let µ be any Borel probability measure on E. Put µ(t) = Y (τ, t) µ, where

Y (τ, t)f (x) = Eτ,x [f (X(t))] , f ∈ Cb (E).

Then µ̇(t) = L(t)∗ µ(t). Moreover,


c(t)
lim Var (µ̇(t)) = 0.
t→∞ ω(t)
We will show this. With the above notation we have:

Var (µ̇(t))
½¯ ¯ ¾
¯d ¯
¯ ¯
= sup ¯ hf, µ(t)i¯ : f ∈ Cb (E), kf k∞ = 1
dt
½¯ ¯ ¾
¯∂ ¯
¯ ¯
= sup ¯ hY (τ, t) f, µi¯ : f ∈ Cb (E), kf k∞ = 1
∂t
½¯ Z Z ¯ ¾
¯∂ ¯
= sup ¯ ¯ ¯
f (y)P (τ, x; t, dy) dµ(x)¯ : f ∈ Cb (E), kf k∞ = 1
∂t
½¯Z E E Z ¯ ¾
¯ ∂ ¯
¯
= sup ¯ f (y) ¯
P (τ, x; t, dy) dµ(x)¯ : f ∈ Cb (E), kf k∞ = 1
E ∂t E
318 7 On non-stationary Markov processes and Dunford projections
µ Z ¶ Z µ ¶
∂ ∂
= Var P (τ, x; t, ·) dµ(x) ≤ Var P (τ, x; t, ·) dµ(x).
∂t E E ∂t
(7.100)

If the probability measure B 7→ P (τ, x; t, B) has density p (τ, x; t, y), then the

total variation of the measure B 7→ P (τ, x; t, B) is given by
∂t
µ ¶ Z ¯ ¯
∂ ¯∂ ¯
Var P (τ, x; t, ·) = ¯ ¯
∂t ¯ ∂t p (τ, x; t, y)¯ dy. (7.101)
E

If there exists a unique P (E)-valued function t 7→ π(t) such that L(t)∗ π(t) =
0, then the system L(t)∗ µ(t) = µ̇(t) is ergodic. This assertion follows from
Theorem 7.36 below.

7.3.1 Ornstein-Uhlenbeck process

The simplest example of this kind of the process is the following one.
1
Example 7.23. In this example we consider the generator
¡ d ¢ L := 2 ∆ − x · ∇
of the so-called Ornstein-Uhlenbeck process in Cb R : see Theorem 1.19
assertion (d), in section E of Demuth et al [70]. There exists a probability
space (Ω, F, P) together with a Rd -valued Gaussian process {X(s) : s ≥ 0},
called Ornstein-Uhlenbeck process, such that E(X(s)) = 0 and such that
1 ¡ ¢¡ ¡ ¢ ¢
E (Xj (s1 )Xk (s2 )) = exp −(s1 + s2 ) exp 2 min(s1 , s2 ) − 1 δj,k (7.102)
2
1¡ ¡ ¢¢
= exp (− |s1 − s2 |) − exp −(s1 + s2 ) δj,k , (7.103)
2
for all s1 , s2 ≥ 0, and for 1 ≤ j, k ≤ d. Put X x (t) = exp(−t)x + X(t).
Then the process {X x (t) : t ≥ 0} is the Ornstein-Uhlenbeck process of initial
velocity x. Let f : Rd → C be a bounded Borel measurable function. Then
E [f (X x (t))] is given by
³ ´
Z ³ p ´ exp − |y|2
E [f (X x (t))] = f e−t x + 1 − e−2t y √ d dy.
( π)

Moreover, the Ornstein-Uhlenbeck process is a strong Markov process. This


is also true for Brownian motion and for the oscillator process. Its integral
kernel p0 (t, x, y) is given by
à !
2 2
1 e−2t |x| + e−2t |y| − 2e−t hx, yi
p0 (t, x, y) = exp − .
(1 − e−2t )
d/2 1 − e−2t
¡ ¢
The semigroup in Cb Rd is given by
7.3 Kolmogorov operators and analytic semigroups 319
Z ³ ´
dy 2
[exp(tL)f ] (x) = p0 (t, x, y)f (y) exp − |y|
√ d
( π)
Z ³ p ´ ³ ´
1 2
= √ d f exp(−t)x + 1 − exp(−2t)y exp − |y| dy.
( π)

Its invariant measure is determined by taking the limit:


Z
1 2
lim [exp(tL)f ] (x) = d/2 f (y)e−|y| dy.
t→∞ π

For more details the reader is referred to e.g. Simon [213]. The joint distribu-
tions of the processes (see Theorem 1.19.(d) of [70])
© ¡¡ ¢ ¢ ª
{X(t) : t ≥ 0} and e−t B e2t − 1 /2 : t ≥ 0

coincide. The process {X(t)n: t ≥ 0} also possesses the same o law (i.e. joint
Rt
distribution) as the process 0 exp (−(t − s)) dB(s) : t ≥ 0 .
The semigroup generated by L is not a bounded analytic one. This can
−1
be seen by rewriting the expression for λR(λ) = λ (λI − L) , <λ > 0. For
convenience we write:
à !
2 2
1 s2 |x| + |y| − 2s hx, yi
q (s, x, y) = exp −
(1 − s2 )
d/2 1 − s2
à !
2
1 |y − sx|
= exp − . (7.104)
(1 − s2 )
d/2 1 − s2
³ ´
2
Then we have lim q (e−t , x, y) = exp − |y| , and also
t→∞

2 ¡ ¢
p0 (t, x, y) e−|y| = q e−t , x, y , t > 0,
∂ 2 (yj − sxj )
q (s, x, y) = − q (s, x, y) and
∂yj 1 − s2
2
∂2 2 4 (yj − sxj )
2 q (s, x, y) =−
1 − s2
q (s, x, y) + 2 q (s, x, y) . (7.105)
(∂yj ) (1 − s2 )

From the equalities in (7.105) we get:


1
∆y q (s, x, y) + dq (s, x, y) + hy, ∇y q (s, x, y)i
2
s2 2s n ¡ ¢ o
2 2 2
= −d q (s, x, y) + 2 s |x| + s |y| − 1 + s hy, xi q (s, x, y)
1 − s2 (1 − s2 )
∂ ∂ ¡ ¢¯
= −s q (s, x, y) = q e−t , x, y ¯e−t =s . (7.106)
∂s ∂t
320 7 On non-stationary Markov processes and Dunford projections

d
Let f ∈ D(L),
³ and
´ let µ0 be the Borel measure on R which has density
2
π −d/2 exp − |y| with respect to the Lebesgue measure. Then integration
by parts yields:
Z ∞
λ e−λt etL f (x)dt
0
Z ∞
¡ ¢ ¯t=∞ ¡ ¢
= 1 − e−λt etL f (x)¯t=0 − 1 − e−λt etL Lf (x)dt
Z ∞ Z 0
¡ −λt
¢ ¡ ¢ dy
= hf, µ0 i − 1−e q e−t , x, y Lf (y) d/2 dt
0 Rd π
Z ∞ Z µ ¶
¡ ¢ ¡ ¢ 1 dy
= hf, µ0 i − 1 − e−λt q e−t , x, y ∆f (y) − hy, ∇f (y)i dt
0 Rd 2 π d/2

(apply again integration by parts)


Z ∞ Z
¡ ¢
= hf, µ0 i − 1 − e−λt
0 Rd
µ ¶
1 ¡ ¢ ¡ ¢ ­ ¡ ¢® dy
∆y q e−t , x, y + dq e−t , x, y + y, ∇y q e−t , x, y f (y) d/2 dt
2 π
Z ∞ Z
¡ ¢
= hf, µ0 i − 1 − e−λt
0 Rd
 n o
2 2
−d 2e2t |x| + |y| − (et + e−t ) hy, xi ¡ ¢
 +  q e−t , x, y f (y) dy dt
2t 2
e −1 2t
(e − 1) π d/2
Z Z ∞
¡ ¢
= hf, µ0 i − 1 − e−λt
Rd 0
 n o
2 2
−d 2e2t |x| + |y| − (et + e−t ) hy, xi ¡ ¢
 +  q e−t , x, y dtf (y) dy
2t 2
e −1 2t
(e − 1) π d/2


(make the substitution y = e−t x + 1 − e−2t y 0 )
Z Z
1 − e−λt ³
∞ n
0 2
p
2t − 1 hy 0 , xi

= hf, µ0 i − −d + 2 |y | − e
Rd 0 e2t − 1
³ ´ ³ p ´ dy 0
2
× exp − |y 0 | f e−t x + 1 − e−2t y 0 dt d/2 . (7.107)
π
By the same token we get
Z ∞
λ e−λt etL f (x)dt
0
Z Z ∞ −λt ³ n p o´
e 0 2 2t − 1 hy 0 , xi
= f (x) + lim −d + 2 |y | − e
η↓0 Rd η e2t − 1
7.3 Kolmogorov operators and analytic semigroups 321
³ ´ ³ p ´ dy 0
2
× exp − |y 0 | f e−t x + 1 − e−2t y 0 dt d/2 . (7.108)
π
From (7.107) we infer
¯ Z ∞ ¯
¯ ¯
¯λ e −λt tL
e f (x)dt − hf, µ i¯
¯ 0 ¯
0
¯  n o
Z ¯Z ∞ 2e 2t
|x|
2
+ |y|
2
− (e t
+ e −t
) hy, xi
¯ ¡ ¢ −d
≤ ¯ 1 − e−λt  2t + 
¯ e −1 (e2t − 1)
2
R ¯ 0
d

¯
¡ −t ¢ ¯ dy
q e , x, y dt¯¯ d/2 kf k∞
π
¯ n o¯
Z Z ∞ ¯ 2e 2t
|x|
2
+ |y|
2
− (e t
+ e −t
) hy, xi ¯
¯ ¯ ¯
−λt ¯ ¯ −d
¯
≤ ¯ 1−e + ¯
¯ 2t 2 ¯
Rd 0 ¯e − 1 (e2t − 1) ¯
¡ ¢ dy
q e−t , x, y dt d/2 kf k∞
π

(make the substitution y = e−t x + 1 − e−2t y 0 )
¯ n √ o¯
Z Z ∞ ¯ 2 |y 0 2
| − e 2t − 1 hy 0 , xi ¯
¯ ¯ ¯ ¯
= ¯1 − e−λt ¯ ¯ −d + ¯
¯ 2t 2t ¯
Rd 0 ¯e − 1 e −1 ¯
³ ´ dy 0
2
exp − |y 0 | dt d/2 kf k∞
π
Z Z ∞ ¯¯ ¯
1 − e−λt ¯ ¯¯ n
0 2
p o¯
2t − 1 hy 0 , xi ¯¯
= ¯−d + 2 |y | − e
Rd 0 e2t − 1
³ ´ dy 0
2
exp − |y 0 | dt d/2 kf k∞
π
Z Z ∞ ¯¯ ¯
1 − e−λt ¯ ¯¯ n
2
p
2t − 1) hy, xi ¯¯

= ¯ −d + |y| − 2 (e
Rd 0 e2t − 1
µ ¶
1 2 dy
exp − |y| dt kf k∞
2 (2π)d/2
Z Z ∞¯ ¯ ¯ µ ¶
1 − e−λt ¯ ¯¯ 2¯
¯ 1 2 dy
≤ 2t − 1 ¯−d
+ |y| ¯ exp − |y| dt d/2
kf k∞
R d 0 e 2 (2π)
Z Z ∞ √ ¯¯ ¯ µ ¶
2 1 − e−λt ¯ 1 2 dy
+ √ |hy, xi| exp − |y| dt kf k∞
Rd 0
2t
e −1 2 (2π)d/2
Z ∞¯ ¯ ¯ Z ∞¯ ¯ ¯
1 − e−λt ¯ 2 1 − e−λt ¯
≤ 2d dt kf k∞ + √ √ dt |x| kf k∞ . (7.109)
0 e2t − 1 π 0 e2t − 1
We will also estimate the absolute value of the quantity:
322 7 On non-stationary Markov processes and Dunford projections
Z Z
dy ¡ ¢ dy
q (0, x, y) f (y) d/2 − q e−t , x, y f (y) d/2
π π
Z Z ∞µ o¶
2e2s n 2 2 ¡ s −s
¢
= −d + 2s |x| + |y| − e + e hy, xi
t e −1
1 ¡ ¢ dy
× 2s q e−s , x, y ds f (y) d/2
e −1 π
Z Z ∞n n p oo ³ ´
2 1 0 2
= −d + 2 |y 0 | − e2s − 1 hy 0 , xi exp − |y |
t e2s − 1
³ p ´ dy 0
f e−s x + 1 − e−2s y 0 ds d/2
π
Z Z ∞( ( r
2s − 1
)) µ ¶
2 e 1 1 2
= −d + |y| − hy, xi exp − |y|
t 2 e2s − 1 2
à r !
1 − e−2s dy
f e−s x + y ds . (7.110)
2 (2π)d/2

Here we used the equality in (7.106):


µ o¶
2e2s n 2 2 ¡ ¢ 1 ¡ ¢
−d + 2s |x| + |y| − es + e−s hy, xi 2s
q e−s , x, y
e −1 e −1
∂ ¡ −s ¢
= q e , x, y . (7.111)
∂s
From (7.110) we obtain the following estimate in the same manner as we got
the inequality in (7.109):
¯Z Z ¯
¯ ¡ ¢ ¯
¯ q (0, x, y) f (y) dy − q e−t , x, y f (y) dy ¯
¯ π d/2 π ¯
d/2
Z ∞ Z ∞
1 1 1
≤ 2d 2s
ds kf k∞ + √ √ ds |x| kf k∞ . (7.112)
t e −1 π t 2s
e −1

Suppose y 6= x. In addition, the substitution s = e−t shows the equality:


Z ∞ Z 1
2
λ e−λt p0 (t, x, y) e−|y| dt = λ sλ−1 q (s, x, y) ds
0 0
Z 1¡ ¢
1 − sλ s
=d q (s, x, y) ds
0 1 − s2
Z 1 ¡ ¢ 2 2 ¡ ¢
2 1 − sλ s |x| + s |y| − 1 + s2 hx, yi
− q (s, x, y) ds. (7.113)
0 1 − s2 1 − s2

From (7.113) we infer


Z ∞Z
2 dy
λ f (y)e−λt p0 (t, x, y) e−|y| d/2 dt
0 π
7.3 Kolmogorov operators and analytic semigroups 323
Z1
¡ ¢ Z
1 − sλ s dy
=d 2
f (y)q (s, x, y) d/2 ds
0 1 − s π
Z 1 ¡ ¢Z 2 2 ¡ ¢
2 1 − sλ s |x| + s |y| − 1 + s2 hx, yi dy
− f (y)q (s, x, y) ds
0 1 − s2 1 − s2 π d/2

(make the substitution y = sx + 1 − s2 y 0 )
Z 1¡ ¢ Z ³
p ´
1 − sλ s 2 y 0 e−|y |
0 2 dy
0
=d f sx + 1 − s ds
0 1 − s2 π d/2
Z 1 ¡ ¢ Z ³ ´³ ´
2 1 − sλ p p
2 y0 0 2 2 hx, y 0 i
− f sx + 1 − s s |y | − 1 − s
0 1 − s2
0 2 dy
0
× e−|y | d/2 ds
π¢
Z 1 ¡ Z ³ p ´ ³ ´ dy 0
s 1 − sλ 2 y 0 e−|y |
0 2
0 2
= f sx + 1 − s d − 2 |y | ds
0 1 − s2 π d/2
Z 1 ¡ ¢ Z ³ p ´
2 1 − sλ 0 2 dy
0
+ √ f sx + 1 − s2 y 0 hx, y 0 i e−|y | d/2 ds. (7.114)
0 1 − s2 π
Let C (t, s), t ≥ s, t, S ∈ R, be a family of d × d matrices with real entries,
with the following properties:
(a) C(t, t) = I, t ∈ R, (I stands for the identity matrix).
(b) The following identity holds: C(t, s)C(s, τ ) = C(t, τ ) holds for all real
numbers t, s, τ for which t ≥ s ≥ τ .
(c) The matrix valued
© function (t, s, x) 7→ªC(t, s)x is continuous as a function
from the set (t, s) ∈ Rd × Rd : t ≥ s × Rd to Rd .
¡ ¢
Define the backward propagator
¡ ¢ YC on Cb Rd by YC (t, s)f (x) = f (C(t, s)x),
x ∈¡ Rd¢, and f ∈ C¡b R¡d . ¢Then¡YC ¢¢ is a backward propagator on the space
¡ ¢
Cb Rd , which is σ Cb Rd , M Rd -continuous. Here the symbols M Rd
stand for the vector space of all signed measures on Rd .
Let W (t) be standard m-dimensional Brownian motion on (Ω, Ft , E) and
let σ(ρ) be a deterministic continuous function which takes its values in the
space of d × m-matrices. Put Q(ρ) = σ(ρ)σ(ρ)∗ . Another interesting example
is the following:
YC,Q (s, t) f (x)
Z Ã µZ t ¶1/2 !
1 − 12 |y|2 ∗
= d/2
e f C(t, s)x + C(t, ρ)Q(ρ)C(t, ρ) dρ y dy
(2π) s
· µ Z t ¶¸
= E f C(t, s)x + C(t, ρ)σ(ρ)dW (ρ) . (7.115)
s

where A is an arbitrary d×d matrix, and where Q(ρ) = σ(ρ)σ(ρ)∗ is a positive-


definite d × d matrix.
¡ ¢Then the propagators YC,Q and YC,S are backward
propagators on Cb Rd . We will prove this.
324 7 On non-stationary Markov processes and Dunford projections

Next suppose that the forward propagator C on Rd consists of contrac-


tive operators, i.e. C(t, s)C(t, s)∗ ≤ I (this inequality is to be taken in
matrix sense). Choose a family S (t, s) of square d × d-matrices such that

C(t, s)C(t, s)∗ + S (t, s) S (t, s) = I, and put
Z
1 1 2
YC,S (s, t) f (x) = d/2
e− 2 |y| f (C(t, s)x + S(t, s)y) dy. (7.116)
(2π)

In fact the example in (7.116) is a special case of the example in (7.115)


provided Q(ρ) is given by the following limit:

I − C (ρ − h) C (ρ − h)
Q(ρ) = lim (7.117)
h↓0 h

If Q(ρ) is as in (7.117), then


Z t
∗ ∗ ∗
S (t, s) S (t, s) = I − C (t, s) C (t, s) = C (t, ρ) Q(ρ)C (t, ρ) dρ.
s

The following auxiliary lemma will be useful. Condition (7.118) is satisfied


if the three pairs (C1 , S1 ), (C2 , S2 ), and (C3 , S3 ) satisfy: C1 C1∗ + S1 S1∗ =
C2 C2∗ + S2 S2∗ = C3 C3∗ + S3 S3∗ = I. It also holds if C2 = C (t2 , t1 ), and
Z tj
∗ ∗
Sj Sj = C (tj , ρ) σ(ρ)σ(ρ)∗ C (tj , ρ) dρ, j = 1, 2, and
tj−1
Z t2

S3 S3∗ = C (t2 , ρ) σ(ρ)σ(ρ)∗ C (t2 , ρ) dρ.
t0

Lemma 7.24. Let C1 , S1 , C2 , S2 , and C3 , S3 be d × d-matrices with the


following properties:

C3 = C2 C1 , and C2 S1 S1∗ C2∗ + S2 S2∗ = S3 S3∗ . (7.118)


¡ d¢
Let f ∈ Cb R , and put
Z
1 1 2
Y1,2 f (x) = d/2
e− 2 |y| f (C1 x + S1 y) dy; (7.119)
(2π)
Z
1 1 2
Y2,3 f (x) = d/2
e− 2 |y| f (C2 x + S2 y) dy; (7.120)
(2π)
Z
1 1 2
Y1,3 f (x) = d/2
e− 2 |y| f (C3 x + S3 y) dy. (7.121)
(2π)

Then Y1,2 Y2,3 = Y1,3 .

Proof.
¡ ¢Let the matrices Cj and Sj , 1 ≤ j ≤ 3, be as in (7.118). Let f ∈
Cb Rd . First we assume that the matrices S1 and C2 are invertible, and
7.3 Kolmogorov operators and analytic semigroups 325

we put A3 = S1−1 C2−1 S3 , and A2 = S1−1 C2−1 S2 . Then, using the equalities in
(7.118) we see A3 A∗3 = I +A2¡A∗2 . We
¢ choose a d×d-matrix A ∗such that A A

=
∗ −1 ∗ ∗ ∗
I + A2 A2 , and
¡ ¢ we put D = A A 2 A 3 . Then we have A 3 A 3 = I + D D.
Let f ∈ Cb Rd . Let the vectors (y1 , y2 ) ∈ Rd × Rd and (y, z) ∈ Rd × Rd be
such that µ ¶ µ ¶µ ¶
y1 A3 −A2 A−1 y
= . (7.122)
y2 0 A−1 z
Since
−1 −1
A2 A∗2 (I + A2 A∗2 ) = A2 (I + A∗2 A2 ) A∗2 ,
we obtain
det (I + A2 A∗2 ) = det (I + A∗2 A2 ) .
Hence, the absolute value of the determinant of the matrix in the right-hand
side of (7.122) can be rewritten as:
¯ µ ¶¯2 ¯ ¯2
¯ A−1 ¯¯
¯det A3 −A2−1 =
¯
¯ det A (det A)
−1 ¯
¯
¯ 0 A ¯ 3

det (A3 A∗3 ) det (I + A2 A∗2 )


= = = 1. (7.123)

det (A A) det (I + A∗2 A2 )

From (7.122) and (7.123) it follows that the corresponding volume elements
satisfy: dy1 dy2 = dy dz. We also have
2 2 2 2
|y1 | + |y2 | = |y| + |z − Dy| . (7.124)

Employing the substitution (7.122) together with the equalities dy1 dy2 =
dy dz and (7.124) and applying Fubini’s theorem we obtain:
ZZ
1
e− 2 (|y1 | +|y2 | ) f (C2 C1 x + C2 S1 y1 + S2 y2 ) dy1 dy2
1 2 2
Y1,2 Y2,3 f (x) = d
(2π)
ZZ
1
e− 2 (|y| +|z−Dy| ) f (C3 x + S3 y) dy dz
1 2 2
=
(2π)d
Z
1 1 2
= e− 2 |y| f (C3 x + S3 y) dy = Y1,3 f (x) (7.125)
(2π)d
¡ ¢
for all f ∈ Cb Rd . If the matrices S1 and C2 are not invertible, then we
replace the C1 with C1,ε = e−ε C1 and S1,ε satisfying C1,ε C1,² ∗ ∗
+ S1,ε S1,ε = I,
and limε↓0 S1,ε = S1 . We take S2,ε = e−ε S2 instead of S2 . In addition, we
∗ ∗
choose the matrices C2,ε , ε > 0, in such a way that C2,ε C2,² + S2,ε S2,ε = I,
and limε↓0 C2,ε = C2 .
This completes the proof of Lemma 7.24.
Rt
Proposition 7.25. Put X τ,x (t) = C (t, τ ) x+ τ C (t, ρ) σ(ρ)dW (ρ). Then the
process X τ,x (t) is Gaussian. Its expectation is given by E [X τ,x (t)] = C (t, τ ) x,
and its covariance matrix has entries
326 7 On non-stationary Markov processes and Dunford projections
µZ t ¶
¡ ¢ ∗
P-cov Xjτ,x (s), Xkτ,x (t) = C (t, ρ) Q(ρ)C (t, ρ) dρ (7.126)
s j,k
© ¡ ¢ª
Let (Ω, F, Pτ,x ) , (X(t), t ≥ 0) , Rd , Bd be the corresponding time-inhomo-
geneous Markov process. Then this process is generated by the family operators
L(t), t ≥ 0, where
d
1 X
L(t)f (x) = Qj,k (t)Dj Dk f (x) + h∇f (x), A(t)xi . (7.127)
2
j,k=1

C(t + h, t) − I
Here the matrix-valued function A(t) is given by A(t) = lim .
h↓0 h
The semigroup esL(t) , s ≥ 0, is given by

esL(t) f (x)
· µ Z s ¶¸
sA(t) (s−ρ)A(t)
=E f e x+ e σ(t)dW (ρ)
0
Z Ã µZ ¶1/2 !
s
1 − 12 |y|2 sA(t) ρA(t) ρA(t)∗
= d/2
e f e x+ e Q(t)e dρ y dy
(2π) 0
Z
= p (s, x, y; t) f (y)dy (7.128)

Z s

where, with QA(t) (s) = eρA(t) Q(t)eρA(t) dρ, the integral kernel p (s, x, y; t)
0
is given by

p (s, x, y; t)
³ D E´
1 −1
− 12 (QA(t) (s)) (y−esA(t) x),y−esA(t) x
= d/2 p
e .
(2π) det QA(t) (s)

If all eigenvalues of the matrix A(t) have strictly negative real part, then the
measure
Z µZ ∞ ¶
1 − 12 |y|2 ρA(t) ρA(t)∗
B 7→ d/2
e 1 B e Q(t)e dρy dy
(2π) 0

defines an invariant measure for the semigroup esL(t) , s ≥ 0.

From Remark 7.31 below it follows that our theory is not directly applicable
to the Ornstein-Uhlenbeck process as exhibited in Proposition 7.25. Therefore
we will modify the example in the next proposition.
Proposition 7.26. Let the Rd -valued process X(t) be a solution to the fol-
lowing stochastic differential equation:
7.3 Kolmogorov operators and analytic semigroups 327
Z t Z t
X(t) = C (t, τ ) X(τ )+ C (t, ρ) F (ρ, X(ρ)) dρ+ C (t, ρ) σ (ρ, X(ρ)) dW (ρ).
τ τ
(7.129)
Under appropriate conditions on the functions F and σ the equation in (7.129)
has a unique weak solution. More precisely, it is assumed that x 7→ σ (t, x) is
Lipschitz continuous with a constant which depends continuously on t, and
that for some strictly positive continuous functions k1 (t), k2 (t) and k3 (t), and
strictly positive finite constants ε > 0, α > 0, the following inequality holds
for all y, z ∈ Rd :
¿ À
y 1+ε α
F (t, y + z) , ≤ −k1 (t) |y| + k2 (t) |z| + k3 (t). (7.130)
|y|
It is also assumed that the functions y 7→ F (t, y) and y 7→ σ (t, y) are locally
Lipschitz, i.e., for every compact subset K of Rd there exists a continuous
function t 7→ CK (t) such that for all y1 and y2 ∈ K the inequalities

|F (t, y2 ) − F (t, y1 )| ≤ CK (t) |y2 − y1 | , and


|σ (t, y2 ) − σ (t, y1 )| ≤ CK (t) |y2 − y1 | . (7.131)

hold. The corresponding Markov process


©¡ ¢ ¡ ¢ª
Ω, Fτt , Pτ,x , (X(t), t ≥ 0) , Rd , Bd

is generated by the time-dependent linear differential operators L(t), given by


d
1 X
L(t)f (x) = h∇f (x), A(t)x + F (t, x)i + Dj Dk f (x)aj,k (t, x), (7.132)
2
j,k=1

where
C (t + h, t) − C (t, t)
A(t) = lim , and
h↓0 h
d
X
aj,k (t, x) = σj,` (t, x) σk,` (t, x) .
`=1

It is assumed that the operator A(t) satisfies hA(t)y, yi ≤ 0, y ∈ Rd . Moreover,


let X τ,x (t), t ≥ τ , be a solution to (7.129) with X(τ ) = x. Then
   
n
Y n
Y
Eτ,x  fj (X (tj )) = E  fj (X τ,x (tj )) ,
j=1 j=1

where E is the expectation with respect to the distribution of Brownian motion.


In addition,

Eτ,x [f (X(t))] = Eτ,x [L(t)f (X(t))] .
∂t
328 7 On non-stationary Markov processes and Dunford projections
R
Proof. Fix a C 1 -function ϕ : Rd → [0, ∞) such that Rd ϕ(y)dy = 1, and
supp (ϕ) ⊂ {|y| ≤ 1}. Moreover, assume that ϕ(y) is symmetric R in the sense
that ϕ(y) = ϕ(−y), y ∈ Rd . This property implies e.g. Rd yϕ(y)dy = 0. In
addition, let εn , n ∈ N, be a sequence of positive real numbers such that
0 < εn+1 ≤ εn ≤ 1, n ∈ N, £ and such that¤ limn→∞ εn = 0. Let the process
t 7→ Y (t) be such that E supτ ≤t≤T |Y (t)| < ∞, and define the processes
s 7→ Fn (s, Y (s)), n ∈ N, by
Z
Fn (s, Y (s)) = F (s, Y (s) − εn y) ϕ(y)dy
d
ZR µ ¶
Y (s) − y dy
= F (s, y) ϕ .
Rd εn εkn

Then the functions Fn have properties similar to F , and for each fixed n,
the functional Y (·) 7→ Fn (t, Y (t)) is globally Lipschitz continuous. Instead
of looking at the equation in (7.129) we consider the sequence of equations
(n ∈ N):
Z t
Xn (t) = C (t, τ ) Xn (τ ) + C (t, ρ) Fn (ρ, Xn (ρ)) dρ
τ
Z t
+ C (t, ρ) σ (ρ, Xn (ρ)) dW (ρ). (7.133)
τ

Assuming that the equation in (7.133) has a solution Xn (t), then we write
Z t
Zn (t) = C (t, ρ) σ (ρ, Xn (ρ)) dW (ρ) and Yn (t) = Xn (t) − Zn (t).
τ
(7.134)
In terms of Yn (t) and Zn (t) the equation in (7.133) reads as follows (notice
that Zn (τ ) = 0):
Z t
Yn (t) = C (t, τ ) Yn (τ ) + C (t, ρ) Fn (ρ, Yn (ρ) + Zn (ρ)) dρ. (7.135)
τ

Moreover, from (7.130) it follows that


¿ À
Yn (t)
Fn (t, Yn (t) + Zn (t)) ,
|Yn (t)|
1+ε α
≤ −k1 (t) |Yn (t)| + k2 (t) |Zn (t)| + k3 (t) + εn . (7.136)

From our hypotheses it follows that


¿ À
d d Yn (t)
|Yn (t)| = Yn (t),
dt dt |Yn (t)|
¿ À ¿ À
Yn (t) Yn (t)
= A(t)Yn (t), + Fn (t, Yn (t) + Zn (t)) ,
|Yn (t)| |Yn (t)|
7.3 Kolmogorov operators and analytic semigroups 329
1+ε α
≤ −k1 (t) |Yn (t)| + k2 (t) |Zn (t)| + k3 (t) + εn . (7.137)

From (7.134) we also see:


Z t
Zn (t) = C (t, ρ) σ (ρ, Yn (ρ) + Zn (ρ)) dW (ρ). (7.138)
τ

Applying Hölder’s inequality to (7.137) shows


d 1+ε α
Eτ,x [|Yn (t)|] ≤ −k1 (t) (Eτ,x [|Yn (t)|]) + k2 (t)Eτ,x [|Zn (t)| ] + k3 (t) + εn .
dt
(7.139)
Next put y1,n (t) = Eτ,x [|Yn (t)|], and let y2,n (t) be any continuously differen-
tiable positive function with the following properties: y2,n (τ ) ≥ y1,n (τ ) = |x|,
and
α
ẏ2,n (t) ≥ −k1 (t)y2,n (t)1+ε + k2 (t)Eτ,x [|Zn (t)| ] + k3 (t) + εn . (7.140)

Then from (7.137), (7.139), and Lemma 7.27 below we obtain y2,n (t) ≥ y1,n (t),
t ≥ τ.
A martingale solution to equation (7.129) can be found as follows. First
find a (weak) solution X0 (t), t ≥ τ ≥ 0, to the equation
Z t
X0 (t) = C (t, τ ) X0 (τ ) + C (t, ρ) σ (ρ, X0 (ρ)) dW (ρ). (7.141)
τ

Then choose F0 (t, y) in such a way that F (t, y) = σ (t, y) F0 (t, y). After that
we define the finite-dimensional distributions of the process XF (t) as follows.
First we introduce the process ζ (t, τ ), t ≥ τ :
Z t Z
1 t 2
ζ (t, τ ) = F0 (ρ, X0 (ρ)) dW (ρ) − |F0 (ρ, X0 (ρ))| dρ. (7.142)
τ 2 τ

Then the process t 7→ eζ(t,τ ) , t ≥ τ , is a local martingale with respect to


the filtration FtW,τ := σ (W (ρ) : τ ≤ ρ ≤ t), t ≥ τ , generated by Brownian
motion,
£ and¯ which is such ¤ that X0 (τ ) = x, P-almost surely. This means that
if E eζ(t,τ ) ¯ X0 (τ ) = x = 1, then the process t 7→ eζ(t,τ ) , t ≥ τ , is a mar-
£ ¯ ¤
tingale with respect to the measure A 7→ P A ¯ X(τ ) = x , A ∈ FtW,τ . The
finite-dimensional distributions of the process XF (t), t ≥ τ , are given by the
Girsanov formula:

Eτ,x [f (XF (t1 ) , . . . , XF (tn ))]


h ¯ i
= E eζ(t,τ ) f (X0 (t1 ) , . . . , X0 (tn )) ¯ X0 (τ ) = x . (7.143)

Here we assume that the function f : Rd × · · · × Rd → R is a bounded Borel


| {z }
n times
function, and τ ≤ t1 < · · · < tn ≤ t. In order to prove that equality (7.143)
330 7 On non-stationary Markov processes and Dunford projections

determines the distribution of the process XF (t), t ≥ τ , we have to show that


the martingale problem for the family of operators L(t), t ≥ τ , in (7.127) is
well-posed. Therefore, we apply Itô’s formula to obtain:

eζ(t,τ ) f (X0 (t)) − eζ(τ,τ ) f (X0 (τ ))


Z t
= eζ(ρ,τ ) f (X0 (ρ)) hF0 (ρ, X(ρ)) , dW (ρ)i
τ
Z t
+ eζ(ρ,τ ) h∇f (X0 (ρ)) , σ (ρ, X(ρ)) dW (ρ)i
τ
Z t
+ eζ(ρ,τ ) h∇f (X0 (ρ)) , A(ρ)X0 (ρ)i dρ
τ
Z t
+ eζ(ρ,τ ) h∇f (X0 (ρ)) , σ (ρ, X0 (ρ)) F0 (ρ, X0 (ρ))i dρ
τ
d Z
1 X t ζ(ρ,τ ) ¡ ∗¢
+ e σ (ρ, X0 (ρ)) σ (ρ, X0 (ρ)) j,k Dj Dk f (X0 (ρ)) dρ
2
j,k=1 τ
Z t
= eζ(ρ,τ ) f (X0 (ρ)) hF0 (ρ, X(ρ)) , dW (ρ)i
τ
Z t
+ eζ(ρ,τ ) h∇f (X0 (ρ)) , σ (ρ, X(ρ)) dW (ρ)i
τ
Z t
+ eζ(ρ,τ ) L(ρ)f (X0 (ρ)) dρ. (7.144)
τ

It follows that that the process


Z t
t 7→ eζ(t,τ ) f (X0 (t)) − f (X0 (τ )) − eζ(ρ,τ ) L(ρ)f (X0 (ρ)) dρ
τ
£ ¯ ¤
is a martingale with respect to the measure A →
7 P A ¯ X(τ ) = x , A ∈ FtW,τ ,
£ ¯ ¤
provided that E eζ(t,τ ) ¯ X(τ ) = x = 1. Hence under the latter condition it
follows that the process t 7→ XF (t) is a Pτ,x -martingale. Essentially speak-
ing this proves that the martingale problem for the operators L(t), t ≥ τ ,
possesses solutions. In order to establish the Markov property we need the
uniqueness of solutions. The uniqueness of solutions can be achieved as fol-
lows. Let t 7→ X 1 (t) and X 2 (t), t ≥ τ , be solutions to equation (7.129).
Rt ¡ ¢
Put Z j (t) = τ σ ρ, X j (ρ) dW (ρ), and Y j (t) = X j (t) − Z j (t). Then
Rt ¡ ¢
Z j (t) = τ σ ρ, Y j (ρ) + Z j (ρ) dW (ρ), and
Z t
j j
¡ ¢
Y (t) = C (t, τ ) Y (τ ) + F ρ, Y j (ρ) + Z j (ρ) dρ. (7.145)
τ

j
Let K be a compact subset of [τ, ∞) × Rd , and define the stopping times τK ,
j = 1, 2, and τK by
7.3 Kolmogorov operators and analytic semigroups 331
j © ¡ ¢ ¡ ¢ ª ¡ 1 2¢
τK = inf t > τ : t, X j (t) ∈ [τ, ∞) × Rd \ K and τK = min τK , τK .

Then on the event {τK > t} by the local Lipschitz property of the function F
and σ we have (see (7.131))
¿ À
d ¯¯ 2 ¯ d ¡ 2 ¢ Y 2 (t) − Y 1 (t)
Y (t) − Y 1 (t)¯ = Y (t) − Y 1 (t) , 2
dt dt |Y (t) − Y 1 (t)|
¿ À
¡ ¢ Y 2 (t) − Y 1 (t)
= A(t) Y 2 (t) − Y 1 (t) , 2
|Y (t) − Y 1 (t)|
¿ À
¡ 2 2
¢ ¡ 1 1
¢ Y 2 (t) − Y 1 (t)
+ F t, Y (t) + Z (t) − F t, Y (t) + Z (t) , 2
|Y (t) − Y 1 (t)|
¡¯ 2 ¯ ¯ 2 ¯¢
¯ 1 ¯ ¯
≤ CK (t) Y (t) − Y (t) + Z (t) − Z (t) 1 ¯
µ ¯Z ¯¶
¯ 2 ¯ ¯ t¡ ¡ ¢ ¡ ¢¢ ¯
¯ 1
= CK (t) Y (t) − Y (t) + ¯ ¯ ¯ σ ρ, X (ρ) − σ ρ, X (ρ) W (ρ)¯¯ .
2 1
τ
(7.146)

From inequality (7.186) in Lemma 7.30 and (7.146) we infer:


¯ 2 ¯ ¯ ¯ R
¯Y (t) − Y 1 (t)¯ ≤ ¯Y 2 (τ ) − Y 1 (τ )¯ e τt CK (ρ)dρ
Z t R
e ρ CK (ρ )dρ CK (ρ) |Z2 (ρ) − Z1 (ρ)| dρ.
t 0 0
+ (7.147)
τ

Inequality (7.147) on the event {τK > t} entails:


¯ 2 ¯
¯Y (t) − Y 1 (t)¯ 1{τ >t}
K
¯ ¯ Rt
≤ ¯Y 2 (τ ) − Y 1 (τ )¯ 1{τK >t} e τ CK (ρ)dρ
Z t R
e ρ CK (ρ )dρ CK (ρ) |Z2 (ρ) − Z1 (ρ)| 1{τK >t} dρ
t 0 0
+
τ
¯ 2 ¯ Rt
¯
= Y (τ ) − Y 1 (τ )¯ 1{τK >t} e τ CK (ρ)dρ
Z t R ¯Z ρ ¯
t
CK (ρ0 )dρ0
¯ ¡ ¡ 0 2 0 ¢ ¡ 0 1 0 ¢¢ ¯
+ e ρ ¯
CK (ρ) ¯ 0 ¯
σ ρ , X (ρ ) − σ ρ , X (ρ ) dW (ρ )¯
τ τ
× 1{τK >t} dρ
¯ ¯ Rt
= ¯Y 2 (τ ) − Y 1 (τ )¯ 1{τK >t} e τ CK (ρ)dρ
Z t R
e ρ CK (ρ )dρ CK (ρ)
t 0 0
+
¯Zτ ρ∧τK ¯
¯ ¡ ¡ 0 2 0 ¢ ¡ ¢¢ ¯
× ¯¯ σ ρ , X (ρ ) − σ ρ0 , X 1 (ρ0 ) 1{τK >ρ0 } dW (ρ0 )¯¯ 1{τK >t} dρ
τ
¯ ¯ Rt
≤ ¯Y 2 (τ ) − Y 1 (τ )¯ 1{τK >τ } e τ CK (ρ)dρ
Z t R
e ρ CK (ρ )dρ CK (ρ)
t 0 0
+
τ
332 7 On non-stationary Markov processes and Dunford projections
¯Z ρ∧τK ¯
¯ ¡ ¡ 0 2 0 ¢ ¡ ¢¢ ¯
× ¯¯ σ ρ , X (ρ ) − σ ρ0 , X 1 (ρ0 ) 1{τK >ρ0 } dW (ρ0 )¯¯ 1{τK >τ } dρ.
τ
(7.148)

It follows that
¯ ¯
sup ¯Y 2 (s) − Y 1 (s)¯ 1{τK >s}
τ ≤s≤t
¯ ¯ Rt
≤ ¯Y 2 (τ ) − Y 1 (τ )¯ 1{τK >τ } e τ CK (ρ)dρ
Z t R
e ρ CK (ρ )dρ CK (ρ)
t 0 0
+
¯Zτ ρ∧τK ¯
¯ ¡ ¡ 0 2 0 ¢ ¡ 0 1 0 ¢¢ ¯
ׯ¯ 0 ¯
σ ρ , X (ρ ) − σ ρ , X (ρ ) 1{τK >ρ0 } dW (ρ )¯
τ
× 1{τK >τ } dρ, (7.149)
2
and hence by the elementary inequalities 2ab ≤ a2 + b2 and (a + b) ≤ 2a2 +
2b2 , a, b ∈ R,
¯ ¯2
sup ¯Y 2 (s) − Y 1 (s)¯ 1{τK >s}
τ ≤s≤t
¯ ¯2 Rt
≤ 2 ¯Y 2 (τ ) − Y 1 (τ )¯ 1{τK >τ } e2 τ CK (ρ)dρ
Z tZ t R Rt
e ρ1 K ( ) CK (ρ1 ) e ρ2 K ( ) CK (ρ2 )
t
C ρ0 dρ0 C ρ0 dρ0
+2
¯τZ ρτ1 ∧τK ¯
¯ ¡ ¡ 0 2 0 ¢ ¡ ¢¢ ¯
× ¯¯ σ ρ , X (ρ ) − σ ρ0 , X 1 (ρ0 ) 1{τK >ρ0 } dW (ρ0 )¯¯
¯Zτ ρ2 ∧τK ¯
¯ ¡ ¡ 0 2 0 ¢ ¡ ¢¢ ¯
× ¯¯ σ ρ , X (ρ ) − σ ρ0 , X 1 (ρ0 ) 1{τK >ρ0 } dW (ρ0 )¯¯
τ
× 1{τK >τ } dρ1 dρ2
¯ 2 ¯2 Rt
≤ 2 ¯Y (τ ) − Y 1 (τ )¯ 1{τK >τ } e2 τ CK (ρ)dρ
Z tZ t R Rt
e ρ1 K ( ) CK (ρ1 ) e ρ2 K ( ) CK (ρ2 )
t
C ρ0 dρ0 C ρ0 dρ0
+
τ
ïτZ ¯2
¯ ρ1 ∧τK ¡ ¡ 0 2 0 ¢ ¡ 0 1 0 ¢¢ ¯
× ¯ ¯ 0 ¯
σ ρ , X (ρ ) − σ ρ , X (ρ ) 1{τK >ρ0 } dW (ρ )¯
τ
¯Z ρ2 ∧τK ¯2 !
¯ ¡ ¡ 0 2 0 ¢ ¡ 0 1 0 ¢¢ ¯
+ ¯¯ σ ρ , X (ρ ) − σ ρ , X (ρ ) 1{τK >ρ0 } dW (ρ )¯¯
0
τ

× 1{τK >τ } dρ1 dρ2


¯ 2 ¯2 Rt
≤ 2 ¯Y (τ ) − Y 1 (τ )¯ 1{τK >τ } e2 τ CK (ρ)dρ
Z t R ³ Rt ´
e ρ CK (ρ )dρ CK (ρ) e τ CK (ρ )dρ − 1
t 0 0 0 0
+2
τ
7.3 Kolmogorov operators and analytic semigroups 333
¯Z ρ∧τK ¯2
¯ ¡ ¡ 0 2 0 ¢ ¡ 0 1 0 ¢¢ ¯
× ¯¯ 0 ¯
σ ρ , X (ρ ) − σ ρ , X (ρ ) 1{τK >ρ0 } dW (ρ )¯ 1{τK >τ } dρ,
τ
(7.150)

The fact that the process


Z ρ∧τK
¡ ¡ 0 2 0 ¢ ¡ ¢¢
ρ 7→ σ ρ , X (ρ ) − σ ρ0 , X 1 (ρ0 ) 1{τK >ρ0 } dW (ρ0 )
τ

is a martingale with respect to Brownian motion entails the equality


"¯Z ¯2 #
¯ ρ∧τK ¡ ¡ 0 2 0 ¢ ¡ 0 1 0 ¢¢ ¯
E ¯¯ σ ρ , X (ρ ) − σ ρ , X (ρ ) 1{τK >ρ0 } dW (ρ0 )¯¯ 1{τK >τ }
τ
"¯Z ¯2 #
¯ ρ∧τK ¯ ¡ 0 2 0 ¢ ¡ 0 1 0 ¢¯2 ¯
=E ¯ ¯ ¯ ¯ 0¯
σ ρ , X (ρ ) − σ ρ , X (ρ ) 1{τK >ρ0 } dρ ¯ 1{τK >τ } .
τ
(7.151)

By taking expectations in (7.150) and using (7.151) we get


· ¸
¯ 2 ¯2
¯ 1 ¯
E sup Y (s) − Y (s) 1{τK >s}
τ ≤s≤t
h¯ ¯2 i Rt
≤ 2E ¯Y 2 (τ ) − Y 1 (τ )¯ 1{τK >τ } e2 τ CK (ρ)dρ
Z t R ³ Rt ´
e ρ CK (ρ )dρ CK (ρ) e τ CK (ρ )dρ − 1
t 0 0 0 0
+2
"¯Zτ ¯2 #
¯ ρ∧τK ¡ ¡ 0 2 0 ¢ ¡ 0 1 0 ¢¢ ¯
× E ¯¯ σ ρ , X (ρ ) −σ ρ , X (ρ ) 1{τK >ρ0 } dW (ρ )¯¯ 1{τK >τ } dρ
0
τ
h¯ ¯2 i Rt
≤ 2E ¯Y 2 (τ ) − Y 1 (τ )¯ 1{τK >τ } e2 τ CK (ρ)dρ
Z t R ³ Rt ´
e ρ CK (ρ )dρ CK (ρ) e τ CK (ρ )dρ − 1
t 0 0 0 0
+2
τ
·Z ρ∧τK ¸
¯¡ ¡ 0 2 0 ¢ ¡ 0 1 0 ¢¢¯2
×E ¯ σ ρ , X (ρ ) − σ ρ , X (ρ ) ¯ 0
1{τK >ρ0 } dρ dρ
τ

(employ the local Lipschitz property of the function x 7→ σ (ρ, x) with Lips-
eK (ρ))
chitz constant C
h¯ ¯2 i Rt
≤ 2E ¯Y 2 (τ ) − Y 1 (τ )¯ 1{τK >τ } e2 τ CK (ρ)dρ
Z t R ³ Rt ´
e ρ CK (ρ )dρ CK (ρ) e τ CK (ρ )dρ − 1
t 0 0 0 0
+2
τ
·Z ρ∧τK ¸
¯ ¯
0 ¯2
×E e 2 0 ¯ 2 0 1 0
CK (ρ ) X (ρ ) − X (ρ ) 1{τK >ρ0 } dρ dρ
τ
334 7 On non-stationary Markov processes and Dunford projections
h¯ ¯2 i Rt
≤ 2E ¯Y 2 (τ ) − Y 1 (τ )¯ 1{τK >τ } e2 τ CK (ρ)dρ
³ Rt ´2 ·Z t∧τK ¯ 2 ¯2
¸
+2 e τ
CK (ρ)dρ
−1 E e 2 ¯ 1 ¯
CK (ρ) X (ρ) − X (ρ) 1{τK >ρ} dρ .
τ
(7.152)

From the Burkholder-Davis-Gundy inequality for p = 2 we obtain;


· ¸
¯ ¯2
E sup ¯Z 2 (s) − Z 1 (s)¯ 1{τK >s}
τ ≤s≤t
" ¯Z s∧τK ¯2 #
¯ ¡ ¡ ¢ ¡ ¢¢ ¯
= E sup ¯¯ σ ρ, X (ρ) − σ ρ, X (ρ) 1τK >ρ dW (ρ)¯¯ 1{τK >s}
2 1
τ ≤s≤t τ
·Z t∧τK ¸
¯¡ ¡ ¢ ¡ ¢¢ ¯
≤ 4E ¯ σ ρ, X 2 (ρ) − σ ρ, X 1 (ρ) 1τ >ρ ¯2 dρ1{τ >τ }
K K
τ
·Z t∧τK ¸
¯ ¯2
≤ 4E eK
C 2
(ρ) ¯X 2 (ρ) − X 1 (ρ)¯ 1{τK >ρ} dρ1{τK >τ } . (7.153)
τ
¯ ¯2
Next we estimate the expectation of maxτ ≤s≤t ¯X K (s)¯ where
¡ ¢
X K (t) = Y 2 (t) − Y 1 (t) + Z 2 (t) − Z 1 (t) 1{τK >t} = Y K (t)+Z K (t). (7.154)

Here the notations Y K (t) and Z K (t) are self-explanatory. Put


· ¸
¯ ¯2
uK (s) = E sup ¯X 2 (ρ) − X 1 (ρ)¯ 1{τK >ρ} .
τ ≤ρ≤s

From (7.152) and (7.153) we then obtain:


h¯ ¯2 i Rt
uK (t) ≤ 4E ¯Y 2 (τ ) − Y 1 (τ )¯ 1{τK >τ } e2 τ CK (ρ)dρ
µ ³ R ´2 ¶Z t
t
+2 2 e τ CK (ρ)dρ
−1 +1 CeK
2
(ρ) uK (ρ)dρ
τ
Z t
= ψ(t) + χ(t) c1 (ρ)uK (ρ)dρ (7.155)
τ

where
h¯ ¯2 i Rt
ψ(t) = 4E ¯Y 2 (τ ) − Y 1 (τ )¯ 1{τK >τ } e2 τ CK (ρ)dρ ,
µ ³ R ´2 ¶
t
χ(t) = 2 2 e τ CK (ρ)dρ − 1 + 1 , and c1 (t) = C eK
2
(t) .

From inequality (7.185) in Lemma 7.30 and (7.155) we then obtain:


Z t R
e ρ χ(ρ )c1 (ρ )dρ c1 (ρ)ψ(ρ)dρ.
t 0 0 0
uK (t) ≤ ψ(t) + χ(t) (7.156)
τ
7.3 Kolmogorov operators and analytic semigroups 335

Since the functions t 7→ c1 (t) and t 7→ ψ(t) are increasing from (7.156) we
infer
Rt
uK (t) ≤ ψ(t)eχ(t) τ c1 (ρ)dρ
h¯ ¯2 i Rt Rt
= 4E ¯Y 2 (τ ) − Y 1 (τ )¯ 1{τK >τ } e2 τ CK (ρ)dρ eχ(t) τ c1 (ρ)dρ
h¯ ¯2 i Rt Rt
= 4E ¯X 2 (τ ) − X 1 (τ )¯ 1{τK >τ } e2 τ CK (ρ)dρ+χ(t) τ c1 (ρ)dρ . (7.157)

If X 2 (τ ) = X 1 (τ ) P-almost surely, then (7.157) implies X 2 (t) = X 1 (t) on the


event {τK > t}. Since K is an arbitrary compact subset of [τ, ∞)×Rd the latter
proves that the stochastic differential equation
¡ ¢ (7.129) in Proposition 7.26 is
uniquely solvable in the space S2loc = Sh2loc Rd consistingi of continuous semi-
2
martingales X withe property that E supτ ≤s≤t |X(s)| is finite. Of course
provided that solution to equation (7.129) belong to the space S2loc .
This completes the proof of Proposition 7.26.
Lemma 7.27. Fix τ ≤ T , and let g : [τ, T ] × R → R be a continuous func-
tion, which is continuously differentiable in the second variable. In addition,
let C : [τ, T ] → R be a measurable function. Let the R-valued continuous func-
tions t 7→ y2 (t) and t 7→ y1 (t), τ ≤ t ≤ T , satisfy the following differential
inequalities:

ẏ1 (t) ≤ −g (t, y1 (t)) + C(t), τ ≤ t ≤ T, and (7.158)


ẏ2 (t) ≥ −g (t, y2 (t)) + C(t), τ ≤ t ≤ T. (7.159)

If y2 (τ ) ≥ y1 (τ ), then y2 (t) ≥ y1 (t), τ ≤ t ≤ T .


Proof. Put Φ(t) = y2 (t) − y1 (t), and
µZ t Z 1 ¶
Ψ (t) = exp D2 g (ρ, (1 − s)y1 (ρ) + sy2 (ρ)) ds dρ .
τ 0

Then Ψ (t) > 0, and


d
(Φ(t)Ψ (t))
dt
µ µZ t Z 1 ¶¶

= Φ(t) exp D2 g (ρ, (1 − s)y1 (ρ) + sy2 (ρ)) ds dρ
∂t τ 0
µ Z 1 ¶
= Φ̇(t) + Φ(t) D2 g (t, (1 − s)y1 (t) + sy2 (t)) ds Ψ (t)
0
µ Z 1 ¶
= ẏ2 (t) − ẏ1 (t) + (y2 (t) − y1 (t)) D2 g (t, (1 − s)y1 (t) + sy2 (t)) ds Ψ (t)
0
= (ẏ2 (t) + g (t, y2 (t)) − ẏ1 (t) − g (t, y1 (t))) Ψ (t) ≥ 0, (7.160)

where in inequality (7.160) we used (7.158) and (7.159). Hence we get


336 7 On non-stationary Markov processes and Dunford projections

Φ(t)Ψ (t) ≥ Φ(τ )Ψ (τ ) = Φ(τ ) = y2 (τ ) − y1 (τ ) ≥ 0, (7.161)

and thus y2 (t) − y1 (t) = Φ(t) ≥ 0.

Lemma 7.28. Let y : [τ, ∞) → [0, ∞) be a solution to the following ordinary


differential equation

ẏ(t) = C(t) − k(t)y(t)1+ε , t ≥ τ. (7.162)

It is assumed that the functions C(t) and k(t) are strictly positive and contin-
C(t)
uous, that ε > 0, that the quotient γ := γ(t) = does not depend on t, and
R∞ k(t)
that τ k(ρ)dρ = ∞. Then limt→∞ y(t) = γ 1/(1+ε) , and supt≥τ y(t) < ∞. In
addition, the following inequality holds for t > τ :
¯ ¯ µ Z t ¶−1/ε
¯ 1/(1+ε) ¯
¯y(t) − γ ¯≤ ε k(ρ)dρ , t > τ. (7.163)
τ

The importance of inequality (7.163) lies in the fact that in this inequality
there is no reference to the initial value y(τ ) of the solution t 7→ y(t). It
seems that the inequality in (7.163) is somewhat nicer and stronger than the
inequality in (2.15) in Goldys and Maslowski [92].
1
Remark 7.29. As in the proof of Lemma 7.28 put η = . From (7.169) in
1+ε
the proof of Lemma 7.28 we see that y(τ ) > γ implies y(τ ) > y(t) > γ η ,
η

and that y(t) decreases to its limit. We also see that y(τ ) < γ η entails y(τ ) <
y(t) < γ η , and that y(t) increases to its limit. If the integral
Z 1 Z t
ε
k(ρ)η ((1 − s)C(ρ)η + sk(ρ)η y(ρ)) dρ ds (7.164)
0 τ

increases to ∞ with t, then limt→∞ y(t) = γ η . Notice that the integrals in


(7.164) tend to ∞ whenever the function k(t) and C(t) are constant. In order
that the limit lim y(t) = γ 1/(1+ε) one needs the fact that the integral
t→∞
Z ∞ Z ∞
1 ε ε
k(ρ) 1+ε C(ρ) 1+ε dρ = γ 1+ε k(ρ)dρ
τ τ

diverges. If it converges, then the limit limt→∞ y(t) still exists, but it is not
equal to γ η . Moreover, the limit depends on the initial value. If y(τ ) < γ η ,
then equality (7.169) implies y(τ ) < y(t) < γ η for all t ≥ τ . Moreover, y(t)
increases to γ η . If y(τ ) = γ η , then y(t) = γ η , t ≥ τ .
1
Proof. For brevity we write η = . We introduce the function ϕ(t), t ≥ τ ,
1+ε
defined by
7.3 Kolmogorov operators and analytic semigroups 337
R1Rt η η η ε
ϕ(t) = (γ η (t) − y(t)) e(1+ε) 0 τ
k(ρ) ((1−s)C(ρ) +sk(ρ) y(ρ)) dρ ds
. (7.165)

We differentiate the function in (7.165) to obtain


d
dt (γ(t)η − y(t))
ϕ̇(t) = ϕ(t)
γ(t)η − y(t)
Z 1
ε
+ ϕ(t)(1 + ε)k(t)η ((1 − s)C(t)η + sk(t)η y(t)) ds, (7.166)
0

and hence

(γ(t)η − y(t)) ϕ̇(t)


µ ¶
d
= (γ(t)η − y(t)) ϕ(t)
dt
Z 1
ε
+ ϕ(t)(1 + ε) (C(t)η − k(t)η y(t)) ((1 − s)C(t)η + sk(t)η y(t)) ds
0
µ ¶
d η
= (γ(t) − y(t)) ϕ(t)
dt
Z 1
∂ 1+ε
− ϕ(t) ((1 − s)C(t)η + sk(t)η y(t)) ds
0 ∂s
µ ¶
d ¡ ¢
= (γ(t)η − y(t)) ϕ(t) − ϕ(t) k(t)y(t)1+ε − C(t) (7.167)
dt

(y(t) satisfies equation (7.162))


µ ¶
d η
= (γ(t) − y(t)) ϕ(t) + ϕ(t)ẏ(t)
dt
µ ¶
d η
= (γ(t) ) ϕ(t) = 0 (7.168)
dt
where we used the fact that γ(t) does not depend on t ≥ τ . Consequently,
from (7.168) it follows that the function ϕ(t) does not depend on t ≥ τ . From
the definition of ϕ (see (7.165)) we see that

y(t) − γ η (7.169)
µ Z 1 Z t ¶
ε
= (y(τ ) − γ η ) exp −(1 + ε) k(ρ)η ((1 − s)C(ρ)η + sk(ρ)η y(ρ)) dρ ds
0 τ
µ Z 1Z t ¶
η η ε
= (y(τ ) − γ ) exp −(1 + ε) k(ρ)((1 − s)γ + s y(ρ)) dρ ds .
0 τ

Suppose τ < t. From (7.169) we see that y(τ ) > γ η implies y(τ ) > y(t) > γ η ,
and that y(t) decreases to its limit. We also see that y(τ ) < γ η entails y(τ ) <
y(t) < γ η , and that y(t) increases to its limit. If y(τ ) ≤ γ η , then equality
(7.169) implies y(t) ≤ γ η for all t ≥ τ . Even more is true:
338 7 On non-stationary Markov processes and Dunford projections

0 ≤ γ η − y(t) (7.170)
µ Z 1 Z t ¶
η η η η ε
= (γ − y(τ )) exp −(1 + ε) k(ρ) ((1 − s)C(ρ) + sk(ρ) y(ρ)) dρ ds
0 τ
µ Z 1 Z t ¶
ε
≤ γ η exp −(1 + ε) k(ρ) ((1 − s)γ η + sy(ρ)) dρ ds
0 τ
µ Z 1Z t ¶
≤ γ η exp −(1 + ε) k(ρ)(1 − s)ε γ εη dρ ds
0 τ
µ Z t ¶
η εη
= γ exp −γ k(ρ)dρ .
τ

Next we put
Z 1 Z ρ
ε
Φε (τ, ρ) = (1 + ε) k (ρ0 ) ((1 − s)γ η + sy (ρ0 )) dρ0 ds. (7.171)
0 τ

If the function y(t) solves equation (7.162), then (7.169) implies

(y(ρ) − γ η ) eΦε (τ,ρ) = y(τ ) − γ η , (7.172)

and consequently we get


Z t
∂ εΦε (τ,ρ)
eεΦε (τ,t) − 1 = e dρ
τ ∂ρ
Z tZ 1
ε
= ε (1 + ε) k(ρ) ((1 − s)γ η + sy(ρ)) eεΦε (τ,ρ) ds dρ
τ 0
Z 1 Z t ³ ´ε
= ε (1 + ε) k(ρ) γ η eΦε (τ,ρ) + s (y(τ ) − γ η ) dρ ds, (7.173)
0 τ

If y(τ ) > γ η , then (7.173) implies


t ³ Z 1 Z ´ε
εΦε (τ,t)
e = 1 + ε (1 + ε) k(ρ) γ η eΦε (τ,ρ) + s (y(τ ) − γ η ) dρ ds
0 τ
Z t
ε
≥1+ε k(ρ)dρ (y(τ ) − γ η ) . (7.174)
τ

From (7.174) we infer


µ Z t ¶1/ε
Φε (τ,t) η ε
e ≥ 1+ε k(ρ)dρ (y(τ ) − γ )
τ
µ Z t ¶1/ε
≥ ε k(ρ)dρ (y(τ ) − γ η ) . (7.175)
τ

From (7.172) with ρ = t together with (7.175) we see that


7.3 Kolmogorov operators and analytic semigroups 339
µ Z t ¶−1/ε
0 ≤ y(t) − γ η ≤ ε k(ρ)dρ . (7.176)
τ

If 0 ≤ y(τ ) < γ η we proceed as follows. Again we use (7.173) to obtain


Z 1 Z t ³ ´ε
eεΦε (τ,t) = 1 + ε (1 + ε) k(ρ) γ η eΦε (τ,ρ) − s (γ η − y(τ )) dρ ds
0 τ
Z Z 1t
ε
≥ 1 + ε (1 + ε) k(ρ) ((1 − s) (γ η − y(τ ))) ds dρ
τ 0
Z t
ε
=1+ε k(ρ)dρ (γ η − y(τ )) . (7.177)
τ

Hence we see
µ Z t ¶1/ε
Φε (τ,t) η ε
e ≥ 1+ε k(ρ)dρ (γ − y(τ ))
τ
µ Z t ¶1/ε
≥ ε k(ρ)dρ (γ η − y(τ )) . (7.178)
τ

From (7.169) with γ η > y(τ ) together with (7.178) we then get
µ Z t ¶−1/ε
γ η − y(t) = (γ η − y(τ )) e−Φε (τ,t) ≤ ε k(ρ)dρ . (7.179)
τ

Inequality (7.163) in Lemma 7.28 now follows from (7.176) and (7.179).
If y(τ ) > γ η , then (7.169) implies y(ρ) > y(τ ), ρ > τ , and

y(t) − γ η (7.180)
µ Z 1Z t ¶
η η η η ε
= (y(τ ) − γ ) exp −(1 + ε) k(ρ) ((1 − s)C(ρ) + sk(ρ) y(ρ)) dρ ds
0 τ
µ Z 1Z t ¶
η η ε
= (y(τ ) − γ ) exp −(1 + ε) k(ρ) ((1 − s)γ + sy(ρ)) dρ ds
0 τ
µ Z 1Z t ¶
ε
≥ (y(τ ) − γ η ) exp −(1 + ε) k(ρ) ((1 − s)y(ρ) + sy(ρ)) dρ ds
0 τ
µ Z t ¶
= (y(τ ) − γ η ) exp − (1 + ε) k(ρ)y(ρ)ε dρ .
τ

Hence from (7.180) we obtain, with ρ instead of t:


µ Z ρ ¶
ε
y(ρ) ≥ γ η + (y(τ ) − γ η ) exp − (1 + ε) k (ρ0 ) y (ρ0 ) dρ0 , (7.181)
τ

and hence
340 7 On non-stationary Markov processes and Dunford projections
µ Z ρ ¶
ε
(1 − s)γ η + sy(ρ) ≥ γ η + s (y(τ ) − γ η ) exp − (1 + ε) k (ρ0 ) y (ρ0 ) dρ0 ,
τ
(7.182)
Again using (7.180) and (7.182) we then obtain:

0 ≤ y(t) − γ η (7.183)
µ Z 1Z t ¶
η η η η ε
= (y(τ ) − γ ) exp −(1 + ε) k(ρ) ((1 − s)C(ρ) + sk(ρ) y(ρ)) dρ ds
0 τ
µ Z 1 Z t ¶
η η ε
= (y(τ ) − γ ) exp −(1 + ε) k(ρ) ((1 − s)γ + sy(ρ)) dρ ds
0 τ
≤ (y(τ ) − γ η )
½ Z 1Z t
exp −(1 + ε) k(ρ)
0 τ
µ µ Z ρ ¶¶ε ¾
ε
γ η + s (y(τ ) − γ η ) exp − (1 + ε) k (ρ0 ) y (ρ0 ) dρ0 dρ ds
τ

(use the elementary inequality (γ η + a)ε ≥ 2(ε−1)∧0 (γ ηε + aε ), a > 0, ε > 0)


µ Z 1 Z t ¶
η (ε−1)∧0 ηε
≤ (y(τ ) − γ ) exp −(1 + ε)2 γ k(ρ)dρ ds
0 τ
µ Z 1 Z t
ε
exp −(1 + ε)2(ε−1)∧0 k(ρ)sε (y(τ ) − γ η )
0 τ
µ Z ρ ¶ ¶
ε
exp −ε (1 + ε) k (ρ0 ) y (ρ0 ) dρ0 dρ ds
τ
µ Z t ¶
= (y(τ ) − γ η ) exp −(1 + ε)2(ε−1)∧0 γ ηε k(ρ)dρ
τ
n
η ε (ε−1)∧0
exp − (y(τ ) − γ ) 2
Z t µ Z ρ ¶ ¾
0 0 ε 0
k(ρ) exp −ε (1 + ε) k (ρ ) y (ρ ) dρ dρ
τ τ
µ Z t ¶
≤ (y(τ ) − γ η ) exp −(1 + ε)2(ε−1)∧0 γ ηε k(ρ)dρ
τ
µ Z t µ Z ρ ¶ ¶
ε
exp −2(ε−1)∧0 (y(τ ) − γ η ) k(ρ) exp −ε (1 + ε) y(τ )ε k (ρ0 ) dρ0 dρ
τ τ
µ Z t ¶
= (y(τ ) − γ η ) exp −(1 + ε)2(ε−1)∧0 γ ηε k(ρ)dρ
τ
½ µ ¶ε µ µ Z t ¶¶¾
2(ε−1)∧0 y(τ ) − γ η ε 0 0
exp − 1 − exp −ε (1 + ε) y (τ ) k (ρ ) dρ
ε(1 + ε) y(τ ) τ

R1
(use the elementary equality 1 − e−a = 0
ae−sa ds, a ≥ 0)
7.3 Kolmogorov operators and analytic semigroups 341
µ Z t ¶
= (y(τ ) − γ η ) exp −(1 + ε)2(ε−1)∧0 γ ηε k(ρ)dρ
τ
½ Z t
ε
exp −2(ε−1)∧0 (y(τ ) − γ η ) k (ρ0 ) dρ0
τ
Z 1 µ Z t ¶ ¾
ε 0 0
× exp −ε (1 + ε) sy(τ ) k (ρ ) dρ ds
0 τ

Lemma 7.30. Let ϕ(t), c1 (t), χ(t) and ψ(t) be nonnegative continuous func-
Rt
tions on the interval [τ, ∞) such that ϕ(t) ≤ ψ(t) + χ(t) τ c1 (ρ)ϕ(ρ)dρ, t ≥ τ .
Then
³R ´j
Z t n−1 t
X ρ χ (ρ0 ) c1 (ρ0 ) dρ0
ϕ(t) ≤ ψ(t) + χ(t) c1 (ρ)ψ(ρ)dρ
τ j=0 j!
³ ´n
Z t R t χ (ρ0 ) c1 (ρ0 ) dρ0
ρ
+ χ(t) c1 (ρ)ϕ(ρ)dρ. (7.184)
τ n!
From (7.184) it follows that:
³R ´j
Z tX
∞ t
χ (ρ 0
) c1 (ρ 0
) dρ0
ρ
ϕ(t) ≤ ψ(t) + χ(t) c1 (ρ)ψ(ρ)dρ
τ j=0 j!
Z t Rt
χ(ρ0 )c1 (ρ0 )dρ0
= ψ(t) + χ(t) e ρ c1 (ρ)ψ(ρ)dρ. (7.185)
τ
Rt
If ψ(t) = χ(t)ϕ(τ ) + χ(t) τ
z(ρ)dρ, then (7.185) implies:
Z t
ϕ(t) ≤ χ(t)ϕ(τ ) + χ(t) z(ρ)dρ
τ
Z t R µ Z ρ ¶
t
χ(ρ0 )c1 (ρ0 )dρ0 00 00
+ χ(t) e ρ χ(ρ)c1 (ρ) ϕ(τ ) + z (ρ ) dρ dρ
τ τ
Rt
Z t Rt
χ(ρ)c1 (ρ)dρ χ(ρ0 )c1 (ρ0 )dρ0
= χ(t)ϕ(τ )e τ + χ(t) e ρ z(ρ)dρ. (7.186)
τ

Remark 7.31. If the matrix A(t) 6= 0, then the operator L(t) does not generate
an analytic semigroup. This means that our theory is not directly applicable to
the example in Proposition 7.25. We could make C (t, τ ) state-dependent and
A(t) also. One way of doing this is by taking a unique solution to a stochastic
differential equation:
dX(t) = b (t, X(t)) dt + σ (t, X(t)) dW (t), t≥τ ≥0 (7.187)
and then defining the family C (t, τ, X(τ )) by X(t) = C (t, τ, X(τ )). The
functional C (t, τ, X(τ )) then depends on the σ-field generated by X(τ ) and
Ftτ = σ (W (ρ) : τ ≤ ρ ≤ t). The evolution Y (τ, t) is then defined by
342 7 On non-stationary Markov processes and Dunford projections
£ ¯ ¤
Y (τ, t) f (x) = E f (X(t)) ¯ X(τ ) = x = Eτ,x [f (C (t, τ, X(τ )))] , t ≥ τ.

In this general setup we do not have explicit formulas any more. Moreover,
this choice of C (t, τ, x) does not give any A(t), because the function t 7→
X(t) = C((t, τ, X(τ )) is not differentiable in a classical sense. Of course it
satisfies (7.187).
τ,A(τ )
We want to study the processes t 7→ X(t), s 7→ X t,A(t) (s) and t 7→ X0 (t),
which are solutions to the following stochastic integral equations, and their
inter-relationships:
Z t
X(t) = C(t, τ )X(τ ) + C (t, ρ) σ (ρ, X(ρ)) dW (ρ),
τ
Z s ³ ´
X t,A(t) (s) = esA(t) X t,A(t) (0) + e(s−ρ)A(t) σ t, X t,A(t) (ρ) dW (ρ), and
0
Z t ³ ´
τ,A(τ ) τ,A(τ ) τ,A(τ )
X0 (t) = e(t−τ )A(τ ) X0 (τ ) + e(ρ−τ )A(τ ) σ τ, X0 (ρ) dW (ρ).
τ
(7.188)

For t ≥ s ≥ τ the matrix family C (t, τ ) satisfies C (t, τ ) = C (t, s) C (s, τ ),


C (τ + h, τ ) − I
and the matrix family A(τ ) is defined by A(τ ) = lim . In
h↓0 h
differential form the stochastic integral equations in (7.188) read as follows:

dX(t) = A(t)X(t)dt + σ (t, X(t)) dW (t); (7.189)


³ ´
dX t,A(t) (s) = A(t)X t,A(t) (s)ds + σ t, X t,A(t) (s) dW (s), and (7.190)
µ Z t ³ ´ ¶
τ,A(τ ) τ,A(τ ) τ,A(τ )
dX0 (t) = A(τ ) X0 (t) − e(ρ−τ )A(τ ) σ τ, X0 (ρ) dW (ρ) dt
τ
³ ´
τ,A(τ )
+ e(t−τ )A(τ ) σ τ, X0 (t) dW (t). (7.191)

We will consider the following exponential martingale


µ Z t Z ¶
1 t 2
Eτ (t) = exp − b (ρ, X(ρ)) dW (ρ) − |b (ρ, X(ρ))| dρ , (7.192)
τ 2 τ

and its companions

Et,A(t) (s) (7.193)


µ Z s ³ ´ Z s¯ ³ ´¯ ¶
t,A(t) 1 ¯ t,A(t) ¯2
= exp − b t, X (ρ) dW (ρ) − ¯b t, X (ρ) ¯ dρ ,
0 2 0

and
τ,A(τ )
E0 (t) (7.194)
7.3 Kolmogorov operators and analytic semigroups 343
µ Z t ³ Z ´¯2 ¶
τ,A(τ )
´ 1 t ¯¯ ³ τ,A(τ ) ¯
= exp − b τ, X0 (ρ) dW (ρ) − ¯b τ, X0 (ρ) ¯ dρ .
τ 2 τ

Instead of E0 (t) we write E(t). Put


Z t Z s ³ ´
τ t,A(t)
M (t) = b (ρ, X(ρ)) dW (ρ), M (s) = b t, X t,A(t) (ρ) dW (ρ),
τ 0

and
Z t ³ ´
τ,A(τ ) τ,A(τ )
M0 (t) = b τ, X0 (ρ) dW (ρ). (7.195)
τ

Then by Itô calculus we have:

dEτ (t) = −Eτ (t)dM τ (t), dEt,A(t) (s) = −Et,A(t) (s)dM t,A(t) (s),
τ,A(τ ) τ,A(τ ) τ,A(τ )
dE0 (t) = −E0 (t)dM0 (t). (7.196)

Let f : Rd → C be a C 2 -function. Again employing Itô calculus shows:

df (X(t))
d
X d
1 X
= Dk f (X(t)) dXk (t) + Qj,k (t, X(t)) Dj Dk f (X(t)) dt
2
k=1 j,k=1
 
d
X
1
= h∇f (X(t)) , A(t)X(t)i + Qj,k (t, X(t)) Dj Dk f (X(t)) dt
2
j,k=1

+ h∇f (X(t)) , σ (t, X(t)) dW (t)i . (7.197)

By the same token we get


³ ´ Xd ³ ´
t,A(t)
df X t,A(t) (s) = Dk f X t,A(t) (s) dXk (s)
k=1
d
1 X ³ ´ ³ ´
+ Qj,k t, X t,A(t) (s) Dj Dk f X t,A(t) (s) ds
2
j,k=1

D ³ ´ E
=  ∇f X t,A(t) (s) , A(t)X t,A(t) (s)

1 X
d ³ ´ ³ ´
+ Qj,k t, X t,A(t) (s) Dj Dk f X t,A(t) (s)  ds
2
j,k=1
D ³ ´ ³ ´ E
+ ∇f X t,A(t) (s) , σ t, X t,A(t) (s) dW (s) . (7.198)
344 7 On non-stationary Markov processes and Dunford projections

In addition we have, again by Itô calculus,


³ ´
τ,A(τ )
df X0 (t)
d
X ³ ´
τ,A(τ ) τ,A(τ )
= Dk f X0 (t) dX0,k (t)
k=1
d
1 X ³ ³ ´´
τ,A(τ ) τ,A(τ )
+ Qj,k τ, X0 (τ )Dj Dk f X0 (t) dt
2
j,k=1

D ³ ´ E
τ,A(τ ) τ,A(τ )
=  ∇f X0 (t) , A(τ )X0 (t)

1 X d ³ ´ ³ ´
τ,A(τ ) τ,A(τ )
+ Qj,k τ, X0 (t) Dj Dk f X0 (t)  dt
2
j,k=1
D ³ ´ ³ ´ E
τ,A(τ ) τ,A(τ )
+ ∇f X0 (t) , σ τ, X0 (t) dW (t)
¿ ³ ´ Z t ³ ´ À
τ,A(τ ) τ,A(τ )
− ∇f X0 (t) , A(τ ) e(ρ−τ )A(τ ) σ τ, X0 (ρ) dW (ρ) dt.
τ
(7.199)
We also need the covariation processes:
D ³ ´E
hE(·), f (X(·))i (t), Et,A(t) (·), f X t,A(t) (·) (s), and
D ³ ´E
τ,A(τ ) τ,A(τ )
E0 (·), f X0 (·) (t). (7.200)

The covariation process hE(·), f (X(·))i (t) is determined by

d hE(·), f (X(·))i (t) = −E(t) h∇f (X(t)) , σ (t, X(t)) b (t, X(t))i dt. (7.201)
­ ¡ ¢®
The covariation process Et,A(t) (·), f X t,A(t) (·) (s) is determined by
D ³ ´E
d Et,A(t) (·), f X t,A(t) (·) (s)
D ³ ´ ³ ´ ³ ´E
= −Et,A(t) (s) ∇f X t,A(t) (s) , σ t, X t,A(t) (s) b t, X t,A(t) (s) ds.
(7.202)
D ³ ´E
τ,A(τ ) τ,A(τ )
Likewise the covariation process E0 (·), f X0 (·) (t) is determined
by
D ³ ´E
τ,A(τ ) τ,A(τ )
d E0 (·), f X0 (·) (t)
D ³ ´ ³ ´ ³ ´E
τ,A(τ ) τ,A(τ ) τ,A(τ ) τ,A(τ )
= −E0 (t) ∇f X0 (t) , σ τ, X0 (t) b τ, X0 (t) dt.
(7.203)
7.3 Kolmogorov operators and analytic semigroups 345

Next we calculate the stochastic differential of the processes


³ ´ ³ ´
τ,A(τ ) τ,A(τ )
E(t)f (X(t)) , Et,A(t) (s)f X t,A(t) (s) and E0 (t)f X0 (t) .

Using Itô calculus, the equality in (7.192), and the first equality in (7.195)
and in (7.196), in conjunction with (7.197) and (7.201) shows

d (E(t)f (X(t)))
= (dE(t)) f (X(t)) + E(t)df (X(t)) + d hE(·), f (X(·))i (t)
= −E(t)f (X(t)) b (t, X(t)) dW (t)
 
Xd
1
+ E(t) h∇f (X(t)) , A(t)X(t)i + Qj,k (t, X(t)) Dj Dk f (X(t)) dt
2
j,k=1

+ E(t) h∇f (X(t)) , σ (t, X(t)) dW (t)i


− E(t) h∇f (X(t)) , σ (t, X(t)) b (t, X(t))i dt
= −E(t)f (X(t)) b (t, X(t)) dW (t) + E(t) h∇f (X(t)) , σ (t, X(t)) dW (t)i
+ E(t)Lb (t)f (X(t)) dt (7.204)

where with Q(t, x) = σ(t, x)σ(t, x)∗ we wrote


d
1 X
Lb (t)f (x) = h∇f (x) , A(t)x − σ (t, x) b (t, x)i + Qj,k (t, x) Dj Dk f (x) .
2
j,k=1
(7.205)
Put
Z s Z s
ρA(t) ∗ ρA(t)∗ ∗
Qt,A(t) (s) = e σ(t, x)σ(t, x) e dρ = eρA(t) Q (t, x) eρA(t) dρ,
0 0

and £ ¯ ¤
Y (τ, t) f (x) = E Eτ (t)f (X(t)) ¯ X(τ ) = x . (7.206)
Then

Y (τ, s) Y (s, t) f (x) = Y (τ, t) f (x), f ∈ Cb (E), x ∈ E, τ ≤ s ≤ t. (7.207)

Next we will calculate the stochastic derivative of the process


³ ´
s 7→ Et,A(t) (s)f X t,A(t) (s) .

More precisely, upon employing Itô calculus, the martingale in (7.193), and
the second martingales in (7.195) and in (7.196), in conjunction with (7.198)
and (7.202) we obtain
³ ³ ´´
d Et,A(t) (s)f X t,A(t) (s)
346 7 On non-stationary Markov processes and Dunford projections
³ ´ ³ ´ ³ ´
= dEt,A(t) (s) f X t,A(t) (s) + Et,A(t) (s)df X t,A(t) (s)
D ³ ´E
+ d Et,A(t) (·), f X t,A(t) (·) (s)
³ ´ ³ ´
= −Et,A(t) (s)f X t,A(t) (s) b τ, X t,A(t) (s) dW (s)

D ³ ´ E
+ Et,A(t) (s)  ∇f X t,A(t) (s) , A(τ )X t,A(t) (s)

1 X
d ³ ´ ³ ´
+ Qj,k τ, X t,A(t) (s) Dj Dk f X t,A(t) (s)  ds
2
j,k=1
D ³ ´ ³ ´ E
+ Et,A(t) (s) ∇f X t,A(t) (s) , σ τ, X t,A(t) (s) dW (s)
D ³ ´ ³ ´ ³ ´E
− Et,A(t) (s) ∇f X t,A(t) (s) , σ τ, X t,A(t) (s) b τ, X t,A(t) (s) ds
³ ´ ³ ´
= −Et,A(t) (s)f X t,A(t) (s) b τ, X t,A(t) (s) dW (s)
D ³ ´ ³ ´ E
+ Et,A(t) (s) ∇f X t,A(t) (s) , σ τ, X t,A(t) (s) dW (s)
³ ´
+ Et,A(t) (s)Lb (t) f X t,A(t) (s) ds (7.208)

where Lb (t) is as in (7.205).


In a quite similar manner we obtain the stochastic differential of the pro-
cess ³ ´
τ,A(τ ) τ,A(τ )
t 7→ E0 (t)f X0 (t) .

Upon employing Itô calculus, the equality in (7.194), and the third martingale
in (7.195) and in (7.196), in conjunction with (7.199) and (7.203) we get
³ ³ ´´
τ,A(τ ) τ,A(τ )
d E0 (t)f X0 (t)
³ ´ ³ ´ ³ ´
τ,A(τ ) τ,A(τ ) τ,A(τ ) τ,A(τ )
= dE0 (t) f X0 (t) + E0 (t)df X0 (t)
D ³ ´E
τ,A(τ ) τ,A(τ )
+ d E0 (·), f X0 (·) (t)
³ ´ ³ ´
τ,A(τ ) τ,A(τ ) τ,A(τ )
= −E0 (t)f X0 (t) b τ, X0 (t) dW (t)

D ³ ´ E
τ,A(τ ) τ,A(τ ) τ,A(τ )
+ E0 (t)  ∇f X0 (t) , A(τ )X0 (t)

1 X d ³ ´ ³ ´
τ,A(τ ) τ,A(τ )
+ Qj,k τ, X0 (t) Dj Dk f X0 (t)  dt
2
j,k=1
D ³ ´ ³ ´ E
τ,A(τ ) τ,A(τ ) τ,A(τ )
+ E0 (t) ∇f X0 (t) , σ τ, X0 (t) dW (t)
7.3 Kolmogorov operators and analytic semigroups 347
¿ ³ ´
τ,A(τ ) τ,A(τ )
− E0 (t) ∇f X0 (t) ,
Z t ³ ´ À
(ρ−τ )A(τ ) τ,A(τ )
A(τ ) e σ τ, X0 (ρ)
dW (ρ) dt
Dτ ³ ´ ³ ´ ³ ´E
τ,A(τ ) τ,A(τ ) τ,A(τ ) τ,A(τ )
− E0 (t) ∇f X0 (t) , σ τ, X0 (t) b τ, X0 (t) dt
³ ´ ³ ´
τ,A(τ ) τ,A(τ ) τ,A(τ )
= −E0 (t)f X0 (t) b τ, X0 (t) dW (t)
D ³ ´ ³ ´ E
τ,A(τ ) τ,A(τ ) τ,A(τ )
+ E0 (t) ∇f X0 (t) , σ τ, X0 (t) dW (t)
¿ ³ ´
τ,A(τ ) τ,A(τ )
− E0 (t) ∇f X0 (t) ,
Z t ³ ´ À
τ,A(τ )
A(τ ) e(ρ−τ )A(τ ) σ τ, X0 (ρ) dW (ρ) dt
τ
³ ´
τ,A(τ ) τ,A(τ )
+ E0 (t)Lb (τ ) f X0 (t) dt (7.209)

where Lb (τ ) is as in (7.205):

Lb (τ )f (x) = h∇f (x) , A(τ )x − σ (τ, x) b (τ, x)i


d
1 X
+ Qj,k (τ, x) Dj Dk f (x) . (7.210)
2
j,k=1

Next let s 7→ X t,A(t) (s) be the solution to the stochastic integral equation:
Z s ³ ´
X t,A(t) (s) = esA(t) X t,A(t) (0) + e(s−ρ)A(t) σ t, X t,A(t) (ρ) dW (ρ), (7.211)
0

which is equivalent to
³ ´
dX t,A(t) (s) = A(t)X t,A(t) (s)ds + σ t, X t,A(t) (s) dW (s), (7.212)

which is the same as the second in (7.188) and which in differential form is
given in (7.190). In terms of the exponential martingale s 7→ Et,A(t) (s) defined
in (7.193) the semigroup esLb (t) , s ≥ 0, is given by:
h ³ ´¯ i
esLb (t) f (x) = E Et,A(t) (s)f X t,A(t) (s) ¯ X t,A(t) (0) = x . (7.213)
¡ ¢
We also want give conditions in order that for every µ ∈ M Rd the limit
¡ ∗ ¢
lim Var Lb (t)∗ Y (τ, t) µ = 0. (7.214)
t→∞

We suppose that the coefficients b(t, x) = b(t) and σ(t, x) = σ(t) only depend
on time. Then the (formal) adjoint of the operator Lb (t) can be written as
follows:
348 7 On non-stationary Markov processes and Dunford projections

Lb (t)∗ f (x) (7.215)


d
X
1
= − h∇f (x), A(t)x − σ(t)b(t)i − tr (A(t)) f (x) + Qj,k (t)Dj Dk f (x).
2
j,k=1

Rt
Put QC (τ, t) = τ
C(t, ρ)σ(ρ)σ(ρ)∗ C(t, ρ)∗ dρ. Then for the evolution family
Y (τ, t) we have:

Y (τ, t) f (x)
Z µ Z t ¶
1 − 21 |y|2 1/2
= e f C(t, τ )x − C(t, ρ)σ(ρ)b(ρ)dρ − (QC (τ, t)) y dy
(2π)d/2 τ
1
= 1/2
(7.216)
d/2
(2π) det (QC (τ, t))
Z à ¯ µ Z t ¶¯2 !
1 ¯¯ ¯
C(t, τ )x − C(t, ρ)σ(ρ)b(ρ)dρ − y ¯¯ f (y)dy.
−1/2
exp − ¯QC (τ, t)
2 τ

Next suppose that the coefficients b(t) = b(t, x) and σ(t) = σ(t, x) only depend
on the time t, and put
µ ¶
1 1­ −1
®
gs (x) = ¡ ¢1/2 exp − Q t,A(t) (s) x, x .
(2π)d/2 det Qt,A(t) (s) 2

Then µ ¶
1­ ®
gbs (ξ) = exp − Qt,A(t) (s)ξ, ξ ,
2
and hence by the Fourier inverse formula

esLb (t) f (x)


Z · µ ¶
1 2
= E exp − hb(t), W (s)i − |b(t)| s
2
µ Z s ¶¸
×f esA(t) x + e(s−ρ)A(t) σ(t)dW (ρ)
Z · µ0 ¶
1 1 2
= E exp − hb(t), W (s)i − |b(t)| s
(2π)d 2
µ ¿ Z s À¶¸
exp i ξ, e sA(t)
x+ e (s−ρ)A(t)
σ(t)dW (ρ) fb(ξ)dξ
0
Z µ ¿ Z s À¶
1 sA(t) (s−ρ)A(t)
= exp i ξ, e x− e dρ σ(t)b(t)
(2π)d 0
µ ¿Z s À¶
1 ∗ (s−ρ)A(t)∗
exp − e (s−ρ)A(t)
σ(t)σ(t) e dρ ξ, ξ fb(ξ)dξ
2 0
Z µ ¿ Z s À¶
1 sA(t) ρA(t)
= exp i ξ, e x− e dρ σ(t)b(t)
(2π)d 0
7.3 Kolmogorov operators and analytic semigroups 349
µ ¿Z s À¶
1 ∗
exp − eρA(t) σ(t)σ(t)∗ eρA(t) dρ ξ, ξ fb(ξ)dξ
2 0
Z µ ¿ Z s À¶
1
= exp i ξ, e sA(t)
x − e ρA(t)
dρ σ(t)b(t) g\
s ∗ f (ξ)dξ
(2π)d 0
µ Z s ¶
= gs ∗ f esA(t) x − eρA(t) dρ
0
1
= ¡ ¢1/2
d/2
(2π) det Qt,A(t) (s)
Z à ¯ µ Z s ¶¯2 !
1 ¯¯ ¯
exp − ¯Qt,A(t) (s) −1/2
e sA(t)
x− eρA(t)
σ(t)b(t)dρ − y ¯¯ f (y) dy
2 0
Z µ Z s ¶
1 − 21 |y| 2
sA(t) ρA(t)
¡ ¢1/2
= d/2
e f e x − e σ(t)b(t)dρ − Q t,A(t) (s) y dy.
(2π) 0
(7.217)

From the representation in (7.217) it follows that the the operator


¡ ¢ Lb (t) does
not generate a bounded analytic semigroup. In fact if f ∈ Cb Rd is such that
its first and second derivative is also continuous and bounded, then we have

Lb (t)esLb (t) f (x)


µ Z s ¶
sLb (t) ∗sA(t) ρA(t)
=e Lb (t)f (x) = Lb (t) gs ∗ f e x− e dρ
0
Z µ Z s ¶
= Lb (t)∗ gs esA(t) x − eρA(t) σ(t)b(t)dρ − · (y)f (y) dy
0
Xd Z µ Z s ¶
∂gs sA(t) ρA(t)
=− e x− e σ(t)b(t)dρ − y
∂yj 0
j=1,k=1
à d
!
X
× Aj,k (t) yk − σk,` (t)b` (t) f (y) dy
`=1
Z µ Z s ¶
sA(t) ρA(t)
− tr (A(t)) gs e x− e σ(t)b(t)dρ − y f (y) dy
0
d Z µ Z s ¶
1 X ∂ 2 gs sA(t) ρA(t)
+ Qj,k (t) e x− e σ(t)b(t)dρ − y f (y) dy
2 ∂yj ∂yk 0
j,k=1
d
X Z µ µ Z s ¶¶
∂gs (y) sA(t) ρA(t)
=− A(t) e x− e σ(t)b(t)dρ − y − σ(t)b(t)
∂yj 0 j
j=1,k=1
µ Z s ¶
× f esA(t) x − eρA(t) σ(t)b(t)dρ − y dy
0
Z µ Z s ¶
− tr (A(t)) gs (y) f esA(t) x − eρA(t) σ(t)b(t)dρ − y dy
0
350 7 On non-stationary Markov processes and Dunford projections
d Z 2 µ Z s ¶
1 X ∂ gs (y)
+ Qj,k (t) f esA(t) x − eρA(t) σ(t)b(t)dρ − y dy.
2 ∂yj ∂yk 0
j,k=1
(7.218)

All terms in (7.218) are uniformly bounded in x except the very first one,
which grows like a constant times |x|. In order that the operator Lb (t)
generates
° a bounded° analytic semigroup ° it is °necessary and sufficient that
sups>0 °sLb (t)esLb (t) ° < ∞ and sups>0 °esLb (t) ° < ∞.
Suppose that the real parts of the eigenvalues of the matrix A(t) are strictly
negative. From (7.217) it follows that the measure

B 7→
Z µ Z s ¶
1 1 2 ¡ ¢1/2
d/2
e− 2 |y| 1B − lim eρA(t) σ(t)b(t)dρ − lim Qt,A(t) (s) y dy
(2π) s→∞ 0 s→∞

serves as an invariant measure for the semigroup esLb (t) , s ≥ 0. Using the
τ,A(τ )
processes X(t), X τ,A(τ ) (t), and X0 ³ (t), t ≥ ´τ , we introduce the filtered
(0)
probability spaces (Ω, Ftτ , Pτ,x ) and Ω, Ftτ , Pτ,x . Here the σ-field Ftτ , τ ≤ t,
is generated by the variables W (ρ), τ ≤ ρ ≤ t. Let the variable F be Ftτ -
measurable. Then we put
£ ¯ ¤
Eτ,x [F ] = E Eτ (t)F ¯ X(τ ) = x . (7.219)
(0)
On the other hand the definition of Pτ,x is more of a challenge.
Qn First
¡ we take F¢
which is measurable with respect to Ftτ of the form F = j=1 fj X τ,A(τ ) (tj ) .
Then we put
 
n
Y ³ ´¯
E(0)  τ,A(τ ) (t)
τ,x [F ] = E E fj X τ,A(τ ) (tj ) ¯ X(τ ) = x
j=1
 
n
Y ³ ´¯
τ,A(τ ) τ,A(τ )
= E E0 (t) fj X 0 (tj ) ¯ X(τ ) = x . (7.220)
j=1

Example 7.32. Another not too artificial example is the adjoint of the form
d d
1 X ∂2 X ∂
L(t) = aj,k (t, x) + bj (t, x) ,
2 ∂xj xk j=1 ∂xj
j,k=1
¡ ¢
defined on a dense subspace of the space C0 Rd , i.e. the space of all bounded
continuous functions with zero boundary conditions. The least that is required
d
for the square matrix (aj,k (t, x))j,k=1 is that it is invertible, symmetric and
positive-definite. We also observe that, for such a choice of the coefficients
aj,k (t, x) the operator L(t) satisfies the following maximum principle. For
7.3 Kolmogorov operators and analytic semigroups 351
¢¡
any function f ∈ C0 Rd belonging to ©the domain D (L(t)) there exists ª a
point (x0 , y0 ) ∈ Rd × Rd such that sup |f (x) − f (y)| ; (x, y) ∈ Rd × Rd =
|f (x0 ) − f (y0 )|, such that the next inequality holds:
n³ ´ o
< f (x0 ) − f (y0 ) (L(t)f (x0 ) − L(t)f (y0 )) ≤ 0. (7.221)

Since the function (x, y) 7→ |f (x) − f (y)| attains its maximum at (x0 , y0 ) it
follows
³³ that ∇f (x0´) = ∇f (y0 ) = ´0. It also follows that the function x 7→
< f (x0 ) − f (y0 ) (f (x) − f (y0 )) attains it maximum at x0 . Hence, by
inequality (7.229) below we see that
³ ´
< f (x0 ) − f (y0 ) L(t)f (x0 )
³ ³ ´ ´
= < L(t) f (x0 ) − f (y0 ) (f (·) − f (y0 )) (x0 ) ≤ 0. (7.222)

where the functions By the same token we also have:


³ ´
−< f (x0 ) − f (y0 ) L(t)f (y0 ) ≤ 0. (7.223)

From (7.221) we infer for α ∈ C, λ ≥ 0, and f ∈ D (L(t)) the string of


inequalities:
2
4 kλ (f − α1) − L(t)f k∞
2
≥ sup |λ (f (x) − f (y)) − L(t)f (x) + L(t)f (y)|
(x,y)∈Rd ×Rd
n
2 2
= sup |λ| |f (x) − f (y)|
(x,y)∈Rd ×Rd
n³ ´ o o
2
−2λ< f (x) − f (y) (L(t)f (x) − L(t)f (y)) + |L(t)f (x) − L(t)f (y)|
n³ ´ o
2 2
≥ |λ| |f (x0 ) − f (y0 )| − 2λ< f (x0 ) − f (y0 ) (L(t)f (x0 ) − L(t)f (y0 ))
2
+ |L(t)f (x0 ) − L(t)f (y0 )|
2 2 2 2
≥ |λ| |f (x0 ) − f (y0 )| = |λ| sup |f (x) − f (y)|
(x,y)∈Rd ×Rd
2 2
≥ |λ| inf kf − α1k∞ . (7.224)
α∈C

From the inequalities in (7.224) we obtain for <λ ≥ 0 and f ∈ D (L(t)):

2 inf kλf − L(t)f − α1k∞ ≥ λ inf kf − α1k∞ . (7.225)


α∈C α∈C

A similar argument shows that for λ ≥ 0 and f ∈ D (L(t)) we also have:

kλf − L(t)f k∞ ≥ λ kf k∞ , (7.226)


352 7 On non-stationary Markov processes and Dunford projections
¡ ¢
provided that for every function f ∈ C0 Rd ©belonging to theªdomain D (L(t))
there exists an point x0 ∈ Rd such
o that sup |f (x)| ; x ∈ Rd = |f (x0 )|, and
such that < f (x0 )L(t)f (x0 ) ≤ 0. In fact the operators L(t) satisfy the
maximum principle in the sense that < (L(t)f (x0 )) ≤ 0 whenever f ∈ D(L(t))
and x0 ∈ Rd is such that <f (x0 ) = supx∈Rd <f (x). One way of seeing this
directly runs as follows. Let f ∈ D (L(t)). If x0 ∈ Rd is such that <f (x0 ) =
supx∈Rd <f (x). Then < hx − x0 , ∇f (x0 )i = 0, and thus, for all x ∈ Rd ,

<f (x) = <f (x0 ) + hx − x0 , ∇<f (x0 )i


Z 1 Xd
∂ 2 <f
+ (1 − s) (xj − x0,j ) (xk − x0,k ) ((1 − s)x0 + sx) ds
0 ∂xj ∂xk
j,k=1
Z 1 d
X
= <f (x0 ) + (1 − s) (xj − x0,j ) (xk − x0,k )
0 j,k=1

∂ 2 <f
× ((1 − s)x0 + sx) ds. (7.227)
∂xj ∂xk

From (7.227) and the fact that the function <f attains its maximum at x0
we see that
Z 1 d
X ∂2
< (1 − s) (xj − x0,j ) (xk − x0,k ) f ((1 − s)x0 + sx) ds ≤ 0.
0 ∂xj ∂xk
j,k=1
(7.228)
From the inequality in (7.228) it easily follows that the Hessian D2 <f (x0 )
∂2
which is the matrix with entries <f (x0 ) is negative-definite: i.e. it is
∂xj ∂xk
symmetric and its eigenvalues are less than or equal to 0. Since the matrix
d
a (t, x0 ) := (aj,k (t, x0 ))j,k=1 is positive-definite (i.e. its eigenvalues are non-
negative and the matrix is symmetric) and the functions bj (t, x), 1 ≤ j ≤ d,
are real-valued, we infer that
d
X Xd
∂2 ∂
<L(t)f (x0 ) = aj,k (t, x0 ) <f (x0 ) + bj (t, x0 ) <f (x0 )
∂xj ∂xk j=1
∂xj
j,k=1
¡ ¢
= trace a (t, x0 ) D2 <f (x0 )
³p p ´
= trace a (t, x0 )D2 <f (x0 ) a (t, x0 ) ≤ 0. (7.229)
p
The matrix a (t, x0 ) is a positive-definite matrix with its square equal to
a (t, x0 ). In addition, we used the fact that in (7.229) the identity
d
X ∂2
aj,k (t, x0 ) <f (x0 )
∂xj ∂xk
j,k=1
7.3 Kolmogorov operators and analytic semigroups 353
¡ ¢
can be interpreted as trace a (t, x0 ) D2 <f (x0 ) . It follows that the operators
L(t) generate analytic semigroups esL(t) where s ∈ C belongs to a sector with
angle opening, which may be chosen independently of t provided that
¯ ¯
¯∂ ¯¡ ¢
sup sup sup s ¯¯ PL(t) (s, x, ·)¯¯ Rd < ∞.
t>0 s>0 x∈Rd ∂s

Here the Markov transition function PL(t) (s, x, B), (s, x) ∈ [0, ∞) × Rd , B ∈
BRd , t ≥ 0, is determined by the equality
Z
¡ ¢
esL(t) f (x) = f (y)PL(t) (s, x, dy) , f ∈ Cb Rd .
Rd

For the reason why, see the inequality in (7.100) and the equality in (7.101).
Then it follows that there exist a constant C and an angle 12 π < β < π again
independent of t such that
° °
° −1 °
|λ| °(λI − L(t)) ° ≤ C, for all λ ∈ C with |arg(λ)| ≤ β. (7.230)

For a proof see Theorem 7.50 and its corollaries 7.51 and 7.52. Let esL(t) ,
s ≥ 0, be the (analytic) semigroup generated by the operator L(t). Then the
(unbounded)
R∞ inverse of the operator −L(t) is given by the¡ strong¢ integral
f 7→ 0 esL(t) f ds. From (7.225) it follows that for µ ∈ M0 Rd and λ > 0
the inequality
¯D ¡ ¢−1 E¯¯
¯
λ ¯ g, λI|M0 (Rd ) − L(t)∗ |M0 (Rd ) µ ¯ ≤ kgk∞ Var (µ) , (7.231)

holds whenever the function g is of the form g = λf −L(t)f , with f ∈ D (L(t)).


Here M0 (Rd ) is the space of all complex Borel measures
¡ sL(t) µ on Rd with the
∗ ¢
d
property that µ(R¡ d)¢ = 0. Suppose that Var e µ ≤ c(t)e−2ω(t)s¡ Var
¢ (µ)
for all µ ∈ M0 R and s ≥ 0. Then for <λ ≥ ω(t), g ∈ C0 Rd and
µ ∈ M0 (Rd ) we have
D ¡ ¢−1 E
(λ − 2ω(t)) g, (λ − 2ω(t)) I|M0 (Rd ) − L(t)∗ |M0 (Rd ) µ
Z ∞¿ ³ ´ À
−s (λ−2ω(t))I|M (Rd ) −L(t)∗ |M (Rd )
= (λ − 2ω(t)) g, e 0 0 µ ds, (7.232)
0

and hence, if |λ − 2ω(t)| ≤ 2ω(t) we have


¯D ¡ ¢−1 E¯¯
¯
|λ − 2ω(t)| ¯ g, (λ − 2ω(t)) I|M0 (Rd ) − L(t)∗ |M0 (Rd ) µ ¯
Z ∞ ¯¿ ³ ´ À¯
¯ ∗ ¯
≤ |λ − 2ω(t)| ¯ g, e−s (λ−2ω(t))I|M0 (Rd ) −L(t) |M0 (Rd ) µ ¯ ds
¯ ¯
Z0 ∞ ³ ´
sL(t)∗ |M (Rd )
≤ |λ − 2ω(t)| e−s(<λ−2ω(t)) Var e 0 µ ds kgk∞
0
354 7 On non-stationary Markov processes and Dunford projections
Z ∞
≤ c(t) |λ − 2ω(t)| e−s(<λ−2ω(t)) e−2sω(t) dsVar (µ) kgk∞
0
|λ − 2ω(t)|
= c(t) kgk∞ Var (µ) ≤ 2c(t) kgk∞ Var (µ) . (7.233)

In view of (7.230), (7.231) and (7.233) it makes sense ¡ to ¢ consider the largest
d
ω(t) with the ¡property
¢ that for all functions g ∈ C 0 R , and all Borel mea-
sures µ ∈ M0 Rd the complex-valued function
D ¡ ¢−1 E
λ 7→ λ g, λI|M0 (Rd ) − L(t)∗ |M0 (Rd ) µ

extends to a bounded holomorphic function on all half-planes of the form

{λ ∈ C : <λ > −2ω 0 (t)}

with ω 0 (t) < ω(t). In follows that ¡there


¢ exists a constant c(t) such that for all
functions g ∈ Cb (E) and µ ∈ M0 Rd the following inequality holds:
¯D ¡ ¢−1 E¯¯
¯
|λ| ¯ g, λI|M0 (Rd ) − L(t)∗ |M0 (Rd ) µ ¯ ≤ c(t) kgk∞ Var (µ) , <λ ≥ −ω(t).

The following definition is to be compared with the definitions 7.22 and 8.54
in Chapter 8.
Definition 7.33. The number 2ω(t) is called the M (E)-spectral gap of the
operator L(t)∗ .
Next let P (τ, x; t, B) be the transition probability function of the process
© ¡ ¢ª
(Ω, Ftτ , Pτ,x ) , (X(t) : t ≥ τ ) , Rd , B

generated by the operators L(t). Suppose that, for every τ ∈ (0, ∞) and every
Borel probability measure on Rd , the following condition is satisfied:
Z µ ¶
c(t) ∂
lim Var P (τ, x; t, ·) dµ(x) = 0.
t→∞ ω(t) Rd ∂t

Let µ be any Borel probability measure on Rd . Put µ(t) = Y (τ, t) µ, where
Y (τ, t)f (x) = Eτ,x [f (X(t))], f ∈ C0 (Rd ). Then µ̇(t) = L(t)∗ µ(t). Moreover,

c(t)
lim Var (µ̇(t)) = 0.
t→∞ ω(t)

We will show this. With the above notation we have:

Var (µ̇(t))
½¯ ¯ ¾
¯d ¯
¯ ¯ d
= sup ¯ hf, µ(t)i¯ : f ∈ C0 (R ), kf k∞ = 1
dt
7.3 Kolmogorov operators and analytic semigroups 355
½¯ ¯ ¾
¯∂ ¯
= sup ¯¯ hY (τ, t) f, µi¯¯ : f ∈ C0 (Rd ), kf k∞ = 1
∂t
½¯ Z Z ¯ ¾
¯∂ ¯
= sup ¯¯ f (y)P (τ, x; t, dy) dµ(x)¯¯ : f ∈ C0 (Rd ), kf k∞ = 1
∂t d d
½¯Z R R Z ¯ ¾
¯ ∂ ¯
= sup ¯¯ f (y) P (τ, x; t, dy) dµ(x)¯¯ : f ∈ C0 (Rd ), kf k∞ = 1
d ∂t Rd
µ RZ ¶ Z µ ¶
∂ ∂
= Var P (τ, x; t, ·) dµ(x) ≤ Var P (τ, x; t, ·) dµ(x).
∂t Rd Rd ∂t
(7.234)

If the probability measure B 7→ P (τ, x; t, B) has density p (τ, x; t, y), then the

total variation of the measure B 7→ P (τ, x; t, ·) is given by
∂t
µ ¶ Z ¯ ¯
∂ ¯∂ ¯
Var P (τ, x; t, ·) = ¯ p (τ, x; t, y)¯¯ dy.
∂t ¯
Rd ∂t

If there exists a unique P (E)-valued function t 7→ π(t) such that L(t)∗ π(t) =
0, then the system L(t)∗ µ(t) = µ̇(t) is ergodic. This assertion follows from
Theorem 7.36 below.
In order to perform some explicit computations we next assume that d = 1.
It is assumed that the coefficient a(t, x) is strictly positive on R. Moreover, by
hypothesis we assume that thereZexists a function B(t, x) such that b(t, x) =


a(t, x) B(t, x) and such that e−2B(t,η) dη < ∞. The adjoint K(t) of
∂x −∞
L(t) acts on a subspace of the dual space of C0 (Rd ) which may be identified
with the space of all complex Borel measures on Rd . Formally, K(t)µ is given
by
1 ∂2 ∂
K(t)µ = (a(t, ·)µ) − (b(t, ·)µ) .
2 ∂x2 ∂x
Let the time-dependent measure µ(t) have the property that K(t)µ(t) = 0.
Then the family of measures µ(t) has density ϕ(t, x) given by
Z x
e2B(t,x) e2B(t,x)−2B(t,η)
ϕ(t, x) = C1 (t) + C2 (t) dη, (7.235)
a(t, x) a a(t, x)

where t 7→ Cj (t), j = 1, 2, are some functions which only depend on time. In


order to be sure that for every t the measure µ(t) belongs to M (R) and is
non-trivial we make additional hypotheses on the coefficients. If both integrals
Z ∞ 2B(t,x) Z ∞ Z x 2B(t,x)−2B(t,η)
e e
dx and dη dx (7.236)
∞ a(t, x) −∞ a a(t, x)

are finite, then the function x 7→ ϕ(t, x) belongs to L1 (R)


R ∞no matter how
the constants C1 (t) and C2 (t) are chosen. The requirement −∞ ϕ(t, x)dx = 1
356 7 On non-stationary Markov processes and Dunford projections

does not make them unique. We have uniqueness of solutions in M (R) to the
eigenvalue problem K(t)µ(t) = 0 and µ (t, R) = 1 provided either one of the
following conditions is satisfied:
Z ∞ 2B(t,x) Z ∞ Z x 2B(t,x)−2B(t,η)
e e
dx < ∞ and dη dx = ∞, or
−∞ a(t, x) −∞ a a(t, x)
(7.237)
Z ∞ 2B(t,x) Z ∞ Z x 2B(t,x)−2B(t,η)
e e
dx = ∞ and dη dx < ∞. (7.238)
−∞ a(t, x) −∞ a a(t, x)

In the cases (7.237) and (7.238) we have respectively


Z
e2B(t,x)
µ(t, B) = C1 (t) dx and
B a(t, x)
Z Z x 2B(t,x)−2B(t,η)
e
µ(t, B) = C2 (t) dη dx,
B a a(t, x)

where the constants C1 (t) and C2 (t) are chosen in such a way that the total
mass µ(t, R) = 1. The operators L(t) generate a diffusion in the sense that
there exists a time-inhomogeneous Markov process

{(Ω, Ftτ , Pτ,x ) , (X(t) : t ≥ τ ) , (R, B)}

such that

Eτ,x [f (X(s))] = Eτ,x [L(s)f (X(s))] , f ∈ D(L(s)),
∂s
where 0 ≤ τ < s ≤ ∞. We put Y (τ, t) f (x) = Eτ,x [f (X(t))], f ∈ Cb (R).
Then, under appropriate conditions on the coefficients a(t, x) and b(t, x) the
operators Y (τ, t) leave the space C0 (R) invariant, and hence the adjoint op-

erators Y (τ, t) are mappings from M (R) to M (R). For a given probability

measure µ(τ ) the measure-valued function µ(t) := Y (τ, t) µ(τ ) satisfies

∂ ∂ ­ ∗ ® ∂
hf, µ(t)i = f, Y (τ, t) µ(τ ) = hY (τ, t) f, µ(τ )i
∂t ∂t Z ∂t Z
∂ ∂
= Y (τ, t) f (x)µ(τ, dx) = Eτ,x [f (X(t))] µ(τ, dx)
∂t ∂t
Z
= Eτ,x [L(t)f (X(t))] µ(τ, dx)

= hY (τ, t) L(t)f, µ(τ )i = hf, L(t)∗ µ(t)i .

Let f ∈ D (L(t)). From (7.225) it follows that for all λ ∈ C with <λ ≥ 0 the
following inequality holds

inf |λ| kf − α1k∞ ≤ 2 inf k(λI − L(t)) f − α1k∞ . (7.239)


α∈C α∈C
7.4 Ergodicity in the non-stationary case 357

c(t)
If lim Var (L(t)∗ µ(t)) = 0, then the equation L(t)∗ µ(t) = µ̇(t) is ergodic,
t→∞ ω(t)
¡ ∗ ¢
provided that Var esL(t) µ ≤ c(t)e−2sω(t) Var (µ) for all µ ∈ M0 (E). This
assertion follows from Theorem 7.36 below, by observing that the dual of the
space C0 (R) endowed with the quotient norm kf k := inf kf − α1k∞ is the
α∈C
space M0 (R).
For explicit formulas for invariant measures for (certain) Ornstein-Uhlenbeck
semigroups we refer the reader to Da Prato and Zabczyk [66] Theorems 11.7
and 11.11, and to Metafune et al [159]. For some recent regularity and smooth-
ing results see Bogachev et al [38].

7.4 Ergodicity in the non-stationary case


We begin with a relevant definition.
Definition 7.34. The system (7.6) is called ergodic, if there exists a unique
solution π(t) to the equation K(t)π(t) = 0, with π(t) ∈ P (E), such that
lim Var (µ(t) − π(t)) = 0 (7.240)
t→∞

for all solutions µ(t) ∈ P (E) to the equation µ̇(t) = K(t)µ(t).


Remark 7.35. Fix t ∈ R and let K(t) be a Kolmogorov operator with 0 as
an isolated point in its spectrum. Then 0 is a dominant eigenvalue of K(t),
and let P (t) : M (E) → M (E) be the Dunford projection on the generalized
eigen-space corresponding to the eigenvalue 0 with eigen-vector π(t) ∈ P (E).
If the eigenvalue 0 has multiplicity 1, then P (t) projects the space M (E) onto
the one-dimensional subspace Cπ(t). Since 0 is a dominant eigenvalue of K(t),
a key spectral estimate of the following form is valid:
¯D E¯
¯ ¯
¯ f, esK(t) (I − P (t)) µ ¯ ≤ c(t)e−2ω(t)s kf k∞ Var (µ) , f ∈ Cb (E), µ ∈ M (E),
(7.241)
where ω(t) is strictly positive, kf k∞ is the supremum-norm of f ∈ Cb (E),
Var (µ) is the total variation norm of µ ∈ M (E), and where c(t) is some finite
constant.
A P (E)-valued function π(t) for which K(t)π(t) = 0 is called a stationary
or invariant P (E)-valued function of the system in (7.6). In addition to (7.6)
we assume that the continuous function π(t) with values in P (E) satisfies
K(t)π(t) = 0, and we suppose that this function is uniquely determined.
Theorem 7.36. Let the function t 7→ µ(t) satisfy (7.6); i.e. µ̇(t) = K(t)µ(t),
d
t > t0 , or more precisely hf, µ(t)i = hf, K(t)µ(t)i, f ∈ Cb (E). In addition,
dt
suppose that there exist strictly positive functions t 7→ ω(t) and t 7→ c(t)
possessing the following properties:
358 7 On non-stationary Markov processes and Dunford projections

(i) For every t ≥ t0 there exists a real number with <λ > −ω(t) such that
(λI − K(t)) (D (K(t))) = M (E); (7.242)
(ii)The following identity holds true:
c(t) c(t)
lim Var (µ̇(t)) = lim Var (K(t)µ(t)) = 0; (7.243)
t→∞ ω(t) t→∞ ω(t)

(iii)The inequality
|λ| Var (µ) ≤ c(t)Var (λµ − K(t)µ) , (7.244)
holds for all µ ∈ D (K(t)) and all λ ∈ C with <λ > −ω(t).
Then there exists a P (E)-valued function t 7→ π(t) such that
lim Var (µ(t) − π(t)) = 0,
t→∞

and such that K(t)π(t) = 0; i.e. the system in (7.6) is ergodic.


Remark 7.37. The inequality in (7.244) is only required on the union of the
right half-plane {λ ∈ C : <λ > 0} and the circular disc {λ ∈ C : |λ| ≤ ω(t)}.
This will follow from the proof of Theorem 7.36.
Remark 7.38. Let g ∈ Cb (E) be such that
¯D ¡ ¢−1 E¯¯ c(t)
¯
¯ g, −K(t)|M0 (E) µ ¯≤ kgk∞ Var (µ) , µ ∈ M0 (E), (7.245)
ω(t)
where the constants c(t) and ω(t) satisfy (7.243). Then
lim hg, µ(t) − π(t)i = 0. (7.246)
t→∞

If the collection of functions g satisfying (7.245) for an appropriate choice


of c(t) and ω(t) satisfying (7.243) is dense in Cb (E), then (7.246) holds for
g ∈ Cb (E).
The following proposition has some independent interest; it says that an oper-
ator which has the properties (i) and (iii) of Theorem 7.36 generates a bounded
analytic weak∗ -continuous semigroup in M0 (E) with exponential decay. For
ω > 0 we define the open subset Π e ω of C by Π e ω = {λ ∈ C : <λ > 0} ∪
{λ ∈ C : |λ| < ω}.
Proposition 7.39. Let K be a sectorial sub-Kolmogorov operator for which
there exist constants ω and c such that (λI − K) D(K) = M (E) for some
λ∈Π e ω and such that

|λ| Var (µ) ≤ cVar (λµ − Kµ) (7.247)


e ω and for all µ ∈ D(K). Then the operator K generates a weak∗ -
for all λ ∈ Π © ª
continuous bounded analytic semigroup etK : |arg(t)| ≤ α . On the range of
the operator K this analytic semigroup has exponential decay as t → ∞.
7.4 Ergodicity in the non-stationary case 359

e ω defined
Proof (Proof of Proposition 7.39). We consider the subset Πω of Π
by n o
Πω = λ ∈ Π e ω : λ 6= 0, (λI − K) D(K) = M (E) . (7.248)
−1
First suppose that λ0 belongs to Πω . Put R (λ0 ) = (λ0 I − K) , and define
P∞ j j+1
the operators R(λ), |λ − λ0 | < c−1 |λ0 |, by R(λ) = j=0 (λ0 − λ) R (λ0 ) .
From (7.247) it follows that the operators R(λ) are well defined and that
(λI − K) R(λ) = I for λ ∈ C such that |λ − λ0 | < c−1 |λ0 |. Hence the set Πω
is an open subset of the punctured subset Π e ω \ {0}. Next let λn , n ∈ N, be
a sequence in Πω with limit λ0 in the punctured open subset Π e ω \ {0}. For
n ∈ N so large that |λ0 − λn | < c−1 |λn | we have

X j j+1
(λ0 I − K) (λn − λ0 ) R (λn ) = I,
j=0

−1
where we wrote R (λn ) = (λn I − K) . It follows that the punctured set
Πω \ {0} is open and closed in the connected punctured open set Π e ω \ {0}.
Since the latter is topologically connected and since by assumption Πω is non-
empty it follows that for every λ ∈ Π e ω , λ 6= 0, the range of the operator λI −K
coincides with M (E).°As above we°put R(λ) = (λI − K) λ ∈ Π
−1 e ω . Inequality
° −1 ° e ω . From the arguments in the
(7.247) implies that °(λI − K) ° ≤ c, λ ∈ Π
proofs of Theorem 7.49 and Corollary 7.48 it follows that the resolvent R(λ)
e ω ∪ {λ ∈ C : |arg(λ)| ≤ β},
extends to a sectorial region of the form Πω,β := Π
where 21 π < β < π, and the norm of the of the resolvent R(λ) satisfies an
estimate of the form:

|λ| kR(λ)k ≤ c0 , λ ∈ Πω,β . (7.249)

Put
Z Z
1 −1 1 1 −1
P = (λI − K) dλ and A = − (λI − K) dλ.
2πi |λ|=ω 2πi |λ|=ω λ
(7.250)
Then we have
Z
1 −1
KP = (λI − (λI − K)) (λI − K) dλ
2πi |λ|=ω
Z Z
1 −1 1 −1
= λ (λI − K) dλ − (λI − K) (λI − K) dλ = 0,
2πi |λ|=ω 2πi |λ|=ω

and
Z
1 1 −1
KA = (λI − K − λI) (λI − K) dλ
2πi |λ|=ω λ
360 7 On non-stationary Markov processes and Dunford projections
Z Z
1 1 1 −1
= dλI − (λI − K) dλ = I − P. (7.251)
2πi |λ|=ω λ 2πi |λ|=ω

It follows that R(K), the range of K, is weak∗ -closed and that I − P is a


continuous linear projection from M (E) onto R(K) with null space R(P ) =

N (K). From Theorem 7.17 ©it follows thatª K generates a weak -continuous
tK
sub-Kolmogorov semigroup e : t ≥ 0 in M (E). By (7.249) we see that
this semigroup is analytic. Since the set Πω,β contains a half-plane of the
form {λ ∈ C : <λ ≥ −ω0 } where ω > ω0 > 0 the representation in (7.255)
with ω 0 = −ω
© 0 and ` =ª 1 can be used to show the exponential decay of the
semigroup etK : t ≥ 0 on the range of K.
This completes the proof of Proposition 7.39.

Suppose that |λ| Var(µ) ≤ cVar (λµ − Kµ) for <λ ≥ −ω, µ ∈ D(K). Then
the operator I − P can be written as
Z −ω+i∞ Z ∞
1 1 −1
I −P =K (λI − K) dλ = −K esK ds. (7.252)
2πi −ω−i∞ λ 0

On the range of K (which coincides with the range of I − P ) the operator A


has the representation:
Z −ω+i∞ Z ∞
1 1 −1
A= (λI − K) dλ = − esK (I − P ) ds. (7.253)
2πi −ω−i∞ λ 0

On the space M (E) the operator etK can be represented by


Z ω 0 +i∞
t` tK 1 −`−1
e = etλ (λI − K) dλ, ω 0 > 0, ` ≥ 1. (7.254)
`! 2πi ω 0 −i∞

On the range of K the operator etK has the representation:


Z ω 0 +i∞
t` tK 1 −`−1
e = etλ (λI − K) dλ, ω 0 > −ω, ` ≥ 1. (7.255)
`! 2πi ω 0 −i∞

Notice that by (7.251) the operator K has a bounded inverse on its range. It
−1
follows that the function λ 7→ (λI − K) restricted to R(K) is holomorphic
in a neighborhood of λ = 0.
° °
Remark 7.40. We may say that see that the condition supt>0 °tLetL ° < ∞
is kind of an analytic maximum principle.analytic maximum principle. In
this remark only, suppose that E is locally compact and second count-
able. Let L be the generator of a Feller-Dynkin
¯ ¯ semigroup.
° ° Fix t > 0 and
choose x ∈ E in such a way that ¯etL f (x)¯ = °etL f °∞ . Then we have
³ ´
< etL f (x)tLetL f (x) ≤ 0. Next assume that the operator L is such that the
corresponding Feller-Dynkin semigroup has an integral p(t, x, y) with respect
7.4 Ergodicity in the non-stationary case 361

to a reference Rmeasure dm(y). This means that the semigroup etL is given
by etL f (x) = p (t, x, y) f (y)dm(y). Then L generates a bounded analytic
semigroup if and only if
Z ¯ ¯ Z
¯ t∂p (t, x, y) ¯
sup sup ¯ ¯
t>0 x∈E E
¯ ∂t ¯ dm(y) = sup sup
t>0 x∈E E
|tL p (t, ·, y) (x)| dm(y) < ∞.

¡ ¢
This is the case if and only if for some α ∈ 0, 12 π an inequality of the form
Z
sup sup |p (t, x, y)| dm(y) < ∞
t∈C:|arg(t)|≤α x∈E

holds.

For the moment we only suppose that the operator K generates


£ a¤ bounded
analytic weak∗ -continuous semigroup on M (E). Let γr : − 12 π, 12 π , 0 < r <
∞, be a parametrization of the semi-circle γr (ϑ) = reiϑ , − 12 π ≤ ϑ ≤ 21 π.
Then by Cauchy’s theorem the following equality of sums of integrals holds
for 0 < r < R < ∞:
Z −ir Z
1 1 −1 1 1 −1
(λI − K) dλ + (λI − K) dλ
πi −iR λ πi γr λ
Z iR
1 1 −1
+ (λI − K) dλ
πi ir λ
Z
1 1 −1
= (λI − K) dλ. (7.256)
πi γR λ

By using the parameterizations ξ 7→ −iξ, R > ξ > r, and ξ 7→ −iξ, r < ξ < R
and letting R tend to ∞ we obtain:
Z Z
2 ∞¡ 2 ¢
2 −1 1 1 −1
ξ I +K dξ = (λI − K) dλ. (7.257)
π r πi γr λ

It follows that
Z ∞ Z
2 ¡ 2 ¢
2 −1 2 ∞ ¡ ¢−1
(−K) ξ I +K dξ = (−K) ξ 2 I + K 2 dξ
π π r
Z r
1 1 −1
= (λI − K − λI) (λI − K) dλ
πi γr λ
Z Z
1 1 1 −1
= dλI − (λI − K) dλ
πi γr λ πi γr
Z
1 −1
=I− (λI − K) dλ. (7.258)
πi γr

From (7.258) we also obtain:


362 7 On non-stationary Markov processes and Dunford projections
µ Z ¶
2 ∞ ¡ ¢−1
K (−K) ξ 2 I + K 2 dξ − I
π r
Z Z
1 1 −1
= 1dλ I − λ (λI − K) dλ
πi γr πi γr
Z
2r 1 −1
= I− λ (λI − K) dλ. (7.259)
π πi γr

We formulate these results in the form of a proposition.


Proposition 7.41. Put
Z ∞ Z
2 ¡ 2 ¢−1 1 −1
Qr = (−K) ξ I + K2 dξ and Pr = (λI − K) dλ.
π r πi γr
(7.260)
Then I = Qr + Pr , R (Pr ) ⊂ D(K), and
µ Z ¶
2 ∞ ¡ 2 ¢
2 −1
K I− (−K) ξ I + K dξ
π r
Z
1 −1 2r
= KPr = λ (λI − K) dλ − I. (7.261)
πi γr π

Moreover, the following inequality is valid:


° ¢−1 °
° iϑ ¡ iϑ ° 2r
kKPr k ≤ r sup °re re I − K °+ . (7.262)
ϑ∈[− 21 π, 12 π ]
π

Definition 7.42. A linear operator Q : M (E) → M (E) is called sequen-


tially weak∗ -closed if its graph {(µ, Qµ) : µ ∈ M (E)} is sequentially weak∗
closed in M (E) × M (E). This means that for any sequence (µn )n∈N which
itself converges to µ for the σ (M (E), Cb (E))-topology, and for which the se-
quence (Qµn )n∈N converges to ν ∈ M (E) with respect to the σ (M (E), Cb (E))-
topology the equality ν = Qµ follows.

In the following proposition we collect a number of alternative ways to repre-


sent the operators Q and P . Recall that the projection operator P is called a
Dunford projection.
Proposition 7.43. Let K be © a sub-Kolmogorov
ª operator which generates a
−1
weak∗ -continuous semigroup esK : s ≥ 0 in M (E). Put R(λ) = (λI − K) ,
<λ > 0. The following assertions are true:
1. Suppose that the weak∗ -limit

Qµ = σ (M (E), Cb (E)) - lim (−K) R(λ)µ


λ↓0

exists for all µ ∈ M (E). In addition, suppose that the operator Q is


sequentially weak∗ -closed. Then Q is a projection from M (E) onto the
7.4 Ergodicity in the non-stationary case 363

weak∗ -sequential closure of the space R(K). Its zero space is N (K), and
the projection P = I − Q on N (K) is given by

P µ = σ (M (E), Cb (E)) - lim λR(λ)µ.


λ↓0

2. Suppose that the weak∗ -limit


Z t
Qµ = σ (M (E), Cb (E)) - lim (−K) esK µds
t↑∞ 0

exists for all µ ∈ M (E). In addition, suppose that the operator Q is


sequentially weak∗ -closed. Then Q is a projection from M (E) onto the
weak∗ -sequential closure of the space R(K). Its zero space is N (K), and
the projection P = I − Q on N (K) is given by

P µ = σ (M (E), Cb (E)) - lim etK µ,


t→∞

provided that σ (M (E), Cb (E)) - limt→∞ KetK µ = 0 for all µ ∈ D(K).


3. Suppose that the semigroup generated by K is bounded and analytic. In
addition, assume that the weak∗ -limit
Z ∞
2 ¡ 2 ¢−1
Qµ = σ (M (E), Cb (E)) - lim (−K) ξ I + K2 µdξ
r↓0 π r

exists for all µ ∈ M (E), and suppose that the operator Q is sequentially
weak∗ -closed. Then Q is a projection from M (E) onto the weak∗ -sequential
closure of the space R(K). Its zero space is N (K), and the projection
P = I − Q on N (K) is given by
Z
1 −1
P µ = σ (M (E), Cb (E)) - lim (λI − K) µ dλ, µ ∈ M (E).
r↓0 πi γ
r

Here γr is the curve γr (ϑ) = reiϑ , − 12 π ≤ ϑ ≤ 12 π.


4. Suppose that 0 is an isolated point of the spectrum of K and that in a
neighborhood of 0 the following inequality holds for a finite constant C,
for all µ ∈ D(K) and for all λ ∈ C in a (small) disc around 0:

|λ| Var(µ) ≤ CVar (λµ − Kµ) . (7.263)

Then the range of K is weak∗ -closed, and M (E) = R(K) + N (K). More
precisely, put
Z
1 1 −1
Qµ = (−K) (λI − K) µ dλ, and
2πi γ
er λ
Z
1 −1
Pµ = (λI − K) µ dλ
2πi γer
364 7 On non-stationary Markov processes and Dunford projections

where µ ∈ M (E). Here γ er (ϑ) = reiϑ , −π ≤


er stands for the full circle: γ
ϑ ≤ π, and for |λ| ≤ r the inequality in (7.263) holds. Then Q is a weak∗ -
continuous projection mapping from M (E) onto R(K), and P = I − Q is
weak∗ -continuous projection mapping from M (E) onto N (K). Moreover,
I = Q + P.

Remark 7.44. If the operator K is the weak©∗ -generatorªof a bounded


© analytic
ª
semigroup etK , t ≥ 0. Then the families etK : t ≥¡ 0 and¢ tKetK : t ≥ 0
are uniformly bounded. It follows that limt→∞ Var KetK µ = 0, and hence
the assumptions of assertion (3) entail those of (2). The identity
Z ∞ Z ∞
−1
−λt tK
λR (λ) µ = λ e e µ dt = e−t eλ tK µ dt
0 0

shows that assertion (1) is a consequence of (2). Finally, by residue-calculus


and the hypothesis in assertion (4) we also have
Z
1 −1
(λI − K) µ dλ = σ (M (E), Cb (E)) - lim λR(λ)µ.
2πi γer λ↓0

It follows that the conditions in assertion (4) imply those of (1).

Remark 7.45. Let (µ, ν) ∈ M (E) × M (E) be such that there exists a sequence
(µn )n∈N ⊂ M (E) together with a sequence (λn )n∈N ⊂ (0, ∞) which decreases
to 0 if n tends to ∞ such that

(µ, ν) = σ (M (E), Cb (E)) - lim (µn , µn − λR(λ)µn ) .


n→∞

Then it is assumed that the graph of the operator Q contains the pair (µ, ν).
Let the sequence (µn , µn − λn R (λn ) µn ) tend to (µ, ν) for the weak∗ -topology.
First we show that Qν = Qµ. By assumption we know that

σ (M (E), Cb (E)) - lim (µ − λn R (λn ) µn )


n→∞
= σ (M (E), Cb (E)) - lim (−KR (λn ) µn ) = Qµ. (7.264)
n→∞

We also have:

σ (M (E), Cb (E)) - lim (µn − µ − λn R (λn ) (µn − µ)) = ν − Qµ. (7.265)


n→∞

Since µn converges to µ in the weak∗ sense the equality in (7.265) implies:

σ (M (E), Cb (E)) - lim (−λn R (λn ) (µn − µ)) = ν − Qµ. (7.266)


n→∞

In addition, we have
¡¡ ¢ ¢
lim Var (Kλn R (λn ) (µn − µ)) = lim Var λ2n R (λn ) − λn (µn − µ) = 0
n→∞ n→∞
(7.267)
7.4 Ergodicity in the non-stationary case 365

and hence, since the operator K is sequentially weak∗ closed we infer

K (ν − Qµ) = 0.

But we also have N (K) = N (Q) and thus Q (ν − Qµ) = 0. Since Q2 = Q we


see Qν = Qµ. Fix N ∈ N. Using the equalities λn R (λn ) (ν − Qν) = ν − Qν
and Qν = Qµ we obtain the identities:
1 ³ N +1
´
µn − µ − (λn R (λn )) (µn − µ) − (ν − Qν)
N +1
N
1 X j
= (λn R (λn )) {(I − λn R (λn )) (µn − µ) − (ν − Qν)}
N + 1 j=0
N
1 X j
= (λn R (λn )) {(I − λn R (λn )) (µn − µ) − (ν − Qµ)}
N + 1 j=0

Hence, if we assume from the start that

σ (M (E), Cb (E)) - lim (λn R (λn ) (µn )) = 0


n→∞

whenever λn ↓ 0 and σ (M (E), Cb (E)) - limn→∞ µn = 0, then R(Q) is the


weak∗ -closure of R(K).

Proof (Proof of Proposition 7.43). Proof of assertion (1). Let µ ∈ M (E). First
we notice the equalities µ+KR(λ)µ = λR(λ)µ ∈ D(K), and lim K (λR(λ)µ) =
λ↓0
¡ ¢
lim λ2 R(λ)µ − λµ = 0. The latter limit is taken with respect to the varia-
λ↓0
tion norm. In addition, we see that P µ := σ (M (E), Cb (E))-lim λR(λ)µ exists.
λ↓0
Since the graph of K is sequentially weak∗ -closed, it follows that P µ belongs
to D(K) and KP µ. Hence, we see that the measure µ − Qµ belongs to N (K).
Consequently, if Qµ = 0, then µ = µ − Qµ ∈ N (K). If Kµ = 0,

Qµ = lim (−K) (λR (λ) µ) = − lim (λR (λ) Kµ) = 0.


λ↓0 λ↓0

The previous arguments show the equalities of spaces: (I − Q) M (E) =


N (K) = N (Q). It follows that Q (I − Q) = 0, and thus Q = Q2 . From
the definition of Q it follows that R(Q), the range of Q, is contained in
the sequential weak∗ -closure of R(K). Conversely, let ν = σ (M (E), Cb (E))-
limn→∞ Kµn , where (µn )n∈N is a sequence in D(K). Then Q (Kµn − ν) =
Q (Kµn ) − ν + ν − Qν = Kµn − ν + ν − Qν, which converges for the weak∗ -
topology to ν − Qν. It follows that the pair (0, ν − Qν) belongs to the sequen-
tial weak∗ -closure of the graph of Q, and consequently ν = Qν.
Proof of assertion (2). In the proof of this assertion we use the identity
Rt
µ + K 0 esK µds = etK µ instead of µ + KR (λ) µ = λR (λ) µ. Then we let t
tend to ∞.
366 7 On non-stationary Markov processes and Dunford projections

Proof of assertion (3). In the proof of this assertion we employ the identity
Z ∞ Z
¡ 2 ¢
2 −1 1 −1
µ+K ξ I +K µ dξ = (λI − K) µ dλ.
r πi γr

Then we let r > 0 tend to 0.


Proof of assertion (4). Here we have the identity:
Z Z
1 1 −1 1 −1
µ+ K (λI − K) µ dλ = (λI − K) µ dλ.
2πi er λ
γ 2πi γer

Hence, here we have


Z Z
1 1 −1 1 −1
Qµ = (−K) (λI − K) µ dλ, and P µ = (λI − K) µ dλ.
2πi γer λ 2πi γ
er

Essentially speaking this proves assertion (4).


This completes the proof of Proposition 7.43.

In all these cases we prove that (I − Q) M (E) = N (K) = N (Q), and


Q (Kµ) = K (Qµ) = Kµ for µ ∈ D(K). Consequently, Q2 = Q. If Qµ = 0,
then µ = µ − Qµ ∈ N (K), and hence N (Q) ⊂ N (K). Conversely, if µ ∈ D(K)
is such that Kµ = 0, then the definition of Q implies Qµ = 0.
Theorem 7.46. Suppose that the operator K generates a bounded analytic
weak∗ -continuous semigroup on M (E), and that for every f ∈ Cb (E) and
µ ∈ M (E) the integral
Z
2 ∞D ¡ ¢−1 E
f, (−K) ξ 2 I + K 2 µ dξ (7.268)
π 0

exists as an impropernRiemann integral. Supposeo that for every µ ∈ M (E), the


−1
family of measures λ (λI − K) µ : <λ > 0 is Tβ -equi-continuous. Then
for every µ ∈ M (E), the functional
Z ∞D
¡ ¢−1 E
f 7→ f, (−K) ξ 2 I + K 2 µ dξ, f ∈ Cb (E) (7.269)
0

is continuous on (Cb (E), Tβ ), and hence it can be identified with a measure.


In addition, it is assumed that for every f ∈ Cb (E) the equality
Z ∞D Z ∞D
¡ ¢−1 E ¡ ¢−1 E
lim f, (−K) ξ 2 I + K 2 µn dξ = f, (−K) ξ 2 I + K 2 µ dξ
n→∞ 0 0

holds whenever (µn : n ∈ N) is a sequence in M (E) which converges with re-


spect to the σ (M (E), Cb (E))-topology to a measure µ ∈ M (E), i.e.

lim hg, µn i = hg, µi for all g ∈ Cb (E).


n→∞
7.4 Ergodicity in the non-stationary case 367

For µ ∈ M (E) let Qµ denote the measure corresponding to the functional:


Z
2 ∞D ¡ ¢−1 E
f 7→ f, (−K) ξ 2 I + K 2 µ dξ = hf, Qµi .
π 0

Then for every µ ∈ M (E) the measure µ − Qµ belongs to D(K) and


K (µ − Qµ) = 0. Moreover, R(Q) is the weak∗ sequential closure of R(K),
and Q2 = Q. In addition I − Q sends positive measures to positive measures,
and R (Q) ∩ N (Q) = {0}. If h1, Kµi = 0 for all µ ∈ D(K), then I − Q sends
the convex set of probability measures on E to itself.

Proof (Proof of Theorem 7.46). As in Proposition 7.41 we introduce the op-


erators
Z ∞ Z
2 ¡ 2 ¢
2 −1 1 −1
Qr = (−K) ξ I +K dξ and Pr = (λI − K) dλ.
π r πi γr
(7.270)
Then
n I = Q r + Pr . Notice
o that, for given µ ∈ M (E), the collection
−1
λ (λI − K) µ : <λ > 0 is Tβ -equi-continuous. As a consequence we see
that the functional in (7.269) belongs to M (E). The proof of Theorem 7.46
can be completed as the proof of Proposition 7.43.

Proof (Proof of Theorem 7.36). Let µ(t) be as in (7.6), and let π(t) satisfy
K(t)π(t) = 0. It follows that µ̇(t) = K(t)µ(t) belongs to M0 (E). Since the
spectrum of the operator K(t)|M0 (E) is contained in the complement of a circle
sector of the form

{λ ∈ C : <λ ≥ −ω(t) : |arg (λ + ω(t))| ≤ β}

with 12 π < β < π, we have:


Z −ω(t)+i∞
1 1 −1
(I − P (t)) µ(t) = K(t) (λI − K(t)) dλµ(t)
2πi −ω(t)−i∞ λ
Z ∞
1 1 ¡ ¢−1
= (−ω(t) + iξ) I|M0 (E) − K(t)|M0 (E)
2π −∞ −ω(t) + iξ
¡ ¢
K(t)|M0 (E) (µ(t) − π(t)) dξ (7.271)
¡ ¢−1 ¡ ¢
= K(t)|M0 (E) K(t)|M0 (E) (µ(t) − π(t))
= µ(t) − π(t). (7.272)

From (7.272) we see that P (t)µ(t) = π(t) and hence K(t)P (t)µ(t) = 0. Using
(7.244) and (7.271) as a norm estimate we obtain the following one:

Var ((I − P (t)) µ(t))


Z ∞ ³¡
1 1 ¢−1
≤ Var (−ω(t) + iξ) I|M0 (E) − K(t)|M0 (E)
2π −∞ |−ω(t) + iξ|
368 7 On non-stationary Markov processes and Dunford projections
¡ ¢ ´
K(t)|M0 (E) (µ(t) − π(t)) dξ
Z
c(t) ∞ 1 ¡¡ ¢ ¢
≤ dξVar K(t)|M0 (E) (µ(t) − π(t))
2π −∞ |−ω(t) + iξ|2
c(t) c(t)
= Var (K(t)µ(t)) = Var (µ̇(t)) . (7.273)
2ω(t) 2ω(t)

The estimate in (7.273) and (7.243) entails the following result

lim Var ((I − P (t)) µ(t)) = lim Var (µ(t) − π(t)) = 0.


t→∞ t→∞

Essentially speaking this proves Theorem 7.36.

Remark 7.47. In this remark we give an alternative representation of the op-


erator P (t). Since the measure K(t)µ(t) belongs to M0 (E), from (7.244) it
follows that
Z ω(t)+i∞
1 −1 1
K(t) (λI − K(t)) dλ µ(t) = 0, (7.274)
2πi ω(t)−i∞ λ

and hence, by Cauchy’s theorem,

(I − P (t)) µ(t)
Z −ω(t)+i∞
1 −1 1
= K(t) (λI − K(t)) dλ µ(t)
2πi −ω(t)−i∞ λ
Z ω(t)+i∞
1 −1 1
− K(t) (λI − K(t)) dλ µ(t)
2πi ω(t)−i∞ λ
Z
1 −1 1
=− K(t) (λI − K(t)) dλ µ(t)
2πi {|λ|=ω(t)} λ
Z Z
1 1 1 −1
= dλ µ(t) − (λI − K(t)) dλ µ(t)
2πi {|λ|=ω(t)} λ 2πi {|λ|=ω(t)}
Z
1 −1
= µ(t) − (λI − K(t)) dλ µ(t),
2πi {|λ|=ω(t)}

and consequently,
Z
1 −1
P (t)µ(t) = (λI − K(t)) dλ µ(t). (7.275)
2πi {|λ|=ω(t)}

−1
From residue calculus it follows that P (t)µ(t) = lim λ (λI − K(t)) µ(t).
λ↓0
Since the operator K(t) has the Kolmogorov property, we see that for λ > 0
−1
the operator λ (λI − K(t)) sends positive measures to positive measures,
and hence P (t)µ(t) is a positive Borel measure. By the same argument
h1, P (t)µ(t)i = 1.
7.4 Ergodicity in the non-stationary case 369

The following corollary is applicable if K(t) = L(t)∗ , where the operators


satisfy the analytic maximum principle. The latter means that
° °
° °
sup sup °sK(t)esK(t) ° < ∞,
s>0 t>0

and that the operators L(t) satisfy the maximum principle. Such densely
defined operators in Cb (E) generate bounded analytic semigroups esL(t) where
s belongs to a sector with angle opening independent of t. It follows that the
operators λI − L(t) are invertible for all λ ∈ C with |(arg λ)| ≤ β with
1
2 π °< β < π, and where
° for some constant C (independent of t) the inequality
° −1 °
|λ| °(λI − L(t)) ° ≤ C holds for λ ∈ C with |(λ)| ≤ β.
Corollary 7.48. Let the family K(t), t ≥ 0, be a family of generators of
weak∗ -continuous semigroups in M (E) with the property that the operators
esK(t) , s ≥ 0, t ≥ 0, map positive measures to positive measures and each
operator K(t) has the property that
|λ| Var (µ) ≤ CVar (λµ − K(t)µ) , <λ > 0, µ ∈ D (K(t)) , (7.276)
where C is a constant which does not depend t. Suppose that the constants
ω(t) and c(t) are such that one of the following conditions
³ ´
Var esK(t) µ ≤ c(t)e−2ω(t)s Var (µ) , for s > 0 or (7.277)
|λ| Var (µ) ≤ c(t)Var (λµ − K(t)µ) , for all λ ∈ C such that |λ| ≤ ω(t)
(7.278)
is satisfied for all µ ∈ M0 (E) ∩ D (K(t)). Let t 7→ µ(t) be a solution to the
equation µ̇(t) = K(t)µ(t), t ≥ 0, with µ(t) ∈ P (E). If (7.243) is satisfied,
then the system µ̇(t) = K(t)µ(t) is ergodic, provided that there exists a unique
function π(t) ∈ P (E) such that K(t)π(t) = 0.
° °
° −1 °
Proof. There exists 21 π < β < π such that |λ| °(λI − K(t)) ° ≤ C for
λ ∈ C with |(λ)| ≤ β, with C independent of t: see Theorem 7.49 and the
corollaries and 7.51 and 7.52. For µ(t) − π(t) ∈ M0 (E) ∩ D (K(t)) such that
ω(t)
K(t) (µ(t) − π(t)) = µ̇(t) and λ ∈ C such that |λ − 2ω(t)| ≤ , and such
2c(t)
that <λ ≥ ω(t), and for µ ∈ M0 (E) we have
Z ∞
µ(t) − π(t) = e−s(λI−2ω(t)I−K(t)) ((λ − 2ω(t)) (µ(t) − π(t)) − µ̇(t)) ds,
0

and hence for such λ


|λ − 2ω(t)| 1 c(t)
Var (µ(t) − π(t)) ≤ c(t) ≤ Var (µ(t) − π(t)) + Var (µ̇(t)) .
<λ 2 ω(t)
(7.279)
An easy application of Theorem 7.36 then completes the proof of Corollary
7.48.
370 7 On non-stationary Markov processes and Dunford projections
© ª
Families of semigroups esK(t) : s ≥ 0 t≥t0 which satisfy (b) of the following
theorem are called uniformly bounded and uniformly holomorphic families
of operator ¯semigroups: cf.
¯ Blunck [36]. The next result will be used with
A(t) = 2ωI ¯M (E) +K(t) ¯M (E) : see Corollary 7.53 below.
0 0

Theorem 7.49. Let A(t), t ≥ t0 , be a family of closed linear operators, each


of which has a dense domain in a Banach space (X, k·k). Suppose that, for
−1
every t ≥ t0 , and for every λ ∈ C with <λ > 0, the inverses (λI − A(t))
exist and are bounded. Then the following assertions are equivalent:
° °
° −1 °
(a) sup sup |λ| °(λI − A(t)) ° < ∞;
t≥t0 <λ>0
° ° ° °
° ° ° °
(b) sup sup °sA(t)esA(t) ° < ∞ and sup sup °esA(t) ° < ∞.
s>0 t≥t0 s>0 t≥t0

Proof. Most standard proofs for one generator A can be adapted to include a
family of operators A(t), t ≥ t0 : (see e.g. [237], page 84, or [187] Theorem 5.2
and formula (5.16)). Another thorough discussion can be found in Chapter II
section 4 of Engel and Nagel [81].

It is also a consequence of the following theorem. For convenience, and because


we need to keep track of the constants an outline of the proof is included.
Theorem 7.50. Let K be the generator of a strongly continuous semigroup
−1
with the property that for λ ∈ C with <λ > 0 the inverse (λI − K) exists
as a bounded linear operator. Then the following assertions are true:
(i) If, for some finite constant C, the inequality
° °
° −1 °
|λ| °(λI − K) ° ≤ C holds for all λ ∈ C with <λ > 0, (7.280)

then
° tK ° e 2 ° °
°e ° ≤ C and °tKetK ° ≤ eC 2 (1 + C) for all t > 0. (7.281)
2
(ii) If there exist finite constants C1 and C2 such that
° tK ° ° °
°e ° ≤ C1 and °tKetK ° ≤ C2 , for all t > 0, (7.282)

then
° °
° −1 °
|λ| °(λI − K) ° ≤ C holds for all λ ∈ C with <λ > 0. (7.283)
µ ¶
e
Here the constant C is given by C = 2 (C2 e + 1) C1 + √ C2 .

7.4 Ergodicity in the non-stationary case 371

Proof. Assertion (i) follows from the representations:


Z ω+i∞
1 −2 1
tetK = λ2 etλ (λI − K) dλ; (7.284)
2πi ω−i∞ λ2
Z ω+i∞
1 2 tK 1 −3 1
t Ke = λ3 etλ (λI − K) dλ
2 2πi ω−i∞ λ2
Z ω+i∞
1 −2 1
− λ2 etλ (λI − K) dλ, (7.285)
2πi ω−i∞ λ2
1
together with the choice ω = .
t
The proof of assertion (ii) is somewhat more delicate. At first we fix t0 > 0
and we consider t > 0 with the property that
t0
|t − t0 | ≤ . (7.286)
C2 e + 1
We notice the inequality
C2 e
t ≥ t0 , (7.287)
C2 e + 1
whenever t satisfies (7.286). Moreover, for n ≥ 0 we have the representation
n
X ` Z t
(t − t0 ) 1 n
etK = K ` e t0 K + (t − s) K n+1 esK ds. (7.288)
`! n! t0
`=0

The remainder term in (7.288) can be estimated as follows:


° Z t °
°1 °
° e ds°
n n+1 sK
(t − s) K
° n! °
t0
¯ ° ° ¯
¯Z
¯
° sK n+1 ° ¯
t n+1 ° ° ¯
1 ¯ n (n + 1) ° sK n + 1  ° ¯
≤ ¯ (t − s) ° e ° ds¯
n! ¯ t0 sn+1 ° n+1 ° ¯
¯ ° ° ¯
¯Z ¯
(n + 1)n+1 C2n+1 ¯¯ t (t − s)n ¯
¯
≤ ¯ ds¯
n! ¯ t0 (min (t, t0 ))n+1 ¯
n+1
(n + 1)n+1 C2n+1 |t − t0 |
≤ n+1
(n + 1)! (min (t, t0 ))

(employ (7.286) and (7.287))

(n + 1)n+1 1
≤ C n+1
(n + 1)! C2 en+1 2
n+1

p
(use Stirling’s formula: (n + 1)! ≥ 2π(n + 1)e−n−1 (n + 1)n+1 )
372 7 On non-stationary Markov processes and Dunford projections

1
≤p . (7.289)
2π(n + 1)

This inequality clearly shows that the remainder term converges to 0 uniformly
t0
for t and t0 satisfying: |t − t0 | ≤ . From (7.288) we see that for t ∈
C2 e + 1
t0
C chosen in such a way that |t − t0 | ≤ the semigroup etK can be
C2 e + 1
represented as:

X `
tK t0 K (t − t0 ) ` t0 K
e =e + K e . (7.290)
`!
`=1

From (7.290) it follows that



° tK ° X `
|t − t0 | ° °
° e − e t0 K ° ≤ °K ` et0 K °
`!
`=1
° ` °
µ ¶ ° t K °
X∞
1
` °
`` ° t0 K
0 ° X∞
1 `` `
°
= ° e `  °≤ C
C2 e + 1 `! ° ` ° ` `! 2
`=1 ° ° `=1 (C2 e + 1)


(again we employ Stirling’s formula `! ≥ 2π`e−` `` )


X ` ∞
X (C2 e) `
(C2 e) 1 1 e
≤ `
√ ≤ `
√ = √ C2 .
`=1 (C2 e + 1)
2π` `=1 (C2 e + 1)
2π 2π
(7.291)
° °
Consequently, by our assumption °et0 K ° ≤ C1 , for all t0 > 0, we get
° tK °
°e ° ≤ C1 + √e C2 , (7.292)

whenever t ∈ C is chosen in such a way that (7.286) is satisfied for µ some¶ t0 > 0.
1 1
If we choose 31 π > α > 0 in such a way that = sin α , and if
2C2 e + 2 2
t0
|arg(t)| ≤ α, then t satisfies: |t − t0 | ≤ , with t0 = |t|. Hence, the norm
C2 e + 1
of e satisfies (7.292). For λ ∈ C such that − 12 π + 12 α < arg(λ) < 21 π + 12 α
tK

we have:
Z ∞ ³ ´
−1 i i i
(λI − K) = e− 2 α exp −λe− 2 α sI + e− 2 α sK ds, (7.293)
0

and hence
° °
° −1 °
|λ| °(λI − K) °
7.4 Ergodicity in the non-stationary case 373
Z ∞ ¯ ³ ´¯ ° ³ i ´°
¯ i ¯ ° °
≤ ¯exp −λe− 2 α s ¯ ds °exp e− 2 α sK ° ds
0
Z ∞ µ µ ¶ ¶ µ ¶
1 e
≤ |λ| exp − |λ| cos arg(λ) − α s ds C1 + √ C2
0 2 2π
µ ¶
1 e
= ¡ ¢ C1 + √ C2 . (7.294)
cos arg(λ) − 12 α 2π

By the same token we also get, for λ ∈ C such that − 12 π − 12 α < arg(λ) <
1 1
2 π − 2 α,
Z ∞ ³ ´
−1 i i i
α
(λI − K) = e 2 exp −λe 2 α sI + e 2 α sK ds, (7.295)
0

and hence
° °
° −1 °
|λ| °(λI − K) °
Z ∞¯ ³ ´¯ ° ³ i ´°
¯ i ¯ ° °
≤ ¯exp −λe 2 α s ¯ ds °exp e 2 α sK ° ds
0
Z ∞ µ µ ¶ ¶ µ ¶
1 e
≤ |λ| exp − |λ| cos arg(λ) + α s ds C1 + √ C2
0 2 2π
µ ¶
1 e
= ¡ ¢ C1 + √ C2 . (7.296)
cos arg(λ) + 12 α 2π

From (7.294) and (7.296) we infer:


° ° µ ¶
° −1 ° 1 e
|λ| °(λI − K) ° ≤ ¡ ¢ C1 + √ C2 , (7.297)
cos |arg(λ)| − 12 α 2π

for − 12 π − 12 α < arg(λ) < 12 π + 12 α. Inequality (7.283) in Theorem 7.50 follows


from (7.297) with λ ∈ C such that |arg(λ)| < 12 π.
This completes the proof of Theorem 7.50.

An inspection of the proof of assertion (ii) in Theorem 7.50, in particular


inequality (7.297), yields the following result, which says that the resolvent
family of a bounded analytic semigroup is bounded in a sector with an opening
which is larger than the open right half-plane.
Corollary 7.51. Let the hypotheses and notation be¡ as¢ in Theorem 7.50.
Choose the angle 13 π > α > 0 in such a way that sin 12 α = 2C21e+2 . Choose
0 ≤ β < 12 α. Then
° ° µ ¶
° −1 ° 1 e 1
|λ| °(λI − K) ° ≤ ¡1 ¢ C1 + √ C2 , |arg λ| ≤ π + β.
sin 2 α − β 2π 2
(7.298)
374 7 On non-stationary Markov processes and Dunford projections

The result in Corollary 7.51 extends to uniformly bounded and uniformly


analytic semigroups. Notice that (7.300) is equivalent to an inequality of the
form (t ≥ t0 ):
° °
° −1 °
|λ| °(λI − A(t)) ° ≤ C, for λ ∈ C with <λ > 0, (7.299)

where the constants C and C1 , C2 are related in an explicit manner: see


Theorem 7.50.
Corollary 7.52. Let A(t), t ≥ t0 , be a family of closed densely defined linear
operators. Suppose there exist finite constants C1 and C2 such that
° ° ° °
° sA(t) ° ° °
°e ° ≤ C1 and °sA(t)esA(t) ° ≤ C2 , for all s > 0 and for all t ≥ t0 .
(7.300)
1
¡1 ¢ 1
Choose 0 < α < 3 π in such a way that sin 2 α = . Fix 0 ≤ β < 21 α.
2C2 e + 2
Then, for all t ≥ t0 , the inequality
° °
° −1 °
|λ| °(λI − A(t)) ° ≤ C(β) (7.301)

1
is true for all λ ∈ C withµ |arg λ| ≤ 2 π ¶
+ β. Here the constant C(β) is given
1 e
by C(β) = ¡ ¢ C1 + √ C2 .
sin 12 α − β 2π
In the
¯ following corollary
¯ we use Theorem 7.49 and Corollary 7.52 with A(t) =
2ωI ¯M (E) +K(t) ¯M (E) .
0 0

Corollary 7.53. Let the function t 7→ µ(t) solve the equation:

µ̇(t) = K(t)µ(t), µ(t) ∈ P (E).

Suppose that lim Var (µ̇(t)) = 0, and that there exists ω > 0 such that
t→∞
° ¯ °
° °
c := sup °s (2ωI + K(t)) es(2ωI+K(t)) ¯M0 (E) ° < ∞. (7.302)
s,t>0

If, in addition, there exists only one continuous function t 7→ π(t) with values
in P (E) such that K(t)π(t) = 0, then limt→∞ Var (µ(t) − π(t)) = 0.
Notice that the operator (2ωI + K(t)) es(2ωI+K(t)) is a mapping from M0 (E)
to M0 (E).
Proof. An appeal to Corollary 7.52 together with the hypothesis in inequality
(7.302) shows that there exists a finite constant c1 such that
°³ ´−1 °
° ¯ ¯ °
° ¯ ¯
|λ| ° λI M0 (E) − (2ωI + K(t)) M0 (E) ° ≤ c1 for all λ with <λ > 0.
°
(7.303)
7.5 Conclusions 375

The latter result follows in fact from the theory of families of uniform holomor-
phic semigroups (the inequality (7.303) is uniform in t > t0 ). Consequently,
we obtain:
°³ ´−1 °
° ¯ ¯ °
|λ − 2ω| °
° λI ¯
M0 (E)
− (2ωI + K(t)) ¯
M0 (E)
° ≤ 3c1
°

for all λ with <λ ≥ ω.


The result in Corollary 7.53 then follows from Theorem 7.36.
Examples of operators L which generate analytic Feller semigroups can be
found in Taira [227]. Other valuable sources of information are Metafune,
Pallara and Wacker [158] and Taira [226].

7.5 Conclusions
In this chapter we discussed some properties of the fundamental operator
of the non-stationary, or time-dependent continuous system (7.6). Moreover,
in some particular cases, when we deal with a family of Kolmogorov opera-
tors K(t), we introduce and prove some efficient criteria for checking ergod-
icity (Theorem 7.36). This is done by using the Dunford projection on the
eigenspace corresponding to the critical eigenvalue 0 of K(t).
© ª
The properties of the families of semigroups esK(t) : s ≥ 0 t≥t0 are ex-
amined in detail in Theorem 7.49 and Theorem 7.50 as well as in Corollary
7.51 and Corollary 7.52. The obtained results allow us to present Corollary
7.53 providing the ergodicity of non-stationary system in terms of bounded
analytic semigroups. In addition, in §8.2 we discuss a rather general situation
in which we have a spectral gap: see e.g. Proposition 8.73. Some of this work
was based on ideas and concepts of Katilova [128, 129, 130]. What follows
next can be found in [243].
Theorem 7.54 is inspired by ideas in Nagy and Zemanek: see [165]. The result
can also be found in the Ph.-D. thesis of Katilova: see [129], Theorem 8.9.
Theorem 7.54. Let M be a bounded linear operator in a Banach space X.
By definition the sub-space X0 of X is the k·k-closure of the vector sum of the
k·k
range and zero-space of I −M : X0 = R (I − M ) + N (I − M ) . Suppose that
the spectrum of M is contained in the open unit disc union {1}. The following
assertions are equivalent:
° °
° −1 °
(i) sup|λ|<1 °(1 − λ) (I − λM ) x° < ∞ for every x ∈ X0 ;
(ii)supn∈N kM n xk < ∞ and supn∈N (n + 1) kM n (I − M ) xk < ∞ for every
x ∈ X0°; ° ° °
(iii)supt>0 °et(M −I) x° < ∞ and supt>0 °t (M − I) et(M −I) x° < ∞ for ∀x ∈
X0 ;
376 7 On non-stationary Markov processes and Dunford projections

(iv)There exists 12 π < α < π such that for all x ∈ X0 :


n ° ° o
° −1 °
sup |λ| °(λI − (M − I)) x° : −α < arg(λ) < α < ∞;

(v) There exists 12 π < α < π such that for all x ∈ X0 :


n° ° o
° −1 °
sup °(I − M ) ((λ + 1) I − M ) x° : −α < arg(λ) < α < ∞;

(vi)For every x ∈ X0 the following limits exist


¡ ¢−1
P x := lim M n x and (I − P ) x = lim (I − M ) I − reiϑ M x;
n→∞ reiϑ →1
0<r<1

(vii)For every x ∈ X0 the following limit exists


¡ ¢−1
(I − P ) x := lim (I − M ) I − reiϑ M x.
reiϑ →1
0<r<1

Moreover, if M satisfies one of the conditions (i) through (vii), then


k·k
X0 = R (I − M ) + N (I − M ) .

Remark 7.55. The Banach-Steinhaus theorem implies that in (i) through (v)
in Theorem 7.54 the vector norms may be replaced with the operator norm
restricted to X0 ; i.e. the operator M must be restricted to X0 . These assertions
(i) through (v) are also equivalent if X0 is replaced with the space X. This
fact will be used in Definition 7.58.

Conditions (a) and (b) of the following corollary from [9] are satisfied, if the
space X is reflexive. The closed range condition in (c) has been used by Lin
in [150] and in [151]; in the latter reference he also tied it up with Doeblin’s
condition.
Corollary 7.56. Let M be a bounded linear operator in a Banach space
(X, k·k). As in Theorem 7.54 let X0 be the closure in X of the sub-space
R (I − M ) + N (I − M ). Suppose that, for 0 < λ < 1, the
° inverse operators
°
−1 ° −1 °
(I − λM ) exist and are bounded, and that sup (1 − λ) °(I − λM ) ° < ∞.
0<λ<1
If one of the following conditions:
∗∗
(a) the zero space of the operator (I − M ) , which is a sub-space of the bidual
space X ∗∗ is in fact a subspace of X;
(b) the σ (X ∗ , X)-closure of R ((I − M )∗ ) coincides with its k·k-closure;
(c) the range of I − M is closed in X;
is satisfied, then the space X0 coincides with X, and hence all assertions in
Theorem 7.54 are equivalent with X replacing X0 .
7.5 Conclusions 377
° °
° −1 °
Remark 7.57. If supn∈N kM n k < ∞, then sup0<λ<1 (1 − λ) °(I − λM ) ° <
∞.

Definition 7.58. An operator M which satisfies the equivalent conditions (i)


– (v) of Theorem 7.54 with the space X replacing X0 is called an analytic
operator.

Proof (Proof of Corollary 7.56). If the range of I − M is closed, then by the


closed range theorem, the range of I −M is weak∗ -closed and hence (c) implies
(b). We will prove that (a) as well as (b) implies X0 = X. First we assume
(a) to be satisfied. Pick x ∈ X, and consider
−1 −1
x = (I − M ) (I − λM ) x + (1 − λ) M (I − λM ) x = x − xλ + xλ , (7.304)
−1
where xλ = (1 − λ) M (I − λM ) x. Then sup0<λ<1 kxλ k < ∞, and conse-
quently the family xλ , 0 < λ < 1, has a point of adherence x∗∗ in X ∗∗ ; i.e.
x∗∗ belongs to the σ (X ∗∗ , X ∗ )-closure of the subset {xλ : 1 − η < λ < 1}, and
this for every 0 < η < 1. Fix x∗ ∈ X ∗ . Then
¯D E¯
¯ −1 ∗ ¯
¯ (1 − λ) M (I − λM ) x, (I − M ) x∗ ¯
¯D E¯
¯ −1 ¯
= ¯ (1 − λ) (I − M ) (I − λM ) x, M ∗ x∗ ¯
° °
° −1 °
≤ (1 − λ) °(I − M ) (I − λM ) x° kM ∗ x∗ k . (7.305)
° °
° −1 °
Since sup0<λ<1 (1 − λ) °(I − λM ) ° < ∞, the identity

−1 1³ −1
´
(I − M ) (I − λM ) = I − (1 − λ) (I − λM )
λ
° °
° −1 °
yields that sup0<λ<1 °(I − M ) (I − λM ) ° < ∞. Consequently (7.305) im-
plies
­ ∗∗ ∗ ® ­ ∗ ®
x , (I − M ) x∗ = lim xλ , (I − M ) x∗
λ↑1
D E
−1
= lim (1 − λ) (I − M ) (I − λM ) x, M ∗ x∗ = 0.
λ↑1
¡ ∗¢
Hence x∗∗ annihilates R (I − M ) and so it belongs to the zero space of the
∗∗
operator (I − M ) . By assumption this zero space is a subspace of X. We
infer that the vector x can be written as x = x−x1 +x1 , where x1 is a member
of N (I − M ), and where x − x1 belongs to the weak closure of the range of
I−M . However this weak closure is the same as the norm-closure of R (I − M ).
Altogether this shows X = X0 = k·k-closure of R (I − M ) + N (I − M ).
Next we assume that (b) is satisfied. Let x∗0 be an element of X ∗
which annihilates X0 ; i.e. which has the property that hx, x∗0 i = 0 for all
378 7 On non-stationary Markov processes and Dunford projections

x ∈ X0 . Then x∗0 annihilates R (I − M ), and hence it belongs to zero-



space of (I − M ) . Since x∗0 also annihilates
¡ the zero-space of I − M , it
∗¢
belongs to the weak∗ -closure of R (I − M ) . By assumption (b), we see
that x∗0 is a member of its norm-closure; i.e. x∗0 belongs to the intersection
¡ ∗¢ ¡ ∗ ¢k·k
N (I − M ) ∩ R (I − M ) . We will show that x∗0 = 0. By the Hahn-
Banach theorem¡ [97] it then follows that X0 = X. Since x∗0 belongs to the
∗¢
k·k-closure of R (I − M ) , it follows that
∗ ¡ ∗ ¢−1
x∗0 = k·k - lim (I − M ) (I − λM ) x∗0 . (7.306)
λ↑1


To see this we first suppose that x∗0 = (I − M ) x∗1 . Then
∗ ∗¡ ∗ ¢−1 ∗
(I − M ) x∗1 − (I − M ) (I − λM ) (I − M ) x∗1
¡ ∗ ¢−1 ∗
= (1 − λ) M ∗ (I − λM ) (I − M ) x∗1 . (7.307)
¡ ∗ ¢−1 ∗
Since the family M ∗ (I − λM ) (I − M ) x∗1 , 0 < λ < 1, is bounded, we
see that (7.306) is a consequence of (7.307) provided x∗0 belongs to the range of
∗ ∗¡ ∗ ¢−1
(I − M ) . By the uniform boundedness of the family (I − M ) (I − λM ) ,
0 < λ < 1, the same conclusion is true if x∗0 belongs to the
¡ closure of the range
∗ ∗¢
of (I − M ) . Since, in addition, x∗0 is a member of N (I − M ) , it follows
that x∗0 = 0. This proves Corollary 7.56.
This completes the proof of Corollary 7.56.
Proof (Proof of Theorem 7.54). (i) =⇒ (ii). Fix 0 < r < 1. The following
representations from Lyubich [153] are being used:
Z
1 −2 dλ
(n + 1)M n = (1 − λ)2 (I − λM ) ; (7.308)
2πi |λ|=r λn+1 (1 − λ)2
1
(n + 1)(n + 2)M n (I − M )
2 Z
1 −3 dλ
= (1 − λ)2 (I − M ) (I − λM )
2πi |λ|=r λn+1 (1 − λ)2
Z
1 2 −2 1
= (1 − λ) (I − λM ) 2 dλ
2πi |λ|=r λ n+2 (1 − λ)
Z
1 3 −3 1
− (1 − λ) (I − λM ) 2 dλ. (7.309)
2πi |λ|=r λ n+2 (1 − λ)
n° ° o
° −1 °
Put C := sup °(1 − λ) (I − λM ) |X0 ° : |λ| < 1 . From (7.308) we infer
Z π
C2 1 1 C2 1
(n + 1) kM n k ≤ 2 dϑ = . (7.310)
rn 2π −π |1 − reiϑ | rn 1 − r2
n
The choice r2 = yields
n+2
7.5 Conclusions 379

2 2
kM n |X0 k ≤ eC . (7.311)
3
In the same spirit from (7.309) we obtain
1 ¡ ¢ 1 1
(n + 1)(n + 2) kM n (M − I) |X0 k ≤ C 2 + C 3 n+1 .
2 r 1 − r2
n+1
The choice r2 = yields the inequality:
n+3
4e ¡ 2 ¢
(n + 1) kM n (M − I) |X0 k ≤ C + C3 .
3
This proves the implication (i) =⇒ (ii).
(ii) =⇒ (iii). The representations (see Nagy and Zemanek [165])
∞ k
X ∞ k+1
X
t t
et(M −I) = e−t Mk and t (M − I) et(M −I) = e−t M k (M − I)
k! k!
k=0 k=0

show that (iii) is a consequence of (ii).


(iii) =⇒ (iv). This is a (standard) result in analytic operator semigroup
theory: see e.g. Van Casteren [237], Chapter 5, Theorem 5.1.
(iv) =⇒ (v). The equality
−1 −1
(I − M ) ((λ + 1) I − M ) = I − λ (λI − (M − I))

shows the equivalence of (iv) and (v).


µ ¶
−iϑ 1 1
(v) =⇒ (i). Fix x ∈ X0 . The choice λ = −1 + e = −2i sin ϑ e− 2 iϑ ,
2
|ϑ| ≤ 2α, yields the boundedness of the function
¡ ¢−1
ϑ 7→ (I − M ) I − eiϑ M x

on the interval [−α, α]. Since, for |λ| = 1, λ 6= 1, the function


−1
λ 7→ (I − M ) (I − λM ) x

is continuous, it follows that this function is bounded on the unit circle. The
maximum modulus theorem shows that this function is bounded on the unit
disc, which is assertion (i).
(i) =⇒ (vi). Fix x ∈ X0 . For 0 < r < 1 and ϑ ∈ R we also have
¡ ¡ ¢¢
I − P reiϑ (I − M ) x
Z π
1 1 − r2 ¡ ¢−1
= 2
(I − M ) I − eit M (I − M ) x dt
2π −π 1 − 2r cos (ϑ − t) + r
380 7 On non-stationary Markov processes and Dunford projections
¡ ¢−1
= (I − M ) I − reiϑ M (I − M ) x. (7.312)

In (7.312) we use the continuity of the boundary function


¡ ¢−1
eit 7→ (I − M ) I − eit M (I − M ) x (7.313)

to show that
¡ ¡ ¢¢
lim I − P reiϑ (I − M ) x = (I − P ) (I − M ) x = (I − M ) x
reiϑ →1, 0≤r<1
(7.314)
exists, and that I − P is a bounded projection on X0 . From (i) it follows
−1
that the function λ 7→ (I − M ) (I − λM ) x is uniformly bounded on the
unit disc, and hence that the limit in (7.314) exists for all y in the closure of
R(I − M ). In addition, for such vectors y we have (I − P ) y = y. The limit
in (7.314) trivially exists for x ∈ X such that M x = x, we conclude that the
limit in (i) exists for all x ∈ X0 , because x = (I − P )x + P x, where (I − P )x
belongs to the closure of the range of I − M and where
−1
P x = x − (I − P )x = x − lim (I − M ) (I − λM ) x
λ↑1
−1
= lim (1 − λ) M (I − λM ) x. (7.315)
λ↑1

From (7.315) it follows that (I − M ) P x = 0. In addition, from (ii), which is


equivalent to (i), we see that limn→∞ M n y = 0 for all y in the range of I − M ;
here we use the boundedness of the sequence (n + 1) M n (I − M ), n ∈ N. The
boundedness of the sequence M n , n ∈ N, then yields limn→∞ M n y = 0 for y ∈
R(I−P ), because the range of I−M is dense in the range of I−P . An arbitrary
x ∈ X0 can be written as x = (I − P ) x + P x. From the previous arguments it
follows that limn→∞ M n x = P x. Fix x ∈ X0 . Altogether this shows the impli-
cation (v) =⇒ (vi), provided we show the continuity of the function in (7.313)
¡ ¢−1
in the sense that limt→0 (I − M ) I − eit M (I − M ) x = (I − M ) x. How-
ever, this follows from the identity
¡ ¢−1
(I − M ) I − eit M (I − M ) x − (I − M ) x
¡ it ¢ ¡ ¢−1
= e − 1 (I − M ) I − eit M M x,

together with the uniform boundedness (in 0 < |t| ≤ π) of the family of
operators:
¡ ¢−1
(I − M ) I − eit M .
In the latter we use the implication (v) =⇒ (i).
The implication (vi) =⇒ (vii) being trivial there remains to be shown that
(vii) implies (i). For this purpose we fix x ∈ X0 and we consider the continuous
function on the closed unit disc, defined by
7.5 Conclusions 381
 −1

 (I − M ) (I − λM ) x for |λ| ≤ 1, λ 6= 1,
F (λ)x := (I − P )x = lim (I − M ) (I − λM )−1 x for λ = 1.

 λ→1
|λ|<1

From (vii) it follows that the function F (λ)x is well-defined and continuous.
Hence it is bounded. The theorem of Banach-Steinhaus then implies (i).
8
Coupling methods and Sobolev type
inequalities

In this chapter we begin with a discussion of a coupling method by Chen and


Wang. We want to establish a spectral gap related to solutions of stochastic
differential equations: see Theorem 8.3. In addition we want to include re-
sults which do not depend on the matrix σ(t, x) (diffusion coefficient) which
is such that a(t, x) = σ(t, x)σ(t, x)∗ . We have a Poincaré inequality in mind:
see Proposition 8.55, and Definition 8.56. Related inequalities are (tight) log-
arithmic Sobolev inequalities: see Definition 8.59, and Proposition 8.60.

8.1 Coupling methods

In this section we want to apply a coupling method to prove the following


theorem, which is due to Chen and Wang: see [53] Theorem 4.13. The operator
L = (Lt )t≥0 is of the form:

d d
1 X ∂ 2 f (x) X ∂f (x)
Lt f (x) = ai,j (t, x) + bi (t, x) . (8.1)
2 i,j=1 ∂xi ∂xj i=1
∂xi

d
The matrix a(t, x) = (ai,j (t, x))i,j=1 is supposed to be positive definite, The
¡ ¢
functions x 7→ ai,j (t, x), t ≥ 0, belong to C 2 Rd , and the functions (t, x) 7→
ai,j (t, x) are continuous. In addition, b(t, x) is of the form
d µ ¶
1X ∂V (t, x) ∂ai,j (t, x)
bi (t, x) = ai,j (t, x) + . (8.2)
2 j=1 ∂xj ∂xj
¡ ¢
Here for every t ≥ 0 the function x 7→ V (t, x) is a member of C 2 Rd and has
R V (t,x)
the property that Z(t) := e dx < ∞; moreover, the function (t, x) 7→
V (t, x) is continuous on [0, ∞) × Rd . Let µt be the probability measure with
density Z(t)−1 eV (t,x) with respect to the d-dimensional Lebesgue measure.
384 8 Coupling and Sobolev inequalities

Then µt is an invariant measure for Lt and the semigroup esLt generated by


Lt , provided such a semigroup exists. Let us check this. Let L∗t be the (formal)
adjoint of Lt . We notice

2L∗t f (x)
d
X Xd
∂2 ∂
= (ai,j (t, x)f (x)) + 2 (bi (t, x)fi (x))
i,j=1
∂xi ∂xj i=1
∂xi
 
X d X d Xd
∂ 2 f (x) ∂a i,j (t, x) ∂f (x) ∂f (x)
= ai,j (t, x) + 2 − bi (t, x) 
i,j=1
∂x i ∂x j i,j=1
∂x i ∂x j i=1
∂x i
 
d
X ∂ 2 ai,j (t, x) d
X ∂bi (t, x)
+ −2  f (x),
i,j=1
∂xi ∂xj i=1
∂xi

and hence
³ ´
2L∗t eV (t,x)
d
X X d
V (t,x) ∂ 2 V (t, x) V (x) ∂V (t, x) ∂V (t, x)
=e ai,j (t, x) +e ai,j (t, x)
i,j=1
∂x i ∂x j i,j=1
∂xi ∂xj
 
X d d
∂ai,j (t, x) ∂V (t, x) X ∂V (t, x) 
+ 2eV (t,x)  − bi (x)
i,j=1
∂xi ∂xj i=1
∂xi
 
X d Xd
∂ 2 ai,j (t, x) ∂bi (t, x) 
+ eV (t,x)  −2 . (8.3)
i,j=1
∂xi ∂xj i=1
∂xi

¡ ¢
From (8.3) in conjunction with (8.2) we see L∗t eV (t, · = 0, and consequently,
Z Z Z
V (t,x)
Z(t) Lt f dµ = Lt f (x) e dx = f (x) L∗ eV (t,·) (x)dx = 0.

d
Note that we used the symmetry of the matrix a(t, x) = (ai,j (t, x))i,j=1 .
In the following theorem 8.3 we consider the time-homogeneous case, i.e.
the operator L does not depend on the time t. It is not clear how to get such
a result in the time-dependent case. It is assumed that the coefficients a(x)
and b(x) are such that the martingale problem is uniquely solvable for for L,
and that the corresponding Markov process is irreducible in the sense that the
transition probability measures B 7→ P (t, x, B), B ∈ BRd , t > 0, x ∈ Rd , are
equivalent, i.e. all of them have the same null-sets. In fact this is a stronger
notion than the standard notion of irreducibility. However, if all functions of
the form (t, x) 7→ P (t, x, B), B ∈ E, are continuous, then these two notions
coincide: see Lemma 8.2 below.
8.1 Coupling methods 385

Definition 8.1. A time-homogeneous Markov process with state space E and


probability transition function P (t, x, ·) is called irreducible if P (t, x, U ) > 0
for all (t, x) ∈ (0, ∞) × E and all non-empty open subsets U of E.

Lemma 8.2. Let (t, x, B) 7→ P (t, x, B) be a transition probability function


with the property that for every (t, B) ∈ (0, ∞)×E the function x 7→ P (t, x, B)
is lower semi-continuous. Then all measures P (t, x, ·), (t, x) ∈ (0, ∞) × E,
are equivalent if and only if, for every non-void open subset U and every
(t, x) ∈ (0, ∞) × E, P (t, x, U ) > 0.

Proof. First suppose that for every non-void open subset U the quantity
P (t, x, U ) is strictly positive for all pairs (t, x) ∈ (0, ∞) × E. Let (t0 , x0 , B) ∈
(0, ∞) × E × E be such R that P (t0 , x0 , B) = 0. Fix s ∈ (0, t0 ). Then
0 = P (t0 , x0 , B) = P (s, y, B) P (t0 − s, x, dy), and hence the function
y 7→ P (s, y, B) is P (t0 − s, x, ·)-almost everywhere zero. Assume that there
exists y0 ∈ E and ε > 0 such that P (s, y0 , B) > ε > 0, and put Uε =
{y ∈ E : P (s, y, B) > ε}. Then Uε is a non-void open subset of E. Moreover,
Z
0 = P (t0 , x0 , B) = P (s, y, B) P (t0 − s, x0 , dy)
Z
≥ P (s, y, B) P (t0 − s, x0 , dy) ≥ εP (t0 − s, x0 , Uε ) > 0 (8.4)

where in the final step of (8.4) we used our initial hypothesis. Anyway, our
assumption that P (t0 , x0 , B) = 0 leads to a contradiction with the assertion
that all transition probabilities of the form P (t, x, U ), (t, x) ∈ (0, ∞) × E, U
open, U 6= ∅, are strictly positive.
Next assume that all measures P (t, x, ·), (t, x) ∈ (0, ∞) × E, are equiva-
lent, and assume that for some non-empty open subset U of E the quantity
P (t, x, U ) = 0. Then, by our assumption we may and will assume that x ∈ U ,
and that we may choose t > 0 as close to zero as we please. By the normality
we have 1 = limt↓0 P (t, x, U ) = 0. Again we end up with a contradiction.
This completes the proof of Lemma 8.2.

Let (X(t), Px ) be a Markov process with the Feller property. Among other
things this implies that limt↓0 Px [X(t) ∈ U ] = 1 for all open subsets U of E,
and for all x ∈ U . If all probability measures B 7→ P (t, x, B) = Px [X(t) ∈ B],
B ∈ E, have the same null-sets, then the corresponding time-homogeneous
Markov process (with the Feller property) is irreducible in the sense of Defini-
tion 8.1. To this end, assume that there exists a non-void open subset U of E
such that P (t, x, U ) = 0. Since all measures P (t, x, ·), (t, x) ∈ (0, ∞) × E have
the same negligible sets, we may and will assume that x ∈ U and t is as close
to zero as we please. Since limt↓0 Px [X(t) ∈ U ] = 1, this leads to a contra-
diction, and hence our Markov process is irreducible, provided all transition
probability measures P (t, x, ·) have the same null-sets.
386 8 Coupling and Sobolev inequalities
2
Theorem 8.3. Suppose that there exists a > 0 such that ha(x)ξ, ξi ≤ a |ξ|
for all x, ξ ∈ Rd . Let a(x) = σ(x)σ(x)∗ and put

trace (σ(x) − σ(y)) (σ(x) − σ(y)) + 2 hb(x) − b(y), x − yi
−γ = sup 2 .
x6=y∈Rd |x − y|
(8.5)
Then the following inequality holds for all globally Lipschitz functions f :
Rd → R, all x ∈ Rd , and all t ≥ 0:
¯ ¯2 a (1 − e−γt ) tL
etL |f | (x) − ¯etL f (x)¯ ≤
2 2
e |∇f | (x). (8.6)
γ
1 − e−γt
If γ = 0, then is to be interpreted as t.
γ
In the next corollary we write:
© ª
λmin (a) = inf ha(x)ξ, ξi : (x, ξ) ∈ Rd × Rd , |ξ| = 1 . (8.7)

Corollary 8.4. In addition to the hypotheses in Theorem 8.3 suppose that


γ > 0. Then ¯2the diffusion ¯2generated by L is mixing in the sense that
R¯ ¯R
limt→∞ ¯etL f ¯ dµ = ¯ f dµ¯ , and the spectral gap of L satisfies

λmin (a)
gap (L) ≥ γ . (8.8)
a
Proof. Let µ be the invariant measure corresponding to the generator L. The
fact that the diffusion generated by L is ergodic follows from results in Chen
and Wang [55]: see Theorem 8.8 below. The mixing property is a consequence
of assertion (ii) in Theorem 8.8. Since
Z t
¯ tL ¯2 ¡ ¢
tL 2 ¯
e |f | − e f = ¯ e(t−s)L Γ1 esL f , esL f ds, (8.9)
0

we see that
Z Z Z tZ
¯ tL ¯2 ¡ ¢
2
|f | dµ − ¯ ¯
e f dµ = Γ1 esL f , esL f dµ ds (8.10)
0

where we used the L-invariance of the measure µ several times. From (8.9) it
R¯ ¯2
follows that limt→∞ ¯etL f ¯ dµ exists. It is not clear that this limit is equal
¯R ¯2
to ¯ f dµ¯ . The equality in (8.9) is an immediate consequence of equality
(8.136) in the proof of Theorem 8.3 below. We wrote
d
X ∂f ∂g
Γ1 (f, g) = ha∇f, ∇gi = ai,j . (8.11)
i,j=1
∂xi ∂xj

By taking the limit as t → ∞ in (8.9) we obtain


8.1 Coupling methods 387
Z ¯Z ¯2 Z ∞Z
¯ ¯ ¡ ¢
|f | dµ − ¯ f dµ¯¯ =
¯
2
Γ1 esL f , esL f dµ ds (8.12)
0

The result in Corollary 8.4 is a consequence of (8.6), (8.7), (8.11), and (8.12).

In the proof of Corollary 8.4 we used a result on ergodicity. The following


result can be found as Theorem 4 in [156]. It is applicable in our situation.
For its proof we refer the reader to Stettner [221] and Seidler [207]. A general
discussion about this kind of properties can be found in Maslowski and Seidler
[156]. For convenience we insert an outline of a proof. We need the following
definition: compare with property (a) in Proposition 8.11 below.
Definition 8.5. Let D be a subspace of Cb (E). It is said that D almost sepa-
rates compact and closed sets, if for every compact subset K and closed subset
F such that K ∩ F = ∅ there exist a constant α > 0 and a function u ∈ D
such that α ≤ u(x) − u(y) for all x ∈ K and all y ∈ F .

Remark 8.6. If the linear subspace D contains the constant functions, and
is closed under taking finite maxima, then D almost separates compact and
closed subsets if and only for every closed subset F of E, and every x ∈
E \ F there exists a function u ∈ D such that u(x) > supy∈D u(y). Let F be
closed subset of E. First suppose that D almost separates compact subsets not
intersecting F . Since a set consisting of one singleton x ∈ E \ F is compact,
there exists a function u and a constant α > 0 such that α < u(x) − u(y), for
all y ∈ F . Then u(x) > α + supy∈F u(y) > supy∈F u(y). Conversely, let K and
F be compact and closed subset of E which do not intersect. Suppose that for
every x ∈ K there exists a function ux ∈ D such that ux (x) > supy∈F ux (y).
Then by subtracting the constant αx = supy∈F ux (y) we see that vx := ux −αx
satisfies vx (x) > 0 ≥ supy∈F vx (y). By compactness there exist finitely many
functions vj := vxj , 1 ≤ j ≤ N , such that

max vj (x) ≥ α > 0 ≥ sup max vj (y), x ∈ K, (8.13)


1≤j≤N y∈F 1≤j≤N

where α = inf x∈K max1≤j≤N vj (x), which is strictly positive real number. It
follows that 0 < α ≤ max1≤j≤N vj (x) − max1≤j≤N vj (y), x ∈ K, y ∈ F .

Definition 8.7. Consider the Markov process in (8.14) below. Let the family
of time-translation operators have the property that X(s) ◦ ϑt = X(s + t) Px -
almost surely for all x, and are such that ϑs+t = ϑs ◦ ϑt for all s, t ∈ [0, ∞).
Its tail or asymptotic σ-field T is defined by T = ∩t>0 ϑ−1
t F. In fact an event
A belongs to T if and only if for every t > 0 there exists an event At ∈ F such
that A = ϑ−1t At , or what amounts to the same 1A = 1At ◦ ϑt . In fact
¡ −1we may¢
assume that At ∈ T. The reason being that A = ϑ−1 −1
s+t As+t = ϑt ϑs As+t ,
and hence for At we may choose At = ∩s>0 ϑ−1 s As+t .

Theorem 8.8. Let


388 8 Coupling and Sobolev inequalities
© ª
(Ω, F, Px )x∈E , (X(t), t ≥ 0) , (E, E) . (8.14)

be a time-homogeneous Markov process on a polish space E with a transition


probability function P (t, x, ·), t ≥ 0, x ∈ E, which is conservative in the sense
that P (t, x, E) = 1 for all t ≥ 0 and x ∈ E. Assume that the process X(t)
is strong Feller in the sense that for all Borel subsets B of E the function
(t, x) 7→ P (t, x, B) is continuous on (0, ∞) × E. In addition, suppose that all
measures B 7→ P (t, x, B), B ∈ E, t > 0, x ∈ E, are equivalent, and that the
process has an invariant probability measure µ. In addition suppose that the
domain of the generator L of the Markov process almost separates compact
and closed subsets. Then the following assertions are true:
(i) For every f ∈ L1 (E, µ) and every x ∈ E the equality
Z t Z
1
lim f (X(s)) ds = f dµ. (8.15)
t→∞ t 0 E

holds Px -almost surely;


(ii)For every x ∈ E the following equality holds:

lim Var (P (t, x, ·) − µ) = 0. (8.16)


t→∞

In particular, both assertion (i) and (ii) imply that the invariant measure µ
is unique.

Remark 8.9. In Theorem 9.36 it will be shown that the Markov process 8.14
admits a σ-finite invariant measure provided that this process satisfies the
conditions of Theorem 8.8, it is recurrent, and the vector sum R(L) + N (L)
is dense in Cb (E) for the strict topology. In addition, in Corollary 9.43 a
condition will be formulated which implies that this invariant measure is in
fact finite, and hence may be taken to be a probability measure.

The equality in (8.15) is known as the strong law of large numbers or the
pointwise ergodic theorem of Birkhoff. In (8.16) Var(ν) stands for the variation
norm of the measure ν. The property in (ii) is stronger than the weak and
strong mixing property. If the process in (8.14) has property (ii), then it is
said to be ergodic. There exist stronger notions of ergodicity: see e.g. [52].
The property in (ii) is closely related to the fact that in the present situation
the tail σ-field is trivial. Mixing properties are heavily used in ergodic theory:
see e.g. [162]. Suppose that there exists a (reference) measure m on E and a
measurable function (t, x, y) 7→ p(t, x, y), (t, x, y) ∈ (0, ∞) × E × E, which is
strictly positive
R such that for every (t, x, B) ∈ (0, ∞) × E × E the equality
P (t, x, B) = p(t, x, y)dm(y) holds. Then P (t, x, A) = 0 if and only if m(A) =
0, and so all measures P (t, x, ·) have the same null-sets. For a proof of Theorem
8.8 the reader is referred to the cited literature. We will also include a proof,
which is based on work by Seidler [207].
8.1 Coupling methods 389

Lemma 8.10 says that property (ii) in Theorem 8.8 is stronger than the
strong mixing property, which can be phrased as follows: for every f and
g ∈ L2 (E, µ) we have
Z Z Z
¡ tL ¢
lim Eµ [f (X(t)) g (X(0))] = lim e f (x)g(x)dµ(x) = f dµ gdµ.
t→∞ t→∞ E E E
R (8.17)
Here Eµ [F ] = E Ex [F ] dµ(x), F ∈ L∞ (Ω, F). Notice that by Cauchy-
Schwarz’ inequality and by the L-invariance of the probability measure we
have:
µZ ¶2 Z Z
¯ tL ¯ ¯ ¯
¯e f (x)g(x)¯ dµ(x) ≤ ¯etL f (x)¯2 dµ(x) · |g(x)|2 dµ(x)
Z Z Z Z
2 2 2 2
≤ etL |f | dµ · |g| dµ = |f | dµ · |g| dµ < ∞

whenever f , g ∈ L2 (E, µ).


Lemma 8.10. Suppose that µ is an L-invariant probability measure which,
for each x ∈ E, satisfies (8.16) in Theorem 8.8. Then
Z Z
lim etL f (x)etL g(x) = f d dµ · gdµ and
t→∞
Z Z
tL tL
lim e f (x) · e g(x)dµ(x) = f (x) dµ(x) g(x) dµ(x) (8.18)
t→∞

for all f and g ∈ Cb (E).


Proof. Let the functions f and g belong to Cb (E). The second equality in
(8.18) is a consequence of the first one and the dominated convergence theorem
of Lebesgue. The first equality is a consequence of the following equalities and
(8.16) in Theorem 8.8:
¯ Z Z ¯
¯ tL ¯
¯e f (x) · etL g(x) − f dµ · g(y)dµ(y)¯
¯ ¯
¯ Z ¯
¯ ¯ ¯ ¯
≤ ¯¯etL f (x) − f (y)dµ(y)¯¯ · ¯etL g(x)¯
¯Z ¯ ¯ Z ¯
¯ ¯ ¯ tL ¯
¯ ¯
+ ¯ f (y)dµ(y)¯ · ¯e g(x) − g(y)dµ(y)¯¯
¯
¯Z Z ¯
¯ ¯ ¯ ¯
= ¯ f (y)P (t, x, dy) − f (y)dµ(y)¯¯ · ¯etL g(x)¯
¯
¯Z ¯ ¯Z Z ¯
¯ ¯ ¯ ¯
¯ ¯ ¯
+ ¯ f (y)dµ(y)¯ · ¯ g(y)P (t, x, dy) − g(y)dµ(y)¯¯

≤ 2 kf |∞ kgk∞ Var (P (t, x, ·) − µ) . (8.19)


The right-hand side of (8.19) together with (8.16) completes the proof of
Lemma 8.10.
390 8 Coupling and Sobolev inequalities

In case the Markov process in Theorem 8.8 originates from a Feller-Dynkin


semigroup with a locally compact state space, then the following proposition
is automatically true. In case we are dealing with a polish state space, we
need the extra condition that the domain of the generator has the property
described in property (a) in Proposition 8.11 below. This property says that,
up to any ε > 0, the domain of L separates disjoint compact and closed sets. In
the locally compact and a strong Markov process originating from a Dynkin-
Feller semigroup, it is only required that C0 (E) has this property, which is
automatically the case. Since by assumption P (t, x, E) = 1 there is no need
to consider E 4 : see final assertion in assertion (a) of Theorem 1.39.
Proposition 8.11. Let K be a compact subset of E and let U be an open
subset of E such that K ⊂ U . Let τU c be the hitting time of E \ U : τU c =
inf {s > 0 : X(s) ∈ E \ U }. Assume that the generator L has the following
separation property:
(a) For every x ∈ K there exist a function ux ∈ D(L) and such that u(x) >
supy∈U c u(y).
Then
lim sup Px [τU c ≤ t] = 0. (8.20)
t↓0 x∈K

In Proposition 8.13 below we will give an alternative formulation for (8.20).


Proof. Since K, and since the domain of L contains the constant functions
there exist finitely many functions uj ∈ D(L), 1 ≤ j ≤ N , and a constant
α > 0 such that

0 < α ≤ inf max uj (x) − sup max uj (y). (8.21)


x∈K 1≤j≤N y∈U c 1≤j≤N

To see this the reader is referred to the arguments leading to (8.13). Choose
the constant α > 0 and u ∈ D(L) satisfying (8.21). Then for x ∈ K and
1 ≤ j ≤ N we have
µ ¶
uj (x) − sup uj (y) Px [τU c ≤ t]
y∈U c

≤ Ex [uj (X(0)) − uj (X (τU c )) , τU c ≤ t]


= Ex [uj (X(t)) − uj (X (τU c )) , τU c ≤ t]
+ Ex [uj (X(0)) − uj (X(t)) , τU c ≤ t] (8.22)
· Z t ¸
= Ex uj (X(t)) − uj (X (τU c ∧ t)) − Luj (X(s)) ds, τU c ∧t<t
τU c ∧t
·Z t ¸
+ Ex Luj (X(s)) ds, τU c ≤ t + Ex [u (X(0)) − u (X(t)) , τU c ≤ t]
τU c
· · Z t ¸
¯
= Ex Ex uj (X(t)) − uj (X (τU c ∧ t)) − ¯
Luj (X(s)) ds FτU c ∧t ,
τU c ∧t
8.1 Coupling methods 391
# ·Z ¸
t
τU c ∧ t < t + Ex Luj (X(s)) ds, τU c ≤ t
τU c

+ Ex [uj (X(0)) − uj (X(t)) , τU c ≤ t]

(Doob’s optional sampling theorem)


·Z t ¸
= Ex Luj (X(s)) ds, τU c ≤ t + Ex [uj (X(0)) − uj (X(t)) , τU c ≤ t]
τU c
≤ t sup Luj (y) + Ex [|uj (X(0)) − uj (X(t))|] . (8.23)
y∈E

The choice of α > 0 together with (8.23) shows:

α sup Px [τU c ≤ t]
x∈K
≤ t max sup Luj (y) + max sup Ex [|uj (X(0)) − uj (X(t))|] . (8.24)
1≤j≤N y∈E 1≤j≤N x∈K

We also notice the inequalities (1 ≤ j ≤ N ):


2
(Ex [|uj (X(0)) − uj (X(t))|])
h i
2
≤ Ex |uj (X(0)) − uj (X(t))|
h i
2
= 2uj (x) (uj (x) − Ex [uj (X(t))]) + Ex uj (X(t)) − uj (x)2
¡ ¢ 2
= 2uj (x) uj (x) − etL uj (x) + etL |uj | (x) − uj (x)2 . (8.25)
© ª
Since the semigroup etL : t ≥ 0 is Tβ -continuous from (8.25) and (8.24) we
infer that

lim sup Ex [|uj (X(0)) − uj (X(t))|] = 0, 1 ≤ j ≤ N. (8.26)


t↓0 x∈K

From (8.23) and (8.26) it follows that

lim sup sup Px [τU c ≤ t] ≤ ε. (8.27)


t↓0 x∈K

Hence, since in (8.27) ε > 0 is arbitrary, this concludes the proof of Proposition
8.11.

Remark 8.12. Suppose that in Proposition 8.11 the state space E is second
countable and locally compact. In this case there exists a function u ∈ C0 (E)
such that 1K ≤ u ≤ 1U . Then we use the time-homogeneous strong Markov
property to rewrite (8.22) as follows:

Px [τU c ≤ t] = Ex [u (X(t)) , τU c ≤ t] + Ex [1 − u (X(t)) , τU c ≤ t]


392 8 Coupling and Sobolev inequalities

= Ex [u (X(t)) − u (X (τU c )) , τU c ≤ t] + Ex [u (X (τU c )) , τU c ≤ t]


+ Ex [1 − u (X(t)) , τU c ≤ t]
£ ¤
= Ex EX(τU c ) [u (X (t − τU c )) − u (X (0))] , τU c ≤ t
+ Ex [1 − u (X(t)) , τU c ≤ t]
≤ sup sup Ey [u (X(s)) − u(X(0))] + Ex [|u (X(0)) − u (X(t))|]
y ∈U
/ s∈[0,t]
¡ ¢
≤ sup sup esL u(y) − u(y) + etL |u(x) − u| (x). (8.28)
s∈[0,t] y∈E

Since, uniformly on E, etL u − u converges to zero when t ↓ 0, the proof of


Proposition 8.11 can be finished as in the non-locally compact case.
Remark 9.13 shows that assertion (i) in Proposition (8.13) automatically holds
when the state space E is second countable and locally compact.
Proposition 8.13. Let d be a metric on E which is compatible with its polish
topology. Then the following assertions are equivalent:
(i) For every compact subset K and every open subset U of E such that K ⊂ U
the equality in (8.20) holds, i.e. lim sup Px [τU c ≤ t] = 0 where τU c stands
t↓∞ x∈K
for the first hitting time of the complement of U , which is also called the
first exit time from U .
(ii)For every compact subset K of E and every η > 0 the following equality
holds: · ¸
lim sup Px sup d (X(s), x) ≥ η = 0. (8.29)
t↓0 x∈K 0<s≤t

(iii)For every compact subset K and every open subset U of E such that K ⊂
U , and every sequence (tn )n∈N ⊂ (0, ∞) which decreases to 0, there exists
a sequence of open subsets (Un )n∈N such that Un ⊃ K, n ∈ N, and which
has the property that
lim sup Px [τU c ≤ tn ] = 0. (8.30)
n→∞ x∈Un

(iv)For every compact subset K of E and every η > 0 and every sequence
(tn )n∈N ⊂ (0, ∞) which decreases to 0 there exists a sequence of open
subsets (Un )n∈N such that Un ⊃ K, n ∈ N, and with the property that:
· ¸
lim sup Px sup d (X(s), x) ≥ η = 0. (8.31)
n→∞ x∈Un 0<s≤tn

The result in Proposition 8.13 resembles the result in Proposition 3.23: see the
proof of Lemma 3.22. In [207] Seidler employs assumption (8.29) to a great
extent. Assertions (iii) and (iv) of Proposition 8.13 show that hypothesis (A5)
is in fact a consequence of (A4) in Seidler [207]. Proposition 2.4 in [207] then
shows that there exists a compact recurrent subset whenever there exists a
point x0 ∈ E with the property that every open subset containing x0 is
recurrent.
8.1 Coupling methods 393

Proof. As already indicated the proof is in the spirit of the proof of Lemma
3.22. Also note that, since Un ⊃ K, the implications (iii) =⇒ (i) and (iv) =⇒
(ii) are trivially true.
(i) =⇒ (ii). Let K be a compact[subset of E. Fix η > 0, and consider
the open subset U defined by U = {y ∈ E : d(y, x) < η}. Then K ⊂ U ,
x∈K
\
and U c = E \ U = {y : d(y, x) ≥ η}. It follows that the event {τU c < t} is
x∈K½ ¾
contained in the event sup d (X(s), x) ≥ η for all x ∈ K. Then for x ∈ K
0<s<t
we have · ¸
Px [τU c < t] ≤ Px sup d (X(s), x) ≥ η . (8.32)
0<s<t

Then (ii) follows from (i) and (8.32).


(ii) =⇒ (i). Let the compact K and the open subset of E be such that
K ⊂ U . Then by compactness there exist points x1 , . . . , xn in K and strictly
positive numbers η1 , . . . , ηn such that
n
[ n
[
K⊂ {y ∈ E : d (y, xj ) < ηj } ⊂ {y ∈ E : d (y, xj ) < 2ηj } ⊂ U. (8.33)
j=1 j=1
Sn
Put V = j=1 {y ∈ E : d (y, xj ) < 2ηj }. Then
n
\
Uc ⊂ V c = {y ∈ E : d (y, xj ) ≥ 2ηj } . (8.34)
j=1

Let y ∈ V c and x ∈ K be arbitrary. Then by (8.33) there exists jx ∈ {1, . . . , n}


such that d (x, xjx ) < ηjx . It follows that

2ηjx ≤ d (y, xjx ) ≤ d (y, x) + d (x, xjx ) < d (y, x) + ηjx ,

and hence d(y,


T x) > ηjx . Put η = min1≤j≤n ηj . Consequently, from (8.34) we
infer U c ⊂ x∈K {y ∈ E : d(y, x) > η}, and hence for all x ∈ K the event
{τU c < t} is contained in {sup0<s<t d (X(s), x) > η}. Putting these observa-
tions together shows
· ¸
sup Px [τU < t] ≤ sup Px sup d (X(s), x) > η ,
c (8.35)
x∈K x∈K 0<s<t

and hence by (8.35) assertion (i) follows from (ii).


Fix η > 0, and put Un = {x ∈ E : d (x, K) < 2−n }. In the proofs of the
implications (i) =⇒ (iii) and (ii) =⇒ (iv) we take the sequence (Un )n∈N .
(i) =⇒ (iii). Let (tn )n∈N be a sequence which decreases to 0. Since K
is a compact subset of U , it follows that Un ⊂ U for n sufficiently large.
394 8 Coupling and Sobolev inequalities

Assuming that the limit in (8.30) does not vanish, then there exists δ > 0
and a subsequence (tnk )k∈N together with a sequence (xk )k∈N , xk ∈ Unk , such
that
Pxk [τU c ≤ tnk ] > δ. (8.36)
Since xk ∈ Unk there exists x0k ∈ K such that d (xk , x0k ) < 2¡−nk η. ¢ By com-
pactness of K (and metrizability) there exists a subsequence x0k` `∈N which
converges to x0 ∈ K. Then by the triangle inequality
¡ ¢ ¡ ¢ ¡ ¢
d (xk` , x0 ) ≤ d xk` , x0k` + d x0k` , x0 ≤ 2−nk` η + d x0k` , x0 . (8.37)
n o
From (8.37) it follows that the set K 0 := {x0 } ∪ xnk` : ` ∈ N is compact.
From (2.42) we see that
h i
δ < sup Px τU c ≤ tnk` . (8.38)
x∈K 0

From assertion (i) it follows that the right-hand side of (8.38) converges to 0,
when ` → ∞. Since the latter is a contradiction we see that assertion (iii) is
a consequence of (i).
The proof of the implication (ii) =⇒ (iv) follows the same lines: details
are left to the reader.
This completes the proof of Proposition 8.13.

Proposition 8.14. Let the notation and hypotheses be as in Proposition 8.11.


In particular τU c is the first hitting time of the ¡complement of the open
¢ set
U , and K is a compact subset of U . Let g ∈ L∞ [0, ∞) × E, B[0,∞) ⊗ E and
t > 0 be fixed. Then the following assertions are true:
(i) The following functions are continuous on U : x 7→ Ex [g (t, X(t)) , τU c > t]
and x 7→ Ex [g (τU c , X (τU c ))].
(ii)Let K be a compact subset of U . Then the family of measure

{B 7→ Px [(τU c , X (τU c )) ∈ B] : x ∈ K}

is tight. Here B varies over the Borel subsets of [0, ∞) × E.


(iii)The function x 7→ Px [τU c < ∞] is lower semi-continuous.

In assertion (iii) the subset U may be an arbitrary Borel subset. In the proof
we use the fact that s + τU c ◦ ϑs decreases to τU c Px -almost surely when s
decreases to 0.
Remark 8.15. A proof similar to the proof of (i) shows that the function x 7→
Px [τU c = ∞] is continuous on U as well.

Proof. For brevity we write τ = τU c . Let s ∈ (0, t) be arbitrary (small) and


x ∈ K where K is a fixed compact subset of U .
(i) Then we have
8.1 Coupling methods 395
£ ¤
Ex EX(s) [g (t, X(t − s)) , τ > t − s] − Ex [g (t, X(t)) , τ > t]
= Ex [g (t, X(t − s)) ◦ ϑs , τ ◦ ϑs > t − s] − Ex [g (t, X(t)) , τ > t]
= Ex [g (t, X(t − s)) ◦ ϑs , τ ◦ ϑs > t − s, τ > s]
+ Ex [g (t, X(t − s)) ◦ ϑs , τ ◦ ϑs > t − s, τ ≤ s]
− Ex [g (t, X(t)) , τ > t]

(on the event {τ > s} the equality s + τ ◦ ϑs = τ holds Px -almost surely)

= Ex [g (t, X(t − s)) ◦ ϑs , τ ◦ ϑs > t − s, τ ≤ s] . (8.39)

From (8.39) we infer:


¯ £ ¤ ¯
¯Ex EX(s) [g (t, X(t − s)) , τ > t − s] − Ex [g (t, X(t)) , τ > t]¯
≤ kg (t, ·)k∞ Px [τ ≤ s] . (8.40)

By Proposition 8.11 the right-hand side of (8.40) converges to zero uniformly


on compact
£ subsets of U . Since, by the ¤strong Feller property, the functions
x 7→ Ex EX(s) [g (t, X(t − s)) , τ > t − s] , s ∈ (0, t), are continuous, we infer
that the function x 7→ Ex [g (t, X(t)) : τ > t] is continuous as well.
Let h ∈ L∞ (E, E). We will use the continuity on U of functions of the
form x 7→ Ex [h (X(t)) , τ > t] in the proof of the continuity of the function
x 7→ Ex [g (τU c , X (τU c ))] are continuous on U . Let x ∈ K. We consider the
following difference:
£ ¤
Ex [g (τ, X(τ )) , τ < ∞] − Ex EX(s) [g (s + τ, X(τ )) , τ < ∞]
£ ¤
= Ex [g (τ, X(τ )) , τ < ∞] − Ex EX(s) [g (s + τ, X(τ )) , τ < ∞] , τ > s
£ ¤
− Ex EX(s) [g (s + τ, X(τ )) , τ < ∞] , τ ≤ s

(Markov property)

= Ex [g (τ, X(τ )) , τ < ∞] − Ex [g (s + τ, X(τ )) ◦ ϑs , τ ◦ ϑs < ∞, τ > s]


£ ¤
− Ex EX(s) [g (s + τ, X(τ )) , τ < ∞] , τ ≤ s
= Ex [g (τ, X(τ )) , τ < ∞]
− Ex [g (s + τ ◦ ϑs , X(s + τ ◦ ϑs )) , s + τ ◦ ϑs < ∞, τ > s]
£ ¤
− Ex EX(s) [g (s + τ, X(τ )) , τ < ∞] , τ ≤ s

(on the event {τ > s} the equality s + τ ◦ ϑs = τ holds Px -almost surely)

= Ex [g (τ, X(τ )) , τ ≤ s]
£ ¤
− Ex EX(s) [g (s + τ, X(τ )) , τ < ∞] , τ ≤ s . (8.41)

By the strong Feller property the functions


396 8 Coupling and Sobolev inequalities
£ ¤
x 7→ Ex EX(s) [g (s + τ, X(τ )) , τ < ∞] , s > 0,

are continuous. From (8.20) in Proposition 8.11 together with (8.41) we see
that, uniformly on the compact subset K, the functions
£ ¤
x 7→ Ex EX(s) [g (s + τ, X(τ )) , τ < ∞] , s > 0,

converge to x 7→ Ex [g (τ, X(τ )) , τ < ∞] whenever s ↓ 0. Consequently,


since K is an arbitrary compact subset of U we see that the function
x 7→ Ex [g (τ, X(τ )) , τ < ∞] is continuous on U .
(ii). In order to prove that the family of Px -distributions, x ∈ K, of the
space-time variable (τ, X (τ )), is tight, by assertion (a) of Theorem 1.8 it
suffices to prove that for every sequence of bounded continuous functions
(fn : n ∈ N) ⊂ Cb ([0, ∞) × E) which decreases pointwise to zero we have:

lim sup Ex [fn (τ, X(τ )) , τ < ∞] = 0. (8.42)


n→∞ x∈K

By Dini’s lemma and by assertion (i), the equality in (8.42) follows from the
pointwise equality:

lim Ex [fn (τ, X(τ )) , τ < ∞] = 0. (8.43)


n→∞

The equality in (8.43) follows from Lebesgue’s dominated convergence theo-


rem.
(iii) By the Markov property we have the equalities
£ ¤
Px [τ < ∞] = sup Px [s + τ ◦ ϑs < ∞] = sup Ex PX(s) [τ < ∞] . (8.44)
s>0 s>0

Functions of the form x 7→ Ex [g(X(s))], where g is a bounded Borel function,


are continuous, and hence by (8.44) the function x 7→ Px [τ < ∞] is lower
semi-continuous. The same argument works in case τ is the hitting time of a
Borel subset of E.
This completes the proof of Proposition 8.14.

Under the hypotheses of the equivalent properties in Proposition 8.13 it will be


shown that there exists a compact recurrent subset, provided all open subsets
are recurrent. More precisely we have the following result.
Proposition 8.16. Suppose that there exists a point x0 ∈ E such that every
open neighborhood of x0 is recurrent, and suppose that the equivalent properties
in 8.13 are satisfied. In addition, suppose that all probability measures B 7→
P (t, x, B), (t, x) ∈ (0, ∞) × E, are equivalent. Then there exists a compact
recurrent subset. In fact, the following assertion is true. Fix t0 > 0, and let K
be a compact subset of E with the property that P (t0 , x0 , K) > 0, and x0 ∈/ K.
Then K is recurrent.
8.1 Coupling methods 397

Proof. Let x0 be as in Proposition 8.16. Fix t0 > 0, and let K be a compact


subset of E with the property that P (t0 , x0 , K) > 0, and x0 ∈ / K. By inner
regularity of the measure B 7→ P (t0 , x0 , B) such compact subset K exists, We
shall prove that K is recurrent. Let τK be the first hitting time of K and let
(U` )`∈N be sequence of open neighborhoods of x0 with respective first hitting
times τ (`) , ` ∈ N. We suppose that this sequence forms a neighborhood base of
x0 , and that U `+1 ⊂ U` where U `+1 stand for the closure of U`+1 . We assume
that U` ∩ K = ∅. For every ` ∈ N we define the following sequence of stopping
(`)
times: τ1 = τ (`) , and
n o
(`)
τn+1 = inf s > τn(`) + 2t0 : X(s) ∈ U` . (8.45)

(`)
Since the open subset U` is recurrent, the hitting times τn are finite Px -
almost surely for all x ∈ E, and for all n ∈ N. As in the proof of Lemma 8.22
below we introduce the following sequence of events:
n o n o
A`n = τn(`) ≤ τn(`) + τK ◦ ϑτ (`) ≤ τn(`) + t0 = τK ◦ ϑτ (`) ≤ t0 , (8.46)
n n

(`)
`, n ∈ N. Then An ∈ Fτ (`) , and we have
n+1

h ¯ i £ ¤
Px A(`)
n
¯ F (`) = Ex PX(τn` ) [τK ≤ t0 ] ≥ inf Py [τK ≤ t0 ] . (8.47)
τn
y∈U `

By assertion (i) in Proposition 8.14 we see that the function y 7→ Py [τK ≤ t0 ] =


1 − P [τK > t0 ] is continuous at y = x0 . From (8.47) it then follows that
h ¯ i 1 1 1
Px A(`)
n
¯ F (`) ≥ Px0 [τK ≤ t0 ] ≥ Px0 [X (t0 ) ∈ K] = P (t0 , x0 , K) > 0
τn 2 2 2
(8.48)
for ` ≥ `0 . From the generalized Borel-Cantelli
hP lemma (or
i the Borel-Cantelli-

Lévy lemma) it then follows that Px n=1 1A(`) = ∞ = 1, ` ≥ `0 , and
n
hence the compact subset K is recurrent. For a precise formulation of the
Borel-Cantelli-Lévy lemma the reader is referred to Shiryayev [212] Corollary
2 page 486.
This completes the proof of Proposition 8.16

The following result shows that Proposition 8.16 also holds for Markov chains.
Proposition 8.17. Let

{(Ω, F, Px ) , (X(n), n ∈ N) , (ϑn , n ∈ N) , (E, N)} (8.49)

be a Markov chain with the property that all Borel measures B 7→ P (1, x, B) =
Px [X(1) ∈ B], x ∈ E, are equivalent. In addition suppose that for every Borel
subset B the function x 7→ P (1, x, B) is continuous. Let there exist a point
398 8 Coupling and Sobolev inequalities

x0 ∈ E such that every open neighborhood of x0 is recurrent. Then there exists


a compact recurrent subset. In fact, the following assertion is true. Let K be
a compact subset of E \ {x0 } with the£ 1property
¤ that P (1, x0 , K) > 0, and
x0 ∈
/ K. Then K is recurrent, i.e. Px τK < ∞ = 1 for all x ∈ E.
1
Here τK = inf {k ≥ 1 : k ∈ N, X(k) ∈ K}.
Proof. The proof can be copied from the proof of Proposition 8.16 with t0 = 1,
1
τK = τK . A similar convention is used for © 1the hitting
ª times of the open
neighborhoods U` of x0 . Also notice that τK ≤ 1 = P (1, x, K), and that
the function x 7→ P (1, x, K) is continuous.
These arguments suffice to complete the proof of Proposition 8.17.
Next we collect some of the results proved so far. The existence of a compact
recurrent subset will also be used when we prove the existence of a σ-finite
invariant Radon measure: see Theorem 9.36.
Theorem 8.18. As in Theorem 8.8 let
© ª
(Ω, F, Px )x∈E , (X(t), t ≥ 0) , (E, E) . (8.50)

be a time-homogeneous Markov process on a polish space E with a transition


probability function P (t, x, ·), t ≥ 0, x ∈ E, which is conservative in the sense
that P (t, x, E) = 1 for all t ≥ 0 and x ∈ E. Assume that the process X(t)
is strong Feller in the sense that for all Borel subsets B of E the function
(t, x) 7→ P (t, x, B) is continuous on (0, ∞) × E. In addition, suppose that all
measures B 7→ P (t, x, B), B ∈ E, t > 0, x ∈ E, are equivalent. Suppose that
there exists x0 ∈ E with the property that all open neighborhoods of x0 are
recurrent. In addition assume that the generator of the process almost separate
points and closed subsets, in the sense that for every x ∈ U with U open there
exists a function v ∈ D(L) such that v(x) > supy∈E\U v(y). Then there exists
a compact subset A which is recurrent. Moreover, every Borel subset B for
which P (t0 , x0 , B) > 0 for some (t0 , x0 ) ∈ (0, ∞) is recurrent.
Theorem 9.31 and its companion Theorem 9.33 show that under the hypothe-
ses of Theorem 8.18 a Borel subset is recurrent if and only if it is Harris
recurrent. For the notion of the almost separation property the reader may
want to see Remark 8.6 following Definition 8.5: see Proposition 8.11 as well.
Proof. In assertion (i) of Proposition 8.11 it is shown that the almost separa-
tion implies the very relevant property (8.20) which is somewhat strength-
ened in Proposition 8.13. Using this property we see that the function
x 7→ Px [τA ≤ t] is continuous on E \ A where A is compact. From the proof of
Proposition 8.16 it follows that there exists a recurrent compact subset. From
Lemma 8.22 below we see that all Borel subsets B for which P (t0 , x0 , B) > 0
for some (t0 , x0 ) ∈ (0, ∞) are recurrent.
This completes the proof of Theorem 8.18.
8.1 Coupling methods 399

Again we consider the time-homogeneous Markov process (8.14) in Theorem


8.8. Theorem 9.31 and its companion Theorem 9.33 show that in the context
of a strong Markov process with the strong Feller property the collection
of recurrent Borel subsets coincides with the collection of Harris recurrent
subsets provided that all measures B 7→ P (t, x, B), B ∈ E, (t, x) ∈ (0, ∞)×E,
are equivalent.
Definition 8.19. Let A be a Borel subset of E, and τA its first hitting time:
τA = inf {s > 0 : X(s) ∈ A}. The subset A is called recurrent if

Px [τA < ∞] = 1 for all x ∈ E.

The subset A is called Harris recurrent provided


·Z ∞ ¸
Px 1A (X(s)) ds = ∞ = 1 for all x ∈ E. (8.51)
0

Definition 8.20. Let µ be an invariant measure for the Markov process in


(8.14). Then the Markov process is µ-Harris recurrent provided every Borel
subset A for which µ(A) > 0 is Harris recurrent. Suppose that all measures
B 7→ P (t, x, B), B ∈ E, (t, x) ∈ (0, ∞) × E, are equivalent. Then the corre-
sponding Markov process is called Harris recurrent if every Borel subset for
which P (1, x0 , B) > 0 for some x0 ∈ E is Harris recurrent.

The following theorem says among other things that, if the Markov process
possesses a finite invariant measure µ, then there exists a compact recurrent
subset K of E such that µ(K) > 0. It is closely related to Theorem 2.1 in
Seidler [207]. An adapted version will be employed in the proof of Theorem
9.36 in Chapter 9: see (9.189)–(9.206). In particular the σ-finiteness will be at
stake: see the arguments after the (in-)equalities (9.173) and (9.188). Another
variant can be found in Theorem 8.39 below.
Theorem 8.21. Let the Markov process have right-continuous sample paths,
be strong Feller, and irreducible. Let K ⊂ E be a compact subset which is
non-recurrent. Then
Z ∞
sup P (t, x, K) dt < ∞, and µ(K) = 0 (8.52)
x∈E 0

for all finite invariant measures µ. If, in addition, P (1, x0 , K) > 0 for some
x0 ∈ E, then

Px [sup {t ≥ 0, X(t) ∈ K 0 } < ∞] = 1, and lim P (t, x, K 0 ) = 0 (8.53)


t→∞

for all x ∈ E and all compact subsets K 0 .

Proof. Let K be a non-recurrent compact subset of E. We begin by showing


that
400 8 Coupling and Sobolev inequalities
Z ∞
sup P (t, x, K) dt < ∞. (8.54)
x∈E 0

The proof of (8.54) follows the same pattern as the corresponding proof by
Seidler in [207], who in turn follows Khasminskii [101]. Let τ be the first
hitting time of K. Since K is non-recurrent there exists y0 ∈
/ K such that

Py0 [τ = ∞] = Py0 [X(t) ∈


/ K for all t ≥ 0] > 0.

By Remark 8.15 which follows Proposition 8.14 the function x 7→ Px [τ = ∞]


is continuous on E \ K. Hence there exists an open neighborhood V of y0 such
that
α := inf Px [τ = ∞] > 0. (8.55)
x∈V

Fix t0 > 0 arbitrary, and choose y ∈ K. Then by the Markov property we


have
·Z ∞ ¸
Py 1K (X(t)) dt < t0
0
· ·Z t0 Z ∞ ¸¸
= Ey ω 7→ PX(t0 )(ω) 1K (X(t)(ω)) dt + 1K (X(t)) dt < t0
0 0
· ·Z t0 ¸¸
≥ Ey ω 7→ PX(t0 )(ω) 1K (X(t)(ω)) dt < t0 , X(t) ∈ / K for all t ≥ 0
0
·Z t0 ¸
≥ Py 1K (X(t)) dt < t0 , X(t) ∈/ K for all t ≥ t0
0
·Z t0 ¸
≥ Py 1K (X(t)) dt < t0 , X (t0 ) ∈ V, X(t) ∈
/ K for all t ≥ t0
£ 0 ¤
≥ Ey PX(t0 ) [τ = ∞] , X (t0 ) ∈ V

(apply (8.55), the definition of α)

≥ αP (t0 , y, V ) ≥ α inf P (t0 , x, V ) =: q > 0, (8.56)


x∈K

where we used the irreducibility of our Markov process, and the continuity of
the function x 7→ P (t0 , x, V ). Hence we infer
·Z ∞ ¸
sup Py 1K (X(t)) dt ≥ t0 ≤ 1 − q. (8.57)
y∈K 0

Put
½ Z t ¾ ½ Z t ¾
κ = inf t > 0 : 1K (X(s)) ds ≥ t0 = inf t > 0 : 1K (X(s)) ds = t0 .
0 0
(8.58)
Then κ is a stopping time relative to the filtration (Ft )t≥0 , because X(s) is
Ft -measurable for all 0 ≤ s ≤ t. Moreover, by right-continuity of the process
8.1 Coupling methods 401

t 7→ X(t) it follows that X(κ) ∈ K on the event {τ < ∞}. Let y ∈ E. By


induction we shall prove that
·Z ∞ ¸
Py 1K (X(t)) dt > kt0 ≤ (1 − q)k−1 , k ∈ N, k ≥ 1. (8.59)
0

To this end we put


·Z ∞ ¸
αk = sup Ex 1K (X(s)) ds ≥ kt0 . (8.60)
x∈K 0

If x belongs to K, then by the Markov property we have:


·Z ∞ ¸ ·Z ∞ ¸
Px 1K (X(s)) ds > (k + 1)t0 = Px 1K (X(s)) ds > kt0 , κ < ∞
0 κ
· ·Z ∞ ¸ ¸
= Ex PX(κ) 1K (X(s)) ds > kt0 , κ < ∞
· ·Z0 ∞ ¸ Z ∞ ¸
= Ex PX(κ) 1K (X(s)) ds > kt0 , 1K (X(s)) ds ≥ t0
0 0
≤ α1 αk . (8.61)

From (8.61) and induction we infer


·Z ∞ ¸
sup Px 1K (X(s)) ds ≥ kt0
x∈K 0
µ ·Z ∞ ¸¶k
≤ α1k = sup 1K (X(s)) ds ≥ t0 ≤ (1 − q)k , (8.62)
x∈K 0

where in the final step of (8.62) we employed (8.57). If y ∈ E is arbitrary,


then we proceed as follows:
·Z ∞ ¸
Py 1K (X(s)) ds > (k + 1)t0
0
·Z ∞ ¸
= Py 1K (X(s)) ds > kt0 , κ < ∞
· κ ·Z ∞ ¸ ¸
= Ey PX(κ) 1K (X(s)) ds > kt0 , κ < ∞
0
≤ (1 − q)k Py [κ < ∞] ≤ (1 − q)k . (8.63)

The inequality in (8.63) implies the inequality in (8.59). To show the first part
of (8.52) we observe that for x ∈ E we have
Z ∞ ·Z ∞ ¸
P (t, x, K) dt = Ex 1K (X(s)) ds
0 0
402 8 Coupling and Sobolev inequalities

X · Z ∞ ¸
≤ kt0 Px (k − 1)t0 < 1K (X(s)) ds ≤ kt0
k=1 0

X ·Z ∞ ¸
≤ t0 + Px 1K (X(s)) ds > (k − 1)t0
k=2 0

X∞ µ ¶
1 1
≤ t0 + t0 k(1 − q)k−2 = t0 1 + + 2 < ∞. (8.64)
q q
k=2

The first part of (8.52) is a consequence of (8.64) indeed.


In fact from (8.64) we also obtain µ(K) = 0 for any finite invariant measure
µ. Let µ be an invariant probability measure. That µ(K) = 0 can be seen by
the following (standard) arguments:
Z Z Z
1 T 1 T
µ(K) = µ(K) dt = P (t, y, K) dµ(y)dt
T 0 T 0 E
Z Ã Z T ! Z ∞
1 1
= P (t, y, K) dt dµ(y) ≤ sup P (t, x, K) dt
E T 0 T x∈E 0
µ ¶
t0 1 1
≤ 1+ + 2 . (8.65)
T q q

Since T > 0 is arbitrary (8.65) implies µ(K) = 0.


Next assume that the compact subset K has the additional property that
P (1, x0 , K) > 0. Let K 0 be an arbitrary compact subset. We want to prove
that
Px [sup {t ≥ 0 : X(t) ∈ K 0 } < ∞] = 1. (8.66)
R∞ ∞
Put h(x) = 0 P (t, x, K) dt. Then by (8.64) the function h ∈ L (E, E).
The function h(x) is also lower semi-continuous, because the functions x 7→
P (t, x, K), t > 0, are continuous. Moreover, it is strictly positive, ©by the fact ª
that for all t > 0 and all x ∈ E, P (t, x, K) > 0. Put Hn = h > n−1 .
Then there exists m ∈ N such that K 0 ⊂ Hm . Fix xinE, denote by σ the first
hitting time of K 0 , and let σ(k) be the first hitting time of K 0 after time k, i.e.
σ(k) = k + σ ◦ ϑk . Taking into account that X (σ(k)) ∈ K 0 ⊂ Hm Px -almost
surely on the event {σ(k) < ∞} we obtain:
1
Px [σ(k) < ∞] ≤ Ex [h (X (σ(k))) , σ(k) < ∞]
m ·Z ¸

= Ex P (s, X (σ(k)) , K) ds, σ(k) < ∞
Z ∞ 0
£ ¤
= Ex PX(σ(k)) [X(s) ∈ K] , σ(k) < ∞ ds
Z0 ∞
£ £ ¯ ¤ ¤
= Ex Px X (s + σ(k)) ∈ K ¯ Fσ(k) , σ(k) < ∞ ds
0
8.1 Coupling methods 403
Z ∞
= Ex [1K (X (s + σ(k))) , σ(k) < ∞] ds
0
"Z #

= Ex 1K (X (s)) ds, σ(k) < ∞
σ(k)

(σ(k) ≥ k on the event {σ(k) < ∞})


·Z ∞ ¸
≤ Ex 1K (X(s)) ds, σ(k) < ∞
Z ∞ k
≤ P (s, x, K) ds. (8.67)
k

The sequence of events {σ(k) < ∞}, k ∈ N, decreases. From (8.67) it follows
that its intersection has Px -measure zero. It follows that its complement has
full Px -measure. This means that for Px -almost all ω there exists k ∈ N such
that σ(k)(ω) = ∞. This means that (8.66) holds. From (8.66) we also readily
infer limt→∞ P (t, x, K 0 ) = 0, because after the process X(t) has visited K 0 it
only returns there finitely many times Px -almost surely.
This completes the proof of Theorem 8.21.
n Rt o
A stopping time of the form inf t > 0 : 0 1K (X(s)) ds > 0 is called the
penetration time of K: compare with (8.58).
Lemma 8.22. Let the hypotheses and notations be as in Theorem 8.8. Sup-
pose that there exists a compact subset K which is recurrent. Then all Borel
subsets B with the property that P (t0 , x0 , B) > 0 for some pair (t0 , x0 ) ∈
(0, ∞) × E (or, equivalently, P (t, x, B) > 0 for all pairs (t, x) ∈ (0, ∞) × E)
are recurrent.

Proof. Let B ∈ E be such that P (t, x, B) > 0 for some (all) pairs (t, x) ∈
(0, ∞)×E. Let τB be the (first) hitting time of B: τB = inf {t > 0 : X(t) ∈ B}.
We need to show that Px [τB < ∞] = 1 for all x ∈ E. By our assumptions we
have

inf Px [τB ≤ 1] ≥ inf Px [X(1) ∈ B] = inf P (1, x, B) =: q > 0. (8.68)


x∈K x∈K x∈K

Let τ be the first hitting time of K, and define a sequence of hitting times of
K as follows:

τ1 = τ, and τn+1 = inf {t > τn + 2 : X(t) ∈ K} = τn +2+τ ◦ϑτn +2 . (8.69)

Then, for any n ∈ N, τn < ∞ and X (τn ) ∈ K Px -almost surely for all x ∈ E.
Put

An = {τn ≤ τn + τB ◦ ϑτn ≤ τn + 1} = {τB ◦ ϑτn ≤ 1, τn < ∞} . (8.70)


404 8 Coupling and Sobolev inequalities

The events in (8.70) should be compared with similar ones in (8.46). Then
An ∈ Fτn +1 ⊂ Fτn+1 , and we have with q as in (8.68)

X ∞
£ ¯ ¤ X £ ¯ ¤
Px An ¯ Fτn = Px {τB ◦ ϑτn ≤ 1} ¯ Fτn
n=1 n=1
X∞ ∞
X
= PX(τn ) [τB ≤ 1] ≥ inf Py [τB ≤ 1]
y∈K
k=1 k=1
X∞
≥ q = ∞, Px -almost surely (8.71)
k=1

for all x ∈ E. Therefore by the generalized Borel-Cantelli lemma (see e.g.


Shiryayev [212] Corollary VII 5.2) Px -almost all ω belong to some An for
some n ∈ N. However, if ω ∈ An , then τB (ω) ≤ τn + 1 < ∞.
This concludes the proof of Lemma 8.22.

The following result is a reformulation of Lemma 8.22 for Markov chains with
values in E. Its proof can be copied from the proof of Lemma 8.22.
Lemma 8.23. Let the notation and hypotheses be as in Proposition 8.17. Sup-
pose that there exists a compact subset K which is recurrent. Then all Borel
subsets B with the property that P (t0 , x0 , B) > 0 for some pair (t0 , x0 ) ∈
(0, ∞) × E (or, equivalently, P (t, x, B) > 0 for all pairs (t, x) ∈ (0, ∞) × E)
are recurrent.
For the notion of a Harris recurrent subset, the reader is referred to Definition
8.19. The following result follows merely from the recurrence properties of our
Markov process. These recurrence properties were established in Lemma 8.22.
The existence of a finite invariant probability measure is not required.
Proposition 8.24. Let the hypotheses and notation be as in Theorem 8.8 ex-
cept that the existence of an invariant probability measure is required. Assume
that there exists a compact recurrent subset. Then every non-empty open sub-
set U of E is Harris recurrent.

Proof. Let U be any open subset of E. Suppose ∅ 6= U 6= E. Since our


Markov process is recurrent, there exists a pair (t0 , x0 ) ∈ (0, ∞) × E such that
P (t0 , x0 , U ) > 0. Let the compact subset K of U be such that P (t0 , x0 , K) ≥
1
2 P (t0 , x0 , U ) > 0. From Lemma 8.22 we infer that the compact subset K is
recurrent. Let τU c be the hitting time of E \ U . From (8.20) in Proposition
1
8.11 we see that there exists q > 0 such that sup Py [τU c ≤ q] < . Then we
y∈K 2
1
see inf Py [τU c > q] ≥ . Let τ = τK be the first hitting time of K. Then by
y∈K 2
recurrence Px [τ < ∞] = 1, x ∈ E. Instead of τU c we write σ. We define the
double sequence of hitting times of K and E \ U :
8.1 Coupling methods 405

τ1 = τ, σn = τn + σ ◦ ϑτn , τn+1 = σn + τ ◦ ϑσn . (8.72)

In addition we introduce the events: Qn = {σn − τn > q}. For every y ∈ E we


have:

X ∞
£ ¯ ¤ X £ ¯ ¤
Py Qn ¯ Fτn = Py σ ◦ ϑτn > q ¯ Fτn
n=1 n=1
X∞ ∞
X
= PX(τn ) [σ > τ ] ≥ inf Px [σ > η]
x∈K
n=1 n=1
X∞
1
≥ = ∞, Py -almost surely. (8.73)
n=1
2

From (8.73) and the generalized Borel-Cantelli lemma (again see e.g. Shiryayev
[212] Corollary VII 5.2) we infer
· ¸
Py lim sup Qn = 1, y ∈ E. (8.74)
n→∞

Since 1U (X(t)) = 1 on the event {τn ≤ t < σn }, (8.74) implies


·Z ∞ ¸
Py 1U (X(t)) dt = ∞ = 1, y ∈ E.
0

In other words the open subset U is Harris recurrent.


This completes the proof of Proposition 8.24.
© ª
Definition 8.25. Let Ftt21 : 0 ≤ t1 ≤ t2 < ∞ be a collection of σ-fields on Ω
such that for every t1 ∈ [0, ∞) fixed the collection (Ft2 )t2 ≥t1 is a filtration, and
¡ ¢
such that for every t2 ∈ (0, ∞) the collection Ftt21 t1 ≤t2 is also a filtration.
A family of stochastic variables A (t1 , t2 ) : Ω → ©R, 0 ≤ t1 ≤ t2 < ∞,ª is
called an additive process relative to the collection Ftt21 : 0 ≤ t1 ≤ t2 < ∞ if
it possesses the following properties:
1. the equality A (t1 , t2 ) + A (t2 , t3 ) = A (t1 , t3 ) holds for all 0 ≤ t1 ≤ t2 ≤ t3 ;
2. for every 0 ≤ t1 ≤ t2 the stochastic variable A (t1 , t2 ) is Ftt21 -measurable.
In case of a time-homogeneous Markov process, like in Theorem 8.8 an additive
process A (t) : Ω → R, 0 ≤ t < ∞, is called a time-homogeneous additive
process relative to the collection {Ft : 0 ≤ t < ∞} if it possesses the following
properties:
1. the equality A (s) + A (t − s) ◦ ϑs = A (t) holds Px -almost surely for all
0 ≤ s ≤ t;
2. for every t ≥ 0 the stochastic variable A (t) is Ft -measurable.
If in the above definitions the plus signs are replaced with multiplication
signs, then the corresponding processes are called multiplicative and time-
homogeneous multiplicative respectively.
406 8 Coupling and Sobolev inequalities

Instead of time-homogeneous additive process we usually just say additive


process; a similar convention is adopted in case of multiplicative processes.
If A (t1 , t2 ) is an additive process, then exp (A(t1 , t2 ) is a multiplicative
process.
In fact there is a relationship between these two notions. Let t 7→ A(t) be an
additive process in the time-homogeneous case. Then it can also be considered
as an additive process of two variables by writing A (t1 , t2 ) = A (t2 − t1 ) ◦ ϑt1 ,
0 ≤ t1 ≤ t2 < ∞.
Let f : [0, ∞) × E → R be a Borel measurable function with the property
Rt
that 0 |f (s, X(s))| ds < ∞, Px -almost surely for all x ∈ E. Then the pro-
Rt
cess (t1 , t2 ) 7→ Af (t1 , t2 ) = t12 f (ρ, X(ρ)) dρ is an additive process. In the
time-homogenous case, and if the function R t f only depends on the state vari-
able, then the process t 7→ Af (t) := 0 f (X(ρ)) dρ is a (time-homogenous)
additive process. Let τ : Ω → [0, ∞] be a terminal stopping time in the
sense that for every pair (t1 , t2 ), 0 ≤ t1 < t2 < ∞, the event {t1 < τ ≤ t2 }
is Ftt21 -measurable. Then the process (t1 , t2 ) 7→ M (t1 , t2 ), 0 ≤ t1 ≤ t2 < ∞,
defined by M (t1 , t2 ) = 1 − 1{t1 <τ ≤t2 } is a multiplicative process. If τ is a
time-homogeneous terminal stopping time, then the process t 7→ 1{τ >t} is a
multiplicative process. This fact from the observation that s + τ ◦ ϑs = τ
Px -almost surely on the event {τ > s}: the latter is just the notion of (time-
homogeneous) terminal stopping time. Examples of terminal stopping times
are first entry and hitting times of Borel subsets. In the presence of a Markov
process like (8.14) in Theorem 8.8, then for Ftt21 we may take the universal
completion of the right closure of σ (X(s) : t1 ≤ s ≤ t2 ). An important prop-
erty which is used here is the fact that the corresponding Markov process has
right-continuous paths (or orbits).
Let µ be a Radon measure R on E which is σ-finite. In the following proposi-
tion we write Eµ [F ] = E Ex [F ] dµ(x) for any stochastic variable F : Ω → R
for which Eµ [|F |] dµ(x) < ∞, or F ≥ 0.
The existence of a σ-finite Radon measure under the recurrence hypotheses
of Theorem 9.36 will be proved in Chapter 9.
Proposition 8.26. Let the hypotheses and notation be as in Theorem 8.8,
except that the invariant measure µ is not necessarily finite, but is allowed to
be a σ-finite Radon measure. Let (A(t))t≥0 and (B(t))t≥0 be additive processes
such that Eµ [|A1 |] < ∞, and 0 < Eµ [B1 ] < ∞. Then the equality
· ¸
At Eµ [A1 ]
Px lim = =1 (8.75)
t→∞ Bt Eµ [B1 ]

holds for all x ∈ E. Moreover, the equality

Ex [At ] Eµ [A1 ]
lim = (8.76)
t→∞ Ex [Bt ] Eµ [B1 ]

holds for µ-almost all x ∈ E.


8.1 Coupling methods 407

The proof of Proposition 8.26 is copied from the proof of Proposition 5.5
in Seidler [207]. Some of the techniques are borrowed from Azema et al [10]
section II.2, and the Chacon-Ornstein theorem as exhibited in Krengel [138].
In the proof of 8.26 we need some definitions and terminology which we collect
next.
Definition 8.27. Let µ be a σ-finite Borel measure on E. An operator
S : L1 (E, µ) → L1 (E, µ) is called a positive operator or positivity pre-
serving operator if f ≥ 0, µ-almost everywhere implies R Sf ≥ 0 Rµ-almost
everywhere. It is called a contraction operator if E |Sf | dµ ≤ E |f | dµ
for all f ∈ L1 (E, µ).R The operator SR∗ : L∞ (E, µ) → L∞ (E, µ) is de-
fined by the equality E (Sf ) g dµ = E f (S ∗ g) dµ for all f ∈ L1 (E, µ)
and all g ∈ L∞ (E, µ). Since the measure µ is σ-finite the dual space of
L1 (E, µ) is identified with L∞ (E, µ). Notice that S ∗ gn decreases to 0, when-
ever gn decreases pointwise to 0. A function (or better a class of functions)
h ∈ L∞ (E, µ) is called harmonic if S ∗ h = h. A non-negative function h for
which h ≥ S ∗ h is called superharmonic. A superharmonic function h is called
strictly superharmonic on a subset A of E provided h > S ∗ h on A. A subset
B ∈ E is called S-absorbing if Sf ∈ L1 (B, µ) for all f ∈ L1 (B, µ).
The following decomposition theorems, 8.28, 8.29, and 8.30 can be found in
Krengel [138] theorems 1.3, 1.5, and 1.6 in Chapter 3. The decomposition of
E into a conservative part C and its complement D a dissipative part is called
the Hopf decomposition. The results are also applicable for the measure space
(Ω, F, Pµ ) instead of (E, E, µ) where µ is a σ-invariant Radon measure on E.
Since µ(C) = Pµ [X(0) ∈ C], C ∈ E, the measure µ is σ-finite if and only if
Pµ is so.
Theorem 8.28. Let S be a positive contraction on L1 (E, µ). Then there ex-
ists a decomposition of E into disjoint sets C and D which are determined
uniquely modulo µ by:
If h is superharmonic, then h = S ∗ h on C;
(C1)
(D1)
There exists a bounded superharmonic function h0 which is strictly super-
harmonic on D.
n
The function h0 may be constructed in such a way that limn→∞ (S ∗ ) h0 = 0
on D, and h0 = 0 on C.

Theorem 8.29. Let S be a positive contraction on L1 (E, µ). Let C and D be


the subsets as described in Theorem 8.28. Then the decomposition of E into
the disjoint sets C and D is also determined uniquely modulo µ by:
P∞ n
(C2)For all
Ph ≥ 0 h ∈ L∞ (E, µ) the sum n=0 (S ∗ ) h = ∞ on the subset
∞ n
C ∩ { n=0 (S ∗ ) h > 0};
(D2)There
Pexists a function hD ∈ L∞ (E, µ), hD ≥ 0, for which {hD > 0} = D,
∞ n
and n=0 (S ∗ ) h ≤ 1.
408 8 Coupling and Sobolev inequalities

Theorem 8.30. Let S be a positive contraction on L1 (E, µ). Let C and D be


the subsets as described in Theorem 8.28. Then the decomposition of E into
the disjoint sets C and D is also determined uniquely modulo µ by:
1
P∞ n
(C3)For all functions
P∞ f n≥ 0 h ∈ L (E, µ) the sum n=0 S f = ∞ on the
subset C ∩ { n=0 S f > 0}; P∞
(D3)For all functions f ≥ 0, f ∈ L1 (E, µ), n=0 S n f < ∞ on D.

Definition 8.31. Let S : L1 (E, µ) → L1 (E, µ) be a positive contraction. The


decomposition of E into the disjoint union of C and D = E \ D as determined
by one of the theorems 8.28, 8.29 or 8.30 is called the Hopf decomposition of E
relative to S. The subset C is called the conservative part of S, and D is called
the dissipative part. The operator S is called conservative if µ (E \ C) = 0.

The following result can be found in Skorohod [214]: see Theorem 5 and its
corollary in Chapter 1, §1.
Theorem 8.32. R Let µ be a non-zero invariant σ-finite measure on E, and
put P
© µ [A] = Px£ [A] dµ(x),¤ A ∈ªF. Then Pµ is a σ-finite measure on F. Put
I = A ∈ F : Pµ ϑ−1 1 R4R = 0 . Assume that all probability measures of the
form B 7→ P (1, x, B), x ∈ E, are equivalent. Then the following assertions
are true:
(a) If B ∈ E is such that P (1, x, B) = 1 for µ-almost all x ∈ B, then either
µ(B) = 0 or µ (E \ B) = 0.
(b) Suppose that the stochastic variable Y ∈ L1 (Ω, F, µ) possesses the follow-
ing property: Y = Y ◦ ϑ1 Pµ -almost everywhere. Then for all n ∈ N the
equality £ ¯ ¤
EX(n) [Y ] = Ex Y ¯ Fn (8.77)
holds
£ Px -almost
¤ surely for µ-almost all x ∈ E. The equality Ex [Y ] =
Ex EX(n) [Y ] holds µ-almost everywhere for all n ∈ N, including n = 0.
Moreover, the equality Y = EX(0) [Y ] holds Pµ -almost everywhere.
(c) Events in I are Pµ -trivial in the sense that either Pµ [R] = 0 or Pµ [Ω \ R] =
0.
(d) Let Y ∈ L1 (Ω, F, µ) be a stochastic variable with the property that Y =
Y ◦ ϑ1 Pµ -almost everywhere. Then Y is zero Pµ -almost everywhere if
µ(E) = ∞, and constant Pµ -almost everywhere if µ is finite.

Remark 8.33. Assertion (b) Theorem 8.32 only uses the invariance of the σ-
finite measure µ. The others also use the fact that all measures of the form
B 7→ P (1, x, B), B ∈ E, x ∈ E, are equivalent.

Proof. Let (An )n∈N be an increasing sequence in E such that µ (An ) < ∞
for all n ∈ N and E = ∪n∈N An . Put Ωn = {X(0) ∈ An }. Then Ωn ⊂ Ωn+1 ,
Ω = ∪n∈N Ωn , and Pµ [Ωn ] = µ (An ). This shows that the measure Pµ is
σ-finite.
8.1 Coupling methods 409

(a) Let B ∈ E be such that P (1, x, B) = 1 for µ-almost all x ∈ B, and


assume that µ(B) > 0. Then P (1, x, E \ B) = 0 for µ-almost all x ∈ B. Since
µ(B) > 0, there exists at least one x0 ∈ B such that P (1, x0 , E \ B) = 0. Since
all measures of the form C 7→ P (1, x, C), x ∈ E, are equivalent, it follows that
P (1, y, E \ B) = 0 for µ-almost all y ∈ E. Consequently, Pµ [E \ B] = 0. This
proves assertion (a).
(b) First observe that by the Markov property, and by the invariance of
the measure µ we have
£ ¤
Eµ [|Y ◦ ϑn − Y ◦ ϑn+1 |] = Eµ EX(n) [|Y − Y ◦ ϑ1 |]
£ ¤ £ ¤
= Eµ EX(n−1) [|Y − Y ◦ ϑ1 |] = Eµ EX(0) [|Y − Y ◦ ϑ1 |]
= Eµ [|Y − Y ◦ ϑ1 |] . (8.78)

From (8.78) we infer by induction that Y = Y ◦ ϑn Pµ -almost everywhere for


all n ∈ N. Let A ∈ Fn , and consider the (in-)equalities:
Z
0 = Eµ [|Y − Y ◦ ϑn |] = Ex [|Y − Y ◦ ϑn |] dµ(x)
E
Z
≥ |Ex [(Y − Y ◦ ϑn ) 1A |] dµ(x)
ZE
¯ £ £ ¯ ¤ ¤¯
= ¯Ex [Y 1A ] − Ex Ex Y ◦ ϑn ¯ Fn 1A ¯ dµ(x)
E

(Markov property)
Z
¯ £ ¤¯
= ¯Ex [Y 1A ] − Ex EX(n) [Y ] 1A ¯ dµ(x)
ZE
¯ £¡ ¢ ¤¯
= ¯Ex Y − EX(n) [Y ] 1A ¯ dµ(x). (8.79)
E
£ ¯ ¤
From (8.79) we see that Ex Y ¯ Fn = EX(n) [Y ] Px -almost surely for µ-almost
all x ∈ E, and n ∈ N, n ≥ 1. The latter is the same as saying the for µ-almost
all x ∈ E the process n 7→ EX(n) [Y ] is a Px -martingale. It also proves (8.77)
in assertion £(b) for n ¤∈ N, n ≥ 1. By putting A = Ω in (8.79) we infer
Ex [Y ] = Ex EX(n) [Y ] µ-almost everywhere on E. In order to complete the
proof of assertion (b) we need to show the equality Y = EX(0) [Y ] Pµ -almost
everywhere. Since the process n 7→ EX(n) [Y ] is a Px -martingale we see that
its limit exists Px -almost surely for µ-almost all x ∈ E. Moreover this limit
is Pµ -almost surely equal to Y . We shall prove that this limit is also equal to
EX(0) [Y ] Pµ -almost everywhere. Therefore we consider for −∞ < α < β < ∞
the quantity
£ ¤
Pµ [α < Y < β] = lim Pµ α < EX(n) [Y ] < β
n→∞

(employ the invariance of µ)


410 8 Coupling and Sobolev inequalities
£ ¤
= lim Pµ α < EX(0) [Y ] < β
n→∞
£ ¤
= Pµ α < EX(0) [Y ] < β . (8.80)

Since −∞ < α < β < ∞ are arbitrary, the equalities in (8.80) yield Y =
EX(0) [Y ] Pµ -almost everywhere. This completes the proof of assertion (b).
(c) Let R be a member of I. Then 1R = 1R ◦ ϑ1 Pµ -almost everywhere.
Since µ is an invariant measure we also get 1R = 1R ◦ϑn Pµ -almost everywhere
for all n ∈ N. In addition, an application of assertion (b) yields

1R = EX(0) [1R ] = PX(0) [R] Pµ -almost everywhere. (8.81)

Put B = {x ∈ E : Px [R] = 1}. From (8.81) we see R = {X(0) ∈ B}, and


hence 1R = 1B (X(0)), Pµ -almost everywhere. We also see that Ω \ R =
{X(0) ∈ E \ B}. It follows that µ(B) = Pµ [R] and µ (E \ B) = Pµ [Ω \ R].
Assume Pµ [R] = µ(B) > 0. Let x0 ∈ B be any point for which 1R ◦ ϑ1 = 1R
Px0 -almost surely. Since R belongs to I, and µ(B) > 0, the latter equality
holds for µ-almost all x0 ∈ B. Then

P (1, x0 , B) = P (1, x0 , {x ∈ E : Px [R] = 1})


£ ¤ £ ¤
= Px0 PX(1) [R] = 1 = Px0 PX(0) [R] ◦ ϑ1 = 1
= Px0 [1R ◦ ϑ1 = 1] = Px0 [1R = 1] = Px0 [R] = 1 (8.82)

where in the final equality of (8.82) we used the fact that x0 ∈ B. It follows
that for µ-almost all x0 ∈ B we have P (1, x0 , B) = 1. From assertion (a) we
then infer that µ (E \ B) = 0. But then Pµ [Ω \ R] = 0. This shows assertion
(c).
(d) Let Y ∈ L1 (Ω, F, µ) be such that Y = Y ◦ ϑ1 Pµ -almost everywhere.
Since Y ∧ 0 = (Y ◦ ϑ1 ) ∧ 0 = (Y ∧ 0) ◦ ϑ1 Pµ -almost everywhere we assume
without loss of generality that Y ≥ 0. Let m be the µ-essential supremum
of Y . If m = ∞, then we consider the Pµ -invariant event {Y > n}. Observe
that Pµ [Y > n] > 0, and so by (c) its complement has Pµ -measure zero. In
other words Y n Pµ -almost everywhere. Since this is true for all n ∈ N we see
Y = ∞, Pµ -almost everywhere. Since µ is non-zero and Y ∈ L1 (Ω, F, Pµ ) this
is a contradiction. So we assume that m < ∞. If ξ < m we have Pµ [Y > ξ] >
0, and hence by (c) and Pµ -invariance of the event {Y > ξ} it follows that
Pµ [Y ≤ ξ] = 0. Thus we see Y ≥ ξ Pµ -almost everywhere on Ω. Since ξ < m
is arbitrary we obtain Y ≥ m Pµ -almost everywhere on Ω. By definition we
have m ≥ Y Pµ -almost everywhere on Ω. Consequently Y = m Pµ -almost
everywhere on Ω. If µ(E) = Pµ [X(0) ∈ E] = Pµ [Ω] = ∞, then necessarily
Y = m = 0 Pµ -almost everywhere. If µ(B) < ∞ we see that Y = m Pµ -almost
everywhere where m is a finite constant.
This completes the proof of Theorem 8.32.
Definition 8.34. Subsets B ∈ E with the property that P (t, x, B) = 1 for
µ-almost all x ∈ B are called µ-invariant subsets. Subsets B ∈ E with the
8.1 Coupling methods 411

property that P (t, x, B) = 1 for all x ∈ B are called invariant subsets. Events
A ∈ F with the property £that R = ¤ϑ−1 t R are called invariant events; events
with the property that Pµ ϑ−1t R4R = 0 for all t > 0 are called Pµ -invariant
events. For the notion of tail σ-fields in F the reader is referred to Definition
8.7.
Proof (Proof of Proposition 8.26). We begin by putting
 
X ∞ 
M= B1 ◦ ϑj = ∞ ,
 
j=1

and note that M is obviously ϑ1 -invariant, so either

Pµ [M ] = 0, or Pµ [Ω \ M ] = 0.

This fact follows from Theorem 8.32 assertion (c). But Eµ [B1 ] > 0 implies
   
X∞ X∞
Eµ  B1 ◦ ϑj  = Eµ  B1  = ∞,
j=1 j=1

(the measure Pµ is ϑ1 -invariant), so the possibility Pµ [M ] = 0 is excluded.


Define a positive contraction T : L1 (Pµ ) → L1 (Pµ ), by u 7→ u ◦ ϑ1 . Let V be
a Borel subset such that µ(V ) < ∞ and which satisfies
·Z ∞ ¸
Px 1V (X(s)) ds = ∞ = 1 for all x ∈ E, (8.83)
0
R1
and set v = 0 1V (X(s)) ds. The existence of such a set V is guaranteed by
Proposition 8.24 and the fact that the measure µ is a regular Radon measure.
Then v ∈ L1 (Pµ ) because
Z Z 1Z
Eµ [v] = vdPµ = P (s, y, V ) dµ(y)ds = µ(V ) < ∞,
0 E

and hence we have



X Z ∞
T jv = 1V (X(s)) ds = ∞, Pµ -almost everywhere. (8.84)
j=0 0

This means that the operator T is conservative (cf. [138], Theorem 1.6 Chapter
3: see Theorem 8.30 and Definition 8.31) and by the Chacon-Ornstein theorem
and the Neveu-Chacon identification theorem (see e.g. Krengel [138], theorems
2.7 and 3.4 Chapter 3) we obtain that
Pn j
j=0 T A1 An Eµ [A1 ]
lim Pn jB
= lim = Pµ -almost everywhere. (8.85)
n→∞
j=0 T 1 n→∞ B n E µ [B1 ]
412 8 Coupling and Sobolev inequalities

Now, exactly the same procedure as in [10] applies, and hence we see that the
discrete time result (8.85) implies that Pµ [Ω \ C] = 0, where
½ ¾
At Eµ [A1 ]
C = lim = .
t→∞ Bt Eµ [B1 ]
So there exists N ∈ E, µ(N ) = 0 and Px [C] = 1 for all x ∈/ N . Let y ∈ E be
arbitrary, then
Z
£ ¤
Py [C] = Ey [1C ◦ ϑ1 ] = Ey EX(1) [1C ] = Pz [C]P (1, y, dz)
E
Z
= Pz [C]P (1, y, dz) = 1,
E\N

since P (1, y, N ) = 0 by the fact that all


R measures B 7→ P (t, y, B), B ∈ E,
(t, y) ∈ (0, ∞) × E, are equivalent, and E P (1, z, N ) dµ(z) = µ(N ) = 0.
This proves equality (8.75) in Proposition 8.26.
In order to prove equality (8.76) we introduce a positivity
R preserving con-
traction mapping S : L1 (E, µ) → L1 (E, µ): Sf (x) = E f (z)P (1, x, dz),
f ∈ L1 (E, µ). As in the proof of equality (8.75) let V be a Borel subset of
E such that µ(V ) < ∞ and such that (8.83) is satisfied. Put h(x) = Ex [v] =
R1
0
P (s, x, V ) ds. Then h ∈ L1 (E, µ) and

X Z ∞
S n h(x) = P (s, x, V ) ds = ∞, x ∈ E. (8.86)
n=0 0

Hence the contraction mapping S is conservative. Let A be the σ-field of S-


absorbing subsets (cf. Krengel [138], Definition 1.7 Chapter 3, and Definition
8.27). The equivalence of transition probabilities of our Markov process in
(8.14) implies easily that µ is trivial on A: i.e. µ(A) = 0 or µ (E \ A) = 0 for
all A ∈ A. Define the functions f and g by f (x) = Ex [A1 ], and g(x) = Ex [B1 ].
Then f , g ∈ L1 (E, µ), g ≥ 0, and hence by the Chacon-Ornstein theorem we
have
PN n
R
n=0 S f
f dµ Eµ [A1 ]
lim PN = RE =
N →∞ n
n=0 S g E
g dµ Eµ [B1 ]
( ∞
)
X
n
µ-almost everywhere on x ∈ E : S g(x) > 0 . (8.87)
n=0

Notice that
Z
£ ¤
Sf (x) = Ez [A1 ] P (1, x, dz) = Ex EX(1) [A1 ] = Ex [A1 ◦ ϑ1 ] . (8.88)
E
 
X∞
We know that Pµ  B1 ◦ ϑj = ∞ = 0, and thus we also have
j=0
8.1 Coupling methods 413
 
X∞
Px  B1 ◦ ϑj = ∞ = 0 for µ-almost all x ∈ E.
j=0

Therefore

X ∞
X
S n g(x) = Ex [B1 ◦ ϑj ] = ∞ (8.89)
n=0 n=0

for µ-almost all x ∈ E, and hence (8.87) yields that the equality

Ex [An ] Eµ [A1 ]
lim =
n→∞ Ex [Bn ] Eµ [B1 ]

holds for µ-almost all x ∈ E. Again, the proof can be completed as in Azema
et al [10].
Altogether this completes the proof of Proposition 8.26.

In Corollary 8.35 we establish the uniqueness of σ-finite invariant measures.


Corollary 8.35. Let the assumptions and notation be as in Proposition 8.26.
Let µ1 and µ2 be two σ-finite non-trivial invariant measures. Then up to a
finite strictly positive constant these two measures coincide.

Proof. Let (B(t))t≥0 be an additive process such that 0 < Eµ1 [B(1)] < ∞
and 0 < Eµ2 [B(1)] < ∞. Let f ∈ L1 (E, µ1 ) ∩ L1 (E, µ2 ). From Proposition
8.26 we infer that R R
E
f dµ1 f dµ2
= E
Eµ1 [B(1)] Eµ2 [B(1)]
and hence Z Z
Eµ2 [B(1)]
f dµ2 = f dµ1 . (8.90)
E Eµ1 [B(1)] E

The asserted uniqueness follows from (8.90) and the density of L1 (E, µ1 ) ∩
L1 (E, µ2 ) in either one L1 (E, µ1 ) or L1 (E, µ2 ).

Corollary 8.36. Let the assumptions and notation be as in Proposition 8.26.


Then the following assertions are valid:
(a) The£RMarkov process in (8.14)
¤ is µ-Harris recurrent, that is the equality

Px 0 1A (X(s)) ds = ∞ = 1 holds for all x ∈ E and for all A ∈ E for
which µ(A) > 0.
Z
1 t
(b) Suppose µ(E) = ∞. Then lim f (X(s)) ds = 0 Px -almost surely for
t→∞ t 0
1
all x ∈ E and all f ∈ L (E, µ). R
Z
1 t f dµ
(c) Suppose µ(E) < ∞. Then lim f (X(s)) ds = E Px -almost
t→∞ t 0 µ(E)
surely for all x ∈ E and all f ∈ L1 (E, µ).
414 8 Coupling and Sobolev inequalities

Proof.
R ∞(a) Assume that there exists z ∈ E and A ∈ E with µ(A) > 0 such
that 0 1A (X(s)) ds < ∞ on an event Ω 0 with Pz (Ω 0 ) > 0. We will arrive at
a contradiction. Let V ∈ E be such that µ(V ) < ∞ and (8.83) are satisfied.
Then by assumption
Rt
1A (X(s)) ds
lim R0 = 0 Px -almost surely on Ω 0 . (8.91)
t→∞ t 1 (X(s)) ds
0 V

However, according to (8.75) in Proposition 8.26 the limit in (8.91) should be


hR i
1 R1R
Eµ 0 1A (X(s)) ds Ex [1A (X(s))] dµ(x) ds
hR i = R01 RE
1
Eµ 0 1V (X(s)) ds 0 E x
E [1V (X(s))] dµ(x) ds
R1R
P (s, x, A) dµ(x) ds µ(A)
= R01 RE = . (8.92)
P (s, x, V ) dµ(x) ds µ(V )
0 E

Since µ(A) > 0 and µ(V ) < ∞ the equality in (8.92) leads to a contradiction.
Hence assertion (a) follows.
(b) Fix ε > 0, x ∈ E, and f ∈ L1 (E, µ), f ≥ 0. Since µ(E) = ∞ and
µR is sigma-finite there exists a subset B ∈ E such that µ(B) < ∞, and
E
f dµ ε
< . By (8.75) of Proposition 8.26 there exists a random variable tε
µ(B) 2
which is Px -almost surely finite such that
Rt R
0
f (X(s)) ds f dµ ε ε
Rt ≤ E + ≤ = ε for all t ≥ tε . (8.93)
1B (X(s)) ds µ(B) 2 2
0

Since Rt Rt
0
f (X(s)) ds f (X(s)) ds
≤ R t0
t 1 (X(s)) ds
0 B

assertion (b) follows from (8.93).


(c) This assertion is an immediate consequence of Proposition 8.26.
Altogether this completes the proof of Corollary 8.36.

In the proof of Proposition 8.37 below we need Theorem 9.4 of Chapter 9. It


is taken from Jamison and Orey [117] Theorem 1, and Lemma 3. A result like
Lemma 3 can also be found in Meyn and Tweedie [162] Theorem 18.1.2. The
result is called Orey’s convergence theorem. R Let ν be a measure on E. The
measures P (t)∗ ν, t ≥ 0, are defined by B 7→ E P (t, x, B) dν(x). The following
proposition should be compared with Theorem 9.4. For the notion of “Harris
recurrence” of Markov chains see Definition 9.5 in Chapter 9. The definitions
8.19 and 8.20 contain the corresponding notions for continuous time Markov
processes.
8.1 Coupling methods 415

Proposition 8.37. Let the hypotheses and notation be as in Proposition 8.26.


Let µ be a σ-finite invariant measure. Then the Markov chain (X(n) : n ∈ N)
is µ-Harris recurrent, and

lim Var (P (t)∗ µ2 − P (t)∗ µ1 ) = 0 (8.94)


t→∞

for all probability measures µ1 and µ2 on E.

Remark 8.38. The proof of Proposition 8.37 yields a slightly stronger result
than (8.94). In fact by (8.110) we have
ZZ
lim Var (P (t, x, ·) − P (t, y, ·)) dµ1 (x) dµ2 (y) = 0. (8.95)
t→∞ E×E

It is clear that
RR the result in (8.95) is stronger than (8.94). Moreover, the func-
tion t 7→ E×E Var (P (t, x, ·) − P (t, y, ·)) dµ1 (x) dµ2 (y) decreases, so that
(8.95) follows once we know it for any sequence (tn : n ∈ N) which increases
to ∞. Put
Z ∞
n (αt)n−1 −αt
(αR(α)) 1B (x) = α e P (t, x, B) dt = Px ⊗ π0 [X (Tn ) ∈ B]
0 (n − 1)!

where α > 0, n ∈ N, and x ∈ E. Here the process (Tn : n ∈ N) consists of the


jump process of a Poisson process
n ¡ ¢ o
(Λ, G, πt )t≥0 , (N (t), t ≥ 0) , ϑP
t : t ≥ 0 , [0, ∞)

which has intensity α0 , and which is independent of the strong Markov process
© ª
(Ω, F, Px )x∈E , (X(t), t ≥ 0) , (ϑt , t ≥ 0) , (E, E) .

For more details see (9.114), (9.117), and Lemma (9.55) in Chapter 9. Again let
µ1 and µ2 probability measures on E. Fix α0 > 0. Then, under the conditions
of Proposition 8.37 we have
ZZ
¡ ¢
lim Var αR(α)1(·) (x) − αR(α)1(·) (y) dµ1 (x) dµ2 (y)
α↓0 E×E
ZZ
¡ n n ¢
= lim Var (α0 R (α0 )) 1(·) (x) − (α0 R (α0 )) 1(·) (y) dµ1 (x) dµ2 (y)
n→∞ E×E
= 0. (8.96)

Proof. The proof follows the lines of Duflo et al [75], which reduces the proof
to the corresponding result for discrete time Markov chains: see Jamison and
Orey [116]. In the formal sense in [75] the authors only consider a locally
compact state space, but changing to a polish space does not affect their
proof. Nevertheless we will repeat the arguments.
416 8 Coupling and Sobolev inequalities

First notice that the process (X(n) : n ∈ N) is a Markov chain with transi-
tion probability function (x, B) 7→ P (1, x, B), (x, B) ∈ E × E. Since all these
measures are equivalent the chain (X(n) : n ∈ N) is aperiodic: see Proposition
9.2 in Chapter 9 and the comments preceding it. We will check that it is Harris
recurrent. Let µ be the invariant measure, and choose an arbitrary B ∈ E for
which 0 < µ(B) < ∞, and put
(∞ )
X
R= 1B (X(n)) = ∞ (8.97)
n=1

Then ϑ−11 R ⊂ R, i.e. the event R is ϑ1 -invariant. Hence we either have


Pµ [R] = 0 or Pµ [Ω \ R] = 0: see Theorem 8.32 assertion (c). Then the map-
ping T : L1 (Ω, F, Pµ ) → L1 (Ω, F, Pµ ) defined by T u = u◦ϑ1 is a conservative
positive contraction, and 1B (X(1)) ∈ L1 (Ω, F, Pµ ). Hence

X ∞
X
T n 1B (X(1)) = 1B (X(n)) ∈ {0, ∞} Pµ -almost everywhere. (8.98)
n=0 n=1

If Pµ [R] = 0, then
"∞ # ∞ Z ∞
X X X
0 = Eµ 1B (X(n)) = P (n, x, B) dµ(x) = µ(B). (8.99)
n=1 n=1 E n=1

Since µ(B) > 0 the equality in (8.99) is a contradiction. It follows that


Pµ [Ω \ R] = 0, and hence there exists a subset N ∈ E such that µ(N ) = 0 and
Py [Ω \ R] = 0 for all y ∈ E\N . So that for yR∈ E\N we have Py [R] = 1. Since
µ(N ) = 0, and µ is invariant we see that E P (1, z, N ) dµ(z) = µ(N ) = 0,
and hence (1, z, N ) = 0 for µ-almost all z ∈ E. Since µ is non-trivial, this
implies that P (1, z, N ) = 0 for at least one z ∈ E. Since all the measures
B 7→ P (1, z, B), z ∈ E, are equivalent, we see that P (1, z, N ) = 0 for all
z ∈ E. So that furthermore, for x ∈ E arbitrary, we infer
£ £ ¯ ¤¤
Px [R] = Ex [1R ◦ ϑ1 ] = Ex Ex 1R ◦ ϑ1 ¯ F1

(Markov property)
Z
£ ¤
= Ex EX(1) [1R ] = Py [R] P (1, x, dy)
E

(employ P (1, x, N ) = 0)
Z
= Py [R] P (1, x, dy)
E\N

(for y ∈ E \ N the equality Py [R] = 1 holds)


8.1 Coupling methods 417
Z
= P (1, x, dy) = P (1, x, E \ N ) = P (1, x, E) = 1. (8.100)
E\N

From (8.100) we get Px [R] = 1 for all x ∈ E. Consequently, the Markov


chain (X(n) : n ∈ N) is Harris recurrent and aperiodic: see Definition 9.5 and
Proposition 9.2 in Chapter 9 and the comments preceding it. From Theorem
9.4, which is Orey’s convergence theorem, we obtain

lim Var (P (n, x, ·) − P (n, y, ·)) = 0 for all x, y ∈ E. (8.101)


n→∞

Next our aim is to establish the triviality of the tail σ-field of the Markov
process (X(t) : t ≥ 0). For the notion of tail σ field see Definition 8.7. Let
A ∈ I, the tail σ-field. Then for every t ≥ 0 there exists a tail event At ∈ I
such that 1A = 1At ◦ ϑt (see Definition 8.7). So for x ∈ E we have
£ £ ¯ ¤¤
Px [A] = Ex [1A ] = Ex [1At ◦ ϑt ] = Ex Ex 1At ◦ ϑt ¯ Ft
Z
£ ¤
= Ex EX(t) [1At ◦ ϑt ] = Pz [At ] P (t, x, dz) . (8.102)
E

By taking t = n → ∞ in (8.101) and employing (8.102) we see that Px [A] =


Py [A], x, y ∈ E, and A ∈ I. Since At ∈ I we see that the function x 7→ Px [At ]
is constant. From (8.102) it follows that this constant equals the constant
function x 7→ Px [A]. By the martingale convergence theorem we see
£ ¯ ¤ n→∞
Px [A] = PX(n) [A] = Px A ¯ Fn −→ 1A Px -almost surely. (8.103)

Equality (8.103) implies that either Px [A] = 1 for all x ∈ E or that Px [A] = 0
for all x ∈ E. This proves the triviality of the tail σ-field I. In order to complete
the proof of Proposition 8.37 we proceed as in the proof of Theorem II.4 of
Duflo and Revuz [75], who follow Blackwell and Freedman [33] Theorem 2.
Put Ft = ϑ−1 F. Then the arguments of Duflo and Revuz read as follows.
First, let B ∈ F and m a probability measure on E. Then we have
Z
Pm [A ∩ B] − Pm [A] Pm [B] = (1B − Pm [B]) dPm
A
Z
¡ £ ¯ ¤ ¢
= Pm 1B ¯ Ft − Pm [B] dPm ,
A

and hence
Z
¯ £ ¯ ¤ ¯
sup |Pm [A ∩ B] − Pm [A] Pm [B]| ≤ sup ¯Pm B ¯ Ft − Pm [B]¯ dPm
A∈F t A∈F t A
Z
¯ £ ¯ t¤ ¯
= ¯Pm B ¯ F − Pm [B]¯ dPm . (8.104)

By the backward martingale convergence theorem (see e.g. Doob [141] Theo-
rem 4.2) the limit
418 8 Coupling and Sobolev inequalities
Z
£ ¯ ¤ £ ¯ ¤
lim Pm B ¯ Ftn = lim Px B ¯ Ftn dm(x) (8.105)
n→∞ n→∞ E

exists Pm -almost surely and in L1 (Ω, F, Pm ) for all sequences (tn )n∈N which
increase to ∞. The limit in (8.105) is measurable relative to the tail£ σ-field
¯ ¤ I.
Since the tail σ-field is trivial, this means that the limit limt→∞ Pm B ¯ Ft =
Pm [B], Pm -almost surely. From (8.104) we see that
lim sup |Pm [A ∩ B] − Pm [A] Pm [B]| = 0. (8.106)
t→∞ A∈F t

Let (x, y) ∈ E × E, and A0 ∈ E. We apply (8.106) with m = 21 (δx + δy ),


A = {X(t) ∈ A0 }, and B = {X(0) = x} or B = {X(0) = y}. Then we obtain
lim sup |P (t, x, A0 ) − P (t, y, A0 )| = 0. (8.107)
t→∞ A ∈E
0

Since
Var (P (t, x, ·) − P (t, y, ·)) ≤ 2 sup |P (t, x, A) − P (t, y, A)| (8.108)
A∈E

equality (8.107) implies


lim Var (P (t, x, ·) − P (t, y, ·)) = 0. (8.109)
t→∞

Next let µ1 and µ2 be two probability measures on E. Then


µZ Z ¶
Var P (t, x, ·) dµ1 (x) − P (t, y, ·) dµ2 (x)
E E
µZ Z ¶
= Var (P (t, x, ·) − P (t, y, ·)) dµ1 (x) dµ2 (y)
E×E
ZZ
≤ Var (P (t, x, ·) − P (t, y, ·)) dµ1 (x) dµ2 (y), (8.110)
E×E

and hence by equality (8.107) and inequality (8.110) we obtain


µZ Z ¶
lim Var P (t, x, ·) dµ1 (x) − P (t, y, ·) dµ2 (x) = 0. (8.111)
t→∞ E E

Since equality (8.111) is equivalent to (8.94) this completes the proof of Propo-
sition 8.37.
The following theorem is another version of Theorem 8.21.
Theorem 8.39. Let the Markov process have right-continuous sample paths,
be strong Feller, and irreducible. Let A be a recurrent compact subset of the
state space E, and K ⊂ E any compact subset. Then there exists an closed
neighborhood Kε with K in its interior such that for h > 0
Z ∞
sup Px [X(t) ∈ Kε , h + τA ◦ ϑh > t] dt < ∞. (8.112)
x∈E 0
8.1 Coupling methods 419

Proof. Without loss of generality may and shall assume that K ⊃ A. Other-
wise replace K by K ∪ A. In the arguments following (9.190) in the proof of
Theorem 9.36 there exists ε > 0 such that
"Z #
h+τA ◦ϑh
sup Ey 1Kε (X(ρ)) dρ < ∞ (8.113)
y∈E 0

where Kε is an ε-neighborhood of K. This completes the proof of Theorem


8.39.

By definition we have
Z ∞ ·Z τA ¸
RA (0)f (x) = Ex [F (X(ρ)) , τA > ρ] dρ = Ex f (X(ρ)) dρ
0 0
(8.114)
for those Borel measurable function f for which the integrals in (8.114) exist.
As a corollary to 8.39 we have the following result. The proof follows by
observing that τA ≤ h + τA ◦ ϑh and the definition of RA (0)f : see (8.114).
Corollary 8.40. Let the hypotheses and notation be as in Theorem 8.39. In
addition, let K be a compact subset of E and h > 0. Then there exists a
bounded function f ∈ Cb (E), 1K ≤ f ≤ 1, such that
·Z τA ¸
sup RA (0)f (y) = sup Ey f (X(ρ)) dρ
y∈E y∈E 0
"Z #
h+τA ◦ϑh
≤ sup Ey f (X(ρ)) dρ < ∞. (8.115)
y∈E 0

Let f be as in (8.115). Then there exists a constant Cf such that for all
g ∈ Cb (E) the following inequality holds:

sup RA (0) (|g| f ) (y) ≤ Cf kgk∞ . (8.116)


y∈E

Proof. Let Kε be as in Theorem 8.39 and choose f ∈ Cb (E) in such a way


that 1K ≤ f ≤ 1Kε . Then f satisfies (8.115), and (8.116) is satisfied with Cf
given by the right-hand side of (8.115). Altogether this completes the proof
of Corollary 8.40.

Remark 8.41. Of course, the estimate in (8.8) in Corollary 8.4 gives an in-
teresting lower bound for gap(L) only in case λmin (a) > 0; we always have
λmin (a) ≥ 0. Condition (8.5) and the finiteness of a in Theorem 8.3 can be
replaced by a Γ2 -condition, ¡without
¢ violating
¡ ¢the conclusion in (8.6). In fact
a condition of the form Γ2 f , f ≥ γΓ1 f , f , f ∈ A, yields a stronger re-
sult: see Theorem 8.71 and Example 8.77, Proposition 8.79 and the formulas
(8.247) and (8.248). It is also noticed that in the presence of an operator L as
described in (8.1), and the corresponding squared gradient operator
420 8 Coupling and Sobolev inequalities
d
1 X ∂ 2 f (x)
Γ1 (f, g) (x) = ai,j (x) (8.117)
2 i,j=1 ∂xi ∂xj

the standard Euclidean distance is not the necessarily the “natural” distance
for problems related to the presence of a spectral gap. In fact the more adapted
distance dL or dΓ1 is probably given by the following formula:
© ¡ ¢ ª
dL (x, y) = sup |f (x) − f (y)| : Γ1 f , f ≤ 1, f ∈ D (Γ1 ) . (8.118)
One of the tools used in estimates related to coupling methods is finding the
correct metric on Rd × Rd which serves as a “prototype” estimate. The reader
should compare this observation with comments and techniques used by Chen
and Wang in e.g. [53, 54, 55]. In Lemma 8.50 below the standard Euclidean
distance is used (like in [53]). In fact, it could be that it would be more
appropriate to use the distance presented in (8.118). As remarked earlier, this
technique might lead to geometric considerations related to Γ2 -calculus.
The proof of Theorem 8.3 will be based on coupling arguments. In the present
situation we will consider unique week solutions to the following stochastic
differential equation in Rd × Rd :
µ ¶ µ ¶ Z tµ ¶ Z tµ ¶
X(t) X(s) σ (s, X(s)) b (ρ, X(ρ))
= + dW (ρ) + dρ.
Y (t) Y (s) 0 σ (ρ, Y (ρ)) s b (ρ, Y (ρ))
(8.119)
Of course this equation is a natural analog of an equation of the form
Z t Z t
X(t) = X(s) + σ (ρ, X(ρ)) dW (ρ) + b (ρ, X(ρ)) dρ. (8.120)
s 0
µ ¶
X(s)
In equation (8.119) we assume that the column vector can be pre-
Y (s)
scribed, and in (8.120) we may prescribe X(s). Let us introduce the coupling
operator Le as follows:

X d d
e s f (x, y) = 1 ∂ 2 f (x, y) 1 X ∂ 2 f (x, y)
L ai,j (s, x, x) + ai,j (s, x, y)
2 i,j=1 ∂xi ∂xj 2 i,j=1 ∂xi ∂yj
d d
1 X ∂ 2 f (x, y) 1 X ∂ 2 f (x, y)
+ ai,j (s, y, x) + ai,j (s, y, y)
2 i,j=1 ∂yi ∂xj 2 i,j=1 ∂yi ∂yj
d
X d
∂f (x, y) X ∂f (x, y)
+ bi (s, x) + bi (s, y) . (8.121)
i=1
∂x i i=1
∂yi
d
Here the matrix a(s, x, y) = (ai,j (s, x, y))i,j=1 is given by
d
X
ai,j (s, x, y) = (σ(s, x)σ(s, y)∗ )i,j = σi,k (s, x)σj,k (s, y).
k=1
8.1 Coupling methods 421

It follows that the diffusion matrix e a(s, x, y) of the operator Le s and the drift
vector eb(s, x, y) are given by respectively:
µ ¶
σ(s, x)σ(s, x)∗ σ(s, x)σ(s, y)∗
e
a(s, x, y) =
σ(s, y)σ(s, x)∗ σ(s, y)σ(s, y)∗
µ ¶µ ¶µ ¶
σ(s, x) 0 Id Id σ(s, x)∗ 0
= , (8.122)
0 σ(s, y) Id Id 0 σ(s, y)∗

and
µ ¶
eb(s, x, y) = b(s, x)
. (8.123)
b(s, y)
µ ¶ µ ¶ µ ∗ ∗¶
Id Id αβ α α
Here Id is the d × d identity matrix. Notice that = ,
Id Id αβ β∗ β∗
∗ ∗
where the d×d matrices α and β are chosen in such a way that αα +ββ = Id .
The stochastic differential equation in (8.119) corresponds to the choice α = Id
(and β = 0). We also assume that the corresponding martingale problem
is well-posed. In the present context the corresponding martingale problem
d d
reads as follows. For every
¡ d pair ¢(x, y) ∈ R × R , and s ≥ 0, find a probability
d
measure Ps,x,y on Cb R × R which makes the process
Z t
f (t, X(t), Y (t)) − f (s, X(s), Y (s)) − e (ρ, X(ρ), Y (ρ)) dρ
Lf (8.124)
s

a Ps,x,y -martingale with respect to the filtration determined by Brownian mo-


tion {W (s) : s ≥ 0}. Moreover, we want the probability measure Ps,x,y to be
such that Ps,x,y [X(s) = x, Y (s) = y] = 1. For more details on the martingale
problem the reader is referred to e.g., item (c) and (d) of Theorem 1.39. Say-
ing that the martingale problem is well posed for the operator is equivalent
to saying that the stochastic differential equation in (8.119) has unique weak
solutions. If the coefficients σ(s, x) and b(s, x) are such that the equation in
(8.119) has unique strong solutions, then it possesses unique weak solutions,
and hence the martingale problem is well-posed forµ L. e For more details the

X(t)
reader is referred to §9.6 in Chapter 9. Let the pair be a unique weak
Y (t)
solution solution µto the ¶coupled
µ ¶ stochastic differential equation (8.119) start-
X(s) x
ing at time s in = in Rd × Rd . Then we define the stopping
Y (s) y
time τ by
τ = inf {t > 0 : X(t) = Y (t)} ,
if there exists t ∈ (0, ∞) such that X(t) = Y (t). If no such finite t exists, then
we write τ = ∞. The following theorem can be found in Chen and Wang [53]
as Theorem 3.1. Their proof uses an approximation argument. As Chen and
Wang indicate, it is also a consequence of theorems 6.1.3, 8.1.3 (and 10.1.1)
in Stroock and Varadhan [223].
422 8 Coupling and Sobolev inequalities

Theorem 8.42. Suppose that the martingale is well posed for the operator L,
or what is equivalent, suppose that the pair (σ(s, x), b(s, x)) possesses unique
weak solutions. Let Ps,x,y be the unique solution to the martingale problem
staring at the pair (x, y). Then X(t) = Y (t) Ps,x,y -almost surely on the event
{τ ≤ t}.
The proof Theorem 8.42 will be given after remark 8.49 below.
The following definition is taken from Stroock and Varadhan [223] Chapter
8. The connection with the well-posedness of the martingale problem will be
explained in §9.6 in Chapter 9. In particular we have that the martingale
problem is well-posed for the operator L if the pair (σ(t, y), b(t, y)), t ≥ 0,
y ∈ Rd , satisfies Itô’s uniqueness condition from any point (s, x) ∈ [0, ∞)×Ed .
In fact this is the theorem of Watanabe and Yamada [252].
Definition 8.43. Let (s, x) ∈ [0, ∞) × Rd . The pair (σ(t, y), b(t, y)), t ≥ s,
y ∈ Rd , is said to possess at most one weak solution from (s, x), if and only if
for every probability space (Ω, F, P), every non-decreasing family {Ft : t ≥ 0}
of sub-σ-fields of F, and every triple β : [0, ∞)×Ω → Rd , ξ : [0, ∞)×Ω → Rd ,
and η : [0, ∞)×Ω → Rd such that (Ω, Ft , P; β(t)) is a d-dimensional Brownian
motion, and the equations
Z s∨t Z s∨t
ξ(t) = x + σ (ρ, ξ(ρ)) dβ(ρ) + b (ρ, ξ(ρ)) dρ, t ≥ 0,
s s

and
Z s∨t Z s∨t
η(t) = x + σ (ρ, η(ρ)) dβ(ρ) + b (ρ, η(ρ)) dρ, t ≥ 0,
s s

hold P-almost surely, then ξ(t) = η(t) P-almost surely. Instead of possessing
a “unique weak solution from (s, x)”, it is also customary to say that for the
pair (σ(s, x), b(s, x)) the Itô’s uniqueness condition is satisfied from (s, x) or
after s starting from x.
The following definition specializes the items (c) and (d) in Theorem 1.39 to
the case of the differential operator L = (Lt ; t ≥ 0} as exhibited in (8.1).
Definition 8.44. Let the operator L be given by (8.1), and let
¡ ¢[0,∞)
Ω = Rd , and X(t)(ω) = X(t, ω) = ω(t), ω ∈ Ω, t ≥ 0.

Put Fts = σ (X(ρ) : s ≤ ρ ≤ t), 0 ≤ s ≤ t < ∞, and F = σ (X(s) : s ≥ 0).


The martingale problem is said to be well-posed for the operator L starting
from (s, x) ∈ [0, ∞) × Rd if there exists a unique probability measure P on P
with the following properties:
(a) P [X(t) = x : 0 ≤ t ≤ s] = 1.
8.1 Coupling methods 423
¡ ¢ R
(b) For every f ∈ C0 Rd the process t 7→ f (X(t))−f (X(s))− Ls f (X(s)) ds
is a P martingale with respect to the filtration (Fts )t≥s .
Let (Ω, Fts , P)t≥s be a filtered probability space, and let (t, ω) 7→ X(t, ω) be
a progressively measurable process. There are several equivalent formulations
for the process X possessing properties (a) and (b) on some probability space.
The reader is referred to e.g. Theorem 4.2.1 in Stroock and Varadhan [223].
We begin by defining a progressively measurable process.
Definition 8.45. Let (Ω, Fts )t≥s be a filtered space, and let (E, E) be a mea-
surable space. Let X : [s, ∞) × Ω → E be a processes (or just a function).
The process X is called progressively measurable if for every t1 , t2 , with
s ≤ t1 < t2 < ∞, the function X : [t1 , t2 ] × Ω → E is B[t1 ,t2 ] × Ft2 -E-
measurable. The symbol B[t1 ,t2 ] stand for the Borel field of the interval [t1 , t2 ].
If E is a topological space with Borel field E, and if X is¡right-continuous,
¢
s
then X is progressively measurable relative to the filtration Ω, Ft+ t≥s
. Here
s
Ft+ = ∩ρ>t Fρs .

Definition 8.46. Let (Ω, Fts , P)t≥s be a filtered probability space and let the
progressively measurable process X(t) have the properties in (a) and (b) of
Definition 8.44 relative to the present filtered probability space. Then X(t) is
called an Itô process on (Ω, Fts , P)t≥s with covariance matrix a(t, x) and drift
vector b(t, x), (t, x) ∈ [0, ∞) × Rd .

In fact the same definition can be used if the coefficients a(t) and b(t) are
processes which are progressively measurable.
The following theorem says that an Itô process after a stopping time is
again an Itô process. It is the same as Theorem 6.1.3 in Stroock and Varadhan
[223]: Sd stands for the symmetric d × d matrices with real entries.
Theorem 8.47. Let (Ω, Fts , P) be a filtered probability space, and let a :
[s, ∞) × Ω → Sd , and b[s, ∞) → Rd be bounded progressively measurable func-
tions. Moreover, let X : [s, ∞) × Ω → Rd be an Itô process with covariance
a and drift b, and let τ : Ω → [s, ∞) be an (Fts )t≥s -stopping time. Suppose
that the process t 7→ X(t) is right-continuous and P-almost surely continuous.
Let ω 7→ Qω be regular conditional £ probability
¯ ¤ distribution corresponding to
the conditional probability: A 7→ P A ¯ Fτs . Then there exists a P-null N set
such that t 7→ X(t) is an Itô process on [τ (ω), ∞) relative to Qω , ω ∈
/ N.
For a proof of Theorem 8.47 we refer the reader to Stroock and Varadhan
[223]. The function (ω, A) 7→ Qω (A), ω ∈ Ω, A ∈ Fs = σ (X(ρ) : ρ ≥ s),
possesses the following properties:
(a) For every B ∈ Fs the function ω 7→ Qω [B] is Fτs -measurable;
(b) For every A ∈ Fτs and B ∈ Fs the following equality holds:
Z
P [A ∩ B] = Qω [B]dP; (8.125)
A
424 8 Coupling and Sobolev inequalities

(c) There exists a P-negligible event N such that Qω [A(ω)] = 1 for all for
ω∈/ N.
In item (c) we write A(ω) = ∩ {A : A 3 ω, A ∈ Fτs }, ω ∈ Ω. Property (c)
expresses the regularity of the the conditional probability Qω . Property (b) is
a quantitative property pertaining to the definition of conditional expectation,
and (a) is a qualitative property defining conditional expectation.
The following theorem appears as Theorem 8.1.3 in Stroock and Varadhan
[223].
Theorem 8.48. Let a and b be bounded Borel measurable functions with at-
tain values in Sd and Rd respectively. Define the matrix functions e a and eb as
in (8.122) and (8.123). Then the coefficients σ and b satisfy Itô’s uniqueness
conditions starting from (s, y) if and only if any solution Pe to the martin-
gale problem relative to the operator Le from (s, y, y) has the property that
e
P [X(t) = Y (t), t ≥ s] = 1. Here, the processes
¡ X(t) ¢and Y (t) attain their
values in Rd and are such that for all f ∈ C02 Rd × Rd the process
Z t
t 7→ f (X(t), Y (t)) − f (X(s), Y (s)) − f Lρ f (X(ρ), Y (ρ)) dρ, t ≥ s,
s

is a Pe-martingale after s relative to filtration determined by the σ-fields Fts =


σ ((x(ρ), X(ρ)) : ρ ∈ [s, t]).

Remark 8.49. In both theorems 8.47 and 8.48 the bounded progressively mea-
surable processes t 7→ a(t) and t 7→ b(t) may be replaced with locally bounded
Borel measurable functions from [0, ∞) × Rd to Sd and Rd respectively. Of
course the processes a(t) and b(t) have to read as a (t, X(t)) and b (t, X(t)) re-
spectively. This is a consequence of Theorem 10.1.1 in Stroock and Varadhan
[223].

Proof (Proof of Theorem 8.42). The result in Theorem 8.42 is a consequence of


Theorem 8.47 in conjunction with Theorem 8.48. In fact Theorem 8.47 reduces
the stopping to stopping τ to a fixed time of the form τ (ω), where ω ∈ Ω is
fixed. Since at time τ (ω), X(τ (ω)) = Y (τ (ω)) Theorem 8.48 shows that the
coupling is successful (i.e. X(t)(ω) = Y (t)(τ ) Qω -almost surely for Ps,x,y -
almost all ω) in case the pair (σ(t, x), b(t, x)) consists of bounded functions
and admits unique weak solutions. It then follows that X(t) = Y (t) Ps,x,y -
almost surely on the event {τ ≤ t}. In formulas the arguments read as follows.
From Theorem 8.47 we have
£ ¯ ¤
P X(t) = Y (t) ¯ Fτs 1{τ ≤t, X(s)=x, Y (s)=y}
= Pτ,X(τ ),Y (τ ) [X(t) = Y (t)] 1{τ ≤t, X(s)=x, Y (s)=y} . (8.126)

From Theorem 8.48 and (8.126) we get

Ps,x,y [X(t) = Y (t), τ ≤ t]


8.1 Coupling methods 425

= P [X(t) = Y (t), X(s) = x, Y (s) = y, τ ≤ t]


£ £ ¯ ¤ ¤
= E P X(t) = Y (t) ¯ Fτs , X(s) = x, Y (s) = y, τ ≤ t
£ ¤
= E Pτ,X(τ ),Y (τ ) [X(t) = Y (t)] , X(s) = x, Y (s) = y, τ ≤ t
= E [1, X(s) = x, Y (s) = y, τ ≤ t] = P [X(s) = x, Y (s) = y, τ ≤ t]
= Ps,x,y [τ ≤ t] . (8.127)

From (8.127) we infer that X(t) = Y (t) Ps,x,y -almost surely on the event
{τ ≤ t}.
Remark 8.49 takes care of locally bounded coefficients.

In the following lemma we suppose that the operator L is time-independent.


µ ¶
X(t)
Lemma 8.50. Let the process be a coupling of the L-diffusion pro-
Y (t)
cess. If there exists γ ∈ R such that
h i
2 2
Ex,y |X(t) − Y (t)| ≤ |x − y| e−γt (8.128)

for all t ≥ 0 and all (x, y) ∈ Rd × Rd , then


¯ tL ¯2
¯∇e f ¯ ≤ e−γt etL |∇f |2 , (8.129)
¡ ¢
for all functions f ∈ C 1 Rd with a bounded gradient which is uniformly
continuous.

Proof. Let τ be the coupling time of the processes X(t) and Y (t)¡ solving ¢ the
coupled stochastic differential equation (8.119), and let f ∈ Cb Rd have a
uniformly bounded gradient ∇f . Then by Lemma 8.50 we have
¯ tL ¯ ¯ · ¸¯2
¯e f (x) − etL f (y)¯2 ¯ f (X(t)) − f (Y (t)) |X(t) − Y (t)| ¯
¯
= ¯Ex,y , τ > t ¯¯
2 |X(t) − Y (t)| |x − y|
|x − y|
" # " #
2 2
|f (X(t)) − f (Y (t))| |X(t) − Y (t)|
≤ Ex,y 2 , τ > t Ex,y 2 ,τ > t
|X(t) − Y (t)| |x − y|
" #
2
−γt |f (X(t)) − f (Y (t))|
≤ e Ex,y 2 ,τ >t
|X(t) − Y (t)|
"¯Z ¿ À ¯2 #
¯ 1 X(t) − Y (t) ¯
−γt
= e Ex,y ¯ ¯ ∇f ((1 − s)Y (t) + sX(t)) , ¯
ds¯ , τ > t
0 |X(t) − Y (t)|
·Z 1 ¸
−γt 2
≤ e Ex,y |∇f ((1 − s)Y (t) + sX(t))| ds, τ > t . (8.130)
0

Next fix ε > 0, and choose δ > 0 in such a way that |y − z| ≤ δ implies
2 2
|∇f (y)| ≤ |∇f (z)| + ε2 . Then from (8.130) we obtain:
426 8 Coupling and Sobolev inequalities
¯ tL ¯
¯e f (x) − etL f (y)¯2
2
|x − y|
·Z 1 ¸
−γt 2
≤ e Ex,y |∇f ((1 − s)Y (t) + sX(t))| ds, |Y (t) − X(t)| ≤ δ
0
·Z 1 ¸
−γt 2
+e Ex,y |∇f ((1 − s)Y (t) + sX(t))| ds, |Y (t) − X(t)| > δ
h 0 i
2
≤ e−γt Ex,y |∇f (X(t))| , |Y (t) − X(t)| ≤ δ
2
+ e−γt k∇f k∞ Px,y [|Y (t) − X(t)| > δ] + e−γt ε2
h i
2
≤ e−γt Ex,y |∇f (X(t))| , |Y (t) − X(t)| ≤ δ
1 h i
2 2
+ e−γt 2 k∇f k∞ Ex,y |Y (t) − X(t)| + e−γt ε2
δ

(use (8.128))
h i 1
2 2 2
≤ e−γt Ex,y |∇f (X(t))| + e−2γt 2 k∇f k∞ |y − x| + e−γt ε2 . (8.131)
δ
In (8.131) we let y tend to x to obtain:
¯ tL ¯2 h i
¯∇e f ¯ ≤ e−γt Ex,x |∇f (X(t))|2 + e−2γt ε2 . (8.132)
¡ ¢
Since Ex,x [g (X(t))] = e−tL g(x), g ∈ Cb Rd , and ε > 0 is arbitrary the
conclusion in Lemma 8.50 follows from (8.132).

We conclude this section with a proof of Theorem 8.1.


2
Proof (Proof of Theorem 8.1). Put h(x, y) = |x − y| . From the representation
e which now does not depend on t, we see that
in (8.121) of the operator L,

e
Lh(x,

y) = trace (σ(x) − σ(y)) (σ(x) − σ(y)) + 2 hb(x) − b(y), x − yi (8.133)

From Itô’s formula and (8.119) we get


2 2
|X(t) − Y (t)| = |X(0) − Y (0)|
Xd Z t
+ (Xk (s) − Yk (s)) (σk,` (X(s)) − σk,` (Y (s))) dWk (s)
k,`=1 0

d Z
X t
+2 (Xk (s) − Yk (s)) (bk (X(s)) − bk (Y (s))) ds
k=1 0

X d Z
d X t
+ (σi,k (X(s)) − σi,k (Y (s))) (σi,k (X(s)) − σj,k (Y (s))) ds
i=1 k=1 0
8.2 Some related stability results 427
2
= |X(0) − Y (0)|
Xd Z t
+ (Xk (s) − Yk (s)) (σk,` (X(s)) − σk,` (Y (s))) dWk (s)
k,`=1 0

d Z
X t
+2 hX(s) − Y (s), b (X(s)) − b (Y (s))i ds
k=1 0
Z t
¡ ∗¢
+ trace (σ (X(s)) − σ (Y (s))) (σ (X(s)) − σ (Y (s))) ds. (8.134)
0
h i
2
Put ϕ(t) = Ex,y |X(t) − Y (t)| . Then (8.134) and the definition of γ in (8.5)
we see that ϕ0 (t) ≤ −γϕ(t). It follows that ϕ(t) ≤ ϕ(0)e−γt . From Lemma
8.1, in particular from (8.129) we see that
¯ tL ¯2
¯∇e f ¯ ≤ e−γt etL |∇f |2 , (8.135)
¡ d¢
for all functions
¡ ¢ f ∈ Cb R with bounded uniformly continuous gradient.
Let f ∈ Cb Rd be such a function. Then from (8.135) we infer
Z
¯ ¯2 t
∂ sL ¯¯ (t−s)L ¯¯2
etL |f | f 2 − ¯etL f ¯ = e ¯e f ¯ ds
0 ∂s
Z t
∂ sL D E
= e a∇e(t−s)L f , ∇e(t−s)L f ds (8.136)
0 ∂s
Z t
∂ sL D (t−s)L E
≤a e ∇e f, ∇e(t−s)L f ds
0 ∂s
Z t
∂ sL ¯¯ (t−s)L ¯2
¯
=a e ¯∇e f, ∇¯ ds
0 ∂s
Z t
2
≤a e−(t−s)γ esL e(t−s)L |∇f | ds
0
1 − e−γt tL 2
=a e |∇f | . (8.137)
γ

8.2 Some related stability results


Let L be the generator of a Tβ -diffusion which, by definition, is a time-
homogeneous Markov process
©¡ ¢ ª
Ω, Ft0 , Px , (X(t), t ≥ 0) , (ϑt , t ≥ 0) , (E, E) .

Let Γ1 be the corresponding squared gradient operator. With f ∈ D(L) we


associate the martingale t 7→ Mf (t) defined by
428 8 Coupling and Sobolev inequalities
Z t
Mf (t) = f (X(t)) − f (X(0)) − Lf (X(s)) ds.
0

For more details on the squared gradient operators see e.g. Bakry [16] and
[17]. Then for f , g ∈ D(L) we have
Z t
hMf , Mg i (t) = Γ1 (f, g) (X(s)) ds. (8.138)
0
© ª
Denote by etL : t ≥ 0 the semigroup generated by L.
¡ ¢
Theorem 8.51. Let f ∈ D L2 . Then the following identities hold for ρ,
t ≥ 0 and x ∈ E:

f (X(ρ + t)) − EX(ρ) [f (X(t))] (8.139)


Z ρ+t
© ª
= Mf (ρ + t) − Mf (ρ) + Me(ρ+t−σ)L Lf (σ) − Me(ρ+t−σ)L Lf (ρ) dσ
ρ
Z t © ª
= Mf (ρ + t) − Mf (ρ) + Me(t−σ)L Lf (ρ + σ) − Me(t−σ)L Lf (ρ) dσ,
0

and
h¯ ¯2 i
Ex ¯f (X(ρ + t)) − EX(ρ) [f (X(t))]¯
h i
2 2
= Ex |f (X(ρ + t))| − |Ex [f (X(ρ + t))]|
¯ ¯2
= e(ρ+t)L |f | (x) − eρL ¯etL f (x)¯
2
·Z t ³ ´ ¸
= Ex Γ1 e(t−σ)L f , e(t−σ)L f (X(ρ + σ)) dσ
0
Z t ³ ´
(ρ+σ)L
= e Γ1 e(t−σ)L f , e(t−σ)L f (x) dσ. (8.140)
0
¡ ¢
Remark 8.52. In (8.139) we need the fact that f ∈ D L2 . In (8.140) the
hypotheses f ∈ D(L) suffices.

Proof. First we prove the equality in (8.139). Therefore we write:


Z ρ+t © ª
Mf (ρ + t) − Mf (ρ) + Me(ρ+t−σ)L Lf (σ) − Me(ρ+t−σ)L Lf (ρ) dσ
ρ
= Mf (ρ + t) − Mf (ρ)
Z ρ+t ½
+ e(ρ+t−σ)L Lf (X(σ)) − e(ρ+t−σ)L Lf (X(ρ))
ρ
Z σ ¾
(ρ+t−σ)L 2
− e L f (X (σ1 )) dσ1 dσ
ρ
8.2 Some related stability results 429
Z ρ+t
= Mf (ρ + t) − Mf (ρ) + e(ρ+t−σ)L Lf (X(σ)) dσ
ρ
Z ρ+t
− e(ρ+t−σ)L Lf (X(ρ)) dσ
ρ
Z ρ+t Z ρ+t
− e(ρ+t−σ)L L2 f (X (σ1 )) dσ dσ1
ρ σ1
Z ρ+t
= Mf (ρ + t) − Mf (ρ) + e(ρ+t−σ)L Lf (X(σ)) dσ
ρ
Z ρ+t
∂ (ρ+t−σ)L
+ e f (X(ρ)) dσ
ρ ∂σ
Z ρ+t Z ρ+t
∂ (ρ+t−σ)L
+ e Lf (X (σ1 )) dσ dσ1
ρ σ1 ∂σ
Z ρ+t
= Mf (ρ + t) − Mf (ρ) + e(ρ+t−σ)L Lf (X(σ)) dσ
ρ
tL
+ f (X(ρ)) − e f (X(ρ))
Z ρ+t ³ ´
+ Lf (X (σ1 )) − e(ρ+t−σ1 )L Lf (X (σ1 )) dσ1
ρ
Z ρ+t
= f (X(ρ + t) − f (X(ρ)) − Lf (X(s)) ds
ρ
Z ρ+t
+ e(ρ+t−σ)L Lf (X(σ)) dσ + f (X(ρ)) − EX(ρ) [f (X(t))]
ρ
Z ρ+t ³ ´
+ Lf (X (σ1 )) − e(ρ+t−σ1 )L Lf (X (σ1 )) dσ1
ρ
= f (X(ρ + t)) − EX(ρ) [f (X(t))] . (8.141)
The equality in (8.139) is the same as the one in (8.141). The proof of (8.140)
is much more difficult. We will employ the equalities in (8.138) and (8.139) to
obtain it. From (8.139) we get
¯ ¯
¯f (X(ρ + t)) − EX(ρ) [f (X(t))]¯2
¯ Z t ¯
¯ © ª ¯2
= ¯¯Mf (ρ + t) − Mf (ρ) + Me(t−σ)L Lf (ρ + σ) − Me(t−σ)L Lf (ρ) dσ ¯¯
0
2
= |Mf (ρ + t) − Mf (ρ)|
µ Z t ¶
© ª
+ 2< Mf (ρ + t) − Mf (ρ) Me(t−σ)L Lf (ρ + σ) − Me(t−σ)L Lf (ρ) dσ
0
¯Z t ¯
¯ © ª ¯2
+ ¯¯ Me(t−σ)L Lf (ρ + σ) − Me(t−σ)L Lf (ρ) dσ ¯¯ . (8.142)
0
For brevity we write
430 8 Coupling and Sobolev inequalities

4ρ Mg (σ) = Mg (ρ + σ) − Mg (ρ), g ∈ D(L). (8.143)

Next we use the fact that processes of the form σ © 7→ 4ρ Mg (σ),


ª g ∈ D(L),
ρ
are Px -martingales with respect to the filtration Fρ+σ : σ > 0 . From this
together with (8.138) and (8.142) we obtain, using the notation in (8.143):
h¯ ¯2 i
Ex ¯f (X(ρ + t)) − EX(ρ) [f (X(t))]¯
h i ·Z t ¸
2 © ª
= Ex |4ρ Mf (t)| + 2<Ex 4ρ Mf (t) 4ρ Me(t−σ)L Lf (σ) dσ
0
·Z t Z t ¸
© ª
+ Ex 4ρ Me(t−ρ1 )L Lf (ρ1 ) 4ρ Me(t−ρ2 )L Lf (ρ2 ) dρ1 dρ2
0 0
h i µ ·Z t ¸¶
2
= Ex |4ρ Mf (t)| + 2< Ex 4ρ Mf (σ)4ρ Me(t−σ)L Lf (σ) dσ
0
·Z t Z t ¸
+ Ex 4ρ Me(t−ρ1 )L Lf (ρ1 ∧ ρ2 )4ρ Me(t−ρ2 )L Lf (ρ1 ∧ ρ2 ) dρ1 dρ2
0 0

(employ (8.138) several times)


·Z t ¸
¡ ¢
= Ex Γ1 f , f (X(ρ + σ)) dσ
0
µ ·Z t Z σ ³ ´ ¸¶
(t−σ)L
+ 2< Ex Γ1 f , e Lf (X (ρ + σ1 )) dσ1 dσ
0 0
·Z t Z t Z ρ1 ∧ρ2 ³ ´ ¸
(t−ρ )L (t−ρ2 )L
+ Ex Γ1 e 1 Lf , e Lf (X (ρ + σ)) dσ dρ1 dρ2
0 0 0
·Z t ¸
¡ ¢
= Ex Γ1 f , f (X(ρ + σ)) dσ
0
µ ·Z t Z t ³ ´ ¸¶
+ 2< Ex Γ1 f , e(t−σ)L Lf (X (ρ + σ1 )) dσ dσ1
0 σ1
·Z t Z t Z t ³ ´ ¸
+ Ex Γ1 e(t−ρ1 )L Lf , e(t−ρ2 )L Lf (X (ρ + σ)) dρ1 dρ2 dσ
0 σ σ

(the operator Γ1 is bilinear)


·Z t ¸
¡ ¢
= Ex Γ1 f , f (X(ρ + σ)) dσ
0
µ ·Z t µ Z t ¶ ¸¶
(t−σ)L
+ 2< Ex Γ1 f , e Lf dσ (X (ρ + σ1 )) dσ1
0 σ1
"Z ÃZ Z ! #
t t t
(t−ρ2 )L
+ Ex Γ1 e(t−ρ1 )L Lf dρ1 , e Lf dρ2 (X (ρ + σ)) dσ
0 σ σ
8.2 Some related stability results 431
·Z t ¸
¡ ¢
= Ex Γ1 f , f (X(ρ + σ)) dσ
0
µ ·Z t ³ ´ ¸¶
(t−σ1 )L
+ 2< Ex Γ1 f , e f − f (X (ρ + σ1 )) dσ1
0
·Z t ³ ´ ¸
(t−σ)L (t−σ)L
+ Ex Γ1 e f − f, e f − f (X (ρ + σ)) dσ
0
·Z t ³ ´ ¸
(t−σ)L (t−σ)L
= Ex Γ1 e f, e f (X(ρ + σ)) dσ . (8.144)
0
¡ ¢ ¡ ¢
The equality in (8.144) yields (8.140) for f ∈ D L2 . Since D L2 is Tβ -dense
in D(L) we infer (8.140) for f ∈ D(L).
An easier proof of equality (8.140) reads as follows. We calculate:
½ ¯ ¯2 ¾
∂ (ρ+σ)L ¯ (t−σ)L ¯
e ¯e f¯
∂σ
¯ ¯2
¯ ¯
= e(ρ+σ)L L ¯e(t−σ)L f ¯
n o
− e(ρ+σ)L Le(t−σ)L f e(t−σ)L f + e(t−σ)L f Le(t−σ)L f
³ ´
= e(ρ+σ)L Γ1 e(t−σ)L f , e(t−σ)L f . (8.145)

In (8.145) we used the identity


¡ ¢ ¡ ¢
L f g = Lf g + f Lg + Γ1 f , g . (8.146)

The equality in (8.146) is true for f , g ∈ D(L) such that f g ∈ D(L). From
(8.145) we obtain:
Z t ³ ´ Z t ½ ¯ ¯2 ¾
∂ ¯ ¯
e(ρ+σ)L Γ1 e(t−σ)L f , e(t−σ)L f dσ = e(ρ+σ)L ¯e(t−σ)L f ¯ dσ
0 0 ∂σ
¯ ¯2
= e(ρ+t)L |f | − eρL ¯etL f ¯ .
2
(8.147)

The equality in (8.147) implies (8.140).

Remark 8.53. Suppose that there exist constants c > 0 and γ ∈ R such that
¡ ¢ ¡ ¢
Γ1 eρL f , eρL f ≤ ce−γρ eρL Γ1 f , f . (8.148)

From (8.147) and (8.148) with ρ = 0 we obtain:


Z t ³ ´
¯ ¯2
|f | − ¯etL f ¯ =
tL 2
e eρL Γ1 e(t−ρ)L f , e(t−ρ)L f dρ
0
Z n
t ¡ ¢o
≤ eρL ce(t−ρ)γ e(t−ρ)L Γ1 f , f dρ
0
432 8 Coupling and Sobolev inequalities
c¡ ¢ ¡ ¢
= 1 − e−tγ etL Γ1 f , f . (8.149)
γ
The inequality in (8.149) is the same as inequality (4.14) in Theorem 4.13 of
Chen and Wang [53]. If µ is an invariant probability for the operator L, then
(8.149) implies
Z
¯ ¯2
lim etL ¯f − etL f (x)¯ (x)dµ(x)
t→∞
Z Z Z
¯ tL ¯2 ¡ ¢
2
= |f | dµ − lim ¯e f ¯ dµ ≤ c Γ1 f , f dµ, (8.150)
t→∞ γ

provided γ > 0. From (8.150) it follows that the L2 -spectral gap of L is


bounded from below by γ/c. The inequality in (8.150) can be considered as a
spectral gap or Poincaré inequality: compare with Definition 8.56 below.

The following definition is to be compared with the definitions 7.22 and 7.33.
Definition 8.54. Let µ be the unique invariant measure of the generator of
a diffusion L with associated squared gradient operator Γ1 . Then the L2 (µ)-
spectral gap of the operator L is defined by the equality
½Z Z Z ¾
¡ ¢ 2
2gap (L) = inf Γ1 f , f dµ : f ∈ D(L), f dµ = 0, |f | dµ = 1 .
(8.151)

Proposition 8.55. Let the measure µ and gap(L) > 0 be as in Definition


8.54. Then γ ∈ (0, ∞) satisfies γ ≤ 2 × gap(L) if and only if the following
inequality holds for all t > 0 and for all f ∈ Cb (E):
Z ¯Z ¯2 ÃZ ¯Z ¯2 !
¯ tL ¯2 ¯ ¯ ¯ ¯
¯e f ¯ dµ − ¯ e f dµ¯ ≤ e
tL −tγ
|f | dµ − ¯¯ f dµ¯¯ .
2
(8.152)
¯ ¯

The inequality in (8.152) holds for all t > 0 and f ∈ Cb (E) if and if the
inequality ÃZ
Z ¯Z ¯2 !
¡ ¢ ¯ ¯
|f | dµ − ¯¯ f dµ¯¯
2
Γ1 f , f dµ ≥ γ (8.153)

holds for all f ∈ D(L).

Definition 8.56. An inequality of the form (8.153) is called a Poincaré or a


spectral gap inequality of L2 (E, µ)-type.
R R
Notice
R that by invariance of the measure µ we have etL f dµ = f dµ, and
Lf dµ = 0, f ∈ D(L). RAlso notice
R that, since µ is a probability measure, the
decomposition f = f − f dµ + f dµ splits R the function f in two orthogonal
functions (one of them being the constant f dµ) in the space L2 (E, µ). Hence
we have
8.2 Some related stability results 433
Z Z ¯ Z ¯2 Z ¯Z ¯2
¯ ¯ ¯ ¯
|f − α| dµ = ¯f − f dµ¯ dµ = |f | dµ − ¯ f dµ¯¯
¯ ¯ ¯
2 2
inf
α∈C
ZZ
1 2
= |f (x) − f (y)| dµ(x) dµ(y). (8.154)
2
Remark 8.57. If the probability measure µ is invariant under the semigroup
generated by L, then
Z Z Z Z
¡ ¢ ¡ ¢ ¡ ¢
Γ1 f , g dµ = L f g dµ − Lf g dµ − f (Lg) dµ
Z Z Z
¡ ¢
=− Lf g dµ − f (Lg) dµ = − (L + L∗ ) f · g dµ
(8.155)

where L∗ is the adjoint of the operator L in the space L2 (E, µ). From (8.155)
we infer
½Z Z Z ¾
¡ ¢ 2
2gap(L) = inf Γ1 f , f dµ : f dµ = 0, |f | dµ = 1
½ Z Z Z ¾
¡ ¢ 2
= inf − (L + L∗ ) f · f dµ : f dµ = 0, |f | dµ = 1 ,

and hence the number 2gap(L)


© R is the bottom of theªspectrum of the operator
− (L + L∗ ) in the space f − f dµ : f ∈ L2 (E, µ) which is the orthogonal
complement of the subspace consisting of the constant functions in L2 (E, µ).
In particular, if L = L∗ , then gap(L) is the gap in the spectrum of −L
between 0 and [gap(L), ∞) ∩ σµ (L). Here σµ (L) denotes the spectrum of L as
an operator in the space L2 (E, µ). In fact it would have been better to write
gap (L + L∗ ) instead of 2gap(L). Of course 0 is an eigenvalue of L and the
constant functions are the corresponding eigenvectors.
Proof (Proof of Proposition 8.55). If γ > 0 is such that (8.152) is satisfied for
R ¯R ¯2
all t > 0 and for all f ∈ Cb (E). Then we subtract |f | dµ − ¯ f dµ¯ from
2

both sides of (8.152) and divide by t > 0 to obtain:


µZ Z ¶ ÃZ ¯Z ¯2 !
1 ¯ tL ¯2 e −γt
− 1 ¯ ¯
¯e f ¯ dµ − |f | dµ ≤
2
|f | dµ − ¯¯ f dµ¯¯ .
2
t t
(8.156)
In (8.156) we let t ↓ 0 to obtain:
Z ÃZ ¯Z ¯2 !
¡ ¢ ¯ ¯
|f | dµ − ¯¯ f dµ¯¯ ,
2
Lf · f + f Lf dµ ≤ −γ (8.157)

or what amounts to the same:


Z ³ ÃZ ¯Z ¯2 !
¡ ¢´ ¯ ¯
|f | dµ − ¯¯ f dµ¯¯ ,
2 2
L |f | − Γ1 f , f dµ ≤ −γ (8.158)
434 8 Coupling and Sobolev inequalities
R 2
Since by invariance L |f | dµ = 0 from (8.158) we infer (8.153) and hence
γ ≤ gap(L).
For the converse statement we consider, for f ∈ D(L) and γ > 0 such that
γ ≤ gap(L), the function
Z ¯ Z ¯2 Z ¯Z ¯2
¯ tL ¯ ¯ tL ¯2 ¯ ¯
ϕ(t) = ¯e f − etL f dµ¯ dµ = ¯e f ¯ dµ − ¯ etL f dµ¯
¯ ¯ ¯ ¯
Z ¯Z ¯2
¯ tL ¯2 ¯ ¯
= ¯e f ¯ dµ − ¯ f dµ¯ . (8.159)
¯ ¯

Then from (8.153) we infer


Z
0
¡ ¡ tL ¢ tL ¡ ¢ ¢
ϕ (t) = L e f e f + etL f LetL f dµ
Z ³ Z
¯ ¯2 ´ ³ ´
= L ¯etL f ¯ dµ − Γ1 etL f , etL f dµ
Z ³ ´
= − Γ1 etL f , etL f dµ
ÃZ ¯Z ¯2 !
¯ tL ¯2 ¯ ¯
≤ −γ ¯e f ¯ dµ − ¯ e f dµ¯
tL
= −γϕ(t), (8.160)
¯ ¯

and hence ϕ(t) ≤ e−γt ϕ(0), which is the same as (8.152).


Since, it is easy to see that γ ≤ gap (L) if and only if inequality (8.153)
holds for all f ∈ D(L), this proves Proposition 8.55.

Definition 8.58. Let µ be an invariant probability measure and let f ≥ 0 be


a Borel measurable function which is not µ-almost everywhere zero. Then the
entropy of f with respect to µ is defined by
Z Z Z Z
f
Ent(f ) = f log f dµ − f dµ log f dµ = f log R dµ. (8.161)
f dµ

Definition 8.59. Let µ be an probability measure. A logarithmic Sobolev in-


equality takes the form
³ ´ Z Z
2 2 1 ¡ ¢
Ent |f | ≤ A |f | dµ + Γ1 f , f dµ (8.162)
λ

for all f in a large enough subalgebra A of Cb (E). Here A ≥ 0 and λ > 0 are
constants. If the constant A can be chosen to be A = 0, then (8.162) is called
a tight logarithmic Sobolev inequality.
³ ´
2
Here Ent |f | is defined in Definition 8.58. The following proposition gives
a relationship between tight logarithmic Sobolev inequalities and the Poincaré
inequality: see Definition 8.56 and inequality (8.153).
8.2 Some related stability results 435

Proposition 8.60. Suppose that L satisfies a logarithmic Sobolev inequality


with constants A and λ > 0, and suppose that L satisfies a Poincaré inequality
with a constant γ > 0. Then L satisfies a tight logarithmic Sobolev inequality.
In the proof we use an inequality which we owe to Rothaus. It is given in
Proposition 8.82 below as inequality (8.252).
R
Proof. Let f ∈ A and put fb = f − f dµ. A combination of inequality (8.252)
and Poincaré’s inequality yields:
³ ´ Z ¯ ¯2 µ¯ ¯ ¶
¯ b¯ ¯ ¯2
Ent |f | ≤ 2 ¯f ¯ dµ + Ent ¯fb¯
2

Z ¯ ¯2 Z
¯ ¯ 1 ¡ ¢
≤ (2 + A) ¯fb¯ dµ + Γ1 f , f dµ
λ

(invoke Poincaré’s inequality with constant γ > 0)


µ ¶Z ¯ ¯
2+A 1 ¯ b¯2
≤ + ¯f ¯ dµ. (8.163)
γ λ

The inequality in (8.163) is a tight logarithmic Sobolev inequality.

Definition 8.61. Let µ be an probability measure. A Sobolev inequality of


order p > 2 has the form
µZ ¶2/p Z Z
p 2 1 ¡ ¢
|f | dµ ≤A |f | dµ + Γ1 f , f dµ (8.164)
λ

for all f in a large enough subalgebra A of Cb (E). Here, as in Definition 8.59,


A ≥ 0 and λ > 0 are constants.

In the following proposition we see that a tight logarithmic Sobolev inequality


implies the Poincaré inequality.
Proposition 8.62. Suppose that in (8.162) the constant A = 0. Then the
inequality in (8.153) is satisfied with γ = 2λ, and hence λ ≤ gap(L).

Proof. Insert f = 1 + εg, ε > 0, g ∈ Cb (E, R), into (8.162), and divide by ε2 ,
to obtain
Z Z
(1 + εg)2 (1 + εg)2
0≥λ log R dµ − Γ1 (g, g) dµ
ε2 (1 + εg)2 dµ
Z Z
(1 + εg)2 1 + 2εg + ε2 g 2
=λ log R R dµ − Γ1 (g, g) dµ
ε2 1 + 2ε gdµ + ε2 g 2 dµ
¡ ¢
(log (1 + x) = x − 21 x2 + O x3 for x → 0)
436 8 Coupling and Sobolev inequalities
Z ( Z Z
2
(1 + εg) 2 ε¡ ¢
2 2
=λ 2g + εg − 2g + εg − 2 g dµ − ε g 2 dµ
ε 2
µ Z Z ¶2 ) Z
ε 2
+ 2 g dµ + ε g dµ dµ − Γ1 (g, g) dµ + O (ε)
2
ÃZ µZ ¶2 ! Z
2
= 2λ g dµ − g dµ − Γ1 (g, g) dµ + O(ε). (8.165)

From (8.165) we infer


ÃZ µZ ¶2 ! Z
2
2λ g dµ − g dµ ≤ Γ1 (g, g) dµ. (8.166)

From (8.166) and the bi-linearity of Γ1 it follows that


ÃZ ¯Z ¯2 ! Z
¯ ¯
|g| dµ − ¯ g dµ¯¯
¯
2
2λ ≤ Γ1 (g, g) dµ, (8.167)

for g ∈ D(L) which take complex values.


By employing Proposition 8.55 this proves Proposition 8.62.

A combination of the propositions 8.60 and 8.62 yields the following corollary.
Corollary 8.63. Suppose that the operator L satisfies a logarithmic Sobolev
inequality. Then L satisfies a tight logarithmic Sobolev inequality if and only
if it satisfies a Poincaré inequality.
In the following proposition we see that a Sobolev inequality combined with
a Poincaré inequality yields a Sobolev inequality with a constant A = 1. In
the proof we employ inequality (8.251) in Proposition P:sobolev3 below.
Proposition 8.64. Suppose that the operator L (or in fact the corresponding
squared gradient operator Γ1 ) satisfies a Sobolev inequality of order p > 2
with constants A and λ: see inequality (8.164) in Definition 8.61. In addition
suppose that L satisfies a Poincaré inequality of the form (8.153) with constant
γ > 0. Then L satisfies a Sobolev inequality
µ of
¶ order p > 2 with constants
1 A 1
A = 1 and λ0 satisfying = (p − 1) + .
λ0 γ λ
Proof. Let f ∈ A. An appeal to inequality (8.251) in Proposition 8.82 yields
the following inequalities:
µZ ¶2/p ¯Z ¯2 µZ ¯ Z ¯p ¶2/p
¯ ¯ ¯ ¯
≤ ¯¯ f dµ¯¯ + (p − 1) ¯f − f dµ¯ dµ
p
|f | dµ ¯ ¯
¯Z ¯2 Z ¯ Z ¯2
¯ ¯ ¯ ¯
≤ ¯ f dµ¯¯ + (p − 1)A
¯ ¯f − f dµ¯ dµ
¯ ¯
8.2 Some related stability results 437
Z µ Z Z ¶
p−1
+ Γ1 f − f dµ, f − f dµ dµ
λ
¯Z ¯2 µ ¶Z
¯ ¯ A 1 ¡ ¢
¯ ¯
≤ ¯ f dµ¯ + (p − 1) + Γ1 f , f dµ. (8.168)
γ λ

The claim in Proposition 8.64 follows from (8.168).

The following theorem says that the entropy defined in terms of an invariant
probability measure has exponential decay for t → ∞ provided that L satisfies
a tight logarithmic Sobolev inequality.
Theorem 8.65. Let λ > 0. The following assertions are equivalent.
(i) For all functions f ∈ A the following inequality holds:
³ ´ Z ¡ ¢
2
λEnt |f | ≤ Γ1 f , f dµ. (8.169)

(ii)For all functions f ∈ A the following inequality holds:


³ ´ ³ ´
2 2
Ent etL |f | ≤ e−2λt Ent |f | . (8.170)

The proof is based on the equalities (see (8.174) below):


³ ´
³ ´ Z Γ1 etL |f |2 , etL |f |2
d 2 1
Ent etL |f | = − 2 dµ
dt 2 etL |f |
Z µ³ ´1/2 ³ ´1/2 ¶
2 2
= −2 Γ1 etL |f | , etL |f | dµ.

Proof. Let f ∈ A. We calculate


ÃZ Ã ! !
d ³ ´ d etL |f |
2
tL 2 tL 2
Ent e |f | = e |f | log R 2 dµ
dt dt etL |f | dµ

(µ is L-invariant)
µZ ³ ´ ¶
d tL 2 tL 2
= e |f | log e |f | dµ
dt
Z ³ ´ ³ ´ Z
2 2 2
= LetL |f | log etL |f | dµ + LetL |f | dµ
Z ³ ´ ³ ´
2 2
= LetL |f | log etL |f | dµ (8.171)

2
Put h = etL |f | . We will rewrite the expression in (8.171 as follows. First we
notice the equality:
438 8 Coupling and Sobolev inequalities

(Lf1 ) f2 = L (f1 f2 ) − f1 Lf2 − Γ1 (f1 , f2 ) (8.172)

for appropriately chosen f1 and f2 . Hence, by L-invariance of µ we have


Z Z Z
(Lf1 ) f2 dµ = − f1 Lf2 dµ − Γ1 (f1 , f2 ) dµ (8.173)
³ ´
2 2
From (8.171) and (8.173) with f1 = etL |f | = h, and f2 = log etL |f | =
log h we get, by using transformation properties of the squared gradient op-
erator Γ1 ,
³ ´ Z Z
d tL 2
Ent e |f | = − hL log h dµ − Γ1 (h, log h) dµ
dt
Z Z Z
Lh 1 Γ1 (h, h) Γ1 (h, h)
= − h dµ + h dµ − dµ
h 2 h2 h
Z Z ³ ´
1 Γ1 (h, h)
=− dµ = −2 Γ1 h1/2 , h1/2 dµ
2 h
Z µ³ ´1/2 ³ ´1/2 ¶
tL 2 tL 2
= −2 Γ1 e |f | , e |f | dµ. (8.174)

If assertion (i) is true, then (8.174) implies:


d ³ ´ ³ ´
2 2
Ent etL |f | ≤ −2λEnt etL |f | ,
dt
³ ´ ³ ´
2 2
and consequently Ent etL |f | ≤ e−2λt Ent |f | which is assertion (ii).
Conversely, if (ii) holds true, then we have
³ ´ ³ ´
2 2
Ent etL |f | − Ent |f | e−2λt − 1 ³ ´
2
≤ Ent |f | . (8.175)
t t
In (8.175) we let t ↓ 0 and we use (8.174) to obtain
³ ´ Z Z
¡ ¢
2
λEnt |f | ≤ Γ1 (|f | , |f |) dµ ≤ Γ1 f , f dµ. (8.176)
¡ ¢
The proof of the inequality Γ1 (|f | , |f |) ≤ Γ1 f , f is given in the proof of
Lemma 8.74: see (8.226), (8.227), and (8.228). From (8.176) assertion (i) fol-
lows.
Proposition 8.66. Fix A ≥ 0 and λ > 0, and let µ be an invariant probability
measure. The following assertions are equivalent:
(i) For all f ∈ A the logarithmic Sobolev inequality in (8.162) holds.
(ii)There exists p ∈ (1, ∞) such that
Z Z
p2
Ent (f p ) ≤ A f p dµ + f p−2 Γ1 (f, f ) dµ

8.2 Some related stability results 439
Z Z
p2
=A f p dµ − f p−1 Lf dµ (8.177)
4λ(p − 1)

for f ∈ A, f ≥ 0.
(iii)For all p ∈ (1, ∞) the inequality
Z Z
p2
Ent (f p ) ≤ A f p dµ + f p−2 Γ1 (f, f ) dµ

Z Z
p p2
= A f dµ − f p−1 Lf dµ (8.178)
4λ(p − 1)

holds for f ∈ A, f ≥ 0.
2
Proof. The proof follows by observing that Γ1 (ϕ(f ), ϕ(f )) = (ϕ0 (f )) Γ1 (f, f )
for all p > 1 and for all f ∈ A. The choice ϕ(f ) = f p/2 shows that (i) implies
(ii). The choice ϕ(f ) = f q/p shows (ii) implies (iii) with q instead of p. Finally,
the choice p = 2 shows the implication (iii) =⇒ (i).

The following result is taken from Bakry [17].


Theorem 8.67. Let A ≥ 0 and λ > 0 be two constants, and let p ∈ (0, ∞).
Let the functions p(t) and m(t) be determined by the equalities:
µ ¶
p(t) − 1 1 1
= e4λt , and m(t) = A − . (8.179)
p−1 t p(t)

Then the following assertions are equivalent:


(i) The logarithmic Sobolev inequality 8.162 is satisfied with constants A and
λ;
(ii)For all t > 0 and f ∈ Lp (E, µ) the following inequality holds
° tL °
°e f ° ≤ em(t) kf kp . (8.180)
p(t)
° °
Notice that for A = 0 we have °etL f °p(t) ≤ kf kp , and hence the mapping
f 7→ etL f is contractive form Lp (E, µ) to Lp(t) (E, µ).
Proof. (i) =⇒ (ii). Fix t0 ∈ (0, ∞). Without loss of generality we may and do
assume that
Z
° tL °p(t) ¡ tL ¢p(t)
°e f ° = e f dµ ≤ 1, 0 ≤ t ≤ t0 . (8.181)
p(t)

n° ° o
Otherwise we divide f ≥ 0 by sup °etL f °p(t) : t ∈ [0, t0 ] . Define the func-
tion g(t), t > 0, by
µ µ ¶¶ µZ ¶1/p(t)
1 1 ¡ tL ¢p(t)
g(t) = exp −A − e f dµ . (8.182)
p p(t)
440 8 Coupling and Sobolev inequalities

Then a calculation shows the equality:


µ µ ¶¶ µZ ¶1−1/p(t)
0 1 1 ¡ tL ¢p(t)
g (t) exp A − e f dµ
p p(t)
Z
p0 (t) ¡ tL ¢p(t)
= −A 2
e f dµ
p(t)
½ ³¡ Z µZ ¶ ¾
p0 (t) tL
¢p(t) ´ ¡ tL ¢p(t) ¡ tL ¢p(t)
+ Ent e f + e f log e f dµ dµ
p(t)2
Z
¡ tL ¢p(t)−1 tL
+ e f Le f dµ. (8.183)

From assertion (iii) in Proposition 8.66 with p(t) instead of p we see


³¡ Z Z
¢p(t) ´ ¡ tL ¢p(t) 1 p(t)2 ¡ tL ¢p(t)−1 tL
Ent etL f ≤A e f dµ − e f Le f dµ
4λ p(t) − 1
Z Z
¡ tL ¢p(t) p(t)2 ¡ tL ¢p(t)−1 tL
=A e f dµ − 0 e f Le f dµ.
p (t)
(8.184)

Then (8.184) together with (8.183) shows:


µ µ ¶¶ µZ ¶1−1/p(t)
0 1 1 ¡ tL ¢p(t)
g (t) exp A − e f dµ
p p(t)
Z
p0 (t) ¡ tL ¢p(t)
≤ −A e f dµ
p(t)2
½ Z Z
p0 (t) ¡ tL ¢p(t) p(t)2 ¡ tL ¢p(t)−1 tL
+ 2
A e f dµ − 0 e f Le f dµ
p(t) p (t)
Z µZ ¶ ¾
¡ tL ¢p(t) ¡ tL ¢p(t)
+ e f log e f dµ dµ
Z
¡ tL ¢p(t)−1 tL
+ e f Le f dµ
Z µZ ¶
p0 (t) ¡ tL ¢p(t) ¡ tL ¢p(t)
= e f log e f dµ dµ ≤ 0 (8.185)
p(t)2

where we used (8.181). From (8.185) it follows that g 0 (t) ≤ 0, t ∈ [0, t0 ],


and hence g(t) ≤ g(0), which shows inequality (8.180) in assertion (ii). Since
t0 ∈ (0, ∞) is arbitrary assertion (ii) follows from (i).
R
(ii) =⇒ (i). It suffices to prove assertion (i) in case f p dµ = 1. Again let
the function g be defined in (8.182). Now we use g 0 (0) to show that (i) is a
consequence of (ii). In fact from assertion (ii) we get g(t) ≤R g(0), t ≥ 0, and
hence g 0 (0) ≤ 0. From (8.183) for t = 0 and the fact that f p dµ = 1 we see
that inequality (8.177) in assertion (ii) of Proposition 8.66 follows.
8.2 Some related stability results 441

Proposition 8.68. Suppose that there exist constants c > 0 and γ ∈ R, such
that
¯ ¯2 ¯¯ ¯2
¯
n ¯ ¯2 o
0 ≤ etL ¯eρL f ¯ − ¯e(ρ+t)L f ¯ ≤ ce−ργ eρL etL |f | − ¯etL f ¯
2
(8.186)

for all f ∈ Cb (E), ρ, t ≥ 0. Then the inequality


¡ ¢ ¡ ¢
Γ1 eρL f , eρL f ≤ ce−ργ eρL Γ1 f , f (8.187)

holds for all ρ ≥ 0 and all f ∈ Cb (E). Consequently, the inequality in (8.149)
holds. Conversely, if (8.187) holds, then the inequality in (8.186) holds as
well. Consequently, if (8.186) or (8.187) is valid, then the inequality in (8.149)
holds.

Proof. Let ρ ≥ 0, and 0 ≤ t ≤ s. Then by (8.186) with e(s−t)L f instead of f


we get
¯ ¯2 ¯ ¯2
¯ ¯ ¯ ¯
etL ¯e(ρ+s−t)L f ¯ − ¯e(ρ+s)L f ¯
¯ ¯2 ¯ ¯2
¯ ¯
≤ ce−γρ e(ρ+t)L ¯e(s−t)L f ¯ − ce−γρ eρL ¯esL f ¯
µ ¯ ¯2 ¯ ¯2 ¶
¯ ¯ ¯ ¯
= ce−γρ eρL etL ¯e(s−t)L f ¯ − ¯e(ρ+s)L f ¯ . (8.188)

We divide the terms in (8.188) by t > 0 and let t tend to zero to obtain:
³ ´
Γ1 e(s+ρ)L f , e(s+ρ)L f
¯ ¯2
¯ ¯
= L ¯e(s+ρ)L f ¯ − Le(ρ+s)L f e(ρ+s)L f − e(ρ+s)L f Le(ρ+s)L f
³ ¯ ¯2 ´
≤ ce−γρ eρL L ¯esL f ¯ − LesL f esL f − esL f LesL f
¡ ¢
= ce−γρ eρL Γ1 esL f , esL f . (8.189)

In order to obtain (8.189) we again employed (8.146). In (8.189) we let s tend


to zero to get:
¡ ¢ ¯ ¯2
Γ1 eρL f , eρL f = L ¯eρL f ¯ − LeρL f eρL f − f LeρL f
³ ´
2
≤ ce−γρ eρL L |f | − Lf f − f Lf
¡ ¢
= ce−γρ eρL Γ1 f , f . (8.190)

Notice that (8.190) coincides with the inequality in (8.187). As in Remark


8.53 we see that (8.190) yields
Z t ³ ´
¯ ¯2
etL |f | − ¯etL f ¯ =
2
eρL Γ1 e(t−ρ)L f , e(t−ρ)L f dρ
0
442 8 Coupling and Sobolev inequalities
Z n
t ¡ ¢o
≤ eρL ce−(t−ρ)γ e(t−ρ)L Γ1 f , f dρ
0
c¡ ¢ ¡ ¢
= 1 − e−tγ etL Γ1 f , f . (8.191)
γ

Notice that (8.191) is the same as (8.149).


Next suppose that (8.187) holds. Then by (8.140) with e(t−σ)L f instead
of f we obtain
¯ ¯2 ¯ ¯2
etL ¯eρL f ¯ − ¯etL eρL f ¯
Z t ³ ´
= eσL Γ1 e(t−σ)L eρL f , e(t−σ)L eρL f dσ
0
Z t ³ ´
≤ ce−γρ eρL eσL Γ1 e(t−σ)L f , e(t−σ)L f dσ
0
Z t ³ ´
= ce−γρ eρL eσL Γ1 e(t−σ)L f , e(t−σ)L f dσ
³0 ¯ ¯2 ´
etL |f | − ¯etL f ¯ .
−γρ ρL 2
= ce e (8.192)

The inequality in (8.192) is the same as the one in (8.186).


Altogether this proves Proposition 8.68.

In the following lemma we want to establish conditions in order that the


inequality (8.186) or the equivalent one (8.187) is satisfied.
Lemma 8.69. Let f ∈ Cb (E), fix t > 0, and put for ρ ≥ 0
³ ¯ ¯2 ´
u(ρ) = e−γρ eρL v(0) = e−γρ eρL etL |f | − ¯etL f ¯ ,
2
(8.193)
¯ ¯2 ¯¯ ¯2
¯
v(ρ) = etL ¯eρL f ¯ − ¯e(ρ+t)L f ¯ , and
³ ´ ³ ´
w(ρ) = etL Γ1 eρL f , eρL f − Γ1 e(ρ+t)L f , e(ρ+t)L f . (8.194)

Suppose w(ρ) ≥ γv(ρ), ρ ≥ 0. Then u(ρ) ≥ v(ρ), ρ ≥ 0.

Proof. A calculation shows:

u0 (ρ) − v 0 (ρ) = Lu(ρ) − Lv(ρ) + w(ρ) − γu(ρ), ρ ≥ 0. (8.195)

Inserting the inequality w(ρ) ≥ γv(ρ) in (8.195) shows:

u0 (ρ) − v 0 (ρ) ≥ Lu(ρ) − Lv(ρ) − γ (u(ρ) − v(ρ)) , ρ ≥ 0. (8.196)

From (8.196) we see

u0 (ρ) − v 0 (ρ) = Lu(ρ) − Lv(ρ) − γ (u(ρ) − v(ρ)) + p(ρ), ρ ≥ 0, (8.197)


8.2 Some related stability results 443

where p(ρ) ≥ 0. Then we have


Z ρ
u(ρ) − v(ρ) = e−γρ eρL (u(0) − v(0)) + e−γ(ρ−σ) e(ρ−σ)L p(σ)dσ ≥ 0.
0
(8.198)
This proves Lemma 8.69.

We observe that w(ρ) ≥ γv(ρ) for all ρ ≥ 0 if and only if


³ ´ ³ ¯ ¯2 ´
etL Γ1 (g, g) − Γ1 etL g, etL g ≥ γ etL |g| − ¯etL g ¯
2
(8.199)

for all functions g of the form g = eρL f , ρ ≥ 0. The following lemma gives
conditions in order that the inequality (8.199) is satisfied.
Lemma 8.70. Suppose that
¡ ¡ ¢¢
LΓ1 (g, g) − 2< Γ1 Lg, g ≥ γΓ1 (g, g) (8.200)

for all functions g of the form g = eρL f , ρ ≥ 0. Then the inequality in (8.199)
is satisfied for such functions.

Proof (Proof of Lemma 8.70). We write


³ ´
etL Γ1 (g, g) − Γ1 etL g, etL g
Z t ³ ³ ´ ³ ³ ´´´
= eρL LΓ1 e(t−ρ)L g, e(t−ρ)L g − 2< Γ1 Le(t−ρ)L g, e(t−ρ)L g dρ
0
Z t ³ ´ ³ ¯ ¯2 ´
eρL Γ1 e(t−ρ)L g, e(t−ρ)L g dρ = γ etL |g| − ¯etL g ¯ .
2
≥γ (8.201)
0

The inequality in (8.201) proves Lemma 8.70.

The bilinear mapping (f, g) 7→ Γ2 (f, g), f , g ∈ A, where

Γ2 (f, g) = LΓ1 (f, g) − Γ1 (Lf, g) − Γ1 (f, Lg) (8.202)

is called the first iterated square gradient operator. The inequality in (8.200)
says that
Γ2 (g, g) ≥ γΓ1 (g, g) , g ∈ A. (8.203)
The following result can also be found as Lemma 1.2 and Lemma 1.3 in Ledoux
[144]: proofs go back to Bakry [13, 14].
Theorem 8.71. Let γ ∈ R. The following assertions are equivalent:
(i) The following inequality holds for all f ∈ A:
¡ ¢ ¡ ¢
Γ2 f , f − γΓ1 f , f ≥ 0. (8.204)
444 8 Coupling and Sobolev inequalities

(ii)For every t ≥ 0 and f ∈ A the following inequality holds:


¡ ¢ ¡ ¢
Γ1 etL f , etL f ≤ e−γt etL Γ1 f , f . (8.205)

(iii)For every t ≥ 0 and f ∈ A the following inequality holds:


¡ ¡ tL ¢¢1/2 1
³ ¡ ¢1/2 ´
Γ1 e f , etL f ≤ e− 2 γt etL Γ1 f , f . (8.206)

(iv)The following inequality holds for all f ∈ A:


¡ ¡ ¢ ¡ ¢¢
¡ ¢ ¡ ¢ Γ1 Γ1 f , f , Γ 1 f , f
Γ2 f , f − γΓ1 f , f ≥ ¡ ¢
4Γ1 f , f
³ ¡ ¢1/2 ¡ ¢1/2 ´
= Γ1 Γ1 f , f , Γ1 f , f . (8.207)

Notice that the inequality in (8.204) is the same as the one in (8.203). For more
details on the iterated square gradient operators see e.g. Bakry [13, 14, 16, 17,
15], Bakry and Ledoux [19], Ledoux [144, 145], and Rothaus [203, 202, 204].
Remark 8.72. Let Ψ1 , Ψ2 : Rn → R be smooth, i.e. C (2) -functions, and F =
(f1 , . . . , fn ) a vector in An . In the proof we employ the following equality:
n
X (1) (2)
Γ1 (Ψ1 (F ), Ψ2 (F )) = Xi Xj Γ1 (fi , fj ) (8.208)
i,j=1

(k) ∂
where Xi = Ψk (F ), 1 ≤ i ≤ n, k = 1, 2. With Ψ1 (g) = Ψ2 (g) = g 2 ,
∂xi
¡ ¢1/2
g = Γ1 f , f , this shows the equality-sign in (8.207). Compare (8.208) and
(8.215) below. In the implication (i) =⇒ (iv) we also need the Hessian of a
function f . The Hessian H(f ) of f is the bilinear mapping defined in (8.213)
below. Its main transformation property is given in (8.214).

Proof. The implication (iii) =⇒ (ii) follows from the Cauchy-Schwarz inequal-
ity in conjunction with (8.206). In fact
¯ tL ¯
¯e (gh)¯2 ≤ etL |g|2 · etL |h|2 ≤ kgk2 etL |h|2

¡ ¢1/2
applied with g = 1 and h = Γ1 f , f shows that (iii) =⇒ (ii).
(ii) =⇒ (i). Subtracting the left-hand side from the right-hand side of
(8.205) and dividing by t > 0 and letting t ↓ 0 yields:
¡ ¢ ¡ ¢ ¡ ¢
(L − γ) Γ1 f , f − Γ1 Lf , f − Γ1 f , Lf ≥ 0. (8.209)

However, the inequality in (8.209) is equivalent to (8.204).


(iv) =⇒ (iii). We fix f ∈ A and t > 0, and we define the function Φ(s),
s ∈ [0, t], by
8.2 Some related stability results 445
½³ ³ ´´1/2 ¾
1
Φ(s) = e− 2 γs esL Γ1 e(t−s)L f , e(t−s)L f . (8.210)

Rt
Then we want to show Φ(t) ≥ Φ(0). Since Φ(t) − Φ(0) = 0
Φ0 (s)ds, it suffices
to prove that Φ0 (s) ≥ 0. Therefore we calculate:

Φ0 (s)
µ ¶ ·³ ³ ´´1/2 ¸
− 12 γs+sL 1 (t−s)L (t−s)L
=e L− γ Γ1 e f, e f
2
· ³ ³ ´´1/2 ¸
− 12 γs+sL ∂ (t−s)L (t−s)L
+e Γ1 e f, e f
∂s
µ ¶ ·³ ³ ´´1/2 ¸
− 12 γs+sL 1 (t−s)L (t−s)L
=e L− γ Γ1 e f, e f
2
"
1 1 1
− e− 2 γs+sL ¡ ¡ ¢¢1/2
2 Γ e (t−s)L f , e(t−s)L f
1
#
³ ³ ´ ³ ´´
(t−s)L (t−s)L (t−s)L (t−s)L
Γ1 Le f, e f + Γ1 e f , Le f
µ ¶ ·³ ³ ´´1/2 ¸
1 1
= e− 2 γs+sL L − γ Γ1 e(t−s)L f , e(t−s)L f
2
"
1 1 1
− e− 2 γs+sL ¡ ¡ ¢¢1/2
2 Γ e(t−s)L f , e(t−s)L f
1
#
³ ³ ´ ³ ´´
(t−s)L (t−s)L (t−s)L (t−s)L
LΓ1 e f, e f − Γ2 e f, e f
µ ¶ ·³ ³ ´´1/2 ¸
1 1
= e− 2 γs+sL L − γ Γ1 e(t−s)L f , e(t−s)L f
2
" ¡ ¢ #
1 − 1 γs+sL LΓ1 e(t−s)L f , e(t−s)L f
− e 2
¡ ¡ ¢¢1/2
2 Γ1 e(t−s)L f , e(t−s)L f
" ¡ ¢ #
1 − 1 γs+sL Γ2 e(t−s)L f , e(t−s)L f
+ e 2 ¡ ¡ ¢¢1/2
2 Γ e(t−s)L f , e(t−s)L f
1

¡ ¢ ¡ ¡ ¢¢1/2
(L g 2 = 2gLg + Γ1 (g, g) with g = Γ1 e(t−s)L f , e(t−s)L f )
µ ¶ ·³ ³ ´´1/2 ¸
− 12 γs+sL 1 (t−s)L (t−s)L
=e L− γ Γ1 e f, e f
2
· ³ ³ ´´1/2 ¸
1 − 1 γs+sL (t−s)L (t−s)L
− e 2 2L Γ1 e f, e f
2
446 8 Coupling and Sobolev inequalities
 ³ ¡ ¢1/2 ¡ ¢1/2 ´ 
(t−s)L
1 − 1 γs+sL  Γ1 Γ1 e f , e(t−s)L f , Γ1 e(t−s)L f , e(t−s)L f
− e 2 ¡ ¡ ¢¢1/2 
2 Γ1 e(t−s)L f , e(t−s)L f
" ¡ ¢ #
1 − 1 γs+sL Γ2 e(t−s)L f , e(t−s)L f
+ e 2 ¡ ¡ ¢¢1/2
2 Γ e(t−s)L f , e(t−s)L f
1

¡ ¢ ¡ ¡ ¢¢1/2
(Γ1 g 2 , g 2 = 4g 2 Γ1 (g, g) with g = Γ1 e(t−s)L f , e(t−s)L f )
" ¡ ¡ ¢ ¡ ¢¢ #
1 − 1 γs+sL Γ1 Γ1 e(t−s)L f , e(t−s)L f , Γ1 e(t−s)L f , e(t−s)L f
=− e 2 ¡ ¡ ¢¢3/2
2 4 Γ1 e(t−s)L f , e(t−s)L f
" ¡ ¢ ¡ ¢#
1 − 1 γs+sL Γ2 e(t−s)L f , e(t−s)L f − γΓ1 e(t−s)L f , e(t−s)L f
+ e 2 ¡ ¡ ¢¢1/2 .
2 Γ e(t−s)L f , e(t−s)L f
1
(8.211)

Put g = e(t−s)L f . In order that the expression in (8.211) is positive it suffices


to prove the inequality:
1
Γ1 (g, g) (Γ2 (g, g) − γΓ1 (g, g)) ≥ Γ1 (Γ1 (g, g) , Γ1 (g, g)) . (8.212)
4
The inequality in (8.212) is a consequence of assertion (iv).
The implication (i) =⇒ (iv) remains to be shown. Here we¡ use¢the fact¡ that¢
L generates a diffusion. So starting from (8.204), i.e. from Γ2 f , f −γΓ1 f , f
for all f ∈ A. Without loss of generality we assume that the function f is
real-valued. For a given function f ∈ A we introduce its Hessian H(f ) as the
bilinear form:
1
H(f ) (g, h) = [Γ1 (Γ1 (f, g) , h) + Γ1 (Γ1 (f, h) , g) − Γ1 (f, Γ1 (g, h))] ,
2
(8.213)
g, h ∈ A. Let Ψ : Rn → R be a smooth function, and let F = (f1 , . . . , fn ) be
∂ ∂2
a vector in An . Put Xi = Ψ (F ) and Xi,j = Ψ (F ), 1 ≤ i, j ≤ n.
∂xi ∂xi ∂xj
Then a cumbersome calculation shows:
n
X n
X
Γ2 (Ψ (F ), Ψ (F )) = Xi Xj Γ2 (fi , fj ) + 2 Xi Xj,k H (fi ) (fj , fk )
i,j=1 i,j,k=1
n
X
+ Xi,j Xk,` Γ1 (fi , fk ) Γ1 (fj , f` ) . (8.214)
i,j,k,`=1

A similar but much simpler calculation shows:


n
X
Γ1 (Ψ (F ), Ψ (F )) = Xi Xj Γ1 (fi , fj ) . (8.215)
i,j=1
8.2 Some related stability results 447

If the function F 7→ Ψ (F ) varies among all real polynomials of second order,


then the function
³ ´
n
X1 , . . . , Xn ; (Xi,j )i,j=1 7→ Γ2 (Ψ (F ), Ψ (F )) − γΓ1 (Ψ (F ), Ψ (F )) (8.216)

is a positive polynomial. We may apply this for n = 2, f1 = f , f2 = g, and


the function Ψ (f, g) chosen in such a way that X2 = X1,1 = X2,2 = 0. Then
from (8.214), (8.215) and (8.216) we get:

X12 (Γ2 (f, f ) − γΓ1 (f, f )) + 4X1 X1,2 H(f ) (f, g)


³ ´
2 2
+ 2X1,2 Γ1 (f, g) + Γ1 (f, f ) Γ1 (g, g) ≥ 0 (8.217)

for all X1 and X1,2 ∈ R \ {0}. Then we choose Ψ (f, g) in such a way that
X1 (Γ2 (f, f ) − γΓ1 (f, f )) = −2X1,2 H(f ) (f, g). Then from (8.217) we infer:
³ ´
2 2
4 (H(f ) (f, g)) ≤ 2 (Γ2 (f, f ) − γΓ1 (f, f )) Γ1 (f, g) + Γ1 (f, f ) Γ1 (g, g) .
(8.218)
Since 2H(f ) (f, g) = Γ1 (Γ1 (f, f ) , g), (8.218) implies
2
(Γ1 (Γ1 (f, f ) , g))
³ ´
2
≤ 2 (Γ2 (f, f ) − γΓ1 (f, f )) Γ1 (f, g) + Γ1 (f, f ) Γ1 (g, g)

2
(Γ1 (f, g) ≤ Γ1 (f, f ) Γ1 (g, g))

≤ 4 (Γ2 (f, f ) − γΓ1 (f, f )) Γ1 (f, f ) Γ1 (g, g) . (8.219)

Choosing g = Γ1 (f, f ), and employing (8.219) entails (8.207) with the real
function f instead of a complex function f ∈ A. By splitting a complex
function in its real and imaginary part we see that (8.207) follows for all
f ∈ A.
This completes the proof of the implication (i) =⇒ (iv), and concludes the
proof of Theorem 8.71.

Proposition 8.73. Suppose that (8.200) is satisfied for all functions g ∈


D(L). Then the equivalent inequalities (8.186) and (8.187) in Proposition
8.68 are satisfied with c = 1. If γ > 0, then the operator L has a spectral gap
≥ γ.

Proof (Proof of Proposition 8.73). If (8.200) is satisfied for all functions g ∈


D(L), then by Lemma 8.70 the inequality (8.199) is satisfied for all functions
g ∈ D(L). Lemma 8.69 implies that
³ ¯ ¯2 ´ ¯ ¯2 ¯¯ ¯2
¯
e−γρ eρL etL |f | − ¯etL f ¯ ≥ etL ¯eρL f ¯ − ¯e(ρ+t)L f ¯ .
2
(8.220)
448 8 Coupling and Sobolev inequalities

Proposition 8.68 and (8.220) show that the equivalent inequalities (8.186)
and (8.187) in Proposition 8.68 are satisfied with c = 1. Hence we obtain the
inequality in (8.149) with c = 1:
Z t ³ ´
¯ ¯2
etL |f | − ¯etL f ¯ =
2
eρL Γ1 e(t−ρ)L f , e(t−ρ)L f dρ
0
Z n
t ¡ ¢o
≤ eρL e(t−ρ)γ e(t−ρ)L Γ1 f , f dρ
0
1¡ ¢ ¡ ¢
= 1 − e−tγ etL Γ1 f , f . (8.221)
γ
R
Let µ be an invariant probability measure such that limt→∞ etL f (x) = f dµ
for all f ∈ Cb (E) and x ∈ E. The existence and uniqueness of such an
invariant probability measure is guaranteed by Orey’s convergence theorem:
see Theorem 9.4, and also (8.101). It is required that the Markov process
is Harris recurrent. The monotonicity property in Lemma 9.61 of Chapter 9
implies that this limit exists by letting t > 0 tend to ∞ instead of n ∈ N.
Then by integrating (8.221) against µ and taking the limit as t → ∞, we find
ÃZ ¯Z ¯2 ! Z
¯ ¯ ¡ ¢
|f | dµ − ¯¯ f dµ¯¯
2
γ ≤ Γ1 f , f dµ. (8.222)

From (8.222) and Definition 8.54 the claim in Proposition 8.73 readily follows.

The inequality (8.230) below is a consequence of equality (8.140) in Theorem


8.51. For convenience we insert a (short) proof here as well. The inequality in
(8.231) employs the full power of Theorem 8.71. In the proof of Theorem 8.75
requires the following lemma.
Lemma 8.74. Suppose that the constant γ satisfies the inequality in (8.229)
in Theorem 8.75 below. Let f ∈ A, and s ≥ 0. Then the following inequality
holds:
³ ´  ³ ´
2 2
Γ1 esL |f | , esL |f |  Γ1 |f |2 , |f |2 
2 ≤ e−γs esL 2 . (8.223)
esL |f |  |f | 

In addition, the following inequality holds:


³ ´
2 2
Γ1 |f | , |f | ¡ ¢
2 ≤ 4Γ1 f , f . (8.224)
|f |

If in (8.224) the function f is real-valued, then this inequality is in fact an


equality.
8.2 Some related stability results 449

Proof. From inequality (8.206) in assertion (iii) of Theorem 8.71 we infer


³ ´1/2 µ ³ ´1/2 ¶
sL 2 sL 2 − 12 γs sL 2 2
Γ1 e |f | , e |f | ≤e e Γ1 |f | , |f |
  ³ ´ 1/2 

 Γ |f |
2
, |f |
2 

1 1
= e− 2 γs esL |f |  2


 |f | 

(Cauchy-Schwarz inequality)

  ³ ´ 1/2
1
³ ´1/2  Γ1 |f |2 , |f |2 
2 esL  .
≤ e− 2 γs esL |f | 2 (8.225)
 |f | 

The inequality in (8.223) easily follows from (8.225). The equality in (8.224)
follows from the transformation rules of the squared gradient operator Γ1 .
More precisely, with f = u + iv, u, v real and imaginary part of f , we have
³ ´
2 2
Γ1 |f | , |f | = 4u2 Γ1 (u, u) + 8uvΓ1 (u, v) + 4v 2 Γ1 (v, v) , (8.226)

and
2 ¡ ¢ ¡ ¢
4 |f | Γ1 f , f = 4 u2 + v 2 (Γ1 (u, u) + Γ1 (u, u)) . (8.227)

Since
¯ p ¯ ¯ p ¯
¯ ¯ ¯ ¯
2uvΓ1 (u, v) ≤ 2 ¯u Γ1 (v, v)¯ · ¯v Γ1 (u, u)¯ ≤ u2 Γ1 (v, v) + v 2 Γ1 (u, u) ,
(8.228)
the inequality in (8.224) readily follows from (8.226), (8.227) and (8.228).
This completes the proof of Lemma 8.74.

Theorem 8.75. Suppose that the constant γ ∈ R satisfies one of the equiva-
lent conditions in Theorem 8.71 for the operator L: i.e.
¡ ¢ ¡ ¢
Γ2 f , f ≥ γΓ1 f , f for all f ∈ A. (8.229)

Then the following inequalities hold for f ∈ A and t ≥ 0:


³ ´ ¯ ¯2 1 − e−γt tL ¡ ¡ ¢¢
etL |f | − ¯etL f ¯ ≤
2
e Γ1 f , f , and (8.230)
γ
³ ´ ³ ´ ³ ³ ´´ 1 − e−γt tL ¡ ¡ ¢¢
2 2 2 2
etL |f | log |f | − etL |f | log etL |f | ≤4 e Γ1 f , f .
γ
(8.231)
450 8 Coupling and Sobolev inequalities

The inequality in (8.230) can be called a pointwise Poincaré inequality. It is a


consequence of assertion (ii) of Theorem 8.71. The inequality in (8.231) may
be called a logarithmic Sobolev inequality. Its proof is based on the assertion
(iii) in Theorem 8.71, which is a consequence of assertion (iv). It is clear that
assertion (iv) is a improvement of our basic assumption (8.229).
Proof. Let f ∈ A and t > 0. Then we have
Z t µ ¯ ¯2 ¶
¯ tL ¯2 ∂ sL ¯ (t−s)L ¯
tL 2 ¯
e |f | − e f = ¯ e ¯e f ¯ ds
0 ∂s
Z t ½ ¯ ¯ ³ ´ ³ ´¾
sL ¯ (t−s)L ¯2 (t−s)L (t−s)L (t−s)L (t−s)L
= e L ¯e f ¯ − Le f e f −e f Le f ds
0
Z t ³ ´
= esL Γ1 e(t−s)L f , e(t−s)L f ds. (8.232)
0

We employ (8.205) in assertion (ii) of Theorem 8.71 and use the identity in
(8.232) to obtain:
Z t
¯ tL ¯2 ¡ ¢
tL 2 ¯
e |f | − e f ≤¯ e−γ(t−s) esL e(t−s)L Γ1 f , f ds
0
1 − e−γt tL ¡ ¢
= e Γ1 f , f . (8.233)
γ
The inequality in (8.233) is the same as the one in (8.230).
The proof of inequality (8.231) is similar, be it (much) more sophisticated.
In fact we write:
³ ´ ³ ´ ³ ³ ´´
2 2 2 2
etL |f | log |f | − etL |f | log etL |f |
Z t
∂ n sL ³³ (t−s)L 2 ´ ³
2
´´o
= e e |f | log e(t−s)L |f | ds
0 ∂s
Z t n ³³ ´ ³ ´´
2 2
= esL L e(t−s)L |f | log e(t−s)L |f |
0
³ ´ ³ ´
2 2
−L e(t−s)L |f | log e(t−s)L |f |
³ ´ ³ ´o
2 2
− e(t−s)L |f | L log e(t−s)L |f | ds
Z t n ³ ³ ´´o
2 2
= esL Γ1 e(t−s)L |f | , log e(t−s)L |f | ds
0
 ³ ´ 
Z t  Γ1 e(t−s)L |f |2 , e(t−s)L |f |2 
= esL 2 ds. (8.234)
0  e(t−s)L |f | 

An appeal to inequality (8.223) in Lemma 8.74) and employing the equality


in (8.234) yields:
8.2 Some related stability results 451
³ ´ ³ ´ ³ ³ ´´
2 2 2 2
etL |f | log |f | − etL |f | log etL |f |
 ³ ´
Z t  Γ1 |f |2 , |f |2 
≤ e−γ(t−s) esL e(t−s)L 2 ds
0  |f | 
 ³ ´
2 2
1−e −γt  Γ 1 |f | , |f | 
= etL 2
γ  |f | 

1 − e−γt tL ¡ ¡ ¢¢
≤4 e Γ1 f , f . (8.235)
γ
The inequality (8.235) shows (8.231) and completes the proof Theorem 8.75.
The following theorem contains some sufficient conditions in order an operator
L possesses a spectral gap in L2 (E, µ), where µ is an invariant probability
measure on the Borel field E of E.
Theorem 8.76. Let L be the generator of a diffusion process with transition
probability function P (t, x, ·), t ≥ 0, x ∈ E. Suppose that the following condi-
tions are satisfied:
¡ ¢ ¡ ¢
(a) Γ2 f , f ≥ γΓ1 f , f for all f ∈ A.
(b) All probability measures B 7→ P (t, x, B), B ∈ E, with (t, x) ∈ (0, ∞) × E
are equivalent, in the sense that they have the same null-sets.
(c) The operator L has an invariant probability measure µ.
If in (a) γ > 0, then the spectral gap of L, gap(L), in L2 (E, µ) satisfies:
gap(L) ≥ γ.
Proof. By invoking (8.230) in Theorem 8.75 we have
³ ´ ¯ ¯2 1 − e−γt tL ¡ ¡ ¢¢
etL |f | − ¯etL f ¯ ≤
2
e Γ1 f , f , f ∈ A. (8.236)
γ
From (8.236), and the invariance of the measure µ we get
Z ³ ´ Z Z
¯ ¯2 1 − e−γt ¡ ¡ ¢¢
|f | dµ − ¯etL f ¯ dµ ≤
2
Γ1 f , f dµ, f ∈ A. (8.237)
γ
Suppose that γ > 0. The recurrence of the underlying Markov process in
conjunction with Orey’s convergence theorem (see the arguments in the proof
of Proposition 8.73 shows the following inequality by letting t tend to ∞ in
(8.237):
Z ³ ´ ¯Z ¯2 Z
¯ ¯ 1 ¡ ¡ ¢¢
|f | dµ − ¯¯ f dµ¯¯ ≤
2
Γ1 f , f dµ, f ∈ A. (8.238)
γ
The assertion in Theorem 8.76 then follows from (8.238) and the definition of
L2 -spectral gap.
452 8 Coupling and Sobolev inequalities

Example 8.77. Next let E = Rd , and L be the differential operator:


d d
1 X X (2) ¡ ¢
Lf = aj,k ∂j ∂k f + bj ∂j f, f ∈ Cb Rd , (8.239)
2 j=1
j,k=1

∂f
where ∂j f = , 1 ≤ j ≤ d. It is assumed that the coefficients aj,k , and
∂xj
bj , 1 ≤ j ≤ d, 1 ≤ k ≤ d, are space dependent and twice continuously
differentiable. Then the corresponding square gradient operator is given by
d
X
Γ1 (f, g) = aj,k ∂j f · ∂k g. (8.240)
j,k=1
¡ ¢
Let f and g be functions in C (3) Rd . We want to simplify an expression of
the form
LΓ1 (f, g) − Γ1 (Lf, g) − Γ1 (f, Lg) . (8.241)
Notice that if f = g, then (8.241) is the same as (8.200). In order to rewrite
(8.241) we need the following proposition. This proposition is also valid for
general diffusion operators L.
¡ ¢
Proposition 8.78. Let the functions f , g and h belong to C (3) Rd . Then
the following identities hold:
L (f gh) = (Lf ) gh + f (Lg) h + f g (Lh)
+ Γ1 (f, g) h + f Γ1 (g, h) + gΓ1 (f, h) , and
Γ1 (f g, h) = Γ1 (f, g) h + gΓ1 (f, h) . (8.242)
¡ ¢
Proposition 8.79. Let the functions f and g belong to C (3) Rd . Then
LΓ1 (f, g) − Γ1 (Lf, g) − Γ1 (f, Lg)
d
( d d
)
X X X
= Laj,k − an,k ∂n bj − an,j ∂n bk ∂j f ∂k g. (8.243)
j,k=1 n=1 n=1

Proof (Proof of Proposition 8.79). First we rewrite


d
X
LΓ1 (f, g) = L (aj,k ∂j f · ∂k g)
j,k=1
d
X
= {(Laj,k ) ∂j f · ∂k g + aj,k (L∂j f ) ∂k g + aj,k ∂j f (L∂k g)
j,k=1

+Γ1 (aj,k , ∂j f ) ∂k g + Γ1 (aj,k , ∂k g) ∂j f + aj,k Γ1 (∂j f, ∂k g)}


d
X d
X d
X
= (Laj,k ) ∂j f · ∂k g + aj,k (L∂j f ) ∂k g + aj,k ∂j f (L∂k g)
j,k=1 j,k=1 j,k=1
8.2 Some related stability results 453
d
X d
X
+ Γ1 (aj,k , ∂j f ) ∂k g + Γ1 (aj,k , ∂k g) ∂j f
j,k=1 j,k=1
d
X
+ aj,k Γ1 (∂j f, ∂k g)
j,k=1
d
X
= (Laj,k ) ∂j f · ∂k g
j,k=1
d
X d
X d X
X d
+ aj,k an,m ∂n ∂m ∂j f · ∂k g + aj,k bn ∂n ∂j f · ∂k g
j,k=1 n,m=1 j,k=1 n=1
d
X d
X d X
X d
+ aj,k an,m ∂j f · ∂n ∂m ∂k g + aj,k bn ∂j f · ∂n ∂k g
j,k=1 n,m=1 j,k=1 n=1
d
X d
X
+ an,m ∂n aj,k · ∂m ∂j f · ∂k g
j,k=1 n,m=1
d
X d
X
+ an,m ∂n aj,k · ∂m ∂k g · ∂j f
j,k=1 n,m=1
d
X d
X
+ aj,k an,m ∂n ∂j f · ∂m ∂k g. (8.244)
j,k=1 n,m=1

We also rewrite
 
d
X d
X
Γ1 (Lf, g) = Γ1  aj,k ∂j ∂k f + bj ∂j f, g 
j,k=1 j=1
d
X d
X
= Γ1 (aj,k ∂j ∂k f, g) + Γ1 (bj ∂j f, g)
j,k=1 j=1
d
X d
X
= aj,k Γ1 (∂j ∂k f, g) + Γ1 (aj,k , g) ∂j ∂k f
j,k=1 j,k=1
d
X d
X
+ bj Γ1 (∂j f, g) + Γ1 (bj , g) ∂j f
j=1 j=1
d
X d
X
= aj,k an,m ∂n ∂j ∂k f · ∂m g
j,k=1 n,m=1
d
X d
X
+ an,m ∂n aj,k · ∂j ∂k f · ∂m g
j,k=1 n,m=1
454 8 Coupling and Sobolev inequalities
d
X d
X
+ bj an,m ∂n ∂j f · ∂m g
j=1 n,m=1
d
X d
X
+ an,m ∂n bj · ∂j f · ∂m g. (8.245)
j=1 n,m=1

By the same token we have


 
d
X d
X
Γ1 (f, Lg) = Γ1 f, aj,k ∂j ∂k g + b j ∂j g 
j,k=1 j=1
d
X d
X
= aj,k Γ1 (f, ∂j ∂k g) + Γ1 (aj,k , f ) ∂j ∂k g
j,k=1 j,k=1
d
X d
X
+ bj Γ1 (f, ∂j g) + Γ1 (bj , f ) ∂j g
j=1 j=1
d
X d
X
= aj,k an,m ∂n f · ∂m ∂j ∂k g
j,k=1 n,m=1
d
X d
X
+ an,m ∂n aj,k ∂m f · ∂j ∂k g
j,k=1 n,m=1
d
X d
X
+ bj an,m ∂n f · ∂m ∂j g
j=1 n,m=1
d
X d
X
+ an,m ∂n bj · ∂m f · ∂j g. (8.246)
j=1 n,m=1

From (8.244), (8.245) and (8.246) we infer:


LΓ1 (f, g) − Γ1 (Lf, g) − Γ1 (f, Lg)
d
X d
X d
X
= (Laj,k ) ∂j f ∂k g − an,k ∂n bj {∂j f ∂k g + ∂k f ∂j g}
j,k=1 j=1 n,k=1
d
( d d
)
X X X
= Laj,k − an,k ∂n bj − an,j ∂n bk ∂ j f ∂k g
j,k=1 n=1 n=1
d
X
= Lb (A)j,k ∂j f ∂k g (8.247)
j,k=1

where Lb (C) is a matrix with entries:


d
X d
X
Lb (C)j,k = Lcj,k − cn,k ∂n bj − cn,j ∂n bk . (8.248)
n=1 n=1
8.2 Some related stability results 455

Here C is the matrix with entries cj,k and b stands for the column vector with
components bj . The symbol Lb can be considered as a mapping which assigns
to a square matrix consisting of functions again a square matrix consisting of
functions. The operator L is the original differential operator given in (7.132).
If we want to check an inequality like (8.186) or, what is equivalent, (8.187),
then it is probably better to consider the corresponding stochastic differential
equations.
Corollary 8.80. Fix γ ∈ R. Suppose that the inequality
d
( d d
) d
X X X X
Laj,k − an,k ∂n bj − an,j ∂n bk λj λk ≥ γ aj,k λj λk (8.249)
j,k=1 n=1 n=1 j,k=1

holds for all complex vectors (λ1 , . . . , λd ) ∈ Cd . Then


¡ ¢ ¡ ¢
Γ1 eρL f , eρL f ≤ e−ργ eρL Γ1 f , f (8.250)

for all f ∈ D(L) and ρ ≥ 0.


Remark 8.81. The inequality in (8.249) says that in matrix sense the following
inequalities hold: Lb (A) ≥ γA. Here we used the notation as in (8.248), and
A is the symmetric matrix with entries aj,k , 1 ≤ j, k ≤ d.
For some of our applications we will need the following somewhat technical
proposition. The result is due to Rothaus (see [204]) and the inequality in
(8.252) is named after him. A proof of the inequality (8.252) can be found
in Deuschel and Stroock [71]. Another proof can be found in Bakry [16]; for
completeness we insert an outline of a proof.
Proposition 8.82. Let µ be a probability measure on the Borel field of E and
let f ∈ Cb (E). Fix p ≥ 2. Then the following inequalities hold:
µZ ¶2/p ¯Z ¯2 µZ ¯ Z ¯p ¶2/p
¯ ¯ ¯ ¯
p
|f | dµ ¯ ¯
≤ ¯ f dµ¯ + (p − 1) ¯f − f dµ¯ dµ , (8.251)
¯ ¯

and
Z ¯ Z ¯2 ï Z ¯2 !
³ ´ ¯ ¯ ¯ ¯
≤ 2 ¯¯f − f dµ¯¯ dµ + Ent ¯¯f − f dµ¯¯ .
2
Ent |f | (8.252)

R
Proof. Put fb = f − f dµ. By homogeneity we assume that f is the form
R R 2
f = 1 + tg where the function g is such that < g dµ = 0 and |g| dµ = 1,
R R 2
and where t ≥ 0. Then f − f dµ = tg, and |1 + tg| dµ = 1 + t2 . Put
µZ ¶2/p µZ ¶2/p
p 2 p
F1 (t) = |1 + tg| dµ − (p − 1)t |g| dµ , and (8.253)
456 8 Coupling and Sobolev inequalities
Z
2 2 ¡ ¢ ¡ ¢
F2 (t) = |1 + tg| log |1 + tg| dµ − 1 + t2 log 1 + t2
Z
2 2
− t2 |g| log |g| dµ. (8.254)

We will show F1 (t) ≤ 1 and F2 (t) ≤ 2t2 , t ≥ 0. The inequality in (8.251) is a


consequence of F1 (t) ≤ 1, and similarly (8.254) follows from F2 (t) ≤ 2t2 . We
will use the following representations:
Z t
Fk (t) = Fk (0) + tFk0 (0) + (t − s) Fk00 (s)ds, k = 1, 2. (8.255)
0
Then
µZ ¶ p2 −1 Z ³ ´
p p−2 2
F10 (t) = 2 |1 + tg| dµ |1 + tg| <g + t |g| dµ
µZ ¶ p2
p
− 2(p − 1)t |g| dµ . (8.256)

From (8.178) we infer:


µ ¶ µZ ¶ p2 −2 µZ ³ ´ ¶2
00 2 p p−2 2
F1 (t) = 2 −1 |1 + tg| dµ |1 + tg| <g + t |g| dµ
p
µZ ¶ p2 −1 Z
p p−2 2
+2 |1 + tg| dµ |1 + tg| |g| dµ
µZ ¶ p2 −1 Z ³ ´2
p p−4 2
+ 2(p − 2) |1 + tg| dµ |1 + tg| <g + t |g| dµ
µZ ¶ p2
p
− 2(p − 1) |g| dµ
µ ¶ µZ ¶ p2 −2 µZ ³ ´ ¶2
2 p p−2 2
≤2 −1 |1 + tg| dµ |1 + tg| <g + t |g| dµ
p
µZ ¶ p2 −1 Z
p p−2 2
+ 2(p − 1) |1 + tg| dµ |1 + tg| |g| dµ
µZ ¶ p2
p
− 2(p − 1) |g| dµ . (8.257)

In (8.257) we apply Hölder’s inequality to obtain:


Z µZ ¶1− p2 µZ ¶ p2
p−2 2 p p
|1 + tg| |g| dµ ≤ |1 + tg| |g| dµ . (8.258)

p p
As conjugate exponents we used and . From (8.258) and (8.257) we
p−2 2
then infer F10 0(t) ≤ 0. Since F10 (0) = 0 equality (8.255) with k = 1 implies
F1 (t) ≤ F1 (0) = 1.
8.3 Notes 457

Next we calculate the first and second derivative of t 7→ F2 (t):


Z ³ ´ ³ ´
2 2
F20 (t) = 2 <g + t |g| log 1 + 2t<g + t2 |g| dµ
Z ³ ´ Z
2 ¡ 2
¢ 2 2
+2 <g + t |g| dµ − 2t log 1 + t − 2t − 2t |g| log |g| dµ
Z ³ ´ ³ ´
2 2
=2 <g + t |g| log 1 + 2t<g + t2 |g| dµ
Z
¡ 2
¢ 2 2
− 2t log 1 + t − 2t |g| log |g| dµ. (8.259)

Its second derivative is given by


³ ´2
Z 2 Z 2
<g + t |g|
2 1 + 2t<g + t2 |g|
F200 (t) = 2 |g| log 2 dµ + 4 2 dµ
|g| |1 + tg|
¡ ¢ 4t2
− 2 log 1 + t2 −
1 + t2
Z 2 Z
2 1 + 2t<g + t2 |g| 2
≤ 2 |g| log 2 dµ + 4 |g| dµ
|g|
¡ ¢ 4t2
− 2 log 1 + t2 −
1 + t2
Z 2
2 1 + 2t<g + t2 |g| 4 ¡ ¢
= 2 |g| log 2 dµ + 2
− 2 log 1 + t2 . (8.260)
|g| 1 + t
Since
R the function x 7→ log x, x > 0, is convex, and the Rmeasure B 7→
2 2
|g| dµ is a probability measure, Jensen inequality implies |g| log h dµ ≤
RB ³ 2 ´ 2 2
log |g| h dµ. Applying this inequality to h = |1 + tg| = 1 + 2t<g + t2 |g|
in (8.260) shows
Z ³ ´
2 4 ¡ ¢ 4
F200 (t) ≤ log 1 + 2t<g + t2 |g| dµ + 2
− 2 log 1 + t2 = .
1+t 1 + t2
(8.261)
Since F2 (0) = 0 = F20 (0) it follows from the representation in (8.177) that
Z t
4
F2 (t) ≤ (t − s) ds ≤ 2t2 . (8.262)
0 1 + s2
From (8.261) the inequality in (8.252) follows. This concludes the proof of
Proposition 8.82.

8.3 Notes
The result in Theorem 8.3 is taken from Chen and Wang [53] Theorem 4.13. In
[53] Chen and Wang wonder whether the condition that there exists a constant
458 8 Coupling and Sobolev inequalities
2
a > 0 such that ha(x)ξ, ξi ≤ a |ξ| for all x, ξ ∈ Rd is really necessary to arrive
at a Poincaré inequality. This problem is not solved in Theorem 8.75. However,
inequality (8.204) gives a condition in terms of the iterated squared gradient
operator Γ2 and Γ1 which guarantees a pointwise Poincaré type inequality:
see inequality (8.230) in Theorem 8.75.
As mentioned earlier in [16] and [17] Bakry gives much more information
on (iterates) of squared gradient operators. The squared gradient operator
was introduced by Roth [201] as a tool to study Markov processes. For that
matter this is still an important tool: see e.g. Carlen and Stroock [47], Qian
[194], Aida [1], Mazet [157], Barlow, Bass and Kumagai [23], and Wang [251].
Of course the main inspirators for promoting and studying the subject of
(iterated) squared gradient operators were and still are Émery and Bakry: see
e.g. [13, 14, 15, 16, 17, 18].
In the abstract of [143] Ledoux writes “In the line of investigation of the
works by D. Bakry and M. Emery ([18]) and O. S. Rothaus ([202, 204]) we
study an integral inequality behind the “Γ2 ” criterion of D. Bakry and M.
Emery (see previous reference) and its applications to hypercontractivity of
diffusion semigroups. With, in particular, a short proof of the hypercontractiv-
ity property of the Ornstein-Uhlenbeck semigroup, our exposition unifies in a
simple way several previous results, interpolating smoothly from the spectral
gap inequalities to logarithmic Sobolev inequalities and even true Sobolev
inequalities. We examine simultaneously the extremal functions for hyper-
contractivity and logarithmic Sobolev inequalities of the Ornstein-Uhlenbeck
semigroup and heat semigroup on spheres.”
It seems that these phrases are still in place. In fact the techniques of
(iterated) squared gradient operators can also be applied in the infinite-
dimensional setting: see e.g. Wang [251].
9
Miscellaneous topics

In this chapter we collect some well-known and not so well-known results about
martingales and stopping times. We also prove the existence and uniqueness
of invariant measures.

9.1 Martingales
In this section we recall some interesting facts about martingales. This mate-
rial is taken from [241].
(1) Let (Ω, F, P) be a probability space, and let (Ft : t ≥ 0) be a filtration
on Ω; i.e. s < t implies Fs ⊆ Ft ⊆ F. Suppose that F is the σ-field
generated £by ¯ Ft ,¤ t ≥ 0. Moreover, let Y belong to L1 (Ω, F, P). Put
M (t) = E Y ¯ Ft . Then the process is the standard example of a closed
martingale. This martingale is closed, because Y = L1 - limt→∞ M (t). This
limit is also an P-almost sure limit.
(2) Let (Ω, F, P) be a probability space and let W (t) : Ω → Rd be Brownian
motion starting at zero. Then the process W (t), t ≥ 0, is a martingale.
2
The same is true for the process t 7→ |W (t)| − dt.
(3) Let (Ω, F, P) be a probability space and let W (t) : Ω → Rd be Brownian
motion. Let {H(t) : t ≥ 0} be a predictable process. This means that H(t)
is Ft -measurable for each t ≥ 0, and that the mapping (t, ω) 7→ H(t, ω) is
measurable with respect to the σ-field generated by
© ª
1A ⊗ 1(s,t] : A Fs -measurable, s < t .
hR i
t 2
Suppose that E 0 |H(s)| ds < ∞ for all t > 0. Then the process
Rt
t 7→ 0 H(s)dW (s) is a martingale in L2 (Ω, F, P). If we only assume
Rt 2
that the expression 0 |H(s)| ds are finite P-almost surely for all t > 0,
then this process is a local martingale. A process t 7→ M (t) is called a
local martingale, if there exists a sequence of stopping times Tn , n ∈ N,
460 9 Miscellaneous topics

which increases to ∞, such that every process t 7→ M (t ∧ Tn ) is a genuine


martingale. A similar notion is available for local sub-martingales, local
super-martingales, and processes which are locally of bounded
£ ¯ variation.
¤ A
process X(t) ∈ L1 (Ω, F, P) with the property that E X(t) ¯ Fs ≥ X(s),
P-almost£ surely¯ for
¤ t > s, is called a sub-martingale, and a process
with E X(t) ¯ Fs ≤ X(s), P-almost surely for t > s, is called a super-
martingale. A process X(t) ∈ L1 (Ω, F, P) is of bounded variation on the
interval [0, T ] if
 
NX−1 
sup |X (X (tj+1 )) − X (X (tj ))| : 0 = t0 < t1 < · · · < tN −1 < tN = T
 
j=0

is finite. Doob-Meyer’s decomposition theorem says that every local sub-


martingale X(t) of class DL (locally) can be decomposed as a sum X(t) =
M (t) + A(t), where M (t) is a local martingale and A(t) is an increasing
process. By definition the process t 7→ X (t ∧ T ) is of class DL if the
collection {X(τ ) : τ ≤ T , τ stopping time} is uniformly integrable.
(4) Let M1 (t) and M2 (t) be two martingales in L2 (Ω, F, P). Then there
exists a process of bounded variation hM1 (·), M2 (·)i (t), the covariation
process of M1 (t) and M2 (t) such that the process t 7→ M1 (t)M2 (t) −
hM1 (·), M2 (·)i (t) is an L1 -martingale. A similar result is true for local
Rt
martingales. If Mj (t) = 0 Hj (s)dW (s), j = 0, 1, where H1 (t) and H2 (t)
Rt 2
are predictable processes for which 0 |Hj (s)| ds is P-almost surely finite
Rt
for all t ≥ 0i, and for j = 1, 2. Then hM1 (·), M2 (·)i (t) = 0 H1 (s)H2 (s)ds.
Instead of hM1 (·), M2 (·)i (t) we often write hM1 , M2 i (t); if M1 (t) =
M2 (t) = M (t) we also write hM i (t) = hM, M i (t).
(5) Exponential martingales. Suppose that M (t) and N (t) are martingales.
Then the process
µ ¶
1
t 7→ E(−N )(t) := exp −N (t) − hN, N i (t)
2
£ ¡ ¢¤
is a martingale, provided Novikov condition, i.e. E exp 12 hN, N i (t) <
∞ is satisfied for all t ≥ 0. In addition, the process
µ ¶
1
t 7→ exp −N (t) − hN, N i (t) (M (t) + hN, M i (t)) (9.1)
2

is a martingale. If M (t) = N (t), for all t ≥ 0, then the martingale in (9.1)


is the same as the second one in
µ ¶
1
t 7→ exp −M (t) − hM, M i (t) and (9.2)
2
µ ¶
1
t 7→ exp −M (t) − hM, M i (t) (M (t) + hM, M i (t)) . (9.3)
2
9.1 Martingales 461

The factor E(−N ) can be considered as an risk adjustment factor, M (t)


can be interpreted as the volatility (fluctuation, diffusion part), and
hN, M i (t) is the drift or trend of the process. Define the exponential
measure QN by QN [A] = E [EN (T )1A ], A ∈ FT . Let EN denote the corre-
sponding expectation. The process M + hN, M i is then a local martingale
with respect to the measure QN . This follows from Itô calculus in the
following manner. First notice that dEN (t) = −EN (t)dN (t), and hence,
for 0 ≤ t1 < t2 ≤ T we have
EN (t2 ) (M (t2 ) + hN, M i (t2 )) − EN (t1 ) (M (t1 ) + hN, M i (t1 ))
Z t2
=− EN (s) (M (s) + hN, M i (s)) dN (s)
t1
Z t2 Z t2
+ EN (s) (dM (s) + d hN, M i (s)) − EN (s) d hN, M i (s)
t1 t1
Z t2 Z t2
=− EN (s) (M (s) + hN, M i (s)) dN (s) + EN (s) dM (s) . (9.4)
t1 t1

As a consequence of (9.4) we see that the process


t 7→ EN (t) (M (t) + hN, M i (t)) (9.5)
is a (local) P-martingale. Here we use the fact that stochastic integrals
with
h 1 respect ito martingales
h 1 are (local) imartingales.
h 1 If the expectationsi
E e 2 hN,N i(T ) , E e 2 hN,N i(T ) hN, N i (T ) , and E e 2 hN,N i(T ) hM, M i (T )
are finite, then the stochastic integrals in (9.4) are genuine martingales.
This follows from the equalities:
£ ¯ ¤
EN M (t2 ) + hN, M i (t2 ) ¯ Ft1 − (M (t1 ) + hN, M i (t1 ))
£ ¯ ¤
EN M (t2 ) + hN, M i (t2 ) − (M (t1 ) + hN, M i (t1 )) ¯ Ft1
£ ¯ ¤
EN EN (T ) (M (t2 ) + hN, M i (t2 )) − EN (T ) (M (t1 ) + hN, M i (t1 )) ¯ Ft1

(the process EN (t) is a martingale)


£ ¯ ¤
EN EN (t2 ) (M (t2 ) + hN, M i (t2 )) − EN (t1 ) (M (t1 ) + hN, M i (t1 )) ¯ Ft1
= 0,
where we used the martingale property of the process in (9.5).
Corollary 9.1. Let N (t) be a martingale for which Novikov’s condition
is satisfied. Put (Radon-Nikodym derivative)
µ ¶
dQ 1
= exp −N (T ) − hN, N i (T ) .
dP 2
Suppose that W (t) = M (t) is Brownian motion with respect to P. Then
W (t) + hN, W i (t) is Brownian motion with respect to Q. In particular,
462 9 Miscellaneous topics
Rt Rt
if N (t) = 0 b(s)dW (s), then W (t) + 0 b(s)ds, 0 ≤ t ≤ T , is Brownian
motion with respect to Q.
(6) Let H(t) be a predictable process, and let M (t) be a martingale. Suppose
·Z t ¸
2
E |H(s)| d hM, M i (s) < ∞, t > 0. (9.6)
0
Rt
Then the stochastic integral t 7→ 0 H(s)dM (s) is well defined (as an Itô
integral). Moreover, it is a martingale and the equality
"¯Z ¯2 # ·Z t ¸
¯ t ¯
¯
E ¯ ¯
H(s)dM (s)¯ = E
2
|H(s)| d hM, M i (s)
0 0

is valid. If H1 (t) and H2 (t) are predictable processes which satisfy (9.6),
then
·Z t Z t ¸
E H1 (s)dM (s) · H2 (s)dM (s)
0 0
·Z t ¸
=E H1 (s) · H2 (s)d hM, M i (s) . (9.7)
0

For Hj (t) = 1Aj ⊗ 1(uj ,∞) (t), Aj ∈ Fuj , j = 1, 2, the equality in (9.7)
is readily established, for linear combinations of such indicator functions
(i.e. for simple processes) the result also follows easily. A density argument
will do the rest.
(7) Suppose that L generates a Feller semigroup with corresponding Markov
process
{(Ω, F, Px ) , (X(t) : t ≥ 0) , (ϑt : t ≥ 0) , (E, E)} .
Let f be a function in D(L). Then the process
Z t
t 7→ Mf (t) := f (X(t)) − f (X(0)) − Lf (X(s))ds
0

is a martingale.
(8) Let L generate the semigroup etL , t ≥ 0. Suppose that the marginals of
the corresponding Markov process have a density:
Z
tL
e f (x) = Ex [f (X(t))] = p0 (t, x, y)dm(y)

for some reference measure m. Then the process s 7→ p0 (t − s, X(s), y) is


a Px -martingale on the half open interval [0, t).
(9) Let L be the second order differential operator
d
1 X ∂2f
Lf = b · ∇f + ajk .
2 ∂xj ∂xk
j,k=1
9.2 Stopping times and time-homogeneous Markov processes 463

Then for C 2 -functions f1 , f2 we have


Z t
hMf1 , Mf2 i (t) = Γ1 (f1 , f2 ) (X(s))ds
0

where
d
X ∂f1 (x) ∂f2 (x)
Γ1 (f1 , f2 ) (x) = ajk (x) .
∂xj ∂xk
j,k=1

The operator (f1 , f2 ) 7→ Γ1 (f1 , f2 ) is called the squared gradient operator,


or in French, the opérateur carré du champ. The process hMf1 , Mf2 i (t)
is called the (quadratic) covariation process of the local martingales Mf1
and Mf2 .
(10) Itô’s formula. Let

X(t) = M (t) + A(t) = (M1 (t), . . . , Md (t)) + (A1 (t), . . . , Ad (t))

be a continuous semi-martingale, where Mj (t), 1 ≤ j ≤ d, are local mar-


tingales, and where the process Aj , 1 ≤ j ≤ d, are locally of bounded
variation. Let f : Rd → C be a C 2 -function. Then
Z t Z t
f (X(t)) =f (X(0)) + ∇f (X(s))dM (s) + ∇f (X(s))dA(s)
0 0
d
X Z t 2
1 ∂ f
+ (X(s)) d hMj , Mk i (s).
2 0 ∂xj ∂xk
j,k=1

We notice that hXj , Xk i (t) = hMj , Mk i (t). Itô’s formula says that, under
the action of C 2 -functions local semi-martingales are preserved. In other
words, if X(t) = M (t) + A(t) is a local semi-martingale (i.e. a sum of a
local martingale and a process which is locally of bounded variation), and
if f : Rd → R is a C 2 -function, then the process t 7→ f (X(t)) is a again a
local semi-martingale.

9.2 Stopping times and time-homogeneous Markov


processes
Next we explain the strong Markov property. Since the sample paths t 7→ X(t),
t ≥ 0 are right continuous Px -almost surely our Markov process is a strong
Markov process. Let S : Ω → ∞ be a stopping meaning that for every t ≥ 0
the event {S ≤ t} belongs to Ft . This is the same as saying that the process
t 7→ 1[S≤t] is adapted. Let FS be the natural σ-field associated with the
stopping time S, i.e.
\
FS = {A ∈ F : A ∩ {S ≤ t} ∈ Ft } .
t≥0
464 9 Miscellaneous topics

Define ¡ϑS (ω) ¢by ϑS (ω) = ϑS(ω) (ω). Consider FS as the information from the
past, σ X(S) as information from the present, and

σ {X(t) ◦ ϑS : t ≥ 0} = σ {X(t + S) : t ≥ 0}

as the information from the future. The time-homogeneous Markov property


can be expressed as follows:
£ ¯ ¤ £ ¯ ¤
Ex f (X(s + t)) ¯ Fs = Ex f (X(s + t)) ¯ σ(X(s)) = EX(s) [f (X(t))] ,
(9.8)
Px -almost surely for all f ∈ Cb (E) and for all s and t ≥ 0. The strong Markov
property can be expressed as follows:

Ex [Y ◦ ϑS |FS ] = EX(S) [Y ] , Px -almost surely (9.9)

on the event {S < ∞}, for all bounded random variables Y , for all stopping
times S, and for all x ∈ E. One can prove that under the “cadlag” property
events like {X(S) ∈ B, S < ∞}, B Borel, are FS -measurable. The passage
from (9.9) to (9.8) is easy: put Y = f (X(t)) and S(ω) = s, ω ∈ Ω. The other
way around is much more intricate and uses the cadlag property of the process
{X(t) : t ≥ 0}. In this procedure the stopping time S is approximated by a
decreasing sequence of discrete stopping times (Sn = 2−n d2n Se : n ∈ N). The
equality
Ex [Y ◦ ϑSn |FSn ] = EX(Sn ) [Y ] , Px -almost surely,
is a consequence of (2) for a fixed time. Let n tend to infinity in (9.2) to
obtain (9.9). The “strong Markov property” can be extended to the “strong
time dependent Markov property”:
£ ¯ ¤
Ex Y (S + T ◦ ϑS , ϑS ) ¯ FS (ω) = E ¡ ¢ [ω 0 7→ Y (S(ω) + T (ω 0 ) , ω 0 )] ,
X S(ω)
(9.10)
Px -almost surely on the event {S < ∞}. Here Y : [0, ∞)×Ω → C is a bounded
random variable. The cartesian product [0, ∞)×Ω is supplied with the product
field B[0,∞) ⊗ F; B[0,∞) is the Borel field of [0, ∞) and F is (some extension
of) σ (X(u) : u ≥ 0). Important stopping times are “hitting times”, or times
related to hitting times:
© ª
TU = inf s > 0 : X(s) ∈ E 4 \ U , and
½ Z s ¾
S = inf s > 0 : 1E\U (X(u))du > 0 ,
0

where U is some open (or Borel) subset of E 4 . This kind of stopping times
have the extra advantage of being terminal stopping times, i.e. t + S ◦ ϑt = S
Px -almost surely on the event {S > t}. A similar statement holds for the
hitting time TU . The time S is called the penetration time of E \ U . Let
p : E → [0, ∞) be a Borel measurable function. Stopping times of the form
9.3 Markov Chains: invariant measure 465
½ Z s ¾
¡ ¢
Sξ = inf s > 0 : p X(u) du > ξ
0

serve as a stochastic time change, because they enjoy the equality: Sξ + Sη ◦


ϑSξ = Sξ+η , Px -almost surely on the event {Sξ < ∞}. As a consequence op-
erators of the form S(ξ)f (x) := Ex [f (X (Sξ ))], f a bounded Borel function,
possess the semigroup property. Also notice that S0 = 0, provided that the
function p is strictly positive.

9.3 Markov Chains: invariant measure


Some of what follows is taken from [56] and [162]. One of the motivations to
study time-homogeneous Markov chains is the fact that Monte Carlo meth-
ods sample a given multivariate distribution π by constructing a suitable
Markov chain with the property that its limiting, invariant distribution, is
the target distribution π. In most problems of interest, the distribution π is
absolutely continuous and, as a result, the theory of MCMC (Markov Chain
Monte Carlo) methods is based on that of Markov chains on continuous state
spaces outlined, for example, in [162] and [171]. Reference [234] is the funda-
mental reference for drawing the connections between this elaborate Markov
chain theory and MCMC methods. Basically, the goal of the analysis is to
specify conditions under which the constructed Markov chain converges to
the invariant distribution, and conditions under which sample path averages
based on the output of the Markov chain satisfy a law of large numbers and
a central limit theorem.

9.3.1 Some definitions and results

A Markov chain is a sequence of random variables (or state variables) X =


{X(i) : i ∈ N} together with a transition probability function (x, B) 7→
P (x, B), x ∈ E, B ∈ E. The evolution of the Markov chain on a space E is
governed by the transition kernel
£ ¯ ¤
P (x, B) = P X(i + 1) ∈ B ¯ X(i) = x, Fi−1
£ ¯ ¤
= P X(i + 1) ∈ B ¯ X(i) = x , (x, B) ∈ E × E, (9.11)

where the second line embodies the time-homogeneous Markov property that
the distribution of each succeeding state in the sequence, given the current
and the past states, depends only on the current state. Note that Fi−1 repre-
sents the σ-field generated by the variables {X(j) : 0 ≤ j ≤ i − 1}. In fact, a
complete description of a time-homogeneous Markov chain is given by:

{(Ω, F, Px ) , (X(i), i ∈ N) , (ϑi , i ∈ N) , (E, E)} (9.12)

where
466 9 Miscellaneous topics
£ ¯ ¤ £ ¯ ¤
Px [X(1) ∈ B] = P X(1) ∈ B ¯ X(0) = x = P X(i + 1) ∈ B ¯ X(i) = x
= P (x, B) = P (1, x, B) . (9.13)

The operators ϑi , i ∈ N, are time shift operators: ϑi ◦ ϑj = ϑi+j , i, j ∈ N.


Moreover, X(i) ◦ ϑj = X(i + j) Px -almost surely for all x ∈ E and all i, j ∈ N.
A convenient way to express the Markov property goes as follows:
£ ¯ ¤
Px X(i + 1) ∈ B ¯ Fi = PX(i) [X(i) ∈ B] , (x, B) ∈ E × E, i ∈ N.

If in (8.14) we confine the time [0, ∞) to the discrete time N, then we get a
Markov chain with a not necessarily discrete state space. The Markov chain
obtained from (8.14) is called a skeleton of the time-homogeneous Markov
process in continuous time. The transition kernel is thus the distribution of
X(i + 1) given that X(i) = x. The nth step ahead transition kernel is given
by Z
P (n, x, B) = P n (x, B) = P (x, dy) P (n−1) (y, B), (9.14)
E
where
B 7→ P (1) (x, B) = P (x, B) = P (1, x, B) , B ∈ E, (9.15)
is a probability measure on E, the Borel field of the state space E. In fact
the Markov property of the time-discrete process in (9.12) is equivalent to the
following Chapman-Kolmogorov equation:
Z
P (n + m, x, B) = P n+m (x, B) = P n (x, dy) P m (y, B), n, m ∈ N, x ∈ E.
E
(9.16)
Instead of the skeleton {X(i) : i ∈ N} we could have taken a skeleton of the
form
{X (δi) : i ∈ N} , δ > 0. (9.17)
Again we get a Markov chain, and the results of Meyn and Tweedie can
be used. However, note that hitting times phrased in terms of a skeleton in
general are larger than the original hitting times. On the other hand, in our
setup the paths of the Markov process are continuous from the right, and so
in principle our Markov process can be approximated by skeletons of the form
(9.17).
The goal is to find conditions under which the nth iterate of the transition
kernel converges to the invariant measure or distribution π as n → ∞. The
invariant distribution is one that satisfies
Z
π(B) = P (x, B)dπ(x). (9.18)
E

The invariance condition states that if X(i) is distributed according to π,


then all subsequent elements of the chain are also distributed as π. Markov
9.3 Markov Chains: invariant measure 467

chain samplers are invariant by construction and therefore the existence of


the invariant distribution does not have to be checked.
A Markov chain is reversible, or satisfies the detailed balance condition,
if there exists a reference measure m onRE such that the transition function
P (x, B) can be written as P (x, B) = E p(x, y)dm(y), where the integral
kernel p(x, y) satisfies

f (x)p(x, y) = f (y)p(y, x), (9.19)

for a Borel measurable function f (·) which is the Radon-Nikodym derivative


of some Borel measure B 7→ π(B), B ∈ E. If this condition holds, it can be
shown that π is an invariant measure: see e.g. Tierney [234]. To verify this we
evaluate the right hand side of (9.18):
Z Z ½Z ¾
P (x, B) dπ(x) = p (x, y) dm(y) f (x) dm(x)
B
Z ½Z ¾
= f (x)p(x, y)dm(x) dm(y)
B
Z ½Z ¾
= p(y, x)f (y)dm(x) dm(y)
ZB
= f (y)dm(y) = π(B).
B

A minimal requirement on the Markov chain for it to satisfy a law of large


numbers is the requirement of π-irreducibility. This means that the chain is
able to visit all sets with strictly positive probability under π from any starting
point in E. Formally, a Markov chain is said to be π-irreducible if for every
x ∈ E,
£ ¯ ¤
π(A) > 0 ⇒ P X(i) ∈ A ¯ X(0) = x > 0, for some i ≥ 1. (9.20)

The property in (9.20) can also be phrased in terms of the hitting time of A:
τA = min {m ≥ 1 : X(m) ∈ A}. If X(m) ∈ / A for all m ∈ N, m ≥ 1, then we
put τA = ∞. Then an equivalent to write (9.20) runs as follows: if A ∈ A ∈ E
is such that π(A) > 0, then Px [τA < ∞] > 0 for all x ∈ E. If the space E is
connected and the function p(x, y) is positive and continuous, then R the Markov
chain with transition probability function given by P (x, B) = E p(x, y)dm(y)
and invariant probability measure π is π-irreducible.
In our case another important property of the Markov chain is its aperi-
odicity, which ensures that the chain does not cycle through a finite num-
ber of sets. A Markov chain is aperiodic if there exists no partition of
E = (D0 , D1 , . . . , Dp−1 ) for some p ≥ 2 such that for all i ∈ N
Z
£ ¤ £ ¤
P X(i) ∈ Di mod (p) |X(0) ∈ D0 = Px X(i) ∈ Di mod (p) dµ0 (x) = 1,
D0
(9.21)
468 9 Miscellaneous topics

for some initial probability distribution µ0 . If the probability µ0 and the par-
tition (D0 , . . . , Dp−1 ) did have the property spelled in (9.21), then there exists
a state x0 ∈ D0 such that
¡ ¢ £ ¤
P i, x0 , Di mod (p) = Px0 X(i) ∈ Di mod (p) = 1, for all i ∈ N. (9.22)

It follows that not all probability measures B 7→ P (i, x0 , B), i ∈ N, i ≥ 1,


have the same null-sets. So we have the following result.
Proposition 9.2. Let the time-homogeneous Markov chain in (9.12) have
a transition probability function P (i, x, B), i ∈ N, x ∈ E, B ∈ E, where
P (x, B) = P (1, x, B), and P (0, x, B) = 1B (x). Suppose that all probability
measures B 7→ P (i, x, B), i ≥ 1, i ∈ N, x ∈ E, have the same negligible sets.
Then the Markov chain in (9.12) is aperiodic.
These definitions allow us to state the following results from [234] which form
the basis for Markov chain Monte Carlo methods, and other asymptotic re-
sults. The first of these results gives conditions under which a strong law of
large numbers holds and the second gives conditions under which the proba-
bility density of the nth iterate of the Markov chain converges to its unique,
invariant density.
Theorem 9.3. Suppose {X(i), Px }x∈E is a π-irreducible time-homogeneous
Markov chain with transition kernel P (x, B) = P (1, x, B) and invariant prob-
ability distribution π. Then π is the unique invariant distribution of P (x, B)
and for all π-integrable real-valued functions h,
n Z
1X
h (X(i)) → h(x)dπ(x) as n → ∞, Px -almost surely. (9.23)
n i=1

If the invariant measure is σ-finite and not finite, then the limit in (9.23) is
zero. That is why irreducible Markov chains with a (unique) σ-finite invariant
measure, which is not finite, are called Markov chains which are null-recurrent.
For ergodicity results in null recurrent Markov chains, like the theorem of
Chacon-Ornstein for quotients of time averages as in (9.23), the reader is
referred to Krengel [138]. Recurrent Markov chains with a finite invariant
measure are called positive recurrent. There is a close relationship between
expectations of (first) return times and invariant measures. In the discrete
state space setting we have the following. Put Ty = inf {m ≥ 1 : X(m) = y},
y ∈ E, and write µx,y = Ex [Ty ]. Then the following equality holds:

1
π(y) = lim P n (x, {y}) = . (9.24)
n→∞ µy,y

The result in (9.24) is called Kac’s theorem: see Theorem 10.2.2 in Meyn
and Tweedie [162]. For more details the reader is referred to the literature:
Norris [167] and Karlin and Taylor [123]. Some older work can be found in
9.3 Markov Chains: invariant measure 469

Orey [177], Kingman and Orey [132], and [117]. The following Theorem of
Orey, or Orey’s convergence theorem can be found in Meyn and Tweedie
[162] theorem 13.3.3 and 18.1.2. For the claim in (9.25) the positivity of the
recurrent Markov chain is not required. It suffices to have a σ-finite invariant
measure, which is guaranteed by a result due to Foguel [88] for irreducible
chains with a recurrent compact subset: see Theorem 2.2 in Seidler [207]. The
existence of a σ-finite Borel measure is also proved in Chapter 10 of the new
version of the book by Meyn and Tweedie [162]. The assertion as written is
proved in Duflo and Revuz [75], who use a method developed by Blackwell
and Freedman [33], who in turn rely on a result by Orey [175] which states a
result like (9.25) for point measures µi = δxi , i = 1, 2. The following theorem
was used in the proof of Proposition 8.37. In [127] Theorem 1 and Lemma 1
Kaspi and Mandelbaum establish a close relationship between recurrence and
Harris recurrence. A similar result for the fine topology was found by Azema
et al in [11]: see Proposition IV 4.
Theorem 9.4. Suppose {X(n), Px }x∈E is an irreducible time-homogeneous
aperiodic Markov chain with transition kernel P (x, B) = P (1, x, B), which is
Harris recurrent. Then for all probability measures µ1 and µ2 on E
ZZ
lim Var (P n (x, ·) − P n (y, ·)) dµ1 (x) dµ2 (y) = 0, (9.25)
n→∞

where Var denotes the total variation norm. If the Markov chain is positive
Harris recurrent, then for µ2 the invariant probability measure π may be cho-
sen. This existence follows from positive recurrence. Then the following equal-
ity holds for all probability measures µ1 on E:
µZ ¶
n
lim Var P (x, ·)dµ1 (x) − π(·) = 0. (9.26)
n→∞

Let B ∈ E. The proof of Theorem 9.4 is based on among other things the
decomposition of the event {X(n) ∈ B} over the times of the first and the
last entrance time, or entry time to A prior to the time n:
£ ¤
Px [X(n) ∈ B] = Px X(n) ∈ B, τA1 ≥ n
n−1
XX j
£ £ £ ¤ ¤
+ Ex EX(k) PX(j−k) X(n − j) ∈ B, τA1 ≥ n − j , X(j − k) ∈ A ,
j=1 k=1
¤
τA1 ≥ k, X(k) ∈ A
£ ¤
= Px X(n) ∈ B, τA1 ≥ n
n−1
XX j
£ £ £ ¤ ¤
+ Ex EX(k) PX(j−k) X(n − j) ∈ B, τA1 ≥ n − j , X(j − k) ∈ A ,
j=1 k=1
¤
τA1 = k , (9.27)
470 9 Miscellaneous topics

where
© 1 in the final ªstep ©of1 (9.27)
ª we used the following equality of events:
τA ≥ k, X(k) ∈ A = τA = k , k ∈ N, k ≥ 1. The formula in (9.27) can
be found in Meyn and Tweedie [162] formula (13.39). Its proof is an easy
consequence of the Markov property. The entrance time τA1 = τA1,1 is defined
in (9.35) of Theorem 9.8: τA1 = inf {n ≥ 1 : X(n) ∈ A}. In terms of functions
the equality in (9.27) reads:
£ ¤
Ex [f (X(n))] = Ex f (X(n)) , τA1 ≥ n
n−1
XX j
£ £ £ ¤ ¤
+ Ex EX(k) EX(j−k) f (X(n − j)) , τA1 ≥ n − j , X(j − k) ∈ A ,
j=1 k=1
¤
τA1 = k , f ∈ Cb (E). (9.28)

Definition 9.5. The Markov chain (X(n) : n ∈ N) in (9.12) is called positive


Harris recurrent if there exists an invariant probability measure on E relative
to X, and if (x, B) ∈ E × E satisfies P (x, B) > 0, then
"∞ #
X
Px 1B (X(n)) = ∞ = 1.
n=1

A further strengthening of the conditions is required to obtain a central limit


theorem for sample-path averages. A key requirement is that of an ergodic
chain, i.e., a chain that is irreducible, aperiodic and positive Harris-recurrent:
for a definition of the latter, see [234] and Meyn and Tweedie [162]. In addition,
one needs the notion of geometric ergodicity. An ergodic Markov chain with
invariant distribution π is geometrically ergodic if there exists a non-negative
real-valued Borel function x 7→ C(x) and a positive constant R r < 1 such that
Var (P n (x, ·) − π(·)) ≤ C(x)rn for all n ∈ N, and such that C(x)dπ(x) < ∞.
The authors of [51] show that if the Markov chain is ergodic, has invariant
probability distribution π, and is geometrically ergodic, then for all L2 (E, π)-
integrable measurable real-valued functions h, and any initial distribution, the
√ ³ R ´
distribution of n b hn − h(x)dx converges weakly to a normal distribution
n
1X
with mean zero and variance σh2 ≥ 0 as n → ∞. Here b hn = h (X(i)),
n i=1
X∞
and σh2 = Var h (X(0)) + 2 Cov [{h (X(0)) , h (X(k))}]. The following the-
k=1
orem discusses the problem of the existence of an invariant measure. It is
taken from Meyn and Tweedie [162] Theorem 10.0.1. It is supposed that
all measures B 7→ P (1, x, B), B ∈ E, x ∈ E, have the same null-sets. Put
E+ = {A ∈ E : P (1, x0 , A) > 0}. Irreducibility is meant in the sense that
Px [τA < ∞] > 0 for all x ∈ E, and all subsets A ∈ E+ . In our setting we
may assume that irreducibility can be phrased in terms of reachability of any
open subset with positive probability from any starting point and in as short
time as we please: see Lemma 8.2. It is not clear what ia the exact analog of
9.3 Markov Chains: invariant measure 471

(9.32) in case we are working with continuous time processes like in (8.14). It is
quite well possible hat in that case Dynkin’s formula plays a central role. Let
{(Ω, F, Px ) , (X(t), t ≥ 0) , (ϑt , t ≥ 0) , (E, E)} be a time-homogeneous strong
Markov process, and let A be a Borel subset of E with hitting time τA . For
λ > 0 we have Dynkin’s formula:
Z ∞ Z ∞
e−λs Ex [f (X(s))] ds − e−λs Ex [f (X(s)) , τA > s] ds
0
· ·Z ∞ 0 ¸¸
−λτA
= Ex e EX(τA ) e−λs f (X(s)) ds . (9.29)
0

If we use the resolvent notation:


Z ∞ Z ∞
−λs sL
R(λ)f (x) = e e f (x)ds, and RA (λ)f (x) = e−λs esLA f (x)ds,
0 0
(9.30)
then the equality in (9.29) can be rewritten as:
£ ¤
R(λ)f (x) − RA (λ)f (x) = Ex e−λτA R(λ)f (X (τA )) . (9.31)
© sL ª
The semigroup e A : s ≥ 0 is defined by esLA f (x) = Ex [f (X(s)) : τA > s],
f ∈ L∞ (E, E), s ≥ 0, x ∈ E. This semigroup need not be strongly continuous.
It lives on Ac = E \ A.
Theorem 9.6. Let the time-homogeneous Markov chain X be recurrent and
have a polish space as state space. Then it admits, up to multiplicative con-
stants, a unique σ-finite invariant measure π. Let A ∈ E be such that
Px [τA < ∞] = 1 for π-almost all x ∈ E. The measure π satisfies:
 1 
Z XτA
π(B) = Ex  1B (X(i)) dπ(x), B ∈ E. (9.32)
A i=1

The invariant measure is finite (rather


£ ¤than merely σ-finite), if there exists a
compact subset C such that sup Ex τC1 < ∞. Moreover,
x∈C
Z Z
£ ¤
π(E) = Ex [τA ] dπ(x) = Ex τC1 dπ(x).
A C

In (9.32) τA1 stands for the first hitting time of the Borel subset A:

τA1 = min {m ≥ 1 : X(m) ∈ A} = 1 + τA ◦ ϑ1

where another stopping time τA also plays a relevant role:

τA = min {m ≥ 0 : X(m) ∈ A} .

In [162] Meyn and Tweedie discuss “petite” and “small” sets. Theorem 9.6
follows from a combination of the following theorems and propositions in [162]:
472 9 Miscellaneous topics

Theorem 10.0.1 (in which ψ-irreducibility and “petite sets” play a crucial role),
a rephrasing of assertion (i) of Proposition 5.2.4 in terms of petite sets, which
is in fact the same as assertion (i) of Proposition 5.5.4, and assertion (ii) in
Theorem 6.2.5 (which states that in a topological Markov chain all compact
subsets are “petite”). Meyn and Tweedie use the following terminology. Let
B 7→ ϕ(B) be a finite measure on E. A Markov chain is called ϕ-irreducible,
if every set A ∈ E for which ϕ(A) > 0 the quantity Px [τA < ∞] > 0 for all
x ∈ E. In our case we may take ψ(A) = ϕ(A) = P (1, x0 , A), A ∈ E. The
assumption that all measures of the form A 7→ P (t0 , x0 , A) are equivalent,
makes the choice of x0 ∈ E irrelevant. Let a := (ak )k∈N be a sequence of
non-negative real numbers which add up to one. Then P∞we define the function
(x, A) 7→ Ka (x, A), (x, A) ∈ E × E, by Ka (x, A) = k=0 ak P n (x, A). Denote
by P (N) the collection of positive sequences which add up to 1. A subset
A ∈ E is called “petite” if there exists a sequence a ∈ P (N) and a non-
trivial measure νa such that Ka (x, B) ≥ νa (B) for all B ∈ E and all x ∈ A.
If we can find a of the form ak = δn (k), k ∈ N, and a corresponding non-
trivial measure νn , for some n ∈ N, then A is called “small”; this means that
P n (x, B) = Kδn (x, B) ≥ νn (B), B ∈ E, and νn a non-trivial measure. It says
that for Markov chains which are ψ-irreducible and aperiodic the collection
of “small sets” coincides the collection of “petite sets”. The Markov chain
X is called topological, or a Markov T -chain, if the space E is a complete
metrizable (locally compact) Hausdorff space, and the function x 7→ P (x, B)
is lower semi-continuous for every B ∈ E. In fact the authors assume that for
every B ∈ E the function x 7→ P (x, B) dominates a strictly positive lower
semi-continuous function, whenever it itself is strictly positive. Observe that
by the Markov property
"∞ # " "∞ ##
[ [
Px [τA < ∞] = Px {X(n) ∈ A} = Ex PX(1) {X(n − 1) ∈ A} ,
n=1 n=1

and hence, by the strong Feller property, the function x 7→ Px [τA < ∞] is in
fact continuous. In the results, which we mention above, the local compactness
does not play a role.
As a corollary to (the proof of) Theorem 9.6 we have the following result.
Corollary 9.7. Let the notation and assumptions be as in Theorem 9.6. Then
the following equality holds for f ∈ L1 (E, π):
Z
lim Ex [f (X(n)) , τA ≥ n] dπ = 0. (9.33)
n→∞ E\A

A result, corresponding to Theorem 9.6 in the discrete time case, reads as


follows.
Theorem 9.8. Let {(Ω, F, Px ) , (X(t), t ≥ 0) , (ϑt , t ≥ 0) , (E, E)} be a strong
Markov process with right-continuous paths. Let π be a σ-finite invariant mea-
sure with the property that, if π1 is another invariant measure such that
9.3 Markov Chains: invariant measure 473

π1 (B) ≤ π(B), B ∈ E, and π1 (A) = π(A) for some A ∈ E for which


Px [e
τA < ∞] < ∞ for π-almost all x ∈ A. Then the following equality holds for
1
all f ∈ L
h (E, π), and
i for all Borel subsets A with the property that π(A) < ∞
1,h
and Px PA < ∞ = 1 for all x ∈ A, and all h > 0:
 
Z 1,h
τA
X/h Z
Ex  f (X (kh)) dπ(x) = f (x)dπ(x). (9.34)
A k=1

In (9.34) the stopping times τA1,h , h > 0, are defined by

τA1,h = inf {`h : ` ∈ N, ` ≥ 1, X(`h) ∈ A} = h + τAh ◦ ϑh (9.35)

where τAh = inf {`h : h` ∈ N,i ` ≥ 0, X(`h) ∈ A}. If there exists A ∈ E with the
R 1,h
property that A Ex τA dπ(x) < ∞ for some h > 0, then the invariant
R h i
measure π is finite, and h × π(E) = A Ex τA1,h dπ(x).

If h = 1 we write τA1 instead of τA1,1 : see formula (9.27) below. The proof of
Theorem 9.8 is completely analogous to that Rof Theorem 9.6. Instead of the op-
erator T , given by T f (x) = Ex [f (X(1))] = f (y)P (x,R dy), we now introduce
the operators Th , h > 0, Th f (x) = Ex [f (X(h))] = f (y)P (h, x, dy), where
P (t, x, B) is the probability transition function. We also need the operator
TA,h defined by
Z
£ h
¤
TA,h f (x) = Ex f (X(h)) , τA ≥ h = f (y)PA (h, x, dy),
£ ¤
where PA (h, x, B) = Px X(h) ∈ B, τAh ≥ h . Again the proof yields the fol-
lowing corollary.
Corollary 9.9. Let the notation and assumptions be as in Theorem 9.8. Then
the following equality holds for f ∈ L1 (E, π):
Z
lim Ex [f (X(nh)) , τA ≥ nh] dπ(x) = 0. (9.36)
n→∞ E\A

Proof. Proof of Theorem 9.6 Let A ∈ E be as in Theorem 9.6. We introduce


two operators T and TA , defined by respectively

T f (x) = Ex [f (X(1))] , and TA f (x) = Ex [f (X(1)) , τA ≥ 1] , f ∈ Cb (E) .


(9.37)
Notice that TA f = T f − 1A T f = 1E\A T f , so that TA f = 0 on A. Then by
induction with respect to n we see
n
X n
X
1A T TAk−1 f + 1E\A TAn f = f + (T − I) TAk−1 f, f ∈ Cb (E). (9.38)
k=1 k=1
474 9 Miscellaneous topics

Hence, since π is an invariant measure the equality in (9.38) implies


n Z
X Z Z
T TAk−1 f (x) dπ + TAn f (x) dπ(x) = f (x) dπ(x), f ∈ L1 (E, π).
k=1 A E\A E
(9.39)
Let f ∈ L∞ (E, E)∩L1 (E, π), f ≥ 0. By the assumption that Px [τA < ∞] = 1,
for π-almost all x ∈ E, we have f = limn→∞ (f − TAn f ), π-almost everywhere,
and hence we obtain
Z Z Z
n
f dπ = lim (f − TA f ) dπ = lim inf (f − TAn f ) dπ
E E n→∞ E n→∞

(Fatou’s lemma)
Z Z n
X
≤ lim inf (I − TAn ) f dπ = lim inf (I − TA ) TAk−1 f dπ
n→∞ E n→∞ E k=1

(employ the identity TA = T − 1A T )


Z n
X
= lim inf (I − T + 1A T ) TAk−1 f dπ
n→∞ E k=1

(the measure π is T -invariant)


Z n
X ∞ Z
X
= lim inf T TAk−1 f dπ = T TAk−1 f dπ
n→∞ A A
k=1 k=1
 
Z 1
τA
X
= Ex  f (X(k)) dπ(x). (9.40)
A k=1

The inequality in (9.40) shows that


 
Z Z 1
τA
X
f dπ ≤ Ex  f (X(k)) dπ(x). (9.41)
E A k=1

Of course, in (9.41) we assumed Px [τA < ∞] = 1, x ∈ E. On the other hand


the equality in (9.38) yields:
n Z
X
T TAk−1 f dπ
k=1 A
n Z
X Z Z Z X
n
≤ T TAk−1 f dπ + TAn f dπ = f dπ + (T − I) TAk−1 f dπ
k=1 A E\A E E k=1
9.3 Markov Chains: invariant measure 475
Z
= f dπ. (9.42)
E

From (9.42) we get, by letting n → ∞ and using the Markov property several
times  1 
Z XτA Z
Ex  f (X(k)) dπ(x) ≤ f dπ. (9.43)
A k=1 E

Combining (9.43) and (9.41) shows the equality:


 1 
Z XτA Z
Ex  f (X(k)) dπ(x) = f dπ. (9.44)
A k=1 E

The equality in (9.44) completes the proof of Theorem 9.6

Remark 9.10.
R If in Theorem R9.6 we only assume π to be sub-invariant in the
sense that E T f (x)dπ(x) ≤ E f (x)dπ(x), f ∈ L1 (E, π), f ≥ 0, then for such
functions we have
Xn Z Z Z
k−1 n
T TA f (x) dπ + TA f (x) dπ(x) ≤ f (x) dπ(x). (9.45)
k=1 A E\A E

Proof (Proof of Corollary 9.7). The equality in (9.39) can be rewritten as


follows:
 1 
Z τA ∧n
X Z
Ex  
f (X(k)) dπ(x) + Ex [f (X(n)) , τA ≥ n] dπ(x)
A k=1 E\A
Z
= f (x)dπ(x) (9.46)
E

The equality in (9.44) together with (9.46) yields the result in Corollary 9.7.
To establish we need once more the fact that Px [τA < ∞] = 1 for π-almost
all x ∈ E.
This completes the proof of Corollary 9.7.

In the following corollary we give a result similar to the one in Theorem 9.6,
but here we do not necessarily assume that Px [τA < ∞] = 1 for π-almost all
x ∈ E.
Corollary 9.11. Define the measures π1 and π∞ by the equalities:
 1 
Z Z τA ∧n
X
f (x) dπ1 (x) = inf sup Ex  T ` f (X(k)) dπ(x)
E `∈N n∈N A k=1
476 9 Miscellaneous topics
 
Z 1
τA ∧n
X
= inf sup Ex  f (X(k + `)) dπ(x), and (9.47)
`∈N n∈N A k=1
Z Z
£ ¤
f (x) dπ∞ (x) = sup inf Ex T ` f (X(n)) , τA ≥ n dπ(x)
E `∈N n∈N E\A
Z
= sup inf Ex [f (X(n + `)) , τA ≥ n] dπ(x), (9.48)
`∈N n∈N E\A

where the function f ≥ 0 belongs to L1 (E, π). Then the measures π1 and π∞
are T -invariant, and they split the measure π:
Z Z Z
f dπ = f dπ1 + f dπ∞ , f ∈ L1 (E, π) . (9.49)
E E E

If, Px [τA < ∞] = 1 for π-almost all x ∈ E, then π∞ = 0 and π1 = π.


Since f ≥ 0, the infima and suprema in (9.47) and (9.48) are in fact limits.
This observation follows from the equality in (9.50) and the invariance of the
measure π.
Proof. From (9.46) we get:
 1 
Z τA ∧n
X Z
Ex  f (X(k + `)) dπ(x) + Ex [f (X(n + `)) , τA ≥ n] dπ(x)
A k=1 E\A
 
Z 1
τA ∧n
X Z
£ ¤
= Ex  T ` f (X(k)) dπ(x) + Ex T ` f (X(n)) , τA ≥ n dπ(x)
A k=1 E\A
Z Z
= T ` f (x)dπ(x) = f (x) dπ(x). (9.50)
E E

The splitting in (9.49) follows from (9.50). If Px [τA < ∞] = 1 for π-almost all
x ∈ E, then Corollary 9.7 yields π∞ = 0, and hence π1 = π.
This completes the proof of Corollary 9.11.
In what follows we establish in the continuous time setting an analog to The-
orem 9.6
Theorem 9.12. Let the time-homogeneous Markov process X be recurrent
and have a polish space as state space. Then it admits, up to multiplicative
constants, a unique σ-finite invariant measure π. Let A ∈ E be such that
Px [τA < ∞] = 1 for π-almost all x ∈ E. In addition suppose that a separation
property like the one in Proposition 8.11 is satisfied:
(a) For every x ∈ E \ Ar and some constant α > 0 there exists a function
u ∈ D(L) such that u(x) − u(y) ≥ α for all y ∈ Ar .
9.3 Markov Chains: invariant measure 477

Let π Rbe a σ-finite invariant measure π and let f ∈ L1 (E, π) be such that

1Ar L 0 esLA |f | ds ∈ L1 (E, π). Then the following equalities hold:
Z Z µ ·Z τA ¸ ¶
f dπ = LE(·) f (X(s)) ds (x) + f (x) dπ(x)
Ar 0
Z µ Z ∞ ¶
= L esLA f (x) ds + f (x) dπ(x). (9.51)
Ar 0

The invariant measure is finite (rather£ than


¤ merely σ-finite), if there exists a
compact subset C such that sup LE(·) τC1 (x) < ∞. Moreover,
x∈C
Z Z
π(E) = LE(·) [τA ] (x) dπ(x) + f (x) dπ(x)
r r
ZA ZA
= LE(·) [τC ] (x) dπ(x) + f (x) dπ(x).
C Cr

Here Ar stands for the collection of regular points of A:

Ar = {x ∈ A : Px [τA = 0] = 1} .

From Blumenthal’s zero-one law we know that Px [τA = 0] = 0 or 1. Since the


paths are right-continuous it follows that Ar ⊂ A, where A is the (topological)
closure of A. The formula which lies at the basis of Theorem 9.12 is the
following:
Z t Z t
sLA tLA
1 Ar L e f ds + 1Ar f + e f = f + L esLA f ds (9.52)
0 0

The discrete analog of formula (9.52) is the formula in (9.38). The formula in
(9.52) is based on the equality Lf − LA f = 1Ar Lf , f ∈ D(L), which in turn
is a consequence of hypothesis (a) in Theorem 9.12. In addition we have
Z t
LA esLA f ds = etLA f − f, on E \ Ar . (9.53)
0
© ª
The semigroup esLA : s ≥ 0 is defined by:

esLA f (x) = Ex [f (X(s)) : τA > s] , f ∈ Cb (E). (9.54)

Its generator LA is pointwise defined by

etLA f (x) − f (x)1E\Ar (x)


LA f (x) = lim (9.55)
t↓0 t
for all functions f ∈ Cb (E) for which these limits exist for all x ∈ E. Note
that etLA f (x) = LA f (x) = 0 for x ∈ Ar . The semigroup esLA lives on E \ Ar .
We will need the following lemma: it resembles Proposition 8.11.
478 9 Miscellaneous topics

Lemma 9.13. Let x ∈ E \ Ar . Then the following equalities hold:


Px [τA ≤ t]
lim = 0, and (9.56)
t↓0 t
(L − LA ) f (x) = 0 for f ∈ Cb (E). (9.57)
In addition: (L − LA ) f = 1Ar Lf , and consequently, LA f = 1E\Ar Lf , for
f ∈ D(L).
Proof. Let α > 0 be as in the separation property (a) of Theorem 9.12, and
choose the function u ∈ D(L) in such a way that α ≤ u(x) − u(y) for all
y ∈ Ar . Then we have
αPx [τA < t]
≤ Ex [u(X(0)) − u (X (τA )) , τA < t]
= Ex [u(X(0)) − u (X (τA ∧ t)) , τA ∧ t < t]
·Z τA ∧t ¸
= −Ex Lu (X(s)) ds, τA < t ≤ t kLuk∞ Px [τA < t] . (9.58)
0

From (9.58) equality (9.56) in Lemma 9.13 follows immediately. We use this
to prove that (L − LA ) f (x) = 0. Therefore we write
¯ tL ¯
¯e f (x) − etLA f (x)¯ = |Ex [f (X(t))] − Ex [f (X(t)), τA > t]|
= |Ex [f (X(t)), τA ≤ t]| ≤ kf k∞ Px [τA ≤ t] . (9.59)
The equality in (9.57) follows from (9.56) and (9.59). Finally let f ∈ D(L).
From (9.57) shows that (L − LA ) f (y) = 0 for y ∈ E \ Ar . If y ∈ Ar , the
LA f (x) = 0. Consequently (L − LA ) f = 1Ar Lf .
This concludes the proof of Lemma 9.13.
The following lemma shows that equality (9.53) holds.
Lemma 9.14. The equality in (9.53) holds for f ∈ Cb (E).
Proof. The proof follows a standard procedure. We write
Z t Z t
ehLA esLA f ds(x) − esLA f (x) ds
0 0
Z t Z t
(s+h)LA
= e f (x) ds − esLA f (x) ds
0 0
Z t+h Z t
= esLA f (x)ds − esLA f (x) ds
h 0
Z t+h Z h
= esLA f (x)ds − esLA f (x) ds. (9.60)
t 0

Upon dividing (9.60) by h and sending h to zero we obtain the equality in


(9.53).
This completes the proof of Lemma 9.14.
9.3 Markov Chains: invariant measure 479

The following lemma shows that equality (9.52) holds.


Lemma 9.15. The equality in (9.52) holds for all f ∈ Cb (E) with the property
Rt
that 0 esLA f ds belongs to D(L) for all t > 0.

Proof. Since for x ∈ Ar the term etLA f (x) vanishes, the equality in (9.52)
is automatically true for x ∈ Ar . By Lemma 9.13 we have 1E\Ar L = LA ,
and for x ∈ E \ Ar we need to establish the equality etLA f (x) = f (x) +
Rt
LA 0 esLA f (x) ds. However, this is the contents of Lemma 9.14. Hence, the
equality in (9.52) follows for f ∈ Cb (E).
This shows that claim in Lemma 9.15.

Proof (Proof of Theorem 9.12). By a density argument and linearity it suffices


to prove the equality in (9.51) for f ∈ L∞ (E, ∩ L1 (E, π) with the property
R ∞E)sL
that f ≥ 0 and that the function 1Ar L 0 e A f ds belongs to L1 (E, π).
The equality of the first and the second quantity in (9.51) follows from the
definition of the operators
h R esLA ,is ≥ 0. Furthermore we have that for x ∈ Ar
t sLA
the function t 7→ L 0 e f ds (x) increases. For this assertion we use the
equality (L − LA ) g = 1Ar g established in Lemma 9.13. For g ≥ 0, g ∈ D(L),
and x ∈ Ar we have
ehL g(x) − ehLA g(x)
(L − LA ) g(x) = lim
h↓0 h
Ex [g (X(h))] − Ex [g (X(h)) , τA > h]
= lim
h↓0 h
Ex [g (X(h)) , τA ≤ h]
= lim ≥ 0. (9.61)
h↓0 h

From (9.61) we see that g ≥ 0 implies 1Ar Lg = (L − LA ) g ≥ 0. The proof of


Theorem 9.12 can now be completed in the same way as we proved Theorem
9.6.

The following corollary is similar to Corollary 9.7 which follows from the proof
of Theorem 9.6.
Corollary 9.16. Let the notation and assumptions be as in Theorem 9.12.
Then the followingR equality holds for all f ∈ L1 (E, π) with the property that

the function 1Ar L 0 esLA f ds belongs to L1 (E, π) as well:
Z
lim Ex [f (X(t)) , τA ≥ t] dπ(x) = 0. (9.62)
t→∞ E\Ar

9.3.2 Construction of an invariant measure

In this subsection we will give a construction of an, up to a multiplicative con-


stant, σ-finite measure provided the hypotheses of Theorem 9.12 are fulfilled.
480 9 Miscellaneous topics

We will use Dynkin’s formula, and we will employ resolvent techniques: see
(9.31). We will begin with establishing a number of relevant formulas, which
we collect in Proposition 9.21 below. In what follows we employ the following
notation:
Z ∞
−1
R(λ)f (x) = (λI − L) f (x) = e−λs esL f (x) ds
0
Z ∞
−λs
= e Ex [f (X(s))] ds, (9.63)
0
−1
RA (λ)f (x) = (λI − LA ) f (x)
·Z τA ¸
= Ex e−λs f (X (s)) ds
Z ∞ 0
= e−λs Ex [f (X(s)) : τA > s] ds
0
Z ∞
= e−λs esLA f (x)ds, (9.64)
0
PA (λ) = (L − λI) RA (λ) + I = 1Ar ((L − λI) RA (λ) + I)
= (λI − L) HA (λ)R(λ) = 1Ar (λI − L) HA (λ)R(λ), (9.65)

where
( £ ¤
Ex e−λτA f (X (τA )) , x ∈ E \ Ar ,
HA (λ)f (x) = (9.66)
f (x), x ∈ Ar .

The equalities in (9.65) follow from equality (9.104) in Proposition 9.20).


Lemma 9.17. Let f ∈ D(L) be such that HA (λ)f ∈ D(L). Then the equalities
in (9.65) yield

PA (λ) (L − λI) f = (L − λI) HA (λ)f and (9.67)


( Z h )
³ ´
−λ0 h hL 0 −λ0 s sL
λR(λ) e e − I RA (λ ) + e e ds (L − λ0 I) f (9.68)
0
³ 0
´
= λR(λ) e−λ h ehL − I HA (λ0 ) f (9.69)
Z h Z h
−λ0 s sL 0 0 0
= λ (λR(λ) − I) e e ds HA (λ ) f − λ λR(λ) e−λ s esL ds HA (λ0 ) f,
0 0

for h > 0, λ0 > 0, and λ > 0.

Proof. The equality in (9.67) is an immediate consequence of (9.65). The


equality in (9.68) is a consequence of the following identities:

RA (λ0 ) (L − λ0 I) f + f = HA (λ0 ) f and (9.70)


9.3 Markov Chains: invariant measure 481
Z h
0 0
e−λ s esL (L − λ0 I) f ds = e−λ h ehL f − f. (9.71)
0

The equalities in (9.70) and (9.71) hold for f ∈ D(L), λ0 > 0, and h > 0. The
equality in (9.69) is closely related to (9.67). This can be seen as follows:
( Z h )
³ ´
−λ0 h hL 0 −λ0 s sL
λR(λ) e e − I RA (λ ) + e e ds (L − λ0 I) f
0
( Z Z )
h h
0 0
= λR(λ) (L − λ0 I) e−λ s esL dsRA (λ0 ) + e−λ s esL ds (L − λ0 I) f
0 0
Z h
0
0
= λ (L − λ I) R(λ) e−λ s esL ds {RA (λ0 ) (L − λ0 I) + I} f
0
Z h
0
= λ (L − λ0 I) R(λ) e−λ s esL ds HA (λ0 ) f. (9.72)
0

Since LR(λ) = λR(λ) − I, equality (9.68) is a consequence of (9.72). In order


to obtain (9.72) we also used the identity
Z h
0 0
(L − λ0 I) e−λ s esL f ds = e−λ h ehL f − f, f ∈ Cb (E).
0

This completes the proof of Lemma 9.17.

Next, fix h > 0, λ > 0, µ ∈ M (A), and f ∈ Cb (E). Here M (A) is the
space of those (complex) measures µ ∈ E which are concentrated on A; i.e.
|µ| (E \ A) = 0 (see notation introduced prior to Theorem 1.7). We will also
need the following stopping times, operators, and functionals:

τAh = inf {kh : k ≥ 0, k ∈ N, X(kh) ∈ A} , (9.73)


τA1,h= inf {kh : k ≥ 1, k ∈ N, X(kh) ∈ A} = h + ◦ ϑh , (9.74) τAh
1 ¡ ¢ 1 ¡ ¢
Lh (λ)f (x) = e−λh ehL − I f (x) = e−λh Ex [f (X(h))] − f (x)
h h
1 ¡ £ −λh ¤¢
= Ex e f (X(h)) − f (X(0)) , (9.75)
h h i
h h ¡ ¡ ¢¢
HA (λ)f (x) = Ex e−λτA f X τAh , (9.76)
 h 
τA /h−1
X
h
RA (λ)f (x) = hEx  e−λkh f (X(kh)) , τAh ≥ h
k=0

X £ ¤
=h Ex e−λkh f (X(kh)) , τAh ≥ (k + 1)h , (9.77)
k=0
PAh (λ)(f ) = L (λ)HA
h h
(λ)Lh (λ)−1 f = Lh (λ)RA
h
(λ)f + f. (9.78)
482 9 Miscellaneous topics
Z
ΛhA,µ (λ)(f ) = Lh (λ)HA
h
(λ)Lh (λ)−1 f (x) dµ(x). (9.79)

The second equality in (9.78) follows from equality (9.80) in Proposition 9.18
below.
Instead of ΛhA,δx (λ) we write ΛhA,x0 (λ), when µ = δx0 is the Dirac measure
0

at x0 . Instead of Lh (0) we write Lh . For the stopping times τAh and τA1,h see
(9.35) in Theorem 9.8. Put

τA = inf {s > 0 : X(s) ∈ A} = inf τA1,h = lim τA1,h .


h>0 h↓0

Proposition 9.18. The following identity holds:


h h
HA (λ) = I + RA (λ)Lh (λ), h > 0, λ > 0. (9.80)
Moreover, the equality in (9.80) is equivalent to equality (9.94)£ below. In
¤ ad-
h h h
dition, HA (λ)RA (λ) = 0, and hence HA (λ)2 = HAh
(λ). If Px τAh < ∞ = 1
for all x ∈ E, then equality (9.80) holds for λ = 0. If A is recurrent in the
sense that Px [τA < ∞] = 1 for all x ∈ E, then HA (0) = I + RA (0)L, and
HA (0)2 = HA (0), where HA (0)f (x) = Ex [f (X (τA )) , τA < ∞].
Proof. Let us check the equality in (9.80). To this end we consider:
h
RA (λ)Lh (λ)f (x)
X∞
£ ¤
=h Ex e−λkh Lh (λ)f (X(kh)) , τAh ≥ (k + 1)h
k=0

X £ © ª ¤
= Ex e−λkh e−λh EX(kh) [f (X(h))] − f (X(kh)) , τAh ≥ (k + 1)h
k=0
X∞
£ £ ¤¤
= e−λ(k+1)h Ex EX(kh) f (X(h)) , τAh ≥ (k + 1)h
k=0

X £ ¤
− e−λkh Ex f (X(kh)) , τAh ≥ (k + 1)h
k=0
© ª
(employ the strong Markov property observing that the event τAh ≥ (k + 1)h
is Fkh -measurable)

X £ ¤
= e−λ(k+1)h Ex f (X ((k + 1)h)) , τAh ≥ (k + 1)h
k=0

X £ ¤
− e−λkh Ex f (X(kh)) , τAh ≥ (k + 1)h
k=0

X £ ¤
= e−λkh Ex f (X (kh)) , τAh ≥ kh − f (x)
k=0
9.3 Markov Chains: invariant measure 483

X £ ¤
− e−λkh Ex f (X(kh)) , τAh ≥ (k + 1)h
k=0

X £ ¤
= e−λkh Ex f (X (kh)) , τAh = kh − f (x)
k=0
h h ¡ ¡ ¢¢i
= Ex e−λτA f X τAh − f (x)
h
= HA (λ)f (x) − f (x). (9.81)

The equality in (9.81)£ is the same


¤ as (9.80). ¡ ¢
If x ∈ A, then Px τAh = 0 = 1 and so RA h
(λ)f (x) = 0. Since X τAh ∈ A
it follows that, for λ > 0,
h h ¡ ¡ ¢¢i
h h
HA (λ)RA (λ)f (x) = Ex e−λτA RA h
f X τAh = 0. (9.82)

The assertions for h = λ = 0 follow by taking limits with respect to λ ↓ 0 and


h ↓ 0 in the corresponding equality (9.80).
This completes the proof of Proposition 9.18.
Proposition 9.19. The following equalities hold for f ∈ Cb (E), λ ≥ 0, and
h > 0:

Lh (λ)RA
h
(λ)f (x) + f (x)
 −1 1,h 
h τA −1
X
= Ex  e−λkh f (X(kh)) , τAh = 0
k=0
 1,h

h−1 τA −1
X £ ¤
= Ex  e−λkh f (X(kh)) , τAh = 0 Px τAh = 0
k=0
 −1 1,h

h τA −1
X £ ¤
= Ex  e−λkh f (X(kh)) Px τAh = 0
k=0
 
h−1 τA
h
−1
X £ ¤
= e−λh ehL E(·)  e−λkh f (X(kh)) , τAh ≥ h (x) · Px τAh = 0
k=0
£ ¤
+ f (x) · Px τAh = 0
µ ¶
1 −λh hL h £ ¤
= e e RA (λ)f (x) + f (x) Px τAh = 0
h
¡ h h
¢ £ ¤
= L (λ)RA (λ)f (x) + f (x) Px τAh = 0
n
X £ ¤ £ ¤
= lim e−λ(k+1)h ehL E(·) f (X(kh)) , τAh ≥ (k + 1)h (x) · Px τAh = 0
n→∞
k=0
£ ¤
+ f (x) · Px τAh = 0 . (9.83)
484 9 Miscellaneous topics

In addition, the following assertions are true:


(a) The following formula holds:
¡ ¢2 ¡ ¢
Lh (λ)RA
h
(λ) + I = 1A Lh (λ)RA
h
(λ) + I . (9.84)

This formula is also£ valid ¤with λ = 0 provided that A is h-recurrent in


the sense that Px τAh < ∞ = 1 for all x ∈ E. If A is recurrent, i.e. if
Px [τA < ∞] for all x ∈ E, then the operator PA (0) = LRA (0+) + I is a
projection operator.
(b) For f ≥ 0 the function PAh (λ)f is non-negative, and the function λ 7→
PAh (λ)f (x) increases when λ decreases. In addition, by the fifth equality in
−n 0
(9.83) it follows that with h = hn = 2−n h0 the sequence n 7→ PA2 h (λ)
increases where h0 > 0 and λ ≥ 0 are fixed.

The equalities in (9.83) are the same as those in (9.93). They will be employed
to prove that the invariant measure we will introduce is σ-finite.
Proof. We use the definitions of the operators Lh (λ) and RA
h
(λ) to obtain

Lh (λ)RAh
(λ)f (x) + f (x)
X∞
¡ ¢ £ ¤
= e−λkh e−λh ehL − I E(·) f (X(kh)) , τAh ≥ (k + 1)h (x) + f (x)
k=0
X∞
£ ¤
= e−λ(k+1)h ehL E(·) f (X(kh)) , τAh ≥ (k + 1)h (x)
k=0

X £ ¤
− e−λkh Ex f (X(kh)) , τAh ≥ (k + 1)h + f (x)
k=0

X £ £ ¤¤
= e−λ(k+1)h Ex EX(h) f (X(kh)) , τAh ≥ (k + 1)h
k=0

X £ ¤
− e−λkh Ex f (X(kh)) , τAh ≥ (k + 1)h + f (x)
k=0

(Markov property)


X £ ¤
= e−λ(k+1)h Ex f (X (k + 1) h) , τAh ◦ ϑh ≥ (k + 1)h
k=0

X £ ¤
− e−λkh Ex f (X(kh)) , τAh ≥ (k + 1)h + f (x)
k=0

X £ ¤
= e−λkh Ex f (X (kh)) , h + τAh ◦ ϑh ≥ (k + 1)h
k=1
9.3 Markov Chains: invariant measure 485

X £ ¤
− e−λkh Ex f (X(kh)) , τAh ≥ (k + 1)h + f (x)
k=0

X h i
= e−λkh Ex f (X (kh)) , τA1,h ≥ (k + 1)h
k=1

X £ ¤ £ ¤
− e−λkh Ex f (X(kh)) , τAh ≥ (k + 1)h + f (x)Px τAh = 0
k=1

Pk2
(a sum of the form k=k1 αk is interpreted as 0 if k2 < k1 )
 
1,h
h−1 τA −1
 X  £ ¤
= Ex  e−λkh f (X(kh)) + f (x)Px τAh = 0
k=(h−1 τA )
h ∨1

 
1,h
h−1 τA −1
 X  £ ¤
= Ex  e−λkh f (X(kh)) , τAh = 0 + f (x)Px τAh = 0
k=(h−1 τA )
h ∨1

 
1,h
h−1 τA −1
 X 
+ Ex  e−λkh f (X(kh)) , τAh ≥ h
k=(h−1 τA )
h ∨1

© ª
(on the event τAh ≥ h the equality τA1,h = τAh holds Px -almost surely)
 1,h

h−1 τA −1
X £ ¤
= Ex  e−λkh f (X(kh)) , τAh = 0 + f (x)Px τAh = 0
k=1

 1,h

h−1 τA −1
X
= Ex  e−λkh f (X(kh)) , τAh = 0 . (9.85)
k=0

The equality in (9.85) shows the first£equality


¤ in (9.83). The second and third
equality follow from the equality Px τAh = 0 = 1 which is true if and only if
x ∈ A. The fourth equality in (9.83) is a consequence of the Markov property
in conjunction with the third equality. The fifth equality just follows from the
h
definition of the operator RA (λ). The sixth equality follows from Lebesgue’s
dominated convergence theorem, or from the monotone convergence theorem
if f ≥ 0.
(a) In order to prove assertion (a) in Proposition 9.19 we consider
¡ h h
¢2 ¡ ¢
L (λ)RA (λ) + I = Lh (λ)RA
h
(λ) + I Lh (λ)RA
h
(λ) + Lh (λ)RA
h
(λ) + I
486 9 Miscellaneous topics
¡ h ¢ h
= Lh (λ) RA (λ)Lh (λ) + I RA (λ) + Lh (λ)RA
h
(λ) + I

(apply the equality in (9.80) and the final assertion in Proposition 9.18)

= Lh (λ)HA
h h
(λ)RA (λ) + Lh (λ)RA
h
(λ) + I
= Lh (λ)RA
h
(λ) + I. (9.86)
¡ ¢2
By taking limits as λ ↓ 0, and h ↓ 0 in Lh (λ)RA h
(λ) + I = Lh (λ)RA h
(λ) + I
the£final conclusion
¤ in assertion (a) of Proposition 9.19 follows. Since 1 A (x) =
Px τAh = 0 the final equality in (9.84) follows from (9.83).
(b) Observe that by the equalities in (9.83) and the definition of the op-
h
erator RA (λ) we have
µ −λh hL ¶
h h
¡ h h
¢ e e −I h
L (λ)RA (λ) + I = 1A L (λ)RA (λ) + I = 1A RA (λ) + I
h
1 ¡ ¢
= 1A e−λh ehL RA h
(λ) + I . (9.87)
h
From (9.87) it follows that for f ≥ 0 the function PAh (λ)f is non-negative, and
that the function λ 7→ PAh (λ)f (x) increases when λ decreases. In addition,
by the fifth equality in (9.83) it follows that for h = 2−n h0 the sequence
−n 0
n 7→ PA2 h (λ) where h0 > 0 and λ ≥ 0 are fixed. The equality in (9.86) and
the latter observations complete the proof of Proposition 9.19.
Fix µ ∈ P(E). An attempt to define the invariant measure π goes as follows.
It is determined by the functional
Z
¡ ¢ h ¡ ¢−1
ΛA,µ : f 7→ lim lim λI − Lh (λ) HA (λ) λI − Lh (λ) f (x) dµ(x)
h↓0 λ↓0
Z Z
¡ h ¢
= lim lim Lh (λ) HA (λ) − I Lh (λ)−1 f (x) dµ(x) + f (x) dµ(x)
h↓0 λ↓0
Z Z
h h
= lim lim L (λ)RA (λ)f (x) dµ(x) + f (x) dµ(x)
h↓0 λ↓0
Z Z
= lim Lh RA h
(0)f (x) dµ(x) + f (x) dµ(x). (9.88)
h↓0

In (9.88) we employed equality (9.80). Let us try to check the L-invariance of


the functional in (9.88). To this end we fix f ∈ D(L). Then

Lf = Tβ - lim lim Lh (λ)f = Lf, and


h↓0 λ↓0
Z
¡ ¢
ΛA,µ Lf = lim lim ΛhA,µ (λ) Lh (λ)f = lim lim Lh (λ)HA h
(λ)f dµ
h↓0 λ↓0 h↓0 λ↓0
Z
1 ¡ £ −λh h ¤ h
¢
= lim lim Ex e HA (λ)f (X(h)) − HA (λ)f (x) dµ(x)
h↓0 λ↓0 h A
9.3 Markov Chains: invariant measure 487
Z ³ h ³ ³ ´´i
1 1,h
= lim lim Ex e−λτA f X τA1,h
h↓0 λ↓0 h A
h h ¡ ¡ ¢¢i´
−Ex e−λτA f X τAh dµ(x)

(for x ∈ E \ A the equality τA1,h = τAh holds Px -almost surely)


Z ³ h ³ ³ ´´i
1 1,h
= lim lim Ex e−λτA f X τA1,h
h↓0 λ↓0 h E
h h ¡ ¡ ¢¢i´
−Ex e−λτA f X τAh dµ(x)
Z ³ h i
1 1,h ¡ ¡ ¢¢
= lim lim Ex e−λτA f X τAh ◦ ϑh
h↓0 λ↓0 h E
h h ¡ ¡ ¢¢i´
−Ex e−λτA f X τAh dµ(x)
Z ³ h i
e−λh h ¡ ¡ ¢¢
= lim lim Ex e−λτA ◦ϑh f X τAh ◦ ϑh
h↓0 λ↓0 h E
h h ¡ ¡ ¢¢i´
−λτA
−Ex e f X τAh dµ(x)
Z ³ h h
e−λh h ¡ ¡ ¢¢ii
= lim lim Ex EX(h) e−λτA f X τAh
h↓0 λ↓0 h E
h h ¡ ¡ ¢¢i´
−λτA
−Ex e f X τAh dµ(x)
Z
1 ¡ £ £ ¡ ¡ ¢¢¤¤
= lim Ex EX(h) f X τAh
h↓0 h E
h h ¡ ¡ ¢¢i´
−Ex e−λτA f X τAh dµ(x)
Z
1 ¡ hL ¢ h
= lim e − I HA f (x) dµ(x). (9.89)
h↓0 h E

Hence, if µ were an invariant Borel measure, then the expression in (9.89)


would vanish. So the expression for ΛA,µ does not automatically lead to
an invariant measure. So that there is a problem with the invariance, al-
though (9.88) yields a measure. In order to take care of that problem we
will assume that for every f ∈ Cb (E) there exists a sequence of strictly
positive real numbers (λn )n∈N , which decreases to zero, and is such that
P f := Tβ - lim λn R (λn ) f exists for all f ∈ Cb (E). Notice R(P ) = N (L)
n→∞
and that, by the resolvent identity P 2 = P , i.e. P is a projection on the
zero-space of the operator L. Fix x0 ∈ Ar . Then, in general, the formula
2
f 7→ lim λn (λn I − L) HA (λn ) R (λn ) f (x0 )
n→∞
³ ´
2
= lim λn (λn I − L) (HA (λn ) − I) R (λn ) f (x0 ) + λn R (λn ) f (x0 )
n→∞
= lim (L − λn ) RA (λn ) (λn R (λn ) f ) (x0 ) + lim λn R (λn ) f (x0 )
n→∞ n→∞
488 9 Miscellaneous topics

= LRA (0+)P f (x0 ) + P f (x0 ) (9.90)

does not provide an invariant measure either. Suppose e.g. that N (L) consists
of the constant functions. Then by taking f = 1 in (9.92) below we have

LRA (0+)1(x) + 1(x) = lim lim Lh (λ)HA


h
(λ)Lh (λ)−1 1(x)
h↓0 λ↓0

= lim lim Lh (λ)HA


h
(λ)Lh (λ)−1 1(x)
h↓0 λ↓0
1 ¡ ¢ h
= lim lim I − e−λh ehL HA (λ)1(x)
h↓0 λ↓01 − e−λh
1 ³ h h 1,h

= lim lim Ex e−λτA − e−λτA
h↓0 λ↓0 1 − e−λh
h h 1,h
i
λ Ex e−λτA − e−λτA , τA1,h > τAh
= lim lim
h↓0 λ↓01 − e−λh λ
1 ³ h i´
1,h h 1,h h
= lim Ex τA − τA , τA > τA
h↓0 h

£ ¤ h i
(Ex τAh = 0 for x ∈ A, and Px τA1,h = τAh = 1 for x ∈ E \ A)

1 ³ h i´
= lim 1A (x) Ex τA1,h
h↓0 h
1 ¡ £ ¤¢
= lim 1A (x) Ex h + τAh ◦ ϑh
h↓0 h

(Markov property)

1 £ ¤
= lim 1A (x)ehL E(·) τAh (x) + 1
h↓0 h

= 1A (x)LE(·) [τA ] (x) + 1 (9.91)


© h ª
where τA = inf τA : h > 0 . If possible choose x0 ∈ A in such a way that
LE(·) [τA ] (x0 ) = ∞. Then LRA (0+)1 (x0 ) + 1 (x0 ) = ∞. It follows that for
f ∈ Cb (E), f ≥ 0, the expression LRA (0+)P f (x0 ) + P f (x0 ) is either ∞, in
case P f 6= 0, or 0, in case P f = 0. Observe that, under the hypothesis “the
space N (L) consists of the constant functions”, P f is a constant ≥ 0.
The reader is cautioned that the symbol LRA (0+)f (x) + f (x) is a short-
hand notation for the following limit:

LRA (0+)f (x) + f (x) = lim lim Lh (λ)RA


h
(λ)f (x) + f (x)
h↓0 λ↓0
¡ h ¢
= lim lim Lh (λ) HA (λ) − I Lh (λ)−1 f (x) + f (x)
h↓0 λ↓0

= lim lim Lh (λ)HA


h
(λ)Lh (λ)−1 f (x). (9.92)
h↓0 λ↓0
9.3 Markov Chains: invariant measure 489

The symbols Lh (λ), RA h h


(λ), and HA (λ) are explained in (9.79). Since P Lf = 0
it follows that the “measure” determined by (9.90) is L-invariant, but not
necessarily σ-finite; the expression in (9.92) is either 0 or ∞. In case the
measure
R π is σ-finite, then there exist functions f ∈ Cb (E), f ≥ 0, such that
0 < f dπ < ∞.
Suppose that for every sequence (fn : n ∈ N) which decreases pointwise to
zero the sequence (sup0<λ<1 λR(λ)fn : n ∈ N) decreases to the zero-function
uniformly on compact subsets. Then the family {λR(λ) : 0 < λ < 1} is Tβ -
equi-continuous, and Tβ - limλ↓0 λR(λ)f = P f exists for all f ∈ Cb (E), pro-
vided that the vector sum R(L) + N (L) is Tβ -dense in Cb (E). Some of the
formulas we need are the following ones:

Lh (λ)RA
h
(λ)f (x) + f (x)
 −1 1,h 
h τA −1
X £ ¤
= Ex  e−λkh f (X(kh)) · Px τAh = 0
k=0
 
h−1 τA
h
−1
X £ ¤
= e−λh ehL E(·)  e−λkh f (X(kh)) , τAh ≥ h (x) · Px τAh = 0
k=0
£ ¤
+ f (x)Px τAh = 0 , (9.93)
¡ h ¢
HA (λ) − I Lh (λ)−1 = RA
h
(λ), (9.94)
h
¡ h ¢2
− RA (λ)Lh (λ) = I − e−λh ehL HA
h
(λ), HA h
(λ) = HA (λ) , and (9.95)
h h h h h h
lim L RA (0)L f (x0 ) + L f (x0 ) = lim L HA (0)f (x0 ) = 0, x0 ∈ E \ Ar .
h↓0 h↓0
(9.96)

We also write PAh (λ) = Lh (λ)RAh


(λ) + I. Then
¡ ¢
PAh (λ)Lh (λ) = Lh (λ) RA (λ)Lh (λ) + I = Lh (λ)HA
h
(λ). (9.97)

Observe that the equalities in (9.93) are proved in Proposition 9.19: see (9.83).
Notice that (9.80) is equivalent to (9.94), and that the equalities in (9.81) prove
this equality. The second equality in (9.97) is a consequence of (9.94). Put

PA (0)f = LRA (0+)f + f = lim lim Lh (λ)RA


h
(λ)f + f.
h↓0 λ↓0

Then from (9.97) we infer informally that PA (0)Lf = LHA (0)f , f ∈ D(L).
More precisely, for f ∈ D(L), and λ > 0 we have

λR(λ)PA (0)Lf = λLR(λ)HA (0)f = λ (λR(λ) − I) HA (0)f. (9.98)

The expression in (9.98) is uniformly bounded in λ > 0, and converges uni-


formly to zero when λ ↓ 0. Some other ideas will be proposed next. Let µ ≥ 0
be any positive measure on E. Then we define the measure π via the functional:
490 9 Miscellaneous topics
Z
f 7→ lim λ R(λ)PA (0)f dµ, f ≥ 0, f ∈ Cb (E). (9.99)
λ↓0

Then for f ∈ D(L) and µ a bounded Borel measure we have


Z Z
lim λ R(λ)PA (0)Lf dµ = lim λ R(λ)LHA (0)f dµ
λ↓0 λ↓0
Z
= lim λ (λR(λ) − I) HA (0)f dµ = 0. (9.100)
λ↓0

If µ is a bounded probability measure which is concentrated on Ar ⊂ A,


then the expression in (9.99) can be employed to define a non-trivial invariant
measure π. So that
Z Z
f dπ = lim λ R(λ)PA (0)f dµ.
λ↓0

The invariance follows from (9.100). The existence follows from the assump-
tion that the subspace R(L)R+ N (L) isR Tβ -dense in Cb (E). The non-triviality
follows from the fact that 1dπ ≥ 1dµ = 1: compare with (9.91). The
σ-finiteness follows from the assumption that the subset A is recurrent, i.e.
Px [τA < ∞] = 1 for all x ∈ E together with (9.92). Suppose x ∈ Ar . Then
the limits in (9.92) are in fact suprema, provided the numbers h are taken of
the form 2−n h0 , h0 > 0 fixed, and n → ∞. Moreover, the expression in (9.92)
vanishes for x ∈ A. In addition, we need the fact that
−n
h0 −n
h0
τA = inf lim τA1,2 = inf inf τA1,2 = inf {s > 0 : X(s) ∈ A} .
h >0 n→∞
0 0
h >0 n∈N
(9.101)
If A is an open subset, then in (9.101) we may fix h0 ; e.g. h0 = 1 will do. As
throughout this book we assume that the paths are Px -almost surely right-
continuous. In order to finish the arguments we need Choquet’s capacity the-
orem, which states that for x ∈ Ar the stopping time τA can be approximated
from above by hitting times of compact subsets K of A, and form below by
hitting times of open subsets:

inf τK = τA = sup τU , Pµ -almost surely. (9.102)


K⊂A, K compact U ⊃A, U open

For more details see §3.5. In particular, see the proof of Theorem 3.30; the
equality in (3.247) is quite relevant.
In the remaining part of this subsection the operators L and LA have to
be interpreted in the pointwise sense. For x ∈ E we have

etL f (x) − f (x) 1


Lf (x) = lim = lim (Ex [f (X(t)) − f (X(0))]) ,
t↓0 t t↓0 t

and
9.3 Markov Chains: invariant measure 491

etLA f (x) − f (x)1E\Ar (x)


LA f (x) = lim
t↓0 t
1
= lim (Ex [f (X(t)), τA > t] − Ex [f (X(0)), τA > 0]) . (9.103)
t↓0 t

As a consequence LA f (x) = 0 for x ∈ Ar .


Proposition 9.20. For λ > 0 and x ∈ E the equality

(λI − L) HA (λ)f (x) = 1Ar (x) (λI − L) HA (λ)f (x) (9.104)

holds for f ∈ Cb (E) with the property that HA (λ)f belongs to the pointwise do-
main of L. Moreover, the function RA (λ)f belongs to the (pointwise) domain
of L if and only if the same is true for the function HA (λ)R(λ)f .

Proof. On E \ Ar the equality in (9.104) follows from Lemma 9.13 equality


(9.56): see Proposition 9.45 below as well. More precisely, for x ∈ E \ Ar ,
λ > 0 and h > 0 we have
¡ ¢
I − e−λh ehL HA (λ)f (x)
£ ¤ £ £ ¤¤
= Ex e−λτA f (X (τA )) − e−λh Ex EX(h) e−λτA f (X (τA ))

(Markov property)
£ ¤ £ ¤
= Ex e−λτA f (X (τA )) − Ex e−λh−λτA ◦ϑh f (X (h + τA ◦ ϑh ))

(on the event {τA > h} the equality h + τA ◦ ϑh = τA holds Px -almost surely)
£ ¤
= Ex e−λτA f (X (τA )) − e−λh−λτA ◦ϑh f (X (h + τA ◦ ϑh )) , τA ≤ h .
(9.105)

The equality in (9.104) now follows from Lemma 9.13 equality (9.56).
From the strong Markov property the Dynkin’s formula follows:

R(λ)f (x) − RA (λ)f (x) = HA (λ)R(λ)f (x), f ∈ Cb (E), x ∈ E, (9.106)

or equivalently,

f = (λI − L) (RA (λ)f + HA (λ)R(λ)f ) . (9.107)

Let f ∈ Cb (E). Hence, from (9.107) it follows that the function RA (λ)f
belongs to the (pointwise) domain of L if and only if the same is true for
the function HA (λ)R(λ)f .
This completes the proof of Proposition 9.20.

If A ∈ E, then τA denotes its first hitting time.


492 9 Miscellaneous topics

Proposition 9.21. Suppose that the Borel subset A is such that it possesses
the almost separation property as defined in Definition 8.5 with D = D(L).
The following identity holds for all λ > 0 and f ∈ Cb (E):

R(λ)f − RA (λ)f = R(λ) (L − LA ) RA (λ) + R(λ) (1Ar f ) . (9.108)

Let f ∈ Cb (E) and λ > 0 be such that the function


·Z τA ¸
−1
x 7→ RA (λ)f (x) = (λI − LA ) f (x) = Ex e−λs f (X (s) ds) (9.109)
0

belongs to the pointwise domain of the operator L. Then the following equality
holds:
·Z τA ¸
(L − LA ) E(·) e−λs f (X(s)) ds (x)
h0 i
−1
= (λI − L) E(·) e−λτA (λI − L) f (X (τA )) (x) − 1Ar (x)f (x). (9.110)

Proof. The equality in (9.110) is just a rewriting of (9.108). The equality in


(9.108) can be obtained by noticing that

(λI − LA ) RA (λ)f = 1E\Ar f. (9.111)

The equality in (9.111) follows from the following identities:

RA (λ)f (x) − e−λδ eδLA RA (λ)f (x)


Z ∞ Z ∞ Z δ
= e−λs esLA f (x) ds − e−λs esLA f (x) ds = e−λs esLA f (x) ds.
0 δ 0
(9.112)

After dividing by δ > 0 and letting δ tend to 0 we see that (9.111) follows.
The equality in (9.108) then follows from L − LA = (λI − LA ) − (λI − L)
together with Dynkin’s formula (9.106).
This completes the proof of Proposition 9.21.

The following theorem yields the existence of an invariant σ-finite Borel mea-
sure provided that there exists a compact recurrent subset A. It is assumed
that the set Ar , i.e. the set of regular points of A coincides with A. Let the
operator HA : Cb ([0, ∞) × E) → Cb (E) be defined by

HA g(x) = Ex [g (τA , X (τa ))] = Ex [g (τA , X (τa )) , τA < ∞] . (9.113)

The equality in (9.113) is the same as (9.253) in Lemma 9.42 below. Recall
that a Markov process with transition function P (t, x, B) is strong Feller
whenever every function x 7→ P (t, x, B), (t, B) ∈ (0, ∞) × E, is continuous.
The following result reduces the existence of an invariant measure for the
Markov process given by (8.14) to that of a Markov chain. In fact our approach
9.3 Markov Chains: invariant measure 493

is inspired by results due to Azema, Kaplan-Duflo and Revuz [12, 11, 10]. Ba-
sically, the process t 7→ X(t) is replaced by the chain (n, ω, λ) 7→ X (Tn (λ), ω),
(n, ω, λ) ∈ N × Ω × Λ, where the process (n, λ) 7→ Tn (λ), (n, λ) ∈ N × Λ, are
the jump times of an independent Poisson process of intensity λ0 > 0
n ¡ ¢ o
(Λ, G, πt )t≥0 , (N (t), t ≥ 0) , ϑP
t : t ≥ 0 , [0, ∞) . (9.114)

This means that Tn = inf {t > 0 : N (t) ≥ n}, Pn ∈ N. The process n 7→ Tn


n
can be realized as a random walk: Tn = k=1 Zk . Here the sequence
(Zk : k ∈ N, k ≥ 1) is a sequence of independent variables each exponentially
distributed with parameter α0 .
Lemma 9.22. The process given by
© ¡ ¢ ª
(Ω × Λ, F ⊗ G, Px ⊗ π0 ) , (X (Tn (λ), ω) , n ∈ N) , ϑP n (λ), n ∈ N , (E, E)
(9.115)
is a Markov chain. Its transition kernel is given by
Z ∞
Px ⊗ π0 [X (T1 ) ∈ B] = α0 e−α0 t P (t, x, B) dt (9.116)
0

provided that T1 is exponentially distributed with parameter α0 .


The Px ⊗π0 -distribution of the state variable X (Tn ) can be expressed in terms
of the Px -distribution of the process X(t):
Z ∞ n−1
(α0 t)
Px ⊗ π0 [X (Tn ) ∈ B] = α0 e−α0 t P (t, x, B) dt
0 (n − 1)!
Z ∞ n−1
(α0 t) n
= α0 e−α0 t Px [X(t) ∈ B] dt = (α0 R (α0 )) 1B (x) . (9.117)
0 (n − 1)!

In (9.115) we have X (t) ◦ ϑP


m (ω, λ) = X (t + Tm (λ), ω), n, m ∈ N, (ω, λ) ∈
Ω × Λ. The time translation operators ϑPm (λ) satisfy

X(t) ◦ ϑP
m (ω, λ) = X (t + Tm (λ), ω) , (ω, λ) ∈ Ω × Λ.

Relative to π0 the variables Tn − Tm and Tn−m − T0 = Tn−m , n > m, have


the same distributions, and the measure πt is the measure π0 translated over
time t, i.e.
Z
F (T1 − T0 , . . . , Tn − Tn−1 ) dπt
Λ
Z
= F (T1 − T0 + t, . . . , Tn − Tn−1 + t) dπ0 ,
Λ

where F : [0, ∞)n → R is any ¡bounded Borel measurable


¢ function. It follows
that the probability measures πTm (λ) : m ∈ N, λ ∈ Λ satisfy (n > m):
494 9 Miscellaneous topics
Z Z
f (Tn − Tm ) dπTm (λ) = f ((Tn − Tm ) (λ0 ) + Tm (λ)) dπ0 (λ0 )
Λ Λ
Z
= f (Tn−m (λ0 ) + Tm (λ)) dπ0 (λ0 ) . (9.118)
Λ

Proof. For brevity we write P ex = Px ⊗ π0 . Put Yk = X (Tk ), k ∈ N, and let


fj : E → R, 1 ≤ j ≤ n + 1, be bounded Borel measurable functions. In order
to show that the process in (9.115) is a Markov chain we have to prove the
equality:
   
n+1
Y n
Y
ex 
E ex 
fj (Yj ) = E e Y [fn+1 (Y1 )] .
fj (Yj ) E (9.119)
n
j=1 j=1

Employing Fubini’s theorem the right-hand side of (9.119) can be rewritten


as
 
n
Y
ex 
E e Y [fn+1 (Y1 )]
fj (Yj ) E n
j=1
 
Z Z n
Y
= dπ0 (λ) dπ0 (λ0 ) Ex  fj (X (Tj (λ))) EX(Tn (λ)) [fn+1 (X (T1 (λ0 )))]
Λ Λ j=1

(the process in (8.14) is a Markov process)


 
Z Z n
Y
= dπ0 (λ) dπ0 (λ0 ) Ex  fj (X (Tj (λ))) fn+1 (X (Tn (λ) + T1 (λ0 )))
Λ Λ j=1

(the variables Tn+1 − Tn and T1 have the same π0 -distribution)


Z Z
= dπ0 (λ) dπ0 (λ0 )
Λ Λ
 
Yn
Ex  fj (X (Tj (λ))) fn+1 (X (Tn (λ) + Tn+1 (λ0 ) − Tn (λ0 )))
j=1

(the variables Tn+1 − Tn and Tj , 1 ≤ j ≤ n, are π0 -independent)


 
Z n
Y
= dπ0 (λ)Ex  fj (X (Tj (λ))) fn+1 (X (Tn (λ) + Tn+1 (λ) − Tn (λ)))
Λ j=1
 
Z n
Y
= dπ0 (λ)Ex  fj (X (Tj (λ))) fn+1 (X (Tn+1 (λ)))
Λ j=1
9.3 Markov Chains: invariant measure 495
 
n+1
Y
ex 
=E fj (Yj ) . (9.120)
j=1

The equality in (9.120) proves the Markov chain property of the process in
(9.115).
Next we will show equality (9.116). Therefore we write
Z Z ∞
Px ⊗ π0 [X (T1 ) ∈ B] = P (T1 (λ), x, B) dπ0 (λ) = α0 e−α0 t P (t, x, B) dt.
Λ 0
(9.121)
In the final step in (9.121) we used the exponential distribution of the variable
T1 with parameter α0 > 0.
The equalities in (9.120) and (9.121) complete the proof of Lemma 9.22.
Lemma 9.23. Put

X
N (t, λ) = n1[Tn (λ),Tn+1 (λ)) (t) = # {k ≥ 1 : Tk (λ) ≤ t}
n=0
= max {k ≥ 0 : Tk (λ) ≤ t} . (9.122)
Suppose that the variables Tk+1 −Tk , k ∈ N, are π0 -independent and identically
exponentially distributed random variables with parameter α0 attaining their
values in [0, ∞). Then with respect to π0 the process N (t), t ≥ 0, is a Poisson
process of intensity α0 and with jumping times Tn .
Proof. Fix k ∈ N and t > 0. Then we have:
π0 [N (t) = k] = π0 [Tk ≤ t < Tk+1 ] = π0 [0 ≤ t − Tk < Tk+1 − Tk ]
Z Z ∞
= dπ0 α0 e−α0 s ds 1{Tk <t}
Λ t−Tk
Z k
(α0 t) −α0 t
= dπ0 e−α0 (t−Tk ) 1{Tk ≤t} = e . (9.123)
Λ k!
Pk
By writing Tk = j=1 (Tj − Tj−1 ), and using the independence of the incre-
ments Tj − Tj−1 , 1 ≤ j ≤ k, the ultimate equality in (9.123) can be proved by
induction with respect to k and using the exponential distribution of Tj −Tj−1 .
This completes the proof of Lemma 9.23
Lemma 9.24. Let the process (Tk : k ∈ N) be the process of jump times of a
Poisson process
© ª
(Λ, G, πn )n∈N : (N (t), t ≥ 0) , (ϑt : t ≥ 0) , N .
Let the initial measure π0 be exponentially distributed with parameter α0 > 0.
Let B be a Borel subset of [0, ∞) of Lebesgue measure ∞. Then
π0 [∩n∈N ∪m≥n {Tm ∈ B}] = 1. (9.124)
496 9 Miscellaneous topics

Proof. Put Bn = B ∩ (n, n + 1], and En = ∪∞ k=1 {Tk ∈ Bn }. Let Ht be the


σ-field generated by (N (s) : s ≤ t). Since the event {there is a jump in Bn }
contains the event {the first jump after n occurs in Bn }, we have
£ ¯ ¤ £ ¯ ¤
π0 En ¯ Hn ≥ π0 n + T1 ◦ ϑn ∈ Bn ¯ Hn

(Markov property of the process N (t))

= πN (n) [n + T1 ∈ Bn ] = π0 [Bn − n] , (9.125)

where in the ultimate equality in (9.125) we used that fact that the distribu-
tion of the first jumping time of a Poisson process does not depend on the
initial position. From (9.125) it follows that
Z
£ ¯ ¤
π0 En ¯ Hn ≥ π0 [T1 ∈ Bn − n] = α0 e−α0 t dt
Bn −n
Z
−α0
≥ α0 e 1dt = α0 e−α0 m (Bn ) (9.126)
Bn −n

where m (Bn ) is the Lebesgue measure of Bn . Since the variables Tk are


stopping times relative to the process t 7→ N (t), the events En are Hn+1 -
measurable, and hence an application of the generalized Borel-Cantelli theo-
rem yields π0 [∩n ∪m≥n En ] = 1. Since ∩n ∪m≥n En ⊂ ∩n ∪m≥n {Tn ∈ B} the
equality in (9.124) follows.
This completes the proof of Lemma 9.24.
The following theorem appears as Theorem 1 in Kaspi and Mandelbaum [127].
Theorem 9.25. Let the strong Markov process be as in (8.14) in Theorem 8.8.
Suppose that this time-homogeneous Markov process on the polish space E has
transition probability function P (t, x, ·), t ≥ 0, x ∈ E, which is conservative in
the sense that P (t, x, E) = 1 for all t ≥ 0 and x ∈ E. In addition, assume that
the process X(t) is strong Feller. Then the following assertions are equivalent:
(a) There exists a non-zero
£R ∞σ-finite Borel measure
¤ µ such that for all B ∈ E,
µ(B) > 0 implies Px 0 1B (X(t)) dt = ∞ = 1 for all x ∈ E.
(b) There exists a non-zero σ-finite Borel measure ν such that for all B ∈ E,
ν(B) > 0 implies Px [τB < ∞] = 1 for all x ∈ E.
Here τB = inf {t > 0 : X(t) ∈ B} is the first hitting time of B. Moreover,
{τB < ∞} = ∪t>0 {X(t) ∈ B}. The measure µ in assertion (a) could be called
a Harris recurrence measure, and the measure ν in assertion (b) could be
called a recurrence measure. In the proof of Theorem 9.25 we need Lemma
9.29 below.
Remark 9.26. In (9.145) below we will see that the measure
Z ·Z ∞ ¸
−t
µ(B) = Ex e 1B (X(t)) dt dν(x) (9.127)
E 0
9.3 Markov Chains: invariant measure 497

conforms to assertion (a), provided ν conforms to (b). If µ is given by µ(B) =


P (t0 , x0 , B),
R ∞B ∈ E, for some fixed (t0 , x0 ) ∈ (0, ∞) × E. Then ν is given by
ν(B) = et0 t0 e−s P (s, x0 , B) ds.

Remark 9.27. If all measures B 7→ P (s, x, B), B ∈ E, (s, x) ∈ (0, ∞) × E,


are equivalent, and if any (all) of these measures serves as a recurrence mea-
sure, then for ν we may also choose one of these transition probabilities.
Fix (t0 , x0 ) ∈ (0, ∞) × E. In fact, if all such measures are equivalent, and
ν(B) = P (t0 , x0 , B), B ∈ E, then the measure µ in (9.127) is given by
Z ∞ Z
µ(B) = e−t Ey [1B (X(t))] P (t0 , x0 , dy) dt
0 E
Z ∞
£ ¤
= e−t Ex0 EX(t0 ) [1B (X(t))] dt
0

(Markov property)
Z ∞
= e−t Ex0 [1B (X (t + t0 ))] dt
0
Z ∞
t0
=e e−t Ex0 [1B (X (t))] dt. (9.128)
t0

From (9.128) it easily follows that µ is also equivalent to the measure B 7→


P (t0 , x0 , B), B ∈ E. Therefore, let B ∈ E be such that µ(B) = 0. Then there
exists (t, x) ∈ (0, ∞) × E such that P (t, x, B) = 0. By equivalence we see
P (t0 , x0 , B) = 0.

Let α ≥ 0. We also have a need for α-excessive functions.


Definition 9.28. A non-negative function f : E → [0, ∞) is called α-
excessive if t 7→ Ex [e−αt f (X(t))] increases to f (x) for all x ∈ E whenever
t ↓ 0. If α = 0, then f is called excessive.
Let f : E → [0, ∞) be an α-excessive function, and let 0 ≤ t1 < t2 < ∞. The
(in-)equalities
£ ¯ ¤
Ex e−αt2 f (X (t2 )) ¯ Ft1 − e−αt1 f (X (t1 ))
= e−αt2 EX(t1 ) [f (X (t2 − t1 ))] − e−αt1 f (X (t1 ))
³ ´
= e−αt1 e−α(t2 −t1 ) EX(t1 ) [f (X (t2 − t1 ))] − f (X (t1 )) ≤ 0 (9.129)

show that the process t 7→ e−αt f (X(t)) is a Px -super-martingale relative to


the filtration (Ft )t≥0 . From the (super-)martingale convergence theorem it
then follows that limt→∞ e−αt f (X(t)) exists Px -almost surely for all x ∈ E.
Let τ : Ω → [0, ∞] be any stopping time such that Px [τ < ∞] = 1. Then it
also follows that
498 9 Miscellaneous topics
h i £ ¤
Ex lim e−αt f (X(t)) ≤ lim Ex e−αt f (X(t))
t→∞ t→∞
£ ¤
= lim Ex e−αt f (X(t)) , τ ≤ t
t→∞
£ £ ¯ ¤ ¤
= lim Ex Ex e−αt f (X(t)) ¯ Fτ ∧t , τ ≤ t
t→∞

(Doob’s optional sampling theorem for super-martinagales)


h i
≤ lim Ex e−α(τ ∧t) f (X(τ ∧ t)) , τ ≤ t
t→∞
£ ¤
= Ex e−ατ f (X(τ )) , τ < ∞ . (9.130)

If τA denotes the first hitting time of A ∈ E, and α > 0, then the function
x 7→ Ex [e−ατA ] is α-excessive, and the function x 7→ Px [τA < ∞] is excessive.
These assertions follow from the Markov property, and the fact that t + τA ◦ ϑt
decreases to τA when t ↓ 0. Recall that τA = inf {s > 0 : X(s) ∈ A}.
Lemma 9.29. Let ν be a recurrence measure for the Markov process, and
let L ≥ 0 be an increasing right-continuous additive process on Ω such that
L(0+) = L(0) = 0. Then either L(∞) := limt→∞ L(t) = ∞ Px -almost surely
for all x ∈ E, or L(∞) = 0 Pν -almost surely. These assertions are mutually
exclusive.
The defining property of an adapted additive process t 7→ L(t) is the equality
L(s) + L(t) ◦ ϑs = L(s + t), which should hold Px -almost surely for all x ∈ E
and for all s, t ≥ 0. For more details on the notion of additive processes
see Definition
Rt 8.25. For our purpose relevant additive processes are given by
L(t) = 0 1B (X(s)) ds with B ∈ E. Let t 7→ L(t) be an increasing positive
additive process, and fix ε > 0. Suppose that L(0+) = 0, and define the
stopping time τε by
τε = inf {t > 0 : L(t) > ε} , (9.131)
Then the function x 7→ Px [τε < ∞] is excessive. This can be seen as follows.
First observe that

t + τε ◦ ϑt = inf {t + s : s > 0, L(s) ◦ ϑt > ε}


= inf {t + s : s > 0, L(t) + L(s) ◦ ϑt > ε + L(t)}
= inf {t + s : s > 0, L(t + s) > ε + L(t)}
= inf {s > t, L(s) > ε + L(t)} , (9.132)

which decreases to

inf {s > 0 : L(s) > ε + L (0+)} = inf {s > 0 : L(s) > ε} = τε . (9.133)

From (9.132) together with (9.133) and the Markov property it follows that
£ ¤
Ex PX(t) [τε < ∞] = Px [t + τε ◦ ϑt < ∞] ↑ Px [τε < ∞] (9.134)

when t ↓ 0. From (9.134) we see that the function x 7→ Px [τε < ∞] is excessive.
9.3 Markov Chains: invariant measure 499

Proof (Proof of Lemma 9.29). Fix ε > 0 and define τε as in (9.131). By the
right-continuity of the process s 7→ L(s) we obtain
£ ¤
lim Ex PX(t) [τε < ∞] = lim Ex [t + τε ◦ ϑt < ∞]
t↓s t↓s
£ ¤
= Ex [s + τε ◦ ϑs < ∞] = Ex PX(s) [τε < ∞] . (9.135)

From (9.135) it follows that we may, and shall, assume that the super-
martingale t 7→ PX(t) [τ² < ∞] is right-continuous. We have the Px -almost
sure equality of events:

{t < τε < ∞} = {τε > t, τε ◦ ϑt < ∞} (9.136)

Conditioning (9.136) on Ft and employing the Markov property yields:


£ ¯ ¤ £ ¯ ¤
Px t < τε < ∞ ¯ Ft = 1{τε >t} Px τε ◦ ϑt < ∞ ¯ Ft
= 1{τε >t} PX(t) [τε < ∞] . (9.137)

Next we let t ↑ ∞ in (9.137) to obtain:

0 = 1{τε =∞} lim PX(t) [τε < ∞] , Px -almost surely. (9.138)


t→∞

Consider the sets

Fε = {x ∈ E : Px [τε < ∞] = 0} , and


Gε,δ = {x ∈ E : Px [τε < ∞] > δ} , δ > 0.

First assume ν (E \ Fε ) > 0. Then ν (Gε,δ ) > 0 for some δ > 0. Since ν is
a recurrence measure, it follows that lim supt→∞ 1Gε,δ (X(t)) = 1 Px -almost
surely, and hence limt→∞ PX(t) [τε < ∞] ≥ δ Px -almost surely. Thus (9.138)
implies Px [τ = ∞] = 0, which is equivalent to Px [τ < ∞] = 1. Consequently,

ν (E \ Fε ) > 0 =⇒ τε < ∞ Px -almost surely for all x ∈ E. (9.139)

Next assume that ν (Fε ) > 0. Let τFε be the (first) hitting time of Fε :
τFε = inf {s > 0 : X(s) ∈ Fε }. Then, since ν is a recurrence measure, we have
Px [τFε < ∞] = 1. From (9.130) with τ = τFε , α = 0, and f (x) = Px [τε < ∞]
we see that limt→∞ PX(t) [τε < ∞] = 0 Px -almost surely for all x ∈ E, and so
τε = ∞ Px -almost surely for all x ∈ E. We repeat the latter conclusion:

ν (Fε ) > 0 =⇒ τε = ∞ Px -almost surely for all x ∈ E. (9.140)

There are two mutually exclusive possibilities:


(i) either there exists ε > 0 such that ν (E \ Fε ) > 0, and (9.139) holds for
some ε > 0,
(ii) or for every ε > 0 ν (Fε ) > 0, and (9.140) holds for all ε > 0.
500 9 Miscellaneous topics

If (9.139) holds for some ε > 0, then for such ε > 0 the equality Px [τε < ∞] =
1 holds for all x ∈ E. Then we proceed as follows. By induction we introduce
the following sequence of stopping times:

η0 = 0, η1 = τε , and for n ≥ 1
ηn = ηn−1 + τε ◦ ϑηn−1 = inf {s > ηn−1 : L(s) > ε + ηn−1 } . (9.141)

From (9.141) it follows that {ηn < ∞} ⊂ {L(∞) > nε}, and hence for all
n ≥ 1 and x ∈ E we have

Px [ηn < ∞] ≤ Px [L(∞) > nε] . (9.142)

In addition, by the strong Markov property we have


£ ¤
Px [ηn < ∞] = Px ηn−1 < ∞, τε ◦ ϑηn−1 < ∞
£ ¤
= Ex ηn−1 < ∞, PX(ηn−1 ) [τε < ∞] . (9.143)

Since Py [τε < ∞] = 1 for all y ∈ E by induction with respect to n ∈ N (9.143)


yields Px [ηn < ∞] = 1 for all x ∈ E and all n ∈ N. From (9.142) we then
infer Px [L(∞) = ∞] = 1 for all x ∈ E. This is the first alternative in Lemma
9.29. If, on the other hand, (9.140) holds for all ε > 0, then we have

Pν [L(∞) = 0] = lim Pν [L(∞) < ε] = lim Pν [τε = ∞] = 1. (9.144)


ε↓0 ε↓0

The equality (9.144) yields the second alternative of Lemma 9.29.


Altogether this completes the proof of Lemma 9.29.
Now we are ready to prove Theorem 9.25.
Proof (Proof of Theorem 9.25). The implication (a) =⇒ (b) follows with ν =
µ. Let ν be such that assertion (b) holds with the measure ν. Then we will
prove that (a) holds with
·Z ∞ ¸ Z ·Z ∞ ¸
µ(B) = Eν e−t 1B (X(t)) dt = Ex e−t 1B (X(t)) dt dν(x).
0 E 0
(9.145)
Let B ∈ E be such that µ(B) > 0, where µ is as in (9.145). Put L(t) =
Rt
1 (X(s)) ds. Then L(∞) = 0 cannot be true Pν -almost everywhere. From
0 B
Lemma 9.29 it follows that L(∞) = ∞ Px -almost surely for all x ∈ E.
This completes the proof of Theorem 9.25.
The following theorem is the Markov chain analogue of Theorem 9.25. Its
proof can be adapted
Pkfrom the proof of Theorem 9.25, and the required Lemma
9.29 with L(k) = j=1 1{X(j)∈B} where B ∈ E. The time τε is replaced by
τ1 = inf {k ≥ 1 : L(k) ≥ 1}. The equalities in (9.145) are replaced with e.g.

"∞ #
X X
k
µ(B) = (1 − r) r Eν 1B (X(k))
k=1 k=1
9.3 Markov Chains: invariant measure 501

X Z
= (1 − r) rk−1 Ex [1B (X(k))] dν(x), (9.146)
k=1 E

for some 0 < r < 1. A version of the following theorem was first proved by
Meyn and Tweedie in [161] Theorem 1.1. In fact Theorem 9.31 is a consequence
of Proposition 9.1.1 in Meyn and Tweedie [162], which reads as follows.
Proposition 9.30. Suppose some £ subset¤ B ∈ E has the following property.
For every x ∈ B the equality Px τB1 < ∞ = 1 holds. Then
"∞ #
X £ ¤
Px 1B (X(k)) = ∞ = Px τB1 < ∞ , for all x ∈ E. (9.147)
k=1

Proof. Put τB0 = inf {n ≥ 0 : X(n) ∈ B},


© ª
τB1 = 1 + τB0 ◦ ϑ1 = inf n > τB0 : X(n) ∈ B and
© ª
τBk = inf n > τBk−1 : X(n) ∈ B , k ≥ 2.
Then τBk = τBk−1 + τB1 ◦ ϑτ k−1 , and hence by the strong Markov property and
B
our assumption on τB1 we have
"∞ #
X £ ¤
Px 1B (X(`)) ≥ k + 1 = Px τBk+1 < ∞
`=1
h i
= Px τB1 ◦ ϑτBk < ∞, τBk < ∞
h £ ¤ i
= Ex PX (τ k ) τB1 < ∞ , τBk < ∞ . (9.148)
B
£ ¤
Assuming that Py τB1 < ∞ = 1, y ∈ B, then (9.148) implies
"∞ #
X £ ¤
Px 1B (X(`)) ≥ k + 1 = Px τBk+1 < ∞
`=1
£ ¤
= Px τBk < ∞ . (9.149)
By induction with respect to k we see that (9.149) implies
"∞ #
X £ ¤
Px 1B (X(`)) ≥ k = Ex τB1 < ∞ , for all x ∈ E. (9.150)
`=1

Let k tend to ∞ to obtain (9.147) from (9.150), which completes the proof of
Proposition 9.30.
Theorem 9.31. Let
© ª
(Ω, F, Px )x∈E , (X(k), k ∈ N) , (ϑk , k ∈ N) , (E, E) (9.151)
be a Markov chain with probability transition function P (x, B) which is con-
servative in the sense that P (t, x, E) = 1 for all t ≥ 0 and x ∈ E. Then the
following assertions are equivalent:
502 9 Miscellaneous topics

(a) There exists a non-zero σ-finite


P∞ Borel measure µ on E such that for all
B ∈ E, µ(B) > 0 implies Px [ k=1 1B (X(k)) = ∞] = 1 for all x ∈ E.
(b) There exists a non-zero
£ σ-finite
¤ Borel measure ν such that for all B ∈ E,
ν(B) > 0 implies Px τB1 < ∞ = 1 for all x ∈ E.

Here τB1 = inf {k ≥ 1 : X(k) ∈ B}.


Proof. Again the implication (a) =⇒ (b) is evident with ν = µ. Fix 0 < r < 1.
Repeating the arguments in the proof of Theorem 9.25 the reverse implication
can be proved with µ given by e.g.

X Z
µ(B) = (1 − r) rk−1 Px [X(k) ∈ B] dν(x), B ∈ E, (9.152)
k=1

provided that ν is a measure which accommodates assertion (b). However,


using Proposition 9.30 we see that implication (a) follows from (b) with µ = ν
where ν conforms assertion (b).
This completes the proof of Theorem 9.31.

Remark 9.32. If all probability measures B 7→ Px [X(1) ∈ B] = P (x, B), x ∈


E, are equivalent, and that (b) is satisfied with B 7→ P (x0 , B), then (a)
holds with the same measure. To see this, consider ν(B) = Px0 [X(1) ∈ B] =
P (x0 , B). Then by the Markov property µ in (9.152) is given by

X Z
µ(B) = (1 − r) rk−1 Px [X(k) ∈ B] dµ(x)
k=1
X∞ Z
k−1
£ ¤
= (1 − r) r Ex0 PX(1) [X(k) ∈ B]
k=1
X∞ Z
= (1 − r) rk−1 Px0 [X(k + 1) ∈ B] , (9.153)
k=1
£ ¤
and hence, if µ(B) = 0, then Ex0 PX(1) [X(1) ∈ B] = Px0 [X(2) ∈ B] = 0.
Thus, we see that PX(1) [X(1) ∈ B] = 0, Px0 -almost surely. Therefore there
exists at least one x ∈ E such that P (x, B) = Px [X(1) ∈ B] = 0. Since, by
assumption, all measures B 7→ P (y ∈ B), y ∈ E, are equivalent in follows
that P (x0 , B) = 0.

The following theorem reduces (Harris) recurrence problems for time-continu-


ous Markov processes with the Feller property and sample space Ω to Markov
chains on a larger sample space Ω × Λ where the continuous time is replaced
with the time jump process
© ¡ ¢ª
(Λ, G, π0 ) , (Tn , n ∈ N) , ϑP
n,n ∈ N

of a Poisson process. Here the variable Tn has π0 -distribution function


9.3 Markov Chains: invariant measure 503

X k
(α0 t) −α0 t
t 7→ π0 [Tn ≤ t] = π0 [N (t) ≥ n] = e .
k!
k=n

Theorem 9.33. Let the Markov process

{(Ω × Λ, F, Px ) , (X (t) , t ≥ 0) , (ϑt , t ≥ 0) , (E, E)} (9.154)

be a Markov process with the strong Feller, with a conservative probability


transition function P (t, x, B), (t, x, B) ∈ [0, ∞) × E × E. Suppose that all
Borel measures B 7→ P (t, x, B), (t, x) ∈ (0, ∞) × E, are equivalent i.e. have
the same negligible sets. Let the Markov chain
© ¡ ¢ ª
(Ω × Λ, F ⊗ G, Px ⊗ π0 ) , (X (Tn (λ), ω) , n ∈ N) , ϑP
n (λ), n ∈ N , (E, E)
(9.155)
be as in (9.115) of Lemma 9.22. Then the following assertions are equivalent:
(a) The Markov process in (9.154) is Harris recurrent in the sense that for any
Borel subset B for which P (t0 , x0 , B) > 0 for some (t0 , x0 ) ∈ (0, ∞) × E
the equality ·Z ¸

Px 1{X(t)∈B} dt = ∞ = 1 (9.156)
0

holds for all x ∈ E.


(b) The Markov process in (9.154) is recurrent in the sense that for any Borel
subset B for which P (t0 , x0 , B) > 0 for some (t0 , x0 ) ∈ (0, ∞) × E the
equality
Px [τB < ∞] = 1 (9.157)
holds for all x ∈ E.
(c) The Markov chain in (9.155) is Harris recurrent in the sense that for any
Borel subset B for which P (t0 , x0 , B) > 0 for some (t0 , x0 ) ∈ (0, ∞) × E
the equality "∞ #
X
Px ⊗ π0 1{X(Tk )∈B} = ∞ = 1 (9.158)
k=0

holds for all x ∈ E.


(d) The Markov chain in (9.155) is recurrent in the sense that for any Borel
subset B for which P (t0 , x0 , B) > 0 for some (t0 , x0 ) ∈ (0, ∞) × E the
equality £ ¤
Px ⊗ π0 τB1 < ∞ = 1 (9.159)
holds for all x ∈ E.
Here τB = inf {t > 0 : X(t) ∈ B}, and τB1 = inf {n ≥ 1 : X (Tn ) ∈ B}.

Proof. First we observe that the measures B 7→ Px ⊗[X (Tn ) ∈ B], n ∈ N, n ≥


1, and x ∈ E, are equivalent to the measures B 7→ P (t, x, B) = Px [X(t) ∈ B],
(t, x) ∈ (0, ∞) × E. The reason for this equivalence is the following equality:
504 9 Miscellaneous topics
Z ∞ n−1
(α0 t)
Px ⊗ π0 [X (Tn ) ∈ B] = α0 e−α0 t P (t, x, B) dt (9.160)
0 (n − 1)!
which can be found in (9.117). Now we are ready to prove Theorem 9.33.
(a) ⇐⇒ (b). This equivalence is a consequence of Theorem 9.25 with
µ(B) = ν(B) = P (t0 , x0 , B), B ∈ B.
(c) ⇐⇒ (d). This equivalence is a consequence of Theorem 9.31 with
ν(B) = Px ⊗ π0 [X (T1 ∈ B)], and µ = ν, or

X
µ(B) = (1 − r) rk−1 Px ⊗ π0 [X (Tk+1 ) ∈ B]
k=1
X k+1
= (1 − r) (α0 R (α0 )) 1B (x). (9.161)
k=1

For this result the reader is referred to the equalities (9.117) and (9.153), and
to Theorem 9.31. Since the measures ν and µ in (9.161) are equivalent to
the measure B 7→ P (t0 , x0 , B) assertions (c) and (d) are equivalent with the
measure B 7→ P (t0 , x0 , B).
(d) =⇒ (b). From the definitions of the stopping times τB and τB1 it follows
the following Px0 ⊗ π0 -sure inclusion of events:
n o © ª
{τB < ∞} ⊃ TτB1 < ∞ = τB1 < ∞ , (9.162)

and hence
£ ¤
Px0 [τB < ∞] = Px0 ⊗ π0 [τB < ∞] ≥ Px0 ⊗ π0 τB1 < ∞ = 1. (9.163)
Assertion (b) is a consequence of (9.163).
n
(a) =⇒ (c). Let A ∈ E be such that (α0 R (α0 )) 1A (x0 ) > 0, for some
n ∈ N, n ≥ 1, which by assumption is equivalent to Px0 [(X (t0 )) ∈ A] =
P (t0 , x0 , A) > 0. Let ω ∈ Ω£Rand put Bω = {t ≥ 0 ¤: X(t, ω) ∈ A}. By assump-

tion (a) we know that Px 0 1A (X(t)) dt = ∞ = 1 for all x ∈ E. Hence
it follows that the Lebesgue measure of Bω is ∞ for Px -almost all ω ∈ Ω
and for all x ∈ A. An application of equality (9.124) in Lemma 9.24 in the
penultimate equality in (9.164) below yields:
Px ⊗ π0 [∩n ∪m≥n {X (Tm ) ∈ A}]
Z Z
= dPx (ω) dπ0 (λ) lim sup 1{X(Tn (λ))∈A} (ω)
n→∞
ZΩ ZΛ
= dPx (ω) dπ0 (λ) lim sup 1{Tn ∈Bω } (λ)
n→∞
ZΩ ZΛ
= dPx (ω) dπ0 (λ)1 = 1. (9.164)
Ω Λ

From (9.164) assertion (c) follows.


This completes the proof of Theorem 9.33
9.3 Markov Chains: invariant measure 505

Lemma 9.34. Let the notation and hypotheses be as in Theorem 9.25. Sup-
pose that all measures B 7→ P (t, x, B), (t, x) ∈ (0, ∞) × E are equivalent, and
that B is recurrent whenever P (t, x, B) > 0 for some pair (t, x) ∈ (0, ∞) × E.
Then all Borel subsets B for which P (t, x, B) > 0 for some pair (t, x) ∈
(0, ∞) × E are recurrent for the chain described in (9.115) of Lemma 9.22.
¡ ¢
Lemma 9.35. Let etL : t ≥ 0 be the semigroup associated to the Markov
process in (8.14). Put for α > 0 and f ∈ Cb (E)
Z ∞ Z ∞
−αt tL
R(α)f (x) = e e f (x)dt = e−αt Ex [f (X(t))] dt, (9.165)
0 0

and fix α0 > 0. Then a σ-finite Radon measure is ¡ invariant¢ for the operator
L if and only if it is invariant for the semigroup etL : t ≥ 0 if and only if it
is invariant for the bounded operator α0 R (α0 ).
R
Proof. Let the positive σ-finite Radon measure µ be such that Lf dµ = 0
for f ∈ D(L) ∩ L1 (E, µ). Then we have for f ∈ L1 (E, µ)
Z Z Z
α0 R (α0 ) f dµ − f dµ = LR (α0 ) f dµ = 0. (9.166)
R R
From (9.166) we infer α0 R (α0 ) f dµ = f dµ, f ∈ L1 (E, µ). By the resol-
vent equation, it then follows that
Z Z Z
R(α)f dµ = R (α0 ) f dµ + (α0 − α) R (α0 ) R(α)f dµ
Z Z
1 α0 − α
= f dµ + R(α)f dµ. (9.167)
α0 α0
R R
From the equality in (9.167) we see that α R(α)f dµ = f dµ, f ∈ L1 (E, µ),
and α > 0. Since

etL f = Tβ lim e−αt eαt(αR(α)) f, f ∈ Cb (E),


α→∞
R R
it follows that etL f dµ 1
R =tL f dµ, fR ∈ L (E, µ) 1∩ Cb (E).
Next suppose that e f dµ = f dµ, f ∈ L (E, µ) ∩ Cb (E). Then
Z Z Z ∞ Z ∞Z Z
α R(α)f dµ = α e−αt etL f dt dµ = α e−αt etL f dµ dt = f dµ.
0 0
R R (9.168)
From (9.168) we see that α R(α)f dµ = f dµ, α > 0. Then we also have
Z
α (αR(α) − I) f dµ = 0, f ∈ D(L) ∩ L1 (E, µ). (9.169)
R
By letting α tend to ∞ in (9.169) we see Lf dµ = 0, f ∈ D(L) ∩ L1 (E, µ).
This completes the proof of Lemma 9.35.
506 9 Miscellaneous topics

Moreover, suppose that for every g ∈ Cb ([0, ∞) × E) and every compact


subset K of E the limit

lim λR(λ)HA g(x) exists uniformly for x ∈ K. (9.170)


λ↓0

Theorem 8.18 gives sufficient conditions in order that the Markov process in
(9.154) possesses a compact recurrent subset.
Theorem 9.36. Suppose that there exists a compact recurrent subset A, and
suppose that the Markov process in (9.154) is irreducible and strong Feller.
In addition, suppose that all measures B 7→ P (t, x, B), x ∈ E, t > 0, are
equivalent. Then there exists a non-trivial σ-finite invariant measure π, and
the vector sum R(L) + N (L) is dense in Cb (E) for the strict topology. In
Rfact the measure π has the property that f ∈ Cb (E), f ≥ 0, f 6= 0, implies
f dπ > 0. Moreover, π(B) = 0 if and only if P (t, x, B) = 0 for all pairs
(some pair) (t, x) ∈ (0, ∞) × E. Moreover, the measure π is unique up to a
multiplicative constant.

Remark 9.37. From the proof it follows that for every compact subset K there
exists an open subset Kε ⊃ K, and hence a function fK ∈ Cb (E) such that
Z
1K ≤ fK ≤ 1Kε , and fK dπ < ∞ (9.171)

where π is the invariant measure. It also follows that

E = ∪K, Kcompact {fK > 0} .

Since the space E is second countable, the family {fK : K compact} in (9.171)
may be chosen countable, while still satisfying E = ∪n∈N {fKn > 0}. This can
be seen as follows. The second countability implies that there exists a sequence
of open subsets (Un )n∈N such that for every compact subset K of E there a
countable subset (UK,k )k∈N ⊂ (Un )n∈N such that {fK > 0} = ∪k∈N UK,k . For
every n ∈ N we choose a compact subset Kn such that Un ⊂ {fKn > 0}. We
only take into account those open subsets Un for which such fKn exists. Then
the sequence (fKn )n∈N will be such that E = ∪n∈N {fαn > 0}.

Here, the space Cb (E) is supplied with the strict topology. A sequence (fn )n∈N
converges with respect to the strict topology if it is uniformly bounded and
if it converges to a function f ∈ Cb (E) uniformly on compact subsets of the
space E. The symbol R(L) stands for the range of L, and N (L) stands for
the null space of L.
Proof (Proof of Theorem 9.36.). We sketch a proof. Fix h > 0, λ > 0, µ ∈
M (A), and f ∈ Cb (E). Here M (A) is the space of those (complex) measures
µ ∈ E which are concentrated on A; i.e. |µ| (E \ A) = 0. We will also need
the following stopping times: τAh , which was defined in (9.73), and τA1,h which
9.3 Markov Chains: invariant measure 507

was defined in (9.74). The operators Lh (λ), HA h h


(λ), and RA (λ) were defined
in respectively (9.75), (9.76), and (9.79).
Therefore we will rewrite the equality:
Z h
0
e−λ s esL ds {(L − λ0 I) RA (λ0 ) + I} f (x)
0
( Z h )
³ ´
−λ0 h hL 0 −λ0 s sL
= e e − I RA (λ ) + e e ds f (x) . (9.172)
0

The expression in (9.172) is equal to


³ ´ Z h
−hλ0 hL 0 0
e e − I RA (λ ) f (x) + e−λ s esL f (x) ds
0
Z h
0 0
−hλ 0 0
=e Ex [RA (λ ) f (X(h))] − RA (λ ) f (x) + e−λ s esL f (x) ds
0
· ·Z τA ¸¸ ·Z τA ¸
−hλ0 −λ0 ρ −λ0 ρ
=e Ex EX(h) e f (X(ρ)) dρ − Ex e f (X(ρ)) dρ
0 0
"Z #
h
0
+ Ex e−λ s f (X(s)) ds
0

(Markov property)
"Z # ·Z ¸
τA ◦ϑh τA
−λ0 (h+ρ) −λ0 ρ
= Ex e f (X(h + ρ)) dρ − Ex e f (X(ρ)) dρ
0 0
"Z #
h
−λ0 s
+ Ex e f (X(s)) ds
0
"Z # ·Z ¸
h+τA ◦ϑh τA
−λ0 ρ −λ0 ρ
= Ex e f (X(ρ)) dρ − Ex e f (X(ρ)) dρ
h 0
"Z #
h
−λ0 s
+ Ex e f (X(s)) ds
0

(τA is a terminal stopping time: on {τA > h} the equality h + τA ◦ ϑh = τA


holds Px -almost surely)
"Z #
h+τA ◦ϑh
0
= Ex e−λ ρ f (X(ρ)) dρ, τA ≤ h
h
·Z τA ¸ ·Z τA ¸
0 0
+ Ex e−λ ρ f (X(ρ)) dρ, τA > h − Ex e−λ ρ f (X(ρ)) dρ
"Zh # "Z 0 #
h h
−λ0 s −λ0 s
+ Ex e f (X(s)) ds, τA ≤ h + Ex e f (X(s)) ds, τA > h
0 0
508 9 Miscellaneous topics
"Z #
h+τA ◦ϑh
−λ0 ρ
= Ex e f (X(ρ)) dρ, τA ≤ h
τA
" Z #
h−τA +τA ◦ϑh−τA ◦ϑτA
−λ0 τA −λ0 ρ
= Ex e e f (X (ρ + τA )) dρ, τA ≤ h
0

(strong Markov property)


" "Z # #
h−τA +τA ◦ϑh−τA
−λ0 τA −λ0 ρ
= Ex e EX(τA ) e f (X (ρ)) dρ , τA ≤ h
0
(9.173)
" "Z # #
τA ◦ϑh−τA
0 0
= Ex e−λ h EX(τA ) e−λ ρ f (X (ρ + h − τA )) dρ , τA ≤ h
0
" "Z # #
h−τA
−λ0 τA −λ0 ρ
+ Ex e EX(τA ) e f (X (ρ)) dρ , τA ≤ h
0

(Markov property once more)


· · ·Z τA ¸¸ ¸
−λ0 h −λ0 ρ
= Ex e EX(τA ) EX(h−τA ) e f (X (ρ)) dρ , τA ≤ h
0
" "Z # #
h−τA
0 0
+ Ex e−λ τA EX(τA ) e−λ ρ f (X (ρ)) dρ , τA ≤ h . (9.174)
0

It is perhaps useful to explain the way the expectations in (9.174) have to be


understood. The second term should be read as follows:
" "Z # #
h−τA
0 0
Ex e−λ τA EX(τA ) e−λ ρ f (X (ρ)) dρ , τA ≤ h
0
" "Z # #
h−τA (ω)
0 0
= Ex ω 7→ e−λ τA (ω) EX(τA )(ω) e−λ ρ f (X (ρ)) dρ 1{τA ≤h} (ω)
0

where X (τa ) (ω) = X (τA (ω)) (ω) = X (τA (ω), ω). The first term in (9.174)
has to be interpreted in the following manner:
· · ·Z τA ¸¸ ¸
0 0
Ex e−λ h EX(τA ) EX(h−τA ) e−λ ρ f (X (ρ)) dρ , τA ≤ h
· ·0 ·Z τA ¸¸
0 0
= Ex ω 7→ e−λ h EX(τA (ω),ω) ω 0 7→ EX(h−τA (ω),ω0 ) e−λ ρ f (X (ρ)) dρ
0
¸
1{τA ≤h} (ω) .
9.3 Markov Chains: invariant measure 509

The equality in (9.173) and (9.174) will be used to prove the existence and
uniqueness (up to scalar multiples) of an invariant measure. A crucial role will
be played by Proposition 9.40.
The equality in (9.172) will also be used
R to prove that the invariant measure
π is strictly positive in the sense that f dπ > 0 whenever f ∈ Cb (E) is such
that f ≥ 0 and f 6= 0. This claim follows from the first equality in (9.223) in
Proposition 9.39 together with the first inequality in (9.209) in Lemma 9.38
below. Here we also need the irreducibility of the Markov process X. So let
f ≥ 0, f 6= 0, f ∈ Cb (E) ∩ L1 (E, E, π). Then, from the first equality in (9.223)
we see:
Z
h f (x) dπ(x)
E
Z " "Z # #
h−τA +τA ◦ϑh−τA
= Ex EX(τA ) f (X(ρ)) dρ , τA ≤ h dπ(x)
E 0
Z " "Z 1
# #
2h 1
≥ Ex EX(τA ) f (X(ρ)) dρ , τA ≤ h dπ(x)
E 0 2
"Z 1 #Z · ¸
2h 1
≥ inf Ey f (X(ρ)) dρ Ex τA ≤ h dπ(x)
y∈A 0 E 2
"Z 1 #Z · ¸
2h 1
= Ey0 f (X(ρ)) dρ Ex τA ≤ h dπ(x) (9.175)
0 E 2

for some y0 ∈ A. By irreducibility we have


"Z 1 #
2h
Ey0 f (X(ρ)) dρ > 0. (9.176)
0

The combination
R of the first inequality in (9.209) in Lemma 9.38 and (9.175)
shows that E f (x) dπ(x) > 0, where f ≥ 0, f 6= 0, f ∈ Cb (E) ∩ L1 (E, E, π):
see (9.175). As a consequence we have that the corresponding measure π
is strictly positive in the sense that π(O) > 0 for every non-empty open
subset O of E. In addition, we have π(B) = 0, B ∈ E, if and only if
P (t, x, B) = 0 for some (t, x) ∈ (0, ∞) × E. If P (t, x, B) = 0 for some
(t0 , x0 ) ∈ (0, ∞)
R × E, then P (t, x, B) = 0 for all (t, x) ∈ (0, ∞) × E, and
hence π(B) = P (t, x, B) dπ(x) R = π(B) = 0. Conversely, suppose B ∈ E
is such that π(B) = 0. Then P (t0 , x, B) dπ(x) = 0 (by invariance). Since,
by the strong Feller property the function x 7→ P (t0 , x, B) is continuous it
follows by the strict positiveness of the measure π that P (t0 , x, B) = 0 for
some x ∈ E. Since all the measures B 7→ P (t0 , x, B), x ∈ E, are equivalent it
follows that P (t0 , x0 , B) = 0.
First let us embark on the existence of the invariant measure π. We will
use a Hahn-Banach argument to obtain such a measure. Recall that τA1,h =
510 9 Miscellaneous topics

inf {s ≥ h : X(s) ∈ A} = h+τA ◦ϑA where τA = inf {s > 0 : X(s) ∈ A}. Since
the compact subset A is recurrent we see that
h i £ ¤
Px τA1,h < ∞ = Px [τA ◦ ϑh < ∞] = Ex PX(h) [τA < ∞] = Ex [1] = 1,
(9.177)
1,h
and hence the stopping time τA is finite Px -almost surely for all x ∈ E.
Define the operator QA : C(A) → C(A) by
h ³ ³ ´´i £ ¤
QA f (x) = Ex f X τA1,h = Ex EX(h) [f (X (τA ))] , f ∈ C(A).
(9.178)
By the strong Feller property of the Markov process X(t) it follows that the
operator QA in (9.178) is a positivity preserving linear mapping from C(A)
to C(A). Moreover, QA 1 = 1. Fix x0 ∈ E. By the Hahn-Banach extension
theorem there exists a positive linear functional Λx0 : C(A) → R such that
for f ∈ C(A), f ≥ 0,

X ∞
X
lim inf (1 − r) rk QkA f (x0 ) ≤ Λx0 (f ) ≤ lim sup(1 − r) rk QkA f (x0 ) .
r↑1 r↑1
k=0 k=0
(9.179)
To obtain Λ, apply the analytic version of the Theorem of Hahn-Banach to
the functional:

X ¡ ¢
f 7→ inf lim sup(1 − r) rk QkA (f + g) (x0 ) − QkA g (x0 ) . (9.180)
g∈C(A),g≥0 r↑1
k=0

From (9.179) it follows that Λx0 (1A ) = 1. From Hahn-Banch’s theorem it


also follows that the second inequality in (9.180) holds for all f ∈ C(A).
Consequently, we have

X
Λx0 (f − QA f ) ≤ lim sup(1 − r) rk QkA (I − QA ) f (x0 )
r↑1
k=0
à ∞
!
X
2 k
= lim sup (1 − r)f (x0 ) − (1 − r) r QkA f (x0 ) = 0. (9.181)
r↑1
k=0

From (9.181) we infer Λx0 (f − QA f ) ≤ 0. The latter inequality is also true


for −f instead of f , and hence the functional Λx0 is QA -invariant. Since the
subset A is compact, by the Riesz representationRtheorem the functional Λx0
can be represented by a measure πx0 : Λx0 (f ) = A f (x)d dπx0 , f ∈ C(A). In
order to see the uniqueness we use Orey’s theorem 9.4. First we introduce
the sequence of stopping times: τAk+1,h = τAk,h + τA1,h ◦ ϑτ k,h . with reference
h ³ ´ i A

measure B 7→ Px X τA1,h ∈ B . We need the fact that all measures of the


form
h ³ ´ i h h ³ ´ ii
Px X τAk+1,h ∈ B = Ex PX (τ k,h ) X τA1,h ∈ B
A
9.3 Markov Chains: invariant measure 511
h £ ¤i
= Ex EX (τ k,h ) PX(h) [X (τA ) ∈ B] , k ∈ N, (9.182)
A

are equivalent. Suppose that B is such


³ that
´ the very first member in (9.182)
k,h
vanishes. Then there exists y = X τA ∈ A such that
£ ¤
Ey PX(h) [X (τA ) ∈ B] = 0 (9.183)

Since all measures of the form B 7→ P (h, y, B), y ∈ E, are equivalent, (9.183)
implies that the quantity in (9.183) vanishes for all y ∈ E. It follows that
h ³ ´ i
PX (τ `,h ) X τA1,h ∈ B = 0 (9.184)
A

³ ´
for all ` ∈ N. As a consequence we see that the process k 7→ X τAk,h is
h ³ ´ i
Harris recurrent relative to the measure B 7→ Py X τAk,h ∈ B , B ∈ E.
Then Orey’s theorem yields that for all pairs of probability measures (µ1 , µ2 )
on the Borel field of A the following limit vanishes (see (9.25) in Theorem
9.4):
ZZ
lim Var (QnA (x, ·) − QnA (y, ·)) dµ1 (x) dµ2 (y) = 0, (9.185)
n→∞

Consequently, we see that QA -invariant probability measures on the Borel


field of A are unique. We call such an invariant measure πA . The existence
was established using the Hahn-Banach theorem. It then follows that for all
f ∈ C(A) and uniformly for x ∈ A

X Z
k
lim(1 − r) r QkA f (x) − f dπA 1A = 0. (9.186)
r↑1 A
k=0

Assertions (b), (c), (d), and (e) in Proposition 9.40 then show the existence
and uniqueness (up to scalar multiplications of etL -invariant measures) on the
Borel field of E.
Next we prove that the invariant measure π on E, the existence of which
is established by Proposition 9.40, is in fact a σ-finite, and strictly positive
invariant Radon measure which is equivalent to the measures B 7→ P (t, x, B).
This will be the subject of the remaining part of the proof.
The σ-finiteness of the measure π follows from Lemma 9.38. More precisely,
put ½ ¾
1
Am,n = x ∈ E : Px [τA ≤ m] > , m, n ∈ N. (9.187)
n
R
Then E = ∪n,m∈N Am,n . Since by Lemma 9.38 E Px [τA ≤ m] dπ(x) < ∞, it
follows that π (Am,n ) < ∞ for all m, n ∈ N \ {0}.
From (9.241) in assertion (f) of Proposition 9.40 and (9.173) it follows that
for f ∈ Cb (E), f ≥ 0,
512 9 Miscellaneous topics
Z "Z #Z
h+τA ◦ϑh
h f (x)dπ(x) ≤ sup Ey f (X(ρ)) dρ Px [τA ≤ h] dπ(x).
E y∈A 0 E
(9.188)
From (9.173) and (9.188) we will infer that the measure π is σ-finite, and that
it is a Radon measure. In the proof of this result we will adapt the proof of
Theorem 8.21 in Chapter 7. In particular the inequality in (8.52) is relevant.
The precise arguments run as follows. Let K be a compact subset of E such
that A ⊂ K. Then there exists ε0 > 0 such with the property that

sup Py [X(t) ∈
/ Kε for all t ∈ [h, h + τA ◦ ϑh )] > 0. (9.189)
y∈A

for all 0 < ε < ε0 . Below we will show that under the hypotheses of The-
orem 9.36 the inequality in (9.189) is satisfied indeed: see (9.206). Here
Kε := {x ∈ E : d (x, K) ≤ ε} stands for an ε-neighborhood of K: d denotes a
compatible metric on the polish space E. We are going to show that
"Z #
h+τA ◦ϑh
sup Ey 1Kε (X(ρ)) dρ < ∞ (9.190)
y∈E 0

for some ε > 0. Let τε be the first hitting time of Kε . From (9.189) it follows
that for every ε ∈ (0, ε0 ) there exists yε ∈ A such that

Pyε [τε ◦ ϑh ≥ τA ◦ ϑh ] = Pyε [X(t) ∈


/ Kε for all t ∈ [h, h + τA ◦ ϑh )] > 0.
£ ¤ (9.191)
The function y 7→ Py [τε ◦ ϑh ≥ τA ◦ ϑh ] = Ey PX(h) [τε ≥ τA ] is continuous,
and so there exists a neighborhood Vε of yε such that

αε := inf Px [τε ◦ ϑh ≥ τA ◦ ϑh ] > 0. (9.192)


x∈Vε

and such that

inf P (t0 , x, Vε ) > 0 (9.193)


x∈Kε

for some fixed but arbitrary t0 > h. In (9.193) we used the irreducibility of
the Markov process and the continuity of the function x 7→ P (t0 , x, Vε ) for
ε > 0. If necessary we choose a smaller neighborhood Vε of yε and a smaller
ε, which we are entitled to do, because (9.191) holds for every ε ∈ (0, ε0 ).
Choose y ∈ Kε . Then by the Markov property we have
"Z #
h+τA ◦ϑh
Py 1Kε (X(t)) dt < t0
0
"Z #
h+τA ◦ϑh
≥ Py 1Kε (X(t)) dt < t0 , t0 < h + τA ◦ ϑh
0
9.3 Markov Chains: invariant measure 513
· ·Z t0
= Ey ω 7→ PX(t0 )(ω) 1Kε (X(t)(ω)) dt
0
Z # #
h+τA ◦ϑh (ω)−t0
+ 1Kε (X(t)) dt < t0 1{t0 <h+τA ◦ϑh } (ω)
0
· ·Z t0
≥ Ey ω 7→ PX(t0 )(ω) 1Kε (X(t)(ω)) dt < t0 ,
0
¸ ¸
X(t) ∈/ Kε for all t ∈ [0, h + τA ◦ ϑh (ω) − t0 ) 1{t0 <h+τA ◦ϑh } (ω)
·Z t0
≥ Py 1Kε (X(t)) dt < t0 ,
0
¸
X(t) ∈/ Kε for all t ∈ [t0 , h + τA ◦ ϑh ), h + τA ◦ ϑh > t0
·Z t0
≥ Py 1Kε (X(t)) dt < t0 ,
0
¸
X (t0 ) ∈ Vε , X(t) ∈
/ Kε for all t ∈ [t0 , h + τA ◦ ϑh ), h + τA ◦ ϑh > t0
£ ¤
≥ Ey PX(t0 ) [τε ◦ ϑh ≥ τA ◦ ϑh ] , X (t0 ) ∈ Vε

(apply (9.192), the definition of αε )

≥ αε P (t0 , y, Vε ) ≥ αε inf P (t0 , x, Vε ) =: q > 0, (9.194)


x∈Kε

where we used the irreducibility of our Markov process, and the continuity of
the function x 7→ P (t0 , x, Vε ). Hence we infer
"Z #
h+τh ◦ϑh
sup Py 1Kε (X(t)) dt ≥ t0 ≤ 1 − q. (9.195)
y∈Kε 0

Put
½ Z t ¾
κε = inf t > h : 1Kε (X(s)) ds ≥ t0
0
½ Z t ¾
= inf t > h : 1Kε (X(s)) ds = t0 . (9.196)
0

Then κε is a stopping time relative to the filtration (Ft )t≥0 , because X(s) is
Ft -measurable for all 0 ≤ s ≤ t. Moreover, by right-continuity of the process
t 7→ X(t) it follows that X (κε ) ∈ Kε on the event {τε < ∞}. Let y ∈ A. By
induction we shall prove that
"Z #
h+τA ◦ϑh
Py 1Kε (X(t)) dt > kt0 ≤ (1 − q)k−1 , k ∈ N, k ≥ 1. (9.197)
0
514 9 Miscellaneous topics

To this end we put


"Z #
h+τA ◦ϑh
αk = sup Ex 1Kε (X(s)) ds ≥ kt0 . (9.198)
x∈Kε 0

If x belongs to Kε , then by the Markov property we have:


"Z #
h+τA ◦ϑh
Px 1Kε (X(s)) ds > (k + 1)t0
0
"Z #
h+τA ◦h
= Px 1Kε (X(s)) ds > kt0 , κε ≤ h + τA ◦ ϑh
κε
· ·Z ∞ ¸ ¸
= Ex PX(κε ) 1Kε (X(s)) ds > kt0 , κε ≤ h + τA ◦ ϑh
" "Z0 #
h+τA ◦ϑh
= Ex PX(κε ) 1Kε (X(s)) ds > kt0 ,
0
Z #
h+τA ◦ϑh
1Kε (X(s)) ds ≥ t0
0

≤ α1 αk . (9.199)
From (9.199) and induction we infer
"Z #
h+τA ◦ϑh
sup Px 1Kε (X(s)) ds ≥ kt0
x∈Kε 0
à "Z #!k
h+τA ◦ϑh
≤ α1k = sup 1Kε (X(s)) ds ≥ t0 ≤ (1 − q)k , (9.200)
x∈Kε 0

where in the final step of (9.200) we employed (8.57). If y ∈ E is arbitrary,


then we proceed as follows:
"Z #
h+τA ◦ϑh
Py 1Kε (X(s)) ds > (k + 1)t0
0
"Z #
h+τA ◦ϑh
= Py 1Kε (X(s)) ds > kt0 , κε ≤ h + τA ◦ ϑh
κε
" "Z # #
h+τA ◦ϑh
= Ey PX(κε ) 1Kε (X(s)) ds > kt0 , κε ≤ h + τA ◦ ϑh
0

≤ (1 − q)k Py [κε ≤ h + τA ◦ ϑh ] ≤ (1 − q)k . (9.201)


The inequality in (9.201) implies the one in (9.197). To show the first part of
(9.190) with f = 1Kε , for ε > 0 small enough, we observe that for x ∈ E we
have
9.3 Markov Chains: invariant measure 515
"Z #
h+τA ◦ϑh
Ex 1Kε (X(s)) ds
0

" Z #
X h+τA ◦ϑh
≤ kt0 Px (k − 1)t0 < 1Kε (X(s)) ds ≤ kt0
k=1 0

"Z #
X h+τA ◦ϑh
≤ t0 + Px 1Kε (X(s)) ds > (k − 1)t0
k=2 0

X∞ µ ¶
1 1
≤ t0 + t0 k(1 − q)k−2 = t0 1 + + 2 < ∞. (9.202)
q q
k=2

The inequality in (9.190) is a consequence of (9.202) indeed with f = 1Kε . In


other words for every compact subset K of E there exists an ε-neighborhood
K
R ε ⊃ K such that (9.190) is satisfied. It follows that the functional f 7→
f dπ, f ∈ Cb (E), f ≥ 0, can be represented as a Radon measure. Since E is
a polish space, it also follows that the measure π is σ-finite.
In order to complete the proof of (9.188) we have to verify the inequality
in (9.189). By assuming that

sup Py [X(t) ∈
/ Kε for all t ∈ [h, h + τA ◦ ϑh )]
y∈A

= sup Py [τε ◦ ϑh ≥ τA ◦ ϑh ] = 0 (9.203)


y∈A

we will arrive at a contradiction. If (9.203) holds, then for all y ∈ A we have


£ ¤
0 = Py [τε ◦ ϑh ≥ τA ◦ ϑh ] = Ey PX(h) [τε ≥ τA ] , (9.204)

and hence since all measure B 7→ P (h0 , y, B), B ∈ E, h0 > 0, are equivalent
we infer from (9.204) that

Py [h0 + τε ◦ ϑh0 ≥ h0 + τA ◦ ϑh0 ] = 0 (9.205)

for all h0 > 0. In (9.205) we let h0 ↓ 0 to obtain

Py [τε ≥ τA ] = 0 (9.206)

for all y ∈ A. Choose y ∈ Ar : since X (τA ) ∈ Ar Px -almost surely on {τA < ∞}


and A is recurrent such points y exist. Then τA = 0 Py -almost surely. From
(9.206) we get Py [τε = 0] = 0 which is manifestly a contradiction, because y
is an interior point of Kε .
The proof of (9.190) follows the same pattern as the corresponding proof
by Seidler in [207], who in turn follows Khasminskii [101]. Let τ be the first
hitting time of K. Since K is non-recurrent there exists y0 ∈ / K such that

Py0 [τ = ∞] = Py0 [X(t) ∈


/ K for all t ≥ 0] > 0.
516 9 Miscellaneous topics

There is one other issue to be settled, i.e. is the subspace R(L)+R1 Tβ -dense in
Cb (E). Therefore we consider a Tβ -continuous linear functional Λ : Cb (E) → R
which annihilates the subspaces R(L) + R1. Suppose that Λ 6= 0. Then Λ
can be represented as a measure on E, and since Λ (1) =R 0 by scaling R we
may and will assume that Λ(f ) can be written as Λ(f ) = f dµ1 − f dµ2 ,
Rf ∈ Cb (E),R where µ1 and µ2 are probability measures on E. Then, since
Lf dµ1 − Lf dµ2 = 0, it follows that
Z Z Z Z
nL
f (x) dµ1 (x) − f (x) dµ2 (x) = e f (x) dµ1 (x) − enL f (x) dµ2 (x)
E E E E
ZZ
¡ nL ¢
= e f (x) − enL f (y) dµ1 (x) dµ2 (x), n ∈ N, f ∈ Cb (E). (9.207)
E×E

In
R (9.207) we letR n → ∞, and we use Orey’s theorem to conclude that
E
f (x) dµ1 (x) − E f (x) dµ2 (x) = 0, f ∈ Cb (E). It follows that Λ(f ) = 0,
f ∈ Cb (E). Consequently, by Hahn-Banach theorem we infer that the sub-
space R(L) + R1 is Tβ -dense in Cb (E).
By construction and (9.190) it follows that for every compact subset K
of
R E there exists a function fK ∈ Cb (E) such that 1K ≤ fK ≤ 1Kε and
fK dπ < ∞. Hence, the open subset {fα > 0} has σ-finite π-measure. Let
the sequence of open subsets (Un )n∈N be as in Remark 9.37. Consequently,
each open subset Un for which there exists a compact subset Kn with Un ⊂
{fKn > 0} has σ-finite π-measure. Since by Remark 9.37 such open subsets
cover E, it follows that the measure π is σ-finite. This is another argument to
show that the invariant etL -measure π is σ-finite. A previous argument was
based on Lemma 9.38.
Altogether this completes the proof of Theorem 9.36.
In the proof of Proposition 9.40 below we need the following lemma. The proof
requires the equalities in (9.240) which are the same as those in (9.173) and
(9.174).
Lemma 9.38. Let A be a compact subset which is recurrent with first hitting
Rtime τA . Let πE be any non-negative invariant Radon measure on E. Then
P [τ ≤ m] dπE (x) < ∞ for every m ∈ R. Put
E x A
µ ¶ Z · ¸
h 2 h
C , πE = Px τA ≤ dπE (x). (9.208)
2 h E 2

Moreover, for 0 < m < ∞, and α > 0 the following inequalities hold:
Z µ ¶
h
0< Px [τA ≤ m] dπE (x) ≤ (m + h)C , πE , and (9.209)
E 2
Z µ ¶
£ ¤ h
α Ex e−ατA dπE (x) ≤ (αh + 1) C , πE . (9.210)
E 2
9.3 Markov Chains: invariant measure 517

Proof. Since A is compact and πE is a Radon measure there exists


R a bounded
continuous function f such that 1A ≤ f ≤ 1, and such that E f dπE < ∞.
The first equality in (9.240) yields:
³ ´ Z h
−hλ0 hL 0 0
e e − I RA (λ ) f (x) + e−λ s esL f (x) ds
0
" "Z # #
h−τA +τA ◦ϑh−τA
0 0
= Ex e−λ τA EX(τA ) e−λ ρ f (X (ρ)) dρ , τA ≤ h ,
0
(9.211)
and so we get by invariance of the measure πE :
Z h Z Z h Z
−λ0 s −λ0 s
e ds f (x) dπE (x) = e esL f (x) dπE (x) ds
0 E 0 E
Z ³ ´
0
= I − e−hλ ehL RA (λ0 ) f (x) dπE (x)
E
Z " "Z h # #
τA
0 0
+ Ex e−λ τA EX(τA ) e−λ ρ f (X (ρ)) dρ , τA ≤ h dπE (x)
E 0
³ 0
´Z
= 1 − e−λ h RA (λ0 ) f (x) dπE (x)
E
Z " "Z h # #
τA
−λ0 τA −λ0 ρ
+ Ex e EX(τA ) e f (X (ρ)) dρ , τA ≤ h dπE (x)
E 0
Z " "Z 1
# #
0 2h 0 1
≥ Ex e−λ τA EX(τA ) e−λ ρ f (X (ρ)) dρ , τA ≤ h dπE (x)
E 0 2
(9.212)
where for brevity we wrote
τAh (ω, ω 0 ) = h − τA (ω) + τA ◦ ϑh−τA (ω) (ω 0 ) (9.213)
which indicates
© the firstªtime of hitting A strictly after h − τA (ω). Notice that
on the event τA ≤ 21 h the inequalities τAh ≥ 12 h + τA ◦ ϑ 21 h ≥ 12 hold. In
(9.212) we let λ ↑ 0 to get:
Z Z " "Z 1 # #
2h 1
h f (x) dπE (x) ≥ Ex EX(τA ) f (X (ρ)) dρ , τA ≤ h dπE (x)
E E 0 2
"Z 1 # Z · ¸
2h 1
≥ inf Ey f (X (ρ)) dρ · Px τA ≤ h dπE (x).
y∈A 0 E 2
(9.214)
hR 1
i
2h
Assuming that inf y∈A Ey f (X (ρ)) dρ = 0 leads to a contradiction, as
0
hR 1 i
h
we shall see momentarily. Since the function y 7→ Ey 02 f (X (ρ)) dρ and
518 9 Miscellaneous topics

A is compact is continuous our assumption implies that for some y0 ∈ A the


following inequality holds for all 0 < h0 < 12 h:
"Z 0 # Z 0
h h
Ey0 f (X (ρ)) dρ = eρL f (y0 ) dρ = 0. (9.215)
0 0

dividing all members of (9.215) by h0 > 0, letting h0 to 0, we obtain f (y0 ) = 0.


Here we employ the Tβ -continuity of the function t 7→ etL f (y0 ) which follows
from the Tβ -continuity of the semigroup t 7→ etL . Since
hR 1 1A ≤ f ≤ 1 and
i y0 ∈ A
h
we have a contradiction. Thus we have inf y∈A Ey 02 f (X (ρ)) dρ > 0. In
R £ ¤
combination with (9.214) this yields that E Px τA ≤ 12 h < ∞. By induction
with respect to k it follows that
Z · ¸ Z · ¸
1 1
Px 0 < τA ≤ kh dπE (x) ≤ k Px 0 < τA ≤ h dπE (x), (9.216)
E 2 E 2

and hence
Z · ¸ Z · ¸
1 1
Px τA ≤ kh dπE (x) ≤ (k + 1) Px τA ≤ h dπE (x) < ∞. (9.217)
E 2 E 2

Let us show (9.216). Since on events of the form {τA > s} we have s + τ ◦ ϑs
Px -almost surely, we have
· ¸
1
Px 0 < τA ≤ (k + 1)h
2
· ¸ · ¸
1 1 1
= Px 0 < τA ≤ kh + Px kh < τA ≤ (k + 1)h
2 2 2
· ¸ · ¸
1 1 1
= Px 0 < τA ≤ kh + Px 0 < τA ◦ ϑ 21 kh ≤ h, kh < τA
2 2 2

(Markov property)
· ¸ · · ¸ ¸
1 1 1
= Px 0 < τA ≤ kh + Ex PX ( 1 kh) 0 < τA ≤ h , kh < τA
2 2 2 2
· ¸ · · ¸¸
1 1
≤ Px 0 < τA ≤ kh + Ex PX ( 1 kh) 0 < τA ≤ h (9.218)
2 2 2

Since the positive measure πE is etL -invariant from (9.218) we deduce


Z · ¸
1
Px 0 < τA ≤ (k + 1)h dπE (x)
E 2
Z · ¸ Z · · ¸¸
1 1
≤ Px 0 < τA ≤ kh dπE (x) + Ex PX ( 1 kh) 0 < τA ≤ h dπE (x)
E 2 E 2 2
9.3 Markov Chains: invariant measure 519
Z· ¸ Z · ¸
1 1 1
= Px 0 < τA ≤ kh dπE (x) + e 2 khL P(·) 0 < τA ≤ h (x) dπE (x)
E 2 E 2

(etL -invariance for t = 12 kh)


Z· ¸ Z · ¸
1 1
= Px 0 < τA ≤ kh dπE (x) + Px 0 < τA ≤ h dπE (x). (9.219)
E 2 E 2
Thus (9.216) follows by induction from (9.219). The inequality in (9.209)
follows from (9.217). Since
Z ∞
£ ¤
Ex e−ατA = α Px [τA ≤ s] e−αs ds
0

the equality in (9.210) follows from (9.209). Suppose that the invariant
measure
R πE is non-trivial. Then there remains to show that the quantity
P [τ
E x AR
≤ m] dπE (x) is strictly positive for 0 < m < ∞. For m ↑ ∞ the
quantity E Px [τA ≤ m] dπE (x) increases to
Z Z Z
Px [τA ≤ m] dπE (x) ↑ Px [τA ≤ ∞] dπE (x) = 1 dπE > 0. (9.220)
E E E

Assume,
R to arrive at a contradiction that, for some m ∈ (0, ∞) the integral
E
P x [τA ≤ m] dπE (x) vanishes. Then by invariance we have
Z
Px [m < τA ≤ 2m] dπE (x)
E
Z
= Px [m + τA ◦ ϑm ≤ 2m, τA > m] dπE (x)
ZE
= Ex [τA ◦ ϑm ≤ m, τA > m] dπE (x)
ZE
£ ¤
≤ Ex PX(m) [τA ◦ ϑm ≤ m] dπE (x)
E

(the measure πE is emL -invariant)


Z
= Px [τA ◦ ϑm ≤ m] dπE (x) = 0. (9.221)
E

Repeating the arguments in (9.221) then shows the equality


Z
Px [τA < ∞] dπE (x) (9.222)
E
Z ∞ Z
X
≤ Px [τA ≤ m] dπE (x) + Px [km < τA ≤ (k + 1)m] dπE (x) = 0,
E k=0 E

which contradicts the non-triviality of the measure πE .


Finally, the conclusion in (9.217) completes the proof of Lemma 9.38.
520 9 Miscellaneous topics

Proposition 9.39. Let πE be an invariant Radon measure. If the function


f ≥ 0 belongs to f ∈ L1 (E, E, πE ), then
Z
h f (x) dπE (x)
E
Z " "Z # #
h−τA +τA ◦ϑh−τA
= Ex EX(τA ) f (X(ρ)) dρ , τA ≤ h dπE (x)
E 0
Z "Z #
h+τA ◦ϑh
= Ex f (X(ρ)) dρ, τA ≤ h dπE (x), (9.223)
E τA

and
Z Z
lim
0
λ0 RA (λ0 ) f (x) dπE (x) = inf
0
λ0 RA (λ0 ) f (x) dπE (x) = 0. (9.224)
λ ↓0 E λ >0 E

First assume that the function f is such that the function RA (0)f is uniformly
bounded. Since the Markov process is irreducible this is true whenever f is
replaced by a function of the form f 1U whenever U is an appropriate open
neighborhood of a given compact subset: see (8.116 in Corollary 8.40.
Proof. Let f ∈ L1 (E, E, πE ) ∩ Cb (E), and let πE be an etL -invariant Radon
measure. For the proof we need the equality in (9.240). From that equality in
conjunction with the invariance property of the measure πE we obtain:
³ ´Z 0
1 − e−λ h
Z
−hλ0 0
e −1 RA (λ ) f (x) dπE (x) + f (x) dπE (x)
E λ0 E
Z " " Z h # #
τA
0 0
= Ex e−λ τA EX(τA ) e−λ ρ f (X (ρ)) dρ , τA ≤ h dπE (x)
E 0
Z "Z #
h+τA ◦ϑh
−λ0 ρ
= Ex e f (X (ρ)) dρ, τA ≤ h dπA (x). (9.225)
E τA

where τAh = h − τA + τA ◦ ϑh−τA : see (9.213). Upon letting λ0 ↓ 0 we get


Z
h f (x) dπE (x)
E
Z
0
= h lim λ RA (λ0 ) f (x) dπE (x)
λ0 ↓0
Z "E "Z h # #
τA
+ Ex EX(τA ) f (X (ρ)) dρ , τA ≤ h dπE (x)
E 0
Z
0
= h lim
0
λ RA (λ0 ) f (x) dπE (x)
λ ↓0
Z "EZ #
h+τA ◦ϑh
+ Ex f (X (ρ)) dρ, τA ≤ h dπA (x). (9.226)
E τA
9.3 Markov Chains: invariant measure 521

Next in (9.240) we let λ0 tend to zero to obtain the pointwise equality:


Z h
¡ ¢
ehL − I RA (0) f (x) + esL f (x) ds
0
" "Z # #
h−τA +τA ◦ϑh−τA
= Ex EX(τA ) f (X (ρ)) dρ , τA ≤ h
0
"Z #
h+τA ◦ϑh
= Ex f (X (ρ)) dρ, τA ≤ h . (9.227)
τA

¡ ¢
From (9.226) and (9.227) we see that the function I − ehL RA (0)f belongs
to L1 (E, E, πE ), and that
Z Z
¡ hL
¢ 0
I −e RA (0)f (x) dπE (x) = lim λ RA (λ0 ) dπE (x)
E λ0 ↓0 E
Z
0
= inf
0
λ RA (λ0 ) dπE (x). (9.228)
λ >0 E

The fact that in (9.224) and in (9.228) we mayR replace the limit by an infimum
is due to the fact that the function λ0 7→ λ0 E RA (λ0 ) dπE (x) is decreasing.
This claim follows from the resolvent property of the family {RA (λ) : λ > 0}
and the invariance of the measure πE . The arguments read as follows. Let
λ0 > λ00 > 0. Then by the resolvent equation we have:

λ0 RA (λ0 ) − λ00 RA (λ00 ) = (λ0 − λ00 ) (I − λ0 RA (λ0 )) RA (λ00 ) . (9.229)

For g ∈ L1 (E, E, πE ), g ≥ 0, we also have


Z Z Z
λ0 RA (λ0 ) g(x) dπE (x) ≤ λ0 R (λ0 ) g(x) dπE (x) = g(x) dπE (x).
E E E
(9.230)
From (9.229) and (9.230) the monotonicity of the function
Z
λ0 7→ λ0 RA (λ0 ) dπE (x)
E

easily follows. We shall prove that this limit vanishes, and consequently the
result in (9.224) follows. Therefore, for m > 0 arbitrary, we consider the
following decomposition of the function λRA (λ)f (x):
·Z τA ¸
−λρ
λRA (λ)f (x) = λEx e f (X(ρ)) dρ (9.231)
0
"Z # "Z #
(τA −m)∨0 τA
= λEx e−λρ f (X(ρ)) dρ + λEx e−λρ f (X(ρ)) dρ .
0 (τA −m)∨0

From the Markov property we infer


522 9 Miscellaneous topics
"Z #
(τA −m)∨0
−λρ
λEx e f (X(ρ)) dρ
0
"Z #
(τA −m)∨0
−λρ
= λEx e f (X(ρ)) PX(ρ) [τA > m] dρ . (9.232)
0

We also infer, again using the Markov property,


"Z #
τA
λEx e−λρ f (X(ρ)) dρ
(τA −m)∨0
"Z #
τA
−λρ
= λEx e f (X(ρ)) PX(ρ) [τA ≤ m] dρ . (9.233)
(τA −m)∨0

In both equalities (9.232) and (9.233) we used the equality Px -almost sure
equality ρ + τA ◦ ϑρ = τA on the event {τA > ρ}. Next we estimate the ex-
pression in (9.232):
"Z #
(τA −m)∨0
λEx e−λρ f (X(ρ)) PX(ρ) [τA > m] dρ
0
Z ∞ £ ¤
=λ e−λρ Ex f (X(ρ)) PX(ρ) [τA > m] , τA > m dρ
Z0 ∞
£ ¤
≤λ e−λρ Ex f (X(ρ)) PX(ρ) [τA > m] dρ
0
¡ ¢
= λR(λ) f (·)P(·) [τA > m] (x). (9.234)

The expression in (9.233) can be rewritten and estimated as follows:


"Z #
τA
λEx e−λρ f (X(ρ)) PX(ρ) [τA ≤ m] dρ
(τA −m)∨0
Z ∞ £ ¤
=λ e−λρ Ex 1[(τA −m)∨0,τA ) (ρ)f (X(ρ)) PX(ρ) [τA ≤ m] dρ
0

(Markov property and ρ + τA ◦ ϑρ = τA on {τA > ρ} Px -almost surely)


Z ∞ £ £ ¤ ¤
=λ e−λρ Ex EX(ρ) 1[(τA −m)∨0,τA ) (ρ) f (X(ρ)) PX(ρ) [τA ≤ m] , τA > ρ
0

Z ∞ £ £ ¤ ¤
≤λ e−λρ Ex EX(ρ) 1[(τA −m)∨0,τA ) (ρ) f (X(ρ)) PX(ρ) [τA ≤ m] dρ.
0
(9.235)

Employing the invariance of the measure πE in the inequality in (9.234) shows


9.3 Markov Chains: invariant measure 523
Z "Z #
(τA −m)∨0
λ Ex e−λρ f (X(ρ)) PX(ρ) [τA > m] dρ dπE (x)
E 0
Z
¡ ¢
=λ R(λ) f (·)P(·) [τA > m] (x) dπE (x)
Z E
= f (x)Px [τA > m] dπE (x). (9.236)
E

A similar estimate for the term in (9.233) is somewhat more involved, but
it really uses the recurrence of the set A. Again using the invariance of the
measure πE for the expression in (9.235) yields:
Z "Z #
τA
λ Ex e−λρ f (X(ρ)) PX(ρ) [τA ≤ m] dρ dπE (x)
E (τA −m)∨0
Z ∞ Z
£ £ ¤ ¤
≤λ e−λρ Ex EX(ρ) 1[(τA −m)∨0,τA ) (ρ) f (X(ρ)) PX(ρ) [τA ≤ m]
0 E
dπE (x) dρ
Z ∞ Z
−λρ
£ ¤
=λ e Ex 1[(τA −m)∨0,τA ) (ρ) f (x) Px [τA ≤ m] dπE (x) dρ
Z 0 Z ∞ E
£ ¤
= λ e−λρ Ex 1[(τA −m)∨0,τA ) (ρ) dρf (x) Px [τA ≤ m] dπE (x)
E 0
Z
¡ −λm
¢
≤ 1−e f (x) Px [τA ≤ m] dπE (x). (9.237)
E

In the final step of (9.237) we used the fact that τA < ∞ Px -almost surely for
all x ∈ E. As a consequence of this we have
Z ∞
£ ¤
λ e−λρ Ex 1[(τA −m)∨0,τA ) (ρ) dρ
0
" Z #
τA
= Ex λ e−λρ dρ ≤ 1 − e−λm
(τA −m)∨0

showing the final step in (9.237). From (9.231), (9.236), and (9.237) we deduce:
Z
λ RA (λ)f (x) dπE (x) (9.238)
E
Z Z
¡ ¢
≤ f (x)Px [τA > m] dπE (x) + 1 − e−λm f (x) Px [τA ≤ m] dπE (x).
E E

Here m > 0 is arbitrary. Let ε > 0 be arbitrary. First we choose m > 0 so


large that the first term in the right-hand side of (9.238) is ≤ 12 ε. Then we
choose λ > 0 so small that the second term in (9.238) is ≤ 21 ε as well. As
a consequence we see that (9.224) in Proposition 9.39 follows. Together with
(9.226), (9.227), and (9.227) this completes the proof of Proposition 9.39.
524 9 Miscellaneous topics

In the following proposition we establish a strong link between QA -invariant


measures on A, and etL -invariant measures on E. In particular it follows that
invariant measures on E are unique whenever this is the case on A.
Proposition 9.40. Let the Borel probability measure πA on A and the mea-
sure πE on E be related as follows. For all functions f ∈ L1 (E, E, πE ) the
equality "Z 1,h #
Z τA Z
Ex f (X(ρ)) dρ dπA (x) = h f dπE (9.239)
A 0 E

holds. Then the following assertions are true:


(a) Let f ∈ Cb (E), and λ0 ≥ 0. The following equalities hold (see (9.173)):
³ ´ Z h
−hλ0 hL 0 0
e e − I RA (λ ) f (x) + e−λ s esL f (x) ds
0
" "Z # #
h−τA +τA ◦ϑh−τA
−λ0 τA −λ0 ρ
= Ex e EX(τA ) e f (X (ρ)) dρ , τA ≤ h
0
"Z #
h+τA ◦ϑh
−λ0 ρ
= Ex e f (X (ρ)) dρ, τA ≤ h . (9.240)
τA

(b) The measure πA is QA -invariant if and only if πE is etL -invariant for all
t ≥ 0.
(c) If the QA -invariant measure πA on the Borel field of A is given, then
(9.239) can be used to define the invariant measure πE on E.
(d) If the etL -invariant measure πE on the Borel field of E is given, then
(9.240) together with the equality (9.223) of Proposition 9.39 can be used
to define the invariant measure πA on the Borel field of A.
(e) If there exists only one QA -invariant probability measure πA , then the etL -
invariant measure πE is unique up to multiplicative constants.
(f ) If πE is an invariant measure on E, and f belongs to L1 (E, E, πE ), then
the following inequality holds:
¯Z ¯ "Z #Z
¯ ¯ h+τA ◦ϑh
¯ ¯
h ¯ f dπE ¯ ≤ sup Ey |f (X(ρ))| dρ Px [τA ≤ h] dπE (x).
E y∈A 0 E
(9.241)
Let πE be a etL -invariant measure. Notice that, with λ0 = 0, the equalities in
(9.240) together with (9.239) entail the equality:
Z "Z 1,h #
τA
Ex f (X(ρ)) dρ dπA (x)
A 0
Z "Z 1,h
# Z
τA
= Ex f (X (ρ)) dρ, τA ≤ h dπE (x) = h f dπE . (9.242)
E τA E
9.3 Markov Chains: invariant measure 525

If f ≥ 0 belongs to Cb (E), and if πE is a positive measure on E, then we


use the first equality in (9.242) to associate to πE a Borel measure πA on A.
Since the invariant probability measures on A are unique, it follows that the
invariant measures on E are unique as well.
Proof. (a) The equalities in (9.240) follow from the equalities in (9.173) and
(9.174).
(b) Let the measures πA on A and πE be related as in (9.239). Then for
t ≥ 0, and f ∈ L1 (E, E, πE ) we have
Z Z "Z 1,h #
τA
h etL f dπE = Ex etL f (X(ρ)) dρ dπA (x)
E A 0
Z "Z 1,h
#
τA
= Ex EX(ρ) [f (X(t))] dρ dπA (x)
A 0

(Markov property)

Z "Z 1,h
#
τA
= Ex f (X(ρ + t)) dρ dπA (x)
A 0
Z "Z 1,h
#
t+τA
= Ex f (X(ρ)) dρ dπA (x). (9.243)
A t

We differentiate both sides of (9.243) to obtain


Z
h etL Lf dπE
E
Z h ³ ³ ´´i Z
1,h
= Ex f X t + τA dπA (x) − Ex [f (X (t))] dπA (x)
ZA Z A

= QA etL f (x) dπA (x) − etL f (x) dπA (x). (9.244)


A A

By setting t = 0 in (9.243) we see that πA is QA -invariant if and only if πE is


etL -invariant (or L-invariant). This proves assertion (a).
(c) Let πA be a (finite) Borel measure on A, and define the measure πA by
the equality in (9.239). If πA is an invariant measure on A, then by assertion
(b) πE is etL -invariant. This proves assertion (c).
(d) Let ME (λ0 ) be the space of all continuous functions g on E for
which there exists a function f ∈ Cb (E) such that g(x), x ∈ E, can be
written as in (9.240). Let MA (λ0 ) be the subspace of C(A) consisting of
functions g ∈ ME (λ0 ) restricted to A. Let πE be a Borel measure on E
which is a positive Radon measure R with the property that for some fi-
nite constant C the inequality E g(x) dπE (x) ≤ C supx∈A g(x) holds for
526 9 Miscellaneous topics

all g ∈ ME (0). Notice that in case R a function g ∈ RMA (0) has two exten-
sions g1 and g2 in ME (0), then E g1 (x) dπE (x) = E g2 (x) dπE (x). Define
R
the functional ΛeA : MA (0) → R by ΛeA (g) = E g(x) dπE (x). Then by as-
sumption ΛeA (g) ≤ C supx∈A g(x), g ∈ ME (0), and hence by the observation
above ΛeA is well-defined. By the Hahn-Banach extension theorem in combi-
nation with the Riesz representation
R theorem there
R exists a measure πA on
the Borel
R field of A such that A
g(x) dπ A (x) = E
g(x) dπE (x), g ∈ MA (0),
and A g(x) dπA (x) ≤ C supx∈A g(x) for all g ∈ C(A). Next, let πE be any
tL
Rnon-negative e -invariant Radon measure on E. Then Lemma 9.38 implies
P
E x A
[τ ≤ h] dπE (x) < ∞.
(1) (2)
(e) Let πE and πE be two Radon measures on E which are etL -invariant.
(1) (2)
Then the construction in (d) gives finite measures πA and πA on the Borel
field of A such that the equality
Z "Z 1,h # Z
τA
(j) (j)
Ex f (X(ρ)) dρ dπA (x) = h f dπE (9.245)
A 0 E

³ ´
(j)
is satisfied for all functions f ∈ L1 E, E, πE , j = 1, 2: see (9.239). Then
(1) (2)
(9.245) implies that the measures πA and πA are QA -invariant. By unique-
ness, they are constant multiples of each other. It follows that the measures
(1) (2)
πE and πE are scalar multiples of each other.
This completes the proof of item (e).
(f) The inequality in (9.241) is a consequence of the first equality in (9.230)
in Proposition 9.39, and the fact that X(τA ) ∈ A Px -almost surely.
Altogether this completes the proof of Proposition 9.40.
Let πE be an invariant Borel
£ measure ¤ on E, let f ≥ 0 be a function in Cb (E),
and put fα (x) = f (x)Ex e−alphaτA . In the following proposition we show
that the functions RA (α)fα are very appropriate to approximate functions of
the form RA (0)f . In may aspects they can be used to play the role of RA (α)f
for α > 0 small. If f belongs to Cb (E), RA (α)fα is a member of L1 (E, E, πE )
where πE is an invariant measure.
Proposition 9.41. In several aspects the functions RA (α)fα , α > 0, f ∈
Cb (E), have properties which are played by RA (α)f , f ∈ Cb (E).

Proof. Next we introduce the function fα , α > 0, by fα (x) = f (x)Ex [e−ατA ].


Observe that
·Z τA ¸
£ ¤
RA (α)fα (x) = Ex f (X(ρ)) e−αρ EX(ρ) e−ατA dρ
Z ∞ 0
£ £ ¤ ¤
= Ex f (X(ρ)) e−αρ EX(ρ) e−ατA , τA > ρ dρ
0
9.3 Markov Chains: invariant measure 527

(Markov property)
Z ∞ £ ¤
= Ex f (X(ρ)) e−αρ−ατA ◦ϑρ , τA > ρ dρ
0

(τA is a terminal stopping time: ρ + τA ◦ ϑρ = τA on the event {τA > ρ})


·Z τA ¸
= Ex f (X(ρ)) dρ e−ατA , (9.246)
0

and consequently limα↓0 RA (α)fα (x) = RA (0)f (x). Here we employed the
recurrence of the set A. We also see that the functions Rα fα are members of
L1 (E, E, πE ):
Z Z Z
α RA (α)fα (x) dπE (x) ≤ α R(α)fα (x) dπE (x) ≤ fα (x) dπE (x) < ∞.
E E E
(9.247)
The following equality was used before in the proof of the present Proposition
9.39, and can be found in (9.240):
Z h
¡ −hα hL ¢
e e − I RA (α) fα (x) + e−αs esL fα (x) ds
0
" "Z # #
h−τA +τA ◦ϑh−τA
−ατA −αρ
= Ex e EX(τA ) e fα (X (ρ)) dρ , τA ≤ h
0
"Z #
h+τA ◦ϑh
= Ex e−αρ fα (X (ρ)) dρ, τA ≤ h . (9.248)
τA

From (9.248) we see that the family {RA (α)fα : α > 0} is uniformly πE -
integrable, because it fulfills:
"Z #
h+τA ◦ϑh
− Ex f (X (ρ)) dρ, τA ≤ h
τA
"Z #
h+τA ◦ϑh
−αρ
≤ −Ex e fα (X (ρ)) dρ, τA ≤ h
τA
¡ ¢
≤ I − e−αh ehL RA (α)fα (x)
Z h Z h
−αs sL
≤ e e fα (x) ds ≤ esL f (x) ds (9.249)
0 0

Next we consider the identity:


¡ ¢ ¡ ¢
I − ehL RA (0)f − I − ehL RA (α)fα
¡ ¢ ¡ ¢
= I − ehL RA (0)f − I − e−αh ehL RA (α)fα
528 9 Miscellaneous topics
¡ ¢
+ 1 − e−αh ehL RA (α)fα . (9.250)

All terms in (9.250) belong to L1 (E, E, πE ). The difference of the two terms
in the right-hand side of (9.250) converges pointwise to zero whenever α ↓ 0.
The difference of the first two terms in the right-hand side of (9.250) con-
verges pointwise to zero and by (9.249) it is uniformly πE -integrable. In fact
the difference of these two terms lies between two function in L1 (E, E, πE ).
The third term in the right-hand side of (9.249) decreases pointwise to zero
for α sufficiently small: see From equality (9.229) it follows that the family
{αRA (α)f : 0 < α ≤ α0 } decreases to zero for α0 > 0 sufficiently small. Since
fα ≤ f it follows that the family {αRA (α)fα : 0 < α ≤ α0 } converges to the
zero-function in L1 (E, E, πE ). Since the measure πE is invariant it follows
that the third term in the right-hand side of (9.250) converges to the zero
function in L1 (E, E, πE ). As a consequence we obtain the equality
Z
¡ ¢
I − ehL RA (0)f (x) dπE (x) = 0. (9.251)
E

An appeal to (9.226) together with (9.251) proves the identity in (9.230) in


Proposition 9.39. It also shows (9.246) provided the function f is such that
the function RA (0)f is bounded. But once the equality in (9.230) holds for
functions with the latter property, it holds for functions in f ∈ L1 (E, E, πE ).
This can be established by a density argument.
In order to finish the proof of equality (9.223) it suffices to establish (9.224).
Without loss of generality we shall assume that f ≥ 0, f ∈ L1 (E, E, πE ).
Then equality (9.226) holds for such functions f . From equality (9.230) it
then follows that
Z Z
0 0 0
lim
0
λ RA (λ ) f (x) dπE (x) = inf0
λ RA (λ0 ) f (x) dπE (x) = 0. (9.252)
λ ↓0 E λ >0 E

We still have prove the equality in (9.252) in a rigorous manner. However, this
follows from (9.224) in Proposition 9.39.
The following lemma was used in the proof of Theorem 9.36 in order to prove
the existence of an invariant measure.
Lemma 9.42. Let A be a compact recurrent subset of E such that Ar = A,
i.e. the collection of its regular points coincides with A itself. Put

HA g(x) = Ex [g (τA , X (τa ))] = Ex [g (τA , X (τa )) , τA < ∞] . (9.253)

Here g : [0, ∞) × E → R is any bounded continuous function. Then the fol-


lowing assertions hold true:
(a) Suppose that for every such function g the limit limλ↓0 λR(λ)HA g(x) exists
uniformly on compact subsets of E. Then the family {λR(λ)HA : λ > 0} is
Tβ -equi-continuous. In particular, it follows that for every compact subset
K in E there exists a function v ∈ H ([0, ∞) × E) such that
9.3 Markov Chains: invariant measure 529

sup sup |λR(λ)HA g(x)| ≤ sup |v(s, x)g(s, x)| . (9.254)


x∈K λ>0 (s,x)∈[0,∞)×E

(b) Suppose that for every compact subset K of E the following equality holds:

inf sup sup |λR(λ) (HA g − u − Lv) (x)| = 0. (9.255)


u∈N (L),v∈D(L) x∈K λ>0

Let g be any function in Cb ([0, ∞) × E). Then the limit

P HA g(y) = lim λR(λ)HA g(y) (9.256)


λ↓0

exists uniformly on compact subsets of E, and P HA g belongs to N (L).


Consequently HA g = P HA g + (I − P ) HA g decomposes the function HA g
into two functions P HA g ∈ N (L) and (I − P ) HA g which belongs to Tβ -
closure of R(L).
(c) Suppose that for every function g ∈ Cb ([0, ∞) × E) the limit

P HA g(x) = lim λR(λ)HA g(x) exists uniformly on compact subsets of E.


λ↓0
(9.257)
If x0 ∈ E and h > 0, then limλ→∞ λR(λ)P(·) [τA ≤ h] (x0 ) > 0.
Recall that a function v belongs to H ([0, ∞) × E) provided that for every
α > 0 the subset {(s, x) ∈ [0, ∞) × E : v(s, x) ≥ α} is contained in a compact
subset of [0, ∞)×. In particular it follows that uniformly on any compact
subset of E we have limt→∞ v(t, x) = 0.
Proof. (a) Let (gn : n ∈ N) be any sequence of functions in Cb ([0, ∞) × E)
which decreases to zero pointwise on [0, ∞) × E. Then HA gn decreases point-
wise to 0 on E. By Dini’s lemma it decreases to zero uniformly on compact
subsets of E. Define the functions Gn : [0, ∞] × E → R by

 λR(λ)HA (x), 0 < λ < ∞, x ∈ E;


Gn (λ, x) = lim αR(α)HA gn (x), λ = 0, x ∈ E;
α↓0


 lim αR(α)H g (x) = H g (x), λ = ∞, x ∈ E.
A n A n
α→∞
(9.258)
Then the sequence (Gn : n ∈ N) defined in (9.258) consists of continuous func-
tions which converges pointwise on [0, ∞] × E to zero. By Dini’s lemma this
convergence occurs uniformly on [0, ∞] × K where K is any compact subset
of E. It follows that for all compact subsets K of E

sup sup λR(λ)HA gn (x) decreases to 0 as n tends to ∞.


x∈K λ>0

By Corollary 1.19 in Chapter 1 it follows that such a family is Tβ -equi-


continuous. The inequality in (9.254) is a consequence of this equi-continuity.
For more details see §1.1.
530 9 Miscellaneous topics

(b) For u ∈ N (L) and λ > 0 we have λR(λ)u = u, and for v ∈ D(L) we
have λR(λ)Lv = λ (λR(λ) − I) v. It follows that limλ↓0 λR(λ) (u + Lv) = u
uniformly on E. By assumption (9.255) we see that limλ↓0 λR(λ)HA g(x) exists
uniformly on compact subsets of E. This shows assertion (b).
(c) Let K be a compact subset of E. By assertion (a) there exists a function
v ∈ H ([0, ∞) × E) such that (9.254) is satisfied. In particular it follows that

sup sup |λR(λ)HA g(x)| ≤ sup |v(s, x)g(s)| (9.259)


x∈K λ>0 (s,x)∈[0,∞)×E

for all functions g ∈ Cb ([0, ∞)). In particular we we may choose continuous


functions gm satisfying 1[m,∞) ≤ gm ≤ 1[m−1,∞) . Then by (9.259) we have
for x ∈ K and λ > 0

λR(λ)P(·) [τA > m] (x) ≤ λR(λ)E(·) [gm (τA )] (x)


≤ sup |v (s, y) gm (s)| . (9.260)
(s,y)∈[0,∞)×E

From the properties of the functions v and gm it follows that

inf sup sup λR(λ)P(·) [τA > m] (x) = 0. (9.261)


m∈N x∈K λ>0

As in (c) assume that for some h > 0 limλ↓0 λR(λ)P(·) [τA ≤ h] (x) = 0. Then
by the Markov property we also have

λR(λ)P(·) [h < τA ≤ 2h] (x)


Z ∞
=λ e−λs esL P(·) [τA ◦ ϑh ≤ h, τA > h] (x) ds
Z0 ∞
£ £ ¤¤
=λ e−λs Ex EX(s) PX(h) [τA ≤ h] , τA > h ds
Z0 ∞
£ £ ¤¤
≤λ e−λs Ex EX(s) PX(h) [τA ≤ h] ds
Z0 ∞
£ ¤
=λ e−λs Ex EX(s+h) [τA ≤ h] ds
0
Z ∞
λh
£ ¤
= λe e−λs Ex EX(s) [τA ≤ h] ds. (9.262)
h

Hence by (9.262) and by assumption we see that

lim λR(λ)P(·) [h < τA ≤ 2h] (x) = 0.


λ↓0

Consequently, we obtain

lim λR(λ)P(·) [τA ≤ m] (x) = 0 for all m > 0. (9.263)


λ↓0

Since the set A is recurrent we have for m > 0, m ∈ N,


9.3 Markov Chains: invariant measure 531

1 = λR(λ)P(·) [τA < ∞] (x)


= λR(λ)P(·) [τA ≤ m] (x) + λR(λ)P(·) [∞ > τA > m] (x) (9.264)

The second term in the right-hand side of (9.264) converges to 0 uniformly


in λ > 0 when m → ∞. For every fixed m the first term in the right-hind
side of (9.264) converges to 0 when λ ↓ 0: see (9.263). These two observations
contradict the equality in (9.264). It follows that for every x ∈ E and every
h > 0 the limit limλ↓0 λR(λ)P(·) [τA ≤ h] (x) > 0.
Altogether this completes the proof of Lemma 9.42.
Corollary 9.43. Let the hypotheses and notation be as in 9.36. Suppose that
there exists a recurrent subset A subset such that supx∈A Ex [h + τA ◦ ϑh ] <
∞ for some h > 0. Then the invariant measure constructed in the proof of
Theorem 9.36 is finite. Here τA is the first hitting time of the subset A.
Proof. Corollary 9.43 follows from inequality (9.188) in the proof of Theorem
9.36 with f = 1: see Definition 8.19 as well.
Suppose that x ∈ Ar . Then the limits in (9.92) are in fact suprema, pro-
vided the numbers h are taken of the form 2−n h0 , h0 > 0 fixed, and n → ∞.
Moreover, the expression in (9.92) vanishes for x ∈ A.
Proposition 9.44. Let the (embedded) Markov chain (X(n) : n ∈ N) be Har-
ris recurrent. Then the strict closure, i.e. the Tβ -closure, of R(L) + R1 co-
incides with Cb (E). If, in addition, the family {λR(λ) : λ > 0} is Tβ -equi-
continuous, then the chain (X(n) : n ∈ N) is positive Harris recurrent.

RProof. Let µ = µ2 − µ1 be a difference of positive


R Borel measures Rsuch that
R Lf dµ = 0 for all f ∈ D(L), and such that 1 dµ = 0. Then 1 dµ1 =
1 dµ2 , and since the chain (X(n) : n ∈ N) is Harris recurrent we know that
µZ Z ¶
lim λR(λ)f dµ2 − λR(λ)f dµ1 = 0. (9.265)
λ↓0
R
Since Lg dµ = 0 for all g ∈ D(L), we have
Z Z Z
f dµ = f dµ2 − f dµ1
µZ Z ¶
= lim λR(λ)f dµ2 − λR(λ)f dµ1 = 0. (9.266)
λ→0

From (9.266) we conclude that µ = 0. From the Hahn-Banach theorem it then


follows that R(L) + R1 is Tβ -dense in Cb (E).
If, in addition, the family {λR(λ) : λ > 0} is Tβ -equi-continuous, then we
define the invariant measure π by
Z
f dπ = lim λR(λ)f (x0 ) . (9.267)
λ↓0
532 9 Miscellaneous topics

The limit in (9.267) exists for f ∈ LD(L) + R1, and for f ∈ R(L) it vanishes.
Since the chain (X(n) : n ∈ N) is Harris recurrent we know that the limit in
(9.267) does not depend on the choice of x0 . Since the family {λR(λ) : λ > 0}
is Tβ -equi-continuous, the limit in (9.267) also exists for f in the Cb (E) which
is the Tβ -closure of R(L) + R1. In addition this limit is a probability measure
on E, again by this Tβ -equi-continuity.

Proposition 9.45. Let the hypotheses and notation be as in Proposition 9.21.


Let f belong to the domain of L, and suppose that

Lf (x) = LE(·) [f (X (τA ))] (x), x ∈ Ar . (9.268)


£ ¤
Then the function x 7→ Ex e−λτA f (τA ) belongs to the pointwise domain of
L, and the following equality holds:
£ ¤
(λI − L) E(·) e−λτA f (X (τA )) (x) = 0, x ∈ E \ Ar , (9.269)

and
© £ ¤ ª
lim (λI − L) E(·) e−λτA f (X (τA )) (x) − (λI − L) f (x) = 0, x ∈ Ar .
λ↓0
(9.270)

Note: neither equality (9.268) nor (9.270) is automatically satisfied. In fact


condition (9.268) is in fact kind of Wentzell type boundary condition. Let f ∈
Cb (E) be such that (9.268) is satisfied. Then the function x 7→ HA (0)f (x) =
Ex [f (X (τA ))] is a function which is L-harmonic on E \ Ar , it coincides with
f on Ar , and in addition the functions Lf and¡ LH¢ A (0)f coincide on the same
set. We introduce the Wentzell subspace D LW A of D(L) by:
¡ ¢
D LW A = {f ∈ D(L) : f satisfies equality (9.268)} . (9.271)

Proof. First we observe that for x ∈ E \ Ar and g in the pointwise domain of


L we have
½ ¾
Ex [g (X(t))] − g(x) Ex [g (X(t)) , τA > t] − g(x)
(L − LA ) g(x) = lim −
t↓0 t t
Ex [g (X(t)) , τA ≤ t]
= lim = 0, (9.272)
t↓0 t

where in the final step of (9.272) we employed Lemma 9.13. We also have
£ ¤
(λI − LA ) HA (λ)f (x) = (λI − LA ) E(·) e−λτA f (X (τA )) (x)
£ ¤ £ £ ¤ ¤
Ex e−λτA f (X (τA )) − Ex e−δλ EX(δ) e−λτA f (X (τA )) , τA > δ
= lim
δ↓0 δ

(employ the Markov property)


9.4 A proof of Orey’s theorem 533
£ ¤ £ ¤
Ex e−λτA f (X (τA )) − Ex e−δλ−λτA ◦ϑδ f (X (δ + τA ◦ ϑδ )) , τA > δ
= lim
δ↓0 δ

(on the event {τA > δ} the equality δ + τA ◦ ϑδ = τA holds)


£ ¤
Ex e−λτA f (X (τA )) , τA ≤ δ
= lim = 0, for x ∈ E \ Ar . (9.273)
δ↓0 δ
In the final step of (9.273) we again used Lemma 9.13. An application of
(9.272) and (9.273) to the function g(x) = HA (λ)f (x) shows the validity of
(9.269) for x ∈ E \ Ar .
Next we treat the (important) case that x ∈ Ar . Since the process t 7→
Rt
e f (X(t)) − f (X(0)) + 0 e−λs f (X(s)) ds is Px -martingale, we get
−λt

£ ¤
Ex e−λτA f (X (τA )) − f (x)
£ ¤
= Ex e−λτA f (X (τA )) − f (X(0))
·Z τA ¸
= −Ex e−λs (λI − L) f (X(s)) ds
Z ∞ 0
=− e−λs Ex [(λI − L) f (X(s)) , τA > s] ds
0
= −RA (λ) (λI − L) f (x). (9.274)

We also have:
£ ¤
(λI − L) E(·) e−λτA f (X (τA )) − f (X(0)) (x)
¡ £ ¤ ¢
= Lf (x) − LE(·) e−λτA f (X (τA )) (x) Px [τA = 0] . (9.275)

In (9.275) we let λ tend to zero to obtain:


£ ¤
lim (λI − L) E(·) e−λτA f (X (τA )) − f (X(0)) (x)
λ↓0
¡ ¢
= Lf (x) − LE(·) [f (X (τA ))] (x) Px [τA = 0] . (9.276)

Here we use the fact that the subset A is recurrent, i.e. Px [τA < ∞] = 1,
x ∈ E. So the following equality remains to be shown:

Lf (x) = LE(·) [f (X (τA ))] (x), x ∈ Ar .

However, this is assumption (9.268).

9.4 A proof of Orey’s theorem


In this section we will prove Orey’s theorem as formulated in Theorem 9.4. We
will employ the formulas (9.27) and (9.28). First we will define an accessible
atom.
534 9 Miscellaneous topics

Definition 9.46. Let

{(Ω, F, P) , (X(n), n ∈ N) , (ϑn , n ∈ N) , (E, E)}

be a time-homogeneous Markov process with a Polish state space E. Let


(x, B) 7→ P (x, B) be the corresponding probability transition function. A Borel
subset A is called an atom if x 7→ P (x, A), x ∈ A, does not depend on x ∈ A.
It is called an accessible atom, if it is an atom such that P (x, A) > 0, x ∈ A.

Lemma 9.47. Let A be an accessible atom and let x1 and x2 belong to A.


Then the measures Px1 and Px2 coincide.
Qn
Proof. Let Fn be a stochastic variable of the form Fn = j=1 fj (X(j)) where
the functions fj : E → R, 1 ≤ j ≤ n, are bounded non-negative Borel
functions. By the monotone class theorem it suffices to prove the equality
Ex1 [Fn ] = Ex2 [Fn ]. We will prove this equality by induction with respect to
n. For n = 1, the equality Ex1 [F1 ] = Ex2 [F1 ] follows from the definition of
atom:
Z ∞ Z ∞
Ex1 [F1 ] = P (x1 , {f1 ≥ ξ}) dξ = P (x2 , {f1 ≥ ξ}) dξ = Ex2 [F1 ] .
0 0
(9.277)
Next we consider
£ £ ¯ ¤¤
Ex1 [Fn+1 ] = Ex1 Fn Ex1 fn+1 (X(n + 1)) ¯ Fn

(Markov property)
£ ¤
= Ex1 Fn EX(n) [fn+1 (X(1))]

(induction hypothesis)
£ ¤
= Ex2 Fn EX(n) [fn+1 (X(1))]

(once again Markov property)

= Ex2 [Fn+1 ] . (9.278)

So from (9.277) and (9.278) the statement in Lemma 9.47 follows.

Let A be an atom. Then we write EA [F ] = Ex [F ], x ∈ A. A similar notation


is in vogue for PA . From (9.28) together with Lemma 9.47 we deduce the
equality (x ∈ E, f ∈ L∞ (E, E))

Ex [f (X(n))]
£ ¤
= Ex f (X(n)) , τA1 ≥ n (9.279)
9.4 A proof of Orey’s theorem 535

n−1
XX j
£ ¤ £ ¤
+ Px τA1 = k PA [X(j − k) ∈ A] EA f (X(n − j)) , τA1 ≥ n − j .
j=1 k=1

In addition we have
n−1
X £ ¤
PA [X(j) ∈ A] EA f (X(n − j)) , τA1 ≥ n − j
j=1
n−1
X £ £ ¤ ¤
= EA EA f (X(n − j)) , τA1 ≥ n − j , X(j) ∈ A
j=1
n−1
X £ £ ¤ ¤
= EA EX(j) f (X(n − j)) , τA1 ≥ n − j , X(j) ∈ A
j=1

(Markov property)

n−1
X £ ¤
= EA f (X(n)) , j + τA1 ◦ ϑj ≥ n, X(j) ∈ A
j=1

= EA [f (X(n))] . (9.280)

Put
£ ¤
ax (k) = Px τA1 = k , uA (k) = PA [X(k) ∈ A] ,
£ ¤
pA,f (k) = EA f (X(k)) , τA1 = k , and
£ ¤
pA,f (k) = EA f (X(k)) , τA1 ≥ k . (9.281)

From (9.279), (9.280), and (9.281) we infer

Ex [f (X(n))] − EA [f (X(n))]
£ ¤
= Ex f (X(n)) , τA1 ≥ n + (ax ∗ uA − uA ) ∗ pA,f (n − 1). (9.282)

Definition 9.48. Let n 7→ p(n) be a probability distribution on N\{0}. Define


the function u : N ∪ {−1} → [0, 1] as in (9.294) in Theorem 9.53 below. Then
the function u is called the renewal function of the distribution p.
The following proposition says that the function uA is the renewal function
corresponding to the distribution pA,1 .
Proposition 9.49. Let the functions n 7→ pA,1 (n) and n 7→ uA (n) be defined
as in 9.281). Then the function uA is the renewal function corresponding to
the distribution pA,1 .

Proof. To see this ©we introduce the hitting


ª times τAk , k positive integer as
k+1
follows: τA = inf ` > τA : X(`) ∈ A , with τA0 = 0. Then it is easy to show
k
536 9 Miscellaneous topics

that τAk1 +k2 = τAk1 + τAk2 ◦ ϑτk1 . Moreover, by the strong Markov property
the variables τAk+1 − τAk = τA1 ◦ ϑkA are identically PA -distributed, and PA -
independent. Then the following identities hold:

X £ ¤
uA (n) = PA [X(n) ∈ A] = PA τAk = n
k=1
 

X Xk ³ ´
= PA  τ j+1 − τ j = n
A A
k=1 j=1
 

X Xk
= PA  τA1 ◦ ϑ j
τA = n
k=1 j=1

X
= p∗k
A,1 . (9.283)
k=1

From (9.283) we see that the sequences pA,1 (n) and uA (n) are related as the
sequences p(n) and u(n) in (9.294) of Theorem 9.53 below.
This completes the proof of Proposition 9.49.

Then under appropriate conditions we will prove that every term in the right-
hand side of (9.282) tends to 0 when n → ∞. In order to obtain such a result
we will use some renewal theory together with a coupling argument. Suppose
that£ the atom
¤ A is recurrent and that the distribution p(n) = pA,1 (n) =
PA τA1 = k is aperiodic, i.e. it satisfies (9.284). Then the right-hand side
of (9.282) converges to zero when n → ∞. This result is a consequence of
Theorem 9.53 below.
We need the following lemma.
Lemma 9.50. Let a, b and p be probability distributions on N. Suppose that
p(0) = 0 and p is aperiodic, i.e. suppose

g.c.d. {n ≥ 1, n ∈ N, p(n) > 0} = 1. (9.284)

Let {S0 , S1 , S2 , . . .} and {S00 , S10 , S20 , . . .} be sequences of positive integer valued
processes with the following properties:
(a) Each random variable Sj , and Sj0 , j ≥ 1, has the same distribution p(k).
(b) The variables S0 and S00 are independent: S0 has distribution a(k), and S00
has distribution b(k).
(c) The variables {S0 , S1 , S2 , . . .} are mutually independent, and the same is
true for the sequence {S00 , S10 , S20 , . . .}.
(d) The variables Sj and Sk0 are independent for all j and k ∈ N.
Let Gn be the σ-field generated by the couples {(S0 , S00 ) , . . . , (Sn , Sn0 )}, and let
T (n) be the Gn -stopping time defined
9.4 A proof of Orey’s theorem 537
 
 m
X m
X 
T (n) = inf m ≥ 0 : Sj ≥ n + 1 and Sj0 ≥ n + 1 . (9.285)
 
j=0 j=0

¡ ¢
Let n 7→ V + (n) = Va+ (n), Vb+ (n) be the bivariate linked forward recurrence
PT (n)
time chain which links the processes n 7→ Va+ (n) = j=0 Sj − n and n 7→
PT (n)
Vb+ (n) = j=0 Sj − n. Then the process n 7→ V + (n) satisfies:

V + (n + 1)
 © ª © ª
 V + (n) − (1, 1) , on Va+ (n) ≥ 2 ∩ Vb+ (n) ≥ 2 ,
= ³ ´ © ª © ª
 V + n + S1+T (n) − 1, S1+T
0
(n) − 1 , on Va+ (n) = 1 ∪ Vb+ (n) = 1 .
(9.286)
2 2
Let P ((i, j), (k, `)), ((i, j), (k, `)) ∈ (N \ {0}) × (N \ {0}) be the probability
transition function of the process n 7→ V + (n). Then P ((i, j), (k, `)) is given
by
P ((i, j), (i − 1, j − 1)) = 1, i > 1, j > 1;
P ((1, j), (k, j − 1)) = p(k), k ≥ 1, j > 1;
(9.287)
P ((i, 1), (i − 1, k)) = p(k), i > 1, k ≥ 1;
P ((1, 1), (i, j)) = p(i)p(j), i > 1, j > 1,
and the other transitions vanish. Put
© ª
τ1,1 = inf n ∈ N : V + (n) = (1, 1) . (9.288)

Then P [τ1,1 < ∞] = 1, and the following coupling equalities holds:


T (τ1,1 ) T (τ1,1 )
X X
Sj = Sj0 = τ1,1 + 1. (9.289)
j=0 j=0

As a consequence we have the following proposition.


Pn Pn
Proposition 9.51. Put X ∗ (n) = j=0 Sj − j=0 Sj0 . The equality in (9.289)
says that the process n 7→ X ∗ (n) returns to zero in a finite time τ ∗ = T (τ1,1 )
with P-probability 1, no matter what its initial distribution is. In other words
the process n 7→ X ∗ (n) is recurrent.

Proof (Proof of Lemma 9.50). Fix (i, j) ∈ (N \ {0}) × (N \ {0}), and choose
M ∈ N so large that

g.c.d. {M ≥ n ≥ 1, n ∈ N, p(n) > 0} = 1. (9.290)

A number M for which (9.290) holds can be found using Bézout’s identity.
Suppose that the distribution n 7→ p(n) has period d. Then there exist
positive integers sj ≥ 1, 1 ≤ j ≤ N , such that p (sj ) > 0, and such that
538 9 Miscellaneous topics

g.c.d. (s1 , . . . , sN ) = d. Then there are integers such that for certain integers
PN
kj , 1 ≤ j ≤ N , j=1 kj sj = d. By renumbering we may assume that kj ≥ 1 for
PN1
1 ≤ j ≤ N1 , and kj ≤ −1 for N1 + 1 ≤ j ≤ N . Then we choose M ≥ j=1 sj .
PN
In fact one may consider the smallest integer k ≥ 1 such that k = j=1 kj sj ,
where N ∈ N, kj ∈ Z, and p (sj ) > 0. Then one proves k = d, by using the fact
that Z is a Euclidean domain. More precisely, PN let k ≥ 1 be the smallest positive
integer which can be written as k = j=1 kj sj . Then we write sj = qj k + rj
nP o
N
with 0 ≤ rj < k and qj ≥ 0. Then rj = sj − qj k ∈ R = `=1 `j s j : `j ∈ Z .
Since 0 ≤ rj < k we infer rj = 0. It follows that k is a divisor of sj , 1 ≤ j ≤ N .
Since d ∈ R, d divides k. Since, in addition, g.c.d. (s1 , . . . , sN ) = d we infer
PN
k = d. So we obtain Bézout’s identity: d = j=1 kj sj for certain positive
integers sj with p (sj ) > 0 and certain integers kj , 1 ≤ j ≤ N .
If the sequence {sj : p (sj ) > 0} is aperiodic, then we choose d = 1 in the
above remarks.
Fix (i0 , j0 ) ∈ (N \ {0}) × (N \ {0}), and choose M so large that (i0 , j0 )
PN1
belongs to the square {1, . . . , M } × {1, . . . , M }, and that M ≥ j=1 kj sj
PN1 PN
where 1 = j=1 kj sj − j=N1 +1 (−kj ) sj with kj ≥ 1, 1 ≤ j ≤ N1 , and
−kj ≥ 1, N1 + 1 ≤ j ≤ N , in Bézout’s identity. Then all paths in the square
{1, . . . , M } × {1, . . . , M } along which each one-time transition is strictly posi-
tive, i.e. either 1 (along a diagonal from north-east to south-west) or p(k) > 0
from a point on one of the “edges” {(1, j) : 1 ≤ j ≥ M } or {(i, 1) : 1 ≤ i ≥ M }
of the square to the horizontal line {(k, j − 1) : 1 ≤ k ≤ M } or the vertical line
{(i − 1, k) : 1 ≤ k ≤ M } respectively. Let τ1,1 be defined as is (9.288) with S0
with distribution δi and S00 with distribution j. By (9.290) P-almost all paths
pass through (1, 1) after a finite time passage, and consequently we obtain
£ © ª¯ ¤
lim lim P ∪nk=1 V + (k) = (1, 1) ¯ Sj ≤ M, Sj0 ≤ M, 0 ≤ j ≤ N 0
n→∞ N →∞ 0
£ ¯ ¤
= lim lim P ∪nk=1 {τ1,1 = k} ¯ Sj ≤ M, Sj0 ≤ M, 0 ≤ j ≤ N 0 = 1.
n→∞ N →∞ 0

(9.291)

Notice that the limit in (9.291), as N 0 tends to ∞, can be interpreted as


the construction of the measure P conditioned on the event

\ © ª
Sj ≤ M, Sj0 ≤ M .
j=0

The existence of this “conditional probability” follows from Kolmogorov’s


extension theorem in conjunction ¡ with ¢the assumption
¡ ¢ that for each 0 ≤ j1 <
j2 , (j1 , j2 ) ∈ N × N, the pairs Sj1 , Sj0 1 and Sj2 , Sj0 2 are P-independent.
The collection of bounded paths along which the process V + (n) moves with
strictly positive probability and which miss the diagonal throughout their life
time eventually dy out, i.e. this event is negligible. The reason for this is that
at each time step the transition probability of such a path is either 1 or else
9.4 A proof of Orey’s theorem 539

one of the quantities p (sj ), 1 ≤ j ≤ N , where N is the number occurring in


Bézout’s identity, and that the non-one transition probability occur infinitely
many often. The P-negligibility then follows from the theorem of dominated
convergence. The other paths end up in (1, 1) in finite time. In (9.291) we let
M tend to ∞ to obtain P [τ1,1 < ∞] = 1. But then we see
£ © + ª¤
Pi0 ,j0 ∪∞
n=1 V (n) = (1, 1) =1 (9.292)
£ ¯ ¤
where Pi0 ,j0 [A] = P A ¯ (S0 , S00 ) = (i0 , j0 ) , A ∈ F. Since the pair (i0 , j0 ) ∈
(N \ {0}) × (N \ {0}) is arbitrary from (9.292) we get
X £ © + ª¤
a(i)b(j)Pi,j ∪∞ n=1 V (n) = (1, 1) = 1. (9.293)
i,j

If now τ1,1 is defined as in (9.288), then (9.293) implies P [τ1,1 < ∞] = 1.


This completes the proof of Lemma 9.50.

Remark 9.52. For the equality in (9.291) see the argument in §10.3.1 of Meyn
and Tweedie [162] as well.

The following result appears as Theorem 18.1.1 in Meyn and Tweedie [162].
Theorem 9.53. Let a, b and p be probability distributions on N, and let u :
N ∪ {−1} → [0, ∞] be the renewal function corresponding to n 7→ p(n), defined
by u(−1) = 0, u(0) = 1, and for n ≥ 1

X ∞
X X
u(n) = pj∗ (n) = δ0 (n)+p(n)+ p (k1 ) · · · p (kj ) .
j=0 j=2 k1 ,...,kj ;0≤ki ≤n;Pj
i=1 ki =n
(9.294)
Suppose that p is aperiodic, i.e. suppose

g.c.d. {n ≥ 1, n ∈ N, p(n) > 0} = 1. (9.295)

Then

lim |a ∗ u(n) − b ∗ u(n)| = 0, and (9.296)


n→∞
lim |a ∗ u(n) − b ∗ u(n)| ∗ p(n) = 0, (9.297)
n→∞
P
where p(n) = k≥n+1 p(k).
In the proof
£ of Theorem
¤ 9.4 the result in £Theorem
¤ 9.53 will be applied with
a(k) = Px τA1 = k , p(k) = pA,1 (k) = PA τA1 = k , b(k) = δ0 (k), and, conse-
quently, u(k) = PA [X(k) ∈ A] = uA (k). Notice that k 7→ uA (k) is the renewal
function of the distribution pA,1 (k).
We follow the proof Theorem 8.1.1 in Meyn and Tweedie [162].
540 9 Miscellaneous topics

Proof. Let {S0 , S1 , S2 , . . .} and {S00 , S10 , S20 , . . .} be sequences of positive in-
teger valued processes with the properties (a), (b), (c) and (d) of Lemma
9.50:
(a) Each random variable Sj , and Sj0 , j ≥ 1, has the same distribution p(k).
(b) The variables S0 and S00 are independent: S0 has distribution a(k), and S00
has distribution b(k).
(c) The variables {S0 , S1 , S2 , . . .} are mutually independent, and the same is
true for the sequence {S00 , S10 , S20 , . . .}.
(d) The variables Sj and Sk0 are P-independent for all pairs (j, k) ∈ N × N.
Pn ¡ ¢
We put Wj = Sj −Sj0 , and X ∗ (n) = j=0 Sj − Sj0 . Notice that the variables
Wj and −Wj , j ∈ N, j ≥ 1, have the same distributions. The distribution of
W0 = S0 − S00 is determined by the distributions a of S0 and b of S00 , and the
fact that S0 and S00 are independent. We also introduce the indicator variables
Za (n) and Zb (n), n ∈ N:

 Xj

 1 if Si = n for some j ≥ 0;
Za (n) = (9.298)


i=0

0 elsewhere.

Hence Za (n) = 1∪∞ {Pj Si =n} . The indicator process Zb (n) is defined
j=0 i=0
similarly, but with Sj0 instead of Sj . Then P [Za (n) = 1] = a ∗ u(n), and
P [Zb (n) = 1] = b ∗ u(n). The coupling time of the renewal processes is de-
fined by
( j j
)
X X
0
Ta,b = min n = Si = Si ∈ N : n ≥ 1, for some j ∈ N . (9.299)
i=0 i=0

We also have ( j )
X

Ta,b = min Si : j ≥ 1, X (j) = 0 . (9.300)
i=0

∗ ∗
PTa,b

Let Ta,b be defined by Ta,b = inf {j ≥ 1 : X ∗ (j) = 0}. Then Ta,b = j=0 Sj =
PTa,b

0
j=0 Sj . From Proposition 9.51 it follows that the coupling time Ta,b is finite
P-almost surely. Based on this property we will prove the equalities in (9.296)
and (9.297). Therefore we put
(
Za (n), if n < Ta,b ;
Za,b (n) = (9.301)
Zb (n), if n ≥ Ta,b .

Then we have

|a ∗ u(n) − b ∗ u(n)| = |P [Za (n) = 1] − P [Zb (n) = 1]|


9.4 A proof of Orey’s theorem 541

= |P [Za,b (n) = 1] − P [Zb (n) = 1]|


= |P [Za,b (n) = 1, Ta,b > n] + P [Za,b (n) = 1, Ta,b ≤ n]
−P [Zb (n) = 1, Ta,b > n] − P [Zb (n) = 1, Ta,b ≤ n]|
= |P [Za (n) = 1, Ta,b > n] + P [Zb (n) = 1, Ta,b ≤ n]
−P [Zb (n) = 1, Ta,b > n] − P [Zb (n) = 1, Ta,b ≤ n]|
≤ max (P [Za (n) = 1, Ta,b > n] , P [Zb (n) = 1, Ta,b > n])
≤ P [Ta,b > n] . (9.302)

Since P [Ta,b < ∞] = 1, the inequality in (9.302) yields the equality in (9.296).
Next we consider the backward recurrence chains Va− (n) and Vb− (n) for the
renewal processes of the sequences {S0 , S1 , S2 , . . .} and {S00 , S10 , S20 , . . .} defined
by respectively:
 
 Xk X k 
Va− (n) = min n − Sj : Sj ≤ n
 
j=0 j=0
 
 Xk X k X 
k+1
= min n − Sj : Sj ≤ n < Sj ,
 
j=0 j=0 j=0

and
 
 Xk Xk 
Vb− (n) = min n − Sj0 : Sj0 ≤ n
 
j=0 j=0
 
 Xk Xk X 
k+1
= min n − Sj0 : Sj0 ≤ n < Sj0 . (9.303)
 
j=0 j=0 j=0

It follows that there exists a random non-negative integer Ka (n) which satisfies
PKa (n) PKa (n)+1 PKa (n)
j=0 Sj ≤ n < n + 1 ≤ j=0 Sj , and hence Va− (n) = n − j=0 Sj .
For the moment fix 0 ≤ m ≤ n. Since the variables {S0 , S1 , S2 , . . .} are mutu-
ally independent, S0 has distribution a(k), and the others have distribution
p(k) we have
 
∞ k k+1
£ − ¤ X X X
P Va (n) = m = P n − Sj = m, Sj ≥ n + 1 
k=0 j=0 j=0
 

X Xk
= P Sj = n − m, Sk+1 ≥ m + 1
k=0 j=0
 

X Xk
= P Sj = n − m P [Sk+1 ≥ m + 1]
k=0 j=0
542 9 Miscellaneous topics

X
= a ∗ p∗k (n − m)p(m) (9.304)
k=0
P∞
where, with a notation we employed earlier, p(m) = j=m+1 p(j). Of course,
for Vb− (n) we have a similar distribution
£ with
¤ b instead of a. From (9.304)
and a similar expression for P Vb− (n) = m we also infer
¯ £ ¤ £ ¤¯
sup ¯P Va− (n) ∈ A − P Va− (n) ∈ A ¯
A⊂N

1 X ¯¯ £ − ¤ £ ¤¯
= P Va (n) = m − P Va− (n) = m ¯
2 m=0
n
1 X
= |a ∗ u(n − m)p(m) − b ∗ u(n − m)p(m)|
2 m=0
1
= |a ∗ u − b ∗ u| ∗ p(n). (9.305)
2
It also follows that on the event Aa,b (n) defined by
 ∗

 Ta,b 
X
Aa,b (n) = Ta,b = Sj ≤ n
 
j=0

the P-distributions of Va− (n) and Vb− (n) coincide.


n³P This is a consequence
´ o of
j Pj 0
the strong Markov property of the process i=0 Sj , i=0 Sj : j ∈ N :

£ ¤
P Va− (n) ∈ A, Aa,b (n)
 ∗ ∗ 
∞ Ta,b +k Ta,b +k Ta,b +k+1
X X X X
= P n − Sj ∈ A, Sj ≤ n < Sj 
k=0 j=0 j=0 j=0
  |ast

∗ ∗
∞ Ta,b +k Ta,b +k Ta,b +k+1
X   X X X ¯ 
= E P n − Sj ∈ A, Sj ≤ n < Sj ¯ GTa,b 
k=0 j=0 j=0 j=0

PTa,b

(strong Markov property together with the definition of Ta,b = i=0 Sj , and
the fact that the variables Sj and Sj0 , j ≥ 1, heve the same distribution)
  ∗ ∗ ∗

∞ Ta,b +k Ta,b +k Ta,b +k+1
X X X X ¯
= E P n − Sj0 ∈ A, Sj0 ≤ n < Sj0 ¯ GTa,b
∗ 

k=0 j=0 j=0 j=0


 ∗ ∗ 
∞ Ta,b +k Ta,b +k Ta,b +k+1
X X X X
= P n − Sj0 ∈ A, Sj0 ≤ n < Sj0 
k=0 j=0 j=0 j=0
9.4 A proof of Orey’s theorem 543
 ∗ 
∞ k k k+1 Ta,b
X X X X X
= P n − Sj0 ∈ A, Sj0 ≤ n < Sj0 , Sj0 ≤ n
k=0 j=0 j=0 j=0 j=0
£ ¤
= P Vb− (n) ∈ A, Aa,b (n) . (9.306)
¡¡ ¢ ¢
Here we wrote Gn = σ Sj , Sj0 : 0 ≤ j ≤ n , and
© © ∗ ª ª
GTa,b
∗ = ∩∞n=0 A ∈ G : A ∩ Ta,b ≤ n ∈ Gn .

From (9.306) we infer


¯ £ − ¤ £ ¤¯
¯P Va (n) ∈ A − P V − (n) ∈ A ¯
b
¯ £ ¤ £ ¤
= ¯P Va− (n) ∈ A, Aa,b (n) + P Va− (n) ∈ A, Ω \ Aa,b (n)
£ ¤ £ ¤¯
−P Vb− (n) ∈ A, Aa,b (n) − P Vb− (n) ∈ A, Ω \ Aa,b (n) ¯
¯ £ ¤ £ ¤¯
= ¯P Va− (n) ∈ A, Ω \ Aa,b (n) − P Vb− (n) ∈ A, Ω \ Aa,b (n) ¯
 ∗ 
Ta,b
X
≤ P [Ω \ Aa,b (n)] =≤ P  Sj ≥ n . (9.307)
j=0

From (9.305), (9.306) and (9.307) we deduce


 ∗ 
Ta,b
X
|a ∗ u − b ∗ u| ∗ p(n) ≤ 2P  Sj ≥ n . (9.308)
j=0

Since by Proposition 9.51 the process X ∗ (n) is recurrent, and hence


 ∗ 
Ta,b
X
P Ta,b = Sj < ∞ = 1,
j=0

it follows from (9.308) that limn→∞ |a ∗ u − b ∗ u| ∗ p(n) = 0. However, this is


the same as equality (9.297).
This completes the proof of Theorem 9.53
Before we complete the proof of 9.4 we insert some definitions which are
taken from [162]. Let (X(n), Px∈E ) be a Markov chain with the property that
all measures B 7→ P (1, x, B) = Px [X(n) ∈ B], x ∈ E, are equivalent. Fix
x0 ∈ E. We say that the Markov£ chain is¤recurrent, if for all subsets B ∈ E
with P (1, x0 , B) > 0 we have Px τB1 < ∞ > 0.
Definition 9.54. A subset C ∈ E is called small if there exists m ∈ N, m ≥ 1,
and a non-trivial positive Borel measure νm such that the inequality

P (m, x, B) ≥ νm (B) (9.309)

holds for x ∈ C and all B ∈ E.


544 9 Miscellaneous topics

The following definition also occurs in formula (9.21).


Definition 9.55. A Markov chain {X(n), Px }n∈N,x∈E is called aperiodic if
there exists no partition of E = (D0 , D1 , . . . , Dp−1 ) for some p ≥ 2 such that
for all i ∈ N
Z
£ ¤ £ ¤
P X(i) ∈ Di mod (p) |X(0) ∈ D0 = Px X(i) ∈ Di mod (p) dµ0 (x) = 1,
D0
(9.310)
for some initial probability distribution µ0 .
A Markov chain {X(n), Px }n∈N,x∈E having initial distribution µ0 is called
periodic if there exists p ≥ 2 and a partition E = (D0 , D1 , . . . , Dp−1 ) such
that (9.310) holds. The largest d for which (9.310) holds is called the period
of the Markov chain.
Let {X(n), Px }n∈N,x∈E be an aperiodic P ((1, x0 , ·)-irreducible Markov chain.
If there exists a ν1 -small set A with ν1 (A) > 0, then the Markov chain
{X(n), Px∈E }n∈N,x∈E is called strongly aperiodic.
The following theorem is proved in [162]: see theorems 5.2.1 and 5.2.2.
Theorem 9.56. Let {X(n), Px }n∈N,x∈E be a P (1, x0 , ·)-irreducible Markov
chain. Then for any A ∈ E with P (1, x0 , A) > 0 there exists m ∈ N, m ≥ 1,
together with a νm -small set C ⊂ A with P (1, x0 , C) > 0 such that νm (C) > 0.

Remark 9.57. Suppose that all measures B 7→ P (1, x, B), B ∈ E, x ∈ E, are


equivalent, then analyzing the proof of Theorem 5.2.1 in [162] shows that in
Theorem 9.56 we may choose m = 3.

The following corollary is an immediate consequence of Definition 9.55 and


Theorem 9.56.
Corollary 9.58. Let {X(n), Px }n∈N,x∈E be a P (1, x0 , ·)-irreducible aperi-
odic Markov chain. Then there exists m ∈ N such that the skeleton chain
{X(mn), Px }n∈N,x∈E is strongly aperiodic, and P (1, x0 , ·)-irreducible.

Remark 9.59. In fact the skeleton chain {X(mn), Px }n∈N,x∈E is P (m, x0 , ·)-
irreducible, provided that the chain {X(n), Px }n∈N,x∈E is also P (1, x0 , ·)-
irreducible, and all measures of the form B 7→ P (1, x0 , B) = 0 are equivalent.
Suppose that B ∈ E is such that P (m, x0 , B) = 0. Then
Z
0 = P (m, x0 , B) = P (m − 1, x0 , dy) P (1, y, B) . (9.311)

From (9.311) we see that P (1, y, B) = 0 for P (m − 1, x0 , ·)-almost all y ∈ E.


Since P (m − 1, x0 , E) = 1, it follows that P (1, y, B) = 0 for at least one
y ∈ E. But then P (1, x0 , B) = 0 because all measures of the form B 7→
P (1, x0 , B) = 0 are equivalent.
9.4 A proof of Orey’s theorem 545

The following theorem is a consequence of Proposition 8.17, Lemma 8.23, and


Theorem 9.56.
Theorem 9.60. Let

{(Ω, F, Px ) , (X(n), n ∈ N) , (ϑn , n ∈ N) , (E, N)} (9.312)

be a Markov chain with the property that all Borel measures B 7→ P (1, x, B) =
Px [X(1) ∈ B], x ∈ E, are equivalent. In addition suppose that for every Borel
subset B the function x 7→ P (1, x, B) is continuous. Let there exist a point
x0 ∈ E such that every open neighborhood of x0 is recurrent. Then there exists
a compact recurrent subset, and all£ Borel subsets
¤ B for which P (1, x0 , B) > 0
are recurrent in the sense that Px τB1 < ∞ = 1 for all x ∈ B. If, moreover,
the Markov chain in (9.312) is aperiodic, then there exists an integer m ∈ N,
m ≥ 1, and a compact m-small set A such that νm (A) > 0 which is compact.
Here the measure νm satisfies P (m, x, B) ≥ νm (B) for all B ∈ B and all
x ∈ A.

Proof. The first two assertions are consequences of respectively Proposition


8.17 and Lemma 8.23. The final assertion is a consequence of Theorem 9.56,
and the fact that Borel measures on a Polish space are inner-regular.

Among other things the following lemma reduces the proof of Orey’s theo-
rem for arbitrary irreducible aperiodic Markov chains to that for arbitrary
irreducible strongly aperiodic Markov chains.
LemmaRR 9.61. Let µ1 and µ2 be probability measures on E. Then the sequence
n 7→ Var (P (n, x, ·) − P (n, y, ·)) dµ1 (x)dµ2 (y) is monotone decreasing.

Proof. Fix the pair (x, y) ∈ E × E. The expression

Var (P (n + 1, x, ·) − P (n + 1, y, ·))

can be rewritten as follows

Var (P (n + 1, x, ·) − P (n + 1, y, ·))
½¯Z ¯ ¾
¯ ¯
¯ ¯
= sup ¯ (P (n + 1, x, dz) − P (n + 1, y, dz)) f (z) dz ¯ : kf k∞ ≤ 1
½¯Z Z ¯ ¾
¯ ¯
¯ ¯
= sup ¯ (P (n, x, dw) − P (n, y, dw)) P (1, w, dz)f (z) dz ¯ : kf k∞ ≤ 1

¯R ¯
(notice that ¯ P (1, w, dz)f (z)¯ ≤ kf k∞ , w ∈ E)

≤ Var (P (n, x, ·) − P (n, y, ·)) . (9.313)

The inequality in (9.313) yields Lemma 9.61.


546 9 Miscellaneous topics

We will also use the Nummelin splitting of general (Harris) recurrent chains.
This splitting technique is taken from [162], §5.1 and §17.3.1. With a strongly
aperiodic irreducible chain it associates a split chain with an accessible atom.
Let the Markov chain (9.312) have the properties described in Theo-
rem 9.60. Then the Markov {X(n), Px }n∈N,x∈E is aperiodic: see Proposition
9.2. From Corollary 9.58 it follows that there exists m ∈ N, m ≥ 1, such
that the skeleton Markov chain {X(mn), Px }n∈N,x∈E is strongly aperiodic.
Definition 9.55 yields the existence of a compact recurrent subset C such
that P (1, x0 , C) > 0 together with a probability measure ν on E such that
ν (C) = 1, and such that the following minorization condition is satisfied:
P (m, x, B) ≥ δ1C (x)ν(B), for all x ∈ X, and all B ∈ E. (9.314)
In the presence of a subset C and a constant m ∈ N such that (9.314) holds
for
© some probability measure νª with ν(C) = 1 we will construct a split chain
X̌(n) = (X(mn), Y (n)) , P̌x,ε n∈N,x∈E, ε=0 or 1 . The m-step Markov chain
{X(mn), Px }n∈N,x∈E is strongly aperiodic, and it may be split to form a new
chain with an accessible atom C × {1}. Momentarily we will explain how the
construction of this splitting can be performed.
In order to distinguish the new split Markov chain and the old skeleton
chain we will introduce some new notation. We let the sequences of ran-
dom variables (Y (n), n ∈ N) attain the values zero and one. The value of
Y
© (n) indicates the level of theª split m-skeleton at time mn. The split chain
X̌(n) = (X(mn), Y (n)) , P̌x,ε n∈N,x∈E, ε=0 or 1 can be described in the fol-
© ª
lowing manner. Following Meyn and Tweedie [162] we write X̌(n) = xi =
{X(n) = x, Y (n) = i}, x ∈ E, i = 0 or i = 1. The new state space Ě is given
by Ě = E × {0, 1}; Ě is the Borel field of Ě. The σ-field F̌k stands for
F̌k,` = σ (X (j1 ) , Y (j2 ) : 0 ≤ j1 ≤ k, 0 ≤ j2 ≤ `) .
Let λ be any Borel measure on E, then the λ is split as a measure λ∗ on Ě in
the following fashion. Let A ∈ E and put A0 = A × {0}, and A1 = A × {1}.
Then the marginal measures of λ∗ are given by
)
λ∗ (A0 ) = (1 − δ)λ (A ∩ C) + λ (A ∩ (E \ C)) ,
(9.315)
λ∗ (A1 ) = δλ (A ∩ C) .

Notice the equality λ∗ (A0 ∪ A1 ) = λ(A), and λ∗ (A0 ) = λ(A) when A is a


subset of E \C. In other words only subsets of C are split by this construction.
The splitting of the skeleton {X(nm), Px }n∈N,x∈E is carried out follows.
Define the split kernel P̌ (m, xi , A), xi ∈ Ě, A ∈ Ě by
∗ 
P̌ (m, x0 , ·) = P (m, x, ·) , x0 ∈ E0 \ C0 ;



P (m, x, ·) − δν ∗ (·)
P̌ (m, x0 , ·) = , x0 ∈ C0 ; (9.316)
1−δ 



P̌ (m, x1 , ·) = δν ∗ (·), x1 ∈ E1 .
9.4 A proof of Orey’s theorem 547

On E1 = E × {1} the distribution of the split chain is also determined by


prescribing the following conditional expectations:
 
m
Y ¯
Ě  fj (X(nm + j)) , Y (n) = 1 ¯ F̌nm,n−1 ; X(nm) = x
j=1
 
m
Y ¯
= Ě  fj (X(j)) , Y (0) = 1 ¯ X(0) = x
j=1
 
m
Y
= δEx  fj (X(j)) r (x, X(m)) , (9.317)
j=1

where the Borel measurable function (x, y) 7→ r(x, y) is the Radon-Nykodim


derivative:
ν(dy)
r(x, y) = 1C (x) . (9.318)
P (m, x, dy)
By putting fj = 1, 1 ≤ j ≤ m − 1, in (9.318) we see that
£ ¯ ¤
Ě fm (X ((n + 1)m)) , Y (n) = 1 ¯ F̌nm,n−1 ; X(nm) = x
Z
= δEx [fm (X(m)) r (x, X(m))] = δ1C (x) fm (y)dν(y). (9.319)

By taking fm = 1 in (9.319) we get


£ ¯ ¤
P̌ Y (n) = 1 ¯ F̌nm,n−1 ; X(nm) = x = δ1C (x). (9.320)

By Bayes rule applied to (9.319) and (9.320) we obtain


Z
£ ¯ ¤
Ě f (X ((n + 1)m)) ¯ F̌nm,n ; X(nm) = x, Y (n) = 1 = f (y)dν(y).
(9.321)
Let fj , 0 ≤ j ≤ N , be bounded Borel functions on E, and let the numbers
εj , 0 ≤ j ≤ N , be equal to 0 or 1. From the tower property of conditional
expectations, the Markov property of the process
n¡ ¢ ¡ ¢ ¡ ¢o
Ω̌, F̌, P̌(x,i) (x,i)∈E×{0,1} , X̌(n), n ≥ 0 , Ě, Ě , (9.322)

and (9.321) we infer, with


N
Y
Fn+1 = fj (X((n + 1)m + j)) δεj (Y (j + n + 1)) ,
j=0

that
£ ¯ ¤
Ěx,1 Fn+1 ¯ F̌nm,n
548 9 Miscellaneous topics
£ ¯ ¤
= Ě Fn+1 ¯ F̌nm,n ; X(nm) = x, Y (n) = 1
£ £ ¯ ¤¯ ¤
= Ě Ě Fn+1 ¯ F̌(n+1)m,n+1 ¯ F̌nm,n ; X(nm) = x, Y (n) = 1
£ £ ¯ ¤¯ ¤
= Ě Ě Fn+1 ¯ σ (X((n + 1)m), Y (n + 1)) ¯ F̌nm,n ; X(nm) = x, Y (n) = 1
Z
£ ¯ ¤
= Ě Fn+1 ¯ σ (Y (n + 1)) ; X((n + 1)m) = y dν(y) (9.323)
 
Z N
Y
= Ěy,ε0  fj (X(j)) δεj (Y (j)) dν(y). (9.324)
j=0

The equality in (9.323) yields the P̌x,1 -independence of the following two σ-
fields, given that Y (n) = 1: F̌nm,n = σ (X(i), Y (j) : 0 ≤ i ≤ nm, 0 ≤ j ≤ n)
and F̌(n+1)m,n+1 = σ (X(i), Y (j) : i ≥ (n + 1)m, j ≥ n + 1).
From (9.323) it also follows that for f ≥ 0 and Borel measurable, k ∈ N,
k ≥ 1, and ε = 0 or 1,
£ ¯ ¤
Ěx,1 f (X ((n + 1)m + k)) δε (Y ((n + 1)m + k)) ¯ F̌nm,n
£ ¯
= Ě f (X ((n + 1)m + k)) δε (Y ((n + 1)m + k)) ¯
¤
F̌nm,n ; X(nm) = x, Y (n) = 1
Z
= Ey [f (X(k))] dν(y). (9.325)

From (9.325) we infer by taking expectation with respect to Ěx,1 that


Z
Ěx,1 [f (X ((n + 1)m + k)) δε (Y ((n + 1)m + k))] = Ey [f (X(k))] dν(y),
(9.326)
and consequently, the subset C × {1} serves as an atom for the split chain
© ª
(X(mn), Y (n)) , P̌x,ε (x,ε)∈E×{0,1} . (9.327)

It is assumed that the process in (9.327) is a time-homogeneous Markov chain


with transition function P̌ (nm, xi , A), n ∈ N, xi = (x, i) ∈ E × {0, 1}. Here
the “first-step” transition function P (m, xi , A) is given by (9.316). By the
Markov property it then follows that the transition function P (nm, xi , A)
satisfies the Chapman-Kolmogorov equation, i.e. the equality
Z
P̌ (jm, xi , dyj ) P̌ (km, yj , A) = P̌ ((j + k) m, xi , A) (9.328)

holds for all xi ∈ Ě and A ∈ Ě and j, k ∈ N. Compare all this with the
Markov chain in (9.322).
The following theorem appears as Theorem 5.1.3 in Meyn and Tweedie.
Theorem 9.62. Let δ > 0, the probability measure ν, and m ∈ N. m ≥ 1
be as in (9.314). Let ϕ be a sigma-finite measure on E. Suppose the function
9.4 A proof of Orey’s theorem 549

P (nm, xi , A) serves a transition function for the Markov process in (9.327).


In particular the Chapman-Kolmogorov identity (9.328) is satisfied. Then the
following assertions hold:
(a) The chain {X(nm), Px }n∈N,x∈E is the marginal chain of
© ª
X̌(nm) = (X(nm), Y (n)) , P̌x,i n∈N,(x,i)∈E×{0,1}
, (9.329)

in the sense that the equality


Z Z
P (km, x, A) dλ(x) = P̌ (km, yi , A0 ∪ A1 ) dλ∗ (yi ) (9.330)
E Ě

holds for all Borel measures λ, all A ∈ E and all k ∈ N.


(b) If the chain in (9.329) is ϕ∗ -irreducible, then the chain {X(nm), Px } is
ϕ-irreducible.
(c) If the chain {X(nm), Px }n∈N,x∈E is ϕ-irreducible with ϕ(C) > 0, then the
split chain in (9.329) is ν ∗ -irreducible, and C × {1} is an accessible atom
for the split chain (9.329).
For the definition of accessible atom the reader is referred to Definition 9.46.
Proof. (a) It suffices to prove (9.330) with λ = δx , the Dirac measure at
x ∈ E. We will employ induction with respect to k. First assume that k = 1.
By (9.315), (9.316) and the equality ν (E \ C) = 0 for x ∈ E \ C we have
Z
dδx∗ (yi ) P̌ (m, yi , A0 ∪ A1 ) = P̌ (m, x0 , A0 ∪ A1 ) + P̌ (m, x1 , A0 ∪ A1 )


= P (m, x, ·) (A0 ∪ A1 ) + ν ∗ (A0 ∪ A1 ) = P (m, x, A) + ν (A ∩ (E \ C))
= P (m, x, A) . (9.331)

Next let x ∈ C. Again by employing (9.315) and (9.316) we infer


Z
dδx∗ (yi ) P̌ (m, yi , A0 ∪ A1 )

Z Z
= dδx∗ (yi ) P̌ (m, yi , A0 ∪ A1 ) + dδx∗ (yi ) P̌ (m, yi , A0 ∪ A1 )
E×{0} E×{1}

= (1 − δ)P̌ (m, x0 , A0 ∪ A1 ) + δ P̌ (m, x1 , A0 ∪ A1 )



P (m, x, ·) (A0 ∪ A1 ) − δν ∗ (A0 ∪ A1 )
= (1 − δ) + δν ∗ (A0 ∪ A1 )
1−δ

= P (m, x, ·) (A0 ∪ A1 ) = P (m, x, A) . (9.332)

The equalities (9.331) (for x ∈ E \ C) and (9.332) (for x ∈ C) yield assertion


(a) for n = 1 and λ = δx . From Fubini’s theorem assertions (a) is then also
true for any bounded measure λ.
Next we assume that the equality in (9.330) holds for 1 ≤ k ≤ n. First we
notice that
550 9 Miscellaneous topics
Z µZ ¶∗
λ∗ (dxi ) P̌ (m, xi , ·) = λ(dx)P (m, x, ·) . (9.333)
Ě E

Using the Chapman-Kolmogorov equation for the probability transition func-


tion P̌ (km, xi , A) in combination with (9.333) and induction then shows
Z µZ ¶ast

λ (dxi ) P̌ (nm, xi , ·) = λ(dx)P (nm, x, ·) . (9.334)
Ě E

Here we need the Chapman-Kolmogorov identity (9.328) for A of the form


B0 ∪ B1 with B ∈ E. For k = n + 1 we then have
Z Z
λ(dx) P ((n + 1)m, x, A) = λ(dx) P (nm, x, dy) P (m, y, A)
E E
Z µZ ¶∗
= λ(dx) P (nm, x, ·) (dyj ) P̌ (m, yj , A0 ∪ A1 )
Ě E
Z µZ ¶∗
= λ(dx) P (nm, x, ·) (dyj ) P̌ (m, yj , A0 ∪ A1 )
Ě E

(apply equality (9.334))


Z Z

= λ (dxi ) P̌ (nm, xi , dyj ) P̌ (m, yj , A0 ∪ A1 )
Ě Ě

(Chapman-Kolomogorov (9.328))
Z
= λ∗ (dxi ) P̌ ((n + 1)m, xi , A0 ∪ A1 ) . (9.335)

The assertion in (a) follows from (9.335). Assertion (b) follows from (a) with
ϕ instead of λ. In order to prove (c) we observe that C ×{1} is an atom for the
Markov chain in (9.329), which is a consequence of the ultimate equality in
(9.316). If ϕ(C) > 0, then from the minorization property in (9.314) it follows
that the split chain (9.329) is ν ∗ -irreducible, and that C × {1} is an accessible
atom.
Altogether this completes the proof of Theorem 9.62.
Next we prove Orey’s theorem, i.e. we prove Theorem 9.4.
Proof (Proof of Theorem 9.4). We distinguish three cases:
(i) The irreducible recurrent chain {X(n), Px }x∈E contains an accessible
atom.
(ii) The irreducible recurrent chain is strongly aperiodic.
(iii)The irreducible recurrent chain is aperiodic.
9.4 A proof of Orey’s theorem 551

In case the irreducible recurrent chain contains an accessible atom A we use


formula (9.282) to obtain:

|Ex [f (X(n))] − EA [f (X(n))]|


¡ £ ¤ ¢
≤ kf k∞ Px τA1 ≥ n + (ax ∗ uA − uA ) ∗ pA,1 (n − 1) . (9.336)
£ ¤
Here f ∈ Cb (E) is arbitrary, and the sequences are ax (n) = Px τA1 = n ,
uA (n) = PA [X(n) ∈ A], and pA,f (n) are chosen as in (9.281). In fact
£ ¤
pA,f (k) = EA f (X(k)) , τA1 = k , and
£ ¤
pA,f (k) = EA f (X(k)) , τA1 ≥ k . (9.337)

Let n tend to ∞ in (9.336). Since Px [τA < ∞] = 1 the first term in the right-
hand side of (9.336) tends to zero uniformly in f provided that kf k∞ ≤ 1. The
equality (9.297) in Theorem 9.53 yield that the second term in the right-hand
side of (9.336) tends to zero, again uniformly in f provided kf k∞ ≤ 1. As a
consequence we see that

lim Var (P (n, x, ·) − P (n, A, ·))


n→∞
= lim sup {|Ex [f (X(n))] − EA [f (X(n))]| : kf k∞ ≤ 1} = 0. (9.338)
n→∞

By the triangle inequality and the dominated convergence theorem the equal-
ity in (9.25) in Theorem 9.4 is a consequence of (9.338). This proves assertion
(i) in the beginning of this proof.
Next we will prove (9.25) in Theorem 9.4 in case the recurrent Markov
chain {X(n), Px }x∈E is strongly aperiodic. This will be a consequence of
Nummelin’s splitting technique, and the fact that for Markov chains with an
accessible atom Orey’s theorem holds: see the arguments following equality
(9.338). If the chain {X(n), Px }n∈N,x∈E is strongly aperiodic, then we know
that inequality (9.314) holds with m = 1 for some some recurrent subset C, a
probability measure ν on E with ν(C) = 1 (and P (1, x0 , C) > 0). Using this
subset C and this measure ν we may construct the split chain in (9.329) with
marginal chain {X(n), Px }n∈N,x∈E (i.e. (9.330) is satisfied), and for which
C × {1} is an accessible atom. These claims follow from assertion (b) and (c)
in Theorem 9.62. Since the subset C × {1} is an accessible atom for the split
chain in (9.329), we know that Orey’s theorem holds for the split chain. The
latter is a consequence of assertion (i), which is a consequence of (9.338). Let
x and y ∈ E. Then we infer

Var (P (n, x, ·) − P (n, y, ·))


≤ 2 sup |P (n, x, A) − P (n, y, A)|
A∈E
¯Z Z ¯
¯ ¡ ¢ ∗ ¯
¯
= 2 sup ¯ P̌ (n, xi , A0 ∪ A1 ) − P (n, yj , A0 ∪ A1 ) dδx (xi ) dδy (yj )¯¯

A∈E Ě×Ě
552 9 Miscellaneous topics
ZZ
¯ ¯
≤2 sup ¯P̌ (n, xi , A0 ∪ A1 ) − P (n, yj , A0 ∪ A1 )¯ dδx∗ (xi ) dδy∗ (yj )
A∈E
Z ZĚ×Ě
¡ ¢
≤2 Var P̌ (n, xi , ·) − P (n, yj , ·) dδx∗ (xi ) dδy∗ (yj ) . (9.339)
Ě×Ě

By assertion (i), applied to the split chain in (9.329) (with m = 1) the final
term in (9.339) converges to zero. By dominated convergence and (9.339) we
see that
ZZ
lim Var (P (n, x, ·) − P (n, y, ·)) dλ1 (x)dλ2 (y) = 0. (9.340)
n→∞ E×E

The equality in (9.340) shows that Orey’s theorem holds for strongly aperiodic
recurrent.
To finish the proof of Theorem 9.4 we suppose that {X(n), Px }n∈N,x∈E is
an aperiodic recurrent chain. By assertion (ii), which has been proved now,
for irreducible recurrent strongly aperiodic chains Orey’s theorem holds. By
Corollary 9.58 there exists m ∈ N such that the the skeleton

{(X(mn), Px ) : n ∈ N, ∈ E}

is strongly aperiodic. Since Orey’s theorem holds for such chains, an applica-
tion of Lemma 9.61 yields the result that Orey’s theorem holds for all irre-
ducible, recurrent aperiodic Markov chains.

9.5 About invariant (or stationary) measures


In this section we collect some references to work related to the existence
of invariant or stationary measures for Markov processes. In this context we
have to mention Harris [99] who proved the existence of a σ-finite invariant
measure for recurrent irreducible Markov chains. Let P (x, B) be a probability
transition function which preserves the bounded continuous functions on a
Polish space E. Suppose that P is irreducible (i.e. for every x ∈ E, and for
every non-void open subset O, P n (x, O) > 0 for some n ∈ N, n ≥ 1), and
topologically recurrent
S (i.e. for every x ∈ E and every open neighborhood O
of x the equality Px [ n=1 {X(n) ∈ O}] = 1 holds). Here
© ª
(Ω, F, Px )x∈E , (X(n), n ∈ N) , (ϑk , k ∈ N) , (E, E)

is the Markov chain with transition function (x, B) 7→ P (x, B) (x, B) ∈ E ×E.
Harris proved that for a discrete state space E there exists a σ-finite invariant
measure, and Orey [175, 176,R 177] was the first to prove that in the presence of
a finite invariant limn→∞ E f (y)P n (x, dy) dµ(x) = 0 for all finite real Borel
measures µ on E such that µ(E) = 0. The original result by Harris and Orey
for discrete positive recurrent chains were improved and generalized by Jami-
son and Orey [116], and Kingman and Orey [132] to Markov chains with a
9.6 Weak and strong solutions to stochastic differential equations 553

more general state space, and for null-recurrent chains. In [173, 174] Nummelin
and Tuominen discuss geometric ergodicity properties, and so do Tuominen
and Tweedie in [236]. For a general discussion on Markov chains and their
limit theorems see e.g. the books by Nummelin [171], Revuz [198], and Orey
[178]. The new version of Meyn and Tweedie [162] also contains a wealth of
information. It explains splitting (due to Nummelin [170]) and (dependent)
coupling techniques (due to Ornstein [179]), and several limit properties as
well as asymptotic behavior of Markov chains. In addition, it discusses geomet-
ric ergodic chains, certain functional central limit theorems, and laws of large
numbers. All these topics are explained for discrete time Markov processes
with an arbitrary state space. Moreover, each of the 19 chapters of [162] is
concluded with a section, entitled Commentary, which contains bibliographic
notes and relevant observations. Azema, Duflo and Revuz apply skeleton tech-
niques to pass from discrete time limit theorems to continuous time limits: see
e.g. [11, 12, 10]. In the proof of Proposition 8.26 we applied the same methods.
Our approach uses the techniques of Seidler [207] (propositions 5.7 and 5.9)
in combination with Orey’s theorem for Markov chains on a compact space.
For more details the reader is referred to the comments following Theorem
9.3, and to the Notes, pp. 319–320, in Supplement, Harris processes, Special
functions, Zero-two law, written by Antoine Brunel in [138].

9.6 Weak and strong solutions to stochastic differential


equations

In this section we discuss weak and strong solutions to stochastic differential


equations.
References

1. Shigeki Aida, Uniform positivity improving property, Sobolev inequalities, and


spectral gaps, J. Funct. Anal. 158 (1998), no. 1, 152–185.
2. S. Albeverio, J. Rezende, and J.-C. Zambrini, Probability and quantum sym-
metries 2, (in preparation), 2001.
3. S. Albeverio, J. Rezende, and J.-C. Zambrini, Probability and quantum sym-
metries. II. The theorem of Noether in quantum mechanics, J. Math. Phys. 47
(2006), no. 6, 062107, 61 pages.
4. F. Altomare and A. Attaliente, Degenerate evolution equations in weighted
continuous function spaces, Markov processes aand the Black-Sholes equation
- Part I, preprint University of Bari, 2002.
5. , Degenerate evolution equations in weighted continuous function spaces,
Markov processes aand the Black-Sholes equation - Part II, preprint University
of Bari, 2002.
6. Francesco Altomare and Michele Campiti, Korovkin-type approximation the-
ory and its applications, de Gruyter Studies in Mathematics, vol. 17, Walter
de Gruyter & Co., Berlin, 1994, Appendix A by Michael Pannenberg and Ap-
pendix B by Ferdinand Beckhoff.
7. C. Andrieu and G. Fort, Explicit control of subgeometric ergodicity, web page:
http://www.tsi.enst.fr/ gfort/biblio.html., 2005.
8. Wolfgang Arendt, Vector-valued Laplace transforms and Cauchy problems, Is-
rael J. Math. 59 (1987), no. 3, 327–352. MR MR920499 (89a:47064)
9. Wolfgang Arendt, Charles J. K. Batty, Matthias Hieber, and Frank Neubran-
der, Vector-valued Laplace transforms and Cauchy problems, Monographs in
Mathematics, vol. 96, Birkhäuser Verlag, Basel, 2001.
10. J. Azéma, M. Kaplan-Duflo, and D. Revuz, Mesure invariante sur les classes
récurrentes des processus de Markov, Z. Wahrscheinlichkeitstheorie und Verw.
Gebiete 8 (1967), no. 3, 157–181.
11. Jacques Azéma, Marie Kaplan-Duflo, and Daniel Revuz, Récurrence fine des
processus de Markov, Ann. Inst. H. Poincaré Sect. B (N.S.) 2 (1965/1966),
185–220.
12. , Mesure invariante et théorèmes ergodiques sur les classes récurrentes
des processus de Markov, C. R. Acad. Sci. Paris Sér. A-B 262 (1966), A1247–
A1249.
556 References

13. D. Bakry, Transformations de Riesz pour les semi-groupes symétriques. I.


Étude de la dimension 1, Séminaire de probabilités, XIX, 1983/84, Lecture
Notes in Math., vol. 1123, Springer, Berlin, 1985, pp. 130–144.
14. , Transformations de Riesz pour les semi-groupes symétriques. II. Étude
sous la condition Γ2 ≥ 0, Séminaire de probabilités, XIX, 1983/84, Lecture
Notes in Math., vol. 1123, Springer, Berlin, 1985, pp. 145–174.
15. , Inégalités de Sobolev faibles: un critère Γ2 , Séminaire de Probabilités,
XXV, Lecture Notes in Math., vol. 1485, Springer, Berlin, 1991, pp. 234–261.
16. D. Bakry, L’hypercontractivité et son utilisation en théorie de semigroupes,
Lecture Notes in Math., vol. 1581, pp. 1–114, Springer Verlag, Berlin, 1994, P.
Bernard (editor).
17. , Functional inequalities for Markov semigroups, Probability measures
on groups: recent directions and trends, Tata Inst. Fund. Res., Mumbai, 2006,
pp. 91–147.
18. D. Bakry and Michel Émery, Diffusions hypercontractives, Séminaire de prob-
abilités, XIX, 1983/84, Lecture Notes in Math., vol. 1123, Springer, Berlin,
1985, pp. 177–206.
19. D. Bakry and M. Ledoux, A logarithmic Sobolev form of the Li-Yau parabolic
inequality, Revista Mat. Iberoamericana 22 (2006), 683–702, to appear.
20. V. Bally, É Pardoux, and L. Stoica, Backward stochastic differential equations
associated to a symmetric Markov process, Potential Analysis 22 (2005), no. 1,
17 – 60.
21. G. Barles, P. Cardaliaguet, O. Ley, and R. Monneau, Global existence results
and uniqueness for dislocation equations, SIAM J. Math. Anal. (2008).
22. Guy Barles, Discontinuous viscosity solutions of first-order Hamilton-Jacobi
equations: a guided visit, Nonlinear Anal. 20 (1993), no. 9, 1123–1134.
23. M. T. Barlow, R. F. Bass, and T. Kumagai, Note on the equivalence of parabolic
Harnack inequalities and heat kernel estimates,
http://www.math.ubc.ca/˜barlow/preprints/, 2005.
24. Richard F. Bass and Takashi Kumagai, Symmetric markov chains on Zd with
unbounded range, Trans. Amer. Math. Soc. 360 (2008), 2041–2075.
25. Heinz Bauer, Probability theory and elements of measure theory, Academic
Press Inc. [Harcourt Brace Jovanovich Publishers], London, 1981, Second edi-
tion of the translation by R. B. Burckel from the third German edition, Prob-
ability and Mathematical Statistics.
26. Boris Bäumer and Frank Neubrander, Laplace transform methods for evolution
equations, Confer. Sem. Mat. Univ. Bari (1994), no. 258-260, 27–60, Swabian-
Apulian Meeting on Operator Semigroups and Evolution Equations (Italian)
(Ruvo di Puglia, 1994).
27. Jean-Paul Benzécri, Théorie des capacités (d’après G. Choquet), Séminaire
Bourbaki, Vol. 3, Soc. Math. France, Paris, 1995, pp. Exp. No. 120, 217–227.
28. V. Bergelson, P. March, and J. Rosenblatt (eds.), Convergence in ergodic the-
ory and probability, Ohio State University Mathematical Research Institute
Publications, 5, Walter de Gruyter & Co., Berlin, 1996, Papers from the Third
Conference held at The Ohio State University, Columbus, Ohio, June 1993,
Dedicated to Louis Sucheston.
29. Peter J. Bickel, Chris A. J. Klaassen, Ya’acov Ritov, and Jon A. Wellner, Effi-
cient and adaptive estimation for semiparametric models, Johns Hopkins Series
in the Mathematical Sciences, Johns Hopkins University Press, Baltimore, MD,
1993.
References 557

30. Patrick Billingsley, Convergence of probability measures, second ed., Wiley Se-
ries in Probability and Statistics: Probability and Statistics, John Wiley &
Sons Inc., New York, 1999, A Wiley-Interscience Publication.
31. Jean-Michel Bismut, Contrôle des systèmes linéaires quadratiques: applications
de l’intégrale stochastique, Séminaire de Probabilités, XII (Univ. Strasbourg,
Strasbourg, 1976/1977), Springer, Berlin, 1978, pp. 180–264.
32. , Mécanique aléatoire, Lecture Notes in Mathematics, vol. 866, Springer-
Verlag, Berlin, 1981, With an English summary.
33. David Blackwell and David Freedman, The tail σ-field of a Markov chain and
a theorem of Orey, Ann. Math. Statist. 35 (1964), no. 3, 1291–1295.
34. R.M. Blumenthal and R.K. Getoor, Markov processes and potential theory,
Pure and Applied Mathematics: A series of monographs and textbooks, vol. 29,
Academic Press, New York, 1968.
35. Sönke Blunck, Analyticity and discrete maximal regularity on Lp -spaces, J.
Funct. Anal. 183 (2001), no. 1, 211–230.
36. , Some remarks on operator-norm convergence of the Trotter product
formula on Banach spaces, J. Funct. Anal. 195 (2002), no. 2, 350–370.
37. Adam Bobrowski, On the Yosida approximation and the Widder-Arendt rep-
resentation theorem, Studia Math. 124 (1997), no. 3, 281–290.
38. Vladimir I. Bogachev, Nicolai V. Krylov, and Michael Röckner, Elliptic equa-
tions for measures: regularity and global bounds of densities, J. Math. Pures
Appl. (9) 85 (2006), no. 6, 743–757.
39. B. Boufoussi and J. van Casteren, An approximation result for a nonlinear
Neumann boundary value problem via BSDEs, Stochastic Process. Appl. 114
(2004), no. 2, 331–350.
40. B. Boufoussi and J. A. van Casteren, An approximation result for solutions to
semilinear pde’s with neumann boundary conditions via BSDE‘s, submitted,
2002.
41. Brahim Boufoussi, Jan Van Casteren, and N. Mrhardy, Generalized backward
doubly stochastic differential equations and SPDEs with nonlinear Neumann
boundary conditions, Bernoulli 13 (2007), no. 2, 423–446.
42. Ola Bratteli and Derek W. Robinson, Operator algebras and quantum statistical
mechanics. 1, second ed., Texts and Monographs in Physics, Springer-Verlag,
New York, 1987, C ∗ - and W ∗ -algebras, symmetry groups, decomposition of
states. MR MR887100 (88d:46105)
43. R. C. Buck, Approximation properties of vector valued functions, Pacific J.
Math. 53 (1974), 85–94.
44. R. Creighton Buck, Bounded continuous functions on a locally compact space,
Michigan Math. J. 5 (1958), 95–104.
45. A. V. Bukhvalov, Order-bounded operators in vector lattices and spaces of
measurable functions, Mathematical analysis, Vol. 26 (Russian), Itogi Nauki i
Tekhniki, Akad. Nauk SSSR Vsesoyuz. Inst. Nauchn. i Tekhn. Inform., Moscow,
1988, Translated in J. Soviet Math. 54 (1991), no. 5, 1131–1176, pp. 3–63, 148.
46. K. Burrage and J. C. Butcher, Stability criteria for implicit Runge-Kutta meth-
ods, SIAM J. Numer. Anal. 16 (1979), no. 1, 46–57.
47. Eric A. Carlen and Daniel W. Stroock, An application of the Bakry-Emery
criterion to infinite-dimensional diffusions, Séminaire de Probabilités, XX,
1984/85, Lecture Notes in Math., vol. 1204, Springer, Berlin, 1986, pp. 341–348.
48. Jan A. Van Casteren, On martingales and Feller semigroups, Results Math.
21 (1992), 274–288.
558 References

49. Sandra Cerrai, A Hille-Yosida theorem for weakly continuous semigroups,


Semigroup Forum 49 (1994), no. 3, 349–367.
50. , Second order PDE’s in finite and infinite dimension, Lecture Notes
in Mathematics, vol. 1762, Springer-Verlag, Berlin, 2001, A probabilistic ap-
proach.
51. K. S. Chan and Johannes Ledolter, Monte Carlo EM estimation for time series
models involving counts, J. Amer. Statist. Assoc. 90 (1995), no. 429, 242–252.
52. Mu-Fa Chen, Eigenvalues, inequalities, and ergodic theory, Probability and its
Applications (New York), Springer-Verlag London Ltd., London, 2005.
53. Mu-Fa Chen and Feng-Yu Wang, Estimation of spectral gap for elliptic opera-
tors, Trans. Amer. Math. Soc. 349 (1997), 1239–1267.
54. , Cheeger’s inequalities for general symmetric forms and existence cri-
teria for spectral gap, Ann. Probab. 28 (2000), no. 1, 235–257.
55. Mu-Fa Chen and Ying-Zhe Wang, Algebraic convergence of Markov chains,
Ann. Appl. Probab. 13 (2003), no. 2, 604–627.
56. Siddhartha Chib, Handbook of computational statistics (volume i) concepts and
fundamentals, vol. I, ch. MCMC Technology, pp. 71–102, Springer-Verlag, Hei-
delberg, 2004.
57. Wojciech Chojnacki, On the equivalence of a theorem of Kisyński and the Hille-
Yosida generation theorem, Proc. Amer. Math. Soc. 126 (1998), no. 2, 491–497.
58. Gustave Choquet, La naissance de la théorie des capacités: réflexion sur une
expérience personnelle, C. R. Acad. Sci. Sér. Gén. Vie Sci. 3 (1986), no. 4,
385–397.
59. Kai Lai Chung and Jean-Claude Zambrini, Introduction to random time and
quantum randomness, Monographs of the Portuguese Mathematical Society,
McGraw-Hill, Lisbon, 2001.
60. M.G. Crandall, L.C. Evans, and P.L. Lions, Some properties of viscosity solu-
tions of Hamilton-Jacobi equations, Trans. Amer. Math. Soc. 282 (1984), no. 2,
487–502.
61. M.G. Crandall, H. Ishii, and P.L. Lions, User’s guide to viscosity solutions of
second order partial differential equations, Bull. Amer. Math. Soc. (N.S.) 27
(1992), 1–67.
62. Michael G. Crandall, Hitoshi Ishii, and Pierre-Louis Lions, User’s guide to
viscosity solutions of second order partial differential equations, Bull. Amer.
Math. Soc. (N.S.) 27 (1992), no. 1, 1–67.
63. M. Crouzeix, W. H. Hundsdorfer, and M. N. Spijker, On the existence of so-
lutions to the algebraic equations in implicit Runge-Kutta methods, BIT 23
(1983), no. 1, 84–91.
64. Michel Crouzeix, Sur la B-stabilité des méthodes de Runge-Kutta, Numer.
Math. 32 (1979), no. 1, 75–82.
65. Ana Bela Cruzeiro and Jean-Claude Zambrini, Malliavin calculus and Eu-
clidean quantum mechanics. I. Functional calculus, J. Funct. Anal. 96 (1991),
no. 1, 62–95.
66. Giuseppe Da Prato and Jerzy Zabczyk, Stochastic equations in infinite dimen-
sions, Encyclopedia of Mathematics and its Applications, vol. 44, Cambridge
University Press, Cambridge, 1992.
67. Donald A. Dawson and Edwin A. Perkins, Measure-valued processes and renor-
malization of branching particle systems, Stochastic partial differential equa-
tions: six perspectives, Math. Surveys Monogr., vol. 64, Amer. Math. Soc.,
Providence, RI, 1999, pp. 45–106.
References 559

68. Freddy Delbaen and Walter Schachermeyer, The mathematics of arbitrage,


Springer Finance, Springer, Berlin, New York, 2006.
69. Claude Dellacherie and Paul-André Meyer, Probabilities and potential, North-
Holland Mathematics Studies, vol. 29, North-Holland Publishing Co., Amster-
dam, 1978.
70. M. Demuth and J. A. van Casteren, Stochastic spectral theory of Feller opera-
tors: a functional integration approach, Birkhaäuser Verlag, Basel, 2000.
71. J.-D. Deuschel and D. W. Stroock, Large deviations, AMS/Chelsea Series,
American Mathematical Society, Providence, 2000.
72. Klaus Donner, Extension of positive operators and Korovkin theorems, Lecture
Notes in Mathematics, vol. 904, Springer-Verlag, Berlin, 1982.
73. Joseph L. Doob, Classical potential theory and its probabilistic counterpart,
Classics in Mathematics, Springer-Verlag, Berlin, 2001, Reprint of the 1984
edition.
74. J. R. Dorroh and J. W. Neuberger, Lie generators for semigroups of transfor-
mations on a Polish space, Electron. J. Differential Equations (1993), No. 01,
approx. 7 pp. (electronic only).
75. M. Duflo and D. Revuz, Propriétés asymptotiques des probabilités de transition
des processus de Markov récurrents, Ann. Inst. H. Poincaré Sect. B (N.S.) 5
(1969), 233–244.
76. E. B. Dynkin and S. E. Kuznetsov, Linear additive functionals of superdiffu-
sions and related nonlinear p.d.e, Trans. Amer. Math. Soc. 348 (1996), no. 5,
1959–1987.
77. , Solutions of Lu = uα dominated by L-harmonic functions, J. Anal.
Math. 68 (1996), 15–37.
78. A. Eberle, Weak Sobolev spaces and Markov uniqueness of operators, C. R.
Acad. Sci. Paris Sér. I Math. 320 (1995), no. 10, 1249–1254.
79. , Girsanov-type transformations of local Dirichlet forms: an analytic
approach, Osaka J. Math. 33 (1996), no. 2, 497–531.
80. , Uniqueness and non-uniqueness of semigroups generated by singular
diffusion operators, Lecture Notes in Math., vol. 1718, Springer Verlag, Berlin,
1999.
81. Klaus-Jochen Engel and Rainer Nagel, One-parameter semigroups for linear
evolution equations, Graduate Texts in Mathematics, vol. 194, Springer-Verlag,
New York, 2000, With contributions by S. Brendle, M. Campiti, T. Hahn, G.
Metafune, G. Nickel, D. Pallara, C. Perazzoli, A. Rhandi, S. Romanelli and R.
Schnaubelt.
82. Abdelhadi Es-Sarhir and B’alint Farkas, Positivity of perturbed ornstein-
uhlenbeck semigroups on cb (h), Semigroup Forum 70 (2005), 208–224.
83. Alison M. Etheridge, An introduction to superprocesses, University Lecture
Series, vol. 20, American Mathematical Society, Providence, RI, 2000.
84. S.N. Ethier and T.G. Kurtz, Markov processes, characterization and conver-
gence, Wiley Series in Probability and Statistics, John Wiley and Sons, New
York, 1985.
85. Lawrence C. Evans, Partial differential equations, Graduate Studies in Math-
ematics, vol. 19, American Mathematical Society, Providence, RI, 1998.
86. W.H. Fleming and H. M. Soner, Controlled markov processes and viscosity
solutions, Applications of Mathematics, vol. 25, Springer Verlag, Berlin, 1993.
87. Shaul R. Foguel, The ergodic theory of Markov processes, Van Nostrand Math-
ematical Studies, No. 21, Van Nostrand Reinhold Co., New York, 1969.
560 References

88. S.R. Foguel, Limit theorems for Markov processes, Transaction of the American
Mathematical Society 121 (1966), no. 1, 200–209.
89. G. Fort and G. O. Roberts, Subgeometric ergodicity of strong Markov processes,
Ann. Appl. Probab. 15 (2005), no. 2, 1565–1589.
90. Dariusz Gatarek and Benjamin Goldys, On invariant measures for diffusions
on Banach spaces, Potential Analysis 7 (1997), no. 2, 533–553.
91. Robin Giles, A generalization of the strict topology, Trans. Amer. Math. Soc.
161 (1971), 467–474.
92. B. Goldys and B. Maslowski, Uniform exponential ergodicity of stochastic dis-
sipative systems, Czechoslovak Math. J. 51(126) (2001), no. 4, 745–762.
93. , Exponential ergodicity for stochastic reaction-diffusion equations,
Stochastic Partial Differential Equations and Applications VII, Chapman &
Hall, 2005, pp. 115–132.
94. , Lower estimates of transition densities and bounds on exponential er-
godicity for stochastic PDE’s, Ann. Probab. 34 (2006), no. 4, 1451–1496.
95. A. Gulisashvili and J. A. van Casteren, Non-autonomous Kato classes and
Feynman-Kac propagators, World Scientific, Singapore, 2006.
96. Dzung Minh Ha, Functional Analysis Volume 1: A gentle introduction, Matrix
Editions, 2007, Student-friendly but rigorous book aimed primarily at third or
fourth year undergraduates. The pace is slow but thorough, with an abundance
of motivations, examples, and counterexamples. Arguments used within proofs
are explicitly cited, with references to where they were first proved. Many
examples have solutions that make use of several results or concepts, so that
students can see how various techniques can blend together into one.
97. Wolfgang Hahn, Über die Anwendung der Methode von Ljapunov auf Differen-
zengleichungen, Math. Ann. 136 (1958), 430–441.
98. E. Hairer and G. Wanner, Solving ordinary differential equations. II, Springer
Series in Computational Mathematics, vol. 14, Springer-Verlag, Berlin, 1991,
Stiff and differential-algebraic problems.
99. T. E. Harris, The existence of stationary measures for certain Markov processes,
vol. 2, pp. 113–124, University of California Press, Los Angeles, 1956.
100. J.M. Harrison and S.R. Pliska, Martingales and stochastic integrals in the the-
ory of continuous trading, Stochastic Process. Appl. 11 (1981), no. 3, 215–260.
101. R. Z. Has0 minskiı̆, Ergodic properties of recurrent diffusion processes and sta-
bilization of the solution of the Cauchy problem for parabolic equations. (Rus-
sian), Teor. Verojatnost. i Primenen 5 (1960), 196–214.
102. M. Hazewinkel (ed.), Encyclopaedia of mathematics. Supplement. Vol. III,
Kluwer Academic Publishers, Dordrecht, 2001.
103. Rudolf A. Hirschfeld, Riots, Nieuw Archief voor Wiskunde 22 (1974), no. 3,
1–43.
104. W. Hoh, The martingale problem for a class of pseudo-differential operators,
Math. Ann. 300 (1994), no. 1, 121–147.
105. Walter Hoh, Feller semigroups generated by pseudo-differential operators,
Dirichlet forms and stochastic processes (Beijing, 1993), de Gruyter, Berlin,
1995, pp. 199–206.
106. , Pseudodifferential operators with negative definite symbols and the
martingale problem, Stochastics Stochastics Rep. 55 (1995), no. 3-4, 225–252.
107. , Pseudo differential operators with negative definite symbols of variable
order, Rev. Mat. Iberoamericana 16 (2000), no. 2, 219–241.
References 561

108. Chii-Ruey Hwang, Shu-Yin Hwang-Ma, and Shuenn-Jyi Sheu, Accelerating Dif-
fusions, Annals of Appl. Prob. 15 (2005), no. 2, 1433–1444.
109. N. Ikeda and S. Watanabe, Stochastic differential equations and diffusion pro-
cesses, 2 ed., North-Holland Mathematical Library, vol. 24, North-Holland,
Amsterdam, 1998.
110. N. Jacob, Pseudo differential operators and Markov processes. Vol. I, Imperial
College Press, London, 2001, Fourier analysis and semigroups.
111. , Pseudo differential operators and Markov processes. Vol. II, Imperial
College Press, London, 2002, Markov processes.
112. , Pseudo differential operators and Markov processes. Vol. III, Imperial
College Press, London, 2005, Markov processes and applications.
113. Adam Jakubowski, A non-Skorohod topology on the Skorohod space, Electron.
J. Probab. 2 (1997), no. 4, 21 pp. (electronic).
114. , Skorokhod’s Ideas in Probability Theory, ch. From convergence of func-
tions to convergence of stochastic processes. On Skorokhod’s sequential ap-
proach to convergence in distribution, pp. 179–194, Insitute of Mathematics,
National Academy of Sciences of Ukraine, Kiev, 2000.
115. B. Jamison, Reciprocal processes, Z. Wahrscheinlichkeitstheor. Verw. Gebiete
30 (1974), 65–86.
116. Benton Jamison and Steven Orey, Markov chains recurrent in the sense of
Harris, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 8 (1967), no. 1, 41–
48.
117. Benton Jamison, Steven Orey, and William Pruitt, Convergence of weighted
averages of independent random variables, Z. Wahrscheinlichkeitstheorie und
Verw. Gebiete 4 (1965), 40–44.
118. Olav Kallenberg, Foundations of modern probability, second ed., Probability
and its Applications (New York), Springer-Verlag, New York, 2002.
119. I. Karatzas and S.F. Shreve, Brownian motion and stochastic calculus, vol. 113,
Springer Verlag, Berlin, 1991, Graduate Texts in Mathematics.
120. Ioannis Karatzas, Lectures on the mathematics of finance, American Mathe-
matical Society, Providence, RI, 1997. MR 98h:90001
121. Ioannis Karatzas and Steven E. Shreve, Brownian motion and stochastic cal-
culus, second ed., Graduate Texts in Mathematics, vol. 113, Springer-Verlag,
New York, 1991. MR MR1121940 (92h:60127)
122. , Methods of mathematical finance, Springer-Verlag, New York, 1998.
123. Samuel Karlin and Howard M. Taylor, A first course in stochastic processes,
second ed., Academic Press [A subsidiary of Harcourt Brace Jovanovich, Pub-
lishers], New York-London, 1975.
124. N. El Karoui, C. Kapoudjian, É. Pardoux, S. G. Peng, and M. C. Quenez,
Reflected solutions of backward SDE’s, and related obstacle problems for PDE’s,
Ann. Probab. 25 (1997), no. 2, 702–737.
125. N. El Karoui, É. Pardoux, and M. C. Quenez, Reflected backward SDEs and
American options, Numerical methods in finance (L. C. G. Rogers and D. Talay,
eds.), Publ. Newton Inst., Cambridge Univ. Press, Cambridge, 1997, pp. 215–
231.
126. N. El Karoui and M.C. Quenez, Imperfect markets and backward stochastic dif-
ferential equations, Numerical methods in finance, Publ. Newton Inst., Cam-
bridge Univ. Press, Cambridge, 1997, pp. 181–214.
127. Haya Kaspi and Avi Mandelbaum, On harris recurrence in continuous time,
Mathematics of Operations Research 19 (1994), no. 1, 211–222.
562 References

128. N. Katilova, Markov chains: Ergodicity in time-discrete cases, to appear 2008.


129. , On ergodicity and stability properties of finite markov chains and their
financial interpretations, Mathematics, University of Antwerp, Middelheimlaan
1, 2020 Antwerp, March 2004.
130. , On Markov and Kolmogorov matrices and their relationship with an-
alytic operators, New Zealand J. Math. 34 (2005), no. 1, 43–60.
131. A. K. Katsaras, On the strict topology in the nonlocally convex setting. ii, Acta
Math. Hungar. 41 (1983), no. 1-2, 77–88.
132. J.F.C. Kingman and S. Orey, Ratio limit theorems for markov chains, Proc.
Am. Math. Soc. 15 (1964), 907–910.
133. Christer O. Kiselman, Plurisubharmonic functions and potential theory in sev-
eral complex variables, Development of mathematics 1950–2000, Birkhäuser,
Basel, 2000, pp. 655–714.
134. Hagen Kleinert, Path integrals in quantum mechanics, statistics, polymer
physics, and financial markets, World Scientific Publishing Co. Inc., River
Edge, NJ, 2003, to appear.
135. Vassili N. Kolokoltsov, Measure-valued limits of interacting particle systems
with k-nary interactions. II. Finite-dimensional limits, Stoch. Stoch. Rep. 76
(2004), no. 1, 45–58. MR MR2038028 (2004k:60268)
136. , On Markov processes with decomposable pseudo-differential generators,
Stoch. Stoch. Rep. 76 (2004), no. 1, 1–44. MR MR2038027 (2005b:60193)
137. P. P. Korovkin, Linear operators and approximation theory, Translated from
the Russian ed. (1959). Russian Monographs and Texts on Advanced Mathe-
matics and Physics, Vol. III, Gordon and Breach Publishers, Inc., New York,
1960.
138. Ulrich Krengel, Ergodic theorems, de Gruyter Studies in Mathematics, vol. 6,
Walter de Gruyter & Co., Berlin, 1985, With a supplement by Antoine Brunel.
139. Nicolai V. Krylov and Boris L. Rozovskii, Stochastic evolution equations,
Stochastic Differential Equations: Theory and Applications, A Volume in
Honor of Professor Boris L. Rozovskii (Peter H. Baxendale and Sergey V. Lo-
totsky, eds.), Interdisciplinary Mathematical Sciences, vol. 2, World Scientific,
Singapore, 2007, A Volume in Honor of Professor Boris L. Rozovskii, pp. 1–70.
140. F. Kühnemund, A Hille-Yosida theorem for bi-continuous semigroups, Semi-
group Forum 67 (2003), no. 2, 205–225.
141. Joseph L., Stochastic processes, John Wiley & Sons, Inc., New York, London,
1953.
142. L. G. Labsker, Korovkin sets in a Banach space for sets of linear functionals,
Mat. Zametki 31 (1982), no. 1, 93–112, 159.
143. M. Ledoux, On an integral criterion for hypercontractivity of diffusion semi-
groups and extremal functions, J. Funct. Anal. 105 (1992), no. 2, 444–465.
144. Michel Ledoux, The geometry of Markov diffusion generators, Ann. Fac. Sci.
Toulouse Math. (6) 9 (2000), no. 2, 305–366, Probability theory.
145. , Spectral gap, logarithmic Sobolev constant, and geometric bounds, Sur-
veys in differential geometry. Vol. IX, Surv. Differ. Geom., IX, Int. Press,
Somerville, MA, 2004, pp. 219–240.
146. Paul Lévy, Théorie de l’addition des variables aléatoires, Gauthier-Villars,
Paris, 1937.
147. Thomas M. Liggett, Interacting particle systems, Classics in Mathematics,
Springer-Verlag, Berlin, 2005, Reprint of the 1985 original.
References 563

148. Michael Lin, Semi-groups of Markov operators, Boll. Un. Mat. Ital. (4) 6 (1972),
20–44.
149. , Strong ratio limit theorems for Markov processes, Ann. Math. Statist.
43 (1972), 569–579.
150. , On the uniform ergodic theorem. II, Proc. Amer. Math. Soc. 46 (1974),
217–225.
151. , Quasi-compactness and uniform ergodicity of Markov operators, Ann.
Inst. H. Poincaré Sect. B (N.S.) 11 (1975), no. 4, 345–354 (1976).
152. R. Lowen, Approach spaces, Oxford Mathematical Monographs, The Claren-
don Press Oxford University Press, New York, 1997, The missing link in the
topology-uniformity-metric triad, Oxford Science Publications.
153. Y. Lyubich, Spectral localization, power boundedness and invariant subspaces
under Ritt’s type condition, Studia Mathematica 134 (1999), no. 2, 153–167.
154. Yong-Hua Mao, Strong ergodicity for Markov processes by coupling methods, J.
Appl. Probab. 39 (2002), no. 4, 839–852.
155. M. J. Marsden and S. D. Riemenschneider, Korovkin theorems for integral
operators with kernels of finite oscillation, Canad. J. Math. 26 (1974), 1390–
1404.
156. Bohdan Maslowski and Jan Seidler, Invariant measures for nonlinear SPDE’s:
uniqueness and stability, Archivum Mathematicum (BRNO) 34 (1998), 153–
172.
157. Olivier Mazet, A characterization of Markov property for semigroups with in-
variant measure, Potential Anal. 16 (2002), no. 3, 279–287.
158. G. Metafune, D. Pallara, and M. Wacker, Feller semigroups on RN , Semigroup
Forum 65 (2002), no. 2, 159–205. MR MR1911723 (2003i:35170)
159. G. Metafune, J. Prüss, A. Randhi, and A. Schnaubelt, The domain of the
Ornstein-Uhlenbeck operator on an Lp -space with invariant measure, Ann.
Scuola Norm. Sup. Pisa Cl. Sci. (5) (2002), no. 1, 471–485.
160. Paul-A. Meyer, Probability and potentials, Blaisdell Publishing Co. Ginn and
Co., Waltham, Mass.-Toronto, Ont.-London, 1966.
161. S. P. Meyn and R. L. Tweedie, Generalized resolvents and Harris recurrence of
Markov processes, Doeblin and modern probability (Blaubeuren, 1991), Con-
temp. Math., vol. 149, Amer. Math. Soc., Providence, RI, 1993, pp. 227–250.
162. , Markov chains and stochastic stability, Communications and Control
Engineering Series, Springer-Verlag London Ltd., London, 1993, (new version
September 2005:
http://probability.ca/MT/).
163. Pedro J. Miana, Uniformly bounded limit of fractional homomorphisms, Proc.
Amer. Math. Soc. 133 (2005), no. 9, 2569–2575 (electronic).
164. Rosa Maria Mininni and Silvia Romanelli, Martingale estimating functions for
Feller diffusion processes generated by degenerate elliptic operators, J. Concr.
Appl. Math. 1 (2003), no. 3, 191–216.
165. Béla Nagy and Jaroslav Zemánek, A resolvent condition implying power bound-
edness, Studia Math. 134 (1999), no. 2, 143–151.
166. Toshihiko Nishishiraho, Korovkin sets and mean ergodic theorems, J. Convex
Anal. 5 (1998), no. 1, 147–151.
167. J. R. Norris, Markov chains, Cambridge Series in Statistical and Probabilistic
Mathematics, vol. 2, Cambridge University Press, Cambridge, 1998, Reprint
of 1997 original.
564 References

168. D. Nualart, The Malliavin calculus and related topics, Probability and its Ap-
plications (New York), Springer-Verlag, New York, 1995.
169. , Analysis on Wiener space and anticipating stochastic calculus, Lec-
tures on probability theory and statistics (Saint-Flour, 1995), Lecture Notes
in Math., vol. 1690, Springer, Berlin, 1998, pp. 123–227.
170. E. Nummelin, A splitting technique for Harris recurrent Markov chains, Z.
Wahrsch. Verw. Gebiete 43 (1978), no. 4, 309–318.
171. Esa Nummelin, General irreducible Markov chains and nonnegative operators,
Cambridge Tracts in Mathematics, vol. 83, Cambridge University Press, Cam-
bridge, 1984.
172. Esa Nummelin and Elja Arjas, A direct construction of the R-invariant measure
for a Markov chain on a general state space, The Annals of Probability 4
(1976), no. 4, 674–679.
173. Esa Nummelin and Pekka Tuominen, Geometric ergodicity of Harris recurrent
Markov chains with applications to renewal theory, Stochastic Process. Appl.
12 (1982), no. 2, 187–202.
174. , The rate of convergence in Orey’s theorem for Harris recurrent Markov
chains with applications to renewal theory, Stochastic Process. Appl. 15 (1983),
no. 3, 295–311.
175. S. Orey, Recurrent Markov chains, Pacific J. Math. 9 (1959), 805–827.
176. Steven Orey, An ergodic theorem for Markov chains, Z. Wahrscheinlichkeits-
theorie Verw. Gebiete 1 (1962), 174–176.
177. , Potential kernels for recurrent Markov chains, J. Math. Anal. Appl.
8 (1964), 104–132.
178. , Lecture notes on limit theorems for Markov chain transition proba-
bilities, Van Nostrand Reinhold Co., London, 1971, Van Nostrand Reinhold
Mathematical Studies, No. 34.
179. Donald S. Ornstein, Random walks. I, II, Trans. Amer. Math. Soc. 138 (1969),
1-43; ibid. 138 (1969), 45–60.
180. M. Ali Özarslan and Oktay Duman, MKZ type operators providing a better
estimation on [1/2, 1), Canad. Math. Bull. 50 (2007), no. 3, 434–439.
181. É. Pardoux, Backward stochastic differential equations and viscosity solutions
of systems of semilinear parabolic and elliptic PDEs of second order, Stochas-
tic analysis and related topics, VI (Geilo, 1996), Progr. Probab., vol. 42,
Birkhäuser Boston, Boston, MA, 1998, pp. 79–127. MR 99m:35279
182. , Backward stochastic differential equations and viscosity solutions of
systems of semilinear parabolic and elliptic PDEs of second order, Progress in
Probability, vol. 42, pp. 79–127, Birkhäuser Verlag, Basel, 1998.
183. , BSDEs, weak convergence and homogenization of semilinear PDEs,
Nonlinear analysis, differential equations and control (Montreal, QC, 1998),
Kluwer Acad. Publ., Dordrecht, 1999, pp. 503–549.
184. É. Pardoux and S. G. Peng, Adapted solution of a backward stochastic differ-
ential equation, Systems Control Lett. 14 (1990), no. 1, 55–61.
185. É. Pardoux and S. Zhang, Generalized BSDEs and nonlinear Neumann bound-
ary value problems, Probab. Theory Related Fields 110 (1998), no. 4, 535–558.
186. K. R. Parthasarathy, Introduction to probability and measure, Texts and Read-
ings in Mathematics, vol. 33, Hindustan Book Agency, New Delhi, 2005, Cor-
rected reprint of the 1977 original.
187. A. Pazy, Semigroups of linear operators and applications to partial differential
equations, Springer-Verlag, New York, 1983.
References 565

188. Yu. V. Prohorov, Convergence of random processes and limit theorems in prob-
ability theory, Teor. Veroyatnost. i Primenen. 1 (1956), 177–238.
189. João B. Prolla, Approximation of vector valued functions, North-Holland Pub-
lishing Co., Amsterdam, 1977, North-Holland Mathematics Studies, Vol. 25,
Notas de Matemática, No. 61. [Notes on Mathematics, No. 61].
190. , The Weierstrass-Stone theorem in absolute valued division rings,
Indag. Math. (N.S.) 4 (1993), no. 1, 71–78.
191. João B. Prolla and Samuel Navarro, Approximation results in the strict topol-
ogy, Ann. Math. Blaise Pascal 4 (1997), no. 2, 61–82.
192. Philip E. Protter, Stochastic integration and differential equations, Stochas-
tic Modelling and Applied Probability, vol. 21, Springer-Verlag, Berlin, 2005,
Second edition. Version 2.1, Corrected third printing.
193. Jan Prüss, Maximal regularity for evolution equations in Lp -spaces, Conf.
Semin. Mat. Univ. Bari (2002), no. 285, 1–39 (2003).
194. Zhongmin Qian, A comparison theorem for an elliptic operator, Potential Anal.
8 (1998), no. 2, 137–142.
195. Frank Räbiger, Abdelaziz Rhandi, and Roland Schnaubelt, Perturbation and
an abstract characterization of evolution semigroups, J. Math. Anal. Appl. 198
(1996), no. 2, 516–533.
196. Frank Räbiger, Roland Schnaubelt, Abdelaziz Rhandi, and Jürgen Voigt, Non-
autonomous Miyadera perturbations, Differential Integral Equations 13 (2000),
no. 1-3, 341–368.
197. K. Murali Rao, On decomposition theorems of Meyer, Math. Scand. 24 (1969),
66–78.
198. D. Revuz, Markov chains, North-Holland Publishing Co., Amsterdam, 1975,
North-Holland Mathematical Library, Vol. 11.
199. D. Revuz and M. Yor, Continuous martingales and Brownian motion, third
ed., Springer-Verlag, Berlin, 1999.
200. G.O. Roberts and J.S. J.S. Rosenthal, Quantitative bounds for convergence
rates of continuous time Markov processes, Electronic Journal of Probability 1
(1996), no. 9, 1–21.
201. Jean-Pierre Roth, Formule de représentation et troncature des formes de
Dirichlet sur Rm , Séminaire de Théorie du Potentiel de Paris, No. 2 (Univ.
Paris, Paris, 1975–1976), Springer, Berlin, 1976, pp. 260–274. Lecture Notes in
Math., Vol. 563.
202. O. S. Rothaus, Diffusion on compact Riemannian manifolds and logarithmic
Sobolev inequalities, J. Funct. Anal. 42 (1981), no. 1, 102–109.
203. , Logarithmic Sobolev inequalities and the spectrum of Schrödinger op-
erators, J. Funct. Anal. 42 (1981), no. 1, 110–120.
204. , Hypercontractivity and the Bakry-Emery criterion for compact Lie
groups, J. Funct. Anal. 65 (1986), no. 3, 358–367.
205. Walter Rudin, Functional analysis, second ed., International Series in Pure and
Applied Mathematics, McGraw-Hill Inc., New York, 1991.
206. Wolfgang Ruess, On the locally convex structure of strict topologies, Mathema-
tische Zeitschrift 153 (1977), no. 2, 179–192.
207. Jan Seidler, Ergodic behaviour of stochastic parabolic equations, Czechoslovak
Math. J. 47 (1997), no. 122, 277–316.
208. M. Sharpe, General theory of Markov processes, Pure and Apllied Math., vol.
133, Academic Press, New York, 1988.
566 References

209. L. A. Shepp, Symmetric random walk, Trans. Amer. Math. Soc. 104 (1962),
144–153.
210. , Recurrent random walks with arbitrarily large steps, Bull. Amer. Math.
Soc. 70 (1964), 540–542.
211. S.J. Sheu, Stochastic control and principal eigenvalue, Stochastics 11 (1984),
no. 3-4, 191–211.
212. A. N. Shiryayev, Probability, Graduate Texts in Mathematics, vol. 95, Springer-
Verlag, New York, 1984, Translated from the Russian by R. P. Boas.
213. B. Simon, Functional integration and quantum physics, Pure and Applied
Mathematics, vol. 86, Academic Press, [Harcourt Brace Jovanovich, Publish-
ers], New York-London, 1979.
214. A. V. Skorokhod, Asymptotic methods in the theory of stochastic differential
equations, Translations of Mathematical Monographs, vol. 78, American Math-
ematical Society, Providence, RI, 1989, Translated from the Russian by H. H.
McFaden.
215. Halil Mete Soner, Controlled Markov processes, viscosity solutions and appli-
cations to mathematical finance, Viscosity solutions and applications (Monte-
catini Terme, 1995), Lecture Notes in Math., vol. 1660, Springer, Berlin, 1997,
pp. 134–185. MR 98i:49032
216. Frank Spitzer, Principles of random walk, The University Series in Higher
Mathematics, D. Van Nostrand Co., Inc., Princeton, N.J.-Toronto-London,
1964.
217. Wilhelm Stannat, On the validity of the log-Sobolev inequality for symmetric
Fleming-Viot operators, Ann. Probab. 28 (2000), no. 2, 667–684.
218. , On the Poincaré inequality for infinitely divisible measures, Potential
Anal. 23 (2005), no. 3, 279–301.
219. , Stability of the optimal filter via pointwise gradient estimates, Stochas-
tic partial differential equations and applications—VII, Lect. Notes Pure Appl.
Math., vol. 245, Chapman & Hall/CRC, Boca Raton, FL, 2006, pp. 281–293.
220. Lukasz Stettner, On the existence and uniqueness of invariant measure for
continuous time Markov processes, Technical report lcds #86-18, Brown Uni-
versity, Lefeschetz Center for Dynamical Systems, Providence, RI, April 1986,
The paper attempts to find fairly general conditions under which the exis-
tence and uniqueness of invariant measure is guaranteed. The obtained results
are new or generalize at least slightly known theorems. The author introduces
a terminology: weak, strong Harris, strong recurrence. Two Sections concern
general standard processes. The other section restricts it to Feller or strong
Feller standard processes. Three examples are considered to illustrate possible
unpleasant situations one can meet in the general theory.
221. , Remarks on ergodic conditions for Markov processes on Polish spaces,
Bull. Polish Acad. Sci. Math. 42 (1994), 103–114.
222. Daniel W. Stroock, A concise introduction to the theory of integration, third
ed., Birkhäuser Boston Inc., Boston, MA, 1999.
223. Daniel W. Stroock and S. R. Srinivasa Varadhan, Multidimensional diffusion
processes, Classics in Mathematics, Springer-Verlag, Berlin, 2006, Reprint of
the 1997 edition.
224. D.W. Stroock, Probability theory, an analytic view, Cambridge University
Press, Cambridge, 2000.
References 567

225. D.W. Stroock and S.R. Srinava Varadhan, Multidimensional diffusion pro-
cesses, Grundlehren der Mathematischen Wissenschaften [Fundamental Prin-
ciples of Mathematical Sciences], vol. 233, Springer-Verlag, 1979.
226. Kazuaki Taira, On the existence of Feller semigroups with boundary conditions,
Memoirs of the AMS, no. 475, American Mathematical Society, Providence, RI,
USA, October 1993.
227. , Analytic Feller semigroups, Conf. Semin. Mat. Univ. Bari (1997),
no. 267, ii+29. MR MR1612061 (99e:60167)
228. M. Thieullen, Second order stochastic differential equations and non-gaussian
reciprocal diffusions, Prob. Theory Related Fields 97 (1993), 231–257.
229. , Reciprocal diffusions and symmetries, Stochastics Stochastics Rep. 65
(1998), 41–77.
230. M. Thieullen and J. C. Zambrini, Probability and quantum symmetries. I. The
theorem of Noether in Schrödinger’s Euclidean quantum mechanics, Ann. Inst.
H. Poincaré Phys. Théor. 67 (1997), no. 3, 297–338.
231. , Symmetries in the stochastic calculus of variations, Probab. Theory
Related Fields 107 (1997), no. 3, 401–427.
232. M. Thieullen and J.C. Zambrini, Probability and quantum symmetries 1.the
theorem of Noether in Schrodinger’s Euclidean quantum mechanics, Ann. Inst.
H. Poincare 67 (1997), no. 3, 297–338.
233. , Symmetries in the stochastic calculus of variations, Prob. Theory Re-
lated Fields 107 (1997), 401–427.
234. Luke Tierney, Markov chains for exploring posterior distributions (with discus-
sion), Annals of Statistics 22 (1994), no. 4, 1701–1762.
235. Christopher Todd, Stone-Weierstrass theorems for the strict topology, Proc.
Amer. Math. Soc. 16 (1965), 654–659.
236. Pekka Tuominen and Richard L. Tweedie, Subgeometric rates of convergence
of f -ergodic Markov chains, Adv. in Appl. Probab. 26 (1994), no. 3, 775–798.
237. Jan A. Van Casteren, Generators of strongly continuous semigroups, Research
Notes in Mathematics, vol. 115, Pitman, 1985, Pitman Advanced Publishing
Program.
238. , Some problems in stochastic analysis and semigroup theory, Semi-
groups of operators: theory and applications (Newport Beach, CA, 1998),
Progr. Nonlinear Differential Equations Appl., vol. 42, Birkhäuser, Basel, 2000,
pp. 43–60.
239. , Viscosity solutions and the Hamilton-Jacobi-Bellmann equation, Pro-
ceedings, Third International Conference on Applied Mathematics and Engi-
neering Sciences, Casablanca, October 23, 24 and 25, CIMASI’2000, CIMASI,
Ecole Hassania des Travaux Publics, BP:8108, Oasis Route d’El Jadida, Km 7
Casablanca - Morocco, 2000, (on CD-rom).
240. , Feynman-Kac semigroups, martingales and wave operators, J. Korean
Math. Soc. 38 (2001), no. 2, 227–274.
241. , Markov processes and Feller semigroups, Conf. Semin. Mat. Univ. Bari
(2002), no. 286, 1–75 (2003).
242. , The Hamilton-Jacobi-Bellman equation and the stochastic Noether
theorem, Evolution equations: applications to physics, industry, life sciences
and economics (Levico Terme, 2000), Progr. Nonlinear Differential Equations
Appl., vol. 55, Birkhäuser, Basel, 2003, pp. 375–401.
243. , Analytic operators, in Liber Amicorum voor Nico Temme CWI, CWI,
Amsterdam, Netherlands, May 2005, pp. 8–12.
568 References

244. , Backward stochastic differential equations and Markov processes, Liber


Amicorum, Richard Delanghe: een veelzijdig wiskundige (Gent) (F Brackx and
H. De Schepper, eds.), Academia Press, University of Gent, 2005, pp. 199–239.
245. , Feynman-Kac formulas, backward stochastic differential equations and
Markov processes, Preprint University of Antwerp, Technical Report 2006-16,
2007.
246. , Viscosity solutions, backward stochastic differential equations and
Markov processes, Preprint University of Antwerp, Technical Report 2006-15,
submitted to “Integration: Mathematical Theory and Applications”, 2007.
247. , Feynman-Kac formulas, backward stochastic differential equations
and Markov processes, Functional Analysis and Evolution Equations (W.;
Hieber M.; Neubrander F.; Nicaise S.; von Below J. Amann, H.; Arendt, ed.),
vol. XX, Birkhäuser, 2008, Procedings Conference August 28–September 1,
2006, pp. 83–111.
248. , Viscosity solutions, backward stochastic differential equations and
Markov processes, Integration: Mathematical Theory and Applications 1
(2008), no. 2, to appear.
249. Jan van Neerven, The Doob-Meyer decomposition theorem, Electronically:
http://fa.its.tudelft.nl/seminar/seminar2003− 2004/lecture3.pdf, 2004, Semi-
nar Lectures Technical University Delft: Lecture 3.
250. S. R. S. Varadhan, Stochastic processes, Courant Lecture Notes in Mathemat-
ics, vol. 16, Courant Institute of Mathematical Sciences, New York, 2007.
251. Fengyu Wang, Functional Inequalities Markov Semigroups and Spectral Theory,
Mathematics Monograph Series, vol. 4, Elsevier Science, Amsterdam, London,
New York, 2005, part of: The Science Series of the Contemporary Elite Youth.
252. Shinzo Watanabe and Toshio Yamada, On the uniqueness of solutions of
stochastic differential equations. II, J. Math. Kyoto Univ. 11 (1971), 553–563.
253. James Wells, Bounded continuous vector-valued functions on a locally compact
space, Michigan Math. J. 12 (1965), 119–126.
254. David Vernon Widder, The Laplace Transform, Princeton Mathematical Series,
v. 6, Princeton University Press, Princeton, N. J., 1946.
255. David Williams, Probability with martingales, Cambridge Mathematical Text-
books, Cambridge University Press, Cambridge, 1991.
256. Liming Wu, Uniformly integrable operators and large deviations for Markov
processes, J. Funct. Anal. 172 (2000), no. 2, 301–376.
257. A. C. Zaanen, Introduction to operator theory in riesz spaces, Springer-Verlag,
Berlin and Heidelberg GmbH & Co., 1997.
258. Radu Zaharopol, Invariant probabilities of Markov-Feller operators and their
supports, Frontiers in mathematics, Birkäuser, Basel, Boston, 2005.
259. J.-C. Zambrini, A special time-dependent quantum invariant and a general the-
orem on quantum symmetries, Proceedings of the second International Work-
shop Stochastic Analysis and Mathematical Physics: ANESTOC’96 (Singa-
pore) (R. Rebolledo, ed.), World Scientific, 1998, Workshop: Vina del Mar
Chile 16-20 December 1996, pp. 197–210.
260. Jean-Claude Zambrini, A special time-dependent quantum invariant and a gen-
eral theorem on quantum symmetries, Stochastic analysis and mathematical
physics (Viña del Mar, 1996), World Sci. Publ., River Edge, NJ, 1998, pp. 197–
210.
261. V. M. Zolotarev, Probability metrics, Teor. Veroyatnost. i Primenen. 28 (1983),
no. 2, 264–287.
Index

Ar , 477, 479, 490 Pτ,x -distribution, 51


(1)
CP,b , 108, 110 E+ , 470
(1) K(E), 23, 104, 152
CP,b (λ), 108–110, 119, 120, 166
HA h
(λ), 489 Kσ (E): collection of σ-compact subsets,
HA (λ), 485 34
I-capacitable subset, 153 M2 , 186, 195, 200, 226, 235
LRA (0+) + I, 488 M2loc,unif , 187, 235
L1 -integrable M2unif , 187
uniform, 42 O(E), 152
L2 -martingale, 459 S2 , 194, 200, 235
L2 -spectral gap, 432, 451 S2 -property, 186
L2 -spectral gap inequality, 432 S2 × M2 , 194, 196, 199, 212, 216, 230,
Lh (λ), 484, 489 231, 244, 245
L∞ -spectral gap, 317, 354 S2loc,unif , 186, 235
M (E)-spectral gap, 316, 317, 354 S2loc , 335
M0 , 353 S2unif , 186
M0 (E), 296, 297, 368 Tβ -Cauchy sequence, 3
h
RA (λ), 484, 489 Tβ -bounded sets versus bounded sets, 9
S-topology, 41 Tβ -continuous Feller semigroup, 104
T , 473
Tβ -continuous semigroup, 92, 110, 391
T -invariant measure, 476
Tβ -convergence, 11
Th , 473
Tβ -convergent sequence, 12
TA,h , 473
ZY , 194 Tβ -dense, 26
Γ1 = squared gradient operator, 172 Tβ -derivatives, 32
Γ2 -condition, 419 Tβ -dissipative, 23
Γ2 -criterion, 458 Tβ -dissipative operator, 103, 105, 106,
λ-dominance, 87, 106 110, 112, 119
sequential, 103 positive, 104
λ-dominant operator, 106, 110 Tβ -dissipativity, 38
sequentially, 37, 104 Tβ -dual of Cb (E), 13
λ-super-median function, 104 Tβ -dual space of Cb (E), 9
(Ftτ )t∈[τ,T ] -stopping time, 31 Tβ -equi-continuity, 87
570 Index

Tβ -equi-continuous, 65, 68, 70–72, 87, formal, 347


94, 95, 100, 313 almost separating generator, 398
Tβ -equi-continuous evolution, 116 almost separating subspace, 387, 388,
Tβ -equi-continuous family, 48, 141, 143, 390, 398, 476
144, 366, 367, 489 analytic maximum principle, 360, 369
Tβ -equi-continuous family of measures, analytic operator, 377
18, 313 analytic semigroup, 298, 305, 307, 353,
Tβ -equi-continuous family of operators, 379
25 bounded, 301
Tβ -equi-continuous semigroup, 105, 106, generator of, 353
116, 166 weak∗ -continuous bounded, 301
Tβ -generator, 116 aperiodic Markov chain, 467, 468
Tβ -generator of a Feller semigroup, 107 approach structure, 98
Tβ -limit, 172 approximating sequence of stopping
Tβ -sequentially complete, 3 times, 57, 58
Tβ -strongly continuous, 29 arbitrage free, 249, 250
µ-invariant subset, 410, 411 arbitrage free portfolio process, 249
∇Lu v (s, x), 180 arbitrage opportunity, 249
∇Lu (τ, x), 173 Arzela-Ascoli theorem, 16
π-irreducible Markov chain, 467 asset, 248
σ-field non-risky, 248
right closed, 52 risky, 248
σ-field after a stopping time, 52 asymptotic σ-field, 387
σ-field associated with stopping time,
463 backward doubly stochastic differential
σ-field between stopping times, 52 equation, 185
σ-finite invariant measure, 406, 407, backward martingale, 179, 290
468, 469, 471, 479, 492 backward martingale convergence
unique, 413 theorem, 417
σ-finite measure, 45 backward propagator, 323
σ-local maximum principle, 140 Backward Stochastic Differential
σ (M (E), Cb (E))”-convergence, 18 Equation, 171
ϕ-irreducible Markov chain, 472 Baire field versus Borel field, 11
ϑ1 -invariant subset, 416 Bamnach-Alaoglo
ζ, 176 Theorem of, 302
ζ = life time, 150, 158 Banach-Steinhaus
ζ: life time of process, 35 theorem of, 381
dL (, x), 420 BDSDE, 185
dΓ1 (x, y), 420 Bernstein diffusion, 255
(infinitesimal) generator of Feller Bernstein probability, 291
evolution, 31 bi-continuous semigroup, 311
bi-topological space, 41
absolutely continuous measure, 45 bilinear mapping Z(t), 184
absorbing subset, 407 Black-Sholes equation, 290
additive measure, 406 Blumenthal’s zero-one law, 477
additive process, 405, 406 Bolzano-Weierstrass theorem, 17
time-homogeneous, 405, 406 Borel measure, 13
adjoint evolution family, 356 Borel probability measure, 13
adjoint of operator Borel-Cantelli lemma
Index 571

generalized, 397, 404, 405 compact recurrent subset, 492


Borel-Cantelli-Lévy lemma, 397 compact subset
bottom of spectrum, 433 relatively weakly compact, 15
bounded analytic semigroup, 349, 350, comparison theorem, 171, 236
375 complex Noether theorem, 292
weak∗ -continuous, 309 complex version of Noether theorem,
bounded continuous function space, 3 285
bounded relative to the variation norm, conditional expectation, 75
14 conditional probability, 424
Bownian motion, 253 conservative Feller propagator, 63
Brownian motion, 178, 190, 191, 248, conservative part, 407, 408
250, 252, 323, 327, 329, 333, 459 conservative process, 176
BSDE, 171, 172, 181, 188, 191, 193, conservative propagator, 65
195, 196, 225, 231, 235, 236, 238, consumption process, 250
239, 242, 243, 248, 253 contingent claim, 251
linear, 231 contingent strategy, 250
strong solution to, 189 continuous orbit, 148
weak, 171 continuous sample path, 148
weak solution to, 180, 189 contraction operator, 407, 439
BSDE with drift, 190 conservative, 412
Buck, 3, 41 convergence for the strict topology
Burger’ equation, 258 versus uniform convergence, 12
Burkholder-Davis-Gundy inequality, convergence of measures
186, 187, 195, 198, 202, 206, 214, weak, 41
219, 223, 334 convex function, 457
convolution product, 172
Cameron-Martin formula, 259 coupled stoschastic differential equation,
Capacitability theorem of Choquet, 152, 421
153, 165 coupling method, 383, 420
capacitable subset, 161, 165, 167 coupling of diffusion processes, 425
Caratheodory’s theorem, 7 coupling operator, 420
carré du champ operateur covariance mapping, 240
See squared gradient operator, 255 covariance matrix, 325, 423
carré du champ operator, 292 covariation process, 344, 463
carré du champ opérateur: See squared quadratic, 463
gradient operator, 257 critical eigenvalue, 296
central limit theorem, 465, 470 cylindrical Brownian motion, 178
Chacon-Ornstein theorem, 411, 412
Chapman-Kolmogorov equation, 39, 64, Daniell-Stone
466 Theorem of, 4
Choquet capacitability theorem, 152, deflator process, 251
153, 155, 165, 167 derivative of quadratic variation
Choquet capacity, 152 process, 236
Choquet’s capacity theorem, 490 deterministic Noether theorem, 279
classical Noether theorem, 279 diffusion, 172
closable operator, 111 generator of, 173
co-variation process, 177, 184, 185, 236 diffusion matrix, 421
compact orbit diffusion part, 461
almost sure, 51 diffusion process, 39
572 Index

diffusion with mixing property, 386 ergodic process, 388


Dini’s lemma, 11, 12, 15, 16, 25, 30, 42, ergodic system, 318, 355, 357, 358, 369,
66, 104, 263, 396 375
Dirac measure, 27 ergodic theory, 98
discounted gains process, 249 ergodicity results, 468
dissipative operator, 103, 111–115, 298 Euler-Lagrange equation, 289
dissipative part, 407, 408 event, 57
dissipative relative to Tβ , 38 evolution, 31, 341
distribution Feller, 27
finite dimensional, 329 evolution family, 39, 297, 311, 348, 356
divergence free action, 289 adjoint, 356
Doeblin’s condition, 376 exponential decay, 358
Doléans measure, 237 exponential martingale, 342, 347, 460,
dominant eigenvalue, 357 461
dominant eigenvector, 296 exponentially distributed variable, 493
dominated sequence, 10 exterior measure, 6
dominated sequences, 11 σ-field to, 6
Doob’s optional sampling theorem, 63,
64, 89, 91, 391
family of functionals
Doob’s submartingale inequality, 66
tight, 15
Doob’s theorem, 63
Doob-Meyer decomposiiton theorem, family of measures
245 tight, 15
Doob-Meyer decomposition theorem, Fatou’s lemma, 45
460 Feller evolution, 27, 35, 36, 38, 47, 55,
drift part, 461 69, 70, 72, 90, 94, 99–102, 115,
drift vector, 421, 423 116, 150, 175, 233, 262
Duhamel’s formula, 312, 314 generator of, 31, 110
Dunford projection, 296, 357, 362, 375 Feller propagator, 27, 35, 53, 63, 64,
Dunford-Pettis theorem, 245 102, 150, 174, 175, 257, 270
Dynkin system, 8, 44 conseravative, 63
Dynkin’s formula, 471, 480, 491, 492 Feller semigroup, 38, 40, 93, 94, 96, 101,
102, 119, 142, 145, 257
eigenvalue problem, 356 Tβ -continuous, 104
energy, 290 Feller-Dynkin semigroup, 40, 167, 390
energy operator, 172 Feynma-Kac integral equation, 179
entrance time, 469, 470 Feynman-Kac formula, 171, 180, 231,
entropy of function, 434, 437 232, 248, 255, 262, 270, 277
entry time, 151, 154, 160, 162, 406, 469 non-linear, 181, 272
equi-continuous family filtration, 37
for the strict topology, 16 right closed, 36
equi-continuous family of operators, 30 final value problem, 180
equi-continuous for the strict topology, finite-dimensional dimensional distribu-
18 tion, 329
equi-continuous versus weakly compact fluctuation, 461
subset, 13 formula
ergodic diffusion, 386 Girsanov, 329
ergodic Markov chain, 470 forward propagator, 324
ergodic Markov process, 295 forward SDE, 172, 186
Index 573

forward stochastic differential equation, generator of weak∗ -continuous analytic


172, 186 semigroup, 360
Fourier inverse formula, 348 generator of weak∗ -continuous bounded
Fubini’s theorem, 45, 325 analytic semigroup, 358
function generator of weak∗ -continuous
transtion, 31 semigroup, 315, 361, 362, 369
function space geometrically ergodic Markov chain,
bounded continuous, 3 470
functional and measure, 13 germ of function, 277
Girsanov formula, 329
Gaussian process, 325 Girsanov transformation, 190, 232
generalized Borel-Cantelli lemma, 397, global Korovkin property, 34, 37, 135,
404, 405 136
generator global Korovkin subspace, 34
almost separating, 398 global maximum principle, 131
generator of d-dimensional diffusion, Gâteaux derivative, 289
248
generator of a Feller-Dynkin semigroup, Hölder’s inequality, 329
360
Hahn decomposition, 299, 300
generator of a Markov process, 233
Hahn-Banach theorem, 22, 378
generator of analytic semigroup, 316,
Hahn-Jordan decomposition, 300
341, 353
Hamilton’s least action principle, 289
generator of bounded analytic
Hamilton-Jacobi theory, 289
semigroup, 310, 358, 363, 369
Hamilton-Jacobi-Bellman equation,
generator of bounded analytic
180, 232, 255, 258
weak∗ -continuous semigroup, 366
harmonic function, 407
generator of BSDE, 184, 193, 196
Harris recurrent Markov chain, 469
generator of diffusion, 173, 255, 257,
Harris recurrent Markov process, 399
258, 432, 446, 451
Harris recurrent subset, 398, 399, 404,
generator of diffusion process, 452
405
generator of Feller evolution, 37, 99,
Hausdorff-Bernstein-Widder Lapalce
103, 106, 110
inversion theorem, 143
generator of Feller evolu-
Hausdorff-Bernstein-Widder theorem,
tion:infinitesimal, 99
104, 143
generator of Feller semigroup, 110, 316,
319, 462 hedging strategy, 172, 251, 254
generator of Feller-Dynkin semigroup, Hellinger integral, 182
168 Hessian, 352, 444, 446
generator of Markov process, 176, 186, hitting time, 151, 152, 154, 160–162,
187, 232, 242, 326, 327, 354, 390 394, 397, 399, 402–404, 406, 464,
generator of semigroup, 102, 144, 428 466, 467, 471, 490, 491
generator of space-time process, 99 homeomorphism, 83
generator of strong Markov process, 35 homotopy argument, 172, 193, 212
generator of time-dependent Markov Hopf decomposition, 407, 408
process, 315, 317 Hunt process, 35
generator of time-homogeneous Markov hypercontractivity, 458
process, 427
generator of time-inhomogeneous increasing process
Markov process, 32, 100 predictable, 245
574 Index

infinitesimal generator of Feller Kato condition, 262


evolution, 31, 99 Khas’minski lemma, 277
inner regular measure, 10 Kolmogorov operator, 295, 297, 300,
inner-regular measure, 25 311, 312, 314, 315, 357, 368, 375
integral operators, 25 Kolmogorov’s extension theorem, 40,
integrated semigroups, 110 47, 48, 76, 79, 81, 270
integration by parts formula, 177 Korovkin family, 167
invariant density, 468 Korovkin property, 34, 38, 94, 135
invariant distribution, 465, 466, 468 global, 34, 37, 135, 136
invariant event, 411 local, 136
invariant function, 357 Korovkin property on subset, 38, 95
invariant measure, 319, 326, 350, 374, Korovkin set, 98
384, 386, 388, 389, 399, 402, 406, Korovkin subspace
409, 410, 415, 416, 432–434, 438, global, 34
448, 451, 459, 465–470, 473, 474,
476, 486, 487, 490 Lévy, 97
σ-finite, 388, 398, 406–408, 469, 471, Lévy metric, 16, 41
472, 476, 477, 479, 484, 492 Lévy number, 77, 97
finite, 471, 477 Lévy-Prohorov metric, 97
unique σ-finite, 413 Lagrangian action, 233, 280
invariant measure and expectation of Lagranian, 289
return times, 468 Laplace transform, 92, 104, 108
invariant mesure vector valued, 110
σ-finite, 468 law, 319
invariant probability measure, 388 law of large numbers, 467
invariant subset, 410 Lemma
inverse Laplace transform, 306 Dini’s, 11, 25, 30
irreducible Markov chain, 467 Khasminski, 277
irreducible Markov chain with compact life time, 64, 72, 150, 158, 164, 176
recurrent subset, 469 life time of process, 35
irreducible Markov process, 384, 385, linear SDE, 250
399, 400, 470 Lipschitz condition, 181
irreducible time-homogeneous aperiodic Lipschitz constant, 209, 211, 212
Markov chain, 469 Lipschitz continuous function, 327
Itô integral, 190 Lipschitz function, 193, 195, 196, 199,
Itô’s formula, 196, 201, 204, 217, 330, 209, 211, 212, 226, 229, 327, 328,
426, 461 331, 386
Itô’s lemma, 220, 238, 251, 343–346 locally, 327
Itô’s uniqueness condition, 422, 424 one-sided, 193
iterated squared gradient operator, 420, local exponential martingale, 240
443 local Korovkin property, 136, 137,
iterated squared gradient operator Γ2 , 140–142
443, 444, 458 local martingale, 194, 195, 329, 459,
Itô’s formula, 284, 463 461, 463
backward, 240
Jensen inequality, 255, 272, 273, 457 right-continuous, 245
joint distribution, 319 local maximum principle, 139, 142
Jordan decomposition, 299, 300 local semi-martingale, 184, 240, 463
jump process of Poisson process, 415 local time
Index 575

density of, 240 time-homogeneous, 174, 398


logarithmic Sobolev inequality, 383, markov process, 174, 193
434–436, 438, 439, 450, 458 Markov process with Feller property,
tight, 383, 434, 435 385
lwa of large numbers, 465 Markov process with left limits, 35
Lyubisch representation, 378 Markov property, 39, 45, 49, 52, 53, 61,
69, 70, 72, 74, 75, 85, 90, 175, 246,
marginal of Markov process, 462 261, 466, 472, 475, 485
marginal of strong Markov process, 35 strong, 60
Markov T -chain, 472 markov property, 466
Markov chain, 397, 404, 415, 416, Markov transition function, 353
465–467, 493, 494 martingale, 37, 50, 56, 66, 68, 79, 81,
µ-Harris recurrent, 415 85, 86, 89, 167, 171, 175–177, 179,
π-irreducible, 467 182, 184, 191, 192, 197, 209, 211,
ϕ-irreducible, 472 216, 226, 228, 232, 235–239, 246,
aperiodic, 416, 417, 467, 468 251, 253, 255, 270, 273, 277, 280,
Harris recurrent, 416, 417, 469 285, 290, 329, 330, 333, 345, 346,
null-recurrent, 468 409, 421, 427, 430, 459, 460, 462
positive recurrent, 468, 470 backward, 179
recurrent, 468 exponential, 342
time-homogeneous, 465, 468 martingale convergence theorem, 417
topological, 472 backward, 417
Markov chain sampler, 467 martingale problem, 35, 38–40, 72, 77,
Markov chain satisfying the detailed 83, 87, 88, 93, 94, 96, 142, 145,
balance condition, 467 146, 167, 168, 171, 191, 330, 384,
Markov chain with recurrent compact 421, 424
subset time-inhomogeneous, 97
irreducible, 469 well-posed, 35, 422
Markov operator, 312 martingale property, 81, 84
Markov process, 31, 38, 39, 41, 47, martingale representation theorem, 185,
64, 84, 87, 88, 90, 94, 99–101, 253
142, 145, 150–152, 158, 171, 172, martingale solution, 329
174–176, 186, 187, 191, 225, 226, martingale:local, 459
228, 231, 235, 236, 244, 262, 277, maximal mapping, 47
327, 330, 384, 458, 462 maximum operator, 59
µ-Harris recurrent, 399 maximum principle, 22, 23, 33, 34, 37,
generator of, 232 38, 94, 103–106, 110, 111, 113,
Harris recurrent, 399, 448 131, 135, 136, 138, 145, 166, 167,
irreducible, 384 350, 352, 369
irreducible strong Feller, 418 analytic, 360, 369
life time of, 35 weak, 33
normal, 35 maximum principle on a subset, 134
normal strong, 36 maximum principle on subset, 38, 95
quasi-left continuous, 35 maximum time operator, 39
reciprocal, 290 measure
standard, 163 T -invariant, 476
strong, 31, 52, 56 σ-finite, 45
strong Feller, 244, 388, 492 absolutely continuous, 45
strong Markov, 35 exterior, 6
576 Index

inner regular, 10, 25 Nother theorem


invariant, 326 deterministic, 289
outer, 6 Novikov condition, 269, 460
measure theory, 5 Novikov’s condition, 250
metric null-recurrent Markov chain, 468
Lévy, 16
metric on E, 38 occupation formula, 240
metric on E 4 , 38 once integrated semigroup, 110
minimum time operator, 39 one-dimensional distribution of strong
mixing property, 388 Markov process, 35
modification, 47, 51, 176, 177 one-sided Lipschitz function, 193, 200,
right-continuous with left limits, 51 229, 235
modified version, 51 operator
momentum observable, 290 (sub-)Kolmogorov, 300
Monotone Class Theorem, 31 analytic, 377
monotone class theorem, 44, 45, 53, 55, positivity preserving, 24
56, 60, 67, 74, 227 sequentially weak∗ closed, 362
monotone function, 193, 195, 196, 199, operator:time derivative, 32
200, 208, 225, 226, 229, 235 operators with unique Markov
monotone mapping, 21 extensions, 35
monotonicity condition, 193 option pricing, 172
monotonicity constant, 225 opérateur carré du champ, 463
Mrkov process orbit, 48, 51, 66, 97, 176
pinned, 291 sequentially compact, 51
multiplicative Borel measure, 27 Orey’s convergence theorem, 414, 417,
multiplicative process, 405 448, 451
time-homogeneous, 405 Orey’s theorem, 417
Myadera perturbation condition, 260 Ornstein-Uhlenbeck process, 318, 326
Myadera potential, 261, 262 Ornstein-Uhlenbeck semigroup, 357
oscillator process, 318
negative-definite matrix, 352 outer measure, 6
Neumann boundary condition, 172
Nevue-Chacon identification theorem, parabolic differential equation, 171
411 partial differential equations of
no-arbitrage, 248, 249 parabolic type, 235
Noether theorem particle mass, 269
stochatic, 255 pavement, 152
Noether constant, 290 penetration time, 464
Noether observable, 285, 288 petite subset, 472
Noether theorem, 255, 292 phase, 184
complex version of, 285 stochastic, 184
deterministic, 279, 289 phase space, 173
stocahstic, 279 stocahstic, 174
non-conservative process, 176 pinned Markov process, 291
non-linear Feynman-Kac formula, 181, Planck’s constant, 269
272 Poincaré inequality, 383, 432, 434–436,
non-risky asset, 248 458
normal Markov process, 35 pointwise, 450
normal strong Markov process, 36 point evaluation, 27
Index 577

pointwise convergence, 5, 12 progressively measurable process, 423,


pointwise defined operator, 490 424
pointwise ergodic theorem of Birkhoff, progressively mesurable process, 423
388 Prohorov metric = Lévy-Prohorov
pointwise generator of semigroup, 477 metric, 98
pointwise limit, 10, 13 projection operator, 487
pointwise Poincaré inequality, 450 propagator, 31, 62, 188
Poisson process, 415 backward, 323
jump process of, 415 Feller, 27
Polish space, 3, 10, 15, 18, 27, 28, 41, pseudo-hitting time, 151, 152, 154
47, 49, 63, 83, 141, 295
portfolio process, 249 quadratic co-variation process, 177
arbitrage free, 249 quadratic covariation process, 428
tame, 249 quadratic variation process, 178, 184,
positive Tβ -dissipative, 141, 166 187, 198, 232, 237, 260, 460
positive Tβ -dissipative operator, 38, derivative of, 236
104, 112, 142, 166 quasi-left continuous Markov process,
positive capacity, 152 35, 61, 64, 72
positive contraction operator, 407, 408, quasi-left continuous process, 157, 159,
411, 412, 416 160, 164, 165
positive homogeneous functional, 134
Radon-Nikodym derivative, 237, 250,
positive operator, 407
252, 291
positive recurrent Markov chain,
Radon-Nikodym theorem, 46
468–470
reciprocal Markov probability distribu-
positive resolvent property, 22
tion, 291
positive-definite matrix, 323, 350, 383
reciprocal Markov process, 290
positive-definitive matrix, 352
reciprocal probability distribution, 291
positivity preserving operator, 24, 407
recurrence property, 404
power Tβ -dissipative operator, 110
recurrent Markov chain, 469, 471
predictable process, 237, 245, 249, 252,
Harris, 469
459, 460, 462
recurrent Markov process, 388, 399,
probability measure 404, 451
Borel, 13 recurrent subset, 392, 397–399, 403, 490
problem comapct, 398, 404
martingale, 40 compact, 404
process Harris, 398, 399, 404, 405
Gaussian, 325 recurrent, 404
life time of, 35 redefinition of process, 60
Markov, 31 regular point, 477
Ornstein-Uhlenbeck, 318 relatively compact range
oscillator, 318 paths with, 33
redeefinition of, 60 relatively compact subset, 33
strong Markov, 31 relatively weakly compact subset, 15
wealth, 250 relatively weakly compact subset versus
process of bounded variation, 460 tight, 16
process of class (DL), 244, 245, 460 resolvent equation, 95
product σ-field, 47 resolvent family, 70, 105, 106, 111, 132,
product topology, 47 142–144, 373, 471, 480
578 Index

resolvent identity, 86 weak∗ -continuous bounded analytic,


resolvent operator 301
powers of, 106 sequential λ-dominance, 38, 103
reversible Markov chain, 467 sequentially λ-dominant operator, 23,
Riccati type equation, 258 37, 38, 83, 87, 104, 142
Riccatti type equation, 232 sequentially compact orbit, 48, 51
right closed σ-field, 52 sequentially compact path, 48
right closure of σ-field, 52 sequentially weak∗ closed operator, 362,
right closure of a σ-field, 52 365
right-closed filtration, 36 sequentially weak∗ -closed operator, 362,
right-continuity of evolution, 52 363
right-continuity of propagator, 52 simple process, 462
right-continuous filtration, 150 skeleton of Markov process, 466
right-continuous Markov process, 35 skeleton of time-homogeneous Markov
risk adjustment factor, 461 process, 466
risk premium vector, 249 Skorohod space, 32, 36, 37, 60, 63, 64,
risk process, 249 76, 83, 88, 164
risk-adjusted measure, 252 Skorohod topology, 41
risky asset, 248 small subset, 472
Rothaus inequality, 455 Sobolev inequality, 458
Runge-Kutta type result, 212 Sobolev inequality of order p, 435, 436
Sobolev space, 184
space of bounded contnuous functions,
sample path of strong Markov process, 27
35 space time process
Scheffé’s theorem, 42, 43 generator of, 99
SDE, 248, 250, 326 space-time operator, 102
forward, 172, 186 space-time space, 101
linear, 250, 251 space-time variable, 57, 74, 164, 225
sectorial generator of Feller semigroup, spectral estimate, 357
309 spectral gap, 316, 383, 386, 420, 447
sectorial operator, 298, 299 spectral gap inequality, 432, 458
sectorial sub-Kolmogorov operator, 297, split measure, 476
301, 304, 309, 358 squared gradient operator, 172, 173,
self-financing hedging strategy, 250, 251 180, 190, 233, 240, 242, 257, 292,
self-financing strategy, 250 419, 427, 432, 438, 449, 452, 458,
semi-linear equation, 194 463
semi-linear partial differential equation, iterated, 419, 458
171 squared gradient operator Γ2
semi-martingale, 191, 192, 335, 463 iterated, 420, 443
semigroup, 84, 86, 166, 347, 465, 471, standard Markov process, 158, 163, 164
477 state space, 47
Tβ -equi-continuous, 105, 106 state variable, 37, 47
analytic, 305 stochastic, 59
bi-continuous, 311 stationary function, 357
bounded analytic, 350 stochastic differential, 345, 346
Feller, 40 stochastic differential equation, 248,
Feller-Dynkin, 40 326, 335, 341, 342, 347, 383, 421
Ornstein-Uhlenbeck, 357 coupled, 425
Index 579

stochastic integral, 461, 462 sub-Kolmogorov operator, 297, 300,


stochastic Noether theorem, 255, 279 309–311
stochastic phase, 184 sectorial, 297
stochastic phase space, 174, 184 sub-Markov transition function, 148,
stochastic state variable, 59 149
stochastic time change, 465 sub-martingale, 244, 245, 460
stochastic variable, 31 sub-solution
stopping time, 31, 36, 51, 57, 64, viscosity, 277
154–160, 162, 164, 167, 174, 202, submartingale, 66
208, 216, 220, 224, 243, 244, 330, subset
390, 397, 400, 403, 421, 459, 463, petite, 472
464, 471, 473, 481, 488 smalls, 472
approximating sequence, 57, 58 weakly compact, 14
terminal, 406 super-martingale, 460
terminal after another, 162–165 local, 460
strategy super-solution
self-financing, 250 viscosity, 279
strict limit, 172 superharmonic function, 407
strict topological dual, 11 strictly, 407
strict topology, 3, 11, 15, 27, 32, 41, 65, supermartingale, 51, 81, 97
80, 99, 117, 172, 175, 257, 295, 388 surjective function, 209, 211, 212
strictly convergent sequence, 11, 12 surjective mapping, 193
strictly equi-continuous family, 16
strictly superharmonic function, 407 tail σ-field, 387, 388, 411, 417, 418
strong Feller Markov process, 398 tame portfolio process, 249
strong Feller process, 472 tangent vector field, 289
strong Feller property, 244, 245, 248, terminal σ-field, 175
472 terminal stopping time, 162–164, 406,
strong Feller semigroup, 395 464
strong law of large numbers, 388 terminal value problem, 180
strong Markov process, 31, 35, 52, 72, Theore
76, 175, 190, 257–259, 291, 318, Doob’s optional sampling, 391
415, 471 Theorem
marginal of, 35 backward martingale convergence,
one-dimensional distribution of, 35 417
strong Markov process with respect to central limit, 470
right-closed filtration, 36 Chacon-Ornstein, 412
strong Markov property, 56, 60, 94, 158, Choquet’s capacity, 490
163, 165, 247, 463, 464, 491 comparison, 236
with respect to hitting times, 163 Doob-Meyer decomposition, 245, 460
strong mixing property, 389 Dunford-Pettis, 245
strong solution to BSDE, 189, 191 Kolmogorov’s extension, 40, 47
strong time-dependent Markov martingale convergence, 417
property, 464 monotone class, 31
strongly sub-additive function, 152 Neveu-Chacon identification, 411
Stroock, 16 of Arzela-Ascoli, 16
sub-additive functional, 134, 140 of Banach-Alaoglu, 18, 302, 313, 314
sub-additive mapping, 22, 86 of Banach-Steinhaus, 376
sub-invariant measure, 475 of Bolzano-Weierstrass, 17
580 Index

of Caratheodory, 7 uniform, 4
of Chacon-Ornstein, 411, 468 weak, 295
of Daniell-Stone, 4 weak∗ , 295
of Fubini, 325 total variation, 355, 357
of Hahn-Banach, 22, 142, 378 totally bounded orbit, 78
of Orey, 469 totally bounded path, 78
of Radon-Nikodym, 45, 46 totally bounded subset, 33, 63
of Scheffé, 42, 43 tower property of conditional expecta-
Orey’s convergence, 417, 448 tion, 54
Orey’s convergence theorem, 414, 469 transition function, 31, 48, 158, 174
pointwise ergodic theorem of, 388 transition probability function, 354,
Tietze 385, 398, 465, 466
extension theorem of, 17 twice integrated semigroup, 110
Tietze’s extension theorem, 17
tight family of functionals, 15 uniform boundedness principle, 378
tight family of measures, 15 uniform topology, 4, 257
tight family of operators, 19, 30 uniformly L1 -integrable, 42
tight family of operators versus uniformly bounded and uniformly
equi-continuous family, 19 holomorphic family of semigroups,
tight logarithmic Sobolev inequality, 370, 374
435 uniformly weak∗ -equi-continuous, 313
tight Sobolev inequality, 434–437 unique Markov extension, 37, 38, 94,
tightlogarithmic Sobolev inequality, 435 167
time derivative operator, 32 unique Markov extensions
time shift operator, 167, 466 operators with, 35
time transformation, 73 unique measure, 4, 8
time translation operator, 39, 387, 466
unique weak solutions, 421
time-dependent Markov process, 326
unique weak solutions to stochastic
time-dependent measure, 355
differential equations, 420
time-homogeneous Markov process, 388,
399
skeleton of, 466 variation measure, 27, 299
time-homogeneous Markov property, variation norm, 295
464 viscosity solution, 235, 236, 241, 242,
time-homogeneous strong Markov 254, 255
process, 391 viscosity solutions, 171, 181
time-homogeneous terminal stopping viscosity sub-solution, 242, 277, 278
time, 406 viscosity super-solution, 242, 279
time-homogenous Markov process, 39 volatility, 461
time-inhomogeneous Markov process, volatility matrix, 248
174, 175, 356
generator of, 32, 100 weak BSDE, 171
time-inhomogeneous martingale weak convergence, 18
problem, 97 weak convergence of measures, 41
topological dual relative to the strict weak maximum principle, 33
topology, 11 weak solution, 186, 327, 329, 421
topology weak solution to BSDE, 180, 189–191
Skorohod, 41 weak topology, 295
strict, 3, 41, 295 weak∗ -compact, 113
Index 581

weak∗ -continuous analytic semigroup, weak∗ -topology, 295


305 weakly compact subset, 14
weak∗ -continuous bounded analytic weakly continuous, 14
semigroup, 301, 309
weak∗ -continuous semigroup, 311 wealth process, 250
weak∗ -convergence, 312 well-posed martingale problem, 35, 421,
weak∗ -generator of bounded analytic 422
semigroup, 364 Wiener process, 190, 248

S-ar putea să vă placă și