Documente Academic
Documente Profesional
Documente Cultură
Random Media
Signal Processing
and Image Synthesis
Mathematical Economics and Finance
Stochastic Optimization
Stochastic Control
Stochastic Models in Life Sciences
Stochastic Modelling
and Applied Probability
(Formerly:
Applications of Mathematics)
64
Edited by B. Rozovski
G. Grimmett
Advisory Board M. Hairer
I. Karatzas
F.P. Kelly
A. Kyprianou
Y. Le Jan
B. ksendal
G. Papanicolaou
E. Pardoux
E. Perkins
Eckhard Platen
Nicola Bruti-Liberati
Numerical Solution
of Stochastic
Differential Equations
with Jumps in Finance
Eckhard Platen
Nicola Bruti-Liberati (19752007)
School of Finance and Economics
Department of Mathematical Sciences
University of Technology, Sydney
PO Box 123
Broadway NSW 2007
Australia
eckhard.platen@uts.edu.au
Managing Editors
Boris Rozovski
Division of Applied Mathematics
Brown University
182 George St
Providence, RI 02912
USA
rozovsky@dam.brown.edu
Geoffrey Grimmett
Centre for Mathematical Sciences
University of Cambridge
Wilberforce Road
Cambridge CB3 0WB
UK
g.r.grimmett@statslab.cam.ac.uk
ISSN 0172-4568
ISBN 978-3-642-12057-2
e-ISBN 978-3-642-13694-8
DOI 10.1007/978-3-642-13694-8
Springer Heidelberg Dordrecht London New York
Library of Congress Control Number: 2010931518
Mathematics Subject Classification (2010): 60H10, 65C05, 62P05
Springer-Verlag Berlin Heidelberg 2010
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting,
reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9,
1965, in its current version, and permission for use must always be obtained from Springer. Violations are
liable to prosecution under the German Copyright Law.
The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply,
even in the absence of a specific statement, that such names are exempt from the relevant protective laws
and regulations and therefore free for general use.
Cover design: deblik
Printed on acid-free paper
Springer is part of Springer Science+Business Media (www.springer.com)
Preface
VI
Preface
simplied schemes. The nal part of the research monograph raises questions
on numerical stability and discusses powerful martingale representations and
variance reduction techniques in the context of derivative pricing.
The book does not claim to be a complete account of the state of the
art of the subject. Rather it attempts to provide a systematic framework for
an understanding of the basic concepts and tools needed when implementing simulation methods for the numerical solution of SDEs. In doing so the
book aims to follow up on the presentation of the topic in Kloeden & Platen
(1999) where no jumps were considered and no particular eld of application motivated the numerical methods. The book goes signicantly beyond
Kloeden & Platen (1999). It is covering many new results for the approximation of continuous solutions of SDEs. The discrete time approximation of
SDEs with jumps represents the focus of the monograph. The reader learns
about powerful numerical methods for the solution of SDEs with jumps. These
need to be implemented with care. It is directed at readers from dierent elds
and backgrounds.
The area of nance has been chosen to motivate the methods. It has been
also a focus of research by the rst author for many years that culminated
in the development of the benchmark approach, see Platen & Heath (2006),
which provides a general framework for modeling risk in nance, insurance and
other areas and may be new to most readers. The book is written at a level
that is appropriate for a reader with an engineers or similar undergraduate
training in mathematical methods. It is readily accessible to many who only
require numerical recipes.
Together with Nicola Bruti-Liberati we had for several years planned a
book to follow on the book with Peter Kloeden on the Numerical Solution of
Stochastic Dierential Equations, which rst appeared in 1992 at Springer
Verlag and helped to develop the theory and practice of this eld. Nicolas
PhD thesis was written to provide proofs for parts of such a book. It is very
sad that Nicola died tragically in a trac accident on 28 August 2007. This
was an enormous loss for his family and friends, his colleagues and the area
of quantitative methods in nance.
The writing of such a book was not yet started at the time of Nicolas
tragic death. I wish to express my deep gratitude to Katrin Platen, who
then agreed to typeset an even more comprehensive book than was originally
envisaged. She carefully and patiently wrote and revised several versions of
the manuscript under dicult circumstances. The book now contains not only
results that we obtained with Nicola on the numerical solution of SDEs with
jumps, but also presents methods for exact simulation, parameter estimation,
ltering and ecient variance reduction, as well as the simulation of hedge
ratios and the construction of martingale representations.
I would like to thank several colleagues for their collaboration in related
research and valuable suggestions on the manuscript, including Kevin Burrage, Leunglung Chan, Kristoer Glover, David Heath, Des Higham, Hardy
Hulley, Constantinos Kardaras, Peter Kloeden, Uwe K
uchler, Herman Lukito,
Preface
VII
Eckhard Platen
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
XV
1
1
16
23
26
34
38
45
53
57
59
61
61
63
78
99
105
113
123
136
Contents
Stochastic Expansions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1 Introduction to Wagner-Platen Expansions . . . . . . . . . . . . . . . . .
4.2 Multiple Stochastic Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3 Coecient Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4 Wagner-Platen Expansions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.5 Moments of Multiple Stochastic Integrals . . . . . . . . . . . . . . . . . . .
4.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
187
187
195
202
206
211
230
233
233
245
252
266
271
Regular Strong It
o Approximations . . . . . . . . . . . . . . . . . . . . . . . .
7.1 Explicit Regular Strong Schemes . . . . . . . . . . . . . . . . . . . . . . . . . .
7.2 Drift-Implicit Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.3 Balanced Implicit Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.4 Predictor-Corrector Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.5 Convergence Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
309
309
316
321
326
331
346
Contents
XI
347
347
350
355
356
359
361
362
368
375
388
389
389
393
397
404
409
413
417
10 Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.1 Kalman-Bucy Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.2 Hidden Markov Chain Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.3 Filtering a Mean Reverting Process . . . . . . . . . . . . . . . . . . . . . . . .
10.4 Balanced Method in Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.5 A Benchmark Approach to Filtering in Finance . . . . . . . . . . . . .
10.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
419
419
424
433
447
456
475
477
477
481
491
495
497
504
507
507
514
517
522
XII
Contents
523
523
529
530
533
534
543
548
569
14 Numerical Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14.1 Asymptotic p-Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14.2 Stability of Predictor-Corrector Methods . . . . . . . . . . . . . . . . . . .
14.3 Stability of Some Implicit Methods . . . . . . . . . . . . . . . . . . . . . . . .
14.4 Stability of Simplied Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
571
571
576
583
586
590
591
591
595
601
606
616
621
627
635
637
637
645
658
669
677
694
697
697
712
720
734
744
753
Contents
XIII
It has been mentioned in the Preface that the material of this book has been
arranged in a way that should make it accessible to as wide a readership as
possible. Prospective readers will have dierent backgrounds and objectives.
The following four groups are suggestions to help use the book more eciently.
(i) Let us begin with those readers who aim for a sucient understanding,
to be able to apply stochastic dierential equations with jumps and appropriate simulation methods in their eld of application, which may not
be nance. Deeper mathematical issues are avoided in the following suggested sequence of reading, which provides a guide to the book for those
without a strong mathematical background:
1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8
12.1 12.2
Chapter 14
16.1
XVI
(ii) Engineers, quantitative analysts and others with a more technical background in mathematical and quantitative methods who are interested in
applying stochastic dierential equations with jumps, and in implementing ecient simulation methods or developing new schemes could use the
book according to the following suggested owchart. Without too much
emphasis on proofs the selected material provides the underlying mathematics.
Chapter 1
Chapter 2
Chapter 4
Chapter 5
Chapter 11
12.1 12.2
Chapter 14
Chapter 16
XVII
Chapter 4
Chapter 5
Chapter 6
Chapter 7
Chapter 8
Chapter 11
Chapter 12
Chapter 13
Chapter 14
Chapter 16
XVIII
Chapter 2
Chapter 3
Chapter 5
Chapter 9
Chapter 10
Chapter 11
12.1 12.2
Chapter 14
Chapter 15
Chapter 16
Chapter 17
Basic Notation
mean of X
2
X
, Var(X)
variance of X
Cov(X, Y )
covariance of X and Y
inf{}
sup{}
max(a, b) = a b
maximum of a and b
min(a, b) = a b
minimum of a and b
(a) = max(a, 0)
maximum of a and 0
x
x = (x , x , . . . , x )
|x|
A = [ai,j ]k,d
i,j=1
det (A)
determinant of a matrix A
A1
inverse of a matrix A
(x, y)
N = {1, 2, . . .}
XX
Basic Notation
innity
(a, b)
[a, b]
closed interval a x b in
= (, )
+ = [0, )
d
sample space
empty set
AB
AB
A\B
E = \{0}
without origin
[X, Y ]t
[X]t
n! = 1 2 . . . n
factorial of n
[a]
i.i.d.
a.s.
almost surely
f
rst derivative of f :
f
second derivative of f :
f : Q1 Q 2
u
xi
k
xi
there exists
FX ()
distribution function of X
fX ()
X ()
characteristic function of X
1A
Basic Notation
XXI
N ()
()
gamma function
(; )
(mod c)
modulo c
ltration
E(X)
expectation of X
E(X | A)
P (A)
probability of A
P (A | B)
probability of A conditioned on B
element of
not element of
not equal
approximately equal
ab
limN
lim inf N
lim supN
()
unit matrix
sgn(x)
sign of x
L2T
B(U )
smallest sigma-algebra on U
ln(a)
natural logarithm of a
MM
Merton model
MMM
XXII
Basic Notation
GIG
GH
generalized hyperbolic
VG
variance gamma
GOP
EWI
ODE
SDE
PDE
PIDE
I ()
K ()
i
l
i!
l!(il)!
combinatorial coecient
C k (Rd , R)
CPk (Rd , R)
C, C1 , . . ., C,
. . . represent nite positive real
Letters such as K, K1 , . . ., K,
constants that can vary from line to line. All these constants are assumed to
be independent of the time step size .
Key features of advanced models in many areas of application with uncertainties are often event-driven. In nance and insurance one has to deal
with events such as corporate defaults, operational failures or insured accidents. By analyzing time series of historical data, such as prices and other
nancial quantities, many authors have argued in the area of nance for
the presence of jumps, see Jorion (1988) and Ait-Sahalia (2004) for foreign
exchange and stock markets, and Johannes (2004) for short-term interest
rates. Jumps are also used to generate the short-term smile eect observed
in implied volatilities of option prices, see Cont & Tankov (2004). Furthermore, jumps are needed to properly model credit events like defaults and
credit rating changes, see for instance Jarrow, Lando & Turnbull (1997). The
short rate, typically set by a central bank, jumps up or down, usually by
some quarters of a percent, see Babbs & Webber (1995). Models for the dynamics of nancial quantities specied by stochastic dierential equations
(SDEs) with jumps have become increasingly popular. Models of this kind
can be found, for instance, in Merton (1976), Bjork, Kabanov & Runggaldier
(1997), Due, Pan & Singleton (2000), Kou (2002), Sch
onbucher (2003),
Glasserman & Kou (2003), Cont & Tankov (2004) and Geman & Roncoroni
(2006). The areas of application of SDEs with jumps go far beyond nance. Other areas of application include economics, insurance, population dynamics, epidemiology, structural mechanics, physics, chemistry and
biotechnology. In chemistry, for instance, the reactions of single molecules
or coupled reactions yield stochastic models with jumps, see, for instance,
Turner, Schnell & Burrage (2004), to indicate just one such application.
Since only a small class of jump diusion SDEs admits explicit solutions,
it is important to construct discrete-time approximations. The focus of this
monograph is the numerical solution of SDEs with jumps via simulation. We
consider pathwise scenario simulation, for which strong schemes are used, and
Monte Carlo simulation, for which weak schemes are employed. Of course,
there exist various alternative methods to Monte Carlo simulation that we only
consider peripherally in this book when it is related to the idea of discrete-time
XXIV
numerical approximations. These methods include Markov chain approximations, tree-based, and nite dierence methods. The class of SDEs considered
here are those driven by Wiener processes and Poisson random measures.
Some authors consider the smaller class of SDEs driven by Wiener processes
and homogeneous Poisson processes, while other authors analyze the larger
class of SDEs driven by fairly general semimartingales. The class of SDEs
driven by Wiener processes and Poisson jump measures with nite intensity
appears to be large enough for realistic modeling of the dynamics of quantities
in nance. Here continuous trading noise and a few single events model the
typical sources of uncertainty. Furthermore, stochastic jump sizes and stochastic intensities, can be conveniently covered by using a Poisson jump measure.
As we will explain, there are some numerical and theoretical advantages when
modeling jumps with predescribed size. The simulation of some Levy process
driven dynamics will also be discussed. The development of a rich theory on
simulation methods for SDEs with jumps, similar to that established for pure
diusion SDEs in Kloeden & Platen (1992), is still under way. This book aims
to contribute to this theory motivated by applications in the area of nance.
However, challenging problems in insurance, biology, chemistry, physics and
many other areas can readily apply the presented numerical methods.
We consider discrete-time approximations of solutions of SDEs constructed
on time discretizations (t) , with maximum step size (0, 0 ), with
0 (0, 1). We call a time discretization regular if the jump times, generated by the Poisson measure, are not discretization times. On the other
hand, if the jump times are included in the time discretization, then a jumpadapted time discretization is obtained. Accordingly, discrete-time approximations constructed on regular time discretizations are called regular schemes,
while approximations constructed on jump-adapted time discretizations are
called jump-adapted schemes.
Discrete-time approximations can be divided into two major classes: strong
approximations and weak approximations, see Kloeden & Platen (1999). We
say that a discrete-time approximation Y , constructed on a time discretization (t) , with maximum step size > 0, converges with strong order at
time T to the solution X of a given SDE, if there exists a positive constant
C, independent of , and a nite number 0 (0, 1), such that
E(|X T Y
T |) C ,
(0.0.1)
for all (0, 0 ). From the denition of the strong error on the left hand side
of (0.0.1) one notices that strong schemes provide pathwise approximations of
the original solution X of the given SDE. These methods are therefore suitable
for problems such as ltering, scenario simulation and hedge simulation, as
well as the testing of statistical and other quantitative methods. In insurance,
the area of dynamic nancial analysis is well suited for applications of strong
approximations.
XXV
|E(g(X T )) E(g(Y
T ))| C ,
(0.0.2)
2(+1)
(0.0.4)
K
E
Y
(0.0.5)
X
E (g(X T )) E g Y
T
T
T
for each (0, 0 ).
2(+1)
Since the set of Lipschitz continuous functions includes the set CP
(d ,
XXVI
and pure jump SDEs. It also includes some discussions on parameter estimation and ltering as well as their relation to strong approximation methods.
Finally, the third part, which is composed of Chaps. 11 up to 17, introduces
weak approximations for Monte Carlo simulation and discusses ecient implementations of weak schemes and numerical stability. Here the simulation
of hedge ratios, ecient variance reduction techniques, Markov chain approximations and nite dierence methods are discussed in the context of weak
approximation.
The monograph Kloeden & Platen (1992) and its printings in (1995) and
(1999) aimed to give a reasonable overview on the literature on the numerical
solution of SDEs via simulation methods. Over the last two decades the eld
has grown so rapidly that it is no longer possible to provide a reasonably fair
presentation of the area. This book is, therefore, simply presenting results that
the authors were in some form involved with. There are several other lines of
research that may be of considerable value to those who have an interest in
this eld. We apologize to those who would have expected other interesting
topics to be covered by the book. Unfortunately, this was not possible due to
limitations of space.
For further reading also in areas that are related but could not be covered we may refer the reader to various well-written books, including Ikeda
& Watanabe (1989), Niederreiter (1992), Elliott, Aggoun & Moore (1995),
Milstein (1995a), Embrechts, Kl
uppelberg & Mikosch (1997), Bj
ork (1998),
Karatzas & Shreve (1998), Mikosch (1998), Kloeden & Platen (1999), Shiryaev
(1999), Bielecki & Rutkowski (2002), Borodin & Salminen (2002), J
ackel
(2002), Joshi (2003), Sch
onbucher (2003), Shreve (2003a, 2003b), Cont &
Tankov (2004), Glasserman (2004), Higham (2004), Milstein & Tretjakov
(2004), Achdou & Pironneau (2005), Brigo & Mercurio (2005), Elliott & Kopp
(2005), Klebaner (2005), McLeish (2005), McNeil, Frey & Embrechts (2005),
Musiela & Rutkowski (2005), ksendal & Sulem (2005), Protter (2005), Chan
& Wong (2006), Delbaen & Schachermayer (2006), Elliott & van der Hoek
(2006), Malliavin & Thalmaier (2006), Platen & Heath (2006), Seydel (2006),
Asmussen & Glynn (2007), Lamberton & Lapeyre (2007) and Jeanblanc, Yor
& Chesney (2009).
The book has been used as reference for the Masters in Quantitative Finance and the PhD program at the University of Technology in Sydney, as
well as for courses and workshops that the rst author has presented in various
places.
The formulas in the book are numbered according to the chapter and
section where they appear. Assumptions, theorems, lemmas, denitions and
corollaries are numbered sequentially in each section. The most common notations are listed at the beginning and an Index of Keywords is given at the
end of the book. Some readers may nd the Author Index at the end of the
book useful. Each chapter nishes with some Exercises with Solutions given
in Chap. 18. These are aimed to support the study of the material.
We conclude this brief survey with the remark that the practical application and theoretical understanding of numerical methods for stochastic dierential equations with jumps are still under development. This book shall stimulate interest and further work on such methods. The Bibliographical Notes
at the end of this research monograph may be of assistance.
1
Stochastic Dierential Equations with Jumps
Stochastic dierential equations (SDEs) with jumps provide the most exible,
numerically accessible, mathematical framework that allows us to model the
evolution of nancial and other random quantities over time. In particular,
feedback eects can be easily modeled and jumps enable us to model events.
This chapter introduces SDEs driven by Wiener processes, Poisson processes
and Poisson random measures. We also discuss the Ito formula, the FeymanKac formula and the existence and uniqueness of solutions of SDEs. These
tools and results provide the basis for the application and numerical solution
of stochastic dierential equations with jumps.
,...,Xti
(1.1.1)
We set the time set to the interval T = [0, ) if not otherwise stated. On
some occasions the time set may become the bounded interval [0, T ] for T
(0, ) or a set of discrete time points {t0 , t1 , t2 , . . .}, where t0 < t1 < t2 < . . ..
One can distinguish between various classes of stochastic processes according to specic properties. First, we aim to identify stochastic processes that
are suitable as basic building blocks for realistic models. From a quantitative
point of view it is essential to select stochastic processes that allow explicit
formulas or at least fast and ecient numerical methods for calculating moments, probabilities, option prices or other quantities. In the following we set
usually the dimension d equal to one. However, most concepts and results we
will introduce generalize to the multi-dimensional case.
Stationary Process
Important information about a stochastic process is provided by the mean
(t) = E(Xt )
(1.1.2)
v(t) = Var(Xt ) = E (Xt (t))2
(1.1.3)
(1.1.4)
for s, t T .
The concept of a market equilibrium has been very important in economics and nance. The corresponding class of stochastic processes for modeling equilibria is that of stationary processes since they allow us to express
probabilistically a form of equilibrium. For instance, interest rates, dividend
rates, ination rates, volatilities, consumption rates, hazard rates and credit
spreads are likely to be modeled by stationary processes since they typically
exhibit in reality some type of equilibrium.
Denition 1.1.2. We say that a stochastic process X = {Xt , t 0} is stationary if its joint distributions are all invariant under time displacements,
that is if
(1.1.5)
FXt1 +h ,Xt2 +h ,...,Xtn +h = FXt1 ,Xt2 ,...,Xtn
for all h > 0, ti 0, i {1, 2, . . . , n} and n N .
The random values Xt of a stationary process X have the same distribution
for all t T . Therefore, means, variances and covariances satisfy the equations
(t) = (0),
v(t) = v(0)
and
C(s, t) = c(t s)
(1.1.6)
Fig. 1.1.1. Sample path for the Vasicek interest rate model, T = 10
such a probability space we consider dynamics, typically of a nancial market model, that is based on the observation of a continuous time stochastic
vector process X = {X t d , t 0}, d N . We denote by At the time t
information set, which is the sigma-algebra generated by the events that are
known at time t 0, see Shiryaev (1984). Our interpretation of At is that
it represents the information available at time t, which is obtained from the
observed values of the vector process X up to time t. More precisely, it is the
sigma-algebra
At = {X s : s [0, t]}
generated from all observations of X up to time t. Since information is not
lost, the increasing family
A = {At , t 0}
of information sets At satises, for any sequence 0 t1 < t2 < . . . < of
observation times, the relation At1 At2 . . . A = t0 At .
For technical reasons one introduces the information set At as the augmented sigma-algebra of At for each t 0. It is augmented by every null
set in A such that it belongs to A0 , and also to each At . One also says
that At is complete. Dene At+ = >0 At+ as the sigma-algebra of events
immediately after t [0, ). The family A = {At , t 0} is called right continuous if At = At+ holds for every t 0. Such a right-continuous family
A = {At , t 0} of information sets one calls a ltration. Such a ltration
can model the evolution of information as it becomes available over time. We
dene for our purposes A as the smallest sigma-algebra that contains A
= t0 At . From now on, if not stated otherwise, we always assume in this
book a ltered probability space (, A, A, P ) to be given.
Any right-continuous stochastic process Y = {Yt , t 0} generates its natural ltration AY = {AYt , t 0}, which is the sigma-algebra generated by Y
up to time t. For a given model with a vector process X we typically set
A = AX with At = AX
t .
If for a process Z = {Zt , t 0} and each time t 0 the random variable
X
= {AX
Zt is AX
t -measurable, then Z is called adapted to A
t , t 0}. The
history of the process Z until time t is then covered by the information set AX
t .
,
which
means
that
it
includes
The completeness of the information set AX
t
all null events, allows us to conclude that for two random variables Z1 and
Z2 , where Z1 = Z2 almost surely (a.s.) and Z1 is AX
t -measurable, Z2 is also
-measurable.
AX
t
Conditional Expectations
The notion of conditional expectation is central to many of the concepts that
arise in applications of stochastic processes and also in nance. The mean value
or expectation E(X) is the coarsest estimate that we have for an integrable
random variable X, that is, for which E(|X|) < , see Shiryaev (1984). If
we know that some event A has occurred we may be able to improve on this
estimate. For instance, suppose that the event A = { : X() [a, b]}
has occurred. Then in evaluating our estimate of the value of X we need only
to consider corresponding values of X in [a, b] and weight them according to
their likelihood of occurrence, which thus becomes the conditional probability
given this event, see Shiryaev (1984).
The resulting estimate is called the conditional expectation of X given the
event A and is denoted by E(X| A). For a continuous random variable X with
a density function fX the corresponding conditional density is
0
for x < a or b < x
fX (x | A) =
,
R b fX (x)
for
x
[a,
b]
f (s) ds
a
E(X | A) =
x fX (x) dx
x fX (x | A) dx = a b
,
f (x) dx
a X
(1.1.7)
E(X | S)() dP () =
X() dP (),
(1.1.8)
Q
for all Q S. The Radon-Nikodym theorem, see Shiryaev (1984), guarantees the existence and uniqueness of the random variable E (X | S). Note
that E (X | S) is a random variable dened on the coarser probability space
(, S, P ) and thus on (, A, P ). However, X is usually not a random variable
on (, S, P ), but when it is we have
(1.1.9)
E X S = X,
which is the case when X is S-measurable.
The following results are important when handling the evolution of stochastic processes in conjunction with an evolving information structure, as is often
the case in nance.
For nested sigma-algebras S T A and an integrable random variable
X we have the iterated conditional expectations
(1.1.10)
E E X T S = E X S
almost surely. Since most equations and relations we will formulate in this
book hold almost surely, we will typically suppress these words from now on.
When X is independent of the events in S we have
E X S = E(X).
(1.1.11)
(1.1.12)
to those of ordinary in
S ,
(1.1.13)
(1.1.14)
(1.1.15)
if X Y a.s.
The conditional expectation E(X | S) is in some sense obtained by smoothing X over the events in S. Thus, the ner the information set S, the more
E(X | S) resembles the random variable X.
Wiener Process
In contrast to stationary processes we consider now stochastic processes with
stationary independent increments. These basic processes have mathematical properties that make them suitable as fundamental building blocks in
stochastic modeling and, thus, in nancial modeling. The random increments
Xtj+1 Xtj , j {0, 1, . . . , n1}, of these processes are independent for any
sequence of time instants t0 < t1 < . . . < tn in [0, ) for all n N . If t0 = 0
is the smallest time instant, then the initial value X0 and the random increment Xtj X0 for any other tj [0, ) are also required to be independent.
Additionally, the increments Xt+h Xt are assumed to be stationary, that is
Xt+h Xt has the same distribution as Xh X0 for all h > 0 and t 0.
The most important continuous process with stationary independent increments is the Wiener process. Bachelier was the rst who employed, already
in 1900, such a mathematical object in his modeling of asset prices at the
Paris Bourse, see Bachelier (1900) or Davis & Etheridge (2006). He did this
even before Einstein used an equivalent mathematical construct that we call
now the Wiener process or Brownian motion, see Einstein (1905).
Denition 1.1.3. We dene the standard Wiener process W = {Wt ,
t 0} as an A-adapted process with Gaussian stationary independent increments and continuous sample paths for which
W0 = 0,
(t) = E(Wt ) = 0,
Var(Wt Ws ) = t s
(1.1.16)
The Wiener process is a continuous time stochastic process with independent Gaussian distributed increments generating continuous sample paths.
This process is also known as Brownian motion, since it can model the motion of a pollen grain under the microscope, as observed by Robert Brown in
the early 19th century.
In Fig. 1.1.2 we plot a sample path of a standard Wiener process. To
visualize more of its probabilistic properties we display 20 such trajectories in
Fig. 1.1.3
The Wiener process has fundamental mathematical properties and is used
as a basic building block in many applications, in particular in nance. The
Wiener process is the most basic stochastic process that allows us to model
continuous uncertainty, as it arises for instance as trading noise. Note that
(1.1.18)
for t 0, s [0, t] and j {1, 2, . . . , m}. Moreover, one has the additional
property that
(t s) for k = j
j
j
k
k
E Wt Ws
Wt Ws
(1.1.19)
As =
0 otherwise
for t 0, s [0, t] and j, k {1, 2, . . . , m}.
Poisson Process
In nance and insurance one also needs to be able to model various random
events, such as defaults, catastrophes or operational failures. There exists
a piecewise constant stochastic process with independent increments which
counts events. This is the Poisson process which can be conveniently used to
model event driven uncertainty.
Denition 1.1.4. A Poisson process N = {Nt , t 0} with intensity > 0
is a piecewise constant process with stationary independent increments with
initial value N0 = 0 such that Nt Ns is Poisson distributed with intensity
ts , that is, with probability
P (Nt Ns = k) =
for k {0, 1, . . .}, t 0 and s [0, t].
e (ts) ( (t s))k
k!
(1.1.20)
(1.1.21)
v(t) = Var(Nt ) = E (Nt (t))2 = t
(1.1.22)
for t 0.
In Fig. 1.1.4 we plot a graph for a Poisson process with intensity = 20.
According to (1.1.21), we should expect on average 20 events to happen during
the time period [0, 1], which is about the case for this trajectory. We emphasize that also the Poisson process is not a stationary process conditioned on
its known initial value. However, its increments have the same Poisson distribution over time intervals of the same length and are in this sense stationary.
A Poisson process N counts events and generates an increasing sequence
of jump times 1 , 2 , . . . related to each event that it counts. Thus, Nt equals
the number of events that occurred up until time t 0. For t 0 let the time
Nt denote the last time that Nt made a jump, that is,
k = inf{t 0 : Nt k}
(1.1.23)
for k N , t 0.
In some applications, as in the modeling of defaults, the intensity t that
a certain type of event occurs may depend on the time t 0. This leads to a
time transformed Poisson process N = {Nt , t 0}, where
k
t
t
exp s z dz
dz
z
s
P (Nt Ns = k) =
(1.1.24)
k!
10
Nt
(1.1.25)
k=1
k P (Nt = k),
(1.1.26)
k=1
where the above probabilities are expressed in (1.1.24). For instance, if one
chooses 1 = 1 and k = 0 for k {2, 3, . . .}, then this simple process allows
us to model the credit worthiness Ct of an obligor at time t by setting
Ct = 1 Yt .
The credit worthiness may start at time zero with C0 = 1. It declines to zero at
the time when the rst default arises. The expected value E(Ct ) of the above
credit worthiness at time t equals by (1.1.26) and (1.1.24) the expression
t
E(Ct ) = P (Nt = 0) = exp
z dz
0
for t 0.
Compound Poisson Process
One often needs to dierentiate between certain types of events, for instance,
those with dierent recovery rates at a default. A corresponding intensity for
the occurrence of each type of possible event is then required.
To construct a process that models sequences of dierent types of events,
we consider a Poisson process N with intensity > 0, together with a sequence
of independent identically distributed (i.i.d.) random variables 1 , 2 , . . . that
are independent of N . Here
F1 (z) = P (1 z)
(1.1.27)
11
Yt =
Nt
(1.1.28)
k=1
12
The following simple, but important insurance model involves the above
compound Poisson process Y . The Cramer-Lundberg model for the risk reserve
Xt at time t of an insurance company can be described in the form
Xt = X0 + c t Yt ,
(1.1.29)
13
(1.1.30)
(1.1.31)
(1.1.32)
(A) =
0
(1.1.33)
(A) (A)
e
!
(1.1.34)
14
and has been used in recent years by a number of researchers in nancial and
insurance modeling, see, for instance, Barndor-Nielsen & Shephard (2001),
Geman, Madan & Yor (2001), Eberlein (2002) and Kou (2002). It includes
as special cases the Poisson process and the Wiener process. These processes
have stationary independent increments and enjoy a number of convenient
mathematical properties.
Denition 1.1.5. A stochastic process X = {Xt , t 0} with X0 = 0 a.s.
is called a Levy process if
1. X is a process with independent increments, where Xt Xs for 0 s <
t < is independent of Xr with 0 r s;
2. X has stationary increments, which means that Xt Xs has for 0 s <
t < the same distribution as Xts ;
P
3. X is continuous in probability, that is Xs = limts Xt for s [0, ).
A popular asset price model is the exponential Levy model S = {St , t
[0, )}, where
(1.1.35)
St = S0 exp{Xt }
denotes the asset price at time t 0 with S0 > 0 and X is a given Levy process. Some exponential Levy models have been already studied, for instance,
in Samuelson (1955), Black & Scholes (1973) and Merton (1973b, 1976). Note
that, since the asset price here is an exponential, it is guaranteed to stay
positive, which is a desirable property of an asset price.
If X is a Levy process, then it can be shown that it can be decomposed
into the form
|v|<1
|v|1
(1.1.36)
for t 0, X0 , see, for instance, Protter (2005). Here W = {Wt , t 0}
is a standard Wiener process. Furthermore, p is a Poisson measure as dened previously. For any set A E the process p (A) = {p (A, [0, t]), t 0}
is a Poisson process independent of W with intensity (A), where (dv) is
called a Levy measure. This measure characterizes the jumps of a Levy process
and is dened on E assuming that (1.1.32) holds. Note that for disjoint sets
A, B E the resulting Poisson processes p (A) and p (B) are independent.
The decomposition (1.1.36) shows that a Levy process is a superposition of
a constant trend, a scaled Wiener process, and some jump process with stationary independent increments.
A Levy process is fully characterized by the two parameters ,
together with the very exible Levy measure . In the case = 0 and
(A) = 0 for all A E, the resulting Levy process X is deterministic and
has at time t the value Xt = t. If (A) = 0 for all A E, then the Levy
process is a transformed Wiener process of the form Xt = t + Wt . This
is the only possible form that a continuous Levy process can have. As an
15
16
that these Wiener processes, Poisson processes and Poisson measures are Aadapted and that their increments over a period (s, t] are independent of As
for t 0, s [0, t].
(1.2.3)
for all t 0.
An example of an (A, P )-martingale is a Wiener process W = {Wt , t 0}
on a ltered probability space (, A, A, P ).
There are many other continuous time stochastic processes that form martingales. For example, using again the standard Wiener process W it can be
shown that the process
X = Xt = Wt2 t, t 0
17
(1.2.4)
and
E(|Xt |) <
(1.2.5)
()
X s E X t As
(1.2.6)
(1.2.8)
18
qt = N t t
(1.2.9)
(1.2.10)
19
Stopping Times
Observable random times arise in many ways in nancial modeling. For instance, times of default, the rst time of reaching a critical threshold or the
early exercise time of an American option are such random times. Let us make
their denition precise, and consider a ltered probability space (, A, A, P )
as introduced above.
Denition 1.2.3. A random variable : [0, ) is called a stopping
time with respect to the ltration A if for all t 0 one has
{ t} At .
(1.2.11)
The relation (1.2.11) expresses the fact that the event { t} is At measurable and thus observable at time t. The information set associated
with a stopping time is dened as
A = {A A : A { t} At
for t 0}.
(1.2.12)
It represents the information available before and at the stopping time . For
example, the rst time
(a) = inf{t 0 : Wt = a}
(1.2.13)
for all t 0.
A stopping time is called predictable, if A is predictable. This means, A
is generated by left-continuous stochastic processes with right hand limits. A
stopping time that is not predictable is called inaccessible. For example, the
jump times of a Poisson process are inaccessible. However, the rst hitting time
(a), given in (1.2.13), of the level a by the Wiener process W is predictable.
20
and
E(Y | A ) = E(Y | A )
(1.2.15)
E E(Y | A ) A = E(Y | A ).
(1.2.16)
(1.2.17)
sup |Xs |
s[0,t]
p
p1
21
p
E(|Xt |p )
(1.2.19)
for t 0.
If X = {Xt , t 0} is a right continuous supermartingale, then it can be
shown, see Doob (1953), that for any > 0 it holds
P sup Xt A0 E X0 A0 + E max(0, X0 ) A0 . (1.2.20)
t0
Standard Inequalities
We list here some standard inequalities, see Ash (1972) and Ikeda & Watanabe
(1989), that we will use later repeatedly. Consider a sequence of pairs of real
numbers {(ai , bi ), i {1, . . . , n}}, with n N . Then one can formulate the
Cauchy-Schwarz inequality
n
2
ai b i
i=1
n
a2i
i=1
n
2
ai
i=1
n
b2i
(1.2.21)
i=1
n
a2i .
(1.2.22)
i=1
Let 1 < p < , 1 < q < and (1/p) + (1/q) = 1, then we have the H
older
inequality
n
n
n
p1
q1
ai b i
|ai |p
|bi |q
(1.2.23)
i=1
and
i=1
n
p
ai
i=1
i=1
p1
n
api .
(1.2.24)
i=1
12
12
2
f d
g 2 d
(1.2.25)
f gd
22
2
f d ()
f 2 d.
(1.2.26)
p1
q1
|f |p d
|g|q d .
(1.2.27)
f gd
p
p1
f
d
()
|f |p d.
(1.2.28)
0 (t) (t) + C
(s)ds
(1.2.29)
t0
for t [t0 , T ] and C > 0. Then the following version of the Gronwall inequality
holds
(t) (t) + C
eC(ts) (s)ds.
(1.2.30)
t0
t
(t) (t) +
f (s) (s)ds
(1.2.31)
t0
for t [t0 , T ], then the following alternative Gronwall inequality can be used
t
t
(t) (t) +
f (s) (s) exp
f (u) du ds.
(1.2.32)
t0
f (u) du .
(1.2.33)
t0
23
(1.2.35)
1
E(|X|).
a
(1.2.36)
1
Var(X).
a2
(1.2.37)
(1.3.1)
with small time steps of length h > 0, such that 0 = t0 < t1 < t2 < . . ..
Thus, we have the discretization times tk = k h for k {0, 1, . . .}. The specic
structure of the time discretization is not essential for the denition below, as
long as the maximum time step size vanishes. There is no need to have the
time discretization equidistant. However, it makes our presentation simpler.
For a given stochastic process X the quadratic variation process [X] =
{[X]t , t 0} is dened as the limit in probability as h 0 of the sums of
squared increments of the process X, provided this limit exists and is unique,
see Jacod & Shiryaev (2003) and Protter (2005). More precisely, we have at
time t the quadratic variation
P
(1.3.2)
(Xtk Xtk1 )2 .
(1.3.3)
nt = max{k N : tk t},
(1.3.4)
[X]h,t =
k=1
(1.3.5)
24
for t 0.
For a continuous, square integrable (A, P )-martingale X, a new continuous
(A, P )-martingale Y = {Yt , t 0} is obtained by setting
Yt = (Xt )2 [X]t
(1.3.6)
nt
(1.3.7)
k=1
for t 0 and h > 0, see (1.3.4). More precisely, at time t 0 we obtain the
covariation
P
(1.3.8)
[Z1 , Z2 ]t = lim [Z1 , Z2 ]h,t ,
h0
(t) = lim (t h)
h0+
(1.3.9)
the almost sure left hand limit of (t) at time t (0, ). The jump size (t)
at time t is then obtained as
(t) = (t) (t)
(1.3.10)
for t (0, ).
In the case of a pure jump process p = {pt , t 0} the corresponding
quadratic variation is obtained as
(ps )2
(1.3.11)
[p]t =
0st
Zi (s)
25
(1.3.12)
0<st
for t 0 and i {1, 2}. We assume that the sum in (1.3.12) is almost surely
nite. The covariation [Z1 , Z2 ]t of Z1 and Z2 at time t is then dened as
(Z1 (s)) (Z2 (s))
(1.3.13)
[Z1 , Z2 ]t = [Z1c , Z2c ]t +
0<st
for t 0, as long as the quantities involved remain almost surely nite. This
also means that the quadratic variation of a process Z1 equals the quadratic
variation [Z1c ]t of its continuous part plus the sum of the squares of its jumps,
that is
(Z1 (s))2
(1.3.14)
[Z1 ]t = [Z1c ]t +
0<st
for t 0.
Local Martingales
As we will see, in continuous time nance certain stochastic processes become
martingales when properly stopped but are not true martingales.
Denition 1.3.1. A stochastic process X = {Xt , t 0} is an (A, P )-local
martingale if there exists an increasing sequence (n )nN of stopping times,
a.s.
that may depend on X, such that limn n = and each stopped process
X n = {Xtn = Xtn , t 0}
(1.3.15)
E([X]T ) <
(1.3.16)
26
1.4 It
o Integral
It
o Integral as Gains from Trade
The notion of a stochastic integral is highly relevant and also very natural
in nance because it can be interpreted as gains from trade. Let an investor
choose his or her trading strategy as a piecewise constant buy and hold allocation process = {(t), t [0, T ]} with (t) = (tk ) units of shares held at
time t [tk , tk+1 ), k {0, 1, . . .} and tk = kh for h > 0. The value of a share
is assumed to evolve according to the asset price process X = {Xt , t 0}.
The gains from trade over the period [0, t] can then be expressed in the form
(s) dXs =
0
nt
k=1
(1.4.1)
1.4 It
o Integral
27
where
nt = max{k N : tk t}
(1.4.2)
is the integer index of the latest discretization time before and including t.
For a left continuous, predictable stochastic process = {(t), t 0} as
integrand with
T
(s)2 d[X]t <
(1.4.3)
0
t
nt
P
(s) dXs = lim
(tk1 ) {Xtk Xtk1 }
(1.4.4)
h0
k=1
t
nt
P
Ws dWs = lim
Wtk1 (Wtk Wtk1 )
h0
k=1
nt
1
Wt2k Wt2k1 (Wtk Wtk1 )2
= lim
h0 2
P
k=1
nt
1
1 2 1 2
Wt W0 lim
(Wtk Wtk1 )2 .
h0 2
2
2
k=1
For the quadratic variation of the standard Wiener processes in (1.3.3) we have
[W ]t = t, see (1.3.5), and in addition W0 = 0, see (1.1.16). Consequently, the
value of the above double Wiener integral is
t
1
1
1
1
(1.4.5)
Ws dWs = Wt2 [W ]t = Wt2 t,
2
2
2
2
0
which is clearly less than 12 Wt2 which would not be expected under classical integration rules. This already indicates that ad hoc conclusions from analogies
with deterministic calculus may fail in a stochastic setting. There exist significant dierences between the rules for It
o integration and those for classical
28
Yt = y0 +
es ds +
0
fs dWs
(1.4.6)
|es | ds <
(1.4.7)
for all t 0 a.s. The second integral is an Ito integral with respect to the
Wiener process W , see (1.4.4), where we assume that
|fs |2 ds <
(1.4.8)
E(ft2 ) dt < .
0
(1.4.10)
1.4 It
o Integral
29
The Ito integral exhibits the following important properties which we will
exploit frequently throughout the book:
1. Linearity: For T 0, t [0, T ], s [0, t], Z1 , Z2 L2T , and As -measurable
square integrable random variables A and B it holds
t
(A Z1 (u) + B Z2 (u)) dWu1 = A
Z1 (u) dWu1 + B
Z2 (u) dWu1 .
s
(1.4.11)
2. Local martingale property: For predictable with
T
(u)2 du <
(1.4.12)
t
a.s. for all T [0, ) the It
o integral I,W = {I,W (t) = 0 (s)dWs ,
t 0} forms an (A, P )-local martingale.
3. Martingale property: Assume that I,W (t) forms a square integrable process, then it is a square integrable (A, P )-martingale if and only if
T
2
(u) du <
(1.4.13)
E
0
for all T 0.
4. Correlation property: For T 0, t [0, T ], independent Wiener processes
W 1 and W 2 and Z1 , Z2 L2T it holds that
t
t
i
j
E
Z1 (u) dWu
Z2 (u) dWu As
0
t E Z (u) Z (u) A du for i = j
1
2
s
0
=
0
otherwise
(1.4.14)
Z1 (u) dWu1 ,
Z2 (u) du = 0.
(1.4.16)
0
Note that some of the above employed conditions can be weakened, see
Protter (2005).
30
Semimartingales
The most general class of stochastic processes that we will consider in this
book is that of semimartingales, see Protter (2005). It is a very rich class of
processes which appears to be closed under various operations, and turns out
to be sucient for the modeling of most quantitative problems that appear
in nance and insurance.
In a ltered probability space (, A, A, P ) a semimartingale is an Aadapted, right-continuous stochastic process X = {Xt , t 0} with left hand
limits, where Xt can be expressed as a sum of the form
Xt = X0 + At + Mt
(1.4.17)
for all t
0. Here A = {At , t 0} is a process of nite total variation, that is
nt
|Atk Atk1 | < , with nt given in (1.4.2), and M = {Mt , t 0}
limh0 k=1
is an (A, P )-local martingale.
If A is predictable, then X is called a special semimartingale and the decomposition (1.4.17) is unique. For an A-adapted, right-continuous stochastic
process X = {Xt , t 0} with left hand limits let
a.s.
Xt = lim Xt
0
denote the almost sure left hand limit of X at time t 0. Such a process may
jump, and we denote by
(1.4.18)
Xt = Xt Xt
its jump size at time t 0. If in (1.4.17) the discontinuous part
Adt =
As
0st
Ms
0st
of M are almost surely nite, then each of the processes A and M can be
decomposed into a continuous and discontinuous part, being
At = Act + Adt
(1.4.19)
Mt = Mtc + Mtd
(1.4.20)
and
for t 0, respectively.
For example, the Wiener process W = {Wt , t 0}, given in Denition 1.1.3, is a semimartingale. Here the decomposition (1.4.17) is simply such
that X0 = 0, At = 0 and Mt = Mtc = Wt . Also a Poisson process N with
1.4 It
o Integral
31
d
v(p (dv, ds) (dv) ds),
(1.4.21)
Mt =
E
(1.4.22)
and the predictable nite total variation part is continuous and equals
At = Act = t +
v (dv) ds
(1.4.23)
0
|v|1
I,X (t) =
h0
nt
(1.4.24)
k=1
(1.4.25)
32
does not vanish for all t 0. In this case the jumps of the It
o integral I,X (t)
are obtained as
I,X (t) = I,X (t) I,X (t) = (t) Xt
(1.4.26)
for t 0. For example, if N is a Poisson process, then at the kth jump time
k we have Nk = Nk Nk1 = 1 for k {1, 2, . . .}. The Ito integral for
an integrand = {(t), t 0} takes then the form
I,N (t) =
(s) dNs =
0
Nt
(k ) Nk =
k=1
Nt
(k )
(1.4.27)
k=1
for
t 0. Another example is that of a nite pure jump process X = {Xt =
0st Xs , t 0}, which jumps at the jump times 1 , 2 , . . . of a counting
process p = {pt , t 0} with corresponding jump size Xk = c(k, k ) at the
o
kth jump time k . We then obtain for an integrand = {(t), t 0} the It
integral
I,X (t) =
(s) dXs =
0
pt
(k ) Xk =
k=1
pt
(k ) c(k, k ) (1.4.28)
k=1
for t 0. Here
we assume that the termsinvolved are almost surely nite, such
pt
pt
(k ) Xk and k=1
Xk almost surely converge to
that the sums k=1
a nite value for all t 0.
It
o Integral for Jump Measures
In the case when jump sizes are continuously distributed and total jump intensities may not be nite, one may use the Ito integral with respect to a
Poisson measure as introduced in Sect. 1.1. Assume that p (dv, dt) is the
Poisson measure on E [0, ) with intensity measure q (dv, dt) = (dv) dt,
satisfying condition (1.1.32).
For a family ((v))vE of almost surely nite adapted processes (v) =
{(v, t), t 0} with v E the It
o integral
I,p (t) =
0
(1.4.29)
I,p (t) =
0
1.4 It
o Integral
33
for all t 0. This means that if at a jump time the Poisson measure p
generates an event with mark v, then the change of the value of the corresponding It
o integral is given by the value (v, ), the integrand for the
mark v just before the jump time.
For the case when E = (0, ) is an interval with (0, ), where
1 for v E
(v) =
0 otherwise,
we obtain for the special case (v, t) = 1 the It
o integral
I1,p (t) =
p (dv, ds) = Nt
0
for t 0. Here
(1.4.31)
(1.4.32)
1
[, X c ]t
2
(1.4.33)
for t 0 and and X as in (1.4.24), see Protter (2005). Here X c denotes the
continuous part of X and [, ]t the covariation, see (1.3.8). For a Stratonovich
or symmetric integral of with respect to X we write
t
J,X (t) =
(s) dXs
(1.4.34)
0
and have
P
h0
nt
((tk ) + (tk1 ))
k=1
(Xtk Xtk1 )
(1.4.35)
34
1.5 It
o Formula
One-dimensional Continuous It
o Formula
To be able to handle functions of solutions to stochastic processes, a chain
rule, as it is known in classical calculus, is needed. Let X = {Xt , t 0} be a
continuous stochastic process characterized by the It
o dierential
dXt = et dt + ft dWt
(1.5.1)
u(t, Xt )
dWt
x
(1.5.2)
for t 0. To derive this important formula one can use a Taylor expansion for
increments over small time steps, where a second order term remains due to
the nonvanishing quadratic variation of the Wiener process, see, for instance,
Protter (2005) or Platen & Heath (2006).
When using the Stratonovich or symmetric integral we characterize, the
dynamics (1.5.1) in the form
dXt = et dt + ft dWt ,
the classical chain rule emerges for u(t, Xt ), that is
u(t, Xt )
u(t, Xt )
u(t, Xt )
+ et
dWt
du(t, Xt ) =
dt + ft
t
x
x
(1.5.3)
u(t, Xt )
u(t, Xt )
1 2 u(t, Xt )
dt +
dXt +
d[X]t
t
x
2
x2
(1.5.4)
1.5 It
o Formula
35
Multi-dimensional Continuous It
o Formula
For multi-factor nancial models the following more general version of the
Ito formula is needed. Consider a d-dimensional vector process e = {et =
(e1t , . . . , edt )T , t 0} with predictable components ek , k {1, 2, . . . , d}. Assume that
|ekz | dz <
(1.5.5)
almost surely for all k {1, 2, . . . , d}. The d m-matrix valued process F =
i,j
{F t = [Fti,j ]d,m
with
i,j=1 , t 0} is assumed to have predictable elements F
Fzi,j
2
dz <
(1.5.6)
m
Ftk,j dWtj
(1.5.7)
j=1
d
m
d
u
2
u
u
1
i,j k,j
+
ekt
+
F
F
=
dt
t
t
t
xk
2 j=1
xi xk
k=1
m
d
Fti,j
j=1 i=1
i,k=1
u
dWtj ,
xi
(1.5.8)
for t 0 with initial value u(0, X01 , X02 , . . . , X0d ), where the partial derivatives
of the function u are evaluated at (t, Xt1 , Xt2 , . . . , Xtd ), which we suppressed
in our notation.
We can rewrite the multi-dimensional It
o formula (1.5.8) also in the form
du(t, Xt1 , Xt2 , . . . , Xtd ) =
m
m
u
u
1 2u
i
dt +
dX
+
d[X i , X k ]t
t
i
i xk
t
x
2
x
i=1
i,k=1
(1.5.9)
36
for all t 0 using covariations, see (1.3.8), which holds for continuous vector
processes X.
The Stratonovich version of this chain rule has the form
du(t, Xt1 , Xt2 , . . . , Xtd ) =
m
u
u
dt +
dXti
i
t
x
i=1
(1.5.10)
(1.5.11)
(1.5.12)
(1.5.13)
denotes the ith component of the continuous part of Xti and Xti,d of the pure
jump part. All jumps of Xti are expressed by
Xti,d =
Xsi
(1.5.14)
s[0,t]
for all t 0 and i {1, 2, . . . , }. In (1.5.13) M i,c denotes a continuous (A, P )local martingale and Ai,d a continuous process of nite total variation.
For a twice continuously dierentiable function u : [0, ) ,
with continuous rst partial derivative with respect to time and continuous
second partial derivatives with respect to the spatial variables, the following
It
o formula holds
du(t, Xt1 , . . . , Xt ) =
u(t, Xt1 , . . . , Xt ) dt +
u(t, Xt1 , . . . , Xt ) dXti,c
i
t
x
i=1
2
1
u(t, Xt1 , . . . , Xt ) d[M i,c , M k,c ]t
2
xi xk
i,k=1
+ u(t, Xt1 , . . . , Xt )
(1.5.15)
1.5 It
o Formula
37
(1.5.16)
1 2
c
c
d (exp{Xt }) = exp{Xt } g + + (e 2) dt + dWt + (e 1) dqt
2
m
k=1
k
bi,k
t dWt +
r=m+1
(1.5.19)
for t 0 and i {1, 2, . . . , d}, where ai , bi,k and ci,r are predictable processes
and the mark space is given as E = \{0}. Then for a function u : [0, )
d , which is assumed to be dierentiable with respect to t and twice
dierentiable with respect to x, the It
o formula has the form
38
du(t, Xt1 , . . . , Xtd )
d
d
m
1 i,k j,k 2 u(t, Xt1 , . . . , Xtd )
dt
bt bt
+
2 i,j=1
xi xj
k=1
m
d
bi,k
t
k=1 i=1
r=m+1
1
d
u(t, Xt
, . . . , Xt
) prr (dv, dt)
(1.5.20)
39
(1.6.1)
t
Yt = y0 +
a(s, Ys ) ds +
b(s, Ys ) dWs .
(1.6.2)
0
Of course, we need to assume that both integrals on the right hand side of
(1.6.2) exist. It is sucient to request that
t
|a(s, Ys )| ds +
|b(s, Ys )|2 ds <
0
for all t 0. Note that the SDE (1.6.1) is only a shorthand notation for the
integral equation (1.6.2). The existence and uniqueness of a solution of an
SDE is not trivially given, and needs to be ensured, as will be discussed later.
In addition to the initial value, not only the drift coecient, the diusion
coecient and the jump coecient need to be given for certain SDEs, additionally also the behavior of its solution at certain boundaries may have to
be dened when establishing the uniqueness of the solution of the SDE. For
instance, absorption or reection has to be declared for certain SDEs at the
level zero. However, in many cases this is not necessary since there is no ambiguity. In nance, asset prices are typically absorbed at zero which excludes
obvious arbitrage opportunities.
Examples of One-Factor Continuous Asset Price Models
Let us list, together with the already mentioned examples, a few one-dimensional diusion processes that have been applied in the literature of asset price
modeling:
The linearly transformed Wiener process, which shifts the trajectories
shown in Fig. 1.1.3 to a positive initial level and scales time, is an example of
40
(1.6.4)
The BS model was suggested in Samuelson (1955, 1965a) and used in Black
& Scholes ((1973)). Despite the fact that this model became the standard
nancial market model, it has some shortcomings. For instance, it does not
generate a random, uctuating volatility, which is usually observed in practice.
The geometric Ornstein-Uhlenbeck (GOU) model can be shown to have
drift coecient
a(s, x) = x (1 ln(x))
(1.6.5)
and diusion coecient
b(s, x) =
2 x.
(1.6.6)
This asset price model permits equilibrium type dynamics for its logarithm
around a straight line. It was used, for instance, in F
ollmer & Schweizer (1993),
Platen & Rebolledo (1996) and Fleming & Sheu (1999). A disadvantage of
this model is again that it does not accommodate a uctuating volatility.
The constant elasticity of variance (CEV) model introduced in Cox (1975),
see also Schroder (1989), has drift
a(s, x) = x r
(1.6.7)
b(s, x) = x
(1.6.8)
a(s, x) = exp{ s}
41
(1.6.9)
exp{ s} x
(1.6.10)
with initial scaling parameter > 0 and net growth rate > 0. This model
generates a rather realistic, uctuating volatility and never hits zero. In particular, its volatility dynamics reect those of observed equity index volatilities
which increase when the index falls and vice versa, known as the leverage
eect, see Black (1976).
Examples of One-Factor Interest Rate Models
A large variety of interest rate models has been developed that are formed by
solutions of SDEs. In the following we mention several one-factor interest rate
models by only specifying their drift and diusion coecients.
One of the simplest stochastic interest rate models arises if the Wiener
process is linearly transformed by a deterministic, time dependent drift coecient a(s, x) = as and a deterministic time dependent diusion coecient
b(s, x) = bs . This leads to the Merton model, see Merton (1973a). It can also
be interpreted as some specication of the continuous time version of the HoLee model, see Ho & Lee (1986). Under this model the interest rate may not
remain positive.
A widely used interest rate model is the Vasicek model, see Vasicek (1977),
or more generally the extended Vasicek model, with linear drift coecient
xs x) and deterministic diusion coecient b(s, x) = bs . Here
a(s, x) = s (
s and bs are deterministic functions of time. Also this model yields a
s , x
Gaussian interest rate and, thus, allows negative rates.
In Black (1995) it was suggested that one considers the nonnegative value
of an interest rate, similar to an option value, which only takes the positive
part of an underlying quantity. The Black model results, when using a Vasicek
model u = {ut , t [0, T ]} to model an underlying shadow interest rate ut .
The interest rate is then obtained by its truncation xs = (us )+ = max(0, us ).
Such type of interest rate model, which allows the consideration of low interest
rate regimes, has been studied, for instance, in Gorovoi & Linetsky (2004) and
Miller & Platen (2005).
Cox, Ingersoll & Ross (1985) suggested the well-known CIR model, which
is sometimes called square root (SR) process and uses a process with linear
xs x) and square root diusion coecient of
drift coecient a(s, x)= s (
s and bs are deterministic functions of
the form b(s, x) = bs x. Here s , x
s > 2 b2s . This model has the desirable feature that it excludes
time with s x
negative interest rates. Furthermore, it yields equilibrium dynamics. Unfortunately, when calibrated to market data, it shows a number of deciencies
which mainly concern the possible shapes of the, so-called, forward rate or
yield curves.
42
Longsta
s x) and b(s, x) = bs x.
obtained by setting a(s, x) = s ( x
A rather general model is the Hull-White model, see Hull & White (1990).
xs x) and power diusion
It has linear mean-reverting drift a(s, x) = s (
coecient b(s, x) = bs xq for some choice of exponent q 0. Obviously, this
structure includes several of the above models. For instance, in the case q = 0
the Hull-White model reduces to the extended Vasicek model.
The Sandmann-Sondermann model, see Sandmann & Sondermann (1994),
was motivated by the aim to consider annual, continuously compounded interest rates. It has drift a(s, x) = (1 ex )(as 12 (1 ex ) b2s ) and diusion
coecient b(s, x) = (1 ex )cs .
An alternative interest rate model was proposed in Platen (1999b), which
suggests a drift a(s, x) = (x as )(cs x) and a 32 power diusion coecient
3
of the type b(s, x) = bs |x cs | 2 . This model provides a reasonable reection
of the interest rate drift and diusion coecient as estimated from market
data in Ait-Sahalia (1996).
As can be seen from these examples one can, in principle, choose quite general functions for the drift and diusion coecients in SDEs to form meaningful models of asset prices, interest rates and other nancial quantities. These
coecient functions then characterize, together with the initial conditions, the
dynamics of the process in an elegant and compact way. Note that in some
cases it is necessary to determine also the behavior of the solution of an SDE
at its boundaries. For instance, reection or absorption has to be declared
43
which may not be automatically clear from the drift and diusion coecients
of an SDE alone.
Continuous Vector SDEs
For the modeling of nancial markets, multi-dimensional solutions of SDEs
are needed. We recall that W = {W t = (Wt1 , Wt2 , . . . , Wtm ) , t 0} is an
m-dimensional standard Wiener process with components W 1 , W 2 , . . ., W m .
Given a d-dimensional vector function a : [0, ) d d and a dmmatrix function b : [0, ) d dm , then we can form the d-dimensional
vector stochastic dierential equation
dX t = a(t, Xt ) dt + b(t, Xt ) dW t
(1.6.11)
t
Xt = X0 +
a(s, Xs ) ds +
b(s, Xs ) dW s ,
(1.6.12)
0
for t 0, where the integrals are dened componentwise. Thus, the ith component of (1.6.12) is then given by the SDE
Xti = X0i +
ai (s, Xs ) ds +
0
=1
(1.6.13)
for t 0 and i {1, 2, . . . , d}. Note that the drift and diusion coecients
of each component can depend on all other components, which allows us to
model a wide range of possible feedback mechanisms.
Before applying the above type of model, questions concerning the existence and uniqueness of solutions of vector SDEs must be answered. For this
we refer to existence and uniqueness results, to be formulated later in Sect.1.9,
see also Krylov (1980) and Protter (2005).
SDEs with Poissonian Jumps
It is important to be able to incorporate into a model event driven uncertainty
expressed by jumps. This arises, for instance, when modeling credit risk, insurance risk or operational risk. Again, an SDE will allow the modeling of
feedback eects in the presence of jumps and in dependence on the level of
the state variable itself.
Let W = {Wt , t 0} denote a standard Wiener process and N =
{Nt , t 0} a Poisson process with intensity > 0. A scalar SDE for an
asset price Xt could then take the form
dXt = a(t, Xt ) dt + b(t, Xt ) dWt + c(t, Xt ) dNt
(1.6.14)
44
for t 0, with initial value X0 > 0. Here the jump coecient c(, ) determines
the jump size at an event. One often calls the resulting solution to process X
of the SDE (1.6.14) an It
o process with jumps or a jump diusion.
In a nancial market several interacting quantities need to play a role.
Let us denote m independent standard Wiener processes by W 1 , . . . , W m .
Furthermore, let N 1 , . . . , N n denote n independent Poisson processes with
corresponding intensities 1 (t, X t ), . . ., n (t, X t ) at time t. As indicated,
these intensities may depend on time and also on the state vector X t =
(Xt1 , Xt2 , . . ., Xtd ) . The model for the components of the state vector is
described by the following system of SDEs
dXti = ai (t, X t ) dt +
m
k=1
n
(1.6.15)
=1
for t 0 with X0i and i {1, 2, . . . , d}. To guarantee the existence and
uniqueness of a solution of the system of SDEs (1.6.15) the drift coecients
ai (, ), diusion coecients bi,k (, ) and jump coecients ci, (, ) need to
satisfy appropriate conditions. Such conditions will be outlined in Sect. 1.9,
see also Ikeda & Watanabe (1989) and Protter (2005).
SDEs Driven by Jump Measures
When events with randomly distributed jump sizes are modeled, it is appropriate to use SDEs driven by Poisson measures. Instead of Poisson processes, Poisson measures p (, ), {1, 2, . . . , n}, will now be employed, see
Sect. 1.1. Leaving the other terms in the SDE (1.6.15) unchanged, we consider
an SDE for the ith component of X t in the form
dXti = ai (t, X t ) dt +
m
k=1
=1
(1.6.16)
for t 0 with Xti , i {1, 2, . . . , d}. Here the (i, )th jump coecient may
not only be a function of time and other components, but may also depend on
the mark v E, see (1.1.30). It is important to note that satises condition
(1.1.32) for {1, 2, . . . , n}. One can model by the SDE (1.6.16) dynamics
involving innitely many small jumps per unit of time, for instance, those that
arise for some Levy processes. The intensity for those jumps that impact the
ith component is controllable via the jump coecients which depend on the
mark v. Even though the Poisson jump measure p is modeled in a standard
way without depending on state variables, its impact can depend, in a very
exible manner, on the actual values of all components of the SDE. To recover
the SDE (1.6.15) for jump diusions we may choose
(dv) = 1{v[0,]} dv,
45
and may leave the jump coecients independent of the marks. SDEs that are
driven by jump measures have been used in nancial modeling, for instance,
in Bj
ork, Kabanov & Runggaldier (1997) in a more classical framework or
Christensen & Platen (2005) in a rather general setup.
(1.7.1)
for t 0 one can write the explicit solution for (1.7.1) in the form
t
1
1
Xt = t X0 +
a2 (s) s ds +
b2 (s) s dWs
0
(1.7.2)
t
1
a2 (s) s ds
E(Xt ) = t X0 +
(1.7.3)
E (Xt E(Xt ))
=
0
for t 0.
b2 (s) t 1
s
2
ds
(1.7.4)
46
Ornstein-Uhlenbeck Process
An important example for a solution of the above linear SDE is the OrnsteinUhlenbeck process, which is given by (1.7.2) when the drift a1 (t) Xt + a2 (t) =
(
x Xt ) is linear and mean reverting, with given constant speed of adjustment and reference level x
. The diusion coecient b2 (t) = is, in this
case, constant. For the linear SDE
dXt = (
x Xt ) dt + dWt
(1.7.5)
for t 0 with initial value X0 one has by (1.7.2) the explicit solution
t
Xt = X0 exp{ t} + x
(1 exp{ t}) +
exp{ (t s)} dWs (1.7.6)
0
for t 0.
The solution (1.7.6) can be interpreted as that of the Vasicek (1977) interest rate model. The reference level x
is the long-term average value of the
interest rate. The parameter characterizes the magnitude of the uctuations
of the interest rate. The speed of adjustment parameter determines how fast
a shock to the interest rate loses its impact. From (1.7.3) and (1.7.4) it follows
2
that the mean and the variance converge over time to x
and 2 , respectively.
Continuous Linear SDE
We now consider a linear SDE
dXt = (a1 (t) Xt + a2 (t)) dt + (b1 (t) Xt + b2 (t)) dWt
(1.7.7)
t
X t = t X 0 +
(a2 (s) b1 (s) b2 (s)) s1 ds +
b2 (s) s1 dWs ,
0
(1.7.8)
where
t = exp
t
t
1
b1 (s) dWs
a1 (s) b21 (s) ds +
2
0
0
(1.7.9)
(1.7.10)
47
(1.7.13)
for t 0 with deterministic initial value X0 > 0. The solution of this SDE
is also known as the Black-Scholes model, see Black & Scholes (1973) and
Merton (1973b). It was originally mentioned in Osborne (1959) and Samuelson
(1965b) as potential asset price model. Its key advantages are that it generates
strictly positive dynamics, and it has an explicit solution
t
1 t 2
Xt = X0 exp
rs ds
ds +
s dWs
(1.7.14)
2 0 s
0
0
for t 0 and permits easy manipulations. Obviously, its logarithm ln(Xt ) is
Gaussian with mean
t
s2
E(ln(Xt )) = ln(X0 ) +
rs
ds
(1.7.15)
2
0
and variance
E
2
=
ln(Xt ) E(ln(Xt ))
s2 ds.
(1.7.16)
(1.7.17)
and variance
2
v(t) = P (t) ((t))2 = (X0 ) exp {2 r t} exp{ 2 t} 1
for t 0.
Furthermore, for q we can obtain the qth moment of Xt as
2
q
q
E ((Xt ) ) = (X0 ) exp q r
(1 q) t
2
for t 0.
(1.7.18)
(1.7.19)
48
m
B t X t + t dWt ,
(1.7.20)
=1
where A, B 1 , B 2 , . . ., B m are dd-matrix valued and , 1 , 2 , . . ., m ddimensional vector valued deterministic functions of time. We have an explicit
solution of (1.7.20) in the form
t
m
m
t
1
B s s ds +
1
s
Xt = t X0 +
s
s s dWs
0
=1
=1
(1.7.21)
for t 0. Here t is the d d fundamental matrix at time t with 0 = I,
where I denotes the identity matrix. The fundamental matrix process satises
the matrix SDE
m
B t t dWt .
(1.7.22)
d t = At t dt +
=1
(1.7.23)
m
B t
(t)
t
t
(t)
B t
t
t
dt, (1.7.24)
=1
with initial conditions (0) = E (X 0 ) and P (0) = E X 0 X
0 , assuming
that these are nite.
If the matrices A, B 1 ,B 2 , . . ., B m commute, that is, if
At B t = B t At
and B t B kt = B kt B t
(1.7.25)
for all k, {1, 2, . . . , m} and t 0, then an explicit solution of the fundamental matrix SDE (1.7.22) can be obtained in the form
!
m
m
t
t
1 2
Bs
B s dWs ,
t = exp
As
ds +
(1.7.26)
2
0
0
=1
=1
49
S0j
bj,k
for i = j
t
0 otherwise
(1.7.29)
d
B kt S t dWtk
(1.7.30)
k=1
(1.7.31)
k=1
50
GARCH, NGARCH type models. In Nelson (1990) it was pointed out that
these models, when appropriately scaled lead to a continuous time diusion
limit that satises a pair of SDEs. The resulting ARCH diusion model follows
the dynamics
(1.7.32)
dSt = St (a dt + t dWt )
and the squared volatility satises the SDE
t
dt2 = k (
2 t2 ) dt + t2 dW
(1.7.33)
and their (j, k)th squared volatilities can be modeled with the dynamics
m
2
2
2
2
j,k
j,k
d tj,k = j,k
(1.7.35)
=1
for t 0, S0j > 0, 0j,k 0, j, k {1, 2, . . . , d}, m {1, 2, . . .}. Here ajt , j,k
t ,
(
tj,k )2 and tj,k, are deterministic functions of time, and W 1 , W 2 , . . . are
independent standard Wiener processes. The above model for the volatility is
the solution of a multi-dimensional linear SDE which we will discuss in further
detail in the next chapter.
Mertons Jump Diusion Model
Let us now discuss some SDEs with jumps that have explicit solutions. Merton
(1976) modeled the dynamics of a stock price St by an SDE of the type
dSt = St (a dt + dWt + dYt )
(1.7.36)
for t 0 and S0 > 0. In the above Merton model the process Y = {Yt , t 0}
denotes a compound Poisson process, see (1.1.28), where
Yt =
Nt
(1.7.37)
k=1
and thus
dYt = Nt +1 dNt
(1.7.38)
51
(1.7.39)
at time k , k {1, 2, . . .}. The dynamics specied in the SDE (1.7.36) leads,
for S, to the kth jump increment
Sk = Sk Sk = Sk Yk = Sk k
at time k . This means that we have
Sk = Sk (k + 1).
(1.7.40)
Sk
= k + 1
Sk
(1.7.41)
Obviously,
(1.7.42)
Equivalently, we may request that the jump ratio is nonnegative, that is, by
(1.7.41) and (1.7.40)
Sk
0
(1.7.43)
Sk
compensates the jumps of Y that arise
for all k {1, 2, . . .}. The quantity t
until time t so that the compensated compound Poisson process
Y = Yt = Yt t, t 0
(1.7.44)
forms an (A, P )-martingale. We can rewrite the SDE (1.7.36) in the form
dt + St dWt + St dYt
dSt = St (a + )
(1.7.45)
for t 0. The last term on the right hand side of the SDE (1.7.45) forms
a martingale. The expected return of S over a period of length equals
(a + ).
The explicit solution of the SDE (1.7.36) has the form
52
St = S0 exp
"
Nt
1 2
a t + Wt
(k + 1)
2
(1.7.46)
k=1
(1.7.47)
gs ds +
0
s dWs
0
"
Nt
(k + 1) (1.7.48)
k=1
53
Hence, by similar arguments as above one obtains for the second moment
P (t) = E(Xt2 )
the ODE
dP (t) = P (t) 2 gt + t2 + (exp{2 ct } 1) dt
(1.7.52)
for t 0 with P (0) = exp{2 Z0 }. One obtains then directly the variance of
Xt using (1.7.52) and (1.7.50).
Exponential L
evy Models
An important generalization of Mertons jump diusion model (1.7.46) is given
by an exponential of the form
St = S0 exp{Xt }.
(1.7.53)
|v|<1
|v|1
(1.7.54)
for t 0, where W is a standard Wiener process and p is a Poisson measure
with Levy measure (dv) so that (1.1.32) is satised. By application of the
It
o formula (1.5.15) one obtains for the resulting exponential Levy model the
SDE
#
1 2
v (dv) dt + dWt
dSt = St +
2
|v|<1
+
(exp{v} 1) p (dv, dt)
(1.7.55)
for t 0. Assume that all expressions on the right hand sides of (1.7.54)
and (1.7.55) exist. Then inserting formula (1.7.54) into (1.7.53) provides an
explicit solution for the SDE (1.7.55).
Models of this type have been proposed and studied by various authors,
for instance, by Madan & Seneta (1990), Madan & Milne (1991), Eberlein &
Keller ((1995)), Barndor-Nielsen & Shephard (2001), Miyahara & Novikov
(2002) and Carr, Geman, Madan & Yor (2003).
54
one could also use a multi-dimensional mark space without any complication.
We dene on E[0, T ] an A-adapted Poisson measure p (dv, dt), with intensity
measure (dv, dt) = (dv)dt, T [0, ). To keep our notation transparent
we assume at this stage that the total intensity
= (E) <
(1.8.1)
of p is nite. This is also the case that one can numerically handle if one
focuses on single events. Thus, p = {p (t) = p (E, [0, t]), t [0, T ]} is a
stochastic process that counts the number of jumps occurring in the time interval [0, T ]. The Poisson random measure p (dv, dt) generates a sequence of
pairs {(i , i ), i {1, 2, . . . , p (T )} }, where {i [0, T ], i {1, 2, . . . , p (T )} }
is a sequence of increasing nonnegative random variables representing the
jump times of a standard Poisson process with intensity , and {i E, i
{1, 2, . . . , p (T )} } is a sequence of independent, identically distributed random variables. Here i is distributed according to (dv)
= F (dv). Therefore,
Xt = X0 +
0
b(s, X s ) dW s +
0
p (t)
a(s, X s ) ds +
c(i , X i , i ), (1.8.3)
i=1
55
This stochastic integral allows us to model rather general jump behavior. The
only restriction we impose on the jump component is the niteness of the
total intensity as requested in condition (1.8.1). By using a Poisson measure
there is no real need for relying on this condition other than being able to
deal with nite sums as in (1.8.3).
A Special Case
Often we will consider in our discussions on numerical schemes the special
case with coecient functions a(t, x) = x, b(t, x) = x and c(t, x, v) = x v.
Then the SDE (1.8.2) reduces to
(1.8.5)
dXt = Xt dt + dWt + v p (dv, dt) ,
E
p (t)
1
Xt = X0 e( 2
)t+Wt
(i + 1),
(1.8.6)
i=1
(1.8.7)
where 1 and 2 are deterministic functions of time and 1{} is the indicator
function dened by
56
1{vA} =
1 for v A
0 for v
/A
(1.8.8)
2 (s)
1 (s)
2 (s,Xs )
1 (s,Xs )
which allows the modeling of the eect of a stochastic intensity. Note that in
this case the coecient function c is, in general, discontinuous in the state
variable and, therefore, could not be Lipschitz. The above specications can
also be used for advanced credit risk models with multiple obligors and correlated intensities, as discussed in Sch
onbucher (2003). In the fast growing
energy markets, quantities such as electricity prices are often described by
rather complex jump diusion SDEs of the type (1.8.2); see, for instance,
Geman & Roncoroni (2006) for a class of jump diusion models that capture
the typical mean reversion feature of electricity prices.
Other important examples of jump diusions of the form (1.8.2) arise in the
pricing and hedging of complex interest rate derivatives in a jump diusion
term structure environment. We refer to Bj
ork et al. (1997) for the HeathJarrow-Morton framework with jumps and to Glasserman & Kou (2003) for
a LIBOR market model with jumps, see also Bruti-Liberati, NikitopoulosSklibosios & Platen ((2009)) for term structure models with jumps under the
benchmark approach.
For example, let us mention a specic LIBOR market model with jumps
proposed in Samuelides & Nahum (2001) for pricing short-term interest rate
derivatives. Given a set of equidistant tenor dates T1 , . . . , Td+1 , with Ti+1
Ti = for i {1, . . . , d}, the components of the vector X t = (Xt1 , . . . , Xtd )
represent discrete compounded forward rates at time t, maturing at tenor
dates T1 , . . . , Td , respectively. In this case the model is driven by one Wiener
process, m = 1, and two Poisson processes, r = 2. The diusion coecient is
specied as b(t, x) = x, with a d-dimensional vector of positive numbers,
and the jump coecient is given by c(t, x) = x, where is a d 2-matrix
with i,1 > 0 and i,2 < 0, for i {1, . . . , d}. In this way the rst jump process generates upward jumps, while the second one creates downward jumps.
Moreover, the marks are set to i = 1 so that the two driving jump processes
are standard Poisson processes. A classical no-arbitrage restriction on the
evolution of forward rates under the Td+1 -forward measure, see Bjork et al.
(1997) and Glasserman & Kou (2003), imposes a particular nonlinear form on
the drift coecient a(t, x). Its ith component is given by
ai (t, x1 , . . . , xd ) =
57
d
j
"
xj
j
1
j,1 x
1
+
1 + xj
1 + xj
j=i+1
j=i+1
+ 2
d
d
"
1 + j,2
j=i+1
x
1 + xj
,
(1.8.9)
for i {1, . . . , d}, where 1 and 2 denote the intensities of the two Poisson processes. A complex nonlinear drift coecient, as the one in (1.8.9), is
a typical feature of LIBOR market models. This makes the application of
numerical techniques essential for the pricing and hedging of complex interest rate derivatives. We refer to Glasserman & Merener (2003b) for numerical
approximations of jump diusion LIBOR market models and to Chap. 3 for
more details on models with jumps.
58
(1.9.3)
for all t [0, T ] and x d . Note that the linear growth conditions can
usually be derived from the corresponding Lipschitz conditions.
Moreover, we assume that the initial value X 0 is A0 -measurable with
E(|X 0 |2 ) < .
(1.9.4)
Theorem 1.9.3. Suppose that the coecient functions a(), b() and c()
of the SDE (1.8.2) satisfy the Lipschitz conditions (1.9.2), the linear growth
conditions (1.9.3) and the initial condition (1.9.4). Then the SDE (1.8.2)
admits a unique strong solution. Moreover, the solution X t of the SDE (1.8.2)
satises the estimate
(1.9.5)
E
sup |X s |2 C 1 + E(|X 0 |2 )
0sT
1.10 Exercises
59
For certain applications, the Lipschitz condition (1.9.2) on the jump coefcient c is too restrictive. For instance, for modeling state-dependent intensities, as discussed in Sect. 1.8, it is convenient to use jump coecients that are
not Lipschitz continuous. Athreya, Kliemann & Koch (1988) provide some results on the existence and uniqueness of the solution of the SDE (1.8.2) when
the jump coecient c is not Lipschitz continuous. The Yamada condition, see
Ikeda & Watanabe (1989), provides another condition for SDEs that do not
satisfy Lipschitz type conditions. In Cherny (2000) it is shown how the uniqueness of strong solutions of SDEs may break down when Lipschitz conditions
are violated. However, if one species at certain boundaries the behavior of
the solution of an SDE it can typically be made unique. In the mentioned
literature one nds also the notion of a weak solution of an SDE. We note
that if an SDE has a strong solution, then it has also a weak solution. It is
beyond the scope of this book to go deeper into these issues.
Moment Estimates
It is convenient to have the following moment estimates for SDEs with jumps
when estimating errors in discrete-time approximations.
Theorem 1.9.4. Suppose that the coecient functions a(), b() and c() of
the SDE (1.8.2) satisfy the Lipschitz conditions (1.9.2) and the linear growth
conditions (1.9.3). Moreover, let for n {1, 2, . . .}
E(|X 0 |2n ) < .
Then the solution X t of (1.8.2) satises
2n
E
sup |Xs |
C 1 + E |X0 |2n
(1.9.6)
(1.9.7)
0sT
1.10 Exercises
1.1. Determine for a geometric Brownian motion Zt = Z0 exp{ t+ Wt } the
o formula, where W is a
It
o dierential for Zt and ln(Zt ) by the use of the It
standard Wiener process.
1.2. Consider two transformed Wiener processes with Yt1 = a1 t + b1 Wt1 and
Yt2 = a2 t + b2 Wt2 , where W 1 and W 2 are two independent standard Wiener
processes. What is the Ito dierential for Yt1 Yt2 ?
60
1.3. Calculate the covariation between a standard Wiener process and its
square.
1.4. For a process X = {Xt , t [0, )} with Xt = Wt + Nt , where W
is a standard Wiener process and N a Poisson process with intensity > 0,
characterize the stochastic dierential of its exponential when , > 0.
1.5. For the sum Xt = a Nt1 + b Nt2 , where N 1 and N 2 are two independent
Poisson processes with intensity > 0, compute the stochastic dierential of
the exponential.
1.6. Compute the mean and variance as functions of time of the standard
Ornstein-Uhlenbeck process with initial value Xt0 = 1.
1.7. Solve explicitly the scalar SDE
1
dXt = Xt dt + Xt dWt1 + Xt dWt2 ,
2
where W 1 and W 2 are independent standard Wiener processes.
1.8. Let N = {Nt , t [0, )} be a Poisson process with intensity > 0 and
W a standard Wiener process. For the process Z = {Zt , t [0, )} with
Zt = a t + b Wt + c Nt
compute the SDE for X = {Xt = exp{k Zt }, t [0, )}, where a, b, c, k > 0.
1.9. For the process X in Exercise 1.8 compute the expectation
(t) = E(Xt ).
1.10. For a Poisson process with intensity > 0 derive a formula for its
second moment at time t > 0.
1.11. Compute the rst moment for a compound Poisson process with intensity > 0 and U (0, 1) distributed i.i.d. marks.
1.12. What is the probability for a compound Poisson process with intensity
> 0 and U (0, 1) distributed i.i.d. marks to have no jump until time t > 0?
2
Exact Simulation of Solutions of SDEs
61
62
(1995a), J
ackel (2002) and Glasserman (2004). Many discrete-time simulation
methods have been developed over the years. However, some SDEs can be
problematic in terms of discrete-time approximation via simulation. Therefore, it is necessary to understand and avoid the problems that may arise
during the simulation of solutions of such SDEs. For illustration, let us consider a family of SDEs of the form
dXt = a(Xt )dt +
2Xt dWt
(2.1.1)
Xt dWt ,
(2.1.2)
dYt =
63
2 /4
Yt dt + dWt .
2Yt
2
2
(2.1.3)
64
in mind since most of the literature is primarily concerned with the modeling
via SDEs.
Inverse Transform Method
The following well-known inverse transform method can be applied for the
generation of a continuous random variable Y with given probability distribution function FY . From a uniformly distributed random variable 0 < U < 1,
we obtain an FY distributed random variable y(U ) by realizing that
so that
U = FY (y(U )),
(2.2.1)
y(U ) = FY1 (U ).
(2.2.2)
Here FY1 denotes the inverse function of FY . More generally, one can still set
y(U ) = inf{y : U FY (y)}
(2.2.3)
65
. . ., FYd |Y1 ,Y2 ,...,Yd1 . Then the inverse transform method can be applied to
each conditional transition distribution function one after the other. This also
shows that it is sucient to characterize explicitly in a model just the marginal
and conditional transition distribution functions.
Note also that nonparametrically described transition distribution functions are sucient for application of the inverse transform method. Of course,
explicitly known parametric distributions are preferable for a number of practical reasons. They certainly reduce the complexity of the problem itself by
splitting it into a sequence of problems. Further, we will list below important
examples of explicitly known multi-variate transition densities and distribution functions.
Copulas
Each multi-variate distribution function has its, so called copula, which characterizes the dependence structure between the components. Roughly speaking,
the copula is the joint density of the components when they are each transformed into U (0, 1) distributed random variables. Essentially, every multivariate distribution has a corresponding copula. Conversely, each copula can
be used together with some given marginal distributions to obtain a corresponding multi-variate distribution function. This is a consequence of Sklars
theorem, see for instance McNeil et al. (2005).
If, for instance, Y Nd (, ) is a Gaussian random vector, then the
copula of Y is the same as the copula of X Nd (0, ), where 0 is the
zero vector and is the correlation matrix of Y. By the denition of the
d-dimensional Gaussian copula we obtain
Ga
C
= P (N (X1 ) u1 , . . . , N (Xd ) ud ) = N (N 1 (u1 ), . . . , N 1 (ud )),
(2.2.4)
where N denotes the standard univariate normal distribution function and
N denotes the joint distribution function of X. Hence, in two dimensions we
obtain
N 1 (u1 )
N 1 (u2 )
(s21 2s1 s2 + s22 )
1
Ga
C (u1 , u2 ) =
exp
2(1 2 )
2(1 2 )1/2
ds1 ds2 ,
(2.2.5)
1/
CCl = (u
,
1 + . . . + ud d + 1)
0,
(2.2.6)
66
Laplace-Stieltjes transforms of distribution functions on + . If F is a distribution function on + satisfying F (0) = 0, then the Laplace-Stieltjes transform
can be expressed by
F (t) =
etx dF (x), t 0.
(2.2.7)
0
exp
p(s, x; t, y) =
, (2.2.9)
2(t s)
(2(t s))d/2 det
for t [0, ), s [0, t] and x, y d . Here is a normalized covariance
matrix. Its copula is the Gaussian copula (2.2.4), which is simply derived from
the corresponding multi-variate Gaussian density. In the bivariate case with
correlated Wiener processes this transition probability simplies to
p(s, x1 , x2 ; t, y1 , y2 ) =
2(t s) 1 2
2
(y1 x1 ) 2(y1 x1 )(y2 x2 ) + (y2 x2 )2
exp
,
2(t s) (1 2 )
(2.2.10)
67
Fig. 2.2.1. Bivariate transition density of the two-dimensional Wiener process for
xed time step = 0.1, x1 = x2 = 0.1 and = 0.8
above inverse transform method by rst using the Gaussian distribution for
generating the increments of the rst Wiener process component. Then one
can condition the Gaussian distribution for the second component on these
outcomes.
In Fig. 2.2.1 we illustrate the bivariate transition density of the twodimensional Wiener process for the time increment = t s = 0.1, initial
values x1 = x2 = 0.1 and correlation = 0.8. One can also generate dependent
Wiener processes that have a joint distribution with a given copula.
Transition Density of a Multi-dimensional Geometric
Brownian Motion
The multi-dimensional geometric Brownian motion is a componentwise exponential of a linearly transformed Wiener process. Given a vector of correlated
Wiener processes W with the transition density (2.2.9) we consider the following transformation
S t = S 0 exp{at + BW t },
(2.2.11)
68
Then the transition density of the above dened geometric Brownian motion has the following form
p(s, x; t, y) =
(2(t
s))d/2
&d
det i=1 bi yi
(2.2.13)
(ln(y) ln(x) a(t s)) B 1 1 B 1 (ln(y) ln(x) a(t s))
exp
2(t s)
for t [0, ), s [0, t] and x, y d+ . Here the logarithm is understood componentwise. In the bivariate case this transition density takes the particular
form
p(s, x1 , x2 ; t, y1 , y2 ) =
2(t s) 1 2 b1 b2 y1 y2
(ln(y1 ) ln(x1 ) a1 (t s))2
exp
2(b1 )2 (t s)(1 2 )
(ln(y2 ) ln(x2 ) a2 (t s))2
exp
2(b2 )2 (t s)(1 2 )
(ln(y1 ) ln(x1 ) a1 (t s))(ln(y2 ) ln(x2 ) a2 (t s))
exp
,
b1 b2 (t s)(1 2 )
for t [0, ), s [0, t] and x1 , x2 , y1 , y2 + , where [1, 1].
In Fig. 2.2.2 we illustrate the bivariate transition density of the twodimensional geometric Brownian motion for the time increment = t s =
0.1, initial values x1 = x2 = 0.1, correlation = 0.8, volatilities b1 = b2 = 2
and growth parameters a1 = a2 = 0.1.
69
(2(1
det
(y xe(ts) ) 1 (y xe(ts) )
exp
, (2.2.14)
2(1 e2(ts) )
e2(ts) ))d/2
2 1
1 2
2
2 !
y1 x1 e(ts) + y2 x2 e(ts)
exp
2 1 e2(ts) (1 2 )
!
y1 x1 e(ts) y2 x2 e(ts)
exp
,
1 e2(ts) (1 2 )
p(s, x1 , x2 ; t, y1 , y2 ) =
e2(ts)
(2.2.15)
(2.2.16)
&d
(2(1 e2(ts) ))d/2 det i=1 yi
(ln(y) ln(x)e(ts) ) 1 (ln(y) ln(x)e(ts) )
exp
,
2(1 e2(ts) )
for t [0, ), s [0, t] and x, y d+ , d {1, 2, . . .}. In the bivariate case
the transition density of the multi-dimensional OU-process is of the form
p(s, x1 , x2 ; t, y1 , y2 ) =
2 1
1
e2(ts)
1 2 y1 y2
2
2 !
ln(y1 ) ln(x1 )e(ts) + ln(y2 ) ln(x2 )e(ts)
exp
2 1 e2(ts) (1 2 )
!
ln(y1 ) ln(x1 )e(ts) ln(y2 ) ln(x2 )e(ts)
exp
, (2.2.17)
1 e2(ts) (1 2 )
70
Fig. 2.2.3. Bivariate transition density of the two-dimensional geometric OUprocess for = 0.1, x1 = x2 = 0.1 and = 0.8
XY
;
0F 1
2 4(t s)2
m1
4
1
det(Y)
X +Y
=
etr
2(t s)
(2(t s))m(m+1)/2 det(X)
XY
,
(2.2.18)
I (m1)/2
4(t s)2
71
see Donati-Martin, Doumerc, Matsumoto & Yor (2004). Here etr{} denotes
the elementwise exponential of the trace of a matrix and I is a special function of a matrix argument with index dened by
I (z) =
(det z)/2
0 F 1 ((m + 1)/2 + , z).
m ((m + 1)/2 + )
(2.2.19)
m+1
m (x) =
etr{}(det())x 2 d.
(2.2.20)
>0
The transition density of the Wishart process, when it starts at time zero
at X = 0 for being at time t (0, ) in Y 0, can be written as
det(Y)(m1)/2
Y
etr
p (0, 0; t, Y) = (2 t)m/2
.
(2.2.21)
2t
m ( 2 )
Herz (1955) derived a representation of the non-central Wishart density
(m)
in terms of a Bessel function of matrix argument A . The advantage of this
representation is that it also accounts for correlation. It has the form
1
(2.2.22)
(2(t s) det())/2
1 (X + Y)
1 Y 1 X
(m)
etr
,
det(Y)(m1)/2 A(m1)/2
2(t s)
4(t s)2
p (s, X; t, Y) =
1
1
(1)
A
(z)
=
(tr(z)) det(z)j ,
A(2)
j=0 j! ( + j + 1) +2j+1/2
(2.2.23)
(1)
where A (z) = j=0 (z)j /(j! ( + j + 1)). Here () denotes the wellknown gamma function.
Hence, with the use of (2.2.22) and (2.2.23) we obtain for m = 2 the
following transition density for the 2 2 Wishart process of dimension = 2:
1 2
0 F3 3 , 3 , 1; K
72
where
K=
1
(x22 y11 2 + x12 y12 2 x12 y11 x22 y12 x22 y21
108(s t)2 (2 1)2
x12 y22 + x12 y21 + x21 (y12 + (y11 + y21 y22 )) + x22 y22
+ x11 (y11 + (y12 y21 + y22 )))
and
1
t =
4
c2u
du
su
(2.2.27)
(2.2.28)
exp
p(s, x; t, y) =
I
2 st t x st
2 t
t
for 0 s < t < and x, y (0, ), where I () is the modied Bessel
function of the rst kind with index = 2 1.
Transition Density of a Matrix SR-process
For the multi-dimensional SR-process, given in terms of a d m matrix,
we have an analytic transition density that can be derived from (2.2.18) or
(2.2.22) in the form
Y
;
(t),
p (s), X
ss
st
(2.2.29)
p(s, X; t, Y) =
st
73
b2
(1 exp{
ct})
4
cs0
(2.2.30)
and st = s0 exp{
ct} for t [0, ), s0 > 0, c < 0 and b
= 0.
As an example, we write down the transition density of the 2 2 matrix
SR-process of dimension = 2. This density follows from (2.2.24) and it can
be expressed by
p(s, x11 , x12 , x21 , x22 ; t, y11 , y12 , y21 , y22 )
8
c2 s20 ec(2s+t) 0 F3 13 , 23 , 1; K
=
b4 (ecs ect )2 (1 2 ) e2ct (y11 y22 y12 y21 )
2
cect (x11 (x12 + x21 ) + x22 ) 2
cecs (y11 (y12 + y21 ) + y22 )
exp
b2 (ecs ect ) (2 1)
'
c4 e2c(s+t) (x12 x21 x11 x22 ) (y12 y21 y11 y22 )
cosh 8
,
(2.2.31)
b8 (ecs ect )4 (2 1)2
where
K=
4
c2 ec(s+t)
2
2
27b4 (ecs ect ) (2 1)
x22 y21 x21 y22 + x21 y12 + x12 (y21 + (y11 + y12 y22 ))
+ x22 y22 + x11 y22 2 (y12 + y21 ) + y11 .
The above transition density looks complex. However, it has the great advantage of providing an explicit formula, which is extremely valuable in many
quantitative investigations.
Transition Density of Multi-dimensional L
evy Processes
Levy processes are a special class of processes which have independent and
stationary increments, see Sect. 1.7. It turns out that the distributions of
the independent increments of Levy processes are innitely divisible, see
Cont & Tankov (2004). A widely used class of distributions for the increments with this property are members of the family of generalized hyperbolic
(GH) distributions. It is always possible to construct a Levy process so that
the value of the increment of the process over a xed time interval has a given
GH distribution. The GH density can be expressed as
74
(
1
1
1
K( d ) ( + (x ) (x ))( + ) e(x)
2
p(x) = c
(
( + (x )
(x ))( +
( d2 )
)
(2.2.32)
where
c=
() ( + 1 )(d/2)
,
(2)d/2 det()1/2 K
(2.2.33)
K1 ()
,
1
(2.2.34)
where
1 1 2 1 + 2 2 1 2 1 x1 + 2 x1 2 x2 + 1 x2
,
2 1
(
= (12 + 22 1 22 + 2 ) B ,
A=
B=
and
c=
12 22 1 +22 2 +
12
1 2 K
1
.
(2.2.35)
75
subject to Madan & Seneta (1990), Eberlein & Keller (1995), Bertoin (1996),
Protter & Talay (1997), Barndor-Nielsen & Shephard (2001), Geman et al.
(2001), Eberlein (2002), Kou (2002), Rubenthaler (2003), Cont & Tankov
(2004), Jacod, Kurtz, Meleard & Protter (2005) and Kl
uppelberg, Lindner &
Maller (2006).
Other Explicit Transition Densities
There is still a wider range of one-dimensional Markov processes with explicit transition densities than those mentioned so far. For instance, Craddock & Platen (2004) consider generalized square root processes, see also
Platen & Heath (2006). These are diusion
processes X = {Xt , t [0, )}
76
(ii)
a(x) =
x
, > 0,
1 + 2 x
)
exp (x+y)
2 xy
t
x xy
p(0, x; t, y) =
+
I1
+ t (y) ;
y
2
t
1 + 2 x t
(iii)
1+3 x
,
a(x) =
2(1 + x)
2 xy
cosh
2 xy
t
p(0, x; t, y) =
1 + y tanh
t
y t (1 + x)
(x + y)
exp
;
t
(iv)
)
1 5
1
,
a(x) = 1 + tanh + ln(x) , =
2
2 2
p(0, x; t, y) =
2
2 xy
2 xy
x
I
+ e2 y I
y
t
t
(v)
(vi)
(vii)
exp{ x+y
t }
;
(1 + exp{2 } x ) t
1
a(x) = + x,
2
(t + 2 x) y exp{ x}
t
(x + y)
p(0, x; t, y) = cosh
exp
;
t
t
4
yt
1
a(x) = + x tanh( x),
2
2 xy
cosh
cosh( y)
t
t
(x + y)
exp
p(0, x; t, y) =
;
yt
t
4
cosh( x)
1
a(x) = + x coth( x),
2
2 xy
sinh
sinh( y)
t
t
(x + y)
exp
p(0, x; t, y) =
;
yt
t
4
sinh( x)
(viii)
77
t }
y 2 I
p(0, x; t, y) =
y 2 I
;
t
t
2 t sin(ln( x))
(ix)
a(x) = x coth
x
2
sinh( y2 )
(x + y)
exp
p(0, x; t, y) =
sinh( x2 )
2 tanh( 2t )
)
exp{ 2t }
xy
x
I1
+
(y)
;
exp{t} 1 y
sinh( 2t )
(x)
a(x) = x tanh
x
2
cosh( y2 )
(x + y)
exp
p(0, x; t, y) =
cosh( x2 )
2 tanh( 2t )
)
exp{ 2t }
xy
x
I1
+ (y) .
exp{t} 1 y
sinh( 2t )
Here I is the modied Bessel function of the rst kind with index , see
Platen & Heath (2006) and (2.2.19).
The rst drift above in (i) is that of the well-known squared Bessel process.
However, the other drifts are mostly from previously unknown scalar diusion processes. It is obvious that some multi-dimensional versions of most of
these processes can be constructed. It is an area of ongoing research that
identies natural dependence structures between dierent components of diffusions with explicit transition densities. One can, of course, always start from
a given copula and generate dependent diusions with given marginal transition distributions. An easy case is obtained, when all components of the
multi-dimensional diusion are independent. In general, each component of a
multi-dimensional diusion can come from various types of conditional transition distribution functions. This gives a great variety of multi-dimensional
models with paths that can be exactly simulated via the inverse transform
method.
Illustration of the Inverse Transform Method
Let us illustrate simulation via the inverse transform method for the following simple example of a two-dimensional standard OU-process. The marginal
78
transition distribution function of this process for its rst component can be
represented as
1
FY1 (s, x1 , t, y1 ) =
1 e2(ts)
1 2
2
!
#
)
1
(ts)
2(ts)
1e
(2.2.36)
1 erf x1 e
2
'
'
!
2(ts) 1
e
x21
(ts)
2(ts)
+e
1e
erf
x1
x21
2(e2(ts) 1)
+ y1 e
ts
x1
'
1 e2(ts)
erf
x1 e(ts) y1
x1 e(ts) y1
2 1 e2(ts)
! *
,
for x1 , y1 and t > s. Recall that erf{} denotes the well-known error function, see Abramowitz & Stegun (1972). Additionally, the conditional transition distribution function of the second component given the rst component is
'
1
1
FY2 |Y1 (s, x1 , x2 , t, y1 , y2 ) =
(2.2.37)
2 (1 e2(ts) )(1 2 )
#(
(1 e2(ts) )(1 2 ) x2 x1 ets (y2 y1 )
'
'
*
(x x ets (y y ))2
2
1
2
1
,
2 erf
2(e2(ts) 1)(1 2 )
(x2 x1 ets (y2 y1 ))
(1 e2(ts) )(1 2 )
79
dimensional distribution function directly rather than using the inversetransform method. For instance, the increment of the multi-dimensional
Wiener process can be simulated from the following exact relation
X t+ X t Nd (0, ),
(2.3.1)
80
(2.3.2)
(2.3.3)
For details on how to sample conveniently from the non-central Wishart distribution we refer to Gleser (1976).
The increments of a Levy processes constructed from a GH distribution
can be obtained by
X t+ X t GHd (, , , , , )
(2.3.4)
(2.3.5)
X t+ X t + W + W AZ.
Here Z Nk (0, Ik ), where W 0 is a non-negative scalar generalized inverse
Gaussian (GIG) random variable that is independent of Z. Here I k denotes a
k-dimensional unit matrix. A is a d k matrix and and are d-dimensional
vectors. Hence, the conditional distribution of increment X t+ X t given
W = w is conditionally Gaussian Nd ( + W , w), where = AA .
81
+ N i+1 ,
(2.3.6)
82
(2.3.7)
1, W
2, . . . , W
d are transformed scalar Wiener prosuch that its components W
t
t
t
cesses. In vector notation such a d-dimensional transformed Wiener process
can be expressed by the linear transform
t = at + BW t ,
W
(2.3.8)
where a = (a1 , a2 , . . . , ad ) is a d-dimensional vector and B is a d mmatrix and W t = (Wt1 , Wt2 , . . . , Wtm ) is an m-dimensional standard Wiener
is such that
process. Note that the kth component of W
tk = ak t +
W
m
bk,i Wti ,
(2.3.9)
i=1
i+1 ,
t
N
W
i+1 = W ti + a +
(2.3.10)
i+1
for ti = i, i {0, 1, . . .} with > 0. Here the random vector N
Nd (0, ) is a d-dimensional Gaussian vector with = BB , which is independent for each i {0, 1, . . .}.
We display in Fig. 2.3.1 (b) a trajectory of a two-dimensional transformed
= {W
t = (W
1, W
2 ) , t [0, 10]} with correlated compoWiener process W
t
t
nents. The two components of this process are as follows
t1 = Wt1 ,
W
2 = W 1 +
W
t
t
(2.3.11)
1
2 Wt2 ,
(2.3.12)
for t [0, 10] with correlation = 0.8. In Fig. 2.3.1 (a) we showed the corresponding independent Wiener paths Wt1 and Wt2 for t [0, 10]. We note
t2 in Fig. 2.3.1 (b). The
t1 and W
the expected strong similarity between W
can also be expressed using matrix multwo-dimensional Wiener process W
t = BW t , where
tiplication as W
1
0
B=
(2.3.13)
1 2
and W t = (Wt1 , Wt2 ) .
83
+ N i+1 ,
(2.3.14)
Id 0 . . . 0
0 Id . . . 0
(2.3.15)
Im Id = . . . . .
.. .. .. ..
0 0 . . . Id
Moreover, similarly to the vector case, we are able to dene a transformed
= {W
t , t [0, )} using the above matrix Wiener
matrix Wiener process W
process W as follows
t = M t + 1W t,
(2.3.16)
W
2
where M is a d m matrix and 1 and 2 are nonsingular d d and
m m matrices, respectively. Values of such a matrix stochastic process can
be obtained at times ti = i by the following recursive computation
W0 = 0
i+1 ,
t
N
W
i+1 = W ti + M +
(2.3.17)
2
2
1 . . . 1,m
1
1,1 1 1,2
2
2
2
2,1
1 2,2
1 . . . 2,m
1
2 1 =
(2.3.18)
,
..
..
..
..
.
.
.
.
2
2
2
1 m,2
1 . . . m,m
1
m,1
1 d
2 m
]i,j and 2 = [i,j
]i,j .
where 1 = [i,j
Let us illustrate this matrix valued stochastic process for a 2 2 matrix
, which
case. In Fig. 2.3.2 we display a transformed matrix Wiener process W
was obtained from a standard 22 matrix Wiener process W by the following
transformation
84
Fig. 2.3.2. 2 2 matrix Wiener process with independent elements in rows and
dependent elements in columns
Fig. 2.3.3. 2 2 matrix Wiener process with both correlated rows and columns
t = 1W tI ,
W
(2.3.19)
t = 1W t,
W
2
85
(2.3.20)
(2.3.21)
W (ti+1 ) = W (ti ) +
where the vector N i+1 Nd (0, I) is formed by independent standard Gaussian random variables. Obviously, it is possible to apply dierent time changes
to dierent elements of the vector W . For instance to prepare the representation of Ornstein-Uhlenbeck processes, let us dene
j (t) =
b2j 2cj t
(e
1)
2cj
(2.3.22)
for t [0, ), bj > 0, cj > 0 and j {1, 2, . . . , d}, see (2.2.30). Then the
elements of the vector W (t) are such that
Wj j (ti+1 ) Wj j (ti ) N (0, j (ti+1 ) j (ti )),
(2.3.23)
Wj,k
Wj,k
N (0, j,k (ti+1 ) j,k (ti )) ,
j,k (ti+1 )
j,k (ti )
(2.3.25)
86
where Wk,j
= 0, ti = i, i {0, 1, . . .} and j {1, 2, . . . , d}, k
k,j (0)
{1, 2, . . . , m}. Again, we may, for instance, dene the (j, k)-th time transformation by
b2j,k 2c t
e j,k 1
(2.3.26)
j,k (t) =
2cj,k
for t [0, ), bj,k > 0, cj,k > 0, and j {1, 2, . . . , d}, k {1, 2, . . . , m}.
In order to obtain the time changed matrix Wiener process with correlated
elements we can use the formula (2.3.16).
In Fig. 2.3.4 we display a matrix time changed Wiener process for d =
m = 2 with the covariance matrix I 1 , where 1 is as
in (2.3.13), = 0.8
and the parameters in the time change equal bj,k = 2 and cj,k = 1 for
j, k = {1, 2}. That is, the same time change is applied to each of the elements
by the relation
of this matrix Wiener process. Namely, we construct W
(t) = 1 W (t) I .
W
(2.3.27)
whose rows
In this case we obtain a matrix time changed Wiener process W
have independent elements, while columns have dependent elements.
Multi-dimensional OU-Processes
Let us now consider multi-dimensional Ornstein-Uhlenbeck (OU)-processes,
covering both vector and matrix valued OU-processes, see Sect. 1.7. We will
here construct the multi-dimensional OU-process as a time changed and scaled
multi-dimensional Wiener process. Note that given the following two functions
st = exp{ct} and
(t) =
b2 2ct
(e 1)
2c
87
(2.3.28)
for t [0, ), b, c > 0, the scalar OU-process Y = {Yt , t } can be represented in terms of a time changed and scaled scalar Wiener process, that
is
(2.3.29)
Yt = st W(t) ,
where W = {W , 0} is a standard Wiener process in -time. By Itos
formula we obtain
dYt = W(t) dst + st dW(t) =
b
Yt
cst dt + st dW
t
st
st
t,
= cYt dt + b dW
(2.3.30)
t , with W
denoting a standard Wiener process in t-time.
where dW(t) = sbt dW
Thus, by a straightforward time change and an application of the It
o formula
one obtains a mean-reverting OU-process out of a basic Wiener process.
It is also straightforward to obtain a vector OU-process by
Y t = st W (t) ,
(2.3.31)
(2.3.32)
that is,
for j {1, 2, . . . , d} and t 0. The generalization to a matrix OU-process is
obvious. The construction of this process starts by forming a d m matrix
time changed Wiener process and then scaling each element of this matrix by
a function sj,k
t for j {1, 2, . . . , d} and k {1, 2, . . . , m}. Hence, the elements
of such a matrix can be expressed by
j,k
Ytj,k = sj,k
t Wj,k (t)
(2.3.33)
(2.3.34)
88
(2.3.35)
for t [0, ) and i {1, 2, . . . , }. Furthermore, we can form the sum of the
squared OU-processes, that is,
Yt =
(Xti )2
(2.3.36)
i=1
b2 2c(Xti )2 dt + 2b
Xti dWti
i=1
(2.3.37)
i=1
for t [0, ). In order to simplify the above SDE we introduce another Wiener
= {W
t , t [0, )} dened as
process W
t =
W
s =
dW
0
i=1
Xi
s dWsi
Ys
(2.3.38)
equals
for t [0, ). It can be shown that the quadratic variation of W
]t =
[W
t
(Xsi )2
ds = t.
Ys
0 i=1
(2.3.39)
is a standard
Hence, by the Levy theorem, see Theorem 1.3.3, we see that W
Wiener process. Therefore, we obtain an equivalent SDE for the square root
process Y in the form
t
dYt = (b2 2cYt ) dt + 2b Yt dW
89
(2.3.40)
for t [0, ) with Y0 = i=1 (xi0 )2 . Note that this process is a SR-process
of dimension {1, 2, . . .}. It is well-known that for = 1 the value Yt can
reach zero and is reected at this boundary. For {2, 3, . . .} the process
never reaches zero for Y0 > 0, see Revuz & Yor (1999).
Matrix Valued Squares of OU-Processes
Kendall (1989) and Bru (1991) studied the matrix generalization for squares
of OU-processes. Denote by X t a m matrix solution of the SDE
dX t = cX t dt + b dW t ,
(2.3.41)
St)
(2.3.43)
90
convenient way for the case when the dimension of this process is an integer
{1, 2, . . .}. This process can be described by the following SDE
(
(2.3.44)
dX = d + 2 |X | dW
for [0 , ) with X0 = x 0, where W = {W , [0 , )} is a
standard Wiener process in -time starting at the initial -time, = 0 , at
zero. This means, for [0 , ) one has the increment in the quadratic
variation of W as
[W ] [W ]0 = 0
for all [0 , ). Furthermore, if we x the behavior of X at zero as
reection, then the absolute sign under the square root in (2.3.44) can be
removed, and X remains nonnegative and has a unique strong solution, see
Revuz & Yor (1999).
It is a useful fact that for {1, 2, . . .} and x 0 the dynamics of a
BESQx process X can be expressed as the sum of the squares of independent
Wiener processes W 1 , W 2 , . . . , W in -time, which start at time = 0 in
w1 , w2 , . . . w , respectively, such that
x=
(wk )2 .
(2.3.45)
k=1
k=1
(wk + Wk )2
(2.3.46)
91
(wk + Wk )dWk
(2.3.47)
k=1
(wk )2 = x.
(2.3.48)
k=1
Furthermore, by setting
dW = |X | 2
1
(wk + Wk )dWk
(2.3.49)
k=1
we satisfy with (2.3.46) the SDE (2.3.44). Note that we have for W the
quadratic variation
[W ] =
0
1 k
(w + Wsk )2 ds = .
Xs
(2.3.50)
k=1
(2.3.51)
+
and initial matrix s0 = W
0 W 0 for t , where W t is the value at time
t 0 of a m matrix Wiener process. Ito calculus applied to the relation
(2.3.51) results in the following SDE
dS t = Idt + dW
t W t + W t dW t ,
(2.3.52)
t expressed by
where I is the m m identity matrix. It can be shown that W
t=
dW
St
1
W
t dW t
(2.3.53)
St
is the inverse of the matrix S t .
positive square root of S t , while
Note also that
92
= dW W t
dW
t
t
=
dW
t Wt
St
1
dW
t Wt
( 1
S
= dW
t
t Wt
St
St
1
1
(2.3.54)
t + dW
St
dS t = Idt + S t dW
(2.3.55)
t
for t + .
In Fig. 2.3.2 we showed the trajectories of the elements of a 2 2 matrix
Wiener process. Now, in Fig.2.3.7 we plot a 22 Wishart process of dimension
= 2 obtained from the Wiener process in Fig. 2.3.2. Recall that the matrix
Wiener process in this example was obtained by assuming the covariance
matrix I 1 , where 1 is as in (2.3.13).
SR-Processes via Squared Bessel Processes
Using squared Bessel processes one can derive SR-processes by certain transformations. For this reason let c : [0, ) and b : [0, ) be given
deterministic functions of time. We introduce the exponential
t
st = s0 exp
cu du
(2.3.56)
0
93
(t) = (0) +
1
4
t
0
b2u
du
su
(2.3.57)
for t [0, ) and s0 > 0 dependent on t-time. Note that we have an explicit
representation for the function (t) in the case of constant parameters bt =
b
= 0 and ct = c
= 0, where
(t) = (0) +
b2
(1 exp{
ct})
4
cs0
(2.3.58)
for t [0, ) and s0 > 0. Furthermore, if (0) = 4bcs0 , this function simply
equals
b2
(t) =
exp{
ct}
(2.3.59)
4
cs0
for t [0, ), s0 > 0, b
= 0 and c
= 0. We show the function (t) in Fig. 2.3.8
2
for the choice of b = 1, c = 0.05, s0 = 20 and (0) = b = 0.25. The
4
cs0
b2
and X(0) = 4
cs0 this expected value simplies to
E(X(t) |A(0) ) =
for t [0, ).
b2
exp{
ct}
4
cs0
(2.3.61)
94
Given a squared Bessel process X of dimension > 0 and using our previous notation we introduce the SR-process Y = {Yt , t 0} of dimension > 0
with
(2.3.62)
Yt = st X(t)
in dependence on time t 0, see also Delbaen & Shirakawa (2002).
Further, by (2.3.44), (2.3.62), (2.3.56) and (2.3.57) and the It
o formula we
can express (2.3.62) in terms of the SDE
dYt =
4
b2t + ct Yt dt + bt
Yt dUt
(2.3.63)
4st
dW(t) .
b2t
Note that Ut forms by the Levy theorem, see Theorem 1.3.3, a Wiener process,
since
t
4sz
[U ]t =
d(z) = t.
(2.3.64)
2
0 bz
The same time-change formula applies in the more general matrix case.
Given a Wishart process X it can be shown, that the matrix square root
process can be obtained from the Wishart process by the following transformation
(2.3.65)
Y t = st X (t) ,
where st and (t) are as in (2.3.56) and (2.3.57), respectively. By (2.3.55),
(2.3.65), (2.3.56) and (2.3.57) and the It
o formula we can express (2.3.65) in
terms of the matrix SDE
2
bt
(2.3.66)
bt I + ct Y t dt +
Y t dU t + dU
Yt
dY t =
t
4
2
(
t
for t [0, ), Y 0 = s0 X (0) and where dU t = 4s
dW (t) is the dierential
b2
t
95
96
dat
(2.3.67)
dt
for t [0, ) with a0 [0, ). More precisely, we dene the process R =
{Rt , t [0, )} such that
(2.3.68)
R t = Y t + at I
at =
(2.3.69)
bt
t + dW
R t At ,
R t At d W
dRt = b2t I + At ct At + ct Rt dt +
t
4
2
(2.3.70)
for t [0, ). Here At denotes the matrix of the derivatives of the type
(2.3.67) for the shifts of each element. In principle, we applied here the It
o
formula to the equation (2.3.69).
Multi-dimensional Geometric Ornstein-Uhlenbeck Processes
The Ito formula provides a general tool to generate a world of exact solutions
of SDEs based on functions of the solutions of those processes we have already
considered. As an example, let us generate explicit solutions for a geometric
OU-process. Here each element of a matrix valued OU-process is simply exponentiated. That is, denoting by Y t = [Ytj,k ]d,m
j,k=1 the corresponding d m
matrix geometric OU-process value at time t and by X t = [Xtj,k ]d,m
j,k=1 the
d m matrix OU-process value. We obtain the elements of the matrix Y t by
Ytj,k = exp{Xtj,k }
(2.3.71)
97
Levy processes, see Sect. 1.7. In principle, one can substitute the Wiener processes by some Levy processes.
Since Levy processes have independent stationary increments, it is possible to construct paths of a wide range of d-dimensional Levy processes
L = {Lt , t 0} at given discretization times ti = i, i {0, 1, . . .}, with xed
time step size > 0. The distribution of the Levy increments Lti+1 Lti ,
however, must be innitely divisible for the process L to be the transition
distribution of a Levy process, see Sect. 2.2. One example of a family of innitely divisible distributions is the generalized hyperbolic (GH) distribution,
see for instance McNeil et al. (2005). This family of distributions yields variance gamma (VG) and normal inverse Gaussian (NIG) processes as special
cases.
Simulation of the d-dimensional VG and NIG processes results from their
representation as subordinated vector Wiener processes with drift. That is,
Lt = aVt + BW Vt ,
(2.3.72)
(2.3.73)
98
i+1 ,
Vi+1 N
(2.3.74)
St
(2.3.75)
dS t = Idt + S t dLt + dL
t
for t + . In order to simulate this Wishart process, which may be driven by
the VG-process L, we rst need to simulate a m matrix VG-process. Afterwards we obtain the m m VG-Wishart process of dimension {1, 2, . . .}
+
by setting S t = L
t Lt , for t .
In Fig. 2.3.12 we show a trajectory of a gamma process, which is always
nondecreasing. Here we have chosen = 1. Moreover, in Fig. 2.3.13 we display
a 22 matrix VG-process, with parameters M = 0 and the covariance matrix
I 1 , where 1 = B is as in (2.3.13) with = 0.8. The subordinator is
here chosen to be the gamma process illustrated in Fig 2.3.12. Additionally,
in Fig. 2.3.14 we display the corresponding trajectory of the resulting 2 2
Wishart process of dimension = 2.
The subordination methodology can be widely applied to generate trajectories of other matrix Levy processes, for instance, matrix Levy OU-processes
and matrix Levy-ane processes.
99
100
Fig. 2.3.14. Wishart process driven by the matrix VG-process in Fig. 2.3.13
Multi-dimensional It
o Formula Application
We consider an m-dimensional Wiener process W = {W t = (Wt1 , . . . , Wtm ) ,
t [0, )}, a d-dimensional drift coecient vector function a : [0, T ] d
d and a dm-matrix diusion coecient function b : [0, T ]d dm . In
this framework we assume that we have already a general family of explicitly
solvable d-dimensional SDEs given as
dX t = a(t, X t )dt + b(t, X t )dW t ,
(2.4.1)
for t [0, ), X 0 d . This means that the kth component of (2.4.1) equals
dXtk = ak (t, X t )dt +
m
(2.4.2)
j=1
(2.4.3)
The expression for its pth component, resulting from the application of the
It
o formula, satises the SDE
d
d
m
p
p
2 p
U
U
U
1
+
dYtp =
dt (2.4.4)
ai
+
bi,l bj,l
i
t
x
2
x
i xj
i=1
i,j=1
l=1
m
d
l=1 i=1
bi,l
U p
dWtl ,
xi
101
for p {1, 2, . . . , k}, where the terms on the right-hand side of (2.4.4) are
evaluated at (t, X t ). Obviously, also the paths of the solution of the SDE
(2.4.4) can be exactly simulated since X ti can be obtained at all discretization
points and by (2.4.3) the vector Y ti is just a function of the components of
the vector X ti .
Multi-dimensional Linear SDEs
In practice, the multi-dimensional It
o formula turns out to be a useful tool if
one wants to construct solutions of certain multi-dimensional SDEs in terms
of known solutions of other SDEs. Let us illustrate this with an example
involving linear SDEs, where we rely on the results of Sect. 1.7. Recall from
(1.7.20) that a d-dimensional linear SDE can be expressed in the form
dX t = (At X t + t ) dt +
m
B lt X t + lt dWtl
(2.4.5)
l=1
for t 0 with X 0 d . Here A, B 1 , B 2 , . . . , B m are deterministic d d matrix functions of time and , 1 , 2 , . . ., m are deterministic d-dimensional
vector functions of time. It is possible to express according to (1.7.21) the
solution of (2.4.5) in the form
t
m
m
t
1
l l
1 l
l
s
B s s ds +
s s dWs .
s
Xt = t X0 +
0
l=1
l=1
(2.4.6)
Here t is the d d fundamental matrix satisfying 0 = I and the homogeneous matrix SDE
d t = At t dt +
m
B lt t dWtl ,
(2.4.7)
l=1
see (1.7.22). Unfortunately, it is not possible to solve (2.4.6) explicitly for its
fundamental solution in its general form. However, if the matrices A, B 1 , B 2 ,
. . . , B m are constant and commute, that is if
AB l = B l A and
BlBk = Bk Bl
(2.4.8)
for all k, l {1, 2, . . . , m}, then the explicit expression for the fundamental
matrix solution is given as
!
m
m
1 l 2
l
l
(2.4.9)
A
(B ) t +
B Wt ,
t = exp
2
l=1
l=1
102
(2.4.10)
k=1
for t [0, ) and j {1, 2, . . . , d}, where aj and bj,k are deterministic functions of time, see (1.7.13) and (1.7.31). Here W k , k {1, 2, . . . , d}, denote independent standard Wiener processes. To represent this SDE in matrix form
d
we introduce the diagonal matrix At = [Ai,j
t ]i,j=1 with
j
at for i = j
Ai,j
=
(2.4.11)
t
0 otherwise
and diagonal matrix B kt = [Btk,i,j ]di,j=1 with
j,k
bt for i = j
k,i,j
=
Bt
0
otherwise
(2.4.12)
for t [0, ). Consequently, we obtain for the jth asset price the explicit
solution
!
d
d
t
t
1 j,k 2
j
j
j
j,k
k
(bt ) ds +
bs dWs
as
(2.4.14)
St = S0 exp
2
0
0
k=1
k=1
for t [0, ) and j {1, 2, . . . , d}. When taking the following exponential
elementwise, the explicit solution of (2.4.10) can be expressed as
!
d
d
t
t
1 k 2
k
k
Bs
S t = S 0 exp
B s dWs
As
ds +
(2.4.15)
2
0
0
k=1
k=1
for t 0, see (1.7.31). If the appreciation rates and volatilities are piecewise
constant, then we can simulate exact solutions. In the case where these parameters are time dependent, one can generate in a straightforward manner
103
almost exact solutions by using, for instance, the trapezoidal rule. The main
advantage of the multi-dimensional Black-Scholes model, which made it so
popular, is that it has an explicit solution for the entire market dynamics and
allows a wide range of easy calculations.
For a two-dimensional Black-Scholes model with B 1 and B 2 we obtain the
following exact solution
1 2
1
1
1
a1 b 1 t + b 1 W t ,
(2.4.16)
St = S0 exp
2
1
a2 b22 t + b2 (Wt1 + 1 2 Wt2 ) , (2.4.17)
St2 = S02 exp
2
for t [0, ). The two components of the trajectory of this two-dimensional
model are illustrated in Fig. 2.4.1 for the parameter choice S01 = S02 = 1,
a1 = a2 = 0.1, b1 = b2 = 0.2 and = 0.8.
Multi-dimensional Linear Diusion Model
The continuous time limits of some popular time series models in nance can
be described by a multi-dimensional ARCH diusion model, see Sect. 1.7. The
squared volatilities of this particular model emerge by assuming l to be zero
in (2.4.5) for all l {1, 2, . . . , m}.
We obtain an exact solution for the above multi-dimensional linear diusion process using (2.4.6) in the following form
104
t
Xt = t X0 +
1
ds
,
s
s
(2.4.18)
1
X ti = ti X 0 +
tk+1 tk+1 + tk tk
.
(2.4.19)
2
k=0
and vector
=
1 x
1
2 x
2
.
(2.4.21)
2
dWt2
(2.4.22)
, (2.4.23)
for t [0, ) with X01 , X02 > 0. Note that the matrices A, B 1 and B 2 commute, hence we can use the expression (2.4.9) for determining the fundamental
matrix t . In the considered two-dimensional example the logarithm of this
matrix equals
1 12 12 t + 1 Wt1
0
ln( t ) =
,
0
2 12 22 t + 2 Wt1 + 1 2 Wt2
(2.4.24)
105
1 2
1
1
exp
1 + 1 tk+1 1 Wtk+1
X 0 + 1 x
1
2
2
k=0
*
1 2
1
+ exp
1 + 1 tk 1 Wtk
,
(2.4.25)
2
1 2
1
2
2
= exp
2 2 ti + 2 (Wti + 1 Wti )
2
#
i1
1 2
2
1
2
2
X 0 + 2 x
exp 2 + 2 tk+1 2 (Wtk+1 + 1 Wtk+1 )
2
2
2
k=0
*
1 2
+ exp
2 + 2 tk 2 (Wt1k + 1 2 Wt2k )
,
(2.4.26)
2
Xt,2
i
for ti = i, i {0, 1, . . .}. Given that we use a highly accurate numerical approximation for calculating the integrals in (2.4.25) and (2.4.26) we can simulate trajectories of X 1 and X 2 almost exactly, that means with any desired
accuracy. This can be done by using the trapezoidal rule when approximating
the time integrals in (2.4.18) and a suciently ne time discretization.
In Fig. 2.4.2 we display a trajectory of a two-dimensional linear diusion
process with the correlation parameter = 0.8, initial values X01 = X02 = 1
2 = 0.5, 1 = 1 = 1 and 2 = 2 = 2. One notes in this gure the
and x
1 = x
similarities. On the other hand there are, of course dierences in the paths of
the two processes.
We refer to Kloeden & Platen (1999) for a list of other specic examples of
explicitly solvable multi-dimensional SDEs. The above models can be further
generalized by using some time changes and applying time changed Levy
processes. For instance, the above Black-Scholes model can be generalized by
subordination to multi-dimensional SDEs driven by Levy processes, yielding
exponential Levy market models, see Barndor-Nielsen & Shephard (2001),
Geman et al. (2001) and Eberlein (2002).
106
Fig. 2.4.2. Two correlated linear diusion processes obtained by setting = 0.8,
1 = x
2 = 0.5, 1 = 1 = 1 and 2 = 2 = 2
X01 = X02 = 1, x
second component can be conditioned on the rst one, and then simulated
exactly or almost exactly. This can eventually be continued to further components. Let us illustrate and describe this concept by applying it to the
multi-dimensional Heston model, see Heston (1993) and Sect. 1.6.
One-dimensional Heston Model
The Heston model is probably the most popular stochastic volatility model
used in nance. The reason why it is used so widely in practice is that Heston
(1993) was able to derive an exact formula for European call and put option
prices under this model. This underlines the relevance of the current chapter
with its focus on exact solutions.
The Heston model can be expressed by a two-dimensional SDE. Its rst
component models the stock price St while its second component characterizes
the squared volatility Vt of the underlying stock. This system of SDEs is
typically written in the form
.
dSt = rSt dt + Vt St dWt1 + 1 2 dWt2 ,
(2.5.1)
dVt = ( Vt )dt +
Vt dWt1 ,
(2.5.2)
for t [0, ). Here W 1 and W 2 are two independent Wiener processes, and
represents the correlation parameter. Furthermore, r denotes the short rate,
107
the speed of adjustment, the average squared volatility and the volatility
of the squared volatility.
There is a wide literature on approximating the Heston model by simulation. Recent studies include Broadie & Kaya (2006), Smith (2007) and
Andersen (2008) who discuss exact or almost exact simulation techniques for
the Heston model. For the purpose of simulation it is convenient to employ
the logarithmic transformation Xt = ln(St ) instead of using (2.5.1) directly.
Therefore, by the It
o formula we obtain the SDE
.
1
dXt = r Vt dt + Vt dWt1 + 1 2 dWt2 ,
2
dVt = ( Vt ) dt +
Vt dWt1 ,
(2.5.3)
for t [0, ).
In Broadie & Kaya (2006), given Vti , i {0, 1, . . .}, the value Vti + is
sampled directly from the non-central chi-square distribution with degrees
of freedom and non-centrality parameter
=
4e
Vt .
2 (1 e ) i
That is
Vti + =
2 (1 e ) 2
(,) ,
4
(2.5.4)
2
where = 4
2 . Here (, ) is sampled from the non-central chi-square distribution function
k
x2 ; +2k
exp 2 2
2
1
FX (x) =
,
(2.5.5)
k!
+2k
2
k=0
where
(u; a) =
ta1 exp{t} dt
is the incomplete gamma function for u 0, a > 1. Note that the non-central
chi-square distribution arises as a scalar version of the non-central Wishart distribution. Sampling from the non-central chi-square distribution is discussed,
for instance, in Glasserman (2004). The resulting simulation method for V is
exact.
In order to obtain an exact scheme for the simulation of the asset price
process in the Heston model we note that for ti = i, i {0, 1, . . .}, we obtain
ti+1
Vti+1 = Vti +
ti
Hence,
( Vu )du +
ti+1
Vu dWu1 .
ti
(2.5.6)
108
ti+1
ti
ti+1
Vu du .
(2.5.7)
ti
Additionally,
ti+1
ti+1
1
Vu dWu1 + 1 2
Vu dWu2 .
r Vu du+
2
ti
ti
ti
(2.5.8)
Substituting (2.5.7) into (2.5.8) we obtain
ti+1
1
Vu du
Xti+1 = Xti + r + (Vti+1 Vti ) +
2
ti
ti+1
+ 1 2
Vu dWu2 .
(2.5.9)
Xti+1 = Xti +
ti+1
ti
ti+1
Vu dWu2 , given the path generated
ti
t
by Vt , is conditionally normal with mean zero and variance tii+1 Vu du, because
V is independent of the Brownian motion W 2 , that is
ti+1
ti+1
2
Vu dWu N 0,
Vu du .
(2.5.10)
Furthermore, the distribution of
ti
ti
t
Broadie & Kaya (2006) discuss in detail how to obtain explicitly tii+1 Vu du.
t
However, it is also possible to approximate tii+1 Vu du given the path of the
process V . This approximation can be achieved with high accuracy by a
quadrature formula such as the trapezoidal rule, as previously discussed, which
results in an ecient almost exact simulation technique by conditioning for
the Heston model.
Multi-dimensional Heston Model with Independent Prices
Let us now dene a particular version of the multi-dimensional Heston model
given via the following SDE
dS t = At rdt + B t dW t .
(2.5.11)
Here S = {S t = (St1 , St2 , . . . , Std ) , t [0, )} is a vector process and At =
d
[Ai,j
t ]i,j=1 is a diagonal matrix process with elements
i
St for i = j
i,j
At =
(2.5.12)
0 otherwise.
Additionally, r = (r1 , r2 , . . . , rd ) is a d-dimensional vector and W = {W t =
(Wt1 , Wt2 , . . . , Wtd ) , t [0, )} is a d-dimensional vector of correlated Wiener
processes. Moreover, B t = [Bti,j ]di,j=1 is a matrix process with elements
Bti,j =
ti,i for i = j
0
otherwise.
109
(2.5.13)
ti+1 (
1 ti+1 1,1
Xt1i+1 = Xt1i + r1
u1,1 dWu1 (2.5.16)
u du +
2 ti
ti
ti+1 (
1 ti+1 2,2
2
2
Xti+1 = Xti + r2
u2,2 dWu1
u du +
2 ti
ti
ti+1 (
+ 1 2
u2,2 dWu2 ,
ti
where
ti+1
ti
ti+1
ti
ti+1
ti
(
u1,1 dWu1
N 0,
u2,2 dWu1 N
0,
ti
ti+1
(
u2,2 dWu2 N
u1,1 du
ti
ti+1
ti+1
0,
ti
(2.5.17)
u2,2 du
(2.5.18)
u2,2 du .
(2.5.19)
110
t
t
Given that we can approximate almost exactly tii+t u1,1 du and tii+1 u2,2 du
it is possible to simulate this two-dimensional Heston model very accurately.
We have discussed in Sect. 2.2 and 2.3 how to simulate the matrix SR-process
, which applies here. The terms in (2.5.17) and (2.5.18) can be generated
as corresponding Gaussian random variables.
Multi-dimensional Heston Model with Correlated Prices
Let us now consider another multi-dimensional version of the Heston model,
which allows correlation of the volatility vector with the vector asset price
process S. We dene this generalization by the system of SDEs
.
(2.5.20)
dS t = At r dt + B t C dW 1t + D dW 2t
d t = (a E t ) dt + F
B t dW 1t ,
(2.5.21)
1 2i for i = j
0
otherwise,
(2.5.23)
(2.5.25)
dt1,1
dt2,2
(
1,1
dt + 1 t1,1 dWt1,1 ,
= a1 b 1 t
(
2,2
= a2 b 2 t
dt + 2 t2,2 dWt1,2 ,
111
(2.5.26)
(2.5.27)
u1,1 du
Xti+1 = Xti + r1 +
1
1
2
ti
ti+1 (
(
+ 1 21
u1,1 dWu2,1 ,
ti
ti+1
b
2 2,2
1
2 2
ti+1 t2,2
u2,2 du
2
i
2
2
2
ti
ti+1 (
(
+ 1 22
u2,2 dWu2,2 .
Xt2i+1 = Xt2i + r2 +
ti
Here,
ti+1
ti
ti+1
(
u1,1 dWu2,1 N 0,
(
2,2
2,2
u dWu N 0,
ti
ti+1
u1,1 du ,
ti
ti+1
(2.5.30)
u2,2 du
(2.5.31)
ti
du
and
du.
Numerical errors are not propagated.
u
u
ti
ti
Another Heston Model
There exist alternative ways to construct multi-dimensional Heston type models. Two possible generalizations are discussed in Gourieroux & Sufana (2004)
112
and Da Fonseca, Grasselli & Tebaldi (2008). For instance, Gourieroux & Sufana (2004) consider a d-dimensional stochastic volatility model of the form
dS t = At r1 dt + t dW 2t ,
(2.5.32)
d t = + M t + t M dt + t dW 1t Q + Q d(W 1t ) t ,
for t [0, ). Here, S = {S t = (St1 , St2 , . . . , Std ) , t [0, )} is a vector
asset price process and t = [ti,j ]di,j=1 is a matrix squared volatility process.
d
Moreover, elements of the matrix At = [Ai,j
t ]i,j=1 are dened by (2.5.12) and
2,1
2
2
1 = (1, 1, . . . , 1) . Here, W = {W t = (Wt , Wt2,2 , . . . , Wt2,d ) , t [0, )}
is a vector of independent Wiener processes. In the second equation , M and
Q are d d parameter matrices with being an invertible matrix. In order
to preserve strict positivity and the mean reverting feature of the volatility,
the matrix M is assumed to be negative semi-denite, while satises
=
(2.5.33)
ti+1 (
1 ti+1 1,1
1 ti+1 1,2
Xt1i+1 = Xt1i + r
u du
u du +
u1,1 dWu2,1
2 ti
2 ti
ti
ti+1 (
+
u1,2 dWu2,2 ,
ti
Xt2i+1 = Xt2i
1
2
ti+1
+
ti
ti+1
u2,1 du
ti
u2,2 dWu2,2 .
1
2
ti+1
ti+1
u2,2 du +
ti
ti
(
u2,1 dWu2,1
113
Here,
ti+1
ti
ti+1
ti
ti+1
ti
ti+1
(
u1,1 dWu2,1 N 0,
(
1,2
2,2
u dWu N 0,
ti
ti+1
(
u2,1 dWu2,1 N 0,
ti
ti+1
(
2,2
2,2
u dWu N 0,
ti
ti+1
ti
ti+1
u1,1 du
(2.5.38)
u1,2 du
(2.5.39)
u2,1 du
(2.5.40)
u2,2 du
(2.5.41)
ti
(2.6.2)
for t [0, ). Here W 1 and W 2 are two independent Wiener processes, and
represents the correlation between the noises driving the return process and
the volatility process. In the given case we can simulate the squared volatility
process V almost exactly by approximating again some time integral via the
trapezoidal rule. Here the exact representation is of the form
114
1 2
1
Vti = exp
ti + Wti
2
i1
tk+1
1 2
Vt0 +
exp + s Ws1 ds , (2.6.3)
2
tk
k=0
1
ti + Wti
Vti = exp
2
#
i1
1 2
1
Vt0 +
exp
+ tk Wtk
2
2
k=0
*
1 2
1
+ exp
+ tk+1 Wtk+1
2
(2.6.4)
for ti = i, i {0, 1, . . .}. Note that this simulation requires sampling from
a Gaussian law in contrast to the simulation of the squared volatility process
in the Heston model, where we sampled from the non-central chi-square distribution. It is now straightforward to correlate the squared volatility process
V with the asset price process S.
In order to simulate the asset price process S we consider its discounted
version St = SS0t , where St0 = exp{rt} is the savings account, for t [0, ).
t
This results in the following SDE
.
dSt = Vt St dWt1 + 1 2 dWt2 ,
(2.6.5)
for t [0, ). Additionally, we obtain by the It
o formula
.
1
d ln(St ) = Vt dt + Vt dWt1 + 1 2 dWt2 ,
(2.6.6)
2
(t))) turns out to be that
for t [0, ). Equation (2.6.6) for ln(St ) = ln(S(
= {ln(S(
(t))), t [0, )}
of a time changed, drifted Wiener process ln(S)
with SDE of the form
.
1 + 1 2 dW
(t))) = 1 d (t) + dW
2
(2.6.7)
d ln(S(
(t)
(t) ,
2
Vs ds,
(t) = (0) +
(2.6.8)
115
(2.6.10)
d
for t [0, ). Here, S = {S t = (St1 , St2 , . . . , Std ) , t [0, )}, At = [Ai,j
t ]i,j=1
is a diagonal matrix with elements as in (2.5.12), r = (r1 , r2 , . . . , rn ) . Furthermore, B t = [Bti,j ]di,j=1 is a diagonal matrix with elements
i
Vt for i = j
Bti,j =
(2.6.11)
0 otherwise.
(2.6.13)
dSt = t dt + St t dWt ,
(2.6.14)
116
for t [0, ), with t = 0 exp{t}, for initial scaling parameter 0 > 0 and
net growth rate > 0. This is equivalent to writing
(
(t) ,
(2.6.15)
dSt = 4d(t) + 2 St dW
for t [0, ), which is the SDE of a squared Bessel process X = {X(t) =
St , t 0} of dimension = 4 in -time. Here, one has
(t) =
0
(exp{t} 1) ,
4
(2.6.16)
= {W
, 0} is a standard Wiener process in -time.
and the process W
The normalized GOP under the stylized MMM is given by the ratio
Yt =
St
.
t
(2.6.17)
By the It
o formula it follows that the process Y = {Yt , t [0, T ]} is a square
root process of dimension four with SDE
dYt = (1 Yt ) dt +
Yt dWt
(2.6.18)
1
[ Y ]t = t,
4
(2.6.19)
for t [0, ). Additionally, the volatility under the stylized MMM is of the
form
1
t = ,
(2.6.20)
Yt
for t [0, ).
Hence, the value S at time t of the discounted GOP can be expressed in
the form
St = Yt t
(2.6.21)
for t [0, ). Therefore, the GOP when expressed in units of the domestic
currency can be represented by the product
St = St0 St = St0 Yt t
t
(2.6.22)
for t [0, ), where St0 = exp{ 0 rs ds} is the domestic savings account with
r = {rt , t [0, )} denoting the short rate process.
117
(2.6.23)
where ti = 0i exp{ i t} and Sii (t) = exp{ri t}, i {0, 1, . . . , d}. Here i is
the ith net growth rate and ri is the short rate for the ith currency for i
{0, 1, . . . , d}. The ith normalized GOP satises then the SDE
(
(2.6.24)
dYti = (1 i Yti ) dt + Yti dWti
for t [0, ), with Y0i > 0. Here W = {W t = (Wt1 , Wt2 , . . . , Wtd+1 ) ,
t [0, )} is a vector of correlated Wiener processes.
Let us now illustrate the simulation of a two-currency market based on
the stylized MMM. We start with the simulation of two time changed 4 2
]4,2 and W 22 (t) = [W2,i,j
]4,2 ,
matrix Wiener processes W 11 (t) = [W1,i,j
1 (t) i,j=1
2 (t) i,j=1
resulting from the same set of Gaussian random numbers. The elements of
such a matrix are as in (2.3.25). The time changes are here
i (t) =
0i
exp{ i t} 1 ,
i
4
(2.6.25)
for i {1, 2}. In the next step we construct two 4 2 matrix Wiener processes
1
2
2
1
W
1 (t) = I44 W 1 (t) 22 and W 2 (t) = I44 W 2 (t) 22 , for t [0, )
where the elements of are as in (2.3.13). As a result of this transformation
1 and W
2 with correlated columns
we obtain the 4 2 matrix processes W
1 is 1 -time
and independent rows. Additionally, the 4 2 matrix process W
2 is 2 -time changed. Now we construct a
changed and the matrix process W
with elements
new 4 2 matrix process W
1,i,j for j = 1
wi,j + W
i,j
1 (t)
t =
W
(2.6.26)
2,i,j for j = 2,
wi,j + W
2 (t)
where i {1, 2, 3, 4}. The trajectory of the eight elements of this 4 2 matrix
process are displayed in Fig. 2.6.1. Here we assumed the starting values of
the components to be wi,j = 12 for i {1, 2, 3, 4}, j = 1, 2 and 01 = 0.04,
02 = 0.05, 1 = 0.05 and 2 = 0.06.
The trajectory of a time changed Wishart process is then obtained by the
W
t = W
t , for t [0, ), see (2.3.51). We illustrate this 2 2
formula S
t
matrix process in Fig.2.6.2. Note that the diagonal elements of this matrix are
118
Fig. 2.6.1. Trajectory of the 4 2 matrix time changed Wiener process with independent components in the columns and correlated components in the rows
119
Fig. 2.6.3. Trajectory of two related SR-processes modeling the normalized GOP
for two dierent currencies
2,1,1 +
St2,2 = w1,2 + W
2 (t)
2,1,2
1 2 W
2 (t)
2
2,2,1 +
+ w2,2 + W
2 (t)
2,2,2
1 2 W
2 (t)
2,3,1 +
+ w3,2 + W
2 (t)
2,3,2
1 2 W
2 (t)
2,4,1 +
+ w4,2 + W
2 (t)
2,4,2
1 2 W
2 (t)
2
2
2
,
(2.6.27)
4
4
2,2
i,1 2
i,2 2
where S01,1 =
=
i=1 (w ) and S0
i=1 (w ) . These two processes
model the discounted GOP in units of the two respective currencies. Let us
also illustrate the normalized GOP for these two currencies. For this part
of the simulation it is sucient to work with the above two squared Bessel
processes. We construct, according to (2.6.23), two related SR-processes by
setting
S1,1
S2,2
and Yt2 = t 2 ,
(2.6.28)
Yt1 = t 1
t
t
for t [0, ). These two processes are illustrated with their trajectories in
Fig. 2.6.3.
Multi-dimensional Generalized MMM
Let us now generalize the MMM with a time change by introducing a market
activity process m = {mt , t [0, )}. That is, we will use time = { (t), t
120
[0, )}, which we call market time, to model random variations in market
activity. The market time process is given by the relation
t
ms ds,
(2.6.29)
(t) = (0) +
0
Y ( (t)) dW ( (t)),
(2.6.31)
for t [0, ). This SDE can be rewritten for Yt = Y ( (t)) in t-time in the
following way
(2.6.32)
dYt = (1 Yt )mt dt + Yt mt dWt ,
for t [0, ), where W is a standard Wiener process in t-time.
Now, let the market activity process m be modeled by a linear diusion
process as considered in Sect. 2.4, that is
t,
mt ) dt + mt dW
dmt = (m
(2.6.33)
121
Fig. 2.6.4. Two related market times 1 and 2 obtained from the linear diusions
in Fig. 2.4.2
(2.6.36)
for i {1, 2} to include the eect of random market activity. This leads to
two correlated squared Bessel processes in two correlated times analogous to
(2.6.26)(2.6.27) given by
2
2
1,1,1 + w2,1 + W
1,2,1
St1,1 = w1,1 + W
1 (t)
1 (t)
2
2
1,3,1 + w4,1 + W
1,4,1 ,
+ w3,1 + W
1 (t)
1 (t)
2,1,1 +
St2,2 = w1,2 + W
2 (t)
2,1,2
1 2 W
2 (t)
(2.6.37)
2
2,2,1 +
+ w2,2 + W
2 (t)
2,2,2
1 2 W
2 (t)
2,3,1 +
+ w3,2 + W
2 (t)
2,3,2
1 2 W
2 (t)
2,4,1 +
+ w4,2 + W
2 (t)
2,4,2
1 2 W
2 (t)
2
2
2
,
(2.6.38)
4
4
2,2
i,1 2
i,2 2
where S01,1 =
=
i=1 (w ) and S0
i=1 (w ) . We display the entire
Wishart matrix process in -time in Fig. 2.6.5, whose diagonal elements are as
122
Fig. 2.6.6. Trajectory of two correlated SR-processes modeling the normalized GOP
for two dierent currencies under the generalized MMM
(1 (t))
(2 (t))
for t [0, ). These two processes are illustrated in Fig. 2.6.6. Moreover,
in Fig. 2.6.7 we display the quadratic variations when multiplied by 4 of the
123
Fig. 2.6.7. Quadratic variation4 of the square root of the SR-process from
Fig. 2.6.6
square root of the SR-processes from Fig. 2.6.6. These should approximately
be equal to the corresponding -times already displayed in Fig. 2.6.4, since
1
[ Y i ]t = i (t)
4
(2.6.40)
for i {1, 2}. The visualization conrms the usefulness of the presented almost
exact simulation method for the generalized MMM.
The above presented exact and almost exact simulation methods for multidimensional diusions lead to highly accurate scenario simulations. Since no
recursive scheme is involved the minor errors are not propagated and the
methods are also reliable over long periods of time. The numerical stability problems that we will highlight in Chap. 14 for recursive discrete-time
approximations of SDEs are avoided. We remark that the eect of jumps
on the above considered type of dynamics can be introduced via a jumpadapted time discretization, see Platen (1982a) and will be discussed in
Sect. 8.6.
124
integro dierential equations (PIDEs) when jumps are involved. The link between the conditional expectations and respective PDEs can be interpreted as
an application of the, so-called, Feynman-Kac formula. Below we formulate
the Feynman-Kac formula in various ways. For a wide range of models this
formula provides the PDEs and PIDES that characterize corresponding pricing functions of derivatives. Furthermore, this section presents some results
on transition probability densities, changes of measures, the Bayes rule, the
Girsanov transformation and the Black-Scholes option pricing formula. These
are useful when searching for analytic formulas of derivative prices and other
functionals.
SDE for Some Factor Process
We consider a xed time horizon T (0, ) and a d-dimensional Markovian
factor process X t,x = {X t,x
s , s [t, T ]}, which satises the vector SDE
t,x
dX t,x
s = a(s, X s ) ds +
m
k
bk (s, X t,x
s ) dWs
(2.7.1)
k=1
d
for s [t, T ] with initial value X t,x
t = x at time t [0, T ]. The process
1
m
W = {W t = (Wt , . . ., Wt ) , t [0, T ]} is an m-dimensional standard
Wiener process on a ltered probability space (, A, A, P ). The process X t,x
has a drift coecient a(, ) and diusion coecients bk (, ), k {1, 2, . . . , m}.
In general, a = (a1 , . . . , ad ) and bk = (b1,k , . . . , bd,k ) , k {1, 2, . . . , m},
represent vector valued functions on [0, T ] d into d , and we assume that
a pathwise unique solution of the SDE (2.7.1) exists. The components of the
SDE (2.7.1) can be the factors in a nancial market model. In the following
it will not matter which pricing approach one chooses. The task will be to
evaluate simply some conditional expectations.
Terminal Payo
Let us discuss the case of a European option, where we have a terminal payo
d
H(X t,x
T ) at the maturity date T with some given payo function H :
[0, ) such that
(2.7.2)
E(|H(X t,x
T )|) < .
We can then introduce the pricing function u : [0, T ] d [0, ) as the
conditional expectation
(2.7.3)
u(t, x) = E H(X t,x
T ) At
for (t, x) [0, T ] d . The Feynman-Kac formula for this payo refers to
the fact that under sucient regularity on a, b1 , . . . , bm and H the function
u : (0, T ) d [0, ) satises the PDE
125
u(t, x) i
u(t, x)
+
a (t, x)
t
xi
i=1
d
L0 u(t, x) =
d
m
1 i,j
2 u(t, x)
+
b (t, x) bk,j (t, x)
2
xi xk
j=1
i,k=1
=0
(2.7.4)
(2.7.5)
t,x
r(s, X t,x
s ) ds H(X T )
exp
t
exp
!
r(s, X t,x
s ) ds
H(X t,x
T ) At
(2.7.6)
126
(2.7.7)
(2.7.8)
exp
t
t,x
r(s, X t,x
s ) ds H(X T )+
exp
t,x
r(z, X t,x
z ) dz g(s, X s ) ds,
T
t,x
r(s, X s ) ds H(X t,x
u(t, x) = E exp
T )
t
+
t
exp
r(z, X t,x
z ) dz
t
g(s, X t,x
s ) ds At
(2.7.9)
for (t, x) [0, T ] d . As will follow below, this pricing function satises the
PDE
(2.7.10)
L0 u(t, x) + g(t, x) = r(t, x) u(t, x)
for (t, x) (0, T ) d with terminal condition
u(T, x) = H(x)
(2.7.11)
127
m
k
bk (s, X t,x
s ) dWs
k=1
j=1
j
cj (v, s, X t,x
s ) pj (dv, ds)
(2.7.12)
(2.7.13)
(2.7.14)
(2.7.15)
d
i=1
ai (t, x)
d
m
u(t, x)
1 i,j
2 u(t, x)
+
b (t, x) bk,j (t, x)
i
x
2
xi xk
j=1
i,k=1
/
u(t, x)
u(s, x1 + c1,j (v, s, x), . . . , xd + cd,j (v, s, x))
+
+
t
E
j=1
0
u(s, x1 , . . . , xd ) j (dv).
(2.7.16)
Here we abuse slightly our notation by writing for u(s, (x1 , . . . , xd ) ) also
u(s, x1 ,. . . , xd ). Note that due to the jumps an extra integral term appears
in (2.7.16) as a consequence of the Ito formula with jumps, see (1.5.15) and
(1.5.20). Of course, also here some boundary conditions may have to be added
to complete the characterization of the function u via the PIDE.
128
(2.7.17)
t,x
t,x
t
r(s, X s ) ds
u(t, x) = E H( , X t ) exp
t,x
g(s, X s ) exp
s
t
r(z, X t,x
u ) dz ds At
(2.7.18)
for (t, x) .
For the formulation of the resulting PIDE of the function u we use the
operator L0 given in (2.7.16). Under sucient regularity of , a, b1 , . . ., bm ,
o
c1 , . . . , c , H, g, 1 , . . ., and r one can show by application of the It
formula (1.5.15) and some martingale argument that the pricing function u
satises the PIDE
L0 u(t, x) + g(t, x) = r(t, x) u(t, x)
(2.7.19)
(2.7.20)
for (t, x) ((0, T ] )\. This result links the functional (2.7.18) to the
PIDE (2.7.19)(2.7.20) and is often called a Feynman-Kac formula.
Generalized Feynman-Kac Formula
For a rather general situation, where = (0, T ) and assuming no jumps,
that is c1 = . . . = c = 0 and t = T , let us now formulate sucient conditions that ensure that the Feynman-Kac formula holds, see Heath & Schweizer
(2000) and Platen & Heath (2006).
129
(A) The drift coecient a and diusion coecients bk , k {1, 2, . . . , m}, are
assumed to be on [0, T ] locally Lipschitz-continuous in x, uniformly
in t. That is, for each compact subset 1 of there exists a constant
K 1 < such that
|a(t, x) a(t, y)| +
m
(2.7.21)
k=1
and
P (X t,x
s for all s [t, T ]) = 1.
(2.7.23)
(C) There exists an increasing sequence (n )nN of bounded, open and connected domains of such that
n=1 n = , and for each n N the
PDE
(2.7.24)
L0 un (t, x) + g(t, x) = r(t, x) un (t, x)
has a unique solution un in the sense of Friedman (1975) on (0, T ) n
with boundary condition
un (t, x) = u(t, x)
(2.7.25)
130
(C2) For each n N the functions a and bb are uniformly Lipschitzcontinuous on [0, T ] (n n ).
(C3) For each n N the function b(t, x)b(t, x) is uniformly elliptic on d
for (t, x) [0, T ] n , that is, there exists a n > 0 such that
y b(t, x) b(t, x) y n |y|2
(2.7.26)
for all y d .
(C4) For each n N the functions r and g are uniformly H
older-continuous
n and an exponent
on [0, T ](n n ), that is, there exists a constant K
qn > 0 such that
n |x y|qn
|r(t, x) r(t, y)| + |g(t, x) g(t, y)| K
(2.7.27)
T
u(t, X t )
i,k
E
b (t, X t )
dt <
xi
0
for all i {1, 2, . . . , d} and k {1, 2, . . . , m}. This condition ensures that
the process u(, X ) is a martingale and that the PDE (2.7.19)(2.7.20) has a
unique solution.
Note that in the case when local Lipschitz continuity is not guaranteed, one
may have to specify particular boundary conditions to obtain an appropriate
description of the pricing function. This is a consequence of the fact that strict
local martingales may drive the factor dynamics. These need extra care when
dening the behavior of PDE solutions at boundaries.
Kolmogorov Equations
When the drift coecient a() and diusion coecient b() of the solution of a
scalar SDE without jumps are appropriate functions, then the corresponding
transition probability density p(s, x; t, y) satises a certain PDE. This is the
Kolmogorov forward equation or Fokker-Planck equation
1 2 2
p(s, x; t, y)
+
{a(t, y) p(s, x; t, y)}
b (t, y) p(s, x; t, y) = 0,
2
t
y
2 y
(2.7.28)
for (s, x) xed. However, p(s, x; t, y) satises also the Kolmogorov backward
equation
p(s, x; t, y) 1 2
p(s, x; t, y)
2 p(s, x; t, y)
+ a(s, x)
+ b (s, x)
= 0,
s
x
2
x2
(2.7.29)
131
for (t, y) xed. Obviously, the initial condition for both PDEs is given by the
Dirac delta function
for y = x
p(s, x; s, y) = (y x) =
(2.7.30)
0 for y
= x,
where
(y x) dy = 1
(2.7.31)
for given x. In the case of an SDE with jumps (2.7.28) and (2.7.29) become
corresponding PIDEs. Of course, the Kolmogorov equations have their obvious
multi-dimensional counterparts. In Sect. 2.2 we have given various examples
of explicit transition probability densities. The PDEs and PIDEs that we
obtain via the Feynman-Kac formula are often called Kolmogorov backward
equations.
Stationary Densities
Let us consider solutions of SDEs which provide models for nancial quantities
that can evolve into some equilibrium. Obviously, we restrict then the class
of processes that we consider. For example, such equilibria can be modeled
by the Ornstein-Uhlenbeck process or the square root process, see Chap. 1.
The corresponding transition probability densities converge over long periods
of time towards corresponding stationary densities, see Sect. 2.2.
More precisely, for a diusion process that permits some equilibrium its
stationary density p(y) is dened as the solution of the integral equation
p(s, x; t, y) p(x) dx
p(y) =
for t [0, ), s [0, t] and y . This means, if one starts with the stationary density, then one obtains again the stationary density as the probability
density of the process after any given time period. A stationary solution of
an SDE is, therefore, obtained when the corresponding process starts with its
stationary density.
It is important to know when the solution of an SDE has a stationary
density. One can identify the stationary density p by noting that it satises the
corresponding stationary, or time-independent, Kolmogorov forward equation,
see (2.7.28). In the case without jumps this stationary Kolmogorov forward
equation reduces to the ordinary dierential equation (ODE)
1 d2 2
d
(a(y) p(y))
b (y) p(y) = 0
2
dy
2 dy
(2.7.32)
with drift a(x) = a(s, x) and diusion coecient b(x) = b(s, x). Consequently,
it is necessary that equation (2.7.32) is satised to ensure that a diusion has
132
p(y) dy = 1.
(2.7.33)
One can identify for a large class of scalar stationary diusion processes
the analytic form of their stationary density p(y). It is straightforward to
check that the explicit expression
y
C
a(u)
exp 2
du
(2.7.34)
p(y) = 2
2
b (y)
y0 b (u)
satises the equation (2.7.32) for y with some xed value y0 . Here y0
is some appropriately chosen point so that (2.7.34) makes sense. The constant
C can be obtained from the normalization condition (2.7.33).
Ergodicity of a Process
Ergodicity is a concept that is somehow related to stationarity. It does not
require the process to start with an initial random variable that already has
the stationary density of the process. Ergodicity can be conveniently dened
for processes with stationary densities. A process X = {Xt , t [0, )} is
called ergodic if it has a stationary density p and
1 T
f (Xt ) dt =
f (x) p(x) dx,
(2.7.35)
lim
T T 0
y0
s(x) dx =
y0
s(x) dx =
(2.7.37)
and
1
dx <
s(x) b2 (x)
133
(2.7.38)
T
m
(ti )2 dt <
(2.7.39)
0
i=1
almost surely, let us assume that the strictly positive Radon-Nikodym derivative process = { (t), t [0, T ]}, where
t
1 t
(t) = exp
s dW s
s ds <
(2.7.40)
2 0 s
0
o formula (1.5.8)
almost surely for t [0, T ], is an (A, P )-martingale. By the It
it follows from (2.7.40) that
(t) = 1
i=1
(s) si dWsi
(2.7.41)
(2.7.43)
P (A) = E( (T ) 1A ) = E (1A )
(2.7.44)
by setting
for A AT . Here 1A is the indicator function for A and E means expectation
with respect to P .
134
Note that P is not just a measure but also a probability measure because
P () = E( (T )) = E (T ) A0 = (0) = 1
(2.7.45)
as a result of the martingale property of .
Bayes Rule
It is useful to be able to change the probability measure when taking conditional expectations. The following Bayes rule establishes a relationship between conditional expectations with respect to dierent equivalent probability
measures.
Assume for an equivalent probability measure P that the corresponding strictly positive Radon-Nikodym derivative process is an (A, P )martingale. Then for any given stopping time [0, T ] and any A measurable random variable Y , satisfying the integrability condition
E (|Y |) < ,
(2.7.46)
E ( ) Y As
E Y As =
E ( ) As
(2.7.47)
W (t) = W t +
s ds
(2.7.48)
135
!
1 T
E exp
s ds
< .
(2.7.49)
2 0 s
This condition is fullled, for instance, for the Black-Scholes model. However,
it is not satised for the minimal market model, which will be outlined in
Sect. 3.6.
Black-Scholes Formula
The following famous Black-Scholes pricing formula for a European call option
with geometric Brownian motion as the dynamics for the underlying risky
security can be obtained in various ways. These include applications of the
Feynman-Kac formula, the Bayes rule and the Girsanov transformation.
Let the underlying risky security St satisfy the SDE
dSt = rt St dt + t St dW (t)
(2.7.50)
for t [0, T ] with initial value S0 > 0. Here W denotes a standard Wiener
process under the risk neutral probability measure P . The deterministic functions of time rt and t denote the interest rate and volatility, respectively.
Furthermore, in this simple two-asset market there exists a savings account
Bt , which accrues the deterministic interest rate such that
dBt = rt Bt dt
(2.7.51)
(2.7.54)
136
(2.7.55)
for S (0, ). Furthermore, it can be shown that the option pricing function
V (t, St ) satises the seminal Black-Scholes formula
V (t, St ) = St N (d1 (t)) K
with
d1 (t) =
ln
St
K
Bt
N (d2 (t))
BT
(2.7.56)
T
+ t rs + 12 s2 ds
(
T 2
s ds
t
'
and
d2 (t) = d1 (t)
s2 ds
t
for t [0, T ), see Black & Scholes (1973). Here N () denotes the standard
Gaussian distribution function.
2.8 Exercises
2.1. Derive the joint transition density for two correlated drifted Wiener processes with constant correlation (1, 1) and constant drifts 1 , 2 .
2.2. For the two-dimensional process in Exercise 2.1, write down a corresponding SDE driven by two independent Wiener processes W 1 and W 2 .
2.3. Write down the transition density of a two-dimensional geometric Brownian motion that is a martingale, where the driving Wiener processes are
correlated with parameter (1, 1) and each of them having the volatility
b > 0.
2.4. For the two-dimensional geometric Brownian motion in Exercise 2.3 write
down the SDEs for its components driven by two independent Wiener processes W 1 and W 2 .
2.5. Consider the following Merton model SDE with jumps
dXt = a Xt dt + b Xt dWt + c Xt dNt ,
for t 0 with X0 > 0, where W is a standard Wiener process independent of
the Poisson process N , which has intensity > 0. Verify the Feynman-Kac
formula for the functional
u(t, x) = E H(XT ) Xt = x
for (t, x) [0, T ] + .
2.8 Exercises
137
2.6. Derive the rst moment of a square root process with constant parameters
c > 0, b < 0 and dimension > 2 satisfying the SDE
2
c + b Yt dt + c Yt dWt
dYt =
4
for t [0, ) and Y0 > 0, where W is a Wiener process.
2.7. Prove that the ARCH diusion model for squared volatility
d|t |2 = (2 |t |2 ) dt + |t |2 dWt
has an inverse gamma density as stationary density.
2.8. Show that the squared volatility of the model
d|t |2 = |t |2 (2 |t |2 ) dt + |t |3 dWt
has an inverse gamma density.
2.9. Compute the stationary density for the squared volatility for the Heston
model
d|t |2 = (2 |t |2 ) dt + |t | dWt .
3
Benchmark Approach to Finance and
Insurance
This chapter introduces a unied continuous time framework for nancial and
insurance modeling. It is applicable to portfolio optimization, derivative pricing, actuarial pricing and risk measurement when security price processes are
modeled via SDEs with jumps. It follows the benchmark approach developed
in Platen & Heath (2006). The jumps allow for the modeling of event risk in
nance, insurance and other areas. The natural benchmark for asset allocation
and the natural numeraire for pricing is represented by the best performing,
strictly positive portfolio. This is shown to be the growth optimal portfolio
(GOP) which maximizes expected growth. Any nonnegative portfolio, when
expressed in units of the GOP, turns out to be a supermartingale. This fundamental property leads to real world pricing which identies the minimal
replicating price. An equivalent risk neutral probability measure need not
exist under the benchmark approach, which provides signicant freedom for
modeling when compared to the classical approach.
(3.1.1)
139
140
and
hks ds <
(3.1.2)
almost surely for t 0 and k {1, 2, . . . , dm}. The kth counting process pk
leads to the kth jump martingale q k = {qtk , t 0} with stochastic dierential
1
dqtk = dpkt hkt dt hkt 2
(3.1.3)
for k {1, 2, . . . , dm} and t 0. It is assumed that the above jump martingales do not jump at the same time. They represent the compensated,
normalized sources of event driven traded uncertainty. The conditional variance of the increment of the kth source of event driven traded uncertainty
over a time interval of length equals
2
k
(3.1.4)
E qt+
qtk At =
for all t 0, k {1, 2, . . . , dm} and [0, ). This is similar to what
we obtain for a Wiener process. Note that we normalize the compensated
counting process by hkt to make the drivers of the event risk similar to
Wiener processes when the intensity is high. Then, in the latter case, there
is not much dierence between modeling via event driven uncertainty with
high intensity or via some Wiener process. In this manner we can avoid the
modeling of innite intensity jump martingales.
The evolution of traded uncertainty is modeled by the vector process of
t1 , . . . , W
tm , qt1 , . . . , qtdm ) ,
independent (A, P )-martingales W = {W t = (W
1
1
m
m
dStj
j
St
ajt
dt +
d
141
bj,k
t
dWtk
(3.1.5)
k=1
j
for t 0 with initial value S0j > 0 and j {1, 2, . . . , d}. Recall that St
j
denotes the value of the process S just before time t, which is dened as the
left hand limit at time t, see (1.3.9). We emphasize that this SDE is driven by
t1 , . . . , Wtm = W
tm and the jump martingales
the Wiener processes Wt1 = W
km
k
for k {m + 1, . . . , d}, t 0.
W t = qt
We assume that the short rate process r, the appreciation rate processes aj , the generalized volatility processes bj,k and the intensity processes
hk take almost surely nite values and are predictable, j {1, 2, . . . , d},
k {1, 2, . . . , d m}. They are assumed to be such that a unique strong
solution of the system of SDEs (3.1.5) exists. To model the limited liability
of companies, which means to ensure nonnegativity for each primary security
account, we need to make the following assumption.
Assumption 3.1.1
The condition
bj,k
t
(
hkm
t
(3.1.6)
(3.1.7)
142
for t 0 and j {0, 1, . . . , d}. For k {1, 2, . . . , m}, the quantity tk denotes
the market price of risk with respect to the kth Wiener process W k . If k
{m + 1, . . . , d}, then tk can be interpreted as the market price of event risk
with respect to the counting process pkm . The market prices of risk play a
central role and determine the risk premia that risky securities attract. For
j = 0 in (3.1.8) we denote by St0 the savings account, which is locally riskless
with b0,k
= 0 for all k {1, 2, . . . , d} and t 0.
t
Portfolios
The vector process S = {S t = (St0 , . . . , Std ) , t 0} characterizes the evolution of all primary security accounts. We say that a predictable stochastic
o integral I,W (t)
process = { t = (t0 , . . . , td ) , t 0} is a strategy if the It
of the corresponding gains from trade exists, see Sect. 1.4. The jth component tj of t denotes the number of units of the jth primary security account
held at time t 0 in the portfolio S , j {0, 1, . . . , d}. For a strategy we
denote by St the value of the corresponding portfolio process at time t, when
measured in units of the domestic currency. Thus, we set
St =
d
tj Stj
(3.1.9)
j=0
d
tj dStj
(3.1.10)
j=0
for all t 0. This means, instantaneous changes in the portfolio value are
only due to instantaneous changes in the primary security accounts. We emphasize that is assumed to be a predictable process and we consider only
self-nancing portfolios.
j
,t
= 1.
143
(3.2.2)
j=0
1
d
, . . . , ,t
) we obtain from
In terms of the vector of fractions ,t = (,t
(3.1.10), (3.1.8) and (3.2.1) the SDE
rt dt +
dSt = St
(3.2.3)
,t bt ( t dt + dW t )
t1 , . . . , dW
tm , dqt1 , . . . , dqtmd ) . Note by (3.1.3)
for t 0, where dW t = (dW
(
j
,t
bj,k
>
hkm
t
t
(3.2.4)
j=1
m
d
k=1 j=1
d
k=m+1
j
k
,t
bj,k
t dWt
ln1 +
d
j=1
j
,t
(
bj,k
hkm
(t
dWtk
t
km
ht
2
m
d
d
1
j
j
k
gt = rt +
,t
bj,k
,t
bj,k
t t
t
2 j=1
j=1
(3.2.5)
(3.2.6)
k=1
(
d
d
j,k
b
j
j
hkm
(t
,t
bj,k
,t
tk hkm
+ ln1 +
+
t
t
t
km
j=1
k=m+1 j=1
ht
d
for t 0. Note that for the rst sum on the right hand side of (3.2.6) a
unique maximum exists because it is a quadratic form with respect to the
fractions. Careful inspection of the terms in the second sum reveals that,
in general, a unique maximum growth rate only exists if the market prices of
event risks are less than the square roots of the corresponding jump intensities,
see Christensen & Platen (2005). This important insight leads to the following
crucial assumption on the market prices of event risk.
144
Assumption 3.2.1
the inequality
(3.2.7)
for k {1, 2, . . . , m}
tk
k
(3.2.8)
ct =
tk
for k {m + 1, . . . , d}
km 1
k
1t (ht
t
for t 0. Note that a very large jump intensity with km
1 causes the
ht
corresponding component ckt to approach the market price of event risk tk for
given t 0 and k {m + 1, . . . , d}.
We now dene for t 0 the fractions
1
,t = (1 ,t , . . . , d ,t ) = c
(3.2.9)
t bt
of a particular strictly positive portfolio S , which we will identify below as
the growth optimal portfolio (GOP). By (3.2.3) and (3.2.8) it follows that St
satises the SDE
rt dt + c
dSt = St
t ( t dt + dW t )
=
St
rt dt +
m
tk (tk dt + dWtk )
k=1
d
tk
k=m+1
1 tk (hkm
) 2
t
1
(tk
dt +
dWtk )
(3.2.10)
145
The proof of the following result is given in Platen & Heath (2006).
Corollary 3.2.3
Under the Assumptions 3.1.1, 3.1.2 and 3.2.1 the portfolio process S = {St , t 0} satisfying (3.2.10) is a GOP.
As shown in Platen & Heath (2006) the GOP is in many ways the best
performing portfolio of the given investment universe. Let us briey mention
one of its most striking properties, which is the fact that the path of the GOP
outperforms in the long run the path of any other portfolio almost surely.
Theorem 3.2.4. The GOP S has almost surely the largest long term
growth rate in comparison with that of any other strictly positive portfolio S ,
that is,
ST
ST
1
1
ln
lim sup ln
lim
(3.2.11)
T T
S0
S0
T T
almost surely.
The proof of this result is given in Platen & Heath (2006).
We mention that it will be shown in Sect. 3.4 that a global, well diversied
portfolio appears to be a good proxy for the GOP. For instance, the MSCI
world accumulation index is such a tradable portfolio that one can use as a
reasonable proxy for the GOP for various purposes, as we will demonstrate
later.
(3.3.1)
146
dSt =
m
k=1
d
tj Stj bj,k
St tk dWtk
t
j=1
d
k=m+1
d
tk
1 (
St
tj St
bj,k
tk dWtk
t
km
j=1
ht
(3.3.2)
for t 0.
The SDE (3.3.2) governs the dynamics of a benchmarked portfolio. For
example, by (3.1.8) and (3.2.10) the benchmarked savings account St0 satises
the SDE
d
0
dSt0 = St
tk dWtk
(3.3.3)
k=1
for t 0.
Supermartingale Property
Note that the right hand side of (3.3.2) is driftless. Thus, a nonnegative benchmarked portfolio S forms an (A, P )-local martingale, see Ansel & Stricker
(1994). This provides for nonnegative S the most important property of our
market, which is the following supermartingale property.
Theorem 3.3.1. Any nonnegative benchmarked portfolio process S is an
(A, P )-supermartingale, that is,
St E S At
(3.3.4)
for all bounded stopping times [0, ) and t [0, ].
A proof of this theorem can be found in Platen & Heath (2006). It essentially states that the GOP is truly best performing so that any other nonnegative portfolio when denominated in units of the GOP can trend only
downwards or has at most no trend. The benchmark that satises the supermartingale property (3.3.4) is also called the numeraire portfolio, see Long
(1990) and Becherer (2001). We emphasize that the fundamental fact that
nonnegative benchmarked portfolios are supermartingales, holds even in general semimartingale markets as long as a nite numeraire portfolio exists, see
Platen (2004a).
Strong Arbitrage
The classical idea of pricing by the exclusion of some standard form of arbitrage dominated, for several decades, theory and practice in derivative pricing,
see Ross (1976) and Harrison & Kreps (1979).
147
148
stopping time [0, ). By the Law of the Minimal Price applied to the
payo H, its fair price UH (t) at time t [0, ] is then the minimal possible
price, which is given by the real world pricing formula
H
At .
UH (t) = St E
(3.3.5)
S
This formula represents an absolute pricing rule. It makes pricing an investment decision and values the payo relative to the best performing portfolio
under the real world expectation. We emphasize that its numeraire is the
GOP, which is the best performing portfolio. The expectation is taken under the real world probability measure. By the supermartingale property we
know that any other replicating self-nancing portfolio process can only be
more expensive than the fair price process. In a competitive market the real
world pricing formula provides the economically correct price.
Risk Neutral Pricing
As shown in Platen & Heath (2006), real world pricing is equivalent to risk
neutral pricing as long as, the candidate Radon-Nikodym derivative
(t) =
St0 S0
St S00
(3.3.6)
for the putative risk neutral probability measure forms an (A, P )-martingale,
see also Sect. 2.7. In this case the risk neutral pricing formula results from
(3.3.5) by the Bayes rule in the form
(T ) St0
H
0
At .
H At = St E
(3.3.7)
UH (t) = E
(t) ST0
ST0
Here E denotes the expectation under the risk neutral pricing measure P ,
with dP
dP |AT = (T ).
In the case when the benchmarked savings account has a downward trend,
which means it is a strict supermartingale, (t) does not form a martingale
and risk neutral prices can be more expensive than fair prices. For more details
on these issues we refer to Platen & Heath (2006).
A simple derivative is the fair zero coupon bond. It pays one unit of the
domestic currency at the given maturity date T [0, ). By the real world
3.4 Diversication
149
pricing formula (3.3.5) the price P (t, T ) at time t for the fair zero coupon
bond is given by the conditional expectation
St
At
P (t, T ) = E
(3.3.8)
ST
for t [0, T ], T [0, ). In the case of a deterministic short rate rt , the
supermartingale property of the benchmarked savings account St0 yields by
(3.3.8) that
!
T
ST0
At P (t, T )
P (t, T ) = exp
rs ds E
(3.3.9)
St0
t
T
with P (t, T ) = exp t rs ds denoting the savings bond. The latter would
be the zero coupon bond price under classical risk neutral pricing.
In reality one observes that the savings account, when benchmarked by a
global equity market index trends systematically downwards in the long run
and is more realistically modeled by a strict supermartingale than a martingale. The above benchmark approach can accommodate this fact. The classical
risk neutral approach ignores any trends arising in reality and represents a
form of relative pricing as opposed to the absolute pricing of the real world
pricing formula. As soon as a derivative has been priced and traded, risk
neutral pricing allows us to calibrate models consistently with other derivatives that may be perfectly hedged. Still, all these risk neutral prices may be
signicantly higher than the corresponding minimal prices obtained through
absolute pricing via the real world pricing formula, as we will demonstrate in
Sect. 3.6. If the derivatives industry follows risk neutral pricing, then it is possible that large classes of derivatives can be overpriced on a macro-economic
scale.
Finally, we remark that the actuarial pricing formula, see B
uhlmann & Platen
(2003), arises from (3.3.7) when H is independent of ST , yielding
(3.3.10)
UH (t) = P (t, T ) E(H At ),
with the fair zero coupon bond P (t, T ) as discount factor. This can be interpreted as a form of absolute pricing. Thus, real world pricing unies actuarial
and derivative pricing and makes both an investment decision.
3.4 Diversication
We have seen that there is no strong arbitrage in our general market setting.
The only signicant free benet that an investment strategy can bring is
through diversication, as will be explained below.
This section considers diversied portfolios in a sequence of markets. It
presents the Diversication Theorem, see Platen (2005b), which is model independent and states that diversied portfolios approximate the GOP.
150
Sequence of Markets
We continue to rely on a ltered probability space (, A, A, P ). Continuous
traded uncertainty is represented by independent standard Wiener processes
k = {W
k , t [0, )} for k N . Event driven traded uncertainty is modW
t
eled by counting processes pk = {pkt , t 0} characterized by corresponding
predictable, strictly positive intensity processes hk = {hkt , t 0} for k N .
We dene the kth jump martingale q k = {qtk , t 0} as in (3.1.3), for k N .
In what follows, we consider a sequence of markets indexed by the number
d N of risky primary security accounts. For a given integer d, the corresponding market comprises d + 1 primary security accounts. To be precise
j
(t) satwe write for the jth primary security account in the jth market S(d)
isfying an SDE as given in (3.1.8). The primary security accounts include
0
0
= {S(d)
(t), t 0}, whose value at time t is given by
a savings account S(d)
t
0
the exponential S(d)
(t) = exp 0 rs ds for t 0. Here r = {rt , t 0} denotes an adapted short rate process, which we assume, for simplicity, to be
the same in each market. We include d nonnegative, risky primary security
j
j
= {S(d)
(t), t 0}, j {1, 2, . . . , d}, in the dth market,
account processes S(d)
1, W
2, . . . , W
m and
each of which can be driven by the Wiener processes W
1
2
dm
. Here m = [d] denotes the largest
the jump martingales q , q , . . . , q
integer not exceeding d with [0, 1] a xed real number. In the dth market we have then the traded uncertainty driven by the d-dimensional vector
1, . . . , W
m , q 1 , . . . , q dm ) , t 0}. Obviously, if
process W = {W t = (W
t
t
t
equals one, then we model no jumps in our markets. Of course, there may
exist many other sources of nontraded uncertainty in our market, which we
do not need to specify for our purposes.
In the dth market a given strategy and the volatilities and market prices
of risk depend typically on d. For simplicity, we shall suppress these dependencies in our notation and only mention it when required.
=
For the dth market we assume that there exists a unique GOP S(d)
(t), t 0}, satisfying the SDE (3.2.10) where we x the initial value to
{S(d)
S(d)
(0) = 1.
(3.4.1)
S(d)
(t) =
(t)
S(d)
S(d)
(t)
(3.4.2)
3.4 Diversication
151
tk bj,k
t
j,k
(d) (t) =
tk
j,k
k
t
t
km
ht
(3.4.3)
for k {1, 2, . . . , m}
for k {m + 1, . . . , d}
(3.4.4)
(t) =
tj S(d)
(t) (d)
(t) dWtk ,
(3.4.5)
dS(d)
k=1 j=0
(t) as
and for strictly positive S(d)
dS(d)
(t) = S(d)
(t)
d
d
j
j,k
,t
(d)
(t) dWtk
(3.4.6)
k=1 j=0
for t 0, using the notation (3.2.1) for fractions, where we suppress the dependence on d.
For the above SDEs and the market model to make sense, the specic
generalized volatilities need to be nite and the benchmarked primary security
accounts have to remain nonnegative. To secure these properties we make the
following realistic assumption.
Assumption 3.4.1
pose that
T
0
d
2
j,k
T <
(d)
(t) dt K
(3.4.7)
k=1
152
and
tk <
(
< hkm
t
(3.4.8)
(
hkm
t
(3.4.9)
quence (S(d)
)dN of strictly positive portfolio processes S(d)
a sequence of
diversied portfolios if some constants K1 , K2 (0, ) and K3 N exist,
independently of d, such that for d {K3 , K3 + 1, . . .} the inequality
K2
j
(3.4.10)
,t 1 +K
d2 1
holds almost surely for all j {0, 1, . . . , d} and t 0.
Note that in (3.4.10) the strategy depends on d. Essentially, we require
that the fraction that a diversied portfolio holds in a given primary security
1
account vanishes slightly faster than the order d 2 as d tends to innity.
We introduce for all t 0, d N and k {1, 2, . . . , d} the kth total specic
volatility for the dth market in the form
k
(t) =
(d)
d
j,k
|(d)
(t)|.
(3.4.11)
j=0
Depending on k, the kth total specic volatility represents the sum of the
absolute values of the specic generalized volatilities with respect to the kth
traded uncertainty.
The following regularity property of a sequence of markets ensures that
each of the independent sources of traded uncertainty inuences only a restricted set of benchmarked primary security accounts.
Denition 3.4.3. A sequence of markets is called regular if there exists a
constant K4 (0, ), independent of d, such that
2
k
(3.4.12)
K4
E
(d) (t)
for all t 0, d N and k {1, 2, . . . , d}.
Note that in the special case when each benchmarked risky primary security account is independent from the other ones, then the market is regular.
3.4 Diversication
153
Diversication Theorem
Consider for given d N in the dth market a strictly positive portfolio process
2
d
d
j
j,k
R(d)
(t) =
,t
(d)
(t)
(3.4.13)
k=1
j=0
S(d)
(t) = S(d)
(0)
(3.4.14)
R(d)
(t) = 0
(3.4.15)
(t) = S(d)
(0) S(d)
(t)
(3.4.16)
S(d)
tracking rate R(d) (t) vanishes for all t 0. We formalize this simple fact by
the following denition.
Denition 3.4.4. For a sequence of markets we call a sequence of strictly
lim P R(d)
(t) > = 0
(3.4.17)
d
for all t 0.
Now, we can state the Diversication Theorem proved in Platen (2005b).
Theorem 3.4.5. For a regular sequence of markets, each sequence
(S(d)
)dN of diversied portfolios is a sequence of approximate GOPs.
This result is highly relevant for the practical applicability of the benchmark approach. In particular, it allows us to approximate the GOP by a diversied market index, without the need of an exact parameter estimation and
subsequent calculation of the fractions of the GOP. In practice it is not a realistic task to estimate risk premia, since even under the simple Black-Scholes
model it takes several hundred years to obtain any reasonable estimate for its
154
drift parameter, see DeMiguel, Garlappi & Uppal (2009). We emphasize that
the above statement on diversication is very robust since it is model independent. The following examples, which use exact or almost exact simulations,
illustrate the robustness of the diversication eect stated in Theorem 3.4.5
and are taken from Platen & Rendek (2009b).
Diversication in Various Markets
We simulate over T = 35 years, similar as in Platen & Rendek (2009b), the
driftless benchmarked trajectories of d + 1 = 50 primary security accounts
according to some given dynamics. For simplicity, we set the interest rate to
0
zero. Thus, the inverse of the benchmarked savings account S(d)
(t) equals the
(d)
The product of the GOP with the jth benchmarked primary security account
yields the value of the primary security account in domestic currency, that is
j
j
S(d)
(t) = S(d)
(t) S(d)
(t),
(3.4.18)
j {1, 2, . . . , d}. For simplicity we start each of the above mentioned price
j
processes at S(d)
(0) = 1 at time zero. We will plot the rst twenty primary
security account prices in a rst gure. In a separate second gure we will
compare the GOP, the market capitalization weighted index (MCI) containing
one stock of each primary security account, with the equi-value weighted index
(EWI), which always keeps the same fraction for each stock.
We study below these three portfolios for three dierent market models.
These examples will demonstrate that a diversied portfolio does not need an
extremely large number of primary security accounts to become a reasonable
approximate GOP. In practice, the number of investable primary security
accounts is very large and reaches more than 10,000. This allows us to expect
that a broadly diversied market index is likely to be a good proxy of the
GOP.
First, let us illustrate the fundamental phenomenon of diversication by
simulating, in a Black-Scholes market, the mentioned diversied portfolios. By
following our discussion in Sect.2.4 this can be performed exactly without any
error. For simplicity, the benchmarked primary security accounts are assumed
to be independent with constant volatilities of magnitude 0.2. In Fig. 3.4.1
we display the resulting values of the rst twenty asset prices. Note that we
rst simulated driftless benchmarked primary security accounts. From the
benchmarked savings account we obtained the GOP. This then gave us the
primary security accounts, see (3.4.18).
In Fig. 3.4.2 we show the corresponding simulated GOP, the MCI and the
EWI. In this case the EWI approximates rather well the GOP. The MCI seems
to be still reasonably diversied and represents a good proxy for the GOP.
The Diversication Theorem appears to be applicable in the given situation.
3.4 Diversication
155
Fig. 3.4.2. GOP, MCI and EWI for the Black-Scholes model
156
Fig. 3.4.4. GOP, MCI and EWI for the Heston model
security accounts. Here the EWI provides a good proxy for the GOP. Also the
MCI represents a reasonable approximation of the GOP.
As the third example we simulate the benchmarked primary security accounts via a multi-dimensional ARCH diusion as described in Sect. 2.6. We
use the same squared volatility process for all benchmarked primary security
accounts, where we set in (2.6.2) = 3.4, = 0.04, = 2 and V0 = 0.04.
For simplicity, the driving noises of the benchmarked asset prices are chosen
to be independent from each other and from that of the volatility process.
3.4 Diversication
157
Fig. 3.4.5. Primary security accounts under the ARCH diusion model
Fig. 3.4.6. GOP, MCI and EWI under the ARCH diusion model
In Fig. 3.4.5 we show the rst twenty of the resulting risky primary security
accounts. Fig. 3.4.6 displays the corresponding GOP, EWI and MCI based on
49 risky primary security accounts. Also here the EWI appears to be a good
proxy of the GOP. Again in this case the MCI comes reasonably close to the
GOP.
In all the above examples we note that the primary security accounts have
realistic common uctuations. These model the general market risk and are
generated by the GOP. The benchmarked risky primary security accounts
158
model the specic market risk of the respective risky asset. The EWI is the
simplest diversied portfolio and approximates well in all our examples the
GOP. If we would double the number of risky primary security accounts in
the above simulations, then the EWI would become an even better proxy of
the GOP. The MCI is still a good proxy of the GOP. However, its fractions
become in some cases quite large, which can work against the diversication
eect.
The above simulation experiments and more extensive simulation studies
in Platen & Rendek (2009c) illustrate under dierent models the power and
robustness of the Diversication Theorem, which identies diversied portfolios as proxies for the GOP without any particular modeling assumptions on
the market dynamics. The diversication phenomenon, known since ancient
times, turns out to be very robust. One simply has to make sure that only
small fractions are invested in each of the primary security accounts to exploit the diversication eect. Only the minor regularity property (3.4.12) is
required to ensure that specic market risks diversify properly when forming
a diversied portfolio to approximate the GOP.
159
(3.5.1)
k
k
,
(3.5.2)
t t dt + dWt
rt dt +
dSt = St
k=1
(3.5.3)
k
k
s
rs +
ds +
(3.5.4)
s dWs ,
St = exp
2
0
0
k=1
k=1
d
tj,k dWtk ,
(3.5.5)
k=1
for all j {0, 1, . . . , d} and t 0, with S0j = S0j . Here we have set tj,k =
j,k
(t), see (3.4.4), for all j {0, 1, . . . , d}, k {1, 2, . . . , d} and t 0. As be(d)
fore, we choose W m+1 , . . . , W d to be compensated, normalized jump martingales with corresponding intensity processes h1 , . . . , hdm , respectively. From
(3.5.5) we obtain, via the It
o formula for the jth benchmarked primary security account, the explicit expression
160
Stj
S0j
1
exp
2
exp
t
m
0 k=1
t
d
j,k 2
ds
k=1
sj,k dWsk
! d
(
"
km
j,k
s
hs
ds
0 k=m+1
k=m+1
j,k
k
1 ( l
(3.5.6)
hkm
k
l=1
pkm
t
"
for each j {0, 1, . . . , d} and all t 0. Here (lk )lN denotes the sequence
of jump times of the kth counting process pk for events of kth type, k
{m + 1, . . . , d}.
Under the benchmark approach the benchmarked primary security accounts Stj , j {0, 1, . . . , d}, are the pivotal objects of study. The savings
account St0 together with the benchmarked primary security accounts are
sucient to specify the entire investment universe. More precisely, the ratio
S0
St = St0 , for all t 0, see (3.3.1), derives the GOP in terms of the savt
ings account and the benchmarked savings account. The product Stj = Stj St
expresses each primary security account in terms of the corresponding benchmarked primary security account and the GOP for each j {1, . . . , d} and all
t 0.
Dene the processes | j | = {|tj |, t 0} for j {0, 1, . . . , d}, by setting
8
9m
9 j,k 2
j
.
(3.5.7)
|t | = :
t
k=1
j = {W
tj , t
We also introduce the aggregate continuous noise processes W
[0, )} for j {0, 1, . . . , d}, dened by
m
t j,k
s
tj =
W
(3.5.8)
dWsk .
j
|
|
s
k=1 0
By Levys Theorem for the characterization of the Wiener process, see The j is a Wiener process for each j {0, 1, . . . , d}.
orem 1.3.3, it follows that W
1 , . . ., W
d can be correlated. Further 0, W
Note that the Wiener processes W
more, we enforce Assumption 3.1.2, such that the generalized volatility matrix
d
bt = [bj,k
t ]j,k=1 becomes for all t 0 invertible. Recall by (3.4.4) that
j,k
k
bj,k
t = t t
(3.5.9)
(3.5.10)
161
(3.5.13)
1 t j2
j
Stj,c = S0j exp
|s | ds
|sj | dW
s
2 0
0
and its compensated jump part
! d
pkm
d
t
"
j,k
j,d
j,k
km
h
1
St = exp t
hkm
k=m+1
k=m+1
(3.5.14)
(3.5.15)
for each j {0, 1, . . . , d} and all t 0. The specic model for the benchmarked
primary security accounts, which we now discuss, diers from the one we will
discuss in the next section in terms of how the continuous processes (3.5.14)
are modeled. The jump processes (3.5.15) will be, for simplicity, chosen to be
the same in both cases.
The Merton Model
The Merton model (MM) has emerged as the standard market model when
including event risk in asset dynamics. We describe here a modication of
the jump diusion model introduced in Merton (1976), see (1.7.36). Each
benchmarked primary security account can be expressed as the product of a
driftless geometric Brownian motion and an independent jump martingale.
Therefore, it is itself a martingale. The MM arises if one assumes that all
parameter processes, that is, the short rate, the volatilities and the jump
intensities, are constant. Therefore, in addition to (3.5.11) and (3.5.12) we set
rt = r and tj,k = j,k for each j {0, 1, . . . , d}, k {1, 2, . . . , m} and t 0.
In this case (3.5.14) can be written as
162
t j2
j,c
j
j j
St = S0 exp | | | | Wt
2
(3.5.16)
for each j {0, 1, . . . , d} and all t 0. In this special case, the benchmarked
primary security accounts are martingales since they are the products of driftless geometric Brownian motions and compensated Poisson processes. The
model is similar to the one introduced in Samuelson (1965b), which eventually became the Black-Scholes model with jumps by Merton (1976). The MM
is sometimes also called the Merton jump diusion model.
By Assumption 3.5.1 and relations (3.5.13)(3.5.15), the benchmarked savings account S0 exhibits no jumps. Furthermore, S0 is a driftless geometric
Brownian motion and, thus, a continuous martingale. Consequently, with this
specication of the market, the Radon-Nikodym derivative process (t) in
(3.3.6) is an (A, P )-martingale. Therefore, the standard risk neutral pricing
approach could be used for derivative pricing under the MM with the risk
neutral pricing formula (3.3.7). While not advocating the MM as an accurate
description of observed market behavior, its familiarity makes it useful for
illustrating real world pricing under the benchmark approach instead of risk
neutral pricing, which in this case delivers the same results. Furthermore, a
great advantage of this model is that one obtains for many derivatives explicit
formulas, which is extremely useful in practice.
Zero Coupon Bonds
We rst consider a standard default-free zero coupon bond, paying one unit
of the domestic currency at its maturity date T [0, ). According to the
real world pricing formula (3.3.5), the value of the zero coupon bond at time
t is given by the expression
!
T
1
1
A = 0 E exp
rs ds ST0 At
(3.5.17)
P (t, T ) = St E
t
ST
St
t
for all t [0, T ]. Since S0 is an (A, P )-martingale and rs is constant we obtain
1
(3.5.18)
P (t, T ) = exp{r(T t)} 0 E ST0 At = exp{r(T t)}
St
for all t [0, T ]. In other words, we obtain the usual bond pricing formula
determined by the deterministic savings account. This is fully in line with
the results that one obtains under risk neutral pricing, see Harrison & Kreps
(1979) and Sect. 2.7.
Forward Contracts
For the moment let us x j {0, 1, . . . , d}, T [0, ) and t [0, T ]. Consider
now a forward contract with the delivery of one unit of the jth primary security
163
account at the maturity date T , which is written at time t [0, T ]. The value
of the forward contract at the time t when it is written is dened to be zero.
According to the real world pricing formula (3.3.5) the forward price F j (t, T )
at time t [0, T ] for this contract is then determined by the relation
F j (t, T ) STj
St E
(3.5.19)
At = 0.
ST
By (3.5.17), solving this equation yields the forward price
St E STj At
1
Stj
j
=
E
STj At
F (t, T ) =
P (t, T ) Stj
St E S1 At
(3.5.20)
if Stj > 0 for given t [0, T ]. In the case when Stj = 0 we have also F j (t, T ) =
0.
In the MM case, with reference to (3.5.16), the same argument, which established that the benchmarked savings account is a continuous martingale,
also applies to the driftless geometric Brownian motion Sj,c , while the compensated Poisson process Sj,d is a jump martingale. Consequently, Sj is the
product of independent martingales, and hence itself an (A, P )-martingale.
Together with (3.5.18) this enables us to write the forward price (3.5.20) as
F j (t, T ) = Stj exp{r(T t)}
(3.5.21)
for all t [0, T ]. Thus, in the MM case we recover the standard risk neutral
formula for the forward price, see for instance Musiela & Rutkowski (2005).
Asset-or-Nothing Binaries
Binary options can be regarded as basic building blocks for complex derivatives. This has been exploited in the valuation of exotic options, when a complex payo can be decomposed into a series of binaries, see Ingersoll (2000)
and Buchen (2004). Let us x j {0, 1, . . . , d} and consider a derivative contract on the jth primary security account with maturity T and strike K + .
We x k {m + 1, . . . , d} and assume that j,k
= 0 and j,l = 0, for each
l {m + 1, . . . , d} with l
= k. In other words, we assume that the jth primary
security account responds only to the (k m)th jump process. This does not
aect the generality of our calculations below, but it does result in analytic
expressions.
The derivative contract under consideration is an asset-or-nothing binary
on the jth primary security account. At its maturity T it pays its holder one
unit of the jth primary security account if this is greater than the strike K,
and nothing otherwise. According to the real world pricing formula (3.3.5),
its value is given by
164
Aj,k (t, T, K) =
St
1{S j K}
T
STj
At
S
T
Stj
j At
j
S
E
1
0
0
1
{ST K(ST ) ST } T
Stj
Stj
= j,c E 1{Sj,c g(pkm pkm )S0 }
t
T
T
T
St
pkm
pkm
t
T
j,k
STj,c At
exp j,k hkm (T t) 1
hkm
(hk (T t))n
exp j,k hkm (T t)
exp hkm (T t)
n!
n=0
n j
j,k
St
j,c At
j,c
1
S
E
1
0
{ST g(n)ST } T
j,c
hkm St
(3.5.22)
n
j,k
j,k
km (T t)
exp
r
+
h
1
St0 Stj,d
hkm
(3.5.23)
for all n N , see Hulley et al. (2005).
Equation (3.5.22) yields the following explicit formula:
(hkm (T t))n
exp hkm (T t)
Aj,k (t, T, K) =
n!
n=0
g(n) =
n
j,k
j,k
km
h
(T t) 1
Stj N (d1 (n)) (3.5.24)
exp
km
h
for all t [0, T ], where
j,k
j
ln 1
S
hkm
ln Kt + r + j,k hkm + n
+
T t
d1 (n) =
0,j
2
1
0,j (T t)
2
T t
(3.5.25)
for each n N . Here N () is the Gaussian distribution function. In (3.5.25)
we employ the following notation
i,j =
| i |2 2 i,j | i | | j | + | j |2
(3.5.26)
for i, j {0, 1, . . . , d}, where i,j is the correlation between the Wiener pro j.
i and W
cesses W
165
Bond-or-Nothing Binaries
Now let us price a bond-or-nothing binary, which pays the strike K + at
maturity T , in the case that the jth primary security account at time T is not
less than K, where j {0, 1, . . . , d} is still xed. As before, let us assume that
the jth primary security account only responds to the kth jump martingale
W k , where k {m + 1, . . . , d} is xed.
At its maturity the bond-or-nothing binary under consideration pays its
holder the strike K if the value of the jth primary security account is in excess
of this amount and nothing otherwise. The real world pricing formula (3.3.5)
then yields
K
j,k
B (t, T, K) = St E 1{S j K} At
T
S
T
= K P (t, T )
KSt E
1
1{S j <K}
T
ST
At
0
S
T
At
1 0
0S
j
ST >K 1 ST
ST0
T
1
0
= K P (t, T ) K exp{r (T t)} 0 E 1 0
j,c ST
ST >g(pkm
pkm
)1 S
t
T
T
St
S0
= K P (t, T ) K t0 E
St
At
(hkm (T t))n
exp hkm (T t)
n!
n=0
1
0
A
S
0 E 1 0
(3.5.27)
t
T
j,c
ST >g(n)1 S
T
St
km
(hkm (T t))n
N (d2 (n))
exp h
(T t)
1
n!
n=0
=
exp{hkm (T t)}
n=0
(hkm (T t))n
K exp{r(T t)} N (d2 (n))
n!
(3.5.28)
166
ln
Stj
K
+
j,k
ln(1
)
hkm
j,k
km
h
+n
r+
T t
d2 (n) =
= d1 (n)
0,j
0,j
1
2
0,j 2
(T t)
T t
T t
(3.5.29)
+
j
K
S
STj K
T
j,k
cT,K (t) = St E
At = St E 1{STj K} S At
S
T
(3.5.30)
(hkm (T t))n
km
j,k
km (T t)
(t)
=
exp{h
(T
t)}
h
cj,k
exp
T,K
n!
n=0
n
j,k
1
Stj N (d1 (n)) K exp{r(T t)}N (d2 (n)) (3.5.31)
hkm
for all t [0, T ], where d1 (n) and d2 (n) are given by (3.5.25) and (3.5.29),
respectively, for each n N .
The explicit formula (3.5.31) corresponds to the original pricing formula
for a European call on a stock whose price follows a jump diusion, as given in
Merton (1976). The only dierence is that there the jump ratios were taken to
be independent log-normally distributed, while in our case they are constant.
Furthermore, this formula can be used to price an option to exchange the jth
primary security account for the ith primary security account. In that case,
the option pricing formula obtained instead of (3.5.31) is a generalization of
that given in Margrabe (1978).
Defaultable Zero Coupon Bonds
We have incorporated default risk in our market model. This allows us to study
the pricing of credit derivatives. Here we consider the canonical example of
167
km
1
At = St E
At E 1 km
P
(t, T ) = St E
{1
>T } At
ST
ST
= P (t, T )P (pkm
= 0 At )
(3.5.32)
T
for all t [0, T ]. Note that the second equality above follows from the independence of the GOP and the underlying Poisson process, see (3.5.2).
Equation (3.5.32) shows that the price of the defaultable bond can be
expressed as the product of the price of the corresponding default-free bond
and the conditional probability of survival. In our setup the latter may be
further evaluated as
= 0 At = E 1{pkm =0} 1{pkm pkm =0} At
P pkm
T
t
t
T
km
p
=
0
= 1{pkm =0} P pkm
At
t
T
t
= 1{pkm =0} E
t
exp
hkm
s
t
!
ds At
(3.5.33)
for all t [0, T ]. One has to combine (3.5.32) and (3.5.33) with (3.5.18) to
obtain an explicit pricing formula for the defaultable bond. Note that the expression obtained by combining (3.5.32) and (3.5.33) is similar to the familiar
risk neutral pricing formula for the price of a defaultable zero coupon bond in a
simple reduced form model for credit risk, see Sch
onbucher (2003). In (3.5.32)
and (3.5.33), however, only the real world probability measure is playing a
role. The crucial advantage of the benchmark approach in such a situation is
that one avoids the undesirable challenge of distinguishing between real world
default probabilities, as determined by historical data and economic reasoning, and putative risk neutral default probabilities, as calibrated by observed
credit spreads and suggested by credit rating agencies. If in reality the existence of an equivalent risk neutral probability measure cannot be reasonably
conrmed, then formally computed risk neutral prices can be substantially
larger than real world prices, see (3.3.9), which will be demonstrated in the
next section. Risk neutral pricing is a form of relative pricing and ignores all
trends. An entire class of derivatives can be massively overpriced using classical risk neutral pricing, as it indeed happened in 2007/2008 with subprime
credit default instruments. Real world pricing represents a form of absolute
168
pricing. The following section will apply a realistic and still parsimonious
model under the benchmark approach to the same derivatives as discussed
above, where explicit formulas will emerge.
(3.6.1)
for all t 0 with 0j > 0. We refer to j as the net growth rate of the jth primary security account, for j {0, 1, . . . , d}. The jth primary security account
can be interpreted as savings account of the jth currency. Next, we dene the
jth square root process Y j = {Ytj , t 0} for j {0, 1, . . . , d} through the
system of SDEs
(
(3.6.2)
dYtj = 1 j Ytj dt + Ytj dWtj
for each j {0, 1, . . . , d} and all t 0, with Y0j =
1
.
j0 S0j
Here W 0 , W 1 , . . . , W d
are, for simplicity, independent Wiener processes. The continuous parts Stj,c
of the benchmarked primary security accounts (3.5.14) are modeled in terms
of these square root processes by setting
Stj,c =
1
tj
169
(3.6.3)
Ytj
for each j {0, 1, . . . , d} and all t 0. Since (3.6.3) combined with (3.5.13)
and (3.5.14) represents a version of the multi-dimensional stylized MMM for
benchmarked primary security accounts, we shall henceforth refer to it as
MMM in this section.
As previously mentioned, between jumps the benchmarked primary security accounts are inverted time transformed squared Bessel processes of dimension four. More precisely, dene the continuous strictly increasing functions
j : + + for j {0, 1, . . . , d} by setting
j (t) = j (0) +
1
4
sj ds
(3.6.4)
Stj,c
(3.6.5)
for each j {0, 1, . . . , d} and all t 0. It then follows, see Sect. 2.6, that X j
is a squared Bessel process of dimension four in -time, so that S1j,c is a time
transformed squared Bessel process for each j {0, 1, . . . , d}.
Zero Coupon Bond and Forward Price
Under the MMM the benchmarked savings account is driftless and nonnegative and, thus, a local martingale. Hence by Lemma 1.3.2 (i) it is a supermartingale. It is, when normalized, also the candidate for the Radon-Nikodym
derivative process for the putative risk neutral measure, see Sect.3.3. However,
according to Revuz & Yor (1999) the candidate Radon-Nikodym derivative is
under the MMM a strict supermartingale and not an (A, P )-martingale. Consequently, risk neutral derivative pricing is impossible under the MMM, and
we shall resort to the more general real world pricing. Platen & Heath (2006)
showed that the MMM is attractive for a number of reasons. Its modest number of parameters and explicitly known transition density makes it a practical
tool. It conveniently illustrates some dierences between the benchmark approach and the classical risk neutral approach.
To simplify the notation let us set
jt =
1
j
St (j (t) j (T ))
(3.6.6)
170
T
1
ST0 At
rs ds At
E
P (t, T ) = E exp
0
St
t
!
T
1 0
1 exp t
rs ds At
= E exp
(3.6.7)
2
t
for all t [0, T ], which uses the real world pricing formula (3.3.5) and a
moment property of squared Bessel processes, see Platen (2002) and (3.6.6).
According to (3.6.5), Sj,c is an inverted time transformed squared Bessel
process of dimension four, while S j,d is an independent jump martingale, as
before in the case of the MM. Thus, we obtain, similar to the previous section,
for the jth primary security account
1
1
1 j
1
j
j,c
j,d
E ST At = j,c E ST At
E ST At = 1 exp t
2
Stj
St
Stj,d
(3.6.8)
for all t [0, T ], by using (3.6.6). Putting (3.5.20) together with (3.6.7) and
(3.6.8) gives for the forward price the formula
!
T
1 exp 12 jt
j
j
1 0 E exp
rs ds At
(3.6.9)
F (t, T ) = St
1 exp
2
for all t [0, T ]. This demonstrates that the forward price of a primary
security account is a tractable quantity under the MMM.
Asset-or-Nothing Binaries
As we have just seen, calculating the price of a payo written on a primary
security account requires the evaluation of a double integral involving the transition density of a two-dimensional process. This is a consequence of choosing the GOP as numeraire. Closed form derivative pricing formulas can be
obtained in many cases for the MM, but in the case of the MMM this is
more dicult, because the joint transition probability densities of two squared
Bessel processes are, in general, dicult to describe, see Bru (1991). A natural
response to this is to obtain the derivative price numerically by Monte Carlo
simulation, as will be described in Chap. 11. However, to give the reader a
feeling for the types of formulas that emerge from applying real world pricing
under the MMM, we have assumed, for simplicity, that the processes S0 and
Sj,c are independent, which is also a reasonable assumption in many practical
171
(hkm (T t))n
exp hkm (T t)
n!
n=0
n
j,k
exp j,k hkm (T t)
1
hkm
(T ) (t)
1
0
0
; jt , 0t exp jt
Stj G0,4
(3.6.10)
g(n)
2
for all t [0, T ], k {m + 1, . . . , d}. Here G0,4 (x; , ) equals the probability P ( ZZ x) for the ratio ZZ of a non-central chi-square distributed
random variable Z 2 (0, ) with degrees of freedom zero and non-centrality
parameter > 0, and a non-central chi-square distributed random variable
Z 2 (4, ) with four degrees of freedom and non-centrality parameter ,
see Hulley et al. (2005). By implementing this special function one obtains the
pricing formula given in (3.6.10), see Johnson, Kotz & Balakrishnan (1995).
Bond-or-Nothing Binaries
Subject to the assumption that ST0 and STj,c are independent, we can combine
(3.5.27), (3.5.23) and (3.6.7), to obtain the bond-or-nothing binary
K
j,k
B (t, T, K) = St E 1{S j K} At
T
ST
1 0
= K exp{r(T t)} 1 exp t
2
(hkm (T t))n
K exp{r(T t)}
exp hkm (T t)
n!
n=0
j (T ) j (t) 0 j
1
; t , t exp 0t
G0,4
g(n)
2
(hkm (T t))n
n!
n=0
j
0
K exp{r(T t)} 1 G0,4 (j (T ) j (t))g(n); t , t
(3.6.11)
exp{hkm (T t)}
172
for all t [0, T ], see Hulley et al. (2005). For the second equality in (3.6.11),
we have again used the fact that
exp{hkm (T t)}
n=0
(hkm (T t))n
n!
+
j
S
K
T
At
cj,k
T,K (t) = St E
ST
#
km
n
(T
t))
(h
=
exp{hkm (T t)}
exp j,k hkm (T t)
n!
n=0
n
1 j
j,k
j
j
0
St G0,4 (0 (T ) 0 (t)) g(n); t , t exp t
1
2
hkm
*
j (T ) j (t) 0 j
; t , t
K exp{r(T t)} 1 G0,4
(3.6.12)
g(n)
for all t [0, T ], where g(n) is given by (3.5.23), for each n N and jt in
(3.6.6).
Savings Bond and Fair Zero Coupon Bond
Let us now compare the MM and the MMM when pricing an extremely long
dated zero coupon bond using historical data to calibrate the models. When
we set for a defaultable bond the recovery rate to one and the default intensity
to zero we obtain a fair zero coupon bond P (t, T ) as priced in (3.5.17) and
(3.6.7) for the respective model.
We interpret now the US dollar savings account discounted world equity
index, displayed in Fig. 3.6.1, as GOP. Its logarithm is shown in Fig. 3.6.2
with a trendline. Under both models, the MM and the MMM, we assumed
deterministic interest rates. Therefore, the historical US three months treasury
bill rate is here interpreted as a deterministic interest rate. The resulting
benchmarked savings bond
173
!
T
1
P (t, T ) = exp
rs ds
St
t
we display in Fig.3.6.3 as the upper graph. One notes its systematic downward
trend. This trend is a consequence of the equity premium, which in the long
run gives the world equity index a larger expected growth than the savings
account. The savings bond invests in the deterministic savings account and
delivers $1 at maturity, as shown in the upper graph of Fig. 3.6.4. This is the
bond price that would result under the MM.
We now use the MMM to model the discounted GOP as a time transformed
squared Bessel process of dimension four. Here the net growth rate can be
estimated from the slope of the trendline in Fig. 3.6.2 with 0 = 0.052 and the
scaling parameter emerges as 00 = 0.1828. Then the resulting fair zero coupon
bond P (t, T ), see (3.6.7), is shown in Fig. 3.6.4 in the lower graph. Its price
is for the long term to maturity signicantly cheaper than that of the savings
174
bond. Both their benchmarked values are shown in Fig. 3.6.3. One notes in
this gure that the trajectory of the benchmarked fair zero coupon is lower
and reects more realistically that of a martingale, whereas the trajectory of
the benchmarked savings bond can be better interpreted as that of a strict
supermartingale, see Platen (2009).
Under the MMM the fair zero coupon bond describes the minimal price
process that can replicate the payo of $1 at maturity. Also, the savings bond
is replicating the payo of $1 at maturity in a self-nancing way. However, this
price process is not fair and, therefore, not the minimal price process that can
achieve this replication. To illustrate the economic reasoning why the fair zero
coupon bond can be cheaper than the savings bond, we display in Fig.3.6.5 the
logarithms of both bonds. One notices that the fair zero coupon bond simply
exploits for a long time the superior long term growth of the equity index and
only close to maturity it slides nally over into the savings bond. Reasonable
nancial planning has always been suggesting such kind of behavior in life cy-
175
176
(3.7.1)
(3.7.2)
177
In the following we consider the simple binomial model to generate randomness. This is probably one of the simplest ways to model randomness
evolving over time. We model the uncertain value of the benchmarked security S at time t = > 0 by only allowing the two possible benchmarked
values (1 + u)S0 and (1 + d)S0 , such that
(3.7.3)
P S = (1 + u) S0 = p
P S = (1 + d) S0 = 1 p,
and
the value u = SSS0 can be interpreted as positive return and for a downward
move d =
0
S
S
0
S
(3.7.5)
(3.7.6)
178
d
.
ud
(3.7.7)
Therefore, we obtain from (3.7.5) and (3.7.7) at time t = 0 the European call
option price
d
S0H = S0H = (1 + u) S0 K
,
(3.7.8)
ud
where S0 = 1. Note that the option price increases with increasing u and
decreases with increasing d.
Hedging
Let us determine the hedge ratio for the above derivative. Assuming a selfnancing fair portfolio, we need by (3.7.2) to hold at time t = 0 in benchmarked terms the value
S0H = 00 + 01 S0
(3.7.9)
and at time t = the value
H
= 00 + 01 S .
S
(3.7.10)
S
=H
.
(3.7.11)
H
S
S0H
,
S S0
which is satised for the case of an upward move when S = (1 + u)S0 with
(1 + u) S0 K
d
(1 + u) S0 K
ud
(1 + u) S0 K
01 =
=
(3.7.12)
u S0
(u d) S0
and for a downward move, that is for S = (1 + d) S0 , with
d
(1 + u) S0 K
ud
(1 + u) S0 K
01 =
=
.
d S0
(u d) S0
(3.7.13)
Important to note is that both values for the hedge ratio in (3.7.12) and
(3.7.13) are the same. This shows that the real world price (3.7.5) provides
a perfect hedge for both scenarios. Fair pricing is performed in a way that
the perfect replication of contingent claims in this market is secured. The
179
(3.7.14)
which involves the two-point distributed random variable {d, u}, where
P ( = u) = p
(3.7.15)
P ( = d) = 1 p.
(3.7.16)
and
It then follows that the variance of and thus
2
S
E
1
S0
S
0
S
equals
A0 = (u d)2 p (1 p).
S
A0 = u d.
E
1
S0
(3.7.17)
180
8
9
2
9
9 1 S
= : E
1
S0
)
A0 = u d
(3.7.18)
the binomial volatility of the benchmarked security under the given binomial
model. Under the BS model we know from the Black-Scholes formula, see
Sect.2.7, that the option price increases for increasing volatility, which appears
to be appropriate from a risk management point of view. Due to the formula
2
( )
S0H = (1 + u) S0 K
u (u d)
there is still substantial freedom to choose for increasing binomial volatility
the returns u and d such that the binomial option price does not increase.
This possibility is somehow not satisfactory and counter intuitive. It results
from the simplicity of the binomial model. Thus, one has to be careful and
should avoid to read too much into the binomial model.
Multi-Period Binomial Model
We consider now a multi-period binomial model. The benchmarked price of
the stock at time t = n satises for an upward move the equation
Sn = S(n1) (1 + u)
(3.7.19)
with probability
d
ud
and for a downward move the relation
p=
Sn = S(n1) (1 + d)
(3.7.20)
(3.7.21)
of a
these events a binomial tree as in Fig. 3.7.1. The benchmarked value S2
portfolio at time T = 2 is given as
0
1
S2
S2
= 2
+ 2
(3.7.22)
with S2 taking the above described three possible values. By using our previous result in (3.7.5) we can compute the option price at time t = for the
case when S = (1 + u) S0 , which follows as
181
+ d
+ u
H
S
+ (1 + u)(1 + d) S0 K
.
= (1 + u)2 S0 K
ud
ud
(3.7.23)
Similarly, we obtain for S = (1 + d)S0 at time t = the option price
+ d
+ u
H
S
+ (1 + d)2 S0 K
.
= (1 + u) (1 + d) S0 K
ud
ud
(3.7.24)
One can now interpret the values (3.7.23) and (3.7.24) as those of a benchmarked payo at time t = . This allows us to obtain, similar as before, the
corresponding fair benchmarked price at time t = 0 as the expectation
+ d
+ u
S0H = p
(1 + u)2 S0 K
+ (1 + u)(1 + d) S0 K
ud
ud
+ d
+ u
+ (1 + d)2 S0 K
+ (1 p) (1 + u) (1 + d) S0 K
.
ud
ud
(3.7.25)
Also for this fair price a corresponding perfect hedging strategy can be described.
More generally, by using backward recursion and the well-known binomial probabilities, see Cox, Ross & Rubinstein (1979), one can show that the
following Cox-Ross-Rubinstein (CRR) type binomial option pricing formula
holds for a benchmarked European call payo with maturity T = i and
where
benchmarked strike K,
182
S0H = E
=
i
k=0
= S0
ST K
+
A0
+
i!
i
k=k
i!
pk (1 p)ik (1 + u)k (1 + d)ik
k ! (i k) !
i
k=k
i!
pk (1 p)ik .
k ! (i k) !
(3.7.26)
Here k denotes the rst integer k for which (1 + u)k (1 + d)ik S0 > K.
Note that this is not exactly the CRR binomial option pricing formula,
which was originally derived for discounted securities under the risk neutral
probability measure. Here we have obtained an option pricing formula that
refers to benchmarked securities under the real world probability measure.
Let us denote by
tN (tk , i, p) =
i
k=k
i!
pk (1 p)ik
k ! (i k) !
(3.7.27)
(3.7.28)
(3.7.29)
for k {1, 2, . . . , i}. Similarly, as in (3.7.13) one obtains the hedge ratio
1
=
k
ik
k
=k
(i k) !
p(ik) (1 p) ,
! (i k ) !
(3.7.30)
where kk denotes the smallest integer for which (1+u) (1+d)ik Sk > K.
183
(3.7.31)
The binomial model is a very convenient but also a rather simple model.
It allows an easy calculation of European and also American option prices.
However, due to its simplicity it has severe short comings that we will demonstrate later on. For instance, such trees have numerical stability problems that
closely resemble those of explicit weak schemes, as we will discuss in Chap. 17.
Approximating the Black-Scholes Price
In the following, we indicate how the binomial model approximates asymptotically the Black-Scholes model in a weak sense when the time step size
converges to zero. In principle, the random walk that is performed on the
binomial tree converges in a weak sense to a limiting process as the time
step size decreases to zero. This limiting process turns out to be a geometric
Brownian motion.
To see the indicated asymptotics of the above binomial model, let us x u
and d by setting
(3.7.32)
ln(1 + u) =
and
ln(1 + d) =
(3.7.33)
u = exp 1
(3.7.34)
and
d = exp 1 .
(3.7.35)
Note that the binomial volatility was introduced in (3.7.18) and we have
for small 1 by (3.7.34)(3.7.35) approximately
1
u d .
=
(3.7.36)
(3.7.37)
Yn+
= Yn
+ Yn
n
(3.7.38)
184
d
ud
and
P (n = d) = 1 p =
u
,
ud
(3.7.39)
(3.7.40)
Yn+
Yn
+ Yn
n ,
(3.7.41)
where we use an independent random variable n with
exp{ } 1
P (n = 1) = p =
exp{ } exp{ }
(3.7.42)
and
P (n = 1) = 1 p.
Note that
and
E
Yn+
Yn
An = 0
E Yn+
Yn
2
2 2
2 2
Yn
.
An = Yn
(3.7.43)
(3.7.44)
ln( SK0 ) + 12 2 T
and
d2 = d1
T.
(3.7.46)
(3.7.47)
(3.7.48)
Here N () is the standard Gaussian distribution function. Note that we consider here benchmarked securities, which is dierent to the standard version
of the Black-Scholes formula described in Sect. 2.7.
3.8 Exercises
185
The important observation here is that the binomial tree method does not
involve any simulation. The nodes of the tree reect all possible paths of the
underlying random walk.
Finally, we remark that the pricing of American style options on binomial
and other trees can be conveniently performed. At rst, one builds the tree in
a forward manner as described above. Then one uses a backward algorithm to
obtain from the maturity date backward the price of the contingent claim at
each of the nodes. In particular, at each node one decides which is the greater,
the price that has been obtained under the assumption that one continues to
hold the American style derivative to the next period, or the price that results
when one exercises it at this node.
3.8 Exercises
3.1. In a continuous nancial market derive the form of the growth rate gt of
a strictly positive portfolio S satisfying the SDE
d
d
j
j,k
dSt = St rt dt +
,t bt (tk dt + dWtk ) .
k=1 j=1
3.2. For a nonnegative portfolio St with SDE as in Exercise 3.1 and the
growth optimal portfolio St satisfying the SDE
d
k
k
k
dSt = St
rt dt +
t (t dt + dWt ) ,
k=1
St
St
integrable does this SDE, in general, imply that S is a submartingale, martingale or supermartingale?
3.3. Calculate the SDE for the logarithm of the discounted GOP under the
MMM.
3.4. Derive the SDE of the square root for the discounted GOP under the
MMM.
3.5. Derive the SDE for the normalized GOP Yt =
t = 0 exp
s ds .
S
t
t
4
Stochastic Expansions
Xt = Xt0 +
a(Xs ) ds +
t0
b(Xs ) dWs
(4.1.1)
t0
187
188
4 Stochastic Expansions
t
1 2
2
f (Xt ) = f (Xt0 ) +
a(Xs ) f (Xs ) + b (Xs ) 2 f (Xs ) ds
x
2
x
t0
t
= f (Xt0 ) +
L0 f (Xs ) ds +
L1 f (Xs ) dWs ,
(4.1.2)
t0
t0
1 2
+ b2 2
x 2 x
(4.1.3)
and
.
(4.1.4)
x
Obviously, for the special case f (x) x we have L0 f = a and L1 f = b, for
which the representation (4.1.2) reduces to (4.1.1). Since the representation
(4.1.2) still relies on integrands that are functions of processes we now apply
the It
o formula (4.1.2) to the functions f = a and f = b in (4.1.1) to obtain
t
s
L0 a(Xz ) dz +
L1 a(Xz ) dWz ds
a(Xt0 ) +
Xt = Xt0 +
L1 = b
t0
t
b(Xt0 ) +
+
t0
t0
ds + b(Xt0 )
dWs + R2
t0
t0
s
L1 a(Xz ) dWz ds
t0
L0 b(Xz ) dz dWs +
+
t0
t0
(4.1.5)
t0
t
s
R2 =
L0 a(Xz ) dz ds +
L b(Xz ) dz +
t0
t0
t0
= Xt0 + a(Xt0 )
t0
t0
Xt = Xt0 + a(Xt0 )
t0
dWs + L1 b(Xt0 )
ds + b(Xt0 )
t0
189
dWz dWs + R3
t0
t0
(4.1.6)
with remainder term
t
s
L0 a(Xz ) dz ds +
R3 =
t0
t0
t0
s
L1 a(Xz ) dWz ds
t0
L b(Xz ) dz dWs +
+
t0
t0
t0
t0
t0
+
t0
t0
t0
t
ds = t t0 ,
t0
t
dWs = Wt Wt0 ,
t0
s
dWz dWs =
t0
t0
1
(Wt Wt0 )2 (t t0 ) ,
2
see (4.1.6). The remainder term, here R3 , consists of the next following multiple It
o integrals with nonconstant integrands. A Wagner-Platen expansion
can be interpreted as a generalization of both the It
o formula and the classical
deterministic Taylor formula. As we have seen, it emerges from an iterated
application of the It
o formula.
Generalized Wagner-Platen Expansion
Let us now expand the changes of a function value with respect to the underlying process X itself. From the It
o formula it follows for a twice continuously
dierentiable function f : [0, T ] the representation
1 t 2
+
f (s, Xs ) d [X]s
(4.1.7)
2 t0 x2
190
4 Stochastic Expansions
for t [t0 , T ], where Xt satises the SDE (4.1.1). Here the quadratic variation
[X]t has the form
t
[X]t = [X]t0 +
b2 (Xs ) ds
(4.1.8)
t0
s 2
f (t0 , Xt0 ) +
f (t, Xt ) = f (t0 , Xt0 ) +
f (z, Xz ) dz
2
t
t0
t0 t
+
t0
t
+
t0
f (t0 , Xt0 ) +
x
+
t0
1
2
2
f (z, Xz ) dXz +
x t
+
t0
t0
2
f (t0 , Xt0 ) +
x2
s
t0
1 2
f
(z,
X
)
d
[X]
z
z ds
2 x2 t
2
f (z, Xz ) dz
t x
s
t0
t0
3
f (z, Xz ) dXz +
x3
= f (t0 , Xt0 ) +
2
f (z, Xz ) dXz +
x2
t
t0
1 3
f
(z,
X
)
d
[X]
z
z dXs
2 x3
3
f (z, Xz ) dz
t x2
s
t0
1 4
f
(z,
X
)
d
[X]
z
z d [X]s
2 x4
f (t0 , Xt0 ) (t t0 ) +
f (t0 , Xt0 ) (Xt Xt0 )
t
x
1 2
f (t0 , Xt0 ) [X]t [X]t0 + Rf (t0 , t),
2
2 x
(4.1.9)
where the remainder term Rf (t0 , t) accommodates the terms with nonconstant
integrands. The generalized Wagner-Platen expansion is similar to a Taylor
expansion but signicantly dierent in that it includes various terms that
would not otherwise arise in deterministic calculus. One could continue to
expand in (4.1.9) those terms that appear in the remainder term by application
of the It
o formula. This would lead, for instance, to an expansion of the
form
191
f (t0 , Xt0 ) (t t0 ) +
f (t0 , Xt0 ) (Xt Xt0 )
t
x
1 2
f (t0 , Xt0 ) [X]t [X]t0
2
2 x
t
s
2
f (t0 , Xt0 )
dXz ds
+
x t
t0 t0
t
s
1 2
f
(t
,
X
)
d [X]z ds
+
0
t0
2 x2 t
t0 t0
t
s
2
+
f (t0 , Xt0 )
dz dXs
t x
t0 t0
t
s
2
+ 2 f (t0 , Xt0 )
dXz dXs
x
t0 t0
t
s
1 3
+
f
(t
,
X
)
d [X]z dXs
0
t0
2 x3
t0 t0
t
s
1 3
+
f (t0 , Xt0 )
dz d [X]s
2 t x3
t0 t0
t
s
1 3
+
f
(t
,
X
)
dXz d [X]s
0
t0
2 x3
t0 t0
t
s
1 4
f (t0 , t)
+
f (t0 , Xt0 )
d [X]z d [X]s + R
4 x4
t0 t0
+
(4.1.10)
t
a(Xs ) ds +
b(Xs ) dWs
(4.1.11)
Xt = Xt0 +
t0
t0
for t [t0 , T ], where the second integral in (4.1.11) is the symmetric stochastic
integral, also known as Stratonovich integral, and the coecients a and b
192
4 Stochastic Expansions
are suciently smooth real valued functions, such that a unique solution of
(4.1.11) exists. Note that when
1
bb ,
2
a=a
(4.1.12)
the It
o equation (4.1.1) and the Stratonovich equation (4.1.11) are equivalent
and have, under appropriate assumptions, the same unique strong solutions.
The solution of a Stratonovich SDE transforms formally according to the
classical chain rule, see Kloeden & Platen (1999). Thus for any continuously
dierentiable function f : we have
f (Xs ) ds +
f (Xs ) dWs
a(Xs )
b(Xs )
f (Xt ) = f (Xt0 ) +
x
x
t0
t0
t
= f (Xt0 ) +
L0 f (Xs ) ds +
L1 f (Xs ) dWs ,
(4.1.13)
t0
t0
(4.1.14)
L1 = b
.
x
(4.1.15)
and
t
s
0
1
Xt = Xt0 +
L a(Xz ) dz +
L a(Xz ) dWz ds
a(Xt0 ) +
t0
t
b(Xt0 ) +
+
t0
t0
dWs + R4 ,
L1 a(Xz ) dWz ds
t0
L b(Xz ) dz dWs +
t0
(4.1.16)
t0
dWs
t0
t0
ds + b(Xt0 )
t
s
L0 a(Xz ) dz ds +
R4 =
t0
L1 b(Xz ) dWz
t0
t
t0
t0
L0 b(Xz ) dz +
t0
= Xt0 + a(Xt0 )
t0
t0
t0
193
t
ds + b(Xt0 )
dWs
Xt = Xt0 + a(Xt0 )
t0
t0
s
dWz dWs + R5 ,
+ L1 b(Xt0 )
t0
t
s
R5 =
L0 a(Xz ) dz ds +
t0
t0
t0
t0
L1 a(Xz ) dWz ds
t0
L b(Xz ) dz dWs +
+
t0
t0
t0
t0
t0
+
t0
t0
t0
194
4 Stochastic Expansions
Nt =
Ns
(4.1.18)
s(0,t]
for t [0, T ]. This is a special case of the Ito formula for semimartingales that
exhibit jumps. We can formally write the equation (4.1.19) as the SDE
(4.1.21)
f (Ns ) dNs
(4.1.22)
f (Nt ) = f (N0 ) +
(0,t]
f (N0 ) dNs +
f (Ns ) dNs dNs
= f (N0 ) +
1
1
2
(0,t]
(0,t]
(0,s2 )
for t [0, T ]. One notes that multiple stochastic integrals with respect to the
counting process N naturally arise. It is straightforward to prove by induction,
as in Engel (1982), Platen (1984) and Studer (2001), that
dNs = Nt
(0,t]
dNs1 dNs2 =
(0,t]
(0,s1 )
1
Nt (Nt 1),
2!
1
Nt (Nt 1) (Nt 2),
3!
Nt
for Nt n
n
. . . dNsn1 dNsn =
0
otherwise
(4.1.23)
(0,s1 )
(0,s2 )
...
(0,t]
(0,s1 )
dNs1
(0,sn )
(4.1.24)
195
f (N0 ) dNs +
f (N0 ) dNs dNs +R
3 (t)
f (Nt ) = f (N0 )+
1
2
(0,t]
with
(0,t] (0,s2 )
3 (t) =
R
(0,t]
(0,s2 )
f (Ns1 )
dNs1 dNs2 dNs3 .
(0,s3 )
3 (t),
f (Nt ) = f (N0 ) + f (N0 )
+ f (N0 )
+R
1
2
where
f (0) = f (1) f (0),
f (N0 ) =
f (N0 ) = f (2) 2 f (1) + f (0).
This means, we have for a general function f of a counting process or pure
jump process N the expansion
1
3 (t).
Nt (Nt 1) + R
2
(4.1.25)
The above calculations demonstrate that one can expand elegantly functions
of pure jump processes. It is interesting to note that for those with Nt
3 (t) equals zero, and the truncated expansion, when
3 the remainder term R
neglecting the remainder term, is then exact. These are important observations
that can be very useful in practice.
The above type of expansions can be generalized also for jump diusions
and for functions on processes that involve several counting processes, as we
will discuss later.
196
4 Stochastic Expansions
(4.2.1)
dX t = a
(t, X t )dt + b(t, X t ) dWt + c(t, X t , v)
p (dv, dt), (4.2.2)
E
a
(t, x) = a(t, x) + c(t, x, v) (dv),
(4.2.3)
(4.2.4)
(4.2.5)
197
l((0, 1, 1, 0, 2)) = 5
n((0, 1, 1)) = 1
n((0, 1, 1, 0, 2)) = 2
s((0, 1, 1)) = 1
s((0, 1, 1, 0, 2)) = 1
(0, 1, 1) = (0, 1)
(0, 1, 1, 0, 2) = (0, 1, 1, 0)
(0, 1, 1) = (1, 1)
(4.2.6)
(4.2.7)
H(0) =
|g(s, )|ds
g:E
!
<
H(1) =
g:E
|g(s, v, )| (dv)ds
2
H(j) =
|g(s, )|2 ds
g:E
!
<
!
< ,
(4.2.8)
198
4 Stochastic Expansions
g( )
when l = 0 and = v
I
[g()]
dz
when
l 1 and jl = 0
,z
I [g()], =
j
l
I [g()]
,z p (dvs() , dz) when l 1 and jl = 1,
E
(4.2.9)
where g() = g(, v1 , . . . , vs() ). As previously, by z we denote the left-hand
limit of z. However, for a multi-index has another meaning as described
earlier. For simplicity, when it is not strictly necessary, here and in the sequel
we may omit the dependence of the integrand process g on one or more of
the variables v1 , . . . , vs() that express the dependence on the marks of the
Poisson jump measure.
Note that by denition (4.2.9), whenever we have an integration with
respect to the Poisson measure p , we take the left-continuous modication of
the integrand. Therefore, since we have required that the integrand is adapted,
its left-continuous modication will be predictable, see for instance Klebaner
(2005). Thus, the corresponding stochastic integral with respect to the Poisson
measure p is well dened.
As dened in (4.2.9), in a multi-index the components that equal 0
refer to an integration with respect to time, the components that equal j
{1, 2, . . . , m} refer to an integration with respect to the jth component of the
Wiener process, while the components that equal 1 refer to an integration
with respect to the Poisson measure p . For instance, for m = 2 one has
z3
I(0,1,1) [g()], =
z2
I(2,0) [g()], =
and
I(1,1) [g()], =
z2
(4.2.10)
199
g( )
I [g()],z dz
I [g()], =
I [g()],z dWzjl
when l = 0 and = v
when l 1 and jl = 0
when l 1 and jl {1, . . . , m}
when l 1 and jl = 1,
E
(4.2.11)
where g() = g(, v1 , . . . , vs() ). Note that the multiple stochastic integral
I [g()], , dened previously in (4.2.9), involves integrations with respect
to the Poisson jump measure p , while the compensated multiple stochastic integral I [g()], is dened in terms of integrations with respect to the
compensated Poisson measure p . For instance, for m = 2 one obtains
I(0,1,1) [g()], =
z3
z2
I(1,1) [g()], =
z2
2s()
1
H,i .
(4.2.12)
i=1
The terms H,i denote multiple stochastic integrals of the adapted process
g() that use as integrators the time, the Wiener processes, the Poisson jump
measure p and the intensity measure . A complete description of these terms
could be given by dening recursively a new type of multiple stochastic integral. Then by using the relationship (4.2.1) together with (4.2.9) and (4.2.11),
the terms H,i could be recursively characterized. However, since this characterization requires again rather technical notation, we omit for simplicity this
obvious but complex characterization. Instead, we provide the following two
illustrative examples.
200
4 Stochastic Expansions
I(0,1,1) [g()], =
z3
z2
z3
z2
=
z3
z2
z3
H(0,1,1),1 =
z2
z2
I(1,1) [g()], =
g(z1 , v1 , v2 ) p (dv1 , dz1 )
p (dv2 , dz2 )
E
z2
=
E
z2
z2
+
z2
= I [g()], +
2s()
1
i=1
,i .
H
(4.2.13)
201
z2
I(1,1) [g()], =
g(z1 , v1 , v2 ) p (dv1 , dz1 )p (dv2 , dz2 )
z2
=
z2
+
E
z2
z2
+
E
(1,1),1 + H
(1,1),2 + H
(1,1),3 .
= I(1,1) [g()], + H
For instance, when = (0, 1, 1) one has
z3
I(0,1,1) [g()], =
z3
z2
z2
z3
+
z2
(0,1,1),1 ,
= I(0,1,1) [g()], + H
so that
(0,1,1),1 =
H
z3
z2
(4.2.14)
202
4 Stochastic Expansions
Wt0 = t
(4.2.15)
Wtj I,t =
i=0
i=1
(4.2.16)
for all t 0.
It is interesting to note that multiple It
o integrals with respect to one
Wiener process are related to Hermite polynomials. In particular, it follows
that
I(j), = Wj
1 j 2
W
I(j,j), =
2
1 j 3
W 3 Wj
I(j,j,j), =
3!
1 j 4
W 6 (Wj )2 + 3 2
I(j,j,j,j), =
4!
I(j,j,j,j,j), =
1 j 5
W 10 (Wj )3 + 15 2 Wj
5!
(4.2.17)
203
(4.3.1)
d
ci (t, x, v)
i=1
f (t, x, u)
xi
(4.3.2)
1
we denote the set
partial derivatives x
i f (t, x, u), i {1, 2, . . . , d}. With L
of functions for which
f t, x + c (t, x, v) , u f t, x, u 2
(4.3.3)
is (dv)-integrable for all t [0, T ], x d and u E s .
Some Operators
Let us now dene the following operators for a function f (t, x, u) Lk :
L0 f (t, x, u) =
d
f (t, x, u) +
ai (t, x) i f (t, x, u)
t
x
i=1
d
m
1 i,j
2
b (t, x)bl,j (t, x) i l f (t, x, u) (4.3.4)
2
x x
j=1
i,l=1
Lk f (t, x, u) =
d
i=1
bi,k (t, x)
f (t, x, u),
xi
(4.3.5)
(4.3.6)
for all t [0, T ], x d and u E s . Note that the operator in (4.3.6) adds
an extra dependence v E on the mark components. Let us also dene the
operator
204
4 Stochastic Expansions
0 f (t, x, u) = L0 f (t, x, u) +
L
E
f (t, x + c(t, x, v), u) f (t, x, u) (dv)
d
f (t, x, u) +
a
i (t, x) i f (t, x, u)
t
x
i=1
d
m
1 i,j
2
b (t, x)bl,j (t, x) i l f (t, x, u)
2
x x
j=1
i,l=1
+
f (t, x + c(t, x, v), u) f (t, x, u)
E
(4.3.7)
for l() = 0,
f (t, x)
j1
f (t, x, u) = L f (t, x, u1 , . . . , us() )
for l() 1, j1 {0, 1, . . . , m}
L1 f (t, x, u , . . . , u
)
for l() 1, j1 = 1.
1
s()
us()
(4.3.8)
By u = (u1 , . . . , us() ) we denote a vector u E s() . Note that the dependence
on u in (4.3.8) is introduced by the repeated application of the operator
(4.3.6). Additionally, we assume that the coecients of the SDE (1.8.2) and
the function f satisfy the smoothness and integrability conditions needed for
the operators in (4.3.8) to be well dened, see also Remark 4.4.2.
If we choose the identity function f (t, x) = x, then for the case d = m = 1
we can write the examples
f(1,0) (t, x, u) = L1
u a(t, x) = a(t, x + c(t, x, u)) a(t, x),
f(0,1) (t, x) = L0 b(t, x)
=
and
2 2
1
205
1
f(1,1) (t, x, u) = L1
u2 c(t, x, u )
f (t, x)
L
0 f (t, x, u1 , . . . , us() )
f (t, x, u) =
Lj1 f (t, x, u1 , . . . , us() )
1
Lus() f (t, x, u1 , . . . , us() )
for
for
for
for
l() = 0,
l() 1, j1 = 0
l() 1, j1 {1, . . . , m}
l() 1, j1 = 1.
(4.3.9)
Here we assume again that the coecients of the SDE (1.8.2) and the function
f satisfy the smoothness and integrability conditions needed for the operators
in (4.3.9) to be well dened. For illustration, if we choose the identity function
f (t, x) = x, then for the case d = m = 1 we can formulate the examples
(t, x)
f(1,0) (t, x, u) = L1
u a
=a
(t, x + c(t, x, u)) a
(t, x)
= a(t, x + c(t, x, u)) a(t, x)
0 b(t, x)
f(0,1) (t, x) = L
=
1
2
b(t, x) + a(t, x)
b(t, x) + (b(t, x))2 2 b(t, x)
t
x
2
x
1
2
b(t, x) + a
(t, x)
b(t, x) + (b(t, x))2 2 b(t, x)
t
x
2
x
b(t, x) (dv)
+
b(t, x + c(t, x, v)) b(t, x) c(t, x, v)
x
E
and
f(1,1) (t, x, u) = f(1,1) (t, x, u).
(4.3.10)
206
4 Stochastic Expansions
(4.3.11)
Then the remainder set consists of all next following multi-indices with respect
to the given hierarchical set.
Let us give an example of a hierarchical set and the corresponding remainder set: When m = 1, we will later see that the hierarchical set AE ,
corresponding to the popular Euler scheme, is of the form
AE = {v, (1), (0), (1)}.
(4.3.12)
(4.4.1)
dX t = a(t, X t ) dt + b(t, X t ) dW t + c(t, X t , v) p (dv, dt),
E
B(A)
(4.4.3)
B(A)
assuming that the function f and the coecients of the SDE (4.4.1) are suciently smooth and integrable such that the arising coecient functions f and
f are well dened and the corresponding multiple stochastic integrals exist.
207
Note that in (4.4.2) and (4.4.3) we have, for simplicity, neglected that f
and f are vectors and suppressed their dependence on u E s() in our notation. We will also continue to do so in the following when no misunderstanding
is possible.
Remark 4.4.2. For instance, if the function f and the coecients a, b and c
of the SDE (4.4.1) are 2(l() + 1) times continuously dierentiable, uniformly
bounded with uniformly bounded derivatives, then all coecient functions
and multiple stochastic integrals in the expansions (4.4.2) and (4.4.3) are well
dened.
The proof of the Wagner-Platen expansion is again based on an iterated
application of the It
o formula. Since it is formally the same as in Platen
(1982a, 1982b) and analogous to the one in Kloeden & Platen (1999), it is
here omitted.
By choosing as function f the identity function f (t, x) = x, we can represent the solution X = {X t , t [0, T ]} of the SDE (4.4.1) by the WagnerPlaten expansion
I [f (, X )], +
I [f (, X )],
(4.4.4)
X =
A
B(A)
(4.4.5)
B(A)
X = X +a(, X )
dz+b(, X )
dWz +
c(, X , v) p (dv, dz)+R
with remainder
208
4 Stochastic Expansions
R=
L1 a(z, Xz ) dWz ds
L a(z, Xz ) dz ds +
L1
v a(z, Xz ) p (dv, dz) ds
L0 b(z, Xz ) dz dWs +
L1
v b(z, Xz ) p (dv, dz) dWs
L0 c(z, Xz , v) dz p (dv, ds)
+
+
L1
u c(z, Xz , v) p (dv, dz) p (du, ds).
(4.4.6)
209
1 2
1
2
+ a a a + (a ) + b b a + b a
+ b2 (a a + 3 a a
2
2
1
+ (b )2 + b b a + 2 b b a + b4 a(4) I(0,0,0)
4
1
1
+ a a b + a b + b b b + b2 b + b2 a b + 2 a b
2
2
2
1 2 (4)
+a b + (b ) + b b b + 2 b b b + b b
I(0,0,1)
2
1
+ a (b a + b a ) + b2 (b a + 2 b a + b a ) I(0,1,0)
2
1 2
2
+ a (b ) + b b + b (b b + 2 b b + b b ) I(0,1,1)
2
1
+ b a a + (a )2 + b b a + b2 a I(1,0,0)
2
1 2
+b ab + a b + bb b + b b
I(1,0,1)
2
+ b (a b + a b) I(1,1,0) + b (b )2 + b b I(1,1,1) + R6 .
(4.4.7)
This is a useful formula since it will allow us to readily establish various
numerical schemes later.
The most important property of the Wagner-Platen expansion is that it
allows a function of a process to be expanded as the sum of a nite number of
multiple It
o integrals with constant integrands. Like the deterministic Taylor
expansion, it can be exibly used for the approximation of increments of solutions of SDEs over small time intervals. If one needs to exploit martingale
properties, as it is often the case in nance, then one should use a WagnerPlaten expansion because the It
o stochastic integrals, which are martingales,
vanish when taking conditional expectations. However, if there are advantages from a less complex structure under the Stratonovich calculus, then a
Stratonovich type stochastic expansion is appropriate, see Kloeden & Platen
(1999).
Examples of Wagner-Platen Expansions
Let us give three specic examples for Wagner-Platen expansions:
1. First, we consider a Vasicek interest rate model
r rt ) dt + dWt
drt = (
(4.4.8)
210
4 Stochastic Expansions
t3
+ 2
6
t2
Ws ds
0
s2
(4.4.9)
(4.4.10)
+ 2
t2
= I(1) I(0,0) = I(0,0,1) +I(0,1,0) +I(1,0,0)
2
and
1
(Wt )2 t = I(0) I(1,1) = I(0,1,1) + I(1,0,1) + I(1,1,0) .
2
Therefore, it follows from (4.4.11) that
t2
t3
2
St = S0 1 + a t + Wt + a2 + a t Wt +
(Wt )2 t + a3
2
2
6
t2
2
2 t
2
3 1
3
(Wt ) t +
(Wt ) 3 t Wt + R6 .
+ a Wt + a
2
2
6
t
dXt = dt + 2
211
Xt dWt
for t [0, T ] with X0 > 0. Here the Wagner-Platen expansion with hierarchical set
A = { M1 : () 2},
is obtained as
Xt = X0 + ( 1) t + 2
1
X 0 Wt +
X0
(4.4.12)
s dWs + (Wt )2 + R.
0
(4.5.1)
see (4.2.1), we will prove corresponding estimates for the case of the multiple
stochastic integrals I [g()], .
The rst moment of a multiple It
o integral vanishes if it has at least
one integration with respect to a component of the Wiener process or the
compensated Poisson measure. This is the case when only using compensated
Poisson measures and not all of the components of the multi-index are
equal to zero, which is equivalent to the condition ()
= n(), where n()
denotes the number of zeros in . The following lemma is a straightforward
generalization of a similar statement in Kloeden & Platen (1999).
Lemma 4.5.1
Let Mm \ {v} with ()
= n(), g H and and
be two stopping times with 0 T , almost surely, then
E I [g()], A = 0.
(4.5.2)
212
4 Stochastic Expansions
2
l()n() l()+n()1
F = E
sup I [g()],s A 4
V,z,s() dz,
s
(4.5.3)
and
F
=E
s() V,z,s() dz,
sup |I [g()],s | A 4l()n() l()+n()1 K
s
2
(4.5.4)
where
V,z,s() =
...
E
E sup |g(t, v 1 , . . . , v s() )|2 A (dv 1 ) . . . (dv s() ) <
tz
(4.5.5)
= 1 (4 + T ).
for z [, ], and K
2
Proof:
l().
1. Let us assume that l() = 1 and = (0). By the Cauchy-Schwarz inequality we have the estimate
s
2
s
(s )
g(z)
dz
|g(z)|2 dz.
(4.5.6)
Therefore, we obtain
F(0) = E
sup
s
2
g(z) dz
sup (s )
s
= E ( )
E
=
|g(z)|2 dz A
|g(z)| dz A
2
A
|g(z)|2 dz A
E |g(z)|2 |A dz
213
2
E sup |g(t)| A dz
tz
= 4l()n() l()+n()1
V,z,s() dz,
(4.5.7)
where the interchange between expectation and integral holds by the A measurability of and an application of Fubinis theorem.
2. When l() = 1 and = (j) with j {1, 2, . . . , m}, we rst observe that
the process
t
g(s) dWsj , t [, T ]
(4.5.8)
I [g()],t , t [, T ] =
F = E
sup
g(z) dWz A
s
4 E
=4 E
=4
2
A
g(z) dWzj
|g(z)|2 dz A
E |g(z)|2 |A dz
E sup |g(t)| A dz
tz
= 4l()n() l()+n()1
V,z,s() dz.
(4.5.9)
Here again the interchange between expectation and integral holds by the
A -measurability of and Fubinis theorem.
3. Let us now consider the case with l() = 1 and = (1). The process
t
I [g()],t , t [, T ] =
g(s, v) p (dv, ds), t [, T ]
(4.5.10)
is by (4.2.1) and (4.2.8) a martingale. Then, by Doobs inequality and the
isometry for Poisson type jump martingales, we obtain
214
4 Stochastic Expansions
sup
s
F(1) = E
4 E
A
|g(z, v)| (dv) dz A
E |g(z, v)|2 |A (dv) dz
=4
sup |g(t, v)|2 A (dv) dz
E
=4
2
g(z, v) p (dv, dz)
A
=4 E
2
g(z, v) p (dv, dz)
tz
l()n()
l()+n()1
V,z,s() dz,
(4.5.11)
since s() = 1. This shows that the result of Lemma 4.5.2 holds for
l() = 1.
4. Now, let l() = n+1, where = (j1 , . . . , jn+1 ) and jn+1 = 0. By applying
the Cauchy-Schwarz inequality we obtain
s
2
F = E
I [g()],z dz A
sup
s
sup (s )
s
= E ( )
E
E
E
2
2
|I [g()],z | dz A
2
|I [g()],z | dz A
2
sup |I [g()],s | dz A
s
= E
|I [g()],z |2 dz A
dz sup |I [g()],s |2 A
s
2
sup |I [g()],s | A .
s
(4.5.12)
215
F 2 4l()n() l()+n()1
V,z,s() dz
= 4l()n() l()+n()1
V,z,s() dz,
(4.5.13)
where the last line holds since l() = l() + 1, n() = n() + 1 and
s() = s().
5. Let us now consider the case when l() = n + 1, where = (j1 , . . . , jn+1 )
and jn+1 {1, 2, . . . , m}. The process
I [g()],t , t [, T ]
(4.5.14)
is again a martingale. Therefore, by Doobs inequality and It
os isometry
we obtain
s
2
jn+1
F = E
I [g()],z dWz A
sup
s
4 E
I [g()],z
= 4E
4E
2
jn+1
dWz
|I [g()],z |2 dz A
2
sup |I [g()],s | dz A
s
= 4 E ( ) sup |I [g()],s |2
s
4E
A
A
sup |I [g()],s |2 A .
(4.5.15)
s
F 4 4l()n() l()+n()1
V,z,s() dz
= l()+n()1 4l()n()
V,z,s() dz,
(4.5.16)
216
4 Stochastic Expansions
2
s()
s()
F = E
I [g(, v
sup
)],z p (dv
, dz) A
s
E
4 E
=4 E
4E
s()
2
s()
|I [g(, v
)],z | (dv
) dz A
= 4 E ( )
A
s()
2
s()
sup |I [g(, v
)],s | (dv
) dz A
s
2
s()
s()
I [g(, v
)],z p (dv
, dz)
sup |I [g(, v s() )],s |2 (dv s() ) A
s
s()
2
E sup |I [g(, v
)],s | A (dv s() ). (4.5.18)
s
F 4 4l()n() l()+n()1
= 4l()n() l()+n()1
V,z,s() dz,
(4.5.19)
2
(1)
=E
sup
g(z, v) p (dv, dz) A
F
s
=E
s
2
sup
g(z, v) p (dv, dz) +
g(z, v) (dv)dz
s
A
217
2
2 E sup I(1) [g()], A
s
2
sup I(0) [ g(, v)(dv)],
s
+2E
A .
(4.5.20)
By applying the estimate of the already obtained assertion (4.5.3) and the
Cauchy-Schwarz inequality, we have
2
F(1) 8
E sup |g(t, v)| A (dv)dz
E
tz
+2
2
sup g(t, v)(dv)
tz
A dz
4K
V,z,s() dz
s()
= 4l()n() l()+n()1 K
V,z,s() dz,
(4.5.21)
2
s()
s()
sup
I [g(, v
)],z p (dv
, dz) A
F = E
s
E
sup
s
2E
sup
s
+2E
)],z p (dv
I [g(, v
sup |I [g(, v
E
E
s()
)],z (dv
+2
s()
E
E
A (dv s() ).
A
2
)dz
)],s |
sup |I [g()],s |
s
s()
A (dv s() )
2
, dz)
s
2
s()
I [g(, v
s()
A
(4.5.22)
218
4 Stochastic Expansions
84
l()n()
l()+n()1
s()
K
E
s()
+ 2 T 4l()n() l()+n()1 K
s()
4l()n() l()+n()1 K
E
V,z,s() dz,
(4.5.23)
F = E I [g()], A
C1 ( )
q l()+n()s() +s()1
and
2q
F = E | I [g()], |
C2 ( )q
A
l()+n()s() +s()1
where
V,z,s() =
...
E
E |g(z, v1 , . . . , vs() )|2q
A (dv1 ) . . . (dvs() ) <
for z [, ].
Proof:
1. Let us rst prove the assertion for a multi-index of length one, which
means l() = 1. When l() = 1 and = (0), by applying the H
older
inequality we obtain
F(0)
= E
A
2q
g(z)dz
( )2q1
= C( )q
219
E |g(z)|2q A dz
(4.5.26)
l()+n()s() +s()1
E |g(z)|2q A dz.
Note that here and in the following we denote by C any nite positive
constant independent of ( ).
2. For l() = 1 and = (j), with j {1, 2 . . . , m}, we obtain, see Krylov
(1980), Corollary 3, page 80,
2q
(j)
j
F = E
g(z)dWz A
E |g(z)|2q A dz
l()+n()s() +s()1
(4.5.27)
E |g(z)|2q A dz.
x =
g(z, v)p (dv, dz).
E
By applying It
os formula to |x |2q together with the Holder inequality,
we obtain
2q
|xz + g(z, v)|2q |xz |2q p (dv, dz)
|x | =
(2
2q1
1)
+ 22q1
(4.5.28)
E |x |2q |A (22q1 1)
E |xz |2q |A dz
+ 22q1
E |g(z, v)|2q |A (dv) dz,
(4.5.29)
where we recall that = (E) is the total intensity. Moreover, let us note
that, by the rst line of (4.5.28) and the properties of the Poisson jump
measure, we obtain
220
4 Stochastic Expansions
E |x | |A =
2q
E
E
|xz + g(z, v)|2q |xz |2q A (dv) dz,
which proves the continuity of E |x |2q |A as a function of . Therefore,
by applying the Gronwall inequality (1.2.33) to (4.5.29), we obtain
2q
2q
g(z, v)p (dv, dz) A
E |x | |A = E
E
exp (22q1 1) ( ) 22q1
exp (22q1 1) T 22q1
= C ( )q l()+n()s() +s()1
(4.5.30)
2q
(1)
F
= E
g(z, v)p (dv, dz)
g(z, v)(dv) dz A
E
22q1 E
+ E
2q
g(z, v)p (dv, dz)
2q
g(z, v)(dv) dz
A
A
C ( )
+
2q
E g(z, v)(dv)
E
A dz
C ( )q l()+n()s() +s()1
2q
221
4. Let us now consider the case l() = n + 1 and = (j1 , . . . , jn+1 ), with
jn+1 = 0. By (4.5.26) and the inductive hypothesis, we get
2q
F = E
I [g()],s ds A
( )2q1
A ds
E |I [g()],s |2q
C ( )2q1 ( )q
l()+n()s() +s()1
V,z,s() dz ds
= C ( )
q l()+n()s() +s()1
V,z,s() dz,
(4.5.31)
where the last line holds since l() = l() + 1, n() = n() + 1 and
s() = s().
5. When l() = n + 1 and = (j1 , . . . , jn+1 ), with jn+1 {1, 2, . . . , m}, by
(4.5.27) and the inductive hypothesis, we obtain
2q
F = E
I [g()],s dWsjn+1 A
2 (2q 1) ( )
q
q1
C ( )
q1
( )
2q
E |I [g()],s | A ds
q l()+n()s() +s()1
V,z,s() dz ds
= C ( )q
l()+n()s() +s()1
V,z,s() dz,
where the last line holds since l() = l() + 1, n() = n() and
s() = s().
6. Finally, let us suppose that l() = n + 1 and = (j1 , . . . , jn+1 ), with
jn+1 = 1. By (4.5.30) and the inductive hypothesis, we obtain
222
4 Stochastic Expansions
F = E
C
2q
A
E |I [g(, vs() )],s |2q A (dvs() ) ds
C ( )q l()+n()s() +s()1
C ( )
q l()+n()s() +s()1
V,z,s() dz,
(4.5.32)
where the last line holds since l() = l() + 1, n() = n() and
s() = s() + 1, and this completes the proof of the assertion (4.5.24).
The assertion (4.5.25) can be shown similarly by induction on l(). The
case of l() = 1 with = (1) has already been shown in (4.5.30). The case of
l() = n + 1, with = (j1 , . . . , jn+1 ) and jn+1 = 1, can be derived by using
(4.2.1), the H
older inequality, the inductive hypothesis, (4.5.31) and (4.5.32).
This completes the proof of Lemma 4.5.3.
Expectations of Products
Let us introduce some notation needed for the following lemma. For any z ,
let us denote by [z] the integer part of z. Moreover, for a given integer p N
we will use the set Ap of multi-indices = (j1 , . . . , jl ) of length l p with
components ji {1, 0}, for i {1, . . . , l}. For a given function h : [, ]
d , to be dened in the lemma below, and a multi-index Ap , we
consider the coecient function f dened in (4.3.9), with f (t, x) = h(t, x).
0 0
For instance, if = (1, 0, 0), then f (t, Xt ) = L1
v L L h(t, Xt ).
p,2p
([, ] d , ) the set of
Furthermore, for p N , we denote by C
functions that are p times continuously dierentiable with respect to time
and 2p times continuously dierentiable with respect to the spatial variables.
By using Lemma 4.5.3 we prove the following result similar to that in
Liu & Li (2000).
], and let and
Lemma 4.5.4
Consider Mm , p = l() [ l()+n()
2
be two stopping times with being A -measurable and 0 T almost
surely. Moreover, dene the process Z = {Zt = h(t, X t ), t [, ]}, with
positive constant
h C p,2p ([, ] d , ). Assume that there exist a nite
K such that for any Ap we have the estimate E f (t, X t )2 |A K
a.s. for t [, ]. Additionally, consider an adapted process g() H , where
223
and
(4.5.34)
m
0
Z = Z +
L h(z, X z ) dz +
Li h(z, X z ) dWzi
i=1
L1
v h(z, X z ) p (dv, dz),
I [g()], =
I [g()],z p (dv, dz)
I [g()],z (dv) dz.
0
L h(z, Xz ) I [g()],z
Z I [g()], =
h(z, Xz )
i=1
I [g()],z (dv) dz
Li h(z, X z ) I [g()],z dWzi
h(z, X z )I [g()],z
+ L1
v h(z, X z ) I [g()],z
+ L1
v h(z, X z ) I [g()],z p (dv, dz).
224
4 Stochastic Expansions
E h(z, X z ) I [g()],z (dv)|A dz
E Z I [g()], |A =
E
h(z, X z ) + L1
v h(z, X z )
I [g()],z |A (dv) dz
L0 h(z, X z ) I [g()],z |A dz
E
L1
h(z,
X
)
I
[g()]
|A
(dv)dz
z
,z
v
C ( )n+1
0
+
E L h(z, X z ) I [g()],z |A dz,(4.5.35)
z
0 0
C (z )n+1 +
E L L h(z1 , X z1 ) I [g()],z1 |A dz1 ,
for z [, ].
By using (4.5.36) in (4.5.35), we obtain
(4.5.37)
E Z I [g()], |A C ( )n+1
z2
0 0
+
E L L h(z1 , X z1 ) I [g()],z1 |A dz1 dz2 .
By applying again this procedure p 2 times and using the CauchySchwarz inequality we obtain
225
E Z I [g()], |A
C ( )n+1
z2
+
...
E fp (z1 , X z1 )I [g()],z1 |A dz1 . . . dzp
C ( )n+1
z2 -
. 12
E fp (z1 , X z1 )2 |A
+
...
-
. 12
2
E I [g()],z1 |A
dz1 . . . dzp ,
(4.5.38)
where p is dened as the multi-index with length p and all zeros, which
means l(p ) = n(p ) = p. Therefore, fp (t, X t ) denotes the stochastic process resulting from applying p times the operator L0 , see (4.3.7),
to h(t, X t ). Note that C denotes here a dierent constant from that in
(4.5.35).
Finally, by applying the result (4.5.24) of Lemma 4.5.3, with q = 1, to the
last
and considering that one has
the estimates
term in (4.5.38)
E fp (t, X t )2 |A K < and E g(t, v)2 |A K(v) a.s. for t [, ],
with K(v) being (dv)-integrable, we obtain
|E Z I [g()], |A C ( )n+1
+K
z2
...
( )
l()+n()
2
dz1 . . . dzp
C ( )l() ,
which proves the case of = (j1 , . . . , jl+1 ) with jl+1 = 1.
2. If = (j1 , . . . , jl+1 ) with jl+1 = 0, we can write
I [g()],z dz.
I [g()], =
0
i=1
Li h(z, X z ) I [g()],z dWzi
+
L1
v h(z, X z ) I [g()],z p (dv, dz).
226
4 Stochastic Expansions
E L0 h(z, X z ) I [g()],z |A dz.
z2 -
. 12
E f p (z1 , X z1 )2 |A
E Z I [g()], |A C ( )n+1 + . . .
. 12
-
2
E I [g()],z1 |A
dz1 . . . dzp
C ( )l() ,
where f p (t, X t ) denotes the stochastic process resulting from applying p
times the operator L0 , see (4.3.4), to h(t, X t ).
3. Let us nally consider the case = (j1 , . . . , jl+1 ) with jl+1 = (j) and
j {1, 2, . . . , m}. Here, we have
I [g()], =
I [g()],z dWzj .
0
i=1
Li h(z, X z ) I [g()],z dWzi
+
E
L1
v h(z, X z ) I [g()],z p (dv, dz).
E L0 h(z, X z ) I [g()],z |A dz,
227
To prove the estimate (4.5.34) we need only to check the case of l() = n+1
with = (j1 , . . . , jn+1 ) and jn+1 = 1.
By using the product rule of stochastic calculus one obtains
0
L h(z, X z ) I [g()],z dz
Z I [g()], =
Li h(z, X z ) I [g()],z dWzi
i=1
h(z, X z )I [g()],z + L1
+
v h(z, X z ) I [g()],z
+ L1
v h(z, X z ) I [g()],z p (dv, dz).
By the properties of It
o integrals and the induction hypothesis we obtain
E h(z, X z ) + L1
E Z I [g()], |A =
v h(z, X z )
E
I [g()],z |A (dv) dz
L0 h(z, X z ) I [g()],z |A dz
E
C (
L1
h(z,
X
)
I
[g()]
|A
(dv)
dz
z
,z
v
)n+1
0 h(z, X z ) I [g()],z |A dz, (4.5.39)
E L
228
4 Stochastic Expansions
q l()+n()s() +s()
= E h( ) | I [g()], | A C2 ( )
,
(4.5.41)
where the positive constants C1 and C2 do not depend on ( ).
F
Proof:
s().
2q
F = E h( ) I [g()], A
. 12 -
4q . 12
-
2
E I [g()], A
E h( ) A
C ( )q
l()+n()
.
Note that here and in the following we denote again by C any positive
constant that does not depend on ( ).
2. Let us consider the case s() = s {1, 2, . . .}.
By the relationship (4.2.12) and some straightforward estimates we obtain
1
2s()
2q
2q
F C E h( ) |I [g()], | A +
E h( ) |H,i | A ,
i=1
(4.5.42)
where terms H,i are described in Remark 4.2.1.
Since s() = s {1, 2, . . .}, we can write = 1 ! (1) ! 2 ! (1) ! . . . !
(1) ! s+1 , where s(j ) = 0 for j {1, 2, . . . , s + 1} and ! denotes the
concatenation operation on multi-indices dened in (4.2.7).
Thus,
p ( )
I1 [g()],i ,
I1 (1) [g()], =
i=p ()+1
where i , with i {1, 2, . . . , p (T )}, are the jump times generated by the
Poisson measure. Similarly, one can show that
229
p ( )
I [g()], =
i=p ()+1
ns
2q
ns+1
E h( )
I1 [g()],i I2 ;i ,i+1 . . . Is+1 ;i+s1 , A
i=1
P p ([, )) = n
e
( )
(n s + 1)
2q1
ns
n
( )
n!
. 12
E h( )2 A
ns+1
i=1
. 14 -
-
16q . 18
8q
E I2 ;i ,i+1 A
E | I1 [g()],i | A
1
-
2s+3 q . 2s+2
. . . E Is+1 ;i+s1 ,
A
( )
Ce
(n s + 1)
2q1
ns
ns+1
( )q
n
( )
n!
Ps+1
i=1 l(i )+n(i )
i=1
C ( )q(l()+n()s()) e( ) ( )s
ns
(n s + 1)2q
n ( )ns
n!
C ( )q(l()+n()s())+s() .
Note that the last line of (4.5.43) follows by
(4.5.43)
230
4 Stochastic Expansions
(n s + 1)2q
ns
n ( )ns
n!
j
( )
2q
()
(j + 1) (j + s) . . . (j + 1)
j!
j=0
s
j
2q+s ( )
(j + s)
K
j!
j=0
j
j
( )
(
)
+
j 2q+s
K
j!
j!
j=0
j=0
= K e( ) 1 + B2q+s (( ))
K e( ) ,
where we denote by Bn (x) the Bell polynomial of degree n, see Bell (1934).
To complete the proof we have to show that the bound (4.5.43) also holds
for the other terms in (4.5.42). Note that, as shown in Remark 4.2.1, the
terms H,i , for i {1, . . . , 2s() 1}, involve multiple stochastic integrals
with the same multiplicity l() of the multiple stochastic integrals I and
replace some of the integrations with respect to the Poisson jump measure p with integrations with respect to time and to the intensity measure
. Therefore, by following the steps above, one can establish the bound
(4.5.43) for all terms in (4.5.42). This completes the proof of Lemma 4.5.5.
4.6 Exercises
4.1. Use the Wagner-Platen expansion with time increment h > 0 and Wiener
process increment Wt0 +h Wt0 in the expansion part to expand the increment
Xt0 +h Xt0 of a geometric Brownian motion at time t0 , where
dXt = a Xt dt + b Xt dWt .
4.2. Expand the geometric Brownian motion from Exercise 4.1 such that all
double integrals appear in the expansion part.
4.3. For multi-indices = (0, 0, 0, 0), (1, 0, 2), (0, 2, 0, 1) determine , ,
() and n().
4.4. Write out in full the multiple It
o stochastic integrals I(0,0),t , I(1,0),t , I(1,1),t
and I(1,2),t .
4.6 Exercises
231
3
,
E (I(1,0), )2 =
3
E(I(1,0), I(1), ) =
2
.
2
4.7. For the case d = m = 1 determine the Ito coecient functions f(1,0) and
f(1,1,1) .
4.8. Which of the following sets of multi-indices are not hierarchical sets:
, {(1)}, {v, (1)}, {v, (0), (0, 1)}, {v, (0), (1), (0, 1)} ?
4.9. Determine the remainder sets that correspond to the hierarchical sets
{v, (1)} and {v, (0), (1), (0, 1)}.
4.10. Determine the truncated Wagner-Platen expansion at time t = 0 for
the solution of the It
o SDE
dXt = a(t, Xt ) dt + b(t, Xt ) dWt
using the hierarchical set A = {v, (0), (1), (1, 1)}.
4.11. In the notation of Sect. 4.2 where the component 1 denotes in a multiindex a jump term, determine for = (1, 0, 1) its length (), its number
of zeros and its number of time integrations.
4.12. For the multi-index = (1, 0, 1) derive in the notation of Sect.4.2 .
5
Introduction to Scenario Simulation
In this chapter we introduce scenario simulation methods for SDEs. We consider pathwise converging simulation methods that use discrete-time approximations. This chapter follows classical ideas, as described in Kloeden & Platen
(1999), and examines dierent numerical schemes. It also considers some examples of typical problems that can be handled by the simulation of approximating discrete-time trajectories.
(5.1.1)
for t [t0 , T ] with X(t0 ) = x0 . Deterministic ordinary dierential equations occur frequently in nancial and economic models. Even when such a
solution can be found, it may be too complicated to be calculated exactly.
It can often be more convenient or computationally ecient to evaluate a
solution numerically by using a discrete-time approximation. Practical necessity has lead to the development of numerical methods for calculating
such discrete-time approximations to the solutions of initial value problems
E. Platen, N. Bruti-Liberati, Numerical Solution of Stochastic Dierential
Equations with Jumps in Finance, Stochastic Modelling
and Applied Probability 64, DOI 10.1007/978-3-642-13694-8 5,
Springer-Verlag Berlin Heidelberg 2010
233
234
for ODEs. The most widely applicable and commonly used are the methods
in which the continuous time dierential equation is replaced by a discretetime dierence equation, generating values Y1 , Y2 , . . . , Yn , . . . to approximate
X(t1 ; t0 , x0 ), X(t2 ; t0 , x0 ), . . ., X(tn ; t0 , x0 ), . . . at given discrete times
t0 < t1 < t2 < . . . < tn < . . . .
These approximations should become accurate if the time increments n =
tn+1 tn for n {0, 1, . . .} are made suciently small.
As background for the construction of discretization methods we shall
review in this section basic dierence methods used for ODEs and discuss
their convergence and numerical stability.
Euler Method
The simplest dierence method for the approximate solution of the ODE
(5.1.1) is the Euler method
Yn+1 = Yn + a(Yn )
(5.1.2)
(5.1.3)
is generally not zero for n {0, 1, . . .}. This dierence is called the local
discretization error for the nth time step. The global discretization error at
time tn+1 is dened as
en+1 = X(tn+1 ; t0 , x0 ) Yn+1 .
(5.1.4)
This is the error obtained with respect to the exact solution X(tn+1 ; t0 , x0 ) of
the ODE (5.1.1). Theoretically the errors given in (5.1.3) and (5.1.4) assume
that we can perform all arithmetic calculations exactly. In practice, however,
a computer is restricted to a nite number of decimal places when performing
calculations. It needs to round o all excess decimal places, thus, introducing
a roundo error, which we shall denote by rn+1 for the nth time step.
Estimating the Local Discretization Error
The key to estimating the size of the discretization errors is the deterministic
Taylor formula. For a twice continuously dierentiable function Xt it follows
that
1 d2 X n 2
dXtn
+
Xtn+1 = Xtn +
(5.1.5)
dt
2 ! dt2
with some n satisfying tn < n < tn+1 . For Xt = X(t; tn , Yn ), the solution
of the ODE with Xtn = Yn has then, according to (5.1.1), the representation
1 d2 X n 2
.
2 ! dt2
235
(5.1.6)
Setting Xtn = Yn , and subtracting (5.1.2) from (5.1.6) we nd that the local
discretization error (5.1.3) has the form
ln+1 =
1 d2 X n 2
.
2 ! dt2
d2 X
If we knew that | dt2n | < M for all t in the given interval [t0 , T ], then we
would have the estimate
1
M 2
(5.1.7)
|ln+1 |
2!
for any discretization time interval [tn , tn+1 ] with t0 tn < tn+1 T . One
d2 X
can obtain a bound on | dt2n | by using the fact that
da(Xt )
a(Xt ) dXt
a(Xt )
d2 X t
d dXt
=
=
=
a(Xt ).
=
dt2
dt dt
dt
x
dt
x
a
If a and x
are continuous and all solutions Xt under consideration remain
in some closed and bounded set C, then we can use the bound
a(x)
,
M = max a(x)
X
where the maximum is taken over x C. The above bound M will usually
be a substantial overestimate. Still, from (5.1.7) we can see that the local
discretization error for the Euler method (5.1.2) is of order 2 .
Estimating the Global Discretization Error
To estimate the global discretization error we assume, for simplicity, that the
drift function a(x) satises a uniform Lipschitz condition
|a(x) a(y)| K |x y|
with some constant K < for all x, y and that the time discretization is
equidistant with step size > 0. Applying the Taylor formula (5.1.5) to the
solution Xt = X(t; t0 , x0 ) we have the representation (5.1.6) with Xtn
= Yn ,
in general. Subtracting (5.1.2) from (5.1.6) we get
en+1 = en + (a(Xtn ) a(Yn )) +
1 d2 Xtn 2
.
2 dt2
2
(5.1.8)
1
M 2
2
236
1
2
1
M 2
2
(1 + K )n 1
(1 + K ) 1
(5.1.9)
M 2 .
Since
(1 + K )n enK
we obtain
1 nK
M
(e
.
1)
2
K
Consequently, the global discretization error for the Euler method (5.1.2)
satises the estimate
|en+1 |
|en+1 |
1 K(T t0 )
M
(e
1)
2
K
(5.1.10)
237
238
2 R 1
M
1 K(T t0 )
e
+
|en+1 |
1
2
K
K
instead of (5.1.10). This illustrates that for very small time step size the
1
term with the reciprocal
dominates the estimation term. The above global
error estimate represents in some sense the worst case scenario. However, this
bound is rather indicative of the cumulative eect of the roundo error, since
for smaller more calculations are required to reach the given end time.
The presence of roundo errors means, in practice, that there is a minimum
step size min for each ODE approximation, below which one cannot improve
the accuracy of the approximations calculated by way of the simple Euler
method.
Higher Order and Implicit Methods
For introductory purposes let us say that a discrete-time approximation Y
with time step size > 0 has the order of convergence > 0 for a given ODE
if there exists a constant C < such that
e T C .
(5.1.11)
The Euler method has for an ODE the order of convergence = 1. To obtain
a more accurate discrete-time approximation we need to use other methods
with higher order of convergence.
For the Euler method we simply froze the right hand side of the ODE at the
value a(Yn ) of the last approximation point for each discretization subinterval
tn t < tn+1 . One should obtain a more accurate approximation if one could
use the average of the values at both end points of the discretization interval.
In this case we obtain the trapezoidal method
Yn+1 = Yn +
1
(a(Yn ) + a(Yn+1 )) .
2
(5.1.12)
239
This is called an implicit scheme because the unknown quantity Yn+1 appears
on both sides of equation (5.1.12). More generally, we have the family of
implicit Euler schemes
Yn+1 = Yn + a(Yn+1 ) + (1 ) a(Yn ) ,
(5.1.13)
where [0, 1] is the degree of implicitness. In the case = 1, we have the
fully implicit Euler scheme. Unfortunately, in general, the value Yn+1 cannot
be obtained algebraically. To overcome this diculty we could use the Euler
method (5.1.2) to approximate the Yn+1 term on the right hand side of the
scheme (5.1.12). This then leads to the modied trapezoidal method
Yn+1 = Yn +
1
a(Yn ) + a(Yn+1 ) ,
2
(5.1.14)
which is also known as the Heun method. It is a simple example of a predictorcorrector method with the predictor Yn+1 inserted into the corrector equation
to give the next iterate Yn+1 .
Both the trapezoidal and the modied trapezoidal methods have local
discretization errors of third order. This can be veried by using the classical
Taylor formula with a third order remainder term for the expansion at a(Yn+1 )
or a(Yn+1 ) about Yn . The global discretization error for both methods (5.1.12)
and (5.1.14) is of second order, which is again one order less than the local
discretization error.
With the standard arithmetic of a computer we repeat the previous numerical experiment for the modied trapezoidal method (5.1.14). Fig. 5.1.4
indicates that the resulting global error for the modied trapezoidal method
is proportional to 2 for 23 , whereas that of the Euler method in
Fig. 5.1.3 was proportional to .
Further higher order dierence methods can be systematically derived by
using more accurate approximations of the right hand side of the ODE (5.1.1)
over each discretization subinterval tn t tn+1 . The resulting schemes
are called one-step methods as they involve only the values Yn and Yn+1 in
addition to tn and . Explicit one-step methods are usually written in the
form
(5.1.15)
Yn+1 = Yn + (Yn , ) ,
for some function (x, ), which is called the increment function. It is a pth
order method if its global discretization error is bounded by some constant
times the pth power of . We will show below that if the functions a and
240
Fig. 5.1.4. Log-log plot for global error for modied trapezoidal method
are suciently smooth, then a (p+1)th order local discretization error implies
a pth order global discretization error. For example, the Euler method (5.1.2)
is a rst order one-step scheme with (x, ) = a(x) and the Heun method
(5.1.14) is a rst order one-step scheme with
1
(x, ) =
a(x) + a(x + a(x) ) .
2
The increment function cannot be chosen completely arbitrarily. For instance, it should be consistent with the dierential equation, that is, it should
satisfy the limit
lim (x, ) = a(x),
0
if the values calculated from (5.1.15) are to converge to the solution of the
ODE. The Euler method is obviously consistent and the Heun method is
consistent when a(x) is continuous.
Extrapolation
One can sometimes obtain higher order accuracy from a one-step scheme by
the method of extrapolation. For example, suppose we use the Euler scheme
T
on the interval t [0, T ]. If XT is
(5.1.2) with N equal time steps = N
the true value at time T and YN () the corresponding value from the Euler
scheme, then we have
YN () = XT + eT + O(2 ),
(5.1.16)
where we have written the global truncation error as eT +O(2 ). If, instead,
we use the Euler scheme with 2N time steps of equal length
2 , then we have
1
1
= XT + eT + O(2 ).
Y2N
(5.1.17)
2
2
241
We can eliminate the leading error term eT from (5.1.16) and (5.1.17) by
forming the linear combination
1
YN () + O(2 ).
XT = 2 Y2N
(5.1.18)
2
Thus we have a second order approximation
1
ZN () = 2 Y2N
YN ()
2
(5.1.19)
for XT using only the rst order Euler scheme. Of course, this requires repeating the Euler scheme calculations using half the original time step size.
However, for a complicated ODE it may involve fewer and simpler calculations than a second order one-step scheme would require. This extrapolation
method is known as Richardson extrapolation. It can also be applied to more
general one-step schemes.
Truncated Taylor Methods
A one-step dierence method with local discretization error of order p + 1 is
readily suggested by the classical Taylor formula with (p + 1)th order remainder term
1 dp Xtn p
dp+1 Xn p+1
dXtn
1
+ ... +
+
,
p
dt
p ! dt
(p + 1)! dtp+1
(5.1.20)
where tn < n < tn+1 and = tn+1 tn , for p N . One can apply this Taylor
expansion to a solution Xt of the ODE (5.1.1) if the drift function a(x) and
its derivatives of orders up to and including p are continuous, as this assures
that Xt is (p + 1) times continuously dierentiable. Indeed, from (5.1.1) and
the chain rule one obtains by repeatedly dierentiating a(Xt ) the derivatives
dXt
d2 Xt
dt = a (Xt ), dt2 = a (Xt ), . . .. Evaluating these terms at Yn and omitting
the remainder term in (5.1.20) we obtain a one-step method for Yn+1 , which
we shall call the pth order truncated Taylor method. This method, obviously,
has a local discretization error of order p + 1. Under appropriate conditions
it can be shown to have global discretization error of order p. The rst order
truncated Taylor method is then just the Euler method (5.1.2), the second
order truncated Taylor method is
Xtn+1 = Xtn +
1
(a (Yn ) a(Yn )) 2
2!
and the third order truncated Taylor method has the form
Yn+1 = Yn + a(Yn ) +
Yn+1 = Yn + a(Yn ) +
+
(5.1.21)
1
(a (Yn ) a(Yn )) 2
2!
1
a (Yn ) a2 (Yn ) + (a (Yn ))2 a(Yn ) 3 .
3!
(5.1.22)
242
(5.1.23)
(5.1.24)
1 2
a (x) (a(x))2 2
2
neglecting higher order terms. Hence, subtracting (5.1.23) with this expansion
for evaluated at (Yn , ) from the third order truncated Taylor method
(5.1.22) we obtain the expression
1
a (Yn ) a(Yn ) 2
(Yn , ) (1 ) a(Yn ) +
2!
1 1
1
2
a (Yn ) (a(Yn ))2 3 + (a (Yn ))2 a(Yn ) 3 ,
+
2 3
6
243
= en + ( (Yn , ) (Xtn , )) + (Xtn , ) + Xtn Xtn+1 ,
where the very last term is the local discretization error. From the global
Lipschitz condition
| (x , ) (x, )| K (|x x| + | |)
(5.1.27)
and using the assumption that the local discretization error is of order p + 1
we obtain
|en+1 | = |en | + K |en | + D p+1
= (1 + K ) |en | + D p+1 .
Here D is some positive constant, and it follows that
D K(T t0 )
e
|en |
1 p
K
on the interval [t0 , T ]. Under appropriate conditions, the global discretization
error is thus of order p if the local discretization error is of order p + 1.
244
Fig. 5.1.5. Log-log plot for global error the Euler method
Numerical Stability
We may still encounter diculties when trying to implement a dierence
method which is known to be convergent. For example, the dierential equation
dXt
= Xt
(5.1.28)
dt
has exact solution Xt = x0 et , which converges rapidly to zero. For this
dierential equation the Euler method with constant time step size ,
Yn+1 = (1 ) Yn ,
has exact iterates
Yn = (1 )n Y0 .
If we choose > 24 for = 33 these iterates oscillate, as shown in Fig. 5.1.5,
instead of converging to zero like the exact solutions of (5.1.28). This is a
simple example of a numerical instability, which, in this particular case, can
be overcome by taking a time step < 24 , see Fig. 5.1.5. For some other
methods or ODEs the numerical instabilities may persist no matter how small
one makes . The structure of some methods can make them intrinsically
unstable, causing small errors such as roundo errors to grow rapidly and
ultimately rendering the computation useless.
The idea of numerical stability of a one-step method is that errors will
remain bounded with respect to an initial error for any ODE of type (5.1.1)
with right hand side a(x) satisfying a Lipschitz condition. To be specic, let
Yn and Yn denote the approximations obtained by the method (5.1.26) when
starting in Y0 and Y0 , respectively. We say that a discrete-time approximation
is numerically stable if there exist nite positive constants and M such that
|Yn Yn | M |Y0 Y0 |
(5.1.29)
245
(5.1.30)
for any two solutions y and y of (5.1.26) corresponding to any time discretization with < a .
It is easy to see that the Euler method is asymptotically numerically stable
for the ODE (5.1.28) with a 24 and = 32. For larger step sizes the
method may fail to be asymptotically numerically stable. The numerical stability of a method has to have priority over potential higher order convergence
properties. We will see that the numerical problems we discussed for solving
ODEs arise in a similar fashion for SDEs.
246
However, one can use also natural or physical random numbers, see Pivitt
(1999), and even observed random returns, log-returns or increments from
observed asset prices. The latter allows us to perform historical simulations,
that is, to generate reasonably realistic paths of securities and portfolios.
If such simulation is focussed on extreme scenarios, then it is called stress
testing. In insurance, the area of dynamic nancial analysis is attracting increasing attention, see Kaufmann, Gadmer & Klett (2001). The combination
of the benchmark approach with scenario simulation of entire market models
appears to be rather promising. In the following we will give an introduction
to scenario simulation, which aims to approximate the path of a solution of
a given SDE by the path of a discrete-time approximation.
Discrete-Time Approximation
Consider a given discretization 0 = t0 < t1 < < tn < < tN = T < of
the time interval [0, T ]. We shall approximate a process X = {Xt , t [0, T ]}
satisfying the one-dimensional SDE
dXt = a(t, Xt ) dt + b(t, Xt ) dWt
(5.2.1)
on t [0, T ] with initial value X0 . For simplicity, we focus in this introduction on the case of SDEs without jumps.
Let us rst introduce one of the simplest, reasonable discrete-time approximations, which is the Euler scheme or Euler-Maruyama scheme, see
Maruyama (1955). An Euler scheme is a continuous time stochastic process
Y = {Yt , t [0, T ]} satisfying the recursive algorithm
(5.2.2)
Yn+1 = Yn + a(tn , Yn ) (tn+1 tn ) + b(tn , Yn ) Wtn+1 Wtn ,
for n {0, 1, . . . , N 1} with initial value
Y0 = X0 ,
(5.2.3)
Yn = Ytn
(5.2.4)
max
n{0,1,...,N1}
(5.2.6)
the maximum step size. For much of this book we shall consider, for simplicity,
equidistant time discretizations with
tn = n
(5.2.7)
247
T
for n = = N
and some integer N large enough so that (0, 1).
The sequence (Yn )n{0,1,...,N } of values of the Euler approximation (5.2.2)
at the instants of the time discretization t0 , t1 , . . . , tN can be recursively computed. We need to generate the random increments
Wn = Wtn+1 Wtn ,
(5.2.8)
and variance
E (Wn ) = 0
(5.2.9)
E (Wn )2 = n .
(5.2.10)
For the increments (5.2.8) of the Wiener process we shall use a sequence of
independent Gaussian random numbers.
For describing a numerical scheme we will typically use the abbreviation
f = f (tn , Yn )
(5.2.11)
(5.2.12)
248
Yt = Ynt
(5.2.14)
(5.2.15)
(5.2.16)
(5.2.17)
for t [0, T ] with initial value X0 > 0. This is also called the Black-Scholes
SDE. The geometric Brownian motion X has drift coecient
a(t, x) = a x
(5.2.18)
b(t, x) = b x.
(5.2.19)
249
Scenario Simulation
Knowing the solution (5.2.20) of the SDE (5.2.17) explicitly, gives us the
possibility of comparing the Euler approximation Y , see (5.2.12), with the
exact solution X and hence calculate the appropriate errors.
To simulate a trajectory of the Euler approximation for a given time discretization we start from the initial value Y0 = X0 , and proceed recursively
by generating the next value
Yn+1 = Yn + a Yn n + b Yn Wn
(5.2.21)
(5.2.22)
250
Fig. 5.2.1. Euler approximations for = 0.25 and = 0.0625 and exact solution
for Black-Scholes SDE
can, for instance, make the Euler approximation (5.2.21) negative. This is
not consistent with our aim of modeling positive asset prices by a BlackScholes model. According to (5.2.23) these prices have to stay nonnegative.
The possibility of generating negative asset prices is a quite serious defect in
the Euler method when directly applied to the Black-Scholes dynamics. For
extremely small step sizes this phenomenon becomes highly unlikely, however,
it is never excluded and one may have to consider alternative ways of avoiding
such a situation. One way is demonstrated in Sect. 2.1. However, for other
SDEs this way is not available.
For Black-Scholes dynamics the potential negativity can be easily avoided
if one simulates rst the logarithm of the geometric Brownian motion by a
corresponding Euler scheme and then derives the approximate value of X
by taking the exponential of the approximated logarithm. One notes that it
is worth rst carefully analyzing the underlying model dynamics and only
afterwards choosing an appropriate way of simulating the desired scenarios.
By simulating transforms of a model, as discussed in Chap. 2 when using
simple transformed Wiener processes, it is possible to make the simulation
highly accurate. Furthermore, such an approach helps to respect conservation
laws, such as the nonnegativity of asset prices.
We emphasize that one should be careful in interpreting the simulation of
an approximate discrete-time scenario as being suciently accurate. It may
in certain cases signicantly dier from the exact solution due to potential
numerical instabilities, as will be discussed in Chap. 14. Later we will address some problems that result from numerical instabilities which can make
a simulation useless. Small errors always occur naturally. However, if these
are uncontrolled they can propagate and may result in prohibitive errors.
251
Types of Approximation
So far we have not specied a criterion for the classication of the overall
accuracy of a discrete-time approximation with respect to vanishing time step
size. Such a criterion should reect the main goal of a simulation. It turns out
that there are two basic types of tasks that are typical for the simulation of
solutions of SDEs.
The rst type, which we have discussed previously, is called scenario simulation and applies in situations, where a pathwise approximation is required.
For instance, scenario simulation is needed for judging the overall behavior of
modeled quantities, for hedge simulations for dynamic nancial analysis and
for the testing of calibration methods and statistical estimators, see Chap. 9.
The second type is called Monte Carlo simulation and focuses on approximating probabilities and thus expectations of functionals of the approximated
process. For instance, via Monte Carlo methods one simulates probabilities,
moments, prices of contingent claims, expected utilities and risk measures.
The second type is highly relevant for many quantitative methods in nance
and insurance. It is generally used when quantities cannot be analytically determined. The rst type of tasks is relevant for verifying quantitative methods
and extracting information from models and the observed data, as is the case
in ltering, see Chap. 10.
In this and the following chapters we concentrate on the rst type of
tasks, the scenario simulation, which we relate to the strong approximation
of solutions of SDEs. Later we will separately study in Chap. 11 Monte Carlo
methods, which we will then relate to the weak approximation of SDEs.
Order of Strong Convergence
In general, one does not know much about the solution of a given SDE. However, one can try to use a scenario simulation to discover some of its properties.
Theoretically one can estimate and in some cases also practically calculate the
error of an approximation using the following absolute error criterion. For a
given maximum step size it is dened as the expectation of the absolute
dierence between the discrete-time approximation YT and the solution XT
of the SDE at the time horizon T , that is
(5.2.24)
() = E XT YT .
This provides some measure for the pathwise closeness between approximation
and exact value at the end of the time interval [0, T ] and its dependence on
the maximum step size of the time discretization.
While the Euler approximation is the simplest, useful discrete-time approximation, it is, generally, not very ecient numerically. We shall therefore
derive and investigate other discrete-time approximations in the following sections. In order to assess and classify dierent discrete-time approximations,
we introduce their order of strong convergence.
252
d
d
m
1 k,j ,j
+
ak
+
b b
,
k
k
t
x
2
x x
j=1
k=1
(5.3.1)
k,=1
+
ak
t
xk
d
L0 =
(5.3.2)
k=1
and
Lj = L j =
d
xk
(5.3.3)
m
1 j k,j
L b
2 j=1
(5.3.4)
k=1
bk,j
253
for k {1, 2, . . . , d}. Here the functions ak and bk,j depend on the time and
the state variable. In addition, similar as to Chap. 4, we shall abbreviate, for
the nth discretization interval, multiple It
o integrals by
tn+1
s2
I(j1 ,...,j ) =
...
dWsj11 . . . dWsj ,
(5.3.5)
tn
tn
tn+1
s2
J(j1 ,...,j ) =
...
dWsj11 . . . dWsj ,
tn
(5.3.6)
tn
(5.3.7)
for all t [0, T ]. In (5.3.5) and (5.3.6) we suppress the dependence on tn and
. Similarly, as introduced in (5.2.11), we will use in the following schemes the
abbreviation f = f (tn , Yn ), for n {0, 1, . . .} and any function f dened on
[0, T ]d . Furthermore, we will usually not explicitly mention the initial value
Y 0 = X 0 . These conventions simplify the formulae for the various schemes
that we will discuss later on.
If we refer in this section to the order of convergence of a scheme, then
we mean the order of strong convergence of the corresponding discrete-time
approximation. Finally, we suppose that the vector process X under consideration satises the It
o SDE
dX t = a(t, X t ) dt +
m
bj (t, X t ) dWtj ,
(5.3.8)
j=1
m
bj (t, X t ) dWtj
(5.3.9)
j=1
for t [0, T ] with X 0 d and adjusted drift a, see (4.1.11) and Kloeden &
Platen (1999).
Euler Scheme
Let us begin with the Euler scheme (5.2.21), which has been already mentioned
in the previous section. It represents the simplest strong Taylor approximation
and, as we shall see, attains the order of strong convergence = 0.5.
In the one-dimensional case, d = m = 1, the Euler scheme has the form
254
Yn+1 = Yn + a + b W,
(5.3.10)
= tn+1 tn
(5.3.11)
where
is the length of the time discretization subinterval [tn , tn+1 ] and we write in
the following
(5.3.12)
W = Wn = Wtn+1 Wtn
for the N (0, ) independent Gaussian distributed increment of the Wiener
process W on [tn , tn+1 ]. Note that we use here and in the following our abbreviation, where we suppress in the coecient functions the dependence on
time tn and approximate value Yn of the last discretization point. We also
suppress the dependence of the random Wiener process increments on n.
In the multi-dimensional case with scalar noise m = 1 and d {1, 2, . . .},
the kth component of the Euler scheme is given by
k
= Ynk + ak + bk W,
Yn+1
(5.3.13)
for k {1, 2, . . . , d}, where the drift and diusion coecients are d-dimensional
vectors a = (a1 , . . ., ad ) and b = (b1 , . . ., bd ) .
For the general multi-dimensional case with d, m {1, 2, . . .} the kth
component of the Euler scheme has the form
k
= Ynk + ak +
Yn+1
m
bk,j W j .
(5.3.14)
j=1
Here
W j = Wtjn+1 Wtjn
(5.3.15)
is the N (0, ) independent Gaussian distributed increment of the jth component of the m-dimensional standard Wiener process W on [tn , tn+1 ], and
W j1 and W j2 are independent for j1
= j2 . The diusion coecient b =
[bk,j ]d,m
k,j=1 is here a dm-matrix.
The Euler scheme (5.3.10) corresponds to a truncated Wagner-Platen expansion, see (4.4.2) or (4.4.7), containing only the time and Wiener integrals
of multiplicity one. Assuming Lipschitz and linear growth conditions on the
coecients a and b, we will see that the Euler approximation is of strong
order = 0.5.
In special cases, this means for specic SDEs, the Euler scheme may actually sometimes achieve a higher order of strong convergence. For example,
when the noise is additive, that is when the diusion coecient is a dierentiable, deterministic function of time and has the form
b(t, x) = bt
(5.3.16)
for all (t, x) [0, T ] d , then the Euler scheme has typically the order of
strong convergence = 1.0.
255
Fig. 5.3.1. Log-log plot of the absolute error () for an Euler scheme against
ln()
The Euler scheme gives reasonable numerical results when the drift and
diusion coecients are nearly constant and the time step size is suciently
small. In general however, and if possible, the use of more sophisticated higher
order schemes is recommended, which we will introduce below.
A Simulation Example
Let us check experimentally the order = 0.5 of strong convergence for the
Euler scheme in our standard example. We suppose that X satises the BlackScholes SDE
dXt = Xt dt + Xt dWt
(5.3.17)
for t [0, T ] with initial value X0 > 0, appreciation rate > 0 and volatility
> 0.
In this case X is a geometric Brownian motion and we compare the Euler
approximation with the exact solution
2
(5.3.18)
XT = X0 exp
T + WT ,
2
see (5.2.20). In the corresponding log-log plot we show in Fig. 5.3.1 for the
Euler scheme the logarithm of the absolute error ln(()) against ln(). We
observe in this gure how the absolute error () = E(|XT YN |) behaves
for increasing time step size. We have used the following default parameters:
X0 = 1, = 0.06, = 0.2 and T = 1. Furthermore, we generated 5000
simulations, where the time step size ranges from 0.01 to 1. One observes
in Fig. 5.3.1 an almost perfect line. We omit any condence intervals since
they are quite small and can be further reduced by increasing the number of
simulations.
256
Regressing the log of the absolute error on the log of the size of the time
step size provides the following tted line
ln(()) = 3.86933 + 0.46739 ln().
The empirical order of strong convergence is with about 0.47 close to
the theoretical strong order of = 0.5 for the Euler scheme.
Milstein Scheme
We shall now introduce a scheme, the Milstein scheme, which turns out to
be of strong order = 1.0. It is obtained by including one additional term
from the Wagner-Platen expansion (4.4.7). This scheme was rst suggested in
Milstein (1974).
The Milstein scheme has in the one-dimensional case with d = m = 1 the
form
1
Yn+1 = Yn + a + b W + b b (W )2 .
(5.3.19)
2
Recall that we suppress here in the coecient functions the dependence on tn
and Yn . The additional term in the Milstein scheme, when compared with the
Euler scheme, marks the point of divergence of stochastic numerical analysis
from deterministic numerics. In some sense one can regard the Milstein scheme
as the proper stochastic generalization of the deterministic Euler scheme for
the strong convergence criterion. As we will show later, it gives the same order
= 1.0 of strong convergence as the Euler scheme does in the deterministic
case.
In the multi-dimensional case with m = 1 and d {1, 2, . . .} the kth
component of the Milstein scheme is given by
d
bk 1
k
(W )2 .
= Ynk + ak + bk W +
b
(5.3.20)
Yn+1
x
2
=1
In the general multi-dimensional case with d, m {1, 2, . . .} the kth component of the Milstein scheme has the form
k
= Ynk + ak +
Yn+1
m
bk,j W j +
j=1
m
(5.3.21)
j1 ,j2 =1
m
j=1
bk,j W j +
m
j1 ,j2 =1
(5.3.22)
257
Fig. 5.3.2. Log-log plot of the absolute error against log-step size for Milstein
scheme
tn+1
s1
tn
dWsj21 dWsj12 ,
(5.3.23)
appearing above, are equal. They cannot be simply expressed in terms of the
increments W j1 and W j2 of the components of the Wiener process and
some eort has to be made to generate or approximate these, as we will discuss
later. However, in the case j1 = j2 we have
2
1
1
(W j1 )2
W j1 ,
I(j1 ,j1 ) =
and J(j1 ,j1 ) =
(5.3.24)
2
2
for j1 {1, 2, . . . , m}, see (4.2.16).
Let us generate for the Milstein scheme the corresponding log-log plot of
the logarithm of the absolute error against the logarithm of the step size for
the previous example (5.3.17). We can see in Fig. 5.3.2 the log-absolute error
for a geometric Brownian motion when approximated with a Milstein scheme
by using the same default parameters as in Fig. 5.3.1.
Regressing the log of the absolute error on the log of the time step size we
obtain the following tted line
ln(()) = 4.91021 + 0.95 ln().
The experimentally determined estimate for the order of strong convergence
is about 0.95 and is thus close to the theoretical value of = 1.0.
Diusion Commutativity Condition
For some important dynamics in nancial modeling the corresponding SDEs
have special structural properties, which allow the Milstein scheme to be sim-
258
plied. This avoids in some cases the use of double integrals of the type (5.3.23)
that involve dierent Wiener processes. For instance, in the case with additive
noise, that is when the diusion coecients depend at most on time t and not
on the variable x, the Milstein scheme (5.3.19) reduces to the Euler scheme
(5.3.10).
Another important special case is that of commutative noise for which the
diusion matrix satises the diusion commutativity condition
Lj1 bk,j2 (t, x) = Lj2 bk,j1 (t, x)
(5.3.25)
for all j1 , j2 {1, 2, . . . , m}, k {1, 2, . . . , d} and (t, x) [0, T ]d . For instance, the diusion commutativity condition is for asset prices satised if one
simulates systems of Black-Scholes SDEs, see (1.7.25) and (1.7.30). Systems
of SDEs with additive noise also satisfy the diusion commutativity condition
(5.3.25). Obviously, the diusion commutativity condition is satised if the
given SDE is driven by a single Wiener process only.
It turns out that the Milstein scheme under the diusion commutativity
condition can be written as
k
= Ynk + ak +
Yn+1
m
1 j1 k,j2
L b W j1 W j2
2 j ,j =1
m
bk,j W j +
j=1
(5.3.26)
for k {1, 2, . . . , d}. This scheme does not use any double integrals as given
in (5.3.23).
A Black-Scholes Example
A typical example is given by the simulation of correlated Black-Scholes dynamics with d = m = 2, where the rst risky security satises the SDE
/
0
dXt1 = Xt1 r dt + 1 (1 dt + dWt1 ) + 2 (2 dt + dWt2 )
1 1 2
( ) + (2 )2 dt + 1 dWt1 + 2 dWt2
= Xt1 r +
2
and the second risky security is given by the SDE
/
0
dXt2 = Xt2 r dt + (1 1,1 ) (1 dt + dWt1 )
1 1 1 1,1
+
dt + (1 1,1 ) dWt1
= Xt2 r + (1 1,1 )
2
2
for t [0, T ]. Here W 1 and W 2 are independent Wiener processes and r,
1 , 2 , 1,1 > 0 are assumed to be strictly positive constants. Obviously, we
obtain
L1 b1,2 =
2
k=1
1,2
1
1 2
b
=
=
bk,2 k b1,1 = L2 b1,1
t
xk
x
2
bk,1
k=1
and
L1 b2,2 =
2
k=1
259
2,2
b =0=
bk,2 k b2,1 = L2 b2,1
k
x
x
2
bk,1
k=1
such that the diusion commutativity condition (5.3.25) is satised. The corresponding Milstein scheme is then of the form
1 1 2
1
( ) + (2 )2 + 1 W 1 + 2 W 2
= Yn1 + Yn1 r +
Yn+1
2
1
1
+ (1 )2 (W 1 )2 + (2 )2 (W 2 )2 + 1 2 W 1 W 2 (5.3.27)
2
2
1
2
Yn+1
= Yn2 + Yn2
r + (1 1,1 ) (1 + 1,1 ) + (1 1,1 ) W 1
2
1
+ (1 1,1 )2 (W 1 )2 .
(5.3.28)
2
Note that if the diusion commutativity condition (5.3.25) were not satised,
then multiple stochastic integrals with respect to the two Wiener processes
would appear. This kind of situation will be discussed later. We remark that
the scheme (5.3.27)(5.3.28) may produce negative trajectories, which could
create severe problems for certain applications. As shown in Sect. 2.1, there
exist alternative methods that keep the approximate security prices under the
Black-Scholes model positive.
A Square Root Process Example
As another example let us consider the simulation of a real valued square root
process X of dimension , see (2.1.2), with SDE
1
Xt dt + Xt dWt
(5.3.29)
dXt =
4
for t [0, T ] and X0 = 1 for > 2. Since there is only one driving Wiener
process the noise is commutative. The Milstein scheme here has according to
(5.3.19), (5.3.20) or (5.3.26) the form
1
1
Yn + Yn W +
W 2 .
(5.3.30)
Yn+1 = Yn +
4
4
This scheme generates, for the parameters = 4, = 0.048, T = 100 and
step size = 0.05 the trajectory exhibited in Fig. 5.3.3. Note that the above
scheme (5.3.30) can generate a negative trajectory, which may cause problems
in applications. Alternative methods that preserve nonnegativity have been
described in Chap. 2. Furthermore, we remark that the strong convergence results, as given in Kloeden & Platen (1999), require Lipschitz continuity of the
diusion coecient in (5.3.29), which is not given. Still the scheme performs
quite well when the trajectory does not come too close to zero.
260
+
(
)
J(j
j
j
p
j
,p
j
j
,p
j
1
2
2
1
1 ,j2 )
2 1 2
p
1
+
2 j2 + j2 ,r j2 ,r 2 j1 + j1 ,r , (5.3.31)
j1 ,r
2 r=1 r
where
p =
p
1 1
1
2
12 2 r=1 r2
(5.3.32)
and j , j,p , j,r and j,r are independent N (0, 1) Gaussian distributed random
variables with
1
j = W j
(5.3.33)
for j {1, 2, . . . , m}, r {1, . . . p} and p {1, 2, . . .}. The number of terms p
p
and J(j1 ,j2 ) .
in the expansion (5.3.31) inuences the dierence between J(j
1 ,j2 )
It has been shown in Kloeden & Platen (1999) that if one chooses
p = p()
(5.3.34)
261
for some positive constant K, then one obtains the order of strong convergence
= 1.0 for the Milstein scheme with the above approximation of the double
stochastic integral (5.3.23).
An alternative way to approximate these multiple stochastic integrals is
the following:
We apply the Euler scheme to the SDE which describes the dynamics of
the required multiple stochastic integrals with a time step size , smaller than
the original time step size . Since the Euler scheme has strong order of
convergence = 0.5, by choosing = 2 , the strong order = 1.0 of the
scheme is preserved, see Milstein (1995a) and Kloeden (2002).
Another way to obtain the desired double Wiener integrals is the technique
of Gaines & Lyons (1994) who suggest sampling the distribution of the Levy
area I(j1 ,j2 ) I(j2 ,j1 ) , j1
= j2 , conditional on W j1 and W j2 , which is
known. Unfortunately, this technique does not apply for m > 2.
Strong Order 1.5 Taylor Scheme
There are certain simulation tasks that require more accurate schemes than
the Milstein scheme. For instance, if one wants to study the distribution of
extreme log-returns one needs higher order schemes to capture reasonably
well the behavior of the trajectory in the tails of the log-return distribution.
In general, we obtain more accurate strong Taylor schemes by including further multiple stochastic integrals from the Wagner-Platen expansion into the
scheme. Each of these multiple stochastic integrals contains additional information about the sample path of the driving Wiener process. The necessity
of the inclusion of further multiple stochastic integrals to achieve higher order convergence is a fundamental feature of the numerical analysis of SDEs
compared to that of deterministic ODEs. For ODEs it was enough to have
sucient smoothness in the drift coecient to obtain higher order schemes.
By adding more terms from the expansion (4.4.7) to the Milstein scheme
one obtains, as in Platen & Wagner (1982), in the autonomous one-dimensional
case d = m = 1 the strong order 1.5 Taylor scheme
1
b b (W )2 + a b Z
2
1 2
1 2
1
2
+
a a + b a + a b + b b {W Z}
2
2
2
1
1
(W )2 W.
(5.3.35)
+ b b b + (b )2
2
3
Yn+1 = Yn + a + b W +
tn+1
s2
dWs1 ds2 .
(5.3.36)
Z = I(1,0) =
tn
tn
262
One can show that Z is Gaussian distributed with mean E(Z) = 0, variance E((Z)2 ) = 13 3 and covariance E(Z W ) = 12 2 .
Note that a pair of appropriately correlated normally distributed random
variables (W, Z) can be constructed from two independent N (0, 1) standard Gaussian distributed random variables U1 and U2 by means of the linear
transformation
1 3
1
2
Z =
W = U1 ,
U1 + U2 .
(5.3.37)
2
3
All multiple stochastic integrals that appear in (4.4.7) and which are also built
into (5.3.35) can be expressed in terms of , W and Z. In particular, the
last term in (5.3.35) contains the triple Wiener integral
1 1
(W 1 )2 W 1 ,
I(1,1,1) =
(5.3.38)
2 3
as described by the corresponding Hermite polynomial given in (4.2.17).
In the multi-dimensional case with d {1, 2, . . .} and m = 1 the kth
component of the strong order 1.5 Taylor scheme is given by
k
Yn+1
= Ynk + ak + bk W
1 1 k
L b {(W )2 } + L1 ak Z
2
1
+ L0 bk {W Z} + L0 ak 2
2
1 1 1 k 1
2
(W ) W.
+ L L b
2
3
+
(5.3.39)
In the general multi-dimensional case with d, m {1, 2, . . .} the kth component of the strong order 1.5 Taylor scheme takes the form
k
Yn+1
= Ynk + ak +
m
1 0 k 2
L a
2
j=1
m
j1 ,j2 =1
m
j1 ,j2 ,j3 =1
263
Approximate Multiple It
o Integrals
As with the general multi-dimensional version of the Milstein scheme (5.3.21),
we also have to approximate some multiple stochastic integrals involving different components of the Wiener process. Otherwise, it is dicult to implement the above schemes in practice for m > 1, because there are no explicit
expressions for the multiple stochastic integrals directly available. We continue the approach taken in (5.3.31), which is based on a Brownian bridge
type expansion of the Wiener process, see Kloeden & Platen (1999).
Let j , j,1 , . . . , j,p , j,1 , . . . , j,p , j,p and j,p be independent standard
Gaussian random variables. Then for j, j1 , j2 , j3 {1, 2, . . . , m} and some
p {1, 2, . . .} we set
I(j) = W j =
j ,
with
I(j,0) =
1
j + aj,0
2
(5.3.41)
aj,0 =
p
2 1
j,r 2
r=1 r
where
p =
p j,p ,
p
1 1
1
2
,
12 2 r=1 r2
1
(W j )2 ,
I(j,j) =
I(0,j) = W j I(j,0) ,
2
1 1
(W j )2 W j ,
I(j,j,j) =
2 3
1
1
p
p
1 1
(j1 ,r j2 ,r j1 ,r j2 ,r ) ,
2 r=1 r
and
p
p
= J(j
I(j
1 ,j2 ,j3 )
1 ,j2 ,j3 )
1
1{j1 =j2 } I(0,j3 ) + 1{j2 =j3 } I(j1 ,0)
2
with
1
1
1
p
p
p
bj1 j2 j3
= j1 J(0,j
+ aj1 ,0 J(j
+
J(j
,j
)
1 ,j2 ,j3 )
2 ,j3 )
2
3
2
2
3
3
3
1 p
p
p
Aj1 ,j2 Cj2 ,j1 + 2 Djp1 ,j2 ,j3
2 j2 Bj1 ,j3 + 2 j3
2
with
264
p
1
5
2 2 2
r,=1
1
[j , (j3 ,+r j1 ,r j1 ,r j1 ,+r )
l( + r) 2
+ j2 , (j1 ,r j3 ,+r + j1 ,r j3 ,+r )]
+
p
1
1
5
2 2 2
=1 r=1
1
[j , (j1 ,r j3 ,r + j3 ,r j1 ,r )
r( r) 2
j2 , (j1 ,r j3 ,r j1 ,r j3 ,r )]
+
p
2p
1
2 2
5
2
=1 r=+1
1
[j , (j3 ,r j1 ,r j1 ,r j3 ,r )
r(r ) 2
+ j2 , (j1 ,r j3 ,r + j1 ,r j3 ,r )] ,
where, for r > p we set j,r = 0 and j,r = 0 for all j {1, 2, . . . , m}. Here
1{A} denotes the indicator function for the event A.
One can show, see Kloeden & Platen (1999), that if one chooses p so that
p = p()
K
2
(5.3.42)
for an appropriate positive constant K, then the scheme (5.3.40) with the
above approximations of multiple stochastic integrals converges with strong
order = 1.5.
We remark, as mentioned earlier, that one can alternatively take a very
ne time discretization of each time interval [tn , tn+1 ] and simply approximate
the multiple It
o integrals as suggested by the denition of the It
o integral.
Strong Order 2.0 Taylor Scheme
We now use the Stratonovich-Taylor expansion from Kloeden & Platen (1999),
and retain those terms which are needed to obtain a strong order 2.0 scheme.
In the one-dimensional case d = m = 1 with time independent a and b the
strong order 2.0 Taylor scheme has the form
Yn+1 = Yn + a + b W +
1
b b (W )2 + b a Z
2!
1
a a 2 + a b {W Z}
2
1
1
+ b (b b ) (W )3 + b b (b b ) (W )4
3!
4!
+
(5.3.43)
The Gaussian random variables W and Z here are the same as given in
(5.3.37), where Z is dened in (5.3.36).
265
p
=
J(0,1,1)
p
=
J(1,1,0)
1 ,
p
Z = J(1,0)
=
1
1 + a1,0 ,
2
1 2 2 1
1 3
p
1 a21,0 + 2 1 b1 2 B1,1
,
3!
4
3
1 2 2
1
1 3
p
p
1
2 1 b1 + 2 B1,1
2 a1,0 1 + 2 C1,1
,
3!
2
4
3
1 2 2 1
1
1 3
p
1 + a21,0
2 1 b1 + 2 a1,0 1 2 C1,1
(5.3.44)
3!
4
2
4
with
a1,0 =
p
1
1
1,r 2 p 1,p ,
2
r
r=1
p =
)
b1 =
p
1
1,r +
2 r=1 r2
p =
p
B1,1
and
p
=
C1,1
p
1 1
1
2
,
12 2 r=1 r2
p 1,p ,
p
1 1
2
2
,
180 2 r=1 r4
p
1 1 2
2
+ 1,r
=
4 2 r=1 r2 1,r
p
1
r
1
l
1,r 1,
1,r 1, .
2 2 r,l=1 r2 l2 l
r
r=l
Here 1 , 1,r 1,r , 1,p and 1,p denote for r {1, . . ., p} and p {1, 2, . . .}
independent standard Gaussian random variables.
266
1 1 k
L b (W )2 + L1 ak Z
2!
1 0 k 2
L a + L0 bk {W Z}
2
1
1
+ L1 L1 bk (W )3 + L1 L1 L1 bk (W )4
3!
4!
(5.3.45)
In the general multi-dimensional case with d, m {1, 2, . . .} the kth component of the strong order 2.0 Taylor scheme has the form
k
Yn+1
= Ynk + ak +
m
m
1 0 k 2 k,j
L a +
b W j + L0 bk,j J(0,j) + Lj ak J(j,0)
2
j=1
j1 ,j2 =1
j1
0 k,j2
+L L b
m
j1 ,j2 ,j3 =1
j1
j2 k
m
(5.3.46)
267
The resulting schemes are similar to Runge-Kutta schemes for ODEs, however,
it must be emphasized that they cannot simply be constructed as heuristic
generalizations of deterministic Runge-Kutta schemes. The extra terms in the
Ito formula that occur additionally to the typical terms in the deterministic
chain rule make it necessary to develop an appropriate class of derivative-free
schemes.
Explicit Strong Order 1.0 Schemes
In a strong Taylor approximation derivatives of various orders of the drift
and diusion coecients must be evaluated at each step. This makes the
construction and implementation of such schemes complicated.
Various rst order derivative-free schemes can be obtained from the Milstein scheme (5.3.21) simply by replacing the derivatives in the Milstein
scheme by the corresponding dierences. The inclusion of these dierences
requires the computation of supporting values of the coecients at additional
supporting points. An example is given by the following scheme, which is due
to Platen (1984) and called the Platen scheme. In the one-dimensional case
with d = m = 1 it takes the form
1
b(n , n ) b (W )2
(5.4.1)
Yn+1 = Yn + a + b W +
2
with supporting value
n = Yn + a + b
(5.4.2)
(5.4.4)
j = Y n + a + bj
(5.4.6)
n
k
= Ynk +ak +
Yn+1
m
268
1 k,j
n + bk,j W j
b
n ,
2 j=1
m
k
Yn+1
= Ynk + ak +
(5.4.7)
m
bj W j .
(5.4.8)
j=1
Here a denotes again the adjusted Stratonovich drift (5.3.4). In practice, the
above schemes turn out to be convenient since they provide the strong order
= 1.0 that is achieved by the Milstein scheme and avoid the computation
of derivatives.
In the log-log plot shown in Fig. 5.4.1 we display the absolute error for a
geometric Brownian motion approximated by the rst order Platen scheme
using the same default parameters from the example of the previous section
for the Euler and Milstein schemes in Figs. 5.3.1 and 5.3.2.
Regressing the log of the absolute error on the log of the time step size
results in the t:
ln(()) = 4.71638 + 0.946112 ln().
The order of strong convergence is close to the theoretical value of = 1.
In Burrage (1998) a two stage Runge-Kutta method of strong order = 1.0
has been proposed that for m = d = 1 has in the autonomous case the
form
1
+
b(Yn ) + 3 b(Yn ) W
Yn+1 = Yn + a(Yn ) + 3 a(Yn )
4
4
(5.4.9)
269
with
2
Yn = Yn + (a(Yn ) + b(Yn ) W ).
3
(5.4.10)
This method minimizes the leading error term within a class of rst order
Runge-Kutta methods and is, therefore, of particular interest.
Explicit Strong Order 1.5 Schemes
One can also construct derivative-free schemes of order = 1.5 by replacing
the derivatives in the strong order 1.5 Taylor scheme by corresponding nite
dierences.
In the autonomous, one-dimensional case d = m = 1 an explicit strong
order 1.5 scheme has been proposed in Platen (1984) in the form
1
a(+ ) a( ) Z
Yn+1 = Yn + b W +
2
+
1
a(+ ) + 2a + a( )
4
1
b(+ ) b( ) (W )2
+
4
1
b(+ ) 2b + b( ) {W Z}
(5.4.11)
2
1
1
) b(+ ) + b( )
+ ) b(
+
(W )2 W
b(
4
3
+
with
= Yn + a b
and
= + b(+ )
(5.4.12)
(5.4.13)
270
m
bk,j W j
j=1
m
m
1
k,j2 j1
j1
b
I(j1 ,j2 )
+
+ bk,j2
2 j2 =0 j1 =1
m
m
1
k,j2 j1
j1
b
I(0,j2 )
+ 2bk,j2 + bk,j2
2 j =0 j =1
2
j1 ,j2 bk,j3
j1 ,j2
bk,j3
+
m
1
2 j
1 ,j2 ,j3 =1
k,j3
with
and
j1
k,j3 j1
+ + b
I(j1 ,j2 ,j3 )
j = Y n + 1 a bj
m
j1 bj2
j1 ,j2 =
j1
,
+
+
(5.4.14)
(5.4.15)
(5.4.16)
5.5 Exercises
271
Fig. 5.4.2. Log-log plot of the absolute error for various strong schemes
order 1.5 Taylor scheme consumes only 4.4 seconds. As mentioned previously,
the above schemes are adequate for scenario simulation, dynamic nancial
analysis, ltering, testing of estimators and for hedge simulation.
5.5 Exercises
5.1. Write down for the linear SDE
dXt = ( Xt + ) dt + Xt dWt
with X0 = 1 the Euler scheme and the Milstein scheme.
5.2. Determine for the Vasicek short rate model
drt = (
r rt ) dt + dWt
the Euler and Milstein schemes. What are the dierences between the resulting
two schemes?
5.3. Write down for the SDE in Exercise 5.1 the strong order 1.5 Taylor
scheme.
5.4. Derive for the Black-Scholes SDE
dXt = Xt dt + Xt dWt
with X0 = 1 the explicit strong order 1.0 scheme.
6
Regular Strong Taylor Approximations with
Jumps
In this chapter we start to go beyond the work described in Kloeden & Platen
(1999) on the numerical solution of SDEs. We now allow the driving noise of
the SDEs to have jumps. We present regular strong approximations obtained
directly from a truncated Wagner-Platen expansion with jumps. The term
regular refers to the time discretizations used to construct these approximations. These do not include the jump times of the Poisson random measure,
as opposed to the jump-adapted strong approximations that will be presented
later in Chap. 8. A convergence theorem for approximations of a given strong
order of convergence will be presented at the end of this chapter. The reader
who aims to simulate a solution of an SDE with low jump intensity is referred
directly to Chap. 8 which describes jump-adapted schemes that are convenient
to use.
(6.1.1)
273
274
Xt = X0 +
b(s, Xs )dWs +
a(s, Xs )ds +
0
= X0 +
p (t)
a(s, Xs )ds +
0
b(s, Xs )dWs +
0
c(i , Xi , i ),
(6.1.2)
i=1
Xt = X0 +
p (t)
a(s, Xs )ds +
b(s, Xs )dWs +
0
c(i , Xi ).
(6.1.3)
i=1
d
f (t, x, u) +
ai (t, x) i f (t, x, u)
t
x
i=1
m
d
1 i,j
2
b (t, x)br,j (t, x) i j f (t, x, u)
2 i,r=1 j=1
x x
f t, x + c(t, x, v), u f (t, x, u) (dv), (6.1.5)
E
Lk f (t, x, u) =
d
i=1
bi,k (t, x)
f (t, x, u),
xi
for k {1, 2, . . . , m}
(6.1.6)
and
L1
v f (t, x, u) = f (t, x + c(t, x, v), u) f (t, x, u),
s()
275
(6.1.7)
(6.1.8)
(6.1.9)
where
tn+1
is
Atn measurable,
(6.1.10)
a.s.,
(6.1.11)
denoting the largest integer n such that tn t, for all t [0, T ]. Such a
time discretization is called regular, as opposed to the jump-adapted scheme
presented later in Chap. 8. For instance, we could consider an equidistant
time discretization with nth discretization time tn = n, n {0, 1, . . . , N },
T
. However, the discretization times could also be
and time step size = N
random, as is needed if one wants to employ a step size control. Conditions
(6.1.9),(6.1.10) and (6.1.11) pose some restrictions on the choice of the random
discretization times. Condition (6.1.9) requires that the maximum step size
in the time discretization cannot be larger than . Condition (6.1.10) ensures
that the length n = tn+1 tn of the next time step depends only on the
information available at the last discretization time tn . Condition (6.1.11)
guarantees a nite number of discretization points in any bounded interval
[0, t].
For simplicity, when describing the schemes we will use the abbreviation
f = f (tn , Yn )
(6.1.12)
and
c(i ) = c(tn , Yn , i ),
(6.1.13)
276
c (v) = c (tn , Yn , v)
and
c (i ) = c (tn , Yn , i ).
(6.1.14)
Note that here and in the sequel, the prime in (6.1.14) denotes the derivative
with respect to the spatial variable, which is the second argument. Moreover
as in the previous chapter, we will omit mentioning the initial value Y0 and
the step numbers n {0, 1, . . . , N } as this is self-explanatory.
Euler Scheme
The simplest scheme is again the well-known Euler scheme, which, in the
one-dimensional case d = m = 1, is given by the algorithm
tn+1
Yn+1 = Yn + an + bWn +
c(v) p (dv, dz)
E
tn
p (tn+1 )
= Yn + an + bWn +
c(i ),
(6.1.15)
(6.1.16)
(6.1.17)
(6.1.18)
(6.1.19)
is the ith mark of the Poisson random measure p at the ith jump time i ,
with distribution function F (). The Euler scheme (6.1.15) generally achieves
a strong order of convergence = 0.5, as will be shown at the end of this
chapter.
When we have a mark-independent jump size, which means c(t, x, v) =
c(t, x), the Euler scheme reduces to
Yn+1 = Yn + an + bWn + cpn ,
(6.1.20)
277
where
pn = p (tn+1 ) p (tn )
(6.1.21)
p (tn+1 )
k
Yn+1
= Ynk + ak n + bk Wn +
ck (i ),
(6.1.22)
for k {1, 2, . . . , d}, where ak , bk , and ck are the kth components of the drift,
diusion and jump coecients, respectively.
In the multi-dimensional case with scalar Wiener process, m = 1, and
mark-independent jump size, the kth component of the Euler scheme is given
by
k
= Ynk + ak n + bk Wn + ck pn ,
(6.1.23)
Yn+1
for k {1, 2, . . . , d}.
For the general multi-dimensional case with mark-dependent jump size the
kth component of the Euler scheme is of the form
k
Yn+1
= Ynk + ak n +
m
p (tn+1 )
bk,j Wnj +
j=1
ck (i ),
(6.1.24)
where ak and ck are the kth components of the drift and the jump coecients,
respectively, and bk,j is the component of the kth row and jth column of the
diusion matrix b, for k {1, 2, . . . , d}, and j {1, 2, . . . , m}. Moreover,
Wnj = Wtjn+1 Wtjn
(6.1.25)
is the N (0, n ) distributed nth increment of the jth Wiener process. We recall
that
(6.1.26)
i E
is the ith mark generated by the Poisson random measure.
In the multi-dimensional case with mark-independent jump size we obtain
the kth component of the Euler scheme
k
= Ynk + ak n +
Yn+1
m
bk,j Wnj + ck pn ,
(6.1.27)
j=1
278
can be interpreted as the strong order 0.5 Taylor scheme. In the following we
will use the Wagner-Platen expansion (4.4.4) for the construction of strong
Taylor schemes. Similarly, one can start from the compensated Wagner-Platen
expansion (4.4.5) obtaining slightly dierent compensated Taylor schemes, as
will be discussed later.
tn+1
tn+1
z2
tn+1
tn
z2
tn
tn+1
tn+1
tn
z2
tn
z2
tn
tn
+
tn
dW (z1 ) dW (z2 )
tn
tn
b tn , Yn + c(v) b p (dv, dz1 )dW (z2 )
(6.2.1)
c tn , Yn + c(v1 ), v2 c(v2 ) p (dv1 , dz1 )p (dv2 , dz2 ),
where
b = b (t, x) =
b(t, x)
x
c(t, x, v)
.
x
(6.2.2)
p (tn+1 )
Yn+1 = Yn + an + bWn +
c(i ) +
p (tn+1 )
+b
279
b b
2
(Wn ) n
2
c (i ) W (i ) W (tn )
b Yn + c(i ) b W (tn+1 ) W (i )
p (tn+1 )
p (j )
p (tn+1 )
c Yn + c(i ), j c j .
(6.2.3)
tn+1
s2
1
I(1,1) =
dWs1 dWs2 =
(Wn )2 n ,
2
tn
tn
tn+1
I(1,1) =
tn
tn+1
tn
s2
I(1,1) =
tn
tn+1
I(1,1) =
tn
tn
p (tn+1 )
s2
Wi pn Wtn ,
E
s2
tn
(6.2.5)
1
(pn )2 pn .
2
280
281
Fig. 6.2.2. Strong error for the Euler and the strong order 1.0 Taylor schemes
generated by these schemes. One clearly notices the higher accuracy of the
strong order 1.0 Taylor scheme.
Multi-dimensional Strong Order 1.0 Taylor Scheme
In the multi-dimensional case with scalar Wiener process, m = 1, and markdependent jump size, the kth component of the strong order 1.0 Taylor scheme
is given by the algorithm
282
tn+1
k
Yn+1
= Ynk + ak n + bk Wn +
tn
l=1
tn+1
bl
tn
z2
tn
tn+1
bk
dW (z1 )dW (z2 )
xl
z2
ck (v)
dW (z1 )p (dv, dz2 )
xl
E tn
l=1 tn
tn+1
z2
+
bk tn , Y n + c(v) bk p (dv, dz1 )dW (z2 )
tn
bl
tn
tn+1
z2
+
E
tn
tn
ck tn , Y n + c(v1 ), v2 ck v2
E
(6.2.6)
where ak , bk , and ck are the kth components of the drift, diusion and jump
coecients, respectively, for k {1, 2, . . . , d}. Similar to (6.2.3) we can rewrite
this scheme in a form that is readily applicable for scenario simulation.
For the multi-dimensional case with one driving Wiener process and markindependent jump size the kth component of the strong order 1.0 Taylor
scheme simplies to
k
= Ynk + ak n + bk Wn + ck pn +
Yn+1
d
l=1
bl
bk
I(1,1)
xl
ck
k
k
I(1,1)
t
I
+
b
,
Y
+
c
b
n
n
(1,1)
xl
l=1
+ ck tn , Y n + c ck I(1,1) ,
+
d
bl
(6.2.7)
for k {1, 2, . . . , d}. Here the four double stochastic integrals involved can be
generated as described in (6.2.5).
In the general multi-dimensional case the kth component of the strong
order 1.0 Taylor scheme is given by
tn+1
m
k
Yn+1
= Ynk + ak n +
bk,j Wnj +
ck (v) p (dv, dz)
m
d
bi,j1
j1 ,j2 =1 i=1
m
j1 =1 i=1
tn+1
tn
tn
j=1
bk,j2
xi
tn+1
tn
z2
tn
z2
bi,j1
E
tn
ck (v)
dW j1 (z1 )p (dv, dz2 )
xi
j1 =1
tn+1
tn
tn+1
tn
283
bk,j1 tn , Y n + c(v) bk,j1 p (dv, dz2 ) dW j1 (z2 )
z2
tn
z2
ck tn , Y n + c(v1 ), v2 ck v2
E
tn
(6.2.8)
m
bk,j Wnj + ck pn
j=1
m
d
i,j1
j1 ,j2 =1 i=1
m
bk,j1 tn , Y n
+
j1 =1
m
d
k
bk,j2
i,j1 c
I
+
b
I(j ,1)
(j
,j
)
1
2
xi
xi 1
j =1 i=1
(6.2.9)
+ c bk,j1 I(1,j1 ) + ck tn , Y n + c ck I(1,1) ,
tn+1
I(j1 ,1) =
tn
tn+1
tn
s2
I(1,j1 ) =
tn
tn
p (tn+1 )
s2
Wji1 pn Wtjn1 ,
tn+1
z2
I(j1 ,j2 ) =
dW j1 (z1 )dW j2 (z2 ),
(6.2.10)
tn
tn
284
tn+1
Yn+1 = Yn + ;
an + bWn +
c(v) p; (dv, dz)
+ bb
tn+1
tn
z2
tn
tn+1
tn+1
tn
tn
z2
tn
b c (v) dW (z1 );
p (dv, dz2 )
tn
z2
+
tn
tn
z2
tn+1
b tn , Yn + c(v) b p (dv, dz1 ) dW (z2 )
(6.2.12)
p (dv2 , dz2 ).
c tn , Yn + c(v1 ), v2 c(v2 ) p; (dv1 , dz1 );
For application in scenario simulation, the generation of a stochastic integral involving the compensated Poisson measure p; can be implemented as
follows: one could rst generate the corresponding multiple stochastic integral with jump integrations with respect to the Poisson jump measure p and
then subtract its mean. This is given by the corresponding multiple stochastic
integral with the integrations with respect to the Poisson jump measure p
replaced by integrations with respect to time and the intensity measure , see
(4.2.1). For example, to generate the stochastic integral
tn+1
tn
tn+1
tn+1
tn
tn+1
tn
(6.2.13)
285
tn+1
+ b b
tn+1
tn
z2
tn
tn+1
tn
z2
tn
tn+1
tn
tn+1
tn
tn+1
tn
tn+1
tn
z2
tn
z2
z2
tn
z2
+
tn
c tn , Yn + c(v1 ), v2 c(v2 ) (dv1 )dz1 p (dv2 , dz2 )
tn
z2
c tn , Yn + c(v1 ), v2 c(v2 ) p (dv1 , dz1 )p (dv2 , dz2 )
b tn , Yn + c(v) b p (dv, dz1 )dW (z2 )
tn
tn+1
tn
z2
tn
tn+1
tn
c tn , Yn + c(v1 ), v2 c(v2 ) p (dv1 , dz1 )(dv2 )dz2
E
c tn , Yn + c(v1 ), v2 c(v2 ) (dv1 )dz1 (dv2 )dz2 .
E
286
(6.3.1)
which is independent of the particular jump times. Let us consider a onedimensional SDE with mark-independent jump size, c(t, x, v) = c(t, x), satisfying the jump commutativity condition
b(t, x)
c(t, x)
= b(t, x + c(t, x)) b(t, x),
x
(6.3.2)
for all t [0, T ] and x . In this case the strong order 1.0 Taylor scheme
(6.2.4) depends only on the sum I(1,1) + I(1,1) expressed in (6.3.1). One
does not need to keep track of the exact location of the jump times. Hence,
its computational complexity is independent of the intensity level.
This is an important observation from a practical point of view. If a given
SDE satises the jump commutativity condition (6.3.2), then considerable
savings in computational time can be achieved. In practice, one can take
advantage of this insight and save valuable computational time.
When we have a linear diusion coecient of the form
b(t, x) = b1 (t) + b2 (t) x,
(6.3.3)
(6.3.4)
for all t [0, T ] and x . Therefore, for linear diusion coecients of the
form (6.3.3) the class of SDEs satisfying the jump commutativity condition
(6.3.2) is identied by mark-independent jump coecients of the form
c(t, x) = eK(t)
b1 (t) + b2 (t) x ,
287
(6.3.5)
1 2
Yn {(Wn )2 n }
2
1 2
Yn {(pn )2 pn }.
2
(6.3.6)
bl (t, x)
l=1
ck (, t, x)
= bk (t, x + c(t, x)) bk (t, x)
xl
(6.3.7)
(6.3.8)
is independent of the particular jump times. Therefore, for a general multidimensional SDE with mark-independent jump size, we obtain the jump commutativity condition
d
l=1
bl,j1 (t, x)
ck (t, x)
= bk,j1 (t, x + c(t, x)) bk,j1 (t, x)
xl
(6.3.9)
for j1 {1, 2, . . . , m}, k {1, 2, . . . , d}, t [0, T ] and x d . Thus, once the
diusion coecients are specied, a solution of the rst order semilinear dmdimensional system of PDEs (6.3.9) provides a d-dimensional commutative
jump coecient. Note that the system (6.3.9) has d m equations and only
d unknown functions. Therefore, even for simple diusion coecients, there
may not exist any jump coecient satisfying (6.3.9).
Consider, for instance, the multi-dimensional case with additive diusion
coecient b(t, x) = b(t). The jump commutativity condition (6.3.9) reduces
to the m d system of rst order homogeneous linear PDEs
d
l=1
bl,j (t)
ck (t, x)
=0
xl
(6.3.10)
288
(6.3.11)
l,j (t)xl
ck (t, x)
= k,j (t) ck (t, x)
xl
(6.3.12)
y i = (x1 )
d1
i+1 (t)
1 (t)
)
d1
(6.3.13)
has components
xi+1
289
(6.3.14)
m
bk,j Wnj + ck pn
j=1
1
+
2
+
m
d
j1 ,j2 =1
m
bk,j2
bi,j1
xi
i=1
Wnj1 Wnj2 n
{bk,j1 (tn , Y n + c) bk,j1 } pn Wnj1
j1 =1
1
2
+ {ck (tn , Y n + c) ck } (pn ) pn ,
2
(6.3.15)
for k {1, 2, . . . , d}. For instance, the special case of additive diusion and
jump coecients, which means b(t, x) = b(t) and c(t, x) = c(t), satises all
the required commutativity conditions and, therefore, leads to an ecient
strong order 1.0 Taylor scheme. Note that in this particular case the last two
lines of (6.3.15) vanish and, thus, only increments of Wiener processes and
the Poisson measure are needed.
From this analysis it becomes clear that when selecting a suitable numerical scheme for a specic model it is important to check for particular
commutativity properties of the SDE under consideration to potentially save
complexity and computational time. For the modeler it also suggests to focus
on models that satisfy certain commutativity conditions. The resulting model
may be more tractable than others with an ad hoc structure of diusion and
jumps coecients.
290
f
f
=
Y
+
I
(t
,
Y
)
=
I
(t
,
Y
)
Y
n
n+1
n
n
n
tn ,tn+1
A \{v}
tn ,tn+1
(6.4.2)
for n {0, 1, . . . , nT 1}. Similarly, we dene the strong order compensated
Taylor scheme by the vector equation
.
.
(tn , Y )
(tn , Y )
I
I
Y
=
Y
+
=
f
f
n+1
n
n
n
tn ,tn+1
A \{v}
tn ,tn+1
(6.4.3)
for n {0, 1, . . . , nT 1}.
Equations (6.4.2) and (6.4.3) provide recursive numerical routines generating approximate values of the solution of the SDE (6.1.1) at the time
discretization points. In order to assess the strong order of convergence of
these schemes, we dene, through a specic interpolation, the strong order
Taylor approximation Y = {Y
t , t [0, T ]}, by
I [f (tnt , Y
(6.4.4)
Y
t =
tnt )]tnt ,t
A
291
(6.4.2) and of the strong order compensated Taylor scheme (6.4.3), respectively, at the time discretization points. Between the discretization points the
multiple stochastic integrals have constant coecient functions but evolve
randomly as a stochastic process.
The strong Taylor schemes, given by (6.4.3), are constructed with jump
integrals with respect to the compensated Poisson measure p; , as follows from
the denition of the multiple stochastic integrals (4.2.9). Note that the Euler
scheme and the strong order 1.0 Taylor scheme presented in the previous
sections were expressed in terms of the Poisson measure p . When rewritten
in terms of the compensated Poisson measure p; , one recovers the strong
Taylor schemes (6.4.5), with = 0.5 and = 1.0, respectively.
Strong Convergence Theorem
The strong order of convergence of the compensated strong Taylor schemes
presented above can be derived from the following theorem. This convergence
theorem enables us to construct a compensated strong Taylor approximation
Y = {Y
t , t [0, T ]} of any given strong order {0.5, 1, 1.5, 2, . . .}.
Theorem 6.4.1. For given {0.5, 1, 1.5, 2, . . . }, let Y = {Y
t , t
[0, T ]} be the strong order compensated Taylor approximation dened in
(6.4.5), corresponding to a time discretization with maximum step size
(0, 1). We assume that
E(|X 0 |2 ) <
and
2
2
E(|X 0 Y
0 | ) K1 .
(6.4.6)
Moreover, suppose that the coecient functions f satisfy the following conditions:
For A , t [0, T ], u E s() and x, y d the coecient function
f satises the Lipschitz type condition
(6.4.7)
f (t, x, u) f (t, y, u) K1 (u)|x y|,
where (K1 (u))2 is (du1 ) . . . (dus() )-integrable.
For all A B(A ) we assume
f C 1,2
and
f H ,
(6.4.8)
292
2
E( sup |X z Y
z | |A0 ) K3
(6.4.10)
0zT
293
Lemma 6.5.1
For a given multi-index Mm \{v}, a time discretization
(t) with (0, 1) and g H let
Vt0 ,u,s() = . . . E sup |g(z, v 1 , . . . , v s() )|2 At0 (dv 1 ) . . . (dv s() )
E
t0 zu
< ,
Ft
(6.5.1)
n 1
2
z
I [g()]tn ,tn+1 + I [g()]tnz ,z
= E sup
t0 zt
n=0
and
n 1
2
z
Ft = E sup
I [g()]tn ,tn+1 + I [g()]tnz ,z
t0 zt
n=0
At0
(6.5.2)
At0 .
(6.5.3)
Then
Ft
t
(t t0 ) 2(l()1) t0 Vt0 ,u,s() du
t
and
t
(t t0 ) 2(l()1) t0 Vt0 ,u,s() du
t
Ft
4l()n()+2 C s() l()+n()1 t0 Vt0 ,u,s() du
n=0
n
z 1
n
z 1
n=0
tn+1
tn
I [g()]tn ,s ds +
tnz
tn
n=0
tn+1
I [g()]tns ,s ds +
I [g()]tnz ,s ds
z
tnz
I [g()]tns ,s ds
=
t0
I [g()]tns ,s ds.
(6.5.4)
The same type of equality holds analogously for every jn {1, 0, . . . , m}.
294
2. Let us rst consider the case with l() = n(). By the Cauchy-Schwarz
inequality we have
z
2
Ft = E
I [g()]tnu ,u du At0
sup
t zt
t0
E
sup (z t0 )
t0 zt
t0
(t t0 ) E
t0
= (t t0 )
t0
2
|I [g()]tnu ,u | du At0
E |I [g()]tnu ,u |2 |At0 du
sup |I [g()]tnu ,z | At0 du
zu
(t t0 )
E
t0
= (t t0 )
|I [g()]tnu ,u |2 du At0
tnu
E E
sup
tnu zu
t0
2
|I [g()]tnu ,z | Atnu At0 du, (6.5.5)
where the last line holds because t0 tnu a.s. and then At0 Atnu for
u [t0 , t]. Therefore, applying Lemma 4.5.2 to
2
E
sup |I [g()]tnu ,z | Atnu
(6.5.6)
t zu
nu
yields
Ft (t t0 ) 4l()n()
E (u tnu )
l()+n()1
t0
tnu
(t t0 ) 4l()n()
t0
Vtnu ,z,s() dz At0 du
E (u tnu )l()+n() Vtnu ,u,s() At0 du
t
(t t0 ) 4l()n() l()+n() E Vtnu ,u,s() At0 du, (6.5.7)
t0
where the last line holds as (u tnu ) for u [t0 , t] and t [t0 , T ] .
Since At0 Atnu , we notice that for u [t0 , t]
295
E Vtnu ,u,s() |At0
=E
|g(z, v 1 , . . . , v s() )|2 Atnu
...
sup
tnu zu
(dv 1 ) . . . (dv s() ) At0
...
E E
|g(z, v , . . . , v
1
sup
tnu zu
s()
)| Atnu At0
2
|g(z, v , . . . , v
1
s()
...
sup
tnu zu
E
E
sup |g(z, v , . . . , v
1
t0 zu
s()
)| At0
)| At0 (dv 1 ) . . . (dv s() )
2
= Vt0 ,u,s() .
(6.5.8)
It then follows
Ft (t t0 ) 4l()n() l()+n()
Vt0 ,u,s() du
t0
= (t t0 ) 2(l()1)
(6.5.9)
t0
since l() = n(), s() = s() and this completes the proof for the
case l() = n().
3. Let us now consider the case with a multi-index = (j1 , . . . , jl ) with
l()
= n() and jl {1, 2, . . . , m}. In this case the multiple stochastic
integral is a martingale. Hence, by Doobs inequality, It
os isometry and
Lemma 4.5.2 we obtain
z
2
j
Ft = E
I [g()]tnu ,u dWul At0
sup
t0 zt
t0
2
t
jl
I [g()]tnu ,u dWu
4E
t0
At0
296
E |I [g()]tnu ,u |2 At0 du
4
t0
2
E E |I [g()]tnu ,u | Atnu At0 du
=4
t0
E E
sup
tnu zu
t0
|I [g()]tnu ,z | Atnu At0 du
2
4 4l()n()
t
l()+n()1
E (u tnu )
t0
tnu
4 4l()n() l()+n()
Vtnu ,z,s() dz At0 du
Vt0 ,u,s() du
t0
= 4l()n() l()+n()1
Vt0 ,u,s() du
t0
l()n()+2
l()+n()1
(6.5.10)
t0
where the last passage holds since s() = s(). This completes the proof
in this case.
4. Let us now consider the case with a multi-index = (j1 , . . . , jl ) with
l()
= n() and jl = 1. The multiple stochastic integral is again a martingale. Therefore, by Doobs inequality, Lemma 4.5.2 and steps similar
to the previous case we obtain
z
2
s()
s()
I [g(, v
sup
)]tnu ,u p; (dv
, du) At0
Ft = E
t zt
t0
2
t
s()
s()
I [g(, v
4E
)]tnu ,u p; (dv
, du)
t0
=4
E
t0
=4
t0
E
t0
E |I [g(, v s() )]tnu ,u |2 At0 (dv s() ) du
E E |I [g(, v s() )]tnu ,u |2 Atnu At0 (dv s() ) du
At0
sup
tnu zu
|I [g(, v
s()
)]tnu ,z | Atnu At0 (dv s() )du
2
4l()n()+1
t
E (u tnu )l()+n()1
t0
tnu
4l()n()+1 l()+n()
Vtnu ,z,s() dz At0 (dv s() ) du
E Vtnu ,u,s() At0 (dv s() ) du.
t0
297
(6.5.11)
Hence, using (6.5.8) we have
Ft 4l()n()+1 l()+n()
t0
= 4l()n() l()+n()1
Vt0 ,u,s() du
t0
4l()n()+2 l()+n()1
Vt0 ,u,s() du
(6.5.12)
t0
since l() = l()+1, n() = n(), s() = s()+1 and this completes
the proof in this case.
5. Finally, we assume that = (j1 , . . . , jl ) with l()
= n() and jl = 0.
It can be shown that the discrete-time process
!
k
I [g()]t ,t , k {0, 1 . . . , nT 1}
(6.5.13)
n
n+1
n=0
is a discrete-time martingale.
Using Cauchy-Schwarz inequality we obtain
n 1
2
z
n 1
2
z
2 E sup
I [g()]tn ,tn+1
t0 zt
n=0
+2E
At
0
2
sup I [g()]tnz ,z At0 .
t0 zt
At0
(6.5.14)
298
n 1
2
z
I [g()]tn ,tn+1
E sup
t0 zt
n=0
At0
2
t 1
n
4 E
I [g()]tn ,tn+1
n=0
At0
n 2
t
2
-
I [g()]tn ,tn+1
4E
n=0
n 2
t
I [g()]tn ,tn+1 E |I [g()]tnt 1 ,tnt | Atnt 1
+2
n=0
.
2
+ E |I [g()]tnt 1 ,tnt | Atnt 1
At0
2
n 2
t
-
I [g()]tn ,tn+1
4E
n=0
.
2
+ E |I [g()]tnt 1 ,tnt | Atnt 1
At0 .
(6.5.15)
Here the last line holds by the discrete-time martingale property of the
involved stochastic integrals, which is E(I [g()]tnt 1 ,tnt |Atnt 1 ) = 0.
Then by applying Lemma 4.5.2 we obtain
n 1
2
z
I [g()]tn ,tn+1 At0
E sup
t zt
0
n=0
2
n 2
t
-
4E
I [g()]tn ,tn+1
n=0
+4
l()n()
l()+n()1
tnt
n 3
t
-
I [g()]tn ,tn+1 |2
4E |
tnt 1
.
Vtnt 1 ,u,s() du At0
n=0
+ 4l()n() l()+n()1
tnt 1
tnt 2
Vtnt 2 ,u,s() du
+4
l()n()
tnt
l()+n()1
tnt 1
.
Vtnt 1 ,u,s() du At0
299
2
n 3
t
-
I [g()]tn ,tn+1
4E
n=0
+4
l()n()
tnt 1
l()+n()1
tnt 2
+4
l()n()
.
Vtnt 2 ,u,s() du At0 , (6.5.16)
tnt
l()+n()1
Vtnt 2 ,u,s() du
tnt 1
where the last passage holds since Vtnt 1 ,u,s() Vtnt 2 ,u,s() . Applying
this procedure repetitively and using (6.5.8) we nally obtain
n 1
2
z
I [g()]tn ,tn+1 At0
E sup
t0 zt
n=0
l()n()+1
l()+n()1
t0
Vt0 ,u,s() du At0
= 4l()n()+1 l()+n()1
(6.5.17)
t0
2
z
= E sup
I [g()]tnz ,u du
t0 zt
tnz
E
sup (z tnz )
t0 zt
E E
t0
tnz
sup
tnu zu
2
|I [g()]tnz ,u | du At0
|I [g()]tnu ,z | Atnu At0 du
2
l()n()
At
0
l()+n()1
E
t0
tnu
Vtnu ,z,s() dz At0 du
300
4l()n() l()+n()1
Vt0 ,u,s() du
t0
= 4l()n() l()+n()1
(6.5.18)
t0
where the last passage holds since l() = l() + 1, n() = n() + 1
and s() = s().
Therefore, combining equations (6.5.17) and (6.5.18) we nally obtain
Ft 2 4l()n()+1 l()+n()1
Vt0 ,u,s() du
t0
+ 4l()n() l()+n()1
Vt0 ,u,s() du
t0
t
4l()n()+2 l()+n()1
(6.5.19)
t0
Let us now prove the assertion for F . The case of l() = n() has been
already proved since F = F .
Consider now the case of l() > n() + s() with = (j1 , . . . , jl ). Note
that in this case at least one of the elements of the multi-index belongs to
{1, 2, . . . , m} and, thus, the process
!
k
I [g()]tn ,tn+1 , k {0, 1 . . . , nT 1}
(6.5.20)
n=0
is a discrete-time martingale.
If jl = 0, then using similar steps as in (6.5.14)(6.5.19) we obtain
l()n()+2
l()+n()1 s()
K
Vt0 ,u,s() du
Ft 4
t0
t
Vt0 ,u,s() du
(6.5.21)
t0
t
s()
Vt0 ,u,s() du
Ft 4l()n()+2 l()+n()1 K
t0
(6.5.22)
301
2
s()
s()
sup
I [g(, v
)]tnu ,u p; (dv
, du) At0
Ft 2 E
t zt
t0
z
2
s()
s()
+ 2 E sup
I [g(, v
)]tnu ,u (dv
)du
t zt
t0
2
I [g(, v s() )]tnu ,u p; (dv s() , du)
E
sup
t zt
E
t0
At0 .(6.5.23)
by similar steps as
At0
s()1
4l()n() l()+n()1 K
(6.5.24)
t0
For the second term on the right-hand side of (6.5.23) by similar steps as
those used in (6.5.5)(6.5.9), we obtain
z
2
s()
s()
I [g(, v
)]tnu ,u (dv
)du At0
E
sup
t zt
0
t0
s()1
(t t0 )4l()n()1 l()+n()1 K
t
s()
4l()n()+2 l()+n()1 K
Vt0 ,u,s() du
Ft
Vt0 ,u,s() du
t0
t0
t
(6.5.26)
t0
Let us nally consider the case of l() = n() + s() with s() 1. By
use of relationship (4.2.13) one can rewrite the multiple stochastic integrals
I [g()]tn ,tn+1 appearing in Ft as the sum of 2s() multiple stochastic integrals
involving integrations with respect to the compensated Poisson measure p ,
the product of time and the intensity measure (). Therefore, by applying
the Cauchy-Schwarz inequality and similar steps as used before, we obtain
302
s()
Ft 2s() 4l()n()+2 l()+n()1 K
Vt0 ,u,s() du
t0
(6.5.27)
t0
(6.6.1)
0zT
n=0
Therefore, we have
2
2
sup (1 + |Y z | ) A0
E
sup |Y z | A0 E
0zT
0zT
E
sup
0zT
1 + Y
0 +
z 1
n
A \{v}
I [f (tn , Y
tn )]tn ,tn+1
n=0
2
sup
0zT
z 1
n
n=0
2
1 + 2 |Y
|
+
2
0
A \{v}
2
2
C1 1 + Y
+ 2K
0
303
sup
0zT
A \{v}
n 1
2
z
A0 , (6.6.2)
C1 (1 +
2
|Y
0 | )
+ 2 K1
2
sup f (z, Y
z )
0zu
...
E
A \{v}
!
A0 (dv 1 ) . . . (dv s() ) du
2
C1 1 + |Y
+ 2 K1
0 |
E
0
...
A \{v}
2
sup 1 + |Y
|
z
0zu
2
C1 1 + |Y
|
+
C
2
0
!
A0 du
sup
0zu
2
|
1 + |Y
z
A0 du.
(6.6.3)
We then have
2
2
E sup (1 + |Y s | )|A0 C1 (1 + |Y
0 | )
0sT
+2K
A \{v}
D(, , )
...
0
2
E( sup |f (s, Y
s )| |A0 )(dv1 ) . . . (dvs() )du
0su
304
2
C1 (1 + |Y
0 | ) + 2K()
...
K ()
A \{v}
T
2
E( sup (1 + |Y
s | )|A0 )du
0su
2
C1 (1 + |Y
0 | )
T
2
E( sup (1 + |Y
+ K(, , )
s | )|A0 )du.
0
(6.6.4)
0su
0sT
2
C1 (1 + |Y
0 | )
+ K(, , )
2
eK(,,)(T u) C1 (1 + |Y
0 | ) du
0
2
C1 (1 + |Y
0 | )
2
K(,,)T
+ C1 (1 + |Y
1)
0 | )(e
2
C2 (, , )(1 + |Y
0 | ),
(6.6.5)
Now by using Lemma 6.5.1 and Lemma 6.6.1, we can nally prove Theorem 6.4.1.
Proof of Theorem 6.4.1
1. With the Wagner-Platen expansion (4.4.4) we can represent the solution
of the SDE (1.8.2) as
I [f (, X )], +
I [f (, X )], ,
(6.6.6)
X =
A
B(A )
A \{v}
B(A )
n+1
nt
nt
n=0
n 1
t
n=0
!
I [f (, X )]tn ,tn+1 + I [f (, X )]tnt ,t
(6.6.7)
305
n=0
(6.6.8)
From the moment estimate (1.9.5) provided by Theorem 1.9.3 we have
(6.6.9)
E
sup |X z |2 A0 C 1 + E |X 0 |2 .
0zT
2. Let us now analyze the mean square error of the strong order compensated Taylor approximation Y . By (6.6.7), (6.6.8) and Cauchy-Schwarz
inequality we obtain
2
|
Z(t) = E sup |X z Y
z
A0
0zt
=E
sup X 0 Y
0
0zt
z 1
n
A \{v}
I [f (tn , X tn ) f (tn , Y
tn )]tn ,tn+1
n=0
z 1
n
B(A )
n=0
2
2
C3
X 0 Y
0 +
A \{v}
St +
Ut
B(A )
(6.6.11)
I f (tn , X tn ) f (tn , Y
St = E sup
)
tn
0zt
n=0
.
+ I f (tnz , X tnz ) f (tnz , Y
tnz )
tnz ,z
2
tn ,tn+1
A0 ,
(6.6.12)
306
Ut = E
z 1
n
2
I [f (, X )]tn ,tn+1 + I [f (, X )]tnz ,z A0 .
sup
0zt
n=0
(6.6.13)
3. By using again Lemma 6.5.1 and the Lipschitz condition (6.4.7) we obtain
z 1
n
.
I f (tn , X t ) f (tn , Y )
S = E sup
t
tn
0zt
n=0
.
+ I f (tnz , X tnz ) f (tnz , Y
tnz )
C4
...
tnz ,z
2
tn ,tn+1
A0
2
sup |f (tnz , X tnz ) f (tnz , Y
)|
tnz
A0
0zu
2
C4 . . .
K1 (v 1 , . . . , v s() ) (dv 1 ) . . . (dv s() )
sup |X tnz
0zu
2
Y
tnz |
A0 du
C5
Z(u) du.
(6.6.14)
Applying again Lemma 6.5.1 and the linear growth condition (6.4.9) we
obtain
z 1
n
2
n=0
C5 ()
C5 ()
. . . E sup |f (z, X z )|2 A0 (dv 1 ) . . . (dv s() )du
...
E
0zu
E sup (1 + |X z | ) A0 du
0zu
C6 ()
t+
0
where
sup |X z |2 A0 du ,
0zu
2l() 2
: l() = n()
() =
l() + n() 1 : l()
= n().
(6.6.15)
6.7 Exercises
307
t
C1 (1 + |X 0 |2 )du
Ut C6 2 t +
0
C7 2 (1 + |X 0 |2 ).
(6.6.16)
t
2
2
2
1
+
|X
+
C
Z(t) C8 X 0 Y
+
C
|
Z(u)du
.
9
0
10
0
0
(6.6.17)
By equations (6.6.9) and (6.6.10) Z(t) is bounded. Therefore, by the Gronwall inequality we have
2
Z(T ) K4 1 + |X 0 |2 2 + K5 |X 0 Y
.
(6.6.18)
0 |
Finally, by assumption (6.4.6), we obtain
)
2
E( sup |X z Y
Z(T ) K3 ,
z | |A0 ) =
(6.6.19)
0zT
Remark 6.6.2. Note that the same result as that formulated in Theorem
6.4.1 holds also for the strong order Taylor approximation (6.4.4) as mentioned in Corollary 6.4.3. The proof uses analogous steps as those employed
in the proof described above.
6.7 Exercises
6.1. Consider the two-dimensional SDE
dXt1 = dWt1
dXt2 = Xt1 dWt2
for t [0, T ], with X01 = X02 = 1. Here W 1 and W 2 are independent Wiener
processes. Does the above SDE satisfy a diusion commutativity condition?
6.2. Apply the Milstein scheme to the system of SDEs in Exercise 6.1.
6.3. Does the following diusion process with jumps satisfy the jump commutativity condition if its SDE has the form
dXt = a dt + 2 Xt dWt + 2 Xt p (E, dt)
for t 0?
7
Regular Strong It
o Approximations
In this chapter we describe strong approximations on a regular time discretization that are more general than the regular strong Taylor approximations
presented in the previous chapter. These approximations belong to the class
of regular strong It
o schemes, which includes derivative-free, implicit and
predictor-corrector schemes. More details on some of the results to be presented in this chapter can be found in Bruti-Liberati, Nikitopoulos-Sklibosios
& Platen (2006) and Bruti-Liberati & Platen (2008).
309
310
7 Regular Strong It
o Approximations
tn+1
(b(tn , Yn ) b)
tn+1
z2
tn
tn+1
z2
dWz1 dWz2
tn
tn
(c(tn , Yn , v) c(v))
tn+1
z2
+
b(tn , Yn + c(v)) b p (dv, dz1 ) dWz2
tn
tn
E
tn+1
z2
+
tn
tn
(7.1.1)
c(tn , Yn + c(v2 ), v1 ) c(v1 ) p (dv1 , dz1 )p (dv2 , dz2 ),
311
n .
(7.1.2)
p (tn+1 )
Yn+1 = Yn + an + bWn +
c(i ) +
p (tn+1 )
p (tn+1 )
b(tn , Yn ) b
2
(Wn ) n
2 n
c(tn , Yn , i ) c(i )
(Wi Wtn )
n
b (Yn + c(i )) b Wtn+1 Wi
p (tn+1 )
p (j )
c (Yn + c(i ), j ) c (j ) ,
(7.1.3)
with supporting value (7.1.2), which can be directly used in scenario simulation.
As discussed in Sect. 6.3, in the case of mark-independent jump size it
is recommended to check the particular structure of the SDE under consideration. Indeed, we derived the jump commutativity condition (6.3.2) under
which the strong order 1.0 Taylor scheme (6.2.4) exhibits a computational
complexity that is independent of the intensity level of the Poisson random
measure.
For the derivative-free strong order 1.0 scheme (7.1.1)(7.1.2) with markindependent jump size, the derivative-free coecient of the multiple stochastic
integral I(1,1) , which is of the form
c(tn , Yn ) c(tn , Yn )
,
n
depends on the time step size n . On the other hand, the coecient of the
multiple stochastic integral I(1,1) ,
b (tn , Yn + c(tn , Yn )) b (tn , Yn ) ,
is independent of n . Therefore, it is not possible to directly derive a commutativity condition similar to (6.3.2) that permits us to identify special classes
312
7 Regular Strong It
o Approximations
n .
(7.1.5)
Since the evaluation of the multiple stochastic integrals I(1,1) and I(1,1) , as
given in (6.2.5), depends on the number of jumps, the computational eciency
of the scheme (7.1.4)(7.1.5) depends on the total intensity of the jump
measure.
Let us consider the special class of one-dimensional SDEs satisfying the
jump commutativity condition (6.3.2), which we recall here in the form
b(t, x)
c(t, x)
= b (t, x + c(t, x)) b (t, x) ,
x
(7.1.6)
for all t [0, T ] and x . Under this condition, we should rst derive the
strong order 1.0 Taylor scheme, using the relationship
I(1,1) + I(1,1) = pn Wn ,
(7.1.7)
obtaining
b b
{(Wn )2 n }
(7.1.8)
2
{c(tn , Yn + c) c}
{(pn )2 pn },
+ {b(tn , Yn + c) b}pn Wn +
2
where we have used again the abbreviation (6.1.12). Then, we replace the
derivative b by the corresponding dierence ratio and obtain a derivative-free
strong order 1.0 scheme
Yn+1 = Yn + an + bWn + cpn
+
{b(tn , Yn ) b}
{c(tn , Yn + c) c}
{(pn )2 pn },
2
(7.1.9)
313
Let us discuss an even more specic example. For the SDE (1.8.5) with
c(t, x, v) = x , for 1, we can derive the derivative-free strong order 1.0
scheme, which, due to the multiplicative form of the diusion coecient, is
the same as the strong order 1.0 Taylor scheme (6.3.6).
Multi-dimensional Strong Order 1.0 Scheme
In the multi-dimensional case with scalar Wiener process, which means m =
1, and mark-dependent jump size, the kth component of the derivative-free
strong order 1.0 scheme is given by
tn+1
k
k
k
k
ck (v) p (dv, dz)
Yn+1 = Yn + a n + b Wn +
+
n) b
b (tn , Y
tn+1
z2
k
tn
tn+1
tn
z2
tn
tn
tn+1
E
z2
tn
z2
dWz1 dWz2
tn
tn
n , v) ck (v)
ck (tn , Y
bk (tn , Y n + c(v)) bk p (dv, dz1 ) dWz2
+
tn
tn
tn+1
ck (tn , Y n + c(v2 ), v1 ) ck (v1 )
(7.1.10)
bl (t, x)
ck (t, x)
= bk (t, x + c(t, x)) bk (t, x)
xl
(7.1.11)
for k {1, 2, . . . , d}, t [0, T ] and x d . Using the relationship (7.1.7) one
can derive in this case a strong order 1.0 Taylor scheme
1 l bk
b
{(Wn )2 n } (7.1.12)
2
xl
d
k
Yn+1
= Ynk + akn + bkWn + ckpn +
l=1
{c (tn , Y n + c) ck }
{(pn )2 pn },
2
k
314
7 Regular Strong It
o Approximations
) bk }
{bk (tn , Y
n
{(Wn )2 n } + {bk (tn , Y n + c) bk }pn Wn
2 n
{ck (tn , Y n + c) ck }
{(pn )2 pn },
2
(7.1.13)
with supporting value (7.1.2), for k {1, 2, . . . , d}. The computational complexity of the scheme (7.1.13) is independent of the jump intensity level.
In the general multi-dimensional case the kth component of the derivativefree strong order 1.0 scheme is given by
k
= Ynk + ak n +
Yn+1
m
tn
j=1
1
+
n
+
m
j1 =1
tn+1
tn
tn+1
tn+1
z2
tn
z2
tn
z2
z2
tn
+
tn
j2 ) bk,j1 dW j1 dW j2
bk,j1 (tn , Y
n
z1
z2
j2 ) ck,j1 dW j1 p (dv, dz2 )
ck,j1 (tn , Y
n
z1
bk,j1 (tn , Y n + c(v)) bk,j1 p (dv, dz2 ) dWzj21
tn
tn
j1 =1
tn+1
tn
j1 ,j2 =1
tn+1
bk,j Wnj +
(7.1.14)
n ,
(7.1.15)
bl,j1 (t, x)
l=1
ck (t, x)
= bk,j1 (t, x + c(t, x)) bk,j1 (t, x)
xl
315
(7.1.16)
m
bk,j Wnj + ck pn
j=1
1
+
2 n
+
m
m
j2 ) bk,j1 (tn , Y n ) I(j ,j )
bk,j1 (tn , Y
n
1 2
j1 ,j2 =1
bk,j1 (tn , Y n + c) bk,j1 pn Wnj1
j=1
{ck (tn , Y n + c) ck }
{(pn )2 pn },
2
(7.1.17)
for k {1, 2, . . . , d}, with the vector supporting values (7.1.15). For the generation of the multiple stochastic integral I(j1 ,j2 ) , for j1 , j2 {1, 2, . . . , d}, we
refer to Sect. 6.3.
For the special case of a multi-dimensional SDE satisfying the jump commutativity condition (7.1.16), the diusion commutativity condition
Lj1 bk,j2 (t, x) = Lj2 bk,j1 (t, x)
(7.1.18)
m
bk,j Wnj + ck pn
j=1
1
2
m
j1 ,j2 =1
j2 ) bk,j1 (tn , Y n ) W j1 W j2 n
bk,j1 (tn , Y
n
n
n
316
7 Regular Strong It
o Approximations
m
{bk,j1 (tn , Y n + c) bk,j1 } pn Wnj1
j1 =1
1 k
2
{c (tn , Y n + c) ck } (pn ) pn ,
2
(7.1.19)
p (tn+1 )
ck (i ),
(7.2.1)
where the parameter [0, 1] characterizes the degree of implicitness and we
have used the abbreviation dened in (6.1.12). For = 0 we recover the Euler
scheme (6.1.15), while for = 1, we obtain a fully drift-implicit Euler scheme.
The scheme (7.2.1) achieves a strong order of convergence = 0.5. This
scheme was proposed and analyzed in Higham & Kloeden (2006) for SDEs
driven by Wiener processes and homogeneous Poisson processes. It generalizes
the drift-implicit Euler scheme for pure diusions presented in Talay (1982b)
and Milstein (1988a).
By comparing the drift-implicit Euler scheme (7.2.1) with the Euler scheme
(6.1.15), one notices that the implementation of the former requires an additional computational eort. An algebraic equation has to be solved at
317
1
,
K
(7.2.2)
p (tn+1 )
ck (i ),
Ynk +
a (tn+1 , Y n+1 ) + (1 ) a
k
p (tn+1 )
m
k,j
j
n + b Wn +
ck (i ),
j=1
318
7 Regular Strong It
o Approximations
tn+1
tn+1
z2
+
c(v) p (dv, dz) + bb
dWz1 dWz2
E
tn
tn+1
tn
z2
+
E
tn
tn+1
tn+1
tn
z2
b (tn , Yn + c(v)) b p (dv, dz1 )dWz2
tn
z2
tn
tn
+
tn
tn
(7.2.4)
c (tn , Yn + c(v1 ), v2 ) c(v2 ) p (dv1 , dz1 ) p (dv2 , dz2 ),
where
b(t, x)
c(t, x, v)
and c (t, x, v) =
.
(7.2.5)
x
x
For simplicity, we have used the convention (6.1.12). Here the parameter
[0, 1], characterizes again the degree of implicitness. This scheme achieves a
strong order of convergence = 1.0. Note that the degree of implicitness
can, in principle, be also chosen to be greater than one if this helps the
numerically stability of the scheme. This scheme generalizes a rst strong
order drift-implicit scheme for pure diusions presented in Talay (1982a) and
Milstein (1988a).
One can simplify the double stochastic integrals appearing in the scheme
(7.2.4) as shown for the strong order 1.0 Taylor scheme (6.2.3). This makes
the resulting scheme more applicable to scenario simulation.
The jump commutativity condition (6.3.2), presented in Sect. 6.3, also
applies to drift-implicit schemes. Therefore, for the class of SDEs identied
by the jump commutativity condition (6.3.2) the computational eciency of
drift-implicit schemes of strong order = 1.0 is independent of the intensity
level of the Poisson measure. For instance, for the SDE (1.8.5) with c(t, x, v) =
x and 1, it is possible to derive a drift-implicit strong order 1.0 scheme,
given by
Yn
1
1 + (1 ) n + Wn + pn + 2 {(Wn )2 n }
Yn+1 =
1 n
2
1 2
2
(7.2.6)
+ pn Wn + {(pn ) pn } ,
2
b (t, x) =
319
which is also ecient in the case of a high intensity jump measure. Note that
should be chosen dierent from (n )1 , which automatically arises for
suciently small step sizes n .
In the multi-dimensional case with scalar Wiener process, m = 1, and
mark-dependent jump size, the kth component of the drift-implicit strong
order 1.0 scheme is given by
k
Yn+1
= Ynk + ak (tn+1 , Y n+1 ) + (1 ) ak n + bk Wn
tn+1
+
E
tn
l=1
tn+1
tn+1
z2
tn
tn
tn+1
tn
z2
bl
tn
l=1
z2
tn
tn+1
tn
z2
bl
tn
bk
dWz1 dWz2
xl
ck (v)
dWz1 p (dv, dz2 )
xl
bk (tn , Y n + c(v)) bk p (dv, dz1 )dWz2
+
tn
(7.2.7)
+ ck pn +
(7.2.8)
l=1
{ck (tn , Y n + c) ck }
{(pn )2 pn },
2
tn+1
+
tn
320
7 Regular Strong It
o Approximations
m
d
j1 ,j2 =1 i=1
m
d
tn+1
j1 =1
z2
tn
j1 =1 i=1
tn+1
tn
z2
bi,j1
tn
tn
z2
tn+1
tn
bk,j2
xi
i,j1
dWzj11 dWzj22
ck (v)
dWzj11 p (dv, dz2 )
xi
tn
tn+1
z2
+
tn
tn
k
c (tn , Y n + c(v1 ), v2 ) ck (v2 )
(7.2.9)
+ ck pn +
m
1
2
m
d
j1 ,j2 =1 i=1
bi,j1
bk,j2
Wnj1 Wnj2 n
i
x
{bk,j1 (tn , Y n + c) bk,j1 } pn Wnj1
j1 =1
1 k
2
{c (tn , Y n + c) ck } (pn ) pn ,
2
(7.2.10)
for k {1, 2, . . . , d}. The special case of additive diusion and jump coecient, which means b(t, x) = b(t) and c(t, x) = c(t), satises all the required
commutativity conditions and, therefore, leads to an ecient drift-implicit
strong order 1.0 scheme.
321
322
7 Regular Strong It
o Approximations
Yn+1 = Yn + Yn Wn .
(7.3.2)
Yn
,
1 Wn
(7.3.3)
(7.3.4)
fails because it can create division by zero. Furthermore, the absolute moment
of the inverse of a Gaussian random variable is innite, that is
E|(1 Wn )1 | = +.
(7.3.5)
(7.3.6)
(7.3.7)
where
and c0 , c1 represent positive, real valued uniformly bounded functions. The
freedom of choosing c0 and c1 can be exploited to construct a numerically
stable scheme, tailored for the dynamics of the given SDE. Note however,
the balanced implicit method is only of strong order = 0.5 since it is, in
principle, a variation of the Euler scheme. This low order strong convergence
is the price that one has to pay for obtaining numerical stability. As mentioned
already, any numerically stable scheme is better than an unstable one with
higher order.
The balanced implicit method can be interpreted as a family of specic
methods providing a kind of balance between approximating diusion terms
in the numerical scheme. One can expect that by an appropriate choice of
the parameters involved in these schemes one is able to nd an acceptable
combination that is suitable for the approximate solution of a given SDE.
Numerical experiments will demonstrate the better behavior of the balanced
implicit method in comparison with the explicit Euler method. As will be
discussed in Chap. 14, the method has a large working range of suitable time
step sizes without numerical instabilities arising. This is in contrast to the
323
explicit Euler method, where one has to be careful when using relatively large
time step sizes.
In a number of applications, in particular, those involving SDEs with multiplicative noise as is typical in nance and ltering, see Fischer & Platen
(1999) and Sect. 10.4, balanced implicit methods show better numerical stability than most other methods.
The General Balanced Implicit Method
To formulate the balanced implicit method for more general systems without
jumps, that is m, d {1, 2, . . .}, we suppose that the d-dimensional stochastic
process X = {X t , t [0, T ]} with E(X 0 )2 < satises the d-dimensional
SDE
m
dX t = a(t, X t ) dt +
bj (t, X t ) dWtj ,
(7.3.8)
j=1
1
m
j=1
where
C n = c0 (n , Y n ) +
m
cj (n , Y n ) |Wnj |
(7.3.10)
j=1
m
aj cj (t, x)
j=1
(7.3.11)
Here I is the unit matrix. Obviously, condition (7.3.11) can be easily fullled
by keeping c0 , c1 , . . . , cm all positive denite. Under these conditions one obtains directly the one-step increment Y n+1 Y n of the balanced implicit
method via the solution of a system of linear algebraic equations.
324
7 Regular Strong It
o Approximations
1
2
(7.3.14)
and p1 p2 + 12 . Then,
12
1
1
2
0
Y
|
K 1 + |X 0 |2 2 p2 2
E |X 0,X
k A0
k
(7.3.15)
m
bj (k , Y k ) Wkj ,
(7.3.16)
j=1
325
,Yn
H1 = E X nn+1
Y n+1 An
,Yn
E
= E X nn+1
YE
n+1 An + E Yn+1 Y n+1 An
1
K 1 + |Y n |2 2 2 + H2
with
H2 = E Y E
n+1 Y n+1 An
m
j
1
j
I (I + Cn )
= E
a(n , Y n ) +
b (n , Y n ) Wn An
j=1
m
j
1
j
= E (I + C n ) C n a(n , Y n ) +
b (n , Y n ) Wn An .
j=1
326
7 Regular Strong It
o Approximations
1
b(n , Yn ) (W 2 )
Yn+1 = Yn + a + b W + b
x
2
+ (Yn Yn+1 ) D(Yn ) |W 2 |.
(7.3.17)
Here the function D(Yn ) can be relatively freely chosen. This scheme is of
strong order = 1 under appropriate conditions, see Alcock & Burrage (2006)
and Kahl & Schurz (2006).
More generally, a multi-dimensional version of the rst strong order balanced implicit method is of the form
Y n+1 = Y n + a(n , Y n ) +
m
bj (n , Y n ) Wnj
j=1
d
m
j1 ,j2 =1
i=1
i,j1
!
m
bj2
I
Y
)
Dj1 ,j2 (Y n ) |Ij1 ,j2 |.
+
(Y
j ,j
n
n+1
xi 1 2
j ,j =1
1
327
p (tn+1 )
+ b(tn+1 , Yn+1 ) + (1 )b Wn +
c(i ), (7.4.1)
where a
= a b b , and the predictor
p (tn+1 )
Yn+1 = Yn + an + bWn +
c(i ).
(7.4.2)
m
n+1 ) + (1 )bk,j W j +
bk,j (tn+1 , Y
n
j=1
p (tn+1 )
ck (i ),
a
= a
bk,j1
j1 ,j2 =1 i=1
bk,j2
,
xi
(7.4.4)
m
j=1
p (tn+1 )
k,j
Wnj
ck (i ).
(7.4.5)
328
7 Regular Strong It
o Approximations
stability properties of the strong order 1.0 Taylor scheme (6.2.1) we have presented the drift-implicit strong order 1.0 Taylor scheme (7.2.4). However, this
scheme is computationally expensive, since it generally requires the solution
of an algebraic equation at each time step. In the following we propose the
predictor-corrector strong order 1.0 scheme which combines good numerical
stability properties and eciency.
In the one-dimensional case, d = m = 1, the predictor-corrector strong
order 1.0 scheme is given by the corrector
Yn+1 = Yn + a(tn+1 , Yn+1 ) + (1 ) a n + bWn
tn+1
tn+1
z2
+
c(v) p (dv, dz) + b b
dWz1 dWz2
tn
tn
tn+1
z2
tn
tn+1
tn+1
tn
z2
E
tn
tn
tn
z2
+
tn
tn
b (tn , Yn + c(v)) b p (dv, dz1 )dWz2
(7.4.6)
c (tn , Yn + c(v1 ), v2 ) c(v2 ) p (dv1 , dz1 ) p (dv2 , dz2 ),
Yn+1 = Yn + an + bWn +
+ bb
tn
tn+1
tn+1
tn
tn+1
tn
tn+1
tn
dWz1 dWz2
tn
tn
z2
z2
tn
z2
tn
tn
z2
tn+1
E
b (tn , Yn + c(v)) b p (dv, dz1 )dWz2
(7.4.7)
c (tn , Yn + c(v1 ), v2 ) c(v2 ) p (dv1 , dz1 ) p (dv2 , dz2 ).
329
m
k
n+1 ) + (1 ) ak n +
Yn+1
= Ynk + ak (tn+1 , Y
bk,j Wnj
j=1
tn+1
tn
m
d
j1 ,j2 =1 i=1
tn
m
d
tn+1
bk,j2
}dWzj11 dW j2 (z2 )
xi
{bi,j1
z2
ck (v)
dWzj11 p (dv, dz2 )
xi
bi,j1
tn+1
tn
z2
tn
j1 =1
z2
tn
tn
j1 =1 i=1
tn+1
tn
tn+1
z2
ck (tn , Y n + c(v1 ), v2 ) ck (v2 )
+
E
tn
tn
(7.4.8)
m
k,j
Wnj
i,j1
j1 ,j2 =1 i=1
m
d
tn+1
j1 =1
tn+1
tn
tn+1
z2
tn
z2
tn
tn
tn
z2
+
tn
bi,j1
j1 =1 i=1 tn
m
tn+1
tn
bk,j2
xi
j=1
m
d
tn+1
z2
tn
dWzj11 dWzj22
ck (v)
dWzj11 p (dv, dz2 )
xi
bk,j1 (tn , Y n + c(v)) bk,j1 p (dv, dz2 ) dWzj21
(7.4.9)
330
7 Regular Strong It
o Approximations
also. In particular, if the SDE under analysis satises the diusion commutativity condition (7.1.18) together with the jump commutativity condition
(7.1.16) and has mark-independent jump size, then we obtain an ecient, implementable predictor-corrector strong order 1.0 scheme. Its kth component
is given by the corrector
m
k
n+1 ) + (1 ) ak n +
= Ynk + ak (tn+1 , Y
bk,j Wnj
Yn+1
j=1
+ ck pn +
m
1
2
m
d
j1 ,j2 =1 i=1
bi,j1
bk,j2
Wnj1 Wnj2 n
i
x
{bk,j1 (tn , Y n + c) bk,j1 } pn Wnj1
j1 =1
1 k
2
{c (tn , Y n + c) ck } (pn ) pn ,
2
(7.4.10)
m
bk,j Wnj + ck pn
j=1
1
2
m
d
j1 ,j2 =1 i=1
m
bi,j1
bk,j2
Wnj1 Wnj2 n
i
x
{bk,j1 (tn , Y n + c) bk,j1 } pn Wnj1
j1 =1
1 k
2
{c (tn , Y n + c) ck } (pn ) pn ,
2
(7.4.11)
with [0, 1], for k {1, 2, . . . , d}. In the special case of additive diusion
and jump coecient, b(t, x) = b(t) and c(t, x) = c(t), respectively, which
satisfy the required commutativity conditions, we obtain an ecient predictorcorrector strong order 1.0 scheme.
We remark that as in the predictor-corrector Euler scheme as with the predictor-corrector strong order 1.0 scheme, one can introduce quasi-implicitness
into the diusion part. We emphasized that predictor-corrector schemes with
higher strong order can be easily constructed, providing numerically stable, efcient and conveniently implementable discrete-time approximations of jump
diusion processes that are quite ecient in many cases, see also Chap. 14.
331
tn ,tn+1
A \{v}
n,
+R
(7.5.2)
,n are
with n {0, 1, . . . , nT 1}. We assume that the coecients h,n and h
Atn -measurable and satisfy the estimates
2
)|
(7.5.3)
C(u) 2() ,
E
max |h,n f (tn , Y
n
0nnT 1
and
E
max
0nnT 1
,n f (tn , Y )|2
|h
n
C(u) 2() ,
(7.5.4)
332
7 Regular Strong It
o Approximations
2
E max
Rk K 2 ,
1nnT
0kn1
and
2
2
k
E max
R
K
1nnT
0kn1
(7.5.5)
(7.5.6)
Ht = E
333
2
max Y n Y n
1nnt
n1
.
)
I f (tk , Y
= E max
k
1nnt
tk ,tk+1
k=0 A \{v}
2
n1
.
,k
I h
+ Rn
tk ,tk+1
k=0
A \{v}
K1
A \{v}
2
n1
.
) f (tk , Y )
I f (tk , Y
E max
k
k
1nnt
tk ,tk+1
k=0
2
n1
.
,k
+E max
I f (tk , Y
)
h
k
1nnt
tk ,tk+1
k=0
n1
2
+ K1 E max
Rn .
1nnt
(7.5.8)
k=0
t
) f (tk , Y )|2
Ht K 2
...
E
max |f (tk , Y
k
k
0
A \{v}
1nnu
+E
max
1nnu
|f (tk , Y
k )
K2
,n |
h
(dv ) . . . (dv
Y |2 )
E( max |Y
k
k
0
A \{v}
1nnu
s()
)du () + K3 2
...
E
K4 (v1 , . . . , vs() )
(dv1 ) . . . (dvs() ) du
+
. . . K5 (v1 , . . . , vs() )2() (dv1 ) . . . (dvs() ) du ()
E
+ K3 2
t
Y |2 ) du
K5
E( max |Y
k
k
0nnu
() + K6 2
A \{v}
K7
Hu du + K6 2 .
0
(7.5.9)
334
7 Regular Strong It
o Approximations
From the second moment estimate (6.6.1) at the compensated strong Tay in Lemma 6.6.1 and a similar estimate of the compensated
lor scheme Y
strong Ito scheme Y , one can show that Ht is bounded. Therefore, by applying the Gronwall inequality to (7.5.9), we obtain
Ht K5 2 eK7 t .
(7.5.10)
= Y , we obtain
Since we have assumed Y
0
0
Y |2 ) K 2 .
E( max |Y
n
n
0nnT
(7.5.11)
'
= E
max |Y n Y n + Y n X tn |
0nnT
'
2
2 E
max |Y n Y n | + E
max |Y n X tn |
0nnT
0nnT
K ,
(7.5.12)
By the same arguments used above, one can show the following result.
Corollary 7.5.2
Let Y = {Y
n , n {0, 1, . . . , nT }} be a discrete-time
approximation generated by the strong order It
o scheme (7.5.1). If the conditions of Corollary 6.4.3 are satised, then
'
2
E
max X tn Y
(7.5.13)
K ,
n
0nnT
tn+1
Yn+1 = Yn + an + bWn +
tn
t
z2
n+1
b(tn , Yn ) b
+
dWz1 dWz2
n
tn
tn
tn+1
z2
c(tn , Yn , v) c(v)
+
dWz1 p (dv, dz2 )
n
tn
E tn
tn+1
z2
+
b(tn , Yn + c(v)) b p (dv, dz1 ) dWz2
tn
tn+1
tn
tn
E
z2
tn
335
(7.5.14)
c(tn , Yn + c(v2 ), v1 ) c(v1 ) p (dv1 , dz1 )p (dv2 , dz2 ),
n .
(7.5.15)
b(t, x)
x
and
b (t, x) =
2 b(t, x)
x2
(7.5.16)
(7.5.17)
and
c(tn , Yn , v) = c(tn , Yn , v) + c (tn , Yn , v) Yn Yn
Yn Yn ), v
c tn , Yn + (
2
Yn Yn ,
+
2
(7.5.18)
with
c (t, x, v) =
c(t, x, v)
x
and
c (t, x, v) =
2 c(t, x, v)
x2
(7.5.19)
(7.5.20)
336
7 Regular Strong It
o Approximations
with
h(0),n = a(tn , Yn ),
h(1),n = b(tn , Yn ),
1
c(tn , Yn , v) c(tn , Yn , v) ,
h(1,1),n =
n
h(1,1),n = b (tn , Yn + c(tn , Yn , v)) b (tn , Yn ) ,
h(1,1),n = c (tn , Yn + c(tn , Yn , v2 ), v1 ) c (tn , Yn , v1 ) ,
1
b(tn , Yn ) b(tn , Yn ) .
h(1,1),n =
n
(7.5.21)
Only for = (1, 1) and = (1, 1) the coecients h,n are dierent from
the coecients f,n of the strong order 1.0 Taylor scheme (6.2.1). Therefore,
to prove that the scheme (7.5.14)(7.5.15) is a strong order 1.0 Ito scheme, it
remains to check the condition (7.5.3) for these two coecients.
By the linear growth condition (6.4.9) of Theorem 6.4.1, we have
2
2
b(tn , Yn )2 b tn , Yn + b(tn , Yn ) n K1 (1 + |Yn |4 ) K2 1 + |Yn |
= C1 1 + |Yn |2 + |Yn |4 + |Yn |6 .
(7.5.22)
In a similar way we also obtain
b(tn , Yn )2 c tn , Yn + b(tn , Yn )
2
n , v C2 (v) 1 + |Yn |2 + |Yn |4 + |Yn |6 ,
(7.5.23)
where C2 (v) : E is a (dv)- integrable function.
Following similar steps as those used in the rst part of the proof of Theorem 6.4.1, one can show that
2q
E
max |Yn |
(7.5.24)
K 1 + E(|Y0 |2q ) ,
0nnT 1
b(tn , Yn )2 b (tn , Yn )
max
0nnT 1
2
2
n
K 1 + E(|Y0 |6 )
K 2 () .
(7.5.25)
We also have
E
max
0nnT 1
h(1,1),n f(1,1) (tn , Yn )2
337
C(v) 2 () ,
(7.5.26)
t+
L1
(7.5.27)
where
t+
L0 L0 a(u, Xu )du +
R1 (t) =
t
t+
0
L1
L
a(u,
X
)p
(dv
,
du)
ds
u
1
v1
L1 L0 a(u, Xu ) dWu
s
0
L1 L1 a(u, Xu ) dWu
L L a(u, Xu )du +
t
+
E
t+
1
L1
v1 L a(Xu )p (dv1 , du)
t
s
+
t
dWs
L0 L1
v2 a(u, Xu )du +
s
t
L1 L1
v2 a(u, Xu ) dWu
1 1
Lv1 Lv2 a(u, Xu )p (dv1 , du) p (dv2 , ds)
(7.5.28)
338
7 Regular Strong It
o Approximations
tn+1
+
c(tn , Yn , v) p (dv, ds).
tn
(7.5.29)
By replacing the rst drift coecient a(tn , Yn ) with its implicit expansion
(7.5.27), we obtain
Yn+1 = Yn + a(tn+1 , Yn+1 ) + (1 ) a n + bWn
tn+1
+
c(tn , Yn , v)p (dv, ds)
tn
L0 an + L1 aWn +
tn+1
tn
L1
ap
(dv,
dz)
+
R
(t)
n .
1
v
tn+1
+
c(tn , Yn , v)p (dv, ds).
(7.5.30)
tn
Note that the drift-implicit Euler scheme (7.5.30) is well dened for time step
size
1
,
K
where K is the Lipschitz constant appearing in the Lipschitz condition (1.9.2)
for the drift coecient a. As previously mentioned, Banachs xed point theorem ensures existence and uniqueness of the solution Yn+1 of the algebraic
equation in (7.5.30).
By applying the same arguments to every time integral appearing in a
higher order strong Taylor scheme, even in the general multi-dimensional case,
it is possible to derive multi-dimensional higher order implicit schemes as, for
instance, the drift-implicit strong order 1.0 scheme (7.2.9). To prove the strong
order of convergence of such drift-implicit schemes it is sucient to show that
one can rewrite these as strong Ito schemes. The drift-implicit Euler scheme
339
(7.5.30), for instance, can be written as a strong order 0.5 Ito scheme given
by
Yn+1 = Yn + I(0) [h(0),n ]tn ,tn+1 + I(1) [h(1),n ]tn ,tn+1 + I(1) [h(1),n ]tn ,tn+1 + Rn
(7.5.31)
with
h(0),n = a(tn , Yn ),
h(1),n = b(tn , Yn ),
and
Rn = n (a(tn+1 , Yn+1 ) a(tn , Yn )) .
(7.5.33)
Since the coecients h,n are the same as those of the Euler scheme
(6.1.15), all that remains is to check condition (7.5.6) for the remainder term
Rn . Following similar steps as those used in the proof of Lemma 6.6.1, one
can derive the second moment estimate
2
E
max |Yn | K 1 + E(|Y0 |2 ) .
(7.5.34)
0nnT 1
We refer to Higham & Kloeden (2006) for details of the proof of this estimate
in the case of the drift-implicit Euler scheme for SDEs driven by Wiener
processes and homogeneous Poisson processes.
By applying the Cauchy-Schwarz inequality, the linear growth condition
(1.9.3) and the estimate (7.5.34), we obtain
2
Rk
E max
1nnT
0kn1
2
E max
1nnT
KE
2
|k |
0kn1
0knT 1
K E
2
|a(tk+1 , Yk+1 ) a(tk , Yk )|
0kn1
2
|a(tk+1 , Yk+1 ) a(tk , Yk )|
0knT 1
2
|a(tk+1 , Yk+1 ) a(tk , Yk )|
0knT 1
K E
0knT 1
2
|Yk+1 Yk | .
(7.5.35)
340
7 Regular Strong It
o Approximations
2
E
|Yk+1 Yk |
0knT 1
i=1
2
|Yk+1 Yk | P (nT = i)
0ki1
i=1
2
E |Yk+1 Yk | Atk P (nT = i), (7.5.36)
0ki1
K E
tk+1
tk
2
Atk
k ,tk+1
E I [f (tk , Yk )]t
2
( (a(tk+1 , Yk+1 ) a(tk , Yk )) + ;
a(tk , Yk )) dz Atk
{(1),(1)}
K (tk+1 tk )
E |a(tk+1 , Yk+1 )|2 + |a(tk , Yk )|2 + |;
a(tk , Yk )|2 Atk dz
tk+1
tk
tk
{(1),(1)}
tk+1
tk+1
2
E f (tk , Yk ) Atk (dv s() )dz
1 + |Yk+1 |2 + 1 + |Yk |2 Atk dz
tk
{(1),(1)}
tk+1
E
E
tk
1 + |Yk |2 Atk (dv s() )dz
K 1 + max |Yn |
0ni
(tk+1 tk )
(7.5.37)
341
2
E max
Rk
1nnT
0kn1
2
E
1 + max |Yn |
0ni
i=1
(tk+1 tk ) P (nT = i)
0ki1
P (nT = i)
K 1 + E(|Y0 |2 ) T
i=1
K = K ,
2
(7.5.38)
for = 0.5.
Therefore, the convergence of the drift-implicit Euler scheme follows from
Corollary 7.5.2 since we have shown that it can be rewritten as a strong Ito
scheme of order = 0.5. In a similar way it is possible to show that the driftimplicit strong order 1.0 scheme (7.2.9) can be rewritten as a strong order
= 1.0 It
o scheme.
Strong Predictor-Corrector Schemes
In this subsection we show how to rewrite strong predictor-corrector schemes
as strong Ito schemes and how to derive the corresponding orders of strong
convergence.
Let us consider the family of strong predictor-corrector Euler schemes
(7.4.1)(7.4.2). We recall that in the one-dimensional case, d = m = 1, this
scheme is given by the corrector
Yn+1 = Yn +
a (tn+1 , Yn+1 ) + (1 )
a n
p (tn+1 )
+ b(tn+1 , Yn+1 ) + (1 )b Wn +
c(i ), (7.5.39)
where a
= a b b , and the predictor
p (tn+1 )
Yn+1 = Yn + an + bWn +
c(i ).
(7.5.40)
This scheme can be written as a strong order 0.5 Ito scheme given by
Yn+1 = Yn + I(0) [h(0),n ]tn ,tn+1 + I(1) [h(1),n ]tn ,tn+1 + I(1) [h(1),n ]tn ,tn+1 + Rn
(7.5.41)
with
342
7 Regular Strong It
o Approximations
h(0),n = a(tn , Yn ),
h(1),n = b(tn , Yn ),
and
Rn = R1,n + R2,n + R3,n + R4,n
with
R1,n =
R2,n =
a(tn+1 , Yn+1 ) a(tn , Yn ) n ,
(7.5.43)
(7.5.44)
b(tn+1 , Yn+1 ) b (tn+1 , Yn+1 ) b(tn , Yn ) b (tn , Yn ) n , (7.5.45)
R3,n = b(tn , Yn ) b (tn , Yn )n ,
and
R4,n =
b(tn+1 , Yn+1 ) b(tn , Yn ) Wn .
(7.5.46)
(7.5.47)
Since the coecients h,n are the same as those of the Euler scheme,
being the strong Taylor scheme of order = 0.5, one needs to show that the
remainder term Rn satises condition (7.5.5).
We also need the following second moment estimate of the numerical approximation Yn , where
2
E
max |Yn | K 1 + E(|Y0 |2 ) .
(7.5.48)
0nnT 1
This estimate can be derived by similar steps as those used in the proof of
Lemma 6.6.1.
By the Cauchy-Schwarz inequality, the Lipschitz condition on the drift
coecient and equation (7.5.40), we obtain
2
R1,k
E max
1nnT
0kn1
2
a(tk+1 , Yk+1 ) a(tk , Yk ) k
= E max
1nnT
0kn1
E max
1nnT
KE
|k |
0kn1
0knT 1
2
a(tk+1 , Yk+1 ) a(tk , Yk )
0kn1
0knT 1
2
a(tk+1 , Yk+1 ) a(tk , Yk )
K E
343
a(tk+1 , Yk+1 ) a(tk , Yk )2
0knT 1
K E
2
Yk+1 Yk
0knT 1
2
[f (tk , Yk )]t ,t
I
K E
k k+1
0knT 1 A0.5 \{v}
2
I [f (tk , Yk )]tk ,tk+1 .
(7.5.49)
0knT 1
A0.5 \{v}
2
E
I [f (tk , Yk )]tk ,tk+1
0knT 1
i=1
i=1
2
P (nT = i)
I [f (tk , Yk )]tk ,tk+1
0ki1
(7.5.50)
2
E I [f (tk , Yk )]tk ,tk+1 Atk P (nT = i)
0ki1
for every A0.5 \{v}. In the last line of (7.5.50) we have applied standard
properties of conditional expectations.
By using the Cauchy-Schwarz inequality for = (0), the It
o isometry for
= (j) and j {1, 1}, and the linear growth conditions (6.4.9), we obtain
2
E I [f (tk , Ytk )]tk ,tk+1 Atk
tk+1
n()
tk
s() n()
E |f (tk , Ytk )|2 Atk (dv s() )dz
tk+1
E 1 + |Ytk |2 Atk dz
tk
(7.5.51)
344
7 Regular Strong It
o Approximations
2
E max
R1,k
1nnT
0kn1
s() n()
A0.5 \{v}
i=1
ti
E
0
E (1 + |Ynz | )Atnz dz
A0.5 \{v}
A0.5 \{v}
K = K2 ,
s()
n()
i=1
ti
P (nT = i)
2
E 1 + max |Yn | dz P (nT = i)
0ni
s() n() 1 + E |Y0 |2 T
P (nT = i)
i=1
(7.5.52)
for = 0.5.
Note that in Theorem 6.4.1 and in Theorem 7.5.1 we assumed that the
coecient function f satises the Lipschitz condition (6.4.7) for every
A . If instead we require this condition to hold for every A0.5 B(A0.5 ),
then by taking into consideration the coecient function corresponding to
the multi-index (1, 1) B(A0.5 ), we see that the coecient b b satises the
Lipschitz type condition (6.4.7). Therefore, with similar steps as in (7.5.49),
(7.5.50) and (7.5.51), we obtain
2
E max
R2,k K = K 2 ,
(7.5.53)
1nnT
0kn1
for = 0.5.
By the linear growth condition on the coecient b b , which follows from
condition (6.4.9), and the second moment estimate (7.5.48), one can show that
2
E max
R3,k K = K 2 ,
(7.5.54)
1nnT
0kn1
for = 0.5.
By Doobs inequality, It
os isometry, the Lipschitz condition on the diusion coecient and (7.5.40), we obtain
345
2
E max
R4,k
1nnT
0kn1
2
tk+1
b(tk+1 , Yk+1 ) b(tk , Yk ) dWz
= E max
1nnT
tk
0kn1
2
T
b(tnz +1 , Ynz +1 ) b(tnz , Ynz ) dWz
4E
0
=4
2
E b(tnz +1 , Ynz +1 ) b(tnz , Ynz ) dz
K
0
K
0
2
E Ynz +1 Ynz dz
2
(7.5.55)
By the Lipschitz condition, the estimate (7.5.51) and the second moment
estimate (7.5.48), we have
2
E max
R4,k
1nnT
0kn1
K
s()
n()
tnz
tnz +1
E
0
A0.5 \{v}
s()
tnz +1
E
0
A0.5 \{v}
n()
tnz
2
E 1 + |Ytnz | Atnz ds dz
2
E 1 + max |Ytn | Atnz ds dz
0nnT
s() n()+1 1 + E(|Y0 |2 ) T
A0.5 \{v}
K = K2 ,
(7.5.56)
for = 0.5.
Finally, by combining the estimates (7.5.52)(7.5.53)(7.5.54) and (7.5.56)
we have shown that the family of strong predictor-corrector Euler schemes
(7.4.1)(7.4.2) can be rewritten as a strong order = 0.5 Ito scheme. Therefore, it achieves strong order = 0.5.
346
7 Regular Strong It
o Approximations
Also in the general multi-dimensional case one can analogously rewrite the
family of predictor-corrector Euler schemes (7.4.3)(7.4.5) and similarly the
predictor-corrector strong order 1.0 scheme (7.4.10)(7.4.11) as Ito scheme
of strong order = 0.5 and = 1.0, respectively. Finally, strong predictorcorrector schemes of higher strong order can be derived by similar methods
as the ones used above.
7.6 Exercises
7.1. Write down the drift implicit Euler scheme for the Black-Scholes model
dXt = Xt dt + Xt dWt
with X0 = 1.
7.2. Describe for the Vasicek interest rate model
drt = (
r rt ) dt + dWt
the drift implicit strong order 1.0 Runge-Kutta method.
7.3. Construct for the Black-Scholes model in Exercise 7.1 a balanced implicit
method. Make sure that for , > 0 the resulting approximation stays always
nonnegative.
7.4. For the Merton type SDE
dXt = a Xt dt + b Xt dWt + c Xt p (E, dt)
write down a derivative-free order 1.0 scheme when c > 1.
7.5. For the Merton model in the previous exercise provide the fully drift
implicit order 1.0 scheme.
8
Jump-Adapted Strong Approximations
347
348
these cases the proposed regular strong schemes can be readily applied. However, for scenario simulation we need to generate the multiple stochastic integrals by using random number generators. To avoid the generation of multiple
stochastic integrals with respect to the Poisson jump measure, Platen (1982a)
proposed jump-adapted approximations that signicantly reduce the complexity of higher order schemes. A jump-adapted time discretization makes these
schemes easily implementable for scenario simulation, including the case of
a mark-dependent jump size. Indeed, between jump times the evolution of
the SDE (6.1.4) is that of a diusion without jumps, which can be approximated by standard schemes for pure diusions, as presented in Chap. 5 and
Kloeden & Platen (1999). At jump times the prescribed jumps are performed.
As we will show in this chapter, it is possible to develop tractable jumpadapted higher order strong schemes including the case of mark-dependent
jump sizes. The associated multiple stochastic integrals involve only time and
Wiener process integrations.
In this chapter we consider a jump-adapted time discretization 0 = t0 <
t1 < . . . < tnT = T , on which we construct a jump-adapted discrete-time
approximation Y = {Yt , t [0, T ]} of the solution of the SDE (6.1.4). Here
nt = max{n {0, 1, . . .} : tn t} <
a.s.
(8.1.1)
denotes again the largest integer n such that tn t, for all t [0, T ].
We require the jump-adapted time discretization to include all the jump
times {1 , 2 , . . . , p (T ) } of the Poisson random measure p . Moreover, as
in Sect. 6.1, for a given maximum step size (0, 1) we require the jumpadapted time discretization
(t) = {0 = t0 < t1 < . . . < tnT = T },
(8.1.2)
(8.1.3)
and
tn+1
is
Atn measurable,
(8.1.4)
349
regular
t0
t1
t0
t2
r
1
r
2
r
t1 t2
r
t3
t3 = T
jump times
jump-adapted
t4
t5 = T
Within this time grid we separate the diusive part of the dynamics from
the jumps, because the jumps can arise only at discretization times. Therefore,
between discretization points we can approximate the diusive part with a
strong scheme for pure diusion processes. We then add the eect of a jump
to the evolution of the approximate solution when we encounter a jump time
as a discretization time. We remark that with jump-adapted schemes a markdependent jump coecient does not add any additional complexity to the
numerical approximation. Therefore, in this chapter we consider the general
case of jump diusion SDEs with mark-dependent jump size as given in the
SDE (6.1.4).
For convenience we set Y tn = Y n and dene
Y tn+1 = lim Y s ,
stn+1
as the almost sure left-hand limit of Y at time tn+1 . Moreover, to simplify the
notation, we will use again the abbreviation
f = f (tn , Y tn )
(8.1.5)
ai (t, x) i f (t, x)
L f (t, x) = f (t, x) +
t
x
i=1
0
Lk f (t, x) =
d
i=1
m
d
1 i,j
2
b (t, x)br,j (t, x) i j f (t, x)
2 i,r=1 j=1
x x
bi,k (t, x)
f (t, x),
xi
for k {1, 2, . . . , m}
(8.1.6)
(8.1.7)
350
for all t [0, T ] and x d , similar to the pure diusion case, see Chap. 5
and Kloeden & Platen (1999).
Essentially, throughout the remainder of this chapter we review strong
discrete-time approximations for the diusion part of the given SDE, as described in Chap. 5. Additionally, at each jump time of the driving Poisson
measure we introduce a time discretization point into the jump-adapted discretization and approximate the jump as required.
(8.2.1)
and
Ytn+1 = Ytn+1 +
(8.2.2)
where
tn = tn+1 tn
(8.2.3)
(8.2.4)
p (dv, {tn+1 }) = 1
(8.2.5)
E
and
(8.2.6)
(8.2.7)
351
as
p (dv, {tn+1 }) = 0.
(8.2.8)
and
Ytn+1 = Ytn+1 + Ytn+1
(8.2.10)
(8.2.11)
Ytkn+1 = Ytkn+1 +
(8.2.12)
for k {1, 2, . . . , d}, where ak , bk , and ck are the kth components of the drift,
the diusion and the jump coecients, respectively.
For the general multi-dimensional case the kth component of the jumpadapted Euler scheme is of the form
Ytkn+1 = Ytkn + ak tn +
m
bk,j Wtjn
(8.2.13)
j=1
and
Ytkn+1
Ytkn+1
+
E
(8.2.14)
where ak and ck are the kth components of the drift and the jump coecients,
respectively. Furthermore, bk,j is the component of the kth row and jth column of the diusion matrix b, for k {1, 2, . . . , d}, and j {1, 2, . . . , m}.
Additionally,
Wtjn = Wtjn+1 Wtjn
(8.2.15)
352
b b
2
(Wtn ) tn
2
(8.2.16)
and (8.2.2), which achieves strong order = 1.0. This scheme can be interpreted as a jump-adapted version of the Milstein scheme for pure diusions,
see Milstein (1974).
The comparison of the jump-adapted strong order 1.0 scheme (8.2.16) with
the regular strong order 1.0 Taylor scheme (6.2.1), shows that jump-adapted
schemes are much simpler. Avoiding the problem of the generation of multiple
stochastic integrals with respect to the Poisson measure.
For instance, for the SDE (1.8.5) of the Merton model, we have the jumpadapted strong order 1.0 Taylor scheme of the form
Ytn+1 = Ytn + Ytn tn + Ytn Wtn +
2 Ytn
2
(Wtn ) tn (8.2.17)
2
and (8.2.10).
For the multi-dimensional case with scalar Wiener process, being m = 1,
the kth component of the jump-adapted strong order 1.0 Taylor scheme is
given by
Ytkn+1 = Ytkn + ak tn + bk Wtn +
d
2
1 l bk
W
(8.2.18)
b
t
t
n
n
2
xl
l=1
m
j=1
bk,j Wtjn +
d
m
j1 ,j2 =1 i=1
bi,j1
bk,j2
I(j1 ,j2 )
xi
(8.2.19)
and (8.2.14), for k {1, 2, . . . , d}. For the generation of the multiple stochastic
integrals I(j1 ,j2 ) we refer to Sect. 5.3.
If we have a multi-dimensional SDE satisfying the diusion commutativity
condition, where
(8.2.20)
Lj1 bk,j2 (t, x) = Lj2 bk,j1 (t, x)
353
Ytkn
+ a tn +
m
bk,j Wtjn
j=1
1
2
m
d
bi,j1
j1 ,j2 =1 i=1
bk,j2
Wtjn1 Wtjn2 tn
(8.2.21)
xi
and (8.2.14), for k {1, 2, . . . , d}. The special case of additive diusion noise,
where b(t, x) = b(t), satises the required commutativity condition and, therefore, leads to an ecient jump-adapted strong order 1.0 Taylor scheme. Note
that this holds for general jump coecients, unlike the case of regular strong
schemes in Chap. 6 and one could easily discuss further special cases of the
above schemes.
Jump-Adapted Strong Order 1.5 Taylor Scheme
If we approximate the diusive part of the SDE (6.1.4) with the strong order
1.5 Taylor scheme, see Kloeden & Platen (1999), then we obtain the jumpadapted strong order 1.5 Taylor scheme.
In the autonomous one-dimensional case, d = m = 1, the jump-adapted
strong order 1.5 Taylor scheme is given by
bb
2
(Wtn ) tn + a bZtn
Ytn+1 = Ytn + atn + bWtn +
2
1
1 2
1 2
2
+
a a + b a (tn ) + a b + b b (Wtn tn Ztn )
2
2
2
1
1
2
(Wtn ) tn Wtn ,
(8.2.22)
+ b b b + b2
2
3
and equation (8.2.2), where we have used the abbreviation dened in (8.1.5).
Here we need the multiple stochastic integral
tn+1
s2
dWs1 ds2 .
(8.2.23)
Ztn =
tn
tn
Recall that Ztn has a Gaussian distribution with mean E(Ztn ) = 0, variance E((Ztn )2 ) = 13 (tn )3 and covariance E(Ztn Wtn ) = 12 (tn )2 .
For example, for the SDE (1.8.5) of the Merton model the terms involving
the random variable Ztn cancel out, thus, yielding the rather simple jumpadapted strong order 1.5 Taylor scheme
354
2 Ytn
2
(Wtn ) tn
2
2 Ytn
2
(tn ) + Ytn (Wtn tn )
2
1
1 3
2
(Wtn ) tn Wtn
+ Ytn
2
3
(8.2.24)
and (8.2.10).
For the multi-dimensional case with scalar Wiener process, m = 1, the kth
component of the jump-adapted strong order 1.5 Taylor scheme is given by
1
2
Ytkn+1 = Ytkn + ak tn + bk Wtn + L1 bk (Wtn ) tn
2
1
2
+ L1 ak Ztn + L0 bk Wtn tn Ztn + L0 ak (tn )
2
1
1
2
(Wtn ) tn Wtn
+ L1 L1 b k
(8.2.25)
2
3
and (8.2.12), for k {1, 2, . . . , d}, where the dierential operators L0 and L1
are dened in (8.1.6) and (8.1.7), respectively.
In the general multi-dimensional case the kth component of the jumpadapted strong order 1.5 Taylor scheme is given by
Ytkn+1 = Ytkn + ak tn +
+
m
1 0 k
2
L a (tn )
2
j=1
m
j1 ,j2 =1
m
(8.2.26)
j1 ,j2 ,j3 =1
355
1
{b(tn , Ytn ) b} (Wtn )2 tn ,
2 tn
(8.3.1)
tn .
(8.3.2)
1
t ) bk } (Wt )2 t ,
{bk (tn , Y
n
n
n
2 tn
(8.3.3)
tn ,
(8.3.4)
m
bk,j Wtjn
j=1
1
tn
m
j1 ) bk,j2 (tn , Y t ) I(j ,j ) , (8.3.5)
bk,j2 (tn , Y
tn
n
1 2
j1 ,j2 =1
356
tn ,
(8.3.6)
for k {1, 2, . . . , d} and for j {1, 2, . . . , m}. The multiple stochastic integrals can be generated, as described in Sect. 5.3. The diusion commutativity
condition presented in Sect. 8.2 may apply here also, and can therefore, lead
to very ecient jump-adapted derivative-free schemes.
Derivative-Free Strong Order 1.5 Scheme
In the autonomous one-dimensional case, d = m = 1, the jump-adapted
derivative-free strong order 1.5 scheme is given by
Ytn+1 = Ytn + bWtn +
1
a(Yt+
) a(Yt
) Ztn
n
n
2 tn
1 +
a(Ytn ) + 2a + a(Yt
) tn
n
4
1
t ) (Wt )2 t
b(Yt+
+
)
b(
Y
n
n
n
n
4 tn
+
+
+
1
b(Yt+
) + 2b + b(Yt
) (Wtn tn Ztn )
n
n
2 tn
+
1
t ) b(
b(
tn ) b(Ytn ) + b(Ytn )
n
4 tn
1
2
(Wtn ) tn Wtn ,
3
Yt
= Ytn + atn bWtn ,
n
tn = Ytn b(Ytn )
tn .
(8.3.7)
(8.3.8)
(8.3.9)
The multiple stochastic integral Ztn = I(1,0) can be generated as described in (5.3.37).
We could now continue to list multi-dimensional versions of this and other
schemes, including those satisfying the diusion commutativity condition.
However, we refrain from doing so since the methodology is straightforward
and already demonstrated above.
357
regions of numerical stability. To achieve this, one needs to introduce implicitness into the schemes. For the derivation of jump-adapted drift-implicit
schemes, it is sucient to replace the explicit scheme for the diusive part
by a drift-implicit one. We refer to Chap. 7 and Kloeden & Platen (1999) for
drift-implicit methods for SDEs driven by Wiener processes only.
Drift-Implicit Euler Scheme
For a one-dimensional SDE, d = m = 1, the jump-adapted drift-implicit Euler
scheme is given by
Ytn+1 = Ytn + a(tn+1 , Ytn+1 ) + (1 ) a tn + bWtn ,
(8.4.1)
and (8.2.2), where the parameter [0, 1] characterizes the degree of implicitness.
In the multi-dimensional case with scalar Wiener noise, m = 1, the kth
component of the jump-adapted drift-implicit Euler scheme is given by:
Ytkn+1 = Ytkn + ak (tn+1 , Y tn+1 ) + (1 ) ak tn + bk Wtn , (8.4.2)
and (8.2.12), for k {1, 2, . . . , d}.
In the multi-dimensional case the kth component of the jump-adapted driftimplicit Euler scheme follows as
Ytkn+1
Ytkn
+ a (tn+1 , Y tn+1 ) + (1 ) a
k
tn +
m
bk,j Wtjn ,
j=1
358
Ytkn+1 = Ytkn + ak (tn+1 , Y tn+1 ) + (1 ) ak tn + bk Wtn
+
d
2
1 l bk
Wtn tn
b
l
2
x
(8.4.4)
l=1
Ytkn
m
k
k
+ a (tn+1 , Y tn+1 ) + (1 ) a tn +
bk,j Wtjn
j=1
m
d
k,j2
i,j1 b
+
b
I(j1 ,j2 )
xi
j ,j =1 i=1
1
(8.4.5)
Ytn+1 = Ytn +
359
(8.5.2)
m
t ) + (1 )bk,j W j ,
bk,j (tn+1 , Y
tn
n+1
(8.5.3)
j=1
m
d
bk,j1
j1 ,j2 =1 i=1
the predictor
Ytkn+1 = Ytkn + ak tn +
m
j=1
and (8.2.14).
bk,j2
,
xi
bk,j Wtjn ,
(8.5.4)
(8.5.5)
360
(8.5.7)
and (8.2.2). The parameter [0, 1] characterizes again the degree of implicitness in the drift coecient. This scheme attains a strong order = 1.0.
In the general multi-dimensional case, the kth component of the jumpadapted predictor-corrector strong order 1.0 scheme is given by the corrector
Ytkn+1
m
k
k
+ a (tn+1 , Y tn+1 ) + (1 ) a tn +
bk,j Wtjn
Ytkn
j=1
d
m
bk,j2
bi,j1
I(j1 ,j2 ) ,
xi
j ,j =1 i=1
1
(8.5.8)
the predictor
Ytkn+1 = Ytkn +ak tn +
m
k,j
j=1
Wtjn +
m
d
bk,j2
bi,j1
I(j1 ,j2 ) (8.5.9)
xi
j ,j =1 i=1
1
and (8.2.14), for k {1, 2, . . . , d}. For the generation of the multiple stochastic
integrals I(j1 ,j2 ) we refer again to Sect. 5.3.
For SDEs satisfying the diusion commutativity condition (8.2.20), as in
the case of an additive diusion coecient b(t, x) = b(t), we can express all of
the multiple stochastic integrals in terms of the increments Wtjn1 and Wtjn2
of the Wiener process. This yields an eciently implementable jump-adapted
predictor-corrector strong order 1.0 scheme whose kth component is given by
the corrector
t ) + (1 ) ak t
Ytkn+1 = Ytkn + ak (tn+1 , Y
n+1
n
+
m
j=1
the predictor
bk,j Wtjn +
1
2
m
d
j1 ,j2 =1 i=1
bi,j1
bk,j2
Wtjn1 Wtjn2 tn ,
xi
Ytkn+1 = Ytkn + ak tn +
m
361
bk,j Wtjn
j=1
1
2
m
d
bi,j1
j1 ,j2 =1 i=1
bk,j2
Wtjn1 Wtjn2 tn
(8.5.10)
i
x
(8.6.1)
dXt = Xt dt + Xt dWt + c(t, Xt , v) p (dv, dt).
E
Because of the general form of the jump coecient c, the SDE (8.6.1) rarely
admits an explicit solution. However, based on the above described jumpadapted time discretization, we can construct the following jump-adapted
scheme given by
1 2
(8.6.2)
Ytn+1 = Ytn e( 2 )tn +Wtn ,
and
Ytn+1 = Ytn+1 +
(8.6.3)
362
Since we are using the explicit solution of the diusion part, see (1.8.6), no
discretization error is introduced between jump times. Moreover, since by
equation (8.6.3) the jump impact is added at the correct jump times, as such,
even at the jump times we do not introduce any error. Therefore, we have
described a method of simulating the solution of the jump diusion SDE
(8.6.1) that does not generate any discretization error.
Jump-Adapted Almost Exact Solutions
The above approach can be generalized for cases where we have an explicit or
almost exact solution to the diusion part of the SDE under consideration.
Examples are described in Chap. 2. In the general case, we consider here the
d-dimensional jump diusion SDE
(8.6.4)
dX t = a(t, X t ) dt + b(t, X t ) dW t + c(t, X t , v) p (dv, dt),
E
that we aim to solve. One has then to check whether this SDE belongs to the
special subclass of jump diusion SDEs for which the corresponding diusion
SDE
(8.6.5)
dZ t = a(t, Z t ) dt + b(t, Z t ) dW t ,
admits an explicit or almost exact solution in the sense of Chap. 2. If this is
the case, then we can construct, as in (8.6.2)(8.6.3), a jump-adapted scheme
without discretization error. This means, by the above methodology it is possible to simulate exact solutions for a considerable range of SDEs with jumps.
As previously noticed, if the SDE under consideration is driven by a Poisson measure with high intensity, then a jump-adapted scheme may be computationally too expensive. In such a case, one may prefer to use a regular
scheme that entails a discretization error, but permits the use of a coarser
time discretization. There may be also cases where one can use the explicit
solution of a Levy process driven SDE when approximating a more complex
SDE with a few low intensity major jumps.
363
d
f (t, x) +
ai (t, x) i f (t, x)
t
x
i=1
m
d
1 i,j
2
b (t, x)br,j (t, x) i j f (t, x)
2 i,r=1 j=1
x x
and
Lk f (t, x) =
d
bi,k (t, x)
i=1
f (t, x)
xi
(8.7.1)
(8.7.2)
m \A : A}.
B(A)
= { M
Moreover, for every {0.5, 1, 1.5, 2, . . .} we dene the hierarchical set
< : l() + n() 2 or l() = n() = + 1 .
A = M
2
Jump-Adapted Strong Taylor Schemes
For a jump-adapted time discretization with maximum time step size
(0, 1) we dene the jump-adapted strong order Taylor scheme by
364
Y tn+1 = Y tn +
f (tn , Y tn ) I
(8.7.4)
\{v}
A
and
Y tn+1 = Y tn+1 +
(8.7.5)
where I is the multiple stochastic integral of the multi-index over the time
period (tn , tn+1 ] and for n {0, 1, . . . , nT 1}.
To assess the order of strong convergence of these schemes, we dene
through a specic interpolation the jump-adapted strong order Taylor approximation by
I [f (tnt , Y tnt )]tnt ,t
(8.7.6)
Yt =
\{v}
A
and
2
2
E(|X 0 Y
0 | )C .
(8.7.7)
Moreover, suppose that the coecient functions f satisfy the following conditions:
For A , t [0, T ] and x, y d , the coecient f satises the Lipschitz
type condition
(8.7.8)
|f (t, x) f (t, y)| K1 |x y|.
A ) we assume
For all A B(
f C 1,2
and
f H ,
(8.7.9)
(8.7.10)
'
2
E
sup |X s Y s | A0 K3
(8.7.11)
0sT
365
n 1
t
A
)
B(
n=0
!
I [f (, X )]tn ,tn+1 + I [f (, X )]tnt ,t
n=0
+
0
(8.7.12)
for t [0, T ].
The jump-adapted strong order Taylor scheme can be written as
n 1
!
t
Yt = Y0 +
I [f (tn , Y tn )]tn ,tn+1 + I [f (tnt , Y tnt )]tnt ,t
\{v}
A
n=0
+
0
(8.7.13)
(8.7.14)
0zT
Moreover, with similar steps to those used in the proof of Lemma 6.6.1,
we can show the estimate
2
2
E
sup |Y z | A0 C 1 + E |Y
|
.
(8.7.15)
0
0zT
366
sup |X z
Z(t) = E
0zt
2
Y
z |
sup X 0 Y
0 +
=E
0zt
A0
z 1
n
\{v}
A
I [f (tn , X tn ) f (tn , Y
tn )]tn ,tn+1
n=0
+
E
I [f (, X )]tn ,tn+1 + I [f (, X )]tnz ,z
n=0
A
)
B(
2
c(tnu , X tnu , v) c(tnu , Y tnu , v) p (dv, du) A0
2
C3 X 0 Y
0 +
\{v}
A
St +
Ut + Pt
A
)
B(
St = E sup
I f (tn , X tn ) f (tn , Y
tn )
0zt
n=0
.
+ I f (tnz , X tnz ) f (tnz , Y
tnz )
Ut
=E
tnz ,z
(8.7.16)
tn ,tn+1
2
A0 ,
(8.7.17)
z 1
n
2
sup
I [f (, X )]tn ,tn+1 + I [f (, X )]tnz ,z A0 ,
0zt
n=0
(8.7.18)
and
Pt = E
z
2
c(tnu , X tnu , v) c(tnu , Y tnu , v) p (dv, du) A0 .
sup
0zt
(8.7.19)
Therefore, the terms St and Ut can be estimated as in the proof of Theorem
os isometry
6.4.1, while for Pt , by applying Jensens and Doobs inequalities, It
for jump processes, the Cauchy-Schwarz inequality and the Lipschitz condition
(1.9.2), we obtain
Pt = E sup
0zt
+
E
c(tnu , X tnu , v) c(tnu , Y tnu , v) p (dv, du)
2
c(tnu , X tnu , v) c(tnu , Y tnu , v) (dv)du
A0
2
t
c(tnu , X tnu , v) c(tnu , Y tnu , v) p (dv, du)
8 E
E
A0
z
2
c(tnu , X tnu , v) c(tnu , Y tnu , v) (dv)du
+ 2E sup
0zt
E
t
8E
+2tE
K E
0
t
X tn
A0
c(tn , X t , v) c(tn , Y t , v)2 (dv) du A0
u
nu
u
nu
t
367
c(tnu , X tn
2
A0
,
v)
c(t
,
Y
,
v)
(dv)
du
nu
tnu
u
2
Y tnu du A0
u
Z(u) du.
(8.7.20)
368
for t [0, T ] and X0 > 0, being the Merton model introduced in (1.8.5). We
recall that this SDE admits the explicit solution
"
p (t)
1
Xt = X0 e( 2
)t+Wt
i ,
(8.8.2)
i=1
(8.8.3)
369
Fig. 8.8.1. Log-log plot of strong error versus time step size (constant marks)
370
Fig. 8.8.2. Log-log plot of strong error versus time step size (constant marks)
Fig. 8.8.3. Log-log plot of strong error versus time step size (lognormal marks)
The jump-adapted drift-implicit strong order 1.0 scheme is more accurate for
small step sizes. In this plot we notice for the selected time step sizes that the
accuracy of the regular predictor-corrector Euler scheme is of a magnitude
equal to that of some rst order schemes. Of course, since its strong order of
convergence equals = 0.5, for smaller time step sizes it becomes less accurate
than rst order schemes. Finally, the most accurate scheme for all time step
sizes considered is the jump-adapted strong order 1.5 Taylor scheme, which
achieves an order of strong convergence of approximately = 1.5.
Let us now consider the case of lognormally distributed marks. Here the
logarithm of mark i = ln (i ) is an independent Gaussian random variable,
371
Fig. 8.8.4. Log-log plot of strong error versus time step size (lognormal marks)
In Fig. 8.8.3, we plot the results obtained from the regular and jumpadapted, regular and jump-adapted drift-implicit, and the regular predictorcorrector Euler schemes. Also in this case the results for the jump-adapted
predictor-corrector Euler scheme are indistinguishable from those of the jumpadapted Euler scheme and, thus, omitted. We remark that these results are
very similar to those obtained in Fig.8.8.1 for the case of constant marks, conrming that the orders of strong convergence derived in the previous chapters
hold also for the case of random marks. Again all schemes considered achieve a
strong order of about = 0.5. Moreover, the regular predictor-corrector Euler
scheme is the most accurate one. The remaining schemes have similar accuracy, with the regular Euler scheme the least accurate and the jump-adapted
drift-implicit Euler scheme the most accurate. In Fig. 8.8.4 we report the results for the regular predictor-corrector Euler, jump-adapted strong order 1.0
Taylor, jump-adapted drift-implicit strong order 1.0 and the jump-adapted
strong order 1.5 Taylor schemes. The results are again very similar to those
obtained for the case of constant marks, reported in Fig.8.8.2, with all schemes
achieving the prescribed theoretical orders of strong convergence.
The Case of High Intensities
Let us now consider the strong errors generated by the strong schemes analyzed in the previous subsection, when using the larger intensity = 2. The
remaining parameters of the SDE (8.8.1) are as in the previous subsection.
In Fig. 8.8.5 we show the results for the regular and jump-adapted, regular
and jump-adapted predictor-corrector and jump-adapted drift-implicit Euler
schemes. Note that the error generated by the regular drift-implicit scheme is
very similar to that of the regular Euler scheme and, thus, is here omitted.
All schemes achieve their theoretical orders of strong convergence of approximately = 0.5. Here we can clearly see that jump-adapted schemes are
372
Fig. 8.8.5. Log-log plot of strong error versus time step size (constant marks)
Fig. 8.8.6. Log-log plot of strong error versus time step size (constant marks)
more accurate than their regular counterparts. This is due to the simulation of
the jump impact at the correct jump times in the jump-adapted scheme. Moreover, we report that of the jump-adapted schemes, the drift-implicit scheme is
the least accurate, while the predictor-corrector scheme is the most accurate.
In Fig. 8.8.6 we show the results for the strong order 1.0 Taylor, jumpadapted strong order 1.0 Taylor, jump-adapted drift-implicit strong order 1.0,
jump-adapted predictor-corrector Euler and the jump-adapted strong order
1.5 Taylor schemes. Also here all schemes achieve roughly the theoretical
expected order of strong convergence. Note that while in the low intensity case
the accuracy of the regular and jump-adapted versions of the strong order 1.0
Taylor scheme were very similar, in the high intensity case the jump-adapted
versions are more accurate, and highlight that the jump-adapted strong order
1.5 Taylor scheme is the most accurate for all time step sizes considered.
373
Fig. 8.8.7. Log-log plot of strong error versus time step size (lognormal marks)
Fig. 8.8.8. Log-log plot of strong error versus time step size (lognormal marks)
Finally, in Figs. 8.8.7 and 8.8.8 we report the behavior of strong errors
for all schemes analyzed previously in this chapter for the case of lognormal
marks. The results are again very similar to those obtained in the case with
constant marks. In particular, we report that all schemes achieve approximately their theoretical order of strong convergence.
In Fig. 8.8.9 we report the results obtained from the implicit Euler, implicit strong order 1.0 Taylor, jump-adapted implicit Euler and jump-adapted
implicit order 1.0 Taylor schemes with = 1. Also in this case the strong
orders of convergence are consistent with those predicted by the convergence
theorems. We notice again the higher accuracy of jump-adapted schemes due
to the better simulation of the jump component.
We now consider the mark-dependent jump coecient c(t, x, v) = x (v1),
with marks drawn from a lognormal distribution with mean 1.1, and standard
374
Fig. 8.8.9. Log-log plot of strong error versus time step size
Fig. 8.8.10. Log-log plot of strong error versus time step size
deviation 0.02. As explained in Sect. 8.1, jump diusion SDEs with a markdependent jump size can be handled eciently by employing jump-adapted
schemes. Therefore, in Fig. 8.8.10 we compare the following jump-adapted
schemes: Euler, implicit Euler, strong order 1.0 Taylor, implicit strong order
1.0 Taylor, strong order 1.5 Taylor and implicit strong order 1.5 Taylor. For the
implicit schemes the degree of implicitness , equals one. Again, the orders
of strong convergence obtained from our numerical experiments are those
predicted by the theory. Comparing explicit with implicit schemes, we report
that for this choice of parameters the implicit schemes are more accurate.
Since the jump impact is simulated without creating any additional error,
these dierences are due to the approximation of the diusive part. We remark
that implicit schemes, which oer wider regions of stability as we will see in
Chap. 14, are more suitable for problems in which numerical stability is an
375
issue. This applies also for the areas of ltering and scenario simulation in
nance, where SDEs with multiplicative noise naturally arise.
376
(8.9.1)
for t [0, T ].
For a pure jump process X = {Xt , t [0, T ]} that is driven by the Poisson
process N we assume that its value Xt at time t satises the SDE
dXt = c(t, Xt ) dNt
(8.9.3)
377
To provide for later illustration a simple, yet still interesting example, let
us consider the linear SDE
dXt = Xt dNt
(8.9.4)
for t [0, T ] with X0 > 0 and constant . This is a degenerate case of the
SDE (1.8.5), with drift coecient a(t, x) = 0, diusion coecient b(t, x) = 0
and mark-independent jump coecient c(t, x) = x . By application of the
It
o formula one can demonstrate that the solution X = {Xt , t [0, T ]} of the
SDE (8.9.4) is a pure jump process with explicit representation
Xt = X0 exp{Nt ln( + 1)} = X0 ( + 1)Nt
(8.9.5)
for t [0, T ].
Jump-Adapted Schemes
We consider a jump-adapted time discretization 0 = t0 < t1 < . . . < tnT = T ,
where nT is as dened in (6.1.11) and the sequence t1 < . . . < tnT 1 equals
that of the jump times 1 < . . . < NT of the Poisson process N . On this
jump-adapted time grid we construct the jump-adapted Euler scheme by the
algorithm
(8.9.6)
Yn+1 = Yn + c Nn ,
for n {0, 1, . . . , nT 1}, with initial value Y0 = X0 , where Nn = Ntn+1
Ntn is the nth increment of the Poisson process N . Between discretization
times the right-continuous process Y is set to be piecewise constant. Note
that here and in the sequel, when no misunderstanding is possible, we use the
previously introduced abbreviation c = c(tn , Yn ).
Since the discretization points are constructed exactly at the jump times
of N , and the simulation of the increments Nti+1 Nti = 1 is exact, the jumpadapted Euler scheme (8.9.6) produces no discretization error. Let us emphasize that this is a particularly attractive feature of jump-adapted schemes,
when applied to pure jump SDEs. In the case of jump diusion SDEs, the
jump-adapted schemes typically produce a discretization error.
For the implementation of the scheme (8.9.6) one needs to compute the
jump times i , i {1, 2 . . . , NT }, and then apply equation (8.9.6) recursively
for every i {0, 1, 2 . . . , nT 1}. One can obtain the jump times via the
corresponding waiting times between two consecutive jumps by sampling from
an exponential distribution with parameter .
The computational demand of the algorithm (8.9.6) is heavily dependent
on the intensity of the jump process. Indeed, the average number of steps,
therefore the number of operations, is proportional to the intensity . Below
we will introduce alternative methods suitable for large intensities, based on
regular time discretizations.
378
Euler Scheme
In this subsection we develop discrete-time strong approximations whose computational complexity is independent of the jump intensity level. We consider
an equidistant time discretization with time step size (0, 1) as in Chap. 6.
The simplest strong Taylor approximation Y = {Yt , t [0, T ]} is the Euler
scheme, given by
(8.9.7)
Yn+1 = Yn + c Nn
for n {0, 1, . . . , nT 1} with initial value Y0 = X0 and Nn = Ntn+1 Ntn .
Between discretization times the right-continuous process Y is assumed to be
piecewise constant.
By comparing the scheme (8.9.7) with the algorithm (8.9.6), we notice
that the dierence in the schemes occurs in the time discretization. We emphasize that the average number of operations and, thus, the computational
complexity of the Euler scheme (8.9.7) is independent of the jump intensity.
Therefore, a simulation based on the Euler scheme (8.9.7) is also feasible in the
case of jump processes with high intensity. However, while the jump-adapted
Euler scheme (8.9.6) produces no discretization error, the accuracy of the Euler scheme (8.9.7) depends on the size of the time step and the nature of
the jump coecient.
For example, for the linear SDE (8.9.4), the Euler scheme (8.9.7) is of the
form
(8.9.8)
Yn+1 = Yn + Yn Nn = Yn (1 + Nn )
for n {0, 1, . . . , nT 1} with Y0 = X0 . Since the equidistant time discretization tn = n of this Euler scheme does not include the jump times of the
underlying Poisson process, we have an approximation error.
Theorem 6.4.1 shows that the Euler approximation (8.9.7) achieves strong
order of convergence = 0.5. This raises the question of how to construct
higher order discrete-time approximations for the case of pure jump SDEs.
The problem can be approached by a stochastic expansion for pure jump SDEs
that we describe below. This expansion is a particular case of the WagnerPlaten expansion (4.4.4) for jump diusions presented in Chap. 4. Therefore,
the resulting strong approximations are particular cases of the strong schemes
presented in Chap. 6.
Wagner-Platen Expansion
Since the use of the Wagner-Platen expansion for pure jump processes, see
Sect.4.1, may be unfamiliar to some readers, let us rst illustrate the structure
of this formula for a simple example. For any measurable function f :
and a given adapted counting process N = {Nt , t [0, T ]} we have the
representation
f (Nt ) = f (N0 ) +
f (Ns )
(8.9.9)
s(0,t]
379
for all t [0, T ], where f (Ns ) = f (Ns ) f (Ns ). We can formally write the
equation (8.9.9) in the dierential form of an SDE
df (Nt ) = f (Nt ) = (f (Nt + 1) f (Nt )) Nt
for t [0, T ]. This equation can also be obtained from the It
o formula for
semimartingales with jumps, see Sect. 1.5 or Protter (2005).
(Ns ) denes a measurObviously, the following dierence expression f
able function, as long as,
f (N ) = f (N + 1) f (N )
(8.9.10)
f (Nt ) = f (N0 ) +
f (Ns ) dNs
(8.9.11)
(0,t]
2
f (Ns )dNs dNs
f (N0 )dNs +
f (Nt ) = f (N0 ) +
1
1
2
(0,t]
(0,t] (0,s2 )
2
f (Ns )dNs dNs
(N0 )
= f (N0 ) + f
dNs +
1
1
2
(0,t]
(0,t]
(0,s2 )
(8.9.12)
q denotes for an integer q {1, 2, . . .} the q times
for t [0, T ]. Here ()
given in (8.9.10). Note that a double
consecutive application of the function
stochastic integral with respect to the counting process N naturally arises in
(8.9.12). As such, one can now in (8.9.12) apply the formula (8.9.11) to the
2 f (Ns ), which yields
measurable function ()
1
2
3 (t)
dNs + f (N0 )
dNs1 dNs2 +R
f (Nt ) = f (N0 )+f (N0 )
(0,t]
(0,t]
(0,s2 )
(8.9.13)
with remainder term
R3 (t) =
(0,t]
(0,s3 )
3
f (Ns ) dNs dNs dNs
1
1
2
3
(0,s2 )
380
3 (t),
f (Nt ) = f (N0 ) + f (N0 )
+ f (N0 )
+R
1
2
where
f (0) = f (1) f (0),
f (N0 ) =
2
f (N0 ) = f (2) 2 f (1) + f (0).
In the given case this leads to the expansion (4.1.25). More generally, by
induction it follows the Wagner-Platen expansion of the form
f (Nt ) =
l
k=0
with
Nt
k
l+1 (t)
() f (N0 )
+R
k
(8.9.14)
l+1 (t) =
R
...
(0,t]
(0,s2 )
(8.9.15)
for all t [0, T ]. In the same manner as previously shown, this leads to the
expansion
381
f (Xs ) dNs
f (Xt ) = f (X0 ) +
(0,t]
f (X0 ) +
= f (X0 ) +
(0,t]
= f (X0 ) +
dNs2
(0,s2 )
l k
f (X0 )
(0,t]
k=1
= f (X0 ) +
2
f (Xs ) dNs
1
1
(0,s2 )
l+1
dNs1 dNsk + R
f,t
l k
Nt
l+1
f (X0 )
+R
f,t
k
(8.9.16)
k=1
with
l+1 =
R
f,t
l+1
(0,t]
(0,s2 )
for t [0, T ] and l {1, 2, . . .}. One notes that (8.9.16) generalizes (8.9.14) in
an obvious fashion.
For the particular example of the linear SDE (8.9.4) we obtain for any
measurable function f the function
f (X ) = f (X (1 + )) f (X )
for the jump times [0, T ] with N = 1. Therefore, in the case l = 2 we
get from (4.1.23) and (8.9.16) the expression
f (Xt ) = f (X0 ) + f (X0 (1 + )) f (X0 ) (Nt N0 )
+ f (X0 (1 + )2 ) 2 f (X0 (1 + )) + f (X0 )
1
3
f,t
(Nt N0 ) ((Nt N0 ) 1) + R
2
3 we obtain in a systematic
for t [0, T ]. By neglecting the remainder term R
f,t
way, for this simple example a truncated Wagner-Platen expansion of f (Xt ) at
the expansion point X0 . We emphasize that in the derivation of the expansion
k f (),
(8.9.16), only measurability of the function f and the coecients ()
for k {1, . . . , l} is required. This contrasts with the case of diusion and
jump diusion SDEs where sucient dierentiability conditions are needed
to obtain a Wagner-Platen expansion. It also indicates that one can obtain
useful approximations for a general counting process that may not even be
Markovian. This is a remarkable fact since on the market micro structure level
all observed quantities are, in principle, pure jump processes.
382
Fig. 8.9.1. Exact solution, Euler and order 1.0 Taylor approximations
383
384
for t [0, T ], with X 0 . Here the jump coecient c and the Poisson
random measure p as dened in (1.8.2). Note that the mark space E of the
Poisson random measure can be made multi-dimensional or split into disjoint
subsets and, thus, can generate several sources for modeling jumps. The case
of a multi-dimensional SDE driven by several Poisson processes is a specic
case of the SDE (8.9.21). The SDE (8.9.21) is equivalent to the jump diusion
SDE (1.8.2) when the drift coecient a and the diusion coecient b equal
zero. Note that birth and death processes and, more generally, continuous
time Markov chains can be described by the SDE (8.9.21). It is sometimes
not the most convenient or usual description of such dynamics. However, it is
a very generic one as we can see.
Theorem 6.4.1, presented in Chap. 6, establishes the strong order of convergence of strong Taylor approximations for jump diusion SDEs. When
specifying the theorem mentioned to the case of SDEs driven by pure jump
processes, it turns out that it is possible to weaken the assumptions on the
coecients of the Wagner-Platen expansion. As we will see below, the Lipschitz and growth conditions on the jump coecient are sucient enough to
establish the convergence of strong Taylor schemes of any given strong order
of convergence {0.5, 1, 1.5, 2, . . .}. Dierentiability of the jump coecient
is not required. This is due to the structure of the increment operator L1
v , see
(4.3.6), naturally appearing in the coecient of the Wagner-Platen expansion
for pure jump processes.
For a regular time discretization (t) with maximum step size (0, 1)
we dene, the strong order Taylor scheme for pure jump SDEs by
s1
21
tn+1
1 k
0
L
Y
=
Y
+
.
.
.
c(tn , Y
n+1
n
n ,v )
d
k=0
tn
tn
(8.9.22)
for n {0, 1, . . . , nT 1} and {0.5, 1, 1.5, . . .}. Here we have used the
notation
when k = 0
c(tn , Y n , v )
1 k
0
1
0
when k = 1
c(tn , Y n , v ) = Lv1 c(t
L
n ,Y n , v )
0
1
L1
. . . Lv1 c(tn , Y n , v )
when k {1, . . .},
vk
(8.9.23)
385
Assume that the jump coecient satises the Lipschitz con|c(t, x, u) c(t, y, u)| K|x y|,
(8.9.24)
Assume that the jump coecient satises the growth condi (1 + |x|2 )
|c(t, x, u)|2 K
(8.9.26)
(0, ). Then
for t [0, T ] and x d and u E, with some constant K
for any {0.5, 1, 1.5, . . .} and k {0, 1, 2, . . . , 2 1} the kth coecient
386
k
L1 c(t, x, u) of the strong order Taylor scheme, satises the growth
condition
2
1 k
L
c(t, x, u) Ck (1 + |x|2 )
(8.9.27)
for t [0, T ], x, y d , u E k and some constant Ck (0, ) which only
depends on k.
Proof: We prove the assertion of Lemma 8.9.2 by induction with respect
to k. For k = 0, by applying the growth condition (8.9.26) we obtain
2
2
1 0
(1 + |x|2 ).
c(t, x, u) = c(t, x, u) K
L
For k = l +1, by the induction hypotheses, Jensens inequality and the growth
condition (8.9.26) we obtain
2
2
l
l
1 l+1
c(t, x, u) = L1 c t, x + c(t, x, v), u L1 c(t, x, u)
L
2
2
2 Cl 1 + x + c(t, x, v) + Cl 1 + x
2
2
2 Cl 1 + 2(x + c(t, x, v) ) + Cl 1 + |x|2
Cl+1 (1 + |x|2 ),
which completes the proof of Lemma 8.9.2.
Lemma 8.9.3
(8.9.28)
(8.9.29)
(8.9.30)
s2
T
sk
...
|g(s, v 1 , . . . , v k , )|2 (dv 1 )ds1 . . . (dv k ) dsk < .
E
0
387
s1
2
T
sk
1 k
E
...
c t, X s0 , u0 (du0 ) ds0 . . . (duk ) dsk
L
E
0
sk
s1
...
0
Ck
(T )k
+ Ck
k!
sk
s1
...
E
0
sup |X z |2 ds0 . . . dsk < .
0zT
The last passage holds, since conditions (8.9.28), (8.9.29) and (8.9.30) ensure
that
E
sup |X z |2
< ,
z[0,T ]
We emphasize that in the case of pure jump SDEs, unlike the more general
case of jump diusions, no extra dierentiability of the jump coecient c is
required when deriving higher order approximations.
Theorem 8.9.4. For given {0.5, 1, 1.5, 2, . . .}, let Y = {Y
t ,t
[0, T ]} be the strong order Taylor scheme (8.9.22) for the SDE (8.9.21)
corresponding to a regular time discretization (t) with maximum step size
(0, 1). We assume the jump coecient c(t, x, v) satises both the Lipschitz condition (8.9.24) and the growth condition (8.9.26). Moreover, suppose
that
(
2
2
E(|X 0 Y
and
E(|X 0 | ) <
0 | ) K1 .
Then the estimate
)
2
E( max |X n Y
n| )K
0nnT
388
Proof: The proof of Theorem 8.9.4 is a direct consequence of the convergence Theorem 6.4.1 for jump diusions presented in Chap. 6. This must be
the case, as by Lemma 8.9.1, 8.9.2 and 8.9.3, the coecients of the strong
order Taylor scheme (8.9.22) satisfy the conditions required by convergence
Theorem 6.4.1. Note that the dierentiability condition in Theorem 6.4.1, being f C 1,2 for all A B(A ), is not required for pure jump SDEs.
This condition is used in the proof of the general convergence Theorem 6.4.1,
only for the derivation of the Wagner-Platen expansion. In the pure jump
case, as shown above, one needs only measurability of the jump coecient c
to derive the corresponding Wagner-Platen expansion.
Theorem 8.9.4 states that the strong order Taylor scheme for pure jump
SDEs achieves a strong order of convergence equal to . In fact Theorem
8.9.4 states a strong convergence of order not only at the endpoint T , but
uniformly over all time discretization points. Thus, by including enough terms
from the Wagner-Platen expansion (8.9.16) we are able to construct schemes of
any given strong order of convergence {0.5, 1, 1.5, . . .}. Note that Theorem
8.9.4 applies to solutions of multi-dimensional pure jump SDEs, which is an
interesting fact.
For the mark-independent pure jump SDE (8.9.3) driven by a single Poisson process, the strong order Taylor scheme (8.9.22) reduces to
Y
n+1
Y
n
2
k
Nn
f (Y n )
+
k
(8.9.32)
k=1
8.10 Exercises
8.1. Consider the pure jump SDE
dXt = (a + b Xt ) dNt ,
where N = {Nt , t 0} is a Poisson process with intensity < . Write down
the corresponding strong order 1.0 Taylor scheme.
8.2. For the SDE in the previous exercise derive the strong order 2.0 Taylor
scheme.
9
Estimating Discretely Observed Diusions
In many areas of application the estimation of parameters in SDEs is an important practical task. This estimation is almost like inverting the problem
of scenario simulation and can benet from the application of Wagner-Platen
expansions. We have already mentioned that it is important to apply scenario
simulation when checking empirically the usefulness of a proposed estimation methods. This chapter introduces estimation techniques for discretely
observed diusion processes. Transform functions are applied to the data in
order to obtain estimators of both the drift and diusion coecients. Consistency and asymptotic normality of the resulting estimators is investigated.
Power transforms are used to estimate the parameters of ane diusions, for
which explicit estimators are obtained.
389
390
(9.1.1)
1 2 T
2
(9.1.2)
a(Xt ) dt
a(Xt ) dXt
LT () = exp
2
0
0
391
(9.1.3)
T
for n {0, 1, . . . , N 1}, where = N
. The increments W0 , W1 , . . . ,
WN 1 of the Wiener process are independent N (0, 1) distributed random
variables. Their joint probability density is given by
N
1
"
1
1
(N )
(Wn )2
pW =
exp
2
2
n=0
1
N
(2 ) 2
N 1
1
exp
(Wn )2
2 n=0
(9.1.4)
(Yn ) .
(9.1.5)
N exp
2 n=0
(2 ) 2
The Radon-Nikodym derivative of the discrete process Y = {Yn , n {0, 1, . . . ,
N 1}} with respect to the Wiener process W is simply the ratio
!
N 1
(N )
pY
1
(Yn )2 (Wn )2
= exp
(N )
2 n=0
p
W
N
1
N
1
1
= exp 2
a(Yn )2
a(Yn ) Wn
2
n=0
n=0
= exp
N
1
N 1
1 2
a(Yn )2
a(Yn ) Yn
2
n=0
n=0
(9.1.6)
392
LT ()
=0
.
T = 0T
a(Xt )2 dt
0
(9.1.7)
(9.1.8)
,
(9.1.9)
T = 0 T
a(Xt )2 dt
0
where is the true value of the parameter. According to Sect. 2.7, if
T
<
a(Xt )2 dt
(9.1.10)
for all T and if the SDE (9.1.1) has a stationary solution which is ergodic with
density p, found by solving the stationary Kolmogorov forward equation, then
it follows from the ergodicity of X that
1
T T
and
1
T T
a.s.
a(Xt ) dWt = 0
lim
a.s.
a(Xt )2 dt =
lim
(9.1.11)
(9.1.12)
(9.1.13)
1
Moreover, a version of the Central Limit Theorem tells us that T 2 (T
) converges in distribution as T to an N (0, 2 )-distributed random
variable with variance
1
2
2
a(x) p(x) dx
.
(9.1.14)
=
393
(9.2.1)
Note that the parameter appears linearly in the drift of the above SDE.
We assume that this It
o SDE has a unique strong solution for every vector
parameter value from some open set k , k {0, 1, . . .}. In (9.2.1) At ,
B t and D t are vector or matrix functions depending for each t on X t . For
simplicity, we will assume that they are continuous. The statistical parameter
is k-dimensional, and W is a m-dimensional Wiener process, so B t is a
d k-matrix and D t is a d m-matrix. The vector At is d-dimensional. At ,
B t and D t are assumed to be known and given by the problem under study.
The drift parameter vector is to be estimated from an observed sample path
X = {X t , t [0, T ]}.
We assume that . Without loss of generality we can assume that
0 . Since the functionals At , B t and D t depend on X through X t only,
the solution of the SDE (9.2.1) is Markovian.
Suppose we have observed the process X in the time interval [0, T ]. Let
P T denote the probability measure on the set of continuous functions from
[0, T ] into d corresponding to the solution of (9.2.1) for the parameter . The
likelihood function for our observation X = {X t , t [0, T ]} is the RadonNikodym derivative
dP T
LT () =
(9.2.2)
dP0T
provided P T is dominated by P0T for all , see Sect. 2.7. This is the case
if the d d-squared diusion coecient matrix
C t (X) = D t (X) D t (X)
is nonsingular for almost all t [0, T ], and P T (STi, < ) = 1, i,
{1, 2, . . . , k}, for all , where S T is the k k-information matrix
ST =
0
B t (X t ) C t (X t )1 B t (X t ) dt.
(9.2.3)
394
T
t,
B t (X t ) C t (X t )1 dX
HT =
(9.2.4)
where
t = Xt
X
As (X s ) ds.
0
(9.2.5)
(9.2.6)
This yields
ST1,1 =
1
c2
Xt2 dt,
0
HT1 =
1
c2
ST2,1 = ST1,2 =
Xt dt and
0
1
c2
HT2 =
Xt dt,
0
ST2,2 =
T
,
c2
1
(XT X0 ).
c2
T
T
1
Xt dXt (XT X0 )
Xt dt
1,T =
T
NT
0
0
T
1
1
2
2
2
=
T (XT X0 c T ) (XT X0 )
Xt dt (9.2.7)
NT
2
0
and
2,T
1
=
NT
1
=
NT
(XT X0 )
Xt2 dt
0
(XT X0 )
Xt2
0
Xt dt
0
1
dt (XT2 X02 c2 T )
2
Xt2
NT = T
Xt dXt
0
with
395
Xt dt
(9.2.8)
2
dt
Xt dt
(9.2.9)
396
Approximate Estimators
To estimate the parameters 1 and 2 we apply the estimators given by the
last expressions in (9.2.7) and (9.2.8) and obtain, after approximation of the
relevant time integrals by the trapezoidal formula, the approximate estimators
n
T 1
1
=
(Xtn+1 + Xtn )
T (XT2 X02 c2 T ) (XT X0 )
1,T
2 NT
n=0
(9.2.10)
and
n
T 1
1
2,T
=
X
)
(Xt2n+1 + Xt2n )
(X
T
0
2 NT
n=0
n
T 1
1
2
2
2
(XT X0 c T )
(Xtn+1 + Xtn )
(9.2.11)
2
n=0
with
NT
nT 1
T
=
(X 2 + Xt2n )
2 n=0 tn+1
2
nT 1
1
(Xtn+1 + Xtn ) .
2 n=0
(9.2.12)
and 2,T
Figs. 9.2.2 and 9.2.3 show that the approximate estimators 1,T
converge after initial oscillations to the corresponding values of 1 and 2 ,
respectively. It is important to note that the maximum likelihood method as
applied above, typically only allows us to estimate parameters in the drift
coecient of the SDE when based on a measure transformation as previously
indicated. However, parameters also in the diusion coecient can be estimated if the transition density is known and permits a suitable estimator.
397
398
timating functions are invariant under one-to-one transforms of the data and
these functions can be combined more simply than the estimators themselves.
Bibby & Srensen (1995) studied martingale estimating functions obtained
from the derivative of the continuous time log-likelihood function by correcting for the discretization bias when subtracting its compensator. The resulting estimating function, known as a linear martingale estimating function,
depends on the conditional moments of the diusion process. Quadratic martingale estimating functions also involving second order conditional moments
were obtained by Bibby & Srensen (1996) from a Gaussian approximation
to the likelihood function. Kessler & Srensen (1999) considered estimating
functions based on eigenfunctions for which the conditional moments are explicit. Srensen (2000) considered more general estimating functions, known
as prediction-based estimating functions. Here conditional moments are approximated by expressions involving only unconditional moments. This type
of estimating functions, is a useful alternative to martingale estimating functions, for instance, in stochastic volatility models and certain interest rate term
structure models. Christensen, Poulsen & Srensen (2001) compared optimal
martingale estimating functions and the approximate maximum likelihood
method, mentioned earlier, for the estimation of the parameters in a model
of the short rate to techniques such as the generalized method of moments,
see Hansen (1982), and indirect inference, see Gourieroux, Monfort & Renault
(1993) and Gallant & Tauchen (1996). It was found that optimal martingale
estimating functions and the approximate maximum likelihood method reduce bias, true standard errors and bias in estimated standard errors when
compared to the aforementioned techniques. In Bibby (1994) martingale estimating functions are combined to estimate parameters in both the drift and
the diusion coecient using techniques discussed in Heyde (1997).
The approach that will be presented below is based on the estimating
function approach. The objective is to obtain a simple yet general estimation
method, which provides exibility in the estimation of discretely observed
diusion processes via the use of transform functions. Unlike some of the
aforementioned techniques, particular information about the conditional and
unconditional moments of the diusion process are not needed. Furthermore,
parameters in the diusion coecients can be estimated and stationarity of
the observed process is not required. The Wagner-Platen expansion will be
used to expand the increments of functions of the observed diusion process.
Diusion Model
Consider a class of one-dimensional diusion processes dened by the following
SDE
(9.3.1)
dXt = b(t, Xt ; ) dt + (t, Xt ; ) dWt
for t [0, T ]. The initial value X0 = x0 is assumed to be A0 -measurable.
Here W = {Wt , t [0, T ]} denotes a standard Wiener process given on the
399
(9.3.3)
U (0 , X0 ; i ), U (1 , X1 ; i ), U (2 , X2 ; i ), . . .
(9.3.4)
(9.3.5)
1
2
u(t, x) + b(t, x; )
u(t, x) + 2 (t, x; ) 2 u(t, x)
L0 u(t, x) =
t
x
2
x
(9.3.6)
400
and
u(t, x).
(9.3.7)
x
For n {1, 2, . . .}, i {1, 2, . . . , p} and i we introduce the normalized
dierence
L1 u(t, x) = (t, x; )
Di ,n, =
1
(U (n , Xn ; i ) U (n1 , Xn1 ; i ))
n n1
(9.3.8)
1
(U (n , Xn ; i ) U (n1 , Xn1 ; i ))2 .
n n1
(9.3.9)
n
s
dz dWs
+ L1 L0 U (n1 , Xn1 ; i )
+ L0 L0 U (n1 , Xn1 ; i )
n1
n
n1
s
+ L0 L1 U (n1 , Xn1 ; i )
dWz ds
n1
+ Ri ,n, (n , n1 , Xn ),
n1
(9.3.10)
401
(2)
F n () = (F (1)
n () , F n () )
with
(j)
())
F n(j) () = (F1,n (), . . . , Fq,n
(j)
(2)
nt
1
M () F n (),
nt n=1
(9.3.11)
nt
1
n ()).
M () (F n () F
nt n=1
(9.3.13)
402
n () = E F n () X
F
n1
is the compensator for F n () and is of order . The optimal choice for the
weighting matrix M () in the unbiased estimating equation (9.3.13), in the
sense of Godambe & Heyde (1987), can be derived using the method outlined
in Heyde (1997). For details regarding the case of diusion processes we refer
to Srensen (1997), where the optimal weighting matrix is given by
M () = B () V ()1 .
(9.3.14)
Bi,j () = B
(, n1 , Xn1 )i,j = E
Fj,n () Fj,n () Xn1 .
i
The values of the i s should be chosen in such a way that the conditional
covariance matrix V () is invertible.
Keeping only the leading terms, we obtain
nt
1
() F n (),
K(,
t, ) =
M
nt n=1
(9.3.15)
() = B()V ()1 . Here B() = (B (1) (), B (2) ()) and the (i, j)th
where M
entry of the p q matrices B (1) () and B (2) () are
0
L U (n1 , Xn1 ; j )
i
(9.3.16)
2
1
L U (n1 , Xn1 ; j ) ,
i
(9.3.17)
B (1) ()i,j =
and
B (2) ()i,j =
respectively. Moreover,
V () =
V 11 () V 12 ()
V 21 () V 22 ()
,
where the (i, j)th entry of the q q matrices V 11 (), V 22 () and V 12 () are
V 11 ()i,j =
1 1
L U (n1 , Xn1 ; i ) L1 U (n1 , Xn1 ; j ),
(9.3.18)
/
02
V 22 ()i,j = 2 L1 U (n1 , Xn1 ; i )L1 U (n1 , Xn1 ; j ) ,
403
(9.3.19)
and
V 12 ()i,j
= 2 L1 U (n1 , Xn1 ; i )L1 U (n1 , Xn1 ; j ) L0 U (n1 , Xn1 ; j )
+ 2L1 U (n1 , Xn1 ; i )L1 U (n1 , Xn1 ; j ) L1 L1 U (n1 , Xn1 ; j )
/
02
+ L1 L1 U (n1 , Xn1 ; i ) L1 U (n1 , Xn1 ; j ) ,
(9.3.20)
while V 21 () = (V 12 ()) .
() in (9.3.15) is optimal in what is referred to as
The weighting matrix M
the xed sample sense, see Godambe & Heyde (1987) and Heyde (1997). This
means, the weighting matrix results in an estimating function (9.3.15), that
is, to the order of approximation used, and closest within the class of estimating functions of the form (9.3.11) to the corresponding, usually unknown,
score function. In this sense, this gives under appropriate assumptions the
most ecient estimator for a xed number of observations within the class
of estimators considered. For example, if the diusion process is ergodic, it
can be shown that a xed sample optimal martingale estimating function is
also asymptotically optimal, also known as Heyde-optimal, see Heyde (1997).
Heyde-optimality results in an estimator that has the smallest asymptotic
condence intervals within the class of estimators considered. The estimating
functions proposed in this section are approximations of martingale estimating
functions to the order .
Optimal Estimating Equations
For the given transform functions, with parameters i , i {1, 2, . . . , q},
we have now obtained a p-dimensional optimal estimating equation
K(,
t, ) = 0
for t 1 , see (9.3.15). Assuming that the resulting system of p equations has
=
a unique solution, we obtain for the particular SDE (9.3.1) an estimator
404
t, ) = 0,
,
H(
where
nt
1
H(, t, ) =
B (2) () V 22 ()1 F (2)
n ()
nt n=1
(9.3.21)
2
1
L U (n1 , Xn1 ; j ) ,
i
(9.3.22)
is the estimator
, t, ) = 0, where
on . Next estimate by solving G(,
of previously obtained, and
nt
1
B (1) (, ) V 11 (, )1 F (1)
G(,
, t, ) =
n (, )
nt n=1
(9.3.23)
L0 U (n1 , Xn1 ; j ).
i
(9.3.24)
G(,
, t, ) and H(,
t, ) are, to the order of approximation used, optimal
within the classes of estimating functions
nt
1
G(, , t, ) =
M (1) (, ) F (1)
n (, )
nt n=1
(9.3.25)
nt
1
H(, t, ) =
M (2) ()F (2)
n ()
nt n=1
(9.3.26)
and
for estimating and , respectively. The optimal martingale estimating function for the form (9.3.13) is the optimal combination of the optimal martingale
estimating functions to which (9.3.21) and (9.3.23) are approximations, see
Heyde (1997) and Bibby (1994).
405
are popular for modeling in nance, see Due & Kan (1994), Filipovic (2001)
and Sect. 2.3. Consider the ane SDE
dXt = (1 + 2 Xt )dt +
3 + 4 Xt dWt
(9.4.1)
2 < 0
or
4 > 0,
2 < 0
and
and
2
4
3 > 0
1
2 3
4
(9.4.2)
1,
(9.4.3)
see Platen & Heath (2006). In the rst case, the Ornstein-Uhlenbeck process
emerges, see (1.7.5). The process X lives then on the entire real line and
the stationary distribution of the process is Gaussian with mean 12 and
3
variance (2
. In the latter case the process X lives on the interval (y0 , )
2)
with y0 = 34 . The stationary density for such an ergodic ane diusion is
of the form
p(x) =
22
4
2 2 3
4 1
4
2 1 2 3 1
4
2
x + 34 4
x+
exp 2
4
24 1 243
3
4
(9.4.4)
for x (y0 , ), where () denotes the well-known Gamma function. Note
that the stationary density in (9.4.4) is a shifted Gamma distribution. In this
case the stationary mean is
1
x p(x) dx =
(9.4.5)
2
y0
and the stationary second moment has the form
(21 + 4 ) 1 3 2
x2 p(x) dx =
.
2(2 )2
y0
(9.4.6)
406
(9.4.7)
nt
1
(2)
F ()
nt n=1 i,n
(9.4.8)
(2)
Fi,n () = Qi ,n, (L1 U (n1 , Xn1 ; i ))2
nt
1
(1)
F (, ),
nt n=1 i,n
(9.4.9)
(1)
1
i 2
+ (3 + 4 Xn1 )i (i 1) Xn1
2
(9.4.10)
and by (9.3.7)
i 1
L1 U (n1 , Xn1 ; i ) = i Xn1
(
3 + 4 Xn1 .
(9.4.11)
Estimating Functions
We obtain from (9.3.9), (9.3.26), (9.4.8) and (9.4.11) the estimating function
2(i 1)
0,1,0
(i ) 3 (i )2 A
Hi (, t, ) = A
i 1
4 (i )2 A2
(9.4.12)
for i {3, 4} for two dierent values 3 , 4 > 0. Here we have used the
notation
Ar,k,j
(i ) =
nt
1
(Xn1 )r (Qi ,n, )k (Di ,n, )j
nt n=1
(9.4.13)
407
and Ar = Ar,0,0
(i ), which refers to an equidistant time discretization of step
size .
Similarly, we obtain from (9.3.8), (9.3.25), (9.4.9) and (9.4.10) the estimating function
Gi (, , t, ) = C (i , 3 , 4 ) 1 i Ai 1 2 i Ai
(9.4.14)
with
0,0,1
C (i , 3 , 4 ) = A
(i )
i (i 1)
(3 Ai 2 + 4 Ai 1 )
2
(9.4.15)
2
2
2
2
2
2
+ 2 E (R1 (, t, , i ))
1
(L0 U (1, y; i ))2 + (L1 L1 U (1, y; i ))2
=
2
y0
1
+ L1 U (1, y; i )L1 L0 U (1, y; i )
2
1 1
0 1
+ L U (1, y; i )L L U (1, y; i ) p(y)dy
2
+ 2 E (R2 (, 1, , i )).
(9.4.16)
Here p(y) is given by (9.4.4), and E (Rj (, t, , i )), for j {1, 2} and i
{3, 4} are nite. Similarly, we obtain
lim E (Gi (, , t, ))
3
1 0 0
(L L U (1, y; i )) p(y) dy + 2 E (R4 (, , 1, , i )), (9.4.17)
=
2
y0
nt
408
Estimating Equations
Note that if the time between observations tends to zero, then the expectation of the functions Hi and Gi will approach zero. By setting the estimating
functions to zero we obtain the linear system of four estimating equations
2( 1)
0,1,0
i 1
0 = A
(i ) 3 (i )2 A i
4 (i )2 A2
(9.4.18)
(9.4.19)
A1 1 + A2 2 =
1
C (2, 3 , 4 )
2
0,1,0
(1)
3 + A1 4 = A
1 0,1,0
(2).
A2 3 + A3 4 = A
4
This system has the explicit solution
1 = C (1, 3 , 4 ) A1 2
(9.4.20)
2 =
1
2
C (2, 3 , 4 ) A1 C (1, 3 , 4 )
A2 (A1 )2
0,1,0
(1) A1 4
3 = A
4 =
1
4
0,1,0
0,1,0
A
(2) A2 A
(1)
.
3
1
2
A A A
(9.4.21)
This means, we have obtained explicit expressions for the estimators of the
given class of ane diusions using power transform functions. The illustrated
transform function method can be extended to other classes of diusions including nonergodic and multi-dimensional diusions and the case with jumps.
If no explicit solution of the system of estimating equations is available, then
numerical solution techniques can be applied to identify their solution.
409
Fig. 9.4.1. Sample path of a square root process with 1 = 4 = 1, 2 = 0.1 and
3 = 0
410
asymptotic behavior and bias of the parameter estimators for some given class
of stationary diusion processes. We assume in this section that X is ergodic,
see Sect. 2.7, described by the SDE (9.3.1) with time homogeneous coecient
functions
b(t, x; ) = b(x, )
(9.5.1)
and
(t, x; ) = (x; )
(9.5.2)
for t [0, T ], x and . We use as state space the interval (, r) where
< r . For given parameter vector , the density of the scale
measure s : (, r) [0, T ] is given by the expression
s(x; ) = exp 2
y0
b(y; )
dy
2 (y; )
411
(9.5.3)
for x (, r) with some reference value y0 (, r), see Sect.2.7. If the following
two conditions
r
0
s(x; ) dx =
y0
and
s(x; ) dx =
(9.5.4)
1
dx <
s(x; ) 2 (x; )
(9.5.5)
(9.5.6)
for x (, r) and , see (2.7.34). The constant C() results from the
normalization condition
r
p(x, ) dx = 1.
(9.5.7)
nt
1
g(, Xn1 , Xn ; ),
nt n=1
(9.5.8)
for t [0, T ] and where Gt and g = (g1 , g2 , . . . , gp ) are p-dimensional. Furthermore, we assume that X is stationary and impose the condition that
X is geometrically -mixing. For a denition of this concept, see for instance
Doukhan (1994). For a given ergodic diusion process X, there exist a number
of relatively simple criteria ensuring -mixing with exponentially decreasing
mixing coecients. We cite the following rather weak set of conditions used in
Genon-Catalot, Jeantheau & Laredo (2000) on the coecients b and that
are sucient to ensure geometric -mixing of X.
412
Condition 9.5.1
(i)
2b(x; )
(x; )
x
(x; )
Each pair of neighboring observations (Xn1 , Xn ) has the joint probability density
p(, x, y; )
(x; )
q
(x, y) = p
on (, r)2 . For a function f : (, r)2 k , k {1, 2, . . .}, where we assume
that the following integral exists, we introduce the vector valued functional
r
r
p(x; )
dy dx.
f (x, y) p(, x, y; )
q (f ) =
For our purposes we cannot assume that the estimating function in (9.5.8)
is unbiased. Instead we make the following assumption.
that is an inteCondition 9.5.2
There exists a unique parameter value
rior point of such that
= 0.
q
g(, )
Note that the equation in Condition 9.5.2 is a vector equation with zerovector on the right hand side. We can now impose conditions on the estimating
function (9.5.8) similar to those by Barndor-Nielsen & Srensen (1994) and
Srensen (1999).
Condition 9.5.3
The function gi (, x, y; ) : is twice continuously dierentiable
with respect to for all x, y (, r), and i {1, 2, . . . , p}.
(ii) The function gi (, , ; ) : (, r) (, r) is such that there exists a
2+
) < for all and i {1, 2, . . . , p}.
> 0 with q
(gi (, )
(i)
gi (, , ; )
A(, ) = q
j
i,j=1
is invertible.
413
Properties of Estimators
The following theorem can be found in Kelly et al. (2004).
Theorem 9.5.4. (Kelly-Platen-Srensen) Suppose Conditions 9.5.2 and
T that solves
9.5.3 are satised. Then for T > , there exists an estimator
the system of estimating equations
T ) = 0
GT (
with a probability tending to one as nT . Moreover, we have the limit in
P-probability
T P=
lim
nT
nT
d
T )
=
nT (
R,
)(A(
)
1 )
)
1 v(,
,
R N 0, A(,
where
)
1 ) , where
)
1 v(,
,
variance matrix A(,
)
= q
g(, )
)
v(,
(g(, )
g(, X , X ; )
E g(, X0 , X1 ; )
k
k+1
k=1
g(, X , X ; )
.
+ E g(, Xk , Xk+1 ; )
0
1
(9.5.9)
Under the conditions imposed, the covariances in the innite sum in (9.5.9)
tend to zero exponentially fast as k . So the sum converges rapidly and
can usually be well approximated by a nite sum with relatively few terms.
The theorem can be derived in complete analogy with the proof of a similar
theorem in Srensen (1999).
414
the presence and impact of jumps. Eraker, Johannes & Polson (2003) apply
a Bayesian approach by using Markov chain Monte Carlo methods to infer jump parameters from index data. Chernov, Gallant, Ghysels & Tauchen
(2003) apply the ecient method of moments. Barndor-Nielsen & Shephard
(2006) and W
orner (2006) study the behavior of interesting statistics, including power variations.
In the following we provide a rst impression on the estimation of parameters for jump diusions following Ait-Sahalia (2004). For simplicity, let us
consider the Merton model, see Sect. 1.7 and Merton (1976). The logarithm
of an asset price is given by the SDE
dXt = dt + dWt + dYt
(9.6.1)
Nt
Ji
(9.6.2)
i=1
+
2
2
dt + dWt + (exp{Yt } 1) dNt
(9.6.4)
(9.6.5)
415
P (Xt x X0 = x0 , ) =
P (Xt x X0 = x0 , Nt = k, ) P (Nt = k )
k=0
(9.6.6)
for x . In the event Nt = k there have been k jumps in [0, t] at the times
0 < 1 < 2 < . . . < k t. Since Ji = Yi N (, ) is Gaussian, the
transition density of Xt given X0 = x0 is of the form
p(x t, x0 , ) =
p(t, x X0 = x0 , Nt = k, ) P (Nt = k )
k=0
(x x0 t k)2
exp
, (9.6.7)
2 (k + t 2 )
2 (k + t 2 ) k !
exp{ t} ( t)k
k=0
see Press (1967). When setting the parameters, care must be taken to ensure that the maximum likelihood for the above mixture of normals remains
bounded, see Honore (1998).
Press (1967) calculated the rst four moments from the above transition
density in the form:
m1t = E Xt = x0 + t( + )
2
M (t, , 2) = E Xt m1t = t( 2 + ( 2 + ) )
3
M (t, , 3) = E Xt m1t = t ( 2 + 3 )
4
M (t, , 4) = E Xt m1t = t ( 4 + 6 2 + 3 2 )
+ 3 t2 ( 2 + ( 2 + ) )2 .
(9.6.8)
(9.6.9)
the log-return of the asset price over the period from tn1 until tn , n
{1, 2, . . .}. Recall that the log-returns are independent under the above Merton
model.
For the estimation of the d-dimensional parameter vector we introduce
a vector of m moment conditions h(z, , ), m d, where the vector function
416
nT
1
h(Ztn , , ).
nT n=1
(9.6.10)
(9.6.11)
(9.6.12)
for n {1, 2, . . .}, where 0 is the true parameter vector of the observed
dynamics. We will denote by h the gradient of h with respect to . By application of standard statistical arguments, see for instance
Hansen (1982), and
0 ) converges in
subject to some regularity conditions it follows that T (
distribution to an N (0, ) distributed random vector with
1 =
1
D G D (D G S G D)1 D G D,
(9.6.13)
t , , 0 ))
D = E(h(Z
1
(9.6.14)
S = E h(Zt1 , , 0 ) h(Zt1 , , 0 )
(9.6.15)
where
is an m d matrix and
is an m m matrix.
Note that the weight GT can be designed optimally to minimize the variance . If GT is chosen optimally with m = d, then one has G = S 1 and
(9.6.13) reduces to
1 1
1 =
D S D.
(9.6.16)
Furthermore, the moments (9.6.8) allow for the case when h is a vector
of polynomials the calculation of D and S. In this manner one can apply the
GMM to construct estimators for any of the parameters. Of course, the form
of the estimators depends on which parameters are considered to be unknown
and which particular choices for h are made, see Ait-Sahalia (2004).
9.7 Exercises
417
(9.6.18)
AVARMLE () = (D S 1 D)1 = D 1 .
(9.6.19)
Furthermore, it has been shown in Ait-Sahalia (2004) that when the asset
price has jumps, the maximum likelihood estimator of 2 has for 0
the same asymptotic distribution as if no jumps were present. This means
for instance, when using high-frequency data the presence of jumps will not
hinder the estimation of the volatility.
Maximum likelihood estimation is usually the theoretically preferred methodology, however, already for the above simple model one cannot explicitly
compute the maximum likelihood estimators. Numerically it can be obtained
which can become rather time consuming. Therefore, alternative estimators,
which are directly based on certain moments and easily computable, are of
particular interest. For details on such GMM estimators we refer to Ait-Sahalia
(2004). Further interesting work related to the estimation of jump diusions
can be found in Ait-Sahalia, Fan & Peng (2009), Ait-Sahalia & Jacod (2009)
and Bladt & Srensen (2005).
9.7 Exercises
9.1. Write down for the Vasicek interest rate model with SDE
drt = (
r rt ) dt + dWt
a maximum likelihood estimator for the speed of adjustment parameter ,
when the interest rate reference level r and the diusion coecient are
known constants.
9.2. Describe a simple way of estimating the volatility parameter for the
Black-Scholes model for an asset price process X = {X(t), t [0, T ]} with
SDE
dXt = Xt (a dt + dWt )
for t [0, T ], X0 > 0.
10
Filtering
419
420
10 Filtering
(10.1.1)
(10.1.2)
(10.1.3)
Xt Xt Xt Xt
Ct = E
(10.1.5)
421
(10.1.6)
(f , g) = E
f
s
g s ds ,
(10.1.8)
(10.1.9)
are orthogonal in this sense, and also in the stronger pointwise sense, such
that
t
t =0
Xt X
E
X
for each t [0, T ], see Kallianpur (1980). Note that the Riccati equation
(10.1.6) must, in general, be solved numerically, which can be performed oline. It only involves the known coecients of the state and observation SDEs
(10.1.1) and (10.1.2). The coecients of the lter equation (10.1.7) are, therefore, known. This SDE can be solved on-line as the observations become available. Generally, solving a lter SDE must be performed numerically using a
strong discrete-time approximation. However, in a linear Gaussian situation
422
10 Filtering
(10.1.11)
(10.1.13)
for t [0, T ] with initial value C0 = 2 . Fortunately, this Riccati equation has
an explicit solution
2
.
(10.1.14)
Ct =
1 + 2 t
In the given example one can show that the lter has the explicit form
t = Ct Yt
X
(10.1.15)
for t [0, T ] since this is the explicit solution of the corresponding lter SDE
(10.1.7).
To see this, let us introduce the quantity
t
Jt = exp
Cs ds .
(10.1.16)
0
423
d(Ct Jt )
dCt
dJt
= Jt
+ Ct
dt
dt
dt
= Jt (Ct )2 Jt (Ct )2 = 0
(10.1.18)
(Ct Jt ) Yt = (C0 J0 ) Y0 +
(Cs Js ) dYs +
0
Ys d(Cs Js )
(10.1.19)
Ct Jt Yt =
Cs Js dYs
0
t
s Js
d X
t Jt
=X
(10.1.20)
424
10 Filtering
Fig. 10.1.2. Trajectories of the Kalman-Bucy lter for dierent hidden levels
t [0, 6.25]. These observed paths for the 11 cases are plotted in Fig. 10.1.1.
They show the trends caused by the corresponding dierent hidden levels. It
then remains to calculate, according to (10.1.21), the paths for the corresponding Kalman-Bucy lter. These are displayed in Fig. 10.1.2. They demonstrate
clearly the functioning of the lter. After some initial time the Kalman-Bucy
t has approximately detected in each of the cases the hidden level.
lter X
425
hidden, continuous time nite-state Markov chains with observations that are
perturbed by the noise of a Wiener process.
The Wonham Filter Problem
The systematic construction and investigation of lters for hidden Markov
chains goes back to Wonham (1965), Zakai (1969) and Fujisaki, Kallianpur
& Kunita (1972). Later, the question of nding discrete-time approximations
for the optimal lter was considered by Clark & Cameron (1980) and Newton
(1986a, 1991).
Now, we introduce lters for hidden, continuous time, nite state Markov
chains. Let (, AT , A, P ) be the underlying ltered probability space and
suppose that the hidden state process = {t , t [0, T ]} is a continuous time,
homogeneous Markov chain on the nite state space X = {a1 , a2 , . . . , ad }, d
{0, 1, . . .}. This could model, for instance, the hidden trend of the logarithms
of a group of commodity prices. Its d-dimensional probability vector p(t) =
(p1 (t), . . . , pd (t)) at time t, with components
pi (t) = P (t = ai )
(10.2.1)
(10.2.2)
(10.2.3)
426
10 Filtering
1
LT = exp
2
|h(s )| ds +
2
(10.2.4)
!
h(s ) dW s
(10.2.5)
(10.2.7)
t
m
t
A (s ) ds +
H j (s ) dW js
(10.2.8)
(t ) = p(0) +
0
j=1
427
The least-squares estimate at time t for g(t ) with respect to the given
observations at time t, that is with respect to the sigma-algebra Yt , is then
the Wonham lter and given by the conditional expectation
d
g(ak ) (t )k
(10.2.9)
g(t ) = E g(t ) Yt = k=1
d
k
k=1 (t )
for t [0, T ].
Quasi-exact Filters
Let us consider the following d-dimensional multiplicative noise SDE
dX t = AX t dt +
m
H k X t dWtk ,
(10.2.10)
k=1
and H k H n = H n H k
(10.2.11)
for all k, n {1, 2, . . . , m}, then an explicit solution of the SDE (10.2.10) can
be expressed by
(10.2.12)
X t = tX 0,
for t [0, ), see (2.4.9). Here, t is the matrix exponential
!
m
m
1
2
r
(H ) t +
H r Wt ,
t = exp At
2
r=1
(10.2.13)
=1
for t [0, ).
The above derivation shows that an SDE of the type (10.2.8) has an explicit
solution if the matrices A, H 1 , . . . , H m commute. Note that H 1 , . . . , H m in
(10.2.8) are diagonal matrices, and thus commute with each other. However,
the matrix A is not commuting with the other matrices. Therefore, we do
not have an exact explicit solution of the Zakai equation (10.2.8). Nevertheless, as illustrated in Platen & Rendek (2009b), if we formally take the matrix exponential (10.2.13) in the product (10.2.12), then one obtains a proxy
of the solution of the corresponding Zakai equation. It turns out that this
quasi-exact solution provides in many cases an excellent approximation of the
exact solution. This is a practically very valuable observation. The solution
can be exploited to solve approximately and eciently the Wonham lter
428
10 Filtering
n+1
1
s2
1
dWsj , . . . ,
dWsj , . . . ,
dWsj1 dWsk2 , . . .
t0
tn
t0
t0
for each j {1, 2, . . . , m}, n = n and n {0, 1, . . .}. We shall see later on
that with such integral observations it is possible to construct strong discretetime approximations Y with time step of the solution () of the Zakai
equation (10.2.8). This allows then for the given function g to form the approximate Wonham lter
d
g(ak ) Yt,k
g (t ) = k=1
(10.2.14)
d
,k
k=1 Yt
for t [0, T ].
As in Chap. 5 we shall say that a discrete-time approximation Y with
time step size converges on the time interval [0, T ] with strong order > 0
to the solution X of the corresponding SDE if there exists a nite constant
K, not depending on , and a 0 (0, 1) such that
E (n ) Y
(10.2.15)
n K
for all (0, 0 ) and n [0, T ]. Note that the expectation in (10.2.15) is
taken with respect to the probability measure P under which the observation
process W is a Wiener process.
Analogously, we say that an approximate Markov chain lter g (nt ) with
time step size converges on the time interval [0, T ] with strong order > 0
to the optimal lter g(nt ) for a given test function g if there exists a nite
constant K, not depending on , and a 0 (0, 1) such that
E g(nt ) g (nt ) K
(10.2.16)
429
for all (0, 0 ) and t [0, T ]. In contrast with (10.2.15) the expectation in (10.2.16) is taken with respect to the original probability measure
P . In Kloeden, Platen & Schurz (1993) the following convergence result was
derived.
Theorem 10.2.1. (Kloeden-Platen-Schurz) An approximate Markov chain
lter g (nt ) with time step size converges for t [0, T ] with strong order
> 0 to the optimal lter g(nt ) for a given bounded function g if the discretetime approximation Y used converges on [0, T ] to the solution () of the
Zakai equation (10.2.8) with the same strong order .
Proof: In view of the criterion (10.2.16) and the Bayes rule, see Sect. 2.7,
we need to estimate the strong approximation error
F (g) = E g( ) g ( )
n
= E Ln g(n ) g (n )
(10.2.17)
d
f (ak ) Xkn
(10.2.18)
f (ak ) Y,k
n
(10.2.19)
k=1
and
G
n (f ) =
d
k=1
Gn (g) Gn (g) + g (n ) Gn (1) Gn (1)
= E Gn (1)
Gn (1)
( ) G (1) G (1)
E Gn (g) G
n (g) + E g
n
n
n
K1
d
E Y,k
(n )k .
n
(10.2.20)
k=1
430
10 Filtering
Explicit Filters
Now, we derive discrete-time strong approximations Y that converge with a
given strong order > 0 to the solution () of the Zakai equation (10.2.8),
which can be used to build a corresponding approximate lter. A systematic presentation of such discrete-time strong approximations is provided in
Chap. 5.
Given an equidistant time discretization of the interval [0, T ] with step size
T
for some N {1, 2, . . .}, we dene the partition sigma-algebra
= N
j
1
PN
= {Wi1
: i {1, 2, . . . , N }, j {1, 2, . . . , m}}
N
dWsj , . . . , WNj 1 =
W0j =
(N 1)
dWsj
(10.2.21)
(10.2.22)
1
for all j {1, 2, . . . , m}. Thus, PN
contains the information about the increments of W for the given time discretization.
The simplest discrete-time approximation is obtained from the Euler
scheme, see (5.3.13). It has for the Zakai equation (10.2.8) the form
Y
n+1 = [I + A + Gn ] Y n
with
Gn =
m
H j Wnj
(10.2.23)
(10.2.24)
j=1
+
G
(10.2.25)
I
+
Y
n
n
n+1
n ,
2
where
1 2
H .
2 j=1 j
m
A=A
(10.2.26)
431
1
1
Gn A + Gn A + G2n + G3n Y
(10.2.27)
n ,
2
2
6
1
. A scheme with similar propwhich is called asymptotically ecient under PN
erties was proposed in Burrage (1998).
One can obtain higher order strong convergence for > 1 by exploiting additional information about the observation process W such as that contained
in the integral observations
N
s
j
dWrj ds, . . . , ZN
=
dWrj ds
Z0j =
1
0
(N 1)
(N 1)
(10.2.28)
for all j {1, 2, . . . , m}. We shall dene as partition sigma-algebra
j
j
1.5
PN
= Zi1
, Wi1
: i {1, 2, . . . , N }, j {1, 2, . . . , m}
(10.2.29)
1
the sigma-algebra generated by PN
together with the multiple stochastic inj
j
tegrals Z0 , . . ., ZN 1 for all j {1, 2, . . . , m}.
The order 1.5 strong Taylor scheme, described in (5.3.35) uses, due to
the diusion commutativity property of the Zakai equation (10.2.8), only the
1.5
. It takes the form
information contained in PN
2 2
A + A Mn
Y
n+1 = I + A + Gn +
2
1
1
M n A + Gn A + G2n + G3n Y
n , (10.2.30)
2
6
where
Mn =
m
H j Znj .
(10.2.31)
j=1
Note that one obtains the strong order 1.0 scheme (10.2.27) from (10.2.30)
1
with
if one replaces the Znj by their conditional expectations under PN
respect to the probability measure P , that is if one substitutes 12 Gn for
M n in (10.2.30).
In order to form a scheme of strong order = 2.0 one needs the information
from the observation process expressed in the partition sigma-algebra
j1
j1
2
= Wi1
, Zi1
, J(j1 ,j2 ,0),i1 , J(j1 ,0,j2 ),i1 : i {1, 2, . . . , N },
PN
j {1, 2, . . . , m}
(10.2.32)
432
10 Filtering
1.5
which is generated by PN
together with the multiple Stratonovich integrals
n+1
s3
s2
dWsj11 dWsj22 ds3 ,
J(j1 ,j2 ,0),n =
n+1
s3
s2
(10.2.33)
Y n+1 = I + A I + A M n A + A M n
2
1
1
1
+ G n I + A + Gn I + Gn I + Gn
2
3
4
+
m
j1 ,j2 =1
+ H j2 H j1 A J(j1 ,j2 ) J(j1 ,j2 ,0) J(j1 ,0,j2 ) Y n .
(10.2.34)
(10.2.35)
433
where [0, 1] denotes the degree of implicitness, see Sect. 7.2. As usual, the
Euler scheme (10.2.35) converges with strong order = 0.5.
The family of drift-implicit Milstein schemes with diusion commutativity condition, all of which converge with strong order = 1.0, gives us the
algorithm
1
1
G
=
(I
A
)
+
G
I
+
(1
)
A
I
+
Y
Y
n
n
n+1
n , (10.2.36)
2
see Sect. 7.2. We can continue to construct similar higher order schemes for
the Zakai equation. In principle, to each explicit scheme there corresponds
a family of drift-implicit schemes obtained from making implicit the terms
involving the nonrandom multiple stochastic integrals such as I(0) = and
I(0,0) = 12 2 .
As an example we mention the strong order 1.5 drift-implicit Taylor
scheme
1
1
1
A
Y
=
I
I + A + Gn A M n A + A M n
n+1
2
2
1
1
+ Gn I + Gn I + Gn
(10.2.37)
Y
n ,
2
3
which makes the drift implicit in the strong order 1.5 Taylor scheme (5.3.40).
434
10 Filtering
where the mean reversion level changes according to a continuous time nite state Markov chain = {t , t [0, T ]}. This could be a model for the
logarithm of a commodity price, see Miltersen & Schwartz (1998), where an
unobservable reference growth rate t changes from time to time. If investors
change their behavior suddenly, for instance, when a bull market for commodities has come to an end, then the reference growth rate of the log-price
also changes. Such an eect may be modeled well by a hidden Markov chain.
Let us take this as a motivation and concentrate on the underlying mathematics. We derive a nite-dimensional lter for the unobservable state of the
Markov chain = {t , t [0, T ]} based on observations of a mean reverting
diusion process r = {rt , t [0, T ]}, which we interpret here as a log-price
process. Additionally, various auxiliary lters will be presented that allow us
to estimate the parameters of the hidden Markov chain.
Let the log price r = {rt , t [0, T ]} be described by the SDE
drt = (t rt ) dt + dWt
(10.3.1)
(10.3.3)
435
d
Let A = {At = [ai,j
t ]i,j=1 , t [0, T ]} be the family of transition intensity
matrices associated with the continuous time Markov chain X, so that p(t)
satises the Kolmogorov equation
dp(t)
= At p(t),
dt
(10.3.4)
M t = Xt X0
As X s ds.
(10.3.5)
t
As X s ds + M t
(10.3.6)
Xt = X0 +
0
for t [0, T ].
Change of Measure
To obtain the lters and estimators required for estimating the parameters of
the model we now introduce a change of measure, see Sect. 2.7.
Assume that there is a probability measure P on (, AT , A) under which
is a nite state Markov chain with the transition intensity matrix family
A as described before. The process W = {Wt = rt , t [0, T ]} is a Wiener
process under P , independent of . In the following, the expected value of
1 t 2
2
Lt = exp
(X
a
r
)
dW
(X
a
r
)
ds
,
(10.3.7)
s
s
s
s
s
2 0
0
436
10 Filtering
(10.3.9)
where W0 = 0. Then Girsanovs theorem, see Sect. 2.7, implies that the process W is an (A, P )-Wiener process, independent of . That is, under P ,
the processes X and r have the dynamics given by (10.3.6) and (10.3.1),
respectively.
The reason for this change of measure is that under the measure P , calculations are easier because the observable process W = {Wt = rt , t [0, T ]}
is interpretable simply as a Wiener process. The fact that the measure P ,
which reects real world probabilities, is a dierent measure does not matter
because we can here use a version of the Bayes rule, see Sect. 2.7, to convert calculations made under one probability measure into a corresponding
quantity calculated under another equivalent probability measure.
General Filter
Let = {t , t [0, T ]} be an A-adapted process. We shall write = {t , t
[0, T ]} for the Y-optional projection of the process under the measure P ,
that is
(10.3.10)
t = E t Yt
P -a.s. We call the lter for .
The Y-optional projection of the process L under the measure P will be
denoted by () = {(t ), t [0, T ]}. Now, following Elliott et al. (1995) by
applying a form of the Bayes rule implies that
(t )
t =
(1)
(10.3.11)
t
H t = H0 +
s ds +
dM
+
s dWs .
(10.3.12)
s
s
0
Here and are A-predictable, square integrable processes and is an Apredictable, square integrable, d-dimensional vector process. Dierent specications of , and will allow us to obtain various lters. Using equation
(10.3.6) and the It
o formula for semimartingales, see (1.5.15), we obtain
Ht X t = H 0 X 0 +
s X s ds +
0
s X s dWs
Hs As X s ds +
0
X s (
s dM s ) +
437
Hs dM s +
0
(
s X s ) X s . (10.3.13)
0<st
(Ht X t ) = (H0 X 0 ) +
(s X s )ds +
0
i,j=1
As (Hs X s ) ds
0
((sj X s si X s ) ei ) aj,i
s ds (ej ei )
1 ( B s (Hs X s ) + (s X s )) drs ,
(10.3.14)
(10.3.15)
t
(X t ) = E(X 0 ) +
As (X s )ds +
1 B s (X s ) drs .
(10.3.16)
0
(X t )
,
(X t ) 1
(10.3.17)
438
10 Filtering
and unknown, and the vector a of possible values for the reference level is also
unknown. In this situation we can use the, so called, Expectation Maximization
(EM) algorithm, see Dembo & Zeitouni (1986), to estimate these unknown
parameters after observing the process r for a given time period, say up to
time t. One of the basic properties ofthe transition intensity matrix A is
d
that for each i {1, 2, . . . , d} one has j=1 ai,j = 0, so that there is no need
to estimate the elements ai,i for i {1, 2, . . . , d} if the others are already
estimated.
We shall use the notation for the set of parameters that we want to
estimate. These are, for instance, the transition intensities ai,j and reference
levels ai , which we are estimating, so = {ai,j , ai : i, j {1, 2, . . . , d}}. The
rst step in the EM algorithm is to choose an initial guess 0 for the given
parameter set. The EM algorithm is then applied to produce an estimate
1 for the parameters. This is repeated iteratively to produce a sequence of
estimates (n )nN . Each iteration consists of two steps:
The rst is the expectation step or E-step. In this step we start with n , the
nth iteration of estimated parameters. If is some set of possible parameter
values we shall denote by P the probability measure induced by the values
on (, AT , A), and En shall denote expectation under the measure Pn . We
then compute the conditional expectation of the log-likelihood function
dP
Yt .
Q(, n ) = En ln
dPn
The second step is the maximization step or M-step. It consists of maximizing
n ) with respect to to obtain the new estimate n+1 . In this way the
Q(,
EM algorithm maximizes iteratively the likelihood that the estimated parameters are the true underlying parameters. For a detailed discussion of the EM
algorithm we refer to Dembo & Zeitouni (1986).
To apply
the EM algorithm in our case we begin by nding an expression
dP
ai,j , a
i : i, j {1, 2, . . . , d}} and n = {ai,j
for dP A , where = {
n , an,i :
t
n
i, j {1, 2, . . . , d}} are two possible parameter sets.
Let Jtij denote the number of jumps that the process X makes from state
ei to ej in the interval [0, t]. Using Theorem T3 in Chapter VI of Bremaud
(1981) in conjunction with the Girsanov Theorem, see Sect. 2.7, one can calculate that
t
dP
an ) drs
= exp
1 (X
s a
dPn At
0
1 t 2
2
rs )2 (X
ds
(X s a
a
r
)
s
s n
2 0
d
"
exp
i,j=1, i =j
ln
0
a
j,i
aj,i
n
t
aj,i aj,i
)
(X
e
)
ds
, (10.3.18)
dJsij (
s i
n
0
439
= (
where a
a1 , a
2 , . . . , a
d ) and an = (an,1 , an,2 , . . . , an,d ) .
= di=1 a
Using the fact that X
i (X
s a
s ei ) we nd
dP
n ) = En ln
Q(,
Yt
dPn At
=
d
1 a
i En
(X
s ei ) drs Yt
i=1
1
2 En
2
)2
((X
s a
)) ds Yt
2rs (X
s a
ln(
aj,i ) En (Jtij Yt )
d
i,j=1, i =j
a
j,i En
(X
s ei ) ds Yt
+ R(n ),
(10.3.19)
so
n ) =
Q(,
d
a
i En
d
1
i=1
d
2 (
ai )2 En
Yt
(X
s ei ) ds Yt
2 a
i En
rs (X
s ei ) ds Yt
i=1
d
(X
s ei ) drs
i=1
ln(
aj,i ) En Jtij Yt
i,j=1, i =j
j,i
En
(X
s ei ) ds Yt
+ R(n ).
(10.3.20)
i
to by equating the partial derivatives of equation (10.3.20) in a
j,i and a
i,j
i
to zero. This implies that the next parameter set n+1 = {an+1 , an+1 : i, j
{1, 2, . . . , d}} generated by the EM algorithm is given by
440
10 Filtering
aj,i
n+1
1
t
ij
= En (Jt Yt )
(X s ei ) ds Yt
En
(10.3.21)
and
ain+1 =
1 En
Yt
Yt + En t rs (X
e
)
ds
i
s
0
. (10.3.22)
t
En 0 (X
s ei ) ds Yt
t
(X
s ei ) drs
0
t
(X
(10.3.23)
Oti =
s ei ) ds,
Kti =
(X
s ei ) drs ,
(10.3.24)
rs (X
s ei ) ds.
(10.3.25)
and
Iti
1 (Kti ) + (Iti )
.
(Oti )
(10.3.27)
441
1 t 2 2
2
2 2
2
(
((X
)
(X
a
r
)
)ds
R1 .
a
)
r
s
n
s
s
s
n
2 0
= (
Here a
a1 , a
2 , . . . , a
d ) , an = (an,1 , an,2 , . . . , an,d ) with R1 denoting a
. Now
term that is independent of and a
t
t
2
drs Yt En
Q(, n ) = En
Xs a
rs drs Yt
0
t
t
1 2
2
ds Yt 2 En
ds Yt
En
Xs a
rs X s a
2
0
0
t
+ En
rs2 ds Yt
(10.3.29)
+ R2 ,
0
recalling that En means expectation with respect to Pn . Here R2 is indepen . Employing as shorthand notation for the Y-optional
dent of both and a
projection of the process under the measure Pn and using the processes
Oi , Ki and I i , see equations (10.3.23) to (10.3.25), we proceed as previously
and write equation (10.3.29) as
d
t
2
i
n ) =
t
Q(,
a
i K
rs drs
i=1
1
2
2
t
d
d
2 i
i
2
(
ai ) O t 2
a
i It + rs ds .
i=1
(10.3.30)
i=1
C
,
rs drs
C =
0
(10.3.31)
d i i
I K
t
i=1
ti
O
(10.3.32)
442
10 Filtering
and
=
D
d
(I i )2
t
i=1
ti
O
rs2 ds.
(10.3.33)
1 i
Kt + Iti
n+1
.
ti
O
(10.3.34)
Simulation Example
For illustration, the lter methods derived above are now applied in a simulation study. The results computed use discrete-time approximate lters as
proposed in the previous section. The approximate lters were calculated using the Milstein scheme, see (5.3.19) with a suciently small time step size
to ensure high accuracy and numerical stability. Alternative methods, which
provide potentially more numerical stability can also be applied, as will be
discussed in Sect. 10.4.
The mean reverting process r was simulated using a total of 800,000 discretization points. In the simulation the hidden process is a three state
continuous time Markov chain, that is, we have d = 3. The intensity matrix
for the process is assumed to be
0.8 0.8
0
A = 1.5 3.0 1.5
(10.3.35)
0 0.4 0.4
The other parameter values were taken to be = 1.2, = 8.0, a1 = 0.3,
a2 = 1.9, a3 = 2.4. The initial values for the EM algorithm to start with, are
set to
1.0 1.0
0
A = 0.5 1.0 0.5
(10.3.36)
0 1.0 1.0
with = 5.0, a1 = 0.0, a2 = 1.0 and a3 = 2.0.
The EM algorithm was run over 50 iterations. Figs. 10.3.1 to 10.3.4 show
the evolution of the estimates of the parameters in dependence on the number
of iterations and include the true values for reference.
The simulation was conducted over the extremely long time interval
[0, 20000] to ensure that each state in the Markov chain was visited suciently often. The accurate estimation of the elements of the intensity matrix
A depends on the detection of numerous jumps in the Markov chain. With
shorter time series one already may be close to the true value, though no
guarantee can be given. In the rst part of the simulation with about 2, 000
discretization points over the rather short time interval [0, 10] the EM algorithm visually appeared to converge rather fast, however, the estimates could
still vary substantially.
443
Fig. 10.3.2. The evolution of the estimates of a2,1 = 0.8 and a3,2 = 0.4
(10.3.37)
444
10 Filtering
we nd that
(Oti X t ) =
((X s ) ei ) ei ds +
A (Osi X s ) ds
0
1 B s (Osi X s ) drs .
(10.3.38)
(Oti ) = (Oti X t ) 1
and the lter for Oi
E Oti Yt =
445
(10.3.39)
(Oti )
.
(X t ) 1
(10.3.40)
(X
s ei ) ej dM s .
(10.3.41)
Jtij
j,i
(X
s ei ) a
ds +
e
aj,i (ej ei )
(s X s sk X s ) ek a,k (e ek ) = X
e
e
i
i
s
i
k,=1
j,i
= X
s ei a (ej ei ). (10.3.42)
Applying Theorem 10.3.1 and using equations (10.3.42) and (10.3.37) we nd
the recursive equation for (Jtij X t ) in the form
aj,i ((X s ) ei ) ej ds +
(Jtij X t ) =
0
A (Jsij X s ) ds
0
1 B s (Jsij X s ) drs .
(10.3.43)
(10.3.44)
(10.3.45)
446
10 Filtering
Kti =
(ai rs ) (X
s ei ) ds +
(X
s ei ) dWs .
(10.3.46)
(Kti X t ) =
(ai rs ) ((X s ) ei ) ei ds +
A (Ksi X s ) ds
0
(10.3.47)
(10.3.48)
(Kti )
.
(X t ) 1
(10.3.49)
(Iti X t ) =
rs (X s ) ei ei ds +
A (Isi X s ) ds
0
1 B s (Isi X s ) drs .
(10.3.50)
(10.3.51)
(Iti )
.
(X t ) 1
(10.3.52)
447
t
s ds + Wt ,
(10.4.1)
Wt =
0
for t [0, T ]. One can say that W represents the time integral over the signal
process that is corrupted by a Wiener process W . We estimate the hidden
state of the Markov chain by observing only the values of the process W .
448
10 Filtering
(10.4.3)
for t [0, T ], with given initial probability vector p(0). H is the diagonal
d d matrix that has the elements of the vector a as diagonal elements and
is zero elsewhere.
Denote by Yt the observation sigma-algebra generated by W up to time t.
The Wonham lter for X at time t is then given as
t = E(X t Yt ).
X
As we have seen in Sect. 10.2, the theoretical solution to the problem of
t involves the unnormalized lter for the conditional distribution
calculating X
of X, which is denoted by (X) = {(X t ), t [0, T ]} and satises the Zakai
equation
(10.4.4)
d(X t ) = A (X t ) dt + D (X t ) dWt
for t [0, T ]. The Wonham lter for X is then computed as
t = E(X t Yt ) =
X
(X t )
(X t ) 1
(10.4.5)
449
(10.4.6)
1 2
H Y n ((Wn )2 )).
2
(10.4.7)
The Milstein scheme itself is then given by equation (10.4.7) with the choice
= 0.
To construct a balanced implicit method (7.3.9)(7.3.10) we must specify
the functions c0 and c1 in (7.3.10). When implementing numerical schemes to
approximately solve an SDE, problems arise when the coecients of (X t ) in
the It
o dierentials become large in magnitude. With this in mind we choose
c0 = 0 and c1 = |H| in the balanced implicit method which yields
I + A + H Wn
Y n+1 = Y n
.
(10.4.8)
1 + |H| |Wn |
The Case of No Mean Reversion
For our numerical simulations the parameters of the Markov chain X, or
equivalently , are chosen to give a realistic multiplicative noise term in the
Zakai equation (10.4.4). The simulated hidden Markov chain X is chosen to
have three states, with the vector a taken to be a = (5, 0, 5) . Experiments
have shown that using more states does not change the nature of the results
that we obtain. The intensity matrix A of the hidden Markov chain is chosen
to be of the simple form
1.0 1.0
0
A = 0.5 1.0 0.5 .
(10.4.9)
0 1.0 1.0
450
10 Filtering
This describes how the Markov chain jumps with prescribed intensities to
neighboring states. For instance, the intensity to jump from level 0 to level 5
is 0.5 per unit of time.
Now, let us investigate the approximate calculation of the Wonham lter.
We simulate the signal and observation processes over the time interval [0, T ]
with T = 10. The simulated output can be seen in Fig. 10.4.1.
Consider the probability
ei = E X ei Yt
(10.4.10)
qti = X
t
t
which, for i {1, 2, 3} is the ltered probability that the hidden Markov chain
X is in the ith state. For illustration let us consider qt1 , which corresponds to
the level = 5, that is i = 1.
To obtain the quantity qt1 we have to solve the SDE (10.4.4). The solution
has to be calculated numerically using any of the above mentioned strong
schemes on the basis of the discrete-time observations of W . When the step
1
, then all four numerical
size was chosen extremely small with about 1000
schemes produced virtually identical results. Fig. 10.4.2 displays a plot of qt1
1
as calculated by the semi-drift-implicit Milstein scheme with = 1000
and
1
= 0.5, showing that the high levels of qt track well the periods when the
hidden state indeed equals its highest level, = 5. However, such a ne time
discretization for the observation process is often not available in real world
nancial ltering.
Dierences between the schemes become apparent when the time step size
is chosen to be larger. Using the same realizations of the observation and
signal processes that were given in Fig.10.4.1, the Figs.10.4.3 to 10.4.6 display
the plots of the ltered probability qt1 , see (10.4.10), calculated by the four
1
. For this step size we
dierent schemes when using the larger step size = 20
can see that the only acceptable scheme appears to be the balanced implicit
451
Fig. 10.4.2. qt1 - obtained by the drift-implicit Milstein scheme with = 0.5 and
1
= 1000
1
20
method. In the other cases we even obtain negative probabilities and other
unrealistic estimates as lter values. The balanced implicit method makes the
lter work even for the given larger time step size, which is a remarkable fact.
Mean Reverting Observation Process
The second problem that we consider refers to a linearly mean reverting obser . Using mostly the same notation as in the previous example,
vation process W
is here assumed to satisfy the linear mean reverting
the observation process W
SDE
t ) dt + dW
t = (t W
(10.4.11)
dW
t
452
10 Filtering
1
20
1
20
(10.4.12)
(X t ) 1
for t [0, T ].
453
1
20
The Euler method, when approximating the solution of the Zakai equation
(10.4.12), has the form
n ) Y n W
n,
Y n+1 = Y n + A Y n + (H I W
(10.4.13)
n = W
= W
, = n+1 n and n
where W
n+1 Wn , Wn
n
{0, 1, . . . , N 1}. The drift-implicit Milstein scheme that approximates the
solution of (10.4.12) yields
n ) Y n W
n
Y n+1 = (I A )1 (Y n + (1 )A Y n + (H I W
+
1 2
n )2 Y n ((W
n )2 )),
(H I W
2
(10.4.14)
where [0, 1]. With the choice = 0 the algorithm (10.4.14) yields the
Milstein scheme when applied to (10.4.12).
Using the same reasoning as in the previous example we implement the
balanced implicit method (7.3.9)(7.3.10) by choosing c0 = 0 and c1n = | (H
n )| such that
IW
n ) W
n
I + A + (H I W
Y n+1 = Y n
.
(10.4.15)
n )| |W
n|
1 + | (H I W
In this example we choose to have the following set of ten states
{0, 1, . . . , 9}, with transition intensity matrix A such that leaves each state
with intensity 1, that is ai,i = 1. It jumps only to the neighboring states
with equal probability, that is with intensity 0.5 up or down if there are two
neighboring states and otherwise with intensity 1 to the neighboring state.
We choose the speed of adjustment parameter = 8 in the mean reversion
term of the SDE (10.4.11).
454
10 Filtering
Fig. 10.4.8. qt3 - obtained by the drift-implicit Milstein scheme with = 0.5 and
1
= 2000
The numerical experiments that were carried out in the case without mean
reversion are repeated in the mean reversion case, with the minor change that
all of the time step sizes are now halved. In this second study we again rst
inspect the trajectories generated by a single realization of the signal and
1
, Fig. 10.4.7 shows the
observation processes. For the time step size = 2000
. The
simulated paths of the signal process and the observation process W
generated observation process is a model for some quantity that is uctuating
around the hidden reference level, which is here allowed to take ten dierent
values.
To compare the output from the various numerical schemes we consider
qt3 = E(X
t e3 Yt ), which is the ltered probability that X is in the third
455
1
40
1
40
state, which has the value = 2. Fig. 10.4.8 shows the plot of the approximate value of qt3 computed by the drift-implicit scheme with = 0.5 with an
1
. We interpret this trajectory as being
extremely ne time step size = 2000
3
close to the exact trajectory of qt . Note in fact the correct detection of the
periods when the hidden state was indeed equal to 2.
Figs. 10.4.9 to 10.4.12 display the lter qt3 calculated by using the four alternative schemes when the observation time step size is much larger, namely
1
. Here only the balanced implicit method produces a useful lter.
= 40
For this rough step size the trajectory of this lter still approximates surprisingly well the trajectory that was computed with the much ner step size in
Fig. 10.4.8. The other numerical schemes clearly fail for the larger step size
due to their insucient numerical stability in ltering.
456
10 Filtering
1
40
1
40
The balanced implicit method can potentially provide the necessary reliability that needs to be guaranteed. It extends considerably the class of
problems that are amenable to the application of stochastic numerical techniques.
457
458
10 Filtering
cases when no equivalent risk neutral probability measure exists, the benchmark approach allows us still to price contingent claims consistently. The
previous section highlights how lters can be numerically calculated. Now, we
employ lters in a nancial setting.
Factor Model
To build a nancial market model with a suciently rich structure and computational tractability we introduce a multi-factor model with n 2 factors
Z 1 , Z 2 , . . . , Z n , forming the vector process
Z = Z t = Zt1 , . . . , Ztk , Ztk+1 , . . . , Ztn , t [0, T ] .
(10.5.1)
We shall assume that not all of the factors are observable. More precisely, only
the rst k factors are directly observed, while the remaining n k are not.
Here k is an integer with 1 k < n that we shall suppose to be xed during
most of this section. However, later on we shall discuss the implications of a
varying k. For xed k we shall consider the following subvectors of Z t
Y t = (Yt1 , . . . , Ytk ) = (Zt1 , . . . , Ztk )
and
(10.5.2)
(10.5.3)
(10.5.4)
for t [0, T ] with given vector Z 0 = (Y01 , . . . , Y0k , X01 , . . . , X0nk ) of initial
values. Here
W = W t = Wt1 , . . . , Wtk , Wtk+1 , . . . , Wtn , t [0, T ]
459
(10.5.5)
(10.5.6)
(10.5.7)
for t [0, T ] and i {1, 2, . . . , k}. This means that we assume, without loss
of generality, that the jump intensities of the rst k counting processes are
observed. The ith (A, P )-jump martingale is then dened by the stochastic
dierential
(10.5.8)
dMti = dNti it (Z t ) dt
for t [0, T ] and i {1, 2, . . . , n}. In the SDE (10.5.3)
N t = Nt1 , . . . , Ntk
(10.5.9)
denotes the vector of the rst k counting processes at time t [0, T ]. Concerning the coecients in the SDE (10.5.3)(10.5.4), we assume that the vectors
at (Z t ), At (Z t ), t (Z t ) and the matrices bt (Z t ), B t (Y t ), g t (Z t ) and Gt (Y t )
are such that a unique strong solution of (10.5.3) exists that does not explode
until time T , see Chap. 1. We shall also assume that the k k-matrix B t (Y t )
is invertible for all t [0, T ]. Finally, g t (Z t ) may be any bounded function
and the k k-matrix Gt (Y t ) is assumed to be a given function of Y t that is
invertible for each t [0, T ]. This latter assumption implies that, since there
are no common jumps among the components of N t , by observing a jump of
Y t we can establish which of the processes N i , i {1, 2, . . . , k}, has jumped.
Finite Dimensional Filter
In addition to the ltration A = (At )t[0,T ] , which represents the total information, we shall also consider the subltration
Y k = (Ytk )t[0,T ] A,
(10.5.10)
where Ytk = {Y s = (Zs1 , . . . , Zsk ) , s [0, t]} represents the observed information at time t [0, T ]. Thus Y k provides the structure of the evolution
460
10 Filtering
(10.5.13)
which consists of the k observed factors and the q components of the lter
state process. Furthermore, the lter state t satises an SDE of the form
k ) dt + D t (Z
k ) dY t
d t = J t (Z
t
t
(10.5.14)
461
Xi X
i Yk
Xt X
t
t
t
t
(10.5.17)
t and C t on k
for t [0, T ] and , i {1, 2, . . . , n k}. The dependence of X
is for simplicity suppressed in our notation. The above lter can be obtained
from a generalization of the well-known Kalman-Bucy lter, see Chapter 10
in Liptser & Shiryaev (1977) and Sect. 10.1, namely
.
/
0
t B t (Y t ) + C t (A1 )
t = a0 + a1 X
t + a2 Y t dt + b
dX
t
t
t
t
.
t + A2 Y t dt
(B t (Y t ) B t (Y t ) )1 dY t A0t + A1t X
t
0
/
1
(10.5.18)
462
10 Filtering
of (10.5.18). These coecients are given deterministic functions of time, except for B t (Y t ) which depends also on observed factors. The value of B t (Y t )
becomes known only at time t. However, this is sucient to determine the
solution of (10.5.18) at time t. The model (10.5.15) is in fact a conditionally
Gaussian lter model, where the lter process is given by the vector pro = {X
t , t [0, T ]} and the upper triangular array of the elements
cess X
. Note
of the matrix process C = {C t , t [0, T ]} with q = (n k) [3+(nk)]
2
by (10.5.17) that the matrix C t is symmetric. Obviously, in the case when
B t (Y t ) does not depend on Y t for all t [0, T ], then we have a Gaussian lter
model.
Finite-State Jump Filter Model
Here we assume that the unobserved factors form a continuous time, (n k)dimensional jump process X = {X t = (Xt1 , . . . , Xtnk ) , t [0, T ]}, which
can take a nite number M of values. More precisely, given a Z t -dependent
at
matrix g t (Z t ) and an intensity vector t (Z t ) = 1t (Z t ), . . . , nt (Z t )
= {N
t = (N 1 , . . . , N n ) ,
time t [0, T ] for the vector counting process N
t
t
t [0, T ]}, we consider the particular case of the model equations (10.5.3)
(10.5.4), where in the X-dynamics we have at (Z t ) = g t (Z t )t (Z t ) and
bt (Z t ) = 0. Thus, by (10.5.3)(10.5.4) and (10.5.8) we have
t
dX t = g t (Z t ) dN
(10.5.19)
for t [0, T ]. Notice that the process X of unobserved factors is here a pure
jump process and is therefore piecewise constant. On the other hand, for the
vector Y t of observed factors at time t we assume that it satises the equation
(10.5.3) with Gt (Y t ) = 0. This means that the process of observed factors Y
is only perturbed by continuous noise and does not jump.
In this example, the lter distribution is completely characterized by the
vector of conditional probabilities p(t) = (p1 (t), . . . , pM (t)) , where M is the
number of possible states 1 , . . . , M of the vector X t and
p(t)j = P X t = j Ytk ,
(10.5.20)
h
for t [0, T ] and j {1, 2, . . . , M }. Let a
i,j
t (y, ) denote the transition kernel
for X at time t to jump from state i into state j given Y t = y and X t = h ,
see Liptser & Shiryaev (1977). The components of the vector p(t) satisfy the
following dynamics
.
j
t (Y t , p(t))
t (Y t , p(t)) p(t) dt + p(t)j At (Y t , j ) A
dp(t)j = a
.
1
t (Y t , p(t)) dt ,
dY t A
B t (Y t ) B t (Y t )
where
(10.5.21)
M
M
j
h
h
t (Y t , p(t)) p(t) =
a
i,j
a
t (Y t , ) p(t)
i=1
463
p(t)i
h=1
At (Y t , ) = At (Y t , X t )
j
t (Y t , p(t)) =
A
M
Xt = j
At (Y t , j ) p(t)j
(10.5.22)
j=1
(10.5.24)
is as dened in (10.5.13).
where the vector Z
t
Notice that, in the case of the conditionally Gaussian model, the expression
k ) takes the particular form
t (Z
A
t
k ) = A0 + A1 X
t (Z
t + A2 Y t .
A
t
t
t
t
(10.5.25)
k ) can be represented as
t (Z
Furthermore, for the nite-state jump model, A
t
k) = A
t (Y t , p(t)) =
t (Z
A
t
M
At (Y t , j ) p(t)j
(10.5.26)
j=1
464
10 Filtering
Proposition 10.5.1
(Platen-Runggaldier) Let At (Z t ) and the invertible matrix B t (Y t ) in (10.5.3)(10.5.4) be such that
and
B t (Y t ) B t (Y t ) dt <
(10.5.27)
t (Z
) dt + B t (Y t ) dV t + Gt (Y t ) dN t
dY t = A
t
(10.5.28)
t (Z
k ) as in (10.5.24).
with A
t
Proof:
is
B t (Y t ) dV t = dY C
t At (Z t ) dt.
From (10.5.3)(10.5.4), (10.5.29) and (10.5.30) it follows that
.
t (Z
k ) dt.
dV t = dV t + B t (Y t )1 At (Z t ) A
t
(10.5.30)
(10.5.31)
t
exp (V t V s ) = 1 +
exp (V u V s ) dV u
u (Z
k ) du
exp (V u V s ) B 1
(Y
)
A
(Z
)
A
u
u
u
u
u
2
exp (V u V s ) du.
(10.5.32)
and that, by our assumptions and by the boundedness of exp (V u V s ) ,
465
u (Z
k ) du Y k =
exp (V u V s ) B 1
(Y
)
A
(Z
)
A
u
u
u
u
u
s
k
1
= 0.
(10.5.34)
Taking conditional expectations on the left and the right hand sides of
(10.5.32) we end up with the equation
t
E exp [(V t V s )] Ysk = 1
E exp (V u V s ) Ysk du,
2 s
(10.5.35)
which has the solution
k
E exp (V t V s ) Ys = exp
(t s)
(10.5.36)
2
for 0 s t T . We can conclude that (V t V s ) is a k-dimensional vector
of independent Ytk -measurable Gaussian random variables, each with variance
(ts) and independent of Ysk . By Levys theorem, see Sect. 1.3, V is thus a
k-dimensional Y k -adapted standard Wiener process.
k ) Gt (Y t ) dN t
+ D t (Z
t
t (Z
k ) dt + D
t (Z
k ) dV t + G
t (Z
k ) dN t ,
=J
t
t
t
(10.5.37)
466
10 Filtering
(10.5.38)
k = {Z
k , t [0, T ]} is
From Corollary 10.5.2 it follows that the process Z
t
Markovian.
Due to the existence of Markovian lter dynamics we have our original
Markovian factor model, see (10.5.3)(10.5.4), projected into a Markovian
model for the observed quantities. Here the driving observable noise V is an
(Y k , P )-Wiener process and the observable counting process N is generated
by the rst k components N 1 , N 2 , . . . , N k of the given n counting processes.
k =
Given k, for ecient notation, we write for the vector of observables Z
t
t = (Z 1 , Z 2 , . . . , Z k+q ) the corresponding system of SDEs in the form
Z
t
t
t
dZt = (t, Zt1 , Zt2 , . . . , Ztk+q ) dt +
k
r=1
k
k+q
1
2
dNtr
,r t, Zt
, Zt
, . . . , Zt
(10.5.39)
r=1
k = Z,
We also have as an immediate consequence of the Markovianity of Z
as well as property (10.5.11), the following result.
Corollary 10.5.3
(10.5.40)
467
Ytd+1 = Ztd+1 = rt
for t [0, T ].
Since the d + 1 primary security account processes coincide with the observable factors Y 1 , . . . , Y d+1 , we may write their dynamics in a form corresponding to (10.5.39). To this eect let for i {1, 2, . . . , k}, by analogy to
(10.5.8),
i (Z
t ) dt
i = dN i
(10.5.41)
dM
t
t
t
be the compensated ith (Y k , P )-jump martingale relative to the ltration Y k .
i (Z
t ) the compenHere, with some abuse of notation, we have denoted by
t
k
i
sating jump intensity for N with respect to Y . For simple notation, in what
t for Z
k , see (10.5.39). Let us now rewrite (10.5.39)
follows we shall often use Z
t
more concisely in vector form as
t = (t,
t ) dt + (t, Z
t ) dV t + (t, Z
t ) dM
t
Z
dZ
(10.5.42)
t (Z
t ) = (t, Z
t ) + (t, Z
t )
t ),
Z
(t,
(10.5.43)
with
are the k-vectors with components M
i , respectively.
and
i and
where M
t ) are
h1
i=1
t ) dV i +
j,i (t, Z
t
h2
t ) dM
j, (t, Z
t
(10.5.44)
=1
468
10 Filtering
dStj
j
St
rt dt +
h1
bj,i
t
d
h1 + dt
dVti + ti dt +
bj,
t
t dMt
i=1
=h1 +1
(10.5.45)
for t [0, T ] with Stj > 0, j {0, 1, . . . , d}. Here we set S00 = 1 and b0,i
t = 0
for t [0, T ] and i {0, 1, . . . , d}, where rt is the short rate. Above we have
in (10.5.45) for i {1, 2, . . . , h1 } the volatility
bj,i
t =
t)
j,i (t, Z
Stj
(10.5.46)
t )
j,ih1 (t, Z
j
St
(10.5.47)
d
for t [0, T ] and j {1, 2, . . . , d}. We assume that the matrix bt = [bj,i
t ]j,i=1
is invertible for all t [0, T ]. This allows us to write the market price of risk
vector t = (t1 , . . . , td ) in the form
t = b1
t [at rt 1]
(10.5.48)
j (t, Z
(10.5.49)
ajt =
j
St
for t [0, T ] and j {1, 2, . . . , d}.
We say that an Y k -predictable stochastic process = { t = (t0 , . . . , td ) ,
t [0, T ]} is a self-nancing strategy, if is S-integrable. The corresponding
portfolio St has at time t the value
St =
d
tj Stj
(10.5.50)
j=0
d
j
t
dStj
(10.5.51)
j=0
dSt = St
h
1
ti (ti dt + dVti )
i=1
d
469
i=h1 +1
i
t
ih1 (Z
t ) i
i
tih1
t
dt + dM
!
(10.5.52)
(10.5.53)
i
dSt = S
(bj,i
t ) dV
t
i=1
d
bj,i
t
i
1 ih1t
t )
(Z
i=h1 +1
i
ih1t
t )
(Z
!
tih1
dM
(10.5.54)
for t [0, T ] and j {0, 1, . . . , d}, see Chap. 3.
Furthermore, by application of the It
o formula it can be shown that the
benchmarked portfolio S = {St , t [0, T ]} with
S
St = t
St
(10.5.55)
h1
d
d
d
j j
j j
S
t t j,i
t t j,i
dSt = St
b ti dVti +
bt + 1
t t
S
S
t
i=1
j=1
j=1
i=h1 +1
i
1 ih1t
t )
(Z
!
tih1
1 dM
*
(10.5.56)
for t [0, T ].
Note again that the jth benchmarked primary security account Sj and all
benchmarked portfolios are driftless and, thus, (Y k , P )-local martingales. Due
to a result in Ansel & Stricker (1994) any nonnegative benchmarked portfolio
470
10 Filtering
471
(10.5.59)
for t [0, ], which by Corollary 10.5.3 can be computed on the basis of the
previous ltering results. Combining (10.5.58) with (10.5.59) and using the
fact that St is Ytk -measurable, we obtain
k
)
u
k (t, Z
t
=E
St
U (, Y ) k
Yt
S
k
u
k (t,Z t )
St
k
(10.5.60)
forms for t [0, ] a
472
10 Filtering
for some xed r {1, 2, . . . , k}, where we assume that the number of observed
k ) be the corresponding
factors that inuence the claim equals r. Let u
k (t, Z
t
fair price at time t under the information Ytk , as given by (10.5.59). Recall
k ) is the conditional expectation, under the real
that, by (10.5.59), u
k (t, Z
t
world probability measure, of u(t, Z t ) given Ytk , which implies that the corresponding conditional variance
2
k) Yk
u(t, Z t ) u
k (t, Z
(10.5.62)
Varkt (u) = E
t
t
at time t [0, ) is the minimal value of the mean square error, conditional
on Ytk , corresponding to the deviation from u(t, Z t ) of any Ytk -measurable
random variable. This conditional variance is computed under the real world
probability measure. It would not make sense if computed under any other
probability measure since the market participants are aected by the real
k ).
dierence between u(t, Z t ) and u
k (t, Z
t
Note that for larger k we have more information available, which naturally should reduce the above conditional variance. In Platen & Runggaldier
(2005) the following result is shown, which quanties the reduction in conditional variance and can also be seen as a generalization of the celebrated
Rao-Blackwell theorem towards ltering in incomplete markets.
Theorem 10.5.4. (Platen-Runggaldier) For m {0, 1, . . . , n k} and
k {1, 2, . . . , n 1} we have
k
Yt = Varkt (u) Rtk+m ,
E Vark+m
(u)
(10.5.63)
t
where
Rtk+m = E
2
k
k+m ) u
k) Yk
u
k+m (t, Z
(t,
Z
t
t
t
(10.5.64)
for t [0, ).
Proof:
2
k)
u(t, Z t ) u
k (t, Z
t
2
2
k+m ) + u
k+m ) u
k)
= u(t, Z t ) u
k+m (t, Z
k+m (t, Z
k (t, Z
t
t
t
k+m
k
k+m ) u
k+m ) u
k ) . (10.5.65)
+ 2 u(t, Z t ) u
k+m (t, Z
(t,
Z
(t,
Z
t
t
t
By taking conditional expectations with respect to Ytk on both sides of the
above equation it follows that
473
k
k+m
k+m
k
k+m ) u
k)
Y
+
R
Varkt (u) = E Vark+m
(u)
+
2
E
u
(t,
Z
(t,
Z
t
t
t
t
t
k+m ) Y k+m Y k .
(10.5.66)
k+m (t, Z
E u(t, Z t ) u
t
t
t
Since the last term on the right hand side is equal to zero by denition, we
obtain (10.5.63).
Hedging Strategy
To determine a hedging strategy in an incomplete market we have to use one
of the hedging criteria for incomplete markets. It turns out that the real world
pricing concept is related to local risk minimization, see F
ollmer & Schweizer
(1991) and DiMasi, Platen & Runggaldier (1995). To see this let us introduce
the benchmarked pricing function
k
)=
u
(t, Z
t
k)
u
k (t, Z
t
St
(10.5.67)
k ) we introduce
for t [0, T ]. To express conveniently the evolution of u
(t, Z
t
for i {1, 2, . . . , k + q} the operator
k
)=
(t, Z
Li u
t
k+q
=1
k)
u
(t, Z
t
Z
(10.5.68)
u t, Zt
(10.5.69)
= {
u(t, Z t ), t [0, ]} the martingale
formula for the (Y , P )-martingale u
representation
U (, Y )
k)
=u
(, Z
S
k ) + It, + R
t, ,
=u
(t, Z
t
(10.5.70)
474
10 Filtering
k
It, =
L u
(s, Z s ) dVs +
=1
=1
k ) dM
u (s, Z
s
s
k
k
k ) dV +
t, =
R
L u
(s, Z
s
s
=h1 +1
(10.5.71)
=h2 +1
) dM
(10.5.72)
u (s, Z
s
s
j
U
(t) Stj
V (t)
(10.5.73)
=
(10.5.74)
t
t
VU (t)
j=1
and for i {1, . . . , h2 } the relation
d
i
k )
iu (t, Z
t
j
j,i
t
1=
U (t) bt + 1
1 ih1
t (Z t )
VU (t)
j=1
(10.5.75)
for t [0, ]. The equations (10.5.74) and (10.5.75) lead with eU (t) =
(e1U (t), . . ., edU (t)) , where
k
L u
(t,Z t )
+ t
for {1, 2, . . . , h1 }
VU (t)
eU (t) =
k
i
i
u
(t,Z t )
ih1 (Z
+t
t )
V (t)
i
ih1 (Z
t )t
t
(10.5.76)
and U (t) = (1U (t), . . . , dU (t)) to the vector equation
eU (t) =
(10.5.77)
U (t) bt
for t [0, ]. Consequently, summarizing the above results we obtain the
following statement.
10.6 Exercises
475
Proposition 10.5.5
The hedgable part of the contingent claim U can be
replicated by the portfolio VU with fractions
U (t) = eU (t) b1
(10.5.78)
t
for t [0, T ].
The resulting hedging strategy can be interpreted as a local risk minimizing strategy in the sense of F
ollmer & Schweizer (1991). The corresponding
benchmarked version of the F
ollmer-Schweizer decomposition has the form
t, , see
(10.5.70). Note that the driving martingales in the unhedgable part R
(10.5.72), are orthogonal to the martingales that drive the primary security
accounts and, thus, the hedgable part It, , see (10.5.71). Obviously, to perform a local risk minimizing hedge the corresponding initial capital at time t
k ), see (10.5.59), of the contingent claim.
equals the fair price u
k (t, Z
t
10.6 Exercises
10.1. Name some type of nite-dimensional lters that can be applied in continuous time nance when observed and hidden quantities follow a continuous
Gaussian process.
10.2. Name a class of nite-dimensional lters, where the observations are
continuous and the hidden quantities jump.
10.3. Why, in practice, discrete-time strong numerical schemes are relevant
in ltering.
10.4. Name a strong numerical scheme that is fully implicit and can potentially overcome numerical instabilities for a range of step sizes when approximating the Zakai equation for the Wonham lter.
10.5. For an independent constant hidden Gaussian growth rate with mean
> 0 and variance 2 > 0 determine for an observed strictly, positive asset
price process S = {St , t [0, T ]} with SDE
1
dSt = St +
dt + St dWt
2
for t [0, T ] with S0 > 0, the least-squares estimate
t = E Yt
under the information given by the sigma-algebra Yt = {Sz : z [0, t]},
which is generated by the security S. Here W = {Wt , t [0, T ]} is an
independent standard Wiener process.
10.6. What is the purpose of an Expectation Maximization algorithm.
11
Monte Carlo Simulation of SDEs
477
478
m
bj (t, X t ) dWtj
(11.1.1)
j=1
() = |E(g(X T )) E(g(Y
T ))| Cg
(11.1.2)
for each (0, 0 ). We call relation (11.1.2) the weak convergence criterion.
Systematic and Statistical Error
Under the weak convergence criterion (11.1.2) functionals of the form
u = E(g(X T ))
are approximated via weak approximations Y of the solution of the SDE
(11.1.1). One can form a standard or raw Monte Carlo estimate using the
sample average
N
1
uN, =
g(Y
(11.1.3)
T (k )),
N
k=1
(11.1.4)
which we decompose into a systematic error sys and a statistical error stat ,
such that
(11.1.5)
N, = sys + stat .
Here we set
479
sys = E(
N, )
N
1
=E
g(Y T (k )) E(g(X T ))
N
k=1
= E(g(Y
T )) E(g(X T )).
(11.1.6)
(11.1.7)
Obviously, the absolute systematic error |sys | represents the weak error and
is a critical variable under the weak convergence criterion (11.1.2).
For a large number N , of independent simulated realizations of Y , we
can conclude from the Central Limit Theorem that the statistical error stat ,
becomes asymptotically Gaussian with mean zero and variance of the form
N, ) =
Var(stat ) = Var(
1
Var(g(Y
T )).
N
(11.1.8)
Note that we used in (11.1.8) the independence of the realizations for each
k . The expression (11.1.8) reveals a major disadvantage of the raw Monte
Carlo method, being that the variance of the statistical error stat decreases
only with N1 . Consequently, the deviation
Dev(stat ) =
1
Var(stat ) =
N
(
Var(g(Y
T ))
1
N
(11.1.9)
as N . This means
g(Y
T)
480
know the variance of the raw Monte Carlo estimates, we can form batches of
the independent outcomes of a simulation. Then the sample mean Xi , i
{1, 2, . . . , n}, of each batch can be interpreted by the Central Limit Theorem
as being approximately Gaussian with unknown variance. This then allows us
to form Student t condence intervals around the sample mean
1
Xi
n i=1
n
n =
involving the sample variance
1
(Xj
n )2 .
n 1 j=1
n
n2 =
The random variable Tn =
n
q
2
n
n
n
t1,n1 = 1 , where a = t1,n1
n and t1,n1 is the 100(1 )%
quantile of the Student t distribution with n 1 degrees of freedom. Without
going into further details, we will use this methodology for various simulation
experiments to check for corresponding condence intervals. However, in most
experiments that we perform it is possible to reach, with the total number N ,
of simulated outcomes, a magnitude such that the length of the condence
intervals becomes practically negligible in the graphs we plot.
We note from (11.1.9) that the length of a condence interval is not only
proportional to 1N but is also proportional to the square root of the variance
of the simulated functional. This indicates the possibility to overcome the
typically large variance of raw Monte Carlo estimates. These are the estimates
obtained without applying any change in the original simulation problem. One
can often reformulate the random variable that one wants to construct into
one that has the same mean but a much smaller variance. We describe later in
Chap. 16 how to construct unbiased estimates u
N, for the expectation u =
))
with
much
smaller
variances
than
the
raw Monte Carlo estimate
E(g(Y
T
uN, given in (11.1.3). These methods are called variance reduction techniques
and are described in detail later.
481
Wtjn+1
Wtjn
with Wnj =
and initial value Y 0 = X 0 . Here we again suppress
the dependence of the coecients on Y n and n .
482
where the
must be independent An+1 -measurable random variables
with moments satisfying the condition
3
j
j
j )2 K 2
+ E (W
W
E
E
W
+
(11.2.3)
n
n
n
nj
W
for some constant K and j {1, 2, . . . , m}. The simplest example of such
j to be used in (11.2.2) is a two-point disa simplied random variable W
n
tributed random variable with
j = = 1.
(11.2.4)
P W
n
2
Obviously, the above two-point distributed random variable satises the moment condition (11.2.3) required by the simplied Euler scheme (11.2.2). In
particular, in the case when the drift and diusion coecients are constant,
a random walk can be interpreted as being generated by a simplied Euler
scheme. This applies also to tree methods, as will be shown in Chap. 17.
When the drift and diusion coecients are only H
older continuous, then
one can show, see Mikulevicius & Platen (1991), that the Euler scheme still
converges weakly, but with some lower weak order < 1.0.
Let us check numerically that the Euler and the simplied Euler schemes
achieve an order of weak convergence = 1.0. We suppose that Xt
satises the following SDE
dXt = a Xt dt + b Xt dWt
(11.2.5)
483
Fig. 11.2.1. Log-log plot of the weak error for an SDE with multiplicative noise for
the Euler and simplied Euler schemes
In the log-log plot of Fig. 11.2.1 we show ln(()) against ln() for the
Euler and simplied Euler scheme. The slope of this graph yields an estimate
for the weak order of the scheme. We deserve that the weak errors are almost
identical for both schemes, and how the error
() = |E(XT ) E(YN )|
behaves if one decreases the step size . We emphasize that there is no signicant dierence in the weak errors produced by these two methods.
Weak Order 2.0 Taylor Scheme
We can derive more accurate weak Taylor schemes by including further multiple stochastic integrals from the Wagner-Platen expansion (4.4.7). Under the
weak convergence criterion the objective is now to include for a given weak
order of convergence the terms with information about the probability measure of the underlying diusion process into the expansion. As we will see,
this is dierent to the choice of the terms that are necessary to approximate
sample paths with some given strong order.
Let us consider the weak Taylor scheme of weak order = 2.0. This scheme
is obtained by adding all of the double stochastic integrals from the WagnerPlaten expansion (4.4.7) to the terms of the Euler scheme. This scheme was
rst proposed in Milstein (1978). In the autonomous, that is, the time independent, one-dimensional case d = m = 1 we have the weak order 2.0 Taylor
scheme
1
Yn+1 = Yn + a + b Wn + b b (Wn )2 + a b Zn
2
1
1 2 2
1 2
+ a a + a b + a b + b b {Wn Zn } . (11.2.6)
2
2
2
484
+ 1 b b
W
Yn+1 = Yn + a + b W
2
1
1
+ 1 a a + 1 a b2 2 , (11.2.7)
+ a b + a b + b b2 W
2
2
2
2
must satisfy the moment condition
where W
3
5
W
W
E W + E
+ E
4
2
W
W
+ E
+ E
32 K 3
(11.2.8)
= 3 = 1
P W
6
= 0 = 2.
and P W
3
(11.2.9)
In the multi-dimensional case d {1, 2, . . .} with m = 1, the kth component of the weak order 2.0 Taylor scheme with scalar noise is given by
k
= Ynk + ak + bk W +
Yn+1
1 1 k
L b (W )2
2
1 0 k 2
L a + L0 bk {W Z} + L1 ak Z, (11.2.10)
2
485
m
1 0 k 2 k,j
L a +
b W j + L0 bk,j I(0,j) + Lj ak I(j,0)
2
j=1
(11.2.11)
j1 ,j2 =1
Under the weak convergence criterion one can substitute the multiple It
o integrals I(0,j) and I(j,0) with simpler random variables. This is a crucial advantage
in Monte Carlo simulation compared to scenario simulation. In this way one
obtains from (11.2.11) the following simplied weak order 2.0 Taylor scheme
with kth component
k
= Ynk + ak +
Yn+1
m
1 0 k 2 k,j 1 0 k,j
j
L a +
b + L b + Lj ak W
2
2
j=1
1 j1 k,j2 j1 j2
W W + Vj1 ,j2 .
L b
2 j ,j =1
m
(11.2.12)
1
2
(11.2.13)
Vj1 ,j1 =
(11.2.14)
(11.2.15)
and
for j1 {1, 2, . . . , m} and j2 {j1 + 1, . . ., m}. Furthermore, one can choose
j in (11.2.12) to be independent three point disthe random variables W
tributed as described in (11.2.9).
A convergence theorem at the end of this section formulates sucient
conditions under which the above schemes converge with weak order = 2.0.
Weak Order 3.0 Taylor Scheme
Let us now consider weak Taylor schemes of weak order = 3.0. As shown in
Platen (1984) these need to include from the Wagner-Platen expansion (4.4.7)
all of the multiple It
o integrals of multiplicity three, to ensure that the scheme
converges with this weak order.
In the general multi-dimensional case d, m {1, 2, . . .} the kth component
of the weak order 3.0 Taylor scheme, proposed in Platen (1984), takes the form
486
k
Yn+1
= Ynk + ak +
m
bk,j W j +
j=1
m
j1 ,j2 =0
m
Lj ak I(j,0) +
j=0
m
m
m
m
j1 =0 j2 =1
j1 ,j2 =0 j3 =1
This scheme is of particular interest for various special cases of SDEs because
it leads to the construction of ecient third order schemes.
In the scalar case d = 1 with scalar noise m = 1 the following simplied
weak order 3.0 Taylor scheme can be obtained
2
+ 1 L1 b
Yn+1 = Yn + a + b W
2
1
Z
+ L1 a Z + L0 a 2 + L0 b W
2
1 0 0
2
L L b + L0 L1 a + L1 L0 a W
+
6
2
1 1 1
L L a + L 1 L0 b + L 0 L1 b
+
W
6
2
1
1
. (11.2.17)
W
3 W
+ L0 L0 a 3 + L1 L1 b
6
6
and Z are correlated Gaussian random variables with
Here W
1
N (0, ),
(11.2.18)
W
Z N 0, 3
3
and covariance
Z = 1 2 .
E W
(11.2.19)
2
For d {1, 2, . . .} and m = 1 we can substitute the multiple stochastic
integrals in the scheme (11.2.16) as follows:
I(1,1)
1 , I(1,0) Z,
I(0,1) W
Z
I(1) = W
1 2
1
,
W ,
I(1,1,0) I(1,0,1) I(0,1,1)
6
1 2
I(1,1,1) W W 3 ,
6
487
In Hofmann (1994) it has been shown for an SDE with additive noise and
m = {1, 2, . . .} that one can use approximate random variables of the form
I(0) = ,
I(j,0)
1
2
I(j) = j
I(0,0) =
3
1
2 ,
j + j
3
and
I(j1 ,j2 ,0)
2
6
2
,
2
I(0,0,0) =
3
,
6
5
I(j,0,0) I(0,j,0)
j1 j2 +
Vj1 ,j2
2
j
6
.
(11.2.20)
In this case one can use independent four point distributed random variables
1 , . . . , m with
(
P j = 3 + 6 =
12 + 4 6
and
P
(
j = 3 6 =
12 4 6
(11.2.21)
m
j=1
j+
bk,j W
1 0 k 2 1 0 0 k 3
L a + L L a
2
6
m
j Z j
+
Lj ak Z j + L0 bk,j W
j=1
1 0 0 k,j
j 2
L L b + L0 Lj ak + Lj L0 ak W
6
m
1 j1 j2 k
j1 j2
+
L L a W W I{j1 =j2 } .
6 j ,j =1
1
(11.2.22)
488
4
m
(11.2.23)
=1 j1 ,...,j =0
This weak order = 4.0 Taylor scheme appeared rst in Platen (1984).
In the case of special SDEs, for instance, those with additive noise, one
obtains highly accurate schemes when using (11.2.23). For the one-dimensional
case d = m = 1 with additive noise the following simplied weak order 4.0
Taylor scheme for additive noise can be derived
+
Yn+1 = Yn + a + b W
1 0
Z
L a 2 + L1 a Z + L0 b W
2
1 0 0
2
L L b + L0 L1 a W
3!
5 2
1 2
1 1
W
+ L L a 2 W Z
6
6
+
1
1 0 0
L L a 3 + L0 L0 L0 a 4
3!
4!
1
1 0 0
3
+
L L L a + L0 L1 L0 a + L0 L0 L1 a + L0 L0 L0 b W
4!
2
1 1 1 0
0 1 1
1 0 1
L L L a+L L L a+L L L a
+
W 2
4!
2
1
W
+ L1 L1 L1 a W
3 .
(11.2.24)
4!
+
489
Fig. 11.2.2. Log-log plot of the weak errors for an SDE with additive noise
Fig. 11.2.3. Log-log plot of the weak error for an SDE with multiplicative noise
we run a sucient number of simulations such that the systematic error sys
dominates the statistical error in (11.1.5) and the condence intervals in the
error plots become negligible.
For the SDE with additive noise
dXt = a Xt dt + b dWt
(11.2.25)
490
We now test the schemes for another SDE. Consider the case with multiplicative noise
(11.2.26)
dXt = a Xt dt + b Xt dWt
for t [0, T ]. Here we use the same parameter choices as above and g(X) = x.
In Fig. 11.2.3 we show the log-log plot for the corresponding results. It can be
seen that most schemes perform about as expected from theory.
Convergence Theorem
Let us now formulate a theorem that indicates the type of conditions that one
needs to satisfy to obtain a given order of weak convergence.
In what follows we shall use the notations from Chap.4 and Chap.5, where
we formulate the Wagner-Platen expansion and strong Taylor schemes. Recall
that for a given function f (t, x) = x we obtain the It
o coecient functions
f (t, x) = Lj1 Lj1 bj (x)
(11.2.27)
tn+1
s
s2
1
dWsj11 . . . dWsj1
dWsj ,
(11.2.28)
I,tn ,tn+1 =
tn
tn
tn
(11.2.29)
(11.2.30)
491
Let CP (d , ) denote the space of times continuously dierentiable functions g : d for which g and all of its partial derivatives of order up to
and including have polynomial growth. This set is slightly more general than
CP (d , ) that we introduced in Sect. 11.1. We state the following theorem
for an autonomous diusion process X with drift a(x) and diusion coecient b(x), which provides the order of weak convergence of the previously
presented schemes.
Theorem 11.2.1. For some {1, 2, . . .} and autonomous SDE for X let
Y be a weak Taylor scheme of order . Suppose that a and b are Lipschitz
2(+1) d
, for all k {1, 2, . . . , d}
continuous with components ak , bk,j CP
and j {0, 1, . . . , m}, and that the f satisfy a linear growth bound
|f (t, x)| K (1 + |x|) ,
(11.2.31)
492
1
1 +
+ 2 b W
+b
a +a +
b
2
4
2
1
1 +
+
(11.3.1)
b
W 2
b
4
Y n+1 = Y n +
and
= Yn + a b
2
4 j=1
m
r
j r
j
j 12
+
b
2
b
+
bj U
W
U
+
r=1
r=j
#
m
2
1
j bj R
j
j
+
(11.3.3)
bj R
W
+
4 j=1
*
m
1
j r
j r
j
r
+
b U+ b U
2
W W + Vr,j
r=1
r=j
m
j,
bj W
j = Y n + a bj
R
j=1
and
j = Y n bj
U
493
For additive noise the second order weak scheme (11.3.3) reduces to the
relatively simple algorithm
!
m
m
1
j
j
j.
+a +
Y n+1 = Y n +
b W
bj W
a Yn + a +
2
j=1
j=1
(11.3.4)
Explicit Weak Order 3.0 Schemes
In the autonomous case d {1, 2, . . .} with m = 1 one obtains for additive
noise the explicit weak order 3.0 scheme in vector form
1 +
1
3
+ a
Y n+1 = Y n + a + b W +
a + a a
a
2
2
4
)
+
1
1 +
+
a a
a
Z
a
4
2
.
1 -
+
+ ( + ) b a+
a Y n + a + a+
a + a
6
2
( + ) W + +
W
(11.3.5)
with
=
a
Y
+
a
a
n
and
=
a
Y
+
2
a
b
2
,
a
n
where is either or . Here one can use two correlated Gaussian random
N (0, ) and Z N (0, 1 3 ) with E(W
Z)
= 1 2 ,
variables W
3
2
together with two independent two-point distributed random variables and
with
1
P ( = 1) = P ( = 1) = .
2
In the autonomous case d {1, 2, . . .} with general scalar noise m = 1 we
have the following generalization of the scheme (11.3.5), which here is called
the explicit weak order 3.0 scheme for scalar noise
494
1
1
H a + H b Z
2
)
2
2
1
W
+
Gb
Ga Z +
2
2
+
W
+ F ++
+
(
+
)
6 a
+
Y n+1 = Y n + a + b W
1 ++
F b + F +
W
+ F +
+ F
b
b
b
24
2
1 ++
+
+
Fb Fb + Fb Fb
+
W
24
2
1 ++
+
+
Fb + Fb Fb Fb
+
W 3 W
24
2
1 ++
+
+
Fb + Fb Fb Fb
W (11.3.6)
+
24
+
with
1 +
3
,
+g
g
g
2
4
1 +
1
,
g
Gg = g + g
g
4
2
H g = g+ + g
F +
= g Y n + a + a+ + b b+ g +
g
g Y n + a b + g,
=
g
Y
+
a
+
a
b
g
n
g
g Y n + a b + g
where
and
g = g Y n + a b
= g Y n + 2 a 2 b ,
g
, Z,
and
with g being equal to either a or b. The random variables W
are as specied earlier for the scheme (11.3.5).
495
Vg,2
(11.4.1)
(T ) = 2E g YT E g YT2 ,
proposed in Talay & Tubaro (1990). It is a stochastic generalization of the
well-known Richardson extrapolation. This method works because the leading
error terms of both parts of the dierence on the right hand side of (11.4.1)
oset each other. What remains is a higher order error term.
As an example, consider again the geometric Brownian motion process X
satisfying the SDE
dXt = a Xt dt + b Xt dWt
with X0 = 0.1, a = 1.5 and b = 0.01 on the time interval [0, T ] where T = 1.
We use the Euler scheme (11.2.1) to simulate the Richardson extrapolation
496
Fig. 11.4.1. Log-log plot for absolute systematic error of Richardson extrapolation
Fig. 11.4.2. Log-log plot for absolute systematic error of weak order 4.0 extrapolation
12 E g YT
+ E g YT
Vg,4 (T ) =
32 E g YT
. (11.4.2)
21
Suitable weak order 2.0 approximations include the weak order 2.0 Taylor
(11.2.6), simplied weak order 2.0 Taylor (11.2.7) or the explicit weak order
2.0 (11.3.1) schemes. In Fig. 11.4.2 we show the log-log plot for the absolute
497
weak error of the weak order 4.0 extrapolation applied to the above test
example. Note the steep slope of the error.
Similarly, one can, in principle, extrapolate a weak order = 3.0 approximation Y , such as the weak order 3.0 Taylor scheme (11.2.16), the simplied
weak order 3.0 Taylor scheme (11.2.17) or the weak order 3.0 schemes (11.3.5)
and (11.3.6). This then allows us to obtain a sixth weak order approximation.
The weak order 6.0 extrapolation is dened by the method
1
Vg,6
(T ) =
4032 E g YT 1512 E g YT2
2905
+ 448 E g YT3 63 E g YT4
.
(11.4.3)
The practical use of extrapolations of such a high order, depends strongly on
the numerical stability of the underlying weak schemes. These weak methods
need to yield numerically reliable results with the appropriate leading error
coecients for a wide range of step sizes. We emphasize that the idea of
extrapolation is generally applicable for many weak approximation methods,
for instance, for trees and also in the case of jumps.
m
j,
bj (n , Y n ) W
(11.5.1)
j=1
(11.5.2)
498
m
j
bj (n , Y n ) W
j=1
(11.5.3)
j , j {1, 2, . . . , m}, as in (11.5.1). The parameter is the degree of
with W
drift implicitness. With = 0 the scheme (11.5.3) reduces to the simplied
Euler scheme (11.2.2), whereas with = 0.5 it represents a stochastic generalization of the trapezoidal method. Under sucient regularity one can show
that the schemes (11.5.3) converge with weak order = 1.0.
Fully Implicit Euler Scheme
For strong discrete-time approximations we noticed in Sect. 7.3 that making
the diusion terms implicit is not easily achieved because some Gaussian
random variables usually appear in the denominator of the resulting scheme.
However, the possible use of discrete random variables in weak simplied
schemes allows us to construct fully implicit weak schemes, that is, algorithms
where the approximation of the diusion term also becomes implicit. This
was not the case for strong schemes, where we nally suggested the use of a
balanced implicit method (7.3.9)(7.3.10) to achieve better numerical stability.
In general, implicit schemes require an algebraic equation to be solved at
each time step. This imposes an additional computational burden. For certain
SDEs the resulting implicit schemes can be directly solved, as is the case for
the Black-Scholes dynamics.
In the one-dimensional autonomous case d = m = 1 the fully implicit weak
Euler scheme has the form
,
(Yn+1 ) + b (Yn+1 ) W
Yn+1 = Yn + a
(11.5.4)
is as in (11.5.2) and a
where W
is some adjusted drift coecient dened by
a
= a b b .
(11.5.5)
m
j,
bj (n+1 , Y n+1 ) + (1 ) bj (n , Y n ) W
(11.5.6)
j=1
= a
a
bk,j1
j1 ,j2 =1 k=1
bj2
xk
499
(11.5.7)
Yn+1 = Yn + a (Yn+1 ) + b W
1
a (Yn+1 ) a (Yn+1 ) +
2
1 2
W +
+ bb
2
1 2
b (Yn+1 ) a (Yn+1 ) 2
2
1
1 2
(11.5.8)
a b + a b + b b W.
2
2
(11.5.9)
m
j1 W
j2 + Vj ,j
Lj1 bj2 W
1 2
j1 ,j2 =1
m
1 0 j
j
L b + (1 2) Lj a W
bj +
2
j=1
1
(1 2) L0 a + (1 ) L0 a (n+1 , Y n+1 ) 2 , (11.5.10)
2
500
Y n+1 = Y n +
1
2
m
m
1
j+1
j
{a (n+1 , Y n+1 ) + a} +
bj W
L0 bj W
2
2
j=1
j=1
j1 W
j2 + Vj ,j .
Lj1 bj2 W
1 2
m
(11.5.11)
j1 ,j2 =1
Note that the last two terms in (11.5.11) vanish for additive noise.
Implicit Weak Order 2.0 Scheme
One can avoid derivatives in the above implicit schemes. In the autonomous
case d {1, 2, . . .} with scalar noise, m = 1, in Platen (1995) the following
implicit weak order 2.0 scheme was derived, where
1
1 +
+ 2 b W
+b
(a + a (Y n+1 )) +
b
2
4
2
1
1 +
b
W 2
+
b
(11.5.12)
4
Y n+1 = Y n +
2
4 j=1
*
m
j r
j r
j
12
j
W
+
b U+ + b U 2 b
r=1
r=j
1
+
4 j=1
m
#
2
j bj R
j
j
W
bj R
+
(11.5.13)
*
m
1
j r
j r
j
r
+
b U+ b U
2
W W + Vr,j
r=1
r=j
and
j = Y n bj
U
501
1
a Yn+1 + a + b W
2
(11.5.14)
and predictor
.
Yn+1 = Yn + a + b W
(11.5.15)
Here the random variable W can be chosen to be Gaussian N (0, ) distributed or as two-point distributed random variable with
= = 1.
P W
(11.5.16)
2
In the general multi-dimensional case for d, m {1, 2, . . .} one can form the
following family of weak order 1.0 predictor-corrector methods with corrector
n+1 + (1 ) a
(n , Y n )
n+1 , Y
Y n+1 = Y n + a
+
m
n+1 + (1 ) bj (n , Y n ) W
j (11.5.17)
bj n+1 , Y
j=1
m
d
bj2
,
xk
(11.5.18)
j,
bj W
(11.5.19)
bk,j1
j1 ,j2 =1 k=1
m
j=1
502
W +
n = b W + b b
a b + b b W
2
2
2
and as predictor
1
+ 1
Yn+1 = Yn + a + n + a b W
2
2
a a +
1 2
a b 2 .
2
(11.5.21)
1
n+1 + a + n ,
a n+1 , Y
2
(11.5.23)
where
n =
m
m
1
j1 W
j+1
j2 + Vj ,j ,
Lj1 bj2 W
bj + L0 bj W
1 2
2
2 j ,j =1
j=1
1
and predictor
n+1 = Y n + a + n + 1 L0 a 2 + 1
j .
Y
Lj a W
2
2 j=1
m
(11.5.24)
503
In the autonomous case d {1, 2, . . .} with scalar noise m = 1 a derivativefree weak order 2.0 predictor-corrector method has corrector
Y n+1 = Y n +
where
n =
1
a Y n+1 + a + n ,
2
(11.5.25)
2
1
1 +
+ ) b(
) + 2 b W
)
+ 1 b(
b( ) + b(
W
2
4
4
and predictor
+a +
n+1 = Y n + 1 a
Y
n
2
with the supporting value
(11.5.26)
= Y n + a + b W
.
1
a Y n+1 + a + n ,
2
(11.5.27)
where
n =
m
m
j r
1 j j
j )+ 2 bj +
)+bj (U
r ) 2bj 12 W
j
b
(
U
b (R+ )+ bj (R
4 j=1
r=1
r=j
m
2
1 j j
j )
b (R+ ) bj (R
4 j=1
m
j r
1
j r
j
r
+
b U+ b U
2 .
W W + Vr,j
r=1
r=j
j = Y n + a bj
R
j = Y n bj
and U
(11.5.28)
504
m
j.
bj W
j=1
11.6 Exercises
with
11.1. Verify that the two-point distributed random variable W
= ) = 1
P (W
2
satises the moment conditions
3
2
W
W
K 2 .
E W + E
+ E
with
11.2. Prove that the three-point distributed random variable W
= 3 = 1
P W
6
and
=0 = 2
P W
3
+
E
W
+
E
W
+
E
4
2
W
W
+ E
+ E
3 2 K 3 .
11.6 Exercises
505
12
Regular Weak Taylor Approximations
In this chapter we present for the case with jumps regular weak approximations obtained directly from a truncated Wagner-Platen expansion. The
desired weak order of convergence determines which terms of the stochastic expansion have to be included in the approximation. These weak Taylor
schemes are dierent from the regular strong Taylor schemes presented in
Chap. 6. We will see that the construction of weak schemes requires a separate analysis. A convergence theorem, which allows us to construct weak
Taylor approximations of any given weak order, will be presented at the end
of this chapter.
for t [0, T ], with X 0 d . In the following, we will consider both the case of
a scalar Wiener process, m = 1, and that of an m-dimensional Wiener process,
m {1, 2, . . .}. Note that in the case of an autonomous multi-dimensional
SDE, we can always cover the case of an SDE with time-dependent coecients
by considering the time t as rst component of the vector process X. Moreover,
we remark that we will sometimes treat separately, the simpler case of a markindependent jump coecient c(t, x, v) = c(t, x).
E. Platen, N. Bruti-Liberati, Numerical Solution of Stochastic Dierential
Equations with Jumps in Finance, Stochastic Modelling
and Applied Probability 64, DOI 10.1007/978-3-642-13694-8 12,
Springer-Verlag Berlin Heidelberg 2010
507
508
m
j=1
p (tn+1 )
bk,j Wnj +
ck (i ),
for k {1, 2, . . . , d}. Here we have used the abbreviations dened in (6.1.12)
(6.1.14), with i denoting an independent random variable that is distributed
according to the measure (dv)
(E) . This scheme is identical to the Euler scheme
presented in Chap. 6.
Order 2.0 Taylor Scheme
We have seen that the Euler scheme is the simplest Taylor scheme, from the
point of view of both strong and weak approximations. Note that one can
simplify the Euler scheme by using discrete random variables, as discussed in
Sect.11.2. When a higher accuracy is required, and thus, a scheme with higher
order of convergence is sought, then it is important to distinguish between
strong and weak schemes. Indeed, by adding to the Euler scheme the four
multiple stochastic integrals appearing in the strong order 1.0 Taylor scheme
presented in Sect. 6.2, we can improve the order of strong convergence from
= 0.5 to = 1.0. However, it can be shown that the strong order 1.0 Taylor
scheme has generally only the same order of weak convergence = 1.0 as the
Euler scheme. This indicates that the construction of ecient higher order
weak schemes requires rules that are dierent from those applicable for strong
schemes, as became evident in Sects. 11.1 and 11.2. At the end of this chapter
we present a convergence theorem for weak Taylor approximations. It will
provide a rule for selecting from the Wagner-Platen expansion those multiple
stochastic integrals needed to achieve a given order of weak convergence =
{1.0, 2.0, . . .}. The rule is rather simple, one needs to include all the terms
with multiple stochastic integrals of up to multiplicity when constructing a
weak order scheme. In this way, we obtain the weak order 2.0 Taylor scheme
509
tn+1
Yn+1 = Yn + an + bWn +
c(v) p (dv, dz)
E
tn
+ b b
tn+1
dWz1 dWz2
tn
z2
tn+1
tn
z2
+
E
tn
tn+1
z2
+
tn
tn
b tn , Yn + c(v) b p (dv, dz1 )dWz2
E
tn
tn+1
tn
z2
c tn , Yn + c(v1 ), v2 c(v2 ) p (dv1 , dz1 )p (dv2 , dz2 )
E
tn
tn+1
z2
a
a 2 tn+1 z2
+ aa +
b
dz1 dz2 + a b
dWz1 dz2
+
t
2
tn
tn
tn
tn
+
b
b
+ a b + b 2
t
2
tn+1
z2
+
tn
tn+1
tn
z2
+
tn
tn
tn+1
z2
dz1 dWz2
tn
tn
c(v)
c (v) 2
+ a c (v) +
b dz1 p (dv, dz2 )
t
2
a tn , Yn + c(v) a p (dv, dz1 ) dz2 .
(12.1.3)
510
p (tn+1 )
Yn+1 = Yn + an + bWn +
c(i ) +
p (tn+1 )
+b
b b
2
(Wn ) n
2
c (i ) Wi Wtn
p (tn+1 )
b Yn + c(i ) b Wtn+1 Wi
c Yn + c(i ), j c j
p (j )
p (tn+1 )
1 a
a 2
+ a a +
b (n )2 + a b Zn
2 t
2
b
b 2
+ a b + b (Wn n Zn )
+
t
2
+
p (tn+1 )
p (tn+1 )
c(i )
c (i ) 2
+ a c (i ) +
b (i tn )
t
2
a Yn + c(i ) a (tn+1 i ).
(12.1.4)
This scheme is readily applicable for weak approximation and, thus, for Monte
Carlo simulation. The correlated Gaussian random variables Wn = Wtn+1
Wtn and
tn+1
s2
dWs1 ds2
(12.1.5)
Zn =
tn
tn
can be generated as in (5.3.37). The order 2.0 Taylor scheme for the SDE
(1.8.5) of the Merton model is given by
p (tn+1 )
Yn+1 = Yn + Yn n + Yn Wn + Yn
(i 1)
2
Yn (Wn )2 n + Yn Wn
2
p (tn+1 )
(i 1)
p (tn+1 )
+ Yn
511
p (j )
(i 1)(j 1)
2
Yn (n )2 + Yn Wn n + Yn n
2
p (tn+1 )
(i 1).
(12.1.6)
tn+1
s2
2
ds1 ds2 = n
I(0,0) =
2
tn
tn
tn+1
s2
I(1,0) =
dWs1 ds2 = Zn
tn
tn+1
tn
s2
I(0,1) =
tn
tn+1
tn
I(0,1) =
tn+1
tn
s2
I(1,0) =
tn
p (tn+1 )
s2
i pn tn
(12.1.8)
tn
Even in the case of mark-independent jump sizes, the computational complexity of the weak order 2.0 Taylor scheme depends on the intensity of
the Poisson jump measure. Indeed, as already discussed in Sect. (6.2), for
the generation of double stochastic integrals involving the Poisson measure
the random variables Wn and pn are not sucient. In this case one
needs also the location of jump times in every time interval (tn , tn+1 ], with
n {0, . . . , nT 1}.
512
tn+1
k
Yn+1
= Ynk + ak n + bk Wn +
ck (v) p (dv, dz)
tn
tn
tn+1
bk
dWz1 dWz2
xl
z2
bl
E
tn+1
tn
tn
tn+1
z2
+
E
tn
tn
ck (v)
dWz1 p (dv, dz2 )
xl
z2
bk tn , Yn + c(v) bk p (dv, dz1 )dWz2
tn
l=1
z2
bl
tn
l=1
tn+1
tn
ck tn , Yn + c(v1 ), v2 ck v2
E
d
d
k
2 ak bl bi tn+1 z2
l a
+
a
+
dz1 dz2
xl
xl xi 2
tn
tn
l=1
d
l=1
i,l=1
l a
tn+1
z2
dWz1 dz2
xl
tn
tn
d
d
k
2 bk bl bi tn+1 z2
l b
+
a
+
dz1 dWz2
xl
xl xi 2
tn
tn
l=1
tn+1
i,l=1
z2
+
E
tn
tn+1
tn
z2
+
tn
tn
d
l=1
al
d
ck (v)
2 ck (v) bl bi
+
dz1 p (dv, dz2 )
l
x
xl xi 2
i,l=1
ak tn , Yn + c(v) ak p (dv, dz1 )dz2 ,
(12.1.9)
for k {1, 2, . . . , d}, where ak , bk , and ck are the kth components of the drift,
diusion and jump coecients, respectively. Note that the multiple stochastic
integrals can be generated similarly as in the one-dimensional case.
In the general multi-dimensional case, the kth component of the weak order
2.0 Taylor scheme is given by
m
tn
j1 =1
tn+1
tn
tn+1
tn+1
z2
tn
m
j k
z2
z2
L a
j=1
tn+1
tn
m
tn+1
z2
tn
tn+1
tn
tn+1
z2
L b
tn
tn
dz1 dWzj2
z2
k
L1
v a p (dv, dz1 )dz2 ,
+
tn
0 k,j
j=1
dz1 dz2 +
tn
tn
dWzj1 dz2
z2
+ L0 a k
tn
k,j1
L1
p (dv, dz2 ) dWzj21
v b
k
L1
v1 c (v2 )p (dv1 , dz1 )p (dv2 , dz2 )
tn+1
dWzj11 dWzj22
tn
tn
tn
tn
z2
z2
tn
j1 =1
Lj1 bk,j2
j1 ,j2 =1
tn+1
tn
m
bk,j Wnj +
j=1
tn+1
513
(12.1.10)
for k {1, 2, . . . , d}, where the operators Lj , with j {1, . . . , m}, are dened
in (4.3.4)(4.3.6).
In the multi-dimensional case with mark-independent jump size, the kth
component of the weak order 2.0 Taylor scheme simplies to the algorithm
k
= Ynk + ak n +
Yn+1
m
bk,j Wnj + ck pn
j=1
m
j1 ,j2 =1
m
j1 =1
m
bk,j1 tn , Y n + c bk,j1 I(1,j1 ) + ck tn , Y n + c ck I(1,1)
j1 =1
m
j=1
Lj ak I(j,0) +
m
j=1
+ L0 ck I(0,1) + ak tn , Y n + c ak I(1,0) ,
(12.1.11)
514
for k {1, 2, . . . , d}. All multiple stochastic integrals that do not involve
Wiener processes can be generated as in the one-dimensional case (12.1.7).
For those involving one Wiener process, we can use the relations
p (tn+1 )
I(j,1) =
Wji pn Wtjn
I(j,0) = Znj
(12.1.12)
(12.2.1)
515
(12.2.2)
c(t, x)
= b t, x + c(t, x) b(t, x)
x
(12.2.3)
(12.2.5)
bl,j (t, x)
ck (t, x)
= bk,j t, x + c(t, x) bk,j (t, x)
xl
(12.2.6)
516
(12.2.7)
(12.2.8)
1
2
m
d
bi,j1
j1 ,j2 =1 i=1
m
bk,j2
Wnj1 Wnj2 n
i
x
{bk,j1 (tn , Y n + c) bk,j1 } pn Wnj1
j1 =1
m
d
ak
1 k
{c (tn , Y n + c) ck } (pn )2 pn +
bl,j l Znj
2
x
j=1
l=1
m
d
d
m
k,j
k,j
2 k,j i,j1 l,j1
b
b
b
b
b
Wnj n Znj
+
+
al
+
l
i
l
t
x
x x
2
j=1
j =1
l=1
ak
t
d
l=1
i,l=1
al
d
m
ak
2 ak bi,j1 bl,j1 (n )2
+
l
x
xi xl
2
2
j =1
+ a (tn , Y n + c) a
k
i,l=1
k
pn n ,
(12.2.10)
for k {1, 2, . . . , d}. The second weak order of convergence of the above
scheme will be shown below.
517
z,y
z,y
z,y
(X u )du +
b(X u ) dW u +
c(X z,y
p (dv, du)
Xt = y +
a
u , v);
z
(12.3.1)
for z t T , which starts at time z [0, T ] in y d . This process has the
same solution as the Ito process X = {X t , t [0, T ]}, which solves the SDE
(1.8.2).
2(+1)
For a given {1, 2, . . .} and function g CP
(d , ) dene the
functional
(12.3.2)
u(z, y) = E(g(X z,y
T ))
for (z, y) [0, T ] d . Then we have
0
u(0, X 0 ) = E(g(X 0,X
)) = E(g(X T )).
T
(12.3.3)
We will need the following lemma, see Mikulevicius & Platen (1988) and
Mikulevicius (1983), involving the operator
(12.3.5)
(12.3.6)
u(z, ) CP
for each z [0, T ].
(d , )
(12.3.7)
518
t
u(t, X s,y
)
=
u(s,
y)
+
L0 u(z, X s,y
z )dz
t
s
m
t
j=1
j
Lj u(z, X s,y
z )dWz +
s,y
L1
v u(z, X z )p (dv, dz),
Proof:
u(t
,
y)
E u(tn , X tn1
n1
tn1 = 0.
n
(12.3.8)
By It
os formula we obtain
,y
u(tn , X tn1
) = u(tn1 , y) +
n
+
m
tn
j=1
tn
tn1
tn1
tn
+
tn1
tn1 ,y
L1
)p (dv, dz).
v u(z, X z
519
(12.3.9)
which will give us the rule for the construction of regular weak Taylor approximations of weak order .
Given a regular time discretization (t) , with maximum step size
(0, 1), we dene the weak order Taylor scheme by the vector equation
.
.
I f (tn , Y
=
I f (tn , Y
,
Y
n+1 = Y n +
n)
n)
\{v}
tn ,tn+1
tn ,tn+1
(12.3.10)
for n {0, 1, . . . , nT 1} with f (t, x) = x, as dened in (4.3.8). Similarly, we
dene the weak order compensated Taylor scheme by the vector equation
.
.
I f (tn , Yn )
I f (tn , Yn )
Y
=
,
n+1 = Y n +
\{v}
tn ,tn+1
tn ,tn+1
(12.3.11)
for n {0, 1, . . . , nT 1} with f (t, x) = x in (4.3.9).
For simplicity, for the next and subsequent theorems we will assume an
autonomous multi-dimensional jump diusion SDE. This formulation is not
restrictive, since we can always rewrite an SDE with time-dependent coecients as being autonomous, by modeling the time t as the rst component of
the process X. However, the resulting conditions on the time component can
be relaxed in a direct proof for the non-autonomous case.
We now present the following convergence theorem which states that, under suitable conditions, for any given {1, 2, . . .} the corresponding weak
order compensated Taylor scheme (12.3.11) achieves the weak order of convergence .
Theorem 12.3.4. For given {1, 2, . . . }, let Y = {Y
n , n {0, 1, . . . ,
nT }} be the weak order compensated Taylor approximation dened in
(12.3.11) corresponding to a regular time discretization with maximum time
step size (0, 1).
We assume that E(|X 0 |q ) < , for q {1, 2, . . .}, and for any g
2(+1)
CP
(d , ) there exists a positive constant C, independent of , such that
|E(g(X 0 )) E(g(Y
0 ))| C .
(12.3.12)
Moreover, suppose that the drift, diusion and jump coecients are Lip2(+1)
schitz continuous with components ak , bk,j , ck CP
(d , ) for all k
{1, 2, . . . , d} and j {1, 2, . . . , m} and that the coecients f , with f (t, x) =
x, satisfy the linear growth condition
|f (t, x)| K(1 + |x|),
(12.3.13)
520
|E(g(X T )) E(g(Y
nT ))| C .
(12.3.14)
Remark 12.3.5. Note that the linear growth condition (12.3.13) on the coecient functions f is satised if, for instance, the drift, diusion and jump
coecients are uniformly bounded.
Remark 12.3.6. By replacing the conditions on the compensated It
o coecient functions f with equivalent conditions on the It
o coecient functions
f , one can show for given {1, 2, . . .} that the weak order Taylor scheme
(12.3.10) also attains a weak order of convergence .
Theorem 12.3.4 is an extension of the weak convergence theorem for diffusion SDEs presented in Kloeden & Platen (1999). The following proof of
Theorem 12.3.4 has similarities to the one given in Liu & Li (2000).
Proof: For ease of notation, so no misunderstanding is possible, we write Y
for Y . By (12.3.3) and the terminal condition of the Kolmogorov backward
equation (12.3.6) we can write
H = E g(Y nT ) E g(X T )
= E u(T, Y nT ) u(0, X 0 ) .
(12.3.15)
Moreover, by (12.3.12), (12.3.8), (12.3.7) and the deterministic Taylor expansion we obtain
nT
H E
u(tn , Y n ) u(tn1 , Y n1 ) + K
n=1
nT
t
,Y
n1
u(tn , Y n ) u(tn , X tn1
= E
) + K
n
n=1
H1 + H2 + K ,
(12.3.16)
where
d
!
nT
tn1 ,Yn1
i;tn1 ,Yn1
i
H1 = E
u(tn , X tn
) (Yn Xtn
)
y i
n=1
i=1
(12.3.17)
and
521
nT
d
2
1
tn1 ,Yn1
tn1 ,Yn1
H2 = E
u
t
,
X
+
(Y
X
)
n
n
n
tn
tn
2 y i y j
n=1
i,j=1
(Yni
i;t
,Y
Xtn n1 n1 )(Ynj
.
j;t
,Y
Xtn n1 n1 )
(12.3.18)
(12.3.19)
(12.3.20)
B( )
i
tn1 ,Yn1
t
,Y
n1
n1
u(tn , X tn
) I f (X
) tn1 ,tn |Atn1
E
y i
nT
d
KE
(tn tn1 )+1
n=1 i=1 {:l()=+1}
K .
(12.3.22)
522
nT d
1 2
tn1 ,Yn1
tn1 ,Yn1
H2 E
u tn , X tn
+ n (Y n X tn
)
4 y i y j
n=1
i,j=1
.
i;t
,Y
j;t
,Y
(Yni Xtn n1 n1 )2 + (Ynj Xtn n1 n1 )2
!
nT d
1 2
t
,Yn1
t
,Yn1
= E
u tn , X tn1
+ n (Y n X tn1
)
n
n
i
j
2 y y
n=1
i,j=1
(Yni
KE
nT
n=1
i;t
,Y
Xtn n1 n1 )2
d
!
i,j=1 {:l()=+1}
E
2
t
,Yn1
t
,Yn1
u tn , X tn1
+ n (Y n X tn1
)
n
n
y i y j
!
i
2
t
,Y
I f (X n1 n1 ) tn1 ,tn |Atn1
K .
(12.3.23)
12.4 Exercises
12.1. Consider the Merton type jump diusion model, see (1.7.36), with SDE
dSt = St (a dt + dWt + c dNt )
for t 0 and S0 > 0, where W = {Wt , t 0} is a standard Wiener process
and N = {Nt , t 0} a Poisson process with intensity (0, ). For an
equidistant time discretization provide the weak order 1.0 Taylor scheme.
12.2. For the Merton SDE in Exercise 12.1 provide the weak order 2.0 Taylor
scheme.
13
Jump-Adapted Weak Approximations
(13.1.1)
is
Atn measurable,
(13.1.3)
523
524
Since jump-adapted weak schemes are very convenient and can be applied
often, we present them in what follows with some detail. Similarities with
parts of Chap. 11, where we treated the case without jumps, naturally arise
and are deliberate.
Weak Order 1.0 Taylor Scheme
The simplest jump-adapted weak Taylor scheme is the jump-adapted Euler
scheme, presented in Chap. 8. We recall that in the general multi-dimensional
case the kth component is given by
Ytkn+1
Ytkn
+ a tn +
m
bk,j Wtjn
(13.1.4)
j=1
and
Ytkn+1
Ytkn+1
+
E
(13.1.5)
while, if tn+1 is not a jump time one has Y tn+1 = Y tn+1 , as E p (dv, {tn+1 })
= 0. Therefore, the weak order of convergence of the jump-adapted Euler
scheme is = 1.0, and thus, equals the weak order of the Euler scheme used
in (13.1.4) for the diusive component.
As was discussed in Sect. 11.2 and will be exploited below, for weak convergence it is possible to replace the Gaussian distributed random variables
by simpler multi-point distributed random variables that satisfy certain moment conditions. For instance, in the Euler scheme we can replace the random
tj that satisfy the moment
variables Wtjn by simpler random variables W
n
condition (11.2.3). In this case the order of weak convergence of the resulting
simplied Euler scheme is still = 1.0. It is popular to replace the random variables Wtjn by the simpler two-point distributed random variables
j , with
W
2,tn
j = t = 1 ,
(13.1.6)
P W
n
2,tn
2
for j {1, 2, . . . , m}. This yields the jump-adapted simplied Euler scheme,
whose kth component is given by
Ytkn+1 = Ytkn + ak tn +
m
j
bk,j W
2,tn
525
(13.1.7)
j=1
Ytn+1 = Ytn+1 +
(13.1.9)
and achieves weak order = 2.0. Here Ztn represents the double It
o integral
dened in (8.2.23). The jump-adapted weak order 2.0 Taylor scheme was rst
presented in Mikulevicius & Platen (1988).
If we compare the regular weak order 2.0 Taylor scheme (12.1.3) with the
jump-adapted weak order 2.0 Taylor scheme (13.1.8)(13.1.9), then we notice
that the latter is much simpler as it avoids multiple stochastic integrals with
respect to the Poisson measure.
Also in this case, since we are constructing weak schemes, we have some
freedom in the choice of the simplifying random variables. A jump-adapted
simplied weak order 2.0 scheme is given by
t )2 t
t + bb (W
(13.1.10)
Ytn+1 = Ytn + atn + bW
n
n
n
2
1 a
a 2
1 b
b
t t
+
+ aa +
b 2tn +
+ a b + ab + b2 W
n
n
2 t
2
2 t
2
t
and (13.1.9). To obtain weak order 2.0 convergence, the random variable W
n
should be Atn+1 -measurable and satisfy the moment condition (11.2.8). Note
t , n {0, . . . , nT
that if we choose independent random variables W
n
1}, then we automatically obtain the required Atn+1 -measurability. Recall
that the moment condition (11.2.8) is satised, for instance, by a three-point
3,t , where
distributed random variable W
n
526
1
3,t = 0) = 2 .
,
P (W
(13.1.11)
n
6
3
Note that the jump-adapted simplied weak order 2.0 scheme (13.1.10)
requires only one random variable to be generated at each time step. Therefore, it is computationally more ecient than the jump-adapted weak order
2.0 Taylor scheme (13.1.8).
In the general multi-dimensional case, the kth component of the jumpadapted weak order 2.0 Taylor scheme is of the form
3,t = 3t ) =
P (W
n
n
Ytkn+1 = Ytkn + ak tn +
+
m
L0 a k 2
tn
2
(13.1.12)
m
bk,j Wtjn + L0 bk,j I(0,j) + Lj ak I(j,0) +
Lj1 bk,j2 I(j1 ,j2 )
j=1
j1 ,j2 =1
and (13.1.5), where the operators L and L , with j {1, 2, . . . , m}, are dened in (4.3.4) and (4.3.5). The multiple stochastic integrals I(0,j) and I(j,0) ,
for j {1, 2, . . . , m}, can be generated as in (12.1.12). However, the generation of the multiple stochastic integrals involving two components of the
Wiener process, I(j1 ,j2 ) , with j1 , j2 {1, 2, . . . , m}, is not straightforward. In
the special case of the diusion commutativity condition (6.3.14), one can express these multiple stochastic integrals in terms of the Gaussian increments
of the Wiener processes Wtjn . In general, since we are interested in a weak
approximation, we can replace the multiple stochastic integrals by the corresponding simple multi-point distributed random variables satisfying appropriate moment conditions. In this way, we obtain the jump-adapted simplied
weak order 2.0 scheme with kth component of the form
0
Ytkn+1 = Ytkn + ak tn +
m
k,j tn 0 k,j
L0 a k 2
tj
tn +
(L b + Lj ak ) W
b +
n
2
2
j=1
1 j1 k,j2 j1 j2
Wtn Wtn + Vtjn1 ,j2
L b
2 j ,j =1
m
(13.1.13)
1
,
2
(13.1.14)
Vtjn1 ,j1 = tn
(13.1.15)
(13.1.16)
and
for j2 {j1 + 1, . . . , m} and j1 {1, 2, . . . , m}.
527
m
L0 a k 2
tn
2
(13.1.17)
m
bk,j Wtjn + L0 bk,j I(0,j) + Lj ak I(j,0) +
Lj1 bk,j2 I(j1 ,j2 )
j=1
m
j1 ,j2 =0
j1 ,j2 =1
m
m
j1 ,j2 =0 j3 =1
528
4,t that
In Hofmann (1994) a four-point distributed random variable W
n
satises condition (13.1.19) was proposed, where
(
,
P W4,tn = 3 + 6 tn =
12 + 4 6
(
,
P W4,tn = 3 6 tn =
(13.1.20)
12 4 6
see (11.2.21). However, since the probabilities in (13.1.20) are not rational
numbers, the corresponding four-point distributed random variable cannot be
eciently implemented by random bit generation, see Bruti-Liberati & Platen
(2004) and Bruti-Liberati, Martini, Piccardi & Platen (2008). Instead, we
5,t , with
present an alternative ve-point distributed random variable W
n
1
5,t = 6t =
5,t = t = 9 ,
P W
,
P W
n
n
n
n
30
30
1
(13.1.21)
P W5,tn = 0 = ,
3
which still satises condition (13.1.19) and is a highly ecient implementation based on random bit generators, see Bruti-Liberati et al. (2008). This
ve-point distributed random variable has also been suggested in Milstein &
Tretjakov (2004). The corresponding jump-adapted simplied weak order 3.0
scheme is then given by
5,t + 1 L1 b (W
5,t )2 t + 1 L0 a2t
Yn+1 = Yn + a + bW
n
n
n
n
2
2
1
2,t t
5,t + 1 W
+ L1 a W
n
n
n
2
3
1 0
1
+ L b W5,tn W2,tn tn
2
3
1 0 0
5,t 2
L L b + L0 L1 a + L1 L0 a W
+
tn
n
6
1 1 1
5,t )2 t
L L a + L1 L0 b + L0 L1 b (W
+
tn
n
n
6
1
1
5,t , (13.1.22)
5,t )2 3 t
W
+ L0 L0 a3tn + L1 L1 b (W
n
n
n
6
6
and relation (13.1.5). It involves the ve-point distributed random variables
2,t , see (13.1.6).
5,t and the two-point distributed random variables W
W
n
n
This scheme achieves an order of weak convergence = 3.0.
We remark that in Kloeden & Platen (1999) there exist multi-dimensional
versions of weak order = 3.0 schemes for the case with additive noise using
Gaussian random variables. Furthermore, there is an alternative weak order
3.0 scheme with multi-point distributed random variables given in Hofmann
(1994).
529
tn
tn (tn ) 2
n
4
Ytn+1 = Ytn +
and
Ytn+1 = Ytn+1 +
(13.2.1)
Yt
= Ytn + atn b
n
tn .
This scheme attains weak order = 2.0 and generalizes a scheme for pure
diusions presented in Platen (1984), see also Sect. 11.3.
If we replace the Gaussian random variables Wtn by the three-point
3,t dened in (13.1.11), then we obtain
distributed random variables W
n
the jump-adapted simplied weak order 2.0 derivative-free scheme that still
achieves weak order = 2.0.
In the autonomous, multi-dimensional case, the kth component of a jumpadapted weak order 2.0 derivative-free scheme is given by
1 k
a (Y tn ) + ak tn
2
m
1 k,j +,j
,j ) + 2bk,j W
tj
+
b (Rtn ) + bk,j (R
tn
n
4 j=1
Ytkn+1 = Ytkn +
m
k,j +,r
1
k,j ,r
tj t 2
b (U
+
(U tn ) + 2bk,j W
tn ) + b
n
n
r=1
r=j
530
m
2
1 k,j +,j
,j )
tj
+
tn
b (Rtn ) bk,j (R
W
tn
n
4 j=1
(13.2.2)
m
k,j +,r
k,j ,r
r + Vtr,j
tj W
+
b (U
(U tn ) W
tn
tn ) b
n
n
r=1
r=j
and
Ytkn+1 = Ytkn+1 +
(13.2.3)
m
tj ,
bj W
n
j=1
= Y t + a t bj
R
tn
n
n
and
= Y t bj
U
tn
n
tn ,
tn .
531
made explicit by using a predictor equation to generate the next value. The
predictor component of the scheme can be derived by the explicit WagnerPlaten expansion for pure diusions.
Order 1.0 Predictor-Corrector Scheme
In the one-dimensional case, d = m = 1, the jump-adapted predictor-corrector
Euler scheme is given by the corrector
1
(13.3.1)
Ytn+1 = Ytn +
a(Ytn+1 ) + a tn + bWtn ,
2
the predictor
Ytn+1 = Ytn + atn + bWtn ,
(13.3.2)
Ytn+1 = Ytn+1 +
(13.3.3)
m
j=1
t ) + (1 )bk,j W
tj ,
bk,j (tn+1 , Y
n+1
n
(13.3.4)
532
m
d
bk,j1
j1 ,j2 =1 i=1
bk,j2
.
xi
(13.3.5)
tj ,
bk,j W
n
(13.3.6)
m
j=1
Ytkn+1 = Ytkn+1 +
(13.3.7)
predictor
2 1 j k
1
tj t
Ytkn+1 = Ytkn + ak tn + tkn + L0 ak tn +
L a W
n
n
2
2 j=1
m
533
(13.4.1)
dX t = a(t, X t )dt + b(t, X t )dW t + c(t, X t , v) p (dv, dt)
E
(13.4.2)
534
and
Ytn+1 = Ytn+1 +
(13.4.6)
4b
2
4etn
Yt .
2 (1 etn ) n
535
with nt dened in (6.1.11). The term jump-adapted indicates again that all
jump times {1 , 2 , . . . p (T ) } of the Poisson measure p are included in the
time discretization. Moreover, we require a maximum step size (0, 1),
which means that P (tn+1 tn ) = 1 for every n {0, 1, 2, . . . , nT 1},
and if the discretization time tn+1 is not a jump time, then tn+1 should be
Atn -measurable. Let us introduce an additional ltration
Atn = 1p (E,{tn+1 })
=0 Atn
(13.5.1)
for every n {0, 1, . . . , nT }. We assume that tn+1 is Atn -measurable. We
also assume a nite number of time discretization points, which means that
nt < a.s. for t [0, T ]. The superposition of the jump times with an
equidistant time discretization, as discussed in Chap. 6, provides an example
of such jump-adapted time discretization.
We recall that for m N we denote the set of all multi-indices that do
not include components equal to 1 by
<m = {(j1 , . . . , jl ) : ji {0, 1, . . . , m}, i {1, 2, . . . , l} for l N } {v}
M
where v is the multi-index of length zero, see Chap. 4 and Sect. 8.7.
<m , the remainder set B(A)
Given a set A M
of A is dened by
<m \A : A}.
B(A)
= { M
Moreover, for every {1, 2, . . .} we dene the hierarchical set
<m : l() }.
= { M
For a jump-adapted time discretization (t) , with maximum time step size
(0, 1), we dene the jump-adapted weak order Taylor scheme by
f (tn , Y tn )I
(13.5.2)
Y tn+1 = Y tn +
\{v}
and
Y tn+1 = Y tn+1 +
(13.5.3)
536
|E(g(X T )) E(g(Y
tn ))| C .
T
tn
t
,y
tn1 ,y
1
)u(t
,
y)+
L
u(z,
X
)(dv)dz
E u(tn , X tn1
Atn1 = 0,
n1
z
v
n
Lemma 13.5.2
tn1
(13.5.4)
d
where the process X s,y = {X s,y
,
t
[0,
T
]},
for
(s,
y)
[0,
t]
,
is
dened
t
in (12.3.1), and
(13.5.5)
u(s, y) = E g (X s,y
T ) As .
Proof: Note that since all jump times are included in the time discretizat
,y
tion, Xzn1 evolves as a diusion in the time interval (tn1 , tn ) for every
os formula we obtain
n {1, . . . , nT }. Therefore, by It
tn
tn1 ,y
L0 u(z, X ztn1 ,y ) dz
u(tn , X tn ) = u(tn1 , y) +
tn1
m
tn
j=1
tn1
tn
= u(tn1 , y) +
tn1
j=1
tn
tn1
tn
tn1
tn1 ,y
L1
)(dv) dz,
v u(z, X z
537
(1)
0 are dened in (4.3.4)(4.3.7).
where the operators L0 , Lj , Lv
and L
We complete the proof of the lemma by applying the expected value, using
the result (12.3.5) and the properties of It
o integrals.
Lemma 13.5.3
For each p {1, 2, . . .} there exists a nite constant K
such that
t
,y
2q
2q
q
A
y|
(13.5.6)
E |X tn1
tn1 K(1 + |y| )(tn tn1 )
n
for all q {1, . . . , p} and n {1, . . . , nT }, where Atn1 is dened in (13.5.1).
Proof: Since there are no jumps between discretization points, the proof
of (13.5.6) follows from that of a similar lemma for the pure diusion case in
Kloeden & Platen (1999).
Let us also dene the process ja = { ja (t), t [0, T ]} by
ja (t) = ja (tn ) +
n ,t
\{v}
+
(tn ,t]
I f (tn , ja (tn )) t
(13.5.7)
(13.5.8)
l
"
y ph
(13.5.10)
h=1
538
Lemma 13.5.5
For each p {1, 2, . . .} there exist nite constants K and
r {1, 2, . . .} such that
2q tnz ,Ytnz
2q
+ Fp X z
Y
E Fp ja (z) Y
tnz
tnz Atnz
2r
(tnz +1 tnz )ql
K 1 + |Y
tnz |
(13.5.11)
nz
since ja (z) = Y
= Y
tnz and X z
tnz for z {t0 , t1 , . . . , tnT }. Moreover, since jump times arise only at discretization times, we obtain the estimate (13.5.11) by It
os formula as in the case of pure diusion SDEs, see
Kloeden & Platen (1999).
Lemma 13.5.6
For each p Pl , l {1, . . . , 2 + 1}, n {1, 2, . . . , nT }
and z [tn1 , tn ) there exist two nite constants K and r {1, 2, . . .} such
that
tn1 ,Ytn1
Y
E Fp ja (z) Y
tn1 Fp X z
tn1 Atn1
r
K 1 + |Y
(z tn1 ).
tn1 |
(13.5.12)
Proof: Since the time discretization includes all jump times, the proof of
the lemma follows from that for pure diusion SDEs, see Kloeden & Platen
(1999).
E
g(X
)
H = E g(Y
T
tnT
(13.5.13)
= E u(T, Y
)
u(0,
X
)
.
0
tn
T
Note that for the ease of notation we will write Y for Y when no misunderstanding is possible. Since Y 0 converges weakly to X 0 with order we
obtain
n
T
u(tn , Y tn ) u(tn , Y tn ) + u(tn , Y tn ) u(tn1 , Y tn1 )
H E
n=1
+ K .
(13.5.14)
539
tn1 ,Yt
u(tn , X tn n1 ) u(tn1 , Y tn1 )
tn
+
tn1
tn1 ,Ytn1
L1
)(dv)dz
v u(z, X z
.
+ K .
Note that by (13.5.8) and the properties of the stochastic integral with
respect to the Poisson measure, we have
n
T
T
1
u(tn , Y tn ) u(tn , Y tn ) = E
Lv u(z, ja (z)) p (dv, dz)
E
0
n=1
T
=E
0
L1
v u(z, ja (z)) (dv) dz
Therefore, we obtain
H H1 + H2 + K ,
where
(13.5.15)
n
T -
H1 = E
u(tn , Y tn ) u(tn , Y tn1 )
n=1
.
tn1 ,Ytn1
) u(tn , Y tn1 )
u(tn , X tn
and
(13.5.16)
T
-
1
H2 = E
u(z,
(z))
L
u(z,
Y
)
L1
tnz
ja
v
v
0
E
.
t
,Y
n
t
z
nz
L1
) L1
(dv) dz . (13.5.17)
v u(z, X z
v u(z, Y tnz )
1. Let us note that by (12.3.7) the function u is smooth enough to apply the
deterministic Taylor expansion. Therefore, by expanding the increments
of u in H1 we obtain
540
n 2+1
T
.
- 1
yp u(tn , Y tn1 ) Fp (Y tn Y tn1 )+ Rn (Y tn )
H1 = E
l!
n=1
l=1
pPl
- 2+1
1
tn1 ,Yt
yp u(tn , Y tn1 ) Fp (X tn n1 Y tn1 )
l!
l=1
pPl
!
.
,
tn1 ,Yt
+ Rn (X tn n1 )
(13.5.18)
1
2( + 1)!
yp u tn , Y tn1 + p,n (Z)(Z Y tn1 )
pP2(+1)
Fp (Z Y tn1 )
tn1 ,Ytn1
for Z = Y tn and X tn
diagonal matrix with
(13.5.19)
k,k
(Z) (0, 1)
p,n
(13.5.20)
.
)Atn1
+ E Rn (Y tn )Atn1 + E Rn (X tn
By (13.5.19), the Cauchy-Schwarz inequality, (12.3.7), (13.5.20) and
(13.5.11) we obtain
E Rn (Y tn )Atn1
K
. 12
-
2
E yp u tn , Y tn1 + p,n (Y tn )(Y tn Y tn1 ) Atn1
pP2(+1)
. 12
-
2
E Fp (Y tn Y tn1 ) Atn1
541
. 12
-
K E 1 + |Y tn1 |2r + |Y tn Y tn1 |2r Atn1
. 12
-
2
E Fp (Y tn Y tn1 ) Atn1
2r
+1
tn tn1
K 1 + Y tn1
,
(13.5.21)
+ p,n (X tn
tn1 ,Ytn1
)(X tn
. 12
2
Y tn1 ) Atn1
-
. 12
2
tn1 ,Yt
E Fp (X tn n1 Y tn1 ) Atn1
. 12
-
tn1 ,Yt
K E 1 + |Y tn1 |2r + |X tn n1 Y tn1 |2r Atn1
. 12
- t ,Y
n1
t
E |X tn n1 Y tn1 |4(+1) Atn1
+1
K 1 + |Y tn1 |2r tn tn1
(13.5.22)
E(|X tn
and
2
t ,Y
4(+1)
tn1 ,Yt
n1 t
,
Fp (X tn n1 Y tn1 ) K X tn n1 Y tn1
for every n {1, . . . , nT }, with p P2(+1) .
Finally, by applying the Cauchy-Schwarz inequality, (12.3.7), (13.5.8),
(13.5.12), (13.5.21), (13.5.22) and (13.5.9) we obtain
542
K 1 + E
max |Y tn |2r
0nnT
K 1 + |Y 0 |2r K .
(13.5.23)
l=1
.
+ Rnz ( ja (z))
- 2+1
1
tnz ,Ytnz
yp L1
Y tnz )
v u(z, Y tnz ) Fp (X z
l!
l=1
pPl
tn ,Yt
+ Rnz (X z z nz
!
.
) (dv) dz
2+1
1
E
u(z,
Y
)
yp L1
t
v
nz
l!
E
l=1
pPl
tn ,Yt
E Fp ( ja (z) Y tnz) Fp (X z z nz Y tnz)Atnz (13.5.24)
tnz ,Ytnz
+ E Rn ( ja (z))At
)At
+ E Rn (X z
(dv)dz.
z
nz
nz
tn ,Yt
E Rnz (X z z nz )Atnz K 1 + |Y tnz |2r (z tnz )+1 .
(13.5.26)
H2 K
0
543
E 1 + |Y tnz |2r (z tnz ) (dv) dz
E 1 + max |Y tn |2r (z tnz ) dz
0nnT
K .
(13.5.27)
Y
=
Y
+
c(tn , Y
(13.6.1)
tn+1
tn+1
tn+1 , v) p (dv, {tn+1 }),
E
for n {0, 1, . . . , nT 1}. Note that we use here the same notation Y as
applied in (13.5.2)(13.5.3) for the jump-adapted weak Taylor scheme. However, the jump-adapted approximation considered here is more general since
we do not specify its evolution between discretization points. The scheme
(13.5.2)(13.5.3) is a special case of the approximation considered here. The
theorem below will state the conditions for obtaining weak order of convergence {1, 2, . . .}.
First, let us formulate an important condition on the evolution of Y .
Dene a stochastic process ja = { ja (t), t [0, T ]} such that for every
n {0, . . . , nT }
(13.6.2)
ja (tn ) = Y tn
and
ja (tn ) = Y tn .
(13.6.3)
544
2q
2r
,
(13.6.4)
E
max |Y tn | |A0 K 1 + |Y
0 |
0nnT
2q
2r
K
1
+
max
E |Y
Y
|
|
A
|Y
|
(tn+1 tn )q
tn
tn+1
tn
tk
0kn
(13.6.5)
E
n
h=1
h=1
I fph (tn , Y
)
tn
\{v}
tn ,tn+1
2r
K 1 + max |Y
(tn+1 tn )
tk |
Atn
(13.6.6)
0knT
|E(g(X T )) E(g(Y
tn ))| C .
T
(13.6.7)
Theorem 13.6.1 formulates sucient conditions on a discrete-time approximation so that it approximates the diusion component of a jump-adapted
scheme. The most important condition is the estimate (13.6.6). It requires
the rst 2 + 1 conditional moments of the increments of the diusion approximation to be suciently close to those of the truncated Wagner-Platen
expansion for pure diusions.
Proof of Weak Convergence
To prove Theorem 13.6.1, let us for n {1, . . . , nT } and y d dene
I f (tn , y) tn1 ,tn .
(13.6.8)
yn, = y +
\{v}
545
Proof of Theorem 13.6.1: Note that in the following we will write Y for
Y . By (12.3.3) and (12.3.6) we obtain
H = E g(Y tnT ) E g(X T )
= E u(T, Y tnT ) u(0, X 0 ) .
(13.6.9)
+ K .
(13.6.10)
tn1 ,Yt
u(tn , X tn n1 ) u(tn1 , Y tn1 )
tn
+
E
tn1
E
n=1
nT
tn1 ,Ytn1
L1
) (dv) dz
v u(z, X z
.
+ K
tnz ,Ytnz
1
L1
v u(z, ja (z)) Lv u(z, X z
.
) (dv)dz
.
tn1 ,Ytn1
u(tn , Y tn )u(tn , Y tn1 ) u(tn , X tn
)u(tn , Y tn1 )
+ K
H1 + H2 + H3 + K ,
where
(13.6.11)
nT -
u(tn , Y tn ) u(tn , Y tn1 )
H1 = E
n=1
.
Yt
u(tn , n,n1 ) u(tn , Y tn1 )
,
(13.6.12)
546
nT -
Yt
H2 = E
u(tn , n,n1 ) u(tn , Y tn1 )
n=1
tn1 ,Yt
u(tn , X tn n1 )
.
u(tn , Y tn1 )
(13.6.13)
and
H3 = E
-
E
1
L1
u(z,
(z))
L
u(z,
Y
)
tnz
ja
v
v
tnz ,Ytnz
L1
v u(z, X z
.
) L1
(dv) dz
v u(z, Y tnz )
. (13.6.14)
(13.6.15)
and
ja (t) = ja (tn ) + a ja (tn ) (t tn ) + b ja (tn ) (Wt Wtn )
(13.6.17)
547
+
(tn ,t]
(13.6.18)
548
1
a ja (tn ) b ja (tn ) + a ja (tn ) b ja (tn )
2
2
1
n (t)(t tn )
+ b ja (tn ) b ja (tn )
2
(tn ,t]
where
n (t) =
(13.6.20)
(13.7.1)
for t [0, T ] and X0 . Recall from Chap. 1 that the SDE admits the closed
form solution
2
XT = X0 exp
a
(13.7.2)
T + WT .
2
When we analyze the central processor unit (CPU) time needed to run a
Monte Carlo simulation that computes the payo of a call option with 400000
paths and 128 time steps on a standard desktop machine, we report a CPU
549
time of 22 and 23.5 seconds for the Euler scheme and the weak order 2.0 Taylor scheme, respectively. The corresponding simplied versions only require
0.75 and 4.86 seconds, respectively. Thus, for the Euler method the simplied
version is about 29 times faster than the Gaussian one. The simplied weak
order 2.0 scheme is nearly ve times faster than the weak order 2.0 Taylor
scheme.
Accuracy for Smooth Payo
In the following we will discuss the accuracy of the Euler scheme, the weak
order 2.0 Taylor scheme and their simplied versions when applied to the
test SDE (13.7.1). As in Chap. 11, we show several log-log plots with the
logarithm log2 (()) of the weak error, as dened in (11.1.2), versus the
logarithm log2 () of the maximum time step size .
We rst study the computation of the functional E(g(XT )), where g is a
smooth function and XT is the solution of the SDE (13.7.1). Note that the
estimation of the expected value of a smooth function of the solution X arises,
for instance, when computing expected utilities. Another important application is the calculation of Value at Risk, via the simulation of moments, as
applied in Edgeworth expansions and saddle point methods in Studer (2001).
Therefore, we consider here the estimation of the kth moment E((XT )k ) of
XT at time T , for k N .
By (13.7.2) we obtain the theoretical kth moment of XT in closed form as
2
k
E (XT ) = (X0 )k exp
(k 1)
(13.7.3)
kT (a +
2
for k N , which we can use for comparison.
The particular structure of the SDE (13.7.1) allows us to obtain a closed
form solution also for the estimator of the kth moment provided by the Euler
and the weak order 2.0 Taylor schemes, and by their simplied versions.
Let us rewrite the Euler scheme (11.2.1) for the SDE (13.7.1) as
YT = Y0
N
1
"
(1 +
n=0
aT
+ Wn ),
N
(13.7.4)
T
T
where Wn are i.i.d Gaussian N (0, N
) random variables and N =
. Note
that
k/2
k!
T
for k even
k
(13.7.5)
E (Wn ) = (k/2)!2k/2 N
0
for k odd.
550
k
k
E (YT ) = (Y0 ) E
N 1
"
aT
+ Wn )k
(1 +
N
n=0
aT
k
+ Wn )
E (1 +
= (Y0 )
N
n=0
k
N
1
"
N
1
k
"
= (Y0 )
n=0 i=0
k
aT ki i
i
) E (Wn )
(1 +
i
N
[k/2]
k2q
q N
k
(2q)! 2 T
aT
k
= (Y0 )
. (13.7.6)
1+
2q
N
q!
2N
q=0
We recall that by [z] we denote the integer part of z and by
the combinatorial coecient, see (4.1.24).
Similarly, for the simplied Euler scheme (11.2.2) we obtain
N 1
"
aT
k
k
k
2,n )
+ W
(1 +
E (YT ) = (Y0 ) E
N
n=0
i
l
, for i l,
[k/2]
k2q 2 q N
k
T
aT
= (Y0 )
, (13.7.7)
1+
2q
N
N
q=0
k
with
2 T a2 T 2
+
,
h1 = 1 + a
2 N
2 N2
h2 = + a
T
N
and
h3 =
2
. (13.7.10)
2
E (YT )
= (Y0 )
N
1
"
551
k
2
h1 + h2 Wn + h3 (Wn )
E
n=0
= (Y0 )
k
N
1
"
n=0 i=0
= (Y0 )
N
1
k
"
n=0 i=0
= (Y0 )
i
k ki i ij j
i+j
h1
h2 h3 E (Wn )
i
j
j=0
[k/2]
k
N
1
"
n=0
q=0
[(k1)/2]
i
k ki
2
h2 Wn + h3 (Wn )
h E
i 1
q=0
2q
hk2q
1
q
2q 2(ql) 2l
2(q+l)
h3 E (Wn )
h2
2l
l=0
q
k
2q + 1 2(ql) 2l+1
k(2q+1)
h3
h
h
2q + 1 1
2l + 1 2
2(q+l+1)
E (Wn )
l=0
[k/2]
q+l
q
k
2q 2(ql) 2l (2(q + l))! T
h
hk2q
h
= (Y0 )
3
2q 1
2l 2
(q + l)!
2N
q=0
k
l=0
[(k1)/2]
q=0
q
k
2q + 1 2(ql) 2l+1
k(2q+1)
h3
h
h
2q + 1 1
2l + 1 2
l=0
(2(q + l + 1))!
(q + l + 1)!
T
2N
q+l+1 N
.
(13.7.11)
For the simplied weak order 2.0 Taylor scheme (11.2.7), with the three 3,n , see (11.2.9) or (13.1.11), we obpoint distributed random variable W
tain
N
1
2 k
"
k
k
3,n
3,n + h3 W
E (YT ) = (Y0 )
E
h1 + h2 W
n=0
k
= (Y0 )
hk1
[k/2]
q=1
q+l
q
k
2q 2(ql) 2l 1 3T
k2q
h3
h
h
2q 1
2l 2
3 N
l=0
552
q
k
2q + 1 2(ql) 2l+1
k(2q+1)
h3
h1
h
2q + 1
2l + 1 2
q=0
l=0
3T
N
q+l+1
N
(13.7.12)
for k even
(13.7.13)
for k odd,
for k {1, 2, . . .}. Therefore, the weak order 2.0 Taylor scheme and the simplied weak order 2.0 Taylor scheme provide the same estimate for E((YT )k )
with k {1, 2}.
In the following, we consider the estimation of the fth moment, that is
E((YT )5 ). We choose the following parameters: X0 = 1, a = 0.1, = 0.15,
T = 1. Note that by comparing the closed form solution (13.7.3) with the estimators (13.7.6), (13.7.11), (13.7.7) and (13.7.12), we can compute explicitly
the weak error () for the Euler and weak order 2.0 Taylor schemes, as well
as for their simplied versions.
In Fig. 13.7.1 we show the logarithm log2 (()) of the weak error for the
Euler and the weak order 2.0 Taylor schemes together with their simplied
versions versus the logarithm log2 () of the time step size. Note that the slope
of the log-log plot for each scheme corresponds to the theoretically predicted
weak order, being = 1.0 for both the Euler and simplied Euler, and =
2.0 for both the weak order 2.0 Taylor and simplied weak order 2.0 Taylor
schemes, as expected. The errors generated by the simplied schemes are very
similar to those of the corresponding weak Taylor schemes.
We know that simplied schemes achieve the same order of weak convergence as their counterparts that are based on Gaussian random variables.
However, it is useful to check whether there is a signicant loss in accuracy
when the time step size is relatively large.
Relative Weak Error
We will now present some plots of the relative weak error, that is
E (g(XT )) E (g(YT ))
,
() =
E (g(XT ))
(13.7.14)
553
Fig. 13.7.1. Log-log plot of the weak error for the Euler, the simplied Euler, the
weak order 2.0 Taylor and the simplied weak order 2.0 Taylor schemes
E (X )5 E (Y )5
T
T
,
E ((XT )5 )
(13.7.15)
where the parameters are set as before with T [1/365, 3] and [0.05, 0.5].
Moreover, we use only one time step, which means that the time step size
used for the discrete-time approximation YT equals T . In Fig. 13.7.2 we report
the relative error generated by the Euler scheme. For small values of the time
to maturity T and volatility , the Euler scheme is very precise, even when
using only one time step as considered here. For instance, when the maturity
time is set to 2/12 and the volatility to 0.1 we obtain a relative error of 0.13%.
When the time to maturity and the volatility increase, the accuracy of the
Euler scheme is not satisfactory. For instance, for T = 3 and = 0.5 the
relative weak error amounts to 99.6%. In Fig. 13.7.3 we report the relative
weak error generated by the simplied Euler scheme for the same parameter
values as above. The results are similar to those obtained in Fig. 13.7.2 for
the Euler scheme based on Gaussian random variables. Finally, Fig. 13.7.4
reports the dierence between the relative errors generated by the Euler and
simplied Euler schemes. Here we notice that the loss in accuracy due to the
use of simplied schemes does not exceed 4%.
The relative errors generated by the weak order 2.0 Taylor scheme and its
simplied version are signicantly smaller than those reported for the Euler
and simplied Euler schemes. However, the qualitative behavior with respect
to the values of the time to maturity T , and volatility is similar. In Fig.13.7.5
we report the relative error of the simplied weak order 2.0 scheme minus the
relative error of the weak order 2.0 Taylor scheme for the same parameters considered in the previous plots. Also in this case the loss in accuracy generated
by the multi-point distributed random variables is limited for all parameter
values tested.
554
555
Fig. 13.7.4. Relative error of simplied Euler scheme minus relative error of Euler
scheme
Fig. 13.7.5. Relative error of simplied weak order 2.0 scheme minus relative error
of weak order 2.0 Taylor scheme
that the simplied schemes are computationally less expensive than their
Gaussian counterparts. We recall that, for instance, the simplied Euler
scheme in our example is 29 times faster than the Euler scheme.
By comparing all six methods, we conclude that the second order simplied scheme is signicantly more ecient for the given example than any
other scheme considered. This result is rather important since it directs us
towards higher order simplied methods for ecient Monte Carlo simulation
for smooth payos.
556
Fig. 13.7.6. Relative error of simplied Euler scheme minus relative error of Euler
scheme for the tenth moment
Non-Smooth Payos
In option pricing we are confronted with the computation of expectations
of non-smooth payos. Note that weak convergence theorems for discretetime approximations require typically smooth payo functions. We refer to
Bally & Talay (1996a, 1996b) and Guyon (2006) for weak convergence theorems for the Euler scheme in the case of non-smooth payo functions.
To give a simple example, let us compute the price of a European call
option. In this case we assume that the Black-Scholes dynamics of the SDE
(13.7.1) is specied under the risk neutral measure, so that the drift coecient a equals the risk-free rate r. In this case the price of the European call
option is given by the expected value of the continuous, but only piecewise
557
dierentiable payo erT (XT K)+ = erT max(XT K, 0) with strike price
K and maturity T . By (13.7.2), we obtain the European call price from the
well-known Black-Scholes formula
cT,K (X0 ) = E(erT (XT K)+ ) = X0 N (d1 ) KerT N (d2 ),
(13.7.16)
2
X
ln ( K0 )+(r+ 2 )T
and
d
=
d
T , see (2.7.56).
where d1 =
2
1
T
For this particular example the estimator of the simplied Euler scheme
and that of the simplied weak order 2.0 Taylor scheme can be obtained in
closed form. For the simplied Euler scheme we obtain
) N i
N
T
erT N
rT
+
cT,K (Y0 ) = N
Y0 1 +
i
2
N
N
i=0
rT
1+
N
T
N
+
i
K
(13.7.17)
3T
3T
+ h3
N
N
i
+
K
(13.7.18)
cT,K (Y0 ) = erT Y0 (1 + rT ) K N (b) + Y0 T N (b) ,
with
K Y0 (1 + rT )
b=
Y0 T
x2
and
e 2
N (x) = ,
2
for x .
For the weak order 2.0 Taylor scheme with one time step, if
Y0 Y0 (h2 )2 4h3 (Y0 h1 K) > 0,
558
Fig. 13.7.8. Log-log plot of the weak errors for the Euler, the simplied Euler, the
weak order 2.0 Taylor and the simplied weak order 2.0 Taylor schemes
then we obtain
cT,K (Y0 ) = e
rT
+ Y0
where
b =
Y0 (h1 + h3 ) K N (b ) + N (b+ )
N (b+ ) N (b ) h2 + h3 b+ b
,
Y0 h2
559
the log-error forming a perfect line dependent on the log-time step size. The
weak order 2.0 Taylor scheme is more accurate and achieves an order of weak
convergence of about = 2.0. The accuracy of the simplied weak order 2.0
scheme is comparable to that of the weak order 2.0 Taylor scheme, but its
convergence is more erratic.
We report that when testing these schemes with dierent sets of parameters, we noticed that simplied schemes have an accuracy similar to that of
corresponding Taylor schemes, but their error exhibits oscillations depending
on the step size. This eect seems to be more pronounced in the simplied
weak order 2.0 Taylor scheme, but it can also be present, when using the
simplied Euler scheme for certain parameters.
As we will see in Sect. 17.2, the simplied Euler scheme and the simplied
weak order 2.0 schemes are related to binomial and trinomial trees, respectively. The erratic behavior of the weak error that we observed appears to be
due to the discrete nature of the presence of multi-point distributed random
variables used in the simplied schemes. This eect appears to be related to
that which was noticed for tree methods in Boyle & Lau (1994), and which
we will also discuss in Chap. 17.
As in the previous section, we now compare for a wide range of parameters
the accuracy of weak Taylor schemes with that of simplied weak Taylor
schemes when using only one time step. We consider the relative weak error,
cT,K (X) cT,K (Y )
,
(13.7.19)
cT,K (X)
with X0 = Y0 = 1, r = 0.1, [0.05, 0.5], K [0.88, 1.12] and T = 3/12.
In Fig. 13.7.9 we report the relative error generated by the Euler scheme. We
560
notice that for small values of the volatility and large values of the strike, the
relative error increases as much as 70%. Fig. 13.7.10 shows the relative error
for the simplied Euler scheme, which is extreme for large strikes and small
volatilities. Also in this case the loss in accuracy due to the use of two-point
distributed random variables (11.2.4) is rather small outside this region of
high strike prices and low volatilities. Note that
since we are using only one
time step, for every value of K Y0 (1 + rT + T ) all sample paths of the
simplied Euler scheme nish out-of-the-money and, thus, the expected call
payo is valued close to zero. On the other hand, some of the paths generated
by the Euler scheme end up in-the-money, providing a positive value for the
expected call payo, which is closer to the exact solution. Note however that in
this region the exact value of the expected payo is very small. For instance, for
K = 1.05 and = 0.05 the exact value equals about 0.002. Thus, the absolute
error generated by the simplied Euler scheme remains under control.
In Figs. 13.7.11 and 13.7.12 we report the dierence between the relative
error of the simplied Euler scheme and that of the Euler scheme with a
medium maturity T = 1 and a long maturity T = 4, respectively. Again we
can see that the accuracy of schemes based on simplied random variables is
similar to that of schemes based on Taylor schemes. Furthermore, we notice
that for certain sets of parameters the simplied Euler scheme is even more
accurate than the Euler scheme.
Finally, we conducted similar numerical experiments comparing the accuracy of the weak order 2.0 Taylor scheme to that of the simplied weak order
2.0 scheme. We report also that in this case the loss in accuracy due to the
use of the three-point distributed random variable (13.1.11) is quite small.
561
Fig. 13.7.11. Relative error of simplied Euler scheme minus relative error of Euler
scheme for T = 1
Fig. 13.7.12. Relative error of simplied Euler scheme minus relative error of Euler
scheme for T = 4
562
the accuracy of simplied weak schemes can be superior to that of weak Taylor
schemes.
Smooth Payos under the Merton Model
We study the weak approximation of the SDE (1.8.5), describing the Merton
model, which is of the form
p (t)
1
Xt = X0 e(a 2
)t+Wt
i ,
(13.7.21)
i=1
563
Fig. 13.7.13. Log-log plot of weak error versus time step size
Fig. 13.7.14. Log-log plot of weak error versus time step size
564
Fig. 13.7.15. Log-log plot of weak error versus time step size
Fig. 13.7.16. Log-log plot of weak error versus time step size
Let us now consider the case of lognormally distributed marks, where the
logarithm of the mark i = ln (i ) is an independent Gaussian random vari
able, i N (, ), with mean = 0.1738 and standard deviation = 0.15.
This implies that at a jump time the value of X drops on average by 15%,
since E()1 = 0.15. In Fig.13.7.16 we report the results for the regular Euler, jump-adapted predictor-corrector Euler, regular and jump-adapted order
2.0 Taylor, and the jump-adapted order 2.0 predictor-corrector schemes. The
accuracy of these schemes in the case of lognormally distributed jump sizes is
almost the same as that achieved in the case of a mark-independent jump coecient. Here all schemes achieve the theoretical orders of weak convergence
prescribed by the convergence theorems in the previous chapters.
565
e T ( T )j
j=0
j!
fj ,
=
j T
j2
2 )T
()
d2,j = d1,j j T , aj = rq(E ()1)+ j ln E
and j2 = 2 + j
T
T . Recall
that we denote by N () the probability distribution of a standard Gaussian
random variable.
566
Fig. 13.7.17. Log-log plot of weak error versus time step size
Let us rst consider the case of constant marks i = > 0, with = 0.85
as in the previous section. Moreover, we set r = 0.055, q = 0.01125, = 0.05,
= 0.15, X0 = 100 and K = 100. Note that this implies the risk neutral drift
a = r q ( 1) = 0.05.
In Fig. 13.7.17 we consider the case of a mark-independent jump coecient for the regular and jump-adapted Euler schemes, and the jump-adapted
predictor-corrector Euler scheme. All these schemes achieve, in our study, an
order of weak convergence of about one. The regular and the jump-adapted
Euler schemes achieve experimentally rst order weak convergence, with the
regular scheme being slightly more accurate. The jump-adapted predictorcorrector Euler scheme is far more accurate than the explicit schemes considered here. Its order of weak convergence is about one, with an oscillatory
behavior for large time step sizes. We will give in the next chapter some explanation for this excellent performance.
Fig. 13.7.18 shows the accuracy of the regular order 2.0 Taylor, jumpadapted order 2.0 Taylor and jump-adapted order 2.0 predictor-corrector
schemes. All schemes attain numerically an order of weak convergence equal
to = 2.0. In this example the jump-adapted order 2.0 Taylor scheme is the
most accurate. Finally, in Fig. 13.7.20 we report the results for these second
weak order schemes together with those of the regular Euler and jump-adapted
predictor-corrector Euler schemes. Notice the dierence in accuracy between
rst order and second order schemes. However, we emphasize again the excellent performance of the jump-adapted predictor-corrector Euler scheme already for large step sizes. This scheme reaches an accuracy similar to that of
some second order schemes. The jump-adapted weak order 2.0 Taylor scheme
is the most accurate for all time step sizes considered.
We have also tested the schemes mentioned above in the case of lognormal marks, where ln (i ) N (, ), with mean = 0.1738 and standard
deviation = 0.15. In Fig. 13.7.19 one notices that the numerical results are
567
Fig. 13.7.18. Log-log plot of weak error versus time step size
Fig. 13.7.19. Log-log plot of weak error versus time step size, lognormal marks
practically identical to those reported in Fig. 13.7.20 for the case of constant
marks.
Let us now test the weak schemes proposed when approximating the SDE
(13.7.20), for the case of lognormal marks, by using parameter values reasonably consistent with market data. We use here the parameters reported in
Andersen & Andreasen (2000a) from European call options on the S&P 500
in April 1999. The risk free rate and the dividend yield are given by r = 0.0559
and q = 0.01114, respectively. With a least-squares t of the Merton model to
the implied midpoint Black-Scholes volatilities of the S&P 500 options in April
1999, Andersen & Andreasen (2000a) obtained the following parameters: =
568
Fig. 13.7.20. Log-log plot of weak error versus time step size, constant marks
Fig. 13.7.21. Log-log plot of weak error versus time step size, lognormal marks
In Fig. 13.7.21 we show the results for the regular Euler, jump-adapted
predictor-corrector Euler, regular and jump-adapted order 2.0 Taylor, and
the jump-adapted order 2.0 predictor-corrector schemes, when pricing an atthe-money European call option with maturity time T = 4. We report that
the regular Euler scheme and the jump-adapted Euler scheme, which is not
shown in the gure, achieve rst order of weak convergence, with the jumpadapted Euler scheme being slightly less accurate than its regular counterpart.
The regular and jump-adapted order 2.0 Taylor and the jump-adapted order
2.0 predictor-corrector schemes are signicantly more accurate and achieve
an order of weak convergence of approximately = 2.0. We emphasize that
the jump-adapted predictor-corrector Euler scheme achieves in this example
a remarkable accuracy. For large time step sizes it is as good as some second
order schemes.
13.8 Exercises
569
In summary, the numerical results obtained indicate that predictor-corrector schemes for the given underlying dynamics remain accurate, even when
using large time step sizes. Also, Hunter, J
ackel & Joshi (2001) report that
the predictor-corrector Euler scheme is very accurate when pricing interest
rate options under the diusion LIBOR market model. Finally, we remark
that in this section we have analyzed only the accuracy and not the CPU
time needed to run the algorithms. This is, of course, strongly dependent on
the specic implementation and the problem at hand. We emphasize that
predictor-corrector schemes are only slightly computationally more expensive
than corresponding explicit schemes, as they use the same random variables.
13.8 Exercises
13.1. For the Merton type SDE given in Exercise 12.1 provide a jump-adapted
simplied weak order 2.0 scheme.
13.2. For the SDE in Exercise 12.1 formulate a jump-adapted fully implicit
weak order 1.0 predictor-corrector scheme.
14
Numerical Stability
When simulating discrete-time approximations of solutions of SDEs, in particular martingales, numerical stability is clearly more important than higher
order of convergence. The stability criterion presented is designed to handle
both scenario and Monte Carlo simulation, that is, both strong and weak
approximation methods. Stability regions for various schemes are visualized.
The result being that schemes, which have implicitness in both the drift and
the diusion terms, exhibit the largest stability regions. Rening the time step
size in a simulation can lead to numerical instabilities, which is not what one
experiences in deterministic numerical analysis. This chapter follows closely
Platen & Shi (2008).
571
572
14 Numerical Stability
cepts typically make use of specically designed test equations, which have explicit solutions available, see, for instance, Milstein (1995a), Kloeden & Platen
(1999), Hernandez & Spigler (1992, 1993), Saito & Mitsui (1993a, 1993b,
1996), Hofmann & Platen (1994, 1996) and Higham (2000).
Stability regions have been identied for the parameter sets of test equations where the propagation of errors is under control in a well-dened sense.
In some cases authors use solutions of complex valued linear SDEs with additive noise as test dynamics, see Hernandez & Spigler (1992, 1993), Milstein
(1995a), Kloeden & Platen (1999) and Higham (2000). However, in nancial
applications, these test equations are potentially not realistic enough to capture numerical instabilities, typically arising when simulating asset prices denominated in units of a certain numeraire. The resulting SDEs have no drift
and their diusion coecients are often level dependent in a multiplicative
manner. To study the stability of the corresponding numerical schemes, test
SDEs with multiplicative noise have been suggested in real valued and complex valued form, see Saito & Mitsui (1993a, 1993b, 1996), Hofmann & Platen
(1994, 1996) and Higham (2000). Recently, Bruti-Liberati & Platen (2008)
analyzed the stability regions of a family of strong predictor-corrector Euler
schemes under a criterion which captures a rather weak form of numerical
stability, known as asymptotic stability.
In the spirit of Bruti-Liberati & Platen (2008), we will use a more general
stability criterion. It unies, in some sense, criteria that identify asymptotic
stability, mean square stability and the stability of absolute moments. The
corresponding stability regions allow us to visualize the stability properties
for a range of simulation schemes. The interpretation of these plots will turn
out to be rather informative. The methodology can be employed for the study
of numerical stability properties of both strong and weak schemes.
Test Dynamics
Similar to Hofmann & Platen (1994) and Bruti-Liberati & Platen (2008) we
o SDE with multiplicative noise
use as linear test SDE the linear It
3
dXt = 1 Xt dt + || Xt dWt
(14.1.1)
2
for t 0, where X0 > 0, < 0, [0, 1). Here we call the degree of
stochasticity and the growth parameter. The corresponding Stratonovich
SDE has the form
dXt = (1 ) Xt dt +
|| Xt dWt ,
(14.1.2)
where denotes the Stratonovich stochastic dierential, see Sect. 1.4 and
Kloeden & Platen (1999).
For the two equivalent real valued SDEs (14.1.1) and (14.1.2), the parameter [0, 1) describes the degree of stochasticity of the test dynamics.
When = 0, the SDEs (14.1.1) and (14.1.2) become deterministic. In the
573
case = 23 , the It
o SDE (14.1.1) has no drift, and X is a martingale. This
case models a typical Black-Scholes asset price dynamic when the price is
expressed in units of a numeraire under the corresponding pricing measure.
When the numeraire is the savings account, then the pricing measure is, under appropriate assumptions, the equivalent risk neutral probability measure.
In Chap. 3 we have shown that the pricing measure is simply the real world
probability measure when the numeraire is the growth optimal portfolio. In
the case = 1, the Stratonovich SDE (14.1.2) has no drift.
Stability Criterion
Now, let us introduce the following notion of stability:
Denition 14.1.1.
totically p-stable if
(14.1.3)
(14.1.5)
1
),
if and only if < 1+1 p . This means, in the case when < 0 and [0, 1+p/2
2
that X is asymptotically p-stable. Furthermore, note that for < 0 by the
law of iterated logarithm and the law of large numbers, one has
(14.1.6)
P lim Xt = 0 = 1
t
574
14 Numerical Stability
575
(14.1.8)
1
E((Gn+1 ( , ))p 1) < 0
p
(14.1.9)
for all n {0, 1, . . .}. This provides the already indicated link to the kind of
asymptotic stability studied in Bruti-Liberati & Platen (2008) when p vanishes. In this case a scheme is called asymptotically stable for a given and
if E(ln(Gn+1 ( , ))) < 0 for all n {0, 1, . . .}.
In the following we visualize the stability regions of various schemes by
identifying the set of triplets ( , , p) for which the inequality (14.1.8) is
satised. More precisely, we will display the stability regions for [10, 0),
[0, 1) and p (0, 2]. Note that when p approaches zero, we obtain the
stability region identied under the asymptotic stability criterion, see (14.1.9).
We deliberately truncate in our visualization the stability regions at p = 2,
since the area shown on the top of the graph then provides information about
576
14 Numerical Stability
the popular mean-square stability of the scheme. For higher values of p the
stability region shrinks further, and for some schemes it may even become an
empty set. Asymptotically, for very large p the resulting stability region relates
to the worst case stability concept studied in Hofmann & Platen (1994, 1996).
As such the employed asymptotic p-stability covers a relatively wide range of
stability criteria.
(14.2.2)
577
see (7.4.1) and (7.4.2). Note that we need to use in (14.2.1) the adjusted drift
function a
= a b b . The parameters , [0, 1] are the degrees of implicitness in the drift and the diusion coecient, respectively. For the case
= = 0 one recovers the Euler scheme. A major advantage of the above
family of schemes is that they have exible degrees of implicitness parameters
and . For each possible combination of these two parameters the scheme
is of strong order r = 0.5. This allows us to compare for a given problem
simulated paths for dierent degrees of implicitness. If these paths dier signicantly from each other, then some numerical stability problem is likely
to be present and one needs to make an eort in providing extra numerical
stability. The above strong predictor-corrector schemes potentially oer the
required numerical stability. However, other schemes, in particular, implicit
schemes may have to be employed to secure sucient numerical stability for
a given simulation task.
Stability of the Euler Scheme
For the Euler scheme, which results for = = 0 in the algorithm (14.2.1),
it follows by (14.1.7) that
3
(14.2.3)
Gn+1 ( , ) = 1 + 1 + Wn ,
2
where Wn N (0, ) is a Gaussian distributed random variable with mean
zero and variance . The transfer function (14.2.3) yields the stability region
578
14 Numerical Stability
shown in Fig. 14.2.1. It is the region where E((Gn+1 ( , ))p ) < 1. This
stability region has been obtained numerically by identifying for each pair
(, ) [10, 0) [0, 1) those values p (0, 2] for which the inequality
(14.1.8) holds. One notes that for the purely deterministic dynamics of the
test SDE, that is = 0, the stability region covers the area (2, 0) (0, 2).
Also in later gures of this type we note that for the deterministic case = 0
we do not see any dependence on p. This is explained by the fact that in
this case there is no stochasticity and the relation (2, 0) reects the
classical interval of numerical stability for the Euler scheme, as obtained under
various criteria, see Hernandez & Spigler (1992). For a stochasticity parameter
of about 0.5, the stability region has in Fig.14.2.1 the largest cross section
in the direction of for most values of p. The stability region covers an
interval of increasing length up to about [6.5, 0] when p is close to zero,
and about [3, 0] when p increases to 2. For increased stochasticity parameter
beyond 0.5 the stability region declines in Fig. 14.2.1. We observe that if the
Euler scheme is stable for a certain step size, then it is also stable for smaller
step sizes. We see later that this is not the case for some other schemes.
The top of Fig. 14.2.1 shows the mean-square stability region of the Euler
scheme, where we have p = 2. As the order p of the moments increases,
the cross section in the direction of and will be further reduced. The
intuition is that as the stability requirements in terms of the order p of the
moment becomes stronger, there are less pairs (, ) for which the simulated
approximations pth moment tends to zero.
Semi-Drift-Implicit Method
Let us now consider the semi-drift-implicit predictor-corrector Euler method
with = 12 and = 0 in (14.2.1). The transfer function for this method equals
1
3
3
Gn+1 ( , ) = 1 + 1 1 +
1 + Wn
2
2
2
+ Wn .
Its stability region is displayed in Fig. 14.2.2, showing that this scheme has
good numerical stability around 0.6, where it is stable for almost all
[10, 0). When p is close to 2, its stability region begins to show a gouge
around 9. This means, when we consider the numerical stability for
the second moment p = 2, instability could arise when using a step size close
to 9. Unfortunately, for the stochasticity parameter value = 23 , that
is when X forms a martingale, the stability region is considerably reduced
when compared to the region for = 0.6. Near the value of = 0.6 the
semi-drift-implicit predictor-corrector scheme outperforms the Euler scheme
in terms of stability for most values of p.
579
3
3
1 + 1 + Wn
Gn+1 ( , ) = 1 + 1
2
2
+ Wn .
The corresponding stability region is plotted in Fig. 14.2.3. It has similar
features as the stability region of the semi-drift-implicit predictor-corrector
scheme. However, the stability region appears reduced. It no longer extends
as far in the negative direction of as in Fig. 14.2.2. In particular, this is
visible around a value of 0.6. When p is close to 2.0, the gouge observed
in Fig. 14.2.2 near 9 now occurs at 4. It appears as though the
stability region decreased by making the drift fully implicit. A symmetric
choice in (14.2.1) with = 1/2 seems to be more appropriate than the extreme
choice of full drift implicitness with = 1.
Semi-Implicit Diusion Term
Now, let us study the impact of making the diusion term implicit in
the predictor-corrector Euler method (14.2.1). First, we consider a predictor-
580
14 Numerical Stability
Gn+1 ( , ) = 1 + (1 ) + Wn
1
3
.
1+
1 + Wn
2
2
The corresponding stability region is shown in Fig. 14.2.4. Although it
appears to be rather restricted, when compared with the stability region for
the semi-drift-implicit predictor-corrector method, it is important to note that
for the martingale case, that is = 23 , the stability region is wider than
those of the previous schemes. This observation could become relevant for
applications in nance. In some sense, this should be expected since when
simulating martingale dynamics, one only gains more numerical stability by
making the diusion term implicit rather than the drift term, which is
simply zero for a martingale.
Furthermore, although it may seem counter-intuitive to the common experience in deterministic numerical analysis, this scheme can lose its numerical
stability by reducing the time step-size. For example, for p close to 0.01 and
near 23 , we observe that for about 3 the method is asymptotically pstable, whereas for smaller step sizes yielding a above 2, the method is no
longer asymptotically p-stable. This is a phenomenon, which is not common
581
Fig. 14.2.4. Stability region for the predictor-corrector Euler method with = 0
and = 12
in deterministic numerical analysis. In stochastic numerics it arises occasionally, as we will also see for some other schemes. In Fig. 14.2.4 it gradually
disappears as the value of p increases.
Symmetric Predictor-Corrector Method
An interesting scheme is the symmetric predictor-corrector Euler method,
obtained from (14.2.1) for = = 12 . It has transfer function
1
3
Gn+1 ( , ) = 1 + (1 ) 1 +
1 + Wn
2
2
1
3
.
+ Wn 1 +
1 + Wn
2
2
Its stability region is shown in Fig. 14.2.5. In particular, for the martingale
case = 23 it has a rather large region of stability when compared to that
of the Euler scheme and those of previously discussed schemes. For p near
zero, close to one and close to 1, there exists a small area where this
scheme is not asymptotically p-stable. Again, this is an area where a decrease
in the step size can make a simulation numerically unstable. Also here this
phenomenon disappears as the value of p increases.
582
14 Numerical Stability
Fig. 14.2.5. Stability region for the symmetric predictor-corrector Euler method
1
3
1 + 1 + Wn
Gn+1 ( , ) = 1 + 1
2
2
3
+ Wn 1 + 1 + Wn
2
3
= 1 + 1 + 1 + Wn
2
1
1 + Wn .
2
The resulting stability region is shown in Fig. 14.2.6. We notice here
that this stability region is considerably smaller than that of the symmetric predictor-corrector Euler scheme. It seems that asymptotic p-stability does
not always increase monotonically with the degrees of implicitness in both the
drift and the diusion term. Symmetric levels of implicitness tend to achieve
large stability regions.
In summary, in the case of predictor-corrector Euler methods, the stability regions displayed suggest that the symmetric predictor-corrector Euler
583
Fig. 14.2.6. Stability region for fully implicit predictor-corrector Euler method
method, see Fig. 14.2.5, is the most suitable of those visualized when simulating martingale dynamics with multiplicative noise. The symmetry between
the drift and diusion terms of the method balances well its numerical stability properties, making it an appropriate choice for a range of simulation
tasks.
584
14 Numerical Stability
1 +
Gn+1 ( , ) =
1
2
1 32 + Wn
.
1 12 1 32
(14.3.1)
Its stability region is shown in Fig. 14.3.1. We can see from (14.3.1) that
the transfer function involves in this case some division by and . This
kind of implicit scheme requires, in general, solving an algebraic equation at
each time step in order to obtain the approximate solution. This extra computational eort makes the stability region signicantly wider compared to those
obtained for the previous predictor-corrector Euler schemes. For instance, for
all considered values of p and , one obtains in Fig.14.3.1 always asymptotic
p-stability for a degree of stochasticity of up to = 0.5. Unfortunately, there
is no such stability for = 2/3 for any value of p when < 3. This means
that this numerical scheme may not be well suited for simulating martingale
dynamics when the step size needs to remain relatively large. The symmetric
predictor-corrector Euler method, with stability region shown in Fig. 14.2.5,
appears to be better prepared for such a task.
Full-Drift-Implicit Euler Method
Now, let us also consider the full-drift-implicit Euler scheme
Yn+1 = Yn + a(Yn+1 ) + b(Yn )Wn ,
see Sect. 7.2, with transfer function
585
1 + W
n
Gn+1 ( , ) =
.
1 1 32
Fig. 14.3.2 shows its stability region which looks similar to the region obtained for the semi-drift-implicit Euler scheme. However, it has an additional
area of stability for close to one and near -10. Additionally, this region diminishes as p becomes larger and does not cover the martingale case
= 2/3. It seems that the region of stability is likely to increase with the
degree of implicitness in the drift term of an implicit Euler scheme. However,
this increase of asymptotic p-stability appears to weaken as p becomes large.
Balanced Implicit Euler Method
Finally, we mention in this section the balanced implicit Euler method, proposed in Milstein et al. (1998), see Sect. 7.3. We study this scheme in the
particular form
3
Yn+1 = Yn + 1 Yn + ||Yn Wn + c|Wn |(Yn Yn+1 ).
2
Gn+1 ( , ) =
.
1 + |Wn |
586
14 Numerical Stability
587
Fig. 14.4.1. Stability region for the simplied symmetric Euler method
n = ) = 1 .
P (W
2
(14.4.1)
Platen & Shi (2008) have investigated the impact of such simplication
of the random variables in simplied weak schemes on corresponding stability regions. The resulting stability regions look rather similar to those of the
corresponding previously studied schemes with Gaussian random variables.
In most cases, the stability region increases slightly. To provide an example,
Fig. 14.4.1 shows the stability region of the simplied symmetric predictorcorrector Euler scheme. It is, of course, dierent from the one displayed in
Fig. 14.2.5. When comparing these two gures, the simplied scheme shows
an increased stability region, in particular near the critical level of 2/3,
which is critical for the Monte Carlo simulation of martingales. In general, one
can say that simplied schemes usually increase numerical stability. The stability region of the simplied balanced implicit Euler method shows a stability
region similar to the one already presented in Fig. 14.3.3, and is therefore, not
displayed here.
Simplied Implicit Euler Scheme
As pointed out in Sect. 11.5, terms that approximate the diusion coecient
can be made implicit in a simplied weak scheme. The reason being that in
such a scheme instead of an unbounded random variable Wn one uses a
n . Let us now consider the family of simplied
bounded random variable W
implicit Euler schemes, see Sect. 11.5, given in the form
588
14 Numerical Stability
Fig. 14.4.2. Stability region for the simplied symmetric implicit Euler scheme
Yn+1 = Yn + { a
(Yn+1 ) + (1 )
a (Yn )}
n,
+ { b(Yn+1 ) + (1 ) b(Yn )} W
(14.4.2)
n
||W
.
n
1 (1 + ( 32 )) ||W
(14.4.3)
Platen & Shi (2008) have studied the resulting stability regions for this
family of schemes. For instance, the stability region for the simplied semidrift-implicit Euler scheme resembles that in Fig. 14.3.1, and that of the simplied full-drift-implicit Euler method shown in Fig. 14.3.2. The simplied
symmetric implicit Euler scheme with = = 0.5 in (14.4.2) has its stability
region displayed in Fig. 14.4.2 and that of the simplied fully implicit Euler
method is shown in Fig. 14.4.3. We observe that both stability regions ll almost the entire area covered. For this reason, a simplied symmetric or fully
implicit Euler scheme could be able to overcome potential numerical instabilities in a Monte Carlo simulation that are otherwise dicult to master. Note
that we achieve here asymptotic p-stability even in some situations where the
test dynamics are unstable, which is interesting to know.
Gn+1 ( , ) =
589
Fig. 14.4.3. Stability region for the simplied fully implicit Euler scheme
One has to be aware of the fact that for the simplied symmetric implicit
Euler scheme, see Fig. 14.4.2, problems may arise for small step sizes. For
instance, for the martingale case, that is = 2/3, a problem occurs at p = 2,
the stability region is decreased by the use of a time step size that is too
small. A straightforward calculation shows by using (14.4.3) for = =
= 1 that the fully implicit Euler scheme is not asymptotically p-stable for
2
, as can be also visually conrmed
p = 2 when the step size is below ||
from Fig. 14.4.3, see Exercise 14.2. This means, for an implementation, which
may have been successfully working for some given large time step size in a
derivative pricing tool, by decreasing the time step size, often expecting a more
accurate valuation, one may in fact create severe numerical stability problems.
This type of eect has already been observed for several other schemes. We
emphasize that rening the time step size in a Monte Carlo simulation can
lead to numerical stability problems.
Testing Second Moments
Finally, let us perform some Monte Carlo simulations with the simplied Euler
scheme and the simplied semi-drift-implicit Euler scheme, where we estimate
the second moment of XT for T [3, 729], = 1, = 0.49 and = 3. We
simulated N = 1,000,000 sample paths to obtain extremely small condence
intervals, which are subsequently neglected in Table 14.4.1.
Table 14.4.1 shows the second moment exact solution, simplied Euler
scheme estimate YT , and simplied semi-drift-implicit Euler scheme estimate
590
14 Numerical Stability
T
E(XT2 )
E(YT2 )
E(YT2 )
3
0.89
1.51
0.94
9
0.7
3.47
0.83
27
81
243
729
0.34 0.039 0.00006 2.17 1013
41.3 69796 3.47 1014 9.83 1042
0.57 0.069 5.07 108 3.0 1030
Table 14.4.1. Second moment of the test dynamics calculated using the exact
solution XT ; the simplied Euler scheme YT ; and simplied semi-drift-implicit Euler
scheme YT
YT in that order. One can observe that the Monte Carlo simulation with the
simplied Euler scheme diverges and is not satisfactory, whereas the simplied
semi-drift-implicit Euler scheme estimate is much more realistic for a time
horizon that is not too far out. The better performance is explained by the
fact that this scheme remains for p = 2 in its stability region, whereas the
simplied Euler scheme is for the given parameter set in a region where there
is no asymptotic p-stability for p = 2.
14.5 Exercises
14.1. Identify for the Euler scheme and asymptotic p-stability with p = 2 the
boundary of its stability region for values as a function of .
14.2. Describe the step sizes for which the fully implicit simplied Euler
scheme for the test SDE with = 1 is asymptotically p-stable for p = 1.
15
Martingale Representations and Hedge Ratios
The calculation of hedge ratios is fundamental to both the valuation of derivative securities and also the risk management procedures needed to replicate
these instruments. In Monte Carlo simulation the following results on martingale representations and hedge ratios will be highly relevant. In this chapter
we follow closely Heath (1995) and consider the problem of nding explicit
It
o integral representations of the payo structure of derivative securities. If
such a representation can be found, then the corresponding hedge ratio can
be identied and numerically calculated. For simplicity, we focus here on the
case without jumps. The case with jumps is very similar.
=a
t ,x
(t, X t0 ) dt
m
t ,x
(15.1.1)
j=1
591
592
for t [t0 , T ], i {1, 2, . . . , d}. Here X t0 ,x starts at time t0 with initial value
x = (x1 , . . . , xd ) d . We assume that appropriate growth and Lipschitz
conditions apply for the drift ai and diusion coecients bi,j so that (15.1.1)
admits a unique strong solution and is Markovian, see Chap. 1.
In order to model the chosen numeraire in the market model we may
select the rst component X 1,t0 ,x = = {t , t [t0 , T ]} to model its price
movements. Under real world pricing, will be the growth optimal portfolio
(GOP). However, under risk neutral pricing we use the savings account as
numeraire. Both cases are covered by the following analysis.
The vector process X t0 ,x could model several risky assets or factors that
drive the securities, as well as, components that provide additional specications or features of the model such as stochastic volatility, market activity,
ination or averages of risky assets for Asian options, etc.
General Contingent Claims
In order to build a setting that will support American option pricing and
certain exotic option valuations, as well as hedging, we consider a stopping
time formulation as follows:
Let 0 [t0 , T ] d be some region with 0 (t0 , T ] d an open set
and dene a stopping time : by
t ,x
/ 0 }.
= inf{t > t0 : (t, X t0 )
(15.1.2)
593
Valuation Function
When applying the Markov property, the option pricing or valuation function
u : 0 1 corresponding to the payo structure h(, X t0 ,x ) is given by
t
t,x
u(t, x) = E
h , X
(15.1.3)
T
denotes expectation under the appropriately
for (t, x) 0 1 . Here E
(15.1.4)
(15.1.5)
for t [t0 , T ], starting at time t0 with initial value t0 . We can write the
solution to (15.1.5) in the form
Ztt0 ,t0 = t ,
(15.1.6)
for t [t0 , T ]. This expression together with the uniqueness of the solutions
of (15.1.1) and (15.1.5) shows that
t0 ,x
t
X t0 ,x = X t,X
and
(15.1.7)
594
0 0
t,Zt
= Zt0 ,t0 = Z
(15.1.8)
for t [t0 , T ]. Using these equalities, the functional (15.1.3), the Markov
1,t ,x
property and the assignment = X 0 with x1 = 1, we have
t0 ,x
ut = u t , X t
t0 ,t0
t0 ,x
t t,Zt
t,Xt
h Z
, X
t0 ,t0
t0 ,x
t t,Zt
t,Xt
At
E
h Z
, X
t0 ,t0
t0 ,x
1 t,Zt
t,Xt
At
t E
h Z
, X
t0 ,t0
t0 ,x
1 t,Zt
t,Xt
t E
h Z
, X
1
t0 ,x
t E
h , X
, X t0 ,x
h
t E
t0 ,x
, X t,Xt
h
t E
=E
=
=
=
=
=
=
t0 ,x
t , X t
= t u
(15.1.9)
t ,x
0
) 0 .
for (t , X t
Dene the (A, P )-martingale M = {Mt : t [t0 , T ]} by
X t0 ,x ) At ,
h(,
Mt = E
(15.1.10)
we have
t0 ,x
, X t,Xt
h
Mt = E
At
t0 ,x
, X t,Xt
h
=E
t0 ,x
=u
t , X t
for t [t0 , T ].
Consequently, the discounted or benchmarked valuation process
(15.1.11)
595
t0 ,x
, t [t0 , T ]
u
= u
t = u
t , X t
is an (A, P )-martingale. Also, from (15.1.9) we can determine values for the
t are known.
random variable ut if the corresponding values for t and u
Usually it is more convenient to compute prices via the function u
rather than
(15.2.1)
596
(15.2.2)
t
s dWs ,
(15.2.3)
M t = M0 +
t0
T
s2 ds
< .
t0
t
Mt = M 0 +
s dWs
t0
then
for some other A-predictable process ,
t
E(( (s s ) dWs )2 ) =
E((s s )2 ) ds = 0.
t0
t0
597
(15.2.4)
for (t, x) [t0 , T ] . We assume that the function u is of class C 1,3 , that
is, continuously dierentiable with respect to t and three times continuously
dierentiable in x. Expanding
t ,x
t ,x
(15.2.5)
by the It
o formula and applying the Kolmogorov backward equation yields
t
t ,x
u s, Xst0 ,x dWs
u t, Xt 0
b s, Xst0 ,x
(15.2.6)
= u(t0 , x) +
x
t0
for t [t0 , T ], see Sect. 2.7. Using the time-independent formulations of
(15.1.10), (15.1.11) and (15.2.6) we have
t ,x
Mt = E h XT0
At
t ,x
= u(t, Xt 0 )
t
u s, Xst0 ,x dWs
= u(t0 , x) +
b s, Xst0 ,x
x
t0
(15.2.7)
os formula
for t [t0 , T ]. This result can also be obtained by applying It
to (15.2.5), taking the conditional expectation on both sides of the resulting equation, and using the relations (15.2.2) and (15.2.6). Consequently, we
obtain
Mt0 = u(t0 , x)
and the integrand in the form
u s, Xst0 ,x .
s = b s, Xst0 ,x
x
We now have a martingale representation of the form (15.2.3). However, this
expression for the integrand requires the solution of the valuation equation
(15.2.4) to be known.
598
and x
L0 be the operators
2
+a
+ b2 2
s
x 2 x
2
2
a
b
1 2 3
L =
+
+ a+b
b
+
,
x
s x x x
x x2
2 x3
L0 =
(15.2.8)
(15.2.9)
for x , so that
0
L (u(s, x)) = 0
x
for (s, x) (t0 , T ) , see Sect. 2.7.
(15.2.10)
a(t, Xt )
b(t, Xt )
dt + Zt
dWt
x
x
(15.2.11)
s,z
u(t, x)
x
(15.2.12)
599
t,x
t,z
v(T, XT , ZT ) = v(t, x, z) +
(15.2.13)
b 2
+
b2 2 +
z 2 2 + bz
2
x
x
z
x x z
and
1 = b + z b .
L
x
x z
Calculating the partial derivatives of the function v using (15.2.12) and applying (15.2.10) yields
0 v(s, x, z) = z L0 u(s, x)
L
x
(15.2.14)
= 0,
for (s, x, z) (t0 , T ) 2 .
We now assume that
t,x
E v(T, XT , ZTt,1 ) <
for all (t, x) [t0 , T ] . Subject to certain growth bounds applying for the
derivative h
x , this condition will be veried in Sect. 15.4.
Consequently, taking expectations on both sides of (15.2.13) and using
(15.2.9) we have
u(t, x)
= v(t, x, 1)
x
t,x
= E v T, XT , ZTt,1
t,x
t,1
u T, XT
= E ZT
x
t,x
t,1
h XT
= E ZT
x
(15.2.15)
600
s,Xst0 ,x
h XT
E
b s, Xst0 ,x dWs .
x
t0
(15.2.16)
Applying the Markov property of X, see the remarks following (15.1.1), and
(15.1.7) with = T , this representation becomes
t ,x
u T, XT0
= u(t0 , x) +
u
t ,x
T, XT0
= u(t0 , x) +
t0
ZTs,1
t0 ,x
s,1
h XT
E ZT
As b s, Xst0 ,x dWs .
x
(15.2.17)
t ,x
t ,x
t,x
h XT
x
is an unbiased estimator of
u(t, x)
,
x
unlike nite dierence approximations. Note also that if
a Monte Carlo
method
t0 ,x
, t [t0 , T ],
is used to estimate the price functional u at the point s, Xs
t0 ,x
can be used, together with
then the same simulation trajectories for
X
t0 ,x
new ones for Z s,1 , to approximate u
at
s,
X
. This procedure is usually
s
x
more accurate since it is unbiased. It is also often more ecient, since only
one simulation run is required, compared to at least two separate simulation
runs needed for nite dierence estimates.
601
Martingale representations of the type (15.2.16)(15.2.18), under different conditions, have been obtained by Elliott & Kohlmann (1988) and
Colwell, Elliott & Kopp (1991) who use the Markov property and the differentiability of solutions of It
o SDEs with respect to the initial conditions.
Broadie & Glasserman (1997a) also derive explicit representations of hedge
ratios in the case where the payo structure is a standard European call
option and the diusion process X t0 ,x follows a one-dimensional geometric
Brownian motion. The result presented relies on certain smoothness conditions and an application of the Kolmogorov backward equation. It has the
advantage of being rather simple and straightforward. Furthermore, it can
also be extended to include stopping time boundaries and multi-dimensional
processes with jumps.
As noted by Colwell et al. (1991) similar results can be obtained by an
application of the Haussmann (1978) integral representation theorem. In fact,
Haussmanns integral representation theorem can be used for certain classes
of path dependent securities where the payo function h depends on whole
trajectories of the diusion process X t0 ,x . A proof of Haussmanns theorem
using Malliavin calculus techniques, which is related to the formula of Clark
(1970), is given by Ocone (1984), see also Davis (1980) and Haussmann (1979).
The use of Haussmanns integral representation theorem has the disadvantage of being less direct and the conditions of the theorem are more dicult to
establish. In fact, the results presented in this section are sucient for many
types of valuation and hedging problems that arise in practice. They can also
be extended to a wide class of path dependent options and to the case with
jumps.
602
t0 ,x
= u(t0 , x) +
u t , X t
i=1 j=1
= u(t0 , x) +
d
m
i=1 j=1
t
t0
t0
bi,j s, X ts0 ,x
u s, X ts0 ,x dWsj
xi
t0 ,x
dWsj
bi,j s, X ts0 ,x
u s , X s
xi
(15.3.2)
and u = u
, we
for t [t0 , T ]. Applying (15.1.10) and (15.1.11), with h = h
also have
Mt = E h , X t0 ,x At
t0 ,x
(15.3.3)
= u t , X t
so that u(, X t0 ,x ) = h(, X t0 ,x ).
Dene the operators L0 and x p L0 , p {1, 2, . . . , d}, by
L0 =
d
d
m
1 i,j k,j
+
ai
+
b b
,
s i=1 xi
2
xi xk
j=1
i,k=1
2
+
L0 =
xp
xp s i=1
d
ai
2
+ ai
xp xi
xp xi
d
m
k,j
2
2
3
1 bi,j k,j
i,j b
i,j k,j
+
b
+b
+b b
.
2
xp
xi xk
xp xi xk
xp xi xk
j=1
i,k=1
L0 u(t, x) = 0
xp
603
(15.3.4)
d,d,s,z
1,1,s,z
d
p,i,s,z
Zt
p=1
d
m
k,j
ak (t, X t )
(t, X t )
p,i,s,z b
dt +
Zt
dWtj (15.3.5)
xp
x
p
p=1 j=1
t,x
t,z
0 vi (s, X t,x , Z t,z ) ds
L
vi (, X , Z ) = vi (t, x, z) +
s
s
i
2
m
j=1
(15.3.7)
where
d
d
d
ak
Li =
+
a
+
zp,i
s
x
xp zk,i
p=1
=1
k=1
d
d
m
m
2
1 ,j k,j
+
b b
+
b,j
2
x
x
k
j=1
j=1
,k=1
j
L
i
d
=1
,j
,k=1
d
d
bk,j
+
zp,i
x
x
z
p
k,i
p=1
k=1
d
p=1
zp,i
bk,j
xp
2
,
x zk,i
604
0i vi (s, x, z) =
L
d
zp,i
p=1
d
d
2 u(s, x)
2 u(s, x)
+
a
zp,i
xp s
x xp
p=1
=1
d
d
zp,i
k=1 p=1
ak u(s, x)
xp xk
d
m
d
3 u(s, x)
1 ,j k,j
b b
zp,i
2
xp x xk
p=1
j=1
,k=1
d
m
bi,j
,k=1 j=1
d
zp,i
p=1
d
p=1
zp,i
bk,j
xp
2 u(s, x)
x xk
0
L u(s, x) = 0
xp
for p, i {1, 2, . . . , d}, then taking expectation on both sides of (15.3.7) yields
u(t, x) = vi (t, x, )
xi
t,
= E vi , X t,x
, Z
=E
d
u , X t,x
xp
Zp,i,t,
h , X t,x
xp
p=1
=E
d
p=1
Zp,i,t,
(15.3.9)
605
Martingale Representation
We assume
t,
E(|vi (, X t,x
, Z )|) <
The above expression for x u(t, x) can be substituted into (15.3.2) yieldi
ing
d
m
u , X t0 ,x = u(t0 , x) +
i=1 j=1
where
si = E
d
Zp,i,s,
p=1
si bi,j s, X ts0 ,x dWsj ,
(15.3.10)
t0
t0 ,x
s,Xs
h , X
.
xp
where
si
=E
d
p=1
Zp,i,s,
si bi,j s, X ts0 ,x dWsj ,
(15.3.11)
t0
h , X t0 ,x As
xp
.
Taking expectations on both sides of (15.3.11) and using the boundary condition
u(, X t0 ,x ) = h(, X t0 ,x )
we can infer that
u(t0 , x) = E(h(, X t0 ,x )).
Consequently, this martingale representation becomes
h
, X t0 ,x
d
m
= E h , X t0 ,x +
i=1 j=1
si bi,j s, X ts0 ,x dWsj , (15.3.12)
t0
606
sup |X ts0 ,x |2
< K1 <
(15.4.1)
s[t0 ,T ]
607
i
n,s
bi,j s, X ts0 ,x dWsj ,
t0
(15.4.3)
where
i
n,s
=E
d
Zp,i,s,
p=1
hn X t0 ,x As
xp
.
si bi,j s, X ts0 ,x dWsj
(15.4.4)
t0
t0
si bi,j s, X ts0 ,x dWsj , (15.4.5)
608
where
si
=E
d
Zp,i,s,
gp , X t0 ,x
As
p=1
and ZTp,i,s, is the unique solution of the SDE (15.3.5) with initial value at
time s [t0 , T ], as given by (15.3.8).
The above representation theorem allows, under general conditions, for the
payo structure to be expressed as a stochastic integral with respect to the
driving Wiener processes. Furthermore, it provides explicit functionals for the
corresponding hedge ratios. This enables these hedge ratios, as well as prices,
to be accurately computed.
A First Lemma
In the following, we will establish the representation theorem using two lemmas and some general results from measure theory and integration.
Lemma 15.4.2
Suppose that the payo function h satises conditions A1
and A2 and the random variables hn (, X t0 ,x ), n N , can be represented in
the form
m
hn , X t0 ,x = E hn , X t0 ,x +
j=1
j
n,s
dWsj ,
(15.4.6)
t0
n,s dWs = = 0,
lim =h , X
n
=
=
j=1 t0
2
where 2 =
Proof:
for all n N . This shows that the random variable |hn (, X t0 ,x )|2 is dominated by the variate K32 (1 + |X t0 ,x |2 ) for all n N . Now, from the growth
bound (15.4.1) we have
2
< K32 (1 + K1 ) < .
E K32 1 + X t0 ,x
(15.4.8)
The pointwise convergence of hn given by condition A1(i) means that
2 a.s.
lim hn , X t0 ,x h , X t0 ,x = 0.
609
(15.4.9)
In fact, the convergence here holds for all although we do not require
this stronger result. Combining (15.4.7), (15.4.8) and (15.4.9) we can apply
a version of the Dominated Convergence Theorem applicable to Lp spaces,
p > 0, or the Corollary to Theorem 2.6.3 in Shiryaev (1984), to obtain
2
lim E hn , X t0 ,x h , X t0 ,x = 0,
n
Furthermore,
E hn , X t0 ,x E h , X t0 ,x E hn , X t0 ,x h , X t0 ,x
=
=
= =hn , X t0 ,x h , X t0 ,x =1
(15.4.11)
for all n N , where 1 = E(| |) denotes the norm in the Banach
olders inequality f 1 f 2 for any f
space L1 (, AT , P ). Since by H
L2 (, AT , A, P ) then from (15.4.10) and (15.4.11) we can infer that
(15.4.12)
lim E hn , X t0 ,x E h , X t0 ,x = 0.
n
A Second Lemma
Let us prepare another result for the proof of the martingale representation
theorem.
j
Lemma 15.4.3
Let (nj = {n,s
, s [t0 , T ]})nN for each j {1, 2, . . . ,
m} be a sequence of stochastic processes, adapted to the ltration A and satisfying the conditions:
m
j
n,s
dWsj
j=1
t0
nN
610
j
sj | = 0
lim |n,s
j=1
sj dWsj
t0
is well-dened and
=
=
=m
=
m
=
=
j
j
j
j=
dW
dW
lim =
n,s
s
s
s = = 0.
=
n
= j=1 t0
=
j=1 t0
2
Proof: In this proof we will use some general arguments from measure theory and integration using the Banach space L2, = L2 ([t0 , T ], LAT , uL
P ), where L is the sigma-algebra of Lebesgue subsets of and uL is the
Lebesgue measure. We assume that this Banach space is equipped with the
norm
'
|f |2 duL P
f 2 =
[t0 ,T ]
j
n,s
dWsj =
t0
j
1s n,s
dWsj ,
t0
1 (nj 1
8
9
9
j
:
n2 )2 = E
611
1s (nj 1 ,s
nj 2 ,s )2
ds
t0
'
j
j
2
= E
(n1 ,s n2 ,s ) ds
t0
9
m
9
9 j
:E
(n1 ,s nj 2 ,s )2 ds
j=1
t0
8
9
2
9
m
9
=9
(nj 1 ,s nj 2 ,s ) dWsj
:E
j=1
t0
=
=
=m
=
m
=
=
j
j
j
j=
dW
dW
==
n1 ,s
s
n2 ,s
s=
=
=j=1 t0
=
j=1 t0
(15.4.13)
m
j
n,s
dWsj
j=1
t0
nN
612
1 nj 1 j 2 =
'
|nj j |2 duL P
D
'
|1 nj j |2 duL P
=
D
'
|1 nj j |2 duL P
[t0 ,T ]
= 1 nj j 2
so that from (15.4.14)
lim 1 nj 1 j 2 = 0.
(15.4.15)
for all s A. Therefore, from condition (b) in the statement of Lemma 15.4.3
we see that for each j {1, 2, . . . , m}
1 sj = 1 sj
a.s. for all s A.
Applying this result, Fubinis Theorem and recalling that 2 denotes
the norm of L2 (, AT , A, P ) we have for any integers j {1, 2, . . . , m} and
n N the relation
8
9
9
j
j
:
1 n 1 2 = E
T0
1s
j
n,s
sj
613
2
ds
t0
'
2
j
= E
1s n,s
sj ds
A
'
2
j
j
E 1s n,s s
ds
'
2
j
j
E 1s n,s s
ds
'
2
j
1s n,s
sj ds
= E
A
8
9
9
:
= E
2
j
j
1s n,s s ds
t0
'
= E
j
n,s
sj
2
ds
sj
dWsj =
=
2
t0
=
==
=
t0
j
n,s
dWsj
t0
=
=
(15.4.18)
Combining (15.4.15) and (15.4.18), which hold for each j {1, 2, . . . , m}, we
can infer that
=
=
=
=
m
=m
=
j
j
j
j=
=
n,s dWs
s dWs = = 0.
lim
n =
= j=1 t0
=
j=1 t0
2
Proof of the Martingale Representation Theorem
We will now show that the conditions required for an application of Lemmas
15.4.2 and 15.4.3 can be satised for suitable choices of processes n , n N ,
and under the Assumptions A1 and A2 of Theorem 15.4.1.
For integers j {1, 2, . . . , m}, n N and s [t0 , T ] dene
614
d
i
n,s
bi,j s, X ts0 ,x ,
(15.4.19)
si bi,j s, X ts0 ,x ,
(15.4.20)
i=1
sj =
d
i=1
i
where n,s
and si , i {1, 2, . . . , d}, n N are as given in the representations
(15.4.3) and (15.4.5), respectively.
Substituting (15.4.19) into (15.4.6) we obtain the representation (15.4.3),
which is assumed to be true by the hypothesis of Theorem 15.4.1. Also, as
given in the statement of Theorem 15.4.1 we assume that the payo function
h satises conditions A1 and A2. Consequently, by applying the results of
Lemma 15.4.2 we see that the random variables
m
T
j
n,s
dWsj ,
j=1
t0
hn , X t0 ,x
xp
(15.4.21)
=
= =
=
1 + =X t0 ,x =2 =Zp,i,s, =2
<
for s [t0 , T ]. Also the pointwise limit condition A1(ii) shows that
(15.4.22)
t0 ,x
t0 ,x p,i,s, a.s.
Z
= 0
gp X
lim
hn X
n xp
615
(15.4.23)
for s [t0 , T ]. These results demonstrate that the random variable n,p,i,s is
dominated by p,i,s , which has nite expectation by (15.4.22) and a P -almost
sure limit, as given by (15.4.23).
We can now apply the Dominated Convergence Theorem for conditional
expectations, see Shiryaev (1984) Theorem 2.7.2, yielding
t0 ,x
t0 ,x
a.s.
p,i,s,
gp X
hn X
(15.4.24)
lim E
Z
As = 0
n
xp
j
and sj given
for s [t0 , T ] and p, i {1, 2, . . . , d}. From the denition of n,s
by (15.4.19) and (15.4.20), respectively, and by applying (15.4.24) we can infer
that
a.s.
j
sj | = 0
(15.4.25)
lim |n,s
n
or equivalently
u
, X t0 ,x
= u(t0 , x) +
d
m
i=1 j=1
si bi,j s, X ts0 ,x dWsj
t0
616
A2* The functions hn and hn satisfy uniform linear growth bounds of the
form
(i)
|hn (x)|2 K32 1 + |x|2
(ii)
|hn (x)|2 K42 1 + |x|2
for all x and n N with K3 , K4 < .
x
h(x) = h(0) +
g(s) ds
617
(15.5.1)
for x , where both h and g satisfy linear growth bounds of the form
|h(x)|2 A0 (1 + |x|2 ),
(15.5.2)
|g(x)|2 B0 (1 + |x|2 )
(15.5.3)
for each i {1, 2, . . . , N }. These conditions and (15.5.1) show that h (x) =
g(x) for all x \{x1 , . . . , xN }. Note that standard payo functions such
as European calls or puts, where h(x) = (x K)+ or h(x) = (K x)+ ,
respectively, are absolutely continuous functions of the form (15.5.1).
Some Continuous Payo Functions
For a continuous function f : dene the function In,m (f ) : ,
for integers n 1, m 0 iteratively as follows:
In,0 (f )(x) = f (x)
x+ n1
In,m+1 (f )(x) = n
In,m (f )(s) ds
(15.5.4)
618
1 + |x|2
(15.5.6)
1
1 2
) 2(a2 + 2 ) 2 (a2 + 1)
n
n
x+ n1
|In,m+1 (h)(x)| n
|In,m (h)(s)| ds
x
1
x+ n
Am
1 + |s|2 ds
'
Am
2
1
1 + |x| +
n
Am
3 + 2 |x|2
Am+1
1 + |x|2 ,
(15.5.7)
m
n ].
(15.5.8)
619
1 + |x|2
(15.5.9)
x+ n1
=n
(In,m (h)) (s) ds,
(15.5.10)
x
x+ n1
g(s) ds.
(15.5.11)
=n
x
1
x+ n
Bm
1 + |s|2 ds
'
Bm
1
1 + |x| +
n
Bm
3 + 2 |x|2
2
Bm+1
1 + |x|2
(15.5.12)
1 + |x|2 ,
620
m
Then for all integers n > m
, g is continuous on the interval [x, x + n ].
m1
, as previously noted, and
For integers m 1, (In,m (h)) is of class C
hence is continuous on . From this fact, (15.5.10) and by using the Mean
Value Theorem we have for any integers n 1, m 1 the relation
m
n ].
(15.5.13)
(15.5.14)
for (t, x) [t0 , T ] and n N is of class C 1,m for all even integers
m 2. In particular, for m 4, un,m will be of class C 1,4 and hence,
using the results in Sect. 15.2 will admit an It
o integral representation
of the form (15.2.17) or (15.2.18). If (15.4.1) holds for the process X t0 ,x
and the linearized process Z s,1 , s [t0 , T ], given by (15.2.11), satises
621
(15.4.2), then by applying Theorem 15.4.1 we see that the valuation function u : [t0 , T ] given by (15.2.4) will also admit a representation
of the form (15.2.17) or (15.2.18). Thus, we have found an It
o integral rept ,x
t ,x
resentation of the random variable u(T, XT0 ) = h(XT0 ) with explicit
expressions for the integrand even in the case where the derivative of h is
discontinuous at a nite number of points.
This result, which includes a wide class of non-smooth payo functions,
justies the eort needed to deal with the technical diculties encountered in
the previous section.
622
max x (n)
fn (x) = (5 n)4d
Cn
Cn
Cn
Cn
4
y (k)
dy (1) . . . dy (4)
k=1
(15.6.2)
for x d , where (n) = ( n1 , n2 , . . . , nd ) d and Cn is the d-dimensional
1
given by
cube of length 5n
1
Cn = a d : a = (a1 , . . . , ad ) , ai 0,
for i {1, 2, . . . , d} .
5n
If we take = T , then we can remove the time parameter t from the formulation of conditions A1 and A2 given in Sect. 15.4. For example, condition A1(i)
becomes
A1(i) For each x d we have
lim hn (x) = h(x).
for x d .
Verifying Condition A1(i)
From the denition of the functions fn , given by (15.6.2), we see that
4
4d
(n)
(k)
y
|fn (x)| (5 n)
max x
dy (1) . . . dy (4)
Cn Cn Cn Cn
(5 n)4d
| max(x)| +
Cn
|x| +
k=1
d+1
n
Cn
Cn
Cn
d+1
n
dy (1) . . . dy (4)
(15.6.3)
for all x d and n N . Consequently, from the linear growth bound that
applies for In,4 (h) given by condition A2 (i) and the inequality (a + b)2
2(a2 + b2 ) we have
623
(15.6.4)
(15.6.5)
= h(max(x))
= h max(x)
(15.6.6)
624
(15.6.7)
i[1,d]
(i)
k=1
or
(n)
xi + i
4
(k)
yi
(n)
xj j
k=1
4
(k)
yj
k=1
x x
for all y (1) , . . . , y (4) Cn . This means that for 0 j 2 i and n > 2(d+1)
xj xi
one has the relation
4
4
(i)
(n)
(k)
(n)
(k)
max x +
= max x
(15.6.8)
y
y
k=1
k=1
4
(k)
yj
(k)
yi
1
5n
(n)
(n)
xj + i j
k=1
or
(n)
xi + i
4
(k)
yi
(n)
xj j
k=1
4
(k)
yj
k=1
for all y (1) , . . . , y (4) Cn and n N . Again this means that (15.6.8) holds
1
and y (1) , . . . , y (4) Cn .
for all 0 < 5n
Combining these two results and the denition of fn given by (15.6.2) we
have, in the case where i
= (x), and for suciently large n and suciently
small , the identity
fn (x + (i) ) fn (x) = 0.
Letting 0 we therefore obtain
lim
fn (x)
= 0.
xi
(15.6.9)
625
If i = (x), then xi = max(x) and xj < xi for j < i, j {1, 2, . . . , d}. Thus,
d+1
d+1
for j < i and n > minj<i
|xi xj | , so that n < minj<i |xi xj |, we have the
inequality
4
d+1
(n)
(n)
(k)
(k)
+
yi yj
xj +
xj + i j
xi ,
n
k=1
d+1
minj<i |xi xj | ,
(n)
then i
(n)
ij
n
4
(n)
(n)
(k)
(k)
xj + i j
+
yi yj
xj xi ,
k=1
xj j
4
(k)
yj
(n)
xi i
k=1
or
max x
(n)
4
4
(k)
yi
k=1
y
(k)
(n)
= xi i
k=1
4
(k)
yi ,
k=1
for all y (1) , . . . , y (4) Cn and hence for > 0 and suciently large n
4
4
(n)
(k)
(i)
(n)
(k)
max x +
y
yi .
= xi + i
k=1
k=1
fn (x)
= 1.
xi
(15.6.10)
(15.6.11)
626
fn (x)
= qi (x)
xi
for all x d . Thus by the chain rule, condition A1 (ii) and (15.6.5) we see
that
fn (x)
fn (x)
lim In,4 (h)
= lim (In,4 (h)) (fn (x))
n
n
xi
xi
= g(max(x)) qi (x)
= Qi (x)
(15.6.12)
for all x d and i {1, 2, . . . , d}. This proves that condition A1(ii) holds
in the time-independent case with = T for the approximating functions
fn (x)
xi 1.
627
for all x d and i {1, 2, . . . , d}, where K62 = 2 K42 (1 + (d + 1)2 ). This
veries that condition A2(ii) holds in the time-independent case with = T
(In,4 (h) fn ), n N .
for the approximating functions x
i
We will now assume that the valuation function un : [t0 , T ] d
given by
(15.6.13)
un (t, x) = E In,4 (h) fn XT1,t0 ,x , . . . , XTd,t0 ,x
for x = (x1 , . . . , xd )T d , t [t0 , T ] is of class C 1,3 . For example, using
Theorem 4.8.6 in Kloeden & Platen (1999) we see that if the drift ai and
diusion coecients bi,j have uniformly bounded derivatives and are of class
C 4,4 , then un will be of class C 1,4 . Thus, applying the results of the previous
1,t ,x
d,t ,x
o
sections, the payo function In,4 (h) fn (XT 0 , . . . , XT 0 ) will admit an It
integral representation of the form (15.4.3). Since the payo function hmax :
d with approximating functions hn = In,4 (h) fn , n N , satises
conditions A1 and A2 in Sect. 15.4 in the time-independent case with = T ,
we can apply Theorem 15.4.1 with = T to the functional h max and the
t ,x
diusion process X T0 yielding
d
m
t0 ,x
t0 ,x
h max X T
= E h max X T
+
=1 j=1
si bi,j s, X ts0 ,x dWsj ,
t0
(15.6.14)
where
si
=E
d
Qp
t ,x
X T0
ZTp,i,s,
As
p=1
for i {1, 2, . . . , d} and ZTp,i,s, is the unique solution of the SDE (15.3.5)
with initial value at time s [t0 , T ] as given by (15.3.8).
Thus, we have found an explicit representation of the payo structure
(15.6.1). This is the case when h is an absolutely continuous function of the
form (15.5.1) and the approximating valuation functions un , n N , are
suciently smooth.
628
is a payo function for which conditions A1 and A2 given in Sect. 15.5 hold.
We now consider, so called, lookback options with payo structures of the form
h
t ,x
Xt 0
sup
(15.7.1)
t[t0 ,T ]
To ensure that option prices for this payo structure are well-dened, we also
assume that the mean square bound (15.4.1) holds. Our aim will be to nd an
explicit representation for the payo structure (15.7.1) of the form (15.4.5).
We remark that nding an explicit It
o integral representation of the payo
structure for a lookback option is generally considered as one of the more dicult problems concerning the computation of hedge ratios for path-dependent
options. For example F
ollmer (1991) applies Clarks formula to obtain an explicit representation in the simple case, where X t0 ,x is a geometric Brownian
motion and h is the identity function.
Other path-dependent options, such as Asian options, can also be handled
using the methods developed in this section. In fact, the analysis is often
simpler because the state space may only need to be increased by a single
extra variable.
For integers m N and i {1, . . . , dm }, where dm = 2m , consider an
equidistant time discretization of the interval [t0 , T ] of the form t0 = 0 <
0
t0 ,x =
. We also denote by X
1 < . . . < dm = T with step size m = Tdt
m
t0 ,x = (X 1,t0 ,x , . . . , X dm ,t0 ,x ) , t [t0 , T ]} the unique solution of the dm {X
t
t
t
dimensional SDE with components
i,t ,x
i,t ,x
i,t ,x
dt + b t, Xt 0
dWt
(15.7.2)
dXt 0 = 1{t<i } a t, Xt 0
for t [t0 , T ], i {1, . . . , dm }, starting at time t0 with initial value x =
(x, . . . , x) dm . For simplicity, we will use the symbol x for the initial
t0 ,x at time t0 and x the initial value of the
value of the vector diusion X
one-dimensional diusion X t0 ,x at time t0 .
For each integer i {1, . . . , dm }, the evolution of the component process
X i,t0 ,x is stopped at time i and we can write
i,t0 ,x
Xt
t ,x
0
= Xt
i
(15.7.3)
for t [t0 , T ]. For the time discretization with step size m dene
m,t0 ,x
MT
and
,t0 ,x
MT
max
i{1,...,dm }
i,t0 ,x
XT
(15.7.4)
t ,x
= sup Xt 0 .
t[t0 ,T ]
hn
max
i{1,...,dm }
i,t0 ,x
XT
m,t ,x
= hn MT 0
629
(15.7.5)
for n N . As in Sect. 15.5 we assume that the valuation function un,m given
by
un,m (t, x1 , . . . , xdm ) = E hn MTm,t0 ,x
for x = (x1 , . . . , xdm ) dm and t [t0 , T ] is of class C 1,3 , and hence,
equation (15.6.14) admits a representation of the form
dm
m,t ,x
m,t ,x
hn MT 0
= E hn MT 0
+
i=1
si 1{s<i } b t, X ts0 ,x dWs ,
t0
(15.7.6)
where
si
=E
d
m
Qp
t0 ,x
X
T
ZTp,i,s,
As
p=1
=E
d
m
hn
m,t ,x
MT 0
t0 ,x Z p,i,s, As
qp X
T
T
p=1
(15.7.8)
630
(15.7.9)
dm
i=0
t0
m,t ,x
t0 ,x Z i,i,s, As
qi X
E hn MT 0
T
T
1{s<i } b s, X ts0 ,x dWs
m,t ,x
= E h n MT 0
dm
T
t0 i=0
m,t ,x
t0 ,x Z i,i,s, 1{s< } As
E hn MT 0
qi X
T
i
T
b s, X ts0 ,x dWs
(15.7.10)
for n, m N . For each m N and time discretization with step size m let
t0 ,x ), where : dm {1, . . . , dm } is given by (15.6.7). Using this
m = (X
T
function we dene m : by
m = m .
(15.7.11)
T
m,t ,x
+
E hn MT 0
1{s<m } As b s, X ts0 ,x dWs .
Zs,1
m
t0
,t ,x
MT 0
631
2
m,t0 ,x 2
m,t0 ,x
2
h n MT
K3 1 + M T
K32
t ,x 2
1 + sup X t0
.
(15.7.14)
t[t0 ,T ]
Note that this bound applies uniformly for both integer variables n and m.
In addition, from the mean square bound (15.4.1) we see that
t0 ,x 2
2
< .
E K1 1 + sup X T
t[t0 ,T ]
This result together with (15.7.13) and (15.7.14) means that we can exploit
a version of the Dominated Convergence Theorem applicable for Lp spaces,
p > 0, see for example the Corollary to Theorem 2.6.3 in Shiryaev (1984), to
obtain
=
=
=
m,t ,x
,t ,x =
h n MT 0 = = 0
(15.7.15)
lim =hn MT 0
m
(15.7.16)
for all n N .
Since dm = 2m for m N any discretization point i , i {1, . . . , dm },
belonging to the time discretization with step size m will be an element of
the set of discretization points for any time discretization with step size m
with m N and m m. This means that (
m )mN , is a non-decreasing
sequence of random times and thus limm m () exists for each . Let
lim m () = ()
(15.7.17)
(15.7.18)
632
Combining this result, (15.7.13) and by continuity of the function hn and the
sample paths of Z s,1 we can infer that
a.s.
m,t ,x
,t ,x
Zs,1
Zs,1
lim hn MT 0
1{s<m } = hn MT 0
1{s< } (15.7.19)
m
for any n N .
Applying the linear growth bound A2 (ii), we know that for integers n, m
N and s [t0 , T ]
)
2
m,t0 ,x
s,1
Z 1{s< } K4
1 + M m,t0 ,x Zs,1
hn M
T
8
9
2
9
t
,x
0
K4 : 1+ sup X t Zs,1
. (15.7.20)
m
t[t0 ,T ]
Also, applying the mean square bounds (15.4.1) and (15.4.2) together with
H
olders inequality we have
=8
8
=
9
=9
2
2 =
9
9
= = s,1 =
=
t ,x
=
=: 1+ sup X tt0 ,x = =
E K4 : 1+ sup X t0 Zs,1
K
4
m
= =Zm =2
=
t[t0 ,T ]
t[t0 ,T ]
=
=
2
<
(15.7.21)
for t [t0 , T ]. This inequality combined with (15.7.19) and (15.7.20) means
that we can apply the Dominated Convergence Theorem for conditional expectations, see for example Shiryaev (1984), Theorem 2.7.2, to obtain
a.s.
m,t ,x
,t ,x
lim E hn MT 0 Zs,1
1{s<m } As = E hn MT 0
1{s< } As
Zs,1
m
(15.7.22)
for all n N and s [t0 , T ].
Now dene the process m , m N , and by
m,t ,x
t0 ,x
m = E hn MT 0
,
1
Zs,1
{s<
m } As b s, X s
m
,t ,x
= E hn MT 0
1
Zs,1
As b s, X ts0 ,x .
{s<
lim m =
633
T
,t ,x
Zs,1
E hn MT 0
1{s< } As b s, X ts0 ,x dWs .
+
t0
We have thus found an explicit representation for a lookback option for the
smooth payo function hn , n N .
Representation for Non-Smooth Payos
Fortunately, these methods also extend to the case of a non-smooth payo
function h. In fact, using conditions A1 , A2 (i) and similar arguments to
those given for the proof of (15.7.15), (15.7.16), (15.7.19) and (15.7.20) we
can show
=
=
=
,t ,x
,t ,x = a.s.
h MT 0 = = 0,
lim =hn MT 0
n
2
a.s.
,t0 ,x
,t0 ,x
E h MT
lim E hn MT
= 0,
n
a.s.
,t ,x
,t0 ,x
Zs,1
Zs,1
lim hn MT 0
1
=
g
M
1{s< } ,
{s<
n
8
9
2
9
,t0 ,x
t
,x
s,1
Z 1{s< } K4 : 1 + sup X t0 Zs,1
hn MT
.
t[t0 ,T ]
(15.7.24)
The last two relations and (15.7.21) show that
a.s.
,t ,x
,t ,x
lim E hn MT 0
Zs,1
Zs,1
1{s< } As = E g MT 0
1{s< } As .
(15.7.25)
The rst two equations in (15.7.24) and relation (15.7.23) prove that the It
o
integral in (15.7.23) is a Cauchy sequence in L2 (, AT , A, P ) for n N . Also
from (15.7.25) we see that
,t ,x
Zs,1
lim E hn MT 0
1{s< } As b s, X ts0 ,x
,t ,x
As b s, X ts0 ,x .
Zs,1
= E g MT 0
1
{s<
}
a.s.
(15.7.26)
634
Furthermore, the random variables on both sides of the above equation are
clearly As -measurable, and consequently, by an application of Lemma 15.4.3
and using again the rst two equations in (15.7.23) we can infer that
,t ,x
,t ,x
h(MT 0 ) = E h MT 0
(15.7.27)
T
,t ,x
t0 ,x
Zs,1
dWs .
E g MT 0
1
+
{s<
} As b s, X s
t0
T
m,t ,x
+
E g MT 0
1{s<m } As b s, X ts0 ,x dWs .
Zs,1
m
t0
15.8 Exercises
635
15.8 Exercises
15.1. Consider a market with zero interest rate and a geometric Brownian
motion S = {St , t [0, T ]} as risky security, satisfying the SDE
dSt = St dWt
for t [0, T ] with S0 > 0. Here W is a Wiener process on a ltered probability
space (, AT , A, P ) under the equivalent risk neutral martingale measure P .
Identify for a payo HT = (ST )q with q (0, ) the martingale representation
involving the corresponding pricing function u(t, St ) at time t [0, T ].
15.2. Calculate for the pricing problem in Exercise 15.1 the hedge ratio dependent on time t and asset price St .
15.3. Derive for the risky asset price given in Exercise 15.1 the martingale
representation and hedge ratio for a European call option price cT,K (t, St ) at
time t [0, T ) when using the Black-Scholes formula.
16
Variance Reduction Techniques
637
638
Maltz & Hitzl (1979), Rubinstein (1981), Ermakov & Mikhailov (1982), Ripley (1983), Kalos & Whitlock (1986), Bratley, Fox & Schrage (1987), Chang
(1987), Law & Kelton (1991) and Ross (1990).
The application of Monte Carlo simulation to option pricing problems was
rst demonstrated in the seminal work of Boyle (1977). The basic Monte
Carlo method has now become standard in quantitative nance but also in
insurance, and the associated literature has rapidly expanded. Its variants
can also handle high-dimensional problems, which is dicult to impossible by
most other numerical techniques. Some publications, which extend the Monte
Carlo method to include alternative variance reduced estimators, path dependent options or quasi random number sequences, include Niederreiter (1988a),
Hofmann et al. (1992), Boyle, Broadie & Glasserman (1997), Broadie & Detemple (1997a), Broadie & Glasserman (1997b), Fu (1995), Grant, Vora &
Weeks (1997), Joy, Boyle & Tan (1996), J
ackel (2002), Kohatsu-Higa & Petterson (2002), Glasserman (2004) and Kebaier (2005). Longsta & Schwartz
(2001) have proposed a least-squares Monte Carlo method that can be applied
to path dependent securities including American style derivatives.
The use of measure transformations, to construct a range of variance
reduced estimators, was rst proposed by Milstein (1988a) and has subsequently been used and extended by Kloeden & Platen (1999), Hofmann et al.
(1992), Newton (1994), Heath (1995), Fournie, Lasry & Touzi (1997) and others. A particular It
o integral representation method has been developed in
Heath & Platen (2002d). It provides a continuous time extension of the martingale control variates described in Clewlow & Carverhill (1992) and will be
detailed in Sect. 16.5.
In what follows, we rst mention several more classical Monte Carlo variance reduction techniques. Later we will point to variance reduction methods
that use martingale representations or measure transformations for functionals of SDEs and exploit the stochastic analytic structure of the problem.
Antithetic Variates
We describe at rst a variance reduction technique that is based on the use
of antithetic variates. This technique is standard and has been applied, for
example, in Due & Glynn (1995), Hull & White (1987, 1988), Clewlow &
Carverhill (1992, 1994) and Barraquand (1995). A description of its use in
general Monte Carlo simulation is given by Ross (1990) and Law & Kelton
(1991). Here we describe a version of antithetic variance reduction, which
together with other variance reduction procedures will be used later.
The construction of antithetic variates can be illustrated using the probability space (, AT , A, P ) where = C([t0 , T ], 2 ) is the space of twodimensional continuous functions on [t0 , T ]. For we denote by 1 (t) and
2 (t) the two components of (t) 2 so that (t) = (1 (t), 2 (t)) 2 for
t [t0 , T ]. We assume that P is the two-dimensional Wiener measure under
which the coordinate mappings Wt1 () = 1 (t) and Wt2 () = 2 (t) dene
639
t ,x
For , dene
by (t)
= (1 (t), 2 (t)) , t [t0 , T ] and the
t0 ,x
(16.1.2)
h
T
T
for .
The variate
X t0 ,x
X t0 ,x = 1 h X t0 ,x + h
h
T
T
T
2
t ,x
t0 ,x )) = E(h(X t0 ,x )). In
is an unbiased estimator for E(h(XT0 )) since E(h(X
T
T
addition
1
t ,x
X t0 ,x
X t0 ,x
=
+ Var h
Var h XT0
Var h
T
T
4
t ,x
X t0 ,x
,h
+ 2 Cov h XT0
T
t ,x
t0 ,x ) will typically be
and, because the random variables h(XT0 ) and h(X
T
t
,x
0
640
conditional variance of X with respect to G, denoted Var(X G), is dened
by
2
Var X G = E X E X G
G
2
= E X2 G E X G
.
(16.1.4)
Let G1 , G2 be sub-sigma-algebras of AT with G1 G2 . Using the law of iterated conditional expectations, see (1.1.10), and the denition of conditional
variance we have for any integrable random variable X the relations
2
.
Var X G1 = E X 2 G1 E X G1
2
= E E X 2 G2 G1 E E X G2 G1
2
= E E X 2 G2 G1 E E X G2
G1
2
2
E X G2
G1 E E X G2 G1
= E Var X G2 G1 + Var E X G2 G1 . (16.1.5)
+E
(16.1.6)
this variate is an unbiased estimator for E(h(XT0 )). By applying the conditional variance formula (16.1.6) we obtain the inequality
t ,x
t ,x
At
Var h XT0
(16.1.7)
Var E h XT0
for t [t0 , T ]. In
general, the inequality in (16.1.7) will be strict. In this
t ,x
case E(h(XT0 ) At ) will provide a variance reduced unbiased estimator for
t ,x
E(h(XT0 )). This classical type of error minimization is known in its standard
formulation as variance reduction by conditioning, see Law & Kelton (1991).
Stratied Sampling
We will now turn our attention to the variance reduction technique of stratied
sampling. This is an error minimization technique of long standing which has
641
been widely used in classical Monte Carlo simulation, see for example Ross
(1990). Surprisingly, the method seems to have been underutilized for nancial
modeling problems. However, Curran (1994) has described a simple form of
stratied sampling for Asian options and geometric Brownian motion. We will
describe another version of stratied sampling which can be applied to a wide
class of SDEs and valuation functionals.
Let Ai AT , i {1, 2, . . . , N }, be a set of disjoint events satisfying
N
>
Ai = ,
Ai Aj =
i=1
P (Ai F )
P (Ai )
N
1
ZAi ,
Z =
N i=1
N
N
1
E(ZAi ) =
Z dP =
Z dP = E(Z),
N i=1
i=1 Ai
(16.1.8)
Z will be an unbiased estimator for E(Z). Furthermore, using the independence property of ZAi , i {1, 2, . . . , N }, we have from (16.1.6)
=
Var(Z)
N
Var(ZAi )
N2
i=1
1
E(Var(Z A))
N
1
Var(Z).
(16.1.9)
N
This inequality will be strict if Var(E(Z A)) > 0. Consequently, if we set
=
t ,x
Z = h(XT0 ),
642
then we obtain from (16.1.8) an unbiased estimator for E(h(XT0 )), which
from (16.1.9) will usually be a variance reduced estimator.
As an example let Y be a simplied weak Euler approximation, see
(11.2.2), of a given one-dimensional diusion process, say, X t0 ,x as given in
k , k {0, 1, . . . , N 1}, are two-point distributed ran(16.1.1), where the W
dom variables given by (11.2.4). Since we are using two-point variates with
N time steps, we can replace for theoretical purposes the underlying sample
space with
(16.1.10)
N = {1, 1}{0,1,...,N1} .
For each N we denote by k , k {0, 1, . . . , N 1}, the value of
= {W
k,
at the kth step. For N the corresponding random walk W
m
j
bj (t, X s,x
t ) dWt .
(16.1.11)
j=1
u(s, x) = E g(X s,x
T ) As
643
(16.1.12)
(16.1.13)
u(T, y) = g(y)
(16.1.14)
d
d
m
1 k,j ,j
+
ak
+
b b
.
k
k
s
x
2
x x
j=1
k=1
k,=1
In Milstein (1988a) the use of the Girsanov transformation, see Sect. 2.7,
was proposed to transform the underlying probability measure P such that
dened by
the process W
t
j
j
0,x dz
Wt = Wt
(16.1.15)
dj z, X
z
0
(16.1.16)
(16.1.17)
j=1
m
m
0,x
0,x
0,x
0,x dW j
dt +
= a t, X
dj t, X
bj t, X
bj t, X
t
t
t
t
t
j=1
j=1
(16.1.18)
644
0,x
=
dP
E g XT
g X 0,x
T
0,x
dP
g X
T
0,x
T
g X
dP
T
0
0,x T
= E g XT
.
0
=
(16.1.19)
(16.1.20)
g X
T
0
to evaluate the functional (16.1.12). So far, our analysis does not depend on
the particular choice of the functions dj , j {1, 2, . . . , m}. Therefore, we can
use the process as a control process to reduce the variance of the random
variable (16.1.20).
Now, let us study the following idealized situation, which is interesting
from a theoretical viewpoint. Assume that u(t, x) > 0 everywhere and that
the corresponding solutions of (16.1.17) and (16.1.18) exist. Furthermore, let
us choose the parameter functions dj in the form
dj (t, x) =
d
1
u(t, x)
bk,j (t, x)
u(t, x)
xk
(16.1.21)
k=1
for all (t, x) [0, T ] d and j {1, 2, . . . , m}. Then, with the Ito formula
and by using (16.1.17) and (16.1.18), (16.1.21) and (16.1.13), it can be shown
that
0,x t = u(0, x) 0
(16.1.22)
u t, X
t
u(0, x) = g X
.
T
0
(16.1.23)
Consequently, with the choice of the parameter functions in the form (16.1.21)
the variable
0,x
T
(16.1.24)
g X
T
0
is not random and the variance is even fully reduced to zero.
645
for all (t, x) [0, T ] d and j {1, 2, . . . , m}. Then the quantity
0,x
T
g X
T
0
is still a random variable, but in general, with small variance if u
is chosen
suciently close to u.
The weak discrete-time approximations presented in Chap. 11 can be read 0,x
ily applied. They provide weak approximations of the SDEs governing X
and to estimate the functional
0,x T
.
E g XT
= E g X 0,x
T
0
The measure transformation variance reduction technique is a rather exible method with substantial potential, as will be indicated in the following
section.
646
Unbiased Estimators
As in Chap. 15 let W = {W t = (W 1 , . . . , W m ) , t 0} be an m-dimensional
Wiener process dened on the ltered probability space (, AT , A, P ), where
A = (At ) t[t0 ,T ] denotes the P -augmentation of the natural ltration of W .
Here At0 is assumed to be the trivial sigma-algebra. Let M = {Mt , t [t0 , T ]}
be a square integrable (A, P )-martingale with respect to the ltration A and
measure P with the Kunita-Watanabe representation, that is, the martingale
representation
m
t
sj dWsj ,
(16.2.1)
Mt = Mt0 +
j=1
t0
T
s s ds
<
(16.2.2)
t0
1 T
d ds ds
< ,
(16.2.3)
E exp
2 t0 s
647
see Sect. 2.7 and Novikov (1972a). Also for t [t0 , T ], let Pt denote the
restriction of the measure P to the sigma-algebra At . Using the Girsanov
transformation, see Sect. 2.7, we know that there is a corresponding measure
= {W
t = (W
1, . . . , W
m ) , t 0} given by
Pt such that the process W
t
j
j
Wt = Wt
djs ds
(16.2.4)
t0
m
t
1
t
dPt
= t = exp
d
djs dWsj
s ds ds +
2 t0
dPt
j=1 t0
and = {t , t [t0 , T ]} is the process which is the solution of the integral
equation
m
t
djs s dWsj
(16.2.5)
t = 1 +
j=1
t0
Pt (A) =
t dPt =
t dPT =
E(T At ) dPT =
T dPT = PT (A).
A
j=1
j,
sj dW
s
T
E
ds < .
s
(16.2.6)
t0
(16.2.7)
t0
For the moment we will regard the process as being unspecied, except for the requirement that be A-predictable and that the mean square
integrability condition (16.2.7) holds. We assume that
648
M
T ) = E(MT ).
E(
(16.2.8)
This condition will be veried later for a wide class of contingent claims.
Equation (16.2.8) together with (16.2.1) and (16.2.6) imply the relations
M
T ) = E(
M
t)
E(Mt ) = E(MT ) = E(
(16.2.9)
for t [t0 , T ].
Since the restrictions of the measures P = PT and P = PT to At are Pt
Pt
and Pt , respectively, we have, using the Radon-Nikodym derivative t = ddP
,
t
the relations
M
t) =
t dPT
E(
M
t dPt
M
t t dPt
M
t t dPT
M
t t )
= E(M
(16.2.10)
for t [t0 , T ]. Combining (16.2.9) and (16.2.10) we see that the process
= {M
t t , t [t0 , T ]} is an unbiased estimator under P for E(MT ) at
M
time t [t0 , T ].
Variance of the Estimator
under the measure
We will now compute the variance of the estimator M
as a semimartingale using
P . To do this we rst express the martingale M
.
the Wiener process W rather than W
Applying (16.2.4) and (16.2.6) we have
t = M
t
M
0
j=1
djs sj
ds +
t0
j=1
sj dWsj .
(16.2.11)
t0
by the It
Expanding the estimator M
o formula together with (16.2.5)
t t becomes
and (16.2.11), the integral equation for M
t t = M
t +
M
0
j=1
for t [t0 , T ].
t0
s ) dW j
s (sj + djs M
s
(16.2.12)
649
m
t
s ) dW j
t t ) = E
s (sj + djs M
Var(M
s
j=1
E s2
t0
t0
m
s
sj + djs M
2
ds
(16.2.13)
j=1
for t [t0 , T ].
t > 0 holds P -a.s. for t [t0 , T ] and we
Consequently, if the inequality M
choose
j
djs = s
s
M
for all j {1, 2, . . . , m} and s [t0 , T ], then the variance of the random
t t under P is reduced to zero for any t [t0 , T ]. Of course, in this
variable M
form we have assumed that the integrand process and the martingale M
can somehow be determined. In practice, this condition is dicult to satisfy.
They usually cannot be easily computed even in the case where an explicit
2
m
t
Ms
t t ) = E
Var (M
s sj sj
dWsj
M
s
j=1 t0
=
t0
E s2
m
j=1
s
M
sj sj
s
M
2
ds.
(16.2.15)
650
Var(Mt t ) may become unacceptably large. Stability problems can also easily
t t is being approximated using stochastic numerical methods and
arise if M
is not constrained.
is
To see how these problems can be controlled, suppose the process M
bounded from below by the value > 0, that is,
t ()
M
(16.2.16)
for all (t, ) [t0 , T ] and the approximation satises a mean square
integrability bound of the form (16.2.7). Then for a suitably large choice of
, the process d = (d1 , . . . , dm ) , given by (16.2.14) will be small, in a mean
square sense, since
j
1 j
t
j
|dt | =
| |
t t
M
for t [t0 , T ] and j {1, 2, . . . , m}. This will ensure that the exponential
Pt
t = ddP
, t [t0 , T ] does not become unbounded but remains close to 1.
t
Unfortunately, for most derivative security valuation problems a lower
bound condition of this type will usually not hold. However, if we apply the
and j, given by
, M
above analysis to the processes M
= +M
t,
M
t
t = + M
t,
M
tj, = tj ,
(16.2.17)
tj,
t
M
for t [t0 , T ], j {1, 2, . . . , m}, and the above arguments we can generally ensure that the corresponding process will be close to 1. Con and j, ,
t t ) will be small for the approximations M
sequently, Var(M
j {1, 2, . . . , m}, and suciently large.
t
We remark that, in the above construction, we could dene t and M
directly by equations (16.2.5) and (16.2.11). This would enable us to build
estimators of the form (16.2.12) without measure transformation. However,
we also need to verify condition (16.2.8) and have a means for building the
, so that the estimator M
t t based on (16.2.12) can be computed. In
process M
practice, for most types of nancial evaluation problems this can be achieved
via a measure transformation. Also the machinery developed in this section
is needed in the next section when we construct other variance reduced estimators.
651
Note that the above variance reduction procedure can be applied itera is a martingale under P with an It
tively because the estimator M
o integral
representation of the form (16.2.12). We assume, of course, that the integrands
in (16.2.12) satisfy mean square integrability conditions of the form (16.2.7).
, then we can apply the above procedure to
Thus, if we let M (1) = M
(1)
rather than the original M . The new estimator will be of the form
M
(1) (1) for a suitably chosen process (1) . In a similar fashion,
M (2) = M
additional estimators M (3) , . . . , M (k) for some positive integer k can also be
built.
For a large class of evaluation problems in quantitative nance and insurance we can nd good analytic approximations for the valuation process M
and integrands j with j {1, 2, . . . , m}, see Sects. 3.5 and 3.6. Using these
one can construct the estimator M (1) . To construct M (2) we require approximations of the integrands (j + dj M (1) ), j {1, 2, . . . , m} appearing in
(16.2.12). Analytic approximations are usually more dicult to nd for this
type of integrand.
Variance of Alternative Estimator
Let us now compute the variance of the estimator M 1 under the measure
t
P . From the relation t1 = dP
we have
dP
t
E(Mt ) =
Mt dPT
Mt dPt
Mt t1 dPt
Mt t1 dPT
t t1 )
= E(M
(16.2.18)
for t [t0 , T ]. This expression can be compared to (16.2.10) for the estimator
. Since E(Mt ) = E(MT ) the process M 1 = {Mt t1 , t [t0 , T ]} is,
M
thus, an unbiased estimator under the measure P = PT for E(MT ) at time t,
, and in particular,
t [t0 , T ]. This result does not use the (A, P )-martingale M
does not require condition (16.2.8) to be veried.
We now express the random variables Mt and t as semimartingales using
under P rather than W . Using (16.2.1), (16.2.4) and
the Wiener process W
(16.2.5) we have
652
Mt = Mt0 +
sj djs ds +
t0
j=1
t = 1 +
(djs )2 s ds +
t0
j=1
j=1
t0
j=1
sj ,
sj dW
(16.2.19)
j.
djs s dW
s
(16.2.20)
t0
= Mt0 +
j=1
j.
s1 (sj djs Ms ) dW
s
(16.2.21)
t0
2
m
t
2
1
;
j
Var(M
s2 sj djs Ms dW
t t ) = E
s
j=1
=
t0
2
E
s
t0
m
2
sj djs Ms ds.
(16.2.22)
j=1
If the inequality Mt > 0 holds P a.s. for all t [t0 , T ] and we choose
djt =
tj
Mt
653
A Class of Estimators
For many derivative security pricing problems we are given an AT -measurable
random variable H : and a valuation martingale process of the form
Mt = E H A t
for t [t0 , T ]. This provides a general framework for modeling various types
of asset dynamics, contingent claims and sources of uncertainty. Because our
variance reduction procedures, obtained under P , require only M to be square
integrable, they can therefore be applied to a wide class of evaluation problems
covering risk neutral pricing and real world pricing.
We will now consider a more explicit form for the martingale process M
given by (16.2.1). Our task will be to verify condition (16.2.8) and nd another
t t ) than given by (16.2.13). Let X t0 ,x = {X t0 ,x , t
expression for Var (M
t
[t0 , T ]} be a d-dimensional diusion process which satises the system of SDEs
(15.1.1) starting at time t0 with initial value x d . As usual, we assume
that appropriate growth and smoothness conditions apply for the drift and
diusion coecients so that the solution to (15.1.1) is unique and Markovian.
We also take the valuation function u : 0 1 , as given by (15.1.4),
with u = u
for some suitable choice of payo functional h : 1 with h =
The regions 0 and 1 and stopping time are as given in Sect. 15.1. Our
h.
task is, therefore, to construct variance reduced estimators for the martingale
M given by
t0 ,x
(16.2.23)
Mt = E h , X t0 ,x At = u t , X t
for t [t0 , T ], see (15.1.11). We assume that appropriate growth bounds apply
for h so that M is square integrable under P .
Suppose that the process d = (d1 , . . . , dm ) has the form
t ,x
djt = dj (t, X t0 )
(16.2.24)
t
t0
m
t
j
t0 ,x dW
t0 ,x ds +
sj .
s,
X
a s, X
b
s
s
j=1
(16.2.25)
t0
654
t0 ,x = x +
X
t
t0
a s, X
s
,x
t0
t0 ,x dj s, X
t0 ,x ds
bj s, X
s
s
j=1
j=1
m
t0 ,x dW j .
bj s, X
s
s
(16.2.26)
t0
(16.2.27)
(16.2.29)
L0 =
655
d
d
m
1 i,j k,j
+
ai i +
b b
,
t i=1 x
2
xi xk
j=1
i,k=1
d
m
d
m
2
1 i,j k,j
0
i
i,j j
L =
+
a
b d
+
b b
,
t i=1
xi
2
xi xk
j=1
j=1
i,k=1
Lj =
d
i=1
bi,j
.
xi
and
t0 ,x
If we expand u(t , X
os formula for semimartingales and
t
) using It
(16.2.25), then we have from (16.2.30) the relation
m
t
t0 ,x = u(t0 , x) +
t0 ,x dW
j.
u t , X
Lj u s, X
(16.2.31)
t
s
s
j=1
t0
Writing this as an integral equation using the Wiener process W rather than
, we obtain from (16.2.4) the expression
W
m
t
j
t0 ,x = u(t0 , x)
t0 ,x Lj u s, X
t0 ,x ds
u t , X
s,
X
d
t
s
s
j=1
j=1
t0
t0 ,x dW j .
Lj u s, X
s
s
(16.2.32)
t0
t0 ,x
os formula and
In a similar manner we can determine u
(t , X
t
) using It
(16.2.25) to obtain
t
0
t0 ,x ds
t0 ,x = u(t0 , x) +
L
u
s,
X
u
t , X
t
s
t0
m
t
j=1
t0 ,x dW
j,
Lj u
s, X
s
s
t0
(16.2.33)
656
t0 ,x
u
t , X t = u(t0 , x) +
t0 ,x
s, X
L0 u
s
t0
j=1
j=1
t
,x
t
,x
0
0
ds
Lj u
dj s, X
s, X
s
s
t0
t
t0 ,x dW j .
Lj u
s, X
s
s
(16.2.34)
t0
j=1
.
t0 ,x dj s, X
t0 ,x dW j ,
t0 ,x + u s, X
s Lj u s, X
s
s
s
s
t0
t0 ,x
(t0 , x) +
u
t , X t t = u
t0 ,x s ds
L0 u
s, X
s
(16.2.36)
t0
j=1
.
t0 ,x dj s, X
t0 ,x dW j
t0 ,x + u
s,
X
s Lj u
s, X
s
s
s
s
t0
we see that
t0 ,x At
t = E
h , X
M
t0 ,x At
u , X
=E
= u(t0 , x) +
j=1
t0 ,x dW
sj
Lj u s, X
s
t0
t0 ,x .
= u t , X
t
(16.2.37)
This relation can also be derived using more general arguments as given in
Sect. 15.1, see in particular (15.1.11).
Unbiased Estimator
os formula (16.2.30) and (15.1.1), then we
If we expand u(, X t0 ,x ) using It
have
657
Lj u s, X ts0 ,x dWsj .
t0
(16.2.39)
t0 ,x ) t , t [t0 , T ]} is an
This expression shows that the process {u(t , X
t
at time .
Variance of the Estimator
Assume that u
(t, x) > 0 for all (t, x) 0 and choose the function d =
(d1 , . . . , dm ) by
(t, x)
Lj u
dj (t, x) =
(16.2.40)
u
(t, x)
for (t, x) 0 and j {1, 2, . . . , m}. Using this choice for d, together with
t0 ,x ) under P by
equation (16.2.36), we can compute the variance of u
(
, X
the relation
t0 ,x
t0 ,x
0
Var u
, X
= Var
s ds .
L u
s, X s
t0
t0 ,x
t0 ,x
t0 ,x
0
0
= Var
L u
L u s, X
s ds ,
s, X
Var u
, X
t0
(16.2.41)
which is a more convenient representation of the variance for some applications. This should be compared with the variance obtained from (16.2.35),
which is given by the equation
658
t0 ,x
Var u
, X
2
t0 ,x
m
u
s,
X
s
t0 ,x Lj u
t0 ,x
dWsj
= E
sLj u s, X
s, X
s
s
t0 ,x
u
s, X
j=1 t0
s
t0 ,x 2
m
u s, X s
t0 ,x Lj u
t0 ,x
Lj u s, X
ds
= E s2
s, X
. (16.2.42)
s
s
t0 ,x
t0
u
s, X s
j=1
This relation can also be obtained by substituting (16.2.40) into (16.2.13).
Thus, for a suitable choice of the function d we can make the random variable
t0 ,x ) an unbiased variance reduced estimator for E(MT ) at time .
u
(
, X
The methods described above provide the basis for powerful variance reduction and error minimization procedures. The explicit formulas for the variance of estimators given by (16.2.13), (16.2.22), (16.2.41) and (16.2.42) mean
that exact controls of the variance can be built and theoretical estimates can
be given.
We emphasize that the smoothness conditions used in the derivation of
(16.2.41) and (16.2.42) are not required in the more general formulations
leading to (16.2.13) and (16.2.22). Also, condition (16.2.8), required for the
, can often be easily veried in practice for reasonable choices
estimator M
659
to be
(16.2.1) and satisfying the integrability condition (16.2.2). We take M
(16.3.1)
(T t0 )
N
and
for N N . Furthermore, let M = {Mk , k {0, 1, . . . , N 1}}, M
k
and given
k , represent weak Euler approximations of the processes M , M
by (16.2.1), (16.2.11) and (16.2.5), respectively, of the form
Mk+1
= Mk +
m
kj Wkj ,
(16.3.2)
j=1
= M
M
k+1
k
m
j=1
= k +
k+1
m
djk kj
m
kj Wkj ,
(16.3.3)
j=1
djk k Wkj ,
(16.3.4)
j=1
= M0 and = 1.
for k {0, 1, . . . , N 1} with initial values M0 = M0 , M
0
0
1
m
We will take the increments (Wk , . . . , Wk ) , k {0, 1, . . . , N 1}, to be
either a collection of independent Gaussian random variables with expectation zero and variance under P , or a collection of multi-point random
variables, again with expectation zero and variance under P . We require
values for only the rst and second moments of these increments in what follows. For example, we can take a set of two-point distributed random variables
with
1
(16.3.5)
P Wkj = =
2
for k {0, 1, . . . , N 1} and j {1, 2, . . . , m}.
can be expressed
Multiplying (16.3.3) by (16.3.4) the product M
k+1 k+1
in the form
660
k+1
k k
M
k+1
=M
m
djk kj k +
j=1
m
m
kj k Wkj
j=1
k dj k W j
M
k
k
j=1
m
dik ki
j=1
m
djk k Wkj
j=1
m
m
+
djk k Wkj .
kj Wkj
j=1
(16.3.6)
j=1
We can write
m
m
m
2
djk k Wkj =
kj Wkj
kj djk k Wkj
j=1
j=1
j=1
m
j1 ,j2 =1
j1 =j2
m
m
m
j
j
j
j
j
j
dk k Wk = E k
k Wk
k dk .
E
j=1
j=1
j=1
From this result, again using the independence property of the increments
Wkj and taking expectations on both sides of (16.3.6) and the initial value
0 = 1, we obtain
= E M
= M0
=
E
M
E M
(16.3.8)
k+1 k+1
k k
0
0
is an unbiased estimator for
for k {0, 1, . . . , N 1}. This means that M
E(MN ) = M0 at time k , k {0, 1, . . . , N 1}.
Variance of Discrete-Time Estimators
Applying once again the independence property of the increments Wkj , we
see that for all k, {0, 1, . . . , N 1}, j1 , j2 {1, 2, . . . , m} with < k
the random variables Wkj1 and Wkj1 Wkj2 are both independent of any
Borel measurable functions of the variates djk1 , kj1 , k , Mk , dj2 , j2 , , M ,
Wj1 , j2 .
In particular, (Wkj1 )2 is independent of djk1 , kj1 , k , dj2 , j2 ,
[(Wj2 )2 ].
661
m
m
.
.
j
j
j
j
j
j
Ck, = E
dk k k (Wk )2
d (W )2
j=1
m
j=1
..
E djk1 kj1 k dj2 j2 (Wkj1 )2 (Wj2 )2
j, j2 =1
m
. .
E djk1 kj1 k dj2 j2 (Wj2 )2 E (Wkj1 )2
j, j2 =1
= 0.
(16.3.9)
Also, for any k {0, 1, . . . , N 1} and using the initial value 0 = 1 we can
write
k1
M
=
+ M0
M
M
k
+1 +1
=0
, k {1, 2, . . . , N } can
so that by applying (16.3.9), the variance of M
k k
be computed from the formula
k1
2
M
=
M
Var M
E
.
k k
+1 +1
(16.3.10)
=0
For small values of we can ignore all terms in the product (M
+1 +1
2
) , {1, . . . , k 1} obtained from (16.3.6) whose expectation under
M
P is zero or of order q for q 32 . This means that all terms on the right
hand side of (16.3.6), except
m
kj k Wkj
and
j=1
m
dj W j ,
M
k
k k
k
j=1
k1
m
2
2
.
)
Var(M
(16.3.11)
E
j + dj M
k k
=0
j=1
662
be approximations
As in Sect. 16.2 we let = (1 , . . . , m ) and M
1
m
2
m
M
2
)
Var(M
E
kj j
k k
M
j=1
=0
k1
(16.3.12)
for k {1, 2, . . . , N }.
The variance formulas (16.3.11) and (16.3.12) for the estimator M
should be compared to the variance of the continuous time version of the
estimator given by (16.2.13) and (16.2.15), respectively. Note that with the
above formulation we have used only very basic properties of discrete-time
approximation processes. In the case where the increments Wkj are two-point
random variables, with probabilities given by (16.3.5), these calculations can
be further simplied since in this case (Wkj )2 = for j {1, 2, . . . , m},
k {0, 1, . . . , N 1}. Similar expressions for the variance given by (16.3.11)
and (16.3.12) are obtained if we replace the Euler approximations (16.3.2)
to (16.3.4) by other higher order weak approximations, although for these
approximations the above computations become more involved.
The variance reduction procedures discussed in the previous sections can
also be formulated in a discrete-time framework. However, in this case we
need to be more careful regarding how the problem should be formulated as
some choices lead to biased estimates of E(MT ).
Construction of an Unbiased Discrete-Time Estimator
To see how an unbiased estimator can be constructed, let M and be
weak Euler approximations of the processes M and as given by (16.3.2)
and (16.3.4), respectively. By observing the form of (16.3.4) and the initial
condition 0 = 1, an induction argument shows that
k1
m
"
1 +
(16.3.13)
k =
dj Wj
=0
j=1
P (A) =
N
dP
(16.3.14)
for A AT .
We will now assume that the increments (Wk1 , . . . , Wkm ) , k {0, 1, . . .,
N 1}, form a vector of two-point distributed, independent random variables
663
with mean zero and variance under P with probabilities given by (16.3.5).
In addition, we assume that
m j
j
dk Wk K1
(16.3.15)
j=1
for some constant K1 + for all k {0, 1, . . . , N 1} and a time discretization with maximum step size . This assumption clearly depends on the properties of the random variables djk , j {1, 2, . . . , m}, k {0, 1, . . . , N 1}. For
a discussion on how the growth of these variates can be constrained, see the
commentary following (16.2.16) in Sect.16.2. We restrict our attention to twopoint random increments because it simplies the calculations in what follows.
However, this analysis can, in fact, be extended to a wider class of multi-point
random variables.
Applying (16.3.13) and the independence property of the increments
Mk
k
Mk
=E
k N
N
1
m
"
j
j
1 +
= E(Mk ) E
d W
=k
j=1
= E(Mk )
= M0
(16.3.16)
664
1 j
P (Ajk ) = E
A
k
= E 1Aj N
k
=E
1Aj
N
1
"
1+
m
=E
1+
1Aj
Wp
p=1
=1
dp
m
dpk
Wp
p=1
1+
djk
j
1
d
j
k
P
Wk = =
.
(16.3.17)
2
These probabilities can also be used to show that
(W j ) = dj ,
E
k
k
(W j )2 = ,
E
k
(16.3.18)
m
kj Wkj ,
j=1
m
djk Wkj ,
j=1
k = k Mk k ,
k+1
k (1+k ).
so that
=
Combining this result and the Euler approximation
(16.3.2) we can write for k {0, 1, . . . , N 1}
665
Mk+1
Mk (1 + )
Mk+1
Mk
k+1
k
k+1
k
.
k+1
(16.3.20)
k
k
k
;
Cov
,
E
E
=E
k+1
+1
k+1
k+1
+1
+1
denote the covariance of the random variables
k /k+1 and / under the
M
Mk
k+1
E
=E
E
= 0.
(16.3.21)
k+1
k+1
k
k
k
;
Cov
,
=E
k+1
+1
k+1
+1
N
k
=E
+1
k+1
k Mk k
N
=E
.
(16.3.22)
+1
k+1
Noting that the increments Wkj , j {1, 2, . . . , m}, appear only in the terms
,
k and k and not in the terms Mk , +1
and
N
1
"
N
=
(1 + p ),
k+1
p=k+1
and using the independence property of the increments Wkj , we can infer
from (16.3.22) that for k, {0, 1, . . . , N 1} with < k
k
;
Cov
,
= E(k ) E
k+1
+1
+1
k+1
N
k
E(k ) E
+1
k+1
= 0.
(16.3.23)
666
M+1
Mk
M
=
+ M0
k
+1
=0
since 0 = 1. Combining (16.3.20), (16.3.21) and (16.3.23) we obtain
k1
M+1
M
M
k
;
;
Var
+ M0
= Var
k
+1
=0
2
k1
.
E
=
(16.3.24)
+1
=0
by
+1
2
=
2
1
1 +
2
=
2
1 + ( )2 . . .
2
;
Var
Mk
k
k1
=0
2
2
2
m
1
=
j M dj Wj . (16.3.25)
E
j=1
=0
k1
We also know from the independence property of the increments Wkj and
(16.3.19) that
W j1 W j2 = E W j1 W j2 N
E
k
k
k
k
=E
Wkj1
Wkj2
N
1
"
(1 +
)
=0
= E Wkj1 Wkj2 (1 + k )
= E Wkj1 Wkj2 + E Wkj1 Wkj2 Wkj2
+ E Wkj1 Wkj2 Wkj2
=0
667
variates include the increments Wpj , j {1, 2, . . . , m}, with index values p
up to, but excluding the value , then from (16.3.25) and (16.3.18) we have
2
k1
m
2
1
M
j
j
k
;
M d .
E
(16.3.26)
Var
j=1
=0
This should be compared to the variance obtained from (16.2.22) for the
estimator M 1 .
Discrete Estimation of Integrands
We will now consider the problem of directly estimating the integrands in the
martingale representation (16.2.1) using discrete-time methods. Let Y
k =
(Yk,1 , . . ., Yk,d ) , k {0, 1, . . . , N 1} be an Euler approximation of the
diusion process X given by (15.1.1) of the form
,i
Yk+1
= Yk,i + ai (k , Y
k ) +
m
j
bi,j (k , Y
k )Wk
(16.3.27)
j=1
N
k
As in Chap. 15, see (15.1.11), we let u : {0 , . . . , N }d , be a valuation
function of the form
(16.3.29)
u(k , Y
k ) = Mk .
This expression and the law of iterated conditional expectations, see (1.1.10),
applied to the discrete-time martingale M means that
A .
(16.3.30)
u(k , Y
k ) = E u k+1 , Y k+1
k
We will now estimate the function u at time k , k {0, 1, . . . , N }, using a
backward numerical technique. The basic idea is as follows: From (16.3.30) we
668
u(k , Y
k )=
1
2m
k) .
u k+1 , Y
(
W
k+1
(16.3.31)
k Q
W
u(k+1 , Y
k+1 ) u(k , Y k ) +
m
j.
kj W
k
(16.3.32)
j=1
u k+1 , Y
k+1 u k , Y k
m
669
2
j ,
kj W
k
j=1
Z = h(X T0 ) (Y E(Y ))
670
for some suitable choice of , rather than h(X T0 ) directly. The parameter is chosen to minimize the variance of Z. Because
t ,x
(t
0
T
(16.4.3)
t ,x
t ,x
t ,x
0
+ 2 Var h X
Var(ZT ) = Var h X T0
T
t ,x
t ,x
0
,h X
2 Cov h X T0
T
the value of , which will minimize the variance of ZT , is
t ,x
t ,x
0
,h X
Cov h X T0
T
t ,x
min =
.
0
Var h X
T
(16.4.4)
Note that if a Monte Carlo simulation using the variate ZT is being performed,
then we can estimate simultaneously the best value of that will minimize the
variance of ZT , as given by (16.4.4). This type of calculation can be performed
as the simulation proceeds, see Clewlow & Carverhill (1992, 1994). We do not
t ,x
t0 ,x ) for each path or
need to store the values of the variates h(X T0 ) and h(X
T
realization of the simulation. This is a simple but useful form of least-squares
analysis which can be extended in a straightforward manner to include linear
combinations of control variates of the form 1 Y1 + . . . + n Yn .
671
(16.4.5)
(16.4.6)
for t [t0 , T ] with the same initial conditions. That is, we have St0 = s and
(, , ), corresponding to the payo
t ) = E (ST K)+ At
u
(t, St ,
(16.4.7)
672
t ,x
X
t0 ,x = E h
0
u
t, X
At
t
T
(16.4.8)
(t
0
t
j=1
(16.4.9)
t0
M = u
=u
(t0 , x) +
, X
(16.4.10)
sj dWsj .
j=1
t0
t0 ,x
=u
(t0 , x)
E u
, X
(16.4.11)
673
from Y itself, since E(Y ) can usually be easily calculated for discrete-time
Monte Carlo schemes.
For one-dimensional diusions Clewlow & Carverhill (1992, 1994) used linear combinations of discrete-time martingale control variates. If we use a time
discretization of the interval [t0 , T ], as given by (16.3.1), then these control
variates in several dimensions take the form
,i
=
CN
N
1
ik Yk,i E Yk,i
k=0
i
for i {1, 2, . . . , d}, k {0, 1, . . . , N 1}, where Y
k = Y k+1 Y k . Here k
is Atk -measurable and is chosen as an approximation to a hedging parameter
such as the delta or gamma sensitivities for the ith component of the diusion
process X given by (15.1.1). Since
A
= C
E C
t
k
T
CTi =
si dXsi,t0 ,x ai s, X ts0 ,x ds
t0
si
=
t0
m
bij s, X ts0 ,x dWsj
(16.4.12)
j=1
674
CT =
sj dWsj ,
(16.4.13)
t0
j=1
Var(ZT ) = Var
= E
j=1
sj sj dWsj
t0
j=1
j=1
T
t0
2
j
s sj dWsj
2
E sj sj ds.
(16.4.15)
t0
t0 ,x = u
(t
u
T, X
,
x)
+
0
T
i=1 j=1
T
t0
t0 ,x u
t0 ,x dW j .
bi,j s, X
s,
X
s
s
s
xi
(16.4.16)
675
d
m
T
Var(ZT ) = Var
u s, X ts0 ,x
bi,j s, X ts0 ,x
xi
i=1 j=1 t0
j
t0 ,x u
t0 ,x
bi,j s, X
s,
X
dW
s
s
s
xi
d
m
T
E
u s, X ts0 ,x
=
bi,j s, X ts0 ,x
xi
j=1 t0
i=1
2
t0 ,x u
t0 ,x
bi,j s, X
s,
X
ds.
s
s
xi
If we set
sj =
d
bi,j s, X ts0 ,x
u
s, X ts0 ,x ,
x
i
i=1
(16.4.17)
(16.4.18)
as used in (16.4.13) for j {1, 2, . . . , m}, then the variance of the estimator
ZT given by (16.4.14) can be calculated from (16.4.15). Thus, one obtains
Var(ZT ) =
j=1
E
t0
d
i=1
i,j
s, X ts0 ,x
2
t0 ,x
(u u
) s, X s
ds.
xi
(16.4.19)
We observe that the variance obtained for ZT is similar to, but dierent from
that obtained for ZT with the above choice for = (1 , . . . , m ) . In general,
we can expect a lower variance for ZT as can be seen by comparing (16.4.17)
with (16.4.19). The computational loads corresponding to the two estimators
can also vary. For example, Monte Carlo estimations using the variate ZT
involve only the calculation of payo structures but require the simulation
t0 ,x . This can be compared
of two separate diusion processes X t0 ,x and X
with estimates of ZT , which require simulation of only one explicit diusion
o integrals (16.4.13) eectively
process X t0 ,x . However, evaluation of the It
involves the simulation of another diusion process, namely the It
o integral
itself. Choosing the best estimator for a given valuation problem often involves
preliminary simulation studies and clearly depends on the specic structure
of the underlying model and payo functional.
Minimizing the Variance
We can extend the above analysis,
estimator ZT to include linear comL for the
676
ZT = h X T0
CT
(16.4.20)
=1
t ,x
L
t ,x
, CT .
Cov h XT0
(16.4.21)
=1
t ,x
k Cov(CT , CTk ) = Cov h X T0
, CT
(16.4.22)
k=1
for each {1, 2, . . . , L}. This system of linear equations will admit a unique
solution if the set of control variates {CT , {1, 2, . . . , L}} is linearly independent. As is the case for a single control variate, we can estimate the quantit ,x
ties Cov(CT , CTk ) and Cov(h(X T0 ), CT ), , k {1, 2, . . . , L}, for a given simulation and simultaneously the optimal vector of coecients = (1 , . . . , L )
to minimize the variance as given by (16.4.22). Also, as has been noted for
a single control variate, we can calculate the optimal vector of coecients
progressively during the simulation. We do not need to store output data for
t ,x
the variates X T0 , CT1 , . . . , CTL for each path of the simulation.
This formulation is simplied if we assume that Cov(CT , CTk ) = 0 for
= k, , k {1, 2, . . . , L}. That is, the random variables CT , {1, 2, . . . , L},
are mutually orthogonal when considered as elements of the Hilbert space
L2 (, AT , A, P ) with inner product (X, Y ) = E(X Y ) for X, Y L2 (, AT ,
A, P ). In this case (16.4.22) reduces to
t ,x
, CT
Cov h X T0
=
Var CT
t ,x
t ,x
E h X T0
E h X T0
CT
=
(16.4.23)
CT 22
for {1, 2, . . . , L}, where 2 denotes the norm in L2 (, AT , A, P ).
A mutually orthogonal set of control variates CT , {1, 2, . . . , L}, can be
constructed using Gram-Schmidt orthogonalization. For example, if CT1 and
CT2 are not orthogonal we can replace CT2 with
Cov(CT1 , CT2 ) CT1
.
CT2
Var(C 1 )
T
677
=
There are many ways of constructing the control variate process C
1
L
j=1
(16.4.24)
t0
j=1
t0
2
E 1{s[t1 1 ,t1 )} 1{s[t2 1 ,t2 )} si
ds
= 0.
Therefore, any two control variates CT1 and CT2 , 1
= 2 , are orthogonal.
These control variate methods can be applied iteratively or in combination
with other techniques. For example, we may nd an initial approximation
u1 : [t0 , T ] d for u from nite dierence methods. The integrands
given by (16.4.18) together with appropriate interpolation routines could then
be used to construct an It
o integral control variate of the form (16.4.13) and
from this a new approximation u2 for u would be obtained. Clearly, this
procedure can be repeated until sucient accuracy is achieved.
The control variate method is an ecient method which can deliver signicant variance reductions. However, its potential is limited since it does not
exploit fully the stochastic structure of the underlying problem. The generalization of the above results to the case with jumps is rather obvious.
678
L0 f (t, x) =
d
m
1 i,j
2 f (t, x)
b (t, x) bk,j (t, x)
2
xi xk
j=1
(16.5.3)
i,k=1
and
Lj f (t, x) =
d
i=1
bi,j (t, x)
f (t, x)
xi
(16.5.4)
B = [0, T ) {T } ( )
and a valuation function u : [0, T ] given by
u(t, x) = E h(, X t,x
)
679
(16.5.5)
(16.5.6)
for t [0, T ] is a square integrable (A, P )-martingale. Under conditions, as described in Chap. 15, by using the martingale representation theorem together
with the Markov property for X, it can be inferred that there exists an mdimensional A-predictable integrand = { t = (t1 , . . . , tm ) , t [0, T ]}
with
Mt = u(t, X 0,x
t )
= u(0, x) +
j=1
sj dWsj
(16.5.8)
for t [0, T ].
Our aim will be to nd an unbiased variance reduced estimator for u(0, x)
given an approximation u
: [0, T ] to u. Examples will be provided on
how to construct such approximate functions. We assume that u
C 1,2 ([0, T ]
f
) is from the class of functions f : [0, T ] with t (, x) continuous
2
f
on (0, T ) for x and xi x
k (t, ) continuous on for t (0, T ) and
satises appropriate integrability
i, k {1, 2, . . . , d}. We also assume that Lj u
tj , t [0, T ]} with
j = {M
conditions so that the process M
t
j
j
Mt =
Lj u
(s, X 0,x
(16.5.9)
s ) dWs
0
(16.5.10)
That is, u
is an approximation to u that matches the payo for u on B.
Applying the It
o formula to u
and using (16.5.1) yields
L0 u
(t, X 0,x
t )dt +
(0, x) +
u
(, X 0,x
)=u
0
j=1
j
Lj u
(t, X 0,x
t )dWt .
0
(16.5.11)
Consequently, by (16.5.6), (16.5.10), (16.5.11), Fubinis theorem and the
j one obtains
martingale property of M
680
u(0, x) = E h(, X 0,x
)
=E u
(, X 0,x
)
=u
(0, x) + E
L0 u
(t, X 0,x
t ) dt
0
=u
(0, x) +
E 1{t< } L0 u
(t, X 0,x
t ) dt
(16.5.12)
for x . Here 1{t< } denotes the indicator function applied to the event
{t < }. Due to (16.5.12) the random variable
Z = u
(0, x) +
L0 u
(t, X 0,x
(16.5.13)
t ) dt
0
is an unbiased estimator for u(0, x). We will refer to this estimator as the HP
estimator.
Variance of the HP Estimator
To compute the variance of Z note that from (16.5.13), (16.5.11), (16.5.10)
and (16.5.8) we have
m
Z = u
(, X 0,x
)
j=1
j=1
= u(0, x) +
j=1
j
Lj u
(t, X 0,x
t ) dWt
= u(, X 0,x
)
j
Lj u
(t, X 0,x
t ) dWt
j
tj Lj u
(t, X 0,x
t ) dWt .
(16.5.14)
j
tj Lj u
Var(Z ) = E
(t, X 0,x
t ) dWt
j=1
j=1
2
E 1{t< } tj Lj u
(t, X 0,x
)
dt. (16.5.15)
t
tj = Lj u(t, X 0,x
t )
681
(16.5.16)
Z, = u
(, X )
Lj u
(t, X 0,x
t ) dWt
j=1
= Z + (1 )
j=1
j
Lj u
(t, X 0,x
t ) dWt
(16.5.18)
is an unbiased estimator for u(0, x). This variance reduced estimator can be
interpreted as a control variate that is based on an It
o integral representation,
see Sect. 16.4. Inspection of (16.5.18) shows that for = 1 the HP estimator
and the above unbiased estimator are the same. This explains the relationship
between the two variance reduction methods. In (16.5.18) the parameter can
be chosen in an optimal manner similar as in Sect. 16.4.
Note that we have provided precise estimates of the level of variance reduction that can be obtained, not in terms of improvements to, or specication of
the order of weak convergence, but by explicitly computing the variance of the
corresponding estimators, see (16.5.15) and (16.5.17). Clearly, the degree of
variance reduction obtained depends on the choice of the approximation function u
. As in standard Monte Carlo methods, the reduced variance decreases
with the inverse of the square root of the number of simulated trajectories.
For a given sample size the absolute variance can be made drastically smaller
by the above method when compared with a raw Monte Carlo estimate.
Extensions of the HP Method
We now consider applications of the HP variance reduction technique to the
approximation of solutions of partial dierential equations (PDEs) that are
pricing functions in nance but could also originate from other applications.
In addition, some extensions of the method are presented.
682
(16.5.19)
(16.5.20)
m
j
Lj u(s, X t,x
(16.5.21)
u(t, x) = u(, X t,x
)
s ) dWs
j=1
m
j
= 0.
Lj u(s, X t,x
E
s ) dWs
j=1
Combining this result with (16.5.21) ensures that the representation (16.5.6)
applies for u. Consequently, the solution to (16.5.19)(16.5.20) must be unique.
(t, x) =
Suppose there is some approximation u
C 1,2 ([0, T ] ) with u
satises the PDE
h(t, x) for (t, x) B. Then the dierence u = u u
L0 u (t, x) + L0 u
(t, x) = 0
(16.5.22)
u (t, x) = 0
(16.5.23)
for (t, x) B. If u
is close to u, then it may be computationally more ecient
to solve (16.5.22)(16.5.23) rather than (16.5.19)(16.5.20). This provides a
simple but powerful illustration of how the ideas used to produce variance
reduced estimators can be directly applied in a PDE setting.
In some cases it may not be possible or convenient to use only PDE methods to solve (16.5.19)(16.5.20). For example, the dimension d may be simply too high. A hybrid method, which combines features of a PDE approach
683
together with the HP variance reduction technique, would rst apply a numerical solver to nd an approximation u
to u. Then by simulation, using
the formula (16.5.12) with estimator (16.5.13), a better approximation of the
solution u(0, x), x , at the point (0, x), could be obtained. Typically, the
approximate solution u
, possibly the output from an implicit nite dierence
solver, would only be available at discrete points on a grid. This may require
some additional smoothing or multi-dimensional interpolation technique to
(t, X 0,x
ensure that values for L0 u
t ), t [0, T ], can be computed for simulated
trajectories, see (16.5.13).
Iterative HP Methods
For many practical applications that arise in a nancial modeling context, a
systematic way of obtaining an approximate solution u
satisfying (16.5.6) is
= {X
t, t
to rst nd an approximation to the diusion process X. Let X
[s, T ]} be a d-dimensional diusion process that is governed by the SDE
s,x
dX
t
s,x ) dt +
(t, X
=a
t
m
j (t, X
s,x ) dW j
b
t
t
(16.5.24)
j=1
: [0, T ] d and
for t [s, T ], s [0, T ] with coecient functions a
j
: [0, T ] d , j {1, 2, . . . , m}. As is the case for (16.5.1) it is assumed
b
that (16.5.24) admits a unique strong solution which is Markovian. Denote
i,j ,
0 the corresponding PDE operator, as given in (16.5.3), with a
i and b
by L
i,j
i
i {1, 2, . . . , d}, j {1, 2, . . . , m} replacing a and b , respectively. Dene
the function u
: [0, T ] by replacing the diusion process X appearing
that is
in (16.5.6) with X,
t,x )
u
(t, x) = E h(X
(16.5.25)
s,x )
for (t, x) [0, T ] . Here : [s, T ] is the rst exit time of (t, X
t
from [s, T ) , that is
s,x
[s, T ) .
(16.5.26)
= inf t s : t, X
t
If u
is suciently smooth, then an application of the Kolmogorov backward
equation yields the PDE
0 u
L
(t, x) = 0
(16.5.27)
for (t, x) [0, T ] with boundary condition
u
(t, x) = h(t, x)
(16.5.28)
684
u(0, x) = u
(0, x) + E
L
u
(t, X 0,x
t ) dt
0,x
0
(L L ) u
(t, X t ) dt .
0
=u
(0, x) + E
(16.5.29)
0) u
Z = u
(0, x) +
(L0 L
(t, X 0,x
(16.5.30)
t ) dt.
0
: [0, T ] dened by
Consider an additional approximation function u
(t, x) = u
u
(t, x) + z(t, x),
where
t,x
L u
(s, X s ) ds
0
z(t, x) = E
(16.5.31)
(16.5.32)
(16.5.33)
for (t, x, y) [0, T ] (0, ) with the process Y s,y = {Yts,y , t [s, T ]}
dened by the relation
s,x ) dv
Yts,y = y +
1{v< } L0 u
(v, X
(16.5.34)
v
s
z(t, x, y) = E YTt,y .
(16.5.35)
0 on suciently smooth functions
Let us dene the extended operator L
f : [0, T ] (0, ) by
f (t, x, y)
0 f (t, x, y) = f (t, x, y) +
L
ai (t, x)
t
xi
i=1
d
d
m
1 i,j
2 f (t, x, y)
b (t, x) bk,j (t, x)
2
xi xk
j=1
i,k=1
+ L0 u
(t, x)
f (t, x, y)
y
(16.5.36)
685
(16.5.37)
0) E
= (L0 L
t,x ) ds
0 ) L0 u
(L0 L
(s, X
s
.
t,x ) ds . (16.5.38)
0 ) (L0 L
0) u
(L0 L
(s, X
s
=E
t,x ) ds
L0 u
(s, X
s
=E
T
0,x
0,x
0
=u
(0, x) + E h(, X 0,x
)
(,
X
)
+
E
1
L
u
(t,
X
)
dt.
{t<
}
t
0
0,x
Zh = u
(0, x) + h(, X 0,x
)
(,
X
)
+
L0 u
(t, X 0,x
t ) dt.
(16.5.40)
686
for t [s, T ] and s [0, T ] with initial values Sss,x = x1 > 0 and vss,x =
x2 > 0 and nonnegative constants , , , and [1, 1] with 2 >
1
1
2
2 . The vector W = {W t = (Wt , Wt ) , t [0, T ]} is a two-dimensional
1
2
Wiener process under P . Here the symbols S s,x and v s,x denote the stock
and squared volatility processes, respectively. The savings account process
= {B
t , t [0, T ]} is given by B
t = ert , where r > 0 is, for simplicity, a
B
constant short rate.
The GOP St in this market satises the SDE
dSt = St (r dt + t1 (t1 dt + dWt1 ))
with market prices of risk
r
t1 = (
2
vts,x
1
dSts,x = Sts,x
(16.5.42)
(16.5.43)
1
Sts,x
St
(
s,x2
1
vt t dWt1 .
(16.5.44)
687
t S
B
1 t 1 2
1
1
0
t =
=
exp
(
)
dt
+
dW
,
(16.5.46)
t
t
2 0 t
St
0
which is required to be an (A, P )-martingale.
The real world pricing formula (3.3.5), (16.5.45) and the Bayes rule yield
then also the traditional formula for the risk neutral price
t HT
T S St B
B
0
c(t, x) = E
T At
t S B
ST B
0
t HT
T B
Bt HT
=E
(16.5.47)
T At = E
T At ,
t B
B
denotes expectation under P .
where E
1
2
Under the measure P the dynamics for S s,x and v s,x becomes
(
1
1
1
2
1
dSts,x = r Sts,x dt + vts,x Sts,x dW
t
(
2
2
2
1 + 1 2 dW
2 (16.5.48)
dvts,x = vts,x dt + vts,x dW
t
t
for t [s, T ] and s [0, T ], where =
(r)
and
t1 = t1 dt + dWt1
dW
2 = dW 2 .
dW
t
t
= {W
t = (W
1, W
2 ) , t [0, T ]} is a two-dimensional Wiener proHere W
t
t
cess under P . The minimal equivalent martingale measure transforms the
stock price such that its appreciation rate becomes r. However, it leaves the
untraded noise Wt2 unchanged under P . We assume that 2 > 12 so that the
2
square root process v s,x remains strictly positive under P . Furthermore, one
needs to assume the parameters are chosen such that P is an equivalent risk
neutral probability measure. Let us also ensure that the stock price process
1
S s,x remains strictly positive under P and P .
688
Even that it requires stronger assumptions, let us work under the risk
neutral approach since this is common to most readers. However, the method
below works equally well when applied directly under the benchmark approach
with less restrictive modeling assumptions and without measure change. We
now describe a version of the HP variance reduction method, which can be
utilized to approximate the option price
c(0, x) = erT u(0, x),
(16.5.49)
1
(S 0,x K)+ .
u(0, x) = E
T
where
This means, we consider a European call with strike K under the Heston
model with risk neutral dynamics given by (16.5.48) and x = (x1 , x2 ) =
1
2
(S0s,x , v0s,x ) . The main idea will be to use the estimate (16.5.29) with the
HP estimator (16.5.30). As previously mentioned, one way of utilizing the
for the diusion process X. A
HP method is to nd an approximation X
=
convenient choice, based on the Black-Scholes model, is as follows: Let X
s,x1
s,x2
d
vts,x = vts,x
dt
(16.5.50)
1
for t [s, T ] and s [0, T ] with initial values Sss,x = x1 > 0 and vss,x = x2 >
0. For this system of SDEs the solution for v can be explicitly computed and
is given by
2
e(ts)
vts,x = + (x2 )
(16.5.51)
for t [s, T ] and s [0, T ]. Denote by BS(x1 , K, r, , T ) the Black-Scholes
price for a European call option with spot price x1 , strike K, short rate r,
constant volatility and maturity T . Using (16.5.51) and setting = T , the
approximate function u
: [0, T ] (0, )2 given in (16.5.25) takes the
form
1
(St,x K)+
u
(t, x) = E
T
t , T t)
= er(T t) BS(x1 , K, r,
1
(16.5.52)
2
T
2
1
t =
vzt,x dz
T t t
'
(T t) 1
e
= (x2 )
.
(16.5.53)
(T t)
689
Here BS(, , , , ) denotes the European call price according to the BlackScholes formula.
Evaluation of the expectation appearing in (16.5.29) requires the cal 0 )
u(t, X 0,x
culation of the values (L0 L
t ) for t [0, T ]. Using (16.5.48)
and (16.5.50) and noting that the coordinates for X = (X 1 , X 2 ) and
= (X
2 ) correspond to the vectors (S s,x1 , v s,x2 ) and (Ss,x1 , vs,x2 ) ,
1, X
X
0 ) can be determined. Thus, for suciently
respectively, the operator (L0 L
smooth functions f : [0, T ] (0, )2 we have
2
2
0 ) f (t, x) = x2 f (t, x) + 1 f (t, x)
(L0 L
(16.5.54)
x1 x2
2
(x2 )2
for (t, x) (0, T ) (0, )2 , where x = (x1 , x2 ) .
0) u
(t, x) can now be computed in an
The corresponding values for (L0 L
explicit form because u
(t, x) can be expressed as a scaled Black-Scholes price
with volatility
t , see (16.5.52) and (16.5.53). Substitution of the appropriate
partial derivatives for u
into (16.5.54) yields
2
t , T t)
t
BS(x1 , K, r,
0) u
(L0 L
(t, x) = x2 er(T t)
x1
t
x2
2
t
2 BS(x1 , K, r,
t , T t)
1
+
2
t2
x2
t , T t) 2
t
BS(x1 , K, r,
+
. (16.5.55)
t
(x2 )2
2
t
BS
BS
The partial derivatives BS
, x2t and (x
2 )2 can be com
t , x1
t ,
t2
puted in a straightforward manner using the Black-Scholes formula together
with the expression for
t given in (16.5.53). For example, BS
t is often referred
to as the Black-Scholes vega.
690
m
bj (n+1 , Yn+1 ) + (1 ) bj (n , Yn ) Wnj
(16.5.56)
j=1
m
bj Wnj
j=1
d
m
i=1 j=1
bi,j (n , Yn )
bj (n , Yn )
.
xi
Here , [0, 1] and Wnj , j {1, 2, . . . , m}, n {0, 1, . . . , N 1} are independent N (0, ) Gaussian random variables. Note that one could have used
j , which in this example give altwo-point distributed random variables W
n
most the same outcomes.
The corresponding discrete-time approximation for the estimator ZT ,
given in (16.5.30) with = T , can be obtained by adding this component
to the system of equations (16.5.48). With this choice of components the
numerical scheme (16.5.56) can be applied to construct an approximation
1
2
Y N = (YN1 , YN2 , YN3 ) to the random vector (ST0,x , vT0,x , ZT ) with initial
(0, x), x = (x1 , x2 ) . The expectavalues Y01 = x1 , Y02 = x2 and Y03 = u
3
tion E(YN ), therefore, provides an approximation to the value u(0, x), see
(16.5.29), (16.5.48) and (16.5.49).
691
Fig. 16.5.1. Simulated outcomes for the intrinsic value (St0,x K)+ , t [0, T ]
For all simulation experiments the results have been enhanced by the
use of antithetic variates, see Sect. 16.2. This consisted of full reection of
both Wiener components. That is, sample paths computed via the 2N outcomes (Wn1 , Wn2 ), n {0, 1, . . . , N 1} were combined with additional
sample paths generated from the pairs (Wn1 , Wn2 ), (Wn1 , Wn2 ) and
(Wn1 , Wn2 ) for n {0, 1, . . . , N 1}.
Simulation Results
To provide a graphical illustration of the level of variance reduction possible
with the HP method, Fig. 16.5.1 displays a group of 10 4 = 40 simulated
1
sample paths for the intrinsic value (St0,x K)+ , t [0, T ], with N = 40 time
discretization points. The results include sample paths generated from the
antithetically produced outcomes as explained above. For this and subsequent
plots the following default parameter values were used: = 0.6, = 0.04,
1
= 0.2, r = 0.04, T = 0.5, K = 100 with initial values S00,x = 100 and
2
v00,x = 0.04. Note that for these parameter values 2 = 0.6 > 0.5. These
choices produce a strong stochastic volatility eect, suciently constrained to
1
2
ensure that S 0,x and v 0,x remain strictly positive under P .
Fig. 16.5.2 shows the same set of sample paths for the estimator Zt , t
1
[0, T ]. Note that simulated outcomes for (St0,x K)+ , t [0, T ], are spread
over the wide interval [0.00, 50.00], whereas those for Zt are contained in the
extremely small interval [6.60, 6.77]. For a larger sample of 100 4 = 400
0,x1
ZT )
trajectories the standard error for the means E((S
K)+ ) and E(
T
were 0.608 and 0.004, respectively. A comparison of the two gures, therefore,
demonstrates the dramatic reduction in variance that can be obtained with
an implementation of the HP method. For this example the method produces
692
0) u
Fig. 16.5.3. Diusion operator values (L L
as a function of asset price S and
squared volatility v
693
Fig. 16.5.4. Price dierences between the Heston and corresponding Black-Scholes
model as a function of strike K and time t
694
Fig. 16.5.6. Implied volatility term structure for the Heston model
Fig. 16.5.4, the corresponding implied volatilities are illustrated in Fig. 16.5.6.
This implied volatility surface shows a negative skew with some smile eect,
a feature which is observed in many markets. For the Heston model choosing
> 0 leads to a positive skew. Note that the choice = 0 produces a smile.
The above described HP method is very powerful since it derives the variance reduction for each path using deep stochastic analytic relationships. It
is straightforward to generalize the HP method to the case with jumps and
to develop particular versions of it.
16.6 Exercises
16.1. Construct an antithetic variance reduction method, where you simulate
the expectation E(1 + Z + 12 (Z)2 ) for a standard Gaussian random variable
Z N (0, 1). For this purpose combine the raw Monte Carlo estimate
VN+
N
1
1
2
=
1 + Z(k ) + (Z(k ))
N
2
k=1
N
1
1
2
1 Z(k ) + (Z(k ))
N
2
k=1
that also uses the same outcomes but with a negative sign. Demonstrate the
variance reduction that can be achieved for the estimator
16.6 Exercises
695
1
VN = (VN+ + VN )
2
instead of using only VN+ . Is VN an unbiased estimator?
16.2. For calculating the expectation E(1+Z + 12 (Z)2 ) use the control variate
VN =
N
1
(1 + Z(k )) .
N
k=1
t0,x dt + X
t0,x dW
t
t0,x = X
dX
0,x
=X
= x0 , respectively, where
XT0,x
t0,x ) dt
t = dWt d(t, X
dW
and
t0,x ) dWt
dt = t d(t, X
with
t0,x ) =
d(t, X
t0,x )
t0,x u(t, X
X
.
t0,x )
x
u(t, X
16.4. Formulate a drift implicit simplied weak Euler scheme for the twofactor Heston model
17
Trees and Markov Chain Approximations
n!
pj (1 p)nj
j ! (n j) !
(17.1.1)
697
698
699
Fig. 17.1.2. Error for out-of-the money European call in dependence on number of
steps
Fig. 17.1.3. Log-log plot of weak error for at-the-money price, delta and gamma
which shows experimentally that the binomial tree achieves for a European
call a weak order of about 1.0.
700
Fig. 17.1.4. Log-log plot of absolute weak error for out-of-the money price, delta
and gamma of CRR binomial European call
701
Fig. 17.1.5. Log-log plot of the absolute error for in-the-money price, delta and
gamma of CRR binomial European call
put price against the log of the number of time steps. One notes that the
binomial American put price appears to converge to a limiting value. For
American put options an early exercise boundary emerges, which describes
the level at which the American option price equals the payo function. As
an illustration, in Fig. 17.1.7 we show for the above default parameters for
an at-the-money American put option the resulting early exercise boundary
when n = 250 time steps are used. One notes the oscillatory behavior of the
early exercise boundary due to the nature of the binomial tree with odd and
even time steps.
702
2
d = exp
r
1,
2
2
u = exp
r
+ 1,
2
respectively, Here the behavior of the weak error is very similar to that of
the CRR specication. Basically, only the center of the tree is slightly shifted.
Experimentally, one also observes rst weak order in this situation.
Let us now go further by allowing the tree to attain three possible values from each node at a new time step. This provides more freedom when
constructing the tree. It also avoids the typical features that arise from odd
and even time steps, as observed for a binomial tree. This raises the question
whether the accuracy or order of weak convergence can be improved in such
a trinomial tree.
To clarify this we form a trinomial tree, which is similar to the binomial
tree. Its main property is that it is recombining, however, here we go to three
nodes in each time step. This allows us to overcome the constraint of binomial
trees where it is not possible to remain at the same level. Below we assign
probabilities to the upward and the downward moves. With the remaining
probability being that the tree stays at the present level. This means that the
trinomial tree still remains very similar to the binomial tree, enriched by the
703
m=1
u = e
(17.1.2)
with a free parameter. In a risk neutral setting, Boyle (1988) suggested for
the above tree the following trinomial probabilities
pd =
(W R) (u + 1)2 (R 1) (u + 1)3
u2 (u + 2)
pm = 1 pd pu
pu =
(W R) (u + 1) (R 1)
u2 (u + 2)
with
W = R 2 e
(17.1.3)
and
R = er ,
where r denotes the short rate and the underlying asset is denominated in the
domestic currency. Note that for = 1 Boyles trinomial tree model coincides
with the original CRR binomial model.
The above probabilities and steps are chosen such that the rst and second
moment of the Black-Scholes model for the underlying security are matched,
that is,
pd (d + 1) + pm m + pu (u + 1) = R
and
pd (d + 1)2 + pm m2 + pu (u + 1)2 R2 = e2r (e
1).
704
Fig. 17.1.8. Weak error for Boyles trinomial at-the-money European call
2
(r 2 )
1
pd =
2 2
2
1
pm = 1 2
2
(r 2 )
1
pu =
+
2 2
2
(17.1.4)
for 1. Here one matches the rst and second moment of the logarithm of
the risk neutral asset price by the conditions
2
pd ln(d + 1) + pu ln(u + 1) = r
2
2
2
2 .
pd (ln(d + 1))2 + pu (ln(u + 1))2 2 + r
2
In the last condition the last term on its right hand side is of higher order and
can still be neglected to obtain the previous probabilities. Numerical experiments show that if pm 13 , then errors in the approximation are somehow
minimized.
For the same parameter settings, as used before
for the binomial tree, and
when applying Boyles trinomial tree with = 3, we show in Fig. 17.1.8
for an at-the-money trinomial European call the weak error dependent on the
number of time steps. It is remarkable that the odd-even oscillations, observed
for the binomial tree, are now removed. In a corresponding log-log plot in
Fig. 17.1.10 one realizes that Boyles trinomial tree achieves experimentally
the same order of weak convergence of about 1.0 as the binomial tree. We
show in Fig.17.1.9 for the same setting the weak error for an out-of-the money
European call with strike K = 1.1. This gure reveals substantial oscillations
of the absolute error, which generally decline as the number of steps increases.
705
Fig. 17.1.9. Weak error for out-of-the money Boyles trinomial European call with
K = 1.1
Fig. 17.1.10. Log-log plot of weak error for at-the-money price, delta and gamma
of Boyles trinomial European call
706
Fig. 17.1.11. Log-log plot of weak error for out-of-the money price, delta and
gamma of Boyles trinomial European call
three nodes in each time step smoothes the resulting pricing function when
compared to that of a binomial tree. However, in the above case it does not
improve the order of weak convergence. The reason for this will become clear
when we formulate a corresponding weak convergence theorem.
As we have seen, for the at-the-money trinomial tree European option
pricing there appear to be almost no oscillations in the prices, which is due
to the matching of the strike with the corresponding node at maturity. In
Fig.17.1.11 we show a log-log plot for an out-of-the money trinomial European
call with weak errors for price, delta and gamma. Here one notes oscillations
when increasing the number of steps. If one adjusts the tree such that the inor out-of-the money strike is located at a node, then one can reduce such inand out-of-the money price oscillations.
Note, that this does not change the weak order of convergence of the
method. These eects are of particular importance in the context of barrier
options, see Boyle (1988). Obviously, American pricing can be performed on
a trinomial tree similarly as on a binomial tree.
Black-Scholes Modication of a Tree
As we noticed above, the placing of the strike of an option at some node
has an impact on the error behavior. Therefore it has been suggested in
Broadie & Detemple (1997b) to smooth the call or put payo function atthe-money and one step before maturity by using the corresponding price at
that time obtained from the Black-Scholes formula. This Black-Scholes modication of a tree is then used to price options. To test this method we plot
in Fig. 17.1.12 the resulting log-weak error for binomial European calls with
and without the Black-Scholes modication against the log of the number of
steps. It follows that there is no increase in the order of weak convergence.
707
Fig. 17.1.12. Log-log plot of weak error with and without Black-Scholes modication for CRR binomial out-of-the money European call
However, the oscillations of the error, as seen in Fig. 17.1.4, are reduced and
a better numerical performance of the method can be observed.
Smooth Payo
Let us now check whether a smooth payo, compared to the nondierentiable
European call or put payo, provides some higher order of weak convergence.
As a simple test case we consider the cubic payo
g(S) = S 3
and compare the theoretical value
E(g(ST )) = (S0 )3 exp{3 (r + 2 ) T }
with that obtained from a trinomial tree. In Fig. 17.1.13 we show the loglog plot of the resulting absolute error dependent on the logarithm of the
number of steps. One notes that there is no change in the order of weak
convergence when using Boyles trinomial method if compared to Fig. 17.1.10.
The experimental weak order is about = 1.0.
Now, let us consider the trinomial tree proposed in Heston & Zhou (2000).
It uses
2
m = exp
r
2
d = m exp 3 1
u = m exp 3 1
(17.1.5)
708
Fig. 17.1.13. Log-log plot for Boyles trinomial tree with smooth payo
Fig. 17.1.14. Log-log plot for Heston-Zhou trinomial tree with smooth payo
with probabilities
1
.
(17.1.6)
6
In Fig. 17.1.14 we show the log-log plot of the weak error for the HestonZhou trinomial tree with the above smooth payo. Here we note that we
achieve experimentally the higher weak order of 2.0. We note that this
particular tree matches exactly, for the logarithm of the given asset price
model, those moments needed to achieve this weak order for smooth payos
in accordance with what was shown in (11.2.8) for simplied weak schemes.
Note that the probabilities and spatial steps in (17.1.5)(17.1.6) resemble
those in the three-point distributed random variables described in (11.2.9).
We also compute for this trinomial tree the weak error when calculating
European calls. The corresponding log-log plot is shown in Fig. 17.1.15. This
gure reveals that we obtain experimentally, due to the nonsmooth payo,
pd = pu =
709
Fig. 17.1.15. Log-log plot for Heston-Zhou trinomial tree for European call
only a weak order of about 1.0. The reasoning for this eect will become
clear when we discuss later a weak convergence theorem.
Pentanomial Trees
We have demonstrated empirically that the binomial and trinomial tree methods achieve, in general, a weak order of convergence of about 1.0. This
raises the question of how one could systematically improve the weak order
of convergence of tree methods. It turns out that the use of ve nodes in
each step in a tree or Markov chain approximation allows, quite generally,
to approximate the transition probabilities to the extent which provides the
weak order of convergence 3.0. One can construct a pentanomial tree
similar to a trinomial tree by just adding at each time step two extra nodes
that can be reached. As suggested in Heston & Zhou (2000), one may consider
additionally to the steps d, m and u also d and u , where
2
m = exp
r
2
d = m exp a 1
u = m exp a 1
d = m exp 3 a 1
u = m exp 3 a 1
(17.1.7)
with a 1.64947. The probability to move to the direct lower and upper
neighbors is each, set to pd = pu 0.181415, and that to jump to the extreme
710
(T ) = 2 E g YT E g YT2
Vg,2
one obtains a new valuation method, analogous to (11.4.1), that has theoretically weak order = 2.0. For the above European call option pricing example
we use the CRR binomial tree method and show in Fig. 17.1.16 the log-log
plot of the resulting weak error (BTree) against the log of the time step size
together with its Richardson extrapolation (Extrap). The regression of the
curve for the error of the Richardson extrapolation reveals a slope of about
1.0, which indicates that only a rst weak order is experimentally obtained. This is not what we theoretically expect. The reason for this is that
the leading error coecient for the considered nonsmooth payo is not one.
We also consider in Fig. 17.1.16, the Black-Scholes modication for a binomial
tree (BBS) and its Richardson extrapolation (ExtrapBBS). Here we note a
signicant improvement due to extrapolation.
Now, let us study the case of a smooth payo g(S) = S 3 . Again with CRR
binomial trees and Richardson extrapolation we calculate the weak error. In
Fig. 17.1.17 we show the log-log plot of this error, which has now the slope of a
second weak order method. This simple study demonstrates that extrapolation
for a non-dierentiable payo may not be benecial in the case of a European
call option and may be questionable in other situations. However, for a smooth
payo the theoretically achievable order appears to be obtained.
711
Fig. 17.1.16. Log-log plot for Richardson extrapolation with out-of-the money
binomial call payo and Black-Scholes modication
Fig. 17.1.17. Log-log plot for Richardson extrapolation with CRR binomial tree
for smooth payo
Also other weak rst order methods can be extrapolated as long as their
leading error coecients are appropriate. To test this, we use the trinomial
tree of Heston & Zhou (2000), which was introduced above and has second
weak order for smooth payos. We apply it with step sizes , 2 and 4 in
the following extrapolation method
.
1
32 E g YT 12 E g YT2 + E g YT4
,
(T ) =
Vg,4
21
see (11.4.2). In Fig. 17.1.18 we present a log-log plot for the absolute error
of the resulting fourth order extrapolation to obtain the expectation for the
smooth payo g(S) = S 3 . The gure shows that the slope of the log error
curve is approximately four, which indicates that we experimentally have obtained a fourth weak order method. However, when one tries to extrapolate
712
Fig. 17.1.18. Log-log plot for the weak error of a fourth order extrapolation using
the second order Heston-Zhou trinomial tree for smooth payo
Fig. 17.1.19. Log-log plot for a fourth order extrapolation error for out-of-the
money Heston-Zhou trinomial European calls
with this trinomial extrapolation method, European call option prices, this
extrapolation method fails because of the nonsmooth payos, as can be seen
from Fig. 17.1.19. This highlights the fact that extrapolation requires smooth
payos to be ecient.
713
714
tion. The major advantages of simplied schemes over tree methods are their
exibility and general applicability in higher dimensions.
As shown previously, implicit simplied methods can overcome certain numerical instabilities. Random bit generators, which will be presented below,
can be eciently applied to implicit schemes, while tree methods cannot easily
be made implicit. On the other hand, simplied schemes can be interpreted
as being, in some sense, equivalent to implicit nite dierence methods. However, as with trees, these methods cannot be easily implemented for higher
dimensions and one has an advantage when using simplied implicit methods
with variance reduction in higher dimensions.
The order of convergence of simplied schemes is independent of the dimension of the problem. As shown in Boyle, Broadie & Glasserman (1997),
simulation methods typically become more ecient than tree or nite dierence methods around dimension three or four.
Weak Taylor Schemes and Simplied Methods
For the dynamics of the underlying security let us consider the scalar SDE
dXt = a(t, Xt ) dt + b(t, Xt ) dWt
(17.2.1)
n = ) = 1 .
P (W
2
(17.2.2)
(17.2.3)
see (11.2.2). Here the rst three moments of the Wiener process increments
n , that is
Wn match those of W
n) = 0
E(Wn ) = E(W
715
n )2 ) =
E((Wn )2 ) = E((W
n )3 ) = 0.
E((Wn )3 ) = E((W
(17.2.4)
This property satises condition (11.2.3). A similar result applies to the weak
order 2.0 Taylor scheme
1
1
1 2
2
Yn+1 = Yn + a + b Wn + b b Wn +
a a + a b 2
2
2
2
1
+ a b Zn + a b + b b2 {Wn Zn } ,
(17.2.5)
2
o integral
where Zn represents the double It
tn+1
z
Zn =
dWz ds,
tn
tn
see (11.2.6), and we suppress again the dependence of the coecient functions
on (tn , Yn ). Here we can replace the Gaussian random variables Wn and Zn
n with
by expressions that use a three-point distributed random variable W
n = 3 = 1 ,
P W
6
n = 0 = 2,
P W
3
see (11.2.9). Then we obtain the simplied weak order 2.0 Taylor scheme
1 2
1
1 2
Wn +
Yn+1 = Yn + a + b Wn + b b
a a + a b 2
2
2
2
1
1
n ,
+
(17.2.6)
a b + a b + b b2 W
2
2
n is such
see (11.2.7). Since the three-point distributed random variable W
that the rst ve moments of the increments of the schemes (17.2.5) and
(17.2.6) are matched, see condition (11.2.8), the scheme (17.2.6) can be shown
to achieve the weak order = 2.0.
By using four or even ve point distributed random variables when approximating the multiple stochastic integrals needed, we can obtain simplied
weak Taylor schemes of weak order = 3.0 or 4.0, respectively, as shown in
Kloeden & Platen (1999) and Hofmann (1994).
An important issue with weak approximations for SDEs is their numerical stability. As noticed in Chap. 14, when considering a test equation with
multiplicative noise, the weak schemes described above show narrow regions
of asymptotic p-stability. In order to improve the numerical stability of such
methods one needs to introduce implicitness in the diusion terms. This leads,
for instance, to the fully implicit weak Euler scheme
716
Yn+1
b (tn+1 , Yn+1 )
= Yn + a (tn+1 , Yn+1 ) b (tn+1 , Yn+1 )
y
+ b (tn+1 , Yn+1 ) Wn ,
(17.2.7)
which makes no sense for certain diusion coecients, as shown in (7.3.3) and
(7.3.4). However, under the weak convergence criterion one can employ the
n , see (17.2.2), instead of Wn to
two-point distributed random variable W
obtain the simplied fully implicit Euler scheme
b (tn+1 , Yn+1 )
Yn+1 = Yn + a (tn+1 , Yn+1 ) b (tn+1 , Yn+1 )
y
n,
(17.2.8)
+ b (tn+1 , Yn+1 ) W
that achieves an order = 1.0 of weak convergence. It does not have the problems that the scheme (17.2.7) may exhibit due to the Gaussianity of Wn .
Random Bit Generators
Let us now demonstrate, for simplied schemes, how to implement highly ecient random bit generators that exploit the architecture of a digital computer
or can be used on parallel working hardware devices. The crucial advantage
of simplied schemes is the fact that only random bits have to be generated
instead of continuous random variables. These substitute the computationally demanding Gaussian random numbers needed for several weak Taylor
schemes.
Within this book we have not focussed on random or quasi-random number generation. For this topic we refer to a large literature which includes
Devroye (1986), Bratley et al. (1987), Niederreiter (1988b, 1992), LEcuyer
(1994, 1999), Fishman (1996), Pages & Xiao (1997), Matsumoto & Nishimura
(1998), Brent (2004) and Panneton, LEcuyer & Matsumoto (2006).
A well-known and ecient method to generate a pair of independent standard Gaussian random variables is the polar Marsaglia-Bray method coupled with a linear congruential random number generator, as described in
Press, Teukolsky, Vetterling & Flannery (2002). The following comparative
study uses, as Gaussian random number generator, the routine gasdev, see
Press et al. (2002).
For the simplied Euler scheme (17.2.3) and simplied fully implicit Euler
scheme (17.2.8) we use a two-point distributed random variable at each time
step, which is obtained from a random bit generator. This generator is an
algorithm that generates a single bit 0 or 1 with probability 0.5. The method
proposed is based on the theory of primitive polynomials modulo 2. These are
polynomials satisfying particular conditions whose coecients are zero or one.
The important property is that every primitive polynomial modulo 2 of order
n denes a recurrence relation for obtaining a new bit from the n preceding
ones with maximal length. This means that the period length of the recurrence
717
(17.2.9)
718
Fig. 17.2.1. Log-log plot of the weak error for the Euler, fully implicit Euler and
weak order 2.0 Taylor schemes
719
for instance, the simplied Euler scheme was 28 times faster than the Euler
scheme.
By comparing all six methods, we conclude that the second order simplied scheme is signicantly more ecient for the given example than any of
the other schemes considered. This result reveals that ecient Monte Carlo
simulation algorithms for smooth payos are obtained by using higher order
simplied schemes.
An Option Payo
In option pricing we are confronted with the computation of expectations of
non smooth payos. Close to maturity and at-the-money, a European call
option pricing function does not have the smoothness that is required in Theorem 11.2.1 for rst order weak convergence. To give a simple example, let us
compute the price of a European call option. Here we have a continuous but
only piecewise dierentiable payo (XT K)+ = max(XT K, 0) with strike
price K. For the option price at time t = 0 we obtain, by the Black-Scholes
formula,
E(erT (XT K)+ ) = X0 N (d1 ) erT K N (d2 ),
2
X
ln( K0 )+(r+ 2 )T
where d1 =
and d2 = d1 T , see (13.7.16). For this non
T
smooth payo we study the absolute weak error for the Euler and the simplied Euler method, assuming the volatility = 0.2 and the short rate
r = 0.1. Simulation experiments show that we observe no major gain by using
higher order methods, which is due to the non smooth option payo. Since
the second order simplied method (17.2.6) is approximately equivalent to a
trinomial tree, as discussed previously, this is consistent with an observation
in Heston & Zhou (2000), where it was mentioned that in option pricing the
order of convergence of trinomial trees is not superior to that of binomial trees.
In Fig. 17.2.2 we show the log-log absolute weak error plot for an at the
money-forward option, with strike K = X0 erT . The Euler method generates
a weak order = 1.0 with the log error forming a perfect line dependent
on the log step size. As mentioned earlier, the simplied Euler method is
approximately equivalent to a binomial tree. This method still achieves a
weak order = 1.0. However, its log-log error plot does not exhibit a perfect
line, due to the discrete nature of the random variables used. This appears
to be the same eect as we noticed for tree methods, see also Boyle & Lau
(1994). We observed, for in the money and out of the money options, a similar
order of convergence with similar log error patterns.
In Fig. 17.2.3 we show for the simplied Euler method the logarithm of
the CPU time versus the negative logarithm of the weak error. For the considered non smooth payo the increase in computational speed is still about
28 times. The simplied Euler method is signicantly more ecient than the
Euler scheme for every level of accuracy. We observed similar results also for
in-the-money and out-of-the-money options. In summary, one can say that
720
Fig. 17.2.2. Log-log plot of weak error for call option with Euler and simplied
Euler scheme
Fig. 17.2.3. Log-log plot of CPU time versus weak error for call option with Euler
and simplied Euler scheme
the proposed, rather simple, random bit generators when combined with simplied schemes can signicantly enhance the eciency of typical Monte Carlo
simulations in nance. It is worth emphasizing that these methods also work
in higher dimensions and for implicit schemes. Since parallel hardware devices will become rather inexpensive in future, one can expect a signicant
increase of their use in Monte Carlo simulation, extending considerably the
applicability of the kind of methods emphasized in this book.
721
The transition probabilities of Markov chains are chosen in such a way that
functionals converge with a desired weak order with respect to vanishing step
size.
Limits of Markov Chains
We have already seen in Chap. 3 how CRR binomial trees weakly approximate
a given diusion process, the geometric Brownian motion. Many problems in
nance and economics can be on a basic level conveniently modeled by Markov
chains. When small time steps are involved, then these Markov chains usually
approximate limiting solutions of corresponding It
o SDEs. When an SDE is
numerically approximated by trees or Markov chains, then one constructs an
approximate characterization of the probability measure of the corresponding
process. The same applies to simulation algorithms and even to nite dierence methods for PDEs, as will be shown below. In all cases the validity and
accuracy of such approximations are of importance. For Monte Carlo methods
we have already discussed their weak order of convergence. A similar approach
was indicated previously for trees, which are Markov chain approximations.
To illustrate the similarities in a simple example, let us examine the weak
convergence of a sequence of random walks to a standard Wiener process on
the time interval [0, 1]. This is one of the simplest cases, where a sequence
of Markov chains approximates a diusion process in a weak sense. For each
N {1, 2, . . .} we partition the time interval [0, 1] at the instants tk = k
with = N1 for k {0, 1, . . . , N }. Then we generate a random walk Y ,
1
starting
at Y0 = 0 with equally probable spatial steps of height = N 2
or , for jumping up or down, respectively, from the current position Yk
Yk+
= Yk
+ k
(17.3.1)
722
1
X
=
x
=
k
:
k
{0,
1,
2,
.
.
.}
,
k
= xj Yk
= xi ). The above description also ts into
where pi,j = P (Y(k+1)
that of a simplied Euler scheme for the trivial SDE of a Wiener process or a
simple symmetric binomial tree, as described previously. In a weak sense these
approximations of the standard Wiener process are somehow equivalent.
In the
walk the spatial step size was the same at every point
above random
1
.
A
natural
generalization is to allow the Markov chain to
of the -grid X
+ b Yn
= Yn
+ a Yn
n
(17.3.2)
Yn+
1
for given functions a and b dened on X
such that
x + a (x) b (x)
1
X
1
1
falls again on the grid X
for all x X
. This forms a recombining tree
or lattice. The functions a and b can be derived rather freely and need to
converge to the limiting functions a and b, respectively, on in an appropriate
manner where moments of increments are matched, as we will discuss below.
In either case we can form an Ito SDE
(17.3.3)
for t [0, 1], the solution could become the limiting diusion process if the
coecient functions a and b are suciently regular and the initial value X0
723
has not to live on a grid. What will be important under the convergence theorem, which we present at the end of the chapter, is that certain moments of
the increments of the Markov chain match those of the limiting process. In
more general cases it may, however, be dicult or cumbersome to write down
explicitly in all detail the state space and the transition matrix for a particular Markov chain. Other descriptions may be more convenient. The trees
of the previous section and the simplied Monte Carlo simulation schemes
of Chap. 11 provide some examples. The following analysis will provide weak
orders of convergence for discrete-time Markov chains that can be achieved
when certain conditions are satised.
Markov Chain Approximations as Numerical Methods
As already indicated above, a usable and intuitively appealing numerical
method for the approximation of functionals of diusions and also jump diffusions are Markov chain approximations. These are, for instance, extensively
discussed and applied in Kushner (1977) and Kushner & Dupuis (1992). The
essential idea involves an approximation of the original process, which solves
approximately a given SDE by a suitable Markov chain for which the corresponding approximate functional converges weakly to that of the diusion
process. Consistency conditions can ensure a type of weak convergence, as
described in Billingsley (1968) and Kushner (1977), for the approximating
Markov chain. Therefore, the convergence of a wide class of functionals is
guaranteed. As early as in Kushner (1977), Markov chains that jump not only
to the nearest neighbor but also to the next nearest neighbor nodes are discussed. Such approximations are expected to give somewhat better weak convergence. This refers, for instance, to the above described pentanomial trees
when compared with binomial trees. The aim of this section is to present a
exible methodology that allows the systematic construction of approximate
Markov chains for a range of numerical tasks. Following Platen (1992), a rich
class of suciently smooth functionals will be shown to converge with given
weak order to those of the underlying process if the maximum time step size
converges to zero.
As in Monte Carlo simulation we will see that higher order Markov chain
approximations require more smoothness of the drift and diusion coecients
and also a more precise approximation of the transition probability of the
increments of the diusion by those of the Markov chain. The convergence
of moments is again automatically covered by the weak convergence criterion
being used.
It will turn out for Markov chains, as proposed in Kushner (1977) and
Kushner & Dupuis (1992), and for binomial and trinomial trees that sucient
smoothness of the drift and diusion coecients, as well as of the payo
function, will give rst order weak convergence. We will also explicitly describe
examples of second- and third order Markov chain approximations. These can
then be used to construct corresponding multinomial trees.
724
t
Xt = X0 +
a(X s ) ds +
b(X s ) dWS ,
(17.3.4)
0
725
almost surely for k > 0. For simplicity, we omit the dependence on the value
Yn of a Taylor scheme in the following conditions similarly to the description
of weak schemes found earlier in the book.
In the case of weak order = 1.0, it can be shown that the Euler scheme
(11.2.1) satises the conditions:
E Yn+1 Yn Atn = a + O(2 ),
E |Yn+1 Yn |2 Atn = b2 + O(2 ),
E (Yn+1 Yn )3 Atn = O(2 ).
(17.3.6)
For the second weak order = 2.0, the weak order 2.0 Taylor scheme
(11.2.6) satises the conditions:
1
1 2
E Yn+1 Yn Atn = a +
a a + b a 2 + O(3 ),
2
2
1
E (Yn+1 Yn )2 Atn = b2 +
2 a (a + b b ) + b2 (2 a + b2
2
+ b b ) 2 + O(3 ),
E (Yn+1 Yn )3 Atn = 3 b2 (a + b b ) 2 + O(3 ),
E (Yn+1 Yn )4 Atn = 3 b4 2 + O(3 ),
E (Yn+1 Yn )5 Atn = O(3 ).
(17.3.7)
Finally, in the case of third weak order = 3.0, we nd that the weak order
3.0 Taylor scheme (11.2.16) has for its increments the conditional moments:
1
1 2
E Yn+1 Yn Atn = a +
a a + b a 2 + A 3 + O(4 ) (17.3.8)
2
2
with
1
1
A=
a a2 + a (a + b b ) + b2 2 a a + 3 a a + 2 a b b
6
2
1 (iv) 2
2
+a bb + b + a
b
,
2
726
1
2 a (a + b b ) + b2 (2 a + b2 + b b ) 2
E (Yn+1 Yn )2 Atn = b2 +
2
+ B 3 + O(4 )
with
B=
1
2 a a 3 a + b2 + b b + b b 3 a + b2 + a b2 (7 a + 8 b b + 2 b b )
6
+ b2 4 a2 + 7 b b a + b2 4 a + b2 + 8 b b2 b + 4 b b a
+
1 2
b 4 a + 5 b2 + 8 b b + b b(iv)
,
2
E (Yn+1 Yn )3 Atn = 3 b2 (a + b b ) 2 + C 3 + O(4 )
with
1 2
a b 9 a + 11 b2 + 7 b b
2
1 2
1 2
2
+ b b b 10 a + 8 b + b (7 a + 28 b b + 4 b b ) ,
2
2
C = a2 (a + 3 b b ) +
E (Yn+1 Yn )4 Atn = 3 b4 2 + D 3 + O(4 )
with
and
1
D = 2 b2 a (3 a + 9 b b ) + b2 6 a + 19 b2 + 7 b b ,
2
E (Yn+1 Yn )5 Atn = 15 b4 (a + 2 b b ) 3 + O(4 ),
E (Yn+1 Yn )6 Atn = 15 b6 3 + O(4 )
E (Yn+1 Yn )7 Atn = O(4 ).
The above formulas can be very useful when designing approximate Markov
chains of some given weak order.
A First Order Markov Chain Approximation
For simplicity, let us rst approximate a one-dimensional time-homogeneous
diusion with d = m = 1 and uniformly bounded drift and diusion coecient
functions a() and b(). We assume that a, b CP4 (, ) and b(x) b0 > 0 for
all x . Let Xh1 denote the h-grid on , which is dened for h > 0 by
727
Xh1 = {x = k h : k {. . . , 1, 0, 1, . . .}}.
For an equidistant time discretization with time step size , the Markov chain
Y shall form a random walk on Xh1 where only jumps to the nearest neighbor
points are possible. Let T be the last point of the time discretization and
Y0 = X0 . We dene for all x Xh1 the functions
Qh (x) = b2 (x)
h = sup Qh (x),
Q
xXh1
=
and
a+ =
a for a 0
0 otherwise,
h2
j
Q
a =
a for a < 0
0 otherwise.
+
b2 (x) h
1
a(x)
h
2
2
Q
728
x=
d
!
zk ek h : zk {. . . , 1, 0, 1, . . .}
k=1
where ek denotes the unit vector in the kth coordinate direction. Let T be the
last point in the time discretization. We introduce for all x d the quantity
k, (x) =
m
(17.3.10)
j=1
d
| k, (x)| 0
=1
=k
d
k,k (x),
k=1
h2
Qh (x)
(17.3.11)
ph (x, x ek h) =
k,k (x)
729
d
1
k,
| (x)| h
k
a (x)
2
2
Qh (x)
i=1
=k
ph (x, x + ek h + e h) = ph (x, x ek h e h) =
1
( k, (x))+
2 Qh (x)
ph (x, x ek h + e h) = ph (x, x + ek h e h) =
1
( k, (x)) (17.3.12)
2Qh (x)
for all k {1, 2, . . . , d} and {1, 2, . . . , d}\{k}, where all other unlisted
transition probabilities are zero.
This provides an example for a multi-dimensional Markov chain approximation. Note that in the convergence theorem the boundedness of a and b
that we assume here will later not be required. Here it is used to allow a
convenient construction of the Markov chain in the given example.
From the convergence theorem it will follow that this d-dimensional
Markov chain converges with weak order = 1 when 0. In this multidimensional case, the theorem will require the Markov chain to have conditional moments of its increments which satisfy the following properties:
k
Ynk Y n = x ak (x) th (x)
E Yn+1
k
+ E (Yn+1
Ynk )(Yn+1
Yn ) Y n = x k, (x) th (x)
p
k
+ E (Yn+1
Ynk )(Yn+1
Yn )(Yn+1
Ynp ) Y n = x
(1 + |x|r ) O(( th (x))2 )
(17.3.13)
(17.3.14)
xXhd
and
=
h2
h
Q
(17.3.15)
h instead of Qh (x)
to normalize all the transition probabilities in (17.3.12) by Q
and nally introduce the probability
ph (x, x) = 1
d
d
ph (x, x + uk ek h + v e h).
730
Again it will follow that this Markov chain converges for 0 with weak
order = 1, if enough smoothness for a and b is guaranteed.
A Second Order Markov Chain Approximation
Let us now describe a second order Markov chain approximation for the onedimensional case, d = m = 1, for given h > 0 and (0, 1) on the h-grid
Xh1 . We assume that a and b are from CP6 (, ) and together with their rst
and second derivatives are uniformly bounded and that b(x) b0 > 0 for all
x . Furthermore, we set Y0 = X0 and take T as the last point of the time
(0, 1) let us introduce the following
discretization. For each pair (x, )
functions
2
= a + a a + b a
1 (x, )
2
2
= b2 + (2 a(a + b b ) + b2 (2 a + b2 + b b ))
2 (x, )
= 3 b2 (a + b b )
3 (x, )
= 3 b4
4 (x, )
= 0,
5 (x, )
where we omit the dependence on x of the right hand sides of the equations.
We set for all x Xh1 and (0, 1)
1 15
3
2
4 ,
Qh, (x) =
3 4
4 h2
and x of the right hand side
where we again suppress the dependence on
and dene
= sup Q (x).
Q
h,
h,
xXh1
h2
,
h,
Q
731
1
1
1
1
1
h
p (x, x h) =
4
2 2
4 h 1 3
2
3
2h
2
h
Qh,
1
1
1
1
1 1
3 h 1
ph (x, x 2 h) =
4 2
h,
3 8 h2
8
4 h
Q
ph (x, x) = 1
ph (x, x + k h),
(17.3.16)
k{2,1,1,2}
for all x Xh1 and n {0, 1, . . .}, with some r N . This condition can be
veried to hold for the above Markov chain. We emphasize that the weak
convergence theorem will not require the boundedness of a and b, and their
. The
derivatives, which in this example is assumed to ensure a nite Q
h,
pentanomial tree, which we have constructed previously, can be shown to
satisfy the above conditions.
One can systematically nd second order Markov chain approximations in
the general multi-dimensional case on the d-dimensional grid Xhd or for a more
general discrete-time approximation. Here assumptions on the conditional moments, as will be stated in the convergence
theorem, need to be satised. Note
4
that one has to use, in general, k=1 dk neighboring points in a transition
step of a Markov chain to achieve weak order = 2. This means, in general,
a second weak order tree needs to have in the one dimensional case ve nodes
to go to, whereas in the two dimensional case 31 nodes are needed. This indicates that, in general, higher order multi-dimensional trees may be dicult
to implement unless the underlying structure of the diusion process allows
major simplications.
A Third Order Markov Chain Approximation
As an example we will construct now a third order Markov chain approximation for the one-dimensional case, d = m = 1, for given (0, 1) and h > 0,
on the h-grid Xh1 . We assume that a and b are from CP8 (1 , 1 ), b(x) b0 > 0
for all x , and that they are uniformly bounded, together with their derivatives, up to the fourth order. Furthermore, we set Y0 = X0 and assume T as
> 0 let us introduce the
the last discretization point. For all x Xh1 and
following functions
732
b2
2
1 (x, ) = 1 = 1 + a (a + a (a + b b )) +
2 a a + 3 a a + 2 a b b
2
2
b2
+ a (b b + b2 ) + aiv
2
6
/
2 = 2 + 2 a (a (3 a + b2 + b b ) + b b (3 a + b2 ))
+ b2 a (7 a + 8 b b + 2 b b ) + 4 a2 + 7 b b a
+ b2 (4 a + b2 ) + 8 b b2 b + b b 4 a
2
b2
+ (4 a + 5 b2 + 8 b b + b biv )
4
6
b2
a (9 a + 11 b2 + 7 b b )
3 = 3 + a2 (a + 3 b b ) +
2
+ b b (10 a + 8 b2 ) +
b2
2
(7 a + 28 b b + 4 b b )
2
b2
2
(6 a + 19 b2 + 7 b b )
4 = 4 + 2 b2 a (3 a + 9 b b ) +
2
5 = 2 15 b4 (a + 2 b b )
6 = 2 15 b6
7 = 0,
and x, and the k are as in the previous
where we omit the dependence on
1
(0, 1) the function
section. We dene for all x Xh and
Qh, (x) =
23
7
1
2
4 +
6 ,
18
18 h2
36 h4
where we again omit on the right hand side the dependence on and x.
Furthermore, we introduce
= sup Q (x)
Q
h,
h,
xXh1
h2
,
h,
Q
733
5
h,
2 2
456 h
24 h3
Q
p (x, x 2 h) =
h
ph (x, x 3 h) =
1
1
1
2
4 +
6
40
12 h2
120 h4
1
1
10
1
3
3
5
h 1 +
3
2
10
57 h
30 h
Qh,
1
1
1
2
4 +
6
90
744 h2
720 h4
1
1 1
1
1
h 1 +
3
5
h,
2 30
456 h
720 h3
Q
ph (x, x) = 1
3
ph (x, x + k h),
(17.3.18)
k=3
k=0
734
(17.4.1)
(17.4.2)
(17.4.3)
(17.4.4)
for x [0, ). This is the Kolmogorov backward equation, which arises from
the Feynman-Kac formula, see Sect. 2.7, when calculating the conditional expectation
(17.4.5)
u(t, Xt ) = E H(XT ) At .
Here E denotes expectation under the pricing measure P .
We recall, as explained in Chap. 3, that a benchmarked pricing function
u(t, Xt ) is obtained as the fair benchmarked price of a benchmarked payo
H(XT ). In this case the expectation (17.4.5) is taken with respect to the real
world probability measure. When using risk neutral pricing an equivalent risk
neutral probability measure needs to exist and the numeraire is the savings
account.
735
(17.4.6)
u(t,x+ x)u(t,x)
x
x))
(u(t,x)u(t,x
x
+ R2 (t, x)
x
(17.4.7)
Here R1 (t, x) and R2 (t, x) are corresponding remainder terms of the standard
Taylor expansions, and x > 0 is a small increment in the spatial direction
of x.
To place these dierences let us introduce an equally spaced spatial grid
1
= {xk = k x : k {0, 1, . . . , N }}
Xx
(17.4.8)
(17.4.9)
and
2 u(t, xk )
uk+1 (t) 2 uk (t) + uk1 (t)
=
+ R2 (t, xk )
x2
( x)2
(17.4.10)
736
(17.4.11)
(17.4.12)
and
for t [0, T ], with , 0. This creates a certain behavior of the PDE
solution at the respective boundaries. As we indicated already in Sect. 2.7,
when we discussed the Feynman-Kac formula, it can be rather important how
the behavior at the boundaries is approximated.
For the vector u(t) = (u0 (t), u1 (t), . . . , uN (t)) we can now transform
the pricing equations that arise from the PDE (17.4.3) into an approximate
system of ODEs that has the form
du(t)
A
+
u(t) + R(t) = 0
dt
(x)2
(17.4.13)
for t [0, T ]. Note the appearance of the vector R(t) of resulting remainder
terms and the corresponding matrix A = [Ai,j ]N
i,j=0 of the above linear system.
The matrix A has the elements
A0,0 = (x)2 ,
A0,1 = (x)2 ,
AN,N = (x)2
AN,N 1 = (x)2
(17.4.14)
for k {1, 2, . . . , N 1} and |k j| > 1. The matrix A thus has the form
(x)2 (x)2 0
0
...
0
0
0
A1,0
A1,1 A1,2 0
...
0
0
0
2,1
2,2
2,3
A A
...
0
0
0
0
A
.
.
.
.
.
.
.
..
..
..
..
..
..
..
A =
.
.
.
N 2,N 2
N 2,N 1
A
0
0
0
0
0
.
.
.
A
0
0
0
0
. . . AN 1,N 2 AN 1,N 1 AN 1,N
0
0
0
0
...
0
(x)2 (x)2
737
A
u(t) dt
(x)2
(17.4.15)
(17.4.16)
We also indicate in (17.4.16) the order of the leading error term, which is
important for the order of overall convergence of the approximation. Some of
these nite dierences can be used at boundaries.
For the second order spatial derivatives the following nite dierences can
be used instead of (17.4.10) for the construction of a numerical method:
2 u(t, xk )
uk 2 uk1 + uk2
=
+ O( x)
2
x
( x)2
2 u(t, xk )
uk+2 2 uk+1 + uk
=
+ O( x)
2
x
( x)2
2 u(t, xk )
uk+1 2 uk + uk1
=
+ O(( x)2 )
2
x
( x)2
738
2 u(t, xk )
2 uk 5 uk1 + 4 uk2 uk3
=
+ O(( x)2 )
2
x
( x)2
2 u(t, xk )
uk+3 + 4 uk+2 5 uk+1 + 2 uk
=
+ O(( x)2 ).
2
x
( x)2
(17.4.17)
( x)2
(17.4.18)
for n {0, 1, . . . , nT }. In our case we know the value of the vector u(T ) =
u(tnT ), which is the payo at maturity. This means, we have
uk (T ) = u(T, xk ) = H(xk )
(17.4.19)
for all k {0, 1, . . . , N }. Now the discrete-time approximate system of difference equations (17.4.18) with terminal condition (17.4.19) can be solved
in backward steps starting from the maturity date. Under appropriate conditions one obtains in this manner an approximate solution of the PDE (17.4.3).
The method described above constitutes a nite dierence method. This is a
widely used method in derivative pricing and quite suitable for a range of
models and exotic options.
Relation to Markov Chains
If one looks back into the previous section on Markov chain approximations,
then one realizes that the above nite dierence method resembles strongly
a Markov chain approximation. The above nite dierence method and the
following Markov chain approximation can indeed be shown to be equivalent
in a certain sense:
Note that (17.4.18) can be written as
u(tn+1 ) = px u(tn )
where
px = I + A
739
(17.4.20)
Here px (xi , xj ) denotes the transition probability for the Markov chain Y
1
1
to move from the state xi Xx
at time tn to the state xj Xx
at time
tn+1 = tn + . Due to (17.4.14) and (17.4.20) we have for i {1, 2, . . . , N 1}
the transition probabilities
1
2
1 b(xi )
for j = i
(x)2
(17.4.21)
p (xi , xj ) =
1
for j = i 1
2 (b(xi ) a(xi ) x) (x)2
0
otherwise.
Obviously, these are probabilities since they sum up to one. At the boundaries
we obtain from (17.4.11) that
for j = 1
x
for j = 0
p (x0 , xj ) = 1
0
otherwise
and by (17.4.12) the relation
x
p (xN , xj ) = 1
for j = N 1
for j = N
otherwise.
Also these quantities turn out to be probabilities for small enough . However, it must be noted that a nite dierence method can also be constructed
such that there is no direct interpretation for an equivalent Markov chain.
This theoretical possibility gives the method more freedom, as we will see
later. Still, for an understanding of some of the numerical properties of these
methods it is useful to draw parallels between the behavior of simplied Monte
Carlo methods, Markov chain approximations and nite dierence methods.
One notes that there are two sources of errors in a nite dierence method.
These are the truncation error caused by the spatial discretization and the
truncation error resulting from the time discretization. Fortunately, there is no
statistical error. We have seen previously that Markov chain approximations
require particular relationships to be satised between the spatial and time
discretization step sizes to obtain nonnegative probabilities. Due to the extra
freedom in a nite dierence method this only loosely applies for nite dierence methods. In the literature it is suggested to use, for an explicit method,
740
1
a ratio of the spatial over the time step size such that 0 < (x)
2 2 . This
choice has a natural explanation because otherwise it is dicult to construct
positive probabilities for an approximating Markov chain, see (17.4.21).
Similar to the moment and numerical stability conditions for Monte Carlo
methods and Markov chain approximations, nite dierence methods also
require similar consistency and stability conditions. These have to ensure,
as in the case of Markov chain approximations, that the approximate PDE
solution evolves, locally, similar to that of the underlying PDE and does not
propagate errors in an uncontrolled manner.
Additionally, when using a nite dierence method one has to be aware
of the fact that the underlying diusion or jump diusion process typically
extends its dynamics beyond the boundaries that one uses for the approximation of the state space. Fortunately, one can show for most situations
that this does not really harm signicantly the computational results, see
Lamberton & Lapeyre (2007), when the boundaries are chosen appropriately.
Finally, we emphasize that as with numerical methods for ODEs and SDEs,
the approximate solution that one generates by a nite dierence method is
clearly a dierent mathematical object, and only by weak convergence linked
to the exact solution of the PDE that motivated its construction.
For the global and overall convergence we have already seen for ODE
solvers and Monte Carlo schemes that the numerical stability is most important. In the literature on nite dierence methods, for instance in Richtmeyer
& Morton (1967) and Raviart & Thomas (1983), one can nd corresponding conditions and convergence theorems that make this precise. For the case
when the above approximate solution can be interpreted as a Markov chain
approximation, Theorem 17.5.1 provides a corresponding weak order of convergence. Its numerical stability can be linked to that of the Markov chain
approximation, using similar notions and methods as described in Chap. 14.
(x)2
(17.4.22)
741
.
(x)2
(17.4.23)
The fully implicit Euler method without stochasticity was shown to have a
relatively large stability region, see Chap. 14.
Crank-Nicolson Method
Another important method is obtained by choosing in (17.4.22) the degree of
implicitness = 12 . This yields the popular Crank-Nicolson method
u(tn+1 ) = u(tn ) +
1
A u(tn+1 ) + u(tn )
.
2
(x)2
(17.4.24)
1
1
(17.4.25)
u(tn+1 ) = I + A
u(tn ).
I A
2 (x)2
2 (x)2
With the large matrix
M =I
1
A
2 (x)2
742
1
I+ A
2 (x)2
u(tn ),
we can rewrite equation (17.4.25), and face at each time step the problem of
solving the linear system of equations
M u(tn+1 ) = B n .
(17.4.26)
The matrix M and the vector B n are known. To obtain the vector u(tn+1 )
that satises (17.4.25) one needs to invert the matrix M . However, this is,
in general, a challenging numerical problem because M is a large matrix.
Fortunately, most of its elements consist of zeros. Additionally, the nonzero
elements typically form diagonal bands. Such matrices are called sparse matrices and there is an area of scientic computing that has specialized on solving
sparse linear systems.
For the solution of sparse, linear systems one uses either direct solvers
or iterative solvers. A direct solver aims to achieve the solution to the given
precision of the computer. A popular direct solver is the tridiagonal solver,
which is also known as the Gaussian elimination method, see Barrett, Berry,
Chan, Demmel, Dongarra, Eijkhout, Pozo, Romine & van der Vorst (1994).
This can only be achieved in certain situations with reasonable eort. When
the matrix A is tridiagonal, reliable and ecient methods exist. On the other
hand, an iterative solver satises only an accuracy criterion and provides more
exibility and eciency. This is usually fully sucient for a nite dierence
method because there is no need to obtain exact solutions for the approximate
dierence equation. Iterative solvers, see Barrett et al. (1994), start from an
initial guess and iteratively improve the solution. Examples are the, so called,
Jacobi method, the Gauss-Seidel method and the successive overrelaxation
method.
Numerical Illustrations of Finite Dierence Methods
We perform some numerical experiments to highlight a few features of nite
dierence methods. We concentrate on the theta method, as described in
(17.4.22). Here this method is applied to the Black-Scholes dynamics for the
underlying security St , satisfying the SDE
dSt = St (r dt + dWt )
(17.4.27)
for t [0, T ] with S0 > 0. At rst we need to decide on the spatial interval
that we like to cover with our nite dierence method. Here we choose zero
to be the lowest grid point in S direction and a level Smax as the upper level.
This level is simply chosen such that a further increase does not change much
the results in the targeted range of accuracy.
For the European call payo that we will consider, we choose for simplicity, so called, Dirichlet boundary conditions. At maturity the approximate
743
Fig. 17.4.1. Prices in dependence on and S0 for the explicit nite dierence
method
pricing function is set equal to the payo function. To obtain the value of the
approximate pricing function at one step back in time, we invert the resulting
tridiagonal matrix, similar to that in (17.4.26), by using as direct solver the
Gaussian elimination method. This then provides at each time step and each
spatial grid point the approximate value of the European call option.
Since for European call options we have the explicit Black-Scholes formula,
one can analyze for dierent parameter settings the dierences between nite
dierence prices and theoretical prices from the Black-Scholes formula. The
main requirement under the given Black-Scholes dynamics for the numerical
stability of the explicit theta method, that is for = 0, translates into the
variable
2 2
S
+ r ,
(17.4.28)
=
(x)2
which needs to be less than one, see Wilmott et al. (1993). If this condition
is satised, then errors will not be enhanced during the computation.
In Figure 17.4.1 we plot prices for the explicit nite dierence method with
= 0 dependent on the parameter when measured at the money. Here we
set = 0.2, r = 0.05, T = 1 and K = 1 and consider a European call price in
dependence on S0 (0, 3).
One notes that for larger than one pricing errors become apparent that
distort signicantly the result. These errors are due to the fact that the time
step size is too large compared to the spatial step size. From Figure 17.4.1
we notice some instability of the explicit nite dierence method for too large
time step sizes. If one compares this with Markov chain approximations, then
one can see that for too large time step size one has, potentially, diculties
in obtaining positive probabilities when going to the next nodes, which, in
744
(tn+1 tn )
n{0,1,...}
745
(17.5.1)
Furthermore, for all n the stopping time tn+1 is assumed to be Atn -measurable.
For instance, (t) can be equidistant with step size . But also a step size
control is possible that is based on the information available until the previous
discretization point. Such a step size control has been, for instance, indicated
in (17.3.11).
We shall call a d-dimensional process Y = {Y t , t [0, T ]}, which is rightcontinuous with left hand limits, a discrete-time approximation if it is based
on a time discretization (t) with maximum step size (0, 1) such that
Y n = Y tn is Atn -measurable for each n {0, 1, . . .} and Y n+1 can be expressed
as a function of Y 0 , . . . , Y n , t0 , . . . , tn , tn+1 and a nite number of Atn+1 measurable random variables Z n+1,j for j {1, 2, . . . , }.
This denition includes, for instance, the case of a Markov chain if we
set = 1 and Z n+1,1 = Y n+1 Y n , where the random variable Z n+1,1 is
characterized by the transition probabilities
P Z n+1,1 = y x Y n = x = pn (x, y) = P Y n+1 = y Y n = x
for n {0, 1, . . .} and x, y d . It provides the possibility of a recursive
simulation of Y along the points of the time discretization. It also allows us
to construct recombining trees or lattices. Obviously, it covers weak Monte
Carlo simulation methods.
Various kinds of interpolations are possible between the time discretization
points. For the case of a Markov chain, one can simply choose a piecewise
constant interpolation such as
Y t = Y nt
for all t [0, T ], see (17.5.1).
Furthermore, we introduce the operators
L0 =
d
k=1
ak
1 k, 2
+
xk
2
xk x
(17.5.2)
k,=1
and
Lj =
d
k=1
bk,j
xk
(17.5.3)
for j {1, 2, . . . , m}, see (17.3.10), which are dened on the set of twice
dierentiable functions on d .
We set for easier notation Wt0 = t, bk,0 (x) = ak (x) and write for given
N and ph {1, 2, . . . , d} the one step weak order Taylor approximation,
see (11.2.30), of its ph th component
tn+1
s2
m
ph
n+1
= Ynph +
Lj1 . . . Ljk1 bph ,jk (Y n )
dWsj11 dWsjkk .
k=1 j1 ,...,jk =0
tn
tn
(17.5.4)
746
h=1
K 1 + max |Y k |2r (tn+1 tn )
0kn
(17.5.5)
for all n {0, 1, . . .},
(iii)
ak , bk,j C 2(+1) (d , )
(17.5.6)
(17.5.7)
(v) for each p N there exist constants K < and r N that do not
depend on such that for all q {1, 2, . . . , p} it holds that
(17.5.8)
E
max |Y n |2q At0 K(1 + |Y 0 |2r )
0nnT
and
2q
2r
Atn K 1 + max |Y k |
E |Y n+1 Y n |
(tn+1 tn )q (17.5.9)
0kn
747
(d , )
Suppose for xed weak order N and payo function g CP
the assumptions of the theorem to be satised. In the following, we list some
notations and results that we will need for the proof.
748
t
s,y
s,y
a(X z ) dz +
b(X s,y
X =y+
z ) dW z
s
(17.5.10)
for t [s, T ], which starts at time s [0, T ] in y d and has the same drift
and diusion coecients as that dened in (17.3.4).
For all s [0, T ] and y d , we dene the functional
u(s, y) = E(g(X s,y
T ))
(17.5.11)
0
)) = E(g(X T )),
u(0, X 0 ) = E(g(X 0,X
T
(17.5.12)
and it is
where X was dened in (17.3.4). According to the Feynman-Kac formula,
see Sect. 2.7, the function u(, ) is the unique solution of the Kolmogorovbackward equation
(17.5.13)
L0 u(s, y) = 0
for all (s, y) (0, T ) d with terminal condition
u(T, y) = g(y)
(17.5.14)
u(s, ) = CP
(d , )
(17.5.15)
u(t
(17.5.16)
E u tn , X tn2
,
y)
n1
n
According to an estimate in Krylov (1980), there exists for each p N a
constant K < such that for all q {1, 2, . . . , p}, all y d and all n
{1, 2, . . . , nT } it holds that
t
,Y
Atn1 K 1 + |y|2q (tn tn1 )q .
E |X tn1
y|
(17.5.17)
n
To achieve a compact notation we introduce for all N the set
P = {1, 2, . . . , d} ,
and dene for all y = (y 1 , . . . , y d ) d and p = (p1 , . . . , p ) P the
product
"
yhp ,
(17.5.18)
Fp (y) =
h=1
749
(17.5.19)
Therefore, it is shown that for each p N there exist constants K < and
r N such that for all {1, 2, . . . , 2( + 1)}, q {1, 2, . . . , p}, p P and
n {1, 2, . . . , nT } it holds that
2q
E |Fp ( n Y n1 )| Atn1 K 1 + |Y n1 |2r (tn tn1 )q . (17.5.20)
Using an estimate from Krylov (1980), p. 78, one has for q {1, 2, . . .} and
i {0, 1, . . . , nT 1} a function f , with
R=
sup
tn stn+1
|f (s)|
2q
At
q1
<
n1
E
tn+1
tn
s2
tn
g(s1 ) dWsj11
2q q1
dWsj
Kq, (tn+1 tn )
(17.5.21)
(17.5.22)
750
n
T
() E
(u(tn , Y n ) u(tn1 , Y n1 ))
n=1
(u(tn , X tn1
n
,Yn1
.
) u(tn1 , Y n1 )) + K
n
T
(u(tn , Y n ) u(tn1 , Y n1 ))
E
n=1
(u(tn , X tn1
n
= H 1 + H2 + K
,Yn1
.
) u(tn , Y n1 )) + K
(17.5.23)
with
n
T
.
H1 = E
(u(tn , Y n ) u(tn , Y n1 )) (u(tn , n ) u(tn , Y n1 ))
n=1
(17.5.24)
and
n
T
(u(tn , n ) u(tn , Y n1 ))
H2 = E
n=1
.
tn1 ,Yn1
) u(tn , Y n1 )) .
(17.5.25)
(u(tn , X tn
2+1
nT
1
yp u(tn , Y n1 ) Fp (Y n Y n1 ) + Rn (Y n )
H1 = E
!
n=1
pP
=2
2+1
1
p
.
y u(tn , Y n1 ) Fp ( n Y n1 ) + Rn ( n )
!
pP
=1
(17.5.26)
Here the remainder terms are of the form
1
Rn (Z) =
yp u(tn , Y n1 + p,n (Z)(Z Y n1 ))
(2( + 1))!
pP2(+1)
Fp (Z Y n1 )
(17.5.27)
k,j
p,n
(Z) =
p,n
0
751
(17.5.28)
otherwise
2+1
nT
1
yp u(tn , Y n1 )
H1 E
!
n=1
=1
pP
E Fp (Y n Y n1 ) Fp ( n Y n1 ) Atn1
*
+ E |Rn ( n )| At
. (17.5.29)
+ E |Rn (Y n )| At
n1
n1
Now, we obtain, with the help of (17.5.27), (17.5.15), (17.5.25), (17.5.28) and
(17.5.9), for all n {1, 2, . . . , nT }
E |Rn (Y n )| Atn1
K
12
2
E yp u(tn , Y n1 ) + p,n (Y n ) (Y n Y n1 ) Atn1
pP2(+1)
12
2
E |Fp (Y n Y n1 )| Atn1
1
K E 1 + |Y n1 |2r + |Y n Y n1 |2 Atn1 2
12
E |Y n Y n1 |4(+1) Atn1
K 1 +
max
0kn1
|Y k |2r (tn tn1 ).
(17.5.30)
In a similar way we obtain from (17.5.27), (17.5.4), (17.5.7) and the moment property (17.5.21) of multiple It
o integrals, for all n {1, 2, . . . , nT } the
estimate
E |Rn ( n )| Atn1 K 1 + |Y n1 |2r (tn tn1 ).
(17.5.31)
From (17.5.29), it follows now by application of equations (17.5.15),
(17.5.19), (17.5.4), (17.5.5), (17.5.30), (17.5.31), (17.5.8), (17.5.9) the inequality
752
H1 E
n
T
K 1 + max |Y k |2r (tn tn1 )
n=1
0knT
K 1 + E
max |Y k |2r
0knT
K 1 + E |Y 0 |2r
K .
(17.5.32)
nT
yp u(tn , Y n1 )
1
H2 E
!
n=1
pP
t
,Yn1
Atn1
E Fp ( n Y n1 ) Fp (X tn1
Y
)
n1
n
*
tn1 ,Yn1
+ E |Rn ( n )| Atn1 + E |Rn (X t
)| Atn1
(17.5.33)
n
n
T
.
tn1 ,Yn1
2r
)| Atn1
K 1 + |Y n1 | (tn tn1 ) + E |Rn (X tn
.
n=1
K E
max |Y k |2r + 1 .
0knT
It follows with condition (17.5.8), and because Y 0 has nite moments, that
H2 = K .
(17.5.34)
17.6 Exercises
753
17.6 Exercises
> 0 at maturity
17.1. For a European put option with benchmarked strike K
T = i, i {1, 2, . . .}, and time step size (0, 1), write down the multiperiod binomial tree option pricing formula at time t = 0 when a positive
18
Solutions for Exercises
2
dWs , 2
Ws dWs + ds
[W, (W ) ]t =
0
2 Ws ds.
0
755
756
2 from (1.7.5) or
2
t
2 exp{(t s)} dWs
=E
t0
exp{2(t s)} ds
=2
t0
= 1 exp{2(t t0 )}.
= X0 exp
1 1 1
2 2 2
2
l
Wt W0l
t+
3
= X0 exp t + Wt1 + Wt2 .
2
l=1
757
1.8 By the It
o formula one obtains
k 2 b2
dXt = Xt k a +
dt + k b dWt + (exp{k c} 1) dNt
2
for t [0, T ] with X0 = 1.
1.9 In the above linear SDE for Xt we take the expectation and obtain
k 2 b2
+ (exp{k c} 1) dt
d(t) = (t) k a +
2
for t [0, T ] with (0) = 1, which yields
k 2 b2
+ (exp{k c} 1) .
(t) = exp t k a +
2
1.10 We obtain by the formula (1.1.20) for the Poisson probabilities that
k2
et ( t)k
k!
k=1
k1
k1
t ( t)
t ( t)
+
= t
(k 1) e
e
(k 1)!
(k 1)!
E(Nt2 ) =
k=1
k=1
= t ( t + 1).
1.11 By the independence of the marks from the Poisson process it follows
that
N
t
t
.
E(Yt ) = E
k = E(Nt ) E(k ) =
2
k=1
1.12 The probability for a compound Poisson process with intensity > 0 of
having no jumps until time t > 0 equals the probability of the Poisson process
N of having no jumps until that time. Thus, by (1.1.20) we obtain
P (Nt = 0) = et .
758
1
2(t s)
1 2
(y1 x1 1 t)2 2(y1 x1 1 t)(y2 x2 2 t) + (y2 x2 2 t)2
exp
2(t s)(1 2 )
for t [0, ), s [0, t] and x1 , x2 , y1 , y2 .
2.2 We obtain the SDEs
dXt1 = 1 dt + dWt1
and
dXt2 = 2 dt + dWt1 +
1 2 dWt2 ,
for t 0.
2.3 The logarithms of each component of the geometric Brownian motion are
linearly transformed correlated Wiener processes. This leads to the transition
density
p(s, x1 , x2 ; t, y1 , y2 ) =
1
2(t s)
1 2 b2 y1 y2
#
2
1
1 2
ln(y1 ) ln(x1 ) + b (t s)
exp 2
2 b (t s) (1 2 )
2
1
1
2 ln(y1 ) ln(x1 ) + b2 (t s)
ln(y2 ) ln(x2 ) + b2 (t s)
2
2
*!
2
1
+ ln(y2 ) ln(x2 ) + b2 (t s)
2
for t [0, ), s [0, t] and x1 , x2 , y1 , y2 + .
2.4 The SDEs are the following
dXt1 = Xt1 b dWt1
dXt2 = Xt2 b dWt1 +
for t 0 and X01 , X02 + .
1 2 dWt2
759
= u(t, Xt ) +
t
u(s, Xs ) + a Xs
u(s, Xs )
t
X
b
Xs2
u(s,
X
)
+
(u(s,
X
(1
+
c))
u(s,
X
))
ds
s
s
s
2
X 2
u(s, Xs ) dWs
+
b Xs
X
t
T
+
(u(s, Xs (1 + c) u(s, Xs ))) (dNs ds).
2
Since u(t, Xt ) is a martingale the drift in the above SDE has to vanish and it
needs to hold for all (t, x) [0, T ] + that
b2
u(t, x) + a x
u(s, x) + x2
u(s, x) + (u(s, x(1 + c)) u(s, x)) = 0
t
X
2
X 2
and H(x) = u(T, x), which conrms the Feynman-Kac formula.
2.6 We consider the square root process Y = {Yt , t [0, )} of dimension
> 2 satisfying the SDE
2
c + b Yt dt + c Yt dWt
dYt =
4
for t [0, ), Y0 > 0, c > 0 and b < 0. The It
o integral
t
Ys dWs
Mt = c
0
forms a martingale due to Lemma 1.3.2 (iii), since the square root process,
as a transformed time changed squared Bessel process, has moments of any
positive order. Consequently,
2
c + b E(Yt ) dt
dE(Yt ) =
4
for t [0, ) with E(Y0 ) > 0. Therefore, we obtain
E(Yt ) = E(Y0 ) exp{b t} +
c2
(1 exp{b t}).
4b
760
2.7 Using the time homogenous Fokker-Planck equation the stationary density
of the ARCH diusion model is of the form
2
!
(2 u)
C
2
p( ) = 2 4 exp 2
du
2 u2
2
!
2
1
C
2 2
2 1
du
= 2 4 exp
du 2
2
2
2 u
2 u
2 2
C
1
1
2
2
= 2 4 exp
2 + 2 ln( ) ln( )
2
2
+2
1 2
2 2 1
,
= C1 exp 2
2
2
which is an inverse gamma density with an appropriate constant C1 > 0.
2.8 The stationary density for the squared volatility needs to satisfy the expression
2
!
(2 u) u
C
2
p( ) = 2 6 exp 2
du
2 u3
2
!
2
2
C
1
2 2 1
du
= 2 6 exp
du
2
2
2 u
2 u
2
C
2 1 1 ln(2 ) ln(2 )
= 2 6 exp
2
2
2
2
C
2 1 1 ln(2 ) + ln(2 )
= 2 6 exp
2
2
2
22 +3
1
1
2
= C1 exp 2 2 2
,
2
which is an inverse gamma density.
2.9 For the Heston model we obtain the stationary density for the squared
volatility
!
2
(2 u)
C
2
p( ) = 2 2 exp 2
du
2 u
2
!
2
2
2
C
2 2
2 2 1
2 2
2
2 2 1
= 2 2 exp
du
)
=
C
exp
)
,
(
1
2 2 u
2
2
which is a gamma density.
761
St
2(St )2
2
d
d
d
d
1
j
j
k
k
= rt +
,t
bj,k
,t
bj,k
dt
t t
t t
2
j=1
j=1
d ln(St ) =
k=1
d
d
k=1
j
k
,t
bj,k
t dWt .
k=1 j=1
2
d
d
d
j j,k
1 j j,k
gt = rt +
,t bt tk
b
.
2 j=1 ,t t
j=1
k=1
St
St
+ dSt + d , S
dSt = St d
S
St
St
t
d
d
d
St
S
j
tk dt + dWtk
= rt dt
tk dWtk + t rt dt +
,t
bj,k
t
St
S
t
k=1
k=1 j=1
d
d
St k j j,k
+
t
,t bt
St k=1
j=1
= St
d
k=1
d
j
k
dWtk .
,t
bj,k
t t
j=1
762
t
S
t
St the SDE
St t
1
1 1
(
St = (
S
dWt
dt
+
2 4 32 t t
S
2 St
2
St
t
t
3 t
1
(
dt +
8
2
St
(
t dWt ,
St
St
= (1 t Yt ) dt +
Yt dWt ,
1
dt
Yt
1
Yt
32
3
dWt = t |t |2 dt |t |2 2 dWt .
763
t0 +h
t0 +h
dz dWs +
t0
t0
b2
(Wt0 +h Wt0 )2 h
2
dWs dz
t0
t0
+ R.
By application of the It
o formula to (Wt Wt0 )(t t0 ) we can write
h2
= Xt0 a h + b (Wt0 +h Wt0 ) + a2
+ a b (Wt0 +h Wt0 ) h
2
b2
2
(Wt0 +h Wt0 ) h
+
+ R.
2
4.3 According to the denitions of the multi-index notation we get for
= (0, 0, 0, 0) : = (0, 0, 0), = (0, 0, 0), () = 4, n() = 4;
= (1, 0, 2) : = (0, 2), = (1, 0), () = 3, n() = 1;
= (0, 2, 0, 1) : = (2, 0, 1), = (0, 2, 0), () = 4, n() = 2.
4.4 By using the introduced notation one obtains
t
s
t
t2
dz ds =
s ds = ,
I(0,0),t =
2
0
0
0
t
s
I(1,0),t =
dWz ds
0
I(1,1),t =
dWz dWs =
0
Ws dWs
0
t
s
t
I(1,2),t =
dWz1 dWs2 =
Ws1 dWs2 .
0
764
4.6
E(I(1,0), ) = E
dWz ds
0
s2 ds =
0
2
dz dWs =
2
dz
ds
3
,
3
dWs
0
E(I(1), I(0,1), ) = E
E((I(1,0), )2 ) = E
E(Ws ) ds = 0,
E((I(0,1), ) ) = E
s dWs
s ds =
=
0
2
,
2
2
2
= 2 E (I(1), )2 2 E I(1), I(0,1), + E I(0,1),
= 3 3 +
3
3
=
.
3
3
and
4.8 The following sets are not hierarchical sets: , {(1)}, {v, (0), (0, 1)}.
4.9 The remainder sets are:
B({v, (1)}) = {(0), (0, 1), (1, 1)}
B({v, (0), (1), (0, 1)}) = {(1, 1), (1, 0), (0, 0), (1, 0, 1), (0, 0, 1)}.
765
Xt = X0 + a(0, X0 )
ds + b(0, X0 )
0
+ b(0, X0 )
b(0, X0 )
x
dWs
0
s2
dWs1 dWs2
0
= X0 + a(0, X0 ) t + b(0, X0 ) Wt
+ b(0, X0 )
1
b(0, X0 ) ((Wt )2 t).
x
2
2
Yn ((W )2 ).
2
5.2 Due to the additive noise of the Vasicek model the Euler and Milstein
schemes are identical and of the form
Yn+1 = Yn + (
r Yn ) + W,
where W = Wn+1 Wn .
5.3 The strong order 1.5 Taylor scheme has for the given SDE the form
766
Yn+1 = Yn + (Yn + ) + Yn W +
+ Yn Z +
Yn 2
((W )2 )
2
1
( Yn + ) 2
2
+( Yn + ) (W Z)
1
1 3
2
(W ) W,
+ Yn
2
3
where W = Wn+1 Wn and
Z =
n+1
(Ws Wn ) ds.
5.4 The explicit strong order 1.0 scheme has the form
Yn+1 = Yn + Yn + Yn W
Yn
+ ((W )2 ),
+
2
where W = Wn+1 Wn .
Solutions for Exercises of Chapter 6
6.1 It follows that the diusion coecients for the rst Wiener process W 1
are
and
b2,1 = 0
b1,1 = 1
and that of the second Wiener process are
b1,2 = 0
and
b2,2 = Xt1 .
2,2
b + b2,1 2 b2,2 = 1
x1
x
and
2,1
b + b2,2 2 b2,1 = 0.
1
x
x
Since the above values are not equal, the SDE does not satisfy a diusion
commutativity condition.
L2 b2,1 = b1,2
6.2 The Milstein scheme applied to the given SDE is of the form
1
Yn+1
= Yn1 + W 1
2
Yn+1
= Yn2 + Yn1 W 2 + I(1,2)
with
767
n+1
I(1,2) =
n
dWs11 dWs22 ,
W 1 = W1n+1 W1n
W 2 = W2n+1 W2n .
6.3 According to the jump commutativity condition (6.3.2) we need to consider the case of b(t, x) = 2x and c(t, x) = 2x. Then it follows
c(t, x)
=2
x
and we have the relation
b(t, x)
c(t, x)
= 2(x + 2x) 2x = b(t, x + c(t, x)) b(t, x).
x
Yn (1 + W )
.
1
7.2 The drift implicit strong order 1.0 Runge-Kutta method for the Vasicek
model yields the equation
Yn+1 = Yn + (
r Yn+1 ) + W
and thus the algorithm
Yn+1 =
Yn + r + W
,
1+
where W = Wn+1 Wn .
7.3 A balanced implicit method for the BS model is obtained by setting
Yn+1 = Yn + Yn + Yn W + (Yn Yn+1 ) |W |.
768
Yn (1 + + (W + |W |))
.
1 + ||W |
b(Yn Yn )
{(Wn )2 n } + b c Yn pn Wn
2 n
c2 Yn
{(pn )2 pn }
2
with supporting value
Yn = Yn + b Yn
+
n .
769
rt drt + r (rT r0 )
.
= T 0
t
2
2T
r
dt
2
r
r
dt
+
(
r
)
t
t
0
0
9.2 One can calculate the quadratic variation of ln(X) = {ln(Xt ), t [0, T ]},
where
1
d ln(Xt ) = a 2 dt + Wt
2
for t [0, T ]. Then the quadratic variation is of the form
[ln(X )]t = 2 t.
Consequently, to obtain the volatility one can use the expression
)
[ln(X )]T
.
=
T
770
2
Wt
1 + 2 t
E W
W
= 0.
Furthermore, it follows
E
and thus
E
2
=
1
1
+ =
2
2
2
W
= 0.
+ E
= 0 K 2 .
+ E
E W
W
W
E W = E
=E
= 0.
Furthermore, calculation of the expected values provides
2 1
1
W
= 3 + 3 = ,
E
6
6
E
4
=
771
1
1
9 2 + 9 2 = 3 2 ,
6
6
+ E
+ E
W
E W
2
4
2
W
W
+ E
+ E
3 = 0 K 3 .
11.3 The simplied Euler scheme for the BS model is of the form
n ,
Yn+1 = Yn 1 + + W
where = n+1 n and
n = = 1
P W
2
for n {0, 1, . . .} with Y0 = X0 . This scheme attains the weak order = 1.0.
11.4 The simplied weak order 2.0 Taylor scheme has for the BS model the
form
2 2
2 2
Wn + Wn +
Yn+1 = Yn 1 + + Wn +
2
2
with = n+1 n and
n = 3 = 1
P W
6
and
n = 0 = 2
P W
3
1 + Wn
1
772
Yn+1 = Yn
1
1+
( 2
n
) W
W
Ytn+1 = Ytn + a Ytn (tn+1 tn ) + Ytn W
(t
t
)
n+1
n
n
n
2
1
t (tn+1 tn )
+ a2 Ytn (tn+1 tn )2 + a Ytn W
n
2
773
with
Ytn+1 = Ytn+1 + c Ytn+1 Nn
where
t = 3 (tn+1 tn ) = 1
P W
n
6
2
t = 0 =
P W
n
3
1 if tn+1 is a jump time of N
Nn =
0 otherwise.
and
3
2
E (Gn+1 (, )) = E
1 + 1 + Wn
2
2
3
3
= 1 + 2 1 + 1 2 2
2
2
2
3
= 1 + 2 (1 2 ) + 1 2 2 .
2
Therefore, we have by (14.1.8) for < 0 that
E (Gn+1 (, ))2 < 1
if and only if
774
2
3
2 (1 2 ) + 1 2 2 < 0,
2
that is,
3
2 (1 2 ) > 1
2
2
.
This means, the boundary B() for values of the stability region as a
function of [0, 1) is given by the function
+
2(1 2)
B() =
.
(1 32 )2
14.2 It follows from (14.4.3) with = = = 1 that
Gn+1 (, 1) =
1
1+
||
2
(1 +
||
2
|| 2
)
n
||W
Consequently, we obtain
E (|Gn+1 (, 1)|) =
1+
||
2
which is less than one for > ||
.
Therefore, the fully implicit simplied Euler scheme is for = 1 asymp2
totically p-stable with p = 1 as long as the step size is not smaller than ||
.
q 2
q2 2
(T t)
u(t, St ) = (St ) exp (T t) +
2
2
2 2
q
E exp
(T t) + q (WT Wt ) At .
2
q
775
Since the term under the conditional expectation forms with respect to T an
(A, P )-martingale we get the pricing function in the form
q
u(t, S) = (S)q exp
(q 1) 2 (T t)
2
for t [0, T ] and S (0, ). The corresponding martingale representation is
obtained by application of the It
o formula to u(t, St ), which yields
(ST )q = u(T, ST )
= u(t, St ) +
t
T
u(z, Sz ) 1 2 u(z, Sz ) 2 2
+
Sz dz
t
2
S 2
u(z, Sz )
Sz dWz
S
t
T
q
q
(Sz )q exp
(q 1) 2 (T t)
(1 q) 2
= u(t, St ) +
2
2
t
q
1
(q 1) 2 (T z) 2 Sz2 dz
+ q (q 1) Sz(q2) exp
2
2
T
u(z, Sz )
Sz dWz
+
S
t
T
u(z, Sz )
Sz dWz .
= u(t, St ) +
S
t
+
15.2 By using Theorem 15.4.1 the hedge ratio is obtained as the partial derivative
u(t, St )
q (q 1) 2
=
(T t)
(St )q exp
S
S
2
q (q 1) 2
= q (St )q1 exp
(T t) .
2
15.3 By the Black-Scholes formula we have
cT,K (t, St ) = St N (d1 (t)) K N (d2 (t))
with
d1 (t) =
ln
St
K
+ (T t)
2
T t
and
d2 (t) = d1 (t)
T t.
776
1
2
+ 2 S 2
cT,K (t, S) = 0
t 2
S 2
for all (t, S) (0, T ) (0, ), we obtain the martingale representation
cT,K (T, ST ) = (ST K)+
= cT,K (t, St ) +
t
cT,K (z, Sz )
dSz .
S
2
N
1
1
= E
(Z(k ))2 1
Z(k ) +
N
2
k=1
2
1
1
=
E
Z+
(Z)2 1
N
2
1
1
2
4
2
E (Z) 2 E (Z) + 1
=
E (Z) +
N
4
3
1
1
.
=
1 + (3 2 + 1) =
N
4
2N
777
Alternatively, we obtain
N
1 1
1
Var VN = E
1 + Z(k ) + (Z(k ))2
2 N
2
k=1
2
N
1
1
3
+
1 Z(k ) + (Z(k ))2
N
2
2
k=1
N
2
1
1
= E
(Z(k ))2 1
N
2
k=1
2
1 1
E
N 4
1
1
(3 2 + 1) =
.
4N
2N
(Z)2 1
1
E (Z)4 2 E (Z)2 + 1
4N
This shows that the antithetic method provides a Monte Carlo estimator or
yielding only a third of the variance of a raw Monte Carlo estimator.
16.2 For E(VN ) = = 1 it is E(VN ) = 23 ,
The variance of VN is then obtained as
N
1
Var VN = E
1 + Z(k ) +
N
k=1
1
(Z(k ))2
2
2
N
1
3
+ 1
(1 + Z(k ))
N
2
k=1
2
N
1
1
= E
(Z(k ))2 1
Z(k ) (1 ) +
N
2
k=1
2
1
1
=
E
Z (1 ) +
(Z)2 1
N
2
1
1
=
E (Z)2 (1 )2 +
E (Z)4 2 E (Z)2 + 1
N
4
1
1 1
1
=
+ (1 )2 .
(1 )2 + (3 2 + 1) =
N
4
N 2
It turns out that the minimum variance can be achieved for min = 1, which
yields Var(VN ) = 21N .
778
XTt,x
Therefore, we obtain
2
At = x2 exp{(2 + 2 ) (T t)}.
t0,x = 2 .
d t, X
16.4 The fully drift implicit simplied weak Euler scheme for the Heston
model, when approximating X by Y and v by V , has for an equi-distant time
discretization with time step size the form
Yn+1 = Yn + (rd + rf ) Yn+1 + k
Vn+1 = Vn + (Vn+1 v) + p
1
Vn Yn W
n
1+
Vn W
n
2
1 2 W
n
1
Yn 1 + k Vn W
n
Yn+1 =
1 (rd + rf )
and
Vn v + p
Vn+1 =
1
W
n
n1 +
Vn W
n2
1 2 W
such that
1 = = P W
2 = = 1.
P W
n
n
2
with independent
and
2
W
n
779
16.5 The simplied weak Euler scheme with implicit drifts and implicit diffusion term for X has the form
Yn+1 = Yn + (rd + rf ) Yn+1 k 2 Vn+1 Yn+1 + k
Vn+1 = Vn + (Vn+1 v) + p
1
Vn+1 Yn+1 W
n
2
Vn W
n
Here we obtain
Yn+1 =
Yn
1 (rd + rf ) +
and
Vn+1
k2
Vn+1 k
n1
Vn+1 W
2
Vn v + p Vn W
n
.
=
1
u = exp{ } 1
with probability p =
d
ud
} 1
i
k=0
=K
+
i!
(1 + u)k (1 + d)ik S0
pk (1 p)ik K
k ! (i k) !
k
k=0
S0
i!
pk (1 p)ik
k ! (i k) !
k
k=0
i!
pk (1 p)ik (1 + u)k (1 + d)ik ,
k ! (i k) !
where k denotes the rst integer k for which S0 (1 + u)k (1 + d)ik < K.
780
17.2 A trinomial recombining tree can achieve, in general, weak order = 1.0
because three conditions can, in general, be satised to construct the transition probabilities to match the rst two conditional moments of its increments.
17.3 For a recombining tree to achieve weak order = 2.0 it is, in general,
necessary to have ve possible nodes at each time step where the Markov
chain could goto in one time step. Otherwise, the transition densities could,
in general, not be chosen to match the necessary moments.
Acknowledgements
781
Bibliographical Notes
783
784
Bibliographical Notes
Sufana (2004), Beskos, Papaspiliopoulos & Roberts (2006, 2008), Casella &
Roberts (2007), Da Fonseca, Grasselli & Tebaldi (2007, 2008), Grasselli &
Tebaldi (2008) and Chen (2008).
Chap. 3: Benchmark Approach to Finance and Insurance
This chapter summarizes an alternative framework for pricing and hedging
beyond the classical risk neutral paradigm, see Delbaen & Schachermayer
(1998, 2006). It is extending the numeraire portfolio idea, see Long (1990),
Bajeux-Besnainou & Portait (1997) and Becherer (2001), and has been developed in the book Platen & Heath (2006) and a sequence of papers, including
Platen (2001, 2002, 2004a, 2004b, 2004c, 2005a, 2005b, 2006), B
uhlmann
& Platen (2003), Platen & Stahl (2003), Christensen & Platen (2005, 2007),
Hulley et al. (2005), Miller & Platen (2005, 2008, 2009), Platen & Runggaldier
(2005, 2007), Le & Platen (2006), Christensen & Larsen (2007), Hulley &
Platen (2008a, 2008b, 2009), Kardaras & Platen (2008), Bruti-Liberati et al.
(2009), Filipovic & Platen (2009), Galesso & Runggaldier (2010), Hulley (2010)
and Hulley & Schweizer (2010).
Other literature that steps out of the classical pricing paradigm includes Rogers (1997), Sin (1998), Lewis (2000), Loewenstein & Willard (2000a,
2000b), Fernholz (2002), Cheridito (2003), Cox & Hobson (2005), Delbaen &
Schachermayer (2006), Bender, Sottinen & Valkeila (2007), Heston, Loewenstein & Willard (2007), Jarrow, Protter & Shimbo (2007a, 2007b), Karatzas
& Kardaras (2007) and Ekstr
om & Tysk (2009).
Chap. 4: Stochastic Expansions
The Wagner-Platen expansion was rst formulated in Wagner & Platen (1978)
and developed further in Platen (1981, 1982a, 1982b, 1984), Azencott (1982),
Liske (1982), Liske, Platen & Wagner (1982), Platen & Wagner (1982), Milstein (1988a, 1995a), Yen (1988, 1992), BenArous (1989), Kloeden & Platen
(1991a, 1991b, 1992), Kloeden, Platen & Wright (1992), Hu & Meyer (1993),
Hofmann (1994), Gaines (1994, 1995), Gaines & Lyons (1994), Castell &
Gaines (1995) Li & Liu, (1997, 2000), Lyons (1998), Burrage (1998), Takahashi
(1999), Kuznetsov (1998, 1999, 2009), Misawa (2001), Ryden & Wiktorsson
(2001), Studer (2001), Tocino & Ardanuy (2002), Schoutens & Studer (2003),
Kusuoka (2004), Lyons & Victoir (2004), Milstein & Tretjakov (2004) and
Bruti-Liberati & Platen (2007b).
Related other work on stochastic expansions includes Stroud (1971), Engel
(1982), Kohatsu-Higa (1997) and Bayer & Teichmann (2008).
Chap. 5: Introduction to Scenario Simulation
This chapter summarizes results on Euler schemes and some other discretetime strong approximations relating to work in Maruyama (1955), Franklin
Bibliographical Notes
785
(1965), Clements & Anderson (1973), Allain (1974), Wright (1974), Milstein,
(1974, 1978), Fahrmeir (1976), Yamada (1976), Doss (1977), Kloeden &
Pearson (1977), Wagner & Platen (1978), Sussmann (1978), Yamato (1979),
Gikhman & Skorokhod (1979), Gorostiza (1980), Clark & Cameron (1980),
R
umelin (1982), Talay (1982c, 1983), Clark (1984, 1985), Janssen (1984a,
1984b), Klauder & Petersen (1985a), Atalla (1986), Liske & Platen (1987),
Kanagawa (1988, 1989, 1995, 1996, 1997), Kaneko & Nakao (1988), Manella
& Palleschi (1989), Golec & Ladde (1989, 1993), Newton (1990, 1991), Mikulevicius & Platen (1991), Kurtz & Protter (1991a, 1991b), Kohatsu-Higa &
Protter (1994), Mackevicius (1994), Chan & Stramer (1998), Grecksch & Anh
(1998, 1999), Gy
ongy (1998), Jacod & Protter (1998), Shoji (1998), Schurz
(1999), Hausenblas (2002), Higham, Mao & Stuart (2002), Konakov &
Mammen (2002), Milstein, Repin & Tretjakov (2002), Malliavin & Thalmaier
(2003), Talay & Zheng (2003), Yuan & Mao (2004a), Jacod et al. (2005),
Guyon (2006), Berkaoui et al. (2008).
Systematic treatments, surveys and books on topics related to the simulation of solutions of SDEs, include Pardoux & Talay (1985), Gard (1988),
Milstein, (1988a, 1995a), Kloeden & Platen (1999), Bouleau & Lepingle (1993),
Kloeden et al. (2003), Janicki & Weron (1994), Talay (1995), Rogers & Talay
(1997), Kuznetsov (1998), Platen (1999a, 2004d, 2008), Cyganowski, Kloeden & Ombach (2001), Higham (2001), Burrage, Burrage & Tian (2004),
Glasserman (2004) and Bruti-Liberati & Platen (2007a).
Pseudo-random and quasi-random number generation are discussed, for
instance, in Brent (1974, 2004), Devroye (1986), Bratley et al. (1987), Niederreiter (1988b, 1992), LEcuyer (1994, 1999), Paskov & Traub (1995), Fishman
(1996), Joy et al. (1996), Hofmann & Mathe (1997), Pages & Xiao (1997),
Matsumoto & Nishimura (1998), Sloan & Wozniakowski (1998), Pivitt (1999),
Bruti-Liberati, Platen, Martini & Piccardi (2005), Panneton et al. (2006) and
Bruti-Liberati et al. (2008).
Stochastic dierential delay equations have been studied, for instance, in
Tudor & Tudor (1987), Tudor (1989), Mao (1991), Baker & Buckwar (2000),
Buckwar (2000), K
uchler & Platen (2000, 2002), Hu, Mohammed & Yan
(2004), Yuan & Mao (2004b), Buckwar & Shardlow (2005) and Kushner
(2005).
Papers on numerical methods for backward stochastic dierential equations include Ma, Protter & Yong (1994), Douglas, Ma & Protter (1996),
Chevance (1997), Ma, Protter, San Martin & Torres (2002), Nakayama (2002),
Makarov (2003), Bouchard & Touzi (2004), Gobet, Lemor & Warin (2005)
and Ma, Shen & Zhao (2008).
Chap. 6: Regular Strong Taylor Approximations with Jumps
Further discrete-time strong approximations of solutions of SDEs, potentially
with jumps, have been studied, for instance, in Franklin (1965), Shinozuka
(1971), Kohler & Boyce (1974), Rao, Borwanker & Ramkrishna (1974),
786
Bibliographical Notes
Dsagnidse & Tschitashvili (1975), Harris (1976), Glorennec (1977), Kloeden & Pearson (1977), Clark (1978, 1982, 1985), Nikitin & Razevig (1978),
Helfand (1979), Platen (1980a, 1982a, 1982b, 1984, 1987), Razevig (1980),
Rootzen (1980), Wright (1980), Greenside & Helfand (1981), Casasus (1982,
1984), Guo (1982, 1984), Talay (1982a, 1982b, 1982c, 1983, 1984, 1999),
Drummond, Duane & Horgan (1983), Janssen (1984a, 1984b), Shimizu &
Kawachi (1984), Tetzla & Zschiesche (1984), Unny (1984), Harris & Maghsoodi (1985), Averina & Artemiev (1986), Drummond, Hoch & Horgan (1986),
Kozlov & Petryakov (1986), Greiner, Strittmatter & Honerkamp (1987), Liske
& Platen (1987), Mikulevicius & Platen (1988, 1991), Milstein (1988a, 1988b),
Golec & Ladde (1989), Ikeda & Watanabe (1989), Feng, Lei & Qian (1992),
Artemiev (1993b), Kloeden et al. (1993), Saito & Mitsui (1993a, 1996), Komori, Saito & Mitsui (1994), Milstein & Tretjakov (1994a, 1994b, 1997, 2000),
Petersen (1994), Torok (1994), Ogawa (1995), Rascanu & Tudor (1995),
Gelbrich & Rachev (1996), Grecksch & Wadewitz (1996), Mackevicius (1996),
Maghsoodi (1996, 1998), Newton (1996), Schurz (1996b), Yannios & Kloeden
(1996), Artemiev & Averina (1997), Bokor (1997, 2003), Denk & Sch
aer
(1997), Kohatsu-Higa (1997), Kohatsu-Higa & Ogawa (1997), Komori, Mitsui
& Sugiura (1997), Protter & Talay (1997), Tudor & Tudor (1997), Abukhaled
& Allen (1998a, 1998b), Schein & Denk (1998), Kurtz & Protter (1991b), Jansons & Lythe (2000), Glasserman & Zhao (2000), Gr
une & Kloeden (2001b),
Hofmann, M
uller-Gronbach & Ritter (2001), Hausenblas (2002), Kubilius &
Platen (2002), Lehn, Rossler & Schein (2002), Metwally & Atiya (2002), Glasserman & Merener (2003b), Jimenez & Ozaki (2003), Gardo`
n (2004, 2006),
Carbonell, Jimenez, Biscay & de la Cruz (2005), Cortes, Sevilla-Peris & Jodar
(2005), Doering, Sagsyan & Smereka (2005), El-Borai, El-Nadi, Mostafa &
Ahmed (2005), Jacod et al. (2005), J
odar, Cortes, Sevilla-Peris & Villafuerte
(2005), Mao, Yuan & Yin (2005), Mora (2005), Bruti-Liberati et al. (2006),
Buckwar & Winkler (2006), Buckwar, Bokor & Winkler (2006), Burrage, Burrage, Higham, Kloeden & Platen (2006), Carbonell, Jimenez & Biscay (2006),
Jimenez, Biscay & Ozaki (2006), Bruti-Liberati & Platen (2007a, 2007b, 2008),
Kloeden & Jentzen (2007), Kloeden & Neuenkirch (2007), Komori (2007) and
Wang, Mei & Xue (2007).
Some work on discrete-time approximations for semimartingales, including
Levy processes and -stable processes can be found, for instance, in Marcus
(1978, 1981), Platen & Rebolledo (1985), Protter (1985), Jacod & Shiryaev
(2003), Mackevicius (1987), Bally (1989a, 1989b, 1990), Gy
ongy (1991), Kurtz
& Protter (1991a, 1991b), Janicki & Weron (1994), Kohatsu-Higa & Protter
(1994), Janicki (1996), Janicki, Michna & Weron (1996), Protter & Talay
(1997) and Tudor & Tudor (1997).
The approximation of solutions of SDEs with boundary conditions or rst
exit times were considered, for instance, in Platen (1983, 1985), Gerardi,
Marchetti & Rosa (1984), Slominski (1994, 2001), Asmussen, Glynn & Pitman
(1995), Lepingle (1995), Petterson (1995), Ferrante, Kohatsu-Higa & SanzSole (1996), Abukhaled & Allen (1998b), Hausenblas (1999a, 1999b, 2000a,
Bibliographical Notes
787
2000b, 2000c), Kanagawa & Saisho (2000), Makarov (2001), Alabert & Ferrante (2003), Bossy, Gobet & Talay (2004), Alcock & Burrage (2006), Kim,
Lee, Hanggi & Talkner (2007) and Sickenberger (2008).
Chap. 7: Regular Strong It
o Approximations
Derivative-free and Runge-Kutta type methods including predictor-corrector
and implicit schemes were studied, for instance, in Clements & Anderson
(1973), Wright (1974), Kloeden & Pearson (1977), R
umelin (1982), Talay
(1982b, 1984), Platen (1984), Klauder & Petersen (1985a), Pardoux & Talay
(1985), Chang (1987), Gard (1988), McNeil & Craig (1988), Milstein (1988a,
1995a), Smith & Gardiner (1988), Petersen (1990), Artemiev & Shkurko
(1991), Drummond & Mortimer (1991), Lepingle & Ribemont (1991), Newton
(1991), Hernandez & Spigler (1992, 1993), Kloeden & Platen (1999), Artemiev
(1993a, 1993b, 1994), Denk (1993), Saito & Mitsui (1993a, 1993b, 1995,
1996), Burrage & Platen (1994), Hofmann & Platen (1994, 1996), Komori
et al. (1994), Milstein & Platen (1994), Hofmann (1995), Mitsui (1995),
Biscay, Jimenez, Riera & Valdes (1996), Burrage & Burrage (1996, 1998, 1999,
2000), Schurz (1996a, 1996b, 1996c), Bokor (1997), Burrage, Burrage & Belaer (1997), Komori et al. (1997), Milstein & Tretward (1997), Denk & Sch
jakov (1997), Ryashko & Schurz (1997), Burrage (1998), Denk, Penski &
Sch
aer (1998), Milstein et al. (1998), Petersen (1998), Fischer & Platen
(1999), Liu & Xia (1999), Burrage, Burrage & Mitsui (2000), Burrage & Tian
(2000, 2002), Higham (2000), Hofmann et al. (2000a, 2000b, 2001), Asmussen
& Rosi
nski (2001), Fleury & Bernard (2001), Gr
une & Kloeden (2001a, 2001b,
2006), Higham et al. (2002), M
uller-Gronbach (2002), Tocino & Ardanuy
(2002), Vanden-Eijnden (2003), Cao, Petzold, Rathinam & Gillespie (2004),
Gardo`
n (2004, 2006), Glasserman (2004), Lepingle & Nguyen (2004), Wu,
Han & Meng (2004), Higham & Kloeden (2005, 2006), Rodkina & Schurz
(2005), Alcock & Burrage (2006), Kahl & Schurz (2006), Bruti-Liberati &
Platen (2008), Li, Abdulle & E (2008), Pang, Deng & Mao (2008), Rathinasamy & Balachandran (2008c, 2008a, 2008b) and Wang (2008).
Chap. 8: Jump-Adapted Strong Approximations
The use of jump-adapted time discretizations goes back to Platen (1982a) and
has been employed, for instance, in Maghsoodi (1996), Glasserman (2004),
Turner et al. (2004), Bruti-Liberati et al. (2006) and Bruti-Liberati & Platen
(2007b, 2008).
Chap. 9: Estimating Discretely Observed Diusions
There exists an extensive literature on estimation for stochastic dierential
equations, which includes Novikov (1972b), Taraskin (1974), Brown & Hewitt
788
Bibliographical Notes
Bibliographical Notes
789
(1984, 1985), Gladyshev & Milstein (1984), Talay (1984, 1986, 1987, 1990,
1991, 1995), Kanagawa (1985, 1989), Klauder & Petersen (1985a, 1985b),
Pardoux & Talay (1985), Ventzel, Gladyshev & Milstein (1985), Averina &
Artemiev (1986), Haworth & Pope (1986), Chang (1987), Petersen (1987,
1990), Romisch & Wakolbinger (1987), Mikulevicius & Platen (1988, 1991),
Gelbrich (1989), Greenside & Helfand (1981), Wagner (1989a, 1989b), Grorud
& Talay (1990, 1996), Talay & Tubaro (1990), Kloeden & Platen (1991b,
1992), Kloeden, Platen & Hofmann (1992, 1995), Goodlett & Allen (1994),
Hofmann (1994), Mackevicius (1994, 1996), Komori & Mitsui (1995), Platen
(1995), Arnold & Kloeden (1996), Bally & Talay (1996a, 1996b), Boyle et al.
(1997), Broadie & Detemple (1997a), Kohatsu-Higa & Ogawa (1997), Milstein & Tretjakov (1997), Protter & Talay (1997), Jacod & Protter (1998),
Hausenblas (1999a, 2002), Kurtz & Protter (1991a), Andersen & Andreasen
(2000b), Hofmann et al. (2000a, 2000b), Liu & Li (2000), Hunter et al. (2001),
Kohatsu-Higa (2001), Kusuoka (2001), Jansons & Lythe (2000), J
ackel (2002),
Kubilius & Platen (2002), Metwally & Atiya (2002), Tocino & Vigo-Aguiar
(2002), Glasserman & Merener (2003a, 2003b), Jimenez & Ozaki (2003, 2006),
Joshi (2003), Schoutens & Symens (2003), Cruzeiro, Malliavin & Thalmaier
(2004), Glasserman (2004), Kusuoka (2004), Lyons & Victoir (2004), Brandt,
Goyal, Santa-Clara & Stroud (2005), Jacod et al. (2005), Kebaier (2005),
Malliavin & Thalmaier (2006), Mao et al. (2005), Mora (2005), SchmitzAbe & Shaw (2005), Tocino (2005), Carbonell et al. (2006), Guyon (2006),
Jimenez et al. (2006), Giles (2007, 2008), Siopacha & Teichmann (2007),
Wang et al. (2007), Bruti-Liberati et al. (2008), Joshi & Stacey (2008),
Mordecki, Szepessy, Tempone & Zouraris (2008), Ninomiya & Victoir (2008)
and Giles, Higham & Mao (2009).
Chap. 12: Regular Weak Taylor Approximations
American option pricing via simulation or related methods has been studied, for instance, by Carriere (1996), Broadie & Glasserman (1997a, 1997b),
Broadie & Detemple (1997b), Acworth, Broadie & Glasserman (1998),
Broadie, Glasserman & Ha (2000), Thitsiklis & Van Roy (2001), Rogers (2002),
Longsta & Schwartz (2001), Clement, Lamberton & Protter (2002), Andersen & Broadie (2004), Stentoft (2004) and Chen & Glasserman (2007).
LIBOR market models and their implementation have been considered,
for instance, in Andersen & Andreasen (2000b), Glasserman & Zhao (2000),
Brigo & Mercurio (2005), Hunter et al. (2001), Rebonato (2002), Schoutens
& Studer (2003), Pietersz, Pelsser & van Regenmortel (2004), Andersen &
Brotherton-Ratclie (2005), Schoenmakers (2005) and Joshi & Stacey (2008).
Step size control and related questions were studied, for instance, in
Artemiev (1985), Ozaki (1992), M
uller-Gronbach (1996), Gaines & Lyons
(1997), Shoji & Ozaki (1997, 1998a, 1998b), Burrage (1998), Mauthner (1998),
Hofmann et al. (2000a, 2000b, 2001), Szepessy, Tempone & Zouraris (2001),
Burrage & Burrage (2002), Lehn et al. (2002), Guerra & Sorini (2003), Lamba,
790
Bibliographical Notes
Mattingly & Stuart (2003), Burrage, Herdiana & Burrage (2004), Bossy et al.
(2004), Romisch & Winkler (2006) and Lamba, Mattingly & Stuart (2007).
Chap. 13: Jump-Adapted Weak Approximations
Most of the literature mentioned already for Chap. 8, Chap. 11 and Chap. 12
is also relevant for Chap. 13.
Chap. 14: Numerical Stability
Questions related to numerical stability of simulations appear in a wide
range of papers, including Talay (1982b, 1984), Klauder & Petersen (1985a),
Pardoux & Talay (1985), McNeil & Craig (1988), Milstein (1988a, 1995a),
Smith & Gardiner (1988), Petersen (1990), Artemiev & Shkurko (1991),
Drummond & Mortimer (1991), Kloeden & Platen (1992), Hernandez &
Spigler (1992, 1993), Artemiev (1993a, 1993b, 1994), Saito & Mitsui (1993a,
1993b, 1996), Hofmann & Platen (1994, 1996), Komori et al. (1994), Milstein & Platen (1994), Hofmann (1995), Komori & Mitsui (1995), Mitsui
(1995), Schurz (1996a, 1996c), Bokor (1997, 1998), Komori et al. (1997),
Ryashko & Schurz (1997), Burrage (1998), Milstein et al. (1998), Petersen
(1998), Fischer & Platen (1999), Liu & Xia (1999), Burrage et al. (2000),
Burrage & Tian (2000, 2002), Higham (2000), Vanden-Eijnden (2003), Cao,
Petzold, Rathinam & Gillespie (2004), Wu et al. (2004), Higham & Kloeden
(2005, 2006), Rodkina & Schurz (2005), Tocino (2005), Alcock & Burrage
(2006), Bruti-Liberati & Platen (2008), Platen & Shi (2008), Li et al. (2008),
Rathinasamy & Balachandran (2008a, 2008b, 2008c) and Pang et al. (2008).
Stability in stochastic dierential equations with regime switching and
potential delay have been studied, for instance, in Cao, Liu & Fan (2004),
Liu, Cao & Fan (2004), Yuan & Mao (2004b), Ding, Wu & Liu (2006), Mao,
Yin & Yuan (2007), Mao & Yuan (2006), Ronghua, Hongbing & Qin (2006),
Ronghua & Yingmin (2006), Wang & Zhang (2006), Yuan & Glover (2006),
Mao (2007), Mao, Yuan & Yin (2007), Ronghua & Zhaoguang (2007), Mao,
Shen & Yuan (2008) and Zhang & Gan (2008).
Chap. 15: Martingale Representations
Methods that involve martingale representations or the simulation of hedge
ratios can be found, for instance, in Heath (1995), Fournie, Lasry, Lebuchoux, Lious & Touzi (1999, 2001), Potters, Bouchaud & Sestovic (2001),
Avellaneda & Gamba (2002), Heath & Platen (2002a, 2002b, 2002c), Milstein & Schoenmakers (2002), Takeuchi (2002), Bernis et al. (2003), Gobet & Kohatsu-Higa (2003), Bouchard, Ekeland & Touzi (2004), El-Khatib
& Privault (2004), Kusuoka (2004), Lkka (2004), Victoir (2004), Bally
et al. (2005), Detemple, Garcia & Rindisbacher (2005), Schoenmakers (2005),
Bibliographical Notes
791
Takahashi & Yoshida (2005), Tebaldi (2005), Bavouzet & Messaoud (2006),
Davis & Johansson (2006), Giles & Glasserman (2006), Malliavin & Thalmaier (2006), Teichmann (2006) and Kampen, Kolodko & Schoenmakers
(2008).
Chap. 16: Variance Reduction Techniques
References on variance reduction techniques include Hammersley & Handscomb (1964), Fosdick & Jordan (1968), Chorin (1973), Ermakov (1975),
Maltz & Hitzl (1979), Sabelfeld (1979), Rubinstein (1981), Ermakov & Mikhailov (1982), Ripley (1983), Kalos (1984), Binder (1985), De Raedt & Lagendijk
(1985), Ventzel et al. (1985), Kalos & Whitlock (1986), Bratley et al. (1987),
Chang (1987), Wagner (1987, 1988a, 1988b, 1989a, 1989b), Hull & White
(1988), Milstein (1988a, 1995a), Ross (1990), Law & Kelton (1991), Hofmann
References
793
794
References
References
795
796
References
Barrett, R., Berry, M., Chan, T. F., Demmel, J., Dongarra, J., Eijkhout, V., Pozo,
R., Romine, C. & van der Vorst, H. (1994). Templates for the Solution of
Linear Systems: Building Blocks for Iterative Methods, SIAM, Philadelphia.
Basle (1995). Planned Supplement to the Capital Accord to Incorporate Market
Risk, Basle Committee on Banking and Supervision, Basle, Switzerland.
Bavouzet, M.-P. & Messaoud, M. (2006). Computation of Greeks using Malliavins
calculus in jump type market models, Electronic J. of Prob. 11(10): 276300.
Bayer, C. & Teichmann, J. (2008). Cubature on Wiener space in innite dimension,
Proceeding of the Royal Society London, Vol. 464, pp. 24932516.
Becherer, D. (2001). The numeraire portfolio for unbounded semimartingales,
Finance Stoch. 5: 327341.
Bell, E. T. (1934). Exponential polynomials, Ann. Math. 35: 258277.
BenArous, G. (1989). Flots et series de Taylor stochastiques, Probab. Theory
Related Fields 81: 2977.
Bender, C., Sottinen, T. & Valkeila, E. (2007). Arbitrage with fractional Brownian
motion?, Theory of Stochastic Processes 13(1-2): 2334.
Berkaoui, A., Bossy, M. & Diop, A. (2005). Euler schemes for SDEs with nonLipschitz diusion coecient: Strong convergence. INRIA working paper
no.5637.
Berkaoui, A., Bossy, M. & Diop, A. (2008). Euler scheme for SDEs with nonLipschitz diusion coecient: Strong convergence, ESAIM: Prob. and Statist.
12: 111.
Bernis, G., Gobet, E. & Kohatsu-Higa, A. (2003). Monte Carlo evaluation of Greeks
for multidimensional barrier and look-back options, Math. Finance 13: 99113.
Bertoin, J. (1996). Levy Processes, Cambridge University Press.
Beskos, A., Papaspiliopoulos, O. & Roberts, G. (2006). Retrospective exact simulation of diusion sample paths with applications, Bernoulli 12: 10771098.
Beskos, A., Papaspiliopoulos, O. & Roberts, G. (2008). A factorisation of diusion
measure and nite sample path constructions, Methodology and Comp. Appl.
Prob. 10: 85104.
Beskos, A. & Roberts, G. (2005). Exact simulation of diusions, Ann. Appl. Probab.
15: 24222444.
Bhar, R., Chiarella, C. & Runggaldier, W. J. (2002). Estimation in models of the
instantaneous short term interest rate by use of a dynamic Baysian algorithm,
in K. Sandmann & P. Sch
onbucher (eds), Advances in Finance and Stochastics
- Essays in Honour of Dieter Sondermann, Springer, pp. 177195.
Bhar, R., Chiarella, C. & Runggaldier, W. J. (2004). Inferring the forward looking
equity risk premium from derivative prices, Studies in Nonlinear Dynamics &
Econometrics 8(1): 124.
Bibby, B. M. (1994). Optimal combination of martingale estimating functions
for discretely observed diusions, Technical report, University of Aarhus,
Department of Theoretical Statistics. Research Report No. 298.
Bibby, B. M., Jacobsen, M. & Srensen, M. (2003). Estimating functions for
discretely sampled diusion-type models, in Y. Ait-Sahalia & L. P. Hansen
(eds), Handbook of Financial Economics, Amsterdam: North Holland, chapter
4, pp. 203268.
Bibby, B. M. & Srensen, M. (1995). Martingale estimation functions for discretely
observed diusion processes, Bernoulli 1: 1739.
Bibby, B. M. & Srensen, M. (1996). On estimation for discretely observed
diusions: A review, Theory of Stochastic Processes 2(18): 4956.
References
797
798
References
Boyle, P. P., Broadie, M. & Glasserman, P. (1997). Monte Carlo methods for
security pricing. Computational nancial modelling, J. Econom. Dynam.
Control 21(8-9): 12671321.
Boyle, P. P. & Lau, S. H. (1994). Bumping up against the barrier with the binomial
method, J. Derivatives pp. 614.
Brandt, M., Goyal, A., Santa-Clara, P. & Stroud, J. R. (2005). A simulation
approach to dynamic portfolio choice with an application to learning about
return predictability, Rev. Financial Studies 18: 831873.
Brandt, M. & Santa-Clara, P. (2002). Simulated likelihood estimation of diusions with application to exchange rate dynamics in incomplete markets, J.
Financial Economics 63(2): 161210.
Bratley, P., Fox, B. L. & Schrage, L. (1987). A Guide to Simulation, 2nd edn,
Springer.
Bremaud, P. (1981). Point Processes and Queues, Springer, New York.
Brent, R. P. (1974). A Gaussian pseudo number generator, Commun. Assoc.
Comput. Mach. 17: 704706.
Brent, R. P. (2004). Note on Marsaglias xorshift random number generators, J.
Statist. Software 11(5): 14.
Breymann, W., Kelly, L. & Platen, E. (2006). Intraday empirical analysis and
modeling of diversied world stock indices, Asia-Pacic Financial Markets
12(1): 128.
Brigo, D. & Mercurio, F. (2005). Interest Rate Models - Theory and Practice, 2nd
edn, Springer Finance.
Broadie, M. & Detemple, J. (1997a). Recent advances in numerical methods for
pricing derivative securities, Numerical Methods in Finance, Cambridge Univ.
Press, pp. 4366.
Broadie, M. & Detemple, J. (1997b). The valuation of American options on multiple
assets, Math. Finance 7(3): 241286.
Broadie, M. & Glasserman, P. (1997a). Estimating security price derivatives using
simulation, Management Science 42(2): 269285.
Broadie, M. & Glasserman, P. (1997b). Pricing American - style securities using
simulation. Computational nancial modelling, J. Econom. Dynam. Control
21(8-9): 13231352.
Broadie, M., Glasserman, P. & Ha, Z. (2000). Pricing American options by simulation using a stochastic mesh with optimized weights, Probabilistic Constrained
Optimization: Methodology and Applications, Kluwer, Norwell, pp. 3250.
Broadie, M. & Kaya, O. (2006). Exact simulation of stochastic volatility and other
ane jump diusion processes, Oper. Res. 54: 217231.
Brown, B. M. & Hewitt, J. T. (1975). Asymptotic likelihood theorem for diusion
processes, J. Appl. Probab. 12: 228238.
Brown, G. & Toft, K. J. (1999). Constructing binomial trees from multiple implied
probability distributions, J. Derivatives 7(2): 83100.
Bru, M.-F. (1991). Wishart processes, J. Theoret. Probab. 4(4): 725751.
Bruti-Liberati, N., Martini, F., Piccardi, M. & Platen, E. (2008). A hardware
generator for multi-point distributed random numbers for Monte Carlo
simulation, Math. Comput. Simulation 77(1): 4556.
Bruti-Liberati, N., Nikitopoulos-Sklibosios, C. & Platen, E. (2006). First order
strong approximations of jump diusions, Monte Carlo Methods Appl. 12(3-4):
191209.
References
799
800
References
Burrage, K., Burrage, P. M. & Tian, T. (2004). Numerical methods for strong solutions of SDES, Proceeding of the Royal Society London, Vol. 460, pp. 373402.
Burrage, K. & Platen, E. (1994). Runge-Kutta methods for stochastic dierential
equations, Ann. Numer. Math. 1(1-4): 6378.
Burrage, K. & Tian, T. (2000). A note on the stability properties of the Euler
method for solving stochastic dierential equations, New Zealand J. of Maths.
29: 115127.
Burrage, K. & Tian, T. (2002). Predictor-corrector methods of Runge-Kutta type
for stochastic dierential equations, SIAM J. Numer. Anal. 40: 15161537.
Burrage, P. M. (1998). Runge-Kutta methods for stochastic dierential equations,
PhD thesis, University of Queensland, Brisbane, Australia.
Burrage, P. M. & Burrage, K. (2002). A variable stepsize implementation for
stochastic dierential equations, SIAM J. Sci. Comput. 24(3): 848864.
Burrage, P. M., Herdiana, R. & Burrage, K. (2004). Adaptive stepsize based on
control theory for stochastic dierential equations, J. Comput. Appl. Math.
170: 317336.
Cao, W. R., Liu, M. Z. & Fan, Z. C. (2004). MS-stability of the Euler-Maruyama
method for stochastic dierential delay equations, Appl. Math. Comput. 159:
127135.
Cao, Y., Petzold, L., Rathinam, M. & Gillespie, D. T. (2004). The numerical
stability of leaping methods for stochastic simulation of chemically reacting
systems, J. Chemical Physics 121: 1216912178.
Carbonell, F., Jimenez, J. C. & Biscay, R. (2006). Weak local linear discretizations
for stochastic dierential equations: Convergence and numerical schemes, J.
Comput. Appl. Math. 197: 578596.
Carbonell, F., Jimenez, J. C., Biscay, R. & de la Cruz, H. (2005). The local linearization method for numerical integration of random dierential equations,
BIT 45: 114.
Carr, P., Geman, H., Madan, D. & Yor, M. (2003). Stochastic volatility for Levy
processes, Math. Finance 13(3): 345382.
Carrasco, M., Chernov, M., Florens, J.-P. & Ghysels, E. (2007). Ecient estimation
of jump diusions and general dynamic models with a continuum of moment
conditions, J. Econometrics 140: 529573.
Carriere, J. (1996). Valuation of early-exercise price of options using simulations and
non-parametric regression, Insurance: Mathematics and Economics 19: 1930.
Casasus, L. L. (1982). On the numerical solution of stochastic dierential equations
and applications, Proceedings of the Ninth Spanish-Portuguese Conference on
Mathematics, Vol. 46 of Acta Salmanticensia. Ciencias, Univ. Salamanca, pp.
811814. (in Spanish).
Casasus, L. L. (1984). On the convergence of numerical methods for stochastic
dierential equations, Proceedings of the Fifth Congress on Dierential
Equations and Applications, Univ. La Laguna, pp. 493501. Puerto de la Cruz
(1982), (in Spanish), Informes 14.
Casella, B. & Roberts, G. (2007). Exact Monte Carlo simulation of killed diusions,
Adv. in Appl. Probab. 40: 273291.
Castell, F. & Gaines, J. (1995). An ecient approximation method for stochastic
dierential equations by means of the exponential Lie series, Math. Comput.
Simulation 38: 1319.
References
801
802
References
References
803
804
References
References
805
806
References
References
807
Fournie, E., Lasry, J. M. & Touzi, N. (1997). Monte Carlo methods for stochastic
volatility models, Numerical Methods in Finance, Cambridge Univ. Press,
Cambridge, pp. 146164.
Fournie, E., Lebuchoux, J. & Touzi, N. (1997). Small noise expansion and
importance sampling, Asymptot. Anal. 14(4): 331376.
Franklin, J. N. (1965). Dierence methods for stochastic ordinary dierential
equations, Math. Comp. 19: 552561.
Frey, R. & Runggaldier, W. J. (2001). A nonlinear ltering approach to volatility
estimation with a view towards high frequency data, Int. J. Theor. Appl.
Finance 4(2): 199210.
Friedman, A. (1975). Stochastic Dierential Equations and Applications, Vol. I, Vol.
28 of Probability and Mathematical Statistics, Academic Press, New York.
Fu, M. C. (1995). Pricing of nancial derivatives via simulation, in C. Alexopoulus,
K. Kang, W. Lilegdon & D. Goldsman (eds), Proceedings of the 1995 Winter
Simulation Conference, Piscataway, New Jersey: Institute of Electrical and
Electronics Engineers.
Fujisaki, M., Kallianpur, G. & Kunita, H. (1972). Stochastic dierential equations
for the non-linear ltering problem, Osaka J. Math. 9: 1940.
Gaines, J. G. (1994). The algebra of iterated stochastic integrals, Stochastics
Stochastics Rep. 49: 169179.
Gaines, J. G. (1995). A basis for iterated stochastic integrals, Math. Comput.
Simulation 38(1-3): 711.
Gaines, J. G. & Lyons, T. J. (1994). Random generation of stochastic area integrals,
SIAM J. Appl. Math. 54(4): 11321146.
Gaines, J. G. & Lyons, T. J. (1997). Variable step size control in the numerical
solution of stochastic dierential equations, SIAM J. Appl. Math. 57(5):
14551484.
Galesso, G. & Runggaldier, W. J. (2010). Pricing without equivalent martingale
measures under complete and incomplete observation, in C. Chiarella & A.
Novikov (eds), Contemporary Quantitative Finance - Essays in Honour of
Eckhard Platen, Springer.
Gallant, A. R. & Tauchen, G. (1996). Which moments to match?, Econometric
Theory 12: 657681.
Gard, T. C. (1988). Introduction to Stochastic Dierential Equations, Marcel
Dekker, New York.
Gardo`
n, A. (2004). The order of approximations for solutions of It
o-type stochastic
dierential equations with jumps, Stochastic Anal. Appl. 22(3): 679699.
Gardo`
n, A. (2006). The order 1.5 approximation for solutions of jump-diusion
equations, Stochastic Anal. Appl. 24(3): 11471168.
Gelbrich, M. (1989). Lp -Wasserstein-Metriken und Approximation stochastischer
Dierentialgleichungen, PhD thesis, Dissert. A, Humboldt-Universit
at Berlin.
Gelbrich, M. & Rachev, S. T. (1996). Discretization for stochastic dierential
equations, Lp Wasserstein metrics, and econometrical models, Distributions
with xed marginals and related topics, Vol. 28 of IMS Lecture Notes Monogr.
Ser., Inst. Math. Statist., Hayward, CA, pp. 97119.
Geman, H., Madan, D. & Yor, M. (2001). Asset prices are Brownian motion:
Only in business time, Quantitative Analysis in Financial Markets, World Sci.
Publishing, pp. 103146.
Geman, H. & Roncoroni, A. (2006). Understanding the ne structure of electricity
prices, J. Business 79(3): 12251261.
808
References
References
809
Goodlett, S. T. & Allen, E. J. (1994). A variance reduction technique for use with
the extrapolated Euler method for numerical solution of stochastic dierential
equations, Stochastic Anal. Appl. 12(1): 131140.
Gorostiza, L. G. (1980). Rate of convergence of an approximate solution of stochastic dierential equations, Stochastics 3: 267276. Erratum in Stochastics 4
(1981), 85.
Gorovoi, V. & Linetsky, V. (2004). Blacks model of interest rates as options, eigenfunction expansions and Japanese interest rates, Math. Finance 14(1): 4978.
Gourieroux, C., Monfort, A. & Renault, E. (1993). Indirect inference, J. Appl.
Econometrics 8: 85118.
Gourieroux, C. & Sufana, R. (2004). Derivative pricing with multivariate stochastic
volatility: Application to credit risk. Working paper CREST.
Grant, D., Vora, D. & Weeks, D. (1997). Path-dependent options: Extending the
Monte Carlo simulation approach, Management Science 43(11): 15891602.
Grasselli, M. & Tebaldi, C. (2008). Solvable ane term structure models, Math.
Finance 18: 135153.
Grecksch, W. & Anh, V. V. (1998). Approximation of stochastic dierential
equations with modied fractional Brownian motion, Z. Anal. Anwendungen
17(3): 715727.
Grecksch, W. & Anh, V. V. (1999). A parabolic stochastic dierential equation
with fractional Brownian motion input, Statist. Probab. Lett. 41(4): 337346.
Grecksch, W. & Wadewitz, A. (1996). Approximation of solutions of stochastic differential equations by discontinuous Galerkin methods, Z. Anal. Anwendungen
15(4): 901916.
Greenside, H. S. & Helfand, E. (1981). Numerical integration of stochastic
dierential equations, Bell Syst. Techn. J. 60(8): 19271940.
Greiner, A., Strittmatter, W. & Honerkamp, J. (1987). Numerical integration of
stochastic dierential equations, J. Statist. Phys. 51(I/2): 95108.
Grorud, A. & Talay, D. (1990). Approximation of Lyapunov exponents of stochastic
dierential systems on compact manifolds, Analysis and Optimization of
Systems, Vol. 144 of Lecture Notes in Control and Inform. Sci., Springer, pp.
704713.
Grorud, A. & Talay, D. (1996). Approximation of Lyapunov exponents of nonlinear
stochastic dierential equations, SIAM J. Appl. Math. 56(2): 627650.
Gr
une, L. & Kloeden, P. E. (2001a). Higher order numerical schemes for anely
controlled nonlinear systems, Numer. Math. 89: 669690.
Gr
une, L. & Kloeden, P. E. (2001b). Pathwise approximation of random ordinary
dierential equations, BIT 41: 711721.
Gr
une, L. & Kloeden, P. E. (2006). Higher order numerical approximation of
switching systems, Systems Control Lett. 55: 746754.
Guerra, M. L. & Sorini, L. (2003). Step control in numerical solution of stochastic
dierential equations, Internat. J. Comput. & Num. Analysis & Appl. 4(3):
279295.
Guo, S. J. (1982). On the mollier approximation for solutions of stochastic
dierential equations, J. Math. Kyoto Univ. 22: 243254.
Guo, S. J. (1984). Approximation theorems based on random partitions for stochastic dierential equations and applications, Chinese Ann. Math. 5: 169183.
Guyon, J. (2006). Euler scheme and tempered distributions, Stochastic Process.
Appl. 116(6): 877904.
810
References
Gy
ongy, I. (1991). Approximation of Ito stochastic equations, Math. USSR-Sb
70(1): 165173.
Gy
ongy, I. (1998). A note on Eulers approximations, Potential Anal. 8: 205216.
Hammersley, J. M. & Handscomb, D. C. (1964). Monte Carlo Methods, Methuen,
London.
Hansen, L. P. (1982). Large sample properties of the generalized method of
moments, Econometrica 50: 10291054.
Harris, C. J. (1976). Simulation of nonlinear stochastic equations with applications
in modelling water pollution, in C. A. Brebbi (ed.), Mathematical Models for
Environmental Problems, Pentech Press, London, pp. 269282.
Harris, C. J. & Maghsoodi, Y. (1985). Approximate integration of a class of
stochastic dierential equations, Control Theory, Proc. 4th IMA Conf.,
Cambridge/Engl., pp. 159168.
Harrison, J. M. & Kreps, D. M. (1979). Martingale and arbitrage in multiperiod
securities markets, J. Economic Theory 20: 381408.
Hausenblas, E. (1999a). A Monte-Carlo method with inherited parallelism for
solving partial dierential equations with boundary conditions, Lecture Notes
Comput. Sci., Vol. 1557, Springer, pp. 117126.
Hausenblas, E. (1999b). A numerical scheme using It
o excursion for simulating local
time resp. stochastic dierential equations with reection, Osaka J. Math.
36(1): 105137.
Hausenblas, E. (2000a). Monte-Carlo simulation of killed diusion, Monte Carlo
Methods Appl. 6(4): 263295.
Hausenblas, E. (2000b). A numerical scheme using excursion theory for simulating
stochastic dierential equations with reection and local time at a boundary,
Monte Carlo Methods Appl. 6(2): 81103.
Hausenblas, E. (2000c). A numerical scheme using excursion theory for simulating
stochastic dierential equations with reection and local time at a boundary,
Monte Carlo Methods Appl. 6: 81103.
Hausenblas, E. (2002). Error analysis for approximation of stochastic dierential
equations driven by Poisson random measures, SIAM J. Numer. Anal. 40(1):
87113.
Haussmann, U. (1978). Functionals of Ito processes as stochastic integrals, SIAM
J. Control Optim. 16: 252269.
Haussmann, U. (1979). On the integral representation of functionals of Ito processes,
Stochastics 3: 1728.
Haworth, D. C. & Pope, S. B. (1986). A second-order Monte-Carlo method for the
solution of the Ito stochastic dierential equation, Stochastic Anal. Appl. 4:
151186.
Heath, D. (1995). Valuation of derivative securities using stochastic analytic and
numerical methods, PhD thesis, ANU, Canberra.
Heath, D. & Platen, E. (2002a). Consistent pricing and hedging for a modied
constant elasticity of variance model, Quant. Finance 2(6): 459467.
Heath, D. & Platen, E. (2002b). Perfect hedging of index derivatives under a
minimal market model, Int. J. Theor. Appl. Finance 5(7): 757774.
Heath, D. & Platen, E. (2002c). Pricing and hedging of index derivatives under
an alternative asset price model with endogenous stochastic volatility, in J.
Yong (ed.), Recent Developments in Mathematical Finance, World Scientic,
pp. 117126.
References
811
812
References
References
813
814
References
Jensen, B. (2000). Binomial models for the term structure of interest. Department
of Finance, Copenhagen Business School (working paper).
Jiang, G. J. & Knight, J. L. (2002). Estimation of continuous-time processes via the
empirical characteristic function, J. Bus. Econom. Statist. 20(2): 198212.
Jimenez, J. C., Biscay, R. & Ozaki, T. (2006). Inference methods for discretely
observed continuous-time stochastic volatility models: A commented overview,
Asia-Pacic Financial Markets 12: 109141.
Jimenez, J. C. & Ozaki, T. (2003). Local linearization lters for nonlinear
continuous-discrete state space models with multiplicative noise, Int. J.
Control 76: 11591170.
Jimenez, J. C. & Ozaki, T. (2006). An approximate innovation method for the
estimation of diusion processes from discrete data, J. Time Series Analysis
27: 7797.
J
odar, L., Cortes, J. C., Sevilla-Peris, P. & Villafuerte, L. (2005). Approximating
processes of stochastic diusion models under spatial uncertainty, Proc.
ASMDA, 16-21 May, Brest, France, pp. 13031312.
Johannes, M. (2004). The statistical and economic role of jumps in continuous-time
interest rate models, J. Finance 59(1): 227260.
Johnson, N. L., Kotz, S. & Balakrishnan, N. (1995). Continuous Univariate
Distributions, Vol. 2 of Wiley Series in Probability and Mathematical Statistics,
2nd edn, John Wiley & Sons.
Jorion, P. (1988). On jump processes in the foreign exchange and stock markets,
Rev. Financial Studies 1: 427445.
Joshi, M. S. (2003). The Concept and Practice of Mathematical Finance, Cambridge
University Press.
Joshi, M. S. & Stacey, A. M. (2008). New and robust drift approximations for the
LIBOR market model, Quant. Finance 8: 427434.
Joy, C., Boyle, P. P. & Tan, K. S. (1996). Quasi Monte Carlo methods in numerical
nance, Management Science 42(6): 926938.
Kahl, C. & J
ackel, P. (2006). Fast strong approximation Monte-Carlo schemes for
stochastic volatility models, Quant. Finance 6(6): 513536.
Kahl, C. & Schurz, H. (2006). Balanced Milstein methods for ordinary SDEs, Monte
Carlo Methods Appl. 12(2): 143170.
Kallianpur, G. (1980). Stochastic Filtering Theory, Springer.
Kalman, R. E. & Bucy, R. S. (1961). New results in linear ltering and prediction
theory, Trans. ASME Ser. D. J. Basic Engrg. 83: 95108.
Kalos, M. H. (1984). Monte Carlo Methods in Quantum Problems, Vol. 125 of
NATO ASI, Series C, Reidel, Boston.
Kalos, M. H. & Whitlock, P. A. (1986). Monte Carlo Methods. Vol. I. Basics,
Wiley, New York.
Kampen, J., Kolodko, A. & Schoenmakers, J. (2008). Monte Carlo Greeks for
nancial products via approximative transition densities, SIAM J. Sci.
Comput. 31(1): 122.
Kamrad, B. & Ritchken, P. (1991). Multinomial approximating models for options
with k state variables, Management Science 37: 16401652.
Kanagawa, S. (1985). The rate of convergence in Maruyamas invariance principle,
Proc. 4th Inter. Vilnius Confer. Prob. Theory Math. Statist., pp. 142144.
Vilnius 1985.
Kanagawa, S. (1988). The rate of convergence for Maruyamas approximate
solutions of stochastic dierential equations, Yokohoma Math. J. 36: 7985.
References
815
816
References
Klauder, J. R. & Petersen, W. P. (1985a). Numerical integration of multiplicativenoise stochastic dierential equations, SIAM J. Numer. Anal. 6: 11531166.
Klauder, J. R. & Petersen, W. P. (1985b). Spectrum of certain non-self-adjoint
operators and solutions of Langevin equations with complex drift, J. Statist.
Phys. 39: 5372.
Klebaner, F. (2005). Introduction to Stochastic Calculus with Applications, Imperial
College Press. Second edition.
Kloeden, P. E. (2002). The systematic derivation of higher order numerical schemes
for stochastic dierential equations, Milan Journal of Mathematics 70(1):
187207.
Kloeden, P. E. & Jentzen, A. (2007). Pathwise convergent higher order numerical
schemes for random ordinary dierential equations, Proc. Roy. Soc. London A
463: 29292944.
Kloeden, P. E. & Neuenkirch, A. (2007). The pathwise convergence of approximation
schemes for stochastic dierential equations, LMS J. Comp. Math. 10: 235253.
Kloeden, P. E. & Pearson, R. A. (1977). The numerical solution of stochastic
dierential equations, J. Austral. Math. Soc. Ser. B 20: 812.
Kloeden, P. E. & Platen, E. (1991a). Relations between multiple Ito and
Stratonovich integrals, Stochastic Anal. Appl. 9(3): 311321.
Kloeden, P. E. & Platen, E. (1991b). Stratonovich and Ito stochastic Taylor
expansions, Math. Nachr. 151: 3350.
Kloeden, P. E. & Platen, E. (1992). Higher order implicit strong numerical schemes
for stochastic dierential equations, J. Statist. Phys. 66(1/2): 283314.
Kloeden, P. E. & Platen, E. (1999). Numerical Solution of Stochastic Dierential
Equations, Vol. 23 of Appl. Math., Springer. Third printing, (rst edition
(1992)).
Kloeden, P. E., Platen, E. & Hofmann, N. (1992). Stochastic dierential equations:
Applications and numerical methods, Proc. 6th IAHR International Symposium
on Stochastic Hydraulics, pp. 7581. National Taiwan University, Taipeh.
Kloeden, P. E., Platen, E. & Hofmann, N. (1995). Extrapolation methods for the
weak approximation of Ito diusions, SIAM J. Numer. Anal. 32(5): 15191534.
Kloeden, P. E., Platen, E. & Schurz, H. (1992). Eective simulation of optimal
trajectories in stochastic control, Optimization 1: 633644.
Kloeden, P. E., Platen, E. & Schurz, H. (1993). Higher order approximate Markov
chain lters, in S. Cambanis (ed.), Stochastic Processes - A Festschrift in
Honour of Gopinath Kallianpur, Springer, pp. 181190.
Kloeden, P. E., Platen, E. & Schurz, H. (2003). Numerical Solution of SDEs
Through Computer Experiments, Universitext, Springer. Third corrected
printing, (rst edition (1994)).
Kloeden, P. E., Platen, E., Schurz, H. & Srensen, M. (1996). On eects of
discretization on estimators of drift parameters for diusion processes, J. Appl.
Probab. 33: 10611076.
Kloeden, P. E., Platen, E. & Wright, I. (1992). The approximation of multiple
stochastic integrals, Stochastic Anal. Appl. 10(4): 431441.
Kl
uppelberg, C., Lindner, A. & Maller, R. (2006). Continuous time volatility
modelling: COGARCH versus Ornstein-Uhlenbeck models, in Y. Kabanov
et al. (ed.), From Stochastic Calculus to Mathematical Finance, Springer, pp.
393419.
Kohatsu-Higa, A. (1997). High order Ito-Taylor approximations to heat kernels, J.
Math. Kyoto Univ. 37(1): 129150.
References
817
818
References
References
819
820
References
References
821
822
References
References
823
824
References
References
825
Pang, S., Deng, F. & Mao, X. (2008). Almost sure and moment exponential stability
of Euler-Maruyama discretizations for hybrid stochastic dierential equations,
J. Comput. Appl. Math. 213: 127141.
Panneton, F., LEcuyer, P. & Matsumoto, M. (2006). Improved long-period
generators based on linear recurrences modulo 2, ACM Transactions on
Mathematical Software 32(1): 116.
Pardoux, E. & Talay, D. (1985). Discretization and simulation of stochastic
dierential equations, Acta Appl. Math. 3(1): 2347.
Paskov, S. & Traub, J. (1995). Faster valuation of nancial derivatives, J. Portfolio
Manag. pp. 113120.
Pearson, N. & Sun, T. S. (1989). A test of the Cox, Ingersoll, Ross model of the
term structure of interest rates using the method of moments. Sloan School of
Management, Massachusetts Institute of Technology (working paper).
Pedersen, A. R. (1995). A new approach to maximum likelihood estimation for
stochastic dierential equations based on discrete observations, Scand. J.
Statist. 22: 5571.
Pelsser, A. & Vorst, T. (1994). The binomial model and the Greeks, J. Derivatives
1(3): 4549.
Petersen, W. P. (1987). Numerical simulation of Ito stochastic dierential equations
on supercomputers, Random Media, Vol. 7 of IMA Vol. Math. Appl., Springer,
pp. 215228.
Petersen, W. P. (1990). Stability and accuracy of simulations for stochastic
dierential equations. IPS Research Report No. 90-02, ETH Z
urich.
Petersen, W. P. (1994). Some experiments on numerical simulations of stochastic
dierential equations and a new algorithm, J. Comput. Phys. 113(1): 7581.
Petersen, W. P. (1998). A general implicit splitting for stabilizing numerical
simulations of Ito stochastic dierential equations, SIAM J. Numer. Anal.
35(4): 14391451.
Petterson, R. (1995). Approximations for stochastic dierential equations with
reecting convex boundaries, Stochastic Process. Appl. 59: 295308.
Picard, J. (1984). Approximation of nonlinear ltering problems and order of
convergence, Filtering and Control of Random Processes, Vol. 61 of Lecture
Notes in Control and Inform. Sci., Springer, pp. 219236.
Pietersz, R., Pelsser, A. & van Regenmortel, M. (2004). Fast drift approximated
pricing in the BGM model, J. Comput. Finance 8: 93124.
Pivitt, B. (1999). Accessing the Intel Random Number Generator with CDSA,
Technical report, Intel Platform Security Division.
Platen, E. (1980a). Approximation of Ito integral equations, Stochastic Dierential
Systems, Vol. 25 of Lecture Notes in Control and Inform. Sci., Springer, pp.
172176.
Platen, E. (1980b). Weak convergence of approximations of Ito integral equations,
Z. Angew. Math. Mech. 60: 609614.
Platen, E. (1981). A Taylor formula for semimartingales solving a stochastic
dierential equation, Stochastic Dierential Systems, Vol. 36 of Lecture Notes
in Control and Inform. Sci., Springer, pp. 157164.
Platen, E. (1982a). An approximation method for a class of It
o processes with jump
component, Liet. Mat. Rink. 22(2): 124136.
Platen, E. (1982b). A generalized Taylor formula for solutions of stochastic
dierential equations, SANKHYA A 44(2): 163172.
826
References
References
827
828
References
Rathinasamy, A. & Balachandran, K. (2008b). Mean-square stability of secondorder Runge-Kutta methods for multi-dimensional linear stochastic dierential
systems, J. Comput. Appl. Math. 219: 170197.
Rathinasamy, A. & Balachandran, K. (2008c). Mean-square stability of semiimplicit Euler method for linear stochastic dierential equations with multiple
delays and Markovian switching, Appl. Math. Comput. 206: 968979.
Raviart, P. A. & Thomas, J. M. (1983). Introduction to the Numerical Analysis
of Partial Dierential Equations, Collection of Applied Mathematics for the
Masters Degree, Masson, Paris. (in French).
Razevig, V. D. (1980). Digital modelling of multi-dimensional dynamics under
random perturbations, Automat. Remote Control 4: 177186. (in Russian).
Rebonato, R. (2002). Modern Pricing of Interest Rate Derivatives: The Libor
Market Model and Beyond, Princeton, University Press.
Revuz, D. & Yor, M. (1999). Continuous Martingales and Brownian Motion, 3rd
edn, Springer.
Richtmeyer, R. & Morton, K. (1967). Dierence Methods for Initial-Value Problems,
Interscience, New York.
Ripley, B. D. (1983). Stochastic Simulation, Wiley, New York.
Rodkina, A. & Schurz, H. (2005). Almost sure asymptotic stability of drift-implicit
-methods for bilinear ordinary stochastic dierential equations in 1 , J.
Comput. Appl. Math. 180: 1331.
Rogers, L. C. G. (1997). Arbitrage with fractional Brownian motion, Math. Finance
7(1): 95105.
Rogers, L. C. G. (2002). Monte-Carlo valuation of American options, Math. Finance
pp. 271286.
Rogers, L. C. G. & Talay, D. (eds) (1997). Numerical Methods in Finance. Sessions
at the Isaac Newton Institute, Cambridge, GB, 1995, Cambridge Univ. Press.
R
omisch, W. (1983). On an approximate method for random dierential equations,
in J. vom Scheidt (ed.), Problems of Stochastic Analysis in Applications, pp.
327337. Wiss. Beitr
age der Ingenieurhochschule Zwickau, Sonderheft.
R
omisch, W. & Wakolbinger, A. (1987). On the convergence rates of approximate
solutions of stochastic equations, Vol. 96 of Lecture Notes in Control and
Inform. Sci., Springer, pp. 204212.
R
omisch, W. & Winkler, R. (2006). Stepsize control for mean-square numerical
methods for stochastic dierential equations with small noise, SIAM J. Sci.
Comput. 28(2): 604625.
Ronghua, L., Hongbing, M. & Qin, C. (2006). Exponential stability of numerical
solutions to SDDEs with Markovian switching, Appl. Math. Comput. 174:
13021313.
Ronghua, L. & Yingmin, H. (2006). Convergence and stability of numerical solutions
to SDDEs with Markovian switching, Appl. Math. Comput. 175: 10801091.
Ronghua, L. & Zhaoguang, C. (2007). Convergence of numerical solution to SDDEs
with Poisson jumps and Markovian switching, J. Comput. Appl. Math. 184:
451463.
Rootzen, H. (1980). Limit distributions for the error in approximations of stochastic
integrals, Ann. Probab. 8(2): 241251.
Ross, S. A. (1976). The arbitrage theory of capital asset pricing, J. Economic
Theory 13: 341360.
Ross, S. M. (1990). A Course in Simulation, Macmillan.
References
829
830
References
References
831
832
References
References
833
834
References
Author Index
835
836
Author Index
Author Index
Davis, M. H. A., 6, 598, 601, 791, 803
Da Fonseca, J., 112, 784, 803
Deelstra, G., 63, 783, 803
Delbaen, F., XXVII, 40, 63, 94, 147,
783, 784, 803
Dembo, A., 438, 788, 803
DeMiguel, V., 154, 803
Demmel, J., 742, 796
Deng, F., 787, 790, 825
Denk, G., 786, 787, 803, 829
Derman, E., 42, 791, 797
Detemple, J., 638, 706, 789, 790, 798,
803
Devroye, L., 716, 785, 803
Dewynne, J., 734, 743, 791, 834
de la Cruz, H., 786, 800
De Raedt, H., 791, 803
DiMasi, G., 473, 788, 791, 804, 818
Ding, X., 790, 804
Diop, A., 63, 783, 785, 796, 797, 804
Doering, C. R., 786, 804
Donati-Martin, C., 71, 804
Dongarra, J., 742, 796
Doob, J. L., 20, 21, 804
Doss, H., 785, 804
Dothan, L. U., 42, 804
Douglas, J., 785, 804
Doukhan, P., 411, 804
Doumerc, Y., 71, 804
Drummond, I. T., 786, 804
Drummond, P. D., 787, 790, 804
Dsagnidse, A. A., 786, 804
Duane, S., 786, 804
Due, D., XXIII, 42, 405, 480, 481,
638, 783, 788, 804
Dupuis, P. G., 723, 791, 805, 818
E, W., 787, 790, 819
Eberlein, E., 14, 53, 75, 105, 783, 805
Eijkhout, V., 742, 796
Einstein, A., 6, 805
Ekeland, I., 790, 797
Ekstr
om, E., 784, 805
El-Borai, M. M., 786, 805
El-Khatib, Y., 598, 790, 805
El-Nadi, K. E., 786, 805
Elerain, O., 390, 788, 805
837
838
Author Index
Author Index
Hull, J., 42, 638, 791, 812
Hulley, H., 158, 164, 166, 168, 171, 172,
784, 812, 813
Hunter, C. J., 569, 789, 813
Iacus, S. M., 788, 813
Ikeda, N., XXVII, 21, 44, 58, 59, 278,
783, 786, 813
Ingersoll, J. E., 41, 163, 533, 802, 813
J
ackel, P., XXVII, 61, 62, 477, 569,
638, 783, 788, 789, 813, 814
Jackwerth, J. C., 791, 813
Jacobsen, M., 397, 788, 796
Jacod, J., 23, 75, 390, 417, 747, 783,
785, 786, 789, 793, 813
Janicki, A., 785, 786, 813
Jansons, K. M., 786, 789, 813
Janssen, R., 785, 786, 813
Jarrow, R., XXIII, 702, 784, 813, 814
Jeanblanc, M., XXVII, 814
Jeantheau, T., 411, 788, 808
Jensen, B., 791, 814
Jentzen, A., 786, 816
Jiang, G. J., 788, 814
Jiang, L., 791, 834
Jimenez, J. C., 786, 787, 789, 797, 800,
814
J
odar, L., 786, 802, 814
Johannes, M., XXIII, 414, 805, 814
Johansson, M. P., 598, 791, 803
Johnson, N. L., 171, 814
Jones, O. D., 61, 783, 799
Jordan, H. F., 791, 806
Jorion, P., XXIII, 814
Joshi, M. S., XXVII, 497, 569, 789,
813, 814
Joy, C., 638, 785, 814
Kabanov, Y., XXIII, 45, 56, 797
Kahl, C., 61, 326, 358, 783, 787, 814
Kallianpur, G., 421, 425, 426, 788, 807,
814
Kalman, R. E., 419, 788, 814
Kalos, M. H., 638, 791, 814
Kampen, J., 791, 814
Kamrad, B., 703, 815
Kan, R., 42, 405, 783, 804
Kanagawa, S., 785, 787, 789, 815
839
840
Author Index
Author Index
Mathe, P., 785, 812
Matsumoto, H., 71, 804
Matsumoto, M., 716, 785, 822, 825
Mattingly, J. C., 790, 818
Mauthner, S., 789, 822
McLeish, D. L., XXVII, 788, 822
McNeil, A., XXVII, 65, 66, 97, 783,
822
McNeil, K. J., 787, 790, 822
Mei, C., 786, 789, 833
Melchior, M., 791, 822
Meleard, S., 75, 783, 785, 786, 789, 813
Meng, X., 787, 790, 834
Mercurio, F., XXVII, 789, 798
Merener, N., 57, 534, 786, 789, 808
Merton, R. C., XXIII, 14, 37, 38, 41,
47, 50, 52, 158, 161, 162, 166, 414,
565, 822
Messaoud, M., 791, 796
Metwally, S. A. K., 786, 789, 822
Meyer, P. A., 784, 812
Michna, Z., 786, 813
Mikhailov, G. A., 638, 791, 806
Mikosch, T., XXVII, 805, 822
Mikulevicius, R., 482, 517, 518, 525,
534537, 620, 728, 748, 785, 786,
789, 822
Miller, S., 41, 158, 164, 166, 168,
170172, 784, 812, 822
Milne, F., 53, 820
Milstein, G. N., XXVI, XXVII, 61,
256, 261, 278, 316, 318, 322, 324,
352, 358, 477, 483, 499, 528, 572,
576, 585, 586, 638, 642, 643, 645,
784791, 808, 822, 823, 833
Miltersen, K. R., 434, 823
Misawa, T., 784, 823
Mitsui, T., 572, 574, 786, 787, 789, 790,
799, 817, 823, 829
Miyahara, Y., 53, 823
Mohammed, S.-E. A., 785, 812
Monfort, A., 398, 788, 809
Moore, J. B., XXVII, 309, 434437,
445, 788, 805
Mora, C. M., 786, 789, 823
Mordecki, E., 789, 823
Mortimer, I. K., 787, 790, 804
Morton, K., 734, 740, 791, 828
Mostafa, O. L., 786, 805
841
Ottinger,
H., 791, 822
Ozaki, T., 786, 788, 789, 814, 825, 830
Pages, G., 716, 785, 825
Palleschi, V., 785, 821
Pan, J., XXIII, 804
Pang, S., 787, 790, 825
Panneton, F., 716, 785, 825
Papaspiliopoulos, O., 784, 796
Pardoux, E., 785, 787, 789, 790, 825
Paskov, S., 785, 825
Pasquali, S., 457, 788, 801
Pearson, N., 42, 825
Pearson, R. A., 785787, 816
Pedersen, A. R., 390, 788, 825
Pelsser, A., 789, 791, 825
Peng, H., 417, 793
Penski, C., 787, 803
Petersen, W. P., 785787, 789, 790,
816, 825
842
Author Index
Author Index
Santa-Clara, P., 390, 788, 789, 798
Sanz-Sole, M., 787, 806
San Martin, J., 785, 820
Schachermayer, W., XXVII, 147, 783,
784, 803, 804
Sch
aer, S., 786, 787, 803
Schein, O., 786, 789, 819, 829
Schmitz-Abe, K., 789, 829
Schnell, S., XXIII, 375, 787, 833
Schoenmakers, J., 789791, 814, 823,
829
Scholes, M., 14, 37, 40, 47, 136, 797
Sch
onbucher, P. J., XXIII, XXVII, 56,
167, 829
Schoutens, W., 784, 789, 830
Schrage, L., 638, 716, 785, 791, 798
Schroder, M., 40, 830
Schurz, H., 61, 245, 322, 326, 358,
390, 393, 429, 477, 576, 585, 586,
785788, 790, 814, 816, 817,
823, 828830
Schwartz, E., 434, 823
Schwartz, E. S., 638, 713, 789, 820
Schweizer, M., 40, 128, 473, 475, 593,
638, 645, 668, 671, 682, 687, 784,
791, 806, 811813, 830
Seneta, E., 53, 75, 783, 820
Sestovic, D., 790, 827
Setiawaty, B., 440, 788, 830
Sevilla-Peris, P., 786, 802, 814
Seydel, R., XXVII, 788, 830
Shahabuddin, P., 791, 808
Shardlow, T., 785, 799
Shaw, W., 734, 741, 744, 789, 791, 829,
830
Shedler, G. S., 61, 783, 819
Shen, J., 785, 820
Shen, Y., 790, 821
Shephard, N., 14, 53, 75, 105, 390, 414,
783, 788, 795, 805
Sheu, S.-J., 40, 806
Shi, L., 571, 587, 588, 781, 790, 827
Shimbo, K., 784, 813, 814
Shimizu, A., 786, 830
Shinozuka, M., 786, 830
Shirakawa, H., 40, 94, 803
Shiryaev, A. N., XXVII, 1, 4, 5, 23, 25,
389, 421, 460463, 609, 612, 615,
843
844
Author Index
Author Index
Wu, K., 790, 804
Wu, S., 787, 790, 834
Xia, X., 787, 790, 820
Xiao, Y. J., 716, 785, 825
Xu, C., 791, 834
Xue, H., 786, 789, 833
Yamada, T., 785, 834
Yamato, Y., 785, 834
Yan, F., 785, 812
Yannios, N., 786, 834
Yen, V. V., 784, 834
Yin, G., 786, 789, 790, 821
Yingmin, H., 790, 828
Yong, J. M., 785, 820
Yor, M., XXVII, 14, 53, 62, 71, 75, 89,
90, 105, 168, 169, 783, 800, 804,
808, 814, 828
845
Index
847
848
Index
Brownian motion, 6
geometric, 47
Cauchy-Schwarz inequality, 21
CEV model, 40
chain rule
classical, 34
change of measure, 435
Chebyshev inequality, 23
CIR model, 41
classical chain rule, 34, 192
Clayton copula, 65
coecient function
It
o, 204
commutativity condition
diusion, 258, 516
jump, 286
compensated
drift coecient, 196
Poisson measure, 195, 211
Wagner-Platen expansion, 196, 206
weak Taylor scheme, 519
compound Poisson process, 414
concatenation operator, 197
conditional
density, 5
expectation, 5
iterated, 5
variance, 472, 639
conditionally Gaussian model, 460
condence interval, 480
consistent, 240
contingent claim, 470, 592
continuation region, 128, 592
continuous uncertainty, 139
control variate, 670
general, 671
convergence
order, 238
order of strong, 292
strong order, 251
theorem
jump-adapted, 535
Markov chain, 744
strong, 291
weak, 544
weak order, 519, 724
copula, 65
Archimedian, 65
Clayton, 65
Gaussian, 65, 66
correlation
property, 29
counting process, 139, 193
Courtadon model, 42
covariance, 2
covariation, 24
approximate, 24
property, 29
Cramer-Lundberg model, 12, 52
Crank-Nicolson method, 741
credit worthiness, 10
CRR binomial tree, 698
d-dimensional linear SDE, 48
defaultable zero coupon bond, 166
degree of implicitness, 239
density
Gaussian, 66
derivative-free
scheme, 267, 309, 334
strong order 1.0 scheme, 310
deviation, 479
dierence method, 234
dierential equation
ordinary, 233
diusion
coecient, 28, 248
additive, 287
multiplicative, 288
commutativity condition, 258, 516
stationary, 409
Dirac delta function, 131
discount rate process, 125
discounted payo with payo rate, 126
discrete-time approximation, 247
discretization
spatial, 735
distribution
generalized hyperbolic, 73
generalized inverse Gaussian, 80
non-central chi-square, 78
normal inverse Gaussian, 74
Student-t, 74
variance-gamma, 74
Wishart non-central, 80
Diversication Theorem, 153
diversied portfolio, 152
Index
drift coecient, 28, 248
compensated, 196
drift-implicit
Euler scheme, 316
simplied Euler scheme, 497
strong order 1.0 scheme, 318
strong scheme, 337
equidistant time discretization, 246
ergodicity, 132
error
absolute, 251
covariance matrix, 420
function, 78
global discretization, 234
local discretization, 234
roundo, 234
statistical, 478
strong, 368
systematic, 478
weak, 478, 699
estimate
least-squares, 16
Monte Carlo, 478
estimating function, 397
martingale, 398
estimation
maximum likelihood, 389
parameter, 390, 413
estimator
approximate, 396
HP, 678
maximum likelihood, 392, 394, 417
unbiased, 600
Euler method, 234
balanced implicit, 585
Euler scheme, 207, 246, 253, 276
fully implicit, 239
jump-adapted, 351
predictor-corrector, 576
simplied symmetric, 587
Euler-Maruyama scheme, 246
European call option, 166, 172
event
driven uncertainty, 140
exact simulation, 64
almost, 361
jump-adapted, 361
exact solution
849
almost, 105
exercise boundary, 592
expansion
compensated Wagner-Platen, 196,
206
general Wagner-Platen, 206
stochastic Taylor, 187
Stratonovich-Taylor, 191
Wagner-Platen, 187
Expectation Maximization algorithm,
438
expectation, conditional, 5
explicit
lter, 430
solution, 45
strong order 1.5 scheme, 269
weak order 2.0 scheme, 491
exponential
Levy model, 14, 53
extrapolation, 240, 495
Richardson, 241, 710
weak order 2.0, 495
weak order 4.0, 496
F
ollmer-Schweizer decomposition, 475
fair price, 471
fair zero coupon bond, 149
Feynman-Kac formula, 124, 128
lter, 436
distribution, 460
explicit, 430
nite-dimensional, 460
implicit, 432
Kalman-Bucy, 420
SDE, 421
ltered probability space, 4
ltering, 419, 456
ltration, 4
natural, 4
nite
dierence approximation, 737
dierence method, 734, 738
variation
property, 29
nite-dimensional lter, 460
nite-state jump model, 460
rst exit time, 128
rst hitting time, 19
rst jump commutativity condition, 515
850
Index
growth rate
long term, 145
Heath-Platen method, 677
hedgable part, 474
hedge ratio, 178, 591, 595, 634
hedging strategy, 473
Hermite polynomial, 202, 262
Heston model, 106, 111, 686
multi-dimensional, 108, 155
Heston-Zhou trinomial tree, 708
Heun method, 239
Heyde-optimal, 403
hidden Markov chain, 425
hidden state, 420
hierarchical set, 206, 290, 363, 519
high intensity, 372, 378
Ho-Lee model, 41
H
older inequality, 21, 22
HP estimator, 678
Hull-White model, 42
implicit
lter, 432
methods, 583
weak Euler scheme, 498
weak order 2.0 scheme, 500
incomplete
gamma function, 107
market, 458, 472
indirect inference, 398
indistinguishability, 57
inequality
Cauchy-Schwarz, 21
Chebyshev, 23
Doob, 20
Gronwall, 22
H
older, 21, 22
Jensens, 22
Lyapunov, 22
Markov, 23
maximal martingale, 20
information matrix, 393
information set, 4
insurance premium, 12
integrable random variable, 4
integral
It
o, 31
multiple stochastic, 279
Index
Riemann-Stieltjes, 28
Stratonovich, 33
symmetric, 33
intensity, 139
high, 372, 378
measure, 13, 32
interest rate model, 41
interpolation, 247
inverse Gaussian process, 97
inverse transform method, 64, 77
iterated conditional expectations, 5
It
o coecient function, 204, 363
compensated, 205
It
o dierential, 34
It
o formula, 34, 36, 100
jump process, 36
multi-dimensional, 35
It
o integral, 27, 31
equation, 39
multiple, 253
It
o process with jumps, 44
Jensens inequality, 22
jump coecient, 376, 468
jump commutativity condition, 286
rst, 515
second, 515
jump diusion, 44, 413
jump martingale, 140
jump measure
Poisson, 44
jump model
nite-state, 460
jump ratio, 51
jump size, 24
mark-independent, 511
jump time, 9, 11
jump-adapted
approximation, 348
derivative-free scheme, 355
drift-implicit scheme, 357
Euler scheme, 351, 524
simplied, 524
exact simulation, 361
predictor-corrector
Euler scheme, 531
scheme, 359
strong order It
o scheme, 367
strong order Taylor scheme, 363
851
Taylor scheme
weak order 2.0, 525
time discretization, 348, 377, 523
weak order
3.0 scheme, 527
weak order Taylor scheme, 535
weak scheme
exact, 533
Kalman-Bucy lter, 420
knock-out-barrier option, 128
Kolmogorov
backward equation, 125, 130, 598
forward equation, 130
stationary, 131
Law of the Minimal Price, 147
least-squares
estimate, 16
Monte Carlo, 638
leverage eect, 41
Levy area, 261
Levy measure, 14
Levy process, 14, 31
matrix, 97
Levys Theorem, 26
Lie group symmetry, 75
likelihood ratio, 390
linear congruential random number
generator, 716
linear growth condition, 58
linear SDE, 101
linear test SDE, 572
linearity, 29
linearized
SDE, 603, 629
stochastic dierential equation, 598
Lipschitz condition, 58
local discretization error, 234
local martingale, 25
property, 29
local risk minimization, 473
lognormal
model, 42
long term growth rate, 145
Longsta model, 42
lookback
option, 627
Lyapunov inequality, 22
852
Index
mark, 10, 11
mark set, 13
mark-independent jump size, 511
market
activity, 119
incomplete, 458, 472
price
of event risk, 142
of risk, 141, 468
time, 120
market risk
general, 151
specic, 151
Markov chain
approximation, 722
second order, 730
third order, 731
convergence theorem, 744
hidden, 425
Markov inequality, 23
Marsaglia-Bray method, 716
martingale, 16, 133
estimating function, 398
jump, 140
local, 25
measure
minimal equivalent, 593
property, 29
representation, 473, 591, 595, 596,
600, 605, 646
representation theorem, 607
strict local, 25
matrix
commute, 48, 101
error covariance, 420
fundamental, 48
Riccati equation, 421
maximum likelihood
estimation, 389
estimator, 392, 394, 417
mean, 2
reverting, 451
measure
change, 435
intensity, 13, 32
Levy, 14
Poisson, 55
transformation, 645
Index
Longsta, 42
Merton, 37, 50, 52, 55, 158, 414, 562
minimal market, 40, 115, 168
multi-factor, 458
Ornstein-Uhlenbeck, geometric, 40
Pearson-Sun, 42
Platen, 42
Sandmann-Sondermann, 42
Vasicek, 41, 209
modication
Black-Scholes, 706
modied Bessel function, 71
moment estimate for SDEs, 59
Monte Carlo
estimate, 478
least-squares, 638
method, 477
simulation, 477
multi-dimensional
ARCH diusion, 156
Heston model, 155
MMM, 117
multi-factor model, 458
multi-index, 196, 206, 363
multiple It
o integral, 253
approximate, 263
multiple stochastic integral, 194, 197,
218
mixed, 283
multiple Stratonovich integral, 253
approximate, 265
multiplicative diusion coecient, 288
natural ltration, 4
non-central
chi-square distribution, 78, 107, 171
Wishart distribution, 80
normal inverse Gaussian distribution,
74
Novikov condition, 135, 646
numerical instability, 244
numerical stability, 571, 740
numerically stable, 244
optimal estimating equation, 403
option
American, 700
basket, 621
European call, 166, 172
knock-out-barrier, 128
lookback, 627
path dependent, 638
Optional Sampling Theorem, 20
order of convergence, 238
order of strong convergence, 251
order of weak convergence, 491
Ornstein-Uhlenbeck process, 46, 69
orthogonal, 421
parameter estimation, 390, 413
partial
dierential equation, 123
integro dierential equation, 124
path dependent option, 638
pathwise approximation, 251
payo
continuous, 617
non-smooth, 556, 633
rate, 126
smooth, 549, 707
Pearson-Sun model, 42
pentanomial tree, 709
Platen model, 42
Platen scheme, 267
Poisson
jump measure, 44
measure, 12, 14, 32, 55
compensated, 195, 211
process, 8
compensated, 17
compound, 10, 414
time transformed, 9
portfolio, 142, 468
diversied, 152, 158
predictable, 19
predictor-corrector
Euler scheme, 327, 576
price
benchmarked, 734
pricing formula
binomial option, 181
Black-Scholes, 184
real world, 148, 177
risk neutral, 148
primary security account, 140
probability
measure change, 133
space, 1
ltered, 4
853
854
Index
probability (cont.)
unnormalized conditional, 426
process
counting, 139, 193
gamma, 97
general ane, 96
inverse Gaussian, 97
Levy, 14
Ornstein-Uhlenbeck, 46, 69, 85
multi-dimensional, 86
Poisson, 8
predictable, 19
pure jump, 32, 193
Radon-Nikodym derivative, 133
square root, 62, 72, 259, 409
squared Bessel, 77, 116, 210
multi-dimensional, 89
stationary, 2
VG-Wishart, 98
Wiener, 6
Wishart, 70, 91, 94
property
correlation, 29
covariation, 29
nite variation, 29
local martingale, 29
martingale, 29
regularity, 152
supermartingale, 146
pure jump
process, 32, 193
SDE, 375
quadratic variation, 23
approximate, 23
Radon-Nikodym derivative, 133
random
bit generator, 713, 716
measure, 12
number,quasi, 638
variable,integrable, 4
real world pricing formula, 148, 177
recovery rate, 10
regular
time discretization, 275
weak approximation, 507
regularity property, 152
remainder set, 206, 363
Index
second jump commutativity condition,
515
second moment estimate, 212
self-nancing, 142, 468
semimartingale, 30
special, 30
sequence of
approximate GOPs, 153
diversied portfolios, 152
JDMs
regular, 152
sigma-algebra, 1, 4
augmented, 4
simplied
Euler scheme, 482
symmetric predictor-corrector Euler
scheme, 587
weak order
2.0 Taylor scheme, 484
3.0 Taylor scheme, 486
weak schemes, 712
Sklars theorem, 65
smooth payo, 707
sparse matrix, 742
spatial discretization, 735
specic generalized volatility, 151
specic market risk, 151
square root process, 62, 72, 259, 409
squared Bessel process, 77, 116, 210
SR-process
correlated, 89
multi-dimensional, 72, 87
stability region, 574
state space, 1
stationary
density, 131
diusion, 409
independent increment, 6
process, 2
statistical error, 478
stochastic
dierential equation, 38
linearized, 598
vector, 43
integral
compensated multiple, 198
multiple, 194, 197, 218
process, 1
855
856
Index