Documente Academic
Documente Profesional
Documente Cultură
Johannes Wissel
Cornell University
Fall 2010
0. Introduction
Motivation: Some examples of nancial derivatives
If S
T
K, option is worthless.
If S
T
> K, exercise the option (buy S for K) and sell S on the
exchange for S
T
, making a prot S
T
K.
Thus, option value at maturity is
_
S
T
K
_
+
(option payo ).
If S
T
K, option is worthless.
If S
T
< K, exercise the option (sell S for K) and buy S on the
exchange for S
T
, making a prot K S
T
.
Thus, option value at maturity is
_
K S
T
_
+
(option payo ).
=
1
2
(115 100)
+
+
1
2
(90 100)
+
= 7.5
If C
0
> 6, the seller could make a certain prot of C
0
6 by
selling the option and simultaneously employing the above
investment strategy.
If C
0
< 6, the buyer could make a certain prot of 6 C
0
by
buying the option and simultaneously employing the reverse
strategy ( = 0.6, = 54).
Such arbitrage opportunities (possibility of a riskless prot without
net investment of capital) are unrealistic in most markets.
Absence of arbitrage is a fundamental concept in nancial
market models and in the theory of derivative pricing.
Let C
1
=
_
S
1
K)
+
denote the option payo. We can rewrite
(1.1), (1.2) in one equation as
+ S
1
= C
1
,
which must hold for both possible values of S
1
.
The fair option value at time 0 is the initial value of the above
investment strategy
+ S
0
= C
0
.
In the above example we have E[S
1
] > S
0
and therefore E[C
1
] > C
0
.
Valuation via expectations. Now suppose we change the
probabilities of the values of S
1
to p and 1 p such that
E[S
1
] = p 115 + (1 p) 90 = S
0
.
We have to take p = 0.4. Then we obtain
C
0
= +S
0
= +
E[S
1
] =
E[+S
1
] =
E[C
1
] =
E
__
S
1
K
_
+
.
Conclusion:
E[S
1
] = S
0
.
The new measure is known as a risk-neutral or pricing measure.
The concept of pricing assets by computing expectations under
risk-neutral measures is another fundamental concept in nancial
engineering. We will later generalize this method to dynamic
(multi-period) models using stochastic processes known as
martingales.
There is a deep relationship between absence of arbitrage and
asset pricing via risk-neutral measures, known as the fundamental
theorem of asset pricing. We will formulate and discuss several
versions of this result in later chapters of the course.
The multi-period binomial model
Asset prices B
t
(bond) and S
t
(stock) at time t
a maturity date N
Examples:
X
t
B
t
=
t
B
t+1
+
t
S
t+1
B
t+1
t
B
t
+
t
S
t
B
t
=
t
_
S
t+1
B
t+1
S
t
B
t
_
X
n
B
n
=
X
k
B
k
+
n1
t=k
t
_
S
t+1
B
t+1
S
t
B
t
_
(1.5)
Note:
k
, ...,
n1
. (Bond positions are determined via (1.3)).
X
n
is a random variable which depends on the (random)
values of the stochastic process S
t
, t = 0, ...., n.
Replication via a sfts
S
t
B
t
_
(1.6)
must hold for any values of the random variables S
t
, S
t+1
.
Given S
t
there are two possibilities for S
t+1
:
h
t+1
(S
t
u)
B
t+1
=
h
t
(S
t
)
B
t
+
t
(S
t
)
_
S
t
u
B
t
(1 + r )
S
t
B
t
_
,
h
t+1
(S
t
d)
B
t+1
=
h
t
(S
t
)
B
t
+
t
(S
t
)
_
S
t
d
B
t
(1 + r )
S
t
B
t
_
Solving for
t
(S
t
), h
t
(S
t
) gives
t
(S
t
) =
h
t+1
(S
t
u) h
t+1
(S
t
d)
S
t
(u d)
,
h
t
(S
t
)
B
t
=
1
B
t+1
_
ph
t+1
(S
t
u) + (1 p)h
t+1
(S
t
d)
_
where p is dened by
p
S
t
u
B
t
(1 + r )
+(1 p)
S
t
d
B
t
(1 + r )
=
S
t
B
t
p =
1 + r d
u d
(1.7)
In summary we obtain
Theorem 1.2
For a derivative with payo h(S
N
) at time N, dene recursively
functions h
N
, ..., h
0
via
h
N
(S
N
) = h(S
N
), (1.8)
h
t
(S
t
)
B
t
=
1
B
t+1
_
ph
t+1
(S
t
u) + (1 p)h
t+1
(S
t
d)
_
(1.9)
for t = N 1, ..., 0 and p in (1.7). Then there exists a sfts with
value process X
t
= h
t
(S
t
), t = 0, ..., N, and stock positions
t
(S
t
) =
h
t+1
(S
t
u) h
t+1
(S
t
d)
S
t
(u d)
(1.10)
for t = 0, ..., N 1.
We say we can replicate the payo h(S
N
) from initial capital
h
k
(S
k
) at time k within the binomial model, using the sfts (1.10).
We call X
k
the fair price of the derivative at time k.
Note:
j =1
A
j
T.
For P we demand
a) P[] = 1
b) if A
1
, A
2
, ... T are disjoint, then P
_
j =1
A
j
_
=
j =1
P[A
j
].
Elements of T are called events.
Remark: If A
1
, A
2
, ... T, then
j =1
A
j
=
_
j =1
A
c
j
_
c
T.
Also for A
1
, ..., A
n
T we have
n
j =1
A
j
T and
n
j =1
A
j
T
(take A
j
= for n > j in c)).
Interpretation
(, T, P) is a model for a random experiment.
N
=
_
= (
1
...
N
)
i
H, T i = 1, ..., N
_
,
T = A[ A
N
, and P[A] =
|A|
2
N
for all A T.
Example 2.3 (Another probability measure)
Finite independent coin-toss space (unfair coin), binomial model:
Let p (0, 1) and
N
=
_
= (
1
...
N
)
i
H, T i = 1, ..., N
_
,
T = A[ A
N
, and P[] = p
#H(
1
...
N
)
(1 p)
#T(
1
...
N
)
for all
N
. This denes a probability on T by setting
P[A] =
A
P[].
It is easy to check the axioms for T and P in Denition 2.1.
The power set T = A[ A is a -algebra for every space .
However this is not always the appropriate choice:
=
_
= (
1
2
...)
i
H, T i 1
_
.
T = A[ A
beginning
with x. We want P[A
x
] =
1
2
n
for our measure P. But this implies
P[
1
2
...] = P [A
1
A
2
...] = lim
n
P [A
1
...
n
] = lim
n
1
2
n
= 0
(see [Shr04], Theorem A.1.1.), i.e. individual have probability zero.
.
Idea: Top down approach.
1) For each n 1, let /
n
consist of the sets A
x
with x
n
.
Examples:
/
1
= A
H
, A
T
,
/
2
= A
HH
, A
HT
, A
TH
, A
TT
, etc.
2) Collect all sets in /
1
, /
2
, ..., and add all sets required to make
the collection a -algebra. We call the result T
. Note that
T
T.
3) It can be shown
1
that specifying the probability on the sets in
/
1
, /
2
, ... (via P[A
x
] =
1
2
n
for any x
n
) uniquely
determines a probability measure P on T
.
Note: The same construction works for p (0, 1) and
P[A
x
] = p
#H(x)
(1 p)
#T(x)
.
1
Caratheodorys extension theorem, see e.g. [Dur95], Theorem A.1.1.
The last example shows that for innite probability spaces,
P[A] = 1 does not necessarily imply A = .
Denition 2.5
Let (, T, P) be a probability space. If A T satises P[A] = 1,
we say that A occurs almost surely (a.s.).
Random variables and distributions
Denition 2.6
Let (, T, P) a probability space. A function X : R is called
T-measurable if
X b := [ X() b T
for all b R. We also say X is a random variable on (, T).
If T is the power set, then every function on is T-measurable.
If X and Y are T-measurable, then f (X, Y) is also T-measurable
for every reasonable
2
function f .
The distribution function F
X
of X is
F
X
(x) = P[X x].
2
The function f must be Borel-measurable. Every function f : R
2
R we
shall ever encounter is Borel-measurable.
Take the intervals [a, b] and all subsets of R required to make the
collection a -algebra (called the Borel--algebra B(R) on R).
Let X be a random variable on (, T, P). Then X B T for
every B B(R). The distribution
X
of X under P is dened by
X
(B) := P[X B], B B(R)
X
denes a probability measure on
_
R, B(R)
_
.
The distribution determines the distribution function, and vice versa.
Example 2.7 (Stock price in binomial model)
Consider the binomial model (
N
, T, P) (Example 2.2) and let
S
t
() = S
0
u
#H(
1
...
t
)
d
#T(
1
...
t
)
for t = 0, ..., N. Choose for instance S
0
= 1 and u = 2, d =
1
2
.
Then the distribution
S
2
of S
2
under P is determined by
S
2
_
4
_
=
S
2
_
1
4
_
=
1
4
,
S
2
_
1
_
=
1
2
.
Many non-discrete random variables X have a density, i.e., a
function f
X
0 on R such that
X
(B) =
_
B
f
X
(x)dx for all B B(R).
In particular, the distribution function F
X
satises
F
X
(b) =
X
_
(, b]
_
=
_
b
f
X
(x)dx
and so
f
X
(x) = F
X
(x).
Example 2.8
Let a < b and suppose
X
(B) = P[X B] =
_
B
1
ba
I
[a,b]
(x)dx.
Then X has uniform distribution on [a, b]. Its density is
f (x) =
1
ba
I
[a,b]
(x).
Example 2.9
We say that X has exponential distribution with parameter if
F
X
(x) = P[X x] = 1 e
x
for x 0, and F
X
(x) = 0 for x < 0. Then X has density
f
X
(x) =
d
dx
_
1 e
x
_
= e
x
for x 0 and f
X
(x) = 0 for x < 0.
Example 2.10
Suppose P[X x] = (x) :=
_
x
2
e
x
2
2
.
We say X has standard normal distribution. X has density .
Expectations
Let X be a random variable on a probability space (, T, P). The
elementary denition of the expectation
E[X] =
X()P[]
cannot be used if is uncountably innite (as in Ex. 2.4).
k
i =1
x
i
I
A
i
with A
i
T, using the indicator
function
I
A
() =
_
1 ( A)
0 ( / A)
Such a random variable is called a simple function. We then
dene
E[X] =
k
i =1
x
i
E
_
I
A
i
=
k
i =1
x
i
P[A
i
].
If X 0, let X
n
(n 1) be any sequence
3
of simple functions
such that X
n
X (n ) a.s. Then dene
E[X] = lim
n
E[X
n
].
One can show that this limit always exists in [0, ], and does
not depend on the choice of the approximating sequence.
. So we dene
E[X] = E[X
+
] E[X
],
provided that E[X
+
], E[X
n2
n
j =0
j
2
n
I
{
j
2
n
X<
j +1
2
n
}
.
Theorem 2.11
a) X is integrable if and only if E[[X[] < .
b) Linearity: For , R and r.v.s X, Y
E[X +Y] = E[X] +E[Y]
c) Monotonicity: If X Y a.s., then E[X] E[Y].
Now let X and X
1
, X
2
, ... be random variables on (, T, P).
Theorem 2.12
a) Monotone convergence: If 0 X
n
X (n ) a.s., then
lim
n
E[X
n
] = E[X].
b) Dominated convergence: If X
n
X (n ) a.s. and
[X
n
[ Y a.s. for all n and some integrable Y, then
lim
n
E[X
n
] = E[X].
c) Fatous lemma: If X
n
0 a.s. for all n, then
E
_
liminf
n
X
n
liminf
n
E[X
n
].
Proofs: See e.g. [Dur95], Theorems A.4.7, A.5.4, A.5.5, A.5.6.
Example 2.13
On
n=1
X
n
2
n
. By monotone convergence, we have
E[X] =
n=1
E
_
X
n
2
n
_
=
n=1
1
2
1
2
n
=
1
2
.
We can also compute the distribution of X. We verify that
P
_
k
2
n
X
k+1
2
n
=
1
2
n
for n = 1, 2, ... and all intervals
_
k
2
n
,
k+1
2
n
X
_
[a, b]
_
= b a
for all a, b [0, 1] of the form
k
2
n
. A limit argument then yields the
equation for all a, b [0, 1]. Hence
X
is the uniform distribution
on [0, 1].
Computing expectations
Theorem 2.14
Let X a random variable on (, T, P), and g a function on R. If
g(X) is integrable, then
E[g(X)] =
_
g(x)d
X
(x).
Theorem 2.15
Let X a random variable on (, T, P) with density f , and g a
function on R. If g(X) is integrable, then
E[g(X)] =
_
g(x)f (x)dx.
Proofs: See [Shr04], Theorems 1.5.1. and 1.5.2.
Let X be a random variable with E[X
2
] < . The variance of X is
Var [X] = E
_
(X E[X])
2
= E[X
2
] E[X]
2
.
The standard deviation of X is
_
Var [X].
Example 2.16
Suppose X has standard normal distribution. Then
E[X] =
_
x(x)dx = 0,
Var [X] = E[X
2
] =
_
x
2
(x)dx = 1.
Example 2.17
Suppose X has uniform distribution on [0, 1]. Its density is
f (x) = I
[0,1]
(x), and so
E[X] =
_
1
0
xdx =
1
2
.
Independence
Denition 2.18
Let (, T, P) be a probability space.
Sets A
1
, ..., A
n
T are independent if
P[A
1
... A
n
] = P[A
1
] P[A
n
].
Random variables X
1
, ..., X
n
on (, T, P) are independent if
P[X
1
a
1
... X
n
a
n
] = P[X
1
a
1
] P[X
n
a
n
]
for all a
1
, ..., a
n
R.
Innitely many sets A
1
, A
2
, ... are independent if the sets A
1
, ..., A
n
are independent for every n N. Innitely many r.v. X
1
, X
2
, ... are
independent if X
1
, ..., X
n
are independent for every n N.
Remark: If X
1
, ..., X
n
are independent, then also
P[X
1
B
1
... X
n
B
n
] = P[X
1
B
1
] P[X
n
B
n
]
for any sets B
1
, ..., B
n
B(R).
The joint distribution function of X, Y is
F
X,Y
(x, y) = P[X x, Y y].
If there exists a function f
X,Y
0 such that
F
X,Y
(x, y) =
_
x
__
y
f
X,Y
(u, v)dv
_
du,
then X, Y have a joint density f
X,Y
.
Theorem 2.19
Let X, Y be random variables. The following are equivalent.
(i) X and Y are independent.
(ii) The joint distribution function of X and Y factors:
F
X,Y
(x, y) = F
X
(x)F
Y
(y) for all x, y R.
If X, Y have a joint density, then (i) and (ii) are equivalent to
(iii) The joint density of X and Y factors:
f
X,Y
(x, y) = f
X
(x)f
Y
(y) for all x, y R.
Theorem 2.20
Let X, Y be independent random variables and f , g functions.
Then f (X) and g(Y) are independent.
Proofs: [Shr04] Theorems 2.2.5. and 2.2.7.
Remark: The last results can be generalized to any number of
random variables, see [Res99] Theorem 4.2.1 and Lemma 4.4.1.
Example 2.21
Let
N
be the independent coin toss space in Example 2.3 and
Y
t
() =
_
u (
t
= H)
d (
t
= T)
for t = 1, ..., N. One checks
P[Y
1
a
1
... Y
N
a
N
] = P[Y
1
a
1
] P[Y
N
a
N
]
for all a
1
, ..., a
N
R: if any a
t
< d then both sides are 0; if all
a
t
d then both sides are (1 p)
#{t|a
t
<u}
.
Example 2.22
Let
t
i =1
Y
i
the stock price in the binomial model.
Then for any 0 k < n N, the random variables S
k
and
S
n
S
k
are
independent.
Example 2.24
Let X
1
, ..., X
n
be independent standard normal random variables.
Then their joint density is
f
X
1
...X
n
(x
1
, ..., x
n
) =
1
(2)
n/2
e
1
2
(x
2
1
+...+x
2
n
)
, x
1
, ..., x
n
R,
the product of the marginal densities f
X
i
(x
i
) =
1
2
e
1
2
x
2
i
.
Theorem 2.25 (Strong law of large numbers)
Let X
1
, X
2
, ... be independent and identically distributed (iid)
random variables with E[[X
i
[] < and E[X
i
] = R. Then
lim
n
1
n
n
i =1
X
i
= a.s.
Proof: [Dur95] Theorem 1.7.1.
Theorem 2.26
Let X, Y be independent and integrable. Then
E[XY] = E[X]E[Y].
Proof: [Shr04] Theorem 2.2.7 (vi).
Corollary 2.27
Let X
1
, ..., X
n
be independent random variables with E[X
2
i
] <
for all i . Then
Var [X
1
+... + X
n
] = Var [X
1
] +... + Var [X
n
].
Dependence. The simplest way to measure dependence of
random variables is via covariance and correlation.
Denition 2.28
Let X, Y have second moments. The covariance of X and Y is
Cov[X, Y] = E
_
(X E[X])(Y E[Y])
= E[XY] E[X]E[Y].
The correlation of X and Y is
Corr [X, Y] =
Cov[X, Y]
_
Var [X]Var [Y]
.
The Cauchy-Schwarz inequality implies that
Corr [X, Y] [1, 1].
Denition 2.29
Let X = (X
1
, ..., X
d
)
(here
is transpose).
This implies Cov[BX + b] = BCov[X]B
for B R
kd
and b R
k
.
Example 2.30 (Multivariate normal distribution)
Let X = (X
1
, ..., X
d
)
with Z
1
, ..., Z
d
independent standard
normal random variables and A R
dd
with AA
= .
Information and -algebras
Take a probability space . Let X
i
(i I ) be functions on .
The smallest -algebra which contains all sets X
i
b where
b R and i I is called the -algebra generated by the X
i
. It is
denoted by (X
i
[ i I ). This is the smallest -algebra for which
each X
i
is measurable.
Interpretation: (X
i
[ i I ) models the (minimal) information
required to determine the values of the X
i
.
Example 2.31 (Binomial model with partial information I)
Let
N
=
_
= (
1
...
N
)
i
H, T i = 1, ..., N
_
with
T = A[ A
N
, and S
t
= S
0
Y
1
Y
t
with the Y
j
from
Example 2.21. For each n N we dene
T
n
= (Y
1
, ..., Y
n
) = (S
1
, ..., S
n
).
We say T
n
is the information available in the market at time n.
Usually -algebras are dened via generators. In some simple cases
one can explicitly specify the list of sets in the -algebra:
Example 2.32
Let
N
as before. Fix n N, and for each x
n
let A
x
be the
set of all
N
beginning with x. Then T
n
is the family of all
sets that can be generated by taking unions of sets A
x
with x
n
.
We nd
T
1
= ,
N
, A
H
, A
T
,
T
2
= ,
N
, A
HH
, A
HT
, A
TH
, A
TT
,
A
H
, A
T
, A
HH
A
TT
, A
HT
A
TT
, A
TH
A
HH
, A
TH
A
HT
,
A
HH
A
HT
A
TH
, A
HH
A
HT
A
TT
,
A
HH
A
TH
A
TT
, A
HT
A
TH
A
TT
, etc.
Example 2.33 (Binomial model with partial information II)
Let
N
and T as before, and let X() = #H(). Then (X) is
the information obtained from observing only the total number of
heads. We have (X) = (S
N
).
Remark. For a single random variable X, the family (X) consists
of all sets X B with B B(R).
Denition 2.34
Let (, T, P) be a probability space.
Sub--algebras (
1
, ..., (
n
of T are independent if
P[A
1
... A
n
] = P[A
1
] P[A
n
] for all A
i
(
i
let T
n
= (Y
1
, ..., Y
n
).
Then Y
t
for t n + 1 are independent of T
n
.
Conditional expectations and martingales
Now let A T with P[A] > 0, and suppose we know that the
outcome of the random experiment will be in A. What is
our estimate for X in this case?
The conditional expectation of X given A is
E[X[A] =
E[X I
A
]
P[A]
.
Example: Take the binomial model (cf. Example 2.2), where
S
n
= S
0
n
i =1
Y
i
and E[S
n
] = S
0
n
i =1
E [Y
i
] = S
0
_
u+d
2
_
n
.
Now suppose we know the results of the rst two coin tosses or
the values of Y
1
, Y
2
or the information in T
2
. We nd
E[S
n
[A
HH
] =
E
__
S
0
n
i =1
Y
i
_
I
{Y
1
=Y
2
=u}
P[A
HH
]
= S
0
u
2
_
u+d
2
_
n2
,
E[S
n
[A
HT
] = S
0
ud
_
u+d
2
_
n2
,
E[S
n
[A
TH
] = S
0
du
_
u+d
2
_
n2
,
E[S
n
[A
TT
] = S
0
d
2
_
u+d
2
_
n2
.
Note: On each of the individual sets, the conditional expectations
are equal to the value of the random variable S
2
_
u+d
2
_
n2
. Thus
E
_
S
n
A
_
= E
_
S
2
_
u+d
2
_
n2
A
_
for any set A A
HH
, A
HT
, A
TH
, A
TT
, and hence for all A T
2
.
In summary, the random variable S
2
_
u+d
2
_
n2
is T
2
-measurable
for any A T
2
satises
E
_
S
n
A
_
= E
_
S
2
_
u+d
2
_
n2
A
_
It is an estimate of S
n
based on the information in T
2
.
Denition 2.36
Let X be an integrable random variable on (, T, P) and ( be a
sub--algebra of T. A conditional expectation of X given ( is
any random variable, denoted E[X[(], that satises
(i) (Measurability) E[X[(] is (-measurable,
(ii) (Partial averaging)
E
_
X I
A
_
= E
_
E[X[(] I
A
_
for any A (.
In the example above, E[S
n
[ T
2
] = S
2
_
u+d
2
_
n2
.
A conditional expectation always exists ([Shr04], Theorem B.1).
It is unique in the following sense. Let Y and Z be conditional
expectations of X given (. Then A = Y Z > 0 ( by (i),
and so by (ii)
E[(Y Z)I
A
] = E[YI
A
] E[ZI
A
] = E[XI
A
] E[XI
A
] = 0,
which implies P[Y Z > 0] = 0. Reversing the roles of Y and Z,
we obtain P[Z Y > 0] = 0. Hence Y = Z a.s.
As for the expectation we have
Theorem 2.37
a) Linearity: For , R and r.v.s X, Y
E[X +Y [ (] = E[X [ (] +E[Y [ (]
b) Monotonicity: If X Y a.s., then E[X [ (] E[Y [ (].
c) Jensens inequality: If is a convex function then
E
_
(X)
_
E[X [ (]
_
.
Theorem 2.38
a) Taking out measurable factors: If X is (-measurable,
E[XY [ (] = XE[Y [ (].
b) Iterated conditioning: If H ( is a sub--algebra, then
E
_
E[X [ (]
= E[X [ H].
c) Independence: Let X = (X
1
, ..., X
d
) be (-measurable and
Y = (Y
1
, ..., Y
e
) be independent of (. Then for any function f
E[f (X, Y) [ (] = g(X),
where g(x) := E[f (x, Y)] for all x R
d
.
Special cases:
= E[X].
T
t
_
= E
_
Y
t+1
1
T
t
= E[Y
t+1
1] =
u+d
2
1
2) Let M
t
= max
j =0,...,t
S
j
. For t 0
E[M
t+1
[ T
t
] = E[M
t
(S
t
Y
t+1
) [ T
t
]
= E[m (sY
t+1
)]
(m,s)=(M
t
,S
t
)
=
_
1
2
(m su) +
1
2
(m sd)
_
(m,s)=(M
t
,S
t
)
=
1
2
(M
t
S
t
u) +
1
2
(M
t
S
t
d)
3) Let X, Z random variables. If ( = (Z) we write
E[X [ (Z)] = E[X [ Z]. In this case there exists a function f with
E[X [ Z] = f (Z).
This follows from
Theorem 2.39
Let Y and Z be random variables. If Y is (Z)-measurable, then
Y = f (Z) for some function f .
Example: Let X, Y independent N(0, 1) distributed and Z = X
2
+ Y
2
.
We want to compute E
_
[X[
= E
_
f (Z)I
{Zz}
=
z
_
0
f (u)
1
2
e
u/2
du
by partial averaging. The left hand side is
[x[I
{x
2
+y
2
z}
1
2
e
x
2
+y
2
2
dxdy =
2
_
0
_
0
[r cos [I
{r
2
z}
1
2
e
r
2
2
rdr d
= 4
z
_
0
1
2
e
r
2
2
r
2
dr .
.
Dierentiating both sides w.r.t. z yields f (z) =
2
z.
Denition 2.40
a) A ltration on a space is a family of -algebras
(T
t
)
t=0,1,2,...,N
(discrete time) or (T
t
)
t[0,T]
(continuous time)
such that T
s
T
t
for all s t.
b) A stochastic process (X
t
)
t[0,T]
is a family of random variables
X
t
indexed with time t. We say that (X
t
)
t[0,T]
is adapted to the
ltration (T
t
)
t[0,T]
if X
t
is T
t
-measurable for each t.
In discrete time, we replace [0, T] with 0, 1, ..., N.
Intuition: T
t
models the information available at time t.
A ltration models the ow of information.
Adapted processes are obtained as follows. For a given stochastic
process (X
t
), let T
t
= (X
s
[ s t). This is called the ltration
generated by the process (X
t
). Clearly (X
t
) is adapted to (T
t
).
We can then construct further adapted processes from (X
t
).
Example: Let
(Y
t
)
t=0,1,2,...
is adapted.
(S
t
)
t=0,1,2,...
with S
t
= S
0
Y
1
Y
t
is adapted.
t
= f
t
(S
t
),
i.e.,
t
is an adapted process. Intuitively, this means that
t
can
be determined with information available at time t. In particular,
we do not look into the future to make the investment decision.
Denition 2.41
Let (T
t
)
t[0,T]
be a ltration on and (X
t
)
t[0,T]
a stochastic
process on (, T, P). Suppose (X
t
)
t[0,T]
is adapted and all X
t
are
integrable.
(ii) The process (X
t
)
t[0,T]
is a martingale if
E[X
t
[ T
s
] = X
s
for all 0 s t T.
(iii) The process (X
t
)
t[0,T]
is a submartingale if
E[X
t
[ T
s
] X
s
for all 0 s t T.
(iv) The process (X
t
)
t[0,T]
is a supermartingale if
E[X
t
[ T
s
] X
s
for all 0 s t T.
In the same way we dene (sub-/super-) martingales in discrete
time for a discrete time ltration (T
t
)
t=0,1,2,...,N
.
Example 2.42
Dene a probability measure on the coin toss space (
N
, T) via
P[] = p
#H()
(1 p)
#T()
,
N
,
for some p (0, 1). Dene Y
1
, ..., Y
N
as in Example 2.21, they are
independent under
P with
P[Y
i
= u] = p and
P[Y
i
= d] = 1 p.
Let B
t
= (1 + r )
t
and S
t
= S
0
t
i =1
Y
i
for t = 0, ..., N be the bond
and stock price process in the binomial model, and (T
t
)
t=0,1,...,N
be the ltration generated by the process Y
1
, ..., Y
N
.
The process
_
S
t
B
t
_
t=0,1,...,N
is adapted to (T
t
)
t=0,1,...,N
since
S
t
B
t
is a function of Y
1
, ..., Y
t
which are T
t
-measurable.
Write
E for the expectation under
P, then for 0 s t N
E
_
S
t
B
t
T
s
_
=
E
_
S
0
t
i =1
Y
i
(1+r )
t
T
s
_
=
S
0
s
i =1
Y
i
(1+r )
s
E
_
t
i =s+1
Y
i
(1+r )
ts
T
s
_
=
S
s
B
s
E
_
t
i =s+1
Y
i
(1+r )
_
=
S
s
B
s
_
pu+(1 p)d
1+r
_
ts
.
So
_
S
t
B
t
_
t=0,1,2,...,N
is a martingale on (
N
, T,
P) if
pu + (1 p)d = 1 + r p =
1+r d
ud
(2.1)
(a submartingale if p >
1+r d
ud
and a supermartingale if p <
1+r d
ud
).
Note:
Suppose
_
S
t
B
t
_
t=0,1,2,...,N
is a martingale under
P. We must
have p [0, 1], which is satised if d < 1 + r < u.
This condition is equivalent to absence of arbitrage, see below.
Example 2.43
Given initial wealth X
0
, let (X
t
)
t=0,1,2,...,N
be the value process
X
t
B
t
=
X
0
B
0
+
t1
n=0
n
_
S
n+1
B
n+1
S
n
B
n
_
of a sfts (see (1.5)), where
n
is T
n
-measurable for each time n
(investment decisions based on available information).
If
_
S
t
B
t
_
t=0,1,2,...,N
is a martingale , then so is
_
X
t
B
t
_
t=0,1,2,...,N
:
it is adapted
X
t
B
t
=
X
s
B
s
+
t1
n=s
n
_
S
n+1
B
n+1
S
n
B
n
_
, and for n s
E
_
n
_
S
n+1
B
n+1
S
n
B
n
_
T
s
_
=
E
_
E
_
n
_
S
n+1
B
n+1
S
n
B
n
_
T
n
_
T
s
_
=
E
_
E
_
S
n+1
B
n+1
S
n
B
n
T
n
_
T
s
_
= 0.
Pricing derivatives with martingales
Consider the binomial model with 0 < d < 1 + r < u.
By Theorem 1.2, for any derivative with payo h(S
N
) at time N
there exists a sfts with value process h
t
(S
t
), t = 0, ..., N (fair price
at t) and h
N
(S
N
) = h(S
N
).
Example 2.42:
_
S
t
B
t
_
t=0,1,2,...,N
is a martingale under
P with
p =
1+r d
ud
T
t
_
=
E
_
h(S
N
)
B
N
T
t
_
for all t = 0, ..., N.
(2.2)
We compute
h
t
(S
t
) =
B
t
B
N
E
_
h(S
N
)
T
t
=
1
(1+r )
Nt
E
_
h
_
S
t
N
i =t+1
Y
i
_
T
t
_
=
1
(1+r )
Nt
E
_
h
_
S
N
i =t+1
Y
i
__
S=S
t
=
1
(1+r )
Nt
E
_
h
_
Su
#H
d
Nt#H
__
S=S
t
=
1
(1 + r )
Nt
Nt
k=0
_
N t
k
_
p
k
(1 p)
Ntk
h
_
S
t
u
k
d
Ntk
_
.
(Here #H denotes the number of heads from t + 1 to N.)
Major goal of the course:
Develop a martingale pricing theory for continuous time models
Change of measure
Two probability measures on (
N
, T):
a) P[] = p
#H()
(1 p)
#T()
b)
P[] = p
#H()
(1 p)
#T()
with p =
1+r d
ud
Under P the expected returns of stock and bond are
E
_
S
t+1
S
t
S
t
T
t
_
= pu + (1 p)d 1,
B
t+1
B
t
B
t
= r
Usually E
_
S
t+1
S
t
S
t
T
t
_
>
B
t+1
B
t
B
t
because of risk aversion.
Under
P the expected return of the stock is
E
_
S
t+1
S
t
S
t
T
t
_
= pu + (1 p)d 1 = r =
B
t+1
B
t
B
t
.
i =1
i
t
S
i
t
. (2.3)
We demand that
t
and
i
t
are bounded and T
t
-measurable.
b) The strategy is self-nancing if for any t = 0, 1, ..., N 1
X
t+1
=
t+1
B
t+1
+
m
i =1
i
t+1
S
i
t+1
=
t
B
t+1
+
m
i =1
i
t
S
i
t+1
. (2.4)
The value process X
t
of a sfts is computed as follows:
X
t+1
B
t+1
X
t
B
t
=
t
B
t+1
+
m
i =1
i
t
S
i
t+1
B
t+1
t
B
t
+
m
i =1
i
t
S
i
t
B
t
=
m
i =1
i
t
_
S
i
t+1
B
t+1
S
i
t
B
t
_
(2.5)
X
n
B
n
=
X
k
B
k
+
n1
t=k
m
i =1
i
t
_
S
i
t+1
B
t+1
S
i
t
B
t
_
(2.6)
for any 0 k < n N.
Denition 2.46
An arbitrage opportunity is a sfts whose value process satises
X
0
= 0, P[X
N
< 0] = 0, and P[X
N
> 0] > 0.
A meaningful market model should be free of arbitrage
opportunities.
Theorem 2.47 (FTAP in discrete time)
There is no arbitrage opportunity in the market B, S
1
, ..., S
d
if,
and only if, there exits a measure
P equivalent to P on (, T)
such that all discounted price processes
_
S
i
t
B
t
_
t=0,1,...,N
for
i = 1, ..., m are martingales under
P.
Proof of if part: Suppose there exists
P as above. Then (2.6)
implies that the value process
X
t
B
t
of any sfts is a
P-martingale.
Now if we had an arbitrage opportunity, its value process would
satisfy
E
_
X
N
B
N
=
X
0
B
0
= 0 (2.7)
and P[X
N
< 0] = 0, thus also
P[X
N
< 0] = 0. So (2.7) implies
P[X
N
> 0] = 0, and thus also P[X
N
> 0] = 0, a contradiction.
The proof of the only if part requires tools from functional
analysis beyond the scope of this course, see e.g. Follmer and
Schied, Stochastic Finance, Theorem 5.17. A measure
P as the
FTAP is called a martingale measure.
Remarks. 1) The FTAP requires very few assumptions on the
market model.
2) In the binomial model, we have identied the risk-neutral
measure
P as a martingale measure in Example 2.42. One can
show that it is the only martingale measure in this model. In this
case, derivative prices are determined by arbitrage considerations.
3) In more general arbitrage-free models, there can be more than
one martingale measure. In this situation, absence of arbitrage
alone does not uniquely determine derivative prices.
The FTAP is a key result for derivative pricing:
Corollary 2.48
Take an arbitrage-free market B, S
1
, ..., S
m
as above. Suppose we
add a derivative with price process V
t
to the market. The resulting
market B, S
1
, ..., S
m
, V is arbitrage-free if and only if
V
t
= B
t
E
_
V
N
B
N
T
t
_
, t = 0, ..., N (2.8)
where
P is a martingale measure for the market B, S
1
, ..., S
m
.
Conditional distributions
For a random variable X on (, T, P), the distribution
X
of X is
X
(B) := P[X B], B B(R)
Now let ( T be a sub--algebra. The conditional distribution
X|G
of X given ( is
X|G
(B) := P
_
X B
, B B(R)
where the conditional probability is P
_
X B
= E
_
I
{XB}
.
Note that P[X B [ (] is a random variable, i.e., depends on .
One checks that for each , the function
B P
_
X B
()
is a probability measure on
_
R, B(R)
_
(a random measure).
In many situations we can write X as a function of a (-measurable
and a (-independent part, and use Theorem 2.38 c) to compute its
conditional distribution given (.
Example 2.49
Let Y, Z be independent N(0, 1) distributed random variables on
(, T, P) and X = Y + Z, so X has N(0, 2) distribution.
Let ( = (Y), then Y is (-measurable and Z is independent of (.
Hence for B B(R)
P
_
X B
() = E
_
I
{Y+ZB}
()
= E
_
I
{y+ZB}
y=Y()
= P
_
y + Z B
y=Y()
Since y + Z N(y, 1) for y R, it follows that the conditional
distribution of X given ( is N
_
Y(), 1
_
.
3. Brownian motion
Main goals
Let (
j =1
X
j
, k = 1, 2, ...
Note
#H(
1
...
k
) =
k
j =1
1+X
j
()
2
=
1
2
_
k + M
k
()
_
#T(
1
...
k
) =
k
j =1
1X
j
()
2
=
1
2
_
k M
k
()
_
so
S
k
= S
0
u
#H
d
#T
= S
0
u
1
2
(k+M
k
)
d
1
2
(kM
k
)
. (3.1)
Jump size u = u
n
, d = d
n
depending on length of time step:
S
(n)
(t) = S
nt
= S
0
u
1
2
(nt+M
nt
)
n
d
1
2
(ntM
nt
)
n
= S
0
e
1
2
(log u
n
log d
n
)M
nt
+
1
2
(log u
n
+log d
n
)nt
(3.2)
n
M
nt
(3.3)
for
4
nt N. Then we have for each such n
Var [W
(n)
(t)] = t.
4
If k < nt < k + 1 for k N then W
(n)
(t) is given by linear interpolation
between W
(n)
(
k
n
) and W
(n)
(
k+1
n
).
We obtain
S
(n)
(t) = S
0
e
1
2
n(log u
n
log d
n
)W
(n)
(t)+
1
2
(log u
n
+log d
n
)nt
(3.4)
n(log u
n
log d
n
)
1
2
(log u
n
+ log d
n
)n c
for n . In this case we obtain
S(t) = S
0
e
W(t)+ct
Now suppose we know that W
(n)
(t) W(t) for n .
(We can hope that we get convergence since Var [W
(n)
(t)] = t
for all n.)
Continuous time limit of the binomial model
Dene
u
n
= 1 +
n
+
n
, d
n
= 1
n
+
n
(3.5)
with > 0, R. Using log(1 + x) = x
1
2
x
2
+ O(x
3
) we nd
1
2
(log u
n
log d
n
) =
n
+ O(n
3
2
),
1
2
(log u
n
+ log d
n
) =
1
n
_
1
2
2
_
+ O(n
3
2
).
So from (3.4)
log S
(n)
(t)
= log S
0
+
1
2
n(log u
n
log d
n
)W
(n)
(t) +
1
2
(log u
n
+ log d
n
)nt
= log S
0
+W
(n)
(t) + O(n
1
)W
(n)
(t) +
_
1
2
2
_
t + O(n
1
2
)
log S
0
+W(t) +
_
1
2
2
_
t.
Hence the random variables S
(n)
with u
n
, d
n
in (3.5) converge to
S(t) = S
0
e
W(t)+
(
1
2
2
)
t
(3.6)
in distr. for n (geometric Brownian motion, see below).
Convergence of the scaled random walk
Theorem 3.1
Let W
(n)
(t), t 0 be a scaled symmetric random walk, and
0 = t
0
< t
1
< ... < t
m
such that each nt
j
N.
a) The increments
W
(n)
(t
1
)W
(n)
(t
0
), W
(n)
(t
2
)W
(n)
(t
1
), ..., W
(n)
(t
m
)W
(n)
(t
m1
)
are independent.
b) The increments satisfy
E
_
W
(n)
(t
j
) W
(n)
(t
j 1
)
= 0,
Var
_
W
(n)
(t
j
) W
(n)
(t
j 1
)
= t
j
t
j 1
.
Proof: For a) use Theorem 2.20, for b) use Corollary 2.27.
Theorem 3.2 (Central limit theorem)
Let Y
1
, Y
2
, ... be independent identically distributed (iid) random
variables with E[Y
k
] = and Var [Y
k
] =
2
< for all k, and
S
n
=
1
n
n
k=1
Y
k
.
Then S
n
converge to a standard normal random variable in
distribution, i.e.
lim
n
P[S
n
x] = (x) for all x R.
Proof: See [Dur95] Theorem 2.4.1 or [Res99] Theorem 9.7.1.
For nt
j 1
, nt
j
N the increment of W
(n)
satises
W
(n)
(t
j
)W
(n)
(t
j 1
)
t
j
t
j 1
=
1
n(t
j
t
j 1
)
_
M
nt
j
M
nt
j 1
_
=
1
nt
j
nt
j 1
nt
j
nt
j 1
k=1
X
nt
j 1
+k
(n)
Z N(0, 1)
in distribution by the central limit theorem.
Brownian motion
Brownian motion is obtained as the limit of W
(n)
(t) for n .
Denition 3.3
A stochastic process W(t), t 0 on a space (, T, P) is a
Brownian motion (BM) if it satises
(i) W(0) = 0 a.s.
(ii) t W(t) is continuous a.s.
(iii) For all times 0 = t
0
< t
1
< ... < t
m
, the increments
W(t
1
) = W(t
1
)W(t
0
), W(t
2
)W(t
1
), ..., W(t
m
)W(t
m1
)
are independent, and W(t
j
) W(t
j 1
) N(0, t
j
t
j 1
)
for all j = 1, ..., m.
Distribution of Brownian motion
For 0 = t
0
< t
1
< ... < t
m
_
W(t
1
) W(t
0
), ..., W(t
m
) W(t
m1
)
_
=
_
t
1
t
0
Z
1
, ...,
t
m
t
m1
Z
m
_
with independent standard normal Z
1
, ..., Z
m
by (iii).
So
_
W(t
1
), ..., W(t
m
)
_
has multivariate normal distribution
(as linear combination of a multivar normal).
For all times 0 s < t
E
_
W(s)
= E
_
W(t)
= 0,
Cov
_
W(s), W(t)
= E
_
W(s)W(t)
= E
_
W(s)
_
W(t) W(s)
_
+ W(s)
2
= s = min(s, t).
Filtration of Brownian motion
Let W(t), t 0 be a BM on (, T, P). Let (T
W
t
)
t0
be the
ltration generated by the BM.
Reminder: T
W
t
=
_
W(s) [ 0 s t
_
is the smallest -algebra
such that all W(s) with s t are T
W
t
-measurable.
Important properties:
a) T
W
t
contains all events which can be expressed in terms of
the values of W(s) for 0 s t.
b) W(t), t 0 is adapted to the ltration (T
W
t
)
t0
.
c) W(u) W(t) is independent of T
W
t
for 0 t < u
(this follows from Denition 3.3 (iii)).
Brownian motion for a general ltration
Sometimes one wants a BM on a space which is already equipped
with a ltration. Then the following characterization of BM is used.
Denition 3.4
Let W(t), t 0 be a stochastic process on a space (, T, P)
which is adapted to a ltration (T
t
)
t0
. The process is a
Brownian motion w.r.t. (T
t
)
t0
if it satises
(i) W(0) = 0 a.s.
(ii) t W(t) is continuous a.s.
(iii) For all times 0 t < u, the increment W(u) W(t) is
independent of T
t
and has N(0, u t) distribution.
The two denitions are equivalent:
= E
_
W(t) W(s) + W(s) [ T
s
= E
_
W(t) W(s) [ T
s
+ E
_
W(s) [ T
s
= E
_
W(t) W(s)
+ W(s)
= W(s).
Geometric Brownian motion
S(t) = S
0
e
W(t)+
(
1
2
2
)
t
(i) S(t) is continuous in t, a.s.
(ii) For 0 s < t,
S(t) = S
0
e
W(s)+
(
1
2
2
)
s
e
_
W(t)W(s)
_
+
(
1
2
2
)
(ts)
= S(s)e
_
W(t)W(s)
_
+
(
1
2
2
)
(ts)
. (3.7)
So the logarithmic return log
S(t)
S(s)
is independent of T
s
(by
property (iii) of BM).
(iii) For 0 s < t, the logarithmic return log
S(t)
S(s)
has a normal
distribution. S(t) has a lognormal distribution.
Geometric BM is used as a model for the stock price in the
Black-Scholes model (see Chapter 4).
Theorem 3.6 (Exponential martingale)
Let W(t), t 0 be a Brownian motion with a ltration (T
t
),
t 0. Then
S(t) = S
0
e
W(t)+
(
1
2
2
)
t
is a martingale for = 0, and a submartingale (supermartingale) if
> 0 ( < 0).
Proof: S(t) is T
t
-measurable as a function of W(t), and integrable
since W(t) has exponential moments. Take the case = 0. For
0 s < t, by (3.7)
E
_
S(t)
T
s
= E
_
S(s)e
_
W(t)W(s)
_
1
2
2
(ts)
T
s
_
= S(s)e
1
2
2
(ts)
E
_
e
_
W(t)W(s)
_
_
.
In the last step we used Theorem 2.38 a) and c).
W(t) W(s) N(0, t s) implies E
_
e
_
W(t)W(s)
_
_
= e
1
2
2
(ts)
.
The cases ,= 0 are an exercise.
Markov property of BM
Denition 3.7
Let
_
X(t)
_
t[0,T]
be a stochastic process adapted to a ltration
_
T
t
_
t[0,T]
. Assume that for all 0 s t T and every
nonnegative function f , there is another function
5
g such that
E
_
f
_
X(t)
_
T
s
= g
_
X(s)
_
.
Then we say that
_
X(t)
_
t[0,T]
is a Markov process.
Theorem 3.8
Let W(t), t 0 be BM and T
t
, t 0 the ltration generated by
the BM. Then W(t), t 0 is a Markov process.
5
To be precise, f and g are assumed to be Borel-measurable.
Proof: From Theorem 2.38 c)
E
_
f
_
W(t)
_
T
s
= E
_
f
_
W(t) W(s) + W(s)
_
T
s
= E
_
f
_
W(t) W(s) + x
_
x=W(s)
=
_
1
_
2(t s)
e
z
2
2(ts)
f (z + x)dz
x=W(s)
=
_
1
_
2(t s)
e
(yx)
2
2(ts)
f (y)dy
x=W(s)
=
_
p
_
t s, W(s), y
_
f (y)dy (3.8)
where p(, x, y) =
1
2
e
(yx)
2
2
.
p(, x, y) is the transition density of BM. (3.8) says that the
conditional distribution of W(t) given T
s
is N
_
W(s), t s
_
.
The fact that this distribution only depends on W(s) (instead of
all information in T
s
) is the essence of the Markov property.
Quadratic variation
Motivation: Let f (t) be a function. We seek a measure of
variation (total up and down oscillation) of f (t) on [0, T].
Denition 3.9
For a partition = t
0
, t
1
, ..., t
n
of [0, T], i.e., times with
0 = t
0
< t
1
< ... < t
n
= T, set [[ = max
j =0,...,n1
(t
j +1
t
j
).
The (rst-order) variation of f on [0, T] is
FV
f
(T) = lim
||0
n1
j =0
[f (t
j +1
) f (t
j
)[.
If f (t) is dierentiable then
FV
f
(T) =
_
T
0
[f
(t)[dt. (3.9)
If f (t) is increasing then
FV
f
(T) = f (T) f (0).
Denition 3.10
For a partition = t
0
, t
1
, ..., t
n
of [0, T], i.e., times with
0 = t
0
< t
1
< ... < t
n
= T, set [[ = max
j =0,...,n1
(t
j +1
t
j
).
The quadratic variation of f on [0, T] is
6
[f , f ](T) = lim
||0
n1
j =0
[f (t
j +1
) f (t
j
)[
2
.
The covariation of two functions f and g on [0, T] is
[f , g](T) = lim
||0
n1
j =0
_
f (t
j +1
) f (t
j
)
__
g(t
j +1
) g(t
j
)
_
.
6
In both Denitions 3.9 and 3.10 one can show that the limits do not
depend on the choice of the partitions . We omit the proof of this fact here.
Basic properties. 1) If f (t) is continuous and g(t) has nite
rst-order variation (e.g. if g(t) has an integrable derivative) then
[f , g](T) = 0 for all T.
2) Thus if f (t) is continuous and has nite rst-order variation,
[f , f ](T) = 0 for all T.
3) Covariation is bilinear and symmetric. In particular,
[f + g, f + g](T) = [f , f ](T) + [g, g](T) + 2[f , g](T).
4) If f
1
, f
2
are continuous and g
1
, g
2
are continuous and have nite
rst-order variation, then
[f
1
+ g
1
, f
2
+ g
2
] = [f
1
, f
2
].
5) The function [f , f ](t) is increasing in t, and thus has nite
rst-order variation.
Quadratic variation of Brownian motion
Theorem 3.11
Let W be a BM. Then [W, W](T) = T for all T 0 a.s.
It follows that BM paths W(t), t 0, do not have nite rst-order
variation. W(t), t 0, is not dierentiable (a.s.).
Intuition:
BM accumulates quadratic variation at rate one per unit time.
Note: By denition,
[W, W](T)() = lim
||0
n1
j =0
W(t
j +1
)() W(t
j
)()
2
.
Theorem 3.11 says that the random variables
n1
j =0
[W(t
j +1
) W(t
j
)[
2
converge to T when [[ 0.
Proof: Take any partition = t
0
, t
1
, ..., t
n
of [0, T]. We show
n1
j =0
[W(t
j +1
) W(t
j
)[
2
T
in L
2
when [[ 0. Indeed,
E
_
n1
j =0
[W(t
j +1
) W(t
j
)[
2
_
=
n1
j =0
(t
j +1
t
j
) = T
and
Var
_
n1
j =0
[W(t
j +1
) W(t
j
)[
2
_
=
n1
j =0
Var
_
[W(t
j +1
) W(t
j
)[
2
_
=
n1
j =0
_
E
_
[W(t
j +1
) W(t
j
)[
4
_
E
_
[W(t
j +1
) W(t
j
)[
2
_
2
_
=
n1
j =0
_
3(t
j +1
t
j
)
2
(t
j +1
t
j
)
2
_
= 2
n1
j =0
[t
j +1
t
j
[
2
2
_
max
j =0,...,n1
[t
j +1
t
j
[
_
n1
j =0
[t
j +1
t
j
[ = [[T 0.
Notations:
a) [W, W](T) = T
b) d[W, W](t) = dt
c) dW(t)dW(t) = dt
Notation c) is delicate since dW(t) alone is not well-dened.
To interpret c), take the partition of [0, T] dened by t
j
=
j
n
T
for j = 0, ..., n, so t = t
j +1
t
j
=
T
n
. Then
_
W(t
j +1
) W(t
j
)
_
2
= T
Y
2
j +1
n
= Y
2
j +1
t (3.10)
with Y
j +1
N(0, 1), hence E[Y
2
j +1
] = 1. So c) is a dierential
version of (3.10).
By the Law of Large Numbers
n1
j =0
T
Y
2
j +1
n
TE[Y
2
j +1
] = T for
n , and so we nd again
lim
n
n1
j =0
[W(t
j +1
) W(t
j
)[
2
= T.
Application: Volatility of geometric BM
Let S(t), t 0, a price process, and = t
0
, t
1
, ..., t
n
a partition
of [0, T]. The (annualized) historical variance of the asset along
the partition is
Var
hist
[0, T] =
1
T
n1
j =0
_
log
S(t
j +1
)
S(t
j
)
_
2
.
Assume now S(t) is a geometric BM. Then for small [[,
Var
hist
[0, T] =
=
1
T
n1
j =0
__
W(t
j +t
) +
_
1
2
2
_
t
j +1
_
_
W(t
j
) +
_
1
2
2
_
t
j
__
2
1
T
_
W(t) +
_
1
2
2
_
t , W(t) +
_
1
2
2
_
t
_
(T)
=
1
T
[W, W](T) =
1
T
2
[W, W](T) =
2
.
The (annualized) historical volatility of the asset during [0, T] is
dened as
hist
[0, T] =
_
Var
hist
[0, T].
So if S(t) is a geometric BM, then for small [[
hist
[0, T] (3.11)
for every T.
Let S(t) be a stochastic process adapted to a ltration (T
t
)
t0
,
and > 0 a small time increment. The (annualized) volatility at t
is the standard deviation of
1
log
S(t+)
S(t)
conditional on T
t
.
Assume again S(t) is a geometric BM. Then
1
log
S(t+)
S(t)
=
_
W(t+)W(t)
_
+(
1
2
2
)
N
_
(
1
2
2
)
,
2
_
,
so the volatility is equal to .
n
(B) = P[W
(n)
() B], B o,
where W
(n)
(t) is the scaled symmetric random walk in (3.3).
Theorem 3.12 (Donskers theorem)
There exists a unique probability measure on (S, o) such that
n
weakly (n ),
and the stochastic process W
t
, t [0, T], is a BM on (S, o, ).
Proof: See [Dur95] Section 7.6.
A consequence of Donskers theorem is the following:
Corollary 3.13
Let W
(n)
be the scaled symmetric random walk and W be a BM.
Then for any times 0 < t
1
< ... < t
m
and any bounded continuous
function h : R
m
R we have
lim
n
E
_
h
_
W
(n)
(t
1
), ..., W
(n)
(t
m
)
_
= E
_
h
_
W(t
1
), ..., W(t
m
)
_
.
This result implies that when the number n of time steps per year
goes to innity, derivative prices in the binomial model converge to
derivative prices in the Black-Scholes model (see Chapter 5).
4. Stochastic calculus
Main topics
Ito-Doeblin formula
j =0
(t
j
)I
[t
j
,t
j +1
)
(t), (t) =
m1
j =0
(t
j
)I
[t
j
,t
j +1
)
(t).
(4.1)
We demand that (t
j
) is T
t
j
-measurable and bounded.
A process (t) as in (4.1) is called a simple process.
j =0
(t
j
)
_
S(t
j +1
t)
B(t
j +1
t)
S(t
j
t)
B(t
j
t)
_
(4.2)
for t [0, T].
Denition 4.1
The stochastic integral (SI) of the simple process (t) with
respect to the process M(t) =
S(t)
B(t)
is
_
t
0
(u)dM(u) :=
m1
j =0
(t
j
)
_
M(t
j +1
t)M(t
j
t)
_
, t [0, T].
Continuous trading
We now consider strategies with continuously changing positions.
A continuous trading strategy (t) is approximated by a
sequence of simple strategies
n
(t) (trading at discrete times) for
which the re-balancing frequency becomes small when n .
That is,
(t) = lim
n
n
(t), t [0, T]. (4.3)
Remark: One can show that every continuous adapted process
(t) can be approximated by simple processes as in (4.3).
We will see that this approximation is a stable procedure in the
sense that the value processes X
n
(t) corresponding to the simple
self-nancing trading strategies
n
(t) satisfy
X
n
(t) X(t) (n )
for some limit value X(t).
Denition 4.2
a) Let (t) = lim
n
n
(t) be a continuous adapted process
approximated by simple processes
n
(t). The stochastic integral
of (t) with respect to M(t) :=
S(t)
B(t)
is
_
t
0
(u)dM(u) := lim
n
_
t
0
n
(u)dM(u), t [0, T]. (4.4)
b) The value process X(t) of the continuous self-nancing
trading strategy (t) with initial value X(0) is given by
X(t)
B(t)
= X(0) +
_
t
0
(u)d
_
S(u)
B(u)
_
. (4.5)
We now specify, in three steps, a class of processes M(t) =
S(t)
B(t)
for which the limit in (4.4) exists.
Step 1: SI with respect to dierentiable processes
Let M(t) be a stochastic process which is dierentiable in t.
Theorem 4.3
If is adapted and (t)M
(u)du.
Proof: For a simple process (t) =
m1
j =0
t
j
I
[t
j
,t
j +1
)
(t),
_
t
0
(u)dM(u) =
m1
j =0
t
j
_
M(t
j +1
t) M(t
j
t)
_
=
m1
j =0
t
j
_
t
j +1
t
t
j
t
M
(u)du =
m1
j =0
_
t
j +1
t
t
j
t
(u)M
(u)du
=
_
t
0
(u)M
(u)du.
For adapted choose simple processes
n
such that
[
n
[ [[. We then obtain
_
t
0
(u)dM(u) = lim
n
_
t
0
n
(u)dM(u)
= lim
n
_
t
0
n
(u)M
(u)du =
_
t
0
(u)M
(u)du
by dominated convergence.
Corollary 4.4 (Associativity)
Let , adapted such that (t)M
(t) are
integrable. Then
_
t
0
(u)d
__
u
0
(v)dM(v)
_
=
_
t
0
(u)(u)dM(u).
Step 2: SI with respect to Brownian motion
Theorem 4.5
If M(t) is a martingale and (t) a simple process, then the SI
_
t
0
(u)dM(u) for t [0, T] is also a martingale.
Proof: See [Shr04] Theorem 4.2.1.
Corollary 4.6
If M(t) is a square-integrable martingale and (t) a simple
process, then the SI satises E
_
_
t
0
(u)dM(u)
_
= 0 and
E
_
__
t
0
(u)dM(u)
_
2
_
= E
_
_
m1
j =0
2
t
j
_
M(t
j +1
t) M(t
j
t)
_
2
_
_
.
Proof: This follows from Theorem 4.5 since for every square-
integrable martingale X,
E
__
X(t) X(0)
_
2
=
m1
j =0
E
__
X(t
j +1
t) X(t
j
t)
_
2
.
We now take M(t) = W(t) BM.
Theorem 4.7 (Ito isometry)
For a simple process (t) we have
E
_
__
t
0
(u)dW(u)
_
2
_
= E
__
t
0
(u)
2
du
_
.
Proof: For any t
j +1
t
E
_
2
t
j
_
W(t
j +1
) W(t
j
)
_
2
_
= E
_
2
t
j
E
__
W(t
j +1
) W(t
j
)
_
2
T
t
j
_
_
= E
_
2
t
j
(t
j +1
t
j
)
_
= E
_
_
t
j +1
t
j
(u)
2
du
_
,
and similarly E
_
2
t
j
_
W(t) W(t
j
)
_
2
_
= E
_
_
t
t
j
(u)
2
du
_
for
t (t
j
, t
j +1
].
Excursion: existence and uniqueness of SI. Let L
2
be the space
of all T
t
-measurable random variables X with |X| = E[X
2
]
1
2
< ,
and L
2
W
be the space of all adapted processes with
||
W
= E
__
t
0
(u)
2
du
_
1
2
< .
Let o be the space of all simple processes. Then the map
_
t
0
(u)dW(u)
denes a linear isometry o L
2
by Theorem 4.7. Since o is dense
in L
2
W
, and L
2
is a Banach space, there is a unique way to extend
this map
7
to a linear isometry L
2
W
L
2
by setting
_
t
0
(u)dW(u) := lim
n
_
t
0
n
(u)dW(u) in L
2
for a sequence
n
o with
n
in L
2
W
.
7
See e.g. Kreyszig, Introductory Functional Analysis with Applications, p. 100.
The last result
species a space L
2
W
of admissible integrands for BM:
all adapted processes with E
_
_
t
0
(u)
2
du
_
<
Note: If L
2
W
adapted, then by denition
_
t
0
(u)dW(u) := lim
n
_
t
0
n
(u)dW(u) in L
2
when
n
are simple processes with
n
in L
2
W
(n ).
The last condition is fullled for instance if
n
a.s. (n )
and [
n
[ [[ a.s. for all n. A sequence
n
of this type can be
found for every adapted L
2
W
.
Basic properties of SI. Let W be a BM.
Theorem 4.8
The stochastic integral I (t) =
_
t
0
(u)dW(u) satises
(i) I (t) is linear in .
(ii) I (t) is T
t
-measurable.
(iii) I (t)
t[0,T]
is a martingale.
(iv) I (t) is continuous in t.
Proof: (i) - (iv) are immediate for simple processes . For an
arbitrary adapted process L
2
W
we choose simple
n
such that
n
in L
2
W
. Then (i) and (ii) generalize from
n
to . For
(iii) we note that for A T
t
, the L
2
-convergence gives
E
_
_
T
0
dW I
A
_
= lim
n
E
_
_
T
0
n
dW I
A
_
= lim
n
E
_
_
t
0
n
dW I
A
_
= E
_
_
t
0
dW I
A
_
.
The proof of (iv) requires results from martingale theory, see
[Dur96] Chapter 2 Theorems 4.3a and 6.3.
Theorem 4.9 (Ito isometry)
For every adapted process (t) L
2
W
we have
E
_
__
t
0
(u)dW(u)
_
2
_
= E
__
t
0
(u)
2
du
_
.
Proof: Take simple processes
n
in L
2
W
when n .
Then
_
t
0
n
(u)dW(u)
_
t
0
(u)dW(u) in L
2
, and thus
E
_
__
t
0
(u)dW(u)
_
2
_
= lim
n
E
_
__
t
0
n
(u)dW(u)
_
2
_
= lim
n
E
__
t
0
n
(u)
2
du
_
= E
__
t
0
(u)
2
du
_
.
Step 3: SI with respect to Ito processes
Denition 4.10
Let W(t) a BM with ltration (T
t
), t 0. An Ito process is a
stochastic process of the form
X(t) = X(0) +
_
t
0
(u)du +
_
t
0
(u)dW(u) (4.6)
with a constant X(0) and adapted processes and satisfying
the appropriate integrability conditions.
Remarks.
1) BM X(t) = W(t) is an Ito process: take X(0) = 0, (u) = 0,
and (u) = 1.
2) Every nice continuous martingale X(t) is an Ito process of
the form X(t) =
_
t
0
(u)dW(u) with a BM W(t).
Stochastic integrals. The limit in the denition (4.4) of the SI
exists in L
2
if the integrator M is an Ito process:
Theorem 4.11 (Associativity)
Let adapted and X(t) an Ito process
X(t) = X(0) +
_
t
0
(u)du +
_
t
0
(u)dW(u).
Then under the appropriate integrability conditions
_
t
0
(u)dX(u) =
_
t
0
(u)(u)du +
_
t
0
(u)(u)dW(u).
Implications:
1) If X(t) is an Ito process, then I (t) =
_
t
0
(u)dX(u) is again
an Ito process.
2) SIs w.r.t. Ito processes can be reduced to SIs w.r.t. Brownian
motion plus ordinary (Lebesgue) integrals.
Proof of Theorem 4.11: a) The case X(t) = X(0) +
_
t
0
(u)du is
established in Corollary 4.4.
b) Next suppose X(t) =
_
t
0
(u)dW(u). Let
(n)
be a sequence of
simple processes with
(n)
a.s. (n ) and [
(n)
[ [[.
Then
(n)
in L
2
, and thus
_
t
0
(n)
(u)dX(u) =
m1
j =0
(n)
t
j
_
X(t
j +1
t) X(t
j
t)
_
=
m1
j =0
(n)
t
j
_
t
j +1
t
t
j
t
(u)dW(u)
=
m1
j =0
_
t
j +1
t
t
j
t
(n)
t
j
(u)dW(u)
=
_
t
0
(n)
(u)(u)dW(u)
_
t
0
(u)(u)dW(u) in L
2
.
c) The case of a general Ito process now follows from linearity of
the SI in the integrator.
Important Example: The SI
_
X dX
Let X an Ito process as in (4.6). We compute
8
_
t
0
X(u)dX(u).
Let
1
,
2
, ... partitions of [0, t] with [
n
[ 0 and dene
n
(t) =
t
j
n
X(t
j
)I
[t
j
,t
j +1
)
(t).
Then
n
(t) X(t) and so
2
_
t
0
X(u)dX(u) = lim
n
t
j
n
2X(t
j
)
_
X(t
j +1
) X(t
j
)
_
= lim
n
t
j
n
_
X(t
j +1
)
2
X(t
j
)
2
_
lim
n
t
j
n
_
X(t
j +1
) X(t
j
)
_
2
= X(t)
2
X(0)
2
[X, X](t). (4.7)
For X = W BM we have [W, W](t) = t by Theorem 3.11. Thus
_
t
0
W(u)dW(u) =
1
2
W(t)
2
1
2
t.
8
The SI exists if E
t
0
(u)
4
du
< .
Ito-Doeblin formula for Brownian motion
The last equation
W(t)
2
W(0)
2
= 2
_
t
0
W(u)dW(u) + t (4.8)
shows that the chain rule from ordinary calculus does not hold for
the stochastic integral with respect BM: Let g(t) be dierentiable.
chain rule:
d
du
_
g(u)
_
2
= 2g(u)g
(u)
integral form:
g(t)
2
g(0)
2
= 2
_
t
0
g(u)g
(u)du = 2
_
t
0
g(u)dg(u)
The additional term t in (4.8) appears because BM has non-zero
quadratic variation, in contrast to dierentiable functions.
Let f : R R be a (
2
function. We want a chain rule for
f
_
W(t)
_
. (Previously we had f (x) = x
2
). By Taylors theorem
f
_
W(t
j +1
)
_
f
_
W(t
j
)
_
= f
_
W(t
j
)
__
W(t
j +1
) W(t
j
)
_
+
1
2
f
_
W(t
j
)
__
W(t
j +1
) W(t
j
)
_
2
with some t
j
(t
j
, t
j +1
). For a partition = t
0
, ..., t
m
of [0, t]
take
m1
j =0
and let [[ 0, then
m1
j =0
f
_
W(t
j
)
__
W(t
j +1
) W(t
j
)
_
_
t
0
f
_
W(u)
_
dW(u)
m1
j =0
1
2
f
_
W(t
j
)
__
W(t
j +1
) W(t
j
)
_
2
1
2
_
t
0
f
_
W(u)
_
du
We obtain the Ito-Doeblin formula for BM
f
_
W(t)
_
f
_
W(0)
_
=
_
t
0
f
_
W(u)
_
dW(u) +
1
2
_
t
0
f
_
W(u)
_
du.
Main results on Ito processes
We ultimately want to extend the Ito-Doeblin formula to multi-
variate functions f
_
X
1
(t), ..., X
d
(t)
_
of Ito processes X
1
, ..., X
d
.
To this end we need the quadratic variation of an Ito process.
Theorem 4.12
Let X(t) = X(0) +
_
t
0
(u)du +
_
t
0
(u)dW(u) an Ito process.
Then its quadratic variation is
[X, X](t) =
_
t
0
(u)
2
du.
Let Y(t) = Y(0) +
_
t
0
(u)du +
_
t
0
(u)dW(u) another Ito process.
The covariation of X and Y is then (symmetric bilinear form)
[X, Y](t) =
1
2
[X + Y, X + Y](t)
1
2
[X, X](t)
1
2
[Y, Y](t)
=
1
2
_
t
0
_
(u) + (u)
_
2
du
1
2
_
t
0
(u)
2
du
1
2
_
t
0
(u)
2
du
=
_
t
0
(u)(u)du.
Theorem 4.13 (Continuous martingales with nite variation)
Let X(t) be a continuous martingale with nite rst order
variation. Then X(t) = X(0) for all t 0.
Proof: We give the proof under the condition that X and FV
X
are
bounded (the general result then follows via a method known as
localization). Take a partition = t
0
, ..., t
m
of [0, t]. Then
E
_
_
X(t) X(0)
_
2
_
= E
_
m1
j =0
_
X(t
j +1
) X(t
j
)
_
2
_
E
_
max
j =0,...,m1
X(t
j +1
) X(t
j
)
m1
j =0
X(t
j +1
) X(t
j
)
_
E
_
max
j =0,...,m1
X(t
j +1
) X(t
j
)
FV
X
(t)
_
0
for [[ 0 by dominated convergence, because everything is
bounded and max
j =0,...,m1
X(t
j +1
) X(t
j
)
0 (continuity of X).
Hence E
__
X(t) X(0)
_
2
_
t
0
(u)
2
du, t 0
is a continuous martingale.
Proof: M(t) is adapted and integrable since X(t) is square-integrable.
Also E
__
X(t)X(s)
_
2
T
s
= E
_
X(t)
2
X(s)
2
T
s
for s t since
X(t) is a martingale. Moreover, X(t) X(s) =
_
t
s
(u)dW(u). Hence
E
_
M(t) M(s)
T
s
= E
__
_
t
s
(u)dW(u)
_
2
_
t
s
(u)
2
du
T
s
_
.
So it remains to show that for every A T
s
,
E
_
I
A
__
_
t
s
(u)dW(u)
_
2
_
t
s
(u)
2
du
__
= 0.
But this equation is Ito isometry (Theorem 4.9) for the process
(u) =
_
0 (u s)
(u)I
A
(u > s).
Proof of Theorem 4.12. We have to show [X, X](t) =
_
t
0
(u)
2
du.
We give a proof under the assumption that is bounded.
By remarks 1) and 3) after Denition 3.10, [X, X](t) = [Y, Y](t)
for Y(t) =
_
t
0
(u)dW(u), so it suces to show
[Y, Y](t) =
_
t
0
(u)
2
du.
To this end, we use (4.7) to compute
[Y, Y](t)
_
t
0
(u)
2
du = Y(t)
2
2
_
t
0
Y(u)dY(u)
_
t
0
(u)
2
du
= Y(t)
2
_
t
0
(u)
2
du 2
_
t
0
Y(u)(u)dW(u).
RHS is a continuous martingale by Theorem 4.14 and the martingale
property of Brownian integrals. LHS has nite rst order variation.
So Theorem 4.13 yields [Y, Y](t)
_
t
0
(u)
2
du = 0 for all t.
Unique representation of Ito processes. Let X(t) be an Ito
process and suppose that we have two representations
X(t) = X(0) +
_
t
0
1
(u)du +
_
t
0
1
(u)dW(u)
= X(0) +
_
t
0
2
(u)du +
_
t
0
2
(u)dW(u)
with adapted processes
i
,
i
for i = 1, 2 satisfying the required
integrability conditions. Then
Z(t) :=
_
t
0
_
1
(u)
2
(u)
_
du =
_
t
0
_
2
(u)
1
(u)
_
dW(u)
for all t. So Z(t) is a continuous martingale (RHS) with nite rst
order variation (LHS), hence Z(t) = 0 for all t by Theorem 4.13.
We obtain Z
(t) =
1
(t)
2
(t) = 0. Also
0 = E
__
_
t
0
_
1
(u)
2
(u)
_
dW(u)
_
2
_
= E
_
_
t
0
_
1
(u)
2
(u)
_
2
du
_
,
which implies
1
(t)
2
(t) = 0.
Ito-Doeblin formula for Ito processes
Theorem 4.15 (Ito-Doeblin formula for It o processes)
Let f : R
d
R be a (
2
function and X
1
(t), ..., X
d
(t) Ito
processes. Set X(t) =
_
X
1
(t), ..., X
d
(t)
_
. Then
f
_
X(t)
_
= f
_
X(0)
_
+
d
i =1
_
t
0
f
x
i
_
X(u)
_
dX
i
(u)
+
1
2
d
i , j =1
_
t
0
2
f
x
i
x
j
_
X(u)
_
d[X
i
, X
j
](u).
(4.9)
Remarks. 1) In particular f
_
X(t)
_
is again an It o process.
2) If some X
i
(t) = X
i
(0) +
_
t
0
i
(u)du has nite variation, then
[X
i
, X
j
](t) = 0 for all j , so the corresponding integrals in (4.9)
vanish.
Special cases
1) For d = 2 with X
1
(t) = t, X
2
(t) = W(t) we obtain the
Ito-Doeblin formula for BM (general case)
f
_
t, W(t)
_
= f
_
0, W(0)
_
+
_
t
0
f
t
_
u, W(u)
_
du
+
_
t
0
f
x
_
u, W(u)
_
dW(u) +
1
2
_
t
0
f
xx
_
u, W(u)
_
du.
(4.10)
2) Let X, Y be Ito processes and f (x, y) = xy. Then we obtain
the product rule
X(t)Y(t) = X(0)Y(0)+
_
t
0
X(u)dY(u)+
_
t
0
Y(u)dX(u)+[X, Y](t).
(4.11)
Proof of Theorem 4.15: See [Dur96] Section 2.10.
Dierential notation
We often write the Ito formula
f
_
X(t)
_
= f
_
X(0)
_
+
d
i =1
_
t
0
f
x
i
_
X(u)
_
dX
i
(u)
+
1
2
d
i , j =1
_
t
0
2
f
x
i
x
j
_
X(u)
_
d[X
i
, X
j
](u).
in a dierential form:
df
_
X(t)
_
=
d
i =1
f
x
i
_
X(t)
_
dX
i
(t) +
1
2
d
i , j =1
2
f
x
i
x
j
_
X(t)
_
d[X
i
, X
j
](t).
For an Ito process
X(t) = X(0) +
_
t
0
(u)du +
_
t
0
(u)dW(u)
let Y(t) =
_
t
0
(u)dX(u). By Theorem 4.11 we have
Y(t) = Y(0) +
_
t
0
(u)(u)du +
_
t
0
(u)(u)dW(u).
We write this in dierential form as
dX(t) = (t)dt + (t)dW(t),
dY(t) = (t)dX(t)
implies
dY(t) = (t)(t)dt + (t)(t)dW(t).
Examples
1) Generalized geometric BM. Let W(t) be a BM for a
ltration (T
t
)
t0
, let (t), (t) adapted processes, and
X(t) =
_
t
0
_
(u)
1
2
(u)
2
_
du +
_
t
0
(u)dW(u).
By the Ito formula S(t) = S
0
e
X(t)
is an Ito process. f (x) = S
0
e
x
yields f
_
X(t)
_
= f
_
X(t)
_
= f
_
X(t)
_
= S(t) and so
dS(t) = S(t)dX(t) +
1
2
S(t)d[X, X](t).
By Theorem 4.12, [X, X](t) =
_
t
0
(u)
2
du, so Theorem 4.11 yields
dS(t) = S(t)
_
(t)
1
2
(t)
2
_
dt + S(t)(t)dW(t) +
1
2
S(t)(t)
2
dt
= S(t)(t)dt + S(t)(t)dW(t).
For constant and we obtain X(t) =
_
1
2
2
_
t +W(t), so
S(t) = S
0
e
W(t)+
(
1
2
2
)
t
is geometric BM.
2) Ornstein-Uhlenbeck process. Let W(t) be a BM and R(t)
be an Ito process satisfying the stochastic dierential equation
dR(t) =
_
R(t)
_
dt +dW(t)
for some constants , , . The solution of this equation is
R(t) =
+ e
t
_
R(0)
+
_
t
0
e
u
dW(u)
_
. (4.12)
To see this let X(t) = R(0)
+
_
t
0
e
u
dW(u) and note
de
t
= e
t
()dt. Then by the product rule
dR(t) = d
_
e
t
X(t)
_
= e
t
dX(t) + X(t)de
t
= e
t
e
t
dW(t) + e
t
_
R(t)
_
e
t
()dt
=
_
R(t)
_
dt +dW(t).
The process R(t) is used to model the spot interest rate in the
Vasicek interest rate model.
We assume that , , > 0 and write the Ornstein-Uhlenbeck
stochastic dierential equation as
dR(t) =
_
R(t)
_
dt +dW(t).
The OU process R(t) is mean-reverting with long-term mean
.
The speed of mean reversion is determined by , and the volatility
of the process by . (4.12) yields the distribution of R(t) using
Theorem 4.16
Let (t) a deterministic function and W(t) a Brownian motion.
Then I (t) =
_
t
0
(s)dW(s) N
_
0 ,
_
t
0
(s)
2
ds
_
.
Proof (sketch): E[I (t)] = 0 and Var [I (t)] =
_
t
0
(s)
2
ds by Ito
isometry. To prove normality, take u R and dene S(t) = e
X(t)
with X(t) =
_
t
0
1
2
_
iu(s)
_
2
ds +
_
t
0
iu(s)dW(s). Then by 1)
dS(t) = S(t)iu(t)dW(t), so S(t) is a martingale.
9
Thus
E
_
e
iu I (t)
= E
_
e
X(t)
1
2
u
2
t
0
(s)
2
ds
= e
1
2
u
2
t
0
(s)
2
ds
.
9
This follows from Novikovs condition, see Karatzas and Shreve, Brownian
Motion and Stochastic Calculus, Chap. 3.5.D.
The Black-Scholes-Merton Equation
Black-Scholes model: B(t) = e
rt
, S(t) = S
0
e
W(t)+
(
1
2
2
)
t
.
dB(t) = B(t)rdt
dS(t) = S(t)dt + S(t)dW(t)
Pricing derivatives via replication. Consider payo
_
S(T) K
_
+
at time T. We want to nd a sfts with stock position (t) and
initial value X(0) whose value process has the form
X(t) = c
_
t, S(t)
_
(4.13)
for some deterministic function c(t, x) and all t [0, T]. The sfts
is a replication strategy if c(t, x) satises the terminal condition
c(T, x) = (x K)
+
for all x 0,
since this implies X(T) = c
_
T, S(T)
_
=
_
S(T) K
_
+
.
How do we nd c(t, x) satisfying (4.13)?
The Black-Scholes-Merton PDE. We have X(t) = c
_
t, S(t)
_
.
Compute dierential on both sides. By (4.5)
X(t)
B(t)
= X(0) +
_
t
0
(u)d
_
S(u)
B(u)
_
.
Also
S(t)
B(t)
= S
0
e
W(t)+
(
(r )
1
2
2
)
t
is a geometric BM, so
d
_
S(t)
B(t)
_
=
_
S(t)
B(t)
_
( r )dt +
_
S(t)
B(t)
_
dW(t). (4.14)
Thus
d
_
X(t)
B(t)
_
= (t)d
_
S(t)
B(t)
_
= (t)
_
S(t)
B(t)
_
( r )dt + (t)
_
S(t)
B(t)
_
dW(t),
(4.15)
dX(t) = d
_
X(t)
B(t)
B(t)
_
=
X(t)
B(t)
dB(t) + B(t)d
_
X(t)
B(t)
_
= X(t)rdt + (t)S(t)( r )dt + (t)S(t)dW(t)
=
_
X(t) (t)S(t)
_
rdt + (t)dS(t). (4.16)
So
dX(t) =
_
X(t) (t)S(t)
_
rdt + (t)dS(t).
By Itos formula,
dc
_
t, S(t)
_
=
= c
t
_
t, S(t)
_
dt + c
x
_
t, S(t)
_
dS(t) +
1
2
c
xx
_
t, S(t)
_
d[S, S](t)
=
_
c
t
_
t, S(t)
_
+
1
2
c
xx
_
t, S(t)
_
S(t)
2
2
_
dt + c
x
_
t, S(t)
_
dS(t)
Equating dX(t) and dc
_
t, S(t)
_
, we obtain
(t) = c
x
_
t, S(t)
_
, (4.17)
rc
x
_
t, S(t)
_
S(t) = rc
_
t, S(t)
_
+ c
t
_
t, S(t)
_
+
1
2
c
xx
_
t, S(t)
_
S(t)
2
2
.
So c(t, x) is the solution to the Black-Scholes-Merton PDE
c
t
(t, x) + rxc
x
(t, x) +
1
2
2
x
2
c
xx
(t, x) = rc(t, x) (4.18)
for all t [0, T), x > 0, with terminal condition
c(T, x) = (x K)
+
for all x 0. (4.19)
The Black-Scholes formula is the solution to this PDE. It is
c(t, x) = x
_
d
+
(t, x)
_
Ke
r (Tt)
_
d
(t, x)
_
(4.20)
where
d
(t, x) =
1
Tt
_
log
x
K
+
_
r
2
2
_
(T t)
_
.
This can be checked by direct verication
10
. The fair price at time t
of a European call with strike K and maturity T in the Black-Scholes
model is X(t) = c
_
t, S(t)
_
. The replication strategy is given by
X(0) = c(0, S
0
) and (t) = c
x
_
t, S(t)
_
.
10
See [Shr04], Ex. 4.9. We shall see another derivation of (4.20) in Chapt. 5.
Remarks. 1) The Black-Scholes formula c(t, x) is a deterministic
function. The price X(t) = c
_
t, S(t)
_
and the stock position
(t) = c
x
_
t, S(t)
_
, t [0, T], are adapted stochastic processes.
2) The Black-Scholes formula does not depend on the drift of
the stock. c
_
t, S(t)
_
depends on r , (model parameters), K, T
(option characteristics), and t, S(t) (state variables).
3) The above analysis applies to every European type derivative
with a payo h(S
T
) at time T for some function h(x), x 0. The
fair price is given by the solution c
(h)
(t, x) to the PDE (4.18) with
terminal condition c
(h)
(T, x) = h(x) for all x 0.
Example. Let C(t), P(t) and F(t) be prices of a European call, a
European put, and a forward contract with payos
_
S(T) K
_
+
,
_
K S(T)
_
+
, and S(T) K, respectively. In any model we have
C(t) = F(t) + P(t) (put-call parity) under absence of arbitrage.
Also F(t) = S(t) e
r (Tt)
K in the Black-Scholes model (check
the PDE). Indeed, in the Black-Scholes model
P(t) = c
_
t, S(t)
_
S(t) + e
r (Tt)
K.
Delta hedging and the greeks
The partial derivatives of c
_
t, S(t)
_
w.r.t. t, S(t), , r are called
the greeks. In particular,
(t) = c
x
_
t, S(t)
_
=
_
d
+
(t, S(t))
_
> 0
(t) = c
xx
_
t, S(t)
_
=
_
d
+
(t,S(t))
_
S(t)
Tt
> 0
(t) = c
t
_
t, S(t)
_
=
S(t)
_
d
+
(t,S(t))
_
2
Tt
rKe
r (Tt)
_
d
delta-neutral:
V(t,S)
S
= 0
short gamma:
2
V(t,S)
S
2
< 0
long theta:
V(t,S)
t
> 0
Multi-dimensional Brownian motion
Denition 4.17
A d-dimensional Brownian motion is a process
W(t) =
_
W
1
(t), ..., W
d
(t)
_
, t 0, such that
(i) W
i
(t) is a one-dimensional Brownian motion for each i
(ii) If i ,= j then the processes W
i
(t) and W
j
(t) are independent
Let T
t
=
_
W(s) [ s t
_
for t 0 the ltration generated by the
process W(t). As for the one-dimensional case we have for every
0 t < u that W(u) W(t) is independent of T
t
.
Theorem 4.18
Let W(t) =
_
W
1
(t), ..., W
d
(t)
_
, t 0, a d-dimensional Brownian
motion. Then
[W
i
, W
j
](t) =
_
t if i = j
0 if i ,= j
Proof: See [Shr04] Section 4.6.1.
Ito processes for multi-dimensional BM
Denition 4.19
Let W(t) =
_
W
1
(t), ..., W
d
(t)
_
a d-dimensional BM with ltration
(T
t
), t 0. An Ito process is a stochastic process of the form
X(t) = X(0) +
_
t
0
(u)du +
d
i =1
_
t
0
i
(u)dW
i
(u) (4.21)
with a constant X(0) and adapted processes and
i
satisfying
the appropriate integrability conditions.
Most results on Ito processes for 1-dimensional BM carry over.
Theorem 4.20 (Associativity)
Let adapted and X(t) an Ito process as in (4.21) Then under the
appropriate integrability conditions
_
t
0
(u)dX(u) =
_
t
0
(u)(u)du +
d
i =1
_
t
0
(u)
i
(u)dW
i
(u).
We again need quadratic variation and covariation of Ito processes.
Theorem 4.21
Let
X(t) = X(0) +
_
t
0
(u)du +
d
i =1
_
t
0
i
(u)dW
i
(u),
Y(t) = Y(0) +
_
t
0
(u)du +
d
i =1
_
t
0
i
(u)dW
i
(u).
Then
[X, Y](t) =
d
i =1
_
t
0
i
(u)
i
(u)du.
Proof: The idea is to rst verify the claim for simple processes
h
1
, h
2
. For adapted processes h
1
, h
2
, one approximates the
integrands h
1
, h
2
with simple processes.
Details: See [Dur96] Chap. 2, Theorems (4.2c) (simple processes)
and (5.4), (6.5), (8.7) (general integrands).
The Ito formula (4.9) again holds.
Theorem 4.22
Let f : R
n
R be a (
2
function and X
1
(t), ..., X
n
(t) Ito processes.
Set X(t) =
_
X
1
(t), ..., X
n
(t)
_
. Then
f
_
X(t)
_
= f
_
X(0)
_
+
n
i =1
_
t
0
f
x
i
_
X(u)
_
dX
i
(u)
+
1
2
n
i , j =1
_
t
0
2
f
x
i
x
j
_
X(u)
_
d[X
i
, X
j
](u).
Proof: See [Dur96] Section 2.10.
The next result provides an important characterization of BM.
Theorem 4.23 (Levy)
Let M
1
(t), ..., M
d
(t) be continuous martingales for a ltration
(T
t
)
t0
with M
i
(0) = 0, [M
i
, M
i
](t) = t, and [M
i
, M
j
](t) = 0 for
i ,= j . Then M(t) =
_
M
1
(t), ..., M
d
(t)
_
, t 0, is a d-dim BM.
Proof (for d = 1, sketch): We show M(t) M(s) N(0, t s)
and independent of T
s
for s t. Take u R and dene
X(t) = e
iuM(t)+
1
2
u
2
t
.
By Itos formula for continuous martingales (see [Dur96] Chap. 2)
dX(t) = X(t)d
_
iuM(t) +
1
2
u
2
t
_
+
1
2
X(t)d[iuM, iuM](t)
= X(t)iudM(t) + X(t)
1
2
u
2
dt +
1
2
X(t)(1)u
2
d[M, M](t)
= X(t)iudM(t).
It follows that X(t) is also a martingale. Hence
E
_
e
iuM(t)+
1
2
u
2
t
T
s
= e
iuM(s)+
1
2
u
2
s
,
E
_
e
iu(M(t)M(s))
T
s
= e
1
2
u
2
(ts)
.
This implies M(t) M(s) N(0, t s) and independent of T
s
.
Application: Constructing correlated stock prices
For geometric BM dS(t) = S(t)dt + S(t)dW(t) we have
log
S(t+)
S(t)
=
_
W(t +) W(t)
_
+ (
1
2
2
).
Suppose we model two stocks S
1
, S
2
by
dS
1
(t) = S
1
(t)
1
dt + S
1
(t)
1
dW
1
(t),
dS
2
(t) = S
2
(t)
2
dt + S
2
(t)
2
_
dW
1
(t) +
_
1
2
dW
2
(t)
_
for independent BMs W
1
(t), W
2
(t) and some [1, 1]. Dene
W
3
(t) = W
1
(t) +
_
1
2
W
2
(t).
Then W
3
(t) is a cont. martingale with W
3
(0) = 0 and [W
3
, W
3
](t) = t.
We have
Corr
_
log
S
1
(t+)
S
1
(t)
, log
S
2
(t+)
S
2
(t)
_
=Corr
_
1
_
W
1
(t +) W
1
(t)
_
+ (
1
1
2
2
1
),
2
_
W
3
(t +) W
3
(t)
_
+ (
2
1
2
2
2
)
=Corr [W
1
(t +) W
1
(t), W
3
(t +) W
3
(t)]
=Corr
_
W
1
(t +) W
1
(t),
_
W
1
(t +) W
1
(t)
_
+
_
1
2
_
W
2
(t +) W
2
(t)
_
_
=Corr [W
1
(t +) W
1
(t), W
1
(t +) W
1
(t)]
=.
That is, logarithmic returns of S
1
and S
2
have correlation .
In summary, starting from a 2-dim BM, we constructed two
geometric BM asset price processes S
1
and S
2
with correlated
logarithmic returns.
5. Risk-Neutral Pricing
Main topics
Risk-neutral measure
1
2
2
)
t
(Black-Scholes model) and X(t) the value process of a sfts in this
market. By Theorem 3.6,
S(t)
B(t)
is a martingale if and only if = r .
In this case
X(t)
B(t)
is also a martingale by (4.15) and Theorem 4.8.
If in addition X(T) =
_
S(T) K
_
+
(replication strategy), then
X(t)
B(t)
= E
_
X(T)
B(T)
T
t
_
= E
_
_
S(T) K
_
+
B(T)
T
t
_
. (5.1)
Problem: In general ,= r .
Idea: Find a new measure
P on (, T) such that
S(t)
B(t)
, t 0,
becomes a martingale under
P. Then replace E with
E in (5.1).
Change of measure
Let Z 0 a random variable on (, T, P) with E[Z] = 1. Then
P[A] = E[ZI
A
] for all A T (5.2)
denes a new probability measure
P on (, T) (check!). Note that
for each A T, if P[A] = 0 then also
P[A] = 0. Conversely, we have
Theorem 5.1 (Radon-Nikodym)
Suppose P and
P are probability measures on (, T) such that for
each A T, if P[A] = 0 then also
P[A] = 0. Then there exists a
random variable Z 0 on (, T) such that E[Z] = 1 and
P[A] = E[ZI
A
] for all A T.
Proof: See [Dur95] Appendix A.8.
Z is called a Radon-Nikodym derivative of
P w.r.t. P and
denoted Z =
d
P
dP
. If Z
] = 1 (check!).
Examples
1) Finite space. Let nite, T = power set of , and P,
P
probability measures such that P[A] = 0 implies
P[A] = 0. Then
Z() =
P[]
P[]
I
{P[]>0}
,
is a Radon-Nikodym derivative of
P w.r.t P (check!).
For with
P[] > 0 we have
P[] = Z()P[],
so
P is obtained by weighting the probabilities under P with Z.
2) Binomial model. P and
P on (
N
, T) as in Examples 2.2
and 2.42. Then by 1)
Z() =
P[]
P[]
= 2
N
p
#H()
(1 p)
#T()
,
N
.
3) Normal distributions. Let X a random variable on (, T, P)
with P[X x] = (x) and Y = X + for some R. Hence
Y N(, 1) under P.
Find a probability
P on (, T) such that Y N(0, 1) under
P.
Solution: Dene Z = e
X
1
2
2
. Then Z > 0 and
E[Z] =
_
2
e
1
2
x
2
e
x
1
2
2
dx =
_
2
e
1
2
(x+)
2
dx = 1.
So
P[A] = E[ZI
A
] for A T denes a probability measure on (, T),
and for b R
P[Y b] = E
_
ZI
{Yb}
= E
_
e
X
1
2
2
I
{Xb}
=
_
b
2
e
1
2
x
2
e
x
1
2
2
dx =
_
b
2
e
1
2
(x+)
2
dx
=
_
b
2
e
1
2
y
2
dy = (b),
so Y N(0, 1) under
P.
Expectations under change of measure. Let P and
P as before,
Z =
d
P
dP
, and (T
t
)
t[0,T]
be a ltration on (, T).
The Radon-Nikodym derivative process of
P w.r.t. P is
Z(t) := E
_
Z
T
t
E
_
Y
T
s
= E
_
YZ(t)
T
s
k
i =1
y
i
I
A
i
with A
i
T then
E[Y] =
k
i =1
y
i
P[A
i
] =
k
i =1
y
i
E[ZI
A
i
] = E
_
Z
k
i =1
y
i
I
A
i
_
.
If Y is nonnegative, take simple Y
n
with Y
n
Y, then by
monotone convergence
E[Y] = lim
n
E[Y
n
] = lim
n
E[Y
n
Z] = E[YZ].
If Y is integrable, use Y = Y
+
Y
.
For (5.5):
E[Y]
(5.4)
= E[YZ] = E
_
E[YZ [ T
t
]
= E
_
YE[Z [ T
t
]
= E[YZ(t)].
For (5.6): Z(s)
E[Y [ T
s
] is T
s
-measurable, and for A T
s
E
_
Z(s)
E[Y [ T
s
]I
A
= E
_
E[YI
A
[ T
s
]Z(s)
(5.5)
=
E
_
E[YI
A
[ T
s
]
=
E[YI
A
]
(5.5)
= E[YI
A
Z(t)],
proving the partial averaging property for Z(s)
E[Y [ T
s
] and YZ(t).
Girsanovs Theorem
In view of the role of martingales in risk-neutral pricing we ask:
P
dP
> 0 P-a.s.
Z(t) = E[Z [ T
t
] > 0 P-a.s.
since
P[Z = 0] = E
_
I
{Z=0}
Z
= 0 and
P[Z(t) = 0] =
E
_
I
{Z(t)=0}
=
(5.5)
= E
_
I
{Z(t)=0}
Z(t)
E[Y(t) [ T
s
] = Y(s)Z(s).
If Y(t)Z(t) is a P-martingale then for s t
E[Y(t) [ T
s
]
(5.6)
=
1
Z(s)
E[Y(t)Z(t) [ T
s
] =
1
Z(s)
Y(s)Z(s) = Y(s).
Let W(t), t [0, T] be a Brownian motion on a space (, T, P)
with a ltration (T
t
)
t[0,T]
. For an adapted (t) dene
Z(t) := exp
_
_
t
0
(u)dW(u)
1
2
_
t
0
(u)
2
du
_
, t [0, T].
(5.7)
This is a generalized geometric BM with
Z(t) = Z(0)
_
t
0
Z(u)(u)dW(u),
see Chap. 4. Assuming some integrability
11
, Z(t) is a martingale. So
E[Z(T)] = Z(0) = 1 and we can dene a probability measure
P by
P[A] = E[Z(T)I
A
] for all A T. (5.8)
Theorem 5.5 (Girsanov)
The process
W(t) := W(t) +
_
t
0
(u)du, t [0, T], is a BM under
P.
11
A sucient condition is E
exp
1
2
T
0
(u)
2
du
< .
Proof. We use Levys characterization of BM (Theorem 4.23).
Firstly,
d
_
W(t)Z(t)
_
=
W(t)dZ(t) + Z(t)d
W(t) + d[Z,
W](t)
=
W,
W(t) (5.12)
where we introduced the market price of risk process
(t) =
(t)R(t)
(t)
(5.13)
and
W(t) = W(t) +
_
t
0
(u)du. Dening
P via the R-N derivative
process Z(t) as in (5.7), (5.8), the process
W(t) is a
P-BM by
Girsanovs theorem.
P is called the risk-neutral measure, P is
called the real-world measure. (5.12) yields
Corollary 5.6
The discounted stock price process
S(t)
B(t)
is a martingale under
P.
Stock price under the risk-neutral measure
From
W(t) = W(t) +
_
t
0
(u)R(u)
(u)
du we obtain
S(t) = S
0
exp
__
t
0
(u)dW(u) +
_
t
0
_
(u)
1
2
(u)
2
_
du
_
= S
0
exp
__
t
0
(u)d
W(u) +
_
t
0
_
R(u)
1
2
(u)
2
_
du
_
(5.14)
and thus
dS(t) = S(t)R(t)dt + S(t)(t)d
W(t). (5.15)
The stock has mean rate of return equal to R(t) under
P.
Let V(T) be an T
T
-measurable random variable which represents
the payo of a derivative at time T. We allow path-dependence,
so this includes but is not limited to payos V(T) = h
_
S(T)
_
.
We say that V(T) is attainable if it can be replicated, i.e.,
if there exists a sfts with terminal value
X(T) = V(T) a.s.
The fair price V(t) of the derivative at t is equal the value X(t)
of the replication portfolio at t.
Corollary 5.7 (Risk-neutral pricing)
Suppose V(T) is attainable. Then
V(t) = B(t)
E
_
V(T)
B(T)
T
t
_
=
E
_
e
T
t
R(u)du
V(T)
T
t
_
. (5.16)
Proof: By (4.5), which holds for a general asset price B(t),
X(t)
B(t)
= X(0) +
_
t
0
(u)d
_
S(u)
B(u)
_
.
So
X(t)
B(t)
is a
P-martingale by (5.12), and
V(t)
B(t)
=
X(t)
B(t)
.
Risk-neutral pricing in the Black-Scholes model
Let bond and stock satisfy
dB(t) = B(t)rdt, B(0) = 1,
dS(t) = S(t)(t)dt + S(t)dW(t), S(0) = S
0
,
where r and constant and (t) adapted. Let (t) =
(t)r
and
Z(t) and
P as in (5.7), (5.8). From (5.14) we have
S(t) = S
0
exp
_
W(t) +
_
r
1
2
2
_
t
_
(5.17)
where
W(t) = W(t) +
_
t
0
(u)du is a
P-BM. Results in Chap. 4
say that the European call payo V(T) =
_
S(T) K
_
+
can be
replicated (the analysis is the same when the is replaced by an
adapted process). By Corollary 5.7
c
_
t, S(t)
_
= V(t) =
E
_
e
r (Tt)
_
S(T) K
_
+
T
t
_
. (5.18)
We write
S(T) = S(t)e
W(T)
W(t)
+(r
1
2
2
)(Tt)
= S(t)e
Y+(r
1
2
2
)
with = T t, Y =
W(T)
W(t)
Tt
N(0, 1) and independent of T
t
under
P. So by (5.18)
c
_
t, S(t)
_
= V(t) =
E
__
e
r
S(t)e
Y+(r
1
2
2
)
e
r
K
_
+
T
t
_
=
E
__
xe
Y
1
2
e
r
K
_
+
_
x=S(t)
.
Note
xe
Y
1
2
e
r
K 0
log x
Y
1
2
2
r + log K
Y
1
_
log
x
K
+ (r
1
2
2
)
_
= d
(t, x) .
Hence
c(t, x) =
E
_
xe
Y
1
2
I
{Yd
(t,x)}
Ke
r
I
{Yd
(t,x)}
_
= x
_
d
(t, x) +
_
Ke
r
_
d
(t, x)
_
= x
_
d
+
(t, x)
_
Ke
r
_
d
(t, x)
_
.
Martingale representation theorem and hedging
In the Black-Scholes model, a European call can be replicated via
the sfts X(0) = c
_
0, S(0)
_
, (t) = c
x
_
t, S(t)
_
(delta hedging, see
Chap. 4). What about other models and derivatives?
Take a BM W(t) with ltration (T
t
)
t0
generated by this BM,
and consider the model (5.9), (5.10)
dB(t) = B(t)R(t)dt, B(0) = 1,
dS(t) = S(t)(t)dt + S(t)(t)dW(t), S(0) = S
0
,
with adapted R(t), (t), and (t) > 0. Dene (t) =
(t)R(t)
(t)
and
P via the R-N derivative process Z(t) as in (5.7), (5.8). Also
let
W(t) = W(t) +
_
t
0
(u)du. This is a
P-BM.
The next result guarantees the existence of replication strategies.
For a proof see e.g. Revuz and Yor, Continuous Martingales and
Brownian motion, Theorem V.3.4.
Theorem 5.8 (Martingale representation theorem)
Let W(t), t [0, T] be a BM and (T
t
)
t[0,T]
the ltration
generated by this BM. For every P-martingale M(t) w.r.t.
(T
t
)
t[0,T]
there exists an adapted process (t) such that
M(t) = M(0) +
_
t
0
(u)dW(u), t [0, T].
Corollary 5.9
For every
P-martingale
M(t), t [0, T] w.r.t. (T
t
)
t[0,T]
there
exists an adapted process
(t) such that
M(t) =
M(0) +
_
t
0
(u)d
E
_
V(T)
B(T)
T
t
_
.
V(t)
B(t)
is a
P-martingale, so by Corollary 5.9 there exists
(t) with
V(t)
B(t)
=
V(0)
B(0)
+
_
t
0
(u)d
W(u)
= V(0) +
_
t
0
(u)B(u)
S(u)(u)
S(u)
B(u)
(u)d
W(u)
= V(0) +
_
t
0
(u)B(u)
S(u)(u)
d
_
S(u)
B(u)
_
, t [0, T].
Therefore taking X(0) =
E
_
V(T)
B(T)
_
and (t) =
(t)B(t)
S(t)(t)
in (4.5)
yields the desired sfts with X(T) = V(T).
Remarks.
1) We have shown that in a model of the form (5.9), (5.10), every
T
T
-measurable derivative payo is attainable, i.e. can be
replicated by self-nancing trading in bank account and stock.
Such a model is called complete.
2) The key assumptions are that the ltration is generated by a
one-dimensional BM W(t), and that (t) is positive.
3) The martingale representation theorem justies the use of the
risk-neutral pricing formula (5.16) by showing that every derivative
is attainable. But it does not provide a method of nding
(t) in
(t) =
(t)B(t)
S(t)(t)
. Under further assumptions on the processes R(t)
and (t), the strategy (t) can be found using PDE methods
(Feynman-Kac theorem, see [Shr04] Chap. 6).
The fundamental theorems of asset pricing
We consider a market with a bank account process B(t),
dB(t) = B(t)R(t)dt, (5.19)
and m risky assets S
i
(t), i = 1, ..., m,
dS
i
(t) = S
i
(t)
i
(t)dt + S
i
(t)
d
j =1
ij
(t)dW
j
(t). (5.20)
Here W(t) =
_
W
1
(t), ..., W
d
(t)
_
is a d-dim BM on a space
(, T
T
, P) with a ltration (T
t
)
t[0,T]
, and R(t),
i
(t), and
ij
(t)
are adapted processes.
Assume
i
(t) =
_
d
j =1
ij
(t)
2
> 0 for i = 1, ..., m and let
B
i
(t) =
d
j =1
_
t
0
ij
(u)
i
(u)
dW
j
(u).
Then
dS
i
(t) = S
i
(t)
i
(t)dt + S
i
(t)
i
(t)dB
i
(t).
The processes B
i
(t) =
d
j =1
_
t
0
ij
(u)
i
(u)
dW
j
(u) are martingales with
d[B
i
, B
k
](t) =
d
j
1
=1
d
j
2
=1
ij
1
(t)
kj
2
(t)
i
(t)
k
(t)
d[W
j
1
, W
j
2
](t) =
d
j =1
ij
(t)
kj
(t)
i
(t)
k
(t)
dt
=
ik
(t)dt.
So d[B
i
, B
i
](t) = dt, and B
i
(t) is a 1-dim BM by Levys theorem.
Also
ik
(t) [1, 1] is the instantaneous correlation of B
i
(t) and
B
k
(t) for i ,= k.
This means that the logarithmic returns of S
i
and S
k
at time t
have conditional correlation
ik
(t), cf. [Shr04] Exercise 4.17.
Arbitrage and the rst FTAP
Let X(t) be the value process of a sfts which holds
i
(t) shares of
asset S
i
at time t, where
i
(t) are adapted processes. Then
dX(t) =
_
X(t)
m
i =1
i
(t)S
i
(t)
_
R(t)dt +
m
i =1
i
(t)dS
i
(t).
The product rule and d
_
1
B(t)
_
=
1
B(t)
R(t)dt imply
X(t)
B(t)
= X(0) +
m
i =1
_
t
0
i
(u)d
_
S
i
(u)
B(u)
_
. (5.21)
Denition 5.10
a) A sfts is admissible if there exists a constant b 0 such that
its value process satises X(t) b for all t, a.s.
b) A sfts is an arbitrage if it is admissible, and its value process
satises X(0) = 0 a.s., X(T) 0 a.s., and P[X(T) > 0] > 0.
Theorem 5.11 (First fundamental theorem of asset pricing)
If there exists a measure
P equivalent to P such that
S
1
(t)
B(t)
, ...,
S
m
(t)
B(t)
are
P-martingales, then the market model B, S
1
, ..., S
m
does not
admit arbitrage.
A measure
P as in Theorem 5.11 is called a risk-neutral or
equivalent martingale measure for the market B, S
1
, ..., S
m
.
= 0 and hence
P[X(T) > 0] = 0. By
equivalence P[X(T) > 0] = 0, so the sfts cannot be an arbitrage.
12
This requires an integrability condition on (t).
Existence of a risk-neutral measure
Let W(t) =
_
W
1
(t), ..., W
d
(t)
_
a d-dim BM as above. Take a
d-dim adapted (t) =
_
1
(t), ...,
d
(t)
_
and dene for t [0, T]
Z(t) := exp
_
_
t
0
(u) dW(u)
1
2
_
t
0
|(u)|
2
du
_
. (5.22)
If E
_
exp
_
1
2
_
T
0
|(u)|
2
du
_
< , then Z(t) is a martingale and
we can dene a probability measure
P by
P[A] = E[Z(T)I
A
] for all A T.
The same proof as in the 1-dimensional case yields
Theorem 5.12 (Girsanov)
The process
W(t) := W(t) +
_
t
0
(u)du, t [0, T], is a d-dim BM
under
P.
Writing the model (5.19), (5.20) in discounted prices we have
d
_
S
i
(t)
B(t)
_
=
S
i
(t)
B(t)
_
i
(t) R(t)
_
dt +
S
i
(t)
B(t)
d
j =1
ij
(t)dW
j
(t)
for i = 1, ..., m. Now
S
1
(t)
B(t)
, ...,
S
m
(t)
B(t)
are
P-martingales if
d
_
S
i
(t)
B(t)
_
=
S
i
(t)
B(t)
d
j =1
ij
(t)
_
j
(t)dt + dW
j
(t)
=
S
i
(t)
B(t)
d
j =1
ij
(t)d
W
j
(t)
with
W
j
(t) = W
j
(t) +
_
t
0
j
(u)du, and this is the case if and only if
i
(t) R(t) =
d
j =1
ij
(t)
j
(t), i = 1, ..., m. (5.23)
This is a linear system of m equations for d unknowns
1
(t), ...,
d
(t).
Conclusion: If (5.23) has a solution (t) =
_
1
(t), ...,
d
(t)
_
, then
the model (5.19), (5.20) does not admit arbitrage.
If (5.23) has no solution, there is an arbitrage in the model.
Instead of a proof we give an
Example. Take m = 2, d = 1, and constant coecients R,
i
,
i
.
Then (5.23) becomes
1
R =
1
,
2
R =
2
.
This has a solution if and only if
1
R
1
=
2
R
2
. Suppose this
does not hold, say
1
R
1
>
2
R
2
. We can realize an arbitrage via
the sfts X(0) = 0,
1
(t) =
1
S
1
(t)
1
,
2
(t) =
1
S
2
(t)
2
. By (5.21),
its discounted value process is
X(t)
B(t)
=
_
t
0
1
(u)d
_
S
1
(u)
B(u)
_
+
_
t
0
2
(u)d
_
S
2
(u)
B(u)
_
=
_
t
0
1
S
1
(u)
1
S
1
(u)
B(u)
_
(
1
R)du +
1
dW(u)
_
_
t
0
1
S
2
(u)
2
S
2
(u)
B(u)
_
(
2
R)du +
2
dW(u)
_
=
_
t
0
1
B(u)
_
1
R
2
R
2
_
du > 0.
Risk-neutral pricing and the second FTAP
So far we dened prices of derivatives via replication arguments.
We now introduce the concept of pricing by arbitrage arguments.
Suppose the market B, S
1
, ..., S
m
given by (5.19), (5.20) admits a
risk-neutral measure
P.
E
_
V(T)
B(T)
T
t
_
. (5.24)
Then
V(t)
B(t)
, t [0, T] is a
P-martingale. So if the derivative is
traded for V(t) at t, the market B, S
1
, ..., S
m
, V does not
admit arbitrage by the rst FTAP. We call V(t) a fair or
arbitrage-free price.
i =1
_
t
0
i
(u)d
_
S
i
(u)
B(u)
_
.
Otherwise the model is called incomplete.
Theorem 5.14 (Second fundamental theorem of asset pricing)
Suppose that the model B, S
1
, ..., S
m
has a risk-neutral measure.
The following statements are equivalent.
(i) The model B, S
1
, ..., S
m
is complete.
(ii) The measure risk-neutral measure is unique.
(iii) Equation (5.23) has a unique solution (t)=
_
1
(t), ...,
d
(t)
_
.
Proof (sketch). (i) (ii): Assume completeness and let
P
1
,
P
2
risk-neutral measures. For A T = T
T
dene V(T) = I
A
B(T).
By completeness V(T) can be replicated via a sfts with some initial
value V(0). Its discounted value process is a martingale under
P
1
and
P
2
, so
P
i
[A] =
E
i
[I
A
] =
E
i
_
V(T)
B(T)
= V(0) for i = 1, 2.
(ii) (iii): Assume uniqueness of
P and let (t) and
(t)
solutions of (5.23). Then Z(T) and Z
(t).
(iii) (i): Given an integrable derivative payo V(T) we have to
show that there are
i
(t) with
V(T)
B(T)
= V(0) +
m
i =1
_
T
0
i
(t)d
_
S
i
(t)
B(t)
_
= V(0) +
m
i =1
_
T
0
i
(t)
S
i
(t)
B(t)
d
j =1
ij
(t)d
W
j
(t).
Take a risk-neutral measure and dene V(t) via (5.24). We obtain
V(T)
B(T)
= V(0) +
d
j =1
_
T
0
j
(t)d
W
j
(t)
for suitable
j
(t) by multi-dim martingale representation, s. below.
So we have show that there is a solution
1
(t), ...,
m
(t) to
m
i =1
i
(t)
S
i
(t)
B(t)
ij
(t) =
j
(t), j = 1, ..., d, (5.25)
Since (5.23) has a unique solution, the matrix
ij
(t) has rank d
(and thus m d). This implies that (5.25) has a solution.
Theorem 5.15 (Martingale representation theorem)
Let
_
W
1
(t), ..., W
d
(t)
_
, t [0, T] be a BM and (T
t
)
t[0,T]
the
ltration generated by this BM. For every P-martingale M(t) w.r.t.
(T
t
)
t[0,T]
there exists an adapted
_
1
(t), ...,
d
(t)
_
such that
M(t) = M(0) +
d
j =1
_
t
0
j
(u)dW
j
(u), t [0, T].
For every
P-martingale
M(t), t [0, T] w.r.t. (T
t
)
t[0,T]
there
exists an adapted
_
1
(t), ...,
d
(t)
_
such that
M(t) =
M(0) +
d
j =1
_
t
0
j
(u)d
W
j
(u), t [0, T].
Discussion of the multi-dimensional asset price model.
1) d = m = 1: The model (5.19), (5.20) is arbitrage-free and
complete if
1
(t) ,= 0 for all t.
2) d = m > 1: The model (5.19), (5.20) is arbitrage-free and
complete if the matrix
_
ij
(t)
_
i , j =1,...,d
is nonsingular for all t.
3) d > m: The model (5.19), (5.20) is arbitrage-free if the matrix
_
ij
(t)
_
i =1,...,m, j =1,...,d
has rank m for all t.
It is incomplete in this case.
4) d < m: The model (5.19), (5.20) is arbitrage-free if the vector
_
1
(t) R(t), ...,
m
(t) R(t)
_
is in the image of the linear map
dened by the matrix
_
ij
(t)
_
i =1,...,m, j =1,...,d
(drift conditions).
It is complete if moreover the matrix
_
ij
(t)
_
i =1,...,m, j =1,...,d
has
rank d.
Dividend-Paying Stocks
Let B(t) = e
t
0
R(u)du
a bank account process, and S(t) the price
process of a stock with cumulative dividend process D(t) (= sum
of all dividends paid between times 0 and t). D(t) is an increasing
process. If we hold the stock from 0 to t and invest all dividends in
the bank account, our wealth at time t is
Y(t) = S(t) +
_
t
0
B(t)
B(u)
dD(u).
We conclude
d
_
Y(t)
B(t)
_
= d
_
S(t)
B(t)
+
_
t
0
1
B(u)
dD(u)
_
=
1
B(t)
dS(t) S(t)
1
B(t)
R(t)dt +
1
B(t)
dD(t)
=
1
B(t)
_
d
_
S(t) + D(t)
_
S(t)R(t)dt
_
. (5.26)
The value process X(t) of a sfts in the stock and bank account
with stock position (t) at time t satises
dX(t) =
_
X(t) (t)S(t)
_
R(t)dt + (t)d
_
S(t) + D(t)
_
.
Using the product rule and the last two equations, we nd
d
_
X(t)
B(t)
_
=
1
B(t)
dX(t) X(t)
1
B(t)
R(t)dt
=
1
B(t)
_
_
X(t) (t)S(t)
_
R(t)dt + (t)d
_
S(t) + D(t)
_
X(t)R(t)dt
_
= (t)d
_
Y(t)
B(t)
_
.
Thus if
Y(t)
B(t)
is a
P-martingale, then
X(t)
B(t)
is a
P-martingale.
Stock model with dividends. We now assume
d
_
S(t) + D(t)
_
= S(t)(t)dt + S(t)(t)dW(t). (5.27)
Returns of the stock including dividend payments have mean (t)
and volatility (t) (modeled by adapted processes). By (5.26)
d
_
Y(t)
B(t)
_
=
1
B(t)
_
S(t)
_
(t) R(t)
_
dt + S(t)(t)dW(t)
_
=
S(t)
B(t)
(t)d
W(t)
where
W(t) = W(t) +
_
t
0
(u)du with (t) =
(t)R(t)
(t)
.
So by choosing Z(t) and
P as in (5.7), (5.8),
Y(t)
B(t)
is a
P-martingale.
In summary we obtain
d
_
X(t)
B(t)
_
= (t)d
_
Y(t)
B(t)
_
, (5.28)
d
_
Y(t)
B(t)
_
=
S(t)
B(t)
(t)d
W(t) (5.29)
for the value process X(t) of a sfts. Threrefore, as in the case of
zero dividends, a derivative payo V(T) at time T has fair price
V(t) = B(t)
E
_
V(T)
B(T)
T
t
_
(5.30)
and can be replicated via a sfts (martingale representation theorem).
The dierence between dividend and no-dividend case is in the
stock model, now given by (5.27)
d
_
S(t) + D(t)
_
= S(t)(t)dt + S(t)(t)dW(t).
We solve the equation for S(t) in two important cases.
Examples.
1) Continuously paying dividend. Here we assume
D(t) =
_
t
0
A(u)S(u)du
for an adapted rate process A(t) 0. Then (5.27) becomes
dS(t) = S(t)
_
(t) A(t)
_
dt + S(t)(t)dW(t)
and we obtain
S(t) = S
0
exp
__
t
0
(u)dW(u) +
_
t
0
_
(u) A(u)
1
2
(u)
2
_
du
_
,
= S
0
exp
__
t
0
(u)d
W(u) +
_
t
0
_
R(u) A(u)
1
2
(u)
2
_
du
_
.
(5.31)
For constant coecients R(t) = r , (t) = , and A(t) = a,
S(t) = S
0
exp
_
W(t) +
_
r a
1
2
2
_
t
_
.
Using the risk-neutral pricing formula (5.30) we can compute
derivative prices as in the Black-Scholes model without dividends.
With S(t) = S
0
exp
_
W(t) +
_
r a
1
2
2
_
t
_
, t [0, T], we
obtain
V(t) =
E
_
e
r (Tt)
_
S(T) K
_
+
T
t
_
=
S(t)
_
d
+
_
t,
S(t)
_
_
e
r (Tt)
K
_
d
_
t,
S(t)
_
_
with
S(t) = S(t)e
a(Tt)
via a similar computation as for the
no-dividend case in (5.18).
When a = 0 we recover the original Black-Scholes price formula
V(t) = c
_
t, S(t)
_
.
2) Lump payments of dividends. Here we assume
D(t) =
t
j
t
a
j
S(t
j
)
with T
t
j
-measurable random variables a
j
[0, 1] for j = 1, ..., n,
and a nite number of payment dates 0 < t
1
< ... < t
n
< T.
Then D(t) is constant in t between consecutive payment dates,
and jumps up by a
j
S(t
j
) at time t
j
.
By (5.27) the sum S(t) + D(t) is continuous in t. Therefore S(t)
jumps down by a
j
S(t
j
) at time t
j
. Also
dS(t) = S(t)(t)dt+S(t)(t)dW(t) = S(t)R(t)dt+S(t)(t)d
W(t)
for t (t
j
, t
j +1
) between consecutive payment dates. Thus
S(t) = S(t
j
)e
t
t
j
(u)d
W(u)+
t
t
j
_
R(u)
1
2
(u)
2
_
du
if t (t
j
, t
j +1
),
S(t
j +1
) = (1 a
j +1
)S(t
j
)e
t
j +1
t
j
(u)d
W(u)+
t
j +1
t
j
_
R(u)
1
2
(u)
2
_
du
.
By recursion it follows for all t [0, T] that
S(t) = S
0
t
j
t
(1 a
j
) e
t
0
(u)d
W(u)+
t
0
_
R(u)
1
2
(u)
2
_
du
(5.32)
For constant coecients R(t) = r , (t) = , and a
j
,
S(t) = S
0
t
j
t
(1 a
j
) e
W(t)+
_
r
1
2
2
_
t
.
The risk-neutral pricing formula (5.30) then yields for the price V(t)
of the European call with strike K and maturity T
V(t) =
S(t)
_
d
+
_
t,
S(t)
_
_
e
r (Tt)
K
_
d
_
t,
S(t)
_
_
where
S(t) = S(t)
t<t
j
T
(1 a
j
).
Remark. We can compute the value X(t) of a portfolio strategy
which starts with one share of stock and instantaneously reinvests
all dividends in the stock. Let (t) be the number of shares in the
portfolio at time t. This is an increasing process with (0) = 1
and
d(t) = (t)
dD(t)
S(t)
.
Case 1) D(t) =
_
t
0
A(u)S(u)du. Then we nd (t) = e
t
0
A(u)du
.
Case 2) D(t) =
t
j
t
a
j
S(t
j
). Then (t) is constant between
consecutive payment dates and at payment date t
j
we have
(t
j
) (t
j
) = (t
j
)
a
j
S(t
j
)
S(t
j
)
= (t
j
)
a
j
1a
j
.
Thus (t
j
) = (t
j
)
1
1a
j
= (t
j 1
)
1
1a
j
and so (t) =
t
j
t
1
1a
j
.
Now since X(t) = (t)S(t), using either (5.31) or (5.32) we
obtain in each case that
X(t) = S
0
exp
_
_
t
0
(u)d
W(u) +
_
t
0
_
R(u)
1
2
(u)
2
_
du
_
.
In conclusion, if the dividend paying stock model is given by
(5.27), then the associated value process X(t) from continuous
dividend reinvestment follows the same stochastic process as the
non-dividend paying stock S(t) in (5.14).
In particular, the discounted value process
X(t)
B(t)
= S
0
exp
_
_
t
0
(u)d
W(u)
_
t
0
1
2
(u)
2
du
_
is again a martingale under the risk-neutral measure
P.