Sunteți pe pagina 1din 185

Financial Engineering with Stochastic Calculus I

Johannes Wissel
Cornell University
Fall 2010
0. Introduction
Motivation: Some examples of nancial derivatives

an airline seeks protection against rising oil prices ( forward


contract)

a company wants to hedge the risk in a payment obligation in


foreign currency at a future time ( call option)

a fund manager wants to protect a stock position against


losses ( put option)
Main objectives of nancial engineering (FE)

development of quantitative models for nancial markets

design, pricing, and hedging of nancial derivatives

development of quantitative methods of risk management


Stochastic calculus

provides math. framework for continuous time models in FE

asset values are modeled by stochastic processes

trading strategy values are modeled by stochastic integrals


Course syllabus
I: Introduction: nancial engineering, binomial model
II: Background in probability: information and -algebras,
independence, general conditional expectations, martingales,
fundamental theorem of asset pricing
III: Brownian motion (BM): scaled random walks, denition of
BM, distribution of BM, ltration for BM, martingale property
of BM, quadratic variation
IV: Stochastic calculus: stochastic integral, Ito processes,
Ito-Doeblin formula, Black-Scholes-Merton equation,
multivariable stochastic calculus
V: Risk-neutral pricing: Girsanovs theorem, risk-neutral measure,
martingale representation, fundamental theorems of asset
pricing
VI: Miscellaneous topics (if time permits): dividends, forwards
and futures, PDE pricing techniques.
1. The binomial model
Main goals

Introduce some fundamental ideas and concepts in FE

Introduce a math. model which later serves as a main building


block for Brownian motion and continuous time models
Motivating problem: Pricing of options
Consider a nancial asset S, e.g. a stock, which is traded on an
exchange. Denote the market price of the asset at time t by S
t
.

Call option on the asset S: A derivative which gives the


holder the right, but not the obligation, to buy the asset S
at a future time T for a pre-arranged price K from the issuer
(to exercise the option).

Terminology: S underlying, K strike price, T maturity


of the option.

Value of the option at maturity?

If S
T
K, option is worthless.

If S
T
> K, exercise the option (buy S for K) and sell S on the
exchange for S
T
, making a prot S
T
K.
Thus, option value at maturity is
_
S
T
K
_
+
(option payo ).

Question: What is the value of the option prior to T?


Motivating problem: Pricing of options
Consider a nancial asset S, e.g. a stock, which is traded on an
exchange. Denote the market price of the asset at time t by S
t
.

Put option on the asset S: A derivative which gives the


holder the right, but not the obligation, to sell the asset S
at a future time T for a pre-arranged price K to the issuer
(to exercise the option).

Terminology: S underlying, K strike price, T maturity


of the option.

Value of the option at maturity?

If S
T
K, option is worthless.

If S
T
< K, exercise the option (sell S for K) and buy S on the
exchange for S
T
, making a prot K S
T
.
Thus, option value at maturity is
_
K S
T
_
+
(option payo ).

Question: What is the value of the option prior to T?


The market
We consider a market with two assets:

Risky asset (stock, commodity, foreign exchange rate, ...)

Riskless asset (money market, bank account or bond with


xed interest rate)
To develop a concept of a fair value of nancial assets and
derivatives, throughout this course we make the standard
assumption that agents may

borrow and invest at the same interest rate in the money


market

take long and short positions in all traded assets

trade without transaction costs and feedback eects on prices


(frictionless market)
Simplest example: One-period binomial model
We assume that prices for the bond B
t
and stock S
t
are quoted at
times t = 0 (today) and t = 1 (option maturity).
Suppose B
0
= B
1
= 1, S
0
= 100, and S
1
is a random variable
taking values S
1
= 115 or S
1
= 90 each with probability p =
1
2
.
Consider a call with strike K = 100. Value at time 0?

First guess: Expectation of the payo


_
S
1
100)
+
? Gives
E
__
S
1
100)
+

=
1
2
(115 100)
+
+
1
2
(90 100)
+
= 7.5

Turns out to be too expensive: A suitable investment strategy


can generate the option payo out of less capital!

Indeed, suppose at time 0 we invest into shares of bond and


shares of stock, and require that at time 1
+ 115 = (115 100)
+
, (1.1)
+ 90 = (90 100)
+
. (1.2)
We nd = 0.6, = 54. This requires + 100 = 6
initial capital and does the same job as the call.
The initial capital C
0
= 6 is the fair value at time 0 of the option
in the above example.
Any price C
0
,= 6 for the option would introduce an arbitrage
opportunity into the market:

If C
0
> 6, the seller could make a certain prot of C
0
6 by
selling the option and simultaneously employing the above
investment strategy.

If C
0
< 6, the buyer could make a certain prot of 6 C
0
by
buying the option and simultaneously employing the reverse
strategy ( = 0.6, = 54).
Such arbitrage opportunities (possibility of a riskless prot without
net investment of capital) are unrealistic in most markets.
Absence of arbitrage is a fundamental concept in nancial
market models and in the theory of derivative pricing.
Let C
1
=
_
S
1
K)
+
denote the option payo. We can rewrite
(1.1), (1.2) in one equation as
+ S
1
= C
1
,
which must hold for both possible values of S
1
.
The fair option value at time 0 is the initial value of the above
investment strategy
+ S
0
= C
0
.
In the above example we have E[S
1
] > S
0
and therefore E[C
1
] > C
0
.
Valuation via expectations. Now suppose we change the
probabilities of the values of S
1
to p and 1 p such that

E[S
1
] = p 115 + (1 p) 90 = S
0
.
We have to take p = 0.4. Then we obtain
C
0
= +S
0
= +

E[S
1
] =

E[+S
1
] =

E[C
1
] =

E
__
S
1
K
_
+

.
Conclusion:

Our rst guess expectation of the payo works, but only


under a new probability measure!

The crucial property of the new measure is

E[S
1
] = S
0
.
The new measure is known as a risk-neutral or pricing measure.
The concept of pricing assets by computing expectations under
risk-neutral measures is another fundamental concept in nancial
engineering. We will later generalize this method to dynamic
(multi-period) models using stochastic processes known as
martingales.
There is a deep relationship between absence of arbitrage and
asset pricing via risk-neutral measures, known as the fundamental
theorem of asset pricing. We will formulate and discuss several
versions of this result in later chapters of the course.
The multi-period binomial model

Discrete-time model: time t = 0, 1, ..., N

Asset prices B
t
(bond) and S
t
(stock) at time t

Bond price process:


B
0
= 1
B
t+1
= B
t
(1 + r ), t = 0, ..., N 1
where r 0 is a constant interest rate.
B
t
= (1 + r )
t
for all t (no randomness riskless asset).

Stock price process: initial value S


0
S
t+1
=
_
S
t
u, probability p
S
t
d, probability 1 p
_
for t = 0, ..., N 1
where 0 < p < 1 and 0 < d < r + 1 < u.
S
t
is a random variable for t 1 risky asset
Pricing derivatives in the binomial model

A European-type derivative on an underlying asset S is


characterized by

a maturity date N

a payo function h(S)


The instrument pays to the holder h(S
N
) at time N

Examples:

call option with strike K: h(S) = (S K)


+

put option with strike K: h(S) = (K S)


+
The key to derivative pricing is the idea of payo replication via
self-nancing trading strategies (sfts).
Denition 1.1
Consider a trading strategy which invests into
t
units of bond
and
t
units of stock at time t = 0, 1, ..., N 1, so that the
portfolio value at each time t is
X
t
=
t
B
t
+
t
S
t
. (1.3)
The strategy is self-nancing if for any t = 0, 1, ..., N 1
X
t+1
=
t+1
B
t+1
+
t+1
S
t+1
=
t
B
t+1
+
t
S
t+1
. (1.4)
(1.4) says that the portfolio is rearranged at time t + 1 from
(
t
,
t
) to (
t+1
,
t+1
) without inserting or withdrawing capital.
The value process X
t
of a sfts is computed as follows:
X
t+1
B
t+1

X
t
B
t
=

t
B
t+1
+
t
S
t+1
B
t+1


t
B
t
+
t
S
t
B
t
=
t
_
S
t+1
B
t+1

S
t
B
t
_

X
n
B
n
=
X
k
B
k
+
n1

t=k

t
_
S
t+1
B
t+1

S
t
B
t
_
(1.5)
Note:

The value process X


n
at any time n > k is controlled by the
portfolio value X
k
at initial time k and the stock positions

k
, ...,
n1
. (Bond positions are determined via (1.3)).

X
n
is a random variable which depends on the (random)
values of the stochastic process S
t
, t = 0, ...., n.
Replication via a sfts

goal: nd a sfts with X


N
= h(S
N
)

recursive procedure: Suppose we have a function h


t+1
(S
t+1
).
Can we nd h
t
(S
t
) and
t
(S
t
) such that the one-period sfts
with initial value h
t
(S
t
) and stock position
t
(S
t
) from t to
t + 1 yields the value h
t+1
(S
t+1
) at t + 1? By (1.5),
h
t+1
(S
t+1
)
B
t+1
=
h
t
(S
t
)
B
t
+
t
(S
t
)
_
S
t+1
B
t+1

S
t
B
t
_
(1.6)
must hold for any values of the random variables S
t
, S
t+1
.
Given S
t
there are two possibilities for S
t+1
:
h
t+1
(S
t
u)
B
t+1
=
h
t
(S
t
)
B
t
+
t
(S
t
)
_
S
t
u
B
t
(1 + r )

S
t
B
t
_
,
h
t+1
(S
t
d)
B
t+1
=
h
t
(S
t
)
B
t
+
t
(S
t
)
_
S
t
d
B
t
(1 + r )

S
t
B
t
_
Solving for
t
(S
t
), h
t
(S
t
) gives

t
(S
t
) =
h
t+1
(S
t
u) h
t+1
(S
t
d)
S
t
(u d)
,
h
t
(S
t
)
B
t
=
1
B
t+1
_
ph
t+1
(S
t
u) + (1 p)h
t+1
(S
t
d)
_
where p is dened by
p
S
t
u
B
t
(1 + r )
+(1 p)
S
t
d
B
t
(1 + r )
=
S
t
B
t
p =
1 + r d
u d
(1.7)
In summary we obtain
Theorem 1.2
For a derivative with payo h(S
N
) at time N, dene recursively
functions h
N
, ..., h
0
via
h
N
(S
N
) = h(S
N
), (1.8)
h
t
(S
t
)
B
t
=
1
B
t+1
_
ph
t+1
(S
t
u) + (1 p)h
t+1
(S
t
d)
_
(1.9)
for t = N 1, ..., 0 and p in (1.7). Then there exists a sfts with
value process X
t
= h
t
(S
t
), t = 0, ..., N, and stock positions

t
(S
t
) =
h
t+1
(S
t
u) h
t+1
(S
t
d)
S
t
(u d)
(1.10)
for t = 0, ..., N 1.
We say we can replicate the payo h(S
N
) from initial capital
h
k
(S
k
) at time k within the binomial model, using the sfts (1.10).
We call X
k
the fair price of the derivative at time k.
Note:

The complexity of the pricing algorithm in Theorem 1.2 grows


with N; one can show it is O(N
2
).

For more complex derivatives than European types, it may


become exponential in N in the worst case.
Questions:

Can we construct a (tractable) continuous time limit of the


binomial model for N ?

If so, what about trading strategies, replication, and pricing in


such a model?
Answers are provided by probability theory and stochastic
calculus.
2. Background in
probability theory
Main goals

to introduce concepts from probability theory on innite


spaces which are needed for continuous time models
(probability measures, -algebras, independence, general
conditional expectations)

to further analyze the binomial model using elementary


probability theory
Probability spaces
Denition 2.1
A probability space consists of a set , a family T of subsets of
( -algebra), and a function P ( probability measure) which assigns
to every A T a probability P[A] [0, 1]. For T we demand
a) T
b) if A T, then the complement A
c
T
c) if a sequence of events A
1
, A
2
, ... T, then

j =1
A
j
T.
For P we demand
a) P[] = 1
b) if A
1
, A
2
, ... T are disjoint, then P
_

j =1
A
j
_
=

j =1
P[A
j
].
Elements of T are called events.
Remark: If A
1
, A
2
, ... T, then

j =1
A
j
=
_

j =1
A
c
j
_
c
T.
Also for A
1
, ..., A
n
T we have
n
j =1
A
j
T and
n
j =1
A
j
T
(take A
j
= for n > j in c)).
Interpretation
(, T, P) is a model for a random experiment.

is the set of possible outcomes of the random experiment.

T models the information obtained from observing the result


of the experiment. It contains the sets of outcomes which we
can distinguish by the observation.
The observation may not tell us the exact outcome of the
experiment, but for any set A T we can tell whether or not
A.

For any A T, the number P[A] is the probability with


which we expect to be in A.
Example 2.2
Finite independent coin-toss space: Let

N
=
_
= (
1
...
N
)

i
H, T i = 1, ..., N
_
,
T = A[ A
N
, and P[A] =
|A|
2
N
for all A T.
Example 2.3 (Another probability measure)
Finite independent coin-toss space (unfair coin), binomial model:
Let p (0, 1) and

N
=
_
= (
1
...
N
)

i
H, T i = 1, ..., N
_
,
T = A[ A
N
, and P[] = p
#H(
1
...
N
)
(1 p)
#T(
1
...
N
)
for all
N
. This denes a probability on T by setting
P[A] =

A
P[].
It is easy to check the axioms for T and P in Denition 2.1.
The power set T = A[ A is a -algebra for every space .
However this is not always the appropriate choice:

For innite spaces , it is not always possible to put a


reasonable probability measure on the power set, see Example
2.4 below.

In some situations smaller -algebras model the relevant


information, see Examples 2.31 and 2.33.
The trivial -algebra T
0
= , also satises the axioms for
every space . This corresponds to the situation where no
information is available on the outcome of the random experiment.
Example 2.4
Innite independent coin-toss space: Let

=
_
= (
1

2
...)

i
H, T i 1
_
.
T = A[ A

is a -algebra, but it is impossible to construct


a meaningful probability measure on T.
For x
n
let A
x
denote the set of all sequences

beginning
with x. We want P[A
x
] =
1
2
n
for our measure P. But this implies
P[
1

2
...] = P [A

1
A

2
...] = lim
n
P [A

1
...
n
] = lim
n
1
2
n
= 0
(see [Shr04], Theorem A.1.1.), i.e. individual have probability zero.

We cannot build P from the bottom up, starting with


elements

.
Idea: Top down approach.
1) For each n 1, let /
n
consist of the sets A
x
with x
n
.
Examples:
/
1
= A
H
, A
T
,
/
2
= A
HH
, A
HT
, A
TH
, A
TT
, etc.
2) Collect all sets in /
1
, /
2
, ..., and add all sets required to make
the collection a -algebra. We call the result T

. Note that
T

T.
3) It can be shown
1
that specifying the probability on the sets in
/
1
, /
2
, ... (via P[A
x
] =
1
2
n
for any x
n
) uniquely
determines a probability measure P on T

.
Note: The same construction works for p (0, 1) and
P[A
x
] = p
#H(x)
(1 p)
#T(x)
.
1
Caratheodorys extension theorem, see e.g. [Dur95], Theorem A.1.1.
The last example shows that for innite probability spaces,
P[A] = 1 does not necessarily imply A = .
Denition 2.5
Let (, T, P) be a probability space. If A T satises P[A] = 1,
we say that A occurs almost surely (a.s.).
Random variables and distributions
Denition 2.6
Let (, T, P) a probability space. A function X : R is called
T-measurable if
X b := [ X() b T
for all b R. We also say X is a random variable on (, T).
If T is the power set, then every function on is T-measurable.
If X and Y are T-measurable, then f (X, Y) is also T-measurable
for every reasonable
2
function f .
The distribution function F
X
of X is
F
X
(x) = P[X x].
2
The function f must be Borel-measurable. Every function f : R
2
R we
shall ever encounter is Borel-measurable.
Take the intervals [a, b] and all subsets of R required to make the
collection a -algebra (called the Borel--algebra B(R) on R).
Let X be a random variable on (, T, P). Then X B T for
every B B(R). The distribution
X
of X under P is dened by

X
(B) := P[X B], B B(R)

X
denes a probability measure on
_
R, B(R)
_
.
The distribution determines the distribution function, and vice versa.
Example 2.7 (Stock price in binomial model)
Consider the binomial model (
N
, T, P) (Example 2.2) and let
S
t
() = S
0
u
#H(
1
...
t
)
d
#T(
1
...
t
)
for t = 0, ..., N. Choose for instance S
0
= 1 and u = 2, d =
1
2
.
Then the distribution
S
2
of S
2
under P is determined by

S
2
_
4
_
=
S
2
_

1
4

_
=
1
4
,
S
2
_
1
_
=
1
2
.
Many non-discrete random variables X have a density, i.e., a
function f
X
0 on R such that

X
(B) =
_
B
f
X
(x)dx for all B B(R).
In particular, the distribution function F
X
satises
F
X
(b) =
X
_
(, b]
_
=
_
b

f
X
(x)dx
and so
f
X
(x) = F

X
(x).
Example 2.8
Let a < b and suppose
X
(B) = P[X B] =
_
B
1
ba
I
[a,b]
(x)dx.
Then X has uniform distribution on [a, b]. Its density is
f (x) =
1
ba
I
[a,b]
(x).
Example 2.9
We say that X has exponential distribution with parameter if
F
X
(x) = P[X x] = 1 e
x
for x 0, and F
X
(x) = 0 for x < 0. Then X has density
f
X
(x) =
d
dx
_
1 e
x
_
= e
x
for x 0 and f
X
(x) = 0 for x < 0.
Example 2.10
Suppose P[X x] = (x) :=
_
x

(z)dz with (x) =


1

2
e

x
2
2
.
We say X has standard normal distribution. X has density .
Expectations
Let X be a random variable on a probability space (, T, P). The
elementary denition of the expectation
E[X] =

X()P[]
cannot be used if is uncountably innite (as in Ex. 2.4).

If X can only assume nitely many dierent values, we can


write X =

k
i =1
x
i
I
A
i
with A
i
T, using the indicator
function
I
A
() =
_
1 ( A)
0 ( / A)
Such a random variable is called a simple function. We then
dene
E[X] =
k

i =1
x
i
E
_
I
A
i

=
k

i =1
x
i
P[A
i
].

If X 0, let X
n
(n 1) be any sequence
3
of simple functions
such that X
n
X (n ) a.s. Then dene
E[X] = lim
n
E[X
n
].
One can show that this limit always exists in [0, ], and does
not depend on the choice of the approximating sequence.

Finally for a general random variable X, note that


X
+
= maxX, 0 and X

= maxX, 0 are nonnegative


and satisfy X = X
+
X

. So we dene
E[X] = E[X
+
] E[X

],
provided that E[X
+
], E[X

] < In this case we say that X


is integrable.
3
There always exists such a sequence, e.g. X
n
=

n2
n
j =0
j
2
n
I
{
j
2
n
X<
j +1
2
n
}
.
Theorem 2.11
a) X is integrable if and only if E[[X[] < .
b) Linearity: For , R and r.v.s X, Y
E[X +Y] = E[X] +E[Y]
c) Monotonicity: If X Y a.s., then E[X] E[Y].
Now let X and X
1
, X
2
, ... be random variables on (, T, P).
Theorem 2.12
a) Monotone convergence: If 0 X
n
X (n ) a.s., then
lim
n
E[X
n
] = E[X].
b) Dominated convergence: If X
n
X (n ) a.s. and
[X
n
[ Y a.s. for all n and some integrable Y, then
lim
n
E[X
n
] = E[X].
c) Fatous lemma: If X
n
0 a.s. for all n, then
E
_
liminf
n
X
n

liminf
n
E[X
n
].
Proofs: See e.g. [Dur95], Theorems A.4.7, A.5.4, A.5.5, A.5.6.
Example 2.13
On

in Example 2.4 let X


n
() =
_
1 (
n
= H)
0 (
n
= T)
and dene
X =

n=1
X
n
2
n
. By monotone convergence, we have
E[X] =

n=1
E
_
X
n
2
n
_
=

n=1
1
2

1
2
n
=
1
2
.
We can also compute the distribution of X. We verify that
P
_
k
2
n
X
k+1
2
n

=
1
2
n
for n = 1, 2, ... and all intervals
_
k
2
n
,
k+1
2
n

[0, 1]. This implies

X
_
[a, b]
_
= b a
for all a, b [0, 1] of the form
k
2
n
. A limit argument then yields the
equation for all a, b [0, 1]. Hence
X
is the uniform distribution
on [0, 1].
Computing expectations
Theorem 2.14
Let X a random variable on (, T, P), and g a function on R. If
g(X) is integrable, then
E[g(X)] =
_

g(x)d
X
(x).
Theorem 2.15
Let X a random variable on (, T, P) with density f , and g a
function on R. If g(X) is integrable, then
E[g(X)] =
_

g(x)f (x)dx.
Proofs: See [Shr04], Theorems 1.5.1. and 1.5.2.
Let X be a random variable with E[X
2
] < . The variance of X is
Var [X] = E
_
(X E[X])
2

= E[X
2
] E[X]
2
.
The standard deviation of X is
_
Var [X].
Example 2.16
Suppose X has standard normal distribution. Then
E[X] =
_

x(x)dx = 0,
Var [X] = E[X
2
] =
_

x
2
(x)dx = 1.
Example 2.17
Suppose X has uniform distribution on [0, 1]. Its density is
f (x) = I
[0,1]
(x), and so
E[X] =
_
1
0
xdx =
1
2
.
Independence
Denition 2.18
Let (, T, P) be a probability space.

Sets A
1
, ..., A
n
T are independent if
P[A
1
... A
n
] = P[A
1
] P[A
n
].

Random variables X
1
, ..., X
n
on (, T, P) are independent if
P[X
1
a
1
... X
n
a
n
] = P[X
1
a
1
] P[X
n
a
n
]
for all a
1
, ..., a
n
R.
Innitely many sets A
1
, A
2
, ... are independent if the sets A
1
, ..., A
n
are independent for every n N. Innitely many r.v. X
1
, X
2
, ... are
independent if X
1
, ..., X
n
are independent for every n N.
Remark: If X
1
, ..., X
n
are independent, then also
P[X
1
B
1
... X
n
B
n
] = P[X
1
B
1
] P[X
n
B
n
]
for any sets B
1
, ..., B
n
B(R).
The joint distribution function of X, Y is
F
X,Y
(x, y) = P[X x, Y y].
If there exists a function f
X,Y
0 such that
F
X,Y
(x, y) =
_
x

__
y

f
X,Y
(u, v)dv
_
du,
then X, Y have a joint density f
X,Y
.
Theorem 2.19
Let X, Y be random variables. The following are equivalent.
(i) X and Y are independent.
(ii) The joint distribution function of X and Y factors:
F
X,Y
(x, y) = F
X
(x)F
Y
(y) for all x, y R.
If X, Y have a joint density, then (i) and (ii) are equivalent to
(iii) The joint density of X and Y factors:
f
X,Y
(x, y) = f
X
(x)f
Y
(y) for all x, y R.
Theorem 2.20
Let X, Y be independent random variables and f , g functions.
Then f (X) and g(Y) are independent.
Proofs: [Shr04] Theorems 2.2.5. and 2.2.7.
Remark: The last results can be generalized to any number of
random variables, see [Res99] Theorem 4.2.1 and Lemma 4.4.1.
Example 2.21
Let
N
be the independent coin toss space in Example 2.3 and
Y
t
() =
_
u (
t
= H)
d (
t
= T)
for t = 1, ..., N. One checks
P[Y
1
a
1
... Y
N
a
N
] = P[Y
1
a
1
] P[Y
N
a
N
]
for all a
1
, ..., a
N
R: if any a
t
< d then both sides are 0; if all
a
t
d then both sides are (1 p)
#{t|a
t
<u}
.
Example 2.22
Let

be the innite independent coin toss space and Y


1
, Y
2
, ...
as in Example 2.21. Then the random variables Y
1
, Y
2
, ... are
independent.
Example 2.23
Let
N
be the independent coin toss space, Y
t
, t = 1, ..., N as
before, and S
t
= S
0

t
i =1
Y
i
the stock price in the binomial model.
Then for any 0 k < n N, the random variables S
k
and
S
n
S
k
are
independent.
Example 2.24
Let X
1
, ..., X
n
be independent standard normal random variables.
Then their joint density is
f
X
1
...X
n
(x
1
, ..., x
n
) =
1
(2)
n/2
e

1
2
(x
2
1
+...+x
2
n
)
, x
1
, ..., x
n
R,
the product of the marginal densities f
X
i
(x
i
) =
1

2
e

1
2
x
2
i
.
Theorem 2.25 (Strong law of large numbers)
Let X
1
, X
2
, ... be independent and identically distributed (iid)
random variables with E[[X
i
[] < and E[X
i
] = R. Then
lim
n
1
n
n

i =1
X
i
= a.s.
Proof: [Dur95] Theorem 1.7.1.
Theorem 2.26
Let X, Y be independent and integrable. Then
E[XY] = E[X]E[Y].
Proof: [Shr04] Theorem 2.2.7 (vi).
Corollary 2.27
Let X
1
, ..., X
n
be independent random variables with E[X
2
i
] <
for all i . Then
Var [X
1
+... + X
n
] = Var [X
1
] +... + Var [X
n
].
Dependence. The simplest way to measure dependence of
random variables is via covariance and correlation.
Denition 2.28
Let X, Y have second moments. The covariance of X and Y is
Cov[X, Y] = E
_
(X E[X])(Y E[Y])

= E[XY] E[X]E[Y].
The correlation of X and Y is
Corr [X, Y] =
Cov[X, Y]
_
Var [X]Var [Y]
.
The Cauchy-Schwarz inequality implies that
Corr [X, Y] [1, 1].
Denition 2.29
Let X = (X
1
, ..., X
d
)

be a random vector with E[X


2
] < . Dene
the covariance matrix Cov[X] of X by Cov[X]
ij
= Cov[X
i
, X
j
].
Note Cov[X] = E
_
(X E[X])(X E[X])

(here

is transpose).
This implies Cov[BX + b] = BCov[X]B

for B R
kd
and b R
k
.
Example 2.30 (Multivariate normal distribution)
Let X = (X
1
, ..., X
d
)

be a vector of random variables. X has


multivariate normal distribution with mean R
d
and
covariance matrix R
dd
, written X N(, ), if
X = + AZ
where Z = (Z
1
, ..., Z
d
)

with Z
1
, ..., Z
d
independent standard
normal random variables and A R
dd
with AA

= .
Information and -algebras
Take a probability space . Let X
i
(i I ) be functions on .
The smallest -algebra which contains all sets X
i
b where
b R and i I is called the -algebra generated by the X
i
. It is
denoted by (X
i
[ i I ). This is the smallest -algebra for which
each X
i
is measurable.
Interpretation: (X
i
[ i I ) models the (minimal) information
required to determine the values of the X
i
.
Example 2.31 (Binomial model with partial information I)
Let
N
=
_
= (
1
...
N
)

i
H, T i = 1, ..., N
_
with
T = A[ A
N
, and S
t
= S
0
Y
1
Y
t
with the Y
j
from
Example 2.21. For each n N we dene
T
n
= (Y
1
, ..., Y
n
) = (S
1
, ..., S
n
).
We say T
n
is the information available in the market at time n.
Usually -algebras are dened via generators. In some simple cases
one can explicitly specify the list of sets in the -algebra:
Example 2.32
Let
N
as before. Fix n N, and for each x
n
let A
x
be the
set of all
N
beginning with x. Then T
n
is the family of all
sets that can be generated by taking unions of sets A
x
with x
n
.
We nd
T
1
= ,
N
, A
H
, A
T
,
T
2
= ,
N
, A
HH
, A
HT
, A
TH
, A
TT
,
A
H
, A
T
, A
HH
A
TT
, A
HT
A
TT
, A
TH
A
HH
, A
TH
A
HT
,
A
HH
A
HT
A
TH
, A
HH
A
HT
A
TT
,
A
HH
A
TH
A
TT
, A
HT
A
TH
A
TT
, etc.
Example 2.33 (Binomial model with partial information II)
Let
N
and T as before, and let X() = #H(). Then (X) is
the information obtained from observing only the total number of
heads. We have (X) = (S
N
).
Remark. For a single random variable X, the family (X) consists
of all sets X B with B B(R).
Denition 2.34
Let (, T, P) be a probability space.

Sub--algebras (
1
, ..., (
n
of T are independent if
P[A
1
... A
n
] = P[A
1
] P[A
n
] for all A
i
(
i

A random variable X is independent of a -algebra ( if


(X) and ( are independent.
Example 2.35
For the random variables Y
1
, Y
2
, ... on

let T
n
= (Y
1
, ..., Y
n
).
Then Y
t
for t n + 1 are independent of T
n
.
Conditional expectations and martingales

Let X be a random variable on (, T, P), which we interpret


as a model for a random experiment. E[X] is the estimate for
the value of X().

Now let A T with P[A] > 0, and suppose we know that the
outcome of the random experiment will be in A. What is
our estimate for X in this case?
The conditional expectation of X given A is
E[X[A] =
E[X I
A
]
P[A]
.
Example: Take the binomial model (cf. Example 2.2), where
S
n
= S
0
n

i =1
Y
i
and E[S
n
] = S
0
n

i =1
E [Y
i
] = S
0
_
u+d
2
_
n
.
Now suppose we know the results of the rst two coin tosses or
the values of Y
1
, Y
2
or the information in T
2
. We nd
E[S
n
[A
HH
] =
E
__
S
0

n
i =1
Y
i
_
I
{Y
1
=Y
2
=u}

P[A
HH
]
= S
0
u
2
_
u+d
2
_
n2
,
E[S
n
[A
HT
] = S
0
ud
_
u+d
2
_
n2
,
E[S
n
[A
TH
] = S
0
du
_
u+d
2
_
n2
,
E[S
n
[A
TT
] = S
0
d
2
_
u+d
2
_
n2
.
Note: On each of the individual sets, the conditional expectations
are equal to the value of the random variable S
2
_
u+d
2
_
n2
. Thus
E
_
S
n

A
_
= E
_
S
2
_
u+d
2
_
n2

A
_
for any set A A
HH
, A
HT
, A
TH
, A
TT
, and hence for all A T
2
.
In summary, the random variable S
2
_
u+d
2
_
n2

is T
2
-measurable

for any A T
2
satises
E
_
S
n

A
_
= E
_
S
2
_
u+d
2
_
n2

A
_
It is an estimate of S
n
based on the information in T
2
.
Denition 2.36
Let X be an integrable random variable on (, T, P) and ( be a
sub--algebra of T. A conditional expectation of X given ( is
any random variable, denoted E[X[(], that satises
(i) (Measurability) E[X[(] is (-measurable,
(ii) (Partial averaging)
E
_
X I
A
_
= E
_
E[X[(] I
A
_
for any A (.
In the example above, E[S
n
[ T
2
] = S
2
_
u+d
2
_
n2
.
A conditional expectation always exists ([Shr04], Theorem B.1).
It is unique in the following sense. Let Y and Z be conditional
expectations of X given (. Then A = Y Z > 0 ( by (i),
and so by (ii)
E[(Y Z)I
A
] = E[YI
A
] E[ZI
A
] = E[XI
A
] E[XI
A
] = 0,
which implies P[Y Z > 0] = 0. Reversing the roles of Y and Z,
we obtain P[Z Y > 0] = 0. Hence Y = Z a.s.
As for the expectation we have
Theorem 2.37
a) Linearity: For , R and r.v.s X, Y
E[X +Y [ (] = E[X [ (] +E[Y [ (]
b) Monotonicity: If X Y a.s., then E[X [ (] E[Y [ (].
c) Jensens inequality: If is a convex function then
E
_
(X)


_
E[X [ (]
_
.
Theorem 2.38
a) Taking out measurable factors: If X is (-measurable,
E[XY [ (] = XE[Y [ (].
b) Iterated conditioning: If H ( is a sub--algebra, then
E
_
E[X [ (]

= E[X [ H].
c) Independence: Let X = (X
1
, ..., X
d
) be (-measurable and
Y = (Y
1
, ..., Y
e
) be independent of (. Then for any function f
E[f (X, Y) [ (] = g(X),
where g(x) := E[f (x, Y)] for all x R
d
.
Special cases:

If X is (-measurable then E[X [ (] = X.

If Y is independent of ( then E[Y [ (] = E[Y].

Partial averaging with A = yields E


_
E[X [ (]

= E[X].

If H = , then E[X [ H] = E[X].


Proofs: [Shr04] Theorem 2.3.2, and [Res99] equation (10.17).
Examples. Let S
t
= S
0
Y
1
Y
t
for t 0 the binomial model with
independent Y
1
, Y
2
, ... and P[Y
j
= u] = P[Y
j
= d] =
1
2
, and T
t
be
the -algebra in Example 2.31.
Since Y
t+1
, Y
t+2
, ... are independent of T
t
, any function of
Y
t+1
, Y
t+2
, ... is independent of T
t
.
1) For t 0
E
_
S
t+1
S
t
S
t

T
t
_
= E
_
Y
t+1
1

T
t

= E[Y
t+1
1] =
u+d
2
1
2) Let M
t
= max
j =0,...,t
S
j
. For t 0
E[M
t+1
[ T
t
] = E[M
t
(S
t
Y
t+1
) [ T
t
]
= E[m (sY
t+1
)]

(m,s)=(M
t
,S
t
)
=
_
1
2
(m su) +
1
2
(m sd)
_

(m,s)=(M
t
,S
t
)
=
1
2
(M
t
S
t
u) +
1
2
(M
t
S
t
d)
3) Let X, Z random variables. If ( = (Z) we write
E[X [ (Z)] = E[X [ Z]. In this case there exists a function f with
E[X [ Z] = f (Z).
This follows from
Theorem 2.39
Let Y and Z be random variables. If Y is (Z)-measurable, then
Y = f (Z) for some function f .
Example: Let X, Y independent N(0, 1) distributed and Z = X
2
+ Y
2
.
We want to compute E
_
[X[

= f (Z). That is, nd f . Now


E
_
[X[I
{Zz}

= E
_
f (Z)I
{Zz}

=
z
_
0
f (u)
1
2
e
u/2
du
by partial averaging. The left hand side is

[x[I
{x
2
+y
2
z}
1
2
e

x
2
+y
2
2
dxdy =
2
_
0

_
0
[r cos [I
{r
2
z}
1
2
e

r
2
2
rdr d
= 4

z
_
0
1
2
e

r
2
2
r
2
dr .
.
Dierentiating both sides w.r.t. z yields f (z) =
2

z.
Denition 2.40
a) A ltration on a space is a family of -algebras
(T
t
)
t=0,1,2,...,N
(discrete time) or (T
t
)
t[0,T]
(continuous time)
such that T
s
T
t
for all s t.
b) A stochastic process (X
t
)
t[0,T]
is a family of random variables
X
t
indexed with time t. We say that (X
t
)
t[0,T]
is adapted to the
ltration (T
t
)
t[0,T]
if X
t
is T
t
-measurable for each t.
In discrete time, we replace [0, T] with 0, 1, ..., N.
Intuition: T
t
models the information available at time t.
A ltration models the ow of information.
Adapted processes are obtained as follows. For a given stochastic
process (X
t
), let T
t
= (X
s
[ s t). This is called the ltration
generated by the process (X
t
). Clearly (X
t
) is adapted to (T
t
).
We can then construct further adapted processes from (X
t
).
Example: Let

the innite coin toss space and


T
t
= (Y
1
, ..., Y
t
) for t = 0, 1, 2, ... as in Example 2.31.

(Y
t
)
t=0,1,2,...
is adapted.

(S
t
)
t=0,1,2,...
with S
t
= S
0
Y
1
Y
t
is adapted.

For any sequence of deterministic functions f


t
(),
t = 0, 1, 2, ..., the process
_
f
t
(S
t
)
_
t=0,1,2,...
is adapted.
Remark: In Theorem 1.2 we found that the stock positions for the
replication strategy of a European derivative are of the form

t
= f
t
(S
t
),
i.e.,
t
is an adapted process. Intuitively, this means that
t
can
be determined with information available at time t. In particular,
we do not look into the future to make the investment decision.
Denition 2.41
Let (T
t
)
t[0,T]
be a ltration on and (X
t
)
t[0,T]
a stochastic
process on (, T, P). Suppose (X
t
)
t[0,T]
is adapted and all X
t
are
integrable.
(ii) The process (X
t
)
t[0,T]
is a martingale if
E[X
t
[ T
s
] = X
s
for all 0 s t T.
(iii) The process (X
t
)
t[0,T]
is a submartingale if
E[X
t
[ T
s
] X
s
for all 0 s t T.
(iv) The process (X
t
)
t[0,T]
is a supermartingale if
E[X
t
[ T
s
] X
s
for all 0 s t T.
In the same way we dene (sub-/super-) martingales in discrete
time for a discrete time ltration (T
t
)
t=0,1,2,...,N
.
Example 2.42
Dene a probability measure on the coin toss space (
N
, T) via

P[] = p
#H()
(1 p)
#T()
,
N
,
for some p (0, 1). Dene Y
1
, ..., Y
N
as in Example 2.21, they are
independent under

P with

P[Y
i
= u] = p and

P[Y
i
= d] = 1 p.
Let B
t
= (1 + r )
t
and S
t
= S
0
t

i =1
Y
i
for t = 0, ..., N be the bond
and stock price process in the binomial model, and (T
t
)
t=0,1,...,N
be the ltration generated by the process Y
1
, ..., Y
N
.

The process
_
S
t
B
t
_
t=0,1,...,N
is adapted to (T
t
)
t=0,1,...,N
since
S
t
B
t
is a function of Y
1
, ..., Y
t
which are T
t
-measurable.

Write

E for the expectation under

P, then for 0 s t N

E
_
S
t
B
t

T
s
_
=

E
_
S
0

t
i =1
Y
i
(1+r )
t

T
s
_
=
S
0

s
i =1
Y
i
(1+r )
s

E
_
t
i =s+1
Y
i
(1+r )
ts

T
s
_
=
S
s
B
s

E
_

t
i =s+1
Y
i
(1+r )
_
=
S
s
B
s
_
pu+(1 p)d
1+r
_
ts
.
So
_
S
t
B
t
_
t=0,1,2,...,N
is a martingale on (
N
, T,

P) if
pu + (1 p)d = 1 + r p =
1+r d
ud
(2.1)
(a submartingale if p >
1+r d
ud
and a supermartingale if p <
1+r d
ud
).
Note:

p in (2.1) is equal to the parameter p in (1.7).


So p in (1.7) can be interpreted as that up-move probability
for which
_
S
t
B
t
_
t=0,1,2,...,N
becomes a martingale.

Suppose
_
S
t
B
t
_
t=0,1,2,...,N
is a martingale under

P. We must
have p [0, 1], which is satised if d < 1 + r < u.
This condition is equivalent to absence of arbitrage, see below.
Example 2.43
Given initial wealth X
0
, let (X
t
)
t=0,1,2,...,N
be the value process
X
t
B
t
=
X
0
B
0
+
t1

n=0

n
_
S
n+1
B
n+1

S
n
B
n
_
of a sfts (see (1.5)), where
n
is T
n
-measurable for each time n
(investment decisions based on available information).
If
_
S
t
B
t
_
t=0,1,2,...,N
is a martingale , then so is
_
X
t
B
t
_
t=0,1,2,...,N
:

it is adapted

X
t
B
t
=
X
s
B
s
+
t1

n=s

n
_
S
n+1
B
n+1

S
n
B
n
_
, and for n s

E
_

n
_
S
n+1
B
n+1

S
n
B
n
_

T
s
_
=

E
_

E
_

n
_
S
n+1
B
n+1

S
n
B
n
_

T
n
_

T
s
_
=

E
_

E
_
S
n+1
B
n+1

S
n
B
n

T
n
_

T
s
_
= 0.
Pricing derivatives with martingales
Consider the binomial model with 0 < d < 1 + r < u.
By Theorem 1.2, for any derivative with payo h(S
N
) at time N
there exists a sfts with value process h
t
(S
t
), t = 0, ..., N (fair price
at t) and h
N
(S
N
) = h(S
N
).

Example 2.42:
_
S
t
B
t
_
t=0,1,2,...,N
is a martingale under

P with
p =
1+r d
ud

Example 2.43: The discounted value process of any sfts is a


martingale under

P
Corollary 2.44
The discounted value process
h
t
(S
t
)
B
t
, t = 0, ..., N, is a martingale
under

P. In particular,
h
t
(S
t
)
B
t
=

E
_
h
N
(S
N
)
B
N

T
t
_
=

E
_
h(S
N
)
B
N

T
t
_
for all t = 0, ..., N.
(2.2)
We compute
h
t
(S
t
) =
B
t
B
N

E
_
h(S
N
)

T
t

=
1
(1+r )
Nt

E
_
h
_
S
t
N

i =t+1
Y
i
_

T
t
_
=
1
(1+r )
Nt

E
_
h
_
S
N

i =t+1
Y
i
__

S=S
t
=
1
(1+r )
Nt

E
_
h
_
Su
#H
d
Nt#H
__

S=S
t
=
1
(1 + r )
Nt
Nt

k=0
_
N t
k
_
p
k
(1 p)
Ntk
h
_
S
t
u
k
d
Ntk
_
.
(Here #H denotes the number of heads from t + 1 to N.)
Major goal of the course:
Develop a martingale pricing theory for continuous time models
Change of measure
Two probability measures on (
N
, T):
a) P[] = p
#H()
(1 p)
#T()
b)

P[] = p
#H()
(1 p)
#T()
with p =
1+r d
ud
Under P the expected returns of stock and bond are
E
_
S
t+1
S
t
S
t

T
t
_
= pu + (1 p)d 1,
B
t+1
B
t
B
t
= r
Usually E
_
S
t+1
S
t
S
t

T
t
_
>
B
t+1
B
t
B
t
because of risk aversion.
Under

P the expected return of the stock is

E
_
S
t+1
S
t
S
t

T
t
_
= pu + (1 p)d 1 = r =
B
t+1
B
t
B
t
.

P is called risk-neutral probability measure.


Note that we have
P[A] = 0

P[A] = 0 for all A T.
Two probability measures with this property are called equivalent.
Remark: In the binomial model we actually have P[A] = 0 only if
A = . The notion of equivalent probability measures will become
more important (and less trivial) in the case of continuous-time
models requiring innite probability spaces, see Chapter 5.
Until now our discussions mostly focused on the binomial model.
We now come to a result which reaches far beyond this model.
Fundamental theorem of asset pricing
Consider a market with one riskless asset B
t
and m risky assets S
i
t
for i = 1, ..., m (stocks, derivatives,...). Asset prices are modeled
by nonnegative adapted stochastic processes on a space (, T, P)
with a ltration (T
t
)
t=0,1,2,...
. We assume B
t
> 0 for all t.
Denition 2.45
a) Consider a trading strategy which invests into
t
units of B
and
i
t
units of S
i
at time t = 0, 1, ..., N 1, so that the portfolio
value at each time t is
X
t
=
t
B
t
+
m

i =1

i
t
S
i
t
. (2.3)
We demand that
t
and
i
t
are bounded and T
t
-measurable.
b) The strategy is self-nancing if for any t = 0, 1, ..., N 1
X
t+1
=
t+1
B
t+1
+
m

i =1

i
t+1
S
i
t+1
=
t
B
t+1
+
m

i =1

i
t
S
i
t+1
. (2.4)
The value process X
t
of a sfts is computed as follows:
X
t+1
B
t+1

X
t
B
t
=

t
B
t+1
+

m
i =1

i
t
S
i
t+1
B
t+1


t
B
t
+

m
i =1

i
t
S
i
t
B
t
=
m

i =1

i
t
_
S
i
t+1
B
t+1

S
i
t
B
t
_
(2.5)

X
n
B
n
=
X
k
B
k
+
n1

t=k
m

i =1

i
t
_
S
i
t+1
B
t+1

S
i
t
B
t
_
(2.6)
for any 0 k < n N.
Denition 2.46
An arbitrage opportunity is a sfts whose value process satises
X
0
= 0, P[X
N
< 0] = 0, and P[X
N
> 0] > 0.
A meaningful market model should be free of arbitrage
opportunities.
Theorem 2.47 (FTAP in discrete time)
There is no arbitrage opportunity in the market B, S
1
, ..., S
d
if,
and only if, there exits a measure

P equivalent to P on (, T)
such that all discounted price processes
_
S
i
t
B
t
_
t=0,1,...,N
for
i = 1, ..., m are martingales under

P.
Proof of if part: Suppose there exists

P as above. Then (2.6)
implies that the value process
X
t
B
t
of any sfts is a

P-martingale.
Now if we had an arbitrage opportunity, its value process would
satisfy

E
_
X
N
B
N

=
X
0
B
0
= 0 (2.7)
and P[X
N
< 0] = 0, thus also

P[X
N
< 0] = 0. So (2.7) implies

P[X
N
> 0] = 0, and thus also P[X
N
> 0] = 0, a contradiction.
The proof of the only if part requires tools from functional
analysis beyond the scope of this course, see e.g. Follmer and
Schied, Stochastic Finance, Theorem 5.17. A measure

P as the
FTAP is called a martingale measure.
Remarks. 1) The FTAP requires very few assumptions on the
market model.
2) In the binomial model, we have identied the risk-neutral
measure

P as a martingale measure in Example 2.42. One can
show that it is the only martingale measure in this model. In this
case, derivative prices are determined by arbitrage considerations.
3) In more general arbitrage-free models, there can be more than
one martingale measure. In this situation, absence of arbitrage
alone does not uniquely determine derivative prices.
The FTAP is a key result for derivative pricing:
Corollary 2.48
Take an arbitrage-free market B, S
1
, ..., S
m
as above. Suppose we
add a derivative with price process V
t
to the market. The resulting
market B, S
1
, ..., S
m
, V is arbitrage-free if and only if
V
t
= B
t

E
_
V
N
B
N

T
t
_
, t = 0, ..., N (2.8)
where

P is a martingale measure for the market B, S
1
, ..., S
m
.
Conditional distributions
For a random variable X on (, T, P), the distribution
X
of X is

X
(B) := P[X B], B B(R)
Now let ( T be a sub--algebra. The conditional distribution

X|G
of X given ( is

X|G
(B) := P
_
X B

, B B(R)
where the conditional probability is P
_
X B

= E
_
I
{XB}

.
Note that P[X B [ (] is a random variable, i.e., depends on .
One checks that for each , the function
B P
_
X B

()
is a probability measure on
_
R, B(R)
_
(a random measure).
In many situations we can write X as a function of a (-measurable
and a (-independent part, and use Theorem 2.38 c) to compute its
conditional distribution given (.
Example 2.49
Let Y, Z be independent N(0, 1) distributed random variables on
(, T, P) and X = Y + Z, so X has N(0, 2) distribution.
Let ( = (Y), then Y is (-measurable and Z is independent of (.
Hence for B B(R)
P
_
X B

() = E
_
I
{Y+ZB}

()
= E
_
I
{y+ZB}

y=Y()
= P
_
y + Z B

y=Y()
Since y + Z N(y, 1) for y R, it follows that the conditional
distribution of X given ( is N
_
Y(), 1
_
.
3. Brownian motion
Main goals

to introduce Brownian motion as a prototype


continuous-time martingale

to develop its basic properties

to explain its role in asset price modeling


Motivation
Goal: constructing a continuous-time limit of the binomial model

Let (

, T, P) as in Example 2.4 (independent coin tosses


with head and tail probability
1
2
. Dene
X
j
() =
_
1 (
j
= H)
1 (
j
= T)
and a symmetric random walk by M
0
= 0,
M
k
=
k

j =1
X
j
, k = 1, 2, ...

Note
#H(
1
...
k
) =

k
j =1
1+X
j
()
2
=
1
2
_
k + M
k
()
_
#T(
1
...
k
) =

k
j =1
1X
j
()
2
=
1
2
_
k M
k
()
_
so
S
k
= S
0
u
#H
d
#T
= S
0
u
1
2
(k+M
k
)
d
1
2
(kM
k
)
. (3.1)

Discrete time model:


S
k
= S
0
u
1
2
(k+M
k
)
d
1
2
(kM
k
)

Continuous time notation: let t denote time in years.


1 discrete time step =
1
n
year in continuous time:
S
(n)
(t) = S
nt
for nt N.

Jump size u = u
n
, d = d
n
depending on length of time step:
S
(n)
(t) = S
nt
= S
0
u
1
2
(nt+M
nt
)
n
d
1
2
(ntM
nt
)
n
= S
0
e
1
2
(log u
n
log d
n
)M
nt
+
1
2
(log u
n
+log d
n
)nt
(3.2)

Scaling: Note Var [M


nt
] = nt for nt N. This motivates to
dene the scaled symmetric random walk
W
(n)
(t) =
1

n
M
nt
(3.3)
for
4
nt N. Then we have for each such n
Var [W
(n)
(t)] = t.
4
If k < nt < k + 1 for k N then W
(n)
(t) is given by linear interpolation
between W
(n)
(
k
n
) and W
(n)
(
k+1
n
).

We obtain
S
(n)
(t) = S
0
e
1
2

n(log u
n
log d
n
)W
(n)
(t)+
1
2
(log u
n
+log d
n
)nt
(3.4)

We want to construct S(t) via taking n . This works if


we have limits
W
(n)
(t) W(t)
1
2

n(log u
n
log d
n
)
1
2
(log u
n
+ log d
n
)n c
for n . In this case we obtain
S(t) = S
0
e
W(t)+ct
Now suppose we know that W
(n)
(t) W(t) for n .
(We can hope that we get convergence since Var [W
(n)
(t)] = t
for all n.)
Continuous time limit of the binomial model
Dene
u
n
= 1 +

n
+

n
, d
n
= 1

n
+

n
(3.5)
with > 0, R. Using log(1 + x) = x
1
2
x
2
+ O(x
3
) we nd
1
2
(log u
n
log d
n
) =

n
+ O(n

3
2
),
1
2
(log u
n
+ log d
n
) =
1
n
_

1
2

2
_
+ O(n

3
2
).
So from (3.4)
log S
(n)
(t)
= log S
0
+
1
2

n(log u
n
log d
n
)W
(n)
(t) +
1
2
(log u
n
+ log d
n
)nt
= log S
0
+W
(n)
(t) + O(n
1
)W
(n)
(t) +
_

1
2

2
_
t + O(n

1
2
)
log S
0
+W(t) +
_

1
2

2
_
t.
Hence the random variables S
(n)
with u
n
, d
n
in (3.5) converge to
S(t) = S
0
e
W(t)+
(

1
2

2
)
t
(3.6)
in distr. for n (geometric Brownian motion, see below).
Convergence of the scaled random walk
Theorem 3.1
Let W
(n)
(t), t 0 be a scaled symmetric random walk, and
0 = t
0
< t
1
< ... < t
m
such that each nt
j
N.
a) The increments
W
(n)
(t
1
)W
(n)
(t
0
), W
(n)
(t
2
)W
(n)
(t
1
), ..., W
(n)
(t
m
)W
(n)
(t
m1
)
are independent.
b) The increments satisfy
E
_
W
(n)
(t
j
) W
(n)
(t
j 1
)

= 0,
Var
_
W
(n)
(t
j
) W
(n)
(t
j 1
)

= t
j
t
j 1
.
Proof: For a) use Theorem 2.20, for b) use Corollary 2.27.
Theorem 3.2 (Central limit theorem)
Let Y
1
, Y
2
, ... be independent identically distributed (iid) random
variables with E[Y
k
] = and Var [Y
k
] =
2
< for all k, and
S
n
=
1

n
n

k=1
Y
k

.
Then S
n
converge to a standard normal random variable in
distribution, i.e.
lim
n
P[S
n
x] = (x) for all x R.
Proof: See [Dur95] Theorem 2.4.1 or [Res99] Theorem 9.7.1.
For nt
j 1
, nt
j
N the increment of W
(n)
satises
W
(n)
(t
j
)W
(n)
(t
j 1
)

t
j
t
j 1
=
1

n(t
j
t
j 1
)
_
M
nt
j
M
nt
j 1
_
=
1

nt
j
nt
j 1
nt
j
nt
j 1

k=1
X
nt
j 1
+k
(n)
Z N(0, 1)
in distribution by the central limit theorem.
Brownian motion
Brownian motion is obtained as the limit of W
(n)
(t) for n .
Denition 3.3
A stochastic process W(t), t 0 on a space (, T, P) is a
Brownian motion (BM) if it satises
(i) W(0) = 0 a.s.
(ii) t W(t) is continuous a.s.
(iii) For all times 0 = t
0
< t
1
< ... < t
m
, the increments
W(t
1
) = W(t
1
)W(t
0
), W(t
2
)W(t
1
), ..., W(t
m
)W(t
m1
)
are independent, and W(t
j
) W(t
j 1
) N(0, t
j
t
j 1
)
for all j = 1, ..., m.
Distribution of Brownian motion
For 0 = t
0
< t
1
< ... < t
m
_
W(t
1
) W(t
0
), ..., W(t
m
) W(t
m1
)
_
=
_
t
1
t
0
Z
1
, ...,

t
m
t
m1
Z
m
_
with independent standard normal Z
1
, ..., Z
m
by (iii).
So
_
W(t
1
), ..., W(t
m
)
_
has multivariate normal distribution
(as linear combination of a multivar normal).
For all times 0 s < t
E
_
W(s)

= E
_
W(t)

= 0,
Cov
_
W(s), W(t)

= E
_
W(s)W(t)

= E
_
W(s)
_
W(t) W(s)
_
+ W(s)
2

= s = min(s, t).
Filtration of Brownian motion
Let W(t), t 0 be a BM on (, T, P). Let (T
W
t
)
t0
be the
ltration generated by the BM.
Reminder: T
W
t
=
_
W(s) [ 0 s t
_
is the smallest -algebra
such that all W(s) with s t are T
W
t
-measurable.
Important properties:
a) T
W
t
contains all events which can be expressed in terms of
the values of W(s) for 0 s t.
b) W(t), t 0 is adapted to the ltration (T
W
t
)
t0
.
c) W(u) W(t) is independent of T
W
t
for 0 t < u
(this follows from Denition 3.3 (iii)).
Brownian motion for a general ltration
Sometimes one wants a BM on a space which is already equipped
with a ltration. Then the following characterization of BM is used.
Denition 3.4
Let W(t), t 0 be a stochastic process on a space (, T, P)
which is adapted to a ltration (T
t
)
t0
. The process is a
Brownian motion w.r.t. (T
t
)
t0
if it satises
(i) W(0) = 0 a.s.
(ii) t W(t) is continuous a.s.
(iii) For all times 0 t < u, the increment W(u) W(t) is
independent of T
t
and has N(0, u t) distribution.
The two denitions are equivalent:

On the last page we saw that a BM in the sense of


Denition 3.3 satises Denition 3.4

It is not dicult to show that a BM in the sense of Denition


3.4 also satises Denition 3.3 (exercise).
In general models, T
t
in Denition 3.4 can contain more
information than T
W
t
.
However we only work with the ltration T
t
= T
W
t
, t 0 in our
models, so we drop the superscript and only write T
t
as before.
Martingale property of Brownian motion
Theorem 3.5
BM is a martingale.
Proof: BM is adapted to its ltration, and integrable because the
normal distribution has nite rst moment. For 0 s t,
E
_
W(t) [ T
s

= E
_
W(t) W(s) + W(s) [ T
s

= E
_
W(t) W(s) [ T
s

+ E
_
W(s) [ T
s

= E
_
W(t) W(s)

+ W(s)
= W(s).
Geometric Brownian motion
S(t) = S
0
e
W(t)+
(

1
2

2
)
t
(i) S(t) is continuous in t, a.s.
(ii) For 0 s < t,
S(t) = S
0
e
W(s)+
(

1
2

2
)
s
e

_
W(t)W(s)
_
+
(

1
2

2
)
(ts)
= S(s)e

_
W(t)W(s)
_
+
(

1
2

2
)
(ts)
. (3.7)
So the logarithmic return log
S(t)
S(s)
is independent of T
s
(by
property (iii) of BM).
(iii) For 0 s < t, the logarithmic return log
S(t)
S(s)
has a normal
distribution. S(t) has a lognormal distribution.
Geometric BM is used as a model for the stock price in the
Black-Scholes model (see Chapter 4).
Theorem 3.6 (Exponential martingale)
Let W(t), t 0 be a Brownian motion with a ltration (T
t
),
t 0. Then
S(t) = S
0
e
W(t)+
(

1
2

2
)
t
is a martingale for = 0, and a submartingale (supermartingale) if
> 0 ( < 0).
Proof: S(t) is T
t
-measurable as a function of W(t), and integrable
since W(t) has exponential moments. Take the case = 0. For
0 s < t, by (3.7)
E
_
S(t)

T
s

= E
_
S(s)e

_
W(t)W(s)
_

1
2

2
(ts)

T
s
_
= S(s)e

1
2

2
(ts)
E
_
e

_
W(t)W(s)
_
_
.
In the last step we used Theorem 2.38 a) and c).
W(t) W(s) N(0, t s) implies E
_
e

_
W(t)W(s)
_
_
= e
1
2

2
(ts)
.
The cases ,= 0 are an exercise.
Markov property of BM
Denition 3.7
Let
_
X(t)
_
t[0,T]
be a stochastic process adapted to a ltration
_
T
t
_
t[0,T]
. Assume that for all 0 s t T and every
nonnegative function f , there is another function
5
g such that
E
_
f
_
X(t)
_

T
s

= g
_
X(s)
_
.
Then we say that
_
X(t)
_
t[0,T]
is a Markov process.
Theorem 3.8
Let W(t), t 0 be BM and T
t
, t 0 the ltration generated by
the BM. Then W(t), t 0 is a Markov process.
5
To be precise, f and g are assumed to be Borel-measurable.
Proof: From Theorem 2.38 c)
E
_
f
_
W(t)
_

T
s

= E
_
f
_
W(t) W(s) + W(s)
_

T
s

= E
_
f
_
W(t) W(s) + x
_

x=W(s)
=
_

1
_
2(t s)
e

z
2
2(ts)
f (z + x)dz

x=W(s)
=
_

1
_
2(t s)
e

(yx)
2
2(ts)
f (y)dy

x=W(s)
=
_

p
_
t s, W(s), y
_
f (y)dy (3.8)
where p(, x, y) =
1

2
e

(yx)
2
2
.
p(, x, y) is the transition density of BM. (3.8) says that the
conditional distribution of W(t) given T
s
is N
_
W(s), t s
_
.
The fact that this distribution only depends on W(s) (instead of
all information in T
s
) is the essence of the Markov property.
Quadratic variation
Motivation: Let f (t) be a function. We seek a measure of
variation (total up and down oscillation) of f (t) on [0, T].
Denition 3.9
For a partition = t
0
, t
1
, ..., t
n
of [0, T], i.e., times with
0 = t
0
< t
1
< ... < t
n
= T, set [[ = max
j =0,...,n1
(t
j +1
t
j
).
The (rst-order) variation of f on [0, T] is
FV
f
(T) = lim
||0
n1

j =0
[f (t
j +1
) f (t
j
)[.
If f (t) is dierentiable then
FV
f
(T) =
_
T
0
[f

(t)[dt. (3.9)
If f (t) is increasing then
FV
f
(T) = f (T) f (0).
Denition 3.10
For a partition = t
0
, t
1
, ..., t
n
of [0, T], i.e., times with
0 = t
0
< t
1
< ... < t
n
= T, set [[ = max
j =0,...,n1
(t
j +1
t
j
).
The quadratic variation of f on [0, T] is
6
[f , f ](T) = lim
||0
n1

j =0
[f (t
j +1
) f (t
j
)[
2
.
The covariation of two functions f and g on [0, T] is
[f , g](T) = lim
||0
n1

j =0
_
f (t
j +1
) f (t
j
)
__
g(t
j +1
) g(t
j
)
_
.
6
In both Denitions 3.9 and 3.10 one can show that the limits do not
depend on the choice of the partitions . We omit the proof of this fact here.
Basic properties. 1) If f (t) is continuous and g(t) has nite
rst-order variation (e.g. if g(t) has an integrable derivative) then
[f , g](T) = 0 for all T.
2) Thus if f (t) is continuous and has nite rst-order variation,
[f , f ](T) = 0 for all T.
3) Covariation is bilinear and symmetric. In particular,
[f + g, f + g](T) = [f , f ](T) + [g, g](T) + 2[f , g](T).
4) If f
1
, f
2
are continuous and g
1
, g
2
are continuous and have nite
rst-order variation, then
[f
1
+ g
1
, f
2
+ g
2
] = [f
1
, f
2
].
5) The function [f , f ](t) is increasing in t, and thus has nite
rst-order variation.
Quadratic variation of Brownian motion
Theorem 3.11
Let W be a BM. Then [W, W](T) = T for all T 0 a.s.
It follows that BM paths W(t), t 0, do not have nite rst-order
variation. W(t), t 0, is not dierentiable (a.s.).
Intuition:
BM accumulates quadratic variation at rate one per unit time.
Note: By denition,
[W, W](T)() = lim
||0
n1

j =0

W(t
j +1
)() W(t
j
)()

2
.
Theorem 3.11 says that the random variables
n1

j =0
[W(t
j +1
) W(t
j
)[
2
converge to T when [[ 0.
Proof: Take any partition = t
0
, t
1
, ..., t
n
of [0, T]. We show
n1

j =0
[W(t
j +1
) W(t
j
)[
2
T
in L
2
when [[ 0. Indeed,
E
_
n1

j =0
[W(t
j +1
) W(t
j
)[
2
_
=
n1

j =0
(t
j +1
t
j
) = T
and
Var
_
n1

j =0
[W(t
j +1
) W(t
j
)[
2
_
=
n1

j =0
Var
_
[W(t
j +1
) W(t
j
)[
2
_
=
n1

j =0
_
E
_
[W(t
j +1
) W(t
j
)[
4
_
E
_
[W(t
j +1
) W(t
j
)[
2
_
2
_
=
n1

j =0
_
3(t
j +1
t
j
)
2
(t
j +1
t
j
)
2
_
= 2
n1

j =0
[t
j +1
t
j
[
2
2
_
max
j =0,...,n1
[t
j +1
t
j
[
_
n1

j =0
[t
j +1
t
j
[ = [[T 0.
Notations:
a) [W, W](T) = T
b) d[W, W](t) = dt
c) dW(t)dW(t) = dt
Notation c) is delicate since dW(t) alone is not well-dened.
To interpret c), take the partition of [0, T] dened by t
j
=
j
n
T
for j = 0, ..., n, so t = t
j +1
t
j
=
T
n
. Then
_
W(t
j +1
) W(t
j
)
_
2
= T
Y
2
j +1
n
= Y
2
j +1
t (3.10)
with Y
j +1
N(0, 1), hence E[Y
2
j +1
] = 1. So c) is a dierential
version of (3.10).
By the Law of Large Numbers

n1
j =0
T
Y
2
j +1
n
TE[Y
2
j +1
] = T for
n , and so we nd again
lim
n
n1

j =0
[W(t
j +1
) W(t
j
)[
2
= T.
Application: Volatility of geometric BM
Let S(t), t 0, a price process, and = t
0
, t
1
, ..., t
n
a partition
of [0, T]. The (annualized) historical variance of the asset along
the partition is
Var
hist
[0, T] =
1
T
n1

j =0
_
log
S(t
j +1
)
S(t
j
)
_
2
.
Assume now S(t) is a geometric BM. Then for small [[,
Var
hist
[0, T] =
=
1
T
n1

j =0
__
W(t
j +t
) +
_

1
2

2
_
t
j +1
_

_
W(t
j
) +
_

1
2

2
_
t
j
__
2

1
T
_
W(t) +
_

1
2

2
_
t , W(t) +
_

1
2

2
_
t
_
(T)
=
1
T
[W, W](T) =
1
T

2
[W, W](T) =
2
.
The (annualized) historical volatility of the asset during [0, T] is
dened as

hist
[0, T] =
_
Var
hist
[0, T].
So if S(t) is a geometric BM, then for small [[

hist
[0, T] (3.11)
for every T.
Let S(t) be a stochastic process adapted to a ltration (T
t
)
t0
,
and > 0 a small time increment. The (annualized) volatility at t
is the standard deviation of
1

log
S(t+)
S(t)
conditional on T
t
.
Assume again S(t) is a geometric BM. Then
1

log
S(t+)
S(t)
=

_
W(t+)W(t)
_
+(
1
2

2
)

N
_
(
1
2

2
)

,
2
_
,
so the volatility is equal to .

we can identify from price observations (approximately)


via (3.11).
Remark. In the second semester of the course we will analyze
models which have non-constant (stochastic) volatility.
Excursion: Existence of BM
Does there exist a stochastic process as in Denition 3.3?
Existence proof via Donskers theorem: Let S be the space of all
continuous functions on [0, T] with (0) = 0. Dene a norm on
S by || = sup
t[0,T]
[(t)[, and let o be the -algebra generated
by the open subsets of S with respect to the norm | |. For each
t [0, T] dene a random variable on S by W
t
() = (t), S.
We dene probability measures
n
on (S, o) by

n
(B) = P[W
(n)
() B], B o,
where W
(n)
(t) is the scaled symmetric random walk in (3.3).
Theorem 3.12 (Donskers theorem)
There exists a unique probability measure on (S, o) such that

n
weakly (n ),
and the stochastic process W
t
, t [0, T], is a BM on (S, o, ).
Proof: See [Dur95] Section 7.6.
A consequence of Donskers theorem is the following:
Corollary 3.13
Let W
(n)
be the scaled symmetric random walk and W be a BM.
Then for any times 0 < t
1
< ... < t
m
and any bounded continuous
function h : R
m
R we have
lim
n
E
_
h
_
W
(n)
(t
1
), ..., W
(n)
(t
m
)
_
= E
_
h
_
W(t
1
), ..., W(t
m
)
_
.
This result implies that when the number n of time steps per year
goes to innity, derivative prices in the binomial model converge to
derivative prices in the Black-Scholes model (see Chapter 5).
4. Stochastic calculus
Main topics

Stochastic integrals as value processes of self-nancing


trading strategies

Ito-Doeblin formula

Application: Black-Scholes-Merton equation


Motivation
Consider a market with a bond and a stock. Prices at time t 0
(continuous time) are given by B(t) = e
rt
and S(t), where S is a
continuous stochastic process adapted to a ltration (T
t
)
t0
.

Trading takes place only at times 0 = t


0
< t
1
< ... < t
m
= T.
In [t
j
, t
j +1
) hold (t
j
) units of bond and (t
j
) units of stock,
then trade at prices B(t
j +1
), S(t
j +1
) in a self-nancing way:
(t
j +1
)B(t
j +1
)+(t
j +1
)S(t
j +1
) = (t
j
)B(t
j +1
)+(t
j
)S(t
j +1
).

So the positions held at time t are


(t) =
m1

j =0
(t
j
)I
[t
j
,t
j +1
)
(t), (t) =
m1

j =0
(t
j
)I
[t
j
,t
j +1
)
(t).
(4.1)
We demand that (t
j
) is T
t
j
-measurable and bounded.
A process (t) as in (4.1) is called a simple process.

The value process X(t) = (t)B(t) + (t)S(t) then satises


X(t)
B(t)
= X(0) +
m1

j =0
(t
j
)
_
S(t
j +1
t)
B(t
j +1
t)

S(t
j
t)
B(t
j
t)
_
(4.2)
for t [0, T].
Denition 4.1
The stochastic integral (SI) of the simple process (t) with
respect to the process M(t) =
S(t)
B(t)
is
_
t
0
(u)dM(u) :=
m1

j =0
(t
j
)
_
M(t
j +1
t)M(t
j
t)
_
, t [0, T].
Continuous trading
We now consider strategies with continuously changing positions.
A continuous trading strategy (t) is approximated by a
sequence of simple strategies
n
(t) (trading at discrete times) for
which the re-balancing frequency becomes small when n .
That is,
(t) = lim
n

n
(t), t [0, T]. (4.3)
Remark: One can show that every continuous adapted process
(t) can be approximated by simple processes as in (4.3).
We will see that this approximation is a stable procedure in the
sense that the value processes X
n
(t) corresponding to the simple
self-nancing trading strategies
n
(t) satisfy
X
n
(t) X(t) (n )
for some limit value X(t).
Denition 4.2
a) Let (t) = lim
n

n
(t) be a continuous adapted process
approximated by simple processes
n
(t). The stochastic integral
of (t) with respect to M(t) :=
S(t)
B(t)
is
_
t
0
(u)dM(u) := lim
n
_
t
0

n
(u)dM(u), t [0, T]. (4.4)
b) The value process X(t) of the continuous self-nancing
trading strategy (t) with initial value X(0) is given by
X(t)
B(t)
= X(0) +
_
t
0
(u)d
_
S(u)
B(u)
_
. (4.5)
We now specify, in three steps, a class of processes M(t) =
S(t)
B(t)
for which the limit in (4.4) exists.
Step 1: SI with respect to dierentiable processes
Let M(t) be a stochastic process which is dierentiable in t.
Theorem 4.3
If is adapted and (t)M

(t) integrable as a function of t, then


the limit in (4.4) exists a.s., and
_
t
0
(u)dM(u) =
_
t
0
(u)M

(u)du.
Proof: For a simple process (t) =

m1
j =0

t
j
I
[t
j
,t
j +1
)
(t),
_
t
0
(u)dM(u) =
m1

j =0

t
j
_
M(t
j +1
t) M(t
j
t)
_
=
m1

j =0

t
j
_
t
j +1
t
t
j
t
M

(u)du =
m1

j =0
_
t
j +1
t
t
j
t
(u)M

(u)du
=
_
t
0
(u)M

(u)du.
For adapted choose simple processes
n
such that
[
n
[ [[. We then obtain
_
t
0
(u)dM(u) = lim
n
_
t
0

n
(u)dM(u)
= lim
n
_
t
0

n
(u)M

(u)du =
_
t
0
(u)M

(u)du
by dominated convergence.
Corollary 4.4 (Associativity)
Let , adapted such that (t)M

(t) and (t)(t)M

(t) are
integrable. Then
_
t
0
(u)d
__
u
0
(v)dM(v)
_
=
_
t
0
(u)(u)dM(u).
Step 2: SI with respect to Brownian motion
Theorem 4.5
If M(t) is a martingale and (t) a simple process, then the SI
_
t
0
(u)dM(u) for t [0, T] is also a martingale.
Proof: See [Shr04] Theorem 4.2.1.
Corollary 4.6
If M(t) is a square-integrable martingale and (t) a simple
process, then the SI satises E
_
_
t
0
(u)dM(u)
_
= 0 and
E
_
__
t
0
(u)dM(u)
_
2
_
= E
_
_
m1

j =0

2
t
j
_
M(t
j +1
t) M(t
j
t)
_
2
_
_
.
Proof: This follows from Theorem 4.5 since for every square-
integrable martingale X,
E
__
X(t) X(0)
_
2

=
m1

j =0
E
__
X(t
j +1
t) X(t
j
t)
_
2

.
We now take M(t) = W(t) BM.
Theorem 4.7 (Ito isometry)
For a simple process (t) we have
E
_
__
t
0
(u)dW(u)
_
2
_
= E
__
t
0
(u)
2
du
_
.
Proof: For any t
j +1
t
E
_

2
t
j
_
W(t
j +1
) W(t
j
)
_
2
_
= E
_

2
t
j
E
__
W(t
j +1
) W(t
j
)
_
2

T
t
j
_
_
= E
_

2
t
j
(t
j +1
t
j
)
_
= E
_
_
t
j +1
t
j
(u)
2
du
_
,
and similarly E
_

2
t
j
_
W(t) W(t
j
)
_
2
_
= E
_
_
t
t
j
(u)
2
du
_
for
t (t
j
, t
j +1
].
Excursion: existence and uniqueness of SI. Let L
2
be the space
of all T
t
-measurable random variables X with |X| = E[X
2
]
1
2
< ,
and L
2
W
be the space of all adapted processes with
||
W
= E
__
t
0
(u)
2
du
_
1
2
< .
Let o be the space of all simple processes. Then the map

_
t
0
(u)dW(u)
denes a linear isometry o L
2
by Theorem 4.7. Since o is dense
in L
2
W
, and L
2
is a Banach space, there is a unique way to extend
this map
7
to a linear isometry L
2
W
L
2
by setting
_
t
0
(u)dW(u) := lim
n
_
t
0

n
(u)dW(u) in L
2
for a sequence
n
o with
n
in L
2
W
.
7
See e.g. Kreyszig, Introductory Functional Analysis with Applications, p. 100.
The last result

provides a positive answer to the existence and uniqueness


question in (4.4) in the case of M(t) = W(t) BM

species a space L
2
W
of admissible integrands for BM:
all adapted processes with E
_
_
t
0
(u)
2
du
_
<
Note: If L
2
W
adapted, then by denition
_
t
0
(u)dW(u) := lim
n
_
t
0

n
(u)dW(u) in L
2
when
n
are simple processes with

n
in L
2
W
(n ).
The last condition is fullled for instance if
n
a.s. (n )
and [
n
[ [[ a.s. for all n. A sequence
n
of this type can be
found for every adapted L
2
W
.
Basic properties of SI. Let W be a BM.
Theorem 4.8
The stochastic integral I (t) =
_
t
0
(u)dW(u) satises
(i) I (t) is linear in .
(ii) I (t) is T
t
-measurable.
(iii) I (t)
t[0,T]
is a martingale.
(iv) I (t) is continuous in t.
Proof: (i) - (iv) are immediate for simple processes . For an
arbitrary adapted process L
2
W
we choose simple
n
such that

n
in L
2
W
. Then (i) and (ii) generalize from
n
to . For
(iii) we note that for A T
t
, the L
2
-convergence gives
E
_
_
T
0
dW I
A
_
= lim
n
E
_
_
T
0

n
dW I
A
_
= lim
n
E
_
_
t
0

n
dW I
A
_
= E
_
_
t
0
dW I
A
_
.
The proof of (iv) requires results from martingale theory, see
[Dur96] Chapter 2 Theorems 4.3a and 6.3.
Theorem 4.9 (Ito isometry)
For every adapted process (t) L
2
W
we have
E
_
__
t
0
(u)dW(u)
_
2
_
= E
__
t
0
(u)
2
du
_
.
Proof: Take simple processes
n
in L
2
W
when n .
Then
_
t
0

n
(u)dW(u)
_
t
0
(u)dW(u) in L
2
, and thus
E
_
__
t
0
(u)dW(u)
_
2
_
= lim
n
E
_
__
t
0

n
(u)dW(u)
_
2
_
= lim
n
E
__
t
0

n
(u)
2
du
_
= E
__
t
0
(u)
2
du
_
.
Step 3: SI with respect to Ito processes
Denition 4.10
Let W(t) a BM with ltration (T
t
), t 0. An Ito process is a
stochastic process of the form
X(t) = X(0) +
_
t
0
(u)du +
_
t
0
(u)dW(u) (4.6)
with a constant X(0) and adapted processes and satisfying
the appropriate integrability conditions.
Remarks.
1) BM X(t) = W(t) is an Ito process: take X(0) = 0, (u) = 0,
and (u) = 1.
2) Every nice continuous martingale X(t) is an Ito process of
the form X(t) =
_
t
0
(u)dW(u) with a BM W(t).
Stochastic integrals. The limit in the denition (4.4) of the SI
exists in L
2
if the integrator M is an Ito process:
Theorem 4.11 (Associativity)
Let adapted and X(t) an Ito process
X(t) = X(0) +
_
t
0
(u)du +
_
t
0
(u)dW(u).
Then under the appropriate integrability conditions
_
t
0
(u)dX(u) =
_
t
0
(u)(u)du +
_
t
0
(u)(u)dW(u).
Implications:
1) If X(t) is an Ito process, then I (t) =
_
t
0
(u)dX(u) is again
an Ito process.
2) SIs w.r.t. Ito processes can be reduced to SIs w.r.t. Brownian
motion plus ordinary (Lebesgue) integrals.
Proof of Theorem 4.11: a) The case X(t) = X(0) +
_
t
0
(u)du is
established in Corollary 4.4.
b) Next suppose X(t) =
_
t
0
(u)dW(u). Let
(n)
be a sequence of
simple processes with
(n)
a.s. (n ) and [
(n)
[ [[.
Then
(n)
in L
2
, and thus
_
t
0

(n)
(u)dX(u) =
m1

j =0

(n)
t
j
_
X(t
j +1
t) X(t
j
t)
_
=
m1

j =0

(n)
t
j
_
t
j +1
t
t
j
t
(u)dW(u)
=
m1

j =0
_
t
j +1
t
t
j
t

(n)
t
j
(u)dW(u)
=
_
t
0

(n)
(u)(u)dW(u)
_
t
0
(u)(u)dW(u) in L
2
.
c) The case of a general Ito process now follows from linearity of
the SI in the integrator.
Important Example: The SI
_
X dX
Let X an Ito process as in (4.6). We compute
8
_
t
0
X(u)dX(u).
Let
1
,
2
, ... partitions of [0, t] with [
n
[ 0 and dene

n
(t) =

t
j

n
X(t
j
)I
[t
j
,t
j +1
)
(t).
Then
n
(t) X(t) and so
2
_
t
0
X(u)dX(u) = lim
n

t
j

n
2X(t
j
)
_
X(t
j +1
) X(t
j
)
_
= lim
n

t
j

n
_
X(t
j +1
)
2
X(t
j
)
2
_
lim
n

t
j

n
_
X(t
j +1
) X(t
j
)
_
2
= X(t)
2
X(0)
2
[X, X](t). (4.7)
For X = W BM we have [W, W](t) = t by Theorem 3.11. Thus
_
t
0
W(u)dW(u) =
1
2
W(t)
2

1
2
t.
8
The SI exists if E

t
0
(u)
4
du

< .
Ito-Doeblin formula for Brownian motion
The last equation
W(t)
2
W(0)
2
= 2
_
t
0
W(u)dW(u) + t (4.8)
shows that the chain rule from ordinary calculus does not hold for
the stochastic integral with respect BM: Let g(t) be dierentiable.

chain rule:
d
du
_
g(u)
_
2
= 2g(u)g

(u)

integral form:
g(t)
2
g(0)
2
= 2
_
t
0
g(u)g

(u)du = 2
_
t
0
g(u)dg(u)
The additional term t in (4.8) appears because BM has non-zero
quadratic variation, in contrast to dierentiable functions.
Let f : R R be a (
2
function. We want a chain rule for
f
_
W(t)
_
. (Previously we had f (x) = x
2
). By Taylors theorem
f
_
W(t
j +1
)
_
f
_
W(t
j
)
_
= f

_
W(t
j
)
__
W(t
j +1
) W(t
j
)
_
+
1
2
f

_
W(t

j
)
__
W(t
j +1
) W(t
j
)
_
2
with some t

j
(t
j
, t
j +1
). For a partition = t
0
, ..., t
m
of [0, t]
take

m1
j =0
and let [[ 0, then

m1
j =0
f

_
W(t
j
)
__
W(t
j +1
) W(t
j
)
_

_
t
0
f

_
W(u)
_
dW(u)

m1
j =0
1
2
f

_
W(t

j
)
__
W(t
j +1
) W(t
j
)
_
2

1
2
_
t
0
f

_
W(u)
_
du
We obtain the Ito-Doeblin formula for BM
f
_
W(t)
_
f
_
W(0)
_
=
_
t
0
f

_
W(u)
_
dW(u) +
1
2
_
t
0
f

_
W(u)
_
du.
Main results on Ito processes
We ultimately want to extend the Ito-Doeblin formula to multi-
variate functions f
_
X
1
(t), ..., X
d
(t)
_
of Ito processes X
1
, ..., X
d
.
To this end we need the quadratic variation of an Ito process.
Theorem 4.12
Let X(t) = X(0) +
_
t
0
(u)du +
_
t
0
(u)dW(u) an Ito process.
Then its quadratic variation is
[X, X](t) =
_
t
0
(u)
2
du.
Let Y(t) = Y(0) +
_
t
0
(u)du +
_
t
0
(u)dW(u) another Ito process.
The covariation of X and Y is then (symmetric bilinear form)
[X, Y](t) =
1
2
[X + Y, X + Y](t)
1
2
[X, X](t)
1
2
[Y, Y](t)
=
1
2
_
t
0
_
(u) + (u)
_
2
du
1
2
_
t
0
(u)
2
du
1
2
_
t
0
(u)
2
du
=
_
t
0
(u)(u)du.
Theorem 4.13 (Continuous martingales with nite variation)
Let X(t) be a continuous martingale with nite rst order
variation. Then X(t) = X(0) for all t 0.
Proof: We give the proof under the condition that X and FV
X
are
bounded (the general result then follows via a method known as
localization). Take a partition = t
0
, ..., t
m
of [0, t]. Then
E
_
_
X(t) X(0)
_
2
_
= E
_
m1

j =0
_
X(t
j +1
) X(t
j
)
_
2
_
E
_
max
j =0,...,m1

X(t
j +1
) X(t
j
)

m1

j =0

X(t
j +1
) X(t
j
)

_
E
_
max
j =0,...,m1

X(t
j +1
) X(t
j
)

FV
X
(t)
_
0
for [[ 0 by dominated convergence, because everything is
bounded and max
j =0,...,m1

X(t
j +1
) X(t
j
)

0 (continuity of X).
Hence E
__
X(t) X(0)
_
2

= 0, and so X(t) = X(0) a.s.


Theorem 4.14 (Generalized It o isometry)
Let X(t) =
_
t
0
(u)dW(u) with E[
_
t
0
(u)
2
du] < . Then the
process
M(t) = X(t)
2

_
t
0
(u)
2
du, t 0
is a continuous martingale.
Proof: M(t) is adapted and integrable since X(t) is square-integrable.
Also E
__
X(t)X(s)
_
2

T
s

= E
_
X(t)
2
X(s)
2

T
s

for s t since
X(t) is a martingale. Moreover, X(t) X(s) =
_
t
s
(u)dW(u). Hence
E
_
M(t) M(s)

T
s

= E
__
_
t
s
(u)dW(u)
_
2

_
t
s
(u)
2
du

T
s
_
.
So it remains to show that for every A T
s
,
E
_
I
A
__
_
t
s
(u)dW(u)
_
2

_
t
s
(u)
2
du
__
= 0.
But this equation is Ito isometry (Theorem 4.9) for the process

(u) =
_
0 (u s)
(u)I
A
(u > s).
Proof of Theorem 4.12. We have to show [X, X](t) =
_
t
0
(u)
2
du.
We give a proof under the assumption that is bounded.
By remarks 1) and 3) after Denition 3.10, [X, X](t) = [Y, Y](t)
for Y(t) =
_
t
0
(u)dW(u), so it suces to show
[Y, Y](t) =
_
t
0
(u)
2
du.
To this end, we use (4.7) to compute
[Y, Y](t)
_
t
0
(u)
2
du = Y(t)
2
2
_
t
0
Y(u)dY(u)
_
t
0
(u)
2
du
= Y(t)
2

_
t
0
(u)
2
du 2
_
t
0
Y(u)(u)dW(u).
RHS is a continuous martingale by Theorem 4.14 and the martingale
property of Brownian integrals. LHS has nite rst order variation.
So Theorem 4.13 yields [Y, Y](t)
_
t
0
(u)
2
du = 0 for all t.
Unique representation of Ito processes. Let X(t) be an Ito
process and suppose that we have two representations
X(t) = X(0) +
_
t
0

1
(u)du +
_
t
0

1
(u)dW(u)
= X(0) +
_
t
0

2
(u)du +
_
t
0

2
(u)dW(u)
with adapted processes
i
,
i
for i = 1, 2 satisfying the required
integrability conditions. Then
Z(t) :=
_
t
0
_

1
(u)
2
(u)
_
du =
_
t
0
_

2
(u)
1
(u)
_
dW(u)
for all t. So Z(t) is a continuous martingale (RHS) with nite rst
order variation (LHS), hence Z(t) = 0 for all t by Theorem 4.13.
We obtain Z

(t) =
1
(t)
2
(t) = 0. Also
0 = E
__
_
t
0
_

1
(u)
2
(u)
_
dW(u)
_
2
_
= E
_
_
t
0
_

1
(u)
2
(u)
_
2
du
_
,
which implies
1
(t)
2
(t) = 0.
Ito-Doeblin formula for Ito processes
Theorem 4.15 (Ito-Doeblin formula for It o processes)
Let f : R
d
R be a (
2
function and X
1
(t), ..., X
d
(t) Ito
processes. Set X(t) =
_
X
1
(t), ..., X
d
(t)
_
. Then
f
_
X(t)
_
= f
_
X(0)
_
+
d

i =1
_
t
0
f
x
i
_
X(u)
_
dX
i
(u)
+
1
2
d

i , j =1
_
t
0

2
f
x
i
x
j
_
X(u)
_
d[X
i
, X
j
](u).
(4.9)
Remarks. 1) In particular f
_
X(t)
_
is again an It o process.
2) If some X
i
(t) = X
i
(0) +
_
t
0

i
(u)du has nite variation, then
[X
i
, X
j
](t) = 0 for all j , so the corresponding integrals in (4.9)
vanish.
Special cases
1) For d = 2 with X
1
(t) = t, X
2
(t) = W(t) we obtain the
Ito-Doeblin formula for BM (general case)
f
_
t, W(t)
_
= f
_
0, W(0)
_
+
_
t
0
f
t
_
u, W(u)
_
du
+
_
t
0
f
x
_
u, W(u)
_
dW(u) +
1
2
_
t
0
f
xx
_
u, W(u)
_
du.
(4.10)
2) Let X, Y be Ito processes and f (x, y) = xy. Then we obtain
the product rule
X(t)Y(t) = X(0)Y(0)+
_
t
0
X(u)dY(u)+
_
t
0
Y(u)dX(u)+[X, Y](t).
(4.11)
Proof of Theorem 4.15: See [Dur96] Section 2.10.
Dierential notation
We often write the Ito formula
f
_
X(t)
_
= f
_
X(0)
_
+
d

i =1
_
t
0
f
x
i
_
X(u)
_
dX
i
(u)
+
1
2
d

i , j =1
_
t
0

2
f
x
i
x
j
_
X(u)
_
d[X
i
, X
j
](u).
in a dierential form:
df
_
X(t)
_
=
d

i =1
f
x
i
_
X(t)
_
dX
i
(t) +
1
2
d

i , j =1

2
f
x
i
x
j
_
X(t)
_
d[X
i
, X
j
](t).
For an Ito process
X(t) = X(0) +
_
t
0
(u)du +
_
t
0
(u)dW(u)
let Y(t) =
_
t
0
(u)dX(u). By Theorem 4.11 we have
Y(t) = Y(0) +
_
t
0
(u)(u)du +
_
t
0
(u)(u)dW(u).
We write this in dierential form as
dX(t) = (t)dt + (t)dW(t),
dY(t) = (t)dX(t)
implies
dY(t) = (t)(t)dt + (t)(t)dW(t).
Examples
1) Generalized geometric BM. Let W(t) be a BM for a
ltration (T
t
)
t0
, let (t), (t) adapted processes, and
X(t) =
_
t
0
_
(u)
1
2
(u)
2
_
du +
_
t
0
(u)dW(u).
By the Ito formula S(t) = S
0
e
X(t)
is an Ito process. f (x) = S
0
e
x
yields f
_
X(t)
_
= f

_
X(t)
_
= f

_
X(t)
_
= S(t) and so
dS(t) = S(t)dX(t) +
1
2
S(t)d[X, X](t).
By Theorem 4.12, [X, X](t) =
_
t
0
(u)
2
du, so Theorem 4.11 yields
dS(t) = S(t)
_
(t)
1
2
(t)
2
_
dt + S(t)(t)dW(t) +
1
2
S(t)(t)
2
dt
= S(t)(t)dt + S(t)(t)dW(t).
For constant and we obtain X(t) =
_

1
2

2
_
t +W(t), so
S(t) = S
0
e
W(t)+
(

1
2

2
)
t
is geometric BM.
2) Ornstein-Uhlenbeck process. Let W(t) be a BM and R(t)
be an Ito process satisfying the stochastic dierential equation
dR(t) =
_
R(t)
_
dt +dW(t)
for some constants , , . The solution of this equation is
R(t) =

+ e
t
_
R(0)

+
_
t
0
e
u
dW(u)
_
. (4.12)
To see this let X(t) = R(0)

+
_
t
0
e
u
dW(u) and note
de
t
= e
t
()dt. Then by the product rule
dR(t) = d
_
e
t
X(t)
_
= e
t
dX(t) + X(t)de
t
= e
t
e
t
dW(t) + e
t
_
R(t)

_
e
t
()dt
=
_
R(t)
_
dt +dW(t).
The process R(t) is used to model the spot interest rate in the
Vasicek interest rate model.
We assume that , , > 0 and write the Ornstein-Uhlenbeck
stochastic dierential equation as
dR(t) =
_

R(t)
_
dt +dW(t).
The OU process R(t) is mean-reverting with long-term mean

.
The speed of mean reversion is determined by , and the volatility
of the process by . (4.12) yields the distribution of R(t) using
Theorem 4.16
Let (t) a deterministic function and W(t) a Brownian motion.
Then I (t) =
_
t
0
(s)dW(s) N
_
0 ,
_
t
0
(s)
2
ds
_
.
Proof (sketch): E[I (t)] = 0 and Var [I (t)] =
_
t
0
(s)
2
ds by Ito
isometry. To prove normality, take u R and dene S(t) = e
X(t)
with X(t) =
_
t
0

1
2
_
iu(s)
_
2
ds +
_
t
0
iu(s)dW(s). Then by 1)
dS(t) = S(t)iu(t)dW(t), so S(t) is a martingale.
9
Thus
E
_
e
iu I (t)

= E
_
e
X(t)
1
2
u
2

t
0
(s)
2
ds

= e

1
2
u
2

t
0
(s)
2
ds
.
9
This follows from Novikovs condition, see Karatzas and Shreve, Brownian
Motion and Stochastic Calculus, Chap. 3.5.D.
The Black-Scholes-Merton Equation
Black-Scholes model: B(t) = e
rt
, S(t) = S
0
e
W(t)+
(

1
2

2
)
t
.
dB(t) = B(t)rdt
dS(t) = S(t)dt + S(t)dW(t)
Pricing derivatives via replication. Consider payo
_
S(T) K
_
+
at time T. We want to nd a sfts with stock position (t) and
initial value X(0) whose value process has the form
X(t) = c
_
t, S(t)
_
(4.13)
for some deterministic function c(t, x) and all t [0, T]. The sfts
is a replication strategy if c(t, x) satises the terminal condition
c(T, x) = (x K)
+
for all x 0,
since this implies X(T) = c
_
T, S(T)
_
=
_
S(T) K
_
+
.
How do we nd c(t, x) satisfying (4.13)?
The Black-Scholes-Merton PDE. We have X(t) = c
_
t, S(t)
_
.
Compute dierential on both sides. By (4.5)
X(t)
B(t)
= X(0) +
_
t
0
(u)d
_
S(u)
B(u)
_
.
Also
S(t)
B(t)
= S
0
e
W(t)+
(
(r )
1
2

2
)
t
is a geometric BM, so
d
_
S(t)
B(t)
_
=
_
S(t)
B(t)
_
( r )dt +
_
S(t)
B(t)
_
dW(t). (4.14)
Thus
d
_
X(t)
B(t)
_
= (t)d
_
S(t)
B(t)
_
= (t)
_
S(t)
B(t)
_
( r )dt + (t)
_
S(t)
B(t)
_
dW(t),
(4.15)
dX(t) = d
_
X(t)
B(t)
B(t)
_
=
X(t)
B(t)
dB(t) + B(t)d
_
X(t)
B(t)
_
= X(t)rdt + (t)S(t)( r )dt + (t)S(t)dW(t)
=
_
X(t) (t)S(t)
_
rdt + (t)dS(t). (4.16)
So
dX(t) =
_
X(t) (t)S(t)
_
rdt + (t)dS(t).
By Itos formula,
dc
_
t, S(t)
_
=
= c
t
_
t, S(t)
_
dt + c
x
_
t, S(t)
_
dS(t) +
1
2
c
xx
_
t, S(t)
_
d[S, S](t)
=
_
c
t
_
t, S(t)
_
+
1
2
c
xx
_
t, S(t)
_
S(t)
2

2
_
dt + c
x
_
t, S(t)
_
dS(t)
Equating dX(t) and dc
_
t, S(t)
_
, we obtain
(t) = c
x
_
t, S(t)
_
, (4.17)
rc
x
_
t, S(t)
_
S(t) = rc
_
t, S(t)
_
+ c
t
_
t, S(t)
_
+
1
2
c
xx
_
t, S(t)
_
S(t)
2

2
.
So c(t, x) is the solution to the Black-Scholes-Merton PDE
c
t
(t, x) + rxc
x
(t, x) +
1
2

2
x
2
c
xx
(t, x) = rc(t, x) (4.18)
for all t [0, T), x > 0, with terminal condition
c(T, x) = (x K)
+
for all x 0. (4.19)
The Black-Scholes formula is the solution to this PDE. It is
c(t, x) = x
_
d
+
(t, x)
_
Ke
r (Tt)

_
d

(t, x)
_
(4.20)
where
d

(t, x) =
1

Tt
_
log
x
K
+
_
r

2
2
_
(T t)
_
.
This can be checked by direct verication
10
. The fair price at time t
of a European call with strike K and maturity T in the Black-Scholes
model is X(t) = c
_
t, S(t)
_
. The replication strategy is given by
X(0) = c(0, S
0
) and (t) = c
x
_
t, S(t)
_
.
10
See [Shr04], Ex. 4.9. We shall see another derivation of (4.20) in Chapt. 5.
Remarks. 1) The Black-Scholes formula c(t, x) is a deterministic
function. The price X(t) = c
_
t, S(t)
_
and the stock position
(t) = c
x
_
t, S(t)
_
, t [0, T], are adapted stochastic processes.
2) The Black-Scholes formula does not depend on the drift of
the stock. c
_
t, S(t)
_
depends on r , (model parameters), K, T
(option characteristics), and t, S(t) (state variables).
3) The above analysis applies to every European type derivative
with a payo h(S
T
) at time T for some function h(x), x 0. The
fair price is given by the solution c
(h)
(t, x) to the PDE (4.18) with
terminal condition c
(h)
(T, x) = h(x) for all x 0.
Example. Let C(t), P(t) and F(t) be prices of a European call, a
European put, and a forward contract with payos
_
S(T) K
_
+
,
_
K S(T)
_
+
, and S(T) K, respectively. In any model we have
C(t) = F(t) + P(t) (put-call parity) under absence of arbitrage.
Also F(t) = S(t) e
r (Tt)
K in the Black-Scholes model (check
the PDE). Indeed, in the Black-Scholes model
P(t) = c
_
t, S(t)
_
S(t) + e
r (Tt)
K.
Delta hedging and the greeks
The partial derivatives of c
_
t, S(t)
_
w.r.t. t, S(t), , r are called
the greeks. In particular,
(t) = c
x
_
t, S(t)
_
=
_
d
+
(t, S(t))
_
> 0
(t) = c
xx
_
t, S(t)
_
=

_
d
+
(t,S(t))
_
S(t)

Tt
> 0
(t) = c
t
_
t, S(t)
_
=
S(t)
_
d
+
(t,S(t))
_
2

Tt
rKe
r (Tt)

_
d

(t, S(t)) < 0


Example: Hedging a short position in a call option. At time t
0
:
(1) shares of option, (t
0
) shares of bond, (t
0
) shares of stock.
Total portfolio value at t t
0
:
V
_
t, S(t)
_
= (1)c
_
t, S(t)
_
+(t
0
)e
rt
+ (t
0
)S(t)

delta-neutral:
V(t,S)
S
= 0

short gamma:

2
V(t,S)
S
2
< 0

long theta:
V(t,S)
t
> 0
Multi-dimensional Brownian motion
Denition 4.17
A d-dimensional Brownian motion is a process
W(t) =
_
W
1
(t), ..., W
d
(t)
_
, t 0, such that
(i) W
i
(t) is a one-dimensional Brownian motion for each i
(ii) If i ,= j then the processes W
i
(t) and W
j
(t) are independent
Let T
t
=
_
W(s) [ s t
_
for t 0 the ltration generated by the
process W(t). As for the one-dimensional case we have for every
0 t < u that W(u) W(t) is independent of T
t
.
Theorem 4.18
Let W(t) =
_
W
1
(t), ..., W
d
(t)
_
, t 0, a d-dimensional Brownian
motion. Then
[W
i
, W
j
](t) =
_
t if i = j
0 if i ,= j
Proof: See [Shr04] Section 4.6.1.
Ito processes for multi-dimensional BM
Denition 4.19
Let W(t) =
_
W
1
(t), ..., W
d
(t)
_
a d-dimensional BM with ltration
(T
t
), t 0. An Ito process is a stochastic process of the form
X(t) = X(0) +
_
t
0
(u)du +
d

i =1
_
t
0

i
(u)dW
i
(u) (4.21)
with a constant X(0) and adapted processes and
i
satisfying
the appropriate integrability conditions.
Most results on Ito processes for 1-dimensional BM carry over.
Theorem 4.20 (Associativity)
Let adapted and X(t) an Ito process as in (4.21) Then under the
appropriate integrability conditions
_
t
0
(u)dX(u) =
_
t
0
(u)(u)du +
d

i =1
_
t
0
(u)
i
(u)dW
i
(u).
We again need quadratic variation and covariation of Ito processes.
Theorem 4.21
Let
X(t) = X(0) +
_
t
0
(u)du +
d

i =1
_
t
0

i
(u)dW
i
(u),
Y(t) = Y(0) +
_
t
0
(u)du +
d

i =1
_
t
0

i
(u)dW
i
(u).
Then
[X, Y](t) =
d

i =1
_
t
0

i
(u)
i
(u)du.
Proof: The idea is to rst verify the claim for simple processes
h
1
, h
2
. For adapted processes h
1
, h
2
, one approximates the
integrands h
1
, h
2
with simple processes.
Details: See [Dur96] Chap. 2, Theorems (4.2c) (simple processes)
and (5.4), (6.5), (8.7) (general integrands).
The Ito formula (4.9) again holds.
Theorem 4.22
Let f : R
n
R be a (
2
function and X
1
(t), ..., X
n
(t) Ito processes.
Set X(t) =
_
X
1
(t), ..., X
n
(t)
_
. Then
f
_
X(t)
_
= f
_
X(0)
_
+
n

i =1
_
t
0
f
x
i
_
X(u)
_
dX
i
(u)
+
1
2
n

i , j =1
_
t
0

2
f
x
i
x
j
_
X(u)
_
d[X
i
, X
j
](u).
Proof: See [Dur96] Section 2.10.
The next result provides an important characterization of BM.
Theorem 4.23 (Levy)
Let M
1
(t), ..., M
d
(t) be continuous martingales for a ltration
(T
t
)
t0
with M
i
(0) = 0, [M
i
, M
i
](t) = t, and [M
i
, M
j
](t) = 0 for
i ,= j . Then M(t) =
_
M
1
(t), ..., M
d
(t)
_
, t 0, is a d-dim BM.
Proof (for d = 1, sketch): We show M(t) M(s) N(0, t s)
and independent of T
s
for s t. Take u R and dene
X(t) = e
iuM(t)+
1
2
u
2
t
.
By Itos formula for continuous martingales (see [Dur96] Chap. 2)
dX(t) = X(t)d
_
iuM(t) +
1
2
u
2
t
_
+
1
2
X(t)d[iuM, iuM](t)
= X(t)iudM(t) + X(t)
1
2
u
2
dt +
1
2
X(t)(1)u
2
d[M, M](t)
= X(t)iudM(t).
It follows that X(t) is also a martingale. Hence
E
_
e
iuM(t)+
1
2
u
2
t

T
s

= e
iuM(s)+
1
2
u
2
s
,
E
_
e
iu(M(t)M(s))

T
s

= e

1
2
u
2
(ts)
.
This implies M(t) M(s) N(0, t s) and independent of T
s
.
Application: Constructing correlated stock prices
For geometric BM dS(t) = S(t)dt + S(t)dW(t) we have
log
S(t+)
S(t)
=
_
W(t +) W(t)
_
+ (
1
2

2
).
Suppose we model two stocks S
1
, S
2
by
dS
1
(t) = S
1
(t)
1
dt + S
1
(t)
1
dW
1
(t),
dS
2
(t) = S
2
(t)
2
dt + S
2
(t)
2
_
dW
1
(t) +
_
1
2
dW
2
(t)
_
for independent BMs W
1
(t), W
2
(t) and some [1, 1]. Dene
W
3
(t) = W
1
(t) +
_
1
2
W
2
(t).
Then W
3
(t) is a cont. martingale with W
3
(0) = 0 and [W
3
, W
3
](t) = t.

By Levys theorem in one dimension W


3
(t) is a 1-dim BM,
and
dS
2
(t) = S
2
(t)
2
dt + S
2
(t)
2
dW
3
(t)
is a geometric BM.

We have
Corr
_
log
S
1
(t+)
S
1
(t)
, log
S
2
(t+)
S
2
(t)
_
=Corr
_

1
_
W
1
(t +) W
1
(t)
_
+ (
1

1
2

2
1
),

2
_
W
3
(t +) W
3
(t)
_
+ (
2

1
2

2
2
)

=Corr [W
1
(t +) W
1
(t), W
3
(t +) W
3
(t)]
=Corr
_
W
1
(t +) W
1
(t),

_
W
1
(t +) W
1
(t)
_
+
_
1
2
_
W
2
(t +) W
2
(t)
_
_
=Corr [W
1
(t +) W
1
(t), W
1
(t +) W
1
(t)]
=.
That is, logarithmic returns of S
1
and S
2
have correlation .
In summary, starting from a 2-dim BM, we constructed two
geometric BM asset price processes S
1
and S
2
with correlated
logarithmic returns.
5. Risk-Neutral Pricing
Main topics

Change of measure: Girsanovs theorem

Risk-neutral measure

Martingale representation theorem

Fundamental theorems of asset pricing


Motivation
Discrete time: If the discounted stock process is a martingale, so
is the discounted value process of any sfts (Example 2.43).
Continuous time: Let B(t) = e
rt
, S(t) = S
0
e
W(t)+
(

1
2

2
)
t
(Black-Scholes model) and X(t) the value process of a sfts in this
market. By Theorem 3.6,
S(t)
B(t)
is a martingale if and only if = r .
In this case
X(t)
B(t)
is also a martingale by (4.15) and Theorem 4.8.
If in addition X(T) =
_
S(T) K
_
+
(replication strategy), then
X(t)
B(t)
= E
_
X(T)
B(T)

T
t
_
= E
_
_
S(T) K
_
+
B(T)

T
t
_
. (5.1)
Problem: In general ,= r .
Idea: Find a new measure

P on (, T) such that
S(t)
B(t)
, t 0,
becomes a martingale under

P. Then replace E with

E in (5.1).
Change of measure
Let Z 0 a random variable on (, T, P) with E[Z] = 1. Then

P[A] = E[ZI
A
] for all A T (5.2)
denes a new probability measure

P on (, T) (check!). Note that
for each A T, if P[A] = 0 then also

P[A] = 0. Conversely, we have
Theorem 5.1 (Radon-Nikodym)
Suppose P and

P are probability measures on (, T) such that for
each A T, if P[A] = 0 then also

P[A] = 0. Then there exists a
random variable Z 0 on (, T) such that E[Z] = 1 and

P[A] = E[ZI
A
] for all A T.
Proof: See [Dur95] Appendix A.8.
Z is called a Radon-Nikodym derivative of

P w.r.t. P and
denoted Z =
d

P
dP
. If Z

is another random variable with the


properties in Theorem 5.1, then P[Z = Z

] = 1 (check!).
Examples
1) Finite space. Let nite, T = power set of , and P,

P
probability measures such that P[A] = 0 implies

P[A] = 0. Then
Z() =

P[]
P[]
I
{P[]>0}
,
is a Radon-Nikodym derivative of

P w.r.t P (check!).
For with

P[] > 0 we have

P[] = Z()P[],
so

P is obtained by weighting the probabilities under P with Z.
2) Binomial model. P and

P on (
N
, T) as in Examples 2.2
and 2.42. Then by 1)
Z() =

P[]
P[]
= 2
N
p
#H()
(1 p)
#T()
,
N
.
3) Normal distributions. Let X a random variable on (, T, P)
with P[X x] = (x) and Y = X + for some R. Hence
Y N(, 1) under P.
Find a probability

P on (, T) such that Y N(0, 1) under

P.
Solution: Dene Z = e
X
1
2

2
. Then Z > 0 and
E[Z] =
_

2
e

1
2
x
2
e
x
1
2

2
dx =
_

2
e

1
2
(x+)
2
dx = 1.
So

P[A] = E[ZI
A
] for A T denes a probability measure on (, T),
and for b R

P[Y b] = E
_
ZI
{Yb}

= E
_
e
X
1
2

2
I
{Xb}

=
_
b

2
e

1
2
x
2
e
x
1
2

2
dx =
_
b

2
e

1
2
(x+)
2
dx
=
_
b

2
e

1
2
y
2
dy = (b),
so Y N(0, 1) under

P.
Expectations under change of measure. Let P and

P as before,
Z =
d

P
dP
, and (T
t
)
t[0,T]
be a ltration on (, T).
The Radon-Nikodym derivative process of

P w.r.t. P is
Z(t) := E
_
Z

T
t

, t [0, T]. (5.3)


By iterated conditioning it is a nonnegative martingale.
Lemma 5.2
Let Y be a

P-integrable random variable. Then

E[Y] = E[YZ]. (5.4)


If moreover Y is T
t
-measurable, then also

E[Y] = E[YZ(t)], (5.5)


Z(s)

E
_
Y

T
s

= E
_
YZ(t)

T
s

for all s t. (5.6)


Proof. For (5.4): If Y =

k
i =1
y
i
I
A
i
with A
i
T then

E[Y] =
k

i =1
y
i

P[A
i
] =
k

i =1
y
i
E[ZI
A
i
] = E
_
Z
k

i =1
y
i
I
A
i
_
.
If Y is nonnegative, take simple Y
n
with Y
n
Y, then by
monotone convergence

E[Y] = lim
n

E[Y
n
] = lim
n
E[Y
n
Z] = E[YZ].
If Y is integrable, use Y = Y
+
Y

.
For (5.5):

E[Y]
(5.4)
= E[YZ] = E
_
E[YZ [ T
t
]

= E
_
YE[Z [ T
t
]

= E[YZ(t)].
For (5.6): Z(s)

E[Y [ T
s
] is T
s
-measurable, and for A T
s
E
_
Z(s)

E[Y [ T
s
]I
A

= E
_

E[YI
A
[ T
s
]Z(s)

(5.5)
=

E
_

E[YI
A
[ T
s
]

=

E[YI
A
]
(5.5)
= E[YI
A
Z(t)],
proving the partial averaging property for Z(s)

E[Y [ T
s
] and YZ(t).
Girsanovs Theorem
In view of the role of martingales in risk-neutral pricing we ask:

How do martingales behave under change of measure?

How does BM behave under change of measure?


Denition 5.3
Probability measures P and

P on (, T) are equivalent if they
agree which sets in T have probability zero, that is if
P[A] = 0

P[A] = 0 for each A T.
If P,

P equivalent then
Z =
d

P
dP
> 0 P-a.s.
Z(t) = E[Z [ T
t
] > 0 P-a.s.
since

P[Z = 0] = E
_
I
{Z=0}
Z

= 0 and

P[Z(t) = 0] =

E
_
I
{Z(t)=0}

=
(5.5)
= E
_
I
{Z(t)=0}
Z(t)

= 0, hence P[Z = 0] = P[Z(t) = 0] = 0.


Theorem 5.4
Let P,

P be equivalent and Y(t), t [0, T] an adapted process.
Y(t) is a

P-martingale if and only if Y(t)Z(t) is a P-martingale.
Proof: First note

E[[Y(t)[] = E[[Y(t)[Z(t)] = E[[Y(t)Z(t)[], so
Y(t) is

P-integrable if and only if Y(t)Z(t) is P-integrable.
If Y(t) is a

P-martingale then for s t
E[Y(t)Z(t) [ T
s
]
(5.6)
= Z(s)

E[Y(t) [ T
s
] = Y(s)Z(s).
If Y(t)Z(t) is a P-martingale then for s t

E[Y(t) [ T
s
]
(5.6)
=
1
Z(s)
E[Y(t)Z(t) [ T
s
] =
1
Z(s)
Y(s)Z(s) = Y(s).
Let W(t), t [0, T] be a Brownian motion on a space (, T, P)
with a ltration (T
t
)
t[0,T]
. For an adapted (t) dene
Z(t) := exp
_

_
t
0
(u)dW(u)
1
2
_
t
0
(u)
2
du
_
, t [0, T].
(5.7)
This is a generalized geometric BM with
Z(t) = Z(0)
_
t
0
Z(u)(u)dW(u),
see Chap. 4. Assuming some integrability
11
, Z(t) is a martingale. So
E[Z(T)] = Z(0) = 1 and we can dene a probability measure

P by

P[A] = E[Z(T)I
A
] for all A T. (5.8)
Theorem 5.5 (Girsanov)
The process

W(t) := W(t) +
_
t
0
(u)du, t [0, T], is a BM under

P.
11
A sucient condition is E

exp

1
2

T
0
(u)
2
du

< .
Proof. We use Levys characterization of BM (Theorem 4.23).
Firstly,
d
_

W(t)Z(t)
_
=

W(t)dZ(t) + Z(t)d

W(t) + d[Z,

W](t)
=

W(t)Z(t)(t)dW(t) + Z(t)dW(t) + Z(t)(t)dt Z(t)(t) 1dt


= Z(t)
_
1

W(t)(t)
_
dW(t),
so

W(t)Z(t) is a P-martingale. Hence

W(t) is a

P-martingale by
Theorem 5.4. Also

W(t) is continuous, we have

W(0) = 0, and
[

W,

W](t) = [W, W](t) = t. Therefore



W(t) is a BM under

P.
Risk-neutral measure
Take a BM W(t) on a space (, T, P) with a ltration (T
t
)
t[0,T]
.
The market model for bank account B(t) and stock S(t):
dB(t) = B(t)R(t)dt, B(0) = 1, (5.9)
dS(t) = S(t)(t)dt + S(t)(t)dW(t), S(0) = S
0
, (5.10)
R(t) (instantaneous interest rate), (t) (mean rate of return) and
(t) (volatility) are adapted processes. We assume (t) > 0
a.s. for all t. Then
B(t) = exp
__
t
0
R(u)du
_
,
S(t) = S
0
exp
__
t
0
(u)dW(u) +
_
t
0
_
(u)
1
2
(u)
2
_
du
_
,
S(t)
B(t)
= S
0
exp
__
t
0
(u)dW(u) +
_
t
0
_
(u) R(u)
1
2
(u)
2
_
du
_
.
Thus
S(t)
B(t)
is a generalized geometric BM with
d
_
S(t)
B(t)
_
=
_
S(t)
B(t)
_
_
(t) R(t)
_
dt +
_
S(t)
B(t)
_
(t)dW(t)
=
_
S(t)
B(t)
_
(t)
_
(t)dt + dW(t)
_
(5.11)
=
_
S(t)
B(t)
_
(t)d

W(t) (5.12)
where we introduced the market price of risk process
(t) =
(t)R(t)
(t)
(5.13)
and

W(t) = W(t) +
_
t
0
(u)du. Dening

P via the R-N derivative
process Z(t) as in (5.7), (5.8), the process

W(t) is a

P-BM by
Girsanovs theorem.

P is called the risk-neutral measure, P is
called the real-world measure. (5.12) yields
Corollary 5.6
The discounted stock price process
S(t)
B(t)
is a martingale under

P.
Stock price under the risk-neutral measure
From

W(t) = W(t) +
_
t
0
(u)R(u)
(u)
du we obtain
S(t) = S
0
exp
__
t
0
(u)dW(u) +
_
t
0
_
(u)
1
2
(u)
2
_
du
_
= S
0
exp
__
t
0
(u)d

W(u) +
_
t
0
_
R(u)
1
2
(u)
2
_
du
_
(5.14)
and thus
dS(t) = S(t)R(t)dt + S(t)(t)d

W(t). (5.15)
The stock has mean rate of return equal to R(t) under

P.
Let V(T) be an T
T
-measurable random variable which represents
the payo of a derivative at time T. We allow path-dependence,
so this includes but is not limited to payos V(T) = h
_
S(T)
_
.
We say that V(T) is attainable if it can be replicated, i.e.,
if there exists a sfts with terminal value
X(T) = V(T) a.s.
The fair price V(t) of the derivative at t is equal the value X(t)
of the replication portfolio at t.
Corollary 5.7 (Risk-neutral pricing)
Suppose V(T) is attainable. Then
V(t) = B(t)

E
_
V(T)
B(T)

T
t
_
=

E
_
e

T
t
R(u)du
V(T)

T
t
_
. (5.16)
Proof: By (4.5), which holds for a general asset price B(t),
X(t)
B(t)
= X(0) +
_
t
0
(u)d
_
S(u)
B(u)
_
.
So
X(t)
B(t)
is a

P-martingale by (5.12), and
V(t)
B(t)
=
X(t)
B(t)
.
Risk-neutral pricing in the Black-Scholes model
Let bond and stock satisfy
dB(t) = B(t)rdt, B(0) = 1,
dS(t) = S(t)(t)dt + S(t)dW(t), S(0) = S
0
,
where r and constant and (t) adapted. Let (t) =
(t)r

and
Z(t) and

P as in (5.7), (5.8). From (5.14) we have
S(t) = S
0
exp
_

W(t) +
_
r
1
2

2
_
t
_
(5.17)
where

W(t) = W(t) +
_
t
0
(u)du is a

P-BM. Results in Chap. 4
say that the European call payo V(T) =
_
S(T) K
_
+
can be
replicated (the analysis is the same when the is replaced by an
adapted process). By Corollary 5.7
c
_
t, S(t)
_
= V(t) =

E
_
e
r (Tt)
_
S(T) K
_
+

T
t
_
. (5.18)
We write
S(T) = S(t)e

W(T)

W(t)

+(r
1
2

2
)(Tt)
= S(t)e

Y+(r
1
2

2
)
with = T t, Y =

W(T)

W(t)

Tt
N(0, 1) and independent of T
t
under

P. So by (5.18)
c
_
t, S(t)
_
= V(t) =

E
__
e
r
S(t)e

Y+(r
1
2

2
)
e
r
K
_
+

T
t
_
=

E
__
xe

Y
1
2

e
r
K
_
+
_

x=S(t)
.
Note
xe

Y
1
2

e
r
K 0
log x

Y
1
2

2
r + log K
Y
1

_
log
x
K
+ (r
1
2

2
)
_
= d

(t, x) .
Hence
c(t, x) =

E
_
xe

Y
1
2

I
{Yd

(t,x)}
Ke
r
I
{Yd

(t,x)}
_
= x
_
d

(t, x) +

_
Ke
r

_
d

(t, x)
_
= x
_
d
+
(t, x)
_
Ke
r

_
d

(t, x)
_
.
Martingale representation theorem and hedging
In the Black-Scholes model, a European call can be replicated via
the sfts X(0) = c
_
0, S(0)
_
, (t) = c
x
_
t, S(t)
_
(delta hedging, see
Chap. 4). What about other models and derivatives?
Take a BM W(t) with ltration (T
t
)
t0
generated by this BM,
and consider the model (5.9), (5.10)
dB(t) = B(t)R(t)dt, B(0) = 1,
dS(t) = S(t)(t)dt + S(t)(t)dW(t), S(0) = S
0
,
with adapted R(t), (t), and (t) > 0. Dene (t) =
(t)R(t)
(t)
and

P via the R-N derivative process Z(t) as in (5.7), (5.8). Also
let

W(t) = W(t) +
_
t
0
(u)du. This is a

P-BM.
The next result guarantees the existence of replication strategies.
For a proof see e.g. Revuz and Yor, Continuous Martingales and
Brownian motion, Theorem V.3.4.
Theorem 5.8 (Martingale representation theorem)
Let W(t), t [0, T] be a BM and (T
t
)
t[0,T]
the ltration
generated by this BM. For every P-martingale M(t) w.r.t.
(T
t
)
t[0,T]
there exists an adapted process (t) such that
M(t) = M(0) +
_
t
0
(u)dW(u), t [0, T].
Corollary 5.9
For every

P-martingale

M(t), t [0, T] w.r.t. (T
t
)
t[0,T]
there
exists an adapted process

(t) such that

M(t) =

M(0) +
_
t
0

(u)d

W(u), t [0, T].


Remark: This is not trivial from Theorem 5.8 since the ltration is
generated by W (and not by

W). But it can easily be derived from
the martingale representation theorem by using Theorem 5.4.
The hedging problem. Recall that a sfts with initial value X(0)
and stock position (t) has discounted value process
X(t)
B(t)
= X(0) +
_
t
0
(u)d
_
S(u)
B(u)
_
, t [0, T]. (4.5)
Given an T
T
-measurable derivative payo V(T), the task is to nd
X(0) and (t) such that the value process satises X(T) = V(T).
We have d
_
S(t)
B(t)
_
=
_
S(t)
B(t)
_
(t)d

W(t), see (5.12). Dene


V(t) = B(t)

E
_
V(T)
B(T)

T
t
_
.
V(t)
B(t)
is a

P-martingale, so by Corollary 5.9 there exists

(t) with
V(t)
B(t)
=
V(0)
B(0)
+
_
t
0

(u)d

W(u)
= V(0) +
_
t
0

(u)B(u)
S(u)(u)

S(u)
B(u)
(u)d

W(u)
= V(0) +
_
t
0

(u)B(u)
S(u)(u)
d
_
S(u)
B(u)
_
, t [0, T].
Therefore taking X(0) =

E
_
V(T)
B(T)
_
and (t) =

(t)B(t)
S(t)(t)
in (4.5)
yields the desired sfts with X(T) = V(T).
Remarks.
1) We have shown that in a model of the form (5.9), (5.10), every
T
T
-measurable derivative payo is attainable, i.e. can be
replicated by self-nancing trading in bank account and stock.
Such a model is called complete.
2) The key assumptions are that the ltration is generated by a
one-dimensional BM W(t), and that (t) is positive.
3) The martingale representation theorem justies the use of the
risk-neutral pricing formula (5.16) by showing that every derivative
is attainable. But it does not provide a method of nding

(t) in
(t) =

(t)B(t)
S(t)(t)
. Under further assumptions on the processes R(t)
and (t), the strategy (t) can be found using PDE methods
(Feynman-Kac theorem, see [Shr04] Chap. 6).
The fundamental theorems of asset pricing
We consider a market with a bank account process B(t),
dB(t) = B(t)R(t)dt, (5.19)
and m risky assets S
i
(t), i = 1, ..., m,
dS
i
(t) = S
i
(t)
i
(t)dt + S
i
(t)
d

j =1

ij
(t)dW
j
(t). (5.20)
Here W(t) =
_
W
1
(t), ..., W
d
(t)
_
is a d-dim BM on a space
(, T
T
, P) with a ltration (T
t
)
t[0,T]
, and R(t),
i
(t), and
ij
(t)
are adapted processes.
Assume
i
(t) =
_

d
j =1

ij
(t)
2
> 0 for i = 1, ..., m and let
B
i
(t) =
d

j =1
_
t
0

ij
(u)

i
(u)
dW
j
(u).
Then
dS
i
(t) = S
i
(t)
i
(t)dt + S
i
(t)
i
(t)dB
i
(t).
The processes B
i
(t) =

d
j =1
_
t
0

ij
(u)

i
(u)
dW
j
(u) are martingales with
d[B
i
, B
k
](t) =
d

j
1
=1
d

j
2
=1

ij
1
(t)
kj
2
(t)

i
(t)
k
(t)
d[W
j
1
, W
j
2
](t) =
d

j =1

ij
(t)
kj
(t)

i
(t)
k
(t)
dt
=
ik
(t)dt.
So d[B
i
, B
i
](t) = dt, and B
i
(t) is a 1-dim BM by Levys theorem.
Also
ik
(t) [1, 1] is the instantaneous correlation of B
i
(t) and
B
k
(t) for i ,= k.
This means that the logarithmic returns of S
i
and S
k
at time t
have conditional correlation
ik
(t), cf. [Shr04] Exercise 4.17.
Arbitrage and the rst FTAP
Let X(t) be the value process of a sfts which holds
i
(t) shares of
asset S
i
at time t, where
i
(t) are adapted processes. Then
dX(t) =
_
X(t)
m

i =1

i
(t)S
i
(t)
_
R(t)dt +
m

i =1

i
(t)dS
i
(t).
The product rule and d
_
1
B(t)
_
=
1
B(t)
R(t)dt imply
X(t)
B(t)
= X(0) +
m

i =1
_
t
0

i
(u)d
_
S
i
(u)
B(u)
_
. (5.21)
Denition 5.10
a) A sfts is admissible if there exists a constant b 0 such that
its value process satises X(t) b for all t, a.s.
b) A sfts is an arbitrage if it is admissible, and its value process
satises X(0) = 0 a.s., X(T) 0 a.s., and P[X(T) > 0] > 0.
Theorem 5.11 (First fundamental theorem of asset pricing)
If there exists a measure

P equivalent to P such that
S
1
(t)
B(t)
, ...,
S
m
(t)
B(t)
are

P-martingales, then the market model B, S
1
, ..., S
m
does not
admit arbitrage.

A measure

P as in Theorem 5.11 is called a risk-neutral or
equivalent martingale measure for the market B, S
1
, ..., S
m
.

One can show that a market model without a risk-neutral


measure violates the no free lunch with vanishing risk
property, which is close to admitting arbitrage. See Delbaen
and Schachermayer, The mathematics of arbitrage (2006).
Proof (sketch): Let X(t) be the value process of an admissible
sfts. Then
X(t)
B(t)
is a

P-martingale
12
as a sum of stochastic integrals
w.r.t. the

P-martingales
S
1
(t)
B(t)
, ...,
S
m
(t)
B(t)
. So if X(0) = 0 and
X(T) 0, then

E
_
X(T)
B(T)

= 0 and hence

P[X(T) > 0] = 0. By
equivalence P[X(T) > 0] = 0, so the sfts cannot be an arbitrage.
12
This requires an integrability condition on (t).
Existence of a risk-neutral measure
Let W(t) =
_
W
1
(t), ..., W
d
(t)
_
a d-dim BM as above. Take a
d-dim adapted (t) =
_

1
(t), ...,
d
(t)
_
and dene for t [0, T]
Z(t) := exp
_

_
t
0
(u) dW(u)
1
2
_
t
0
|(u)|
2
du
_
. (5.22)
If E
_
exp
_
1
2
_
T
0
|(u)|
2
du
_
< , then Z(t) is a martingale and
we can dene a probability measure

P by

P[A] = E[Z(T)I
A
] for all A T.
The same proof as in the 1-dimensional case yields
Theorem 5.12 (Girsanov)
The process

W(t) := W(t) +
_
t
0
(u)du, t [0, T], is a d-dim BM
under

P.
Writing the model (5.19), (5.20) in discounted prices we have
d
_
S
i
(t)
B(t)
_
=
S
i
(t)
B(t)
_

i
(t) R(t)
_
dt +
S
i
(t)
B(t)
d

j =1

ij
(t)dW
j
(t)
for i = 1, ..., m. Now
S
1
(t)
B(t)
, ...,
S
m
(t)
B(t)
are

P-martingales if
d
_
S
i
(t)
B(t)
_
=
S
i
(t)
B(t)
d

j =1

ij
(t)
_

j
(t)dt + dW
j
(t)

=
S
i
(t)
B(t)
d

j =1

ij
(t)d

W
j
(t)
with

W
j
(t) = W
j
(t) +
_
t
0

j
(u)du, and this is the case if and only if

i
(t) R(t) =
d

j =1

ij
(t)
j
(t), i = 1, ..., m. (5.23)
This is a linear system of m equations for d unknowns
1
(t), ...,
d
(t).
Conclusion: If (5.23) has a solution (t) =
_

1
(t), ...,
d
(t)
_
, then
the model (5.19), (5.20) does not admit arbitrage.
If (5.23) has no solution, there is an arbitrage in the model.
Instead of a proof we give an
Example. Take m = 2, d = 1, and constant coecients R,
i
,
i
.
Then (5.23) becomes

1
R =
1
,

2
R =
2
.
This has a solution if and only if

1
R

1
=

2
R

2
. Suppose this
does not hold, say

1
R

1
>

2
R

2
. We can realize an arbitrage via
the sfts X(0) = 0,
1
(t) =
1
S
1
(t)
1
,
2
(t) =
1
S
2
(t)
2
. By (5.21),
its discounted value process is
X(t)
B(t)
=
_
t
0

1
(u)d
_
S
1
(u)
B(u)
_
+
_
t
0

2
(u)d
_
S
2
(u)
B(u)
_
=
_
t
0
1
S
1
(u)
1
S
1
(u)
B(u)
_
(
1
R)du +
1
dW(u)
_

_
t
0
1
S
2
(u)
2
S
2
(u)
B(u)
_
(
2
R)du +
2
dW(u)
_
=
_
t
0
1
B(u)
_

1
R


2
R

2
_
du > 0.
Risk-neutral pricing and the second FTAP
So far we dened prices of derivatives via replication arguments.
We now introduce the concept of pricing by arbitrage arguments.
Suppose the market B, S
1
, ..., S
m
given by (5.19), (5.20) admits a
risk-neutral measure

P.

For a derivative with T


T
-measurable payo V(T) at T dene
V(t) = B(t)

E
_
V(T)
B(T)

T
t
_
. (5.24)

Then
V(t)
B(t)
, t [0, T] is a

P-martingale. So if the derivative is
traded for V(t) at t, the market B, S
1
, ..., S
m
, V does not
admit arbitrage by the rst FTAP. We call V(t) a fair or
arbitrage-free price.

If (5.23) has more than one solution (typical situation for


d > m), then there is more than one risk-neutral measure.
Dierent measures usually yield dierent prices V(t) in (5.24).

When do we have a unique fair price?


Denition 5.13 (Complete market model)
The model B, S
1
, ..., S
m
given by (5.19), (5.20) is complete if
every derivative can be replicated, i.e., if for every integrable
T
T
-measurable random variable V(T) there exists a real number
V(0) and adapted processes
1
(t), ...,
m
(t) such that
V(T)
B(T)
= V(0) +
m

i =1
_
t
0

i
(u)d
_
S
i
(u)
B(u)
_
.
Otherwise the model is called incomplete.
Theorem 5.14 (Second fundamental theorem of asset pricing)
Suppose that the model B, S
1
, ..., S
m
has a risk-neutral measure.
The following statements are equivalent.
(i) The model B, S
1
, ..., S
m
is complete.
(ii) The measure risk-neutral measure is unique.
(iii) Equation (5.23) has a unique solution (t)=
_

1
(t), ...,
d
(t)
_
.
Proof (sketch). (i) (ii): Assume completeness and let

P
1
,

P
2
risk-neutral measures. For A T = T
T
dene V(T) = I
A
B(T).
By completeness V(T) can be replicated via a sfts with some initial
value V(0). Its discounted value process is a martingale under

P
1
and

P
2
, so

P
i
[A] =

E
i
[I
A
] =

E
i
_
V(T)
B(T)

= V(0) for i = 1, 2.
(ii) (iii): Assume uniqueness of

P and let (t) and

(t)
solutions of (5.23). Then Z(T) and Z

(T) in (5.22) with (t) and

(t) both dene a R-N derivative for a risk-neutral measure. Thus


Z(T) = Z

(T) a.s. and so (t) =

(t).
(iii) (i): Given an integrable derivative payo V(T) we have to
show that there are
i
(t) with
V(T)
B(T)
= V(0) +

m
i =1
_
T
0

i
(t)d
_
S
i
(t)
B(t)
_
= V(0) +

m
i =1
_
T
0

i
(t)
S
i
(t)
B(t)

d
j =1

ij
(t)d

W
j
(t).
Take a risk-neutral measure and dene V(t) via (5.24). We obtain
V(T)
B(T)
= V(0) +

d
j =1
_
T
0

j
(t)d

W
j
(t)
for suitable

j
(t) by multi-dim martingale representation, s. below.
So we have show that there is a solution
1
(t), ...,
m
(t) to
m

i =1

i
(t)
S
i
(t)
B(t)

ij
(t) =

j
(t), j = 1, ..., d, (5.25)
Since (5.23) has a unique solution, the matrix
ij
(t) has rank d
(and thus m d). This implies that (5.25) has a solution.
Theorem 5.15 (Martingale representation theorem)
Let
_
W
1
(t), ..., W
d
(t)
_
, t [0, T] be a BM and (T
t
)
t[0,T]
the
ltration generated by this BM. For every P-martingale M(t) w.r.t.
(T
t
)
t[0,T]
there exists an adapted
_

1
(t), ...,
d
(t)
_
such that
M(t) = M(0) +
d

j =1
_
t
0

j
(u)dW
j
(u), t [0, T].
For every

P-martingale

M(t), t [0, T] w.r.t. (T
t
)
t[0,T]
there
exists an adapted
_

1
(t), ...,

d
(t)
_
such that

M(t) =

M(0) +
d

j =1
_
t
0

j
(u)d

W
j
(u), t [0, T].
Discussion of the multi-dimensional asset price model.
1) d = m = 1: The model (5.19), (5.20) is arbitrage-free and
complete if
1
(t) ,= 0 for all t.
2) d = m > 1: The model (5.19), (5.20) is arbitrage-free and
complete if the matrix
_

ij
(t)
_
i , j =1,...,d
is nonsingular for all t.
3) d > m: The model (5.19), (5.20) is arbitrage-free if the matrix
_

ij
(t)
_
i =1,...,m, j =1,...,d
has rank m for all t.
It is incomplete in this case.
4) d < m: The model (5.19), (5.20) is arbitrage-free if the vector
_

1
(t) R(t), ...,
m
(t) R(t)
_
is in the image of the linear map
dened by the matrix
_

ij
(t)
_
i =1,...,m, j =1,...,d
(drift conditions).
It is complete if moreover the matrix
_

ij
(t)
_
i =1,...,m, j =1,...,d
has
rank d.
Dividend-Paying Stocks
Let B(t) = e

t
0
R(u)du
a bank account process, and S(t) the price
process of a stock with cumulative dividend process D(t) (= sum
of all dividends paid between times 0 and t). D(t) is an increasing
process. If we hold the stock from 0 to t and invest all dividends in
the bank account, our wealth at time t is
Y(t) = S(t) +
_
t
0
B(t)
B(u)
dD(u).
We conclude
d
_
Y(t)
B(t)
_
= d
_
S(t)
B(t)
+
_
t
0
1
B(u)
dD(u)
_
=
1
B(t)
dS(t) S(t)
1
B(t)
R(t)dt +
1
B(t)
dD(t)
=
1
B(t)
_
d
_
S(t) + D(t)
_
S(t)R(t)dt
_
. (5.26)
The value process X(t) of a sfts in the stock and bank account
with stock position (t) at time t satises
dX(t) =
_
X(t) (t)S(t)
_
R(t)dt + (t)d
_
S(t) + D(t)
_
.
Using the product rule and the last two equations, we nd
d
_
X(t)
B(t)
_
=
1
B(t)
dX(t) X(t)
1
B(t)
R(t)dt
=
1
B(t)
_
_
X(t) (t)S(t)
_
R(t)dt + (t)d
_
S(t) + D(t)
_
X(t)R(t)dt
_
= (t)d
_
Y(t)
B(t)
_
.
Thus if
Y(t)
B(t)
is a

P-martingale, then
X(t)
B(t)
is a

P-martingale.
Stock model with dividends. We now assume
d
_
S(t) + D(t)
_
= S(t)(t)dt + S(t)(t)dW(t). (5.27)
Returns of the stock including dividend payments have mean (t)
and volatility (t) (modeled by adapted processes). By (5.26)
d
_
Y(t)
B(t)
_
=
1
B(t)
_
S(t)
_
(t) R(t)
_
dt + S(t)(t)dW(t)
_
=
S(t)
B(t)
(t)d

W(t)
where

W(t) = W(t) +
_
t
0
(u)du with (t) =
(t)R(t)
(t)
.
So by choosing Z(t) and

P as in (5.7), (5.8),
Y(t)
B(t)
is a

P-martingale.
In summary we obtain
d
_
X(t)
B(t)
_
= (t)d
_
Y(t)
B(t)
_
, (5.28)
d
_
Y(t)
B(t)
_
=
S(t)
B(t)
(t)d

W(t) (5.29)
for the value process X(t) of a sfts. Threrefore, as in the case of
zero dividends, a derivative payo V(T) at time T has fair price
V(t) = B(t)

E
_
V(T)
B(T)

T
t
_
(5.30)
and can be replicated via a sfts (martingale representation theorem).
The dierence between dividend and no-dividend case is in the
stock model, now given by (5.27)
d
_
S(t) + D(t)
_
= S(t)(t)dt + S(t)(t)dW(t).
We solve the equation for S(t) in two important cases.
Examples.
1) Continuously paying dividend. Here we assume
D(t) =
_
t
0
A(u)S(u)du
for an adapted rate process A(t) 0. Then (5.27) becomes
dS(t) = S(t)
_
(t) A(t)
_
dt + S(t)(t)dW(t)
and we obtain
S(t) = S
0
exp
__
t
0
(u)dW(u) +
_
t
0
_
(u) A(u)
1
2
(u)
2
_
du
_
,
= S
0
exp
__
t
0
(u)d

W(u) +
_
t
0
_
R(u) A(u)
1
2
(u)
2
_
du
_
.
(5.31)
For constant coecients R(t) = r , (t) = , and A(t) = a,
S(t) = S
0
exp
_

W(t) +
_
r a
1
2

2
_
t
_
.
Using the risk-neutral pricing formula (5.30) we can compute
derivative prices as in the Black-Scholes model without dividends.
With S(t) = S
0
exp
_

W(t) +
_
r a
1
2

2
_
t
_
, t [0, T], we
obtain
V(t) =

E
_
e
r (Tt)
_
S(T) K
_
+

T
t
_
=

S(t)
_
d
+
_
t,

S(t)
_
_
e
r (Tt)
K
_
d

_
t,

S(t)
_
_
with

S(t) = S(t)e
a(Tt)
via a similar computation as for the
no-dividend case in (5.18).
When a = 0 we recover the original Black-Scholes price formula
V(t) = c
_
t, S(t)
_
.
2) Lump payments of dividends. Here we assume
D(t) =

t
j
t
a
j
S(t
j
)
with T
t
j
-measurable random variables a
j
[0, 1] for j = 1, ..., n,
and a nite number of payment dates 0 < t
1
< ... < t
n
< T.
Then D(t) is constant in t between consecutive payment dates,
and jumps up by a
j
S(t
j
) at time t
j
.
By (5.27) the sum S(t) + D(t) is continuous in t. Therefore S(t)
jumps down by a
j
S(t
j
) at time t
j
. Also
dS(t) = S(t)(t)dt+S(t)(t)dW(t) = S(t)R(t)dt+S(t)(t)d

W(t)
for t (t
j
, t
j +1
) between consecutive payment dates. Thus
S(t) = S(t
j
)e

t
t
j
(u)d

W(u)+

t
t
j
_
R(u)
1
2
(u)
2
_
du
if t (t
j
, t
j +1
),
S(t
j +1
) = (1 a
j +1
)S(t
j
)e
t
j +1
t
j
(u)d

W(u)+
t
j +1
t
j
_
R(u)
1
2
(u)
2
_
du
.
By recursion it follows for all t [0, T] that
S(t) = S
0

t
j
t
(1 a
j
) e

t
0
(u)d

W(u)+

t
0
_
R(u)
1
2
(u)
2
_
du
(5.32)
For constant coecients R(t) = r , (t) = , and a
j
,
S(t) = S
0

t
j
t
(1 a
j
) e

W(t)+
_
r
1
2

2
_
t
.
The risk-neutral pricing formula (5.30) then yields for the price V(t)
of the European call with strike K and maturity T
V(t) =

S(t)
_
d
+
_
t,

S(t)
_
_
e
r (Tt)
K
_
d

_
t,

S(t)
_
_
where

S(t) = S(t)

t<t
j
T
(1 a
j
).
Remark. We can compute the value X(t) of a portfolio strategy
which starts with one share of stock and instantaneously reinvests
all dividends in the stock. Let (t) be the number of shares in the
portfolio at time t. This is an increasing process with (0) = 1
and
d(t) = (t)
dD(t)
S(t)
.
Case 1) D(t) =
_
t
0
A(u)S(u)du. Then we nd (t) = e

t
0
A(u)du
.
Case 2) D(t) =

t
j
t
a
j
S(t
j
). Then (t) is constant between
consecutive payment dates and at payment date t
j
we have
(t
j
) (t
j
) = (t
j
)
a
j
S(t
j
)
S(t
j
)
= (t
j
)
a
j
1a
j
.
Thus (t
j
) = (t
j
)
1
1a
j
= (t
j 1
)
1
1a
j
and so (t) =

t
j
t
1
1a
j
.
Now since X(t) = (t)S(t), using either (5.31) or (5.32) we
obtain in each case that
X(t) = S
0
exp
_
_
t
0
(u)d

W(u) +
_
t
0
_
R(u)
1
2
(u)
2
_
du
_
.
In conclusion, if the dividend paying stock model is given by
(5.27), then the associated value process X(t) from continuous
dividend reinvestment follows the same stochastic process as the
non-dividend paying stock S(t) in (5.14).
In particular, the discounted value process
X(t)
B(t)
= S
0
exp
_
_
t
0
(u)d

W(u)
_
t
0
1
2
(u)
2
du
_
is again a martingale under the risk-neutral measure

P.

S-ar putea să vă placă și