Stochastic Processes and Time Series Markov Chains - II: 1 Conditional Probability Results

Stochastic Processes and Time Series
Module 2
Markov Chains - II
Dr. Alok Goswami, Professor, Indian Statistical Institute, Kolkata
1 Conditional Probability Results

1.1 Some Basic Results involving Conditional Probabilities:
CP1: P(A ∩ B) = P(A | B)P(B)
CP2: More generally, P(A1 ∩ · · · ∩ An ) = P(An | A1 ∩ · · · An−1 ) · · · · · P(A2 | A1 ) · P(A1 )
CP3: P(A ∩ B | C) = P(A | B ∩ C) · P(B | C)
CP4: More generally, P(A1 ∩· · ·∩An | B) = P(A1 | A2 ∩· · ·∩An ∩B)· · · · ·P(An−1 | An ∩B)·P(An | B)
CP5: P(B) = P(B | A1 )P(A1 ) + · · · + P(B | An )P(An ) for mutually exclusive and collectively ex-
haustive events A1 , . . . , An .
CP6: More generally, for events A1 , . . . , An as above, P(B | C) = P(B ∩ A1 | C) + · · · + P(B ∩
An | C) = P(B | A1 ∩ C) P(A1 | C) + · · · + P(B | An ∩ C) P(An | C)
2 Back to Markov Chain:

Let (Xn )n≥0 be a Markov chain (to be often abbreviated as MC) with state space S and transition
probabilities {pxy , x, y ∈ S}.
Recall: For any x, y ∈ S, pxy represents the conditional probability that the chain will move to
state y at the next time point, given that the chain is in state x at present time point.
What about probability of moving to a state y after 2 steps?
What is P(Xn+2 = y | X0 = x0 , . . . , Xn−1 = xn−1 , Xn = x)?
2.1 2-step transition probabilities:

Using (CP6) and (CP3) and the Markov property, the above equals:
X
P(Xn+2 = y, Xn+1 = z | X0 = x0 , . . . , Xn−1 = xn−1 , Xn = x)
z∈S
X
= P(Xn+2 = y | X0 = x0 , . . . , Xn−1 = xn−1 , Xn = x, Xn+1 = z)
z∈S
X
× P(Xn+1 = z | X0 = x0 , . . . , Xn−1 = xn−1 , Xn = x) = pxz pzy
z∈S
1
3 Markov chains: Some Notations
P (2)
The quantity z∈S pxz pzy obtained above is denoted by pxy and is called the “2-step transition
probability” (from state x to state y).
In fact, what we have just proved, implies:
P(Xn+2 = y | X0 = x0 , . . . , Xn−1 = xn−1 , Xn = x) = P(Xn+2 = y | Xn = x)
(2)
=P(X2 = y | X0 = x)=pxy
3.1 k-step transition probabilities

One can similarly consider k-step transition probabilities, for any k ≥ 2, and show by induction
that,
P(Xn+k = y | X0 = x0 , . . . , Xn−1 = xn−1 , Xn = x)
(k)
=P(Xn+k = y | Xn = x) = P(Xk = y | X0 = x) = pxy
(k) P (k−1) (k−1) P
pxz pzy = # pxz1 · · · pzk−1 y ,
P
where pxy = pxz pzy =
z∈S z∈S
P#
with the last sum above being over all z1 , . . . , zk−1 ∈ S.
4 Marix Notations:
4.1 Transition Matrix:
Consider the S × S matrix: P = ((pxy ))x,y∈S
P : transition matrix (or, the transition probability matrix).
Properties: (i) all entries non-negative, (ii) each row-sum = 1
(2) (2)
pxy = z pxz pzy =(x, y)th entry of P 2 =⇒ P 2 = (( pxy ))
P
(k) (1)
In general, P k = (( pxy )), for every k ≥ 1. (write pxy = pxy )
(0)
Holds for k = 0 also: pxy = P(X0 = y | X0 = x) = δxy .
4.2 Chapman-Kolmogorov Equation

From the trivial identity P m+n = P m · P n , one easily gets:
(m+n) P (m) (n)
C-K Equations: pxy = pxz pzy , for x, y ∈ S, m, n ≥ 1
z∈S
(x → y in m+n steps ≡ x → some z in m steps, z → y in n steps)
We introduce two very important notations:
For any event A involving (Xn )n≥0 , Px (A) := P(A | X0 = x). For any real random variable
Z determined by (Xn )n≥0 , Ex (Z) := E(Z | X0 = x). Thus, for each x ∈ S, Px denotes the
(conditional) probabilityand Ex denotes the (conditional) expectation, given that the chain starts
at state x.
2
Clearly, in this notation,

pxy = Px (X1 = y) = Ex I{X1 =y}
and p(k)

xy = Px (Xk = y) = Ex I{Xk =y}
(Here, I(·) denotes the usual indicator random variable.)
5 Some Facts Left as Exercises:

Using Markov property (and also Time-homogeneity), one can show the following (treat these as
exercises!):
P(Xn+1 = y1 , . . . , Xn+k = yk | X0 = x0 , . . . , Xn−1 = xn−1 , Xn = x)
=Px (X1 = y1 , . . . , Xk = yk ) = pxy1 py1 y2 · · · pyk−1 yk
For any B ∈ S k and for any f : S k → R
P( (Xn+1 , . . . , Xn+k ) ∈ B | X0 = x0 , . . . , Xn−1 = xn−1 , Xn = x)

X
= Px ( (X1 , . . . , Xk ) ∈ B) = pxy1 py1 y2 · · · pyk−1 yk
(y1 ,...,yk )∈B
and, E( f (Xn+1 , . . . , Xn+k ) | X0 = x0 , . . . , Xn−1 = xn−1 , Xn = x)

X
= Ex ( f (X1 , . . . , Xk ) ) = f (y1 , . . . , yk )pxy1 py1 y2 · · · pyk−1 yk
(y1 ,...,yk )
The last two equalities dealt with “finite” future (Xn+1 , . . . ,Xn+k ). But, same holds for “infinite”
future as well. For any B ∈ S ∞ and for any f : S ∞ → R
P( (Xn+1 , Xn+2 , . . .) ∈ B | X0 = x0 , . . . , Xn−1 = xn−1 , Xn = x)
= Px ( (X1 , X2 , . . .) ∈ B)
and, E( f (Xn+1 , Xn+2 , . . .) | X0 = x0 , . . . , Xn−1 = xn−1 , Xn = x)
= Ex ( f (X1 , X2 , . . .))
Upshot: Given the past upto a time n and the present state x, the conditional future evolution of
the chain from time (n+1), behaves in exactly the same way as the evolution of the chain from time
1, conditioned on its starting at the state x and the corresponding conditional prob-
abilities and expectations are all completely determined by transition probabilities of
the MC.
6 Initial Distribution
Conditional probabilities and expectations related to future evolution, given history upto any time
point, reduce to Px and Ex , which are, in turn, determined by the transition probabilities.
To get unconditional probabilities and expectations, we need to specify the distribution of the
3
initial state X0 and use:
X X
P(A)= P(A |X0 = x)P(X0 = x), E(Z)= E(Z |X0 = x)P(X0 = x)
x∈S x∈S
Denote the distribution of X0 by µ: µx = P(X0 = x), x ∈ S. Writing Pµ and Eµ respectively
for (unconditional) probabilities and expectations arising out of µ, the above relations reduce to:
X X
Pµ (A) = µx Px (A), Eµ (Z) = µx Ex (Z)
x∈S x∈S
Distribution of a MC is completely specified by Initial Distribution (µx , x ∈ S) and the
Transition Probabilities (pxy , x, y ∈ S)
7 Distribution of the MC
Joint distribution of (X1 , . . . , Xn ):
X
Pµ (X1 = x1 , . . . , Xn = xn ) = µx pxx1 · · · pxn−1 xn
x∈S
P (n)
Marginal distribution of Xn : Pµ (Xn = x) = µz pzx
z∈S
(n)
Denote distribution of Xn as: µ(n) = {µx , x ∈ S}. The above formula says (in matrix notation):
µ(n) = µ · P n , where µ and µ(n) are both viewed as 1 × S (row) vectors. Thus, probabilities of
all finite-dimensional events are explicitly given in terms of initial distribution µ and transition
matrix P . But, the really important questions for a MC centre around undesrtanding its long-term
characteristics and these involve Infinite-Dimensional Events!!
What are these questions? How does one answer them?
8 On-Off Chain: Detailed Analysis

A complete analysis of the On-Off Chain will help us identify at least one important question for
general MC.
State space: S = {0, 1}

Transition probabilities: p01 = α = 1−p00 , p10 = β = 1−p11 .
Let, P(X0 = 0) = µ0 , P(X0 = 1) = µ1 (= 1 − µ0 ).
(n) (n) (n)

Denote µ0 = P(Xn = 0), µ1 = P(Xn = 1) = 1 − µ0 .
(n)
Recurrence relation between the successive µ0 :
(n) (n−1) (n−1) (n−1)

µ0 = µ0 · p00 + µ1 · p10 = β + (1 − α − β) · µ0
Deduce:
n−1
(n)
X
µ0 =β (1 − α − β)k + (1 − α − β)n · µ0 .
k=1
4
Assume: α + β > 0. [Just rules out α = β = 0, so OK (why?)]

(n) β n β (n)
Get a simple formula for distribution of Xn : µ0 = − (1 − α − β) − µ0 , µ1 =
α+β α+β
(n)
1−µ0
All three main parameters µ0 , α and β involved, as expected!
We ask: If n → ∞, is there a limit? If so, what?
Assume: α + β < 2. [Rules out α = β = 1, again OK (why?)]
0 < α + β < 2 ⇒ |1 − α − β| < 1 ⇒ (1 − α − β)n → 0
We conclude (except in case α = β = 0 or α = β = 1) :

(n) β (n) α
lim µ0 = = π0 (say) and lim µ1 = 1−π0 =
n→∞ α+β n→∞ α+β
8.1 What are the main takeaways from the above analysis?
The chain has a unique limiting distribution. The limit depends only on α and β, but not on µ.
(Initial distribution often has little or no effect in long run!) Limit distribution π = {π0 =
β α
α+β , π1 = α+β } has property:
if µ = π, then µ(n) = π for all n. (X0 ∼ π ⇒ Xn ∼ π ∀ n)
(First prove: πP = π, then conclude πP n = π for all n)
This property is expressed as: π is a stationary distribution.

β α
π = α+β , α+β is the only stationary distribution.
(It is the only probability on S = {0, 1} satisfying πP = π)
To sum up: The On-Off Chain has, irrespective of the initial distribution, a unique limiting
distribution, which is also a stationary distribution for the chain and is the only one.
• Do the Same Results hold for general Markov Chains?
• Quite expectedly, the answer is : NO!!
• Next Natural Question: How to go about understanding long-term behaviour of a general

MC?
• Distinct Advantage for On-Off chain: Simple explicit formula for µ(n) .
• Such explicit computations NOT possible for general MC.
• Will have to employ different ideas, that do NOT depend on such explicit computations.

Stochastic Processes and Time Series Markov Chains - II: 1 Conditional Probability Results

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Stochastic Processes and Time Series Markov Chains - II: 1 Conditional Probability Results

Încărcat de

Drepturi de autor:

Formate disponibile

Stochastic Processes and Time Series

1 Conditional Probability Results

2 Back to Markov Chain:

2.1 2-step transition probabilities:

3.1 k-step transition probabilities

4.2 Chapman-Kolmogorov Equation

(Here, I(·) denotes the usual indicator random variable.)

5 Some Facts Left as Exercises:

For any B ∈ S k and for any f : S k → R

P( (Xn+1 , . . . , Xn+k ) ∈ B | X0 = x0 , . . . , Xn−1 = xn−1 , Xn = x)

and, E( f (Xn+1 , . . . , Xn+k ) | X0 = x0 , . . . , Xn−1 = xn−1 , Xn = x)

8 On-Off Chain: Detailed Analysis

State space: S = {0, 1}

(n) (n) (n)

(n) (n−1) (n−1) (n−1)

All three main parameters µ0 , α and β involved, as expected!

We ask: If n → ∞, is there a limit? If so, what?

Assume: α + β < 2. [Rules out α = β = 1, again OK (why?)]

0 < α + β < 2 ⇒ |1 − α − β| < 1 ⇒ (1 − α − β)n → 0

We conclude (except in case α = β = 0 or α = β = 1) :

• Do the Same Results hold for general Markov Chains?

• Quite expectedly, the answer is : NO!!

• Next Natural Question: How to go about understanding long-term behaviour of a general

• Such explicit computations NOT possible for general MC.

S-ar putea să vă placă și