Documente Academic
Documente Profesional
Documente Cultură
YL. Chang
Fall 2015
Motivation Example
Consider an inventory model:
Dn : the demand of the nth period (e.g. days or weeks). Demand is Random and its the distribution
is known.
Inventory that is left at the end of this period can be used to satisfy the demand in the next period
(e.g. paper, pencils, tires, ink etc.)
Suppose the Demand {Dn } is an i.i.d. sequence and it follows the following distribution:
d
P[Dn = d]
1/8
1/4
1/2
1/8
Assume Cp = $100, Cv = $50, Cf = $100, h = $10, where h is the holding cost, i.e., h = Cs .
Question: What is the optimal inventory policy to maximize the long-run average profit per week?
Please dont be confused this with the newsvendor problem. For newsvendor, you have to get rid of items
at the end of each period. In that case, time 0: make order decision; time 1: demand is realized and any
leftover is removed. Now we consider the case where the leftover can be used for the following period.
How to analyze the performance of an inventory policy for this case? Every Friday 5 pm, lets say we decide
how much to order for the following week so that the ordered items will arrive at 8 am the following Monday.
We sell from Monday to Friday. Lets again consider (s, S) policy.
Let Xn be the inventory level at the end of period n.
(i) If Xn < s, order S Xn items
(ii) Otherwise, do not order.
The goal of our class is again to determine the (s, S) policy so that our long-run average profit can be
maximized.
Lets do some analysis first on Xn .
Definition 1. Time series: A sequence of data points over time. Time can be discrete or continuous.
Hence, {Xn }n0 is a time series.
Assume our policy is (s, S) = (2, 4). Then
(i) If Xn 1, order 4 Xn items
(ii) Otherwise, do not order.
Lets consider the following probability:
P (Xn+1 = 3|Xn = 1)
0
0
0
5/8
1/8
0
1
1/8
1/8
1/4
1/2
1/8
2
1/2
1/2
1/8
1/4
1/2
3
1/4
1/4
0
1/8
1/4
1/8
1/8
0
0
1/8
1.1
Elements in DTMC
(1) State space (S): State space is a set containing all possible values that Xn can take. You will see that
S does not have to be finite. For example,
(a) Inventory level Xn in the motivation section. S = {0, 1, 2, 3, 4}
(b) The status of a machine S = {working, down}
(c) Tomorrows weather S = {Hot, Cold}
(d) Simple random walk. Suppose you toss a coin at each time n and you go up by 1 if you get
a head, or go down by 1 if you get a tail. You are interested in your physical position. Then
S = {..., 2, 1, 0, 1, 2, ...}.
(2) Transition probability matrix P = (Pij ). Transition probability matrix describes the probability of
a future event given the information about our current state.
Pij 0
P
jS Pij = 1, i The sum of each row should be 1, but the sum of each column does not have to
be 1.
P is a square matrix.
The size of P is (# of elements in state space)(# of elements in state space), i.e., |S| |S|.
(3) Initial Distribution a0 : Initial distribution provides information from which state the system stars off
at the very beginning. It determines where the system starts at time 0.
For example, if your current inventory level is 2, then the initial distribution a0 = (0, 0, 1, 0, 0).
If you have no idea about your initial inventory level and you believe it is equally likely to be 0,1,2,3,4,
then a0 = (1/5, 1/5, 1/5, 1/5, 1/5).
Definition 2. (Stochastic process) A stochastic process is a collection of random variables, representing the
evolution of some values (random) of the system over time.
A time series can be seen as a realization from a stochastic process. Or a stochastic process is a time series
model.
Definition 3. (Discrete Time Markov Chain) A discrete time stochastic process X = {Xn : n = 0, 1, 2, ...}
is said to be a DTMC on state space S with transition matrix P if for each n 1 for i0 , i1 , i2 , ..., i, j S
P (Xn+1 = j|X0 = i0 , X1 = i1 , X2 = i2 , ..., Xn1 = in1 , Xn = i) = Pij = P (Xn+1 = j|Xn = i).
(1)
Definition 4. (Markov Property) Given the current information (state), future and past are independent.
From the information gathering perspective, it is very appealing because you just need to remember
the current state. For an opposite example, Wikipedia keeps track of all the histories of each article.
It requires tremendous effort.
What if you think your situation depends not only on the current state but also on the state of one
week ago? Then you can properly redefine the state space so that each state describes the status of
your system for the two weeks, instead of one week. At this point, I want to stress that you are the
one who decide how your model should be. You can add more assumptions to fit your situation to a
Markov model.
1.2
DTMC Model
Define a DTMC:
Step 1: Define Xn , n 0. Xn usually has some physical meaning about your system, such as the
inventory level at the end of every Friday, the weather, etc.
Step 2: find state space S: find all possible value of Xn for n is sufficient large.
Step 3: find the transition probability matrix P . If you cannot find it, check whether the Markov
property hold. If it does not hold, you may need to redefine Xn . Or say the system cannot be modeled
as a DTMC.
Step 4: find the initial distribution a0 by initial information about the system.
1. A Two State Model S = {0, 1}
P = 0
1
0
1
1
P = Hot
Cold
Hot
3/4
1/6
Cold
1/4
5/6
Machine repairing process. State 0: machine is up and running. State 1: machine is down.
2. Inventory model in the motivation example: Let Yn be the
n. Then Y = {Yn : n = 0, 1, 2, ...} is a DTMC.
State space S = {2, 3, 4}.
2
3
2 1/8
0
P = 3 1/2 1/8
4 1/4 1/2
4
7/8!
3/8
1/4
3. A simple random walk: suppose you toss a coin at each time. You go right if you get a head, left if
you get a tail. Then S = {..., 2, 1, 0, 1, 2, ...} (infinite state space).
Let Xn is the position after nth toss of the coin.
p, j = i + 1
Pij = q, j = i 1
0, otherwise.
4. (Weird Frog Jumping Around). Imaging a situation that a frog is jumping around lily pads.
(1) Suppose the frog is jumping around 3 pads. Label each pad as 1, 2, and 3. The frogs jumping has
been observed extensively and behavioral scientists concluded this frogs jumping pattern to be as
follows.
1 2 3 2 1 2 3 2 ...
Let Xn denote the pad label that the frog is sitting on at time n. Is Xn a DTMC? If so, what are
the state space and the transition probability matrix? If not, why?
It is not a DTMC because Xn does not have the Markov property. To show it, lets try to complete
the transition probability matrix given that the state space S = {1, 2, 3}.
1
P = 2
3
1
0
?
0
2
1
?
1
3
!
0
?
0
It is impossible to predict the next step of the frog by just looking at the current location of it
if the frog is sitting on pad 2. If we know additionally where it case from to reach pad 2, it
becomes possible to predict its next step. That is, it requires past information to determine
probability of future. Hence, the Markov property does not hold. Xn cannot be a DTMC.
(2) A friend of the frog is also jumping around lily pads in the near pond. This friend frog jumps
around 4 pads instead of 3. Label each pad as 1, 2A, 2B, and 3. The friend frogs jumping pattern
is characterized as follows.
1 2A 3 2B 1 2A 3 2B 1 2A ...
Let Yn denote the pad on which the friend frog is sitting at time n. Is Yn a DTMC? If so, what
are the state space and the transition probability matrix? If not, why?
It is a DTMC. State space S = {1, 2A, 2B, 3} and the transition probability matrix is
1
1
0
0
P = 2A
2B 1
3
0
2A
1
0
0
0
2B
0
0
0
1
0
1
0
0
I strongly suggest you read the textbook. There are many examples in the textbook and many
exercises at the end of each chapter. You can find the solution at the end of your textbook.
Those examples and exercises will be very helpful for your midterms and final.
1.3
Sample Path
1.4
0
0
1/2
1
1
1/4
0
0
2
3/4!
1/2
0
Transition diagram
P = 0
1
0
P = 1
2
0
3/4
1/2
0
0
1/2
1
1
1/4
1/2
1
1/4
0
0
2
!
3/4
1/2
0
1.5
1.5.1
Calculating Probabilities
Math Review:
P (AB)
P (B) , P (A
B) = P (A|B)P (B).
P = 0
1
0
0.7
0.5
1
0.3
0.5
and the initial distribution a0 = (0.6, 0.4), i.e., P (X0 = 0) = 0.6, P (X0 = 1) = 0.4. What is P (X1 = 0)?
P (X1 = 0) = P (X1 = 0, X0 = 0) + P (X1 = 0, X0 = 1) (Law of total probability)
= P (X1 = 0|X0 = 0)P (X0 = 0) + P (X1 = 0|X0 = 1)P (X0 = 1) (Conditional Probability)
= 0.7 0.6 + 0.5 0.4 = 0.62.
Exercise: Show that P (X1 = 1) = 0.38.
Let an be the probability distribution for Xn , then we say a1 = (0.62, 0.38), i.e., P (X1 = 0) = 0.62, P (X1 =
1) = 0.38.
We note,
a0 P = (0.6 0.4)
0.7 0.3
0.5 0.5
= (0.6 0.7 + 0.4 0.5 0.6 0.3 + 0.4 0.5) = (0.62 0.38) = a1
What is P (X2 = 0) ?
P (X2 = 0) = P (X2 = 0, X1 = 0) + P (X2 = 0, X1 = 1)
= P (X2 = 0|X1 = 0)P (X1 = 0) + P (X2 = 0|X1 = 1)P (X1 = 1)
= 0.7 0.62 + 0.5 0.38 = 0.624.
0.3
= (P (X2 = 0) P (X2 = 1)) = a2
0.5
0.7 0.3
0.7 0.3
a1 P = (a0 P )P = (0.6 0.4)
= (P (X2 = 0) P (X2 = 1)) = a2
0.5 0.5
0.5 0.5
a1 P = (0.62 0.38)
0.7
0.5
a0 P 2 = a2
Hence,
an = an1 P = (an2 P )P = an2 P 2 = a0 P n
an+1 = an P
an+k = an P k = an+1 P k1
Remark: Pij = P (Xn+1 = j|Xn = i), (P 2 )ij 6= (Pij )2 . (P 2 )ij = P (Xn+2 = j|Xn = i) =
(Pij )2 = [P (Xn+1 = j|Xn = i)]2 . In general, (P k )ij = P (Xn+k = j|Xn = i).
kS
0
0
1/2
1
1
1/4
0
0
2
3/4!
1/2
0
2
X
P (X4 = 0, X3 = i|X2 = 0)
i=0
P (X1 = 2, X0 = 1)
P (X1 = 2)
0
P 2 = 0.5
1
0.25
0
0
0.75
0
0.5 0.5
0
1
0.25
0
0
0.75
0.875
0.5 = 0.5
0
0
0
0.125
0.25
0.125
0.375
0.75
1.6
Stationary Distribution
Suppose this week, half of my time I drink coke and half of the time i drink Pepsi. i.e. S = {Coke, Pepsi}, a0 =
(0.5, 0.5). The transition probability matrix is:
P = Coke
Pepsi
Coke
0.7
0.5
Pepsi
0.3
0.5
Observe :
a1 = [0.6
0.4] = a0 P
0.38] = a1 P = a0 P 2
a2 = [0.62
a3 = [0.624
0.376] = a2 P = a0 P 3
0.3752] = a3 P = a0 P 4
a4 = [0.6248
a5 = [0.625
0.375] = a4 P = a0 P 5
a6 = [0.625
0.375] = a5 P = a0 P 6
a7 = [0.625
0.375] = a6 P = a0 P 7
...
a100 = [0.625
0.375] = a7 = a8 = ...a
Once you get to this distribution [0.625 0.375], it does not change over time any more. This distribution
describe the steady state of your system: what can happen if we run this system for a long period of time
(dynamic equilibrium).
Does the steady state always exist? Sometimes not. We will learn the conditions that can guarantee the
existence of the steady state later.
Definition 5. (Stationary Distribution). Suppose = (i , i S) satisfies
P
i 0 for all i S and iS i = 1,
= P
Then, is said to be a stationary distribution of the DTMC.
Remark: We could have either unique or infinite number of stationary distributions. The conditions that
can guarantee the uniqueness of stationary distribution will be given later.
What is the stationary distribution for the Coke and Pepsi example?
c 0, p 0
(c
p )
c + p = 1
0.7 0.3
= (c
0.5 0.5
c = 0.7c + 0.5p
p = 0.3c + 0.5p
c + p = 1
c = 0.625, p = 0.375. = [0.625
which is exactly same to what we observed above.
p )
p )
0.375].
In the test, you could be asked to give the stationary distribution for a simple transition probability matrix.
Hence, you should practice to compute the stationary distribution from the given transition matrix.
0
P = 1
2
0
0
1/2
1
1
1/4
0
0
2
!
3/4
1/2
0
0 0 + 1/2 1 + 2 = 0
1/4 + 0 + 0 =
0
1
2
1
3/4
+
1/2
+
0
=
2
0
1
2
0 + 1 + 2 = 1
= (0.4706
1.6.1
0.1176
0.4118).
Examples
1. = [0.625
0.375] means
(1) In long run, 62.5% of the time, I drink coke. 37.5% of the time, I drink Pepsi.
(2) After a long time, you visit me on a day. Then the chance that you see i drink coke on that day
is 62.5%. The chance that you see i drink Pepsi on that day is 37.5%.
Questions:
(1) Suppose Pepsi is $1, and Coke is $1.5. How much on average I spend on soda in a month?
$1.5 (30 0.625) + $1 30 0.375
(2) If I drink soda for next 10 years (3650 days), how many days I will drink Pepsi?
3650 0.375
10
2. Inventory example. Suppose the Demand {Dn } is an i.i.d. sequence and it follows the following
distribution:
d
P[Dn = d]
0.1
0.4
0.3
0.2
Assume Cp = $2000, Cv = $1000, Cf = $1500, h = $100, where h is the holding cost for each item left
by the end of a Friday. Consider the (s = 2, S = 3) policy. What is the long-run average profit per
week?
Let f (i) be the expected profit of the following week, given that this weeks inventory ends with i
items.
i = 0:
no holding cost
Place order, and order 3 items. Cf + 3Cv
next week we start with S = 3 items.
f (0) = P (D = 1) 1 Cp + P (D = 2) 2 Cp + P (D = 3) 3 Cp (Cf + 3Cv )
= [1 0.4 $2000 + 2 $2000 0.3 + 3 $2000 0.2] [3$1000 + $1500] = $1300.
i = 1:
holding cost. 1 h
Place order, and order 2 items. Cf + 2Cv
next week we start with S = 3 items.
f (1) = P (D = 1) 1 Cp + P (D = 2) 2 Cp + P (D = 3) 3 Cp (Cf + 2Cv + h)
= [1 0.4 $2000 + 2 $2000 0.3 + 3 $2000 0.2] [2$1000 + $1500 + $100] = $400.
i = 2:
holding cost. 2 h
no order.
next week we start with 2 items.
f (2) = P (D = 1) 1 Cp + P (D = 2) 2 Cp + P (D = 3) 2 Cp (2h)
= [1 0.4 $2000 + 2 $2000 0.3 + 2 $2000 0.2] [2$100] = $2600.
Similarly,
f (3) = P (D = 1) 1 Cp + P (D = 2) 2 Cp + P (D = 3) 3 Cp (3h)
= [1 0.4 $2000 + 2 $2000 0.3 + 3 $2000 0.2] [3$100] = $2900.
Please construct the transition probability matrix P and solve the stationary distribution by yourself.
Note, the state space S = {0, 1, 2, 3}. Then,
Long-run average profit: f (0)0 + f (1)1 + f (2)2 + f (3)3 . The percentage of time that the system
is losing money: 0 + 1 . Assume you will get $500 if the system is making money and -$100 if the
system is losing money. Then your average bonus is $500 (2 + 3 ) $100 (0 + 1 ). The manager
may want to talk with you if he/she sees that the system is losing more than $1000 on a day. Then
the probability that you have to talk with your manager on a day is 0 .
11
1.6.3
Flow Balance Equations Under stationary condition, for any state, rate into a state should be equal to
the rate out of the state. (Ignore self loop)
State space S = {1, 2, 3}.
1
P = 2
3
1
0.25
0.5
0
2
0.25
0
1
3
0.5!
0.5
0
State 1:
rate in: 0.52
rate out: (0.5 + 0.25)1
Hence, 0.52 = 0.751
State 2:
rate in: 0.251 + 3
rate out: 2
Hence, 3 + 0.251 = 2
State 3:
rate in: 0.51 + 0.52
rate out: 3
Hence, 0.5(1 + 2 ) = 3
1 + 2 + 3 = 1 and = (0.2667, 0.4, 0.3333).
Cut Method
p
= 1/2, n = (1/2)n 0 = (1/2)n+1 .
q
12
1.7
Limiting probability
Recall n-step transition probability Pijn = P (Xk+n = j|Xk = i). let n , what can happen?
Examples:
(1) Chain goes well (I will explain when a chain goes well in next section.).
0.7 0.3
P =
0.5 0.5
P
= lim P = lim
n
0.7
0.5
n
0.3
0.5
=
0.625 0.375
0.625 0.375
(a) In the long run, what do you do today has little effect on the future. Limiting probability is
independent of the initial state. Each row is identical, which means your initial state is irrelevant
once you have run the system for a long time.
(b) Each row of the matrix converges to the stationary distribution.
0.625 0.375
n
P = lim P =
=
0.625 0.375
n
Hence, if the chain goes well, we could use P n (for n large) to get the stationary distribution quickly.
(2) Limit may not exist. For example,
P =
0
1
1
0
= P
(1
2 ) = (1
(
P 100 =
P
1000
1
0
0
1
1
0
0
1
2 )
0
1
1
0
1 + 2 = 1
1 = 2
= [1 2 ] = [0.5 0.5].
0 1
101
6=
,P
=
6=
1 0
0 1
1001
6=
,P
=
6
=
1 0
The behavior is similar to (1)n , where the limit of (1)n does not exist. The chain is obviously
oscillating. However,
1 100
101
(P
+P )=
2
1 1000
1001
(P
+P
)=
2
(3)
0.7 0.3
0.5 0.5
P =
0
0
0
0
0
0
0.6
0.3
0
5/8
0
, P = 5/8
0
0.4
0.7
0
13
3/8 0
3/8 0
0 3/7
0 3/7
0
0
4/7
4/7
The first two rows are identical and the last two rows are identical. However,
the second
and the third row
0.7 0.3
are different. In fact, = [5/8 3/8] is the stationary distribution of
and = [3/7 4/7]
0.5 0.5
0.6 0.4
is the stationary distribution of
. If you draw the transition diagram of this chain, you will
0.3 0.7
see it has two disjoint subchains. In long run, each of these subchains converges to its own stationary
distribution.
What is the stationary distribution?
= P
(1
4 ) = (1
0.7
0.5
4 )
0
0
0.71 + 0.52 = 1
0.31 + 0.52 = 2
0.63 + 0.34 = 3
0.43 + 0.74 = 4
+ + + = 1
1
2
3
4
0.3
0.5
0
0
0
0
0.6
0.3
0
0
0.4
0.7
52 = 31
43 = 34
1 + 2 + 3 + 4 = 1
We have 4 variables but we only have 3 equations. Hence, we may have infinite many of stationary
distribution. Indeed, if we set 1 + 2 = , 0 1, then 1 = 58 , 2 = 38 , 3 = 37 (1 ), 4 =
4
7 (1 ), i.e., = [ (1 )], 0 1.
1.8
1.8.1
DTMC Techniques
Accessibility and Irreducibility
P =
0.2
0
0.1
0
0.5
0
0.5
0
0.2
0
0.5
0
0.1
0.5
0.5
0
0.5
0
0.3
0
0.3
0
0.8
0
0
0 0.5
0.5 0
P =
0 0.5
0.5 0
This chain is irreducible, since 1 2 3 4.
14
0
0.5
0
0.5
0.5
0
0.5
0
Periodicity
Lets look at the Example (2) in Section 1.7 again. It is a finite state, irreducible DTMC. Hence, we should
have a unique stationary distribution. However, P 1000 6= P 1001 and each row doesnt converge to the stationary distribution .
In this case, we cannot obtain the limiting probability, because the chain oscillates. Then what is the relationship between P and ?
Definition 7. (Periodicity) For a state i S,
n
d(i) = gcd{n : Pi,i
> 0}
0
0.5
P =
0
0.5
0.5
0
0.5
0
0
0.5
0
0.5
0.5
0
0.5
0
Then,
d(1) = gcd{2, 4, 6, 8, ...} = 2
d(2) = 2.
We say this DTMC is periodic with period d = 2.
If fact, if i j, d(i) = d(j). It is called the solidarity property. Since all states in a irreducible
DTMC communicate, the periods of all states are the same. We will revisit it after covering recurrence
and transience.
2.
0.1 0.2
P = 0.5 0
0.6 0.4
0.7
0.5
0
Then,
d(1) = gcd{1, 2, 3, ...} = 1
d(2) = gcd{2, 3, 4, ...} = 1
We say this DTMC is aperiodic.
Theorem 2.
(ii) If DTMC is period with period d 2, then limiting probability does not exist. However,
P n + P n+1 + P n+2 + ... + P n+d1
n
d
lim
exists.
15
Examples:
1.
P =
0.2
0.5
0.8
0.5
[1
P =
0.8
= [1
0.5
0.2
0.5
2 ]
2 ]
1 + 2 = 1
= [5/13
8/13].
0
0.5
P =
0
0.5
(a) Is this DTMC irreducible?
Yes, 1 2 3 4.
(b) Is this chain aperiodic?
No, the period is d = 2.
16
0.5
0
0.5
0
0
0.5
0
0.5
0.5
0
0.5
0
(c) Does the DTMC have any stationary distributions? If so, how many? Can you solve them?
It is a finite state, irreducible DTMC. According to Corollary 1, there exists a unique stationary
distribution.
[1
0
0.5
4 ]
0
0.5
0.5
0
0.5
0
P =
0.5
0
= [1
0.5
0
0
0.5
0
0.5
4 ]
1 + 2 + 3 + 4 = 1
= [1/4
1/4
1/4
1/4].
1.9
limiting
Exist,
independent of initial state
not exist
Stationary
unique
Remark
nice
unique
think about
P n +...P n+d1
d
exist
depends on initial state
maybe
Let X be a DTMC on state space S with transition matrix P . For each state i S, let i denote the first
time n 1 such that Xn = i. For example,
P (1 = 2|X0 = 2) = P (X1 6= 1, X2 = 1|X0 = 2)
P (3 = 3|X0 = 3) = P (X1 6= 3, X2 6= 3, X3 = 3|X0 = 3)
P (i = k|X0 = i): The first time the system returns to state i.
Definition 8. (1) State i is said to be recurrent if P (i < |X0 = i) = 1
(2) State i is said to be positive recurrent if E(i |X0 = i) < .
(3) State i is said to be transient if it is not recurrent.
(4) State i is said to be an absorbing state if Pi,i = 1.
(1) means there is always a way to get back to state i. (3) means it is possible that you cannot return to
state i.
17
1.9.1
What is the difference between P (i < |X0 = i) = 1 vs E(i |X0 = i) < ? In general, What is the
difference between P (X < ) = 1 and E[X] < ? Arent they same?
Lets look at the meaning of each expression:
(1) P (X < ) = 1 means the random variable X can only take finite values. X could be 1, 2, 100, 10000
billon etc. However, X cannot be infinite, i.e., P (X = ) = 0.
(2) E[X] < means the expected value of X is finite.
Example:
Let X be the number of tosses to get the 1st head, then X is a geometric random variable. It may take
millions of tosses for you to get the 1st head.
P (X = n) = q n1 p
where p is the probability of getting a success and q = 1 p.
Do we have P (X < ) = 1?
P (X < ) =
n=1
1
=1
1q
Yes, we do. It means no matter how small the success probability is, it will take a finite
number of trials to get the first success for sure! I hope you can remember this fact. You will
have many challenges in reality, and sometimes you may be so frustrated. However, Keep going! You
will definitely make it!
How about the expectation?
E[X] = 1p + 2qp + 3pq 2 + ... = p(1 + 2q + 3q 2 + ...) = p(q + q 2 + q 3 + ...)0
q 0
1q+q
p
1
= p(
) = p[
]=
= <
1q
(1 q)2
(1 q)2
p
Hence, the expected number of trails to get a success is also finite.
You may ask: if we have a random variable X such that P (X < ) = 1, should we always have E[X] < ?
P
2
Not necessary. For example, we know n=1 n12 = 6 . Consider the random variable X such that P (X =
n) = n12 62 for n 1. Then
P (X < ) =
P (X = n) =
n=1
However,
E[X] =
nP (X = n) =
n=1
6 X 1
=1
2 n=1 n2
X
6 1
= .
2 n
n=1
Definition 9. (Solidarity property). If i j, then d(i) = d(j) where d(k) is the period of state k. Also, i
is (positive) recurrent or transient if and only if j is (positive) recurrent or transient.
The solidarity property states that if two states communicate with each other, it is not possible that one
state i is recurrent and the other state j is transient. Therefore, if we apply the solidarity property to a
irreducible DTMC, only one of the following states is true:
(1) All states are transient.
(2) All states are (positive) recurrent.
The reason is simple: all states communicate with each other in an irreducible DTMC.
Equivalently,
Assume a Markov chain is irreducible, if state i is (positive) recurrent, then every state is (positive)
recurrent and we say this Markov chian is (positive) recurrent.
If a Markov chain is irreducible, then every state is either (positive) recurrent or transient.
Theorem 3. Assume that X is irreducible in a finite state space. Then,
(1) X is positive recurrent.
(2) X has a unique stationary = (i )i .
n
(3) If the DTMC is aperiodic, limn Pi,j
= j , independent of what i is.
n+d1
n+1
n
+...+Pi,j
Pi,j
+Pi,j
d
= j
Theorem 4. Assume X is irreducible. X is positive recurrent if and only if X has a (unique) stationary
distribution = (i ). Furthermore,
1
E(i |X0 = i) =
i
Example:
P =
0.5
1
0.5
0
P (1 = 1|X0 = 1) = 0.5
P (1 = 2|X0 = 1) = P (X1 6= 1, X2 = 1|X0 = 1) = P (X1 = 2, X2 = 1|X0 = 1) = P (X2 = 1|X1 =
2)P (X1 = 2|X0 = 1) = 1 0.5 = 0.5
P (1 = 3|X0 = 1) = P (X1 6= 1, X2 6= 1, X3 = 1|X0 = 1) = P (X1 = 2, X2 = 2, X3 = 1|X0 = 1) = 0
P (1 = 4|X0 = 1) = 0
For n 4, P (1 = n|X0 = 1) = 0
Hence, E[1 |X0 = 1] = 1 0.5 + 2 0.5 = 1.5 < and state 1 is positive recurrent. From the solidarity
property (or class property), we know state 2 is also positive recurrent. You could verify that the stationary
distribution = (2/3 1/3). From Theorem 4, we know E(1 |X0 = 1) = 11 = 3/2 which is equal to what
we calculated. This chain is aperiodic (because of a self loop), then according to Theorem 3, we know
2/3 1/3
lim P n =
2/3 1/3
n
In addition,
there is only one class {1, 2} which is positive recurrent. Hence, this chain is also irreducible.
since it contains a self loop and it is irreducible, we know this chain is aperiodic, i.e., d(1) = d(2) = 1.
19
1.9.2
Classification of States
Definition 10. (Class) Two states that communicate are said to be in the same class. Within each class,
all states communicate to each other, but no pair of states in different classes communicates.
In other words, the concept of communication divides the state space into a number of separate classes.
Hence,
The Markov chain is said to be irreducible if there is only one class, that is, if all states communicate
with each other.
Solidarity property Class property.
Definition 11. (Closed Subset) A subset S0 S of states is closed if Pij = 0 for each i S0 and j
/ S0 .
In plain English, once entered, a closed set cannot be existed.
Theorem 5. If a closed subset S0 only has finitely many states, then there must be at least one recurrent
state. In particular, any finite Markov chain must contain at least one recurrent state.
Proof. Start from any state from S0 . By definition, the chain stays in S0 forever. If all states in S0 are
transient, then each of them is visited not at all or only finitely many times. This is impossible.
Examples:
1.
1
0
0.7
0.8
1
P = 2
3
2
0.6
0
0.2
3
0.4!
0.3
0
0
0
1
0
2
1
0
0
0
0
3
0
1
0
0
0
4
0
0
1
0
1
0
0
0
0
1
0
1
0
0
0
2
1
0
0
0
0
20
3
0
0
0.25
0.3
0
4
0
0
0.75
0.6
0
0
0
0
0.1
1
Pn
i=1
with probability p
with probability q
Hence, Sn = 13 n and limn Sn = +. So if you start from state 2, you will never get back to state
2 in long-run. This means every state in the chain is transient.
In fact, if p 6= q, you can show this chain is always transient by the same method. If p = q, this chain
is null recurrent. i.e., it is recurrent, but not positive recurrent. The proof is omitted.
Final Remarks
In a DTMC with finite state space, not all states can be transient. i.e., there exists at least one
recurrent state.
A recurrent state is accessible from all states in its class, but is not accessible from recurrent states in
other classes.
A transient state is not accessible from any recurrent state.
At least one, possibly more, recurrent states are accessible from a given transient state.
21
1.10
Absorption Probability
How to calculate limn P n if the DTMC has possibly more than one class?
Key observation: When n is sufficiently large, any transient state will definitely enter one of the recurrent
classes. i.e., after a very large n, transient states will detach from the DTMC (intuitively). Hence, we
could focus on the recurrent classes first and then come back to analyze the transient classes.
Let
fi,j = P (X = j|X0 = i) = lim (P n )i,j
n
This is P
the probability that starting from state i, the DTMC will be at state j eventually.
Hence, jS fi,j = 1. i.e., the sum of each row of P equals 1.
Question: What is P (X = j|X20 = i)?
P (X = j|X20 = i) = P (X = j|X200 = i) = fi,j from definition.
Examples:
1.
1
P = 1 (1)
What is P ?
P = f1,1 = 1.
2.
1
0.2
0.8
P = 1
2
What is P ? Assume you know that
0.2
(0.5 0.5)
0.8
0.8
0.2
2
0.8
0.2
= (0.5
0.5) = (1
P =
.
0.5 0.5
3.
P = 1
2
1
1
1
2
0
0
What is P ?
f1,1 = 1
f1,2 = 0 since
jS
fi,j = 1.
22
1
1
0
0
.
2 )
4.
1
1
0.5
P = 1
2
2
0
0.5
What is P ?
f1,1
f1,2
f2,1
f2,2
=1
P
= 0 since jS fi,j = 1.
= 1 since state 2 is transient.
= 0 since state 2 is transient.
P
=
1
1
0
0
.
5.
1
0.25
0.5
1
1
P = 2
3
2
0.75
0.5
0
3
!
0
0
0
f1,1
f1,2
f1,3
f2,1
f2,2
f2,3
f3,1
f3,2
f3,3
0.6)
0.25
0.5
0.75
0.5
= (0.4
0.6) = (1
2 )
= 0.4
= 0.6
P
= 0 since jS fi,j = 1 or {1, 2} is a closed subset.
= 0.4
= 0.6
P
= 0 since jS fi,j = 1 or {1, 2} is a closed subset.
= 0.4 = 1 Make sure you understand why f3,1 = 1
= 0.6 = 2 Make sure you understand why f3,2 = 2
= 0 since it is transient.
0.4 0.6 0
P = 0.4 0.6 0 .
0.4 0.6 0
6.
1
P = 2
3
1
1
0.25
0
2
0
0.5
0
3
!
0
0.25
1
What is P ?
Class {1} is recurrent. Class {2} is transient. Class {3} is recurrent. And we know P1,1
= P3,3
=1
because they are absorbing states. Furthermore, we know the probability that state 3 goes to class {1}
is equal to the probability that state 3 goes to class {2}. Hence,
1
= 2
3
23
1
1
0.5
0
2
0
0
0
3
!
0
0.5
1
Note, P2,2
= 0 since it is a transient state.
7.
1
1
1/3
0
1
P = 2
3
2
0
1/2
0
3
!
0
1/6
1
What is P ?
Same as before:
Class {1} is recurrent. Class {2} is transient. Class {3} is recurrent.
1
= 2
3
1
1
?
0
2
0
0
0
3
!
0
?
1
However, we know state 3 can go to class {1} or class {2} with different probabilities. Note,
f2,1 = P (X = 1|X0 = 2)
= P (X = 1, X1 = 1|X0 = 2) + P (X = 1, X1 = 2|X0 = 2) + P (X = 1, X1 = 3|X0 = 2)
= P (X = 1|X1 = 1)P (X1 = 1|X0 = 2) + P (X = 1|X1 = 2)P (X1 = 2|X0 = 2)
+ P (X = 1|X1 = 3)P (X1 = 3|X0 = 2)
= 1 1/3 + f2,1 1/2 + 0 1/6 = 1/3 + 1/2 f2,1
And
f2,1 + f2,3 = 1
Hence, we have
(
f2,1 + f2,3 = 1
f2,1 = 1/3 + 1/2 f2,1
1
1
2/3
0
1
= 2
3
2
0
0
0
3
!
0
1/3
1
8.
1
1 0.25
0.5
P = 2
3 0
4
0
2
0.75
0.5
1/3
0
3
0
0
1/2
0
4
0
0
1/6
1
What is P ?
Class {1, 2} is recurrent. Class {3} is transient. Class {4} is recurrent.
From Problem 2, 5, 7, we know
1
1
1
1
= 2
3 f3,{1,2} 1
4
0
2
2
2
f3,{1,2} 2
0
3
0
0
0
0
0.4
0
0.4
0 =
2/3 0.4
f3,4
0
1
24
0.6
0
0.6
0
2/3 0.6 0
0
0
0
0
1/3
1
f3,{1,2} + f3,4 = 1
f3,{1,2} = 1/3 + 1/2 f3,{1,2}
P =
0.2
0.5
0
0
0
0.8
0.5
0.25
0
0
25
0
0
0
0.5
0
0
0
0.75
0
0
0
0
0
0.5
1
1 2
1 2
?
?
?
?
P = ?
.
?
?
?
?
?
where
(1
2 ) = (1
2 )
0.2
0.5
0.8
0.5
1 + 2 = 1
0.21 + 0.52 = 1
0.81 + 0.52 = 2
1 + 2 = 1
(1
2 ) = (5/13
P =
1
1
?
?
0
2
2
?
?
0
8/13).
0 0 0
0 0 0
? ? ?
.
? ? ?
0 0 1
Step 3: Work on rows of P for transient classes(fill the blanks), using f3,{1,2} , f4,{1,2} , f3,5 , f4,5 ,
5/13
8/13
5/13
8/13
f3,{1,2} 1
f3,{1,2} 2
f3,5
f4,{1,2} 1
f4,{1,2} 2
f4,5
where:
f3,{1,2} = P (X {1, 2}|X0 = 3)
f4,{1,2} = P (X {1, 2}|X0 = 4)
f3,5 = P (X = 5|X0 = 3)
f4,5 = P (X = 5|X0 = 4)
26
Again, why P3,1
= f3,{1,2} 1 ?
Step 4: Solve the f3,{1,2} , f4,{1,2} , f3,5 , f4,5 and complete the matrix.
5/13
8/13
5/13
8/13
2/5 5/13
2/5 8/13
3/5
1/5 5/13
1/5 8/13
4/5
5
X
i=1
5
X
5
X
i=1
i=1
5
X
i=1
5
X
i=1
5
X
i=1
27
1.11
Review
Markov property
Being able to model a problem by a DTMC, including (1) identify the state space; (2) construct
transition probability matrix; (3) initial distribution
Meaning of transition probability P and P n
Sample path and transition diagram
Calculate 1-step transition probability P (X1 ) using Pij
Calculate n-steps transition probability P (Xn |X0 ) using Pijn and notice that (Pi,j )n 6= (P n )ij
Calculate stationary distribution (3 methods: (1) definition (2) flow balance equation and (3) cut
method)
Explanation of stationary distribution
DTMC technique
(1) Accessibility, Irreducibility and definition of a class.
(2) Existence of stationary distribution and number of stationary distributions.
(3) Difference between P (i < |X0 = i) and E[i |X0 = i] < .
(4) Recurrent/transient/absorption state: you need to identify if a state/class is recurrent or transient.
(5) Period of a state.
(6) Solidarity (class) property.
(7) Limiting probability, existence and its relationship with stationary distribution.
(8) Relationship between stationary distribution and E[i ].
(9) Calculate absorption probability matrix P .
28