Sunteți pe pagina 1din 78

Games,Risks,andDecisions-M4S11

TaughtbyLyndaWhite-l.white@ic.ac.uk.
Autumn2014

Contents
0 Introduction
0.1 Game Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
0.2 Utility Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
0.3 Decision Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2
2
2
3

1 Game Theory
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Two-Person Zero-Sum Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Non-Zero-Sum Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3
3
4
38

2 Utility
2.1 Introduction . . . . . . . .
2.2 The Lottery Axioms . . .
2.3 The Existence of a Unique
2.4 The Utility of Money . . .

51
51
51
53
54

. . . . . . . . . .
. . . . . . . . . .
Utility Function
. . . . . . . . . .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

3 Bayesian Methods

55

4 Decision Theory
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Decision Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3 Decision Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

57
57
58
59

5 Mastery Exam Component

63

6 Appendix: Further Material of Use


6.1 On Risk and Loss Functions . . . .
6.2 Mid-Term 1 - 13/11/14 . . . . . .
6.3 Mid-Term 2 - 11/12/14 . . . . . .
6.4 Further Problem Sheet Questions .

67
67
69
70
72

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

Distribution of this1document is illegal

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

0 Introduction
0.1 Game Theory
Startedinthe1920s. Borel1921,VonNeumann1928. DevelopedinWWIIforlogistics,submarinesearch,and
airdefence. 1944gaveusvonNeumannandMorgensternsTheTheoryofGamesandEconomicBehaviour.
Later on John Nash [film: A Beautiful Mind].
0.1.1 Aim
The aim of Game Theory is to develop optimal strategies in competitive situations involving two or more
intelligent antagonists. Key points:
Conflicting interests
Playing rationally
Areas of application:
1. Games:
Chess: 2 players, no random element, perfect information
Monopoly:

2 players, random element, perfect information

Bridge: 2 against 2, random element, non-perfect information


Can you think of a game with no random element and not perfect information?
2. Military / political situations, countries with conflicting interests. Note that coalitions can form
3. Biology: Competing species. Evolutionary stable strategies [i.e. balance of species in an environment
such that none has an incentive to change its behaviour]
4. Economics: Companies compete for the market share in a product. Auctions, how much would you bid?
We will only look at two-person games. [MSci students will consider n-person games.] There are two types:
1. Strictly competitive games where the players act independently
2. Games in which players may cooperate to their mutual advantage

0.2

Utility Theory

We will need to assess the value of the outcomes of any decisions we make. Not all outcomes are monetary.
We need a common scale [of value] to compare the values of, say, a kidney machine and an incubator in a
hospital. We will develop a mathematical theory based on axioms representing how a reasonable person makes
decisions. This leads to a utility function, a measure of value.
Risk = E[loss] =

E[utility]

We will look at risk aversion. How risk averse are you? Consider choosing one of the following:
1 guaranteed
(
2 if a fair coin shows H

0 if a fair coin shows T


N.B. these have the same expected outcome. Attitudes change when sums are multiplied by 10, 100, 1000 &c.
2

0.3 Decision Theory


Wemakedecisionsallthetime;thecrucialelementistheuncertainty. Tolimittheuncertaintywecollectdata.
Decision making is like playing a game with the unknown. We will study Bayesian inference to tell us how to
incorporate this data into the decision making process in order to maximise expected utility.
Exempli gratia: Marriage problem. We neither want to choose too soon nor too late. Whom can you choose
as your long-term partner? Roughly, look around for about 1
3 of the time and then choose the first one after
that who is better than all those seen so far. In fact, its not actually 13 but 1e .
[End lecture 1, 09/10/14]

Game Theory

1.1

Introduction

We assume 2 players, A and B, both are rational and greedy! Each player has available a number of strategies,
or recipes, for playing the game. A strategy for a player [sometimes called a pure strategy] involves a
complete description of all the moves that the player will make, including responses to the opponents moves
and any random moves [e.g. using dice or coins]. I.e. a strategy is a program that can be followed mechanically
by the player. It usually comprises a sequence of moves, some of which may be random. Examples:
1. Two companies, A and B. Each produces a product. To capture more of the market each will decide to
do one of three things:
Spend 10% extra on advertising
Reduce the product price
Give away a free product

These are the 3 pure strategies for each player


2. A fair coin is tossed. The result is shown to A and B. Each then has a choice of n possible moves,
{m1 , . . . , mn }. Consider player A. A strategy for A is a map: {H, T } ! {m1 , . . . , mn }. An example of
a strategy would be: if the coin shows H, play move m4 ; if the coin shows T , play move m6 . H and T
can both be mapped to any of the n moves, so A has n2 strategies. N.B. for each player there are 2n
outcomes. These are not strategies
3. GOPS [game of pure strategy]. Each player has 10 cards numbered 1, 2, . . . , 10. The dealer has a similar
set of cards and deals them onto the table one by one, at random, so both players can see them [one
by one]. When a card is dealt on the table in random order, so both players can see, the players then
simultaneously bid for each card, one by one. The higher bid wins the dealt card. The cards that have
been used for bidding are discarded. If both players play the same card, the dealt card is discarded. The
winner is the one with the highest total score on the dealt cards they have won.
Lets think about GOPS with 2 cards: A strategy for A tells A which card [1 or 2 to play in response
to the first dealt card. The rest of the game is then determined. A strategy is a map: {dealt cards} !
{As first card} i.e. a map {1, 2} ! {1, 2}. The number of strategies is thus = 22 = 4.
Question: How many strategies does each player have in a game of GOPS with 3 cards each?
Players may have a finite or infinite number of pure strategies. Heres an example where players have an infinite
number of strategies:
(
A chooses a real number, a 2 [0, 1]
B chooses a real number, b 2 [0, 1]
A wins [and B loses] an amount |a

b|.

Ingeneral,ifAplaysaandBplaysbwewritedowntheorderedpair(expected gain to A, expected gain to B).


N.B.expectedistotakeaccountofpossiblerandommoves. Writethisas(gA (a,b),gB (a,b)). WecallgA (a,b)
and gB (a,b) the gains or pay-os to A and B respectively when A, B play a, b respectively.
If gain to A + gain to B =0 8 a,b we call the game zero-sum. In such a game, whatever one player wins the
other loses. The game is strictly competitive; theres no point in collaborating. If gain to A + gain to B =C
[= constant, 6=0] then we make this into a zero-sum game:
(
c
(a, b) = g (a, b)
gA
A

2
Let
then gA
(a, b) + gB
(a, b) = 0 8 a, b
c

gB (a, b) = gB (a, b) 2
If the gain to A + the gain to B 6= constant, i.e. in a non-constant-sum game, there are two possibilities:
1. The players cant collaborate, theres no pre-play communication. E.g. Prisoners Dilemma
2. The players can collaborate for their mutual advantage. This is sometimes called bargaining
Listing the strategies for each player and forming a table of pay-os [or gains] is called the normal form of the
game.
Another form of the game, the extensive form, describes the game as a tree. E.g. Nim, Piles of stones. Players,
in turn, take stones from one pile only. The aim is to take the last stone. The two approaches are equivalent
with strategies being described using paths through the tree in the extensive form, see section 4.4 for an
example. We will look only at the normal form.
[End lecture 2, 14/10/14]

GOPS with 3 Cards


7,077,888 pure strategies. We need two sub-strategies here. A strategy for A must tell her/him:
1. Which card to play in response to the first card dealt
2. Having played her/his first card, which second card to play in response to Bs first card and the second
dealt card
1. is determined by a map {1, 2, 3} ! {1, 2, 3} {first dealt card} ! {As first GOPS pay-o}. 9 33 such
maps. For each such map, A needs to know how to respond at the next step. I.e. A has 2 choices for each
((i, j), k, l) where: i = first card dealt; j = As first card; k = Bs first card; l = second card dealt.
For a given map [in 1.] there are 3 possibilities for the [ordered] pair (i, j), 3 for k, and 2 for l. Hence for each
map [in 1.] there are 3 3 2 = 18 possibilities for stage 2. We have a map from this set of 18 to 2 choices for
A. The total number of strategies for A is 33 218 , as there are 3 maps in 1., 2 choices, and 18 possibilities.

1.2
1.2.1

Two-Person Zero-Sum Games


Introduction
Let

AS = the set of pure strategies for A


BS = the set of pure strategies for B

Assume gain to A + gain to B = 0 for all a 2 AS , b 2 BS . AS , BS can be finite or infinite. A and B choose
their strategies independently and simultaneously with no collaboration. If A chooses a 2 AS and B chooses
b 2 BS let:
g(a, b) = gain to A
so g(a, b) = gain to B or g(a, b) = loss to B. N.B. g(a, b) could be negative! g(a, b) can be monetary or
simply a score, e.g. A wins =) g = +1 and A loses =) g = 1. If AS = {a1 , . . . , am } and BS = {b1 , . . . , bn }
are finite, the game is called a finite or matrix game. In this case, we write:
gij = g(ai , bj )
4

Then G = (gij ) is called the pay-o matrix.


For analytical reasons, we assume g(a, b) is bounded below.
Note the problem of choosing optimal strategies is not altered by the following:
1. Adding a constant to each pay-o
2. Multiplying each pay-o by a positive constant
Examples:
1. Rock, paper, scissors. Win = +1: lose =

1. Rock beast scissors, paper beats rock, scissors beat paper:


R
R
P
S

0
41
1

1
0
1

3
1
15
0

2. Matching pennies. A and B each has a coin and simultaneously each shows a side, H or T , of her coin.
If both show the same then B pays A one unit [i.e. A wins, B loses]. Otherwise, B wins one unit from
A. a1 = H, a2 = T , b1 = H, b2 = T :

1
1
G=
1 1
3. Matching pennies with imperfect spying. Like matching pennies but after A has made a choice of H or
T [without showing it to B] a spy-coin is tossed by a third party. The spy is unreliable and:
P (spy-coin shows As choice) = p
P (spy-coin shows opposite of As choice) = 1

Both A and B know the value of p. The result of the spy coin toss is then shown to B who then makes
a choice. A has 2 pure strategies: a1 = H; a2 = T . B has 4 pure strategies: a strategy is a map
{spy-coin tosses} ! {H, T }. We can label these strategies:
(H, H)

(H, T )

(T, H)

(T, T )

Here, the first entry in each ordered pair tells B how to play if the spy-coin shows heads and the second
entry tells B how to play if the spy-coin shows tails. E.g. (H, H) tells B to play H in either case. (H, T )
tells B to copy the spy-coin.
Lets take a look at the pay-o matrix:

3
(H, H) (H, T ) (T, H) (T, T )
4a1 = H
1
2p 1 1 2p
1 5
a2 = T
1
2p 1 1 2p
1

These are the expected gains to A. For example:

g(a1 , b2 ) g(H, (H, T )) = 1 P (B chooses H) + ( 1) P (B chooses T )


= 1 P (spy-coin shows H) + ( 1) P (spy-coin shows T )
=p

(1

p) = 2p

1 [since B copies the spy-coin]

Heres another example from the 2013/14 course:


4. A man, X, arrives in his office at a random time T between 4 : 00 and 5 : 00 p.m. I.e. T U (4, 5), a
uniform distribution. X will leave the office at 5 : 00 p.m. There are two clients, A and B. Each wants
to meet X for dinner that evening but must phone X while X is in the office to make the arrangements.
Each client only has time to make one phone call, and whoever calls first, provided X is in the office,
will get the dinner appointment. If both call before X arrives in the office, each gets payo = 0. Gain
from getting dinner appointment = +1: gain from not getting the dinner appointment = 1. Note that
this is a zero-sum game. Let A call at time a, B call at b, and a, b 2 [4, 5], a 6= b. If T = t, A gains:
8
>
t < a < b or b < t < a
<1
1 t < b < a or a < t < b
>
:
0
a < t and b < t
So:

g(a, b) = ET (gain to A)
(
1 P (T < a) + ( 1) P (a < T < b)
=
1 P (b < T < a) + ( 1) P (T < b)
(
2a b 4 a < b
=
[N.B. a 6= b]
a 2b + 4 a > b
1.2.2

Pure Strategy Saddle-Points

Question: How should A and B play in a two-person zero-sum game?

a<b
a>b

A wants to maximise g(a, b)


B wants to minimise g(a, b)

Consider player A and pure strategy a 2 AS . We know that A cannot get less than:
inf g(a, b)

b2BS

[Recall that inf = greatest lower bound. In a finite game, this inf is simply the smallest entry in row a.]
A cannot guarantee getting more than this. Hence:
L

= sup
a2AS

inf g(a, b)

b2BS

is the upper limit to what A can guarantee getting. [sup = least upper bound. In a finite game,
maxa minb g(a, b).]

[End lecture 3, 15/10/14]


Similarly, for any b 2 BS , B will not lose more than:
sup g(a, b)
a2AS

In a finite game, this is the largest entry in column b. B cant guarantee losing less if they play b. B wants to
choose b to minimise this guarantee level, so if:
(
)
U

then

= inf

b2BS

sup g(a, b)

a2AS

= upper limit of what B can guarantee getting.

Definition:

= lower pure value of the game:

= upper pure value of the game.

Lemma 1.

Proof. For any a 2 AS , b 2 BS we have:


inf0 g(a, b0 ) g(a, b) sup g(a0 , b)
b

a0

and this yields:

sup inf0 g(a0 , b0 )


a0

Definition: If

sup g(a , b) 8b =) sup inf0 g(a0 , b0 )


0

a0

a0

inf0 sup g(a0 , b0 )


b

a0

then their common value is called the pure value of the game.

Suppose 9(a , b ) with a 2 AS , b 2 BS such that:


(

L = supa inf b g(a, b) = inf b g(a , b)


1:

U = inf b supa g(a, b) = supa g(a, b )


Then, in a finite game, b minimises the column maxima in the pay-o matrix and a maximises the row
minima [in the pay-o matrix].
If L = U = the pure value of the game, then 1 =) supa g(a, b ) = inf b g(a , b). Since g(a , b ) lies between
the the two values, they are both equal to g(a , b ). If, in a finite game, g(a , b ) is simultaneously the smallest value in its row and the largest in its column in the pay-o matrix then the game has a pure value = g(a , b ).
Example:
2

col. max.

6 a1
6
4 a2
a3

3
b 1 b 2 b3
1 3 27
7
5 4 65
3 2 4

row min.
1
4
3

Largest row min. = smallest col. max. = 4. Pure value = 4. A should play a2 and B should play b2 .
If such a , b exist [as in 1] we say that g(a , b ) is a pure strategy saddle-point at g(a , b ) [Similarly (a , b )].
We sometimes say that a and b are good strategies for A and B respectively. We use the abbreviation PSSP.
If a game has a PSSP [i.e. an element of the pay-o matrix that is the smallest in its row and largest in its
column] then we have solved the game.
Definition: A pure strategy a 2 AS is said to be dominated by another pure strategy a0 2 AS if:
g(a, b) g(a0 , b) 8b 2 BS
[I.e. a0 is always better than a, whatever B plays.] Similarly b 2 BS is dominated by another pure strategy
b0 2 BS if g(a, b) g(a, b0 ) 8a 2 AS .
Warning: It is possible for a game to have a PSSP at
strategy! Exempli gratia:
2
b1
6 a1
3
6
4 a2
3
a3
1

(a , b ) but for a to be dominated by another pure


3
b2 b3
5 47
7
6 55
2 0

3 is a PSSP. However a2 dominates a1 , the PSSP. 3 = (a2 , b1 ) is also a PSSP.


Downloaded by:: chrismcleod3 | chrismcleod3@mail.com
Distribution of this7document is illegal

Exercises
1. Show that if a game has two PSSPs, given by (a, b) and (a0 , b0 ), then g(a, b) = g(a0 , b0 )
2. Show that if a game has a unique PSSP, given by (a , b ), then a is not dominated by any other a0 2 As
with a similar result for B. [Hint: WLOG let (a1 , b1 ) be a unique PSSP in a finite game. Suppose, for a
contradiction, that a2 dominates a1 . Show that (a2 , b1 ) also gives a PSSP]
If A has a pure strategy, a, that is dominated by a pure strategy a0 2 AS then we delete the row of the pay-o
matrix corresponding to a. We may lose a PSSP by deleting a row [or column] like this, but it will not aect
the existence [or otherwise] of a PSSP in the game. See 2. above. It is good practice to delete dominated rows
and columns and to look for PSSPs.
Not all games have a PSSP. Heres an example of
2
b1
6 a1 1
6
4 a2 5
a3 3
col. max.

a game without a PSSP [with


3
b2 b3
row min.
3 27
1
7
5
4 6
4
7 4
3
7

= 4,

= 5]:

We can delete a1 by comparing it with a2 or with a3 to give:

5 4 6
3 7 4
We can now delete b3 [compare it with b1 ] to give:
2

4 a2
a3

3
b1 b2
5 45
3 7
[End lecture 4, 16/10/14]

The Duellists Problem


2 duellists X, Y . Each has a gun with only one bullet. They start at 1 unit distance apart and simultaneously
start walking towards each other.

They walk towards each other. Each must decide when [i.e. at what distance] they should fire. Rules:
1. A duellist who hits the other one wins 1 point and the loser loses 1 point
2. If a duellist fires and misses, the other can walk up and shoot at point blank range, therefore winning
3. If both fire at the same time and both hit or both miss, each gets a pay-o of 0
Thisisatwo-personzero-sumgame. SothepurestrategiesforX andY areXS =YS =[0,1]. LetX fireata
distancexapart:letYfireatadistanceyapart.x,y2[0,1].
8

Well assume P(X hits when distance apart is x) = p1 (x) and P(Y hits when distance apart is y) = p2 (y).
BothX andY knowthefunctionsp1 (x)andp2 (y). p1 (x)andp2 (y)aremonotonicdecreasingandcontinuous.
8
>
x>y
<X fires first
p1 (0) = p2 (0) = 1. The expected pay-os to X depend on whether: Y fires first
x<y
>
:
Both fire together x = y
8
>
x>y
<1 P (X hits) + ( 1) P (X misses)
We get g(x, y) = 1 P (Y misses) + ( 1) P (Y hits)
x<y
>
:
1 P (X hits) P (Y misses) + ( 1) P (X misses) P (Y hits) x = y
8
>
x>y
<2p1 (x) 1
Using P (X hits) = p1 (x) and P (Y hits) = p2 (y) =) g(x, y) = 1 2p2 (y)
x<y
>
:
p1 (x) p2 (x) x = y
Consider the pure strategy, d, that is to fire at distance apart d where p1 (d) + p2 (d) = 1. [9 such a solution.]
N.B. d satisfies p1 (d) = 1 p2 (d) i.e. P (X hits) = P (Y misses) and vice-versa. Suggest X and Y both play
this pure strategy < d >. We find that (< d >, < d >) gives a PSSP. To see this, calculate:
8
>
x>d
<2p1 (x) 1
g(x, < d >) = 1 2p2 (d)
x<d
>
:
p1 (d) p2 (d) x = d
Note that 2p1 (x)
1 2p2 (d) = p1 (d)

1 < 2p1 (d) 1 [when x > d because p1 is mon. decreasing] = p1 (d)


p2 (d). So supx g(x, < d >) = p1 (d) p2 (d) = g(< d >, < d >).

Similarly inf y g(< d >, y) = p1 (d)


and (< d >, < d >) gives a PSSP.

1.2.3

p2 (d). Also

p2 (d) = g(< d >, < d >) so the pure value of the game is p1 (d)

p2 (d)

Randomised Strategies

If 9 a PSSP, its clear to both players what they need to do; were home and dry. If @ a PSSP then we will
find that it is beneficial to introduce randomised strategies. Consider the following example:
a1
a2

b1 b2
| 3 4
| 5 2

No PSSP

Suppose the game is to be played many times and suppose A always plays a1 . Then B would always play b1
and A would get 3. If A always plays a2 then B would play b2 and A would get 2. I.e. if A always sticks to one
pure strategy, they cant get more than 3. With randomised strategies A can do better than 3 = max(2, 3). In
fact, they can get 3 12 on average. Also, randomised strategies introduce an element of surprise.
Definition: A randomised [or mixed] strategy for A is a probability distribution over AS , As pure strategies
[with a similar definition for B].
If the game is finite and AS = {aP
1 , . . . , am }, BS = {b1 , . . . bn } then a randomised strategy is a set of probabilities
{p1 , . . . , pm } where pi 0 and i pi = 1 such that
P A chooses pure strategy ai with probability pi . We write
= (p1 , . . . , pm ), = (q1 , . . . , qm ) where qi 0, i qi = 1. Sometimes we write = p1 a1 + p2 a2 + + pm am
or = p1 < a1 > + + pm < am >.
N.B. Pure strategies are just special cases of a randomised strategies. E.g. the pure strategy a1 corresponds
to := (1, 0, . . . , 0) =: (p1 , p2 , . . . , pm ).
A and B dont reveal their strategies [randomised] to each other pre-play. If A chooses and B chooses
then the pay-o to A, in a finite game, is:
XX
E[gain to A] =
gij pi qj
i

HereG=(gij)isthepay-omatrix.Noticethatthetermpiqjoccursbecausetheplayerschoosetheirrandomised
strategiesindependently.Sincethestrategychoicesareindependentthiscanalsobewrittenas:
X X
X
pi
gij qj =
pi g(ai , )
i

In the continuous case:


E[gain to A] =

Z Z
a

where fA , fB are P.D.Fs.

g(a, b)fA (a)fB (b)dbda


b

[End lecture 5, 21/10/14]

Question 1
Part I
B
b 1 b 2 b 3 b4
| a a b b
| c d c d
| c e c e

a1
a2
a3

WLOG. a b. If we look at column b3 and compare it with b1 we see that b3 is dominated by b1 and similarly
b4 is dominated by b2 . This leaves us with the following:
a1
a2
a3

b1 b2
| a a
| c d
| c e

WLOG lets take d e. So a2 is dominated by a3 . Notice that we have not gained any PSSPs along the
way. We now have:
a1 |
a3 |

b1 b 2
a a
c e

WLOG c e. The second strategy for B in this 2 2 game is dominated by the first. We are then left with

a
c

which has a PSSP at max{a, c}.

Part II

1
G=4 0
0

3
2 2
2 15
0 3

So, no. This is a counterexample.


GT does not have a PSSP

Question 2
Part I
WLOG we can assume the pay-os are 1, 2, 3, 4 in some order and that g(a1 , b1 ) = 1. We can write down the
6 possible pay-o matrices:
1
4

2
3

1
3

2
4

All of these are equally likely so P (PSSP) =

1 3
4 2
4
6

2
3

1
2

3
4

1 4
3 2

1
2

4
3

Part II
3 3 game.

P (PSSP) =

P (PSSP in cell (i, j))

(i,j)

since the pay-os are disjoint [if there were more than one PSSP, the entries in the pay-o matrix would be the
same; this contradicts the fact that all entries are disjoint], we have the union of 9 disjoint events. Consider:
P (PSSP in cell (i, j)| Entries in cells in row i and col. j consist of, say, a, b, c, d, e) =

4
5!

E.g. (i, j) is the middle cell. The entry in the cell (i, j) must be the middle value in {a, b, c, d, e}

The 4 remaining values must be placed such that the 2 largest values go in row i and the 2 smallest values go
in column j [2 orders in each case]. 9 5! possible arrangements.
This conditional probability is the same for all selections of 5 symbols out of the 9, so unconditionally,
P (PSSP in cell (i, j)) = 5!4 and so:
4
3
P (PSSP) = 9 =
5!
10

Question 3

Assume that the volume of available business is proportional to the population and that between them the two
companies take all of the business [so it becomes a constant sum game].

Small Company

Large Company

W
X
Y
Z

|
|
|
|

W
60
72
64
56

X
48
60
56
52

Y
56
64
60
48

Z
64
68
72
60

E.g. g(X, Y ) = (0.8 20) + (0.8 40) + (0.4 20) + (0.4 20) = 64, since L is closer to X and W but farther
from Y and Z than S is. 9 a PSSP at (X, X) and L can expect to get 60% of the business. Constant-sum game.
When the population distribution changes, let p = 3q. The population distribution becomes:
W : 20(1 + 2q)%
N.B. that the sum is 100.

X : 40(1

3q)%

Y : 20(1 + 2q)%

Z : 20(1 + 2q)%

We recalculate the pay-o matrix:


If p <

3
8

then (X, X) is still a PSSP

If p =

3
8

then (X, X), (X, Y ), (Y, X) and (Y, Y ) all become PSSPs

When p > 38 , (Y, Y ) is a PSSP [and (X, X) no longer is a PSSP


Small Company

Large Company

W
X
Y
Z

W
X
Y
Z
|
60
48 + 16q 56 8q 64 32q
| 72 16q
60
64 32q 68 24q
| 64 + 8q 56 + 32q
60
72 16q
| 56 + 32q 52 + 24q 48 + 16q
60

[If the stores can locate anywhere between W and Z, the 9 a PSSP at (X, X).]

Extra Solutions
Lets also answer the exercises posed in lecture 4:
1. If a game has 2 PSSPs then they have the same pay-o.
WLOG 9 PSSPs at distinct locations in the pay-o matrix at (a1 , b1 ) and (a2 , b2 ). If these are in the
same row or column they must have the same pay-o. If not, then we have:
g(a1 , b1 ) g(a1 , b2 ) because (a1 , b1 ) is a PSSP
g(a2 , b2 ) because (a2 , b2 ) is a PSSP

Similarly g(a2 , b2 ) g(a1 , b1 ) using comparisons with g(a2 , b1 )

2. If (a1 , b1 ) is a PSSP and a2 dominates a1 , show that (a2 , b1 ) also gives a PSSP. [Hint: Show g(a1 , b1 ) =
g(a2 , b1 ).]
Clearly:
g(a2 , b1 )

a2 dom. a1

g(a1 , b1 )

(a1 ,b1 ) is a PSSP

g(a2 , b1 )

So g(a1 , b1 ) = g(a2 , b1 ) and g(a2 , b1 ) g(a, b1 ) 8a. Also, g(a2 , b1 ) = g(a1 , b1 ) g(a1 , b) 8b [as (a1 , b1 ) is
a PSSP] g(a2 , b) 8b [a2 dominates a1 ].
So g(a2 , b1 ) is the smallest in its row and largest in its column
[End lecture 6, 23/10/14]
We only consider the cases where pairs of random strategies have finite pay-os.
(
A = { : g(, ) < 1 8 }
Let
B = { : g(, ) < 1 8}
Example:

3
(
b1 b2
= 13 , 23
4 a1 3 4 5
Let
= 34 , 14
a2 5 1

1 3
1 1
2 3
2 1
g(, ) = 3
+ 4
+ 5
+ 1
3 4
3 4
3 4
3 4

1
3
1
2
3
1
=
3 +4
+
5 +1
3
4
4
3
4
4
What are the analogues of L , U when we allow random strategies?

Lemma 2. For any 2 A ,


1. sup2A g(,
2. inf

2B

2 B:

= supa2AS g(a,

g( , ) = inf b2BS g( , b)

Proof. of 1. only, in the discrete case. Clearly supa2AS g(a,


Conversely, if = (p1 , . . . , pm ) then:
g(,

)=

X
i

pi g(ai ,

= sup g(a,
a2AS

X
i

X
i

pi sup g(a,
a2AS

for all

) [definition of sup and pi

pi 8 =) sup g(,
2A

sup2A g(,

) sup g(a,

because AS A .

0] =

a2AS

Definitions: A maximin strategy for A is a randomised strategy such that:


inf g( , ) = sup inf g(, )

Or equivalently [by lemma 2] such that inf b g( , b) = sup inf b g(, b). RHS is called the lower value of
the game and is denoted by VL [compare this with L where we only look at pure strategies]. Clearly
sup inf b g(, b) supa inf b g(a, b) [AS A ] i.e. VL
L.
Similarly a minimax strategy for B is a randomised strategy such that sup g(, ) = inf sup g(, ).
RHS = upper value of the game = VU . We showed [lemma 1] that L U . The same argument shows
VL VU . We also have VU U , so we have L VL VU U .
If VL = VU then their common value, denoted by V , is called the value of the game and ( ,
saddle-point of the game [where represents the maximin and a minimax] and we have:
inf g( , ) = sup g(,

) [= g( ,

define a

)]

[Compare this with PSSP.] If and define a saddle-point then we say that and are good strategies
for A and B respectively. In this case, the quantity V = value = amount A should pay B for the privilege of
playing the game. V = 0 defines a fair game.
Lets find a minimax strategy for B. [Well see a better method later.] Example:
2

4 a1
a2

3
b1 b 2
1 35
2 1

L
U

=1
=) VU , VL 2 [1, 2]
=2

Let  =(q1 ,q2 ) where qi 0 and q1 +q2 =1. We will find q1 and q2 to make  minimax for B. Calculate:
g(a1 , )=q1 +3q2 =3 2q1
g(a2 , )=2q1 +q2 =1+q1
We need to choose q1 so that the larger of these two, 3 2q1 and 1+q1 is minimised. [N.B. we only need to
considerg(a1, )andg(a2, )bylemma2.]

At x: q1 = 23 , q2 = 13 . So

2 1
3, 3

is minimax for B and VU = 1 +

2
3

= 53 , which is the y coordinate at x.

Similarly, we find = 13 , 23 is maximin for A and VL = 53 . As these two are equal, the game has value 53 .
This is a very long-winded method, and later on well see a better method of solving 2 2 games.
Note if AS and/or BS are/is not finite, then there is no guarantee that the suprema and infima above are
actually attained by any randomised strategies.
Definition: If a game has a value and if the maximin and minimax strategies for A, B respectively exist, then
the game is called strictly determined. We will prove later that all finite games are strictly determined.
All perfect information games have a PSSP [including chess] but we wont prove this.
To solve a game means to find VL , VU and maximin and minimax strategies for A, B when these exist. The
following result enables us to check whether a guessed solution to a game is actually a solution.
[End lecture 7, 28/10/14]
In the class on Thursday 6th November, well go over questions 4, 7, 12, and 11.
The following lemma tells us how to check a guessed solution to a game is actually a solution:
Lemma 3. If ,

are randomised strategies for A, B respectively such that for all a 2 AS , b 2 BS we have:

g(a,
then the game has value V = g( ,

and ,

) g( , b)

are maximin and minimax respectively.

Proof.
VU = inf sup g(, ) sup g(,

= sup g(a,
a

) [by lemma 2]

inf g( , b) [by the hypothesis] [largest on LHS smallest on RHS]


b

= inf g( , ) [by lemma 2]


sup inf g(, ) = VL

Distribution of this14
document is illegal

Hence VU VL . But we know that VL VU , so VL = VU [= V , say], which is the value of the game:
V = sup g(,

) = inf g( , )

[because the inequalities above have become equalities]. Hence is maximin and
since g( , ) lies between the terms in the above equation, we have V = g( , ).

is minimax. Finally,

Examples:
1.

4 a1
a2

3
b1 b2 b3
1 3 35
2 1 2

Let = 13 , 23 , = 23 , 13 , 0 . We calculate: g(a1 , ) = 53 , g(a2 , ) = 53 ; g( , b1 ) = 53 , g( , b2 ) = 53 ,


g( , b3 ) = 73 . Using lemma 3 we deduce that is maximin, is minimax and V = 53
(
a b a b
2. AS = BS = [0, 1], a, b 2 [0, 1], g(a, b) = |a b| i.e. g(a, b) =
b a a<b
First calculate L , U . L = supa inf b g(a, b). First fix a ! L = supa 0 = 0.
fix b ! U = inf b max{b, 1 b}. [In the below diagram, weve drawn b <
Exercise: Show U = 12 . L 6= U .

This tells us that If the game has a value V then V 2 0,

1
2

. One might guess

Using lemma 3.:


(
As a

1
2

1
2


g(a, ) = g a, 12 = a 12 , a 2 [0, 1]
g( , b) = 12 g(< 0 >, b) + 12 g(< 1 >, b) = 12 (b + 1

8a 2 [0, 1] we can apply lemma 3. So is maximin,

1
2

= inf b supa g(a, b). First


which would yield 1 b.]

= 12 < 0 > + 12 < 1 >



= 1
2

b) =

1
2

is minimax and V =

1
2

1.2.4 Admissible Strategies


Definition: A randomised strategy  for A is called inadmissible if 9 a random strategy  2 A such that
g( ,b)g(,b)8b2BS whereatleastoneinequalityisstrict[so6= ]. Inthiscase,issaidtodominate
, i.e. whatever B plays,  is better [or equal] than [or to]  for A.
Definition: A randomised strategy  is called admissible if no such  exists. [Similar definitions for B with
the inequalities the other way round.]
We can delete inadmissible strategies without aecting the existence or otherwise of a saddle-point. We may,
by doing this, miss one or more solutions to the game, but this doesnt matter as we only want one solution.

15

Example. 5 targets; 2 armies.

Army A can attack one and only one target. Army B must choose one target to defend but any target connected
immediately to that chosen target is also automatically defended. E.g. if B chooses 2 then 1, 3 and 4 are also
defended. If A attacks a defended target he/she loses a point [i.e. receives 1] and B receives +1. If A attacks
an undefended target then A receives +1 and B receives 1. This is a zero-sum game [always check]!
B defends

A
attacks

1
2
3
4
5

2
6
6
6
6
4

1
1
1
1
1

1
1
1
1
1

1
1
1
1
1

1
1
1
1
1

3
1
17
7
17
7
15
1

Delete the inadmissible a2 by comparing it with a1 ; similarly delete a4 by comparing it with a5 . Delete b1 by
comparing it with b2 ; delete b3 by comparing it with b2 .
[End lecture 8, 29/10/14]
Doing this yields the following matrix:
B defends
2
A
attacks

2
1
1
3 4 1
5
1

1
1
1

3
1
15
1

4 and 5 are the same so we can delete one; 1 and 3 are the same so we can also delete one of those. This yields:
B defends

A
attacks

3
5

Try to guess a solution. Use lemma 3 to show that S =


maximin and S is the minimax in the sub-game.

1
1

1
1

1 1
2, 2

Subgame, S
is a solution. S = sub-game. S is the

Maximin and minimax strategies for the whole game are = 0, 0, 12 , 0, 12 ,


= 0. [Exercise.] Is this solution reasonable? [Exercise.]

= 0, 12 , 0, 12 , 0 . Value? Value

1.2.5 Equaliser Strategies


Definition: Arandomisedstrategy forAiscalledanequaliserstrategy[ES]ifg( ,b)=constant8b2BS
and theres a similar definition for B.
Example: Two committees, A and B, are at war. B has two airfields, but can only defend one.
(
b1 : defend airfield 1
Bs pure strategies are:
b2 : defend airfield 2
A only has resources to attack one airfield. As pure strategies are a1 and a2 . If A attacks a defended airfield it
withdraws and there is no loss to either side. If A attacks an undefended airfield, then the airfield is destroyed.
Airfield 1 is twice as valuable, to both A and B, as airfield 2.
b1 b2

0 2
1 0

a1
a2

Consider = 13 , 23 . We find g( , b1 ) = g( , b2 ) = 23 . So is an equaliser strategy for A. g( , ) =


Also, = 23 , 13 is an equaliser strategy for B.
Exercise: Show that ,

2
3

8 .

are maximin and minimax for A, B respectively, using lemma 3. Find V .

If we had instead the following pay-o matrix:


b1 b 2 b 3

0 2 2
1 0 1

a1
a2

We would find that = 13 , 23 would still be maximin and would still be the minimax [check using lemma 3]
but would no longer be an equaliser strategy because g( , b3 ) = 43 i.e. a maximin strategy is not necessarily
an equaliser strategy [same for minimax]. Likewise, an equaliser strategy is not necessarily maximin or minimax.
Exempli gratia:
b1 b2

2 4
2 4

a1
a2

b2 is an equaliser strategy, but b2 is not minimax. However...


Lemma 4. If both A and B have equaliser strategies, and
are maximin and minimax respectively.

say, then the game has a value and ,

Proof. For all 2 A we have that:


inf g( , )

is ES

g( ,

is ES

g(,

inf g(, ) =) inf g( , )

sup inf g(, )

By definition, because the reverse inequality, , trivially holds is maximin. Exercise: Show
Also inf g( , ) = g( ,

This also shows that VL = g( ,

) = sup g(,

= VU so the game has value g( ,

If a game has a pair of equaliser strategies ,

is minimax.

).

for A, B respectively, then we say they form a simple solution.


17

1.2.6

Solving 2 2 Games

Consider the two-person zero-sum game with pay-os:

a1
a2

b1 b2

x y
z w

Neither player can have pure strategy which is an ES, i.e. x 6= y, x =


6 z, z 6= w, y 6= w. Show that if x = y
then 9 a PSSP.
(
x y and z w have opposite signs
If the game has a PSSP, we are done! If the game has no PSSP then
x z and y w have opposite signs
If they dont have opposite signs then the game has a PSSP. None can be 0 because otherwise 9 PSSP. Consider
the following:

|z w|
|x y|

=
,
|x y| + |z w| |x y| + |z w|
N.B. the denominator 6= 0. is proper vector of probabilities.

Firstly, notice that represents a randomised strategy. Secondly, suppose x < y and z > w. Then:
g( , b1 ) =

x|z
|x

w| + z|x y|
x(z
=
y| + |z w|
|x

and g( , b2 ) =

y|z
|x

w) + z(y x)
zy wx
=
y| + |z w|
|x y| + |z w|

w| + w|x y|
zy wx
=
y| + |z w|
|x y| + |z w|

So is an equaliser strategy for A if x < y and z > w. If x > y and z < w we find, in a similar fashion, that
zy

g( , b1 ) = g( , b2 ) = |x wx
y|+|z w| . In both cases is an ES for A. Similarly:

|y

|y w|
w| + |x

,
z| |y

|x z|
w| + |x

z|

is an ES for B. So the game has equaliser strategies for A and for B and, by lemma 4, is maximin and
is minimax. Exercise: Compare with the result for the airfields example in section 1.2.5.

[End lecture 9, 30/10/14]


The test will be very basic - solving a series of games. Therell be a class on Thursday 6th November. This
will be the last examinable lecture for the test on the 13th November.
1.2.7

The G

Method

If m = n, G, the pay-o matrix, is square. If G 1 exists we can use it to check if the game has a simple
solution. If the game has a simple solution this method gives it.
Let G = (gij ) be the n n pay-o matrix. Calculate G 1 if such exists. If G 1 does not exist try adding a
constant to each entry of G to give an invertible matrix. [The game theory problem will be unaected. This
will work, i.e. 9 such a constant provided the rows of G and the row vector of all ones spans Rn .]
Assume that G

exists. We will look for a pair of ESs [one for A, one for B]. Let:
= (p1 , . . . , pn )

is an ES if g( , bj ) =

Pn

i=1 pi gij

= (q1 , . . . , qn )

= k 8j.

Distribution of this18
document is illegal

1. I.e. if G = k 1T where 1T = row-vector of all ones. Similarly for B we have G


1 = coumn-vector of all ones. From 1.:
2.

= k 1T G
=k1 G

= k 1 where

1
1 T

We need to show that these two, and , represent proper randomised strategies. This will be so provided
the entries of are all the same sign [similarly for ]. [0 is allowed.]
(
consists of k row vector of the column sums of G 1
From 2.
consists of k row vector of the row sums of G 1
In order for the expressions in 2. to be randomised strategies, G 1 must have all its row sums with the same
sign [or 0] and all its column sums with the same sign [or 0]. If this property does not hold then 6 9 a simple
solution. If these conditions are met, we have:

1T G 1
1T G 1
1T G 1

= T
=
=
1 G 11
sum of the entries of 1T G 1
sum of all entries of G 1
Similarly:

[Sum of entries of G

1T (G 1 )T
sum of entries of 1T (G

= sum of those of (G

1 )T .]

1 )T

1T (G 1 )T
sum of the entries of G

are maximin and minimax respectively by lemma 4.

Exercise: Find the value of v:


k 1T 1
G 1
1T G 1 G 1
1
=
=
=
T
T
1 1
1 1
n sum of entries of G
sum of entries of G
1
0
1
0
5
7
25
15
33
1
@ 31
3
1A =) G 1 = 185
5
40 A 14
5
4
44
25
15
4
6
5
40

V =k=
0

5
Example: G = @7
3

1
We ignore 185
. We add the numbers outside the G 1 matrix. The row and column sums are all positive, so
the game has a simple solution:

2 5 40
11 14 4
185
1

=
, ,
, =
, ,
,V =
=
17 51 51
17 51 51
51
sum of entries of G 1
Warning: this is not necessarily the best method. Unless G is very easy to invert it is often better to seek
equaliser strategies by writing down the relevant equations.

1.2.8

Equilibrium Pairs

Definition: A pair of strategies ( ,


g(,

is said to be in equilibrium [or form an equilibrium pair] if:

) g( ,

) g( , ) 8 2 A ,

2B

In an equilibrium pair, neither player has an incentive to change strategy.


From lemma 3 we see that an equilibrium pair has ,
g( , ). However, the converse also holds:

maximin [for A] and minimax [for B] and V =

Distribution of this19
document is illegal

Lemma 5. If ,  are maximin and minimax and if the game has a value then V =g( , ) and ( , )
is an equilibrium pair.
Proof.
VL

is maximin

inf g( , ) g( ,

) sup g(,

is minimax

VU

Since the game has a value, we know VL = VU = V so everything in the line above is equal and V = g( ,
Now to show ( ,

is an equilibrium pair, let 0 ,


g(0 ,

so ( ,
1.2.9

) sup g(,

).

2 A , B respectively. Then we have:

) [= g( ,

)] = inf g( , ) g( ,

is an equilibrium pair.

S-Games

Pre-amble
If we have a set of point x1 , . . . , xn 2 Rm we can form their convex
P hull, C(x1 , . . . , xn ), which consists of all
points of the form q1 x1 + q2 x2 + + qn xn where qi 0 8i and ni=1 qi = 1.

E.g. with m = 2, imagine a rubber band closing in on the points shaded. The convex hull is the triangle and
its interior, C(x1 , x2 , x3 , x4 ):

In general, C(x1 , . . . , xn ) is a convex set i.e. for any 2 points in C, the line segment joining them is also in C.
[Exercise.]

[End lecture 10, 04/11/14]


If we have 2 disjoint [i.e. non-intersecting] convex sets in Rm it can be shown that there is a hyperplane [i.e.
(m 1)-dimensional subspace of Rm ]:
a1 x1 + a2 x2 + + am xm = k, ai 2 R
20

which separates the two convex sets. Here, (x1 ,...,xm ) is a point in Rm . E.g. with m=2 one can obviously
always draw a line [of dimension 1=(2 1)] that separates the two disjoint convex sets C1 and C2 :

If (x1 , . . . , xm ) 2 Rm then

a x > k x 2 C2
a x < k x 2 C1

[>, < could be the other way round], a = (a1 , . . . , am ), k const.

In our diagram, C1 and C2 include their boundaries. However not all convex sets do include their boundaries,
e.g. {(x1 , x2 ) 2 R2 : x21 + x22 < 1}. We could have 2 disjoint convex sets, C1 and C2 , where C2 includes its
boundary but C1 does not, but one of the boundary points of C1 is in C2 :

We still get a separating hyper-plane, but

ax k
axk

x 2 C2
x 2 C1 [ (boundary of C1 )

Application to Finite Games


Finite game AS = {a1 , . . . , am }, BS = {b1 , . . . , bn }. Let 2 B be a randomised strategy for B. We represent
by a point in Rm :
(g(a1 , ), g(a2 , ), . . . , g(am , )) 2 Rm

3 2 1 2
Exempli gratia: G =
Let = 23 b1 + 13 b2 :
4 1 5 3

21

Recall from M1GLA:

g(a1 , ) = 23 g(a1 , b1 ) + 13 g(a1 , b2 ) = 83


g(a2 , ) = 23 g(a2 , b1 ) + 13 g(a2 , b2 ) = 3

= 23 b1 + 13 b2 lies on the line segment joining b1 and b2 and lies in C(b1 , b2 , b3 , b4 ) . [Any other
also lie in the convex hull of b1 , b2 , b3 , b4 .]
More generally, = q1 b1 + qn bn , [qi 2 [0, 1],
q1 , . . . , qn at b1 , . . . , bn respectively.

2 B will

qi = 1] is represented by the centre of gravity of weights

Let S be the set of all points in Rm that represent randomised strategies for B. S is called the risk set for B.
[A can have a risk set. If we want to distinguish between the 2 risk sets, we call them S(B) and S(A)].
S [for B] is a convex set. Its the convex hull of the points in Rm that represent the pure strategies for
B. In our example, S is the triangle with vertices b1 , b2 , b3 and its interior. If m = 2, S is a convex polygon and its interior. In higher dimensions, if m > 2, S is a convex multi-dimensional polyhedron and its interior.
B must choose a point in S as her randomised strategy.
What about A? A typical randomised stratP
m
egy for A is (1 , . . . , m ) where i 2 [0, 1] and
i i = 1. This defines a set of hyperplanes in R :
Hk : 1 x1 + . . . m xm = k, for Hk , k a constant. So we have a set of hyperplanes = {Hk }. As k varies
the Hk form a set of parallel hyperplanes.
Example [m = 2] [the arrow indicates the direction, which follows from k increasing]:

A typical line is 1 x1 + 2 x2 = k. Exercise: Why is the slope of the lines negative?


[All points in S on the same hyperplane have the same pay-o to A [and to B]].
In general, a typical = (1 , . . . , m ) defines a direction in Rm . If 2 A ,

2 B, then the payo to A is:

g(, ) = 1 g(a1 , ) + + m g(am , )


=(1  first coordinate of s)++(m mth coordinate of s)=s [dot product]

where s2S is the point representing  in S [as long as Hk intersects S].

N.B.allpointsonthesamehyperplanehavethesamepay-o[namelyk]. Thinkofthisasanewgameinwhich
Bs randomised strategiesre points of S, the risk set, and As pure strategiesre the m coordinates in Rm .
ThepayotoAiss.ThisiscalledanS-game.

22

E.g.: G =

3 2 1 2
4 1 5 3

= 23 b1 + 13 b2 :

1. If = (1, 0) then Hk is the line g(a1 , ) = k


2. If = (0, 1) then Hk is the line g(a2 , ) = k
3. If =

1 2
3, 3

then Hk is the line 13 g(a1 , ) + 23 g(a2 , ) = k

We will represent maximin [for A] and minimax [for B] geometrically.


[End lecture 11, 05/11/14]

Question 4
R = rock, S = scissors, P = paper.

R
P
S
and

R
0
41
1
2

are both ESs. So is maximin,

P S
3
1 1
0
15
1
0

=
=

1
3,
1
3,

1
3,
1
3,

1
3
1
3

is minimax. V = 0.

Question 7
Zero-sum game because all targets are equally valuable to
2
0
0
1
60
0
0
6
6 1 0
0
G=6
6 1
1 0
6
4 1
1
1
1
1
1

A and to B. The pay-o matrix is [for m = 6]:


3
1
1
1
1
1
17
7
0
1
17
7
0
0
17
7
0
0
05
1 0
0

The obvious thing to do is to delete inadmissible pure strategies. This leads to:
a2
a5
Consider S = 12 , 12 , S =
the whole game we have:

1 1
2, 2

|
|

b1
0
1

b6
1
0

. Both are ESs so S ,

1
1
0, , 0, 0, , 0
2
2

as maximin, minimax respectively. V =

1
2.

Exercise

are maximin and minimax in the sub-game. For


=

1
1
, 0, 0, 0, 0,
2
2

Check this using Lemma 3.

m = 8: Delete inadmissible pure strategies. This leads to the 4 4 matrix:


a2
a4
a5
a7

|
|
|
|

b1
0
1
1
1

b4
1
0
0
1

b5
1
0
0
1

b8
1
1
1
0

a4 = a5 , b4 = b5 in the sub-game. Delete, for example, a4 and b4 . We then get:


a2 |
a5 |
a7 |

b1
0
1
1

b5
1
0
1

b8
1
1
0

which has a simple solution:


S

1 1 1
, ,
3 3 3

We then extend this to the whole game.

a1
a2

1 1 1
, ,
3 3 3

2
3

V =
6

8.

b1 b2

x y
z w

Question 11
Suppose A bids a and B bids b.
8
>
<400 a
A gets:
b
>
:
200

8
>
<a
B gets:
400
>
:
200

a>b
a<b
a=b

[This gives g(a, b)]

a>b
a<b
a=b

[This is a constant-sum game]

The table of pay-os [below, there are 4 PSSPs]:


B [100]

A
[100]

0
2
63
6
62
6
41
0
2

0
1
2
3
4

1
1
2
2
1
0

2
2
2
2
1
0

3
3
3
3
2
0

4
3
4
47
7
47
7
45
2

If we change this to a zero-sum game by subtracting 200 from each pay-o, Value = 0. This is a fair game.
If any real-numbered bid from 0 to 400 is allowed, consider A and B bot playing / bidding < 200 >. This
is a PSSP. Check this using Lemma 3.

Question 12
By contradiction. Suppose
that:

is a unique minimax strategy for B that is not admissible. Then 9 6=


g(, ) g(,

) 8 2 A

[with at least one strict inequality]. Hence:


sup g(, ) sup g(,

By the definition of

as minimax, we must have:


sup g(, ) = sup g(,

which contradicts the uniqueness of

as minimax.

Question 8
Part I
Here we have a PSSP:

3
G = 42
5

4
6
4

3
1
05
3

such

Part II

3
x 0 0
G = 4 0 y 05
0 0 z

Are there any PSSPs? If any of x, y, z is 0 we get a PSSP and V = 0.


Look for ESs. Suppose = (p, q, r) is an ES for A. p, q, r 2 [0, 1], p + q + r = 1. Then:
xp = yq = zr

p=

yz
xz
xy
, q=
, r=
D
D
D

where D = xy +xz +yz. If x, y, z all have the same sign then (p, q, r) represents a randomised strategy. V =

xyz
D

If they didnt all have the same sign, delete inadmissible strategies ! a game with value 0.

Part III
Add 1 to each entry in Part II.
[End lecture 12, 06/11/14]
Thursday 13th November. 40 minutes long. Justify your answers! [Test 2: 11th December.]
We will represent minimax and maximin strategies geometrically. Let:
Q = {(x1 , . . . , xm ) 2 Rm : xi

i 2 {1, . . . , m}}

Q is the set of points in Rm whose maximum coordinate is . [ may not be small or even positive!] m = 2:

Q is convex and it includes its boundary. [The interior of Q is also convex.]
Let S =S\Q . S contains those points of the risk set, S, whose maximum coordinate is .
If isverylarge andpositive then thewhole ofS willlie in Q and S =S: if islargebutnegative, thenQ
may not intersect S at all. Somewhere in-between we have the smallest  such that Q intersects S [S 6=;].
[Sinceweassumedthatthepay-osinAareboundedfrombelow,thisisfinite.]

25

We let:
M = inf{ : S = S \ Q 6= ;}
Claim: SM = S \ QM is the set of minimax points for B. This is because it contains those points of S that
minimise the maximum coordinate, i.e. they minimise supa g(a, ) [= sup g(, )].
Typically, for m = 2, we have:

Here, SM consists of the single point SM alone. Note that other situations can arise, e.g. multiple minimax
points and minimax point for B that is a pure strategy for B:

What about A? Consider the hyperplane Hk where:


1 x1 +2 x2 ++m xm =k
For fixed  = (1 ,...,m ) [i.e. a randomised strategy for A] and fixed k, all points on the hyperplane, all
points of S, have the same pay-o [= k]. For fixed , the hyperplanes Hk are parallel. As we increase k the
pay-o increases.
Hk must intersect S so the most A can guarantee getting by playing  is given by:
inf{k:Hk \S 6=;} [=k , say]
A wants to choose  [i.e. a direction] to maximise this infimum. We will see soon that the optimal  [i.e.
maximin] is given by a hyperplane separating two convex sets: S and the interior of QM.

26

Typically,form=2,wehave[thedirectionofthelinesisdeterminedby,thesolidlineistheseparatingline
that yields the limiting k]:

The coefficients of the separating line give the maximin strategy for A.
In many situations the separating line in the case m = 2 will lie along an edge of S. Suppose we have:

Hk has equation 1 x1 + 2 x2 = k . The point of contact P has x1 > x2 . So we can increase the pay-o to A
by increasing 1 [and so decreasing 2 ]. This rotates the line through P clockwise [P remains fixed].
However we cannot go beyond the edge P Q because otherwise, as it reaches Q, the point of contact changes.
Similarly if Q were the point of contact we could increase the pay-o to A by increasing 2 [as x2 > x1 for Q]
and thereby rotating the line anti-clockwise. The limiting [optimal] situation is that the optimal direction for
A is parallel to P Q.
Notes [for m = 2]:
1. The slope of the separating line is never positive [1 + 2 = 1]
2. The separating line passes through (M , M ) [proved later]
3. Typically [m = 2], we get the diagram in the top below. The direction of the separating line gives us the
direction for As maximin strategy. Sometimes the separating line [m = 2] is not always along an edge
of S and there may be more than one maximin strategy. [Illustrated in the other two diagrams below]
27

We will show [later] that the separating line [m = 2] passes through (M , M ). And, as weve already seen,
sometimes the separating line is not along an edge of the risk set S.
[End lecture 13, 11/11/14]
Class Thursday 20th November.
1.2.10

The Minimax Theorem for Finite Games

Recall: B is trying to minimise supai g(ai , ) = maxai g(ai , ). [ai are pure strategies for A.]
Notation: In an S-game g(, ) = g(, s) where s is the point representing in the S-game. We can write the
pay-o to A as g(, ) = g(, s) = a s [a dot product]. = (1 , , m ), s = (s1 , , sn ).
Theorem 1 (The Minimax Theorem (A very famous theorem proved by Von Neumann)). Every finite game
is strictly determined [i.e. has a value and maximin [for A] and minimax [for B] strategies].
Proof. BS is finite so S is closed [i.e. it includes all its limit points] and S is bounded. Let sM 2 S be a
minimax point for B [sM 2 S because S contains its limit points]. By definition of minimax:

M [= maximum coordinate of sM ] = sup g(, sM ) = max g(a, sM ) as the game is finite

i.e. sM M for all .

a2AS

Recall: QM = {(x1 ,...,xm ) 2 Rm : xi  M,i = 1,2,...,m}. Let T be the interior of QM. So T =
{(x1 ,...,xm )2Rm :xi <M,i=1,2,...,m}. T isaconvexsetandT\S =;becauseallpointsinS haveat
leastonecoordinate M [bydefinitionofQM]. S isalsoconvex[andincludesitsboundary]so9aseparating
hyperplane ax=k where a=(a1 ,...,am ). [N.B. these are not pure strategies for A, theyre components of
a row vector, a. x=(x1 ,...,xm ).] This separates S and T:
(

[Well see that if

ax k
axk

x2S
x 2 T [ boundary of T [i.e. QM ]

, are the other way round - the proof still works!]

Let i 2 Rm have ith coordinate = 1 and the rest = 0. Then sM


are M . Then:
sM

sM 2S

Hence a i 0 8i i.e. the ith coordinate of a is


then this would yield ai 0 8i = 1, . . . , m.]
Let =

Pa
i

ai .

sM

i 2QM

2 QM because all components of sM

a (sM

0 8i = 1, . . . , m. [If

i)

, were the other way round in 1

[This is a vector of probabilities whose entries sum to 1.] The represents a randomised

strategy for A. Let V =

Pk
i

ai .

Well show that is maximin for A and V = value of the game. From 1 :
(

s V
x V

s2S
x 2 QM

Consider the point x = (M , . . . , M ) 2 QM [by definition] so:


n

M = x V

For any 2 Q we have:


4

sM

definition of minimax

M V s for all s 2 S
3
2

Looking at the 2 extreme ends of this inequality we can use lemma 3 to show that is maximin [we know
already that sM is minimax and V is the value of the game because if we put = and s = sM in 4 this
gives V = value and V = M . We get equality through 4 .]

Corollary: From 3 we get x = M = V so the separating hyperplane passes through x = (M , . . . , M ).


Examples:
1.
a1
a2

b 1 b2 b 3 b 4 b 5
| 4 5 8 2 6
| 1 8 5 6 6
29

In this diagram, the dotted lines denote the boundaries of QM . sM is the point
of the line x1 = x2 and the line joining b1 and b4 ]. V = 22
7 .
The minimax for B is 47 b1 + 37 b4 [M1GLA!] or

4
3
7 , 0, 0, 7 , 0

22 22
7 , 7

[the intersection

The separating line joins b1 and b4 and has equation:


5x1 2x2
22
+
=
7
7
7

x1 = g(a1 , )
x2 = g(a2 , )

So As maximin strategy is 57 a1 + 27 a2 .
The only admissible strategies for B are those on the line segment joining b1 and b4 . In fact, from the
pay-o matrix b2 , b3 , and b5 are inadmissible. We could have deleted these to give:

a1
a2

b1 b4
| 4 2
| 1 6

[End lecture 14, 12/11/14]

2. A has 3 strategies and B has 2. There are no obvious dominated pure [inadmissible] strategies. Plot the
risk set S(A) for A. n = 2:
a1 |
a2 |
a3 |

b1 b2
3 5
4 1
0 8

The dotted lines mark the boundary of the limiting Q for A. tM , the intersection of the line x1 = x2
and the segment from a1 to a2 gives the maximin point for A: 35 a1 + 25 a2 . [Exercise.]
The separating line is along a1 a2 . This has equation x2 + 4x1 = 17. We just have to normalise this:

3.
a1
a2

x2 4x1
17
+
=
=V
5
5
5

4 1
3 2

=
,
=
, ,0
5 5
5 5

b1 b2 b3
3 3 5
5 2 3

PSSP at (a1 , b2 )

Here there are multiple minimax points for B; they join (3, 3) to b2 . The separating line here is simply
x1 = 3 i.e. g(a1 , ) = 3 ! = a1 . b2 is the only admissible strategy for B; the other minimax points
are inadmissible

1.2.11 Bayes Strategies in S-Games


Definition: A randomised strategy  for A is a Bayes strategy for A with respect to  2 B if g( , ) =
sup g(, ) i.e.  maximises the expected pay-o to A if B plays .
Definition: If  is Bayes with respect to  then g( , ) is called the Bayes loss of  to B.
Similarly

 is Bayes with respect to 2A

if g(, )=inf  g(, ).

Typically, for m=2:

b here is Bayes with respect to . is represented by the parallel lines. Exercise: Show that ,
and minimax respectively () is Bayes with respect to and vice-versa.

are maximin

Admissibility in S-Games
Intuitively, admissible strategies for B are those points of S in the south west corner of our diagrams.
More precisely (assuming a finite game): Assume S is bounded and includes its boundary. For s 2 S let
Q(s) = {x 2 Rm : xi si , i = 1, . . . , m} where x = (x1 , . . . , xm ), s = (s1 , . . . , sm ). N.B. this is dierent from
Q earlier.
Below are two diagrams with m = 2 in which Q(s) \ S 6= {s} i.e. s is inadmissible and Q(s) \ S = {s} i.e. s is
admissible.

Let (S) = {s 2 S : Q(s) \ S = {s}}.


(S) consists precisely of all admissible points for B because s 2 (S) () there is no point s0 2 S [s0 6= s]
such that si s0i 8i [with at least one strict inequality].

Lemma 6. If a Bayes strategy s2S with respect to 2A is unique then it is admissible.
Proof. Suppose s is inadmissible and let =(1 ,...,m ). Then 9s6=s such
that for all i we have si si
withatleastonestrictinequality. Hence,bymultiplyingbyi andsumming, P i si  P i s
i [weneedin
case one of the i s is 0] i.e. g(,s)g(,s ).
But s is Bayes with respect to  so g(,s) g(,s ) so g(,s)=g(,s ), which contradicts the uniqueness
of s as the unique Bayes strategy for B with respect to .
Lemma 7. If s is Bayes with respect to =(1 ,...,m ) where each i >0 8i then s is admissible.
(

g(ai , s) g(ai , s ) 8i
Proof. Suppose s is inadmissible. Then 9s 2 S such that
g(aj , s) < g(aj , s ) for some j
P
P

i i g(ai , s) <
i i g(ai , s ) = g(, s ) contradicting s as Bayes with respect to .

hence g(, s) =

i >0

Lemma 8. If s is admissible then it is Bayes with respect to some 2 A .

Proof. If s is admissible then s 2 (S) and S and Q(s)/{s} are disjoint convex sets so 9 a separating hyperplane
between them. The coefficients of this hyperplane give the required .
1.2.12

Solving Finite Games

1. Cheat! There are many game solvers on the web that solve finite games. See the following site which
will only work with numerical values: http://banach.lse.ac.uk/form.html
[End lecture 15, 18/11/14]
2. Linear programming for finite games. Simplex method algorithm. We wont look at this approach.
3. (a) Look for a PSSP
(b) Delete any obvious inadmissible strategies. These are usually the pure strategies but not necessarily:

Exempi gratia: Consider

a1
a2

b1 b2 b3
3 2 4
4 5 1

I.e. b1 is inadmissible. Exercise: show =

2 1
3, 3

and notice that

1
1
3
b2 + b3 gives
3
2
2

= 0, 12 , 12 are maximin and minimax

(c) Look for a pair of equaliser strategies:


Use ideas of symmetry. In the following, the first and third rows are mirror images and the
second row is a mirror image of itself [similarly for columns]:
1 2 6
3 4 3
6 2 1
We should try = (p, q, p), = (r, s, r) where 2p + q = 1 [similarly for B]
If the pay-o matrix, G, is symmetric then an equaliser strategy for A, if such exists, will also
be an equal strategy for B
If the game is symmetric [i.e. G = GT ] then an equaliser strategy for A, if such exists, will
also be an ES for B. A symmetric game has value 0: If T = (p1 , . . . , pm ) is maximin for
A then = (p1 , . . . , pm ) must be minimax for B [by symmetry: both players have the same
options] and V = T G = T GT = T G = V =) V = 0. [Here we used the fact
that V is scalar so is equal to V T .] This is a fair game

4. Look at sub-games [i.e. we have a subset of As pure strategies and a subset of Bs]:
Sometimes we can find a solution to a sub-game that, when extended to the whole game, can give a
solution to the whole game. [One can simply try putting 00 s for other pure strategies in an attempt to
solve the whole game. Use lemma 3 to check.] In fact, It can be shown that 9 at least one solution that
can be obtained from extending a simple solution [i.e. a pair of ESs] to some sub-game: [QM intersects
S on a face or edge or point of S [in some linear subspace of S] and the pay-o on the separating
hyperplane is the same at all points of this intersection]. Example:
3 5 2
4 1 3
0 8 2
The G 1 method does not work here, theres no simple solution [all the row sums are > 0, but this is
not true for all the column sums]. We need to look at sub-games. Try the following:
b1 b2 b3
3 5 2
4 1 3

a1
a2
b1 is inadmissible !

5 2
! S = 25 a1 + 35 a2 ,
1 3

= 15 b2 + 45 b3 . Try =

2 3
5, 5, 0

= 0, 15 , 45

So lemma 3 tells us we do not have a solution. However:


a2
a3
gives = 0, 34 , 14 ,

b1 b2 b3
4 1 3
0 8 2

= 0, 18 , 78 , which works [using lemma 3]. V =

11
4

[exercise]

[End lecture 16, 19/11/14]

Question 5
g(, )

(, )
in equil.

g(, ) g( , ) g( , ) g(, )
( , )

( , )

(, )

Hence g(, ) = g(, ) = g( , ) = g(, ). To show (, ) is in equilibrium we need to show that for all
2 A , 2 B, we have:
g( , ) g(, ) g(, ) by the definition of an equilibrium pair
To see this, note:
g( , ) g( , )
(, )

From above
Also = g(, )

g(, ) g(, )
(, )

Distribution of this34
document is illegal

Question 6
b 1 b 2 b3
1 2 3
2 3 2
3 2 x

a1 |
a2 |
a3 |
Part I
Two methods:
The G

method. What makes the row and column sums have the same sign? Answer: 1 x 3

Let = (p, q, r) where p, q, r

0, p + q + r = 1. Write down the equations for to be an ES and solve:


p=

x
4

1
4

r=

1
2

We see that (p, q, r) is a randomised strategy because 1 x 3.


Value = (1 p) + (2 q) + (3 r) =

x+7
4

Part II
1 2 3
G=2 3 2
3 2 x
1. x > 3: b3 is inadmissible [compare it with b1 ]. Now delete a1 [compare it with a2 ]. We are left with:
a2 |
a3 |

b1 b2
2 3
3 2


Lets guess: = 12 , = 12 , 12 . S , S are ESs for the sub-game so are maximin and minimax for
the sub-game. Extend this to the whole game. We get that the value = 52 .

2. x < 1: b2 is inadmissible compare it with 12 b1 + 12 b3 . We are left with:


b 1 b3
1 3
2 2
3 x

Distribution of this35
document is illegal

We see that a2 is maximin [and it happens to be an ES]. There are many separating lines all going through
(2, 2). B has minimax strategy qb1 + (1 q)b3 in the sub-game for 12 q 23 xx [one way of seeing this
is by asking the question: For which values of q does Lemma 3. tell us we have a solution to the sub-game?
Another way of forming this inequality is noting that the separating line is of the form (y 2) = m(x 2)
where m is the gradient of a line between the lines joining (2, 2) and (3, x), and (2, 2) and (1, 3), so
x 2 m 1.

m
1

=
,
m 1 m 1
Extend this to the whole game. The value = 2.

Question 9
Let

be an admissible ES for B. Proof by contradiction. Suppose


sup g(,

is not minimax. Then 9 such that:

) > sup g(, )

is an ES, so suppose g(, ) = c(= constant) 8. So c > sup g(, ).


g(, ) > g(, ) 8, which contradicts the admissibility of .

So c > g(, ) 8.

So

Question 14
0
a
b

G=

a b
0 c
c 0

This has no PSSPs. If a = 0 we look at possibilities for b and c. If both are negative, then (a3 , b3 ) is a PSSP
and V = 0. If at least one of b and c is 0 then [exercise] show that 9 a PSSP. So a 6= 0. Similarly b, c 6= 0.
To show a, b have opposite signs, look at these two cases:
a, b are both > 0
a, b are both < 0
In both cases we get PSSPs.
Show a, c have the same sign, again by contradiction.
Look for equaliser strategies. Note that the pay-o matrix is anti-symmetric (G =
and V = 0. Write down the equations for an ES for A [same for B]. We find:
=
to be a pair of ESs and hence ,
strategies.

GT ) so the game is fair

(c, b, a)
a b+c

are maximin and minimax respectively. Check that these are randomised

Distribution of this36
document is illegal

Question 13
Lynda only gave numerical solutions to this question. Heres my solution:
Part I
G = a1 |
a2 |

b 1 b 2 b 3 b4
1 3 2 7
8 2 4 1

So the line x1 = x2 intersects the line segment b2 b3 . So the separating line encompasses the segment b2 b3 .
y = mx + c:
(
(

2 = 3m + c
m= 2
y
8
2 1
8
normalising 2x

=)
=) y + 2x = 8 =)
+ =
=) =
,
,V =
3
3
3
3 3
3
4 = 2m + c
c=8
As for


3
+
2
2

(
8
3
2
= 38 =)
3
4
2
3

+ 2(1
2 + 4(1
2

2)

=
2) =

8
3
8
3

=)

2
= ,
3

1
=)
3

2 1
0, , , 0
3 3

Part II
1 4 3
G=3 2 4
5 1 4

b3 is inadmissible compare it with 12 b1 + 12 b2 . Then a2 is inadmissible compare it with 12 a1 + 12 a3 , giving


1 4
the sub-game
, which you can solve using the formula for 2 2 games with no PSSP and extend this to
5 1
the whole game !:

4
3
3 4
19

=
, 0,
,
=
, ,0 ,
V =
7
7
7 7
7
[End lecture 17, 20/11/14]

Distribution of this37
document is illegal

1.3
1.3.1

Non-Zero-Sum Games
Introduction

In a zero-sum game [i.e. a strictly competitive game]:


1. Its never advantageous to divulge your strategy to your opponent
2. There is no point in discussing a joint strategy with your opponent
3. If (1 ,
equal

1)

and (2 ,

2)

are both equilibrium pairs then so are (1 ,

4. (, ) is an equilibrium pair () ,

2)

and (2 ,

1)

and the pay-os are

are maximin / minimax. V = g(, ). [Proved in lemmas 3, 5]

In non-zero-sum games, one or more of these may not hold:


(
gA (, ) = gain to A when A plays and B plays
Let
gB (, ) = gain to B when A plays and B plays
A wants to maximise gA (, )
B wants to maximise gB (, )
Non-zero-sum games are of two types:
1. Non-cooperative games: A and B cannot confer. They both know gA and gB and the table of pay-os.
They choose their strategies independently and simultaneously
2. Cooperative games: The 2 players can confer in advance to decide on a joint strategy to their possible
mutual advantage. Their strategy choices are not independent. All pre-play agreements are binding.
Bargaining
1.3.2

Non-Cooperative Games

Examples:
1. Prisoners Dilemma:
Two suspects, A and B, are held in dierent cells and they cannot communicate. The officer in charge
suspects that they are guilty of major crime but does not have enough evidence to convict them. Each
prisoner is given two choices:
DONT CONFESS
= a1 for A
= b1 for B

CONFESS
= a2 for A
= b2 for B

If both were to confess, (a2 , b2 ), they each would receive eight years in prison. If neither confesses, they
get one year each in prison because they can be convicted on some other minor crime. If one confesses
and the other does not, the one who confesses gets three months and the other gets ten years:
B
b1= dont confess b2 = confess
( 1, 1)
( 10, 0.25)
A a1 = dont confess
a2 = confess
( 0.25, 10)
( 8, 8)
Multiply by 4 and add 40:
a1
a2

b 1 b2
(36, 36) (0, 39)
(39, 0) (8, 8)

Whatever B chooses, A would prefer a2 . Similarly, whatever A chooses, B would prefer b2 . However if
A plays a2 and B plays b2 then both do rather badly. What appears to be best for them, as individuals,
is not good when they both choose to confess
Distribution of this38
document is illegal

2. The battle of the sexes:


A is a man, B is a woman. Each has a choice: cinema [a1 and b1 ] or tennis [a2 and b2 ]. A prefers the
cinema, B prefers tennis, but most importantly they want to be together. Consider:

a1
a2

B
b
b2
1

(2, 1)
( 1, 1)
( 1, 1)
(1, 2)

Assume theres no communication about the decision. The man says to himself: I want a1 , and she
wants b2 , but we both do badly like that. So, if I choose a2 [i.e. give in to her] and she chooses b2 then
we both do pretty well. However B will argue in a similar way and give in to him by playing b1 . This
gives the combination (a2 , b1 ), which is bad for both. Moral: communicate in advance!
3. Evolution:
Dawkins [Selfish Gene] Maynard-Smith. Worked models for animal behaviour. In a population of
hawks and doves, each member behaves either like a hawk [H] or a dove [D].
When two members confront each other, we have the rules:
If 2 hawks meet they always fight until one gets badly injured [cost = 50]. On average, the loss to
a hawk in an H H confrontation is 25 [half the time they win, half the time they lose]
When a hawk and a dove meet, the hawk wins 50 because the dove runs away [dove gets 0]
When 2 doves meet no one gets hurt, but one [who gets 0] runs away, the other gets 50. However
both lose 10 for wasting time in a staring match. In a D D confrontation the average gain
= 12 50 10 = 15. We can model this as a two-person game: 2 players, A and B. Each can choose
H or D. We get a pay-o table:

H
D

B
H
D

( 25, 25) (50, 0)


(0, 50)
(15, 15)

Question: is there a mixture [i.e. a randomised strategy] of H and D that is stable? I.e. a mixture
such that any small deviation from the stable state is soon brought back to it. Looking for an
equilibrium.
[End lecture 18, 25/11/14]
4. Cuban missile crisis in 1962:
Russians wanted to put nuclear weapons on Cuba, threatening the USA. The US had two strategies:
Natural blockade to stop Russians sending further weapons
Air strike to wipe out existing weapons on Cuba, followed by an invasion of Cuba
Russia also had two strategies:
Withdraw (W)
Not withdraw
Russians
W
Not W

USA Blockade Compromise for both Soviet victory, more powerful weapons
Airstrike
USA victory
Nuclear war
In the end, Russia withdrew and persuaded Kennedy not to invade Cuba. Game Theory in the
Humanities by S.J. Brams MIT Press
Distribution of this39
document is illegal

Definitions:
1. The pair ( ,

is said to be in [Nash] equilibrium if

gA ( ,
gB ( ,

)
)

gA (, ) 8 2 A
gB ( , ) 8 2 B

The definition is due to Nash [1950] ! Nobel prize. Neither A nor B has an incentive to change strategy.
N.B. This definition agrees with our earlier definition for zero-sum games, why?
We call ( ,

an equilibrium pair

2. Two equilibrium pairs (1 ,


equilibrium pairs

1)

and (2 ,

2)

are called interchangeable if (1 ,

2)

and (2 ,

1)

are also

3. A non-cooperative game is called Nash solvable if every pair of equilibrium pairs is interchangeable. By
convention, any game with 0 or 1 equilibrium pair is Nash solvable.
Examples of equilibrium pairs:
1. Chicken [the starred entries in the following are the E. pairs [pure]]:

We have a long straight road on a single track. To swerve or not to swerve? That is the question. Its of
course more macho not to swerve, but one doesnt want to get killed!
B
swerve
dont swerve

swerve
(2, 2)
(1, 4)
A

dont swerve (4, 1) ( 100, 100)


2. Prisoners Dilemma:
B
b1 = dont
confess b2 = confess
(36, 36) (0, 39)
A a1 = dont confess
a2 = confess
(39, 0) (8, 8)
3. Battle of the sexes:
a1
a2

b1
b2
(2, 1)
( 1, 1)
( 1, 1) (1, 2)

Here are two equilibrium pairs but 9 a pair of randomised strategies in equilibrium:

= 35 a1 + 25 a2
= 2b + 3b
5 1
5 2

[N.B. These are NOT maximin and minimax respectively.]


Exercise: Check this from the definition that ( , ) are an EP. We need to verify: gA ( ,
gA (ai , ) 8i; gB ( , ) gB ( , bj ) 8j. [We only need to check pure strategies by lemma 2]

Distribution of this40
document is illegal

There is a situation where it is easy to find equilibrium pairs:
Lemma 9. In a non-zero-sum game played non-cooperatively [or cooperatively], if  is an equaliser strategy
forAusingBspay-os,and  isanequaliserstrategyforB usingAspay-os,then( , )isanequilibrium
pair.
(
(
gA (ai , ) = CA 8i
gA ( , ) = CA = gA (ai , ) 8i
Proof.
C
,
C
are
constants
so
A
B
gB ( , bj ) = CB 8j
gB ( , ) = CB = gB ( , bj ) 8j
Examples:
1. Battle of the Sexes

a1
a2

b1
b2

(2, 1)
( 1, 1)
( 1, 1) (1, 2)

2
1
1
and in this game B has equaliser strategy 25 , 35 . Bs pay-os are
1 1
1
and in this game A has equaliser strategy 35 , 25 . N.B. Not all equilibrium pairs occur this way.
As pay-os are

1
2

2. Evolutionary Model
H
D

H
D
( 25, 25) (50, 0)
(0, 50)
(15, 15)

25 50
25
7 5
7 5

which gives = 35
60 , 60 = 12 , 12 . Similarly = 12 , 12 . So a popula0
15
5
hawks and 12
doves is stable or in equilibrium. This is called an evolutionary stable strategy

As pay-os are
tion with

7
12

Definition: An equilibrium pair (1 ,

1)

is said to be equivalent to an equilibrium pair (2 ,


(
gA (1 , 1 ) = gA (2 , 2 )
gB (1 , 1 ) = gB (2 , 2 )

2)

if

[In a zero-sum game any 2 equilibrium pairs are both equivalent and interchangeable.]
Examples: Look at Chicken and the Battle of the Sexes. Equivalent? Interchangeable?
In both Chicken and the Battle of the Sexes the equilibrium pairs of pure strategies are neither equivalent nor
interchangeable.
Geometrical Interpretation of Non-Zero Sum, Non-Cooperative Games
(
x = pay-o to A
We plot the points (x, y) where
Suppose = pa1 + (1 p)a2 ,
y = pay-o to B
a game where each player has two pure strategies.

= qb1 + (1

q)b2 in

[End lecture 19, 26/11/14]


These represent randomised strategies when 0 p, q 1. We have:
x = gA (, ) = pqgA (a1 , b1 ) + p(1

q)gA (a1 , b2 ) + q(1

p)gA (a2 , b1 ) + (1

p)(1

q)gA (a2 , b2 )

with a similar expression for y.


Question: which points (x, y) 2 R2 correspond to randomised strategies (, )? Alternatively, for which points
(x, y) 2 R2 do there exist p, q 2 [0, 1] with 0 p, q 1 that give rise to (x, y)?

The set of such points corresponding to randomised strategies for A and B is called the pay-o set, S.
Firstly, any point of S must be in the convex hull of the 4 points (ai ,bj) [i,j 2{1,2}] because 0  pq  1,
0p(1 q)1 &c. and pq+p(1 q)+q(1 p)+(1 p)(1 q)=1.
HowevernoteverypointintheconvexhullisnecessarilyanelementofS becausethecoefficientspq,p(1 q),
q(1 p)and(1 p)(1 q)areconstrained[i.e. theyhaveaparticularstructure];thisarisesfromthefactthat
the players choose their strategies independently.
Example: Battle of the Sexes:
a1
a2

b1
b2
(2, 1)
( 1, 1)
( 1, 1) (1, 2)

Then (x, y) 2 S () for some p, q 2 [0, 1] we have:


(x, y) = pq(2, 1) + p(1

q)( 1, 1) + q(1

p)( 1, 1) + (1

p)(1

q)(1, 2)

= q[p(2, 1) + (1

p)( 1, 1)] + (1 q)[p( 1, 1) + (1 p)(1, 2)]


(
L = p(2, 1) + (1 p)( 1, 1)
which lies on the line segment joining two points
But L is a point on the
M = p( 1, 1) + (1 p)(1, 2)
line segment joining (2, 1) and ( 1, 1) and M is a point on the line segment joining ( 1, 1) and (1, 2). The
line segment LM is in S. In the below diagrams, S is the shaded region of the parabolic envelope.

Here, we let p vary in [0, 1]. S consists of the 2 straight line segments [( 1, 1) to (2, 1) and ( 1, 1) to (1, 2)],
the parabolic arc and the region interior to these. Note that there are points in the convex hull that are not in S.
Note the following:
We would have produced the same S if we had used p instead of q in the expressions for (x, y)
All points on the line segment ( 1, 1) to (2, 1) are in S [take q = 1] and all points on the line segment
joining ( 1, 1) to (1, 2) are in S [take q = 0]
The line segment joining (1, 2) and (2, 1) is not in S. Note that ( 1, 1) and (2, 1) are in the same row
of the table of pay-os, as are ( 1, 1) and (1, 2), whereas (1, 2) and (2, 1) have no row or column in
common
Exercise: Show that the equation of the parabolic arc is 5(x

y + 1)2 = 4(3x

2y + 1).

In general, S depends on the orientation and location of the four points representing the pure strategies.

Example [2 2 game]:
a1
a2
Possibilities [we will plot , ,

b1 b2

and ]:

Here, the line segments joining and , and , and , and all lie in S [set p = 1, p = 0, q = 1, q = 0
respectively]. However the line segments joining and or and may not be in S.
Sometimes S comprises the whole of the convex hull of the four points representing pairs of pure strategies.
E.g. if the four points , , , are the vertices of a convex quadrilateral, labelled in the order , , , we have
diagrams (a) or (b). If M is in the convex hull, we can find p, q 2 [0, 1] such that = (p, 1 p), = (q, 1 q)
are represented by the point M .
In general, this will not be so and various dierent diagrams can result. In (d) we have a curve on the lower
part and two straight lines on the upper part. The whole of the shaded area = S. The lines drawn are in equal
increments. [In (c) there are two straight lines on the lower part, and a curve on the upper.] Other possibilities
can occur.
[End lecture 20, 27/11/14]

Theorem 2 (Nash 1950). Every finite 2-person game has at least one equilibrium pair of strategies.
Preamble: Let = (1 , . . . , m ), = ( 1 , . . . , n ) be any randomised strategies for A, B respectively.
(
ri = max {gA (ai , ) gA (, ), 0} i = 1, . . . , m
sj = max {gB (, bj ) gB (, ), 0} j = 1, . . . , n
Example:
b2
b1
(4, 0) (0, 2)
(3, 1) (4, 0)

a1
a2
gA (, ) =

17
6 ,

so r1 = max gA (a1 , )

17
6 ,0

Let: =

1 2
,
3 3

= 0 and r2 = max gA (a2 , )

1 3
,
4 4

17
6 ,0

11
12 .

[Exercise.]

Proof (Outline).
Let i =
) and
and = (1 , . . . , m
cise].

=(

1, . . . ,

n ).

i + ri
P
1 + ri

1+

+ sj
P
sj

Check that these are randomised strategies for A and B [Exer-

Define f (, ) = ( , ). We now use Brouwers Fixed Point Theorem [without proof]: a continuous function
from a closed bounded convex set C into C has a fixed point. f satisfies the requirements of this theorem. f is
a map from Rm+n to itself. So f has a fixed point, i.e. 9(, ) such that f (, ) = (, ). We will show that
the pair (, ) is in equilibrium () f (, ) = (, ).
Firstly =) : If (, ) is an equilibrium pair then, by definition, gA (ai , ) gA (, ) 8i so ri = 0 8i and hence
i = i 8i and so = . Similarly = , so f has a fixed point.
Secondly (= : By contradiction. Suppose (, ) is not an equilibrium pair. Then either [or both]:
( 0
9 such that gA (0 , ) > gA (, )
0
9 such that gB (, 0 ) > gB (, )
0 ). We know
Well just consider
the first of these situations as the second is very similar. Let 0 = (10 , . . . , m
P
that gA (0 , ) = i i0 gA (ai , ) hence 9 at least one
gA (, ). Otherwise, there would
P i such that gA (ai , ) > P
be a contradiction with the first case: gA (0 , ) i i0 gA (, ) = gA (, ) i i0 = gA (, ).

P
P
For
P this i, we have ri > 0 and, as another consequence, we know ri > 0. However, gA (, ) = i i gA (ai , ) =
hence gA (ak , ) gA (, ) for some k with ak > 0 otherwise gA (, ) > gA (, ),
k:k >0 k gA (ak , ) and
h
i
P
clearly a contradiction uses

=
1
.
k
k:k >0
For this particular k, we have rk = 0 and k =
6= and (, ) is not a fixed point of f .

Pk
1+ i ri

< k since k > 0 and

ri > 0. Hence k 6= k , so

Solving Non-Cooperative Non-Zero-Sum Games


Difficult! Weve met Nash solvable already.
Definition: Anon-cooperativegameiscalledNash solvableifeverypairofequilibriumpairsisinterchangeable.
Convention: If a game only has one equilibrium pair, it is called Nash solvable. Exempli gratia: Prisoners
Dilemma is Nash solvable [as it only has one equilibrium pair]. The Battle of the Sexes is not.

44

Definition: A pair of randomised strategies (, ) is called jointly inadmissible if 9( ,


(

gA ( ,
0
gB ( ,

0
0

)
)

) such that

gA (, )
gB (, )

with at least one strict >. Otherwise (, ) is jointly admissible.


Definition: A non-cooperative game is said to have a solution in the strict sense if:
1. 9 an equilibrium pair amongst the jointly admissible pairs
2. All jointly admissible equilibrium pairs are both equivalent and interchangeable [if 9 only one jointly
admissible equilibrium pair, by convention it is solvable in the strict sense]
Very few interesting games satisfy these stringent requirements. E.g. the following does have a solution in the
strict sense:

B
(1, 1) (0, 0)
(0, 0) (2, 2)

(1, 1) is in equilibrium, but is not jointly admissible. (2, 2) is a solution in the strict sense.
Prisoners Dilemma has no jointly admissible equilibrium pairs, so it is not solvable in the strict sense.
Repetition of Non-Cooperative Non-Zero-Sum Games [or Supergames]
Richard Dawkins The Selfish Gene, Poundstone: Prisoners Dilemma. Consider:
B[you]
cooperate
defect

(3, 3) (0, 5)
A[me] cooperate [nice]
defect [nasty]
(5, 0) (1, 1)
If you only play one game, its best to defect [though both do badly]. But how should we play if we repeat the
game a large number of times? You need to think about your strategy. Generally, both players do reasonably
well if both cooperate, but there is always the temptation to defect to gain more. Would your opponent forgive
you? I somehow doubt it.
In 1981 Robert Axelrod [American political scientist] conducted an experiment. He held a competition and
asked people to submit strategies for playing repeated Prisoners Dilemma games. The computer programs he
added randomly. The participants werent told how many rounds, but they played each other for about 200
games!

[End lecture 21, 02/12/14]


Strategies:
1. Tit for Tat: Cooperate on first move then copy previous moves of opponent. Never 1st to defect
2. Nave Prober: Same as Tit for Tat, but occasionally play defect, then copy your opponents last move
3. Grudger: Always played defect once the opponent has defected
There are two kinds of strategy: a nice strategy is one where you are never the first to defect; a nasty strategy
is the opposite. Axelrod found that all nice strategies did better than all nasty ones.

Secondexperiment: 62submittedstrategies. Allweretoldresultsofthefirstexperiment. Manyinthesecond


experiment were nice strategies because these had worked in round one. Others submitted nasty strategies
because they wanted to take advantage of a lot of softies. Tit for Tat won in both experiments. For one
game,defectistheonlysensiblestrategy. Ifthenumberofmovesisknowninadvance,theneachplayerwould
play defect on the last move as theres no chance of retaliation. So it is best to play defect on round n 1 [n
totalmoves]astheopponentwillplaydefectonroundnanyway. Backwardsinduction&c. Thegamebecomes
trivial.
1.3.3 Cooperative Games
Most games are like this. All agreements are binding and players decide in advance, after discussion, what
each will play. Assume AS, BS are finite. We have a pay-o table with entries:
{gA (ai ,bj), gB (ai ,bj)}2R2

i=1,...,m

j =1,...,n

Definition: A joint strategy (, ) is a joint probability distribution over AS BS .


I.e. we attach probability pij to the point representing (ai , bj ). Because
P the players cooperate, their strategy
choices are not independent and so pij 6= pi qj necessarily. 0 pij 1, i,j pij = 1. The pay-o set, the set of
joint strategies, is now the convex hull of the points representing (ai , bj ), the pij . Let us call it S.
Example with the Battle of the Sexes. S is the pay-o set [the isosceles triangle and its interior]:

Definition: A point (x, y) 2 S is called inadmissible if 9 a distance point (x , y ) 2 S such that x


Otherwise (x, y) is called admissible.
In The Battle of the Sexes, the admissible points are shaded above.

x, y

y.

Definition: The set of admissible points is called the pareto optimal set.
S is a convex polygon. Typically:

In this diagram, LM is the pareto optimal set.


B would prefer L, but A would prefer M . However we can find:

sA = sup inf gA (, )

I.e. look at As pay-os alone and calculate the value of the game to A.
This is the maximum A can guarantee getting, ignoring Bs pay-os. A wont settle for less than this. sA is
As security level. Similarly:
n
o
sB = sup inf gB (, )

is Bs security level. Remember that B is trying to make gB big here.


Definition: The points of the pareto optimal set with x
sometimes called the bargaining set.

sA and y
0

sB are called the negotiation set, N ,


0

Referring to the above diagram, N = two line segments from L to M .


Examples:
1. Battle of the Sexes:
As pay-os

2
1

1
1

Solve as a zero-sum game ! sA = 15 , careful with B as pay-o table has gains to B.


We may need to transpose Bs pay-os ! sB = 15 . N = whole of the line segment from (1, 2) to (2, 1)

2. Prisoners Dilemma:

The set N is from L to M .

(36, 36) (0, 39)


(39, 0) (8, 8)

A!

36 0
39 [8]

B!

sA = 8

36 39
0 [8]

sB = 8

Again, in general, be careful as the numbers in the pay-o table are gains to B.

0
[PSSPs]. A cannot reasonably expect to get more than the x-coordinate of M = 38 13 and B cannot
0
reasonably expect to get more than the y-coordinate of L . Exercise: What is this value?
0

Question: What is a fair solution? I.e. which points on L M should represent a solution?
The Nash Arbitration Procedure [Also Called the Shapley Solution]
Choose s 2 S to maximise (s1 sA )(s2 sB ) [the excess over sA multiplied by the excess over sB ] over all
points in N where s = (s1 , s2 ), with respect to s1 , s2 . Why choose this function? The advantages are:
Symmetric on s1

sA and s2

sB

Scale invariant
[End lecture 22, 03/12/14]

Question 10
Part I
=

1 1 1 1
4, 4, 4, 4

are ES for the 2 players, so maximin and minimax.

Part II
1 2 3
6 1 2
2 3 2
There are two approaches:
1. Write down equations which we need for a pair of ESs
2. G

method. Both lead to =

1 1 1
3, 6, 2

1 1 1
6, 3, 2

,v=

7
3

Question 15
b2
b1
(x, z) (y, z)
(x, w) (y, w)

a1
a2

All pairs of pure strategies are in equilibrium. We need to show that every pair of randomised strategies is in
equilibrium:
Let = pa1 + (1
that:

p)a2 ,

= qb1 + (1 q)b2 be any pair of strategies 0 p 1, 0 q 1. We must show


(
gA ( , ) gA (, ) 8
gB ( , ) gB ( , ) 8

For A (B is similar):
gA ( ,

) = pqgA (a1 , b1 ) + p(1

q)gA (a1 , b2 ) + q(1

..
.

p)(1

q)gA (a2 , b2 )

q)gA (, b2 ) + &c. 8

pqgA (, b1 ) + p(1

gA (,

p)gA (a2 , b1 ) + (1

) 8

Question 16
Let (1 ,

1)

and (2 ,

be two equilibrium pairs. For 0 1. Let 0 = 1 + (1

2)

We need to show that (0 ,


gA (0 ,

)=

0)
2

1 + (1

2.

is in equilibrium. For A (B is similar):

gA (1 ,

1)

+ (1

)[gA (1 ,

Since the game is Nash solvable, we know that (1 ,


2 A , we have:
gA (0 ,

)2 ,

gA (,

= gA (,
= gA (,

1)
1)

+ (1

+ (1

2)

)[gA (,
)gA (,

2)

+ gA (2 ,

and (2 ,
2)

1)

+ gA (,

1 )]

+ (1

)2 gA (2 ,

2)

are also equilibrium pairs, so for any


1 )]

+ (1

)2 gA (,

2)

2)

) 8

A similar result holds for gB and so (0 ,

0)

is also an equilibrium pair. Nash equilibria form a convex set.

Question 17
a1
a2

b2
b1
(1, 4) (9, 0)
(7, 1) (3, 3)

gA (a1 , b1 ) = 1 < 7 = gA (a2 , b1 ), so (a1 , b1 ) is not in equilibrium. Check the other three pairs of pure strategies
are not in equilibrium. Look for ESs for each player using the others pay-os.
For A:

1 9
[these are gains to B] !
7 3

1 1
2, 2

is an ES for B using As pay-os.

For B:

4 0
[these are gains to A] ! =
1 3

1 2
3, 3

is an ES for A using Bs pay-os.

[Find gA (ai ,

= 5 8i, gB ( , bj ) = 2 8j, hence ( ,

is in equilibrium].

Question 18

Suppose that = (p, 1

p),

= (q, 1

(36, 36) (0, 39)


(39, 0) (8, 8)

q) are in equilibrium.

gA (, ) = 36pq + 39q(1
= 5pq + 31q
We know gA (, )

gA (a1 , ) and gA (, )

p) + 8(1

p)(1

q)

8p + 8

gA (a2 , ) ? [we only need compare with pure strategies].

Using the second of these, ?, 5pq + 31q 8p + 8


gB (, ) = 8(1 q) gB (, b2 ) = 8 =) q = 0.

31q + 8 so p(5q

8)

=)

p = 0. If p = 0,

[End lecture 23, 04/12/14]


Example [Shapley solution]:

a1
a2

B
b2
b1
(2, 1) (3, 0)
(0, 4) (2, 5)

The dot on the pareto optimal set, i.e. the line segment between (2, 5) and (3, 0), represents a Shapley solution.
[2] 3
[1] 0
A:
PSSP = a1 , SA = 2. B :
gains to B. PSSP = b1 , SB = 1.
0 2
4 5
The Shapley solution is (s1 , s2 ) where (s1 2)(s2 1) is maximised over N , i.e. along the relevant part of the
line 5s1 + s2 = 15. We find (s1 2)(s2 1) = (s1 2)(14 5s1 ) = 5s21 + 24s1 28, a quadratic in s1 [[].
Dierentiation tells us that the maximum occurs when s1 = 12
Check that this is in N . N.B. It is
5 , s2 = 3.
a global maximum.
Exercises [Cooperative Games]: Give examples:
1. Not all equilibrium pairs are jointly admissible
2. Not all jointly admissible pairs are in equilibrium
3. It is possible for to be inadmissible [ignoring Bs pay-os] and vice-versa for , yet (, ) is in equilibrium

2
2.1

Utility
Introduction

Consider the choice between: decision d1 = accept 100; decision d2 = get 200 with probability
with probability 12 . Most people prefer d1 even though both have the same expected pay-o.

1
2

and 0

Now consider the St. Petersburg Paradox [Bernoulli]. Toss a fair coin until you get tails. You can get 2n if
the first tail occurs at toss n.
E(pay-o) =

1
X

2n P (first tail is on toss n)

n=1

1
X

n=1

2n

1
!1
2n

Although the expected pay-o is infinite, people wont pay much to play it. So the expected pay-o is not a
good measure to compare decisions about money.
In addition, the results of making decisions may not even be monetary; they may be prestige, goodwill &c.,
which are hard to quantify. We need a scale for value. We also may have more than one objective. We need
to make a comprehensive list of possible decisions and, for each decision, a list of consequences.
Let R = the set of all possible consequences of a set of decisions. The unknown events need to be taken into
account. The elements of R may not be numerical, and may involve unknown events. We have R = {C}, C
is a consequence.
Example: You are invited to your friends house one evening at 8p.m., but you do not know if they will provide
dinner. Two decisions: d1 = eat beforehand; d2 = do not eat beforehand. Your friends will either provide
dinner [event 1 ] or not [event 2 ]. The true value of is unknown. There are four consequences: C1 = (d1 , 1 ),
C2 = (d1 , 2 ), C3 = (d2 , 1 ), C4 = (d2 , 2 ). We have a set of consequences {C1 , . . . , C4 }P
and, P, a probability
distribution over these. An element of P is a set of probabilities p1 , p2 , p3 , p4 . [pi 0, i pi = 1.]

The notation (p1 C1 , p2 C2 , p3 C3 , p4 C4 ) denotes a gamble or a lottery in which Ci occurs with probability pi .
In our first example, R = {0, 100, 200} {C0 , C100 , C200 }. We were asked to compare the two lotteries
(1 C100 ) and 12 C0 , 12 C200 .

2.2

The Lottery Axioms

By comparing lotteries we will develop a common scale for utility [i.e. value]. We can think of decisionth
making as comparing
P lotteries. R = {C1 , . . . , Cm }, Ci = i consequence. A lottery is L = (p1 C1 , . . . , pm Cm )
where 0 pi 1, i pi = 1. We can form a compound lottery such as L = (pL1 , (1 p)L2 ) where L1 and L2 are
lotteries, which means you receive the result of L1 with probability p and the result of L2 with probability 1 p.
Definition: We write L1 < L2 if the decision-maker prefers L2 to L1 . L1 > L2 [L1 preferred to L2 ]. L1 L2
[indierent between L1 and L2 ]. L1 . L2 [i.e. L1 < L2 or L1 L2 ].
Example at the friends house:
d1
d2

1
2
C1 [full / costs] C2 [costs]
C3 [best]
C4 [starve]

Each Ci is a degenerate lottery.


We can order theses, e.g.: C4 < C1 < C2 < C3 . We will now set out some axioms for how a rational
decision-maker should compare lotteries.

51

Stuvia.co.uk - The Marketplace for Revision Notes & Study Guides

Axiom 1. For any two lotteries L1 and L2 , either L1 < L2 , L1 > L2 , or L1 L2 .


I.e. the decision-maker must express their feelings about L1 and L2 .
Axiom 2 (Transitivity). For any three lotteries L1 , L2 , L3 such that L1 . L2 , L2 . L3 we must have L1 . L3 .
There are some criticisms of axiom 2:
Many people do not behave transitively in their preferences, especially when the consequences are multidimensional
Three non-transitive dice problem: Red 1, 4, 4, 4, 4, 4; Blue 3, 3, 3, 3, 3, 6; Green 2, 2, 2, 5, 5, 5
P (R > B) =

25
= 0.694
36

P (B > G) =

21
= 0.583
36

P (G > R) =

21
= 0.583
36

All >

1
2

Non-transitive
[End lecture 24, 09/12/14]
Axiom 3 (Substitutability of Lotteries).
1. If L = (p1 L1 , . . . , pn Ln ) and L1 L01 , then L (p1 L01 , p2 L2 , . . . , pn Ln )
2. If L =P(p1 L1 , . . . , pn LP
n ) and Li (qi1 C1 , . . . , qim Cm ) [qs are probabilities] then we should be indierent:
L [ i pi qi1 C1 , . . . , i pi qim Cm ] [like the law of total probability]

Criticism: Some people like a multi-stage lottery [those who like gambling]. Axiom 3 reflects this fact.
Heres an example from the 2013/14 course:
Ci = i, L =

1
4
5 L1 , 5 L2

where L1 =

1
1
2 C2000 , 2 C0

, L2 = C200 . The we should believe that:

1
1
4
L
C2000 ,
C0 , C200
10
10
5

Axiom 4 (Principle of Irrelevant Alternatives). If L1 , L2 , and L are 3 lotteries then for any 2 (0, 1) we
should have L1 < L2 () (L1 , (1 )L) < (L2 , (1 )L).
I.e. introducing a new lottery, L, should not change our preferences between L1 and L2 .
Axiom 5 (Continuity). For any consequences C1 < C < C2 9 2 (0, 1) such that C ((1
However this may be difficult.

)C1 , C2 ).

Heres an example from the 2013/14 course:


A doctor is to treat a patient. The patient will either die or survive. There are 3 treatments available: T1 ,
T2 , T . T always cures but is more expensive than T2 . T1 causes half the patients to die, the cost is the same
as T . T2 is cheap, and always cures. Clearly T1 < T < T2 . But the doctor cannot find > 0 such that
T ((1 )T1 , T2 ) because the survival of the patient is more important than any cost.
Exercise: C0 get nothing, C1000 get 1000, C2000 get 2000. Choose so you are indierent between C1000
and L = ((1 )C0 , C2000 ).

Distribution of this52
document is illegal

2.3 The Existence of a Unique Utility Function


First notice that axioms 1 and 2 together enable us to order the consequences C1 ,...,Cm . C ..C. To
avoid trivialities we assume C <C.
Lemma 10. For any consequence C such that:
C<C<C
there exists a unique 2 (0, 1) such that C ((1

)C, C).

Proof. Axiom 5 tells us that 9 at least one such . Suppose [for a contradiction] that 91 < 2 such that:
C ((1

1 )C, 1 C) and C ((1

2 )C, 2 C)

Axiom 2 now tells us:


1 )C, C) ((1

((1
Let L be the lottery ((1
Let

= 2

( C, (1
((1

)C, C) where

(2

1
1+1 2 .

2 (0, 1). Notice that 1

1 . Check

)L)

1 )C, (1 + 1

2 )

1 )C, 1 C) [by axiom 3] ((1

Hence by axiom 2 ( C, (1

)L) ( C, (1

2 )C, 2 C)

Check

1 2
.
1

2 (0, 1).

Consider:

1 2
1 C
C,
1 + 1 2
1 + 1 2

= ((2 1 )C, (1

2 )C, 2 C) [by axiom 2] ( C, (1

)((1

)C, C))

)L) [by axiom 3]

)L), which violates axiom 4 because C < C. So 1 = 2 .

Corollary: Lemma 10 works with C, C replaced by any C1 , C2 where C1 < C < C2 .


Defintion: A utility function u is a map u : R ! R such that:
1. If C1 < C2 then u(C1 ) < u(C2 )
2. If C1 < C < C2 and if C ((1
to Lemma 10] then u(C) = (1

)C1 , C2 ) [note that 9 and its unique by Axiom 5 and the corollary
)u(C1 ) + u(C2 )

I.e. u is preference preserving and linear.


Lemma 11. If u is a utility function then u(Ci ) = ai + b where a and b are the constants such that a > 0
and i is the unique value in (0, 1) such that Ci ((1 i )C, i C).
Proof. By linearity, u(Ci ) = (1 i )u(C) + i u(C) = ai + b where a = u(C)
preserves our preferences, we have a > 0 because C < C.

u(C) > 0, b = u(C). Since u

N.B. The converse of Lemma 11 holds too [not proved here].


We can extend this toP
lotteries. Let L =P(p1 C1 , . . . , pm Cm ) be a lottery and let Ci ((1 i )C,
P i C), i
unique. Hence L ( i pi (1 i )) C, ( pi i ) C [by axiom 3] = ((1
) C) where =
pi i . We
define u(L) = a + b, a > 0, a, b constants. This is a well-defined function because the i s are well-defined
and so is . It can also be shown that u(L) has the properties of a utility function. Note that:
u(L) = a

X
i

pi i + b =

X
i

pi (ai + b) =

pi u(Ci )

so u(L) can be thought of as expected utility. If we accept axioms 1 to 5 then we should base our preferences
on maximising the expected utility.

2.4 The Utility of Money


What does u(z) look like? Assume each element of R is a sum of money in R
Typically we have the following concave u(z):

u(z)

0 .

Consider u(z) for z 2R

0 .

0 [u(0) = 0]

u(z) is increasing in z
u(z) is twice dierentiable
u(z) is concave. u00 (z) 0 because u0 (z) decreases as z increases, which reflects the idea that an extra
10 to a rich person means very little compared with what it means to a poor person
u(z) is linear near z = 0
u(z) is bounded above; after 101000 no one would want any more
[End lecture 25, 10/12/14]
u(z) is very dierent for z < 0. Typical functions used to model u(z) are:
u(z) = a log(1 + bz), a, b > 0 constants and b small (to get linearity near the origin)
z
u(z) = z+
, > 0 constant. represents willingness to take risks: a high love risk; a low risk
averse with this u(z).

u(z) = 1

z ,

> 0 constant. Risk averse large : risk loving small .

A risk-averse person prefers the certainty of small amounts of money. A risk-loving person prefers speculative
gains of large amounts to certain gains of small amounts.

Bayesian Methods

If X, Y are continuous random variables then the conditional P.D.F. of Y given X = x: where:
"
#
fX|Y (x|y)fY (y)
fX,Y (x, y)
fY |X (y|x) =
=R
fX (x)
y fX|Y (x|y)fY (y)dy
If x is fixed, fX (x) = constant and fY |X (y|x) / fX,Y (x, y) = fX|Y (x|y)fY (y).

Example: Y Exp( ),
> 0. X|Y = y Exp(y). We will calculate the conditional distribution of
Y |X. fY |X (y|x) / fX|Y (x|y)fY (y) = ye xy e y / ye (x+ )y , y 2 (0, 1). I.e. given X = x, Y has a
Gamma(2, x + ) distribution.

Bayesian Inference
Suppose we have some observations x1 , . . . , xn , which are realisations of random variables X1 , . . . , Xn that have
a joint P.D.F. depending on an unknown parameter [ may be vector-valued but will be scalar in an exam].
E.g. X1 , . . . , Xn independent but identically distributed N (, 1) random variables where is unknown [this is
called a random sample]. The Bayesian approach to estimating assumes that itself has a P.D.F. before we
collect the data. I.e. we regard as a random variable. Denote this P.D.F. (). We call it a prior P.D.F. for
. It represents our knowledge and beliefs about before we collect the data.
Heres an example form the 2013/14 course: A patient sees a doctor. The patient either has a disease, D, or
not. The doctor must diagnose which it is: = 1 is the event that the patient has D; = 0 is the event that
the patient does not have D. The doctor believes P ( = 1) = , P ( = 0) = 1 .
The doctor now does some tests, i.e. collects data. We will call this data X. We get:
(
P ( = 0|X = x) / fX| (x| = 1)
P ( = 0|X = x) / fX| (x| = 0) (1 )
Updated beliefs about . In general, we use Bayes Theorem as follows:
|X (|x) / fX| (x|)()
x = (x1 , . . . , xn ), X = (X1 , . . . , Xn ), |X (|x) is called the posterior P.D.F. of . fX| (x|) is the joint P.D.F.
of the data given a particular . As a function of , this joint P.D.F. is called the likelihood function, i.e.
posterior P.D.F. of / likelihood prior P.D.F. of .
Examples:
1. X Bin(n, ). n known, unknown. Were told () = 6(1 ), 2 [0, 1]. Using Bayes:

n x
(|x) /
(1 )n x 6(1 ) / x+1 (1 )n x+1
x
2 [0, 1]. Hence the posterior P.D.F. of given X = x, is that of a Beta(x+2, n x+2) random variable.
The prior distribution looks like something resembling a semi-circle: its posterior counterpart is more
peaked - the variance has decreased. Posterior P.D.F. updates our feelings about in light of the data
2. X N (, 02 ), 02 is known, is unknown. We want to estimate . The prior distribution of [we are
told] is () = N (0 , 02 ) where 0 , 02 are both known. Use Bayes:

1 (x )2 ( 0 )2
=) (|x) / exp
+
[= prior likelihood]
2
2
02
0
The first term in the [ ] comes from the likelihood, and the second term from the prior. Think of this
as a function
proportional to another normal P.D.F.
of . Complete thesquare in . Get an 2expression
0 + 02

1
1
x
! |x N m, p where m = p 2 + 2 and p = 2 2
0

How do we find ()?


From historical data
Subjectively
Non-informative priors. How do we express ignorance about ?

Non-Informative Priors
If can only take a finite set of k values we might set () = k1 . If 2 [a, b] then take () =

1
b a,

2 [a, b].

Problem: Any non-linear function of , say g(), is not uniformly distributed over [g(a), g(b)]. Theres a worse
problem if the range of values is infinite.
R
If, for example, () = constant [finite] then ()d is infinite. This is an example of an improper prior as it
doesnt integrate to 1. However, we can still use Bayes Theorem as we often get a proper posterior distribution
for [integrates to 1].
E.g. taking X1 , . . . , Xn a random sample from Exp( ), unknown [ > 0] [random sample means X1 , . . . , Xn
are independently and identically distributed]. We assume ( ) / 1 , an improper prior.
( |x) /
Provided n

Pn

i=1

xi

[by independence] =

1, so this gives a proper posterior P.D.F. for

n 1

Pn

i=1

xi

[ > 0]

given X = x, that is Gamma (n,

Pn

i=1 xi ).

[End lecture 26, 16/12/14]

4 Decision Theory
4.1 Introduction
Decisionmakinginvolvesuncertainty.Firstly,makealist,D,ofallpossibledecisionsoractions.Secondly,make
alistoftheunknowns.Thesearecalledthestatesofnature,.
E.g.doyoutakeanumbrella?Youdontknowifitwillrainornot.={0,1}.=0itrains:=1itdoesnot
rain. For any d 2 D and  2  we assume that we can evaluate the consequences and the utility of this
consequence.Indecisiontheory,wetalkoflosses;letusdefinethefollowinglossfunction:
L(,d)= utilityoftheactiondwhenisthetruestateofnature
n
fromwhichwecanformatableofthestatesofnature{i}m
i=1 againsttheactions/decisions{dj }j =1 withentries
equaltoL(i,dj ).WeassumethatalltheLsarefinite.Itwilllookabitlikethepay-omatrixofatwo-person
zero-sum game. For the rows, the states of nature represent the set of pure strategies for player A and
actionsforthecolumnsofplayerB.WearetryingtominimiseL.However,therearedierences:

Nature is not playing like an intelligent opponent, i.e. it is not trying to maximise L
What are the randomised strategies for both players? For Nature a randomised strategy is a
probability distribution over , i.e. a prior P.D.F. [or P.M.F.] () for
We (i.e. player B) do not need to use any element of surprise on Nature so theres less need for a
randomised strategy for the decision maker
We know (), so we are trying to choose the best d 2 D in the light of Nature playing the randomised
strategy (). I.e. we need to look for a Bayes strategy for the decision maker with respect to ()
We should also note that:
Even if we consider randomised actions [i.e. probability distributions
that:
inf L(, d) = inf L(, )

over D] then lemma 2 tells us

So we lose nothing by restricting ourselves to pure strategies for player the decision maker
Definitions:
1. r(, d) = E L(, d) =

L(, d)()d is called the Bayes Loss of d with respect to ()

2. A decision d 2 D is called a Bayes Decision or Bayes Action with respect to () if it minimises r(, d)
i.e. if r(, d ) r(, d) 8d 2 D
3. If d is a Bayes Decision then r(, d ) is called the Bayes Loss of
Example: Let (1 ) = 14 , (2 ) = 34 :
1
2

d1 d2
0 3
5 2


1
r(, d1 ) = 0
+ 5
4


1
r(, d2 ) = 3
+ 2
4

3
=
4

3
=
4

So d2 is the Bayes Action with respect to , the Bayes Loss of is 94 .

15
4
9
4

4.2 Decision Rules


In practice, we collect data to inform out decision making. These data give us information about the state of
nature. Let X represent the data. Let X = sample space of all possible xs. The distribution of X depends
on the unknown . D is the set of actions.
Definitions:
1. A decision rule d is [now] a map d : X ! D. I.e. if we observe x 2 X then we choose d(x) 2 D
2. The risk function of a decision rule, d, is:
R(, d) = EX| L (, d(X)) = expected value over the distribution of X for a particular fixed
If we have no data then the risk is equivalent to the loss i.e. R(, d) = L(, d)
3. The Bayes Risk of d 2 D with respect to () is r(, d) = E R(, d). [Compare with E L(, d) in the
case of having no data, as on the previous page]
4. A Bayes Decision Rule, d , with respect to () minimises the Bayes Risk of d with respect to (). I.e.
if d is a Bayes Rule, then r(, d ) r(, d) 8d(x)
5. If d is a Bayes Rule, then r(, d ) is called the Bayes Risk of , denoted by r()
Lemma 12. The Bayes Risk is minimised by minimising the expected posterior loss of d(x).
Proof. We want to choose d(x) to minimise the Bayes Risk. I.e. we want to choose d(x) to minimise:
Z
r(, d) = ()R(, d)d

()

f (x|)L(, d(x))dx d
x

where the integral in the square bracket is R(, d). The next step assumes we can interchange
we can do as were dealing with realty and expect not to have any pathological example:
Z
Z
= fX (x)
(|x)L(, d(x))d dx
x

using (|x) =

f (x|)()
fX (x)

and

x,

which

from Bayes Theorem.

Looking at the last integral, once we have observed x we should minimise [. . . ] inside the main integral. If we
were to do this for each x then we would have minimised r(, d). I.e. once we know x we need to minimise for
any particular x:
Z
(|x)L(, d(x))d

= the expected posterior loss of choosing the decision rule d(x)

So minimising the Bayes Risk minimising the expected posterior loss maximising the expected posterior
utility.
Example: We observe x from X where X N (, 1) [ is unknown]. The prior distribution of is N (0, 2 ).
2

x
2
|x N
,
[See Section 3]
1 + 2 1 + 2
We need a loss function. The most common in inference is square error loss:
L(, d(x)) = (

d(x))2

There was some more material on this covered in the 2013/14 lecture series, which Ill include in an appendix
in section 6.1.

Lemma13.Undersquareerrorloss[andaproperposteriordistributionfor],theBayesDecisionRuleisto
estimatebythemeanoftheposteriordistributionfor.
Proof. The expected posterior loss of a decision rule d(x) is:
Z
(|x)( d(x))2 d

This is a [-shaped quadratic in d(x). We dierentiate with respect to d(x) and set it = 0:
Z
Z
d
2
(|x)( d(x)) d = 2 (|x) ( d(x))d = 0
d{d(x)}
Z
Z
=)
(|x) d = (|x)d(x)d = d(x)

R
The mean of the posterior distribution of . N.B. (|x)d = 1.
Examples:

2
1. In our X N (, 1)
example, with square error loss, the prior is N (0, ). The posterior distribution of
2
2
x

is N 1+
[see Section 3].
2 , 1+ 2

The mean of the posterior distribution for i.e. the Bayes Rule, d , for estimating with respect to ()
2x
is d (x) = 1+
2
2. X Bin(n, ). () = 6(1 ), 2 [0, 1]. Squared erros loss. We have (|x) / x+1 (1 )n x+1 i.e.
the posterior distribution for is Beta(x + 2, n x + 2). The Bayes Rule d for estimating with respect
x+2
to () is the mean of this beat distribution, i.e. d (x) = n+4
.
If instead we were interested in

= 2 then we would calculate:

(x + 2)(n x + 2)
E|X ( |X = x) = var(|X = x) + [E(|X = x)] =
+
(n + 4)2 n + 5)
2

x+2
n+4

from Beta

[End lecture 27, 17/12/14]

4.3

Decision Trees

Decision-making is usually sequential; most decision tress have many steps, which we use to represent decision
problems. There are 2 kinds of nodes [vertices]: decision nodes and chance nodes. Quite often these alternate.
Examples:
1. Dinner example: C1 [overfull]; C2 [okay but costly]; C3 [great]; C4 [starve]:

In this diagram, = P (meal), the square represents the decision node and the circles, the chance nodes.
It is drawn from left to right ! time goes from left to right. Ci are consequences [in practice with
attached utilities, sometimes called terminal utilities, which sum to 1]

2. Inference [an example from the 2013/14 course]:

3. A company is thinking of launching a new product that it has developed. The marketing executive
estimates the profits / losses resulting from dierent market shares. The market share level, , is equal
to one of 2% or 10% with prior probabilities [with the numbers in utility units] of 0.3 and 0.7:
Market Share Level ()
Prior Decisions

Launch
Dont launch

10%
0.7

500
0

2%
0.3

250
0

[Cost for scrap is 0 because the research and development costs have already been sunk.]
E(utility of launch) = 0.7 500
E(utility of scrap) = 0

0.3 250 = 275

This is called a prior analysis of the problem. We might do better if we knew more about the market
share [i.e. 10% or 2%]. We would launch if 10% and scarp if 2%.
E(utility under perfect information) = 500 0.7 + 0 0.3 = 350
The dierence [350 275 = 75] is called the expected value of perfect information. It is not worth spending more than this to get information on the unknown market share.
Suppose now that the company is oered a market research proposal at a cost of 10 utility units. The
market researchers will report one of:
Market share will be high
Market share will be low

Unfortunately high and low do not correspond exactly with 10% and 2% as the M.R. company may
get things wrong.
Market Research Report
High Low
Actual Share 10% 0.85 (0.15)
2%
0.25 0.75
(0.15) = P (The market research report says low|the share is actually 10%).

Distribution of this60
document is illegal

Lets draw the decision tree:

The Market Research Problem

Note the terminal utilities on the right hand side. All probabilities are conditional on everything to the
left. Exempli gratia:
i. P (M R says high) = P (high|10%)P (10%) + P (high|2%)P (2%) = 0.85 0.7 + 0.25 0.3 = 0.67
ii. P (10%| MR says high) =

P (MR says high |10%)P (10%)


P (M R says high)

= 0.89 by Bayes

How do we solve this? We fold back the tree from right to left. The purple numbers are obtained from
this folding back. The quadruple purple lines are indicating to us not to take these decisions.
We take expectations at chance nodes: we maximise at decision nodes. In the top right corner, we take
expectations: 0.89(500 10) + 0.11( 250 10).
Repeat this for other chance nodes on the right. Attach these expectations to the relevant chance nodes.
Then, maximise at decision nodes you meet next. For a more algorithmic specification:
Work from right to left

Take expectations at a chance node [use probabilities to the right]


Maximise utility at a decision node

Block o the non-optimal lines at the decision nodes at each stage

4. The Marriage Problem. When should one get married?! Assume n suitable partners will be presented to
you in a random order, one by one [n is known]. Having seen person j you can rank persons 1, 2, . . . , j.
You then must decide to reject person j [and move on to person j + 1] or accept person j. You cannot
back track; you cannot accept one seen earlier. [If you were able to see all at once you could rank them
all.] If you reach person n you must accept them. pi > pj denotes person i is preferred to person j.
There are no ties.
Terminal utility = P (best person has been selected). We want to maximise this. Consider the case n = 4.
We need to calculate the terminal utilities. For example:
P (p3 is best |p3 > p2 > p1 ) =

P (p3 > p2 > p1 and p3 > p4 )


=
P (p3 > p2 > p1 )

1
8
1
6

3
4

The Marriage Problem

If p2 > p1 then accept p2 . Otherwise see p3 and only accept p3 if p3 > p2 > p1 .

[End lecture 28, 18/12/14]


We didnt cover the case for general n in lectures, but Ill include it here anyway. If uj = utility of the
chance node immediately following seeing pj , exempli gratia un = n1 = utility of choosing the final
person. Find:

1
n k
un k =
max
, un k+1 + (n k + 1)un k+1
n k
n

n (k + 1)
1
1
1
=
+
+ +
n
n (k + 1) n k
n 1
as long as un

k+1

n k
n

holds.

The threshold occurs when un k+1 = n n k . Set n n k =


approximate the sum of reciprocals:

n 1
log
1!1
n (k + 1)

n k
n

1
n k

k
n

+ +

1
n 1

. Use Eulers construct to

0.368

After the threshold, choose the best so far, id est the first one better than all those seen so far

Mastery Exam Component

The M4S11 mastery question will be on n-person cooperative games. The following books may be useful:
Introducing Game Theory and its Applications by Mendelson, Chapman and Hall
Intorduction to Game Theory by Morris, Springer Verlag
The Theory of Games by Wang, Oxford
Were expected to study the following topics:
1. Coalitions
2. Characteristic Functions
3. Imputations
4. The core of an n-person cooperative game
Definition. Let P be the set consisting of all N players. A coalition, S, is a subset of P. The corresponding
counter-coalition to S is simply S c = P\S.
Clearly there are 2N coalitions. We call P the grand coalition and its complement is the empty coalition, ;.
With a coalition S, its natural to think of the game as having two players: S and S c . We can rewrite this
game in bi-matrix form with the tuples representing the sum of the payos to players in the coalition in the
first coordinate and whichs second coordinate is the sum of the payos to the players in S c .
Definition. The maximin value for the coalition, denoted v (S), is called the characteristic function. Its
domain is the set of all coalitions.
Note: Obviously v (P) = the largest total payo which the set of all players can achieve and v(;) = 0.
Theorem (Superaditivity). Let S and T be two disjoint coalitions. Then:
v (S [ T )

v (S) + v (T )

Proof. By the definition of a characteristic function, theres a joint strategy for the members of S such that the
total payo to the members of S is v (S). Similarly, theres a joint strategy for the members of T such that
the total payo to them is v (T ). Since S and T are disjoint, if players in the two coalitions play according
to these strategies, then the total payo to the union is guaranteed to be at least v (S) + v (T ) and hence the
maximin value for the union is this value, i.e. superaditivity holds.
Definition. A game in characteristic function form comprises a set of players P = {P1 , . . . , PN } together with
a function v defined on all subsets of P such that v(;) = 0 and superaditivity holds, i.e. v(S [ T ) v(S) + v(T )
for disjoint subsets S and T of P.
Definition. An N -person game, v, in characteristic function form is said to be inessential if:
v (P) =

N
X
i=1

v ({Pi })

and a game which is not inessential is said to be essential


If a game is inessential all it means is that theres no reason for coalitions to form. We can see this by the
below theorem:
Theorem. Let S be any coalition of the players in an inessential game. Then:
X
v(S) =
v({Pi })

Proof. Suppose not, then by superaditivity we must have v(S) >


v(S) + v(S c ) >

v(P)

N
X
i=1

Pi 2S

v({Pi }). But now, by superaditivity:

v({Pi })

which contradicts the game being inessential.


Theorem. A two-person-zero-sum game in its normal form is inessential in its characteristic function form.
Proof. As the game is zero-sum, v (P) = 0. Further, v ({P1 }) =

v ({P2 }). Hence:

v (P) = v ({P1 }) + v ({P2 })


However, zero-sum games with more than two players can be essential.
Definition. Let v be an N -person game in characteristic function form with players P = {P1 , . . . , PN }. An
N -tuple, x, of real numbers is said to be an imputation if both of the following conditions hold:
1. Individual Rationality: For all players Pi :
v ({Pi })

xi
2. Collective Rationality: We have:
N
X

xi = v (P)

i=1

The first condition is an intuitively obvious imposition. If xi were < v ({Pi }) then player Pi would do better
o on their own. For the second condition, consider:
PN
PN

v(P): If the sum were < v(P), then = v(P)


i=1 xi
i=1 xi > 0 and the players would form the
grand coalition and distribute the total payo v(P) as x0i = xi + /N so must hold
PN

i=1 xi v(P): suppose x occurs, i.e. a coalition with such a profit split occurs. Then, using superaditivity:
N
X
X
X
xi =
xi +
xi = v(S) + v(S c ) v(P)
i=1

Pi 2S

Pi 2S c

Theorem. Let v be an N -person game in characteristic function form. If v is inessential, then it only has
one imputation:
x = (v({P1 }), . . . , v({PN }))
and if v is essential then it has infinitely many imputations.
Proof. Suppose that v is inessential and that x is an imputation. If for some j we had xj > v({Pj }) then
PN
PN
i=1 xi >
i=1 v({Pi }) = v(P), contradicting collective rationality.
Now suppose that v is essential and let:

= v(P)

N
X
i=1

v({Pi }) > 0

then for any N -tuple  of positive numbers summing to  we have xi = v({Pi })+i , which clearly defines
an imputation. So, with infinitely many choices for , there are infinitely many imputations when a game is
essential.
For an essential game, there are too many imputations. So we need a way to single out the ones which merit
thetitleofasolution. Thefollowingdefinitionattemptstoformalisethedefinitionofoneimputation being
preferred over another.
64

Definition. Let v be a game in characteristic function form and let S be a coalition with imputations x and
y. We say that x dominates y through the coalition S if the following two conditions hold:
1. xi > yi 8Pi 2 S
P
2.
Pi 2S xi v(S)

and we denote this by x

The second condition merely tells us that such an x is feasible, that the coalition can attain enough payo so
as to distribute the payos as prescribed. Now well meet a solution concept, albeit one thats fundamentally
flawed in that it is sometimes empty!
Definition. Let v be the game in characteristic form. The core of v consists of all imputations which are not
dominated by any other imputation through any coalition.
So if x is in the core, then no group of players has reason to form a coalition and replace x with a dierent
imputation. At first, it looks difficult to decide whether x is in the core, but the following theorem will help
make this easier.
Theorem. Let v be a game in its characteristic function form with N players and let x be an imputation. x
is in the core of v if and only if:
X
xi v(S)
Pi 2S

for every coalition S.

Proof. Suppose this formula holds for every coalition S. If some other imputation w dominates x through a
coalition S then:
X
X
wi >
xi v(S)
Pi 2S

Pi 2S

which violates the feasibility condition in the definition of domination.

Now suppose that x is in the core and suppose that S is a coalition such that:
X
xi < v(S)
Pi 2S

Note that S =
6 P otherwise collective rationality in the definition of an imputation would be violated. Next,
there has to exist a Pj 2 S c such that xj > v({Pj }). If this werent true then by superaditivity:
N
X
i=1

xi < v(S) +

Pi 2S c

xi v(P)

which violates collective


rationality. So we can indeed choose Pj 2 S c such that 9 with 0 < xj v({Pj })
P
and v(S)
Pi 2S xi . Now, with k denoting the number of players in S, we define a new imputation, w,
by:

wi = xi + for Pi 2 S
k
w j = xj
wi = xi for all other i
Then w dominates x through S and so the assumption that x is in the core is contradicted.
The next Corollary will give a more convenient form of the above result, whichll allow us to calculate the core
of a game more easily.
Corollary. Let v be a game in characteristic function form with N payers and let x be an N -tuple of numbers.
Then x is an imputation in the core if and only if the following two conditions hold:
PN
1.
i=1 xi = v(P)
P
2.
v(S) for every coalition S
Pi 2S xi

Proof. Certainly an imputation in the core satisfies the two above conditions. Now let x satisfy both conditions. The second condition applied to one-player coalitions shows that individual rationality holds. The first
condition is collective rationality, and so x is an imputation. And its certainly in the core.
Definition. Let v be a game in characteristic function form. We say that v is constant-sum if, for every
coalition S, we have:
v(S)+v(S c )=v(P)
Further, its zero-sum if its constant-sum and v(P)=0.
Theconceptsofzero-sumandconstant-sumarenotthesameinnormalandcharacteristicfunctionforms;the
twoarentequivalent. Itspossibleforagamewhichisnotconstant-suminitsnormalformtobeconstant-sum
in its characteristic function form.
Theorem. If an N -person game is zero-sum in its normal form, then it is also zero-sum in its characteristic
form.
Proof. Supposethatthegamewerentzero-suminitscharacteristicfunctionform,i.e.forsomecoalitionS we
have v(S)6= v(S c ). Then, if the players in the coalition S and the counter coalition S c adopt the strategies
that give them such the maximin payos v(S) and v(S c ) in the normal form game then the total payo to all
players would P
be v(S) + v(S c ) 6= 0. But this is a contradiction, since the game is zero-sum in the normal form
we know that N
i=1 i (x1 , . . . , xN ) = 0 for all strategies x1 , . . . , xN for players P1 , . . . , PN .
Theorem. If an N -person game is constant-sum in its normal form, then it is also constant-sum in its
characteristic form.
P
!
Proof. Let c be the constant value of the normal-form game, , i.e. N
i=1 i (x1 , . . . , xN ) = c for all choices of
!
strategies x1 , . . . , xN for players P1 , . . . , PN respectively. We define a new game by subtracting c/N from
!
every payo in . Then (x1 , . . . , xN ) = (x1 , . . . , xN ) c/N for every choice of i and for all choices of
!
!
strategies. Then is zero-sum, and thus the characteristic function form of , u, is zero-sum. But its easy
!
to see that the characteristic function v of is related to u by the formula:
v(S) = (S) + kc/N
where k is the number of players in the coalition S. So clearly v is constant sum.
Theorem. If v is both essential and constant-sum, then its core is empty.
Proof. Suppose v has players {Pi }N
i=1 . Well prove this by showing that if v is constant sum and theres an
imputation in its core then v must be inessential. We know, for any player Pj , we have xj
v({Pj }) by
individual rationality. Since x is in the core, we also have:
X
xi v({Pj }c )
i6=j

Summing these inequalities we see, using collective rationality, that:


v(P) =

N
X

xi

i=1

v({Pj }) + v({Pj }c ) = v(P)

by the constant sum property. Hence the inequality is actually an equality and so xj = v({Pj }). Since it holds
for every j, the game is inessential.
Definition. A game v in characteristic function is form is called simple if all of the below hold:
v(S) is either 0 or 1 for every coalition S
v(P) = 1
v({Pi }) = 0 8Pi 2 P
In a simple game, a coalition S with v(S) = 1 is called a winning coalition and a coalition with v(S) = 0 is
called a losing one.

6 Appendix: Further Material of Use


6.1 On Risk and Loss Functions
These are a couple of examples given in last years course, which Lynda didnt have time to go over this year:
1. Oil company. Should they drill for oil at a particular site? There is an unknown probability that there is
oil there. They have drillings at other similar sites to inform about . = [0, 1], = P (oil at the site).
d1 = drill at the site: d2 = do not drill at the site. Let A = profit if d1 is chosen and oil is found.
B = loss if d1 is chosen and no oil is found. So the expected losses are

L(, d1 ) = A + B(1
L(, d2 ) = 0

In n similar wells [n, known], x have oil. So X = {0, 1, . . . , n} and X Bin(n, )


(a) Calculation of risk function. Consider the decision rules {dc } of the form:
(
d1 if nx c
dc (x) =
d2 if nx < c
Where c 2 [0, 1] is a constant.
R(, dc ) = expected loss over distribution of X for a particular
= L(, d1 )P (X

cn|) + L(, d2 )P (X < cn|)


n
X
n t
(A + B))
(1 )n r
r

= (B

r cn

(b) Calculation of the Bayes Decision Rule. Suppose now that we have a prior distribution for and
that it is Beta(, ) where , are known. So:
(, x) / x (1

)n

Hence the distribution of |x is Beta( + x, n

x 1

(1

x + ).

We want to choose the action [d1 or d2 ] to minimise the expected posterior loss, so we calculate:
E|X L(, d1 ) =
=
Where

+x
+n+

(A + B)E|X (|X = x) + B
(A + B)

+x
+n+

+B

is the mean of the posterior distribution.


E|X L(, d1 ) = 0

+x
So, we drill if (A + B) +n+

B, i.e. when x

(n+ )B A
.
A+B

This is the Bayes Decision Rule

2. Observe a random sample x1 , . . . , xn from N (, 1) where is unknown. xi is an observation on Xi .


X1 , . . . , Xn are independent N (, 1) random variables. = R. We want to estimate , i.e. we are asking
which function d(x) of the data best estimates ? We need a loss function, L. Some common ones are:
Squared error loss: L(, d(x)) = (d(x)
Absolute loss: L(, d(x)) = |d(x)

)2

Example to show risk function [assume squared error loss]: R(, dc ) = EX| L(, dc (X)) where dc (x) = cx,
P
c = constant and x = n1 n1 xi , X similarly.
R(, dc ) = EX| (cX

E X = , var X =

1
n

)2 = var(cX

) + (mean(cX

))2 =

c2
+ 2 (c
n

1)2

In the range [A, B], d 1 is better than d1 . but we dont know ! Note that if c > 1 then:
2

R(, d1 ) =

1
c2
<
+ 2 (c
n
n

1)2 = R(, dc ) > 0 8

If c > 1 we can improve on dc by using d1 . We say that dc is inadmissible.


Suppose now that n = 1, i.e. one observation, X, and we have a prior distribution for that is N (0, 2 )
where is known. Squared error loss.
R(, dc ) = c2 + 2 (c

1)2 as above

Bayes Risk of dc with respect to = r(, dc ) = E R(, dc ) = E c2 + 2 (c 1)2 = c2 +(c 1)2 E (2 ) =


c2 + (c 1)2 2 since the variance of the prior is equal to 2 . Amongst the rules of this kind (i.e. of the
2
kind dc (x) = cx) this gives the best one, i.e. we minimise c2 + (c 1)2 2 with respect to c ! c = 1+
2
This rule is best overall [i.e. the Bayes Rule].
Lemma. If we have absolute loss, the Bayes Rule is to estimate by the median of the posterior distribution
of [i.e. that value m, say] such that:
Z m
1
(|x)d =
2
1
Proof. Expected posterior loss of decision rule d d(x) is:
Z

(|x)|

d(x)|d =

(d

)(|x)d +

d)(|x)d

Now let us dierentiate with respect to d [N.B. the limit of the integrals involve d itself]:

"

(|x)d =
1

@
We have used:
@a

(|x)d Exercise

g(a)

f (a, x)dx =
1

g(a)
1

@f
dx + g 0 (a)f (a, g(a))
@a

6.2

Mid-Term 1 - 13/11/14

Solve the following games:


Part I
G=

3 6
1 0

(a1 , b1 ) is a PSSP because 3 < 6 [row 1] and 3 > 1 [column 1]. The game thus has value V = 3.
Part II
1 0 2
G=0 2 1
2 1 0
Consider =

1 1 1
3, 3, 3

g(a1 , ) = g(a2 , ) = g(a3 , ) = 1


g( , b1 ) = g( , b2 ) = g( , b3 ) = 1

Hence A and B both have ESs implying that is maximin and

is minimax. V = 1.

Part III
1 4 6
G=2 0 3
1 2 8
b3 is inadmissible [compare it with b1 or b2 ]. So we delete b3 and after doing so we see that a3 is now inadmissible
[compare it with a1 ]. So we then delete a3 . Were left with:
1 4
2 0
A has ES S =

2 3
5, 5

and B has ES

4 1
5, 5

Hence S is maximin in this sub-game and S is minimax. For the whole game, =
are maximin and minimax respectively. V = 85 .

2 3
5, 5, 0

and

Part IV
x, y 2 R>0 :
G=
(a1 , b1 ) is a PSSP if
When

1
2

<

x
y

x
y

x 2x
2y y

2. In this case V = x. (a2 , b2 ) is a PSSP if

x
y

2. In this case V = y.

< 2 then theres no PSSP.

|y

|y 2x|
2x| + |2y

,
x| |y

y
x
,
x+y x+y
|2y x|
2x| + |2y

x|

is an ES for A

2x y 2y x
,
x+y x+y

is an ES for B

3xy
x+y .

Check that the latter is indeed a randomised strategy.


As both and

are ESs, is maximin and

is minimax. V =

xy
x+y

2xy
x+y

4 1
5, 5, 0

Part V
G=

1 x x2
x x 2 x3

1. x > 0, x 2 R:
0 x 1 =) (a1 , b3 ) is a PSSP since x2 1, x2 x and x2

x3 . V = x2

When x > 1, (a2 , b1 ) is a PSSP as x < x2 , x < x3 and x > 1. V = x


2. x =

2 [considering subgames]:
G=

1
2

2
4

4
8

9 any obvious inadmissible strategies. The idea here is to solve the 3 sub-games to see if any can be

extended to give a solution to the whole game. Consider:


1
2
Here S =

2 1
3, 3

2
4

are ESs in the sub-game.

For the whole game, we calculate:


(
g(a1 , ) = g(a2 , ) = 0
g( , b1 ) = g( , b2 ) = g( , b3 ) = 0
where = 23 , 13 and = 23 , 13 , 0 . Hence and
maximin, minimax respectively. V = 0.

are both ESs for the whole game and ,

are

If we had
to disregard b1 instead
of b3 we would have obtained an alternate solution to the whole
chosen

2 1
2 1

game = 3 , 3 , = 0, 3 , 3 .
The other 2 2 sub-game has a PSSP at (a1 , b1 ), but Lemma 3 shows [very easily] that this cannot be
2 1
extended to a solution to the whole game. In fact, any =
) 0, 23 , 13 with = 23 , 13
3 , 3 , 0 + (1
for 2 [0, 1] gives a solution

6.3

Mid-Term 2 - 11/12/14

Part I
a. Note that (3, 8) lies on the line joining (2, 9) and (7, 4). (x + y = 11). S, the risk set, is the triangle with
vertices (2, 9), (7, 4) and (8, 2).

b. The lines segments (2, 9) ! (7, 4) and (7, 4) ! (8, 2) are the admissible strategies for A.
c. sM = (7, 4) + (1
)(3, 8) = (4 + 3, 8 4 ) lies on y = x if = 58 . So = 58 a3 + 38 a4 is maximin for
= 1 , 1 is minimax and V = 11 . An alternative
A. The line joining (3, 8) and (7, 4) is x2 + y2 = 11
2 so
2 2
2
7
3
maximin strategy for A is = 10
a3 + 10
a1 [or any strategy of the form p + (1 p) with p 2 [0, 1]]
d. The maximum of the blow four is

31
5 .

Hence a1 is Bayes with respect to

2
5
2
g(a2 , ) =
5
2
g(a3 , ) =
5
2
g(a4 , ) =
5
g(a1 , ) =

3
5
3
8+
5
3
7+
5
3
3+
5
2+

and the Bayes loss for

is

31
5 :

31
5
22
2=
5
26
4=
5
30
8=
3
9=

Part II

Part III
3 > 2 [column 1] and 6 > 5 [row 3] for A and B respectively.
(a1 , b2 ) is an equilibrium pair because 3 > 1 and 3 > 2 [column 2] and 2 = 2 [row 1] for A and B respectively.
As pay-os are:
1 3 2
2 1 3
3 2 1
An equaliser strategy for B using these pay-os is

= 13 b1 + 13 b2 + 13 b3 .

Bs pay-os are:
2 2 2
3 5 1
6 5 5
An equaliser strategy for A using these pay-os is = 1a1 as g(a1 , bj ) = 2 for j = 1, 2, 3.
Since and

are equaliser strategies using each others pay-os, ( ,

is an equilibrium pair.

6.4

Further Problem Sheet Questions

The answers thatre to follow were all written by Lynda White, and have been copied into this document.

Question 19
The pairs of pure joint strategies form a rectangle and the pay-o set consists of two triangles:

(a1 , b2 ) and (a2 , b1 ) are both admissible and in equilibrium, but they are not interchangeable so the game is
not solvable in the strict sense.

Question 20
The pay-o set is the whole of the convex set determined by the four points representing pairs of pure strategies.
The pairs (ai , b2 ) [i = 1, 2] are both equilibrium pairs and so, therefore, is any pair of the form (, b2 ). Show
that there are no other equilibrium pairs by considering x = 2pq q + 4, y = p q + 2.
The game is Nash solvable. The security levels for A and B are 3 and 2 respectively and the Shapley solution
is (a1 , b2 ) as (a1 , b2 ) is the only admissible pair of strategies. Draw a diagram!

Question 21
a.

i. Firstly consider the non-cooperative game. Note that if = (p, 1


x = gA = 4pq + 3p + 2q + 1, y = gB = 4pq p 2q + 3. Hence:
p=

x+y
2

p=

x
2

q=

10 x 3y
20 4x 4y

p) and

q=

= (q, 1

q) then

10 3x y
20 4x 4y

[provided x + y 6= 5]. If x + y = 5, p = 12 , q is arbitrary.


So p 0 and 1 p 0 () (x, y) lies between the parallel lines formed by the join of (1, 3) and
(3, 1), and the join of (4, 2) and (2, 4). Also q 0 and 1 q 0 () 10 x 3y and 10 3x y
have the same sign.
The jointly admissible strategies are those corresponding to the line segment joining (2, 4) and (4, 2).
No pair of pure strategies is in equilibrium [check]. If = (p, 1 p) and = (q, 1 q) are in
equilibrium then 4pq + 3p + 2q + 1 gA (a1 , ) = 4 2q =) (4q 3)(1 p) 0 =) p = 1 or
q 43 . Also, 4pq + 3p + 2q + 1 gA (a2 , )2q + 1 =) p(3 4q) 0 =) q 34 or p = 0.

Hence either p = 1, q 34 or q 34 , p = 0 or q = 34 .
If p = 1: 4pq p 2q + 3 gB (, b1 ) =) 2q + 2 4 =) q 1 > 34
If p = 0: 4pq p 2q + 3 gB (, b2 ) =) 3 2q 3 =) q 0 < 34
Hence q = 34 and gB (, ) = 2p + 32 . But gB (, b1 ) = 3p + 1 and gB (, b2 ) = 3 p =) p = 12 . That
is p = 12 , q = 34 is the only equilibrium pair and the game is Nash solvable
ii. If the game is played cooperatively sA = sB = 52 . The negotiation set is the line joining 72 , 52 to
5 7
2 , 2 and the Shapley solution is given by the point (3, 3) by symmetry. This corresponds to = a1
and = 12 b1 + 12 b2
b.

i. Non-cooperative game: since each edge of the convex hull of the four pure strategy points can be
achieved by suitable p and q, the pay-o set is the whole of the convex hull [draw a diagram]. The
jointly admissible strategies are those corresponding to the line segment joining (2, 5) and (3, 0). The
point (2, 1) represents a pair of pure strategies in equilibrium. Using a method similar to that in 26.
show that this is the only equilibrium pair. The game is Nash solvable
ii. Cooperative game: show that sA = 2, sB = 1. The negotiation set N is the line segment joining (2, 5)
to 11
5x] and the Shapley solution is (s1 , s2 ) where (s1 2)(s2 1)
4 , 1 [whichs equation is y = 15
is maximum over N . Dierentiation gives s1 = 12
5 , s2 = 3, which is in fact in N . The Shapley
2
3
solution is = 5 a1 + 5 a2 , = b2

Question 22
Draw a diagram. The payo set is a square. The admissible strategy pairs correspond to points on the line
segment joining (a b, a + b) to (a, b). The security levels for A and B are respectively 0 and b. The Shapley
2
2
solution maximises s1 (s2 b) such that as1 +bs2 = a2 +b2 . This gives s1 = a2 and s2 = a +2b
by dierentiation.
2b

a
a
a
a2 +2b2
If 2 > a b i.e. if b > 2 the Shapley solution is given by x = 2 and y = 2b , which is in the negotiation
set. Otherwise the Shapley solution is at (a b, ab ).

Question 23
L1 > L2 =) u(5000) > 0.1u(25000) + 0.89u(5000) + 0.01u(0) and L4 > L3 =) 0.1u(25000) + 0.9u(0) >
0.11u(5000) + 0.89u(0).
Rearrange each of these to give two contradictory inequalities. Many people would say L1 > L2 and L4 > L3 ,
but if you believe the Lottery axioms these are contradictory.

Question 24
Plot a graph! Note that u(z) and u0 (z) are both continuous at z = 1 .
h
i
u0 (z) = 2
2 2 z 0 0 z 1 and zero otherwise. Hence u(z) is a non-decreasing function of z reflecting
h
i
the fact that people prefer more money to less. u00 (z) = 2 2 0 z 1 and zero otherwise. So u(z) is a
concave function of z, reflecting the fact that people tend to avoid taking risks.
Also u(z) ! 1 as z ! 1, which reflects the fact that for most people there is an upper limit to the amount of
money they want. High values of correspond to high risk aversion.
u(L1 ) = 0.4(20
100 2 ) + 0.4(10
25 2 ) and u(L2 ) = 0.1(20
2
From this we deduce that L1 > L2 i < 35
.

100

2)

+ 0.9(10

25

2 ),

since

> 10.

When
= 0.01, 1 = 100 and 10010 < 200, 50 < 5 100. Hence u(L1 ) = 0.4u(10) + 0.4u(5) =
(0.4 1) + 0.4(0.1 0.00252 ) and u(L2 ) = 0.1u(10) + 0.9u(5) = (0.1 1) + 0.9(0.1 0.00252 ).
2

u(L1 ) < u(L2 ) if 0.30.5(0.1 0.00252 ). I.e. if g() = 8


5 + 30 < 0. But for 10 < 20, 8g() =
(10 )(20 ) 10+40. So g() < 0 for  2 [10,20]. We conclude that L2 is preferred to L1 and
u(L2 ) = 0.1+0.09 0.00252 , which is minimised [over  2 [10,20]] when  = 10 or  = 20. Direct
calculationofu(L2)forthesetwovaluesofleadstotherequiredresult.
73

Question 25
1. u(L) = (p 0.67) + ((1

p) 0.46) > 0.59 when p > 0.62

2. u(L0 ) = (p2 0.72) + (2p(1


p > 0.59]

p) 0.32) > 0.59 if 0.14p2

p) 0.59) + ((1

0.54p + 0.27 < 0 [i.e. if

Question 26
Let X be the number of people out of the 5 who win the lottery. Then X Bin 5, 23 . The reward to each
1
person is
X)) = 4000X 10000. The expected utility for each person is therefore
Pthen 5 (10000X 10000(5
1
j = 0.5 u(4000j)P (X = j) = 35 {u(0) + 10u(4000) + 40u(8000) + 80u(12000) + 80u(16000) + 32u(20000)}.
Apart from u(0) and u(20000), we do not know any of these utilities exactly, but if we assume that the utility
function is concave we have:
u(4000)

0.4u(2500) + 0.6u(5000)

u(8000)

0.4u(5000) + 0.6u(10000)

u(12000)

0.6u(10000) + 0.4u(15000)

u(16000)

0.8u(15000) + 0.2u(20000)

Hence E(utility) 0.89, which is greater than the utility [= 0.85] of each persons current assets, so it is worth
each person taking part in the lottery.

Question 27
a. For xi = m, m + 1 . . . we have P (Xi = xi ) =

xi 1
m 2

(|x) / mn+

m (1
(1

)xi
P

)
P
Hence the posterior distribution of is Beta(mn + , xi
mean and variance.

m.

xi mn+

By Bayes:
1

mn + ). Hence you can find the posterior

a
b. (|x) / n e Pxi a 1 e a/b = a+n 1 e ( xi + b ) . Hence the posterior distribution of is
Gamma a + n, xi + ab and we can then find the posterior mean and variance.

Question 28

1 bab
for > a and 0 < x1 , . . . , xn <
n b+1
These two conditions on can be expressed as > max{a, max{xi }}. The result follows immediately.
(|x) /

Question 29
(|x) / exp

1X
(yi
2

xi )2

Now complete the square in and show that the posterior distribution of is normal with mean
variance

P1 2 .
xi

xi y i
x2i

and

Question 30
Let M be the random variable representing the smaller amount, m, and let X and Y be the amounts in your
envelope and the other envelope respectively. For any m, we have:
P (X = m|M = m) = P (X = 2m|M = m) =
since the two envelopes are allocated at random.

Distribution of this74
document is illegal

1
2

The posterior distribution of M given we observe X = x is given by:


P (M = x|X = x) =
and P M = x2 |X = x = 1

P (X = x|M = x)(x)
P (X = x|M = x)(x) + P X = x|M =

P (M = x|X = x) =

( x2 )

(x)+ ( x2 )

x
2

x
2

(x)
(x) +

x
2

x
x
P Y = |X = x + 2xP (Y = 2x|X = x)
2
2

x x2 + 4(x)
x
x
= P M = |X = x + 2xP (M = x|X = x) =
2
2
2 (x) + x2

E(reward|swap) = E(Y |X = x) =

You should swap () E(reward|swap) > x () 2(x) >

x
2

Question 31
)2 = c2 var(X) + (c

a. dc (x) = cx =) R(, dc ) = EX| (cX


dc (x) c2 > = d1 (x) when c > 1.

1)2 2 = c2 + (c

1)2 2 . We see that

b. Under the square error loss the Bayes Rule for estimating is the posterior mean. The posterior density
of is (|x) / e x 1 e .
Hence the posterior distribution of is Gamma( + x, 1 + ) and E(|x) =
is dB (x) = +x
1+ .

We have R(, dB (x)) = EX|


Bayes Risk of is:
E

+X
1+

1
+(
(1 + )2

= varX| +X
+

1+

)2 =

1
+
(1 + )2

+
1+

+x
1+ .

(1+ )2

var() =

Hence the Bayes Rule

( )2
.
(1+ )2

Hence the

(1 + )

Question 32
[This material wasnt covered in lectures.]

Question 33
R(, dc ) = EX| (cX )2 = c2 var(X) + [(c
where = 1 . We therefore get:
R(, dc ) = c2 (

1)]2 . For the geometric distribution var(X) =

1) + [(c

1)]2 = [c2 + (c

1)2 ]2

= (

1),

c2

Now:
R(, dc ) = R(, d1 ) = (c2

1)(

1) + [(c

1)]2 > (c2

1)(

1) > 0

for all > 1 when c > 1. Hence dc is inadmissible for c > 1.


x 1

When has a prior P.D.F proportional to 12 you can find (|x) = x(x+1)
1 1
by Bayes Theorem
3
[and some integration to get the normalising constant]. You can then find the Bayes Rule by calculating the
expected posterior mean of :

Z 1
x(x + 1)
1 x 1
E(|x) =
1
d
2

1
which [after an obvious substitution] gives E(|x) = x + 1.
However, it is easier to note that the prior distribution of is UR nif orm[0, 1] and so the posterior P.D.F. of
1
is x(x + 1) (1
)x 1 . We then calculate E(|x) = E 1 |x = 0 x(x + 1)(1
)x 1 d = x + 1.

Question 34
[This material wasnt covered in lectures.]

Question 35
P
P
2
R(, dc ) = EX| c Xi2 = var c Xi2 + [(cn 1)]2 = [2c2 n + (cn 1)2 ]2 , which is minimised with
1
respect to c when c = c0 = n+2
.
P
m 1 exp(
Let = 1 . Then ( |x) / n/2 exp 2
x2i
). So the posterior distribution of is:

n
1X 2
m + , +
xi
2
2

The Bayes Rule for estimating is E(|x) = E(1/ |x) =


As ! 0, dB (x) becomes
becomes inadmissible.

x2i
2m+n 2

=c

x2i with c =

P
2+ x2i
2m+n 2

1
2m+n 2

[exercise].

6=

1
n+2

unless m = 2. Hence the Bayes Rule

Question 36
If 12 < < 32 we find, on folding back, that choosing box A gives expected utility 2 + max 2 , 3 43 and that
choose box B gives expected utility +2
4 . Hence the utility of box A is less than that of box B if and only if:

3 3
1
max
,
<
2
4
2 4
By plotting the three relevant lines, we see that this will always be the case and we therefore choose box B. If
> 23 , we find that the utilities of choosing the two boxes are both and so it doesnt matter which box we
choose.

P (R from A) = P (R from A|W = A)P (W = A) + P (R from A|W = B)P (W = B) =

P (Correct) = P (W = A|R from A) =

P (R from A|W = A)P (W = A)


=
P (R from A)

1
1+
+
=
2
4
4

1
2
1+
4

2
1+

Question 37
Let F denote a faulty machine, R denote the number of alarms that ring and H denote overhaul.
With perfect information, the expected utility of H is 0.8800+0.2700=780 and the expected utility of
H is 0.81000+0700=800.
Hence the expected value of perfect information is 940 max{780,800} = 140. Since each scanning device
costs 50, we can take n=0,1,2.
1. For n = 1 show that P (R = 0) = P (R = 1) = 0.5
2. For n = 2 show that P (R = 0) = 0.29, P (R = 1) = 0.42, P (R = 2) = 0.29
Hence calculate the conditional probabilities of F and F given the possible values of R for the cases n = 1 and
n = 2.
Taking into account the costs of the scanning devices, the optimal decision is n = 1 [utility = 812]. One should
use one scanning device and then overhaul if the alarm sounds. [See the below diagram, shamelessly copied
form Lyndas notes.]

Question 38
Let M = minor fault; S = serious fault; T M = examination indicates M ; TS = examination indicates S. We
have:
P (M ) = 0.9
P (T M ) = 0.8
P (M |T M ) = 36/37
P (M |T S) = 9/13

P (S) = 0.1
P (T S) = 13/15
P (S|T M ) = 1/37
P (S|T S) = 4/13

Folding back the decision tree, we see that the detailed examination should be selected if
7
c < 50
i.e. if the cost of the examination is less than 140, 000.

13
50

+c <

4
10

i.e. if

n = 1:
P (R = 0) = P (R = 0|F )P (F ) + P (R = 0)|F )P (F )

P (F |R = 0) =

P (R = 0|F )P (F )
P (R = 0)

&c.

n = 2:
P (R = 0) = P (R = 0|F )P (F ) + P (R = 0|F )P (F )

P (R = 1) = P (R = 1|F )P (F ) + P (R = 1|F )P (F )

S-ar putea să vă placă și