Documente Academic
Documente Profesional
Documente Cultură
P.G. Babu
Indira Gandhi Institute of Development Research
Film City Road, Goregaon (East), Mumbai 400 065
email: babu@igidr.ac.in
Sentenced to Reality
People were always telling me
"Youve got to live
in the real world." I heard it from
parents and teachers.
To live in the real world, like a
verdict. What terrible sin
Could these souls have committed
that their lives in this world
should begin with a verdict:
You are sentenced to reality for
life.
With no possibility of parole.
The parole is death.
-Yehuda Amichai
in Open Closed Open.
Decision Theory is a collection of models in which primitives are details about the behavior of
individual agents. An agent or an economic agent or a Decision Maker may often mean, quite
literally, an individual. But, there would be occasions when an agent might stand for family,
country, or even an entire generation. There would also be occasions when we would clone an
agent into many, each clone operating in a different circumstance and each regarded as an agent.
1
These are lecture notes; no originality, other than the organization of the material, is claimed. The first section
Hence, who is an agent might depend on the economic or decision making or social context that
you are dealing with or trying to model. We have to be careful that when we take an agent to be
a group of individuals, the assumptions we might impose on that entity will be distinct from those
that we might impose on a single agent.
Before you go further, pause for a second to read the situation given below carefully and answer
the question. Imagine yourself to be a Mumbai Doctor who is in charge of medical decision making
in this town. You have been informed by the Central Institute of Viral Diseases that Chikungunya
will hit your town in the coming winter and will result in 600 deaths. There are two possible
vaccination programs that you can employ: the first program will save 400 for sure and the second
will save none with probability
1
3
Let us begin with the primitive preferences, which can be thought of as mental attitude of an
agent toward alternatives, independent of choice. All of us are free to mentally prefer (or dream) a
BMW over a Suzuki, even if we would never be able to actually exercise a choice between these two
car models. So, preference is a mental (and hence more fundamental) concept that choice. They
can be defined over objects in a set, for that matter, any set X. For the time being, we will treat
X merely as a set of alternatives, without specifying what its elements are.
Now, put two alternatives or objects x and y from X in front of an agent. Ask him the following
question. How do you compare x and y? For each distinct pair (x, y), we can expect to see any of
the following answers:
1. x is better than y, but not the reverse.
2
vaccination programs is to be chosen by you. In the first, 200 people will die with certainty. In
the second, there is
2
3
1
3
So far, from the given primitive and the two conditions that we had imposed, we tried to get
other conditions. Now, let us try something different. Would it be possible for us to unearth other
kinds of preference relations from ? The answer is in the affirmative. We can think of other
preferences such as preferred or indifferent to (denoted by ) and indifferent to (denoted by ).
Let us define them.
Definition 4 For x, y X, x y, i.e., x is weakly preferred to y, if it is not the case that y x.
Remark: Again, strictly speaking, we should have indexed these preferences by the name of the
decision maker. When we say x y we actually mean the decision maker in question weakly prefers
x to y; also, x y means decision maker in question is indifferent to x and y. The way we have
written the definition, you might get the impression that the objects x and y somehow have a life
of their own and hence can decide to be indifferent to each other! That is not the case though.
Let us think about the two new definitions. The way we have defined weak preference is by
ruling out the strict preference one way (y 6 x) and indifference as the absence of strict preference
both ways (x 6 y and y 6 x). This could give rise to problems when the decision maker finds
it difficult to make judgements. Recall our earlier example that we used to motivate the negative
transitivity condition. The agent was unable to say either (20, 1) (15, 4) or (15, 4) (20, 1).
However, this inability does not make him indifferent between these two alternatives. If it did,
we would run into serious trouble with the indifference relation; because, (15, 4) (20, 1) and
(20, 1) (14, 3), thereby resulting in (15, 4) (14, 3). Thankfully, the asymmetry and negative
transitivity conditions (which ruled out such decision makers inabilities) come to our rescue and
hence we after all have well behaved weak preference and indifference relations.
What are all the conditions on and that we can derive from their definitions (which are
based on ) and the two assumptions that we have imposed on ?
Theorem 6
If is asymmetric and negatively transitive, and and are defined from as above, then the
following hold true:
is complete (i.e., for any x, y X, either x y or y x or both.).
is transitive (i.e., if x y and y z, then x z.).
is reflexive (i.e., x x, for all x X), symmetric (i.e., x y implies y x), and transitive
(x y and y z implies x z).
If w x, x y and y z, then w y and x z.
Proof:
Completeness of Weak Preference: Recall that is asymmetric. That is there is no pair
(x, y) X such that x y and y x. Hence, either x 6 y or y 6 x or both. That gives us y x
or x y or both, thereby satisfying the definition of completeness of .
6
Transitivity of Weak Preference: We are given that satisfies negative transitivity. From
that definition, transitivity automatically follows.
Reflexivity of Indifference: The indifference relation, , is reflexive because is irreflexive.
All you have to do is to remember that indifference is the absence of strict preference on both the
sides.
Transitivity look alike construction using Indifference and Strict Preference: Given that
w x and x y, we can think of any of the following possibilities: either w y, or w y or
y w. Can y w? It is impossible, as transitivity of will give us x w contradicting the
given fact that x w. Also, w y is impossible since transitivity of would give us y x,
contradicting the given fact that x y. Hence, w y must be true. You can replicate the similar
logic to conclude that x z.
There are quite a few things to think about now. To begin with, introspect about the relation
between asymmetry of and completeness of . Also, carefully consider the relation between
negative transitivity of and transitivity of .
Recall now from mathematics that any relation that satisfies reflexivity, transitivity and symmetry qualifies to be called by a special name equivalence relation. You can see that indifference
is an equivalence relation. Also, recall that any equivalence relation divides the given set X into
mutually exclusive equivalence classes, whose union gives us back the set X. Indifference relation,
being an equivalence relation, can do the same thing. It can divide a given set X into mutually
exclusive indifference classes, whose union gives us back the set X. So, if you take any object in
set X, it should necessarily lie in one of these mutually exclusive indifference classes. Also, if we
take the intersection of any two arbitrary indifference classes, we ought to get empty intersection.
You might wonder now whether that is the reason why we learn in our elementary textbooks that
indifference curves do not intersect.
If your imagination is fertile, you can visualize these indifference classes to be nothing more
than those good looking indifference curves of our elementary economics textbooks, except for the
nagging feeling as to why they should be so good looking. If you have such a nagging feeling, you
7
can be sure that your intuition is working fine because there is nothing that we have done so far
which would give us such ultra thin concave curves. To get such a shape, you can be sure that we
need to put in lot more hard work (euphemism for additional assumptions or conditions).
Let us dispense with one other related misconception that might arise. If I give you a finite set
of people with you and me thrown in and the strict relation taller than, you can easily verify that
such a relation will satisfy asymmetry (if you are taller than me, I can not be taller than you, or can
I?) and negative transitivity. Suppose I asked you: would the operation of taller than relation in
such a finite set always lead us to an unique answer? In other words, does strict preference relation
always give us the unique preference optimizing answer? In the above example, it could so happen
that we both are taller than everyone else and in addition, I am not taller than you and you are not
taller than me. If so, may be we are of same height and in which case, we both would be answers.
What is the source of misconception? Perhaps in our tunnel vision focused only on the strict of
strict preference; we for a moment have forgotten that we can indeed derive indifference from strict
preference and unless we put additional technical conditions on preferences (say the average of our
heights ought to be strictly taller than our respective heights), there is no reason to believe that
strict preference would lead us to the unique answer.
You can also show that x y if and only if x y or x y. How? x y iff y 6 x iff x y or
x y. Also, x y and y x implies x y (from the definitions of and using ).
Let us now pose the converse question. Can we take weak preference, , as our primitive and
derive strict preference and indifference from that alternative primitive? If the answer is yes, you
can use either of them. The next section takes up this issue.
Given that we are now going to begin using weak preference, , as our primitive, let us define it
using completeness and transitivity conditions.
Definition 7 Th preference relation is rational or weak order or complete preorder if it
possesses the following properties:
Completeness: For all x, y X, we have x y or y x or both.
Transitivity: For all x, y, z X, if x y and y z, then x z.
As I told you before, completeness axiom is not an innocent one either. Think for example, a
bouquet delivered at your hostel room and another delivered at your parental home. We are in
effect saying that you are capable of comparing both of them.
8
Transitivity however is even more fundamental to us; much of the economic and decision theory
would perhaps fall flat without it. Let us understand the spirit behind it using the following money
pump argument, which we owe to Menahem Yaari (this is also known as Dutch Book Argument
in the economics of uncertainty literature). Suppose you are the type for whom the following is
true: x y z x. Then, I can loot you using the following scheme. If you have x in your hand,
I will offer you z provided you give me Rs. 10. You will gladly do so as you like z to x. Now I
will say, Great. Give me another ten and I will exchange z for y. You will again oblige. Now, I
will offer you x again. I am richer by Rs. 30 and you are back with x again. More importantly, I
am ready to operate the money pump all over again. In the uncertainty literature, they call this
phenomenon as making book against oneself which is a nice sounding euphemism for a fool.
Having to hear this story and its bottom line, all of us would be ashamed of violating transitivity
and even if we do violate it unknowingly, when pointed out, we would jump to correct it albeit
with a red face. That behavior perhaps explains why we call with completeness and transitivity
as a rational preference relation.
Despite all the convincing arguments above as to why we need to adhere to transitivity, that
is one axiom we often violate. The earlier (Kahenman-Tversky) framing example would violate it.
You can think of the following majority opinion situation as well. Say that you can be in one of
the three different moods when you get up in the morning. In mood 1, x 1 y 1 z; in mood 2,
z 2 x 2 y and in mood 3, y 3 z 3 x. Define your preference relation in the following manner:
x y if you prefer x to y in the majority of the moods. That would again give you intransitivity.
So, you now know what you are ruling out with transitivity.
Let us now derive strict preference () and indifference () relations from the weak one.
Definition 8 We will say that x y if y 6 x. Now, x y if both x y and y x.
Pause for a moment now and contrast the x y and y x requirement for indifference derived
from weak preference with that of indifference derived from the strict one. As you would expect,
negative transitivity of strict preference3 would correspond to that of transitivity of weak preference
and asymmetry of strict preference corresponds to the notion of completeness of weak preference.
We are ready to prove the following result.
Theorem 9
If is rational (viz., that it satisfies completeness and transitivity), then the following are true:
1. is irreflexive and negatively transitive.
3
In case you find it difficult to visualize, think of the contra-positive definition of negative transitivity of strict
preference.
Proof:
Indifference: The reflexivity of indifference follows definitionally from the completeness of weak
preference. Similarly, transitivity of indifference follows from the transitivity of weak preference.
Symmetry follows from the definition of indifference relation.
Having shown the equivalence, henceforth we would use weak preference as our primitive.
Given these results, we learn that we can use either strict preference or weak preference as our
primitive. Once we have the preference orderings, on a pragmatic note, we wonder if there could be
a real valued function representation of the preferences. Pragmatism arises here because at times it
might be easier to work with real valued functions rather than the preferences themselves. Question
now is: can preferences be represented by real valued functions, which in the economics literature
are popularly known as utility functions. We will postpone this question to a later lecture.
Remark: At some point you might have wondered about our (over)emphasis on the individual
in individual preference relations. A natural question that might arise relates to interpersonal
comparison of preferences or alternatively utility functions that represent those preferences. It is a
valid question if you are thinking about ideas such as collective choice and justice. However, it is
beyond our current goals of exposition.
10
Let us now think about what we mean by a Decision problem under certainty. In these kind of
decision problems, there are various actions available to a Decision Maker (DM henceforth), and
each of these actions can lead to only one outcome. For example, if you go to the restaurant and
order Penne Arabiatta, that is what you would get on your plate. Each action has one and only one
outcome. The question then is the choice of action. That in turn would depend on ranking of the
corresponding outcomes. And you can do that ranking thanks to your knowledge of preferences.
Think of the following.
Action 1 Outcome A1
Action 2 Outcome A2
... ...
... ...
... ...
Action
n Outcome An
If you are given a table like this, you would perhaps rank order the outcomes and then choose
that action that leads to the most preferred outcome. For a second, assume that there are only five
actions and hence five outcomes, which are 1, 2, 3, 4 and 5. Which is the best? With completeness
of preferences over these outcomes, you would be able to compare any two of them at a time. With
transitivity of preferences, you would be able to organize all the outcomes into a nice linear ordering
with the most favored at the top and the least favored at the bottom. Of course, you can do so as
long as you have finite number of outcomes/actions.
Suppose you have weak preferences to begin with. Then, from weak preferences you can derive
indifference relation. Use the indifference relation to group all outcomes among which you are
indifferent. Then, you will have organized the various outcomes into different indifferent classes.
Now, we also know how to derive strict preference from weak preference. Hence, you can order
these indifferent classes using strict preference relation. Why are we insistent on strict preference
relations? The transitivity of weak preferences imply transitivity of strict preferences and hence
rules out the possibility of strict preferences being cyclical.
Now, a transitive, acyclical preference relation can be numerically represented by any set of
numbers that preserve the inherent ordering. We will come back to this representation by numbers
or utilities in the next section. By the transitivity of real numbers, you can always go to 5, which
would be the best and 1 which would be the worst. If so, then you will choose action 5 that will
give you the desired outcome (best choice) 5.
11
In the same manner, if you are given three actions possible, viz., eat apples, eat oranges and
eat bananas, again using your preferences over outcomes, viz., apples in your taste bud, oranges in
your taste bud and bananas in your taste bud, you can decide what is the best outcome and then
correspondingly choose the relevant action. For example, if apples are most preferred as per your
taste bud, then you will choose the action eat apples.
If you thought all decision problems under certainty are easy to solve, you need to just remind
yourselves about the Traveling Salesmans Problem. The salesman has to visit, let us say 20 cities,
starting from a base city and returning back to the base city, but can visit all other cities only once.
The distances of going from one city to another are completely known. All you need to do is to
find the shortest possible route to complete the tour. As the number of cities grow, this problem
takes massive exponential time to solve and is still an open problem to find an efficient algorithm.
This section summarizes all the basic introductory details relevant to a typical decision theory
course.
It is usual to divide decision theory into two major parts: (i) Descriptive Decision Theory and
(ii) Normative Decision Theory. The former focuses on how decisions are made by ordinary mortals
like me, and the latter focuses on how decisions ought to be made by rational agents (such as you
philosophers). This distinction is not always useful as information about real world agents decision
making behaviors might be relevant to prescriptions about how decisions ought to be made. You
could also study Senior Business Executives such as George Soros to make sense of their decision
making styles in the hope of learning something that might come handy while advising ordinary
people as to how they ought to make their business decisions. Also, in Economics itself, no one
really takes the notion of homo economicus in the normative sense or for that matter in the
descriptive sense. For us economists, it is merely useful as an idealization in our models. May be
I am too blithe here. Be that as it may, what makes more sense is to focus on experimental and
abstract types of decision theories.
We would also distinguish between group decision making and individual decision making.
Elections for example are group decisions. Decisions by elected officials such as Presidents could be
either individual (as in President Bill Clintons decision in Monica Lewinsky affair) or they could
be categorized as group (as in Clintons efforts towards Palestinian peace process). You could ask
at this point whether game theory falls under individual decision making or group decision making.
While it involves more than one player, it will not count as group decision, as each agent in a game
chooses an action to further his or her own objectives. Unlike group decision, no effort will be made
12
to develop a policy that applies to all the participants in the game. Think of two big supermarkets
such as X and Y who have to decide on when to organize a sale. Obviously, a sale by one might
mean less sales for the other. However, when they make such decisions in a game theoretic setup,
they decide independently, expecting that it is in others interest to declare a sale as well. You
could have a situation where both of them might sit down together and collectively decide how to
sequence a sale. That of course would be a group decision. But that is not really a noncooperative
game situation.
Group decision theory focuses on development of common policies and the joint distribution
or redistribution of resources throughout the group. Individual decision theory focuses on how
individuals can best pursue their self interests, even if it does not satisfy rational, social or moral
principles, or even if it goes agains their group (whichever group identity they have) interests or
for that matter, their own welfare (think of suicide for example or drug addiction).
A decision, whether made by a group or an individual, involves a choice between one or more
actions, each of which can produce one of several outcomes, depending on the states of the world.
Let us understand what is a state of the world. Imagine that you are contemplating going to a
beach. What could be the state of the world? It could be rain or shine. You can also think
of cloudy. Whatever the state of the world be, it is not in your hands to decide which of them
would occur. It is beyond your control. But within a state of the world, you know everything that
is to know. A state of the world is a complete characterization of the relevant world which in turn
is relevant to your model. For example, if you are thinking about oil prices being high or low as
possible states, perhaps Saudi Kings health might be important for what state might occur. If
you are thinking about climate change, a butterfly flying in New York might make a difference to
weather patterns. Whatever it is, you should be able to define the states of the world relevant to
your problem exhaustively and mutually exclusively. The latter means there cannot be overlapping
states.
Another analogy might help you. Think of going to multi-cuisine restaurant and consider the
situation where each day one expert chef turns out. Even though this restaurant has a menu
book with different pages corresponding to different cuisines such as French, Italian or Indian, no
one knows which chef would turn up that particular day - could be french chef or Italian chef or
Indian chef. So, you know within that menu page all that would be available including prices etc.,
but you do not know which state of the world (or which Chef) would appear. In the absence of
that knowledge, you have to go prepared with your order for each one of the chefs; that is your
contingency plan for each state of the world.
Think now of entering a dark room that is smelling of liquified petroleum gas (LPG). You
might want to switch on the light. That act might lead to a big explosion that might kill you.
Alternatively, you could consider not switching on the light. If you do not switch on, there is
13
unlikely to be an explosion. If you switch on, there could be two possible outcomes: an explosion
can happen or it might not happen. Your decision here is not a straight one as you are not certain
about the explosion. It might depend on the amount of LPG in the air in that dark room. Hence,
the outcome of your action, depends on the state of the world, viz., LPG in the environment in
which your action takes place. Hence, as we said, any decision involves three components: acts,
states, and outcomes. When analyzing a decision problem, Decision Maker (DM henceforth) or the
analyst must determine the relevant set of acts, states and outcomes that characterize that decision
problem. In our example, acts could have been come back next day, go get a flashlight, leave
the door open so as to ventilate. Outcomes could have been no damage, moderate damage, and
severe damage. States could have been LPG to air ratio that can be infinity of states, as there
are infinitely many ratios between 0 and 1. However, the simplest characterization of the decision
problem is the best and what you should aim for.
Let us try and formulate the above story as a decision table. The rows correspond to actions
and columns of the table correspond to states of the world.
Explosive Gas Level
Explosion
No Explosion
No Explosion
No Explosion
Switch on
Do not Switch on
As we said earlier, the problem specification is all important. For that to happen, the states
must be mutually exclusive and exhaustive. For example, if we have No Gas, Some Gas and
More Gas as states of the world, we would not be able to say whether second and third states of
the world, viz., some and more, exclude each other.
Let us now begin to look at various decision rules. The rules with which we would start deal with
those situations where the Decision Maker (DM henceforth) has to take decisions under ignorance.
Hence, in these situations safety first would work well for the DM. All the rules will also have that
flavor in this section.
6.1
Maximin Rule
This is the most conservative of the rules. Look for the minimum payoffs corresponding to each
act (look in the row corresponding to a given action) and choose that act whose minimum is the
maximum value for all such minimums. This condition leads us to maximize the minimum. It is
best to work with an example.
14
S1
S2
S3
S4
A1
A2
-1
A3
A4
Here the minimum corresponding to action A1 is (look for the minimum payoff in the row
corresponding to A1 ) 0. Similarly, for A2 it is 1. You can now find the minimum for other
actions. The maximum of these minimums happens to be 3, which corresponds to action A4 .
Hence, A4 is the action that maximizes the minimum. If two or more acts whose minimums are
maximal, this rule counts them all as equally good.
In case you want to do better than that, you then need to use what is known as Lexical Maximin
Rule. Here, if you end up with ties when you use Maximin rule, then, cross out all the rows except
the tied ones. After that, cross out the minimum numbers that you have, and compare the next
lowest. If that also leads to ties, then repeat the procedure by crossing out the minimum you have
and compare the next lowest. Again, the best way to see it is to use an example.
S1
S2
S3
S4
S5
A1
A2
A3
A4
Here, if you use Maximin rule, you will have to choose all the actions, as the minimum happens
to be 0 for all actions. Now, given that the tie involves all the actions, you cannot strike out any
action or row. So, strike out the zeros and choose the next lowest number. It happens to be 1.
Again, all the actions lead to the same minimum, viz., 1. So, you again have the same kind of tie
between all the four actions. Now, strike out 1 and move to the next lowest number. That turns
out to be 3, again the same for all rows. Now, strike out 3 and look for the next minimum. That
will be 4 for rows A1 and A4 , and 2 for rows A2 and A3 . Hence, you are left with tie between
actions A1 and A4 .
This rule picks out the best of the worst possible scenarios. You will use it perhaps when
potential losses are enormous.
Think now of the following decision situation.
S1
S2
A1
1.50
1.75
A2
1.00
10000
15
Here, the maximin rule would pick up action A1 . However, by choosing action A2 , you might
lose 0.50 if S1 happens, but will get 10000 uf S2 happens. This rule fails to take advantage of small
losses and big gains. This motivates our next decision rule.
6.2
This decision rule focuses on missed opportunities rather than the worst possibilities that we might
have to face. For example, in the above decision table, if the DM chooses A1 and state S2 happens,
he misses an opportunity to gain 9998.25, whereas he misses an opportunity to gain 0.50 if he
chooses A2 and state S1 happens. Let this be called as regret.
For the above decision table, let us calculate a regret table as follows. In order to get regret
numbers, remember to subtract each payoff in each box of the decision table from the maximum
number in its column. I repeat again, in the column.
S1
S2
A1
9998.25
A2
0.50
Now pick that action for whom the regret is minimal, that is, minimize the maximum regret.
It happens to be action A2 .
Given two decision tables, one an ordinal transformation of the other, the maximin rule picks
the same set of acts in both the tables. However, the minimax regret rule does not do so. Let us
see how using the following example.
Consider the following table.
S1
S2
A1
A2
Here the Maximin rule will pick action A1 . See how for yourself.
The regret table for this decision table would look as follows (with the explicit regret calculations).
S1
S2
A1
3-3 = 0
7-5 =2
A2
3 - (1) = 2
7-7 =0
Here the minimax regret rule will pick both actions A1 and A2 .
16
Let us now do the ordinal transformation of the above decision table: change 1 to 3, 3 to 7, 5
to 11, and 7 to 16, to get the new decision table.
S1
S2
A1
11
A2
16
The maximin rule will continue to pick action A1 . So, ordinal transformation does not affect
the working of the maximin rule.
Let us recalculate the regrets for this ordinally transformed decision table. The new regret table
will be:
S1
S2
A1
7-7 = 0
16-11 = 5
A2
7-3 = 4
16-16 = 0
S2
A1
11
A2
15
S2
A1
7-7 = 0
15-11 = 4
A2
7-3 = 4
16-16 = 0
17
by the minimax regret rule. Why? Recall how we calculate the regret: Regret, R, equals Maximum
in that row viz., Max minus the element in that box, say, u. Now transform it using the positive
linear transformation a.u + b, a > 0. Then we will have the Maximum becoming a.Max + b. Now,
the new regret, R0 will be (a.Max + b) (a.u + b), which will be a.(Max u) = a.R. Hence,
R0 = a.R, a > 0. As a result, it has managed to preserve the original order of the minimax regret
rule.
What is so special about positive linear transformations? Consider the difference between two
utilities, say 10 and 8 and also find the difference between positive linear transformations of those
two numbers; we will get 10 8 and (a.10 + b) (a.8 + b) = 10 8. So, this kind of transformation
preserves the distance between utilities or payoffs.
You could then criticize the minimax regret rule that it demands a more refined preference
structure than is needed for, say, maximin rule, because such preservation of distance or interval
between utility numbers is not one of the usual conditions we have on preferences.
Another criticism is something that came up during the lecture. Consider the following decision
tables and calculate the regret tables for yourselves.
S1
S2
S3
A1
10
A2
10
Here, it turns out that action A1 is the minimax regret action. Now, to the same decision table,
add another action A3 and get the following decision table.
S1
S2
S3
A1
10
A2
10
A3
10
As per the minimax regret criterion, A3 is not any better than the other two actions. However,
its presence seems to change the answer. Now, if you calculate, you will get A2 as the minimax
regret action. So, you could argue that an addition of an irrelevant action has changed the decision.
6.3
Optimism-Pessimism Rule
We have thought about maximin rule. Why not maximax rule - maximize the maximums? Perhaps
it is too much of optimism to have given that we are supposedly considering suitable decision rules
when we are acting under complete ignorance. May be we can reach a compromise between these
two extremes of pessimism (maximin rule) and optimism (maximax rule).
18
Let MAX be the maximum corresponding to an action, and MIN be its minimum. Combine
these two as, a.MAX + (1 a)MIN, where 0 a 1. Here, when a = 0 you will get the MIN,
and when a = 1, you will get the MAX, as special cases. Basically, the parameter a works as an
optimism/pessimism index. Who would choose a? You, as a Decision Maker (DM) will, as it is
your optimism/pessimism index.
Once you fix your a, calculate a.MAX + (1 a)MIN for all actions, and then choose that action
for which the optimism-pessimism index is maximal. Let us calculate it in an example.
S1
S2
S3
A1
10
A2
-1
11
Let us use optimism-pessimism index a = 0.5. For action A1 , using it, we will get a.MAX +
(1 a)MIN = (0.5 10) + (0.5 0) = 5. Similarly, for action A2 , we will get 5. The maximal of
these two would be A1 as well as A2 .
Now change a to 0.8. Then we will get 8 for A1 and 8.6 for A2 and the maximal here would
pick action A2 .
When a = 1, you are using maximax rule and when a = 0 you are using maximin rule.
What are the problems with this rule? We are leaving too much to the Decision Maker (DM)
and there is no hope of consistency here. Even the same DM may not use the same a index all
the times, and hence you are likely to see wide variations in decisions even by the same agent over
time. We could also take any decision we want and rationalize them by using a suitable optimismpessimism index. Hence, it is hard to defend it as a rule that rational decision makers use. Again,
this rule preserves order under a positive linear transformation as does the minimax regret rule.
6.4
This comes from probability theory. If I do not know anything about the states of the world, then I
need to treat all the states as equiprobable, that is, they could occur with equal probability. There
is no reason to believe that the worst state has higher probability of occurrence or for that matter
the best state. If you are really ignorant, then you need to treat all the states as equiprobable and
also independent of our choices. Consider the following example.
S1
S2
S3
S4
S5
A1
10
A2
10
20
A3
19
Now, given the above principle, you need to treat all states of the world as equiprobable.
Hence, you would assign probabilities
1
5
the probability of each state is occurring is given by 15 , as per the principle of insufficient reason.
Treat the numbers in the boxes as utilities. Then you can calculate the expected utility as per the
following formula:
E(u) = p1 u1 + p2 u2 + . . . + p5 u5
for each row/action.
That is, for each action, you are looking at the utility values of outcomes for each state and
multiplying each outcome by the probability of that state occurring and adding up across states.
And then, choose that action that has the maximum expected utility. We will later go into the
details of the expected utility theory. Right now, we are pretty much blindly applying the formula.
The expected utility if you take action A1 will be 5, for A2 it will be 8, and for A3 it will be 3.
How did we get this? For example, in case of A1 , (outcome in State 1) times (probability of state
1 happening) + ... + (outcome in state 5) times (probability of state 5 happening). That will be:
(5 51 ) + (7 15 ) + (2 51 ) + (1 15 ) + (10 15 ) = 5. Similarly, calculate for other actions. You will
find that the maximum expected utility happens for action A2 .
This principle will require that the Decision Makers utility be invariant to positive linear
transformations, very like minimax regret rule and optimism-pessimism rule. There could be an
important objection to this principle of insufficient reasons. It says that there could be no justification to assume that the states happen with equal probability. If you know nothing about the
states, why then assume that they are equiprobable? If every probability assignment is baseless,
then do not assign any probability at all. If you buy this argument, then you should look elsewhere
in order to find a rule for decision making under ignorance.
The bottom line after considering various rules under ignorance is that there is no one rule
that fits all the situations. if you buy the criticism against the principle of insufficient reason, so
should you give equal weight to objections against other decision rules. There is no single rule that
dominates everything else. Also, for the same decision problem, each of these rules could potentially
pick different answers. Even if you say that you will look at majority voting among these different
rules, you could end up in a cycle, as in our mood example. So, there is no universal recipe when
it comes to decision rules.
Before we embark on a full blown analysis of Neumann-Morgenstern Expected Utility Theory, let
us quickly reprise a few other ad hoc decision rules.
20
7.1
According to Domar and Musgrave, the Decision Makers (or investors) worry the most about
the possibility of actual yield or payoff being less than zero. Hence, they propose a risk index,
P
RI = xi 0 pi (xi ).xi . Here xi are the rate of returns or payoffs and pi are the associated
probabilities. Higher the risk index, the more risky the investment option.
Realizing that the investors would feel that they have made a mistake in their investment if
they earn less than the riskless interest rate prevalent in the market, Domar and Musgrave later
modified their risk index as follows:
RI =
pi (xi )(xi r )
xi r
for discrete outcomes xi . Here r is the risk free market interest rate, say that on a government
bond.
xi
p(xi )
x1
x2
x3
x4
x5
-50%
- 10%
5%
50%
100%
1/5
1/5
1/5
1/5
1/5
p(x)
A1
-50%
0.1
A2
-10%
0.5
Using the first variant, we will get same risk indices for both the investment options A and
B. But, some decision makers would prefer B to A as A has 50% loss possibility. The main
disadvantage is that these indices do not take into account the differential damage of the various
negative monetary returns.
7.2
Mean-Variance Rule
i p(xi )(xi
E(x))2 . You will first calculate the mean, and choose whichever has the higher mean. If the means
come to be the same, then you will choose one with lower variance.
21
See the following example with two investment options. Investment A gives you three outcomes
1, 2, 5 with probabilities 1/3 each. Investment B gives you three outcomes 1, 2, 8 with probabilities 2/6, 3/6, 1/6 respectively. If you calculate the means for both the investments, you will get:
= 2 for both. So, you need to calculate the variances: they would be 6 and 9 respectively. You
will conclude that investment B is more risky as it has higher variance. On closer look, you will see
that B induces higher risk than A due to its +8% deviation to the right of the mean. The problem
therefore with this index is that it takes both the good and bad deviations from the mean into
account. However only bad deviations imply losses.
7.3
Semi-Variance
xi A
constant such that earning below A is treated as a failure. In general, A can be set equal to E(x)
and hence the name semi-variance.
Again, think of two different options A and B. Investment option A will give us five different
outcomes 1, 2, 3, 4, 5 with probabilities 1/5 each. Investment option B will give us three outcomes
1, 3, 7 with probabilities 1/20, 18/20, 1/20 respectively. If you calculate the means, you will get
3 for both. The semi-variance with the cut-off as the mean for investment A will be 1 and for B
you will get 4/5. You will conclude that Investment A is riskier than B. However, if you are the
type of decision maker who treats negative income as bankruptcy, contrary to this rule, you might
prefer A to B.
What is the bottom line? There is no hope of getting one single measure of risk.
Recall from basic Microeconomic theory, where we used to define preferences of a typical agent i over
her consumption possibility set Xi . That setup, if you remember, did not contain any uncertainty.
In such a scenario, we looked for a real valued function that would represent the agents preferences.
In much the same manner, we are now going to search for real valued function representation for
an agents preferences over uncertain prospects.
We, however, need to know what is the relevant space here. It helps to imagine the underlying
space, say X, to be one of monetary outcomes. In so doing, we will assume that we can attach
money values to even nebulous notions, say for example, a commodity bundle (even though it is
anything but nebulous!). It also helps to imagine that any outcome in this world (at least that part
of the world that is of interest to us) is captured by this very large but finite (a simplification we
22
are making to make our, viz., modelers, lives easy) outcome space.
We can think of a risky asset as one which takes values in this finite outcome space with the
corresponding probabilities. In other words, a risky asset for us is essentially a random variable
defined on the finite monetary outcome space X. If we can think of one risky asset, so can we
depict two, three, or more. A question might cross your mind as to whether we are constraining all
these risky assets to take on the same outcome values. We do not have to do it. Simply put, if you
have a risky asset which takes only monetary outcomes 1, 2, 3 (for example) from this big space X,
then such a risky asset can be written as a random variable which attaches positive probabilities
only to outcomes 1, 2, 3 and probability zero to all the rest of the outcomes in the X space. If
that is so, then we can also depict riskless or risk free assets on the same space; a risk free asset
such as a government bond would put all the probability weight 1 on only one outcome and take
value 0 on all the rest of the elements of X. In probability, we have learnt to call such risk free
assets (viz., those which put all probability weight on only one outcome) as degenerate random
variable or degenerate probability distribution. So, what we have now is essentially a space of all
random variables (some degenerate and the rest non-degenerate) or equivalently, space of all simple
probability distributions (denote it by P (X)) defined over the monetary outcome space X. We call
these distributions simple because we are considering only finite outcomes. In the literature, such
distributions are considered to be ones with finite support, meaning which such a probability
distribution puts positive probability only on finite number of monetary outcomes. Observe that
even if the underlying X space is finite, the probability space (or space of risky assets), P (X), will
have infinite elements. If you are puzzled, think of X space with only two elements, 1, 2 and see
how many different probability distributions you can define on this two element set!
Now we come to the crux of von Neumann-Morgenstern theory; the relevant space on which we
are going to define preference is the space of risky assets or alternatively, space of simple probability
distributions P (X) (lest you forget, we remind you that this space also contains degenerate (or
riskless) assets as well).
We will keep using the terms risky asset or a simple probability distribution or a lottery
synonymously. If we can think of lotteries with outcomes taking values in the monetary outcome
space X, we can also imagine lotteries whose outcomes are in turn lotteries. We call such special
lotteries as compound lotteries. See the examples below.
INSERT FIGURE HERE.
The moment you stare at a typical compound lottery, you realize that it can be reduced to a
simple lottery, using the probability law of independence of outcomes. For example, in the example
above, we have shown the compound lottery and the corresponding reduced form simple lottery.
So, any compound lottery, however complicated it looks, can be reduced to a simple lottery form,
23
in which case they are identical as far as we are concerned. Discern carefully that we are tucking an
implicit assumption under the carpet here. By saying that a compound lottery and its reduced form
simple lottery are the same to us, we are essentially saying that neither the sequence of lotteries
nor the time taken to complete them is significant to us.
As in consumer theory, we need to define the basic axioms that will help us get a representation
of preferences on the P (X) space. It helps to think about the reasonableness of each of these axioms
that we are going to impose now, given that we are going to live with them for the rest of our lives.
To begin, let the underlying preferences of a typical agent i be given by i on P (X), the space
of simple lotteries or simple probability distributions or assets. These preferences are given by:
i P (X) P (X). The initial axioms are fairly intuitive to understand.
Axiom 1: Completeness For any two assets or lotteries p and q in P (X), either p i q or q i p
or both.
All that we are saying is that any two assets should be comparable. That is the minimum we
expect from our Decision Maker (DM henceforth): that she has the ability to compare.
Axiom 3: Archimedian Axiom For any p, q, and r in P (X) such that p i q i r, then there
exist numbers a and b in the interval (0, 1) such that ap + (1 a)r i q and q i bp + (1 b)r.
First, let us focus on the name of this axiom. The name derives from the Archimedian property
that we use in basic analysis. If x and y are positive real numbers, then x is infintesimal with respect
to y (or y is infinite with respect to x) if for any natural number n, nx < y. Put alternatively,
how many ever times we keep adding x, as long as the number of times n that we add x is finite,
the sum will always be less than y. A system with Archimedian property rules out the existence of
such infintesimals x and infinites y.
24
In our case, let us interpret the first component. What does it say? It states that if you like p
to q, then addition of a least preferred lottery r to the one you like the most, viz., p cannot make
you reverse your preference. You will still prefer the compound lottery ap + (1 a)r to q. In the
same manner, if you prefer q to r, a mere addition of your most preferred lottery p (with a small
probability) to your least preferred r cannot make you reverse your preference; i.e., you will still
prefer q over the compound lottery bp + (1 b)r. Basically, a good lottery is never so great to
make a bad lottery (such as r) (now that it is being served with the bad lottery) more palatable.
In the same way, a bad lottery is never so terrible to make a good one less palatable. Hopefully,
now you can see why the name Archimedean Property; in a way, that is what we are doing here
in this axiom - ruling out the presence of infintesimals and infinites.
Now that we understand the technical part of the axiom, let us get to the intuition. Suppose p
stands for Rupees 1 million for sure, q stands for Rupees 10000 for sure and r stands for death for
sure. If you are asked to apply the axiom, you might refuse to accept it as a reasonable behavior
norm stating that no amount of money is worth your life. Alternatively, if you are offered Rupees
10000 for sure here at IGIDR and the alternative of driving to reach the Borivli train station where
you will get Rupees 1 million, most of us would have by now jumped onto a car. However, such an
act involves a very small probability of death by railway accident. So, we get you to agree to the
first part of the axiom.
On the other hand, if you are offered Rupees 10000 for sure right now and the alternative of
Rupees 1 million if you successfully jog to Borivli train station on the railway tracks dodging the
oncoming trains, which one you would accept? I guess Rupees 10000 right now, given that we would
almost surely stare at death if we have to jog on a railway track (at the least, the probability of
death is pretty high)! That is the underlying spirit of the second part of the axiom.
Axiom 4: Independence Axiom For all p, q, and r in P (X) and any a [0, 1], p i q if and
only if ap + (1 a)r i aq + (1 a)r.
What do we make of the independence axiom? It says that if we like p to q, then adding a third
irrelevant lottery r in the same way to both p and q would not reverse the choice; alternatively put,
you will still pick the compound lottery which contains p and r over the compound lottery which
combines q and r using the same probability weights as in the previous case.
Does this axiom mean that we are ruling out complementarity? If it is true, it will certainly be
an unreasonable restriction. Let us first see if the accusation is true. Suppose there are two travel
agents p and q who are competing against each other to lure you as their customer. The travel
agent p is offering you a free ticket to Paris with probability 1 and the travel agent q is offering
you a free ticket to London with probability 1. If you think of the set of all possible outcomes
as X = (ticket to Paris, ticket to London), then the lottery p = (1, 0) and lottery q = (0, 1).
25
Suppose you prefer Paris over London; that would mean you would prefer lottery p to lottery q.
Now let us say that both the travel agents are giving you free tickets for London theatre with
probability 0.3 and give out their old prizes with probability 0.7. The outcome space X now is
(ticket to Paris, ticket to London, ticet to London theatre). Your lottery p will now be (0.7, 0, 0.3)
and that of q will be (0, 0.7, 0.3). Now it may appear as if independence axiom is saying that you
should still prefer p to q while actually you should be switching your preferences to lottery q as the
ticket to London and ticket to London theatre are complementary to each other. This reasoning is
not correct for the following reason. No one is offering you a ticket to London theatre in addition
to the ticket to London. Ticket to London theatre is being offered to you in place of the regular
tickets. And that is the subtle but all important difference. So, if you like Paris to London, you
should still go to p and not q. The independence axiom does not rule out the complementarity of
outcomes, but rather rules out complementarity of lotteries.
We could now cook up a different situation where independence axiom could be counterintuitive.
Instead of the original ticket, you might now get to watch a movie about Paris with a probability 0.3.
So, the outcome space X is (ticket to Paris, ticket to London, movie about Paris) and p = (0.7, 0, 0.3)
and q = (0, 0.7, 0.3). Here the independence axiom says that you will still prefer p to q. One might
be tempted to state that here you might want to change your preference. After all, if you go to
agent p and get the movie about Paris prize, then you might be sitting to watch the movie cursing
your fate - you could have gone to Paris in the real sense but here you are ... watching a movie
about Paris and not being in Paris! Here, the emergence of the possibility of a movie might lead
you to reverse your preference. However, this is ruled out by the independence axiom.
It is my experience that most readily buy this axiom and also increasingly most do not violate
this axiom as well. Perhaps you are a convert already or those experiments have been repeated
adnauseum that everyone has learnt what to do! Whichever be the case, keep an open mind about
this axiom as most experimental evidence militates against it. We will come back to this axiom in
great detail after we accomplish our task of getting a representation of the preferences over P (X),
as this axiom plays a fundamental role in the way the representation is obtained.
Let us now state the von Neumann - Morgenstern expected utility theorem representation of
preferences over lottery space, P (X).
Theorem 10 (von Neumann - Morgenstern Theorem)
Let P (X) be the space of simple probability distributions. Define i on P (X). Then, i satisfy
completeness, transitivity, Archimedean axiom and Independence axiom if and only if there is a real
valued function U : P (X) R such that (a) U represents i (that is, for all p, q P (X), p i q
U (p) U (q)), (b) U is affine (that is, for all p, q P (X), U (ap + (1 a)q) = aU (p) + (1 a)U (q)
for any a (0, 1)); (c) Moreover, if V : P (X) R also represents the said preferences, then
there exist , R, > 0, such that V = U + ; that is, U is unique up to a positive linear
26
transformation; (d) Also, U (p) has the expected utility form, viz., U (p) =
x p(x)u(x),
where u is
A diagram is always helpful in our understanding of any new theory. Here again we would use a
simple geometric device, which dates back to Jacob Marschak and popularized by Mark Machina.
In their honor, we would call such a diagram as Marschak-Machina triangle. This is a restrictive
depiction and can be used only for lotteries or risky prospects with three outcomes. However,
it is good enough for our understanding. Consider three different outcomes X = {x1 , x2 , x3 }.
The set of all simple probability distributions on this three outcome set would be denoted by
P (X) = (p1 , p2 , p3 ). This set of probability distributions can be captured by the area of the
triangle below.
INSERT FIGURE ABOUT HERE.
The three corners of the triangle represent three certain events: that is, p1 = 1 is a situation
where probability of outcome x1 happening is 1. Similarly, when p2 = 1, outcome x2 happens with
probability 1 and p3 = 1 stands for outcome x3 happening for sure. Those three corners are the sure
outcomes in our story. Any other point in the triangle represents a lottery p = (p1 , p2 , p3 ). From
basic probability laws, we know that p1 + p2 + p3 = 1. Hence, we can infer that p2 = 1 p1 p3 .
Hence, when p1 becomes zero, and p3 becomes zero, p2 should be 1. Where will this be captured
in our diagram? It will be captured by the origin, where p1 = 0 and p3 = 0. Hence, the case where
p2 = 1 is represented by the origin in this triangle. Any point on the horizontal axis would mean
that p3 = 0 and p1 > 0 as well as p2 > 0. In the same way, we can figure out that any point on the
vertical axis will represent a case where p1 = 0 and p2 > 0 as well as p3 > 0. Now, what should a
point on the hypotenuse stand for? This is the situation where p2 = 0 and p1 > 0 as well as p3 > 0.
Any point in the inside of the triangle would represent cases where p1 > 0, p2 > 0, p3 > 0.
Hence, we conclude that geometrically we can represent all the probability distributions over
outcomes x1 , x2 , x3 using points in the unit triangle in the (p1 , p3 ) plane. We also assume, without
loss of generality, that outcome x3 is the most preferred outcome, and that outcome x1 is the least
preferred outcome. That is, x3 x2 x1 . Hence, we move upwards, you are going close to the
best outcome or towards that probability distribution which puts all the probability mass on the
best outcome x3 , viz., the degenerate probability distribution x3 , which puts the entire probability
mass on outcome x3 .
Now start with any arbitrary point p = (p1 , p2 , p3 ) in the interior of the diagram. It should
be clear to you that any horizontal movement from this point to the right hand side represents
27
an increase in p1 at the expense of p2 , while p3 remains constant. Any vertical movement in the
upward direction would represent increase in p3 at the expense of p2 , while p1 remains constant.
Any diagonal movement will represent changes in all the three probabilities. Also, note carefully
that any leftward and upward movement means better lotteries as it takes you closer to your most
preferred outcome.
To repeat again: how do you find p2 ? From any point on the triangle, find out what is p1
(from the horizontal axis), find out what is p3 (from the vertical axis) and then, figure out that
p2 = 1 p1 p3 .
What are the indifference curves in this diagram? They should be the loci of the solutions to
the equation linear in probabilities (from the expected utility form of the Neumann-Morgenstern
utility functions). That is,
p1 u(x1 ) + (1 p1 p3 )u(x2 ) + p3 u(x3 ) = k.
By changing the value of k you would get different indifference curves.
What should be the slope of this indifference curve? We know from the theorem that the N-M
utility function:
U (p) = Eu(xi ) = p1 u(x1 ) + (1 p1 p3 )u(x2 ) + p3 u(x3 ).
Totally differentiate this U function to get:
dU = 0 = u(x1 )dp1 + u(x2 )(dp1 dp3 ) + u(x3 )dp3 = 0
That in turn implies
(u(x2 ) u(x1 ))dp1 + (u(x3 ) u(x2 )dp3 = 0.
On simplification, we get:
dp3
u(x2 ) u(x1 )
=
> 0
dp1
u(x3 ) u(x2 )
given that x3 > x2 > x1 .
This is the slope of the indifference curve. As it is positive, indifference curves should be
upward sloping. Observe carefully that the slope expression does not have p1 or p3 . Hence, slopes
are unaffected by the changes in probability. Therefore the indifference curves would be parallel
straight lines increasing in value in the north-west direction.
Now, you should take it on face value what I say: the slope of these indifference curves capture
the risk aversion of the decision makers. If you are risk averse, your indifference curve would be
steeper. If you are risk loving, your indifference curves would be relatively flat.
28
pi xi = p1 x1 + (1
p1 p3 )x2 + p3 x3 . Then, we can also figure out what the iso-expected value lines should be. They
should be solutions to p1 x1 + (1 p1 p3 )x2 + p3 x3 = k, where k is a constant by varying the
constants. What should be the slope of these iso-expected value lines? Going by the previous
exercise, it should be
x2 x1
x3 x2 .
two points on the same indifference curve must be on that curve too. This will happen with linear
indifference curves. If we instead had a non-linear indifference curve U 00 passing through p0 and
q 0 , then we get ap0 + (1 a)q 0 = r0 q 0 as r0 lies above U 00 . Hence, nonlinear indifference curves
violate the independence axiom.
Alternatively put, any violation of independence axiom will mess up the linearity of the indifference curves in probabilities. But that in itself should not bother us, as long as we have slopes
of the tangent to the nonlinear indifference curves to be steeper everywhere than the iso-expected
value lines, whenever you consider a mean preserving spread increase in pure risk.
10
As I said in the lecture, you need to see the proof at least once in your life time. It helps to
understand how the theorem works and also provides you with a deep understanding of the role of
different axioms, and the important role played by the independence axiom in particular.
Proof of the NM Theorem:
Part I: Axioms imply the Representation and Affinity
Lemma 1: If i on P (X) satisfy the axioms, then p i q and 0 a < b 1 implies bp+(1b)q i
ap + (1 a)q.
Proof of the Lemma 1: Case (i): Suppose a = 0. Then, p i q and the independence axiom
will give us:
bp + (1 b)q i bq + (1 b)q
=q
= ap + (1 a)q.
Case (ii): Suppose a > 0. Let bp+(1b)q = r for a moment. The fact that
axiom yields the following:
a
a
r = (1 )r + r
b
b
a
a
i (1 )q + r
b
b
a
a
= (1 )q + (bp + (1 b)q)
b
b
a
a
= ap + q q + q aq
b
b
= ap + (1 a)q
30
a
b
Intuition: If lottery p is preferred to lottery q, then if we construct two compound lotteries with
different weights, then we prefer the compound lottery which puts relatively more weight on p.
Proof of the Lemma 2: This lemma is what makes Archimedian axiom to be also called as the
axiom of continuity. Recall the intermediate value theorem from basic calculus and draw parallels
for yourself.
1. If p i q, then a = 1 should work and there is nothing much for us to do.
2. If r i q, then a = 0 should work and again it is fairly straight forward.
3. If p i q i r, then define the following set:
{a [0, 1]|q i ap + (1 a)r}.
This set is certainly nonempty because a = 0 is an element (because of the possibility of
q r) and it is bounded above by a 1. Thus, there is a least upper bound for this set. Let
that least upper bound (also known as supremum) be a .
Intuition: Given lottery q, we can construct a compound lottery which yields the same utility as
q by appropriately combining any lottery p which is preferred to q with any lottery r to which q is
preferred.
Lemma 3: For any x X, let x be the degenerate lottery at x. That is,
1 if x0 = x
x (x0 ) =
0 if x0 6= x.
If i on P (X) satisfies the axioms, then there exists x and x in X such that x i p i x for all
p P (X).
Proof of Lemma 3: If the outcome space X is finite, there are best and worst outcomes in X, say
x and x (by transitivity). Even if the set X is not finite, probability simplex is a compact object
and hence it has a maximum and a minimum x and x respectively.
Claim: Let p0 , p1 , . . . pK be K +1 lotteries and (1 , . . . , K ) 0 be probabilities with
P
PK
If pk p0 for all k, then K
k=1 k pk p0 . If p0 pk for all k, then p0
k=1 k pk .
PK
k=1 k
= 1.
K
X
k pk = (1 K )
k=1
By induction, we have
PK1
k
k=1 1K pk
K1
X
k=1
k
pk + K pK .
1 K
p0 .
(1 K )[
k=1
k
pk ] + K .pK (1 K )p0 + K pK
1 K
(1 K )p0 + K p0 by independence axiom
= p0 .
Hence, we get
PK
k=1 k pk
p0 .
Using similar logic, one can also prove the other case, viz., p0 pk for all k.
Coming back to the proof of the Lemma 3, for each n let xn be the lottery that yields outcome
xn with probability 1. Then, xn x because both of them are sure outcomes. Let p = (p1 , . . . , pN )
32
n pn xn
compound lottery). As xn
Thus, we have got all our four axioms from the representation (viz., NM utility function, U ) and
its affinity property.
V (p) V (r)
= H V (p).
V (s) V (r)
34
By Lemma 2, there is an a (0, 1) such that p i a s + (1 a )r. Now, replace p in the above
expression of H U (p) by a s + (1 a )r. Then we will have:
H U (p) =
U (s + (1 )r) U (r)
U (s) U (r)
[V (s)V (r)]
[U (s)U (r)]
[V (s) V (r)]
[V (s) V (r)]
U (r)
+ V (r)
[U (s) U (r)]
[U (s) U (r)]
(s)V (r)]
and = U (r) [V
[U (s)U (r)] + V (r), then we can rewrite the above
expression as:
V (p) = U (p) +
which is the desired expression.
Part IV: Expected Utility Form
From Part II we know that NM Utility function U exists and represents the preferences defined
on P (X) and that it is affine. Now, define the function
x : X {0, 1} as x (y) = 1 if y = x and x (y) = 0 otherwise.
This implies that for every x X, x is a degenerate distribution, and hence x P (X). Let
U (x ) = u(x). Now consider a distribution p = [p(x), p(y)]. We can then write this distribution as
a convex combination of degenerate distributions x and y in the following manner:
p = p(x)x + p(y)y
35
Thus,
U (p) = U (p(x)x + p(y)y ) = p(x)U (x ) + p(y)U (y ) = p(x)u(x) + p(y)u(y)
by affinity and our definitions of u(x) and u(y). Thus, more generally, any distribution p with
finite support can be written as a convex combination of degenerate distributions,
X
p=
xsupport
p(x)x
of p
xsupport
p(x)x ) =
of p
xsupport
p(x)u(x)
of p
which is the expected utility representation of U (p). Thus, as p i q if and only if U (p) U (q) by
the NM theorem, then equivalently, p i q if and only if
P
P
x supp(p) p(x)u(x)
x supp(q) q(x)u(x). And there exists v : X R that represents the
preferences over outcomes if and only if there exist and such that v = u + .
In terms of more general probability distributions, the expected utility form will look like:
Z
U (p) =
u(x)dp(x).
x
11
You need to know a few things very well. Let me summarize them here for you.
As I have repeatedly been telling, we can always find the best lottery (x ) and worst lottery (x ).
These are nothing but degenerate probability distributions that put all the probability weight on
the best and worst outcomes; in our notation, these will be:
1 if x = x
x (x) =
0 if x 6= x.
for the best outcome x , and
1 if x = x
x (x) =
0 if x =
6 x.
for the worst outcome x. Also, observe that U (x ) = u(x) and U (x ) = u(x).
Any other lottery you can conceive of, would always be between these best and worst lotteries.
That is, any other lottery or risky situation or simple probability distribution (I am deliberately
using all these terms, so that you get used to all these synonymous terms), say p, would be such
36
that x p x . Then, we know from Lemma 3 statement that if the following two conditions are
satisfied, viz., x p x , and x x , then there exists an unique weight a [0, 1] such that
p a x + (1 a )x .
Let me emphasize the word unique. Given that such a weight is unique, we can use it as
Neumann-Morgenstern Utility for the lottery p. That is, the N-M utility of lottery p, viz.,
U (p) = a .
Similarly, if you have another lottery q, say, then it would also be the case that x q x , and
x x , then there would exist another unique weight b [0, 1] such that
q b x + (1 b )x .
Then, we can assign b as the N-M utility of q, that is, U (q) = b . Similarly, you can define N-M
utilities for whichever lottery or risky situation or simple probability distribution. Also, recall
P
that the N-M utility function U (p) takes the expected utility form, that is, U (p) = x p(x)u(x),
where u is called the Bernoulli (or money) utility function on the monetary outcomes x X.
A doubt could come into your mind as to why all these N-M utilities are numbers between zero
and one. Could they be then probabilities themselves? That is NOT true. If you recall the N-M
Utility theorem, we have an important statement (that we prove) that says Moreover, if another
real valued function, say, V : P (X) R also represents the said preferences, then there exist
, R, > 0, such that V = U + ; that is, U is unique up to a positive linear
transformation. That is, the N-M utility is unique up to a positive linear transformation. Let us
understand what we are saying with a specific example.
Example 11 Consider two lotteries A (gives outcome 5 with probability 0.5 and outcome 10 with
probability 0.5) and lottery B (which gives outcome 8 with probability 1). Lottery B is a
degenerate lottery; that is, it can be represented by a degenerate simple probability distribution that
puts all its probability weight on the monetary outcome 8 (or simply put, attaches probability one
to monetary outcome 8).
Suppose this Decision Maker (DM) has the Bernoulli (that is, money) utility function, say,
u(x) = x2 . Then, EA u(x) = 0.5u(5) + 0.5u(10) = (0.5)(52 ) + (0.5)(102 ) = 62.5. Similarly, you can
calculate EB u(x) = 64. Hence, you conclude that B A as per the expected utility criterion.
Now, let us transform this utility function using a positive linear transformation, that is, let us get
another v(x) = au(x) + b where a = 100 and b = 0. Let us once again calculate expected utilities
with this v(x) function. That is, EA v(x) = 100EA u(x) = (100)(62.5) = 6250 and similarly,
EB v(x) = 6400. Again, you notice that the order of preference, viz., B A is preserved under
this positive linear transformation.
37
Let us perform yet another transformation: w(x) = au(x) + b where b = 100 and a = 1. That is,
w(x) = 100 + x2 . Calculate EA w(x) = 100 + EA u(x) = 100 + 62.5 = 37.5. Similarly,
EB w(x) = 100 + 64 = 36. Again, you observe that B A, as 36 is larger than 37.5, and
hence preference ordering is still preserved.
Hence, it should be clear in your mind that even though N-M utility scales are defined between
zero and one, they can always be scaled up or down or their origin can be changed, without any
loss of ordering of the preferences. You can think of the analogy between this and the conversion
of temperatures from Fahrenheit (F) to Celsius (C) scale or vice versa. Recall that F = 95 C + 32
is the affine transformation that allows you to move back and forth from Fahrenheit scale to
Celsius scale. What we are performing by way of positive linear transformation of N-M utility
function is not any different.
38