Documente Academic
Documente Profesional
Documente Cultură
Preface xi
1 Introduction 1
1.1 Focus of the book 3
1.2 Basic objectives 4
1.3 Content and structure 5
1.4 Some advice 9
3 Risk aversion 33
3.1 Marschak-Machina triangle 33
3.2 Contingent claims 40
3.3 Measures of risk aversion 49
3.4 Slope of risk aversion 63
4 Applications 69
4.1 Portfolio choice 69
4.2 The demand for insurance 74
4.3 Precautionary savings 87
4.4 Theory of production under risk 96
vii
viii Contents
Index 219
List of Figures
ix
x List of Figures
xi
xii Preface
nomic processes. Once a student can see what a problem involves, and
how it should be tackled graphically, then it is a relatively easy step
to apply the correct mathematical techniques to it. The underlying
theme of the present book is to attempt to achieve this by sticking
rigorously with problems in only two dimensions, and showing as much
as possible both mathematical and graphical treatments side by side.
I am indebted to a great many individuals both for fostering my
own interest in the topic of the microeconomics of decision making
under risk, and for turning my rough-and-ready lecture notes into
what I hope is now a coherent and sensible treatment of the topic.
I was initially lured by problems in choice under risk by the late
Prof. Richard Manning in classes that were taught at the University
of Canterbury some 25 years ago. Since then, the main impetus to
my interest in the topic has come from the many vibrant discussions
that are so typical at the annual meetings of the European Group of
Risk and Insurance Economists (EGRIE), which I habitually attend.
I owe a huge debt of gratitude to Jasper Mackenzie who took on the
arduous task of preparing so professionally the graphs that appear in
the book. I also thank Nick Sanders who helped me with an earlier set
of graphs, which allowed deadlines to be reached. Aleta Bezuidenhout
and Jaime Marshall at Palgrave Macmillan have been a pleasure to
work with.
Chapter 1
Introduction
1
2 1. Introduction
dimensions. That is, the assumption is that there are only two goods
present in the choice problem. This, however, should not be a concern.
The restricted number of dimensions is in place only in order that the
visual apparatus of a graph can be used. There is not doubt that a
graphical exposition of the solution helps enormously to capture most
of the essential elements of the solution, and for that reason two-
dimensional analysis is often used. But the model itself is robust to
an extension to any number of goods, and indeed it is often solved in
its multi-dimensional version in more advanced courses.
The other most often cited simplication that is important for the
model to be a faithful representation of real-world decision making
is the fact that everything that the decision maker needs to know,
he does know. In particular, he is fully informed of the availability
of all goods, of the prices of all goods, and, of course, of his own
income and preferences. It is likely that none of these things are really
quite so certain. Prices and availability of goods dier over sellers,
and it is often very dicult (or at least, very costly) to know exactly
where to go to get any particular item at any particular price. Even
personal attributes such as the disposable income and preferences
of the decision maker are known only approximately. One way to
deal with income uncertainty might be to set a budget for purchases
that is small enough to be guaranteed to be available, and then any
surplus income that results is simply retained as a random element of
savings. But then, we should ask what would be the optimal size of
the consumption budget that should be established? More generally,
we would be better to enquire about how the risks and uncertainties
that undoubtedly surround a decision-making environment can be
best catered for. This is the underlying theme of this book.
Risk is an ever-present element in decision making. It is often
related to time, because the nal consequences of the decisions that
we make often do not occur simultaneously with the decision. Between
the moment of the decision, and the moment of the consequence,
other random elements in the problem environment might be playing
out, aecting the consequences of our decisions. That is, a given
decision can, feasibly, lead to more than one outcome or consequence,
depending on the outcome of other relevant stochastic elements. What
we need, therefore, is a convincing theory of how to best take such
stochastic elements into account when the decisions are made.
One obvious way in which the existence of risk aects economic
transactions is the existence of markets and institutions in which
1.1. Focus of the book 3
risk itself can be traded. Take, for example, the insurance industry,
which clearly oers a service designed to shift risk from insurance
consumers to insurance companies, in exchange for a premium pay-
ment. However, many other examples exist, including (but certainly
not restricted to) markets for nancial products like shares in busi-
nesses, and, of course, options and futures on those shares, contracts
between employers and employees that shift risks from the former to
the latter, xed rather than variable interest rate contracts that shift
risk from borrowers to lenders, and so on. Achieving an understanding
of how such markets and institutions work to the mutual benet of
all concerned, and how they aect decision making, is a fundamental
purpose of this book.
with, the decision maker lives in a world in which the only person
of relevance is himself, just like Robinson Crusoe living alone on his
tropical island. His choices and decisions are made in an environment
in which other important things may change the weather, the tides,
the appearance of ships on the horizon but those other changes are
not controlled directly by any other decision maker. They are, as it
were, acts of nature. The objective of these decisions is to provide the
decision maker with the greatest possible welfare (or utility), given
the fact that some other important values have yet to be xed. Part
I of this book deals with this type of single-person decision problem.
Then, another decision maker turns up. Just like Friday, whom
Robinson Crusoe meets on the island. Now, with two decision makers
on the island, a small economy emerges in which meaningful transac-
tions can take place between the two. In Part II of this book we look
at how these small economies may work in as much as risk sharing
goes. In particular, we are interested in how the two individuals can
join together in an eort to aront the risks that they face, the risks
posed by the whims of nature. The main thing at this intermediate
stage of existence is that both Robinson and Friday are both fully
informed about the exact nature of the risks that they face. They
both know what outcomes would result under each and every feasible
state of nature, and (importantly) the likelihood of each and every
state of nature.
The climax of the story is when, perhaps after some time on
the island, Friday begins to understand that there is a fundamental
dierence between himself and Robinson. Their information sets are
dierent, and this will have a profound eect upon the way they work
together. Perhaps we can think that Robinson (as the master) is busy
writing the story of his adventures, and so he sends Friday (as the
servant) o to labour each day in the jungles and oceans to get food
for the two of them. Assume, for example, that Robinson really likes
to eat sh, and that Friday is happy to eat only fruit. Fruit is easily
available and in plentiful supply all over the island, and so there is no
problem about gathering all the fruit that the two may ever require.
But shing is dierent. It is inherently risky, and the outcome of how
many sh are caught depends upon many random factors. Maybe it
turns out that the best place to sh is a cove that is very far away,
and Friday would rather not walk so far, and instead he prefers to sh
at a closer location in spite of it not being such a plentiful supplier
of sh. Robinson, who does not want to have to accompany Friday
1.3. Content and structure 7
1
The possible exception is the case of insurance demand, where one could argue
that not only the insurance consumer, but also the insurer is present. However,
when we analyse the insurance demand model, our primary attention is placed
upon the decision of the consumer, and the insurer is really present only as a
parameter set in the demanders decision problem.
8 1. Introduction
2
If you are unsure about how this is done, check Appendix A.
1.4. Some advice 11
15
16 2. Risk and preferences
they will almost certainly not be used. This is an example of what is known by
economists as bounded rationality.
3
For an excellent account of this history, see Against the Gods: The Remarkable
Story of Risk, by Peter Bernstein, published by John Wiley & Sons (New York) in
1996.
4
And we are not dealing with just any mathematicians. The two names most
often associated with this idea are Blaise Pascal and Pierre de Fermat.
18 2. Risk and preferences
was rst suggested), then we would value the lottery at its expected
value. However, the expected value of the lottery5 is
2 3
1 1 1
Ex= 1+ 2+ 4 + ...
2 2 2
i
1
= 2i1
2
i=1
2i1
=
2i
i=1
1
=
2
i=1
=
The expected monetary prize turns out to be innite! However, it
appears quite clear that no sane bettor would place a very high value
on the lottery, since the most likely outcome is that the prize that will
end up being won is on the order of 1 or 2 monetary units, or perhaps
4 if we are quite lucky. So how can this dilemma be resolved?
It was the famous swiss mathematician Daniel Bernoulli6 who rst
proposed a solution. Bernoulli postulated that what was important in
a risky situation such as that proposed was the moral value of the
prizes, rather than their pure monetary values. His analysis rests on
the recognition that the loss of a certain amount of money, say x
implies a change in happiness that is, in absolute value, greater
than the change that would occur if that same amount of money was
earned. These days, we use the term utility of the prizes rather than
the moral value of the prizes. In simple mathematical terms, if we
use u() to denote a utility function and w as the initial (riskless)
wealth, then what Bernoulli recognises is that for any given x we
should have u(w) u(w x) > u(w + x) u(w).
Using a clear and logical argument, based principally on intuition,
Bernoulli concludes that the bettor in the Saint Petersburg lottery
5
Note thethat has appeared above the random variable x. This tilde is often
used to distinguish a random variable from a deterministic one, and we shall follow
that custom throughout this text. The E is the expectations operator.
6
The Bernoulli family was full of famous mathematicians. It was a cousin of
Daniel, called Nicolas, who suggested the Saint Pertersburg problem in the rst
place. Even though history credits Daniel Bernoulli as being the author of the
solution that we shall analyse here, he himself recognised that Gabriel Cramer had
discussed the same solution some years earlier.
2.1. Historical antecedents 19
Naturally, the bettor will receive only one prize, and the probability of
prize xi is pi . We shall indicate dierent lotteries by dierent vectors,
in this case dierent probability vectors, expressed by the introduction
of a super-index. That is, two dierent lotteries are expressed as
1 (x, p1 ) and 2 (x, p2 ). Note that this implies that all lotteries share
the same prize vector, something that will be altered later on. How-
ever, for now it is sucient to note that, since in any probability vector
we can incorporate a 0 in any place that we desire, two lotteries with
dierent prizes can still be captured by our notation. For example,
if lottery 1 has the prizes x1 and x2 , and lottery 2 has the prizes
x3 and x4 , then we can simply write 1 (x1 , x2 , x3 , x4 , p11 , p12 , 0, 0) and
2 (x1 , x2 , x3 , x4 , 0, 0, p23 , p24 ).
We shall indicate the utility function for lotteries by U () =
U (x, p). With this notation we are explicitly assuming that such a
utility function actually exists, and recognising that the utility of a
lottery will depend upon both the set of possible prizes, and the asso-
ciated set of probabilities. Our initial objective is to nd a particular
functional form for U (), using the denition of a utility function,
that is U ( h ) U ( k ) if and only if h k . In other words, we
2.2. Expected utility theory 21
Note that with only two prizes, the implication goes in both directions.
With more than two prizes, the implication goes only from left to right.
You should think carefully about why this is so. That is, rst-order
stochastic dominance implies that a greater probability on the better
prize leads to a more preferred lottery, and if there are only two prizes,
then it is also true that the more preferred lottery must have a greater
probability on the better prize.
7
This assumption is further justied below, in footnote 10.
22 2. Risk and preferences
Now, note that we can associate with each one of the prizes a
number, i , such that
xi
i (x1 , xn , i , 1 i ) i = 1, 2, ..., n
8
Using the same notation as up to now, we should really write i1 instead of
i . However, when there are only two prizes there is no need to continue with
the subindex that indicates the corresponding prize, and so we eliminate it in the
interests of simplifying the notation.
2.2. Expected utility theory 23
qh qk if and only if h k
k
and so we can simply take U ( k ) = qk = pi i = pi u(xi ). That
is, the utility of the lottery is equal to the expected value of the utility
of its prizes.
Lottery p1 p2 p3
A 0 1 0
B 0.1 0.89 0.01
C 0 0.11 0.89
D 0.1 0 0.9
Lottery x1 x2 x3 p1 p2 p3
A 5 5 5 0.1 0.89 0.01
B 25 5 0 0.1 0.89 0.01
C 25 5 0 0 0.11 0.89
D 25 0 0 0.1 0.01 0.89
probabilities over the same prize set, but where in one scenario the
probabilities are known and in the other they are not.
Prospect theory
Another well-known alternative decision criteria is known as prospect
theory, which hypothesises that utility should be dened relative
to a particular wealth level (often taken to be the perceived initial
wealth), and what is important are changes in wealth from that level
rather than the levels of wealth that are attained. Above all, it is often
hypothesised that utility may be convex below the critical wealth
level (the domain of losses) and concave above the critical wealth
level (the domain of gains). It is also often assumed that the utility
function may have a non-derivable kink at the critical wealth level.
When the critical wealth level is indeed taken as perceived initial
wealth, clearly it will change over time as lotteries play out. A small
risk involves comparing falling below the initial wealth into a zone of
convex utility (the loss domain) and going above the initial wealth into
a zone of concave utility (the gains domain). In such a setting small
risks become disproportionately important compared to the smooth,
everywhere concave, kind of utility function that is typically used in
expected utility theory. An example of two utility functions, one for
traditional utility and the other for prospect theory, together with
a perceived initial wealth of w0 , is shown in Figure 2.1. Specically,
Figure 2.1 assumes a kink at the initial wealth w0 , and that the two
functions coincide above w0 , but they are clearly dierent below w0 .
Convexity of utility under the perceived initial wealth and con-
cavity of utility above perceived initial wealth has the implication of
what has become known as loss aversion. Loss aversion is simply a
situation in which the eect upon utility of a loss of a small amount of
wealth is greater than the eect of a gain of the same small amount of
wealth. Clearly, any concave function is loss averse in this sense, but
when the function is drawn with a kink at perceived initial wealth
and with convex utility below the initial wealth, the loss aversion
eect for small losses is greatly amplied, both by the kink, and by
the change from convexity to concavity as we move from left to right.
This is shown in Figure 2.1, where the small change in wealth is x.
Because Figure 2.1 assumes the two functions to be equal above w0
the welfare eect of the gain (+x) is the same for both functions,
the distance a. But the welfare eect of the loss (x) is greater (in
2.3. Alternative decision criteria 29
u(w)
a
u(w0 )
c
w0 -x w0 w0 +x w
Discussion
Economic theorists have reacted in very dierent ways to the general-
isation of expected utility that is suggested by making the preferences
non-linear in probabilities. In reality, the debate relates to positive
30 2. Risk and preferences
Summary
In this chapter, you should have learned the following:
Problems
1. Show mathematically that, when the utility function for wealth
of w is equal to ln(w), then the expected utility of the Saint
Petersburg lottery is equal to ln(2).
2. Work out the value of the expected utility of the Saint Peters-
burg lottery when the utility function for wealth of w is w.
3. Really, the analysis of Daniel Bernoulli asks the wrong question.
Bernoulli is interested in the number w for which the utility of
w for sure is equal to the utility of the posed lottery. While this
is an interesting question, an even more interesting one from
the point of view of economics is the following. Given an initial
wealth of w0 , what price q would an individual be willing to pay
to purchase the St. Petersburg lottery? Write down the equation
that would dene q for the St. Petersburg lottery, assuming that
utility is the log function. Can this equation be solved exactly
for any given w0 ? How about for w0 = 2?
4. Use your equation from the previous problem, establishing the
limit price q that would be paid by an individual with risk-
32 2. Risk and preferences
Risk aversion
33
34 3. Risk aversion
p3
3
p33
4
p43
2
p13 = p23 1
That is, the curves that maintain expected value constant (from now
on, iso-expected value curves), are also straight lines with positive
slope. The interesting question is, how do the indierence curves and
the iso-expected value curves compare to each other? The answer
depends entirely upon the concavity of the utility function, u(w). Lets
see how.
u(w)
c
u(w1 )
b 2
u(w2 )
a 1
u(w3 ) d
w3 w2 w1 w
Figure 3.2 shows a typical concave utility function, along with the
three levels of wealth w1 > w2 > w3 . If we draw the line segments
joining point a to point b, and point b to point c, then due to the
concavity of the utility function, the slope of the line joining a to
b must be greater than the slope of the line joining b to c. That
is, 1 > 2 . We can measure these two slopes using some simple
geometry. Consider the triangle formed by the three points a, b and d.
The slope of the line (actually, the tangent of the angle at 1 ) joining
a to b is equal to the length of the opposite side (the distance from
d to b) divided by the length of the adjacent side (the distance from
a to d). But these two distances are, respectively, u(w2 ) u(w3 ) and
w2 w3 . Thus, we have 1 = u(ww22)u(w
w3
3)
. In exactly the same way,
38 3. Risk aversion
p3
E w = constant
Eu(w) = constant
1 p1
Variance is dened as var() = 2 () = 2 . In fact,
pi (wi E w)
it turns out that 2 ( 2 ) > 2 ( 1 ). To see why, it is necessary to
consider the derivative of 2 () with respect to p1 conditional upon
Ew remaining constant (this is suggested as problem 1 at the end
of the chapter). Never-the-less, note that as we increase p1 along a
particular iso-expected value line, we need to increase p3 and therefore
decrease p2 . This corresponds to a displacement of probability weight
from the centre of the distribution to the extremes, which implies an
increase in variance.
In short, we have reached the following important conclusion: if
the utility function is strictly concave, then an increase in variance
while holding the expected value constant implies a decrease in expected
utility.1 Economists say that such preferences display risk aversion,
since it is normal to associate variance with risk. Thus, concavity of
the utility function is equivalent to risk aversion. Of course, if u(w)
were linear, then we would have a risk neutral preference, and if it
were convex we have a preference for risk (sometimes referred to as
risk loving).
out, in the sense that they are steeper and steeper as we move
upwards and to the west in the triangle. Such preferences cannot
correspond to expected utility.
p3
1
D
0.9
0.89 C
B
0.01 A
0.1 1 p1
w2
w1 = w2
w0
w0 q
w0 w0 q + x w1
= pw2 + (1 p)w1
Ew (3.2)
2 (w) 2 + (1 p)(w1 E w)
= p(w2 E w) 2
44 3. Risk aversion
1 + (1 )w
Eu(w 2 ) > Eu(w
1 ) + (1 )Eu(w
2 )
3.2. Contingent claims 45
w2
2 = 22 2 = 21
2 = 0
2 = 21
2 = 22
w1
w2
w1 = w2
2 > 0
1p w1
p
tails. Would you voluntarily accept this lottery? How about win
two dollars on heads, lose a dollar on tails? Try to answer the
following question honestly. You are oered to voluntarily play a
lottery in which on heads you win x dollars, and on tails you lose
one dollar. What is the smallest value of x for which you would
play this lottery? What is the expected value of the lottery, and
what is its variance? Think about what your answers imply for
your own preference towards risk.
asset. Can the risk-free asset ever dominate the risky one, in the
sense that the investor would invest only in the risk-free asset
and not in the risky asset at all?
w2
w1 = w2
w0 (1 + t) w
w0 (1 r)
1p
p
w0 (1 + t) w0 (1 + r) w1
Since the expected return on the risky asset is greater than the
expected return on the risk-free asset, we know that the point
corresponding to all wealth invested in the risky asset, which lies
below the certainty line, must lie above the expected value line
passing through the point on the certainty line corresponding to
all wealth invested in the risk-free asset. Thus, the straight line
joining these two points (the line showing all possible investment
opportunities as wealth is spread over the two investments) is
less steep (atter) than the expected value line of the risk-free
investment. But since the slope of the risk-free expected value
line is simply the ratio of state contingent probabilities, we also
know that the indierence curve at the risk-free investment is
steeper than the market opportunities line. Thus the tangency
between the market opportunities line and the indierence curve
must occur below the certainty line, that is, some money is
always invested in the risky asset. Curiously, the result that
some risk will always be purchased is independent of exactly
how risk averse the individual is, and how slight might be the
expected value advantage of the risky asset.
w2 w1 = w2
w0 A(w0 )
w0 w1
w2
w1 = w2
w0
ui (w0 )
uj (w0 )
w0 w1
with f (u) > 0 will also represent the same preferences. A composite
function of the form z(w) f (u(w)) with f (u) > 0 is known as a
positive monotone transformation of u(w).
Lets now go back to our uncertain environment (just for now, let
us consider an n state world, rather than a strictly 2 state world). If
two utility functions for wealth ui (w) and uj (w) are to represent the
same preferences over lotteries, then it must be true that the two func-
tions always give the same ordering over lotteries, or in other words,
that the two expected utilities are related by a positive monotone
transformation:
n
n
pk ui (wk ) = H pk uj (wk ) with H () > 0
k=1 k=1
and so
ui (wk )
H () = wk (3.4)
uj (wk )
Dierentiating (3.4) yields
ui (wk )uj (wk ) ui (wk )uj (wk )
H () = 2 wk
uj (wk )
But recall that if the two functions are to represent the same prefer-
ences over lotteries, then it must hold that Ria (wk ) = Rja (wk ) for all
wk , that is
ui (wk ) uj (wk )
= ui (wk )uj (wk ) = ui (wk )uj (wk ) wk
ui (wk ) uj (wk )
and so clearly it would have to hold that
H () = 0 wk
n
where a > 0 from (3.4). Now, since b = k=1 pk b, we have
n
n
n
n
pk ui (wk ) = a pk uj (wk ) + pk b = pk (auj (wk ) + b)
k=1 k=1 k=1 k=1
that is
ui (w) = auj (w) + b with a > 0
Since Rfa (u)u (w) > 0 it turns out that Rva (w) > Rua (w) for all
w. So indeed v(w) is more risk averse than u(w).
3.3. Measures of risk aversion 55
For any relative lottery that oers certainty (that is, r1 = r2 ), we get
the result that the slope of the indierence curve is equal to (1p)
p ,
just as in the case of absolute lotteries. The second derivative of an
indierence curve in the space of relative lotteries is
d2 r2
=
d(r1 )2 dEu=0
wu (r w)u (r w) u (r w)wu (r w) dr2
1p 1 2 1 2 dr1
2
p (u (r2 w))
4
Earlier we used w to indicate a wealth vector, and now it is being used to
indicate a scalar. From the context of the analysis it should always be clear what
the exact dimensionality of w is being assumed.
56 3. Risk aversion
d2 r2
=
d(r1 )2 dEu=0
wu (rw)u (rw) u (rw)wu (rw) 1p
1p p
2
p (u (rw))
wu (rw)u (rw) 1 + 1p
1p p
= 2
p (u (rw))
1p 1p wu (rw)
= 1+
p p u (rw)
wu (rw)
= f (p)
u (rw)
Rr (w)f (p)
Risk premium
Lets go back to absolute lotteries. In what we have done above, we
always began with a situation of certainty, that is, our endowment
points were risk-free. Now lets consider what can be done when we
start o from a wealth distribution that involves risk, concretely we
shall assume an endowment characterised by w1 > w2 . In the same
manner as previously, the indierence curve that passes through the
endowment point denes the lower frontier of the acceptance set. This
indierence curve cuts the certainty line at a point of wealth equal to,
say, w in either state. w satises
Since E w
p = w1 + w2 , this is just
u(w1 ) + u(w2 ) = u (E w
) w1 + w2
p
u(w2 ) u(w1 )
u (E w
) =
w2 w1
u(w)
u(w1 ) u(w)
u (E w )
u(w2 )
w2 E w w1 w
Arrow-Pratt approximation
A logical thing to think about is exactly how the risk premium relates
to the Arrow-Pratt measure of absolute risk aversion. To begin with,
note that from the denition of the certainty equivalent, and from the
denition of the risk premium, we can write
= u(w ) = u(w )
pu(w2 ) + (1 p)u(w1 ) = Eu(w)
Thus, the risk premium is the maximum amount of wealth that the
individual would be willing to pay to substitute his lottery for the
one with no risk at all but with the same expected value. It is now
useful to split the initial lottery into two parts6 the risk-free part,
w0 , and a risky part which we denote by the random variable x , whose
expected value is E x = x, multiplied by a constant, k. In this way,
the endowed expected utility is
x) = u(w0 + kx )
Eu(w0 + k (3.6)
k 2
(k) (0) + k (0) + (0) (3.7)
2
In this equation, we are going to substitute for the values of (0),
(0) and (0). We begin by noting that, if we set k = 0 in (3.6),
then we directly obtain the result (0) = 0. Second, derive (3.6) with
respect to k to obtain
xu (w0 + k
E x) = (x (k))u (w0 + kx (k))
x2 u (w0 + k
E x) = (x (k))2 u (w0 + kx (k))
(k)u (w0 + kx (k))
But since (as we have just seen) (0) = 0 and (0) = 0, setting k = 0
gives us the result that
that is,
u (w0 )
(0) = x2 x2 )
(E
u (w0 )
62 3. Risk aversion
w2
w1 = w2
w2
i
w3 j
i
j
a b c d Ew w1
In Figure 3.12 we have drawn two lotteries with the same expected
value and dierent variances. The expected utility indierence curves
3.4. Slope of risk aversion 63
through each of these two lotteries are also drawn for two individuals,
one of whom (individual i) is more risk averse than the other (individ-
ual j). If we concentrate on either of the two dierent lotteries, then it
is clear that the risk premium is greater for the individual with greater
risk aversion, so the risk premium is increasing in risk aversion. On
the other hand, if we concentrate on the indierence curves of any
of the two individuals, then it is also clear that holding risk aversion
constant and increasing variance also leads to an increase in the risk
premium.7
Thus, always within the assumption that the utility function itself
is increasing and concave (so that both relative and absolute risk
aversion are positive) we can directly conclude that
At the second step we can note that, u (w) 0 is a necessary (but not
sucient) condition for decreasing absolute risk aversion; Ra (w) < 0.
In words, a necessary condition for decreasing absolute risk aversion
is that marginal utility is convex. But we have already assumed that
u (w) > 0 and that u (w) < 0 for all w, that is, marginal utility is
positive and decreasing. From that, we can directly conclude that, at
least for very large values of w, marginal utility will indeed be convex
(if not, it would either have to be negative or increasing draw a
graph of marginal utility if you are not convinced).
At the nal step, we can also conclude that a necessary and su-
cient condition for decreasing absolute risk aversion is that Ra (w) <
P (w). The function P (w) as dened above is known as absolute
prudence, and so absolute risk aversion is decreasing if (and only if)
absolute risk aversion is less than absolute prudence. Another way of
looking at prudence is to consider the utility function v(w) = u (w).
Prudence of u(w) is then just the Arrow-Pratt measure of absolute risk
aversion of v(w). So u(w) displays decreasing absolute risk aversion if
u(w) is less risk averse than is u (w). The concept of prudence turns
out to be important for decisions that involve savings as a hedge
against risk, and it is normally accepted that risk averse individuals
also display positive prudence, implying that indeed u (w) > 0. We
study exactly this kind of problem in the next chapter.
In short, it is very often accepted that absolute risk aversion is
in fact decreasing (indeed, a common assumption which is also
often found to correspond to real life choices in empirical analyses is
that relative risk aversion is constant). In graphical terms, decreasing
absolute risk aversion corresponds to a family of indierence curves
that become more and more linear as we move away from the origin
of the graph.
Summary
In this chapter you should have learned the following:
Problems
1. Prove mathematically that a movement upwards along a line
of constant expected value in the Marschak-Machina triangle
corresponds to an increase in variance of wealth.
2. Use Jensens inequality to prove that if u(w) is concave, then
the iso-expected value lines in the Marschak-Machina triangle
are steeper than the indierence curves.
3.4. Slope of risk aversion 67
Applications
wi = 1 1i + 2 2i i = 1, 2
69
70 4. Applications
v 1 1 N 1 + v 2 2 N 2 v 1 10 N 1 + v 2 20 N 2
v 1 N 1 ( 1 10 ) + v 2 N 2 ( 2 20 ) 0
We shall also add the restrictions that j 0 j = 1, 2, that is,
it is impossible to be the owner of a negative share in a rm. In
reality, this type of restriction does not necessarily need to hold, since
owning a negative proportion of a rm is simply a situation in which
instead of owning shares, shares are owed. In many real-world markets
this is possible, and is known as holding a short position in a rm.
Selling more shares than what one owns is known as a short sale. Short
positions are possible only when there exists a time dimension in share
trading. An individual who believes that the price of a share will go
down tomorrow, can sell them today (although he does not actually
have them) at the current market price, with the promise of delivering
them the day after tomorrow. Then, with the money that he gets for
the sale, he waits until the next day when he purchases the shares (at
the lower price, if his belief has been fullled), and then he settles his
share debt. The prot from such a trade (net of any transactions costs)
is the dierence in the price of the shares, multiplied by the number
of shares involved. Of course, this can be a very dangerous strategy.
If rather than going down, the shares increase in price, the investor
makes a negative prot, and what is more, since (theoretically) the
price can increase without bound, the negative prot can also become
very large.1 Many bankruptcies have occurred through betting on
short sales. Our assumption of eliminating short sales avoids such
a complication.
Our interest is in the optimal portfolio choice of the investor,
that is, his optimal choices of j . Formally, the problem is to max-
imise Eu(w()) with respect to , conditional upon v 1 N 1 ( 1 10 ) +
2 2 2 2
v N ( 0 ) 0 and j 0 j = 1, 2. Since the objective function
(expected utility) is concave in , and since the restrictions are linear,
1
In comparison, holding only positive positions in rms limits losses to the
amount invested (the scenario in which the prices of the shares held drops to 0).
4.1. Portfolio choice 71
we can rest assured that the problem has a unique solution. We shall
formulate the problem by ignoring the no-negativity constraints, since
if they are satised in any solution found by not imposing them, we
know that the same solution would be found by imposing them, and
if one of them is not satised then we know that the optimal solution
is to simply set that equal to 0. The Lagrangean function for the
problem is
v 1 N 1 ( 1 10 ) + v 2 N 2 ( 2 20 ) = 0 (4.1)
w2
1
12
C
2
22 w
A
10 12
B
20 22
10 11 20 11 11 21 w1
the graph, and j = 1 would give the point j ). The initial endowment
of the investor is indicated by point A on the line pertaining to rm
1 and point B on the line of rm 2. The vector sum of these two
points indicates that the individuals initial point is found at C, and
the optimal position of the individual is given by the tangency point
between his indierence curve and the frontier of all feasible trades
(the line passing through C).
The principal problem in working this through is simply to obtain
the equation for the slope of the frontier of feasible trades in state
contingent wealth space. Note carefully, since we have not depicted the
two assets (the shares) on the axes, the slope of the budget constraint
in state contingent wealth space is certainly not equal to the negative
of the ratio of the prices of the assets, as one may be tempted into
believing at rst glance. Lets investigate.
We know that the slope of the individuals indierence curves in
state contingent wealth space (his marginal rate of substitution) is
(1 p)u (w1 )
pu (w2 )
4.1. Portfolio choice 73
g() = V 1 ( 1 10 ) + V 2 ( 2 20 )
so that the restriction reads g() = 0. First, note that point C must
necessarily lie on the implied restriction, since it corresponds to j =
j0 j = 1, 2 which clearly yields g() = 0. Second, from the implicit
function theorem we have
1
d 2 V
1 =
d dg()=0 V2
w2
1
C
B
A
E w1
z0 + (1 p)(y1 x1 ) + p(y2 L x2 )
pL (1 p)x1 + px2
w2 w1 = w2
Eu(w) = (1 p)u(w0 )
w0 + pu(w0 L)
E w = w0 pL
w0 w1
w2
w1 = w2
w0
w
w0 pL
w
w0 pL
w0 L
w0 pL L
w0 pL L
w0 pL w0 pL
w0 w1
Thus the contract must satisfy (1 p)x1 + px2 = pL, and since
x1 = x2 = xc , we have in this case xc = pL. Thus the individuals
nal wealth ends up at the point w in Figure 4.4, and his expected
utility is the greatest possible within the set of possibilities that is
oered by the zone of feasible contracts. Note that under this contract,
the individuals wealth in both states ends up at w0 pL, so in state 1
the contract asks him to pay pL, and in state 2 the contract gives him
a payment of L pL. In other words, the contract asks for a premium
payment from the insured to the insurer of pL in both states, and
oers an indemnity payment from the insurer to the insured in state
2 of L. The fact that the indemnity is, in absolute value, equal to the
size of the loss is the indication that the contract has full coverage,
and the fact that the premium is equal to the expected value of the
loss is known as a case of a fair premium.
Second, consider the case when the insurer is a monopolist, so
that the contract is characterised by x1 = x2 = xm , that is, again
we know that the contract will still involve full coverage, and all that
we need to nd out is the amount of the premium. But clearly, the
company will oer the contact that maximises her expected prot
while still being accepted by the individual. If the contract does not
leave the insured indierent between accepting it or not, then the
same indemnity payment can be made with a higher premium, which
must increase the expected prot of the insurer. So the contract must
lie on the individuals initial indierence curve. But since it oers full
coverage, it oers full certainty to the individual. Thus
This is just the denition that we saw previously for the certainty
equivalent wealth, w , and so we have w0 + xm = w . But from the
denition of the risk premium, we now know that w = w =
w0 pL , and so xm = (pL + ). A monopoly contract leaves the
individual with wealth of w0 (pL+) in both states, and so it implies
that the premium paid by the insured to the insurer is pL + , and
the indemnity coverage to be received if state 2 eventuates is equal
to L. Hence, the only dierence between a competitive contract and
a monopoly contract is the premium to be paid in both contracts
the indemnity to be received in state 2 is always equal to the loss,
L. When we have a monopoly insurer, the premium is equal to the
competitive premium plus the individuals risk premium.
80 4. Applications
subject to
Eu(C )
=pu (w0 (qC + k) L + C )(1 q)
C
(1 p)u (w0 (qC + k))q (4.3)
=0
Some special cases are now quite evident. First, say q = p, which
is the case studied in the previous section. In that case (and only
in that case), p(1q)
(1p)q = 1, and so the rst-order condition indicates
that u (w0 (qC + k)) = u (w0 (qC + k) L + C ). But since
Consider the other two options, q > p and q < p. In the rst of
these cases, we get p(1q)
(1p)q < 1, in which case u (w0 (qC + k)) <
u (w0 (qC + k) L + C ). Again, due to concavity of the utility
function, this indicates that w0 (qC + k) > w0 (qC + k) L + C ,
or C < L, that is, under-insurance. The other case, q < p leads
directly to over-insurance, C > L.
Now, typically over-insurance is problematic, and is never a feature
of the real-world insurance business. It is not dicult to see why. If,
in case of accident, the insured receives back more money in indem-
nity than what he loses in the accident, he has a clear incentive to
articially boost the probability of accident, which is detrimental to
the expected prot of the insurer. Normally, actions by the insured
to attempt to create accidents (a well-known type of insurance fraud)
cannot be easily monitored by the insurer, and so in order to avoid
such a scenario, insurers do not oer contracts with q < p, and
correspondingly, we shall ignore that option here.
As far as the comparative statics of insurance are concerned, the
interesting case occurs when q > p, since the optimal coverage does not
respond at all to changes in any parameter values (outside of moving
k from below to above ) in the case when q = p. So from now on,
let us consider only the case q > p, and we shall look at the eects
of changing the parameter values on the optimal insurance choice. To
do this, go back to the rst version of the rst-order condition (4.3),
to which we can directly apply the implicit function theorem.
The most important result to note is what happens when the
individual becomes independently wealthier, that is, w0 increases.
From the implicit function theorem,
2 Eu
C
Cw0
=
w0 dEu=0 2 Eu
C 2
2 Eu C
But since we know that C 2
< 0, the sign of w0 is the same as the
2 Eu
sign of Cw0 . Dierentiating (4.3) we have
4.2. The demand for insurance 85
2 Eu
= pu (w0 (qC +k)L+C )(1q)(1p)u (w0 (qC +k))q
Cw0
(4.5)
However, we can cancel the term p(1q) using the rst-order condition
itself, to get
2 Eu
=
Cw0
q(1 p)u (w0 (qC + k))
u (w0 (qC + k) L + C )
u (w0 (qC + k) L + C )
(1 p)u (w0 (qC + k))q
u (w0 (qC + k) L + C )u (w0 (qC + k))
=q(1 p)
u (w0 (qC + k) L + C )
u (w0 (qC + k))
2 Eu
Now, since q(1 p) > 0, the sign of Cw0 is equal to the sign of the
bracketed term
u (w0 (qC + k) L + C )u (w0 (qC + k))
u (w0 (qC +k))
u (w0 (qC + k) L + C )
2 Eu
or more formally, Cw0 0 as
2 Eu
= p(1 q)u (w0 (qC + k) L + C )+
Ck
(1 p)qu (w0 (qC + k))
2 Eu
=
Cw0
2 Eu
where we have used (4.5). Thus, if Cw0 < 0 due to decreasing
2 Eu
absolute risk aversion, then we have Ck > 0, and an increase in
4.3. Precautionary savings 87
the xed component of the price of insurance will actually increase the
demand for insurance! Why is this? Again, the logic is quite easy. Note
that the xed component of the price of insurance is nothing more
than a loss in risk-free wealth regardless of whether or not an accident
occurs. Thus an increase in k is exactly equivalent to a decrease in
w0 . Running our previous argument in reverse, this decrease in risk-
free wealth would make the individual more risk averse, and thereby
increase his insurance purchase.
In actual fact, it is somewhat unfair to label insurance as an
inferior good when absolute risk aversion is decreasing. In this
insurance model, the only decision that the insurance consumer takes
is how much insurance to purchase. This is a little dierent from the
traditional consumer model in which a decision is taken on at least
two goods. If we introduce a second good into the insurance model, it
becomes unclear whether or not insurance is inferior.
and the indierence curves are downward sloping convex curves. The
optimal choice is where an indierence curve is tangent to the budget
constraint. The marginal rate of substitution can be found from the
utility function using the implicit function theorem, and it is
dc2 u (c1 )
=
dc1 dU =0 u (c2 )
u (c01 )
= (1 + r)
u (c02 )
c02 = (1 + r)(y1 c01 ) + y2
u (c01 )
= (1 + r)
u (c02 )
1
Exercise 4.4. Show that if = (1+r) , the consumer will con-
sume the same in each period. Further, if the nancial system is
costless (r = 0) and the individual is innitely patient ( = 1),
show that consumption in each period is exactly the average
of total income. How do the two consumption choices relate to
1 1
each other when > (1+r) and when < (1+r) ? Can you give
some economic intuition for these results?
1 u (c01 )
Answer. If = (1+r) the tangency condition becomes u (c02 )
=
1. Given concave utility, this is just the same as saying c1 = c02 ,
0
that is, the consumer will consume the same in each period. In
this case, consumption c0 can be calculated from the budget
constraint; c0 = (1 + r)(y1 c0 ) + y2 , which solves out to c0 =
(1+r)y1 +y2 u (c0 )
2+r . With r = 0 and = 1, again we get u (c10 ) = 1
2
Ey2
subject to s y1
1+r
Again, the second-order condition is satised by concavity of the
utility function, and so (assuming an interior solution) the optimal
savings is the solution to
That is,
u (y1 s )+(1+r)[pu (y22 +(1+r)s )+(1p)u (y21 +(1+r)s )] = 0
(4.7)
Now, we are interested in seeing how the solution to (4.7) compares
to the solution to (4.6), that is, what is the eect of the introduction
of pure income risk? In principle, we would most likely expect that the
risk will result in more savings, since by passing income into period
2 savings is a way in which the adverse outcome of low period 2
income can be insured against. Such a savings strategy is known as
precautionary savings.
U (s)
U (s ) U (s0 ) > 0
U (s0 )
s0 s s
y2
subject to s y1
1 Er
Again, since the objective function is concave in s, if we assume that
the solution is interior, then the optimal savings for this problem, s ,
is given by the solution to the rst-order condition;
which simplies to
Note that this equation is of the form ph(r2 ) + (1 p)h(r1 ) > h(r),
where r = pr2 + (1 p)r1 . Thus the requirement is that h(r) =
u (y2 + (1 + r)s0 )(1 + r) is convex in r, that is, we require h (r) > 0.
However, we can calculate
and
h (r) = u (y2 + (1 + r)s0 )(1 + r)(s0 )2 + u (y2 + (1 + r)s0 )s0 +
+ u (y2 + (1 + r)s0 )s0
= u (y2 + (1 + r)s0 )(1 + r)(s0 )2 + 2u (y2 + (1 + r)s0 )s0
2u (y2 + (1 + r)s0 )s0 > u (y2 + (1 + r)s0 )(1 + r)(s0 )2
or
u (y2 + (1 + r)s0 )(1 + r)s0
2< (4.8)
u (y2 + (1 + r)s0 )
4.3. Precautionary savings 95
At the last step, be careful that you understand why the inequality
direction has changed it is because we divided by u which is
negative.
Condition (4.8) tells us several things about savings under interest
rate risk. First, it is not true that positive prudence is necessary
for the individual to save more under interest rate risk than under
certainty. Prudence must be suciently high. To see this, recall that
the coecient of absolute prudence is the third derivative of utility
divided by the second derivative and multiplied by 1, and then
simply write our condition in the following ways
s
2/((1 + r)s )
2 Eu( y)
2
= (1 p){u (y1 (x ))[y1 (x )]2 + u (y1 (x ))y1 (x )}+
x
p{u (y2 (x ))[y2 (x )]2 + u (y2 (x ))y2 (x )}
The sign of this depends on the sign of yi (x) = 2d (x) + d (x)x
c (x).
Since we have already assumed that c (x) > 0, and that d (x)
0, the second derivative of expected utility with respect to x is likely
to be negative, but we do need to assume that d (x) is not too positive
for this to happen. We shall indeed make this assumption, as if it were
98 4. Applications
not to hold, all that happens is that we would get a corner solution
(either output of 0, or output going innite), which is neither realistic
nor interesting.
Now, if the producer were risk neutral, then u (y1 (x)) = u (y2 (x)),
and the rst-order condition (4.9) would simplify to (using super-
indexes of 0 to indicate the risk neutral solution)
or
d(x0 ) + d (x0 )x0 c (x0 ) = [(1 p)1 + p2 ] = 0
So in the end, the risk neutral solution is nothing more than the
condition that marginal revenue d(x0 ) + d (x0 )x0 be equal to marginal
cost c (x0 ). Of course, this is also the solution when no risk exists
(1 = 2 = 0).
Now, lets substitute that solution x0 into the rst-order condition
for the risk averse producer. If the sign of the rst-order condition
becomes negative, we would then know that x < x0 (draw a quick
graph of a concave function to help you see why).
When we use the condition for x0 , it turns out that
Eu(y)
= (1 p)u (y1 (x0 ))1 + pu (y2 (x0 ))2 (4.10)
x
Now, since our assumption is 1 > 0 > 2 , it also happens that for
any x we have y1 (x) = (d(x)+1 )xc(x) > y2 (x) = (d(x)+2 )xc(x).
Thus y1 (x0 ) > y2 (x0 ), and since utility is concave, this implies that
u (y1 (x0 )) < u (y2 (x0 )). We can use this in equation (4.10) as follows:
and since 1 > 0 > 2 , for any x > 0 we have y1 (x) > y2 (x). Thus
the feasible set that we are looking for is located below the certainty
line in (y1 , y2 ) space. When x = 0, y1 (x) = y2 (x) = 0, so the feasible
set does contain the origin of (y1 , y2 ) space. Furthermore, we have
y1 (x) y2 (x) = (1 2 )x, which is larger the larger is x. Thus, as x
grows, the upper frontier of the feasible set gets further and further
away from the certainty line.
Consider how yi (x) changes with x:
yi (x) = 2d (xi ) + d (xi )xi c (xi ) < 0
So, yi (x) is (under the assumptions made on d(x) and c(x)) a concave
function of x, and so it has a maximum. This means that the feasible
set that we are looking for must be bounded, since neither y1 nor y2
can exceed their maximum values. If you like, say the maximum value
of yi is denoted by yimax , then we can draw in our (y1 , y2 ) space a
rectangle with sides of y1max and y2max , and the feasible set that we are
looking for must be everywhere contained within that rectangle.
Denote by xi the value of x that maximises yi (x). Under our
assumptions on d(x) and c(x), we have
where we have used the fact that 2d (xi ) + d (xi )xi c (xi ) < 0 from
the second-order condition for our main maximisation problem. The
dx
fact that dii > 0 indicates that as increases, so does the value of x
that maximises y(x). So it turns out that x1 > x2 .
Now, all of this indicates that we know that for any x < x2 both y1
and y2 are increasing with x. Thus the frontier of the feasible set over
this range of values of x must be positively sloped. Since it started at
y1 = y2 = 0, and since it both lies below, and slopes away from, the
certainty line, over this range of values of x the frontier of the feasible
set must be an increasing function of slope less than 1. But then, when
4.4. Theory of production under risk 101
y2
x = x2
y2max
x
x = x1
x=0 y1max y1
The only really relevant section of the feasible set is the negatively
sloped part. This is because the indierence curves in (y1 , y2 ) space
are negatively sloped, and so when we maximise utility on this feasible
set the optimal point must turn out to be on the negatively sloped
section, that is, we know that whatever value of x maximises utility,
it must satisfy x2 < x < x1 . The solution is shown in Figure 4.8,
where the indierence curve is tangent to the feasible set boundary.
102 4. Applications
w2
w1 = w2
x0
w1
Figure 4.8 also shows the risk neutral solution, x0 . The risk neutral
indierence curves are straight lines with slope 1pp , which, of course,
is the slope of the indierence curves of the risk averse problem as they
cross the certainty line. It is this property that leads to the risk averse
solution, x , locating to the north-west of the risk neutral solution x0 .
If you imagine the risk averse indierence curve going through x0 , it
would have to be less steep than the risk neutral indierence curve at
that point. Thus the risk averse problem must nd its maximum at a
smaller value of x.
Notice that the rst term on the right-hand side of this is positive,
and the second term is negative. So this could indeed be positive,
negative or equal to zero. Expected utility over this interval is concave
under the assumption that the newsboy is risk averse, u < 0 (you
can check this by calculating the second derivative and checking that
it is negative). In short, the optimal number of newspapers to order
in, x , must satisfy
Eu(x)
x2 x1 x
Eu(x)
x2 x1 x
Eu(x)
x2 x1 x
q(1 p)
x = x 2
c
Summary
In this chapter you should have learned the following:
1. The basic theory of choice under risk can be applied to many
specic questions, relating to consumers, investors, savers and
producers (to name a few).
2. The stock market, where shares in companies are traded, pro-
vides a mechanism under which individuals can organise their
holdings of risk. In the model analysed in this chapter, there
was no price risk, only prot risk, and our investor was able to
spread his portfolio over companies with dierent (and risky)
prot outcomes. The main thing to note in this model is how its
solution conforms, almost exactly, with the kind of solution that
we get in any standard consumer theory model the equilibrium
is at the point at which the indierence curve is tangent to a
budget line.
3. The classic model of transactions involving risk is found in the
insurance market. Here we have analysed the insurance decision
of an insurance consumer, that is, the demand for insurance.
We saw that insurance demand will always involve full cover-
age whenever the premium is marginally fair, and so long as
there is no xed-cost element to the premium that exceeds the
insurance consumers risk premium. When the premium is no
longer marginally fair, we get partial coverage. In these cases it
also happens that, if absolute risk aversion is decreasing with
wealth, as the insurance consumer gets wealthier, less coverage
is demanded.
4. When more than one period is brought into the analysis, a
decision maker has the opportunity to pass money from one
period to the next in the form of savings. When it happens that
there are risks in the second period, then it may be optimal to
save from the rst to the second period in order to mitigate
the eects of second period risks. Such a strategy is known
as precautionary savings. We showed that, when the second
period risk is upon income, then the decision maker will be a
precautionary saver only if she is prudent (i.e., if her marginal
utility is convex). When the second period risk is upon the
interest rate, prudence alone is a necessary but not sucient
condition for precautionary savings. The sucient condition is
that prudence must be suciently high.
4.4. Theory of production under risk 109
Problems
1. Draw a graph of a situation of a strictly risk averse monopolist
insurer. Comment on the dierences between this situation and
that of a risk neutral insurer.
2. Assume a model of a risk-neutral monopolistic insurer. What is
the expected prot that this insurer extracts from a risk averse
individual with a loss of L that occurs with probability p? Would
the insurer prefer to insure a risk with higher or lower p?
3. An individual with strictly increasing and concave utility has a
lottery that pays x1 with probability 1 p and x2 with prob-
ability p. Assume 0 < x2 < x1 . The individual can insure his
lottery with a perfectly competitive insurer. Write the equation
for the increase in expected utility that the individual receives
under the optimal insurance demand. This increase in expected
utility is a function of the probability p, so write the expected
utility increase as H(p). Evaluate the concavity or convexity of
H(p) and nd the value of p that would maximise H(p).
4. Assume an individual with wealth w0 , and a risk on that wealth.
The risk is that with probability p, a fraction of the wealth
is lost. Assume that an insurer oers coverage such that if the
loss occurs, the indemnity paid to the individual is C (which is
110 4. Applications
(c) Now draw the locus of optimal points, one for each value of
q between the two extremes of the previous two questions.
Locate (graphically) the premium that corresponds to the
maximal expected prot of the insurer.
From now on we shall consider how risk and uncertainty can be dealt
with in a somewhat more general equilibrium setting. It will, however,
be a very simple general equilibrium, with only two economic agents
present at all times. We begin (in this chapter) with an analysis of risk
sharing between the two individuals under an assumption of perfect
information (all that is relevant to the situation is fully known by
both players), and then later (chapters 6 and 7) we shall consider
what happens when we relax the perfect information assumption.
Here then, we retain the contingent claims environment of the
previous chapter, but we adapt our graphical presentation to include
two individuals, both of whom are assumed to be strictly risk averse
(unless otherwise stated). The natural way to do this is by using the
well-known Edgeworth box diagram of intermediate microeconomics.
This is be the principal graphical tool that is used throughout this
chapter. Again, the only good present in the model is money. The
initial endowment of individual i is given by the vector w i = (w 1i , w
2i )
for i = 1, 2. Here, wji represents the wealth of individual i in state of
nature j, and if w 1
= w
i 2i then individual i has a risky endowment.
In what follows, we assume that state 2 is the unfavourable state for
both individuals, that is, w 1i for both i = 1, 2. This assumption
2i < w
implies that total (or aggregate) wealth in state 2, W 2 = w 21 + w 22 , is
strictly less than total (or aggregate) wealth in state 1, W1 = w 11 + w 12 .
Such a situation, W 2 < W 1 , is known as characterising aggregate risk.
When there exists aggregate risk, the Edgeworth box is longer than
it is tall (see Figure 5.1).
115
116 5. Perfect information
w2
w11 w21
O2
w1
C1
Eu1 (w)
w22
Eu2 (w) w
w21
C2
O1 w1
w2
individual without decreasing the utility of the other). The fact that
both of the individuals are assumed to be risk averse has an important
consequence in the graph. It implies that at all interior points on it,
the contract curve must lie strictly between the two certainty lines,
without ever touching either. The reason for this is quite simple to
see. We know that the slope of an indierence curve where it passes
through the corresponding certainty line is equal to (1p) p , where of
course, p is the probability that state 2 occurs. The strict convexity of
the indierence curves of individual 1 then implies that it is impossible
that an indierence curve of individual 1 has slope equal to (1p)p at
the point at which it passes through the certainty line of individual 2
(and so the two indierence curves can never be tangent to each other
at a point on C2 ). After all, the indierence curves of individual 1 take
the particular slope (1p)
p only at points on C1 , and they never take
that slope again anywhere else. In the same way, the strict concavity
of the indierence curves of individual 2 (with respect to the origin
O1 ), and the fact that their slope is equal to (1p)
p at points on C2
imply that these curves can never have slope equal to (1p)
p at points
on C1 . So the contract curve cannot ever touch either certainty line.
To see that in fact the contract curve lies between the two certainty
lines, consider one indierence curve in particular, Eu2 ( x) = U . This
curve must cut both certainty lines, but at the point at which it cuts
C1 it must be less steep than it is at the point at which it cuts C2 .
When it cuts C1 , it is atter than the indierence curve of individual 1
at that same point. But a similar argument suces to show that where
this particular indierence curve cuts C2 , it must be steeper than the
indierence curve of individual 1 passing through that same point. So
as we move along that indierence curve of individual 2, the marginal
rate of substitution of individual 1 is less than that of individual 2
at C1 (remember that the M RS are negative numbers), and greater
than that of individual 1 at C2 . By the intermediate value theorem,
at some point between the two certainty lines, the two marginal rates
of substitution must be equal.
The logic of why the contract curve cannot touch either certainty
line (at interior points in the Edgeworth box) is also easy to appreci-
ate. Since both individuals are strictly risk averse, it cannot be ecient
for only one of them to accept all of the risk.
There are a couple of straight forward exceptions to the rule that
the contract curve cannot touch the certainty lines, but they must
118 5. Perfect information
Lets consider this aspect of the contract curve before moving on.
We know that the contract curve is formed by all the points at
which the two marginal rates of substitution are equated. Since the
marginal rate of substitution of individual i is
(1 p)ui (w1i )
RM Si = for i = 1, 2
pui (w2i )
it can be seen that the contract curve is the set of points such that
j for j = 1, 2, that satisfy
wj1 + wj2 = W
Note that the probabilities cancel, so that the contract curve satises
Curiously then, the position and slope of the contract curve is inde-
pendent of the probabilities of the states of nature.1
Using (5.1) we can easily see what happens along the borders of
the box, and in particular, what happens at the origins. Take, for
example the lower axis of the Edgeworth box, that is, the axis that
sets individual 1s state 2 wealth equal to 0 (and thus individual 2s
state 2 wealth equal to W 2 ). The contract curve must establish exactly
how state 1 wealth should be shared when state 2 wealth is shared in
this way. When we look at interior points close to the lower axis, the
contract curve will involve points at which the two marginal rates of
substitution are equal, and that contract curve must touch the lower
axis at some point. Let us call that point w11 = a, as is the case for
contract curve a shown in Figure 5.2.
If point a is not the origin O1 (as in curve a Figure 5.2), then the
contract curve will actually follow the lower border of the Edgeworth
box until it reaches O1 , but those points will be corner solutions rather
than tangency solutions between indierence curves. What is more
interesting are the cases like contract curve b in Figure 5.2, where
there are no corner solutions, and the contract curve converges to the
origin. We might wonder when this kind of contract curve eventuates.
1
Obviously, this is true only when both individuals share the same probabities
(i.e., the case is one of risk, rather than uncertainty).
120 5. Perfect information
w2
O2
w1
C1
Contract curve b
Contract curve a
C2
O1 a w1
w2
So, conditional upon 0 < u1 (0) < , it becomes impossible that
the contract curve converges to the origin O1 , since the marginal
rates of substitution are dierent at that point. Following an identical
argument, conditional upon 0 < u2 (0) < , it turns out that the
contract curve also cannot converge to the origin O2 . Given this, lets
5.1. The contract curve 121
think a bit more about the condition, 0 < ui (0) < , that marginal
utility with zero wealth is positive and nite.
Given that we are assuming the utility function to be concave,
we know that the marginal utility is decreasing. But if it is also to
be positive for positive wealth levels, we must directly eliminate the
option that ui (0) = 0. So, as far as the relationship between the
contract curve and the origins of the box is concerned, the only case
that we need to consider is the possibility that2 ui (0) = .
If ui (0) = then the contract curve converges to the origin Oi . To
see why, assume that individual 1s utility function is characterised by
u1 (0) = . Now, what is the value of w11 that corresponds to a Pareto
ecient allocation in which w21 = 0 and w22 = W 2 ? Graphically, the
question is, conditional upon being on the lower axis of the box, which
point is Pareto ecient? Let the value of w11 that we are searching for
be denoted by w11 = a. Whatever is the value of a, it is necessarily true
that a W 1 W
2 , since we know that the point in question can never
lie beneath the certainty line of individual 2 (recall, the contract curve
always lies strictly between the two certainty lines), and this certainty
line touches the lower axis at the point w11 = W 1 W 2 .
Now, at the relevant point we get
u1 (w11 ) u1 (a)
=
u1 (w21 ) w1 =aW
1 W
2 , w1 =0
1 2
u1 (a) 1 a)
u ( W
= 1 >0
2 )
u1 (W
2
This should really be written as limw0 ui (w) = , but no confusion at all
will result from the simpler expression used in the text.
122 5. Perfect information
So the only possible option is that, when u1 (0) = we must set
a = 0, in which case the contract curve converges3 to the origin O1 .
Naturally, the very same argument reveals that when u2 (0) = the
contract curve must converge to the origin O2 . Of course, both could
occur simultaneously.
In order to say more about the contract curve, we need to look
at its slope. We will now do this, limiting ourselves to interior points
only. Also, since any point on the contract curve is fully dened by
the coordinates of individual 1s consumption, and since we know
that whatever is individual 1s consumption, individual 2 consumes
the rest of the wealth in each state, we can simplify our notation by
eliminating the need to continue with the super-indexes that indicate
which individual is which. Thus, we now use simply wi1 = wi and
wi2 = Wi wi for i = 1, 2.
We begin with the equation that denes the contract curve itself,
equation (5.1), which now reads as follows
2 w2 ) = u (w2 )u (W
u1 (w1 )u2 (W 1 w1 )
1 2
2 w2 ))
h(w1 , w2 ) Ln(u1 (w1 )) + Ln(u2 (W
Ln(u (w2 )) Ln(u (W 1 w1 )) (5.2)
1 2
From the implicit function theorem, we get the slope of the con-
tract curve as
h()
dw2 w1
=
dw1 h()
dh()=0
w2
h()
so long as w2
= 0. However, it is evident that
Ln(u (w)) u (w)
= = Ra (w)
w u (w)
where Ra (w) is the Arrow-Pratt measure of absolute risk aversion.
Carrying out the suggested derivatives we get4
h() 1 w1 ) < 0
= R1a (w1 ) R2a (W
w1
h() 2 w2 ) + Ra (w2 ) > 0
= R2a (W 1
w2
Finally then, we nd that the slope of the contract curve at any
interior point is
dw2 dw2 1 w1 )
R1a (w1 ) + R2a (W
= = (5.3)
dw1 dh()=0 dw1 cc R1a (w2 ) + R2a (W2 w2 )
1. Since both individuals are strictly risk averse, Ria > 0 for i =
1, 2, the contract curve has strictly positive slope at all interior
points.
2. Since the contract curve lies between the two certainty lines, we
have w1 > w2 and W 1 w1 > W 2 w2 , and so the slope of the
contract curve is less than 1 if both individuals have decreasing
absolute risk aversion, greater than 1 if both individuals have
increasing absolute risk aversion, and equal to 1 if both individ-
uals have constant absolute risk aversion.
1 ) and w2 = k2 (W
w1 = k1 ( W 2 )
i for i = 1, 2
wi = W
5
This does in fact occur in some royalty type contracts. The author is often
paid a larger share of the revenue when the revenue is large. We shall consider
exactly this example later on.
5.2. Constant proportional risk sharing 125
w1
But in this case, since = , it turns out that the contract stipulates
W1
an allocation that must satisfy
w 1
W 2
w2 = W2 = 2 =
W w1
W1
W1
Clearly, this is a point on the diagonal of the Edgeworth box. Given
this, our task is to wonder if such a point can ever be the result of
an ecient risk sharing arrangement. This requires thinking about
intersections between the contract curve and the diagonal passing
through the Edgeworth box.
It is easy to see that, quite in general, the contract curve must have
at least one point that coincides with the diagonal of the Edgeworth
box. This is trivial when ui (0) = for some i = 1, 2, since in that case
we know that the contract curve touches the diagonal at least at the
origin Oi . For the case in which ui (0) < for both i = 1, 2, we need
only recall that the contract curve cannot touch either certainty line.
But since it also cannot pass through the origins, it must start out
below the diagonal and end up above it. Thus there must be at least
one internal point at which the contract curve crosses the diagonal.
So, we know that independently of the exact situation, there must
always exist at least one point such that a constant proportional risk
sharing contract is Pareto ecient. However, this is quite dierent to
a statement to the eect that such a point will always be ecient.
Indeed, for a very large set of logical cases, there is a single point
of intersection between the contract curve and the diagonal. Simply
using (5.3), we can see that if absolute risk aversion for each individual
is non-decreasing, the contract curve must have a slope that is greater
than or equal to 1, and so in all of those cases there can be only a
single intersection with the diagonal of the box.
Other cases are also relatively easy to spot. For example, if we
2
dene W
W1 < 1, then a contract on the diagonal of the Edgeworth
box is w2 = w1 . Now, there can be only a single intersection between
the contract curve and the diagonal if, at any such intersection the
contract curve has greater slope than the diagonal. Using (5.3) this
requires that at the point in question
1 w1 ) > R1a (w1 ) + R2a ((W
R1a (w1 ) + R2a (W 1 w1 )) (5.4)
Now, dene a function
1 w1 ))
f () R1a (w1 ) + R2a ((W
126 5. Perfect information
Note that equation (5.4) is equivalent to f (1) > f (), and since < 1,
it is, therefore, sucient that f () > 0. Since
1 w1 )) +
f () = R1a (w1 ) + R2a ((W
1 w1 )Ra ((W
w1 R1a (w1 ) + (W 1 w1 ))
2
we require
1 w1 )) +
R1a (w1 ) + R2a ((W
1 w1 )Ra ((W
w1 R1a (w1 ) + (W 1 w1 )) > 0
2
w2
O2
w1
C1
C2
O1 w1
w2
w1R
u(w) =
1R
1
u (w) = (5.6)
wR
constant Ri gives
R1 R2
1 1
Ln + Ln =
w1 2 w2
W
R 2
1 R1 1
Ln + Ln
w2 1 w1
W
that is,
w1 1 w1
W
R1 R 2
w2 2 w2
W
or cross-multiplying the second inequality, we get
w2 2 w2
W
R1 R 2
w1 1 w1
W
But any point on the diagonal line in the Edgeworth box is dened
by w2 = w1 , that is, the diagonal line satises
1 w1
W
w2 w1 W2 w 2
= = and = =
w1 w1 1 w1
W W1 w1
w2 2 w2
W
==
w1 1 w1
W
w2 2 w2
W
>>
w1 1 w1
W
130 5. Perfect information
w2 2 w2
W
<<
w1 1 w1
W
In short, then, it turns out that if R1 > R2 , so that all points on
the contract curve must satisfy
w2 2 w2
W
>
w1 1 w1
W
and so in this case the contract curve must lie above the diagonal line
at all internal points. Likewise, if R2 > R1 then the contract curve
must lie below the diagonal line at all internal points, and if R1 = R2
then the contract curve coincides with the diagonal line.
w2
O2
w1
C1
R2
>
R1 R2
1
=
R
R2
<
R1
C2
O1 w1
w2
In Figure 5.4 we show the three options. Note that the contract
curve bends towards the certainty line of the most risk averse indi-
vidual. Thus the least risk averse of the two is insuring the position
of the more risk averse, by accepting a larger share of the risk in any
ecient contract. Note also that, in the case of two individuals with
5.3. Increases in aggregate wealth 131
constant relative risk aversion, if one is more risk averse than the other,
then there are no internal points that are Pareto ecient, and so in
this rather likely case it is impossible that a contract with constant
proportional risk sharing be Pareto ecient. Finally, as a limit case, if
one of the individuals is risk neutral (R = 0), then the contract curve
will coincide entirely with the certainty line of the other.
wi
= 0 for j
= i
j
W
i i = 1, 2
max k1 Eu1 (w1 ) + k2 Eu2 (w2 ) subject to wi1 + wi2 W
w1 ,w2
These equations all imply that the two multipliers are strictly
positive, thus as expected the solution allocates all the wealth in each
state, and so w12 = W1 w1 and w2 = W 2 w1 . This allows us to go
1 2 2
back to our original notation, so we can express the state contingent
1 w , w1 = w ,
wealths of the two individuals as w11 = w1 , w12 = W 1 2 2
and w22 = W2 w . Using this, the rst-order conditions can be more
2
easily combined and expressed as
1 w1 )
k1 u1 (w1 ) =k2 u2 (W (5.8)
2 w2 )
k1 u1 (w2 ) =k2 u2 (W (5.9)
1 , to get
Now, derive this with respect to W
w1 w2
2 w2 ) =
(R1a (w1 )) R2a (W
1
W W1
w2
w1 1 w1 )
(R1 (w2 )) + 1
a
R2a (W
W 1 W1
T1 (w1 )
=
1 w ) + T1 (w )
T2 ( W 1 1
Summary
In this chapter, you should have learned the following.
some risk. That is, the contract curve cannot touch the certainty
line of either individual at any strictly interior point of the box.
4. There are a great many cases in which the contract curve touches
the diagonal of the Edgeworth box at only one interior point.
In those cases, since the diagonal describes all allocations that
involve constant proportional risk sharing, it is unlikely that
constant proportional risk sharing will be consistently chosen as
the equilibrium allocation.
5. Only when both individuals have constant and equal relative
risk aversion does the contract curve coincide with the diagonal
of the box, and so only in this case can we guarantee that
a constant proportional risk sharing contract will be optimal.
Specically, if the two individuals have constant but dierent
relative risk aversion, then a constant proportional risk sharing
contract will never be optimal.
6. Aggregate risk will be shared between the two according to a
sharing rule that depends upon absolute risk tolerances. Specif-
ically, individual i will retain a proportion of any increase in
aggregate wealth in state 1 that is equal to that individuals risk
tolerance in state 1 divided by the sum of the two individuals
risk tolerances in state 1. A similar result holds for how increases
in state 2 wealth is shared.
7. In particular, the way an increase in state i aggregate wealth
is shared is independent of how much wealth is available for
sharing in state j.
Problems
1. Assume that the two agents have dierent probability beliefs
regarding the probability of the two states of nature. Specically,
assume that individual 1 believes that the probability of state
2 is p1 while individual 2 believes it to be p2 , and, of course,
p1
= p2 . Each individual is fully informed of the probability
belief of the other.
(a) Write out the equation that describes the contract curve,
and evaluate its slope.
(b) Does it still hold true that the contract curve cannot touch
the certainty lines of either of the two individuals at an
interior point? Explain why or why not.
136 5. Perfect information
Asymmetric information:
Adverse selection
In the preceding chapter there were two active agents in the model,
an insurer and an insured individual. Ecient risk sharing in the
insurance model was dependent upon the assumption that both active
economic agents have exactly the same information. Above all, we
assumed that both agreed on the value of the probabilities of the
states of nature. It is relatively simple to see that, if the two agents
had dierent beliefs as to exactly what is the value of the probability of
each state of nature, then nothing important changes, so long as each
knows the probability belief of the other. But again, having dierent
beliefs, but where each agent is fully informed of the beliefs of the
other, is not really an extension to the model since it retains the
symmetric information nature of the set-up. But what happens when
at least one agent is totally ignorant of the probability belief of the
other? Such a situation is known as asymmetric information.
138
6.1. Preliminary comments 139
In all that follows, we are only interested in what people know and
do not know, and not when there is disagreement as to true values.
For example, if you are convinced that team A will win on Saturday
with probability one half, and your friend is equally convinced that
the probability of team A winning is only one quarter, and you are
both informed as to the probability assessment of the other, we cannot
speak of a case of asymmetric information. Everything that is relevant
is known by all concerned. Thus asymmetric information as we shall
study it involves situations in which at least one party is totally
uninformed as to some relevant data point.
A perfect information setting is one in which all economic agents
are fully informed of all relevant parameter values. On the other hand,
imperfect information is a situation in which at least one agent is
uninformed of the value of at least one relevant data point. If it turns
out that both individuals are uninformed of the values of the same
data, then we have a case of imperfect but symmetric information.
Note that imperfect but symmetric information does not necessarily
arise when there exists uncertainty as to a relevant data value, and
both individuals estimate the probability density that they think
should correspond to the unknown data. It depends on whether or
not each individual knows the others probability belief. Thus a model
of pure uncertainty (or risk) is not generally a setting of imperfect
information.
When imperfect information exists, it is possible that the two
agents dier in what they each know. Such a scenario is a case of
asymmetric information, and that is what we are interested in here. In
order that things are as simple as possible, we shall only be considering
here very simple asymmetric information problems, in which one agent
is fully informed, and the other is informed of all relevant data except
for one specic value.1
The model that will be used throughout this chapter is the Edge-
worth box, although we will not be drawing the axes corresponding
to person 2 (the top and right-hand side axes). In that way, a point
on our graphs will represent the allocation from person 1s point of
view (where person 1s origin is the origin of the graph). Person 1s
more preferred allocations are to the north-east, while person 2s more
preferred allocations are to the south-west. Our convention will always
1
Cases in which the uniformed party is uninformed as to more than one data
point, or when both parties are uninformed as to something, but where what is
unknown to one is known to the other, are possible but unnecessarily complicated.
140 6. Adverse selection
On the other hand, say the salesman exerts low eort in which case
it is likely that he does not sell much, and yet he may simply have
a lucky day anyway and manage to make good sales in spite of his
laziness.3 The point is that the result obtained becomes an imperfect
signal for the level of eort used, and we have a legitimate situation
of asymmetric information whose solution is no longer trivial.
In a situation of asymmetric information, the objective of the
principal is to choose a set of contracts to oer such that the agents
best choice (or best response) among these contracts reveals the infor-
mation that the principal lacks at the outset. In this chapter and the
next, we consider the two basic problems in turn, to see how they are
solved. However, before working through the principal-agent model
proper, it is worthwhile to take a look at a couple of ideas related to
adverse selection in risk-free situations.
willing to pay, at most, the expected value of the car, and so the price
must also satisfy p (1 q)v1 + qv2 .
But now think about the two options for the true value of the
car. Say the car is really of high quality, and so its true value (known
to the seller, unknown to the buyer) is actually v1 . But clearly it is
always true that (1 q)v1 + qv2 < v1 , and so combining all of our
inequalities we see that we are looking for a price that satises
fee in order to sell the car along with the certicate, the seller
would now require the buyer to pay v1 + m, but the buyer is still
only getting a vehicle of value v1 , and so would not be willing
to pay any more than that. So the seller cannot aord to pay
the mechanic. You should go through the case when the buyer
pays the mechanic yourself, but the same outcome happens.
Signalling
In a rather provocative and very inuential paper in the early 1970s,
Michael Spence considered the relationship between employers and
employees when the latter are better informed of their underlying
value to the rm (e.g., their productivity) than are the former. Speci-
cally, say there are two types of workers, those with value va and those
with value vb , where va > vb . It is known that a proportion of all
workers are of type a. Now, this is exactly the same type of adverse
selection problem as in the Ackerlof used car market under perfect
information (and assuming the employer is perfectly competitive, that
is, the employer must earn zero prots) the rm would like to pay each
worker their value, but under asymmetric information (the rm cannot
observe the workers type) the rm is afraid of paying the wage va as
it might be accepted by a type-b worker thereby generating negative
prots. So without any further mechanisms in place, the high wage
cannot be oered, and the upshot is that only the type-b workers are
employed.
However, Spence recognised that the value of a worker to a rm
might very well be highly correlated with the workers abilities in other
areas of life. Specically, Spence considered the possibility that high
value workers might also be more capable students in an education
6.2. Adverse selection without risk 145
w cb e
va
d ca e
vb
e e
vb . On the other hand, type-a workers will invest in exactly the level
of education ea = e , and will be employed at the wage va . No worker
type has an incentive to alter his investment in education, and the
employers beliefs on who is who are conrmed at this equilibrium. In
this way the signal has allowed the employer to sort the two types of
workers into their correct wage categories, even though their under-
lying value was not observable.
Both the Ackerlof and Spence models are set in risk-free envi-
ronments. That means that there is always another way to resolve
the issue we only need to pay at the end of the game rather than
148 6. Adverse selection
proportion of the population that each type represents, but she does
not directly observe a particular agents true type).
The proposed relationship will be carried out under risk, with
two states of nature. As always, we consider state 1 to be the better
state in the sense that whatever is the type of agent concerned, the
relationship yields a greater result for the principal in state 1 than in
state 2. Concretely, we assume that if state i occurs, then the contract
yields a payment for the principal of xi , where x1 > x2 , and where we
understand the variable x as a monetary amount.
The driving assumption in the model is that the underlying dif-
ferences between the two types of agent result in them being dieren-
tiated only by the probability with which the states of nature occur.
If the principal contracts with a type i agent, then the probability of
state 2 is pi , for i = 1, 2. We assume that p1 < p2 , so that type-1
agents are better than type-2 agents, since type-1 agents manage
to generate the better payo for the principal (the more favourable
state of nature) with a greater probability.
In this set-up, a contract oered by the principal consists of a
vector of two numbers that indicate what the agent will receive in
each of the two possible states of nature, once the state has been
realised and the payo to the principal (x) has been received. Thus a
contract is a vector w = (w1 , w2 ), where the agent is paid the wage wi
when the principal receives the result xi for i = 1, 2. Notice that the
contract shares the outcome of each state of nature between the two
parties, the agent getting wi and the principal getting xi wi . Thus
the contract is both a way to remunerate and provide incentives to
the agent, and to share risk.
The principal may oer more than one contract, and allow each
agent to choose between the contracts on oer. In general, we say
that the set of contracts oered by a principal is the contract menu.
However, it is very important to note that, since the principal cannot
distinguish between dierent agent types, she must oer exactly the
same contract menu to all agents, thus allowing all agents exactly the
same choice.
Note that since there are only two types of agent, at most the prin-
cipal will include only two dierent contracts in the contract menu.
The reason is clear; if she were to include more than two contracts
in the menu, all but two will certainly be ignored by both types of
agent. That is, since all agents of a given type are exactly identical,
what appeals to one will appeal to all of them in the same way. So all
150 6. Adverse selection
type-1 agents will prefer the same contract, and all type-2 agents will
also coincide as to which contract is the most preferred, although the
type-2 agents may choose a dierent contract to the type-1s. So, there
is never anything to be gained by oering more than two dierent
contracts in the menu (although it may be useful to oer a single
contract in the menu, something which we can interpret as a special
case of oering two contracts it is oering two contracts that are
equal). Our objective is to nd the coordinates of the two optimal
contracts, which we shall denote by w1 and w2 respectively, without
requiring that they necessarily be dierent.
If it turns out that all agents, irrespective of their type, choose
the same contract, then we say that they have been pooled, and we
speak of a pooling equilibrium. On the other hand, if type-1 agents
choose a dierent contract to type-2 agents, then we say that the
agents have been separated, and we talk of a separating equilibrium.
This second situation (separating equilibria) is much more interesting,
since it implies that the principal will be able to perfectly infer an
agents type by the choice of contract that he makes, and so contract
choice is a perfect signal for agent type. For this reason, separating
equilibria are also often known as self-selecting equilibria.
Since the principal is the contractor in the relationship, we can
assume that she is some type of business person, and we shall assign
her an objective function that is expected (monetary) prot. Thus we
are assuming that the principal is risk neutral. On the other hand,
the agents are the contracted parties (e.g., workers, or employees in
general) and so will be assigned an objective function that is expected
utility, where their utility function, u(w), is an increasing concave
function of money; u (w) > 0 and u (w) < 0.
If the principal contracts with a type i agent, her expected prot
is
x w)
Ei ( = (1 pi )(x1 w1 ) + pi (x2 w2 )
(1 pi )w1 pi w2
= Ei x
= (1 pi )u(w1 ) + pi u(w2 ) i = 1, 2
Ei u(w)
w2
w1 = w2
preference
direction
E2 u
E1 u
w1
Now, the assumption that p1 < p2 tells us that at any given point w
in the contract space the indierence curve of a type-1 agent is steeper
than the indierence curve of a type-2 agent at the same point. To
see this, we simply need to derive M RSi (w) with respect to pi ,
M RSi (u (w1 )pi u (w2 )) (1 pi )u (w1 )u (w2 )
=
pi (pi u (w2 ))2
u (w1 )u (w2 )
= (pi + (1 pi ))
(pi u (w2 ))2
u (w1 )
= 2 >0
pi u (w2 )
152 6. Adverse selection
w2
w1 = w2
preference
direction
w0
E2 (x w)
E1 (x w)
w1
curves of the principal are less steep when she contracts with a type-2
agent than when she contracts with a type-1 agent. In Figure 6.3 the
indierence curves of a principal (expected prot lines) are shown for
the cases of contracts signed with each type of agent.
Let the endowed, or reservation, utility of a type i agent be denoted
by ui , and assume that the reservation utility of the principal is 0. We
also assume that, since type-1 agents are in some way more propense
to generate the good state of nature, they also have greater reservation
utility than type-2 agents; u2 < u1 . This assumption implies that
the reservation utility indierence curve of a type-1 agent cuts the
certainty line in the space of contracts above the point at which the
reservation indierence curve of a type-2 agent cuts it. It also implies
that the two reservation utility indierence curves intersect each other
at a point characterised by w1 > w2 .
A contract w will attract a type i agent voluntarily, and will be
voluntarily oered by the principal conditional upon being accepted
by a type i agent, if it satises the conditions
(1 pi )u(w1 ) + pi u(w2 ) ui
(1 pi )w1 pi w2 0
Ei x
i ) Ei u(w
Ei u(w j ) i, j = 1, 2
Perfect competition
If the principal acts in a perfectly competitive environment, she is
restricted to earning a non-positive prot, but since her participation
condition requires that the expected prot also be non-negative, we
have the result that the expected prot must be exactly equal to 0. In
this case, eciency demands that the principal searches for the two
contracts w1 and w2 , where the rst is designed for type-1 agents and
the second is designed for type-2 agents, that respectively maximise
the expected utility of the two types of agent, subject to the condition
that she earns an expected prot of 0, and subject to the participation
and incentive compatibility constraints of the two types of agent. That
6.3. Principal-agent setting 155
subject to
E1 ( 1 ) + (1 )E2 (
xw 2 ) = 0
xw (6.1)
and
(1 p1 )u(w11 ) + p1 u(w21 ) u1
(1 p2 )u(w12 ) + p2 u(w22 ) u2
(1 p1 )u(w11 ) + p1 u(w21 ) (1 p1 )u(w12 ) + p1 u(w22 )
(1 p2 )u(w12 ) + p2 u(w22 ) (1 p2 )u(w11 ) + p2 u(w21 )
This is clearly a complex and large problem, with four choice vari-
ables (two components of the two wage vectors), and ve restrictions
(implying ve Lagrange multipliers). In all, if we solve the problem
using the Lagrange method, we would have to handle a system of nine
simultaneous equations in nine unknowns. Fortunately, it is far easier
to carry out a graphical analysis.
To begin with, we have the following result:
Result 6.1: Whatever is the solution to an adverse selection prob-
lem under perfect competition, it is characterised by w1
= w2 .
Result 6.1 indicates that it is impossible for the solution to involve
a single contract for both types of agent, that is, there will never be
a pooling equilibrium. To see why, assume that this were not true,
that is, assume that we can have a solution with w1 = w2 = w, and
dene q p1 + (1 )p2 . Now, if the solution were to imply a single
contract for both types of agent, then to satisfy (6.1), we require
(1 p1 )w1 p1 w2 ] = (1 ) [E2 x
[E1 x (1 p2 )w1 p2 w2 ]
w2
w1 = w2
w
w0
E2 (x w) = 0
E (x w) = 0
E1 (x w) = 0
w1
The situation has been drawn in Figure 6.4. The proposed equilib-
rium contract is the point w where the two indierence curves intersect
(the steepest indierence curve at that point corresponds to a type-1
agent).
6.3. Principal-agent setting 157
Now, note that this graph implies that we can always design a
new contract, located below the indierence curve of the type-2 agent
and above the indierence curve of the type-1 agent (so that it would
be accepted only by type-1 agents), and yet that oers the principal
a strictly positive expected prot. For example, in Figure 6.4 any
x w)
contract located on the line E ( = 0 a little below the point w
would be sucient. But since all principals have the same incentive
to oer this new contract given that (by assumption) the others are
all oering the point w, we cannot have a Nash equilibrium at w.
w2
w1 = w2
A
C
E (x w) = 0
w1
w2
w1 = w2
w1
E2 (x w) = 0
E1 (x w) = 0
w1
w2
w1 = w2
w1
E2 (x w) = 0
E (x w) = 0
E1 (x w) = 0
w1
A monopolistic principal
When the principal is a monopolist, her objective is to maximise
expected prot. Naturally, when there is only one principal, we can
safely ignore all the arguments in the previous sub-section based
on rebel contracts that take one or another type of agent from the
rest of the principals. In the monopoly problem, the principal need
only search for the two contracts that maximise her expected prot
(conditional upon that expected prot being non-negative) subject to
the participation and incentive compatibility constraints of both types
of agent. Indeed, since we know from the previous problem (perfect
competition) that it is always possible for the principal to oer two
contracts that give her an expected prot of 0, we can in fact also
ignore the participation constraint of the principal (the restriction
that in the solution to the problem her expected prot must be non-
negative), since at least one contract menu exists that achieves this
objective. So we know that whatever is the solution to the expected
prot maximising problem, it can never end up giving a negative
expected prot. Thus, the problem can be formulated as
max E1 x (1 p1 )w11 p1 w21 +
1
w ,w 2
(1 ) E2 x (1 p2 )w12 p2 w22
subject to
(1 p1 )u(w11 ) + p1 u(w21 ) u1 (6.2)
(1 p2 )u(w12 ) + p2 u(w22 ) u2 (6.3)
(1 p1 )u(w11 ) + p1 u(w21 ) (1 p1 )u(w12 ) + p1 u(w22 ) (6.4)
(1 p2 )u(w12 ) + p2 u(w22 ) c(1 p2 )u(w11 ) + p2 u(w21 ) (6.5)
Again, this is a rather large problem, with four variables and
four restrictions which implies four multipliers. A full mathematical
treatment of the problem would require analysing the simultaneous
solution to eight equations in eight unknowns. However, using some
easy graphical analysis, we can reduce the problem down to an equiv-
alent one with only two equations in two unknowns. Lets see how.
6.3. Principal-agent setting 163
w2
w1 = w2
w2
w1 (w2 )
E2 u(w) = u(w2 )
E1 u(w) = u1
w2 w1
the type-2 contract. Directly, she will lose some expected prot on the
type-2s, but it also has the eect of pushing the type-2 indierence
curve upwards, and forcing the optimal contract of the type-1 agents
upwards around the type-1 reservation utility indierence curve. This
implies an increase in the expected prot that is earned on the type-1
contract. So the principal will increase the payment to the type-2s
until the marginal loss she suers on that contract exactly equals the
marginal gain she gets back on the type-1 contract. In general, then,
it is certainly not true that we should conclude that the principal will
keep the type-2 agents on their reservation utility indierence curve,
as she would in a symmetric information problem. We shall now go
on to look at this in a little more detail, but in order to simplify the
notation, from now on we use the variable w to represent the wage
that is paid to the type-2 agents (the same in each state of nature),
and wi to represent the wage of the type-1 agents in state i.
To begin with, note that so long as the principal sets w at a
level that is less than the certainty equivalent wealth of type-1 agents
(the point where their reservation utility indierence curve cuts the
certainty axis), then we know that the type-1 incentive compatibility
condition cannot bind, and that the type-1 optimal contract must
be characterised by w1 > w2 . In the following, we shall make use
of the general result that, outside of a very extreme case (which we
will consider), the type-1 agent incentive compatibility condition will
never bind, and so is irrelevant to the problem and can be ignored.
Now, we know that in all cases the type-1 agent participation
condition binds, as does the type-2 agent incentive compatibility con-
dition. Formally, these two ideas are written as
(1 p1 )u(w1 ) + p1 u(w2 ) = u1
(1 p2 )u(w1 ) + p2 u(w2 ) = u(w)
subject to
u(w) u2
If we write the restriction as
g(w) u(w) u2
then we can use the Lagrange method, so long as the objective func-
tion is concave in the choice variable w, since the equation that denes
the restriction, g(w), is convex by the assumption of concavity of the
utility function.
The rst derivative of the objective function with respect to w is
w1 w2
f (w) = (1 p1 ) + p1 (1 )
w w
p1 u (w) 1 p1 u (w)
= (1 p1 ) + p1
p2 p1 u (w1 ) p2 p1 u (w2 )
(1 )
(1 p1 )p1
= u (w) u (w2 )1 u (w1 )1
p2 p1
(1 ) (6.8)
where we have used (6.6) and (6.7). The second derivative is
(1 p1 )p1
f (w) = u (w) u (w2 )1 u (w1 )1
p2 p1
(1 p1 )p1 u (w2 )1 u (w1 )1
u (w)
p 2 p1 w
6.3. Principal-agent setting 167
But since
u (w2 )1 u (w1 )1
=
w
w2 w1
u (w2 )2 u (w2 ) + u (w1 )2 u (w1 ) >0
w w
the second term of the second derivative is certainly negative, and we
only need concern ourselves with the rst term. The rst term of the
second derivative is not positive if
u (w2 )1 u (w1 )1 0
that is, if
u (w2 ) u (w1 ) = w2 w1
However, as will be shown below, since this will be true in all possible
cases, it is indeed true that the objective function is concave in w
and we can solve the principals simplied problem using traditional
maximisation techniques.
The Lagrangean for the problem is
+ (1 )E2 x
L(w, ) = E1 x (1 p1 )w1 (w)+
p1 w2 (w) (1 )w + u2 + u(w)
[u2 + u(w)] = 0
u (w2 )1 u (w1 )1 0
that is,
w2 w1
However, since it is never feasible to have an equilibrium with w2 >
w1 , this case must correspond to w2 = w1 , and so the type-1 contract
is located on the certainty axis. But since the type-1 agent contract
is also located on the indierence curve of the type-2 agents, we now
know that when = 1 the equilibrium is pooling with w2 = w1 = w.
Of course, this is not at all surprising if there are no type-2 agents
(which is basically what = 1 indicates) then the principal needs only
to deal with the type-1 agents in an expected prot maximising way.
Really, when = 1 there is no problem of asymmetric information.
Furthermore, in any other case ( < 1) it must necessarily be true
that w2 < w1 , and so the equilibrium will be separating. To see this,
just apply the implicit function theorem to the rst-order condition
2
L
w w
= 2
L
w2
The sign of this is equal to the sign of the numerator as the Lagrangean
has already been shown to be concave in w. But since
2L 2f
=
w w
from (6.8) it turns out that
2L (1 p1 )p1
= u (w) u (w2 )1 u (w1 )1 + 1
w p2 p1
Summary
From our analysis of the perfect competition case, we can conclude
that
Problems
1. In the model of Ackerlof, of the second-hand car market, cars
were dened to be of high or low quality, without really paying
much attention to what quality actually means. Assume that
any given car can either break down or not, and that the proba-
bility of breaking down is p. Good quality cars break down with
probability p1 and bad quality cars break down with probability
p2 , where p1 < p2 . For simplicity, assume that a broken down
car has value 0, and a non-broken down car has value 1. Assume
that sellers can oer their cars along with a guarantee. The
guarantee stipulates that the seller will pay the purchaser an
amount of money, g, should the vehicle break down. What is
the cost to a seller of each quality of car of selling with the
guarantee? Calculate the minimum size of the guarantee such
that it signals a good quality car. Describe the nal (separating)
equilibrium.
6.3. Principal-agent setting 171
Asymmetric information:
Moral hazard
and so
But since p(e2 ) p(e1 ) > 0, it turns out that the vectors w such
that f (w) = 0 must satisfy u(w2 ) u(w1 ) < 0, that is, they have
a higher wage in state 1 than in state 2, w2 < w1 . But then, since
the slope of the contour is nothing more than the ratio of marginal
utilities, and recalling that the utility function is concave (marginal
utility is decreasing), the fact that w2 < w1 implies that the slope of
the contour is always less than 1.
(w1 )2
w2 w2
But since in this case dw 2 w2
dw1 = w1 , this reduces to (w1 )2 = 0.
So the contour under logarithmic utility is linear. What about
under constant absolute
risk aversion? In general the slope of
dw2
the contour is dw1 = uu (w 1)
(w2 ) . Deriving again with respect
df (w)=0
177
to w1 we get
d2 w2 u (w1 )u (w2 ) u (w1 )u (w2 ) dw
dw1
2
=
d(w1 )2 df (w)=0 u (w2 )2
u (w1 )u (w2 ) u (w1 )u (w2 ) uu (w 1)
(w2 )
= 2
u (w2 )
Which re-orders to
u (w1 ) u (w2 ) u (w1 )
<
u (w1 ) u (w2 ) u (w2 )
u (w1 )
Ra (w1 ) > Ra (w2 )
u (w2 )
u (w1 )
1>
u (w2 )
In Figure 7.1 we can see the curve f (w) = 0 together with two
indierence curves of the agent passing through a point on the contour.
It is important to note that the two indierence curves drawn repre-
sent only the part of the utility function that depends on money, that
is, they are curves along which Ee u(w) is constant. Clearly, since d(e)
is independent of the contingent wage vectors whatever was the choice
of e, along the curves that are drawn total utility Ee u(w) d(e) is
also constant. Since we know that an individual is indierent between
two situations when his indierence curves for total expected utility
178 7. Moral hazard
w2
w1 = w2
f (w) = 0
Ee 2 u
Ee 1 u
w1
The agent is indierent between the two eort levels that he can
choose between for any contract located on the curve f (w) = 0, and
so clearly the agent has a strict preference for one eort level over the
other for any point not located on that curve. In order to see exactly
what that preference is, consider a point characterised by w1 = w2 =
w, which is to the left of the curve f (w) = 0. With a risk-free wage,
the individual is indierent between which state of nature occurs, and
so will always choose the least costly eort level, that is, e2 . So at all
points to the left of f (w) = 0 the individual prefers low eort to high
eort, and at any point to the right of f (w) = 0 the preference is for
high eort over low. The intuition is clear; the greater is the variance
that the contract oers, the more state 1 is preferred to state 2 by
the agent, and so the more reasonable it becomes that he is willing to
suer additional costs in terms of eort to increase the probability of
occurrence of state 1.
Now consider the principal. Since the principal is risk neutral, her
expected prot at any contract when the agent oers eort of ei is
p(ei )(x2 w2 ) + (1 p(ei ))(x1 w1 )
Thus, the principal is indierent between the two eort levels if
w2
w1 = w2
g(w) = 0
Ee2 (x w)
Ee1 (x w)
x1 x2 w1
w2
w 1 = w2
g(w) = 0
Ee2 (x w)
Ee1 (x w)
x1 x2 w1
w2
w1 = w2
f (w) = 0
Ee1 (x w) = 0
w1
w2
w1 = w2
f (w) = 0
B Ee 2 u
Ee 1 u
w1
contract that will be chosen is the one that lies at the intersection
= u + d(e1 ) and the curve f (w) =
of the indierence curve Ee1 u(w)
0. This contract is indicated in Figure 7.6 as the point B , which
incidentally also lies on the indierence curve for low eort passing
through the optimal contract for low eort, A , due to the fact that
both are on indierence curves that represent the same reservation
level of utility, u.
w2
w1 = w2
f (w) = 0
B Ee2 u = u + d(e2 )
Ee1 u = u + d(e1 )
w1
w2
w1 = w2
f (w) = 0
A
g(w) = 0
B Ee2 u = u + d(e2 )
Ee1 u = u + d(e1 )
w1
Summary
To summarise the case of perfect competition, we can note the follow-
ing points:
Problems
1. Assume that a perfectly competitive principal decides to con-
tract low eort. Then the risk aversion of the agents increases.
Is it in the best interests of the principal to change to contracting
high eort? Explain.
2. Assume that you observe the eort (high or low) demanded
by a monopolistic principal under symmetric information. Can
you then know what eort (high or low) this principal should
contract under asymmetric information? Explain.
188 7. Moral hazard
Appendices
This page intentionally left blank
Appendix A
Mathematical toolkit
191
192 A. Mathematical toolkit
necessary that the student fully understands exactly what each piece
of mathematical toolkit is actually doing. This is a far dierent story
than simply being able to do the maths when asked to. It is only when
you understand what the maths is doing that you will know why each
technique is useful and when.
Given the above, in this appendix we set out the basic mathemat-
ical techniques that are used over and over again in the text. Really
there are only a very few of them, but students are well advised to
be very comfortable with each of them before moving forward into
the text proper. The toolkit that is set out in this appendix are the
following: the implicit function theorem, considerations of concavity
of functions, the Kuhn-Tucker method of constrained optimisation,
and some very basic ideas regarding probability.
y
f (x)
y2
y1
x1 x2 x
values y1 = f (x1 ) and y2 = f (x2 ). The straight line that joins these
two points in (x, y) space is given by an equation of the type:
y = a + bx
where a and b are constants. Of course, given the two points (x1 , y1 )
and (x2 , y2 ), we can actually solve the two equations yi = a + bxi i =
1, 2 in the two unknowns a and b, but we do not need to actually do
that right now. All we need to note is that, any value of x that lies
between x1 and x2 can be written as a weighted average of the two
extremes. That is, if we write x3 = x1 + (1 )x2 , then so long as
we take 0 < < 1, we get min{x1 , x2 } < x3 < max{x1 , x2 }. Now,
consider the value of y that corresponds to our x3 thus dened. Using
the equation for a straight line we have
y3 =a + bx3
=a + b [x1 + (1 )x2 ]
= [a + (1 )a] + bx1 + (1 )bx2
=(a + bx1 ) + (1 )(a + bx2 )
=y1 + (1 )y2
where (at the third step) we have used the obvious fact that a + (1
)a = a.
All of this tells us that, given any two points (x1 , y1 ) and (x2 , y2 ),
then any other point on the straight line that joins them can be dened
as the point (x3 , y3 ) = (x1 + (1 )x2 , y1 + (1 )y2 ), so long as
0 < < 1. Therefore, the statement that the straight line joining any
two points on a function like that shown in Figure A.1 lies beneath
the function itself can be written mathematically as follows; for any
x1 and x2 , and for any : 0 < < 1, then
x2
x1
x2
u(x) = constant
x1
There are two important points to note here. First, the utility
function and indierence curve example is special, since it corresponds
to a specic assumption on the rst derivatives of the function (utility
is increasing in each argument). This is what leads to decreasing
convex indierence curves. Mathematically, an indierence curve is
really just a contour or level set of the underlying function, since
it is all the vectors such that the function itself does not alter its value.
Try to draw the graphs of a contour of a concave function f (x1 , x2 )
that is increasing in one argument and decreasing in the other. Or a
contour of a concave function that is decreasing in both arguments.
The second important point to note is that you should never
confuse a contour with the function that generates it. In terms of
utility theory, an indierence curve is an entirely dierent concept to
a utility function. We are interested in the concavity of the utility
A.2. Concavity and convexity 199
xi , xj X and : 0 1, if xk () X X is convex
h(xk ()). To see this, note that for any xi and xj , we can dene
min{h(xi ), h(xj )} c. Then both xi and xj belong to the set X(c)
{x : h(x) c}. But if h(x) is quasi-concave, then X(c) is ai convex set,
k
and so x () must also belong to X(c), that is, min h(x ), h(x ) = j
)h(xj )), but neither of the reverse armations is true. When the
strict inequality in any of the denitions of concavity/convexity is
used, then the corresponding characteristic is strict (e.g., h(xk ()) >
h(xi ) + (1 )h(xj ) implies that h(x) is strictly concave).
Now you should be able to see that the previous discussion con-
cerning concave utility functions and convex indierence curves can
also be framed in terms of quasi-concavity. The minimal concavity
requisite on a consumers utility function for each indierence curve
to be a convex function in goods space is that the utility function be
quasi-concave. Naturally, this is also the minimal requirement upon a
choice problem with a convex budget set to be guaranteed to have a
unique optimal point.
A.3. Kuhn-Tucker optimisation 201
where f (x) is increasing and concave and each gi (x) is increasing and
convex. Certainly the most familiar example is maximising the utility
of consumption subject to a budget constraint and non-negativity on
the goods in question. You may have seen problems that look very sim-
ilar to this, but where the that appears in the restrictions is written
as an equality. There is a signicant dierence between problems with
inequality restrictions and problems with equality restrictions, and
here we will be interested only in the former.
4
Actually, you will see that when we study risky rather than certain environ-
ments, as is the case for the present text, we do indeed require concavity of utility
in goods, not just quasi-concavity, in order to guarantee that our decision maker
is what we call risk averse. This is shown in the next appendix.
202 A. Mathematical toolkit
5
We know this from the Weirstrauss Theorem.
A.3. Kuhn-Tucker optimisation 203
Together, equation (A.1) and the fact that g(x ) = b are two
equations in the two unknowns, x1 and x2 , and so their simultaneous
solution gives us the solution to the initial constrained optimisation
problem.
It is worthwhile to clearly point out that, although the solution to
the above problem involves g(x ) = b, this equality was not directly
assumed at any point. The underlying restriction for the problem is
g(x ) b, and the fact that this is solved with equality rather than
with inequality appears endogenously as we solve the problem. You
should see, from the above logic, that the equality has in fact been
a direct result of the assumption that the objective function f (x) is
increasing in the elements of the vector x.
In order to resolve a more general problem, with any number of
restrictions, we cannot fall back on the simple intuition that was just
used. The reason is that, although it will always be true that at least
one restriction binds (due to the fact that the objective function is
increasing in all variables), we cannot know for sure which one or
which ones. So it is impossible to know which of the equations gi (x) =
bi are valid for obtaining the solution. In order to solve the problem,
it is convenient to transform it into a second maximisation problem
with no restrictions.
In a variant of the well-known Lagrange method of solving prob-
lems with equality constraints,6 Harold Kuhn and Albert Tucker have
proved that the solution to the general problem coincides with the
solution to the alternative problem:
m
max L(x, ) f (x) + i [bi gi (x)]
x i=1
To see the logic underlying the Lagrange method, note that since
f (x) is concave, and each gi (x) is convex (and so gi (x) is concave),
it turns out that L(x, ) is concave in x. Thus, since the multipliers
are non-negative, the global maximum of L(x, ) is found where its
rst derivatives are 0. Call the point that achieves this x . Now, by
the very denition of a global maximum, it is true that
m
m
f (x ) + i [bi gi (x )] f (x) + i [bi gi (x)] x
i=1 i=1
m
But since the multipliers are dened such that i=1 i [bi gi (x )] =
0, it holds that
m
f (x ) f (x) + i [bi gi (x)] x
i=1
L(x, ) f (x ) m gj (x )
= i = 0 i = 1, 2 (A.3)
xi xi j=1 xi
xi = xi (b, g) i = 1, 2
But, from the rst-order conditions (A.3), the rst term of this is
exactly 0, and so we are left with
L() j
m
= [bj gj (x )] + k
bk bk
j=1
abilities. In short,7 we can point out the result that if the set of
possible outcomes can be divided into a suciently large number of
independent events, then there will exist a probability measure that
represents subjective probabilities, in the sense that if A is not less
probable than B, then the corresponding probability measure assigns
numbers p(A) and p(B) such that p(A) p(B). However, for our
purposes, this result is not particularly useful, since we will typically
be considering simple cases with a small number of possible outcomes
(two, or at most three), that cannot be sub-divided. However, for us
a simple denition of probability will suce.
Let x be a random variable,8 and let X be the set of values that
can take. Naturally, the set X cannot be empty, X
= . We shall
x
identify any general element of X by xi , and we shall assume that
there are z dierent elements in X, that is, X can be thought of as
a vector with z elements; X = (x1 , x2 , ..., xz ). Now, if z = 1, that is,
there is only one element in X, then we say that x is a constant (it is
deterministic). A deterministic variable is also sometimes referred to
as a degenerate random variable. On the other hand, if z > 1, then
we say that x is a random variable (it is stochastic). We use the term
lottery to describe the mechanism by which a particular element of X
is assigned to x.
When a lottery is repeated many times independently, we obtain a
list of the values that have been assigned to x in each trial. Denote by
ni (m) the number of times that the particular value xi was assigned
to x when the lottery is repeated m times. In this way we obtain the
z
vector n(m) = (n1 (m), n2 (m), ..., nz (m)), where i=1 ni (m) = m.
On the other hand, we also obtain the relative frequencies of each
(m) n2 (m) (m)
xi dened by the vector r(m) = ( n1m , m , ..., nzm ). Of course,
the relative frequencies are numbers with the properties
that, for any
(m) (m)
given m we have 0 nim 1 for all i, and zi=1 nim = 1. It is
important to note that the relative frequencies of X refer to the past,
while the concept of probability that we are searching for refers to the
future.
Now, we can use the following denition of probability; the prob-
ability of xi , denoted by pi , is the belief that the individual has for
the relative frequency of xi that would be obtained if the lottery were
7
For a more detailed account, see The Foundations of Statistics, by Leonard
Savage, originally published by J. Wiley & Sons in 1954.
8
In all of the present text, all random variables (those that can take on more
than one nal value) will be indicated by a curly line above the variable.
208 A. Mathematical toolkit
A primer on consumer
theory under certainty,
and indirect utility
209
210 B. A primer on consumer theory
u(x)
xi > 0 i = 1, 2. Our assumption on concavity will be described,
as was established in the previous appendix, by Jensens inequality;
x1 , x2 and (0, 1), u(x1 + (1 )x2 ) > u(x1 ) + (1 )u(x2 ).
These assumptions imply that the indierence curves correspond-
ing to u(x), drawn in (x1 , x2 ) space, are decreasing and strictly convex,
and that indierence curves located further from the origin correspond
to greater levels of utility. From the implicit function theorem, the
slope of the indierence curve passing through any given point x at
that point is
u(x)
x2 x1
=
x1 du(x)=0 u(x)
x2
the unspent wealth is w z > 0. But then this unspent wealth can
be protably allocated to the purchase of at least one of the two
goods. Say it is all allocated to good 1, then the strictly positive
additional amount wz p1 of good 1 can be purchased, and since utility
is increasing in the consumption of good 1, adding this new quantity
to the consumption bundle must increase utility. Thus, in our search
for the optimal solution we need only consider points that lie on the
budget line.
Next, note that unless the optimal point is at one of the extreme
vertices of the budget set, it must correspond to a point of tangency
between the budget line and an indierence curve. If this were not
the case, then the budget line and the indierence curve would cut at
the proposed point, which implies that some part of that indierence
curve lies strictly within the budget set. In other words, the proposed
point is indierent to some other point for which not all wealth is
spent. But since we have just shown that any point for which not all
wealth is spent can be improved upon, no point that is indierent to
such a point can ever be optimal. Thus, outside of a corner solution,
there must be a tangency between the budget line and an indierence
curve, which is expressed as
u(x )
x1 p
= 1
u(x ) p2
x2
L(x, ) = u(x) + 1 [0 + x1 ] + 2 [0 + x2 ] + 3 [w p1 x1 p2 x2 ]
i xi = 0 , i = 1, 2 ; 3 [w p1 x1 p2 x2 ] = 0 (B.4)
The rst thing to notice is that again we can be sure that the
budget constraint will saturate, w = p1 x1 + p2 x2 . To see why, note
that if we can show that in any solution we always get 3 > 0, then
directly from the third complementary slackness condition we would
know that w = p1 x1 + p2 x2 . Given that, lets write the rst-order
conditions as:
u(x )
+ i = 3 pi i = 1, 2
xi
and multiply this by xi , so that from the complementary slackness
conditions we can ignore the term i xi , and so we get
u(x )
x = 3 pi xi i = 1, 2
xi i
Now, we sum these two equations to obtain
2
u(x )
xi = 3 (p1 x1 + p2 x2 ) 3 w
xi
i=1
If the solution is not interior, that is, one of the two quantities
x1 or x2 is equal to zero (recall that both cannot be zero with pos-
itive wealth, since the solution must lie on the budget line), then
the solution will not in general be given by the tangency condition.
These types of cases, known as corner solutions can still be easily
calculated from the tangency condition. If we denote by x the point
that does satisfy (B.8), then the optimal vector (the point x that
simultaneously satises (B.3) and (B.4)) is found as
x if xi 0; i = 1, 2
x =
(xi = 0, xj = pwj ) if xi < 0; i, j = 1, 2; i
= j
In all that follows, unless we specically state otherwise, we shall
simply assume that the solution to the problem is interior, and so is
calculated directly from the tangency condition and the budget line.
However, in any optimal solution (i.e., both before and after the
increase in w) we know that the budget constraint must saturate (B.9),
and so we can derive this restriction with respect to w, which reveals
the result
x x
p 1 1 + p2 2 = 1
w w
Substituting this into the previous equation, it turns out that
v(p, w)
=>0
w
Note that this is exactly what was mentioned at the end of the
previous appendix, when optimisation was considered in general. How-
ever, now we can clearly refer to as the marginal utility of wealth,
an important concept in microeconomics. Since is always strictly
positive, we know that an increase in wealth will always increase
utility.
Next, consider an increase in one of the prices, say pi . Deriving
the indirect utility function, we get
v(p, w)
= xi < 0 i = 1, 2
pi
v(p, w) =constant
xi
pi
The nal result that we should look at here is also perhaps the
most important, at least for the subject matter of the main text;
the concavity of the indirect utility function in wealth. That utility
should be concave in wealth is an often assumed characteristic, and it
is most important to the economics of risk and uncertainty. It turns
out that it is true that indirect utility is concave in wealth, but only
conditional upon the direct utility function being concave in the vector
of goods. This might not seem to be a severe restriction, as indeed it
is very often assumed that u(x) is concave in x, since among other
B.2. Utility maximisation under certainty 219
things this implies that the indierence curves will be convex contours.
But as we have seen in the mathematical appendix, concavity of u(x)
is by no means necessary for convexity of indierence curves. What
is required is that utility be quasi-concave in the vector of goods,
a weaker requirement than strict concavity, and one that will not
necessarily generate concavity of indirect utility in wealth. However,
that said, it is still not too much of a compromise to assume strict
concavity of u(x), and so we shall.
The result can be proved as follows. Hold prices constant, and
compare the optimal solution to the utility maximisation problem
with two dierent levels of wealth, say w1 and w2 . Call the solutions
to these two problems, respectively, x1 and x2 . Then, consider the
utility maximisation problem with wealth equal to w1 + (1 )w2 =
w3 . Call the solution to that problem x3 . We know that p1 x1 1 +
1 2 2
p2 x2 = w1 and that p1 x1 + p2 x2 = w2 . Multiplying the rst of
these equations by and the second by (1 ), and summing them
gives
(p1 x1 1 2 2
1 + p2 x2 ) + (1 )(p1 x1 + p2 x2 ) = w1 + (1 )w2 = w3
p1 [x1 2 1 2
1 + (1 )x1 ] + p2 [x2 + (1 )x2 ] = w3