Sunteți pe pagina 1din 110

This is page i

Printer: Opaqu

Open Macro Economics

Roberto Rigobon
MIT

Summer 2010
ii
This is page iii
Printer: Opaqu

Contents

1 International Asset Pricing: Discrete Time 1


1.1 Small Open Economies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Basic Production Economy under Certainty . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1.1 External Accounts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 Asset Pricing in a Small Open Economy . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.2.1 Risk Free Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.2.2 Risk Neutrality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.2.3 Challenging Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1.2.4 Incomplete Markets (Only Bonds and Stocks) . . . . . . . . . . . . . . . . . 7
1.2 General Equilibrium: Two Countries, Single Good . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 Pareto Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.1.1 Log Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.2 Competitive Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.2.1 Single Good Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.3 Asset Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.3.1 Risk Free Bond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.3.2 Stock Markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.4 Log Utility Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.4.1 Pareto Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.4.2 Competitive Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.4.3 Interest Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.2.4.4 Stock Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.2.4.4.1 Veronesi et.al.: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.2.4.4.2 Cochrane-Longstaff-Santa Clara (two trees): . . . . . . . . . . . . . 13
1.2.4.5 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2.4.6 Portfolio Holdings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2.5 Issues with this model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2.6 Problem Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2.6.1 Stock markets and Output: The Bernoulli World . . . . . . . . . . . . . . . . 14
1.2.6.2 Non-tradables in a single good model . . . . . . . . . . . . . . . . . . . . . . 15
iv Contents

1.3 General Equilibrium: Two Countries, Two Goods. . . . . . . . . . . . . . . . . . . . . . . . . 16


1.3.1 Utilities and Pareto Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.1.1 Terms of Trade . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.3.1.1.1 Ricardian Effect: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.3.1.1.2 Dependent Economy Effect: . . . . . . . . . . . . . . . . . . . . . . . 18
1.3.1.1.3 Wealth Transfer Effect: . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.3.1.2 Goods’ Prices and Exchange Rates . . . . . . . . . . . . . . . . . . . . . . . . 19
1.3.2 Competitive Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.3.3 Asset Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.3.3.1 Issues with this model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.4 Demand Shocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.4.1 Social Planner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.4.2 Competitive Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.4.3 Asset Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.4.4 Interest Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2 Introduction to Brownian Motion and Stochastic Calculus: Some Applications 29


2.1 Basic Continuous Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.1.1 Brownian Motion: Random Walk representation. . . . . . . . . . . . . . . . . . . . . . 29
2.1.1.1 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.1.1.2 Some approximations (from the Random Walk) . . . . . . . . . . . . . . . . 31
2.1.2 Brownian Motion: Continuous Time Representation. . . . . . . . . . . . . . . . . . . . 31
2.1.2.1 Itô’s lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.1.2.2 Bellman Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.1.2.2.1 Stationary problem: . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.1.2.2.2 Non-Stationary Problem: . . . . . . . . . . . . . . . . . . . . . . . . 34
2.1.2.2.3 What makes Brownian motion so special? . . . . . . . . . . . . . . . 34
2.1.3 Constraints and Barriers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.1.3.1 Absorbing Barriers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.1.3.2 Reflecting Barriers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.1.3.3 Reseting Barriers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.1.3.4 Shifting Barriers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.1.4 Distributions and paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.1.5 Control problem: defining optimal barriers . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.2 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.2.1 Target Zones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.2.1.1 The Differential Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.2.1.2 Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.2.1.3 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.2.2 Cochrane-Longstaff-Santa Clara . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.2.2.1 Evolution of the Share . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.2.2.2 Solving for Stock Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.2.3 Problem Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.2.3.1 Numerical Cochrane, Longstaff, and Santa-Clara. . . . . . . . . . . . . . . . 47
2.3 Sticky prices models in continuous time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

3 Balance of Payment Crises in a Simple Monetary Model 49


3.1 Stochastic Fiscal Reform and Crises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.2 Model without Debt Constraints: Optimal Monetary Policy . . . . . . . . . . . . . . . . . . . 51
3.2.1 Environment and Consumers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.2.2 Government . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Contents v

3.2.3 Central Bank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53


3.2.4 Optimal Monetary and Exchange Rate Policy . . . . . . . . . . . . . . . . . . . . . . . 54
3.2.4.1 Flexible exchange rate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.2.4.2 Optimal interest rate path . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.2.4.3 Solution: Formal Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.3 Model with Debt Constraints: Balance of payments crisis . . . . . . . . . . . . . . . . . . . . 57
3.3.1 Optimal Monetary and Exchange Rate Policy . . . . . . . . . . . . . . . . . . . . . . . 58
3.3.1.1 Solution: A heuristic approach . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.3.1.2 Solution: Formal Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.4 Sequence of Stabilization Programs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.5 Solution for the Money in the Utility model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4 Identification in Macroeconomics: Problem 69


4.1 Problems and Biases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.1.1 Simultaneous equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.1.2 Omitted variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.1.3 Error-in-variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.2 Lack of Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.2.1 General set-up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.3 Standard solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.3.1 Parameter Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.3.1.1 Exclusion Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.3.1.1.1 Contemporaneous coefficients: Assuming the problem away . . . . . 77
4.3.1.1.2 Exogenous Variables: Indirect Least Squares . . . . . . . . . . . . . 78
4.3.1.1.3 Instrumental Variables . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.3.1.2 Long Run Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.3.2 Variance Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.3.2.1 Near Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.3.2.2 Relative variance restriction . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.3.3 Sign Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.3.4 Reversed Regressions and ”Bounds” . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

5 Identification through Heteroskedasticity: Theory. 85


5.1 Preliminary Intuition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.2 Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.2.1 Identification under two regimes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.2.2 Identification under more than two regimes. . . . . . . . . . . . . . . . . . . . . . . . . 89
5.3 Identification with common shocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.3.1 Related literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.4 Consistency under misspecification of the heteroskedasticity. . . . . . . . . . . . . . . . . . . . 93
5.4.1 Misspecification of the regime windows. . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.4.2 Under-specified number of regimes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

References 99
vi Contents
This is page vi
Printer: Opaqu

Preface

I started writing these notes while I was visiting the Graduate Institute of International Studies at Genève
in 2004. I started with the identification on macroeconomics section, after that, I had far too much free time
(they gave me tenure), and I have continued them through out the years while I was visiting PUC in Rio, the
University of Indiana, the University of Wisconsin at Madison, the Bank of England, the European Central
Bank, the Inter-American Development Bank, Universidad Los Andes in Bogota, and Now the Kiel Institute.
I thank all of them for their tremendous hospitality and for motivating me to organize my thoughts in these
area.
Before starting it is absolutely crucial to start with a disclaimer. Most people have disclaimers and I have
never been able to write one. So, here it goes... Although I do not work on a Central Bank or a multilateral
organization such as the IBD, IMF or WB, my opinions do not reflect the views of those organizations, nor
their board members, nor their staff members, nor their respective significant others, nor their pets either.
Just in case you were wandering.
Now turning to more serious issues, there are three important characteristics that define these notes. First,
there are, probably, a continuum of mistakes. Especially, because they were written in Spanish, and then
translated to English by someone that has a very limited knowledge of both languages − i.e., me. Second,
these notes try to summarize an extremely extensive literature. I cannot cite everybody that deserves to
be cited. The main reason is that I cannot type all those citations in BibTex. That is quite embarrassing
because a lot of them are actually colleagues, that are still kicking around. If by any chance you think that
I have forgotten to cite 24 of your papers, I apologize, and I can only offer you these words of profound
sympathy: “you are in very good company”. I promise you, though, that I will not forget a single one of my
papers. So, at least someone will be well represented.
viii Contents
This is page 1
Printer: Opaqu

1
International Asset Pricing: Discrete Time

Some of the most important questions in international open macro involve asset pricing issues. For example,
the new research on the current account and the definition of sustainability and external adjustment require
portfolio and asset pricing considerations to even define them. Traditional puzzles such as international
diversification, the correlation between savings and investment, and home bias are puzzles directly related
to asset price considerations. Finally, issues on contagion, crisis, forward premium puzzle, and the uncovered
interest rate parity failures are also related to asset prices.
In this Chapter we discuss the “basics” of international asset pricing. There are hundred of open questions
and these notes hopefully help develop some of the tools so you can tackle more difficult questions later.

1.1 Small Open Economies

1.1.1 Basic Production Economy under Certainty

In the small open economies we mostly assume that the interest rate is exogenous. The assumptions are
the following: discrete time; infinite horizon; investment without depreciation; only one international bond;
single good. The budget constraint indicates that all sources of income are equal to all the expenditures: The
sources of income are yt + (1 + r)bt which is output plus the interest payment on the bond. The expenditures
are given by consumption, government expenditures, the new bonds (bt+1 ), and also investment. Hence, the
budget constraint is
yt + (1 + r)bt = ct + gt + bt+1 + it
which implies
bt+1 − bt = yt + rbt − ct − gt − it
| {z }
domestic savings

where the left hand side term is the foreign savings (the current account), which is equal to domestic savings
minus domestic investment.
From the production side we have that capital evolves kt+1 = kt + it , and yt = A · F (kt ).
2 1. International Asset Pricing: Discrete Time

Consumer’s utility is

X
Ut = β s−t u (cs )
s=t
from the budget constraint and production we have that consumption is
cs = ys + (1 + r) bs − gs − is − bs+1
= A · F (ks ) + (1 + r) bs − gs − (ks+1 − ks ) − bs+1
where ks and bs are the state variables. Maximization is

X
max β s−t u (A · F (ks ) + (1 + r) bs − gs − (ks+1 − ks ) − bs+1 )
{bs+1 ,ks+1 }
s=t

where the FOC’s are



= 0
∂bs+1

= 0
∂ks+1

The FOC’s are:



= 0 =⇒ −β s−t u0 (cs ) + β s+1−t (1 + r)u0 (cs+1 ) = 0
∂bs+1

= 0 =⇒ −β s−t u0 (cs ) + β s+1−t (A · F 0 (ks+1 ) + 1) u0 (cs+1 ) = 0
∂ks+1

From the first FOC we have


u0 (cs ) = β(1 + r)u0 (cs+1 ) (1.1)
which is the Euler equation. From the second one
u0 (cs ) = β (A · F 0 (ks+1 ) + 1) u0 (cs+1 )
β(1 + r)u0 (cs+1 ) = β (A · F 0 (ks+1 ) + 1) u0 (cs+1 )
1 + r = A · F 0 (ks+1 ) + 1
r = A · F 0 (ks+1 ) (1.2)

These two equations together with the transversality condition solves for the optimal consumption. The
transversality condition is
1
lim bt+1 = 0
t→∞ (1 + r)t

the intuition of why this constraint is used is the following: assume g is zero, and k constant. Also assume
that the initial bond holdings are zero, and that 1 + r = 1/β. If consumption and output are constant, then
the bond accumulation implies
bt+1 = (1 + r) bt + y − c
T
X t
bT = (1 + r) (y − c)
t=0
T +1
(1 + r) −1
bT = t (y − c)
(1 + r) − 1
T +1
bT (1 + r) −1y−c
T
= T
(1 + r) (1 + r) r
1.1 Small Open Economies 3

assume we start when y = c. It is easy to show that the left hand side has to be zero. Assume that it is not.
Assume that the value of the left hand side is negative, then notice that I can increase consumption by one
by borrowing one. If I borrow that amount today I have to pay 1 + r next period – which I can continue
borrowing until time T . The net present value of that bond is one, which mean that I can afford to increase
consumption. Therefore, that consumption can’t be optimal. The opposite occurs if debt is positive. In the
end, it has to be zero.
One problem of Small Open Economies is that there is no restriction about 1 + r and β. If they are
different, then one country can shrink to zero or grow to infinity. Nevertheless, assume that 1 + r = β. In
this case we have that the Euler equation is
u0 (cs ) = u0 (cs+1 ) =⇒ cs = cs+1 = c
Substituting in the budget constraint and integrating forward we have
bt+1 = (1 + r) bt + yt − c
T
X −1
T T −s
bT = (1 + r) b0 + (1 + r) (ys − c)
s=0

which implies
T
X −1
bT 1
T
= b0 + s (ys − c)
(1 + r) s=0
(1 + r)
The transversality condition implies that the left hand side is zero, therefore
T
X −1 T
X −1
1 1
c s = b0 + s ys
s=0
(1 + r) s=0
(1 + r)
1
1+r −0 1
c 1 = c = b0 + N P V (y)
1− 1+r
r
c = r (b0 + N P V (y))

Therefore, consumption is just the interest annuity of wealth. Which is given by the initial assets plus the
net present value of output.

1.1.1.1 External Accounts

One interesting aspect is to analyze the properties of the external account. The trade balance is the difference
between output and consumption
T Bs = ys − c
while the current account is the trade balance plus interest payments
CAs = ys − c + rbs
Consumption is constant, therefore, any fluctuation of the trade balance is a fluctuation of output. So the
correlation between the trade balance and output is one.
∆T Bs = ∆ys
Furthermore, fluctuations in the current account are also correlated with output. Remember that (from the
budget constraint) ∆bs = rbs−1 + ys−1 − c = CAs−1 . Therefore,
∆CAs = ∆ys + rCAs−1
4 1. International Asset Pricing: Discrete Time

Having discussed their relationship with output, lets study their level. From the accumulation of the budget
constraint and the transversality condition we know that the average current account is zero. This means
that the average trade balance is equal to the initial asset holdings (b0 ). In other words, if the country has
initial savings of b0 they can consume rb0 plus output forever, and run a trade balance deficit forever of rb0 .
What all this means is that the trade balance ad the current account are both pro-cyclical, and more
volatile than consumption.

1.1.2 Asset Pricing in a Small Open Economy

We have the same assumptions as before, but now we allow for a state denoted as w ∈ W described with
a probability law π (s, ws ). We also assume that there is a complete set of Arrow-Debreu assets (AD from
now on). We assume that we pay a price q (s, ws ) today for a contract that will deliver one unit of good at
time s and state w. Finally, we assume that we are in an endowment economy, with stochastic production
y (s, ws ).
Consumers now maximize
∞ X
X
Ut = β s−t π (s, ws ) u (c (s, ws ))
s=t w

The budget constraint is not that trivial, the reason is that today (at time t and state w) we are not only
consuming but buying and selling all the AD assets. Assume the quantity of assets traded is denoted as
B (s, ws ). In other words, we are buying B (s, ws ) which stipulate that at time s and state w, B (s, ws ) units
will be delivered. So, the budget constraints are
P∞ P
y (t, wt ) + (1 + r) bt = c (t, wt ) + s=t+1 w q (s, ws ) B (s, ws ) at time t
y (s, ws ) + B (s, ws ) = c (s, ws ) at time s

where all the asset accumulation is summarized by the AD assets delivered at a particular point in time in
a given state of the world. We also know that the contract to deliver one good today has to have a price of
one. In other words, q (t, w) = 1 for today’s state w. We can combine all the budget constraints and arrive
to
X∞ X
q (s, ws ) [y (s, ws ) − c (s, ws )] + (1 + r) bt = 0
s=t w

The Lagrangian is
∞ X
" ∞ X
#
X X
s−t
£= β π (s, ws ) u (c (s, ws )) + µ q (s, ws ) [y (s, ws ) − c (s, ws )] + (1 + r) bt
s=t w s=t w

The FOC with respect to consumption is



= 0 =⇒ β s−t π (s, ws ) u0 (c (s, ws )) − µq (s, ws ) = 0
∂c (s, ws )

which is
β s−t π (s, ws ) u0 (c (s, ws )) = µq (s, ws )
Using the fact that at time t we know that q = 1 for w = wt and zero otherwise. We also know that the
probability is π = 1. Therefore, u0 (c (t, wt )) = µ. Combining we obtain

β s−t π (s, ws ) u0 (c (s, ws )) = q (s, ws ) u0 (c (t, wt )) (1.3)


1.1 Small Open Economies 5

1.1.2.1 Risk Free Rate

Using the Arrow Debreu prices we can find the risk free rate that prevails in the economy. What is the value
of a contract today that delivers one unit of good at time s in all the statesPof the world? We know that we
have to deliver B (s, ws ) = 1 for all w. The price of that contract today is w q (s, ws ).
1 X
= q (s, ws ) (1.4)
1 + Rs,t w

This is the risk free rate.


We can combine the disk free rate with the law of motion of consumption.

β s−t π (s, ws ) u0 (c (s, ws )) = q (s, ws ) u0 (c (t, wt ))


X X
β s−t π (s, ws ) u0 (c (s, ws )) = q (s, ws ) u0 (c (t, wt ))
w w
X
= u0 (c (t, wt )) q (s, ws )
w
1
= u0 (c (t, wt ))
1 + Rs,t

which can be rearranged to


X
u0 (c (t, wt )) = β s−t (1 + Rs,t ) π (s, ws ) u0 (c (s, ws ))
w
0 s−t
u (c (t, wt )) = β (1 + Rs,t ) Et u0 (c (s, ws )) (1.5)

In the small open economy we usually assume that the interest rate is exogenous, and it is usually the
case that we assume it is constant. In this case, the risk free rate every period is r, and therefore, 1 + Rs,t =
s−t
(1 + r) . In this case the Euler equation becomes
s−t
u0 (c (t, wt )) = (β (1 + r)) Et u0 (c (s, ws ))

1.1.2.2 Risk Neutrality

This is always a good simplification that is commonly used in finance. The reason is that in finance (asset
pricing) the probability space is changed to make agents appear risk neutral, and therefore, the Arrow Debreu
prices have a very simple specification.
Assume that there exists a foreigner that is risk neutral – not that we are risk neutral at home, but that
someone outside the country is. In this case, risk neutrality implies that marginal utility of consumption is
constant. Substituting in the equivalent equation for the foreign country,

q (s, ws ) = β s−t π (s, ws )

We know that the risk free rate implies


1 X X
= q (s, ws ) = β s−t π (s, ws ) = β s−t
1 + Rs,t w w

s−t
Therefore, 1+R 1
s,t
= β s−t for all s. Because β is constant, then 1 + Rs,t = (1 + r) and (1 + r) β = 1. So,
the first implication of risk neutrality is that indeed the interest rate and the discount rate are congruent.
6 1. International Asset Pricing: Discrete Time

Risk neutrality in the rest of the world implies that consumption at home is the same across all states of
the world. In other words, that there is perfect risk sharing. Notice that the assumption that there is one
risk neutral agent implies perfect risk sharing for all agents regardless of the home’s degree of risk aversion.

β s−t π (s, ws ) u0 (c (s, ws )) = q (s, ws ) u0 (c (t, wt ))


= β s−t π (s, ws ) u0 (c (t, wt ))
0
u (c (s, ws )) = u0 (c (t, wt )) (1.6)

So, there is perfect risk sharing! Across all states and all times!
Substituting in the budget constraint produces implications that are almost identical to the world without
uncertainty.
c = r (b0 + E0 N P V (y)) (1.7)
Which has the exact same time series implications we discussed in the previous section. Before, the fluctuation
of output was deterministic, here the fluctuation is stochastic. Except for that, everything else is identical.
The trade balance and the current account are pro-cyclical, the average or expected current account is zero,
the expected trade balance is the negative of the initial holdings. For permanent shocks, the correlation
between output and consumption is zero, and the correlation between output and the trade balance is one.
In the data, however, this is very different. The correlation between output and the trade balance is -0.13,
while the correlation between output and consumption is 0.60!
Another interesting implication of (1.7) is that the change in the trade balance has to have a counterpart
on the net foreign asset positions. For example, take first differences (and because consumption is flat) then

−∆b0 = ∆E0 N P V (y)

So, any accumulation or de-acumulation of foreign assets (∆b0 ) is the same as the trade balance (the right
hand side).

1.1.2.3 Challenging Issues

As discussed before, SOE fail to capture simple correlation structures in the data. Several papers have tried to
solve the problems by introducing investment, labor supply and non separability, adjustment and transport
costs, and incomplete markets. None of these extensions have been successful. To match emerging market’s
moments, the most successful alternatives have been the inclusion of borrowing constraints, and the paper
by Aguiar and Gopinath (the cycle is the trend).
Small open economies do have very important and difficult challenges.

1. The first problem appears when the model has to be solved when there is no stationary equilibrium.
In general, the solution is found as the approximation around the deterministic equilibrium. But this
might not be good enough.
2. Solutions of most SOE depend on the initial conditions – which means that the simulation of the model
is exceedingly difficult.
3. When you do not have risk neutrality in the rest of the world, clossing the models is not easy. There is
a large RBC literature on this issue. Uribe and Schmitt-Grohe (2003) have a very nice paper proposing
different methodologies on how to close SOE models. They offer/discuss four alternatives on how to
do this:

(a) β depends on consumption


1.2 General Equilibrium: Two Countries, Single Good 7

(b) r depends on β
(c) Portfolio Adjustment Costs
(d) Complete Markets

In the end, all four possibilities are very similar and produce very similar results.

1.1.2.4 Incomplete Markets (Only Bonds and Stocks)

The previous model assumes the existence of AD assets. Here we show that if the country has access to bonds
and stocks only, and is able to have continuous trading, then the economy behaves as if it is in complete
markets.
Assume we have a claim on the output of the home country (as if were a Lucas tree). The price of the
stock – under complete markets and risk neutrality – is
∞ X
X
St = q (s, ws ) y (s, ws )
s=t w

but we know that

q (s, ws ) = β s−t π (s, ws )


−(s−t)
= (1 + r) π (s, ws )

Therefore,
∞ X
X −(s−t)
St = (1 + r) π (s, ws ) y (s, ws )
s=t w
X∞ X
−(s−t)
= (1 + r) π (s, ws ) y (s, ws )
s=t w
X∞
−(s−t)
= (1 + r) Et y (s, ws )
s=t
= Et N P V (y)

Consumption is then the same as before.

c = r (b0 + Et N P V (y)) .

In other words, the optimal consumption is the annuity equivalent of today’s total wealth as accounted as
the value of today’s foreign assets (b0 ) plus the value of home stock (S0 ). So, optimal consumption does not
require any other asset than stocks and bonds. We arrive to the exact same allocation if we have complete
markets, or if we have two assets!

1.2 General Equilibrium: Two Countries, Single Good

Assumptions: endowment economy, single good (no relative prices), no investment, and complete markets.
8 1. International Asset Pricing: Discrete Time

The complete market assumptions implies that the world competitive allocation achieves the Pareto alloca-
tion. We assume that the utilities of each of the agents are different. Sometimes we will make the assumption
at the beginning that the utilities are the same to analyze some of the properties.
Asset prices: we assume there is a claim to the tree that produces home output, as well as a claim to the
production of the foreign output.

1.2.1 Pareto Allocation

The program is
∞ X
X
max β s−t π (s, ws ) [uH (cH (s, ws )) + Ψ uF (cF (s, ws ))]
s=t w

where Ψ is the Pareto weight. Because there is one single good, then the resource constraint is

cH (s, ws ) + cF (s, ws ) = yH (s, ws ) + yF (s, ws )


= y (s, ws )

The Lagrangian is
X∞ X ½ s−t ¾
β π (s, ws ) [uH (cH (s, ws )) + Ψ uF (cF (s, ws ))]
£ = max
s=t w
+µ (s, ws ) [y (s, ws ) − cH (s, ws ) − cF (s, ws )]

the FOC’s are



= 0 =⇒ β s−t π (s, ws ) u0H (cH (s, ws )) − µ (s, ws ) = 0
∂cH (s, ws )

= 0 =⇒ β s−t π (s, ws ) Ψ u0F (cF (s, ws )) − µ (s, ws ) = 0
∂cF (s, ws )
which implies
u0H (cH (s, ws )) = Ψ u0F (cF (s, ws ))

1.2.1.1 Log Utility

If we have the same utility function for each country, which takes the form of the log utility, and because in
the pareto allocation we have that Ψ is constant, then the two consumptions are proportional

cH (s, ws ) ∼ cF (s, ws )

for every time s and every state of the world, w.


In fact, this is perfect risk sharing. However, in practice this is not the case. In the data, consumptions are
correlated but not at this level (only 70 percent), and, in fact, output levels are incredibly highly correlated
relative to consumptions (output are correlated in 61 percent). Another interesting thing is the fact that
consumptions are perfectly correlate to world output.
Assume that the constant of proportionality is α. cF (s, ws ) = αcH (s, ws ), then
1
cH (s, ws ) = y (s, ws )
1+α
α
cF (s, ws ) = y (s, ws )
1+α
1.2 General Equilibrium: Two Countries, Single Good 9

1.2.2 Competitive Equilibrium

After solving the consumption allocation, given the pareto weight, we solve for the competitive equilibrium.
The competitive equilibrium is what we use to price assets. The idea is to find the asset prices that are
consistent with the consumption allocation derived in the Pareto problem
The Arrow-Debreu assets entail purchasing an asset today at a price q (s, ws ) that will deliver one good
at time t = s in state w. The consumer maximizes
∞ X
X
max β s−t π (s, ws ) uH (cH (s, ws ))
s=t w
st
∞ X
X
yH (t, wt ) = cH (t, wt ) + (1 + rt ) Bt + q (s, ws ) B (s, ws )
s=t+1 w
yH (s, ws ) + B (s, ws ) = cH (s, ws )
where B are the number of AD assets purchased at t.
The first budget constraint is the one prevailing at the beginning. The agent has some initial assets (which
we identified as bonds) and the agent purchases all the AD assets. Then, when time and state (s, ws ) is
realized, then the agent receives its endowment, and the assets they purchased from the AD assets.

1.2.2.1 Single Good Tips

1. There is one tremendous simplification that we can make in the single good environment. This is the fact
that we know, ex-ante, that the number of AD assets in state (s, ws ) are given by cH (s, ws )−yH (s, ws ).
In that case, we can write the Lagrangian and solve it. The purpose of the competitive equilibrium
maximization is to determine the price of the AD assets in terms of the utility.
2. The second tip is the fact that q(t, w) at time t is always equal to 1. Therefore, the budget constraint
can be written as
∞ X
X
q (s, ws ) (yH (s, ws ) − cH (s, ws )) = (1 + rt ) Bt
s=t w

These two tricks make the maximization problem extremely simple. The reason is that we do not have to
maximize the AD holdings. In other words, we do not have to take derivatives with respect to B (s, ws ). The
Lagrangian is
∞ X
X
£ = β s−t π (s, ws ) uH (cH (s, ws )) +
s=t w
"∞ #
XX
λ q (s, ws ) (yH (s, ws ) − cH (s, ws )) − (1 + rt ) Bt
s=t w

The FOC implies



= 0 =⇒ β s−t π (s, ws ) u0H (cH (s, ws )) − λq (s, ws ) = 0
∂cH (s, ws )

What we always try to get from the competitive equilibrium is the following:
1 s−t
q (s, ws ) = β π (s, ws ) u0H (cH (s, ws )) (1.8)
λ
10 1. International Asset Pricing: Discrete Time

1.2.3 Asset Prices

Here we compute both the interest rate and the stock prices.

1.2.3.1 Risk Free Bond

The risk free bond at some given time is equivalent to receive one unit of good at time s for all the states of
the world w. In other words, X
PB,s,t = q (s, ws ) B (s, ws )
w

where B (s, ws ) = 1.
1 X
= q (s, ws ) B (s, ws )
1 + Rs,t w
X
= q (s, ws )
w
X1
= β s−t π (s, ws ) u0H (cH (s, ws ))
w
λ
1 s−t X
= β π (s, ws ) u0H (cH (s, ws ))
λ w
1 s−t
= β Et u0H (cH (s, ws ))
λ
We also can compute the interest at time s = t. We know that the interest rate between today and today is
zero, and that all the AD assets have a price equal to 1, therefore,
1 1 t−t
= β Et u0H (cH (t, wt ))
1 + Rt,t λ
Rt,t = 0
Et u0H (cH (t, wt )) = u0H (cH (t, wt ))
λ = u0H (cH (t, wt ))

which substituting in the previous equation we have that the interest rate between time t and s is

1 Et u0 (cH (s, ws ))
= β s−t 0 H (1.9)
1 + Rs,t uH (cH (t, wt ))

1.2.3.2 Stock Markets

To determine the stock prices we have to replicate the flow from the tree. At time s and state w the tree
delivers y (s, ws ) goods. That means that we need B (s, ws ) = yH (s, ws ) ∀ s, w. Therefore, the price of that
claim today is
X∞ X
q (s, ws ) yH (s, ws )
s=t w
1.2 General Equilibrium: Two Countries, Single Good 11

The value of the stock market of the foreign country is simply changing the dividend process from home to
foreign. Everything else is identical. From the previous derivation we have that
1 s−t
q (s, ws ) = β π (s, ws ) u0H (cH (s, ws ))
λ
1 0
q (t, wt ) = u (cH (t, wt )) = 1
λ H
π (s, ws ) u0H (cH (s, ws ))
q (s, ws ) = β s−t
u0H (cH (t, wt ))
therefore
∞ X
X u0H (cH (s, ws ))
SH,t = β s−t π (s, ws ) yH (s, ws ) (1.10)
s=t w
u0H (cH (t, wt ))

1.2.4 Log Utility Case

Assume that both countries have the same utility, and both have logs: u(c) = ln(c).

1.2.4.1 Pareto Allocation

The Pareto Allocation equations are

u0H (cH (s, ws )) = Ψ u0F (cF (s, ws ))


cH (s, ws ) + cF (s, ws ) = y (s, ws )

substituting the logs


1 1
= Ψ
cH (s, ws ) cF (s, ws )
cF (s, ws ) = Ψ cH (s, ws )

which substituting in the resource constraint implies


1
cH (s, ws ) = y (s, ws )
1+Ψ

1.2.4.2 Competitive Equilibrium

from the competitive equilibrium, we solved for the price of the AD assets
1 s−t
q (s, ws ) = β π (s, ws ) u0H (cH (s, ws ))
λ
1 s−t 1
= β π (s, ws )
λ cH (s, ws )
we also know that the price of the AD asset at s = t is equal to 1. So, we can solve for the multiplier
1
λ=
cH (t, wt )
which implies
cH (t, wt )
q (s, ws ) = β s−t π (s, ws ) .
cH (s, ws )
12 1. International Asset Pricing: Discrete Time

Given the solutions from the pareto allocation we substitute the consumption sharing rules. Because markets
are complete we know that the pareto weight is constant. We can actually check this, but this is hard, so,
lets assume it.
y (t, wt )
q (s, ws ) = β s−t π (s, ws )
y (s, ws )

1.2.4.3 Interest Rate

The interest rate is


1 Et u0H (cH (s, ws ))
= β s−t
1 + Rs,t u0H (cH (t, wt ))
1
Et y(s,w s)
= β s−t 1
y(t,wt )
1 1
= β s−t y (t, wt ) Et
1 + Rs,t y (s, ws )
1
This seems simple, but it is not! The problem is computing Et y(s,w s)
. For example, if this is a one country
world, and we assume that output is log normally distributed, then this is simple. But, here we have two
countries. If we assume that each country is log normally distributed, then the world output is NOT log nor-
mal. If we assume that each country is normal, then the sum is normal, but then computing this expectation
is extremely difficult – and probably not well defined.

1.2.4.4 Stock Prices

As before, lets substitute


∞ X
X u0H (cH (s, ws ))
SH,t = β s−t π (s, ws ) yH (s, ws )
s=t w
u0H (cH (t, wt ))
∞ X 1
X cH (s,ws )
= β s−t π (s, ws ) 1 yH (s, ws )
s=t w cH (t,wt )
X∞ X
yH (s, ws )
= β s−t π (s, ws ) y (t, wt )
s=t w
y (s, ws )
X∞ · ¸
s−t yH (s, ws )
= y (t, wt ) β Et
s=t
y (s, ws )

This is, again, extremely difficult to compute. The problem is that we have no idea what to substitute on the
output processes in order to compute this expectation. In the closed economy this does not happen. Why?
because yH = y, and therefore, the expectation becomes equal to one for all states, and we can compute the
value of the stock.

1.2.4.4.1 Veronesi et.al.:

One possible solution offered be veronesi and others is to specify the country output as a share times the
world output. In other words, assume that yH (s, ws ) = θ (s, ws ) · y (s, ws ). This implies that
X∞ · ¸ X∞
s−t yH (s, ws )
SH,t = y (t, wt ) β Et = y (t, wt ) β s−t Et θ (s, ws )
s=t
y (s, w s ) s=t
1.2 General Equilibrium: Two Countries, Single Good 13

For example, assume that we assume the shares are a random walk, then Et θ (s, ws ) = θ (t, wt ), and the
value of the stocks is
1
SH,t = y (t, wt ) θ (t, wt )
1−β

Of course, this assumption is a extreme simplification, and even though it solves the problem of computing
the stock prices, it offers no solution to the interest rate – which is as complicated with this assumption as
it is without it.

1.2.4.4.2 Cochrane-Longstaff-Santa Clara (two trees):

A very nice paper solving this problem is Cochrane, Longstaff, and Santa-Clara (2005). To understand this
paper we need to develop some basic tools in stochastic calculus. We do this in the next chapter and delay
the discussion of this paper until then.

1.2.4.5 Research Questions

1. There is a large research effort in this area regarding the terms of trade. In this model, because there
is only one good, there are no exchange rate and terms of trade implications. Important research
introducing transport costs, or other inefficiencies, have tried to deal with some of these shortcomings.
2. Second, and equally important, is the problem of portfolio allocation: this model implies perfect risk
sharing and a very particular – and unrealistic – portfolio holding. For instance, portfolio holdings are
identical.

1.2.4.6 Portfolio Holdings

Finally, we can study the portfolio holdings that prevail. In this particular case, because the solutions are so
simple, we can derive the portfolio from the consumption solution. This is not always the case, and most of
the times you need to optimize; but here it is simple enough.
From the Pareto Allocation we know that consumption at home is given by
1 1
cH (s, ws ) = y (s, ws ) = (yH (s, ws ) + yF (s, ws ))
1+Ψ 1+Ψ
Originally, however, Home was owning home production (yH (s, ws )) and they purchased the AD assets. The
idea of the assets was to deliver the optimal consumption. So, from the asset side we know that

cH (s, ws ) = B (s, ws ) + yH (s, ws )

for every state. Therefore, we can solve for the AD assets that will be delivered in state (s, ws ) that guarantee
the optimal consumption
1
B (s, ws ) = (yH (s, ws ) + yF (s, ws )) − yH (s, ws )
1+Ψ
1 Ψ
= yF (s, ws ) − yH (s, ws )
1+Ψ 1+Ψ

1 Ψ
So, in every state, the optimal ”delivery” is one in which home agents get 1+Ψ yF (s, ws ) and pay 1+Ψ yH (s, ws ).
1
Notice that trading in the stock actually delivers these flows. If home buys 1+Ψ of the foreign stock, and
Ψ
sells 1+Ψ of the home stock, then in every state (s, ws ) the deliveries are exactly as those shown here.
14 1. International Asset Pricing: Discrete Time

1
This allocation implies that the optimal portfolio for home agents is to hold 1+Ψ of both home and foreign
Ψ
assets. While the optimal portfolio of foreign is to hold 1+Ψ of both stocks. Notice, as we said before, that
the portfolios are identical – up to a scaling factor. Also, the optimal portfolio is constant. Finally, it is a
constant proportion of the world stock (home plus foreign).
One final question, how can we obtain the solution to Ψ? The optimal portfolios, and the initial portfolio
are very easy ways to get the solution. Assume we start from yH (0) = yF (0) which implies θ(0) = 1/2 and
SH (0) = SF (0). Also, assume that before the world starts, home and foreign own their own stock. The initial
1
wealth is SH (0) and the optimal wealth when the world starts is 1+Ψ [SH (0) + SF (0)]. Because there cannot
be a transfer of wealth between foreign and home when the world starts, these two have to be identical;
which only occurs if Ψ = 1. Of course, the solution for Ψ depends on initial wealth

1.2.5 Issues with this model

These model has tremendous problems. I am going to concentrate on the Log utility one.

1. Because there is only one good on earth, there is no exchange rate or terms of trade in the model. The
only way to introduce exchange rates is by introducing transport costs – which is an unnatural way of
dealing with the issue.
2. Consumptions are perfectly correlated – there is perfect risk sharing here as it was in the previous
model. One alternative is to deviate from log utility, but excessive consumption correlation still remains
a property of all these models.

3. Portfolio holdings are identical – in terms of wealth share invested in each asset. This also implies that
the current account is identical to zero every time, and the change in net foreign asset positions is zero
as well.

4. PPP and the price level are identical in all countries. One alternative is to allow for non-tradable
consumption. If the production of NT is stochastic – it is a separate tree – then the price levels of the
countries will vary, and the real exchange rate enters the euler equation.

1.2.6 Problem Sets

These are the problem sets that can be used for this section.

1.2.6.1 Stock markets and Output: The Bernoulli World

As we discussed, in the standard two countries one good model stock markets and output can have almost
any correlation. This problem is just to highlight this fact. As we showed, stock prices at time t are
∞ X
X yi (s, i)
St = β i−t ytw π (s, i) ,
i=t s∈S
yiw (s, i)
∞ X
X yi∗ (s, i)
St∗ = β i−t ytw π (s, i) .
i=t s∈S
yiw (s, i)

Notice that in class we derived this expressions for time 0, but you can do a simple “search and replace”
and obtain the expressions at all times and states. Assume that output in each country is determined by the
1.2 General Equilibrium: Two Countries, Single Good 15

outcome of two coins being flipped. Assume that the first coin is flipped with payoffs
½
η if H
xt =
−η if T

while the second coin has payoffs ½


ε if H
εt = .
−ε if T
Assume output of each country is given by

yt = C0 + xt + εt
yt∗ = C0 + xt − εt

which obviously implies that total world output is

ytw = 2 (C0 + xt ) .

Finally, assume that in every period we flip the coins independently.

1. Compute the stochastic properties of the output processes. What is the variance of each output (home,
foreign, world) and what is the correlation of outputs across the world?
X X y ∗ (s,i)
2. Compute the expectation π (s, i) yywi (s,i)
(s,i) and π (s, i) ywi (s,i) .
i i
s∈S s∈S

3. Compute the close form solution for the stock prices of both countries. What is the correlation of the
stock prices? What you have to impose in this example to get no correlation in output, and perfect
correlation in output?

1.2.6.2 Non-tradables in a single good model

Assume that there are three goods, that the utility of each country is the same, and that the tradable good
is identical. The social’s planner problem is
∞ X
X
max β s−t π (s, ws ) [uH (cH (s, ws ) , cN T,H (s, ws )) + Ψ uF (cF (s, ws ) , cN T,F (s, ws ))]
s=t w

where cN T,H (s, ws ) and cN T,F (s, ws ) are the non tradable consumption in each respective country. As we
said before, we assume that uH (.) = uF (.).
The resource constraints, with their respective multipliers, are

cH (s, ws ) + cF (s, ws ) = yH (s, ws ) + yF (s, ws ) µ (s, ws )


cN T,H (s, ws ) = yN T,H (s, ws ) µN T,H (s, ws )
cN T,F (s, ws ) = yN T,F (s, ws ) µN T,F (s, ws )

The FOC’s are



∂cH (s,ws ) = 0 =⇒ β s−t π (s, ws ) uH
T = µ (s, ws )
∂ s−t
∂cN T ,H (s,ws ) = 0 =⇒ β π (s, ws ) uH N T = µN T,H (s, ws )

∂cF (s,ws ) = 0 =⇒ Ψβ s−t π (s, ws ) uFT = µ (s, ws )
∂ s−t
∂cN T ,F (s,ws ) = 0 =⇒ Ψβ π (s, ws ) uF N T = µN T,F (s, ws )
16 1. International Asset Pricing: Discrete Time

From the resource constraints we can compute the relative price of the NT goods. Assume that the
numeraire is the tradable good, then the price of NT at home is µN T,H /µ and the price of the NT good in
the foreign country is µN T,F /µ. Therefore, PN T,H = uH H
N T /uT and similarly for the foreign country. Because
cN T,H (s, ws ) = yN T,H (s, ws ) we know that using any traditional utility function implies that the price level
in each country – a combination of the tradable and the non-tradable – is going to be stochastic as well.
Solve the problem for the Cobb-Douglas with expenditure share on tradables equal to α. Solve for the
price of tradables and non-tradables.

1. Find the Euler Equation for the tradable consumption in each country (do not substitute for the
prices). Notice that there are two terms, one that depends on the relative consumption of the goods,
and another term that comes from the change in relative prices of the non-tradable goods. Provide
a short intuition of what this equation entails. Is there risk sharing? If relative prices of NT’s go up,
what occurs to relative consumptions of tradables?
2. Now, substitute for the prices in and find what it implies.

1.3 General Equilibrium: Two Countries, Two Goods.

Lets now introduce multiple goods. We will do this only in the context of log utility because otherwise we
have no simple solution.
The first assumption is to have each country have their own tree. This is the easiest production allocation
that we can think of – except single good. This is an endowment economy where each country is fully
specialized in the production of one differentiated item.
Of course, consumers derive utility from consuming the different items. We allow for the two utility
functions to be different across countries. The idea is to capture aspects such as home bias in consumption,
and non-tradable demand as part of the framework.

1.3.1 Utilities and Pareto Allocation

Assume the utilities are given by

uH (cHH (s, ws ) , cHF (s, ws )) = αH ln cHH (s, ws ) + β H ln cHF (s, ws )


uF (cF H (s, ws ) , cF F (s, ws )) = β F ln cF H (s, ws ) + αF ln cF F (s, ws )

where cHH (s, ws ) is the consumption by home agents of the home produced good; cHF (s, ws ) is the con-
sumption of home agents of the foreign good; and similarly for foreigners (cF H (s, ws ) is foreign consumption
of home good, and cF F (s, ws ) is foreign consumption of foreign good).
We also assume that αH > β H and that αF > β F . This is the home bias in consumption assumption. We
do not need this for most of the derivations, but at some point in time we are going to rely on this assumption
to characterize the evolution of the terms of trade and the real exchange rate. This assumption is sensible
because in general there are non-tradables in the economies and their consumption can be interpreted as a
home bias in the consumption. We also assume that the weights are always positive.
Notice that this is not your typical Cobb-Douglas. Instead of setting the weight to be α and 1 − α we have
adopted a different specification. Of course, it can be transformed to the ”usual” setting. The reason why we
do this is because of simplicity of the assumptions required on the weights. These are the expenditure weights,
1.3 General Equilibrium: Two Countries, Two Goods. 17

and as will become clear in this section, we need the expenditure shares to be stochastic to avoid simple and
trivially unrealistic results. By the way, this is a very famous result from Helpman and Razin (1978), and
Cole and Obstfeld (1991). So, even though trivial it is still extremely important. We will replicate it here,
but in order to ”break” that result we need to make the expenditure shares stochastic. The assumptions
and properties of the stochastic process of αH and β H are much easier to specify than the assumptions for
(αH + β H ) and αHα+β H
. Which corresponds to the alternative formulation
H

uH (cHH (s, ws ) , cHF (s, ws )) = ξ (s, ws ) [γ (s, ws ) ln cHH (s, ws ) + (1 − γ (s, ws )) ln cHF (s, ws )]
αH (s, ws )
γ (s, ws ) =
αH (s, ws ) + β H (s, ws )
ξ (s, ws ) = αH (s, ws ) + β H (s, ws )

which ”seems” more natural. We come back to these points later when we introduce demand shocks. For the
moment we will assume they are constant and proceed in the exact same way we did before.
We proceed to the pareto Allocation exactly in the same manner as in the previous case. The social planner
optimizes
∞ X
X
max β s−t π (s, ws ) [uH (cHH (s, ws ) , cHF (s, ws )) + Ψ uF (cF H (s, ws ) , cF F (s, ws ))]
s=t w

subject to the resource constraints

cHH (s, ws ) + cF H (s, ws ) = yH (s, ws )


cHF (s, ws ) + cF F (s, ws ) = yF (s, ws )

In this problem there are several interesting aspects. The multipliers on the resource constraints have two
properties. Their ratio is equal to the relative price of the items. In other words, their ratio is a measure of
the terms of trade. This is going to become an extremely important component of our discussion and the
transmission mechanisms. In addition, the two multipliers are the product of the price of risk (same for both
multipliers) times the price of the items (once a numeraire has been decided).
The optimization is
∞ X
X
£ = β s−t π (s, ws ) [uH (cHH (s, ws ) , cHF (s, ws )) + Ψ uF (cF H (s, ws ) , cF F (s, ws ))]
s=t w
+µH (s, ws ) [yH (s, ws ) − cHH (s, ws ) − cF H (s, ws )]
+µF (s, ws ) [yF (s, ws ) − cHF (s, ws ) − cF F (s, ws )]

The first order conditions are


∂ αH
= 0 =⇒ β s−t π (s, ws ) = µH (s, ws )
∂cHH (s, ws ) cHH (s, ws )
∂ βH
= 0 =⇒ β s−t π (s, ws ) = µF (s, ws )
∂cHF (s, ws ) cHF (s, ws )
∂ βF
= 0 =⇒ β s−t π (s, ws ) Ψ = µH (s, ws )
∂cF H (s, ws ) cF H (s, ws )
∂ αF
= 0 =⇒ β s−t π (s, ws ) Ψ = µF (s, ws )
∂cF F (s, ws ) cF F (s, ws )
18 1. International Asset Pricing: Discrete Time

substituting in the resource constraints, we obtain


αH
cHH (s, ws ) = yH (s, ws ) (1.11)
αH + Ψβ F
βH
cHF (s, ws ) = yF (s, ws )
β H + ΨαF
Ψβ F
cF H (s, ws ) = yH (s, ws )
αH + Ψβ F
ΨαF
cF F (s, ws ) = yF (s, ws )
β H + ΨαF

For the moment we have assumed that the expenditure shares are constant, and that the Pareto Weight
is also constant. Therefore, there is no (s, ws ) term next to them.
We can solve for the multipliers.
1
µH (s, ws ) = β s−t π (s, ws ) (αH + Ψβ F ) (1.12)
yH (s, ws )
1
µF (s, ws ) = β s−t π (s, ws ) (β H + ΨαF )
yF (s, ws )

1.3.1.1 Terms of Trade

The first step is to compute the terms of trade as a ratio of the multipliers in equation (1.12). The price of
Home relative to Foreign is given by

µH (s, ws )
T (s, ws ) =
µF (s, ws )
αH + Ψβ F yF (s, ws )
T (s, ws ) = (1.13)
β H + ΨαF yH (s, ws )

There are three aspects that are worth highlighting about the terms of trade.

1.3.1.1.1 Ricardian Effect:

The Ricardian Effect refers to the impact that world supply of goods have on the relative prices of the goods.
Ricardo said that an increase in the supply of Home goods, for instance, should reduce its relative price.
Notice that is exactly what the term yyH F (s,ws )
(s,ws ) is doing. In any state of nature when yF (s, ws ) increases, or
when yH (s, ws ) decreases, the relative price of home goods rises.

1.3.1.1.2 Dependent Economy Effect:

The Dependent Economy effect refers to the impact on relative prices when relative demands shift. This
effect was firstly introduced in the 60’s by Salter and Swan.1 But years later, Rudi Dornbusch actually was
able to formulate it and explain it to human kind.2 He called it the Dependent Economy, and I have always
kept that name in his honor.

1 See Salter (1959) and Swan (1960).


2 See Dornbusch (1980) and Dornbusch (1983).
1.3 General Equilibrium: Two Countries, Two Goods. 19

The Dependent economy effect says that if the demand shifts toward home goods, the price of home goods
will go up – or the real exchange rate will appreciate (more on this bellow). If home demand shifts toward
home goods, then αH increases. If foreign demand shifts toward home goods, then β F increases. Notice
that both are in the numerator of T (s, ws ) implying that they increase the relative price of home goods (as
expected).
It is important to highlight that the Dependent Economy effect is independent of the values of the
expenditureshares. the α0 s and β 0 s can be any positive number and still a shift in them has the desired
impact on the terms of trade.

1.3.1.1.3 Wealth Transfer Effect:

The Wealth Effect was the outcome of a large discussion in the 30’s between Keynes and Ohlin. At that time,
it was called the Transfer Problem. The original “Transfer Problem” was the outcome of a debate between
Bertil Ohlin and John Maynard Keynes regarding the true value of the burden of reparations payments
demanded of Germany after World War I. Keynes argued that the payments would result in a reduction of
the demand for German goods and cause a deterioration of the German terms of trade, making the burden
on Germany much higher than the actual value of the payments. On the other hand, Ohlin’s view was that
the shift in demand would have no impact on relative prices. This implication would be correct if all countries
have the exact same demands – i.e. the exact same expenditure shares αH = β F and αF = β H .
In our case, because we are assuming that αi > β i , a change in Ψ has the following implication in the
relative prices
∂T (s, ws ) β F (β H + ΨαF ) − αF (αH + Ψβ F ) yF (s, ws )
= 2
∂Ψ (β H + ΨαF ) yH (s, ws )
∂T (s, ws )
sign = β F β H − αF αH < 0
∂Ψ
So, an increase in Ψ, which is equivalent to a transfer of wealth from home to foreign, reduces the relative
demand of home products. Why? consumption is proportional to wealth and foreigners consume a higher
proportion of their budget on their goods. Therefore, a transfer of wealth from home to foreign is reducing
the demand of home products more than what foreigners are increasing it, and increases the demand for
foreign goods because foreigners increase it more than what home reduces it. In the end, there is a relative
increase in the demand of foreign goods, increasing their respective price.

1.3.1.2 Goods’ Prices and Exchange Rates

When you have several goods it is crucial to decide on a numeraire. This is not always simple, especially
in our case where the consumers have two different utility functions, and utility functions that are going to
be shifting soon. Therefore, just defining what is a reasonable numeraire is not easy at all. Additionally, we
would not like to have changes across countries – and volatility – caused by shifts in the numeraire. For that
reason, we assume a constant weight
a · pH (s, ws ) + (1 − a) · pF (s, ws ) = 1
but
pH (s, ws ) = pF (s, ws ) · T (s, ws )
therefore
T (s, ws )
pH (s, ws ) = (1.14)
aT (s, ws ) + (1 − a)
1
pF (s, ws ) =
aT (s, ws ) + (1 − a)
20 1. International Asset Pricing: Discrete Time

From the utility of each country


à !α αH à !α βH
H +β H H +β H
pH pF
PH = αH βH
αH +β H αH +β H
αH 1
= T αH +βH KH
aT + (1 − a)

The price level in each country is different, and their ratio is


PH αH

βF
= T αH +βH αF +βF .
PF

In general, due to home bias in consumption, αHα+β


H
> αFβ+β
F
– i.e., the expenditure share of home agents
H F
on the home good is larger than the expenditure share of foreign agents on the home good – then the terms
of trade and the real exchange rate move together.
Notice that this model implies that PPP does not hold. Permanent change in the relative outputs imply
permanent changes in the terms of trade and in the price levels.

1.3.2 Competitive Equilibrium

Lets now solve the competitive equilibrium problem.


We assume that home owns the domestic production. As before, they consume home and foreign goods.
They use their income to purchase the home and foreign goods, and also to purchase all the AD assets
that are available. We will have two different sets of AD assets. And the reason is very simple, we need to
deliver both domestic and foreign goods. This means that we have prices for both. Finally, just for notational
simplicity, we assume no initial holdings of the bond.
∞ X
X
max β s−t π (s, ws ) uH (cHH (s, ws ) , cHF (s, ws ))
s=t w
st
pH (t, wt ) yH (t, wt ) = pH (t, wt ) cHH (t, wt ) + pF (t, wt ) cHF (t, wt )
X∞ X
+ qH (s, ws ) BH (s, ws ) pH (t, wt )
s=t+1 w
X∞ X
+ qF (s, ws ) BF (s, ws ) pF (t, wt )
s=t+1 w
pH (s, ws ) cHH (s, ws ) + pF (s, ws ) cHF (s, ws ) = pH (s, w) yH (s, ws ) + pH (s, ws ) BH (s, ws ) + pF (s, ws ) BF (s, ws )

Assume the multiplier for the first equation is λt while the multiplier for the second set of constraints (one
for each (s, ws )) is λ (s, ws ). The consumer maximizes both levels of consumption and also the AD assets
1.3 General Equilibrium: Two Countries, Two Goods. 21

purchased. Also remember that our utility function is the log utility function we used before
∂ αH
= 0 =⇒ β s−t π (s, ws ) = pH (s, ws ) λ (s, ws )
∂cHH (s, ws ) cHH (s, ws )
∂ βH
= 0 =⇒ β s−t π (s, ws ) = pF (s, ws ) λ (s, ws )
∂cHF (s, ws ) cHF (s, ws )

= 0 =⇒ −λt qH (s, ws ) pH (t, wt ) + λ (s, ws ) pH (s, ws ) = 0
∂BH (s, ws )

= 0 =⇒ −λt qF (s, ws ) pF (t, wt ) + λ (s, ws ) pF (s, ws ) = 0
∂BF (s, ws )

From the last two equations we can find the prices of the AD assets

λt qH (s, ws ) pH (t, wt ) = λ (s, ws ) pH (s, ws )


λt qF (s, ws ) pF (t, wt ) = λ (s, ws ) pF (s, ws )
qH (s, ws ) pF (t, wt ) pH (s, ws )
=
qF (s, ws ) pH (t, wt ) pF (s, ws )
qH (s, ws ) T (s, ws )
=
qF (s, ws ) T (t, wt )

In the end, the purpose of the competitive equilibrium is to find the prices of the AD assets that sustain
the competitive equilibrium. Hence, we have to substitute the consumptions on the AD equations. From the
FOC we have the following two relationships
αH
β s−t π (s, ws ) = pH (s, ws ) λ (s, ws )
cHH (s, ws )
αH
= pH (t, wt ) λt
cHH (t, wt )

but we know (from the third FOC) that

qH (s, ws ) λt pH (t, wt ) = λ (s, ws ) pH (s, ws )


αH αH
qH (s, ws ) = β s−t π (s, ws )
cHH (t, wt ) cHH (s, ws )
cHH (t, wt )
qH (s, ws ) = β s−t π (s, ws )
cHH (s, ws )

similarly
cHF (t, wt )
qF (s, ws ) = β s−t π (s, ws )
cHF (s, ws )

1.3.3 Asset Prices

Let us start with the stock prices first. As we did previously, the price of the stock can be replicated
with the AD assets. It is a claim that delivers yH (s, ws ) home goods in each of the states of the world:
BH (s, ws ) = yH (s, ws ).
X∞ X
SH,t = qH (s, ws ) yH (s, ws ) pH (t, wt )
s=t w
22 1. International Asset Pricing: Discrete Time

So, just to clarify, the home AD assets deliver home goods and have to be paid today using home goods.
That is why the pH (t, wt ) appears in the budget constraint. Substituting
∞ X
X cHH (t, wt )
SH,t = β s−t π (s, ws ) yH (s, ws ) pH (t, wt )
s=t w
cHH (s, ws )

X X yH (s, ws )
= β s−t pH (t, wt ) cHH (t, wt ) π (s, ws )
s=t w
cHH (s, ws )

From the pareto allocation we have


αH
cHH (s, ws ) = yH (s, ws )
αH + Ψβ F


X X αH + Ψβ F
SH,t = β s−t pH (t, wt ) cHH (t, wt ) π (s, ws )
s=t w
αH

X αH αH + Ψβ F
= β s−t pH (t, wt ) yH (t, wt )
s=t
αH + Ψβ F αH
1
= pH (t, wt ) yH (t, wt )
1−β

Similarly, the foreign stick is


1
SF,t = pF (t, wt ) yF (t, wt )
1−β
Therefore, the stock prices are perfectly correlated

SH,t pH (t, wt ) yH (t, wt ) yH (t, wt ) αH + Ψβ F


= = T (t, wt ) =
SF,t pF (t, wt ) yF (t, wt ) yF (t, wt ) β H + ΨαF

This is a very famous result (well, famous only if you have no life whatsoever and you are doing a Ph.D.
instead of making money and enjoying a great and fruitful career. Oh, wait! You are doing a Ph.D.!). This
has been found by several important papers in the literature: Helpman and Razin (1978), Cole and Obstfeld
(1991), and Zapatero (1995). In this simple model, the terms of trade produce diversify all the world risk and
therefore, financial markets are redundant. In other words, the exact same pareto allocation can be achieved
with or without financial market completeness.

1.3.3.1 Issues with this model

There are several problems with this model, obviously

1. Stock markets are perfectly correlated.


2. Because of that, portfolio allocation is indeterminate – any portfolio allocation sustains equilibrium
3. Consumptions are also perfectly correlated.

The literature has three main solutions to this problem: demand shocks, financial constraints, and non-log
utility.
1.4 Demand Shocks 23

1.4 Demand Shocks

The assumption now is that αH (s, ws ) and β H (s, ws ) are both stochastic variables. For the moment we
will assume that markets are complete, therefore, there are enough financial assets to be able to handle the
uncertainty arising from the demand shocks. For example, assume that αH = 1 − β H , which is a reasonable
assumption, and that αH is uncorrelated with output, then we only need two bonds and two stocks to fully
expand the space. Continuous trading in this environment will guarantee that the economy behaves as if it
were under complete markets.

1.4.1 Social Planner

In our case, we will evaluate the market as if markets are complete and then price all the other assets. Using
the exact same procedure. First, the Pareto Optimal problem

∞ X
X
£ = β s−t π (s, ws ) [uH (cHH (s, ws ) , cHF (s, ws )) + Ψ uF (cF H (s, ws ) , cF F (s, ws ))]
s=t w
+µH (s, ws ) [yH (s, ws ) − cHH (s, ws ) − cF H (s, ws )]
+µF (s, ws ) [yF (s, ws ) − cHF (s, ws ) − cF F (s, ws )]
where
uH (cHH (s, ws ) , cHF (s, ws )) = αH (s, ws ) ln cHH (s, ws ) + β H (s, ws ) ln cHF (s, ws )
uF (cF H (s, ws ) , cF F (s, ws )) = β F ln cF H (s, ws ) + αF ln cF F (s, ws )

The first order conditions are


∂ αH (s, ws )
= 0 =⇒ β s−t π (s, ws ) = µH (s, ws )
∂cHH (s, ws ) cHH (s, ws )
∂ β (s, ws )
= 0 =⇒ β s−t π (s, ws ) H = µF (s, ws )
∂cHF (s, ws ) cHF (s, ws )
∂ βF
= 0 =⇒ β s−t π (s, ws ) Ψ = µH (s, ws )
∂cF H (s, ws ) cF H (s, ws )
∂ αF
= 0 =⇒ β s−t π (s, ws ) Ψ = µF (s, ws )
∂cF F (s, ws ) cF F (s, ws )
substituting in the resource constraints, we obtain
αH (s, ws )
cHH (s, ws ) = yH (s, ws )
αH (s, ws ) + Ψβ F
β H (s, ws )
cHF (s, ws ) = yF (s, ws )
β H (s, ws ) + ΨαF
Ψβ F
cF H (s, ws ) = yH (s, ws )
αH (s, ws ) + Ψβ F
ΨαF
cF F (s, ws ) = yF (s, ws )
β H (s, ws ) + ΨαF

Which, as it can be seen, has the exact same structure as before. From here, as we did before, we can
compute the terms of trade
24 1. International Asset Pricing: Discrete Time

αH (s, ws ) + Ψβ F yF (s, ws )
T (s, ws ) =
β H (s, ws ) + ΨαF yH (s, ws )
Again, notice that the equation is virtually the same except that now the movements in the expenditure
shares will have an effect on the terms of trade.

1.4.2 Competitive Equilibrium

The problem arises when we want to solve for the competitive equilibrium. Here, in general, we need far
more structure. As we saw in the previous case, when we have two sources of uncertainty (output at home
and abroad) we need two AD assets for each state. If we add demand shocks that are uncorrelated with
output, then we need even more AD securities. In the discrete time problem this is cumbersome - in the
sense that we need to keep track of several assets. One easy way to see the solution of the problem is to
assume that there are only two sources of risk, yH (s, ws ) and yF (s, ws ) , where αH (s, ws ) and β H (s, ws )
are linear functions of the underlying shocks. In this case, the bonds and the two AD securities from before
are more than enough to expand the risk and markets are complete. Assume we can write
αH (s, ws ) = σ α,H · yH (s, ws ) + σ α,F · yF (s, ws )
β H (s, ws ) = σ β,H · yH (s, ws ) + σ β,F · yF (s, ws )

as before, assume no initial holdings of the bond.


∞ X
X
max β s−t π (s, ws ) uH (cHH (s, ws ) , cHF (s, ws ))
s=t w
st
pH (t, wt ) yH (t, wt ) = pH (t, wt ) cHH (t, wt ) + pF (t, wt ) cHF (t, wt )
X∞ X
+ qH (s, ws ) BH (s, ws ) pH (t, wt )
s=t+1 w
X∞ X
+ qF (s, ws ) BF (s, ws ) pF (t, wt )
s=t+1 w
pH (s, ws ) cHH (s, ws ) + pF (s, ws ) cHF (s, ws ) = pH (s, ws ) yH (s, ws ) + pH (s, ws ) BH (s, ws ) + pF (s, ws ) BH (s, ws )
Assume the multiplier for the first equation is λt while the multiplier for the second set of constraints (one
for each (s, ws )) is λ (s, ws ). The consumer maximizes both levels of consumption and also the AD assets
purchased. Also remember that our utility function is the log utility function we used before
∂ αH (s, ws )
= 0 =⇒ β s−t π (s, ws ) = pH (s, ws ) λ (s, ws )
∂cHH (s, ws ) cHH (s, ws )
∂ β (s, ws )
= 0 =⇒ β s−t π (s, ws ) H = pF (s, ws ) λ (s, ws )
∂cHF (s, ws ) cHF (s, ws )

= 0 =⇒ −λt qH (s, ws ) pH (t, wt ) + λ (s, ws ) pH (s, ws ) = 0
∂BH (s, ws )

= 0 =⇒ −λt qF (s, ws ) pF (t, wt ) + λ (s, ws ) pF (s, ws ) = 0
∂BF (s, ws )

From the last two equations we can find the prices of the AD assets
λt qH (s, ws ) pH (t, wt ) = λ (s, ws ) pH (s, ws )
λt qF (s, ws ) pF (t, wt ) = λ (s, ws ) pF (s, ws )
1.4 Demand Shocks 25

which as before they are


qH (s, ws ) pF (t, wt ) pH (s, ws ) T (s, ws )
= =
qF (s, ws ) pH (t, wt ) pF (s, ws ) T (t, wt )
which are the same as before, except that now they will be affected by how the terms of trade are affected
by the demand shocks
As before, substitute the consumptions on the AD equations. From the FOC we have the following two
relationships. At time s and state w consumption satisfies

αH (s, ws )
β s−t π (s, ws ) = pH (s, ws ) λ (s, ws )
cHH (s, ws )

and at time t (today), consumption satisfies

αH (t, wt )
= pH (t, wt ) λt
cHH (t, wt )

In the previous exercise we did not keep track of the expenditure share, but now we do have to keep track.
From the third FOC we have

qH (s, ws ) λt pH (t, wt ) = λ (s, ws ) pH (s, ws )


αH (t, wt ) αH (s, ws )
qH (s, ws ) = β s−t π (s, ws )
cHH (t, wt ) cHH (s, ws )
cHH (t, wt ) αH (s, ws )
qH (s, ws ) = β s−t π (s, ws )
cHH (s, ws ) αH (t, wt )

similarly
cHF (t, wt ) β H (s, ws )
qF (s, ws ) = β s−t π (s, ws )
cHF (s, ws ) β H (t, wt )

Notice that now the AD prices include the movements of the expenditure shares.

1.4.3 Asset Prices

Let us start with the stock prices first. As we did previously, the price of the stock can be replicated
with the AD assets. It is a claim that delivers yH (s, ws ) home goods in each of the states of the world:
BH (s, ws ) = yH (s, ws ).
X∞ X
SH,t = qH (s, ws ) yH (s, ws ) pH (t, wt )
s=t w

Substituting
∞ X
X cHH (t, wt ) αH (s, ws )
SH,t = β s−t π (s, ws ) yH (s, ws ) pH (t, wt )
s=t w
cHH (s, ws ) αH (t, wt )

X cHH (t, wt ) X yH (s, ws )
= β s−t pH (t, wt ) π (s, ws ) αH (s, ws )
s=t
αH (t, wt ) w cHH (s, ws )
26 1. International Asset Pricing: Discrete Time

This is the exact same formula we had before! Lets operate the last sumation, which is just an expectation:
X yH (s, ws ) X yH (s, ws )
π (s, ws ) = π (s, ws ) αH (s, ws ) α (s,w
cHH (s, ws ) s)
αH (s,ws )+Ψβ F yH (s, ws )
H
w w
X αH (s, ws ) + Ψβ F
= π (s, ws ) αH (s, ws )
w
αH (s, ws )
X
= π (s, ws ) [αH (s, ws ) + Ψβ F ]
w
= Et [αH (s, ws ) + Ψβ F ]
Notice the tremendous simplicity! The stock price today is

X cHH (t, wt )
SH,t = β s−t pH (t, wt ) Et [αH (s, ws ) + Ψβ F ]
s=t
αH (t, wt )
If we assume that αH (s, ws ) is a martingale, then Et [αH (s, ws )] = αH (t, wt ), and because markets are
complete, then Ψ is constant as well. Therefore, the stock market price is identical to
1
SH,t = pH (t, wt ) yH (t, wt )
1−β
1
SF,t = pF (t, wt ) yF (t, wt )
1−β
which is identical to what we had before. However, stock markets are not perfectly correlated now.

SH,t pH (t, wt ) yH (t, wt ) yH (t, wt ) αH (t, wt ) + Ψβ F


= = T (t, wt ) =
SF,t pF (t, wt ) yF (t, wt ) yF (t, wt ) β H (t, wt ) + ΨαF
These means that movements in the demand expenditure shares will produce shifts in the stock prices. This
is a model in which the lack of perfect correlation of stock markets is the outcome of demand shocks
What happens at time zero? Just before the world starts lets assume that home owns all the stock at
home, that foreigners own all their stocks, and that the bond is in zero net supply. When we open the
economy for trade the first thing that occurs is an exchange of stocks across the two countries that has to
keep wealth constant. Let us assume that the first moment in which we trade is at time t = 0. We know
β H (0) Ψβ F
that bH (0) = αH α H (0)
(0)+Ψβ and bF (0) = β (0)+Ψα F
, which means that home sold αH (0)+Ψβ shares of the
F H F
β H (0)
home stock to foreigners, in exchange of β H (0)+ΨαF shares of the foreign stock. The price of the stocks are
1 1 pH (0) αH (0)+Ψβ F yF (0)
SH,0 = 1−β pH (0)yH (0), and SF,0 = 1−β pF (0)yF (0). We also know that T (0) = pF (0) = β H (0)+ΨαF yH (0) .
The value of the stocks sold is
Ψβ F 1
· pH (0)yH (0)
αH (0) + Ψβ F 1 − β
while the value of the stock purchased is
β H (0) 1
· pF (0)yF (0)
β H (0) + ΨαF 1 − β
equating these two we obtain
pH (0)yH (0) β H (0) αH (0) + Ψβ F
=
pF (0)yF (0) β H (0) + ΨαF Ψβ F
αH (0) + Ψβ F αH (0) + Ψβ F β H (0)
=
β H (0) + ΨαF β H (0) + ΨαF Ψβ F
β H (0)
1 =
Ψβ F
1.4 Demand Shocks 27

which solves for Ψ


β H (0)
Ψ=
βF

With the solution for Ψ and the previous equations we characterize the whole equilibrium

1.4.4 Interest Rate

Not only we can compute the price of the stocks, but we can also compute the price for the risk free interest
rate. Here, we can define different bonds. The bond can pay one unit of the home good in all states of the
world. This would be a domestic risk free REAL bond. It can also pay one unit of the foreign good, and
it can also pay one unit of the numeraire basket. All of these bonds are risk free in terms of the numeraire
that has been chosen, and all of them will complete the financial market. The reason is that it is easy to
transform from one bond to the other and then trading on the stock markets. We are going to concentrate
on one of these bonds and compute the interest rate for that one — the methodology is identical for any
other bond price.
So, the price of the real bond paying one unit of the home good in all states of the world is
∞ X
X
BH,t = [qH (s, ws ) pH (s, ws )]
s=t w
X∞ X· ¸
cHH (t, wt ) αH (s, ws )
= β s−t π (s, ws ) pH (s, ws )
s=t w
cHH (s, ws ) αH (t, wt )
X∞ X· ¸
s−t cHH (t, wt ) αH (s, ws )
= β π (s, ws ) pH (s, ws )
s=t
αH (t, wt ) w cHH (s, ws )
X∞ X· ¸
s−t cHH (t, wt ) pH (s, ws )
= β π (s, ws ) (αH (s, ws ) + Ψβ F )
s=t
αH (t, wt ) w yH (s, ws )

Remember that the prices are determined by a numeraire. So, lets assume that we decide to keep a world
basket with constant weights at a constant price

a · pH (s, ws ) + (1 − a) · pF (s, ws ) = 1

which solves for the price as a function of the terms of trade

T (s, ws )
pH (s, ws ) =
aT (s, ws ) + (1 − a)

The equation for the interest rate does not have a closed form solution, but certainly, we know the stochastic
process of the terms of trade and output in closed form, hence, we can derive the evolution of the price of
the bond, and therefore, the interest rate. It is exactly for this kind of problems that continuous time makes
life easier. That is exactly the topic that we follow below.
28 1. International Asset Pricing: Discrete Time
This is page 29
Printer: Opaqu

2
Introduction to Brownian Motion and Stochastic
Calculus: Some Applications

2.1 Basic Continuous Time

Continuous time is an extremely useful tool. Why? well, this is like approximating functions to solve difficult
problems but when the approximation is exact. In other words, every time you log linearize a macro model
around a deterministic steady state — for example — you are making an approximation that rarely is useful
in practice. Rarely shocks are small enough for the approximation to make sense, and the deterministic
steady state is unlikely to reflect the economy’s steady state, and therefore it is an uninteresting point to
start with. When you use continuous time the approximation you make is an exact solution - and therefore
it can be used to analyze real situations. Furthermore, because it is an approximation you can solve models
that otherwise are intractable.
We first start introducing the random walk representation of brownian motion and derive the most im-
portant properties. Then we provide the Wiener representation and derive everything there as well. Finally,
we use brownian motion to study the behavior of the exchange rate in a target zone exchange rate regime.

2.1.1 Brownian Motion: Random Walk representation.

Brownian motion is the only continuous process with independent gaussian increments. Intuitive, no? Not
exactly... Because this definition is not extremely useful as stated, we use representations to develop our
intuition. The random walk is by far the most used one.
Assume we define the following stochastic process:
½
∆h w/p p
xt+∆t − xt =
−∆h w/p 1 − p

and assume that the process satisfy the following properties:


30 2. Introduction to Brownian Motion and Stochastic Calculus: Some Applications

E (xt+∆t − xt ) = µ∆t
V ar (xt+∆t − xt ) = σ 2 ∆t

Given these moments we can compute ∆h and p.

∆h (2p − 1) = µ∆t
∆h2 − µ2 ∆t2 = σ 2 ∆t

If ∆t is small such that ∆t2 << ∆t, then this system of equations can be easily solved:
 √ ³ √ ´
 σ ∆t w/p 1
1 + σµ ∆t
2 ³
xt+∆t − xt = √ √ ´
 −σ ∆t w/p 1
1 − µ
2 σ ∆t

When we take the limit ∆t → 0, this is the random walk representation of a Brownian motion with drift
µ and variance σ 2 . For the record, most books define the random these quantities the way we did. The jump
up and down is called ∆h while the probability going up is p and the probability of going down is q.
We will take the limit when the steps go to zero and define this process as Brownian Motion. Indeed,
Brownian Motion has some particular and very special properties.

2.1.1.1 Properties

1. First, the path is continuous at every point, this means that there are no jumps. However, the path
is everywhere non-differentiable. This is easy to prove because the limit lim∆t→0 (xt+∆t − xt ) is zero,
and it is bounded. But the³ derivative´ in an unit interval
³ √ ´goes to infinity. Assume a positive jump, then
xt+∆t −xt
the derivative is lim∆t→0 ∆t = lim∆t→0 σ ∆t∆t → ∞. The exact same occurs if we compute
the derivative when negative shocks occur.
R
2. Second, the process has finite variation, thus the integral ∆x2t has meaning. In fact, this is a very
important property and, although it is hard to prove for the continuous time representation, it is easy
to prove for the random walk one. Lets compute the square of ∆xt+∆t : ∆x2t+∆t = σ 2 ∆t. This is true
for any of the two possible paths. Hence, even though the process of increments is a random variable,
the process of the square of the increments is a deterministic variable. We can add all the increments
from time t1 to time t2 , and take the limit when ∆t goes to zero. In the end, the sum is
Z t2 t2
X
∆x2t = lim ∆x2t+∆t
t1 ∆t→0
t=t1
t2 − t1
= lim σ 2 ∆t ·
∆t→0 ∆t
= σ 2 (t2 − t1 )

which is finite and well defined if the time interval is well defined.
R
3. Third, the length of all paths is infinite: in other words, |∆xt | √ → ∞. As in the previous case, in
everyone of the paths |∆xt | is not an stochastic variable. |∆xt | = σ ∆t. Lets compute the integral (or
2.1 Basic Continuous Time 31

sumation) as before
Z t2 t2
X
|∆xt | = lim |∆xt |
t1 ∆t→0
t=t1
√ t2 − t1
= lim σ ∆t ·
∆t→0
µ ∆t

t2 − t1
= lim σ √ →∞
∆t→0 ∆t

4. Fourth, the increments are independent (this should be trivial from the definition) of the random walk.
5. Fifth, the random variable xs −xt is normally distributed with mean µ(s−t) and variance σ 2 (s−t). This
is easy to prove by using the central limit theorem and the fact that the increments are independent.
In reality, for a given ∆t there are n steps, where n is given by s−t
∆t . Because at each node we have a
Bernoulli process with increments ∆h, then the distribution of n Bernoullis is a Binomial distribution.
When ∆t goes to zero the Binomial distribution converges to a normal distribution. If you want to get
familiar with this stuff, do the proofs and get used to the representation. They shouldn’t be to hard.
Do them between innings of tonight’s game.

2.1.1.2 Some approximations (from the Random Walk)

This is one of the most important results we have in continuous time. From the random walk representation
we can derive the lemma as an approximation. What is really important in the end is that this approximation
is actually exact when we are dealing with a continuous time process. But in order to develop the intuition
it is always usefull to get this approximation first.
Imagine we have a function of variable that follows a Brownian Motion. Lets aproximate the expected
value of the function by using the Taylor expansion.
1 2
EF (xt+∆t ) = F (xt ) + F 0 (xt ) E (xt+∆t − xt ) + F 00 (xt ) E (xt+∆t − xt )
µ 2 ¶
√ 1³ µ√ ´ √ 1³ µ√ ´
= F (xt ) + F 0 (xt ) σ ∆t 1+ ∆t − σ ∆t 1− ∆t
2 σ 2 σ
µ ³ ´ ³ ´ ¶
1 1 µ √ 1 µ √
+ F 00 (xt ) σ 2 ∆t 1+ ∆t + σ 2 ∆t 1− ∆t
2 2 σ 2 σ
³ µ ´ 1 ¡ ¢
= F (xt ) + F 0 (xt ) σ ∆t + F 00 (xt ) σ 2 ∆t
σ 2
0 1 2 00
= F (xt ) + µF (xt ) ∆t + σ F (xt ) ∆t
2

Notice that the first order term and the second order term are both proportional to ∆t.

2.1.2 Brownian Motion: Continuous Time Representation.

There is a simpler representation (at least notationally) of Brownian motion First, lets define the Weiner
measure (dw) as the Brownian motion with zero drift and unit variance (variance of dt). Because increments
are normally distributed (gaussian) then we can represent any Brownian motion as a linear combination
between a Weiner measure and a predictable drift. In other words,
32 2. Introduction to Brownian Motion and Stochastic Calculus: Some Applications

dxt = µdt + σdwt (2.1)

This is the representation that we will use.


Notice the following properties of the Weiner process: Edwt = 0 and Edwt2 = dt1 . This means that
the expected value of xt is: Edxt = µdt + σEdwt = µdt. While the variance will be given by: V dxt =
E(dxt − µdt) = σ 2 Edwt2 = σ 2 dt.
An alternative way of defining it is by starting the definition with the Wiener process:

∞ √
Definition 1 Brownian Motion: {wt }0 wt ˜N (0, 1). Define ∆xt = µ∆t + σ ∆twt where t ∈ [0, T ] and
where n · ∆t = T and n → ∞.

This is the more formal definition (however, at least to me) a little bit more obscure.
We are not going to prove normality – which is proved using the central limit theorem - nor the properties
we have already proved in the random walk representation. The proves are a little bit harder but doing them
does not provide additional intuitions. We will, however, derive Itô’s lemma.

2.1.2.1 Itô’s lemma

The Itô’s lemma is the result of a Taylor expansion and the properties of Brownian Motion. Assume we have
a function of time and xt (F (xt , t)). The change in the function is given by,

1¡ ¢
dF = Ft dt + Fx dx + Ftt dt2 + Fxx dx2 + Fxt dxdt
2

substituting and taking the limit when dt goes to zero,

1
dF = Ft dt + Fx dx + Fxx σ 2 dt (2.2)
2

Note that we can separate the deterministic from the stochastic part,
· ¸
1
dF = Ft + Fx µ + Fxx σ 2 dt + Fx σdwt
2

In other words, this is like a normal derivative but where we have to add an additional term that comes
from the finite variation of the Brownian motion. As before, the first and second order terms are of the same
order of magnitud.

2.1.2.2 Bellman Equation

In almost all the problems we encounter we will end up deriving a Bellman Equation to describe the problem.
I would like to derive the equation for the general case.

1 Well, this property is far stronger than this. Not only Edw 2 = dt, but dw 2 = dt. ASs we did before, using the random walk

representsation you will be able to prove it.


2.1 Basic Continuous Time 33

Assume that an agent has a state xt that evolves following a Brownian Motion parameters for drift and
variance B (µ, σ) . Assume that agents derive an instantaneous utility u(xt ) every dt, and that the discount
rate is given by β. Assume that the agent does not make a choice, there is no maximization at all. For the
moment we would like to know what is the present value of the utility derived by the agent.
Some problems do look like this, where the value function is just evolving with the instantaneous utility.
For example, all the problems of irreversible investment or sticky prices.
The discrete representation of the Bellman Equation is the following

V (xt ) = {u (xt ) dt + (1 − βdt) E [V (xt+dt )]}

which is the same as the typical representation but where we are clear and specific about the flows per unit
of time. For example, we are assuming that between t and t + dt the random variable takes the value xt .
Therefore the utility is u (xt ) dt. Furthermore, the discount rate is β for an instantaneous period of time,
which means that the discount rate in the interval is βdt.

2.1.2.2.1 Stationary problem:

For simplicity, we assume stationarity of the problem so the value function does not depend on time. We
do this especification first, and then relax it at the end of this subsection. The first step is to compute the
expectation on the right hand side. For this we use Itô’s lemma. we can approximate the function V () as
follows
1 2
V (xt+dt ) = V (xt ) + V 0 (xt ) dxt + V 00 (xt ) [dxt ]
2
given the definition of the process we know that
1
V (xt+dt ) = V (xt ) + V 0 (xt ) [µdt + σdwt ] + V 00 (xt ) σ 2 dt .
2
Taking expectations
1
EV (xt+dt ) = V (xt ) + V 0 (xt ) µdt + V 00 (xt ) σ 2 dt .
2

Substituting in the Bellman Equation we obtain


½ µ ¶¾
1
V (xt ) = u (xt ) dt + (1 − βdt) V (xt ) + V 0 (xt ) µdt + V 00 (xt ) σ 2 dt
2
¡ ¢ 1 ¡ ¢
= u (xt ) dt + V (xt ) (1 − βdt) + V 0 (xt ) µdt − µβdt2 + V 00 (xt ) σ 2 dt − σ 2 βdt2 .
2
We know that dt2 << dt, then
1
V (xt ) βdt = u (xt ) dt + V 0 (xt ) µdt + V 00 (xt ) σ 2 dt
· 2 ¸
1 00
V (xt ) βdt = u (xt ) + V (xt ) µ + V (xt ) σ 2 dt
0
2

Notice that the left and right hand sides are proportional to dt. Therefore,
1
βV (xt ) = u (xt ) + µV 0 (xt ) + σ 2 V 00 (xt ) (2.3)
2
Which is an ordinary differential equation. Given the utility function we can solve for the value function.
34 2. Introduction to Brownian Motion and Stochastic Calculus: Some Applications

2.1.2.2.2 Non-Stationary Problem:

A lot of problems that we encounter are not stationary. For example, if time is finite. In this case, the value
function changes through time and it is impossible to argue that the same function is valid every intance.
Another example is an european option that has a strike price at some given time. In this case, we have
to formally take into account the fact that the value function depends on time — as well as on the state
variable x.
The Bellman equation is identical except that the approximation of the function is different. The expec-
tation on the right hand side is given by
1 2
V (xt+dt , t) = V (xt , t) + Vx (xt , t) dxt + Vt (xt , t) dt + Vxx (xt , t) [dxt ]
2
1
EV (xt+dt , t) = V (xt , t) + µVx (xt , t) dt + Vt (xt , t) dt + σ 2 Vxx (xt , t) dt
2
so, the Bellman Equation is
1
βV (xt , t) = u (xt ) + µVx (xt , t) + σ 2 Vxx (xt , t) + Vt (xt , t) (2.4)
2
which is a partial differential equation — and in fact it is a parabolic differential equation. For some sim-
ple processes and simple utility functions, these equations have simple solutions. However, in most cases
these equations have to be solved numerically. There is a technioque called MultiGrid that is fantastic to
numerically solve partial differential equations.

2.1.2.2.3 What makes Brownian motion so special?

Brownian Motion is a very special process because when we take the limit when ∆t goes to zero, the jumps
and the probabilities converge at a very particular speed; and it is that speed that makes the mean and the
variance of the process to be proportional to the time elapsed.
For example, imagine we were doing this with a standard AR process. Something like xt+∆t − xt =
(µ + σεt ) · ∆t, where εt is a standard normal distribution. In this case, the first order term would have been
proportional to ∆t, while the second order term would have been proportional to ∆t2 . In other words, the
expected increment is µ∆t, while the variance is σ 2 ∆t2 . Which means that when the limit is taken, the
second order term dissapears at a faster speed; making it irrelevant. The approximation of the function will
only depend on the first order term
In the Brownian Motion case, this does not happen. The first and second order terms are of the same
order of magnitude. Interestingly, when we solve the discrete time problems, the second order terms are as
relevant as the first order ones for some of the choices. Hence, the limit of the Brownian Motion, where the
second order term survives is a much better representation of the economic problem at hand than the one
derived from the AR example discussed above. There is an additional advantage of BM! Because this is a
limit, this approximation is exact, and not a linealization.
There are other important equations that can be derived from the random walk representation: For in-
stance,continuous time boundary conditions and constraints, and the Kolmogorov forward and backward
equations for ergodic distributions. This is what we do next.

2.1.3 Constraints and Barriers

Every problem we will encounter has a boundary restriction or a constraint. Especifying these constraints
requires some understanding how brownian motion actually works. In this section we see some of the most
common constraints and derive a “methodology” to deal with them.
2.1 Basic Continuous Time 35

We study four type of restrictions: absorbing, reflecting, resetting, and shifting. In each example I will try
to provide some intuition of what they are supposed to represent. We are going to derive all these formulas
for the stationary case. However, the extension to the non-stationary one should be trivial.
Because we are going to use it frequently, let us remember the Bellman Equation:
1
βV (xt ) = u (xt ) + µV 0 (xt ) + σ 2 V 00 (xt )
2

2.1.3.1 Absorbing Barriers

An absorbing barrier is the instance when the value function takes a particular value when a certain state
is reached. For example, assume that we know that when the state reaches the value x̄ the instantaneous
payoff is ū permanently, and that the state will remain at x̄ with probability one.
So, this is like a final condition. When the economy reaches certain state it is trapped (absorbed ). What
is the stochastic process of x?
When x is not at the absorbing state
½
∆h w/p p 1³ µ√ ´
xt+∆t = xt + where p = 1+ ∆t
−∆h w/p 1−p 2 σ

when the economy reaches the absorbing state (xt = x̄)

xt+∆t = xt with probability 1

What is the Bellman equation when we are at x̄? This is equivalent to ask what is the Bellman equation
when both µ = σ = 0. This is the case because at x̄ the process has drift and variance exactly equal to zero.

βV (x̄) = u (x̄)

So, the boundary condition for the differential equation is



V (x̄) =
β

This case is a very simple case in which we know what is the value of the function at a particular state.
In finance those restrictions are common. For example, in the case of an option we know exactly what is the
value of the payoffs at the terminal time. And therefore, we can describe V (x, T ) perfectly. We use those
restrictions to solve for the differential equation.
In macro and international, however, absorbing barriers are not that common: First, it can be a case in
which priors reach zero or one — which in continuous time we know they will never happen; Second, it
occurs when some types dissapear or dominate the whole world — so it is a case of entry and exit; Third,
it apperas when one country is infinitively larger than the other; etc. Some of these are rare circumstances
but good contraints to impose if we know something about the solution of the problem in these extreme
conditions. In fact, excluding for the entry/exit payoff, the other two are always extreme circumstances. In
macro, the most common restrictions are reflecting and resetting barriers. These are the two net examples.

2.1.3.2 Reflecting Barriers

Assume that we know that when a certain state is reached the stochastic process cannot continue. For
example, assume that there is an upper bound on the stochastic process and assume we know that when
36 2. Introduction to Brownian Motion and Stochastic Calculus: Some Applications

that state is reached we know that we will need to either remain there or move down. Similarly, we can have
a bottom state below which we know the economy cannot fall further.
This is the example of a credible Target Zone. Imagine the central commits that if the exchange rate
reaches some level they will start selling to force the price down, but if the price is below some level they
will buy. If we believe that the central bank will do so, and it is capable of doing so, then when the state of
the economy reaches that state of the world the stochastic process can only move in one direction.
Assume that an upper reflecting barrier is at x̄. Then, when we are close to the barrier the evolution of
the process is as follows:
In equations, the standard process is
½
∆h w/p p 1³ µ√ ´
xt+∆t = xt + where p = 1+ ∆t ,
−∆h w/p 1−p 2 σ
while at the reflecting barrier the process is (xt = x̄)
½
0 w/p p 1³ µ√ ´
xt+∆t = xt + where p = 1+ ∆t .
−∆h w/p 1 − p 2 σ

Let us derive the evolution of the value function when it is at the barrier. In discrete time we have

V (x̄) = {u (x̄) ∆t + (1 − β∆t) E [V (xt+∆t )]}

where the approximation of the function is from itô’s lemma:

EV (xt+∆t ) = V (x̄) + pV 0 (x̄) · (x̄ − x̄) + (1 − p) V 0 (x̄) · (x̄ − ∆h − x̄)


1 1
+ pV 00 (x̄) · (x̄ − x̄)2 + (1 − p) V 00 (x̄) · (x̄ − ∆h − x̄)2 .
2 2
Notice that because we are on the reflecting barrier the ”positive” jumps truly remain in the same place.
We can substitute for ∆h and p.
1³ µ√ ´ ³ √ ´
EV (xt+∆t ) = V (x̄) + V 0 (x̄) 1− ∆t −σ ∆t
2 σ
1 1 ³ µ√ ´ ³ √ ´2
+ V 00 (x̄) · 1− ∆t −σ ∆t
2 2 σ
Substituting in the Bellman equation
1³ µ√ ´ ³ √ ´
V (x̄) = u (x̄) ∆t + (1 − β∆t) V (x̄) + (1 − β∆t) V 0 (x̄) 1− ∆t −σ ∆t
2 σ
1 1 ³ µ √ ´ ³ √ ´ 2
+ (1 − β∆t) V 00 (x̄) · 1− ∆t −σ ∆t
2 2 σ
Eliminating V (x̄) from both sides and rearranging terms we have
1³ µ√ ´ ³ √ ´
β∆tV (x̄) = u (x̄) ∆t + V 0 (x̄) (1 − β∆t) 1− ∆t −σ ∆t
2 σ
1 ³ µ√ ´
+V 00 (x̄) (1 − β∆t) 1 − ∆t σ 2 ∆t
4 σ

Notice that the the second term on the right is proportional to ∆t. While the third one is proportional to
∆t. By eliminating all the terms of higher order of ∆t we simplify to
√ 1
βV (x̄) ∆t = u (x̄) ∆t − V 0 (x̄) σ ∆t + V 00 (x̄) σ 2 ∆t,
4
2.1 Basic Continuous Time 37

and because ∆t >> ∆t, we have that this constraint simplifies to

V 0 (x̄) = 0 (2.5)

So, in any reflecting barrier, the value function approaches it with a zero slope. By the way, this is the
case given the form of the reflection. There are other mechanisms on how the reflection takes place, but here,
this is the simplest one and produces an incredibly simple constraint.
This is exactly the constrain we will use in the Target Zone example.

2.1.3.3 Reseting Barriers

Resetting barriers are also extremely important and used all ovber the place. In general there are two types
of resetting barriers, the ones that have fixed costs, and the ones that have proportional costs.
For example, a case of fixed costs are all irreversible investment, sticky prices, inventory problems. The
idea is that when the economy reaches certain state (x̄) the agent recieves a flow F (positive or negative)
and the state jumps to (x̂). In the irreversible investment or inventory model, when the capital is too low
(or inventory is too low) we pay a fixed cost F and we reinvest (change the capital stock) or purchase goods
(change the inventory).
Lets see what happens to the value function. Remember that when the economy is at x̄ it receives the
fixed flow and the instantaneous utility function, and then jumps to x̂ independently of the realization of
the state variable.

V (x̄) = u (x̄) ∆t + (1 − β∆t) [V (x̂) + F ]


V (x̄) = V (x̂) + F (2.6)

In some “circles”, this constraint is called the value matching conrtaint. When you stare at it, it does
make a lot of sense. When the economy reaches x̄ authomatically jumps to x̂ and gets a flow F . Imagine the
flow F where zero, then the two bvalue functions should ahve the exact same value — and they do according
to this constraint. In the case of menu costs, or irreversible investment, or inventory reposition, the flow is
alwasy negative.
There is another type of fixed cost that is proportional to the movement. In other words, the cost is
proportional to (x̂ − x̄). In this case, the value function is

V (x̄) = u (x̄) ∆t + (1 − β∆t) [V (x̂) + f · (x̂ − x̄)]

which implies
V (x̄) − V (x̂)
= −f.
x̄ − x̂
where f is the proportional cost.
Of course there are combinations of these two types of constraints. But the constraints in the end are not
much more complicated than what we have derived.

2.1.3.4 Shifting Barriers

Finally, an extremely interesting set of constraints is what happens when in a state of the world the profit
flow changes. In other words, assume that in some region, for all x ≤ x̄ there is an instantaneous utility
function (or profit function) given by ul , while for the rest of the space the utility is uh .
38 2. Introduction to Brownian Motion and Stochastic Calculus: Some Applications

This is a very common constraint when we are dealing with problems such as different regimes of compe-
tition, or regions of credit constraints, etc. In general, we can think of different competitive arrangements
where in one side of the state space the agents are playing one game, and a different game on the other.
Clearly, there are two value functions and Bellman equations, one for each regime
1
βVl (xt ) = ul (xt ) + µVl0 (xt ) + σ 2 Vl00 (xt )
2
1
βVh (xt ) = uh (xt ) + µVh (xt ) + σ 2 Vh00 (xt )
0
2
where the constraint implies the following Bellman equation at x̄ (in discrete time)

Vl (x̄) = ul (x̄) ∆t + (1 − β∆t) [pVh (x̄ + ∆h) + (1 − p) Vl (x̄ − ∆h)]

where the assumption is that at the barrier, if the state jumps up we shift to the other regime, but if it drops
we remain in the low regime. The approximation of the function implies
1
Vh (x̄ + ∆h) = Vh (x̄) + Vh0 (x̄) ∆h + Vh00 (x̄) ∆h2
2
1
Vl (x̄ − ∆h) = Vl (x̄) − Vl0 (x̄) ∆h + Vl00 (x̄) ∆h2
2
substituting
· ¸
1
Vl (x̄) Vh0
= ul (x̄) ∆t + (1 − β∆t) p Vh (x̄) + (x̄) ∆h + Vh00 (x̄) ∆h2
2
· ¸
0 1 00 2
+ (1 − β∆t) (1 − p) Vl (x̄) − Vl (x̄) ∆h + Vl (x̄) ∆h
2

substituting the probabilities, and eliminating all the high order terms we end up with
1³ µ√ ´ 1³ µ√ ´ 0
Vl (x̄) = ul (x̄) ∆t +
1+ ∆t Vh (x̄) + 1+ ∆t Vh (x̄) ∆h +
2 σ 2 σ
1 ³ µ√ ´ 1 ³ µ√ ´ 0
+ (1 − β∆t) 1− ∆t Vl (x̄) − 1− ∆t Vl (x̄) ∆h.
2 σ 2 σ

Notice that the terms not multiplied by ∆t are
1 1
Vl (x̄) = Vh (x̄) + Vl (x̄)
2 2
which implies
Vl (x̄) = Vh (x̄) (2.7)

Interestingly, after this condition is achieved, we also have terms that are proportional to ∆t. Those are
1 µ√ 1 1 ³ µ√ ´ 1
0= ∆tVh (x̄) + Vh0 (x̄) ∆h + − ∆t Vl (x̄) − Vl0 (x̄) ∆h.
2σ 2 2 σ 2
which after substituting the fact that the functions have to be continuous on x̄ the first and third term cancel
eachother and we find that
Vh0 (x̄) = Vl0 (x̄) (2.8)

So, not only the function has to be continuous but it is also differentiable.
2.1 Basic Continuous Time 39

2.1.4 Distributions and paths

Another important aspect of any brownian motion problem is to understand how it evolves, and what is its
distribution. The laws of motion of these processes are described by the Kolgomorov forward and backward
equations. The idea is in general very simple: in one equation we write what is the probability that a certain
state is reach – which obviously takes into consideration the barriers, while the other equation describes
what is the distribution of the next states conditional on where the economy is.
In terms of figures, the idea is that in one equation we compute the mass of observations that arrive to
state x coming from all possible states, while the other computes where the state will end assuming we are
on x.
Lets see some examples. Assume we are interested in estimating the ergodic distriobution (assuming
one exist). Assume we have a brownian motion and there are reflecting barriers at xl and xh . Define the
probability of being on state x as φ (x). For all the x0 s between the two reflecting barriers we have that the
probability of being in state x is the probability of being in state x − ∆h and getting a positive shock, plus
the mass on state x + ∆h times the probability of a bad shock. Technically, this is

φ (x) = pφ (x − ∆h) + (1 − p) φ (x + ∆h)

From Itô’s lemma we know that


1 2
φ (x − ∆h) = φ (x) − ∆h · φ0 (x) + (∆h) · φ00 (x)
2
1 2
φ (x + ∆h) = φ (x) + ∆h · φ0 (x) + (∆h) · φ00 (x)
2
which implies that
1 2
φ (x) = φ (x) + (1 − 2p) φ0 (x) ∆h + (∆h) · φ00 (x) .
2
Sustituting for p and ∆h
³ µ√ ´ √ 1 ³ √ ´2 00
0= − ∆t σ ∆tφ0 (x) + σ ∆t · φ (x) .
σ 2
Notice that both terms are proportional to ∆t, hence,
1 2 00
σ · φ (x) − µ · φ0 (x) = 0
2

The other Kolmogorov equation is to compute where the state variable (or particle) moves next. In this
case, the law of motion is
φ (x) = pφ (x + ∆h) + (1 − p) φ (x − ∆h)
which is very similar to the one we had before, except that here we are saying that the mass of particles at x
moves up with probability p and down with probability 1 − p. Using the same equations as before we arrive
to
1 2 00
σ · φ (x) + µ · φ0 (x) = 0
2

The first opne is called the backward kolmogorov equation and the second one is the forward one. We
have to impose boundary conditions, and those are impossed by stating the evolution of the probability
distribution next to the barriers. So, if we are at the boundary (low one) and get a negative shock we remain
on the same boundary.
φ (xl ) = pφ (xl + ∆h) + (1 − p) φ (xl ) .
40 2. Introduction to Brownian Motion and Stochastic Calculus: Some Applications

Substituting we have
µ ¶
1 2
φ (xl )= p φ (xl ) + ∆h · φ0 (xl ) + (∆h) · φ00 (xl ) + (1 − p) φ (xl )
2
1³ µ√ ´ √ 1³ µ √ ´ 1 ³ √ ´2 00
0 = 1+ ∆t σ ∆tφ0 (xl ) + 1+ ∆t σ ∆t · φ (xl )
2 σ 2 σ 2
1 √
0 = σ ∆tφ0 (xl ) + ∆t · Φ
2
0 = φ0 (xl )

The same constraint is obtained in the upper bound. So, the differential equation is solved with these two
constraints in mind. There are other constraints worth emphasizing. Such that the total sum of the probability
distribution has to add to one, etc.

2.1.5 Control problem: defining optimal barriers

There are optimality conditions when we are solving a control problem that must be satisfied. In general
they are very intuitive and can be derived in the same way we have derived the constraints - usually they
are a derivative of the constraints.
There are great references for brownian motion and stochastic calculus. My preferred ones are Øksendal
(2003), Harrison (1990), Dixit (1993), and Karatzas and Shreve (1988).

2.2 Applications

We study two intereting applications of Brownian Motion. The first one is the solution of the Target Zone
problem for exchange rates, and the second one is to find a solution to the CLSC paper – this is the two
country single good asset pricing problem we discussed earlier.

2.2.1 Target Zones

From the quantitative theory of money we have the following equation:

mt − pt = yt + vt − αEπ t (2.9)

Using purchasing power parity, and assuming foreign prices have no inflation and are equal to one, we can
rewrite equation 2.9

St = mt − yt − vt + αE Ŝt (2.10)

where St = pt − p∗t is the spot exchange rate, and Ŝt = π t − π ∗t is the exchange rate depreciation (which is
the result of our assumption of fixed foreign prices). Just a reminder that the inflation rate is the change in
prices, and therefore the exchange rate depreciation is just the change of the spot exchange rate.
Assume fundamentals are described by a Brownian motion process: mt − yt − vt = xt . Then the equation
2.10 is what is called an stochastic differential equation.
2.2 Applications 41

Lets see how we can solve it.

St = xt + αE Ŝt
EdSt
E Ŝt =
dt

St is a function of xt . Assume that xt is given by equation 2.11.

dxt = µdt + σdwt (2.11)

This means that we can apply Itô’s lemma and take expectations.

∂S ∂S 1 ∂2S 2
dSt = dt + dxt + dx
·∂t ∂x 2 ∂x2 ¸ t
∂S ∂S 1 ∂2S 2 1 ∂S
= + µ+ σ dt + σdwt
∂t ∂x 2 ∂x2 2 ∂x
· ¸
∂S ∂S 1 ∂2S 2
EdSt = + µ+ σ dt
∂t ∂x 2 ∂x2

Substituting in the differential equation we obtain:

· ¸
∂S ∂S 1 ∂2S 2
S = xt + α + µ+ σ
∂t ∂x 2 ∂x2

which is a partial differential equation with boundary conditions that we will define later. Now, it should
be clear from now that what is left is just algebra. If you have the boundary conditions the solution to the
partial differential equation is simple.
Note that so far we have not impose the fact that the exchange rate is operating under a target zone.
This is where we get the boundary conditions. First assume that the exchange rate moves between bounds
[U, L]. This means that if the fundamentals imply an exchange rate that is between U and L then there is
no intervention. However, if the fundamentals imply a different exchange rate the central bank will intervene
to move the exchange rate toward the band. For simplicity (see Bertola and Caballero for the relaxation of
this assumption) assume that the central bank has infinite amount of reserves. Thus, there is no way they
can not commit to the band.

2.2.1.1 The Differential Equation

First, because the bands are time invariant and the process is time invariant, the functional form of S will
only depend on time through the value of fundamental. Therefore, ∂S ∂t = 0. The partial differential equation
is then a second order differential equation. So, instead of using the notation ∂S ∂x we know that the spot
exchange rate is a function of the fundamental and the first derivative is just represented as S 0 .
· ¸
1
S = xt + α µS + σ 2 S 00
0
2
42 2. Introduction to Brownian Motion and Stochastic Calculus: Some Applications

There are two parts to the solution of this differential equation. The homogeneous and the particular
solution. It is important to highlight that if there were no target zones, then the only solution for this
differential equation is the particular solution. To find this solution just simply state that

S (xt ) = a + bxt + cx2t + ...

Substituting and equating coefficients we find that the solution is S(xt ) = xt + αµ.
The homogeneous solution takes the form S(xt ) =£Ae−λxt . Substituting
¤ in the homogeneous differential
equation we find that the roots have to satisfy 1 = α µλ + 12 σ 2 λ2 . So, the general solution is

S (x) = Aeλ1 x + Beλ2 x + x + αµ

So, A and B have to be found from the boundary conditions.

2.2.1.2 Boundary Conditions

If there are no boundary conditions, the exchange rate function is described only by the particular solution.
Let us forget about the αµ term for a moment – which is indeed the effect of the expected depreciation
on the exchange rate. Notice that when there are no boundaries the exchange rate will fully follow the
fundamentals – or in other words, the changes of the exchange rate are perfectly correlated to the changes
of the fundamentals. If we have bands, however, we know that when we are in the lower boundary (for
example) the exchange rate can not go down any further. So, it remains either at the same place or moves
up. This is anticipated by the agents (in the expectation) and therefore it affects the level of the exchange
rate immediately.
How can we take this into account? The first step is to determine the law of motion of the fundamental
when we are close to the boundaries. As we said before, assume that at x = l the exchange rate takes a value
of L, and that at x = u the exchange rate is equal to U . The constraint on the law of motion of the brownian
motion when it is at the boundary implies that when x = l the fundamentals cannot fall further, and that
when x = u the fundamental cannot increase further. In fact, that is exactly whan the central bank does.
That is the intervention they do to keep the fundamentals within the admissible range.
This constraint on the law of motion of the fundamental implies that the stochastic process is

xt = l
½
∆h w/p p 1³ µ√ ´
xt+∆t − xt = where p = 1+ ∆t .
0 w/p 1−p 2 σ
Let us see the implications of this process on the exchange rate. Remember that the original law of motion
of the exchange rate – before applying the Itô’s lemma is

St = xt + αE Ŝt

Let us now evaluate this equation at the boundary. Remember that when xt = l the exchange rate takes the
value of L. This means that
E∆S
α = S (l) − l
∆t

At l the fundamentals either increase or remain the same (this is the intervention). Given this, the change
in the exchange rate is as follows:
½
S (l + ∆h) − S (l) w/p p
∆St = St+∆t − St (l) = .
0 w/p 1 − p
2.2 Applications 43

So, the expected value is

E [St+∆t − St (l)] = [S (l + ∆h) − S (l)] · p


· ¸
1
= S (l) + ∆h · S 0 (l) + ∆h2 · S 00 (l) − S (l) · p
2
· ¸
1 √ 1 1
= σ ∆tS 0 (l) + µS 0 (l) + σ 2 S 00 (l) ∆t
2 2 4

Substituting in the law of motion of the exchange rate we have


· · ¸ ¸
1 √ 0 1 0 1 2 00
α σ ∆tS (l) + µS (l) + σ S (l) ∆t = ∆t [S (l) − l] .
2 2 4

Notice that when ∆t goes to zero, ∆t >> ∆t. Therefore, simplifying in the previous equation we get
1 √
ασ ∆tS 0 (l) = 0
2
S 0 (l) = 0

Therefore, the function is flat at l! The exact same thing occurs at the top bound.
There are four equations that describe the function and the impact of the intervention on the function.
These are the boundary conditions:

S(u) = U (2.12a)
S 0 (u) = 0 (2.12b)
S(l) = L (2.12c)
S 0 (l) = 0 (2.12d)

The values of A and B as well as l and u satisfy the four boundary conditions (equations 2.12).

2.2.1.3 Comments

Some comments on the solution and its empirical relevance: First, the implied probability distribution in a
target zones is that most of the time the exchange rate should be close to the boundaries. In other words,
if the we depict the rage of the fundamentals in the x-axis – fluctuating from l to u – the distribution
should look like an U. However, in reality, the distribution is closer to an inverted U. Thus, the empirical
distribution implies that most of the time the exchange rate is closer to the center of the band rather than in
the extremes. The main reason why this occurs is because the central bank not only intervenes in the bands
but also in the center. In the model we have used we have the intervention only in the extremes.
Second, the interest rate differential should be declining. In fact, close to the upper band the exchange rate
is more likely to appreciate than depreciate thus smaller interest rates should be expected. But this seems
counter factual. We have observed that the longer and closer to the band the exchange rate is the higher the
interest rate. The reason is that in our model there is no possibility of abandonment of the exchange rate,
while this is the premium we observe in reality.
Extensions: Bertola and Caballero (1992) is for the probability of realignment. Garber and Svensson (1994)
look at the cases when there is large interventions and not necessarily in the band.
44 2. Introduction to Brownian Motion and Stochastic Calculus: Some Applications

2.2.2 Cochrane-Longstaff-Santa Clara

The second application we study is related to the asset pricing section we covered at the beginning of the
course. Remember that for the 2 countries, 1 good case the price of the stock – in discrete time – is given
by:
In continuous time, the formula is very similar. The next subsection follows closely the paper by Cochrane,
Longstaff, and Santa-Clara (2005). The solution for the stock price in discrete or continuous time is exactly
the same. In continuous time we usually work with finite horizon – for simplicity of the regularity conditions
– but that is the only difference. The stock price at home is given by
Z T · ¸
−ρ(s−t) yH (s)
SH,t = y (t) e Et ds
t y (s)
h i
Notice that the important problem to be solved is to describe the evolution of yy(s)
H (s)
which is the evolution
of the share of home output relative to world output. Computing the expectation of this is quite important.
Assume a very general specification of the output processes
dyH (t)
= µH (t) dt + σ H (t) dzH (t) (2.13)
yH (t)
dyF (t)
= µF (t) dt + σ F (t) dzF (t)
yF (t)
where we assume that dzH (t) and dzF (t) are the standard Wiener processes with mean zero and variance
one. Additionally, assume that the two Brownian motions are uncorrelated.

2.2.2.1 Evolution of the Share

The first step is to find the process of the share. Assume we have the function
x
θ (x, y) =
x+y
then, the first derivatives are
y
θx = 2
(x + y)
x
θy = − 2
(x + y)
while the second order derivatives are
2y
θxx = − 3
(x + y)
2x
θyy = 3
(x + y)
" #
1 2y
θxy = 2 − 3
(x + y) (x + y)

We can now substitute and find the stochastic properties of the share.
1h 2 2
i
dθ = θx dx + θy dy + θxx · [dx] + θyy · [dy] + 2θxy · [dx · dy]
2
2.2 Applications 45

Denote the share as θ, therefore substituting we get


" #
yF yH 1 2yF 2 2yH 2
dθ = 2 dyH − 2 dyF + 2 − 3 [dyH ] + 3 [dyF ]
(yH + yF ) (yH + yF ) (yH + yF ) (yH + yF )

where we have already used the fact that the Brownian motions are independent. Meaning that [dyH · dyF ] =
0. We know that θ = yHy+yH
F
and that 1 − θ = yHy+y
F
F
Therefore the equation describing the evolution of the
share car be written as:
· ¸2 · ¸2
dyH dyF 2 dyH 2 dyF
dθ = θ (1 − θ) − θ (1 − θ) − θ (1 − θ) + θ (1 − θ)
yH yF yH yF

I have eliminated the (t) terms from the notation for simplicity. This can be further reduced to
· ¸ " · ¸2 · ¸2 #
dyH dyF dyF dyH
dθ = θ (1 − θ) − + θ (1 − θ) (1 − θ) −θ
yH yF yF yH

This is a difficult process to describe for the general Brownian motions described in equation (2.13). In
order to solve this even further, assume that µH (t) = µF (t) = µ, and that σ H (t) = σ F (t) = σ. The
evolution of θ.
¡ ¢ 2¡ ¢
dθ = θ (1 − θ) (µdt + σdzH ) − θ (1 − θ) (µdt + σdzF ) − θ2 (1 − θ) σ 2 dt + θ (1 − θ) σ 2 dt
= θ (1 − θ) σ (dzH − dzF ) + θ (1 − θ) [(1 − θ) dt − θdt] σ 2
= θ (1 − θ) σ (dzH − dzF ) + θ (1 − θ) (1 − 2θ) σ 2 dt

Notice that there is a drift


E [dθt ] = θ (1 − θ) (1 − 2θ) σ 2
which is zero at θ = 0, θ = 1, and θ = 1/2. Also, it is easy to confirm that the drift is positive when
θ ∈ (0, 1/2) and negative when θ ∈ (1/2, 1). In other words, the share tends to move to the middle, and there
are two absorbing states: 0 and 1. The variance of this process is given by
2 2
[dθt ] = θ2 (1 − θ) σ 2 · 2dt

2.2.2.2 Solving for Stock Prices

Importantly, the presence of a non zero drift implies that the share is not a martingale, however, it only
depends on the value of θ today – which means that it is a random walk.
The price of the stock is given by
Z T · ¸
−ρ(s−t) yH (s)
SH (t) = y (t) e Et ds
t y (s)
Z T
= y (t) e−ρ(s−t) Et θs ds
t

this can be re-written as


SH (t)
= Et F (θt )
y (t)
Z T
F (θt ) = e−ρ(s−t) θs ds
t
46 2. Introduction to Brownian Motion and Stochastic Calculus: Some Applications

As before, we can use Itô’s lemma to compute this integral (and its expectation). F (·) is a function of θ and
therefore for a given process of the share we can compute the evolution of the function by Itô’s lemma.
1 2
dF (θt ) = F 0 (θt ) dθt + F 00 (θt ) [dθt ]
2
Hence,
1 2
Et dF (θt ) = F 0 (θt ) Et [dθt ] + F 00 (θt ) [dθt ]
2
From the previous derivation of the process of θ
Et [dθt ] = θ (1 − θ) (1 − 2θ) σ 2 · dt
2 2
[dθt ] = θ2 (1 − θ) σ 2 · 2dt
substituting h h ii
2
Et dF (θt ) = F 0 (θt ) [θ (1 − θ) (1 − 2θ)] + F 00 (θt ) θ2 (1 − θ) σ 2 dt (2.14)

To finalize the differential equation describing the function we have to concentrate on the term on the left.
We have defined the function F as Z T
F (θt ) = e−ρ(s−t) θs ds
t
which means that F is a stochastic integral (mainly because it is an integral of a stochastic variable). This
function, interestingly, depends on time through two channels: the integrand and the limit of the integration.
So, we can compute the change in the function with respect to t.
"Z #
T
−ρ(s−t)
dF (θt ) = ρ e θs ds dt − e−ρ(t−t) θt dt
t

= ρF (θt ) dt − θt dt
where the first term of the derivative is from the integrand, and the second one is from the limit in the
integration. Now, we do not need the whole process of F (·) but its expectation.
Et dF (θt ) = ρF (θt ) dt − θt dt
The term of the left is what we computed above in equation (2.14). Substituting implies the following
differential equation
h i
2
F 0 (θt ) [θt (1 − θt ) (1 − 2θt )] σ 2 + F 00 (θt ) θ2t (1 − θt ) σ 2 = ρF (θt ) − θt

So, we have to solve the following differential equation to find the solution to the function F and the value
of the stock.
2
θ2 (1 − θ) σ 2 · F 00 + θ (1 − θ) (1 − 2θ) σ 2 · F 0 − ρF = θ
This is usually solved using some numerical methods. There is one exact solution to this differential equation
that uses hyperbolic functions – it can be solved with Lagrange or Fourier transforms – but this exact
solution only exists for this very simple example where the drifts of the processes and the variances are equal
and constant. In the general case, there is no simple solution and the easiest way to solve this differential
equation is by appealing to the several numerical methods available in the literature. My preferred method
is called Mutigrid. See Briggs, Henson, and McCormick (2000) for further references.
Notice that once F is known as a function of the share, then the stock price of home is known using
SH (t) = y (t) F (θt )
while the foreign stock is
SF (t) = y (t) F (1 − θt )
That solves the problem of all the stock prices.
2.3 Sticky prices models in continuous time 47

2.2.3 Problem Sets

This is a nice problem set that will allow you to develop some numerical methods and learn how to solve
differential equations in a computer.

2.2.3.1 Numerical Cochrane, Longstaff, and Santa-Clara.

In the two country, single good, endowment economy, assume that production of each country is described
by the following processes
dyH,t
= µH · dt + σ H · dzH,t
yH,t
dyF,t
= µF · dt + σ F · dzF,t
yF,t
yW,t = yH,t + yF,t
where the dz’s are independent Wiener processes.

1. Derive the expression for the share of the home country in world production.
2. Use the fact that the stock prices are given by
Z ∞
SH,t
= Et e−β(s−t) θs ds
yW,t t
Z ∞
SF,t
= Et e−β(s−t) (1 − θs ) ds
yW,t t

to derive the differential equation that home and foreign stock prices should satisfy.
3. Solve each of the equations numerically. For this purpose assume that µH = µF = 0.08, σ H = σ F =
0.16, and β = 0.02. For simplicity assume that both countries start with initial output equal to one.
4. The numerical solution in the previous case gives you a mapping between the current share and the
current price of the stocks. Now, we want to characterize the behavior of stock prices through time. To
do so, we’ll perform a Monte-Carlo exercise. Draw 100 realizations for dzH and dzF . Remember that
they are supposed to be normally distributed with mean zero and variance one. Assume the economy
starts with a share equal to 1/2 at time 0. Given the stochastic process described in question 1, plot the
path of the share. Using the numerical solution from the previous question, determine the price of the
stocks and compute their path. For the given path, compute the stock market returns, and compute
the correlation. (In this question you need to think in terms of a discrete aproximation to the model
with dt = 1.)
5. Run the Monte-Carlo 500 times and compute the distribution of the correlation. Present the 95%
confidence interval for the estimated correlation.
6. What is the correlation (distribution) across stocks if we increase µF to 0.10? Present the 95% confi-
dence interval for the estimated correlation.
7. How does it changes if instead of increasing µF we decrease the σ F to 0.10? Present the 95% confidence
interval for the estimated correlation.

2.3 Sticky prices models in continuous time


48 2. Introduction to Brownian Motion and Stochastic Calculus: Some Applications
This is page 49
Printer: Opaqu

3
Balance of Payment Crises in a Simple Monetary Model

Balance of Payment Crises is one of the most prolific areas of research in international economics. It all
started with the seminal contributions of XX and XX where they described in a very simple framework
how in a continuous and parsimonious model discrete changes in portfolios can cause a crisis. Those models
describe a crisis mostly as the outcome of an unsustainable fiscal policy. In fact, those were possible the
causes of many crises during the 70’s and 80’s, and obviously right now in countries such as Greece, Spain,
and Portugal. Crises that are driven entirely by unsustainable fiscal policy are called first generation.
The second generation of models started with Obstfeld paper of self-fulfilling crises. In these models the
fiscal policy is not unsustainable if the agents buy-in into the country, but it is if the agents decide to bet
against the country. In other words, crises occurs conditional on the expectations of agents - and hence the
multiple equilibria aspect.
The third generation of models introduced the financial sector as the mechanisms of transmission. Either
the banks have inconsistent practices – lending, capital requirements, or credit evaluation – which makes
then seem as an extension of the standard first generation model, or the banking sector suffers from liquidity
shocks – in the spirit of diamond and dybvig – which makes the model similar to the second generation
models.
In this chapter I want to introduce the first generation model in an intertemporal model as we have done
so far. I will discuss in an stochastic environment how the crises takes place and discuss some alternative
explanation to why the inconsistent fiscal policy takes place.

3.1 Stochastic Fiscal Reform and Crises

When countries are in fiscal problems they tend to initiate a process of fiscal reform. These reform processes
are rarely certain. In most instances, part of the fiscal reform was delayed or not implemented completely,
so the fiscal deficit increased and the program had to be abandoned. The aftermath of these programs is
not encouraging: since most of these policies turned out to be failures, lowering reserves and causing higher
inflation rates. Hence, it is the case that most of the times the intentions differ significantly with the outcomes.
50 3. Balance of Payment Crises in a Simple Monetary Model

The decade of the 80’s and 90’s was filled with Latin American fiscal reforms. Nowadays, the Europeans are
leading the way in trying to get their public accounts in order.
The literature has explained the need of fiscal reforms and disinflation programs based on four alternative
theories: the Olivera-Tanzi effect, optimal tax composition, exchange rate management as a disciplinary de-
vice, and political economy issues. First, if the economy is working in the wrong side of the Laffer curve, there
exists another equilibrium with lower inflation. The idea is that the lag that exists between the realization
of income and the time income tax is paid reduces real revenue. A stabilization moves the economy to the
left hand side of the Laffer curve and no fiscal effort is required. This is the Olivera-Tanzi effect (see Olivera
(1967) and Tanzi (1978)). Second, the disinflation program might be the result of an optimal tax choice
problem. For example, consider an economy that has a high inflation tax and a low income tax. Moving
toward the optimal tax portfolio implies a reduction in inflation and an increase in income tax. This kind of
tax recomposition are common in the Latin American experience, and are an important component of their
reform processes. Third, the disinflation program can be thought as a commitment or disciplinary device to
encourage fiscal responsibility. If there is a conflict between the central bank and the government, and the
central bank is the stronger one, then the monetary authority initiates a managed exchange rate to force
the fiscal authority to reduce expenditure.1 Fourth, there are models that concentrates on how the incentive
structure of the government interacts with the choice of exchange rate regime (see Tornell and Velasco 1995,
1998, 2000)
These theories capture important aspects of the disinflation programs in Latin America. They fail, how-
ever, to explain several of the issues in those processes. The first two theories cannot justify why disinflation
programs usually end with a balance of payments crisis. Both predict that no need for extra financing is
required during the disinflation. The third explanation does not seem to capture the institutional arrange-
ments that prevail in Latin America; central bank independence is a relatively new concept for the continent,
and, in general, we observe that the monetary authority abandons the policy, and not the converse. Finally,
the fourth hypothesis depends on political economy institutions that are not necessarily common across the
region. These are undoubtedly important components of the story, however, here we abstract from them and
emphasize an alternative explanation.
In this chapter, we discuss a simple model that accounts for the behavior of the government based on
three assumptions: the process of reform is uncertain, inflation has welfare costs, and disinflations are costly.
The model has three main implications: First, the optimal exchange rate policy implies the initiation of a
disinflation program at the announcement of a fiscal reform. Second, even if there exists a possibility of
a balance of payments crisis, it is still optimal to initiate the disinflation program. Third, it is optimal to
engage in a sequence of stabilization programs until one of them is successful, or until a balance of payments
crisis occurs2 .
The intuition is that the announcement of a fiscal reform conveys good news in the future in the form of
lower expected fiscal deficits. Seigniorage has welfare costs, therefore it is optimal for the Central Bank, to
smooth the inflationary tax. Hence, a disinflation program is initiated at the announcement of the reform
even though this implies a larger fiscal deficit currently. This deficit is financed with reserves. If the reform
never takes place and the disinflation program has to be abandoned, the ex-post inflation rate is higher than

1 The European disinflation experiences of the 80’s can be classified as examples of the use of monetary policy as a commitment

device.
2 It is important to point out that here we are concerned with the timing between the disinflation and the fiscal reform. As

we discuss below, the model only captures a small part of the disinflation (around 25%). In an earlier version of the paper, the
model included sticky prices and it was able to account for a sizeable fraction of the exchange rate peg.
3.2 Model without Debt Constraints: Optimal Monetary Policy 51

the one that existed before the program was initiated; it looks as if the government made a mistake when
they implemented the stabilization program in the first place.3

3.2 Model without Debt Constraints: Optimal Monetary Policy

This section discusses the basic ingredients of the model and shows that the optimal monetary policy is one
in which a disinflation program is started together with a fiscal reform. This has the implication of having a
fiscal deficit and a loss of reserves at the same time – in the same way it occurs in the Krugman model. The
first case is one where the country has no debt restrictions. This is the case analyzed here. The next section
looks at the problem when there is an upper bound on the debt level, and therefore, there is the possibility
of a currency crisis.
The three main ingredients of the model are the following: the reform process is uncertain, inflation has
welfare costs, and a disinflation program is costly.
In the model, we assume that inflation is the only available tax, and that it generates welfare costs. First,
the assumption that inflation is the only available tax is capturing the fact that seigniorage has been an
important share of the government’s revenue, and it is the source for the marginal revenue when economies
have a fiscal deficit. The second assumption is that inflation has welfare costs.We capture this welfare cost
with a concave utility function and a cash in advance constraint. In fact, this is how we introduce money
into the model. This particular formulation has the advantage that it can be interpreted as a tax smoothing
problem, where inflation is a distortionary tax. Barro (1979) showed that when taxes are distortionary, the
optimal policy is to spread the tax burden across time: tax smoothing. See Barro (1988), Calvo and Guidotti
(1992), Ball and Mankiw (1994), Mankiw (1984) and Saint-Paul (1994). In our case, the tax smoothing result
implies inflation smoothing. Note that the smoothing motive justifies the implementation of the disinflation
program. In other words, the announcement of the fiscal reform implies that future welfare costs might be
smaller. If the cost function is convex, then consumers want to transfer part of the future benefits to today,
which requires reducing current inflation.4 Finally, a disinflation program is costly because it deprives the
government of a source of revenue. The loss in reserves today leads to a higher level of inflation in the future,
as the government seeks to recover revenue. This is the Sargent and Wallace (1980) effect, which in our case,
appears as a reduction in reserves, rather than as an increase in debt (see also Liviatan (1984, 1986) and
van Wijnbergen (1988)).
As it should be clear by now, in the model the tax smoothing motive drives the timing of the disinflation,
while the Sargent and Wallace effect generates the costs of the program.
Two additional remarks: First, failed stabilization programs are far more costly than just the Sargent
and Wallace effect. In practice, the failure to implement a disinflation program is costly, not only in loss
of reserves, but in several and probably more important ways, such as recessions, loss in credibility in
future programs, etc. These costs could be included in the model. However, it would complicate the analysis
without improving the intuition. Second, the amount of disinflation predicted by the model comes from the
tax smoothing motive, and therefore, is relatively small in comparison with the data. The Latin American
experience on average implies a reduction in the exchange rate depreciation from 200 percent to almost
zero. The tax smoothing (at best) would be able to account for one quarter of that. This caveat, however,

3 This intuition is closely related to Calvo & Drazen (1995). They also study the impact of uncertain policies on the path

of the economy. In their case, they concentrate on the existence of market imperfections and its interaction with the uncertain
duration of the policies.
4 It is important to mention that the assumption that the welfare costs are convex can be relaxed. It can be shown that if the

concavity of seignorage is larger than the concavity of the welfare costs of inflation, then the optimal strategy still is to smooth
inflation.
52 3. Balance of Payment Crises in a Simple Monetary Model

can be solved if sticky prices or inflation inertia (à la Calvo) is introduced in the model. Moreover, issues
of credibility, transparency of policy, and or political economy will contribute to explain the size of the
disinflation. In this section, however, we are more concerned with the timing of the disinflation, rather than
its magnitude. The inclusion of sticky prices (for example) complicates the analysis but does not change the
date at which the disinflation program is initiated. In that model, only the “intensity” of the disinflation
program is changed.

3.2.1 Environment and Consumers

Consider a small open economy where there is a single tradable good and where PPP holds. Assume there
is perfect capital mobility and zero foreign inflation. All bonds are indexed, thus the domestic nominal
inflation rate is equal to the rate of depreciation, and the domestic interest rate is equal to the depreciation
rate plus the foreign real interest rate (assumed to be constant).5 There are three agents: an infinitely lived
representative consumer, the government and the central bank.
Consumers choose their consumption path and portfolio holdings taking as given the exchange rate policy.
Formally, the consumer’s problem is,
Z∞
max E ln ct e−ρt dt (3.1)
{ct }
0
s.t.
.
at = ρat + y − ct − it mt
1
ct ≤ mt
α
lim at e−ρt = 0
t→∞

where ct is consumption, y is output (assumed to be constant), at are the asset holdings denominated in
tradables, mt denotes money balances in terms of tradables, ρ is the discount rate (assumed to be constant),
and it is the nominal domestic interest rate. The first equation is the consumer’s objective function. The
second one is the budget constraint in terms of tradables, where the interest rate has been already substituted
by the international interest rate. The third one is the cash in advance constraint. And the fourth one is the
transversality condition on consumer’s assets.
There are four technical assumptions used in the model that simplify the analysis. First, we assume that
consumers do not derive utility from government expenditure. Second, we assume that output is exogenously
given.6 Third, we adopt a cash in advance formulation to introduce money into the model. An equivalent
formulation is one where money enters in the utility function. The same general results hold with the
exception that the path of money holdings might be different. Cash in advance assumes that money and
consumption are complements, and money in the utility function relaxes this assumption. We choose a cash
in advance formulation because it captures the distortionary inflation tax in a simpler way. The money in the
utility function is analyzed in a separe sub-section. Fourth, we assume log utility. The choice of log utility
simplifies the consumer’s solution making current consumption independent of the future interest rate path.
A different utility function implies that current consumption is a function of the future path of interest
rates. Thus, some intertemporal substitution is made by the consumers at the announcement of the reform.

5 We assume that there is no growth in the world economy and that it is in steady state, thus the international real interest

rate is equal to the discount rate.


6 Relaxing these two assumptions does not change the results. If output depends on the level of expenditure or consumers

derive utility from public expenditure, this makes the expenditure reduction less desirable. However, if reducing expenditure is
welfare improving, then there is a reduction in tax requirements in the future and the results still hold.
3.2 Model without Debt Constraints: Optimal Monetary Policy 53

However, full smoothing is only achieved if there is tax smoothing (this result comes from Barro (1979)). So,
still it is the case that the optimal strategy involves inflation smoothing.
The solution for the consumer’s problem is,
y + ρa0
ct = (3.2)
1 + αit
y + ρa0
mt = α (3.3)
1 + αit

3.2.2 Government

The government finances an exogenous expenditure on tradables by inflationary tax and interest earnings on
reserves. We assumed that the government expenditure has no impact on output or the consumer’s utility;
it is wasteful expenditure. At time zero, the government announces an uncertain fiscal reform, in the sense
that it is not sure when it can be implemented or if it will ever be. We assume that all agents have the same
prior about the probability of success of such reform.
Assume that the expenditure’s process is described by,


 gh t<τ

gt = (3.4)

 gh wp 1 − q t≥τ

gl wp q
where q, τ and gh > gl are exogenously given. Define the expenditure improvement as ∆g = gh − gl . Define
the bad state of the world as the state in which expenditure is not reduced, and the good state of the world
as the state in which expenditure is permanently reduced.
There are three technical remarks about these stochastic process: First, the timing of the adjustment
is known, but not its outcome. In section 3.4, we show that the results still hold if this assumption is
relaxed. Second, the drift of the process is negative, thus there is a true process of reform in place at least in
expectations. Third, the expenditure process is exogenous. The question we are addressing is why countries
peg their exchange rates, conditional on having a fiscal deficit and a reform in place. Thus, the exogeneity
of the process can be interpreted as the existence of conflicts between monetary and fiscal policy, and that
the fiscal authority is the stronger one. Hence, the expenditure process can be considered as exogenous by
the central bank.
The government’s budget constraint is given by,
.
B t = et gt − Ωt + it Bt (3.5)

where Bt denotes the government debt held by the central bank and Ωt represents the central bank’s profits,
discussed below. We assume that the government’s debt is in nominal terms but indexed. This eliminates
the incentives for discrete devaluations or surprise inflations to reduce its real value.
In the present model, the government has been oversimplified; it has no choices to make. It follows a very
simple rule. It maintains a high expenditure and when we reach τ , if lucky, it can reduce it for ever.

3.2.3 Central Bank

The central bank decides the path of exchange rate depreciations that maximize consumer’s utility, taking as
given the government’s expenditure path and the consumer’s reaction function. This is a benevolent Central
54 3. Balance of Payment Crises in a Simple Monetary Model

Bank in the sense that its objective function is exactly the same as that of the consumers. Obviously different
results would be obtained if the Central Bank has a different objective. In other words, the central bank does
not follow a pure inflation target. However, it does target an inflation rate consistent with maximization of
the consumer’s utility.7
The central bank’s balance sheet and flow profits in nominal terms are given by:

Mt = et rt + Bt (3.6)
Ωt = it Bt + (i∗t + êt ) et rt

where Mt represents the nominal money holdings, rt is total reserves in foreign currency, i∗t is the foreign
nominal interest rate, and êt denotes the exchange rate depreciation. The first equation is the central bank’s
balance sheet. The second equation is the central bank’s profits which consist of nominal interest earnings
on government’s debt, foreign interest earnings on reserves, and the capital gains on reserves due to a
depreciation.
One implication of perfect capital mobility, the indexed government debt and the PPP assumptions is that
choosing the exchange rate depreciation is the same as choosing the inflation rate or the nominal interest
rate.8 Given this equivalence we assume that the central bank chooses the nominal interest rate. Formally,
the problem is,
Z∞ µ ¶
y + ρa0
max E ln e−ρt dt (3.7)
{it } 1 + αit
0
s.t.
.
bt = ρbt + gt − it mt
lim bt e−ρt = 0
t→∞
rt ≥ r̄

The first constraint is the government’s budget constraint in real terms. This is obtained by substituting
(3.6) into (3.5), and rewriting it in terms of tradables. Again, the real interest rate has been substituted out
by the international interest rate. The second constraint is the transversality condition on the government’s
debt. The third constraint is an international liquidity constraint reflected in a minimum level of reserves.

3.2.4 Optimal Monetary and Exchange Rate Policy

The main question is why do governments initiate a disinflation program even though the fiscal equilibrium
is not guaranteed.9 The main result is that the optimal exchange rate path is a managed exchange rate
regime with a depreciation rate lower than the one implied by flexible exchange rate.
The central bank’s problem is to choose the path of nominal interest rates that solves (3.7) when r̄ → −∞.
First, we solve the model for a flexible exchange rate as a benchmark. Second, we solve for the optimal
exchange rate policy.

7 This
approach to optimal monetary policy is now standard in the literature. See Lahiri and Vegh (2000).
8 Additionally,
these assumptions imply that government foreign debt and reserves are perfect substitutes, so, a constraint
on the level of reserves is equivalent to a constraint on the level of debt.
9 An alternative ways of posing the same question is: why for some period of time the fiscal and monetary policy seems to

be inconsistent?
3.2 Model without Debt Constraints: Optimal Monetary Policy 55

3.2.4.1 Flexible exchange rate.

We define the flexible exchange rate as the one that implies a constant level of reserves; thus the government’s
debt is also constant. Imposing ḃt = 0 on the government’s budget constraint we obtain,

ρb0 + gt = it mt

This equation implies that the seigniorage has to be equal to the total government expenditures every period.
Given the money demand, equation (3.2), we can solve for the interest rate, which implicitly solves for the
exchange rate depreciation.
1 gt + ρb0
= 1− (3.8)
1 + αit y + ρa0
êt = it − ρ

Denote ifh (ifl ) as the interest rate in the flexible regime consistent with a high (low) level of expenditure.

3.2.4.2 Optimal interest rate path

Lets show that the optimal exchange rate path before τ , is a managed exchange rate with a depreciation
rate smaller than the one implied by flexible exchange rate, and that after τ , the optimal regime is a flexible
exchange rate. The problem is solved by backward induction.
We know that, after τ , expenditure is constant in each of the states of the world. By tax smoothing,
the optimal regime is one that implies a constant inflationary tax. The only constant rate of depreciation
consistent with the government’s transversality condition is a flexible exchange rate. Denote the government
debt at τ as bτ . Substituting in equation (3.8) we obtain the interest rate in each state of the world.

1 gh + ρbτ
= 1− (3.9)
1 + αi1h y + ρa0
1 gl + ρbτ
= 1− (3.10)
1 + αi1l y + ρa0

where i1h is the interest rate consistent with the higher level of expenditure and i1l is the one consistent with
the lower level of expenditure.
The second step is to solve for the interest rate before τ . Writing the Hamiltonian and optimizing we
obtain that the interest rate is constant prior to τ and that it satisfies the following constraint:

i1 = (1 − q) i1h + qi1l (3.11)

where i1 is the interest rate between [0, τ ]. Equation (3.11) comes from equating expected marginal utilities
of consumption before and after τ . Finally, we use the law of motion of debt to compute its value at time τ ,
given i1 . · · ¸¸
eρτ − 1 1
bτ = b0 + gh + ρb0 − (y + ρa0 ) 1 − (3.12)
ρ 1 + αi1

Equations (3.9), (3.10), (3.11), and (3.12) constitute a system of four equations with four unknowns. The
solution for the interest rate, debt, reserves and consumption are shown in figure 3.1.

Proposition 2 Along the optimal path, the exchange rate depreciation between [0, τ ] is smaller than the one
implied by flexible exchange rate. Moreover, foreign debt is increasing or, equivalently, reserves are falling.
56 3. Balance of Payment Crises in a Simple Monetary Model

Proof. The proof is by contradiction. Suppose the proposition is false, assume that i1 ≥ ifh . Substituting in
the intertemporal budget constraint of the government, we obtain ḃt < 0. This is because the larger interest
rate implies a larger seigniorage. Thus, at time τ , the total debt is smaller than the initial debt b0 . Then,
according to equation (3.8), (3.9), and (3.10), i1h < ifh and i1l < ifl . However, using equation (3.11) the interest
rate i1 is a weighted average of the interest rates after τ . In particular, it has to be always smaller than i1h ,
which is smaller than ifh . But this is a contradiction.

The proposition states that a disinflation program is initiate even though expenditure has not been ad-
justed. The disinflation causes an increase in debt due to the reduction in seigniorage. If the fiscal adjustment
fails, so the bad state of the world is realized, the new equilibrium depreciation rate is higher than the one
that would prevail if a flexible exchange rate were adopted in the first place. Ex-post, it looks as if the
country made a mistake initiating the stabilization program.
The intuition of the result is the following. The announcement of the fiscal reform conveys good news in
terms of future expected reductions in expenditure; the expected equivalent annuity of expenditure falls. By
the intertemporal budget constraint of the government the expected equivalent annuity of taxation should
fall too. Because inflation generates welfare costs the optimal path of inflation tax is to have a constant
expected rate of inflation. Thus a disinflation is initiated.10
Finally, note that there is no guarantee that reserves are positive in the bad state of the world. If τ or
the expected expenditure improvement are large enough, reserves can be negative, especially when the fiscal
adjustment does not take place. 11 We return to this point in the next section.

3.2.4.3 Solution: Formal Derivation

Solving for consumption, money holdings, and the multiplier, in the problem described in equations (3.1)
1 1
ct =
λ0 1 + αit
1 α
mt =
λ0 1 + αit
Substituting in the intertemporal budget constraint, integrating and imposing the transversality condition
we obtain,
1
= y + ρa0
λ0
If there is no uncertainty the Central Banks problem is
Z∞
1
max E ln e−ρt dt
{it } 1 + αit
0
s.t. µ ¶
. 1
bt = ρbt + gt − (y + ρa0 ) 1 −
1 + αit
lim bt e−ρt = 0
t→∞

10 The model has additional implications that are well in line with the existing literature on exchange rate based stabilization

programs.First, on impact, reserves go up and decrease thereafter. The reduction in the nominal interest rate implies an increase
in demand for real balances, which is reflected in an increase in reserves on the implementation of the disinflation. Second, there
is a consumption boom at the announcement of the reform. Third, the trade balance and the current account deteriorate. See
Calvo (1986 and 1987), Calvo and Vegh(1993), Agenor and Montiel (1986), Rodriguez (1982).
11 The comparative statics is analyzed in the appendix. An increase in q unambiguously increases debt at τ , and reduces

current inflation. An increase in τ increases debt at time τ , and increases current inflation.
3.3 Model with Debt Constraints: Balance of payments crisis 57

Define,
1
θt = (3.13)
1 + αit
then, the first order conditions are,
1
+ (y + ρa0 ) λt = 0
θt
.
λt = 0

Which means that the optimal strategy for the government is to have a constant inflation after τ ; smooth the
inflationary tax. To determine the level of the multiplier we substitute in the budget constraint and impose
the transversality condition.
gi + ρb0
θi = 1 −
y + ρa0
This means that the solution is a constant depreciation rate equal to the flexible exchange rate.
Under fiscal uncertainty, we solve the problem by backward induction. Given that we know that without
fiscal uncertainty the solution is a constant interest rate, then after τ , there should be a constant inflation
rate consistent with a flexible exchange rate given the level of debt at τ . Formally,
gh + ρbτ gl + ρbτ
θ1h = 1 − , and θ1l = 1 − (3.14)
y + ρa0 y + ρa0
Hnece, the central bank’s problem becomes,
 τ 
Z
1 ¡ ¢
max  ln θt e−ρt dt + e−ρt q ln θ1l + (1 − q) ln θ1h 
{θ t } ρ
0
s.t. µ ¶
. 1
bt = ρbt + gi − (y + ρa0 ) 1 −
1 + αit

writing the Hamiltonian and solving the first order conditions we find (as before) that the optimal interest rate
has to be constant between [0, τ ]. Using the debt accumulation equation and substituting in the maximization
problem, the first order condition implies
1 q 1−q
= 1+ 1 (3.15)
θ1 θl θh

This condition is saying that marginal utility of consumption before τ is equal to the expected marginal
utility of consumption after τ . The debt accumulation equation and equations (3.14) and (3.15) form a
system of four equations with four unknows with the properties described in the text.

3.3 Model with Debt Constraints: Balance of payments crisis

In this section, we explore the optimal exchange rate policy when the country faces a constraint in the level
or reserves (or equivalently in its level of debt). In this section we build heavily on the seminal literature
on balance of payments crises (See Krugman (1979), Flood and Garber (1986), Calvo (1986 and 1987)). It
has, however, the additional implication that a disinflation program is initiated and optimal even though it
implies a positive probability of facing a crises.
58 3. Balance of Payment Crises in a Simple Monetary Model

3.3.1 Optimal Monetary and Exchange Rate Policy

As should be trivial from the previous section, there are parameters under which the optimal policy implies
negative reserves at τ . In these cases, if there exists a constraint on the level of reserves (assume that for
simplicity it is zero), the central bank is unable to implement the optimal unconstrained strategy. The
solution to the constrained optimization problem is then a corner solution. The optimal polity implies that
the central bank sets the interest rate to the minimum one that guarantees that at time τ , in any event of
the world, reserves are greater or equal than the minimum. In other words, the central bank sets the interest
rate such that in the bad state of the world reserves are zero and the country faces a currency crisis.12
The balance of payments crisis occurs á la Krugman (1979) with the twist that here the timing is given by
the reform process, as opposed to the timing given by the fixed exchange rate and the process of the fiscal
deficit. In Krugman’s model, the engine of the crisis is an exogenous fiscal deficit under a fixed exchange
rate. Thus, the timing is determined by the necessity to finance the deficit entirely with reserves. In our
case, the expenditure process is exogenous, but not the fiscal deficit. The timing of the crisis is given by the
realization of not implementing the reform, and the inflation tax revenue (or equivalently the fiscal deficit)
adjusts to make the crisis rational at τ . In other words, the inflation tax is such that there is a fiscal deficit
financed by reserves that makes optimal a speculative attack at τ .

3.3.1.1 Solution: A heuristic approach

Assume the constraint is hit, we know that after τ reserves are zero in the bad state; therefore by the balance
sheet of the central bank, domestic debt and money holdings are equal.
y + ρa0
bτ = mτ ⇒ bτ = α
1 + αich
where ich stands for the interest rate when the level of expenditure is high and there is a constraint on the
level of reserves. The interest rate after τ also has to satisfy the transversality condition on the government
debt, so, it is determined by equation (3.8). Solving for the maximum level of debt,
α
b̄τ = (y + ρa0 − gh ) (3.16)
1 + αρ

The interest rate prior to τ has to be consistent with a debt accumulation such that debt is equal to
equation (3.16) at time τ . Using the equation for debt accumulation we solve for the interest rate (ic ) prior
to τ .  ³ ´
1
1 1 αρ g h + ρb0 1 + αρ
= + 1 −  (3.17)
1 + αic 1 + αif (1 + αρ) (eρτ − 1) y + ρa0

Proposition 3 The optimal path implies that a disinflation program is initiated at the announcement of the
fiscal reform. Most importantly, there is a positive probability of a balance of payments crisis.

Proof. i1 implies an accumulation of debt that generates negative reserves, and we constructed ic to have
a lower rate of debt accumulation. Thus, ic > i1 by construction. To show that if > ic we follow the same
proof by contradiction of proposition (2), or by inspection of equation (3.17).
Finally, the interest rate is computed such that the reserves reach their minimum in the case of not
adjusting the expenditure. This means, that there is a balance of payments crisis at τ that occurs with

12 When expenditure is not adjusted interest rates increase and reserves fall. Thus, if the constraint is binding it has to be

binding in the bad state of the world only.


3.3 Model with Debt Constraints: Balance of payments crisis 59

probability equal to the probability that the bad state of the world is realized. In other words, when it is
known that the fiscal reform has failed there is a speculative attack.

Note that the proposition implies that a government initiates a disinflation program even though there is
a risk of a balance of payments crisis. Indeed, this result is robust to alternative formulations of preferences
and expenditure processes. The intuition is that the announcement of the fiscal reform conveys good news in
the future and the government wants to transfer part of those future benefits to today in the form of higher
real balances. The extent in which this transfer can be made is limited by the debt constraint. Therefore
there is no full smoothing of consumption and money holdings. It is always optimal, however, to transfer
some of those benefits to today.
Two remarks about the cost of a balance of payments crises: First, in this model, the only cost of the
balance of payments crisis is the elimination of reserves and the lack of foreign credit; this is just the Sargent
& Wallace effect. Balance of payments crises, however, are likely to be more costly than this. Considering
additional costs does not change the qualitative implications of the model.
In particular, the proposition (almost) continues to be true if additional costs have to be paid after the
crisis occurs. The intuition is that the crisis is avoided with probability one if the interest rate implemented
is an ε larger than ic . Thus, in the case in which the costs of the balance of payments crisis are paid after
the speculative attack, the model predicts the same timing for the initiation of the disinflation program, and
a similar depreciation rate.
Second, a more realistic cost of the balance of payment would assume that the interest rate faced by the
government is a decreasing (convex and differentiable) function of the level of reserves. The disinflation will
be initiated at the announcement of the reform, but the size of the disinflation would be smaller. There is a
marginal benefit of reserves on top of its financing role that limits the extent of the depreciation.

3.3.1.2 Solution: Formal Derivation

We know that when the constraint is hit the expenditure is high. At that moment, money demand and total
debt are equal (bτ = mτ ). Moreover, the interest rate has to be one in which there is no change in the level
of debt, thus it is the flexible exchange rate. This system of equations uniquely determines the debt at τ ,
which is consistent with hitting the constraint.
α
bcτ = (y + ρa0 − gh ) (3.18)
1 + αρ
Thus, if the constraint is binding, then the interest rate between [0, τ ] has to be one such that the debt
accumulated until time τ is equal to equation (3.18). Using the government’s debt law of motion, we can
solve for the interest rate,

· ¸
c gh ((1 + αρ) eρτ − 1) + ρb0 (1 + αρ) eρτ
θ ≡ 1 + Ψτ αρ − (3.19)
y + ρa0
1
Ψτ ≡
(1 + αρ) (eρτ − 1)
After some algebra, we obtain equation (3.17) in the text.
The central bank’s problem is the Kuhn-Tucker problem where the interest rate is θc if the constraint is
binding. Formally,  c
 θ
 τ <0
gh +ρbcτ
θt = θch = 1 − y+ρa gt = gh τ ≥0
 0
 θc = 1 − gl +ρbcτ
l y+ρa0 gt = gl τ ≥ 0
60 3. Balance of Payment Crises in a Simple Monetary Model

Notice that by construction bcτ is smaller than the debt obtained in the optimal unconstrained strategy, thus
θch > θ1h and θcl > θ1l , but it is positive which implies that there is a disinflation program before τ .

3.4 Sequence of Stabilization Programs.

Usually fiscal reforms do not occur in isolation. Usually after a failed reform a new one is started. However,
the previous failure clearly is costly – in terms of the inflation rate. This pattern of one process of reform
after the other, all of them failing, and having a toll on the inflation rate, is a very common pattern in the
data. In this section, we show that this behavior is indeed the optimal policy. We show that the central bank
implements a sequence of stabilization programs, even though each time it is harder to reduce inflation and
it is more costly if the program fails.
To capture this dynamics we change our basic framework and assume that the government is continuously
trying to reduce expenditure: every time a fiscal reform fails, the government announces a new one. This
behavior should raise naturally from the assumption that expenditure is wasteful. A very simple way of
modelling this is to assume that expenditure follows a Poisson process, which implies that there is a fiscal
reform at every point in time with probability qdt of being successful.

 gh w/p 1 − qdt if gt = gh


gl w/p qdt
gt+dt = (3.20)



gl w/p 1 if gt = gl

Define the high state when there is a high level of expenditure, and the low state when there is a low level
of expenditure. In in unconstrained economy, it is easy to show that the optimal policy is to implement a
continuum of disinflation programs. This is shown by proving that when expenditure is high, the optimal
nominal interest rate is always smaller than the one implied by flexible exchange rate. To solve the problem
we define a value function in each of the states of the world.
© ª
ρV l (bt ) = max ln θt + [ρbt + gl − (y + ρa0 ) (1 − θt )] Vbl (3.21)
θt
© £ ¤ª
ρV h (bt ) = max ln θt + [ρbt + gh − (y + ρa0 ) (1 − θt )] Vbh + q V l − V h
θt

where V l is the value function when expenditure is low and V h is the value function when expenditure is
high (A formal solution is in appendix ??).

Proposition 4 If expenditure is high, the optimal strategy involves a rate of depreciation smaller than the
one implied by flexible exchange rate, inflation and government debt are increasing every unsuccessful fiscal
reforms, and the optimal strategy approaches the flexible exchange rate at high levels of debt.
If expenditure is low, the optimal strategy is either a flexible or a fixed exchange regime. This is because
the optimal flexible regime is a constant exchange rate.

The expenditure follows equation (3.20). The unconstraint problem implies that there exist two value
functions: One for the low level of expenditure and one for the high level of expenditure. The Bellman’s
equations satisfy,
© ª
ρV l (bt ) = max ln θt + [ρbt + gl − (y + ρa0 ) (1 − θt )] Vbl
θt
© £ ¤ª
h
ρV (bt ) = max ln θt + [ρbt + gh − (y + ρa0 ) (1 − θt )] Vbh + q V l − V h
θt
3.4 Sequence of Stabilization Programs. 61

0
-8
-6
-4
-2

0
2
4
6
8
10
12
14
16
18
20

-8

-6

-4

-2

10

12

14

16

18

20
-1

-1

debt interest_h interest_l


0

-8

-6

-4

-2

10

12

14

16

18

20
0

-8

-6

-4

-2

10

12

14

16

18

20

-1
-1

reserves_h reserves_l consumption_h consumption_l

FIGURE 3.1. Solution to the unconstrained economy.


62 3. Balance of Payment Crises in a Simple Monetary Model

The solution for the first one is the following: The first order condition and the envelope theorem equations
are,
1
+ (y + ρa0 ) Vbl = 0 (3.22)
θlt
h ³ ´i
ρbt + gl − (y + ρa0 ) 1 − θlt Vbbl = 0

This means that the solution is the flexible exchange rate regime.
gl + ρbt
θl (bt ) = 1 − (3.23)
y + ρa0
Substituting in the Bellman equation it is possible to solve for the value function. Notice that the value
function is twice differentiable, decreasing and concave, and the interest rate policy function is increasing
and convex. Now, we solve the problem for the value function with high level of expenditure. The first order
condition and the envelope theorem imply,
1
+ (y + ρa0 ) Vbh = 0 (3.24)
θht
h ³ ´i £ ¤
ρbt + gh − (y + ρa0 ) 1 − θht Vbbh = q Vbh − Vbl (3.25)

Proposition 5 ∀bt < ∞ ⇒ θh (bt ) < θl (bt )

Proof. Lets first show that they can not be equal, and then show that θh (bt ) can not be larger than θl (bt ).
Assume θh (bt ) = θl (bt ). Then equations (3.22) and (3.24) imply that, Vbl = Vbh . Substituting in the right
hand side of equation (3.25) we obtain,
h ³ ´i
ρbt + gh − (y + ρa0 ) 1 − θht =0

which implies that the solution for θh (bt ) is,


gh + ρbt
θh (bt ) = 1 − 6= θl (bt ) ∀ bt < ∞
y + ρa0
which is a contradiction for any finite level of debt.
Now assume θh (bt ) > θl (bt ). In this case, equations (3.22) and (3.24) imply,
1 1
(y + ρa0 ) Vbh = − h
> − l = (y + ρa0 ) Vbl ⇒ Vbh > Vbl
θt θt
This implies that the right hand side of equation (3.25) is always positive. Given the properties of the value
function we know that Vbbh is negative. Thus this would imply that the term in the brackets is negative.
h ³ ´i
ρbt + gh − (y + ρa0 ) 1 − θh (bt ) < 0

Solving for θh (bt )


gh + ρbt
θh (bt ) < 1 − < θl (bt )
y + ρa0
which is a contradiction. Therefore, θh (bt ) < θl (bt ) for any finite level of debt.

Note that this proposition implies also, that Vbh < Vbl . Thus the right hand side of equation (3.25) is
negative. Now lets show that the optimal solution implies a disinflation program when expenditure is high.
3.4 Sequence of Stabilization Programs. 63

Proposition 6 ∀bt < ∞ ⇒ θh (bt ) > θf (bt )

Proof. Remember that we define θf (bt ) as the solution to


h ³ ´i
ρbt + gh − (y + ρa0 ) 1 − θf (bt ) = 0
¡ ¢
Given the concavity of the value function and proposition (5) Vbh < Vbl we know that,
h ³ ´i
ρbt + gh − (y + ρa0 ) 1 − θh (bt ) > 0

Therefore, the optimal path implies a reduction in reserves (increasing debt) and θh (bt ) > θf (bt ) for any
finite level of debt.

Substituting the definitions of θ and the value functions, we obtain the following differential equation for
the interest rate when expenditure is high.
· µ ¶¸ h
1 ∂i £ ¤
ρbt + gh − (y + ρa0 ) 1 − h
= q ih − il (3.26)
1 + αi ∂bt

where il (bt ) has a close form solution from equation (3.23). The boundary condition for the differential
equation is,
1
lim if = lim ih = lim il = −
b→−∞ b→−∞ b→−∞ α

The solution is shown in figure 3.2. The schedule in the bottom is the interest rate when there is a low
level of expenditure. The schedule on the top is the interest rate implied by a flexible exchange rate when
expenditure is high. The schedule in the middle is the solution for the differential equation when expenditure
is high.
Now, assume that reserves have to be positive. This imposes a limit on the maximum level of debt given
by equation (3.18). The solution for the constrained economy implies the same differential equation as before
but with a different boundary condition. Formally,
· µ ¶¸ h
1 ∂i £ ¤
ρbt + gh − (y + ρa0 ) 1 − = q ih − il
1 + αih ∂bt
µ ¶
h 1 gh + ρb̄
i (b̄) =
α y + ρa0 − gh − ρb̄
α
b̄ = (y + ρa0 − gh )
1 + αρ
The properties of the solution are conserved. The numerical solutions is shown in figure 3.2. We compare
the solution for the constrained and unconstrained economies. Notice that for low levels of debt the two
solutions behave similarly. However, when the crisis is close the interest rate starts increasing faster in the
constrained case.
The proposition implies that the optimal strategy when expenditure is high, is a managed exchange rate
regime. Additionally, it implies that the larger the level of debt, the smaller the disinflation effort. In other
words, the difference in the nominal interest rate between the optimal and the implied by flexible exchange
rate is a decreasing function of debt.
The differential equations implied by equation 3.21 do not have a close form solution, thus we solve them
numerically. The solutions for the optimal policy when expenditure is high is shown in figure 3.2. Debt as a
percentage of GDP is measured in the x-axis, il is the interest rate implied by flexible exchange rate when
64 3. Balance of Payment Crises in a Simple Monetary Model

expenditure is low (the bottom schedule), if is the interest rate implied by flexible exchange rate when
expenditure is high (the top schedule), and ih is the solution of the differential equation when expenditure is
high. The interest rate is increasing with debt and is always smaller than the interest rate implied by flexible
exchange rate in the high state.
We now introduce the possibility of a balance of payments crisis. Similarly as in the previous section, the
maximum level of debt is given by equation (3.16). At this level of debt the optimal strategy is a flexible
exchange rate regime; thus we use this constraint as a boundary condition for the differential equation. After
substituting by the FOC, the differential equation is,
· µ ¶¸ h
1 ∂i £ ¤
ρbt + gh − (y + ρa0 ) 1 − h
= q ih − il
1 + αi ∂bt
µ ¶
1 gh + ρb̄
ih (b̄) =
α y + ρa0 − gh − ρb̄
α
b̄ = (y + ρa0 − gh )
1 + αρ

The solution is shown in figure 3.2, where the interest rate of the constrained economy is computed. Note
that for low levels of debt, the solutions for the constrained and unconstrained economy are similar. On the
other hand, when debt is increasing the constrained economy approaches the flexible exchange rate faster
than the unconstrained economy. Finally, when the maximum level of debt is reached, the regime changes
to a flexible exchange rate in the constrained economy. In other words, when reserves are zero there are no
possibilities of financing a reduction in inflation, other than implementing the fiscal reform.
Finally, observe that if the country is close to hit the debt constraint a foreign loan is welfare improving and
it implies an immediate adoption of a disinflation program. To clarify the intuition, assume that the economy
has reached the maximum level of debt, so it has a flexible exchange rate. Lets interpret the debt level net
of foreign help. This means that a loan from the IMF or the World Bank increases the debt capacity of the
country. In terms of our model, the economy jumps to the left in figure 3.2. Therefore, a more aggressive
disinflation program is initiated, real balances increase, and the consumer’s utility goes up.

3.5 Solution for the Money in the Utility model.

In this section we show that the main result of the paper can be obtained in a Money in the utility model.
The two most important results in the model are: First, the tax smoothing result that implies that optimal
interest rates are constant if government expenditure is constant. Second, that expected marginal utilities
are equalized at the time of reform.
Assume the consumers derives utility from holding real balances and that its objective function is
Z∞
max E U (ct , mt ) e−ρt dt
{ct ,mt }
0
s.t.
.
at = ρat + y − ct − it mt
−ρt
lim at e = 0
t→∞
3.5 Solution for the Money in the Utility model. 65

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
00

05

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

00

05

10

15

20

25
0

5
.5

.4

.4

.3

.3

.2

.2

.1

.1

.0
0.

0.

0.

0.

0.

0.

0.

0.

0.

0.

0.

0.

0.

0.

0.

0.

0.

0.

0.

0.

1.

1.

1.

1.

1.

1.
-0

-0

-0

-0

-0

-0

-0

-0

-0

-0

-0.1
interest rate low expenditure
interest rate high expenditure
interest rate (flexible) high expenditure
-0.2
interest rate, constrained case, high expenditure

FIGURE 3.2. Solution to the poisson process.


66 3. Balance of Payment Crises in a Simple Monetary Model

The first order conditions are

Uc = λt (3.27)
Um = it λt (3.28)
λ̇t = 0 (3.29)
.
at = ρat + y − ct − it mt (3.30)

This implies that consumption is constant and that the total inflationary taxes paid satisfy

Uc = λ0
Z∞
c0 = y + ρa0 − ρ it mt e−ρt dt
0

Now assume the expenditure is constant, and lets show that the optimal solution is to have consumption,
monetary holdings, and interest rate constant. The Central Bank maximizes the same utility function as
consumers, subject to its budget constraint and the solution of the consumer expressed by equations (3.27)
to (3.30).
Z∞
max E U (ct , mt ) e−ρt dt
{it }
0
s.t.
.
bt = ρbt + g − it mt
Equations (3.27) to (3.30)
lim bt e−ρt = 0
t→∞

The Hamiltonian is
H = U (ct , mt ) + µt (ρbt + g − it mt )
where the FOC are
µ ¶
∂ct ∂mt ∂mt
Uc + Um = µt mt + it
∂it ∂it ∂it
µ̇t = 0
.
bt = ρbt + g − it mt

Using the government budget constraint we can show that the net present value of total expenditures has to
be equal to the net present value of total taxes. Thus
Z∞
0 = g + ρb0 − ρ it mt e−ρt dt
0

Which implies that the total consumption is constant and equal to

c0 = y + ρa0 − (g + ρb0 ) (3.31)

Note that equation (3.31) implies that consumption is independent of the path of taxes. This is because
in this model there is Ricardian Equivalence. If the taxes are reduced today, those will have to be recovered
in the future. This implies that ∂c
∂it = 0.
t
3.5 Solution for the Money in the Utility model. 67

Substituting the solution of the consumer problem (3.28) in the FOC’s of the Central Bank we obtain

it ∂mt µ0
=−
mt ∂it µ0 − λ0

or in other words, µ ¶
Umm λ0
mt =− 1−
Um µ0

Therefore, if the demand for real balances is well behaved (monotonic) the optimal solution for the central
bank is to have a constant elasticity of substitution on the money holdings. For example, in a CES this
implies a unique money demand for each level of consumption. Given that the level of consumption is unique
due to equation (3.31), this implies that the interest rate is constant too. This proves the first part of the
results. If expenditure is constant, and the utility function is well behaved (decreasing and monotonic demand
functions) the optimal monetqary policy is to set a constant tax.
From the budget constraint of the government debt it is easy to show that the solution implies that

it mt = g + ρb0

The second result comes directly from the concavity of the utility function. The optimal monetary policy
will equate the expected utility before and after the resolution of the uncertainty takes place. The reason is
that otherwise there will be a jump in the exchange rate that would have been anticipated. In order to avoid
it, the expected utility after τ and the marginal utility before τ are the same.
Given some assumptions on the utility function we can obtain that increases in expenditure need increases
in interest rate to compensate the extra resources. This implies that after τ if the fiscal reform is successful
there is a decrease in the interest rate. Thus, the interest rate before τ has to be a weighted average of the
interest rate assuming there is a high or low expenditure. Because in our set up, the high expenditure after τ
is the same as that one that exists at time t = 0, this implies that the interest rate between 0 and τ is smaller
than the one that exists before t = 0 smaller than the one that exists after τ if the reform is unsuccessful,
but bigger than the one that will prevail if the reform is successful.
68 3. Balance of Payment Crises in a Simple Monetary Model
This is page 69
Printer: Opaqu

4
Identification in Macroeconomics: Problem

The problem of identification in macroeconomics is one of the most studied issues in theoretical and applied
work. Problems of simultaneous equations, omitted variables, and errors in variables have motivated a large
literature in econometric papers. In this notes, my objective is to describe how these problems affect the
estimation of macro-models and to study some of the new methodologies that have been developed to solve
them. We, as a profession, still are far from having a satisfactory answer, but we are clearly moving in the
right direction.
This chapter describes the three problems we are interested in analyzing. First, we discuss the biases
that arise in each of the cases and their properties. Second, we provide a reinterpretation to the biases by
relating the problem of recovering the “true” coefficients from the data − i.e. the lack of identification − to
these three problems. This puts all the problems within the same framework. The third section analyzes the
standard solution that the literature has offered to these problems. The purpose of the section is to provide
a concise summary of the “favorite” techniques within a single framework. By no means, it pretends to be
a survey of the literature.

4.1 Problems and Biases

Economic data suffers from several problems. Indeed, if you have worked in empirical projects the list of
problems you have faced seems infinite; and probably it is. There are problems of simultaneous equations, of
omitted variables, of aggregation, of noisy data, of truncated variables, etc. In this brief notes I would like
to concentrate on three main problems: simultaneous equations, omitted variables, and error-in-variables.
One reason is that I consider them the most important problems, but also, because I find them the most
interesting ones.1

1 Although, lately I have been working on aggregation issues. So, probably the next version of these notes will claim that

aggregation is also crucial. In any case, the choice of topics reflect my preferences, and not the aggregate opinion of the profession.
70 4. Identification in Macroeconomics: Problem

4.1.1 Simultaneous equations

The problem of simultaneous equations is perhaps one of the most common issues we face in applied work.
In fact, it is the preferred one of any referee uses to protect his/her personal agenda, and wants to reject any
(of my) papers. In any case, it is also common in practical issues. For instance, the problem of estimating
the slope of the demand curve, when the researcher does not know if the price-quantities observed are the
result of shifts in the supply schedule or those of the demand curve is one of the benchmark models in most
econometric classes. The problem is more generalized than this. For example, estimating the Central Bank
reaction function, the fiscal policy reaction function, savings and investment behavior, the linkages among
asset prices, or among countries (contagion), or the choice of education and wages, or of participation in the
labor market and taxes, or the impact of the quality of institutions on income, estimating the Q theory, are
just a few of all the possible questions where endogeneity is a crucial issue.
In this sub-section, we study the general problem of simultaneous equations in the standard supply and
demand framework. Assume that we are interested in estimating the following relationship:

yt = αxt + εt (4.1)

For simplicity, lets assume that the two variables have mean zero (so there is no constant in the regression),
and that both are univariate with dimensions T × 1. It is well known that the OLS estimated coefficient
takes the form
−1
α̂OLS = (x0t xt ) (x0t yt ) .

The problem of simultaneous equations, however, is that the variable x also depends on y. Assume that
they satisfy the following relationship:
xt = βyt + η t . (4.2)

Equations (4.1) and (4.2) form a system of equations that is known as the structural model:

yt = αxt + εt (Structural Model)


xt = βyt + η t

where εt and η t are known as the structural shocks. In most macro applications the following moments are
usually assumed

Assumption 7 Assume that the structural errors have mean zero

E (εt ) = 0, E (η t ) = 0,

finite variance
E (ε0t εt ) = σ 2ε , E (η 0t η t ) = σ 2η ,
and are uncorrelated
E (ε0t η t ) = 0.

The assumptions imply that unconditionally the errors have mean zero, that their variances are finite, and
that their covariance is zero. This last assumption is not required but in most macroeconomic models it is
used. The main reason is that we would like to be able to think of the structural shocks as innovations that
are economically meaningful, such as demand versus supply shocks, nominal versus real shocks, or permanent
versus transitory shocks. In general, it is easier to understand the implications of these shocks when they are
considered as independent or orthogonal. As will become clear below we will relax this assumption. For the
moment, this assumption is innocuous to our discussion and therefore we will keep it because it simplifies
tremendously the algebra.
4.1 Problems and Biases 71

Additionally, this covariance assumption implies that all the joint co-movement between the observed
variables (x and y) is explained by the endogenous coefficients (α and β) and not by the correlation in their
disturbances.2
The structural model implies that the observed variables are given by what is known as the reduced form
1
yt = (αη t + εt ) (Reduced Form Model)
1 − αβ
1
xt = (η + βεt )
1 − αβ t
where, in order to assure that the observed variables have finite variance, we will impose the following
assumption:

Assumption 8 Assume that the structural parameters satisfy:

|αβ| < 1

Under assumptions (7) and (8) there are only three relevant moments that can be estimated in the sample:
the variance of y, the variance of x, and their covariance. If the distributions are not normal there are also
other higher moments that can be estimated in the sample that are relevant, but those issues are left for
later. Mainly because if the distributions are not normal then identification becomes a much easier problem
to solve. We would like to put ourselves in the toughest of all positions and discuss how to solve the problem
there. Furthermore, the assumption that the variables are normal, such that their sum is also normal is a
standard assumption in macro applications. The moments are:
1 £ 2 2 ¤
V ar(yt ) = 2 α σ η + σ 2ε (4.3a)
(1 − αβ)
1 £ 2 2
¤
Covar(xt , yt ) = 2 ασ η + βσ ε (4.3b)
(1 − αβ)
1 £ 2 2 2
¤
V ar(xt ) = 2 ση + β σε (4.3c)
(1 − αβ)

The estimate from equation (4.1) is:

Covar(xt , yt )
α̂OLS =
V ar(xt )
σ 2ε
= α + β (1 − αβ) (4.4)
σ 2η + β 2 σ 2ε

Equation (4.4) shows that the OLS estimate has an additional term which is the bias introduced by
simultaneous equations. There are some properties of this bias that are worth discussing. First, under the
assumption that |αβ| < 1 the sign of the bias is the sign of β. So, if x is a decreasing function of y then the
OLS estimate is smaller (downward biased) than the true one, while the converse occurs if the coefficient is
positive. Notice that nothing prevents the bias to reverse the sign of the coefficient. In other words, the fact
that α is positive (for instance) does not necessarily forces the OLS estimate to be positive. We will see that
this is not the case with some of the other problems discussed below.

2 This does not imply that the disturbances are not correlated in most applications. But any of such correlation can be

transformed into a similar setup as the one studied here.


72 4. Identification in Macroeconomics: Problem

Second, the bias is exactly zero if β is zero. Obviously assuming that β is zero and that the covariance
of the structural shocks is also zero is indeed eliminating the problem of simultaneous equation − it is just
assuming the problem away.3 In any case, it is important to highlight that this is the case because some of
the solutions that are widely used in the literature are in fact making this assumption.
Third, notice that there is another condition in which the bias goes to zero:

σ 2η
→∞
σ 2ε

which can happen if the innovations to the first equation are zero (σ 2ε = 0) or if the innovations to the second
equation are infinitely large (σ 2η → ∞), which is the case when the variables are integrated but they are also
cointegrated.
Finally, the bias is small (and goes toward zero), when σ 2η À σ 2ε . This is known in the literature as near
identification and we will return to this issue in the next chapter.

4.1.2 Omitted variables

Omitted variable bias is perhaps the second most important issue afflicting macro applied work. The fact
that it is almost impossible to control for all observables implies that in most of our specifications we always
have some degree of misspecification. Obviously this should not be considered as a justification to never do
applied work. On the contrary, as in the case of endogeneity, this problem should just make our claims less
ambitious.
One of the most important and studied problems of omitted variables is the estimation of the returns of
one more year of schooling. The idea is that there exists an unobservable variable, the individuals ability,
that is both correlated with the decision of participation in the school system, and on the salaries received.
It could be argued that an individual of higher ability would be willing to study more years, and for the
same level of education might receive a higher wage.
As before, we study a simplified model to highlight the problems of estimation. Assume that we are
interested in estimating the following relationship:

yt = αxt + εt

but in this case, the true model is the following:

yt = αxt + γzt + εt (Omitted Variable Model)


xt = zt + η t

where εt and η t are the structural shocks, and zt is an unobservable omitted variable. The following moments
are usually assumed

Assumption 9 Assume that the structural errors have mean zero

E (εt ) = 0, E (η t ) = 0, E (zt ) = 0,

finite variance
E (ε0t εt ) = σ 2ε , E (η 0t η t ) = σ 2η , E (zt0 zt ) = σ 2z ,

3 By the way, this is indeed very common: Denial is the most important source of happiness. If you have a problem, a solution

is to assume that you do not have one.


4.1 Problems and Biases 73

and are uncorrelated


E (ε0t η t ) = 0, E (ε0t zt ) = 0, E (η 0t zt ) = 0.

Which are equivalent to the assumptions made before (Assumption 7). The reduced form is the following

yt = (α + γ) zt + αη t + εt
xt = zt + η t

As before, there are only three relevant moments that can be estimated in the data:
2
V ar(yt ) = (α + γ) σ 2z + α2 σ 2η + σ 2ε (4.5a)
Covar(xt , yt ) = (α + γ) σ 2z + ασ 2η (4.5b)
V ar(xt ) = σ 2z + σ 2η (4.5c)

The OLS estimate is:


σ 2z
α̂OLS = α + γ (4.6)
σ 2η + σ 2z

Equation (4.6) shows the bias introduced by omitted variables. Notice that in this case we have similar
remarks as the ones for the simultaneous equations problem. First, the sign of the bias is the sign of γ. As
before, nothing prevents the bias to reverse the sign of the coefficient, and if γ is zero, then the omitted
variable does not enter the y equation − and hence the bias disappears.
Second, if
σ 2η
→∞
σ 2z
the bias goes to zero. Finally, the bias is small when σ 2η À σ 2z , which is exactly the same condition as before
for near identification.
This parallel will continue to be present, and is part of the purpose of this section to show that indeed
these different problems are in some form all related.

4.1.3 Error-in-variables

Finally, lets study the problem of errors in variables. Assume we are interested in estimating the exact same
relationship but that the true model is

yt = αx∗t + εt (Error-in-variables Model)


xt = x∗t + η t

where x∗t is the true variable, but one that cannot be observed. We only observe a noisy and unbiased measure
of it (xt ). As before, εt and η t are the structural shocks, and the following moments are usually assumed:

Assumption 10 Assume that the structural errors have mean zero

E (εt ) = 0, E (η t ) = 0, E (x∗t ) = 0,

finite variance
E (ε0t εt ) = σ 2ε , E (η 0t η t ) = σ 2η , E (x∗0 ∗ 2
t xt ) = σ x∗ ,

and are uncorrelated


E (ε0t η t ) = 0, E (ε0t x∗t ) = 0, E (η 0t x∗t ) = 0.
74 4. Identification in Macroeconomics: Problem

These are the conditions that make this a ”classical” error-in-variables problem. The non-classical error-
in-variables produces different implications to the ones discussed here. These are important extensions, but
beyond out scope.
Assumption 10 is equivalent to the assumptions made in the previous two sub-sections. As before, there
are only three relevant moments that can be estimated in the data:

V ar(yt ) = α2 σ 2x∗ + σ 2ε (4.7a)


Covar(xt , yt ) = ασ 2x∗ (4.7b)
V ar(xt ) = σ 2x∗ + σ 2η (4.7c)

The OLS estimate is:


σ 2η
α̂OLS = α − α (4.8)
σ 2x∗+ σ 2η

Equation (4.8) shows the bias introduced by error-in-variables. Although the form of equation (4.8) is
similar to (4.6) their properties are not exactly the same. First, the sign of the bias depends on the coefficient
in the equation to be estimated. Which means that the biased in negative if the coefficient is positive, and
the bias is positive if the coefficient is negative. Second, because the ratio of the variances in the right term
is always smaller than one then the bias is always in absolute terms smaller than α. This implies that the
bias (in this case) can never change the sign of the coefficient − the bias is moving the coefficients toward
zero but never reaching it.4
The only circumstance in which the bias is zero is when5

σ 2x∗
→ ∞.
σ 2η

Finally, as before, the bias is small, when σ x2 ∗ À σ 2ηz , which is exactly the same condition implied by near
identification.

4.2 Lack of Identification

The previous section has discussed the biases introduced by the three problems under analysis. There is a
deeper issue that we would like to highlight in this section: the lack of identification.
The previous section is saying that OLS estimates are biased. Which it is just telling that OLS is not the
appropriate technique of estimation. Imagine that there would be a different procedure that would allow us
to recover the true coefficients from the data, then the issues highlighted in the previous section are just a
curiosity, and mostly a warning: “Be aware of OLS”. But this is not the case. The main problem when any
of these three problems is present in the data is that the true coefficients cannot be recovered from the data
with any procedure, without further assumptions. This means that there does not exist a single technique
or methodology that could help us solve the problem of estimating equation (4.1).
This is easily seen by counting the number of equations, or moments, that can be computed in the data,
and by comparing it to the number of parameters that describe them. The way we have set up all three

4 This is known in the earlier literature as the ”iron law” of economics. See Hausman (19XX). Obviously this is the the case

in the linear bivatiare setting (as the one described here). If the model is non-linear or there are more regressors then the bias
can go in any direction.
5 There is another circumstance: α = 0, but this is not an interesting case.
4.2 Lack of Identification 75

problems there are only three moments that can be computed in the data: the variance of y, the variance
of x, and their covariance. These moments are given in equations (4.3), (4.5), and (4.7) for the cases of
simultaneous equations, omitted variables, and error-in-variables problems, respectively.
In the case of simultaneous equations there are four coefficients: α, β, σ 2ε , and σ 2η − three equations and
four unknwons. For omitted variables we have five parameters: α, γ, σ 2ε , σ 2η , and σ 2z − three equations and
five unknwons. Finally, for the error-in-variables problem we have four parameters: α, σ 2ε , σ 2η , and σ 2x∗ −
three equations and four unknwons. In all three problems the number of coefficients or parameters to be
estimated is larger than the number of equations. Furthermore, not only the number of equations is smaller
than the number of unknowns, but there is no linear or non-linear combination of the equations that can
solve for any of the parameters, and specially the parameter of interest - α.
Therefore, without further assumptions there is a continuum of solutions that satisfy the sample moments.
In other words, we cannot recover the true parameters from the data − which is known as an identification
problem.

4.2.1 General set-up

The problem of identification described before can be generalized. In this section we discuss the exact same
issues and we introduce the standard terminology of system of equations. In particular rank and order
conditions. We will come back several times to these concepts and therefore, this is a good time to developed
them.
Assume the model to be estimated is
yt = αxt + εt
0
E (εt xt ) 6 = 0
In this model, the OLS estimate is given by
E (ε0t xt )
α̂OLS = α + .
V ar(xt )
Which again indicates the source of the bias is coming from the fact that the right hand side variable is
correlated with the residual.
In this model, the identification problem is due to the same aspect as the previous examples. In the data
we can only compute three moments: var(yt ), var(xt ), and covar(yt , xt ) but there are four parameters: α,
σ ε , var(xt ), and E (ε0t xt ).
In the standard literature on system of equations when the number of equations is smaller than the
number of unknowns it is said that the system of equations does not satisfy the order condition. As should
be expected, a system of equations where the order condition is not satisfied has no hope of actually collecting
the parameters without further assumptions − or equations.
It is important to highlight that the fact that the order condition is satisfied does not guarantee that
the system of equations has a solution. In other words, we can have enough equations, but they are not
independent. The independence of the equations is a condition known as rank condition. It states that the
number of independent equations has to be larger than the number of unknowns. The name “rank” condition
comes from the linear system of equations literature where the independence of the equations is computed
using the rank of the matrix describing the system. Most of the systems of equations we are faced when
estimating parameters involve non-linear relationships and checking their independence is much harder than
just calculating the column rank of a matrix. Nevertheless, the econometric literature adopted this definitions
since the seminal contribution by (Fisher 1976).
76 4. Identification in Macroeconomics: Problem

The idea, or purpose of this section is to show (and try to convince the reader) that all these problems can
be described as part of a more general issue in which the number of coefficients that have to be estimated is
smaller than the number of equations or moments that can be computed in the data. The next section deals
with the methods that have been ;proposed in the literature to solve this problem.

4.3 Standard solutions

In this section, we study the standard methods that have been proposed to solve the problem of identification.
Because most of the analysis can be done in the simultaneous equations set up we concentrate entirely on
this framework.
To clarify the intuition, let us remember the set up that we have been using. Consider the following
standard problem of simultaneous equations:

yt = αxt + εt , (4.9)
xt = βyt + η t , (4.10)

where (4.9) is the demand equation, (4.10) is the supply equation, yt and xt are the observed price and
quantity, and εt and η t are the structural shocks. The parameters of interest are α, β, and the variances of
the shocks: σ 2ε , σ 2η . For the moment, assume that the structural shocks are not correlated: σ εη = 0. This
assumption is relaxed below.
It is well known that if α and β are different from zero, equations (4.9) and (4.10) cannot be consistently
estimated without further information. Actually, one can only estimate the covariance matrix of the reduced
form (Ω̂) given by, · 2 2 ¸
1 α σ η + σ 2ε ασ 2η + βσ 2ε
Ω̂ = .
(1 − αβ)
2 . σ 2η + β 2 σ 2ε
The problem of identification is that the covariance matrix only provides three moments (the variance of yt ,
the variance of xt , and the covariance between yt and xt ) while there are four unknowns: α, β, σ 2η , σ 2ε .
The literature has solved the problem of identification by imposing additional parameter constraints. This
amounts to create or assume additional equations to the system of equations we are studying. These restric-
tions can be divided in the following classes: parameter restrictions, variance restrictions, sign restrictions,
and reverse regressions. In this section we summarize the implications of these assumptions.
The objective of this section, therefore, is to describe briefly most of the assumptions that have been used
in the literature. By no means, this pretends to be an exhaustive survey, it is just a summary of some of
the most used techniques. As will become clear, I will oversimplify what each of the methodologies do, and
indeed, I will present a critical perspective to all of them. It is important to mention that, even though I will
address the methods through this ”critical lens” perspective, these assumptions have proven to be extremely
useful in applied work. We have learned a great deal by using them, and several of the agreements we have
in the profession are the outcome of empirical studies using one or more of these techniques. There are other
economic problems, however, in which none of them can be rationalized and we still are in search of the
answers.

4.3.1 Parameter Restrictions

By far, the assumption that has been extensively used in the literature is parameter restrictions in the form
of exclusion restrictions or long run restrictions. For instance, (i) when we estimate VAR’s and compute a
4.3 Standard solutions 77

Cholesky decomposition to estimate the structural equations − we are indeed using an exclusion restriction
that is implied by the ordering in the VAR; (ii) when we solve the problem by using instrument variables,
we are imposing an exclusion restriction; (iii) when we solve the problem of error-in-variables by using lags,
we are using an exclusion restriction, etc.

4.3.1.1 Exclusion Restrictions

4.3.1.1.1 Contemporaneous coefficients: Assuming the problem away

The first type of exclusion restriction is one in which we assume that either β = 0, or α = 0. In my view this
is just assuming the problem of endogeneity or omitted variables away. When said it like this, the assumption
does not sound that reasonable, does it? But this is exactly the implicit assumption that we are making
when we use the triangular decomposition − or Cholesky decomposition − in a VAR! This is exactly the
assumption implied when we claim that certain variable is a valid instrument, etc.
The assumption indicated here implies that (lets assume we concentrate on β = 0)

yt = αxt + εt ,
xt = ηt ,
which implies that xt is orthogonal to εt and we can run OLS in the first equation to recover the true
coefficient.
For the multinomial setup the assumptions are very similar. Assume that there are N endogenous variables,
and that the contemporaneous relationship is described by the matrix A.
AXt = εt
where εt are the structural shocks assumed to be uncorrelated and with covariance matrix Σ, and where A
is a matrix with ones on the diagonal and dense. For example, for the bivariate case, the matrix is
· ¸
1 −α
A=
−β 1
and the structural shocks covariance matrix is
· ¸
σ 2ε 0
Σ=
0 σ 2η

Returning to the multinomial setup, the reduced form model is


Xt = A−1 εt .
Notice that in this model we can compute the covariance matrix of the observed variables (Xt ), which
contribute with N (N + 1) /2 moments. These moments are explained by the theoretical covariance matrix
given by A−10 · Σ · A−1 − which has N variances (elements of Σ) and N (N − 1) parameters (the elements of
A. Remember that the diagonal is set to be equal to one). Clearly, there are more unknwons than equations,
which is the identification problem we have been discussing all along. The solution requires several exclusion
restrictions, lets see how many:
N (N + 1)
+r ≥ N + N (N − 1) = N 2
2
N (N − 1)
r ≥
2
Observe that we need to impose as many exclusion restrictions as the ones that will guarantee that A has
zeros in the lower (or upper) triangle. The Cholesky decomposition is exactly doing so.
78 4. Identification in Macroeconomics: Problem

4.3.1.1.2 Exogenous Variables: Indirect Least Squares

A different set of exclusion restrictions appear when the variable excluded is exogenous rather than endoge-
nous. Assume the model is the following:

yt = αxt + πwt + εt ,
xt = βyt + γwt + η t ,

where wt is observed. We still make the same Assumption 7 in addition to

Assumption 11 Assume that the observed variable (wt ) has mean zero

E (wt ) = 0

finite variance
E (wt0 wt ) = σ 2w
and is uncorrelated with all the other shocks

E (wt0 εt ) = 0, E (wt0 η t ) = 0.

Where the zero-mean assumption is innocuous. We need it in this setup to assure that we do not have to
estimate a constant term. This is obviously a simplification. The reduced form model is
1
yt = ((αγ + π) wt + εt + αη t )
1 − αβ
1
xt = ((γ + βπ) wt + βεt + η t )
1 − αβ
Although in this model we can compute six moments: three variances for each of the observable variables
and three covariances, there are seven unknwons: three variances (σ 2ε , σ 2η , and σ 2w ) and four coefficients (α,
β, γ, and π). Furthermore, there is no way of re covering even some of the coefficients.
However, it is easy to show that one exclusion restriction is enough to solve the problem of identification.

Assumption 12 Assume that π = 0.

In this case, we are assuming that the variable wt enters the second equation but does not enters the first
one. The reduced form is
1
yt = (αγwt + εt + αη t )
1 − αβ
1
xt = (γwt + βεt + η t )
1 − αβ
Notice that the ratio between the coefficients on the exogenous variable identify α. The regression coefficient
αγ γ
of yt on wt is 1−αβ , while the coefficient of xt on wt is 1−αβ . The ratio is exactly α. This methodology was
developed by (Haavelmo 1947)

4.3.1.1.3 Instrumental Variables

Instrumental variables is similar to the indirect least square we have seen but the required assumptions are
smaller. Which also explains why instrumental variables has been used so much in the literature. The setup
is the following:
4.3 Standard solutions 79

yt = αxt + πwt + εt ,
xt = βyt + γwt + η t ,
where wt is observed and from now on will be denoted as the “instrument”. We change Assumption 7 to the
following:

Assumption 13 Assume that the observed variable (wt ) has mean zero mean zero
E (εt ) = 0, E (η t ) = 0, E (wt ) = 0,

finite variance
E (ε0t εt ) = σ 2ε , E (η 0t η t ) = σ 2η , E (wt0 wt ) = σ 2w ,
and the instrument is uncorrelated with the residual in the first equation:

E (wt0 εt ) = 0.

Lemma 14 The coefficient α can be estimated consistently if and only if the shocks satisfy Assumption 13,
and the exclusion restriction
π = 0,
is imposed. Furthermore, one of the possible ways to estimate α is the following:
αIV = (wt0 xt )−1 (wt0 yt )

Notice that in this case the structural shocks are not required to be uncorrelated, E (η 0t εt ) 6= 0. Moreover,
the instrument can be correlated with the residuals in the second equation E (wt0 η t ) 6= 0. In these circum-
stances, even though there are less equations than unknowns, we still can solve the problem of estimating α.
By all means, this is the beauty of the instrumental variables approach.
First, lets make clear that the number of equations is in principle not enough to solve the problem. This
means that even though one of the coefficients is identified, the other coefficients cannot be recovered without
further assumptions. Second, we derive the instrumental variable estimates. The reduced form is
1
yt = (αγwt + εt + αη t )
1 − αβ
1
xt = (γwt + βεt + η t )
1 − αβ
In this model there are six moments that can be computed in the sample, but there are seven theoretical
moments: three variances (σ 2ε , σ 2η , and σ 2w ) and three coefficients (α, β, and γ), and two covariances of the
structural shocks (E (η 0t εt ) and E (wt0 η t )). This means that not all the coefficients can be recovered from the
data.
However, the amazing implication of instrumental variables is that even though these system is underi-
dentified, in terms of the total number of equations being smaller than the total number of unknowns, still
one of the parameters − actually the parameter of interest − can be recovered from the moments.
Notice that
1 0 γ β 1 1 1
plim w xt = σw + plim wt0 εt + plim wt0 η t
T t 1 − αβ 1 − αβ T 1 − αβ T
µ ¶
1 1 0
= γσ w + plim wt η t
1 − αβ T
80 4. Identification in Macroeconomics: Problem

and
1 0 αγ 1 1 α 1
plim w yt = σw + plim wt0 εt + plim wt0 η t
T t 1 − αβ 1 − αβ T 1 − αβ T
µ ¶
α 1
= γσ w + plim wt0 η t
1 − αβ T

which means that even though when plim T1 wt0 η t 6= 0 still the ratio between these two plim’s is α.6 These
assumptions are much weaker than the ones required by ILS - no wonder why IV made such an incredible
impact in our profession while ILS’s impact has been significantly smaller.
Before turning our attention to the next subject it is important to remember the implicit assumptions of
IV for a much general setup − one that allows random coefficients, for example. We will use this in future
chapters and it is worth including these concepts right away.
Assume we are interested in estimating
yt = αt xt + εt
where
αt = ᾱ + η t
and where
E (x0t εt ) 6= 0.

Assume we have an instrumental variable denoted as wt that satisfies the following assumptions

Assumption 15 The instrumental variable is correlated with the right hand side variable

E (wt0 xt ) 6= 0

but uncorrelated with the residual on the first equation, as well as with the random coefficient

E (wt0 εt ) = 0
E (wt0 η t ) = 0.

Then the average of the random coefficients can be recovered by using the standard instrumental variable
estimator.

It is important to indicate what these assumptions are indeed stating that the instrument is affecting both
endogenous variables, but the effect on yt is entirely through the impact of the instrument on xt . This means
that the residuals in the first equation are unaffected by the instrument, as well as the coefficients. Under
these circumstances IV is a consistent estimate of the average effect (ᾱ).

4.3.1.2 Long Run Restrictions

One of the most used restrictions in VAR’s is the one that was popularized by (Blanchard and Quah 1989).
If it is known that one shock does not have permanent effects, then, under some conditions, it is possible
to obtain identification. For example, assume that nominal shocks are short lived, while real shocks are
permanent. Imposing this constraint (Blanchard and Quah 1989) and (Shapiro and Watson 1988) were able
to estimate the effects of aggregate shocks on aggregate activity and unemployment.

6 Usually it is assumed that plim 1 1 0


T
wt0 η t = 0 and that plim ε η
T t t
= 0, but at this derivation has shown this is not a
requirement.
4.3 Standard solutions 81

The idea is that we can impose that the long run effect of some shock is zero creating one additional
equation to the system and achieving identification. Obviously, this assumption can be used only when the
system includes lagged dependent variables otherwise it is equivalent to a exclusion restriction.
XXXXXX

4.3.2 Variance Restrictions

Finally, constraints on the variances,7 for example, that σ 2η /σ 2ε is equal to some constant, or to infinity. The
case in which the relative variances is restricted to be equal to a constant has not been (frequently) used in
applied work, while the assumption that the ratio goes to zero or to infinite is used as one of the underlying
assumptions of most event studies.

4.3.2.1 Near Identification

Near identification refers to the case in which one of the variances is infinitely large in comparison to the
others. In that case, as has been discussed in Chapter 4, the problem of identification is solved.
Most event studies indeed appeal to this assumption. for example, in corporate finance when we are eval-
uating the impact of earnings announcements on stock prices, the idea is to pool all earning announcements
together in one single day, and the argument is that this process of averaging makes all other shocks in the
economy, such as change in risk premium, interest rates, confidence, etc., smaller. Therefore, it is possible to
measure the impact of the earnings exclusively.
This is the original intuition developed by (Wright 1928) to solve the problem of identification. See (Fisher
1976) for a general discussion.

4.3.2.2 Relative variance restriction

Setting
σ 2η /σ 2ε = λ
solves the problem of identification in the simultaneous equations problem. In general, this assumption is
hard to justify and therefore, it has not received a lot of attention in applied work. However, it is important
to highlight that in principle, this assumption is as hard to justify as those based on exclusion restrictions.

4.3.3 Sign Restrictions

Sign restrictions: constraining the sign on the slopes of the structural equations can achieve partial identifi-
cation because the two inequalities imply a region of admissible parameters.
Even though a unique estimate cannot be obtained, at least an admissible interval is derived. See (Fisher
1976) and (Blanchard and Diamond 1989)
[to be completed XXXX]

7 See (Rothenberg and Ruud 1990) for a detailled study where covariance restrictions are imposed in linear simultaneous

equation models.
82 4. Identification in Macroeconomics: Problem

4.3.4 Reversed Regressions and ”Bounds”

In the standard simultaneous equations problem, it is possible to determine, under certain conditions, what
are the range in which the true coefficients belong. The method was developed by Gini (1926) and it was later
recovered by (?) and (?).8 The purpose of the bounds is to highlight or show the extent of the misspecification,
and offer a range of coefficients that are valid to any possible identification scheme. A regressions in which
the bounds are tight imply that the biases introduced by simultaneous equations are small.9
This method was developed for the general problem of misspecification, Assume we are interested in
estimating the simple relationship
yt = axt + ν y,t (4.11)
where the right hand side variable is correlated with the residual because there is a problem of simultaneous
equations. Notice that this is exactly the first equation in our system of equations. It is well known, and
as we have already argued, in the presence of misspecification we cannot estimate a consistently a. Indeed,
because regression 4.11 is misspecified, it is important to realize that there are two forms of estimating a.
yt =
axt + ν y,t (4.12)
1
xt = yt + ν x,t (4.13)
a
Observe that under endogeneity both regressions are equally wrong! Gini studied this problem and realized
that depending on the sources of the misspecification, the OLS estimates in these two regressions provide
bounds for the true coefficient. The estimate in equation (4.12) provides one bound, and the inverse of the
estimate on equation (4.13) provides the other bound. The case of simultaneous equations implies that the
OLS estimate in equation (4.12) is (the same as before):
σ 2ε
α̂eq−4.12 = α + β (1 − αβ)
σ 2η + β 2 σ 2ε
while the estimate of 1/a in equation (4.13) is (note that the two expressions are similar):
b
1 1 1 σ2
= − (1 − αβ) 2 2 ε 2
α eq−4.13 α α α ση + σε
We are interested in the estimation of α, hence, we solve for α in the second equation. We can in fact use both
estimates and compute the range where the true coefficient α must lie if the model is correct. To illustrate
the range, consider the two possible cases; where α and β have different or similar signs.
If α and β have different signs, the bias in equation (4.12) makes the OLS coefficient smaller (in absolute
value) than the true one. In other words, ¯ ¯
¯ ¯
¯α̂eq−4.12 ¯ < |α|
Similarly, under the same conditions, the estimate in equation (4.13) is also toward zero. Hence we can write
¯ ¯ ¯ ¯
¯b ¯ ¯1¯
¯1 ¯ ¯ ¯
¯ ¯<
¯ α eq−4.13 ¯ ¯ α ¯

Therefore, ¯ ¯
¯ ¯ 1
¯α̂eq−4.12 ¯ < |α| < ¯¯ ¯
¯
b
1
¯ α eq−4.13 ¯

8 See
(?) for a discussion along the same lines as here.
9 Althoughthe bounds were developed for the general misspecification problem, here we concentrate on the simultaneous
equations case.
4.3 Standard solutions 83

In other words, if the two schedules have different signs, then the true coefficient lies between these two
estimates.
The intuition of this result is very simple. First, it is important to realize that equation (4.12) is the OLS
run in one direction, while equation (4.13) is the OLS regression in the other direction. If the schedules
have different signs, simultaneous equations will bias the OLS coefficients toward zero, because the OLS
coefficient is a linear combination of the two coefficients—one positive, and the other negative. Hence the
OLS coefficients in both regressions are smaller in absolute terms than the true ones. However, the coefficient
in the first equation (4.12) attempts to estimate α and the coefficient in the second equation (4.13) estimates
1/α. This is what determines the range.
When the two schedules have the same signs the range of coefficients is different. In this case, the bias in
the OLS in both equations (4.12 and 4.13) is away from zero. So, if both coefficients are positive the OLS
is larger than the true one, and if the coefficients are negative the OLS ones are smaller than the true ones.
This means that in absolute terms the true estimate has to satisfy the following relationship:
 
¯ ¯ 1 
¯ ¯
|α| < min ¯α̂eq−4.12 ¯ , ¯¯ ¯
¯
 ¯b1
α eq−4.13 ¯

Again, this implies a range of coefficients that is admissible. The intuition in this case, follows the same
reasoning as before, where the difference in the two estimates is due to the fact that in both equations the
estimated coefficients are larger than the OLS ones.
These bounds have been extended to study the case of multivariate (here we have discussed only the
bivariate case), and when the type of misspecification is not only simultaneous equations but other forms as
well.
84 4. Identification in Macroeconomics: Problem
This is page 85
Printer: Opaqu

5
Identification through Heteroskedasticity: Theory.

The question of identification when the model includes endogenous variables has been studied for several
decades now.1 The problem arises when the structural form cannot be directly estimated, and the parameters
must be recovered from the reduced form, which has fewer equations than the number of unknowns. Thus, to
solve for the original parameters, more information is required. The typical solution is to impose additional
constraints based on economic knowledge about the particular model that is estimated. Indeed, as was
discussed in the previous chapter assumptions such as exclusion, sign, long-run, and covariance restrictions
have been very useful in numerous applied problems. However, they cannot always be justified.
In this chapter we present an alternative method to solve the identification problem that is based on the
heteroskedasticity that exists in the data. I show that if the structural shocks have a known correlation
(usually zero), the identification problem can be solved by simply appealing to the heteroskedasticity of the
structural shocks. For simplicity, I begin with a case in which there are two endogenous variables and two
regimes. Subsequently, I study the cases in which there are more than two regimes, when there are multiple
endogenous variables, and when common unobservable shocks are present.
The chapter is organized as follows: In section 5.1, we discuss the preliminary intuition of the method of
identification based on the heteroskedasticity. In section 5.2, the typical problem of identification is specified
in the bivariate setting. The methodology based on heteroskedasticity is studied when the data exhibit two
regimes, as well as they exhibit more than two regimes. A GMM interpretation of the estimation problem
is developed. In section 5.3, necessary conditions for identification are derived for multivariate processes
with unobservable common shocks. In section 5.4, the question of consistency under misspecification of the
heteroskedasticity is explored in the bivariate setup. Two cases are studied: First, when the number of
regimes are correctly specified but not the timing of the regimes, or windows, and second, when the number
of regimes is smaller than the actual number of regimes exhibited by the data.

1 See (Fisher 1976) for the most comprehensive treatment of the subject. See (Haavelmo 1947) and (Koopmans, Rubin, and

Leipnik 1950) for the seminal contributions.


86 5. Identification through Heteroskedasticity: Theory.

5.1 Preliminary Intuition

The typical problem of identification is depicted in the first panel of Figure 5.1. Assume that in the standard
supply and demand problem we are interested in estimating the slope of the demand curve. The realizations
are the outcomes of shocks to both the supply and the demand schedule, so, the OLS estimates would be
biased. The instrumental variable approach solves the problem of identification by finding a variable that
shifts the supply schedule without affecting the demand curve, thus measuring the slope of the demand. The
heteroskedasticity of the structural shocks works in a similar fashion.
The simplest intuition can be developed by looking at a special case: Split the sample in two and assume
that in the second sub-sample the supply shocks are more volatile than in the first sub-sample, while the
demand shocks have a constant variance across the two sub-samples. This increase in the variance of the
supply shocks implies that the “cloud” of realizations enlarges through the demand schedule, as is shown in
the second panel of Figure 5.1. The residuals are distributed along an ellipse, and the shift in the variance
implies a rotation along the demand curve. From the instrumental variables point of view, this is equivalent
to having a “probabilistic” instrument; we cannot assure that the supply curve shifts (as in the standard IV
approach), but in the second sample shocks to the supply are more likely to occur. Thus, the joint behavior
approximates more closely the demand schedule.
In the limit, if the variance of the supply shocks goes to infinity, the ellipse collapses and becomes the
demand curve. In this case, the slope of the demand can be estimated by OLS. This intuition was put
forward by (Wright 1928). This paper extends the original methodology to the case in which the shifts in
the variances are finite, and the form of the heteroskedasticity is unknown. In fact, if the structural shocks
are not correlated, the system is identified just by knowing that there is a change in the relative variance of
the shocks. In particular, if both variances shift by the same amount, then the two ellipses are proportional,
and the system is not identified. On the other hand, if the relative importance changes, then the system will
be identified by the rotation of the ellipse.

5.2 Identification

5.2.1 Identification under two regimes.

Assume there are two regimes in the variances of the structural shocks: high and low volatility. Additionally,
assume that the structural parameters are stable across the regimes. Under these assumptions the two
reduced form covariance matrices have the same structure as before:
· ¸ · 2 2 ¸
ω 11,s ω 12,s 1 α σ η,s + σ 2ε,s ασ 2η,s + βσ 2ε,s
Ω̂s ≡ = , s ∈ {1, 2} , (5.1)
. ω 22,s (1 − αβ)
2 . σ 2η,s + β 2 σ 2ε,s
where each regime is denoted as s ∈ {1, 2}, where the variances of the structural shocks in regime s are
given by σ ε,s and σ η,s , and where Ω̂s indicates the reduced form covariance matrix in regime s. In this new
system of equations there are six unknowns: α, β, σ 2η,1 , σ 2ε,1 , σ 2η,2 , and σ 2ε,2 , and two covariance matrices that
provide six equations! If the equations are independent, the problem of identification has been solved. It is
essential to restate the assumptions that lead to the identification of the system: (i) the parameters are stable
across the heteroskedasticity regimes, and (ii) the structural shocks are not correlated. These assumptions
are implicit in much of the applied macro work and are further discussed below.
Solving for the variances in equation (5.1), α and β satisfy the following non-linear system of equations:
ω 12,s − β · ω 11,s
α= , s ∈ {1, 2} . (5.2)
ω 22,s − β · ω 12,s
5.2 Identification 87

After some algebra, β solves the quadratic equation:

[ω 11,1 ω 12,2 − ω 12,1 ω 11,2 ] β 2 − [ω 11,1 ω 22,2 − ω 22,1 ω 11,2 ] β + [ω 12,1 ω 22,2 − ω 22,1 ω 12,2 ] = 0 (5.3)

There are two solutions to the quadratic equation. It is easy to show that if α, β is one solution to the system
of equations, then β ∗ = 1/α, α∗ = 1/β, is the other solution. Indeed, the solutions are the two possible ways
in which the structural form can be written. In other words, the system is identified up to row permutations
of the original model.

Proposition 16 Let yt and xt be described by equations (4.9) and (4.10), where the parameters (α and β)
determining the law of motion are stable and where the disturbances have finite variance, are not correlated,
and exhibit heteroskedasticity that can be described with two regimes. Then, if the covariance matrices satisfy
¯ ¯
¯ w11,2 ¯¯
¯
det ¯Ω̂2 − Ω̂1 6= 0 (5.4)
w11,1 ¯

the structural form is just identified: α and β are consistently estimated from the two estimable covariance
matrices.

Proof. Identification is achieved if equation (5.3) has real solutions. A real solution requires
2
[ω 11,1 ω 22,2 − ω 22,1 ω 11,2 ] − 4 [ω 11,1 ω 12,2 − ω 12,1 ω 11,2 ] [ω 12,1 ω 22,2 − ω 22,1 ω 12,2 ] > 0.

After some algebra this is equal to


£ 2 ¤ 2 £ ¤
ω 11,2 ω 222,2 [θ11 − θ22 ] − 2ω 11,2 ω 22,2 ω 212,2 [2 (θ11 − θ12 ) (θ12 − θ22 )] > 0,
ω 11,1 ω 12,1 ω 22,1
where θ11 = ω 11,2 , θ12 = ω 12,2 , and θ22 = ω 22,2 . A sufficient condition for this inequality to be positive is
£ 2 ¤ £ ¤
ω 11,2 ω 222,2 − 2ω 11,2 ω 22,2 ω 212,2 > 0
2
[θ11 − θ22 ] − [2 (θ11 − θ12 ) (θ12 − θ22 )] > 0.

The first one is satisfied because the positive definite properties of the covariance matrix
£ ¤
ω 11,2 ω 22,2 ω 11,2 ω 22,2 − 2ω 212,2 > 0.

The second inequality is, after some algebra, equal to


2 2
[θ11 − θ12 ] + [θ22 − θ12 ] > 0,

which is always positive. Therefore, if the coefficients in the quadratic equation are different from zero, then
the two roots are real.
The last requirement is to show when the quadratic equation does not have infinite solutions. This requires
that either
ω 11,1 ω 22,2 − ω 22,1 ω 11,2 6= 0,
or
ω 11,1 ω 12,2 − ω 12,1 ω 11,2 6= 0.
Given the model generating the data, these two assumptions are not satisfied if the heteroskedasticity implies
a proportional change of both structural shocks’ variances. In other words, when Ω2 = aΩ1 , for some scalar
a. This is the only case in which the solution to the quadratic equation (5.3) has infinite solutions.
88 5. Identification through Heteroskedasticity: Theory.

Note
h that if Ω2i = aΩ1 then det [Ω2 − a Ω1 ] = 0, which can be tested by computing whether or not
ω ?
det Ω2 − ω11,2
11,1
Ω1 = 0. By construction this is equivalent to asking if the covariance of the normalized
difference is equal to zero:
?
ω 11,1 ω 12,2 − ω 11,2 ω 12,1 = 0.
The small sample properties of this statistic are better behaved than the ones from the determinant, and in
the empirical section this is what is implemented to check the rank condition.
Consistent estimates of both covariance matrices imply that the estimate of β solves the following quadratic
equation:

[ω 11,1 ω 12,2 − ω 12,1 ω 11,2 ] β 2 − [ω 11,1 ω 22,2 − ω 22,1 ω 11,2 ] β + [ω 12,1 ω 22,2 − ω 22,1 ω 12,2 ] = 0,

where
1 £¡ 2 2 ¢¡ ¢ ¡ ¢¡ ¢¤
ω 11,1 ω 12,2 − ω 12,1 ω 11,2 = 2 α σ η,1 + σ 2ε,1 ασ 2η,2 + βσ 2ε,2 − ασ 2η,1 + βσ 2ε,1 α2 σ 2η,2 + σ 2ε,2
(1 − αβ)
1 £¡ 2 2 ¢¡ ¢ ¡ ¢¡ ¢¤
ω 11,1 ω 22,2 − ω 22,1 ω 11,2 = 2 α σ η,1 + σ 2ε,1 σ 2η,2 + β 2 σ 2ε,2 − σ 2η,1 + β 2 σ 2ε,1 α2 σ 2η,2 + σ 2ε,2
(1 − αβ)
1 £¡ 2 ¢¡ ¢ ¡ ¢¡ ¢¤
ω 12,1 ω 22,2 − ω 22,1 ω 12,2 = 2 ασ η,1 + βσ 2ε,1 σ 2η,2 + β 2 σ 2ε,2 − σ 2η,1 + β 2 σ 2ε,1 ασ 2η,2 + βσ 2ε,2 ,
(1 − αβ)

which after some algebra are equal to


1 £ ¤
ω 11,1 ω 12,2 − ω 12,1 ω 11,2 = −ασ 2η,1 σ 2ε,2 + ασ 2ε,1 σ 2η,2
1 − αβ
1 £ 2 2 ¤
ω 11,1 ω 22,2 − ω 22,1 ω 11,2 = −σ η,1 σ ε,2 (1 + αβ) + σ 2ε,1 σ 2η,2 (1 + αβ)
1 − αβ
1 £ ¤
ω 12,1 ω 22,2 − ω 22,1 ω 12,2 = −βσ 2η,1 σ 2ε,2 + βσ 2ε,1 σ 2η,2 .
1 − αβ
Hence, the two solutions to the quadratic equation are
£ ¤
[(1 + αβ) ± (1 − αβ)] −σ 2η,1 σ 2ε,2 + σ 2ε,1 σ 2η,2
β= £ ¤ ,
2α −σ 2η,1 σ 2ε,2 + σ 2ε,1 σ 2η,2

where, under the assumption that the rank condition is satisfied (equations (5.4) or (5.5)), the solution of
the system of equations is
[(1 + αβ) ± (1 − αβ)]
β=

where one solution is β = β and the other one is β = 1/α, which are the two permutations of the system of
equations. Thus, if σ 2η,1 , σ 2ε,2 , σ 2ε,1 , and σ 2η,2 are consistently estimated from the data, the consistency of β is
assured. But consistent estimates of the structural variances are indeed obtained from consistent estimates of
the reduced form covariance matrices if the system is linear, the parameters are stable, and the the residuals
have finite variances.
Furthermore, observe that β is consistent if the relative variances of the structural shocks shift:

σ 2η,1 σ 2η,2
−σ 2η,1 σ 2ε,2 + σ 2ε,1 σ 2η,2 6= 0 ⇒ =
6 ,
σ 2ε,1 σ 2ε,2

which is the generalization of the intuition in (Wright 1928).


5.2 Identification 89

Equation (5.4) is equivalent to


w11,1 w12,2 − w11,2 w12,1 6= 0. (5.5)
Note that conditions (5.4) and (5.5) are similar to testing the rank condition when the order condition
(number of equations) has been satisfied. In terms of the standard literature on linear systems of equations,
the order condition requires that the number of equations must be at least larger than or equal to the number
of unknowns. The rank condition requires the number of linearly independent equations to be equal to or
larger than the number of unknowns. In linear systems of equations, this is done by computing the rank
of the matrix. In the case studied here, the system is non-linear, and the rank condition takes the form of
equation (5.4).
Equation (5.4) fails if the two covariance matrices are proportional; i.e., the heteroskedasticity does not
identify the system if the relative variances are constant across regimes. Returning to the intuition given
in the introduction, imagine that the variance of both shocks doubles; then the shape of the ellipse across
the two regimes is the same, and nothing can be learned about the original system. Technically, this is the
case in which we have six equations and six unknowns, but the equations are not independent. On the other
hand, when the relative ratio of the variances shifts, then the heteroskedasticity changes the region in which
the errors are distributed, enlarging the ellipse along one of the structural equations. This rotation in the
ellipse can be estimated from the reduced form covariances allowing us to obtain the slope of the schedules.
The simplest intuition of how identification is achieved can be developed by first analyzing the case in
which the variance changes for only one shock. Assume that it is known that at some point in time there is
an increase in the variance of the supply shocks. During that period, the “cloud” of realizations is going to
widen along the demand curve as depicted in Figure 5.1. Comparing how the ellipse of the realizations has
changed across the two samples allows one to determine the slope of the demand curve. In this particular
case, because it has been assumed that the structural shocks have zero correlation, this is enough to estimate
the slope of the supply curve, too. Moreover, this explanation has an instrumental variable interpretation.
A valid instrument to estimate the demand schedule is one that moves the supply without affecting the
demand. In this example, the rise in the variance of the supply shocks becomes a probabilistic instrument
precisely because it increases the likelihood that the supply equation “moves”.
Finally, when both variances shift, there is an expansion along both schedules. So it is not necessary to
know which shock becomes more important across the regimes. It is enough if the relative variances shift -
equation (5.4) would be satisfied and both schedules identified.

5.2.2 Identification under more than two regimes.

It is easy to extend the previous results to the case where there are more than two regimes. Assume that
the data exhibit multiple finite heteroskedasticity regimes indexed by s ∈ {1, .., S}. For each regime, the
covariance matrix is
· ¸ · 2 2 ¸
ω 11,s ω 12,s 1 α σ η,s + σ 2ε,s ασ 2η,s + βσ 2ε,s
Ω̂s ≡ = . (5.6)
. ω 22,s (1 − αβ)
2 . σ 2η,s + β 2 σ 2ε,s

This is a system that has 3S equations (one covariance matrix per regime) and 2S + 2 unknowns: S times
two structural variances for each regime, plus two parameters (α and β).
The order condition will be satisfied for any S larger than or equal to two. The rank condition takes the
same form as equations (5.4) and (5.5) for any pair of regimes. Indeed, the system is overidentified if there
are at least three regimes that satisfy the rank condition for all combinations.
Appealing to the probabilistic IV interpretation used before, each new heteroskedastic regime is a valid
instrument if and only if it satisfies the rank condition with respect to all the previous regimes. In this case,
90 5. Identification through Heteroskedasticity: Theory.

each new covariance matrix adds three equations and only two unknowns. Otherwise, the new heteroskedastic
regime does not increase the number of restrictions on the structural coefficients. Hence, for S larger than
two, and for all covariance matrices satisfying the rank condition, the system of equations is overidentified,
and the underlying assumption - such as that α and β are stable through time - can be tested. The estimation
has a minimum distance interpretation where each heteroskedastic regime is equivalent to one instrument.2

5.3 Identification with common shocks

In the previous sections, the stochastic process is bivariate and there are no common shocks. In this section,
these assumptions are relaxed and the necessary conditions to achieve identification are discussed.3
It should be clear that if we allow for a common unobservable heteroskedastic shock in the bivariate
setting, the heteroskedasticity will not be sufficient to achieve identification. Each heteroskedastic regime
adds not only three equations, but also three unknowns. So it is essential to impose some constraints on the
covariances to be able to use the variation in the second moments to solve the problem of identification.
Assume that there are N endogenous variables, K common unobservable shocks, and s ∈ {1, ..., S} possible
regimes or states. Denote the structural form as follows:
     
x1,t z1,t ε1,t
     
AN ×N  ...  = ΓN ×K  ...  +  ...  , (5.7)
xN,t zK,t εN,t

where all the shocks are assumed to have zero correlation at all leads and lags,

E [zi,t , zj,t ] = 0 ∀i 6= j, i, j ∈ {1, K}


E [εi,t , εj,t ] = 0 ∀i 6= j, i, j ∈ {1, N } (5.8)
E [zi,t , εj,t ] = 0 ∀i 6= j, i ∈ {1, K}, j ∈ {1, N },

and where xn,t , n ∈ {1, ..., N } are the N endogenous (row vector) variables; where zk,t , k ∈ {1, ..., K} are
the K unobservable common shocks, assumed to have no correlation, with variance σ z,k,s in state s; and
where εn,t are the structural shocks, assumed not to be correlated, with variance σ ε,n,s in state s.
The matrix AN ×N describes the contemporaneous parameters,
 
1 a12 · · · a1n
 a21 1 · · · a2n 
 
AN ×N =  . .. .. .. , (5.9)
 .. . . . 
an1 an2 · · · 1

where the assumption of normalization already has been imposed (coefficients along the diagonal are equal
to one). And ΓN ×K are the parameters from the common shocks, where normalization is also assumed; in

2 The additional equations can also be interpreted as a factor regression model - where the left hand side variables of equation

(5.6) are the estimates (or observable), the variances (σ 2η,s and σ 2ε,s ) are the unobservable factors, and the coefficients are the
weights or factor loadings. Factor analysis usually assumes that the ω ij,s ’s are independent. It is unlikely, however, that this is
the case in this setup. Therefore proper corrections have to be considered in the estimation procedure. In this paper, I use the
GMM interpretation.
3 Including common shocks in the model is equivalent to relaxing the assumption on the correlation of the structural shocks.
5.3 Identification with common shocks 91

this case, it implies a unitary impact on the first equation,


 
1 1 ··· 1
 γ 21 γ 22 · · · γ 2k 
 
ΓN ×K =  . .. .. .. . (5.10)
 .. . . . 
γ n1 γ n2 ··· γ nk

Proposition 17 A multivariate system of N equations, with K unobservable common shocks, described by


equations (5.7), (5.8), (5.9), and (5.10) is identified if and only if, for N > 1,
(i) the number of states (S) satisfies,

(N + K) (N − 1)
S≥2 , (5.11)
N 2 − N − 2K

(ii) if there is a minimum number of endogenous variables (or maximum number of common shocks) that
satisfies
N 2 − N − 2K > 0. (5.12)

(iii) and if the covariance matrices constitute a system of equations that is linearly independent.

Proof. Note that the proposition states a necessary condition, but not a sufficient one. Thus it is stating
an order condition. From equation (5.7), the number of equations is given by the covariance matrix in each
regime. This provides N (N2+1) equations in each state. The total number of unknowns is as follows: The
matrix AN ×N has N (N − 1) parameters; the matrix ΓN ×K has K(N − 1) parameters; the variances of the
common shocks in each state is K · S (K variances times S regimes) and the variances of the structural
shocks in each regime are N · S (N variances times S regimes). Identification, then, requires

N (N + 1)
S· ≥ N (N − 1) + K(N − 1) + S · K + S · N
2
(N + K) (N − 1)
S ≥ 2 .
N 2 − N − 2K
Inequality (5.11) indicates the minimum number of states required to obtain identification. Finally, in order
for (5.11) to make sense, there is a minimum number of endogenous variables, which is given by

N 2 − N − 2K > 0.

Equation (5.12) is the “catch up” constraint. It indicates the conditions under which one additional regime
in the variance-covariance adds more equations than unknowns. In the example that motivated this section,
(N = 2 and K = 1) implies that the inequality is not satisfied and no further information is obtained from
the heteroskedasticity. Moreover, if the common shocks are interpreted as the sources of correlation between
the structural shocks, then this constraint indicates that some of the covariances of the structural shocks
must be restricted to be constant or zero. Solving for K it is found that identification requires K < N (N2−1) ,
where the right hand side of this inequality is exactly the number of all possible contemporaneous correlations
among structural shocks.
There are two main implications of proposition 17: First, in the absence of common shocks only two states
are required to achieve identification, independently of the number of endogenous variables N . Second, if
K > 0 and N is finite, the number of states required to achieve identification is always larger than two.
92 5. Identification through Heteroskedasticity: Theory.

The estimation of this model is performed by GMM where the moment conditions are

AΩs A0 = ΓΩz,s Γ0 + Ωε,s , (5.13)

where Ωs is the covariance matrix that can be estimated in the data from the observed variables (xt ) in
regime s, Ωz,s is the covariance matrix of the common unobservable shocks in regime s, which, given the
assumptions in equation (5.8), is a diagonal matrix, and Ωε,s is the covariance matrix of the structural shocks
in regime s, which given the assumptions in equation (5.8), is also diagonal. The parameters of interest are
A and Γ.
As I hope it is clear, the assumptions required to identify the model when there are common shocks is
much harder than in the case in which the covariance assumption of the structural shocks can be imposed
directly. In what follows I would like to discuss two methodologies that Brian Sack and I have used in other
papers to deal with the presence of common shocks. This is an extremely important problem when we are
dealing with macro asset pricing.

5.3.1 Related literature

At this point it is useful to discuss the relationship between this methodology and the literature on identifi-
cation using heteroskedasticity. As mentioned before, the use of second moments as a source of identification
was firstly introduced by Philip Wright [1928]. He indicated that an increase in the variance of the shocks
in one equation reduces the bias introduced by simultaneous equation problems in the OLS estimate of
the other one. Taking the limit to infinity implies that OLS would estimate the coefficients consistently.
Relatively new research has been conducted extending the original intuition (i) to non-linear models, (ii) to
models with parametric representations of the heteroskedasticity (such as ARCH or GARCH models), and
(iii) to models that are partially identified.
(Klein and Vella 2000b) and (Klein and Vella 2000a) discuss the problem of identification and estimation
in a binary endogenous model when exclusion restrictions (or any other parameter restrictions) are not
available and the case of the triangular model, respectively. They estimate the heteroskedasticity semi-
parametrically and use the residual from the second equation as an additional regressor in the first equation
as the instrument.4
(Sentana 1992) and (Sentana and Fiorentini 2001) study the problem of estimation in factor regressions
when there is conditional heteroskedasticity. The simple case developed in this section (proposition 16) is
a special case of their proposition 3. They study the conditions in which identification is achieved in a
non-triangular system when the common latent factors exhibit heteroskedasticity.
There are important differences between those papers and the approach developed here. First, the proce-
dure highlighted in this paper requires only the knowledge that a shift in the relative variances has occurred
- that is, the regime shift comes from economic events, such as crisis, policy shifts, or other characteristics in
the data as heteroskedasticity along regions, time, or other cross-sectional characteristics. The ARCH spec-
ification uses the time series heteroskedasticity in the data as an statistical vehicle to achieve identification.
Second, the procedure described in this paper allows us to test for some of the underlying assumptions, such
as parameter stability; the system is overidentified when there are more than two regimes. The techniques
based on conditional heteroskedasticity are unable to provide this test. Third, as is shown below, if the het-
eroskedasticity is misspecified in this model, the coefficients are still consistent. This is not the case when the
heteroskedasticity is modeled parametrically; misspecification in those cases could bias the contemporaneous
coefficients as well. Furthermore, if the data exhibit conditional heteroskedasticity, and the procedure here

4 See also (Chen and Khan 1999) for a general solution of the problem of identification in sample selection models when the

data exhibit heteroskedasticity.


5.4 Consistency under misspecification of the heteroskedasticity. 93

described is implemented, it is still the case that the coefficients will be consistent. Fourth, models that rely
on conditional heteroskedasticity to achieve identification require the number of heteroskedastic shocks to
be smaller than, or equal to, the number of endogenous variables. As is shown in Section 5.3, this is not the
case in the present procedure. If there are more than two regime shifts, there exist conditions in which it is
possible to have more latent factors than endogenous variables and still being able to identify the structural
system.
Though the estimation procedures among all these papers are very different, they share the same intuition
for solving the problem of endogenous variables: the heteroskedasticity adds equations to the system after
some covariance restrictions have been imposed. It is important to mention that these procedures require
that the system of equations be linear, or in other words, that the coefficients be stable to changes in the
volatility. Future research should consider extending the methodology to non-linear specifications.
Finally, in addition to the papers mentioned above, some applied papers already have used heteroskedastic-
ity to identify a system of equations. In the context of conditional heteroskedasticity, see (Caporale, Cipollini,
and Spagnolo 2002), (Dungey and Martin 2001), (King, Sentana, and Wadhwani 1994), and (Rigobon 2002).
In these papers a structural conditionally heteroskedastic model is estimated from a reduced form GARCH
model. In the context of regime switches see (?), and (Rigobon and Sack 2003) and (Rigobon and Sack 2004)
In the context of testing parameter stability see (Rigobon 2000) and (Rigobon 2003). I discuss partial iden-
tification of simultaneous equation models with unobservable common shocks. That paper is more concerned
with developing a test for stability of parameters rather than with identifying the system of equations. The
procedure depends on the presence of a particular form of the heteroskedasticity, where in the short run only
a subset of the variances are allowed to shift.
Relatively new applications are arising in panel data questions. As far as I can tell, it seems that in
those applications the power of the panel is strong enough to produce very tight estimates. (Hogan and
Rigobon 2003) estimate the returns to education using the heteroskedasticity that exists among the different
regions in the U.K. See also, (Rigobon and Rodrik ????), (Lee, Ricci, and Rigobon 2004), as well as others.

5.4 Consistency under misspecification of the heteroskedasticity.

An important question arising from the previous derivation is the issue of consistency when the heteroskedas-
ticity is misspecified. This section shows that the estimates are consistent even though the regimes might be
misspecified.
In this section two cases are evaluated: (i) when the windows of the heteroskedasticity are wrongly specified
but the number of regimes is correct, (ii) and when the data have more regimes than the ones assumed in
the specification. Without loss of generality, only the bivariate case in which there are no common shocks is
discussed.
The intuition about why consistency is achieved in these two cases is that the misspecified covariance
matrices are linear combinations of the true underlying ones. Therefore, the misspecified system of equations
is a linear transformation of the original problem. If this linear transformation does not drop the rank of the
system, the same solution is obtained. It is not proven in this section, but it should be intuitively obvious
that the misspecification reduces the power of the test by eliminating the differences across regimes. For
example, if in the limit, when the misspecification is so large that the system drops rank, then the estimates
are inconsistent - there is a continuum of them.
94 5. Identification through Heteroskedasticity: Theory.

5.4.1 Misspecification of the regime windows.

Assume the system is described by equations (4.9) and (4.10), and that the data exhibit heteroskedastic-
ity with only two regimes. If the windows are misspecified, the computed covariance matrices are linear
combinations of the true underlying covariance matrices. Denote

Ωr1 = λr1 Ω1 + (1 − λr1 ) Ω2 ,


Ωr2 = (1 − λr2 ) Ω1 + λr2 Ω2 ,

where Ω1 and Ω2 are the true covariance matrices describing the heteroskedasticity, Ωr1 and Ωr2 are the
estimated covariance matrices, and λr1 and λr2 are weights indicating how “correct” the windows are; when
they are equal to one, the windows coincide with the true regimes.

Proposition 18 Assume the original system satisfies the rank condition (5.4). If the misspecified het-
eroskedasticity satisfies the rank condition (5.4), then the model is identified and its estimators are consistent.

Proof. After some algebra the two covariance matrices can be written in terms of the underlying variances:
· 2 2 ¸
1 α σ η,r1 + σ 2ε,r1 ασ 2η,r1 + βσ 2ε,r1
Ωr1 = ,
(1 − αβ)
2 . σ 2η,r1 + β 2 σ 2ε,r1
· 2 2 ¸
1 α σ η,r2 + σ 2ε,r2 ασ 2η,r2 + βσ 2ε,r2
Ωr2 = ,
(1 − αβ)
2 . σ 2η,r2 + β 2 σ 2ε,r2

where

σ 2η,r1 = λr1 σ 2η,1 + (1 − λr1 ) σ 2η,2 and σ 2ε,r1 = λr1 σ 2ε,1 + (1 − λr1 ) σ 2ε,2 (5.14)
σ 2η,r2 = (1 − λr2 ) σ 2η,1 + λr2 σ 2η,2 and σ 2ε,r2 = (1 − λr2 ) σ 2ε,1 + λr2 σ 2ε,2 . (5.15)

Given that the original heteroskedasticity satisfied the rank condition (σ 2η,1 σ 2ε,2 − σ 2η,2 σ 2ε,1 6= 0), there are
two questions to answer: (i) in which circumstances the misspecified model satisfies the rank condition, and
(ii) in which circumstances the estimates are consistent. After some algebra, Ωr1 and Ωr2 satisfy equation
(5.4) if and only if
σ 2η,r1 σ 2ε,r2 6= σ 2η,r2 σ 2ε,r1 .
Substituting by the definitions of the variances (equations 5.14 and 5.15), the rank condition is not satisfied
if and only if
λr1 = 1 − λr2 .
In other words, the rank condition is not satisfied if the windows are so badly specified that they imply the
same weights on the true regimes. Thus, the two computed matrices are identical.
Assume the rank condition is satisfied; then the question is whether the solution of the new system of
equations is consistent. Substituting equations 5.14 and 5.15 into equation (5.3), the estimated β solves.
µ µ ¶ ¶
Φα 2 1 β
3 β − + β β + = 0, (5.16)
(1 − αβ) α α

where ¡ ¢
Φ = σ 2η,1 σ 2ε,2 − σ 2η,2 σ 2ε,1 (1 − λr1 − λr2 ) .
Note that under the assumption that the original heteroskedasticity satisfies the rank condition, and that
λr1 6= 1 − λr2 , then Φ is different from zero. Hence, equation (5.16) solves the exact same quadratic equation
as the well-specified model. Thus the consistency is assured if the covariance matrix is consistently estimated.
5.4 Consistency under misspecification of the heteroskedasticity. 95

The two solutions are β and 1/α. Therefore, if the regimes are misspecified and the system satisfies the rank
condition, then the estimates are consistent.
In other words, if the computed covariance matrices satisfy the rank condition, then the estimates are
consistent even if the regimes have been slightly misspecified. On the other hand, if the misspecification is so
large that the system fails the rank condition, then the coefficients are not identified. Hence, the estimated
coefficients should be consistent for small perturbations of the regime definitions.
Remember that the equivalent rank condition is testable. Therefore, the degree of misspecification can be
detected in the applications.

5.4.2 Under-specified number of regimes.

Assume the system is described by equations (4.9) and (4.10), and that the data exhibit heteroskedasticity
with S ∗ regimes, where there are no restrictions to the form of the heteroskedasticity. For simplicity denote
the variances of the structural shocks in each regime as follows:
σ 2η,s = (1 + δ η,s ) σ 2η,0
∀s 6= 0,
σ 2ε,s = (1 + δ ε,s ) σ 2ε,0
where σ 2η,s and σ 2ε,s represent the variances of the idiosyncratic shocks in regime s, and δ η,s and δ ε,s are the
changes of those variances relative to the variances from regime s = 0.
Assume that only two regimes are used in the estimation. Without loss of generality assume that the first
window corresponds to the first set of ŝ < S ∗ regimes and that the second window corresponds to the second
set of S ∗ − ŝ regimes. The covariance matrices of each of the misspecified periods are given by:
 21 P 2 P 2 P 2 P 2 
α ŝ σ η,s + 1ŝ σ ε,s α 1ŝ σ η,s + β 1ŝ σ ε,s
1  s<ŝ s<ŝ s<ŝ s<ŝ 
Ωr1 = P P 2
(1 − αβ)
2 . 1
ŝ σ 2η,s + β 2 1ŝ σ ε,s
s<ŝ s<ŝ

for the first window, and


 P P P 2 P 2 
α2 S ∗1−ŝ σ 2η,s + 1
S ∗ −ŝ σ 2ε,s α S ∗1−ŝ σ η,s + β S ∗1−ŝ σ ε,s
1  s>ŝ s>ŝ s>ŝ s>ŝ 
Ωr2 = P P
(1 − αβ)
2 . 1
S ∗ −ŝ σ 2η,s + β 2 S ∗1−ŝ σ 2ε,s
s>ŝ s>ŝ

for the second one. The two matrices can be rewritten as


· ¸
1 (1 + δ η,r1 ) α2 σ 2η,0 + (1 + δ ε,r1 ) σ 2ε,0 (1 + δ η,r1 ) ασ 2η,0 + (1 + δ ε,r1 ) βσ 2ε,0
Ωr1 =
(1 − αβ)
2 (1 + δ η,r1 ) σ 2η,0 + (1 + δ ε,r1 ) β 2 σ 2ε,0
· ¸
1 (1 + δ η,r2 ) α2 σ 2η,0 + (1 + δ ε,r2 ) σ 2ε,0 (1 + δ η,r2 ) ασ 2η,0 + (1 + δ ε,r2 ) βσ 2ε,0
Ωr2 = ,
(1 − αβ)
2 (1 + δ η,r2 ) σ 2η,0 + (1 + δ ε,r2 ) β 2 σ 2ε,0
where
1X 1 X
δ η,r1 = δ η,s and δ η,r2 = δ η,s (5.17)
ŝ S∗− ŝ
s<ŝ s>ŝ
1X 1 X
δ ε,r1 = δ ε,s and δ ε,r2 = ∗ δ ε,s . (5.18)
ŝ S − ŝ
s<ŝ s>ŝ

Proposition 19 Assume the true heteroskedasticity is described by S ∗ regimes and that those covariance
matrices satisfy the rank condition (5.4). Assume that only two regimes have been used in the estimation;
then, if the following conditions are satisfied, the system is identified and its estimates are consistent.
96 5. Identification through Heteroskedasticity: Theory.

1. The misspecified covariance matrices have to exhibit heteroskedasticity: Ωr1 6= Ωr2


2. The misspecified covariance matrices satisfy the rank condition (5.4).

Proof. The first assumption in the proposition is to guarantee that the original system can be identified
if the heteroskedasticity is well specified. In the ill-specified model, identification is achieved if the relative
volatilities change. This is equivalent to

δ η,r1 6= δ η,r2 or δ ε,r1 6= δ ε,r2 . (5.19)

Equation (5.19) indeed guarantees that the two estimated covariance matrices are different. In other words,
it guarantees that the order condition will be satisfied; there is heteroskedasticity in the estimated model.
The next question is, as before, what are the conditions for consistency. Substituting into equation (5.3)
for the computed covariance matrices (Ωr1 and Ωr2 ) the estimated β satisfies,
µ µ ¶ ¶
σ 2η,0 σ 2ε,0 Φα 1 β
3 β2 − +β β+ = 0, (5.20)
(1 − αβ) α α

where
Φ = (1 + δ ε,r1 ) (1 + δ η,r2 ) − (1 + δ ε,r2 ) (1 + δ η,r1 ) .
Note that if Φ is different from zero, then β solves the same quadratic equation as the original model. Φ is
different from zero if condition (5.19) is satisfied, and

δ η,r1 δ ε,r1
6= . (5.21)
δ η,r2 δ ε,r2

Condition (5.21) indicates that the change in the variances across the misspecified regimes cannot be pro-
portional. In other words, this is equivalent to the rank condition discussed before. Again, the two roots
solving equation (5.20) are β and 1/α.
In summary, even though the assumed form of the heteroskedasticity implies a smaller number of regimes
than those exhibited in the data, the system is identified and its estimates are consistent if and only if the
order and rank conditions are satisfied by the misspecified matrices.

It is important to mention that if the number of true regimes is smaller than the number of regimes used
in the estimation, then the system of equations does not satisfy the rank condition. In other words, there are
not enough independent equations to identify the system. It should be clear that in those cases the estimates
are inconsistent, and the confidence intervals are infinitely large.
The two cases analyzed in this section are probably the most common forms of misspecification. However,
they are not exhaustive. Depending on the particular application in which the identification is used, and the
possible misspecification problems that could be encountered, the consistency of the methodology should be
explored further.
5.4 Consistency under misspecification of the heteroskedasticity. 97

FIGURE 5.1. Identification Problem.


98 5. Identification through Heteroskedasticity: Theory.
This is page 99
Printer: Opaqu

References

Andrews, D. W. K., and E. Zivot (1992): “Further Evidence on the Great Crash, the Oil Price Shock,
and the Unit Root Hypothesis,” Journal of Business anmd Economic Statistics, 10, 251–70.
Bai, H., R. L. Lumsdaine, and J. H. Stock (1998): “Testing for and Dating Common Breaks in Multi-
variate Time Series,” Review of Economic Studies, 65, 395–432.
Bai, J. (1997): “Estimation of a Change Point in Multiple Regression Models,” The Review of Economics
and Statistics, pp. 551–563.
Banerjee, A., R. L. Lumsdaine, and J. H. Stock (1992): “Recursive and Sequential Tests of the
Unit Root and Trend Break Hypotheses: Theory and International Evidence,” Journal of Business and
Economic Statistics, 10, 271–88.
Barten, A. P., and L. S. Bronsard (1970): “Two-Stage Least Squares Estimation with Shift in the
Structural Form,” Econometrica, 38(6), 938–941.
Bertola, G., and R. J. Caballero (1992): “Target Zones and Realignments,” American Economic
Review, 82, 520–536.
Blanchard, O., and D. Quah (1989): “The Dynamic Effects of Aggregate Demand and Aggregate Supply
Disturbances,” American Economic Review, 79, 655–73.
Blanchard, O. J., and P. Diamond (1989): “The Beveridge Curve,” Brookings Papers in Economic
Activity, 1, 1–76.
Briggs, W. L., V. E. Henson, and S. F. McCormick (2000): A Multigrid Tutorial. SIAM: Society for
Industrial and Applied Mathematics, second edition edn.
Caballero, R., and E. Engle (2003): “Adjustment is Much Slower Than You Think,” NBER.
Caporale, G. M., A. Cipollini, and N. Spagnolo (2002): “Testing for Contagion: A Conditional Cor-
relation Analysis.,” CEMFE Mimeo.

Chen, S., and S. Khan (1999): “ n-Consistent Estimation of Heteroskedastic Sample Selection Models,”
University of Rochester, Mimeo.
100 References

Chen, S.-S., and C. Engel (2004): “Does ”Aggregation Bias” Explain the PPP Puzzle?,” NBER Working
Papers, 10304.
Chong, T. T.-L. (1995): “Partial Parameter Consistency in a Misspecified Structural Change Model,”
Economics Letters, 49(4), 351–57.
(1999): “Asymptotic Distribution of the Sup-Wald Statistic under Specification Errors,” Structural
Change and Economic Dynamics, 10, 421–30.
Chow, G. C. (1960): “Test of Equality Between Sets of Coefficients in Two Linear Regressions.,” Econo-
metrica, 28, 591–605.
Cochrane, J. H., F. A. Longstaff, and P. Santa-Clara (2005): “Two Trees,” working paper, Uni-
versity of Chicago.
Cole, H. L., and M. Obstfeld (1991): “Commodity Trade and International Risk Sharing,” Journal of
Monetary Economics, 28, 3–24.
Dixit, A. (1993): The Art of Smooth Pasting. Harwood Academic Publisheers.
Dornbusch, R. (1980): Open Economy Macroeconomics. Basic Books, Inc. Publishers, New York.
(1983): “Real Interest Rates, Home Goods and Optimal External Borrowing,” Journal of Political
Economy, 91, 141–153.
Dufour, J.-M. (1982): “Generalized Chow Tests for Structural Change: A Coordinate-Free Approach,”
International Economic Review, 23, 565–575.
Dufour, J.-M., E. Ghysels, and A. Hall (1994): “Generalized Predictive Tests and Structural Change
Analysis in Econometrics,” International Economic Review, 35(1), 199–227.
Dungey, M., and V. L. Martin (2001): “Contagion Across Financial Markets: An Empirical Assessment,”
Australian National University Mimeo.
Erlat, H. (1983): “A Note on Testing Structural Change in a Single Equation Belonging to a Simultaneous
System,” Economic Letters, 13, 185–89.
Fachini, S. (1989): “A Montecarlo Experiment on the Power of Variable Addition Tests for Parameter
Stability,” Giornale Degli Economisti e Annali di Economia, 48(9-10), 497–506.
Fisher, F. M. (1976): The Identification Problem in Econometrics. Robert E. Krieger Publishing Co., New
York, second edn.
Froot, K. A., and K. Rogoff (1995): “Perspectives on PPP and Long-Run Real Exchange Rates,” in
Handbook of International Economics, ed. by G. M. Grossman, and K. Rogoff, vol. 3 of Handbooks in
Economics, chap. 32, pp. 1647–1729. Elsevier.
Garber, P. M., and L. E. Svensson (1994): “The Operation and Collapse of Fixed Exchange Rate
Regimes,” in Handbook of International Economics, ed. by G. Grossman, and K. Rogoff, vol. 3, chap. 36,
pp. 1865–1911. Elsevier, 1 edn.
Haavelmo, T. (1947): “Methods of Measuring the Marginal Propensity to Consume,” Journal of the Amer-
ican Statistical Association, 42, 105–122.
Harrison, M. (1990): Brownian motion and Stochastic Flow Systems. Krieger Publishing Company.
Helpman, E., and A. Razin (1978): A Theory of International Trade under Uncertainty. Acedemic Press,
San Diego.
References 101

Hodoshima, J. (1988): “Estimation of a Single Structural Equation with Structural Change,” Econometric
Theory, 4, 86–96.

(1992): “Finite-Sample Properties of Single-Equation Estimators under Structural Change,” Journal


of Econometrics, 53, 189–209.
Hogan, V., and R. Rigobon (2003): “Using Unobserved Supply Shocks to Estimate the Returns to Edu-
cation,” NBER working paper 9145.

Imbs, J., H. Mumtaz, M. O. Ravn, and H. Rey (2002): “PPP Strikes Back: Aggregation and the Real
Exchange Rate,” NBER Working Papers, 9372.

Karatzas, I., and S. E. Shreve (1988): Brownian Motion and Stochastic Calculus. Springer-Verlag, New
York.
King, M., E. Sentana, and S. Wadhwani (1994): “Volatility and Links Between National Stock Markets,”
Econometrica, 62, 901–33.

Klein, R., and F. Vella (2000a): “Employing Heteroskedasticity to Identify and Estimate Triangular
Semiparametric Models,” Rutgers mimeo.

(2000b): “Identification and Estimation of the Binary Treatment Model Under Heteroskedasticity,”
Rutgers mimeo.

Koopmans, T., H. Rubin, and R. Leipnik (1950): Measuring the Equation Systems of Dynamic Eco-
nomicsvol. Statistical Inference in Dynamic Economic Models of Cowles Commission for Research in
Economics, chap. II, pp. 53–237. John Wiley and Sons, New York.
Lee, H. Y., L. Ricci, and R. Rigobon (2004): “Once Again, is Account Openness Good for Growth?,”
Journal of Development Economics, 75(2), 451–472.
Lo, A. W., and W. K. Newey (1985): “A Large-Sample Chow Test for the Linear Simultaneous Equation,”
Economic Letters, 18, 351–53.
Lumsdaine, R. L., and S. Ng (1999): “Testing for ARCH in the Presence of a Possible Misspecified
Conditional Mean,” Journal of Econometrics, 93, 257–79.
MacKinnon, J. (1989): “Heteroskedasticity Robust Tests for Structural Change,” in Econometrics of Struc-
tural Change, ed. by W. Kramer. Physica-Verlag Heidelberg, Germany.

Obstfeld, M., and K. S. Rogoff (1996): Foundations of International Macroeconomics. The MIT Press,
Cambridge.
Øksendal, B. (2003): Stochastic Differential Equations. 6th edition, Springer-Verlag, Berlin.

Perron, P. (1989): “The Great Crash, the Oil Price Shock and the Unit Root Hypothesis,” Econometrica,
57, 1361–1401.
Rigobon, R. (2000): “A Simple Test for the Stability of Linear Models under Heteroskedasticity, Omitted
Variable, and Endogneous Variable Problems.,” MIT Mimeo: http://web.mit.edu/rigobon/www/.

(2002): “The Curse of Non-Investment Grade Countries,” Journal of Development Economics,


69(2), 423–449.
(2003): “On the Measurement of the International Propagation of Shocks: Is the Transmission
Stable?,” Journal of International Economics, 61, 261–283.
102 References

Rigobon, R., and D. Rodrik (????): “Rule of Law, Democracy, Openness, and Income: Estimating the
Interrelationships,” NBER.

Rigobon, R., and B. Sack (2003): “Measuring the Reaction of Monetary Policy to the Stock Market,”
Quarterly Journal of Economics, 118, 639–669.
(2004): “The Impact of Monetary Policy on Asset Prices,” Journal of Monetary Economics, 51,
1553–75.

Rigobon, R., and T. Stoker (2003): “Censored Regressors and Expansion Bias,” MIT Mimeo.
Rothenberg, T. J., and P. A. Ruud (1990): “Simultaneous Equations with Covariance Retrictions,”
Journal of Econometrics, 44(1-2), 25–39.
Salter, W. (1959): “Internal and External Balance: The Role of Price Expenditure Effects,” Economic
Record, 35, 226–238.

Sentana, E. (1992): “Identification of Multivariate Conditionally Heteroskedastic Factor Models,” LSE,


FMG Discussion Paper, 139.
Sentana, E., and G. Fiorentini (2001): “Identification, Estimation and Testing of Conditional Het-
eroskedastic Factor Models,” Journal of Econometrics, 102(2), 143–164.
Shapiro, M. D., and M. W. Watson (1988): Sources of Business Cycle Fluctuations. MIT Press, Cam-
bridge, Mass.
Swan, T. W. (1960): “Economic Control in a Dependent Economy,” Economic Record, 36, 51–66.
Uribe, M., and S. Schmitt-Grohe (2003): “Closing Small Open Economy Models,” Journal of Interna-
tional Economics, 61, 163–185.
Wright, P. G. (1928): The Tariff on Animal and Vegetable Oils, The Institute of Economics. The Macmillan
Conpany, New York.
Zapatero, F. (1995): “Equilibrium Asset Prices and Exchange Rates,” Journal of Economic Dynamics and
Control, 19, 787–811.

S-ar putea să vă placă și