Sunteți pe pagina 1din 37

Part 11: Heterogeneity [ 1/36]

Econometric Analysis of Panel Data


William Greene
Department of Economics
Stern School of Business

Part 11: Heterogeneity [ 2/36]

Agenda

Random Parameter Models

Fixed effects
Random effects

Heterogeneity in Dynamic Panels


Random Coefficient Vectors-Classical vs.
Bayesian
General RPM Swamy/Hsiao/Hildreth/Houck
Hierarchical and Two Step Models
True Random Parameter Variation

Discrete Latent Class


Continuous

Classical
Bayesian

Part 11: Heterogeneity [ 3/36]

A Capital Asset Pricing Model


2
%
%
%
%
R%it %

0t
1t i
2t i 3t si it
R%it one period percentage return

%
0t expected return on a riskless security (stochastic)
% expected premium on the 'market' portfolio, R% R%
1t

Mt

0t

%
2t "nonlinear" risk effect
%
3t "nonbeta risk" term
Data are [R%, ,2 , s ], generated by auxiliary regressions
it

Coefficients are 'random' through time.


Fama - MacBeth, "Risk, Return, and Equilibrium: Empirical
Tests," J ournal of Political Economy, 1974.

Part 11: Heterogeneity [ 4/36]

Heterogeneous Production Model


Healthi,t i iHEXPi,t iEDUCi,t i,t
i country, t=year
Health = health care outcome, e.g., life expectancy
HEXP = health care expenditure
EDUC = education
Parameter heterogeneity:
Discrete? Aids dominated vs. QOL dominated
Continuous? Cross cultural heterogeneity
World Health Organization, "The 2000 World Health Report"

Part 11: Heterogeneity [ 5/36]

Parameter Heterogeneity
Unobserved Effects Random Constants
it ci it
yit x
it it
yit i x
i ui ,
E[ui | Xi ] 0 --> Random effects
E[ui | Xi ] 0 --> Fixed effects
EXE[ui | Xi ] 0.
Var[ui | Xi ] not yet defined - so far, constant.

Part 11: Heterogeneity [ 6/36]

Parameter Heterogeneity
Generalize to Random Parameters
it i it
yit x
i ui
E[ui | Xi ] zero or nonzero - to be defined
EX [E[ui | Xi ]] = 0
Var[ui | Xi ] to be defined, constant or variable
"The Pooling Problem: " What is the consequence
of estimating under the erroneous assumption of
constant parameters. (Theil, 1960, "The Aggregation
Problem") (Maddala, 1970s - 1990s, "The Pooling
Problem")

Part 11: Heterogeneity [ 7/36]

Fixed Effects
(Hildreth, Houck, Hsiao, Swamy)
it i it , each observation
yit x
yi X
i i i , Ti observations
i ui
Assume (temporarily) Ti > K.
E[ui | Xi ] =g(Xi ) (conditional mean)
P[ui | Xi ] =(Xi -E[X
(projection)
i ])
EX [E[ui | Xi ]] = EX[P[ui | Xi ]] =0
Var[ui | Xi ] constant but nonzero

Part 11: Heterogeneity [ 8/36]

OLS and GLS Are Inconsistent


yi X
i i i , Ti observations
i ui
yi X
i Xu
i i i , Ti observations
yi X
i wi
E[wi | Xi ] XiE[ui | X
X i | 0i ]
i ] E[

Part 11: Heterogeneity [ 9/36]

Estimating the Fixed Effects Model


y1 X1

y2 0
... ...

yN 0

0
X2
...
0

...
...
...
...

0
1 1

0
2 2

... ... ...



N N
N

Estimator: Equation by equation OLS or (F)GLS


1 N
Estimate ? i1i is consistent for E[i ] in N.
N

Part 11: Heterogeneity [ 10/36]

Partial Fixed Effects Model


Some individual specific parameters
yi D
i i +X
i i , Ti observations
Use OLS and Frisch-Waugh
[N XMi X ]1[N XMi y ], Mi I D (DD
i i ) 1D

i1 i D i
i1 i D i
D
i
i

i i ]1D(yi -X
i [DD

i )
E.g., Individual specific time trends,
it it ; Detrend individual data, then OLS
yit i0 i1t x
E.g., Individual specific constant terms,
it it ; Individual group mean deviations, then OLS
yit i0 x

Part 11: Heterogeneity [ 11/36]

Heterogeneous Dynamic Models


logYi,t i i logYi,t1 ixit i,t
long run effect of interest is i

i
1 i

See :
Pesaran,H.,Smith,R.,Im,K.,"Estimating Long-Run Relationships
From Dynamic Heterogeneous Panels," J ournal of Econometrics, 1995.
(Repeated with further study in Matyas and Sevestre, The
Econometrics of Panel Data.
Smith, J ., notes, Applied Econometrics, Dynamic Panel Data Models,
University of Warwick.
http://www2.warwick.ac.uk/fac/soc/economics/staff/faculty/jennifersmith/panel/
Weinhold, D., "A Dynamic "Fixed Effects" Model for Heterogeneous
Panel Data," London School of Economics, 1999.

Part 11: Heterogeneity [ 12/36]

Random Effects and


Random Parameters
THE Random Parameters Model
it i it , each observation
yit x
yi X
i i i , Ti observations
i ui
Assume (temporarily) Ti > K.
E[ui | Xi ] =0
Var[ui | X
i]

constant but nonzero

We differentiate the classical and Bayesian interpretations


Randomness here is heterogeneity, not "uncertainty"
Bayesian approach to be considered later.

Part 11: Heterogeneity [ 13/36]

Estimating the Random


Parameters Model
yi X
i i i , Ti observations
i ui
yi X
i Xu
i i i , Ti observations
yi X
i wi
E[wi | Xi ] XiE[ui | X
X i | 0i ]
i ] E[
2
2

Var[wi | Xi ] XX

I
<==
Should

i
i
,i
,i vary by i?

Objects of estimation : , 2 ,i ,
Second level estimation : i

Part 11: Heterogeneity [ 14/36]

Estimating the Random


Parameters Model by OLS
yi X
i i i , Ti observations
i ui
yi X
i Xu
i i i , Ti observations
yi X
i wi
b [Ni1XiXi ]1[Ni1Xiy
i]

1
N

[X
NiX
]
[
X

1 i i
iw
1 i

i 2I )X][Ni1XiXi ]1
Var[b| X]=[Ni1XiXi ]1[Ni1Xi (XX
i
i i ]1
=2 [Ni1XiXi ]1 [Ni1XiXi ]1[Ni1 (XiX
( Xi )][NiX
i) X
1 X
the usual + the variation due to the random parameters
Robust estimator
iw
iXi ][Ni1XiXi ]1
Est.Var[b] [Ni1XiXi ]1[Ni1Xiw

Part 11: Heterogeneity [ 15/36]

Estimating the Random


Parameters Model by GLS
yi X
i i i , Ti observations
i ui
yi X
i Xu
i i i , Ti observations
2

yi X

w
,
Var[
w
|
X
]
=

=(
XX

i
i
i
i
i
i
i
,iI )

[N X-1X ]1[N X-1y ]

i1 i i
i
i1 i i
i
2
and
For FGLS, we need
,i.

Part 11: Heterogeneity [ 16/36]

Estimating the RPM


b
i

X
(X i iX
) w i w
Xi=u i i +
i,

= ui (XiXi ) Xii
1

Var[bi| X
X i i)
i ]= +X (
2
,i

Ti
2

(y

x
b
)
2
it i

is unbiased
,i t1 it
Ti K

(but not consistent because Ti is fixed).

Part 11: Heterogeneity [ 17/36]

An Estimator for
E[bi| X
i]
2
1

Var[bi| X
]=
+

X
(
X
)
i
,i
i i

Var[bi ] VarXE[bi| Xi ] EXVar[bi| Xi ]


=

0+

EX [+2 ,i (XiXi )1 ]
+EX [2 ,i (Xi Xi )1]

1 N
Estimate Var[bi ] with i1(bi b)(bi b)'
N
1
2
1
Estimate EX [2 ,i ( XiXi )1] with Ni1
,i ( XiXi )
N
1 N
1 N 2
1

= i1(bi b)(bi b)' - i1


,i (XiXi )
N
N

Part 11: Heterogeneity [ 18/36]

A Positive Definite Estimator for


1 N
1 N 2
1

= i1 (bi b)(bi b)' i1


,i ( XiXi )
N
N
May not be positive definite. What to do?
(1) The second term converges (in theory) to 0 in Ti. Drop it.
(2) Various Bayesian "shrinkage" estimators,
(3) An ML estimator

Part 11: Heterogeneity [ 19/36]

Estimating i
N

GLS
i1Wb
i i,OLS
N
2
1
1
2
1

[
X

(
)

]}
[
X

(
)
]
i
i1
,i
i i
,i
i i

Best linear unbiased predictor based on GLS is


A

i
i GLS + (I -Ai )bi,OLS bi,OLS Ai (GLS bi,OLS )
A
i {

-1

1 1 1

[X
2X
(
)
] }
,i
i i

-1

| all data]=A Var[


]A
Var[
i
i
GLS
i

[Ai (I -Ai )]

]
Var[
GLS
WVar[bi,OLS ]i

Var[bi,OLS ]W i Ai

(
I
A
)
Var[bi,OLS ]
i

Part 11: Heterogeneity [ 20/36]

Baltagi and Griffins Gasoline Data


World Gasoline Demand Data, 18 OECD Countries, 19 years
Variables in the file are
COUNTRY = name of country
YEAR = year, 1960-1978
LGASPCAR = log of consumption per car
LINCOMEP = log of per capita income
LRPMG = log of real price of gasoline
LCARPCAP = log of per capita number of cars
See Baltagi (2001, p. 24) for analysis of these data. The article on
which the analysis is based is Baltagi, B. and Griffin, J., "Gasoline
Demand in the OECD: An Application of Pooling and Testing
Procedures," European Economic Review, 22, 1983, pp. 117-137. The
data were downloaded from the website for Baltagi's text.

Part 11: Heterogeneity [ 21/36]

OLS and FGLS Estimates


+----------------------------------------------------+
| Overall OLS results for pooled sample.
|
| Residuals
Sum of squares
=
14.90436
|
|
Standard error of e =
.2099898
|
| Fit
R-squared
=
.8549355
|
+----------------------------------------------------+
+---------+--------------+----------------+--------+---------+
|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] |
+---------+--------------+----------------+--------+---------+
Constant
2.39132562
.11693429
20.450
.0000
LINCOMEP
.88996166
.03580581
24.855
.0000
LRPMG
-.89179791
.03031474
-29.418
.0000
LCARPCAP
-.76337275
.01860830
-41.023
.0000
+------------------------------------------------+
| Random Coefficients Model
|
| Residual standard deviation
=
.3498
|
| R squared
=
.5976
|
| Chi-squared for homogeneity test = 22202.43
|
| Degrees of freedom
=
68
|
| Probability value for chi-squared=
.000000
|
+------------------------------------------------+
+---------+--------------+----------------+--------+---------+
|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] |
+---------+--------------+----------------+--------+---------+
CONSTANT
2.40548802
.55014979
4.372
.0000
LINCOMEP
.39314902
.11729448
3.352
.0008
LRPMG
-.24988767
.04372201
-5.715
.0000
LCARPCAP
-.44820927
.05416460
-8.275
.0000

Part 11: Heterogeneity [ 22/36]

Country Specific Estimates

Part 11: Heterogeneity [ 23/36]

Estimated

Part 11: Heterogeneity [ 24/36]

Two Step Estimation (Saxonhouse)


A Fixed Effects Model
it it
yit i x
Secondary Model
i
i z
Two approaches
(1) Reduced form is a linear model with time constant zi
it z i it
yit x
(2) Two step
(a) FEM at step 1
i vi
(b) ai i (ai i ) z
1

xi ( XiMDi Xi )1 x i
Ti

Var[vi ] 2

Use weighted least squares regression of ai on zi

Part 11: Heterogeneity [ 25/36]

A Hierarchical Model
Fixed Effects Model
it it
yit i x
Secondary Model
i ui <========
i z
Two approaches
(1) Reduced form is an REM with time constant zi
it z i ui it
yit x
(2) Two step
(a) FEM at step 1
i ui vi
(b) ai i (ai i ) z
1

xi (XiMDi Xi )1 x i
Ti

Var[ui vi ] u2 2

Part 11: Heterogeneity [ 26/36]

Analysis of Fannie Mae

Fannie Mae
The Funding Advantage
The Pass Through

Passmore, W., Sherlund, S., Burgess, G.,


The Effect of Housing Government-Sponsored
Enterprises on Mortgage Rates, 2005,
Federal Reserve Board and Real Estate
Economics

Part 11: Heterogeneity [ 27/36]

Two Step Analysis of Fannie-Mae


Fannie Mae's GSE Funding Advantage and Pass Through
RMi,s,t 0s,t (1s,tLTV) 2s,tSmalli,s,t 3s,tFeesi,s,t
4
s,t
Newi,s,t 5s,tMtgCoi,s,t s,t J i,s,t i,s,t

i, s, t individual, state,month
1,036,252 observations in 370 state,months.
RM mortgage
LTV= 3 dummy variables for loan to value
Small = dummy variable for small loan
Fees = dummy variable for whether fees paid up front
New = dummy variable for new home
MtgCo = dummy variable for mortgage company
J = dummy variable for whether this is a J UMBO loan
THIS IS THE COEFFICIENT OF INTEREST.

Part 11: Heterogeneity [ 28/36]

Average of 370 First Step


Regressions
Symbol

Variable

Mean

S.D.

Coeff

S.E.

RM

Rate %

7.23

0.79

Jumbo

0.06

0.23

0.16

0.05

LTV1

75%-80%

0.36

0.48

0.04

0.04

LTV2

81%-90%

0.15

0.35

0.17

0.05

LTV3

>90%

0.22

0.41

0.15

0.04

New

New
Home

0.17

0.38

0.05

0.04

Small

<
$100,000

0.27

0.44

0.14

0.04

Fees

Fees paid

0.62

0.52

0.06

0.03

MtgCo

Mtg. Co.

0.67

0.47

0.12

0.05
R2 = 0.77

Part 11: Heterogeneity [ 29/36]

Second Step
s,t 0
1 GSE Funding Advantages,t - estimated separately
2 Risk free cost of credits,t
3 Corporate debt spreadss,t - estimated 4 different ways
4 Prepayment spreads,t
5 Maturity mismatch risks,t
6 Aggregate Demands,t
7 Long term interest rates,t
8 Market Capacitys,t
9 Time trends,t
10-13 4 dummy variables for CA, NJ , MD, VAs,t
14-16 3 dummy variables for calendar quarterss,t

Part 11: Heterogeneity [ 30/36]

Estimates of 1
Second step based on 370 observations. Corrected for
"heteroscedasticity, autocorrelation, and monthly clustering."
Four estimates based on different estimates of corporate
credit spread:
0.07 (0.11) 0.31 (0.11) 0.17 (0.10) 0.10 (0.11)
Reconcile the 4 estimates with a minimum distance estimator
11-1)
(
2
1 -1)
(

1
2
3
4
-1

1-1),(
1 -1),(
1 -1),(
1 -1)]'
Minimize [(
3

(1 -1)
4
(

)
1 1
Estimated mortgage rate reduction: About 16 basis points. .16%.

Part 11: Heterogeneity [ 31/36]

The Minimum Distance Estimator


0.07 (0.11)

0.31 (0.11)

.017 (0.10)

0.10 (0.11)

Reconcile the 4 estimates with a minimum distance estimator


11-1)
(
2

1
2
3
4
-1 (1 -1 )

Minimize [(1-1),(1 -1),(1 -1),(1 -1)]'

)
1 1
4
(

1 -1)
2
.07
/
.11

(1 / .112 ) (1 / .112 ) (1 / .102 ) (1 / .112 )


.31 / .112

(1 / .112 ) (1 / .112 ) (1 / .102 ) (1 / .112 )


+ ...
Approximately .17%.

Part 11: Heterogeneity [ 32/36]

A Hierarchical Linear Model


German Health Data
Hsat = 1 + 2AGEit + i EDUCit + 4 MARRIEDit + it
i = 1 + 2FEMALEi + ui
Sample ; all$
Reject ; _Groupti < 7 $
Regress ; Lhs = newhsat ; Rhs = one,age,educ,married
; RPM = female ; Fcn = educ(n)
; pts = 25 ; halton
; pds = _groupti ; Parameters$
Sample ; 1 887 $
Create ; betaeduc = beta_i $
Dstat ; rhs = betaeduc $
Histogram ; Rhs = betaeduc $

Part 11: Heterogeneity [ 33/36]

OLS Results
OLS Starting values for random parameters model...
Ordinary
least squares regression ............
LHS=NEWHSAT Mean
=
6.69641
Standard deviation
=
2.26003
Number of observs.
=
6209
Model size
Parameters
=
4
Degrees of freedom
=
6205
Residuals
Sum of squares
=
29671.89461
Standard error of e =
2.18676
Fit
R-squared
=
.06424
Adjusted R-squared
=
.06378
Model test
F[ 3, 6205] (prob) =
142.0(.0000)
--------+--------------------------------------------------------|
Standard
Prob.
Mean
NEWHSAT| Coefficient
Error
z
z>|Z|
of X
--------+--------------------------------------------------------Constant|
7.02769***
.22099
31.80 .0000
AGE|
-.04882***
.00307
-15.90 .0000
44.3352
MARRIED|
.29664***
.07701
3.85 .0001
.84539
EDUC|
.14464***
.01331
10.87 .0000
10.9409
--------+---------------------------------------------------------

Part 11: Heterogeneity [ 34/36]

Maximum Simulated Likelihood


Normal exit: 27 iterations. Status=0. F=
12584.28
-----------------------------------------------------------------Random Coefficients LinearRg Model
Dependent variable
NEWHSAT
Log likelihood function
-12583.74717
Estimation based on N =
6209, K =
7
Unbalanced panel has
887 individuals
LINEAR regression model
Simulation based on
25 Halton draws
--------+--------------------------------------------------------|
Standard
Prob.
Mean
NEWHSAT| Coefficient
Error
z
z>|Z|
of X
--------+--------------------------------------------------------|Nonrandom parameters
Constant|
7.34576***
.15415
47.65 .0000
AGE|
-.05878***
.00206
-28.56 .0000
44.3352
MARRIED|
.23427***
.05034
4.65 .0000
.84539
|Means for random parameters
EDUC|
.16580***
.00951
17.43 .0000
10.9409
|Scale parameters for dists. of random parameters
EDUC|
1.86831***
.00179 1044.68 .0000
|Heterogeneity in the means of random parameters
cEDU_FEM|
-.03493***
.00379
-9.21 .0000
|Variance parameter given is sigma
Std.Dev.|
1.58877***
.00954
166.45 .0000
--------+---------------------------------------------------------

Part 11: Heterogeneity [ 35/36]

Individual Coefficients

Frequency

--> Sample ; 1 - 887 $


--> create ; betaeduc = beta_i $
--> dstat
; rhs = betaeduc $
Descriptive Statistics
All results based on nonmissing observations.
==============================================================================
Variable
Mean
Std.Dev.
Minimum
Maximum
Cases Missing
==============================================================================
All observations in current sample
--------+--------------------------------------------------------------------BETAEDUC| .161184
.132334
-.268006
.506677
887
0

-.268

-.157

-.047

.064

.175

BETAEDUC

.285

.396

.507

Part 11: Heterogeneity [ 36/36]

A Hierarchical Linear Model

A hedonic model of house values


Beron, K., Murdoch, J., Thayer, M.,
Hierarchical Linear Models with
Application to Air Pollution in the South
Coast Air Basin, American Journal of
Agricultural Economics, 81, 5, 1999.

Part 11: Heterogeneity [ 37/36]

HLM
yijk log of home sale price i, neighborhood j, community k.
m
yijk m1 m
x
jk ijk ijk (linear regression model)
M

xm
ijk sq.ft, #baths, lot size, central heat, AC, pool, good view,
age, distance to beach
Random coefficients
m
q q
m

q1 j Njk wjk
jk

Nqjk %population poor, race mix, avg age, avg. travel to work,
FBI crime index, school avg. CA achievement test score
qj sqm1 sEqm
vj
j
S

Eqm
air quality measure, visibility
j

S-ar putea să vă placă și