14 Wls

Regression Analysis Tutorial
183
LECTURE / DISCUSSION Weighted Least Squares
Econometrics Laboratory C University of California at Berkeley C 22-26 March 1999
184
Introduction
In a regression problem with time series data (where the variables have subscript "t" denoting the time the variable was observed), it is common for the error terms to be correlated across time, but with a constant variance; this is the problem of "autocorrelated disturbances," which will be considered in the next lecture. For regressions with cross-section data (where the subscript "i" now denotes a particular individual or firm at a point in time), it is usually safe to assume the errors are uncorrelated, but often their variances are not constant across individuals. This is known as the problem of heteroskedasticity (for "unequal scatter"); the usual assumption of constant error variance is referred to as homoskedasticity. Although the mean of the dependent variable might be a linear function of the regressors, the variance of the error terms might also depend on those same regressors, so that the observations might "fan out" in a scatter diagram, as illustrated in the following diagrams.
185
. . . . .
Y
. . . . .
. . . . .
X
Homoskedasticity
. . . . .
Y
. . . . .
. . . . .
X
Increasing heteroskedasticity
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
X
"U-shaped" heteroskedasticity
186
Assumptions of Heteroskedastic Linear Model

C yi '
K
@ x i % gi % gi
[simple linear model] or
yi ' j xij @
j'1
[multiple regression model]; [zero mean error terms];
C C C
E(gi) ' 0 Cov(gi , gi ) ' 0

)
if i i )
2
[no serial correlation]; and [heteroskedasticity].
Var(gi) '
2 i
'
@ hi , some hi
Sometimes also assume C gi normally distributed [optional].
187
Examples of Heteroskedastic Models

1. Grouped (Aggregate) Data
For individual "i" in group "s" (i.e., state, region, time period) yis ' % @ xis % gis , with Var(gis) '
2
, etc.
However, we only observe some group averages: 1 s ys ' j yis , ns i'1 Then ys ' % @ x s % g s , with Var(g s) '
2 n
1 s xs ' j xis . ns i'1
1 ns
@ hs .
188
2. Random Coefficients
Both the intercept and slope vary (randomly) across i, yi / "i % $i @ xi , where E( i) / V( i) /
2
, E( i) / Cov( i , i) / %
, Var( i) / , so that
, and yi /
@ x i % gi ,
with gi / ( i & ) % ( i & ) @ x i , which has E(gi) / 0 and V(gi) ' ' /
2 2
% 2 @ xi %
@ xi
2 2
@ (1 % 2 @ hi .
1 @ xi %
2 @ xi )
189
3. Variance Proportional to Square of Mean

yi ' V(gi) ' %
2
@ x i % gi , @ x i) 2 /
with
2
( %
@ hi ,
so that a larger variance is associated with a larger mean.
190
Properties of Classical Least Squares Under Heteroskedasticity

C Least squares estimators of " and $ still unbiased and consistent; C Least squares estimators no longer efficient, i.e., they are no longer the best linear unbiased estimators; and C Usual estimators for the standard errors of least squares are biased, so the usual confidence intervals and test statistics are incorrect, and may lead to incorrect conclusions.
191
Approaches to Dealing with Heteroskedasticity

C For known heteroskedasticity (e.g., grouped data with known group sizes), use weighted least squares (WLS) to obtain efficient unbiased estimates; C Test for heteroskedasticity of a special form using a squared residual regression; C Estimate the unknown heteroskedasticity parameters using this squared residual regression, then use the estimated variances in the WLS formula to get efficient estimates of regression coefficients (known as feasible WLS); or C Stick with the (inefficient) least squares estimators, but get estimates of standard errors which are correct under arbitrary heteroskedasticity.
192
Correction for Heteroskedasticity of Known Form

If Var(gi) '
2
@ hi
where hi is known (e.g., grouped data), then yi ' Y yi hi yi ' etc. Since Var(gi ) '
( 2 (
% '
@ x i % gi , @ 1 hi
( (
Var(gi) ' @ xi hi
(
@ hi or
gi hi yi hi
@ zi %
@ xi % gi , with yi /
, zi /
1 hi
, can use Classical Least Squares on
this transformed equation to get efficient estimates of " and $ . For multiple regression model, divide the dependent variable and all of the regressors (including the constant term) by hi , then do least squares.
193
Weighted Least Squares

Regressing yi on zi and xi involves minimization of
n ( (
j
i'1
( yi
& a @ zi & b @
(2 xi
' j
i'1
(yi & a & b @ xi)2 hi
thus, a more efficient estimator is obtained by downweighting the squared residuals for observations with large variances, in proportion to those variances.
194
Properties of Weighted Least Squares Estimates (with known weights)

C Estimated coefficients are efficient, i.e., best linear unbiased (BLUE). C Regression of yi
(
on zi and xi gives correct standard
errors for coefficient estimates. C R2 must be redefined, since transformed model usually has no intercept term.
195
Detection of Heteroskedasticity (unknown weights)

C Residual plot: Graph squared LS residuals ei ' (yi & & @ xi)2 against xi or yi ' % @ xi to check variability. C Diagnostic testing: Do formal statistical test for particular hypothesized form of h i .
2
196
Squared Residual Regression Test for Heteroskedasticity

Conditions: Moderate to large samples (n = 50 - 100+), possibly nonnormal errors, and linear form for hi , hi ' 1 % Y Y
2 i 2 gi 1zi1 izi1
% % % %
LziL LziL LziL
' '
2 2
% zi1 % %
% ui , E(ui) ' 0 ,
where zi1,...,ziL are known functions of regressors (e.g., zi1 = xi , zi2 = xi2 for random coefficients model). Idea: replace unknown squared errors gi with squared residuals ei ' (yi & yi)2 from LS, then regress ei zi1,...,ziL . Steps: (1) (2) (3) Get LS residuals ei ' yi & & @ xi . Get R2 from regression of ei on 1, zi1,...,ziL . Construct usual F-statistic 1&R reject homoskedasticity if ) exceeds critical value from F-table with L-1 and n-L degrees of freedom.
2
on 1,
)=
R2
n&L L&1
197
Correction for Heteroskedasticity: Feasible WLS

C Conditions: Same as for squared residual regression test, including
2 i
'
1zi1
% %
LziL
C Idea: Use squared residual regression to estimate weights. C Steps: (1) Fit
yi ' % @ xi % gi
2
by least squares, get
ei ' y i & & @ xi . (2) Regress ei "hi"

N
on
2
1,z1i,...,z1L
get
2 i ' 2 % 1zi1 % % LziL .
(3)
Replace
with " i " in WLS formula-(yi & & @ xi)2

2 i
minimize j
i'1
to estimate ",$ .
C Properties: In large samples, (approximately) same as WLS.
198
Examples of Feasible WLS

1. Grouped Data Regress ns y s on ns , ns x s to estimate ",$
(exact WLS).
2. Random Coefficients First get LS estimates model has

2 i 2
, , and residuals ei . Since

2
'
1x i
2 2x i
regress ei on 1, xi, xi2 , test H0: *1 = *2 = 0 with F-test. If H0 (homoskedasticity) rejected, let
2 2 i ' 2 % 1x i % 2x i ,
plug into WLS formula. (For multiple regression, set ziR equal to squares and cross-products of regressors.)
199
3. Variance Proportional to Square of Mean

2 To test for heteroskedasticity, regress ei ' (yi & & xi)2
on 1 and zi1 ' ( i)2 ' ( % xi)2 , do F-test (or t-test) for y exclusion of
2
yi .
If homoskedasticity rejected, use
" hi ' yi " in place of "hi" in WLS formula.
200
Correcting Least Squares Standard Errors for Heteroskedasticity

C Situation: Suspect heteroskedasticity but dont want to specify hi ; willing to stick with (inefficient but unbiased) least squares coefficient estimators. C Idea: Find formula for standard errors of LS which are valid under either homoskedasticity or heteroskedasticity. Known as Eicker-White (or just White) heteroskedasticity-consistent standard errors. C Usual Variance Estimator for LS (page 60): V( ) ' s 2 / j xi
2
if no intercept term; if intercept term;
' s 2 / j (xi & x)2 ' s 2 j (xi & x)2 / j (xi & x)2 2 . Formula assumes
E gi (xi & x)2 ' E(gi ) @ E(xi & x)2
which fails under heteroskedasticity.
201
C Corrected Variance Estimator for : Now use V( ) ' j ei (xi & x)2 / j (xi & x)2 2 where ei ' yi & & xi . Note V( ) s 2 / (xi & x)2 unless ei ' s 2 '
2 2
1 2 j ei n&k
for all observations, which never happens in practice.
C Formula for Multiple Regression: Similar, but more complicated. Fortunately, many computer packages (e.g., TSP) compute "Eicker-White" standard errors as an option.

14 Wls

Încărcat de

Informații document

Descriere originală:

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

14 Wls

Încărcat de

Drepturi de autor:

Formate disponibile

Regression Analysis Tutorial

LECTURE / DISCUSSION Weighted Least Squares

Econometrics Laboratory C University of California at Berkeley C 22-26 March 1999

Regression Analysis Tutorial

Econometrics Laboratory C University of California at Berkeley C 22-26 March 1999

Regression Analysis Tutorial

Econometrics Laboratory C University of California at Berkeley C 22-26 March 1999

Regression Analysis Tutorial

Assumptions of Heteroskedastic Linear Model

[simple linear model] or

[multiple regression model]; [zero mean error terms];

E(gi) ' 0 Cov(gi , gi ) ' 0

[no serial correlation]; and [heteroskedasticity].

Sometimes also assume C gi normally distributed [optional].

Econometrics Laboratory C University of California at Berkeley C 22-26 March 1999

Regression Analysis Tutorial

Examples of Heteroskedastic Models

1 s xs ' j xis . ns i'1

Econometrics Laboratory C University of California at Berkeley C 22-26 March 1999

Regression Analysis Tutorial

Econometrics Laboratory C University of California at Berkeley C 22-26 March 1999

Regression Analysis Tutorial

3. Variance Proportional to Square of Mean

so that a larger variance is associated with a larger mean.

Econometrics Laboratory C University of California at Berkeley C 22-26 March 1999

Regression Analysis Tutorial

Properties of Classical Least Squares Under Heteroskedasticity

Econometrics Laboratory C University of California at Berkeley C 22-26 March 1999

Regression Analysis Tutorial

Approaches to Dealing with Heteroskedasticity

Econometrics Laboratory C University of California at Berkeley C 22-26 March 1999

Regression Analysis Tutorial

Correction for Heteroskedasticity of Known Form

, can use Classical Least Squares on

Econometrics Laboratory C University of California at Berkeley C 22-26 March 1999

Regression Analysis Tutorial

Weighted Least Squares

(yi & a & b @ xi)2 hi

Econometrics Laboratory C University of California at Berkeley C 22-26 March 1999

Regression Analysis Tutorial

Properties of Weighted Least Squares Estimates (with known weights)

on zi and xi gives correct standard

Econometrics Laboratory C University of California at Berkeley C 22-26 March 1999

Regression Analysis Tutorial

Detection of Heteroskedasticity (unknown weights)

Econometrics Laboratory C University of California at Berkeley C 22-26 March 1999

Regression Analysis Tutorial

Squared Residual Regression Test for Heteroskedasticity

LziL LziL LziL

Regression Analysis Tutorial

Correction for Heteroskedasticity: Feasible WLS

by least squares, get

ei ' y i & & @ xi . (2) Regress ei "hi"

2 i ' 2 % 1zi1 % % LziL .

with " i " in WLS formula-(yi & & @ xi)2

C Properties: In large samples, (approximately) same as WLS.

Econometrics Laboratory C University of California at Berkeley C 22-26 March 1999

Regression Analysis Tutorial

Examples of Feasible WLS

2. Random Coefficients First get LS estimates model has

, , and residuals ei . Since

Econometrics Laboratory C University of California at Berkeley C 22-26 March 1999