Documente Academic
Documente Profesional
Documente Cultură
Econometrics
Multiple Regression Analysis: Heteroskedasticity
Spring Semester
Heteroskedasticity
With MLR.1 through MLR.5 we have derived the variance of the OLS
estimators and further concluded that OLS was asymptotically
Normal: Enough to conduct inference ”as usual”
Heteroskedasticity
Heteroskedastic Case
Suppose y is wage and x is education
f(y|x)
.
. E(y|x) = b0 + b1x
.
x1 x2 x3 x
Heteroskedasticity
Theorem
Under assumptions MLR.1 through MLR.5
σ2
Var (β̂j ) = , j = 0, 1, ..., k
SSTj (1 − Rj2 )
n
X
SSTj = (xij − x¯j )2
i=1
Rj2 is the coefficient of determination from regressing xj on all the other regressors.
Tells us how much the other regressors ”explain” xj
João Valle e Azevedo (NOVA SBE) Econometrics Lisbon 4 / 19
MLR.5 Variance LM Statistic Testing Heteroskedasticity WLS and GLS
Heteroskedasticity
Heteroskedasticity
Heteroskedasticity
(β̂j − βj ) a
t= ∼ Normal(0, 1)
se(β̂j )
I This is an heteroskedasticity-robust t statistic
Often, the estimated variance is corrected for degrees of freedom by
multiplying by n/(n-k-1) (irrelevant for large n)
Why not use always robust standard errors?
I In small samples t statistics using robust standard errors will not have a
distribution close to the Normal (or t) and inferences will not be correct
Will not deal with heteroskedasticity-robust F statistics
Instead, use heteroskedasticity-robust LM tests
João Valle e Azevedo (NOVA SBE) Econometrics Lisbon 7 / 19
MLR.5 Variance LM Statistic Testing Heteroskedasticity WLS and GLS
Heteroskedasticity
A Robust LM Statistic
Suppose we have a standard model
y = β0 + β1 x1 + β2 x2 + ... + βk xk + u
and our null hypothesis is H0 : βk−q+1 = βk−q+2 = ... = βk = 0 (the
number of restrictions is q)
First, we just run OLS on the restricted model and save the residuals
ŭ
Regress each of the excluded variables on all of the included variables
(q different regressions) and save each set of residuals r̆1 , r̆2 , ..., r̆q
Regress a variable defined to be = 1 on r̆1 ŭ, r̆2 ŭ, ..., r̆q ŭ, with no
intercept
The LM statistic is n − SSR1 , where SSR1 is the sum of squared
residuals from this final regression, it has a chi-square distribution
with q degrees of freedom (under the Null)
João Valle e Azevedo (NOVA SBE) Econometrics Lisbon 8 / 19
MLR.5 Variance LM Statistic Testing Heteroskedasticity WLS and GLS
Heteroskedasticity
I Don’t observe the error, but can use residuals from the OLS regression
Heteroskedasticity
R 2 /k
F = ∼ F(k,n−k−1)
(1 − R 2 )/(n − k − 1)
Alternatively, can form the LM statistic LM = nR 2 , which is
approximately distributed as a χ2k under the null (R 2 of the regression
above!, this is not the typical LM test!)
These tests are usually called the Breusch-Pagan tests for
heteroskedasticity
Heteroskedasticity
R 2 /q
F = ∼ F (q, n − k − 1) (approx.) under the null
(1 − R 2 )/(n − q − 1)
I and LM=nR 2 ∼ χ2q (approx.) under the null (q = k + k(k + 1)/2)
I If k is large and n small these approximations are poor
João Valle e Azevedo (NOVA SBE) Econometrics Lisbon 11 / 19
MLR.5 Variance LM Statistic Testing Heteroskedasticity WLS and GLS
Heteroskedasticity
WLS
WLS
y = β0 + β1 x1 + β2 x2 + β3 x3 + ... + βk xk + u
Example:
WLS
n
X βˆ0
(yi∗ − √ − βˆ1 xi1 ∗
− ... − βˆk xik∗ )2
i=1
hi
p p
where yi∗ = yi / hi , xi1∗
= xi1 / hi
Xn
(yi − βˆ0 − βˆ1 xi1 − ... − βˆk xik )2 /hi
i=1
WLS
More on WLS
We interpret WLS estimates in the original (not transformed model)
but get variances of the WLS estimators in the transformed model
WLS is optimal if we know the form of Var (ui |xi )
In most cases, won’t know the form of heteroskedasticity
Can often estimate the form of heteroskedasticity
Example:
WLS
WLS
Summary:
I Run OLS in the original model, save the residuals, û, square them and
take logs
I Regress ln(û 2 ) on all of the independent variables (plus constant) and
get the fitted values, ĝ
I Do WLS using 1/exp(ĝ ) as the weight
João Valle e Azevedo (NOVA SBE) Econometrics Lisbon 18 / 19
MLR.5 Variance LM Statistic Testing Heteroskedasticity WLS and GLS
WLS
Notes on GLS
OLS is still unbiased and consistent with heteroskedasticity (as long
as MLR.1 through MLR.4) hold
We use GLS just for efficiency (smaller variance of the estimators)
If we know the weights to use in WLS, then GLS is unbiased.
Otherwise, and assuming that we estimate a correctly specified for
heteroskedasticity, FGLS (which is a Feasible GLS) is not unbiased
but is consistent and asymptotically efficient
Remember, with FGLS we are estimating the parameters of the
original model. Standard errors in the transformed model also refer to
standard errors in the original model
Can use the t and F tests for inference
When doing F tests with WLS, form the weights from the
unrestricted model and use those weights to do WLS on the restricted
model as well as on the unrestricted model
João Valle e Azevedo (NOVA SBE) Econometrics Lisbon 19 / 19