Sunteți pe pagina 1din 19

MLR.

5 Variance LM Statistic Testing Heteroskedasticity WLS and GLS

Econometrics
Multiple Regression Analysis: Heteroskedasticity

João Valle e Azevedo

NOVA School of Business and Economics

Spring Semester

João Valle e Azevedo (NOVA SBE) Econometrics Lisbon 1 / 19


MLR.5 Variance LM Statistic Testing Heteroskedasticity WLS and GLS

Heteroskedasticity

Properties of OLS: Variance


Assumption MLR.5 (Homoskedasticity) The error u has the same
variance given any value of the explanatory variables
−1
Var (u|xi , ..., xk ) = σ 2 leading to Var (β̂|X) = σ 2 (X0 X)

With MLR.1 through MLR.5 we have derived the variance of the OLS
estimators and further concluded that OLS was asymptotically
Normal: Enough to conduct inference ”as usual”

If MLR.5 does not hold, that is, if the conditional variance of u is


allowed to vary given the x’s, then the errors are heteroskedastic
and the results above are NOT valid. Cannot make inference ”as
usual” (t-tests, F tests, LM tests)

João Valle e Azevedo (NOVA SBE) Econometrics Lisbon 2 / 19


MLR.5 Variance LM Statistic Testing Heteroskedasticity WLS and GLS

Heteroskedasticity

Heteroskedastic Case
Suppose y is wage and x is education

f(y|x)

.
. E(y|x) = b0 + b1x

.
x1 x2 x3 x

Figure: How spread out is the distribution of the estimator


João Valle e Azevedo (NOVA SBE) Econometrics Lisbon 3 / 19
MLR.5 Variance LM Statistic Testing Heteroskedasticity WLS and GLS

Heteroskedasticity

Properties of OLS: Variance (Cont.)

Theorem
Under assumptions MLR.1 through MLR.5

σ2
Var (β̂j ) = , j = 0, 1, ..., k
SSTj (1 − Rj2 )

n
X
SSTj = (xij − x¯j )2
i=1

Rj2 is the coefficient of determination from regressing xj on all the other regressors.
Tells us how much the other regressors ”explain” xj
João Valle e Azevedo (NOVA SBE) Econometrics Lisbon 4 / 19
MLR.5 Variance LM Statistic Testing Heteroskedasticity WLS and GLS

Heteroskedasticity

Variance with Heteroskedasticity


Now assume Var (ui |xi1 , ..., xik ) = σi2
For the simple regression case:
P
ˆ (xi − x̄)ui
β1 = β1 + P
(xi − x̄)2
So, conditional on the x’s:

(xi − x̄)2 σi2


P
ˆ
Var (β1 ) =  P 2
(xi − x̄)2
A valid estimator when σi2 6= σ 2 is:

(xi − x̄)2 ûi2


P
Var (βˆ1 ) =  P 2 , where ûi are the OLS residuals
(xi − x̄)2

João Valle e Azevedo (NOVA SBE) Econometrics Lisbon 5 / 19


MLR.5 Variance LM Statistic Testing Heteroskedasticity WLS and GLS

Heteroskedasticity

Variance with Heteroskedasticity


For the multiple regression model, a valid (consistent) estimator of
Var (β̂j ) with heteroskedasticity is:
P 2 2
r̂ij ûi
Var (β̂j ) =
SSRj2

r̂ij is the i th residual from regressing xj on all other independent


variables

SSRj is the sum of squared residuals from this regression

ûi are the OLS residuals

João Valle e Azevedo (NOVA SBE) Econometrics Lisbon 6 / 19


MLR.5 Variance LM Statistic Testing Heteroskedasticity WLS and GLS

Heteroskedasticity

Robust Standard Errors


The square root of this variance can be used as a standard error for
inference (Robust Standard error). With these standard errors it
turns out that:

(β̂j − βj ) a
t= ∼ Normal(0, 1)
se(β̂j )
I This is an heteroskedasticity-robust t statistic
Often, the estimated variance is corrected for degrees of freedom by
multiplying by n/(n-k-1) (irrelevant for large n)
Why not use always robust standard errors?
I In small samples t statistics using robust standard errors will not have a
distribution close to the Normal (or t) and inferences will not be correct
Will not deal with heteroskedasticity-robust F statistics
Instead, use heteroskedasticity-robust LM tests
João Valle e Azevedo (NOVA SBE) Econometrics Lisbon 7 / 19
MLR.5 Variance LM Statistic Testing Heteroskedasticity WLS and GLS

Heteroskedasticity

A Robust LM Statistic
Suppose we have a standard model

y = β0 + β1 x1 + β2 x2 + ... + βk xk + u
and our null hypothesis is H0 : βk−q+1 = βk−q+2 = ... = βk = 0 (the
number of restrictions is q)
First, we just run OLS on the restricted model and save the residuals

Regress each of the excluded variables on all of the included variables
(q different regressions) and save each set of residuals r̆1 , r̆2 , ..., r̆q
Regress a variable defined to be = 1 on r̆1 ŭ, r̆2 ŭ, ..., r̆q ŭ, with no
intercept
The LM statistic is n − SSR1 , where SSR1 is the sum of squared
residuals from this final regression, it has a chi-square distribution
with q degrees of freedom (under the Null)
João Valle e Azevedo (NOVA SBE) Econometrics Lisbon 8 / 19
MLR.5 Variance LM Statistic Testing Heteroskedasticity WLS and GLS

Heteroskedasticity

Testing for Heteroskedasticity


Want to test H0 : Var (u|xi , ..., xk ) = σ 2 , which is equivalent to
H0 : E (u 2 |xi , ..., xk ) = E (u 2 ) = σ 2

If assume the relationship between u 2 and xj will be linear, can test as


a linear restriction
I Thus, for u 2 = δ0 + δ1 x1 + ... + δk xk + ν this means testing
H0 : δ1 = δ2 = ... = δk = 0

I Don’t observe the error, but can use residuals from the OLS regression

João Valle e Azevedo (NOVA SBE) Econometrics Lisbon 9 / 19


MLR.5 Variance LM Statistic Testing Heteroskedasticity WLS and GLS

Heteroskedasticity

The Breusch-Pagan Test


Estimate u 2 = δ0 + δ1 x1 + ... + δk xk + ν by OLS
Want to test H0 : δ1 = δ2 = ... = δk = 0
I Take the R 2 of this regression. With assumptions MLR.1 through
MLR.4 still in place we can use an F test or an LM type test
I The F statistic is just the reported F statistic for overall significance of
this regression

R 2 /k
F = ∼ F(k,n−k−1)
(1 − R 2 )/(n − k − 1)
Alternatively, can form the LM statistic LM = nR 2 , which is
approximately distributed as a χ2k under the null (R 2 of the regression
above!, this is not the typical LM test!)
These tests are usually called the Breusch-Pagan tests for
heteroskedasticity

João Valle e Azevedo (NOVA SBE) Econometrics Lisbon 10 / 19


MLR.5 Variance LM Statistic Testing Heteroskedasticity WLS and GLS

Heteroskedasticity

The White Test


The Breusch-Pagan tests will detect linear forms of heteroskedasticity
The White test allows for nonlinearities by using squares and
cross-products of all the x’s
Estimate
u 2 = δ0 + δ1 x1 + ... + δk xk + δk+1 x12 + ... + δ2k xk2 + ...+
+ δ2k + 1x1 x2 + ... + δk+k(k+1)/2 xk xk−1 + error by OLS
Want to test H0 : δ1 = δ2 = ... = δk+k(k+1)/2 = 0
I Take the R 2 of this regression and still use the F or LM statistics to
test whether all the xj , xj2 , and xj xh are jointly significant:

R 2 /q
F = ∼ F (q, n − k − 1) (approx.) under the null
(1 − R 2 )/(n − q − 1)
I and LM=nR 2 ∼ χ2q (approx.) under the null (q = k + k(k + 1)/2)
I If k is large and n small these approximations are poor
João Valle e Azevedo (NOVA SBE) Econometrics Lisbon 11 / 19
MLR.5 Variance LM Statistic Testing Heteroskedasticity WLS and GLS

Heteroskedasticity

Alternate form of the White Test


Now, the fitted values from OLS, ŷ , are a function of all the x’s

Thus, ŷ 2 will be a function of the squares and cross-products and ŷ


and ŷ 2 can ”substitute” for all of the xj , xj2 , and xj xh , so:

I Regress the squared residuals on ŷ and ŷ 2 (as well as a constant) and


use the R 2 to form an F or LM statistic (as for the BP or White tests)

I Only testing 2 restrictions now

João Valle e Azevedo (NOVA SBE) Econometrics Lisbon 12 / 19


MLR.5 Variance LM Statistic Testing Heteroskedasticity WLS and GLS

WLS

Weighted Least Squares


We can always estimate robust standard errors for OLS

However, if we know something about the specific form of the


heteroskedasticity, we can obtain estimators that have a smaller
variance than OLS

If we know in fact something we are able to transform the model into


one that has homoskedastic errors

João Valle e Azevedo (NOVA SBE) Econometrics Lisbon 13 / 19


MLR.5 Variance LM Statistic Testing Heteroskedasticity WLS and GLS

WLS

Case of known form up to a multiplicative constant

y = β0 + β1 x1 + β2 x2 + β3 x3 + ... + βk xk + u

Suppose we know that Var (u|x) = σ 2 h(x), or


Var (ui |x) = σ 2 h(xi ) = σ 2 hi

Example:

wage = β0 + β1 Education + β2 Experience + β3 Tenure + u



We know √ that E (ui2/ hi |x) = 0, because hi2 depends only on x, and
Var (ui / hi |x) = σ , because Var (u|x) = σ hi

So, if we divide the regression equation by hi we will get a model
where the error is homoskedastic (MLR.1 to MLR.5 verified again)
João Valle e Azevedo (NOVA SBE) Econometrics Lisbon 14 / 19
MLR.5 Variance LM Statistic Testing Heteroskedasticity WLS and GLS

WLS

Generalized Least Squares


Estimating the transformed equation by OLS is an example of
generalized least squares (GLS)
GLS will be BLUE (Best Linear Unbiased Estimator) in this case
The GLS estimator for √ the particular case where we divide the
regression equation by hi is called a weighted least squares (WLS)
estimator. Why?

n
X βˆ0
(yi∗ − √ − βˆ1 xi1 ∗
− ... − βˆk xik∗ )2
i=1
hi
p p
where yi∗ = yi / hi , xi1∗
= xi1 / hi
Xn
(yi − βˆ0 − βˆ1 xi1 − ... − βˆk xik )2 /hi
i=1

João Valle e Azevedo (NOVA SBE) Econometrics Lisbon 15 / 19


MLR.5 Variance LM Statistic Testing Heteroskedasticity WLS and GLS

WLS

More on WLS
We interpret WLS estimates in the original (not transformed model)
but get variances of the WLS estimators in the transformed model
WLS is optimal if we know the form of Var (ui |xi )
In most cases, won’t know the form of heteroskedasticity
Can often estimate the form of heteroskedasticity
Example:

wage = β0 + β1 Education + β2 Experience + β3 Tenure + u

Var (u|Education, Experience, Tenure) = σ 2 exp(δ0 + δ1 Education)

I where δ0 and δ1 are unknown

João Valle e Azevedo (NOVA SBE) Econometrics Lisbon 16 / 19


MLR.5 Variance LM Statistic Testing Heteroskedasticity WLS and GLS

WLS

Must estimate the form of Heteroskedasticity: Feasible


GLS
First, we assume a model for heteroskedasticity
Example: Var (u|x) = E (u 2 |x) = σ 2 exp(δ0 + δ1 x1 + ... + δk xk ) > 0
Since we don’t know the δ’s, must estimate them
We can write the above model as:

u 2 = σ 2 exp(δ0 + δ1 x1 + ... + δk xk )ν, where E (ν|x) = 1


Assume further that ν is independent of x

Then ln(u 2 ) = α0 + δ1 x1 + ... + δk xk + e

where E (e) = 0 and e is independent of x

João Valle e Azevedo (NOVA SBE) Econometrics Lisbon 17 / 19


MLR.5 Variance LM Statistic Testing Heteroskedasticity WLS and GLS

WLS

Feasible GLS (continued)

ln(u 2 ) = α0 +δ1 x1 +...+δk xk +e where E (e) = 0 and e is independent of x

Can use û (from OLS) instead of u, to estimate this equation by OLS


Then, obtain an estimate of hi by ĥi = exp(ĝi ),
Finally, use 1/ĥi as the weights in WLS

Summary:
I Run OLS in the original model, save the residuals, û, square them and
take logs
I Regress ln(û 2 ) on all of the independent variables (plus constant) and
get the fitted values, ĝ
I Do WLS using 1/exp(ĝ ) as the weight
João Valle e Azevedo (NOVA SBE) Econometrics Lisbon 18 / 19
MLR.5 Variance LM Statistic Testing Heteroskedasticity WLS and GLS

WLS

Notes on GLS
OLS is still unbiased and consistent with heteroskedasticity (as long
as MLR.1 through MLR.4) hold
We use GLS just for efficiency (smaller variance of the estimators)
If we know the weights to use in WLS, then GLS is unbiased.
Otherwise, and assuming that we estimate a correctly specified for
heteroskedasticity, FGLS (which is a Feasible GLS) is not unbiased
but is consistent and asymptotically efficient
Remember, with FGLS we are estimating the parameters of the
original model. Standard errors in the transformed model also refer to
standard errors in the original model
Can use the t and F tests for inference
When doing F tests with WLS, form the weights from the
unrestricted model and use those weights to do WLS on the restricted
model as well as on the unrestricted model
João Valle e Azevedo (NOVA SBE) Econometrics Lisbon 19 / 19

S-ar putea să vă placă și