Sunteți pe pagina 1din 4

ECON2206IntroductoryEconometrics

Week9TutorialExercises

Readings
Read Chapter 8 thoroughly.
Make sure that you know the meanings of the Key Terms at the chapter end.

Review Questions (these may or may not be discussed in tutorial classes)


What is heteroskedasticity in a regression model?
In the presence of heteroskedasticity, are the tstat and Fstat from the usual OLS still valid?
Why? Are there any other problems with OLS under heteroscedasticity?
What are the heteroskedasticityrobust standard errors? How do you use them in Stata?
How do you detect if there is heteroskedasticity?
If heteroskedasticity is present in a known form, how would you estimate the model?
If heteroskedasticity is present in an unknown form, how would you estimate the model?
What are the steps in the FGLS estimation?
How would you handle the heteroskedasticity of the LPM?

Problem Set (these will be discussed in tutorial classes)


Q1. Wooldridge 8.1
Parts (ii) and (iii). The homoskedasticity assumption played no role in Chapter 5 in showing
that OLS is consistent. But we know that heteroskedasticity causes statistical inference based on
the usual t and F statistics to be invalid, even in large samples. As heteroskedasticity is a
violation of the GaussMarkov assumptions, OLS is no longer BLUE.


Q2. Wooldridge 8.2
With Var(u|inc,price,educ,female) = 2inc2, h(x) = inc2, where h(x) is the heteroskedasticity
function defined in equation (8.21). Therefore, h(x) = inc, and so the transformed equation is
obtained by dividing the original equation by inc:

Notice that ,which is the slope on inc in the original model, is now a constant in the
transformed equation. This is simply a consequence of the form of the heteroskedasticity and
the functional forms of the explanatory variables in the original equation. For many functional
forms there will be no constant in the transformed model.

Q3. Wooldridge 8.5


(i) No. For each coefficient, the usual standard errors and the heteroskedasticityrobust ones
are practically very similar.

(ii) The effect is .029(4) = .116, so the probability of smoking falls by about .116.

(iii) As usual, we compute the turning point in the quadratic: .020/[2(.00026)] 38.46, so about
38 and onehalf years.

(iv) Holding other factors in the equation fixed, a person in a state with restaurant smoking
restrictions has a .101 lower chance of smoking. This is similar to the effect of having four more
years of education.

(v) We just plug the values of the independent variables into the OLS regression line:

= .656 .069log(67.44)+.012log(6,500) .029(16)+.020(77) .00026(77)2 .0052.

Thus, the estimated probability of smoking for this person is close to zero. (In fact, this person is
not a smoker, so the equation predicts well for this particular observation.)



Q4. Wooldridge C8.10 (See Ch8_C10.do)

(i) In the following equation, estimated by OLS, the usual standard errors are in () and the
heteroskedasticityrobust standard errors are in []:

401 = .506 + .0124 inc .000062 inc2 + .0265 age .00031 age2 .0035 male
(.081) (.0006) (.000005) (.0039) .00005) (.0121)
[.079] [.0006] [.000005] [.0038] [.00004] [.0121]
n = 9,275, R2 = .094.

There are no important differences; if anything, the robust standard errors are smaller.

(ii) This is a general claim. Since Var(y|x) = p(x)[1p(x)], we can write E(u2|x) = p(x)p(x)2.
Written in error form, u2 = p(x) p(x)2 + v. In other words, we can write this as a regression
model u2 = 0 + 1p(x) + 2p(x)2 + v, with the restrictions 0 = 0, 1 = 1, and 2 = 1. Remember
that, for the LPM, the fitted values, , are estimates of p(x). So, when we run the regression
on and (including an intercept), the intercept estimates should be close to zero, the
coefficient on should be close to one, and the coefficient on should be close to 1.

(iii) The White LM statistic and F statistic about 581.9 and 310.32 respectively, both of which
are very significant. The coefficient on 401 is about 1.010, the coefficient on 401 2 about
.970, and the intercept is about .009. These estimates are quite close to what we expect to find
from the theory in part (ii).

(iv) The smallest fitted value is about .030 and the largest is about .697. The WLS estimates of
the LPM are

401 = .488 + .0126 inc .000062 inc2 + .0255 age .00030 age2 .0055 male
(.076) (.0005) (.000004) (.0037) (.00004) (.0117)
n = 9,275, R2 = .108.

There are no important differences with the OLS estimates. The largest relative change is in the
coefficient on male, but this variable is very insignificant using either estimation method.

. * Read data from Stata file


. use 401ksubs.dta
.
. *Part (i)
. reg e401k inc incsq age agesq male

Source | SS df MS Number of obs = 9,275


-------------+---------------------------------- F(5, 9269) = 192.96
Model | 208.430869 5 41.6861738 Prob > F = 0.0000
Residual | 2002.39458 9,269 .216031349 R-squared = 0.0943
-------------+---------------------------------- Adj R-squared = 0.0938
Total | 2210.82544 9,274 .238389632 Root MSE = .46479

------------------------------------------------------------------------------
e401k | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
inc | .0124464 .0005929 20.99 0.000 .0112843 .0136086
incsq | -.0000616 4.73e-06 -13.03 0.000 -.0000709 -.0000524
age | .0265061 .0039225 6.76 0.000 .0188173 .034195
agesq | -.0003053 .000045 -6.78 0.000 -.0003935 -.000217
male | -.0035328 .012084 -0.29 0.770 -.0272202 .0201545
_cons | -.5062895 .0810961 -6.24 0.000 -.6652556 -.3473233
------------------------------------------------------------------------------

. predict e4res, residual

. predict e4hat, xb

. reg e401k inc incsq age agesq male, robust

Linear regression Number of obs = 9,275


F(5, 9269) = 209.32
Prob > F = 0.0000
R-squared = 0.0943
Root MSE = .46479

------------------------------------------------------------------------------
| Robust
e401k | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
inc | .0124464 .0006003 20.73 0.000 .0112697 .0136232
incsq | -.0000616 5.00e-06 -12.32 0.000 -.0000715 -.0000518
age | .0265061 .0038235 6.93 0.000 .0190113 .034001
agesq | -.0003053 .0000438 -6.98 0.000 -.000391 -.0002195
male | -.0035328 .0120525 -0.29 0.769 -.0271583 .0200927
_cons | -.5062895 .0785541 -6.45 0.000 -.6602728 -.3523061
------------------------------------------------------------------------------
.
. *Part (iii)
. gen e4ressq=e4res*e4res
. gen e4hatsq=e4hat*e4hat
. reg e4ressq e4hat e4hatsq

Source | SS df MS Number of obs = 9,275


-------------+---------------------------------- F(2, 9272) = 310.32
Model | 14.7106003 2 7.35530013 Prob > F = 0.0000
Residual | 219.765799 9,272 .023702092 R-squared = 0.0627
-------------+---------------------------------- Adj R-squared = 0.0625
Total | 234.476399 9,274 .0252832 Root MSE = .15395

------------------------------------------------------------------------------
e4ressq | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
e4hat | 1.009682 .057717 17.49 0.000 .8965443 1.12282
e4hatsq | -.9702863 .069728 -13.92 0.000 -1.106968 -.8336041
_cons | -.0090334 .0109145 -0.83 0.408 -.0304283 .0123615
------------------------------------------------------------------------------

. test e4hat e4hatsq

( 1) e4hat = 0
( 2) e4hatsq = 0

F( 2, 9272) = 310.32
Prob > F = 0.0000
.
. *Part (iv)
. summ e4hat

Variable | Obs Mean Std. Dev. Min Max


-------------+---------------------------------------------------------
e4hat | 9,275 .3921294 .1499158 .0299172 .6971899

. gen wgt=1/(e4hat*(1-e4hat))

. reg e401k inc incsq age agesq male [aweight=wgt]


(sum of wgt is 4.4712e+04)

Source | SS df MS Number of obs = 9,275


-------------+---------------------------------- F(5, 9269) = 224.23
Model | 231.856957 5 46.3713915 Prob > F = 0.0000
Residual | 1916.86894 9,269 .206804288 R-squared = 0.1079
-------------+---------------------------------- Adj R-squared = 0.1074
Total | 2148.7259 9,274 .231693541 Root MSE = .45476

------------------------------------------------------------------------------
e401k | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
inc | .012552 .0005343 23.49 0.000 .0115045 .0135994
incsq | -.0000621 4.18e-06 -14.85 0.000 -.0000703 -.0000539
age | .0255006 .0037105 6.87 0.000 .0182273 .0327739
agesq | -.000295 .0000425 -6.94 0.000 -.0003782 -.0002117
male | -.00553 .0117142 -0.47 0.637 -.0284924 .0174323
_cons | -.488038 .0755912 -6.46 0.000 -.6362133 -.3398627
------------------------------------------------------------------------------

S-ar putea să vă placă și