Econometrics Ch6 Applications

Martin Luther University of Halle-Wittenberg
Department of Economics
Chair of Econometrics
Econometrics
Lecture
6. Applications
Summer 2015
1 / 49
Key questions and objectives

This chapter focuses on the following key questions:
How does changing the units of measurement of variables affect the
OLS regression results (OLS intercept, slope estimates, standard errors,

t statistics, F statistics, and confidence intervals)?
How can we specify an appropriate functional form relationship between
the explained and explanatory variables?

How can we obtain confidence intervals for a prediction from the OLS
regression line?
2 / 49
Applications
6 Applications
6.1 Effects of data scaling on OLS statistics

6.2 Functional form specification
6.2.1 Using logarithmic functional forms
6.2.2 Models with quadratics
6.2.3 Models with interaction terms
6.3 Goodness-of-fit and selection of regressors

6.3.1 Adjusted R-squared
6.3.2 Selection of regressors
6.4 Prediction
6.4.1 Confidence intervals for predictions
6.4.2 Predicting y when ln y is the dependent variable
3 / 49
Applications
Effects of data scaling on OLS statistics
6 Applications


6.4 Prediction
4 / 49
Applications

In general, the coefficients, standard errors, confidence intervals, t
statistics, and F statistics change in ways that preserve all measured

effects and testing outcomes when variables are rescaled.
Data scaling is often used to reduce the number of zeros after a
decimal point in an estimated coefficient.
Example: birth weight and cigarette smoking
Regression model:
\ = 0 + 1 cigs + 2 faminc,
bwght
(6.1)
where
bwght
cigs
faminc
=
=
=
child birth weight, in ounces.

no. of cigs smoked by the pregnant mother, per day.
annual family income, in thousands of dollars
5 / 49
Applications
CHAPTER 6
187
Multiple Regression Analysis: Further Issues
T A B L E 6 . 1 Effects of Data Scaling

(1) bwght
Dependent Variable
(2) bwghtlbs
(3) bwght
Independent Variables
cigs

packs
.4634
(.0916)

faminc

intercept

Observations
R-Squared
SSR
557,485.51
SER
.0927
(.0292)
116.974
(1.049)
.0298
20.063

.0058
(.0018)

7.3109
(.0656)

1,388
1,388
.0289
(.0057)
.0298
2,177.6778
1.2539

9.268
(1.832)
.0927
(.0292)
116.974
(1.049)
1,388
.0298
557,485.51
20.063
Source: Wooldridge (2013), Table 6.1

49
The estimates of this equation, obtained using the data in BWGHT.RAW, are given in6 /the
Applications
Conversion of the dependent variable:

All OLS estimates change. But once the effects are transformed into
the same units, we get exactly the same answer, regardless of how the
dependent variable is measured.
Standard errors and confidence intervals change.
Residuals and SSR change.
Statistical significance is not affected. t and p values remain
unchanged.
R-squared is not affected.
Conversion of an explanatory variable affects only its coefficient and
standard error.
Question: in the birth weight equation, suppose that faminc is
measured in dollars rather than in thousands of dollars. Thus, define

the variable fincdol = 1, 000 faminc. How will the OLS statistics
change when fincdol is substituted for faminc? Do you think it is better
to measure income in dollars or in thousands of dollars?
7 / 49
Applications
If the dependent variable appears in logarithmic form, changing the
unit of measurement does not affect the slope coefficients:

Conversion: ln(cyi ) = ln c + ln yi , c > 0
New intercept: 0new = 0old + ln c
Similarly, changing the unit of measurement of any explanatory
variable xj , where ln(xj ) appears in the regression, only affects the

intercept.
Conversion: ln(cxij ) = ln c + ln xij , c > 0
New intercept: 0new = 0old j ln c
8 / 49
Applications
Functional form specification
6 Applications


6.4 Prediction
9 / 49
Applications

Example: housing prices and air pollution
Estimated equation:
ln\
price =9.23
(0.19)
.718 ln nox
(.066)
+.306rooms
(6.7)
(.019)
The coefficient 1 is the elasticity of price with respect to nox: if nox
increases by 1%, price is predicted to fall by .718%, ceteris paribus.

The coefficient 2 is the semi-elasticity of price with respect to rooms.
It is the change in ln price, when rooms = 1. When multiplied by 100,

this is the approximate percentage change in price: one more room
increases price by about 30.6%.
The approximation error occurs because, as the change in ln y
becomes larger and larger, the approximation %y 100 ln y
becomes more and more inaccurate.
10 / 49
Applications
For the exact interpretation, consider the general estimated model:
d
ln
y = 0 + 1 ln x1 + 2 x2 .
d
Holding fixed x1 , we have ln
y = 2 x2 .
Exact percentage change:
%
y = 100[exp (2 x2 ) 1],
(6.8)
where the multiplication by 100 turns the proportionate change into a

percentage change.
When x2 = 1,
%
y = 100[exp (2 ) 1].
(6.9)
In the housing price example,
%price = 100 [exp(.306) 1] = 35.8%, which is notably larger than

the approximate percentage change, 30.6%.
11 / 49
Applications
Adjustment in 6.8 is not as crucial for small percentage changes.
Approximate
Exact
2
2 100
100[exp (2 ) 1]
0.05
5
5.13
0.10
10
10.52
0.15
15
16.18
0.20
20
22.14
0.30
30
34.99
0.50
50
64.87
Advantages of using logarithmic variables:
Appealing interpretations
When y > 0, models using ln y as the dependent variable often satisfy
the CLM assumptions more closely than models using the level of y .
Taking the log of a variable often narrows its range (e.g. monetary
values, such as firms annual sales). Narrowing the range of the

dependent and independent variables can make OLS estimates less
sensitive to outliers.
12 / 49
Applications
Using explanatory variables that are measured as percentages:
\ = 0.3 0.05unemployment rate

ln(wage)
\ = 0.3 0.05 ln(unemployment rate)
ln(wage)
The first equation says that an increase in the unemployment rate by
one percentage point (e.g. a change from 8 to 9) decreases wages by

about 5%.
The second equation says that an increase in the unemployment rate by
one percent (e.g. a change from 8 to 8.08) decreases wages by about
0.05%.
Limitations of logarithms: logs cannot be used if a variable takes on
zero or negative values. Sometimes, ln(1 + y ) is used. However, this

approach is acceptable only when the data on y contain relatively few
zeros. Alternatives are Tobit and Poisson models.
13 / 49
Applications

Quadratic functions are also used often to capture decreasing or
increasing marginal effects.

Example:
y = 0 + 1 x + 2 x 2 ,
(6.10)
where y = wage and x = exper.

Interpretation: the effect of x on y depends on the value of x.
y (1 + 22 x)x, so
1 + 22 x.
x
(6.11)
Typically, we might plug in the average value of x in the sample, or
some other interesting values, such as the median or the lower and
upper quartile values.
14 / 49
Applications
Example: wage regression

Estimated equation:
w
[
age = 3.73 + .298exper .0061exper2
(6.12)
Equation 6.12 implies that exper has a diminishing effect on wage.

The first year of experience is worth 0.298 cent per hour.
The second year of experience is worth less: .298 2(.0061)(1) = .286.
In going from 10 to 11 years of experience, wage is predicted to
increase by about .298 2(.0061)(10) = .176.

The turning point (or maximum of the function) is achieved at the
coefficient on x over twice the absolute value of the coefficient on x2 :

x =
1
.298
=
24.4.
2(.0061)
22
(6.13)
15 / 49
all
Applications
but a small percentage of the people in the sample, then this is not of much concern.
F I G U R E 6 . 1 Quadratic relationship between !

wage and exper.
wage
7.37
24.4
Source: Wooldridge (2013), Figure 6.1
exper
Cengage Learning, 2013
3.73
. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
uppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
16 / 49
Applications
Example: effects of pollution on housing prices

ln\
price = 0 + 1 ln nox + 2 ln dist + 3 rooms + 4 rooms2 5 stratio
. reg lprice lnox ldist c.rooms##c.rooms stratio
Source |
SS
df
MS
-------------+-----------------------------Model | 50.9872385
5 10.1974477
Residual | 33.5949865
500 .067189973
-------------+-----------------------------Total |
84.582225
505 .167489554
Number of obs
F( 5,
500)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
506
151.77
0.0000
0.6028
0.5988
.25921
--------------------------------------------------------------------------------lprice |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
----------------+---------------------------------------------------------------lnox |
-.901682
.1146869
-7.86
0.000
-1.12701
-.6763544
ldist | -.0867814
.0432807
-2.01
0.045
-.1718159
-.001747
rooms |
-.545113
.1654541
-3.29
0.001
-.870184
-.2200419
|
c.rooms#c.rooms |
.0622612
.012805
4.86
0.000
.037103
.0874194
|
stratio | -.0475902
.0058542
-8.13
0.000
-.059092
-.0360884
_cons |
13.38548
.5664731
23.63
0.000
12.27252
14.49844
--------------------------------------------------------------------------------17 / 49
Applications
Interpretation: what is the effect of rooms on ln price?

Because the coefficient on rooms is negative and the coefficient on
rooms2 is positive, this equation implies that, at low values of rooms,

an additional room has a negative effect on ln price.
At some point, the effect becomes positive, and the quadratic shape
means that the semi-elasticity of price with respect to rooms is

increasing as rooms increases.
Turnaround value of rooms:
x =
(.5451)
= 4.4
2(.0623)
18 / 49
CHAPTER 6
Applications
197
Multiple Regression Analysis: Further Issues
FIGURE 6.2 !
log(price) as a quadratic function of rooms.
4.4
Source: Wooldridge (2013), Figure 6.2

and so
rooms
Cengage Learning, 2013
log(price)
19 / 49
Applications
Only five of the 506 communities in the sample have houses averaging
4.4 rooms or less, about 1% of the sample. Hence, the quadratic to the
left of 4.4 can, for practical purposes, be ignored.
To the right of 4.4, we see that adding another room has an increasing
effect on the percentage change in price:

d 100 {[.545 + 2(.062)] rooms} rooms
%price
= (54.5 + 12.4rooms)rooms
An increase in rooms from, say, five to six increases price by about
54.5 + 12.4(5) = 7.5%.

An increase from six to seven increases price by
54.5 + 12.4(6) = 19.9%.
20 / 49
Applications
If the coefficients on the level and squared terms have the same sign
(either both positive or both negative) and the explanatory variable is

nonnegative, then there is no turning point for values x > 0.
Quadratic functions may also be used to allow for a nonconstant
elasticity.
Example:
ln price = 0 + 1 ln nox + 2 (ln nox)2 + ... + u.
(6.15)
The elasticity depends on the level of nox:

%price [1 + 22 ln nox]%nox.
(6.16)
Further (higher) polynomial terms can be included in regression models:
y = 0 + 1 x + 2 x 2 + 3 x 3 + 4 x 4 + u.
21 / 49
Applications

Sometimes, the partial effect, elasticity, or semi-elasticity of the
dependent variable with respect to an explanatory variable depends on

the magnitude of another explanatory variable.
Example: in the model
price = 0 + 1 sqrft + 2 bdrms + 3 sqrft bdrms + 4 bthrms + u

the partial effect of bdrms on price is
price
= 2 + 3 sqrft.
bdrms
(6.17)
Interaction effect between square footage and number of bedrooms:

if 3 > 0, then an additional bedroom yields a higher increase in
housing price for larger houses.
22 / 49
Applications
Example: did returns to education change between 1978 and 1985?

Consider the following wage regression:
ln wage =1 + 2 y 85 + 3 educ + 4 y 85 educ + ... + u.

Returns to education are:
ln wage
= 3 + 4 y 85 =
educ
3 ,
if y 85 = 0;
3 + 4 , if y 85 = 1.
23 / 49
Applications
Source |
SS
df
MS
-------------+-----------------------------Model | 135.992074
8 16.9990092
Residual | 183.099094 1075 .170324738
-------------+-----------------------------Total | 319.091167 1083
.29463635
Number of obs
F( 8, 1075)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
1084
99.80
0.0000
0.4262
0.4219
.4127
-----------------------------------------------------------------------------lwage |
Coef.
Std. Err.
t
P>|t|
-------------+---------------------------------------------------------------y85 |
.1178062
.1237817
0.95
0.341
-.125075
.3606874
educ |
.0747209
.0066764
11.19
0.000
.0616206
.0878212
y85educ |
.0184605
.0093542
1.97
0.049
.000106
.036815
[output omitted]
_cons |
.4589329
.0934485
4.91
0.000
.2755707
.642295
------------------------------------------------------------------------------
Returns to education in 1978: 7.47%

Returns to education in 1985: (.0747 + .0185)100 = 9.32%
Returns to education increased between 1978 and 1985 by
4 = 0.0185, i.e. by 1.85 percentage points.

24 / 49
Applications
Goodness-of-fit and selection of regressors
6 Applications


6.4 Prediction
25 / 49
Applications

R-squared is the proportion of the total sample variation in y that is
explained by x1 , x2 , ..., xk .
The size of R-squared does not affect unbiasedness.
R-squared never decreases when additional explanatory variables are
added to the model because SSR never goes up (and usually falls) as
more variables are added:
R2 = 1
SSR
.
SST
The adjusted R-squared imposes a penalty for adding additional
independent variables to a model:
2
2 = 1 SSR/(n k 1) = 1
R
.
SST /(n 1)
SST /(n 1)
(6.21)
26 / 49
Applications
SSR/(n k 1) can go up or down when a new independent variable
is added to a regression.
2
If we add a new independent variable to a regression equation, R
increases if, and only if, the t statistic on the new variable is greater
than one in absolute value.
It holds that
2
2 = 1 (1 R )(n 1) .
R
(n k 1)
(6.22)
2 can be negative, indicating a very poor model fit relative to the

R
number of degrees of freedom.
27 / 49
Applications
Adjusted R-squared can be used to choose between nonnested
models. (Two equations are nonnested models when neither equation

is a special case of the other.)
Example: explaining major league baseball players salaries
Model 1: ln salary = 0 + 1 yrs + 2 games + 3 bavg + 4 hrunsyr + u

2 = .6211
R
Model 2: ln salary = 0 + 1 yrs + 2 games + 3 bavg + 4 rbisyr + u
2 = .6226
R
Based on the adjusted R-squared, there is a very slight preference for
the model with rbisyr.
28 / 49
Applications
Example: explaining R&D intensity
Model 1:
rdintens = 0 + 1 ln sales + u
2 = .030
R 2 = .061, R
Model 2:
rdintens = 0 + 1 sales + 2 sales2 + u

2 = .090
R 2 = .148, R
The first model captures a diminishing return by including sales in

logarithmic form; the second model does this by using a quadratic.
Thus, the second model contains one more parameter than the first.
2 can be used to choose between different functional
Neither R 2 nor R
forms for the dependent variable.
29 / 49
Applications

A long regression (i.e. with many explanatory variables) is more likely
to have ceteris paribus interpretation than a short regression.

Furthermore, a long regression generates more precise estimates of the
coefficients on the variables included in a short regression because these

covariates lead to a smaller residual variance.
However, it is also possible to control for too many variables in a
regression analysis (over controlling).
30 / 49
Applications
Example: impact of state beer taxes on traffic fatalities

Idea: a higher tax on beer will reduce alcohol consumption, and
likewise drunk driving, resulting in fewer traffic fatalities.
Model to measure the ceteris paribus effect of taxes on fatalities:
fatalities = 0 + 1 tax + 2 miles + 3 percmale + 4 perc16 21 + ...,
where
miles = total miles driven.
percmale = percentage of the state population that is male.
perc16 21 = percentage of the population between ages 16 and 21,
The model does not included a variable measuring per capita beer
consumption. Are we committing an omitted variables error?
No, because controlling for beer consumption would imply that we
measures the difference in fatalities due to a one percentage point
increase in tax, holding beer consumption fixed. This is not
interesting.
31 / 49
Applications
Prediction
6 Applications


6.4 Prediction
32 / 49
Applications
Prediction

(a) CI for E (y |x1 , ..., xk ) (for the average value of y for the
subpopulation with a given set of covariates)
Predictions are subject to sampling variation because they are obtained
using the OLS estimators.
Estimated equation:
y = 0 + 1 x1 + 2 x2 + ... + k xk .
(6.27)
Plugging in particular values of the independent variables, we obtain a
prediction for y . The parameter we would like to estimate is:

0 = 0 + 1 c1 + 2 c2 + ... + k ck
(6.28)
= E (y |x1 = c1 , x2 = c2 , ..., xk = ck ).
The estimator of 0 is
0 = 0 + 1 c1 + 2 c2 + ... + k ck .
(6.29)
33 / 49
Applications
Prediction
The uncertainty in this prediction is represented by a confidence
interval for 0 .
With a large df, we can construct a 95% confidence interval for 0
using the rule of thumb 0 2 se(0 ).

How do we obtain the standard error of 0 ? Trick:
Write 0 = 0 1 c1 2 c2 ... k ck .
Plug this into
y = 0 + 1 x1 + 2 x2 + ... + k xk + u.
This gives
y = 0 + 1 (x1 c1 ) + 2 (x2 c2 ) + ... + k (xk ck ) + u.
(6.30)
That is, we run a regression where we subtract the value cj from each
observation on xj .
The predicted value and its standard error are obtained from the
intercept in regression 6.30.
34 / 49
Applications
Prediction
Example: confidence interval for predicted college GPA

Estimation results for predicting college GPA:
Source |
SS
df
MS
-------------+-----------------------------Model | 499.030504
4 124.757626
Residual | 1295.16517 4132 .313447524
-------------+-----------------------------Total | 1794.19567 4136 .433799728
Number of obs
F( 4, 4132)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
4137
398.02
0.0000
0.2781
0.2774
.55986
-----------------------------------------------------------------------------colgpa |
Coef.
Std. Err.
t
P>|t|
-------------+---------------------------------------------------------------sat |
.0014925
.0000652
22.89
0.000
.0013646
.0016204
hsperc | -.0138558
.000561
-24.70
0.000
-.0149557
-.0127559
hsize | -.0608815
.0165012
-3.69
0.000
-.0932328
-.0285302
hsizesq |
.0054603
.0022698
2.41
0.016
.0010102
.0099104
_cons |
1.492652
.0753414
19.81
0.000
1.344942
1.640362
------------------------------------------------------------------------------
Note: definition of variables is colgpa=GPA after fall semester,

sat=combined SAT score, hsperc=high school percentile (from top),
hsize=size grad. class (100s).
35 / 49
Applications
Prediction
What is predicted college GPA, when sat=1,200, hsperc=30, and
hsize=5 (which means 500)?

Define a new set of independent variables: sat0 = sat - 1,200, hsperc0
= hsperc - 30, hsize0 = hsize - 5, and hsizesq0 = hsize2 - 25.
Source |
SS
df
MS
-------------+-----------------------------Model | 499.030503
4 124.757626
Residual | 1295.16517 4132 .313447524
-------------+-----------------------------Total | 1794.19567 4136 .433799728
Number of obs
F( 4, 4132)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
4137
398.02
0.0000
0.2781
0.2774
.55986
-----------------------------------------------------------------------------colgpa |
Coef.
Std. Err.
t
P>|t|
-------------+---------------------------------------------------------------sat0 |
.0014925
.0000652
22.89
0.000
.0013646
.0016204
hsperc0 | -.0138558
.000561
-24.70
0.000
-.0149557
-.0127559
hsize0 | -.0608815
.0165012
-3.69
0.000
-.0932328
-.0285302
hsizesq0 |
.0054603
.0022698
2.41
0.016
.0010102
.0099104
_cons |
2.700075
.0198778
135.83
0.000
2.661104
2.739047
-----------------------------------------------------------------------------36 / 49
Applications
Prediction
The variance of the prediction is smallest at the mean values of the xj
(because the variance of the intercept estimator is smallest when each

explanatory variable has zero sample mean).
(b) CI for a particular unit from the population: prediction interval
In forming a confidence interval for an unknown outcome on y , we
must account for the variance in the unobserved error.
Let y 0 be the value for an individual not in our original sample.
Let x10 , x20 , ..., xk0 be the new values of the independent variables.
Let u 0 be the unobserved error.
Model for observation (y 0 , x10 , ..., xk0 ):
y 0 = 0 + 1 x10 + 2 x20 + ... + k xk0 + u 0 .
(6.33)
Prediction:
y 0 = 0 + 1 x10 + 2 x20 + ... + k xk0 .

Prediction error:
e0 = y 0 y 0 = (0 + 1 x10 + 2 x20 + ... + k xk0 ) + u 0 y 0 .
(6.34)
37 / 49
Applications
Prediction
The expected prediction error is zero, E (

e 0 ) = 0, because
E (
y 0 ) = y 0 (as the j are unbiased) and u 0 has zero mean.
The variance of the prediction error is the sum of the variances
because u 0 and y 0 are uncorrelated:

Var (
e 0 ) = Var (
y 0 ) + Var (u 0 ) = Var (
y 0) + 2.
(6.35)
There are two sources of variation in e0 :

1 Sampling error in y
0 , which arises because we have estimated the j ;
decreases with sample size.
2 2 is variance of the error in the population; it does not change with the
sample size.
Standard error of e0 :
se(
e 0 ) = {[se(
y 0 )]2 +
2 }1/2 .
(6.36)
38 / 49
Applications
Prediction
It holds that e0 /se(

e 0 ) has a t distribution with n k 1 degrees of
freedom.
Therefore,

e0
P t/2 6
6
t
/2 = 1
se(
e 0)

y 0 y 0
6 t/2 = 1
P t/2 6
se(
e 0)

P y 0 t/2 se(
e 0 ) 6 y 0 6 y0 + t/2 se(
e 0) = 1
39 / 49
Applications
Prediction
Example: prediction interval (for GPA) for any particular student

Source |
SS
df
MS
-------------+-----------------------------Model | 499.030503
4 124.757626
Residual | 1295.16517 4132 .313447524
-------------+-----------------------------Total | 1794.19567 4136 .433799728
Number of obs
F( 4, 4132)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
4137
398.02
0.0000
0.2781
0.2774
.55986
-----------------------------------------------------------------------------colgpa |
Coef.
Std. Err.
t
P>|t|
-------------+---------------------------------------------------------------sat0 |
.0014925
.0000652
22.89
0.000
.0013646
.0016204
hsperc0 | -.0138558
.000561
-24.70
0.000
-.0149557
-.0127559
hsize0 | -.0608815
.0165012
-3.69
0.000
-.0932328
-.0285302
hsizesq0 |
.0054603
.0022698
2.41
0.016
.0010102
.0099104
_cons |
2.700075
.0198778
135.83
0.000
2.661104
2.739047
------------------------------------------------------------------------------
se(
e 0 ) = [(.020)2 + 0.5602 ]1/2 .560.
Prediction interval: 2.70 1.96 .560 = [1.60, 3.80].
40 / 49
Applications
Prediction

Given the OLS estimators, we can predict ln y for any value of the
explanatory variables:
d
ln
y = 0 + 1 x1 + 2 x2 + ... + k xk .
(6.39)
How to predict y ?
d
N.B.: y 6= exp(ln
y ). Hence, simply exponentiate the predicted value
for ln y does not work. In fact, it will systematically underestimate the
expected value of y .
It can be shown that
E (y |x) = exp( 2 /2) exp(0 + 1 x1 + 2 x2 + ... + k xk ),

where 2 is the variance of u.
41 / 49
Applications
Prediction
Hence, the prediction of y is:
d
y = exp(
2 /2) exp(ln
y ),
(6.40)
where
2 is the unbiased estimator of 2 .
The prediction in 6.40 relies on the normality of the error term, u.
How to obtain a prediction that does not rely on normality?
General model:
E (y |x) = 0 exp(0 + 1 x1 + 2 x2 + ... + k xk ),
(6.41)
where 0 is the expected value of exp(u).

Given an estimate
0 , we can predict y as
d
y =
0 exp(ln
y ).
(6.42)
42 / 49
Applications
Prediction
First approach to estimate 0 : a consistent but not unbiased
smearing estimate is
0 = n1
n
X
exp(
ui ).
(6.43)
i=1
Second approach to estimate 0 :

Define mi = exp(0 + 1 xi1 + 2 xi2 + ... + k xik ).
d
Replace the j with their OLS estimates and obtain m
i = exp(ln
yi ).
Estimate a simple regression of yi on m
i without an intercept. The
slope estimate is a consistent but not unbiased estimate for 0 .
With a consistent estimate for 0 , the prediction for y can be
d
calculated as
0 exp(ln
y ).
43 / 49
Applications
Prediction
Example: predicting CEO salaries

Model:
ln salary = 0 + 1 ln sales + 2 ln mktval + 3 ceoten + u,
Estimation results:
Source |
SS
df
MS
-------------+-----------------------------Model | 20.5672434
3 6.85574779
Residual | 44.0789697
173 .254791732
-------------+-----------------------------Total | 64.6462131
176 .367308029
Number of obs
F( 3,
173)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
177
26.91
0.0000
0.3182
0.3063
.50477
-----------------------------------------------------------------------------lsalary |
Coef.
Std. Err.
t
P>|t|
-------------+---------------------------------------------------------------lsales |
.1628545
.0392421
4.15
0.000
.0853995
.2403094
lmktval |
.109243
.0495947
2.20
0.029
.0113545
.2071315
ceoten |
.0117054
.0053261
2.20
0.029
.001193
.0222178
_cons |
4.503795
.2572344
17.51
0.000
3.996073
5.011517
-----------------------------------------------------------------------------44 / 49
Applications
Prediction
The smearing estimate for 0 is:

. predict uhat, res
. gen euhat = exp(uhat)
. su euhat
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------euhat |
177
1.135661
.6970541
.0823372
6.378018
45 / 49
Applications
Prediction
The regression estimate for 0 is:

. predict lsalary_hat
(option xb assumed; fitted values)
. gen m_hat = exp(lsalary_hat)
. reg salary m_hat, nocons
Source |
SS
df
MS
-------------+-----------------------------Model |
147352711
1
147352711
Residual |
46113901
176 262010.801
-------------+-----------------------------Total |
193466612
177 1093031.71
Number of obs
F( 1,
176)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
177
562.39
0.0000
0.7616
0.7603
511.87
-----------------------------------------------------------------------------salary |
Coef.
Std. Err.
t
P>|t|
-------------+---------------------------------------------------------------m_hat |
1.116857
.0470953
23.71
0.000
1.023912
1.209801
------------------------------------------------------------------------------
46 / 49
Applications
Prediction
Prediction for sales = 5,000 (which means $5 billion because sales is in
millions), mktval = 10,000 (or $10 billion), and ceoten = 10:

ln\
salary = 4.503 + 0.163 ln(5000) + 0.109 ln(10000) + 0.012 10
= 7.013.
Naive prediction: exp(7.013) = 1110.983.
Prediction using smearing estimate: 1.136 exp(7.013) = 1262.076.
Prediction using regression estimate: 1.117 exp(7.013) = 1240.967.
47 / 49
Key terms
References
References
Key terms
adjusted R-squared
interaction effect
nonnested models
over controlling
prediction error
prediction interval
predictions
quadratic functions
smearing estimate
variance of the prediction error
48 / 49
Key terms
References
References
References
Textbook: Chapter 6 in Wooldridge (2013).
Further readings: Chapter 8, Chapter 9 in Stock and Watson (2012).
Chapter 6, Chapter 10 in Hill et al. (2001)
Hill, R. C., Griffiths, W. E., and Judge, G. G. (2001). Undergraduate
Econometrics. John Wiley & Sons, New York.
Stock, J. H. and Watson, M. W. (2012). Introduction to Econometrics.
Pearson, Boston.
Wooldridge, J. M. (2013). Introductory Econometrics: A Modern Approach.
Cengage Learning, Mason, OH.
49 / 49

Econometrics Ch6 Applications

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Econometrics Ch6 Applications

Încărcat de

Drepturi de autor:

Formate disponibile

Martin Luther University of Halle-Wittenberg

Key questions and objectives

OLS regression results (OLS intercept, slope estimates, standard errors,

the explained and explanatory variables?

6.1 Effects of data scaling on OLS statistics

6.3 Goodness-of-fit and selection of regressors

6.1 Effects of data scaling on OLS statistics

6.3 Goodness-of-fit and selection of regressors

6.1 Effects of data scaling on OLS statistics

statistics, and F statistics change in ways that preserve all measured

child birth weight, in ounces.

Multiple Regression Analysis: Further Issues

Effects of data scaling on OLS statistics

T A B L E 6 . 1 Effects of Data Scaling

Source: Wooldridge (2013), Table 6.1

Conversion of the dependent variable:

measured in dollars rather than in thousands of dollars. Thus, define

If the dependent variable appears in logarithmic form, changing the

unit of measurement does not affect the slope coefficients:

Similarly, changing the unit of measurement of any explanatory

variable xj , where ln(xj ) appears in the regression, only affects the

6.1 Effects of data scaling on OLS statistics

6.3 Goodness-of-fit and selection of regressors

6.2.1 Using logarithmic functional forms

The coefficient 1 is the elasticity of price with respect to nox: if nox

increases by 1%, price is predicted to fall by .718%, ceteris paribus.

It is the change in ln price, when rooms = 1. When multiplied by 100,

For the exact interpretation, consider the general estimated model:

where the multiplication by 100 turns the proportionate change into a

In the housing price example,

%price = 100 [exp(.306) 1] = 35.8%, which is notably larger than

Adjustment in 6.8 is not as crucial for small percentage changes.

values, such as firms annual sales). Narrowing the range of the

Using explanatory variables that are measured as percentages:

\ = 0.3 0.05unemployment rate

The first equation says that an increase in the unemployment rate by

one percentage point (e.g. a change from 8 to 9) decreases wages by

zero or negative values. Sometimes, ln(1 + y ) is used. However, this

6.2.2 Models with quadratics

increasing marginal effects.

where y = wage and x = exper.

Typically, we might plug in the average value of x in the sample, or

Example: wage regression

Equation 6.12 implies that exper has a diminishing effect on wage.

increase by about .298 2(.0061)(10) = .176.

coefficient on x over twice the absolute value of the coefficient on x2 :

Functional form specification

F I G U R E 6 . 1 Quadratic relationship between !

Source: Wooldridge (2013), Figure 6.1

Cengage Learning, 2013

Example: effects of pollution on housing prices

Interpretation: what is the effect of rooms on ln price?

rooms2 is positive, this equation implies that, at low values of rooms,

means that the semi-elasticity of price with respect to rooms is

Multiple Regression Analysis: Further Issues

Functional form specification

Source: Wooldridge (2013), Figure 6.2

Cengage Learning, 2013

effect on the percentage change in price:

54.5 + 12.4(5) = 7.5%.

54.5 + 12.4(6) = 19.9%.

(either both positive or both negative) and the explanatory variable is

ln price = 0 + 1 ln nox + 2 (ln nox)2 + ... + u.

The elasticity depends on the level of nox: