Sunteți pe pagina 1din 49

# Martin Luther University of Halle-Wittenberg

Department of Economics
Chair of Econometrics

Econometrics
Lecture
6. Applications

Summer 2015
1 / 49

## Key questions and objectives

This chapter focuses on the following key questions:
How does changing the units of measurement of variables affect the

## OLS regression results (OLS intercept, slope estimates, standard errors,

t statistics, F statistics, and confidence intervals)?
How can we specify an appropriate functional form relationship between

## the explained and explanatory variables?

How can we obtain confidence intervals for a prediction from the OLS

regression line?

2 / 49

Applications

6 Applications

## 6.1 Effects of data scaling on OLS statistics

6.2 Functional form specification
6.2.1 Using logarithmic functional forms
6.2.3 Models with interaction terms

## 6.3 Goodness-of-fit and selection of regressors

6.3.2 Selection of regressors

6.4 Prediction
6.4.1 Confidence intervals for predictions
6.4.2 Predicting y when ln y is the dependent variable

3 / 49

Applications
Effects of data scaling on OLS statistics

6 Applications

## 6.1 Effects of data scaling on OLS statistics

6.2 Functional form specification
6.2.1 Using logarithmic functional forms
6.2.3 Models with interaction terms

## 6.3 Goodness-of-fit and selection of regressors

6.3.2 Selection of regressors

6.4 Prediction
6.4.1 Confidence intervals for predictions
6.4.2 Predicting y when ln y is the dependent variable

4 / 49

Applications
Effects of data scaling on OLS statistics

## 6.1 Effects of data scaling on OLS statistics

In general, the coefficients, standard errors, confidence intervals, t

## statistics, and F statistics change in ways that preserve all measured

effects and testing outcomes when variables are rescaled.
Data scaling is often used to reduce the number of zeros after a
decimal point in an estimated coefficient.
Example: birth weight and cigarette smoking
Regression model:

\ = 0 + 1 cigs + 2 faminc,
bwght

(6.1)

where
bwght
cigs
faminc

=
=
=

## child birth weight, in ounces.

no. of cigs smoked by the pregnant mother, per day.
annual family income, in thousands of dollars
5 / 49

Applications

CHAPTER 6

187

## T A B L E 6 . 1 Effects of Data Scaling

(1) bwght

Dependent Variable

(2) bwghtlbs

(3) bwght

Independent Variables
cigs

packs

.4634
(.0916)

faminc

intercept

Observations

R-Squared

SSR

557,485.51

SER

.0927
(.0292)
116.974
(1.049)
.0298
20.063

.0058
(.0018)

7.3109
(.0656)

1,388

1,388

.0289
(.0057)

.0298

2,177.6778

1.2539

9.268
(1.832)
.0927
(.0292)
116.974
(1.049)
1,388
.0298

557,485.51

20.063

## Source: Wooldridge (2013), Table 6.1

49
The estimates of this equation, obtained using the data in BWGHT.RAW, are given in6 /the

Applications
Effects of data scaling on OLS statistics

## Conversion of the dependent variable:

All OLS estimates change. But once the effects are transformed into
the same units, we get exactly the same answer, regardless of how the
dependent variable is measured.
Standard errors and confidence intervals change.
Residuals and SSR change.
Statistical significance is not affected. t and p values remain
unchanged.
R-squared is not affected.
Conversion of an explanatory variable affects only its coefficient and

standard error.
Question: in the birth weight equation, suppose that faminc is

## measured in dollars rather than in thousands of dollars. Thus, define

the variable fincdol = 1, 000 faminc. How will the OLS statistics
change when fincdol is substituted for faminc? Do you think it is better
to measure income in dollars or in thousands of dollars?
7 / 49

Applications
Effects of data scaling on OLS statistics

## unit of measurement does not affect the slope coefficients:

Conversion: ln(cyi ) = ln c + ln yi , c > 0
New intercept: 0new = 0old + ln c

## variable xj , where ln(xj ) appears in the regression, only affects the

intercept.
Conversion: ln(cxij ) = ln c + ln xij , c > 0
New intercept: 0new = 0old j ln c

8 / 49

Applications
Functional form specification

6 Applications

## 6.1 Effects of data scaling on OLS statistics

6.2 Functional form specification
6.2.1 Using logarithmic functional forms
6.2.3 Models with interaction terms

## 6.3 Goodness-of-fit and selection of regressors

6.3.2 Selection of regressors

6.4 Prediction
6.4.1 Confidence intervals for predictions
6.4.2 Predicting y when ln y is the dependent variable

9 / 49

Applications
Functional form specification

## 6.2.1 Using logarithmic functional forms

Example: housing prices and air pollution
Estimated equation:
ln\
price =9.23
(0.19)

.718 ln nox
(.066)

+.306rooms

(6.7)

(.019)

## increases by 1%, price is predicted to fall by .718%, ceteris paribus.

The coefficient 2 is the semi-elasticity of price with respect to rooms.

## It is the change in ln price, when rooms = 1. When multiplied by 100,

this is the approximate percentage change in price: one more room
The approximation error occurs because, as the change in ln y
becomes larger and larger, the approximation %y 100 ln y
becomes more and more inaccurate.
10 / 49

Applications
Functional form specification

## For the exact interpretation, consider the general estimated model:

d
ln
y = 0 + 1 ln x1 + 2 x2 .
d
Holding fixed x1 , we have ln
y = 2 x2 .
Exact percentage change:
%
y = 100[exp (2 x2 ) 1],

(6.8)

## where the multiplication by 100 turns the proportionate change into a

percentage change.
When x2 = 1,
%
y = 100[exp (2 ) 1].

(6.9)

## %price = 100 [exp(.306) 1] = 35.8%, which is notably larger than

the approximate percentage change, 30.6%.
11 / 49

Applications
Functional form specification

## Adjustment in 6.8 is not as crucial for small percentage changes.

Approximate
Exact

2
2 100
100[exp (2 ) 1]
0.05
5
5.13
0.10
10
10.52
0.15
15
16.18
0.20
20
22.14
0.30
30
34.99
0.50
50
64.87
Appealing interpretations
When y > 0, models using ln y as the dependent variable often satisfy

the CLM assumptions more closely than models using the level of y .
Taking the log of a variable often narrows its range (e.g. monetary

## values, such as firms annual sales). Narrowing the range of the

dependent and independent variables can make OLS estimates less
sensitive to outliers.
12 / 49

Applications
Functional form specification

## \ = 0.3 0.05unemployment rate

ln(wage)
\ = 0.3 0.05 ln(unemployment rate)
ln(wage)

## one percentage point (e.g. a change from 8 to 9) decreases wages by

The second equation says that an increase in the unemployment rate by
one percent (e.g. a change from 8 to 8.08) decreases wages by about
0.05%.
Limitations of logarithms: logs cannot be used if a variable takes on

## zero or negative values. Sometimes, ln(1 + y ) is used. However, this

approach is acceptable only when the data on y contain relatively few
zeros. Alternatives are Tobit and Poisson models.
13 / 49

Applications
Functional form specification

Quadratic functions are also used often to capture decreasing or

## increasing marginal effects.

Example:

y = 0 + 1 x + 2 x 2 ,

(6.10)

## where y = wage and x = exper.

Interpretation: the effect of x on y depends on the value of x.

y (1 + 22 x)x, so
1 + 22 x.
x

(6.11)

## Typically, we might plug in the average value of x in the sample, or

some other interesting values, such as the median or the lower and
upper quartile values.
14 / 49

Applications
Functional form specification

## Example: wage regression

Estimated equation:

w
[
age = 3.73 + .298exper .0061exper2

(6.12)

## Equation 6.12 implies that exper has a diminishing effect on wage.

The first year of experience is worth 0.298 cent per hour.
The second year of experience is worth less: .298 2(.0061)(1) = .286.
In going from 10 to 11 years of experience, wage is predicted to

## increase by about .298 2(.0061)(10) = .176.

The turning point (or maximum of the function) is achieved at the

## coefficient on x over twice the absolute value of the coefficient on x2 :

x =

1
.298
=
24.4.
2(.0061)
22

(6.13)

15 / 49

all
Applications

but a small percentage of the people in the sample, then this is not of much concern.

wage and exper.
wage

7.37

24.4

exper

## Cengage Learning, 2013

3.73

. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
uppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

16 / 49

Applications
Functional form specification

## Example: effects of pollution on housing prices

ln\
price = 0 + 1 ln nox + 2 ln dist + 3 rooms + 4 rooms2 5 stratio
. reg lprice lnox ldist c.rooms##c.rooms stratio
Source |
SS
df
MS
-------------+-----------------------------Model | 50.9872385
5 10.1974477
Residual | 33.5949865
500 .067189973
-------------+-----------------------------Total |
84.582225
505 .167489554

Number of obs
F( 5,
500)
Prob > F
R-squared
Root MSE

=
=
=
=
=
=

506
151.77
0.0000
0.6028
0.5988
.25921

--------------------------------------------------------------------------------lprice |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
----------------+---------------------------------------------------------------lnox |
-.901682
.1146869
-7.86
0.000
-1.12701
-.6763544
ldist | -.0867814
.0432807
-2.01
0.045
-.1718159
-.001747
rooms |
-.545113
.1654541
-3.29
0.001
-.870184
-.2200419
|
c.rooms#c.rooms |
.0622612
.012805
4.86
0.000
.037103
.0874194
|
stratio | -.0475902
.0058542
-8.13
0.000
-.059092
-.0360884
_cons |
13.38548
.5664731
23.63
0.000
12.27252
14.49844
--------------------------------------------------------------------------------17 / 49

Applications
Functional form specification

## Interpretation: what is the effect of rooms on ln price?

Because the coefficient on rooms is negative and the coefficient on

## rooms2 is positive, this equation implies that, at low values of rooms,

an additional room has a negative effect on ln price.
At some point, the effect becomes positive, and the quadratic shape

## means that the semi-elasticity of price with respect to rooms is

increasing as rooms increases.
Turnaround value of rooms:

x =

(.5451)
= 4.4
2(.0623)

18 / 49

CHAPTER 6

Applications

197

## Functional form specification

FIGURE 6.2 !
log(price) as a quadratic function of rooms.

4.4

and so

rooms

## Cengage Learning, 2013

log(price)

19 / 49

Applications
Functional form specification

Only five of the 506 communities in the sample have houses averaging

4.4 rooms or less, about 1% of the sample. Hence, the quadratic to the
left of 4.4 can, for practical purposes, be ignored.
To the right of 4.4, we see that adding another room has an increasing

## effect on the percentage change in price:

d 100 {[.545 + 2(.062)] rooms} rooms
%price
= (54.5 + 12.4rooms)rooms
An increase in rooms from, say, five to six increases price by about

## 54.5 + 12.4(5) = 7.5%.

An increase from six to seven increases price by

## 54.5 + 12.4(6) = 19.9%.

20 / 49

Applications
Functional form specification

If the coefficients on the level and squared terms have the same sign

## (either both positive or both negative) and the explanatory variable is

nonnegative, then there is no turning point for values x > 0.
Quadratic functions may also be used to allow for a nonconstant

elasticity.
Example:

(6.15)

## The elasticity depends on the level of nox:

%price [1 + 22 ln nox]%nox.

(6.16)

## Further (higher) polynomial terms can be included in regression models:

y = 0 + 1 x + 2 x 2 + 3 x 3 + 4 x 4 + u.
21 / 49

Applications
Functional form specification

## 6.2.3 Models with interaction terms

Sometimes, the partial effect, elasticity, or semi-elasticity of the

## dependent variable with respect to an explanatory variable depends on

the magnitude of another explanatory variable.
Example: in the model

## price = 0 + 1 sqrft + 2 bdrms + 3 sqrft bdrms + 4 bthrms + u

the partial effect of bdrms on price is
price
= 2 + 3 sqrft.
bdrms

(6.17)

## Interaction effect between square footage and number of bedrooms:

if 3 > 0, then an additional bedroom yields a higher increase in
housing price for larger houses.
22 / 49

Applications
Functional form specification

## Example: did returns to education change between 1978 and 1985?

Consider the following wage regression:

## ln wage =1 + 2 y 85 + 3 educ + 4 y 85 educ + ... + u.

Returns to education are:

ln wage
= 3 + 4 y 85 =
educ

3 ,
if y 85 = 0;
3 + 4 , if y 85 = 1.

23 / 49

Applications
Functional form specification

Source |
SS
df
MS
-------------+-----------------------------Model | 135.992074
8 16.9990092
Residual | 183.099094 1075 .170324738
-------------+-----------------------------Total | 319.091167 1083
.29463635

Number of obs
F( 8, 1075)
Prob > F
R-squared
Root MSE

=
=
=
=
=
=

1084
99.80
0.0000
0.4262
0.4219
.4127

-----------------------------------------------------------------------------lwage |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------y85 |
.1178062
.1237817
0.95
0.341
-.125075
.3606874
educ |
.0747209
.0066764
11.19
0.000
.0616206
.0878212
y85educ |
.0184605
.0093542
1.97
0.049
.000106
.036815
[output omitted]
_cons |
.4589329
.0934485
4.91
0.000
.2755707
.642295
------------------------------------------------------------------------------

## Returns to education in 1978: 7.47%

Returns to education in 1985: (.0747 + .0185)100 = 9.32%
Returns to education increased between 1978 and 1985 by

## 4 = 0.0185, i.e. by 1.85 percentage points.

24 / 49

Applications
Goodness-of-fit and selection of regressors

6 Applications

## 6.1 Effects of data scaling on OLS statistics

6.2 Functional form specification
6.2.1 Using logarithmic functional forms
6.2.3 Models with interaction terms

## 6.3 Goodness-of-fit and selection of regressors

6.3.2 Selection of regressors

6.4 Prediction
6.4.1 Confidence intervals for predictions
6.4.2 Predicting y when ln y is the dependent variable

25 / 49

Applications
Goodness-of-fit and selection of regressors

R-squared is the proportion of the total sample variation in y that is

explained by x1 , x2 , ..., xk .
The size of R-squared does not affect unbiasedness.
R-squared never decreases when additional explanatory variables are
added to the model because SSR never goes up (and usually falls) as
R2 = 1

SSR
.
SST

## independent variables to a model:

2
2 = 1 SSR/(n k 1) = 1
R
.
SST /(n 1)
SST /(n 1)

(6.21)
26 / 49

Applications
Goodness-of-fit and selection of regressors

## SSR/(n k 1) can go up or down when a new independent variable

2
If we add a new independent variable to a regression equation, R
increases if, and only if, the t statistic on the new variable is greater
than one in absolute value.
It holds that
2
2 = 1 (1 R )(n 1) .
R
(n k 1)

(6.22)

## 2 can be negative, indicating a very poor model fit relative to the

R
number of degrees of freedom.

27 / 49

Applications
Goodness-of-fit and selection of regressors

## models. (Two equations are nonnested models when neither equation

is a special case of the other.)
Example: explaining major league baseball players salaries

## Model 1: ln salary = 0 + 1 yrs + 2 games + 3 bavg + 4 hrunsyr + u

2 = .6211
R
Model 2: ln salary = 0 + 1 yrs + 2 games + 3 bavg + 4 rbisyr + u
2 = .6226
R
Based on the adjusted R-squared, there is a very slight preference for
the model with rbisyr.

28 / 49

Applications
Goodness-of-fit and selection of regressors

## Example: explaining R&D intensity

Model 1:

rdintens = 0 + 1 ln sales + u
2 = .030
R 2 = .061, R

Model 2:

2 = .090
R 2 = .148, R

## The first model captures a diminishing return by including sales in

logarithmic form; the second model does this by using a quadratic.
Thus, the second model contains one more parameter than the first.
2 can be used to choose between different functional
Neither R 2 nor R
forms for the dependent variable.

29 / 49

Applications
Goodness-of-fit and selection of regressors

## 6.3.2 Selection of regressors

A long regression (i.e. with many explanatory variables) is more likely

## to have ceteris paribus interpretation than a short regression.

Furthermore, a long regression generates more precise estimates of the

## coefficients on the variables included in a short regression because these

covariates lead to a smaller residual variance.
However, it is also possible to control for too many variables in a

## regression analysis (over controlling).

30 / 49

Applications
Goodness-of-fit and selection of regressors

## Example: impact of state beer taxes on traffic fatalities

Idea: a higher tax on beer will reduce alcohol consumption, and
likewise drunk driving, resulting in fewer traffic fatalities.
Model to measure the ceteris paribus effect of taxes on fatalities:
fatalities = 0 + 1 tax + 2 miles + 3 percmale + 4 perc16 21 + ...,
where
miles = total miles driven.
percmale = percentage of the state population that is male.
perc16 21 = percentage of the population between ages 16 and 21,
The model does not included a variable measuring per capita beer
consumption. Are we committing an omitted variables error?
No, because controlling for beer consumption would imply that we
measures the difference in fatalities due to a one percentage point
increase in tax, holding beer consumption fixed. This is not
interesting.
31 / 49

Applications
Prediction

6 Applications

## 6.1 Effects of data scaling on OLS statistics

6.2 Functional form specification
6.2.1 Using logarithmic functional forms
6.2.3 Models with interaction terms

## 6.3 Goodness-of-fit and selection of regressors

6.3.2 Selection of regressors

6.4 Prediction
6.4.1 Confidence intervals for predictions
6.4.2 Predicting y when ln y is the dependent variable

32 / 49

Applications
Prediction

## 6.4.1 Confidence intervals for predictions

(a) CI for E (y |x1 , ..., xk ) (for the average value of y for the
subpopulation with a given set of covariates)
Predictions are subject to sampling variation because they are obtained
using the OLS estimators.
Estimated equation:
y = 0 + 1 x1 + 2 x2 + ... + k xk .

(6.27)

## prediction for y . The parameter we would like to estimate is:

0 = 0 + 1 c1 + 2 c2 + ... + k ck

(6.28)

= E (y |x1 = c1 , x2 = c2 , ..., xk = ck ).
The estimator of 0 is

0 = 0 + 1 c1 + 2 c2 + ... + k ck .

(6.29)
33 / 49

Applications
Prediction

## The uncertainty in this prediction is represented by a confidence

interval for 0 .
With a large df, we can construct a 95% confidence interval for 0

## using the rule of thumb 0 2 se(0 ).

How do we obtain the standard error of 0 ? Trick:
Write 0 = 0 1 c1 2 c2 ... k ck .
Plug this into
y = 0 + 1 x1 + 2 x2 + ... + k xk + u.
This gives

## y = 0 + 1 (x1 c1 ) + 2 (x2 c2 ) + ... + k (xk ck ) + u.

(6.30)

That is, we run a regression where we subtract the value cj from each
observation on xj .
The predicted value and its standard error are obtained from the
intercept in regression 6.30.
34 / 49

Applications
Prediction

## Example: confidence interval for predicted college GPA

Estimation results for predicting college GPA:
Source |
SS
df
MS
-------------+-----------------------------Model | 499.030504
4 124.757626
Residual | 1295.16517 4132 .313447524
-------------+-----------------------------Total | 1794.19567 4136 .433799728

Number of obs
F( 4, 4132)
Prob > F
R-squared
Root MSE

=
=
=
=
=
=

4137
398.02
0.0000
0.2781
0.2774
.55986

-----------------------------------------------------------------------------colgpa |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------sat |
.0014925
.0000652
22.89
0.000
.0013646
.0016204
hsperc | -.0138558
.000561
-24.70
0.000
-.0149557
-.0127559
hsize | -.0608815
.0165012
-3.69
0.000
-.0932328
-.0285302
hsizesq |
.0054603
.0022698
2.41
0.016
.0010102
.0099104
_cons |
1.492652
.0753414
19.81
0.000
1.344942
1.640362
------------------------------------------------------------------------------

## Note: definition of variables is colgpa=GPA after fall semester,

sat=combined SAT score, hsperc=high school percentile (from top),

35 / 49

Applications
Prediction

## hsize=5 (which means 500)?

Define a new set of independent variables: sat0 = sat - 1,200, hsperc0
= hsperc - 30, hsize0 = hsize - 5, and hsizesq0 = hsize2 - 25.
Source |
SS
df
MS
-------------+-----------------------------Model | 499.030503
4 124.757626
Residual | 1295.16517 4132 .313447524
-------------+-----------------------------Total | 1794.19567 4136 .433799728

Number of obs
F( 4, 4132)
Prob > F
R-squared
Root MSE

=
=
=
=
=
=

4137
398.02
0.0000
0.2781
0.2774
.55986

-----------------------------------------------------------------------------colgpa |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------sat0 |
.0014925
.0000652
22.89
0.000
.0013646
.0016204
hsperc0 | -.0138558
.000561
-24.70
0.000
-.0149557
-.0127559
hsize0 | -.0608815
.0165012
-3.69
0.000
-.0932328
-.0285302
hsizesq0 |
.0054603
.0022698
2.41
0.016
.0010102
.0099104
_cons |
2.700075
.0198778
135.83
0.000
2.661104
2.739047
-----------------------------------------------------------------------------36 / 49

Applications
Prediction

## (because the variance of the intercept estimator is smallest when each

explanatory variable has zero sample mean).
(b) CI for a particular unit from the population: prediction interval
In forming a confidence interval for an unknown outcome on y , we
must account for the variance in the unobserved error.
Let y 0 be the value for an individual not in our original sample.
Let x10 , x20 , ..., xk0 be the new values of the independent variables.
Let u 0 be the unobserved error.
Model for observation (y 0 , x10 , ..., xk0 ):
y 0 = 0 + 1 x10 + 2 x20 + ... + k xk0 + u 0 .

(6.33)

Prediction:

## y 0 = 0 + 1 x10 + 2 x20 + ... + k xk0 .

Prediction error:

(6.34)
37 / 49

Applications
Prediction

## The expected prediction error is zero, E (

e 0 ) = 0, because

E (
y 0 ) = y 0 (as the j are unbiased) and u 0 has zero mean.
The variance of the prediction error is the sum of the variances

## because u 0 and y 0 are uncorrelated:

Var (
e 0 ) = Var (
y 0 ) + Var (u 0 ) = Var (
y 0) + 2.

(6.35)

## There are two sources of variation in e0 :

1 Sampling error in y
0 , which arises because we have estimated the j ;
decreases with sample size.
2 2 is variance of the error in the population; it does not change with the
sample size.
Standard error of e0 :

se(
e 0 ) = {[se(
y 0 )]2 +
2 }1/2 .

(6.36)

38 / 49

Applications
Prediction

## It holds that e0 /se(

e 0 ) has a t distribution with n k 1 degrees of

freedom.
Therefore,



e0
P t/2 6
6
t
/2 = 1
se(
e 0)


y 0 y 0
6 t/2 = 1
P t/2 6
se(
e 0)


P y 0 t/2 se(
e 0 ) 6 y 0 6 y0 + t/2 se(
e 0) = 1

39 / 49

Applications
Prediction

## Example: prediction interval (for GPA) for any particular student

Source |
SS
df
MS
-------------+-----------------------------Model | 499.030503
4 124.757626
Residual | 1295.16517 4132 .313447524
-------------+-----------------------------Total | 1794.19567 4136 .433799728

Number of obs
F( 4, 4132)
Prob > F
R-squared
Root MSE

=
=
=
=
=
=

4137
398.02
0.0000
0.2781
0.2774
.55986

-----------------------------------------------------------------------------colgpa |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------sat0 |
.0014925
.0000652
22.89
0.000
.0013646
.0016204
hsperc0 | -.0138558
.000561
-24.70
0.000
-.0149557
-.0127559
hsize0 | -.0608815
.0165012
-3.69
0.000
-.0932328
-.0285302
hsizesq0 |
.0054603
.0022698
2.41
0.016
.0010102
.0099104
_cons |
2.700075
.0198778
135.83
0.000
2.661104
2.739047
------------------------------------------------------------------------------

se(
e 0 ) = [(.020)2 + 0.5602 ]1/2 .560.
Prediction interval: 2.70 1.96 .560 = [1.60, 3.80].
40 / 49

Applications
Prediction

## 6.4.2 Predicting y when ln y is the dependent variable

Given the OLS estimators, we can predict ln y for any value of the

explanatory variables:
d
ln
y = 0 + 1 x1 + 2 x2 + ... + k xk .

(6.39)

How to predict y ?

d
N.B.: y 6= exp(ln
y ). Hence, simply exponentiate the predicted value
for ln y does not work. In fact, it will systematically underestimate the
expected value of y .
It can be shown that

## E (y |x) = exp( 2 /2) exp(0 + 1 x1 + 2 x2 + ... + k xk ),

where 2 is the variance of u.
41 / 49

Applications
Prediction

## Hence, the prediction of y is:

d
y = exp(
2 /2) exp(ln
y ),

(6.40)

where
2 is the unbiased estimator of 2 .
The prediction in 6.40 relies on the normality of the error term, u.
How to obtain a prediction that does not rely on normality?
General model:

(6.41)

## where 0 is the expected value of exp(u).

Given an estimate
0 , we can predict y as

d
y =
0 exp(ln
y ).

(6.42)

42 / 49

Applications
Prediction

## First approach to estimate 0 : a consistent but not unbiased

smearing estimate is

0 = n1

n
X

exp(
ui ).

(6.43)

i=1

## Second approach to estimate 0 :

Define mi = exp(0 + 1 xi1 + 2 xi2 + ... + k xik ).
d
Replace the j with their OLS estimates and obtain m
i = exp(ln
yi ).
Estimate a simple regression of yi on m
i without an intercept. The
slope estimate is a consistent but not unbiased estimate for 0 .
With a consistent estimate for 0 , the prediction for y can be

d
calculated as
0 exp(ln
y ).

43 / 49

Applications
Prediction

## Example: predicting CEO salaries

Model:
ln salary = 0 + 1 ln sales + 2 ln mktval + 3 ceoten + u,
Estimation results:
Source |
SS
df
MS
-------------+-----------------------------Model | 20.5672434
3 6.85574779
Residual | 44.0789697
173 .254791732
-------------+-----------------------------Total | 64.6462131
176 .367308029

Number of obs
F( 3,
173)
Prob > F
R-squared
Root MSE

=
=
=
=
=
=

177
26.91
0.0000
0.3182
0.3063
.50477

-----------------------------------------------------------------------------lsalary |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------lsales |
.1628545
.0392421
4.15
0.000
.0853995
.2403094
lmktval |
.109243
.0495947
2.20
0.029
.0113545
.2071315
ceoten |
.0117054
.0053261
2.20
0.029
.001193
.0222178
_cons |
4.503795
.2572344
17.51
0.000
3.996073
5.011517
-----------------------------------------------------------------------------44 / 49

Applications
Prediction

## The smearing estimate for 0 is:

. predict uhat, res
. gen euhat = exp(uhat)
. su euhat
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------euhat |
177
1.135661
.6970541
.0823372
6.378018

45 / 49

Applications
Prediction

## The regression estimate for 0 is:

. predict lsalary_hat
(option xb assumed; fitted values)
. gen m_hat = exp(lsalary_hat)
. reg salary m_hat, nocons
Source |
SS
df
MS
-------------+-----------------------------Model |
147352711
1
147352711
Residual |
46113901
176 262010.801
-------------+-----------------------------Total |
193466612
177 1093031.71

Number of obs
F( 1,
176)
Prob > F
R-squared
Root MSE

=
=
=
=
=
=

177
562.39
0.0000
0.7616
0.7603
511.87

-----------------------------------------------------------------------------salary |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------m_hat |
1.116857
.0470953
23.71
0.000
1.023912
1.209801
------------------------------------------------------------------------------

46 / 49

Applications
Prediction

## millions), mktval = 10,000 (or \$10 billion), and ceoten = 10:

ln\
salary = 4.503 + 0.163 ln(5000) + 0.109 ln(10000) + 0.012 10
= 7.013.
Naive prediction: exp(7.013) = 1110.983.
Prediction using smearing estimate: 1.136 exp(7.013) = 1262.076.
Prediction using regression estimate: 1.117 exp(7.013) = 1240.967.

47 / 49

Key terms

References

References

Key terms

interaction effect
nonnested models
over controlling
prediction error
prediction interval
predictions
smearing estimate
variance of the prediction error

48 / 49

Key terms

References

References

References
Textbook: Chapter 6 in Wooldridge (2013).
Further readings: Chapter 8, Chapter 9 in Stock and Watson (2012).
Chapter 6, Chapter 10 in Hill et al. (2001)
Hill, R. C., Griffiths, W. E., and Judge, G. G. (2001). Undergraduate
Econometrics. John Wiley & Sons, New York.
Stock, J. H. and Watson, M. W. (2012). Introduction to Econometrics.
Pearson, Boston.
Wooldridge, J. M. (2013). Introductory Econometrics: A Modern Approach.
Cengage Learning, Mason, OH.

49 / 49