Sunteți pe pagina 1din 66

Chapter 2

The Simple Linear Regression Model:


Specification and Estimation

Walter R. Paczkowski
Rutgers University
Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 1
Chapter Contents

 2.1 An Economic Model


 2.2 An Econometric Model
 2.3 Estimating the Regression Parameters
 2.4 Assessing the Least Squares Estimators
 2.5 The Gauss-Markov Theorem
 2.6 The Probability Distributions of the Least
Squares Estimators
 2.7 Estimating the Variance of the Error Term
 2.8 Estimating Nonlinear Relationships
 2.9 Regression with Indicator Variables
Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 2
2.1
An Economic Model

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 3
2.1
An Economic
Model

As economists we are usually more interested in


studying relationships between variables
– Economic theory tells us that expenditure on
economic goods depends on income
– Consequently we call y the ‘‘dependent
variable’’ and x the independent’’ or
‘‘explanatory’’ variable
– In econometrics, we recognize that real-world
expenditures are random variables. All the
information about the dependent variables y is
contained the in probability density of y.
Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 4
2.1
An Economic
Model

The relevant pdf of y is a conditional probability


density function since it is ‘‘conditional’’ upon an
x
– The conditional mean, or expected value, of y is
E(y|x)
• The expected value of a random variable is
called its ‘‘mean’’ value, which is really a
contraction of population mean, the center of
the probability distribution of the random
variable.

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 5
2.1
An Economic Figure 2.1a Probability distribution of food expenditure y given
Model income x = $1000

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 6
2.1
An Economic
Model

The conditional variance of y is σ2 which


measures the dispersion of y about its mean μy|x
– We assume this to be constant and independent
of the level of income x.
– The conditional mean E(y|x) varies with x but
the conditional variance σ2 is assumed constant.
– Thus E(y|x) is focus of our attention.

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 7
2.1
An Economic Figure 2.1b Probability distributions of food expenditures y
Model given incomes x = $1000 and x = $2000

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 8
2.1
An Economic
Model

In order to investigate the relationship between


expenditure and income we must build an
economic model and then a corresponding
econometric model that forms the basis for a
quantitative or empirical economic analysis
– This econometric model is also called a
regression model

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 9
2.1
An Economic
Model

The simple regression function is written as

Eq. 2.1 E ( y | x)   y  1   2 x Eq. 2.1

where β1 is the intercept and β2 is the slope

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 10
2.1
An Economic
Model

It is called E   y  1not
( y | x)regression
simple  because
2x it is easy,
but because there is only one explanatory variable
on the right-hand side of the equation

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 11
2.1
An Economic Figure 2.2 The economic model: a linear relationship between
Model average per person food expenditure and income

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 12
2.1
An Economic
Model

The slope of the regression line can be written as:


E ( y | x) dE ( y | x)
Eq. 2.2 β2  
x dx
E ( y | x) dE ( y | x)
where “Δ” denotes
2  “change
x
 in” and “dE(y|x)/dx”
dx
Eq. 2.2

denotes the derivative of the expected value of y


given an x value
“Δ” denotes “change in” and “dE(y|x)/dx” denotes the derivative of
the expected value of y given an x value

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 13
2.2
An Econometric Model

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 14
2.2
An Econometric
Model
Figure 2.3 The probability density function for y at two levels of income

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 15
2.2
An Econometric
Model

While Average expenditure is given exactly as a


straight line function

The individuals expenditure of a family y may


deviate from this average level and is given by
y  E( y / x)  e  1  2 x  e
Here e is random error. Why e?
There are several key assumptions underlying the
error term of simple linear regression.
These are required so that statistical inference is
validated
Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 16
2.2
An Econometric
Model
ASSUMPTIONS OF THE SIMPLE LINEAR REGRESSION MODEL - II

2.2.1
Introducing the
Error Term

Assumption SR1:
The value of y, for each value of x, is:

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 17
2.2
An Econometric
Model
ASSUMPTIONS OF THE SIMPLE LINEAR REGRESSION MODEL - II

2.2.1
Introducing the
Error Term

Assumption SR2:
The expected value of the random error e is:

E (e)  0

This is equivalent to assuming that

E ( y)  β1  β2 x

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 18
2.2
An Econometric
Model
ASSUMPTIONS OF THE SIMPLE LINEAR REGRESSION MODEL - II

2.2.1
Introducing the
Error Term

Assumption SR3:
The variance of the random error e is:

var(e)  σ 2  var( y)

The random variables y and e have the


same variance because they differ only by
a constant.

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 19
2.2
An Econometric
Model
ASSUMPTIONS OF THE SIMPLE LINEAR REGRESSION MODEL - II

2.2.1
Introducing the
Error Term

Assumption SR4:
The covariance between any pair of random
errors, ei and ej is:
cov( ei , e j )  cov( yi , y j )  0

The stronger version of this assumption is that


the random errors e are statistically independent,
in which case the values of the dependent
variable y are also statistically independent
Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 20
2.2
An Econometric
Model
ASSUMPTIONS OF THE SIMPLE LINEAR REGRESSION MODEL - II

2.2.1
Introducing the
Error Term

Assumption SR5:
The variable x is not random, and must take at
least two different values

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 21
2.2
An Econometric
Model
ASSUMPTIONS OF THE SIMPLE LINEAR REGRESSION MODEL - II

2.2.1
Introducing the
Error Term

Assumption SR6:
(optional) The values of e are normally
distributed about their mean if the values of y
are normally distributed, and vice versa

e ~N (0, σ ) 2

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 22
2.3
Estimating the Regression Parameters

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 23
The parameters of the population regression
y  β1  β2 x  e
are unknown.

We collect a random sample ( y , x ), i  1, 2, 3..., n i i

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 24
2.3
Estimating the
Regression Table 2.1 Food Expenditure and Income Data
Parameters

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 25
2.3
Estimating the
Regression Figure 2.6 Data for food expenditure example
Parameters

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 26
2.3
Estimating the
Regression
Parameters

2.3.1
The Least Squares
Principle

The fitted regression line is:


Eq. 2.5 yˆi  b1  b2 xi
The least squares residual is:

Eq. 2.6 eˆi  yi  yˆi  yi  b1  b2 xi

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 27
2.3
Estimating the
Regression Figure 2.7 The relationship among y, ê and the fitted regression line
Parameters

2.3.1
The Least Squares
Principle

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 28
2.3
Estimating the
Regression
Parameters

2.3.1
The Least Squares
Principle

Suppose we have another fitted line:


yˆ i*  b1*  b2* xi

The least squares line has the smaller sum of


squared residuals:
N N
if SSE   ei and SSE   ei then SSE  SSE
ˆ 2 *
ˆ *2 *

i 1 i 1

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 29
Least Square Principle

30
2.3
Estimating the
Regression
Parameters

2.3.1
The Least Squares
Principle

Least squares estimates for the unknown


parameters β1 and β2 are obtained by minimizing
the error sum of squares function:
N
S (β1 ,β 2 )   ( yi  β1  β 2 xi ) 2
i 1

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 31
2.3
Estimating the
Regression THE LEAST SQUARES ESTIMATORS
Parameters

2.3.1
The Least Squares
Principle

Derivation

b2 
 ( x  x )( y  y )
i i

 (x  x)
Eq. 2.7
2
i

Eq. 2.8 b1  y  b2 x

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 32
2.3
Estimating the
Regression
Parameters

2.3.2
Estimates for the
Food Expenditure


Function
( x  x )( y  y ) 18671.2684
b2  i
i
 10.2096
 (x  x) i
2
1828.7876

b1  y  b2 x  283.5735  (10.2096)(19.6048)  83.4160

A convenient way to report the values for b1


and b2 is to write out the estimated or fitted
regression line:

yˆ i  83.42  10.21xi

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 33
2.3
Estimating the
Regression Figure 2.8 The fitted regression line
Parameters

2.3.2
Estimates for the
Food Expenditure
Function

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 34
2.3
Estimating the
Regression
Parameters

The value b2 = 10.21 is an estimate of 2, the


2.3.3
Interpreting the
Estimates

amount by which weekly expenditure on food per


household increases when household weekly
income increases by $100. Thus, we estimate that
if income goes up by $100, expected weekly
expenditure on food will increase by
approximately $10.21
– Strictly speaking, the intercept estimate b1 =
83.42 is an estimate of the weekly food
expenditure on food for a household with zero
income

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 35
2.3
Estimating the
Regression
Parameters

2.3.3a
Elasticities Income elasticity is a useful way to characterize
the responsiveness of consumer expenditure to
changes in income. The elasticity of a variable y
with respect to another variable x is:

percentage change in y y x
 
percentage change in x x y

In the linear economic model given by Eq. 2.1 we


have shown that
E ( y )
β2 
x
Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 36
2.3
Estimating the
Regression
Parameters

2.3.3a
Elasticities The elasticity of mean expenditure with respect to
income is:
E ( y ) E ( y ) E ( y ) x x
   β2
x x x E ( y ) E ( y)
Eq. 2.9
A frequently used alternative is to calculate the elasticity
at the “point of the means” because it is a
representative point on the regression line.
x 19.60
ˆ  b2  10.21  0.71
y 283.57
Interpretation: If income of a household increases by
1% expenditure on food increases by 0.71% on
average. This is true for an average household, why?
Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 37
Was food a normal good or inferior good?
Classification: ε > 0 normal, otherwise inferior
Was food a luxury or necessity for the average
household in the sample?
Classification: ε > 1 luxury, otherwise necessity
What goods are luxury and what are necessity?
What is a necessity and what is a luxury depends
on the level of income. For people with a low
income, food and clothing can be luxuries. So the
level of income has a big effect on income
elasticity of demand
Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 38
Income elasticity of demand for some products in the US

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 39
Figure: income elasticity of demand for food in 20 countries. Source
: Theil et al. (1989). Advances in Econometrics, edited book

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 40
2.3
Estimating the
Regression
Parameters

2.3.3b
Prediction

Suppose that we wanted to predict weekly food


expenditure for a household with a weekly income
of $2000. This prediction is carried out by
substituting x = 20 into our estimated equation to
obtain:
yˆ  83.42  10.21xi  83.42  10.21(20)  287.61

We predict that a household with a weekly income


of $2000 will spend $287.61 per week on food

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 41
2.3
Estimating the
Regression Figure 2.9 EViews Regression Output
Parameters

2.3.3c
Computer Output

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 42
2.3
Estimating the
Regression
Parameters

2.3.4
Other Economic
Models

The simple regression model can be applied to


estimate the parameters of many relationships in
economics, business, and the social sciences
– The applications of regression analysis are
fascinating and useful

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 43
Is a CEO compensated for good company
performance?
Salary ($1000)  967  5.58ROE (%)

Returns to schooling: Does wage increase with


education?
Wage ($ / h)  8.49  1.18EDU ( years )

Input-Output relation:
Output (# units)  36  0.75Labor (hours )

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 44
2.4
Assessing the Least Squares Fit

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 45
2.4
1 Assessing the
and
 2Least Squares Fit

How closer the estimates of the food expenditure model


yˆ i  83.42  10.21xi  b1  b2 x
are to the true parameters
We call b1 and b2 the least squares estimators.
– We can investigate the properties of the estimators b1
and b2 , which are called their sampling properties, and
deal with the following important questions:
1. If the least squares estimators are random
variables, then what are their expected values,
variances, covariances, and probability
distributions?
2. How do the least squares estimators compare with
other procedures that might be used, and how can
we compare alternative estimators?
Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 46
2.5
The Gauss-Markov
Theorem
GAUSS-MARKOV THEOREM

Under the assumptions SR1-SR5 of the linear


regression model, the estimators b1 and b2 have the
smallest variance of all linear and unbiased
estimators of b1 and b2. They are the Best Linear
Unbiased Estimators (BLUE) of b1 and b2

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 47
2.5
The Gauss-Markov
Theorem MAJOR POINTS ABOUT THE GAUSS-MARKOV THEOREM

1. The estimators b1 and b2 are “best” when compared to similar


estimators, those which are linear and unbiased. The Theorem does
not say that b1 and b2 are the best of all possible estimators.

2. The estimators b1 and b2 are best within their class because they
have the minimum variance. When comparing two linear and
unbiased estimators, we always want to use the one with the
smaller variance, since that estimation rule gives us the higher
probability of obtaining an estimate that is close to the true
parameter value.

3. In order for the Gauss-Markov Theorem to hold, assumptions SR1-


SR5 must be true. If any of these assumptions are not true, then b1
and b2 are not the best linear unbiased estimators of β1 and β2.

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 48
2.5
The Gauss-Markov
Theorem MAJOR POINTS ABOUT THE GAUSS-MARKOV THEOREM

4. The Gauss-Markov Theorem does not depend on the assumption


of normality (assumption SR6).

5. In the simple linear regression model, if we want to use a linear


and unbiased estimator, then we have to do no more searching.
The estimators b1 and b2 are the ones to use. This explains why
we are studying these estimators and why they are so widely used
in research, not only in economics but in all social and physical
sciences as well.

6. The Gauss-Markov theorem applies to the least squares


estimators. It does not apply to the least squares estimates from a
single sample.

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 49
2.4
Assessing the
Least Squares Fit

2.4.1
The Estimator b2 The estimator b2 can be rewritten as:
N
Eq. 2.10 b2   wi yi
i 1

where
xi  x
wi 
 i
Eq. 2.11
( x  x ) 2

It could also be write as:

Eq. 2.12 b2  β2   wi ei

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 50
2.4
Assessing the
Least Squares Fit

2.4.2
The Expected
Values of b1 and b2
We will show that if our model assumptions hold,
then E(b2) = β2, which means that the estimator is
unbiased. We can find the expected value of b2
using the fact that the expected value of a sum is
the sum of the expected values:
E (b2 )  E (b2   wi ei )  E (β 2  w1e1  w2e2  ...  wN eN )
 E (β 2 )  E ( w1e1 )  E ( w2e2 )  ...  E ( wN eN )
Eq. 2.13  E (β 2 )   E ( wi ei )
 β 2   wi E (ei )
 β2
using E (ei )  0 and E (wi ei )  wi E (ei )
Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 51
2.4
Assessing the
Least Squares Fit

2.4.2
The Expected
Values of b1 and b2
The property of unbiasedness is about the average
values of b1 and b2 if many samples of the same size
are drawn from the same population
– If we took the averages of estimates from many
samples, these averages would approach the true
parameter values b1 and b2
– Unbiasedness does not say that an estimate from
any one sample is close to the true parameter value,
and thus we cannot say that an estimate is unbiased
– We can say that the least squares estimation
procedure (or the least squares estimator) is
unbiased
Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 52
2.4
Assessing the
Least Squares Fit
Table 2.2 Estimates from 10 Samples

2.4.3
Repeated
Sampling

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 53
2.4
Assessing the
Least Squares Fit
Figure 2.10 Two possible probability density functions for b2

2.4.3
Repeated The variance of b2 is defined as var( b2 )  E[b2  E (b2 )]2
Sampling

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 54
2.4
Assessing the
Least Squares Fit

2.4.4
The Variances and
Covariances of b1
If the regression model assumptions SR1-SR5 are
and b2
correct (assumption SR6 is not required), then the
variances and covariance of b1 and b2 are:

Eq. 2.14

var(b1 )  σ 2 
 x 2
i


 N   xi  x  
2

σ2
Eq. 2.15 var(b2 ) 
  xi  x 
2

  x 
Eq. 2.16 cov(b1 , b2 )  σ 2  
   xi  x  
2

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 55
2.4
Assessing the MAJOR POINTS ABOUT THE VARIANCES AND COVARIANCES
Least Squares Fit OF b1 AND b2

2.4.4
The Variances and 1. The larger the variance term σ2 , the greater the uncertainty
Covariances of b1
and b2
there is in the statistical model, and the larger the variances
and covariance of the least squares estimators.

The larger the sum of squares,  xi  x  , the smaller the


2
2.
variances of the least squares estimators and the more
precisely we can estimate the unknown parameters.

3. The larger the sample size N, the smaller the variances and
covariance of the least squares estimators.

The larger the term  x , the larger the variance of the least
2
4. i

squares estimator b1.

5. The absolute magnitude of the covariance increases the


larger in magnitude is the sample mean x , and the
covariance has a sign opposite to that of x.
Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 56
series x
series y
x=10*nrnd
y=3+2*x+50*nrnd

Ex 2.6 p78 and computer ex CAPM

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 57
2.4 Figure 2.11 The influence of variation in the explanatory variable x on
Assessing the precision of estimation (a) Low x variation, low precision (b) High x
Least Squares Fit
variation, high precision
2.4.4
The Variances and
Covariances of b1
The variance of b2 is defined as var( b2 )  Eb2  E (b2 )
2
and b2

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 58
2.6
The Probability Distributions of the
Least Squares Estimators

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 59
2.6
The Probability
Distributions of the
Least Squares
Estimators

If we make the normality assumption (assumption


SR6 about the error term) then the least squares
estimators are normally distributed:

 σ 2  xi2 
Eq. 2.17 b1 ~ N  β1 , 
 N  x  x   2
 i 

 σ2 
Eq. 2.18 b2 ~ N  β2 , 
  x  x  
2
 i 

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 60
2.6
The Probability
Distributions of the
Least Squares
A CENTRAL LIMIT THEOREM
Estimators

If assumptions SR1-SR5 hold, and if the sample


size N is sufficiently large, then the least squares
estimators have a distribution that approximates the
normal distributions shown in Eq. 2.17 and Eq. 2.18

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 61
2.7
Estimating the Variance of the Error
Term

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 62
2.7
Estimating the
Variance of the
Error Term

The variance of the random error ei is:

var(ei )  σ 2  E[ei  E (ei )]2  E (ei ) 2

if the assumption E(ei) = 0 is correct.

Since the “expectation” is an average value we might


consider estimating σ2 as the average of the
squared errors:

σ̂ 2 
i
e 2

N
where the error terms are ei  yi  β1  β 2 xi

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 63
2.7
Estimating the
Variance of the
Error Term

The least squares residuals are obtained by replacing


the unknown parameters by their least squares
estimates:
eˆi  yi  yˆi  yi  b1  b2 xi

σ2 
i
ˆ
e 2

N
There is a simple modification that produces an
unbiased estimator, and that is:

Eq. 2.19 ˆ 2  i
e 2

N 2

 
so that:
E σ̂ 2  σ 2
Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 64
2.7
Estimating the
Variance of the
Error Term

2.7.1
Estimating the
Variance and
Replace the unknown error variance σ2 in Eq. 2.14
Covariance of the
Least Squares – Eq. 2.16 by ˆ 2 to obtain:
Estimators

Eq. 2.20

var(b1 )  σˆ 2 
 xi
2 

 N   xi  x  
2

σ̂ 2
Eq. 2.21 var(b2 ) 
  xi  x 
2

  x 
Eq. 2.22 cov(b1 , b2 )  σˆ 2  
   xi  x  
2

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 65
2.7
Estimating the
Variance of the
Error Term

2.7.1
Estimating the
Variance and
Covariance of the
Least Squares
Estimators
The square roots of the estimated variances are the
“standard errors” of b1 and b2:

Eq. 2.23 se(b1 )  var(b1 )

Eq. 2.24 se(b2 )  var(b2 )

Ex 2.6, 2.7, 2.8 p78

Principles of Econometrics, 4th Edition Chapter 2: The Simple Linear Regression Model Page 66