Documente Academic
Documente Profesional
Documente Cultură
Lecture 3
The Simple Linear
Regression Model
What is Ordinary Least Squares?
✔ Ordinary Least Squares (OLS) finds the
linear model that minimizes the sum of the
squared errors.
✔ Such a model provides the best
explanation/prediction of the data.
✔ Later we’ll show that OLS is the “Best
Linear Unbiased Estimator” (BLUE)
“Explained and “Unexplained”
Variation
yˆ = Βˆ 0 + Βˆ 1 x
û i
( yi − y )
Y
yi
ŷ i
Β̂1
Β̂ 0
Xi
X
“Explained and “Unexplained”
Variation Square this quantity
and sum across all
Square this observations and
quantity and sum we have our SST
across all (Total Sum of
Squares)
observations and û i
we have our SSR
(Residual Sum of ( yi − y )
Squares)
Y
Square this yi
quantity and sum
ŷ i
across all
observations and
we have our SSE
(Explained Sum
of Squares) Xi
X
Some Useful Properties of
Summation
n
1. xi x1 x2 ...xn
i 1
n
2. c nc
i 1
n n
3. cxi c xi
i 1 i 1
n n n
4. axi byi a xi b yi
i 1 i 1 i 1
n
n xi
4.1. ( xi / yi ) i 1
n
i 1
yi
i 1
Some Useful Properties of
Summation
n n
y i x i
5. i 1 y ...and... i 1 x Proofs of 7 and 8
n n in Appendix A
n (page 709) of
6. ( xi x ) 0 Wooldridge
i 1
n n
7. ( xi x ) x n( x )
2 2
i
2
i 1 i 1
n n n n
8. ( xi x )( yi y ) xi ( yi y ) ( xi x ) yi xi yi n( x * y )
i 1 i 1 i 1 i 1
Minimizing the Sum of Squared
Errors
✔ How to put the Least in OLS?
✔ In mathematical jargon we seek to minimize
the residual sum of squares (SSR), where:
n
SSR = ∑ ( yˆ i − y i )
2
i =1
n
=∑ˆ
ui2
i=1
Picking the Parameters
✔ To Minimize SSR, we need parameter
estimates.
✔ In calculus, if you wish to know when a
function is at its minimum, you take the
first derivative.
✔ In this case we must take partial
derivatives since we have two parameters
(β0 & β1) to worry about.
Parameters that Minimize SSR
Minimize the Squared Errors
✔The SSR Function is:
n
SSR = ∑uˆ 2
i =1
SSR ( yi yˆ i )
Substitute in
2
our equation
for yhat.
SSR ( yi 0 1 xi ) 2
Here comes the magic, Baby!
◆ Simplify Terms
Partial Derivative with respect to β0
SSR ( yi 0 1 xi ) 2 terms
(A-B)
(A-B)2=
2=
First,
SSR ( yi (0 1 xi )) 2
22
Outside,
AA -BA-AB+B
-2AB+BLast
Inside, 2 2
(F.O.I.L)
SSR yi 2 yi ( 0 1 xi ) (0 1 xi )
2 2
F.O.I.L
Multiply
-2yi
through
Partial Derivative with respect to β0
dSSR
0 2 yi 0 20 21 xi 0
d 0 Take the
derivative
dSSR
2 yi 20 21 xi only of
d 0 terms
which
include β0
dSSR
2( yi 0 1 xi )(1)
d 0 Simplif
y
Partial Derivative with respect to β1
n
✔ First equation is the dSSR
partial derivative with 2( yi 0 1 xi )(1)
respect to β
dB0 i 1
0
n
dSSR
✔ Second equation is 2( yi 0 1 xi )( xi )
with respect to β1 d 1 i 1
Simplify and Set Equal to Zero
✔ First equation is for β0, second is for β1
✔ Set = 0 to find minimum point
✔ (Hats denote that parameters are estimates)
dSSR
2 ( yi ˆ0 ˆ1 xi ) 0
dB0
dSSR
2 xi ( yi ˆ0 ˆ1 xi ) 0
d 1
The Normal Equations
Divide equation 1 by -2 ( y ˆ ˆ x ) 0
i 0 1 i
x y ˆ ( x ) ˆ ( x
i i 0 i 1 i
2
)
Solving the Normal Equations
✔ Now we have two equations with two
unknown terms: β0 and β1
xi
n xi yi xi yi
0 i 1 i 0 i 1 i
ˆ n x nˆ ( x 2 ) nˆ x ˆ ( x )2
Still Solving for β1 …
✔ Terms of ˆ0 n xi
cancel one n xi yi xi yi
another out
nˆ1 ( xi 2 ) ˆ1 ( xi ) 2
✔ Then we
factor out β1 n xi yi xi yi
from both
terms on the ˆ1 (n xi 2 ( xi ) 2 )
right-hand
side
✔ Then divide ˆ n xi yi xi yi
through by the
n xi 2 ( xi )2
1
quantity on the
right hand side
to yield:
Tricky: Multiply both sides by
A Solution for β 1
1/n*1=1/n*n/n=n/n2
Why?
xi yi
I need
to multiply
xi
ythree
i n
Now multiply i
✔ xi yseparate I can’t
numerator &
ˆ
x y
i i
n simply
numbers.
n n
split the 1/n.
denominator 1
by 1/n
( x ) 2
Imagine
I wanted xi
xi to multiply
xi
2 i
i n
2
x½(5*10)=25.
Ican’t solve
it
n
n 1/2(5)*1/2(10),
by multiplying n
✔ Recall that: which equals 12.5. That is like
multiplying by ¼. I need to
y i
y ...and...
xi
x ˆ
xi yi n( x )( y )multiply xbyi y½*1=1/2*2/2=2/4.
i n( x * y )
n n 1 Now,
2
xi n( x )( x ) 2(1/2*5)(1/2*10)=25 xi n( x )
2 2
✔ This yields:
ˆ ( xi x )( yi y )
i
1
( x x ) 2
Now Solving for βo
✔ Take the first normal equation
i 0 1 i
y ( n ˆ ) ˆ ( x )
✔ Then divide both sides by n and rearrange to
yield:
yi 1 ( xi ) ˆ
ˆ
0
n n
A Solution for βo
✔ Now again that recall that:
y y ...and... x x
i i
n n
✔ Thus:
ˆ ˆ
0 y 1 x
But What Does It Mean?
✔ Equation for β1 may not seem to make
intuitive sense at first
✔ But if we break it down into pieces, we can
begin to see the logic
ˆ
( x x
i )( y yi)
(x x )
1 2
i
Understanding what makes β1
✔ Numerator for β1 is made of of TWO parts
– Deviations of X from its mean
– Deviations of Y from its mean
We know
– Then we multiply those deviationsthis as….
– Covarianc
And sum them up across all observations
e.
ˆ1
( x x )( y y )
i i
(x x ) i
2
Understanding What Makes β1
✔ Denominator of β1 is made up of the
We know this
deviation of x from its mean times itself
as….
✔ We square this term. Variance in
✔ And sum up across all observations the
Independen
t Variable
ˆ
( x ix )( y y
i )
(x x )
1 2
i
Understanding What Makes β1
✔ Thus β1 is made of of changes in x times changes in
y, divided by changes in x squared
– A.K.A “rise over run”
✔ Notice if the changes in x are EQUAL to the changes
in y, then β1 = 1
ˆ1
( x x )( y y )
i i
(x x ) i
2
Understanding What Makes β1
✔ If the changes in y are LARGER than the
changes in x, then β1 > 1
– I.E. a 1 unit change in x creates more than a 1
unit change in y
✔ If the changes in y are SMALLER than the
changes in x, then β1 < 1
– I.E. a 1 unit change in x creates less than a 1
unit change in y
Understanding What Makes β1
✔ This corresponds to our intuitive
understanding of the slope of a line
– How much change in y do we observe for each
change in x?
✔ We can also see how β1 is calculated in
units of the dependent variable.
– It is changes in the dependent variable over
changes in the independent variable
Let’s Do An Example!
y x
8 2
2 0
5 1
26 8
14 4
17 5
26 8
Calculating β0 & β1
✔ Mean of x is 4
✔ Mean of y is 14
Y X Y - mean X - mean
8 2 -6 -2
2 0 -12 -4
5 1 -9 -3
26 8 12 4
14 4 0 0
17 5 3 1
26 8 12 4
Calculating β0 & β1
( x x )( y y ) = 186
i i
i
( x x ) 2
= 62
β1 =3
y x y - mean x- mean (x-x)(y-y) (x-x)(x-x)
8 2 -6 -2 12 4
2 0 -12 -4 48 16
5 1 -9 -3 27 9
26 8 12 4 48 16
14 4 0 0 0 0
17 5 3 1 3 1
26 8 12 4 48 16
Calculating βo and β1
ˆ0 y ˆ1 x
✔ βo = mean of y - β1 (mean of x)
✔ Recall that:
– mean of y = 14 & mean of x = 4
✔ βo = 14 - 3(4)
✔ βo = 2
✔ Our equation is: y = 2 + 3x
Which Looks Like…This!
Regression of y on x
30
25
20
15
10
0
0 1 2 3 4 5 6 7 8 9
Calculating R2
✔ Let’s return to i i 0 1i
ˆ
u 2
( y ˆ ˆ x )2
SSR
✔ Plug in βo and i
( y y ) 2
1 i
2
( x x ) 2
i
ˆ
u 2
solve to get
SST SSE SSR
Calculating R2
✔ R2= SSE/SST
y - mean (y-mean)2
-6 36
-12 144
-9 81
2
(x x ) 2 12 144
0 0
R
2 1 i
( y y)
3 9
2
12 144
i
558
9(62)
R
2
1
558 Our model
perfectly
explains
variation in y.