Documente Academic
Documente Profesional
Documente Cultură
to Statistical Learning
LINEAR REGRESSION
Chapter 03
Outline
The Linear Regression Model
Least Squares Fit
Measures of Fit
Inference in Regression
Other Considerations in Regression Model
Qualitative Predictors
Interaction Terms
Potential Fit Problems
Linear vs. KNN Regression
Outline
The Linear Regression Model
Least Squares Fit
Measures of Fit
Inference in Regression
Other Considerations in Regression Model
Qualitative Predictors
Interaction Terms
Potential Fit Problems
Linear vs. KNN Regression
+ bp Xp +e
easy to interpret.
0 is the intercept (i.e. the average value for Y if all the
parameters using
least squares i.e.
minimize
1
MSE = Yi - Yi
n i=1
1 n
= Yi - b0 - b1 X1 n i=1
- b p X p
Yi = b0 + b1X1 + b2 X2 +
Least Squares
line
Yi = b0 + b1 X1 + b2 X2 +
+ bp Xp +e
+ b p X p
Measures of Fit: R2
Some of the variation in Y can be explained by variation
by X.
RSS
Ending Variance
R 1
1
2
Starting Variance
(Yi Y )
2
Inference in Regression
Estimated (least squares) line.
14
12
10
8
6
4
2
0
-10
-5
0
X
10
1.
10
vs
t
Ha: j0
Number of standard deviations
away from zero.
SE( j )
Std Err
0.4578
0.0027
SE( 1 )
t-value
15.3603
17.6676
p-value
0.0000
0.0000
P-value
11
Std Err
0.3119
0.0014
0.0086
0.0059
Std Err
0.6214
0.0166
t-value
9.4223
32.8086
21.8935
-0.1767
t-value
19.8761
3.2996
p-value
0.0000
0.0000
0.0000
0.8599
p-value
0.0000
0.0011
12
(1=2==p=0),
df
2
197
SS
4860.2347
556.9140
MS
2430.1174
2.8270
F
859.6177
p-value
0.0000
Outline
The Linear Regression Model
Least Squares Fit
Measures of Fit
Inference in Regression
Other Considerations in Regression Model
Qualitative Predictors
Interaction Terms
Potential Fit Problems
Linear vs. KNN Regression
13
14
Qualitative Predictors
How do you stick men and women (category listings)
15
Interpretation
Suppose we want to include income and gender.
Two genders (male and female). Let
0 if male
Genderi =
1 if female
b0 + b1Incomei if male
Yi b0 + b1Incomei + b2Genderi =
b0 + b1Incomei + b2 if female
2 is the average extra balance each month that females
Regression coefficients
Constant
Income
Gender_Female
Coefficient
233.7663
0.0061
24.3108
Std Err
39.5322
0.0006
40.8470
t-value
5.9133
10.4372
0.5952
p-value
0.0000
0.0000
0.5521
16
17
18
Interaction
When the effect on Y of increasing X1 depends on
another X2.
Example:
Maybe the effect on Salary (Y) when increasing Position (X1)
depends on gender (X2)?
For example maybe Male salaries go up faster (or slower)
than Females as they get promoted.
Advertising example:
TV and radio advertising both increase sales.
Perhaps spending money on both of them may increase
sales more than spending the same amount on one alone?
19
Interaction in advertising
Sales = b0 + b1 TV + b2 Radio+ b3 TV Radio
Sales = b0 +(b1 + b3 Radio)TV + b2 Radio
Spending $1 extra on TV increases average sales
Interaction Term
by 0.0191 + 0.0011Radio
Sales = b0 + (b2 + b3 TV) Radio + b2 TV
Spending $1 extra on Radio increases average
Estimate
6.7502202
0.0191011
0.0288603
0.0010865
20
Expanded Estimates
170
t Ratio
Prob>|t|
Intercept
Gender[female]
Gender[m ale]
Pos ition
77.52
3.53
-3.53
21.60
<.0001
0.0005
0.0005
<.0001
112.77039
1.8600957
-1.860096
6.0553559
1.454773
0.527424
0.527424
0.280318
160
150
140
Regression equation
130
120
Different
intercepts
Same
slopes
2.5
7.5
Position
10
21
Interaction Effects
Our model has forced the line for men and the line for
women to be parallel.
Parallel lines say that promotions have the same salary
22
Expanded Estimate s
130
t Ratio
Prob>|t|
Intercept
112.63081
Gender[female]
1.1792165
Gender[m ale]
-1.179216
Pos ition
6.1021378
Gender[female]*Position0.1455111
Gender[m ale]*Pos ition -0.145511
75.85
0.79
-0.79
20.58
0.49
-0.49
<.0001
0.4280
0.4280
<.0001
0.6242
0.6242
1.484825
1.484825
1.484825
0.296554
0.296554
0.296554
120
110
0 1 2 3 4
5 6 7
8 9 10
Position
Outline
The Linear Regression Model
Least Squares Fit
Measures of Fit
Inference in Regression
Other Considerations in Regression Model
Qualitative Predictors
Interaction Terms
Potential Fit Problems
Linear vs. KNN Regression
23
24
Outline
The Linear Regression Model
Least Squares Fit
Measures of Fit
Inference in Regression
Other Considerations in Regression Model
Qualitative Predictors
Interaction Terms
Potential Fit Problems
Linear vs. KNN Regression
25
26
KNN Regression
kNN Regression is similar to the kNN classifier.
To predict Y for a given value of X, consider k closest
1
yi
K xi Ni
regression.
Is that better?
27
28
29
30
31