HW9

Stat 697R Homework 8
9.6
Forward stepwise regression
Step1 The stepwise regression routine first fits a simple linear regression model for each of the P - I
potential X variables. For each simple linear Regression model, the f* statistic (2.17) for testing
whether or not the slope is zero is obtained:
The X variable with the largest t* value is the candidate for first addition. If this f* value exceeds a
predetermined level, or if the corresponding P-value is less than a predetermined the X variable is
added. Otherwise, the program terminates with no X variable.
Considered sufficiently helpful to enter the regression model. Since the degrees of freedom associated
with MSE vary depending on the number of X variables in the model, and since repeated tests on the
same data are undertaken, fixed t* limits for adding or deleting a variable have no precise probabilistic
meaning. For this reason, software programs often favor the use of predetermined a-limits.
Step 2 Assume X7 is the variable entered at step 1. The stepwise regression routine now fits all
regression models with two X variables, where X7 is one of the pair. For each such regression model,
the t* test statistic corresponding to the newly added predictor Xk is obtained. This is the statistic for
testing whether or not k = 0 when X7 and Xk are the variables in the model. The X variable with the
largest t* value-or equivalently, the smallest P -value-is the candidate for addition at the second stage:
If this t* value exceeds a predetermined level (i.e., the P-value falls below a predetermined level), the
second X variable is added. Otherwise, the program terminates
Step 3. Suppose X3 is added at the second stage. Now the stepwise regression routine examines
whether any of the other X variables already in the model should be dropped. For our illustration, there
is at this stage only one other X variable in the model, X7, so that only one t* test statistic is obtained:
At later stages, there would be a number of these t* ~statistics, one for each of the variables in the
model besides the one last added. The variable for which this t* value is smallest (or equivalently the
variable for which the P-value is largest) is the candidate for deletion. If this t* value falls below-or the
P-value exceeds-a predetermined limit, the variable is dropped from the model; otherwise, it is
retained.
Step 4. Suppose X7 is retained so that both X3 and X7 are now in the model. The stepwise regression
routine now examines which X variable is the next candidate for addition, then examines whether any
of the variables already in the model should now be dropped, and so on until no further X variables can
either be added or deleted, at which point the search terminates.
Note that the stepwise regression algorithm allows an X variable, brought into the model at an earlier
stage, to be dropped subsequently if it is no longer helpful in conjunction with variables added at later
stages.
Forward Selection. The forward selection search procedure is a simplified version of forward stepwise
regression, omitting the test whether a variable once entered into the model should be dropped.
Backward Elimination. The backward elimination search procedure is the opposite of forward
selection. It begins with the model containing all potential X variables and identifies the one with the
largest P-value. If the maximum P-value is greater than a predetermined limit, that X variable is
dropped. The model with the remaining P - 2 X variables is then fitted, and the next candidate for
dropping is identified. This process continues until no further X variables can be dropped.
9.9 a>
Number in
Model
R-Square
Adjusted
R-Square
C(p)
0.6190
0.6103
8.3536
220.5294 X1
0.4155
0.4022
35.2456
240.2137 X3
0.3635
0.3491
42.1123
244.1312 X2
0.6761
0.6610
2.8072
215.0607 X1 X3
0.6550
0.6389
5.5997
217.9676 X1 X2
0.4685
0.4437
30.2471
237.8450 X2 X3
0.6822
0.6595
4.0000
Obs VarsInModel
AIC Variables in Model
216.1850 X1 X2 X3
Press
1 X1
5569.56
2 X2
9254.49
3 X3
8451.43
4 X1 X2
5235.19
5 X1 X3
4902.75
6 X2 X3
8115.91
7 X1 X2 X3
5057.89
According to the notes ideally we need to maximize adjusted R square while

minimize the Cp AIC and PRESS. And luckily we have X1 and X3 meet all the
criteria here and the plot of those also support this pair.
b> As seen in part a the answer is yes, unfortunately this is a very rare thing to
happen. As talked in class
c> Since forward step has advantage when covariates are large, and in our case it is
small So its not advantageous.
9. 10
Analysis of Variance
Source
DF
Sum of
Squares
Mean
Square
F Value
Model
8718.02248
2179.50562
129.74
Pr
>
<.0001
Analysis of Variance
Source
DF
Sum of
Squares
Mean
Square
Error
20
335.97752
16.79888
Corrected Total
24
9054.00000
Root MSE
Dependent Mean
Coeff Var
F Value
Pr
4.09864 R-Square
0.9629
92.20000 Adj R-Sq
0.9555
>
4.44538
Parameter Estimates
Variable
DF
Parameter
Estimate
Standard
Error
Value
Pr
>
Intercept
-124.38182
9.94106
-12.51
<.0001
X1
0.29573
0.04397
6.73
<.0001
X2
0.04829
0.05662
0.85
0.4038
X3
1.30601
0.16409
7.96
<.0001
X4
0.51982
0.13194
3.94
0.0008
Pearson Correlation Coefficients, N = 25

Prob > |r| under H0: Rho=0
proficiency
X1
proficiency
X1
X2
X3
X4
1.00000
0.51441
0.49701
0.89706
0.86939
0.0085
0.0115
<.0001
<.0001
1.00000
0.10227
0.18077
0.32666
0.6267
0.3872
0.1110
1.00000
0.51904
0.39671
0.0078
0.0496
1.00000
0.78204
0.51441
0.0085
X2
X3
X4
0.49701
0.10227
0.0115
0.6267
0.89706
0.18077
0.51904
<.0001
0.3872
0.0078
0.86939
0.32666
0.39671
<.0001
0.78204
1.00000
|t|
Pearson Correlation Coefficients, N = 25

Prob > |r| under H0: Rho=0
proficiency
X1
X2
X3
<.0001
0.1110
0.0496
<.0001
X4
F value tells us that there are some predictor to be kept, X2 has to be dropped according to the P Value
And the X2 X3 are highly related, while X3, X4 are highly related too. We need to consider X3, X4
correlation there after
9.11 a>
Number
in
Model
R- Adjusted
Square R-Square
C(p)
AIC
SBC Variables in
Model
0.8047
0.7962
84.2465 110.4685 112.90629 X3
0.7558
0.7452 110.5974 116.0546 118.49234 X4
0.2646
0.2326 375.3447 143.6180 146.05576 X1
0.2470
0.2143 384.8325 144.2094 146.64717 X2
0.9330
0.9269
17.1130
0.8773
0.8661
47.1540 100.8605 104.51716 X3 X4
0.8153
0.7985
80.5653 111.0812 114.73788 X1 X4
0.8061
0.7884
85.5196 112.2953 115.95191 X2 X3
0.7833
0.7636
97.7978 115.0720 118.72864 X2 X4
0.4642
0.4155 269.7800 137.7025 141.35916 X1 X2
0.9615
0.9560
3.7274
73.8473
78.72282 X1 X3 X4
0.9341
0.9247
18.5215
87.3143
92.18984 X1 X2 X3
0.8790
0.8617
48.2310 102.5093 107.38479 X2 X3 X4
0.8454
0.8233
66.3465 108.6361 113.51157 X1 X2 X4
0.9629
0.9555
5.0000
85.7272
74.9542
89.38384 X1 X3
81.04859 X1 X2 X3 X4
According to the table above the best 4 combinations are (X1 X3), (X1 X3 X4), (X1
X2 X3), (X1 X2 X3 X4). Because they have the largest adjusted R squre.
b> Those more ordered criteria are useful, as we can see in the above table, C(p), AIC
SBC are minimized for X1 X3 X4, So those criteria are useful
9.18 a>
Summary of Stepwise Selection
Step Variable Variable Number Partial Model
Entered Removed
Vars
RRIn Square Square
C(p)
F Pr
Value
>
1 X3
0.8047 0.8047 84.2465 94.78
<.0001
2 X1
0.1283 0.9330 17.1130 42.12
<.0001
3 X4
0.0285 0.9615
0.0007
3.7274 15.59
So according to the table above we say that the best subset is X1 X3 X4
b> We can see from 9.11a that X1 X3 X4 has the largest AdjR^2 so the result agrees
with that criteria.
Code:
option ls = 80 nodate;
title 'probelm 9.9';
data patient;
infile 'C:\Reg\hw9\9.9.txt';
input Y X1 X2 X3;
run;
proc reg data=patient ;

model Y = X1 X2 X3/selection = rsquare adjrsq cp aic mse;
plot adjrsq.*np.;
plot cp.*np.;
plot aic.*np.;
run;
proc reg data = patient outest = tp1 covout noprint;

model Y = X1/ press mse ;
model Y = X2/ press mse ;

model Y = X3/press mse ;
model Y = X1 X2/ press mse;
model Y = X1 X3/press mse ;
model Y = X2 X3/ press mse ;
model Y = X1 X2 X3/ press mse ;
run;
data temp100;
set temp100;
mse1=input(mse, 6.);
run;
data temp120;
set tp1;
if _type_ = 'PARMS';
mse1=input(_mse_, 6.);
run;
proc sql;
create table temp300
as select *
from temp120 , temp100
where temp120.MSE1 = temp100.mse1;
quit;
run;
proc print data=temp300 (rename=(_p_=p _edf_=df _press_=Press));
var varsinmodel Press;
run;
option ls = 80 nodate;
title 'probelm 9.10';
data hw9_910;
infile 'C:\Reg\hw9\9.10.txt';
input proficiency X1 X2 X3 X4;
run;
proc print data =hw9_910;

run;
proc reg data = hw9_910;
model proficiency=X1 X2 X3 X4/selection = stepwise;
run;
proc corr data = hw9_910;
run;

HW9

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

HW9

Încărcat de

Drepturi de autor:

Formate disponibile

Stat 697R Homework 8

Forward stepwise regression

AIC Variables in Model

According to the notes ideally we need to maximize adjusted R square while

92.20000 Adj R-Sq

Pearson Correlation Coefficients, N = 25

Pearson Correlation Coefficients, N = 25

84.2465 110.4685 112.90629 X3

0.7452 110.5974 116.0546 118.49234 X4

0.2326 375.3447 143.6180 146.05576 X1

0.2143 384.8325 144.2094 146.64717 X2

47.1540 100.8605 104.51716 X3 X4

80.5653 111.0812 114.73788 X1 X4

85.5196 112.2953 115.95191 X2 X3

97.7978 115.0720 118.72864 X2 X4

0.4155 269.7800 137.7025 141.35916 X1 X2

48.2310 102.5093 107.38479 X2 X3 X4

66.3465 108.6361 113.51157 X1 X2 X4

0.8047 0.8047 84.2465 94.78

0.1283 0.9330 17.1130 42.12

So according to the table above we say that the best subset is X1 X3 X4

proc reg data=patient ;

proc reg data = patient outest = tp1 covout noprint;

model Y = X2/ press mse ;

proc print data =hw9_910;

S-ar putea să vă placă și