Documente Academic
Documente Profesional
Documente Cultură
regression
Statement of problem
A common problem is that there is a large set
Stepwise regression:
Preliminary steps
1. Specify an Alpha-to-Enter (E = 0.05)
significance level.
2. Specify an Alpha-to-Remove (R = 0.05)
significance level.
Stepwise regression:
Stopping the procedure
The procedure is stopped when adding an
Drawbacks of stepwise
regression
The final model is not guaranteed to be
Forward selection
Step 1 the first predictor in the model is the best single predictor
Select the predictor with the numerically largest simple
correlation with the dependent variable
ry,x1
vs.
ry,x2
vs.
ry,x3
vs.
ry,x4
Step 2 the next predictor in the model is the one that will
contribute the most -- with two equivalent definitions
1. The 2-predictor model (including the first predictor) with the
numerically largest R -- if the R is significant and
significantly larger than the r from the first step
R2y.x3,x1
vs.
R2y.x3,x2
vs.
R2y.x3,x4
ry(x1.x3)
vs.
ry(x2.x3)
vs.
ry.(x4.x3)
All subsequent steps -- the next predictor in the model is the one
that will contribute the most -- with two equivalent definitions
1. The 3-predictor model (including the first predictor) with the
numerically largest R -- if the R is significant and
significantly larger than the R from the previous step
R2y.x3,x2,x1
vs.
R2y.x3,x2,x4
r y.(x1.x3,x2)
vs.
r y.(x4.x3,x2)
Backward selection
Step 1 -- start with the full model (all predictors) -- if the R is
significant. Consider the regression weights of this model.
Step 2 -- remove from the model that predictor that contributes
the least
Delete that predictor with the largest p-value associated
with its regression (b) weight -- if that p-value is greater
than .05. (The idea is the predictor with the largest pvalue is the one most likely to not be contributing to the
model in the population)
Stepwise regression
Step 1 the first predictor in the model is the best single predictor
(same as the forward inclusion model)
Select the predictor with the numerically largest simple
correlation with the criterion -- if it is a significant
correlation
by using this procedure we are sure that the initial model works
Step 2 the next predictor in the model is the one that will
contribute the most -- with two equivalent definitions
(same as the forward inclusion model)
1. The 2-predictor model (including the first predictor) with the
numerically largest R -- if the R is significant and
significantly larger than the r from the first step
2. Add to the model that predictor with the highest semi-partial
correlation with the criterion, controlling the predictor for the
predictor already in the model -- if the semi-partial is
significant
by using this procedure we are sure the 2-predictor model works
and works better than the 1-predictor model
Model selection
A full model is one that includes all the
variables
A null model is one that includes only the
intercept
Selection of which variables to include can be
done by you, by the computer, or both
Types of selection:
Forward, backward, stepwise
Backward selection
Starts with a full model
Removes variables starting with the least
significant variable
Often the best approach to start with
with a chiropractor?
You get an adjusted R squared from a
BACKward regression problem!
Forward selection
Starts with a null model
Enters the variables into the model starting
Stepwise selection
Starts with a full or null model (usually a full
Stepwise Regression
Analysis
Stepwise finds the explanatory variable with the
highest R2 to start with. It then checks each of
the remaining variables until two variables with
highest R2 are found. It then repeats the process
until three variables with highest R 2 are found,
and so on.
The overall R2 gets larger as more variables are
added.
Stepwise may be useful in the early exploratory
stage of data analysis, but not to be relied upon
for the confirmatory stage.
Week assignment
Summary stepwise regression (3 pages)
Run stepwise regression (data in next slide)
https://www.youtube.com/watch?v=eme0ErU
7GJA
childA
childIn
parentIn
teacherI
n
frequenc
y
30
30
30
30
30
30
30
30
30