Documente Academic
Documente Profesional
Documente Cultură
BUSINESS
STATISTICS
by
AMIR D. ACZEL
&
JAYAVEL SOUNDERPANDIAN
7th edition.
Prepared by Lloyd Jaisingh, Morehead State
University
Chapter 11
Multiple Regression
McGraw-Hill/Irwin
11-2
Using Statistics
The k-Variable Multiple Regression Model
The F Test of a Multiple Regression Model
How Good is the Regression
Tests of the Significance of Individual Regression Parameters
Testing the Validity of the Regression Model
Using the Multiple Regression Model for Prediction
11-3
11-4
11-5
11-6
Lines
B
Slope: 11
x1
Intercept: 00
x
Any two points (A and B), or
an intercept and slope (0 and
1), define a line on a twodimensional surface.
Planes
x2
Any three points (A, B, and C), or an
intercept and coefficients of x1 and x2
(0 , 1, and 2), define a plane in a
three-dimensional surface.
11-7
where0isisthe
theY-intercept
Y-interceptof
ofthe
the
where
0
regressionsurface
surfaceand
andeach
eachi, ,ii==1,2,...,k
1,2,...,k
regression
i
theslope
slopeof
ofthe
theregression
regressionsurface
surface-isisthe
sometimescalled
calledthe
theresponse
responsesurface
surface-sometimes
withrespect
respecttotoXX.i.
with
i
x2
x1
y 0 1 x1 2 x 2
Modelassumptions:
assumptions:
Model
2
~N(0,2),
),independent
independentofofother
othererrors.
errors.
1.1. ~N(0,
Thevariables
variablesXXiare
areuncorrelated
uncorrelatedwith
withthe
theerror
errorterm.
term.
2.2. The
i
11-8
x1
y b0 b1x
X
simpleregression
regressionmodel,
model,
model
InInaasimple
model
theleast-squares
least-squaresestimators
estimators
the
minimizethe
thesum
sumof
ofsquared
squared
minimize
errorsfrom
fromthe
theestimated
estimated
errors
regressionline.
line.
regression
x2
y b0 b1 x1 b2 x 2
multipleregression
regressionmodel,
model,
model
InInaamultiple
model
theleast-squares
least-squaresestimators
estimators
the
minimizethe
thesum
sumofofsquared
squared
minimize
errorsfrom
fromthe
theestimated
estimated
errors
regressionplane.
plane.
regression
whereY isisthe
thepredicted
predictedvalue
valueof
ofY,
Y,the
thevalue
valuelying
lyingon
onthe
the
where
estimatedregression
regressionsurface.
surface. The
Theterms
termsbbi,i,for
forii==0,
0,1,1,....,k
....,kare
are
estimated
theleast-squares
least-squaresestimates
estimatesof
ofthe
thepopulation
populationregression
regression
the
parametersi.i.
parameters
Theactual,
actual,observed
observedvalue
valueof
ofYYisisthe
thepredicted
predictedvalue
valueplus
plusan
an
The
error:
error:
+.....++bbkkxxkjkj+e,
+e, jj==1,1,,
,n.n.
yyj j==bb00++bb11xx1j1j++bb22xx2j2j+.
11-9
Least-Squares Estimation:
The 2-Variable Normal Equations
11-10
x y b x b x b x x
2
x y b x b x x b x
2
2
2
11-11
Example 11-1
YY
72
72
76
76
78
78
70
70
68
68
80
80
82
82
65
65
62
62
90
90
----743
743
XX1 1
12
12
11
11
15
15
10
10
11
11
16
16
14
14
88
88
18
18
----123
123
XX2 2
55
88
66
55
33
99
12
12
44
33
10
10
----65
65
1X2
XX1X
2
60
60
88
88
90
90
50
50
33
33
144
144
168
168
32
32
24
24
180
180
----869
869
2
XX121
144
144
121
121
225
225
100
100
121
121
256
256
196
196
64
64
64
64
324
324
------1615
1615
2
XX222
25
25
64
64
36
36
25
25
99
81
81
144
144
16
16
99
100
100
----509
509
1Y
2Y
XX1Y
XX2Y
864 360
360
864
836 608
608
836
1170 468
468
1170
700 350
350
700
748 204
204
748
1280 720
720
1280
1148 984
984
1148
520 260
260
520
496 186
186
496
1620 900
900
1620
---- ---------9382 5040
5040
9382
Estimatedregression
regressionequation:
equation:
Estimated
NormalEquations:
Equations:
Normal
743==10b
10b+123b
0+123b+65b
1+65b2
743
0
1
2
9382==123b
123b+1615b
0+1615b+869b
1+869b2
9382
0
1
2
5040==65b
65b+869b
0+869b+509b
1+509b2
5040
0
1
2
47.164942
bb00==47.164942
1.5990404
bb11==1.5990404
1.1487479
bb22==1.1487479
47164942
15990404
11487479
47164942
YY
..
15990404
..
XX11 11487479
..
XX22
11-12
Coefficients
11-13
Total deviation: Y Y
Y Y: Error Deviation
Y Y : Regression Deviation
x1
x2
TotalDeviation
Deviation==Regression
RegressionDeviation
Deviation++Error
ErrorDeviation
Deviation
Total
SST
SST
==
SSR
SSR
SSE
++ SSE
11-14
11-15
statisticaltest
testfor
forthe
theexistence
existenceof
ofaalinear
linearrelationship
relationshipbetween
betweenYYand
andany
anyor
or
AAstatistical
allof
ofthe
theindependent
independentvariables
variablesXX,1,XX,2,...,
...,XX:k:
all
1
2
k
...==
k= 0
HH0:0: 11==22==...=
k 0
Notall
allthe
thei(i=1,2,...,k)
(i=1,2,...,k)are
areequal
equaltoto00
HH1:1: Not
i
Sourceofof Sum
Sumofof
Source
Variation Squares
Squares
Variation
Degreesofof
Degrees
Freedom Mean
MeanSquare
Square
Freedom
Regression SSR
SSR
Regression
kk
Error
Error
Total
Total
SSE
SSE
SST
SST
(k+1)
nn--(k+1)
n-1
n-1
SSR
MSR
MSE
SSE
( n ( k 1))
MST
SST
( n 1)
Ratio
FFRatio
=0.01
F
F0.01=9.55
Thetest
teststatistic,
statistic,FF==86.34,
86.34,isisgreater
greater
The
thanthe
thecritical
criticalpoint
pointof
ofFF(2, 7)for
forany
any
than
(2, 7)
commonlevel
levelof
ofsignificance
significance
common
(p-value0),
0),so
sothe
thenull
nullhypothesis
hypothesisisis
(p-value
rejected,and
andwe
wemight
mightconclude
concludethat
that
rejected,
thedependent
dependentvariable
variableisisrelated
relatedtoto
the
oneor
ormore
moreof
ofthe
theindependent
independent
one
variables.
variables.
11-16
=0.01
F
F0.01=9.55
Thetest
teststatistic,
statistic,FF==86.34,
86.34,isisgreater
greater
The
thanthe
thecritical
criticalpoint
pointof
ofFF(2, 7)for
forany
any
than
(2, 7)
commonlevel
levelof
ofsignificance
significance
common
(p-value0),
0),so
sothe
thenull
nullhypothesis
hypothesisisis
(p-value
rejected,and
andwe
wemight
mightconclude
concludethat
that
rejected,
thedependent
dependentvariable
variableisisrelated
relatedtoto
the
oneor
ormore
moreof
ofthe
theindependent
independent
one
variables.
variables.
11-17
11-18
( y y) 2
MSE
( n ( k 1)) ( n ( k 1))
SSE
x1
x2
s = MSE
Errors: y - y
2
The multiple coefficient of determination, R , measures the proportion of
the variation in the dependent variable that is explained by the combination
of the independent variables in the multiple regression model:
SSR
SSE
R =
=1SST
SST
2
SSE
=
SSR
SST
= 1-
SSE
SST
1.911
ss==1.911
R-sq==96.1%
96.1%
R-sq
R-sq(adj)==95.0%
95.0%
R-sq(adj)
11-19
Degreesof
of
Degrees
Freedom Mean
MeanSquare
Square
Freedom
Regression SSR
SSR
Regression
(k)
(k)
MSR
Error
Error
SSE
SSE
(n-(k+1))
(n-(k+1))
=(n-k-1)
=(n-k-1)
Total
Total
SST
SST
(n-1)
(n-1)
SSR
SST
= 1-
SSE
SST
MSE
11-20
Ratio
FFRatio
F
SSR
k
MSR
MSE
SSE
( n ( k 1))
MST
SST
( n 1)
SSE
( n ( k 1))
2
(1 R )
(k )
= 1-
(n - (k + 1))
SST
(n - 1)
MSE
MST
0
Test
statistic
for
test
i
:
t
( n ( k 1 )
( n ( k 1 )
ss((bb))
i
11-21
11-22
Constant
Constant
Coefficient Standard
Standard
Coefficient
Estimate
Error
Estimate
Error
t-Statistic
t-Statistic
53.12
53.12
5.43
5.43
XX11
2.03
2.03
0.22
0.22
9.227
9.227
XX22
5.60
5.60
1.30
1.30
4.308
4.308
XX33
10.35
10.35
6.88
6.88
1.504
1.504
XX44
3.45
3.45
2.70
2.70
1.259
1.259
XX55
-4.25
-4.25
0.38
0.38
11.184
11.184
n=150
n=150
t0.025=1.96
=1.96
t0.025
9.783
9.783
**
**
**
**
11-23
11-24
Coefficients
11-25
It appears that the residuals are randomly distributed with no pattern and
with equal variance as M1 increases
11-26
It appears that the residuals are increasing as the Price increases. The
variance of the residuals is not constant.
11-27
11-28
99.9
99.9
99
99
90
90
50
50
10
10
1
1
0.1
0.1
-1.0
-1.0
-0.5
-0.5
0.0
0.0
Residual
Residual
0.5
0.5
Histogram
Histogram
Residual
Residual
Frequency
Frequency
12
12
6
0
-0.8
-0.8
-0.4
-0.4
0.0
0.4
0.0
0.4
Residual
Residual
0.0
0.0
-0.5
-0.5
1.0
1.0
24
24
0.5
0.5
-1.0
-1.0
1.0
1.0
18
18
0.8
0.8
Versus Fits
Versus Fits
1.0
1.0
Residual
Residual
Percent
Percent
ResidualPlots
Plotsfor
forExports
Exports
Residual
3.0
3.0
3.6
4.2
3.6
4.2
Fitted Value
Fitted Value
4.8
4.8
5.4
5.4
Versus Order
Versus Order
0.5
0.5
0.0
0.0
-0.5
-0.5
-1.0
-1.0 1 5 10 15 20 25 30 35 40 45 50 55 60 65
1 5 10 15 20 25 30 35 40 45 50 55 60 65
Observation Order
Observation Order
11-29
11-30
Regression line
without outlier
. .
.
.. ..
. .. ..
.. .
.
y
Regression
line with
outlier
.
.
.
.
.. .. .. .
. .. .
Regression line
when all data are
included
No relationship in
this cluster
* Outlier
Outliers
Outliers
x
InfluentialObservations
Observations
Influential
.
.
.
.
.. .. .. .
. .. .
x
x
x
x x
x
x
x
*
x
x
x x
x
x
x
x
x
x
x
x x
x
More appropriate curvilinear relationship
(seen when the in between data are known).
11-31
Fit
Fit
Stdev.Fit
Stdev.Fit
-0.14 XX
-0.14
-0.14 XX
-0.14
2.80R
2.80R
-2.87R
-2.87R
-2.57R
-2.57R
2.6420
2.6420
2.6438
2.6438
4.5949
4.5949
4.6311
4.6311
5.1317
5.1317
4.9474
4.9474
2.02R
2.02R
denotes an
an obs.
obs. with
with aa large
large st.
st. resid.
resid.
RR denotes
denotes an
an obs.
obs. whose
whose XX value
value gives
gives it
it large
large influence.
influence.
XX denotes
11-32
EstimatedRegression
RegressionPlane
Planefor
forExample
Example11-1
11-1
Estimated
89.76
Advertising
18.00
63.42
8.00
Promotions
12
11-33
11-34
)))
(( 2,(,(nn((kk11)))
2
11-35
COST
COST
4.2
4.2
6.0
6.0
5.5
5.5
3.3
3.3
12.5
12.5
9.6
9.6
2.5
2.5
10.8
10.8
8.4
8.4
6.6
6.6
10.7
10.7
11.0
11.0
3.5
3.5
6.9
6.9
7.8
7.8
10.1
10.1
5.0
5.0
7.5
7.5
6.4
6.4
10.0
10.0
PROM
PROM
1.0
1.0
3.0
3.0
6.0
6.0
1.0
1.0
11.0
11.0
8.0
8.0
0.5
0.5
5.0
5.0
3.0
3.0
2.0
2.0
1.0
1.0
15.0
15.0
4.0
4.0
10.0
10.0
9.0
9.0
10.0
10.0
1.0
1.0
5.0
5.0
8.0
8.0
12.0
12.0
BOOK
BOOK
0
0
1
1
1
1
0
0
1
1
1
1
0
0
0
0
1
1
0
0
1
1
1
1
0
0
0
0
1
1
0
0
1
1
0
0
1
1
1
1
EXAMPLE 11-3
11-36
b3
b0+b2
x1
b0
X1
regressionwith
withone
one
AAregression
quantitativevariable
variable(X
(X)1)and
and
quantitative
1
onequalitative
qualitativevariable
variable(X
(X):
2):
one
2
y b b x b x
0
x2
multipleregression
regressionwith
withtwo
two
AAmultiple
quantitativevariables
variables(X
(X1and
andXX)2)
quantitative
1
2
andone
onequalitative
qualitativevariable
variable(X
(X):
3):
and
3
y b b x b x b x
0
11-37
qualitative
AAqualitative
variablewith
withrr
variable
levelsor
orcategories
categories
levels
representedwith
with
isisrepresented
(r-1)0/1
0/1(dummy)
(dummy)
(r-1)
variables.
variables.
b0+b2
b0
X1
regressionwith
withone
onequantitative
quantitativevariable
variable(X
(X)1)and
andtwo
two
AAregression
1
qualitativevariables
variables(X
(X2and
andXX):
2):
qualitative
2
2
y b b x b x b x
0
Category XX2
Category
2
Adventure 00
Adventure
Drama
Drama
00
Romance 11
Romance
XX33
00
11
00
11-38
Salary == 8547
8547
949 Education
Education ++ 1258
1258
Salary
++ 949
Experience -- 3256
3256 Gender
Gender
Experience
(SE)
(32.6)
(45.1)
(SE)
(32.6)
(45.1)
(78.5)
(212.4)
(78.5)
(212.4)
(t)
(262.2)
(21.0)
(t)
(262.2)
(21.0)
(16.0)
(-15.3)
(16.0)
(-15.3)
1 if Female
Gender
0 if Male
Onaverage,
average,female
femalesalaries
salariesare
are
On
$3256below
belowmale
malesalaries
salaries
$3256
Slope = b1
b0
Slope = b1+b3
b0+b2
X1
regressionwith
withinteraction
interactionbetween
betweenaaquantitative
quantitative
AAregression
variable(X
(X)1)and
andaaqualitative
qualitativevariable
variable(X
(X2):):
variable
1
2
y b b x b x b x x
0
11-39
11-40
y b b X
0
y b b X
0
y b b X b X
(b 0)
0
y b b X b X b X
2
X1
X1
11-41
11-42
11-43
Variable Estimate
Estimate Standard
StandardError
Error T-statistic
T-statistic
Variable
2.34
0.92
2.54
XX1 1
2.34
0.92
2.54
3.11
1.05
2.96
XX2 2
3.11
1.05
2.96
2
4.22
1.00
4.22
XX121
4.22
1.00
4.22
2
3.57
2.12
1.68
XX222
3.57
2.12
1.68
2
2.77
2.30
1.20
2
1X
XX1X
2.77
2.30
1.20
11-44
X
X
X
Y X X X
The logarithmic
logarithmic transformation
transformation::
The
logYY log
log log
logX
X log
logX
X log
logX
X log
log
log
0
11-45
e
Y e
The logarithmic
logarithmic transformation
transformation::
The
logY
Y
log
log
X
X
log
log
log
0
0
1X
1X
0
0
1
1
1
1
11-46
20
10
SALES
SALES
30
15
Y = 3.6 6 8 2 5 + 6.78 4 X
R- Sq uared = 0 .978
5
10
15
ADVERT
LOGADV
3.5
2.5
Y = 1.70 0 8 2 + 0 .5 53 13 6 X
R- Sq uar ed = 0 .9 47
RESIDS
LOGSALE
0.5
-0.5
-1.5
1.5
0
LOGADV
12
Y-HAT
22
11-47
Squareroot
roottransformation:
transformation:
Square
Y Y
Usefulwhen
whenthe
thevariance
varianceof
ofthe
theregression
regressionerrors
errorsisisapproximately
approximatelyproportional
proportional
Useful
theconditional
conditionalmean
meanof
ofYY
totothe
Logarithmictransformation:
transformation:
Logarithmic
Y log(Y )
Usefulwhen
whenthe
thevariance
varianceof
ofregression
regressionerrors
errorsisisapproximately
approximatelyproportional
proportionaltoto
Useful
thesquare
squareof
ofthe
theconditional
conditionalmean
meanof
ofYY
the
Reciprocaltransformation:
transformation:
Reciprocal
1
Y
Y
Usefulwhen
whenthe
thevariance
varianceof
ofthe
theregression
regressionerrors
errorsisisapproximately
approximatelyproportional
proportional
Useful
thefourth
fourthpower
powerof
ofthe
theconditional
conditionalmean
meanof
ofYY
totothe
1 p
p log
Logistic Function
11-48
11-49
11-11: Multicollinearity
x2
x2
x1
Orthogonal X variables provide
information from independent
sources. No multicollinearity.
x2
x1
Some degree of collinearity.
Problems with regression depend
on the degree of collinearity.
x1
x2
x1
11-50
Effects of Multicollinearity
Variancesof
ofregression
regressioncoefficients
coefficientsare
areinflated.
inflated.
Variances
Magnitudesof
ofregression
regressioncoefficients
coefficientsmay
maybe
bedifferent
different
Magnitudes
fromwhat
whatare
areexpected.
expected.
from
Signsof
ofregression
regressioncoefficients
coefficientsmay
maynot
notbe
beas
asexpected.
expected.
Signs
Addingor
orremoving
removingvariables
variablesproduces
produceslarge
largechanges
changesin
in
Adding
coefficients.
coefficients.
Removingaadata
datapoint
pointmay
maycause
causelarge
largechanges
changesin
in
Removing
coefficientestimates
estimatesor
orsigns.
signs.
coefficient
Insome
somecases,
cases,the
theFFratio
ratiomay
maybe
besignificant
significantwhile
whilethe
thett
In
ratiosare
arenot.
not.
ratios
11-51
11-52
R
1 Rhh
2
2
whereRR2hh isisthe
theRR2 value
valueobtained
obtainedfor
forthe
theregression
regressionof
ofXXon
on
where
theother
otherindependent
independentvariables.
variables.
the
Relationship between VIF and Rh2
VIF100
50
0
0.0
0.5
1.0
Rh2
11-53
11-54
11
22
33
44
55
66
77
88
99
10
10
1.0
1.0
0.0
0.0
-1.0
-1.0
2.0
2.0
3.0
3.0
-2.0
-2.0
1.0
1.0
1.5
1.5
1.0
1.0
-2.5
-2.5
i-1
i i
i-1
**
**
1.0 **
1.0
0.0 1.0
1.0
0.0
-1.0 0.0
0.0
-1.0
2.0 -1.0
-1.0
2.0
3.0 2.0
2.0
3.0
-2.0 3.0
3.0
-2.0
1.0 -2.0
-2.0
1.0
1.5 1.0
1.0
1.5
1.0 1.5
1.5
1.0
i-2 i-3
i-2
i-3
*
*
**
**
**
**
**
1.0
1.0
**
0.0
1.0
0.0
1.0
-1.0
0.0
-1.0
0.0
2.0 -1.0
-1.0
2.0
3.0
2.0
3.0
2.0
-2.0
3.0
-2.0
3.0
1.0 -2.0
-2.0
1.0
TheDurbin-Watson
Durbin-Watsontest
test(first-order
(first-order
The
autocorrelation):
autocorrelation):
i-4
i-4
HH0:0:11==00
0
1:
HH1:
0
TheDurbin-Watson
Durbin-Watsontest
teststatistic:
statistic:
The
n
2
( ei ei 1 )
d i2 n
2
ei
i 1
11-55
11-56
15
15
16
16
17
17
18
18
. ..
..
.
65
65
70
70
75
75
80
80
85
85
90
90
95
95
100
100
1.08
1.08
1.10
1.10
1.13
1.13
1.16
1.16
1.57
1.57
1.58
1.58
1.60
1.60
1.61
1.61
1.62
1.62
1.63
1.63
1.64
1.64
1.65
1.65
1.36
1.36
1.37
1.37
1.38
1.38
1.39
1.39
. ..
..
.
1.63
1.63
1.64
1.64
1.65
1.65
1.66
1.66
1.67
1.67
1.68
1.68
1.69
1.69
1.69
1.69
0.95
0.95
0.98
0.98
1.02
1.02
1.05
1.05
1.54
1.54
1.55
1.55
1.57
1.57
1.59
1.59
1.60
1.60
1.61
1.61
1.62
1.62
1.63
1.63
1.54
1.54
1.54
1.54
1.54
1.54
1.53
1.53
. ..
..
.
1.66
1.66
1.67
1.67
1.68
1.68
1.69
1.69
1.70
1.70
1.70
1.70
1.71
1.71
1.72
1.72
kk==33
ddLL ddUU
0.82
0.82
0.86
0.86
0.90
0.90
0.93
0.93
1.50
1.50
1.52
1.52
1.54
1.54
1.56
1.56
1.57
1.57
1.59
1.59
1.60
1.60
1.61
1.61
1.75
1.75
1.73
1.73
1.71
1.71
1.69
1.69
. ..
..
.
1.70
1.70
1.70
1.70
1.71
1.71
1.72
1.72
1.72
1.72
1.73
1.73
1.73
1.73
1.74
1.74
kk==44
ddLL ddUU
0.69
0.69
0.74
0.74
0.78
0.78
0.82
0.82
1.47
1.47
1.49
1.49
1.51
1.51
1.53
1.53
1.55
1.55
1.57
1.57
1.58
1.58
1.59
1.59
1.97
1.97
1.93
1.93
1.90
1.90
1.87
1.87
. ..
..
.
1.73
1.73
1.74
1.74
1.74
1.74
1.74
1.74
1.75
1.75
1.75
1.75
1.75
1.75
1.76
1.76
kk==55
ddLL ddUU
0.56
0.56
0.62
0.62
0.67
0.67
0.71
0.71
1.44
1.44
1.46
1.46
1.49
1.49
1.51
1.51
1.52
1.52
1.54
1.54
1.56
1.56
1.57
1.57
2.21
2.21
2.15
2.15
2.10
2.10
2.06
2.06
. ..
..
.
1.77
1.77
1.77
1.77
1.77
1.77
1.77
1.77
1.77
1.77
1.78
1.78
1.78
1.78
1.78
1.78
11-57
Positive
Autocorrelation
dL
Test is
Inconclusive
dU
No
Autocorrelation
Test is
Inconclusive
4-dU
Negative
Autocorrelation
4-dL
Fornn==67,
67,kk==4:
4: ddUU1.73
1.73 4-d
4-dUU2.27
2.27
For
1.47 44-ddLL2.53
2.53<<2.58
2.58
ddLL1.47
rejected,and
andwe
weconclude
concludethere
thereisisnegative
negativefirst-order
first-order
HH00isisrejected,
autocorrelation.
autocorrelation.
11-58
Fullmodel:
model:
Full
YY==0 0++1 1XX1 1++2 2XX2 2++3 3XX3 3++4 4XX4 4++
Reducedmodel:
model:
Reduced
YY==0 0++1 1XX1 1++2 2XX2 2++
PartialFFtest:
test:
Partial
HH0:0:3 3==4 4==00
and 4not
notboth
both00
HH1:1:3 3and
4
PartialFFstatistic:
statistic:
Partial
(SSE
F
(r, (n (k 1))
SSE ) / r
R
F
MSE
F
whereSSE
SSERisisthe
thesum
sumofofsquared
squarederrors
errorsofofthe
thereduced
reducedmodel,
model,SSE
SSEFisisthe
thesum
sumofofsquared
squared
where
R
F
errorsofofthe
thefull
fullmodel;
model;MSE
MSEFisisthe
themean
meansquare
squareerror
errorofofthe
thefull
fullmodel
model[MSE
[MSEF==SSE
SSE/(nF/(nerrors
F
F
F
(k+1))];rrisisthe
thenumber
numberofofvariables
variablesdropped
droppedfrom
fromthe
thefull
fullmodel.
model.
(k+1))];
Allpossible
possibleregressions
regressions
All
Runregressions
regressionswith
withall
allpossible
possiblecombinations
combinationsof
ofindependent
independentvariables
variables
Run
andselect
selectbest
bestmodel
model
and
11-59
11-60
11-61
Stepwiseprocedures
procedures
Stepwise
Forwardselection
selection
Forward
Addone
onevariable
variableatataatime
timetotothe
themodel,
model,on
onthe
thebasis
basisofofits
itsFFstatistic
statistic
Add
Backwardelimination
elimination
Backward
Removeone
onevariable
variableatataatime,
time,on
onthe
thebasis
basisofofits
itsFFstatistic
statistic
Remove
Stepwiseregression
regression
Stepwise
Addsvariables
variablestotothe
themodel
modeland
andsubtracts
subtractsvariables
variablesfrom
fromthe
themodel,
model,on
onthe
thebasis
basis
Adds
theFFstatistic
statistic
ofofthe
11-62
Stepwise Regression
Compute F statistic for each variable not in the model
No
Stop
Yes
Enter most significant (smallest p-value) variable into model
Remove
variable
11-63
11-64