Documente Academic
Documente Profesional
Documente Cultură
The output of the logistic analysis is presented in Table 8.1. X1 and X2 have
Wald chi-square values 10.5556 and 2.9985, respectively. Using Wald cut-off
value 4, as outlined in Chapter 2, Section 2.11.1, there is indication that X1
is an important predictor variable, whereas X2 is not quite important. The
classification accuracy of the base RESPONSE model is displayed in Classi-
fication Table 8.2. The Total column represents the actual number of non-
responders and responders: there are 296,120 nonresponders and 3,094
responders. The Total row represents the predicted or classified number of
TABLE 8.2
Classification Table of Model with X1 and X2
Classified
0 1 Total
Actual 0 143,012 153,108 296,120
1 1,390 1,704 3,094
Total 144,402 154,812 299,214
TCCR 48.37%
TABLE 8.3
Logistic Regression of Response on X1, X2 and X1X2
Parameter Standard Wald Pr >
Variable df Estimate Error Chi-Square Chi-Square
INTERCEPT 1 –4.5900 0.0502 8374.8095 0.E+00
X1 1 –0.0092 0.0186 0.2468 0.6193
X2 1 0.0292 0.0126 5.3715 0.0205
X 1X 2 1 –0.0074 0.0047 2.4945 0.1142
X2 =
ne 0 =0
98.9% 99.0%
1.1% 1.0%
208277 90937
X1
FIGURE 8.1
CHAID Tree for Testing X2 for a Special Point
X1=
ne 0 =0
99.0% 98.9%
1.0% 1.1%
229645 69569
X2
FIGURE 8.2
CHAID Tree for Testing X1 for a Special Point
1. The top box indicates that for the sample of 299,214 individuals there
are 3,094 responders and 296,120 nonresponders. The response rate
is 1% and nonresponse rate is 99%.
2. The left-leaf node represents 229,645 individuals whose X1 values
are not equal to zero. The response rate among these individuals is
1.0%.
3. The right-leaf node represents 69,569 individuals whose X1 values
equal zero. The response rate among these individuals is 1.1%.
4. I reference the bottom row of four branches (defined by the
intersection of X1 and X2 intervals/nodes), from left to right: #1
through #4.
5. The branch #1 represents 22,626 individuals whose X1 values equal
zero and X2 values lie in [0, 2). The response rate among these indi-
viduals is 1.0%.
FIGURE 8.3
Smooth Plot of Response and X2
1The minimum value is one of several values which can be used; alternatives are the means or
medians of the predefined ranges.
TABLE 8.6
Classification Table of Model with X2 and X1X2
Classified
0 1 Total
Actual 0 165,007 131,113 296,120
1 1,617 1,477 3,094
Total 166,624 132,590 299,214
TCCR 55.64%
TABLE 8.7
Logistic Regression of Response on X2, X1X2 and X2_SQ
Parameter Standard Wald Pr >
Variable df Estimate Error Chi-Square Chi-Square
INTERCEPT 1 –4.6247 0.0338 18734.1156 0.E+00
X1X2 1 –0.0075 0.0027 8.1118 0.0040
X2 1 0.7550 0.1151 43.0144 5.E-11
X2_SQ 1 –0.1516 0.0241 39.5087 3.E-10
TABLE 8.9
Summary of Model Performance
Number of
Responder
Model Correct
Type Defined by TCCR Classification RCCR
base X 1, X2 48.37% 1,704 1.10%
full X 1, X 2, X 1X 2 55.64% 1,478 1.11%
best-so-far X 2, X 1X 2 55.64% 1,477 1.11%
best X 2, X1X2, X2_SQ 64.59% 1,256 1.19%
8.9 Summary
After briefly reviewing the concepts of interaction variables and multi-col-
linearity, and the relationship between the two, I restated the popular strat-
egy for modeling with interaction variables: the Principle of Marginality
states that a model including an interaction variable should also include the
component variables that define the interaction. I reinforced the cautionary
note that accompanies this principle: neither testing the statistical signifi-
cance (or noticeable importance) nor interpreting the coefficients of the com-
ponent variables should be performed. Moreover, I pointed out that an
unfortunate by-product of this principle is that models with unnecessary
component variables go undetected, resulting in either unreliable predictions
and/or deterioration of performance.
Then I presented an alternative strategy, which is based on Nelder’s Notion
of a Special Point. I defined the strategy for first-order interaction, X1*X2, as
higher order interactions are seldom found in database marketing applica-
tions: predictor variable X1 = 0 is called a special point on the scale if when
X1 = 0 there is no relationship between the dependent variable and a second
predictor variable X2. If X1 = 0 is a special point, then omit X2 from the model;
References
1. Marquardt, D.W., You should standardize the predictor variables in your re-
gression model. Journal of the American Statistical Association, 75, 87–91, 1980.
2. Aiken, L.S. and West, S.G., Multiple Regression: Testing and Interpreting Interac-
tions, Sage Publications, Thousand Oaks, CA ,1991.
3. Chipman, H., Bayesian variable selection with related predictors, Canadian
Journal of Statistics, 24, 17–36, 1996.
4. Peixoto, J.L., Hierarchical variable selection in polynomial regression models,
The American Statistician, 41, 311–313, 1987.
5. Nelder, J.A., Functional marginality is important (letter to editor), Applied Sta-
tistics, 46, 281–282, 1997.
6. McCullagh, P.M. and Nelder, J.A., Generalized Linear Models, Chapman & Hall,
London, 1989.
7. Fox, J., Applied Regression Analysis, Linear Models, and Related Methods, Sage
Publications, Thousand Oaks, CA, 1997.
8. Nelder, J., The selection of terms in response-surface models — how strong is
the weak-heredity principle? The American Statistician, 52, 4, 1998.