Documente Academic
Documente Profesional
Documente Cultură
http://intranet.hbs.edu/dept/research/statistics/
Types of Models
Models are generalizations of the logit and probit models Ordinal logit and probit deal with ordered data (more than 2 categories) Multinomial logit deals with unordered data with more than 2 categories (Multinomial probit is not commonly used due to computational difficulties)
Outline of Talk
Review of Binary Models Ordinal Models Multinomial Logit
a + bX
p=1
p=0
PDF of Y*
p=1
p=0
p=1
p=0
Ordinal Outcomes
3 or more categorical outcomes, which can be treated as ordered Bond ratings (AAA, AA, B, C, ) Likert scales (e.g. responses on a 1-7 scale, from strongly disagree to strongly agree)
Often analyzed as continuous
t3 t2 t1
p=1
p=0
p=1
p=0
p=1
p=0
p=1
p=0
-------------+------------------------------------------_cut1 | _cut2 | _cut3 | _cut4 | _cut5 | _cut6 | -2.076242 -.9736895 -.4528313 1.106628 2.079342 3.176076 .1548201 .0807119 .073509 .0781733 .0932966 .167065 (Ancillary parameters)
----------------------------------------------------------
Outcome will be in the second ordered category or higher (not the first), if 1.07*x+u > -2.08. Outcome will be in the third ordered category or higher (not the first or second), if 1.07*x+u > -.97. Outcome will be in the second ordered category exactly, if -.97 > 1.07*x+u > -2.08.
Outcome will be in the second ordered category or higher (not the first), if 1.07*x + 2.08 + u > 0. Outcome will be in the third ordered category or higher, if 1.07*x + .97 + u > 0. Outcome will be in the second ordered category if 1.07*x + 2.08 + u > 0 and 1.07*x + .97 + u < 0.
Interpreting Coefficients
Multiple cutpoints with no intercept term, or multiple intercept terms Probabilities modeled are probabilities for all outcomes >=k, compared with all outcomes < k. Interpret the coefficients the same as in the corresponding binary model.
Parallel Regressions
Y a3+ bX a2+ bX a1+ bX X
p=1
p=0
Proportional Odds
p2 ! E 2 FX log 1 p2 p3 ! E 3 FX log 1 p3 p3 p2 log log ! E3 E2 1 p3 1 p2 odds 3 log ! E3 E2 odds 2 odds 3 ! odds 2 * exp 3 E 2
E
Interpreting Cutpoints
SD=Strongly Disagree, SoD = Somewhat Disagree D=Disagree, N=Neutral, A=Agree SA=Strongly Agree, SoA=Somewhat Agree MoD=Moderately Disagree VSA = Very Slightly Agree
Probability of Responses
SD
D MoD
SoD
N VSA
SoA
SA
SD=Strongly Disagree, SoD = Somewhat Disagree MoD=Moderately Disagree D=Disagree, N=Neutral VSA = Very Slightly Agree SA=Strongly Agree
SD
DMoD
SoD
NVSA
SA
t3 t2 t1
Interpreting Cutpoints
Model is Y* = u (no predictor variables)
oprobit y Coef. _cut1 | _cut2 | _cut3 | _cut4 | _cut5 | _cut6 | -2.494879 -1.501138 -.4976369 .5008453 1.506652 2.519265 Std. Err. .014093 .0061005 .0041469 .0041494 .0061208 .0144766
Uneven Cutpoints
Coef. _cut1 | _cut2 | _cut3 | _cut4 | _cut5 | _cut6 | -2.494879 -2.217338 -.4976369 .5008453 1.506652 2.519265 Std. Err. .014093 .0106104 .0041469 .0041494 .0061208 .0144766
Multinomial Logit
A generalization of logistic regression More than two outcomes Outcomes are not ordered We are interested in the relative probabilities of outcomes
Examples
Choice of transportation bus, taxi, private car Choice of product brand Occupational choice (considered as unordered) craft, blue collar, professional, white collar
Example Data
ID 1 2 3 4 5 6 7 Distance Income 5 10 1 25 30 2 1 15 10 12 18 40 20 8 Choice Bus Car Car Bus Taxi Bus Taxi
Sample Results
----------------------------------------------------outcome | Coef. Std. Err. z P>|z| -------------+--------------------------------------Taxi | distance | income | _cons | -.0757664 .319901 -6.22562 .1305456 .0830162 1.734012 -0.58 3.85 -3.59 0.562 0.000 0.000
-------------+--------------------------------------Car | distance | income | _cons | .4482523 .0447404 -2.587764 .1129979 .0581754 1.214103 3.97 0.77 -2.13 0.000 0.442 0.033
-------------+--------------------------------------Car | distance | income | _cons | .5240187 -.2751607 3.637855 .1245058 .080734 1.705811 4.21 -3.41 2.13 0.000 0.001 0.033
Coefficients on Distance
Taxi Bus Bus Taxi Bus Taxi Car Car .0757664 -.0757664 .4482523 .5240187
T B B C
-.18 -.09 -.01 .08 Change in the Predicted Probability .16 .24
C T
.33
distance
Std Coef
T B C B C T
-1.48 -1.01 -.53 Logit Coefficient Scale Relative to -.05 .43 .91 1.39
income
Std Coef
Nested Logit
Treats a set of choices as a hierarchy IIA assumption can be relaxed
References
Long, J. S. (1997). Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks, CA: Sage. Hosmer, D. W. and S. Lemeshow. (2000). Applied Logistic Regression (Second ed.). New York: Wiley. Allison, P. D. (1999). Logistic Regression Using the SAS System: Theory and Application. Cary, NC: SAS Institute. Long, J. S. & Freese, J. (2001). Regression Models for Categorical Dependent Variables using Stata. College Station, TX: Stata Press.
Appendix
Programming Examples
By James Zeitler
7 6 5 4 3 2
Analysis of Maximum Likelihood Estimates Standard Error 0.1666 0.0933 0.0781 0.0734 0.0807 0.1555 0.1208 Wald Chi-Square 363.5568 496.5331 200.8158 38.0347 145.4615 178.1792 79.1034
DF 1 1 1 1 1 1 1
Analysis of Maximum Likelihood Estimates Standard Parameter Mode DF Estimate Error Intercept Bus 1 6.2253 1.7340 Intercept Car 1 3.6375 1.7057 Distance Bus 1 0.0757 0.1305 Distance Car 1 0.5240 0.1245 Income Bus 1 -0.3199 0.0830 Income Car 1 -0.2751 0.0807
472.2 124.1 630.7 643.5 379.3 662.3 831.2 485.1 384.389.114.1700.2084.3321.5dnuoB reppU dnuoB rewoL lavretnI ecnedifnoC %59
dlohserhT
setamitsE retemaraP
PLUM y WITH x /CRITERIA = CIN(95) DELTA(0) LCONVERGE(0) MXITER(100) MXSTEP(5) PCONVERGE (1.0E-6) SINGULAR(1.0E-8) /LINK = LOGIT /PRINT = FIT PARAMETER SUMMARY .
113.1 838. 305.3 058.2 262.2 698.1 062.1 459. 903.795.518.231.1177.1183.2dnuoB reppU dnuoB rewoL lavretnI ecnedifnoC %59
dlohserhT
setamitsE retemaraP
PLUM y WITH x /CRITERIA = CIN(95) DELTA(0) LCONVERGE(0) MXITER(100) MXSTEP(5) PCONVERGE (1.0E-6) SINGULAR(1.0E-8) /LINK = PROBIT /PRINT = FIT PARAMETER SUMMARY .
1 1 1 1 1 1 fd
572. 425.836.3540.844.885.2 B
ixaT
suB ECIOHC
setamitsE retemaraP
Analyze
NOMREG choice WITH distance income /CRITERIA = CIN(95) DELTA(0) MXITER(100) MXSTEP(5) CHKSEP(20) LCONVERGE(0) PCONVERGE(1.0E-6) SINGULAR(1.0E-8) /MODEL /INTERCEPT = INCLUDE /PRINT = PARAMETER SUMMARY LRT .