Ordinal and Multinomial Models

Ordinal and Multinomial Models
William Simpson Research Computing Services
http://intranet.hbs.edu/dept/research/statistics/
Types of Models
Models are generalizations of the logit and probit models Ordinal logit and probit deal with ordered data (more than 2 categories) Multinomial logit deals with unordered data with more than 2 categories (Multinomial probit is not commonly used due to computational difficulties)
Outline of Talk
Review of Binary Models Ordinal Models Multinomial Logit
Binary Data View 1 (CDF)

View 1 we compute a number that is a linear combination of our predictors, call it y=E+F x. We then convert y into a probability p by using a cumulative distribution function (CDF). Our final outcome is 1 with probability p.
prob 1 0.8 0.6 0.4 0.2 a + bX - 3 - 2 - 1 1 2 3
Another CDF View

Y
a + bX
p=1
p=0
Binary Data View 2 (Latent or Unobserved Variable)

View 2 we compute a number that is a linear combination of our predictors and then add an error term, call it y*= E + F x + u We then get an outcome of 1 if y* >= 0, outcome 0 if y* < 0. In this case, the probabilistic element is the error term u, and y* is an unobserved variable.
Binary Data Unobserved Variable View

Y* a + bX
PDF of Y*
What Happens When Standard Deviation of u Changes

Y* a + bX
y*= E + F x + v std(v) > std(u)

t X
Comparing CDF and Latent Variable Views

The two views are equivalent. Each one can be converted into the other, where the cumulative probability function (CDF) in view 1 matches the CDF of the distribution of u in view 2.
Combining the Two Views

Y, Y* a + bX
p=1
p=0
Combining the Two Views

Y, Y* a + bX
p=1
p=0
Ordinal Outcomes
3 or more categorical outcomes, which can be treated as ordered Bond ratings (AAA, AA, B, C, ) Likert scales (e.g. responses on a 1-7 scale, from strongly disagree to strongly agree)
Often analyzed as continuous
Ordinal Outcomes (Latent Variable View)

Y* bX
t3 t2 t1
Ordinal Outcomes (CDF and Latent Variable View)

bX t3 t2 t1
p=1
p=0

bX t3 t2 t1
p=1
p=0

bX t3 t2 t1
p=1
p=0

bX t3 t2 t1
p=1
p=0
SAS and Stata Code

Stata oprobit outcome x or ologit outcome x SAS proc logistic; class outcome; model outcome = x / link=probit; or model outcome = x ; run;
Sample Output (Stata oprobit)

--------------------------------------------------------y | Coef. Std. Err. z P>|z|
--------------------------------------------------------x | 1.074575 .1209108 8.89 0.000
-------------+------------------------------------------_cut1 | _cut2 | _cut3 | _cut4 | _cut5 | _cut6 | -2.076242 -.9736895 -.4528313 1.106628 2.079342 3.176076 .1548201 .0807119 .073509 .0781733 .0932966 .167065 (Ancillary parameters)
----------------------------------------------------------
Interpretation of Stata Output

x | 1.074575 .1209108 -------------+----------------------_cut1 | _cut2 | -2.076242 -.9736895 .1548201 .0807119
Outcome will be in the second ordered category or higher (not the first), if 1.07*x+u > -2.08. Outcome will be in the third ordered category or higher (not the first or second), if 1.07*x+u > -.97. Outcome will be in the second ordered category exactly, if -.97 > 1.07*x+u > -2.08.
Sample Output (SAS PROC LOGISTIC with LINK=PROBIT)

Parameter Intercept 7 Intercept 6 Intercept 5 Intercept 4 Intercept 3 Intercept 2 x DF 1 1 1 1 1 1 1 Estimate -3.1758 -2.0793 -1.1066 0.4528 0.9737 2.0762 1.0746 Std Error 0.1666 0.0933 0.0781 0.0734 0.0807 0.1555 0.1208
Interpretation of SAS Output

Intercept 3 Intercept 2 x 1 1 1 0.9737 2.0762 1.0746 0.0807 0.1555 0.1208
Outcome will be in the second ordered category or higher (not the first), if 1.07*x + 2.08 + u > 0. Outcome will be in the third ordered category or higher, if 1.07*x + .97 + u > 0. Outcome will be in the second ordered category if 1.07*x + 2.08 + u > 0 and 1.07*x + .97 + u < 0.
Interpreting Coefficients
Multiple cutpoints with no intercept term, or multiple intercept terms Probabilities modeled are probabilities for all outcomes >=k, compared with all outcomes < k. Interpret the coefficients the same as in the corresponding binary model.
Interpreting Coefficients (Ordinal Probit)

E p 2 ! * 2 FX p 3 ! * 3 FX E * is the cumulative distribution of a standard normal p 2 is the probability of outcome 2 or higher p 3 is the probability of outcome 3 or higher prob(exactly 2) ! p3 p 2
Interpreting Coefficients (Ordinal Logit)

p2 log ! E 2 FX 1 p2 exp(E 2 FX ) p2 ! 1 exp(E 2 FX ) exp(E 3 FX ) p3 ! 1 exp(E 3 FX ) p 2 is the probability of outcome 2 or higher p3 is the probability of outcome 3 or higher prob(exactly 2) ! p 3 p 2
Assumptions of Ordinal Models

Relationship between probabilities and E + F x follows the assumed form (normal for probit, logistic for logit). Parallel regressions Coefficient F is the same for every hurdle aka equal slopes, (proportional odds for logistic models)
If not, use generalized ordered logit
Parallel Regressions
Y a3+ bX a2+ bX a1+ bX X
p=1
p=0
Proportional Odds
p2 ! E 2 FX log 1 p2 p3 ! E 3 FX log 1 p3 p3 p2 log log ! E3 E2 1 p3 1 p2 odds 3 log ! E3 E2 odds 2 odds 3 ! odds 2 * exp 3 E 2 E
Interpreting Cutpoints
Sample Likert Scale with Extra Points

2.3 1 2 3 4.2 4 5 6 7 ----------------------------------------------------------SD D MoD SoD N VSA SoA A SA
SD=Strongly Disagree, SoD = Somewhat Disagree D=Disagree, N=Neutral, A=Agree SA=Strongly Agree, SoA=Somewhat Agree MoD=Moderately Disagree VSA = Very Slightly Agree
Probability of Responses
SD
D MoD
SoD
N VSA
SoA
SA
Sample Likert Scale with Uneven Points

1 2 3 4 5 6 7 ----------------------------------------------------------SD (1) D (2) MoD (2.3) SoD (3) N (4) VSA (4.2) SA (7)
SD=Strongly Disagree, SoD = Somewhat Disagree MoD=Moderately Disagree D=Disagree, N=Neutral VSA = Very Slightly Agree SA=Strongly Agree
Probabilities with Uneven Scale
SD
DMoD
SoD
NVSA
SA
Ordinal Outcomes (Latent Variable View)

Y* bX
t3 t2 t1
Interpreting Cutpoints
Model is Y* = u (no predictor variables)
oprobit y Coef. _cut1 | _cut2 | _cut3 | _cut4 | _cut5 | _cut6 | -2.494879 -1.501138 -.4976369 .5008453 1.506652 2.519265 Std. Err. .014093 .0061005 .0041469 .0041494 .0061208 .0144766
Uneven Cutpoints
Coef. _cut1 | _cut2 | _cut3 | _cut4 | _cut5 | _cut6 | -2.494879 -2.217338 -.4976369 .5008453 1.506652 2.519265 Std. Err. .014093 .0106104 .0041469 .0041494 .0061208 .0144766
Cutpoints for Ordinal Logit
_cut1 | _cut2 | _cut3 | _cut4 | _cut5 | _cut6 |
-5.090054 -2.623044 -.8017561 .8111758 2.630199 5.05293
.0405469 .0125897 .0068396 .0068519 .0126288 .0398104
Multinomial Logit
A generalization of logistic regression More than two outcomes Outcomes are not ordered We are interested in the relative probabilities of outcomes
Examples
Choice of transportation bus, taxi, private car Choice of product brand Occupational choice (considered as unordered) craft, blue collar, professional, white collar
Example Data
ID 1 2 3 4 5 6 7 Distance Income 5 10 1 25 30 2 1 15 10 12 18 40 20 8 Choice Bus Car Car Bus Taxi Bus Taxi
Using a Reference Level

ID 1 2 3 4 5 6 7 Distance Income 5 10 1 25 30 2 1 15 10 12 18 40 20 8 Choice Bus Car Car Bus Taxi Bus Taxi
Sample Results
----------------------------------------------------outcome | Coef. Std. Err. z P>|z| -------------+--------------------------------------Taxi | distance | income | _cons | -.0757664 .319901 -6.22562 .1305456 .0830162 1.734012 -0.58 3.85 -3.59 0.562 0.000 0.000
-------------+--------------------------------------Car | distance | income | _cons | .4482523 .0447404 -2.587764 .1129979 .0581754 1.214103 3.97 0.77 -2.13 0.000 0.442 0.033
----------------------------------------------------(Outcome outcome==Bus is the comparison group)
Sample Results (2)

----------------------------------------------------outcome | Coef. Std. Err. z P>|z| -------------+--------------------------------------Bus | distance | income | _cons | .0757664 -.319901 6.22562 .1305456 .0830162 1.734012 0.58 -3.85 3.59 0.562 0.000 0.000
-------------+--------------------------------------Car | distance | income | _cons | .5240187 -.2751607 3.637855 .1245058 .080734 1.705811 4.21 -3.41 2.13 0.000 0.001 0.033
----------------------------------------------------(Outcome outcome==Taxi is the comparison group)
Coefficients on Distance
Taxi Bus Bus Taxi Bus Taxi Car Car .0757664 -.0757664 .4482523 .5240187
Bus Taxi + Taxi Car = Bus Car

-.0757664 + .5240187 = .4482523
Bus Car = Taxi Car Taxi Bus
Probability Change Plot

distance: +/-sd/2 income: +/-sd/2
T B B C
-.18 -.09 -.01 .08 Change in the Predicted Probability .16 .24
C T
.33
Odds Ratio Plot

Factor Change Scale Relative to .23 .37 .59 .95 1.54 2.48 4
distance
Std Coef
T B C B C T
-1.48 -1.01 -.53 Logit Coefficient Scale Relative to -.05 .43 .91 1.39
income
Std Coef
Independence from Irrelevant Alternatives (IIA)

Relative odds of two categories shouldnt change when a new category is added E.g., if choices are car, bus, and Yellow Cab, the relative proportions shouldnt change if a new choice is added, e.g. Black & White Cab
Not realistic in this case. Assumption should be examined carefully.
Other Models for Nominal Outcomes

Conditional Logit
Attributes of choices can be used as predictors
Nested Logit
Treats a set of choices as a hierarchy IIA assumption can be relaxed
References
Long, J. S. (1997). Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks, CA: Sage. Hosmer, D. W. and S. Lemeshow. (2000). Applied Logistic Regression (Second ed.). New York: Wiley. Allison, P. D. (1999). Logistic Regression Using the SAS System: Theory and Application. Cary, NC: SAS Institute. Long, J. S. & Freese, J. (2001). Regression Models for Categorical Dependent Variables using Stata. College Station, TX: Stata Press.
Appendix
Programming Examples
By James Zeitler
Ordered Logit (SAS)

proc logistic data = work.ordinals descending; model y = x; run;
The LOGISTIC Procedure Model Information Data Set WORK.ORDINALS .............................................. Model cumulative logit Optimization Technique Fisher's scoring Response Profile Ordered Total Value y Frequency 1 7 6 ............................. 7 1 6 Probabilities modeled are cumulated over the lower Ordered Values. Analysis of Maximum Likelihood Estimates Standard Wald DF Estimate Error Chi-Square 1 -6.1912 0.4312 206.1863 1 -3.6194 0.1804 402.7389 1 -1.8611 0.1414 173.2883 1 0.7326 0.1275 33.0150 1 1.7093 0.1520 126.4030 1 4.3014 0.4189 105.4418 1 1.8479 0.2176 72.1016
Parameter Intercept Intercept Intercept Intercept Intercept Intercept x
7 6 5 4 3 2
Pr > ChiSq <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001
Ordered Probit (SAS)

proc logistic data = work.ordinals descending; model y = X / LINK = PROBIT; run; The LOGISTIC Procedure Model Information Data Set WORK.ORDINALS ............................................... Model cumulative probit Response Profile Ordered Total Value y Frequency 1 7 6 ............................ 7 1 6 Probabilities modeled are cumulated over the lower Ordered Values.
Analysis of Maximum Likelihood Estimates Standard Error 0.1666 0.0933 0.0781 0.0734 0.0807 0.1555 0.1208 Wald Chi-Square 363.5568 496.5331 200.8158 38.0347 145.4615 178.1792 79.1034
Parameter Intercept Intercept Intercept Intercept Intercept Intercept x 7 6 5 4 3 2
DF 1 1 1 1 1 1 1
Estimate -3.1758 -2.0793 -1.1066 0.4528 0.9737 2.0762 1.0746
Pr > ChiSq <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001
Multinomial Logit (SAS)

/* Use Link = GLOGIT in PROC LOGIT /* to estimate a multinomial logit /* Refer to the response profile to /* determine the reference category proc logistic data = transport; class Mode; model Mode = Distance Income /link = glogit; run; */ */ */ */ The LOGISTIC Procedure Model Information Data Set WORK.TRANSPORT Response Variable Mode Number of Response Levels 3 Model generalized logit Response Profile Ordered Total Value Mode Frequency 1 Bus 27 2 Car 42 3 Taxi 31 Logits modeled use Mode='Taxi' as the reference category.
Analysis of Maximum Likelihood Estimates Standard Parameter Mode DF Estimate Error Intercept Bus 1 6.2253 1.7340 Intercept Car 1 3.6375 1.7057 Distance Bus 1 0.0757 0.1305 Distance Car 1 0.5240 0.1245 Income Bus 1 -0.3199 0.0830 Income Car 1 -0.2751 0.0807
Wald Chi-Square 12.8897 4.5475 0.3367 17.7135 14.8488 11.6155
Pr > ChiSq 0.0003 0.0330 0.5617 <.0001 0.0001 0.0007
Ordered Logit (SPSS)

Analyze Regression X Ordinal...
Logit is default link distribution
.tigoL :noitcnuf kniL 1 1 1 1 1 1 1 fd X ]6 = Y[ ]5 = Y[ ]4 = Y[ ]3 = Y[ ]2 = Y[ ]1 = Y[ noitacoL
472.2 124.1 630.7 643.5 379.3 662.3 831.2 485.1 384.389.114.1700.2084.3321.5dnuoB reppU dnuoB rewoL lavretnI ecnedifnoC %59
000. 000. 000. 000. 000. 000. 000. .giS
690.27 081.602 337.204 282.371 810.33 904.621 144.501 dlaW
812. 134. 081. 141. 721. 251. 914. rorrE .dtS
848.1 191.6 916.3 168.1 337.907.1203.4etamitsE
dlohserhT
setamitsE retemaraP
Ordered Logit Syntax and Results (SPSS)
PLUM y WITH x /CRITERIA = CIN(95) DELTA(0) LCONVERGE(0) MXITER(100) MXSTEP(5) PCONVERGE (1.0E-6) SINGULAR(1.0E-8) /LINK = LOGIT /PRINT = FIT PARAMETER SUMMARY .
Ordered Probit (SPSS)

Analyze Regression X Ordinal...
Set Probit as link distribution
.tiborP :noitcnuf kniL 1 1 1 1 1 1 1 fd X ]6 = Y[ ]5 = Y[ ]4 = Y[ ]3 = Y[ ]2 = Y[ ]1 = Y[ noitacoL
113.1 838. 305.3 058.2 262.2 698.1 062.1 459. 903.795.518.231.1177.1183.2dnuoB reppU dnuoB rewoL lavretnI ecnedifnoC %59
000. 000. 000. 000. 000. 000. 000. .giS
601.97 354.363 735.694 028.002 330.83 464.541 071.871 dlaW
121. 761. 390. 870. 370. 180. 651. rorrE .dtS
570.1 671.3 970.2 701.1 354.479.670.2etamitsE
dlohserhT
setamitsE retemaraP
Ordered Probit Syntax and Results (SPSS)
PLUM y WITH x /CRITERIA = CIN(95) DELTA(0) LCONVERGE(0) MXITER(100) MXSTEP(5) PCONVERGE (1.0E-6) SINGULAR(1.0E-8) /LINK = PROBIT /PRINT = FIT PARAMETER SUMMARY .
245.1 657. 270.1 797.
421.1 464. 358. 215.
713.1 295. 659. 936. )B(pxE
dnuoB reppU dnuoB rewoL )B(pxE rof lavretnI ecnedifnoC %59
100. 000. 330. 244. 000. 330. .giS
1 1 1 1 1 1 fd
616.11 417.71 845.4 195. 637.51 345.4 dlaW
180. 521. 607.1 850. 311. 412.1 rorrE .dtS
572. 425.836.3540.844.885.2 B
EMOCNI ECNATSID tpecretnI EMOCNI ECNATSID tpecretnI
ixaT
suB ECIOHC
setamitsE retemaraP
Analyze
Regression X Multinomial logit...
Multinomial Logit (SPSS)
NOMREG choice WITH distance income /CRITERIA = CIN(95) DELTA(0) MXITER(100) MXSTEP(5) CHKSEP(20) LCONVERGE(0) PCONVERGE(1.0E-6) SINGULAR(1.0E-8) /MODEL /INTERCEPT = INCLUDE /PRINT = PARAMETER SUMMARY LRT .

Ordinal and Multinomial Models

Încărcat de

Informații document

Descriere originală:

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Ordinal and Multinomial Models

Încărcat de

Drepturi de autor:

Formate disponibile

Ordinal and Multinomial Models

William Simpson Research Computing Services

Binary Data View 1 (CDF)

Another CDF View

Binary Data View 2 (Latent or Unobserved Variable)

Binary Data Unobserved Variable View

What Happens When Standard Deviation of u Changes

y*= E + F x + v std(v) > std(u)

Comparing CDF and Latent Variable Views

Combining the Two Views

Combining the Two Views

Ordinal Outcomes (Latent Variable View)

Ordinal Outcomes (CDF and Latent Variable View)

Ordinal Outcomes (CDF and Latent Variable View)

Ordinal Outcomes (CDF and Latent Variable View)

Ordinal Outcomes (CDF and Latent Variable View)

SAS and Stata Code

Sample Output (Stata oprobit)

--------------------------------------------------------x | 1.074575 .1209108 8.89 0.000

Interpretation of Stata Output

Sample Output (SAS PROC LOGISTIC with LINK=PROBIT)

Interpretation of SAS Output

Interpreting Coefficients (Ordinal Probit)

Interpreting Coefficients (Ordinal Logit)

Assumptions of Ordinal Models

Sample Likert Scale with Extra Points

Sample Likert Scale with Uneven Points

Probabilities with Uneven Scale

Ordinal Outcomes (Latent Variable View)

Cutpoints for Ordinal Logit

_cut1 | _cut2 | _cut3 | _cut4 | _cut5 | _cut6 |

-5.090054 -2.623044 -.8017561 .8111758 2.630199 5.05293

.0405469 .0125897 .0068396 .0068519 .0126288 .0398104

Using a Reference Level

----------------------------------------------------(Outcome outcome==Bus is the comparison group)

Sample Results (2)

----------------------------------------------------(Outcome outcome==Taxi is the comparison group)

Bus Taxi + Taxi Car = Bus Car

Bus Car = Taxi Car Taxi Bus

Probability Change Plot

Odds Ratio Plot

Independence from Irrelevant Alternatives (IIA)

Other Models for Nominal Outcomes

Ordered Logit (SAS)

Parameter Intercept Intercept Intercept Intercept Intercept Intercept x

Pr > ChiSq <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001

Ordered Probit (SAS)

Parameter Intercept Intercept Intercept Intercept Intercept Intercept x 7 6 5 4 3 2

Estimate -3.1758 -2.0793 -1.1066 0.4528 0.9737 2.0762 1.0746

Pr > ChiSq <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001

Multinomial Logit (SAS)

Wald Chi-Square 12.8897 4.5475 0.3367 17.7135 14.8488 11.6155

Pr > ChiSq 0.0003 0.0330 0.5617 <.0001 0.0001 0.0007

Ordered Logit (SPSS)

Logit is default link distribution

.tigoL :noitcnuf kniL 1 1 1 1 1 1 1 fd X ]6 = Y[ ]5 = Y[ ]4 = Y[ ]3 = Y[ ]2 = Y[ ]1 = Y[ noitacoL

000. 000. 000. 000. 000. 000. 000. .giS

690.27 081.602 337.204 282.371 810.33 904.621 144.501 dlaW

812. 134. 081. 141. 721. 251. 914. rorrE .dtS

848.1 191.6 916.3 168.1 337.907.1203.4etamitsE

Ordered Logit Syntax and Results (SPSS)

Ordered Probit (SPSS)

Set Probit as link distribution