Sunteți pe pagina 1din 6

Multinomial Logistic Regression- Nominal Outcomes

● Example 1. Alligator’s Food Preference (Nominal Outcome)


– The outcome variable Y will be the types of food (fish, birds, reptile,
other) and the predictor variables may include size/length of alligator,
gender, geographic location, etc.)
● Example 2. High school students’ program choices (Nominal Outcome)
– Y : program choices; 1=general, 2 = academic 3 = vocational and
predictors may include their reading, writing scores as well as social
economic status.
● Example 3. Severity of Disease (Ordinal Outcome)
– Patients may be classified Y = 0 (no disease), Y = 1 (early stage of a
disease) and Y = 2 (advance stage of the disease). The outcome
variable Y is ordinal with three ordered categories (0, 1, 2) and
covariate x may include patients characteristics (age, race, gender,
marital status, family history, blood and X-ray results, etc.)
● Example 4. Strength of Response in Opinion Surveys (Ordinal Outcome)
– Subjects’ answer may fall into one of several ordered categories (1=
strongly disagree, 2 = disagree, 3 = no opinion, 4 = agree, 5 = strongly
agree). Covariates are various characteristics of subject (gender, age,
race, education, income, etc.)
Multinomial Logistic Regression (Nominal Outcome)
● Analysis
– Collapse the number of outcome categories to two and then do a binary logistic
• This approach suffers from loss of information and changes the original research
questions to very different ones.
– One for each pair of outcomes.
• Two problems with this approach are
– Each analysis is run on a different sample.
– Without constraining the logistic models, we may end up with the total
probability of choosing all possible outcome categories greater than 1.
– Fit baseline categorical model with Y=0(say) as the baseline outcome category
(two logits when Y has three outcomes: 0, 1, 2): (Multinomial_nominal.sas)
• Ln[p(y=1/x)/p(y=0/x)]=β10+ β11 x1 + β12 x2+β1p xp= x’β1 baseline logit,
– p(y=1/x)/p(y=0/x) = exp(x’β1) generalized
• Ln[p(y=2/x)/p(y=0/x)]=β20+β21 x1+ β22 x2 + β2p xp= x’β2 logit or
– p(y=2/x)/p(y=0/x) = exp(x’β2) multinomial
• Use p(y=0/x) + p(y=1/x) + p(y=2/x) = 1 to obtain logit model
– p(y=0/x) = 1 / [1 + exp(x’β1) + exp(x’β2)]
– p(y=1/x) = exp(x’β1)/ [1 + exp(x’β1) + exp(x’β2)]
– p(y=2/x) = exp(x’β ) / [1 + exp(x’β ) + exp(x’β )]
Multinomial Logistic Regression (Ordinal outcome)
● Analysis
• Proportional odds model (Multinomial_ordinal.sas)
– Compares Y≤ j to Y > j
– By default, Proc Logistic models this cumulative probability for
three or more categories for the outcome variable with common
beta coefficients associated with covariates.
– It assumes that the odds ratio is constant for all categories
(independent of j), need to see if this assumption is justified. SAS
proc logistic automatically displays a test of common odds ratio.
– If the common beta-coefficient assumption is not valid, use
unequalslopes option in model statement to fit a model with
different beta-coefficients for different categories j (cumulative
logistic model)
Multinomial Logistic Regression Model : Outcome variable is Y
● The ratio p(y=j/x=a)/p(y=0/x=a) will be referred to as oddsj,a
● The odds ratio of outcome y=j versus y= 0 for covariate x = a
versus x = b is ORj(a,b) = oddsj,a / oddsj,b
= [p(y=j/x=a)/p(y=0/x=a)] / [p(y=j/x=b)/p(y=0/x=b)]
● Confidence interval for odds ratios within each logit is obtained
in the same fashion as in binary logistic regression
● Interpret odds ratios and the corresponding confidence intervals
in the fashion of binary outcome setting
● The comparison of odds ratios (i.e., test and estimation of
contrast of beta coefficients within or across logit models) can
be carried out using “test” statements
● Use LSMEANS with GLM parametrization to find predicted
probabilities corresponding to some variables when all others
are held constant at their average values
● OUTPUT statement can be used to save predicted
probabilities for all sets of x values in the input data file.
Multinomial Logistic Regression Model
● Variable Selection Methods
– May use automatic variable selection methods to select subsets of variables for
each model. May result in different subsets of significant variables for different
logit, but one may prefer to have the same subset for all logit functions.
● Model diagnostics
– Nominal outcomes: use binary outcome models.
– For ordinal outcomes, not much known but can get some idea using
multiple binary models.
● Goodness-of-fit :
– Nominal Outcomes: Need more research in this area.
• Add predicted probabilities of all outcomes except the reference category. A
goodness-of-fit chi-square test based on g x k contingency table (g = # of groups
formed using total predicted probability mentioned above and k = number of
outcome categories) is proposed by Fagerland et al. (2008) and Fagerland (2009) .
– Ordinal Outcomes: Need lot more research.
• For proportional odds model, use Liptsitz and Fagerland-Hosmer Test
Multinomial Logistic Regression Model

● SAS examples
• multinomial_nominalW.sas
• multinomial_ordinalW.sas

S-ar putea să vă placă și