Sunteți pe pagina 1din 6

MARKETING ANALYSIS ASSIGNMENT-1

Logistic Regression

Submitted to

Submitted by

Prof. Shelendra Tyagi

Ramit Gahlaut
053/2015

Answer:

Q1. The file logitsubscibedata.xls gives the number of people in each age group who
subscribe and do not subscribe to a magazine. How does age influence the chance of
subscribing the magazine?

Answer:
To predict the chance of Age influencing the subscription of the magazine, we will use
Logit Regression
Logit Regression output using SPSS
Block 1

Significance is less than 0.05 which suggests that our model is valid.

Variance explained
In order to understand how much variation in the dependent variable can be
explained by the model (the equivalent of R2 in multiple regression), we have to
interpret the below table, "Model Summary":
Model Summary
Step

-2 Log likelihood

1381.112a

Cox & Snell R

Nagelkerke R

Square

Square
.068

a. Estimation terminated at iteration number 4 because


parameter estimates changed by less than .001.

.102

This table contains the Cox & Snell R Square and Nagelkerke R Square values, which
are both methods of calculating the explained variation. These values are sometimes
referred to as pseudo R2 values (and will have lower values than in multiple
regression). However, they are interpreted in the same manner, but with more caution.
Therefore, the explained variation in the dependent variable based on our model
ranges from 6.80% to 10.2%, depending on reference to the Cox & Snell R2 or
Nagelkerke R2 methods, respectively. Nagelkerke R2 is a modification of Cox &
Snell R2, the latter of which cannot achieve a value of 1. For this reason, it is
preferable to report the Nagelkerke R2 value.

Category prediction
Binomial logistic regression estimates the probability of an event (in this case,
subscription of magazine) occurring. If the estimated probability of the event
occurring is greater than or equal to 0.5 (better than even chance), SPSS classifies the
event as occurring (e.g., subscription of magazine). If the probability is less than 0.5,
SPSS classifies the event as not occurring (e.g., no subscription of magazine). It is
very common to use binomial logistic regression to predict whether cases can be
correctly classified (i.e., predicted) from the independent variables. Therefore, it
becomes necessary to have a method to assess the effectiveness of the predicted
classification against the actual classification. There are many methods to assess this
with their usefulness depending on the nature of the study conducted. However, all
methods revolve around the observed and predicted classifications, which are
presented in the "Classification Table", as shown below:

This means that if the probability of a case being classified into the "yes" category is
greater than .500, then that particular case is classified into the "yes" category.
Otherwise, the case is classified as in the "no" category (as mentioned previously).

Variables in the equation


The "Variables in the Equation" table shows the contribution of each independent
variable to the model and its statistical significance. This table is shown below:

The Wald test ("Wald" column) is used to determine statistical significance for each of
the independent variables. The statistical significance of the test is found in the "Sig."
column. From these results you can see that age (p = .00), gender (p = .002) added
significantly to the model/prediction. We can use the information in the "Variables in
the Equation" table to predict the probability of an event occurring based on a one unit
change in an independent variable when all other independent variables are kept
constant. For example, the table shows that the Females were 1.22 times more likely to
subscribe the magazine than males.

Analysis

The coefficient and intercept estimates give us the following equation:


log(p/(1-p)) = logit(p) = .801 + (-.052)*Age + -.204*(0=Man)
log(p/(1-p)) = logit(p) = .801 + (-.052)*Age + (.204)*(1=Woman)
Let's fix Age at some value. We will use 54. Then the conditional logit of being a
subscriber to the magazine is held at 54 is
log(p/(1-p))(Age=54) = .801 + (-0.52) *54 + (-.204)*0
We can examine the effect of a one-unit increase in Age. When the Age is held at 55,
the conditional logit of being a subscriber is
log(p/(1-p))(Age=55) = .801 +(-0.52)*55 + (-.204)*0
Taking the difference of the two equations, we have the following:
log(p/(1-p))(Age=55) - log(p/1-p))(Age = 54) = 0.052399.
We can say now that the coefficient for Age is the difference in the log odds. In other
words, for a one-unit increase in the Age, the expected change in log odds is 0.052399.
By exponentiation of both sides of our last equation, we have the following:
Exp[log(p/(1-p))(Age=55) - log(p/1-p))(Age = 54)] = Exp(log(p/(1-p))(Age=55)) /
Exp(log(p/(1-p))(Age = 54)) = odds(Age=55)/odds(Age=54) = Exp(.052399) =
1.053796.

So we can say for a one-unit increase in Age, we expect to see about 5% decrease in
the odds of being a subscriber to the magazine. This 5% of decrease does not depend
on the value that Age is held at.

Conclusion
A logistic regression was performed to ascertain the effects of age and gender on the
likelihood that participants have subscribe the magazine or not. The logistic regression
model was statistically significant, 2(4) = 94.86, p < .0005. The model explained
10.20% (Nagelkerke R2) of the variance in subscription and correctly classified
76.2.0% of cases. Females were 1.52 times more likely to subscribe the magazine than
males. Increasing age was associated with less l subscription of magazine.

S-ar putea să vă placă și