Sunteți pe pagina 1din 3

Homework: Regression

Background:
This homework was developed using data supplied by MMA, a consulting firm that specializes
in marketing mix models. The purpose of this exercise is to understand how sales for Brand C
relate to factors in the marketplace (own marketing efforts, others marketing efforts,
environmental factors). This information could then be used to (i) assess the relative impact of
different elements of the marketing mix and (ii) volume forecasting.
This is an integrated dataset with 179 weeks (Feb 2000 Jul 2003) of observations for a variety
of marketing mix variables. Brand C is one of the big players in a fairly commoditized product
category. Brands R, E, P and U are some of the other brands in the category. Brand C and U are
owned by the same company. Some of the measures in the dataset should look familiar, while
others may be new. The key dependent variable in the dataset is equivalent units sales volume.
You have selling price information for brands C, E, and P. The variable disacv_c not only
measures how deep of a discount was offered for brand C, but also how prevalent that discount
was across sales outlets.
A unique measure of promotional activity included in the dataset is expressed in terms of coupon
valuation within an FSI drop, and how big the coupon drop was (in terms of circulation). This
measure is reported as two variables for brand C contingent on holiday or non-holiday time
periods. The dataset also has information about coupon drops for competitors R and E.
Information from Nielsen Media Research is incorporated into the dataset as TV GRP
information for commercials featuring brands C and U. A gross rating point (GRP) is a variable
used to measure the impact of television advertising. There is also an indicator variable for the
thematic focus of the television advertising message.
The dataset also includes information about the prevalence of brand Cs bonus pack offering, a
measure of line length per store expressed as rolling average of SKUs per store, and (using
panel data) percent share of brand C that is sold through Wal*Mart.
Definitions of variables in dataset:
week
eq_volum_c
disacv_c
bonusacv
price_c
price_e
price_p
tvgrp_c

Week of observations
Equivalent unit sales volume for brand C (the dependent variable)
Brand C %ACV * % Discount (This variable captures depth of price discount and
how prevalent it was. That is, weighted average price discount)
%ACV for stores in which brand C bonus pack had sales
Brand C price per equivalent unit (non promoted price)
Brand E price per equivalent unit (non promoted price)
Private label price per equivalent unit (non promoted price)
Brand C TV GRPs (GRPs are reach TIMES frequency or the number of people
viewing the commercial and how many times they see it.)

tvgrp_u

Brand U TV GRPs (GRPs are reach TIMES frequency or the number of people
viewing the commercial and how many times they see it.)
Theme of Brand C TV advertising focused on the message Trusted. Included to
indicate times when this ad ran.
Brand C Holiday FSIs (coupon value * circulation)
Brand C Non-Holiday FSIs (coupon value * circulation)
Brand E or R FSIs (coupon value * circulation)
Number of Brand C items sold per store rolling 13 week average
Wal*Mart share

trustad
fsi_holi
fsi_non
fsi_comp
itemstor
walmart
Questions:

1. Univariate (one variable at a time) analysis: Use summary statistics and time series
graphs to highlight key findings in the data. Please be thoughtful and concise in your
reporting.
2. Bivariate (two variables at a time) analysis: Use correlations and X-Y plots to explore
bivariate relationships between the variables reported. Please include only those graphs
that show an association between the dependent variable (eq_volum) and an independent
variable. Do any pairs of the independent variables exhibit a high association with each
other?
3. Develop one regression model to explain drivers of Eq. Unit Sales. Use methods
discussed in class to determine this final model and discuss the following aspects of your
model using appropriate diagnostics:
a.
b.
c.
d.

linearity
homoscedasticity of errors
normality of errors
independence of errors

Please be thoughtful and concise in your reporting.


4. Justify your choice of independent variables in the model. That is, why did you choose to
include or exclude certain variables? Briefly comment on multicollinearity in the data
provided and how you went about addressing it in your final model.
5. Comment on the in-sample model fit (R2) and significance level of the independent
variables. Include a time series plot that reports the actual as well as predicted (i.e. model
based) eq_volum_c. How well do you do?
6. Out of sample prediction: Rerun your regression model using only 150 weeks of data.
Use this model to predict eq_volum_c for the remaining 29 weeks in the data. Report
your findings in a time series plot. Please comment on how well your model predicts out
of sample.

7. Using your final model (in 3 above), what can you say about the effectiveness of the
different marketing activities for Brand C (e.g. price discounts vs. couponing vs.
advertising)? What can you say about the effect of competitive activities on Brand C
sales?
8. In about a paragraph, discuss how you could use your final model (in 3 above) to manage
the marketing mix of your brand C. Next, as an illustration, pose one specific marketing
mix related question that could be answered by your model and then answer that
question.
9. What are some of the limitations on the inference you can draw from the above model?
How could you overcome these limitations?

S-ar putea să vă placă și