Sunteți pe pagina 1din 29

Analysis of Variance

(ANOVA)



Anova
The Analysis of Variance technique is used when the independent variables are of
nominal scale (categorical) and the dependent variable is metric (continuous).

Designs
The design of the experiment is most critical in performing any experiment to be
analyzed through the technique of ANOVA.


Completely Randomised Design in a One-Way ANOVA (Single Factor)
Randomised Block Design (Single Blocking Factor)
Factorial Design with 2 or more Factors.


One-Way ANOVA

Marketing researchers are often interested in examining the
differences in the mean values of the dependent variable for several
categories of a single independent variable or factor.

One dependent (metric) variable.
There is only one categorical independent variable. Variable is called a Factor.
Each category of an independent variable is called a level.

The independent variable may be different levels of prices, or different pack sizes,
or different product colours, and the effect (dependent variable) could be sales,
preferences or attitudes towards the brand.

Conducting One-way Analysis of
Variance
Decompose the Total Variation
The total variation in Y, denoted by SS
y
, can be
decomposed into two components:

SS
y
= SS
between
+ SS
within

where the subscripts between and within refer to the
categories of X. SS
between
is the variation in Y related to the
variation in the means of the categories of X. For this
reason, SS
between
is also denoted as SS
x
. SS
within
is the
variation in Y related to the variation within each category
of X. SS
within
is not accounted for by X. Therefore it is
referred to as SS
error
.

The total variation in Y may be decomposed as:

SS
y
= SS
x
+ SS
error

where






Y
i
= individual observation

j
= mean for category j
= mean over the whole sample, or grand mean
Y
ij
= i th observation in the j th category

S S
y = ( Y
i
- Y )
2
S
i
= 1
N
S S
x
= n ( Y
j
- Y )
2
S
j
= 1
c
S S
e r r o r
=
S
i
n
(
Y
i j
- Y
j
)
2
S
j
c
Y
Y
Independent Variable X
Total
Categories Sample
X
1
X
2
X
3
X
c
Y
1
Y
1
Y
1
Y
1
Y
1
Y
2
Y
2
Y
2
Y
2
Y
2
: :
: :
Y
n
Y
n
Y
n
Y
n
Y
N
Y
1
Y
2
Y
3
Y
c
Y
Within
Category
Variation
=SS
within
Between Category Variation = SS
between
Total
Variation
=SS
y
Category
Mean
Source of
Variation
Sum of Squares DF Mean
Square
F
Between
Groups
c-1 MSx=
SSx/(c-1)
MSx/MSE
Within
Groups
SSE = SSy-SSx n-c MSE=
SSE/(n-c)
Total n-1
Anova Table
Test Significance

In one-way analysis of variance, the interest lies in testing the null
hypothesis that the category means are equal in the population.

H
0
:
1
=
2
=
3
= ........... =
c

Under the null hypothesis, SS
x
and SS
error
come from the same
source of variation. In other words, the estimate of the population
variance of Y,

MS
x
= SS
x
/(c - 1)
= Mean square due to X


MS
error
= SS
error
/(N - c)
= Mean square due to error


Conducting One-way Analysis of
Variance
Test Significance
The null hypothesis may be tested by the F statistic
based on the ratio between these two estimates:



This statistic follows the F distribution, with (c - 1) and
(N - c) degrees of freedom (df).

F =
SS
x
/(c - 1)
SS
error
/(N - c)
=
MS
x
MS
error

Conducting One-way Analysis of
Variance
Interpret the Results
If the null hypothesis of equal category means is not
rejected, then the independent variable does not have a
significant effect on the dependent variable.
On the other hand, if the null hypothesis is rejected, then
the effect of the independent variable is significant.
Tukeys test can be used to see which pairs of groups are
significantly different or else.

Example:

The department store is attempting to determine
the effect of in-store promotion (X) on sales (Y).

The null hypothesis is that the category means are
equal:
H
0
:
1
=
2
=
3


H
1
: At least one of the means is different from others.
= 0.05
Effect of Promotion and Clientele on Sales
EFFECT OF IN-STORE PROMOTION ON SALES
Store Level of In-store Promotion
No. High Medium Low
Normalized Sales _________________
1 10 8 5
2 9 8 7
3 10 7 6
4 8 9 4
5 9 6 5
6 8 4 2
7 9 5 3
8 7 5 2
9 7 6 1
10 6 4 2
_____________________________________________________

Column Totals 83 62 37
Category means:
j
83/10 62/10 37/10
= 8.3 = 6.2 = 3.7
Grand mean, = (83 + 62 + 37)/30 = 6.067

Y
Y
To test the null hypothesis, the various sums of squares are computed as follows:

SSy = (10-6.067)
2
+ (9-6.067)
2
+ (10-6.067)
2
+ (8-6.067)
2
+ (9-6.067)
2
+ (8-6.067)
2
+ (9-6.067)
2
+ (7-6.067)
2
+ (7-6.067)
2
+ (6-6.067)
2
+ (8-6.067)
2
+ (8-6.067)
2
+ (7-6.067)
2
+ (9-6.067)
2
+ (6-6.067)
2
(4-6.067)
2
+ (5-6.067)
2
+ (5-6.067)
2
+ (6-6.067)
2
+ (4-6.067)
2
+ (5-6.067)
2
+ (7-6.067)
2
+ (6-6.067)
2
+ (4-6.067)
2
+ (5-6.067)
2
+ (2-6.067)
2
+ (3-6.067)
2
+ (2-6.067)
2
+ (1-6.067)
2
+ (2-6.067)
2

=(3.933)
2
+ (2.933)
2
+ (3.933)
2
+ (1.933)
2
+ (2.933)
2
+ (1.933)
2
+ (2.933)
2
+ (0.933)
2
+ (0.933)
2
+ (-0.067)
2
+ (1.933)
2
+ (1.933)
2
+ (0.933)
2
+ (2.933)
2
+ (-0.067)
2
(-2.067)
2
+ (-1.067)
2
+ (-1.067)
2
+ (-0.067)
2
+ (-2.067)
2
+ (-1.067)
2
+ (0.9333)
2
+ (-0.067)
2
+ (-2.067)
2
+ (-1.067)
2
+ (-4.067)
2
+ (-3.067)
2
+ (-4.067)
2
+ (-5.067)
2
+ (-4.067)
2
= 185.867

Illustrative Applications of One-way
Analysis of Variance
SSx = 10(8.3-6.067)
2
+ 10(6.2-6.067)
2
+ 10(3.7-6.067)
2
= 10(2.233)
2
+ 10(0.133)
2
+ 10(-2.367)
2
= 106.067

SSerror = (10-8.3)
2
+ (9-8.3)
2
+ (10-8.3)2 + (8-8.3)2 + (9-8.3)2
+ (8-8.3)
2
+ (9-8.3)2 + (7-8.3)2 + (7-8.3)2 + (6-8.3)2
+ (8-6.2)
2
+ (8-6.2)2 + (7-6.2)2 + (9-6.2)2 + (6-6.2)2
+ (4-6.2)
2
+ (5-6.2)2 + (5-6.2)2 + (6-6.2)2 + (4-6.2)2
+ (5-3.7)
2
+ (7-3.7)2 + (6-3.7)2 + (4-3.7)2 + (5-3.7)2
+ (2-3.7)
2
+ (3-3.7)2 + (2-3.7)2 + (1-3.7)2 + (2-3.7)2

= (1.7)
2
+ (0.7)
2
+ (1.7)
2
+ (-0.3)
2
+ (0.7)
2
+ (-0.3)
2
+ (0.7)
2
+ (-1.3)
2
+ (-1.3)
2
+ (-2.3)
2
+ (1.8)
2
+ (1.8)
2
+ (0.8)
2
+ (2.8)
2
+ (-0.2)
2
+ (-2.2)
2
+ (-1.2)
2
+ (-1.2)
2
+ (-0.2)
2
+ (-2.2)
2
+ (1.3)
2
+ (3.3)
2
+ (2.3)
2
+ (0.3)
2
+ (1.3)
2
+ (-1.7)
2
+ (-0.7)
2
+ (-1.7)
2
+ (-2.7)
2
+ (-1.7)
2

= 79.80
Illustrative Applications of One-way
Analysis of Variance (cont.)
It can be verified that
SSy = SSx + SSerror
as follows:
185.867 = 106.067 +79.80
The strength of the effects of X on Y are measured as follows:

2
= SSx/SSy
= 106.067/185.867
= 0.571
In other words, 57.1% of the variation in sales (Y) is accounted for by in-store
promotion (X), indicating a modest effect.





F =
SS
x
/(c - 1)
SS
error
/(N - c)
=
MS
X
MS
error

F =
106.067/(3-1)
79.800/(30-3)
Illustrative Applications of One-way
Analysis of Variance
= 17.944
From the Statistical Tables, we can see that for
2 and 27 degrees of freedom, the critical value
of F is 3.35 for .

Because the calculated value of F is greater
than the critical value, we reject the null
hypothesis.


= 0.05
Press Post Hoc after selecting the dependent variable and the factor.
Press continue to go back.
Press OK to get output
Another Example :
Three different versions of advertising copy have been created by an advertising agency
for a campaign. Let us call these versions of copy ADCOPY 1, 2 and 3. Now, the ad
agency wants to test which of these three versions of the advertising copy is preferred
by its target population, before they launch the campaign.

A sample of 18 respondents is selected from the target population in the nearby areas of
the city. At random, these 18 respondents are assigned to the 3 versions of ad copy.
Each version of ad copy is thus shown to six of the respondents.
The respondents are asked to rate their liking for the ad copy shown to them on a scale
of 1 to 10. (1 = Not liked at all, 10 = Liked a lot, and other values in between these
two). The ratings given by the 18 respondents are tabulated.
Sr.
No.
Ad copy rating
1 1 6.00
2 1 7.00
3 1 5.00
4 1 8.00
5 1 8.00
6 1 8.00
7 2 4.00
8 2 4.00
9 2 5.00
10 2 7.00
11 2 7.00
12 2 6.00
13 3 5.00
14 3 5.00
15 3 4.00
16 3 7.00
17 3 8.00
18 3 7.00

Ratings
Respondents Adcopy1 Adcopy2 Adcopy3
1 6 4 5
2 7 4 5
3 5 5 4
4 8 7 7
5 8 7 8
6 8 6 7
F
2,15
= 7.70
The codes in the ad copy, column (1,2,3) indicate
the different versions of the ad. The last column,
rating, is the rating given by a respondent to the
adcopy seen by him/her. Thus, six respondents have
rated each ad. Please note, that these eighteen
respondents were randomly assigned to each of the
three ad versions. This random assignment is called a
completely randomised assignment or design.


This input data is input into a statistical
package for performing a One-Way ANOVA,
because we have only 1 categorical factor (Ad
copy) at 3 levels 1, 2, 3 and 1 dependent
variable Rating.

Output



Source of
Variation
Sum of
Squares
DF Mean
Square
F Sig.
of F

Between
Groups
7.000 2 3.500 1.780 .203
Within
Groups
29.500 15 1.967
Total 36.500 17 2.147

The null hypothesis for this F-test is that there is no significant difference
in the mean ratings for the three ad copy versions.

H
0
: M
1
= M
2
= M
3
where M
1
, M
2
and M
3
are the mean ratings for the
three versions of ad copy.

Thus, in this case, we have accepted the null hypothesis (or failed to reject
the null hypothesis), at the 95 percent confidence level.

In other words, the Ratings given to the three ad
copy versions are not significantly different from
each other.

S-ar putea să vă placă și