Documente Academic
Documente Profesional
Documente Cultură
Part 2: Statistics
1
Frequency Distribution
Definition: Frequency is the number of times a
certain event occurs
Example: Assume that a pediatrician is interested
in the weight of new born babies at a Gandhi
Hospital. The frequency graph of the weights in
grams of babies is given in the next slide
What is the probability of the weight at birth of
any baby born being in 2440 g class?
2
Frequency distribution
p=0.0677
3
Birthweight < 3000 g
How many of them?
Frequency distribution
28.57%
4
Frequency distribution
5
Frequency distribution
P(Birthweight=6000) → 0
6
Frequency distribution
7
Frequency distribution
We have used the empirical frequency distribution
to make certain predictions or make judgments and
decisions. In many cases we will make such
predictions not from empirical distributions, but on
the basis of theoretical considerations.
We may feel, that data should be distributed in a
certain way. If our observed data do not sufficiently
conform to the values expected on the basis of
these assumptions – we will have serious doubts
about our assumptions.
8
Probability distribution
9
Discrete probability distributions
Binomial distribution
Negative Binomial distribution
Multinomial distribution
Geometric distribution
Hypergeometric distribution
Poisson distribution
10
The binomial distribution
Suppose that n independent experiments,
or trials, are performed, where n is a fixed
number, and that each experiment results
in „success” with probability p and a
„failure” with probability q=1-p.
The total number of successes, X, is a
binomial random variable with parameters n
and p.
11
The Binomial Distribution
The probability, that X=k, or p(k), can be
found in the following way:
Any particular sequence of k successes
nk
occurs with probability p (1 p )
k
k k!(n k )!
12
The binomial distribution
E( X ) k p
Var ( X ) k p q
13
The binomial distribution
N=10, p=0.1
N=10, p=0.5
14
Example of binomial distribution
Tay-Sachs disease is a rare but fatal disease
of genetic origin occuring chiefly in infants
and children. If a couple are both carriers of
Tay-Sachs disease, their child has probability
0.25 of being born with the disease.
If such a couple has four children, what is
the frequency function for the number of
children that will have the disease?
15
Solution of the problem
(a) 4
P ( X 0) 0.250 0.754 1 1 0.316 0.3164
0
(b) 4
P ( X 1) 0.2510.753 6 0.25 0.421875 0.422
1
(c) 4
P( X 2) 0.252 0.752 6 0.0625 0.5625 0.211
2
(d) 4
P( X 3) 0.253 0.751 4 0.015625 0.75 0.047
3
(e) 4
P ( X 4) 0.254 0.750 1 0.00391 1 0.004
4
16
The binomial distribution
0.422
0.316
0.211
0.047
0.004
17
The Geometric Distribution
The geometric distribution is also
constructed from independent Bernoulli
trials like the binomial, but from an
infinite sequence. On each trial, a
success occurs with probability p, and X is
the total number of trials up to and
including the first success. In order that
X=k, there must be k-1 failures followed
by a success. k 1
p (k ) (1 p ) p
18
The geometric distribution
20
The hypergeometric distribution
Suppose that an urn contains n balls, of
which r are black and n-r are white.
Let X denote the number of black balls
drawn when taking m balls without
replacement. So
r n r
k m k
P( X k ) p (k )
n
m
21
The hypergeometric distribution
23
Example
24
The Poisson distribution
The Poisson distribution can be derived
as the limit of a binomial distribution as
the number of trials n approaches to
infinity and the probability of success on
each trial p approaches zero in such a
way that np=λ.
k
e
p(k )
k!
25
The Poisson distribution
E( X )
Var ( X )
26
The Poisson distribution
27
Example
Two dice are rolled 100 times, and the
number of double sixes X is counted.
The distribution of X is binomial with
n=100 and p=1/36=0.0278. Since n is
large and p is small, we can
approximate the binomial probabilities
by Poisson probabilities with λ=np=2.78
28
Example
29
Another example
Suppose that an office receives
telephone calls as a Poisson process
with λ=0.5 per minute. The number of
calls in a 5-min interval follows a
Poisson distribution with parameter
ω=5λ=2.5. Thus, the probability of no
calls in 5-min interval is
2.5
p(k 0) e 0.082
30
Continuous density functions
For the continuous random
variable, the role of the frequency
function is taken by a density
function f(x) which has the
properties:
f ( x) 0 and f ( x)dx 1
and b
P(a X b) f ( x)dx
a
31
Probability Density Function
32
Examples of Continuous Density Functions
0 for xa
1
P( X ) for a xb
b a
0 for xb
34
Uniform Density Function…
A function f(x) is called a probability density function
(over the range a ≤ x ≤ b if it meets the following
requirements:
1) f(x) ≥ 0 for all x between a and b, and
f(x)
area=1
a b x
35
Uniform Distribution…
Consider the uniform probability distribution
(sometimes called the rectangular probability
distribution).
It is described by the function:
f(x)
a b x
x
F ( X ) P( X x) f ( x)dx 37
Example …
The amount of gasoline sold daily at a service station is
uniformly distributed with a minimum of 2,000 liters
and a maximum of 5,000 liters.
f(x)
2,000 5,000 x
40
The exponential density function
0
41
Exponential Distribution…
42
The exponential density function
15
43
The exponential density function
x0
P ( X x0 ) 1 e
and
x0
S ( x0 ) P( X x0 ) 1 P( X x0 ) e
45
Example
Let X represents the life time of a washing
machine. Suppose the average lifetime for
this type of washing machine is 15 years.
a.What is the probability that this washing
machine can be used for less than 6 years?
b.Also, what is the probability that this
washing machine can be used for more
than 18 years?
46
Example
P(X≤6) ≈ 0.0447×6+(0.0667-
0.0447)×6/2
0.0667 = 0.3342
0.0447
47
The exponential density function
6
P( X 6) 1 e 0.3297
15
18
P( X 18) e 15
0.3012
48
Example
49
Poisson and exponential ...
Let Y be a Poisson random variable representing
the number of occurrences in the unit time interval
with the probability distribution
k u
e
P(Y k ) ,
k!
where μ is the mean number of occurrences in this
time interval. Then, if X represents the time of one
occurrence, X has the exponential density function
with mean 1
E( X )
50
Example
The average number of car accidents
on the highway in two days is 8. What
is the probability of no accident for
more than 3 days?
51
Example
The average number of car accidents on the
highway in one day is 4. Thus, the mean
time of one occurrence is 0.25 (day).
Let Y be the Poisson random variable with
mean 4 representing the number of car
accidents in one day, while X be the
exponential random variable with mean
representing the time of one accident
occurrence. 52
Example
53
The Normal distribution
The normal distribution plays a central role in
probability and statistics.
This distribution is also called the Gaussian
distribution after Carl Friedrich Gauss, who
proposed it as a model for measurement
errors (in 1809).
The normal distribution has been used as a
model for such diverse phenomena as a
person’s height, the distribution of IQ scores,
and the velocity of gas molecules.
54
The Normal Distribution…
The frequency curve of the normal probability
distribution looks like the following graph. As
could be seen it has a shape of a bell. It is
also symmetrical around the mean. By that is
meant that the left side of the mean of the
curve is a mirror
image of the right side
55
The normal distribution
The density function of the normal
distribution depends on two parameters, μ
called mean and σ named standard deviation
of the normal density (where -∞< μ< ∞ and
σ > 0):
56
The Normal Distribution…
μ=0 μ=4
σ=1
σ=2
σ=3
61
The normal distribution
Cumulative normal
distribution function
Normal probability
50.00%
density function
34.13%
5.87% 13.59%
2.28% 2.14%
62
Standard normal distribution
The special case for which =0 and
=1 is called the standard normal
density.
63
Standard normal distribution
Probabilities for general normal random
variables can be evaluated in terms of
probabilities for standard normal
variables.
To demonstrate it we will use the
following property:
If X ~ N (, ) and Y aX b,
2
then Y ~ N (a b, a )
2 2
64
Standard normal distribution
Suppose that X~N(,2) and we wish to
find P(x0<X<x1) for some numbers x0
and x1. Consider the random variable:
X X
Z
where a=1/ and b=-/. We see that
Z ~ N (a b, a ) N ( , ( ) ) N (0,1)
2 2
1 2
2
65
Standard Normal Distribution…
-0
1
1
As we shall see shortly, any normal distribution can be
converted to a standard normal distribution with simple
algebra. This makes calculations much easier.
66
Standard normal distribution
Therefore
X x
FX ( x) P ( X x) P ( )
x x
P( Z ) ( )
We thus have
P( x0 X x1 ) FX ( x1 ) FX ( x0 )
x1 x0
( ) ( )
67
Normal Distribution…
The normal distribution is described by two parameters:
its mean and its standard deviation . Increasing
the mean shifts the curve to the right…
68
Normal Distribution…
The normal distribution is described by two parameters:
its mean and its standard deviation . Increasing
the standard deviation “flattens” the curve…
69
Calculating Normal Probabilities…
71
Calculating Normal Probabilities…
72
Calculating Normal Probabilities…
0 1.52
P(Z < 1.52) = 0.5 + P(0 < Z < 1.52)
= 0.5 + 0.4357
= 0.9357
74
Example 2: Return To Investment
The return on investment is normally
distributed with a mean of 10% and a standard
deviation of 5%. What is the probability of losing
money?
We want to determine P(X < 0). Thus,
X 0 10
P ( X 0) P
5
P ( Z 2)
0.5 P(0 Z 2)
0.5 0.4772
0.0228
8.75
Finding Values of ZA…
Often we’re asked to find some value of Z for
a given probability, i.e. given an area (A)
under the curve, what is the corresponding
value of z (zA) on the horizontal axis that gives
us this area?
That is:
76
Finding Values of Z…
What value of z corresponds to an area under the curve of
2.5%? That is, what is z.025 ?
78
Using the values of Z
Similarly
P(-1.645 < Z < 1.645) = 0.90
79
Example
Scores on a certain standardized test,
IQ scores, are approximately normally
distributed with mean =100 and
standard deviation =15.
If an individual is selected at random,
what is the probability that his score X
satisfies 120 < X < 130?
80
Example
We can calculate this probability by using
the standard normal distribution as
follows:
P(120 X 130) P( 12015100 X 100
15 13015100 )
P(1.33 Z 2) (2) (1.33)
0.9772 0.9082 0.069
81
Symmetry and kurtosis
In many cases an observed frequency
distribution departs obviously from
normality; thus statistics that measure the
nature and amount of departure are
useful.
We will focus on two types of departures
from normality: skewness and kurtosis.
82
Skewness
Skewness, which is another name for
asymmetry, means that one tail of the curve
is drawn out more than the other.
In such curves the mean and the median do
not coincide.
Curves are called skewed to the right or left,
depending upon wheter the right or left tails
are drawn out.
83
Skewness
84
Kurtosis
If a symmetrical distribution is considered to
have a center, two shoulders, and two tails, the
kurtosis describes the proportions in the center
and in the tails with relation to those in the
shoulders.
There are three types of kurtosis: leptokurtic,
mesokurtic and platykurtic
We shall define the leptokurtic and platykurtic
curves.
85
Kurtosis
A leptokurtic curve has more items near
the center and at the tails, with fewer
items in the shoulders relative to a
normal distribution with the same mean
and variance.
86
Kurtosis
A platykurtic curve has fewer items at
the center and at the tails than the
normal curve, but has more items in the
shoulders. A bimodal distribution is an
extreme case of platykurtic distribution.
87
Skewness and kurtosis
The sample statistics for measuring skewness
and kurtosis are called g1 and g2, to represent
population parameters G1 and G2.
n ( X i X ) 3
g1
( n 1)( n 2) s 3
n ( n 1)
n 1 (X X ) 3 ( X i X )
4
2 2
g2 i
( n 2)( n 3) s 4
88
Skewness and kurtosis
In a normal frequency distribution both
G1 and G2 are zero.
A negative g1 indicates skewness to the
left, a positive g1 skewness to the right.
A negative g2 indicates platykurtosis,
while a positive g2 shows leptokurtosis.
The absolute magnitudes of g1 and g2
do not mean so much.
89
Quantile measures of symmetry and kurtosis
Denoting the ith quartile as Qi we can define the Bowley coefficient of skewness
It is a measure that may range from -1, for a distribution with extreme left skewness; to 0 for a symmetrical distribution; to 1, for a distribution
with extreme right skewness
Q3 Q1 2Q2
skewness
Q3 Q1
90
Quantile measures of kurtosis
A kurtosis measure based on octiles Oi
(12.5%, 25%, 37.5% and so on) was
proposed in 1988 by Moors
(O7 O5 ) (O3 O1 )
kurtosis
Q3 Q1
Which measures zero , for extreme
platykurtosis; to 1.233 for normal; to
infinity, for extreme leptokurtosis
91
Graphic test for normality
92
Graphic test for normality
93
Graphic test for normality
94
Chi-Square Distribution and Chi-square Tests
Multinomial Experiments
95
Multinomial Experiment
A multinomial experiment is a probability experiment
consisting of a fixed number of trials in which there are
more than two possible outcomes for each independent
trial. (Unlike the binomial experiment in which there
were only two possible outcomes.)
Example:
A researcher claims that the distribution of favorite pizza
toppings among teenagers is as shown below.
Topping Frequency, f
Each outcome is Cheese 41% The probability for
classified into Pepperoni 25% each possible
categories. Sausage 15% outcome is fixed.
Mushrooms 10%
Onions 9%
Uses of Chi-Square Goodness-of-Fit Test
A Chi-Square Goodness-of-Fit Test is used to test whether a
frequency distribution fits an expected distribution.
To calculate the test statistic for the chi-square goodness-of-fit test,
the observed frequencies and the expected frequencies are used.
The observed frequency O of a category is the frequency for the
category observed in the sample data.
The expected frequency E of a category is the calculated
frequency for the category. Expected frequencies are obtained
assuming the specified (or hypothesized) distribution. The expected
frequency for the ith category is
Ei = npi
where n is the number of trials (the sample size) and pi is the
assumed probability of the ith category.
Observed and Expected Frequencies
Example:
200 teenagers are randomly selected and asked what their favorite
pizza topping is. The results are shown below.
Find the observed frequencies and the expected frequencies.
Continued.
100
Chi-Square Goodness-of-Fit Test
Performing a Chi-Square Goodness-of-Fit Test
In Words In Symbols
6. Calculate the test statistic. 2
χ �
(O E )2
E
7. Make a decision to reject or fail
to reject the null hypothesis. If χ2 is in the
rejection region,
8. Interpret the decision in the reject H0. Otherwise,
context of the original claim. fail to reject H0.
101
Chi-Square Goodness-of-Fit Test
Example:
A researcher claims that the distribution of favorite pizza
toppings among teenagers is as shown below. 200
randomly selected teenagers are surveyed.
Topping Frequency, f
Cheese 39%
Pepperoni 26%
Sausage 15%
Mushrooms 12.5%
Onions 7.5%
Continued.
Chi-Square Goodness-of-Fit Test
Example continued:
Topping Observed Expected
Rejection Frequenc Frequenc
region y y
0.01 Cheese 78 82
Pepperoni 52 50
X2 Sausage 30 30
χ20 = 13.277 Mushroom 25 20
s
2 Onions
(O E )2 (78 82)2 (52 50)2
(30 30)2 (2515 18
20)2 (15 18)2
χ �
E 82 50 30 20 18
�2.025
Conclusion: Fail to reject H0.
There is not enough evidence at the 1% level to reject the
surveyor’s claim.
104
Independence of Chi-squares
Contingency Tables
105
Contingency Tables
An r c contingency table shows the observed
frequencies for two variables. The observed frequencies
are arranged in r rows and c columns. The intersection of
a row and a column is called a cell.
Age
Gender 16 – 20 21 – 30 31 – 40 41 – 50 51 – 60 61 and older
Male 32 51 52 43 28 10
Female 13 22 33 21 10 6
Expected Frequency
Age
Gender 16 – 20 21 – 30 31 – 40 41 – 50 51 – 60 61 and Total
older
Male 32 51 52 43 28 10 216
Female 13 22 33 21 10 6 105
Total 45 73 85 64 38 16 321
Continued.
108
Expected Frequency
Example continued:
Age
Gender 16 – 20 21 – 30 31 – 40 41 – 50 51 – 60 61 and Total
older
Male 32 51 52 43 28 10 216
Female 13 22 33 21 10 6 105
Total 45 73 85 64 38 16 321
216 �
64 216 �
38 216 �
16
E 1,4 �43.07 E 1,5 �25.57 E 1,6 �10.77
321 321 321
109
Chi-Square Independence Test
Continued.
112
Chi-Square Independence Test
Performing a Chi-Square Independence Test
In Words In Symbols
2 (O E )2
6. Calculate the test statistic. χ �
E
113
Chi-Square Independence Test
Example:
The following contingency table shows a random sample
of 321 fatally injured passenger vehicle drivers by age
and gender. The expected frequencies are displayed in
parentheses. At = 0.05, can you conclude that the
drivers’ ages are related to gender in such accidents?
Age
Gender 16 – 20 21 – 30 31 – 40 41 – 50 51 – 60 61 and Total
older
Male 32 51 52 43 28 10 216
(30.28) (49.12) (57.20) (43.07) (25.57) (10.77)
Female 13 22 33 21 10 6 105
(14.72) (23.88) (27.80) (20.93) (12.43) (5.23)
45 73 85 64 38 16 321
Chi-Square Independence Test
Example continued:
X
n
X X
t n
(n 1) S S S n
2
2
(n 1)
119
Student’s t distribution
1
�t2 � 2
g (t ) K � 1�
� �
where � 1 �
G� �
� 2 �
K
� �
p G � �
�2�
Where G (k ) = k × (k – 1) × (k – 2) × (k – 3) × × 2 × 1
8.120
Relationship between Normal & t Distributions
121
Section 3
COMPARING TWO VARIANCES
THE F DISTRIBUTION
122
The F Distribution
V ~ k21 W ~ k22 V , W independent
V k1
F
W k2
Application ({ X i } and {Y j } independent) :
X 1 ,... X n1 ~ iid N 1 , 12
Y1 ,...Yn2 ~ iid N 2 , 22
( n1 1) S12 ( n2 1) S 22
V ~ 2
n1 1 W ~ 2
n2 1
12 22
V , W independent
( n1 1) S12
( n1 1)
V k1 12
S12 12
F 2
W k2 ( n2 1) S 22
S 2 22
n2 1
22
123
F-Distribution
Let s12 and s22 represent the sample variances of two
different populations. If both populations are normal and the
population variances σ12 and σ22 are equal, then the sampling
distribution of
s12
F 2
s2 is called an F-distribution.
The frequency function is given by
k1 k 2
G k1
k1 k 2
2 k1 2 fk1
k1
h f f
1 2
1 f 0
2
k k k k
G 1 G 2 2 2
2 2
There are several properties of this distribution.
Continued.
F-Distribution
1. The F-distribution is a family of curves each of which is
determined by two types of degrees of freedom: the degrees
of freedom corresponding to the variance in the numerator,
denoted d.f.N, and the degrees of freedom corresponding to
the variance in the denominator, denoted d.f.D.
125
F-Distribution
F
1 2 3 4
The F-Distribution
Finding Critical Values for the F-Distribution
127
Critical Values for the F-Distribution
Example:
Find the critical F-value for a right-tailed test when
= 0.05, d.f.N = 5 and d.f.D = 28.
Appendix B: Table 7: F-Distribution
d.f.D: = 0.05
Degrees of d.f.N: Degrees of freedom, numerator
freedom,
denominator
1 2 3 4 5 6
1 161.4 199.5 215.7 224.6 230.2 234.0
27
2 4.21 19.00
18.51 3.35 2.96
19.16 2.73
19.25 2.57
19.30 2.46
19.33
28 4.20 3.34 2.95 2.71 2.56 2.45
29 4.18 3.33 2.93 2.70 2.55 2.43
1 2 3 4 5 6
1 161.4 199.5 215.7 224.6 230.2 234.0
2 18.51 19.00 19.16 19.25 19.30 19.33
3 10.13 9.55 9.28 9.12 9.01 8.94
4 7.71 6.94 6.59 6.39 6.26 6.16
5 6.61 5.79 5.41 5.19 5.05 4.95
The critical
6 value is F0 5.14
5.99 = 4.53.4.76 4.53 4.39 4.28
7 5.59 4.74 4.35 4.12 3.97 3.87
Two-Sample F-Test for Variances
Two-Sample F-Test for Variances
2
A two-sampleσ1F-test σ22used to compare two population variances
and is
when a sample is randomly selected from each
population. The populations must be independent and normally
distributed. The test statistic is
s12 2 2
F 2 s1 �s2.
s2
2 2
where 1 s and s2 represent the sample variances with
The degrees of freedom for the numerator is d.f. N =
n1 – 1 and the degrees of freedom for the denominator is2 d.f. D =
n2 – 1, where n1 is the size of the sample having s1
variance and
2
s2.
n2 is the size of the sample having variance
Two-Sample F-Test for Variances
2 2
Using a Two-Sample F-Test to Compare σ1 and σ2
In Words In Symbols
1. Identify the claim. State the null State H0 and Ha.
and alternative hypotheses.
Continued.
131
Two-Sample F-Test for Variances
2 2
Using a Two-Sample F-Test to Compare σ1 and σ2
In Words In Symbols
5. Determine the rejection region.
6. Calculate the test statistic.
s12
F 2
7. Make a decision to reject or fail s2
to reject the null hypothesis.
If F is in the rejection
region, reject H0.
8. Interpret the decision in the Otherwise, fail to
context of the original claim. reject H0.
132
Two-Sample F-Test
Example:
A travel agency’s marketing brochure indicates that
the standard deviations of hotel room rates for two cities are
the same. A random sample of 13 hotel room rates in one
city has a standard deviation of $27.50 and a random
sample of 16 hotel room rates in the other city has a
standard deviation of $29.75. Can you reject the agency’s
claim at = 0.01?
2 2
Because 29.75 > 27.50, 1s =885.06 and s2 756.25.
More often than not, not all values observed are the
same. This is because there are various sources of
error which create differences between observations
gathered by a researcher. The analysis of variance is
the breakdown of such variabilities into their
component parts.
The technique of the analysis of variance was credited
to the famous British statistician Sir Ronald R. Fisher.
135
One-Way ANOVA
One-way analysis of variance is a hypothesis-testing
technique that is used to compare means from three or
more populations. Analysis of variance is usually
abbreviated ANOVA.
Continued.
139
Test Statistic for a One-Way ANOVA
Finding the Test Statistic for a One-Way ANOVA Test
In Words In Symbols
140
Performing a One-Way ANOVA Test
In Words In Symbols
1. Identify the claim. State the null State H0 and Ha.
and alternative hypotheses.
In Words In Symbols
5. Determine the rejection region.
6. Calculate the test statistic.
MS B
F
7. Make a decision to reject or fail MSW
to reject the null hypothesis.
If F is in the rejection
region, reject H0.
8. Interpret the decision in the Otherwise, fail to
context of the original claim. reject H0.
142
ANOVA Summary Table
A table is a convenient way to summarize the results in a
one-way ANOVA test.
Degrees
Sum of Mean
Variation of F
squares squares
freedom
SS B
Between SSB d.f.N MS B MS B �MSW
d.f.N
SSW
Within SSW d.f.D MSW
d.f.D
143
Performing a One-Way ANOVA Test
Example:
The following table shows the salaries of randomly
selected individuals from four large metropolitan areas. At
= 0.05, can you conclude that the mean salary is different in
at least one of the areas?
d.f.D = N – k = 20 – 4 = 16
Within 16 14909904.97
Cities
Total 19
148
End of Probability
Distributions and
Statistical Analysis
149
CHAPTER
DIFFERENT TYPES OF
HYPOTHESIS TESTING
CONTENTS OF THE CHAPTER
• Examples on Hypothesis Testing
• One-tailed and Two-tailed Hypotheses
• Test of One Mean
• Test of two means which are Independent
• Test of two means when samples are Dependent
• Test for a Single Proportion
• Test for Two Proportions
• Cross Tabulation Hypothesis (Chi-square test)
13 | 151
Hypotheses Testing
13 | 152
Some Important
Hypotheses
Bankers assumed high-income earners are more
profitable than low-income earners
Clients who carefully balance their checkbooks every
month and minimize fees due to overdrafts are
unprofitable checking account customers
Old clients are more likely to diminish CD balances
by large amounts compared to younger clients
This was non-intuitive because conventional wisdom
suggested that older clients have a larger portfolio of
assets and seek less risky investments
13 | 153
Data Analysis
Descriptive
Computing measures of central tendency and
dispersion,as well as constructing one-way
tables
Inferential
Data analysis aimed at testing specific
hypotheses is usually called inferential
analysis
13 | 154
Null and Alternative
Hypotheses
H0 : Null Hypotheses
Ha : Alternative Hypotheses
13 | 155
Steps in Conducting a
Hypothesis Test
Step 1. Set up H0 and Ha
Step 2. Identify the nature of the sampling
distribution curve and specify the appropriate
test statistic
Step 3. Determine whether the hypothesis
test is one-tailed or two-tailed
13 | 156
Steps in Conducting a
Hypothesis Test (Cont’d)
Step 4. Taking into account the specified significance
level, determine the critical value (two critical values
for a two-tailed test) for the test statistic from the
appropriate statistical table
Step 5. State the decision rule for rejecting H0
Step 6. Compute the value for the test statistic from
the sample data
Step 7. Using the decision rule specified in step 5,
either reject H0 or reject Ha
13 | 157
Launching a Product Line
Into a
New Market Area
Martha, product manager for a line of apparel, wishes
to introduce the product line into a new market area
Survey of a random sample of 400 households in that
market showed a mean income per household of
$30,000. Martha strongly believes the product line
will be adequately profitable only in markets where
the mean household income is greater than $29,000.
Should Martha introduce the product line into the new
market?
13 | 158
Martha’s Criterion for
Decision Making
To reach a final decision, Martha has to make
a general inference (about the population)
from the sample data
Criterion: mean income across across all
households in the market area under
consideration
If the mean population household income is
greater than $29,000, then Martha should
introduce the product line into the new market
13 | 159
Martha’s Hypothesis
Martha’s decision making is equivalent to
either accepting or rejecting the hypothesis:
The population mean household income in the
new market area is greater than $29,000
13 | 160
One-Tailed Hypothesis Test
The term one-tailed signifies that all of z-
values that would cause Martha to reject H0,
are in just one tail of the sampling distribution
Is the population Mean
H0: $29,000
Ha: $29,000
13 | 161
Type One and Type Two Errors
Type I error occurs if the null hypothesis is
rejected when it is true
Type II error occurs if the null hypothesis is
not rejected when it is false
13 | 162
Significance Level
13 | 163
Summary of Errors Involved in
Hypothesis Testing
Inference Real State of Affairs
Based on
Sample Data H0 is True H0 is False
Correct decision Type II error
H0 is True Confidence level
= 1- P (Type II error) =
Correct decision
Type I error
H0 is False Significance level Power = 1-
= *
13 | 164
Level of Risk
13 | 165
Scenario - Firms ABC & XYZ
Firm ABC
ABC should be more cautious
Firm XYZ
XYZ should be less cautious
13 | 166
Exhibit 1 Identifying the Critical Sample Mean
Value – Sampling Distribution
Sample mean (x) values greater than $29,000--that is x-values on the right-
hand side of the sampling distribution centered on µ = $29,000--suggest that
H0 may be false. More important the farther to the right x is , the stronger is
the evidence against H0
13 | 167
Martha’s Decision Rule for Rejecting
the Null Hypothesis
13 | 168
Criterion Value
Every mean x has a corresponding equivalent
standard Normal Deviate:
The expression for z
x-
Z = ---------
sx
x = + zsx
Substituting xc for x and zc for z
xc = + zcsx where zc is standard normal
deviate 13 | 169
Computing the Criterion Value
Standard deviation for the sample of 400
households is $8,000. The standard error of the
mean (sx ) is given by
x
S
s = ---- = $400
n
Critical mean household income xc through the
following two steps:
1. Determine the critical z-value, zc. For =.05, From
Appendix 1, zc = 1.645.
13 | 171
Test Statistic
x-
Z = ------ = 2.5
s
13 | 172
Exhibit 2 Critical Value for Rejecting
the Null Hypothesis
13 | 173
P - Value – Actual Significance Level
13 | 174
T-test
X = $30,000 , s = $8,000
From the t-table in Appendix 3, tc = 1.71 for
= .05
and d.f. = 24.
13 | 175
T-test (Cont’d)
13 | 176
Two-Tailed Hypothesis Test
13 | 177
Test of Two Means
13 | 178
Test of Two Means (Cont’d)
City 1: n1 = 300 x1 = 30 s1 = 22
City 2: n2 = 200 x2 = 35 s2 = 25
(x1 - x 2) - (1 - 2 )
z = -------------------------------
s12/n1 + s22/n2
13 | 181
Decision – Two-Tailed Test
13 | 182
Computing Z-value – Two-Tailed Test
13 | 183
Exhibit 5 Hypothesis Test Related to Mean
Exercising in Two Cities
13 | 184
The t- test for Independent Samples
Test statistic
n1 = 20 x1 = 30 s1 = 22
n2 = 10 x2 = 35 s2 = 25
t = -0.56
13 | 188
National Insurance Company Study – Perceived Service
Quality Differences Between Males and Females
Group Statistics
Std. Error
gender N Mean Std. Deviation Mean
OQ male 137 7.87 2.26 .19
female 126 7.83 2.31 .21
13 | 189
Test of Two Means When Samples Are Dependent
13 | 190
Test of Two Means When Samples Are Dependent
(Cont’d)
13 | 191
Sales Per Store Before and After a
Promotional Campaign
Sales per Store (In Thousands)
13 | 192
Test of Two Means When Samples Are Dependent
(Cont’d)
xd = 50/10 = 5
13 | 193
Test of Two Means When Samples Are Dependent
(Cont’d)
Test statistic is
xd -
t = ----------- = 2.10
s/n
13 | 194
Test of Two Means When Samples Are Dependent
(Cont’d)
13 | 196
Test for a Single Proportion
13 | 197
Test for a Single Proportion (Cont’d)
H0: p 0.3
Ha: p 0.3
(p is the symbol for population proportion)
13 | 198
Test for a Single Proportion (Cont’d)
13 | 201
Test of Two Proportions: Choosing Between
Commercial X & Commercial Y For a New Product
13 | 202
Test of Two Proportions (Cont’d)
Question
Can Tom conclude that commercial Y will be
more effective in the total market for the new
product?
13 | 203
Criterion for Decision Making
13 | 204
Hypothesis
13 | 205
Null and Alternative Hypotheses
Commercial
Commercial
X Y
Sample sizes: n1 = 200 n2 = 200
H0: p1 p2 or p1 - p2 0
Ha: p1 p2 or p1 - p2 0 13 | 206
Test of Two Proportions – Commercial
Sample Standard Error
13 | 208
Test of Two Proportions
=0.042
13 | 210
Cross-Tabulations: Chi-square Contingency Test
13 | 211
Telecommunications Company
13 | 212
Table 3 Two-Way Tabulation of Bluetooth Technology and
Whether Customers Would Buy Cell Phone
13 | 213
Cross Tabulations - Hypotheses
13 | 214
Conducting the Test
13 | 215
Expected Values
ninj
Eij = -----
n
13 | 216
Computing Expected Values
13 | 217
Table 4 Observed and Expected Cell Frequencies
13 | 218
Chi-square Test Statistic
= 72.00
Computed 2 = 72.00
Since the computed Chi-square value is greater
than the critical value of 3.84, reject H0.
The apparent relationship between “Bluetooth
technology "and "would buy the cellular phone"
revealed by the sample data is unlikely to have
occurred because of chance
13 | 220
Interpretation
13 | 221
Cross-Tabulation Using SPSS
for National Insurance Company
13 | 222
Need to Conduct Chi-square
Test to Reach a Conclusion
13 | 223
Association Between Education and Customer’s
Willingness to recommend National to a Friend
13 | 224
National Insurance Company Study -
Chi-Square Test
For Chi-Square Assessment:
1. Select ANALYZE
2. Click on DESCRIPTIVE STATISTICS
3. Select CROSS-TABS
4. Move the variable “highest level of schooling” to ROW(s) box
5. Move “rec” to COLUMN(s) box;
6. Click on “STATISTICS”
7. Select CHI-SQUARE, CONTINGENCY COEFFICIENT, and
CRAMER’S V
8. Click on CELLS,
9. Select OBSERVED and EXPECTED FREQUENCIES
10.Click CONTINUE
11.Click OK.
13 | 225
National Insurance Company Study –
P-Value Significance
The actual significance level (p-value) = 0.019
The chances of getting a chi-square value as
high as 10.007 when there is no relationship
between education and recommendation are
less than 19 in 1000
The apparent relationship between education
and recommendation revealed by the sample
data is unlikely to have occurred because of
chance
Jill and Tom can safely reject null hypothesis
13 | 226
Precautions in Interpreting
Cross Tabulation Results
Two-way tables cannot show conclusive
evidence of a causal relationship
Watch out for small cell sizes
Increases the risk of drawing erroneous
inferences when more than two variables are
involved
13 | 227
Exercise: Two-way Table Based on a Survey of
200 Hospital Patients:
Patients with
20 40
heart disease
Patients
without heart 80 60
disease
100 100
Slide
229
Nonparametric Methods
Sign Test
Wilcoxon Signed-Rank Test
Mann-Whitney-Wilcoxon Test
Kruskal-Wallis Test
Rank Correlation
Slide
230
Nonparametric Methods
Slide
231
Nonparametric Methods
Slide
232
Sign Test
Slide
233
Example: Peanut Butter Taste Test
Slide
234
Example: Peanut Butter Taste Test
Hypotheses
H0: No preference for one brand over the other exists
Ha: A preference for one brand over the other exists
Sampling Distribution
Sampling distribution
of the number of “+”
values if there is no
brand preference
2.74
= 15 =0.5(30)
Slide
235
Example: Peanut Butter Taste Test
Rejection Rule
Using 0.05 level of significance,
Reject H0 if z < -1.96 or z > 1.96
Test Statistic
z = (18 - 15)/2.74 = 3/2.74 = 1.095
Conclusion
Do not reject H0. There is insufficient evidence in
the sample to conclude that a difference in preference
exists for the two brands of peanut butter.
Fewer than 10 or more than 20 individuals would
have to have a preference for a particular brand in
order for us to reject H0.
Slide
236
Wilcoxon Signed-Rank Test
Slide
237
Example: Express Deliveries
Slide
238
Example: Express Deliveries
Slide
239
Wilcoxon Signed-Rank Test
Slide
240
Example: Express Deliveries
Slide
241
Example: Express Deliveries
Hypotheses
H0: The delivery times of the two services are the
same; neither offers faster service than the other.
Ha: Delivery times differ between the two services;
recommend the one with the smaller times.
Sampling Distribution
Sampling distribution
of T if populations
19.62 are identical
T = 0
T
Slide
242
Example: Express Deliveries
Rejection Rule
Using 0.05 level of significance,
Reject H0 if z < -1.96 or z > 1.96
Test Statistic
z = (T - T )/T = (44 - 0)/19.62 = 2.24
Conclusion
Reject H0. There is sufficient evidence in the
sample to conclude that a difference exists in the
delivery times provided by the two services.
Recommend using the NiteFlite service.
Slide
243
Mann-Whitney-Wilcoxon Test
Slide
244
Mann-Whitney-Wilcoxon Test
Slide
245
Example: Westin House Freezers
Slide
246
Example: Westin Freezers
Slide
247
Example: Westin Freezers
Slide
248
Mann-Whitney-Wilcoxon Test:
Large-Sample Case
First, rank the combined data from the lowest to
the highest values, with tied values being assigned
the average of the tied rankings.
Then, compute T, the sum of the ranks for the first
sample.
Then, compare the observed value of T to the
sampling distribution of T for identical populations.
The value of the standardized test statistic z will
provide the basis for deciding whether to reject H0.
Slide
249
Mann-Whitney-Wilcoxon Test:
Large-Sample Case
Sampling Distribution of T for Identical Populations
• Mean
T = (1/2)n1(n1 + n2 + 1)
1
• Standard Deviation
T 1 12 n1n2 (n1 n2 1)
• Distribution Form
Approximately normal, provided
n1 > 10 and n2 > 10
Slide
250
Example: Westin Freezers
Slide
251
Example: Westin Freezers
Sampling distribution
of T if populations
are identical
13.23
T
T = 105 =1/2(10)(21)
Slide
252
Example: Westin Freezers
Rejection Rule
Using .05 level of significance,
Reject H0 if z < -1.96 or z > 1.96
Test Statistic
z = (T - T )/T = (86.5 - 105)/13.23 = -1.40
Conclusion
Do not reject H0. There is insufficient evidence in
the sample data to conclude that there is a difference
in the annual energy cost associated with the two
brands of freezers.
Slide
253
Kruskal-Wallis Test
Slide
254
Rank Correlation
Slide
255
Rank Correlation
6 di2
rs 1
n( n2 1)
Slide
256
Test for Significant Rank Correlation
Slide
257
Rank Correlation
• Standard Deviation
1
rs
n1
• Distribution Form
Approximately normal, provided n > 10
Slide
258
Example: Connor Investors
Rank Correlation
Connor Investors provides a portfolio
management service for its clients. Two of Connor’s
analysts rated ten investments from high (6) to low
(1) risk as shown below. Use rank correlation, with
= .10, to comment on the agreement of the two
analysts’ ratings.
Investment A B C D E F G H I J
Analyst #1 1 4 9 8 6 3 5 7 2 10
Analyst #2 1 5 6 2 9 7 3 10 4 8
Slide
259
Example: Connor Investors
Analyst #1 Analyst #2
Investment Rating Rating Differ. (Differ.)2
A 1 1 0 0
B 4 5 -1 1
C 9 6 3 9
D 8 2 6 36
E 6 9 -3 9
F 3 7 -4 16
G 5 3 2 4
H 7 10 -3 9
I 2 4 -2 4
J 10 8 2 4
Sum = 92
Slide
260
Example: Connor Investors
Hypotheses
H0: ps = 0 (No rank correlation exists.)
Ha: ps = 0 (Rank correlation exists.)
Sampling Distribution
Sampling distribution of
rs under the assumption
of no rank correlation
1
rs .333
10 1
rs
r = 0
Slide
261
Example: Connor Investors
Rejection Rule
Using .10 level of significance,
Reject H0 if z < -1.645 or z > 1.645
Test Statistic
6 di2 6(92)
rs 1 1 0.4424
n(n 1)
2
10(100 1)
z = (rs - r )/r = (.4424 - 0)/.3333 = 1.33
Conclusion
Do no reject H0. There is not a significant rank
correlation. The two analysts are not showing
agreement in their rating of the risk associated with
the different investments.
Slide
262
MULTIVARIATE STATISTICAL
TECHNIQUES
263
DEFINITION OF MULTIVARIATE TECHNIQUES
All statistical techniques which simultaneously analyze more
than two variables on a sample of observations can be
categorized as multivariate techniques.
Multivariate analysis is a collection of methods for analyzing
data in which a number of observations are available for each
object.
In the analysis of many problems, it is helpful to have a
number of scores for each object. For instance, in the field of
intelligence testing if we start with the theory that general
intelligence is reflected in a variety of specific performance
measures
Then to study intelligence in the context of this theory one
must administer many tests of mental skills, such as
vocabulary, speed of recall, mental arithmetic, verbal
analogies and so on.
264
Example of Multivariate Variables
• The score on each test is one variable, Xi, and there are
several, k, of such scores for each object, represented
as X1, X2,…, Xk.
• Most of the research studies involve more than two
variables in which situation analysis is desired of the
association between one (at times many) criterion
variable and several independent variables
• Or we may be required to study the association
between variables having no dependency relationships.
• All such analyses are termed as multivariate analyses or
multivariate techniques.
• In brief, techniques that take account of the various
relationships among variables are termed multivariate
analyses or multivariate techniques 265
GROWTH OF MULTIVARIATE TECHNIQUES
• Of late, multivariate techniques have emerged as a powerful
tool to analyze data represented in terms of many variables.
• The main reason being that a series of univariate analysis
carried out separately for each variable may, at times, lead to
incorrect interpretation of the result.
• This is so because univariate analysis does not consider the
correlation or inter-dependence among the variables.
• As a result, during the last fifty years, a number of statisticians
have contributed to the development of several multivariate
techniques.
• Today these techniques are being applied in many fields such
as economics, sociology, psychology, agriculture,
anthropology, biology and medicine.
266
GROWTH OF MULTIVARIATE TECHNIQUES
267
CHARACTERISTICS AND APPLICATIONS
269
CHARACTERISTICS AND APPLICATIONS
The basic objective underlying multivariate techniques is to
represent a collection of massive data in a simplified way.
In other words, multivariate techniques transform a mass of
observations into a smaller number of composite scores in
such a way that they may reflect as much information as
possible contained in the raw data obtained concerning a
research study.
Thus, the main contribution of these techniques is in
arranging a large amount of complex information involved
in the real data into a simplified visible form.
Mathematically, multivariate techniques consist in “forming
a linear composite vector in a vector subspace, which can
be represented in terms of projection of a vector onto 270
CHARACTERISTICS AND APPLICATIONS
For better appreciation and understanding of
multivariate techniques, one must be familiar with
fundamental concepts of linear algebra, vector spaces,
orthogonal and oblique projections and univariate
analysis.
Even then before applying multivariate techniques for
meaningful results, one must consider the nature and
structure of the data and the real aim of the analysis.
We should also not forget that multivariate techniques
do involve several complex mathematical computations
and as such can be utilized largely with the availability
of computer facility.
271
CLASSIFICATION OF MULTIVARIATE TECHNIQUES
Today, there exist a great variety of multivariate techniques which can
be conveniently classified into two broad categories viz., dependence
methods and interdependence methods.
This sort of classification depends upon the question: Are some of the
involved variables dependent upon others? If the answer is ‘yes’, we
have dependence methods; but in case the answer is ‘no’, we have
interdependence methods.
Two more questions are relevant for understanding the nature of
multivariate techniques. Firstly, in case some variables are
dependent, the question is how many variables are dependent?
The other question is, whether the data are metric or non-metric?
This means whether the data are quantitative, collected on interval
or ratio scale, or whether the data are qualitative, collected on
nominal or ordinal scale.
272
CLASSIFICATION OF MULTIVARIATE TECHNIQUES
273
CLASSIFICATION OF MULTIVARIATE TECHNIQUES
284
MULTIPLE DISCRIMINANT ANALYSIS:
We may briefly refer to the technical aspects relating to
discriminant analysis
There happens to be a simple scoring system that assigns a
score to each individual or object.
This score is a weighted average of the individual’s numerical
values of his independent variables.
On the basis of this score, the individual is assigned to the ‘most
likely’ category.
For example, an individual is 20 years old, has an annual income
of USD 12,000, and has 10 years of formal education.
Let b1, b2, and b3 be the weights attached to the independent
variables of age, income and education respectively.
285
MULTIPLE DISCRIMINANT ANALYSIS:
• The individual’s score (z), assuming linear score, would
be:
z = b1 (20) + b2 (12000) + b3 (10)
This numerical value of z can then be transformed into the
probability that the individual is an early user, a late user
or a non-user of the newly marketed consumer product
(here we are making three categories viz. early user, late
user or a non-user).
The numerical values and signs of the b’s indicate
the importance of the independent variables in
their ability to discriminate among the different
classes of individuals. 286
MULTIPLE DISCRIMINANT ANALYSIS:
• Thus, through the discriminant analysis, the researcher can as
well determine which independent variables are most useful in
predicting whether the respondent is to be put into one group or
the other.
• In other words, discriminant analysis reveals which specific
variables in the profile account for the largest proportion of inter-
group differences.
In case only two groups of the individuals are to be formed on
the basis of several independent variables, we can then have a
model like this
298
FACTOR ANALYSIS:
SCORE MATRIX (or Matrix S)
299
FACTOR ANALYSIS:
It is assumed that scores on each measure are standardized
[i.e., xi = (X - Xi )2 /si ] .
This being so, the sum of scores in any column of the matrix,
S, is zero and the variance of scores in any column is 1.0.
Then factors (a factor is any linear combination of the
variables in a data matrix and can be stated in a general way
like:
A = Waa + Wbb + … + Wkk)
301
IMPORTANT METHODS OF FACTOR ANALYSIS
306
IMPORTANT METHODS of
• Communality for each variables will remain
undisturbed regardless of rotation but the eigen
values will change as result of rotation.
Factor scores: Factor score represents the degree to
which each respondent gets high scores on the
group of items that load high on each factor.
• Factor scores can help explain what the factors
mean. With such scores, several other multivariate
analyses can be performed.
• We can now take up the important methods of
factor analysis 307
CENTROID METHOD OF FACTOR ANALYSIS
This method of factor analysis, developed by L.L. Thurstone,
was quite frequently used until about 1950 before the
advent of large capacity high speed computers.*
The centroid method tends to maximize the sum of
loadings, disregarding signs; it is the method which
extracts the largest sum of absolute loadings for each
factor in turn.
It is defined by linear combinations in which all weights are
either + 1.0 or – 1.0.
The main merit of this method is that it is relatively simple,
can be easily understood and involves simpler
computations.
308
CENTROID METHOD OF FACTOR ANALYSIS
• If one understands this method, it becomes easy to
understand the mechanics involved in other methods of
factor analysis.
• Various steps involved in this method are as follows:
1) This method starts with the computation of a matrix of
correlations, R, wherein units are placed in the diagonal
spaces. The product moment formula is used for
working out the correlation coefficients.
2) If the correlation matrix so obtained happens to be
positive manifold (i.e., disregarding the diagonal
elements each variable has a large sum of positive
correlations than of negative
309
CENTROID METHOD OF FACTOR ANALYSIS
correlations), the centroid method requires that the weights for
all variables be +1.0. In other words, the variables are not
weighted; they are simply summed.
• But in case the correlation matrix is not a positive manifold,
then reflections must be made before the first centroid factor is
obtained.
3) The first centroid factor is determined as under:
a) The sum of the coefficients (including the diagonal unity) in
each column of the correlation matrix is worked out.
b) Then the sum of these column sums (T) is obtained.
c) The sum of each column obtained as per (a) above is divided
by the square root of T obtained in (b) above, resulting in
what are called centroid loadings.
d) This way each centroid loading (one loading for one variable)
is computed. The full set of loadings so obtained constitute
the first centroid factor (say A).
310
CENTROID METHOD OF FACTOR ANALYSIS
4) To obtain second centroid factor (say B), one must first
obtain a matrix of residual coefficients. For this purpose,
the loadings for the two variables on the first centroid
factor are multiplied.
This is done for all possible pairs of variables (in each
diagonal space is the square of the particular factor
loading).
The resulting matrix of factor cross products may be
named as Q1. Then Q1 is subtracted element by element
from the original matrix of correlation, R, and the result
is the first matrix of residual coefficients, R1.*
After obtaining R1, one must reflect some of the
variables in it, meaning thereby that some of the
variables are given negative signs in the sum [This is
usually done by inspection. 311
CENTROID METHOD OF FACTOR ANALYSIS
The aim in doing this should be to obtain a reflected matrix,
R'1, which will have the highest possible sum of coefficients
(T)].
For any variable which is so reflected, the signs of all
coefficients in that column and row of the residual matrix are
changed.
When this is done, the matrix is named as ‘reflected matrix’
form which the loadings are obtained in the usual way
(already explained in the context of first centroid factor), but
the loadings of the variables which were reflected must be
given negative signs.
The full set of loadings so obtained constitutes the second
centroid factor (say B).
Thus loadings on the second centroid factor are obtained
from R'1.
312
CENTROID METHOD OF FACTOR ANALYSIS
5)For subsequent factors (C, D, etc.) the same process outlined
above is repeated.
After the second centroid factor is obtained, cross products
are computed forming, matrix, Q2. This is then subtracted
from R1 (and not from R'1) resulting in R2.
To obtain a third factor (C), one should operate on R2 in the
same way as on R1.
First, some of the variables would have to be reflected to
maximize the sum of loadings, which would produce R'2.
Loadings would be computed from R'2 as they were from R'1.
Again, it would be necessary to give negative signs to the
loadings of variables which were reflected which would
result in third centroid factor (C).
313
PRINCIPAL-COMPONENTS METHOD OF FACTOR ANALYSIS
314
PRINCIPAL-COMPONENTS METHOD OF FACTOR ANALYSIS
From the ∑aij and ∑yij , we may find bij of the original model,
transferring back from the p’s into the standardized X’s.
Alternative method for finding the factor loadings is as
under
1) Correlation coefficients (by the product moment
method) between the pairs of k variables are worked
out and may be arranged in the form of a correlation
matrix, R, as under
317
ALTERNATIVE METHOD FOR FINDING THE FACTOR LOADINGS
325
MAXIMUM LIKELIHOOD (ML) METHOD OF FACTOR ANALYSIS
330
R-TYPE AND Q-TYPE FACTOR ANALYSES
• Factors emerge when there are high correlations within
groups of people. Q-type analysis is useful when the
object is to sort out people into groups based on their
simultaneous responses to all the variables.
• Factor analysis has been mainly used in developing
psychological tests (such as IQ tests, personality tests,
and the like) in the realm of psychology.
• In marketing, this technique has been used to look at
media readership profiles of people.
• Merits: The main merits of factor analysis can be stated
thus:
1) The technique of factor analysis is quite useful
when we want to condense and simplify the
multivariate data. 331
R-TYPE AND Q-TYPE FACTOR ANALYSES
2) The technique is helpful in pointing out important and interesting,
relationships among observed data that were there all the time, but
not easy to see from the data alone.
3) The technique can reveal the latent factors (i.e., underlying factors
not directly observed) that determine relationships among several
variables concerning a research study.
For example, if people are asked to rate different cold drinks (say,
Limca, Nova-cola, Gold Spot and so on) according to preference, a
factor analysis may reveal some salient characteristics of cold drinks
that underlie the relative preferences.
4) The technique may be used in the context of empirical clustering of
products, media or people i.e., for providing a classification scheme
when data scored on various rating scales have to be grouped
together.
332
LIMITATIONS OF FACTOR ANALYSIS
One should also be aware of several limitations of factor
analysis. Important ones are as follows
1) Factor analysis, like all multivariate techniques, involves
laborious computations involving heavy cost burden.
With computer facility available these days, there is no
doubt that factor analysis has become relatively faster and
easier, but the cost factor continues to be the same i.e.,
large factor analyses are still bound to be quite expensive.
2) The results of a single factor analysis are considered
generally less reliable and dependable for very often a
factor analysis starts with a set of imperfect data.
“The factors are nothing but blurred averages, difficult to
be identified.”
333
R-TYPE AND Q-TYPE FACTOR ANALYSES
To overcome this difficulty, it has been realised that analysis
should at least be done twice. If we get more or less similar
results from all rounds of analyses, our confidence concerning
such results increases.
3) Factor-analysis is a complicated decision tool that can be used
only when one has thorough knowledge and enough experience
of handling this tool. Even then, at times it may not work well
and may even disappoint the user.
To conclude, we can state that in spite of all the said limitations
“when it works well, factor analysis helps the investigator make
sense of large bodies of intertwined data.
When it works unusually well, it also points out some
interesting relationships that might not have been obvious from
examination of the input data alone”. 334
CLUSTER ANALYSIS
Cluster analysis consists of methods of classifying
variables into clusters.
Technically, a cluster consists of variables that correlate
highly with one another and have comparatively low
correlations with variables in other clusters.
The basic objective of cluster analysis is to determine
how many mutually and exhaustive groups or clusters,
based on the similarities of profiles among entities,
really exist in the population and then to state the
composition of such groups.
Various groups to be determined in cluster analysis are
not predefined as happens to be the case in
discriminant analysis. 335
CLUSTER ANALYSIS
In general, cluster analysis contains the following steps to
be performed:
a) First of all, if some variables have a negative sum of
correlations in the correlation matrix, one must reflect
variables so as to obtain a maximum sum of positive
correlations for the matrix as a whole.
b) The second step consists in finding out the highest
correlation in the correlation matrix and the two variables
involved (i.e., having the highest correlation in the matrix)
form the nucleus of the first cluster.
c) Then one looks for those variables that correlate highly
with the said two variables and includes them in the cluster.
This is how the first cluster is formed.
336
CLUSTER ANALYSIS
d) To obtain the nucleus of the second cluster, we find two
variables that correlate highly but have low correlations
with members of the first cluster.
Variables that correlate highly with the said two variables
are then found. Such variables along the said two
variables thus constitute the second cluster.
e) One proceeds on similar lines to search for a third cluster
and so on.
From the above description we find that clustering
methods in general are judgmental and are devoid of
statistical inferences.
337
CLUSTER ANALYSIS
For problems concerning large number of variables,
various cut-and try methods have been proposed for
locating clusters.
McQuitty has specially developed a number of rather
elaborate computational routines* for that purpose.
In spite of the above stated limitation, cluster analysis
has been found useful in context of market research
studies.
Through the use of this technique we can make
segments of market of a product on the basis of several
characteristics of the customers such as personality,
socio-economic considerations, psychological factors,
purchasing habits and like ones. 338
Multidimensional Scaling
Multidimensional scaling (MDS) allows a researcher to
measure an item in more than one dimension at a time.
The basic assumption is that people perceive a set of
objects as being more or less similar to one another on
a number of dimensions (usually uncorrelated with one
another) instead of only one.
There are several MDS techniques (also known as
techniques for dimensional reduction) often used for
the purpose of revealing patterns of one sort or
another in interdependent data structures.
If data happen to be non-metric, MDS involves rank
ordering each pair of objects in terms of similarity.
339
Multidimensional Scaling
Then the judged similarities are transformed into
distances through statistical manipulations and are
consequently shown in n-dimensional space in a way
that the interpoint distances best preserve the original
interpoint proximities.
After this sort of mapping is performed, the dimensions
are usually interpreted and labeled by the researcher.
The significance of MDS lies in the fact that it enables
the researcher to study
“The perceptual structure of a set of stimuli and the
cognitive processes underlying the development of this
structure.... MDS provides a mechanism for
determining the truly salient attributes without forcing
the judge to appear irrational” 340
Multidimensional Scaling
• With MDS, one can scale objects, individuals or both with a
minimum of information.
• The MDS analysis will reveal the most salient attributes
which happen to be the primary determinants for making a
specific decision.
341
LATENT STRUCTURE ANALYSIS
This type of analysis shares both of the objectives of factor
analysis viz., to extract latent factors and express
relationship of observed (manifest) variables with these
factors as their indicators and to classify a population of
respondents into pure types.
This type of analysis is appropriate when the variables
involved in a study do not possess dependency relationship
and happen to be non-metric.
In addition to the above stated multivariate techniques, we
may also describe the salient features of what is known as
“Path analysis”, a technique useful for decomposing the
total correlation between any two variables in a causal
system 342
PATH ANALYSIS
The term ‘path analysis’ was first introduced by the biologist
Sewall Wright in 1934 in connection with decomposing the
total correlation between any two variables in a causal
system.
The technique of path analysis is based on a series of multiple
regression analyses with the added assumption of causal
relationship between independent and dependent variables.
This technique lays relatively heavier emphasis on the
heuristic use of visual diagram, technically described as a path
diagram.
An illustrative path diagram showing interrelationships
between Fathers’ education, Fathers’ occupation, Sons’
education, Sons’ first and Sons’ present occupation can be
shown in the Fig. below 343
PATH ANALYSIS
Path analysis makes use of standardized partial regression
coefficients (known as beta weights) as effect coefficients.
In linear additive effects are assumed, then through path
analysis a simple set of equations can be built up showing how
each variable depends on preceding variables.
“The main principle of path analysis is that any correlation
coefficient between two variables, or a gross or overall
measure of empirical relationship can be decomposed into a
series of parts: separate paths of influence leading through
chronologically intermediate variable to which both the
correlated variables have links.”
The merit of path analysis in comparison to correlational
analysis is that it makes possible the assessment of the relative
influence of each antecedent or explanatory variable
344
PATH ANALYSIS
explicit the assumptions underlying the causal connections and then
by elucidating the indirect effect of the explanatory variables.
345
PATH ANALYSIS
The use of the path analysis technique requires the
assumption that there are linear additive, a symmetric
relationships among a set of variables which can be measured
at least on a quasi-interval scale.
Each dependent variable is regarded as determined by the
variables preceding it in the path diagram, and a residual
variable, defined as uncorrelated with the other variables, is
postulated to account for the unexplained portion of the
variance in the dependent variable.
The determining variables are assumed for the analysis to be
given (exogenous in the model).”
We may illustrate the path analysis technique in connection
with a simple problem of testing a causal model with three
explicit variables as shown in the following path diagram:346
PATH ANALYSIS
347
PATH ANALYSIS
where the X variables are measured as deviations
from their respective means.
p21 may be estimated from the simple regression of
X2 on X1 i.e., X2 = b21Xl and p31 and p32 may be
estimated from the regression of X3 on X2 and X1 as
under:
350