Documente Academic
Documente Profesional
Documente Cultură
PROPORTIONS, TEST OF
INDEPENDENCE AND
GOODNESS OF FIT
Quantitative Analysis
Mayank Patel, CFA
Learning Objectives
In this chapter, you learn:
Testing the equality of population
proportions for three or more populations
Testing the independence of two categorical
variables
Testing whether a probability distribution for
a population follows a specific historical or
theoretical probability distribution.
Contingency Tables
Contingency Tables
Useful in situations involving multiple
population proportions
Used to classify sample observations
according to two or more characteristics
Also called a cross-classification table.
independent populations:
H 0: 1 = 2 = = c
H1: Not all of the j are equal (j = 1, 2, , c)
f
)
2 o e
fe
all cells
where:
Computing the
Overall Proportion
The overall
proportion is:
X 1 X 2 ... X c X
p
n1 n2 ... nc
n
Insurance
Companies
Pharmacies
Medical
Researchers
Yes
410
295
335
No
90
205
165
0.6933
n1 n2 ... nc
500 500 500
Organization
Insurance
Companies
Pharmacies
Medical
Researchers
Yes
fo = 410
fe = 346.667
fo = 295
fe = 346.667
fo = 335
fe = 346.667
No
fo = 90
fe = 153.333
fo = 205
fe = 153.333
fo = 165
fe = 153.333
Object to
Record
Sharing
Organization
Insurance
Companies
Pharmacies
Medical
Researchers
fo fe 2
fo fe 2
fo fe 2
fe
fo fe
fe
11.571
26.159
fe
fo fe 2
fe
7.700
17.409
fe
0.3926
fo fe 2
fe
0.888
2
(
f
f
)
The Chi-square test statistic is: 2 o e 64.1196
fe
all cells
Decision Rule:
If 2 > 2U, reject H0,
otherwise, do not reject H0
Critical Range
2
U
p j (1 p j )
nj
p j / (1 p j / )
n j/
Insurance
Companies
Pharmacies
Medical
Researchers
Yes
410
P1 = 0.82
295
P2 = 0.59
335
P3 = 0.67
No
90
205
165
Absolute
Differences Critical Range
| Group 1 - Group 2 |
0.23
0.06831808
| Group 1 - Group 3 |
0.15
0.0664689
| Group 2 - Group 3 |
0.08
0.074485617
2 Test of Independence
Similar to the 2 test for equality of more than two
2 Test of Independence
The Chi-square test statistic is:
2
(
f
f
)
2 o e
fe
all cells
where:
fo = observed frequency in a particular cell of the r x c table
fe = expected frequency in a particular cell if H 0 is true
2 for the r x c case has (r-1)(c-1) degrees of freedom
Decision Rule
The decision rule is
Class
Standing
Fresh.
Soph.
Junior
Senior
Total
32
26
14
16
88
14
12
6
10
42
Total
70
60
30
40
200
10.5
200
Class
Standing
20/wk
10/wk
none
Fresh.
24.5
30.8
14.7
70
Soph.
21.0
26.4
12.6
60
Junior
10.5
13.2
6.3
30
Senior
14.0
17.6
8.4
40
70
88
42
200
Total
Total
0.709
24.5
30.8
8.4
Example: Test of
Independence
The test statistic is 2 0.709 , U2 with 6 d.f. 12.592
Decision Rule:
If 2 > 12.592, reject H0, otherwise,
do not reject H0
=0.05
Do not
reject H0
Reject H0
2U=12.592
Here,
2 = 0.709 < 2U = 12.592,
so do not reject H0
Conclusion: there is
insufficient evidence that meal
plan and class standing are
related.
e
)
2 i i
ei
i 1
k
where:
fi = observed frequency for category i
ei = expected frequency for category i
k = number of categories
2
2
Reject H0 if
Hypotheses
H0: pC = pL = pS = pA = .25
Ha: The population proportions are not
pC = .25, pL = .25, pS = .25, and pA = .25
where:
pC = population proportion that purchase a colonial
pL = population proportion that purchase a log cabin
pS = population proportion that purchase a split-level
pA = population proportion that purchase an A-frame
Do Not Reject H0
Reject H0
2
7.815
Expected Frequencies
e1 = .25(100) = 25
e3 = .25(100) = 25
e2 = .25(100) = 25
e4 = .25(100) = 25
Test Statistic
25
25
25
25
2
=1+1+4+4
= 10
2 Value (df = 3)
.10
.05
.025
.01
.005
Actual p-value is .
0186
e
)
2 i i
ei
i 1
k
2
2
5. Reject H0 if (where is the significance level
Example: IQ Computers
A simple random sample of 30 of the salespeople
was taken and their numbers of units sold are
listed below.
33
64
83
43
65
84
44
66
85
45
68
86
52
70
91
52
72
92
56
73
94
58 63 64
73 74 75
98 102 105
Interval Definition
Areas
= 1.00/6
= .1667
53.02
71 .43(18.54) = 63.03
71
88.98 = 71 + .97(18.54)
78.97
fi
6
3
6
5
4
6
30
ei
5
5
5
5
5
5
30
fi - ei
1
-2
1
0
-1
1
Test Statistic
2
2
2
2
2
2
(1)
(
2)
(1)
(0)
(
1)
(1)
2
1.600
5
5
5
5
5
5
.90
.10
.05
2 Value (df = 3)
.025
.01
9.348 11.345