Documente Academic
Documente Profesional
Documente Cultură
Chapter 1
Introduction to Statistics
LEARNING OBJECTIVES
1.
Define statistics.
2.
3.
4.
Page | 2
of the meaning of the term. For this reason, several perceptions of the word
statistics are given in the chapter.
Chapter 1 sets up the paradigm of inferential statistics. The student
will understand that while there are many useful applications of descriptive
statistics in business, the strength of the application of statistics in the field of
business is through inferential statistics. From this notion, we will later
introduce probability, sampling, confidence intervals, and hypothesis testing.
The process involves taking a sample from the population, computing a
statistic on the sample data, and making an inference (decision or
conclusion) back to the population from which the sample has been drawn.
In chapter 1, levels of data measurement are emphasized. Too many
texts present data to the students with no comment or discussion of how the
data were gathered or the level of data measurement. In chapter 7, there is
a discussion of sampling techniques. However, in this chapter, four levels of
data are discussed. It is important for students to understand that the
statistician is often given data to analyze without input as to how it was
gathered or the type of measurement. It is incumbent upon statisticians and
researchers to ascertain the level of measurement that the data represent so
that appropriate techniques can be used in analysis. All techniques
presented in this text cannot be appropriately used to analyze all data.
CHAPTER OUTLINE
1.1
Statistics in Business
Marketing
Management
Finance
Economics
Management Information Systems
1.2
1.3
Data Measurement
Nominal Level
Page | 3
Ordinal Level
Interval Level
Ratio Level
Comparison of the Four Levels of Data
Statistical Analysis Using the Computer: Excel and MINITAB
KEY TERMS
Census
Descriptive Statistics
Parameter
Inferential Statistics
Parametric
Statistics
Interval Level Data
Population
Metric Data
Sample
Non-metric Data
Statistic
Nonparametric Statistics
Statistics
1.1
Page | 4
accounting - cost of goods, salary expense, depreciation, utility costs,
taxes, equipment inventory, etc.
finance - World bank bond rates, number of failed savings and loans,
measured risk of common stocks, stock dividends, foreign exchange
rate, liquidity rates for a single-family, etc.
1.2
Page | 5
manufacturing - size of punched hole, number of rejects, amount of
inventory, amount of production, number of production workers, etc.
Page | 6
healthcare - number of patients per physician per day, average cost of
hospital stay, average daily census of hospital, time spent waiting to
see a physician, patient satisfaction, number of blood tests done per
week.
1.3
1)
2)
3)
4)
1)
2)
3)
Page | 7
4)
1.4
decisions -
1)
2)
3)
Page | 8
1.5
1.6
1)
2)
3)
a)
ratio
b)
ratio
c)
ordinal
d)
nominal
e)
ratio
f)
ratio
g)
nominal
h)
ratio
a)
ordinal
b)
ratio
c)
nominal
d)
ratio
e)
interval
f)
interval
Page | 9
1.7
g)
h)
nominal
ordinal
a)
The population for this study is the 900 electric contractors who
purchased Rathburn wire.
b)
c)
d)
P a g e | 10
Chapter 2
Charts and Graphs
LEARNING OBJECTIVES
The overall objective of chapter 2 is for you to master several techniques for
summarizing and depicting data, thereby enabling you to:
1.
2.
3.
P a g e | 11
meaningful manner. One mechanism for data summarization is the
frequency distribution, which is essentially a way of organizing ungrouped or
raw data into grouped data. It is important to realize that there is
considerable art involved in constructing a frequency distribution. There are
nearly as many possible frequency distributions for a problem as there are
students in a class. Students should begin to think about the receiver or user
of their statistical product. For example, what class widths and class
endpoints would be most familiar and meaningful to the end user of the
distribution? How can the data best be communicated and summarized using
the frequency distribution?
The second part of chapter 2 presents various ways to depict data
using graphs. The student should view these graphical techniques as tools
for use in communicating characteristics of the data in an effective manner.
Most business students will have some type of management opportunity in
their field before their career ends. The ability to make effective
presentations and communicate their ideas in succinct, clear ways is an
asset. Through the use of graphics packages and such techniques as
frequency polygons, ogives, histograms, and pie charts, the manager can
enhance his/her personal image as a communicator and decision-maker. In
addition, the manager can emphasize that the final product (the frequency
polygon, etc.) is just the beginning. Students should be encouraged to study
the graphical output to recognize business trends, highs, lows, etc. and
realize that the ultimate goal for these tools is their usage in decision making.
CHAPTER OUTLINE
2.1
Frequency Distributions
Class Midpoint
Relative Frequency
Cumulative Frequency
2.2
P a g e | 12
Histograms
Frequency Polygons
Ogives
Pie Charts
Stem and Leaf Plots
Pareto Charts
2.3
KEY TERMS
Class Mark
Pareto Chart
Class Midpoint
Pie Chart
Cumulative Frequency
Range
Frequency Distribution
Relative Frequency
Frequency Polygon
Scatter Plot
Grouped Data
Histogram
Ogive
P a g e | 13
SOLUTIONS TO PROBLEMS IN CHAPTER 2
2.1
a)
Class Interval
Frequency
10 - under 25
25 - under 40
13
40 - under 55
11
55 - under 70
70 - under 85
8
50
b)
Class Interval
Frequency
10 - under 18
18 - under 26
26 - under 34
34 - under 42
42 - under 50
50 - under 58
58 - under 66
66 - under 74
74 - under 82
P a g e | 14
82 - under 90
c)
P a g e | 15
2.2
One possible frequency distribution is the one below with 12 classes
and class intervals of 2.
Class Interval
Frequency
39 - under 41
41 - under 43
43 - under 45
45 - under 47
10
47 - under 49
18
49 - under 51
13
51 - under 53
15
53 - under 55
15
55 - under 57
57 - under 59
59 - under 61
61 under 63
The distribution reveals that only 13 of the 100 boxes of raisins contain
50 1 raisin (49 -under 51). However, 71 of the 100 boxes of raisins
contain between 45 and 55 raisins. It shows that there are five boxes
that have 9 or more extra raisins (59-61 and 61-63) and two boxes that
have 9-11 less raisins (39-41) than the boxes are supposed to contain.
2.3
Class
Interval
Class
Frequency
Midpoint
Relative
Frequency
Cumulative
Frequency
P a g e | 16
0-5
2.5
6/86 = .0698
5 - 10
7.5
.0930
14
10 - 15
17
12.5
.1977
31
15 - 20
23
17.5
.2674
54
20 - 25
18
22.5
.2093
72
25 - 30
10
27.5
.1163
82
30 - 35
32.5
.0465
86
TOTAL
86
1.0000
P a g e | 17
2.4
Class
Class
Interval
Frequency
Relative
Midpoint
Cumulative
Frequency
Frequency
0-2
218
.436
218
2-4
207
.414
425
4-6
56
.112
481
6-8
11
.022
492
.016
500
8-10
TOTAL
2.5
500
1.000
2.6 Histogram:
P a g e | 18
P a g e | 19
Frequency Polygon:
2.7
Histogram:
P a g e | 20
P a g e | 21
Frequency Polygon:
2.8 Ogive:
P a g e | 22
P a g e | 23
2.9
STEM
LEAF
21
2, 8, 8, 9
22
0, 1, 2, 4, 6, 6, 7, 9, 9
23
0, 0, 4, 5, 8, 8, 9, 9, 9, 9
24
0, 0, 3, 6, 9, 9, 9
25
0, 3, 4, 5, 5, 7, 7, 8, 9
26
0, 1, 1, 2, 3, 3, 5, 6
27
0, 1, 3
2.10
Firm
Proportion
Degrees
Caterpillar
.372
134
Deere
.246
89
.144
.121
American Standard
TOTAL
Pie Chart:
52
44
.117
1.000
42
361
P a g e | 24
Annual Sales
Eaton; 12%
Caterpillar; 37%
Deere; 25%
P a g e | 25
2. 11
Company
Proportion
Degrees
Southwest
.202
73
Delta
.198
71
American
.181
65
United
.140
50
Northwest
.108
39
US Airways
.094
34
Continental
TOTAL
.078
1.001
28
360
Pie Chart:
Em planem ents
Continental; 8%
Southw est; 20%
US Airw ays; 9%
Delta; 20%
United; 14%
American; 18%
P a g e | 26
2.12
Brand
Proportion
Degrees
Pfizer
.289
104
.259
93
Merck
.125
45
Bristol-Myers Squibb
.120
Abbott Laboratories
43
.112
40
Wyeth
.095
34
TOTAL
1.000
359
Pie Chart:
Wyeth; 9%
Abbott Lab.; 11%
Pfizer; 29%
Merck; 13%
P a g e | 27
P a g e | 28
2.13
STEM
LEAF
3, 6, 7, 7, 7, 9, 9, 9
0, 3, 3, 5, 7, 8, 9, 9
2, 3, 4, 5, 7, 8, 8
1, 4, 5, 6, 6, 7, 7, 8, 8, 9
0, 1, 2, 2, 7, 8, 9
0, 1, 4, 5, 6, 7, 9
0, 7
The stem and leaf plot shows that the number of passengers per flight
were
relatively evenly distributed between the high teens through the
sixties. Rarely
was there a flight with at least 70 passengers. The category of 40's
contained the
most flights (10).
2.14
Complaint
Number
Busy Signal
420
% of Total
56.45
184
24.73
85
11.42
Got Disconnected
37
4.97
10
1.34
P a g e | 29
Poor Connection
Total
744
1.08
99.99
800
700
600
500
400
300
200
100
0
100
80
60
40
20
0
C1
Count
Percent
Cum %
420
56.5
56.5
184
24.7
81.2
85
11.4
92.6
37
5.0
97.6
10
1.3
98.9
8
1.1
100.0
Percent
Count
Customer Complaints
P a g e | 30
2.15
3500
3000
2500
2000
Industrial Products
1500
1000
500
0
3000
3500
4000
4500
5000
Hum an Food
5500
6000
6500
P a g e | 31
2.16
180
160
140
120
100
Sales
80
60
40
20
0
1
Adve rtising
2.17
Class Interval
Frequencies
16 - under 23
23 - under 30
30 - under 37
37 - under 44
44 - under 51
51 - under 58
TOTAL
30
10
11
P a g e | 32
2.18
Class Interval
Frequency
Midpoint
Rel. Freq.
Cum. Freq.
20 - under 25
17
22.5
.207
17
25 - under 30
20
27.5
.244
37
30 - under 35
16
32.5
.195
53
35 - under 40
15
37.5
.183
68
40 - under 45
42.5
.098
76
45 - under 50
47.5
.073
82
P a g e | 33
2.19
Class Interval
50 - under 60
Frequencies
13
60 - under 70
27
70 - under 80
43
80 - under 90
31
90 - under 100
TOTAL
Histogram:
Frequency Polygon:
9
123
P a g e | 34
Ogive:
2.20
Label
A
B
Value
55
121
Proportion
.180
.397
Degrees
65
143
P a g e | 35
C
83
.272
98
46
.151
54
305
1.000
360
TOTAL
Pie Chart:
D; 15%
A; 18%
C; 27%
B; 40%
2.21
STEM
LEAF
28
4, 6, 9
29
0, 4, 8
30
1, 6, 8, 9
31
1, 2, 4, 6, 7, 7
32
4, 4, 6
33
P a g e | 36
2.22
Problem
Frequency
Percent of Total
673
26.96
29
1.16
108
4.33
379
15.18
73
2.92
564
22.60
12
0.48
402
16.11
54
2.16
202
8.09
10
2496
Pareto Chart:
P a g e | 37
2500
100
2000
80
1500
60
1000
40
500
20
0
C5
Count
Percent
Cum %
2.23
1
6
673 564
27.0 22.6
27.0 49.6
8
4
402 379
16.1 15.2
65.7 80.8
10
202
8.1
88.9
3
5
108
73
4.3
2.9
93.3 96.2
9
2 Other
54
29
12
2.2
1.2
0.5
98.4 99.5 100.0
Percent
Count
Problems
P a g e | 38
16
14
12
10
0
4
10
12
Class Interval
Frequency
32 - under 37
37 - under 42
42 - under 47
12
47 - under 52
11
52 - under 57
14
57 - under 62
62 - under 67
14
16
18
P a g e | 39
67 - under 72
TOTAL
1
50
P a g e | 40
2.25
Class
Class
Relative
Cumulative
Frequency
Frequency
Interval
Frequency
Midpoint
20 25
22.5
8/53 = .1509
25 30
27.5
.1132
14
30 35
32.5
.0943
19
35 40
12
37.5
.2264
31
40 45
15
42.5
.2830
46
45 50
47.5
.1321
53
TOTAL
2.26
53
.9999
Frequency Distribution:
Class Interval
Frequency
10 - under 20
20 - under 30
30 - under 40
40 - under 50
50 - under 60
12
60 - under 70
70 - under 80
80 - under 90
2
50
P a g e | 41
Histogram:
P a g e | 42
Frequency Polygon:
The normal distribution appears to peak near the center and diminish
towards the
end intervals.
2.27
Class
Interval
Cumulative
Frequency
Frequency
P a g e | 43
20 25
25 30
14
30 35
19
35 40
12
31
40 45
15
46
45 50
53
TOTAL
53
P a g e | 44
Histogram:
Frequency Polygon:
P a g e | 45
P a g e | 46
b. Ogive:
2.28
Cumulative
P a g e | 47
Asking Price
Frequency
Frequency
21
21
27
48
18
66
11
77
83
86
86
P a g e | 48
Histogram:
30
25
20
Frequency
15
10
0
90000
110000
130000
Class Midpoints
Frequency Polygon:
150000
170000
190000
P a g e | 49
30
25
20
Frequency
15
10
0
90000
110000
130000
150000
Class Midpoints
Ogive:
170000
190000
P a g e | 50
100
90
80
70
60
Frequencies
50
40
30
20
10
0
80000
100000
120000
140000
160000
180000
200000
Class Endpoints
2.29
Amount Spent
Cumulative
on Prenatal Care
Frequency
Frequency
$ 0 - under $100
12
21
19
40
11
51
57
57
P a g e | 51
P a g e | 52
Histogram:
Frequency Polygon:
P a g e | 53
P a g e | 54
Ogive:
2.30
Cumulative
Price
Frequency
Frequency
14
23
17
40
P a g e | 55
$2.20 - under $2.35
16
56
18
74
82
87
87
P a g e | 56
Histogram:
Frequency Polygon:
P a g e | 57
Ogive:
P a g e | 58
2.31
Genre
Albums Sold
Proportion
Degrees
R&B
146.4
.29
Alternative
102.6
.21
76
Rap
73.7
.15
54
Country
64.5
.13
47
Soundtrack
56.4
.11
40
Metal
26.6
.05
18
Classical
14.8
.03
11
Latin
14.5
.03
11
TOTAL
Pie Chart:
1.00
104
361
P a g e | 59
Classical; 3%
Metal ; 5%
Soundtrack; 11%
Latin; 3%
R&B; 29%
Country; 13%
Rap; 15%
Alternative; 21%
P a g e | 60
2.32
700
600
500
400
M anufactured Goods
300
200
100
0
0
10
15
20
Agricultural Products
25
30
35
P a g e | 61
2.33
Industry
Total Release
Proportion
Degrees
Chemicals
737,100,000
.366
132
Primary metals
566,400,000
.281
103
Paper
229,900,000
.114
41
109,700,000
.054
19
102,500,000
.051
18
Food
89,300,000
.044
16
Fabricated Metals
85,900,000
.043
15
Petroleum
63,300,000
.031
11
29,100,000
.014
Transportation
Equipment
Electrical
Equipment
TOTAL
Pie Chart:
0.998
360
P a g e | 62
Fab. Metals; 4%
Petro.; 3%
Elec.; 1%
Food; 4%
Trans. Equip.; 5%
Chem.; 37%
Plas. & Rubber; 5%
Paper; 11%
P a g e | 63
2.34
500
100
400
80
300
60
200
40
100
20
C1
Count
Percent
Cum %
2.35
STEM
LEAF
42
43
44
20, 40, 59
Labeling
44
8.8
93.6
Discoloration
32
6.4
100.0
Percent
Count
P a g e | 64
45
12
46
53, 54
47
30, 34, 58
48
49
63
50
48, 49, 90
51
66
52
53
38, 66, 66
54
31, 78
55
56
56
69
57
37, 50
58
59
2.36
STEM
19, 23
LEAF
22
00, 68
23
24
25
24, 55
26
27
42, 60, 64
28
14, 30
29
P a g e | 65
30
02, 10
2.38 Family practice is most prevalent with about 20% with pediatrics next at
slightly
less. A virtual tie exists between ob/gyn, general surgery,
anesthesiology, and
psychiatry at about 14% each.
2.39 The fewest number of audits is 12 and the most is 42. More companies
(8)
performed 27 audits than any other number. Thirty-five companies
performed
between 12 and 19 audits. Only 7 companies performed 40 or more
audits.
2.40 There were relatively constant sales from January through August ($4 to
6 million).
P a g e | 66
Each month from September through December sales increased with
December
having the sharpest increase ($15 million in sales in December).
Chapter 3
Descriptive Statistics
LEARNING OBJECTIVES
1.
2.
3.
4.
5.
6.
7.
8.
P a g e | 67
P a g e | 68
example, in tracking the price of a stock over a period of time, a financial
analyst might determine that the larger the standard deviation, the greater
the risk (because of swings in the price). However, because the size of a
standard deviation is a function of the mean and a coefficient of variation
conveys the size of a standard deviation relative to its mean, other financial
researchers prefer the coefficient of variation as a measure of the risk. That
is, it can be argued that a coefficient of variation takes into account the size
of the mean (in the case of a stock, the investment) in determining the
amount of risk as measured by a standard deviation.
It should be emphasized that the calculation of measures of central
tendency and variability for grouped data is different than for ungrouped or
raw data. While the principles are the same for the two types of data,
implementation of the formulas is different. Computations of statistics from
grouped data are based on class midpoints rather than raw values; and for
this reason, students should be cautioned that group statistics are often just
approximations.
Measures of shape are useful in helping the researcher describe a
distribution of data. The Pearsonian coefficient of skewness is a handy tool for
ascertaining the degree of skewness in the distribution. Box and Whisker
plots can be used to determine the presence of skewness in a distribution and
to locate outliers. The coefficient of correlation is introduced here instead of
chapter 14 (regression chapter) so that the student can begin to think about
two-variable relationships and analyses and view a correlation coefficient as a
descriptive statistic. In addition, when the student studies simple regression
in chapter 14, there will be a foundation upon which to build. All in all,
chapter 3 is quite important because it presents some of the building blocks
for many of the later chapters.
P a g e | 69
CHAPTER OUTLINE
P a g e | 70
Correlation
KEY TERMS
Arithmetic Mean
Measures of Shape
Bimodal
Measures of Variability
P a g e | 71
Box and Whisker Plot
Chebyshevs Theorem
Median
Mesokurtic
Mode
Multimodal
Percentiles
Correlation
Platykurtic
Quartiles
Empirical Rule
Range
Interquartile Range
Skewness
Kurtosis
Standard Deviation
Leptokurtic
Sum of Squares of x
Variance
z Score
3.1
Mode
2, 2, 3, 3, 4, 4, 4, 4, 5, 6, 7, 8, 8, 8, 9
The mode = 4
P a g e | 72
4 is the most frequently occurring value
3.2
Since there are an odd number of terms, the median is the middle
number.
The median = 4
n 1
2
at the
th
151
2
term =
= 8th term
The 8th term =
4
3.3
Median
Arrange terms in ascending order:
073, 167, 199, 213, 243, 345, 444, 524, 609, 682
There are 10 terms.
Since there are an even number of terms, the median is the average of
the
middle two terms:
(2 4 3 3 4 5) 5 8 8
2
2
Median =
= 294
P a g e | 73
n 1
2
Using the formula, the median is located at the
th
term
10 1 11
2
2
= 5.5th term.
n = 10 therefore
3.4
Mean
17.3
44.5
= x/N = (333.6)/8 =
41.7
31.6
40.0
x
52.8
38.8
30.1
78.5
x = 333.6
3.5
Mean
7
-2
P a g e | 74
= x/N = -12/12 = -1
5
9
0
-3
x
-6
= x/n = -12/12 = -1
-7
-4
-5
2
-8
x = -12
3.6
11, 13, 16, 17, 18, 19, 20, 25, 27, 28, 29, 30, 32, 33, 34
35
(15) 5.25
100
P35 = 19
55
(15) 8.25
100
P55 = 27
P a g e | 75
i
Q1 = P25 but
25
(15) 3.75
100
Q2 = Median but:
Q1 = 17
15 1
Q2 = 25
i
Q3 = P75 but
75
(15) 11.25
100
th
8th term
P a g e | 76
3.7
80, 94, 97, 105, 107, 112, 116, 116, 118, 119, 120, 127,
128, 138, 138, 139, 142, 143, 144, 145, 150, 162, 171, 172
n = 24
i
For P20:
20
( 24) 4.8
100
i
For P47:
47
(24) 11.28
100
i
For P83:
P47 = 127
83
(24) 19.92
100
Q1 = P25
P20 = 107
P83 = 145
P a g e | 77
i
For P25:
25
(24) 6
100
Q1 = (112 + 116)/ 2 =
114
Q2 = Median
24 1
th
12.5th term
127.5
Q3 = P75
i
For P75:
75
(24) 18
100
Q3 = (143 + 144)/ 2 =
143.5
x 18,245 1216.33
3.8
Mean =
15
15 1
th
= 8th term
P a g e | 78
Q2 = Median = 1,233
i
For P63,
63
(15) 9.45
100
i
For P29,
29
(15) 4.35
100
3.9
P63 = 1,277
12 1
P29 = 1,119
th
6.5 th
position
i
For Q3 = P75:
75
(12) 9
100
P a g e | 79
Q3 = P75
i
For P20:
20
(12) 2.4
100
i
For P60:
P20 = 2.12
60
(12) 7.2
100
i
For P80:
P60 = 5.10
80
(12) 9.6
100
i
For P93:
P80 = 7.88
93
(12) 11.16
100
P93 = 8.97
x 61 3.588
3.10
n = 17;
Mean =
17
P a g e | 80
17 1
th
= 9th term,
Q3 = P75:
i=
75
(17) 12.75
100
P11:
i=
11
(17) 1.87
100
P35:
i=
i=
P11 = 1
35
(17) 5.95
100
P58:
Q3 = 4
P35 = 3
58
(17) 9.86
100
P58 = 4
Median = 4
P a g e | 81
P67:
i=
67
(17) 11.39
100
3.11
P67 = 4
x -
(x-)2
1.7143
2.2857
2.9388
5.2244
0.2857
0.0816
4.7143
22.2246
3.2857
10.7958
1.2857
1.6530
0.7143
0.5102
x
6
2
6-4.2857 =
x = 30
x- =
14.2857
x 30
4.2857
N
7
a.) Range = 9 - 1 = 8
x
b.) M.A.D. =
c.) 2 =
14.2857
( x ) 2 43.4284
N
7
2.0408
= 6.2041
(x -)2 = 43.4284
P a g e | 82
d.) =
( x ) 2
6.2041
N
= 2.4908
Q1 = P25
i =
1, 2, 3, 4, 5, 6, 9
25
(7)
100
= 1.75
Q3 = P75:
i =
75
(7)
100
= 5.25
IQR = Q3 - Q1 = 6 - 2 = 4
f.)
z =
z =
6 4.2857
2.4908
2 4.2857
2.4908
= 0.69
= -0.92
P a g e | 83
z =
z =
z =
z =
z =
4 4.2857
2.4908
9 4.2857
2.4908
1 4.2857
2.4908
3 4.2857
2.4908
5 4.2857
2.4908
= -0.11
= 1.89
= -1.32
= -0.52
= 0.29
xx
3.12
( x x) 2
x
4
16
25
P a g e | 84
5
xx
x = 32
x 32
n
8
( x x ) 2
= 14
= 48
=4
a) Range = 9 - 0 = 9
xx
b) M.A.D. =
c) s2 =
( x x) 2 48
n 1
7
14
8
= 1.75
= 6.8571
( x x) 2
6.857
n 1
d) s =
= 2.6186
e) Numbers in order: 0, 2, 3, 4, 4, 5, 5, 9
Q1 = P25
i =
25
(8)
100
= 2
Q1
P a g e | 85
Q3 = P75
i =
75
(8)
100
= 6
3.13
a.)
x
12
12-21.167= -9.167
=
FORMULA
84.034
23
1.833
3.360
19
-2.167
4.696
26
4.833
23.358
24
2.833
8.026
23
1.833
3.360
x = 127
( x - ) 2
(x-)
x 127
N
6
(x -) = -0.002
(x -)2 = 126.834
= 21.167
( x ) 2
126.834
21.139
N
6
= 4.598
ORIGINAL
Q3
P a g e | 86
b.)
x
x2
12
144
23
529
19
361
26
676
24
576
23
529
x = 127
x2= 2815
(x ) 2
N
N
x 2
= 4.598
(127) 2
6
2815
2815 2688.17
126.83
21.138
6
6
SHORT-CUT FORMULA
P a g e | 87
3.14
s2 = 433.9267
s = 20.8309
x = 1387
x2 = 87,365
n = 25
x
= 55.48
3.15
2 = 58,631.295
= 242.139
x = 6886
x2 = 3,901,664
n = 16
= 430.375
3.16
14, 15, 18, 19, 23, 24, 25, 27, 35, 37, 38, 39, 39, 40, 44,
46, 58, 59, 59, 70, 71, 73, 82, 84, 90
P a g e | 88
Q1 = P25
i =
25
(25)
100
= 6.25
Q3 = P75
i =
75
(25)
100
= 18.75
IQR = Q3 - Q1 = 59 - 25 =
1
3.17
a)
1
b)
1
c)
1
1 3
1 .75
2
2
4 4
1
1
1
.84
2
2 .5
6.25
1
1
1
.609
2
1 .6
2.56
34
.75
.84
.609
P a g e | 89
1
d)
1
1
1
.902
2
3 .2
10.24
.902
P a g e | 90
3.18
Set 1:
x 262
65.5
N
4
(x ) 2
(262) 2
17,970
N
4
N
4
x 2
= 14.2215
Set 2:
x 570
142.5
N
4
(x ) 2
(570) 2
82,070
N
4
N
4
x 2
= 14.5344
CV1 =
14.2215
(100)
65.5
= 21.71%
P a g e | 91
CV2 =
14.5344
(100)
142.5
= 10.20%
P a g e | 92
xx
( x x) 2
3.19
7
1.833
3.361
3.833
14.694
10
1.167
1.361
12
3.167
10.028
0.167
0.028
0.833
0.694
14
5.167
26.694
5.833
34.028
11
2.167
4.694
13
4.167
17.361
0.833
0.694
2.833
8.028
106
121.665
x 106
n
12
= 8.833
xx
a) MAD =
b) s2 =
32
12
= 2.667
( x x) 2 121.665
n 1
11
= 11.06
32.000
P a g e | 93
s 2 11.06
c) s =
= 3.326
3 5 6 7 8 8 9 10 11 12 13 14
Q1 = P25: i = (.25)(12) = 3
Q1 = (6 + 7)/2 = 6.5
Q3 = P75: i = (.75)(12) = 9
e.) z =
f.) CV =
3.20 n = 11
6 8.833
3.326
(3.326)(100)
8.833
= - 0.85
= 37.65%
x-
768
429
475.64
136.64
323
30.64
306
13.64
Q3 = (11 + 12)/2 =
P a g e | 94
286
6.36
262
30.36
215
77.36
172
120.36
162
148
130.36
144.36
145
147.36
x = 3216
= 292.36
x-
x = 3216
= 1313.08
x2 = 1,267,252
x
N
b.) MAD =
1313.08
11
= 119.37
(x ) 2
(3216) 2
x
1,267,252
N
11
N
11
2
c.) 2 =
= 29,728.23
29,728.23
d.) =
e.) Q1 = P25:
= 172.42
i = .25(11) = 2.75
P a g e | 95
Q3 = P75:
i = .75(11) = 8.25
z =
x 172 292.36
172.42
= -0.70
172.42
(100)
(100)
292.36
g.) CV =
3.21
= 125
= 58.98%
= 12
P a g e | 96
3.22
=6
= 38
x1 - = 50 - 38 = 12
x2 - = 26 - 38 = -12
x1 12
= 2
x 2 12
= -2
1
1
1 3
1 2 1
2
4 4
k
2
= .75
P a g e | 97
P a g e | 98
between 14 and 62?
= 38
=6
x1 - = 62 - 38 = 24
x2 - = 14 - 38 = -24
x1 24
= 4
x 2 24
= -4
k=4
1
1
1 15
1 2 1
2
16 16
k
4
= .9375
1-
1
k2
=.89
P a g e | 99
.11 =
1
k2
.11 k2 = 1
k2 =
1
.11
k2 = 9.09
k = 3.015
3.23
1-
1
k2
= .80
1 - .80 =
1
k2
P a g e | 100
.20 =
1
k2
k2 = 5
and
and
.20k2 = 1
k = 2.236
3.24
and one of
= 43. 68% of the values lie + 1 . Thus, between the mean, 43,
the values, 46, is one standard deviation. Therefore,
1 = 46 - 43 = 3
within 99.7% of the values lie + 3. Thus, between the mean, 43,
and one of
the values, 51, are three standard deviations. Therefore,
3 = 51 - 43 = 8
= 2.67
P a g e | 101
1-
1
k2
= .77
Solving for k:
.23 =
1
k2
and therefore,
.23k2 = 1
k2 = 4.3478
k = 2.085
2.085 = 4
=
3.25
= 29
4
2.085
= 1.918
=4
x1 21 29 8
4
4
x 2 37 29 8
8
4
= -2 Standard Deviations
= 2 Standard Deviations
P a g e | 102
Exceed 37 days:
Since 95% fall between 21 and 37 days, 5% fall outside this range.
Since the
normal distribution is symmetrical, 2% fall below 21 and above 37.
Exceed 41 days:
x 41 29 12
4
4
= 3 Standard deviations
The empirical rule states that 99.7% of the values fall within 3 =
29 3(4) =
29 12. That is, 99.7% of the values will fall between 17 and 41 days.
0.3% will fall outside this range and half of this or .15% will lie
above 41.
P a g e | 103
= 4
x 25 29 4
4
4
= -1 Standard Deviation
29 1(4) = 29 4
Therefore, between 25 and 33 days, 68% of the values lie and 32% lie
outside this
range with (32%) = 16% less than 25.
3.26
x
97
109
111
118
120
130
132
133
137
137
x = 1224
x = 151,486
2
x
n = 10
= 122.4
s = 13.615
P a g e | 104
Bordeaux: x = 137
z =
137 122.4
13.615
= 1.07
Montreal: x = 130
z =
130 122.4
13.615
= 0.56
Edmonton: x = 111
z =
111 122.4
13.615
= -0.84
Hamilton: x = 97
z =
97 122.4
13.615
= -1.87
P a g e | 105
3.27
Mean
Class
fM
0- 2
39
39
2- 4
27
81
4- 6
16
80
6- 8
15
105
8 - 10
10
90
10 - 12
11
88
12 - 14
13
78
f=121
fM=561
fM 561
f
121
=
= 4.64
3.28
Class
1.2 - 1.6
f
220
fM
1.4
308
1.6 - 2.0
150
1.8
270
2.0 - 2.4
90
2.2
198
2.4 - 2.8
110
2.6
286
P a g e | 106
2.8 - 3.2
280
3.0
f=850
840
fM=1902
fM 1902
f
850
Mean:
Mode:
= 2.24
P a g e | 107
3.29
Class
Total
fM
20-30
25
175
30-40
11
35
385
40-50
18
45
810
50-60
13
55
715
60-70
65
390
70-80
75
300
59
2775
fM 2775
f
59
=
= 47.034
( M - ) 2
M-
f ( M - ) 2
-22.0339
485.4927
3398.449
-12.0339
144.8147
1592.962
- 2.0339
4.1367
74.462
7.9661
63.4588
824.964
17.9661
322.7808
1936.685
27.9661
782.1028
3128.411
Total 10,955.933
f ( M ) 2 10,955.93
f
59
2 =
= 185.694
185.694
=
13.627
P a g e | 108
3.30
fM2
Class
5- 9
20
140
980
9 - 13
18
11
198
2,178
13 - 17
8 15
120
1,800
17 - 21
6 19
114
2,166
21 - 25
2 23
46
1,058
f=54
fM
fM= 618
fM2= 8,182
(fM ) 2
(618) 2
8182
n
54 8182 7071.67
n 1
53
53
fM 2
s2 =
s 2 20.9
s =
= 4.575
= 20.931
P a g e | 109
3.31
Class
fM2
fM
18 - 24
17
21
357
24 - 30
22
27
594
30 - 36
26
33
858
36 - 42
35
39
1,365
42 - 48
33
45
1,485
48 - 54
30
51
1,530
54 - 60
32
57
1,824
60 - 66
21
63
1,323
66 - 72
15
69
1,035
7,497
16,038
28,314
53,235
66,825
78,030
103,968
83,349
71,415
f= 231
x
a.) Mean:
fM= 10,371
fM2= 508,671
fM fM 10,371
n
f
231
= 44.896
b.) Mode. The Modal Class = 36-42. The mode is the class midpoint =
39
P a g e | 110
(fM ) 2
(10,371) 2
fM
508,671
n
231 43,053.5065
n 1
230
230
2
c.) s2 =
187.2
d.) s =
= 13.682
= 187.189
P a g e | 111
3.32
fM2
Class
fM
0-1
31
0.5
15.5
7.75
1-2
57
1.5
85.5
128.25
2-3
26
2.5
65.0
162.50
3-4
14
3.5
49.0
171.50
4-5
4.5
27.0
121.50
5-6
5.5
16.5
90.75
f=137
fM=258.5
fM2=682.25
a.) Mean
fM 258.5
f
137
=
b.) Mode:
= 1.887
c.) Variance:
(fM ) 2
( 258.5) 2
682.25
N
137
N
137
fM 2
2 =
2 1.4197
= 1.1915
= 1.4197
P a g e | 112
P a g e | 113
3.33
fM2
fM
20-30
30-40
8
7
25
35
200
245
5000
8575
40-50
45
45
2025
50-60
55
60-70
65
195
12675
70-80
75
75
5625
f = 20
fM = 760
fM2 = 33900
a.) Mean:
fM 760
f
20
=
b.) Mode.
= 38
c.) Variance:
(fM ) 2
(760) 2
33,900
N
20
N
20
fM 2
2 =
= 251
P a g e | 114
2 251
= 15.843
P a g e | 115
3.34
No. of Farms
fM
0 - 20,000
16
10,000
160,000
20,000 - 40,000
11
30,000
330,000
40,000 - 60,000
10
50,000
500,000
60,000 - 80,000
70,000
420,000
80,000 - 100,000
90,000
450,000
100,000 - 120,000
1
f = 49
110,000
110,000
fM = 1,970,000
fM 1,970,000
f
49
=
= 40,204
The actual mean for the ungrouped data is 37,816. This computed
group
mean, 40,204, is really just an approximation based on using the class
midpoints in the calculation. Apparently, the actual numbers of farms
per
state in some categories do not average to the class midpoint and in
fact
might be less than the class midpoint since the actual mean is less
than the
grouped data mean.
(fm) 2
(1,970,000) 2
11
fM
1.185 x10
N
49
N
49
2
2 =
= 801,999,167
P a g e | 116
= 28,319.59
The actual standard deviation was 29,341. The difference again is due
to
the grouping of the data and the use of class midpoints to represent
the
data. The class midpoints due not accurately reflect the raw data.
3.35
mean = $35
median = $33
mode = $21
The stock prices are skewed to the right. While many of the stock
prices
are at the cheaper end, a few extreme prices at the higher end pull the
mean.
P a g e | 117
3.36
mean = 51
median = 54
mode = 59
The distribution is skewed to the left. More people are older but the
most
extreme ages are younger ages.
3.37
Sk =
3.38
n = 25
3( M d ) 3(5.51 3.19)
9.59
= 0.726
x = 600
x
= 24
Sk =
s = 6.6521
3( x M d ) 3(24 23)
s
6.6521
Md = 23
= 0.451
3.39
Q1 = 500.
Median = 558.5.
Q3 = 589.
P a g e | 118
Inner Fences:
P a g e | 119
3.40
n = 18
Median:
2
2
2
th
= 9.5th term
Median = 74
Q1 = P25:
i =
25
(18)
100
= 4.5
Q1 = 5th term =
66
Q3 = P75:
i =
75
(18)
100
= 13.5
Q3 = 14th term =
90
Therefore, IQR = Q3 - Q1 = 90 - 66 = 24
P a g e | 120
Outer Fences: Q1 - 3.0 IQR = 66 - 3.0 (24) = -6
Q3 + 3.0 IQR = 90 + 3.0 (24) = 162
There are no extreme outliers. The only mild outlier is 21. The
distribution is positively skewed since the median is nearer to Q1 than
Q3.
P a g e | 121
3.41
x = 80
x2 = 1,148
y = 69
y2 = 815
xy = 624
n=7
xy
x y
( x )
x
n
( y ) 2
r =
(80)(69)
7
2
(80)
(69) 2
1,148
815
7
7
624
r =
r =
3.42
164.571
( 233.714)(134.857)
=
164.571
177.533
= -0.927
x = 1,087
x2 = 322,345
y = 2,032
y2 = 878,686
xy= 507,509
n=5
P a g e | 122
xy
2
x
x y
( x ) 2
n
y2
( y ) 2
n
r =
(1,087)( 2,032)
5
2
(1,087)
(2,032) 2
322
,
345
878
,
686
5
5
507,509
r =
65,752.2
(86,031.2)(52,881.2)
r =
65,752.2
67,449.5
=
= .975
P a g e | 123
3.43
Delta (x)
SW (y)
47.6
15.1
46.3
15.4
50.6
52.6
15.9
15.6
52.4
16.4
52.7
18.1
x = 302.2
y = 96.5
x2 = 15,259.62
y2 = 1,557.91
xy
2
x
xy = 4,870.11
x y
( x ) 2
n
y2
( y ) 2
r =
(302.2)(96.5)
6
(302.2) 2
(96.5) 2
15
,
259
.
62
1
,
557
.
91
6
6
4,870.11
r =
3.44
= .6445
x = 6,087
x2 = 6,796,149
y = 1,050
y2 = 194,526
xy = 1,130,483
n=9
P a g e | 124
xy
x y
n
( x )
x
n
( y ) 2
r =
(6,087)(1,050)
9
2
(6,087)
(1,050) 2
6
,
796
,
149
194
,
526
9
9
1,130,483
r =
420,333
(2,679,308)(72,026)
r =
3.45
420,333
439,294.705
=
= .957
x = 17.09
x2 = 58.7911
y = 15.12
y2 = 41.7054
xy = 48.97
n=8
xy
x y
( x )
x
n
r =
( y ) 2
n
P a g e | 125
(17.09)(15.12)
8
2
(17.09)
(15.12) 2
58
.
7911
41
.
7054
8
8
48.97
r =
16.6699
(22.28259)(13.1286)
r =
16.6699
17.1038
= .975
x2 = 41.7054
y = 15.86
y2 = 42.0396
xy = 41.5934
xy
2
x
n=8
x y
( x ) 2
n
r =
y2
( y ) 2
n
(15.12)(15.86)
8
2
(15.12)
(15.86) 2
41
.
7054
42
.
0396
8
8
41.5934
r =
P a g e | 126
11.618
(13.1286)(10.59715)
r =
11.618
11.795
= .985
x = 17.09
x2 = 58.7911
y = 15.86
y2 = 42.0396
xy = 48.5827
n=8
xy
2
x
x y
( x ) 2
n
y2
( y ) 2
n
r =
(17.09)(15.86)
8
(17.09) 2
(15.86) 2
58
.
7911
42
.
0396
8
8
48.5827
r =
14.702
( 22.2826)(10.5972)
r =
14.702
15.367
= .957
P a g e | 127
The years 2 and 3 are the most correlated with r = .985.
P a g e | 128
3.46
1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
3, 3, 3, 3, 3, 3, 4, 4, 5, 6, 8
x
Mean:
x 75
n 30
= 2.5
th
n 1
30 1 31
2
2
2
= 15.5th position.
Range = 8 - 1 = 7
Q1 = P25:
i =
25
(30)
100
= 7.5
P a g e | 129
Q1 is the 8th term =
Q3 = P75:
i =
75
(30)
100
IQR = Q3 - Q1 = 3 - 1 = 2
= 22.5
P a g e | 130
3.47
P10:
i =
10
( 40)
100
= 4
P80:
i =
80
( 40)
100
= 32
Q1 = P25:
i =
25
( 40)
100
= 10
Q3 = P75:
i =
75
( 40)
100
= 30
P a g e | 131
Range = 81 - 19 = 62
P a g e | 132
3.48
x 126,904
N
20
= 6345.2
n 1
2
P30:
i = (.30)(20) = 6
P60:
i = (.60)(20) = 12
P90:
i = (.90)(20) = 18
P a g e | 133
Q1 = P25:
i = (.25)(20) = 5
Q1 = (4464+4507)/2 = 4485.5
Q3 = P75:
i = (.75)(20) = 15
Q3 = (6796+8687)/2 = 7741.5
3.49
n = 10
x = 87.95
x2 = 1130.9027
( x ) 2
(87.95) 2
1130.9027
N
10
N
10
x 2
=
= 5.978
P a g e | 134
3.50
a.)
x
N
= 26,675/11 = 2425
Median = 1965
b.)
c.)
Q1 = 1532
IQR = Q3 - Q1 = 1335
Variance:
(x ) 2
(26,675) 2
86,942,873
N
11
N
11
x 2
2 =
Standard Deviation:
2 2,023,272.55
=
d.)
= 1422.42
Texaco:
z =
x 1532 2425
1422.42
= -0.63
= 2,023,272.55
P a g e | 135
Exxon Mobil:
x 6300 2425
1422.42
z =
e.)
= 2.72
Skewness:
Sk =
3( M d ) 3(2425 1965)
1422.42
1.79 2.07
2
Median:
Mode:
Q3:
= 2.3536
= 1.93
No Mode
b.) Range:
Q1:
x 32.95
n
14
= 0.97
1
(14) 3.5
4
3
(14) 10.5
4
Q3 = 2.87
P a g e | 136
IQR = Q3 Q1 = 2.87 1.68 = 1.19
xx
x x
2.3764
5.6473
x
4.73
3.64
3.53
2.87
2.61
2.59
2.07
1.79
1.77
1.69
1.68
1.41
1.37
1.20
1.2864
1.1764
0.5164
0.2564
0.2364
0.2836
0.5636
0.5836
0.6636
0.6736
0.9436
0.9836
1.1536
xx
xx
n
MAD =
x x
n 1
s2 =
11.6972
14
13.8957
13
1.6548
1.3839
0.2667
0.0657
0.0559
0.0804
0.3176
0.3406
0.4404
0.4537
0.8904
0.9675
1.3308
( x x)
= 11.6972
= 13.8957
= 0.8355
= 1.0689
s 2 1.0689
s=
= 1.0339
Sk =
3( x M d ) 3(2.3536 1.93)
s
1.0339
= 1.229
P a g e | 137
Inner Fences:
Mineral Production
3
Mineral Production
P a g e | 138
3.52
fM
fM
15-20
17.5
157.5
2756.25
20-25
16
22.5
360.0
8100.00
25-30
27
27.5
742.5
20418.75
30-35
44
32.5
1430.0
46475.00
35-40
42
37.5
1575.0
59062.50
40-45
23
42.5
977.5
41543.75
45-50
47.5
332.5
15793.75
50-55
52.5
105.0
5512.50
f = 170
fM = 5680.0
fM2 = 199662.50
a.) Mean:
fM 5680
f
170
=
= 33.412
Mode: The Modal Class is 30-35. The class midpoint is the mode =
32.5.
b.) Variance:
P a g e | 139
(fM ) 2
(5680) 2
fM
199,662.5
n
170
n 1
169
2
s2 =
Standard Deviation:
s 2 58.483
s =
7.647
= 58.483
P a g e | 140
3.53
Class
fM2
fM
0 - 20
32
10
320
3,200
20 - 40
16
30
480
14,400
40 - 60
13
50
650
32,500
60 - 80
10
70
700
49,000
80 - 100
19
90
1,710
153,900
f = 90
fM= 3,860
fM2= 253,000
fM fm 3,860
n
f
90
a) Mean:
= 42.89
Mode: The Modal Class is 0-20. The midpoint of this class is the
mode = 10.
(fM ) 2
n
n 1
fM 2
253,000
87,448.9
982.572
89
s =
= 31.346
89
(3860) 2
90
253,000 165,551.1
89
P a g e | 141
3.54
x = 36
x2 = 256
y = 44
y2 = 300
xy = 188
n =7
xy
2
x
x y
( x )
n
2
y2
( y ) 2
r =
38.2857
(70.85714)( 23.42857)
r =
(36)( 44)
7
2
(36)
(44)
256
300
7
7
188
38.2857
40.7441
= -.940
P a g e | 142
x
3.45
(100%)
(100%)
x
32
3.55
CVx =
= 10.78%
y
5.40
(100%)
(100%)
y
84
CVY =
= 6.43%
3.56 = 7.5
the mean.
Each of the numbers, 1 and 14, are 6.5 units away from
3 = 14 - 7.5 = 6.5
= 2.167
P a g e | 143
3.57
= 419, = 27
a.) 68%:
+ 1
419 + 27
95%:
+ 2
419 + 2(27)
365 to 473
99.7%: + 3
419 + 3(27)
338 to 500
392 to 446
Each of the points, 359 and 479 is a distance of 60 from the mean,
= 419.
400 419
27
3.58
a.)
Albania
x2
4,900
24,010,000
Bulgaria
8,200
67,240,000
Croatia
11,200
125,440,000
Czech
16,800
282,240,000
P a g e | 144
x=41,100
x 41,100
N
4
= 10,275
(x ) 2
N
N
x 2
=
498,930,000
(41,100) 2
4
4
= 4376.86
b.)
x
Hungary
x2
14,900
Poland
222,010,000
12,000
144,000,000
Romania
7,700
59,290,000
Bosnia/Herz
6,500
42,250,000
x=41,100
x 41,100
N
4
x2=467,550,000
= 10,275
(x ) 2
N
N
x 2
c.)
x2= 498,930,000
467,550,000
(41,100) 2
4
4
= 3363.31
P a g e | 145
1
4376.86
(100)
(100)
1
10,275
CV1 =
= 42.60%
2
3363.31
(100)
(100)
2
10,275
CV2 =
= 32.73%
P a g e | 146
3.59
Mean
$35,748
Median
$31,369
Mode
$29,500
Since these three measures are not equal, the distribution is skewed.
The
distribution is skewed to the right because the mean is greater than
the median. Often, the median is preferred in reporting income data
because it yields information about the middle of the data while
ignoring extremes.
3.60
x = 36.62
x2 = 217.137
y = 57.23
y2 = 479.3231
xy = 314.9091
n =8
xy
x y
( x )
x
n
r =
( y ) 2
n
(36.62)(57.23)
8
2
(36.62)
(57.23) 2
217.137
479.3231
8
8
314.9091
r =
P a g e | 147
52.938775
(49.50895)(69.91399)
r =
= .90
P a g e | 148
3.61
a.)
Q1 = P25:
25
(20)
100
i =
= 5
Q3 = P75:
75
(20)
100
i =
= 15
th
Median:
n 1
20 1
2
2
th
= 10.5th term
Inner Fences:
Q1 - 1.5 IQR = 49.1 47.4 = 1.7
Q3 + 1.5 IQR = 80.7 + 47.4 = 128.1
Outer Fences:
P a g e | 149
Q1 - 3.0 IQR = 49.1 94.8 = - 45.70
Q3 + 3.0 IQR = 80.7 + 94.8 = 175.5
b.) and c.) There are no outliers in the lower end. There are two
extreme
outliers in the upper end (South Louisiana, 198.8, and
Houston,
190.9). There is one mild outlier at the upper end (New York,
145.9).
Since the median is nearer to Q1, the distribution is
positively skewed.
P a g e | 150
50
100
150
200
U.S. Ports
3.62
Paris:
k = 1.459
1.459 = 32
= 21.93
Moscow:
k = 2.425
P a g e | 151
2.425 = 44
= 18.14
Chapter 4
Probability
LEARNING OBJECTIVES
The main objective of Chapter 4 is to help you understand the basic principles of
probability, specifically enabling you to
1.
2.
Understand and apply marginal, union, joint, and conditional
probabilities.
3.
P a g e | 152
4.
5.
P a g e | 153
Section 4.8 on Bayes theorem can be skipped in a one-semester
course without losing any continuity. This section is a prerequisite to the
chapter 19 presentation of revising probabilities in light of sample
information (section 19.4).
CHAPTER OUTLINE
4.1
Introduction to Probability
4.2
4.3
Structure of Probability
Experiment
Event
Elementary Events
Sample Space
Unions and Intersections
Mutually Exclusive Events
Independent Events
Collectively Exhaustive Events
Complimentary Events
Counting the Possibilities
The mn Counting Rule
Sampling from a Population with Replacement
Combinations: Sampling from a Population Without
Replacement
P a g e | 154
4.4
4.5
Addition Laws
Probability Matrices
Complement of a Union
Special Law of Addition
4.6
Multiplication Laws
General Law of Multiplication
Special Law of Multiplication
4.7
Conditional Probability
Independent Events
4.8
KEY TERMS
A Priori
Intersection
Bayes' Rule
Joint Probability
Marginal Probability
mn Counting Rule
Mutually Exclusive Events
Probability Matrix
P a g e | 155
Complement
Occurrence
Relative Frequency of
Conditional Probability
Sample Space
Elementary Events
Set Notation
Event
Subjective Probability
Experiment
Union
Independent Events
Union Probability
P a g e | 156
4.1
Sample Space:
D1 D2, D2 D3, D3 A5
D1 D3, D2 A4, D3 A6
D1 A4, D2 A5, A4 A5
D1 A5, D2 A6, A4 A6
D1 A6, D3 A4, A5 A6
9/15 = .60
4.2
P a g e | 157
a)
X Z = {1, 2, 3, 4, 5, 7, 8, 9}
b)
X Y = {7, 9}
c)
X Z = {1, 3, 7}
d)
X Y Z = {1, 2, 3, 4, 5, 7, 8, 9}
e)
X Y Z=
f)
g)
h)
X or Y = X Y = {1, 2, 3, 4, 5, 7, 8, 9}
i)
Y and Z = Y Z = {2, 4, 7}
{7}
3, 4, 7}
4.3
If A = {2, 6, 12, 24} and the population is the positive even numbers
through 30,
A = {4, 8, 10, 14, 16, 18, 20, 22, 26, 28, 30}
4.4
6(4)(3)(3) = 216
4.5
Sample Space:
P a g e | 158
D1 D2 A1, D1 D2 A2, D1 D2 A3,
D1 D2 A4, D1 A1 A2, D1 A1 A3,
D1 A1 A4, D1 A2 A3, D1 A2 A4,
D1 A3 A4, D2 A1 A2, D2 A1 A3,
D2 A1 A4, D2 A2 A3, D2 A2 A4,
D2 A3 A4, A1 A2 A3, A1 A2 A4,
A1 A3 A4, A2 A3 A4
C3 =
6!
3!3!
= 20
12/20 = 3/5
.60
4.6
P a g e | 159
4.7
C6 =
20!
6!14!
20
= 38,760
4.8
P(B C) = .03
4.9
D
12
25
10
20
15
23
16
21
60
P a g e | 160
d) P(C F) = P(C) + P(F) - P(C F) =
= .5167
P a g e | 161
4.10
.10
.03
.13
.04
.12
.16
.27
.06
.33
.31
.07
.38
.72
.28
1.00
4.11
P(A) = .47
P(T) = .28
P a g e | 162
We need to know the probability of the intersection of A and T, the
proportion
who have ridden both or determine if these two events are mutually exclusive.
4.12
P(L) = .75
P(M) = .78
P(M L) = .61
P a g e | 163
4.13
4.14
P(T F) = .35
P(F) = .44
.28
.37
d)
Faculty
References
Y
Transcript
.35
.19
.54
.09
.37
.46
.44
.56
1.00
P a g e | 164
4.15
C
11
16
40
17
14
21
15
57
4.16
D
.12
.13
.08
.33
.18
.09
.04
.31
.06
.24
.06
.36
.36
.46
.18
1.00
a) P(E B) = .09
b) P(C F) = .06
c) P(E D) = .00
4.17
a) (without replacement)
P a g e | 165
6 5
30
50 49 2450
= .0122
b) (with replacement)
4.18
6 6
36
50 50 2500
Let U = Urban
I = care for Ill relatives
P(U) = .78
P(I) = .15
P(I U) = .11
c)
U
Yes
No
Yes
.15
No
.85
= .0144
P a g e | 166
.78
.22
d. P(NU I) is found in the no for U column and the yes for I row (1st row and
2nd column). Take the marginal, .15, minus the yes-yes cell, .0858, to get
.0642.
P a g e | 167
4.19
Let S = stockholder
P(S) = .43
Let C = college
P(C) = .37
P(C S) = .75
The matrix:
C
Yes
No
Yes
.3225
.1075
.43
No
.0475
.5225
.57
.37
.63
1.00
P a g e | 168
4.20
P(P) = .52
P(P F) = .91
The matrix:
NP
.091
.009
.10
NF
.429
.471
.90
.520
.480
1.00
P a g e | 169
4.21
Let S = safety
P(S) = .30
Let A = age
P(A) = .39
P(A S) = .87
The matrix:
A
Yes
No
Yes
.261
.039
.30
No
.129
.571
.70
.39
.61
1.00
P a g e | 170
4.22
P(O) = .29
The matrix:
O
Yes
No
Yes
.13
.47
.60
No
.16
.24
.40
.29
.71
1.00
4.23
E
15
12
35
11
17
19
47
21
32
27
80
18
13
12
43
65
74
66
205
P a g e | 171
a) P(G A) = 8/35 = .2286
d) P(E G) = .0000
P a g e | 172
4.24
C
.36
.44
.80
.11
.09
.20
.47
.53
1.00
c) P(A B) = .0000
4.25
Calculator
Yes
No
Yes
46
49
No
11
15
26
57
18
75
Computer
P a g e | 173
P(V1 V2) = P(V1).
46 49
57 75
.8070 .6533
Since this is one example that the conditional does not equal
the marginal in
is matrix, the variable, computer, is not independent of the
variable,
calculator.
P a g e | 174
4.26
Let C = construction
1258
P (C S ) 83,384
8010
P(S )
83,384
c) P(C S) =
= .15705
1258
P (C S ) 83,384
10,867
P (C )
83,384
d) P(S C) =
= .11576
P ( NS NC ) 1 P (C S )
P ( NC )
P ( NC )
e) P(NS NC) =
P a g e | 175
P ( NS C ) P(C ) P (C S )
P (C )
P (C )
f) P(NS C) =
P a g e | 176
4.27
Let E = Economy
Let Q = Qualified
P(E) = .46
P(E Q) = .15
P(Q) = .37
The matrix:
Q
Yes
No
Yes
.15
.31
.46
No
.22
.32
.54
.37
.63
1.00
P a g e | 177
P a g e | 178
The matrix:
TD
Yes
No
Yes
No
.1080
.4320
.54
.46
EM
1.00
P a g e | 179
4.29
Let H = hardware
P(H) = .37
P(S) = .54
Let S = software
P(S H) = .97
The matrix:
Yes
Yes
No
.3589
.0111
.37
P a g e | 180
No
.1811
.4489
.63
H
.54
.46
1.00
P a g e | 181
4.30
barrier
a barrier
P(R) = .43
P(S) = .46
P(R S) = .77
P(not S) = .54
The matrix:
Yes
No
Yes
.3542
.0758
.43
No
.1058
.4642
.57
P a g e | 182
.46
.54
1.00
P a g e | 183
4.31
P(A) = .10
P(B) = .40
P(D A) = .05
Event
Prior
P(C) = .50
P(D B) = .12
Conditional
Joint
P(D C) = .08
Revised
P(D Ei)
P(Ei)
P(D Ei)
.10
.05
.005
.005/.093=.0538
.40
.12
.048
.048/.093=.5161
.50
.08
.040
.040/.093=.4301
P(D)=.093
Revise:
P a g e | 184
4.32
Let
P(A) = .30
P(B) = .45
P(C) = .25
P(I C) = .05
P(K C) = .95
a) P(B) = .45
c)
Event
Prior
Conditional
Joint
Revised
P(I Ei)
P(Ei)
P(Ei I)
P(I Ei)
.30
.20
.0600
.0600/.1265=.4743
.45
.12
.0540
.0540/.1265=.4269
.25
.05
.0125
.0125/.1265=.0988
P(I)=.1265
P a g e | 185
d)
Event
Prior
Conditional
Joint
Revised
P(K Ei)
P(Ei)
P(Ei K)
P(K Ei)
.30
.80
.2400
.2400/.8735=.2748
.45
.88
.3960
.3960/.8735=.4533
.25
.95
.2375
.2375/.8735=.2719
P(K)=.8735
P a g e | 186
4.33
Let
P(T) = .72
Event
Prior
P(G) = .28
Conditional
P(V T) = .30
Joint
P(V G) = .20
Revised
P(V Ei)
P(Ei)
P(Ei V)
P(V Ei)
.72
.30
.216
.216/.272=.7941
.28
.20
.056
.056/.272=.2059
P(V)=.272
Revised:
4.34
Let S = small
Let L = large
P(T L) = .82
P(S) = .70
P(L) = .30
P a g e | 187
Event
Prior
Conditional
Joint
Revised
P(T Ei)
P(Ei)
P(Ei T)
P(T Ei)
.70
.18
.1260
.1260/.3720 = .3387
.30
.82
.2460
.2460/.3720 = .6613
P(T)=.3720
Revised:
4.35
Variable 1
D
Variable 2
10
20
30
15
20
P a g e | 188
C
30
15
45
55
40
95
50/95 = .52632
f) P(B C) = .0000 (mutually exclusive)
P(A B)
.0 0 0 0
P (B )
20 / 95
h) P(A B) =
exclusive)
P a g e | 189
P a g e | 190
4.36
12
31
22
10
25
21
18
16
23
78
P (A B)
.0 0 0 0
P (B )
22 / 78
b) P(A B) =
exclusive)
P a g e | 191
4.37
Age(years)
<35
Male
Gender
Female
35-44
45-54
55-64
>65
.11
.20
.19
.12
.16
.78
.07
.08
.04
.02
.01
.22
.18
.28
.23
.14
.17
1.00
a) P(35-44) = .28
b) P(Woman 45-54) = .04
c) P(Man 35-44) = P(Man) + P(35-44) - P(Man 35-44) = .78 + .28 .20 = .86
d) P(<35 55-64) = P(<35) + P(55-64) = .18 + .14 = .32
e) P(Woman 45-54) = P(Woman 45-54)/P(45-54) = .04/.23= .1739
f) P(not W not 55-64) = .11 + .20 + .19 + .16 = .66
P a g e | 192
4.38
Let T = thoroughness
P(T) = .78
P(K) = .40
Let K = knowledge
P(T K) = .27
The matrix:
K
Yes
No
Yes
.27
.51
.78
No
.13
.09
.22
.40
.60
1.00
P a g e | 193
4.39
Let R = retirement
P(R) = .42
P(R L) = .33
P(L) = .61
The matrix:
L
Yes
No
Yes
.33
.09
.42
No
.28
.30
.58
.61
.39
1.00
P a g e | 194
4.40
P(T) = .16
P(T W) = .20
P(W) = .21
P(NE) = .20
{(.20)(.83)}/(.84) = .1976
e) P(not W not NE T) = P(not W not NE T)/ P(T)
but P(not W not NE T) =
.16 - P(W T) - P(NE T) = .16 - .042 - .034 = .084
P(not W not NE T)/ P(T) = (.084)/(.16) = .525
P a g e | 195
4.41
Let M = MasterCard
P(M) = .30
P(A) = .20
P(M A) = .08
A = American Express
V = Visa
P(V) = .25
P(V M) = .12
P(A V) = .06
.25 .40
P a g e | 196
4.42
P(N) = .51
Therefore, P(S) = 1 - .51 = .49
P(<45) = .57
P a g e | 197
S
<45
.189
.381
.57
>45
.301
.129
.43
.490
.510
1.00
P a g e | 198
4.43
P(M) = .43
P(R) = .45
P(R M) = .81
Reduce
Yes
Save
No
Yes
.3483
.0817
.43
No
.1017
.4683
.57
.45
.55
1.00
P a g e | 199
P a g e | 200
4.44
Let R = read
Let B = checked in the with boss
P(R) = .40
P(B) = .34
P(B R) = .78
NB
.312
.088
.40
P a g e | 201
NR
.028
.572
.60
.34
.66
1.00
P a g e | 202
4.45
P(Q) = .35
P(C Q) = .75
P(Q C) = .40
P a g e | 203
.2625
.0875
.35
.
39375
.
25625
.65
.
65625
.
34375
1.00
P a g e | 204
4.46
Let: D = denial
I = inappropriate
C = customer
P = payment dispute
S = specialty
G = delays getting care
R = prescription drugs
P(D) = .17
P(I) = .14
P(C) = .14
P(S) = .10
P(G) = .08
P(R) = .07
P(P) = .11
P a g e | 205
4.47
Let R = retention
P(R) = .56
P(P R) = .36
P(R P) = .90
c) P(P) = ??
Solve P(R P) = P(R P)/P(P) for P(P):
P(P) = P(R P)/P(R P) = .36/.90 = .40
d) P(R P) = P(R) + P(P) - P(R P) =
.36
.20
.56
.04
.40
.44
.40
.60
1.00
P a g e | 206
Note: In constructing the matrix, we are given P(R) = .56, P(P R)
= .36, and
P(R P) = .90. That is, only one marginal probability is given.
From P(R), we can get P(NR) by taking 1 - .56 = .44.
However, only these two marginal values can be computed
directly.
To solve for P(P), using what is given, since we know that 90%
of P lies
in the intersection and that the intersection is .36, we can set
up an
equation to solve for P:
.90P = .36
Solving for P = .40.
P a g e | 207
4.48
Let M = mail
P(M) = .38
Let S = sales
P(M S) = .0000
S
Y
.
0000
.38
.38
.21
.41
.62
.21
.79
1.00
P a g e | 208
4.49
P(V F) = .60
P(NF) = .59
P(NV) = .695
Solve for P(NF NV) = P(NV) P(F NV) = .695 - .164 = .531
P(NF NV) = P(NF) + P(NV) - P(NF NV) = .59 + .695 - .531 = .
754
P a g e | 209
Y
.246
.164
.41
.059
.531
.59
.305
.695
1.00
P a g e | 210
4.50
blood test
Event
Let S = Sarabia
Let T = Tran
Let J = Jackson
P(S) = .41
P(T) = .32
P(J) = .27
P(B S) = .05
P(B T) = .08
P(B J) = .06
Prior
Conditional
Joint
Revised
P(B Ei)
P(Bi NS)
P(Ei)
P(B Ei)
.41
.05
.0205
.3291
.32
.08
.0256
.4109
.27
.06
.0162
.2600
P(B) = .0623
4.51
Let R = regulations
T = tax burden
P(R) = .30
P(T R) = .71
P(T) = .35
Let B =
P a g e | 211
e) P(NR T) = 1 - P(R T) = 1 - .6086 = .3914
f) P(NR NT) = P(NR NT)/P(NT) = [1 - P(R T)]/P(NT) =
(1 - .4370)/.65 = .8662
T
Y
.213
.087
.30
.137
.563
.70
.35
.65
1.00
P a g e | 212
4.52
Event
Prior
Conditional
Joint
Revised
P(RB Ei)
P(Ei)
P(Ei RB)
P(RB Ei)
0-24
.353
.11
.03883
.03883/.25066 = .
15491
25-34
.142
.24
.03408
.03408/.25066 = .
13596
35-44
.160
.27
.04320
.04320/.25066 = .
17235
> 45
.345
.39
.13455
.13455/.25066 = .
53678
P(RB) = .
25066
P(GH) = .29
P(HM) = .21
P(FG) = .40
Let FG =
P a g e | 213
P a g e | 214
Chapter 5
Discrete Distributions
LEARNING OBJECTIVES
1.
2.
3.
4.
5.
6.
P a g e | 215
P a g e | 216
distribution approaches the normal curve as p gets nearer to .50 and as n
gets larger for other values of p. It can be useful to demonstrate this in class
along with showing how the graphs of Poisson distributions also approach the
normal curve as gets larger.
In this text (as in most) because of the number of variables used in its
computation, only exact probabilities are determined for hypergeometric
distribution. This, combined with the fact that there are no hypergeometric
tables given in the text, makes it cumbersome to determine cumulative
probabilities for the hypergeometric distribution. Thus, the hypergeometric
distribution can be presented as a fall-back position to be used only when the
binomial distribution should not be applied because of the non independence
of trials and size of sample.
P a g e | 217
CHAPTER OUTLINE
5.1
5.2
Distributions
Mean or Expected Value
Variance and Standard Deviation of a Discrete Distribution
5.3
Binomial Distribution
Solving a Binomial Problem
Using the Binomial Table
Using the Computer to Produce a Binomial Distribution
Mean and Standard Deviation of the Binomial Distribution
Graphing Binomial Distributions
5.4
Poisson Distribution
Working Poisson Problems by Formula
Using the Poisson Tables
Mean and Standard Deviation of a Poisson Distribution
Graphing Poisson Distributions
Using the Computer to Generate Poisson Distributions
Approximating Binomial Problems by the Poisson Distribution
P a g e | 218
5.5
Hypergeometric Distribution
KEY TERMS
Binomial Distribution
Hypergeometric
Distribution
Continuous Distributions
Lambda ()
Discrete Distributions
Poisson
Random Variable
Distribution
P a g e | 219
SOLUTIONS TO PROBLEMS IN CHAPTER 5
5.1
x
P(x)
(x-)2P(x)
(x-)2
xP(x)
1
.238
0.6605823
.238
2.775556
2
.290
0.1286312
.580
0.443556
3
.177
0.0197454
.531
0.111556
4
.158
0.2811700
.632
1.779556
5
.137
0.7463152
.685
5.447556
2 = [(x-
= [xP(x)] = 2.666
)2P(x)] = 1.836444
5.2
1.836444
= 1.355155
x
P(x)
2
(x-) P(x)
0
.103
.000
0.780071
1
.118
0.362201
2
.246
0.139114
(x-)2
xP(x)
7.573504
.118
3.069504
.492
0.565504
P a g e | 220
3
.229
0.014084
4
.687
.138
0.214936
.552
5
.094
0.475029
.470
6
.071
0.749015
.426
0.061504
.001
0.018046
1.557504
5.053504
10.549500
.007
18.045500
2 = [(x-
= [xP(x)] = 2.752
)2P(x)] = 2.752496
5.3
x
P(x)
2
(x-) P(x)
2.752496
= 1.6591
xP(x)
(x-)2
0
0.421324
.461
.000
0.913936
1
0.000552
.285
.285
0.001936
2
0.140602
.129
.258
1.089936
3
0.363480
.087
.261
4.177936
4
0.352106
.038
.152
9.265936
2 = [(x-
P a g e | 221
1.278064
= 1.1305
P a g e | 222
5.4
2
) P(x)
(x-)2
P(x)
xP(x)
0
0.37791
.262
1
0.01588
.393
.393
0.0404
2
0.15705
.246
.492
0.6384
3
0.26538
.082
.246
3.2364
4
0.11752
.015
.060
7.8344
5
0.02886
.002
.010
14.4324
6
.000
0.00000
.000
23.0304
.000
(x1.4424
2 = [(x-)2P(x)]
= [xP(x)] = 1.201
= 0.96260
5.5
a)
n=4
P(x=3) =
b)
p = .10
n=7
P(x=4) =
= .98112
q = .90
p = .80
.96260
q = .20
P a g e | 223
c)
n = 10
p = .60
q = .40
C7(.60)7(.40)3 +
+10C10(.60)10(.40)0 =
10
10
C8(.60)8(.40)2 +
C9(.60)9(.40)1
10
d)
n = 12
p = .45
q = .55
C5(.45)5(.55)7 +
12
12
C6(.45)6(.55)6 +
C7(.45)7(.55)5 =
12
5.6
By Table A.2:
P a g e | 224
a)
n = 20
p = .50
P(x=12) = .120
b)
n = 20
p = .30
c)
n = 20
p = .70
=
.065 + .031 + .012 + .004 + .001 + .000 = .113
d)
n = 20
p = .90
P a g e | 225
e)
n = 15
p = .40
f)
n = 10
p = .60
P a g e | 226
5.7
a)
n = 20
p = .70
q = .30
= np = 20(.70) = 14
n p q 20(.70)(.30) 4.2
=
b)
= 2.05
n = 70
p = .35
q = .65
= np = 70(.35) = 24.5
n p q 70(.35)(.65) 15.925
=
c)
= 3.99
n = 100
p = .50
q = .50
= np = 100(.50) = 50
n p q 100(.50)(. 50) 25
=
5.8
a)
n=6
= 5
p = .70
x
0
Prob
.001
P a g e | 227
2
4
b)
n = 20
.010
.060
3
.185
.324
5
.303
.118
p = .50
.000
.000
.000
3
.001
.005
.015
.037
.074
.120
.160
Prob
P a g e | 228
10
.176
11
.160
12
.120
13
.074
14
.037
15
.015
16
.005
17
.001
18
.000
19
.000
20
.000
P a g e | 229
c)
5.9
n=8
a)
n = 20
p = .80
.000
.000
.001
.009
.046
.147
.294
.336
.168
p = .78
Prob
x = 14
20
P a g e | 230
b)
n = 20
p = .75
x = 20
20
c)
n = 20
p = .70
x < 12
P a g e | 231
5.10
n = 16
p = .40
Prob
.084
10
.039
11
.014
12
.004
13
.001
.142
Prob
.047
.101
.162
.198
.508
n = 13
P(x = 10) =
1376
p = .88
C10(.88)10(.12)3 = 286(.278500976)(.001728) = .
13
P a g e | 232
P(x = 13) =
13
P a g e | 233
5.11
n = 25
p = .60
a) x > 15
P(x > 15) = P(x = 15) + P(x = 16) + + P(x = 25)
x
15
16
17
.161
.151
.120
18
.080
19
.044
20
.020
21
.007
22
.002
n = 25, p = .60
Prob
.585
b) x > 20
P(x > 20) = P(x = 21) + P(x = 22) + P(x = 23) + P(x = 24) +
P(x = 25) =
n = 25, p = .60
P a g e | 234
8, 9
Prob.
.009
.013
.003
.001
<6
.000
P a g e | 235
5.12
n = 16
p = .50
x > 10
x
11
15
16
Prob.
.067
12
.028
13
.009
14
.002
.000
.000
.106
For n = 10
p = .87
x=6
10
5.13
n = 15
a) P(x = 5) =
p = .20
15
P a g e | 236
c) P(x = 0) =
15
e)
5.14
n = 18
a)
b)
p =.30
= 18(.30) = 5.4
p = .34
= 18(.34) = 6.12
P(x > 8)
Prob
n = 18
p = .30
P a g e | 237
8
.081
.039
10
.015
11
.005
12
.001
.141
c) n = 18
p = .34
C2(.34)2(.66)16 +
18
C3(.34)3(.66)15 +
18
C4(.34)4(.66)14 =
18
d) n = 18
p = .30
x=0
C0(.30)0(.70)18 = .00163
18
n = 18
p = .34
x=0
C0(.34)0(.66)18 = .00056
18
P a g e | 238
5.15
a) P(x=5 = 2.3) =
b) P(x=2 = 3.9) =
2 .3 5 e
5!
2 .3
3 .9 2 e
2!
3 .9
( 6 4 . 3 6 3 4 3 ) ( .1 0 0 2 5 9 )
120
( 1 5 .2 1 ) ( .0 2 0 2 4 2 )
2
= .0538
= .1539
4 .1 3 e
3!
4 .1
4 .1 2 e
2!
4 .1
4 .1 1 e
1!
4 .1
4 .1 0 e
0!
4 .1
( 6 8 .9 2 1 ) ( .0 1 6 5 7 3 )
6
( 1 6 .8 1 ) ( .0 1 6 5 7 3 )
2
( 4 .1 ) ( . 0 1 6 5 7 3 )
1
( 1 ) ( .0 1 6 5 7 3 )
1
= .1904
= .1393
= .0679
= .0166
d) P(x=0 = 2.7) =
P a g e | 239
2 .7 0 e
0!
2 .7
( 1 ) ( .0 6 7 2 1 )
1
= .0672
e) P(x=1 = 5.4)=
5 .4 1 e
1!
5 .4
( 5 .4 ) ( .0 0 4 5 1 6 6 )
1
= .0244
4 .4 5 e
5!
4 .4
4 .4 6 e
6!
4 .4
( 1 6 4 9 .1 6 2 2 ) ( . 0 1 2 2 7 7 3 4 )
120
4 .4 7 e
7!
4 .4
( 7 2 5 6 .3 1 3 9 ) ( .0 1 2 2 7 7 3 4 )
720
+
( 3 1 , 9 2 7 .7 8 1 ) ( .0 1 2 2 7 7 3 4 )
5040
b) P(x>7 = 2.9):
Prob
.0068
.0022
10
.0006
P a g e | 240
11
.0002
12
.0000
.0098
Prob
.1852
.1944
5
6
7
.1633
.1143
.0686
.0360
.0168
.7786
Prob
.0550
.1596
.2314
.2237
.1622
.0940
P a g e | 241
6
.0455
.9714
Prob
6
7
.1594
.1298
.0925
.3817
6.3
5.17 a) = 6.3
mean = 6.3
Standard deviation =
x
0
1
Prob
.0018
.0116
.0364
.0765
.1205
.1519
.1595
.1435
.1130
.0791
10
.0498
= 2.51
P a g e | 242
11
.0285
12
.0150
13
.0073
14
.0033
15
.0014
16
.0005
17
.0002
18
.0001
19
.0000
P a g e | 243
1.3
b) = 1.3
mean = 1.3
x
0
1
standard deviation =
Prob
.2725
.3542
.2303
.0998
.0324
.0084
.0018
.0003
.0001
.0000
= 1.14
P a g e | 244
c) = 8.9
mean = 8.9
8.9
=
= 2.98
x
0
1
Prob
.0001
.0012
.0054
.0160
.0357
.0635
.0941
.1197
.1332
.1317
10
.1172
11
.0948
12
.0703
13
.0481
14
.0306
15
.0182
16
.0101
17
.0053
18
.0026
19
.0012
20
.0005
21
.0002
standard deviation
P a g e | 245
22
.0001
P a g e | 246
0.6
d) = 0.6
mean = 0.6
standard deviation =
Prob
.5488
1
2
.3293
.0988
.0198
.0030
.0004
.0000
= .775
P a g e | 247
5.18
= 2.8 4 minutes
a) P(x=6 = 2.8)
b) P(x=0 = 2.8) =
from Table A.3 .0608
Prob.
.0872
.0407
.0163
.0057
.0018
10
.0005
11
.0001
.1523
P a g e | 248
15.23% of the time a second window will need to be opened.
P a g e | 249
P(x > 5 = 5.6):
Prob.
.1697
.1584
7
.1267
.0887
.0552
10
.0309
11
.0157
12
.0073
13
.0032
14
.0013
15
.0005
16
.0002
17
.0001
.6579
5.19
a) P(x = 0) = .0302
P a g e | 250
= 7.0 10 minutes
e) P(x = 8 15 minutes)
= 10.5 15 minutes
P a g e | 251
P(x = 8 15 minutes) =
5.20
x!
8!
= .1009
b) P(x=6
= 5.6):
Prob.
15
.0005
16
.0002
17
.0001
.0008
P a g e | 252
Lambda has changed because of an overall increase in pollution.
P a g e | 253
5.21
a) P(x=0 = 0.6):
b) P(x=1
= 0.6):
Prob.
.0988
.0198
.0030
.0004
6
.0000
.1220
P a g e | 254
P(x < 3
= 1.8):
Prob.
.1653
.2975
.2678
.1607
.8913
e) P(x=4 6 years):
P(x=4 = 3.6):
5.22
a) P(x=0 = 1.2):
P a g e | 255
b) P(x=2 2 months):
P(x=2
= 0.6):
Prob.
.1653
.2975
.4628
P a g e | 256
The result is likely to happen almost half the time (46.26%). Ship
channel and
weather conditions are about normal for this period. Safety
awareness is
about normal for this period. There is no compelling reason to
reject the
lambda value of 0.6 collisions per 4 months based on an outcome of
0 or 1
collisions per 6 months.
P a g e | 257
5.23
a) P(x=0 = 1.2):
Prob.
.0260
.0062
.0012
.0002
8
.0000
.0336
P a g e | 258
5.24
n = 100,000
p = .00004
= = np = 100,000(.00004) = 4.0
Prob.
.0595
.0298
.0132
10
.0053
11
.0019
12
.0006
13
.0002
14
.0001
.1106
Prob.
11
.0019
P a g e | 259
12
.0006
13
.0002
14
.0001
.0028
P a g e | 260
5.25
p = .009
n = 200
= np = 200(.009) = 1.8
c) P(x = 0) = .1653
5.26
If 99% see a doctor, then 1% do not see a doctor. Thus, p = .01 for
this problem.
P a g e | 261
n = 300,
p = .01,
= n(p) = 300(.01) = 3
a) P(x = 5):
Using = 3 and Table A.3 = .1008
P a g e | 262
5.27
a) P(x = 3 N = 11, A = 8, n = 4)
C 3 3 C1 (56)(3)
330
11 C 4
= .5091
P(x = 1) + P (x = 0) =
C1 10 C 5
15 C 6
C 0 10 C 6
15 C 6
5005
5005
c) P(x=0 N = 9, A = 2, n = 3)
C 0 7 C 3 (1)(35)
84
9 C3
= .4167
P a g e | 263
C 5 15 C 2
20 C 7
(1)(105)
77520
C 6 15 C1
20 C 7
C 7 15 C 0
20 C 7
=
P a g e | 264
5.28
N = 19 n = 6
a) P(x = 1 private)
11
A = 11
C1 8 C 5 (11)(56)
27,132
19 C 6
= .0227
b) P(x = 4 private)
11
C 4 8 C 2 (330)( 28)
27,132
19 C 6
= .3406
c) P(x = 6 private)
11
C 6 8 C 0 (462)(1)
27,132
19 C 6
= .0170
d) P(x = 0 private)
11
C 0 8 C 6 (1)( 28)
27,132
19 C 6
= .0010
P a g e | 265
5.29
N = 17
A=8
n=4
C 0 9 C 4
17 C 4
a) P(x = 0) =
(1)(126)
2380
C 4 9 C 0
17 C 4
b) P(x = 4) =
(70)(1)
2380
= .0529
= .0294
C 2 8 C 2
17 C 4
5.30
N = 20
A = 16 white
16
N - A = 4 red
C 4 4 C1
20 C 5
a) P(x = 4 white) =
b) P(x = 4 red) =
(36)( 28)
2380
C 4 16 C1
20 C 5
=
(1820)( 4)
15504
(1)(16)
15504
= .4235
n=5
= .4696
= .0010
P a g e | 266
c) P(x = 5 red) =
C 5 16 C 0
20 C 5
= .0000 because 4C5 is impossible to
determine
The participant cannot draw 5 red beads if there are only 4 to draw
from.
P a g e | 267
5.31
N = 10
n=4
a) A = 3 x = 2
C 2 7 C 2 (3)( 21)
C
210
10 4
P(x = 2) =
= .30
b) A = 5 x = 0
C 0 5 C 4 (1)(5)
210
10 C 4
P(x = 0) =
= .0238
c) A = 5 x = 3
C 3 5 C1 (10)(5)
C
210
10 4
P(x = 3) =
5.32
N = 16
= .2381
A = 4 defective
a) P(x = 0) =
n=3
C 0 12 C 3 (1)( 220)
C
560
16 3
= .3929
P a g e | 268
C 3 12 C 0 (4)(1)
560
16 C 3
b) P(x = 3) =
= .0071
4
C 2 12 C1
16 C 3
(6)(12)
560
C1 12 C 2
16 C 3
+ .3929 (from part a.) =
3929 = .8643
(4)( 66)
560
+ .3929 = .4714 + .
P a g e | 269
5.33
N = 18
A = 11 Hispanic
n=5
11
C1 7 C 4
18 C 5
11
C 0 7 C 5
18 C 5
=
8568
8568
= .0449 + .0025 = .
0474
5.34
11
P a g e | 270
C8(.85)8(.15)1 + 9C9(.85)9(.15)0 =
C3(.70)3(.30)11 +
14
C1(.70)1(.30)13 +
14
14
14
C2(.70)2(.30)12 +
C0(.70)0(.30)14 =
(364)(.3430)(.00000177) + (91)(.49)(.000000531)=
(14)(.70)(.00000016) + (1)(1)(.000000048) =
P a g e | 271
5.35
Prob.
.028
.121
.233
.267
.200
.849
Prob.
12
13
.063
.022
14
.005
15
.000
.090
P a g e | 272
Prob.
21
.000
22
.000
23
24
.000
.000
25
.000
.000
P a g e | 273
5.36
a) P(x = 4 = 1.25)
1 .2 5
( 1 .2 5 4 ) ( e
4!
( 2 .4 4 1 4 ) ( .2 8 6 5 )
24
= .0291
( 6 .3 7 ) 1 ( e
1!
6 .3 7
( 6 .3 7 0 ) ( e
0!
6 .3 7
( 6 .3 7 ) ( .0 0 1 7 ) ( 1 ) ( .0 0 1 7 )
1
1
( 2 .4 6 ) ( e
6!
2 .4
( 2 .4 7 ) ( e
7!
2 .4
( 2 .4 8 ) ( e
8!
2 .4
( 2 .4 9 ) ( e
9!
2 .4
( 2 .4
10
)(e
10!
2 .4
P a g e | 274
5.37
Prob.
.0369
.1217
.2008
.2209
.1823
.7626
Prob.
.1890
.0992
.0417
.0146
.0044
.0011
9
10
.0003
.0001
11
.0000
.3504
P a g e | 275
Prob.
3
4
5
.1852
.1944
.1633
.5429
P a g e | 276
5.38
C 3 1 C1 (10)(1)
15
6 C4
a) P(x = 3 N = 6, n = 4, A = 5) =
= .6667
C1 5 C 2
10 C 3
P(x = 1) + P(x = 0) =
C 0 5 C 3
10 C 3
(5)(10) (1)(10)
120
120
C 2 10 C 3
13 C 5
C 3 10 C 2
13 C 5
1287
1287
= .2797 + .0350 = .
3147
5.39
n = 25 p = .20 retired
P a g e | 277
P(x = 8) = .180
P(x = 0) = .000
P(x > 12) = P(x = 12) + P(x = 13) + . . . + P(x = 20) = .035 + .015 + .
005 + .001 = .056
x=8
5.40
a)
P a g e | 278
Prob.
.0176
.0047
7
.0011
.0002
.0236
P a g e | 279
P a g e | 280
5.41
N = 32
A = 10
10
n = 12
C 3 22 C 9
32 C12
a) P(x = 3) =
(120)( 497,420)
225,792,840
=
10
C 6 22 C 6
32 C12
b) P(x = 6) =
= .2644
(210)(74,613)
225,792,840
=
10
C 0 22 C12
32 C12
c) P(x = 0) =
= .0694
(1)(646,646)
225,792,840
=
= .0029
d) A = 22
22
C 7 10 C 5
32 C12
22
C8 10 C 4
32 C12
22
C 9 10 C 3
32 C12
225,792,840
225,792,840
225,792,840
=
P a g e | 281
5.42
accepts
Prob.
.2466
.3452
.2417
.1128
.9463
If x < 3, buyer
P a g e | 282
5.43
a) n = 20 and p = .25
P(x = 1) + P(x = 0) =
C1(.25)1(.75)19 +
20
C0(.25)0(.75)20
20
5.44
Prob.
.233
.121
10
.028
.382
P a g e | 283
b) n = 15
p = 1/3
C0(1/3)0(2/3)15 = .0023
15
c) n = 7
p = .53
C7(.53)7(.47)0 =
.0117
Probably the 53% figure is too low for this population since the
probability of
this occurrence is so low (.0117).
P a g e | 284
5.45
n = 12
p = .20
C0(.20)0(.80)12 = .0687
12
p = .20
p = .25,
C5(.25)5(.75)7 = .1032
12
5.46
n = 100,000
p = .000014
Worked as a Poisson:
a) P(x = 5):
= np = 100,000(.000014) = 1.4
P a g e | 285
b) P(x = 0):
Prob
.0005
.0001
.0006
P a g e | 286
5.47
Prob.
.001
.008
.041
.124
.174
5.48
n = 25
p = .20
.062
P a g e | 287
x
Prob.
11
.004
12
.001
13
.000
.005
c) Since such a result would only occur 0.5% of the time by chance, it
is likely
that the analyst's list was not representative of the entire state of
Idaho or the
20% figure for the Idaho census is not correct.
P a g e | 288
5.49
Prob.
.0198
.0030
.0004
.0232
Assume one trip is independent of the other. Let F = flat tire and NF =
no flat tire
P(NF1 NF2) = P(NF1) P(NF2)
5.50
N = 25
n=8
a) P(x = 1 in NY)
A=4
P a g e | 289
C1 21 C 7
25 C 8
(4)(116,280)
1,081,575
=
= .4300
10
A = 10
C 4 15 C 4 (210(1365)
1,081,575
25 C 8
= .2650
c) P(x = 0 in California)
A=5
C 0 20 C8 (1)(125,970)
C
1,081,575
25 8
= .1165
d) P(x = 3 with M)
A=3
C 3 22 C 5 (1)( 26,334)
1,081,575
25 C 8
= .0243
5.51
N = 24
n=6
a) P(x = 6) =
A=8
C 6 16 C 0
(28)(1)
134,596
24 C 6
= .0002
P a g e | 290
C 0 16 C 6 (1)(8008)
134,596
24 C 6
b) P(x = 0) =
= .0595
d) A = 16 East Side
16
C 3 8 C 3 (560)(56)
134,596
24 C 6
P(x = 3) =
5.52
n = 25 p = .20
= .2330
n p q 25(.20)(. 80)
= 25(.20) = 5
= 2
P a g e | 291
13
.0000
The values for x > 12 are so far away from the expected value that
they are very
unlikely to occur.
P(x = 14) =
25
If this value (x = 14) actually occurred, one would doubt the validity of
the
p = .20 figure or one would have experienced a very rare event.
5.53
Prob.
.0241
.0083
P a g e | 292
8
.0025
.0007
10
.0002
11
.0000
.0358
P(x = 1) =
.3293
P(x = 0) =
.5488
.8781
P a g e | 293
5.54
n = 160
p = .01
Prob.
.0002
.0000
.0002
Prob.
.2584
.1378
.0551
5
.0176
.0047
.4736
P a g e | 294
5.55
p = .005
n = 1,000
= np = (1,000)(.005) = 5
c) P(x = 0) = .0067
P a g e | 295
5.56
n=8
p = .36
x = 0 women
5.57
N = 34
a) n = 5
13
x=3
A = 13
C 3 21 C 2 (286)( 210)
278,256
34 C 5
= .2158
b) n = 8
x<2
C 0 29 C8
34 C 8
A=5
C1 29 C 7
34 C 8
C 2 29 C 6
34 C 8
=
P a g e | 296
18,156,204
18,156,204
18,156,204
= .2364 + .4298 + .2616 = .
9278
c) n = 5
x=2
A=3
P a g e | 297
5.58
N = 14
n=4
10
C 4 4 C 0 (210((1)
1001
14 C 4
= .2098
C 4 10 C 0 (1)(1)
1001
14 C 4
= .0010
C 2 10 C 2 (6)( 45)
1001
14 C 4
= .2697
5.59
a) = 3.84 1,000
P(x = 0) =
3.840 e 3.84
0!
= .0215
P a g e | 298
b) = 7.68 2,000
P(x = 6) =
6!
720
= .1317
P a g e | 299
5.60
= np = 15(.36) = 5.4
15(.36)(.64)
=
= 1.86
The most likely values are near the mean, 5.4. Note from the printout
that the
most probable values are at x = 5 and x = 6 which are near the mean.
5.61
This printout contains the probabilities for various values of x from zero
to eleven from a Poisson distribution with = 2.78. Note that the
highest probabilities are at x = 2 and
x = 3 which are near the mean. The probability is slightly higher at x
= 2 than at x = 3 even though x = 3 is nearer to the mean because of
the piling up effect of x = 0.
5.62
P a g e | 300
n p q 22(.64)(. 36)
=
= 2.25
5.63
high
P a g e | 301
Chapter 6
Continuous Distributions
LEARNING OBJECTIVES
1.
2.
3.
4.
5.
P a g e | 302
Chapter 5 introduced the students to discrete distributions. This
chapter introduces the students to three continuous distributions: the uniform
distribution, the normal distribution and the exponential distribution. The
normal distribution is probably the most widely known and used distribution.
The text has been prepared with the notion that the student should be able to
work many varied types of normal curve problems. Examples and practice
problems are given wherein the student is asked to solve for virtually any of
the four variables in the z equation. It is very helpful for the student to get
into the habit of constructing a normal curve diagram, with a shaded portion
for the desired area of concern for each problem using the normal
distribution. Many students tend to be more visual learners than auditory
and these diagrams will be of great assistance in problem demonstration and
in problem solution.
This chapter contains a section dealing with the solution of binomial distribution
problems by the normal curve. The correction for continuity is emphasized. In this text,
the correction for continuity is always used whenever a binomial distribution problem is
worked by the normal curve. Since this is often a stumbling block for students to
comprehend, the chapter has included a table (Table 6.4) with rules of thumb as to how to
apply the correction for continuity. It should be emphasized, however, that answers for
this type of problem are still only approximations. For this reason and also in an effort to
link chapters 5 & 6, the student is sometimes asked to work binomial problems both by
methods in this chapter and also by using binomial tables (A.2). This also will allow the
student to observe how good the approximation of the normal curve is to binomial
problems.
The exponential distribution can be taught as a continuous distribution,
which can be used in complement with the Poisson distribution of chapter 5
to solve inter-arrival time problems. The student can see that while the
Poisson distribution is discrete because it describes the probabilities of whole
number possibilities per some interval, the exponential distribution describes
the probabilities associated with times that are continuously distributed.
CHAPTER OUTLINE
6.1
P a g e | 303
Normal Distribution
History of the Normal Distribution
Probability Density Function of the Normal Distribution
Standardized Normal Distribution
Solving Normal Curve Problems
Using the Computer to Solve for Normal Distribution Probabilities
6.3
6.4
Exponential Distribution
Probabilities of the Exponential Distribution
Using the Computer to Determine Exponential Distribution
Probabilities
KEY TERMS
Exponential Distribution
Uniform Distribution
Normal Distribution
z Distribution
Rectangular Distribution
z Score
P a g e | 304
6.1 a = 200
a) f(x) =
b) =
b = 240
1
1
1
b a 240 200 40
a b 200 240
2
2
= .025
= 220
b a 240 200
40
12
12
12
= 11.547
240 230 10
240 200 40
= .250
220 205 15
240 200 40
225 200 25
240 200 40
= .375
= .625
P a g e | 305
6.2 a = 8
a) f(x) =
b) =
b = 21
1
1
1
b a 21 8 13
a b 8 21 29
2
2
2
= .0769
= 14.5
b a 21 8
13
12
12
12
= 3.7528
17 10 7
21 8 13
= .5385
P a g e | 306
6.3 a = 2.80
b = 3.14
a b 2.80 3.14
2
2
= 2.97
b a 3.14 2.80
12
12
= 0.098
6.4 a = 11.97
Height =
3.10 3.00
3.14 2.80
= 0.2941
b = 12.03
1
1
b a 12.03 11.97
12.03 12.01
12.03 11.97
= 16.667
= .3333
12.01 11.98
12.03 11.97
= .5000
P a g e | 307
6.5 = 2100
a = 400
b = 3800
b a 3800 400
12
12
= 981.5
1
3800 400
ba
12
Height =
= .000294
= .2353
1500 700
800
= .2353
.4750
P a g e | 308
b) P (z < 0.73):
.2673
.4977
.4279
.4962
.3599
P a g e | 309
Table A.5 value for z = -0.87: .3078
P a g e | 310
P(x < 635 = 604, = 56.8):
6.7 a)
z =
x 635 604
56.8
= 0.55
.2088
b) P(x < 20
z =
= 48, = 12):
x 20 48
12
= -2.33
.4901
z =
x 150 111
33.8
= 1.15
.3749
P a g e | 311
z =
x 100 111
33.8
= -0.33
.1293
z =
x 250 264
10.9
= -1.28
z =
x 255 264
10.9
.3997
= -0.83
.2967
z =
x 35 37
4.35
= -0.46
P a g e | 312
.1772
z =
x 170 156
11.4
= 1.23
.3907
6.8 = 22
=4
z =
x 17 22
= -1.25
P a g e | 313
P(x > 17) = .3944 + .5000 = .8944
z =
x 13 22
= -2.25
P a g e | 314
c) P(25 < x < 31):
z =
x 31 22
= 2.25
z =
x 25 22
= 0.75
= 11.35
6.9 = 60
z =
x 85 60
11.35
= 2.20
P a g e | 315
z =
z =
x 45 60
11.35
x 70 60
11.35
= -1.32
= 0.88
P a g e | 316
c) P(65 < x < 75):
z =
z =
x 65 60
11.35
x 75 60
11.35
= 0.44
= 1.32
z =
x 40 60
11.35
= -1.76
6.10 = $1332
= $725
P a g e | 317
z =
x 2000 1332
725
= 0.92
.3212
z =
x 0 1332
725
= -1.84
z =
x 100 1332
725
= -1.70
.4554
P a g e | 318
z =
x 700 1332
725
= -0.87
.3078
6.11 = $30,000
.1476
= $9,000
x 45,000 30,000
9,000
z =
= 1.67
.4525
x 15,000 30,000
9,000
z =
= -1.67
.4525
P a g e | 319
x 50,000 30,000
9,000
z =
= 2.22
P a g e | 320
c) P($5,000 < x < $20,000):
x 5,000 30,000
9,000
z =
= -2.78
.4973
x 20,000 30,000
9,000
z =
= -1.11
.3665
P a g e | 321
d) Since 90.82% of the values are greater than x = $7,000, x = $7,000
is in the
lower half of the distribution and .9082 - .5000 = .4082 lie between
x and .
Solving for :
z =
-1.33 =
7,000 30,000
= 17,293.23
the upper half of the distribution and .7995 - .5000 = .2995 of the
values lie
P a g e | 322
Solving for :
z =
33,000
9,000
0.84 =
= $25,440
6.12
= 200,
= 47
Determine x
Since 50% of the values are greater than the mean, = 200, 10%
or .1000 lie
between x and the mean. From Table A.5, the z value associated
with an area
of .1000 is z = -0.25. The z value is negative since x is below the
mean.
Substituting z = -0.25, = 200, and = 47 into the formula and
solving for x:
z =
-0.25 =
x 200
47
P a g e | 323
x = 188.25
Since x is only less than 17% of the values, 33% (.5000- .1700) or .
3300 lie
between x and the mean. Table A.5 yields a z value of 0.95 for an
area of
.3300. Using this z = 0.95, = 200, and = 47, x can be solved
for:
z =
0.95 =
x 200
47
x = 244.65
Since 22% of the values lie below x, 28% lie between x and the
mean
(.5000 - .2200). Table A.5 yields a z of -0.77 for an area of .2800.
Using the z
value of -0.77, = 200, and = 47, x can be solved for:
P a g e | 324
z =
x 200
47
-0.77 =
x = 163.81
z =
0.13 =
x 200
47
x = 206.11
P a g e | 325
6.13
z =
-0.64 =
1700
625
= 2100
z =
P a g e | 326
0.48 =
x 2258
625
x = 2558
= ??
6.14 = 22
Since 72.4% of the values are greater than 18.5, then 22.4% lie
between 18.5 and . x = 18.5 is below the mean. From table A.5, z =
- 0.59.
-0.59 =
18.5 22
-0.59 = -3.5
3.5
0.59
= 5.932
P a g e | 327
.2100 of the area. The z score associated with this area is -0.55.
Solving for :
z =
20
4
-0.55 =
= 22.20
6.16
= 9.7 Since 22.45% are greater than 11.6, x = 11.6 is in the
upper half of the distribution and .2755 (.5000 - .2245) lie between x
and the mean.
Table A.5 yields a z = 0.76 for an area of .2755.
Solving for :
z =
0.76 =
11.6 9.7
= 2.5
P a g e | 328
6.17
n p q 30(.70)(.30)
=
= 2.51
= np = 25(.50) = 12.5
n p q 25(.50)(. 50)
=
= 2.5
= np = 40(.60) = 24
P a g e | 329
n p q 40(.60)(. 40)
=
= 3.10
= np = 16(.45) = 7.2
n p q 16(.45)(.55)
=
= 1.99
P a g e | 330
6.18
a) n = 8 and p = .50
= np = 8(.50) = 4
n p q 8(.50)(. 50)
=
= 1.414
3 = 4 3(1.414) = 4 4.242
b) n = 18 and p = .80
= np = 18(.80) = 14.4
n p q 18(.80)(.20)
=
= 1.697
c) n = 12 and p = .30
= np = 12(.30) = 3.6
P a g e | 331
n p q 12(.30)(. 70)
=
= 1.587
d) n = 30 and p = .75
= np = 30(.75) = 22.5
e) n = 14 and p = .50
= np = 14(.50) = 7
n p q 14(.50)(. 50)
=
= 1.87
3 = 7 3(1.87) = 7 5.61
= np = 25(.40) = 10
P a g e | 332
n p q 25(.40)(. 60)
=
= 2.449
3 = 10 3(2.449) = 10 7.347
z =
7.5 10
2.449
= -1.02
z =
8.5 10
2.449
= -0.61
= np = 20(.60) = 12
P a g e | 333
n p q 20(.60)(. 40)
=
= 2.19
3 = 12 3(2.19) = 12 6.57
z =
x 12.5 12
2.19
= 0.23
= np = 15(.50) = 7.5
n p q 15(.50)(.50)
=
= 1.9365
P a g e | 334
Approximation by the normal curve is sufficient.
z =
x 6.5 7.5
1.9365
= -0.52
.1985
= np = 10(.70) = 7
n p q 10(.70)(.30)
=
3 = 7 3(1.449) = 7 4.347
6.20
44.4
= np = 120(.37) =
P a g e | 335
n p q 120(.37)(. 63)
=
= 5.29
z =
39.5 44.4
5.29
= -0.93
n p q 70(.59)(.41)
= n(p) = 70(.59) = 41.3 and =
= 4.115
P a g e | 336
z =
34.5 41.3
4.115
= -1.65
6.22
p = .53
= 300(.53) = 159
n p q 300(.53)(.47)
=
= 8.645
which lies between 0 and 300. It is okay to use the normal distribution
as an approximation on parts a) and b).
P a g e | 337
z =
175.5 159
8.645
= 1.91
z =
170.5 159
8.645
= 1.33 and z =
164.5 159
8.645
= 0.64
p = .60
n p q 300(.60)(.40)
= 300(.60) = 180
8.485
P a g e | 338
Test: + 3 = 180 + 3(8.485) = 180 + 25.455
z =
170.5 180
8.485
= -1.12 and z =
154.5 180
8.485
z =
199.5 180
8.485
= 2.30
= -3.01
P a g e | 339
n p q 130(.25)(.75)
=
= 4.94
z =
36.5 32.5
4.94
= 0.81
z =
25.5 32.5
4.94
= -1.42 and z =
35.5 32.5
4.94
= 0.61
P a g e | 340
z =
19.5 32.5
4.94
x = 19.5
= -2.63
d) P(x = 30):
z =
29.5 32.5
4.94
= -0.61 and
z =
29.5 to 30.5
30.5 32.5
4.94
= -0.40
6.24
n = 95
P a g e | 341
n p q 95(.52)(. 48)
=
= 4.87
z =
43.5 49.4
4.87
test passed
= -1.21
z =
52.5 49.4
4.87
= 0.64
z =
56.5 49.4
4.87
= 1.46
P a g e | 342
from table A.5, area = .4279
c) Joint Venture:
p = .70, n = 95
n p q 95(.70)(. 30)
=
= 4.47
test passed
z =
59.5 66.5
4.47
= -1.57
P a g e | 343
P(x < 60) = .5000 - .4418 = .0582
d) P(55 < x < 62):
z =
54.5 66.5
4.47
= -2.68
z =
62.5 66.5
4.47
= -0.89
P a g e | 344
6.25 a) = 0.1
x0
.1000
1
2
.0905
.0819
.0741
.0670
.0607
.0549
.0497
.0449
.0407
10
.0368
b) = 0.3
x0
.3000
P a g e | 345
1
.2222
.1646
.1220
.0904
.0669
.0496
.0367
.0272
.0202
c) = 0.8
x0
.8000
.3595
.1615
.0726
P a g e | 346
4
.0326
.0147
.0066
.0030
.0013
.0006
d) = 3.0
x0
0
y
3.0000
.1494
.0074
.0004
P a g e | 347
4
.0000
.0000
P a g e | 348
6.26
a) = 3.25
1
1
3.25
1
1
3.25
= 0.31
= 0.31
b) = 0.7
1
1
.007
1
1
.007
= 1.43
= 1.43
c) = 1.1
1
1
1.1
1
1
1.1
= 0.91
= 0.91
P a g e | 349
d) = 6.0
1 1
1 1
= 0.17
= 0.17
P a g e | 350
6.27 a) P(x > 5 = 1.35) =
for x0 = 5:
for x0 = 3:
1 e-x = 1 e-0.68(3) = 1 e
2.04
= 1 - .1300 = .
8700
for x0 = 4:
for x0 = 6:
= .9918
6.28
= 23 sec.
P a g e | 351
Change to minutes:
for x0 = 1:
b) = .0435/sec
Change to minutes:
P(x > 3
= 2.61/min) =
for x0 = 3:
6.29 = 2.44/min.
Let x0 = 10,
P a g e | 352
b) P(x > 5 min = 2.44/min) =
Let x0 = 5,
Let x0 = 1,
d) Expected time = =
1
1
2.44
a) =
1
1
1.12
Let x0 = 2,
P a g e | 353
Let x0 = 10,
1703
P a g e | 354
6.31 = 3.39/ 1000 passengers
1
1
3.39
= 0.295
(0.295)(1,000) = 295
.1836
P a g e | 355
6.32 = 20 years
1
20
= .05/year
x0
.9512
.9048
.8607
P a g e | 356
6.33
= 2/month
1 1
month = 15
days
= = 15 days
Change to days:
2
30
= .067/day
let x0 = 2,
6.34 a = 6
f(x) =
b = 14
1
1
1
b a 14 6 8
a b 6 14
2
2
= 10
= .125
P a g e | 357
b a 14 6
8
12
12
12
= 2.309
14 11 3
14 6 8
= .375
12 7 5
14 6 8
= .625
6.35
z =
x 21 25
= -1.00
P a g e | 358
b) P(x > 77 = 50 and = 9):
z =
x 77 50
= 3.00
z =
x 47 50
= -0.50
z =
x 13 23
= -2.50
P a g e | 359
z =
x 29 23
= 1.50
.9270
z =
x 105 90
2.86
= 5.24
.0000
P a g e | 360
6.36
= np = 25(.60) = 15
n p q 25(.60)(. 40)
=
= 2.45
3 = 15 3(2.45) = 15 7.35
(7.65 to 22.35) lies between 0 and 25.
The normal curve approximation is sufficient.
z=
z=
x 11.5 15
2.45
x 12.5 15
2.45
= -1.43
= -1.02
P a g e | 361
= np = 15(.50) = 7.5
n p q 15(.50)(.50)
=
= 1.94
z =
5.5 7.5
1.94
= -1.03
P a g e | 362
c) P(x < 3 n = 10 and p = .50):
= np = 10(.50) = 5
n p q 10(.50)(. 50)
=
= 1.58
3 = 5 3(1.58) = 5 4.74
z =
3 .5 5
1.58
= -0.95
= np = 15(.40) = 6
P a g e | 363
n p q 15(.40)(.60)
=
= 1.90
3 = 6 3(1.90) = 6 5.7
(0.3 to 11.7) lies between 0 and 15.
The normal curve approximation is sufficient.
z =
7.5 6
1.9
= 0.79
P a g e | 364
6.37 a) P(x > 3
= 1.3):
let x0 = 3
P(x > 3
b) P(x < 2
= 2.0):
Let x0 = 2
P(x < 2
= 2.0) =
= 1.65):
Let x0 = 3
.9817
P a g e | 365
e-x = e-1.65(3) = e-4.95 = .0071
P(1 < x < 3) = P(x > 1) - P(x > 3) = .1920 - .0071 = .1849
Let x0 = 2
P a g e | 366
6.38 = 43.4
x = 48
Solving for :
z =
1.175 =
48 43.4
4.6
1.175
= 3.915
P a g e | 367
6.39
p = 1/5 = .20
n = 150
= 150(.20) = 30
150(.20)(.80)
=
z =
= 4.899
50.5 30
4.899
= 4.18
P a g e | 368
6.40 = 1 customer/20 minutes
= 1/ = 1
a) 1 hour interval
b) 10 to 30 minutes
x0 = .5, x0 = 1.5
x0 = 5/20 = .25
P a g e | 369
P(x < .25) = 1 - .7788 = .2212
6.41
= 90.28
= 8.53
z =
80 90.28
8.53
= -1.21
z =
95 90.28
8.53
= 0.55
P a g e | 370
z =
z =
83 90.28
8.53
87 90.28
8.53
= -0.85
= -0.38
6.42
= 83
z =
P a g e | 371
1.88 =
2655
83
= 2498.96 million
P a g e | 372
6.43 a = 18
b = 65
a b 65 18
2
2
f(x) =
50 25 25
65 18 47
= .5319
= 41.5
1
1
1
b a 65 18 47
= .0213
a) =
1
1
1 .8
P a g e | 373
d)
x0 = 1 min. = 60/15 = 4
6.45
= 951
= 96
z =
x 1000 951
96
= 0.51
P a g e | 374
b) P(900 < x < 1100):
z =
z =
x 900 951
96
x 1100 951
96
= -0.53
= 1.55
z =
z =
x 825 951
96
x 925 951
96
= -1.31
= -0.27
P a g e | 375
z =
x 700 951
96
= -2.61
P a g e | 376
6.46 n = 60
p = .24
= np = 60(.24) = 14.4
n p q 60(.24)(. 76)
=
= 3.308
Since 4.476 to 24.324 lies between 0 and 60, the normal distribution
can be used
to approximate this problem.
z =
x 16.5 14.4
3.308
= 0.63
P a g e | 377
correcting for continuity: x = 22.5
z =
x 22.5 14.4
3.308
= 2.45
z =
x 12.5 14.4
3.308
= -0.57
z =
x 7.5 14.4
3.308
6.47
= 45,970
= 4,246
= -2.09
P a g e | 378
x 50,000 45,970
4,246
z =
= 0.95
x 40,000 45,970
4,246
z =
= -1.41
x 35,000 45,970
4,246
z =
= -2.58
P a g e | 379
x 39,000 45,970
4,246
z =
= -1.64
x 47,000 45,970
4,246
z =
= 0.24
P a g e | 380
6.48
= 9 minutes
= 1/ = .1111/minute = .1111(60)/hour
= 6.67/hour
Let x0 = 5
6.49 = 88
= 6.4
z =
x 70 88
6.4
= -2.81
P a g e | 381
From Table A.5, area = .4975
z =
x 80 88
6.4
= -1.25
z =
x 100 88
6.4
= 1.88
z =
x 90 88
6.4
= 0.31
P a g e | 382
P(90 < x < 100) = .4699 - .1217 = .3482
= 162
n p q 200(.81)(.19)
=
= 5.548
z =
150.5 162
5.548
= -2.07
P a g e | 383
z =
154.5 162
5.548
= -1.35
z =
158.5 162
5.548
= -0.63
z =
143.5 162
5.548
= -3.33
P a g e | 384
6.51 n = 150
p = .75
= np = 150(.75) = 112.5
n p q 150(.75)(.25)
=
= 5.3033
z =
x 104.5 112.5
5.3033
= -1.51
P a g e | 385
correcting for continuity: x = 109.5, x = 120.5
z =
z =
109.5 112.5
5.3033
120.5 112.5
5.3033
= -0.57
= 1.51
z =
95.5 112.5
5.3033
= -3.21
P a g e | 386
6.52 =
ab
2
= 2.165
a + b = 2(2.165) = 4.33
b = 4.33 - a
Height =
1
ba
= 0.862
1 = 0.862b - 0.862a
Substituting b from above, 1 = 0.862(4.33 - a) - 0.862a
1 = 3.73246 - 0.862a - 0.862a
1 = 3.73246 - 1.724a
1.724a = 2.73246
a = 1.585
and
6.53 = 85,200
P a g e | 387
The 60% can be split into 30% and 30% because the two x values are
equal distance from the mean.
z =
.84 =
94,800 85,200
= 11,428.57
6.54
n = 75
p = .81 prices
p = .44 products
n p q 75(.81)(. 19)
1 =
= 3.3974
P a g e | 388
n p q 75(.44)(. 56)
2 =
= 4.2988
+ 3 = 33 + 3(4.299) = 33 + 12.897
20.103 to 45.897 lies between 0 and 75. It is okay to use the normal
distribution to approximate this problem.
z =
66.5 60.75
3.3974
= 1.69
P a g e | 389
z =
22.5 33
4.2988
= -2.44
Let x0 = 1
P a g e | 390
Let x0 = 0.5
P(x < 0.5 month) = 1 - P(x > 0.5 month) = 1 - .7408 = .2592
6.56 n = 50
p = .80
= np = 50(.80) = 40
n p q 50(.80)(. 20)
=
= 2.828
1.67 months
P a g e | 391
Test: + 3 = 40 +3(2.828) = 40 + 8.484
z =
34.5 40
2.828
= -1.94
z =
41.5 40
2.828
= 0.53
z =
x = 47.5
47.5 40
2.828
= 2.65
P a g e | 392
P a g e | 393
6.57 = 2087
= 175
z.30 = -.84
z =
x 2087
175
-.84 =
x = 1940
z.15 = -0.39
z =
-.39 =
x 2087
175
x = 2018.75
P a g e | 394
z.35 = 1.04
z =
1.04 =
x 2087
175
x = 2269
Let x0 = 1
P a g e | 395
Let x0 = 2.5
6.59 = 2,106,774
= 50,940
2,200,000 2,106,774
50,940
z =
= 1.83
2,000,000 2,106,774
50,940
z =
= -2.10
P a g e | 396
For x0 = 1:
For x0 = 2:
P a g e | 397
The mean is (11 + 32)/2 = 21.5 and the standard deviation is
12
(32 - 11)/
equal to 28
sales associates working. One hundred percent of the time there are
less than or
equal to 34 sales associates working and never more than 34. About
23.8% of
the time there are 16 or fewer sales associates working. There are 21
or fewer
6.62 The weight of the rods is normally distributed with a mean of 227 mg
and a
standard deviation of 2.3 mg. The probability that a rod weighs less
than or
or equal to 227 is .5000 (since 227 is the mean), less than 231 mg is .
9590, and
P a g e | 398
6.63 The lengths of cell phone calls are normally distributed with a mean
of 2.35
less than or equal to 2.60 minutes, almost 82% are less than or equal
to 2.45
minutes, over 32% are less than 2.3 minutes, and almost none are less
than
2 minutes.
P a g e | 399
Chapter 7
LEARNING OBJECTIVES
The two main objectives for Chapter 7 are to give you an appreciation for the
proper application of sampling techniques and an understanding of the sampling
distributions of two statistics, thereby enabling you to:
1.
2.
3.
4.
5.
P a g e | 400
x
6.
and
Virtually every analysis discussed in this text deals with sample data.
It is important, therefore, that students are exposed to the ways and means
that samples are gathered. The first portion of chapter 7 deals with
sampling. Reasons for sampling versus taking a census are given. Most of
these reasons are tied to the fact that taking a census costs more than
sampling if the same measurements are being gathered. Students are then
exposed to the idea of random versus nonrandom sampling. Random
sampling appeals to their concepts of fairness and equal opportunity. This
text emphasizes that nonrandom samples are non probability samples and
cannot be used in inferential analysis because levels of confidence and/or
probability cannot be assigned. It should be emphasized throughout the
discussion of sampling techniques that as future business managers (most
students will end up as some sort of supervisor/manager) students should be
aware of where and how data are gathered for studies. This will help to
assure that they will not make poor decisions based on inaccurate and poorly
gathered data.
P a g e | 401
n
standard error of the mean (
). As the student sees the central limit
theorem unfold, he/she begins to see that if the sample size is large enough,
sample means can be analyzed using the normal curve regardless of the
shape of the population.
Chapter 7 presents formulas derived from the central limit theorem for
both sample means and sample proportions. Taking the time to introduce
these techniques in this chapter can expedite the presentation of material in
chapters 8 and 9.
CHAPTER OUTLINE
7.1 Sampling
Reasons for Sampling
Reasons for Taking a Census
Frame
Random Versus Nonrandom Sampling
Random Sampling Techniques
Simple Random Sampling
Stratified Random Sampling
Systematic Sampling
Cluster or Area Sampling
Nonrandom Sampling
Convenience Sampling
Judgment Sampling
Quota Sampling
Snowball Sampling
P a g e | 402
Sampling Error
Nonsampling Errors
x
7.2
Sampling Distribution of
Sampling from a Finite Population
p
7.3
Sampling Distribution of
KEY TERMS
Quota Sampling
Random Sampling
Sample Proportion
Sampling Error
Frame
Snowball Sampling
Judgment Sampling
Nonrandom Sampling
Nonsampling Errors
Systematic Sampling
Two-Stage Sampling
P a g e | 403
7.1
a)
b)
c)
d)
e)
i.
ii.
i.
ii.
i.
ii.
i.
ii.
i.
ii.
P a g e | 404
7.4
7.5
a)
b)
c)
d)
a)
b)
c)
d)
e)
P a g e | 405
f) Manufacturing, finance, communications, health care, retailing, chemical,
transportation.
7.6
7.7
N = nk = 825
7.8
k = N/n = 3,500/175 =
20
Start between 0 and 20. The human resource department probably has a list
of
company employees which can be used for the frame. Also, there might be a
company phone directory available.
P a g e | 406
7.9
a)
b)
c)
7.10
i.
Counties
ii.
Metropolitan areas
i.
ii.
i.
States
ii.
Counties
7.11
P a g e | 407
7.12
7.13
= 10,
= 50,
n = 64
x
a) P( > 52):
x 52 50
10
n
z =
64
= 1.6
x
P(
x
b) P( < 51):
P a g e | 408
x 51 50
10
n
64
z =
= 0.80
x
P( < 51) = .5000 + .2881 = .7881
x
c) P( < 47):
x 47 50
10
n
64
z =
= -2.40
x
P( < 47) = .5000 - .4918 =
x
d) P(48.5 <
< 52.4):
.0082
P a g e | 409
x 48.5 50
10
n
64
z =
= -1.20
x 52.4 50
10
n
z =
64
= 1.92
x
P(48.5 <
P a g e | 410
x
e) P(50.6 <
< 51.3):
x 50.6 50
10
n
64
z =
= 0.48
x 51.3 50
1.04
10
n
64
z =
x
P(50.6 <
7.14
= 23.45
= 3.8
x
a) n = 10, P(
> 22):
P a g e | 411
x 22 23.45
3.8
n
10
z =
= -1.21
x
P(
.8869
x
b) n = 4, P(
> 26):
x 26 23.45
3.8
n
z =
= 1.34
x
P(
.0901
x
7.15
n = 36
= 278
x
.3600 of the area lies between
P a g e | 412
n
z =
280 278
36
1.08 =
1.08
7.16
= 2
12
1.08
n = 81
= 11.11
= 12
x
P(
Solving for :
P a g e | 413
n
z =
300
12
81
0.92 =
0.92
12
9
= 300 -
1.2267 = 300 -
= 300 - 1.2267 =
7.17
a)
N = 1,000
n = 60
298.77
= 75
x
P(
< 76.5):
n
z =
N n
N 1
76.5 75
1000 60
60 1000 1
= 2.00
=6
P a g e | 414
from Table A.5, prob. = .4772
x
P( < 76.5) = .4772 + .5000 = .9772
b) N = 90 n = 36
= 3.46
= 108
x
P(107 <
< 107.7):
N n
N 1
107 108
3.46 90 36
36 90 1
z =
= -2.23
N n
N 1
107.7 108
3.46 90 36
36 90 1
z =
= -0.67
x
P(107 <
c) N = 250
n = 100
= 35.6
.2385
= 4.89
P a g e | 415
x
P(
> 36):
N n
N 1
36 35.6
4.89
100
250 100
250 1
z =
= 1.05
x
P(
d) N = 5000
n = 60
= 125
= 13.4
x
P(
< 123):
N n
N 1
123 125
13.4 5000 60
60 5000 1
z =
= -1.16
x
P(
.1230
P a g e | 416
7.18
= 30
= 99.9
n = 38
x
a) P(
< 90):
x 90 99.9
30
n
38
z =
= -2. 03
x
P(
x
b) P(98 <
< 105):
x 105 99.9
30
n
38
z =
= 1.05
x 98 99.9
30
n
z =
38
= -0.39
P a g e | 417
from table A.5, area = .1517
x
P(98 <
x
c) P(
< 112):
x 112 99.9
30
n
38
z =
= 2.49
x
P( < 112) = .5000 + .4936 = .9936
x
d) P(93 <
< 96):
x 93 99.9
30
n
z =
38
= -1.42
P a g e | 418
x 96 99.9
30
n
38
z =
= -0.80
x
P(93 <
7.19
N = 1500
P(
n = 100
= 177,000
= 8,500
> $185,000):
N n
N 1
185,000 177,000
8,500 1500 100
100 1500 1
z =
= 9.74
P(
P a g e | 419
7.20
= $21.45
= $65.12
P(
>
) = .2300
x0
x
Prob.
x0
x
n = 45
lies between
x0
Solving for
x0
n
z =
x 0 65.12
21.45
45
0.74 =
x0
2.366 =
7.21
= 50.4
x0
- 65.12
= 11.8
x
a) P(
> 52):
and
n = 42
P a g e | 420
x 52 50.4
11.8
n
42
z =
= 0.88
x
P( > 52) = .5000 - .3106 = .1894
x
b) P(
< 47.5):
x 47.5 50.4
11.8
n
42
z =
= -1.59
x
P(
x
c) P(
< 40):
x 40 50.4
11.8
n
z =
42
= -5.71
P a g e | 421
x
P(
d) 71% of the values are greater than 49. Therefore, 21% are between the
sample mean of 49 and the population mean, = 50.4.
z.21 = -0.55
n
z =
49 50.4
42
-0.55 =
= 16.4964
P a g e | 422
7.22
p = .25
p
a) n = 110
P(
p p
pq
n
< .21):
.21 .25
(.25)(. 75)
110
z =
= -0.97
p
P(
p
b) n = 33
P(
p p
pq
n
> .24):
.24 .25
(.25)(. 75)
33
z =
= -0.13
p
P(
p
c)
n = 59
P(.24 <
< .27):
P a g e | 423
p p
pq
n
.24 .25
(.25)(. 75)
59
z =
= -0.18
p P
pq
n
z =
.27 .25
(.25)(. 75)
59
= 0.35
p
P(.24 <
P a g e | 424
p
d) n = 80
P(
p p
pq
n
> .30):
.30 .25
(.25)(. 75)
80
z =
= 1.03
p
P(
.1515
p
e) n = 800
P(
p p
pq
n
> .30):
.30 .25
(.25)(. 75)
800
z =
= 3.27
p
P(
7.23
p = .58
n = 660
P a g e | 425
p
a) P(
> .60):
p p
pq
n
.60 .58
(.58)(. 42)
660
z =
= 1.04
p
P(
p
b) P(.55 <
< .65):
p p
pq
n
.65 .58
(.58)(. 42)
660
z =
= 3.64
p p
pq
n
z =
.55 .58
(.58)(. 42)
660
= 1.56
P a g e | 426
p
P(.55 <
p
c)
P(
> .57):
p p
pq
n
.57 .58
(.58)(. 42)
660
z =
= -0.52
p
P(
p
d)
P(.53 <
< .56):
p p
pq
n
z =
.56 .58
p p
pq
n
(.58)(. 42)
660
= 1.04
z =
2.60
p
P(.53 <
.53 .58
(.58)(. 42)
660
=
P a g e | 427
p
e)
P(
< .48):
p p
pq
n
z =
.48 .58
(.58)(. 42)
660
= -5.21
p
P(
P a g e | 428
p
7.24
p = .40
P(
p
P(.35 <
Solving for n:
p p
pq
n
z =
.35 .40
(.40)(. 60)
n
-0.84 =
.05
.24
n
=
0.84 .24
n
.05
n
8.23 =
n = 67.73 68
P a g e | 429
p 0
p
7.25
p = .28
P(
<
) = .3000
p 0
p
P(
n = 140
<
p 0
Solving for
p 0 p
pq
n
z =
p 0 .28
(.28)(. 72)
140
-0.52 =
p 0
-.02 =
- .28
p 0
= .28 - .02 = .26
7.26
p
=
n = 600
150
600
= .25
p = .21
x = 150
P a g e | 430
p p
pq
n
.25 .21
(.21)(. 79)
600
z =
= 2.41
7.27
p = .48
n = 200
p
=
90
200
= .45
p p
pq
n
z =
.45 .48
(.48)(. 52)
200
= -0.85
P a g e | 431
p
=
100
200
= .50
p p
pq
n
z =
.50 .48
(.48)(. 52)
200
= 0.57
P a g e | 432
c) P(x > 80):
80
p
200
= .40
p p
pq
n
.40 .48
(.48)(. 52)
200
z =
= -2.26
7.28
p = .19
n = 950
p
a) P(
> .25):
p p
pq
n
z =
.25 .19
(.19)(.89)
950
= 4.71
p
P(
P a g e | 433
p
b) P(.15 <
< .20):
p p
pq
n
.15 .19
(.19)(.81)
950
z =
= -3.14
p p
pq
n
.20 .19
(.19)(.89)
950
z =
= 0.79
p
P(.15 <
p 1
=
133
950
p 2
= .14
p
P(.14 <
< .18):
171
950
= .18
P a g e | 434
p p
pq
n
.14 .19
p p
pq
n
(.19)(. 81)
950
z =
= -3.93
z =
-0.79
7.29
= 76,
= 14
a) n = 35,
P(
x
> 79):
x 79 76
14
n
z =
35
= 1.27
x
P(
.18 .19
(.19)(.81)
950
=
P a g e | 435
x
b) n = 140, P(74 <
< 77):
x 74 76
14
n
z =
x 77 76
14
140
n
= -1.69
z =
x
P(74 <
140
= 0.85
P a g e | 436
x
c) n = 219,
P(
< 76.5):
x 76.5 76
14
n
219
z =
= 0.53
x
P(
7.30 p = .46
a) n = 60
p
P(.41 <
< .53):
p p
pq
n
z =
.53 .46
(.46)(. 54)
60
= 1.09
P a g e | 437
p p
pq
n
.41 .46
(.46)(.54)
60
z =
= -0.78
p
P(.41 <
p
b) n = 458 P(
p p
p q
n
z =
< .40):
.4 0 .4 6
( .4 6 ) ( .5 4 )
458
= -2.58
p
P(
P a g e | 438
p
c) n = 1350
p p
pq
n
P(
> .49):
.49 .46
(.46)(. 54)
1350
z =
= 2.21
p
P(
7.31
Under 18
250(.22) =
55
18 25
250(.18) =
45
26 50
250(.36) =
90
51 65
250(.10) =
25
over 65
250(.14) =
35
n = 250
7.32
p = .55
p
=
x 298
n 600
n = 600
= .497
x = 298
P a g e | 439
p
P(
< .497):
p p
pq
n
z =
.497 .55
(.55)(. 45)
600
= -2.61
p
P(
.0045
P a g e | 440
7.33 a) Roster of production employees secured from the human
resources department of the company.
7.34
= $ 17,755
= $ 650
n = 30
N = 120
x
P(
< 17,500):
17,500 17,755
650 120 30
30 120 1
z =
= -2.47
x
P(
P a g e | 441
7.35
Number the employees from 0001 to 1250. Randomly sample from the
random number table until 60 different usable numbers are obtained. You
cannot use numbers from 1251 to 9999.
7.36
= $125
x
n = 32
= $110
2 = $525
x
P(
> $110):
x 110 125
525
n
z =
32
= -3.70
x
P(
1.0000
P a g e | 442
x
P(
> $135):
x 135 125
525
n
32
z =
= 2.47
x
P(
x
P($120 <
< $130):
x 120 125
525
n
32
z =
= -1.23
x 130 125
525
n
z =
32
= 1.23
P a g e | 443
x
P($120 <
7.37 n = 1100
a) x > 810,
p
=
p = .73
x 810
n 1100
p p
pq
n
.7364 .73
(.73)(. 27)
1100
z =
= 0.48
b) x < 1030,
p
=
x 1030
n 1100
p = .96,
= .9364
P a g e | 444
p p
pq
n
.9364 .96
(.96)(.04)
1100
z =
= -3.99
c) p = .85
p
P(.82 <
< .84):
p p
pq
n
.82 .85
(.85)(. 15)
1100
z =
= -2.79
p p
pq
n
z =
.84 .85
(.85)(. 15)
1100
= -0.93
P(.82 <
P a g e | 445
7.38
1)
2)
3)
4)
5)
6)
7)
8)
9)
7.39
Divide the factories into geographic regions and select a few factories to
represent those regional areas of the country. Take a random sample of
employees from each selected factory. Do the same for distribution centers
and retail outlets. Divide the United States into regions of areas. Select a
few areas. Take a random sample from each of the selected area distribution
centers and retail outlets.
7.40
N = 12,080 n = 300
P a g e | 446
7.41
p = .54
n = 565
p
=
x 339
n 565
p p
pq
n
= .60
.60 .54
(.54)(.46)
565
z =
= 2.86
p
=
x 288
n 565
= .5097
P a g e | 447
p p
pq
n
.5097 .54
(.54)(.46)
565
z =
= -1.45
p
c) P(
< .50):
p p
pq
n
.50 .54
(.54)(.46)
565
z =
= -1.91
p
P(
7.42
= $550
n = 50
x
P(
< $530):
= $100
P a g e | 448
x 530 550
100
n
50
z =
= -1.41
7.43
= 56.8
n = 51
= 12.3
x
a) P(
> 60):
x 60 56.8
12.3
n
z =
51
= 1.86
x
P(
P a g e | 449
x
b) P(
> 58):
x 58 56.8
12.3
n
51
z =
= 0.70
x
P(
x
c) P(56 <
< 57):
x 56 56.8
12.3
n
z =
x 57 56.8
12.3
51
n
= -0.46
z =
x
P(56 <
51
= 0.12
P a g e | 450
x
d) P( < 55):
x 55 56.8
12.3
n
51
z =
= -1.05
x
P(
x
e) P(
< 50):
x 50 56.8
12.3
n
51
z =
= -3.95
x
P(
7.45 p = .73
n = 300
P a g e | 451
p 1
=
x 210
n 300
p p
pq
n
p 2
= .70
x 234
n 300
.70 .73
(.73)(. 27)
300
z =
= -1.17
p p
pq
n
.78 .73
(.73)(. 27)
300
z =
= 1.95
p
b) P(
> .78):
p p
pq
n
z =
.78 .73
(.73)(. 27)
300
= 1.95
p
P(
= .78
P a g e | 452
p
c) p = .73
p p
pq
n
z =
n = 800
P(
> .78):
.78 .73
(.73)(. 27)
800
= 3.19
p
P(
P a g e | 453
7.46
n = 140
p
=
35
140
= .25
p p
pq
n
p = .22
.25 .22
(.22)(.78)
140
z =
= 0.86
p
=
21
140
= .15
p p
pq
n
z =
.15 .22
(.22)(.78)
140
= -2.00
P a g e | 454
n = 300
p = .20
p
P(.18 <
< .25):
p p
pq
n
.18 .20
(.20)(.80)
300
z =
= -0.87
p p
pq
n
z =
.25 .20
(.20)(.80)
300
= 2.17
p
P(.18 <
7.47
answers.
With a census, data is usually more general and easier to analyze because it
is in a more standard format. Decision-makers are sometimes more
comfortable with a census because everyone is included and there is no
P a g e | 455
sampling error. A census appears to be a better political device because the
CEO can claim that everyone in the company has had input.
7.48
p = .75
n = 150
x = 120
p
P(
> .80):
p p
pq
n
.80 .75
(.75)(. 25)
150
z =
= 1.41
p
P(
7.49 Switzerland: n = 40
= $ 21.24
x
P(21 <
< 22):
x 21 21.24
3
n
z =
.0793
40
= -0.51
=$3
P a g e | 456
x 22 21.24
3
n
z =
40
= 1.60
x
P(21 <
P a g e | 457
Japan: n = 35
= $3
= $ 22.00
x
P(
> 23):
x 23 22
3
n
35
z =
= 1.97
x
P(
U.S.: n = 50
= $ 19.86
=$3
x
P(
< 18.90):
x 18.90 19.86
3
n
z =
50
= -2.26
x
P( < 18.90) = .5000 - .4881 = .0119
P a g e | 458
7.50
7.51
a)
Age, Ethnicity, Religion, Geographic Region, Occupation, UrbanSuburban-Rural, Party Affiliation, Gender
b)
c)
d)
= $281
n = 65
= $47
x
P(
> $273):
x 273 281
47
n
z =
65
= -1.37
x
P(
P a g e | 459
Chapter 8
Statistical Inference: Estimation for Single
Populations
LEARNING OBJECTIVES
1.
2.
3.
4.
5.
6.
P a g e | 460
P a g e | 461
This chapter introduces the student to the t distribution for
estimating population means when is unknown. Emphasize that this
applies only when the population is normally distributed because it is an
assumption underlying the t test that the population is normally distributed,
albeit that this assumption is robust. The student will observe that the t
formula is essentially the same as the z formula and that it is the table that is
different. When the population is normally distributed and is known, the z
formula can be used even for small samples.
A formula is given in chapter 8 for estimating the population variance;
and
it is here that the student is introduced to the chi-square distribution. An
assumption underlying the use of this technique is that the population is
normally
distributed. The use of the chi-square statistic to estimate the population
variance
is extremely sensitive to violations of this assumption. For this reason,
extreme
caution should be exercised in using this technique. Because of this, some
statisticians omit this technique from consideration presentation and usage.
Lastly, this chapter contains a section on the estimation of sample size.
One of the more common questions asked of statisticians is: "How large of a
sample size should I take?" In this section, it should be emphasized that
sample
size estimation gives the researcher a "ball park" figure as to how many to
sample.
The error of estimation is a measure of the sampling error. It is also equal
to
the + error of the interval shown earlier in the chapter.
P a g e | 462
P a g e | 463
CHAPTER OUTLINE
8.1
The t Distribution
Robustness
P a g e | 464
Mean
8.3
8.4
8.5
KEY WORDS
Bounds
Point Estimate
Chi-square Distribution
Robust
Degrees of Freedom(df)
Sample-Size Estimation
Error of Estimation
t Distribution
Interval Estimate
t Value
P a g e | 465
x
8.1 a)
= 3.5
= 25
n = 60
95% Confidence
xz
z.025 = 1.96
3.5
60
= 25 + 1.96
x
b)
= 119.6
= 23.89
98% Confidence
xz
= 25 + 0.89 =
n = 75
z.01 = 2.33
23.89
75
= 119.6 + 2.33
126.03
x
c)
= 3.419
90% C.I.
= 0.974
n = 32
z.05 = 1.645
P a g e | 466
xz
0.974
32
= 3.419 + 1.645
3.702
x
d)
= 12.1
= 56.7
80% C.I.
xz
N = 500
n = 47
z.10 = 1.28
12.1 500 47
47 500 1
N n
N 1
= 56.7 + 1.28
P a g e | 467
x
8.2 n = 36
95% C.I.
xz
= 211
= 23
z.025 = 1.96
23
36
= 211 1.96
211 7.51 =
218.51
x
8.3 n = 81
90% C.I.
xz
= 5.89
= 47
z.05=1.645
5.89
81
= 47 1.645
2 = 49
8.4 n = 70
= 90.4
x
= 90.4
94% C.I.
xz
Point Estimate
z.03 = 1.88
49
70
= 90.4 1.88
P a g e | 468
x
8.5 n = 39
N = 200
96% C.I.
xz
= 66
= 11
z.02 = 2.05
N n
N 1
11
39
200 39
200 1
= 66 2.05
x
= 66
Point Estimate
x
8.6 n = 120
99% C.I.
= 18.72
= 0.8735
z.005 = 2.575
x
= 18.72
xz
Point Estimate
0.8735
120
= 18.72 2.575
x
8.7 N = 1500
95% C.I.
n = 187
= 5.3 years
z.025 = 1.96
x
= 5.3 years
Point Estimate
= 1.28 years
P a g e | 469
xz
N n
N 1
= 5.3 1.96
x
8.8 n = 24
90% C.I.
xz
= 3.23
= 5.625
z.05 = 1.645
3.23
24
= 5.625 1.645
x
8.9
n = 36
98% C.I.
xz
= 3.306
= 1.17
z.01 = 2.33
1.17
36
= 3.306 2.33
x
8.10 n = 36
= 2.139
= .113
P a g e | 470
x
= 2.139
90% C.I.
xz
Point Estimate
z.05 = 1.645
(.113)
36
= 2.139 1.645
P a g e | 471
8.11 95% confidence interval
= 5.124
= 24.533
xz
n = 45
z = + 1.96
5.124
45
= 24.533 + 1.96
n = 41
Confidence interval:
x
8.13 n = 13
= 45.62
s = 5.694
/2=.025
df = 13 1 = 12
P a g e | 472
t.025,12 = 2.179
xt
5.694
13
s
n
= 45.62 2.179
x
8.14 n = 12
= 319.17
s = 9.104
df = 12 - 1 = 11
/2 = .05
xt
t.05,11 = 1.796
9.104
12
s
n
= 319.17 (1.796)
323.89
x
8.15 n = 41
= 128.4
/2 = .01
t.01,40 = 2.423
s = 20.6
df = 41 1 = 40
P a g e | 473
xt
20.6
41
= 128.4 2.423
x
= 128.4
Point Estimate
x
8.16 n = 15
= 2.364
s2 = 0.81
df = 15 1 = 14
/2 = .05
t.05,14 = 1.761
xt
0.81
15
s
n
= 2.364 1.761
x
8.17 n = 25
= 16.088
/2 = .005
t.005,24 = 2.797
s = .817
df = 25 1 = 24
P a g e | 474
xt
(.817)
25
= 16.088 2.797
x
= 16.088 Point Estimate
P a g e | 475
x
8.18 n = 22
= 1,192
xt
s = 279
t.01,21 = 2.518
279
22
s
n
df = n - 1 = 21
= 1,192 + 2.518
1,341.78
8.19
n = 20
df = 19
95% CI
t.025,19 = 2.093
x
= 2.36116
s = 0.19721
0.1972
20
2.36116 + 2.093
2.45346
Error = 0.0923
P a g e | 476
x
8.20 n = 28
= 5.335
s = 2.016
df = 28 1 = 27
/2 = .05
t.05,27 = 1.703
xt
2.016
28
s
n
= 5.335 1.703
x
8.21 n = 10
= 49.8
95% Confidence
xt
s = 18.22
/2 = .025
n = 14
t.025,9 = 2.262
18.22
10
s
= 49.8 2.262
8.22
df = 10 1 = 9
98% confidence
t.01,13 = 2.650
/2 = .01
df = 13
P a g e | 477
x
from data:
= 152.16
s = 14.42
xt
14.42
14
s
n
confidence interval:
= 152.16 + 2.65
8.23
n = 17
df = 17 1 = 16
/2 = .005
99% confidence
t.005,16 = 2.921
x
from data:
= 8.06
s = 5.07
xt
confidence interval:
5.07
17
= 8.06 + 2.921
P a g e | 478
8.24
x t
s
n
x t
s
n
P a g e | 479
p
8.25
a)
n = 44
p z
=.51
99% C.I.
p q
n
z.005 = 2.575
(.51)(. 49)
44
= .51 2.575
p
b)
n = 300
p z
= .82
p q
n
95% C.I.
z.025 = 1.96
(.82)(. 18)
300
= .82 1.96
p
c)
n = 1150
p z
= .48
p q
n
z.05 = 1.645
(.48)(.52)
1150
= .48 1.645
504
90% C.I.
P a g e | 480
p
d)
n = 95
p z
= .32
88% C.I.
p q
n
z.06 = 1.555
(.32)(. 68)
95
= .32 1.555
394
8.26
a)
n = 116
p
=
p z
x = 57
x 57
n 116
99% C.I.
= .49
p q
n
(.49)(. 51)
116
= .49 2.575
b)
n = 800
p
=
z.005 = 2.575
x 479
n 800
x = 479
= .60
97% C.I.
z.015 = 2.17
P a g e | 481
p q
n
p z
(.60)(. 40)
800
= .60 2.17
638
c)
n = 240
p
=
x = 106
x 106
n 240
z.075 = 1.44
= .44
p q
n
p z
85% C.I.
(.44)(. 56)
240
= .44 1.44
486
d)
n = 60
p
=
p z
x = 21
x 21
n 60
90% C.I.
= .35
p q
n
(.35)(.65)
60
= .35 1.645
8.27
n = 85
z.05 = 1.645
x = 40
90% C.I.
z.05 = 1.645
P a g e | 482
p
=
p z
x 40
n 85
= .47
p q
n
(.47)(. 53)
85
= .47 1.645
95% C.I.
p z
z.025 = 1.96
p q
n
(.47)(. 53)
85
= .47 1.96
99% C.I.
p z
z.005 = 2.575
p q
n
(.47)(. 53)
85
= .47 2.575
All other things being constant, as the confidence increased, the width of the
interval increased.
p
8.28
n = 1003
= .255
99% CI
z.005 = 2.575
P a g e | 483
p z
p q
n
(.255)(.745)
1003
= .255 + 2.575
290
p
n = 10,000
p z
= .255
99% CI
z.005 = 2.575
(.255)(.745)
10,000
p q
n
= .255 + 2.575
266
p
8.29
n = 560
p z
= .47
95% CI
p q
n
z.025 = 1.96
(.47)(.53)
560
= .47 + 1.96
P a g e | 484
p
n = 560
p z
= .28
90% CI
p q
n
z.05 = 1.645
(.28)(. 72)
560
= .28 + 1.645
3112
8.30
n = 1250
p
=
p z
x = 997
x 997
n 1250
98% C.I.
z.01 = 2.33
= .80
p q
n
(.80)(.20)
1250
= .80 2.33
P a g e | 485
8.31
n = 3481
p
=
x = 927
x 927
n 3481
= .266
a)
b)
99% C.I.
p z
z.005 = 2.575
p q
n
(.266)(.734)
3481
= .266 + 2.575
= .266 .019 =
8.32
n = 89
p
=
p z
x = 48
x 48
n 89
85% C.I.
z.075 = 1.44
= .54
p q
n
(.54)(.46)
89
= .54 1.44
p
8.33
= .63
n = 672
95% Confidence
z.025 = + 1.96
P a g e | 486
p z
p q
n
(.63)(. 37)
672
= .63 + 1.96
6665
8.34
n = 275
p
=
p z
x = 121
x 121
n 275
98% confidence
z.01 = 2.33
= .44
p q
n
(.44)(.56)
275
= .44 2.33
P a g e | 487
x
8.35 a)
n = 12
12 1 = 11
s2 = 44.9
= 28.4
2.995,11 = 2.60320
2.005,11 = 26.7569
< 2 <
99% C.I.
df =
x
b)
n=7
= 4.37
s = 1.24
s2 = 1.5376
95% C.I. df = 7 1 =
2.975,6 = 1.23734
(7 1)(1.5376)
14.4494
2.025,6 = 14.4494
< 2 <
(7 1)(1.5376)
1.23734
x
c)
n = 20
20 1 = 19
= 105
2.95,19 = 10.11701
s = 32
s2 = 1024
2.05,19 = 30.1435
90% C.I.
df =
P a g e | 488
(20 1)(1024)
30.1435
< 2 <
(20 1)(1024)
10.11701
d)
s2 = 18.56
n = 17
2.90,16 = 9.31224
(17 1)(18.56)
23.5418
80% C.I.
2.10,16 = 23.5418
< 2 <
(17 1)(18.56)
9.31224
df = 17 1 = 16
P a g e | 489
8.36
n = 16
s2 = 37.1833
2.99,15 = 5.22936
(16 1)(37.1833)
30.5780
98% C.I.
df = 16-1 = 15
2.01,15 = 30.5780
< 2 <
(16 1)(37.1833)
5.22936
8.37 n = 20
= 19
s = 4.3
2.99,19 = 7.63270
(20 1)(18.49)
36.1908
< 2 <
s2 = 18.49
98% C.I.
2.01,19 = 36.1908
(20 1)(18.49)
7.63270
8.38
n = 15
s2 = 3.067
2.995,14 = 4.07466
99% C.I.
2.005,14 = 31.3194
df = 15 1 = 14
df = 20 1
P a g e | 490
(15 1)(3.067)
31.3194
< 2 <
(15 1)(3.067)
4.07466
8.39
s2 = 26,798,241.76
n = 14
95% C.I.
df = 14 1 = 13
2.975,13 = 5.00874
2.025,13 = 24.7356
< 2 <
a)
= 36
n=
E=5
z 2 2 (1.96) 2 (36) 2
E2
52
Sample 200
95% Confidence
= 199.15
z.025 = 1.96
P a g e | 491
b)
= 4.13
n=
E=1
99% Confidence
z 2 2 ( 2.575) 2 (4.13) 2
E2
12
= 113.1
Sample 114
c)
E = 10
90% Confidence
n =
z.05 = 1.645
z 2 2 (1.645) 2 (105) 2
E2
10 2
= 298.3
Sample 299
d)
E=3
Range = 108 - 50 = 58
88% Confidence
n =
z.06 = 1.555
z 2 2 (1.555) 2 (14.5) 2
E2
32
= 56.5
z.005 = 2.575
P a g e | 492
Sample 57
8.41
a)
E = .02
p = .40
96% Confidence
z.02 = 2.05
E2
(.02) 2
n =
= 2521.5
Sample 2522
b)
E = .04
p = .50
95% Confidence
z.025 = 1.96
z 2 p q (1.96) 2 (.50)(.50)
E2
(.04) 2
n =
= 600.25
Sample 601
c)
E = .05
p = .55
90% Confidence
z.05 = 1.645
E2
(.05) 2
n =
= 267.9
Sample 268
d)
E =.01
p = .50
99% Confidence
z.005 = 2.575
P a g e | 493
z 2 p q (2.575) 2 (.50)(.50)
E2
(.01) 2
n =
= 16,576.6
Sample 16,577
8.42
E = $200
n =
= $1,000
z 2 2 (2.575) 2 (1000) 2
E2
200 2
Sample 166
99% Confidence
= 165.77
z.005 = 2.575
P a g e | 494
8.43
E = $2
n =
= $12.50
90% Confidence
z 2 2 (1.645) 2 (12.50) 2
E2
22
= 105.7
Sample 106
8.44
E = $100
90% Confidence
n =
z.05 = 1.645
z 2 2 (1.645) 2 (475) 2
E2
100 2
= 61.05
Sample 62
8.45
p = .20
q = .80
90% Confidence,
E = .02
z.05 = 1.645
z.05 = 1.645
P a g e | 495
E2
(.02) 2
n =
= 1082.41
Sample 1083
8.46
p = .50
q = .50
95% Confidence,
E = .05
z.025 = 1.96
E2
(.05) 2
n =
= 384.16
Sample 385
8.47
E = .10
95% Confidence,
p = .50
q = .50
z.025 = 1.96
E2
(.10) 2
n =
= 96.04
P a g e | 496
Sample 97
x
8.48
= 45.6
80% confidence
xz
= 7.75
n = 35
z.10 = 1.28
7.75
45.6 1.28
n
35
= 45.6 + 1.68
94% confidence
xz
z.03 = 1.88
7.75
45.6 1.88
n
35
= 45.6 + 2.46
98% confidence
z.01 = 2.33
P a g e | 497
xz
7.75
45.6 2.33
n
35
= 45.6 + 3.05
P a g e | 498
x
8.49
s
n
n = 10
/2 = .05
xt
s = .4373
12.03 1.833
t.05,9= 1.833
(.4373)
10
= 12.03 + .25
/2 = .025
xt
s
n
12.03 2.262
t.025,9 = 2.262
(.4373)
10
= 12.03 + .31
/2 = .005
xt
s
n
12.03 3.25
(.4373)
10
= 12.03 + .45
t.005,9 = 3.25
df = 9
P a g e | 499
8.50
a)
n = 715
329
715
p z
x = 329
95% confidence
z.025 = 1.96
= .46
p q
(.46)(. 54)
.46 1.96
n
715
= .46 + .0365
p
b)
n = 284
p z
= .71
90% confidence
z.05 = 1.645
p q
(.71)(.29)
.71 1.645
n
284
= .71 + .0443
p
c)
n = 1250
p z
= .48
95% confidence
p q
(.48)(. 52)
.48 1.96
n
1250
= .48 + .0277
z.025 = 1.96
P a g e | 500
d)
n = 457
270
457
p z
x = 270
98% confidence
z.01 = 2.33
= .591
p q
(.591)(.409)
.591 2.33
n
457
= .591 + .0536
8.51
n = 10
s = 7.40045
s2 = 54.7667
90% confidence,
/2 = .05
2.95,9 = 3.32512
2.05,9 = 16.9190
(10 1)(54.7667)
16.9190
< 2 <
1 - /2 = .95
(10 1)(54.7667)
3.32512
df = 10 1 = 9
P a g e | 501
95% confidence,
/2 = .025
2.975,9 = 2.70039
2.025,9 = 19.0228
(10 1)(54.7667)
19.0228
< 2 <
(10 1)(54.7667)
2.70039
1 - /2 = .975
P a g e | 502
8.52
a)
= 44
n =
E=3
95% confidence
z 2 2 (1.96) 2 (44) 2
E2
32
= 826.4
Sample 827
b)
E=2
Range = 88 - 20 = 68
z 2 2 (1.645) 2 (17) 2
E2
22
z.05 = 1.645
= 195.5
Sample 196
c)
E = .04
p = .50
98% confidence
q = .50
z.01 = 2.33
E2
(.04) 2
= 848.3
z.025 = 1.96
P a g e | 503
Sample 849
d)
E = .03
p = .70
95% confidence
q = .30
z.025 = 1.96
E2
(.03) 2
= 896.4
Sample 897
P a g e | 504
x
8.53
n = 17
= 10.765
/2 = .005
99% confidence
xt
s
n
s = 2.223
10.765 2.921
df = 17 - 1 = 16
t.005,16 = 2.921
2.223
17
= 10.765 + 1.575
8.54
p = .40
E=.03
90% Confidence
z.05 = 1.645
E2
(.03) 2
n =
= 721.61
Sample 722
8.55
s2 = 4.941
n = 17
2.995,16 = 5.14216
< 2 <
99% C.I.
2.005,16 = 34.2671
df = 17 1 = 16
P a g e | 505
x
8.56 n = 45
98% Confidence
xz
= 213
= 48
z.01 = 2.33
48
213 2.33
n
45
= 213 16.67
P a g e | 506
x
8.57
n = 39
= 37.256
90% confidence
z.05 = 1.645
xz
= 3.891
3.891
37.256 1.645
n
39
= 37.256 1.025
8.58
= 6
n =
E=1
z 2 2 (2.33) 2 (6) 2
E2
12
98% Confidence
z.98 = 2.33
= 195.44
Sample 196
8.59
n = 1,255
714
1255
x = 714
= .569
95% Confidence
z.025 = 1.96
P a g e | 507
p z
p q
(.569)(.431)
.569 1.96
n
1,255
= .569 .027
P a g e | 508
8.60
40
n = 41
x 128
s = 21
98% C.I.
t.01,40 = 2.423
xt
s
n
128 2.423
21
41
= 128 + 7.947
x
8.61
n = 60
= 6.717 = 3.06
98% Confidence
xz
s
n
N = 300
z.01 = 2.33
N n
3.06 300 60
6.717 2.33
N 1
60 300 1
=
df = 41 1 =
P a g e | 509
6.717 0.825
8.62 E = $20
95% Confidence
n =
z.025 = 1.96
z 2 2 (1.96) 2 (142.50) 2
E2
20 2
Sample 196
= 195.02
P a g e | 510
8.63
n = 245
x 189
n 245
p z
x = 189
90% Confidence
z.05= 1.645
= .77
p q
(.77)(. 23)
.77 1.645
n
245
=
.77 .044
8.64
n = 90
x = 30
95% Confidence
z.025 = 1.96
x 30
n 90
= .33
p z
p q
(.33)(.67)
.33 1.96
n
90
= .33 .097
x
8.65
n = 12
= 43.7
s2 = 228
df = 12 1 = 11
95% C.I.
P a g e | 511
t.025,11 = 2.201
xt
s
n
228
43.7 2.201
12
= 43.7 + 9.59
2.99,11 = 3.05350
2.01,11 = 24.7250
< 2 <
P a g e | 512
x
8.66
n = 27
= 4.82
95% CI:
xt
s
n
s = 0.37
df = 26
t.025,26 = 2.056
4.82 2.056
0.37
27
= 4.82 + .1464
Since 4.50 is not in the interval, we are 95% confident that does
not
equal 4.50.
x
8.67
n = 77
= 2.48
95% Confidence
xz
= 12
z.025 = 1.96
12
2.48 1.96
n
77
= 2.48 2.68
5.16
P a g e | 513
The interval is inconclusive. It says that we are 95% confident that the
average arrival time is somewhere between .20 of a minute (12 seconds)
early and 5.16 minutes late. Since zero is in the interval, there is a possibility
that, on average, the flights are on time.
p
8.68
n = 560
=.33
99% Confidence
p z
z.005= 2.575
p q
(.33)(. 67)
.33 2.575
n
560
= .33 .05
P a g e | 514
8.69
p = .50
E = .05
98% Confidence
z.01 = 2.33
E2
(.05) 2
= 542.89
Sample 543
x
8.70
n = 27
= 2.10
98% confidence
/2 = .01
xt
s
n
2.10 2.479
s = 0.86
df = 27 - 1 = 26
t.01,26 = 2.479
0.86
27
= 2.10 0.41
8.71
n = 23
2.51
df = 23 1 = 22
2.95,22 = 12.33801
s = .0631455
2.05,22 = 33.9245
90% C.I.
P a g e | 515
< 2 <
(23 1)(.0631455) 2
12.33801
x
8.72 n = 39
z.005 = 2.575
xz
= 1.294
= 0.205
0.205
1.294 2.575
n
39
= 1.294 .085
99% Confidence
P a g e | 516
8.73
of
The sample mean fill for the 58 cans is 11.9788 oz. with a standard deviation
.0536 oz. The 99% confidence interval for the population fill is 11.9607 oz. to
11.9969 oz. which does not include 12 oz. We are 99% confident that the
population mean is not 12 oz., indicating that the machine may be under
filling
the cans.
8.74
The point estimate for the average length of burn of the new bulb is 2198.217
8.75
The
The point estimate for the average age of a first time buyer is 27.63 years.
sample of 21 buyers produces a standard deviation of 6.54 years. We are
98%
confident that the actual population mean age of a first-time home buyer is
between 24.0222 years and 31.2378 years.
P a g e | 517
8.76
A poll of 781 American workers was taken. Of these, 506 drive their cars to
work. Thus, the point estimate for the population proportion is 506/781 = .
647887. A 95% confidence interval to estimate the population proportion
shows that we are 95% confident that the actual value lies between .61324
and .681413. The error of this interval is + .0340865.
Chapter 9
Statistical Inference:
Hypothesis Testing for Single Populations
LEARNING OBJECTIVES
The main objective of Chapter 9 is to help you to learn how to test hypotheses on single
populations, thereby enabling you to:
1.
2.
3.
Understand Type I and Type II errors and know how to solve for Type II
errors.
Know how to implement the HTAB system to test hypotheses.
4.
5.
6.
7.
P a g e | 518
Page | 1
CHAPTER TEACHING STRATEGY
examined. The student can be made aware that much of the development of concepts to
this point including sampling, level of data measurement, descriptive tools such as mean
and standard deviation, probability, and distributions pave the way for testing hypotheses.
Often students (and instructors) will say "Why do we need to test this hypothesis when
we can make a decision by examining the data?" Sometimes it is true that examining the
data could allow hypothesis decisions to be made. However, by using the methodology
and structure of hypothesis testing even in "obvious" situations, the researcher has added
credibility and rigor to his/her findings. Some statisticians actually report findings in a
court of law as an expert witness. Others report their findings in a journal, to the public,
to the corporate board, to a client, or to their manager. In each case, by using the
hypothesis testing method rather than a "seat of the pants" judgment, the researcher
stands on a much firmer foundation by using the principles of hypothesis testing and
random sampling. Chapter 9 brings together many of the tools developed to this point
and formalizes a procedure for testing hypotheses.
The statistical hypotheses are set up as to contain all possible
decisions. The
two-tailed test always has = and in the null and alternative hypothesis.
One-tailed tests are presented with = in the null hypothesis and either > or <
in the alternative hypothesis. If in doubt, the researcher should use a twotailed test. Chapter 9 begins with a two-tailed test example. Often that
which the researcher wants to demonstrate true or prove true is set up as the
alternative hypothesis. The null hypothesis is that the new theory or idea is
not true, the status quo is still true, or that there is no difference. The null
hypothesis is assumed to be true before the process begins. Some
researchers liken this procedure to a court of law where the defendant is
presumed innocent (assume null is true - nothing has happened). Evidence is
brought before the judge or jury. If enough evidence is presented, then the
null hypothesis (defendant innocent) can no longer be accepted or assumed
true. The null hypothesis is rejected as not true and the alternate hypothesis
is accepted as true by default. Emphasize that the researcher needs to make
a decision after examining the observed statistic.
Page | 2
Some of the key concepts in this chapter are one-tailed and two-tailed
test and Type I and Type II error. In order for a one-tailed test to be
conducted, the problem must include some suggestion of a direction to be
tested. If the student sees such words as greater, less than, more than,
higher, younger, etc., then he/she knows to use a one-tail test. If no direction
is given (test to determine if there is a "difference"), then a two-tailed test is
called for. Ultimately, students will see that the only effect of using a onetailed test versus a two-tailed test is on the critical table value. A one-tailed
test uses all of the value of alpha in one tail. A two-tailed test splits alpha
and uses alpha/2 in each tail thus creating a critical value that is further out
in the distribution. The result is that (all things being the same) it is more
difficult to reject the null hypothesis with a two-tailed test. Many computer
packages such as MINITAB include in the results a p-value. If you
designate that the hypothesis test is a two-tailed test, the computer will
double the p-value so that it can be compared directly to alpha.
The student can be told that there are some widely accepted values for
alpha (probability of committing a Type I error) in the research world and that
Page | 3
a value is usually selected before the research begins. On the other hand,
since the value of Beta (probability of committing a Type II error) varies with
every possible alternate value of the parameter being tested, Beta is usually
examined and computed over a range of possible values of that parameter.
As you can see, the concepts of hypothesis testing are difficult and represent
higher levels of learning (logic, transfer, etc.). Student understanding of
these concepts will improve as you work your way through the techniques in
this chapter and in chapter 10.
CHAPTER OUTLINE
9.1
9.2
known)
Using
Page | 4
the z Statistic
9.3
unknown)
Using
the t Test
9.4
Proportion
9.5
9.6
Page | 5
KEY TERMS
Alpha( )
One-tailed Test
Alternative Hypothesis
Operating-Characteristic Curve
(OC)
Beta( )
p-Value Method
Critical Value
Power
Power Curve
Hypothesis
Rejection Region
Hypothesis Testing
Research Hypothesis
Level of Significance
Statistical Hypothesis
Nonrejection Region
Substantive Result
Null Hypothesis
Observed Significance Level
Two-Tailed Test
Type I Error
Observed Value
Type II Error
9.1 a) Ho: = 25
Ha: 25
Page | 6
x
= 28.1
= 8.46
n = 57
= .01
zc = 2.575
x 28.1 25
8.46
n
z=
57
= 2.77
Page | 7
n
zc =
x c 25
8.46
57
2.575 =
x
c = 25 2.885
x
c = 27.885 (upper value)
x
c = 22.115 (lower value)
x
= 6.91
n = 24
= 1.21
=.01
Page | 8
For one-tail, = .01
zc = -2.33
x 6.91 7.48
1.21
n
z =
24
= -2.31
Page | 9
9.3 a)Ho: = 1,200
Ha: > 1,200
x
= 1,215
n = 113
= 100
= .10
zc = 1.28
x 1,215 1,200
100
n
113
z =
= 1.59
b) Probability > observed z = 1.59 is .5000 - .4441 = .0559 (the pvalue) which is
less than = .10.
Reject the null hypothesis.
c) Critical mean value:
xc
n
zc =
P a g e | 10
x c 1,200
100
113
1.28 =
x
c
= 1,200 + 12.04
x
Since the observed = 1,215 is greater than the critical
the decision is to reject the null hypothesis.
x
= 1212.04,
P a g e | 11
9.4
Ho: = 82
Ha: < 82
x
= 78.125
n = 32
= 9.184
= .01
z.01 = -2.33
x 78.125 82
9.184
n
32
z=
= -2.39
P a g e | 12
Ha: $424.20
x
= $432.69
n = 54
= $33.90
= .05
z.025 = + 1.96
x 432.69 424.20
33.90
n
z =
54
= 1.84
Since the observed z = 1.85 < z.025 = 1.96, the decision is to fail to
reject the
null hypothesis.
P a g e | 13
9.6 H0: = $62,600
Ha: < $62,600
x
= $58,974
n = 18
= $7,810
= .01
z.01 = -2.33
x 58,974 62,600
7,810
n
18
z =
= -1.97
Since the observed z = -1.97 > z.01 = -2.33, the decision is to fail to
reject the
null hypothesis.
9.7 H0: = 5
Ha: 5
x
= 5.0611
n = 42
N = 650
10
z.05 = + 1.645
= 0.2803
=.
P a g e | 14
N n
N 1
5.0611 5
0.2803 650 42
650 1
42
z =
= 1.46
Since the observed z = 1.46 < z.05 = 1.645, the decision is to fail to
reject the
null hypothesis.
x
= 15.6
n = 32
= 2.3
= .10
z.10 = -1.28
x 15.6 18.2
2 .3
n
z =
32
= -6.39
P a g e | 15
x
= $4,008
= $386
n = 55
= .01
z.01 = -2.33
x $4,008 $4,292
$386
n
55
z =
= -5.46
9.10
Ho: = 123
Ha: > 123
= .05
n = 40
P a g e | 16
x
= 132.36
s = 27.68
P a g e | 17
x
9.11 n = 20
= 16.45
s = 3.59
df = 20 - 1 = 19 = .05
Ho: = 16
Ha: 16
x 16.45 16
s
3.59
n
20
t =
= 0.56
x
9.12
n = 51
= 58.42
s2 = 25.68
df = 51 - 1 = 50
= .01
Ho: = 60
Ha: < 60
P a g e | 18
x 58.42 60
s
25.68
n
51
t =
= -2.23
P a g e | 19
x
9.13
10
n = 11
= .05
= 1,235.36
s = 103.81
df = 11 - 1 =
Ho: = 1,160
Ha: > 1,160
or one-tail test, = .05
x 1,236.36 1,160
s
103.81
n
11
t =
= 2.44
x
9.14 n = 20
= .01
= 8.37
s = .1895
df = 20-1 = 19
Ho: = 8.3
Ha: 8.3
For two-tail test, /2 = .005
P a g e | 20
x 8.37 8.3
s
.1895
n
20
t =
= 1.65
x
9.15 n = 12
= 1.85083
s = .02353
df = 12 - 1 = 11
= .10
H0: = 1.84
Ha: 1.84
For a two-tailed test, /2 = .05 critical t.05,11 = 1.796
x 1.85083 1.84
s
.02353
n
12
t =
= 1.59
x
9.16
n = 25
= 3.1948
s = .0889
df = 25 - 1 = 24
= .01
P a g e | 21
Ho: = $3.16
Ha: > $3.16
x 3.1948 3.16
s
.0889
n
25
t =
= 1.96
x
9.17 n = 19
= $31.67
s = $1.29
df = 19 1 = 18
= .05
H0: = $32.28
Ha: $32.28
x 31.67 32.28
s
1.29
n
t =
19
= -2.06
t.025,18 = + 2.101
P a g e | 22
x
9.18 n = 61
= 3.72
s = 0.65
df = 61 1 = 60
= .01
H0: = 3.51
Ha: > 3.51
t.01,60 = 2.390
x 3.72 3.51
s
0.65
n
t =
61
= 2.52
P a g e | 23
x
9.19 n = 22
= .05
= 1031.32
s = 240.37
df = 22 1 = 21
H0: = 1135
Ha:
1135
t.025,21 = +2.080
x 1031.32 1135
s
240.37
n
22
t =
= -2.02
x
9.20 n = 12
= .01
= 42.167
s = 9.124
df = 12 1 = 11
H0: = 46
Ha: < 46
t.01,11 = -2.718
P a g e | 24
x 42.167 46
s
9.124
n
12
t =
= -1.46
x
9.21 n = 26
= 19.534 minutes
s = 4.100 minutes
= .05
H0: = 19
Ha: 19
P a g e | 25
9.22
Ho: p = .45
Ha: p > .45
p
n = 310
= .465
= .05
p p
pq
n
z.05 = 1.645
.465 .45
(.45)(. 55)
310
z =
= 0.53
9.23
Ho: p = 0.63
Ha: p < 0.63
p
n = 100
x = 55
x
55
n 100
z.01 = -2.33
= .55
P a g e | 26
p p
pq
n
z =
.55 .63
(.63)(. 37)
100
= -1.66
P a g e | 27
9.24
Ho: p = .29
Ha: p .29
p
n = 740
x = 207
p p
pq
n
x 207
n 740
= .28
= .05
z.025 = 1.96
.28 .29
(.29)(. 71)
740
z =
= -0.60
p-Value Method:
z = -0.60
P a g e | 28
Since the p-value = .2743 > /2 = .025, the decision is to Fail to
reject the null
hypothesis
p c p
pq
n
z =
p c .29
(.29)(. 71)
740
1.96 =
p c
= .29 .033
p
Since
to reject
the null hypothesis
P a g e | 29
9.25
Ho: p = .48
Ha: p .48
n = 380
x 164
n 380
p p
pq
n
x = 164
= .01
/2 = .005
z.005 = +2.575
= .4316
.4316 .48
(.48)(. 52)
380
z =
= -1.89
Since the observed z = -1.89 is greater than z.005= -2.575, The decision
is to fail to reject the null hypothesis. There is not enough
evidence to declare that the proportion is any different than .48.
9.26
Ho: p = .79
Ha: p < .79
n = 415
x 303
n 415
x = 303
= .7301
= .01
z.01 = -2.33
P a g e | 30
p p
pq
n
z =
7301 .79
(.79)(.21)
415
= -3.00
Since the observed z = -3.00 is less than z.01= -2.33, The decision is to
reject the null hypothesis.
P a g e | 31
9.27
Ho: p = .31
Ha: p .31
n = 600
x 200
n 600
p p
pq
n
x = 200
= .10
/2 = .05
z.005 = +1.645
= .3333
.3333 .31
(.31)(. 69)
600
z =
= 1.23
Since the observed z = 1.23 is less than z.005= 1.645, The decision is to
fail to reject the null hypothesis. There is not enough evidence to
declare that the proportion is any different than .31.
Ho: p = .24
Ha: p < .24
n = 600
x 130
n 600
x = 130
= .2167
= .05
z.05 = -1.645
P a g e | 32
p p
pq
n
z =
.2167 .24
(.24)(.76)
600
= -1.34
Since the observed z = -1.34 is greater than z.05= -1.645, The decision
is to fail to reject the null hypothesis. There is not enough
evidence to declare that the proportion is less than .24.
P a g e | 33
9.28
Ho: p = .18
Ha: p > .18
p
n = 376
one-tailed test,
p p
pq
n
= .01
= .22
z.01 = 2.33
.22 .18
(.18)(. 82)
376
z =
= 2.02
Since the observed z = 2.02 is less than z.01= 2.33, The decision is to
fail to reject the null hypothesis. There is not enough evidence to
declare that the proportion is greater than .18.
9.29
Ho: p = .32
Ha: p < .32
p
n = 118
x = 22
x
22
n 118
z.05 = -1.645
= .1864
= .05
P a g e | 34
p p
pq
n
z =
.1864 .32
(.32)(.68)
118
= -3.11
Since the observed z = -3.11 is less than z.05= -1.645, the decision is to
reject the null hypothesis.
P a g e | 35
9.30
Ho: p = .47
Ha: p .47
n = 67
= .05
x = 40
x 40
n 67
/2 = .025
z.025 = +1.96
= .597
p p
pq
n
.597 .47
(.47)(.53)
67
z =
= 2.08
Since the observed z = 2.08 is greater than z.025= 1.96, The decision is
to reject the null hypothesis.
9.31 a) H0: 2 = 20
= .05
32
Ha: 2 > 20
2.05,14 = 23.6848
n = 15
df = 15 1 = 14
s2
P a g e | 36
2 =
(15 1)(32)
20
= 22.4
b) H0:
2 = 8.5
= .10
/2 = .05
n = 22
df = n-1 = 21
s = 17
Ha: 2 8.5
2.05,21 = 32.6706
2 =
(22 1)(17)
8.5
= 42
c) H0: 2 = 45
Ha: 2 < 45
2 .01,7 = 18.4753
= .01
n=8
df = n 1 = 7
s = 4.12
P a g e | 37
2 =
(8 1)( 4.12) 2
45
= 2.64
d)
H 0: 2 = 5
= .05
/2 = .025
n = 11
df = 11 1 = 10
s2 = 1.2
Ha: 2 5
2.025,10 = 20.4832
2 =
(11 1)(1.2)
5
2.975,10 = 3.24696
= 2.4
Since 2 = 2.4 < 2.975,10 = 3.24696, the decision is to reject the null
hypothesis.
9.32 H0: 2 = 14
s2 = 30.0833
= .05
/2 = .025
n = 12
Ha: 2 14
2.025,11 = 21.9200
2.975,11 = 3.81574
df = 12 1 = 11
P a g e | 38
2 =
(12 1)(30.0833)
14
= 23.64
9.33
00144667
H0: 2 = .001
= .01
n = 16
df = 16 1 = 15
s2 = .
2.01,15 = 30.5780
2 =
= 21.7
P a g e | 39
9.34 H0: 2 = 199,996,164
- 1 = 12
Ha: 2 199,996,164
= .10
/2 = .05
n = 13
df =13
s2 = 832,089,743.7
2.05,12 = 21.0261
2.95,12 = 5.22603
(13 1)(832,089,743.6)
199,996,164
2 =
= 49.93
9.35
= .1156
H0: 2 = .04
= .01
n=7
df = 7 1 = 6
s = .34
2.01,6 = 16.8119
2 =
(7 1)(.1156)
.04
= 17.34
s2
P a g e | 40
9.36
H0: = 100
Ha: < 100
n = 48
a)
= 14
= 99
= .10
z.10 = -1.28
xc
n
zc =
x c 100
14
48
-1.28 =
x
c
= 97.4
xc
97.4 99
14
48
n
z =
= -0.79
P a g e | 41
b)
= .05
z.05 = -1.645
xc
n
zc =
x c 100
14
48
-1.645 =
x
c
= 96.68
xc
96.68 99
14
48
n
z =
= -1.15
P a g e | 42
c)
= .01
z.01 = -2.33
xc
n
zc =
x c 100
14
48
-2.33 =
x
c
= 95.29
xc
95.29 99
14
48
n
z =
= -1.84
d)
P a g e | 43
9.37
a)
= .05
= 100
a = 98.5
zc = -1.645
n = 48
xc
n
zc =
x c 100
14
48
-1.645 =
x
c
= 96.68
xc
96.68 98.5
14
48
n
z =
= -0.90
= 14
P a g e | 44
P a g e | 45
b) a = 98
zc = -1.645
x
c = 96.68
xc
96.68 98
14
48
n
zc =
= -0.65
c) a = 97
z.05 = -1.645
x
c
= 96.68
xc
96.68 97
14
48
n
z =
= -0.16
P a g e | 46
d) a = 96
z.05 = -1.645
x
c
= 96.68
xc
96.68 96
14
48
n
z =
= 0.34
e) As the alternative value gets farther from the null hypothesized value, the
probability of committing a Type II error reduces (all other variables being held
constant).
9.38 Ho: = 50
Ha: 50
a = 53
n = 35
=7
= .01
z.005 = 2.575
P a g e | 47
xc
n
zc =
x c 50
7
35
2.575 =
x
c
= 50 3.05
xc
53.05 53
7
35
n
z =
= 0.04
Other end:
xc
46.95 53
7
35
n
z =
= -5.11
P a g e | 48
Area associated with z = -5.11 is .5000
P a g e | 49
9.39
a) Ho: p = .65
Ha: p < .65
= .05
n = 360
pa = .60
z.05 = -1.645
p c p
pq
n
zc =
p c .65
(.65)(.35)
360
-1.645 =
p
c
.609 .60
(.60)(. 40)
360
p c p
pq
n
z =
= 0.35
p
b)
pa = .55
z.05 = -1.645
= .609
P a g e | 50
.609 .55
(.55)(.45)
360
p c P
pq
n
z =
= 2.25
p
c)
pa = .50
z.05 = -1.645
p c p
= .609
.609 .50
(.50)(. 50)
360
pq
n
z =
= -4.14
x
9.40
n = 58
= 45.1
= 8.7
025
H0: = 44
Ha: 44
z.025 = 1.96
= .05
/2 = .
P a g e | 51
45.1 44
8 .7
58
z =
= 0.96
Since z = 0.96 < zc = 1.96, the decision is to fail to reject the null
hypothesis.
x c 44
8.7
58
+ 1.96 =
x
2.239 =
- 44
x
c
For 45 years:
46.239 45
8.7
58
z =
= 1.08
P a g e | 52
For 46 years:
46.239 46
8.7
58
z =
= 0.21
For 47 years:
P a g e | 53
46.239 47
8.7
58
z =
= -0.67
For 48 years:
46.239 48
8.7
58
z =
= 1.54
P a g e | 54
p
n = 463
x = 324
324
463
= .6998
= .10
P a g e | 55
z.10 = -1.28
p p
pq
n
.6998 .71
(.71)(.29)
463
z =
= -0.48
Since the observed z = -0.48 > z.10 = -1.28, the decision is to fail to
reject the null hypothesis.
Type II error:
p
Solving for the critical proportion,
p c p
pq
n
zc =
p c .71
(.71)(. 29)
463
-1.28 =
p
= .683
P a g e | 56
For pa = .69
.683 .69
(.69)(.31)
463
z =
= -0.33
For pa = .66
.683 .66
(.66)(. 34)
463
z =
= 1.04
For pa = .60
P a g e | 57
.683 .60
(.60)(. 40)
463
z =
= 3.65
P a g e | 58
9.42
HTAB steps:
1)Ho: = 36
Ha: 36
n
2)z =
3) = .01
4) two-tailed test,
/2 = .005,
z.005 = + 2.575
If the observed value of z is greater than 2.575 or less than -2.575, the
decision will be to reject the null hypothesis.
x
5) n = 63,
= 38.4,
38.4 36
5.93
63
n
6) z =
= 5.93
= 3.21
7) Since the observed value of z = 3.21 is greater than z.005 = 2.575, the
decision is
to reject the null hypothesis.
P a g e | 59
1)Ho: = 7.82
Ha: < 7.82
5) n = 17
= 7.01
x
s
7.01 7.82
1.69
17
n
6) t =
s = 1.69
= -1.98
7) Since the observed t = -1.98 is less than the table value of t = -1.746,
the decision
is to reject the null hypothesis.
P a g e | 60
9.44
HTAB steps:
a. 1) Ho: p = .28
p p
pq
n
2) z =
3) = .10
230
783
x = 230
= .2937
.2937 .28
(.28)(. 72)
783
6) z =
= 0.85
P a g e | 61
b. 1) Ho: p = .61
Ha: p .61
p p
pq
n
2) z =
3) = .05
4) This is a two-tailed test, z.025 = + 1.96. If the observed value of z is greater
than 1.96 or less than -1.96, then the decision will be to reject the null
hypothesis.
p
5) n = 401
= .56
.56 .61
(.61)(.39)
401
6) z =
= -2.05
7) Since z = -2.05 is less than z.025 = -1.96, the decision is to reject the null
hypothesis.
8) The population proportion is not likely to be .61.
9.45
HTAB steps:
1) H0: 2 = 15.4
Ha: 2 > 15.4
P a g e | 62
2) 2 =
( n 1) s 2
2
3) = .01
4) n = 18,
df = 17,
one-tailed test
2.01,17 = 33.4087
5) s2 = 29.6
6) 2 =
(n 1) s 2
2
(17)( 29.6)
15.4
= 32.675
9.46
a)
H0: = 130
Ha: > 130
n = 75
= 12
= .01
z.01 = 2.33
a = 135
P a g e | 63
x
Solving for
c:
xc
n
zc =
x c 130
12
75
2.33 =
x
c = 133.23
133.23 135
12
75
z =
= -1.28
b)
H0: p = .44
Ha: p < .44
P a g e | 64
n = 1095
= .05
pa = .42
z.05 = -1.645
p c p
pq
n
zc =
p c .44
(.44)(.56)
1095
-1.645 =
p c
= .4153
.4153 .42
(.42)(. 58)
1095
z =
= -0.32
p
n = 80
= .01
= .39
P a g e | 65
z.01 = 2.33
p p
pq
n
.39 .32
(.32)(.68)
80
z =
= 1.34
Since the observed z = 1.34 < z.01 = 2.33, the decision is to fail to
reject the null hypothesis.
x
9.48
= 3.45
n = 64
2 = 1.31
= .05
Ho: = 3.3
Ha: 3.3
3.45 3.3
1.31
64
n
z =
zc = 1.96
= 1.05
P a g e | 66
p
9.49
n = 210
x = 93
= .10
x
93
n 210
= .
443
Ho: p = .57
Ha: p < .57
p p
pq
n
z =
zc = -1.28
.443 .57
(.57)(.43)
210
= -3.72
Since the observed z = -3.72 < zc = -1.28, the decision is to reject the
null hypothesis.
P a g e | 67
9.50
H0: 2 = 16
n = 12
= .05
df = 12 - 1 = 11
Ha: 2 > 16
2.05,11 = 19.6752
2 =
(12 1)(5.98544) 2
16
= 24.63
9.51
H 0:
1=6
s = 1.3
= 8.4
= .01
Ha: 8.4
x
= 5.6
t.005,6 = + 3.707
5.6 8.4
1. 3
7
t =
= -5.70
/2 = .005
n=7
df = 7
P a g e | 68
Since the observed t = - 5.70 < t.005,6 = -3.707, the decision is to reject
the null hypothesis.
x
9.52
= $26,650
a)
= $12,000
n = 100
Ho: = $25,000
= .05
z.05 = 1.645
26,650 25,000
12,000
100
n
z =
= 1.38
Since the observed z = 1.38 < z.05 = 1.645, the decision is to fail
to reject the null hypothesis.
b)
a = $30,000
zc = 1.645
x
Solving for
P a g e | 69
xc
n
zc =
( x c 25,000)
12,000
100
1.645 =
x
c
26,974 30,000
12,000
100
z =
= -2.52
P a g e | 70
9.53 H0: 2 = 4
=7
n=8
s = 7.80
= .10
df = 8 1
Ha: 2 > 4
2.10,7 = 12.0170
2 =
(8 1)(7.80) 2
4
= 106.47
p
n = 125
x = 66
= .05
x 66
n 125
= .528
p p
pq
n
z =
.528 .46
(.46)(.54)
125
= 1.53
Since the observed value of z = 1.53 < z.05 = 1.645, the decision is to
fail to reject the null hypothesis.
P a g e | 71
p c
Solving for
p c p
pq
n
zc =
p c .46
(.46)(. 54)
125
p c
1.645 =
and therefore,
p c p a
pa qa
n
= .533
.533 .50
(.50)(. 50)
125
z =
= 0.74
x
9.55
n = 16
05
H0: = 185
Ha: < 185
t.05,15 = - 1.753
= 175
s = 14.28286
df = 16 - 1 = 15
=.
P a g e | 72
x
s
175 185
14.28286
16
n
t =
= -2.80
9.56
H0: p = .182
Ha: p > .182
p
n = 428
x = 84
p p
pq
n
z =
= .01
x 84
n 428
= .1963
z.01 = 2.33
.1963 .182
(.182)(.818)
428
= 0.77
Since the observed z = 0.77 < z.01 = 2.33, the decision is to fail to
reject the null hypothesis.
P a g e | 73
p c
Solving for
p c p
pq
n
zc =
. p c .182
(.182)(.818)
428
2.33 =
p c
= .2255
p c p a
pa qa
n
z =
.2255 .21
(.21)(.79)
428
= 0.79
9.57
Ho: = $15
Ha: > $15
P a g e | 74
x
= $19.34
n = 35
= $4.52
x
s
zc = 1.28
19.34 15
4.52
35
n
z =
= .10
= 5.68
Since the observed z = 5.68 > zc = 1.28, the decision is to reject the
null hypothesis.
9.58
H0: 2 = 16
n = 22
df = 22 1 = 21
s=6
= .05
Ha: 2 > 16
2.05,21 = 32.6706
2 =
(22 1)(6) 2
16
= 47.25
P a g e | 75
x
9.59 H 0: = 2.5
91=8
= 3.4
s = 0.6
= .01
n=9
df =
t.01,8 = 2.896
x
s
3.4 2.5
0.6
9
n
t =
= 4.50
Since the observed t = 4.50 > t.01,8 = 2.896, the decision is to reject
the null hypothesis.
9.60
a)
Ho: = 23.58
Ha: 23.58
x
n = 95
= 22.83
= 5.11
= .05
22.83 23.58
5.11
95
n
z =
= -1.43
P a g e | 76
xc
n
b)
zc =
xc 23.58
5.11
95
+ 1.96 =
xc
= 23.58 + 1.03
xc
= 22.55, 24.61
xc a
22.55 22.30
5.11
95
n
z =
= 0.48
P a g e | 77
xc a
24.61 22.30
5.11
95
n
z =
= 4.41
from Table A.5, the areas for z = 0.48 and z = 4.41 are
and .5000
x
9.61
n = 12
= 12.333
s2 = 10.424
H0: 2 = 2.5
Ha: 2 2.5
= .05
df = 11
2.025,11 = 21.9200
2..975,11 = 3.81574
.1844
P a g e | 78
2 =
( n 1) s 2
2
11(10.424)
2 .5
= 45.866
P a g e | 79
x
9.62 H0: = 23
16 1 = 15
= 18.5
s = 6.91
= .10
n = 16
df =
Ha: < 23
t.10,15 = -1.341
x
s
18.5 23
6.91
16
n
t =
= -2.60
Since the observed t = -2.60 < t.10,15 = -1.341, the decision is to reject
the null hypothesis.
x
9.63
is 3.969
s = 0.866
x
s
n
t =
df = 21
P a g e | 80
9.64
H0: p = .25
Ha: p .25
Since the p-value = .045 < = .05, the decision is to reject the null
hypothesis.
p
The sample
hypothesized p = .25.
proportion,
.205729
which
is
less
than
the
P a g e | 81
9.65
H0: = 2.51
Ha: > 2.51
This is a one-tailed test. The sample mean is 2.55 which is more than
the hypothesized value. The observed t value is 1.51 with an
associated
p-value of .072 for a one-tailed test. Because the p-value is greater
than
9.66
H0: = 2747
Ha: < 2747
This is a one-tailed test. Sixty-seven households were included in this
study.
The sample average amount spent on home-improvement projects was
2,349.
Since z = -2.09 < z.05 = -1.645, the decision is to reject the null
hypothesis at
P a g e | 82
Chapter 10
Statistical Inferences about Two Populations
LEARNING OBJECTIVES
1.
2.
3.
5.
P a g e | 83
CHAPTER TEACHING STRATEGY
P a g e | 84
In conducting a t test for the difference of two means from
independent populations, there are two different formulas given in the
chapter. One version of this test uses a "pooled" estimate of the population
variance and assumes that the population variances are equal. The other
version does not assume equal population variances and is simpler to
compute. In doing hand calculations, it is generally easier to use the
pooled variance formula because the degrees of freedom formula for the
unequal variance formula is quite complex. However, it is good to expose
students to both formulas since computer software packages often give you
the option of using the pooled that assumes equal population variances or
the formula for unequal variances.
P a g e | 85
CHAPTER OUTLINE
10.3
Related
Populations
Confidence Intervals
P a g e | 86
10.4
10.5
Variances
KEY TERMS
Dependent Samples
Independent
Samples
F Distribution
Matched-Pairs Test
F Value
Related Measures
P a g e | 87
SOLUTIONS TO PROBLEMS IN CHAPTER 10
10.1
Sample 1
Sample 2
x
1
a)
= 51.3
= 53.2
s12 = 52
s22 = 60
n1 = 31
n2 = 32
Ho:
1 - 2 = 0
Ha:
1 - 2 < 0
( x 1 x 2 ) ( 1 2 )
1 2
n1
n2
2
z =
z.10 = -1.28
31 32
= -1.01
b)
P a g e | 88
( x 1 x 2 ) c ( 1 2 )
1 2
n1
n2
2
zc =
( x 1 x 2 ) c ( 0)
52 60
31 32
-1.28 =
x
( 1-
c)
x
2)c = -2.41
P a g e | 89
10.2
Sample 1
Sample 2
n1 = 32
n2 = 31
x
1
= 70.4
1 = 5.76
= 68.7
2 = 6.1
z.05 = 1.645
( x1 x 2 ) z 1 2
n1
n2
2
5.76 2 6.12
32
31
(70.4) 68.7) + 1.645
1.7 2.46
10.3
a)
Sample 1
Sample 2
x
1
= 88.23
= 81.2
P a g e | 90
12 = 22.74
22 = 26.65
n1 = 30
n2 = 30
Ho: 1 - 2 = 0
Ha: 1 - 2 0
For two-tail test, use /2 = .01 z.01 = + 2.33
( x 1 x 2 ) ( 1 2 )
1 2
n1
n2
2
z =
30
30
= 5.48
P a g e | 91
1 2
n1
n2
2
( x1 x 2 ) z
b)
22.74 26.65
30
30
(88.23 81.2) + 2.33
7.03 + 2.99
This supports the decision made in a) to reject the null hypothesis because zero is not in
the interval.
10.4 Computers/electronics
Food/Beverage
x
1
= 1.96
= 3.02
12 = 1.0188
22 = .9180
n1 = 50
n2 = 50
Ho:
1 - 2 = 0
Ha:
1 - 2 0
z.005 = 2.575
P a g e | 92
( x 1 x 2 ) ( 1 2 )
1 2
n1
n2
2
z =
50
50
= -5.39
P a g e | 93
10.5
n1 = 40
n2 = 37
x
1
= 5.3
12 = 1.99
= 6.5
22 = 2.36
z.025 = 1.96
1 2
n1
n2
2
( x1 x 2 ) z
1.99 2.36
40
37
(5.3 6.5) + 1.96
-1.2 .66
10.6
Managers
Specialty
n1 = 35
n2 = 41
P a g e | 94
x
1
= 1.84
1 = .38
= 1.99
2 = .51
z.01 = 2.33
( x1 x 2 ) z 1 2
n1
n2
2
.382 .512
35
41
(1.84 - 1.99) 2.33
-.15 .2384
Hypothesis Test:
1) Ho: 1 - 2 = 0
Ha: 1 - 2 0
P a g e | 95
( x 1 x 2 ) ( 1 2 )
1 2
n1
n2
2
2) z =
3) = .02
35
41
6) z =
= -1.47
7) Since z = -1.47 > z.01 = -2.33, the decision is to fail to reject the
null
hypothesis.
10.7
1996
2006
P a g e | 96
x
1 = 190
2 = 198
1 = 18.50
2 = 15.60
n1 = 51
= .01
n2 = 47
H0: 1 - 2 = 0
Ha: 1 - 2 < 0
For a one-tailed test,
( x 1 x 2 ) ( 1 2 )
1 2
n1
n2
2
z.01 = -2.33
51
47
z =
= -2.32
Since the observed z = -2.32 > z.01 = -2.33, the decision is to fail to
reject the null hypothesis.
10.8
Seattle
Atlanta
n1 = 31
n2 = 31
x
1
= 2.64
12 = .03
= 2.36
22 = .015
z.005 = 2.575
P a g e | 97
1 2
n1
n2
2
( x1 x 2 ) z
.03 .015
31
31
(2.64-2.36) 2.575
.18 < < .38
.28 .10
Between $ .18 and $ .38 difference with Seattle being more expensive.
10.9
Canon
Pioneer
x
1
= 5.8
= 5.0
1 = 1.7
2 = 1.4
n1 = 36
n2 = 45
Ho:
1 - 2 = 0
Ha:
1 - 2 0
z.025 = 1.96
P a g e | 98
( x 1 x 2 ) ( 1 2 )
1 2
n1
n2
2
z =
36
45
= 2.27
Since the observed z = 2.27 > zc = 1.96, the decision is to reject the
null
hypothesis.
P a g e | 99
10.10
x
1
= 8.05
= 7.26
1 = 1.36
2 = 1.06
n1 = 50
n2 = 38
Ho:
1 - 2 = 0
Ha:
1 - 2 > 0
( x 1 x 2 ) ( 1 2 )
1 2
n1
n2
2
z.10 = 1.28
50
38
z =
= 3.06
Since the observed z = 3.06 > zc = 1.28, the decision is to reject the
null hypothesis.
10.11
Ho: 1 - 2 = 0
= .01
Ha: 1 - 2 < 0
df = 8 + 11 - 2 = 17
Sample 1
Sample 2
n1 = 8
n2 = 11
P a g e | 100
x
1
= 24.56
s12 = 12.4
= 26.42
s22 = 15.8
( x1 x 2 ) ( 1 2 )
2
s1 ( n1 1) s 2 (n 2 1) 1
1
n1 n 2 2
n1 n2
t =
8 11 2
8 11
-1.05
Since the observed t = -1.05 > t.01,19 = -2.567, the decision is to fail to
reject the null hypothesis.
P a g e | 101
10.12 a)
=.10
Ho: 1 - 2 = 0
Ha: 1 - 2 0
df = 20 + 20 - 2 = 38
Sample 1
Sample 2
n1 = 20
n2 = 20
x
1
= 118
s1 = 23.9
= 113
s2 = 21.6
/2 = .05
Critical
t.05,38
1.697
(used
df=30)
( x1 x 2 ) ( 1 2 )
2
s1 (n1 1) s 2 (n 2 1) 1
1
n1 n2 2
n1 n2
t =
20 20 2
20 20
2
t =
= 0.69
P a g e | 102
2
( x1 x 2 ) t
s1 (n1 1) s 2 (n 2 1) 1
1
n1 n2 2
n1 n 2
b)
20 20 2
20 20
5 + 12.224
P a g e | 103
10.13
= .05
Ho: 1 - 2 = 0
Ha: 1 - 2 > 0
df = n1 + n2 - 2 = 10 + 10 - 2 = 18
Sample 1
Sample 2
n1 = 10
n2 = 10
x
1
= 45.38
s1 = 2.357
= 40.49
s2 = 2.355
( x1 x 2 ) ( 1 2 )
2
s1 (n1 1) s 2 (n 2 1) 1
1
n1 n2 2
n1 n2
t =
10 10 2
10 10
t =
= 4.64
10.14
Ho: 1 - 2 = 0
=.01
P a g e | 104
Ha: 1 - 2 0
df = 18 + 18 - 2 = 34
Sample 1
Sample 2
n1 = 18
n2 = 18
x
1
= 5.333
s12 = 12
= 9.444
s22 = 2.026
/2 = .005
df=30)
( x 1 x 2 ) ( 1 2 )
2
s1 (n1 1) s 2 (n2 1) 1
1
n1 n2 2
n1 n2
t =
t =
18 18 2
18 18
= -4.66
Since the observed t = -4.66 < t.005,34 = -2.75, reject the null
hypothesis.
P a g e | 105
2
( x1 x 2 ) t
s1 (n1 1) s 2 (n 2 1) 1
1
n1 n2 2
n1 n 2
=
(12)(17) (2.026)(17) 1 1
18 18 2
18 18
-4.111 + 2.1689
10.15
Peoria
Evansville
n1 = 21
n2 = 26
x1
x2
= 116,900
= 114,000
s1 = 2,300
s2 = 1,750
df = 21 + 26 2
df = 40)
s (n 1) s 2 (n2 1) 1
1
( x1 x 2 ) t 1 1
n1 n 2 2
n1 n2
=
P a g e | 106
21 26 2
21 26
=
2,900 + 994.62
P a g e | 107
= .10
10.16 Ho: 1 - 2 = 0
Ha: 1 - 2 0
df = 12 + 12 - 2 = 22
Co-op
Interns
n1 = 12
n2 = 12
x
1
= $15.645
s1 = $1.093
= $15.439
s2 = $0.958
( x 1 x 2 ) ( 1 2 )
2
s1 (n1 1) s 2 (n2 1) 1
1
n1 n2 2
n1 n2
t =
12 12 2
12 12
t =
= 0.49
Since the observed t = 0.49 < t.05,22 = 1.717, the decision is to fail
reject the null hypothesis.
P a g e | 108
90% Confidence Interval: t.05,22 = 1.717
s (n 1) s 2 (n2 1) 1
1
( x1 x 2 ) t 1 1
n1 n 2 2
n1 n2
=
12 12 2
12 12
0.206 + 0.7204
P a g e | 109
10.17 Let Boston be group 1
1)
Ho: 1 - 2 = 0
Ha: 1 - 2 > 0
( x 1 x 2 ) ( 1 2 )
2
s1 (n1 1) s 2 (n2 1) 1
1
n1 n2 2
n1 n2
2)
t =
3) = .01
5) Boston
n1 = 8
Dallas
n2 = 9
x
1
= 47
s1 = 3
= 44
s2 = 3
t =
1 1
8 9
= 2.06
7) Since t = 2.06 < t.01,15 = 2.602, the decision is to fail to reject the
null
P a g e | 110
hypothesis.
10.18 nm = 22
nno = 20
x
m
= 112
no
sm = 11
= 122
sno = 12
df = nm + nno - 2 = 22 + 20 - 2 = 40
and
s (n 1) s 2 (n2 1) 1
1
( x1 x 2 ) t 1 1
n1 n 2 2
n1 n2
=
22 20 2
22 20
-10 8.60
t.01,40 = 2.423
P a g e | 111
10.19 Ho: 1 - 2 = 0
Ha: 1 - 2 0
df = n1 + n2 - 2 = 11 + 11 - 2 = 20
Toronto
Mexico City
n1 = 11
n2 = 11
x
1
= $67,381.82
s1 = $2,067.28
= $63,481.82
s2 = $1,594.25
( x 1 x 2 ) ( 1 2 )
2
s1 (n1 1) s 2 (n2 1) 1
1
n1 n2 2
n1 n2
t =
t =
11 11 2
11 11
= 4.95
P a g e | 112
Since the observed t = 4.95 > t.005,20 = 2.845, the decision is to
Reject the null
hypothesis.
10.20 Ho: 1 - 2 = 0
Ha: 1 - 2 > 0
df = n1 + n2 - 2 = 9 + 10 - 2 = 17
Men
Women
n1 = 9
n2 = 10
x
1
= $110.92
= $75.48
s1 = $28.79
s2 = $30.51
( x 1 x 2 ) ( 1 2 )
2
s1 (n1 1) s 2 (n2 1) 1
1
n1 n2 2
n1 n2
t =
( 1 1 0 .9 2 7 5 .4 8 ) ( 0 )
t =
( 2 8 . 7 9 ) 2 ( 8 ) ( 3 0 .5 1 ) 2 ( 9 )
9 10 2
1
1
9 10
= 2.60
P a g e | 113
Since the observed t = 2.60 > t.01,17 = 2.567, the decision is to Reject
the null
hypothesis.
10.21 Ho: D = 0
Ha: D > 0
Sample 1
Sample 2
38
22
16
27
28
-1
30
21
41
38
36
38
-2
38
26
12
33
19
14
35
31
44
35
d
n=9
=7.11
df = n - 1 = 9 - 1 = 8
sd=6.45
= .01
P a g e | 114
d D 7.11 0
sd
6.45
n
t =
9
= 3.31
Since the observed t = 3.31 > t.01,8 = 2.896, the decision is to reject
the null
hypothesis.
P a g e | 115
10.22 Ho:
D=0
Ha:
D0
Before
After
107
102
99
98
110
113
100
108
96
89
98
101
-3
100
99
102
102
107
105
109
110
-1
104
102
96
100
99
101
10
5
d
n = 13
= 2.5385
sd=3.4789
= .05
df = n - 1 = 13 - 1 = 12
P a g e | 116
d D 2.5385 0
sd
3.4789
n
t =
13
= 2.63
Since the observed t = 2.63 > t.025,12 = 2.179, the decision is to reject
the null
hypothesis.
P a g e | 117
d
10.23 n = 22
= 40.56
sd = 26.58
t.01,21 = 2.518
d t
sd
n
26.58
22
40.56 (2.518)
40.56 14.27
10.24
Before
After
32
40
-8
28
25
35
36
-1
32
32
26
29
-3
P a g e | 118
25
31
-6
37
39
-2
16
30
-14
35
31
d
n=9
= -3
= .025
sd = 5.6347
df = n - 1 = 9 - 1 = 8
d t
t.05,8 = 1.86
sd
n
t =
5.6347
9
t = -3 + (1.86)
= -3 3.49
10.25 City
Cost
Resale
Atlanta
2042725163
-4736
Boston
2725524625
2630
Des Moines
22115
12600
9515
P a g e | 119
Kansas City
23256
24588
-1332
Louisville
21887
19267
2620
Portland
24255
20150
4105
Raleigh-Durham
19852
22500
-2648
Reno
23624
16667
6957
Ridgewood
25885
26875
- 990
San Francisco
28999
35333
-6334
Tulsa
20836
16292
4544
d
= 1302.82
= .01
d t
sd = 4938.22
/2 = .005
n = 11,
df = 10
t.005,10= 3.169
sd
4938.22
11
n
= 1302.82 + 3.169
= 1302.82 + 4718.42
P a g e | 120
10.26 Ho:
D=0
Ha:
D<0
Before
After
-2
-1
-2
-3
-4
-1
-4
d
n=9
n-1=9-1=8
=-1.778
sd=1.716
= .05
d D 1.778 0
sd
1.716
n
t =
9
= -3.11
df =
P a g e | 121
Since the observed t = -3.11 < t.05,8 = -1.86, the decision is to reject
the null hypothesis.
10.27
Before
After
255
197
58
230
225
290
215
75
242
215
27
300
240
60
250
235
15
215
190
25
230
240
-10
225
200
25
219
203
16
236
223
13
d
n = 11
n - 1 = 11 - 1 = 10
= 28.09
sd=25.813
df =
P a g e | 122
d t
sd
n
25.813
11
28.09 (2.764)
= 28.09 21.51
P a g e | 123
10.28 H0: D = 0
d
Ha: D > 0
n = 27
df = 27 1 = 26
= 3.17
sd = 5
d D 3.71 0
sd
5
27
n
t =
= 3.86
Since the observed t = 3.86 > t.01,26 = 2.479, the decision is to reject
the null hypothesis.
d
10.29
n = 21
= 75
sd = 30
df = 21 - 1 = 20
d t
sd
n
30
21
75 + 1.725
= 75 11.29
P a g e | 124
10.30 Ho:
D=0
Ha:
D0
d
n = 15
= 15 - 1 = 14
= -2.85
sd = 1.9
= .01
df
d D 2.85 0
sd
1.9
n
t =
15
=
-5.81
Since the observed t = -5.81 < t.005,14 = -2.977, the decision is to reject
the null
hypothesis.
10.31 a)
Sample 1
Sample 2
n1 = 368
n2 = 405
x1 = 175
x2 = 182
P a g e | 125
p 1
x1 175
n1 368
p 2
x 2 182
n2 405
= .476
= .449
Ho: p1 - p2 = 0
Ha: p1 - p2 0
( p 1 p 2 ) ( p1 p 2 )
1
1
n1 n
p q
368 405
= 0.75
b)
Sample 1
Sample 2
p
1
= .38
n1 = 649
= .25
n2 = 558
P a g e | 126
n1 p 1 n2 p 2 649(.38) 558(.25)
n1 n 2
649 558
= .32
Ho:
p1 - p2 = 0
Ha:
p1 - p2 > 0
( p 1 p 2 ) ( p1 p 2 )
1
1
n1 n
p q
z.10 = 1.28
649 558
= 4.83
Since the observed z = 4.83 > zc = 1.28, the decision is to reject the
null
hypothesis.
p
10.32 a)
n1 = 85
n2 = 90
p
1
= .75
( p 1 p 2 ) z
p 1 q1 p 2 q 2
n1
n2
= .67
P a g e | 127
(.75)(.25) (.67)(.33)
85
90
(.75 - .67) 1.645
= .08 .11
p
b)
n1 = 1100
n2 = 1300
p
1
= .19
= .17
( p 1 p 2 ) z
p 1 q1 p 2 q 2
n1
n2
1100
1300
(.19 - .17) + 1.96
= .02 .03
c)
n1 = 430
p 1
n2 = 399
x1 275
n1 430
x1 = 275
p 2
= .64
x2 = 275
x 2 275
n2 399
= .69
P a g e | 128
( p 1 p 2 ) z
z.075 = 1.44
p 1 q1 p 2 q 2
n1
n2
430
399
(.64 - .69) + 1.44
= -.05 .047
P a g e | 129
d)
n1 = 1500
p 1
n2 = 1500
x1 = 1050
x1 1050
n1 1500
p 2
x2 = 1100
x 2 1100
n 2 1500
= .70
= .733
( p 1 p 2 ) z
p 1 q1 p 2 q 2
n1
n2
1500
1500
(.70 - .733) 1.28
= -.033 .02
10.33 H0: pm - pw = 0
p
Ha: pm - pw < 0
nm = 374
nw = 481
= .70
z.05 = -1.645
p
m = .59
P a g e | 130
n m p m n w p w 374(.59) 481(.70)
nm n w
374 481
= .652
( p 1 p 2 ) ( p1 p 2 )
1
1
n1 n
p q
374 481
= -3.35
Since the observed z = -3.35 < z.05 = -1.645, the decision is to reject
the null
hypothesis.
P a g e | 131
p1
10.34 n1 = 210
n2 = 176
p 2
= .24
= .35
( p 1 p 2 ) z
z.05 = + 1.645
p 1 q1 p 2 q 2
n1
n2
210
176
(.24 - .35) + 1.645
= -.11 + .0765
p
1
= .48
n1 = 56
Banks
= .56
n2 = 89
n1 p 1 n 2 p 2 56(.48) 89(.56)
n1 n2
56 89
= .529
Ho:
p1 - p2 = 0
P a g e | 132
Ha:
p1 - p2 0
( p 1 p 2 ) ( p1 p 2 )
1
1
n1 n
p q
-0.94
P a g e | 133
10.36
n1 = 35
n2 = 35
x1 = 5
x2 = 7
p 1
x1
5
n1 35
p 2
x2
7
n 2 35
= .14
= .20
( p 1 p 2 ) z
z.01 = 2.33
p 1 q1 p 2 q 2
n1
n2
35
35
(.14 - .20) 2.33
10.37 H0: p1 p2 = 0
Ha: p1 p2 0
= -.06 .21
.15
P a g e | 134
p
= .10
p
1
= .09
= .06
n1 = 780
n2 = 915
z.05 = + 1.645
n1 p 1 n 2 p 2 780(.09) 915(.06)
n1 n 2
780 915
= .0738
( p 1 p 2 ) ( p1 p 2 )
1
1
n1 n
p q
780 915
= 2.35
Since the observed z = 2.35 > z.05 = 1.645, the decision is to reject the null
hypothesis.
P a g e | 135
p
10.38 n1 = 850
n2 = 910
p
1
= .60
= .52
( p 1 p 2 ) z
p 1 q1 p 2 q 2
n1
n2
850
910
(.60 - .52) + 1.96
= .08 + .046
10.39 H0: 12 = 22
Ha: 12 < 22
dfnum = 12 - 1 = 11
= .01
s12 = 562
n1 = 10
n2 = 12
s22 = 1013
dfdenom = 10 - 1 = 9
P a g e | 136
s2
s1
1013
562
F =
= 1.80
Since the observed F = 1.80 < F.01,10,9 = 5.26, the decision is to fail to
reject the null hypothesis.
H0: 12 = 22
10.40
= .05
Ha: 12 22
dfnum = 5 - 1 = 4
n2 = 19
F =
s2
s1 = 4.68
s2 = 2.78
dfdenom = 19 - 1 = 18
s1
n1 = 5
F.025,4,18 = 3.61
F.95,18,4 = .277
(4.68) 2
(2.78) 2
= 2.83
Since the observed F = 2.83 < F.025,4,18 = 3.61, the decision is to fail to
reject the null hypothesis.
P a g e | 137
10.41
City 1
City 2
3.43
3.33
3.40
3.42
3.39
3.39
3.32
3.30
3.39
3.46
3.38
3.39
3.34
3.36
3.38
3.44
3.38
3.37
3.28
3.38
n1 = 10
df1 = 9
s12 = .0018989
H0: 12 = 22
n2 = 10
df2 = 9
s22 = .0023378
= .10
/2 = .05
Ha: 12 22
P a g e | 138
F =
s1
s2
.0018989
.0023378
= 0.81
Since the observed F = 0.81 is greater than the lower tail critical value
of 0.314 and less than the upper tail critical value of 3.18, the decision
is to fail
to reject the null hypothesis.
P a g e | 139
10.42 Let Houston = group 1 and Chicago = group 2
1) H0: 12 = 22
Ha: 12 22
s1
s2
2) F =
3) = .01
4) df1 = 12
F.005,12,10 = 5.66
F.995,10,12 = .177
If the observed value is greater than 5.66 or less than .177, the
decision will be to reject the null hypothesis.
5) s12 = 393.4
6) F =
393.4
702.7
s22 = 702.7
= 0.56
P a g e | 140
10.43 H0: 12 = 22
= .05
Ha: 12 > 22
n1 = 12
s1 = 7.52
n2 = 15
dfnum = 12 - 1 = 11
s2 = 6.08
dfdenom = 15 - 1 = 14
F=
s1
s2
(7.52) 2
(6.08) 2
= 1.53
Since the observed F = 1.53 < F.05,10,14 = 2.60, the decision is to fail to
reject the null hypothesis.
P a g e | 141
10.44 H0: 12 = 22
= .01
Ha: 12 22
n2 = 15
dfnum = 15 - 1 = 14
s2
s12 = 91.5
s22 = 67.3
dfdenom = 15 - 1 = 14
s1
n1 = 15
F.005,12,14 = 4.43
F.995,14,12 = .226
91.5
67.3
F =
= 1.36
Since the observed F = 1.36 < F.005,12,14 = 4.43 and > F.995,14,12 = .226,
the decision is to fail to reject the null hypothesis.
10.45 Ho:
1 - 2 = 0
Ha:
1 - 2 0
Sample 1
Sample 2
x
1
= 138.4
= 142.5
P a g e | 142
1 = 6.71
2 = 8.92
n1 = 48
n2 = 39
( x 1 x 2 ) ( 1 2 )
1 2
n1
n2
2
z =
48
39
= -2.38
Since the observed value of z = -2.38 is less than the critical value of z
= -1.645, the decision is to reject the null hypothesis. There is a
significant difference in the means of the two populations.
P a g e | 143
10.46 Sample 1
Sample 2
x
1
= 34.9
12 = 2.97
= 27.6
22 = 3.50
n1 = 34
n2 = 31
( x1 x 2 ) z
s1
s
2
n1
n2
2.97 3.50
34
31
(34.9 27.6) + 2.33
10.47 Ho: 1 - 2 = 0
Ha: 1 - 2 > 0
Sample 1
Sample 2
x
= 2.06
= 1.93
7.3 + 1.04
P a g e | 144
s12 = .176
s22 = .143
n1 = 12
n2 = 15
= .05
( x 1 x 2 ) ( 1 2 )
2
s1 (n1 1) s 2 (n2 1) 1
1
n1 n2 2
n1 n2
t =
25
12 15
t =
= 0.85
Since the observed value of t = 0.85 is less than the critical value of t
= 1.708, the decision is to fail to reject the null hypothesis. The
mean for population one is not significantly greater than the mean for
population two.
10.48
Sample 1
Sample 2
x
1
= 74.6
s12 = 10.5
n1 = 18
= 70.9
s22 = 11.4
n2 = 19
P a g e | 145
For 95% confidence, /2 = .025.
Using df = 18 + 19 - 2 = 35,
( x1 x 2 ) t
t30,.025 = 2.042
s1 (n1 1) s 2 (n2 1) 1
1
n1 n 2 2
n1 n2
(10.5)(17) (11.4)(18) 1 1
18 19 2
18 19
3.7 + 2.22
= .01
10.49 Ho: D = 0
Ha: D < 0
d
n = 21
df = 20
= -1.16
sd = 1.01
The critical t.01,20 = -2.528. If the observed t is less than -2.528, then
the decision will be to reject the null hypothesis.
P a g e | 146
d D 1.16 0
sd
1.01
n
t =
21
= -5.26
Since the observed value of t = -5.26 is less than the critical t value of
-2.528, the decision is to reject the null hypothesis. The population
difference is less
than zero.
P a g e | 147
10.50 Respondent
Before
After
47
63
-16
33
35
-2
38
36
50
56
-6
39
44
-5
27
29
-2
35
32
46
54
-8
41
47
-6
d
= -4.44
sd = 5.703
df = 8
d t
sd
5.703
9
n
= -4.44 + 3.355
= -4.44 + 6.38
10.51 Ho: p1 - p2 = 0
Ha: p1 - p2 0
= .05
/2 = .025
z.025 = + 1.96
P a g e | 148
If the observed value of z is greater than 1.96 or less than -1.96, then
the decision will be to reject the null hypothesis.
Sample 1
Sample 2
x1 = 345
x2 = 421
n1 = 783
n2 = 896
x1 x 2 345 421
n1 n2 783 896
= .4562
p 1
x1 345
n1 783
p 2
x 2 421
n2 896
= .4406
( p 1 p 2 ) ( p1 p 2 )
1
1
n1 n
p q
= .4699
783 896
= -1.20
10.52 Sample 1
Sample 2
n1 = 409
n2 = 378
p
1
= .71
= .67
P a g e | 149
( p 1 p 2 ) z
z.005 = 2.575
p 1 q1 p 2 q 2
n1
n2
409
378
(.71 - .67) + 2.575
= .04 .085
10.53 H0: 12 = 22
= .05
Ha: 12 22
dfnum = 8 - 1 = 7
n1 = 8
n2 = 10
s12 = 46
s22 = 37
dfdenom = 10 - 1 = 9
F.025,7,9 = 4.20
F.975,9,7 = .238
If the observed value of F is greater than 4.20 or less than .238, then
the decision will be to reject the null hypothesis.
P a g e | 150
F =
s1
s2
46
37
= 1.24
Since the observed F = 1.24 is less than F.025,7,9 =4.20 and greater than
F.975,9,7 = .238, the decision is to fail to reject the null hypothesis.
There is no significant difference in the variances of the two
populations.
P a g e | 151
10.54
Term
Whole Life
x
t
= $75,000
= $45,000
st = $22,000
sw = $15,500
nt = 27
nw = 29
df = 27 + 29 - 2 = 54
df=50)
( x1 x 2 ) t
s1 (n1 1) s 2 (n2 1) 1
1
n1 n 2 2
n1 n2
27 29 2
27 29
30,000 10,160.11
10.55
Morning
43
Afternoon
41
P a g e | 152
51
49
37
44
-7
24
32
-8
47
46
44
42
50
47
55
51
46
49
-3
d
n=9
= -0.444
d t
sd =4.447
sd
n
4.447
9
-0.444 + (1.86)
= -0.444 2.757
Accountants
n2 = 450
x2 = 216
df = 9 - 1 = 8
P a g e | 153
Ho: p1 - p2 = 0
= .01
Ha: p1 - p2 > 0
p 1
=
220
400
p 2
= .55
z.01 = 2.33
216
450
= .48
x1 x 2 220 216
n1 n2 400 450
= .513
( p 1 p 2 ) ( p1 p 2 )
1
1
n n
1
p q
400 450
= 2.04
Since the observed z = 2.04 is less than z.01 = 2.33, the decision is to
fail to reject
the null hypothesis. There is no significant difference between
marketing
managers and accountants in the proportion who keep track of
obligations in their head.
10.57
Accounting
Data Entry
P a g e | 154
n1 = 16
n2 = 14
x
1
= 26,400
s1 = 1,200
= 25,800
s2 = 1,050
H0: 12 = 22
Ha: 12 22
= .05
and
dfnum = 16 1 = 15
dfdenom = 14 1 = 13
s1
s2
F.975,15,13 = 0.33
1,440,000
1,102,500
F =
= 1.31
Since the observed F = 1.31 is less than F.025,15,13 = 3.05 and greater
than
F.975,15,13 = 0.33, the decision is to fail to reject the null hypothesis.
10.58
Men
Women
n1 = 60
n2 = 41
x
1
= 631
1 = 100
= 848
2 = 100
P a g e | 155
For a 95% Confidence Level, /2 = .025 and z.025 = 1.96
( x1 x 2 ) z
s1
s
2
n1
n2
100 2 100 2
60
41
(631 848) + 1.96
= -217 39.7
10.59 Ho:
1 - 2 = 0
Ha:
1 - 2 0
Detroit
= .01
df = 20 + 24 - 2 = 42
Charlotte
P a g e | 156
n1 = 20
n2 = 24
x
= 17.53
s1 = 3.2
= 14.89
s2 = 2.7
For two-tail test, /2 = .005 and the critical t.005,40 = 2.704 (used
df=40)
( x1 x 2 ) ( 1 2 )
2
s1 (n1 1) s 2 (n 2 1) 1
1
n1 n2 2
n1 n2
t =
42
20 24
2
t =
= 2.97
Since the observed t = 2.97 > t.005,40 = 2.704, the decision is to reject
the null
hypothesis.
10.60
With Fertilizer
Without Fertilizer
x
1
= 38.4
1 = 9.8
= 23.1
2 = 7.4
P a g e | 157
n1 = 35
Ho:
1 - 2 = 0
Ha:
1 - 2 > 0
n2 = 35
( x 1 x 2 ) ( 1 2 )
1 2
n1
n2
2
z =
35
35
= 7.37
Since the observed z = 7.37 > z.01 = 2.33, the decision is to reject the
null
hypothesis.
P a g e | 158
10.61
Specialty
Discount
n1 = 350
n2 = 500
p
1
= .75
= .52
( p 1 p 2 ) z
z.05 = 1.645
p 1 q1 p 2 q 2
n1
n2
(.75)(.25) (.52)(.48)
350
500
(.75 - .52) + 1.645
= .23 .053
P a g e | 159
10.62 H0: 12 = 22
= .01
Ha: 12 22
dfnum = 6
s12 = 72,909
F =
s2
n2 = 7
s22 = 129,569
dfdenom = 7
s1
n1 = 8
F.005,6,7 = 9.16
F.995,7,6 = .11
129,569
72,909
= 1.78
Since F = 1.78 < F.005,6,7 = 9.16 but also > F.995,7,6 = .11, the decision is
to fail to reject the null hypothesis. There is no difference in the
variances of the shifts.
P a g e | 160
10.63
Name Brand
Store Brand
54
49
55
50
59
52
53
51
54
50
61
56
51
47
53
49
d
n=8
= 4.5
sd=1.414
df = 8 - 1 = 7
d t
sd
n
1.414
8
4.5 + 1.895
= 4.5 .947
t.05,7 = 1.895
P a g e | 161
10.64 Ho:
1 - 2 = 0
= .01
Ha:
1 - 2 < 0
df = 23 + 19 - 2 = 40
Wisconsin
Tennessee
n1 = 23
n2 = 19
x
1
= 69.652
s12 = 9.9644
= 71.7368
s22 = 4.6491
( x 1 x 2 ) ( 1 2 )
2
s1 (n1 1) s 2 (n2 1) 1
1
n1 n2 2
n1 n2
t =
40
23 19
t =
= -2.44
Since the observed t = -2.44 < t.01,40 = -2.423, the decision is to reject
the null
hypothesis.
P a g e | 162
10.65
Wednesday
Friday
71
53
18
56
47
75
52
23
68
55
13
74
58
16
d
n=5
= 15.8
sd = 5.263
df = 5 - 1 = 4
= .05
Ho: D = 0
Ha: D > 0
d D 15.8 0
sd
5.263
n
t =
5
= 6.71
Since the observed t = 6.71 > t.05,4 = 2.132, the decision is to reject
the null
hypothesis.
10.66 Ho: p1 - p2 = 0
Ha: p1 - p2 0
= .05
P a g e | 163
p 1
Machine 1
Machine 2
x1 = 38
x2 = 21
n1 = 191
n2 = 202
x1 38
n1 191
p 2
x2
21
n2 202
= .199
= .104
n1 n 2
191 202
= .15
( p 1 p 2 ) ( p1 p 2 )
1
1
n1 n
p q
z.025 = 1.96
191 202
(.15)(.85)
= 2.64
Since the observed z = 2.64 > zc = 1.96, the decision is to reject the
null
hypothesis.
10.67 Construction
Telephone Repair
n1 = 338
n2 = 281
x1 = 297
x2 = 192
P a g e | 164
p 1
x1 297
n1 338
p 2
x 2 192
n2 281
= .879
= .683
( p 1 p 2 ) z
p 1 q1 p 2 q 2
n1
n2
338
281
(.879 - .683) + 1.645
= .196 .054
10.68
Aerospace
n1 = 33
Automobile
n2 = 35
x
1
= 12.4
1 = 2.9
= 4.6
2 = 1.8
P a g e | 165
1 2
n1
n2
2
( x1 x 2 ) z
(2.9) 2 (1.8) 2
33
35
(12.4 4.6) + 2.575
= 7.8 1.52
10.69
Discount
Specialty
x
1
= $47.20
= $27.40
1 = $12.45
2 = $9.82
n1 = 60
n2 = 40
= .01
Ho: 1 - 2 = 0
Ha: 1 - 2 0
( x 1 x 2 ) ( 1 2 )
1 2
n1
n2
2
z =
60
40
= 8.86
Since the observed z = 8.86 > zc = 2.575, the decision is to reject the
null
P a g e | 166
hypothesis.
P a g e | 167
10.70
Before
12
After
8
10
16
= 4.0
sd = 1.8708
d
n=5
df = 5 - 1 =
= .01
Ho: D = 0
Ha: D > 0
d D 4.0 0
sd
1.8708
n
t =
5
= 4.78
Since the observed t = 4.78 > t.01,4 = 3.747, the decision is to reject
the null
hypothesis.
10.71 Ho: 1 - 2 = 0
= .01
P a g e | 168
Ha: 1 - 2 0
df = 10 + 6 - 2 = 14
B___
n1 = 10
n2 = 6
x
1
= 18.3
= 9.667
s12 = 17.122
s22 = 7.467
( x1 x 2 ) ( 1 2 )
2
s1 (n1 1) s 2 (n 2 1) 1
1
n1 n2 2
n1 n2
t =
14
10 6
t =
4.52
Since the observed t = 4.52 > t.005,14 = 2.977, the decision is to reject
the null
hypothesis.
10.72 A t test was used to test to determine if Hong Kong has significantly
different
rates than Mumbai. Let group 1 be Hong Kong.
P a g e | 169
Ho:
1 - 2 = 0
Ha:
1 - 2 0
x
n1 = 19
s1 = 12.9
n2 = 23
x
1
= 130.4
s2 = 13.9
= 128.4
10.73 H0: D = 0
Ha: D 0
P a g e | 170
10.74
The point estimates from the sample data indicate that in the
northern city the market share is .31078 and in the southern city the
market share is .27013. The point estimate for the difference in the
two proportions of market share are .04065. Since the 99% confidence
interval ranges from -.03936 to +.12067 and zero is in the interval, any
hypothesis testing decision based on this interval would result in failure
to reject the null hypothesis. Alpha is .01 with a two-tailed test. This is
underscored by an observed z value of 1.31 which has an associated pvalue of .191 which, of course, is not significant for any of the usual
values of .
P a g e | 171
H0: 12 = 22
Ha: 12 > 22
Twenty-six pipes were measured for sample one and twenty-eight pipes
were measured for sample two. The observed F = 2.0575 is significant
at = .05 for a one-tailed test since the associated p-value is .034787.
The variance of pipe lengths for machine 1 is significantly greater than
the variance of pipe lengths for machine 2.
P a g e | 172
Chapter 11
Analysis of Variance and
Design of Experiments
LEARNING OBJECTIVES
The focus of this chapter is learning about the design of experiments and the
analysis of variance thereby enabling you to:
1.
2.
3.
4.
5.
6.
P a g e | 173
CHAPTER TEACHING STRATEGY
This important chapter opens the door for students to a broader view
of statistics than they have seen to this time.
Through the topic of
experimental designs, the student begins to understand how they can
scientifically set up controlled experiments in which to test certain
hypotheses. They learn about independent and dependent variables. With
the completely randomized design, the student can see how the t test for two
independent samples can be expanded to include three or more samples by
using analysis of variance. This is something that some of the more curious
students were probably wondering about in chapter 10.
Through the
randomized block design and the factorial designs, the student can
understand how we can analyze not only multiple categories of one variable,
but we can simultaneously analyze multiple variables with several categories
each. Thus, this chapter affords the instructor an opportunity to help the
student develop a structure for statistical analysis.
P a g e | 174
might result in a smaller treatment F value than would occur in a completely
randomized design. The repeated-measures design is shown in the chapter
as a special case of the random block design.
P a g e | 175
CHAPTER OUTLINE
P a g e | 176
Using a Computer to Do a Two-Way ANOVA
KEY TERMS
a posteriori
Factors
a priori
Independent Variable
Interaction
Blocking Variable
Levels
Classification Variables
Multiple Comparisons
Classifications
Variance
One-way
Analysis
Post-hoc
Concomitant Variables
Confounding Variables
of
Dependent Variable
Treatment Variable
Experimental Design
Tukey-Kramer Procedure
F Distribution
F Value
Factorial Design
P a g e | 177
11.1
a) Time Period, Market Condition, Day of the Week, Season of the Year
11.2
a) Type of 737, Age of the plane, Number of Landings per Week of the
plane,
City that the plane is based
P a g e | 178
11.3
Region
11.4
P a g e | 179
11.5
Source
Treatment
df
2
Error
14
Total
16
= .05
SS
MS
F__
1.00______
36.24
Since the observed F = 11.07 > F.05,2,14 = 3.74, the decision is to reject
the null hypothesis.
11.6
Source
Treatment
df
4
MS
F__
Error
18
26.67
Total
22
120.43
= .01
SS
1.48______
Since the observed F = 15.82 > F.01,4,18 = 4.58, the decision is to reject
the null
hypothesis.
P a g e | 180
11.7
Source
Treatment
df
3
MS
F_
Error
12
167.5
Total
15
711.8
= .01
SS
14.0______
Since the observed F = 13.00 > F.01,3,12 = 5.95, the decision is to reject
the null hypothesis.
P a g e | 181
11.8
Source
df
Treatment
SS
F__
Error
12
43.43
Total
13
107.71
= .05
MS
3.62______
Since the observed F = 17.76 > F.05,1,12 = 4.75, the decision is to reject
the null hypothesis.
n1 = 7
n2 = 7
x
1
= 29
s12 = 3
= 24.71429
s22 = 4.238095
t =
3(6) (4.238095)( 6) 1 1
772
7 7
F 17.76
Also, t =
= 4.214
= 4.214
P a g e | 182
11.9
Source
SS
df
Treatment
583.39
Error
972.18
50
Total
1,555.57
11.10 Source
29.64
Error
68.42
Total
98.06
F__
145.8475
7.50
19.4436______
54
SS
Treatment
MS
df
MS
2
14
14.820
3.03
4.887___ __
16
F.05,2,14 = 3.74
Since the observed F = 3.03 < F.05,2,14 = 3.74, the decision is to fail to
reject
the null hypothesis
11.11 Source
df
SS
MS
F__
P a g e | 183
Treatment
Error
15
Total
18
= .01
.007076
.002359
.003503
10.10
.000234________
.010579
Since the observed F = 10.10 > F.01,3,15 = 5.42, the decision is to reject
the null
hypothesis.
11.12 Source
Treatment
df
2
SS
MS
180700000 90350000
Error
12
11699999
Total
14
192400000
= .01
F__
92.67
975000_________
Since the observed F = 92.67 > F.01,2,12 = 6.93, the decision is to reject
the null
hypothesis.
P a g e | 184
11.13 Source
Treatment
Error
df
2
15
SS
29.61
MS
14.80
18.89
F___
11.76
1.26________
Total
17
48.50
= .05
Since the observed F = 11.76 > F.05,2,15 = 3.68, the decison is to reject
the null
hypothesis.
P a g e | 185
11.14 Source
Treatment
df
SS
456630
MS
152210
Error
16
220770
Total
19
= .05
F__
11.03
13798_______
677400
Since the observed F = 11.03 > F.05,3,16 = 3.24, the decision is to reject
the null
hypothesis.
11.15 There are 4 treatment levels. The sample sizes are 18, 15, 21, and
11. The F
value is 2.95 with a p-value of .04. There is an overall significant
difference at
alpha of .05. The means are 226.73, 238.79, 232.58, and 239.82.
11.16 The independent variable for this study was plant with five
classification levels
(the five plants). There were a total of 43 workers who participated in
the study. The dependent variable was number of hours worked per
P a g e | 186
week. An observed F value of 3.10 was obtained with an associated pvalue of .026595. With an alpha of .05, there was a significant
overall difference in the average number of hours worked per week
by plant. A cursory glance at the plant averages revealed that workers
at plant 3 averaged 61.47 hours per week (highest number) while
workers at plant 4 averaged 49.20 (lowest number).
11.17C = 6
= .05
MSE = .3352
N = 46
x
q.05,6,40 = 4.23
n3 = 8
n6 = 7
x
3
= 15.85
= 17.2
.3352 1 1
2 8 7
HSD = 4.23
= 0.896
x 3 x 6 15.85 17.21
= 1.36
11.18 C = 4
n=6
N = 24
dferror = N - C = 24 - 4 = 20
= .05
P a g e | 187
MSE = 2.389
q.05,4,20 = 3.96
MSE
n
HSD = q
11.19 C = 3
2.389
6
= (3.96)
MSE = 1.002381
= 2.50
= .05
N = 17
x
q.05,3,14 = 3.70
n1 = 6
n2 = 5
N - C = 14
x
1
=2
= 4.6
1.002381 1 1
2
6 5
HSD = 3.70
= 1.586
x 1 x 2 2 4 .6
= 2.6
P a g e | 188
11.20 From problem 11.6,
MSE = 1.481481
C=5
N = 23 N C
= 18
n2 = 5
n4 = 5
= .01
q.01,5,18 = 5.38
1.481481 1 1
2
5 5
HSD = 5.38
= 2.93
x
2
= 10
= 16
x 3 x 6 10 16
= 6
P a g e | 189
11.21 N = 16
q.01,4,12 = 5.50
n=4
C=4
MSE
n
= 5.50
x
= 115.25
and
MSE = 13.95833
13.95833
4
HSD = q
N - C = 12
= 10.27
x
2
= 125.25
x
3
= 131.5
= 122.5
x
3
using the
HSD test.
P a g e | 190
11.22 n = 7
C=2
MSE = 3.619048
N = 14
N - C = 14 - 2 =
12
= .05
q.05,2,12 = 3.08
MSE
n
HSD = q
3.619048
7
= 3.08
= 2.215
x
1
= 29 and
x
Since
= 24.71429
x
1
null
hypothesis.
P a g e | 191
11.23 C = 4
q.01,4,15 = 5.25
= .01
MSE = .000234
n1 = 4
n2 = 6
x
1
= 4.03,
n3 = 5
x
2
= 4.001667,
= .0367
.000234 1 1
2 4 5
HSD1,3 = 5.25
= .0381
.000234 1 1
2 4 4
HSD1,4 = 5.25
= .0402
.000234 1 1
2 6 5
HSD2,3 = 5.25
= .0344
.000234 1 1
2 6 4
HSD2,4 = 5.25
n4 = 4
x
= 3.974,
.000234 1 1
2 4 6
HSD1,2 = 5.25
N = 19
= .0367
= 4.005
N C = 15
P a g e | 192
.000234 1 1
2
5 4
HSD3,4 = 5.25
= .0381
x1 x 3
= .056
P a g e | 193
11.24 = .01
C=3
MSE = 975,000
n=5
N = 15
q.01,3,12 = 5.04
MSE
n
975,000
5
HSD = q
= 5.04
= 2,225.6
x
1
N C = 12
= 40,900
x
2
= 49,400
= 45,300
x1 x 2
= 8,500
x1 x 3
= 4,400
x 2 x3
= 4,100
11.25 = .05
C=3
q.05,3,15 = 3.67
N = 18
n1 = 5
N - C = 15
n2 = 7
n3 = 6
MSE = 1.259365
P a g e | 194
x
1
= 7.6
x
2
= 8.8571
= 5.8333
1.259365 1 1
2
5 7
HSD1,2 = 3.67
= 1.705
1.259365 1 1
2
5 6
HSD1,3 = 3.67
= 1.763
1.259365 1 1
2
7 6
HSD2,3 = 3.67
= 1.620
x1 x 3
= 1.767 (is significant)
x 2 x3
= 3.024 (is significant)
P a g e | 195
11.26 = .05
13,798.13
n=5
C=4
x
1
= 591
x
2
= 350
MSE
n
HSD = q
x1 x 2
N = 20
= 776
= 563
13,798.13
5
= 4.05
= 241
= 212.76
x1 x 4
= 185
x2 x4
= 426
MSE =
x1 x 3
x2 x3
N - C = 16
= 28
x3 x4
= 213
= 213
11.27
MINITAB
= .05. There were five plants and ten pairwise comparisons. The
P a g e | 196
plant 3 where the reported confidence interval (0.180 to 22.460)
contains the same
sign throughout indicating that 0 is not in the interval. Since zero is
not in the
interval, then we are 95% confident that there is a pairwise difference
significantly different from zero. The lower and upper values for all
other
confidence intervals have different signs indicating that zero is
included in the
interval. This indicates that the difference in the means for these pairs
might be
zero.
P a g e | 197
11.28 H0: 1 = 2 = 3 = 4
Ha: At least one treatment mean is different from the others
Source
df
SS
MS
Treatment
62.95
Blocks
257.50
Error
12
45.30
Total
19
365.75
= .05
20.9833
F__
5.56
64.3750 17.05
3.7750______
For treatments, the observed F = 5.56 > F.05,3,12 = 3.49, the decision is
to
reject the null hypothesis.
11.29 H0: 1 = 2 = 3
Ha: At least one treatment mean is different from the others
Source
df
SS
Treatment
.001717
.000858
1.48
Blocks
.076867
.025622
44.13
Error
.003483
.000581_______
Total
11
.082067
MS
F_
P a g e | 198
= .01
For treatments, the observed F = 1.48 < F.01,2,6 = 10.92 and the
decision is to
fail to reject the null hypothesis.
11.30 Source
df
SS
MS
F__
Treatment
2477.53
495.506
1.91
Blocks
3180.48
353.387
1.36
Error
45
11661.38
Total
59
17319.39
= .05
259.142______
For treatments, the observed F = 1.91 < F.05,5,45 = 2.45 and decision is
to fail to
reject the null hypothesis.
11.31 Source
df
SS
MS
F__
Treatment
199.48
66.493
3.90
Blocks
265.24
44.207
2.60
18
306.59
17.033______
Error
P a g e | 199
Total
27
= .01
771.31
For treatments, the observed F = 3.90 < F.01,3,18 = 5.09 and the decision
is to
fail to reject the null hypothesis.
11.32 Source
df
SS
MS
F__
Treatment
2302.5
767.5000
15.67
Blocks
5402.5
600.2778
12.26
Error
27
1322.5
Total
39
9027.5
= .05
48.9815____ __
For treatments, the observed F = 15.67 > F.05,3,27 = 2.96 and the
decision is to
reject the null hypothesis.
P a g e | 200
11.33 Source
df
SS
MS
F__
Treatment
64.5333
32.2667
15.37
Blocks
137.6000
34.4000
16.38
Error
16.8000
Total
14
218.9300
= .01
2.1000_ _____
For treatments, the observed F = 15.37 > F.01,2,8 = 8.65 and the
decision is to
reject the null hypothesis.
P a g e | 201
11.35 The p value for Phone Type, .00018, indicates that there is an overall significant
difference in treatment means at alpha .001. The lengths of calls differ according to type
of telephone used. The p-value for managers, .00028, indicates that there is an overall
difference in block means at alpha .001. The lengths of calls differ according to Manager.
The significant blocking effects have improved the power of the F test for treatments.
11.36 This is a two-way factorial design with two independent variables and
one dependent variable. It is 2x4 in that there are two row treatment
levels and four column treatment levels. Since there are three
measurements per cell, interaction can be analyzed.
dfrow treatment = 1
dfcolumn treatment = 3
dfinteraction = 3
dferror = 16
dftotal = 23
11.37 This is a two-way factorial design with two independent variables and
one dependent variable. It is 4x3 in that there are four treatment
P a g e | 202
levels and three column treatment levels. Since there are two
measurements per cell, interaction can be analyzed.
dfrow treatment = 3
dfcolumn treatment = 2
dfinteraction = 6
dferror = 12
dftotal
= 23
11.38 Source
Row
df
3
Column
Interaction
SS
126.98
MS
42.327
F__
3.46
37.49
9.373
0.77
12
380.82
31.735
2.60
Error
60
733.65
Total
79
1278.94
12.228______
= .05
F.05,3,60
Critical F.05,3,60 = 2.76 for rows. For rows, the observed F = 3.46 >
= 2.76
and the decision is to reject the null hypothesis.
Critical F.05,4,60 = 2.53 for columns. For columns, the observed F = 0.77
<
F.05,4,60 = 2.53 and the decision is to fail to reject the null
hypothesis.
P a g e | 203
F.05,12,60 = 1.92 and the decision is to
11.39 Source
Row
df
1
SS
1.047
MS
1.047
F__
2.40
Column
3.844
1.281
2.94
Interaction
0.773
0.258
0.59
Error
16
6.968
Total
23
12.632
0.436______
= .05
F.05,1,16
Critical F.05,1,16 = 4.49 for rows. For rows, the observed F = 2.40 <
= 4.49
and decision is to fail to reject the null hypothesis.
Critical F.05,3,16 = 3.24 for columns. For columns, the observed F = 2.94
<
F.05,3,16 = 3.24 and the decision is to fail to reject the null
hypothesis.
P a g e | 204
Critical F.05,3,16 = 3.24 for interaction. For interaction, the observed F =
0.59 <
F.05,3,16 = 3.24 and the decision is to fail to reject the null
hypothesis.
P a g e | 205
11.40 Source
Row
df
1
SS
60.750
MS
60.750
F___
38.37
Column
14.000
7.000
4.42
Interaction
2.000
1.000
0.63
Error
9.500
Total
11
86.250
1.583________
= .01
Critical F.01,1,6 = 13.75 for rows. For rows, the observed F = 38.37 >
F.01,1,6 = 13.75 and the decision is to reject the null hypothesis.
Critical F.01,2,6 = 10.92 for columns. For columns, the observed F = 4.42
<
F.01,2,6 = 10.92 and the decision is to fail to reject the null
hypothesis.
Critical F.01,2,6 = 10.92 for interaction. For interaction, the observed F =
0.63 <
F.01,2,6 = 10.92 and the decision is to fail to reject the null
hypothesis.
11.41 Source
Treatment 1
Treatment 2
Interaction
df
1
3
3
SS
1.24031
5.09844
0.12094
MS
1.24031
1.69948
0.04031
F__
63.67
87.25
2.07
P a g e | 206
Error
24
0.46750
Total
31
6.92719
0.01948______
= .05
P a g e | 207
11.42 Source
Age
No. Children
df
3
Interaction
SS
MS
F__
4.9167
Error
12
11.5000
Total
23
107.9583
0.8194
0.86
0.9583______
= .05
Critical F.05,3,12 = 3.49 for Age. For Age, the observed F = 14.77 >
F.05,3,12 = 3.49 and the decision is to reject the null hypothesis.
Critical F.05,2,12 = 3.89 for No. Children. For No. Children, the observed
F = 25.61 > F.05,2,12 = 3.89 and the decision is to reject the null
hypothesis.
11.43 Source
df
Location
Competitors
Interaction
Error
24
Total
35
SS
MS
F__
2
1736.22
868.11
3
1078.33
359.44
6
503.33
83.89
607.33
25.31_______
3925.22
34.31
14.20
3.32
P a g e | 208
= .05
Critical F.05,2,24 = 3.40 for rows. For rows, the observed F = 34.31 >
F.05,2,24 = 3.40 and the decision is to reject the null hypothesis.
P a g e | 209
11.44 This two-way design has 3 row treatments and 5 column treatments.
There are 45 total observations with 3 in each cell.
MS R 46.16
MS E
3.49
FR =
= 13.23
p-value = .000 and the decision is to reject the null hypothesis for
rows.
MSC 249.70
MS E
3.49
FC =
= 71.57
p-value = .000 and the decision is to reject the null hypothesis for
columns.
MS I 55.27
MS E
3.49
FI =
= 15.84
p-value = .000 and the decision is to reject the null hypothesis for
interaction.
P a g e | 210
11.45 The null hypotheses are that there are no interaction effects, that there
are no significant differences in the means of the valve openings by
machine, and that there are no significant differences in the means of
the valve openings by shift. Since the p-value for interaction effects
is .876, there are no significant interaction effects and that is good
since significant interaction effects would confound that study. The pvalue for columns (shifts) is .008 indicating that column effects are
significant at alpha of .01. There is a significant difference in the mean
valve opening according to shift. No multiple comparisons are given in
the output. However, an examination of the shift means indicates that
the mean valve opening on shift one was the largest at 6.47 followed
by shift three with 6.3 and shift two with 6.25. The p-value for rows
(machines) is .937 and that is not significant.
P a g e | 211
11.46 This two-way factorial design has 3 rows and 3 columns with three
observations
per cell. The observed F value for rows is 0.19, for columns is 1.19,
and for
interaction is 1.40. Using an alpha of .05, the critical F value for rows
and columns (same df) is F2,18,.05 = 3.55. Neither the observed F value
for rows nor the observed F value for columns is significant. The
critical F value for interaction is F4,18,.05 = 2.93. There is no significant
interaction.
11.47 Source
df
Treatment
Error
Total
= .05
3
12
15
30.25
96.94
SS
66.69
MS
22.23
F__
8.82
2.52______
Since the treatment F = 8.82 > F.05,3,12 = 3.49, the decision is to reject
the null
hypothesis.
P a g e | 212
MSE = 2.52
n=4
N = 16
C=4
N - C = 12
q.05,4,12 = 4.20
MSE
n
2.52
4
HSD = q
= (4.20)
x
1
= 12
= 3.33
x
2
= 7.75
x
3
= 13.25
= 11.25
11.48 Source
df
Treatment
Error
19
249.61
Total
25
317.80
SS
68.19
MS
11.365
13.137______
F__
0.87
P a g e | 213
11.49 Source
Treatment
df
SS
MS
210
42.000
Error
36
655
Total
41
865
11.50 Source
df
Error
Total
2
22
24
MS
150.91
102.53
253.44
= .01
2.31
18.194______
SS
Treatment
F__
F__
75.46
16.19
4.66________
Since the observed F = 16.19 > F.01,2,22 = 5.72, the decision is to reject
the null
hypothesis.
x
1
= 9.200
n1 = 10
x
2
= 14.250
n2 = 8
= 8.714286
n3 = 7
P a g e | 214
MSE = 4.66
= .01
C=3
N = 25
N - C = 22
q.01,3,22 = 4.64
4.66 1 1
2 10 8
HSD1,2 = 4.64
= 3.36
4.66 1 1
2 10 7
HSD1,3 = 4.64
= 3.49
4.66 1 1
2 8 7
HSD2,3 = 4.64
x1 x 2
x 2 x3
= 5.05 and
different at = .01
= 3.67
P a g e | 215
11.51 This design is a repeated-measures type random block design. There
is one
treatment variable with three levels. There is one blocking variable
with six
people in it (six levels). The degrees of freedom treatment are two.
The degrees
of freedom block are five. The error degrees of freedom are ten. The
total
degrees of freedom are seventeen. There is one dependent variable.
11.52 Source
Treatment
Blocks
df
SS
MS
F__
3
9
20,994
16,453
6998.00 5.58
1828.11 1.46
Error
27
33,891
1255.22_____
Total
39
71,338
= .05
Since the observed F = 5.58 > F.05,3,27 = 2.96 for treatments, the
decision is to
reject the null hypothesis.
P a g e | 216
11.53 Source
Treatment
Blocks
df
SS
3
5
240.125
548.708
Error
15
38.125
Total
23
= .05
MS
80.042
109.742
F__
31.51
43.20
2.542_ _____
Since for treatments the observed F = 31.51 > F.05,3,15 = 3.29, the
decision is to
reject the null hypothesis.
Ignoring the blocking effects, the sum of squares blocking and sum of
squares error are combined together for a new SS error = 548.708 +
38.125 = 586.833. Combining the degrees of freedom error and
blocking yields a new dferror = 20. Using these new figures, we compute
a new mean square error, MSE = (586.833/20) = 29.34165.
n=6
C=4
N = 24
N - C = 20
q.05,4,20 = 3.96
P a g e | 217
MSE
n
29.34165
6
HSD = q
= (3.96)
x
1
= 16.667
= 8.757
x
2
= 12.333
x
3
= 12.333
= 19.833
None of the pairs of means are significantly different using Tukey's HSD
= 8.757.
This may be due in part to the fact that we compared means by folding
the
blocking effects back into error and the blocking effects were highly
significant.
11.54 Source
Treatment 1
df
4
Treatment 2
29.13
1
Interaction
SS
Error
30
110.38
Total
39
225.67
= .05
7.2825
12.67
4
MS
12.6700
73.49
F__
1.98
3.44
18.3725
3.6793______
4.99
P a g e | 218
P a g e | 219
11.55 Source
df
SS
MS
F___
Treatment 2
Treatment 1
3
2
257.889
1.056
85.963
0.528
38.21
0.23
Interaction
17.611
2.935
1.30
Error
24
54.000
Total
35
330.556
2.250________
= .01
Critical F.01,3,24 = 4.72 for treatment 2. For the treatment 2 effects, the
observed
F = 38.21 > F.01,3,24 = 4.72 and the decision is to reject the null
hypothesis.
Critical F.01,2,24 = 5.61 for Treatment 1. For the treatment 1 effects, the
observed
F = 0.23 < F.01,2,24 = 5.61 and the decision is to fail to reject the null
hypothesis.
Critical F.01,6,24 = 3.67 for interaction. For the interaction effects, the
observed
F = 1.30 < F.01,6,24 = 3.67 and the decision is to fail to reject the null
hypothesis.
P a g e | 220
11.56 Source
Age
df
2
SS
49.3889
MS
F___
24.6944 38.65
Column
1.2222
0.4074
0.64
Interaction
1.2778
0.2130
0.33
Error
24
15.3333
Total
35
67.2222
0.6389_______
= .05
Critical F.05,2,24 = 3.40 for Age. For the age effects, the observed F =
38.65 >
F.05,2,24 = 3.40 and the decision is to reject the null hypothesis.
Critical F.05,3,24 = 3.01 for Region. For the region effects, the observed F
= 0.64
< F.05,3,24 = 3.01 and the decision is to fail to reject the null
hypothesis.
Critical F.05,6,24 = 2.51 for interaction. For interaction effects, the
observed
F = 0.33 < F.05,6,24 = 2.51 and the decision is to fail to reject the null
hypothesis.
There are no significant interaction effects. Only the Age effects are
significant.
P a g e | 221
x
1
= 2.667
n = 12
x
2
C=3
= 4.917
N = 36
= 2.250
N - C = 33
q.05,3,33 = 3.49
MSE
n
HSD = q
.5404
12
= (3.49)
= 0.7406
Shown below is a graph of the interaction using the cell means by Age.
P a g e | 222
P a g e | 223
11.57 Source
df
Treatment
SS
MS
90477679
Error
20
81761905
Total
23
172000000
= .05
30159226
F__
7.38
4088095_______
The treatment F = 7.38 > F.05,3,20 = 3.10 and the decision is to reject
the null
hypothesis.
11.58 Source
Treatment
Blocks
df
SS
2
5
460,353
33,524
Error
10
22,197
Total
17
516,074
= .01
MS
F__
230,176 103.70
6,705
3.02
2,220_______
P a g e | 224
11.59 Source
df
Treatment
SS
9.555
Error
18
185.1337
Total
20
194.6885
= .05
MS
4.777
F__
0.46
10.285_______
Since the treatment F = 0.46 > F.05,2,18 = 3.55, the decision is to fail to
reject the
null hypothesis.
P a g e | 225
11.60 Source
df
SS
MS
Years
4.875
2.437
5.16
Size
17.083
5.694
12.06
Interaction
Error
36
17.000
Total
47
41.250
2.292
0.382
F___
0.81
0.472_______
= .05
Critical F.05,2,36 = 3.32 for Years. For Years, the observed F = 5.16 >
F.05,2,36 = 3.32 and the decision is to reject the null hypothesis.
Critical F.05,3,36 = 2.92 for Size. For Size, the observed F = 12.06 >
F.05,3,36 = 2.92
and the decision is to reject the null hypothesis.
P a g e | 226
11.61 Source
df
SS
MS
F___
Treatment
53.400
13.350
13.64
Blocks
17.100
2.443
2.50
Error
28
27.400
0.979________
Total
39
97.900
= .05
For treatments, the observed F = 13.64 > F.05,4,28 = 2.71 and the
decision is to
reject the null hypothesis.
P a g e | 227
11.62 This is a one-way ANOVA with four treatment levels. There are 36
observations
in the study. The p-value of .045 indicates that there is a significant
overall
difference in the means at = .05. An examination of the mean
analysis shows
that the sample sizes are different with sizes of 8, 7, 11, and 10,
respectively. No
multiple comparison technique was used here to conduct pairwise
comparisons.
However, a study of sample means shows that the two most extreme
means are
from levels one and four. These two means would be the most likely
candidates
for multiple comparison tests. Note that the confidence intervals for
means one
and four (shown in the graphical output) are seemingly nonoverlapping
indicating a potentially significant difference.
P a g e | 228
cases, the observed value is less than the critical value.
11.64 This is a two-way ANOVA with 5 rows and 2 columns. There are 2
observations
per cell. For rows, FR = 0.98 with a p-value of .461 which is not
significant. For columns, FC = 2.67 with a p-value of .134 which is not
significant. For interaction, FI = 4.65 with a p-value of .022 which is
significant at = .05. Thus, there are significant interaction effects
and the row and column effects are confounded. An examination of
the interaction plot reveals that most of the lines cross verifying the
finding of significant interaction.
11.65 This is a two-way ANOVA with 4 rows and 3 columns. There are 3
observations
per cell. FR = 4.30 with a p-value of .014 is significant at = .05. The
null
hypothesis is rejected for rows. FC = 0.53 with a p-value of .594 is not
significant. We fail to reject the null hypothesis for columns. FI = 0.99
with a
p-value of .453 for interaction is not significant. We fail to reject the
null
hypothesis for
interaction effects.
P a g e | 229
11.66 This was a random block design with 5 treatment levels and 5 blocking
levels.
For both treatment and blocking effects, the critical value is F.05,4,16 =
3.01. The
observed F value for treatment effects is MSC / MSE = 35.98 / 7.36 =
4.89 which
is greater than the critical value. The null hypothesis for treatments is
rejected,
and we conclude that there is a significant different in treatment
means. No
multiple comparisons have been computed in the output. The
observed F value
for blocking effects is MSR / MSE = 10.36 /7.36 = 1.41 which is less than
the
critical value. There are no significant blocking effects. Using random
block
design on this experiment might have cost a loss of power.
11.67 This one-way ANOVA has 4 treatment levels and 24 observations. The
F = 3.51
yields a p-value of .034 indicating significance at = .05. Since the
sample sizes
are equal, Tukeys HSD is used to make multiple comparisons. The
computer
P a g e | 230
output shows that means 1 and 3 are the only pairs that are
significantly different
the
(same signs in
confidence interval). Observe on the graph that
confidence intervals for means 1 and 3 barely overlap.
Chapter 12
Analysis of Categorical Data
LEARNING OBJECTIVES
1.
2.
P a g e | 231
P a g e | 232
The chi-square goodness-of-fit test examines the categories of one
variable to determine if the distribution of observed occurrences matches
some expected or theoretical distribution of occurrences. It can be used to
determine if some standard or previously known distribution of proportions is
the same as some observed distribution of proportions. It can also be used to
validate the theoretical distribution of occurrences of phenomena such as
random arrivals that are often assumed to be Poisson distributed. You will
note that the degrees of freedom, k - 1 for a given set of expected values or
for the uniform distribution, change to k - 2 for an expected Poisson
distribution and to k - 3 for an expected normal distribution. To conduct a chisquare goodness-of-fit test to analyze an expected Poisson distribution, the
value of lambda must be estimated from the observed data. This causes the
loss of an additional degree of freedom. With the normal distribution, both
the mean and standard deviation of the expected distribution are estimated
from the observed values causing the loss of two additional degrees of
freedom from the k - 1 value.
P a g e | 233
CHAPTER OUTLINE
12.1
of-Fit
Test as an Alternative Technique to the z Test
12.2
KEY TERMS
Categorical Data
Independence
Chi-Square Test of
Chi-Square Distribution
Contingency Analysis
Contingency Table
P a g e | 234
SOLUTIONS TO THE ODD-NUMBERED PROBLEMS IN CHAPTER 12
12.1
( fo fe )2
fe
f0
fe
53
68
37
32
28
42
33
22
18
10
6.400
15
6.125
3.309
0.595
0.030
1.636
( f0 fe )2
fe
2
Observed
= 18.095
df = k - 1 = 6 - 1 = 5, = .05
2.05,5 = 11.0705
P a g e | 235
The observed frequencies are not distributed the same as the expected
frequencies.
P a g e | 236
12.2
( fo fe )2
fe
f0
19
fe
18
17
14
18
18
18
18
19
21
18
18
18
18
0.000
18
18
0.000
fo =
144
0.056
0.056
0.889
0.000
0.056
0.500
fe = 144
1.557
f
k
144
8
= 18
df = k 1 = 8 1 = 7, = .01
2.01,7 = 18.4753
P a g e | 237
( f0 fe )2
fe
2
Observed
= 1.557
P a g e | 238
12.3
Number
f0
(Number)(f0)
28
17
17
11
22
_5
15
61
54
54
61
=0.9
Expected
Number
Probability
Expected
Frequency
.4066
24.803
.3659
22.320
.1647
10.047
>3
.0628
3.831
Number
0
fo
28
fe
24.803
( f o fe )2
fe
0.412
P a g e | 239
1
17
22.320
1.268
>2
16
13.878
0.324
61
60.993
2.004
df = k - 2 = 3 - 2 = 1,
= .05
2.05,1 = 3.8415
( f0 fe )2
fe
Observed
= 2.001
Since the observed 2 = 2.001 < 2.05,1 = 3.8415, the decision is to fail
to reject
the null hypothesis.
P a g e | 240
12.4
Category
f(observed)
Midpt.
fm2
fm
10-20
15
90
1,350
20-30
14
25
350
8,750
30-40
29
35
1,015
35,525
40-50
38
45
1,710
76,950
50-60
25
55
1,375
75,625
60-70
10
65
650
42,250
70-80
75
525
39,375
n = f = 129
fm 5,715
f 129
fM 2
s=
= 44.3
( fM ) 2
n 1
(5,715) 2
129
128
279,825
= 14.43
For Category 10 - 20
Prob
P a g e | 241
z =
z =
10 44.3
14.43
20 44.3
14.43
= -2.38
.4913
= -1.68
- .4535
Expected prob.:
.0378
for x = 20,
z=
30 44.3
14.43
Prob
z = -1.68
= -0.99
.4535
-.3389
Expected prob:
.1146
P a g e | 242
For Category 30 - 40
for x = 30,
z =
40 44.3
14.43
Prob
z = -0.99
.3389
= -0.30
-.1179
Expected prob:
.2210
For Category 40 - 50
for x = 40,
z =
50 44.3
14.43
Prob
z = -0.30
= 0.40
.1179
+.1554
Expected prob:
.2733
For Category 50 - 60
z =
60 44.3
14.43
for x = 50,
Prob
= 1.09
.3621
z = 0.40
Expected prob:
For Category 60 - 70
-.1554
.2067
Prob
P a g e | 243
z =
70 44.3
14.43
for x = 60,
= 1.78
.4625
z = 1.09
-.3621
Expected prob:
For Category 70 - 80
z =
80 44.3
14.43
for x = 70,
.1004
Prob
= 2.47
.4932
z = 1.78
Expected prob:
-.4625
.0307
P a g e | 244
Category
Prob
expected frequency
< 10
.0087
.0087(129) = 1.12
10-20
.0378
.0378(129) = 4.88
20-30
.1146
14.78
30-40
.2210
28.51
40-50
.2733
35.26
50-60
.2067
26.66
60-70
.1004
12.95
70-80
.0307
3.96
> 80
.0068
0.88
( fo fe )2
fe
Category
fo
fe
10-20
6.00
.000
20-30
14
14.78
.041
30-40
29
28.51
.008
40-50
38
35.26
.213
50-60
60-70
25
10
26.66
12.95
.103
.672
70-80
4.84
.964
P a g e | 245
2.001
( f0 fe )2
fe
Calculated
= 2.001
df = k - 3 = 7 - 3 = 4,
= .05
2.05,4 = 9.4877
Since the observed 2 = 2.004 < 2.05,4 = 9.4877, the decision is to fail
to reject
the null hypothesis. There is not enough evidence to declare that
the observed
frequencies are not normally distributed.
12.5
Definition
fo
Exp.Prop.
fe
( fo fe )2
fe
Happiness
42
.39
227(.39)= 88.53
24.46
Sales/Profit
95
.12
227(.12)= 27.24
168.55
Helping Others
Achievement/
27
.18
40.86
4.70
P a g e | 246
Challenge
63
.31
70.37
0.77
227
198.48
Ha: The observed frequencies are not distributed the same as the
expected
frequencies.
Observed 2 = 198.48
df = k 1 = 4 1 = 3,
= .05
2.05,3 = 7.8147
The observed frequencies for men are not distributed the same
as the
expected frequencies which are based on the responses of
women.
P a g e | 247
P a g e | 248
12.6
Age
fo
fe
( fo fe )2
fe
10-14
22
.09
(.09)(212)=19.08
0.45
15-19
50
.23
(.23)(212)=48.76
0.03
20-24
43
.22
25-29
30-34
29
19
.14
.10
> 35
49
.22
212
29.68
46.64
0.28
0.02
21.20
0.23
46.64
0.12
1.13
= .01, df = k - 1 = 6 - 1 = 5
2.01,5 = 15.0863
Since the observed 2 = 1.13 < 2.01,5 = 15.0863, the decision is to fail
to reject
P a g e | 249
the null hypothesis.
distribution of expected
P a g e | 250
12.7
Age
fo
fm2
fm
10-20
16
15
240
3,600
20-30
44
25
1,100
27,500
30-40
61
35
2,135
74,725
40-50
56
45
2,520
113,400
50-60
35
55
1,925
105,875
60-70
19
65
1,235
80,275
fm = 9,155
231
fM
n
9,155
231
fM 2
s =
= 39.63
( fM ) 2
n 1
fm2 = 405,375
(9,155) 2
231
230
405,375
= 13.6
z =
z =
10 39.63
13.6
20 39.63
13.6
Prob
= -2.18
.4854
= -1.44
-.4251
P a g e | 251
Expected prob. .0603
for x = 20,
z =
Prob
z = -1.44
30 39.63
13.6
= -0.71
.4251
-.2611
for x = 30,
z =
Prob
z = -0.71
40 39.63
13.6
= 0.03
.2611
+.0120
z =
50 39.63
13.6
= 0.76
Prob
.2764
P a g e | 252
for x = 40,
z = 0.03
-.0120
z =
60 39.63
13.6
for x = 50,
= 1.50
Prob
.4332
z = 0.76
-.2764
Expected prob.
z =
70 39.63
13.6
= 2.23
.1568
Prob
.4871
Expected prob.
-.4332
.0539
P a g e | 253
For < 10:
Probability between 10 and the mean = .0603 + .1640 + .2611 = .
4854
Probability < 10 = .5000 - .4854 = .0146
P a g e | 254
Age
Probability
fe
< 10
.0146
(.0146)(231) =
3.37
10-20
.0603
(.0603)(231) = 13.93
20-30
.1640
30-40
40-50
.2731
.2644
50-60
.1568
36.22
60-70
.0539
12.45
> 70
.0129
2.98
37.88
63.09
61.08
Age
fo
( fo fe )2
fe
fe
10-20
16
17.30
0.10
20-30
44
37.88
0.99
30-40
61
63.09
0.07
40-50
56
61.08
0.42
50-60
35
36.22
0.04
60-70
19
15.43
0.83
2.45
df = k - 3 = 6 - 3 = 3,
= .05
P a g e | 255
2.05,3 = 7.8147
Observed 2 = 2.45
P a g e | 256
12.8
Number
0
18
1
2
28
47
21
4
5
16
11
6 or more
0
28
94
63
64
55
54
f = 150
( f) (number)
f number 358
150
f
f(number) = 358
= 2.4
Number
Probability
fe
.0907
(.0907)(150) =
13.61
.2177
(.2177)(150) = 32.66
.2613
39.20
.2090
31.35
.1254
18.81
.0602
9.03
6 or more
.0358
5.36
P a g e | 257
( f0 fe )2
f0
fo
fe
18
13.61
1.42
28
32.66
0.66
47
39.20
1.55
21
16
31.35
18.81
3.42
0.42
11
9.03
0.43
5.36
2.47
10.37
= .01, df = k 2 = 7 2 = 5,
2.01,5 = 15.0863
Since the observed 2 = 10.37 < 2.01,5 = 15.0863, the decision is to fail
to reject
the null hypothesis.
12.9
H0: p = .28
Ha: p .28
n = 270
x = 62
P a g e | 258
fo
Spend More
Don't Spend More
Total
fe
( fo f e )2
fe
62
270(.28) = 75.6
2.44656
208
270(.72) = 194.4
0.95144
270
270.0
3.39800
df = k - 1 = 2 - 1 = 1
2.025,1 = 5.02389
12.10
H0: p = .30
Ha: p < .30
n = 180
x= 42
P a g e | 259
f0
Provide
Don't Provide
Total
fe
( fo fe )2
fe
42
180(.30) = 54
2.6666
138
180(.70) = 126
1.1429
180
180
3.8095
= .05
df = k - 1 = 2 - 1 = 1
2.05,1 = 3.8415
P a g e | 260
12.11
Variable Two
Variabl
e
One
203
326
529
68
110
178
271
436
707
e11 =
e21 =
(529)( 271)
707
(271)(178)
707
= 202.77
= 68.23
e12 =
e22 =
(529)( 436)
707
(436)(178)
707
Variable Two
Variabl
e
(202.7
7)
(326.2
3)
One
203
326
(68.23)
68
(109.7
7)
110
529
178
= 326.23
= 109.77
P a g e | 261
271
436
(110 109.77) 2
109.77
(203 202.77) 2
202.77
707
(326 326.23) 2
326.23
(68 6.23) 2
68.23
2.01,1 = 6.6349
Since the observed 2 = 0.00 < 2.01,1 = 6.6349, the decision is to fail
to reject
the null hypothesis.
Variable One is independent of Variable Two.
P a g e | 262
12.12
Variable
Two
Variabl
e
24
13
47
58
93
59
187
244
142
One
583
117
302
72
234
725
e11 =
e13 =
e21 =
e23 =
(142)(117)
725
(142)( 234)
725
(583)(117)
725
(583)( 234)
725
= 22.92
= 45.83
= 94.08
= 188.17
e12 =
e14 =
e22 =
(142)( 72)
725
(142)(302)
725
(583)(72)
725
e24 =
= 14.10
= 59.15
= 57.90
(583)(302)
725
= 242.85
P a g e | 263
Variable Two
Variabl
e
(22.9
2)
(14.1
0)
One
24
13
(94.0
8)
(57.9
0)
(188.1
7)
(242.85
)
93
59
187
244
117
302
2 =
(24 22.92) 2
22.92
(93 94.08) 2
94.08
(45.83)
47
(59.15)
58
142
72
583
234
725
(13 14.10) 2
14.10
(59 57.90) 2
57.90
(47 45.83) 2
45.83
(188 188.17) 2
188.17
(58 59.15) 2
59.15
(244 242.85) 2
242.85
P a g e | 264
12.13
Social Class
Lower
Upper
Middle
Numbe
r
18
31
of
38
23
70
34
97
58
189
47
31
30
108
184
398
Childre
n
2 or
3
>3
97
117
e11 =
e12 =
e13 =
(31)(97)
398
(31)(184)
398
(31)(117)
398
= 7.56
= 14.3
= 9.11
e31 =
e32 =
e33 =
(189)(97)
398
(189)(184)
398
(189)(117)
398
= 46.06
= 87.38
= 55.56
P a g e | 265
e21 =
e22 =
e23 =
(70)(97)
398
= 17.06
(70)(184)
398
(70)(117)
398
e41 =
= 32.36
e42 =
= 20.58
e43 =
(108)(97)
398
= 26.32
(108)(184)
398
= 49.93
(108)(117)
398
= 31.75
Social Class
Lower
Upper
0
Numbe
r
7
(17.0
6)
2
3
or
>3
(14.3
3)
18
of
Childre
n
(7.56)
Middle
9
(46.0
6)
34
(26.3
2)
47
97
117
(32.3
6)
38
(87.3
8)
97
(49.9
3)
31
(9.11
)
31
6
(20.5
8)
70
23
(55.5
6)
58
189
108
(31.7
5)
30
184
398
P a g e | 266
2 =
(7 7.56) 2
7.56
(38 32.36) 2
32.36
(58 55.56) 2
55.56
(18 14.33) 2
14.33
(23 20.58) 2
20.58
(47 26.32) 2
26.32
(6 9.11) 2
9.11
(34 46.06) 2
46.06
(31 49.93) 2
49.93
(9 17.06) 2
17.06
(97 87.38) 2
87.38
(30 31.75) 2
31.75
.04 + .94 + 1.06 + 3.81 + .98 + .28 + 3.16 + 1.06 + .11 + 16.25
+
2.05,6 = 12.5916
P a g e | 267
12.14
Type of Music Preferred
Regio
n
Rock
R&B
Coun
Clssic
NE
140
32
18
195
134
41
52
235
154
27
13
202
632
428
100
65
39
e11 =
e12 =
(195)( 428)
632
(195)(100)
632
= 132.6
= 30.85
e23 =
(235)(65)
632
e24 =
= 24.17
(235)(39)
632
= 14.50
P a g e | 268
e13 =
e14 =
e21 =
e22 =
(195)(65)
632
(195)(39)
632
= 20.06
= 12.03
(235)( 428)
632
(235)(100)
632
e31 =
e32 =
= 159.15
(202)( 428)
632
(202)(100)
632
e33 =
= 37.18
e34 =
= 136.80
= 31.96
(202)( 65)
632
(202)(39)
632
= 12.47
NE
Regio
n
Rock
R&B
Coun
Clssic
(132.0
6)
(30.8
5)
(20.0
6)
(12.0
3)
140
S
(159.1
5)
134
(136.8
0)
154
32
(37.1
8)
41
(31.9
6)
27
5
(24.1
7)
52
(20.7
8)
8
18
(14.5
0)
8
(12.4
7)
13
= 20.78
195
235
202
632
P a g e | 269
428
100
65
39
2 =
(141 132.06) 2
132.06
(134 159.15) 2
159.15
(154 136.80) 2
136.80
(32 30.85) 2
30.85
(41 37.18) 2
37.18
(27 31.96) 2
31.96
(5 20.06) 2
20.06
(52 24.17) 2
24.17
(8 20.78) 2
20.78
(18 12.03) 2
12.03
(8 14.50) 2
14.50
(13 12.47) 2
12.47
2.01,6 = 16.8119
P a g e | 270
Transportation Mode
Industr
y
Air
Train
Truck
Publishing
32
12
41
85
Comp.Har
d.
24
35
37
18
65
120
e11 =
e12 =
e13 =
(85)(37)
120
(85)(18)
120
(85)(65)
120
= 26.21
e21 =
= 12.75
e22 =
= 46.04
e23 =
(35)(37)
120
(35)(18)
120
(35)(65)
120
= 10.79
= 5.25
= 18.96
Transportation Mode
Industr
y
Publishing
Air
Train
Truck
(26.2
1)
(12.7
5)
(46.0
4)
32
12
41
85
P a g e | 271
Comp.Har
d.
2 =
(32 26.21) 2
26.21
(5 10.79) 2
10.79
(10.7
9)
(5.25
)
(18.9
6)
24
37
18
65
(12 12.75) 2
12.75
(6 5.25) 2
5.25
120
(41 46.04) 2
46.04
(24 18.96) 2
18.96
35
2.05,2 = 5.9915
12.16
P a g e | 272
Number of
Bedrooms
Number
of
<2
116
101
57
274
90
325
160
575
Stories
206
217
>4
426
849
e11 =
e12 =
e13 =
2 =
+
(274)( 206)
849
(274)( 426)
849
(274)( 217)
849
= 66.48
= 137.48
= 70.03
(90 139.52) 2
139.52
(101 137.48) 2
137.48
e21 =
e22 =
e23 =
(575)( 206)
849
(575)( 426)
849
(575)( 217)
849
(57 70.03) 2
70.03
= 139.52
= 288.52
= 146.97
(90 139.52) 2
139.52
P a g e | 273
(325 288.52) 2
288.52
(160 146.97) 2
146.97
= .10
df = (c-1)(r-1) = (3-1)(2-1) = 2
2.10,2 = 4.6052
P a g e | 274
12.17
Mexican Citizens
Yes
No
Type
Dept.
24
17
41
of
Disc.
20
15
35
Hard.
11
19
30
Shoe
32
28
Store
87
79
60
166
e11 =
e12 =
e21 =
e22 =
(41)(87)
166
(41)(79)
166
(35)(87)
166
(35)(79)
166
= 21.49
= 19.51
= 18.34
= 16.66
e31 =
e32 =
e41 =
e42 =
(30)(87)
166
(30)(79)
166
(60)(87)
166
(60)(79)
166
= 15.72
= 14.28
= 31.45
= 28.55
P a g e | 275
Mexican Citizens
Yes
Type
Dept.
of
Store
Disc.
Hard.
Shoe
No
(21.4
9)
(19.5
1)
24
17
(18.3
4)
(16.6
6)
20
15
(15.7
2)
(14.2
8)
11
19
(31.4
5)
(28.5
5)
32
28
87
2 =
(24 21.49) 2
21.49
(11 15.72) 2
15.72
(17 19.51) 2
19.51
(19 14.28) 2
14.28
41
35
30
60
79
166
(20 18.34) 2
18.34
(32 31.45) 2
31.45
(15 16.66) 2
16.66
(28 28.55) 2
28.55
P a g e | 276
= .05,
df = (c-1)(r-1) = (2-1)(4-1) = 3
2.05,3 = 7.8147
Since the observed 2 = 3.93 < 2.05,3 = 7.8147, the decision is to fail
to
reject the null hypothesis.
12.18 = .01, k = 7, df = 6
Use:
( f0 fe )2
fe
P a g e | 277
fo
( fo fe )2
fe
(f0-fe)2
fe
214
235
206
232
64
9
0.311
0.039
279
268
281
284
0.032
264
268
16
0.060
254
232
484
2.086
211
206
25
0.121
121
0.451
3.100
( f0 fe )2
fe
= 3.100
Since the observed value of 2 = 3.1 < 2.01,6 = 16.8119, the decision
is to fail to
reject the null hypothesis. The observed distribution is not different
from the
expected distribution.
12.19
Variable 2
12
23
21
56
P a g e | 278
Variable
1
e11 = 11.04
2 =
17
20
45
11
18
36
27
51
59
137
e12 = 20.85
e13 = 24.12
e21 = 8.87
e22 = 16.75
e23 = 19.38
e31 = 7.09
e32 = 13.40
e33 = 15.50
(12 11.04) 2
11.04
(17 16.75) 2
16.75
(18 15.50) 2
15.50
(23 20.85) 2
20.85
(20 19.38) 2
19.38
(21 24.12) 2
24.12
(7 7.09) 2
7.09
(8 8.87) 2
8.87
(11 13.40) 2
13.40
df = (c-1)(r-1) = (2)(2) = 4
2.05,4 = 9.4877
= .05
P a g e | 279
P a g e | 280
12.20
Location
Custom
er
e11 =
e12 =
e13 =
NE
Industri
al
230
115
68
413
Retail
185
143
89
417
415
258
157
830
(413)( 415)
830
(413)(258)
830
(413)(157)
830
= 206.5
e21 =
= 128.38
= 78.12
e22 =
e23 =
(417)( 415)
830
(417)( 258)
830
(417)(157)
830
= 208.5
= 129.62
= 78.88
Location
Custom
er
Industri
al
NE
(206.
5)
(128.3
8)
(78.1
2)
230
115
68
413
P a g e | 281
Retail
2 =
(230 206.5) 2
206.5
(185 208.5) 2
208.5
(208.
5)
(129.6
2)
(78.8
8)
185
143
89
415
258
157
(115 128.38) 2
128.38
(143 129.62) 2
129.62
417
830
(68 78.12) 2
78.12
(89 78.88) 2
78.88
fo
189
P a g e | 282
Peanut Butter
168
Cheese Cracker
155
Lemon Flavored
161
Chocolate Mint
216
Vanilla Filled
165
fo = 1,054
Ho:
Ha:
( fo fe )2
fe
fo
fe
189
175.67
1.01
168
175.67
0.33
155
175.67
2.43
161
175.67
1.23
216
175.67
9.26
165
175.67
0.65
14.91
no.kinds
1,054
6
P a g e | 283
The observed 2 = 14.91
= .05
df = k - 1 = 6 - 1 = 5
2.05,5 = 11.0705
P a g e | 284
12.22
Gender
M
Boug
ht
207
65
272
811
984
1,795
1,04
2,067
Car
1,018
9
(272)(1,018)
2,067
e11 =
(27)(1,049)
2,067
= 133.96
e12 =
(1,795)(1,018)
2,067
e21 =
= 138.04
(1,795)(1,049)
2,067
= 884.04
e22 =
= 910.96
Gender
Boug
ht
Car
N
(133.9
6)
(138.0
4)
207
65
(884.0
4)
(910.9
6)
811
984
272
1,795
P a g e | 285
1,018
2 =
(207 133.96) 2
133.96
(984 910.96) 2
910.96
= .05
1,049
(65 138.04) 2
138.04
2,067
(811 884.04) 2
884.04
90.36
df = (c-1)(r-1) = (2-1)(2-1) = 1
2.05,1 = 3.8415
Since the observed 2 = 90.36 > 2.05,1 = 3.8415, the decision is to
reject the
null hypothesis.
P a g e | 286
12.23
Arrivals
fo
( fo)(Arrivals)
26
40
40
57
114
32
96
17
68
12
60
48
fo = 192
( f )(arrivals ) 426
192
f
(fo)(arrivals) = 426
= 2.2
Arrivals
Probability
fe
.1108
(.1108)(192) = 21.27
.2438
(.2438)(192) = 46.81
.2681
51.48
.1966
37.75
.1082
20.77
.0476
9.14
.0249
4.78
P a g e | 287
fo
fe
( fo fe )2
fe
26
21.27
1.05
40
46.81
0.99
57
51.48
0.59
32
37.75
0.88
17
20.77
0.68
12
9.14
0.89
4.78
2.17
7.25
Observed 2 = 7.25
= .05
df = k - 2 = 7 - 2 = 5
2.05,5 = 11.0705
Since the observed 2 = 7.25 < 2.05,5 = 11.0705, the decision is to fail
to reject
the null hypothesis. There is not enough evidence to reject the
claim that the
observed frequency of arrivals is Poisson distributed.
P a g e | 288
Soft Drink
fo
proportions
fe
( fo fe )2
fe
Classic Coke
198.49
105.29
314
Pepsi
2.12
219 .115
0.08
(.115)(1726)
Diet Coke
212
.097
167.42
Mt. Dew
121
.063
108.74
1.38
98.32
0.29
11.87
Diet Pepsi
0.50
98
Sprite
93
.057
Dr. Pepper
88
.056
96.66
0.78
581
.372
642.07
5.81
Others
fo = 1,726
22.83
.061
P a g e | 289
Observed 2 = 22.83
= .05
df = k - 1 = 8 - 1 = 7
2.05,7 = 14.0671
The observed frequencies are not distributed the same as the expected
frequencies
from the national poll.
P a g e | 290
12.25
Position
Manag
er
Years
e11 =
e12 =
e13 =
e14 =
Programm
er
Operat
or
Syste
ms
Analys
t
0-3
37
11
13
67
4-8
28
16
23
24
91
>8
47
10
12
19
88
81
63
46
56
246
(67)(81)
246
(67)(63)
246
(67)( 46)
246
(67)(56)
246
= 22.06
= 17.16
= 12.53
= 15.25
e23 =
e24 =
e31 =
e32 =
(91)( 46)
246
(91)(56)
246
(88)(81)
246
(88)(63)
246
= 17.02
= 20.72
= 28.98
= 22.54
P a g e | 291
e21 =
e22 =
(91)(81)
246
= 29.96
(91)(63)
246
e33 =
= 23.30
e34 =
(88)( 46)
246
(88)(56)
246
Position
Manag
er
0-3
Operat
or
(17.16)
(22.06)
Years
Programm
er
Syste
ms
Analys
t
37
67
(12.53)
11
(15.25
)
13
4-8
(23.30)
(29.96)
16
28
91
(17.02)
23
(20.72
)
24
>8
(22.54)
(28.98)
10
47
88
(16.46)
12
(20.03
)
19
81
63
46
56
246
= 16.46
= 20.03
P a g e | 292
2 =
(6 22.06) 2
22.06
(28 29.96) 2
29.96
(47 28.98) 2
28.98
(37 17.16) 2
17.16
(16 23.30) 2
23.30
(10 22.54) 2
22.54
(11 12.53) 2
12.53
(13 15.25) 2
15.25
(23 17.02) 2
17.02
(12 16.46)2
16.46
(24 20.72) 2
20.72
(19 20.03) 2
20.03
= .01
df = (c-1)(r-1) = (4-1)(3-1) = 6
2.01,6 = 16.8119
P a g e | 293
12.26 H0: p = .43
Ha: p .43
n = 315
x = 120
=.05
/2 = .025
fo
( fo fe )2
fe
fe
More Work,
More Business
Others
Total
120
(.43)(315) = 135.45
195
315
(.57)(315) = 179.55
315.00
1.76
1.33
3.09
df = k - 1 = 2 - 1 = 1
2.025,1 = 5.0239
P a g e | 294
Type of College or
University
Communi
ty
College
Numbe
r
of
Childre
n
Large
Universit
y
Small
Colleg
e
25
178
31
234
49
141
12
202
31
54
93
>3
22
14
42
127
387
57
571
e11 =
e12 =
e13 =
(234)(127)
571
(234)(387)
571
(234)(57)
571
= 52.05
e31 =
= 158.60
= 23.36
e32 =
e33 =
(93)(127)
571
(193)(387)
571
(93)(57)
571
= 9.28
= 20.68
= 63.03
P a g e | 295
(202)(127)
571
e21 =
(202)(387)
571
e22 =
e23 =
(202)(57)
571
= 44.93
e41 =
= 136.91
= 20.16
e42 =
e43 =
(42)(127)
571
(42)(387)
571
(42)(57)
571
= 9.34
= 28.47
= 4.19
Type of College or
University
Communi
ty
College
Numbe
r
25
Colleg
e
(158.60)
(23.36
)
178
1
(44.93)
49
(20.68)
31
>3
(9.34)
Small
Universit
y
(52.05)
of
Childre
n
Large
(136.91)
31
(20.16
)
141
12
(63.03)
(9.28)
54
(28.47)
234
202
93
8
(4.19)
22
14
127
387
57
42
571
P a g e | 296
2 =
(25 52.05) 2
52.05
(141 136.91) 2
136.91
(8 9.28) 2
9.28
(178 158.6) 2
158.6
(12 20.16) 2
20.16
(22 9.34) 2
9.34
(31 23.36) 2
23.36
(31 20.68) 2
20.68
(14 28.47 ) 2
28.47
(49 44.93) 2
44.93
(54 63.03) 2
63.03
(6 4.19) 2
4.19
= .05,
2.05,6 = 12.5916
P a g e | 297
12.28 The observed chi-square is 30.18 with a p-value of .0000043. The chisquare
goodness-of-fit test indicates that there is a significant difference
between the observed frequencies and the expected frequencies. The
distribution of responses to the question is not the same for adults
between 21 and 30 years of age as they are for others. Marketing and
sales people might reorient their 21 to 30 year old efforts away from
home improvement and pay more attention to leisure travel/vacation,
clothing, and home entertainment.
12.29 The observed chi-square value for this test of independence is 5.366.
The associated p-value of .252 indicates failure to reject the null
hypothesis. There is not enough evidence here to say that color choice
is dependent upon gender. Automobile marketing people do not have
to worry about which colors especially appeal to men or to women
because car color is independent of gender. In addition, design and
production people can determine car color quotas based on other
variables.
Chapter 13
P a g e | 298
Nonparametric Statistics
LEARNING OBJECTIVES
1.
Recognize the advantages and disadvantages of nonparametric
statistics.
2.
3.
Know when and how to use the Mann-Whitney U Test, the Wilcoxon
matched-pairs signed rank test, the Kruskal-Wallis test, and the
Friedman test.
4.
Chapter 13 contains new six techniques for analysis. Only the first
technique, the runs test, is conceptually a different idea for the student to
consider than anything presented in the text to this point. The runs test is a
mechanism for testing to determine if a string of data are random. There is a
P a g e | 299
runs test for small samples that uses Table A.12 in the appendix and a test for
large samples, which utilizes a z test.
CHAPTER OUTLINE
P a g e | 300
13.1
Runs Test
Small-Sample Runs Test
Large-Sample Runs Test
13.2
Mann-Whitney U Test
Small-Sample Case
Large-Sample Case
13.3
13.4
Kruskal-Wallis Test
13.5
Friedman Test
13.6
KEY TERMS
Friedman Test
Parametric Statistics
Kruskal-Wallis Test
Runs Test
Mann-Whitney U Test
P a g e | 301
Nonparametric Statistics
Wilcoxon Matched-Pairs Signed Rank Test
P a g e | 302
SOLUTIONS TO CHAPTER 13
13.1
= .05, The lower tail critical value is 6 and the upper tail critical value
is 16
n1 = 10
n2 = 10
R = 11
13.2
P a g e | 303
n2 = 21
n = 47
2n1n2
2(26)( 21)
1
1
n1 n2
26 21
= 24.234
2n1n2 (2n1n2 n1 n2 )
(n1 n2 ) 2 (n1 n2 1)
R=9
R R 9 24.234
R
3.351
= -4.55
Since the observed value of z = -4.55 < z.025 = -1.96, the decision is to
reject the
null hypothesis. The data are not randomly generated.
P a g e | 304
13.3
n1 = 8
n2 = 52
= .05
13.5
P a g e | 305
= .05
the
observed value of z is greater than 1.96 or less than -1.96, the decision
is to reject the null hypothesis.
R = 27 n1 = 40 n2 = 24
2n1n2
2(40)( 24)
1
1
n1 n2
64
= 31
2n1n2 (2n1n2 n1 n2 )
(n1 n2 ) 2 (n1 n2 1)
R R 27 31
R
3.716
= -1.08
Since the observed z of -1.08 is greater than the critical lower tail z
value
of -1.96, the decision is to fail to reject the null hypothesis. The
data are randomly generated.
P a g e | 306
13.6
n1 = 5
n2 = 8
n = 13
= .05
R=4
Since R = 4 > than the lower critical value of 3 and less than the upper
critical
value of 11, the decision is to fail to reject the null hypothesis.
The data are randomly generated.
13.7
Use the small sample Mann-Whitney U test since both n1, n2 < 10, = .
05. Since this is a two-tailed test, /2 = .025. The p-value is obtained
using Table A.13.
Value
11
Rank
1
Group
1
P a g e | 307
13
2.5
13
14
2.5
4
15
17
18
6
7.5
18
7.5
21
9.5
21
9.5
2
2
1
1
1
22
11
23
12.5
23
12.5
24
14
26
15
29
16
n1 = 8
n2 = 8
U n1 n2
n1 (n1 1)
(8)(9)
W1 (8)(8)
62.5
2
2
U ' n1 n2 U
= 64 37.5 = 26.5
= 37.5
P a g e | 308
13.8
Value
Rank
Group
203
208
209
211
214
216
217
218
219
222
10
223
11
224
12
P a g e | 309
227
13
229
14
230
15.5
230
15.5
231
17
236
18
240
19
241
20
248
21
255
22
256
23
283
24
n1 = 11
n2 = 13
W1 = 6 + 7 + 10 + 12 + 17 + 19 + 20 + 21 + 22 + 23 + 24 =
W1 = 181
n1 n2 (11)(13)
2
2
= 71.5
n1 n2 (n1 n2 1)
12
(11)(13)( 25)
12
= 17.26
P a g e | 310
U n1 n2
n1 (n1 1)
(11)(12)
W1 (11)(13
181
2
2
U 28 71.5
17.26
= 28
= -2.52
2.52
Since z =
null
hypothesis.
13.9
Contacts
Rank
Group
3.5
3.5
10
11
6.5
11
6.5
P a g e | 311
12
8.5
12
8.5
13
11
13
11
13
11
14
13
15
14
16
15
17
16
W1 = 39
U1 n1 n2
n1 ( n1 1)
(7)(8)
W1 (7)(9)
39
2
2
= 52
U 2 n1 n2 U1
= (7)(9) 52 = 11
U = 11
From Table A.13, the p-value = .0156. Since this p-value is greater
than = .01,
the decision is to fail to reject the null hypothesis.
P a g e | 312
Ha: Urban and rural spend different amounts
Expenditure
Rank
Group
1950
2050
2075
2110
2175
2200
2480
2490
2540
2585
10
2630
11
2655
12
2685
13
2710
14
2750
15
2770
16
2790
17
2800
18
2850
19.5
2850
19.5
2975
21
2995
22.5
P a g e | 313
2995
22.5
3100
24
n1 = 12
n2 = 12
W1 = 1 + 4 + 5 + 6 + 7 + 9 + 11 + 12 + 14 + 15 + 19.5 + 19.5 =
123
n1 n2 (12)(12)
2
2
= 72
n1 n2 (n1 n2 1)
12
(12)(12)( 25)
12
= 17.32
U n1 n2
n1 (n1 1)
(12)(13)
W1 (12)(12)
123
2
2
U 99 72
17.32
= .05
= 1.56
/2 = .025
z.025 = +1.96
= 99
P a g e | 314
Since the observed z = 1.56 < z.025 = 1.96, the decision is to fail to
reject the
null hypothesis.
P a g e | 315
13.11
Earnings
Rank
Gender
$28,900
31,400
36,600
40,000
40,500
41,200
42,300
42,500
44,500
45,000
10
47,500
11
47,800
12.5
47,800
12.5
48,000
14
50,100
15
51,000
16
51,500
17.5
51,500
17.5
53,850
19
55,000
20
57,800
21
P a g e | 316
61,100
22
63,900
23
n1 = 11
n2 = 12
n1 n2 (11)(12)
2
2
n1 n2 (n1 n2 1)
12
(11)(12)( 24)
12
= 66 and
16.25
U n1 n2
n1 (n1 1)
(11)(12)
W1 (11)(12)
193.5
2
2
U 4.5 66
16.25
= .01,
= 4.5
= -3.78
z.01 = 2.33
Since the observed z = 3.78 > z.01 = 2.33, the decision is to reject
the null
hypothesis.
P a g e | 317
Hartford
Ha: There is a difference in the price of a single-family home in Denver
and
Hartford
Price
Rank
City
132,405
134,127
134,157
134,514
135,062
135,238
135,940
136,333
7
8
D
H
136,419
136,981
10
137,016
11
137,359
12
137,741
13
137,867
14
138,057
15
139,114
16
139,638
17
140,031
18
140,102
19
140,479
20
P a g e | 318
141,408
21
141,730
22
141,861
23
142,012
24
142,136
25
143,947
26
143,968
27
144,500
28
n1 = 13
n2 = 15
W1 = 1 + 3 + 5 + 7 + 10 + 11 + 15 + 17 + 19 +
20 + 21 + 22 + 23 = 174
U n1 n2
n1 ( n1 1)
(13)(14)
W1 (13)(15)
174
2
2
n1 n2 (13)(15)
2
2
= 97.5
n1 n2 (n1 n2 1)
12
(13)(15)( 29)
12
= 21.708
= 112
P a g e | 319
U 112 97.5
21.708
= 0.67
z.025 = + 1.96.
Since the observed z = 0.67 < z.025 = 1.96, the decision is to fail
to reject
the null hypothesis. There is not enough evidence to declare
that there is
a price difference for single family homes in Denver and
Hartford.
Rank
212
179
33
15
234
184
50
16
219
213
7.5
199
167
32
13.5
194
189
206
200
7.5
P a g e | 320
234
212
22
11
225
221
220
223
-3
- 3.5
218
217
234
208
26
12
212
215
-3
-3.5
219
187
32
13.5
196
198
-2
-2
178
189
-11
-9
213
201
12
10
n = 16
T- = 3.5 + 3.5 + 2 + 9 = 18
(n)( n 1) (16)(17)
4
4
= 68
n(n 1)( 2n 1)
16(17)(33)
24
24
= 19.34
T 18 68
19.34
= .10
= -2.59
/2 = .05
z.05 = 1.645
P a g e | 321
Since the observed z = -2.59 < z.05 = -1.645, the decision is to reject
the null
hypothesis.
13.14 Ho: Md = 0
Ha: Md 0
Before
After
Rank
49
43
+9
41
29
12
+12
47
30
17
+14
39
38
53
40
13
+13
51
43
+10
51
46
49
40
38
42
-4
- 5.5
54
50
+ 5.5
46
47
-1
- 1.5
50
47
+4
44
39
+ 7.5
49
49
45
47
-2
+ 1.5
+ 7.5
+11
-3
P a g e | 322
14
reject the
null hypothesis. There is a significant difference in before and after.
P a g e | 323
13.15 Ho: The population differences > 0
Ha: The population differences < 0
Before
After
Rank
10,500
12,600
-2,100
-11
8,870
10,660
-1,790
-9
12,300
11,890
410
10,510
14,630
-4,120
-17
5,570
8,580
-3,010
-15
9,150
10,115
-965
-7
11,980
14,320
-2,370
-12
6,740
6,900
-160
-2
7,340
8,890
-1,550
-8
13,400
16,540
-3,140
-16
12,200
11,300
900
10,570
13,330
-2,760
-13
9,880
9,990
-110
-1
12,100
14,050
-1,950
-10
9,000
9,500
-500
-4
11,800
12,450
-650
-5
10,500
13,450
-2,950
-14
T+ = 3 + 6 = 9
P a g e | 324
T =9
(n)( n 1) (17)(18)
4
4
= 76.5
n(n 1)( 2n 1)
17(18)(35)
24
24
= 21.12
T 9 76.5
21.12
= .05
= -3.20
z.05 = -1.645
Since the observed z = -3.20 < z.05 = -1.645, the decision is to reject
the null
hypothesis.
13.16 Ho:Md = 0
Ha:Md < 0
Manual
426
Scanner
473
d
-47
Rank
-11
P a g e | 325
387
446
-59
-13
410
421
-11
506
510
-4
-2
411
465
-54
-12
398
409
-11
427
414
13
449
459
-10
-4
407
502
-95
-14
438
439
-1
-1
418
456
-38
-10
482
499
-17
-8
512
517
-5
-3
402
437
-35
-9
-5.5
-5.5
n = 14
T+ = (+7)
T- = (11 + 13 + 5.5 + 3 + 12 + 5.5 + 4 + 14 + 1 + 10 + 8 + 3 + 9)=
98
Since the observed T = 7 < T.05,14 = 26, the decision is to reject the null
hypothesis.
The differences are significantly less than zero and the after scores are
P a g e | 326
significantly higher.
P a g e | 327
13.17 Ho: The population differences 0
Ha: The population differences < 0
1999
2006
Rank
49
54
-5
-7.5
27
38
-11
-15
39
38
75
80
-5
59
53
11
67
68
-1
-2
22
43
-21
-20
61
67
-6
-11
58
73
-15
-18
60
55
7.5
72
58
14
16.5
62
57
7.5
49
63
-14
-16.5
48
49
-1
-2
19
39
-20
-19
32
34
-2
60
66
-6
80
90
-10
55
57
-2
-4.5
68
58
10
13.5
-7.5
-4.5
-11
-13.5
P a g e | 328
n = 20
4
4
n(n 1)( 2n 1)
24
= 105
20(21)( 41)
24
= 26.79
T 58 105
26.79
For = .10,
= -1.75
z.10 = -1.28
Since the observed z = -1.75 < z.10 = -1.28, the decision is to reject
the null
hypothesis.
April
April
P a g e | 329
2002
2006
Rank
63.1
57.1
5.7
67.1
66.4
0.7
65.5
61.8
3.7
68.0
65.3
2.7
8.5
66.6
63.5
3.1
10.5
65.7
66.4
-0.7
-3.5
69.2
64.9
4.3
67.0
65.2
1.8
65.2
65.1
0.1
1.5
60.7
62.2
-1.5
-5
63.4
60.3
3.1
10.5
59.2
57.4
1.8
6.5
62.9
58.2
4.7
15
69.4
65.3
4.1
13
67.3
67.2
0.1
1.5
66.8
64.1
2.7
8.5
n = 16
T- = 8.5
T = 8.5
(n)( n 1) (16)(17)
4
4
= 68
16
3.5
12
14
6.5
P a g e | 330
n(n 1)( 2n 1)
16(17)(33)
24
24
= 19.339
T 8.5 68
19.339
For = .05,
= -3.08
z.05 = 1.645
3.08
Since the observed z =
the null
hypothesis.
P a g e | 331
13.19 Ho: The 5 populations are identical
Ha: At least one of the 5 populations is different
5__
157
165
219
286
197
188
197
257
243
215
175
204
243
259
235
174
214
231
250
217
201
183
217
279
240
203
203
233
213
BY RANKS
1
5__
18
29
7.5
26
23.5
15
7.5
12
23.5
27
21
14
19
25
16.5
16.5
28
22
10.5
_
Tj
nj
10.5
_
__
33.5
40.5
20
__
113.5
13_
132.5 115
P a g e | 332
Tj
6
5
6
5
7
= 8,062.67
n = 29
Tj
12
12
K
3(n 1)
(8,062.67) 3(30)
n(n 1)
nj
29(30)
= 21.21
= .01
df = c - 1 = 5 - 1 = 4
2.01,4 = 13.2767
P a g e | 333
13.20 Ho: The 3 populations are identical
Ha: At least one of the 3 populations is different
Group 1
Group 2
Group 4
19
30
39
21
38
32
29
35
41
22
24
44
37
29
30
42
27
33
By Ranks
Group 1
1
Group 2
8.5
Group 3
15
14
10
6.5
12
16
18
3
13
6.5
17
__
Tj 42.5
nj 6
8.5
5
_
45
11_
83.5
P a g e | 334
Tj
6
5
7
= 1,702.08
n = 18
2
Tj
12
12
K
3(n 1)
(1,702.08) 3(19)
n(n 1)
nj
18(19)
= 2.72
= .05,
df = c - 1 = 3 - 1 = 2
2.05,2 = 5.9915
Since the observed K = 2.72 < 2.05,2 = 5.9915, the decision is to fail
to reject
the null hypothesis.
P a g e | 335
13.21 Ho: The 4 populations are identical
Ha: At least one of the 4 populations is different
Region 1
Region 2
Region 3
$1,200
$225
$ 675
$1,075
450
950
500
1,050
110
100
1,100
750
800
350
310
180
375
275
660
330
200
Region 4
680
425
By Ranks
Region 1
Region 2
Region 3
23
15
21
12
19
13
20
22
17
18
10
14
4
_
T j 69
nj
Region 4
16
_
11
40
71
96
P a g e | 336
Tj
6
5
5
7
= 3,438.27
n = 23
Tj
12
12
K
3(n 1)
(3,428.27) 3(24)
n(n 1)
nj
23(24)
= 2.75
= .05
df = c - 1 = 4 - 1 = 3
2.05,3 = 7.8147
Since the observed K = 2.75 < 2.05,3 = 7.8147, the decision is to fail
to reject
the null hypothesis.
Small Town
$21,800
City
$22,300
Suburb
$22,000
P a g e | 337
22,500
21,900
22,600
21,750
21,900
22,800
22,200
22,650
22,050
21,600
21,800
21,250
22,550
By Ranks
Small Town
4.5
6.5
14
6.5
16
15
4.5
__
__
13
43.5
61
Tj 31.5
Tj
11
nj
5
5
6
= 1,197.07
n = 16
Suburb
12
10
nj
City
P a g e | 338
2
Tj
12
12
K
3(n 1)
(1,197.07) 3(17)
n(n 1)
nj
16(17)
= 1.81
= .05
df = c - 1 = 3 - 1 = 2
2.05,2 = 5.9915
Since the observed K = 1.81 < 2.05,2 = 5.9915, the decision is to fail
to reject
the null hypothesis.
P a g e | 339
13.23 Ho: The 4 populations are identical
Ha: At least one of the 4 populations is different
Amusement Parks
Lake Area
City
National
Park
0
1
3
By Ranks
Amusement Parks
Lake Area
City
National Park
2
20.5
11.5
11.5
5.5
11.5
11.5
28.5
5.5
20.5
20.5
20.5
33
11.5
28.5
11.5
28.5
20.5
20.5
5.5
28.5
11.5
33
20.5
20.5
28.5
P a g e | 340
33
11.5
__
Tj 34
__
20.5
28.5
5.5
20.5
____
207.5
154.0
199.5
nj 7
Tj
10
7
9
10
8
= 12,295.80
n = 34
Tj
12
12
K
3(n 1)
(12,295.80) 3(35)
n(n 1)
nj
34(35)
= 18.99
= .05
df = c - 1 = 4 - 1 = 3
2.05,3 = 7.8147
P a g e | 341
Day Shift
Swing Shift
Graveyard Shift
52
45
41
57
48
46
53
44
39
56
51
49
55
48
42
50
54
35
51
49
52
43
By Ranks
Day Shift
22
9.5
18
21
14.5
20
9.5
13
19
14.5
11.5
Tj 125
Tj
Graveyard Shift
16.5
nj
Swing Shift
5 _
11.5
4
1
16.5
___
82
46
7
8
7
= 3,374.93
P a g e | 342
n = 22
Tj
12
12
K
3(n 1)
(3,374.93) 3(23)
n(n 1)
nj
22(23)
= 11.04
= .05
df = c - 1 = 3 - 1 = 2
2.05,2 = 5.9915
P a g e | 343
13.25
c = 5, b = 5, df = c - 1 = 4, 2.05,4 = 9.4877
If the observed value of 2 > 9.4877, then the decision will be to reject
the null
hypothesis.
2.5
2.5
Rj
8.5
11.5
12
18
25
P a g e | 344
Rj2
132.25
144
324
625
72.25
Rj2 = 1,297.5
r
2
12
12
2
R j 3b(c 1)
(1,297.5) 3(5)(6)
bc(c 1)
(5)(5)(6)
= 13.8
Since the observed value of r2 = 13.8 > 4,.052 = 9.4877, the decision
is to
reject the null hypothesis. At least one treatment population yields
larger values than at least one other treatment population.
P a g e | 345
13.26 Ho: The treatment populations are equal
Ha: At least one of the treatment populations yields larger values than
at least one
other treatment population.
c = 6, b = 9, df = c - 1 = 5, 2.05,5 = 11.0705
Rj
15
25
25
56
50
38
P a g e | 346
Rj2 225
625
625
3136
2500
1444
Rj2 = 8,555.5
r
2
12
12
2
R j 3b(c 1)
(8,555) 3(9)(7)
bc(c 1)
(9)(6)(7)
= 82.59
P a g e | 347
13.27 Ho: The treatment populations are equal
Ha: At least one of the treatment populations yields
than at least one
larger values
Rj
20
Rj2
64
400
22
484
10
100
P a g e | 348
Rj2 = 1,048
r2
12
2
12
R j 3b(c 1)
(1,048) 3(6)(5)
bc(c 1)
(6)( 4)(5)
= 14.8
Since the observed value of r2 = 14.8 > 23,.01 = 11.3449, the decision
is to
reject the null hypothesis. At least one treatment population yields
larger values
than at least one other treatment population.
P a g e | 349
13.28 Ho: The treatment populations are equal
Ha: At least one of the treatment populations yields larger values than
at least one
other treatment population.
If the observed value of 2 > 5.9915, then the decision will be to reject
the null
hypothesis.
Worker
5-day
4-day
3.5 day
3
1
2
4
P a g e | 350
10
Rj
29
Rj2
841
18
324
13
169
Rj2 = 1,334
r2
12
2
12
R j 3b(c 1)
(1,334) 3(10)( 4)
bc (c 1)
(10)(3)( 4)
= 13.4
Since the observed value of r2 = 13.4 > 2.05,2 = 5.9915, the decision
is to
reject the null hypothesis. At least one treatment population yields
larger values
than at least one other treatment population.
P a g e | 351
13.29 c = 4 treatments
b = 5 blocks
Since the p-value of .564 > = .10, .05, or .01, the decision is to fail
to reject
the null hypothesis. There is no significant difference in treatments.
13.31
x Ranked
y Ranked
23
201
41
259
10.5
11
37
234
29
240
-.5
0.25
-2
d2
P a g e | 352
25
231
-2
17
209
-2
33
229
41
246
10.5
1.5
2.25
40
248
10
-1
28
227
19
200
d2 = 23.5
n = 11
rs 1
6 d 2
n(n 1)
2
6(23.5)
11(120)
= .893
P a g e | 353
13.32
d2
-2
-3
11
10
10
-2
11
-2
4
d2 = 34
n = 11
rs 1
6 d 2
n(n 1)
2
6(34)
11(120)
= .845
13.33
x Ranked
y Ranked
d2
P a g e | 354
99
108
36
67
139
-1
82
117
46
168
-7
49
80
124
57
162
-4
16
49
145
-4
16
91
102
36
d2 = 164
n =8
rs 1
6 d 2
n(n 1)
2
6(164)
8(63)
= -.95
P a g e | 355
13.34
x Ranked
y Ranked
d2
92
9.3
-1
96
9.0
91
8.5
6.5
-.5
.25
89
8.0
91
8.3
6.5
1.5
2.25
88
8.4
-2
84
8.1
-1
81
7.9
-1
83
7.2
1
d2 = 15.5
n =9
rs 1
6 d 2
n(n 1)
2
6(15.5)
9(80)
= .871
13.35 Bank
Home
Bank
Home
Credit
Equity
Cr. Cd.
Eq. Loan
Card
Loan
Rank
Rank
2.51
2.07
12
2.86
1.95
2.33
1.66
6.5
13
d2
11
121
4.5
20.25
49
P a g e | 356
2.54
1.77
10
2.54
1.51
10
7.5
2.5
2.18
1.47
14
10
16
3.34
1.75
-1
2.86
1.73
6.5
2.74
1.48
2.54
1.51
10
3.18
1.25
14
-10
100
3.53
1.44
11
-10
100
3.51
1.38
12
-10
100
3.11
1.30
13
-8
64
7.5
1.5
-1
2.5
49
6.25
2.25
1
6.25
d2 = 636
n = 14
rs 1
6 d 2
n(n 1)
2
6(636)
14(142 1)
= -.398
13.36
Year
1
Iron
Steel
Rank
Rank
d2
12
12
P a g e | 357
2
11
10
-2
-5
25
-2
10
11
-1
10
-2
11
16
12
16
d2 = 80
rs 1
6 d 2
n(n 1)
2
6(80)
12(144 1)
= 0.72
d2
11
10
100
1055
10
64
943
1005
95
8
4
3
16
5
25
on NYSE
on AMEX
1774
1063
1885
2088
2361
Rank NYSE
P a g e | 358
2570
981
2675
936
2907
896
-3
3047
893
-6
36
3114
3025
2862
862
769 3
765
19
36
-8
64
10
5
d2 = 162
-7
49
-6
11
= 408
n = 11
rs 1
6 d 2
n(n 1)
2
6( 408)
11(112 1)
= -0.855
13.38 = .05
H 0:
Ha:
P a g e | 359
n1 = 13, n2 = 21
R = 10
Since this is a two-tailed test, use /2 = .025. The critical value is: z.025
= + 1.96
2n1n2
2(13)( 21)
1
1
n1 n2
13 21
= 17.06
2n1n2 (2n1n2 n1 n2 )
(n1 n2 ) 2 (n1 n2 1)
R R 10 17.06
R
2.707
= -2.61
Since the observed z = - 2.61 < z.025 = - 1.96, the decision is to reject
the null
hypothesis. The observations in the sample are not randomly
generated.
P a g e | 360
13.39
Sample 1
Sample 2
573
547
532
566
544
551
565
538
540
557
548
560
536
557
523
547
= .01 Since n1 = 8, n2 = 8 < 10, use the small sample MannWhitney U test.
P a g e | 361
Rank
Group
523
532
1
536
1
538
540
544
547
7.5
547
7.5
548
551
10
557
11.5
557
11.5
560
13
565
14
566
15
573
16
W1 = 1 + 2 + 3 + 5 + 6 + 9 + 14 + 16 = 56
U1 n1 n2
n1 (n1 1)
(8)(9)
W1 (8)(8)
56
2
2
= 44
P a g e | 362
U 2 n1 n2 U1
= 8(8) - 44 = 20
From Table A.13, the p-value (1-tailed) is .1172, for 2-tailed, the p-value
is .2344.
Since the p-value is > = .05, the decision is to fail to reject
the null
hypothesis.
P a g e | 363
13.40 = .05, n = 9
H 0: Md = 0
Ha: Md 0
Group 1
Group 2
d
-0.8
Rank
5.6
6.4
-8.5
1.3
1.5
-0.2
-4.0
4.7
4.6
0.1
2.0
3.8
4.3
-0.5
-6.5
2.4
2.1
0.3
5.0
5.5
6.0
-0.5
-6.5
5.1
5.2
-0.1
-2.0
4.6
4.5
0.1
2.0
3.7
4.5
-0.8
-8.5
T+ = 2 + 5 + 2 = 9
T = min(T+, T-) = 9
P a g e | 364
Since the observed value of T = 9 > T.025 = 6, the decision is to fail to
reject the
null hypothesis. There is not enough evidence to declare that there
is a difference between the two groups.
P a g e | 365
13.41 nj = 7, n = 28, c = 4, df = 3
Group 1
Group 2
Group 3
Group 4
6
11
13
10
13
12
10
10
Group 2
Group 3
By Ranks:
Group 1
9.5
25
17.5
Tj
3.5
Group 4
1
27.5
13.5
3.5
9.5
13.5
23
17.5
9.5
27.5
26
23
20.5
13.5
20.5
17.5
23
17.5
13.5
81.5
63.5
139
122
9.5
P a g e | 366
Tj
Tj
12
12
3(n 1)
(6411.36) 3(29)
n(n 1)
nj
28(29)
= 7.75
Since K = 7.75 < 23,.01 = 11.3449, the decision is to fail to reject the
null
hypothesis.
P a g e | 367
13.42 = .05, b = 7, c = 4, df = 3
2.05,3 = 7.8147
Blocks
Group 1
Group 2
16
14
15
17
19
17
13
24
26
25
21
13
10
11
19
11
18
13
7
14
21
15
Group 3
Group 4
16
By Ranks:
Blocks
Group 1
Group 2
Group 3
Group 4
1
P a g e | 368
5
Rj
24
16
13
Rj2
576
256
3
2
2
17
169
289
r
2
12
12
2
R j 3b(C 1)
(1,290) 3(7)(5)
bC (C 1)
(7)( 4)(5)
= 5.57
P a g e | 369
13.43
Ranks
1
d2
101
87
-6
36
129
89
-6
36
133
84
-3
147
79
-1
156
70
179
64
25
183
67
25
190
71
16
d2 = 152
n=8
rs 1
6 d 2
n(n 1)
2
6(152)
8(63)
= -.81
1 Gal.
1.1
5 Gal.
2.9
10 Gal.
3.1
P a g e | 370
1.4
2.5
2.4
1.7
2.6
3.0
1.3
2.2
2.3
1.9
2.1
2.9
1.4
2.0
1.9
2.1
2.7
By Ranks
1 Gal.
5 Gal.
10 Gal.
17.5
20
3.5
14
13
15
19
11
12
6.5
9.5
3.5
9.5
16
Tj 31
91
nj
Tj
17.5
6.5
88
7
7
6
= 2,610.95
n = 20
P a g e | 371
Tj
12
12
3(n 1)
(2,610.95) 3(21)
n(n 1)
nj
20(21)
= 11.60
= .01
df = c - 1 = 3 - 1 = 2
2.01,2 = 9.2104
13.45 N = 40 n1 = 24 n2 = 16
= .05
Use the large sample runs test since both n1, n2 are not less than 20.
P a g e | 372
R = 19
2n1n2
2(24)(16)
1
1
n1 n2
24 16
= 20.2
2n1n2 (2n1n2 n1 n2 )
(n1 n2 ) 2 (n1 n2 1)
2(24)(16) 2(24)(16) 24 16
(40) 2 (39)
= 2.993
R R 19 20.2
R
2.993
= -0.40
Since z = -0.40 > z.025 = -1.96, the decision is to fail to reject the
null hypothesis.
c = 3 and b = 7
Operator
1
Machine 1
231
Machine 2
229
Machine 3
234
P a g e | 373
2
233
232
231
229
233
230
232
235
231
235
228
232
234
237
231
236
233
230
By ranks:
Operator
Machine 1
Machine 2
Machine 3
Rj
16
15
11
Rj2
256
225
121
df = c - 1 = 2
2.05,2 = 5.99147.
If the observed 2r > 5.99147, the decision will be to reject the null
hypothesis.
P a g e | 374
Rj2 = 256 + 225 + 121 = 602
r2
12
2
12
R j 3b(c 1)
(602) 3(7)( 4)
bc(c 1)
(7)(3)( 4)
= 2
P a g e | 375
13.47 Ho: EMS workers are not older
Ha: EMS workers are older
Age
Rank
Group
21
23
24
25
27
27
27
28
28
28
29
11
30
13
30
13
30
13
32
15
33
16.5
33
16.5
36
18.5
36
18.5
37
20
39
21
P a g e | 376
41
n1 = 10
22
n2 = 12
W1 = 1 + 2 + 3 + 4 + 6 + 9 + 15 + 18.5 + 20 + 22 = 100.5
n1 n2 (10)(12)
2
2
= 60
n1 n2 (n1 n2 1)
12
(10)(12)( 23)
12
= 15.17
U n1 n2
n1 (n1 1)
(10)(11)
W1 (10)(12)
100.5
2
2
U 74.5 60
15.17
= 0.96
with = .05,
= 74.5
z.05 = -1.645
1.645
Since the observed z = 0.96 < z.05 =
reject the
null hypothesis.
P a g e | 377
Ha: The population differences 0
With
Without
Rank
1180
1209
-29
-6
874
902
-28
-5
1071
862
209
18
668
503
165
15
889
974
-85
-12.5
724
675
49
880
821
59
10
482
567
-85
796
602
194
16
1207
1097
110
14
968
962
1027
1045
-18
-4
1158
896
262
20
670
708
-38
-8
849
642
207
17
559
327
232
19
449
483
-34
-7
992
978
14
1046
973
73
11
852
841
11
n = 20
-12.5
P a g e | 378
T- = 6 + 5 + 12.5 + 12.5 + 4 + 8 + 7 = 55
T = 55
4
4
n(n 1)( 2n 1)
24
= 105
20(21)( 41)
24
= 26.79
T 55 105
26.79
= .01,
= -1.87
/2 = .005
z.005 = 2.575
Since the observed z = -1.87 > z.005 = -2.575, the decision is to fail to
reject the
null hypothesis.
GMAT
350
Rank
1
Month
J
P a g e | 379
430
460
470
490
500
510
520
530
9.5
530
9.5
540
11
550
12.5
550
12.5
560
14
570
15.5
570
15.5
590
17
600
18
610
19
630
20
n1 = 10
n2 = 10
P a g e | 380
U1 n1 n2
n1 (n1 1)
(10)(11)
W1 (10)(10)
109.5
2
2
= 45.5
U 2 n1 n2 U1
= (10)(10) - 45.5 = 54.5
From Table A.13, the p-value for U = 45 is .3980 and for 44 is .3697.
For a
two-tailed test, double the p-value to at least .739. Using = .10, the
decision is
to fail to reject the null hypothesis.
P a g e | 381
13.50 Use the Friedman test. b = 6, c = 4, df = 3, = .05
Location
A
Brand
176
58
111
4
120
156
62
98
117
203
89
117
105
183
73
118
113
147
46
101
114
190
83
113
115
By ranks:
Location
Brand
P a g e | 382
E
Rj
24
14
16
Rj2
576
36
196
256
Rj2 = 1,064
r2
12
2
12
R j 3b(c 1)
(1,064) 3(6)(5)
bc(c 1)
(6)( 4)(5)
= 16.4
Since r2 = 16.4 > 2.05,3 = 7.8147, the decision is to reject the null
hypothesis.
At least one treatment population yields larger values than at least one
other
treatment population. An examination of the data shows that location
one produced the highest sales for all brands and location two
produced the lowest sales of gum for all brands.
P a g e | 383
13.51 Ho: The population differences = 0
Ha: The population differences 0
Box
No Box
185
170
15
11
109
112
-3
-3
92
90
105
87
18
60
51
45
49
-4
-4.5
25
11
14
10
58
40
18
13.5
161
165
-4
-4.5
108
82
26
15.5
89
94
-5
123
139
-16
-12
34
21
13
8.5
68
55
13
8.5
59
60
-1
-1
78
52
26
15.5
n = 16
T- = 3 + 4.5 + 4.5 + 6 + 12 + 1 = 31
T = 31
Rank
13.5
7
-6
P a g e | 384
(n)( n 1) (16)(17)
4
4
= 68
n(n 1)( 2n 1)
16(17)(33)
24
24
= 19.34
T 31 68
19.34
= .05,
= -1.91
/2 = .025
z.025 = 1.96
Since the observed z = -1.91 > z.025 = -1.96, the decision is to fail to
reject the
null hypothesis.
P a g e | 385
13.52
Ranked
Cups
Stress
Ranked
Cups
Stress
25
80
-2
41
85
16
35
45
-4
16
11
30
28
50
34
65
18
40
20
d2
d2 = 26
n=9
rs 1
6 d 2
n(n 1)
2
6(26)
9(80)
= .783
= .05, /.025
P a g e | 386
H 0:
Ha:
R = 21
P a g e | 387
13.54
Before
After
Rank
430
465
-35
-11
485
475
10
5.5
520
535
-15
- 8.5
360
410
-50
440
425
15
500
505
-5
-2
425
450
-25
-10
470
480
-10
515
520
-5
-2
430
430
OMIT
450
460
-10
-5.5
495
500
-5
540
530
10
-12
8.5
-5.5
-2
5.5
n = 12
T = 19.5
From Table A.14, using n = 12, the critical T for = .01, one-tailed, is
10.
P a g e | 388
Since T = 19.5 is not less than or equal to the critical T = 10, the
decision is to fail
to reject the null hypothesis.
P a g e | 389
13.55 Ho: With ties have no higher scores
Ha: With ties have higher scores
Rating
Rank
Group
16
17
19
3.5
19
3.5
20
21
6.5
21
6.5
22
22
22
23
11.5
23
11.5
24
13
25
15.5
25
15.5
25
15.5
25
15.5
26
19
26
19
26
19
27
21
28
22
P a g e | 390
n1 = 11
n2 = 11
n1 n2 (11)(11)
2
2
= 60.5
n1 n2 (n1 n2 1)
12
(11)(11)( 23)
12
= 15.23
U n1 n2
n1 (n1 1)
(11)(12)
W1 (11)(11)
163.5
2
2
U 23.5 60.5
15.23
For = .05,
= -2.43
= 23.5
z.05 = 1.645
2.43
Since the observed z =
the null
hypothesis.
P a g e | 391
Sales
n1 = 9
Rank
Type of Dispenser
92
105
106
110
114
117
118
7.5
118
7.5
125
126
10
128
11
129
12
137
13
143
14
144
15
152
16
153
17
168
18
n2 = 9
W1 = 4 + 7.5 + 11 + 13 + 14 + 15 + 16 + 17 + 18 = 115.5
P a g e | 392
U1 n1 n2
n1 (n1 1)
(9)(10)
W1 (9)(9)
115.5
2
2
= 10.5
U 2 n1 n2 U1
= 81 10.5 = 70.5
= .01
From Table A.13, the p-value = .0039. The decision is to reject the
null
hypothesis since the p-value is less than .01.
P a g e | 393
13.57 Ho: The 4 populations are identical
Ha: At least one of the 4 populations is different
45
55
70
85
216
228
219
218
215
224
220
216
218
225
221
217
216
222
223
221
219
226
224
218
214
225
217
By Ranks:
45
70
85
23
11.5
18.5
13
20.5
14.5
6.5
16
17
22
18.5
11.5
1
Tj
nj
55
20.5
31.5
14.5
9
6.5
120.5
74.5
49.5
P a g e | 394
Tj
6
6
5
6
= 4,103.84
n = 23
Tj
12
12
K
3(n 1)
(4,103.84) 3(24)
n(n 1)
nj
23(24)
= 17.21
= .01
df = c - 1 = 4 - 1 = 3
2.01,3 = 11.3449
P a g e | 395
13.58
Ranks
Sales
Miles
Ranks
Sales
Miles
d2
150,000
1,500
210,000
2,100
285,000
3,200
-4
16
301,000
2,400
335,000
2,200
390,000
2,500
400,000
3,300
-1
425,000
3,100
440,000
3,600
0
d2 = 26
n=9
rs 1
6 d 2
n(n 1)
2
6(26)
9(80)
= .783
P a g e | 396
13.59 Ho: The 3 populations are identical
Ha: At least one of the 3 populations is different
3-day
Quality
Mgmt. Inv.
27
16
11
38
21
17
25
18
10
40
28
22
31
29
15
19
20
35
31
By Ranks:
3-day
Mgmt. Inv.
14
20
11
13
21
15
17.5
16
12
5
19
Tj 34
nj
Quality
113.5
10
17.5
83.5
P a g e | 397
Tj
7
7
7
= 3,001.5
n = 21
Tj
12
12
K
3(n 1)
(3,001.5) 3(22)
n(n 1)
nj
21(22)
= 11.96
= .10
df = c - 1 = 3 - 1 = 2
2.10,2 = 4.6052
Husbands
Wives
Rank
27
35
-8
-12
22
29
-7
-11
P a g e | 398
28
30
-2
-6.5
19
20
-1
-2.5
28
27
2.5
29
31
-2
-6.5
18
22
-4
-9.5
21
25
19
29
6.5
-4
-9.5
18
28
-10
-13.5
20
21
-1
-2.5
24
22
6.5
23
33
-10
-13.5
25
38
-13
-16.5
22
34
-12
-15
16
31
-15
-18
23
36
-13
-16.5
30
31
-1
-2.5
n = 18
T = 15.51
(n)( n 1) (18)(19)
4
4
= 85.5
P a g e | 399
n(n 1)( 2n 1)
18(19)(37)
24
24
= 22.96
T 15.5 85.5
22.96
= .01
= -3.05
z.01 = -2.33
Since the observed z = -3.05 < z.01 = -2.33, the decision is to reject
the null
hypothesis.
13.61 This problem uses a random block design, which is analyzed by the
Friedman
nonparametric test. There are 4 treatments and 10 blocks. The value
of the
observed r2 (shown as S) is 12.16 (adjusted for ties) and has an
associated
p-value of .007 that is significant at = .01. At least one treatment
population
yields larger values than at least one other treatment population.
Examining the
treatment medians, treatment one has an estimated median of 20.125
and
P a g e | 400
treatment two has a treatment median of 25.875. These two are the
farthest apart.
13.62 This is a Runs test for randomness. n1 = 21, n2 = 29. Because of the
size of the
ns, this is a large sample Runs test. There are 28 runs, R = 28.
R = 25.36
28 25.36
3.34
R = 3.34
= 0.79
The p-value for this statistic is .4387 for a two-tailed test. The decision
is to fail
to reject the null hypothesis at = .05.
P a g e | 401
not identical.
The value of W is 191.5. The p-value for the test is .0066. The test is
significant
at = .01. The decision is to reject the null hypothesis. The two
populations are
not identical. An examination of medians shows that the median for
group two
(46.5) is larger than the median for group one (37.0).
13.64 A Kruskal-Wallis test has been used to analyze the data. The null
hypothesis is
that the four populations are identical; and the alternate hypothesis is
that at least one of the four populations is different. The H statistic
(same as the K statistic) is 11.28 when adjusted for ties. The p-value
for this H value is .010 which indicates that there is a significant
difference in the four groups at = .05 and marginally so for = .01.
An examination of the medians reveals that all group medians are the
same (35) except for group 2 that has a median of 25.50. It is likely
that it is group 2 that differs from the other groups.
P a g e | 402
Chapter 14
Simple Regression Analysis
LEARNING OBJECTIVES
1.
Compute the equation of a simple regression line from a sample of data
and interpret the slope and intercept of the equation.
2.
3.
4.
5.
Test hypotheses about the slope of the regression model and interpret
the results.
6.
7.
P a g e | 403
P a g e | 404
are any better than "seat of the pants" or "crystal ball" estimates remains to be
seen.
It is my view that for many of these students, the most important facet
of this chapter lies in understanding the "buzz" words of regression such as
standard error of the estimate, coefficient of determination, etc. because they
may only interface regression again as some type of computer printout to be
deciphered. The concepts then may be more important than the calculations.
P a g e | 405
CHAPTER OUTLINE
14.1
14.2
14.3
Residual Analysis
Using Residuals to Test the Assumptions of the Regression Model
Using the Computer for Residual Analysis
14.4
14.5
Coefficient of Determination
Relationship Between r and r2
14.6
Hypothesis Tests for the Slope of the Regression Model and Testing the
Overall
Model
Testing the Slope
Testing the Overall Model
14.7
Estimation
Confidence Intervals to Estimate the Conditional Mean of y: y/x
Prediction Intervals to Estimate a Single Value of y
P a g e | 406
14.8
14.8
KEY TERMS
Coefficient of Determination (r2)
Prediction Interval
Confidence Interval
Probabilistic Model
Dependent Variable
Regression Analysis
Deterministic Model
Heteroscedasticity
Homoscedasticity
Residual
Residual Plot
Scatter Plot
Independent Variable
Simple Regression
Outliers
(se)
P a g e | 407
SOLUTIONS TO CHAPTER 14
14.1
12
17
21
15
28
22
19
20
24
30
25
20
15
y
10
0
5
10
15
20
25
30
x = 89
x2= 1,833
y = 97
y2 = 1,935
xy = 1,767
n=5
P a g e | 408
SS xy
SS x
xy
x
x y
(89)(97)
5
(89) 2
1,833
5
1,767
n
( x) 2
n
b1 =
y b x 97 0.162 89
b0 =
y
= 16.51 + 0.162 x
= 16.51
= 0.162
P a g e | 409
14.2
x_
_y_
140
25
119
29
103
46
91
70
65
88
29
112
24
128
= 571
x2
= 58,293
= 498
y2 = 45,154
xy = 30,099
n=7
P a g e | 410
SS xy
SS x
xy
x
x y
b1 =
(571)( 498)
7
(571) 2
58,293
7
30,099
n
( x) 2
=
y
= 144.414 0.898 x
= 144.414
= -0.898
P a g e | 411
14.3
(Advertising) x
(Sales) y
12.5
148
3.7
55
21.6
338
60.0
994
37.6
541
6.1
89
16.8
126
41.2
379
x = 199.5
x2 = 7,667.15
SS xy
SS x
xy = 107,610.4
= 2,670
y2 = 1,587,328
xy
x
x y
(199.5)( 2,670)
8
(199.5) 2
7,667.15
8
107,610.4
n
( x) 2
b1 =
n=8
15.240
y
= -46.292 + 15.240 x
= -46.292
P a g e | 412
14.4
(Prime) x
(Bond) y
16
12
15
x = 41
y = 48
x2 = 421
y2 = 524
SS xy
SS x
b1
xy
n=5
xy
x
= 333
x y
n
( x) 2
( 41)( 48)
5
(41) 2
421
5
333
y b x 48 (0.715) 41
b0 =
= 15.460
y
= 15.460 0.715 x
14.5
Bankruptcies(y)
34.3
58.1
Firm Births(x)
= -0.715
P a g e | 413
35.0
55.4
38.5
57.0
40.1
58.5
35.5
57.4
37.9
58.0
x = 344.4
y = 221.3
y2 = 8188.41
SS xy
SS x
x2 = 19,774.78
xy = 12,708.08
xy
x
b1 =
x y
(344.4)( 221.3)
6
(344.4) 2
19,774.78
6
12,708.08
n
( x) 2
n
n=6
b1 = 0.878
y
= -13.503 + 0.878 x
= -13.503
P a g e | 414
14.6
5.65
213
4.65
258
3.96
297
3.36
340
2.95
374
2.52
420
2.44
426
2.29
441
2.15
460
2.07
469
2.17
434
2.10
444
x = 36.31
y = 4,576
y2 = 1,825,028
xy = 12,766.71
SS xy
SS x
b1 =
xy
x
x y
n = 12
(36.31)( 4,576)
12
(36.31) 2
124.7931
12
12,766.71
n
( x) 2
n
x2 = 124.7931
= -72.328
P a g e | 415
12
y
= 600.186 72.3281 x
12
= 600.186
P a g e | 416
14.7
Steel
99.9
2.74
97.9
2.87
98.9
2.93
87.9
2.87
92.9
2.98
97.9
3.09
100.6
3.36
104.9
3.61
105.3
3.75
108.6
3.95
x = 994.8
y = 32.15
x2 = 99,293.28
y2 = 104.9815
xy = 3,216.652
n = 10
SS xy
SS x
b1 =
0.05557
New Orders
xy
x
x y
n
( x) 2
n
(994.8)(32.15)
10
(994.8) 2
99,293.28
10
3,216.652
P a g e | 417
10
y
= -2.31307 + 0.05557 x
10
= -2.31307
P a g e | 418
14.8
15
47
36
19
56
12
44
21
y
= 13.625 + 2.303 x
Residuals:
Residuals (y-
15
47
48.1694
-1.1694
36
32.0489
3.9511
19
12
56
44
57.3811
41.2606
-1.3811
2.7394
21
25.1401
-4.1401
y
14.9
x
12
y
17
Predicted (
18.4582
y
)
Residuals (y-1.4582
P a g e | 419
21
15
19.9196
-4.9196
28
8
22
19
21.0563
17.8087
0.9437
1.1913
20
24
19.7572
4.2428
y
= 16.51 + 0.162 x
y
14.10
Predicted (
y
) Residuals (y-
140
119
25
29
18.6597
37.5229
6.3403
-8.5229
103
46
51.8948
-5.8948
91
70
62.6737
7.3263
65
88
86.0281
1.9720
29
112
118.3648
-6.3648
24
128
122.8561
5.1439
y
= 144.414 - 0.898 x
P a g e | 420
y
14.11
Predicted (
y
)
Residuals (y-
12.5
148
144.2053
3.7947
3.7
21.6
60.0
55
338
994
10.0954
282.8873
868.0945
44.9047
55.1127
125.9055
37.6
541
526.7236
14.2764
6.1
89
46.6708
42.3292
16.8
126
209.7364
-83.7364
41.2
379
581.5868
-202.5868
y
= -46.292 + 15.240x
P a g e | 421
y
14.12
Predicted (
Residuals (y-
16
4.0259
0.9741
6
8
12
9
11.1722
9.7429
0.8278
-0.7429
15
12.6014
2.3986
10.4576
-3.4575
y
= 15.460 - 0.715 x
y
14.13
_ x_
_ y_
Predicted (
58.1
34.3
37.4978
-3.1978
55.4
57.0
35.0
38.5
35.1277
36.5322
-0.1277
1.9678
58.5
40.1
37.8489
2.2511
57.4
35.5
36.8833
-1.3833
58.0
37.9
37.4100
Residuals (y-
0.4900
The residual for x = 58.1 is relatively large, but the residual for x =
55.4 is quite
P a g e | 422
y
small.
14.14
_x_ _ y_
Predicted (
47
42.2756
7
11
38
32
38.9836
32.3997
-0.9836
-0.3996
12
24
30.7537
-6.7537
19
22
19.2317
25
10
y
)
Residuals (y-
4.7244
2.7683
9.3558
0.6442
y
= 50.506 - 1.646 x
y
14.15
Miles (x)
Cost y
1,245
2.64
2.5376
425
1,346
2.31
2.45
2.3322
2.5629
973
2.52
2.4694
(y-
.1024
-.0222
-.1128
.0506
P a g e | 423
255
2.19
2.2896
-.0996
865
2.55
2.4424
.1076
1,080
2.40
2.4962
-.0962
296
2.37
2.2998
.0702
y
= 2.2257 0.00025 x
14.16
P a g e | 424
14.17
P a g e | 425
14.19 SSE = y2 b0y - b1XY = 1,935 - (16.51)(97) - 0.1624(1767) =
46.5692
se
SSE
n2
46.5692
3
= 3.94
SSE = 272.0
se
SSE
n2
272.0
5
= 7.376
P a g e | 426
14.21 SSE = y2 b0y - b1XY = 1,587,328 - (-46.29)(2,670) 15.24(107,610.4) =
SSE = 70,940
se
SSE
n2
70,940
6
= 108.7
Six out of eight (75%) of the sales estimates are within $108.7 million.
se
SSE
19.8885
n2
3
= 2.575
Four out of five (80%) of the estimates are within 2.575 of the actual
rate for
bonds. This amount of error is probably not acceptable to financial
analysts.
P a g e | 427
y
14.23
_ x_
_ y_
Predicted (
( y y ) 2
y
)
Residuals (y-
58.1
55.4
57.0
34.3
35.0
38.5
37.4978
35.1277
36.5322
-3.1978
-0.1277
1.9678
10.2259
0.0163
3.8722
58.5
40.1
37.8489
2.2511
5.0675
57.4
35.5
36.8833
-1.3833
1.9135
58.0
37.9
37.4100
0.4900
( y y )
SSE =
se
( y y )
SSE
n2
0.2401
2
= 21.3355
= 21.3355
21.3355
4
= 2.3095
P a g e | 428
y
14.24
(y-
y
)
(y-
4.7244
22.3200
-0.9836
.9675
-0.3996
.1597
-6.7537
45.6125
2.7683
0.6442
7.6635
.4150
y
(y-
SSE =
se
( y y )
SSE
n2
)2 = 77.1382
= 77.1382
77.1382
4
= 4.391
)2
P a g e | 429
y
14.25 (y-
y
)
(y-
)2
.1024
.0105
-.0222
.0005
-.1129
.0127
.0506
-.0996
.0026
.0099
.1076
.0116
-.0962
.0093
.0702
.0049
y
(y-
)2 = .0620
SSE =
( y y )
.0620
se
SSE
.0620
n2
6
= .1017
14.26
Volume (x)
Sales (y)
728.6
10.5
497.9
48.1
439.1
64.8
P a g e | 430
377.9
20.1
375.5
11.4
363.8
123.8
276.3
89.0
x = 3059.1
n=7
x2 = 1,464,071.97
b1 = -.1504
y = 367.7
y2 = 30,404.31
xy = 141,558.6
b0 = 118.257
y
= 118.257 - .1504x
se
SSE
8211.6245
n2
5
= 40.526
This is a relatively large standard error of the estimate given the sales
values
(ranging from 10.5 to 123.8).
P a g e | 431
1
14.27 r2 =
SSE
46.6399
1
2
( y )
(97) 2
2
1,935
y n
5
= .123
1
14.28 r2 =
SSE
272.121
1
2
( y )
(498) 2
2
45
,
154
y n
7
= .972
1
14.29 r2 =
SSE
70,940
1
2
( y )
(2,670) 2
2
1
,
587
,
328
8
n
= .898
1
14.30 r2 =
SSE
19.8885
1
2
( y )
(48) 2
2
524
y n
5
= .685
P a g e | 432
This value of r2 is a modest value.
68.5% of the variation of y is accounted for by x but 31.5% is
unaccounted for.
1
14.31 r2 =
SSE
21.33547
1
2
( y )
(221.3) 2
2
8
,
188
.
41
y n
6
= .183
14.32
CCI
Median Income
116.8
37.415
91.5
36.770
68.5
35.501
61.6
35.047
65.9
34.700
90.6
34.942
100.0
35.887
P a g e | 433
104.6
36.306
125.4
37.005
x = 323.573
y = 824.9
x2 = 11,640.93413
y2 = 79,718.79
xy = 29,804.4505
n=9
SS xy
SS x
xy
x
x y
b1 =
(323.573)(824.9)
9
(323.573) 2
11,640.93413
9
29,804.4505
n
( x) 2
=
b1 = 19.2204
-599.3674
y
= -599.3674 + 19.2204 x
SSE = y2 b0y - b1XY =
79,718.79 (-599.3674)(824.9) 19.2204(29,804.4505) =
1283.13435
se
SSE
1283.13435
n2
7
= 13.539
P a g e | 434
1
r2 =
SSE
1283.13435
1
2
( y )
(824.9) 2
2
79,718.79
y n
9
se
x2
( x ) 2
n
= .688
3.94
1.833
(89) 2
5
14.33 sb =
= .2498
b1 = 0.162
Ho: = 0
= .05
Ha: 0
t.025,3 = 3.182
b1 1 0.162 0
sb
.2498
t =
= 0.65
df = n - 2 = 5 - 2 = 3
P a g e | 435
Since the observed t = 0.65 < t.025,3 = 3.182, the decision is to fail to
reject the
null hypothesis.
se
x2
( x ) 2
7.376
58,293
(571) 2
7
14.34 sb =
= .068145
b1 = -0.898
Ho: = 0
= .01
Ha: 0
t.005,5 = 4.032
b1 1 0.898 0
sb
.068145
t=
= -13.18
P a g e | 436
se
x2
( x ) 2
108.7
7,667.15
14.35 sb =
(199.5) 2
8
= 2.095
b1 = 15.240
Ho: = 0
= .10
Ha: 0
df = n - 2 = 8 - 2 = 6
t.05,6 = 1.943
b1 1 15,240 0
sb
2.095
t =
= 7.27
Since the observed t = 7.27 > t.05,6 = 1.943, the decision is to reject
the null
hypothesis.
P a g e | 437
se
( x )
2.575
421
(41) 2
5
14.36 sb =
= .27963
b1 = -0.715
Ho: = 0
= .05
Ha: 0
df = n - 2 = 5 - 2 = 3
t.025,3 = 3.182
b1 1 0.715 0
sb
.27963
t =
= -2.56
Since the observed t = -2.56 > t.025,3 = -3.182, the decision is to fail to
reject the
null hypothesis.
P a g e | 438
se
( x )
2.3095
(344.4) 2
19,774.78
6
14.37 sb =
= 0.926025
b1 = 0.878
Ho: = 0
= .05
Ha: 0
df = n - 2 = 6 - 2 = 4
t.025,4 = 2.776
b1 1 0.878 0
sb
.926025
t =
= 0.948
Since the observed t = 0.948 < t.025,4 = 2.776, the decision is to fail to
reject the
null hypothesis.
14.38
= .05 but
P a g e | 439
not at = .01. For simple regression,
F
t =
= 2.874
t.05,8 = 1.86 but t.01,8 = 2.896. The slope is significant at = .05 but not
at
= .01.
P a g e | 440
14.39 x0 = 25
/2 = .025
95% confidence
df = n - 2 = 5 - 2 = 3
t.025,3 = 3.182
x 89
n
= 17.8
x = 89
x2 = 1,833
se = 3.94
y
= 16.5 + 0.162(25) = 20.55
( x 0 x) 2
( x ) 2
2
x
t /2,n-2 se
1 (25 17.8) 2
(89) 2
5
1,833
5
20.55
(.63903) =
20.55 8.01
3.182(3.94)
20.55 3.182(3.94)
P a g e | 441
12.54 < E(y25) < 28.56
P a g e | 442
14.40 x0 = 100
df = n - 2 = 7 - 2 = 5
t.05,5 = 2.015
x 571
n
= 81.57143
x= 571
x2 = 58,293
se = 7.377
y
= 144.414 - .0898(100) = 54.614
( x0 x ) 2
( x ) 2
2
x n
t /2,n-2 se
1 (100 81.57143) 2
1
(571) 2
7
58,293
7
54.614 2.015(7.377)
P a g e | 443
y
For x0 = 130,
( x0 x) 2
( x ) 2
2
x n
y t /2,n-2 se
1 (130 81.57143) 2
(571) 2
7
58,293
7
27.674 2.015(7.377)
27.674 2.015(7.377)(1.1589) =
27.674 17.227
The width of this confidence interval of y for x0 = 130 is wider that the
confidence interval of y for x0 = 100 because x0 = 100 is nearer to the
value of
x = 81.57 than is x0 = 130.
14.41 x0 = 20
df = n - 2 = 8 - 2 = 6
x 199.5
n
= 24.9375
/2 = .01
t.01,6 = 3.143
P a g e | 444
x = 199.5
x2 = 7,667.15
se = 108.8
y
= -46.29 + 15.24(20) = 258.51
( x 0 x) 2
( x ) 2
2
x
t /2,n-2 se
(20 24.9375) 2
(199.5) 2
7,667.15
8
258.51 (3.143)(108.8)
1
1
n
y
t /2,n-2 se
( x0 x) 2
( x ) 2
2
x n
P a g e | 445
(20 24.9375) 2
(199.5) 2
7,667.15
8
258.51 (3.143)(108.8)
The confidence interval for the single value of y is wider than the
confidence
interval for the average value of y because the average is more
towards the
middle and individual values of y can vary more than values of the
average.
P a g e | 446
14.42 x0 = 10
df = n - 2 = 5 - 2 = 3
/2 = .005
x 41
n
= 8.20
x = 41
x2 = 421
se = 2.575
y
= 15.46 - 0.715(10) = 8.31
( x 0 x) 2
( x ) 2
2
x n
t /2,n-2 se
1 (10 8.2) 2
(41) 2
5
421
5
8.31 5.841(2.575)
P a g e | 447
If the prime interest rate is 10%, we are 99% confident that the
average bond rate
is between 0.97% and 15.65%.
P a g e | 448
14.43
Year
Fertilizer
2001
2002
11.9
17.9
2003
22.0
2004
21.8
2005
26.0
x = 10,015
y = 99.6
x2= 20,060,055
y2 = 2097.26
SS xy
SS x
xy
x
b1 =
x y
(10,015)(99.6)
5
(10,015) 2
20,060,055
5
y b x 99.6 3.2110,015
b0 =
n=5
199,530.9
n
( x) 2
n
xy = 199,530.9
= -6,409.71
y
= -6,409.71 + 3.21 x
y
(2008) = -6,409.71 + 3.21(2008) = 35.97
= 3.21
P a g e | 449
14.44
Year
Fertilizer
1998
1999
2000
2001
2002
5860
6632
7125
6000
4380
2003
3326
2004
2642
x = 14,007
y = 35,965
xy = 71,946,954
x2= 28,028,035
y2 = 202,315,489
n=7
SS xy
SS x
xy
x
b1 =
x y
n
( x) 2
n
(14,007)(35,965)
7
(14,007) 2
28,028,035
7
71,946,954
-678.9643
= 1,363,745.39
y
= 1,363,745.39 + -678.9643 x
y
(2007) = 1,363,745.39 + -678.9643(2007) = 1,064.04
P a g e | 450
14.45 Year
Quarter
Cum. Quarter(x)
Sales(y)
11.93
2003
2004
2005
12.46
13.28
15.08
16.08
16.82
17.60
18.66
19.73
10
21.11
11
22.21
12
22.94
x = 78
y = 207.9
xy = 1,499.07
x2= 650
y2 = 3,755.2084
n = 12
SS xy
SS x
b1 =
xy
x
x y
n
( x) 2
n
(78)( 207.9)
12
(78) 2
650
12
1,499.07
= 1.033
P a g e | 451
y b x 207.9 1.033 78
b0 =
12
12
= 10.6105
y
= 10.6105 + 1.033 x
y
(19) = 10.6105 + 1.033(19) = 30.2375
P a g e | 452
14.46
11
16
27
12
15
13
x = 52
x2 = 564
y = 83
y2 = 1,389
b1 = 1.2853
xy = 865
n=6
b0 = 2.6941
y
a)
= 2.6941 + 1.2853 x
y
b)
(Predicted Values)
(y-
) residuals
9.1206
-1.1206
11.6912
-2.6912
6.5500
4.4500
23.2588
3.7412
18.1177
-3.1176
14.2618
-1.2618
P a g e | 453
y
c)
(y-
)2
1.2557
7.2426
19.8025
13.9966
9.7194
1.5921
SSE = 53.6089
se
SSE
53.6089
n2
4
1
d)
r2 =
= 3.661
SSE
53.6089
1
2
( y )
(83) 2
2
1
,
389
y n
6
= .777
P a g e | 454
e)
Ho: = 0
= .01
Ha: 0
df = n - 2 = 6 - 2 = 4
t.005,4 = 4.604
se
( x ) 2
3.661
564
(52) 2
6
sb =
= .34389
b1 1 1.2853 0
sb
.34389
t =
= 3.74
f)
The r2 = 77.74% is modest. There appears to be some
prediction with this
model. The slope of the regression line is not significantly
different from
P a g e | 455
zero using = .01. However, for = .05, the null hypothesis of
a zero
slope is rejected. The standard error of the estimate, se = 3.661
is not
particularly small given the range of values for y (11 - 3 = 8).
14.47
53
47
41
50
58
10
62
12
45
60
11
x = 416
x2 = 22,032
y = 57
y2 = 489
b1 = 0.355
xy = 3,106
n=8
b0 = -11.335
y
a)
= -11.335 + 0.355 x
P a g e | 456
y
b)
y
(Predicted Values)
(y-
7.48
-2.48
5.35
-0.35
3.22
3.78
6.415
-2.415
9.255
0.745
10.675
1.325
4.64
9.965
-1.64
1.035
y
c)
(y-
)2
6.1504
0.1225
14.2884
5.8322
0.5550
1.7556
2.6896
1.0712
SSE = 32.4649
d)
se =
) residuals
SSE
32.4649
n2
6
= 2.3261
P a g e | 457
SSE
32.4649
1
2
( y )
(57) 2
2
489
y n
8
e)
r2 =
f)
Ho: = 0
= .608
= .05
Ha: 0
df = n - 2 = 8 - 2 = 6
t.025,6 = 2.447
se
x2
( x ) 2
2.3261
22,032
sb =
(416) 2
8
= 0.116305
b1 1 0.3555 0
sb
.116305
t =
= 3.05
P a g e | 458
g)
the
variance of y is unaccounted for by x. The range of y values is
12 - 3 = 9
and the standard error
14.48 x = 1,263
y = 417
x2 = 268,295
y2 = 29,135
xy = 88,288
b0 = 25.42778
n=6
b1 = 0.209369
1
r2 =
SSE
46.845468
1
2
153.5
( y )
y2 n
= .695
P a g e | 459
Coefficient of determination = r2 = .695
14.49a) x0 = 60
x = 524
x2 = 36,224
y = 215
y2 = 6,411
xy = 15,125
b1 = .5481
n=8
b0
-9.026
se = 3.201
df = n - 2 = 8 - 2 = 6
t.025,6 = 2.447
y
= -9.026 + 0.5481(60) = 23.86
x 524
n
= 65.5
/2 = .025
P a g e | 460
( x0 x) 2
( x ) 2
2
x n
t /2,n-2 se
(60 65.5) 2
(524) 2
36,224
8
23.86 + 2.447(3.201)
b) x0 = 70
y
70
( x 0 x) 2
( x ) 2
2
x n
+ t/2,n-2 se
1
29.341 + 2.447(3.201)
(70 65.5) 2
(524) 2
36,224
8
P a g e | 461
c) The confidence interval for (b) is much wider because part (b) is for a
single value
of y which produces a much greater possible variation. In actuality, x0
= 70 in
part (b) is slightly closer to the mean (x) than x0 = 60. However, the
width of the
single interval is much greater than that of the average or expected y
value in
part (a).
P a g e | 462
14.50
Year
Cost
1
2
56
54
49
46
45
x = 15
y = 250
x2= 55
y2 = 12,594
SS xy
SS x
xy
x
xy = 720
x y
b1 =
(15)( 250)
5
(15) 2
55
5
720
n
( x) 2
n
n=5
y b x 250 (3) 15
b0 =
y
= 59 - 3 x
y
(7) = 59 - 3(7) = 38
= 59
= -3
P a g e | 463
14.51 y = 267
y2 = 15,971
x = 21
x2 = 101
xy = 1,256
n =5
b0 = 9.234375
b1 = 10.515625
1
r2 =
SSE
297.7969
1
2
1,713.2
( y )
y2 n
= .826
14.52 n = 12
y = 5940
x = 548
y2 = 3,211,546
x2 = 26,592
xy = 287,908
P a g e | 464
b1 = 10.626383
b0 = 9.728511
y
= 9.728511 + 10.626383 x
se
SSE
94,337.9762
n2
10
1
r2 =
= 97.1277
SSE
94,337.9762
1
2
271,246
( y )
y2 n
= .652
10.626383 0
97.1277
26,592
t =
(548) 2
12
= 4.33
If = .01, then t.005,10 = 3.169. Since the observed t = 4.33 > t.005,10 =
3.169, the
decision is to reject the null hypothesis.
P a g e | 465
P a g e | 466
14.53
Sales(y)
Number of Units(x)
17.1
12.4
7.9
7.5
4.8
6.8
4.7
8.7
4.6
4.6
4.0
5.1
2.9
11.2
2.7
5.1
2.7
2.9
y = 51.4
y2 = 460.1
x2 = 538.97
xy = 440.46
b1 = 0.92025
x = 64.3
n=9
b0 = -0.863565
y
= -0.863565 + 0.92025 x
SSE = y2 - b0y - b1xy =
P a g e | 467
1
r2 =
SSE
99.153926
1
2
166.55
( y )
2
y
= .405
P a g e | 468
14.54
Year
Total Employment
1995
11,152
1996
10,935
1997
11,050
1998
10,845
1999
10,776
2000
10,764
2001
10,697
2002
9,234
2003
9,223
2004
9,158
x = 19,995
y = 103,834
xy =
x2= 39,980,085
y2 = 1,084,268,984
n=7
207,596,350
SS xy
SS x
xy
x
b1 =
x y
n
( x) 2
n
(19,995)(103,834)
10
(19,995) 2
39,980,085
10
207,596,350
10
10
= 488,639.564
= -239.188
P a g e | 469
y
= 488,639.564 + -239.188 x
y
(2008) = 488,639.564 + -239.188(2008) = 8,350.30
P a g e | 470
14.55
1977
2003
581
666
213
214
668
496
345
204
1476
1600
1776
6278
x= 5059
y = 9458
y2 = 42,750,268
xy = 14,345,564 n = 6
SS xy
SS x
xy
x
b1 =
x2 = 6,280,931
x y
n
( x) 2
n
(5059)(9358)
6
(5059) 2
6,280,931
6
13,593,272
y
= -1089.0712 + 3.1612 x
= -1089.0712
= 3.1612
P a g e | 471
for x = 700:
y
= 1076.6044
( x 0 x) 2
( x ) 2
2
x n
+ t/2,n-2se
= .05,
t.025,4 = 2.776
x0 = 700, n = 6
x
= 843.167
SSE = y2 b0y b1xy =
se
SSE
n2
7,701,506.49
4
Confidence Interval =
= 1387.58
P a g e | 472
1
(700 843.167) 2
(5059) 2
6
6,280,931
6
1123.757 + (2.776)(1387.58)
1123.757 + 1619.81
-496.05 to 2743.57
H0: 1 = 0
Ha: 1 0
= .05
df = 4
b1 0
sb
3.1612 0
1387.58
2.9736
.8231614
2,015,350.833
t =
= 3.234
Since the observed t = 3.234 > t.025,4 = 2.776, the decision is to reject
the null
hypothesis.
14.56 x = 11.902
y = 516.8
x2 = 25.1215
y2 = 61,899.06
b1 = 66.36277
P a g e | 473
xy = 1,202.867
n=7
b0 = -39.0071
y
= -39.0071 + 66.36277 x
SSE = y2 - b0 y - b1 xy
se
SSE
n2
1
r2 =
2,232.343
5
= 21.13
SSE
2,232.343
1
2
( y )
(516.8) 2
2
61
,
899
.
06
7
n
14.57 x = 44,754
y = 17,314
y2 = 24,646,062
SS xy
SS x
b1 =
x2 = 167,540,610
xy = 59,852,571
n = 13
xy
x y
(44,754)(17,314)
13
(44,754) 2
167,540,610
13
59,852,571
n
( x) 2
n
= 1 - .094 = .906
= .01835
P a g e | 474
13
13
= 1268.685
y
= 1268.685 + .01835 x
x = 91
y = 44,754
xy = 304,797
x2= 819
y2 = 167,540,610
n = 13
SS xy
SS x
xy
x
b1 =
x y
(91)( 44,754)
13
(91) 2
819
13
304,797
n
( x) 2
=
y b x 44,754 (46.5989) 91
b0 =
13
y
= 3,768.81 46.5989 x
13
= 3,768.81
= -46.5989
P a g e | 475
y
(2007) = 3,768.81 - 46.5989(15) = 3,069.83
P a g e | 476
14.58 x = 323.3
y = 6765.8
x2 = 29,629.13
y2 = 7,583,144.64
xy = 339,342.76 n = 7
SS xy
SS x
xy
x
b1 =
x y
(323.3)(6765.8)
7
(323.3) 2
29,629.13
7
339,342.76
n
( x) 2
=
= 1.82751
= 882.138
y
= 882.138 + 1.82751 x
se
SSE
994,623.07
n2
5
= 446.01
P a g e | 477
1
r2 =
SSE
994,623.07
1
2
( y )
(6765.8) 2
2
7
,
583
,
144
.
64
y n
7
= 1 - .953 = .
047
H 0: = 0
Ha: 0
= .05
x
2
SSxx =
x
n
t.025,5 = 2.571
29,629.13
(323.3) 2
7
= 14,697.29
b1 0
1.82751 0
se
446.01
SS xx
14,697.29
t =
= 0.50
Since the observed t = 0.50 < t.025,5 = 2.571, the decision is to fail to
reject the
null hypothesis.
P a g e | 478
14.59 Let Water use = y and Temperature = x
x = 608
x2 = 49,584
y = 1,025
y2 = 152,711
xy = 86,006
n=8
b1 = 2.40107
b0 = -54.35604
y
= -54.35604 + 2.40107 x
y
100
SSE = y2 - b0 y - b1 xy
se
SSE
1,919.5146
n2
6
1
r2 =
= 17.886
SSE
1,919.5145
1
2
( y )
(1025) 2
2
152
,
711
y n
8
= 1 - .09 = .91
P a g e | 479
Ho: = 0
Ha: 0
= .01
t.005,6 = 3.707
se
( x) 2
17.886
49,584
(608) 2
8
sb =
= .30783
b1 1 2.40107 0
sb
.30783
t =
= 7.80
Since the observed t = 7.80 < t.005,6 = 3.707, the decision is to reject
the null
hypothesis.
y
14.60 a) The regression equation is:
= 67.2 0.0565 x
P a g e | 480
decrease by -.0565.
c) The t ratio for the slope is 5.50 with an associated p-value of .000.
This is
significant at = .10. The t ratio negative because the slope is
negative and
the numerator of the t ratio formula equals the slope minus zero.
P a g e | 481
14.61 The F value for overall predictability is 7.12 with an associated p-value
of .0205 which is significant at = .05. It is not significant at alpha of .
01.
The coefficient of determination is .372 with an adjusted r2 of .
32. This represents very modest predictability. The standard error of
the estimate is 982.219, which in units of 1,000 laborers means that
about 68% of the predictions are within 982,219 of the actual figures.
The regression model is:
Number of Union Members = 22,348.97 - 0.0524 Labor Force. For a
labor force of 100,000 (thousand, actually 100 million), substitute x =
100,000 and get a predicted value of 17,108.97 (thousand) which is
actually 17,108,970 union members.
14.62 The Residual Model Diagnostics from MINITAB indicate a relatively healthy
set
of residuals. The Histogram indicates that the error terms are
generally normally
distributed. This is somewhat confirmed by the semi straight line
Normal Plot of Residuals. However, the Residuals vs. Fits graph
indicates that there may be some heteroscedasticity with greater error
variance for small x values.
P a g e | 482
Chapter 15
Multiple Regression Analysis
LEARNING OBJECTIVES
1.
2.
P a g e | 483
3.
4.
P a g e | 484
Presented early in chapter 15 are the simultaneous equations that
need to be solved to develop a first-order multiple regression model using
two predictors. This should help the student to see that there are three
equations with three unknowns to be solved. In addition, there are eight
values that need to be determined before solving the simultaneous equations
( x1, x2, y, x12, . . .) Suppose there are five predictors. Six simultaneous
equations must be solved and the number of sums needed as constants in
the equations become overwhelming. At this point, the student will begin to
realize that most researchers do not want to take the time nor the effort to
solve for multiple regression models by hand. For this reason, much of the
chapter is presented using computer printouts. The assumption is that the
use of multiple regression analysis is largely from computer analysis.
P a g e | 485
CHAPTER OUTLINE
P a g e | 486
KEY TERMS
Adjusted R2
R2
Residual
Dependent Variable
Response Plane
Independent Variable
Response Surface
Response Variable
Multiple Regression
Outliers
15.1
y
= 25.03 - 0.0497 x1 + 1.928 x2
P a g e | 487
Predicted value of y for x1 = 200 and x2 = 7 is:
y
= 25.03 - 0.0497(200) + 1.928(7) = 28.586
15.2
y
= 118.56 - 0.0794 x1 - 0.88428 x2 + 0.3769 x3
y
= 118.56 - 0.0794(33) - 0.88428(29) + 0.3769(13) = 95.19538
P a g e | 488
15.3
y
= 121.62 - 0.174 x1 + 6.02 x2 + 0.00026 x3 + 0.0041 x4
There are four independent variables. If x2, x3, and x4 are held
constant, the predicted y will decrease by - 0.174 for every unit
increase in x1. Predicted y will increase by 6.02 for every unit increase
in x2 as x1, x3, and x4 are held constant. Predicted y will increase by
0.00026 for every unit increase in x3 holding x1, x2, and x4 constant. If
x4 is increased by one unit, the predicted y will increase by 0.0041 if x1,
x2, and x3 are held constant.
15.4
y
= 31,409.5 + 0.08425 x1 + 289.62 x2 - 0.0947 x3
P a g e | 489
15.5
For every unit increase in paper consumption, the predicted per capita
consumption increases by 116.2549 if fish and gasoline consumption
are held constant. For every unit increase in fish consumption, the
predicted per capita consumption decreases by 120.0904 if paper and
gasoline consumption are held constant. For every unit increase in
gasoline consumption, the predicted per capita consumption increases
by 45.73328 if paper and fish consumption are held constant.
P a g e | 490
15.6
Insider Ownership =
17.68 - 0.0594 Debt Ratio - 0.118 Dividend Payout
The coefficients mean that for every unit of increase in debt ratio there
is a predicted decrease of - 0.0594 in insider ownership if dividend
payout is held constant. On the other hand, if dividend payout is
increased by one unit, then there is a predicted drop of insider
ownership by 0.118 with debt ratio is held constant.
15.7
There are 9 predictors in this model. The F test for overall significance
of the model is 1.99 with a probability of .0825. This model is not
significant at = .05. Only one of the t values is statistically
significant. Predictor x1 has a t of 2.73 which has an associated
probability of .011 and this is significant at = .05.
15.8
P a g e | 491
15.9
P a g e | 492
15.10 The regression model is:
Insider Ownership =
17.68 - 0.0594 Debt Ratio - 0.118 Dividend Payout
The overall value of F is only 0.02 with p-value of .982. This model is
not significant. Neither of the t values are significant (tDebt = -0.19 with
a p-value of .855 and tDividend = -0.11 with a p-value of .913).
y
= 3.981 + 0.07322 x1 - 0.03232 x2 - 0.003886 x3
15.12 The regression equation for the model using both x1 and x2 is:
P a g e | 493
y
= 243.44 - 16.608 x1 - 0.0732 x2
= 235.143 - 16.7678 x1
y
= 657.053 + 5.7103 x1 0.4169 x2 3.4715 x3
P a g e | 494
x1 is significant at = .01 (t = 3.19, p-value of .0087)
x3 is significant at = .05 (t = - 2.41, p-value of .0349)
15.14 The standard error of the estimate is 3.503. R2 is .408 and the
adjusted R2 is only .203. This indicates that there are a lot of
insignificant predictors in the model. That is underscored by the fact
that eight of the nine predictors have non significant t values.
15.15 se = 9.722, R2 = .515 but the adjusted R2 is only .404. The difference
in the two is due to the fact that two of the three predictors in the
model are non-significant. The model fits the data only modestly. The
adjusted R2 indicates that 40.4% of the variance of y is accounted for
by this model and 59.6% is unaccounted for by the model.
15.16 The standard error of the estimate of 14,660.57 indicates that this
model predicts Per Capita Personal Consumption to within + 14,660.57
about 68% of the time. The entire range of Personal Per Capita for the
data is slightly less than 110,000. Relative to this range, the standard
error of the estimate is modest. R2 = .85988 and the adjusted value of
R2 is .799828 indicating that there are potentially some non significant
variables in the model. An examination of the t statistics reveals that
two of the three predictors are not significant. The model has
relatively good predictability.
P a g e | 495
P a g e | 496
15.19 For the regression equation for the model using both x1 and x2, se =
6.333,
R2 = .963 and adjusted R2 = .957. Overall, this is a very strong model.
For the regression model using only x1 as a predictor, the standard
error of the estimate is 6.124, R2 = .963 and the adjusted R2 = .960.
The value of R2 is the same as it was with the two predictors. However,
the adjusted R2 is slightly higher with the one-predictor model because
the non-significant variable has been removed. In conclusion, by using
the one predictor model, we get virtually the same predictability as
with the two predictor model and it is more parsimonious.
P a g e | 497
15.21 The Histogram indicates that there may be some problem with the
error
terms being normally distributed. The Residuals vs. Fits plot reveals
that there may be some lack of homogeneity of error variance.
15.22 There are four predictors. The equation of the regression model is:
y
= -55.9 + 0.0105 x1 0.107 x2 + 0.579 x3 0.870 x4
The test for overall significance yields an F = 55.52 with a p-value of .
000
which is significant at = .001. Three of the t tests for regression
coefficients are significant at = .01 including the coefficients for
x2, x3, and x4. The R2 value of 80.2% indicates strong predictability for
the model. The value of the adjusted R2 (78.8%) is close to R2 and se is
9.025.
15.23 There are two predictors in this model. The equation of the regression
model is:
y
= 203.3937 + 1.1151 x1 2.2115 x2
The F test for overall significance yields a value of 24.55 with an
associated p-value of .0000013 which is significant at = .00001.
Both
P a g e | 498
variables yield t values that are significant at a 5% level of
significance.
x2 is significant at = .001. The R2 is a rather modest 66.3% and the
standard error of the estimate is 51.761.
y
= 137.27 + 0.0025 x1 + 29.206 x2
F = 10.89 with p = .005, se = 9.401, R2 = .731, adjusted R2 = .664. For
x1, t = 0.01 with p = .99 and for x2, t = 4.47 with p = .002. This model
has good predictability. The gap between R2 and adjusted R2 indicates
that there may be a non-significant predictor in the model. The t
values show x1 has virtually no predictability and x2 is a significant
predictor of y.
P a g e | 499
F = 16.05 with p = .001, se = 37.07, R2 = .858, adjusted R2 = .804. For
x1, t = -4.35 with p = .002; for x2, t = -0.73 with p = .483, for x3, t =
1.96 with p = .086. Thus, only one of the three predictors, x1, is a
significant predictor in this model. This model has very good
predictability (R2 = .858). The gap between R2 and adjusted R2
underscores the fact that there are two non-significant predictors in
this model.
P a g e | 500
15.26 The overall F for this model was 12.19 with a p-value of .002 which is
significant at = .01. The t test for Silver is significant at = .01 ( t =
4.94, p = .001). The t test for Aluminum yields a t = 3.03 with a pvalue of .016 which is significant at = .05. The t test for Copper was
insignificant with a p-value of .939. The value of R2 was 82.1%
compared
to an adjusted R2 of 75.3%. The gap between the two indicates the
presence of some insignificant predictors (Copper). The standard error
of the estimate is 53.44.
The low value of adjusted R2 indicates that the model has very low
predictability. Both t values are not significant (tNavalVessels = 0.67 with
p = .541 and tCommercial = 1.07 with p = .345). Neither predictor is a
significant predictor of employment.
P a g e | 501
15.28 The regression model was:
P a g e | 502
15.29
Only one of the four predictors has a significant t ratio and that is
Utility with t = 2.57 and p = .018. The ratios and their respective
probabilities are:
P a g e | 503
thealthcare = - 0.64 with p = .53.
This model is very weak. Only the predictor, Utility, shows much
promise in accounting for the grocery variability.
y
= 87.89 0.256 x1 2.714 x2 + 0.0706 x3
F = 47.57 with a p-value of .000 significant at = .001.
se = 0.8503, R2 = .941, adjusted R2 = .921.
All three predictors produced significant t tests with two of them
(x2 and x3) significant at .01 and the other, x1 significant at = .05.
This is
a very strong model.
15.32 Two of the diagnostic charts indicate that there may be a problem
with the
error terms being normally distributed. The histogram indicates that
the error term distribution might be skewed to the right and the normal
probability plot is somewhat nonlinear. In addition, the residuals vs.
fits chart indicates a potential heteroscadasticity problem with
P a g e | 504
residuals for middle values of x producing more variability that those
for lower and higher values of x.
Chapter 16
Building Multiple Regression Models
LEARNING OBJECTIVES
1.
analysis.
2.
3.
4.
P a g e | 505
CHAPTER TEACHING STRATEGY
P a g e | 506
CHAPTER OUTLINE
16.4 Multicollinearity
P a g e | 507
KEY TERMS
Qualitative Variable
Backward Elimination
Dummy Variable
Search Procedures
Stepwise Regression
Forward Selection
Indicator Variable
Tukeys Ladder of
Multicollinearity
Transformations
Quadratic Model
16.1
y
= - 147.27 + 27.128 x
P a g e | 508
t = 15.15 with p = .000. This is a very strong simple regression model.
y
= - 22.01 + 3.385 X + 0.9373 x2
P a g e | 509
16.2
y
= b0b1x
Using logs:
The regression model is solved for in the computer using the values of
x and the values of log y. The resulting regression equation is:
P a g e | 510
16.3
= - 1456.6 + 71.017 x
y
= 1012 - 14.06 x + 0.6115 x2
R2 = .947 but adjusted R2 = .911. The t ratio for the x term is t = 0.17 with p = .876. The t ratio for the x2 term is t = 1.03 with p = .
377
P a g e | 511
7000
6000
Ad Exp
5000
4000
3000
2000
1000
30
40
50
60
70
80
90
16.4
y
= b0b1x
Using logs:
100
110
P a g e | 512
The regression model is solved for in the computer using the values of
x and the values of log y where x is failures and y is liabilities. The
resulting regression equation is:
P a g e | 513
16.5
y
= - 28.61 - 2.68 x1 + 18.25 x2 - 0.2135 x12 - 1.533 x22 + 1.226
x1*x2
None of the t ratios for this model are significant. They are t(x1) = 0.25 with p = .805, t(x2) = 0.91 with p = .378, t(x12) = - 0.33 with .745,
t(x22) = - 0.68 with .506, and t(x1*x2) = 0.52 with p = .613. This model
has a high R2 yet none of the predictors are individually significant.
The same thing occurs when the interaction term is not in the model.
None of the t tests are significant. The R2 remains high at .957
indicating
that the loss of the interaction term was insignificant.
16.6
P a g e | 514
= .01.
16.7
y
= 13.619 - 0.01201 x1 + 2.998 x2
The overall F = 8.43 is significant at = .01 (p = .009).
The t ratio for the x1 variable is only t = -0.14 with p = .893. However
the t ratio for the dummy variable, x2 is t = 3.88 with p = .004. The
indicator variable is the significant predictor in this regression model
that has some predictability (adjusted R2 = .575).
16.8
y
= 7.909 + 0.581 x1 + 1.458 x2 - 5.881 x3 - 4.108 x4
c-1
P a g e | 515
se = 1.733, R2 = .806, and adjusted R2 = .747
For the predictors, t = 0.56 with p = .585 for the x1 variable (not
significant), t = 1.32 with p = .208 for the first indicator variable (x2)
and is non significant, t = -5.32 with p = .000 for x3 the second
indicator variable and this is significant at = .001, t = -3.21 with p = .
007 for the third indicator variable (x4) which is significant at = .01.
This model has strong predictability and the only significant predictor
variables are the two dummy variables, x3 and x4.
16.9
P a g e | 516
16.10 The regression model is:
y
= 41.225 + 1.081 x1 18.404 x2
y
becomes
y
becomes
P a g e | 517
P a g e | 518
Step 1:
Step 2:
y
The model is
Step 1:
y
The model is
Step 2:
= 133.53 - 0.78 x4
y
The model is
P a g e | 519
The variables, x1 and x3 never enter the procedure.
16.15 The output shows that the final model had four predictor variables, x4,
x2, x5, and x7. The variables, x3 and x6 did not enter the stepwise
analysis. The procedure took four steps. The final model was:
The R2 for this model was .5929, and se was 3.36. The t ratios were:
tx4 = 3.07, tx2 = 2.05, tx5 = 2.02, and tx7 = 1.98.
P a g e | 520
16.16 The output indicates that the stepwise process only went two steps.
Variable x3 entered at step one. However, at step two, x3 dropped out
of the analysis and x2 and x4 entered as the predictors. x1 was the
dependent variable. x5 never entered the procedure and was not
included in the final model as x3 was not. The final regression model
was:
16.17 The output indicates that the procedure went through two steps. At
step 1, dividends entered the process yielding an r2 of .833 by itself.
y
The t value was 6.69 and the model was
= - 11.062 + 61.1 x1. At
step 2, net income entered the procedure and dividends remained in
the model. The R2 for this two-predictor model was .897 which is a
modest increase from the simple regression model shown in step one.
The step 2 model was:
correlation matrix
Premiums Income Dividends Gain/Loss
Premiums
Income
1
0.808236
P a g e | 521
Dividends 0.912515 0.682321
Gain/Loss
-0.40984
0.0924 -0.52241
16.18 This stepwise regression procedure only went one step. The only
significant predictor was natural gas. No other predictors entered the
model. The regression model is:
For this model, R2 = .9295 and se = 0.490. The t value for natural gas
was 11.48.
P a g e | 522
Chapter 18
Statistical Quality Control
LEARNING OBJECTIVES
1.
2.
3.
4.
5.
x
charts, R charts, p charts, and c charts.
P a g e | 523
P a g e | 524
assembly line. However, even in most service industries such insurance,
banking, or healthcare there are processes. A useful class activity might be
to brainstorm about what kind of process is involved in a person buying
gasoline for their car, checking in to a hospital, or purchasing a health club
membership. Think about it from a companys perspective. What activities
must occur in order for a person to get their car filled up?
In analyzing process, we first discuss the construction of flowcharts.
Flowcharting can be very beneficial in identifying activities and flows that
need to be studied for quality improvement. One very important outcome of
a flowchart is the identification of bottlenecks. You may find out that all
applications for employment, for example, must pass across a clerks desk
where they sit for several days. This backs up the system and prevents flow.
Other process techniques include fishbone diagrams, Pareto analysis, and
control charts.
In this chapter, four types of control charts are presented. Two of the
charts, the x bar chart and the R chart, deal with measurements of product
attributes such as weight, length, temperature and others. The other two
charts deal with whether or not items are in compliance with specifications (p
chart) or the number of noncompliances per item (c chart). The c chart is
less widely known and used than the other three. As part of the material on
control charts, a discussion on variation is presented. Variation is one of the
main concerns of quality control. A discussion on various types of variation
that can occur in a business setting can be profitable in helping the student
understand why particular measurements are charted and controlled.
CHAPTER OUTLINE
18.1
P a g e | 525
Benchmarking
Just-in-Time Systems
Reengineering
Failure Mode and Effects Analysis (FMEA)
Poka-Yoke
Six Sigma
Design for Six Sigma
Lean Manufacturing
Team Building
18.2
Process Analysis
Flowcharts
Pareto Analysis
Cause-and-Effect (Fishbone) Diagrams
Control Charts
Check Sheets
Histogram
Scatter Chart
18.3
Control Charts
Variation
Types of Control Charts
x
Chart
R Charts
p Charts
P a g e | 526
c Charts
Interpreting Control Charts
18.4
Acceptance Sampling
Single Sample Plan
Double-Sample Plan
Multiple-Sample Plan
Determining Error and OC Curves
P a g e | 527
KEY TERMS
Acceptance Sampling
After-Process Quality Control
p Chart
Pareto Analysis
Benchmarking
Pareto Chart
c Chart
Poka-Yoke
Cause-and-Effect Diagram
Process
Centerline
Producers Risk
Check Sheet
Product Quality
Consumer's Risk
Quality
Control Chart
Quality Circle
Quality Control
Double-Sample Plan
R Chart
Reengineering
Fishbone Diagram
Scatter Chart
Flowchart
Single-Sample Plan
Histogram
Six Sigma
Team Building
Ishikawa Diagram
Transcendent Quality
Lean Manufacturing
User Quality
Manufacturing Quality
Value Quality
x
Multiple-Sample Plan
Chart
P a g e | 528
Operating Characteristic (OC) Curve
P a g e | 529
SOLUTIONS TO PROBLEMS IN CHAPTER 18
18.2
Complaint
Number
Busy Signal
% of Total
420
56.45
184
24.73
85
11.42
Get Disconnected
37
4.97
10
Poor Connection
Total
1.34
8
744
1.08
99.99
P a g e | 530
x1
18.4
x2
x3
= 27.00,
= 24.29,
x4
= 25.29,
x5
= 27.71,
R1 = 8, R2 = 8, R3 = 9, R4 = 7, R5 = 6
= 26.03
= 7.6
x
For
Chart:
Since n = 7, A2 = 0.419
x
Centerline:
= 26.03
x
UCL:
+ A2
x
LCL:
- A2
For R Chart:
Centerline:
Since n = 7, D3 = 0.076
= 7.6
D4 = 1.924
= 25.86
P a g e | 531
UCL:
D4
LCL:
D3
x
Chart:
R Chart:
= (1.924)(7.6) = 14.62
= (0.076)(7.6) = 0.58
P a g e | 532
P a g e | 533
x1
18.5
x2
= 4.55,
x3
= 4.10,
x5
x6
= 4.30,
x4
= 4.80,
= 4.70,
x7
= 4.73,
= 4.38
= 4.51
= 0.90
x
For
Chart:
Since n = 4, A2 = 0.729
x
Centerline:
= 4.51
x
UCL:
+ A2
x
LCL:
- A2
For R Chart:
Centerline:
Since n = 4, D3 = 0
= 0.90
D4 = 2.282
P a g e | 534
UCL:
D4
LCL:
D3
x
Chart:
= (2.282)(0.90) = 2.05
= 0
P a g e | 535
R Chart:
p 1
18.6
p 3
p 2
= .02,
p 6
= .07,
p 7
= .05,
= .04,
p 8
= .02,
p 5
p 4
= .03,
p 9
= .00,
= .03
p10
= .01,
= .06
p = .033
Centerline:
p = .033
(.033)(. 967)
100
UCL: .033 + 3
P a g e | 536
(.033)(.967)
100
LCL:
.033 - 3
pChart:
P a g e | 537
p 1
18.7
p 3
p 2
= .025,
p 5
= .000,
p 6
p 4
= .025,
= .075,
p 7
= .05,
= .125,
= .05
p = .050
Centerline:
p = .050
(.05)(.95)
40
UCL: .05 + 3
(.05)(. 95)
40
LCL:
.05 - 3
p Chart:
P a g e | 538
c
18.8
22
35
= 0.62857
c
Centerline:
= 0.62857
c3 c
UCL:
0.62857
= 0.62857 + 3
c3 c
LCL:
0.62857
= 0.62857 - 3
c Chart:
P a g e | 539
c
18.9
43
32
= 1.34375
c
Centerline:
= 1.34375
c3 c
UCL:
1.34375
= 1.34375 + 3
c3 c
LCL:
1.34375
= 1.34375 - 3
P a g e | 540
c Chart:
18.10 a.) Six or more consecutive points are decreasing. Two of three
consecutive points are in the outer one-third (near LCL). Four out of
five points are in the outer two-thirds (near LCL).
b.) This is a relatively healthy control chart with no obvious rule
violations.
c.) One point is above the UCL. Two out of three consecutive points are
in
the outer one-third (both near LCL and near UCL). There are six
consecutive increasing points.
18.11 While there are no points outside the limits, the first chart exhibits
some
problems. The chart ends with 9 consecutive points below the
centerline.
P a g e | 541
Of these 9 consecutive points, there are at least 4 out of 5 in the outer
2/3 of the lower region. The second control chart contains no points
outside the control limit. However, near the end, there are 8
consecutive points above the centerline. The p chart contains no
points outside the upper control limit. Three times, the chart contains
two out of three points in the outer third. However, this occurs in the
lower third where the proportion of noncompliance items approaches
zero and is probably not a problem to be concerned about. Overall,
this seems to display a process that is in control. One concern might
be the wide swings in the proportions at samples 15, 16 and 22 and
23.
P a g e | 542
But x1 = 2 and x2 = 2
so x1 + x2 = 4 > 3
18.13 n = 10
c=0
P(x = 0) =
p0 = .05
C0(.05)0(.95)10 = .5987
10
p1 = .14
P(x = 0) =
C0(.14)0(.86)10 =
15
.2213
18.14 n = 12
c=1
p0 = .04
P a g e | 543
1 - [12C0(.04)0(.96)12 +
12
C1(.04)1(.96)11] =
p1 = .15
C0(.15)0(.85)12 +
12
12
P a g e | 544
18.15 n = 8
c=0
p0 = .03
p1 = .1
Probability
.01
.9227
.02
.8506
.03
.7837
.04
.7214
.05
1 - .7837
= .2163
.06
.6096
.07
.5596
.08
.09
.4703
.11
.10
.3937
.12
.3596
.13
.3282
.14
.2992
.15
.2725
OC Chart:
.4305
P a g e | 545
P a g e | 546
18.16 n = 11
c=1
p0 = .08
p1 = .20
Probability
.02
.9805
.04
.9308
.06
.8618
.08
.10
.6974
.12
.6127
.14
.5311
.16
.4547
.18
.3849
.20
.3221
.24
.2186
2181
.22
OC Chart:
.2667
P a g e | 547
P a g e | 548
18.17
Stop
(no)
D K L M (yes) Stop
Stop
(no)
(no)
Start A B (yes) C E F G
(yes)
H(no) J Stop
(yes)
P a g e | 549
18.18
Problem
Frequency
Percent of Total
673
26.96
29
1.16
108
4.33
379
15.18
73
2.92
564
22.60
12
0.48
402
16.11
54
2.16
202
8.09
10
2496
P a g e | 550
Pareto Chart:
P a g e | 551
P a g e | 552
18.20
C0(.05)0(.95)13 +
13
C1(.05)1(.95)12 =
13
(1)(1)(.51334) + (13)(.05)(.54036) =
.51334 + .35123 = .86457
C0(.12)0(.88)13 +
13
C1(.12)1(.88)12 =
13
(1)(1)(.18979) + (13)(.12)(.21567) =
.18979 + .33645 = .52624
b) n = 20, c = 2, p0 = .03
The probability of acceptance is:
C0(.03)0(.97)19 +
20
C1(.03)1(.97)19 +
20
C2(.03)2(.97)18 =
20
P a g e | 553
p 1
18.21
p 3
p 2
= .06,
= .22,
p 6
p 7
= .16,
p=
52
500
= .14,
p 8
= .00,
p 5
p 4
= .04,
p 9
= .18,
= .10,
p10
= .02,
= .12
= .104
Centerline:
p = .104
(. 104)(. 896)
50
UCL: .104 + 3
(.104)(.896)
50
LCL:
.104 - 3
P a g e | 554
p Chart:
x1
18.22
x2
= 24.022,
x5
x3
= 24.048,
x6
= 23.998,
x9
x7
= 24.018,
x 10
= 24.014,
x4
= 23.996,
x8
= 24.000,
x 11
= 24.002,
= 24.000,
= 24.034,
x12
= 24.012,
= 24.022
P a g e | 555
= 24.01383
= 0.05167
x
For
Chart:
x
Centerline:
= 24.01383
x
UCL:
+ A2
= 24.01383 + (0.266)(.05167) =
24.01383 + .01374 = 24.02757
x
LCL:
- A2
= 24.01383 - (0.266)(.05167) =
24.01383 - .01374 = 24.00009
For R Chart:
Centerline:
UCL:
D4
LCL:
D3
D4 = 1.716
= .05167
= (1.716)(.05167) = .08866
= (.284)(.05167) = .01467
P a g e | 556
x
Chart:
R Chart:
P a g e | 557
Probability
.01
.8601
.02
.7386
.04
.5421
.06
.3953
.08
.2863
.10
.2059
.12
.1470
.14
.1041
OC Curve:
P a g e | 558
P a g e | 559
c
18.24
77
36
= 2.13889
c
Centerline:
= 2.13889
c3 c
UCL:
2.13889
= 2.13889 + 3
c3 c
LCL:
2.13889
= 2.13889 - 3
c Chart:
P a g e | 560
x1
18.25
x2
= 1.2100,
x3
= 1.2050,
x5
x6
= 1.2075,
x4
= 1.1900,
x7
= 1.2025,
= 1.1725,
x8
= 1.1950,
= 1.1950,
x9
= 1.1850
= 1.19583
= 0.04667
x
For
Chart:
Since n = 4, A2 = .729
x
Centerline:
= 1.19583
x
UCL:
+ A2
= 1.19583 + .729(.04667) =
1.19583 + .03402 = 1.22985
P a g e | 561
x
LCL:
- A2
= 1.19583 - .729(.04667) =
1.19583 - .03402 = 1.16181
For R Chart:
Centerline:
UCL:
D4
LCL:
D3
x
Chart:
Since n = 9, D3 = .184
D4 = 1.816
= .04667
= (1.816)(.04667) = .08475
= (.184)(.04667) = .00859
P a g e | 562
R chart:
x1
18.26
x2
= 14.99333,
x5
x3
= 15.00000,
x6
= 15.01333,
x4
= 14.97833,
x7
= 15.00000,
x8
= 15.01667,
x
= 14.99854
= 0.05
x
For
Chart:
= 14.97833,
Since n = 6, A2 = .483
= 14.99667,
P a g e | 563
x
Centerline:
= 14.99854
x
UCL:
+ A2
= 14.99854 + .483(.05) =
14.00854 + .02415 = 15.02269
x
LCL:
- A2
= 14.99854 - .483(.05) =
14.00854 - .02415 = 14.97439
For R Chart:
Centerline:
UCL:
D4
LCL:
D3
x
Chart:
Since n = 6, D3 = 0
= .05
= 2.004(.05) = .1002
= 0(.05) = .0000
D4 = 2.004
P a g e | 564
R chart:
p 1
18.27
p 3
p 2
= .12,
p 5
= .04,
p 4
= .00,
= .02667,
p 7
p 8
p 6
= .09333,
p 9
= .18667,
p10
= .06667,
= .14667,
p12
p11
= .05333,
= .10667,
= .0000,
= .09333
P a g e | 565
p=
70
900
= .07778
Centerline:
p = .07778
(.07778)(.92222)
75
UCL: .07778 + 3
(.07778)(.92222)
75
LCL:
.07778 - 3
p Chart:
P a g e | 566
c
18.28
16
25
= 0.64
c
Centerline:
= 0.64
c3 c
UCL:
0.64
= 0.64 + 3
c3 c
LCL:
0.64
= 0.64 - 3
P a g e | 567
P a g e | 568
18.29 n = 10
c=2
p0 = .10
p1 = .30
Probability
.05
.9885
.10
.9298
.15
.8202
.20
.6778
.25
.5256
.30
.3828
.35
.2616
.40
.1673
.45
.0996
.50
.0547
P a g e | 569
P a g e | 570
c
18.30
81
40
= 2.025
c
Centerline:
= 2.025
c3 c
UCL:
2.025
= 2.025 + 3
c3 c
LCL:
2.025
= 2.025 - 3
c Chart:
P a g e | 571
p 1
18.31
p 3
p 2
= .05,
p 4
= .00,
p 5
= .15,
p 6
= .025,
p 9
p 7
= .025,
p10
= .10,
p=
36
600
= .00,
p12
p11
= .05,
= .05,
p15
p14
= .15,
p 8
= .125,
= .075,
p13
= .075,
= .025,
= .000
= .06
Centerline:
p = .06
(.06)(.94)
40
UCL: .06 + 3
P a g e | 572
(.06)(. 94)
40
LCL:
.06 - 3
p Chart:
x
1/3 of the confidence bands (+ 1
).
P a g e | 573
18.33 There are some items to be concerned about with this chart. Only one
sample range is above the upper control limit. However, near the
beginning of the chart there are eight sample ranges in a row below
the centerline. Later in the run, there are nine sample ranges in a row
above the centerline. The quality manager or operator might want to
determine if there is some systematic reason why there is a string of
ranges below the centerline and, perhaps more importantly, why there
are a string of ranges above the centerline.
18.34 This p chart reveals that two of the sixty samples (about 3%) produce
proportions that are too large. Nine of the sixty samples (15%)
p
produce proportions large enough to be greater than 1
above the
centerline. In general, this chart indicates a process that is under
control.
18.35 The centerline of the c chart indicates that the process is averaging
0.74 nonconformances per part. Twenty-five of the fifty sampled items
have zero nonconformances. None of the samples exceed the upper
control limit for nonconformances. However, the upper control limit is
3.321 nonconformances, which in and of itself, may be too many.
Indeed, three of the fifty (6%) samples actually had three
nonconformances. An additional six samples (12%) had two
nonconformances. One matter of concern may be that there is a run of
ten samples in which nine of the samples exceed the centerline
(samples 12 through 21). The question raised by this phenomenon is
P a g e | 574
whether or not there is a systematic flaw in the process that produces
strings of nonconforming items.
Chapter 19
Decision Analysis
LEARNING OBJECTIVES
1.
2.
P a g e | 575
3.
4.
5.
The notion of contemporary decision making is built into the title of the
text as a statement of the importance of recognizing that statistical analysis
is primarily done as a decision-making tool. For the vast majority of students,
statistics take on importance only in as much as they aid decision-makers in
weighing various alternative pathways and helping the manager make the
best possible determination. It has been an underlying theme from chapter 1
that the techniques presented should be considered in a decision-making
context. This chapter focuses on analyzing the decision-making situation and
presents several alternative techniques for analyzing decisions under varying
conditions.
P a g e | 576
States of nature are possible environments within which the outcomes
will occur over which we have no control. These include such things as the
economy, the weather, health of the CEO, wildcat strikes, competition,
change in consumer demand, etc. While the text presents problems with
only a few states of nature in order to keep the length of solution reasonable,
students should learn to consider as many states of nature as possible in
decision making. Determining payoffs is relatively difficult but essential in
the analysis of decision alternatives.
P a g e | 577
CHAPTER OUTLINE
P a g e | 578
KEY TERMS
Decision Alternatives
Decision Analysis
Hurwicz Criterion
Maximax Criterion
Maximin Criterion
Minimax Regret
Decision Table
Payoffs
Decision Trees
Payoff Table
EMV'er
Risk-Avoider
Risk-Taker
States of Nature
Utility
P a g e | 579
SOLUTIONS TO PROBLEMS IN CHAPTER 19
19.1
S1
S2
d1 250
175
d2 110
d3 390
S3
Max
Min
-25
250
-25
100
70
110
70
140
-80
390
-80
decision: Select d3
decision: Select d2
c.) For = .3
P a g e | 580
decision: Select d2
For = .8
decision: Select d3
Comparing the results for the two different values of alpha, with a
more pessimist point-of-view ( = .3), the decision is to select d2 and
the payoff is 82. Selecting by using a more optimistic point-of-view (
= .8) results in choosing d3 with a higher payoff of 296.
S1
d1 140
S2
S3
Max
95
140
P a g e | 581
d2 280
75
d3
35
150
280
150
19.2
S1
S2
S3
S4
Max
Min
d1
50
70
120
110
120
50
d2
80
20
75
100
100
20
d3
20
45
30
60
60
20
d4 100
85
-30
-20
100
-30
-10
65
80
80
-10
d5
Decision: Select d1
P a g e | 582
Decision: Select d1
c.) = .5
Decision: Select d1
P a g e | 583
S1
S2
S3
S4
Max
d1
50
15
50
d2
20
65
45
10
65
d3
80
40
90
50
90
d4
150
130
150
d5 100
95
55
30
100
Decision: Select d1
P a g e | 584
19.3
Max Min
60
15
-25
60
-25
10
25
30
30
10
C -10
40
15
40
-10
25
25
20
Decision: Select A
Decision: Select B
P a g e | 585
19.4
Not
None
Somewhat
Very
Max
Min
-50
-50
-50
-50
-50
Few
-200
300
400
400
-200
Many
-600
100
1000
1000
-600
Not
None
Somewhat
Very
Max
350
1050
1050
Few
150
600
600
Many
550
200
550
P a g e | 586
P a g e | 587
19.5, 19.6
P a g e | 588
19.7 Expected Payoff with Perfect Information =
P a g e | 589
P a g e | 590
19.9
Down(.30)
Lock-In
-150
No
Up(.65)
200
175
No Change(.05)
EMV
-250
85
19.10
EMV
No Layoff
Layoff 1000
-320
-960
-110
P a g e | 591
Layoff 5000
400
P a g e | 592
19.12 a.)
S1(.30)
S2(.70)
d1
350
-100
d2
-200
325
EMV
35
167.5
b. & c.)
Prior
Cond.
Joint
Revised
S1
.30
.90
.27
.6067
S2
.70
.25
.175
.3933
F(S1) = .445
Prior
Cond.
Joint
Revised
P a g e | 593
S1
.30
.10
.030
.054
S2
.70
.75
.525
.946
F(S2) = .555
P a g e | 594
P a g e | 595
19.13
Dec(.60)
Inc(.40)
-225
125
EMV
425
-150
350
35
15
-400
50
Prior
Cond.
Joint
Decrease
.60
.75
.45
.8824
Increase
.40
.15
.06
.1176
F(Dec) = .51
Revised
P a g e | 596
Prior
Cond.
Joint
Revised
Decrease
.60
.25
.15
.3061
Increase
.40
.85
.34
.6939
F(Inc) = .49
P a g e | 597
P a g e | 598
P a g e | 599
19.14
Don't Plant
20
-40
Small
-90
10
175
Large
-600
-150
800
EMV
-16
72.5
235
Prior
Cond.
Joint
.20
.70
.140
.8974
.30
.02
.006
.0385
.50
.02
.010
.0641
P(Fdec) = .156
Revised
P a g e | 600
Prior
Cond.
Joint
Revised
.20
.25
.05
.1333
.30
.95
.285
.7600
.50
.08
.040
.1067
P(Fsame) = .375
Prior
Cond.
Joint
Revised
.20
.05
.01
.0213
.30
.03
.009
.0192
.50
.90
.45
.9595
P a g e | 601
P(Finc) = .469
P a g e | 602
19.15
Oil(.11)
Drill
No Oil(.89)
EMV
1,000,000
-100,000
21,000
Don't Drill
Actual
Oil
No Oil
Oil
.20
.10
No Oil
.80
.90
Forecast
P a g e | 603
Forecast Oil:
State
Prior
Cond.
Joint
Revised
Oil
.11
.20
.022
.1982
No Oil
.89
.10
.089
.8018
P(FOil) = .111
Forecast No Oil:
State
Prior
Cond.
Joint
Revised
Oil
.11
.80
.088
.0990
No Oil
.89
.90
.801
.9010
P a g e | 604
P a g e | 605
19.16
S1
S2
Max.
d1
50
100
100
50
d2
-75
200
200
-75
d3
25
40
40
25
75
10
75
10
d4
a.) Maximax:
Decision: Select d2
b.) Maximin:
Min.
Decision: Select d1
P a g e | 606
d4: 75(.6) + 10(.4) = 49
Decision: Select d2
S1
S2
d1
25
d2
150
d3
50
d4
Maximum
100
160
190
100
150
160
190
Decision: Select d1
P a g e | 607
19.17
Decision: Select d1
P a g e | 608
P a g e | 609
19.18
S1(.40)
S2(.60)
EMV
d1
200
150
170
d2
-75
450
240
d3
175
125
145
Select d2
Forecast S1:
State
Prior
Cond.
Joint
Revised
S1
.4
.9
.36
.667
S2
.6
.3
.18
.333
P(FS1) = .54
Forecast S2:
P a g e | 610
State
Prior
Cond.
Joint
S1
.4
.1
.04
.087
S2
.6
.7
.42
.913
P(FS2) = .46
Revised
P a g e | 611
P a g e | 612
P a g e | 613
19.19
Small
Small
200
Moderate
Large
Min
Max
250
300
200
300
Modest
100
300
600
100
600
Large
-300
400
2000
-300
2000
Small
Small
Moderate
Large
Max
150
1700
1700
Modest
100
100
1400
1400
Large
500
500
P a g e | 614
Min {1700, 1400, 500} = 500
Decision: Large Number
P a g e | 615
19.20
No
Low
Fast
Low
-700
-400
1200
1200
-700
Medium
-300
-100
550
550
-300
100
125
150
150
100
High
a.) = .1:
b.) = .5:
c.) = .8:
Max
Min
P a g e | 616
d.) Two of the three alpha values (.5 and .8) lead to a decision of pricing
low.
Alpha of .1 suggests pricing high as a strategy. For optimists (high
alphas), pricing low is a better strategy; but for more pessimistic
people,
pricing high may be the best strategy.
P a g e | 617
19.21
Mild(.75)
Severe(.25)
Reg.
2000
-2500
875
Weekend
1200
-200
850
Not Open
-300
100
-200
EMV
P a g e | 618
P a g e | 619
19.22
Don't Produce
Produce
-700
-200
1800
150
400
EMV
-235
-1600
90
Decision: Based on Max EMV = Max {-235, 90} = 90, select Produce.
P a g e | 620
P a g e | 621
19.23
Red.(.15)
Con.(.35)
-40,000
-15,000
60,000
18,750
5,000
10,000
-30,000
-10,750
Automate
Do Not
Inc.(.50)
Forecast Reduction:
State
Prior
Cond.
Joint
Revised
.15
.60
.09
.60
.35
.10
.035
.2333
.50
.05
.025
.1667
Forecast Constant:
State
Prior
Cond.
Joint
Revised
.15
.30
.045
.10
.35
.80
.280
.6222
.50
.25
.125
.2778
EMV
P a g e | 622
P(F Cons) = .450
Forecast Increase:
State
Prior
Cond.
Joint
Revised
.15
.10
.015
.0375
.35
.10
.035
.0875
.50
.70
.350
P(FInc) = .400
.8750
P a g e | 623
P a g e | 624
P a g e | 625
19.24
Chosen(.20)
Not Chosen(.80)
EMV
Build
12,000
-8,000
-4,000
Don't
-1,000
2,000
1,400
Forecast Chosen:
State
Prior
Cond.
Joint
Revised
Chosen
.20
.45
.090
.2195
Not Chosen
.80
.40
.320
.7805
P(FC) = .410
State
Chosen
Prior
Cond.
.20
.55
Joint
.110
Revised
.1864
P a g e | 626
Not Chosen
.80
.60
.480
.8136
P(FC) = .590
P a g e | 627
P a g e | 628
16.19
x1 -.653
x1
x2
x3
-.653 -.891
.821
x2 -.891 .650
.650 -.615
-.688
P a g e | 629
16.20
x1 -.241
x1
-.241
x2
.621
-.359
x3
x4
.278 -.724
-.161
.325
x2
.621 -.359
.243
-.442
x3
.278 -.161
.243
-.278
-.442
-.278
x4 -.724
.325
P a g e | 630
Net
Income
Net
Dividends
.682
Gain/Loss
.092
Income
Dividends .682
Gain/Loss .092
-.522
-.522
Natural
Fuel
P a g e | 631
Gas
Oil
Gasoline
.570
.701
Natural
Gas
Fuel Oil
Gasoline
.570
.701
.934
.934
P a g e | 632
y
= 564 - 27.99 x1 - 6.155 x2 - 15.90 x3
y
= 1540 + 48.2 x1.
The R2 at this step is .9112 and the t ratio is 11.55. At step 2, x12
entered the procedure and x1 remained in the analysis. The stepwise
regression procedure stopped at this step and did not proceed. The
final model was:
y
= 1237 + 136.1 x1 - 5.9 x12.
P a g e | 633
The R2 at this step was .9723, the t ratio for x1 was 7.89, and the t ratio
for x12 was - 5.14.
16.25 In this model with x1 and the log of x1 as predictors, only the log x1 was
a significant predictor of y. The stepwise procedure only went to step
1. The regression model was:
y
= - 13.20 + 11.64 Log x1. R2 = .9617 and the t ratio of Log x1
was 17.36. This model has very strong predictability using only the
log of the x1 variable.
P a g e | 634
16.26 The regression model is:
toilseed = 3.74 with p = .006 and tlivestock = 3.78 with p = .005. Both
predictors are significant at = .01. This is a model with strong
predictability.
P a g e | 635
The low value of adjusted R2 indicates that the model has very low
predictability. Both t values are not significant (tNavalVessels = 0.67 with
p = .541 and tCommercial = 1.07 with p = .345). Neither predictor is a
significant predictor of employment.
P a g e | 636
The t ratios were: tfood = 8.32, tfuel oil = 2.81, tshelter = 2.56.
16.30 The stepwise regression process with these two independent variables
only went one step. At step 1, Soybeans entered in producing the
model,
Corn = - 2,962 + 5.4 Soybeans. The
The t ratio for Soybeans was 5.43. Wheat did not enter in to the
analysis.
P a g e | 637
Grocery = 76.23 + 0.08592 Housing + 0.16767 Utility
+ 0.0284 Transportation - 0.0659 Healthcare
Only one of the four predictors has a significant t ratio and that is
Utility with t = 2.57 and p = .018. The ratios and their respective
probabilities are:
This model is very weak. Only the predictor, Utility, shows much
promise in accounting for the grocery variability.
P a g e | 638
16.32 The output suggests that the procedure only went two steps.
y
= 124.5 - 43.4 x1 + 1.36 x2
y
= 87.89 + 0.071 x3 - 2.71 x2 - 0.256 x1
P a g e | 639
16.34 The R2 for the full model is .321. After dropping out variable, x3, the R2
is still .321. Variable x3 added virtually no information to the model.
This is underscored by the fact that the p-value for the t test of the
slope for x3 is .878 indicating that there is no significance. The
standard error of the estimate actually drops slightly after x3 is
removed from the model.
Chapter 17
Time-Series Forecasting and Index Numbers
LEARNING OBJECTIVES
P a g e | 640
1.
2.
3.
4.
5.
and to
forecast by using decomposition techniques
6.
7.
The first section of the chapter contains a general discussion about the
various possible components of time-series data. It creates the setting
against which the chapter later proceeds into trend analysis and seasonal
effects. In addition, two measurements of forecasting error are presented so
P a g e | 641
that students can measure the error of forecasts produced by the various
techniques and begin to compare the merits of each.
A full gamut of time series forecasting techniques has been presented
beginning with the most nave models and progressing through averaging
models and exponential smoothing. An attempt is made in the section on
exponential smoothing to show the student, through algebra, why it is called
by that name. Using the derived equations and a few selected values for
alpha, the student is shown how past values and forecasts are smoothed in
the prediction of future values. The more advanced smoothing techniques
are briefly introduced in later sections but are explained in much greater
detail on WileyPLUS.
Trend is solved for next using the time periods as the predictor
variable. In this chapter both linear and quadratic trends are explored and
compared. There is a brief introduction to Holts two-parameter exponential
smoothing method that includes trend. A more detailed explanation of Holts
method is available on WileyPLUS. The trend analysis section is placed
earlier in the chapter than seasonal effects because finding seasonal effects
makes more sense when there are no trend effects in the data or the trend
effect has been removed.
P a g e | 642
In regression analysis involving data over time, autocorrelation can be
a problem. Because of this, section 17.5 contains a discussion on
autocorrelation and autoregression. The Durbin-Watson test is presented as a
mechanism for testing for the presence of autocorrelation. Several possible
ways of overcoming the autocorrelation problem are presented such as the
addition of independent variables, transforming variables, and autoregressive
models.
P a g e | 643
CHAPTER OUTLINE
P a g e | 644
Winters Three-Parameter Exponential Smoothing Method
P a g e | 645
KEY TERMS
Autocorrelation
Moving Average
Autoregression
Averaging Models
Cycles
Seasonal Effects
Cyclical Effects
Serial Correlation
Decomposition
Simple Average
Deseasonalized Data
Durbin-Watson Test
Error of an Individual
Smoothing Techniques
Exponential Smoothing
Stationary
First-Difference Approach
Time-Series Data
Forecasting
Trend
Forecasting Error
Index Number
Irregular Fluctuations
Laspeyres Price Index
Mean Absolute Deviation (MAD)
Index Number
Weighted Aggregate Price
Index Number
Weighted Moving Average
P a g e | 646
e
17.1
Period
e2
2.30
2.30
5.29
1.60
1.60
2.56
-1.40
1.40
1.96
1.10
1.10
1.21
0.30
0.30
0.09
-0.90
0.90
0.81
-1.90
1.90
3.61
-2.10
2.10
4.41
0.70
Total
-0.30
0.70
12.30
no. forecasts
20.43
12.30
9
MAD =
= 1.367
no. forecasts
MSE =
0.49
20.43
9
= 2.27
P a g e | 647
e
17.2
Period Value
1
202
e2
191
202
-11
11
121
173
192
-19
19
361
169
181
-12
12
144
5
6
7
8
171
175
182
196
174
172
174
179
-3
3
8
17
3
3
8
17
9
9
64
289
204
189
15
15
225
10
219
198
21
21
441
11
227
211
16
16
256
Total
35
no. forecasts
125 1919
125.00
10
MAD =
= 12.5
no. forecasts
MSE =
1,919
10
= 191.9
P a g e | 648
e
17.3
Period Value
19.4 16.6
2
e2
2.8
2.8
7.84
26.8 24.8
2.0
2.0
4.00
29.2 25.9
3.3
3.3
10.89
35.5 28.6
6.9
6.9
47.61
Total
21.5
21.5
94.59
MAD =
e
2 1 .5
N o .F o r e c a s t s
6
= 3.583
MSE =
e2
9 4 .5 9
N o .F o r e c a s t s
6
= 15.765
e
17.4
e2
Year
Acres
Forecast
140,000
141,730
140,000
1730
1730
2,992,900
134,590
141,038
-6448
6448
41,576,704
131,710
137,169
-5459
5459
29,800,681
P a g e | 649
5
131,910
133,894
-1984
1984
3,936,256
134,250
132,704
1546
1546
2,390,116
135,220
133,632
1588
1588
2,521,744
131,020
134,585
-3565
3565
12,709,225
120,640
132,446
-11806
11806
10
115,190
125,362
-10172
10172
11
114,510
119,259
-4749
4749
139,381,636
103,469,584
22,553,001
Total
-39,319
49047
361,331,847
MAD =
e
4 9 ,0 4 7
N o .F o r e c a s t s
10
= 4,904.7
MSE =
17.5
a.)
e2
3 6 1 , 3 3 1 ,8 4 7
N o .F o r e c a s t s
10
error
44.75
14.25
52.75
13.25
61.50
9.50
64.75
21.25
= 36,133,184.7
P a g e | 650
b.)
70.50
30.50
81.00
16.00
error
53.25
5.75
56.375
9.625
62.875
8.125
67.25
18.75
76.375
24.625
89.125
7.875
P a g e | 651
c.)
difference in errors
14.25 - 5.75 = 8.5
3.626
1.375
2.5
5.875
8.125
17.6
Difference
Period
F( =.1)
Value
Error
F( =.8)
Error
211
228
211
236
213
23
225
11
12
241
215
26
234
19
242
218
24
240
22
227
220
242
-15
22
217
221
-4
230
-13
203
221
-18
220
-17
211
P a g e | 652
Using alpha of .1 produced forecasting errors that were larger than
those using alpha = .8 for the first three forecasts. For the next two
forecasts (periods 6
and 7), the forecasts using alpha = .1 produced smaller errors. Each
exponential smoothing model produced nearly the same amount of
error in forecasting the value for period 8. There is no strong argument
in favor of either model.
17.7
Period
Value
=.3
Error
=.7
Error
3-mo.avg.
Error
1
9.4
8.2
9.4
-1.2
9.4
-1.2
7.9
9.0
-1.1
8.6
-0.7
9.0
8.7
0.3
8.1
0.9
8.5
9.8
8.8
1.0
8.7
1.1
8.4
11.0
9.1
1.9
9.5
1.5
8.9
10.3
9.7
0.6
10.6
-0.3
9.9
9.5
9.9
-0.4
10.4
-0.9
10.4
9.1
9.8
-0.7
9.8
-0.7
10.3
0.5
1.4
1.1
0.4
-0.9
-1.2
P a g e | 653
17.8
(a)
(c)
(b)
(c)
F(a)
e(a)
F(b)
e(b)
Year
Orders
2512.7
2739.9
2874.9
2934.1
2865.7
2978.5
2785.46
193.04
2852.36
126.14
3092.4
2878.62
213.78
2915.49
176.91
3356.8
2949.12
407.68
3000.63
356.17
3607.6
3045.50
562.10
3161.94
445.66
10
3749.3
3180.20
569.10
3364.41
384.89
11
12
3952.0
3949.0
3356.92
3551.62
595.08
397.38
3550.76
3740.97
401.24
208.03
13
4137.0
3722.94
414.06
3854.64
282.36
e
17.9
Year
No.Issues F(=.2)
e
F(=.9)
332
694
332.0
362.0
332.0
362.0
518
404.4
113.6
657.8
139.8
222
427.1
205.1
532.0
310.0
P a g e | 654
5
6
209
172
386.1
350.7
177.1
178.7
253.0
213.4
44.0
41.4
366
315.0
51.0
176.1
189.9
512
325.2
186.8
347.0
165.0
667
362.6
304.4
495.5
171.5
10
11
571
575
423.5
453.0
147.5
122.0
649.9
578.9
78.9
3.9
12
865
477.4
387.6
575.4
289.6
13
609
554.9
54.1
836.0
227.0
2289.9
12
2023.0
12
= 2289.9
= 190.8
= 168.6
y
= 37,969 + 9899.1 Period
=2023.0
P a g e | 655
F = 1603.11 (p = .000), R2 = .988, adjusted R2 = .988,
se = 6,861, t = 40.04 (p = .000)
y
= 35,769 + 10,473 Period - 26.08 Period2
R2 = 83.0% se = 190.1374
hypothesis.
P a g e | 656
17000
16500
Union Members
16000
15500
15000
14500
1980
1985
1990
1995
Year
2000
2005
2010
P a g e | 657
17.12
Year Line Fit Plot
1200
1000
800
Loans
600
400
200
0
0
Year
Trend Model:
R2 = 33.0
t = 1.86 (p = .106)
F = 3.44 (p = .1.06)
Quadratic Model:
R2 = 90.6
10
P a g e | 658
F = 28.95 (p = .0008)
The graph indicates a quadratic fit rather than a linear fit. The
quadratic model produced an R2 = 90.6 compared to R2 = 33.0 for
linear trend indicating a much better fit for the quadratic model. In
addition, the standard error of the estimate drops from 96.80 to 39.13
with the quadratic model along with the t values becoming significant.
P a g e | 659
17.13
Month
Broccoli
Jan.(yr. 1)
132.5
Feb.
164.8
Mar.
141.2
Apr.
133.8
May
138.4
June
150.9
12-Mo. Mov.Tot.
2-Yr.Tot.
TC
SI
1655.2
July
146.6
3282.8
136.78
3189.7
132.90
3085.0
128.54
3034.4
126.43
2996.7
124.86
2927.9
122.00
93.30
1627.6
Aug.
90.47
146.9
1562.1
Sept.
92.67
138.7
1522.9
Oct.
98.77
128.0
1511.5
Nov.
111.09
112.4
1485.2
Dec.
100.83
121.0
P a g e | 660
1442.7
Jan.(yr. 2)
113.52
104.9
2857.8
119.08
2802.3
116.76
2750.6
114.61
2704.8
112.70
2682.1
111.75
2672.7
111.36
1415.1
Feb.
117.58
99.3
1387.2
Mar.
112.36
102.0
1363.4
Apr.
92.08
122.4
1341.4
May
99.69
112.1
1340.7
June
102.73
108.4
1332.0
July
119.0
Aug.
119.0
Sept.
114.9
Oct.
106.0
Nov.
111.7
Dec.
112.3
P a g e | 661
17.14
Month
C
Ship
12m tot
2yr tot
TC
SI
TCI
Jan(Yr1) 1891
1952.50
2042.72
Feb
1986
1975.73
2049.87
Mar
1987
1973.78
2057.02
Apr
1987
1972.40
2064.17
May
2000
1976.87
2071.32
June
2082
1982.67
2078.46
23822
July
1878
2085.61
94.49
47689
1987.04
94.51 1970.62
23867
Aug
2074
2092.76
96.13
47852
48109
48392
23985
Sept
2086
2099.91
95.65
24124
Oct
2045
2107.06
93.48
P a g e | 662
24268
Nov
1945
2114.20
95.76
48699
2029.13
95.85 2024.57
49126
2046.92
90.92 2002.80
24431
Dec
1861
2121.35
94.41
24695
Jan(Yr2) 1936
93.91
49621
2067.54
93.64 1998.97
2128.50
24926
Feb
2104
2135.65
98.01
49989
50308
25063
Mar
2126
2142.80
98.56
25245
Apr
98.39
2131
50730
2149.94
51132
2157.09
51510
2164.24
51973
2165.54
2171.39
25485
May
99.11
2163
25647
June
2346
103.23
25863
July
2109
101.92
26110
97.39 2213.01
P a g e | 663
Aug
2211
2178.54
98.45
52346
52568
26236
Sept
2268
2185.68
99.91
26332
Oct
2285
100.37
52852
2192.83
26520
Nov
2107
2199.98
99.69
53246
2218.58
94.97 2193.19
53635
2234.79
26909
92.94 2235.26
26726
Dec
2077
2207.13 101.27
Jan(Yr3) 2183
101.79
53976
2249.00
27067
97.07 2254.00
2214.28
Feb
2230
2221.42
99.87
54380
2265.83
27313
98.42 2218.46
Mar
2222
2228.57
99.04
54882
2286.75
27569
97.17 2207.21
Apr
2319
102.96
55355
2306.46 100.54 2301.97
27786
2235.72
May
2369
104.40
55779
2324.13 101.93 2341.60
27993
2242.87
June
2529
2250.02 107.04
56186
28193
July
2267
105.39
56539
28346
2355.79
96.23 2378.80
2257.17
P a g e | 664
Aug
2457
2264.31 105.26
56936
57504
28590
Sept
2524
2271.46 106.99
28914
Oct
2502
105.76
58075
2278.61
58426
2434.42
2285.76
29161
Nov
2314
105.38
95.05 2408.66
29265
Dec
2277
2292.91 106.87
58573
2440.54
93.30 2450.50
29308
Jan(Yr4) 2336
104.87
58685
2445.21
95.53 2411.98
2300.05
58815
2307.20
58806
2314.35
58793
2321.50
58920
2328.65
29377
Feb
2474
106.67
29438
Mar
2546
109.28
29368
Apr
2566
109.72
29425
May
2473
104.97
P a g e | 665
29495
June
2572
104.86
59018
2335.79
59099
2462.46
2342.94
29523
July
2336
104.62
94.86 2451.21
29576
Aug
2518
2350.09 103.93
59141
59106
2462.75
58933
58779
2449.13
97.34 2481.52
58694
2445.58
94.25 2480.63
29565
Sept
2454
2357.24 100.24
99.64 2362.80
29541
Oct
2559
2364.39 104.25
29392
Nov
2384
2371.53 104.64
29387
Dec
2305
2378.68 104.29
29307
Jan(Yr5) 2389
103.39
58582
2440.92
97.87 2466.70
2385.83
29275
Feb
2463
2392.98 102.39
29268
58543
P a g e | 666
Mar
2522
2400.13 104.38
58576
29308
Apr
99.67
2417
58587
2441.13
99.01 2399.25
2407.27
58555
2414.42
29279
May
2468
101.04
29276
June
2492
2421.57
98.00
58458
29182
July
99.54
2304
58352
2431.33
94.76 2417.63
2428.72
29170
Aug
2511
2435.87
99.99
58258
57922
57658
57547
2397.79
29088
Sept
2494
2443.01
98.29
28834
Oct
2530
2450.16
99.46
28824
Nov
2381
2457.31 100.86
99.30 2478.40
28723
Dec
96.55
2211
57400
2391.67
92.45 2379.47
2464.46
P a g e | 667
28677
Jan(Yr6) 2377
99.30
57391
2391.29
99.40 2454.31
2471.61
28714
Feb
2381
2478.76
95.56
57408
2392.00
99.54 2368.68
57346
2389.42
94.92 2252.91
28694
Mar
2268
2485.90
90.63
28652
Apr
95.84
2407
57335
2493.05
57362
2390.08
99.03 2339.63
2500.20
57424
2507.35
28683
May
93.58
2367
28679
June
92.90
2446
28745
July
2341
Aug
2491
Sept
2452
Oct
2561
Nov
2377
Dec
2277
Seasonal Indexing:
Month
Year1
Year2
Year3
Year4
Year5
Year6
Index
P a g e | 668
Jan
93.64
97.07
95.53
97.87
99.40
96.82
Feb
101.01
99.54 100.49
Mar
101.42
94.92 100.64
Apr
May
June
96.23
94.86
July
94.51
Aug
104.02
103.06
Sept
104.60
103.55 105.34
99.64 103.34
103.83
Oct
101.42
103.76 103.40
104.21 105.31
103.79
Nov
95.85
94.97
95.05
97.24
99.30
96.05
Dec
90.92
92.94
93.30
94.25
92.45
92.90
Total
94.76
95.28
1199.69
Month
Index
Jan
96.85
Feb
100.52
Mar
100.67
Apr
100.74
May
101.17
P a g e | 669
June
105.01
July
95.30
Aug
103.09
Sept
103.86
Oct
103.82
Nov
96.07
Dec
92.92
= 2035.58 + 7.1481 X
R2 = .682, se = 102.9
Note: Trend Line was determined after seasonal effects were removed (based on
TCI column).
Coef
Constant
1.4228
Shelter
t-ratio
4.57
0.4925
s = 0.939
7.99
R-sq = 72.7%
Food
Housing
8.5
15.7
Y
9.1555
p
0.0001
0.0000
R-sq(adj) = 71.5%
e
-0.6555
e2
0.4296
P a g e | 670
7.8
11.5
7.0868
0.7132
0.5086
4.1
7.2
4.9690
-0.8690
0.7551
2.3
3.7
2.7
4.1
2.7526
3.4421
-0.4526
0.2579
0.2048
0.0665
2.3
4.0
3.3929
-1.0929
1.1944
3.3
3.0
2.9004
0.3996
0.1597
4.0
3.0
2.9004
1.0996
1.2092
4.1
3.8
3.2944
0.8056
0.6490
5.7
3.8
3.2944
2.4056
5.7870
5.8
4.5
3.6391
2.1609
4.6693
3.6
4.0
3.3929
0.2071
0.0429
1.4
2.9
2.8511
-1.4511
2.1057
2.1
2.7
2.7526
-0.6526
0.4259
2.3
2.5
2.6541
-0.3541
0.1254
2.8
2.6
2.7033
0.0967
0.0093
3.2
2.9
2.8511
0.3489
0.1217
2.6
2.6
2.7033
-0.1033
0.0107
2.2
2.3
2.5556
-0.3556
0.1264
2.2
2.2
2.5063
-0.3063
0.0938
2.3
3.5
3.1466
-0.8466
0.7168
3.1
4.0
3.3929
-0.2929
0.0858
1.8
2.2
2.5063
-0.7063
0.4989
2.1
2.5
2.6541
-0.5541
0.3070
3.4
2.5
2.5
3.3
2.6541
3.0481
0.7459
-0.5481
0.5564
0.3004
Total
21.1603
P a g e | 671
(e e
t 1
)2
= 1.873 + 2.503 + 0.173 + 0.505 + 1.825 + 2.228 +
0.490 +
0.0864 + 2.560 + 0.060 + 3.817 + 2.750 + 0.638 +
0.089 +
0.203 + 0.064 + 0.205 + 0.064 + 0.205 + 0.064 +
0.002 +
0.292 + 0.307 + 0.171 + 0.023 + 1.690 + 1.674 =
24.561
= 21.160
( e t e t 1 ) 2
D =
2 4 .5 6 1
2 1 .1 6 0
= 1.16
Housing
Predictor
Coef
Constant
3.1286
First Diff
-.2003
t-ratio
10.30
0.000
-1.09
0.287
P a g e | 672
s = 1.44854
R-sq = 4.9%
Food
Housing
8.5
15.7
7.8
11.5
-4.2
4.1
7.2
-4.3
2.3
3.7
2.7
4.1
-4.5
1.4
2.3
4.0
-0.1
3.3
3.0
-1.0
4.0
3.0
0.0
4.1
3.8
0.8
5.7
3.8
0.0
5.8
4.5
0.7
3.6
4.0
-0.5
1.4
2.9
-1.1
2.1
2.7
-0.2
2.3
2.5
-0.2
2.8
2.6
0.1
3.2
2.9
0.3
2.6
2.6
-0.3
2.2
2.3
-0.3
2.2
2.2
-0.1
2.3
3.5
1.3
3.1
4.0
0.5
R-sq(adj) = 0.8%
P a g e | 673
1.8
2.2
-1.8
2.1
2.5
0.3
3.4
2.5
2.5
3.3
0.0
0.8
y
for x= 150:
R2 = 37.9%
= 21,881 (million $)
adjusted R2 = 34.1%
se = 13,833
006
D = 2.49
F = 9.78, p = .
P a g e | 674
The critical table values for k = 1 and n = 18 are dL = 1.16 and dU =
1.39. Since
the observed value of D = 2.49 is above dU, the decision is to fail to
reject the null
hypothesis. There is no significant autocorrelation.
y
Failed Bank Assets Number of Failures
8,189
11
2,882.8
5,306.2
104
2,336.1
-2,232.1
1,862
34
6,026.5
-4,164.5
4,137
45
7,530.1
-3,393.1
36,394
586,449,390
79
12,177.3
24,216.7
3,034
118
17,507.9
-14,473.9
7,609
144
21,061.7
-13,452.7
7,538
201
28,852.6
-21,314.6
56,620
221
31,586.3
25,033.7
28,507
206
29,536.0
- 1,029.0
10,739
159
23,111.9
-12,372.9
43,552
108
16,141.1
27,410.9
28,155,356
4,982,296
17,343,453
11,512,859
209,494,371
180,974,565
454,312,622
626,687,597
1,058,894
153,089,247
751,357,974
e2
P a g e | 675
16,915
100
15,047.6
1,867.4
3,487,085
2,588
42
7,120.0
- 4,532.0
825
11
2,882.8
- 2,057.8
753
2,199.4
- 1,446.4
186
2,062.7
- 1,876.7
27
1,516.0
- 1,489.0
20,539,127
4,234,697
2,092,139
3,522,152
2,217,144
P a g e | 676
17.18
11
7
1,862
34
27
4,137
45
11
36,394
79
34
3,034
118
39
7,609
144
26
7,538
201
57
56,620
221
20
28,507
206
-15
10,739
159
-47
43,552
108
-51
16,915
100
-8
2,588
42
-58
825
11
-31
753
-5
186
-1
27
-4
P a g e | 677
R2 = 0.0%
= 0.00, p = .958
adjusted R2 = 0.0%
se = 18,091.7
D = 1.57
17.19 Starts
333.0
lag1
*
lag2
*
P a g e | 678
270.4
333.0
281.1
270.4
333.0
443.0
281.1
270.4
432.3
443.0
281.1
428.9
432.3
443.0
443.2
428.9
432.3
413.1
443.2
428.9
391.6
413.1
443.2
361.5
391.6
413.1
318.1
361.5
391.6
308.4
318.1
361.5
382.2
308.4
318.1
419.5
382.2
308.4
453.0
419.5
382.2
430.3
453.0
419.5
468.5
430.3
453.0
464.2
468.5
430.3
521.9
464.2
468.5
550.4
521.9
464.2
529.7
550.4
521.9
556.9
529.7
550.4
606.5
556.9
529.7
670.1
606.5
556.9
745.5
670.1
606.5
756.1
745.5
670.1
P a g e | 679
826.8
756.1
745.5
F = 198.67
Se = 70.84
The model with 1 lag is the best model with a strong R2 = 89.2%. The
model
with 2 lags is relatively strong also.
The F value for this model is 27.0 which is significant at alpha = .001.
The value of R2 is 56.2% which denotes modest predictability. The
P a g e | 680
adjusted R2 is 54.2%. The standard error of the estimate is 216.6. The
Durbin-Watson statistic is 1.70 indicating that there is no significant
autocorrelation in this model.
17.21 Year
Price
a.) Index1950
b.) Index1980
1950
22.45
100.0
32.2
1955
31.40
139.9
45.0
1960
32.33
144.0
46.4
1965
36.50
162.6
52.3
1970
44.90
200.0
64.4
1975
61.24
272.8
87.8
1980
69.75
310.7
100.0
1985
73.44
327.1
105.3
1990
80.05
356.6
114.8
1995
84.61
376.9
121.3
2000 87.28
388.8
125.1
2005 89.56
398.9
128.4
P a g e | 681
17.22 Year
Patents
Index
1980
66.2
66.8
1981
71.1
71.7
1982
63.3
63.9
1983
62.0
62.6
1984
72.7
73.4
1985
77.2
77.9
1986
76.9
77.6
1987
89.4
90.2
1988
84.3
85.1
1989
102.5
103.4
1990
99.1
100.0
1991
106.7
107.7
1992
107.4
108.4
1993
109.7
110.7
1994
113.6
114.8
1995
113.8
115.3
1996
121.7
122.8
1997
124.1
125.2
1998
163.1
164.6
1999
169.1
170.6
2000
176.0
2001
184.0
177.6
185.7
P a g e | 682
2002
184.4
2003
187.0
2004
181.3
186.1
188.7
182.9
P a g e | 683
17.23
Year
1993
Totals
1999
2005
1.53
1.40
2.17
2.21
2.15
2.51
1.92
2.68
2.60
3.38
3.10
4.00
9.04
9.33
11.28
9 .0 4
(1 0 0 )
9 .0 4
Index1993 =
= 100.0
9 .3 3
(1 0 0 )
9 .0 4
Index1999 =
= 103.2
1 1 .2 8
(1 0 0 )
9 .0 4
Index2005 =
17.24
= 124.8
Year
P a g e | 684
1998
2005
1999
2000
2001
2002
2003
2004
2006
1.10
1.16
2.89
1.23
1.23
1.08
1.56
1.85
2.59
1.58
1.61
1.78
1.77
1.61
1.71
1.90
2.05
1.80
1.82
1.98
1.96
1.94
1.90
1.92
1.94
7.95
7.96
8.24
8.21
8.19
8.05
8.12
8.10
2.08
1.96
8.24
14.68
Totals 12.43
15.17
12.55
13.23
1 2 .4 3
(1 0 0 )
1 3 .2 3
Index1998 =
= 94.0
1 2 .5 5
(1 0 0 )
1 3 .2 3
Index1999 =
= 94.9
1 3 .2 3
(1 0 0 )
1 3 .2 3
Index2000 =
= 100.0
1 3 .1 7
(1 0 0 )
1 3 .2 3
Index2001 =
= 99.5
13.17
12.82
13.22
13.79
P a g e | 685
1 2 .8 2
(1 0 0 )
1 3 .2 3
Index2002 =
= 100.0
1 3 .2 2
(1 0 0 )
1 3 .2 3
Index2003 =
= 101.0
1 3 .7 9
(1 0 0 )
1 3 .2 3
Index2004 =
= 106.4
1 4 .6 8
(1 0 0 )
1 3 .2 3
Index2005 =
= 111.0
1 5 .1 7
(1 0 0 )
1 3 .2 3
Index2006 =
17.25
= 114.7
Quantity
Price
Price
2000
Price
2004
Price
Item
2000
2005
21
0.50
0.67
0.68
0.71
1.23
1.85
1.90
1.91
17
0.84
0.75
0.75
0.80
43
0.15
0.21
0.25
0.25
2006
P a g e | 686
P2000Q2000
10.50
P2004Q2000 P2005Q2000
14.07
7.38
Totals
P2006Q2000
14.28
11.10
14.91
11.40
11.46
14.28
12.75
12.75
13.60
6.45
9.03
10.75
10.75
38.61
46.95
49.18
Index2000 =
Index2001 =
Index2002 =
P 2004 Q
2000
P 2000 Q
2000
P 2005 Q
2000
P 2000 Q
2000
P 2006 Q
2000
P 2000 Q
2000
17.26
Price
Item
2000
2005
46.95
(100)
38.61
49.18
(100)
38.61
50.72
(100)
38.61
Price
50.72
= 121.6
= 127.4
= 131.4
Quantity
2005
Price
2006
Quantity
2006
P a g e | 687
1
22.50
27.80
13
28.11
12
10.90
13.10
13.25
1.85
2.25
41
P2000Q2005 P2000Q2006
Totals
44
P2005Q2005 P2006Q2006
292.50
270.00
361.40
337.32
54.50
87.20
65.50
106.00
75.85
81.40
92.25
103.40
422.85
438.60
519.15
546.72
Index2005 =
Index2006 =
17.27 a)
2.35
P 2005Q
2005
P 2000 Q
2005
(1 0 0 )
=
P 2006 Q
2006
P 2000 Q
2006
(1 0 0 )
=
F = 219.24
p = .000
519.15
(100)
422.85
546.72
(100)
438.60
= 122.8
= 124.7
R2 = 90.9
se = .3212
P a g e | 688
F = 176.21
p = .000
R2 = 94.4%
se = .2582
P a g e | 689
b)
10.08
10.05
9.24
9.23
9.69
9.65
.04
9.55
9.37
9.55
9.43
.00
.06
8.55
9.46
.91
8.36
9.29
.93
8.59
8.96
.37
7.99
8.72
.73
8.12
8.37
.25
7.91
8.27
.36
7.73
8.15
.42
7.39
7.94
.55
7.48
7.79
.31
7.52
7.63
.11
7.48
7.53
.05
7.35
7.47
.12
7.04
7.46
.42
6.88
7.35
.47
6.88
7.19
.31
7.17
7.04
.13
7.22
6.99
.23
P a g e | 690
MAD =
c)
6.77
20
= .3
= 6.77
= .3385
= .7
e
x
10.08
e
F
10.05 10.08
.03
10.08 .03
9.24 10.07
.83
10.06 .82
9.23
9.82
.59
9.49 .26
9.69
9.64
.05
9.31 .38
9.55
9.66
.11
9.58 .03
9.37
9.63
.26
9.56 .19
8.55
9.55 1.00
9.43 .88
8.36
9.25
.89
8.81 .45
7.99
8.86
.87
8.98
8.50
8.56 .57
8.12
8.60
.48
8.16 .04
7.91
8.46
.55
8.13 .22
7.73
8.30
.57
7.98 .25
7.39
8.13
.74
7.81 .42
8.59
.39
.09
P a g e | 691
7.48
7.91
.43
7.52 .04
7.52
7.78
.26
7.49 .03
7.48
7.70
.22
7.51 .03
7.35
7.63
.28
7.49 .14
7.04
7.55
.51
7.39 .35
6.88
7.40
.52
7.15 .27
6.88
7.24
.36
6.96 .08
7.17
7.13
.04
6.90 .27
7.22
7.14
.08
7.09 .13
MAD=.3 =
= 10.06
10.06
23
= 5.97
= .4374
MAD=.7 =
5.97
23
= .2596
d).
= .7
produces the lowest error (.2596 from part c).
e)
4 period
TCSI
moving tots
10.08
10.05
38.60
8 period
moving tots
TC
SI
P a g e | 692
9.24
76.81
9.60
96.25
75.92
9.49
97.26
75.55
9.44
102.65
75.00
9.38
101.81
72.99
9.12
102.74
70.70
8.84
96.72
68.36
8.55
97.78
66.55
8.32
103.25
65.67
8.21
97.32
64.36
8.05
100.87
62.90
7.86
100.64
61.66
7.71
100.26
60.63
7.58
97.49
38.21
9.23
37.71
9.69
37.84
9.55
37.16
9.37
35.83
8.55
34.87
8.36
33.49
8.59
33.06
7.99
32.61
8.12
31.75
7.91
31.15
7.73
30.51
7.39
P a g e | 693
30.12
7.48
59.99
7.50
99.73
59.70
7.46
100.80
59.22
7.40
101.08
58.14
7.27
101.10
56.90
7.11
99.02
56.12
7.02
98.01
56.12
7.02
98.01
29.87
7.52
29.83
7.48
29.39
7.35
28.75
7.04
28.15
6.88
27.97
6.88
28.15
7.17
7.22
1st Period
102.65
98.01
2nd Period
98.01
3rd Period
96.25 102.74
4th Period
97.26
97.32
96.72 100.87
97.49 101.10
99.73
99.02
The highs and lows of each period (underlined) are eliminated and the
others are
averaged resulting in:
P a g e | 694
1st
Seasonal Indexes:
99.82
2nd 101.05
3rd
98.64
4th
98.67
total 398.18
Since the total is not 400, adjust each seasonal index by multiplying by
400
398.18
1st
100.28
2nd 101.51
3rd
99.09
4th
99.12
17.28
Year
Quantity
Index Number
1992
2073
100.0
1993
2290
110.5
1994
2349
113.3
1995
2313
111.6
1996
2456
118.5
1997
2508
121.1
1998
2463
118.8
1999
2499
120.5
2000
2520
121.6
2001
2529
122.0
P a g e | 695
17.29
2002
2483
119.8
2003
2467
119.0
2004
2397
115.6
2005
2351
113.4
2006
2308
111.3
Item
2002
2003
2004
3.21
3.37
0.51
0.55
2005
3.80
0.68
2006
3.73
0.62
0.59
0.83
0.90
0.91
1.02
1.06
1.30
1.32
1.33
1.32
1.30
1.67
1.72
1.90
1.99
1.98
0.62
0.67
0.70
0.72
0.71
Totals
Index2002 =
Index2003 =
8.14
P 2002
P 2003
P 2002
8.53
9.32
(1 0 0 )
8 .1 4
(1 0 0 )
8 .1 4
9.40
= 100.0
P 2002
(1 0 0 )
8 .5 3
(1 0 0 )
8 .1 4
= 104.8
3.65
9.29
P a g e | 696
Index2004 =
Index2005 =
Index2006 =
17.30
P 2004
P 2005
P 2002
9 .3 2
(1 0 0 )
8 .1 4
(1 0 0 )
= 114.5
P 2002
9 .4 0
(1 0 0 )
8 .1 4
(1 0 0 )
= 115.5
P 2006
P 2002
9 .2 9
(1 0 0 )
8 .1 4
(1 0 0 )
= 114.1
2003
Item
2004
2.75
12
2.98
0.85
47
1.33
20
Laspeyres:
Totals
2005
Q
9
3.21 11
0.89 52
0.95 61
0.98 66
1.32 28
1.36 25
1.40 32
P2003Q2003
3.10
2006
P2006Q2003
33.00
38.52
39.95
46.06
26.60
28.00
99.55
112.58
P a g e | 697
Laspeyres Index2006 =
Paasche2005:
P 2006 Q
2003
P 2003Q
2003
P2003Q2005
Totals
Paasche Index2005 =
112.58
(100)
99.55
= 113.1
P2005Q2005
24.75
27.90
51.85
57.95
33.25
34.00
109.85
(1 0 0 )
119.85
P 2005Q
2005
P 2003Q
2005
(1 0 0 )
=
119.85
(100)
109.85
= 109.1
P a g e | 698
P a g e | 699
17.31
b) = .2
a) moving average
Year
Quantity
1980
6559
1981
6022
6022.00
1982
6439
6022.00
1983
6396
6340.00
1984
6405
56.00 6105.40
290.60
6285.67
119.33 6163.52
1985
6391
6413.33
22.33 6211.82
179.18
6397.33
245.33 6247.65
95.65
241.48
1986
6152
1987
7034
6316.00
718.00 6228.52
805.48
1988
7400
6525.67
874.33 6389.62
1010.38
1989
8761
6862.00
1899.00 6591.69
2169.31
1990
9842
7731.67
2110.33 7025.56
2816.45
1991
10065
8667.67
1397.33 7588.84
2476.16
1992
10298
9556.00
742.00 8084.08
2213.93
1993
10209
10068.33
140.67
8526.86
1994
10500
10190.67
309.33
8863.29
1995
9913
10335.67
422.67
9190.63
9644
10207.33
563.33
9335.10
1682.14
1636.71
722.37
1996
308.90
P a g e | 700
1997
9952
10019.00
67.00
9396.88
1998
9333
9836.33
503.33 9507.91
174.91
1999
9409
9643.00
234.00 9472.93
63.93
2000
9143
9564.67
421.67 9460.14
317.14
2001
9512
9295.00
217.00 9396.71
115.29
2002
9430
9354.67
75.33 9419.77
10.23
9513
9361.67
151.33 9421.82
91.18
10085
9485.00
600.00
555.12
2003
2004
c)
=18,621.46
= 540.44
1 8 , 6 2 1 .4 6
22
numberforecasts
MAD=.2 =
644.95
1 1 ,8 8 9 .6 7
22
numberforecasts
=11,889.67
MADmoving average =
9440.05
= 846.43
did
exponential smoothing with = .2 (MAD = 846.43). Using MAD as the
criterion, the three-year moving average was a better forecasting tool
than the exponential smoothing with = .2.
P a g e | 701
17.32-17.34
Month
Chem
Jan(1)
23.701
Feb
12m tot
2yr tot
TC
SI
TCI
24.189
Mar
24.200
Apr
24.971
May
24.560
June
24.992
288.00
July
94.08
22.566
23.872 23.917
575.65
23.985
287.65
Aug
24.037
575.23
24.134 23.919
23.968
100.29
287.58
Sept
25.047
576.24
24.047 23.921
24.010 104.32
288.66
Oct
24.115
577.78
24.851 23.924
24.074 100.17
289.12
Nov
23.034
578.86
24.056 23.926
24.119
95.50
P a g e | 702
289.74
Dec
22.590
23.731 23.928
580.98 24.208
93.32
24.333
95.95
24.423
98.77
291.24
Jan(2)
23.347
584.00
24.486 23.931
292.76
Feb
24.122
586.15
24.197 23.933
293.39
Mar
25.282
587.81
23.683 23.936
24.492 103.23
294.42
Apr
25.426
23.938
24.450
294.63
May
25.185
590.05
24.938 23.940
24.585 102.44
295.42
June
26.486
592.63
24.763 23.943
24.693 107.26
297.21
July
24.088
595.28
25.482 23.945
24.803
97.12
24.908
99.05
298.07
Aug
24.672
597.79
24.771 23.947
299.72
P a g e | 703
Sept
26.072
23.950
25.031
302.03
Oct
24.328
605.59
25.070 23.952
25.233
96.41
25.327
94.07
25.440
95.81
25.553
94.73
303.56
Nov
23.826
607.85
24.884 23.955
304.29
Dec
24.373
610.56
25.605 23.957
306.27
Jan(3)
24.207
613.27
25.388 23.959
307.00
Feb
25.772
614.89
25.852 23.962
25.620 100.59
307.89
Mar
27.591
616.92
25.846 23.964
25.705 107.34
309.03
Apr
26.958
619.39
25.924 23.966
25.808 104.46
310.36
May
June
25.920
25.666
622.48
23.969
28.460
625.24
26.608 23.971
25.937
99.93
312.12
26.052 109.24
P a g e | 704
313.12
July
Aug
24.821
26.257 23.974
25.560
629.12
25.663 23.976
627.35 26.140
94.95
314.23
26.213
97.51
314.89
Sept
27.218
631.53
26.131 23.978
26.314 103.44
316.64
Oct
25.650
635.31
26.432 23.981
26.471
96.90
26.660
95.98
26.835
94.54
26.985
93.82
27.208
97.16
318.67
Nov
25.589
639.84
26.725 23.983
321.17
Dec
25.370
644.03
26.652 23.985
322.86
Jan(4)
25.316
647.65
26.551 23.988
324.79
Feb
26.435
652.98
26.517 23.990
328.19
Mar
29.346
659.95
27.490 23.992
27.498 106.72
331.76
Apr
28.983
666.46
27.871 23.995
27.769 104.37
P a g e | 705
334.70
May
28.424
672.57
28.145 23.997
28.024 101.43
337.87
June
30.149
28.187 24.000
679.39
28.308 106.50
341.52
July
26.746
686.66
28.294 24.002
28.611
93.48
345.14
Aug
28.966
694.30
29.082 24.004
28.929 100.13
349.16
Sept
30.783
701.34
29.554 24.007
29.223 105.34
352.18
Oct
28.594
706.29
29.466 24.009
29.429
97.16
354.11
Nov
28.762
710.54 29.606
30.039 24.011
97.14
356.43
Dec
29.018
715.50
30.484 24.014
29.813
97.33
30.031
96.34
359.07
Jan(5)
28.931
720.74
30.342 24.016
361.67
P a g e | 706
Feb
30.456
725.14
30.551 24.019
30.214 100.80
363.47
Mar
32.372
727.79
30.325 24.021
30.325 106.75
364.32
Apr
30.905
730.25
29.719 24.023
30.427 101.57
365.93
May
30.743
733.94
30.442 24.026
30.581 100.53
368.01
June
32.794
738.09
30.660 24.028
30.754 106.63
370.08
July
29.342
Aug
30.765
Sept
31.637
Oct
30.206
Nov
30.842
Dec
31.090
Seasonal Indexing:
Month
Year1
Year2
Year3
Year4
94.73
93.82
Jan
95.95
Feb
98.77 100.59
Year5
Index
96.34
95.34
97.16 100.80
99.68
Mar
106.74
Apr
103.98
P a g e | 707
May
102.44
100.98
June
106.96
July
94.08
97.12
94.95
93.48
Aug
100.29
99.05
97.51 100.13
Sept
104.32
Oct
100.17
96.41
96.90
97.16
Nov
95.50
94.07
95.98
97.14
Dec
93.32
95.81
94.54
97.33
94.52
99.59
103.98 103.44 105.34
104.15
97.03
95.74
95.18
Total
1199.88
P a g e | 708
Final Seasonal Indexes:
Month
Index
Jan
95.35
Feb
99.69
Mar
106.75
Apr
103.99
May
100.99
June
106.96
July
94.53
Aug
99.60
Sept
104.16
Oct
97.04
Nov
95.75
Dec
95.19
y
= 22.4233 + 0.144974 x
R2 = .913
P a g e | 709
y
= 23.8158 + 0.01554 x + .000247 x2
R2 = .964
In this model, the linear term yields a t = 0.66 with p = .513 but the
squared term predictor yields a t = 8.94 with p = .000.
y
= 23.9339 + 0.00236647 x2
R2 = .964
Note: The trend model derived using only the squared predictor was used in computing T
(trend) in the decomposition process.
P a g e | 710
17.35
2004
Item
2005
Price Quantity
Quantity
Margarine (lb.)
1.26
2006
Price Quantity
21
1.32
Price
23
1.39
22
Shortening (lb.)
0.94
0.97
1.12
1.43
70
1.56
68
1.62
Cola (2 liters)
1.05
12
1.02
13
1.25
2.81
27
2.86
29
2.99
65
11
28
Total
7.49
P 2004
(1 0 0 )
P 2005
7.73
8.37
Index2004 =
Index2005 =
Index2006 =
P2004Q2004
P 2004
7 .4 9
(1 0 0 )
7 .4 9
= 100.0
P 2004
(1 0 0 )
7 .7 3
(1 0 0 )
7 .4 9
= 103.2
P 2006
P 2004
(1 0 0 )
8 .3 7
(1 0 0 )
7 .4 9
= 111.8
P2005Q2004
P2006Q2004
P a g e | 711
Totals
26.46
27.72
29.19
4.70
4.85
5.60
100.10
109.20
113.40
12.60
12.24
15.00
75.87
77.22
80.73
219.73
231.23
243.92
IndexLaspeyres2005 =
IndexLaspeyres2006 =
P2004Q2005
28.98
Total
P 2005Q
2004
P 2004 Q
2004
2 3 1 .2 3
(1 0 0 )
2 1 9 .7 3
(1 0 0 )
=
P 2006 Q
2004
P 2004 Q
2004
P2004Q2006
27.726
= 105.2
2 4 3 .9 2
(1 0 0 )
2 1 9 .7 3
(1 0 0 )
=
= 111.0
P2005Q2005
P2006Q2006
30.36
30.58
2.82
3.76
2.91
4.48
97.24
92.95
106.08
105.30
13.65
11.55
13.26
13.75
81.49
78.68
82.94
83.72
224.18
214.66
235.55
237.83
P a g e | 712
IndexPaasche2005 =
IndexPaasche2006 =
P 2005Q
2005
P 2004 Q
2005
2 3 5 .5 5
(1 0 0 )
2 2 4 .1 8
(1 0 0 )
=
P 2006 Q
2006
P 2004 Q
2006
2 3 7 .8 3
(1 0 0 )
2 1 4 .6 6
(1 0 0 )
=
y
17.36
= 9.5382 0.2716 x
y
(7) = 7.637
R2 = 40.2%
F = 12.78, p = .002
se = 0.264862
Durbin-Watson:
n = 21
D = 0.44
k=1
= .05
= 105.1
= 110.8
P a g e | 713
P a g e | 714
17.37 Year
Fma
Fwma
SEMA
SEWMA
1988 118.5
1989 123.0
1990 128.5
1991 133.6
1992 137.5
125.9
128.4
134.56
82.08
1993 141.2
130.7
133.1
111.30
65.93
1994 144.8
135.2
137.3
92.16
56.25
1995 148.5
139.3
141.1
85.10
54.17
1996 152.8
143.0
144.8
96.04
63.52
1997 156.8
146.8
148.8
99.50
64.80
1998 160.4
150.7
152.7
93.61
58.68
1999 163.9
154.6
156.6
86.03
53.14
86.12
188.38
135.26
161.93
100.80
150.06
89.30
137.48
85.56
184.9
SE =
167.70
1,727.60
SE
1 7 2 7 .6 0
N o .F o r e c a sts
14
MSEma =
= 123.4
115.78
1,111.40
P a g e | 715
SE
1 1 1 1 .4
N o .F o r e c a sts
14
MSEwma =
= 79.39
P a g e | 716
17.38 The regression model with one-month lag is:
The model with the four-month lag does not have overall significance
and has an
adjusted R2 of 1%. This model has virtually no predictability. The
model with
the one-month lag has relatively strong predictability with adjusted R2
of 83.3%. In addition, the F value is significant at = .001 and the
standard error of the estimate is less than 40% as large as the
standard error for the four-month lag model.
17.39
P a g e | 717
Qtr
TSCI
4qrtot
Year1 1
54.019
56.495
8qrtot
TC
SI
TCI
213.574
3
50.169
425.044 53.131
94.43
51.699 53.722
211.470
4
52.891
52.341 55.945
210.076
Year2
51.915
423.402 52.925
98.09
52.937 58.274
213.326
2
55.101
53.063 60.709
217.671
3
53.419
440.490 55.061
97.02
55.048 63.249
56.641 65.895
222.819
4
57.236
230.206
Year3 1 57.063
467.366 58.421
97.68
58.186 68.646
60.177 71.503
237.160
2 62.488
3 60.373
492.176 61.522
98.13
62.215 74.466
62.676 77.534
248.918
4 63.334
254.810
P a g e | 718
Year4 1 62.723
512.503 64.063
97.91
63.957 80.708
65.851 83.988
257.693
2 68.380
260.805
3 63.256
524.332 65.542
96.51
65.185 87.373
65.756 90.864
263.527
4 66.446
263.158
Year5 1 65.445
526.305 65.788
99.48
66.733 94.461
65.496 98.163
263.147
2 68.011
263.573
3 63.245
521.415 65.177
97.04
65.174 101.971
66.177 105.885
257.842
4 66.872
253.421
Year6 1 59.714
501.685 62.711
95.22
60.889 109.904
61.238 114.029
Year2
Year5
248.264
2 63.590
3 58.088
4 61.443
Quarter
Year1
Year3
Year4
Year6
Index
P a g e | 719
1
98.09
97.68
97.91
99.48
95.22
102.28
104.06
105.51
103.30
103.59
97.89
103.65
3
94.43
97.02
98.13
96.51
97.04
100.38
101.07
100.58
100.93
104.64
96.86
100.86
Total
399.26
Quarter
Index
98.07
103.84
97.04
101.05
Total
400.00
400
399.26
= 1.00185343
P a g e | 720
17.40
Time Period
Deseasonalized Data
Q1(yr1)
55.082
Q2
54.406
Q3
51.699
Q4
52.341
Q1(yr2)
52.937
Q2
53.063
Q3
55.048
Q4
56.641
Q1(yr3)
58.186
Q2
60.177
Q3
62.215
Q4
62.676
Q1(yr4)
63.957
Q2
65.851
Q3
65.185
Q4
65.756
Q1(yr5)
66.733
Q2
65.496
Q3
65.174
Q4
66.177
Q1(yr6)
60.889
Q2
61.238
Q3
59.860
Q4
60.805
P a g e | 721
y
17.41 Linear Model:
R2 = 55.7%
= 53.41032 + 0.532488 x
se = 3.43
y
Quadratic Model:
R2 = 76.6%
se = 2.55
In the quadratic regression model, both the linear and squared terms
have significant t statistics at alpha .001 indicating that both are
contributing. In addition, the R2 for the quadratic model is
considerably higher than the R2 for the linear model. Also, se is smaller
for the quadratic model. All of these indicate that the quadratic model
is a stronger model.
P a g e | 722
17.42 R2 = 55.8%
se = 50.18
This model with a lag of one year has modest predictability. The
overall F is
significant at = .05 but not at = .01.
R2 = 88.2%
se = 582.685
D = 0.84
P a g e | 723
= .1
17.44
= .5
e
Year
PurPwr
= .8
6.04
5.92
6.04
.12
6.04
.12
6.04
.12
5.57
6.03
.46
5.98
.41
5.94
.37
5.40
5.98
.58
5.78
.38
5.64
.24
5.17
5.92
.75
5.59
.42
5.45
.28
5.00
5.85
.85
5.38
.38
5.23
.23
4.91
5.77
.86
5.19
.28
5.05
.14
4.73
5.68
.95
5.05
.32
4.94
.21
4.55
5.59
1.04
4.89
.34
4.77
.22
10
4.34
5.49
1.15
4.72
.38
4.59
.25
11
4.67
5.38
.71
4.53
.14
4.39
.28
12
5.01
5.31
.30
4.60
.41
4.61
.40
13
4.86
5.28
.42
4.81
.05
4.93
.07
14
4.72
5.24
.52
4.84
.12
4.87
.15
15
4.60
5.19
.59
4.78
.18
4.75
.15
16
4.48
5.13
.65
4.69
.21
4.63
.15
17
4.86
5.07
.21
4.59
.27
4.51
.35
P a g e | 724
18
5.15
5.05
.10
e
e
MAD1 =
e
MAD2 =
e
MAD3 =
10.26
17
4.83
17
3.97
17
4.73
= 10.26 .
.42
= 4.83
4.79
.36
= 3.97
= .60
= .28
= .23
P a g e | 725
17.45 The model is:
et
(et et-1)2
et et-1
- 1,338.58
et2
1,791,796
- 8,588.28
- 7,249.7
52,558,150
73,758,553
- 7,050.61
1,537.7
2,364,521
49,711,101
1,115.01
8,165.6
66,677,023
1,243,247
12,772.28
11,657.3
135,892,643
163,131,136
14,712.75
1,940.5
3,765,540
216,465,013
- 3,029.45
-17,742.2
314,785,661
9,177,567
- 2,599.05
430.4
185,244
6,755,061
622.39
3,221.4
10,377,418
387,369
9,747.30
9,124.9
83,263,800
95,009,857
458.5
210,222
86,282,549
- 434.76
- 9,723.6
94,548,397
189,016
-10,875.36
-10,440.6
109,006,128
118,273,455
- 9,808.01
1,067.4
1,139,343
96,197.060
- 4,277.69
5,530.3
30,584,218
18,298,632
4,020.9
16,167,637
65,946
9,288.84
256.80
(e
et 1 ) 2
=921,525,945
=936,737,358
P a g e | 726
(e e
e
t 1
2
D =
)2
921,525,945
936,737,358
= 0.98