Chapter 12 Solutions Develop Your Skills 12.1

Instructors Solutions Manual - Chapter 12
Chapter 12 Solutions

Develop Your Skills 12.1
1. We are looking for evidence of a decrease in the proportion of on-time flights after
the merger. Call the population of flights before the merger population 1, and the
population of flights after the merger population 2.

H
0
: p
1
p
2
= 0
H
1
: p
1
p
2
> 0

= 0.04

85 . 0
100
85
p
1
= = , n
1
= 100, 78 . 0
100
78
p
2
= = , n
2
= 100

Sampling is done without replacement, but it is likely that the airline handles many
thousands of flights, so we can still use the binomial distribution as the appropriate
underlying model.

Check for normality of the sampling distribution:
=100(0.85) = 85 > 10
1 1
p n
=100(1-0.85) = 100(0.15) = 15 > 10
1 1
q n
=100(0.78) = 78 > 10
2 2
p n
=100(1-0.78) = 100(0.22) = 22 > 10
2 2
q n

Since the null hypothesis is that there is no difference in the proportions, we can pool
the sample data to estimate . p

815 . 0
100 100
78 85
p =
+
+
=

We calculate the z-score as:

2747 . 1
054913568 . 0
07 . 0
100
1
100
1
) 815 . 0 1 )( 815 . 0 (
0 ) 78 . 0 85 . 0 (
n
1
n
1
q p
0 ) p p (
z
2 1
2 1
= =
|
.
|
\
|
+

=
|
|
.
|
\
|
+

=

p-value = P(z > 1.27) = 1 0.8980 = 0.102

Since p-value > , fail to reject H
0
. There is insufficient evidence to infer that the
proportion of on-time flights decreased after the merger.
Copyright 2011 Pearson Canada Inc. 321


2. Call the data on use of social network profiles by online Canadians in 2009 sample 1
from population 1, and the data on use of social network profiles by online
Canadians 18 months previously sample 2 from population 2.

H
0
: p
1
p
2
= 0.10
H
1
: p
1
p
2
> 0.10

= 0.05

5607 . 0
824
462
1
= = p , n
1
= 824, 39 . 0
2
= p , n
2
= 800

Sampling is done without replacement, but there are millions of online Canadians, so
we can still use the binomial distribution as the appropriate underlying model.

= 462 > 10
1 1
p n
= 824 - 462 = 362 > 10
1 1
q n
=800(0.39) = 312 > 10
2 2
p n
=800(1-0.39) = 800(0.61) = 488 > 10
2 2
q n

Since the null hypothesis is that there is a 10% difference in the proportions, we
cannot pool the sample data to estimate . p


89 . 2
800
) 61 . 0 )( 39 . 0 (
824
) 4393 . 0 )( 5607 . 0 (
10 . 0 ) 39 . 0 5607 . 0 (

) ( ) (
2
2 2
1
1 1
2 1

2 1
2 1
2 1
2 1
=
+

=
+

=

=

n
q p
n
q p
p p
s
p p
z
p p
p p
p p

p-value = P(z > 2.89) = 1 0.9981 = 0.0019

Since p-value < , reject H
0
. There is sufficient evidence to infer that the proportion
of online Canadians with a social network profile is more than 10% higher in 2009
than it was 18 months previous.


3. Call the data on perceptions of female bank employees sample 1 from population 1,
and the data on perceptions of male bank employees sample 2 from population 2.
We want to know if the proportion of male employees who felt that female
employees had as much opportunity for advancement as male employees is more
than 10% higher than the proportion of female employees who thought so. So, we
are wondering if p
2
is more than 10% higher than p
1
, that is, if p
2
p
1
> 0.10.
Rewriting this in standard format, we ask the equivalent question: is p
1
p
2
< -0.10?

H
0
: p
1
p
2
= -0.10
H
1
: p
1
p
2
< -0.10

= 0.05

, n
1
= 240, , n
2
= 350 825 . 0
1
= p 943 . 0
2
= p

Sampling is done without replacement, but Canadian banks are large employers (in
2006, the Royal Bank employed about 69,000 people, for instance), so we can still
use the binomial distribution as the appropriate underlying model.

=240(0.825) = 198 > 10
1 1
p n
=240(1-0.825) = 42 > 10
1 1
q n
=350(0.943) = 330.05 > 10
2 2
p n
=350(1-0.943) =19.95 > 10
2 2
q n

The null hypothesis is that there is 10% difference in the proportions, so we cannot
pool the sample data to estimate . p


1 2
1 2
1 1 2 2
1 2
( )
(0.825 0.943) ( 0.10)
0.655
(0.825)(0.175) (0.943)(0.057)
240 350
p p
p p
z
p q p q
n n

= = =
+ +

p-value = P(z s -0.66) = 0.2546

0
. There is not enough evidence to infer that the
proportion of male employees who felt that female employees had as much
opportunity for advancement as male employees was more than 10% higher than the
proportion of female employees who thought so.



4. Call the data on customers told about the extended warranty by cashiers sample 1
from population 1. Call the data on customers exposed to the display at the checkout
sample 2 from population 2.

H
0
: p
1
p
2
= 0
H
1
: p
1
p
2
0

= 0.10

We presume the store has many thousands of customers, so although we are
sampling without replacement, we can still use the binomial distribution as the
appropriate underlying model.

The data are available in Excel, so we will use Excel to do this problem. First the
raw data must be organized. The Excel output from the Histogram tool is shown
below.

Cashier Display
Bin Frequency Bin Frequency
0 122 0 145
1 28 1 55

We can then use the Excel template to proceed. The output is shown below.

MakingDecisionsAboutTwo
PopulationProportions
Sample1Size 150
Sample2Size 200
Sample1Proportion 0.18666667
n
1
p
1
hat 28
n
1
q
1
hat 122
n
2
p
2
hat 55
n
2
q
2
hat 145
Arenpandnq>=10? yes
HypothesizedDifferenceinPopulation
Proportions,p
1
p
2
(decimalform)
0
zScore 1.92275784
OneTailedpValue 0.02725523
TwoTailedpValue 0.05451047



This is a two-tailed test, so the appropriate p-value is 0.0545. Since this is less than
, reject H
0
. There is sufficient evidence to infer there is a difference in the
proportion of customers who buy the extended warranty when exposed to promotion
by a display or informed by the cashier.

5. We will use Excel to calculate this confidence interval (it could also be done by
hand, based on the information acquired in Exercise 4 above). The Excel template is
shown below.

ConfidenceIntervalEstimatefor
theDifferenceinPopulation
Proportions
ConfidenceLevel(decimalform) 0.9
Sample1Size 150
Sample2Size 200
n
1
p
1
hat 28
n
1
q
1
hat 122
n
2
p
2
hat 55
n
2
q
2
hat 145
Arenpandnq>=10? yes
UpperConfidenceLimit 0.0146
LowerConfidenceLimit 0.1621

With 90% confidence, we estimate that the interval (-0.162, -0.015) contains the true
difference in the proportion of customers who buy the extended warranty, when told
about it by the cashier, compared with being exposed to a prominent display at the
checkout. Another way to say this: a greater proportion of those exposed to the
display bought the extended warranty. We estimate the difference to be contained in
the interval (1.5%, 16.2%).

This confidence interval corresponds to the hypothesis test in the preceding exercise.
Since we rejected the hypothesis of no difference, we would not expect the
confidence interval to contain zero (and it does not).


6. First, summarize the data, and calculate expected values. See the table below.

Past % Observed Expected
Pay Now 0.26 23 19.5
Pay In Six Months 0.37 32 27.75
Pay In One Year 0.37 20 27.75
75 75

Expected values are calculated as follows. In the past, 26% of customers paid
immediately. Out of the 75 customers surveyed, we would expect 26% 75 = 19.5
customers to pay immediately. The other expected values are calculated in a similar
fashion.

H
0
: The distribution of customers according to method of payment is the same now
as it was in the past.
H
1
: The distribution of customers according to method of payment is different now,
compared to the past.
= 0.05 (given)

All expected values are more than 5, so we can proceed.
444 . 3
75 . 27
) 75 . 27 20 (
75 . 27
) 75 . 27 32 (
5 . 19
) 5 . 19 23 (
e
) e o (
X
2 2 2
i
2
i i 2
=
E =

Degrees of freedom = k 1 = 2.
Using the tables, we see that p-value > 0.100.
Using CHITEST, we see that p-value = 0.1787.
Fail to reject H
0
. There is insufficient evidence to infer that there has been a change
in customers preferences for the different payment plans.

The graph below summarizes the changes.

0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
PayNow PayinSixMonths PayinOneYear
CustomerPreferencesforPaymentPlans
PastCustomerPreferences CurrentCustomerPreferences



7. H
0
: The distribution of customers brand preferences is as claimed by the previous
manager.
H
1
: The distribution of customers brand preferences is different from what was
claimed by the previous manager.
= 0.05 (given)

Expand the table of claimed brand preferences to show expected and observed
values, as shown below.

Labatt
Blue
Labatt
Blue
Light
Molson
Canadian
Kokanee Rickards
Honey
Brown
Claimed
Preference
33% 8% 25% 19% 15%
Expected
(for 75
Customers)
24.75 6.00 18.75 14.25 11.25
Observed 29 6 21 16 13

All expected values are more than 5, so we can proceed.

487 . 1
25 . 11
) 25 . 11 13 (
25 . 14
) 25 . 14 16 (
75 . 18
) 75 . 18 21 (
00 . 6
) 00 . 6 6 (
75 . 24
) 75 . 24 29 ( ) (
2 2 2 2 2 2
2
=
E =
i
i i
e
e o
X

Using the tables, we see that p-value > 0.100.
Using CHITEST, we see that p-value = 0.8289
.
Fail to reject H
0
. There is insufficient evidence to infer that the distribution of
customers brand preferences is different from what the previous manager claimed.



8. H
0
: The die is fair (the probability of occurrence of each side is 1/6).
H
1
: The die is not fair.
= 0.025 (given)

Expand the table of observations from repeated tosses of a die to show expected and
observed values, as shown below.

1 Spot 2 Spots 3 Spots 4 Spots 5 Spots 6 Spots
Observed 18 24 17 25 16 25
Expected
(Out Of 125) 20.83333 20.83333 20.83333 20.83333 20.83333 20.83333

All expected values are more than 5.
Using the formula as before, we calculate X
2
= 4.36.

From the table, we see p-value > 0.100.
Fail to reject H
0
. There is insufficient evidence to infer that the die is not fair.
Johns troubles are of his own making. We have no way to know if Mary will ever
forgive him.

9. H
0
: The distribution of customer destination preferences at the travel agency is the
same as in the past.
H
1
: The distribution of customer destination preferences at the travel agency is
different from the past.
= 0.04 (given)

Summarize the data from the random sample and calculate expected values.

Canada U.S. Caribbean Europe Asia Australia
/New
Zealand
Other
Past Preferences 28% 32% 22% 12% 2% 3% 1%
Expected
(for a sample of
54)
15.12 17.28 11.88 6.48 1.08 1.62 0.54
Observed 22 14 8 8 0 2 0



In this case, there are three expected values < 5 (these are highlighted in the table).
It seems logical to combine these categories, and then proceed. The new table of
expected and observed values is shown below.

Canada U.S. Caribbean Europe Other
Past Preferences 28% 32% 22% 12% 6%
Expected
(for a sample of 54)
15.12 17.28 11.88 6.48 3.24
Observed 22 14 8 8 2

However, even this change still leaves us with an expected value < 5. We must
combine categories again. It is less satisfying to combine Europe with Asia,
Australia/New Zealand and Other. However, all of these destinations represent
destinations at a significant distance from the North American continent, so there is
some sense to combining them

The final table of expected and observed values is shown below.

Canada U.S. Caribbean Europe, Asia,
Australia/New
Zealand, Other
Past Preferences 28% 32% 22% 18%
Expected
(for a sample of 54)
15.12 17.28 11.88 9.72
Observed 22 14 8 10

Now that all expected values are 5, we can proceed.
Using the formula as before, we calculate X
2
= 5.028.
Using the tables, with 3 degrees of freedom, we see p-value > 0.100.
Fail to reject H
0
. There is insufficient evidence to infer that there has been a change
in customer destination preferences at this travel agency.

10. H
0
: The distribution of responses to the survey at the local branch is the same as the
national benchmarks
H
1
: The distribution of responses to the survey at the local branch is not the same as
the national benchmarks

o = 0.025; X
2
= 20.859; from the tables (4 degree of freedom), p-value < .005
Reject H
0
. There is sufficient evidence to infer that the distribution of responses to
the survey at the local branch differs from the national benchmarks. The graph below
gives some indication of where the differences lie.


0%
10%
20%
30%
40%
50%
60%
Strongly
Agree
Agree Neither
Agreenor
Disagree
Disagree Strongly
Disagree
ResponsetoaSurveyofFinancialServices
CompanyCustomers: "Thestaffatmylocalbranchcan
providemewithgoodadviceonmyfinancialaffairs"
National
Benchmark
Observed
Percentage

11. H
0
: There is no relationship between the views on the proposed health benefit
changes and the type of job held in the organization
H
1
: There is a relationship between the views on the proposed health benefit
changes and the type of job held in the organization
= 0.01

The calculations of expected values for a contingency table can be done manually,
but are somewhat tedious. We will use Excels Non-Parametric Tool for Chi-
Squared Expected Value Calculations. The Excel output is shown below.

Chi-Squared Expected Values Calculations
Chi-squared test statistic 16.44338
#of expected values <5 0
p-value 0.002478

in favour opposed undecided
Management 19.32377 15.62705 6.04918
Professional, Salaried 53.72951 43.45082 16.81967
Clerical, Hourly Paid 41.94672 33.92213 13.13115

(The Excel output will allow you to check your manual calculations.)
We see that there are no expected values < 5, so we can proceed.


The p-value is 0.002478, which is < = 0.01. Reject H
0
. There is sufficient
evidence to infer that there is a relationship between the views on the proposed
health benefits changes and the type of job held in the organization.

12. H
0
: There is no relationship between hair colour and tendency to use sunscreen.
H
1
: There is a relationship between hair colour and tendency to use sunscreen.
= 0.05

The output of the Excel tool for Chi-Squared Expected Value Calculations is shown
below.

p-value 0.011016

Hair Colour Always Usually
Once in a
While Never
Red 22.94828 23.51724 8.913793 10.62069
Blonde 30.5977 31.35632 11.88506 14.16092
Brown 29.2069 29.93103 11.34483 13.51724
Black 38.24713 39.1954 14.85632 17.70115

All of the expected values are 5, so we can proceed. The p-value is 0.011 < 0.05.
Reject H
0
. There is evidence of a relationship between hair colour and tendency to
use sunscreen.


13. H
0
: There is no relationship between household income and the section of the paper
read most closely.
H
1
: There is a relationship between household income and the section of the paper
read most closely.
= 0.25

below.

Chi-squared test
statistic 51.92698
#of expected values
<5 0
p-value 1.74E-08

Household Income
National and World
News Business Sports Arts Lifestyle
Under $40,000 48.96498 40.74708 46.56809 19.1751 20.54475
$40,000 to $70,000 52.58171 43.75681 50.00778 20.59144 22.06226
over $70,000 41.45331 34.49611 39.42412 16.23346 17.393

All expected values are 5, so we can proceed. The p-value is 0.0000000174, which
is extremely small. Reject H
0
. There is evidence of a relationship between
household income and the section of the paper read most closely.

14. H
0
: There is no difference in the proportions of students whose first language is
English, French or something else among these four schools.
H
1
: There is a difference in the proportions of students whose first language is
English, French or something else among these four schools.
= 0.05

The output of the Excel tool for Chi-squared Expected Value Calculations is shown
below.

p-value 0.14647

First Language
School
#1
School
#2
School
#3
School
#4
English 44.25 44.25 44.25 44.25
French 44.25 44.25 44.25 44.25
Other 11.5 11.5 11.5 11.5



All expected values are 5, so we can proceed. The p-value is 0.14647 > = 0.05.
Fail to reject H
0
. There is insufficient evidence to infer that there are differences in
the proportions of students whose first language is English, French, or something
else for these four schools.

15. H
0
: The proportions of students drawn from inside or outside the local area are the
same for the Business, Technology and Nursing programs at a college.
H
1
: The proportions of students drawn from inside or outside the local area are
different for the Business, Technology and Nursing programs at a college.
= 0.025
below.

# of expected values <5 0
p-value 0.662621

Business Technology Nursing
From local area 68.46154 45.64103 63.89744
Not from local area 81.53846 54.35897 76.10256

All expected values are 5, so we can proceed. The p-value is 0.66 > = 0.025.
Fail to reject H
0
. There is not enough evidence to infer that the proportions of
students drawn from inside or outside the local area are different for the Business,
Technology and Nursing programs at a college.

Chapter Review Questions
1. We can pool data when the null hypothesis is that there is NO difference in the
population proportions. If that is the case (as we assume), then we can pool the
sample data, because both samples provide estimates of the same proportion of
successes. We cannot pool the data when the null hypothesis is that the population
proportions differ by 5%, because the sample data are providing estimates of two
different proportions of success.

2. We can't pool the sample data when we are constructing a confidence interval
estimate of p
1
p
2
, because we are not assuming there is no difference in the
population proportions. We do not have a null hypothesis in mind when we are
estimating the difference in proportions.

3. H
0
: p
1
p
2
= - 0.10
H
1
: p
1
p
2
< -0.10

This may not be immediately obvious. Remember, the subscript 1 corresponds to last
year's results, and the subscript 2 corresponds to this year's results. If the proportion


of people who pass this year is more than 10% higher, then when we subtract p
1
-p
2
,
we will get a negative number, and it will be to the left of -0.10 on the number line.

4. The Chi-square goodness-of-fit test measures only how closely the observed
frequencies match the expected frequencies. The test does not take into account
whether the differences are positive or negative (differences are squared in the
calculation of the test statistic). Larger differences result in larger values of the Chi-
square test statistics. Unusually large values (in the right tail of the distribution)
signal that there are significant differences in the distributions.

5. Repeated tests on the same data set lead to higher chances of Type I error, and are
therefore not reliable. A Chi-square test allows us to compare all three proportions
simultaneously.

6. Call the data on members who were taking fitness classes sample 1 from population
1. Call the data on members who were working with a personal trainer sample 2 from
population 2.

H
0
: p
1
p
2
= 0
H
1
: p
1
p
2
0

= 0.05

63333333 . 0
60
38
p
1
= = , n
1
= 60, 75 . 0
80
60
p
2
= = , n
2
= 80

Sampling is done without replacement, but presumably the fitness club has hundreds
of members. This is an assumption that we should note before we proceed to use the
binomial distribution as the underlying model.

=38 > 10
1 1
p n
=60 - 38 = 22 > 10
1 1
q n
=60 > 10
2 2
p n
=80 - 60 = 20 > 10
2 2
q n


70 . 0
80 60
60 38
p =
+
+
=




49 . 1
80
1
60
1
) 70 . 0 1 )( 70 . 0 (
0 ) 75 . 0
60
38
(
n
1
n
1
q p
0 ) p p (
z
2 1
2 1
=
|
.
|
\
|
+

=
|
|
.
|
\
|
+

=

(Note that p
1
is left in fractional form to preserve accuracy for calculations with a
P(z s - 1.49) = 2 0.0681 = 0.1362
0
. There is insufficient evidence to infer that there
ith

. Since the Chi-square test is equivalent to the test of proportions, we expect to get the

Still working out in Quit working out in

calculator.)
p-value = 2

is a difference in the proportion of new members still working out regularly six
months after joining the club, when comparing those who attend fitness classes w
those who work out with a personal trainer.
7
same answer. First, set up the appropriate contingency table for the data, as shown
below.
first six months first six months
Taking fitness classes 38 22
Working with a personal trainer 60 20

The setup of the problem is the same, with the same null and alternative hypotheses.

red Expected Values Calculations

below.

Chi-Squa
p-value 0.136037

till working out in first six uit working out in first six S
months
Q
months
Taking fitness classes 42 18
Working with a personal
56 24 trainer

The p-value is the same as in the previous problem (it differs slightly only because we

used the tables for the calculation in the previous question, which involves rounding
the z-score to two decimal places). Of course, the conclusion is also the same.


8. Call the data on deliveries by the private courier sample 1 from population 1. Call

H
0
: p
1
p
2
= 0.05
= 0.025
, n
1
= 100, , n
2
= 75
Sampling is done without replacement, but presumably both the private courier and

0
0
Since the null hypothesis is that there is a difference in the proportions, we cannot

the data on deliveries by Canada Post sample 2 from population 2.

H
1
: p
1
p
2
> 0.05

89 . 0
1
= p 80 . 0 p
2
=

Canada Post make a very large number of deliveries, so we can use the binomial
distribution as the appropriate underlying model.

1 1
p n = 100(0.89) = 89 > 10
= 100(1-0.89) = 11 > 1
1 1
q n

2
=75(0.80) = 60 > 10
2
p n
=75(1 - 0.80) = 15 > 1
2 2
q n

pool the sample data.

72 . 0
75
) 20 . 0 )( 80 . 0 (
100
) 11 . 0 )( 89 . 0 (
05 . 0 ) 80 . 0 89 . 0 (

) (
2
2 2
1
1 1
2 1
2 1
=
+

=
+

=

n
q p
n
q p
p p
z
p p

p-value = P(z > 0.72) = 1 0.7642 = 0.2358
0
. There is insufficient evidence to infer that the
uld

on-time or early percentage for the private courier is more than 5% higher than
Canada Posts. Using this criterion for the decision, the mail order company sho
not use the private courier service.



9. Call the data on students who were called by program faculty sample 1 from
population 1. Call the data on students who were only sent a package in the mail
sample 2 from population 2.

H
0
: p
1
p
2
= 0
H
1
: p
1
p
2
> 0

= 0.025

841726618 . 0
278
234
1
= = p , n
1
= 278, 76821192 . 0
302
232
2
= = p , n
2
= 302

Sampling is done without replacement. We have no information on college
enrolment. Sample sizes are fairly large. For these samples to be at most 5% of the
relevant populations, the college would have to have about 5560 students who were
called by faculty in total, and about 6040 who were sent acceptance packages. This
means a fairly large potential first-year enrolment. This is an assumption that we
should note before we proceed to use the binomial distribution as the underlying
model.

= 234 > 10
1 1
p n
= 278 - 234 = 44 > 10
1 1
q n
= 232 > 10
2 2
p n
= 302 - 232 = 70 > 10
2 2
q n


80344828 . 0
302 278
232 234
=
+
+
= p


23 . 2
302
1
278
1
580
466
1
580
466
0
302
232
278
234
1 1

0 ) (
2 1
2 1
=
|
.
|
\
|
+
|
.
|
\
|

|
.
|
\
|
|
.
|
\
|

=
|
|
.
|
\
|
+

=
n n
q p
p p
z

(Note that some proportions are left in fractional form to preserve accuracy for
calculations with a calculator.)

p-value = P(z > 2.23) = 1 - 0.9871 = 0.0129


0
. There is sufficient evidence to infer that the proportion
of prospective students who send acceptances is higher when they get calls from
program faculty (compared with receiving a package in the mail).

10. This confidence interval could be calculated manually. The output of the Excel
template is shown below (manual calculations should be very close).

Proportions
Sample1Size 278
Sample2Size 302
n
1
p
1
hat 234
n
1
q
1
hat 44
n
2
p
2
hat 232
n
2
q
2
hat 70
Arenpandnq>=10? yes

With 95% confidence, we estimate that the proportion of students who send
acceptances when called by program faculty is 0.9% to 13.8% higher than the
proportion who send acceptances when they receive only a package in the mail.

11. Call the data on managers who have been sent to conflict resolution training sample
1 from population 1. Call the data on non-managerial employees who have been sent
to conflict resolution training sample 2 from population 2.

H
0
: p
1
p
2
= 0
H
1
: p
1
p
2
0

= 0.025

72 . 0
50
36
p
1
= = , n
1
= 50, 50 . 0
76
38
p
2
= = , n
2
= 76


Sampling is done without replacement. We have no information on the total number
of employees who have been sent to conflict resolution training. We are told that the
company is large. For these samples to be at most 5% of the relevant populations,
the company would have had to send 1000 managerial employees to the training, and
1520 non-managerial employees. This is an assumption that we should note before
we proceed to use the binomial distribution as the underlying model.

= 36 > 10
1 1
p n
= 50 - 36 = 14 > 10
1 1
q n
= 38 > 10
2 2
p n
= 76 - 38 = 38 > 10
2 2
q n


587301587 . 0
76 50
38 36
p =
+
+
=


45 . 2
76
1
50
1
126
74
1
126
74
0 ) 50 . 0 72 . 0 (
n
1
n
1
q p
0 ) p p (
z
2 1
2 1
=
|
.
|
\
|
+ |
.
|
\
|
|
.
|
\
|

=
|
|
.
|
\
|
+

=


p-value = 2 P(z > 2.45) = 2 (1 - 0.9929) = 2 0.0071 = 0.0142

0
. There is sufficient evidence to infer there is a
difference in the proportions of managers and non-managers who thought that
conflict resolution training was a waste of time.

12. In the sample, 72% of managers and 50% of non-managers thought the training was
a waste of time. There is no way to know why, and this is something that might be
worthy of further research. Was the training perceived as a waste of time because
the employees felt they did not benefit? If they did not benefit, was this because they
learned nothing new, or they thought the training was poorly done? Was it a waste
of time only because they felt they had more important tasks to complete?



The initial research was not really that helpful. It would have been more appropriate
to ask the employees what they learned at the training, and whether they were likely
to put what they learned into practice.

However, generally, the sample results raise a question about whether the training is
accomplishing its intended goals. Before continuing to spend money on training, the
decision to implement the training should be revisited, with further data collection a
possibility.

13. We can set this up as a Chi-square test, with the information organized as in the table
below.

Manufacturer
#1
Manufacturer
#2
Manufacturer
#3
Defective Components 36 30 38
Non-Defective
Components 89 95 87
Total 125 125 125

H
0
: The proportions of defective items are the same for all three manufacturers.
H
1
: The proportions of defective items are different among the three manufacturers.
= 0.05

The output from the Excel tool for Chi-Squared Expected Value Calculations is
shown below.

p-value 0.500633

#1 #2 #3
Defective Components 34.66667 34.66667 34.66667
Non-Defective Components 90.33333 90.33333 90.33333

All the expected values are 5, so we can proceed. The p-value is 0.5 > = 0.05.
Fail to reject H
0
. There is not enough evidence to infer that the proportions of
defective items are different among the three manufacturers.



14. Refer to the two plants as Plant 1 and Plant 2.

H
0
: p
1
p
2
= 0
H
1
: p
1
p
2
0

= 0.05

15333333 . 0
150
23
p
1
= = , n
1
= 150, 184 . 0
125
23
p
2
= = , n
2
= 125

Sampling is done without replacement. We have no information on the total number
of employees at the two plants. As long as Plant 1 has 3000 employees, and Plant 2
has 2500 employees, we can still use the binomial distribution as the appropriate
underlying model. We note this assumption and proceed.

= 23 > 10
1 1
p n
= 150 - 23 = 127 > 10
1 1
q n
= 23 > 10
2 2
p n
= 125 - 23 = 102 > 10
2 2
q n


7 1672727272 . 0
275
46
125 150
23 23
p = =
+
+
=


68 . 0
125
1
150
1
275
46
1
275
46
0 184 . 0
150
23
n
1
n
1
q p
0 ) p p (
z
2 1
2 1
=
|
.
|
\
|
+ |
.
|
\
|
|
.
|
\
|
|
.
|
\
|

=
|
|
.
|
\
|
+

=


p-value =2 P(z s -0.68) = 2 0.2483 = 0.4966

Using Excel, we calculate the exact p-value as 0.497. (See the output from the Excel
template for Making Decisions About Two Population Proportions, Qualitative Data,
shown below.)



MakingDecisionsAboutTwo
PopulationProportions
Sample1Size 150
Sample2Size 125
n
1
p
1
hat 23
n
1
q
1
hat 127
n
2
p
2
hat 23
n
2
q
2
hat 102
Arenpandnq>=10? yes
HypothesizedDifferenceinPopulation
Proportions,p
1
p
2
(decimalform)
0
zScore 0.67847976
OneTailedpValue 0.24873378
TwoTailedpValue 0.49746755

0
. There is insufficient evidence to infer there is a
difference in the proportions of employees who had accidents at the two plants.

15. Exercise 14 could also be done as a Chi-square test. First, organize the data as
shown below.

Plant 1 Plant 2
Employees Who Had An Accident 23 23
Employees Who Had No Accident 127 102
Total 150 125

H
0
: The proportions of employees who had accidents are the same at the two plants.
H
1
: The proportions of employees who had accidents are different at the two plants.
= 0.05



The output from the Excel tool for Chi-Squared Expected Value Calculations is
shown below.

p-value 0.497468

Plant 1 Plant 2
Employees Who Had An Accident 25.09091 20.90909
Employees Who Had No Accident 124.9091 104.0909

We arrive at the same conclusion as before (as we would expect). Once again, the p-
value is 0.497. Since p-value > , fail to reject H
0
. There is insufficient evidence to
infer there is a difference in the proportions of employees who had accidents at the
two plants.

16. H
0
: There is no relationship between an individuals family status and his/her
willingness to accept a foreign posting
H
1
: There is a relationship between an individuals family status and his/her
willingness to accept a foreign posting
= 0.05

The output of the Excel tool for Chi-squared Expected Value Calculations is shown
below.

Chi-squared test
statistic 5.474924
#of expected values
<5 0
p-value 0.140146

Family Status
Accepted Foreign
Posting
Declined Foreign
Posting
Single, No Children 45.78947 12.21053
Single with Children 25.26316 6.736842
Partnered, No Children 37.89474 10.10526
Partnered with Children 41.05263 10.94737

All expected values are greater than 5, so we can proceed. The p-value is 0.14,
which is greater than = 0.05. Fail to reject H
0
. There is not enough evidence to
infer that there is a relationship between an individuals family status and his/her
willingness to accept a foreign posting.


17. H
0
: The absences are equally distributed across the five working days of the week.
H
1
: The absences are not equally distributed across the five working days of the
week.
= 0.05

There are 48 absences in total, in the sample. If the absences are equally distributed
across the five working days of the week, then we would expect each of the five days
to have 48/5 = 9.6 absences.

625 . 12
6 . 9
) 6 . 9 16 (
6 . 9
) 6 . 9 7 (
6 . 9
) 6 . 9 4 (
6 . 9
) 6 . 9 6 (
6 . 9
) 6 . 9 15 ( ) (
2 2 2 2 2 2
2
=
E =
i
i i
e
e o
X

p-value = P(X
2
> 12.625)= 0.013261 (using Excels CHITEST).

Using the table, for four degrees of freedom, we see 0.010 < P(X
2
> 12.625) < 0.025.
Reject H
0
. There is enough evidence to suggest that the absences are not equally
distributed across the five working days of the week.

18. H
0
: There is no relationship between gender and preferred movie type.
H
1
: There is a relationship between gender and preferred movie type.
= 0.04

This problem could be done manually, of course. The output of the Excel tool for
Chi-squared Expected Value Calculations is shown below.

p-value 0.091571

Favourite Movie Type Male Female
Action/Adventure 29.79672 34.20328
Comedy 14.43279 16.56721
Drama 21.88197 25.11803
Fantasy 16.76066 19.23934
Horror 20.95082 24.04918
Romance 19.08852 21.91148
Thriller 19.08852 21.91148

All of the expected values are 5, so we can proceed. The p-value is 0.092, and
since this is greater than = 0.04, fail to reject H
0
infer that there is a relationship between gender and preferred movie type.


19. H
0
: The proportions of workers who travel to work via the different methods are the
same for the software firm and the accounting firm.
H
1
: The proportions of workers who travel to work via the different methods at the
software firm are different from the proportions of workers who travel to work
via the different methods at the accounting firm.
= 0.05

below.

Chi-squared test statistic n/a
p-value n/a

By
Transit In Car
On
Bicycle On Foot
Software Firm 50.8481 15.3038 9.873418 1.974684
Accounting Firm 52.1519 15.6962 10.12658 2.025316

Since some of the expected values are less than 5, we cannot proceed. First we must
amalgamate categories in a meaningful way. It seems reasonable to combine the
categories of travel by bicycle and by foot, since both of these are self-propelled.

The reorganized data set will then be:

By Transit In Car
On Bicycle Or
On Foot
Software Firm 51 8 19
Accounting Firm 52 23 5

The new output from the Excel tool for Chi-Squared Expected Value Calculations is
shown below.

p-value 0.00045

By
Transit In Car
On Bicycle
Or On Foot
Software Firm 50.8481 15.3038 11.8481
Accounting Firm 52.1519 15.6962 12.1519



Since the expected values are now all 5, we can proceed. The p-value is very
small, at 0.00045. We have very convincing evidence that the proportions of
workers who travel to work via the different methods at the software firm are
different from the proportions of workers who travel to work via the different
methods at the accounting firm.

20. H
0
: The distribution of preferences for beer, wine and other alcoholic drinks is the
same for males and females.
H
1
: The distribution of preferences for beer, wine and other alcoholic drinks is
different for males and females.
= 0.05

below.

ChiSquaredExpectedValuesCalculations
Chisquaredteststatistic 0
#ofexpectedvalues<5 0
pvalue 1
Beer Wine
Other
Alcoholic
Drinks
Male 42 36 22
Female 63 54 33

The p-value is 100% in this case, and the test statistic is zero. There is no evidence to
support the hypothesis that the distribution of preferences for beer, wine and other
alcoholic drinks is different for males and females, because the proportions of males
and females who prefer each drink type is exactly the same, for all types of drinks.

21. H
0
: The proportions of mixed nuts are as specified.
H
1
: The proportions of mixed nuts are not as specified.
= 0.025

Almonds Peanuts Hazelnuts Cashews Pecans
Desired % 22% 48% 10% 10% 10%
Observed Number 80 190 36 31 37
Expected Number
(out of 374) 82.28 179.52 37.4 37.4 37.4

Expected values are calculated as: desired % total number of nuts (374 in this
case).


All of the expected values are 5, so we can proceed.

X
2
= 1.827
From the table (degrees of freedom = k 1 = 4), we see that p-value > 0.100.
Fail to reject H
0
. There is insufficient evidence to infer that the proportions of mixed
nuts are not as specified.

22. H
0
: p = 0.50
H
1
: p > 0.50
= 0.025
= 190/374 = 0.50802139 p
n = 374
Sampling is done without replacement. The company presumably produces a
significant quantity of mixed nuts, so the sample is presumably not more than 5% of
the population. This means the binomial distribution is still the appropriate
underlying model. In this case, we are presuming that one package of mixed nuts
constitutes a random sample.

np = 374(0.50) = 187
nq = 374(1 0.50) = 187
Both are > 10, so the sampling distribution of will be approximately normal, with
a mean of 0.50, and a standard error of
p
025854384 . 0
374
) 50 . 0 )( 50 . 0 (
n
pq
p
= = = o .

3783 . 0
6217 . 0 1
) 31 . 0 z ( P
374
) 50 . 0 )( 50 . 0 (
50 . 0
374
190
z P
)
374
190
p ( P
=
=
> =
|
|
|
|
|
.
|
\
|
> =
>

p-value = 0.3783 > = 0.025
Fail to reject H
0
. There is insufficient evidence to infer that there are more than 50%
peanuts in the mixed nuts packages.



23. H
0
: The proportions of types of nuts are the same for the two companies.
H
1
: The proportions of types of nuts are different for the two companies.
= 0.05

below.

Chi-Squared Expected Values
Calculations
p-value 0.729017

Almonds Peanuts Hazelnuts Cashews Pecans
Company B 161.2716 374.973 67.03058 70.34892 64.3759
Company A 81.72842 190.027 33.96942 35.65108 32.6241

All expected values are greater than 5, so we can proceed. The p-value is 0.729,
which is greater than = 0.05. Fail to reject H
0
infer that the proportions of types of nuts are different for the two companies.

24. H
0
: p
A
p
B
= 0
H
1
: p
A
p
B
0

= 0.05

50802139 . 0
374
190
p
A
= = , n
A
= 374, 508130081 . 0
738
375
p
B
= = , n
B
= 738

Sampling is done without replacement, but it is likely that both companies produce
many packages of mixed nuts. Again, we are assuming that one package is a random
sample.

=190 > 10
A A
p n
=184 >10
A A
q n
=375 >10
B B
p n
=363 > 10
B B
q n




508093525 . 0
738 374
375 190
p =
+
+
=

z-score = - 0.003

P(z s -0.003) ~ 50%
Fail to reject H
0
. There is insufficient evidence to suggest that there is a difference in
the proportions of peanuts in the mixed-nuts packages of the two companies.

25. The Excel template output is shown below.

Proportions
Sample1Size 374
Sample2Size 738
n
1
p
1
hat 190
n
1
q
1
hat 184
n
2
p
2
hat 375
n
2
q
2
hat 363
Arenpandnq>=10? yes

With 90% confidence, we estimate the interval (-0.0523, 0.0521) contains the true
difference in the proportion of peanuts in the mixed-nuts packages of the two
companies.

This confidence interval is wider than the confidence interval that would correspond
to the hypothesis test in the previous exercise. Since we failed to reject the
hypothesis of no difference in that test, we would expect that narrower confidence
interval to contain zero. The wider confidence interval for this exercise, then, would
also contain zero.

26. You cannot use the Chi-square test on the weights of the different-coloured candies
directly. The Chi-square test works with discrete qualitative data, and weights are
continuous quantitative data. As well, it is important to use the correct counts for the
test. Don't be lazy!



ghts into an approximate number of candies, you can proceed,
although you are only approximating. So, for example, if you knew each candy

ppose the weight breakdown was as follows:

Red Yellow Green Black Orange Total
If you convert the wei
weighed 1.5 grams, you could convert the weights for each colour into a number of
candies.
Try it! Su

Weight 62 46 58 39 45 250

Use Excel to do a Chi-square test for this data set. Then divide each of the weights
by 1.5 grams to get the number of candies of each colour, and repeat. You will see

that you do not get the same Chi-square statistic or p-value for the two versions.
Only the second version, based on counts, is correct.

Chapter 12 Solutions Develop Your Skills 12.1

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Chapter 12 Solutions Develop Your Skills 12.1

Încărcat de

Drepturi de autor:

Formate disponibile

Instructors Solutions Manual - Chapter 12

S-ar putea să vă placă și