Sunteți pe pagina 1din 47

CHAPTER 6

Hypothesis Testing

Hypothesis Testing
A
statistical hypothesis is a claim
about a population characteristic
(and on occasion more than one).
An example of a hypothesis is the
claim that the population mean is
some value, e.g. .

Hypotheses and Test


Procedures
Null Hypothesis: H0
The claim that is initially assumed to be
true

Alternative Hypothesis: H1 or Ha
The complementary assertion to H0
The new statement that we wish to test

Hypothesis Test Example


We own a paint company. The old paint takes 60
minutes to dry. We want to see if a new paint will
dry faster.
H0: = 60
H1: < 60

A test procedure is created under the assumption of


H0 and then it is determined how likely that
assumption is compared to its complement HA.
The decision will be based on
Test Statistic and Rejection Region
Or
p-value and Significance Level

Test Procedures
Test Statistic - a function of the sample data on
which the decision (reject or do not reject H0 is
made)
Rejection Region - set of all test statistic values for
which H0 will be rejected
The basis for choosing a particular rejection region
lies in an understanding of the errors that can be
made.

Hypotheses Test Errors


Type I Error: rejecting a true H0
Type II Error: failing to reject a false H0

Since

REJECT
H0

FAIL TO
REJECT
H0

True H0

TYPE I
ERROR

CORREC
T

False
H0

CORREC
T

TYPE II
ERROR

we wish to control for the type I error, we set,


,
The default value of (significance level) is usually
taken to be 0.05.

Motivating the test procedure


Example:
The drying time of a certain type of paint,

under fixed environmental conditions, is known to be
normally distributed with mean 75 min. and standard
deviation 9 min. Chemists have added a new additive
that is believed to decrease drying time and have
obtained a sample of 35 drying times and wish to test
their assertion at significance level.
Solution: Here we are interesting in estimating the
following hypotheses (let be the mean of drying time),


An obvious candidate for a test statistic is
which is normally distributed.
Thus, under ,
or,
.
If the test value is small enough, i.e
then, we reject .

What is the logic?


We assume that sample mean

is a good
estimate for and hence should be close to 0,
which implies T.S. should be close to zero. However,
if it is not, then it implies that was not a good
hypothesis value for the true mean.

Assume
that from the 35 samples, then,

T.S.=
thus,
So, we reject the null hypothesis at significance
level.
We can also make conclusion using p-value!

P-value
The p-value of a hypothesis test is the probability of
observing the specific value of the test statistic,
T.S., or a more extreme value, under the null
hypothesis.
The direction of the extreme values is indicated by
the alternative hypothesis.

Computing p-value for our example


In
this example values more extreme than -2.76 are
as the alternative, , is indicating values less than.
Thus,
p-value=
which indicates that
p-value
so we reject the null hypothesis!

The
null hypothesis is rejected in favor of the
alternative hypothesis as the probability of
observing the test statistic value of -2.76 or more
extreme (as indicated by Ha) is smaller than the
probability of the type I error () we are willing to
undertake.

Large sample test for population


mean (section 6.1)
Let
be a random sample with (n>30) and hence
is normally distributed.
To test,
I.
vs
II.
vs
III.
vs
at the significance level, first compute the test
statistic,

Making decision
Reject
the null if,
(i)
(i) p-value=
(ii)
(ii) p-value=
(iii)
(iii) p-value=
Remark 6.1. If is unknown and instead s is used,
one should be using Students-t and the relevant ttable instead of the z-table, but since the sample
size is large the two distributions are equivalent.

Example:
A scale is to be calibrated by weighing a

1000 g test weight 60 times. The 60 scale readings
have mean 1000.6 g and standard deviation 2 g.
Find the P-value for testing
versus .
Solution:
Assuming is true, from C.L.T we can say,
.
We can approximate with because sample size is
large. Thus,

p-value=

P-value is very small, we have some strong


evidence to reject the null hypothesis.

Making decision solely based on p-value, i.e. when


significance level is not given,
p-value

Evidence against
No evidence
Weak evidence
Strong evidence
Very strong evidence

If significance level is given in the problem, then


compare p-value with and reject whenever p-value is
less than

Example:
in the previous example perform a

hypotheses testing for versus at significance level .
Solution:
p-value=
So, we do not have any evidence to reject the null
hypothesis.

Tests for population proportion (section


6.3)
Let
be the number of successes in i.i.d Bernoulli
trials with probability of success , then

By C.L.T. we know under certain conditions (, ),

To
test,
I.
vs
II.
vs
III.
vs
we must assume, under the null hypothesis ,
the number of successes and failures is greater
than 5, i.e. and , such that under and using
C.L.T, we can say,

The
test statistic is
and the r.v. corresponding to the test statistic has a
standard normal distribution under the null
hypothesis assumption. Reject the null if
(i)
(ii)
(iii)

(i) p-value=
(ii) p-value=
(iii) p-value=

Example:
For a sample of 1225 baselines, 926 gave

results that were within the class C spirit leveling
tolerance limits. Can we conclude that this method
produces results within the tolerance limits more
than 75% of the time?
Solution: First, we should write the hypotheses,

Second, we should check the normality conditions


under the null hypothesis,

So, we have normality under the assumption of .


Thus,

The
observed sample proportion is,
the test statistic is,

and p-value is,

So, we do not have any evidence to reject

Small sample test for population mean


(section 6.4)
If
the
sample size is small, i.e. , then the C.L.T. is
not applicable for and therefore we must assume
that the individual random variables corresponding
to the sample are normal random variables with
mean and variance. As a result,
.
Thus, if is known then we can proceed exactly as in
the case of large sample test for population mean.

What if is unknown?

If
is unknown, which is usually the case, we replace it
by its sample estimate s. Consequently, under we
have,
and then for the observed value
At the significance level, for the same hypothesis tests
as before, we reject if

(i)
(ii)
(iii)

(i) p-value=
(ii) p-value=
(iii) p-value=

Example: Muzzle velocities of eight shells tested with a new gunpowder


yield a sample mean of 2959 feet per second and a standard deviation of
39.4. The manufacturer claims that the new gunpowder produces an
average velocity of no less than 3000 feet per second. Does the sample
provide enough evidence to contradict the manufacturers claim at 0.05
significance level? (assume velocity of the new gunpowder is normally
distributed)

Solution: Let be the mean velocity of the new gunpowder.


Here, we are interested in testing
H0: 3000
H1: < 3000
Because we want to see whether there is evidence to refuse the
manufacturer's claim.
The test statistic is,
and the rejection region is
0.0101
So, we have very strong evidence against the null hypothesis.

Remark:
The values contained within a two-sided

C.I. are precisely those values for which the p-value
of a two sided hypothesis test will be greater than .
Example: The lifetime of single cell organism is
believed to be on average 257 hours. A small
preliminary study was conducted to test whether
the average lifetime was different when the
organism was placed in a certain medium. The
measurements are assumed to be normally
distributed and turned out to be 253, 261, 258, 255,
and 256.
Solution 1: Here we want to test v.s.
with and , the teat statistic value is

p-value
Hence, since the p-value is large we fail to reject the
null hypothesis and we conclude that the population
mean is not statistically different from 257.
Solution 2: Instead of hypotheses testing if a two
sided 95% confidence interval was constructed by,
it is clear that the null hypothesis value of is a
plausible value and consequently we do not reject
at 0.05 significance level.

Large sample test for difference of two


means (section 6.5)
Let
and represent two independent random large
samples with and with means and variances ,
respectively. By C.L.T we have,

How
I.
II.
III.

To Test the following hypotheses?!


vs
vs
vs

we
assume that the variances are known and the
test statistic is
The r.v. corresponding to the test statistic has a
standard normal distribution under the null
hypothesis , that . Reject the null if
(i)
(ii)
(iii)

(i) p-value=
(ii) p-value=
(iii) p-value=

Example:
Two welding procedures are two be testing on

the property of the diameter of inclusions, which are
particles embedded in the weld. A sample of 544 inclusions
in welds made using method X and averaged 0.37 m in
diameter, with a standard deviation of 0.25 m. A sample
of 581 inclusions in welds made using method Y and
averaged 0.40 m in diameter, with a standard deviation
of 0.26 m. Can you conclude that the mean diameter for
Y exceeds that of X by more than 0.015 m.
Solution:
vs
The test statistics is
This is a one-tailed test with .
We failed to reject the null hypothesis.

Tests for the difference between two


proportions (section 6.6)
Let
and Y represent two independent Binomial
random variables resulted from two independent
i.i.d. Bernoulli trials. To test,
I.
vs
II.
vs
III.
vs
we first need an appropriate test statistic.

We
must assume that the number of successes and
failures is greater than 10 for both samples.
As the null hypotheses values for and are not available
we simply check that the sample successes and failures
are greater than 10. By virtue of the C.L.T.

and test statistic would be constructed in the usual way.


However, under it is assumed that = which implies that
the two variances of the two Bernoulli trials are equal ().

Therefore
we can replace and in the variance by

the pooled estimate,
The test statistic is then,
and the r.v. corresponding to the test statistic has a
standard normal distribution under the null
hypothesis.
Thus, we reject the null hypothesis whenever,
(i)
(i) p-value=
(ii)
(ii) p-value=
(iii)
(iii) p-value=

Example: We want to compare the proportion of


defective electric motors turned out by two shifts of
workers. From the large number produced in a given
week, 250 motors were selected from the output of
shift I and 200 motors were selected from the
output of shift II. The sample from shift one revealed
25 to be defective and the sample from shift II 30
faulty motors.
Is it true to say the difference between the
proportions of defective motors produced in two
shifts is not equal to zero? Use a 0.05 significance
level.

Solution:
Let be the proportion of defective motors produced by workers in shift I
and be the proportion of defective motors produced by workers in shift II.

The goal is testing versus ,


using , , and
we get and .
Since,
,
and
,
and
so the sample sizes are large enough to use normal approximation.
Also,
=0.11
Thus,
and,
P-value=
So, we failed to reject the null hypothesis at 0.05 significance level. That is
at 0.05 level significance level, difference between the proportions of
defective motors produced in two shifts is not equal to zero.

Small sample test for the difference


between two means (6.7)
In
this case, since the C.L.T. is not applicable we
must assume that the two random samples are
normally distributed and independent.
1. If the variances are known, the test statistic is,
Which has a normal distribution under the null
hypothesis.

2. If variances are unknown (which is


usually the case),
which has a distribution under , where
the degrees of freedom are given by

We
reject if
(i)
(i) p-value=
(ii)
(ii) p-value=
(iii)
(iii) p-value=
Remark: If we have equality of variances () then
we replace both and with
And in this case the degrees of freedom for the t
distribution is .

Example: The prestressing wire on each of two concrete


pipes manufactured at different times was compared for
torsion properties. Ten specimens randomly selected
from each pipe were twisted in a laboratory apparatus
until they broke the number of revolutions until complete
failure was recorded. The results are as follows, with C1
and C2 denoting the two concrete pipes:
C1: 5.83, 8.66, 4.75, 3.00, 3.37, 3.63, 4.00, 4.63, 4.25,
4.13
C2: 3.38, 2.81, 7.00, 1.50, 5.88, 5.25, 4.08, 7.63, 4.50,
4.88
Is there any evidence to suggest that the true mean
revolutions to failure differ for the wire on the two pipes?
Solution: MINITAB

Test for paired data (section 6.8)


In
the event that two samples are dependent, i.e. paired,
such as when two different measurements are made on
the same experimental unit.
Where we consider the data in the form of the pairs ,
and construct the one-dimensional, i.e. one-sample
where for As shown earlier, .
To test,
I.
vs
II.
vs
III.
vs

perform

a one-sample hypothesis test


by either a large or small sample
inference using the test statistic
or

Example: The two drying methods for concrete were used on


seven different mixes, with each mix of concrete subjected to each
drying method. The resulting strength test measurements (in psi)
are given below. Is there evidence of a difference between average
strengths for the two drying methods at the 10% significance level?

Solution: MINITAB

Mix

Method Method
I
II

3160

3170

3240

3220

3190

3160

3520

3530

3480

3440

3220

3210

3120

3120

Power of the Test


The
power of a test is the probability of
rejecting whenever it is false.
Power

Exam 2
1.
2.
3.
4.

Section 2.6
Section 4.11
Sections 5.1-5.7
Sections 6.1-6.8

S-ar putea să vă placă și