Documente Academic
Documente Profesional
Documente Cultură
Hypothesis Testing
Hypothesis Testing
A
statistical hypothesis is a claim
about a population characteristic
(and on occasion more than one).
An example of a hypothesis is the
claim that the population mean is
some value, e.g. .
Alternative Hypothesis: H1 or Ha
The complementary assertion to H0
The new statement that we wish to test
Test Procedures
Test Statistic - a function of the sample data on
which the decision (reject or do not reject H0 is
made)
Rejection Region - set of all test statistic values for
which H0 will be rejected
The basis for choosing a particular rejection region
lies in an understanding of the errors that can be
made.
Since
REJECT
H0
FAIL TO
REJECT
H0
True H0
TYPE I
ERROR
CORREC
T
False
H0
CORREC
T
TYPE II
ERROR
An obvious candidate for a test statistic is
which is normally distributed.
Thus, under ,
or,
.
If the test value is small enough, i.e
then, we reject .
is a good
estimate for and hence should be close to 0,
which implies T.S. should be close to zero. However,
if it is not, then it implies that was not a good
hypothesis value for the true mean.
Assume
that from the 35 samples, then,
T.S.=
thus,
So, we reject the null hypothesis at significance
level.
We can also make conclusion using p-value!
P-value
The p-value of a hypothesis test is the probability of
observing the specific value of the test statistic,
T.S., or a more extreme value, under the null
hypothesis.
The direction of the extreme values is indicated by
the alternative hypothesis.
The
null hypothesis is rejected in favor of the
alternative hypothesis as the probability of
observing the test statistic value of -2.76 or more
extreme (as indicated by Ha) is smaller than the
probability of the type I error () we are willing to
undertake.
Making decision
Reject
the null if,
(i)
(i) p-value=
(ii)
(ii) p-value=
(iii)
(iii) p-value=
Remark 6.1. If is unknown and instead s is used,
one should be using Students-t and the relevant ttable instead of the z-table, but since the sample
size is large the two distributions are equivalent.
Example:
A scale is to be calibrated by weighing a
1000 g test weight 60 times. The 60 scale readings
have mean 1000.6 g and standard deviation 2 g.
Find the P-value for testing
versus .
Solution:
Assuming is true, from C.L.T we can say,
.
We can approximate with because sample size is
large. Thus,
p-value=
Evidence against
No evidence
Weak evidence
Strong evidence
Very strong evidence
Example:
in the previous example perform a
hypotheses testing for versus at significance level .
Solution:
p-value=
So, we do not have any evidence to reject the null
hypothesis.
To
test,
I.
vs
II.
vs
III.
vs
we must assume, under the null hypothesis ,
the number of successes and failures is greater
than 5, i.e. and , such that under and using
C.L.T, we can say,
The
test statistic is
and the r.v. corresponding to the test statistic has a
standard normal distribution under the null
hypothesis assumption. Reject the null if
(i)
(ii)
(iii)
(i) p-value=
(ii) p-value=
(iii) p-value=
Example:
For a sample of 1225 baselines, 926 gave
results that were within the class C spirit leveling
tolerance limits. Can we conclude that this method
produces results within the tolerance limits more
than 75% of the time?
Solution: First, we should write the hypotheses,
The
observed sample proportion is,
the test statistic is,
What if is unknown?
If
is unknown, which is usually the case, we replace it
by its sample estimate s. Consequently, under we
have,
and then for the observed value
At the significance level, for the same hypothesis tests
as before, we reject if
(i)
(ii)
(iii)
(i) p-value=
(ii) p-value=
(iii) p-value=
Remark:
The values contained within a two-sided
C.I. are precisely those values for which the p-value
of a two sided hypothesis test will be greater than .
Example: The lifetime of single cell organism is
believed to be on average 257 hours. A small
preliminary study was conducted to test whether
the average lifetime was different when the
organism was placed in a certain medium. The
measurements are assumed to be normally
distributed and turned out to be 253, 261, 258, 255,
and 256.
Solution 1: Here we want to test v.s.
with and , the teat statistic value is
p-value
Hence, since the p-value is large we fail to reject the
null hypothesis and we conclude that the population
mean is not statistically different from 257.
Solution 2: Instead of hypotheses testing if a two
sided 95% confidence interval was constructed by,
it is clear that the null hypothesis value of is a
plausible value and consequently we do not reject
at 0.05 significance level.
How
I.
II.
III.
we
assume that the variances are known and the
test statistic is
The r.v. corresponding to the test statistic has a
standard normal distribution under the null
hypothesis , that . Reject the null if
(i)
(ii)
(iii)
(i) p-value=
(ii) p-value=
(iii) p-value=
Example:
Two welding procedures are two be testing on
the property of the diameter of inclusions, which are
particles embedded in the weld. A sample of 544 inclusions
in welds made using method X and averaged 0.37 m in
diameter, with a standard deviation of 0.25 m. A sample
of 581 inclusions in welds made using method Y and
averaged 0.40 m in diameter, with a standard deviation
of 0.26 m. Can you conclude that the mean diameter for
Y exceeds that of X by more than 0.015 m.
Solution:
vs
The test statistics is
This is a one-tailed test with .
We failed to reject the null hypothesis.
We
must assume that the number of successes and
failures is greater than 10 for both samples.
As the null hypotheses values for and are not available
we simply check that the sample successes and failures
are greater than 10. By virtue of the C.L.T.
Therefore
we can replace and in the variance by
the pooled estimate,
The test statistic is then,
and the r.v. corresponding to the test statistic has a
standard normal distribution under the null
hypothesis.
Thus, we reject the null hypothesis whenever,
(i)
(i) p-value=
(ii)
(ii) p-value=
(iii)
(iii) p-value=
Solution:
Let be the proportion of defective motors produced by workers in shift I
and be the proportion of defective motors produced by workers in shift II.
We
reject if
(i)
(i) p-value=
(ii)
(ii) p-value=
(iii)
(iii) p-value=
Remark: If we have equality of variances () then
we replace both and with
And in this case the degrees of freedom for the t
distribution is .
perform
Solution: MINITAB
Mix
Method Method
I
II
3160
3170
3240
3220
3190
3160
3520
3530
3480
3440
3220
3210
3120
3120
Exam 2
1.
2.
3.
4.
Section 2.6
Section 4.11
Sections 5.1-5.7
Sections 6.1-6.8