Section 6.2: Tests of Significance

1
Section 6.2
Tests of Significance
2
Test of Significance
Statistical Inference-
To make conclusion about a population
To estimate a population parameter
3 most common ways to make inference
Point Estimation (x-bar or p-hat)
Confidence Intervals (Plausible values of parameter)
Test of Significance (also called Hypothesis Tests)

Test of Significance is the focus of this section
3
Test of Significance:
Assess the evidence provided by data in favor of
some claim about the population
Start by setting up a hypothesis
This is a statement about a population parameter
Results of the test are expressed in terms of a
probability
Usually called a p-value
This probability (p-value) measures how well the data
and the hypothesis agree.
4
Is the observed effect too large or too
unusual to be due to chance variation?
An approach to significance testing
1. Define a research question in terms of a parameter and
specify the claimed parameter value.
2. Conduct an experiment or survey to obtain data.
3. Plot data and compute observed statistic (x-bar or p-hat).
4. Check sampling distribution for normality.
5. Using the sampling distribution of the statistic with center
given by claimed parameter value, calculate the probability
of getting differences more extreme than the observed effect
6. If the probability is small enough, declare the observed effect
statistically significant (i.e., too large to be due to chance).
5
Test of Significance An example
The mean yield of corn in the US is about
135 bushels per acre. A survey of 40
randomly selected farmers this year gives a
sample mean yield of bushels per
acre.

We want to know whether this is good
evidence that the national mean this year is
not 135 ushers per acre
x =138.8
6
Farmers corn yield
want to measure the average corn yield for the US this year.
Sample -- n = 40 farmers
Statistic sample mean:
pop std. dev:

n=4030, according to CLT,
Question: Is it possible that, even though we observed ,
the true is in fact 135 bushels per acre?
(Is it likely that observing is due to pure chance?)
x =138.8
o =10
|
.
|
\
|
n
N x
o
, ~
x =138.8
x =138.8
7
Section 6.2
Using z-
calculation and
table A, the
probability of
observing 138.8 or
larger while the
true is 135, with
sd = 10/sqrt(40) is
only 0.008 = sd
8

The probability 0.008 is very small
We will only observe eight times per 1000
samples if the true mu = 135
Hence, it is not likely that the true is 135, given that a
sample mean=138.8 is observed
Two possibilities
=135 and we observed something very unusual
is not 135 but is some other value that makes the
observed data more probable
x >138.8
9
What if a sample mean is observed?
Probability of observing 136.6 given =135 is 0.15
We will observe 15 times per 100 sample
This value 0.15 is not as extreme as 0.008, and it could
easily happen
We dont have strong evidence to believe 135.
Conclusion: A sample outcome that would be extreme if a
hypothesis were true is evidence that the hypothesis is not true. In
other words, if the sample result is extreme, we likely have
evidence against our original hypothesis.
x =136.6
x >136.6
10

In this example, we considered 2 situations of
how the value of can occur:
1. =0
2. 0
These are the hypotheses
The first step in a test of significance is to state
a claim that we will try to find evidence against
11

Stating hypothesis
Null Hypothesis (H
0
)
The statement being tested in a test of significance
is called the null hypothesis
Usually the null hypothesis
is a statement of no effect or no difference,
is a statement about a population,
is expressed in terms of a (some) parameter(s).
Example H
0
: =0
12
Stating hypothesis
Alternative Hypothesis ( H
a
)
The name we give to the statement we hope (or
suspect) is true.
Example H
a
: =0
Hypotheses always refer to some population or
model, not a particular outcome
We must decide whether the alternative hypothesis
(H
a
) should be one-sided or two-sided
13
Stating hypothesis
One-sided alternative hypotheses:
Example:

Two-sided alternative hypothesis:
Example:
0 : <
a
H
0 : =
a
H
0 : >
a
H
14

Stating hypothesis
Choosing one-sided or two-sided Hypothesis
The alternative hypothesis should express the hopes or
suspicions we had in mind when we decided to collect the
data.
You are cheating if you first look at the data and then
frame H
a
to fit what the data show. Choose a one or
two-sided Ha before you look at the data.
If you do not have a specific direction in advance, use a
two-sided alternative
15
Stating hypothesis
Example: Your company hopes the reduce the mean time
() required to process customer orders. At present, this
mean is 3.8 days. You study the process and eliminate
some unnecessary steps.
Q: Did you succeed in decreasing the average process
time?
Target: to show that the mean is now less than 3.8 days.
So alternative hypothesis is one-sided
The null hypothesis is no change value
8 . 3 : <
a
H
8 . 3 : =
o
H
16
The mean area of several thousand
apartments in a new development is
advertised to be 1250 sqft. A tenant
group thinks that the apartments are
smaller than advertised. They hire an
engineer to measure a sample of
apartments to test their suspicion.
H
0
: =1250 vs. H
a
: <1250
Stating hypothesis
17
Experiments concerning learning in animals
sometimes measure how long it takes a mouse
to find its way through a maze. The mean time
is 18 seconds for one particular maze. A
researcher thinks that a loud noise will cause
the mice to complete the maze faster. She
measures how long each of 10 mice takes with
a noise as stimulus.
H
0
: =18 vs. H
a
: <18

Stating hypothesis
18
Last year, your companys service technicians
took an average of 2.6 hours to respond to
trouble calls from business customers who
purchased service contracts. Do this years
data show a different average response time?
H
0
: = 2.6 vs. H
a
: = 2.6

Stating hypothesis
19

The test is based on a statistic that estimates the
parameter appearing in the hypothesis
Usually this is the same estimate we would use in
a confidence interval for the parameter
When H
0
is true, we expect the estimate to take a
value near the parameter value specified by H
0

Values of the estimate far from the parameter
value specified by H
0
give evidence against H
0

The alternative hypothesis (Ha) determines which
directions count against H
0

20

Test statistics
A test statistic measures compatibility between the null
hypothesis and the data
Many test statistics can be thought of as a distance
between a sample estimate of a parameter and the
value of the parameter specified by the null hypothesis

21

Example:
The Census Bureau reports that households spend an
average of 31% of their total spending on housing. A
homebuilders association in Cleveland wonders if the
national finding applies in their area. They interview a
sample of 40 households in the Cleveland metropolitan are
to learn what percent of their spending goes toward
housing. Take to be the mean percent of spending
devoted to housing among all Cleveland households.
Assume that o = 9.6%.
What is the null and alternative hypothesis?
H
0
: = 31% vs. H
a
: = 31%
22

Example: Spending on housing
Sample: n=40 households
sample mean:
Pop. std. dev:
Test:

The test statistic is the standardized version of
distance between sample mean and parameter value
given in the H
0
:

H
0
: = 31%
H
a
: = 31%
x = 28.6%
o = 9.6%
z =
x
0
o
n
=
28.6 31
9.6
40
= 1.58
23

Example: Spending on Housing
The Central Limit Theorem assures us that is
approximately normal
This implies that the test statistic has an approximately
standard Normal distribution
To move from the test statistic z to a probability, we
must do Normal probability calculations.

x
24
Test of Significance P-Values
P-values
A test of significance assesses the evidence against the null
hypothesis and provides a numerical summary of this evidence in
terms of a probability
The idea is that surprising or unusual outcomes are evidence
against H
0

A surprising or unusual outcome is one that is far from what we would
expect if H
0
were true
25
Test of Significance P-Values

A test of significance finds the probability of getting an outcome as extreme
or more extreme than the actually observed outcome
The direction or directions that count as far from what we would expect are
determined by the alternative hypothesis
Definition: The probability, computed assuming that H
0
is true, that the test
statistic would take a value as or more extreme than that actually observed
is called the P-value of the test
The smaller the P-value, the stronger the evidence against H
0

provided by the data in our sample
26
P-values
To calculate the P-value, we must use the
sampling distribution of the test statistic
Since our test statistic z follows a standard
Normal distribution that is all we will need in
chapter 6. (Matters change in Chapter 7)
27

P-values
Example: continue Spending on housing

Previously, we calculated z = -1.58
If the null hypothesis is true, we expect z to take a value
not too far from 0
Because the alternative is two-sided, values of z far
from 0 in either direction count as evidence against H
0

and in favor of H
a
H
0
: = 31%
H
a
: = 31%
28
P-Values
Example:Continued
So the P-value is:
P(z < -1.58) + P(z > 1.58) =
2*P(z > |1.58|) = 2*0.057 =
0.114

What is the p-value if
H
a
: < 31%
P-value is:
P(z < -1.58) = 0.057
29

Statistical Significance
Statistical software automates the task of calculating
the test statistic and its P-value
You must still decide which test is appropriate and
whether to use a one-sided or two-sided test
You must also decide what conclusion the computers
numbers support
We know that smaller P-values indicate stronger
evidence against the null hypothesis
30
How strong is strong enough?
One approach is to announce in advance how much
evidence against H
0
we will require to reject H
0

We compare the P-value with a significance level that says
this evidence is strong enough
Significance level is denoted by o
If we choose o = 0.05, we are requiring that the data give
evidence against H
0
so strong that it would happen no more
than 5% of the time when H
0
is true
If the P-value is small or smaller than o, we say
that the data are statistically significant at level o
31
P-value < o => Data is significant. Reject H
0
P-value > o => Data is not significant at given
significance level. There is not enough evidence to
reject H
0
32

Significant in the statistical sense does
not mean important
The term is used to indicate only that the
evidence against the null hypothesis reached
the standard set by o
33

Test Statistic
Previously, we calculated P-value = 0.057
With H
a
: < 31%
Is this significant at the o = 0.10 level?
0.057 < 0.10 => There is enough evidence to reject H
0
at this
o-level
Is there statistical significance at the o = 0.05 level?
0.057 > 0.05 => There is not significant evidence to reject H
0
34

Test for a population mean
There are four steps in carrying out a
significance test
1. State the hypothesis (Ho vs. Ha)
2. Calculate the test statistic (x-bar usually)
3. Find the P-value and make a decision
4. State your conclusion in the context of your
specific setting

35
We have a SRS of size n drawn from a Normal
population with unknown mean
We want to test the hypothesis that has a
specified value
Call the specified value
0
to represent a specific
value

1. State the null hypothesis
The null hypothesis is H
0
: =
0
36

The test is based on the sample mean
2. Calculate the test statistic
Because Normal calculations require standardized
variables, we will use the standardized sample mean as
our test statistic:

This one-sample z statistic has the standard Normal
distribution when H
0
is true

n
x
z
o

=
x
37

3. find the P-value and make a decision
The P-value of the test is the probability that z
takes a value at least as extreme as the value for
our sample
What counts as extreme is determined by the
alternative hypothesis H
a
One sided H
a
: <
0
or

H
a
: >
0

Two sided H
a
:
0
38

Test of Significance (Page 391)
39

3. Find a p-value and make a decision
P-value < o => Data is significant. Reject H
0
P-value > o => Data is not significant at given
significance level. There is not enough evidence to
reject H
0
4. State your conclusion within the context of the problem

40

z Test for a Population Mean
Example:
A manufacturer of cereal wants to test the performance of
one of its filling machines
The machine is designed to discharge a mean amount of
= 12 ounces per box
The manufacturer wants to detect any departure from this
setting
Suppose the sample yields the following results
n = 100 observations (boxes)
= 11.85 ounces
o = 0.5 ounces
x
41
Solution:
1. State the null and alternative hypothesis, specify an o-level
2. Calculate the test statistic
3. find the p-value
4. Conclusion: p-value=0.0026 < Reject H
0
. At this
significance level, the sample provides enough evidence to
believe that mean amount of cereal the machines discharge is
different from 12 ounces per box.
x
o
n
=
11.85 12
0.5
100
= 3
H
0
: =12 vs. H
a
: =12
P value = 2 * P(z > 3.0 ) ~ 0.027
01 . 0 = o
01 . 0 = o
42

Remember
Tests of significance assess the evidence against H
0

If the evidence is strong, we can confidently reject H
0
in
favor of the alternative
Failing to find evidence against H
0
means only that the
data are consistent with H
0
, not that we have clear
evidence that H
0
is true
43

Two-sided Significance Tests and Confidence Intervals:
A 95% confidence interval captures the true value in
95% of all samples
If we are 95% confident that the true lies in our
interval, we are also confident that values of that fall
outside our interval are incompatible with the data
That sounds like the conclusion of a test of
significance!
44

There is a close connection between 95% confidence
intervals and significance at the 5% level
The same connection holds between 99% confidence
intervals and significance at the 1% level, and so on
So we can use confidence intervals to conduct a two-
sided hypotheses test (in these simple cases)
45


46
Example contd:
take
n = 100 observations
= 11.85 ounces
.
99% confidence interval:

x
o = 0.5
01 . 0 = o
) 98 . 11 , 72 . 11 (
100
5 . 0
576 . 2 85 . 11
*
= =
n
z x
o
47
Section 6.2
Hypotheses:

Conclusion:
12 is not in (11.72, 11.98)
We reject the at significance
level

Does this match the previous conclusion?
12 :
12 :
0
=
=
a
H
H
12 :
0
= H 01 . 0 = o
48
P-value versus Fixed o
A P-value is more informative than a reject-or-not finding
at a fixed significance level

Assessing significance at a fixed level o is easier, because
no probability calculation is required
Simply look up a critical value in a table

Because the practice of statistics almost always employs
software that calculates P-values automatically, tables of
critical values are basically outdated
49
Section 6.2 Summary
A test of significance is intended to
assess the evidence provided by data
against a null hypothesis (Ho) and in
favor of an alternative hypothesis Ha.
The test provides a method for ruling out
chance as an explanation for data that
deviate from what we expect under Ho.

50
Section 6.2 Summary
The hypotheses (Ho and Ha) are stated in
terms of population parameters. Usually
Ho is a statement that no effect is present,
and Ha says that a parameter differs from
its null value, in a specific direction (one-
sided alternative) or in either direction
(two-sided alternative).

51
Section 6.2 Summary
The test is based on a test statistic. The P-
value is the probability, computed assuming that
Ho is true, that the test statistic will take a value
at least as extreme as that actually observed.
Small P-values indicate strong evidence against
Ho. Calculating P-values requires knowledge of
the sampling distribution of the test statistic
when Ho is true.

52
Section 6.2 Summary
If the P-value is as small or smaller than a
specified value alpha, the data are
statistically significant at significance
level alpha.

53
Section 6.2 Summary
Significance tests for the hypothesis H0:
= 0 concerning the unknown mean of a
population are based on the z statistic:

54
Section 6.2 Summary
The z test assumes
A SRS of size n
Known population standard deviation
Either a Normal population or a large sample.
P-values are computed from the Normal
distribution (Table A) or from computers.

Section 6.2: Tests of Significance

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Section 6.2: Tests of Significance

Încărcat de

Drepturi de autor:

Formate disponibile

1

S-ar putea să vă placă și