Tests of Significance

TESTS OF SIGNIFICANCE
DEFINITIONS
POPULATION
The total number of individuals of a particular

species present in a defined area is called the
population.
SAMPLE
A finite subset of statistical individuals in a

population is called a sample.
DEFINITIONS
RANDOM SAMPLING
The best way to get a representative sample is

usually to choose a proportion of the population at
random – without bias, with every possible
experimental unit having an equal chance of being
selected.
A random sample is one in which each unit of the

population has an equal chance of being selected.
PROBLEMS WITH RANDOM SAMPLING
First, even a random sample may not be a good

representative of the population from which it has
been taken.
(See Next Slide)

 By chance, sample 1 contains a group of

relatively large fish, while those in sample 2 are
relatively small.
 So, if you take a random sample from each of two

similar populations, the samples may be different
to each other simply by chance.
 On the basis of it, you might mistakenly conclude

that the two populations are very different.
Second, even if two populations are very different,

samples from each may be similar and give the
misleading impression that the populations are
also similar.
(See Next Slide)

Simply by chance,
sample 1 and sample 2
are similar.
Third, natural variation among individuals within

a sample may obscure any effect of an experimental
treatment.
For example, if tomato plants treated with a new

fertilizer yielded from 1.5 to 9 kg of fruit per plant,
compared with 1.5 to 7.5 kg per plant in an
untreated group, can you conclude that the
fertilizer really had an effect?
It is clear from the above discussion that it is often

difficult to make a decision about a difference
between samples from different populations or
different experimental treatments.
Is it the sort of difference you would expect

by chance, or are the populations really
different? Is the experimental treatment
having an effect?
PARAMETER AND STATISTIC
 For a population, the values of mean (μ),

standard deviation (σ) and variance (σ2) are
called parameters.
 For a sample, the values of mean ( ), standard

deviation (s) and variance (s2 ) are called sample
statistics.
Tests of significance enable us to decide on the basis

of sample results, if
 the deviation between the observed sample statistic

and the hypothetical parameter value, or
 the deviation between two independent sample

statistics
is significant or might to attributed to chance.

 If you take a lot of samples of certain size (n) at

random from a normal population and calculate
the mean of each sample, they are unlikely to be
same. But the sample means will be dispersed
around the population mean μ.
 The distribution of these sample means is also

normal with its own mean (which is also μ) and
standard deviation.
 The standard deviation of the distribution of

sample means is called the standard error of
the mean (abbreviated as SEM or SE).
where σ = standard deviation of the population

n = sample size
 As the sample size, i.e., the value of ‘n’ increases,

the standard error of the mean decreases and
therefore the sample mean becomes a more
appropriate estimate of the population mean.
 So, the distribution of the means of samples of a

particular size (n) taken from a normal population
will also be normal, with a mean of μ and
standard error of mean of
THE 95% CONFIDENCE INTERVAL
 95% of the means of sample size n, taken from a

population with a known μ and σ would be
expected to occur within the range of
μ ± (1.96 x SEM).
 This range is called the 95% confidence interval

and μ - (1.96 x SEM) and μ + (1.96 x SEM) are
called the 95% confidence limits.
USING ‘Z’ STATISTIC TO COMPARE A
SAMPLE MEAN AND POPULATION MEAN
WHEN POPULATION STATISTICS ARE
KNOWN
 Set up the Null Hypothesis, H0 – there is no

significant difference between the sample mean
and the population mean, μ
 Set up the Alternative Hypothesis, H1.
 Compute the standard normal variate:

USING ‘Z’ STATISTIC TO COMPARE A
SAMPLE MEAN AND POPULATION MEAN
WHEN POPULATION STATISTICS ARE
KNOWN
 If the value of Z falls between the limits -1.96 and

+1.96, there is no significant difference between
the sample mean and the population mean.
 If the value of Z is < -1.96 or > +1.96, the null
hypothesis is rejected and there is a significant
difference between the sample mean and the
population mean.
EXAMPLE 1
 A sample of 900 members has a mean of 3.4 cms

and standard deviation 2.61 cms. Is the sample
from a population of mean 3.25 cms and s.d. 2.61
cms?
SOLUTION
Null Hypothesis, H0: The sample has been drawn

from a population with mean μ = 3.25 cm and σ =
2.61 cm
Alternative Hypothesis, H1: μ ≠ 3.25 cm

SOLUTION (CONTD.)
= 3.4 cm, n = 900, μ = 3.25 cm, σ = 2.61 cm
Since the value of Z lies between -1.96 and +1.96,

therefore the sample data does not provide any
evidence against the null hypothesis at 5% level of
significance.
EXAMPLE 2
 A sample of 400 male students is found to have a

mean height of 67.47 inches. Can it be reasonably
regarded as a sample from a large population
with mean height 67.3 inches and standard
deviation 1.30 inches? Test at 5% level of
significance.
TWO-TAILED & ONE-TAILED TESTS
A two-tailed test rejects the null hypothesis if the

sample mean is significantly higher or lower than
the hypothesized value of the mean of the
population.
Symbolically, a two-tailed test is appropriate when

we have: H0: μ = μH0 and H1: μ ≠ μH0
which may mean μ > μH0 or μ < μH0

A one-tailed test would be used when we are to

test whether the population mean is either
significantly lower than or higher than some
hypothesized value.
For example, if we have:
H0: μ = μH0 and H1: μ < μH0
then it is called left-tailed test, wherein there is

rejection region only on the left tail.
If we have:
H0: μ = μH0 and H1: μ > μH0
then it is called right-tailed test, wherein there is

rejection region only on the right tail of the curve.
EXAMPLE 3
 The average age and standard deviation of policy
holders insured by all insurance agents is 30.5
yrs and 6.35 yrs respectively. An insurance agent
claims that the average age of policy holders who
insure through him is less than the average age
for all agents. A random sample of 100 policy
holders who insured through him had mean age
of 28.8 yrs. Test his claim at 5% level of
significance.
t - test
 t – test is used to test if the sample mean,

differs significantly from the hypothetical value,
μ of the population mean when population
standard deviation is not known.
 Here, we use standard deviation of the sample as

an estimate of the population standard deviation.
t - test
Under the null hypothesis, H0:
 the sample has been drawn from the population
with mean μ.
 there is no significant difference between the
sample mean, and the population mean, μ.
t - test
The calculated value of ‘t’ is compared with the

tabulated value at certain level of significance. If
|t| > tabulated t, null hypothesis is rejected and if
calculated |t|< tabulated t, H0 may be accepted at
the level of significance adopted.
NOTE: t-test applies only in case of small samples

when population variance is known.
EXAMPLE 4
 A random sample of 10 boys had the following

I.Q.’s : 70, 120, 110, 101, 88, 83, 95, 98, 107, 100.
Do these data support the assumption of a
population mean I.Q. of 100?
SOLUTION – EXAMPLE 4
Null Hypothesis, H0: The data are consistent with

the assumption of a mean I.Q. of 100 in the
population, i.e., μ = 100
Alternative Hypothesis, H1: μ ≠ 100

SOLUTION – EXAMPLE 4 (CONTD)
X X– (X – )2
70 – 27.2 739.84
120 22.8 519.84
110 12.8 163.84
101 3.8 14.44
88 – 9.2 84.64
83 – 14.2 201.64
95 – 2.2 4.84
98 0.8 0.64
107 9.8 96.04
100 2.8 7.84
Total: 972 1833.60

Tabulated t for (10 – 1), i.e. 9 d.f. for two-

tailed test at 5% significance level is 2.262.
Conclusion: Since calculated ‘t’ is less than

tabulated t for 9 d.f., H0 may be accepted at 5%
level of significance and we may conclude that the
data are consistent with the assumption of mean
I.Q. of 100 in the population.
EXAMPLE 5
 A machinist is making engine parts with axle

diameters of 0.700 inch. A random sample of 10
parts shows a mean diameter of 0.742 inch with a
standard deviation of 0.040 inch. Compute the
statistic you would use to test whether the work
is meeting the specifications.
SOLUTION - EXAMPLE 5
Null Hypothesis, H0: μ = 0.700, i.e., the product is

conforming to specifications
Alternative hypothesis, H1: μ ≠ 0.700
Here, μ = 0.700 inch, = 0.742 inch, s = 0.040 inch
and n = 10.
We will use two-tailed t-test.

SOLUTION – EXAMPLE 5
Tabulated t for (10 – 1), i.e. 9 d.f. at 5% significance

level is 2.262.
Since calculated ‘t’, i.e. 3.23 > tabulated t, we can say
the value of ‘t’ is significant. This means that differs
significantly from μ and thus H0 is rejected at 5%
level of significance. So, the product is not meeting
the specifications.

Tests of Significance

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Tests of Significance

Încărcat de

Drepturi de autor:

Formate disponibile

TESTS OF SIGNIFICANCE

The total number of individuals of a particular

A finite subset of statistical individuals in a

The best way to get a representative sample is

A random sample is one in which each unit of the

First, even a random sample may not be a good

(See Next Slide)

 By chance, sample 1 contains a group of

 So, if you take a random sample from each of two

 On the basis of it, you might mistakenly conclude

Second, even if two populations are very different,

(See Next Slide)

Third, natural variation among individuals within

For example, if tomato plants treated with a new

It is clear from the above discussion that it is often

Is it the sort of difference you would expect

 For a population, the values of mean (μ),

 For a sample, the values of mean ( ), standard

Tests of significance enable us to decide on the basis

 the deviation between the observed sample statistic

 the deviation between two independent sample

is significant or might to attributed to chance.

 If you take a lot of samples of certain size (n) at

 The distribution of these sample means is also

 The standard deviation of the distribution of

where σ = standard deviation of the population

 As the sample size, i.e., the value of ‘n’ increases,

 So, the distribution of the means of samples of a

 95% of the means of sample size n, taken from a

 This range is called the 95% confidence interval

 Set up the Null Hypothesis, H0 – there is no

 Set up the Alternative Hypothesis, H1.

 Compute the standard normal variate:

 If the value of Z falls between the limits -1.96 and

 A sample of 900 members has a mean of 3.4 cms

Null Hypothesis, H0: The sample has been drawn

Alternative Hypothesis, H1: μ ≠ 3.25 cm

= 3.4 cm, n = 900, μ = 3.25 cm, σ = 2.61 cm

Since the value of Z lies between -1.96 and +1.96,

 A sample of 400 male students is found to have a

A two-tailed test rejects the null hypothesis if the

Symbolically, a two-tailed test is appropriate when

which may mean μ > μH0 or μ < μH0

A one-tailed test would be used when we are to

For example, if we have:

H0: μ = μH0 and H1: μ < μH0

then it is called left-tailed test, wherein there is

H0: μ = μH0 and H1: μ > μH0

then it is called right-tailed test, wherein there is

 t – test is used to test if the sample mean,

 Here, we use standard deviation of the sample as

The calculated value of ‘t’ is compared with the

NOTE: t-test applies only in case of small samples

 A random sample of 10 boys had the following

Null Hypothesis, H0: The data are consistent with

Alternative Hypothesis, H1: μ ≠ 100

120 22.8 519.84

110 12.8 163.84

101 3.8 14.44

107 9.8 96.04

100 2.8 7.84

Total: 972 1833.60

Tabulated t for (10 – 1), i.e. 9 d.f. for two-