Sunteți pe pagina 1din 40

TESTS OF SIGNIFICANCE

DEFINITIONS

POPULATION

The total number of individuals of a particular


species present in a defined area is called the
population.

SAMPLE

A finite subset of statistical individuals in a


population is called a sample.
DEFINITIONS

RANDOM SAMPLING

The best way to get a representative sample is


usually to choose a proportion of the population at
random – without bias, with every possible
experimental unit having an equal chance of being
selected.

A random sample is one in which each unit of the


population has an equal chance of being selected.
PROBLEMS WITH RANDOM SAMPLING

First, even a random sample may not be a good


representative of the population from which it has
been taken.

(See Next Slide)


PROBLEMS WITH RANDOM SAMPLING

 By chance, sample 1 contains a group of


relatively large fish, while those in sample 2 are
relatively small.

 So, if you take a random sample from each of two


similar populations, the samples may be different
to each other simply by chance.

 On the basis of it, you might mistakenly conclude


that the two populations are very different.
PROBLEMS WITH RANDOM SAMPLING

Second, even if two populations are very different,


samples from each may be similar and give the
misleading impression that the populations are
also similar.

(See Next Slide)


Simply by chance,
sample 1 and sample 2
are similar.
PROBLEMS WITH RANDOM SAMPLING

Third, natural variation among individuals within


a sample may obscure any effect of an experimental
treatment.

For example, if tomato plants treated with a new


fertilizer yielded from 1.5 to 9 kg of fruit per plant,
compared with 1.5 to 7.5 kg per plant in an
untreated group, can you conclude that the
fertilizer really had an effect?
PROBLEMS WITH RANDOM SAMPLING

It is clear from the above discussion that it is often


difficult to make a decision about a difference
between samples from different populations or
different experimental treatments.

Is it the sort of difference you would expect


by chance, or are the populations really
different? Is the experimental treatment
having an effect?
PARAMETER AND STATISTIC

 For a population, the values of mean (μ),


standard deviation (σ) and variance (σ2) are
called parameters.

 For a sample, the values of mean ( ), standard


deviation (s) and variance (s2 ) are called sample
statistics.
TESTS OF SIGNIFICANCE

Tests of significance enable us to decide on the basis


of sample results, if

 the deviation between the observed sample statistic


and the hypothetical parameter value, or

 the deviation between two independent sample


statistics

is significant or might to attributed to chance.


TESTS OF SIGNIFICANCE

 If you take a lot of samples of certain size (n) at


random from a normal population and calculate
the mean of each sample, they are unlikely to be
same. But the sample means will be dispersed
around the population mean μ.

 The distribution of these sample means is also


normal with its own mean (which is also μ) and
standard deviation.
TESTS OF SIGNIFICANCE

 The standard deviation of the distribution of


sample means is called the standard error of
the mean (abbreviated as SEM or SE).

where σ = standard deviation of the population


n = sample size
TESTS OF SIGNIFICANCE

 As the sample size, i.e., the value of ‘n’ increases,


the standard error of the mean decreases and
therefore the sample mean becomes a more
appropriate estimate of the population mean.

 So, the distribution of the means of samples of a


particular size (n) taken from a normal population
will also be normal, with a mean of μ and
standard error of mean of
THE 95% CONFIDENCE INTERVAL

 95% of the means of sample size n, taken from a


population with a known μ and σ would be
expected to occur within the range of
μ ± (1.96 x SEM).

 This range is called the 95% confidence interval


and μ - (1.96 x SEM) and μ + (1.96 x SEM) are
called the 95% confidence limits.
USING ‘Z’ STATISTIC TO COMPARE A
SAMPLE MEAN AND POPULATION MEAN
WHEN POPULATION STATISTICS ARE
KNOWN

 Set up the Null Hypothesis, H0 – there is no


significant difference between the sample mean
and the population mean, μ

 Set up the Alternative Hypothesis, H1.

 Compute the standard normal variate:


USING ‘Z’ STATISTIC TO COMPARE A
SAMPLE MEAN AND POPULATION MEAN
WHEN POPULATION STATISTICS ARE
KNOWN

 If the value of Z falls between the limits -1.96 and


+1.96, there is no significant difference between
the sample mean and the population mean.
 If the value of Z is < -1.96 or > +1.96, the null
hypothesis is rejected and there is a significant
difference between the sample mean and the
population mean.
EXAMPLE 1

 A sample of 900 members has a mean of 3.4 cms


and standard deviation 2.61 cms. Is the sample
from a population of mean 3.25 cms and s.d. 2.61
cms?
SOLUTION

Null Hypothesis, H0: The sample has been drawn


from a population with mean μ = 3.25 cm and σ =
2.61 cm

Alternative Hypothesis, H1: μ ≠ 3.25 cm


SOLUTION (CONTD.)

= 3.4 cm, n = 900, μ = 3.25 cm, σ = 2.61 cm

Since the value of Z lies between -1.96 and +1.96,


therefore the sample data does not provide any
evidence against the null hypothesis at 5% level of
significance.
EXAMPLE 2

 A sample of 400 male students is found to have a


mean height of 67.47 inches. Can it be reasonably
regarded as a sample from a large population
with mean height 67.3 inches and standard
deviation 1.30 inches? Test at 5% level of
significance.
TWO-TAILED & ONE-TAILED TESTS

A two-tailed test rejects the null hypothesis if the


sample mean is significantly higher or lower than
the hypothesized value of the mean of the
population.

Symbolically, a two-tailed test is appropriate when


we have: H0: μ = μH0 and H1: μ ≠ μH0

which may mean μ > μH0 or μ < μH0


TWO-TAILED & ONE-TAILED TESTS

A one-tailed test would be used when we are to


test whether the population mean is either
significantly lower than or higher than some
hypothesized value.

For example, if we have:

H0: μ = μH0 and H1: μ < μH0

then it is called left-tailed test, wherein there is


rejection region only on the left tail.
TWO-TAILED & ONE-TAILED TESTS

If we have:

H0: μ = μH0 and H1: μ > μH0

then it is called right-tailed test, wherein there is


rejection region only on the right tail of the curve.
EXAMPLE 3
 The average age and standard deviation of policy
holders insured by all insurance agents is 30.5
yrs and 6.35 yrs respectively. An insurance agent
claims that the average age of policy holders who
insure through him is less than the average age
for all agents. A random sample of 100 policy
holders who insured through him had mean age
of 28.8 yrs. Test his claim at 5% level of
significance.
t - test

 t – test is used to test if the sample mean,


differs significantly from the hypothetical value,
μ of the population mean when population
standard deviation is not known.

 Here, we use standard deviation of the sample as


an estimate of the population standard deviation.
t - test
Under the null hypothesis, H0:
 the sample has been drawn from the population
with mean μ.
 there is no significant difference between the
sample mean, and the population mean, μ.
t - test

The calculated value of ‘t’ is compared with the


tabulated value at certain level of significance. If
|t| > tabulated t, null hypothesis is rejected and if
calculated |t|< tabulated t, H0 may be accepted at
the level of significance adopted.

NOTE: t-test applies only in case of small samples


when population variance is known.
EXAMPLE 4

 A random sample of 10 boys had the following


I.Q.’s : 70, 120, 110, 101, 88, 83, 95, 98, 107, 100.
Do these data support the assumption of a
population mean I.Q. of 100?
SOLUTION – EXAMPLE 4

Null Hypothesis, H0: The data are consistent with


the assumption of a mean I.Q. of 100 in the
population, i.e., μ = 100

Alternative Hypothesis, H1: μ ≠ 100


SOLUTION – EXAMPLE 4 (CONTD)
X X– (X – )2

70 – 27.2 739.84

120 22.8 519.84

110 12.8 163.84

101 3.8 14.44

88 – 9.2 84.64

83 – 14.2 201.64

95 – 2.2 4.84

98 0.8 0.64

107 9.8 96.04

100 2.8 7.84

Total: 972 1833.60


SOLUTION – EXAMPLE 4 (CONTD)

Tabulated t for (10 – 1), i.e. 9 d.f. for two-


tailed test at 5% significance level is 2.262.
SOLUTION – EXAMPLE 4 (CONTD)

Conclusion: Since calculated ‘t’ is less than


tabulated t for 9 d.f., H0 may be accepted at 5%
level of significance and we may conclude that the
data are consistent with the assumption of mean
I.Q. of 100 in the population.
EXAMPLE 5

 A machinist is making engine parts with axle


diameters of 0.700 inch. A random sample of 10
parts shows a mean diameter of 0.742 inch with a
standard deviation of 0.040 inch. Compute the
statistic you would use to test whether the work
is meeting the specifications.
SOLUTION - EXAMPLE 5

Null Hypothesis, H0: μ = 0.700, i.e., the product is


conforming to specifications

Alternative hypothesis, H1: μ ≠ 0.700

Here, μ = 0.700 inch, = 0.742 inch, s = 0.040 inch

and n = 10.

We will use two-tailed t-test.


SOLUTION – EXAMPLE 5

Tabulated t for (10 – 1), i.e. 9 d.f. at 5% significance


level is 2.262.
Since calculated ‘t’, i.e. 3.23 > tabulated t, we can say
the value of ‘t’ is significant. This means that differs
significantly from μ and thus H0 is rejected at 5%
level of significance. So, the product is not meeting
the specifications.

S-ar putea să vă placă și