Sunteți pe pagina 1din 41

Chapter

  Test

For B. V. Sc. & A. H.

For B. V. Sc. & A. H. 1


Parametric and non-parametric test
• Parametric Test: it is a test whose model
specifies certain conditions about the
parameters of the populations from which the
samples are drawn.
• Non- Parametric Test: it is a test that does not
depend on the particular form of the basic
frequency function from which the samples
are drawn.

For B. V. Sc. & A. H. 2


Difference
Parametric test Non-parametric test
1. based on the assumption that 1. do not specify any condition about
the parent population is normally the population parameter of the
distributed i. e. specify certain population from which the sample has
conditions about the parameters been drawn.
of the population from which the
sample has been drawn
2. are designed to test the 2. are designed only to test statistical
statistical hypothesis of one or hypothesis which does not involved
more parameters of the any parameter of the population.
population.
3. are mostly applied only to the 3. are applied only to the data which
data which are measured in are measured in nominal and ordinal
interval and ratio scale. data.
4. Parametric test are most 4. Non-parametric test are less
powerful test. powerful test than parametric test.
5. It is applied in wide range. 5. It is applied in small range.

For B. V. Sc. & A. H. 3


χ² - Definition
•• If  ‘X’ is a random variable following normal distribution
having mean ‘’ and standard deviation ‘’ then is a
standard normal variate.
• The Square of standard normal variate is called chi-square
variate with 1 degree of freedom (d. f.).
• [χ² is Greek alphabet pronounced as chi –square)
• If X₁, X₂, X₃,….,Xn are ‘n’ independent random variable
following normal distribution with means and standard
deviations respectively then the variate:

• Which follows chi-square distribution having(n – 1)


degrees of freedom.
For B. V. Sc. & A. H. 4
Distribution
 P(

𝜐=1
 

𝜐=2
 
 𝜐=3
 𝜐=4

 𝜐=5
 𝜐=6

2
1 2 3 4 5 6 7 8 9 (𝜒 )
10  

For B. V. Sc. & A. H. 5


Condition for applying χ² test
• The
1.
  frequencies used in test must be absolute and relative.
2. The total number of items must be as large as 50.
3. Each of the observations of the sample must be independent of each
other.
4. The constraints in the cell frequencies should be linear (i. e. they should
not involve square and higher powers of the frequencies) such as .
5. test cannot be used for estimating the value of the population
parameter.
6. distribution should not be replaced by relative frequencies or
proportions but the data should be given in original units.
7. The expected frequency of any item or cell must not be less than 5.
8. If any cell is less than 5, then the frequencies of adjacent items or cells
should be pooled together in order to make it 5 or more than 5 and
adjust for the degrees of freedom accordingly.

For B. V. Sc. & A. H. 6


Data used on χ² test
• Measuremental data: the data obtained by actual
measurement is called measuremental data. For
example, height, weight, age, income, area etc.,
• Enumeration data: the data obtained by enumeration
or counting is called enumeration data. For example,
number of blue flowers, number of intelligent boys,
number of curled leaves, etc.
• Note: χ² – test is used for enumeration data which
generally relate to discrete variable where as t-test and
standard normal deviate tests are used for measure
mental data which generally relate to continuous
variable.
For B. V. Sc. & A. H. 7
Critical value

• P[χ² > χ²n(α)] = α ………(2)

 P

Critical value
 Rejection

 Acceptance region
region
()

2   𝜒2
 
𝜒 𝑛 (𝛼)

For B. V. Sc. & A. H. 8


Case I: Chi-square Test of Goodness of Fit
• If a set of given observed frequencies under some
experiment is given and we want to test whether the
experimental results support a particular hypothesis or
theory Karl Pearson developed a test of significance of
the discrepancy between experimental values and the
theoretical values obtained under some theory or
hypothesis, called χ² test of goodness of fit.

 follows χ² distribution having (n – 1) degrees of freedom.

For B. V. Sc. & A. H. 9


Procedure
•  
Step 1: Formulate the null and alternative hypothesis.
H₀: There is no significant difference between observed and expected frequencies.
H₁: There is a significant difference between observed and expected frequencies.
Step 2: Test Statistic:
• Now: (i)
• (ii)
• (iii)
• (iv)
Step 3: Fix the level of significance
Step 4: d. f. = (n – 1)
Step 5: Find the tabulated (critical) value of χ² for (n –1) d. f. at level of significance, from the
table of “significant values of χ²”.
Step 6: Make decision (critical value approach)
• (i) If χ² (calculated) χ² (tabulated at level of significance) then, H₀ is accepted.
• (ii) If χ² (calculated) χ² (tabulated at level of significance) then, H1 is accepted.
Step 7: (i) If H₀ is accepted then there is no significant difference between observed and expected
frequencies.
• (ii) If H₁ is accepted then there is a significant difference between observed and expected
frequencies.

For B. V. Sc. & A. H. 10


Example 1: Uniform distribution
• Three groups of patients with 17 patients each were
administrated analgesics A, B, & C was noted in 20,
18 & 13 cases respectively. Is this difference due to
the drug or by chance?
• Solution:
• Step 1: Hypothesis setting:
• H₀: There is no significant difference between the
experimental values and observed values.
• H₁: There is significant difference between the
experimental values and observed values.
• Step 2: Test statistic: Calculation of χ²
For B. V. Sc. & A. H. 11
Contd…
Group Oi Ei (Oi – Ei) (Oi – Ei)2 (Oi – Ei)2/Ei
A 20 17 3 9 9/17 = 0.529
B 18 17 1 1 1/17 = 0.058
C 13 17 -4 16 16/17 = 0.941
Total 51  51     χ² = 1.528

••Hence
  χ²(cal) = 1.528
•Step 3: Level of significance (α) = 0.05
•Step 4: degrees of freedom = n – 1 = 3 – 1 = 2
•Step 5: Critical (table) value: χ²α = 0.05 for (3 – 1) = 2 d. f. = 5.99
•Step 6: Make decision: χ²(cal) = 1.528 χ²α = 0.05 = 5.99 then we
accept null hypothesis(H0).
• Step 7: Conclusion: There is no significant difference between
experimental values and observed values.
For B. V. Sc. & A. H. 12
Example 2: Finding the expected value
• The following table gives the number of child
births that occurred during the various days of
the week. Test whether the child births are
uniformly distributed throughout the week.
Days Sun Mon Tue Wed Thu Fri Sat Total

No. of 65 53 35 38 62 54 43 350
births

For B. V. Sc. & A. H. 13


Solution
• Step 1: Hypothesis setting:
• H₀: There is no significance difference between observed birth rate and expected
birth rate throughout the week.
• H₁: There is significance difference between observed birth rate and expected
birth rate throughout the week.
• Step 2: Test statistic: Calculation of χ²
Days Oi Ei (Oi – Ei) (Oi – Ei)2 (Oi – Ei)2/Ei
Sun 65 50 15 225 225/50 = 4.5
Mon 53 50 3 9 9/50 = 0.18
Tue 35 50 -15 225 225/50 = 4.5
Wed 38 50 -12 144 144/50 =2.88
Thu 62 50 -12 144 144/50 = 2.88
Fri 54 50 4 16 16/50 = 0.32
Sat 43 50 -7 49 49/50 = 0.98
Total 350 350     χ² = 16.24

For B. V. Sc. & A. H. 14


Contd…
• Hence
  χ²(cal) = 16.24
• Step 3: Level of significance (α) = 0.05
• Step 4: degrees of freedom = n – 1 = 7 – 1 = 6
• Step 5: Critical value: χ²α = 0.05 for 6 d. f.= 12.592
• Step 6: Make decision:
• χ²(cal) = 16.24 χ²α = 0.05 = 12.592 then the test is
significant so we reject null hypothesis and accept
alternative hypothesis(H1).
• Step 7: Conclusion: There is no significance difference
between observed birth rate and expected birth rate
throughout the week.
For B. V. Sc. & A. H. 15
Example 3 : For unequal proportion

• A sample analysis of examination results of 200 MBBS


was made. It was found that 46 students had failed, 68
secured a third division, 62 secured a second division and
the rest were placed in first division. Are these figures
commensurate with the general examination result, which
is in the ratio of 4:3:2:1 for variation categories
respectively?
• Step 1: Hypothesis setting:
• H0: Observed figures do not differ significantly from the
hypothetical frequencies which are in the ratio of 4:3:2:1.
• H1: Observed figures do differ significantly from the
hypothetical frequencies which are in the ratio of 4:3:2:1.
For B. V. Sc. & A. H. 16
Solution
• Step 2: The χ² computed as:
Category Observed Expected (Oi – Ei) (Oi – Ei)2 (Oi – Ei)2/E
(Oi) (Ei)
Failed
Failed 46
46 (4/10)
(4/10) xx 200
200 == 80
80 -34
-34 1156
1156 14.450
14.450
Third
Third 68
68 (3/10)
(3/10) xx 200
200 == 60
60 8
8 64
64 1.067
1.067
Second
Second 62
62 (2/10)
(2/10) xx 200
200 == 40
40 22
22 484
484 12.100
12.100
First 24 (1/10) x 200 = 20 4 16 0.800
First 24 (1/10) x 200 = 20 4 16 0.800
  200 200     χ² = 28.417
      χ² = 28.417

For B. V. Sc. & A. H. 17


Contd…
• Step 3: Level of significance (α) = 0.05
• Step 4: degrees of freedom = n – 1 = 4 – 1 = 3
• Step 5: The tabulated value of χ² at α = 0.05
level of significance for 3 d. f. is 7.815
• Step 6: Decisions: Since the calculated value of
χ² = 28.417 > χ² (tabulated) = 7.815. Then H0 is
highly significant. Therefore, H1 is accepted.
• Step 7: Conclusion: We can conclude that data
are not commensurate with the general
examination result.
For B. V. Sc. & A. H. 18
Example 4: Merging degrees of freedom problem
• A. S. Parkes gives the observed and expected
frequency distribution of the number of males in the
litters of piglets. Test whether observed frequencies
and expected frequencies are significantly difference at
5% level of significance.
No. of males 0 1 2 3 4 5 6 7 8 Total
Observed 1 8 37 81 162 77 30 5 1 402
Frequencies
Expected 2 12 44 88 110 88 44 12 2 402
frequencies

For B. V. Sc. & A. H. 19


Solution
• Calculation of χ²
No. of Oi  merge Ei  merge (Oi – Ei) (Oi – Ei)2 (Oi – Ei)2/Ei
males Oi Ei
0 1 9 2 14 -5 25 1.79
1 8 12
2 37 37 44 44 -7 49 1.11
3 81 81 88 88 -7 49 0.55
4 162 162 110 110 52 2704 24.58
5 77 77 88 88 -11 121 1.37
6 30 30 44 44 -14 196 4.45
7 5 6 12 14 -8 64 4.57
8 1 2
Total 402   402       χ² = 38.42

For B. V. Sc. & A. H. 20


Contd…
••Step
  1: Setting of Hypothesis:
•H0: Observed and expected frequencies are same.
•H1: Observed and expected frequencies different.
•Step 2: Test statistic: χ²cal = 38.42
•Step3: d. f. = (n – 1 – k1 – k2) = 9 – 1 – 2 – 0 = 6
•Step 4: Level of significance () = 0.05
•Step 5: Table value of χ²: for 6 d. f. = 12.592
•Step 6: decision: since χ²cal = 38.42 then the test is significant
so we reject null hypothesis (H0) and accept alternative
hypothesis (H1).
• Step 7: Conclusion: Hence we conclude that the proportion
of male piglet birth and female piglet birth are not equal.

For B. V. Sc. & A. H. 21


Case II: χ²-test for Independence of Attributes
(a) 2 x 2 Contingency Table: Under the null hypothesis of independence of
attributes, the value of χ² for 2 x 2 contingency table is given by:
a b (a + b)
c d (c + d)
(a + c) (b + d) N=a+b+c+d

 • , Where, N = a + b + c + d = total frequency


OR we can apply by finding the expected frequencies by the following
rules.

• ,,
• &

For B. V. Sc. & A. H. 22


Example 5: 2 x 2 contingency problem
• Test whether there is association between regions and
contacted by the kala-azar from given data. (Test at 5% level
of significance).
Contacted by kala - azar Region
Hill Terai Total
Yes 20 40 60
No 100 40 140
Total 120 80 200

• Solution
• Step 1: Setting of Hypothesis:
• H0: There is no significance difference between Hilly
region and terai region of contacted disease by Kala-
azar.
• H1: There is significance difference between Hilly region
and terai region of contacted disease by Kala-azar.
For B. V. Sc. & A. H. 23
Contd…
• Step 2: calculation of expected frequency and
•  

Oi Ei (Oi – Ei) (Oi – Ei)2 (Oi – Ei)2/Ei

20 36 -16 256 7.11

40 24 16 256 10.67

100 84 16 256 3.04

40 56 -16 256 4.57

ΣOi = 200 ΣEi = 200     χ² = 25.4

For B. V. Sc. & A. H. 24


Contd…
•• Or
 

•Step 3: Level of significance: α = 0.05
•Step 4: degrees of freedom = (2 – 1).(2 – 1) = 1 d. f.
•Step 5: The tabulated value of χ² at 1 d. f. α = 0.05 level of
significance is 3.841.
• Step 6: Decision: Here, the calculated value of χ² = 25.4 > χ² (α =
0.05) = 3.841. Hence, calculated value of χ² is greater than
tabulated value of χ² so it is highly significant. Hence, H 1 is
accepted.
• Step 7: Conclusion: We may conclude that there is
significance difference between Hilly region and terai region
of contacted disease by Kala-azar.
For B. V. Sc. & A. H. 25
(b) Yates correction for continuity for 2 x 2 table
•• If  any cell frequency in 2 x 2 table is less than 5, we have to
apply the correction due to F. Yates (1934) and this approach
consists in adding 0.5 to the cell frequency, which is less than
5, and adjusting the remaining frequencies accordingly.
• Test statistic after applying Yates, correction is:

• Remarks:
i. Yates correction for continuity can be applied only in the
case of 2 x 2 table.
ii. If N is sufficiently large, the value of χ² is not affected much
by Yates correction. However, if N is small, application of
Yates correction may overstate the probability.

For B. V. Sc. & A. H. 26


Example 6: Yate’s correction problem
• An experiment was conducted to test the efficiency of
Chloromycetin in checking Typhoid in a certain hospital.
Chloromycetin was given to 87 out of 109 patients
suffering from typhoid. The number of typhoid cases was
as follows. Test the effectiveness of Chloromycetin in
checking typhoid.
  Typhoid No Typhoid Total
Chloromycetin 56 31 87
No Chloromycetin 18 4 22
Total 74 35 109

For B. V. Sc. & A. H. 27


Solution
• Since the cell frequency 4 (a22 cell) is less than 5, we should
apply Yates correction for computing the value of χ². This
consists in adding 0.5 to the cell frequency which is less than 5
and adjusting the remaining frequencies, as given in the table.
• Step1: Hypothesis setting:
• H0: The effectiveness of Chloromycetin is not significant in
checking of typhoid.
• H1: The effectiveness of Chloromycetin is significant in checking
of typhoid.
  T NT Total
Ch 56.5 30.5 87
N Ch 17.5 4.5 22
Total 74 35 109
For B. V. Sc. & A. H. 28
Contd…
• Step
  2: test statistic now using Yates formula, we get

• Step 3: level of significance () = 0.05


• Step 4: d. f. = (r – 1)(c – 1) = (2 – 1)(2 – 1) = 1 d. f.
• Now, the tabulated value of χ²: at 1 d. f. = 3.841.
• Decision: Here, the calculated value of χ² = 1.113 < χ²α
= 3.841. So it is not significant. Then H0 is accepted.
• Conclusion: We may conclude that vaccine is
ineffective in controlling diseases

For B. V. Sc. & A. H. 29


(c) Fisher’s exact test for 2 x 2 contingency tables:

• When any cell frequency is less than 5 then the


situation of Yates correction is applied to calculate the
χ² for 2 x 2 contingency table.
• But ‘N’ is very small (numerically less than 40) and if
one of the expected frequency is less than 5 (or more
than one cell is less than 5), then Yates correction may
not improve the approximation of χ².
• To overcome this difficulty R. A. Fisher has given a test
procedure which is known as Fishers exact test.
• By this method, exact probabilities are calculated
instead of comparing the observed and expected
frequencies.
For B. V. Sc. & A. H. 30
Procedure
• Step 1: Setting of Hypothesis: H0: Population Proportions are equal.
• H1: Population proportions are not equal.
• Step 2: Test statistic: The configuration of 2 x 2 contingency table is given below
Group Category Total
I II
1 a b r1 = (a + b)
2 c d r2 = (c + d)
Total c1 = (a + c) c2= (b + d) n

 • reduce the smallest cell frequency by 1 (one) step by step, we can get more
extreme configuration of the cell frequencies. Each time a new table is formed.
This process is continued till the smallest frequency becomes 0.
• For each table the conditional probabilities of the cell frequencies are computed as
• Step 3: Fix the level of significance α = 0.05 until and unless stated.
• Step 4: Decision: Compare the calculated p- value with the level of significance.
• If p-value (level of significance) then accept H0.
• If p-value (level of significance) then accept H1.
• Step 5: Conclusion: write the conclusion based on decision.

For B. V. Sc. & A. H. 31


Example 7: Fisher’s exact test
• An experiment on the effect of immunization of goats against a
diseases was conducted. Two batches, each of 10 animals, were
taken. Once batch was inoculated and other was not inoculated.
Then, both the batches were exposed to the infection of the
diseases. The frequencies of dead and survived animals were
observed in both the batches. The results are given in the table:
• The frequency of dead and survived animals

  Dead Survived Total


Inoculated 2 8 10
Not inoculated 7 3 10
Total 9 11 20

For B. V. Sc. & A. H. 32


Solution
• Step 1: Hypothesis setting:
• H₀: The attributes inoculation with vaccine and the survival from Anthrax are
independent. In other words, the vaccine is not effective in the immunization
of goats from Anthrax.
• H₁: They differ significantly.
• Step 2 Test statistic: The following are the possible configuration of the cell
frequencies given in the table:
  Dead Survived Total
Inoculated 2 8 10
Not Inoculated 7 3 10
Total 9 11 20

 • For table 1:

For B. V. Sc. & A. H. 33


Contd…
• Table 2:   D S Total
I 1 9 10
NI 8 2 10
Total 9 11 20
 • For table 2,

  D S Total
• Table 3: I 0 10 10
NI 9 1 10
Total 9 11 20

 • For table 3,

For B. V. Sc. & A. H. 34


Contd…
•  
Step 3: level of significance () = 0.05
• Step 4: d. f. = 1
• Step 5: p – value: The required probability is:
• p = p1 + p2 + p3 = 0.032150 + 0.002679 +
0.000060 = 0.034889
• Decision: Here, p(calculated) = 0.034889 < p(tabulated)
= 0.05. Then H0 is not significant we accept H0.
• Conclusion: The inoculated is not effective
against the diseases is rejected.

For B. V. Sc. & A. H. 35


case III: χ²-test for equality of several population
proportions:
• Null
  Hypothesis (H0): P1 = P2 = P3 = ….= Pn. [i. e.
P1, P2, …Pn, represents the n1, n2, …, nn true
proportions.]
• Alternative Hypothesis (H1): P1, P2, P3…. Pn are
not all equals.
• Test statistic:

For B. V. Sc. & A. H. 36


Example 8: Test for several population
• The following data gives the HDL- levels in random samples of size 120,
200, 150, & 130 from the adult population of the four cities A, B, C & D.
Test the equality of proportions of adults with high HDL cholesterol in
these four cities. Use 5% level of significance.
Cities A B C D Total
High HDL 53 80 68 57 258
Not High HDL 67 120 82 73 342
Total 120 200 150 130 600

• Step 1: Hypothesis setting:


• H0: P1 = P2 = P3 = P4 (i. e. P1, P2, P3 & P4 represents the true
proportions of adults with high HDL cholesterol in the cities A,
B, C & D respectively).
• H₁: P1, P2, P3 & P4 are not all equal.

For B. V. Sc. & A. H. 37


Solution
• Expected
  frequency:
Category Expected frequencies (E)

High HDL

 
  Not High
 Not
HDL High
HDL
 
  
 

For B. V. Sc. & A. H. 38


Contd…
• Step 2: Test Statistic: Computation of χ².
Category Category Oi Ei (Oi – Ei) (Oi – Ei)2 (O – E)2 / E

HDL High A 53 51.6 1.4 1.96 0.038


B 80 86 -6 36 0.418
C 68 64.5 3.5 12.25 0.189
D 57 55.9 1.1 1.21 0.021
HDL not A 67 68.4 -1.4 1.96 0.028
High B 120 114 6 36 0.315
C 82 85.5 -3.5 12.25 0.143
D 73 74.1 -1.1 1.21 0.016
Total           χ² = 1.168

For B. V. Sc. & A. H. 39


Contd…
• Therefore χ² = 1.168
• Step 3: Level of significance (α) = 0.05
• Step 4: d. f. = (r – 1).(c – 1) = (2 – 1).(4 – 1) = 3
• Step 5: The tabulated value of χ²α = 0.05 level of
significance for 3 d. f. = 7.815
• Step 6: Decisions: Since the calculated value of χ² =
1.168 < χ²α = 7.815. Then H0 is not significant.
Therefore, H₀ is accepted.
• Step 7: Conclusion: The proportions of adults with
high HDL cholesterol level is most likely the same in
the four cities A, B, C and D.
For B. V. Sc. & A. H. 40
THANK YOU

For B. V. Sc. & A. H. 41

S-ar putea să vă placă și