Sunteți pe pagina 1din 11

Non Parametric tests:

Need of Non parametric tests


1. Researcher is not sure about normal distribution or data is not normal.
2. Many Non parametric tests convert the values into ranks and calculate the values.
3. Data must be in ordinal scale and Nominal scale.
4. They are distribution free
5. They make fewer assumptions.

Non Paramteric
tests

More than two


One sample Two samples
samples

categorical Continuous
Related Independent Indipendent Related
values variable

Binomial ( two Kolmogorav- Wilcoxon rank Kruskal wallis


Sign test Cochran k test
categories) Smirnov sum test test.

Chi sqaure one Man whiteney Chi sqaure K


Sign test. Friedman test
sample test test samples

Chi Squre two


sample

WILCOXON SIGNED RANK TEST FOR ONE SAMPLE


Problem : An environmental activist believes her community’s drinking water contains at least
the 40.0 parts per million (ppm) limit recommended by health officials for a certain metal. In
response to her claim, the health department samples and analyzes drinking water from a sample
of 11 households in the community. The results are residue levels of 39.0, 20.2, 40.0, 32.2, 30.5,
26.5, 42.1, 45.6, 42.1, 29.9, and 40.9 ppm. At the 0.05 level of significance, can we conclude that
the community’s drinking water might equal or exceed the 40.0 ppm recommended limit?

Sign test
1. Tests One Population Median, h
2. Corresponds to t-Test for 1 Mean
3. Assumes Population Is Continuous
4. Small Sample Test Statistic: # Sample Values Above (or Below) Median
5. Can Use Normal Approximation If N>= 10

Question: A test is done before and after training. About 15 people were tested and it has been
seen that the datas are not normally distributed. Find whether the training has improved the test
scores.

Person Before score After score After before Sign change


1 5 6 1 +
2 3 2 -1 -
3 4 4 0
4 2 4 2 +
5 1 3 2 +
6 6 6 0
7 7 7 0
8 3 5 2 +
9 2 3 1 +
10 3 5 2 +
11 5 5 0
12 1 3 2 +
13 4 4 0
14 4 5 1 +
15 3 2 -1 -

Solution:

Using above table


Total number of people = 15
Number of positive sign = 8
Number of negative sign = 2

The binomial test tests the probability of k when its greater than or equal to 8 has positive
changes,
Here the probability p = 1/2 for positive and the probability q = 1/2 for negative.

H0: There are less than or equal positive as negative changes (p ≤≤ 0.5)

H1: There are more than positive as negative changes (p > 0.5)

Now lets take a small sample of 10 members, so n =10

p = 0.5 and q = 1 - p = 0.5


The probability for the events k equal or greater 8 is given by using the binomial formula

= (10 8)(12)8(12)2 + (10 9)(12)9(12)1 + (10 10)(12)10(12)0

= 0.044 + 0.0097 + 0.00097

= 0.05467

The probability (x ≥≥ 8) = 0.0547.

The significance level taken is 5 percent. So the probability is greater than the 5 percent
significance level. Hence H1 is discarded. The training doesn’t show any improvement in test.

Problem 1: Shri Vishnu is an analyst for Hanuman enterprises. Shri Vishnu asked 7 people to
rate a new bike on a 5-point scale (1 = terrible,…, 5 = excellent) The ratings are: 2 5 3 4 1
4 5. At the .05 level, is there evidence that the median rating is at least 3?
Solution: Median of the data set is 3
Values less than median are 1 and 2 =2
P(x>2)= 1- p(x=<1)
= 0.9297

McNemara test:
 The McNemar test is used to determine if there are differences on a dichotomous
dependent variable between two related groups.
 It can be considered to be similar to the paired-samples t-test, but for a dichotomous
rather than a continuous dependent variable.
 The McNemar test is used to analyse pretest-posttest study designs.

Requisitions:
 Dichotomous variable
 Two related groups..
 Two groups are mutually exclusive.
Problem 1: Following table is the outcome of a survey on a poll of which candidate
participates were planning on voting for (BJP or Congress) before the debate and again
after the debate. Below is the summary table.

After
D R
D 63 21 84
Before
R 4 12 16
100

Do you think that respondent opinion altered after the debate?

Solution:

H0:Population voting alignment was not altered by debate


H0:Population voting alignment was not altered by debate

(𝑏 − 𝑐)^2
𝑏+𝑐
(21 − 4)^2
(21 + 4)
=11.56>3.84=χ2(1)
If we let α=0.05, then we would reject the null hypothesis and conclude that the debate did alter
the voting alignment.

Problem 2: 51 trainees undergoing SAP training were evaluated before and 1 month after
SAP basic training. Trainers were interviewed about the frequency of their hands on
exercise per day.

Frequency of hands on exercise per day 1 month after SAP basic training Total
Before SAP basic training ≤1 >1
≤1 25 (a) 15 (b) 40
>1 0 (c) 11 (d) 11
Total 25 26 51

Solution:
(𝑏 − 𝑐)^2
𝑏+𝑐
(15−0)^2
= 15+0

=15

Critical value for Mcnemar test based on chi square table: α = 0.05 , df = 1 is 3.84
Ho rejected.

Mann whitney U test


Question 1: Ten children one each selected from 10 sets of identical twins, were trained by a
certain method A and the remaining 10 were trained by method B. At the end of the year, the
following I.Q scores were obtained.
Pair: 1 2 3 4 5 6 7 8 9 10
Method A : 31 25 38 33 42 40 44 26 43 35
Method B : 44 30 34 47 35 32 35 47 48 34

Is this sufficient evidence to indicate a difference in the average IQ scores of the groups?
αα = 0.10
Solution:

Step 1:
The null hypothesis is H0:μ1=μ2 and alternative hypothesis is H1:μ1≠μ2
We first find the rank of the items in the samples (considering whole group as one) and then find
the rank sums of each sample.

Method A Rank Method B Rank


31 4 44 16.5
25 1 30 3
38 12 34 7.5
33 6 47 18.5
42 14 35 10
40 13 32 5
44 16.5 35 10
26 2 47 18.5
43 15 48 20
35 10 34 7.5
Total 93.5 Total 116.5

Step 2:
Here n1 = 10
n2 = 10
R1 = 93.5
R2 = 116.5

Test statistic

where n1 is the total number of items in the first sample n2 is the number of observations in the
second sample Ri is the rank sum of ith sample

U = 10 * 10 + [10(10+1)/2] - 93.5

= 100 + 55 – 93.5

= 61.5

Step 3:
Now we find the mean and variance of the sampling distribution of U statistic, which is given by

Mean = [n1n2] /2 and

variance = [n1n2(n1+n2+1)]/12

Mean = [10×10]/2 = 50

Variance = {10×10(10+10+1)}/12= 175

Step 4:
we find the standard normal variate of U using the formula

Z= (U - Mean)/ √ Variance

= (61.5−50)/ √175

= 11.5/ 13.2288

= 0.8693

At αα = 0.10, the z value obtained from the standard normal table is z = 1.645.
Since calculated z is less than 1.645, we accept the null hypothesis. So we conclude that there is
no sufficient evidence to indicate a difference in the average IQ scores of the two groups.

Question 2: The nicotine content of two brands of cigarettes measured in milligrams was found
to be as follows: -
Brand A: 2.1 4 6.3 5.4 4.8 3.7 6.1 3.3
Brand B: 4.1 0.6 3.1 2.5 4 6.2 1.6 2.2 1.9 5.4
At αα = 0.05 test whether the average nicotine contents of the two brands are equal against the
alternative that they are unequal.

Solution:

The null hypothesis is H0:μ1=μ2 and alternative hypothesis is H1: μ1≠μ2.

We first find the rank of the items in the samples (considering whole group as one) and then find
the rank sums of each sample.

Brand A Rank Brand B Rank


2.1 4 4.1 12
4 10.5 0.6 1
6.3 18 3.1 7
5.4 14.5 2.5 6
4.8 13 4.0 10.5
3.7 9 6.2 17
6.1 16 1.6 2
3.3 8 2.2 5
1.9 3
5.4 14.5
Total 93 Total 78

Here n1 = 8
n2 = 10
R1 = 93
R2 = 78
where n1 is the total number of items in the first sample n2 is the number of observations in the
second sample Ri is the rank sum of ith sample

U = 8 * 10 +[ {8(8+1)}/2] - 93

= 800 + 36 – 93

= 23

Now we find the mean and variance of the sampling distribution of U statistic, which is given by
Mean = [n1n2] /2 and

variance = [n1n2(n1+n2+1)]/12
Mean = (8×10)/2 = 40

Variance = 8×10(8+10+1)12
= 126.67

Then we find the standard normal variate of U using the formula

Z = (U - Mean)/ √Variance
= (23−40)/ √126.67
= −17/11.2547
= -1.5105

At αα = 0.05, the z value obtained from the standard normal table is z = -1.96.
Since calculated z is less than –1.96, we accept the null hypothesis. So we conclude that the
average nicotine contents of the two brands are equal.

Kruskal wallis test


The Kruskal Wallis test is a non-parametric technique for comparing two or more populations,
i.e. analogous to ANOVA. Just as in the case of two independent samples the ranks are
computed for each observation according to the relative magnitude of the measurements when
the data for all the samples are combined. The test statistic is computed which is a function of the
rank sums for each sample, and the following hypothesis is tested. • H0The probability
distributions for each sample are identical. • H1 At least two of the probability distributions
differ in location.
Assumptions.
• The p samples are random and independent.
• There are 5 or more measurements in each sample
• The p probability distributions from which the samples are dawn are continuous.
Problem 1:Use Kruskal Wallis test to test for differences in mean among 3 samples for α = 0.05
Sample 1 : 100, 65, 102, 86, 80, 89, 98, 96, 91, 101
Sample 2: 84, 103, 126, 62, 92, 97, 95, 90, 94, 76
Sample 3: 90, 99, 57, 106, 88, 91, 88, 102, 77, 90.

Solution:

The null hypothesis is H0: μ1=μ2=μ3 and alternative hypothesis is H1: μ1≠μ2≠μ3
We first find the rank of the items in the samples (considering whole group as one) and then find
the rank sums of each sample.

Sample1 Rank Sample2 Rank Sample3 Rank


100 24 83 7 90 13
65 3 103 28 99 23
102 26.5 12662 30 57 1
86 8 62 2 106 29
80 6 92 17 88 9.5
89 11 97 21 91 15.5
98 22 95 19 88 9.5
96 20 90 13 102 26.5
91 15.5 94 18 77 5
101 25 76 4 90 13
R1 161 159 145

Here
n = 30
ni = 10 for all i
m=3
Degrees of freedom = m - 1 = 3 - 1 = 2.
Test statisic,
𝑚
12n 𝑅
H = (n+1) (∑ )−3(n+1)
𝑖=𝑙 𝑁
𝑚
12∗30 161^2 159^2 145^2
H = (30+1) (∑ + + )−3(30+1)
𝑖=𝑙 10 10 10

= 0.196

From the x2 distribution with m - 1 degrees of freedom and α level of significance we get critical
value = 5.991. Since H < 5.991, we accept the null hypothesis and we conclude that there is no
difference in the mean among 3 samples.

Cochran's Q test
In statistics, in the analysis of two-way randomized block designs where the response variable
can take only two possible outcomes (coded as 0 and 1),
Cochran's Q test is a non-parametric statistical test to verify whether k treatments have identical
effects. It is named after William Gemmell Cochran.
Cochran's Q test should not be confused with Cochran's C test, which is a variance outlier test.
Put in simple technical terms, Cochran's Q test requires that there only be a binary response (e.g.
success/failure or 1/0) and that there be more than 2 groups of the same size. The test assesses
whether the proportion of successes is the same between groups. Often it is used to assess if
different observers of the same phenomenon have consistent results (interobserver variability).
The Cochran's Q test statistic is

where
k is the number of treatments
X• j is the column total for the jth treatment
b is the number of blocks
Xi • is the row total for the ith block
N is the grand total

S-ar putea să vă placă și