Documente Academic
Documente Profesional
Documente Cultură
Inferential Statistics
©drtamil@gmail.com 2012
Inferential Statistic
©drtamil@gmail.com 2012
Drug A Better Than Drug B?
©drtamil@gmail.com 2012
Null Hypothesis
Null Hyphotesis;
©drtamil@gmail.com 2012
Null Hypothesis
©drtamil@gmail.com 2012
Can reindeer fly?
You believe reindeer can fly
Null hypothesis: “reindeer cannot fly”
Experimental design: to throw reindeer off the roof
Implementation: they all go splat on the ground
Evaluation: null hypothesis not rejected
• This does not prove reindeer cannot fly: what you have shown
is that
– “from this roof, on this day, under these weather conditions,
these particular reindeer either could not, or chose not to, fly”
©drtamil@gmail.com 2012
Significance
Inferential statistics determine whether a significant
difference of effectiveness exist between drug A and
drug B.
If there is a significant difference (p<0.05), then the
null hypothesis would be rejected.
Otherwise, if no significant difference (p>0.05), then
the null hypothesis would not be rejected.
The usual level of significance utilised to reject or
not reject the null hypothesis are either 0.05 or 0.01.
In the above example, it was set at 0.05.
©drtamil@gmail.com 2012
Confidence interval
Reject H0 Reject H0
©drtamil@gmail.com 2012
.025 .025
-2.0639-1.960 2.06391.96 t
Fisher’s Use of p-Values
R.A. Fisher referred to the probability to declare
significance as “p-value”.
©drtamil@gmail.com 2012
“It is a common practice to judge a result significant, if it
is of such magnitude that it would be produced by
chance not more frequently than once in 20 trials.”
1/20=0.05. If p-value less than 0.05, then the
probability of the effect detected were due to chance
is less than 5%.
We would be 95% confident that the effect detected is
due to real effect, not due to chance.
©drtamil@gmail.com 2012
Error
©drtamil@gmail.com 2012
Error
REALITY
Treatments are Treatments are
DECISION not different different
©drtamil@gmail.com 2012
Type I Error
• Type I Error – rejecting the null hypothesis
although the null hypothesis is correct e.g.
• when we compare the mean/proportion of
the 2 groups, the difference is small but the
difference is found to be significant.
Therefore the null hypothesis is rejected.
• It may occur due to inappropriate choice of
alpha (level of significance).
©drtamil@gmail.com 2012
Type II Error
p = 0.136. p bigger than 0.05. No significant diffe rence and thenull hypothesis was not
rejected.
There was a large difference between the rates but were not
©drtamil@gmail.com 2012
significant. Type II Error?
Not significant since power of
the study is less than 80%.
Power is only
32%!
©drtamil@gmail.com 2012
Check for the errors
©drtamil@gmail.com 2012
Determining the
appropriate statistical test
©drtamil@gmail.com 2012
Data Analysis
©drtamil@gmail.com 2012
Test of Association
©drtamil@gmail.com 2012
Problem Flow Chart
Independent Variables
Suicidal Tendencies
©drtamil@gmail.com 2012
Dependent Variable
Multivariat
©drtamil@gmail.com 2012
Hypothesis Testing
Distinguish parametric & non-parametric
procedures
Test two or more populations using
parametric & non-parametric procedures
• Means
• Medians
• Variances
©drtamil@gmail.com 2012
©drtamil@gmail.com 2012
Parametric Test
Procedures
Involve population parameters
• Example: Population mean
Require interval scale or ratio scale
• Whole numbers or fractions
• Example: Height in inches: 72, 60.5,
54.7
Have stringent assumptions
©drtamil@gmail.com 2012
• Example: Normal distribution
Examples: Z test, t test
©drtamil@gmail.com 2012
Nonparametric Test
Procedures
Statistic does not depend on population
distribution
Data may be nominally or ordinally
scaled
• Example: Male-female
May involve population parameters such
as median
Example: Wilcoxonrank sum test
©drtamil@gmail.com 2012
Parametric Analysis –
Quantitative
©drtamil@gmail.com 2012
non-parametric tests
Variable 1 Variable 2 Criteria Type of Test
Qualitative Qualitative Sample size < 20 or (< 40 but Fisher Test
Dichotomus Dichotomus with at least one expected
value < 5)
Qualitative QuantitativeData not normally distributed Wilcoxon Rank Sum
Dichotomus Test or U Mann-
Whitney Test
Qualitative QuantitativeData not normally distributedKruskal-Wa llis One
Polinomial Way ANOVA Test
Quantitative QuantitativeRepeated measurement ofWilcoxonthe Rank Sign
same individual & item Test
Quantitative - QuantitativeData
- not normally distributed Spearman/Kendall
continous continous Rank Correlation
©drtamil@gmail.com 2012
Statistical Tests - Qualitative
©drtamil@gmail.com 2012
Data Analysis
Using SPSS;
http://161.142.92.104/spss/
Using Excel;
http://161.142.92.104/excel/
©drtamil@gmail.com 2012
FF2613
Independent T-Test
Student’s T-Test
Paired T-Test
ANOVA
©drtamil@gmail.com 2012
Student’s T-test
©drtamil@gmail.com 2012
Student’s T-Test
To compare the means of two independent
groups. For example; comparing the mean
Hb between cases and controls. 2 variables
are involved here, one quantitative (i.e. Hb)
and the other a dichotomous qualitative
variable (i.e. case/control).
t=
©drtamil@gmail.com 2012
Examples: Student’s ttest
Comparing the level of blood cholestrol
(mg/dL) between the hypertensive and
normotensive.
Comparing the HAMD score of two
groups of psychiatric patients treated
with two different types of drugs (i.e.
Fluoxetine & Sertraline
©drtamil@gmail.com 2012
Example
Group Statistics
©drtamil@gmail.com 2012
Assumptions of T test
©drtamil@gmail.com 2012
Manual Calculation
n1 n2 ( n1 −1)s +−
2 2
( n2 1)s
s =
2 1 2
( n1 −+−
0
1)( n2 1)
©drtamil@gmail.com 2012
Example – compare
cholesterol level
Hypertensive : Normal :
Mean : 214.92 Mean : 182.19
s.d. : 39.22 s.d. : 37.26 n : 64 n :
36
• Comparing the cholesterol level between
hypertensive and normal patients.
• The difference is (214.92 – 182.19) = 32.73 mg%.
• H0 : There is no difference of cholesterol level
between hypertensive and normal patients.
©drtamil@gmail.com 2012
• n > 30, (64+36=100), therefore use the first formula.
©drtamil@gmail.com 2012
Calculation
X1 − X 2
t= 2 2
s s
1
+ 2
n1 n2
t = (214.92-182.19)________
((39.222/64)+(37.262/36))0.5
t =4.137
df= n 1+n2-2 = 64+36-2 = 98
Refer to t table; with t = 4.137, p < 0.001
©drtamil@gmail.com 2012
If df>100, can refer Table A1.
We don’t have 4.137 so we
use 3.99 instead. If t = 3.99,
then p=0.00003x2=0.00006
Therefore if t=4.137,
p<0.00006.
Or can refer to Table A3.
We don’t have df=98,
so we use df=60 instead.
t = 4.137 > 3.46 (p=0.001)
©drtamil@gmail.com 2012
Conclusion
• Therefore p < 0.05, null hypothesis rejected.
• There is a significant difference of
cholesterol level between hypertensive and
normal patients.
• Hypertensive patients have a significantly
higher cholesterol level compared to
normotensive patients.
©drtamil@gmail.com 2012
©drtamil@gmail.com 2012
T-Test In SPSS
Std. Error
SGA N Mean Std. Deviation Mean
Weight at first ANC Normal 108 58.666 11.2302 1.0806
SGA 109 51.037 9.3574 .8963
©drtamil@gmail.com 2012
Paired T-Test
©drtamil@gmail.com 2012
Formula
d −0
t=
sd
n
(∑ d)
2
∑d −
2
i
sd = n
n −1
df =−
np 1 ©drtamil@gmail.com 2012
Examples of paired t-test
©drtamil@gmail.com 2012
Example
Paired Differences
Std. Sig.
Mean Deviation t df (2-tailed)
Pair DHAMAWK0 -
10.1563 6.75903 8.500 31 .000
1 DHAMAWK6
©drtamil@gmail.com 2012
M a n u a lC a lc u la t io n
©drtamil@gmail.com 2012
Calculation
©drtamil@gmail.com 2012
Calculation
∑ d = 112 ∑ d2 = 1842 n = 36
Mean d = 112/36 = 3.11
sd= ((1842-112 2/36)/35)0.5 d −0
t=
sd
sd= 6.53 n
t = 3.11/(6.53/6)
t = 2.858 (∑ d)
2
∑d −
2
df= n p –1 = 36 –1 = 35. sd =
i
n
n −1
Refer to t table;
df =−np 1
©drtamil@gmail.com 2012
Refer to Table A3.
We don’t have df=35,
so we use df=30 instead.
t = 2.858, larger than 2.75
(p=0.01) but smaller than 3.03
(p=0.005). 3.03>t>2.75
Therefore if t=2.858,
0.005<p<0.01.
Conclusion
with t = 2.858, 0.005<p<0.01
Therefore p < 0.01.
Therefore p < 0.05, null hypothesis
rejected.
Conclusion: There is a significant
difference of the systolic blood pressure
between the first and second
measurement. The mean average of first
reading is significantly higher compared
to the second reading.
©drtamil@gmail.com 2012
Paired T-Test In SPSS
Std. Error
Mean N Std. Deviation Mean
Pair HB2 10.247 70 .3566 .0426
1 HB3 10.594 70 .9706 .1160
©drtamil@gmail.com 2012
Paired T-Test Results
Paired Samples Test
Paired Differences
95% Confidence
Interval of the
Std. Error Difference
Mean Std. Deviation Mean Lower Upper t df Sig. (2-tailed)
Pair 1 HB2 - HB3 -.347 .9623 .1150 -.577 -.118 -3.018 69 .004
Before
treatment
Paired T-
(HB2) vs
70 0.35 + 0.96 test 0.004
After
t = 3.018
treatment
(HB3)
©drtamil@gmail.com 2012
ANOVA
©drtamil@gmail.com 2012
ANOVA –
Analysis of Variance
©drtamil@gmail.com 2012
Examples
©drtamil@gmail.com 2012
One-Way ANOVA
F-Test Assumptions
Randomness & independence of errors
• Independent random samples are dr awn
Normality
• Populations are normally distributed
Homogeneity of variance
• Populations have equal variances
©drtamil@gmail.com 2012
©drtamil@gmail.com 2012
Manual Calculation
ANOVA
©drtamil@gmail.com 2012
Manual Calculation
©drtamil@gmail.com 2012
Example:
Time To Complete
Analysis
45 samples were
analysed using 3
different blood analyser
(Mach1, Mach2 &
Mach3).
distances from .X
If the combined
distances are large,
Example:
Time To Complete
that indicates we
should reject H0.
The Anova Statistic
To combine the differences from the grand mean we
• Square the differences
• Multiply by the numbers of observations in the groups
• Sum over the groups
(
SSB = 15 XMach1 −X ) +15(X
2 Mach2 ) +15(X
−X 2 Mach3 )
−X 2
(
SSB = 15 XMach1 −X ) +15(X
2 Mach2 ) +15(X
−X 2 Mach3 )
−X 2
+15(X −X)
Mach3 2
©drtamil@gmail.com 2012
How big is big?
∑∑ (x −X j )
1
MSE =
2
N −K
ij
j i
∑∑ (x −X j )
1
MSE =
2
N −K
ij
j i
∑∑ (x −X j )
1
MSE =
2
N −K
ij
j i
MSE =
1
∑∑ (x ij − X j )
2
N −K j i
©drtamil@gmail.com2012
Notes on MSE
Under H0 the F statistic has an “F” distribution,
with K-1 and N-K degrees of freedom (N is the
total number of observations)
Time to Analyse:
F test p-value
To get a p-value we
compare our F statistic to
an F(2, 42) distribution.
Time to Analyse:
F test p-value
To get a p-value we
compare our F statistic to
an F(2, 42) distribution.
In our example
F= = 89.015
We cannot draw the line
since the F value is so
large, therefore the p value
is so small!!!!!!
Refer to F Dist. Table (α=0.01).
We don’t have df=2;42,
so we use df=2;40 instead.
F = 89.015, larger than 5.18
(p=0.01)
Therefore if F=89.015, p<0.01.
In our example
F= = 89.015
Total 174.872 44
Results are often displayed using an ANOVA Table
ANOVA Table
Sum of Mean
Squares df Square F Sig.
Between
Groups 141.492 2 40.746 89.015 .0000000
Total 174.872 44
Pop Quiz!: Where are the following quantities presented in this
table?
Results are often displayed using an ANOVA Table
Total 174.872 44
Total 174.872 44
Total 174.872 44
Total 174.872 44
Select ‘Descriptive’,
‘Homegeneityof
variance test’and
‘Means plot’.
Click ‘Continue’and
then ‘OK’.
©drtamil@gmail.com 2012
©drtamil@gmail.com 2012
Results & Homogeneity of
Variances
Test of Homogeneity of Variances
Birth weight
Levene
Statistic df1 df2 Sig.
.757 2 215 .470
©drtamil@gmail.com 2012
ANOVA Results
ANOVA
Birth weight
Sum of
Squares df Mean Square F Sig.
Between Groups .153 2 .077 .263 .769
Within Groups 62.550 215 .291
Total 62.703 217
©drtamil@gmail.com 2012
How to present the
result?
ANOVA
Housewife 2.78 + 0.53 0.769
F = 0.263
©drtamil@gmail.com 2012
Proportionate Test
©drtamil@gmail.com 2012
Formula
p1 − p2
z=
• where p1 is the rate for
event 1 = a1/n1
11
pq
00 + • p2 is the rate for event 2
n1 n2 = a2/n2
• a1 and a2 are frequencies
of event 1 and 2
11 + pn
We refer to the normal
pn
p0 = 22
distribution table to
n1 + n2 decide whether to reject
or not the null
hypothesis.
q0 =−
1 p0
©drtamil@gmail.com 2012
http://stattrek.com/hypothesis-
test/proportion.aspx
p0 = (29/96*96)+(24/104*104) = 0.265
96+104
q0 = 1 – 0.265 = 0.735
©drtamil@gmail.com 2012
Cont.