Documente Academic
Documente Profesional
Documente Cultură
learning objectives:
to understand the relationship between point
estimation and interval estimation
to calculate and interpret the confidence
interval
Statistical estimation
Every member of the
population has the
same chance of
being
selected in the
sample
Population
Parameters
estimation
Random sample
Statistics
Statistical estimation
Estimate
Point estimate
sample mean
sample proportion
Interval estimate
Interval estimation
Confidence interval (CI)
provide us with a range of values that we belive, with a given
level of confidence, containes a true value
CI for the poipulation means
Interval estimation
Confidence interval (CI)
2%
14%
34%
34%
14%
2%
z
-3.0 -2.0
3.0
-2.58
-1.0
-1.96
0.0
1.0
1.96
2.0
2.58
Interval estimation
Confidence interval (CI), interpretation and example
50
Frequency
40
30
20
10
0
22.5 27.5 32.5 37.5 42.5 47.5 52.5 57.5
25.0 30.0 35.0 40.0 45.0 50.0 55.0 60.0
Age in years
Testing of hypotheses
learning objectives:
to understand the role of significance test
to distinguish the null and alternative
hypotheses
to interpret p-value, type I and II errors
S c ie n t if i c k n o w le d g e
R e a s o n a n d i n t u it i o n
Formulate
hypotheses
E m p ir i c a l o b s e r v a t i o n
Collect data to
test hypotheses
Formulate
hypotheses
Collect data to
test hypotheses
CHANCE
Accept hypothesis
Reject hypothesis
Testing of hypotheses
Significance test
Subjects: random sample of 352 nurses from HUS surgical
hospitals
Mean age of the nurses (based on sample): 41.0
Another random sample gave mean value: 42.0.
Question: Is it possible that the true age of
nurses from HUS surgical hospitals was
41 years and observed mean ages
differed just because of sampling error?
Answer can be given based on Significance
Testing.
Testing of hypotheses
Null hypothesis H0 -
there is no difference
Alternative hypothesis HA -
Testing of hypotheses
Example
The purpose of the study:
to assess the effect of the lactation nurse on attitudes
towards breast feeding among women
Research question: Does the lactation nurse have
an effect on attitudes towards
breast feeding ?
HA :
H0 :
Testing of hypotheses
Definition of p-value.
90
2.5%
80
95%
2.5%
70
60
50
40
30
20
10
0
23.8
28.8
33.8
38.8
43.8
48.8
53.8
58.8
AGE
If our observed age value lies outside the green lines, the
probability of getting a value as extreme as this if the null
hypothesis is true is < 5%
Testing of hypotheses
Definition of p-value.
p-value = probability of observing a value more
extreme that actual value observed, if the null
hypothesis is true
The smaller the p-value, the more unlikely the
null hypothesis seems an explanation for the
data
Interpretation for the example
If results falls outside green lines, p<0.05,
if it falls inside green lines, p>0.05
Testing of hypotheses
Type I and Type II Errors
Decision
Accept H0 /
reject HA
Reject H0
/accept HA
No study is perfect,
there is always the chance for error
H0 true / HA false H0 false / HA true
Type II error ()
OK
p=1-
Type I error ()
p=
- level of significance
p=
OK
p=1-
1- - power of the test
Testing of hypotheses
Type I and Type II Errors
=0.05
Testing of hypotheses
Type I and Type II Errors
The probability of making a Type II () can be decreased
by increasing the level of significance.
it will increase the chance of a Type I error
Testing of hypotheses
Type I and Type II Errors. Example
Suppose there is a test for a particular disease.
If the disease really exists and is diagnosed early, it can be
successfully treated
If it is not diagnosed and treated, the person will become
severely disabled
If a person is erroneously diagnosed as having the disease
and treated, no physical damage is done.
Testing of hypotheses
Type I and Type II Errors. Example.
Decision
No disease
Not diagnosed
Diagnosed
OK
Type I error
Disease
Type II error
OK
irreparable damage
would be done
Testing of hypotheses
Confidence interval and significance test
Null
hypothesis is
accepted
p-value > 0.05
Null
hypothesis is
rejected
p-value < 0.05
One group
Two
unrelated
groups
Two related
groups
K-unrelated
groups
K-related
groups
Nonparametric tests
Parametric tests
Nominal
data
Ordinal, interval,
ratio data
Ordinal data
mean
skewness
kurtosis
1
2
( f oi f ei )
f
ei
fe2= 10,fe3=5,
fo1= 95,
fo2= 2,
= 14.2, df=3
2
fe4= 5;
fo3=2,
(4-1)
fo4= 1;
Frc
then
( fr fc )
N
1
2
( f ij Fij )
j F
ij
df = (fr-1) (fc-1)
Sex
Cardiac
Cath
No
male
female
Row total
15
16
31
Yes
45
24
69
Column
total
60
40
100
Sex
Cardiac
Cath
No
male
female
Row total
18.6
12.4
31
Yes
41.4
27.6
69
Column
total
60
40
100
= 2.52, df=1
2
(2-1) (2-1)
p > 0.05
Null hypothesis is accepted at 5% level
Conclusion: Recommendation for cardiac
Frequency data
Expected
frequencies
should not be less than 5
Measures independent
No subjects can be
count more than once
of each other
Categories
should
be
defined prior to data
collection and analysis
on
the
same
subjects
(repeated
One group
Two
unrelated
groups
Two related
groups
K-unrelated
groups
K-related
groups
Nonparametric tests
Parametric tests
Nominal
Ordinal data
data
Chi square
goodness
of fit
Chi square
Ordinal, interval,
ratio data
McNemar
s test
Chi square
test
Parametric
tests
One group
Two
unrelated
groups
Two related
groups
K-unrelated
groups
K-related
groups
Nominal
data
Chi square
goodness of
fit
Chi square
McNemars
test
Chi square
test
Ordinal data
Wilcoxon signed
rank test
Wilcoxon rank
sum test,
Mann-Whitney
test
Wilcoxon signed
rank test
Kruskal -Wallis
one way analysis
of variance
Friedman
matched samples
serum calcium
before and
Whether the mean serum
consentration statistically
significantly differ before
and
operation?paired t-test
Test after
of significance:
on weight
Study design:
Samples: group of patients treated with drug A
(n=35)
control group (n=40)
Outcome measure: weight in Time 1 (before using
drug) and Time 2 (after using
drug)
Cannot
be
used
analyze frequency
Measures independent
of each other
Homoginity of group
variances
to
Nonparametric tests
Parametric tests
Nominal
data
Chi square
goodness
of fit
Chi square
Ordinal, interval,
ratio data
One group t-test
Ordinal data
Wilcoxon
signed rank test
Wilcoxon rank
sum test,
Mann-Whitney
test
Two related McNemars Wilcoxon
groups
test
signed rank test
K-unrelated Chi square Kruskal -Wallis
groups
test
one way
analysis of
variance
K-related
Friedman
groups
matched
samples
Students t-test
Paired Students
t-test
ANOVA
ANOVA with
repeated
measurements
Beskrivning av samplet
Samplet bestod av 1028 lrare frn grundskolan
och gymnasiet. Av lrarna var n=775 (75%)
kvinnor och n=125 (25%) mn. Lrarna
frdelade sig p de olika skolniverna enligt
fljande: n=330 (%) undervisade p lgstadiet;
n= 303 (%) p hgstadiet och n= 288 (%) i
gymnasiet. En liten grupp lrare n= 81 (%)
undervisade p bde p hg- och lgstadiet
eller bde p hgstadiet och gymnasiet eller p
alla niver. Denna grupp benmndes i
analyserna fr den kombinerade gruppen.
Faktoranalysen
Fljande saker br beskrivas:
det ursprungliga instrumentet (ex K&T) med de 17 variablerna
och den teoretiska grupperingen av variablerna.
Kaisers Kriterium och Cattells Scree Test fr det potentiella
antalet faktorer att finna
Kommunaliteten fr variablerna
Metoden fr faktoranalys
Rotationsmetoden
Faktorernas frklaringsgrad uttryckt i %
Kriteriet fr att laddning skall anses signifikant
Den slutliga roterade faktormatrisen
Summavariabler och deras reliabilitet dvs Chronbacks alpha
Dtaanlysmetoder
Data analyserades kvantitativt. Fr beskrivning av variabler
anvndes frekvenser, procenter, medelvrdet, medianen,
standardavvikelsen och minimum och maximum vrden. Alla
variablerna testades betrffande frdelningens form med
Kolmogorov-Smirnov Testet. Hypotestestningen betrffande
skillnader mellan grupperna gllande bakgrundsvariablerna har
utfrts med Mann-Whitney Test och d gruppernas antal > 2
med Kruskall-Wallis Testet. Sambandet mellan variablerna har
testats med Pearsons korrelationskoefficient. Valideringen av
mtinstrumentet har utfrts med faktoranalys som beskrivits
ingende i avsnitt xx. Reliabiliteten fr summavariablerna har
testats med Chronbachs alpha. Statistisk signifikans har
accepterats om p<0.05 och datat anlyserades med
programmet SPSS 11.5.