Sunteți pe pagina 1din 63

Basic Statistics

Fundamentals of Hypothesis Testing:


One-Sample, Two-Sample Tests

PREPARED BY:
ASHOK THAKKAR
Mail id: ashokmba2000@yahoo.co.in

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-1


What is biostatistics

 Statistics is the science and art of collecting,


summarizing, and analyzing data that are
subject to random variation.

 Biostatistics is the application of statistics and


mathematical methods to the design and
analysis of health, biomedical, and biological
studies.

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-2


Different Tests of Significance
1. One-Sample z-test or t-test
a. Compares one sample mean versus a population mean
2. Two-Sample t-test
a. Compares one sample mean versus another sample
mean
a. Independent t-tests (equal samples)
b. Dependent t-tests (dependent/paired samples)
3. One-way analysis of variance (ANOVA)
a. Comparing several sample means

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-3


How to properly use
Biostatistics

 Develop an underlying question of interest


 Generate a hypothesis
 Design a study (Protocol)
 Collect Data
 Analyze Data
 Descriptive statistics
 Statistical Inference

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-4


Relationship between population and sample

(Simple random sampling)

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-5


Sampling Techniques

Population

Simple Random Stratified Random Systematic Cluster Convenience


Sample Sample Sampling Sampling Sampling

Bias free Bias free Biased Bias free Biased


sample sample sample sample sample

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-6


Example

 How are my 10 patients doing after I put them


on an anti-hypertensive medications?
 Describe the results of your 10 patients

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-7


Example
 What is the in hospital mortality rate after
open heart surgery at SAL hospital so far this
year
 Describe the mortality

 What is the in hospital mortality after open


heart surgery likely to be this year, given
results from last year
 Estimate probability of death for patients like
those seen in the previous year.

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-8


Misuse of statistics

 About 25% of biological research is flawed


because of incorrect conclusions drawn from
confounded experimental designs and misuse
of statistical methods

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-9


What is a Hypothesis?
 A hypothesis is a I claim that mean CVD
claim (assumption) in the INDIA is atleast 3!
about the population
parameter
 Difference between
the value of sample 

statistic and the


corresponding
hypothesized
parameter value is
called hypothesis
testing. © 1984-1994 T/Maker Co.

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-10


Hypothesis Testing Process
Assume the
population
mean age is 50.
( H 0 :   50) Identify the Population

Is X  20 likely if    ? Take a Sample


No, not likely!

REJECT

Null Hypothesis
 X  20 
ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-11
Reason for Rejecting H0
Sampling Distribution of X
It is unlikely that ... Therefore,
we would get a we reject the
sample mean of null hypothesis
this value ... that m = 50.

... if in fact this were


the population mean.

20  = 50 X
If H0 is true
ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-12
Components of Biostatistics

Biostatistics

Statistical
Descriptive
Inference

Estimation Hypothesis Testing

Confidence Intervals P-values

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-13


Normal Distribution

A variable is said to be normally distributed or to have a


normal distribution if its distribution has the shape of a
normal curve.
ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-14
Normal distribution
 bell-shaped

 symmetrical about the mean (No skewness)

 total area under curve = 1

 approximately 68% of distribution is within one


standard deviation of the mean

 approximately 95% of distribution is within two


standard deviations of the mean

 approximately 99.7% of distribution is within 3


standard deviations of the mean

 Mean = Median = Mode

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-15


Empirical Rule
About 68% of the area lies
within 1 standard deviation
68% of the mean

3 2    2 3


About 95% of the area lies
within 2 standard
deviations
About 99.7% of the area lies within 3
standard deviations of the mean
ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-16
ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-17
Level of Significance, 

 Is designated by  , (level of significance)


 Typical values are .01, .05, .10
 Is selected by the researcher at the beginning
 Provides the critical value(s) of the test

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-18


The z-Test for Comparing
Population Means
Critical values for standard normal distribution

Alpha Critical Value of z


2-tailed .05 +1.96
.01 +2.58

1-tailed .05 +1.65


.01 +2.33

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-19


Level of Significance I claim that mean CVD
in the INDIA is atleast 3!
and the Rejection Region

H0:  3  Critical


H1:  < 3 Value(s)

Rejection 0
Regions 
H0:   3
H1:  > 3
0
/2
H0:  3
H1:   3
0

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-20


Hypothesis Testing

1. State the research question.


2. State the statistical hypothesis.
3. Set decision rule.
4. Calculate the test statistic.
5. Decide if result is significant.
6. Interpret result as it relates to your research
question.

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-21


Rejection & Nonrejection
Regions I claim that mean CVD
in the INDIA is atleast 3!

Two-tailed test Left-tailed test Right-tailed


Sign in Ha = < >
Rejection region Both sides Left side Right side
ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-22
The Null Hypothesis, H0

 States the assumption (numerical) to be


tested
 e.g.: The average number of CVD in INDIA is at
least three ( H 0 :   3 )
 Is always about a population parameter
( H 0 :   3 ), not about a sample
statistic ( H 0 : X  3 )

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-23


The Null Hypothesis, H0
(continued)

 Begins with the assumption that the null


hypothesis is true
 Similar to the notion of innocent until
proven guilty

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-24


The Alternative Hypothesis, H1
 Is the opposite of the null hypothesis
 e.g.: The average number of CVD in INDIA is
less than 3 ( H1 :   3 )
 Never contains the “=” sign
 May or may not be accepted

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-25


General Steps in
Hypothesis Testing
e.g.: Test the assumption that the true mean number of of
CVD in INDIA is at least three ( Known) 
1. State the H0 H0 :   3
2. State the H1 H1 :   3
3. Choose   =.05
4. Choose n n  100
5. Choose Test Z test
ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-26
General Steps in
Hypothesis Testing (continued)
6. Set up critical value(s) Reject H0

Z
-1.645
7. Collect data 100 persons surveyed
Computed test stat =-2,
8. Compute test statistic
p-value = .0228
and p-value
9. Make statistical decision Reject null hypothesis
The true mean number of CVD is
10. Express conclusion less than 3 in human
ASHOK THAKKAR© 2002 Prentice-Hall, Inc.
population. Chap 9-27
The z-Test for Comparing
Population Means
Critical values for standard normal distribution

Alpha Critical Value of z


2-tailed .05 +1.96
.01 +2.58

1-tailed .05 +1.65


.01 +2.33

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-28


p-Value Approach to Testing
 Convert Sample Statistic (e.g. X ) to Test
Statistic (e.g. Z, t or F –statistic)
 Obtain the p-value from a table or computer

 Compare the p-value with


   , do not reject H0
If p-value
 If p-value   , reject H0

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-29


Comparison of Critical-Value & P-
Value Approaches
Critical-Value Approach P-Value Approach
Step1 State the null and alternative Step1 State the null and
hypothesis. alternative hypothesis.
Step 2 Decide on the significance Step 2 Decide on the significance
level, level,
Step 3 Compute the value of the Step 3 Compute the value of the
test statistic. test statistic.
Step 4 Determine the critical
Step 4 Determine the P-value.
value(s).
Step 5 If the value of the test
statistic falls in the rejection region, Step 5 If P < , reject Ho;
reject Ho; otherwise, do not reject otherwise do not reject Ho.
Ho.
Step 6 Interpret the result of the Step 6 Interpret the result of the
hypothesis test. hypothesis test.
ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-30
Result Probabilities
H0: Innocent
Jury Trial Hypothesis Test
The Truth The Truth
Verdict Innocent Guilty Decision H0 True H0 False
Do Not Type II
Innocent Correct Error Reject 1-
Error ( b )
H0
Type I
Power
Guilty Error Correct Reject Error
H0 (1 - b )
( )

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-31


Type I & II Errors Have an
Inverse Relationship
If you reduce the probability of one
error, the other one increases so that
everything else is unchanged.

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-32


Critical Values
Approach to Testing

 Convert sample statistic (e.g.: X ) to test


statistic (e.g.: Z, t or F –statistic)
 Obtain critical value(s) for a specified 
from a table or computer
 If the test statistic falls in the critical region,
reject H0
 Otherwise do not reject H0

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-33


One-tail Z Test for Mean
(  Known)

 Assumptions
 Population is normally distributed
 If not normal, requires large samples
 Null hypothesis has  or  sign only
 Z test statistic

X  X X 
Z 
X / n

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-34


Rejection Region
H0: 0 H0: 0
H1:  < 0 H1:  > 0
Reject H0 Reject H0
 

0 Z 0 Z
Z Must Be Significantly Small values of Z don’t
Below 0 to reject H0 contradict H0
Don’t Reject H0 !
ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-35
Example: One Tail Test
Q. Does an average box of
cereal contain more than
368 grams of cereal? A
random sample of 25
boxes showed X = 372.5.
The company has 368 gm.
specified  to be 15 grams.
Test at the 0.05 level. H0: 368
H1: > 368

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-36


Finding Critical Value: One Tail
Standardized Cumulative
What is Z given  = 0.05? Normal Distribution Table
(Portion)

Z 1 Z .04 .05 .06

.95 1.6 .9495 .9505 .9515


 = .05
1.7 .9591 .9599 .9608

0 1.645 Z 1.8 .9671 .9678 .9686


Critical Value 1.9 .9738 .9744 .9750
= 1.645
ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-37
Example Solution: One Tail Test
H0: 368 Test Statistic:
H1:  > 368
 = 0.5 X 
Z  1.50
n = 25 
Critical Value: 1.645 n
Reject Decision:
.05 Do Not Reject at  = .05
Conclusion:
0 1.645 Z No evidence that true
1.50
mean is more than 368
ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-38
p -Value Solution
p-Value is P(Z 1.50) = 0.0668
Use the
alternative P-Value =.0668
hypothesis
to find the 1.0000
direction of - .9332
the rejection .0668
region.
0 1.50 Z
From Z Table: Z Value of Sample
Lookup 1.50 to Statistic
ASHOK THAKKAR© 2002 Prentice-Hall, Inc.
Obtain .9332 Chap 9-39
p -Value Solution (continued)

(p-Value = 0.0668)  ( = 0.05)


Do Not Reject.
p Value = 0.0668

Reject

 = 0.05

0 1.645
Z
1.50
Test Statistic 1.50 is in the Do Not Reject Region
ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-40
Example: Two-Tail Test
Q. Does an average box
of cereal contain 368
grams of cereal? A
random sample of 25
boxes showed X =
372.5. The company 368 gm.
has specified  to be
15 grams. Test at the H0:  368
0.05 level.
H1:   368

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-41


Example Solution: Two-Tail Test
H0: 368 Test Statistic:
H1:  368
X   372.5  368
 = 0.05 Z   1.50
 15
n = 25 n 25
Critical Value: ±1.96
Decision:
Reject
Do Not Reject at  = .05
.025 .025
Conclusion:
No Evidence that True
-1.96 0 1.96 Z Mean is Not 368
1.50
ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-42
p-Value Solution
(p Value = 0.1336)  ( = 0.05)
Do Not Reject.
p Value = 2 x 0.0668

Reject Reject

 = 0.05

0 1.50 1.96
Z
Test Statistic 1.50 is in the Do Not Reject Region
ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-43
Connection to
Confidence Intervals
For X  372.5,   15 and n  25,
the 95% confidence interval is:
372.5  1.96 15 / 25    372.5  1.96 15 / 25
or
366.62    378.38
If this interval contains the hypothesized mean (368),
we do not reject the null hypothesis.
It does. Do not reject.
ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-44
What is a t Test?

 Commonly Used
Definition: Comparing
two means to see if
they are significantly
different from each
other
 Technical Definition:
Any statistical test that
uses the t family of
distributions

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-45


Independent Samples t Test
 Use this test when you
want to compare the
means of two Independent Independent
Mean Mean
independent samples on #1 #2
a given variable
• “Independent” means
that the members of
one sample do not Compare using t test
include, and are not
matched with, members
of the other sample
 Example:
• Compare the average
height of 50 randomly
selected men to that of
50 randomly selected
women
ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-46
Dependent Samples t Test
 Used to compare the
means of a single sample
or of two matched or
paired samples
 Example:
• If a group of students
took a math test in
March and that same
group of students took
the same math test two
months later in May, we
could compare their
average scores on the
two test dates using a
dependent samples t
test
ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-47
Comparing the Two t Tests
Independent Samples Dependent Samples
 Tests the equality of the means  Tests the equality of the means
from two independent groups between related groups or of two
(diagram below) variables within the same group
 Relies on the t distribution to (diagram below)
produce the probabilities used to  Relies on the t distribution to
test statistical significance produce the probabilities used to
test statistical significance

Person Person Person Person


#1 #2 #1 #1

Treatment group Control group Before treatment After treatment

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-48


Types

 One sample
compare with population
 Unpaired
compare with control
 Paired
same subjects: pre-post
 Z-test
large samples >30
ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-49
 Compare Means (or medians)Example:
 Compare blood presures of two or more groups, or
compare BP of one group with a theoretical value.
 1 Group:
1. One Sample t test
2. Wilcoxon rank sum test
 2 Groups:
1. Unpaired t test
2. Paired t test
3. Mann-Whitney t test
4. Welch’s corrected t test
5. Wilcoxon matched pairs test
ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-50
 3-26 Groups:
1. One-way ANOVA
2. Repeated measures ANOVA
3. Kruskal-Wallis test
4. Friedman test
(All with post tests) Raw data Average data
Mean, SD, & NAverage data Mean, SEM, & N

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-51


Is there a difference?

between you…means,
who is meaner?
ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-52
Statistical Analysis

control treatment
group group
mean mean

Is there a difference?
ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-53
Slide downloaded from the Internet
What does difference mean?
The mean difference
medium is the same for all
variability three cases

high
variability

low
variability

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-54


Slide downloaded from the Internet
What does difference mean?

medium
variability

high
variability

Which one shows


low the greatest
variability difference?

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-55


Slide downloaded from the Internet
t Test:  Unknown
 Assumption
 Population is normally distributed
 If not normal, requires a large sample
 T test statistic with n-1 degrees of freedom
X 
 t
S/ n

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-56


Example: One-Tail t Test

Does an average box of


cereal contain more than
368 grams of cereal? A
random sample of 36
boxes showed X = 372.5, 368 gm.
and s  15. Test at the 
0.01 level. H0:  368
H1: > 368
 is not given
ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-57
Example Solution: One-Tail
H0: 368 Test Statistic:
H1: > 368
X   372.5  368
 = 0.01 t   1.80
S 15
n = 36, df = 35 n 36
Critical Value: 2.4377
Reject Decision:
Do Not Reject at  = .01
.01
Conclusion:
No evidence that true
0 2.437 t35
7
1.80
mean is more than 368
ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-58
The t Table
 Since it takes into
account the changing
shape of the
distribution as n
increases, there is a
separate curve for
each sample size (or
degrees of freedom).
 However, there is not
enough space in the
table to put all of the
different probabilities
corresponding to each
possible t score.
 The t table lists
commonly used critical
regions (at popular
alpha levels).

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-59


Z-distribution versus t-distribution

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-60


The z-Test for Comparing
Population Means
Critical values for standard normal distribution

Alpha Critical Value of z


2-tailed .05 +1.96
.01 +2.58

1-tailed .05 +1.65


.01 +2.33

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-61


Summary
 We can use the z distribution for testing
hypotheses involving one or two
independent samples
 To use z, the samples are independent and
normally distributed
 The sample size must be greater than 30
 Population parameters must be known

ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-62


ASHOK THAKKAR© 2002 Prentice-Hall, Inc. Chap 9-63

S-ar putea să vă placă și