Documente Academic
Documente Profesional
Documente Cultură
STATISTICS PAPER
Received 25 August 2010 ; received in revised form 21 December 2010; accepted 11 January 2011
KEYWORDS Summary This paper is the sixth in a series of statistics articles recently published
Statistics; by Australian Critical Care. In this paper we explore the most commonly used sta-
Chi square test; tistical tests to compare groups of data at the nominal level of measurement. The
Test for goodness of t; chosen statistical tests are the chi-square test, chi-square test for goodness of t,
Test for independence;
chi-square test for independence, Fishers exact test, McNemars test and the use
of condence intervals for proportions. Examples of how to use and interpret the
Fishers exact test;
tests are provided.
McNemars test;
Crown Copyright 2011 Published by Elsevier Australia (a division of Reed
Condence intervals International Books Australia Pty Ltd) on behalf of Australian College of Critical
Care Nurses Ltd. All rights reserved.
. Crown Copyright 2011 Published by Elsevier Australia (a division of Reed International Books Australia Pty Ltd) on behalf of Australian College of Critical Care Nurses Ltd. All rights reserved.
1036-7314/$ see front matter
doi:10.1016/j.aucc.2011.01.005
134 M.J. Fisher et al.
before implementation of a bowel management that is, the results may be considered statistically
protocol with those after the protocol was imple- signicant when in reality they may not be sta-
mented. The tests of signicance for nominal tistically signicant. When samples are small and
data vary depending on the nature of the chosen the assumptions for the chi-square are violated,
measurements for the variables. Table 2 presents Fishers exact test could be used.6
the most commonly used tests for comparing two The formula for calculating the chi-square statis-
groups using nominal measurement level. tic is
(fo fe )2
Chi-square test 2 =
fe
The chi-square test compares the observed fre-
quency distribution (fo ) for each category of the
where 2 is equal to the sum of the squared dif-
scale with the expected frequency distribution (fe )
ference between the observed and the expected
of the null hypothesis. When using a chi-square
frequencies divided by the expected frequency for
test it is assumed that there has been random
each cell.
sampling; that 80% of the cells have an expected
The concept of degrees of freedom (df) is impor-
frequency of greater than ve; that no cell has an
tant and is a mathematical limitation that needs to
observed frequency of 0; and, that a large sam-
be factored in when calculating an estimate of one
ple is used, as small sample sizes lead to a small
statistic from an estimate of another. The df are
expected frequency which causes large chi-square
used in conjunction with the table of critical val-
values.3 A limitation of the chi- square test is that
ues for chi-square. The df for a chi- square test is
it is sensitive to either very small or large sam-
calculated with the following equation:
ples. Quantifying the minimum sample is difcult
as it is dependent on the number of cells in the
crosstab. A sample is considered too small when df = (R 1) (C 1)
the above assumptions are not met. When these
assumptions are not met the chi-square cannot be where R equals the number of rows and C equals
meaningfully interpreted.4 The chance of nding a the number of columns.3
signicant difference between samples is greater
with larger samples. If you double the sample size,
the chi-square statistic will double due to the large Chi-square test for goodness of t
sample size rather than a strong pattern of depen-
The chi-square test for goodness of t is used for
dence between the variables.5
a single population and is a test used when you
When these assumptions are violated the results
have one categorical variable. This test determines
may lead to erroneous interpretation of the data;
how well the frequency distribution from that sam-
ple ts the model distribution. Consider the data
provided in the contingency table (Table 3) which
Table 2 The tests of signicance for nominal data.
reports the frequency of patients who developed
Sample types Test of signicance diarrhoea for three different wards within a hospi-
One-sample case Chi-square goodness tal.
of t
Two or more independent Chi-square test for
samples independence Table 3 Frequency of diarrhoea in patients admitted
Two dependent (paired) McNemars test for to three wards.
samples binomial
distributions Ward A Ward B Ward C Total
Small samples Fishers exact test 30 25 40 95
Testing differences in proportions 135
Table 4 Calculating the expected number of patients with diarrhoea in the pre-intervention sample for the null
hypothesis.
Pre-intervention Post-intervention Total
Patients with diarrhoea ? (fe) 63 201 (fr)
Patients without diarrhoea 241 214 455
Table 5 Expected frequency distribution for patients with and without diarrhoea at pre-intervention and post-
intervention time periods.
Pre-intervention Post-intervention Total
Patients with diarrhoea 116.13 84.87 201
Patients without diarrhoea 262.87 192.13 455
Total 379 277 656
null hypothesis. In this case the fe would be: value, therefore the null hypothesis is rejected and
we conclude that there is a statistical difference
379(201) between the number of patients with diarrhoea in
fe = = 116.13
656 the pre-intervention sample as compared to those
in the post-intervention sample.
The expected frequency distribution for the null The Statistical Package for the Social Sciences
hypothesis in this example would be calculated as (SPSS version 18) was used to examine this sample
depicted in Table 5. of patients with or without diarrhoea. The reported
At a glance it would appear that in this example SPSS output conrms that there was a statistical
there is a difference between frequency observed difference in the incidence of diarrhoea between
(fo ) and the expected frequency (fe ). Table 6 the pre-intervention and post-intervention samples
presents the difference between the observed and 2 (1, n = 656) = 14.06, p < 0.0001. In the original
expected frequency for each cell. study Ferrie and East2 identied a statistical differ-
To calculate whether there is a statistical differ- ence in the incidence of diarrhoea between the two
ence the chi-square formula is used. samples (p < 0.0001), however this claim could have
been strengthened by reporting the 2 statistic.
(fo fe )2
2 =
fe
Fishers exact test
where 2 is equal to the sum of the squared dif-
Fishers exact test is used in cases where there are
ference between the observed and the expected
cells with an expected frequency (fe ) less than 5
frequency divided by the expected frequency for
and/or with small sample sizes, as Fishers exact
each cell. In this case the chi-square statistic is
test has no sample size restriction.6 The method
equal to:
of calculation of Fishers exact test is different
(21.87)2 (21.87)2 (21.87)2 (21.87)2 to the chi-square statistic and is calculated by
2 = + + + determining the probability of getting the observed
116.13 84.87 262.87 192.13
frequency distribution by establishing and compar-
= 4.12 + 5.63 + 1.82 + 2.49 = 14.06 ing to all other possible distributions where the
column and row totals remain the same as the
The critical value for 2 needs to be determined; observed distribution. In this case the null hypoth-
rst calculate the df (see Box 1) and determine the esis indicates that all the cells would be close to
level of signicance. Referring to a table listing the equal. The calculation of Fishers exact test is com-
critical values of chi-square and using the calcu- plex and is not available in all statistical packages
lated df (1) and level of signicance of 0.05, the but can be performed using the Statistical Package
critical value for 2 is 3.84. The chi-square value of for Social Sciences.
14.06 calculated above exceeds that of the critical
McNemars test
Table 6 Difference between frequency observed and
expected frequency. The McNemar test compares dependent (paired
or matched) samples in terms of a dichotomous
Pre- Post-
variable.4 It is the best test for comparing dichoto-
intervention intervention
mous variables with two dependent sample studies
Patients with 21.87 21.87
as opposed to the chi-square test which examines
diarrhoea
Patients without 21.87 21.87
nominal level variables with two samples that are
diarrhoea independent of each other.4 A dichotomous vari-
able has only two possible outcomes, for example
Testing differences in proportions 137
Table 7 Contingency table of diarrhoea at two time periods (with cells named).
Time 1
No Yes Total
Time 2 Yes Cell A = 40 Cell B = 67 107
No Cell C = 60 Cell D = 33 93
Total 100 100 200
These results indicate that the lower limit of a be more appropriate. As with many other statisti-
95% CI is 6% and the upper limit is 20% with the cal tests, assessment of the critical values, p values
sample proportion difference at 13%. Note that CIs and CI may assist in the reader determining clinical
may not be symmetrical around the sample propor- and statistical signicance of the results.
tion, it just happens to be in this instance. With
a 0.05 level of signicance, there is a signicant
result with p < 0.0001 (as reported earlier in the References
paper) and the CI provides additional information
1. Fisher M, Marshall AP. Understanding descriptive statistics.
as it gives a range of where the population propor- Aust Crit Care 2009;22:937.
tion is likely to lie. Patients with the intervention 2. Ferrie S, East V. Managing diarrhoea in intensive care. Aust
are somewhere between 6% and 20% more likely Crit Care 2007;20:713.
to experience no diarrhoea than those without the 3. Corty EW. Using and interpreting statistics: a practical text
for the health, behavioral and social sciences. St. Louis:
intervention. The clinical signicance and research
Mosby Elsevier; 2007.
conclusions should be drawn from the individual 4. Argyrous G. Statistics for social research. Melbourne:
context for the study.3 Macmillan Education Australia Pty. Ltd.; 1996.
5. Smithson MJ. Statistics with condence: an introduction for
psychologists. Canberra: Sage; 2000.
6. Altman DG. Practical statistics for medical research. Lon-
Conclusion don: Chapman & Hall; 1996.
7. Periera SMC, Leslie G. Hypothesis testing. Aust Crit Care
This paper has provided an introduction to the sta- 2009;22:18791.
tistical tests commonly used to test differences in 8. Polit DF, Beck CT. Nursing research: generating and assess-
ing evidence for nursing practice. 8th ed. St. Louis: Mosby
proportions for nominal level data. Chi-square tests Elsevier; 2008.
are commonly used in health care research and 9. Carlin JB, Doyle LW. Statistics for Clinicians 6:
where sample sizes are small, Fishers exact test comparison of means and proportions using con-
may be used. If the data from dependent, paired dence intervals. J Paediatr Child Health 2001;37:
samples are binomial, then the McNemar test may 5836.