Sunteți pe pagina 1din 3

P~ycbologicalReports, 1967, 21, 70-72.

@ Southern Universities Press 1967

SAMPLE SIZE REQUIRED T O OBSERVE A T LEAST k


RARE EVENTS'

JOHN E. OVERALL
The University of Texas Medical Branch

Summary.-Tables based upon Poisson distribution are presented which spe-


cify total sample size adequate to ensure ( p > 9 5 ) that a certain specified num-
ber of rare events will be observed. The tables are useful in planning sampling
surveys where special interest is in obatining at least a specified number of cases
of special research interest. The tables can also be used as basis for rejecting the
hyporhesis that occurrence rate for rare event exceeds some specified value in the
population.

One of the most frequently recurring problems in clinical research concerns


estimation of total sample size required in order for an investigator to have rea-
sonable confidence that at least some specified n ~ ~ m b eofr rare cases will be ob-
served. Much clinical research involves interest in a particular sub-population,
such as students with IQ above 140 or patients with a particular rare MMPI
profile. The investigator may have to screen by examination or other methods
large numbers of individuals in order to locate a relatively small number of spe-
cial interest to him. In considering the possible cost and feasibiliy of such a
study, it is important to have some initial estimate of the total sample size re-
quired for confidence that a specified number of acceptable cases will be identi-
fied.
In many instances it is possible to obtain at least a crude estimate of the
occurrence rate for cases meeting criteria for inclusion into a study. Where se-
lection criteria are specified in terms of a test score, normative data for the test
will provide the necessary estimate. Sometimes multiple independent criteria
may be involved in defining the sub-population of interest, such as 16-yr.-old fe-
males with IQ above 140 and weight less than 100 Ibs. In the population of
16-yr.-old females, IQ and weight can be assumed statistically independent for
practical purposes. The joint probability of IQ > 140 and W t < 100 can be
obtained by simply multiplying the separate probabilities obtained from test
norms and growth charts.
A rare event will be defined as one which occurs with frequency not ex-
ceeding 50 per thousand. Given that an investigator has an estimate of the ex-
'Calculation of sample size estimares was possible using computer facility supported by
USPH Grant 2 PO7 FR-00024-03. The tables presented in this article were actually con-
structed in relation to the problem of estimating sample size required to ensure identifi-
cation of a certain number of cases of rare cardio-vascular disease. An investigator inter-
esred in studying the etiology of rare disease must start by following a sample large enough
to provide reasonable confidence that it will eventually contain at least a specified mini-
mum number of cases of interest. The same general problem arises frequently in psycho-
logical research.
SAMPLE SIZE FOR k RARE EVENTS

Table 1 C

P.. 050

60
95
176
15s
181
710
217
763
299
314
164
411
462
510
558
652
745
817
928
1019
72 J. E. OVERALL

pected occurrence rate of the rare event in a specified population, he wants to


know the total number of cases that he may have to screen in order to be rea-
sonably sure of obtaining at least k cases meeting his selection criteria. It is
generally accepted that the frequency of occurrence of rare events follows a Pois-
son distribution, as a limiting case of the binomial. There is nothing new theo-
retically about the results presented in this article. Tables of total sample size
required for reasonable certainty of obtaining at least k rare events have not been
available in the form presented here, and the purpose of this note is to provide
them for the convenience of psychologists planning survey sampling.
The cumulative Poisson probability of observing 0, 1, 2, 3 . . . k rare
events when sampling at random from a population containing proportion p
of the rare events is given by the formula,
h -1
P = Z ( p " ) " / a ! e"" .
a=O
In calculations upon which the tabled values are based, sample size N was syste-
matically increased until P approached as closely as possible, but did not exceed,
P = .05. The value of N obtained in this way is the total sample size required
for 95% confidence that at least k rare events will be observed.
- of Table 1 indicates the minimum number of rare
The left-hand margin
events which the user wants to identify by random sampling from a populacion
containing proportion p designated at the top of the table. The entry in the
body of Table 1 is the total sample size necessary for 95% confidence that at
least the number of rare events specified at the left will be observed in random
sampling from a population containing the proportion shown above.
It will be noted incidentally that the table of Poisson sample size estimates
provides basis for a direct a = .05 test of the hypothesis that the true populacion
proportion is equal to or exceeds a P value at the top of the table. If the num-
ber of rare events actually observed is less than the number shown in the left-
hand margin for a given total sample size, then the true populacion proportion is
unlikely ( p < .05) to be as great as the value entered at the top. For example,
the observation of only 4 cases of rare disease X out of a total sample of 2000
is sufficient evidence for an investigator to conclude that the true population oc-
currence rate is less than a previously hypothesized 5/1000. On the other hand,
the observation of 7 cases of disease X out of a total sample of 2000 is inadequate
evidence upon which to base a conclusion that the true population occurrence
rate is less than 5/1000. Use of the table in this manner depends upon the as-
sumption of random sampling from the population in question.

Accepted Jzrly 14, 1967.

S-ar putea să vă placă și