Documente Academic
Documente Profesional
Documente Cultură
net/publication/10611966
CITATIONS READS
9 410
4 authors, including:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Yi Tsong on 25 November 2016.
1
Office of Biostatistics, Office of Pharmacoepidemiology and Statistical Sciences, and
2
Office of New Drug Chemistry, Office of Pharmaceutical Science, Center for Drug
Evaluation and Research, FDA, Rockville, Maryland, USA
ABSTRACT
#
The views expressed in this paper are the authors’ professional opinions. They do not represent the
official positions of the U.S. Food and Drug Administration.
*Correspondence: Yi Tsong, HFD-705, Quantitative Methods and Research Staff, Office
of Biostatistics, Center for Drug Evaluation and Research, FDA, 5600 Fishers Lane, Rockville,
MD 20875, USA; E-mail: tsong@cder.fda.gov.
431
Key Words: Stability pooling test; Linear regression; Shelf life testing; Equivalence test.
I. INTRODUCTION
In general, a single shelf life is used for all batches of the drug product of a given
strength and packaging size. Often the shelf life is determined by the shortest shelf life of
all the batches examined. However, when the manufacturing procedure produces quality
batches with very small batch-to-batch differences such that difference between batches
can not be shown statistical significant, a common shelf life based on pooled data can be
used. For example with the analysis of covariance (ANCOVA) approach described in the
Food and Drug Administration (FDA) Draft Guidelines (1987) and the International
Conference on Harmonization (ICH) Guidance (2001a,b; 2003), batches may be pooled to
estimate the slope and/or intercept of a common regression line when it is supported by the
data. The decision of pooling is made by testing the following hypotheses (Chow and Liu,
1995; Lin et al., 1993),
H0 : bj ¼ b for all j; versus Ha : bj – b for some j ¼ 1 to J ð1Þ
and
H0 : aj ¼ a for all j; versus Ha : aj – a for some j ¼ 1 to J ð2Þ
Where aj, bj are the intercept and slope of the regression line of the chemical measurement
of batch j of the product respectively. The slopes and intercepts can be pooled if the null
hypotheses (1) and (2) are not rejected, respectively. In order to protect against a large rate
of false pooling, a large significance level of 0.25 is used conventionally (Asano, 1960;
Bancroft, 1944; Bancroft, 1964; Chow and Liu, 1995; Draft ICH Consensus Guideline,
2001; FDA, 1987; Guidance for Industry: ICH, 2001; Guidance for Industry: ICH, 2003;
Johnson et al., 1977; Larson and Bancroft, 1963; Lin et al., 1993) since the number of
batches is often small at the premarketing stage. This approach is often criticized because
not rejecting H0 provides no evidence to support pooling, rather than “failure to show
difference.” On the other hand, using a large significance level of 0.25 leads only to the
inflation on the type-I error rate of testing hypotheses (1) and (2) without properly
assessing power. Two recent simulation studies support the need to use a large significance
level in pooling tests in order to protect the type-I error rate from falsely determining a
longer shelf life (Chen and Tsong, 2003; Chen et al., 1995).
As early as 1990 and 1991, Ruberg and Hsu (1990) and Ruberg and Stegemen (1991)
proposed to test for equality of slopes or intercept as an alternative for batch pooling test.
Their approach is to test the following hypotheses:
H0 : jbj 2 bj0 j $ d for some j – j0 ; versus Ha : jbj 2 bj0 j
However, no equality limit was proposed for this approach. A simulation study showed
comparable results of slope equality test with the ANCOVA approach cited in the FDA
Guidelines with an appropriately chosen equivalent limit D. However, there is a problem
in having a fixed predetermined equality limit. Lin and Tsong (1991) showed through a
simulation study that for the same equality limit of slope, there is different impact on the
shelf life equality depending on the slope value.
Yoshioka et al. (1996a,b) revisited the equivalence approach and proposed to pool the
batches if the difference in shelf life between any two batches are within an equivalence
limit that is a prespecified percentage of the longest sample shelf life. Yoshioka et al.’s
(1996a,b) range-based test is proposed to test the following hypotheses:
vs.
where Tj and Tj are the true shelf life of batch j and j0 respectively, 0 # g # 1 is a
constant. Yoshioka et al. (1996a) compared the range-based equivalence test using g ¼
0:15 with the ANCOVA approach through Monte Carlo simulation and found that the
proposed method is more powerful in rejecting pooling when the batches have different
shelf lives. However, there is no proper statistical procedure proposed for implementation.
In this manuscript, we proposed two pooling tests based on equivalence assessment.
One was inspired by Yoshioka et al. (1996a,b; 1997) but with an equivalence assessment
setup. The second approach is inspired by the hypothesis testing statement of Chen and
Tsong (2003) and Tsong et al. (2003) such that shelf life determination is made by
comparing the chemical measurement against its acceptance criteria, SL and SU at a
prespecific proposed date T0 with the following hypotheses,
versus
where Yj ðT0 Þ is the true chemical measurement of batch j at T0. For a one-sided
acceptance criterion, use either SL or SU in hypothesis (6).
For simplicity but without loss of generality, a fixed set of ðSL ; SU Þ ¼ ð95%; 105%Þ
of label claim is used in all examples.
In determining the shelf life of a drug product, the 1987 FDA Guidelines and the ICH
Guidance require that the stability data of at least three batches be used to account for
batch-to-batch variability. It is statistically dealt with through a classical analysis of
covariance model (Chow and Liu, 1995; Draft ICH Consensus Guideline, 2001; FDA,
MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016
©2003 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.
1987; Guidance for Industry: ICH, 2001; 2003; Lin et al., 1993)
where
Yjk ¼ the assay value as a percentage of the label claim of the j-th batch at the k-th
time point,
a ¼ the intercept at manufacture date of the j-th batch,
bj ¼ the regression rate of the j-th batch,
tjk ¼ time of the k-th time interval of the j-th batch,
1jk ¼ random error corresponding to the k-th time interval of the j-th batch, it is
interface identifier (iid) distributed as Nð0; s2 Þ:
For simplicity, we assume that the assays were performed at the same time point of all
batches.
As recommended in the 1987 FDA Guidelines, the shelf life of the multiple batches of
the product is estimated based on the final ANCOVA model resulting from the pooling
test. The hypotheses of the intercept and slope pooling tests are given below.
Slope test : H01 : bj ¼ b for all j vs: Ha1 : bj – b for some j ð8Þ
and
Intercept test : H02 : aj ¼ a for all j vs: Ha2 : aj – a for some j ð9Þ
If both null hypotheses are not rejected, then the batches are considered to be poolable, and
the data of the batches will be combined to estimate that shelf life, which is the common
shelf life of all batches. However, if H02 is rejected and H01 is not, then a common slope
estimated by using the pooled data can be used to estimate the shelf life but the intercept is
individually estimated. Otherwise, the shortest shelf life among all is the shelf life of all
batches of the same drug product. Since the sample size is not determined through power
calculation, certain protection against falsely pooling batches with different slopes or
intercepts is needed. The FDA dealt with it by raising the type I error rate from 0.05 to 0.25
based on recommendations in the literature (Asano, 1960; Bancroft, 1944; 1964; Johnson
et al., 1977; Larson and Bancroft, 1963). The individual regression lines using separate
intercept and slope and the 95% confidence bands can be estimated using the ANCOVA
model.
For example, consider the data of the three batches in Table 1, the mean estimates and
the 95% confidence intervals (CI) of the shelf life for each of the three batches based on the
ANCOVA model are shown in Fig. 1. The point estimate and the 95% confidence interval
of the shelf life for the three batches are:
Time in months
Batch 0 3 6 9 12 15 18
Using pooling tests based on the ANCOVA approach recommended in ICH Guidance, the
equal intercept null hypothesis is rejected but the equal slope null hypothesis is not. The
three 95% confidence bands of the individual regression line with a common slope
intersect the lower acceptance criterion (i.e., 95% of label claim) at 29, 25, and 25 months,
respectively (Fig. 2). Hence, a target shelf life of 24 months for the product is supported
with the ANCOVA approach.
Rather than considering the batch difference either as negligible or as of a fixed
amount (Chen and Tsong, 2003; Chen et al., 1995; Johnson et al., 1977; Ruberg and
Stegemen, 1991), the random effect model was proposed by taking the batch-to-batch
variation into the model. This approach is most suitable for a drug product with a large
number of batches at the manufacturing stage rather than at the time of premarketing
approval.
The practice of using a large significance level such as 0.25 in the ANCOVA pooling
test often raised criticism because of its lack of assurance in protecting the power of
the tests by increasing chances to falsely reject the null hypotheses (Asano, 1960;
Bancroft, 1944; 1964; Larson and Bancroft, 1963).
Yk ¼ a þ btk þ 1k ð11Þ
where
a ¼ the intercept at manufacture data,
b ¼ the regression rate,
tk ¼ time of the k-th time interval,
1k ¼ random error corresponding to the k-th time point,
and that 1k’s are assumed to be independent and identically distributed across all time
points with a normal distribution of mean zero and variance s2. The 95% confidence band
of the regression line is determined by the set of solutions of Y at tk of the following
MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016
©2003 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.
equation
½Y 2 ða þ btk Þ2 ¼ ðt0:975;n22 Þ2 s2 ð1=n þ ðtk 2 tÞ2 =Stt ð12Þ
where b ¼ Syt =Stt and a ¼ y 2 bt are the estimates of b and a respectively, t0:975;n22 is the
97.5th percentile of t- distribution with degrees of freedom n-2, Syt ¼ Sk ðyk 2 y Þðtk 2 tÞ;
Stt ¼ Sk ðtk 2 tÞ2 ; and Syy ¼ Sk ðyk 2 y Þ2 :
Let YL and YU denote the upper and lower 95% confidence bounds over time,
respectively. The intersection of the confidence band of the regression line and either the
upper or lower acceptance criterion is used to estimate the mean shelf life. Furthermore,
the intersections (TL, TU) of the lower and upper 95% confidence bound with the
acceptance criterion are the 95% confidence limits of the mean shelf life. Note that when
two-sided acceptance limits are applied, the shorter of the two 95% confidence limits is the
lower limit of the mean shelf life and is the regulatory defined shelf life of the batch. When
one-sided acceptance criterion is used, a 95% one-sided confidence interval is used in-
place of the two-sided 95% confidence intervals. The equation used to determine the 95%
confidence limits of the mean shelf life is given in Appendix A.
The sampling distribution of T is unknown. The complexity of the problem was well
documented in the literature of statistical calibration problem. For the shelf life regression, it is
understandable that the 95% confidence limits TU and TL are asymmetric to ‘T. For example,
when b , 0; the distance between ‘T and TU is longer than the distance between ‘T and TL.
Example 1. For illustration purpose, only the stability data of batch #1 in Table 1 will be
considered in this example. The 95% confidence interval of the mean shelf life is skewed
to the right and is not a symmetric normal interval (see Fig. 3). Note that in Fig. 2, a 95%
confidence band for the mean regression line is used to determine the confidence interval
of the shelf life. It is in contrast to the single 95% confidence band in Fig. 1 for shelf life
determination using the ANCOVA approach. The sampling distribution of ‘T is unknown.
In general, a log, log(log) or even more complicated transformation may be needed in
order to be approximated by a normal interval. For example, a log transformation may be
suitable if ‘T2 < TL ·TU : As shown in Fig. 3, the estimate of mean shelf life is 36 months.
The 95% lower and upper limits are 26 months and 63 months respectively. Because it is
skewed to the right, a log transformation is needed to make the confidence interval
approximately symmetric. The transformed confidence interval has 3.258, 3.584, and
4.413 as the lower limit, mean, and upper limit, respectively.
Assume that a natural log transformation is needed to make the confidence interval
approximately symmetric. Let us consider two batches with expected shelf lives T1 and T2
respectively. A 1.15 ratio for equivalence limit originally proposed by Yoshioka et al.
(1996b) would imply that two shelf lives are equivalent or of no practical difference if the
ratio of the shelf lives is between 0.8696 and 1.15 of the maximum shelf life of the batches
in the study. It is used in the following example in the way that for a proposed shelf life T0,
MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016
©2003 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.
any two batches in the study are equivalent if the difference between the true shelf lives is
between 0.8696T0 and 1.15T0. It appears to be a reasonable choice as shown in Table 2. By
using the equivalence limits of 0.8696 and 1.15, the shelf life of a batch that has equivalent
shelf life to a given batch with a true shelf life of 12 months, is between 10.44 months and
13.8 months. With the ln values of the lower and upper equivalence limit being 2 0.1398
[i.e., ln(0.8696)] and 0.1398 [i.e., ln(1.15)] respectively, the maximum difference between
the batch with equivalent shelf life to the given batch is no more than 1.8 months. The
equivalence limit in shelf life increases with the true shelf life of a given batch. For
example, if the true shelf life of a given batch is 36 months, a batch with equivalent shelf
life will be no more than 41.4 months and no less than 31.4 months in true shelf life. It is no
more than 5.4 months shorter or longer in shelf life than the given batch of 36 months of
true shelf life. It is clear that for a batch with a 12 month shelf life, the equivalence limit of
the second batch is less than two months. The difference is shorter than three months,
which is the shortest length between two time points measured. For a batch with a shelf life
of 24 or 36 months, the equivalence limits are all less than six months, which is shorter
than the length between two of the last two time points as suggested for some cases by the
FDA Guidelines (1987) and ICH QIA Guidance (2001). The equivalence limit is greater
than six months only if the target shelf life is four years or longer at the time the length
between two measurements is one year. The use of these equivalence limits in the
regulatory setting may require further investigation.
When using this definition of shelf life equivalence limit, two shelf lives are
equivalent if the following null hypothesis H0 is rejected.
H0 : Tj =Tj0 # 0:8696 or Tj =Tj0 $ 1:15;
vs.
Assume that the 95% confidence interval ðlnðTjU Þ; lnðTjL ÞÞ; j ¼ 1 to J is symmetric to
lnð‘ Tj Þ; the standard error of lnð‘ Tj Þ is ‘ sj < ½lnðTjU Þ 2 lnðTjL Þ=ð2·1:96Þ: The difference
between two batches, Inð‘ Tj Þ 2 lnð‘ Tj0 Þ; is estimated by d ¼ lnð‘ Tj Þ 2 lnð‘ Tj0 Þ; and the
p
standard error of lnð‘ Tj Þ 2 lnð‘ Tj0 Þ is ð‘ s2j þ‘ s2j0 Þ:
Hypotheses (13) can be tested with either two one-sided tests or a confidence interval
decision rule. With the two one-sided tests approach, we consider the following two sets of
one-sided hypotheses
The shelf life of multiple batches of the same strength, package type configuration can
be determined by comparing the individual batch shelf life with the target shelf life. If the
shelf life of each batch determined by the ANCOVA model (7) is greater than the target
shelf life, then the target shelf life should be accepted as the shelf life of all batches.
Otherwise a statistical shelf-life equivalence assessment may be performed between every
two batches in order to determine whether a pooled shelf life should be used. The
procedure is illustrated with the following examples.
Example 1 (continued).
< 0:221
< 0:275
< 0:145
¼ ð20:533; 0:608Þ:
¼ ð20:800; 0:071Þ:
¼ ð20:903; 0:119Þ:
None of the 90% CI lays between—0.1398 [i.e., ln(0.8796)] and 0.1398 [i.e., ln(1.15)].
It indicates that the shelf lives of three batches are not equivalent. The shortest 95% lower
confidence limit of mean shelf life of the three batches is 19 months. Hence, a shelf life
of 18 months would be derived for this product based on the proposed shelf life
equivalence test.
As pointed out earlier, using a large significance level for slope and intercept pooling
tests may lead to the control of the type I error of testing against the target shelf life.
However, it is achieved by inflating type I error rate of the pooling test and penalizes
stability studies with replicate observations. When the sample size increases in each batch
with either more observed time points or with replicates at each point, standard error of
estimate is reduced and the power of rejecting H0 also increases. It leads to the fact that
the rejection of pooling is likely when sample size increases. On the other hand, with
MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016
©2003 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.
the reduced standard error, it increases the power to establish shelf life equivalence. It is
illustrated with the following Example 2.
Example 2. The stability data of Table 3 represents one study with three batches and
three replicate measurements at each time points. When using the ANCOVA approach as
recommended in regulatory guidance, the results indicate that the null hypothesis of equal
slope is rejected when the p-value of F
test ¼ 0:1687: The three regression lines are
The lower 95% confidence bands of the regression line intercept the acceptance limit (i.e.,
95% of label claim) at 31, 30, and 27 months respectively. Based on these results, a
24-month shelf life will be used for the drug product.
On the other hand, when using the shelf life equivalence test as proposed above, the
estimates of the shelf life of the three batches are
Time
Batch Replicate 0 3 6 9 12 18 24
Since the confidence limits are between 2 0.139 and 0.139, the three batches are
equivalent in shelf life. The estimate of the log transformed common shelf life is 3.434,
with 95% CI ¼ 3:3863; 3:4810Þ: After anti-log transformation, the shelf life estimate is
31 months with 95% CI ¼ ð30; 32Þ: Hence, based on the shelf life equivalence test, the
common shelf life is 30 months.
Note that when using the ANCOVA model, the shelf life estimates of three batches
are not independent, and the derivation of the sampling distributions or the joint sampling
distribution is complex. The shelf life equivalence test stated in this section is an
approximation test. However, given the (iid) normal assumption of the error term of the
ANCOVA model in hypotheses (7), distribution of the Y values at any time point is
normal. Hence an equivalence testing for pooling may be derived based on the Y values at
the target shelf life as given in the following section.
As pointed out by Chen and Tsong (2003) and Tsong et al. (2003) that the regulatory
requirement of a shelf life T0 is based on the evidence that the chemical characteristic of
the batch is within the acceptance criterion or criteria at T0. Hence, the hypotheses of
interest are
Let J be the total number of batches and K be the number of observational time points of
each batch. The regression models can be represented in matrix form,
YI ¼ XI BI þ 1jk ð17Þ
where YI is an observed data vector with JK elements formed by data from J batches and
K þ 1 time points, BI ¼ ða1 ; a2 ; . . .; aJ ; b1 ; b2 ; . . .; bJ Þ0 is a parameter vector with 2J
parameters, XI is the Jkx2J design matrix of an individual regression model with element
0, 1, or tk. Then, YI follows normal distribution with mean XI BI and covariance matrix s2I
[i.e., NðXI BI ; s2 IÞ:
Let ‘BI be the least equal estimate of BI.
Then; ‘
BI ¼ ðX0I XI Þ21 X0I YI :
For a given batch, ð‘ aj ;‘ bj Þ0 , Nððaj ; bj Þ0 ; SI Þ such that SI is the matrix that consists of the
ðj; jÞ 2 th; ðj; j þ JÞ 2 th; ðj þ J; jÞ 2 th; and ðj þ J; j þ JÞ 2 th entries of the matrix
ðX0I XI Þ21 s2
In turn,
‘
Yj ðT0 Þ ¼ Xj ðT0 Þ‘ BI
MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016
©2003 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.
where ‘ Yj ðT0 Þ is the estimate of Yj at T0, Xj ðT0 Þ is the 1 £ JK row vector such that the j-th
entry is 1, the j þ J-th entry is T0, and 0 for the rest entries using individual regression line.
By general linear model theorem, the following results are developed:
‘
Yj ðT0 Þ , NðXj ðT0 Þ‘ BI ; Xj ðT0 ÞðX0I XI Þ21 X0j ðT0 Þs2 Þ;
and
X
‘
YP ðT0 Þ ¼ ð1=JÞ j¼1 to J Xj ðT0 Þ‘ BI
X X
, Nðð1=JÞ X ðT Þ‘ B1 ; ½ð1=JÞ
j¼1 to J j 0 j ¼1 to J
Xj ðT0 ÞðX0I XI Þ21
X
½ð1=JÞ j¼1 to J X0j ðT0 Þs2 Þ
X
ð‘ Yj ðT0 Þ 2‘ YP ðT0 ÞÞ ¼ ð1=JÞ½JXj ðT0 Þ 2 j¼1 to J
Xj ðT0 Þ ðX0I XI Þ21 X0I Y
and
‘
s2 ¼ ½Y0 ðI – XI ðX0I XI Þ21 X0I Y=ðN 2 2JÞ;
0 21 0
where N 2 2J ¼ rankðI
XP I ðXI XI Þ XI Þ:
Since ð1=JÞ½JXj ðT0 Þ 2 j¼1 to J Xj ðT0 ÞðX0I XI Þ21 X0I ½I
XI ðX0I XI Þ21 X0I ¼ 0;
it follows that
p
ðð‘ YP ðT0 Þ 2‘ Yj ðT0 ÞÞ 2 tf ð0:025Þ {½ð1=JÞðJXj ðT0 Þ
X
2 j¼1 to J
Xj ðT0 ÞÞðX0I XI Þ21 ½ð1=JÞðJXj ðT0 Þ
X
2 j¼1 to J
Xj ðT0 ÞÞ0 ‘ s2 };
Hence hypotheses
P (16) can be tested by comparing the confidence interval of ½ai þ
bi T0 2 ð1=JÞ j¼1 to J ðai þ bi T0 Þ with ð2dT ; dT Þ: The null hypothesis
P of hypothesis (16)
is rejected if the 90% confidence interval of ½ai þ bi T0 2 ð1=JÞ j¼1 to J ðai þ bi T0 Þ is
contained within ð2dT ; dT Þ:
For illustration purpose, dT of 1.5% is used in Examples 1 and 2. It implies that for all
batches in a study with a common assay value at the proposed shelf life T0, the difference
in assay value of any two batches can be no more than 3% of the label claim. The use of
these equivalence limits in the regulatory setting may require further investigations.
Example 1 (continued).
The estimate of Y(24) and the 95% CI of the three batches in Table 1 can be estimated
using the ANCOVA model. They are
Then,
‘
YI1 ð24Þ , Nð97:07; 0:40Þ
‘
YI2 ð24Þ , Nð96:89; 0:40Þ
‘
YI3 ð24Þ , Nð95:32; 0:40Þ
and
‘
YP ð24Þ , Nðð96:43; 0:13Þ:
Finally, the 95% CI of YP(24) is (95.65, 97.21) and the three 90% CI of
YI1 ð24Þ 2 YP ð24Þ;YI2 ð24Þ 2 YP ð24Þ; and YI3 ð24Þ 2 YP ð24Þ are respectively reported
below:
ð20:26; 1:55Þ; ð20:44; 1:37Þ; and ð22:01; 20:20Þ:
The 90% confidence interval of difference YI1 ð24Þ 2 YP ð24Þ and Y13 ð24Þ 2 YP ð24Þ are
not bounded within ð2dT ; dT Þ [i.e., ð21:5; 1:5Þ; hence the assay value of the three batches
MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016
©2003 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.
at 24 months can’t be pooled for shelf life testing and the shortest of the three batches,
18 months is used as the shelf life of the three batches.
Example 2 (continued). The estimate of Y(30) and the 95% CI of the three batches in
Table 2 are
Batch #1 ‘ YI1 ð30Þ 95:450; 95% CI ¼ ð95:236; 95:665Þ
‘
YP ð30Þ ¼ 95:253:
‘
YI1 ð30Þ , Nð0 95:45; 0:012Þ
‘
YI2 ð30Þ , Nð 95:32; 0:012Þ
‘
YI3 ð30Þ , Nð94:98; 0:012Þ
and
‘
YP ð30Þ , Nð95:25; 0:004Þ:
Finally, the 95% CI of YP(30) is ð95:13; 95:38Þ; and the three 90% CI of YI1 ð30Þ 2 YP ð30Þ;
YI2 ð30Þ 2 YP ð30Þ; YI3 ð30Þ 2 YP ð30Þ are respectively reported below:
They clearly indicate that the assay value of the three batches at the 30-th month can be pooled.
The lower 95% confidence limit of the assay value averaged over the three batches is
95:25 2 1:645ð0:004Þ ¼ 95:243:
Hence a 30-month shelf life maybe given based on the proposed assay value equivalence at the
thirtieth month with equivalence limit ¼ 1.5%.
Note that the equivalence limit applied in these examples is selected only for the
purpose of demonstrating the examples. One justification of the choice is that it is selected
to assure that the difference between any two batches equivalent to the mean is no more
than 3% for the examples discussed here with the acceptance criteria restricted to no more
than 5% from the label claim.
The issues of significance level used in the pooling test of slope and intercept in the
conventional ANCOVA approach of stability study has been documented in the literature.
MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016
©2003 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.
The concept of equivalence testing for slope and/or intercept pooling was introduced early
and the difficulty was recognized. Articles by Yoshioka et al. (1996a,b; 1997) reintroduced
the concept of pooling by equivalence in stability study. The recent statistical development
in equivalence testing makes the concept of pooling batches based on equivalence much
more acceptable.
When considering pooling batches based on equivalence of shelf life, it encounters
the difficulty of deriving the sampling distribution of the shelf life estimate. In the example
of shelf life equivalence, a natural log transformation is applied to make the confidence
interval symmetric. However, the choice of transformation in this case can be data
dependent. Furthermore, the estimates of shelf lives are not independent and the derivation
of the joint distribution of two shelf lives is complicated. The proposed approximation
approach does not take the covariance of the two shelf lives into consideration. On the
other hand, the sampling distribution of the estimate of the difference of the assay values
of two batches is estimable under the general ANCOVA model assumptions and the
equivalence test is more statistically sound.
The justifications of the selected equivalence limits in the two approaches are
discussed briefly. However, the choice is yet debatable and requires further investigation.
When applying the ANCOVA model to multiple factor stability design, we
recommend testing the interaction terms of the ANCOVA model using the procedure
proposed by Tsong et al. (2003) After eliminating as many of the interaction terms as
possible, one can start performing equivalence testing across all product combinations
based on the already interaction-reduced model. If they fail the equivalence test, one may
test equivalence of the products within a given level of a factor to see if a common
chemical characteristic value can be used in supporting the proposed shelf life, T0. The
generalization is simple but tedious.
The estimate of mean shelf life can be obtained by reversing Eq. (A.1), such that
‘
T ¼ ðy* 2 aÞ=b ðA:1Þ
where S2 ¼ ð1=ðn 2 2ÞÞ{Sj ðyj 2 y Þ2 2 b2 ðtj 2 t Þ2 }; F1;n22 ðaÞ is the a-th percentile of F
distribution with degrees of freedom 1 and n-2.
MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016
©2003 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.
V. ACKNOWLEDGMENTS
The authors want to thank the referee for the careful review of the manuscript and the
important comments. This manuscript was prepared with the support of Regulatory
Science Research Grant RSR02-015 of the Center for Drug Evaluation and Research,
FDA. The authors thank the members of FDA CDER Office of Biostatistics Stability
Working Group for their support and discussion on the development of the proposed
approaches.
REFERENCES
Asano, C. (1960). Tests due to pooling data through preliminary test on biological direct
assay. Bull. Math. Stat. 9:25 –39.
Bancroft, T. A. (1944). On biases in estimation due to the use of preliminary tests of
significance. Annal Math. Stat. 15:190 –204.
Bancroft, T. A. (1964). Analysis and inference for incompletely specified models
involving the use of preliminary tests of significance. Biometrics 20(3):427– 442.
Chen, W. J., Tsong, Y. (2003). Significance level for stability polling test: a simulation
study. J. Biopharm. Stat. 13(3):355 –374.
Chen, J. J., Hwang, J.-S., Tsong, Y. (1995). Estimation of the shelf-life of drugs with
mixed effects models. J. Biopharm. Stat. 5(1):131 – 140.
Chow, S. C., Liu, J. P. (1995). Statistical Design and Analysis in Pharmaceutical Sciences.
New York, NY USA: Marcel Dekker.
Draft ICH Consensus Guideline. (2001a). Q1E Stability Data Evaluation. Food and
Drug Administration, Center for Drug Evaluation and Research and Center for
Biologics Evaluation and Research, www.fda.gov/cder/guidance/4983dft.pdf.
FDA. (1987). Guidelines for Submitting Documentation for the Stability of Human Drugs
and Biologics. Rockville MD: USA Food and Drug Administration, Center for Drugs
and Biologics.
Guidance for Industry: ICH, (2001b). Q1A(R) Stability Testing of New Drug Substances
and Products. Food and Drug Administration, Center for Drug Evaluation and
Research and Center for Biologics Evaluation and Research, www.fda.gov/cder/
guidance/4282fnl.pdf.
Guidance for Industry: ICH, (2003). Q1D Bracketing and Matrixing Designs for Stability
Testing of New Drug Substances and Products. Food and Drug Administration,
Center for Drug Evaluation and Research and Center for Biologics Evaluation and
Research, www.fda.gov/cder/guidance/4985fnl.pdf.
Johnson, J. P., Bancroft, T. A., Han, C. P. (1977). A pooling methodology for regressions
in prediction. Biometrics 33:57 – 67.
MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016
©2003 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.