Documente Academic
Documente Profesional
Documente Cultură
KEY POINTS
The following key points are discussed:
Analysis of covariance (ANCOVA) is an important
kind of multiple regression that involves two predictor variables: one continuous (e.g., time) and one
categorical (e.g., batch of material).
Like simple linear regression, simple ANCOVA
fits straight lines to response measurements (e.g.,
potency, related substance, or moisture content)
over time: one line for each level (i.e., batch) of the
categorical variable.
A key objective of ANCOVA is to determine whether
the straight lines for all batches are best described as
having a common-intercept-common-slope (CICS)
model, a separate-intercepts-common-slope (SICS)
model, or a separate-intercepts-separate-slopes (SISS)
model.
In ANCOVA, model choice is based on two statistical
gxpandjv t.com
Journal
of
47
Statistical Viewpoint.
INTRODUCTION
A previous installment of Statistical Viewpoint
described simple linear regression in which there
is a single continuous independent variable such as
time, temperature, concentration, or weight (1). Many
important relationships involve multiple independent variables, some of which may be categorical in
nature (e.g., batch of material, supplier, manufacturing
site, laboratory, preservative type, clinical subject).
Understanding such relationships requires the use of
multiple linear regression. In this installment, we deal
with the simplest kind of multiple linear regression
in which there are two independent variables: one
continuous (called the covariate) and one categorical. The following are some examples in which this
kind of relationship is important:
Pre-clinical studies. Ten xenograft rodents are
treated with a range of doses of an anti-tumor
agent and the tumor weight for each animal
decreases as dose increases. The objective is to
quantify the animal to animal differences in dose
response profile. Here tumor weight is the dependent variable, rodent identity is the categorical
variable, and dose is the covariate.
Process scale-up. Active pharmaceutical ingredient (API) concentration is measured over time
in three chemical reactors. The reactors differ
in size (scale). The objective is to estimate scale
effects on the rate of API synthesis. Here, API
concentration is the dependent variable, scale
is the categorical variable, and dose level is the
covariate.
A nalytical methods. An assay measures the
concentration of an analyte in plasma samples
based on a florescence response. Samples are
tested in duplicate. Each test provides a blank
response and a test response. The objective is to
compare analyte concentrations among samples,
while correcting each for the effect of the blank. In
this case, the test response is the dependent variable, sample identity is the categorical response,
and blank is the covariate.
Pharmaceutical product stability. The drug
potency, related substance (a degradation product), and moisture level are measured over time
in multiple batches of product stored in a temperature- and humidity-controlled chamber. The
objective is to estimate the shelf life of the product. Here, the potency, related substance, and
moisture levels are the dependent variables, batch
identity is the categorical variable, and storage
time is the covariate.
48
Journal
of
Figure 1: Multiple-batch models of instability: CICS (common intercept and common slope),
SICS (separate intercept and common slope), SISS (separate intercept and separate slope).
MODELS OF INSTABILITY
We will assume here that, for a given batch, the change
in level over time can be approximated by a straight
line. Chemists refer to this as pseudo zero-order kinetic mechanism. The real kinetic mechanism is almost
certainly more complex, but this linear assumption
is often found to be adequate. In any real application
this linear assumption should be justified. In some
cases, the response measurements or the time scale
can be altered using appropriate transformation(s) to
obtain a linear stability profile.
Consider the case where stability data are available for three batches of product. Figure 1 illustrates
possible models, or scenarios, of product instability
where the response is, for instance, the level of some
related substance or degradation product of the active
drug. However, the models described in Figure 1 apply
equally well for decreasing responses (e.g., potency)
or for responses that may rise or fall over time (e.g.,
moisture). In Figure 1, the mean response level for
each batch is indicated by a different colored line.
gxpandjv t.com
of
49
Statistical Viewpoint.
Simple model
SICS
SISS
CICS
SICS
Journal
of
Table II: ANCOVA table output from the Minitab stability macro.
Source
DF
Seq SS
Seq MS
Time
DF T=1
SST=SSECINS-SSECICS
MST=SST/DF T
F T=MST/MSE
p-valueT
Batch
DFB=B-1
SSB=SSECICS-SSESICS
MSB=SSB/DFB
FB=MSB/MSE
p-valueB
Batch*Time
DFBT=B-1
SSBT=SSESICS-SSESISS
MSBT=SSBT/DFBT
FBT=MSBT/MSE
p-valueBT
Error
DFE=N-2*B
SSE=SSESISS
MSE=SSE/DFE
Total
DFtot=N-1
SStot=SSECINS
gxpandjv t.com
Journal
of
51
Statistical Viewpoint.
Journal
of
Figure 3: Illustration of shelf-life determination for a single batch. Red horizontal lines indicate
upper (U) or lower (L) acceptance limits. The solid straight line is the mean regression line, and the
dashed line is the upper or lower confidence interval. The maximum batch shelf life is indicated by S.
gxpandjv t.com
of
53
Statistical Viewpoint.
STABILITY ANALYSIS
The following illustrates five stability analyses using
this macro. The potency data used was obtained
from an actual literature example (6). The related
substance and moisture data are realistic, but artificially constructed.
Input Parameters
Definition
LIFE
c.1 c.z
STORE
out.1-out.n
Specifies storage columns for the fitted values and confidence/prediction limits for
each row of data. Either 3 or 5 columns for one- or two-sided limits, respectively.
These may be separated by spaces (c4 c5 c6 ) or given as a range using a dash
(c4-c6). When using the xvalues subcommand, fits and limits are provided only for the
batches/ times in the columns specified in the xvalues subcommand, and not the fits
and limits for every value in the dataset.
ITYPE
it
CONFIDENCE
cl
XVALUES
xpredt xpredb
Requests fitted values and limits for batch/time combinations that were not included in
your stability data set. The desired times and batchs are entered into columns xpredt
and xpredb, respectively, prior to invoking the macro. The xvalues subcommand always needs to be used in conjunction with the store subcommand.
NOGRAPH
N/A
CRITERIA
alpha
Defines the significance level used in the ANCOVA F-tests. By default, the significance
level is 0.25.
54
Journal
of
iv thome.com
c2
c3
c4
c5
c6
Potency
Month
Batch
Fit
Lower CL
Lower PL
101.0
100.567
100.215
99.1808
102.0
100.567
100.215
99.1808
101.3
100.567
100.215
99.1808
101.3
100.374
100.043
98.9928
101.4
100.374
100.043
98.9928
101.5
100.374
100.043
98.9928
100.8
100.181
99.869
98.8043
99.8
99.988
99.693
98.6152
100.2
99.988
99.693
98.6152
10
100.2
99.988
99.693
98.6152
11
99.2
99.988
99.693
98.6152
12
99.7
99.988
99.693
98.6152
13
99.8
99.988
99.693
98.6152
14
99.5
99.409
99.154
98.0442
15
98.8
99.409
99.154
98.0442
16
99.0
99.409
99.154
98.0442
17
97.8
99.409
99.154
98.0442
18
98.5
99.409
99.154
98.0442
19
98.5
99.409
99.154
98.0442
20
97.4
12
98.251
97.994
96.8857
21
98.0
12
98.251
97.994
96.8857
22
98.5
12
98.251
97.994
96.8857
23
97.2
12
98.251
97.994
96.8857
24
97.1
12
98.251
97.994
96.8857
25
97.4
12
98.251
97.994
96.8857
26
96.9
24
95.935
95.436
94.5045
27
96.6
24
95.935
95.436
94.5045
28
96.6
24
95.935
95.436
94.5045
29
96.0
24
95.935
95.436
94.5045
30
96.1
24
95.935
95.436
94.5045
31
96.4
24
95.935
95.436
94.5045
Journal
of
55
Statistical Viewpoint.
Table V: Example one ANCOVA, regression, ANOVA, and estimated shelf-life output from the Minitab stability macro.
ANCOVA
Source
DF
Time
Batch
Batch*Time 2
Seq SS
0.314
Seq MS
0.157
Error
25 17.146 0.686
Total
30 98.417
0.229
0.797
Seq MS
Model 1 Analysis
Regression Equation
y = 100.567 - 0.192994 time
Summary of Model
S = 0.789106
R-Sq = 81.65%
R-Sq(adj) = 81.02%
DF
Seq SS
Adj SS
Regression 1
Time
Error
Lack-of-Fit
13.1613
13.1613
2.6323
Pure Error
24
4.8967
4.8967
0.2040
Total
30 98.4168
12.901 0.0000037
56
Journal
of
iv thome.com
%stability c1 c2 c3;
life 95 105.
Table IX shows the ANCOVA and other statistical output from this analysis.
There is evidence for both separate slopes (p-value
= 0.17) and intercepts (p-value < 0.01). Both p-values
are below the regulatory limit of 0.25. A comparison with
the ANCOVA decision process of Figure 2, shows that the
SISS model is appropriate in this case. The regression equations for each batch are given in Table IX, and the slopes
and intercepts differ for each batch as expected. We note
that in this case, the LOF test is not statistically significant
(p-value = 0.100568). For this test we use the traditional
Type I error rate of 0.05 to judge statistical significance.
Journal
of
57
Statistical Viewpoint.
Figure 6: Example two potency stability profiles for each batch on a SICS model and a onesided lower acceptance limit.
58
Journal
of
iv thome.com
Table VI: Example two potency stability data and estimated fits and limits.
C1
C2
C3
C4
C5
C6
Potency
Month
Batch
Fit
Lower CL
Lower PL
104.8
102.176
101.434
100.192
104.0
104.255
103.463
102.252
102.0
100.820
100.163
98.866
101.4
100.607
99.971
98.660
100.8
100.394
99.777
98.453
103.0
101.536
100.857
99.575
103.2
103.616
102.887
101.637
100.2
100.181
99.581
98.245
101.2
101.536
100.857
99.575
10
99.7
100.181
99.581
98.245
11
100.8
100.897
100.261
98.950
12
102.8
102.976
102.295
101.014
13
98.8
99.541
98.977
97.617
14
99.2
100.897
100.261
98.950
15
103.3
102.976
102.295
101.014
16
98.5
99.541
98.977
97.617
17
98.6
12
99.618
98.999
97.677
18
102.4
12
101.698
101.045
99.745
19
98.0
12
98.263
97.688
96.335
20
97.2
12
99.618
98.999
97.677
21
101.2
12
101.698
101.045
99.745
22
97.1
12
98.263
97.688
96.335
23
97.6
24
97.061
96.215
95.035
24
99.1
24
99.140
98.291
97.113
25
96.6
24
95.705
94.853
93.677
26
98.0
24
97.061
96.215
95.035
27
99.5
24
99.140
98.291
97.113
28
96.1
24
95.705
94.853
93.677
gxpandjv t.com
Journal
of
59
Statistical Viewpoint.
Table VII: Example two ANCOVA, regression, ANOVA, and estimated shelf-life
output from the Minitab stability macro.
ANCOVA
Source
DF
Time
Batch
Batch*Time 2
Seq SS
0.455
Seq MS
0.227
Error
22 27.309 1.241
Total
27 156.221
0.183
0.834
Model 2 Analysis
Regression Equation
batch
3
Summary of Model
S = 1.07556
R-Sq = 82.23%
R-Sq(adj) = 80.01%
DF
Seq SS
Adj SS
Seq MS
Regression
128.457
128.457
42.8191
37.0144 0.0000000
time
74.489
88.734
74.4895
64.3914
0.0000000
batch
53.968
53.968
26.9839
23.3259
0.0000024
3.3602
0.0258372
Error
Lack-of-Fit
13
22.179
22.179
1.7061
Pure Error
11
5.585
5.585
0.5077
Total
27 156.221
Journal
of
%stability c1 c2 c3;
itype 0;
confidence 0.95;
life 95 105;
xvalues c4 c5;
store C6 c7 c8 c9 c10.
C1
C2
C3
C4
C5
C6
Potency
Month
Batch
Fit
Lower CL
Lower PL
104.0
104.071
103.402
102.729
102.0
100.782
100.280
99.515
101.6
101.259
100.375
99.798
101.4
100.573
100.101
99.318
100.8
100.365
99.919
99.119
103.2
103.482
102.921
102.191
100.2
100.156
99.736
98.919
100.0
100.269
99.618
98.936
99.7
100.156
99.736
98.919
10
102.8
102.894
102.419
101.637
11
98.8
99.530
99.164
98.311
12
99.0
99.278
98.754
98.002
%stability c1 c2 c3;
store c4 c5 c6;
itype 1;
confidence 0.95;
life 0.3;
criteria 0.25.
13
103.3
102.894
102.419
101.637
14
98.5
99.530
99.164
98.311
15
102.4
12
101.717
101.302
100.482
16
98.0
12
98.279
97.897
97.054
17
97.8
12
97.297
96.514
95.895
18
101.2
12
101.717
101.302
100.482
19
97.1
12
98.279
97.897
97.054
20
97.0
12
97.297
96.514
95.895
21
99.1
24
99.363
98.605
97.975
22
96.6
24
95.775
95.027
94.392
23
99.5
24
99.363
98.605
97.975
24
96.1
24
95.775
95.027
94.392
of
61
Statistical Viewpoint.
Table IX: Example three ANCOVA, regression, ANOVA, and estimated shelf-life
output from the Minitab stability macro.
ANCOVA
Source
DF
Seq SS
Time
Batch
Batch*Time 2
Seq MS
1.760
0.880
Error
18 8.101 0.450
Total
23 120.230
1.955
0.17
Model 3 Analysis
Regression Equation
batch
4
Summary of Model
S = 0.670850
R-Sq = 93.26%
R-Sq(adj) = 91.39%
DF
Regression 5
Seq SS
Adj SS
Seq MS
112.129
112.129
time
45.451
45.950
45.4513
100.994
0.000000
batch
64.918
21.635
32.4588
72.124
0.000000
time*batch
1.760
1.760
0.8800
1.955
0.170420
2.532
0.100568
Error
18 8.101 8.101
0.4500
Lack-of-Fit
10
6.156
6.156
0.6156
Pure Error
1.945
1.945
0.2431
Total
23 120.230
Table X: Example three fit, confidence limit, and prediction limit estimates for
time and batch combinations not present in the stability data.
62
Journal
C4
C5
C6
C7
C8
C9
C10
Xvalue_Month
Xvalue_Batch
Fit
Lower CL
Upper CL
Lower PL
Upper PL
15
101.128
100.574
101.683
99.6139
102.643
15
97.653
97.110
98.195
96.1426
99.163
15
96.306
95.036
97.577
94.4088
98.204
of
iv thome.com
Figure 7: Example three potency stability profiles each batch on a SISS model and a one-sided
lower acceptance limit.
gxpandjv t.com
Journal
of
63
Statistical Viewpoint.
Table XI: Example four related substance stability data and estimated fits and limits.
64
Journal
C1
C2
C3
C4
C5
C6
Related
Month
Batch
Fit
Upper CL
Upper PL
0.030
0.027881
0.047950
0.068139
0.054
0.045534
0.062376
0.084284
0.066
0.063188
0.077421
0.100878
0.051
0.063188
0.077421
0.100878
0.078
12
0.098495
0.110942
0.135547
0.114
12
0.098495
0.110942
0.135547
0.177
24
0.169110
0.191852
0.210765
0.165
24
0.169110
0.191852
0.210765
0.090
0.126544
0.141610
0.164556
10
0.108
0.132802
0.146984
0.170472
11
0.126
0.139060
0.152420
0.176429
12
0.144
0.145319
0.157933
0.182427
13
0.159
0.145319
0.157933
0.182427
14
0.186
0.164093
0.175072
0.200678
15
0.195
0.164093
0.175072
0.200678
16
0.210
12
0.201643
0.213096
0.238373
17
0.237
12
0.201643
0.213096
0.238373
18
0.252
24
0.276742
0.299188
0.318236
19
0.267
24
0.276742
0.299188
0.318236
20
0.102
0.112219
0.138754
0.156060
21
0.150
0.141938
0.161447
0.181919
22
0.180
0.171656
0.187385
0.209936
23
0.216
12
0.231094
0.254586
0.273163
24
0.240
12
0.231094
0.254586
0.273163
of
iv thome.com
Table XII: Example four ANCOVA, regression, ANOVA, and estimated shelf-life
output from the Minitab stability macro.
ANCOVA
Source
DF
Seq SS
Seq MS
Time
0.041
0.041
100.994 0.00
Batch
Batch*Time 2
0.002
0.001
Error
18 0.007 0.000
Total
23 0.108
1.955
0.17
Seq MS
Model 3 Analysis
Regression Equation
batch
4
Summary of Model
S = 0.0201255
R-Sq = 93.26%
R-Sq(adj) = 91.39%
DF
Seq SS
Regression
0.100916 0.100916
time
0.040906
0.041355
0.0409062
100.994
0.000000
batch
0.058426
0.019472
0.0292129
72.124
0.000000
time*batch
0.001584
0.001584
0.0007920
1.955
0.170420
18
Lack-of-Fit
10
0.005540
0.005540
0.0005540
2.532
0.100568
Pure Error
0.001751
0.001751
0.0002188
Error
Total
Adj SS
23 0.108207
gxpandjv t.com
Journal
of
65
Statistical Viewpoint.
66
Journal
of
C2
C3
Moisture
Month
Batch
2.20059
1.70372
3.32395
2.75907
2.43192
1.76331
1.56801
2.19423
12
3.22311
12
10
3.16325
24
11
1.54837
24
12
2.81078
13
1.94915
14
2.49058
15
2.00485
16
3.30700
17
2.99309
18
3.30159
19
2.72512
12
20
1.88341
12
21
2.77215
24
22
1.69048
24
23
2.45301
24
2.16138
25
2.26631
26
2.12853
27
2.51775
28
2.31034
29
3.36915
30
2.32070
12
31
2.72001
12
32
2.19393
24
33
3.45895
24
iv thome.com
Table XIV: Example five ANCOVA, regression, ANOVA, and estimated shelf-life
output from the Minitab stability macro.
ANCOVA
Source
DF
Time
Batch
Batch*Time 2
Seq SS
0.531
Seq MS
0.265
Error
27 9.573 0.355
Total
32 10.366
0.748
0.483
Model 1 Analysis
Regression Equation
y = 2.45678 + 0.0022724 time
Summary of Model
S = 0.577939
R-Sq = 0.11%
R-Sq(adj) = -3.11%
DF
Seq SS
Adj SS
Seq MS
Regression
0.0116
0.0116
0.011599
0.034726 0.853386
0.0116
0.0116
0.011599
0.034726
0.853386
0.589613
0.707875
time
Error
Lack-of-Fit
1.0545
1.0545
0.210898
Pure Error
26
9.2999
9.2999
0.357689
Total
32 10.3660
CONCLUSION
We have illustrated here the ANCOVA process that is used
to set product shelf life for pharmaceutical products. We
have also illustrated the use of a convenient Minitab macro
that can be used to perform the ANCOVA analysis, choose
the appropriate stability model, and execute the multiple
regressions to estimate shelf life and produce other useful
statistical tests and statistics. The macro is flexible enough
to handle a variety of common situations and produces
graphics that serve as useful regression diagnostics.
gxpandjv t.com
It is essential to stress here the critical aspect of software validation. Validation is a regulatory requirement
for any software used to estimate pharmaceutical product
shelf life. Reliance on any statistical software, whether
validated or not, carries with it the risk of producing
misleading results. It is incumbent on the users of statistical software to determine, not only that the statistical
packages they use can produce accurate results, given a
battery of standard data sets, but also that the statistical
model and other assumptions being made apply to the
particular data set being analyzed, and that data and command language integrity are maintained. It is not uncommon for a computer package to perform differently when
installed on different computing equipment, in different
environments, or when used under different operating
systems. In our hands, using a number of representative
data sets, the Minitab Stability macro performs admirably
Journal
of
67
Statistical Viewpoint.
Figure 9: Example five moisture stability profile for all batches based on a CICS model and a
two-sided acceptance limit.
REFERENCES
1. Hu Yanhui, Linear Regression 101, Journal of Validation
Technology 17(2), 15-22, 2011.
2. LeBlond D., Chapter 23, Statistical Design and Analysis of
Long-Term Stability Studies for Drug Products, In Qui Y, Chen
Y, Zhang G, Liu L, Porter W (Eds.), 539-561, 2009.
3. Minitab Stability Studies Macro (2011), A technical support document describing the use of the Macro in Minitab
version 16 is available from the Minitab Knowledgebase
at http://www.minitab.com/support/answers/answer.
aspx?id=2686.
4. International Conference on Harmonization. ICH Q1E,
Step 4: Evaluation for Stability Data, 2003. http://www.ich.
org/products/guidelines/quality/article/quality-guidelines.
html
5. Neter J, Kuntner MH, Nachtsheim CJ, and Wasserman W,
Applied Linear Statistical Models, Chapter 23. 3rd edition,
Irwin Chicago, 1996.
68
Journal
of
Analysis of Covariance
Analysis of Variance
Active Pharmaceutical Ingredient
Common Intercept and Common Slope
Confidence Limit
Degrees of Freedom
Lack of fit
Percent of Label Claim
Mean Square Error
Prediction Limit
Predicted Residual Sum of Squares
Root Mean Squared Error
R-square
Adjusted R-square
Prediction R-square
Separate Intercept and Common Slope
Separate Intercept and Separate Slope
iv thome.com