Sunteți pe pagina 1din 16

HUMAN F A C T O R S , 1973, 15(4), 295-310

Response Surface Methodology


Central-Composite Design
Modifications for Human
Performance Research
CHRISTINE CLARK and ROBERT C. WILLIGES, University of Illinois at Urbana-
Champaign

Selected response surface methodology (RSM) designs that are viable alternatives in
human performance research are discussed. T w o major RSM designs that are variations of
the basic, blocked, central-composite design have been selected for consideration: ( 1 )
central-composite designs with multiple observations at only the center point, and ( 2 )
cen tral-composite designs with multiple observations a t each experimental point. Designs
of the latter type are further categorized as: (a) designs which collapse data across all
observations at the same experimental point; ( b ) between-subjects desgns in which no
subject is observed more than once, and observations at each experimental point
may be multiple and unequal or multiple and equal; and (c) within-subject designs in
~~ which each subject is observed only once at each experimental point. The ramifications of
these designs are discussed in terms of various criteria such as rotatability, orthogonal
blocking, and estimates of error.

INTRODUCTION RSM focuses primarily on determining the


functional relationship that exists between the
Frequently, an investigator’s aim is to de- response and specified continuous, quantitative
termine a quantitative relationship between factors, rather than merely determining the
human performance and one or more system significance of the various factors.
parameters. Among the most immediate bene- In addition to approximating the relation-
fits accruing from such a known, quantitative ship between performance and factors in the
relationship are the ability to predict per- form of a prediction equation, RSM advances a
formance levels corresponding to given levels of variety of experimental designs to achieve that
the system variables and, conversely, the ability estimate as efficiently and economically as
to determine the system variable levels neces- possible. When using factorial designs, the
sary to maintain a designated performance investigator is often forced by practical con-
level. One particularly promising procedure for siderations to limit the number of factors
gathering the data needed to make these and studied to even fewer than the number that he
other quantitative determinations is response believes has a critical effect on performance. In
surface methodology (RSM), originally intro- such a case he must conduct multiple studies,
duced by Box and Wilson (1951). Unlike each of which investigates only a few factors at
traditional factorial analysis of variance designs, any one time. This results in an unrealistic view

295

Downloaded from hfs.sagepub.com at PENNSYLVANIA STATE UNIV on February 19, 2016


296-August, 1973 H U M A N FACTORS

of any system in which factors are not inde- tion is called the response surjfiace. Of course, in
pendent of each other. By allowing the investi- practice one usually does not know just what
gator to consider larger numbers of factors that function is. Therefore, the investigator
within a single study, RSM proves a valuable attempts to derive a reasonable estimate of the
investigatory tool. Through strategic sampling unknown function, basing his estimate upon
of data points, RSM also provides the most the examination of representative data. In other
essential information and allows one to decide words, the investigator attempts to approxi-
whether or not the collection of additional data mate the response surface, the true functional
is merited. relationship between response and factor levels,
Most RSM designs are special cases of the by using a derived polynomial equation. For
Box and Wilson (1951) central-composite de- example, in lieu of the function f , he might
sign. Although this design was originally de- substitute a complete second-order polynomial
veloped for application in chemical research, its in X 1 ,X , , and X , of the form
utility in psychological research, especially in
studies of human performance, has been docu-
Y = bo + blXl -t bzXz -t b3X3 + b&
mented (Meyer, 1963; Simon, 1970; Williges
and Simon, 1971). It is not unreasonable,
however, to anticipate the need for some
modification in that basic design to make it where the numerical values of bo through b9
more appropriate for research involving human are determined empirically according to multi-
subjects. The purpose of this paper is to suggest ple-regression techniques. The complete second-
several appropriate design modifications that order polynomial includes the linear effect of
attempt to retain as many of the positive traits each variable, the linear-by-linear interactions,
of the RSM central-composite design as possi- and the quadratic effect of each variable.
ble. Before discussing these modifications, a
description of central-composite designs is Factorial Design: A Data Collection Procedure
necessary.
When developing an equation to approxi-
mate the response surface, the investigator
CENTRAL-COMPOSITE DESIGNS measures the desired response a t relatively few
data points, each designated by some unique
Suppose that an investigator is interested in combination of independent variable or factor
predicting radar target detection, Y, given levels levels. For example, the investigator studying
of display resolution, X I , visual angle, X 2 , and target detection might adopt a factorial design
random noise, X 3 . Further suppose that the in which each of the three display-related
true relationship between target detection and variables assumes two levels, -1 and +l. Of
the three display-related variables could be course, these two factor levels can represent
expressed as a function f of the levels of X 1 , any desired real-world factor levels simply by
X 2 , and X 3 . That is, in symbolic form applying the appropriate linear transformation.
Determination of real-w orld factor levels using
Y = f ( X I , X z , .. .,Xm)+e, (1) such a transformation is illustrated in a later
where m = 3; Xj, i = 1, 2, 3, is the level of the section. The Z 3 , or 8 , possible combinations of
ith display-related variable; e is the associated factor levels designate the particular set of
experimental error; and Y is the corresponding points at which the investigator measures the
level of target detection. The particular func- response. In simple terms, the factorial design
tion which describes the relationship in ques- serves as a set of directions for collecting data.

Downloaded from hfs.sagepub.com at PENNSYLVANIA STATE UNIV on February 19, 2016


CHRISTINE CLARK A N D R O B E R T C . WILLIGES August, 1973-297

If the factors are continuous and quantita-


tive, the data collected in this manner can serve
as the raw input data for either a traditional
analysis of variance or a multiple-regression
analysis. When the investigator’s aim is to derive
a polynomial approximation to a response Again, these factor levels can represent any
surface, rather than merely to determine the desired real-world factor levels simply by apply-
significance of the various factors, multiple ing the appropriate linear transformation. The
regression is the more appropriate analysis. The numerical value which 01 assumes is chosen to
factorial design provides the quantitative levels insure certain advantageous design properties
of the relevant factors or predictor variables, (to be discussed later). The particular 01 value is
and the investigator makes direct measurements not crucial to the current discussion; suffice it
of the response level at each data point desig- to say at this point that 01 is merely one of the
nated by the design. In the case of the levels which the factors can assume.
preceding example, because each of the three The addition of these seven new data points
factors-display resolution, visual angle, and to the basic factorial design results in a design
random noise-assumes two distinct, quantita- composed of 15 distinct factor combinations.
tive levels, a first-order polynomial equation in Yet the investigator can now fit not only a
each factor can be fitted to the data. complete second-order polynomial to the re-
If the investigator suspects that target de- sulting data, but also a polynomial involving
tection is at least a complete second-order selected higher-order predictors as well. This is
function of the three display-related factors, he usually more than adequate for approximating
must measure detection performance at more most response surfaces. With an increase of
than two levels of each of those variables. He only seven in the number of distinct data
could, for example, provide for a complete collection points, the investigator is able to
second-order equation in all three factors by measure the response at five levels of each
collecting the appropriate data according to factor, those five levels being the values +a,21,
another factorial design in which each factor and 0. (The corresponding complete factorial
assumes three levels. Such a design designates a design involving five levels of each factor entails
total of 33 or 27 points at which target 125 distinct data points for a single replica-
detection performance is measured, an increase tion.) Moreover, if repeated observations were
of 19 data points over the previous design. made at the center point (0, 0, 0), the resulting
design would provide for an estimate of experi-
mental error variance. This error estimate al-
Central-Composite Design: A n Alternative Data
lows the investigator to test the significance of
Collection Procedure
the derived polynomial and each of its com-
ponents, as well as testing the significance of
An alternative procedure could be followed effects not included in the derived equation.
to direct data collection efforts. Suppose the This proposed alternative design is merely a
investigator maintained the initial two-level combination or composite of a traditional z3
factorial design involving only eight unique factorial design and some strategically selected
factor combinations. He could augment that additional points (Box and Wilson, 1951). In
basic design by including the following particular, the design is a three-factor central-
(2.3 + 1) or 7 additional distinct factor combi- composite design in that the designated factor
nations, expressed here as ordered triplets of combinations, or data points, are spaced
factor levels: symmetrically about a central or center point

Downloaded from hfs.sagepub.com at PENNSYLVANIA STATE UNIV on February 19, 2016


298-August, 1973 HUMAN F A C T O R S

designated by the ordered triplet of factor given above. More specifically, when fractional
levels (0, 0, 0) as shown in Figure 1. More factorials are incorporated into a second-order
generally, a K-factor central-composite design is central-composite design, one chooses the de-
realized by combining a basic ZK factorial with fining contrast so that all the first- and second-
the (2-1C + 1) additional distinct factor combi- order components are present and are not
nations aliases of each other. Were. this restriction not
observed, the first- and second-order effects
(0, 0, . . .,0); (-a,0, . . .,0); (a,0, . . .,0); would be inextricably mixed with one another.
(0, -a, . . ., 0); (0, a,. . .,0); ' Regardless of the number of factors, however,
. . . (0, 0, . . ., -a);(0, 0, . . .,a) each factor assumes five distinct levels cor-
(Cochran and Cox, 1957, p. 343). responding to the coded values +a,+1, and 0.
Moreover, the designated factor combinations
Note that each of the 2K noncenter points is fall symmetrically about the center point (0, 0,
defined so that all factors except one are held . . .,0).
a t the 0 level, whereas the remaining factor Again, if the factors and the response are
assumes the values -a and +a, in turn. The continuous quantitative entities, the data can
aggregate of these 2K additional noncenter be analyzed using multiple-regression techni-
points is referred to as the star or axial portion ques. To test for the significance of the derived
of the resulting central-composite design. As polynomial and its components and the
the number of factors increases to five or more, significance of all other terms not included
a 2 ( K - p ) fractional factorial, where p is a in the equation, the investigator needs an
positive integer, is often substituted for the estimate of experimental error variance. The
complete 2K factorial, thereby reducing still central-composite design provides for an esti-
further the number of distinct data points (see mate of error by repeating observations at the
Cochran and Cox, 1957, Ch. 6A). In such .
center point (0, 0, . ., 0). Choosing the
instances, a K-factor central-composite design is appropriate number of replications results in a
realized by combining a 2(K-p) fractional fac- design in which the standard error of estimate is
torial with the same (2.K + 1) combinations roughly the same at all points within the
immediate vicinity of the design. Hence, the
/STAR estimate of error at the center is used as an
CENTER. POINT / estimate of error throughout the entire K-space,
\ ?/ thereby minimizing redundancy. Too many
replications at the center yield standard errors
of estimate which increase rapidly for those
points farther from the center. On the other
hand, with .too few replications of the center
point, the standard error is apt to be greater at
the center than at the surrounding data points.
In the case of a three-factor central-composite
design, for example, the suggested number of
replications at the center point is six, thereby
increasing the total number of observations to
20 (see Table 1). Although the derivation
I--- XI+ procedures are beyond the scope of this discus-
Figure 1. Three-factor, central-composite design. sion, procedures exist for determining the

Downloaded from hfs.sagepub.com at PENNSYLVANIA STATE UNIV on February 19, 2016


CHRISTINE CLARK A N D ROBERT C . WILLIGES August, 1973-299

TABLE 1 Rotatability
Coded Value Coordinates of Data Points for a
Second-Order Central-Composite Design in Three One desirable property of some central-
Variables
composite designs is rotatability (Box and
Observation x, x2 x3 Hunter, 1957). Rotatability exists when the
1 1 .o -1 .o 1 .o variance of the predicted response is the same
2 1 .o ' 1.0 -1 .o for all data points equidistant from the center.
3 -1.0 . 1.0 1 .o
4 -1 .o -1 .o -1 .o This is an especially convenient design quality
5 -1 .o 1 .o -1 .o in exploratory work when the investigator is
6 -1 .o -1 .o 1 .o
7 1 .o -1 .o -1 .o ignorant of the response surface and its relative
8 1 .o 1 .o 1 .o orientation to the orthogonal factor axes. Ro-
9 -a 0.0 0.0
10 0.0 -a 0.0 tatability imposes the additional constraint on
11 0.0 0.o -a factor-level selection that the value of a be
12 a 0.0 0.0
13 0.0 a 0.0 equal to (Box and Hunter, 1957). When a
14 0.0 0.0 a 2(K-p) fractional factorial design is used in
15 0.0 0.0 0.0
16 0.0 0.0 0.0 place of the full 2K factorial, then a must equal
17 0.0 0.0 0.0 2(K-p)/4 if rotatability is to exist (Box and
18 0.0 0.0 0.0
19 0.0 0.0 0.0 Hunter, 1957). Thus, if the hypothetical three-
20 0.0 0.0 0.0 factor design diagrammed in Figure 1 is to be
rotatable, the (Y value must be 1.682, because
optimum number of center points of a K-factor $14 = $14 = 8114 = 1.682.
design (Box and Hunter, 1957).
Selection of Factor Levels
Design Limitations
\
The first, and perhaps most crucial, step in
Of course, reducing the size of an experi- selecting factor levels for a central-composite
ment by eliminating data points has its price. design (or even a basic factorial) is to determine
Coincidental with the reduction in data is a the experimental range of each factor to be
reduction in obtained information. In particu- incorporated into the design. Because poly-
lar, when fractional factorials are incorporated nomials cannot be extrapolated with confi-
into the central-composite design, at least one dence, the derived polynomial equation should
factorial effect, the defining contrast, is lost be considered an approximation to the response
entirely. Prudent choice of the defining con- surface only within the region defined by the
trast(s), however, usually results in losing infor- respective factor ranges. When appropriately
mation concerning some higher-order inter- transformed, the limiting real-world values of
action(s) which seldom affects performance each factor, as determined by the selected
anyway. In addition, interpretation of- that range, yield the coded values -a and +a,and
information which is provided by a fractional- the center of that range yields the coded value
factorial, central-composite design is somewhat 0. For example, suppose that the values of
more ambiguous in that certain effects are interest for display resolution range from 1 9
mixed with one another, as indicated above. By to 504 TV lines/decimeter. Further suppose
choosing the highest-order interaction as the that +a assume the values -1.68 and t1.68,
defining contrast, the experimenter can insure respectively, thus insuring that ,the resulting
that first- and second-order effects are not design is rotatable. The investigator's next task
confounded with one another. is to determine the linear transformation

Downloaded from hfs.sagepub.com at PENNSYLVANIA STATE UNIV on February 19, 2016


300-August, 1973 HUMAN FACTORS

which: (a) when applied to the center of the Bloc king


factor range, 336, yields the coded value 0, and
(b) when applied to the lower and upper An additional feature of central-composite
limiting values of display resolution, 168 and designs which affords the investigator greater
504, yields the coded values -1.68 and t1.68, efficiency and flexibility is blocking. Under
respectively. It can be demonstrated that the blocking conditions, subsets of the complete set
following linear transformation satisfies both of data collection points are studied together. If
these requirements: the blocking is orthogonal, any differences in
mean performance among blocks are inde-
x; = x1 -336 pendent of any main effects due to the in-
100 ’ dependent variable manipulations, and as such,
they do not affect the underlying quantitative
where is a coded factor level, and XI is the relationship between factors and performance.
corresponding real-world factor level. The re- If blocking were not orthogonal, the derived
maining two levels of display resolution are prediction equation would be a function of
’ determined by solving for X1 where X T as- block effects as well as main effects. This aspect
sumes the values -1 and fl, in turn. Therefore, of design is valuable to the human factors
the appropriate five real-world levels of display engineer who is concerned with isolating poten-
resolution are 168, 236, 336, 436, and 504 TV tial effects due to such factors as different
lines/decimeter . experimenters, changes in apparatus, and vari-
The appropriate real-world levels of all other able environmental conditions. In our example
experimental factors are determined in like of the investigator studying radar target detec-
manner. In each case, (a) the range of the factor tion as affected by display resolution, visual
and the center point are established, (b) the angle, and random noise, it is unlikely that all
appropriate linear transformation is de- the necessary data can be collected during a
termined, and (c) the remaining two levels of single flight or perhaps not even in the same
the factor are determined in accordance with aircraft. By taking advantage of orthogonal
the transformation. Although coding the appro- blocking techniques, the investigator can guard
priate real-world factor levels, once they are against the parameters of the derived prediction
determined, is not necessary, the use of linear equation being affected by such differences.
transformations of the data simplifies analysis For example, a block could refer to that set of
without affecting the result of any subsequent observations which was made during any given
statistical tests. On occasion, this rigid demand flight.
regarding the selection of data points makes the Blocking a central-composite design is read-
central-composite design impractical for some ily accomplished by subdividing the design into
human factors studies. For example, variables two parts: (a) the 2K factorial (or 2(K-p)
such as target type, target complexity, and fractional factorial) portion and (b) the set of
briefing instructions are not readily quantifi- 2K points comprising the star or axial portion
able. Moreover, it is sometimes neither practical of the design. As the number of factors
nor feasible to measure even certain quantifi- increases, the 2K factorial (or 2(K-p)fractional
able variables at the five levels specified by the factorial) can be subdivided further into addi-
central-composite design. Alternative RSM de- tional blocks by using fractional factorials.
signs have been developed which require fewer When fractional factorials are used for blocking
than five levels (Box and Behnken, 1960, and second-order designs, care must be taken not to
Draper and Stoneman, 1968). confound any first- or second-order effects with

Downloaded from hfs.sagepub.com at PENNSYLVANIA STATE UNIV on February 19, 2016


CHRI S T IN E C L A R K A N D R O B E R T C. WILLIGES August, 1973-301

blocks, and none of these effects should be


an alias of any other within a given block.
Orthogonal blocking places additional con-
straints on the central-composite design con-
cerning the selection of a and the number and
distribution of center points. These parameters BLOCK 3

must be chosen to insure that the average (- 1.633,0,0 )

predicted response level is the same for every (0.-1.633,0)


block. Orthogonal blocking of the central- ( 0 ,0.-1.633)

composite design requires that the following ( 1.633,0,0 )


condition be met (Box and Hunter, 1957, (0,1.633,0 ) .
p. 230): ( 0 ,0,1.633)

(0. 0 , o 1
(3) (0. 0. 0 )

Figure 2. Orthogonal blocking of second-order, cen-


or, in the event that a 2(K-p) fractional tral-composite design in three variables with^ coded
factorial is incorporated into the design, value coordinates of data points.

levels for display resolution. Transforming X I ,


where Nco and Nso are the number of center where x;“ assumes the revised a values -1.633
points added to the intact 2K factorial (or and t1.633, in turn, yields revised levels for the
2(K-p)fractional factorial) portion and the 2K lower and upper limiting values of display
star portion of the design, respectively. Nc and resolution; the revised real-world levels are 173
N s reflect the number of noncenter points in and 499 TV lines/decimeter, respectively. Simi-
, the 2K factorial (or 2(K-p)fractional factorial) larly, it can be shown that the change in a value
and in the 2K star, respectively. In addition, if does not necessitate a change in the,three
the 2K factorial (or 2(K-p)fractional factorial) intermediate real-world values of display resolu-
is itself subdivided into blocks, the Nco center tion. Hence, the five levels appropriate to the
points added to that portion of the design must orthogonally-blocked design are 173,236,336,
be distributed equally across the resulting 436, and 499 TV lines/decimeter.
blocks. The investigator must also recompute the
Given the proposed design in Figure 1 for appropriate real-world levels of visual angle and
studying radar target detection, orthogonal random noise in like manner. Note that the
blocking can be achieved by dividing the 20 value of (Y required to insure orthogonality is
data points given in Table 1 into subsets of 6 , 6 , slightly different from the 1.682 value required
and 8 observations, as depicted in Figure 2. The for rotatability. To achieve orthogonal block-
first two blocks each represent one-half repli- ing, it is often necessary to sacrifice ro-
cates of the complete 23 factorial portion, and tatability, although the appropriate (11values are
the third block is the six-point star portion. usually quite similar. In human factors applica-
Two center points have been included in each tions, however, the potential gains from orthog-
of the three blocks for replication. Solving onal blocking probably outweigh the risk of
Equation 3 for yields an a value of 1.633. forfeiting rotatability.
Given this revised value of a,the investigator Added flexibility can accrue from use of
must revise his choices of real-world factor blocking techniques, as Box and Hunter (1957)

Downloaded from hfs.sagepub.com at PENNSYLVANIA STATE UNIV on February 19, 2016


302-A~gust)1973 HUMAN FACTORS

illustrated when they employed blocking to converted easily into a nonstandard or raw
facilitate exploration of a response surface. A score regression equation.
properly blocked design permits research to be The second analysis usually performed on
conducted in stages. Each block of data points data obtained from a RSM design is an analysis
from the complete second-order design consti- of variance performed on the regression analy-
tutes a first-order, rotatable central-composite sis. Essentially, the analysis of variance parti-
design. Gathering data from the first series of tions the sums of squares into variation due to
blocks, the investigator can judge, for example, regression and variation not due to regression
whether or not any of the original experimental (residual). The regression sum of squares is
variables merits being dropped from further subdivided into the variation of the particular
consideration or whether or not an equation partial regression weights resulting from the
of higher order than a linear polynomial is preceding multiple-regression analysis, The re-
needed to explain the data adequately. If so, sidual sum of' squares can also be further
the design can be altered here rather than after subdivided into block effects, subject effects,
all data are collected. The ability to make such lack of fit, and error. The main purposes of this
decisions at an early stage may mean that the analysis of variance are to test the significance
investigator is able to conclude his study after of the given partial-regression weights and to
collection of considerably fewer data than he test for a significant lack of fit which might
had anticipated. indicate additional parameters are necessary in
the regression equation. All of the sums of
squares are converted to mean squares by
Analyses dividing by the appropriate degrees of freedom.
The resulting F ratios are constructed by using
Basically, two standard statistical analyses the error mean square as the denominator.
are conducted on the data accrued from an Consider again the study of radar target
RSM design. First, a least-squares multiple- detection performance, Y, as a function of
regression analysis is performed on the data to display resolution, visual angle, and random
determine the functional relationship between noise, XI X2,and X3, respectively. Hypotheti-
)

performance (Y) and the system variables (X). cal data for such a study are presented in Table
Multiple regression is merely an extension of 2. A multiple-regression analysis of these hypo-
simple linear regression such that the multiple- thetical data yields the following generalized,
regression analysis includes more than one first-order prediction equation:
predictor or terms other than linear com-
I

Y = 16.115 - 1.203 Xi - 0.503 X2


ponents. Because of the numerical complexity
involved in multiple regression, matrix algebra -t 0.847 X3. (5)
ordinarily is used for the calculation of the Substituting given levels of the independent
regression-equation coefficients. In addition, a variables into this equation affords the investi-
matrix algebra solution using correlation gator a corresponding predicted level of detec-
matrices rather than raw score matrices provides tion latency.
a flexible and efficient means for handling a The results of a subsequent analysis of
variety of possible regression equations within variance performed on the regression analysis
the ,same computer program. A correlation appear in Table 3. The derived equation ac-
matrix solution results in a standard regression counts for nearly 74% of the total variance in
equation (variables are stated in terms of I detection latency. Each of the coefficients,
scores and the intercept is 0) that can be excluding the constant term bo ,is significant at

Downloaded from hfs.sagepub.com at PENNSYLVANIA STATE UNIV on February 19, 2016


CHRISTINE CLARK AND ROBERT C. WILLIGES August, 1973-303

TABLE 2
Hypothetical Data in Coded Form for a Three-Factor, Second-Order, RSM Central-Composite Design

XI X2 x3 Y
Detection Latency
Observation Block Resolution Visual Angle Random Noise (Seconds)

1 1 1.oo -1 .oo 1.oo 16.2


2 1 1.oo 1.oo -1 .oo 14.3
3 1 -1 .oo 1.oo 1.oo 17.0
4 1 -1 .oo -1 .oo -1 .oo 17.4
5 1 0.00 0.00 0.00 15.5
6 1 0.00 0.00 0.00 15.8
7 2 -1 .oo 1.oo -1 .oo 16.8
8 2 -1 .oo -1 .oo 1.oo 18.1
9 2
- 1..oo
.. -1 .oo -1 .oo 14.9
10 2 1.oo 1.oo 1.oo 16.2
11 2 0.00 0.00
0.00 0.00 15.0
12 2 0.00 0.00
0.00 0.00 14.8
13 3 -1.63 0.00
0.00 0.00 19.0
14 3 0.00 -1.63 0.00 17.3
15 3 0.00 0.00 -1.63 14.8
16 3 1.63 0.00 0.00 13.9
17 3 0.00 1.63 0.00 14.6
18 3 0.00 0.00 1.63 19.2
19 3 0.00 0.00 0.00 15.8
20 3 0.00 0.00 0.00 15.7

well beyond the .01 level. Blocks are signifi- DESIGN CONSIDERATIONS
cant; however, because blocking is orthogonal,
the values of the regression weights have not
been affected. Noting that the lack-of-fit term In a recent article, Williges and Simon
is significant, the investigator will submit his (1971) discussed several general advantages of
data to a second multiple regression analysis to the RSM technique which contribute to its
determine a higher-order prediction equation.' potential value in human factors research.
For a detailed discussion of the analysis Among the most obvious benefits is the
procedures, see Clark and Williges, 1972. economy of data collection. Not only is sam-
pling restricted to the experimental region of
TABLE 3 greatest interest, but also repeated observations
are restricted to the center point of that region.
First-Order Regression Analysis of Variance Summary
Table for Hypothetical Detection Latency Data As originally conceived, RSM was developed as
a methodology for quickly locating optimums
Source df MS F
by means of a series of experiments each
Regression ( 3) 10.73 536.50"" dependent on the results of the preceding one.
b, 1 19.26 963.00*" More specifically, Box and Wilson (1951) were
b2 1' 3.37 168.51 * *
b3 1 9.54 477.00*" interested in determining the optimum combi-
Residual (16) 0.71
Blocks 2 0.21 10.50"
nation of factor levels needed to produce the
Lack of Fit 11 0.99 49.50"" maximum yield from a chemical reaction.
Error 3 0.02 However, human factors engineers are largely
Total (19)
interested in deriving global prediction equa-
* p<.o5 tions which allow them to predict performance
** p<.OOl
Multiple Regression Coefficient, R = 0.86 levels accurately throughout an entire range of
Coefficient of Determination, R 2 = 0.74 factor levels.

Downloaded from hfs.sagepub.com at PENNSYLVANIA STATE UNIV on February 19, 2016


304-Aupst, 1973 HUMAN F A C T O R S

When the goal is to approximate an entire veloped by Clark, Williges, and Carmer (1971),
response surface (rather than merely that por- and a detailed discussion of the statistical
tion of the surface surrounding the optimum), procedures is presented by Clark and Williges
limiting multiple observations to a single experi- (1972).
mental point may not be the most judicious
strategy. Indeed, the actual variability in re- Collapsed Designs
sponse may be so great across subjects and data
points that it would be unrealistic to presume The simplest modification is achieved merely
the standard error of estimate at the center by replicating the entire central-composite de-
point is an adequate estimate of error at all sign a given number of times. Consider, for
points. A recent study concerning transfer of example, the orthogonally-blocked, RSM cen-
training (Williges and Baron, 1973) affords a tral-composite design depicted in Figure 2 .
striking demonstration of the effect of esti- Suppose the investigator elects to replicate that
mating experimental error at a single replicated design five times. The data points remain the
point as opposed to estimating it across a series same as those listed under Figure 2 . Now,
of replicated points. When replications were however, the design involves a total of 100
restricted to the center point, none of the observations, over a total of 15 distinct factor
experimental factors was found to contribute combinations. Block 1 now contains 30 ob-
significantly to the response level, despite their servations, Block 2 contains 30 observations,
apparent importance in the resulting prediction and Block 3 contains 40 observations. Note
equation. When multiple observations were that, although multiple observations have been
made at each of the data points, however, the made at each of the experimental points, the
subsequent analysis revealed that some of the center point has still been replicated six times
experimental variables were significant in de- more than any other point. Although the points
termining the response level. Of course, when comprising the factorial and the star have each
the basic RSM central-composite design is been replicated five times, the center point has
modified in such a manner, methodological been replicated 30 times, 10 times within each
questions arise concerning how best to retain of the three blocks.
the positive attributes of the basic design, while At this point the investigator must decide
still making the modifications appropriate to whether or not to retain and analyze directly
research with human subjects. For example, the data corresponding to all 100 observations.
should repeated observations be made at more He could collapse his data across those subjects
than one experimental point?; should all data within the same block who were observed at
be retained or should they be collapsed?; the same experimental point and then analyze
should different subjects be observed at each the collapsed data without having to make any
experimental point or should the same subjects modifications in calculation procedures. The
be observed at all points?; and under what net effect of collapsing in this manner is a data
conditions are particular design variations es- matrix identical in form and number of ob-
pecially appropriate? servations to one resulting from the original
The following discussion proposes several blocked RSM central-composite design shown
design variations appropriate to human factors in Figure 2. Now, however, the data are
research, together with the ensuing method- combined values obtained from collapsing
ological considerations. A generalized computer rather than values representing a single observa-
program to analyze data from each of these tion. In addition, estimates of experimental
design variations, as well as data from the basic error are obtained from the resulting six center
RSM central-composite design, has been de- points, each of which is a collapsed score.

Downloaded from hfs.sagepub.com at PENNSYLVANIA STATE UNIV on February 19, 2016


CHRISTINE CLARK A N D R O B E R T C. WILLIGES August, 1973-305

This procedure has, the advantage of retain- gains degrees of freedom for the error term
ing all the features of a RSM central-composite which were previously lost by collapsing the
design, .as well as adding stability to the data. Error is now estimated across all points at
experimental data points, because the collapsed which replications occur, instead of using only
data are not heavily biased by the results of any the estimate of the error at the center point as
one extreme subject. This is especially valid if in the collapsed design and the original design.
the median is used as the combining statistic. It is quite possible that there may be certain
Because it is probably of little value to develop areas of the experimental region in which there
unique prediction equations for each subject, is considerable variability in response and other
such a collapsing procedure may be appropriate areas in which the variability is negligible. This
even though. degrees of freedom are lost from is particularly true if the range of factor levels
the design. under consideration is sizable. Given this vari-
A recent cross-validation study (Williges and ability, it is not reasonable to use the estimate
North, 1973), however, illustrates a potential of error obtained from only one area as an
drawback of collapsing data prior to analysis. estimate of error throughout the experimental
When median data were used to derive predic- region. The prediction equation which one
tion equations, the resulting multiple regression develops should afford a reasonable description
coefficient R was notably higher than the of the entire response surface, not merely a
corresponding value resulting from the compara- selected area of that response surface.
ble noncollapsed data analysis. However, the When noncollapsed designs are used, the
shrinkage of R from the original sample to the investigator must make another major decision
cross-validation sample was very pronounced with respect to his selected design. If, due to
when regression was based on collapsed data. the nature of his research problem, he chooses
There was far greater shrinkage than that to observe different subjects at each of the
predicted by the modified Wherry shrinkage experimental points, the resulting study consti-
formula (Herzberg, 1969; Lord and Novick, tutes a between-subjects design. If, on the other
1968). On the other hand, shrinkage was hand, he elects to observe each of a set of
minimal when derivation was based upon non- subjects under all experimental conditions, the
collapsed data. Hence, for predicting response resulting study constitutes a within-subject de-
levels for individuals not included in the deriva- sign. The choice of a between- versus a within-
tion sample, the collapsed analysis did not subject design is dictated by the particular
afford appreciably better prediction despite the question which the researcher is investigating.
deceivingly greater accuracy of the derived In either case, if the necessary restrictions are
prediction equation as suggested by the ini- observed, the design conforms to the basic
tially-high multiple R value. Indeed, the,multi- central-composite design.
ple R deriving from noncollapsed data was far Between-subjects designs. Given certain re-
more representative of the predictive accuracy search questions, observing the same subjects
of the equation. under more than one experimental condition
would lead to invalid conclusions concerning
Noncollapsed Designs the effect of the various experimental manipu-
lations. Consider, for example, an investigation
Suppose that the investigator replicating the of the comparative efficacy of selected training
blocked central-composite design chooses not methods. Certainly Training Method B cannot
to collapse his data across subjects. Rather, he be evaluated accurately by observing the per-
retains the data of each subject for subsequent formance of subjects who have previously been
analysis. By retaining all this information he trained to criterion under Method A, because

Downloaded from hfs.sagepub.com at PENNSYLVANIA STATE UNIV on February 19, 2016


306-Augu~t, 1973 HUMAN F A C T O R S

the observed performance may be a function of instead to replicate each of the experimental
not only the condition itself, but also of the points, including the center, an equal number
preceding condition which he has experienced. of times, while still maintaining the use of
In such a case it is imperative that the investiga- different subjects for each observation.
tor adopt a between-subjects design, observing Eliminating observations at the center point,
each subject under only one experimental however, has implications for orthogonal block-
condition. The transfer of traikng study cited ing. It is now necessary to adjust the value of a
earlier (Williges and Baron, 1973) provides such accordingly, because the original blocking has
an example. been disturbed due to the elimination of center
In the detection latency study which repli- points from the factorial portion of the design
c a t es t h e o r thogonally-blocked central- and the reduction in the number of center
composite design of Figure 2 five times, if 100 points in the star portion of the design. With
different subjects are observed across those 20 respect to the target detection latency example
data points (6 of which are the center point), a in which repeated observations are made at
between-subjects design is realized. Because the each of 15 unique experimental points, making
full central-composite design is being replicated the appropriate adjustment results in an a value
intact, the necessary relationship guaranteeing of 1.87 rather than 1.633, as defined by
orthogonal blocking, as given in Equation 3, is Equation 3. This change in the a value is
still satisfied. As in the original design, the reflected in Figure 3 which designates the
center point is being replicated six times more orthogonal blocking of the 15 unique experi-
than any other point. Although experimental mental points. Note the reduction of data
error is now being estimated across all data collection points within each of the three
points and includes subject-to-subject variation, blocks, and the complete absence of center
the results of a subsequent analysis to de- points in Blocks 1 and 2. Changing the coded
termine a first-order prediction equation are of value of a also necessitates reselecting the
the same type shown in Table 3. The increased real-world levels of the various factors under
number of observations is reflected in the study. Recalculating the levels of display resolu-
values for total degrees of freedom, residual
degrees of freedom, and error degrees of free-
dom; the adjusted values are 99, 96, and 83,
respectively. Meyer (1963) has used this design
procedure successfully in a human learning
experiment.
If, indeed, the variability in response a t each
of a series of data points is used as an estimate BLOCK I BLOCK 2 BLOCK 3

of experimental error variance, there is no need ( I, -I, I ) (-1, I,-I 1 (-1.87, 0 , O )


to replicate one point more than any other. In ( I, I , - 1 ) (-1.4. I) ( 0 .-1.87.0)
the original central-composite design, in which (-1, I, I 1 ( I, - 1 . 4 1 ( 0 , 0 , -1.87)

only the center point is replicated, the addi- (-l,-l.-l) (1,l.l) ( l.87.0.0)
tional observations at that point provide the (0,1.87,0)
investigator with his only estimate of error. ( 0 , 0,1.87)
But, with repeated observations occurring at (0, 0 , O )
each of the experimental points, there appears
no need to make more observations at the Figure 3. Orthogonal blocking of second-order, cen-
tral-composite design in three variables with coded
center merely for the sake of obtaining an value coordinates of data points employing equal
estimate of error. The investigator could choose num ber of replications.

Downloaded from hfs.sagepub.com at PENNSYLVANIA STATE UNIV on February 19, 2016


CHRI ST IN E C L A R K A N D R O B E R T C . WILLIGES August, 1973-307

tion, for example, the investigator learns that features of the RSM central-composite design
the five levels appropriate to the new orthog- variations previously discussed. Again, a check
onally blocked design are 149, 236, 336, 436, should be mad,e to insure that the selected (11
and 523. Selecting these five levels retains the value guarantees orthogonality in the case of
center of the experimental region but increases blocked designs, or rotatability in the case of
its range beyond that indicated in Figure 2. unblocked designs. The appropriate real-world
Replicating this modified RSM central- levels of the experimental factors are then
composite design five times, the investigator determined accordingly. Data are retained un-
makes a total of 75 observations, 20 in Block 1, collapsed from repeated observations made at
20 in Block 2, and 35 in Block 3. Submitting each of the experimental points, thereby af-
these 75 observations to direct analysis to fording increased degrees of freedom for the
determine a first-order prediction equation resulting error term. As in the other design
yields results similar to those shown in Table 3. variations, the within-subject design permits
Again, the change in design is reflected in tests for the significance of blocking and of lack
corresponding changes in values of total degrees of fit as well as tests of individual partial
of freedom, residual degrees of freedom, and regression coefficients. In addition, a subject
error degrees of freedom; the adjusted values term can be isolated and tested for significance.
are 74,71, and 60, respectively. Because subjects are completely crossed with
Within-subject design. On occasion the ob- treatments (every subject receives every treat-
jectives of an experiment make it appropriate ment once), one can refine the estimate of
and desirable to observe each subject in each experimental error variance by accounting for
treatment condition. In such a case, each the variability within the individual subjects
individual serves as his own control, and be- after assessing the variability within treatment
tween-subjects variability does not affect the conditions. In a within-subject design, the error
experimental conditions. Moreover, observing term which results from merely accounting for
the same set of subjects under each treatment the variability of response at the experimental
condition affords another obvious advantage points is comprised of intersubject variations,
over the between-subjects designs in that fewer the interactions between subjects and treatment
subjects are needed to conduct the study; conditions, and random error. By removing the
albeit one may encounter the familiar problem subject effect, a better estimate of experimental
of subject attrition. Of course, this design error is available for subsequent tests for
strategy is not appropriate when a subject’s significance. Moreover, if one assumes no inter-
performance in one condition is affected by actions between subjects and treatment condi-
prior experience with any of the other condi- tions, one can test the isolated subject term to
tions. As previously mentioned, a within- determine the existence of significant inter-
subject design is inappropriate for studying subject variation. (For greater detail concerning
differential training effectiveness. However, it the appropriate analysis see Clark and Williges,
could be used effectively to investigate the 1972.)
differential suitability of various display for- By way of example, the same four subjects
mats to enhance target detection where there is might be observed at each of the 15 experi-
little or no differential transfer from display to mental points designated in Figure 3, thereby
display. When these within-subject designs are yielding a total of 60 observations. Hypotheti-
used, caution must be exercised to implement cal data for such a design are presented in Table
the proper counterbalancing so as to avoid 4. Note that the 1.87 value for Q! is still
spurious sequence effects. appropriate because all 15 points, including the
The within-subject design combines several center point, are being replicated an equal

Downloaded from hfs.sagepub.com at PENNSYLVANIA STATE UNIV on February 19, 2016


308-August, 1973 HUMAN FACTORS

TABLE 4
Hypothetical Data in Coded Form for a Three-Factor, Second-Order, RSM Central-Composite Design Using
Repeated Measures on Four Subjects
Detection Latency (Seconds)
for Four Subjects

Resolution Visual Angle Random Noise s, SZ s3 s4

1 .oo -1 .oo 1.oo 15.8 15.9 16.1 16.4


1.oo 1.oo -1 .oo 14.3 14.5 14.0 14.8
-1 .oo 1.oo 1.oo 17.0 17.3 17.1 16.9
-1 .oo -1 .oo -1 .oo 17.4 17.5 17.0 17.3
-1 .oo 1.oo -1 .oo 16.8 16.7 17.0 17.0
-1 .oo -1 .oo 1.oo 18.1 18.3 18.6 18.1
1.oo -1 .oo -1 .oo 14.9 15.2 14.5 15.0
1.oo 1.oo 1.oo 16.2 16.7 16.4 15.9
-1.87 0.00 0.00 19.0 19.1 18.9 19.5
0.00 -1.87 0.00 17.3 16.9 17.4 16.8
0.00 0.00 -1.87 15.1 15.3 14.4 15.0
1.87 0.00 0.00 13.9 14.2 13.7 14.1
0.00 1.87 0.00 14.9 15.0 14.8 15.0
0.00 0.00 1.87 19.2 19.0 20.0 18.9
0.00 0.00 0.00 15.8 16.1 16.4 16.0

number of times as in the between-subjects But, in the case of within-subject designs, the
design with equal replication at all data points. error term is refined by removing the subject
A multiple-regression analysis of these hypo- effect from it.
thetical data yields the following first-order Mills and Williges (1973) have used a within-
prediction equation: subject design in a recent study of radar target
initiation and maintenance. Their results reveal
Detection Latency = 16.44 - 1.17 X1
highly significant intersubject variability which
-0.40 X , f 0.82 X , .
was removed from the regression equation. In
(6) addition, the resulting prediction equations
Substituting given levels of display resolution, appear to demonstrate a high degree of predic-
visual angle, and random noise for X1,X , ,and
X , , respectively, into this equation provides a TABLE 5
corresponding predicted level of detection First-Order Regression Analysis of Variance Summary
latency. Table ’ for Hypothetical Detection Latency Data of
The results of a subsequent analysis of Four Subjects
variance performed on the hypothetical data of Source df MS F
the regression analysis appear in Table 5. Note Regression ( 3) 43.87 548.37””
the additional “subjects” component into b, 1 81.75 1021.87**
1 9.42 117.75“
which residual variance has been subdivided. bZ
b3 1 40.44 505.50”
The corresponding degrees of freedom reflect Residual (56) 0.42
Blocks 2 0.65 8.13*
the use of four subjects throughout the experi- Subjects 3 0.05 0.63
ment. Notice also that the error degrees of Lack of F i t 9 2.04 25.50 * *
Error (42) 0.08
freedom are reduced by three, the degrees of Total 59
freedom attributed to the subject factor. Had
this , experiment utilized different subjects * p<.o1
** p<.OOl
throughout, the value of error degrees of Multiple Regression Coefficient, R = 0.92
freedom would have been 45 rather than 42. Coefficient of Determination, RZ = 0.85

Downloaded from hfs.sagepub.com at PENNSYLVANIA STATE UNIV on February 19, 2016


CHRISTINE CLARK AND ROBERT C. WILLIGES August, 1973-309

tive validity to other points within the orig- order to retain such attributes in view of the
inally sampled surface (Williges and Mills, overall design modification.
1973).

CONCLUSIONS ACKNOWLEDGMENTS

The techniques of RSM, and the central- This research was supported by the Life
composite design in particular, can be effec- Sciences Program, Air Force Office of Scientific
tively used in human factors research, where Research and is based on a paper given at the
the goal is frequently the development of an Fifteenth Annual Meeting of the Human Fac-
equation to describe the relationship between tors Society, New York, New York, October
human performance and a host of equipment 1971. Dr. Glen Finch, Program Manager, Life
parameters. Certain modifications in the basic Sciences Directorate, was the program monitor.
RSM central-composite design, however, appear The authors wish to express their appreciation
to make the method more appropriate to to Dr. Richard DeVor and Beverly H. Williges
research involving human subjects. In making for their helpful comments on an earlier version
the appropriate design modifications, the in- of this paper.
vestigator must make several major decisions.
He must decide whether or not to make
repeated observations over a series of experi- REFERENCES
mental points rather than at a single point. If
his goal is to develop a global prediction Box, G. E. P.and Behnken, D. W. Some new three-level
equation to approximate the entire response designs for the study of quantitative variables.
surface, replication at each of the experimental Technometrics, 1960,2,455-475.
Box, G. E. P. and Hunter, J. S. Multifactor experi-
data collection points appears to be a wise mental designs for exploring response surfaces.
strategy. The basic central-composite design, Annals of Mathematical Statistics, 1957, 28,
calling for replication at only the center point, 195-241.
is perhaps better reserved for preliminary re- Box, G. E. P. and Wilson, K. B. On the experimental
attainment of optimum conditions. Journal of the
search where the primary aim is to ascertain Royal Statistical Society, Series B (Methodologi-
quickly what major factors appear worthy of cal), 1951, 13, 1-45.
more thorough study. Clark, C. and Williges, R. C. Central-composite re-
sponse surface methodology design and analyses.
The investigator must also select either a Savoy, Ill.: University of Illinois, Institute of
between-subjects or a within-subject design. Aviation, Aviation Research Laboratory, Technical
This choice is dictated by the objectives of his Report ARL-72-1O/AFOSR-72-5,June 1972.
Clark, C., Williges, R. C., and Canner, S. G. General
particular experiment. Of the design variants computer program for response surface methodol-
discussed above, those advocating multiple and ogy analyses. Savoy, Ill.: University of Illinois,
equal replications at all experimental points, Institute of Aviation, Aviation Research Labora-
tory, Technical Report ARL71-8/AFOSR-71-1,
followed by analysis of uncollapsed data, ap- May 1971.
pear the most advantageous, whether they are Cochran, W. G. and Cox, G. M. Experimental designs.
conceived as between- or within-subject designs. (2nd ed.) Chapter 6A. Factorial experiments in
fractional replications. New York: Wiley, 1957,
The particular modifications which the investi- 244-292.
gator elects to implement have ramifications for Cochran, W. G. and Cox, G. M.Experimental designs.
other aspects of the design such as orthogonal (2nd ed.) Chapter 8A. Some methods for the study
of response surfaces. New York: Wiley, 1957,
blocking and rotatability. Appropriate adjust- 335-375.
ments must be made in factor level selection in Herzberg, P. A. The parameters of cross-validation.

Downloaded from hfs.sagepub.com at PENNSYLVANIA STATE UNIV on February 19, 2016


310-August, 1973 HUMAN FACTORS

Psychometric Monographs Supplement, 1969, 34, and Human Factors Department, Technical Report
No. 16. AFOSR-70-6, December 1970.
Lord, F. M. and Novick, M. R. Statistical theories of Williges, R. C. and Baron, M. L. Transfer assessment
mental test scores. Chapter 13. The selection of using a between-subjects central-composite design.
predictor variables. Reading, Mass.: Addison- Human Factors, 1973,15,311-319.
Wesley, 1968,284-301. Williges, R. C. and Mills, R. G. Predictive validity of
Meyer, D. L. Response surface methodology in educa- central-composite design regression equations.
tion and psychology. The Journal of Experimental Human Factors, 1973,15,349-354.
Education, 1963,31,329-336. Williges, R. C. and North, R. A. Prediction and
Mills, R. G. and Williges, R. C. Performance prediction cross-validation of video cartographic symbol loca-
in a single-operator simulated surveillance system. tion performance. Human Factors, 1973, 15,
Human Factors, 1973,15,337-348. 321-336.
Simon, C. W. The use of cenbal-composite designs in Wfliges, R. C. and Simon, C. W. Applying response
human factors engineering experiments. Culver surface methodology to problems of target acquisi-
City, Calif.: Hughes Aircraft Co., Display Systems tion. Human Factors, 1971, 13,511-519.

Downloaded from hfs.sagepub.com at PENNSYLVANIA STATE UNIV on February 19, 2016

S-ar putea să vă placă și