A Comparison of Three Methods For Selecting Values of Input Variables in The Analysis of

Taylor & Francis, Ltd.
American Statistical Association

American Society for Quality
A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of
Output from a Computer Code
Author(s): M. D. McKay, R. J. Beckman and W. J. Conover
Source: Technometrics, Vol. 21, No. 2 (May, 1979), pp. 239-245
Published by: Taylor & Francis, Ltd. on behalf of American Statistical Association and American
Society for Quality
Stable URL: http://www.jstor.org/stable/1268522
Accessed: 28-10-2015 16:25 UTC
REFERENCES
Linked references are available on JSTOR for this article:
http://www.jstor.org/stable/1268522?seq=1&cid=pdf-reference#references_tab_contents
You may need to log in to JSTOR to access the linked references.
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://www.jstor.org/page/
info/about/policies/terms.jsp
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content
in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship.
For more information about JSTOR, please contact support@jstor.org.
Taylor & Francis, Ltd., American Statistical Association and American Society for Quality are collaborating with
JSTOR to digitize, preserve and extend access to Technometrics.
http://www.jstor.org
This content downloaded from 161.139.102.11 on Wed, 28 Oct 2015 16:25:27 UTC
All use subject to JSTOR Terms and Conditions
TECHNOMETRICS?, VOL. 21, NO. 2, MAY 1979
A Comparisonof ThreeMethodsforSelecting
Values of InputVariablesin the Analysisof Output
froma ComputerCode
M. D. McKay and R. J. Beckman
W. J. Conover
Los Alamos ScientificLaboratory

P.O. Box 1663
Los Alamos, NM 87545
Departmentof Mathematics
Texas Tech University
Lubbock,TX 79409
Two typesof samplingplans are examinedas alternativesto simple randomsamplingin

Monte Carlo studies.These plans are shownto be improvements
oversimplerandomsampling
withrespectto variance for a class of estimatorswhich includesthe sample mean and the
empiricaldistributionfunction.
KEY WORDS
Latin hypercubesampling
Samplingtechniques
Simulationtechniques
Variance reduction
1. INTRODUCTION
Numerical methodshave been used for years to

provideapproximatesolutionsto fluidflowproblems
that defyanalyticalsolutionsbecause of theircomplexity.A mathematicalmodel is constructedto resemble the fluidflowproblem,and a computerprogram (called a "code"), incorporatingmethods of
obtaininga numericalsolution,is written.Then for
any selectionof inputvariablesX = (X,, ** , XK) an
output variable Y = h(X) is produced by the computer code. If the code is accurate the output Y
resembleswhat the actual output would be if an
experimentwere performedunderthe conditionsX.
It is oftenimpracticalor impossibleto performsuch
an experiment.Moreover, the computercodes are
sometimessufficiently
complexso thata singleset of
inputvariablesmay requireseveralhoursof timeon
the fastestcomputerspresentlyin existencein order
to produce one output. We should mentionthat a
singleoutputY is usuallya graph Y(t) of outputas a
functionof time,calculatedat discretetimepointst,
to < t < tl.
Whenmodelingrealworldphenomenawitha computer code one is oftenfaced with the problem of
what values to use forthe inputs.This difficulty
can
ReceivedJanuary1977; revisedMay 1978
arise fromwithinthe physical process itselfwhen

systemparametersare notconstant,but varyin some
mannerabout nominalvalues. We model our uncertaintyabout thevalues oftheinputsbytreatingthem
as randomvariables. The informationdesiredfrom
the code can be obtainedfroma studyof the probaof the output Y(t). Consequently,
bilitydistribution
we model the "numerical"experimentby Y(t) as an
unknowntransformation
h(X) oftheinputsX, which
have a knownprobabilitydistribution
F(x) forx E S.
Obviouslyseveralvalues of X, say X1, .', XN, must
be selectedas successiveinputssetsin orderto obtain
the desired informationconcerningY(t). When N
must be small because of the runningtime of the
code, theinputvariablesshouldbe selectedwithgreat
care.
The nextsectiondescribesthreemethodsof selecting(sampling)inputvariables.Sections3, 4 and 5 are
devotedto comparingthethreemethodswithrespect
to theirperformancein an actual computercode.
The computercode used in thispaper was developed in theHydrodynamics
Group oftheTheoretical
Division at the Los Alamos ScientificLaboratory,to
studyreactorsafety[8]. The computercode is named
SOLA-PLOOP and is a one-dimensionalversionof
anothercode SOLA [7]. The code was used by us to
model the blowdown depressurizationof a straight
and
pipe filledwithwaterat fixedinitialtemperature
pressure.Input variables include:X1, phase change
fordriftvelocity;X3,number
rate;X2,dragcoefficient
of bubbles per unitvolume; and X4, pipe roughness.
The input variables are assumed to be uniformly
distributedover givenranges.The outputvariableis
pressureas a functionoftime,wheretheinitialtimeto
is the time the pipe rupturesand depressurization
239
240
M. D. McKAY,W. J. CONOVER AND R. J. BECKMAN
initiates,and thefinaltimeti is 20 millisecondslater.

The pressureis recordedat 0.1 millisecondtimeintervals. The code was used repeatedlyso thatthe accuracy and precisionof the threesampling methods
could be compared.
compared to random samplingwith respectto the

class of estimatorsof the form
2. A DESCRIPTION OF THE THREE METHODS USED

FOR SELECTING THE VALUES OF INPUT VARIABLES
thesamplemeanwhich
Ifg(Y) = Y thenT represents
is used to estimateE(Y). If g(Y) = yr we obtain the
rth sample moment.By letting
g(Y) = 1 forY < y, 0
otherwise,we obtain the usual empiricaldistribution
functionat the point y. Our interestis centered
around theseparticularstatistics.
Let r denotetheexpectedvalue of T whenthe Ye's
constitutea randomsample fromthedistribution
of
Y = h(X). We show in theAppendixthatbothstratified sampling and Latin hypercubesampling yield
unbiasedestimatorsof r.
If TRis theestimateof r froma randomsampleof
size N, and Ts is theestimatefroma stratified
sample
of size N, thenVar(Ts) < Var(TR)whenthestratified
plan uses equal probabilitystrata with one sample
per stratum(all pi = 1/N and nlj = 1). No direct
means of comparingthe varianceof thecorrespondingestimatorfromLatin hypercubesampling,TL, to
Var(Ts) has been found.However,thefollowingtheorem,provedin the Appendix,relatesthe variances
of TL and TR.
Theorem.If Y = h(X1,.,
XK) is monotonicin
and g(Y) is a monotonicfunceach of its arguments,
tionof Y, thenVar(TL) < Var(TR).
From the manydifferent

methodsof selectingthe
values of inputvariables,we have chosen threethat
have considerableintuitiveappeal. These are called
randomsampling,stratified
sampling,and Latin hypercubesampling.
RandomSampling.Let theinputvaluesX1, ** , XN
be a randomsample fromF(x). This methodof samplingis perhapsthemostobvious,and an entirebody
of statisticalliteraturemay be used in makinginferences regardingthedistributionof Y(t).
StratifiedSampling.Using stratifiedsampling,all
areas of the sample space of X are representedby
input values. Let the sample space S of X be partitioned into I disjointstrataSt. Let pi = P(X C Si)
thesize ofSi. Obtain a randomsampleXiJ,j
represent
= 1, ** , n fromSi. Then of coursetheni sum to N.
If I = 1, we have random samplingover the entire
sample space.
Latin HypercubeSampling. The same reasoning
that led to stratified
sampling,ensuringthatall portions of S were sampled, could lead further.If we
wishto ensurealso thateach oftheinputvariablesXk
has all portions of its distributionrepresentedby
inputvalues,we can dividetherangeof each Xk into
N strata of equal marginal probability 1/N, and
sample once fromeach stratum.Let this sample be
Xkj,j = 1, ..., N. These formtheXkcomponent,k =
1, * , K, in Xi, i = 1, * , N. The componentsof the
variousX,A'sare matchedat random.This methodof
selectinginputvalues is an extensionof quota sampling [13], and can be viewed as a K-dimensional
extensionof Latin square sampling[11].
One advantageof the Latin hypercubesample appears when the output Y(t) is dominatedby only a
few of the componentsof X. This methodensures
that each of those componentsis representedin a
manner,no matterwhichcomponents
fullystratified
turn
out
to be important.
might
We mentionherethattheN intervalson therange
of each componentof X combine to formNK cells
whichcoverthesamplespace ofX. These cells,which
are labeled bycoordinatescorresponding
to theintervals, are used when findingthe propertiesof the
samplingplan.
2.1 Estimators
In the Appendix (Section 8), stratifiedsampling
and Latin hypercubesampling are examined and
T(Y,,,
YN)= (1/N)
it=l
g(Yi),
whereg( ) = arbitraryfunction.
2.2 The SOLA-PLOOP Example

The threesamplingplans werecomparedusingthe
SOLA-PLOOP computercode withN = 16. Firsta
randomsample consistingof 16 values of X = (X1,
X2, X3, X,) was selected,enteredas inputs,and 16
graphsof Y(t) wereobservedas outputs.These output values wereused in the estimators.
For the stratifiedsamplingmethod the range of
each inputvariable was divided at the median into
two partsof equal probability.The combinationsof
rangesthusformedproduced24 = 16 strataSi. One
observationwas obtainedat randomfromeach Si as
input,and the resultingoutputswereused to obtain
the estimates.
To obtaintheLatin hypercubesampletherangeof
each inputvariableXi was stratified
into 16 intervals
of equal probability,and one observationwas drawn
at randomfromeach interval.These 16 valuesforthe
4 inputvariableswerematchedat randomto form16
inputs,and thus 16 outputsfromthecode.
The entireprocessof samplingand estimatingfor
the threeselectionmethodswas repeated50 timesin
order to get some idea of the accuracies and precisions involved.The totalcomputertimespentin running the SOLA-PLOOP code in this study was 7
hours on a CDC-6600. Some of the standarddevia-
TECHNOMETRICS
?, VOL.21, NO. 2, MAY1979
A COMPARISON OF THREE METHODS FOR SELECTING VALUES OF INPUT
tion plots appear to be inconsistent

withthetheoretical results.These occasional discrepanciesare believed to arise from the non-independenceof the
estimatorsover timeand the small sample sizes.
3.
140-0 -
0
I-
ESTIMATING THE MEAN
V)
L&J
L.a
The goodnessof an unbiasedestimatorof themean

can be measuredby thesize of itsvariance.For each
samplingmethod,the estimatorof E(Y(t)) is of the
form
Y(t) = (1/N)
i=
1, ...,N.
Var(YR(t)) = (1/N) Var(Y(t))

(1/N2)
X
=1
(u -,)2
Var(YL(t)) = Var(YR(t)) + ((N - 1)/N)
1/(NK(N-
1)K)) y
(A,l M)(uj - A)
(3.2)
whereu = E(Y(t)),
Iji = E(Y(t) I X G Si) in thestratified
sample,or
it = E(Y(t) | X E cell i) in the Latin hypercube
sample,
and R means the restrictedspace of all pairsAu,,j
havingno cell coordinatesin common.
For the SOLA-PLOOP computercode the means
and standard deviations,based on 50 observations,
were computed for the estimatorsjust described.
Comparativeplotsofthemeansaregivenin Figure 1.
All of theplotsof themeansare comparable,demonstratingthe unbiasednessof the estimators.
Comparative plots of the standarddeviationsof
the estimatorsare given in Figure 2. The standard
deviation of Ys(t) is_smallerthan that of YR(t) as
expected.However,YL(t) clearlydemonstratessuperiorityas an estimatorin thisexample,witha standard deviationroughlyone-forththatof therandom
samplingestimator.
..................
STRAT D
..........
LATM
----
100-0 -
80'0 -
Ll
60-0 -
40-0
0-0
5.0
10
t50
TIME
20-0
S2(t) = (1/N) ~ (Y,(t)- Y(t))2,

i=l
ESTIMATING THE VARIANCE
For each samplingmethod,theformof theestimator of the varianceis
(4.1)
and its expectationis

E(S2(t)) = Var(Y(t))- Var(Y(t)),
(4.2)
whereY(t) is one of YR(t), Ys(t), or YL(t).

In the case of therandomsample,it is well known
that N S2R/(N- 1) is an unbiased estimatorof the
varianceof Y(t). The bias in thecase of thestratified
sample is unknown.However,because Var(Ys(t)) <
Var(YR(t)),
(1 - 1/N) Var(Y(t)) < E(S,2(t)) < Var(Y(t)). (4.3)
The bias in the Latin hypercubeplan is also unknown,but for the SOLA-PLOOP example it was
small.Variancesfortheseestimatorswerenotfound.
Again using the SOLA-PLOOP example, means
and standarddeviations(based on 50 observations)
werecomputed.The meanplotsare givenin Figure3.
They indicatethatall threeestimatorsare in relative
agreementconcerningthequantitiestheyare estimating.In termsofstandarddeviationsoftheestimators,
Figure 4 shows that, although stratifiedsampling
yieldsabout thesame precisionas does randomsam2-5 RANDOM
2-0 -
STRATIED
LAT IIN
I-
1-5-
L&.
0
1-0 -
"\
v)
0-5-
0-0
4.
IRANDOM
FIGURE 1. Estimatingthemean: thesamplemeanof ?R(t), Ys(t),

and L(t).
In the case of the stratified

sample,the X1 comes
fromstratumSi, Pi = 1/N and n1= 1. For theLatin
hypercubesample,the Xi is obtainedin the manner
describedearlier.Each ofthethreeestimatorsYR,Ys,
and YL is an unbiased estimatorof E(Y(t)). The
variancesof the estimatorsare givenin (3.2):
Var(Ys(t)) = Var(YR(t))-
120-0 -
(3.1)
i=1
where
Y(t) = h(Xi),
X Yt(t)
241
\
.
~~'~_..
.::....::::>:'.......................
.
/
---..
0.0
Io
5-0
-_
1o
10.0
TIME
15
15-0
I2
20-0
FIGURE 2. Estimatingthemean: thestandarddeviationof YR(t),

Ys(t), and YL(t).
242
M. D. McKAY, W. J. CONOVER AND R. J. BECKMAN

150-0-
1.0
RAOM
I-.
4c
2
100-0 -
STRATFID
LATIN
4
I-
V)
V)
L,d
LAJ
LL.
LL.
500 -
RANDO
STRATFIED
...............
-----
0-6
0-4
4c
2
0-0
LATIN
I-
L&J
10.0
5-O
0-0
15.0
0-2
20-0
TIME
thevariance:thesamplemeanofSR2(t),
FIGURE 3. Estimating
PRESSURE
Gs(y, t), and GL(Y, t) at t = 1.4.
pling,Latin hypercubefurnishesa clearlybetterestimator.
Var(Gs(y,t)) = Var(GR(y,t))
N
-(1/N2)
5. ESTIMATINGTHE DISTRIBUTION FUNCTION
G(y,t) = (l/N)
i=1
u(y - Y (t)),
90-0
=-1
(Di(y, t) - D(y, t))2
Var(GL(y,t)) = Var(GR(y,t))
+((N - 1)/N
(5.1)
whereu(z) = 1 forz > 0 and is zero otherwise.Since

equation (5.1) is of the formof the estimatorsin
Section 2.1, the expectedvalue of G(y, t) underthe
threesamplingplans is the same, and underrandom
sampling,the expectedvalue of G(y, t) is D(y, t).
The variancesof the threeestimatorsare givenin
(5.2). Di again refersto eitherstratumi or cell i, as
appropriate,and R representsthe same restricted
space as it did in (3.2).
Var(GR(y,t)) = (1/N) D(y, t)(l - D(y, t))
80-0
FIGURE 5. Estimating
theCDF: thesamplemeanof GR(, t),
Ss2(t), and SL2(t).
The distributionfunction,D(y, t), of Y(t) = h(X)

funcmaybe estimatedby theempiricaldistribution
functioncan be writtion.The empiricaldistribution
ten as
700
1/NK(N- 1)K) C
(Dt'(, t)
- D(y, t)) * (Dj(y, t)- D(y, t)).
(5.2)
As withthecases of themean and varianceestimators,the distributionfunctionestimatorswere compared forthe threesamplingplans. Figures 5 and 6
givethemeansand standarddeviationsoftheestimators at t = 1.4 ms. This time point was chosen to
correspondto the timeof maximumvariancein the
distributionof Y(t). Again the estimatesobtained
froma Latin hypercubesample appear to be more
precise in general than the othertwo typesof estimates.
50.0 -
0
2
I(1
LAJ
0
vi
C)
40-0 -
RANOM
RANDOM
..................
STRATIFIED
STRAT
---------
LATIN
0O
LATIN
30-0 -
I \
I
I
20-0 -
0-10-
ILiJ
\\
LA
005
V/)
10-0 -
'.0 Irl
Ul
0*0
5-0
10-0
TIME
15*0
20-0
FIGURE 4. Estimatingthe variance: the standarddeviationof
SR2(t), Ss2(t), and SL2(t).
0-00
20-0 30-0
40-0
50-0
600
PRESSURE
700
80-0
900
FIGURE 6. EstimatingtheCDF: the standarddeviationof GR(Y,

t), Gs(y, t), and GL(Y, t) at t = 1.4.
243
A COMPARISONOF THREEMETHODS FOR SELECTINGVALUESOF INPUT

6. DISCUSSION AND CONCLUSIONS
We have presentedthreesamplingplans and associated estimatorsof the mean, the variance,and the
population distributionfunctionof the output of a
computercode whentheinputsare treatedas random
variables. The firstmethod is simple random sampling. The second method involves stratifiedsamplingand improvesupon thefirstmethod.The third
methodis called hereLatin hypercubesampling.It is
an extensionof quota sampling[13], and it is a first
cousin to the "random balance" designdiscussedby
Satterthwaite[12], Budne [2], Youden, et al [15],
factoAnscombe[1], and to the highlyfractionalized
rial designsdiscussedby Enrenfeldand Zacks [5, 6],
Dempster [3, 4], and Zacks [16, 17], and to lattice
sampling as discussed by Jessen [9]. This third
method improves upon simple random sampling
when certain monotonicityconditionshold, and it
appears to be a good method to use for selecting
values of inputvariables.
7. ACKNOWLEDGMENTS
We extenda special thanksto Ronald K. Lohrding,forhis earlysuggestionsrelatedto thisworkand

for his continuingsupportand encouragement.We
also thankour colleagues LarryBruckner,Ben Duran,C. Phive,and Tom Boullion fortheirdiscussions
concerningvariousaspectsof theproblem,and Dave
Whitemanforassistancewiththe computer.
This paper was preparedunderthe supportof the
AnalysisDevelopmentBranch,Division of Reactor
SafetyResearch,Nuclear RegulatoryCommission.
8. APPENDIX
In thesectionsthatfollowwe presentsomegeneral
resultsabout stratified
samplingand Latin hypercube
samplingin orderto make comparisonswithsimple
randomsampling.We move fromthegeneralcase of
stratified
samplingto stratified
samplingwithproportional allocation, and then to proportionalallocationswithone observationper stratum.We examine
Latin hypercubesampling for the equal marginal
probabilitystratacase only.
8.1
TypeI Estimators
Let X denote a K variate random variable with

probabilitydensityfunction(pdf)f(x) forx C S. Let
of X givenby
Y denote a univariatetransformation
Y = h(X). In the contextof this paper we assume
X - f(x), x E S
Y = h(X)
KNOWNpdf
UNKNOWN butobservable
transformation
ofX.
The class of estimatorsto be consideredare thoseof

the form
T(u,,
, UN)=
(1/N) J g(U,),
t-=
(8.1)
whereg(. ) is an arbitrary,
knownfunction.In particular we use g(u) = ur to estimatemoments,and g(u)
= 1 foru > 0, = 0 elsewhere,to estimatethedistribution function.
The samplingschemes describedin the following
sectionswill be comparedto random samplingwith
respectto T. The symbolTR denotesT(Y1, .* , YN)
whenthe argumentsY,, ** , YNconstitutea random
sample of Y. The mean and variance of TR are denoted by T and 02/N. The statisticT givenby (8.1)
will be evaluatedat argumentsarisingfromstratified
samplingto formTs, and at argumentsarisingfrom
Latin hypercubesamplingto formTL. The associated
means and varianceswill be compared to those for
randomsampling.
8.2 StratifiedSampling
Let the rangespace, S, of X be partitionedintoI
disjointsubsetsSi of size pi = P(X e St), with
t=l
pt = 1.
Let Xij, j = 1, *.., nl, be a random sample from

stratumSt. That is, letXj - iidf(x)/pi,j = 1, . * , ni,
forx e Si, but withzero densityelsewhere.The correspondingvalues of Y are denotedby Yj = h(X(j), and
thestratameansand variancesofg(Y) are denotedby
A, = E(g(Yij)) = fg(y)(1/pt)f(x)d
ai2 = Var(g(Y()) = f(g(y)-t)2(l/pt)f(x)dx.

s
It is easy to see thatif we use the generalform

I
Ts= ?
t=l
nt
(pt/ni)
J=l
g(Yj),
that Ts is an unbaised estimatorof r withvariance

givenby
Var(Ts) =
] (p,2/ni)a,2.
t=1
(8.2)
The followingresultscan be foundin Tocher [14].

Allocation.If
Stratified
SamplingwithProportional
the probabilitysizes,pi, of the strataand the sample
sizes, ni, are chosen so that ni = piN, proportional
allocationis achieved.In thiscase (8.2) becomes
Var(Ts) = Var(TR) - (/N)
"pt(,t-rT)2. (8.3)
t=1
Thus, we see that stratifiedsampling with proportionalallocationoffersan improvement

overrandom
sampling,and thatthe variancereductionis a function of the differences
betweenthe stratameans A,
and theoverall mean r.
TECHNOMETRICS
?, VOL.21, NO. 2, MAY1979
M. D. McKAY,W. J. CONOVERAND R. J. BECKMAN
244
1. P(w,=l) = (l/NK-1) = E(w1) = E(w,2)
ProportionalAllocationwithOne Sample per Stratum.Anystratified

plan whichemployssubsampling,
ni > 1, can be improvedby furtherstratification.
When all ni = 1, (8.3) becomes
Var(Ts) = Var(TR) - (/N)
Var(w1)= (1/NK-)(1 - 1/NK-).

2. If w, and wj correspondto cells havingno cell
coordinatesin common,then
E(wiw,) = E(w,wj w = O)P(wj = 0)
N
1=1
(8.4)
-)2.
+ E(wilwJw= l)P(w = 1)
8.3 Latin HypercubeSampling

In stratified
samplingtherangespace S of X can be
arbitrarily
partitionedto formstrata.In Latin hypercube sampling the partitionsare constructedin a
specificmannerusingpartitionsof therangesof each
component of X. We will only consider the case
wherethe componentsof X are independent.
Let therangesof each oftheK componentsofX be
partitionedinto N intervalsof probablititysize 1/N.
The Cartesianproductof theseintervalspartitionsS
into NK cells each of probabilitysize N-K. Each cell
can be labeled by a set ofK cellcoordinatesmi =(mil,
mi2, .' , msK)wheremij is the intervalnumberof
componentXJrepresentedin cell i. A Latin hypercube sample of size N is obtained froma random
selectionN of thecells m,, **, mN,withthe condition thatforeach j the set {mij}l_N is a permutation
of the integers1, .. , N. One random observation
is made in each cell. The densityfunctionof X given
X e cell i is NKf(X) if x E cell i, zero otherwise.The
of Yi(t) is easmarginal(unconditional)distribution
ily seen to be the same as thatfora randomlydrawn
X as follows:
P(Y < y) =
all oells q
-= Zcell
P(Y, < y X cell q)P(X

q
h(x) Sy
NKf(X)d
cell q)
E(wfwj) = 0.
Now
Var(w(g(Yl)) = E(W,2) Var g(Y,)
+ E2(g(Y,)) Var(w,)
NK
C Var(wg(Y1))
NK
N-K+1
E(g(Y,)-#,)2
NK
+ (N-K+
A2
1-N-N+2K)
(8.8)
1=1
where,i = E{g(Y) IX e cell i}. Since

E(g(Y,)-t)2
= NK fcel
I (g(y)-rT)f(x)dx
(.t-- r)2
(8.9)
we have
X Var(wg(Yi)) = N Var(Y) I
N-K+l
(-
i #,2
Tr)2
(8.10)
Furthermore
Fromthiswe have TL as an unbiasedestimatorof r.

To arrive at a formfor the variance of TL we
introduceindicatorvariableswt,with
ifcell i is in thesample
ifnot.
NK
NK
/=1
J=l
i,6
Cov(wg(rY), wg(rY)) = ?,
uiE{wwiw}
iJ
(8.11)
iAj
N-2K+2CZ
i?i
whichcombineswith(8.10) to give
The estimatorcan now be writtenas
Var(TL) = (l/N)Var(Y)
- N-K-1
NK
/=1
wfg(Yi),
(8.5)
givenby
- N-2K)
+ (N-
I)-K+lNK -'X
NK
Var(TL) = (1/N2)C Var(wtg(Y1))

t=l
-N- 2KF
NK NK
?c Cov(wtg(Yy),wg(Y,)).
i=1 J=1
Jfi
(8.6)
The followingresultsabout the wi are immediate:
+ (N-K
whereY1= h(X1)and Xl ~ cell i. The varianceof TL is
+ (1/N2)
(8.7)
so that
+ (N-K+l-N-2K+2)
h(X)<y
TL = (1/N)
1))K--1
3. If wi and Wjcorrespondto cells havingat least

one commoncell coordinate,then
(l/NK)
f(x)dx.
= 1/(N(N-
tij
Mi,I
(i
T)2
#,2
Mi
(8.12)
whereR means the restrictedspace of NK(N - 1)K

pairs (uI,,j) correspondingto cells having no cell
coordinates in common. After some algebra, and
TECHNOMETRICS
?, VOL.21, NO. 2, MAY1979
A COMPARISON OF THREE METHODS FOR SELECTING VALUES OF INPUT
with Iiu = NKT, thefinalformforVar(TL) becomes
equation
UsingHoeffding's
Var(TL) = Var(TR) + (N - 1)/N[N-K(N -
Cov(X,Y) =
*
R
)-K
(i-rT)(Lj-r)
- P(X < x)P(Y < y)] dx dy,
< 0,
(8.14)
which is equivalent to saying that the covariance

betweencells havingno cell coordinatesin common
is negative.A sufficient
conditionfor(8.14) to hold is
theorem.
the
following
givenby
THEOREM. If Y = h(X1,??, XK) is monotonicin
each of its arguments,and if g(Y) is a monotonic
functionof Y, thenVar(TL) < Var(TR).
PROOF. The proof employs a theoremby Lehmann [10]. Two functionsr(x,, * * , XK) and s(y,, * .,
YK) are said to be concordantin each argumentif r
and s eitherincreaseor decrease togetheras a functionofxi = yi,withall xj,j # i and yj,j #i heldfixed,
foreach i. Also, a pair ofrandomvariables(X, Y) are
ifP(X < x, Y
said to be negatively
quadrantdependent
< y) < P(X < x)P(Y < y) forall x, y. Lehmann's
theoremstatesthatif(i) (X,, Y1),(X2, Y2), .* , (XK,
YK) are independent,(ii) (Xi, Yi) is negativelyquadrantdependentforall i, and (iii) X = r(X, . .., XK)
and Y = s(Y1, .. , YK) are concordantin each argument,then(X, Y) is negativelyquadrantdependent.
We earlierdescribeda stage-wiseprocessforselecting cells fora Latin hypercubesample,wherea cell
was labeled bycell coordinatesmi, ?? , inK.Two cells
(11, ** , IK) and (ml,
, mx) with no coordinates in
commonmaybe selectedas follows.Randomlyselect

two integers(R,,, R21)withoutreplacementfromthe
firstN integers1, .* , N. Let 11= R,, and m, = R21.
Repeat the procedureto obtain (R12,R22),(R13,R23),
'* , (R1K, R2K) and let Ik =
Rlk
[P(X < x, Y < y)
(8.13)
(i-T)(j-Tr)].
Note thatVar(TL) < Var(TR) if and only if

N-K(N - 1)-K
245
and mk = R2k. Thus
twocells are randomlyselectedand Ik mk

M fork = 1,
* *, K.
Note that the pairs (Rlk,R2k), k = 1, .. , K, are
mutuallyindependent.Also note thatbecause P(Rlk
< x, R2k< y) = [xy- min(x,y)]/(n(n- 1)) < P(Rlk
< x)P(R2k < y), where [ ] representsthe "greateach pair(R1k,R2k)is negatively
estinteger"function,
quadrantdependent.
Let ,1 be theexpectedvalue ofg(Y) withinthecell
IK), and let #2 be similarily
designated by (1, . .,
defined for (ml, .., mK). Then 1i = #(R,1,R12, . .,
RIK) and #2= A(R21,R22, .** , R2K) are concordant in
each argumentunder the assumptionsof the theorem. Lehmann's theoremthenyields thatAl and g2
are negativelyquadrantdependent.Therefore,
P(,l < x, ,2 < y) < P(jU < X)P(,
< y).
(see Lehmann[10] fora proof),we haveCov(l,,A2) <

0. Since Var(TL) = Var(TR) + (N - l)/NCov(L1,42),
the theoremis proved.
Since g(t) as used in both Sections 3 and 5 is an
increasingfunctionoft,we can say thatifY = h(X) is
Latin
a monotonicfunctionof each of itsarguments,
random
is
than
better
sampling
hypercubesampling
forestimatingthe mean and the populationdistribution function.
REFERENCES
[1] ANSCOMBE, F. J. (1959). Quick analysismethodsforrandom balance screeningexperiments.Technometrics,
1, 195209.
[2] BUDNE, T. A. (1959). The applicationof randombalance
1, 139-155.
designs.Technometrics,
[3] DEMPSTER, A. P. (1960). Random allocationdesignsI: On
generalclasses of estimationmethods.Ann. Math. Statist.,
31, 885-905.
[4] DEMPSTER, A. P. (1961). Random allocation designsII:
Approximate theory for simple random allocation. Ann.
Math. Statist.,32, 387-405.
[5] EHRENFELD, S., and ZACKS, S. (1951). Randomization
Ann. Math. Statist.,32, 270-297.
and factorialexperiments.
[6] EHRENFELD, S., and ZACKS, S. (1967). TestinghypotheAnn.Math. Statist.,
ses in randomizedfactorialexperiments.
38, 1494-1507.
[7] HIRT, C. W., NICHOLS, B. D., and ROMERO, N. C.
(1975). SOLA-a numericalsolutionalgorithmfortransient
fluid flows. Los Alamos ScientificLaboratoryReport LA5852, Los Alamos.
[8] HIRT, C. W., and ROMERO, N. C. (1975). Applicationofa
drift-flux
model to flashingin straightpipes. Los Alamos
ScientificLaboratoryReportLA-6005-MS, Los Alamos.
[9] JESSEN, RAYMOND J. (1975). Square and cubic lattice
sampling.Biometrics,31, 449-471.
[10] LEHMANN, E. L. (1966). Some concepts of dependence.
Ann. Math. Statist.,35, 1137-1153.
[11] RAJ,DES. (1968). SamplingTheory,pp. 206-209.New York:
McGraw-Hill.
[12] SATTERTHWAITE, F. E. (1959). Random balance experimentation.Technometrics,
1, 111-137.
[13] STEINBERG, H. A. (1963). Generalized quota sampling.
Nuc. Sci. and Engr.,15, 142-145.
[14] TOCHER, K. D. (1963). TheArtofSimulation.pp. 106-107.
Princeton,N.J.: D. Van Nostrand.
[15] YOUDEN, W. J., KEMPTHORNE, O., TUKEY, J. W.,
BOX, G. E. P., and HUNTER, J.S. (1959). Discussion ofthe
and Budne. Technometrics,
1,
papersof Messrs.Satterthwaite
157-193.
[16] ZACKS, S. (1963). On a completeclass of linear unbiased
estimatorsforrandomizedfactorialexperiments.
Ann.Math.
Statist.,34, 769-779.
[17] ZACKS, S. (1964). Generalizedleast squares estimatorsfor
randomizedfractionalreplicationdesigns.Ann. Math. Statist.,35, 696-704.

A Comparison of Three Methods For Selecting Values of Input Variables in The Analysis of

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

A Comparison of Three Methods For Selecting Values of Input Variables in The Analysis of

Încărcat de

Drepturi de autor:

Formate disponibile

Taylor & Francis, Ltd.

American Statistical Association

TECHNOMETRICS?, VOL. 21, NO. 2, MAY 1979

Los Alamos ScientificLaboratory

Two typesof samplingplans are examinedas alternativesto simple randomsamplingin

Numerical methodshave been used for years to

arise fromwithinthe physical process itselfwhen

M. D. McKAY,W. J. CONOVER AND R. J. BECKMAN

initiates,and thefinaltimeti is 20 millisecondslater.

compared to random samplingwith respectto the

2. A DESCRIPTION OF THE THREE METHODS USED

From the manydifferent

2.2 The SOLA-PLOOP Example

A COMPARISON OF THREE METHODS FOR SELECTING VALUES OF INPUT

tion plots appear to be inconsistent

ESTIMATING THE MEAN

The goodnessof an unbiasedestimatorof themean

Var(YR(t)) = (1/N) Var(Y(t))

Var(YL(t)) = Var(YR(t)) + ((N - 1)/N)

S2(t) = (1/N) ~ (Y,(t)- Y(t))2,

ESTIMATING THE VARIANCE

For each samplingmethod,theformof theestimator of the varianceis

and its expectationis

whereY(t) is one of YR(t), Ys(t), or YL(t).

FIGURE 1. Estimatingthemean: thesamplemeanof ?R(t), Ys(t),

In the case of the stratified

FIGURE 2. Estimatingthemean: thestandarddeviationof YR(t),

TECHNOMETRICS?, VOL. 21, NO. 2, MAY 1979

M. D. McKAY, W. J. CONOVER AND R. J. BECKMAN

pling,Latin hypercubefurnishesa clearlybetterestimator.

5. ESTIMATINGTHE DISTRIBUTION FUNCTION

(Di(y, t) - D(y, t))2

whereu(z) = 1 forz > 0 and is zero otherwise.Since

Ss2(t), and SL2(t).

The distributionfunction,D(y, t), of Y(t) = h(X)

- D(y, t)) * (Dj(y, t)- D(y, t)).

FIGURE 4. Estimatingthe variance: the standarddeviationof

SR2(t), Ss2(t), and SL2(t).

FIGURE 6. EstimatingtheCDF: the standarddeviationof GR(Y,

TECHNOMETRICS?, VOL. 21, NO. 2, MAY 1979

A COMPARISONOF THREEMETHODS FOR SELECTINGVALUESOF INPUT

We extenda special thanksto Ronald K. Lohrding,forhis earlysuggestionsrelatedto thisworkand

Let X denote a K variate random variable with

The class of estimatorsto be consideredare thoseof

Let Xij, j = 1, *.., nl, be a random sample from

ai2 = Var(g(Y()) = f(g(y)-t)2(l/pt)f(x)dx.

It is easy to see thatif we use the generalform

that Ts is an unbaised estimatorof r withvariance

The followingresultscan be foundin Tocher [14].

Thus, we see that stratifiedsampling with proportionalallocationoffersan improvement

M. D. McKAY,W. J. CONOVERAND R. J. BECKMAN

1. P(w,=l) = (l/NK-1) = E(w1) = E(w,2)

ProportionalAllocationwithOne Sample per Stratum.Anystratified

Var(w1)= (1/NK-)(1 - 1/NK-).

8.3 Latin HypercubeSampling

P(Y, < y X cell q)P(X

where,i = E{g(Y) IX e cell i}. Since

Fromthiswe have TL as an unbiasedestimatorof r.

The estimatorcan now be writtenas

Var(TL) = (1/N2)C Var(wtg(Y1))

The followingresultsabout the wi are immediate:

whereY1= h(X1)and Xl ~ cell i. The varianceof TL is

3. If wi and Wjcorrespondto cells havingat least