Sunteți pe pagina 1din 8

Taylor & Francis, Ltd.

American Statistical Association


American Society for Quality
A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of
Output from a Computer Code
Author(s): M. D. McKay, R. J. Beckman and W. J. Conover
Source: Technometrics, Vol. 21, No. 2 (May, 1979), pp. 239-245
Published by: Taylor & Francis, Ltd. on behalf of American Statistical Association and American
Society for Quality
Stable URL: http://www.jstor.org/stable/1268522
Accessed: 28-10-2015 16:25 UTC
REFERENCES
Linked references are available on JSTOR for this article:
http://www.jstor.org/stable/1268522?seq=1&cid=pdf-reference#references_tab_contents
You may need to log in to JSTOR to access the linked references.

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://www.jstor.org/page/
info/about/policies/terms.jsp
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content
in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship.
For more information about JSTOR, please contact support@jstor.org.

Taylor & Francis, Ltd., American Statistical Association and American Society for Quality are collaborating with
JSTOR to digitize, preserve and extend access to Technometrics.

http://www.jstor.org

This content downloaded from 161.139.102.11 on Wed, 28 Oct 2015 16:25:27 UTC
All use subject to JSTOR Terms and Conditions

TECHNOMETRICS?, VOL. 21, NO. 2, MAY 1979

A Comparisonof ThreeMethodsforSelecting
Values of InputVariablesin the Analysisof Output
froma ComputerCode
M. D. McKay and R. J. Beckman

W. J. Conover

Los Alamos ScientificLaboratory


P.O. Box 1663
Los Alamos, NM 87545

Departmentof Mathematics
Texas Tech University
Lubbock,TX 79409

Two typesof samplingplans are examinedas alternativesto simple randomsamplingin


Monte Carlo studies.These plans are shownto be improvements
oversimplerandomsampling
withrespectto variance for a class of estimatorswhich includesthe sample mean and the
empiricaldistributionfunction.

KEY WORDS
Latin hypercubesampling
Samplingtechniques
Simulationtechniques
Variance reduction

1. INTRODUCTION

Numerical methodshave been used for years to


provideapproximatesolutionsto fluidflowproblems
that defyanalyticalsolutionsbecause of theircomplexity.A mathematicalmodel is constructedto resemble the fluidflowproblem,and a computerprogram (called a "code"), incorporatingmethods of
obtaininga numericalsolution,is written.Then for
any selectionof inputvariablesX = (X,, ** , XK) an
output variable Y = h(X) is produced by the computer code. If the code is accurate the output Y
resembleswhat the actual output would be if an
experimentwere performedunderthe conditionsX.
It is oftenimpracticalor impossibleto performsuch
an experiment.Moreover, the computercodes are
sometimessufficiently
complexso thata singleset of
inputvariablesmay requireseveralhoursof timeon
the fastestcomputerspresentlyin existencein order
to produce one output. We should mentionthat a
singleoutputY is usuallya graph Y(t) of outputas a
functionof time,calculatedat discretetimepointst,
to < t < tl.
Whenmodelingrealworldphenomenawitha computer code one is oftenfaced with the problem of
what values to use forthe inputs.This difficulty
can
ReceivedJanuary1977; revisedMay 1978

arise fromwithinthe physical process itselfwhen


systemparametersare notconstant,but varyin some
mannerabout nominalvalues. We model our uncertaintyabout thevalues oftheinputsbytreatingthem
as randomvariables. The informationdesiredfrom
the code can be obtainedfroma studyof the probaof the output Y(t). Consequently,
bilitydistribution
we model the "numerical"experimentby Y(t) as an
unknowntransformation
h(X) oftheinputsX, which
have a knownprobabilitydistribution
F(x) forx E S.
Obviouslyseveralvalues of X, say X1, .', XN, must
be selectedas successiveinputssetsin orderto obtain
the desired informationconcerningY(t). When N
must be small because of the runningtime of the
code, theinputvariablesshouldbe selectedwithgreat
care.
The nextsectiondescribesthreemethodsof selecting(sampling)inputvariables.Sections3, 4 and 5 are
devotedto comparingthethreemethodswithrespect
to theirperformancein an actual computercode.
The computercode used in thispaper was developed in theHydrodynamics
Group oftheTheoretical
Division at the Los Alamos ScientificLaboratory,to
studyreactorsafety[8]. The computercode is named
SOLA-PLOOP and is a one-dimensionalversionof
anothercode SOLA [7]. The code was used by us to
model the blowdown depressurizationof a straight
and
pipe filledwithwaterat fixedinitialtemperature
pressure.Input variables include:X1, phase change
fordriftvelocity;X3,number
rate;X2,dragcoefficient
of bubbles per unitvolume; and X4, pipe roughness.
The input variables are assumed to be uniformly
distributedover givenranges.The outputvariableis
pressureas a functionoftime,wheretheinitialtimeto
is the time the pipe rupturesand depressurization

239

This content downloaded from 161.139.102.11 on Wed, 28 Oct 2015 16:25:27 UTC
All use subject to JSTOR Terms and Conditions

240

M. D. McKAY,W. J. CONOVER AND R. J. BECKMAN

initiates,and thefinaltimeti is 20 millisecondslater.


The pressureis recordedat 0.1 millisecondtimeintervals. The code was used repeatedlyso thatthe accuracy and precisionof the threesampling methods
could be compared.

compared to random samplingwith respectto the


class of estimatorsof the form

2. A DESCRIPTION OF THE THREE METHODS USED


FOR SELECTING THE VALUES OF INPUT VARIABLES

thesamplemeanwhich
Ifg(Y) = Y thenT represents
is used to estimateE(Y). If g(Y) = yr we obtain the
rth sample moment.By letting
g(Y) = 1 forY < y, 0
otherwise,we obtain the usual empiricaldistribution
functionat the point y. Our interestis centered
around theseparticularstatistics.
Let r denotetheexpectedvalue of T whenthe Ye's
constitutea randomsample fromthedistribution
of
Y = h(X). We show in theAppendixthatbothstratified sampling and Latin hypercubesampling yield
unbiasedestimatorsof r.
If TRis theestimateof r froma randomsampleof
size N, and Ts is theestimatefroma stratified
sample
of size N, thenVar(Ts) < Var(TR)whenthestratified
plan uses equal probabilitystrata with one sample
per stratum(all pi = 1/N and nlj = 1). No direct
means of comparingthe varianceof thecorrespondingestimatorfromLatin hypercubesampling,TL, to
Var(Ts) has been found.However,thefollowingtheorem,provedin the Appendix,relatesthe variances
of TL and TR.
Theorem.If Y = h(X1,.,
XK) is monotonicin
and g(Y) is a monotonicfunceach of its arguments,
tionof Y, thenVar(TL) < Var(TR).

From the manydifferent


methodsof selectingthe
values of inputvariables,we have chosen threethat
have considerableintuitiveappeal. These are called
randomsampling,stratified
sampling,and Latin hypercubesampling.
RandomSampling.Let theinputvaluesX1, ** , XN
be a randomsample fromF(x). This methodof samplingis perhapsthemostobvious,and an entirebody
of statisticalliteraturemay be used in makinginferences regardingthedistributionof Y(t).
StratifiedSampling.Using stratifiedsampling,all
areas of the sample space of X are representedby
input values. Let the sample space S of X be partitioned into I disjointstrataSt. Let pi = P(X C Si)
thesize ofSi. Obtain a randomsampleXiJ,j
represent
= 1, ** , n fromSi. Then of coursetheni sum to N.
If I = 1, we have random samplingover the entire
sample space.
Latin HypercubeSampling. The same reasoning
that led to stratified
sampling,ensuringthatall portions of S were sampled, could lead further.If we
wishto ensurealso thateach oftheinputvariablesXk
has all portions of its distributionrepresentedby
inputvalues,we can dividetherangeof each Xk into
N strata of equal marginal probability 1/N, and
sample once fromeach stratum.Let this sample be
Xkj,j = 1, ..., N. These formtheXkcomponent,k =
1, * , K, in Xi, i = 1, * , N. The componentsof the
variousX,A'sare matchedat random.This methodof
selectinginputvalues is an extensionof quota sampling [13], and can be viewed as a K-dimensional
extensionof Latin square sampling[11].
One advantageof the Latin hypercubesample appears when the output Y(t) is dominatedby only a
few of the componentsof X. This methodensures
that each of those componentsis representedin a
manner,no matterwhichcomponents
fullystratified
turn
out
to be important.
might
We mentionherethattheN intervalson therange
of each componentof X combine to formNK cells
whichcoverthesamplespace ofX. These cells,which
are labeled bycoordinatescorresponding
to theintervals, are used when findingthe propertiesof the
samplingplan.
2.1 Estimators
In the Appendix (Section 8), stratifiedsampling
and Latin hypercubesampling are examined and

T(Y,,,

YN)= (1/N)

it=l

g(Yi),

whereg( ) = arbitraryfunction.

2.2 The SOLA-PLOOP Example


The threesamplingplans werecomparedusingthe
SOLA-PLOOP computercode withN = 16. Firsta
randomsample consistingof 16 values of X = (X1,
X2, X3, X,) was selected,enteredas inputs,and 16
graphsof Y(t) wereobservedas outputs.These output values wereused in the estimators.
For the stratifiedsamplingmethod the range of
each inputvariable was divided at the median into
two partsof equal probability.The combinationsof
rangesthusformedproduced24 = 16 strataSi. One
observationwas obtainedat randomfromeach Si as
input,and the resultingoutputswereused to obtain
the estimates.
To obtaintheLatin hypercubesampletherangeof
each inputvariableXi was stratified
into 16 intervals
of equal probability,and one observationwas drawn
at randomfromeach interval.These 16 valuesforthe
4 inputvariableswerematchedat randomto form16
inputs,and thus 16 outputsfromthecode.
The entireprocessof samplingand estimatingfor
the threeselectionmethodswas repeated50 timesin
order to get some idea of the accuracies and precisions involved.The totalcomputertimespentin running the SOLA-PLOOP code in this study was 7
hours on a CDC-6600. Some of the standarddevia-

TECHNOMETRICS
?, VOL.21, NO. 2, MAY1979

This content downloaded from 161.139.102.11 on Wed, 28 Oct 2015 16:25:27 UTC
All use subject to JSTOR Terms and Conditions

A COMPARISON OF THREE METHODS FOR SELECTING VALUES OF INPUT

tion plots appear to be inconsistent


withthetheoretical results.These occasional discrepanciesare believed to arise from the non-independenceof the
estimatorsover timeand the small sample sizes.
3.

140-0 -

0
I-

ESTIMATING THE MEAN

V)
L&J
L.a

The goodnessof an unbiasedestimatorof themean


can be measuredby thesize of itsvariance.For each
samplingmethod,the estimatorof E(Y(t)) is of the
form
Y(t) = (1/N)

i=

1, ...,N.

Var(YR(t)) = (1/N) Var(Y(t))


(1/N2)

X
=1

(u -,)2

Var(YL(t)) = Var(YR(t)) + ((N - 1)/N)

1/(NK(N-

1)K)) y

(A,l M)(uj - A)

(3.2)

whereu = E(Y(t)),
Iji = E(Y(t) I X G Si) in thestratified
sample,or
it = E(Y(t) | X E cell i) in the Latin hypercube
sample,
and R means the restrictedspace of all pairsAu,,j
havingno cell coordinatesin common.
For the SOLA-PLOOP computercode the means
and standard deviations,based on 50 observations,
were computed for the estimatorsjust described.
Comparativeplotsofthemeansaregivenin Figure 1.
All of theplotsof themeansare comparable,demonstratingthe unbiasednessof the estimators.
Comparative plots of the standarddeviationsof
the estimatorsare given in Figure 2. The standard
deviation of Ys(t) is_smallerthan that of YR(t) as
expected.However,YL(t) clearlydemonstratessuperiorityas an estimatorin thisexample,witha standard deviationroughlyone-forththatof therandom
samplingestimator.

..................

STRAT D

..........

LATM

----

100-0 -

80'0 -

Ll

60-0 -

40-0

0-0

5.0

10

t50

TIME

20-0

S2(t) = (1/N) ~ (Y,(t)- Y(t))2,


i=l

ESTIMATING THE VARIANCE

For each samplingmethod,theformof theestimator of the varianceis

(4.1)

and its expectationis


E(S2(t)) = Var(Y(t))- Var(Y(t)),

(4.2)

whereY(t) is one of YR(t), Ys(t), or YL(t).


In the case of therandomsample,it is well known
that N S2R/(N- 1) is an unbiased estimatorof the
varianceof Y(t). The bias in thecase of thestratified
sample is unknown.However,because Var(Ys(t)) <
Var(YR(t)),
(1 - 1/N) Var(Y(t)) < E(S,2(t)) < Var(Y(t)). (4.3)
The bias in the Latin hypercubeplan is also unknown,but for the SOLA-PLOOP example it was
small.Variancesfortheseestimatorswerenotfound.
Again using the SOLA-PLOOP example, means
and standarddeviations(based on 50 observations)
werecomputed.The meanplotsare givenin Figure3.
They indicatethatall threeestimatorsare in relative
agreementconcerningthequantitiestheyare estimating.In termsofstandarddeviationsoftheestimators,
Figure 4 shows that, although stratifiedsampling
yieldsabout thesame precisionas does randomsam2-5 RANDOM

2-0 -

STRATIED

LAT IIN

I-

1-5-

L&.
0

1-0 -

"\

v)

0-5-

0-0

4.

IRANDOM

FIGURE 1. Estimatingthemean: thesamplemeanof ?R(t), Ys(t),


and L(t).

In the case of the stratified


sample,the X1 comes
fromstratumSi, Pi = 1/N and n1= 1. For theLatin
hypercubesample,the Xi is obtainedin the manner
describedearlier.Each ofthethreeestimatorsYR,Ys,
and YL is an unbiased estimatorof E(Y(t)). The
variancesof the estimatorsare givenin (3.2):

Var(Ys(t)) = Var(YR(t))-

120-0 -

(3.1)

i=1

where
Y(t) = h(Xi),

X Yt(t)

241

\
.

~~'~_..
.::....::::>:'.......................
.
/
---..

0.0

Io
5-0

-_

1o
10.0

TIME

15

15-0

I2

20-0

FIGURE 2. Estimatingthemean: thestandarddeviationof YR(t),


Ys(t), and YL(t).

TECHNOMETRICS?, VOL. 21, NO. 2, MAY 1979

This content downloaded from 161.139.102.11 on Wed, 28 Oct 2015 16:25:27 UTC
All use subject to JSTOR Terms and Conditions

242

M. D. McKAY, W. J. CONOVER AND R. J. BECKMAN


150-0-

1.0
RAOM

I-.

4c
2

100-0 -

STRATFID

LATIN

4
I-

V)

V)
L,d

LAJ

LL.

LL.

500 -

RANDO
STRATFIED

...............
-----

0-6

0-4

4c

2
0-0

LATIN

I-

L&J

10.0

5-O

0-0

15.0

0-2

20-0

TIME
thevariance:thesamplemeanofSR2(t),
FIGURE 3. Estimating

PRESSURE
Gs(y, t), and GL(Y, t) at t = 1.4.

pling,Latin hypercubefurnishesa clearlybetterestimator.

Var(Gs(y,t)) = Var(GR(y,t))
N

-(1/N2)

5. ESTIMATINGTHE DISTRIBUTION FUNCTION

G(y,t) = (l/N)

i=1

u(y - Y (t)),

90-0

=-1

(Di(y, t) - D(y, t))2

Var(GL(y,t)) = Var(GR(y,t))
+((N - 1)/N

(5.1)

whereu(z) = 1 forz > 0 and is zero otherwise.Since


equation (5.1) is of the formof the estimatorsin
Section 2.1, the expectedvalue of G(y, t) underthe
threesamplingplans is the same, and underrandom
sampling,the expectedvalue of G(y, t) is D(y, t).
The variancesof the threeestimatorsare givenin
(5.2). Di again refersto eitherstratumi or cell i, as
appropriate,and R representsthe same restricted
space as it did in (3.2).
Var(GR(y,t)) = (1/N) D(y, t)(l - D(y, t))

80-0

FIGURE 5. Estimating
theCDF: thesamplemeanof GR(, t),

Ss2(t), and SL2(t).

The distributionfunction,D(y, t), of Y(t) = h(X)


funcmaybe estimatedby theempiricaldistribution
functioncan be writtion.The empiricaldistribution
ten as

700

1/NK(N- 1)K) C

(Dt'(, t)

- D(y, t)) * (Dj(y, t)- D(y, t)).

(5.2)

As withthecases of themean and varianceestimators,the distributionfunctionestimatorswere compared forthe threesamplingplans. Figures 5 and 6
givethemeansand standarddeviationsoftheestimators at t = 1.4 ms. This time point was chosen to
correspondto the timeof maximumvariancein the
distributionof Y(t). Again the estimatesobtained
froma Latin hypercubesample appear to be more
precise in general than the othertwo typesof estimates.

50.0 -

0
2

I(1
LAJ
0

vi

C)

40-0 -

RANOM

RANDOM

..................

STRATIFIED

STRAT

---------

LATIN

0O

LATIN
30-0 -

I \

I
I

20-0 -

0-10-

ILiJ

\\

LA

005

V/)

10-0 -

'.0 Irl
Ul

0*0

5-0

10-0

TIME

15*0

20-0

FIGURE 4. Estimatingthe variance: the standarddeviationof

SR2(t), Ss2(t), and SL2(t).

0-00

20-0 30-0

40-0

50-0

600

PRESSURE

700

80-0

900

FIGURE 6. EstimatingtheCDF: the standarddeviationof GR(Y,


t), Gs(y, t), and GL(Y, t) at t = 1.4.

TECHNOMETRICS?, VOL. 21, NO. 2, MAY 1979

This content downloaded from 161.139.102.11 on Wed, 28 Oct 2015 16:25:27 UTC
All use subject to JSTOR Terms and Conditions

243

A COMPARISONOF THREEMETHODS FOR SELECTINGVALUESOF INPUT


6. DISCUSSION AND CONCLUSIONS

We have presentedthreesamplingplans and associated estimatorsof the mean, the variance,and the
population distributionfunctionof the output of a
computercode whentheinputsare treatedas random
variables. The firstmethod is simple random sampling. The second method involves stratifiedsamplingand improvesupon thefirstmethod.The third
methodis called hereLatin hypercubesampling.It is
an extensionof quota sampling[13], and it is a first
cousin to the "random balance" designdiscussedby
Satterthwaite[12], Budne [2], Youden, et al [15],
factoAnscombe[1], and to the highlyfractionalized
rial designsdiscussedby Enrenfeldand Zacks [5, 6],
Dempster [3, 4], and Zacks [16, 17], and to lattice
sampling as discussed by Jessen [9]. This third
method improves upon simple random sampling
when certain monotonicityconditionshold, and it
appears to be a good method to use for selecting
values of inputvariables.
7. ACKNOWLEDGMENTS

We extenda special thanksto Ronald K. Lohrding,forhis earlysuggestionsrelatedto thisworkand


for his continuingsupportand encouragement.We
also thankour colleagues LarryBruckner,Ben Duran,C. Phive,and Tom Boullion fortheirdiscussions
concerningvariousaspectsof theproblem,and Dave
Whitemanforassistancewiththe computer.
This paper was preparedunderthe supportof the
AnalysisDevelopmentBranch,Division of Reactor
SafetyResearch,Nuclear RegulatoryCommission.
8. APPENDIX

In thesectionsthatfollowwe presentsomegeneral
resultsabout stratified
samplingand Latin hypercube
samplingin orderto make comparisonswithsimple
randomsampling.We move fromthegeneralcase of
stratified
samplingto stratified
samplingwithproportional allocation, and then to proportionalallocationswithone observationper stratum.We examine
Latin hypercubesampling for the equal marginal
probabilitystratacase only.
8.1

TypeI Estimators

Let X denote a K variate random variable with


probabilitydensityfunction(pdf)f(x) forx C S. Let
of X givenby
Y denote a univariatetransformation
Y = h(X). In the contextof this paper we assume
X - f(x), x E S
Y = h(X)

KNOWNpdf
UNKNOWN butobservable
transformation
ofX.

The class of estimatorsto be consideredare thoseof


the form

T(u,,

, UN)=

(1/N) J g(U,),
t-=

(8.1)

whereg(. ) is an arbitrary,
knownfunction.In particular we use g(u) = ur to estimatemoments,and g(u)
= 1 foru > 0, = 0 elsewhere,to estimatethedistribution function.
The samplingschemes describedin the following
sectionswill be comparedto random samplingwith
respectto T. The symbolTR denotesT(Y1, .* , YN)
whenthe argumentsY,, ** , YNconstitutea random
sample of Y. The mean and variance of TR are denoted by T and 02/N. The statisticT givenby (8.1)
will be evaluatedat argumentsarisingfromstratified
samplingto formTs, and at argumentsarisingfrom
Latin hypercubesamplingto formTL. The associated
means and varianceswill be compared to those for
randomsampling.
8.2 StratifiedSampling
Let the rangespace, S, of X be partitionedintoI
disjointsubsetsSi of size pi = P(X e St), with
t=l

pt = 1.

Let Xij, j = 1, *.., nl, be a random sample from


stratumSt. That is, letXj - iidf(x)/pi,j = 1, . * , ni,
forx e Si, but withzero densityelsewhere.The correspondingvalues of Y are denotedby Yj = h(X(j), and
thestratameansand variancesofg(Y) are denotedby
A, = E(g(Yij)) = fg(y)(1/pt)f(x)d

ai2 = Var(g(Y()) = f(g(y)-t)2(l/pt)f(x)dx.


s

It is easy to see thatif we use the generalform


I

Ts= ?

t=l

nt

(pt/ni)

J=l

g(Yj),

that Ts is an unbaised estimatorof r withvariance


givenby
Var(Ts) =

] (p,2/ni)a,2.

t=1

(8.2)

The followingresultscan be foundin Tocher [14].


Allocation.If
Stratified
SamplingwithProportional
the probabilitysizes,pi, of the strataand the sample
sizes, ni, are chosen so that ni = piN, proportional
allocationis achieved.In thiscase (8.2) becomes
Var(Ts) = Var(TR) - (/N)

"pt(,t-rT)2. (8.3)

t=1

Thus, we see that stratifiedsampling with proportionalallocationoffersan improvement


overrandom
sampling,and thatthe variancereductionis a function of the differences
betweenthe stratameans A,
and theoverall mean r.
TECHNOMETRICS
?, VOL.21, NO. 2, MAY1979

This content downloaded from 161.139.102.11 on Wed, 28 Oct 2015 16:25:27 UTC
All use subject to JSTOR Terms and Conditions

M. D. McKAY,W. J. CONOVERAND R. J. BECKMAN

244

1. P(w,=l) = (l/NK-1) = E(w1) = E(w,2)

ProportionalAllocationwithOne Sample per Stratum.Anystratified


plan whichemployssubsampling,
ni > 1, can be improvedby furtherstratification.
When all ni = 1, (8.3) becomes
Var(Ts) = Var(TR) - (/N)

Var(w1)= (1/NK-)(1 - 1/NK-).


2. If w, and wj correspondto cells havingno cell
coordinatesin common,then
E(wiw,) = E(w,wj w = O)P(wj = 0)

N
1=1

(8.4)

-)2.

+ E(wilwJw= l)P(w = 1)

8.3 Latin HypercubeSampling


In stratified
samplingtherangespace S of X can be
arbitrarily
partitionedto formstrata.In Latin hypercube sampling the partitionsare constructedin a
specificmannerusingpartitionsof therangesof each
component of X. We will only consider the case
wherethe componentsof X are independent.
Let therangesof each oftheK componentsofX be
partitionedinto N intervalsof probablititysize 1/N.
The Cartesianproductof theseintervalspartitionsS
into NK cells each of probabilitysize N-K. Each cell
can be labeled by a set ofK cellcoordinatesmi =(mil,
mi2, .' , msK)wheremij is the intervalnumberof
componentXJrepresentedin cell i. A Latin hypercube sample of size N is obtained froma random
selectionN of thecells m,, **, mN,withthe condition thatforeach j the set {mij}l_N is a permutation
of the integers1, .. , N. One random observation
is made in each cell. The densityfunctionof X given
X e cell i is NKf(X) if x E cell i, zero otherwise.The
of Yi(t) is easmarginal(unconditional)distribution
ily seen to be the same as thatfora randomlydrawn
X as follows:
P(Y < y) =

all oells q

-= Zcell

P(Y, < y X cell q)P(X


q

h(x) Sy

NKf(X)d

cell q)

E(wfwj) = 0.
Now
Var(w(g(Yl)) = E(W,2) Var g(Y,)
+ E2(g(Y,)) Var(w,)
NK

C Var(wg(Y1))

NK

N-K+1

E(g(Y,)-#,)2
NK

+ (N-K+

A2

1-N-N+2K)

(8.8)

1=1

where,i = E{g(Y) IX e cell i}. Since


E(g(Y,)-t)2

= NK fcel

I (g(y)-rT)f(x)dx

(.t-- r)2

(8.9)
we have

X Var(wg(Yi)) = N Var(Y) I

N-K+l

(-

i #,2

Tr)2

(8.10)

Furthermore

Fromthiswe have TL as an unbiasedestimatorof r.


To arrive at a formfor the variance of TL we
introduceindicatorvariableswt,with
ifcell i is in thesample
ifnot.

NK

NK

/=1

J=l

i,6

Cov(wg(rY), wg(rY)) = ?,

uiE{wwiw}

iJ

(8.11)

iAj

N-2K+2CZ

i?i

whichcombineswith(8.10) to give

The estimatorcan now be writtenas

Var(TL) = (l/N)Var(Y)

- N-K-1

NK
/=1

wfg(Yi),

(8.5)

givenby

- N-2K)

+ (N-

I)-K+lNK -'X

NK

Var(TL) = (1/N2)C Var(wtg(Y1))


t=l

-N- 2KF

NK NK

?c Cov(wtg(Yy),wg(Y,)).

i=1 J=1
Jfi

(8.6)

The followingresultsabout the wi are immediate:

+ (N-K

whereY1= h(X1)and Xl ~ cell i. The varianceof TL is

+ (1/N2)

(8.7)

so that

+ (N-K+l-N-2K+2)

h(X)<y

TL = (1/N)

1))K--1

3. If wi and Wjcorrespondto cells havingat least


one commoncell coordinate,then

(l/NK)

f(x)dx.

= 1/(N(N-

tij

Mi,I

(i

T)2

#,2

Mi
(8.12)

whereR means the restrictedspace of NK(N - 1)K


pairs (uI,,j) correspondingto cells having no cell
coordinates in common. After some algebra, and

TECHNOMETRICS
?, VOL.21, NO. 2, MAY1979

This content downloaded from 161.139.102.11 on Wed, 28 Oct 2015 16:25:27 UTC
All use subject to JSTOR Terms and Conditions

A COMPARISON OF THREE METHODS FOR SELECTING VALUES OF INPUT

with Iiu = NKT, thefinalformforVar(TL) becomes

equation
UsingHoeffding's

Var(TL) = Var(TR) + (N - 1)/N[N-K(N -

Cov(X,Y) =

*
R

)-K

(i-rT)(Lj-r)

- P(X < x)P(Y < y)] dx dy,

< 0,

(8.14)

which is equivalent to saying that the covariance


betweencells havingno cell coordinatesin common
is negative.A sufficient
conditionfor(8.14) to hold is
theorem.
the
following
givenby
THEOREM. If Y = h(X1,??, XK) is monotonicin
each of its arguments,and if g(Y) is a monotonic
functionof Y, thenVar(TL) < Var(TR).
PROOF. The proof employs a theoremby Lehmann [10]. Two functionsr(x,, * * , XK) and s(y,, * .,
YK) are said to be concordantin each argumentif r
and s eitherincreaseor decrease togetheras a functionofxi = yi,withall xj,j # i and yj,j #i heldfixed,
foreach i. Also, a pair ofrandomvariables(X, Y) are
ifP(X < x, Y
said to be negatively
quadrantdependent
< y) < P(X < x)P(Y < y) forall x, y. Lehmann's
theoremstatesthatif(i) (X,, Y1),(X2, Y2), .* , (XK,
YK) are independent,(ii) (Xi, Yi) is negativelyquadrantdependentforall i, and (iii) X = r(X, . .., XK)
and Y = s(Y1, .. , YK) are concordantin each argument,then(X, Y) is negativelyquadrantdependent.
We earlierdescribeda stage-wiseprocessforselecting cells fora Latin hypercubesample,wherea cell
was labeled bycell coordinatesmi, ?? , inK.Two cells
(11, ** , IK) and (ml,

, mx) with no coordinates in

commonmaybe selectedas follows.Randomlyselect


two integers(R,,, R21)withoutreplacementfromthe
firstN integers1, .* , N. Let 11= R,, and m, = R21.
Repeat the procedureto obtain (R12,R22),(R13,R23),
'* , (R1K, R2K) and let Ik =

Rlk

[P(X < x, Y < y)

(8.13)

(i-T)(j-Tr)].

Note thatVar(TL) < Var(TR) if and only if


N-K(N - 1)-K

245

and mk = R2k. Thus

twocells are randomlyselectedand Ik mk


M fork = 1,
* *, K.
Note that the pairs (Rlk,R2k), k = 1, .. , K, are
mutuallyindependent.Also note thatbecause P(Rlk
< x, R2k< y) = [xy- min(x,y)]/(n(n- 1)) < P(Rlk
< x)P(R2k < y), where [ ] representsthe "greateach pair(R1k,R2k)is negatively
estinteger"function,
quadrantdependent.
Let ,1 be theexpectedvalue ofg(Y) withinthecell
IK), and let #2 be similarily
designated by (1, . .,
defined for (ml, .., mK). Then 1i = #(R,1,R12, . .,
RIK) and #2= A(R21,R22, .** , R2K) are concordant in

each argumentunder the assumptionsof the theorem. Lehmann's theoremthenyields thatAl and g2
are negativelyquadrantdependent.Therefore,
P(,l < x, ,2 < y) < P(jU < X)P(,

< y).

(see Lehmann[10] fora proof),we haveCov(l,,A2) <


0. Since Var(TL) = Var(TR) + (N - l)/NCov(L1,42),
the theoremis proved.
Since g(t) as used in both Sections 3 and 5 is an
increasingfunctionoft,we can say thatifY = h(X) is
Latin
a monotonicfunctionof each of itsarguments,
random
is
than
better
sampling
hypercubesampling
forestimatingthe mean and the populationdistribution function.
REFERENCES
[1] ANSCOMBE, F. J. (1959). Quick analysismethodsforrandom balance screeningexperiments.Technometrics,
1, 195209.
[2] BUDNE, T. A. (1959). The applicationof randombalance
1, 139-155.
designs.Technometrics,
[3] DEMPSTER, A. P. (1960). Random allocationdesignsI: On
generalclasses of estimationmethods.Ann. Math. Statist.,
31, 885-905.
[4] DEMPSTER, A. P. (1961). Random allocation designsII:
Approximate theory for simple random allocation. Ann.
Math. Statist.,32, 387-405.
[5] EHRENFELD, S., and ZACKS, S. (1951). Randomization
Ann. Math. Statist.,32, 270-297.
and factorialexperiments.
[6] EHRENFELD, S., and ZACKS, S. (1967). TestinghypotheAnn.Math. Statist.,
ses in randomizedfactorialexperiments.
38, 1494-1507.
[7] HIRT, C. W., NICHOLS, B. D., and ROMERO, N. C.
(1975). SOLA-a numericalsolutionalgorithmfortransient
fluid flows. Los Alamos ScientificLaboratoryReport LA5852, Los Alamos.
[8] HIRT, C. W., and ROMERO, N. C. (1975). Applicationofa
drift-flux
model to flashingin straightpipes. Los Alamos
ScientificLaboratoryReportLA-6005-MS, Los Alamos.
[9] JESSEN, RAYMOND J. (1975). Square and cubic lattice
sampling.Biometrics,31, 449-471.
[10] LEHMANN, E. L. (1966). Some concepts of dependence.
Ann. Math. Statist.,35, 1137-1153.
[11] RAJ,DES. (1968). SamplingTheory,pp. 206-209.New York:
McGraw-Hill.
[12] SATTERTHWAITE, F. E. (1959). Random balance experimentation.Technometrics,
1, 111-137.
[13] STEINBERG, H. A. (1963). Generalized quota sampling.
Nuc. Sci. and Engr.,15, 142-145.
[14] TOCHER, K. D. (1963). TheArtofSimulation.pp. 106-107.
Princeton,N.J.: D. Van Nostrand.
[15] YOUDEN, W. J., KEMPTHORNE, O., TUKEY, J. W.,
BOX, G. E. P., and HUNTER, J.S. (1959). Discussion ofthe
and Budne. Technometrics,
1,
papersof Messrs.Satterthwaite
157-193.
[16] ZACKS, S. (1963). On a completeclass of linear unbiased
estimatorsforrandomizedfactorialexperiments.
Ann.Math.
Statist.,34, 769-779.
[17] ZACKS, S. (1964). Generalizedleast squares estimatorsfor
randomizedfractionalreplicationdesigns.Ann. Math. Statist.,35, 696-704.

TECHNOMETRICS?, VOL. 21, NO. 2, MAY 1979

This content downloaded from 161.139.102.11 on Wed, 28 Oct 2015 16:25:27 UTC
All use subject to JSTOR Terms and Conditions

S-ar putea să vă placă și