Sunteți pe pagina 1din 17

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/276211091

Refined Stratified Sampling for efficient Monte Carlo based uncertainty


quantification

Article  in  Reliability Engineering [?] System Safety · May 2015


DOI: 10.1016/j.ress.2015.05.023 · Source: arXiv

CITATIONS READS

25 287

4 authors, including:

Michael Shields Kirubel Teferra


Johns Hopkins University Johns Hopkins University
52 PUBLICATIONS   257 CITATIONS    16 PUBLICATIONS   70 CITATIONS   

SEE PROFILE SEE PROFILE

Raymond P. Daddazio
Thornton Tomasetti
29 PUBLICATIONS   150 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Stochastic multi-scale modeling of plasticity in amorphous solids View project

Adaptive Sampling for Efficient Uncertainty Propagation View project

All content following this page was uploaded by Michael Shields on 16 December 2017.

The user has requested enhancement of the downloaded file.


Reliability Engineering and System Safety 142 (2015) 310–325

Contents lists available at ScienceDirect

Reliability Engineering and System Safety


journal homepage: www.elsevier.com/locate/ress

Refined Stratified Sampling for efficient Monte Carlo based


uncertainty quantification
Michael D. Shields a,c,n, Kirubel Teferra a, Adam Hapij b, Raymond P. Daddazio b
a
Department of Civil Engineering, Johns Hopkins University, United States
b
Applied Science & Investigations, Weidlinger Associates, Inc., United States
c
Department of Materials Science and Engineering, Johns Hopkins University, United States

art ic l e i nf o a b s t r a c t

Article history: A general adaptive approach rooted in stratified sampling (SS) is proposed for sample-based uncertainty
Received 3 July 2014 quantification (UQ). To motivate its use in this context the space-filling, orthogonality, and projective properties
Received in revised form of SS are compared with simple random sampling and Latin hypercube sampling (LHS). SS is demonstrated to
18 May 2015
provide attractive properties for certain classes of problems. The proposed approach, Refined Stratified
Accepted 30 May 2015
Sampling (RSS), capitalizes on these properties through an adaptive process that adds samples sequentially by
Available online 9 June 2015
dividing the existing subspaces of a stratified design. RSS is proven to reduce variance compared to traditional
Keywords: stratified sample extension methods while providing comparable or enhanced variance reduction when
Uncertainty quantification compared to sample size extension methods for LHS – which do not afford the same degree of flexibility to
Monte Carlo simulation
facilitate a truly adaptive UQ process. An initial investigation of optimal stratification is presented and
Stratified sampling
motivates the potential for major advances in variance reduction through optimally designed RSS. Potential
Latin hypercube sampling
Sample size extension paths for extension of the method to high dimension are discussed. Two examples are provided. The first
involves UQ for a low dimensional function where convergence is evaluated analytically. The second presents a
study to asses the response variability of a floating structure to an underwater shock.
& 2015 Elsevier Ltd. All rights reserved.

1. Introduction LHS in particular has been widely used for uncertainty quantification
(e.g. [5,6]) and its properties have been studied extensively [4,7–11].
Monte Carlo methods are used for uncertainty quantification True stratified sampling, on the other hand, has not received a similar
(UQ) in nearly every field of engineering and computational science. level of attention in the UQ literature – largely due to the desirable
They are often (rightfully) criticized for their high computational properties of LHS for many applications – although it is widely
cost – especially for reliability analysis and other applications where recognized as an effective means of variance reduction [12–14]. In
extreme response is of interest. Yet, they remain the most robust this work, some properties of stratified sampling methods (including
and effective means of quantifying uncertainty in computational LHS) are studied and the case is made for the use of true stratified
analyses. For this reason, there has been much interest in improving sampling for certain classes of problems. In particular, there are many
the computational efficiency of Monte Carlo methods. As Janssen low-to-moderate dimensional applications where true stratified
[1] points out, there are two means of accomplishing this computa- sampling is competitive with or even more effective than LHS. This
tional savings: (1) improve the convergence rate of the sampling motivates the need to extend these benefits to high dimensional
routine; and/or (2) perform a sample set that is optimally small. problems and some insights for future work along these lines are
This work addresses both of these aspects. provided.
To achieve an improved convergence rate, we focus on a class of As for minimizing the sample size, there is no generally accepted
variance reduction techniques called stratified sampling [2,3] that means to identify an optimally small sample set a priori. In this
includes the popular Latin hypercube sampling (LHS) method [4,3]. work, we espouse a general adaptive paradigm for UQ (Fig. 1)
Stratified sampling methods operate by subdividing the sample wherein samples are progressively added and analyses conducted
space into smaller regions and sampling within these regions. In so until a user-defined convergence condition is met. This is not a new
doing, the produced samples more effectively fill the sample space concept, but it is one that is perhaps underdeveloped. Early work by
and therefore reduce the variance of computed statistical estimators. Gilman [15] demonstrated this idea for classical Monte Carlo
analyses using simple random sampling (SRS) and a recent work
by Janssen [1] provides a renewed emphasis on its importance.
n
Corresponding author. Until recently though, most stratified sampling methods required a
E-mail address: michael.shields@jhu.edu (M.D. Shields). priori prescription of the sample size and precluded sample size

http://dx.doi.org/10.1016/j.ress.2015.05.023
0951-8320/& 2015 Elsevier Ltd. All rights reserved.
M.D. Shields et al. / Reliability Engineering and System Safety 142 (2015) 310–325 311

Initial Samples Analyses Statistics


- Conduct analysis for
- Define N min - Evaluate convergence
each sample
- Conduct initial - Bootstrap methods, STOP
- ODE, PDE, FE
sampling KDEs, t-statistics, etc.
Model, etc.

Add Samples
- Define N add
- Generate N add
additional samples

Fig. 1. General adaptive process for UQ.

extension. Therefore, it was necessary to conservatively oversample random variables with marginal cumulative distribution functions
– otherwise the entire analysis would need to be restarted if the (CDFs) DX i ðÞ. Correlated random variables are not considered in this
convergence criteria were not met. work as it is common practice to produce a set of uncorrelated random
Recent developments in LHS enable sample size extension using variables from a correlated set using methods such as Principal
the so-called Hierarchical Latin Hypercube Sampling (HLHS) [16–18], Component Analysis and the Nataf or Rosenblatt transformations.
Replicated Latin Hypercubes (RLHs) [19,20], and related methods [21– The operator FðÞ commonly represents a computer simulation such as
23]. These methods, while important, do not afford the flexibility a finite element model possessing strong nonlinearities and/or
necessary to fully capitalize on the UQ paradigm promoted in this instabilities such that Y is difficult to assess probabilistically. The
work. Hierarchical methods, for example, produce – at a minimum – a following presents a brief review of common Monte Carlo methods
new sample for each existing sample during the extension. Conse- for randomly sampling X in order to perform statistical analysis of Y.
quently, sample size increases exponentially with the number of For notational clarity, subscript i is used to denote a vector
extensions performed. RLHs grow linearly with the number of exten- component, subscript k is used to denote a stratum of the space, and
sions (ni þ 1 ¼ 2ni ) and require each subsequent LHS to be the same subscript l denotes a specific sample. Additionally, n refers to the total
size and utilize the same stratification as the original. Since there is no number of vector components (dimension), M refers to the number of
refinement of the strata using RLHs, there is a tradeoff between loss of strata in a design, and N refers to the total number of samples.
desirable sample properties and sample size growth rate. These
considerations limit the adoption of a truly adaptive UQ process. 2.1. Simple Random Sampling
A few adaptive SS methods have been proposed in the past –
primarily in the physics community focusing on applications in Traditional Monte Carlo methods rely on the so-called Simple
multidimensional integration [24,25]. However, these methods have Random Sampling (SRS) or Monte Carlo Sampling in which realiza-
slightly different aims and, again, do not possess the desired level of tions of X, denoted xl ; l ¼ 1; …; N (samples), are generated as
adaptability. To enable effective sample size extension from a stratified independent and identically distributed (iid) realizations on S by
design, a new methodology – Refined Stratified Sampling (RSS) – is
xli ¼ DXi 1 ðU i Þ; i ¼ 1; 2; …; n ð2Þ
proposed that is conceptually simple and adds as many or as few
samples as desired at each extension. The method operates by dividing where Ui are iid uniformly distributed samples on [0,1]. The
existing strata and generating samples in the newly created empty realizations x are then applied to the system y ¼ FðxÞ and y is
strata. The method is proven to unconditionally reduce the variance of statistically evaluated.
the sample set when compared with existing methods for sample size
extension in stratified sampling. It is demonstrated that the method 2.2. Stratified Sampling
affords convergence rates for statistical estimators that are comparable
or superior to LHS for certain classes of problems with the added Stratified Sampling (SS) divides the sample space S into a
advantage of maximal flexibility in its sample size extension capabil- collection of M disjoint subsets (strata) Ωk ; k ¼ 1; 2; …; M with
[Mk ¼ 1 Ωk ¼ S and Ωp \ Ωq ¼ ∅; p a q. Samples x l ¼ ½xl1 ; xl2 ; …;
ities. Moreover, through an exploration of stratum optimality, we
suggest that the RSS method provides a convenient and rigorous xln ; l ¼ 1; 2; …; N are generated by randomly drawing Mk
P
avenue for identifying a stratification of the input space that samples ( M k ¼ 1 M k ¼ N) samples within each stratum k according to
optimally in the output space of interest. This is the basis of a parallel
effort for applications in reliability analysis [26]. xðkÞ
li
¼ DXi 1 ðU ik Þ; i ¼ 1; 2; …; n ð3Þ
The emphasis of this paper is on developing the RSS methodol-
where Uik are iid uniformly distributed samples on ½ξ ξ
lo hi
ik ; ik  with
ogy, which is then demonstrated on low-dimensional applications
where the benefits of true stratified sampling are observed. ξ ¼ DX i ðζ and ξ ¼ DX i ðζ and ζ and ζ denote the lower and
lo
ik
lo
ik Þ
hi
ik
hi
ik Þ
lo
ik
hi
ik

Considerations for its extension to high dimensional problems upper bounds respectively of the ith vector component of stratum
are discussed and motivation provided for future research along Ωk . For our purposes, stratification is performed directly in the
these lines. Two specific example applications are provided. probability space meaning that the strata are defined by prescrib-
ing the bounds ξlo ik and ξik on the n-dimensional unit hypercube.
hi

Beyond being disjoint (Ωp \ Ωq ¼ ∅; p aq) and filling the space


2. Review of sampling methods ([Mk ¼ 1 Ωk ¼ S), there are no restrictions on the strata definitions.
Strata can be defined with different sizes and shapes in general
Consider a stochastic system described by the relation: and can even be defined a posteriori based on an existing sample
set (the so-called post-stratification) [2]. The size of the strata in
Y ¼ FðXÞ ð1Þ
the probability space, pk, is equal to its probability of occurrence
where the random vector X ¼ fX 1 ; X 2 ; …; X n g possesses independent pk ¼ P½Ωk . With these considerations in mind, we define three
components defined over the sample space S describing n input general classes of stratified designs:
312 M.D. Shields et al. / Reliability Engineering and System Safety 142 (2015) 310–325

1. Symmetrically Balanced Stratified Design (SBSD): A stratified estimator defined by


design is said to be symmetrically balanced if all strata possess
X
N
equal probability pi ¼ pj ; 8 i; j and the probability space is Tðy1 ; …; yN Þ ¼ wl gðyl Þ ð5Þ
divided equally in all variables of the design. In an SBSD, each l¼1
stratum is an n-dimensional hypercube of equal size.
where yl ¼ Fðxl Þ and xl denotes a sample generated according to
2. Asymmetrically Balanced Stratified Design (ABSD): A stratified
SRS, LHS, or SS, wl are weights attributed to each sample (wl ¼ 1=N
design is said to be asymmetrically balanced if all strata possess
for SRS and LHS but may differ for SS – see below), and gðÞ is an
equal probability pi ¼ pj ; 8 i; j but the strata limits are assigned
arbitrary function. If gðyÞ ¼ yr , then T represents an estimate of the
arbitrarily. In the simplest case of a ABSD, each stratum is an n-
rth moment while gðyÞ ¼ 1fy r Yg, where 1fg denotes the indicator
dimensional orthotope (or hyperrectangle). However, general
function, specifies estimation of the empirical CDF. In keeping with
ABSDs may possess polyhedral or other arbitrarily shaped strata.
past notational conventions, we denote TR, TL, and TS as the
3. Unbalanced Stratified Design (UBSD): A stratified design is said
statistical estimates produced from SRS, LHS, and SS respectively.
to be unbalanced if all strata do not possess equal probability (i.
For SRS, it is well known that the sample variance of the
e. ( pi ; pj : pi a pj ).
statistical estimator is given by

We refer to these terms throughout the paper and samples drawn σ2


Var½T R  ¼ ð6Þ
from these designs are referred to as Symmetrically Balanced N
Stratified Samples (SBSS), Asymmetrically Balanced Stratified Sam- where σ2 denotes the variance of g(Y). This estimate serves as a
ples (ABSS), and Unbalanced Stratified Samples (UBSS) respectively. benchmark for comparison with the considered variance reduction
techniques.
2.3. Latin Hypercube Sampling Stratified sampling does not necessitate an equal probability of
occurrence for each sample. Consequently, in order to statistically
Latin Hypercube Sampling (LHS) divides the range of each evaluate the response quantity Y generated using stratified sam-
vector component Xi ; i ¼ 1; 2; …; n into M disjoint subsets (strata) ples, probabilistic weights are assigned to each sample according
of equal probability Ωik ; i ¼ 1; 2; …; n; k ¼ 1; 2; …; M. Samples of to the probability of occurrence of the stratum being sampled.
each vector component are drawn from the respective strata That is, for a sample xl A Ωk , the sample weight is defined as
according to
P½Ωk  pk
wl ¼ ¼ ð7Þ
xðkÞ
li
¼ DXi 1 ðU ik Þ; i ¼ 1; 2; …; n; k ¼ 1; 2; …; M ð4Þ Mk Mk

where Uik are iid uniformly distributed samples on ½ξk ; ξk  with


lo hi where Mk is the total number of samples in Ωk subject to
PM PM
ξlok ¼ ðk  1Þ=M and ξhi k ¼ 1 M k ¼ N and k ¼ 1 P½Ωk  ¼ 1.
k ¼ k=M. The samples x l ¼ ½xl1 ; xl2 ; …; xln ;
l ¼ 1; 2; …; N are assembled by (uniformly) randomly grouping Stratified sampling has been proven to unconditionally reduce
the terms of the generated vector components. That is, a term xli the variance of statistical estimators when compared to SRS.
is generated by randomly selecting from the generated compo- McKay et al. [4] have shown that for a balanced stratified
nents xðkÞ (without replacement) and these terms are grouped to design (SBSD or ABSD)
li
produce a sample. This process is repeated M times. 1XM
Because the component samples are randomly paired, an LHS is Var½T S  ¼ Var½T R   p ðμ  τÞ2 ð8Þ
Nk¼1 k k
not unique; there are ðM!Þn  1 possible combinations. Improved LHS
algorithms have been developed to determine optimal pairings that where μk is the mean value of the response evaluated over stratum
either enhance space-filling or reduce spurious correlation (increas- k, and τ is the overall response mean.
ing orthogonality). To improve space-filling, several Latin hypercube Similarly, McKay et al. [4] have shown that the variance of the
design methods have been developed that minimize the L2-dis- statistical estimator produced from a Latin hypercube design (TL) is
crepancy [11], maximize the minimum distance between points given by
(‘maximin’ designs) [27–30], optimize projection properties to X
N 1 1
ensure samples are evenly spread when projected onto a known Var½T L  ¼ Var½T R  þ ðμp  τÞðμq  τÞ ð9Þ
N N n ðN 1Þn R
subspace [31], and minimize integrated mean square error and
maximize entropy [32], among others. Similarly, numerous efforts where τ is defined as before, μp is the mean value of LHS cell p defined
have been made to reduce spurious correlations beginning with by the bounds of the strata on each of the n marginal distributions for
Iman and Connover [33] and later Florian [34] who rearrange the sample l, and R denotes the summation over the restricted space of
matrix of samples based on a transformation of the rank number Nn ðN  1Þn pairs ðμp ; μq Þ of cells having no cell coordinates in
matrix. Huntington and Lyrintzis [9] and Vorechovsky and Novak common. Notice that LHS does not unconditionally reduce the
[35] utilized iterative optimization methods to reduce spurious variance of the estimate although it has been shown that the variance
correlations while others use orthogonal arrays [36]. Further, is reduced for any function FðÞ having finite second moment when
several authors have developed methods for constructing orthogo- N⪢n [7]. Stein [7] has also shown that the closer FðXÞ is to additive
nal Latin hypercubes that additionally possess enhanced space- P
(FðXÞ ¼ ni¼ 1 F i ðX i Þ), the more the variance of TL is reduced.
filling properties [37,38]. Many of these designs are complex, To the authors' knowledge, no comprehensive study exists
laborious to implement, and/or computationally intensive. Mean- which compares the variance reductions from Eqs. (8) and (9).
while, there is no consensus on whether space-filling or minimal Although such a study is beyond the scope of this paper, it is
spurious correlations are preferable but certainly the two properties immediately clear that SS and LHS reduce variance through
are linked and there may be a trade-off in achieving either. different statistical mechanisms. Thus, different classes of pro-
blems exist for which each method will afford superior variance
2.4. Statistical evaluation and variance reduction reduction. LHS, for example, has been demonstrated to perform
well for applications where FðXÞ has strong additive components
For the purposes of evaluating statistical properties of the while SS will be shown to perform well for applications with
different sampling methods, consider the general statistical strong interactions among the variables of X.
M.D. Shields et al. / Reliability Engineering and System Safety 142 (2015) 310–325 313

3. Space-filling, orthogonality, and projective properties of With this general procedure for estimating cell sizes, we propose
sample designs that the coefficient of variation of the cell sizes (σ V =μV ), referred to
as the “V-metric” herein, provides an effective metric of space-
In this section, we study and compare the space-filling, ortho- filling that can be applied regardless of sample size or dimension. It
gonality, and projection properties of the designs considered in the is straightforward to interpret (low value means that the points
previous section. It is demonstrated that, under certain conditions, occupy approximately the same amount of space) and is visually
SS has merits that, to date, have not been exploited and can lead to clear when observed in 2D. Consider 100 samples drawn using SRS,
meaningful improvements in sample design. LHS, and SBSS. A plot of the Voronoi decomposition (Fig. 2) shows
that the SBSS produces cells that are more uniform in size than LHS
and SRS. The inset statistics verify that, indeed, SBSS produces cells
3.1. Space-filling whose size variance is significantly smaller than both LHS and SRS.
These relations are valid regardless of sample size as demonstrated
Numerous metrics have been developed to quantify the space- in Fig. 3, which shows the V-metric along with maximum and
filling of a sample design including various discrepancy measures minimum observed cell sizes (from all sample sets) relative to
and the maximin and minimax distances. Each metric provides a sample size as evaluated by Monte Carlo simulation averaging over
slightly different interpretation of space-filling and can therefore 10 independent trials. Note also that, by this metric, the space-
yield apparently inconsistent results (one design may be “good” by filling of LHS degrades while the SBSS and ABSS maintain their
one measure and “poor” by another). It is not our intention to discuss space-filling properties as the sample size grows.
the relative merits of these measures in general but the interested To study the dependence on dimension of the space-filling
reader is referred to the works of Fang et al. [39] and Dalbey and properties for these sample designs, we compute the average V-
Karystinos [40] for further discussion. Instead, we aim to compare metric for each design from 1024 samples (repeated 10 times) of
SRS, LHS, and SS using a couple of metrics to gain some insight into different dimension. Fig. 4(a) shows that, by this metric, SS
how these methods fill the space. For this purpose, we will utilize the provides a more even distribution of samples than either LHS or
commonly employed wrap-around L2-discrepancy [41] and a new SRS for low-dimensional random vectors while all sampling
method derived from the Voronoi decomposition [42]. methods lose effectiveness for higher dimension. Notice that, for
The Voronoi decomposition is defined such that each point in LHS, the use of the procedure by Iman and Conover [33] (denoted
space is associated with the nearest sample point. That is, each LHS-corr) to reduce spurious correlation has only a small effect on
Voronoi cell is defined as the region of the space P , denoted Rk and space-filling although, as we will see in the following section, it
associated with point Pk, containing all points in P whose distance significantly improves the orthogonality of the sample set.
to Pk is less than or equal to the distance, d, to any other point As a cautionary note, the V-metric is meaningful only for
P j ; j a k. That is, random sampling methods with the metric estimated over an
    ensemble of realizations. There are many examples where deter-
Rk ¼ x∈Xj dðx; P k Þ ≤d x; P j ∀j≠k ð10Þ
ministic sampling or single instantiations of a sample set will
where, for the purposes of this work, dðx; yÞ is the Euclidean produce a zero V-metric (or perfect space-filling) when, in fact, the
distance. Dividing the space in this way facilitates evaluation of space filling is rather poor. One such example is a two-dimensional
different sample designs because the cell sizes provide a direct problem where the sample points lie at (0.25,0.5) and (0.75,0.5). In
measure of space-filling. In particular, large Voronoi cells corre- such a case, the Voronoi decomposition produces two equally
spond to regions of the space that are sparsely sampled while sized rectangles with zero V-metric.
small cells correspond to regions of the space that are densely The second metric we consider is the wrap-around L2-discre-
sampled. By this measure, an optimal space-filling design pro- pancy (DL2 ) [41] defined through the standard discrepancy mea-
duces cells whose sizes have minimal ensemble variance. In low sure:
dimension, the Voronoi decomposition can be established expli-  
X \ c M 
citly. In high dimension, however, the Voronoi decomposition DðXÞ ¼  N Volðc Þ
M 
ð12Þ
must be estimated implicitly. To do so, the space is populated
with a large number of points (Np  105  108 ) and the samples are with cM  ½0; 1M such that J  J denotes an L2 norm and the subset
each attributed to their nearest point in the sample set. The size cM includes all hyperrectangles that can exist in a periodic domain
(volume) of each Voronoi cell is therefore determined by the ½0; 1M . Fig. 4(b), which plots DL2 for different sample designs and
proportion of points attributed to each sample, Ni: dimensions, is in contrast with the Voronoi metric in that LHS is
shown to provide better space-filling properties. This is due, as
Ni highlighted by Dalbey and Karystinos [40], to the fact that DL2 is
Vi ¼ : ð11Þ
Np sensitive to differences in low-dimensional subspaces. Since LHS

1 1 1

0.9 Max = 0.0223 0.9 Max = 0.0248 0.9 Max = 0.0161


Min = 0.0014 Min = 0.0015 Min = 0.005
0.8 0.8 0.8
St.D. = 0.0048 St.D. = 0.0046 St.D. = 0.0027
0.7 0.7 0.7

0.6 0.6 0.6

0.5 0.5 0.5

0.4 0.4 0.4

0.3 0.3 0.3

0.2 0.2 0.2

0.1 0.1 0.1

0 0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

Fig. 2. Voronoi tessellation of the 2D probability space from 100 samples drawn using SRS (left), LHS (middle), and SBSS (right).
314 M.D. Shields et al. / Reliability Engineering and System Safety 142 (2015) 310–325

0.6
-1 SRS
10 LHS
0.55 LHS-Corr
SBSD
-2 ABSD
0.5 10

0.45
10 -3

Cell Size
V-metric

SRS
LHS
0.4 LHS-Corr
SBSD -4
ABSD 10
0.35
-5
0.3 10

0.25 -6
10
0.2
2 3 2 3
10 10 10 10
Number of Samples Number of Samples
Fig. 3. (a) Voronoi cell size coefficient of variation (V-metric) and (b) maximum/minimum cell size for different sample designs as a function of sample size for 2D samples.

0.8
LHS LHS
0.7 LHS-Corr 0 LHS-Corr
SRS 10 SRS
SS SS
Discrepancy
0.6
V-Metric

0.5 10-2

0.4

0.3 10-4

0.2
0 5 10 15 20 0 5 10 15 20
Dimension Dimension (n)
Fig. 4. Space-filling properties of LHS, LHS with correlation correction (LHS-Corr), SRS, and SS as measured using (a) the Voronoi cell metric (V-metric) and (b) the wrap-
around L2 discrepancy. Note that for n ¼ 2; 5; 10 SS designs are SBSD while for n¼8, 20 SS designs are ABSD.

0 0 0
10 10 10
Correlation Standard Deviation

10 -1 10 -1 10 -1
Maximum Correlation

10 -2 10 -2 10 -2
E[cond(X T X)]-1

-3 -3 -3
10 10 10

10 -4 10 -4 10 -4
SRS SRS SRS
LHS LHS LHS
10 -5 LHS-Corr 10 -5 LHS-Corr 10 -5 LHS-Corr
SBSS SBSS SBSS
ABSS ABSS ABSS
10 -6 2 4 6
10 -6 2 4 6
10 -6 2 4 6
10 10 10 10 10 10 10 10 10
Number of Samples Number of Samples Number of Samples

Fig. 5. Evaluation of spurious correlation produced by various 2D sample designs as a function of sample size: (a) standard deviation of spurious correlation, (b) maximum
spurious correlation, and (c) condition number.

discretizes the 1D subspaces very evenly, DL2 indicates a high when interaction effects play an important role, the higher-order
degree of space-filling. Meanwhile the V-metric quantifies the space-filling of SS may be desirable.
uniformity over the whole M-dimensional space.
The comparison above does not conclude therefore that one 3.2. Orthogonality
sampling method is “better” than the other at space-filling.
Instead, it serves to highlight that LHS and SS fill the space in An orthogonal sample is one whose sample correlation matrix
different ways. Owing to the hierarchical ordering principle [43], is diagonal. That is, the random samples are perfectly uncorrelated.
which states that main effects and low-order interactions are In practice, an orthogonal (or near orthogonal) design can be
usually more important than higher-order effects, the first-order difficult to achieve. Latin hypercube designs often produce spur-
space-filling of LHS is desirable for many problems. However, ious correlations unless corrected or iterated in some way (e.g.
M.D. Shields et al. / Reliability Engineering and System Safety 142 (2015) 310–325 315

[9,33–35]). Stratified samples however, intuitively possess little But, this significance can usually only be established a posteriori (i.e.
artificial correlation to begin with. This is demonstrated in the after concluding the Monte Carlo study). Consequently, it is neces-
following where we consider two measures of orthogonality sary that a sampling method project all variables through the
similar to those considered by Cioppa and Lucas [38]. transformation effectively in order to capture the effects of the
First, we evaluate the spurious correlations among the sample significant variables (even when they are not known). This property
variates for a two dimensional sample produced using the various of a sample design is referred to as the projective property and it is
designs and consider their variations from the intended zero perhaps the greatest strength of LHS. Because LHS discretizes each
correlation using Monte Carlo simulation and averaging over 10 variable finely (highly resolving all of the marginal distributions), it
independent trials. Fig. 5(a) and (b) shows the standard deviation projects each variable through the transformation well.
and maximum spurious correlation respectively for five different The projective properties of LHS follow directly from the
sample designs as a function of sample size. Notice that the properties illuminated by Stein [7], who showed that LHS has
stratified samples (both symmetric [SBSS] and asymmetric with the effect of filtering out the additive components (or main effects)
2:1 aspect ratio [ABSS]) produce smaller spurious correlations than of the transformation. More specifically, Stein showed that if the
even the LHS with explicit correlation control (LHS-Corr) while LHS transformation hðxÞ is decomposed in terms of its main effects
without correlation control produces relatively large spurious ha ðxÞ and interaction effects rðxÞ as
correlations. Moreover, the relative improvement of a stratified
hðxÞ ¼ ha ðxÞ þ rðxÞ ð14Þ
design over LHS and LHS-Corr increases with sample size.
Next, we evaluate the condition number of XT X, where X is the then the variance of the LHS estimator becomes
N  n design matrix with values in the probability space scaled to Z

1 1
the range ½  1; 1, defined by VarðT L Þ ¼ rðxÞ2 dFðxÞ þ O ð15Þ
N N
  ϕ
cond XT X ¼ 1 ð13Þ The variance associated with the main effects ha ðxÞ is very small,
ϕn OðN  1 Þ, while the variance associated with the interactions rðxÞ is
where ϕ1 and ϕn are the 1st and nth eigenvalues of XT X equal to that of a standard Monte Carlo estimate (i.e. there is no
respectively. The condition number is a more general metric of variance reduction on the interactions). By contrast, SS reduces the
orthogonality than any single correlation value with a value of variance on the main effects and the interactions in equal measure.
1 indicating a perfectly orthogonal sample. It is therefore useful for This leads to the following conclusion. LHS has excellent projective
higher dimensional samples. We consider first the condition properties for individual variables. SS, on the other hand, has
number for the 2D samples above as a function of sample size moderate projective properties for individual variables but much
(Fig. 5(c)) and observe the same general trend. However, as the better projective properties for variable interactions. This benefit,
dimension grows, stratified designs are less effective at achieving however, diminishes as the dimension grows as demonstrated by
orthogonality as demonstrated in Fig. 6, which shows the mean the following example.
condition number from 10 independent trials for samples of size Consider two simple transformations. The first is an additive
N ¼256 (Fig. 6(a)) and N ¼ 1024 (Fig. 6(b)) as a function of function defined by
dimension for the different sampling methods. 2Xn
As the dimension grows, LHS methods with active correlation Y1 ¼ X: ð16Þ
ni¼1 i
reduction significantly improve the orthogonality of the sample
while SS produces sample sets with reduced spurious correlations where X i  Uð0; 1Þ. The second is a multiplicative function with
up to moderate-dimensional samples without the need for any strong variable interactions given by
sample adjustment. n
Y 2 ¼ ∏ Xi: ð17Þ
i¼1
3.3. Projective properties and variable interactions pffiffi pffiffi
where X i  Uð1  ð3Þ; 1 þ ð3ÞÞ. Each transformation has been
Often, in an uncertainty analysis, certain variables have a strong designed to produce E½Y 1  ¼ E½Y 2  ¼ 1.
effect on the response while others are relatively insignificant. Fig. 7 shows the average standard deviation from 1000 Monte
Global, or variance-based, sensitivity analysis is one means of Carlo estimates of E½Y 1  and E½Y 2  from 1024 samples using SRS,
assessing the significance of each variable (and interactions) [44]. LHS, and SS for varying problem dimensions. Note that since

10 0 10 0
E[cond(X T X)]-1

E[cond(X T X)]-1

-1 -1
10 10

10 -2 10 -2
SRS SRS
LHS LHS
LHS-Corr LHS-Corr
SS SS

5 10 15 20 5 10 15 20
Dimension Dimension
Fig. 6. Condition number of the design matrix as a function of dimension for the different sampling methods from (a) N¼ 256 samples and (b) N ¼ 1024 samples.
316 M.D. Shields et al. / Reliability Engineering and System Safety 142 (2015) 310–325

Standard Deviation, σ Y

Standard Deviation, σ Y
Number of variables, n Number of variables, n
Fig. 7. Response standard deviation for (a) an additive transformation and (b) a multiplicative transformation as a function of dimension for SRS, LHS, and SS.

E½X i  a 0, the transformation in Eq. (17) possesses both main Lastly, an alternate means of sample size “extension” is through
effects and interaction effects. This can be seen by expressing Xi a method called Nested Latin Hypercube Sampling (NLHS) devel-
in terms of zero mean variables X^ i as X i ¼ X^ i  1 such that oped in [21] and [23]. Extension is presented in quotations
Y 2 ¼ ðX^ 1  1ÞðX^ 2  1Þ…ðX^ n  1Þ. It is these main effects that enable because it is not truly an extension technique. Rather, it works in
a variance reduction from LHS on Eq. (17). As expected, LHS the opposite manner as HLHS by producing one very large LHS
performs exceptionally well for transformation Y1 regardless of sample (larger than would presumably be necessary for the
dimension. However, SS reduces variance considerably over LHS problem at hand) and dividing this sample into smaller “nested”
for Y2. This variance reduction diminishes with dimension but Latin hypercubes. It can therefore be used as an extension method
remains more effective than LHS up to n ¼ log 2 ðNÞ (here n ¼10 for by considering first a single nest and then considering subsequent
N ¼1024), after which neither method is capable of producing a nests as the sample extensions.
meaningful variance reduction. These developments for adding samples to a Latin hypercube
design represent a significant development toward achieving the
desired adaptive Monte Carlo framework. Each of the methods, with
4. Sample size extension the exception of RLH, strictly maintains the character (and all
associated properties) of a Latin hypercube design as discussed in
The adaptive UQ methodology used in this paper requires the Section 3. Given the broad appeal of LHS and its many desirable
ability to easily add samples to an existing set. Sample size features, these methods are attractive for many practical applications.
extension for SRS is straightforward because samples are iid
realizations. However, in SS and LHS, sample size extension is not
necessarily trivial. For SS, samples are traditionally added to existing 5. Refined Stratified Sampling
strata although, as we will show, this is not optimal. Extension of
LHS meanwhile requires to maintain equal weight for all samples The primary drawback of the LHS extension methods is the rapid
for which a few methodologies have been developed recently. sample size growth associated with sample size extension. The meth-
The first methodologies developed to extend Latin hypercube odology developed in this section, referred to as Refined Stratified
samples are referred to as Replicated Latin Hypercubes (RLH) Sampling (RSS), makes use of the unequal sample weighting allowed by
[19,20]. RLH entails performing an additional (independent) LHS stratified sampling to extend a stratified sample set by a single sample.
with identical stratification upon completion of the prior LHS. Furthermore, after each extension, the sample maintains the properties
Using RLH, the minimum achievable sample size extension is of a stratified sample with weights that can be easily computed if the
limited by the original LHS size (it requires adding N additional sample components are independent and uncorrelated.
samples at each refinement) and the sample does not benefit from Consider the input random vector X with n independent and
any additional space refinement. uncorrelated components defined on the sample space S as
Recent works by Tong, [16], Sallaberry et al. [17], and Vor- described in Section 2. Given an initial stratified sample set of
echovsky [18] have proposed methods commonly referred to as size N distributed over M ¼N strata, the Refined Stratified Sam-
Hierarchical Latin Hypercube Sampling (HLHS). Although these pling methodology proceeds as follows:
methods differ in their details (e.g. the methods in [17,18] enable
extension for samples of correlated variables), the basic premise is 1. Select a stratum Ωk to divide according to the following
the same. Given an LHS of size N, the strata of each sample criteria:
component are further divided into t þ 1 strata (one containing the  If a single stratum Ωk exists such that wk 4wj 8 ja k, divide
original sample) – where t is referred to as the refinement factor – this stratum.
and the new components are randomly paired as in a typical LHS  If Ns strata Ωk ; k ¼ 1; 2; …; Ns exist with wk ¼ maxðwj Þ, ran-
j
implementation. domly select the stratum Ωk to divide with the probability
HLHS introduces N new ¼ tN new samples to the set and in of dividing Ωk equal to P½DðΩk Þ ¼ 1=Ns .
general, through r sample size extensions, the size of the sample 2. Divide stratum Ωk in half according to the following criteria:
set grows exponentially as  Compute the unstratified lengths of Ωk , defined as
N tot ¼ Nðt þ 1Þr : ð18Þ λik ¼ ξhi
ik  ξik ;
lo
i ¼ 1; 2; …; n ð19Þ
Its space-filling properties and statistical convergence are
improved over RLH though because each sample size extension
where ξik and ξik are defined as in Section 2.2.
hi lo
involves a division of the sample strata.
M.D. Shields et al. / Reliability Engineering and System Safety 142 (2015) 310–325 317

 Determine the maximum unstratified length of Ωk as the statistical estimators from these two designs are given by
Λk ¼ maxðλik Þ.
 Divide stratum
i p21 2 X M p2
Ωk along the component in corresponding to Var½T s1  ¼ σ1 þ σj
j 2
ð20aÞ
Λk . If Nc components exist such that individual unstratified N ss N
j¼2 j
lengths λi k ¼ Λk ; in ¼ 1; 2; …; N c then randomly select the
n

component to divide with the probability of dividing com- X


N ss X
M p2
p2k σ 2k þ σ 2j
j
Var½T s2  ¼ ð20bÞ
ponent in equal to P½Dðin Þ ¼ 1=N c . N
k¼1 j¼2 j
3. Keeping all existing samples xl ; l ¼ 1; 2; …; N, randomly sample PNss
in the newly defined empty stratum. Compute new weights for where k ¼ 1 pk ¼ p1 and the M term summation refers to all
each of the existing samples and the new sample according to points in the stratified design outside of region Ω1 . The difference
Eq. (7). in variance between these estimators is given by
4. Repeat steps 1–3 for every new extension.
p21 2 X N ss
Var½T s1   Var½T s2  ¼ σ1  p2k σ 2k ð21Þ
N ss k¼1
The RSS process is described graphically in the flowchart
provided in Fig. 8 for a two-component random vector starting Under the condition that pk ¼ p1 =N ss (i.e. balanced stratum refine-
with N ¼ M ¼ 1 and showing the first four sample-size extensions ment of Ω1 ), the difference reduces to
using RSS. !
p2 1 X N ss
Var½T s1   Var½T s2  ¼ 1 σ 21  σ 2k ð22Þ
N ss N ss k ¼ 1
5.1. Why refine strata?
It is straightforward to show that the term on the right hand side
of Eq. (22) is strictly positive. Therefore, variance is reduced using
The methodology presented herein utilizes a refinement of the
a balanced restratification of Ω1 .□
strata definitions rather than adding samples to existing strata.
The benefits of this strategy can be significant – but can also be
detrimental if not implemented appropriately. The benefits of the 5.1.1. Optimal strata refinement for various output distributions
method rely on the following theorem. Theorem 1 states that a balanced stratum refinement of any given
stratum will always reduce the variance of statistical estimates when
Theorem 1. Let Ω1 denote a stratum of space S and ωk ; k ¼ 1; …; Nss compared with simply adding samples to the existing stratum. Note,
denote Nss disjoint substrata of Ω1 such that [ N k ¼ 1 ωk ¼ Ω1 and
however, that Theorem 1 does not say that any stratum refinement
P½ωk  ¼ P½ωj  8 k; j A 1; …; N ss . The variance of a statistical estimator in general will result in reduced variance. In fact, improper stratum
computed using Nss samples drawn singularly from ωk ; k ¼ 1; …; Nss refinement can increase the variance. Furthermore, Theorem 1 does
will always be less than the variance of the same estimator computed not imply that a balanced stratum refinement produces the optimal
using Nss samples drawn from Ω1 . variance reduction. It often will not.
Consider a statistical estimate of the expected value of Y,
Proof 1. Consider a statistical estimator from a stratified design, T S ¼ E½Y , from a stratified design. Here, we establish the optimal
TS, as given in Section 2.4. Next, consider two different stratified stratum division (on the basis of variance reduction) for different
designs wherein all strata are identical except in one region, distributions of Y produced from the transformation Y ¼ FðXÞ. For
denoted Ω1 . In design one, the region Ω1 has only one stratum demonstration purposes, the estimate is computed initially from
possessing Nss samples. In design two, Ω1 is divided into Nss two samples, one in each stratum Ω1 and Ω2 with

balanced strata, each possessing a single sample. The variances of p1 ¼ P Ω1 ¼ 0:5 and p2 ¼ P Ω2 ¼ 0:5 as depicted in Fig. 9.

Fig. 8. Refined Stratified Sampling: flowchart of sample size extension procedure for two random variables.
318 M.D. Shields et al. / Reliability Engineering and System Safety 142 (2015) 310–325

We are interested in identifying the optimal location to divide where z A ½0; 1, referred to as the ‘Imbalance Factor,’ is a measure
the strata such that a single sample can be added while minimiz- of imbalance of the resulting stratum division (Fig. 9). Note that
ing Var½T S . In each example, stratum Ω2 is divided into two z¼0.5 corresponds to balanced stratum refinement. The objective
substrata denoted ω21 and ω22 possessing, respectively, probabil- of the optimization can be stated as
ity weights p21 and p22 subject to
minimize : Var½T s  ¼ p21 σ 21 þ p221 σ 221 þ p222 σ 222
p21 ¼ P ½ω21  ¼ zp2 ð23aÞ z

subject to : p21 ¼ zp2 ð24Þ


p22 ¼ P ½ω22  ¼ ð1  zÞp2 ð23bÞ
p1 þ p21 þp22 ¼ 1

We consider three different output distributions: Y  Normalð0; 1Þ;


Y  Uniformð0; 1Þ, Y  LogNormalð  1:49; 1:27Þ. Optimization as
defined by Eq. (24) yields the stratum refinement outlined in Table 1
for each distribution. Table 2 meanwhile compares the resulting
variances of TS for each distribution considering different strata defini-
tions: (1) two samples – one from Ω1 and one from Ω2 ; (2) three
samples with no refinement – a single sample is added to stratum Ω2 ;
(3) three samples with balanced refinement – stratum Ω2 is divided
such that p21 ¼ p22 ¼ 0:25; (4) three samples with optimal refinement
of stratum Ω2 as defined in Table 1. Notice that optimal stratification
can have a profound influence on variance reduction.
Fig. 9. Identification of the optimal stratum refinement location for a normal Fig. 10 shows the variance of the mean estimates for the normal,
distribution. uniform, and lognormal distributions as a function of the imbalance
factor z with the optimal value demarcated by an ‘x’. The dashed line
shows the variance resulting from samples produced without
Table 1 stratum refinement. Notice that the balanced refinement is not
Optimal strata division for uniform, normal, and lognormal output distributions.
necessarily the optimal but always reduces variance while many
Dist. p21 p22 σ 221 σ 222 z refinement strategies (e.g. z o 0:4 for lognormal output) actually
increase the variance relative to estimates produced without refine-
Uð0; 1Þ 0.25 0.25 5.2e  3 5.2e  3 0.5 ment. Analysis of these results shows that the closer the stratum Ω2
Nð0; 1Þ 0.3162 0.1838 6.53e  2 0.21 0.632 is to possessing symmetric conditional distribution DY ðyj y A Ω2 Þ, the
LNð 1:49; 1:27Þ 0.4426 0.0574 0.1165 6.97 0.885
closer the balanced refinement strategy becomes to optimal. Con-
versely, when the conditional distribution DY ðyj y A Ω2 Þ possesses
strong asymmetry, the optimal refinement will be unbalanced. In
such cases, the balanced refinement procedure proposed herein
Table 2 produces diminished return in terms of variance reduction.
Variance of the estimated mean value for different strata refinements and different
output distributions.
The analysis performed in this section assumes stratification
in the domain of the output Y. In general, the corresponding
Var½T S  strata in the input domain of X cannot be identified unless FðXÞ
is known and invertible. Additionally, the distribution of Y is not
Dist. Two Three samples
known in general. Consequently, it can be difficult and some-
samples
No Bal. Opt. times impossible to identify the stratum division for X that
refinement refinement refinement corresponds to the optimal division on Y – particularly if FðXÞ is
strongly nonlinear, non-monotonic, discontinuous, or otherwise
Uð0; 1Þ 0.0104 7.8125e  3 5.859e  3 5.859e  3
complex. Balanced stratum refinement on the input space, on
N(0,1) 0.18175 0.1363 0.1083 0.1045
LN(  1.49,1.27) 0.4138 0.2073 0.1696 0.04667
the other hand, necessarily corresponds to a balanced stratum
refinement on the output space and will always provide a

× 10 -3
0.19 10.5 0.45

0.18 10 0.4

0.17 9.5 0.35


9
0.16 0.3
8.5
Variance

Variance

Variance

0.15 0.25
8
0.14 0.2
7.5
0.13 0.15
7
0.12 6.5 0.1

0.11 6 0.05

0.1 5.5 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Imbalance Factor (z) Imbalance Factor (z) Imbalance Factor (z)

Fig. 10. Variance of the estimated mean value from three samples with (a) normal, (b) uniform, and (c) lognormal output distributions as a function of stratum imbalance
factor.
M.D. Shields et al. / Reliability Engineering and System Safety 142 (2015) 310–325 319

reduction in variance. For this reason, its use is recommended reason, it is recommended that the initial stratified design be a
when stratum optimization is not possible. balanced one (asymmetric or symmetric). The proposed methodology
Identifying optimal stratum refinement, however, represents a will then produce unbalanced designs during most iterations but this
potentially important direction for future research. While some similar is justified by Theorem 1. That said, the analysis conducted in the
concepts have been developed for optimally sampling the input space previous section and the variance reduction expression in Eq. (8)
with probabilistically weighted draws to match the input probability together imply that optimal refinement will not rely on randomly
model (e.g. [45]) the potential capability to use RSS within the adaptive selecting the stratum to divide but rather selecting it based on some
framework promoted herein represents an opportunity to define a established criteria that may optimize variance reduction. One condi-
partitioning of the input space that optimizes the partitioning of the tion that could be considered, for example, is to enforce some
output space. This development is one that is potentially very powerful symmetry conditions in stratum selection such that the stratum
for efficient uncertainty analysis. An initial exploration along these lines design does not become too heavily imbalanced.
is undertaken in a parallel effort for applications in reliability analysis
[26] where strata in the input variable space are divided such that
samples are concentrated in the output space in the region of the limit
state function. 5.3. Expected performance and extension to high dimensional
applications
5.2. RSS yields unbalanced stratified designs
As demonstrated in Section 5.1, the efficiency of the method is
dependent on the resulting distribution (and more specifically the
Globally, RSS produces stratified designs that are balanced only
conditional distributions of the response from within each stratum).
when N ¼ M ¼ 2p , are symmetrically balanced only when
Given these considerations, the method is expected to converge more
N ¼ M ¼ 2np for integer values of p, and are otherwise unbalanced
rapidly for conditional output distributions that are closer to sym-
(assuming an initial sample size of one). In general, producing
metric. For problems whose output distribution is strongly asymmetric
unbalanced designs is recommended only when the output distribu-
and characterized by a dominant peak and heavy tails (i.e. those with
tion calls for it (see the previous section). However, the output
high skewness and kurtosis), the convergence rate of the RSS method
distribution is unknown in general and therefore balanced stratifica-
will be reduced due to the sub-optimality of the stratum refinement.
tion should be used whenever optimization is not possible. For this
Optimal stratum refinement methods need to be developed to fully
capitalize on the variance reduction potential of RSS.
In its present form, the proposed methodology is not well-suited to
Table 3 problems with very high dimensional input vectors. As the analysis in
Input distributions and resulting analytical moments of the output [LN ¼ LogNormal Section 3 implies, stratified sampling loses considerable effectiveness
(μ; σ), U ¼Uniform(a; b), N¼ Normal(μ; σ)].
for dimensions above n ¼ log 2 ðNÞ and its benefits are most pro-
Dist. X1 X2 α E½Y Var[Y] Skew[Y] Kurt[Y] nounced only for low dimensional problems. To extend the RSS
method to very high dimensional spaces, it is currently being
A LN(0,0.01) U(0,20) N(1,0.1)  113.33 12012.0  0.77  0.55 considered to conduct RSS on low-dimensional subspaces with
B LN(0,0.1) U(0,10) N(1,0.1)  23.37 621.18  0.89  0.24
samples generated on these subspaces randomly paired as in LHS.
C LN(0,0.1) U(0,7) N(1,0.1)  9.32 121.97  0.97  0.09
D LN(0,0.1) U(0,6) N(1,0.1)  5.98 58.46  1.01 0.002
Moreover, a major deficiency of the proposed method from a
E LN(0,0.1) U(0,5) N(1,0.1)  3.31 23.65  1.08 0.17 dimensionality perspective is that rectilinear stratification requires
F LN(0,0.3) U(0,5) N(1,0.1)  3.10 25.03  1.17 0.81 stratum division in only a single dimension. A generalized RSS
G LN(0,0.4) U(0,5) N(1,0.1)  2.87 26.80  0.99 2.08 algorithm would refine strata of arbitrary shape (i.e. Voronoi cells) to
H LN(0,0.45) U(0,5) N(1,0.1)  2.70 28.85  0.48 9.42
allow refinement in multiple dimensions simultaneously – thereby
I LN(0,0.475) U(0,5) N(1,0.1)  2.60 30.56 0.09 23.84
J LN(0,0.5) U(0,5) N(1,0.1)  2.48 33.05 1.08 60.62 greatly reducing dimensional dependence. Such algorithms are the
topic of a forthcoming work.

x 10
2.5 1
HLHS
SRS RSS
Sample Size Reduction over SRS
Samples for 95% convergence

HLHS
2 0.8
RSS

1.5 0.6

1 0.4

0.5 0.2

0 0
0 10 20 30 40 50 60 70 0 10 20 30 40 50 60 70
Excess Kurtosis Excess Kurtosis

Fig. 11. (a) Number of sample required to obtain 95% convergence using SRS, HLHS, and RSS as a function of response kurtosis. (b) Reduction in sample size for HLHS and RSS
compared to SRS as a function of response kurtosis.
320 M.D. Shields et al. / Reliability Engineering and System Safety 142 (2015) 310–325

1 Distribution C Distribution E
10
SRS SRS
HLHS HLHS
Area Metric (δ ) RSS 10 0 RSS

Area Metric (δ )
10 0

10 -1

-1
10

-2
10
10 1 10 2 10 3 10 4 10 1 10 2 10 3 10 4

Number of Samples Number of Samples


Distribution H Distribution J
SRS SRS
HLHS HLHS
RSS RSS
10 0
10 0
Area Metric (δ )

Area Metric (δ )

-1
10

-1
10

1 2 3 4 1 2 3 4
10 10 10 10 10 10 10 10

Number of Samples Number of Samples


Fig. 12. Convergence of the empirical CDF to the “true” CDF for select cases using SRS, HLHS, and RSS.

6. Applications as a function of response excess kurtosis. In general, a larger number


of samples are required for convergence as the kurtosis increases
6.1. Statistical analysis of a stochastic function although the proposed RSS method consistently outperforms both
SRS and HLHS in this regard. Fig. 11(b) shows the relative reduction in
Consider the cubic polynomial function given by sample size over SRS for both HLHS and RSS defined by

Y ¼ FðXÞ ¼ X 21 X 2  αX 1 X 22 þ X 1 X 2 ð25Þ NS  N H=R


ð27Þ
NS
The function possesses three random variables: X1, X2, and α.
Several sets of input distributions are considered to elicit signifi- where NS and N H=R are the number of samples required for 95%
cantly different response distributions as described through ana- convergence (plotted on the left) for SRS and HLHS/RSS respectively.
lytically computed moments in Table 3. The cases cover a vast By this measure, the RSS method affords upwards of a 90% reduction
range of distributions from those approaching uniform to those in sample size when compared to SRS for responses with low
with sharp peaks and heavy tails. Convergence is defined based on kurtosis. Initially, the effectiveness drops precipitously as the kurtosis
the following relative error: increases. After kurtosis E 2, it stabilizes while still providing
  significant sample size reduction. Notice also that RSS consistently
σ 2  Var½Y produces a 10–20% greater reduction in sample size than HLHS. A
 y 
  r ϵth ð26Þ
 Var½Y  complete tabular summary of this study is provided in the Appendix.
Next, we consider the convergence of the empirical CDFs. The
where Var½Y is the analytically determined variance, σ2y is the “true” CDFs DY(y) for each case A–J are determined by Monte Carlo
variance computed from the generated samples, and ϵth ¼ 0:01 (in simulation with 100,000 stratified samples. Convergence to DY(y)
general, analytical moments will not be available – this example is is measured for the first 10,000 samples using the area validation
selected for demonstration purposes to allow a simple and clearly metric (a.k.a. the Minkowski L1 norm) as [46]
defined convergence criterion). Z 1
ðlÞ
For each of the parameter sets, 1000 calculation sets were δðlÞ ¼ j DY ðyÞ  D^ Y ðyÞj dy: ð28Þ
1
performed using RSS, HLHS, and SRS. Each calculation set begins
ðlÞ
with 20 samples and is extended until the convergence criterion is where D^ Y ðyÞ denotes the empirical CDF computed from l samples.
satisfied. The number of samples required for convergence is Sample plots showing the convergence of δðlÞ using SRS, HLHS, and
recorded for each set. The results of this study are summarized in RSS are provided in Fig. 12. While HLHS and RSS provide comparable
Fig. 11. Fig. 11(a) shows the number of samples required to achieve rates of convergence, HLHS allows only those sample sizes shown
convergence for 95% of the calculation sets using SRS, HLHS, and RSS with an ‘x’; highlighting the need to add many samples during an
M.D. Shields et al. / Reliability Engineering and System Safety 142 (2015) 310–325 321

Lagrangian
Structural Model

Eulerian Fluid Mesh

Fig. 13. (a) Schematic of coupled fluid structure model for underwater shock response of a surface vessel. (b) Simplified model used for demonstration purposes.

HLHS sample size extension. RSS meanwhile affords extension by as resulting from fluid–structure interaction. Linear viscous damping
many or as few samples as desirable for the application. (ξd) is assumed and considered to possess a truncated (non-negative)
normal distribution with mean μd ¼ 0:025 and standard deviation
6.2. Structural response to underwater shock σ d ¼ 0:01. Second, noticeable variabilities are known to arise in the
peak pressure produced from underwater detonation of certain
Growing emphasis is being placed on uncertainty quantification in explosives. Here, we consider charge weight to be a normal random
computational physical modeling and simulation where it is desirable variable with mean μc ¼ 117 lb. TNT and standard deviation
to evaluate the response variability for large and complex physical σ c ¼ 1:17 lb. TNT to account for this variability in peak pressure.
systems. We consider the case of the multi-physical response of a
floating structure to underwater shock. These calculations involve
large Lagrangian structural finite element models (with tens of
thousands of elements) coupled with extremely large Eulerian fluid
models (with tens to hundreds of millions of fluid elements) as 6.2.1. Modified bootstrap convergence evaluation
depicted in Fig. 13(a). Such calculations often require several days to To evaluate statistical convergence, a bootstrap procedure is
weeks of computation time on massively parallel computers. An used that enables us to estimate approximate confidence bounds
accompanying paper on the validation of computational models for for response statistics of interest. The traditional bootstrap proce-
this problem was recently published by the authors [47]. dure, developed by Efron [53], performs a random resampling (with
To demonstrate the tractability of uncertainty quantification for replacement) of a dataset to obtain a large number of surrogate
these problems, it was necessary to show that statistical convergence sample sets. Statistical analysis of these so-called bootstrap samples
could be accomplished for the response in a small number of samples. provides an estimate of the variability of the computed statistic. The
To do this, a simplified representation of the system was constructed traditional bootstrap method resamples from the original dataset
as shown in Fig. 13(b). The simplified model is an uncoupled one with each sample possessing equal probability of occurrence. That
dimensional, two degree of freedom damped system consisting of a is, the probability of selecting sample Sl from a set of N samples is
mass M1 representing a primary structure supporting a much larger given by P½Sl  ¼ 1=N. However, in stratified samples, and more
piece of equipment with M 2 ¼ 5M 1 . Simplified representations of this specifically the RSS procedure developed herein, the samples
form have been used for decades to study fluid–structure interaction possess associated weights that are not necessarily equal. To
in near-surface underwater shock. Early work by Bleich and Sandler account for this unequal weighting (imbalanced stratified design),
[48] presented the effects of bulk surface cavitation on the velocity a modified bootstrap method is proposed as follows:
response of a floating structure subject to underwater shock in a
bilinear cavitating fluid. This was followed by the development of a 1. Find the least common denominator D for all sample weights
numerically uncoupled treatment of the surface shock response and create a modified dataset of size D by duplicating samples
problem in 1D that accounts for the effects of cavitation by DiMaggio such that each sample in the modified set has equal weight.
et al. [49]. In a few more recent papers describing a numerical 2. Resample a set of N points (bootstrap sample) with replace-
technique for capturing the effects of cavitation in finite elements ment from the modified set.
models, Sprague and Geers [50,51], with corrections by Stultz and 3. Compute the statistic of interest (bootstrap statistic) from the
Daddazio [52], presented the two mass problem (neglecting the bootstrap sample.
effects of structural damping) as it is studied here. 4. Repeat a large number of times to obtain an empirical dis-
Unlike the previous authors, we are not specifically concerned tribution for the bootstrap statistic.
with the effects of bulk cavitation and the complex physics that drive
the response following the initial kickoff. Instead, we are interested The proposed resampling process accounts for the sample weights
in capturing the variability in peak velocity of the mass M2 given but, it should be emphasized that the bootstrap datasets produced
uncertainties in the model using a minimal number of samples. This from this method are not stratified samples. In any given bootstrap
is motivated by the need to verify that simulation-based uncertainty resample, it is likely that certain strata will not be represented. The
quantification is a tractable option for the more complex multi- consequence of this loss of stratification is a potential increase in
physics model. By considering the sensitivity of the model to various the variance of the bootstrap statistics.
parameters and assessing realistic uncertainties associated with this This new bootstrap method provides an admittedly imperfect
problem, we identify two uncertain quantities. First, damping of real monitor of convergence. The shortcomings of the proposed moni-
structures is notoriously difficult to ascertain and model with any tor motivate the need for further research to establish a more
accuracy. This is further complicated by the dissipation of energy effective and general monitor.
322 M.D. Shields et al. / Reliability Engineering and System Safety 142 (2015) 310–325

41.5 41.5

41.4 41.4

41.3 41.3

41.2 41.2

41.1 41.1
Mean

Mean
41 41

40.9 40.9

40.8 40.8

40.7 40.7

40.6 40.6

40.5 40.5
0 500 1000 1500 2000 2500 3000 3500 4000 0 500 1000 1500 2000 2500 3000 3500 4000
Number of Samples Number of Samples
3.5 3.5

3.4 3.4

3.3 3.3
Standard Deviation

Standard Deviation
3.2 3.2

3.1 3.1

3 3

2.9 2.9

2.8 2.8

2.7 2.7

2.6 2.6

2.5 2.5
0 500 1000 1500 2000 2500 3000 3500 4000 0 500 1000 1500 2000 2500 3000 3500 4000
Number of Samples Number of Samples
1.2 1.2

1.1 1.1

1 1

0.9 0.9

0.8 0.8
Skewness

Skewness

0.7 0.7

0.6 0.6

0.5 0.5

0.4 0.4

0.3 0.3

0.2 0.2
0 500 1000 1500 2000 2500 3000 3500 4000 0 500 1000 1500 2000 2500 3000 3500 4000
Number of Samples Number of Samples
2.5 2.5

2 2

1.5 1.5
Kurtosis

Kurtosis

1 1

0.5 0.5

0 0

−0.5 −0.5
0 500 1000 1500 2000 2500 3000 3500 4000 0 500 1000 1500 2000 2500 3000 3500 4000
Number of Samples Number of Samples

Fig. 14. Evolution of response statistics with added samples for peak velocity of mass M2 using SRS (left) and RSS (right) showing 95% bootstrap confidence intervals for 4000
samples.
M.D. Shields et al. / Reliability Engineering and System Safety 142 (2015) 310–325 323

6.2.2. Results very rapidly for these statistics even when the response quantity of
The bootstrap procedure outlined in the previous section is used to interest (here peak velocity of mass M2) deviates appreciably from
determine confidence intervals for the response statistics from the Gaussian. As a result, this large-scale multi-physics study was
empirical CDF of the bootstrap statistics. We are interested in performed but details of this study are beyond the scope of this paper.
evaluating statistics of the peak velocity of mass M2 denoted by V2.
Convergence is defined such that the bootstrap 95% confidence
bounds for each statistic of interest are sufficiently narrow. Denoting
a computed response statistic by T with bootstrap 95% confidence
intervals given by Tl95 and Tu95 corresponding to the lower and upper
7. Conclusions
bound respectively, the convergence criterion is expressed as follows:
 
T u  T l  In this work, we promote an adaptive approach to Monte Carlo-
 95 95 
  r ϵth ð29Þ based UQ rooted in stratified sampling. To motivate the advantages
 T 
of SS for certain classes of problems the space-filling, orthogonality,
Note that the convergence criterion based on confidence intervals and projective properties of SS are studied and compared with
from an empirical CDF was selected in order to account for skewness simple random sampling (SRS) and Latin hypercube sampling (LHS).
of the likely non-Gaussian bootstrap CDF. A convergence criterion It is shown that SS provides superior properties for many low-to-
based on variance of the bootstrap statistics, for example, would not moderate dimensional problems – especially when strong variable
account for this reality. interactions occur. To enable the adaptive approach, a new sample
Convergence was compared for RSS and SRS. Considering the size extension methodology – Refined Stratified Sampling (RSS) – is
performance comparison presented previously and the need for proposed that adds samples sequentially by dividing the existing
maximal flexibility in minimizing sample size, HLHS was not strata. RSS is proven to reduce variance compared to existing
considered a good candidate for these analyses. Initially 90,000 sample size extension methods for SS that add samples to existing
analyses were performed using SRS and the convergence evaluated. strata and is shown to possess comparable or improved conver-
To achieve superior convergence in all statistics, far fewer calculations gence when compared to existing extension methods for LHS while
were required using RSS. In all, 4000 RSS analyses were conducted affording maximal flexibility. Using RSS, samples can be added one-
although less than this was necessary to match the convergence of at-a-time while sample size grows exponentially using hierarchical
SRS. Plots of the first four moments (with bootstrap confidence LHS. Optimality of the stratum refinement is examined and its
intervals) as a function of sample number for peak velocity V2 are extension to high dimensional applications discussed. Several
provided in Fig. 14 using both SRS (left) and RSS (right). future research opportunities along these lines are identified that
By establishing convergence thresholds, we determine the number may facilitate significant breakthroughs in sample-based UQ if
of samples required to produce sample statistics that are considered achieved. In particular, the potential ability to adaptively identify a
adequate. Table 4 shows the number of samples required for various partitioning (stratification) of the input space that optimizes the
convergence criteria of the first four statistical moments for both RSS partitioning of the output space could serve to minimize sample
and SRS. It is noted that the mean value converges extremely rapidly size dramatically compared to the current state-of-the-art. Addi-
with a cutoff threshold of 0.1% requiring only 300 samples for RSS tionally, the RSS method, when combined with new methods for
compared to nearly 85,000 for SRS. Furthermore, a reasonably small decomposing the sample space into low-dimensional subspaces
error can be gained in standard deviation with only hundreds of will allow its extension to very high dimensional problems.
samples. Skewness and Kurtosis meanwhile require thousands of Two examples are presented. The first involved the forward
samples to converge even using RSS. Nonetheless, the improvement propagation of uncertainty through a low-dimensional stochastic
in convergence over SRS is an order of magnitude or better. function where convergence can be explicitly evaluated at each sample
These data are encouraging when considering the prospects for an size extension. The second involved a real multi-physics computational
uncertainty quantification study for the large-scale multi-physics model possessing uncertain parameters for the response of a floating
problem, particularly if the study is interested in evaluating low- structure subject to an underwater shock. In the second example, a
order statistics (mean and standard deviation) of various response new bootstrap procedure was proposed for resampling from a strati-
quantities. As indicated by the plots in Fig. 14, the method converges fied design to compute confidence intervals for response statistics.

Table 4
Number of samples required to achieve various levels of convergence. –'s indicate that this level of convergence was not obtained in the number of samples performed (4000
for RSS and 90,000 for SRS).

S V2 15% 5% 2.5% 1% 0.5% 0.1%

μ
SRS o 20 40 140 840 3480 84,900
RSS o 20 o 20 o 20 60 140 300

σ
SRS 460 4500 18,800 – – –
RSS 120 280 460 1760 3760 –

γ1
SRS 10,750 73,350 – – – –
RSS 1760 2240 – – – –

γ2
SRS 69,700 – – – – –
RSS 2920 – – – – –
324 M.D. Shields et al. / Reliability Engineering and System Safety 142 (2015) 310–325

Table 5 [6] Wang G. Adaptive response surface method using inherited Latin hypercube
Number of samples required to achieve different proportions of convergence from design points. Trans ASME: J Mech Des 2003;125:210–20.
1000 sample sets using SRS, HLHS, and RSS. [7] Stein M. Large sample properties of simulations using Latin hypercube
sampling. Technometrics 1987;29(2):143–51.
Dist. 10% 25% 50% 75% 90% 95% [8] Owen A. A central limit theorem for Latin hypercube sampling. J R Stat Soc Ser
B 1992;54(2):541–51.
SRS [9] Huntington D, Lyrintzis C. Improvements to and limitations of Latin hypercube
A 72 156 636 2663 6878 12,880 sampling. Probab Eng Mech 1998;13(4):245–53.
[10] Fang K-T, Ma C-X. Wrap-around l2-discrepancy of random sampling, Latin
B 74 199 710 2713 7669 14,023
hypercube, and uniform designs. J Complex 2001;17:608–24.
C 73 189 687 2891 8969 15,710
[11] Fang K-T, Ma C-X, Winker P. Centered l2-discrepancy of random sampling and
D 81 217 788 3086 9453 17,503
Latin hypercube design, and construction of uniform designs. Math Comput
E 93 237 901 3693 11,654 20,151
2002;71:275–96.
F 101 276 993 4080 15,513 29,009 [12] Glasserman P, Monte Carlo Methods in financial engineering. Springer; New
G 125 371 1777 6938 21,263 36,146 York, USA, 2003.
H 134 496 2301 10,919 36,507 71,982 [13] Rubinstein R, Kroese D. Simulation and the Monte-Carlo method. New York,
I 185 619 4329 24,708 84,181 165,086 NY: John Wiley and Sons; 2008.
J 199 1004 7136 32,866 115,060 236,090 [14] Kalos M, Whitlock P. Monte Carlo methods. Wiley; Weinheim, Germany, 2008.
[15] Gilman M. A brief survey of stopping rules for Monte Carlo In: Proceedings of
HLHS
the second conference on applications of simulations, New York, NY, USA;
A 40 80 160 640 1280 2560
1968.
B 40 160 320 1280 2560 5120 [16] Tong C. Refinement strategies for stratified sampling methods. Reliab Eng Syst
C 80 160 320 1280 2560 5120 Saf 2006;91:1257–65.
D 40 160 640 1280 2560 5120 [17] Sallaberry C, Helton J, Hora S. Extension of Latin hypercube samples with
E 80 160 640 1280 2560 5120 correlated variables. Reliab Eng Syst Saf 2008;93:1047–59.
F 160 640 1280 2560 10,240 10,240 [18] Vorechovsky M. Hierarchical refinement of Latin hypercube samples. Comput
G 320 1280 2560 5120 10,240 20,480 Aided Civil Infrastruct Eng 2015;30:394–411.
H 640 2560 5120 10,240 40,960 40,960 [19] Iman R. Statistical methods for including uncertainties associated with the
I 640 2560 5120 20,480 40,960 81,920 geologic isolation of radioactive waste which allow for comparison with
J 1280 5120 10,240 40,960 81,920 163,840 licensing criteria. In: Proceedings of the symposium on uncertainties asso-
ciated with the regulation of the geologic disposal of high-level radioactive
RSS waste, Gatlinburg, TN, USA; 1981.
A 42 73 144 322 505 942 [20] McKay M. Evaluating prediction uncertainty. Technical report. NUREG/CR-
B 48 84 177 358 670 933 6311, Los Alamos National Laboratory; 1995.
C 51 88 196 387 666 954 [21] Qian P. Nested Latin hypercube designs. Biometrika 2009;96(4):957–70.
D 53 93 205 392 710 1130 [22] Xiong F, Xiong Y, Chen W, Yang S. Optimizing Latin hypercube design for
E 59 113 236 459 881 1339 sequential sampling of computer experiments. Eng Optim 2009;41:793–810.
F 64 130 278 676 1382 1897 [23] Rennen G, Husslage B, van Dam E, Hertog D. Nested maximin Latin hypercube
G 92 219 560 1621 3643 5975 designs. Struct Multidiscip Optim 2010;41:371–95.
H 132 368 1056 3842 14,628 32,475 [24] Lepage G. A new algorithm for adaptive multidimensional integration. J
I 165 463 1861 7219 29,408 65,372 Comput Phys 1978;27:192–203.
J 228 775 3211 13,336 57,076 125,842 [25] Press W, Farrar G. Recursive stratified sampling for multidimensional Monte
Carlo integration. Comput Phys 1990;4:190–5.
[26] Shields M, Sundar V. Targeted random sampling: a new approach for efficient
reliability estimation of complex systems. Int J Reliab Saf.
[27] Johnson M, Moor L, Ylvisaker D. Minimax and maximin distance designs. J Stat
Plan Inference 1990;26:131–48.
Appendix A [28] Morris M, Mitchell T. Exploratory designs for computational experiments. J
Stat Plan Inference 1995;43:381–402.
Example 1 analyzes the number of simulations required to [29] Ye K, Li W, Sudjianto A. Algorithmic construction of optimal symmetric Latin
hypercube designs. J Stat Plan Inference 2000;90:145–59.
achieve convergence for the 3-dimensional stochastic function
[30] Joseph V, Hung Y. Orthogonal-maximin Latin hypercube designs. Stat Sin
given in Eq. (25) with 10 different sets of input distributions using 2008;18(171–186).
SRS, LHS, and RSS. Table 5 provides the results of this study for all [31] Liefvendahl M, Stocki R. A study on algorithms for optimization of Latin
three sample size extension methods and each set of input hypercubes. J Stat Plan Inference 2006;136(9):3231–47.
[32] Park J. Optimal Latin-hypercube designs for computer experiments. J Stat Plan
distributions defined in Table 3. The columns show the number Inference 1994;39:95–111.
of samples required to achieve different proportions of converged [33] Iman R, Conover W. A distribution-free approach to inducing rank correlation
calculations. For example, 10% of the 1000 (or 100) calculation sets among input variables. Commun Stat: Simul Comput 1982;11(3):311–34.
[34] Florian A. An efficient sampling scheme: updated Latin hypercube sampling.
converged within 72, 40, and 42 samples for Distribution A using
Probab Eng Mech 1992;7:123–30.
SRS, HLHS, and RSS respectively. In general, the RSS method shows [35] Vorechovsky M, Novak D. Correlation control in small-sample monte carlo
a drastic reduction (often greater than an order of magnitude) in type simulations i: a simulated annealing approach. Probab Eng Mech
the number of samples needed for convergence when compared to 2009;24:452–62.
[36] Tang B. Orthogonal array-based Latin hypercubes. J Am Stat Assoc
SRS and even affords a significant savings when compared to 1993;88:1392–7.
HLHS. In particular, it is apparent that oversampling is necessary [37] Ye K. Orthogonal column Latin hypercubes and their application in computer
with HLHS given its exponentially increasing sample size. experiments. J Am Stat Assoc 1998;93:1430–9.
[38] Cioppa T, Lucas T. Efficient nearly orthogonal and space-filling Latin hyper-
cubes. Technometrics 2007;49(1):45–55.
[39] Fang K-T, Li R, Sudjianto A. Design and modeling for computer experiments.
London, UK: Chapman and Hall/CRC; Boca Roton, FL, USA, 2006.
References [40] Dalbey K, Karystinos G. Fast generation of space-filling Latin hypercube
sample designs. In: Proceedings of the 13th AIAA/ISSMO multidisciplinary
[1] Janssen H. Monte-carlo based uncertainty analysis: sampling efficiency and analysis and optimization conference; 2010.
sampling convergence. Reliab Eng Syst Saf 2013;109:(123–132). [41] Hickernell F. Lattice rules: how well do they measure up? In: random and
[2] Tocher K. The art of simulation. The English Universities Press, London, UK; quasi-random point sets. Springer; New York, USA, 1998.
1963. [42] Aurenhammer F. Voronoi diagrams—a survey of a fundamental geometric data
[3] Helton J, Davis F. Latin hypercube sampling and the propagation of uncertainty structure. ACM Comput Surv 1991;23(3):345–405.
in analyses of complex systems. Reliab Eng Syst Saf 2003;81:23–69. [43] Wu C, Hamada M. Experiments: planning, analysis, and parameter design
[4] McKay M, Beckman R, Conover W. A comparison of three methods of selecting optimization. New York: Wiley; Hoboken, NJ, USA, 2000.
values of input variables in the analysis of output from a computer code. [44] Saltelli A, Ratto M, Andres T, Campolongo F, Cariboni J, Gatelli D, et al. Global
Technometrics 1979;21(2):239–45. sensitivity analysis: the primer. Wiley; 2008.
[5] Olsson A, Sandberg G, Dahlblom O. On Latin hypercube sampling for structural [45] Grigoriu M. Reduced order models for random function. application to
reliability analysis. Struct Saf 2003;25:47–68. stochastic problems. Appl Math Model 2009;33:161–75.
M.D. Shields et al. / Reliability Engineering and System Safety 142 (2015) 310–325 325

[46] Roy C, Oberkampf W. A comprehensive framework for verification, validation, [50] Sprague M, Geers T. Computational treatments of cavitation effects in near-
and uncertainty quantification in scientific computing. Comput Methods Appl free-surface underwater shock. Shock Vib 2001;7:105–22.
Mech Eng 2011;200:2131–44. [51] Sprague M, Geers T. Spectral elements and field separation for an acoustic field
[47] Teferra K, Shields M, Hapij A, Daddazio R. Mapping model validation metrics subject to cavitation. J Comput Phys 2003;184:149–62.
to subject matter expert scores for model adequacy assessment. Reliab Eng [52] Stultz K, Daddazio R. The applicability of fluid–structure interaction
Syst Saf 2014;132:9–19. approaches to the analysis of floating targets subjected to undex loading In:
[48] Bleich H, Sandler I. Interaction between structures and bilinear fluids. Int J Proceedings of the 73rd shock and vibration symposium, Newport, RI; 2002.
Solids Struct 1970;6:617–39. [53] Efron B. Bootstrap methods: another look at the jackknife. Ann Stat
[49] DiMaggio F, Sandler I, Rubin D. Uncoupling approximations in fluid–structure 1979;7:1–26.
interaction problems with cavitation. J Appl Mech 1981;48:753–6.

View publication stats

S-ar putea să vă placă și