Documente Academic
Documente Profesional
Documente Cultură
GEOSTATISTICAL MODELLING
ALEXANDER PLONER*
Institute of Mathematics and Applied Statistics, University of Agricultural Sciences, Gregor Mendel Strae 33,
A-1180 Wien, Austria
SUMMARY
This paper gives an overview of some of the possible applications of the variogram cloud in geostatistical
modelling, mainly in exploratory data analysis (EDA), but also in preliminary parameter quantication,
and model validation. Copyright # 1999 John Wiley & Sons, Ltd.
KEY WORDS spatial data analysis; geostatistics; variogram cloud; exploratory data analysis
1. INTRODUCTION
The variogram cloud in its traditional sense, i.e. as a scatter plot of the variogram estimates
versus the distances, is well established in geostatistical practice (e.g. Isaaks and Srivastava 1989,
p. 181). In our work with geochemical datasets we have found a number of generalizations of this
concept, which are useful, but less well known, and which we will present in some detail in the
following sections.
1.1. Denitions
The basic tool for describing the autocorrelation structure of a spatial random process Z(x)
ranging over some domain D R
n
is the variogram given by
2g(x; ~ x) = Var[Z(x) Z( ~ x)]; (1)
where x, x are locations in D; as we will generally have only one observation z
i
per sampling
location x
i
, i = 1; . . . ; n, we need the intrinsical hypothesis as an additional assumption, which
basically amounts to
2g(x; ~ x) = E[(Z(x) Z( ~ x))
2
]
= 2g( ~ x x);
(2)
implying constant mean of the process Z(x) and invariance to translation of its covariance
structure.
CCC 11804009/99/04041325$17
.
50 Received 11 May 1998
Copyright
#
1999 John Wiley & Sons, Ltd. Accepted 10 December 1998
ENVIRONMETRICS
Environmetrics, 10, 413437 (1999)
* Correspondence to: A. Ploner, Institute of Mathematics and Applied Statistics, University of Agricultural Sciences,
Gregor Mendel Strae 33, A-1180 Wien, Austria.
The variogram is typically estimated by a method-of-moments estimator.
2^ g(h) =
1
[N(h)[
X
N(h)
(Z(x) Z( ~ x))
2
; (3)
where summation is over the index set
N(h) = {(x; ~ x)[x; ~ x c D; ~ x x = h]: (4)
In practice, when dealing with non-gridded data, we are hardly ever able to use (4), because there
will be only few if any pairs of observations where the sampling locations dier exactly by some
vector h. Therefore we have to dene classes of distance vectors in order to get tolerably stable
estimations, using the index set
N
4
3
7
(
1
9
9
9
)
A
P
P
L
I
C
A
T
I
O
N
S
O
F
T
H
E
V
A
R
I
O
G
R
A
M
C
L
O
U
D
4
1
7
Figure 2. Variogram cloud for variable Ni, observation 118 marked
C
o
p
y
r
i
g
h
t
#
1
9
9
9
J
o
h
n
W
i
l
e
y
&
S
o
n
s
,
L
t
d
.
E
n
v
i
r
o
n
m
e
t
r
i
c
s
,
1
0
,
4
1
3
4
3
7
(
1
9
9
9
)
4
1
8
A
.
P
L
O
N
E
R
Figure 3. Linked variogram cloud and map for variable Ni, neighbourhood of observation 118 marked
C
o
p
y
r
i
g
h
t
#
1
9
9
9
J
o
h
n
W
i
l
e
y
&
S
o
n
s
,
L
t
d
.
E
n
v
i
r
o
n
m
e
t
r
i
c
s
,
1
0
,
4
1
3
4
3
7
(
1
9
9
9
)
A
P
P
L
I
C
A
T
I
O
N
S
O
F
T
H
E
V
A
R
I
O
G
R
A
M
C
L
O
U
D
4
1
9
Figure 4. Linked variogram cloud and map for variable Ni, neighbourhood of observation 136 marked
C
o
p
y
r
i
g
h
t
#
1
9
9
9
J
o
h
n
W
i
l
e
y
&
S
o
n
s
,
L
t
d
.
E
n
v
i
r
o
n
m
e
t
r
i
c
s
,
1
0
,
4
1
3
4
3
7
(
1
9
9
9
)
4
2
0
A
.
P
L
O
N
E
R
Figure 5. Linked variogram cloud and map for variable Ni, two transsects through observation 136 marked
C
o
p
y
r
i
g
h
t
#
1
9
9
9
J
o
h
n
W
i
l
e
y
&
S
o
n
s
,
L
t
d
.
E
n
v
i
r
o
n
m
e
t
r
i
c
s
,
1
0
,
4
1
3
4
3
7
(
1
9
9
9
)
A
P
P
L
I
C
A
T
I
O
N
S
O
F
T
H
E
V
A
R
I
O
G
R
A
M
C
L
O
U
D
4
2
1
Figure 6. Variogram cloud for variable Th
C
o
p
y
r
i
g
h
t
#
1
9
9
9
J
o
h
n
W
i
l
e
y
&
S
o
n
s
,
L
t
d
.
E
n
v
i
r
o
n
m
e
t
r
i
c
s
,
1
0
,
4
1
3
4
3
7
(
1
9
9
9
)
4
2
2
A
.
P
L
O
N
E
R
cloud may help in getting a visual impression of the appropriateness of the assumption of
normality.
Examples: In Figure 7 we see the square-root-cloud for the variable Pb, where all pairs with
observation 139 are marked; obviously the transformation of the dierences has pulled this
outlying value closer in, so that it appears more palatable than in Figure 1; the dierences are
distributed more evenly, so that we can now see that there seems to be a problem with small
dierences: there are disproportionally many equal values all over the place, and the correspond-
ing zero dierences are distinctly apart from the main bulk of the cloud; it appears that there are
too many equal values to uphold the assumption of normality.
3.2. Exploratory parameter assessment (EPA)
We use the term `exploratory parameter assessment' in analogy to `exploratory data analysis' as a
euphemism for initial visual parameter estimation.
3.2.1. Assessing anisotropy. In the presence of anisotropy, outlier detection or model tting
require the identication of the anisotropy parameters beforehand. In case of two-dimensional
locations, these are the directions and lengths of the principal axes of the ellipsoid characterizing
the linear transformation A. (Actually, the ratio between the lengths of the axes is sucient.) We
have found dierent representations of the untransformed variogram cloud to be of varying
interest:
(i) Maybe the most obvious idea is to consider directional variogram clouds, in direct
analogy to modelling anisotropy with empirical variogram functions, i.e. to consider the
distances between points along a transsect through D, and to compare these sub-clouds
for dierent directions; this turns out to be not very satisfying for two reasons: on the one
hand, we have to dene some kind of tolerance region around any given angle in order to
get a reasonable number of observations per plot, which is basically what we wanted to
avoid, and on the other hand, it is very dicult to get a visual range estimate from a
variogram cloud, which would be necessary in order to nd the long and the short axis
among the given directional plots.
(ii) The next obvious idea is probably a conventional 2D-symbol-plot, with the coordinates of
the separation vectors on the axes, and dierent levels of squared dierences coded with
dierent symbols: if the underlying process Z(x) is isotropic, then the values should
increase at least approximately in concentric circles centered on the origin; if the rate of
increase is markedly dierent in some directions, we might deal with realizations from an
anisotropic process. In practice, such a plot is not easy to read; even for a modest number
of observations, and using a set of symbols designed to minimize overlap (e.g. Cleveland
1994, p. 146), the plot is very crowded; and even if there are clear axes of anisotropy, it is
hard to read o their directions from this kind of plot.
(iii) The shortcomings described above can be overcome by using polar coordinates for the
symbol-plot: We have done so by marking the actual distances between points on the
horizontal axis of the plot, and the angle on the vertical axis; a circle in the plane is then a
vertical line, therefore the squared dierences should be approximately vertically constant,
if Z(x) is isotropic; a horizontal line with markedly lower values corresponds to a direction
Copyright # 1999 John Wiley & Sons, Ltd. Environmetrics, 10, 413437 (1999)
APPLICATIONS OF THE VARIOGRAM CLOUD 423
Figure 7. Square-root-dierences cloud for variable Pb, observation 139 marked
C
o
p
y
r
i
g
h
t
#
1
9
9
9
J
o
h
n
W
i
l
e
y
&
S
o
n
s
,
L
t
d
.
E
n
v
i
r
o
n
m
e
t
r
i
c
s
,
1
0
,
4
1
3
4
3
7
(
1
9
9
9
)
4
2
4
A
.
P
L
O
N
E
R
in which the squared dierences grow more slowly, which may indicate that this is the
direction of the shorter axis of the anisotropy ellipsoid; it is now easy to read o the actual
angle of the axis, and to verify that the direction orthogonal to a potential minor axis
shows fast-growing squared dierences and is therefore the corresponding major axis.
Besides, we can reduce the clutter by only considering angles in the range [0, p[, eliminat-
ing the redundant symmetry induced by considering both x
i
x
j
and x
j
x
i
that takes up
half the plotting area in a simple symbol-plot. Another advantage of this kind of plot is the
fact that it shows clearly beyond which distance there are only pairs of observations for a
subset of all possible directions. This can lead to spurious correlations in variogram
estimation due to odd border- and corner-eects, which is why usually only pairs of
observations up to half the maximum distance are included during estimation; by using a
polar symbol-plot of the kind described, this limit can be easily veried and possibly also
increased.
(iv) Once the orientation of the anisotropy ellipsoid has been assessed, we can use simple
projections onto the planes dened by the axes of the ellipsoid and perpendicular to the
plane, in order to verify the correctness of our initial estimate: a projection on the plane
dened by the short axis should showa pronounced central valley in the cloud of projected
dierences, whereas a projection on the plane of the long axis should show a uniform wall
of values forming this valley.
After estimating the orientation of the ellipsoid, these plots can be used to assess the anisotropy
factor by experimenting with dierent values until the transformed data appear to be suciently
isotropic.
Examples: Figure 8 shows a standard symbol-plot for variable Na; even though it gives the
impression that dierences are increasing slower along a transsect approximately angled at 0
.
75p,
the situation is not very clear when compared with Figure 9, which is the same plot in polar
coordinates: we can see that the angle for the short axis of the anisotropy ellipsoid is actually
closer to 0
.
8p, and that the increase of the dierences is markedly stronger in the orthogonal
direction of 0
.
3p; besides we can see that it will be a good idea to consider only distances up to
approximately 175,000 m for tting a variogram model. Figures 10 and 11 show conrmatory
projection plots for 0
.
3p/0
.
8p as angles of the anisotropy axes.
3.2.2. Assessing structure. Obviously, scaling the squared dierences by some power l of the
distance, i.e. plotting
|x
j
x
i
|;
(z(x
j
) z(x
i
))
2
|x
j
x
i
|
l
!
[i ,= j; i; j = 1; . . . ; n
( )
; (8)
will be useful when tting a power model of the form g(h) = c|h|
l
, but it can be useful in
exploring and summarizing the dependency structure even in situations where a global plot of (8)
does not make sense, because the scale will be determined by only a few observations lying closely
together, so that the rest of the plot is compressed into too little space to show any detail. This
can be resolved by dening some kind of cut-o limit for the scaled distances, but it is
more rewarding to divide the whole range of distances into subsets, that can then be plotted and
Copyright # 1999 John Wiley & Sons, Ltd. Environmetrics, 10, 413437 (1999)
APPLICATIONS OF THE VARIOGRAM CLOUD 425
Figure 8. Symbol-plot of the variogram cloud for variable Na
C
o
p
y
r
i
g
h
t
#
1
9
9
9
J
o
h
n
W
i
l
e
y
&
S
o
n
s
,
L
t
d
.
E
n
v
i
r
o
n
m
e
t
r
i
c
s
,
1
0
,
4
1
3
4
3
7
(
1
9
9
9
)
4
2
6
A
.
P
L
O
N
E
R
Figure 9. Symbol-plot of the variogram cloud for variable Na, polar coordinates
C
o
p
y
r
i
g
h
t
#
1
9
9
9
J
o
h
n
W
i
l
e
y
&
S
o
n
s
,
L
t
d
.
E
n
v
i
r
o
n
m
e
t
r
i
c
s
,
1
0
,
4
1
3
4
3
7
(
1
9
9
9
)
A
P
P
L
I
C
A
T
I
O
N
S
O
F
T
H
E
V
A
R
I
O
G
R
A
M
C
L
O
U
D
4
2
7
Figure 10. Projection of the variogram cloud for variable Na on a vertical plane passing through the origin at 0
.
3p
C
o
p
y
r
i
g
h
t
#
1
9
9
9
J
o
h
n
W
i
l
e
y
&
S
o
n
s
,
L
t
d
.
E
n
v
i
r
o
n
m
e
t
r
i
c
s
,
1
0
,
4
1
3
4
3
7
(
1
9
9
9
)
4
2
8
A
.
P
L
O
N
E
R
Figure 11. Projection of the variogram cloud for variable Na on a vertical plane passing through the origin at 0
.
8p
C
o
p
y
r
i
g
h
t
#
1
9
9
9
J
o
h
n
W
i
l
e
y
&
S
o
n
s
,
L
t
d
.
E
n
v
i
r
o
n
m
e
t
r
i
c
s
,
1
0
,
4
1
3
4
3
7
(
1
9
9
9
)
A
P
P
L
I
C
A
T
I
O
N
S
O
F
T
H
E
V
A
R
I
O
G
R
A
M
C
L
O
U
D
4
2
9
scaled individually. For a simple dependency structure, we may typically nd the following
subdivisions:
(i) a small set of pairs markedly closer to each other than the rest of the data, i.e. those that
will compress the scale on a global plot into uselessness; these pairs will be primarily
responsible for the presence and size of a nugget constant in our model, so while nding a
proper power relation between squared dierences and distances via l may not be our
main concern here, the identication of this set may be helpful in assessing the nugget
eect.
(ii) the main part of the data, where the actual structure, i.e. the decrease in correlation with
increase of distance, is displayed. Experimentation with dierent values may yield a l that
describes this relationship adequately.
(iii) the set of pairs with the largest distances (in practice usually the pairs with separation
larger than half the maximum distances) may show the same pattern as the main part of
the data for similar l, which generally indicates that a power model can be tted. More
often, the scaled dierences will decrease with distances increasing beyond a certain limit,
even for a l that produced a stable pattern with the rest of the data. A possible explanation
for this is the presence of a sill, so that the squared dierences do not increase beyond the
corresponding range; another possibility is a border eect, e.g. when for rectangular
domain D the observations that are separated by larger distances tend to cluster in the
corners of the rectangle, thereby giving the impression of higher correlations, as
mentioned in 3.2.1. In the latter case, we are more interested in the rough range estimate we
get by this subdivision than in nding a reasonable l.
Examples: Figure 12 shows the standard plot of the variogram cloud for chrome (Cr); Figure 13
shows that scaling the cloud by the distances to the power of 0
.
4 stabilizes the spread uniformly
over all distances, suggesting the t of such a model.
Figure 14 shows the standard plot of the variogram cloud for lanthanium (La); overall scaling
does not work here, the reason is the single pair of observations in the upper left corner of
Figure 15, where we can see the scaled dierences in the neighbourhood of the origin. Figure 16
shows that the scaled distances exhibit a constant spread for the power of 0
.
6 over a fairly wide
range; note that the dierence in the vertical scale of Figure 15 and 16! Figure 17 nally shows
that a sill seems to be reached at a distance of about 140,000 m, as the scaled dierences decrease
strongly for the exponent 0
.
6 which produced stable spread in Figure 15.
3.3. Model validation
Once a model g(h) has been t to the data, the quality of the t should be judged, comparing it for
dierent areas in the domain D of interest. Barry (1996) uses the assumption of normality to
highlight pairs of observations for which (6) is below or above specied critical quantiles of the
w
2
1
-distribution, in order to spot places where the t of g(h) is uncomfortable. Similarly, we can
plot
|x
j
x
i
|;
(z(x
j
) z(x
i
))
2
2g(x
j
x
i
)
!
[i ,= j; i; j = 1; . . . ; n
( )
Copyright # 1999 John Wiley & Sons, Ltd. Environmetrics, 10, 413437 (1999)
430 A. PLONER
Figure 12. Variogram cloud for variable Cr, unscaled
C
o
p
y
r
i
g
h
t
#
1
9
9
9
J
o
h
n
W
i
l
e
y
&
S
o
n
s
,
L
t
d
.
E
n
v
i
r
o
n
m
e
t
r
i
c
s
,
1
0
,
4
1
3
4
3
7
(
1
9
9
9
)
A
P
P
L
I
C
A
T
I
O
N
S
O
F
T
H
E
V
A
R
I
O
G
R
A
M
C
L
O
U
D
4
3
1
Figure 13. Variogram cloud for variable Cr, scaled by the distances to the power of 0
.
4
C
o
p
y
r
i
g
h
t
#
1
9
9
9
J
o
h
n
W
i
l
e
y
&
S
o
n
s
,
L
t
d
.
E
n
v
i
r
o
n
m
e
t
r
i
c
s
,
1
0
,
4
1
3
4
3
7
(
1
9
9
9
)
4
3
2
A
.
P
L
O
N
E
R
Figure 14. Variogram cloud for variable La, unscaled
C
o
p
y
r
i
g
h
t
#
1
9
9
9
J
o
h
n
W
i
l
e
y
&
S
o
n
s
,
L
t
d
.
E
n
v
i
r
o
n
m
e
t
r
i
c
s
,
1
0
,
4
1
3
4
3
7
(
1
9
9
9
)
A
P
P
L
I
C
A
T
I
O
N
S
O
F
T
H
E
V
A
R
I
O
G
R
A
M
C
L
O
U
D
4
3
3
Figure 15. Variogram cloud for variable La, scaled by the distances to the power of 0
.
6, small distances
C
o
p
y
r
i
g
h
t
#
1
9
9
9
J
o
h
n
W
i
l
e
y
&
S
o
n
s
,
L
t
d
.
E
n
v
i
r
o
n
m
e
t
r
i
c
s
,
1
0
,
4
1
3
4
3
7
(
1
9
9
9
)
4
3
4
A
.
P
L
O
N
E
R
Figure 16. Variogram cloud for variable La, scaled by the distances to the power of 0
.
6, medium distances
C
o
p
y
r
i
g
h
t
#
1
9
9
9
J
o
h
n
W
i
l
e
y
&
S
o
n
s
,
L
t
d
.
E
n
v
i
r
o
n
m
e
t
r
i
c
s
,
1
0
,
4
1
3
4
3
7
(
1
9
9
9
)
A
P
P
L
I
C
A
T
I
O
N
S
O
F
T
H
E
V
A
R
I
O
G
R
A
M
C
L
O
U
D
4
3
5
Figure 17. Variogram cloud for variable La, scaled by the distances to the power of 0
.
6, large distances
C
o
p
y
r
i
g
h
t
#
1
9
9
9
J
o
h
n
W
i
l
e
y
&
S
o
n
s
,
L
t
d
.
E
n
v
i
r
o
n
m
e
t
r
i
c
s
,
1
0
,
4
1
3
4
3
7
(
1
9
9
9
)
4
3
6
A
.
P
L
O
N
E
R
linked to a map of observations, as described in 3.1.1; extreme pairs of observations will stand
out more clearly and individually.
4. SUMMARY
We have found the generalized concept of the variogram cloud as the set of all pairwise
dierences in location and observation and its graphical representations to be a promising tool in
the initial stages of geostatistical modelling.
As a technical note we would like to add that we were pleasantly surprised by the speed of
computation and display of the plots: even on our elderly workstation, the response time was
quite good for datasets up to approximately 400 observations. For larger datasets, a faster
machine, a high-end monitor, and the use of colour in our routines seem to be advisable.
The S functions used to create the gures in this article will be made available via StatLib.
REFERENCES
Barry, R. P. (1996). `A diagnostic to assess the t of a variogram model to spatial data'. Journal of Statistical
Software 1.
Cleveland, W. S. (1994). The Elements of Graphing Data. Summit, New York: Hobart Press.
Cressie, N. A. C. (1991). Statistics for Spatial Data. New York: Wiley & Sons.
Haslett, J., Bradley, R., Craig, P. S., Wills, G. and Unwin, A. R. (1991). `Dynamic graphics for exploring
spatial data, with application to locating global and local anomalies'. The American Statistician 45,
234242.
Isaaks, E. H. and Srivastava, R. M. (1989). An Introduction to Applied Geostatistics. New York: Oxford
University Press.
Reimann, C., A
yra s, M., Chekushin, V., Bogatyrev, I., Boyd, R., de Caritat, P., Dutter, R., Finne, T. E.,
Halleraker, J. H., Jger, , Kashulina, G., Niskavaara, H., Pavlov, V., Ra isa nen, M. L., Strand, T.,
Volden, T. (1996). `A geochemical atlas of the central parts of the Barents region'. In The 6th Seminar on
Hydrogeology and Environmental Geochemistry 1996, no. 96.128, 4647.
Copyright # 1999 John Wiley & Sons, Ltd. Environmetrics, 10, 413437 (1999)
APPLICATIONS OF THE VARIOGRAM CLOUD 437