Sunteți pe pagina 1din 12

TEACHING AID

Ricardo A. Olea
A six-step practical approach to semivariogram modeling
Published online: 9 May 2006
Springer-Verlag 2006
Abstract Geostatistical prediction and simulation are
being increasingly used in the earth sciences and engi-
neering to address the imperfect knowledge of attributes
that uctuate over large areas or volumespollutant
concentration, electromagnetic elds, porosity, thickness
of a geological formation. Central to the application of
such techniques is the need to know the spatial conti-
nuity, knowledge that is commonly condensed in the
form of covariance or semivariogram models. Their
preparation is subdivided here into the following steps:
(1) Data editing, (2) Exploratory data analysis, (3)
Semivariogram estimation, (4) Directional investigation,
(5) Simple modeling, (6) Nested modeling. I illustrate
these stages practically with a real data set from a geo-
physical survey from Elk County, Kansas, USA. The
applicability of the approach is not limited by the
physical nature of the attribute of interest.
Keywords Geostatistics Continuity Uncertainty
Semivariogram Model tting
1 Introduction
Shortage of information is common in the modeling and
evaluation of natural resources. Geostatistics has been
developed in recent decades to assist in the process and
to quantify the uncertainty associated with such imper-
fect knowledge (e.g. Goovaerts 1997; Chile` s and Delner
1999; Olea 1999).
A fundamental dierence between geostatistics and
classical statistics is the assumption by geostatistics of the
existence of spatial autocorrelation, which matches the
commonly held notion that in the vicinity of small values
there are other small values, while large values tend to be
close to other large values. The assessment of such
autocorrelation is a prerequisite in most applications of
geostatistics, which is generally done by modeling the
semivariogram based on the information provided by a
sampling of the attribute of interest. Equivalent results
can be obtained employing covariances instead, which
can be readily derived from semivariograms.
Despite numerous publications on the subject, mod-
eling a semivariogram remains to the uninitiated the
most dicult and intriguing aspect in the application of
geostatistics. What follows is a hands-on approach in-
tended to teach by example, by breaking the task of
modeling into six sequential steps. Proper execution of
the task requires computer software for calculations and
graphical display, for which I provide references. The
use of a two-dimensional gravimetrical survey here is
merely pedagogical. The basic steps are the same,
regardless of the physical nature of the attribute and the
dimensionality of the sampling. The focus is on handling
typical peculiarities of spatial continuity as revealed by a
proper sampling, rather than in complications derived
from insucient data, blunders in their preparation, or a
combination of the two.
2 Step 1: Data editing
To remain as focused as possible, I will start by
assuming that there is a sampling already available. This
will avoid going into sampling design, a major step
worthy of a separate paper.
Typically, a geostatistical sampling comprises one
record per measurement. Each record gives the observed
value and its spatiotemporal location, which may be in
one, two, or three spatial dimensions. The most frequent
caseand the one to which I will devote my atten-
tionis the atemporal, two-dimensional case, in which
location comprises a couple of geographic coordinates.
In cases such as locations given as legal description of
location or latitude and longitude, it is necessary to
convert them to Cartesian coordinates using programs
such as the one prepared by Collins (1999).
R. A. Olea
Institut fu r Ostseeforschung Warnemu ende,
18119 Rostock, Germany
E-mail: ricardo@oleageostats.com
Stoch Environ Res Risk Assess (2006) 20: 307318
DOI 10.1007/s00477-005-0026-1
Where every observation is some type of average over
a line, area, or volumeporosity, chemical concentra-
tion, ore grade, crop yieldall measurements must have
the same type of underlying line, area, or volume in
terms of shape, size and orientation. Observations hav-
ing signicant variations in the underlying line, area, or
volume should form dierent sub-samplings that must
be treated separately.
The rst task is to review every record to eliminate
any possible reading, recording, or processing errors,
either in the locations or in the measurement of the
attribute. Human errors are more common than most
people imagine. Final results will be particularly sensi-
tive to anomalous uctuations; one needs to scrutinize
them as much as possible. The sampling should retain
only those observations denoting real, natural uctua-
tions, within instrumental precision, of course. Failure
to eliminate sampling blunders can sometimes be de-
tected later in the modeling. Then it is necessary to re-
peat the work after taking corrective action. Undetected
blunders usually degrade the representativity of the re-
sults.
A posting of values or preliminary mapping, such as
the one in Fig. 1, is helpful to gain familiarity with the
data and detect dubious sampling sites. The sampling in
Fig. 1 is a medium-size survey with 668 measurements
and will be used extensively to illustrate the methodol-
ogy. The sampling is part of a larger geophysical study
comprising more than 27,000 stations that cover all of
the state of Kansas (Lam 1987; Xia et al. 1992). The
survey was done to detect Bouguer gravity anomalies in
the gravitational force relative to the standard geoid,
which in this case mainly indicate depth to bedrock.
Values of Bouguer anomaly are gravity averages over
the same vertically elongated volumes, with cross-sec-
tions that are regarded as points relative to the size of
the sampling area. The coordinates in this case are those
of the instrument on the surface of the earth.
The reader is encouraged to download and work with
the sampling available at URL http://www.kgs.ku.edu/
Mathgeo/Books/Elk/index.html. As rendered in Fig. 1,
the survey is free of errors, ready for reliable use, within
a combined measurement and processing precision of
0.1 mgal.
3 Step 2: Exploratory data analysis
Before calculating the semivariogram, the user needs to
examine both the spatial distribution of sampling sites
and the cumulative distribution of the measurements to
assess any need to modify the original data.
First, for the proper modeling of a semivariogram,
the sampling should not have preferential areas, as is
the case when some measurements concentrate in
clusters with a much greater sampling density that the
rest of the sampling area. If that is the situation, the
sampling needs preprocessing to eliminate the inu-
ence of clusters. One way to do this is by assigning
weights (Isaaks and Srivastava 1989, pp. 241247),
which can be done by the program declus (Deutsch
and Journel 1998, pp. 213214). The Elk county
gravimetric survey was designed to take one mea-
surement at every intersection of the almost perfectly
regular network of roads every mile in the eastwest
and northsouth directions. Hence the demonstration
sampling is as free of clustering as any sampling
can be.
The second decision relates to the requirement or
convenience of transforming the data to increase its
univariate normality, namely the ability of the mea-
surements to approximate a normal distribution. The
most common practice is to convert the data to normal
scores. The transformed data will have a normal distri-
bution with a mean of zero and a variance of one,
transformation that is also known as a Gaussian ana-
morphose. Often the transformation is optional and
makes sense solely in the case of clear deviation from
normality. Some applications, however, such as
sequential Gaussian simulation, require normal scores,
and so the transformation is mandatory. For more de-
tails, see Verly (1986). If the reader needs to make a
normal score transformation, then see for example
Deutsch and Journel (1998, pp. 223226).
Figure 2 shows the cumulative univariate distribution
for the Elk data on a normal probability scale. In this
instance, the maximum deviation from normality occurs
for 56 mgal and is 5.7% points. As a practical rule,
there are no clear advantages to working with normal
scores unless the deviation from normality is above 10%
points. For cases such as the Bouguer values from Elk
county, one should avoid a normal score transforma-
tion. Most statistical libraries have programs to
calculate maximum deviation between cumulative dis-
tributions, such as the one in Press et al. (1992, pp. 617
619). Given the nonlinearity of the probability scale, it is
safer to run this simple calculation than trying to read
the maximum discrepancy from a graphical comparison
of the distributions like the one in Fig. 2.
4 Step 3Semivariogram estimation
At this stage it seems opportune to dene what the
semivariogram is. Given two sites h units apart and the
dierence for a variable of interest at those sites, the
semivariogram, c(h), is half the variance of this dier-
ence. The semivariogram has the property of measuring
the degree of dissimilarity between pairs of measure-
ments in terms of how far apart they are and the ori-
entation of the line between those two sampling sites.
Statistics and geostatistics are sciences of the un-
known. Therefore, it follows that the true semivario-
gram is never known, and as in statistics generally, all
that it is customarily known is an estimate of the semi-
variogram. Although there are several semivariogram
estimators, the predominant practice is to use the fol-
lowing unbiased estimator:
308
Fig. 1 Location of Elk County
in the state of Kansas, United
States (star), posting of
observation stations (+), and
contour map of Bouguer
anomaly in Elk County. Lines
close to the margins are the Elk
County boundary lines.
Distances along the axes are in
kilometers and contour interval
is 1 mgal
Fig. 2 Cumulative distribution
for the Bouguer gravity
anomaly, Elk County, Kansas,
denoted by the dots, and a
normal distribution with the
same mean and variance, given
by the straight line. The sample
mean is 61.3 mgal, the
standard deviation 4.8 mgal,
and the maximum discrepancy
with a normal distribution with
these same parameters is 5.7
percentage points
309
^c h
1
2 n h
X
n h
i1
z x
i
h z x
i

2
; 1
where z (x
i
) is a measurement taken at location x
i
and
n(h) is the number of pairs h units apart in the direction
of the vector. In the geostatistical jargon, ^c h is known
as experimental semivariogram and h is the lag. An
important assumption for the validity of this estimator is
the absence of any systematic variations, that is to say,
there should be no trend. Note that the estimator is
conveniently independent from the individual sites x
i
.
The lag is written in bold to denote that the argument
simultaneously has a magnitude dened by a distance
and an orientation, which in two dimensions is any of
the points in the compass. Location is also in bold to
denote that it is a vector of coordinates, as many as the
dimensions in the sampling space. In two dimensions,
they are commonly denoted as easting and northing.
Given an orientation, the estimator in (1) is strictly
applicable to a sampling at regular intervals, d. ^c 0 is
always zero, as the dierence of any measurement with
itself is zero. ^c d is half the square mean of all dier-
ences of a measurement with its immediate neighbor,
^c 2d is half the square mean of all dierences of a
measurement with the neighbor that results from skip-
ping one measurement, and so on. For a numerical
example, see, e.g., Olea (1999, p. 74).
In irregular sampling schemes, for the purpose of
establishing pairs, z(x
i
+h ) is regarded as a centroid of a
distance class. Figure 3 illustrates the situation of a two-
dimensional irregular sampling, in which any measure-
ment inside the shaded area is considered in the calcu-
lation of ^c h ; although it is not exactly a distance h from
x
i
. The lag spacing is typically taken close to the value of
the average sampling distance; the lag tolerance, t
h
, is set
to half this spacing; the lateral tolerance, t
b
, 0.5 to 2
times the lag spacing; and the angular tolerance, d, 0.25
0.5 times the increment in the azimuth. The increment in
the azimuth is customarily between 22.5 and 45.
Clearly there is a trade-o between the resolution of
small distance classes with few observations per class,
and the reliability and smoothing of large distance
classes. Options become more numerous as the sampling
size increases. In practice, the pairing of measurements
and calculation of half-square averages is better done
with the assistance of computer programs, such as
VARIOWIN (Pannatier 1996) or gamv (Deutsch and
Journel 1998, pp. 5355).
Figure 4 is an example of a typical experimental
semivariogram, which in this case shows the results of
processing the Elk data with gamv using an incremental
lag of 1.6 km (1 mi), an angular tolerance of 10 and a
lateral tolerance of 1 km. The degree of dissimilarity
provided by the semivariogram often shows a bounded
increase. The distance at which the semivariogram
reaches the limiting value is called the range and the
bound is denoted as the sill. In the case of experimental
semivariograms, the range and the sill are hard to dene
accurately because of irregularity in the uctuations.
Let me conclude this section with some remarks
about the signicance of the point estimates comprising
the experimental semivariogram. The autocorrelation
assumption at the core of geostatistics is both a blessing
and a problem. It is an advantage in that allows for
better characterizations than would be possible without
spatial continuity. Autocorrelation, however, introduces
enough theoretical complications that it has been
impossible to develop any kind of test of signicance in
the style of classical statistics. So, for example, the an-
swer to the question as to the minimum number of pairs
in ^c h required to provide a reliable estimation, can be
Fig. 3 Denition of the distance class in the estimation of a
semivariogram. The shaded area is dened by an angular tolerance
d, lag tolerance t
h
, and a lateral tolerance t
b
, relative to the point
here appearing in the upper left corner. Any measurement inside
the shaded area can be used in the calculation of ^c h despite not
being exactly h units apart from x
i
Fig. 4 Experimental semivariogram for demonstration sampling
along N63E showing a range of about 21.3 km and a sill of about
1.07 mgal
2
310
addressed solely by the following couple of recommen-
dations:
1. The minimum number of pairs in semivariogram
estimation must be 30, according to Journel and
Huijbregts (1978, p. 194), and 50 if one is going to
follow the advice of Chile` s and Delner (1999, p. 38).
Please see Webster and Oliver (1992) for a discussion.
2. If it is necessary to estimate the semivariogram for a
large lag close to the diameter of the sampling area,
then the pairs of observations that are that far apart
are only those located at opposite extremes of the
sampling area, thus excluding the central points from
the analysis. Hence, the justication for a second
practical rule that advises to limit the lag of the
experimental semivariogram should be limited to half
the extreme distance in the sampling domain for the
direction of interest (Journel and Huijbregts 1978,
p. 194).
Indirectly, these two rules collectively make it dicult
to properly estimate a semivariogram with less than 50
measurements at dierent locations.
5 Step 4: Directional investigation
In more than one dimension, the semivariogram gener-
ally has directional properties. Hence the user should
not stop at estimating a semivariogram for a single
direction, such as the one in Fig. 4. In general, the data
permitting, the more directions are investigated, the
better. In two dimensions, the bare minimum is the
investigation of three azimuths (Goovaerts 1997, p. 98).
In three dimensions, there is the additional need to run a
sensitivity analysis in the declination.
One might observe at least three basic types of
behavior in a directional semivariogram survey. In the
simplest situation, there is no signicant dierence
among experimental semivariograms for the dierent
directions tested. In this circumstance, one speaks of
isotropy and it is acceptable to average all experimental
semivariograms regardless of orientation. The result is
usually a smooth semivariograman omnidirectional
semivariogramwhich is smoother than individual
directional semivariograms.
If the sampling is trend free and the average size of
the anomalies is smaller than the maximum length of the
sampling area, then the typical situation is that one
obtains semivariograms with dierent rates of increase
for short lags that level o at a common sill that
approximates the sampling variance (Barnes 1991), such
as in Fig. 5. This is the second basic type of behavior, a
semivariogram with what is called a geometric anisot-
ropy, which is a true function of distance and direction.
Figure 6 shows the third type of basic behavior
through eight semivariograms for the Elk data set at
increments of 22.5, starting from the eastwest direc-
tion. Now, instead of observing a sill for every semi-
variogram, one can see an exponential increase without
bound at dierent incremental rates. Many of the esti-
mated values surpass the value of the variance, which in
this instance is 23 mgal
2
. What we have in such situa-
tions are a collection of artifacts, not genuine experi-
mental semivariograms. Even a casual inspection of
Fig. 1 reveals that there is a systematic increase in the
Bouguer gravity from northwest to southeast, which
violates the assumption of no-trend for the proper use of
the estimator in Eq. 1. Hence, the curves in Fig. 6 are
neither semivariogram estimates, nor the kind of semi-
variograms that will be required in a geostatistical
characterization of an attribute with a trend. In the
presence of trend, the semivariogram required for
modeling is that computed on the residuals obtained
after removing the trend. Yet to remove the trend, it is
necessary to have the semivariogram of the residual. The
simplest, yet most eective way out of this conundrum,
is to nd a trend-free directionnamely a direction that
on average has a constant meanand use the semi-
variogram in that direction as the semivariogram for the
residual. The justication for this is based on the fact
that the experimental semivariogram depends on dif-
ferences z(x
i
+h ) z(x
i
) in which the addition or
subtraction of a constant to each term does not change
the increment. The trend-free direction is perpendicular
to the direction of maximum dip and coincides with the
direction of minimal increase in the pseudo-semivario-
grams of a directional survey. In the case of the Elk
County data, this direction is about N67E.
A second directional analysis around the approxi-
mately trend-free directionsuch as the one in
Fig. 7helps to narrow the solution, which in this case
is approximately N63E. Other more sophisticated ap-
proaches not considered here include iterative modeling
of the trend and the semivariogram (Chile` s and Delner
1999, pp. 115128) and removal of the trend by ltering
through the calculation of increments (Chile` s and Del-
ner 1999, chap. 4).
Var(Z)
lag
s
e
m
i
v
a
r
i
o
g
r
a
m
Direction 1
Direction 2
Direction 3
Direction 4
Fig. 5 Schematic example of geometrically anisotropic semivario-
gram
311
6 Step 5: simple modeling
If all that the user wants to obtain are some conclusions
from the inspection of the experimental semivariogram,
step 4 is the end of the process. Yet, more often than not,
the semivariogram is required for kriging estimation or
for some form of stochastic simulation involving kri-
ging. In these situations, modeling of the semivariogram
becomes mandatory.
Kriging is the solution to the quadratic minimization
problem of nding weights that minimize the estimation
Fig. 6 Directional
semivariogram investigation for
the Elk County gravity data
312
error in a mean square sense. Any quadratic minimiza-
tion problem has a unique, positive solution, provided
that the coecient matrix is not singular, which in the
case of the kriging minimization problem introduces the
requirement that the semivariogram be negative denite.
By a positive solution, it is meant that the objective
functionthe estimation errorbe positive, which is
essential to avoid imaginary standard errors. A semi-
variogram model is any negative denite analytical
expression of a shape likely to capture and emulate the
style of variation of some experimental semivariogram.
By replacing the experimental semivariogram by a neg-
ative denite model, the user avoids singular kriging
matrices no matter what the combination of arguments.
Fig. 7 Directional
semivariogram investigation
around the trend-free direction
313
This explanation is given to dispel any notion that the
use of semivariogram models is an unnecessary compli-
cation introduced solely to make life more miserable.
The use of negative denite models is utterly more e-
cient than the alternative to test the non-singularity of
every kriging coecient matrix for each particular set of
values derived from direct interpolation of the table of
experimental semivariogram values, even though this
would be a valid approach.
Although there is an innite number of negative
denite functions, the basic shape of the semivariogram
rising from zero to reach a limiting value restricts to a
few the negative denite functions that are of interest.
Those most commonly employed are dened in Table 1
and displayed in Fig. 8. Parameters C and a conve-
niently relate directly to the sill and the range. A special
case of the negative denite model is the pure nugget
model, N, which can be considered a limiting case of
some of the other models when the range is innitesi-
mally small.
N C
0
1 H 0 ;
where H(0) is the Heaviside function, which is 1 at lag 0
and 0 otherwise. In this particular case, the constant C
0
is not called the sill but the nugget eect, a term derived
from the modeling of semivariograms of gold deposits.
The sum of a simple model and a pure nugget eect
model is also negative denite. The higher the nugget
eect relative to the nugget minus sill, the poor the
spatial continuity. To the limit, just a pure nugget eect
semivariogram indicates complete absence of spatial
continuity, making the use of geostatistics meaningless
as it produces the same results as classical statistics. Such
lack of continuity can be real, the result of too large
sampling space, or the consequence of numerous blun-
ders in the data. For example, an attribute with a true
range in its semivariogram of 100 m will have a pure
nugget eect semivariogram when the attribute is sam-
pled at intervals of 1 km. In my experience, this is
commonly the case of geochemical data.
Modeling of a semivariogram is the process of
replacing the collection of estimated values by the closest
negative denite model, namely, the selection of the
most adequate model type and the determination of its
parameters. This can be done by:
(a) trial and error (Goovaerts 1997, 97104);
(b) maximum likelihood (Kitanidis 1997, chap. 4); and
(c) weighted least squares (Jian et al. 1996).
Considering that this paper presents a practical ap-
proach, I refer the reader to the references for theoretical
discussions. For years, I have had the most satisfactory
Table 1 Most commonly used
simple semivariogram models 0 < a, 0 < C
Power model : P h a h
b
; 0\b\2
Exponential model : Ex h C 1 e

3h
a

Gaussian model : G h C 1 e
3
h
a

2

Spherical model : Sp h
C
3
2
h
a

1
2
h
a

3

; 06 h j j\a j j
C; a j j6 h j j
(
Pentaspherical : Pe h
C
15
8
h
a

5
4
h
a

3

3
8
h
a

5

; 06 h j j\a j j
C; a j j6 h j j
(
Cubic model : Cu h
C 7
h
a

2

35
4
h
a

3

7
2
h
a

5

3
4
h
a

7

; 06 h j j\a j j
C; a j j6 h j j
(
Sine hole effect : S h C 1
sin p
h
a

p
h
a

0.0
0.5
1.0
1.5
2.0
0.0 0.5 1.0 1.5 2.0
lag
s
e
m
i
v
a
r
i
o
g
r
a
m
Power
Exponential
Gaussian
Spherical
Pentaspherical
Cubic
Sine hole effect
Fig. 8 Most common negative denite semivariogram models
when the two parameters are equal to 1
314
Fig. 9 Best ts for experimental semivariogram along N63E employing each one of the models in Table 1 plus a pure nugget model when
necessary
315
results employing weighted least squares, particularly
when the experimental semivariogram follows a typical
behavior devoid of anomalous uctuations, such as the
case of the semivariogram in Fig. 4.
Regardless of the method used to nd the model
parameters, it is particularly hard for the novice to pick
the best type of model by simple inspection. Hence, the
safest approach is to t all models and then select the
model with the best goodness of t, which is a trivial and
instantaneous undertaking when the modeling is not
done by trial and error. Figure 9 shows the results for
the Elk data using a program described in Jian et al.
(1996) that is available from the Internet at http://
www.iamg.org/CGEditor/index.htm. Table 2 contains
the optimal parameters and the sum for the squares of
the weighted dierences, R
m
.
In a weighted least squares sense, the best model is
the Gaussian one, closely followed by the cubic model.
In the absence of a trend, if there is anisotropy, one
has to model as many directions as dimensions in the
sampling space. For two dimensions, one has to model
one semivariogram for the direction of maximum range
and another one for the direction of the minimum range.
Programs making use of anisotropic models generally
will expect that:
(a) the two directions are perpendicular;
(b) the type of model is the same; and
(c) both models have the same sill; which in most cases
forces some approximations. Those programs auto-
matically model the range for intermediate directions
as the radius of an ellipse in which the axes are the
minimum and maximum ranges.
For cases with trend and modeled only in the
trend-free direction, such as the demonstration data
along N63E, if the user wants to investigate the pos-
sibility of anisotropy, the investigation may be done
indirectly making use of crossvalidation. Crossvalida-
tion is a verication process in which each observation
is removed with replacement to produce an estimate at
the same site of the removal. Each estimate is then
used to calculate a dierence with the corresponding
censored measurement, thus generating a set of errors
that one can use to investigate their sensitivity to the
selection of the estimation method and some of its
parameters (Olea 1999, Chap. 7). If one employs an
estimator involving a semivariogram modelsuch as
the most adequate form of kriging, which in the case
of the Elk data may be universal kriging because of
the presence of a trendone can use crossvalidation
to study the sensitivity of the errors to changes in the
semivariogram. Conclusions derived from the analysis
of the errors, however, must be taken cautiously be-
cause the errors are not independent.
Table 3 shows the sensitivity of crossvalidation errors
to two sets of anisotropic models, one with a maximum
range 10% larger than the range in the best simple
model and a minimum range 10% less than the range in
the best simple model (16.38, 13.40) and another set with
a discrepancy of 20% (17.86, 11.91). Although there is a
systematic improvement that suggests a minimum mean
square error for a largest range oriented approximately
in a north-south direction, for this example the
improvement is not large enough to be considered sig-
nicant or to justify the complications of an anisotropic
model.
7 Step 6: nested modeling
A sum of negative dene semivariograms is also negative
denite. The sum of a simple model plus a pure nugget
eect model is just a special case. This property of
negative denite semivariograms opens innite possi-
bilities of semivariogram mixing, which in geostatistical
jargon is called semivariogram nesting. In practice, the
goodness of t rapidly reaches a saturation point,
explaining why one rarely sees nested models involving
more than a pure nugget eect model plus two simple
models, not necessarily of the same type.
Table 2 Results of tting simple models by weighted least squares
Model R
m
(mgal
4
)
C
0
(mgal
2
)
a or C
(mgal
2
)
b or a
(km)
Gaussian 0.081 0.038 0.990 14.885
Cubic 0.085 0.033 0.991 20.525
Spherical 0.202 0.000 1.083 25.449
Pentaspherical 0.217 0.000 1.100 31.958
Exponential 0.349 0.000 1.663 72.665
Sine hole eect 0.410 0.038 0.909 15.904
Power 0.565 0.000 0.069 0.857
Table 3 Sensitivity of Elk County Bouguer gravity to anisotropic
semivariogram models
Maximum
range (km)
Minimum
range (km)
Orientation of
largest range
Mean square
error (mgal
2
)
14.89 14.89 0.231
16.38 13.40 N63E 0.233
N83E 0.235
N77W 0.235
N57W 0.234
N37W 0.232
N17W 0.230
N3E 0.229
N23E 0.229
N43E 0.231
17.86 11.91 N63E 0.236
N83E 0.240
N77W 0.241
N57W 0.238
N37W 0.234
N17W 0.230
N3E 0.227
N23E 0.228
N43E 0.231
316
To keep an eye on the parsimony of nested modeling,
one can use the Akaike information criterion (AIC)
from time series analysis. The AIC is a measure of
goodness of t involving not only the weighted errors,
but the number of points used for the tting, n, and the
number of parameters, p, as well (Tong 1983, p. 135):
AIC n ln
R
m
n

2p
The smaller the Akaike information criterion, the better
is the t. Given an experimental semivariogram, when n
and p remain constant, such as in Table 3, the ranking
by R
m
or AIC is the same.
By looking at the best simple model in Fig. 10a, one
can see that between a lag of 8 and 20 km there are 8
experimental points below the curve, two of them clearly
below, which justify trying a more complex model to aim
Fig. 10 Best semivariogram
models. (a) Simple Gaussian
model. (b) Best double nested
Gaussian model.
Table 4 Best simple and double nested models
Model R
m
(mgal
4
) AIC C
0
(mgal
2
) C (mgal
2
) a (km)
Simple Gaussian 0.081 104.2 0.038 0.990 14.885
Nested Gaussian 0.025 123.4 0.010 0.6900.380 22.3118.042
317
for a better t. Table 4 and Fig. 10b provide the answer:
a double nested Gaussian model.
The AIC for the nested model is indeed smaller than
that for the simple one. Hence, the improvement is
worth increasing the number of parameters from 3 to 5.
Considering that the crossvalidation does not show any
signicant evidence of anisotropy for this nested model
either, the isotropic model:
ch N 0:01 G h; 0:69; 22:3 G h; 0:38; 8:0
is the best model for the Bouguer gravity anomaly data
from Elk County.
If there is a sill, then the covariance is easily obtained
by subtracting the semivariogram from the total sill,
which in this case would be:
Covh 1:08 N 0:01 G h; 0:69; 22:3
G h; 0:38; 8:0
The main advantage of this indirect way to model the
covariance is that it does not require knowledge of the
mean.
8 Concluding remarks
I hope the novice reader, for which this paper is in-
tended, is now less intimidated and more condent of
being able to model a semivariogram or a covariance.
Sophistications and variants abound. The six steps
described here are by no means the absolute way to go.
The ultimate test of understanding for the reader will be
to feel condent enough to try her or his own version of
these basic steps.
Acknowledgements I am grateful to John H. Doveton and two
anonymous reviewers for critical reading of the manuscript that
resulted in suggestions that improved the presentation.
References
Barnes RJ (1991) The variogram sill and the sample variance. Math
Geol 23:673678
Chile` s J-P, Delner P (1999) Geostatisticsmodeling spatial
uncertainty. Wiley, New York, 695 p
Collins DR (1999) Users guide for the LEO system, version 3.9.
Kansas Geological Survey Open-File Report 9948, 13 p
Deutsch CV, Journel AG (1998) GSLIBgeostatistical software
library and users guide. Oxford University Press, New York,
369 p and 1 compact disk
Goovaerts P (1997) Geostatistics for natural resources evaluation.
Oxford University Press, New York, 483 p
Isaaks EH, Srivastava, RM (1989) Introduction to applied geo-
statistics. Oxford University Press, New York, 561 p
Jian X, Olea RA, Yu Y-S (1996) Semivariogram modeling by
weighted least squares. Comput Geosci 22:387397
Journel AG, Huijbregts CJ (1978) Mining geostatistics. Academic,
London, 600 p
Kitanidis PK (1997) Introduction to geostatistics: applications to
hydrology. Cambridge University Press, New York, 249 p
Lam C-K (1987) Interpretation of statewide gravity survey of
Kansas. Kansas Geological Survey Open-File Report 871. 213
p and 6 plates
Olea RA (1999) Geostatistics for engineers and earth scientists.
Kluwer, Boston, 303 p
Pannatier Y (1996) VARIOWIN: software for spatial data analysis
in 2D. Springer, New York, 91 p
Press WH, Teukolsky SA, Vetterling WT, Flannery BP (1992)
Numerical recipes in fortran, 2nd edn. Cambridge University
Press, New York, 963 p
Tong H (1983) Threshold models in non-linear time series analysis.
Springer, New York, 323 p
Verly G (1986) Multigaussian kriginga complete case study. In:
Ramani RV (ed) Proceedings of the 19th APCOM international
symposium, Society of Mining Engineers, Littleton, Colorado,
pp 283298
Webster R, Oliver MA (1992) Sample adequately to estimate
variograms of soils properties. J Soil Sci 43:177192
Xia J, Yarger H, Lam C-K, Steeples D, Miller R (1992) Bouguer
gravity anomaly map of Kansas. Kansas Geological Survey
Map M-31
318

S-ar putea să vă placă și