An Introduction To Applied Geostatistics

Introduction to applied geostatistics
Short version
Overheads
D G Rossiter
Department of Earth Systems Analysis
International Institute for Geo-information Science & Earth Observation (ITC)
<http://www.itc.nl/personal/rossiter>
March 21, 2007

Introduction to applied geostatistics 1
Topic: Resources
There are many resources, at various mathematical levels, some aimed at
particular applications. These lists are not comprehensive but should be good
starting points:
• Texts
• Web pages
• Computer programmes
D G Rossiter
Texts: Mathematical
• Chilès, J.-P. and Delfiner, P., 1999. Geostatistics: modeling spatial uncertainty.
Wiley series in probability and statistics. John Wiley & Sons, New York.
• Christakos, G., 2000. Modern spatiotemporal geostatistics. Oxford University

Press, New York.
• Cressie, N., 1993. Statistics for spatial data. John Wiley & Sons, New York.
• Ripley, B.D., 1981. Spatial statistics. John Wiley & Sons, New York.
D G Rossiter
Texts: In the context of a particular application field
• Davis, J.C., 2002. Statistics and data analysis in geology. John Wiley & Sons,
New York.
• Fotheringham, A.S., Brunsdon, C. and Charlton, M., 2000. Quantitative

geography : perspectives on spatial data analysis. Sage Publications, London ;
Thousand Oaks, Calif.
• Stein, A., Meer, F.v.d. and Gorte, B.G.F. (Editors), 1999. Spatial statistics for
remote sensing. Kluwer Academic, Dordrecht.
• Kitanidis, P.K., 1997. Introduction to geostatistics : applications to

hydrogeology. Cambridge University Press, Cambridge, England.
D G Rossiter
Texts: Application-oriented but mathematical
• Webster, R., and Oliver, M. A., 2001. Geostatistics for environmental scientists.
Wiley & Sons, Chichester.
• Goovaerts, P., 1997. Geostatistics for natural resources evaluation. Oxford

University Press, Oxford and New York.
• Isaaks, E.H. and Srivastava, R.M., 1990. An introduction to applied

geostatistics. Oxford University Press, New York.
D G Rossiter
Texts: Emphasis on computational methods
• Venables, W.N. & Ripley, B.D., 2002. Modern applied statistics with S, 4th
edition. Springer-Verlag, New York.
• Deutsch, C. V., & Journel, A. G., 1992. GSLIB: Geostatistical software library
and user’s guide. Oxford University Press, Oxford.
D G Rossiter
Web pages
• R: http://www.r-project.org/
• R spatial projects: http://sal.uiuc.edu/csiss/Rgeo/
• gstat: http://www.gstat.org/
• gslib: http://www.gslib.com/
• GEOEAS: http://www.epa.gov/ada/csmos/models/geoeas.html
• ILWIS: http://www.itc.nl/ilwis/
• ArcGIS Geostatistical Analyst: http:

//www.esri.com/software/arcgis/arcgisxtensions/geostatistical/
• Geostatistical analysis tutor [Colorado (USA) School of Mines]:

http://uncert.mines.edu/tutor/
D G Rossiter
Computer programmes
• ILWIS 3.3 (ITC)
• R open-source environment for statistical computing and visualisation;

includes several relevant libraries, including
* gstat, by Pebesma
* spatial, by Ripley
* geoR, by Ribeiro & Diggle
* spdep, by Rowlingson & Diggle
* spatstat, by Baddeley & Turner (point pattern analysis)
* sp, underlying spatial data structures (used by others)
• ArcGIS Geostatistical Analyst (ESRI) [requires ArcGIS base]
• PCRaster + gstat (Utrecht) [free]
• GeoEAS, GSLIB, Variowin, VESPER . . .
D G Rossiter
Topic: Introduction to Spatial Analysis
1. Concepts of space: geographic and feature spaces
2. What is special about spatial data?
3. Key concepts in spatial analysis
4. Measuring spatial correlation
D G Rossiter
What is “space”?
• A set of n continuous dimensions; dimension i has range [ximin · · · ximax ]
• Points are mathematical n-dimensional vectors: x = (x1, x2, · · · , xn)
• Depending on how we choose the axes, we can speak of both geographic and
feature spaces . . .
D G Rossiter
Feature space
This “space” is not geographic space, but rather a mathematical space formed by
any set of variables:
• Axes are the range of each variable
• Coordinates are values of variables, possibly transformed or combined
• Not included in the common use of the term “spatial” data or analysis
• But the observation may be related in this ‘space’ . . .
• . . . and we often plot variables in this space, e.g. 2-D scatterplots
This is the “space” in which univariate, bivariate, or multivariate analysis are

carried out.
D G Rossiter
Geographic space
• Axes are 1-d lines
• One-dimensional: coordinates are on a line with respect to some origin (0):

(x1) = x
• Two-dimensional: coordinates are on a grid with respect to some origin (0, 0):
(x1, x2) = (x, y) = (E, N)
• Three-dimensional: coordinates are grid and elevation from a reference

elevation: (x1, x2, x3) = (x, y, z) = (E, N, H)
• Must transform latitude-longitude to grid coordinates in some 2-d projection;

distortions occur over large areas
• Can work directly with geographic coordinates, but not as a grid
D G Rossiter
What is special about spatial data? (1)
1. The location of a sample is an intrinsic part of its definition.
2. All data sets from a given area are implicitly related by their coordinates →
models of spatial structure
3. Values at sample points can not be assumed to be independent
4. That is, there may be a spatial structure to the data
• Classical statistics assumes independence, at least within sampling strata

• Major implications for sampling design and statistical inference
5. Data values may be related to their coordinates → spatial trend
D G Rossiter
Key Concepts
• Spatial dependence: the value of a variable at a point in space is related to its

value at nearby points; knowing the value of these points allows us to predict
(with some degree of certainty) the value at the chosen point
• Spatial structure: the nature of the spatial relation: how far, and in what
directions, is the spatial dependence? How does the dependence vary with
distance and direction between points?
• Support of a sample: the physical dimensions it represents (n.b. may try to

predict to coarser or finer resolutions)
D G Rossiter
Topic: Exploratory spatial data analysis

Since spatial data were collected at known points in geographic space, we should
visualise them in that space.
• Distribution of sample points
• Postplots (values vs. locations): where are which values?
• Geographic postplots: with images, landuse maps etc. as background: do

there appear to be any explanation for the distribution of values?
• Spatial structure: range, direction, strength . . .
• Is there anisotropy? In what direction(s)?
• Do there seem to be several populations with distinct geographic distribution?
D G Rossiter
Point distribution
This shows how sample points are distributed in space.
• What was the sampling plan?
• Random or clustered?
• Are some areas over– or under–sampled?
D G Rossiter
Example: Walker Lake: Distribution of points – All points
D G Rossiter
The Postplot: distribution of values in space

The so-called postplot shows how the data values are distributed in space.
• Are values of closeby points similar to each other, or do the values appear to
be random?
• Does there appear to be a trend?
• Are there distinct clusters of high or low values?
• Is there any directional difference in clustering? (anisotropy)
D G Rossiter
Meuse – Distribution of Log(Cadmium) in soils
D G Rossiter
Geographic postplot
This shows the postplot against a background that may explain the distribution of
samples or values. Examples:
• land cover or land use
• geologic or soil units
• structural geology
D G Rossiter
Meuse – Log(Cadmium) on a false-colour composite
D G Rossiter
Topic: Spatial correlation
1. What is spatial auto-correlation?
2. Evidence of spatial correlation
3. Computing spatial correlation and covariance
4. Summarizing and visualising spatial covariance; the empirical variogram
Topics for later units:
1. modelling spatial correlation
2. predicting using the modelled structure
D G Rossiter
Spatial Correlation
• Question: are nearby points in geographic space also ‘nearby’ in feature

space?
• That is, does knowing the value of some variable at some location give us
information on the value at ‘nearby’ locations?
• The concept of correlation between variables can be applied to correlation

within a variable, using distance to model the relation
D G Rossiter
Covariance and Correlation

Recall: for two non-spatial variables X and Y :
• Sample covariance:
n
1 X
sXY = (xi − x) · (yi − y)
n − 1 i=1
• Sample correlation coefficient: the covariance normalized by sample

standard deviations; range [−1 . . . 1]:
P P
sXY (xi − x) · (yi − y)
rXY = = pP
sX · sY (xi − x)2 · pP(yi − y)2
Can we extend this idea to a single variable, which is then correlated with itself?
D G Rossiter
Auto-correlation
We want to apply the idea of correlation to one variable (auto-correlation); the
prefix auto- means “self”, here referring to the single variable.
Here, the correlation is controlled by some other dimension:
• time – if the variable is collected as a time series
• space – if the variable is collected at points in space
So we will get a measure of how much the variable is correlated to itself,

considering the other factor (time or space) .
D G Rossiter
Auto-covariance
• The spatial auto-covariance is computed within the same variable, using

pairs of observations.
• Each pair of observations (xi, xj ) has a covariance, showing how they jointly
differ from the variable’s mean x:
(xi − x)(xj − x)
• There are (n · (n − 1))/2 point pairs for which this can be calculated
• This is a large number! For example, with 200 points this is 19,900 point pairs.
D G Rossiter
Modelling the auto-covariance
• By themselves the individual auto-covariances are not usefull; they just

quantify the covariance of each point pair.
• We need to summarize the individual covariances as a covariance function of

spatial separation
• Theory: the covariance depends only on the separation between point.
• If we can model this function . . .
• . . . we can then predict the covariance between any two locations in space.
D G Rossiter
Semivariances
It is easier to model semivariances than covariances:
• Each pair of observation points has a semivariance, usually symbolized by the

Greek letter “gamma”, i.e. γ, defined as:
1
γ(xi, xj ) = [z(xi) − z(xj )]2
2
• Each point pair is separated by a known distance, so . . .
• We can plot the semivariances against distance as a variogram “cloud”, with

(n · (n − 1))/2 points in the graph
• Can also summarize in a variogram
• (The ‘semi’ refers to the factor 1/2, because there are two ways to compute for
the same point pair)
D G Rossiter
The gstat package of R

We illustrate the concepts of spatial correlation with the gstat package of the R
environment and the meuse example data set.
The meuse data frame has coördinates in fields x and y; these are used to
promote the object to class SpatialPointsDataFrame.
> # view package information

> library(help=gstat)
> # load the package
> library(gstat)
> ?meuse
> # load sample data
> data(meuse)
> # as loaded is a data frame
> summary(meuse)
> # promote to class SpatialPointsDataFrame
> coordinates(meuse) <- ~ x+y
> # now has explicit coordinates
> summary(meuse)
D G Rossiter
> summary(meuse)
Object of class SpatialPointsDataFrame
Coordinates:
min max
x 178605 181390
y 329714 333611
Is projected: NA
proj4string : [NA]
Number of points: 155
Data attributes:
cadmium copper lead zinc elev
Min. : 0.20 Min. : 14.0 Min. : 37.0 Min. : 113 Min. : 5.18
1st Qu.: 0.80 1st Qu.: 23.0 1st Qu.: 72.5 1st Qu.: 198 1st Qu.: 7.55
Median : 2.10 Median : 31.0 Median :123.0 Median : 326 Median : 8.18
Mean : 3.25 Mean : 40.3 Mean :153.4 Mean : 470 Mean : 8.17
3rd Qu.: 3.85 3rd Qu.: 49.5 3rd Qu.:207.0 3rd Qu.: 674 3rd Qu.: 8.96
Max. :18.10 Max. :128.0 Max. :654.0 Max. :1839 Max. :10.52
dist om ffreq soil lime landuse dist.m

Min. :0.0000 Min. : 1.00 1:84 1:97 0:111 W :50 Min. : 10
1st Qu.:0.0757 1st Qu.: 5.30 2:48 2:46 1: 44 Ah :39 1st Qu.: 80
Median :0.2118 Median : 6.90 3:23 3:12 Am :22 Median : 270
Mean :0.2400 Mean : 7.48 Fw :10 Mean : 290
3rd Qu.:0.3641 3rd Qu.: 9.00 Ab : 8 3rd Qu.: 450
Max. :0.8804 Max. :17.00 (Other):25 Max. :1000
NA’s : 2.00 NA’s : 1
D G Rossiter
The empirical variogram
• To summarize the variogram cloud, compute average semivariance at various

separations (‘lags’); this is the empirical variogram
m(h)
1 X
γ(h) = [z(xi) − z(xj )]2
2m(h) i=1
• m(h) is the number of point pairs separated by vector h
• In practice, we have to define the set of vectors in each “bin” (to have enough
points); that is, we collect a distance range into one bin.
• (Note: there are other ways to estimate the variogram from the variogram
cloud; in particular so-called robust estimators.)
D G Rossiter
Example of an experimental variogram

> (v <- variogram(log(cadmium)~1, data=meuse))
np dist gamma
1 57 79.29244 0.6650872
2 299 163.97367 0.8584648
3 419 267.36483 1.0064382
4 457 372.73542 1.1567136
5 547 478.47670 1.3064732
6 533 585.34058 1.5135658
7 574 693.14526 1.6040086
8 564 796.18365 1.7096998
9 589 903.14650 1.7706890
10 543 1011.29177 1.9875659
11 500 1117.86235 1.8259154
12 477 1221.32810 1.8852099
13 452 1329.16407 1.9145967
14 457 1437.25620 1.8505336
15 415 1543.20248 1.8523791
np are the number of point pairs in the bin; dist is the average separation of
these pairs; gamma is the average semivariance in the bin.
D G Rossiter
Plotting the experimental variogram

This can be plotted as semivariance gamma against average separation dist,
along with the number of points that contributed to each estimate np:
> plot(v, plot.numbers=T)
(Note: gstat defaults to 15 equally-spaced bins and a maximum distance of 1/3

of the maximum separation. These can be over-ridden with the width= and
cutoff= arguments, respectively; or explicit bin limits can be set with the
boundaries= argument.)
D G Rossiter
Default variogram of Log(Cd)
2.0 ● 543
● 452
● 477
● 500
● 457 ● 415
● 589
● 564
● 574
1.5 ● 533
semivariance ● 547
● 457
1.0 ● 419
● 299
● 57
0.5
0.0
0 500 1000 1500
distance
D G Rossiter
Features of the experimental variogram

Later we will look at fitting a theoretical model to the experimental variogram;
but even without a model we can notice some features, which we define here only
qualitatively:
• Sill: maximum semi-variance
* represents variability in the absence of spatial dependence
• Range: separation between point-pairs at which the sill is reached
* distance at which there is no evidence of spatial dependence
• Nugget: semi-variance as the separation approaches zero
* represents variability at a point that can’t be explained by spatial structure
In the previous slide, we can estimate the sill ≈ 1.9, the range ≈ 1200 m, and the
nugget ≈ 0.5 i.e. ≈ 25% of the sill.
D G Rossiter
Defining the bins (1)
• Distance interval, specifying the centres. E.g. (0, 100, 200, . . .) means intervals
of [0 . . . 50], [50 . . . 150], . . .
• All point pairs whose separation is in the interval are used to estimate γ(h) for
h as the interval centre
• Narrow intervals: more resolution but fewer point pairs for each sample
> v <- variogram(log(cadmium)~1, meuse, boundaries = seq(50, 2050, by = 100))

> plot(v, pl=T)
> par(mfrow = c(2, 3)) # show all six plots together
> for (bw in seq(20, 220, by = 40)) {
v<-variogram(log(cadmium)~1, meuse, width=bw)
plot(v$dist, v$gamma, xlab=paste("bin width", bw))
D G Rossiter
Variograms of Log(Cd) with different bin widths
● ● ●
2.0
●● ●
● ● ● ●
●● ● ● ● ● ●
● ●
2.0
●
1.8
●
●● ●● ● ●●● ● ● ●●
● ● ●
●● ● ● ● ● ● ● ●
● ● ●● ● ●
● ●
● ● ●●● ●
●●
1.6
● ● ●
1.5
●
●● ● ● ● ● ●
●
1.5
●● ●● ● ●
●●●●
v$gamma
v$gamma
v$gamma
●
● ● ●●
1.4
●● ● ●
●
●● ● ●●
1.0
● ● ●
1.0
● ● ● ●
1.2
● ●
●●
●
●● ● ●
1.0
● ●
0.5
0.5
●
0.8
●
● ● ●
0 500 1000 1500 0 500 1000 1500 500 1000 1500
bin width 20 bin width 60 bin width 100
● ● ● ●
●
● ● ●
1.8
● ● ●
1.8
1.8
●
●
● ● ● ●
1.6
1.6
1.6
●
●
v$gamma
v$gamma
v$gamma
1.4
●
1.4
1.4
●
●
1.2
1.2
1.2
1.0
● ●
●
1.0
1.0
0.8
●
0.8
● ● ●
0.8
500 1000 1500 200 600 1000 1400 200 600 1000 1400
bin width 140 bin width 180 bin width 220
D G Rossiter
Defining the bins (2)
• Each bin should have > 100 point pairs; > 300 is much more reliable
> v <- variogram(log(cadmium)~1, meuse, width=20)

> v$np
[1] 6 19 27 27 51 65 58 62 62 82 76 75 86 81 76
[16] 91 92 90 88 92 112 103 80 116 108 106 79 94 117 99
[31] 100 101 108 117 110 117 114 107 96 110 109 106 114 117 104
[46] 98 94 117 92 110 105 91 89 98 89 91 103 102 93 92
[61] 73 85 88 91 88 84 75 81 90 73 93 95 76 85 67
[76] 77 88 60
> v <- variogram(log(cadmium)~1, meuse, width=120)
> v$np
[1] 79 380 485 577 583 642 654 648 609 572 522 491 493 148
D G Rossiter
Topic: Modelling the variogram

From the empirical variogram we now derive a variogram model which
expresses semivariance as a function of separation vector.
The model allows us to:
• Infer the characteristics of the underlying process from the functional form
and its parameters;
• Compute the semi-variance between any point-pair, separated by any vector

...
• . . . which is used in an ‘optimal’ interpolator (“kriging”) to predict at

unsampled locations.
D G Rossiter
A variogram model, with parameters
D G Rossiter
Authorized variogram models
• Only some functional forms can be used to model the variogram (theoretical
and mathematical constraints)
• The permitted forms are called authorized models
• Simplest: The exponential model; sill c, effective range 3a
(− h
a)
γ(h) = c{1 − e }
E.g. if the effective range is estimated as 120, the parameter a is 40.
• Another common model: The Spherical model; sill c, range a
 3
 c 3h − 1 h : h<a
γ(h) = 2a 2 a
c : h≥a

D G Rossiter
Graphs of authorized variogram models

Linear−with−sill variogram model Circular variogram model Spherical variogram model
1.0
1.0
1.0
0.8 sill sill sill
0.8
0.8
0.6
0.6
0.6
semivariance
semivariance
semivariance
0.4
0.4
0.4
0.2
0.2
0.2
nugget nugget nugget
range range range

0.0
0.0
0.0
0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10
separation distance separation distance separation distance
Pentaspherical variogram model Exponential variogram model Gaussian variogram model

1.0
1.0
1.0
sill sill sill
0.8
0.8
0.8
0.6
0.6
0.6
semivariance
semivariance
semivariance
0.4
0.4
0.4
0.2
0.2
0.2
nugget nugget nugget
range range range

0.0
0.0
0.0
0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10
separation distance separation distance separation distance
D G Rossiter
Comparaison
m
0.0
0.2
0.4
0.6
0
0.8
1
1.0E
Exponential
S
Spherical
G
Gaussian
P
Pentaspherical
C
Circular
L
M
Linear-with-sill
2ange
4
6
8 inear-with-sill
xponential
pherical
entaspherical
omparaison
ircular
aussian
ssemivariance
sill
n
rnugget
Mrange
Mill
.2
.4
.6
.8
.0
ugget
sseparation
eparation
emivariance
distance of variogram models
Comparaison of variogram models
1.0
sill
Exponential
0.8 Gaussian
Circular
0.6
Spherical
semivariance
Pentaspherical
0.4
Linear-with-sill
0.2
nugget
range
0.0
0 2 4 6 8
separation distance
Models vary considerably, from origin to range
D G Rossiter
Comparaison of models available in gstat

> show.vgms()
D G Rossiter
Choosing a model (1)

The empirical variogram should be one realization of a random process. So,
what do we expect from the process that is supposed to be responsible for the
spatial structure represented in the variogram?
• Exponential: First-order autoregressive process: values are random but with

dependency on the nearest neighbour; boundaries according to a Poisson
process
• Gaussian: as exponential, but with strong close-range dependency, very

smooth at each point.
D G Rossiter
Choosing a model (1) – continued
• Spherical, circular, pentaspherical: Patches of similar values; patches have

similar size ≈ range) with transition zones (overlap of processes); These differ
mainly in the “shoulder” transition to the sill
D G Rossiter
Choosing a model (2)
• Which has been successfully applied with this kind of data?

(This is evidence for the nature of this kind of process)
• What do we expect from the supposed process? if we have some other

evidence of its spatial behaviour.
For example, a Gaussian model might be expected for a phenomenon which
physically must be very continuous, e.g. the surface of a ground-water table.
• Visual estimate of functional form from the variogram
• (Fit various models, pick the statistically-best fit)
D G Rossiter
Fitting the model

Once a model form is selected, then the model parameters must be adjusted for
a ‘best’ fit of the experimental variogram.
• By eye, adjusting parameters for good-looking fit
* Hard to judge the relative value of each point

* This is all that’s possible in ILWIS
• Automatically, looking for the best fit according to some objective criterion
* Various criteria possible in gstat
• In both cases, favour sections of the variogram with more pairs and at shorter
ranges (because it is a local interpolator).
• Mixed: adust by eye, evaluate statistically; or vice versa
D G Rossiter
Fitting a variogram model in gstat

We’ve decided on a spherical + nugget model:
> # Calculate the experimental variogram and display it

> v1 <- variogram(log(cadmium)~1, meuse); plot(v1, plot.numbers=T)
> # Fit by eye, display fit
> m1 <- vgm(1.4, "Sph", 1200, 0.5); plot(v1, plot.numbers=T, model=m1)
> # Let gstat adjust the parameters, display fit
> m2 <- fit.variogram(v1, m1); m2
model psill range
1 Nug 0.54785 0.000
2 Sph 1.33980 1149.4
> plot(v1, plot.numbers=T, model=m2)
> # Fix the nugget, fit only the sill of spherical model
> m2a <- fit.variogram(v1,m1,fit.sills=c(F,T),fit.range=F); m2a
model psill range
1 Nug 0.5000 0
2 Sph 1.4651 1200
In this case, the eyeball did a pretty good job . . .
D G Rossiter
2.0 ● 543 2.0 ● 543

● 452 ● 452
● 477 ● 477
● 457 ● 415 ● 457 ● 415
● 500 ● 500
● 589 ● 589
● 564 ● 564
● 574 ● 574
1.5 ● 533 1.5 ● 533
● 547 ● 547
semivariance
semivariance
● 457 ● 457
1.0 ● 419 1.0 ● 419
● 299 ● 299
● 57 ● 57
0.5 0.5
0.0 0.0
0 500 1000 1500 0 500 1000 1500
distance distance
By eye: c0 = 0.5, c1 = 1.4, a = 1200; total sill c0 + c1 = 1.9
Automatic: c0 = 0.548, c1 = 1.340, a = 1149; total sill c0 + c1 = 1.888
The total sill was almost unchanged; gstat raised the nugget and lowered the
partial sill of the spherical model a bit; the range was shortened by 51 m.
D G Rossiter
What sample size to fit a variogram model?
• Can’t use non-spatial formulas for sample size, because spatial samples are
correlated, and each sample is used multiple times in the variogram estimate
• Stochastic simulation from an assumed random field with a known variogram

suggests:
1. < 50 points: not at all reliable

2. 100 to 150 points: more or less acceptable
3. > 250 points: almost certaintly reliable
• More points are needed to estimate an anisotropic variogram.
This is very worrying for many environmental datasets (soil cores, vegetation
plots, . . . ) especially from short-term fieldwork, where sample sizes of 40 – 60
are typical. Should variograms even be attempted on such small samples?
D G Rossiter
Topic: Approaches to spatial prediction

This is the prediction of the value of some variable at an unsampled point,
based on the values at the sampled points.
This is often called interpolation, but strictly speaking:
• Interpolation: prediction is only for points that are geographically inside the
(convex hull of the) sample set;
• Extrapolation: prediction outside this geographic area
(Note: same usage as in feature-space predictions)
D G Rossiter
A taxomomy of spatial prediction methods
Strata divide area to be mapped into ‘homogeneous’ strata; predict within each
stratum from all samples in that stratum
Global predictors: use all samples to predict at all points; also called regional
predictors;
Local predictors: use only ‘nearby’ samples to predict at each point
Mixed predictors: some of structure is explained by strata or globally, some

locally
D G Rossiter
Which approach is “best”?
• No theoretical answer
• Depends on how well the approach models the ‘true’ spatial structure, and
this is unknown (but we may have prior evidence)
• Should correspond with what we know about the process that created the
spatial structure
D G Rossiter
Polynomial trend surfaces
• A global predictor which models a regional trend
• The value of a variable at each point depends only on its coödinates and
parameters of a fitted surface
• This is modelled with a smooth function of position, z = f (x, y) = f (E, N) for

grid coördinates; this is called the trend surface
• Simple form (plane, 1st order):
z = β0 + βx E + βy N
• Higher-order surfaces may also be fitted (beware of fitting the noise!)
D G Rossiter
Fitting trend surfaces
• The trend surface is predicted by linear regression with coödinates as the

predictor variables and the response variable to be predicted, using data from
all sample points.
• All samples participate equally in the prediction
• We can measure the goodness of fit of the trend surface to the sample by the
residual sum of squares
• The same cautions as in feature-space regression analysis!
• Ordinary Least Squares (OLS) is often used but is not really correct, since it
ignores possible correlation among closely-spaced samples; better is
Generalised Least Squares (GLS)
D G Rossiter
Predictions of 1st and 2nd order Trend Surfaces in the study area
1
xM
30.5
333000
332000
331000
-2.0
-1.5
-1.0st and
y330000
TS1
T
M 30000
S1
S2
331000
332000
333000
3
178500
179000
179500
180000
180500
181000
181500
--0.5
0.0
0.5
0
1.0
1.5
1
M2.0
1.5
1.0
30000
31000
32000
33000
78500
79000
79500
80000
80500
81000
81500
.0
.5 2nd order trend surfaces, study area
1st and 2nd order trend surfaces, study area

178500 179000 179500 180000 180500 181000 181500
1.5
333000 1.0
0.5
332000 0.0
y
-0.5
331000
-1.0
-1.5
330000
-2.0
178500 179000 179500 180000 180500 181000 181500

x
D G Rossiter
Predictions of 1st and 2nd order Trend Surfaces in the bounding box
1
xMM st and
3123481500
333000
332000
331000
y330000
TS1
T 30000
S1
S2
331000
332000
333000
178500
179000
179500
180000
180500
181000
181500
-4
-3
-2
--1
0
1
2
3
4
M 30000
31000
32000
33000
78500
79000
79500
80000
80500
81000 2nd order trend surfaces, bounding box
1st and 2nd order trend surfaces, bounding box

178500 179000 179500 180000 180500 181000 181500
3
333000
1
332000
y
-1
331000
-2
-3
330000
-4
178500 179000 179500 180000 180500 181000 181500

x
D G Rossiter
Approaches to prediction: Local predictors
• No strata
• No regional trend
• Value of the variable is predicted from “nearby” samples
* Example: concentrations of soil constituents (e.g. salts, pollutants)

* Example: vegetation density
D G Rossiter
Local Predictors
Each interpolator has its own assumptions, i.e. theory of spatial variability
• Nearest neighbour (Thiessen polygons)
• Average within a radius
• Average of the n nearest neighbours
• Distance-weighted average within a radius
• Distance-weighted average of n nearest neighbours
• ...
• “Optimal” weighting ⇒ Kriging
D G Rossiter
Local predictor: Nearest neighbour (Thiessen polygons)
• Predict each point from its single nearest sample point
• Conceptually-simple, makes the minimal assumptions about spatial structure
• No way to estimate prediction variances, ignores other ‘nearby’ information
• Maps show abrupt discontinuities at boundaries, so don’t look very realistic
• But may be a more accurate predictor than poorly-modelled predictors
D G Rossiter
Local predictor: Average within a radius
• Use the set of all neighbouring sample points within some radius r
• Predict by averaging :
n
1X
xˆ0 = xi , d(x0, xi) ≤ r
n i=1
• Although we can calculate prediction variances from the neighbours, these

assume no spatial structure closer than the radius
• Problem: How do we select a radius?
D G Rossiter
Local predictors: Distance-weighted average
• Inverse of distance to some set of n nearest-neighbours:
n n
X xi X 1
xˆ0 = /
i=1
d(x0 , xi ) i=1
d(x0, xi)
• Inverse of distance to some set of n nearest-neighbours, to some power k
n k
X xi X 1
xˆ0 = k
/ k
i=1
d(x0 , xi ) i=1
d(x0 , xi )
• Implicit theory of spatial structure (a power model), but this is not testable
• Can select all points within some limiting distance (radius), or some fixed
number of nearest points, or . . .
• How to select radius or number and power?

D G Rossiter
Inverse distance in gstat

The idw method is used. There is no model of spatial variability, so there is no
way to estimate a prediction variance.
> kid <- idw(log(cadmium) ~ 1, meuse, meuse.grid)

[inverse distance weighted interpolation]
> levelplot(var1.pred ~ x+y, as.data.frame(kid), aspect="iso")
The weights are computed only from the inverse distance; they do not account for
spatial structure nor for the relative positions of the sample points.
Compare inverse distance (linear) to Ordinary Kriging with a spherical model

(range = 1150 m): OK gives a much smoother map.
D G Rossiter
3 3
●
●●●●
●
●●● ●
●
●
333000 333000 ● ●
● ●
●
●
● ●●●●●
2 2
●
●
●●●
●●● ●
●●●
●● ●●
●
●●
●
332000 1 332000 1
●
●
●
●
●
●
●
●
●
y
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0 ●
● ● 0
●
●
331000 331000
●
●
●● ●
●
●
● ●
●
●
●● ●
● ●
●
●
●●●●
●●
●
●
●
● ●●●
●
●
●
−1
●
●
●
●
● −1
● ●
●●
●
●
330000 330000
●
● ●●●●●
178500 179000 179500 180000 180500 181000 181500 178500 179000 179500 180000 180500 181000 181500
Inverse distance x x
2.5 2.5
●
●●●●
●
2.0
●
●
●●●● 2.0
333000 333000 ● ●
●
● ●
●
1.5
● ● ● ●●
● ●● 1.5
●●●
●●● ●
●●●
●● ●●
1.0 ● 1.0
●●
●
332000 332000
●
●
●
●
●
●
●
●
●
y
●
●
●
0.5 0.5
●
●
●
●
●
●
●
●
●
●
●
0.0
●
● ●
0.0
●
●
331000 331000
●
●●
● ●
●
●
● ●
●
●
●● ●
● ●
●
●● ●
●
●
●● ●
−0.5 −0.5
●
●
● ●●●
●
●
●
●
●
●
●
●
−1.0 ● ●
●●
●
● −1.0
330000 330000
●
● ●● ● ●●
178500 179000 179500 180000 180500 181000 181500 178500 179000 179500 180000 180500 181000 181500
Ordinary kriging x x
D G Rossiter
Approaches to prediction: Mixed predictors
• For situations where there is both long-range structure (trend) or strata and
local structure
* Example: Particle size in the soil: strata (rock type), trend (distance from a
river), and local variation in depositional or weathering processes
• One approach: model strata or global trend, subtract from each value, then
model residuals → Regression Kriging.
• Another approach: model everything together → Universal Kriging or Kriging

with External Drift
D G Rossiter
Topic: Ordinary Kriging

The theory of regionalised variables leads to an “optimal” interpolation method, in
the sense that the prediction variance is minimized.
This is based on the theory of random functions, and requires certain

assumptions.
D G Rossiter
Kriging
• A “Best Linear Unbiased Predictor” (BLUP) that satisfies a certain optimality

criterion (so it’s “best” with respect to the criterion)
• It is only “optimal” with respect to the chosen model and the chosen
optimality criterion
• Based on the theory of random processes, with covariances depending only

on separation (i.e. a variogram model)
• Theory developed several times (Kolmogorov 1930’s, Wiener 1949) but current
practise dates back to Matheron (1963), formalizing the practical work of the
mining engineer D G Krige (RSA).
* Should really be written as “krigeing” (Fr. krigeage) but it’s too late for that.
D G Rossiter
What is so special about kriging?
• Predicts at any point as the weighted average of the values at sampled points
* as for inverse distance (to a power)
• Weights given to each sample point are optimal, given the spatial covariance
structure as revealed by the variogram model (in this sense it is “best”)
• So, the prediction is only as good as the model of spatial structure!
• The prediction error at each point is automatically generated as part of the

process of computing the weights.
D G Rossiter
How do we use Kriging?
1. Sample, preferably at different resolutions
2. Calculate the experimental variogram
3. Model the variogram with one or more authorized functions
• N.b. the variogram model may already be known from other studies or
theoretical considereations
4. Apply the kriging system of equations, with the variogram model of spatial
dependence, at each point to be predicted
• Predictions are often at each point on a regular grid (e.g. a raster map)
5. Calculate the variance of each prediction; this is based only on the sample
point locations, not their data values.
D G Rossiter
OK in gstat
The krige method is used with a variogram model:
# compute experimental variogram

v <- variogram(log(cadmium) ~ 1, meuse)
# estimated model
m <- vgm(1.4, "Sph", 1200, 0.5)
# fitted model
m.f <- fit.variogram(v, m)
data(meuse.grid); coordinates(meuse.grid) <- ~ x +y # interpolation grid
kr <- krige(log(cadmium)~ 1, loc=meuse, newdata=meuse.grid, model=m.f)
[using ordinary kriging]
# visualize interpolation; note aspect option to get correct geometry
levelplot(var1.pred ~ x+y, as.data.frame(kr), aspect="iso")
# visualize prediction error
levelplot(var1.var ~ x+y, as.data.frame(kr), aspect="iso")
Note the model specification (model=m.f); this gives the assumed covariance
structure with which to compute the optimal weights.
D G Rossiter
Ordinary kriging (OK) results for Meuse log(Cd)

2.5 2.5
●
●●
2.0
●
●
●
●●●●●
● 2.0
333000 333000 ● ●
● ●
●
●
1.5
● ● ● ●●
● ●● 1.5
●●●
●●● ●
●●●
1.0 ● ●
●● ●●●
1.0
●
332000 332000
●
●
●
●
●
●
●
●
●
y
●
●
●
0.5 0.5
●
●
●
●
●
●
●
●
●
●
●
0.0
●
● ●
0.0
●
●
331000 331000
●
●●● ●
●
●
● ●
●
●
●● ●
● ●
●
−0.5
● ● ●●●●
●
−0.5
● ●●● ● ●
● ●
●● ● ●
●
● ●● ●
●
330000
−1.0
330000
● ● ● −1.0
●
● ●●●●●
178500 179000 179500 180000 180500 181000 181500 178500 179000 179500 180000 180500 181000 181500
x x
D G Rossiter
Kriging prediction errors for Meuse log(Cd)
●
● ●
●
● ● ●
●
1.4 ● ● ●
●
● 1.4
● ●
● ● ●
●
333000 333000 ● ●
●
● ●
● ●
1.3 ● ● ● 1.3
● ●● ●
●
● ●
●
● ●
● ●
●
● ● ●
1.2 ●
● ● ● ● 1.2
● ●
● ● ● ●
●●
●
● ● ●
●
332000 332000 ● ● ●
● ●
1.1 ● ● ● 1.1
●
● ● ● ● ●
y
y
●
● ●
●
● ●
● ● ●
● ●
1.0 ● ● 1.0
● ● ●
● ● ● ●
● ●
●
331000 331000 ●
● ● ● ● ●
●
0.9 ●● ●
● ● ●
0.9
● ● ●
●● ●
●
● ●●
● ● ● ● ●
● ●
● ●
● ● ●
0.8 ●
●
● ● ● 0.8
●
●
● ● ●
● ●
●
330000 330000 ●
0.7 ●
● ● 0.7
●
● ●
178500 179000 179500 180000 180500 181000 181500 178500 179000 179500 180000 180500 181000 181500
x x
D G Rossiter
How realistic are maps made by Ordinary Kriging?
• The resulting surface is smooth and shows no noise, no matter if there is a

nugget effect in the variogram model
• So the field is the best at each point taken separately, but taken as a whole is
not a realistic map
• The sample points are predicted exactly; they are assumed to be without
error, again even if there is a nugget effect in the variogram model
D G Rossiter
Non-parametric geostatistics
A non-parametric statistic is one that does not assume any underlying data
distribution.
For example:
• a mean is an estimate of a parameter of location of some assumed

distribution (e.g.mid-point of normal, expected proportion of success in a
binomial, . . . )
• a median is simply the value at which half the samples are smaller and half
larger, without knowing anything about the distribution underlying the process
which produced the sample.
In geostatistics, “non-parametric” refers to methods that make no assumptions

about the distribution of the data values, only about spatial structure.
D G Rossiter
Non-parametric geostatistics: Motivation (1)

There is some positive motivation . . .
• In some applications, we may be most interested in finding areas with values

above a certain threshold (e.g. polluted areas), and not really care if we get
accurate predictions in other areas (as long as we are sure they are below the
threshold)
• So the form of the distribution is not important, just whether a value is above
or below some threshold.
• In these applications, we often want a probability that an interpolated point

exceeds the threshold; this is directly useful for probabilistic decision-making
* e.g. whether or not to clean up a polluted site
D G Rossiter
Non-parametric geostatistics: Motivation (2)

. . . and there is also some negative motivation:
• The outlier problem: a dataset may contain a few very high values
• These can make the area mean arbitrarily high (n.b. not the median)
• These contribute a disproportionate amount to the total variance as well
• These can make the experimental semivariogram unreliable for “typical” values
and useless for unusual values:
* the point-pairs where the outliers are included will have very high
semivariances
* these contribute disproportionately to the average semivariance in a bin . . .
* . . . so that the variogram is very difficult to model
D G Rossiter
• E.g. a random sample of 15 N(10, 1) variates with one outlier at 100 (i.e. 10x
the expected value) replacing the last value:
1. Without outlier: x̄ = 9.95, sx2 = 0.57

2. With outlier: x̄ = 15.98, sx2 = 540.8
• The one point with value 100 accounts for (100 − x̄)2/15 = 470 of the
variance, i.e. 470/540 = 87% of it
• So, it will make semi-variances of point pairs involving this point much
higher than others; these can be seen in the variogram cloud
• Note: the median is only slightly affected: if the outlier replaces a value
above the median, the next-highest value is now the median
D G Rossiter
“Solutions” to the outlier problem
1. Ignore (assume that they represent a different population and remove from the
dataset before further analysis)
• → under-estimation, can’t find “hot spots”
2. Set to some arbitrary maximum, nearer the bulk of the population; same
problem
3. Transform the variable to logarithms for modelling; transform back for the
final maps and estimates
• Good solution if the whole distribution is lognormal

• Not optimal if the aim is just to bring some outliers closer (i.e. the rest of
the distribution is not lognormal)
4. → Transform to indicator variables, interpolate by Indicator Kriging (IK)
D G Rossiter
Lognormal Kriging
~ i = log z(x
1. Transform the data to their (natural) logarithms: y(x) ~i); this
should be approximately normally distributed
2. Model and interpolate with the transformed variable (OK, block kriging, UK,
KED, trend surfaces . . . )
3. Optional: Back-transform to original units of measure
Back-transformation is not required if we don’t care about the original variable,

e.g. if the logarithm itself is a useful index.
(Back-transformation of prediction variances is only possible for SK.)
D G Rossiter
Indicator kriging
This is a simple non-parametric (also called distribution-free) method
ofinterpolation.
It is used primarily to estimate the probability of exceeding some pre-defined

threshold value.
It can also be used to estimate an entire cumulative probability distribution

(CDF).
Note that there are other non-parametric methods, e.g. disjunctive kriging;
these have a much more difficult theory.
D G Rossiter
Distribution-free estimates
So far we have assumed an approximately normal or lognormal distribution of the
target spatially-correlated random variable. But this may be demonstrably not
true.
A non-parametric approach does not attempt to fit a distribution to the data, but
rather works directly with the experimental CDF, by dividing it into sample
quantiles.
To work with these, we introduce the idea of indicator variables.
D G Rossiter
Indicator variables
• Binary variables: Take one of the values {1, 0} depending on whether the point
is ‘in’ or ‘out’ of the set; i.e. if it does or does not meet some criterion
* These are suitable for binary nominal variables, e.g. {“urban”, “not urban”};
{“land use changed”, “land use did not change”}
• A continuous variable can be converted to an indicator zt by a threshold or

cut-off value xt : zt = 1 ⇐⇒ x ≤ xt
* e.g. xt = 350 to cut-off at 350 mg kg-1

* Formally: I(x~i, zt ) = 1 iff Z(x
~i) ≤ zt ; 0 otherwise
* By convention 1 indicates values below the threshold (to model the CDF);
inverting reverses the sense
D G Rossiter
Setting up indicators in gstat

> mind <- as.data.frame(meuse)[c("x","y","cadmium")]; str(mind)
‘data.frame’: 155 obs. of 3 variables:
$ x : num 181072 181025 181165 181298 181307 ...
$ y : num 333611 333558 333537 333484 333330 ...
$ cadmium: num 11.7 8.6 6.5 2.6 2.8 3 3.2 2.8 2.4 1.6 ...
> attach(mind)
> quantile(cadmium, seq(0,1,.1))
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
0.20 0.20 0.64 1.20 1.56 2.10 2.64 3.10 5.64 8.26 18.10
> for (q in seq(.1,.9,.1)) mind <-
+ cbind(mind, as.numeric(cadmium<=quantile(cadmium,q)))
> names(mind)[4:12] <- paste("q",seq(1:9),sep="")
> mind$q5[1:30]
[1] 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1
So field q5 of data frame mind contains a 1 if the corresponding value of field

cadmium is ≤ 2.10, the fifth decile (i.e. the median).
D G Rossiter
Indicator map
• Every sample point is either 1 (‘in’) or 0 (‘out’); a binary map
• No measure of ‘how far’ in or out
• Prepare a series of indicator maps, with increasing thresholds, to visualise

the cumulative sample distribution
• A common strategy is to divide the range of the sample values into quartiles
or deciles and prepare an indicator for each
• The proportion of 1’s will increase with increasing quantile.
D G Rossiter
The Indicator variogram
• Compute as for a parametric variogram; every sample point has either value 1
(below the cutoff, in the set) or 0.
• The semivariance of each point pair is either 0 (both above or below; both out
or in) or 0.5 (one above, one below; one out, one in).
• For a quantized continuous variable, each indicator variable (quantile) might

well have different spatial structure
• Variograms near the two ends of the CDF have few 1’s or 0’s (depending on the
end), so few point-pairs will have semivariance 0.5 → hard to model (fluctuates)
• Model as for parametric variogram; however the total sill must be < 0.5
(generally it’s a lot lower)
D G Rossiter
Probability kriging using indicator variables
1. Calculate the indicator at the required threshold
2. Calculate the empirical variogram for that indicator (not the median)
• (May have to use a threshold closer to the median if there are too few 1’s so
that the variogram is erratic)
3. Model the variogram
4. Solve the kriging system at each point to be predicted, using Simple Kriging
(SK) with the quantile proportion as the expected value (e.g., in the 6th
decile, 0.6 of the values are expected to be 1’s)
• Note! this is only true if the original sampling scheme was unbiased! If not,
also estimate the mean (use OK).
5. If necessary, limit the results to the range [0 . . . 1]
6. This may be interepreted as the probability that the point does not exceed the
threshold
D G Rossiter
Indicator kriging in gstat

> # convert to spatial object
> coordinates(mind) <- ~ x +y
> #compute the variogram for the 90th percentile
> vq9 <- variogram(q9~1, mind)
> plot(vq9, plot.numbers=T)
> mq9 <- vgm(0.05,"Sph",500,0.04)
> plot(vq9, plot.numbers=T, model=mq9)
> mq9f <- fit.variogram(vq9, mq9)
> plot(vq9, plot.numbers=T, model=mq9f)
> # erratic around sill, leads to very short range variogram
> mq9f
model psill range
1 Nug -0.006447182 0.0000
2 Sph 0.090135765 167.9867
> # krige this quantile; note expected proportion of 1’s is known
> k9 <- krige(q9~1, mind, meuse.grid, beta=0.9, model=mq9f)
[using simple kriging]
> levelplot(var1.pred~x+y, as.data.frame(k9), aspect="iso")
> # this is the probability of being *below* the cutoff of 8.26ppm
D G Rossiter
500
● 500
●
452
● ●452
0.10 477
● 0.10 477
●
● 415 ● 415
533
● 533
●
543
● 543
●
●547 ●564 ●589 457

● ●547 ●564 ●589 457
●
299
● 299
●
457
● 457
●
0.08 0.08
574
● 574
●
419
● 419
●
semivariance
semivariance
0.06 0.06
Indicator variogram (9th ● 57 ● 57
decile);
0.04 0.04
0.02 0.02
Estimated model 0.00 0.00

0 500 1000 1500 0 500 1000 1500
distance distance
500
●
452
●
0.10 477
●
● 415
533
● −1
543
●
333000
●547 ●564 ●589 457
●
299
●
457
●
0.08
574
● −0.8
419
●
Fitted model; note 332000

semivariance
−0.6
0.06
y
unrealistic nugget ● 57
−0.4
0.04
331000
−0.2
Probability < 0.02
330000 −0
8.26mg kg-1 0.00

0 500 1000 1500 178500 179000 179500 180000 180500 181000 181500
distance x
D G Rossiter
D G Rossiter
Summary: Advantages of IK
• Makes no assumption about the theoretical distribution of the data values,

yet still give realistic probability estimates
• Outlier-resistent: these can not increase the estimate or prediction variances

of an indicator arbitrarily; for data values they only affect one quantile
• Simple Kriging is used at each quantile, which improves the estimate.
D G Rossiter
Summary: Disadvantages of IK
• Variograms may be difficult to model, especially at the highest and lowest

quantiles (few pairs with different 0/1 values)
D G Rossiter

An Introduction To Applied Geostatistics

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

An Introduction To Applied Geostatistics

Încărcat de

Drepturi de autor:

Formate disponibile

Introduction to applied geostatistics

March 21, 2007

• Christakos, G., 2000. Modern spatiotemporal geostatistics. Oxford University

Texts: In the context of a particular application field

• Fotheringham, A.S., Brunsdon, C. and Charlton, M., 2000. Quantitative

• Kitanidis, P.K., 1997. Introduction to geostatistics : applications to

Texts: Application-oriented but mathematical

• Goovaerts, P., 1997. Geostatistics for natural resources evaluation. Oxford

• Isaaks, E.H. and Srivastava, R.M., 1990. An introduction to applied

Texts: Emphasis on computational methods

• R spatial projects: http://sal.uiuc.edu/csiss/Rgeo/

• ArcGIS Geostatistical Analyst: http:

• Geostatistical analysis tutor [Colorado (USA) School of Mines]:

• ILWIS 3.3 (ITC)

• R open-source environment for statistical computing and visualisation;

• ArcGIS Geostatistical Analyst (ESRI) [requires ArcGIS base]

• PCRaster + gstat (Utrecht) [free]

• GeoEAS, GSLIB, Variowin, VESPER . . .

Topic: Introduction to Spatial Analysis

1. Concepts of space: geographic and feature spaces

2. What is special about spatial data?

3. Key concepts in spatial analysis

4. Measuring spatial correlation

• A set of n continuous dimensions; dimension i has range [ximin · · · ximax ]

• Points are mathematical n-dimensional vectors: x = (x1, x2, · · · , xn)

• Axes are the range of each variable

• Coordinates are values of variables, possibly transformed or combined

• But the observation may be related in this ‘space’ . . .

• . . . and we often plot variables in this space, e.g. 2-D scatterplots

This is the “space” in which univariate, bivariate, or multivariate analysis are

• Axes are 1-d lines

• One-dimensional: coordinates are on a line with respect to some origin (0):

• Three-dimensional: coordinates are grid and elevation from a reference

• Must transform latitude-longitude to grid coordinates in some 2-d projection;

• Can work directly with geographic coordinates, but not as a grid

What is special about spatial data? (1)

1. The location of a sample is an intrinsic part of its definition.

3. Values at sample points can not be assumed to be independent

4. That is, there may be a spatial structure to the data

• Classical statistics assumes independence, at least within sampling strata

5. Data values may be related to their coordinates → spatial trend

• Spatial dependence: the value of a variable at a point in space is related to its

• Support of a sample: the physical dimensions it represents (n.b. may try to

Topic: Exploratory spatial data analysis

• Distribution of sample points

• Postplots (values vs. locations): where are which values?

• Geographic postplots: with images, landuse maps etc. as background: do

• Spatial structure: range, direction, strength . . .

• Is there anisotropy? In what direction(s)?

• Do there seem to be several populations with distinct geographic distribution?

• What was the sampling plan?

• Are some areas over– or under–sampled?

Example: Walker Lake: Distribution of points – All points

The Postplot: distribution of values in space

• Does there appear to be a trend?

• Are there distinct clusters of high or low values?

• Is there any directional difference in clustering? (anisotropy)

Meuse – Distribution of Log(Cadmium) in soils

• land cover or land use

• geologic or soil units

Meuse – Log(Cadmium) on a false-colour composite

Topic: Spatial correlation

1. What is spatial auto-correlation?