Documente Academic
Documente Profesional
Documente Cultură
org/
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Covariance Analysis
for
Seismic Signal Processing
Edited by
R. Lynn Kirlin
William J. Done
Series Editor
Stephen J. Hill
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Covariance analysis for seismic signal processing / edited by R. Lynn Kirlin and William J. Done.
p. cm.(Geophysical development series ; v. 8)
Includes bibliographical references and index.
ISBN 1-56080-081-X (vol.). ISBN 0-931830-41-9 (series)
1. Seismic prospecting. 2. Signal Processing. 3. Analysis of covariance.
I. Kirlin, R. Lynn. II. Done, William J.
III. Series.
TN269.8.C68 1998
622.1592dc21
988792
CIP
ISBN 978-0-931830-41-9 (Series)
ISBN 978-1-56080-081-1 (Volume)
Society of Exploration Geophysicists
P.O. Box 702740
Tulsa, OK 74170-2740
1999 Society of Exploration Geophysicists
All rights reserved. This book or parts hereof may not be reproduced in any form without written
permission from the publisher.
Published 1999
Reprinted 2009
Printed in the United States of America.
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Contents
1
Introduction ................................................................................... 1
R. Lynn Kirlin
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
3
Analysis Regions.................................................................... 6
Data Windows....................................................................... 7
Data Vectors.......................................................................... 8
Sample Data Covariance Matrix ............................................ 9
Rationale for Sample Covariance Analysis ........................... 10
Statistics of the Sample Covariance Matrix........................... 11
Robust Estimation of Sample Covariance Matrices ............... 13
References ........................................................................... 17
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
4
4.1
4.2
4.3
4.4
4.5
4.6
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
6.1
6.2
6.3
6.4
6.5
6.6
6.7
6.8
6.9
6.10
7
Introduction......................................................................... 83
Multiple Wavefront Model................................................... 83
Frequency Focusing and Spatial Smoothing ......................... 87
Discussion ........................................................................... 92
Comparison of MUSIC with Semblance ............................... 93
Keys Algorithm ................................................................... 95
A Subspace Semblance Coefficient ...................................... 98
Multiple Sidelobe Canceler................................................ 101
Summary of Coherence Detection and Velocity
Estimation.......................................................................... 105
References ......................................................................... 107
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
7.1
7.2
7.3
7.4
7.5
7.6
7.7
7.8
8
8.1
8.2
8.3
8.4
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
8.4.3
8.5
8.6
9
9.1
9.2
9.3
9.4
9.5
9.6
9.7
9.8
10.1
10.2
10.3
10.4
10.5
10.6
11.1
11.2
Introduction .......................................................................227
A Brief Mathematical Description.......................................228
vi
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
11.3
12.1
12.2
12.3
12.4
12.5
Introduction....................................................................... 241
Theory ............................................................................... 242
12.2.1 Eigenimages and the KL Transformation................ 245
12.2.2 Eigenimages and the Fourier Transform ................ 250
12.2.3 Computing the Filtered Image ............................... 251
Applications ...................................................................... 252
12.3.1 Signal to Noise Enhancement ............................... 252
12.3.2 Wavefield Decomposition .................................... 256
12.3.2.1 Event identification ................................ 257
12.3.2.2 Vertical Seismic Profiling ....................... 263
12.3.3 Residual Static Correction ..................................... 265
Discussion ......................................................................... 268
References ......................................................................... 272
13.1
13.2
13.3
13.4
13.5
13.6
13.7
Introduction....................................................................... 275
Time Windows in Polarization Analysis............................. 275
The Triaxial Covariance Matrix.......................................... 276
Principal Components Transforms by SVD......................... 278
Analysis of the Results of SVD ........................................... 283
Summary ........................................................................... 287
References ......................................................................... 289
14.1
14.2
Introduction....................................................................... 291
Single-Station Polarization Analysis ................................... 292
vii
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
14.3
14.4
14.5
14.6
14.7
14.8
15.1
15.2
15.3
15.4
15.5
15.6
15.7
15.8
Introduction .......................................................................323
Background........................................................................323
Estimation of the Component Powers .................................325
Results using 0.10.2 Hz Geophysical Data at a
Triaxial Array .....................................................................329
Signal Model in the Case of One Rayleigh and One
Love Wave .........................................................................330
Application of the MUSIC Algorithm to the Array Data ......337
Conclusions .......................................................................339
References..........................................................................339
viii
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
William J. Done
6204 S. 69th Place
Tulsa, Oklahoma 74133
Other Contributors:
Sergio L. M. Freire
Petrobras - DEXBA/DEPEX
Salvador, Bahia - Brazil
Hui Liu
Department of Electrical Engineering
Portland State University
Portland, Oregon
Brian N. Fuller
Paulsson Geophysical Services, Inc.
7035 S. Spruce Dr. E.
Englewood, Colorado
I. M. Mason
ARCO Geophysical Imaging Laboratory
Department of Engineering Science
Oxford University
Parks Road, Oxford, U. K.
S. A. Greenhalgh
School of Earth Sciences
Flinders University of South Australia
Bedford Park, Adelaide, Australia
John Nabelek
College of Oceanic and Atmospheric
Sciences
Oregon State University
Ocean Admin. Bldg. 104
Corvallis, OR 97331
G. M. Jackson
Elf Geoscience Research Centre
114A Cromwell Road
London, U. K.
M. J. Rutty
School of Earth Sciences
Flinders University of South Australia
Bedford Park, Adelaide, Australia
Fu Li
Department of Electrical Engineering
Portland State University
Portland, Oregon
Mauricio D. Sacchi
Department of Geophysics and Astronomy
University of British Columbia
Vancouver, Canada
Guibiao Lin
Tadeusz J. Ulrych
College of Oceanic and Atmospheric Sciences
Department of Geophysics and Astronomy
Oregon State University
University of British Columbia
Ocean Admin. Bldg. 104
Vancouver, Canada
Corvallis, OR 97331
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Acknowledgments
The editors are indebted to the contributing authors for their efforts and
patience during the preparation of the manuscript.
Our appreciation is owed to John Claassen, Sandia National Laboratories,
and Lonnie Ludeman, Dept. of Electrical & Computer Engineering, New
Mexico State University, for their review of the manuscript.
Kurt Marfurt, University of Houston (formerly with Amoco Tulsa Technology Center) provided valuable suggestions for improvements to and figures
for the first five chapters.
Maureen Denning, Dept. of Electrical and Computer Engineering, University of Victoria, prepared the first draft of several of Lynn Kirlin's chapters.
We also thank Julie Youngblood and Vicki Wilson, University of Houston
(formerly with Amoco Tulsa Technology Center) for their efforts in producing
the manuscript from the individual contributors' documents and its many
revisions.
ix
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Chapter 1
Introduction
R. Lynn Kirlin
This reference is intended to give the geophysical signal analyst sufficient
material to understand the usefulness of data covariance matrix analysis in the
processing of geophysical signals. A background of basic linear algebra,
statistics, and fundamental random signal analysis is assumed. This reference
is unique in that the data vector covariance matrix is used throughout. Rather
than dealing with only one seismic data processing problem and presenting
several methods, we will concentrate on only one fundamental
methodologyanalysis of the sample covariance matrixand we present
many seismic data problems to which the methodology applies.
This is much like, indeed very much like, writing about seismic
applications of spectral or Fourier analysis. With Fourier analysis, the data are
represented in a domain other than the original, and each independent
estimate of frequency content contains a measure of information about the
source data. With covariance analysis, information from the data has been
compressed into the elements of the covariance matrix, and the structure of
the covariance matrix, if viewed properly, contains similar independent
measures of information about the data. The major difference is that the
Fourier transform gives one-to-one mapping of the data, and is therefore
invertible. The covariance matrix is a many-to-one mapping, and, when
appropriately applied, compresses the voluminous original data into a much
smaller amount, but still sufficient to adequately estimate the desired
unknown parameters within the data.
We will demonstrate the methodology of covariance matrix analysis and
relate the covariance matrix structure to the physical parameters of interest in
a number of seismic data analysis problems. In some cases, we will be able to
1
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Ulrych, et al., provide several demonstrations of the usefulness of singularvalue decomposition for enhancing seismic components through the use of
eigenimages or orthogonal images to make up the raw 2-D seismic data
record. His work is accompanied by thorough theoretical analyses.
Three chapters demonstrate the application of covariance subspace
analysis to three-component data (triaxial geophones). In Chapter 13,
Jackson, et al., analyze three-component data at a single station, and, in
Chapter 14, Rutty and Greenhalgh extend the work to multiple stations.
From the covariance matrix eigenstructure, they produce signal-space
enhanced waveforms and test statistically for rectilinearity. Rayleigh and Love
waves in the 0.10.2 Hz range coincidently arriving at triaxial arrays are
analyzed by Kirlin et al. in Chapter 15. This work separates the two waves by
estimating the joint covariance matrix of their components. Recent work from
other authors regarding the number of waves and parameters that can be
separated and estimated using vector sensor arrays is also included in
Chapter 15.
Thus covariance analysis of seismic data is seen to be of current interest to
many researchers and a method amenable to many distinct applications. We
are not attempting to provide an encyclopedia of these applications nor of the
theory and the literature that has developed to date. Instead, we wish to
provide a diverse sampling and a discussion of that work from a common
viewpoint.
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Chapter 2
Data Vectors and Covariance Matrices
R. Lynn Kirlin
Seismic signals are sensed by geophones in land acquisition or by hydrophones in marine acquisition. Typically, the signals are excited in the earth
with some sort of energy source such as an explosion, vibrator, or air or water
gun. These signals travel through the subsurface structures and are reflected
from boundaries having distinct physical properties. Eventually they produce
multiple reflections observed at each recording phone. Often the geological
structure is not simple, and although simple models have sufficed for many
regions of exploration, the recorded reflections are interpreted well only if the
geophysicist has much experience and is familiar with other sources of information.
For much of the methodology, geological structure is assumed to be reasonably simple, such as horizontally layered strata. However, such simple
structures are not always required. Often it is only necessary that there be
knowledge that some specific temporal or spatial structure (coherence) is
present in the array of received signals in order to obtain some processing
advantage.
Now, here is some of the terminology inherent in the methods we will discuss. Analysis regions, data windows, data vectors, and covariance matrices are
shown in Figure 2.1 and are described in the following sections.
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 2.1.
2.1
An analysis region spanning many traces and time samples; a moving analysis window within which spatio-temporal adaptive processing is done;
and sample vector windows within the moving window are indicated.
Analysis Regions
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
b)
a)
c)
d)
Figure 2.2.
2.2
e)
Data Windows
Within the analysis region are many data points, and around each point
the data may have features or parameters that are considered to have local stationarity. A window of data around this point may be analyzed separately to
provide the desired local parameter estimate or localized information. Such a
window may be said to be a running window or a sliding window. The running analysis window may be positioned at every point in the analysis region
or it may be positioned only at selected points. The alternative windows range
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
from maximally overlapping, where every point is a window center; to partially overlapping, whose points in time/space are spaced somewhat closer
than the window breadth or length; to nonoverlapping, where consecutive
windows in either direction touch but do not overlap. Nontouching windows
are also possible, but these omit some data from the analysis.
When choosing window size and spacing, it must be realized that smaller
windows will allow more spatial or temporal variability in the output which is
the result of processing within the window, but fewer samples within the window will be available to estimate any parameters of interest. That is, small windows allow higher spatial or temporal frequency in the resulting parameter
estimates, but provide less statistical stability, fewer degrees of freedom, to
those estimators.
2.3
Data Vectors
Within each data window, divide the data into vectors as shown in
Figures 2.1. The elements within these vectors are data points taken from
vector windows, which are subwindows of the running window; these vector
windows may have any shape within the constraints of the running window
size. Commonly, the vector is from a vector window 1 M in size, covering
either M time points from the same trace, down the trace, allowing only temporal analysis; M points taken from M traces at the same time (time slice,
snapshot, or across traces''), allowing only spatial analysis; or along and parallel to a prescribed space-time curve, allowing constrained space-time analysis.
Other vector windows are possible, such as every kth point, which would allow
an M-length vector to span Mk points (subsampling). The vector window may
also be two dimensional, such as 2 M, resulting in a length 2M vector.
In any case, the vector window is moved over all possible positions within
the data window, gathering a total of L sample vectors of data. Maximally
overlapped vector windows are usually taken. The assignment of data points
from within the vector elements is arbitrary as long as it is known, but it is
usually a logical ordering such as from lesser time to greater time and from
lesser offset to greater offset and must be consistent from vector to vector. For
example, a 2 5 vector window surrounding points of data x(i,j), i 10 to
11 and j -2 to 2 where i is time index and j is trace index, similar to that
shown in Figure 2.1, would become the 2M 10 by 1 vector:
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
2.4
Within the window and if mean x 0, the averaged sum of vector outer
products xi xiT, i 1, 2, ..., L gives the sample covariance matrix. It is the sample covariance matrix that we will be analyzing in most of the remainder of the
book. However, the vectors will not always have come from a time-space window as discussed above. In such a case the distinction will be obvious.
In the foregoing, the vectors have been taken from 2-D time-trace data
sets. Data vectors can come from anywhere. Another common source of seismic data vectors is two- or three-component geophones (see Chapters 1315).
In this situation, a vector x might contain just three elements that are the
three-component data samples at the one point in time-space only, as if there
were a data vector window of size one by one in time/space. However, the vector may be extended to contain 3n elementsthe samples from n geophones.
In any case, a collection of L such vectors from within a data window
(there may be just one analysis window that spans the entire analysis region)
may be averaged in the outer product to give the sample covariance matrix Cx:
L
1
C x -- xi xiH.
Li1
(2.1)
Again we have assumed that the data vectors are zero mean. When the vector mean is not zero, the mean first must be subtracted from x before forming
the outer products. Because, in practice, seismic data are zero mean, we generally have no need to estimate or remove any mean. However, some recording
systems, such as those for well logging, occasionally have trouble with dc bias.
Some processing systems include a debiasing routine.
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
2.5
All the methodology in the remainder of this volume is based on the sample covariance matrix in equation (2.1). The sample covariance matrix arises
from many areas of science, engineering, and statistics: it is needed in multivariate data analysis, pattern recognition, least-squares problems, hypothesis
testing, parameter estimation, etc.
For example, if we draw one such length M vector x from an N(m, R) distribution (Gaussian multivariate vectors with mean m and covariance R), and
its a priori probability density function is (Eaton, 1983)
1
1
f ( x ) -----------M
------2 R 2 exp { ( x m ) T R 1 ( x m )/2 } ,
( 2 )
(2.2)
exp -- x i m T R 1 x i m .
2i 1
10
(2.3)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
where x (x1, x2, ..., xL) . It may be shown that the maximum likelihood estimate of m is
1
m
-L
xi ,
i1
1
the sample mean. Further, m
is distributed N(m, L1R) and --R is the covariL
ance of the errors in estimating m.
When m is known, the maximum likelihood estimate of R is Cx, the sample covariance matrix. This estimate of R and the above estimate of m are also
appropriate when the vectors x have complex Gaussian elements and we
define for zero mean x
R E { xx H } ,
where ()H indicates complex conjugate transpose.
2.6
(2.4)
1
H 1
f ( X ) ----LM
------------L exp x i R x i .
R
i1
(2.5)
If Cx is a sample complex covariance matrix, then LCx has its distinct real
and imaginary elements distributed with a complex Wishart density (Eaton,
1983). For
L
A LC x
xi xiH
i1
this density is
11
( Re { A pq } jIm { A pq } )
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
1
LM
1
f ( A ) --------------------- A
exp { Tr ( R A ) } ,
h ( R, L, M )
(2.6)
where
h ( R, L, M ) M ( M 1 ) /2 ( L ) ( L M 1 ) R L ,
and Tr[] indicates trace, the sum of diagonal elements.
When x is zero-mean with real elements, the distinct M(M 1)/2 elements of S LCx jointly have the Wishart probability density (Eaton, 1983;
Goodman, 1963; Anderson, 1958):
1
1
1 L 2
1
f ( S ) ----------------- R S
exp - Tr [ R S ] ,
W ( L, M )
2
(2.7)
where
LM1
W ( L, M ) 2 LM 2 M ( M 1 )/4 ( L/2 ) ---------------------- ,
2
The elements Sik and Ski are equal and therefore not distinct.
When the xi have nonzero mean , then xi in equation (2.5) is replaced
with x i
, and xi in Cx of equation (2.1) is replaced with x i
, where
1
-L
xi .
i1
For this case, where we have had to estimate an unknown mean, the Wishart
densities are rewritten similar to equations (2.6) and (2.7), except that L is
replaced by L 1, the degrees of freedom of LCx.
An iterative estimator of R, given Cx, and the constraint that R must be
Toeplitz, i.e., R RH, and all elements of any one diagonal or off-diagonal
are equal, is given by Burg et al., (1982), who show that the real-L-vector normal density of equation (2.3), the R matrix that maximizes its likelihood
(ML), also maximizes the function:
g ( C x ,R ) log R Tr ( R 1 C x ).
12
(2.8)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
R ( R ij ) i,j 1, 2, ..., M
shows that the variation of g(Cx, R) for Toeplitz-constrained (or any) variation
in R must satisfy
g ( C x ,R ) g ( C x ,R )Tr [ ( R 1 C x R 1 ) R ] 0 .
(2.9)
2.7
Often the data vector x contains not only signal and Gaussian noise, but
also wild points. The wild points in seismic data arise from a number of
sources including dead geophones or hydrophones, noisy phones, poor phone
placement, local noise sources that cause one trace to include significantly dif-
13
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
ferent data from its neighbors, faults in other acquisition hardware, and temporally transient noise sources such as lightning-induced, earth tremors, etc.
Even large amplitude noise events can be wild point in nature, especially
when the data are sorted into a different order, e.g., CMP. Also, in marine data
interference occurs when other seismic vessels are shooting nearby.
When the set of sample vectors is large, one or two wild points in time or
space may not result in a serious difference between Cx and R, but a dead or
noisy trace is certainly going to result in a significant error in one row and column of Cx. Detectors of such errors and robust estimators are useful or necessary in such situations. The effects of such errors depend greatly on the
application of the covariance analysis.
We have already mentioned maximum likelihood (ML) estimation of
structured covariance matrices. That method assumes normal data, which is
true of the algorithms to be presented in the rest of this text as well, including
those of Chapter 8, where methods of enhancing noisy covariance matrices are
presented.
The problems of dead, missing, and noisy traces have already been dealt
with by the industry, resulting in various interpolation or editing schemes.
However, it is worthwhile to note some literature that particularly addresses
the covariance estimation problem.
Robust methods of parameter estimation in general were dealt with in the
fundamental work of Huber (1964). The results of that work are summarized
along with that of several others by Andrews et al., (1971). Robust estimators
of scalar covariances are presented by Mosteller and Tukey (1977). The fundamental ideas in these references have to do with trimming extreme points
adaptively. Quite often the median is used as the location estimator, and
median absolute deviation (mad) from the median is used as a spread estimator. Knowledge of these two adaptively computed measures allows wild
points to be defined as those in excess of k mad's where k is selected ad hoc,
often 5 to 9. Other methods of nonlinearly weighting data are proposed
(Andrews et al., 1971; Mosteller and Tukey, 1977, Dewin et al., 1981). These
often lead to iterative procedures, because after a location and a spread parameter are computed and wild points trimmed, the remaining data can be reexamined for location and spread, etc.
Many such robust methods have been proposed for estimating covariance
methods. Nine of these were tested under various types of noise by Dewin,
et al. (1981). Of the nine methods, the raw covariance Cx is best only with
14
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
x
wi xi / wi ,
i1
(2.10)
i1
i1
w i2 ( x i
x ) ( x i
x ) T
15
/ w i2 1 ,
i 1
(2.11)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
where
wi w ( di ) ( di ) di ,
(2.12)
1 ( x x ) ) 1 / 2 ,
d i ( ( x i x ) T R
i
and
d, d d o
(d)
1
d o exp -- ( d d o ) 2 /b 22 , d d o
2
do
v b1 / 2
wi2 1 .
Expression (2.12) simply weights the ith vectors outer product with unity
for small deviations from the mean, but with less than unity for greater deviations. Note that if R is diagonal and has no zeros on the diagonal, d is an X2
random variable with M-1 degrees of freedom; M is the vector length and one
dof is removed for estimating the mean vector. This dof holds for general
(nondiagonal) R as well.
Campbell (1980) also proposes a robust principal component analysis.
The eigenstructure can either be calculated from the robust covariance matrix
of the above procedure, or the weights can be determined through the means
16
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
and variances of the principal components of the xi. This will be detailed in
Section 3.7 of the next chapter.
2.8
References
17
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Chapter 3
Eigenstructure, the Karhunen Loeve
Transform, and Singular-Value Decomposition
R. Lynn Kirlin
An M M covariance matrix R exhibits many special properties. For
example, it is complex Hermitian, equal to its conjugate transpose, RH R; it
is positive semidefinite, xHRx 0. Because of the latter, its eigenvalues are
greater than or equal to zero as well. In many cases, it is also Toeplitz Ri,j
Ri m,j m, that is, diagonal elements are equal. In this chapter, I will review
some of the more important properties of covariance matrices and their eigenstructure, and discuss some simple applications.
3.1
19
(3.1)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
then for the largest value and associated v satisfying equation (3.1)
x ( v T x )v
(3.2)
gives minimum E { x x } . Note that the scale factor on v is vTx. When the
L xk are drawn from an infinite set, the sample covariance Cx replaces Rx.
There are M eigenvalues and M associated eigenvectors that satisfy
equation (3.1). Throughout the rest of this text we will assume that the eigenvectors are ordered such that
1 2 ... M .
(3.3)
3.2
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Cx
i vi vi
VV ,
(3.4)
i1
3.3
We have seen that the sample covariance matrix is factorable into its eigenstructure form; that is,
H
C x VV ,
(3.5)
(3.6)
(3.7)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
2
0
G ( V1 V2 ) 1
2
0 2
H
V1 ,
H
V2
(3.8)
V1 XH [ X ( V V ) ] 1 0 .
1 2
H
2
V2
0 2
(3.9)
(3.10)
where
U 1 ( U 1 U 2 ) I 0 .
0 I
U 2H
1 0
Let V ( V 1 V 2 ), U ( U 1 U 2 ) and 0 2 ; then by post multiplying equation (3.10) by VH, X is found to be
H
0 VH
X ( U 1 U 2 ) 1 1H U V H
0 2 V 2
U 1 1 V 1H U 2 2 V 2H
r
i1
(3.11a)
(3.11b)
i u i v iH
i r1
22
i u i v iH ,
(3.11c)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
XX H
U 1 12 U 1H
ui i2 uiH
(3.12)
i1
and
r
XHX
V 1 12 V 1H
vj j2 vjH .
(3.13)
j1
3.3.1
(3.14)
One use of SVD is that it allows any of the columns (xi) in X to be written
as a linear combination of the singular vectors uk of U. Thus,
p
xi
k uk
k1
( Uk UHk )xi
k1
UU x i ,
(3.15)
k uk xi
The transformation T UH on any xi constitutes the Karhunen Loeve
Transform (KLT), and the vector UHxi contains the principal components of xi.
For random vectors, U is found from E{xxH} UUH.
23
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Xr
i ui viH
U r r V rH
(3.16)
i1
r2
T r [ rH r ]
i2 .
(3.17)
i r1
3.3.2
Clearly for p M there are vectors y of length M which are not LC of the
p independent vectors xi in X. Such vectors lie in the null space of X; they
are not LC of either the p vectors xi or of the p vectors in U1, the submatrix of
U (U1U2) associated with the p nonzero singular values of U. Rather they
are LC of the M p vectors in U2.
A general vector y of length M has components both in the null space of X
and in the range of X, the range being defined as all vectors that are LC of
24
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
(3.18)
yx [ X ( X X )
T
y [ I X ( X X )
1
1
X ]y P # y ,
X ]y ( I P # )y ,
P # U 1 U 1H X ( X X )
1
X ,
(3.19)
(3.20)
(3.21)
and
I P # U 2 U 2T ,
(3.22)
X# ( X X )
1
1 0 T
T
U .
X V 1
0 0
(3.23)
These are convenient notations for determining the least-squares fit of columns of X to a general vector y. That is, what p coefficients (1, 2, ...,
p )T give a best least-squares fit y x X to y?
The result is
X#y ,
(3.24)
y XX # y P # y ,
(3.25)
giving
25
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
(3.26)
3.4
A Seismic Example
Suppose a region of data is ideally flattened to a prescribed velocity corresponding to the only reflection present, i.e., the exact delays have been
removed from each trace. Then, ignoring wavelet stretch, each trace xi is now
identically a vector s except for additive noise and interference. That is, suppose
x i s n i , i 1, 2, , p,
(3.27)
26
(3.28)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
where S (s, s, ..., s), M p, and N (n1, n2, ..., np), M p. Note S
s(1 1... 1) so that SSH pssH, and this is clearly a rank one matrix. Now the
statistical mean of the cross terms 2 Re{SNH} is zero, while SSH pssH and
E { NN H } p n2 I , where n2 is the variance of noise on each trace. Thus, as
p increases,
XX H p ( ss H n2 I ) pR.
(3.29)
(3.30)
3.5
A Second Example
In the second example, vectors are taken across traces, and we assume
there are M traces of length p so that the vectors are of length M. As before, we
assume that the traces have been flattened to some true event, so that each vec-
27
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
XX H ( 1 1 1 ) Ds Ds H ( 1 1 ) NN H
cross terms
(3.31)
where Ds diag (s(t1), s(t2), ..., s(tp)), possibly complex valued, and the ith column of N is (n1(ti), n2(ti)..., nM(ti)T, for i 1, 2, ..., p. By similar arguments
as in Section 3.4, we see that with large p,p1XXH(M M) approaches
T
( E s 1 1 n2 I ) M R and the eigenvalues of R are as before, except that
there are only M of them, i.e., 1 E s n2 , i n2 , i 2 , , M .
However, the major eigenvector v1 1 (1, 1, ..., 1)T of length M, whereas
with the choice of vectors xi ith trace, as in Section 3.4, v1 s.
Now note that SVD would have found both eigenstructures. Define X as
in Section 3.4, but find the SVD
X UV H ,
(3.32)
where V are the eigenvectors of XHX(p p), and U are the eigenvectors of
XXH(M M). Then U 1 s, v 1 1 , and 11 ( E s n2 ) 1 / 2 ; these
are the singular vector u1 of X for the first example, the eigenvector v1 of XHX
for the current example, and the first singular value 1 of the SVD of X. The
eigenvalues of either XXH or XHX are i E s n2 , n2 , , n2 , to either
p or M values, respectively.
These two examples are basic to many of the algorithms presented elsewhere in this book. In general and with no noise, the singular vectors are LCs
of the signals down traces (first example), and the eigenvectors are LCs of
wavefront vectors across traces (second example). In the narrowband case,
the wavefront vectors equate to delay vectors whose elements are complex
phasor rotations. More will be said on this in Chapter 4.
28
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
3.6
Often the noise-free portion of XHX or XXH has eigenvalues that are all
nonzero. In these situations, and when noise is present, it is still useful to use a
rank-reduced version of either X or R. So even though we are deleting some
signal energy by not including all of its components, we are excluding more
noise with each singular vector dimension or eigenvector dimension that is
not used. In effect, more bias in an estimate is being allowed in exchange for
reducing variance.
Many times this exchange can be made interactively. Sometimes the
approximate signal dimensionality is known. In many cases, it is possible to
either know or estimate what the quantitative trade is statistically. The following presentation is based on Scharf (1991, Chapter 9).
Suppose that the data matrix X, M traces of length LT is the sum of a signal matrix S plus an independent white zero-mean noise matrix N. Then the
SVD representation is
V
H,
X U
(3.33)
(3.34)
29
(3.35)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
giving
XX H SS H NN H cross terms .
(3.36)
Now the rank of SSH is p, but XXH may have singular values that do not
clearly indicate this fact because of the influence of noise terms in XXH and
similarities of signal traces. We now try to estimate S with a reduced rank X.
That is, we want to use
r
S r
u i u iH X
rU
rH X
U
i1
(3.37)
as a rank r p estimate of S. If we let r M, X is reproduced exactly, summing S with N. Otherwise, there is a bias in S r , an estimate of which is
p
b r S p S r
i r1
u i u iH X
(3.38)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
u 1H x i 2 u (H2 ) x i 2 ... u ( p ) x i ,
(3.39)
where we assume the u k are good approximations of the uk. This assumption
and another that all noise is Gaussian leads to an optimum r for each xi
(Scharf, 1991). The optimum r* for xi is r such that the estimated mse is minimized,
mse b rH b r ( 2r p ) n2 ,
(3.40)
b r ( u p u pH u r u rH )x i .
(3.41)
3.7
At the end of Chapter 2 we alluded to a robust covariance matrix estimator that was used to estimate the eigenstructure: the Campbell method
(Campbell, 1980).
Recall first that the normalized eigenvector v1 of Cx, associated with the
largest eigenvalue 1, is such that y m v 1H x m has maximum sample vari of C
ance. The eigenstructure may be taken from Cx or a robust version R
x
obtained as in Section 2.7. However the weights on data vectors xi were functions of a Mahalanobis distance di that used the iterated robust mean vector x
and robust covariance k .
31
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
The iterative process can be modified to give weights on xi which are functions of y m v iH x m . Because the process is iterative, in each iteration the
minimum of current and previous weight measures is retained to ensure convergence. In the following, estimates of eigenvectors vi are denoted ui, and
estimates of the matrix V are denoted U.
The proposed procedure is as follows:
1) As an initial estimate of u1, take the first eigenvector from an eigenanalysis of V.
2) Form the principal component scores y m u 1T x m .
3) Determine the M-estimators of mean and variance of ym and the associated weights wm. The median and [0.74 (interquartile range)]2 of the
ym can be used to provide initial robust estimates. Here
0.74 (2 0.675)1 and 0.675 is the 75% quartile for the N (0,1)
distribution. This initial choice ensures that the proportion of observations downweighted is kept reasonably small.
After the first iteration, take the weights wm as the minimum of
the weights for the current and previous iterations; this prevents
oscillation of the solution.
4) Calculate x and V as in steps 1 and 2 using the weights wm for step 3.
5) Determine the first eigenvalue and eigenvector u1 of V.
6) Repeat steps 2 to 5 until successive estimates of the eigenvalue are sufficiently close. To determine successive directions ui, 2 i, project the
data onto the space orthogonal to that spanned by the previous eigenvectors, u1, ..., ui1, and repeat steps 2 to 5; as the initial estimate, take
the second eigenvector from the last iteration for the previous eigenvector. The proposed procedure for successive directions can be set out
as follows.
7) Form x im ( I U i 1 U iT 1 )x m , where
U i 1 ( u i , ..., u i 1 ) .
8) Repeat steps 2 to 5 with xim replacing xm, and determine the first
eigenvector u.
9) The principal component scores are given by
u T x im u T ( I U i 1 U iT 1 x m ) and hence u i ( I U i 1 )u .
32
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Repeat steps 7, 8, and 9 until all eigenvalues and eigenvectors ui, together
with the associated weights, are determined. Alternatively, the procedure may
be terminated after some specified proportion of variation is explained.
Finally, a robust estimate of the covariance or correlation matrix can be
found from UEUT to provide an alternative robust estimate. Both this
approach and that described in the previous section gives a positive definite
correlation/covariance matrix. Robust estimation of each entry separately does
not always achieve this.
3.8
References
33
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Chapter 4
Vector Subspaces
R. Lynn Kirlin
Over the past decade, much research has been devoted to the understanding and application of what has come to be known as signal subspace and
noise subspace processing. This methodology is based on the linear statistical
model for vector data. All data vectors are linear combinations of their signal
and noise components. Given such vectors of length M, a vector space CM
may be spanned by any M independent, length M complex vectors. In many
situations the spanning vectors may be partitioned or chosen such that r vectors are adequate to span the set of all possible signal vectors, the signal subspace. The remaining M r vectors lie in the noise subspace. The two
subspaces are orthogonal, meaning that any signal subspace vector has zero
inner product with any noise subspace vector.
The data covariance matrix is used to estimate the two subspaces. When
the estimation is good, for example when S/N is sufficiently high and sample
size sufficiently large, then n r dimensions of noise power can be removed
effectively from the data, allowing processing to proceed with higher S/N
data. This results in better parameter estimations, decisions, or interpretations.
The ability to separate signal and noise subspaces rests not only on S/N
and sample size, but also on a priori knowledge of the linear statistical model.
In the following, I will define the linear statistical model, explain the mathematics of subspaces, and give some examples of interest.
4.1
The linear statistical model assumes that the mean vector m of data x is a
linear combination of r vectors which comprise the columns of H. Thus
35
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
x H w ,
(4.1)
f ( x ) ( 2 )
ML 2
L 2
T 1
R
exp -- ( x i m ) R ( x i m ) ; (4.2)
2i 1
and if x is complex
L
f (x)
ML
M
H 1
R
exp ( x i m ) R ( x i m ) ,
i 1
(4.3)
where m H. The above are duplicates of equations (2.3) and (2.5), and R
is the data covariance matrix.
The reader is referred to Scharf (1991) for specific techniques of either
detection of m 0, where 0 is a vector with M 1 zero elements, or estimation of m, H, or under various assumptions, knowns and unknowns.
Often the exact density of x is not known; nevertheless, the sample covariance matrix Cx of x, given the linear statistical model, carries a good deal of
information. (See Chapter 2 for the statistics of Cx when x is Gaussian.) When
the L samples of x are arranged into the columns of X, the sample covariance
matrix can be written
36
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
C x XX L
H
(4.4)
R E { C x } HE { }H N ,
(4.5)
=(H H H Hw w H ww ) L.
where E{wwT} = N.
4.1.1
4.2
Assume now that at each sample time a vector time slice or snapshot is
taken across M sensors. The linear statistical model becomes
x ( t ) As ( t ) n ( t ) ,
(4.6)
37
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
pendent. Thus the rank of APs AH is r, where Ps E{ssH}, the source covariance matrix. Further, the data covariance matrix
H
R APs A N
(4.7)
R AP s A n I .
(4.8)
4.2.1
38
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
(Note that as used above, the term noise subspace is not strictly correct
2
because noise has equal power n in all dimensions including the signal subspace. It is more appropriately termed the orthogonal subspace, meaning
orthogonal to the signal subspace.)
Thus, R may be rewritten,
r
R
H
i vi vi
2
n
i1
vi vi
(4.9a)
i r1
H
Vs x V s V n n V n
(4.9b)
0
H
( Vs Vn ) s
( Vs Vn )
0 n
(4.9c)
VV
(4.9d)
where the eigenvalue matrix diag (1...m) has been partitioned to give s and
n, diagonal eigenvalue matrices of size r and M r respectively, and the
eigenvector matrix V has been partitioned into signal subspace eigenvectors Vs
and noise subspace eigenvectors Vn.
The above four eigenstructure properties are explained as follows. First, we
note that As has r degrees of freedom, therefore AE{ssH}AH has rank r. Further,
APs AH must have r positive eigenvalues, the last M r equaling zero. Next,
2
we observe that if an eigenvalue of APs AH is , then n is an eigenvalue
H
2
of APs A n I ; because if v is the eigenvector associated with , then
H
Rv ( APs A I )v v n v .
2
( n )v
39
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
4.2.2
1 M s n , r 1 ,
2
i
M
i1
2s r n2 ,
k
k1
Ps ( A A )
1
A H Vs s Vs A ( A A )
1
(4.10)
40
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
(4.11)
E { v i } v i ,
(4.12)
cov { i, j } i j ij
(4.13)
and
M
cov { v i, v j }
1
--N
k =1
l =1
k i l j
k l kl ij
H
------------------------------------v k v l .
( i k ) ( j l )
(4.14)
41
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
4.2.3
Ps
vi vHi
(4.15)
i1
be the projection operator that projects a vector x onto the signal subspace.
Similarly, let
M
PB
H
vi vi I Ps
(4.16)
i r1
v i
v i ,
P s v i ,
42
(4.17)
(4.18)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
v i P B v i .
(4.19)
1
H
v i P B ------ B k ( q ) ( Y k ( q ) B k ( q ) ) vi i
KQ q 1 k 1
P B B, X vi i ,
(4.20a)
(4.20b)
where q is the usual time index on the subarray snapshots xk(q) of length m
and k is the index on subarrays used for spatial smoothing (see Chapter 8)
giving K M m 1. Bk(q) is the noise component of xk(q), and Yk(q)
Ak s(q) as in equation (4.6), except that we have indexed time samples and
subarrays. With K 1, equation (4.20a) indicates no spatial smoothing and
x(q) has length M. Ak and Bk contain appropriate transformation matrices to
yield coherence of signals in subarray k with signals at the reference subarray
(see Chapter 8). Thus, equations (4.20a) and (4.20b) give the noise subspace (B) component of vi as a function of the additive noise Bk(q), signal
components in xk(q), and the true ith eigenvector and eigenvalue.
The formula is the result of projecting onto the noise subspace the finite
average in time (q) and space (k) of all the noise vectors, each weighted by its
associated data-vectors component in the direction of vi normalized by i.
The factors vi / i on the end of (4.20a) and (4.20b) can be replaced by
||
R y v i , where
r
||
Ry
j 1 vj vHj .
j1
43
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
It is easy to see that any true solution vector, which of course lies in the
signal subspace, is therefore a linear combination of the vi. Thus, any true
solution vector a has noise space components given by
||
a P B B B, X R y a .
The covariance of this error component of a is derived in Clergeot
et al. (1989) from this expression, and error variances on the parameters of
interest in a (such as rms velocity or bearing or frequency) follow, but are
dependent upon the specific algorithm, source correlation, S/N, and relative
source locations. Similar analyses applied to the velocity estimation are used
by Li and Liu in Chapter 7.
4.3
(4.21)
44
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
e jw1
jw1 2
e
x(i)
jw1 9
e
1
e
e
jw 2
jw 2 2
.
.
.
jw 2 9
e jw1 i
s1
jw i
s2 e 2
n(i)
(4.22)
Hs ( i ) n ( i )
2
2
0 s2
H
(4.23a)
HP s H n I
(4.23b)
2
This covariance matrix will have two eigenvalues greater than n and
2
eight equal to n . The two eigenvectors associated with the larger eigenvalues
span the same signal subspace as do h1 and h2, the columns of H. Both col2
umns of H are orthogonal to the eigenvector associated with 3 n . We
note that in no case will either eigenvector in s equal either h1 or h2, but will
always be some combination of both h1 and h2. However, with only one sig2
2
nal, the rank 1 case, v 1 h 1 10 and 1 10 s1 n .
In the second example, I intend to estimate the directions of two independent, narrowband, analytic sources at bearing angles and 2 and at infinite
distance. The equivalent problem is estimation of two reflections slownesses.
The plane waves arrive at M equispaced sensors. The relative delays of signal k
appear in the H-matrix as the elements e juk mk , where k sin k/c, u1
and u2 are the radian frequencies of the two sources, is the sensor spacing, c
45
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
1
1
ju
ju
e 22
e 11
ju 2
e 1 1
.
x(i)
.
.
.
s (i)
1 n(i)
s2 ( i )
(4.24)
Hs ( i ) n ( i )
Rx in this case is identical to that for the first example wherein the noise is
spatially white and stationary. The eigenstructure is identical to that of Rx in
2
the first example if both Ps and n are unchanged and if w1 u11 and w2
u22. For this reason the normalized frequency or f 0.5 is
often used for both problems ( w or u1). The parameter of interest, wi
or i, is extracted from the solution values of or f.
4.4
T m T 0 ( m ) V ,
46
(4.25)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
where is the sensor spacing, V is the wavefronts rms velocity, and T0 is the
zero-offset, two-way traveltime. Thus, the relative delay at sensor m in reference to sensor zero (m 0) is
m
T 0 ( m ) V T 0 .
2
(4.26)
2
m ( m ) ( 2T 0 V ) .
(4.27)
The relative delays mk from the kth wavefront replace mtk in the elements of H
in equation (4.24), to the extent that the signals can be considered narrowband (Kirlin, 1991).
For more broadband signals, there will be more than one plane wave per
source. The eigenstructure properties discussed for the examples above still
hold under certain restrictions. Basically, the wavefronts to be analyzed must
be fairly flat and mostly encompassed by the window of analysis. However,
because an unflattened wavefront has nonzero relative delays and the seismic
wavefronts are broadband, each frequency component would have its unique
phase rotation at each sensor. Thus, straightforward application of the above
will yield eigenstructures with an unclear demarcation of the signal subspace,
because there will be more than one larger eigenvalue per complex wavefront present.
If the reflections are band-pass filtered to create a more narrowband signal,
the model is better matched, but signal energy has been lost. Methods of combining Fourier coefficients are used in broadband extensions. We will discuss
these methods and the above problems of model unsuitability in Chapter 6.
4.5
Nonwhite Noise
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
N VV ,
(4.28)
V w,
(4.29)
E { } V E { ww }V
V VV V .
H
(4.30)
U NU D (diagonal)
(4.31)
and
N
1
UD
1
U .
(4.32)
2
U n
orthogonalizes and whitens n. If the columns of U are normalized, then the
transformed noise is also stationary (equal power in all dimensions), and the
elements of D become identical.
48
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
(4.33)
T ( As w )
TAs Tw .
The covariance matrix of z is
H
R z TAPs A T n I .
(4.34)
R z u TR x T u lu ,
H
T TR x T u lT u ,
H
Rx ( T u ) l ( T u ) .
(4.35)
49
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
4.6
References
Clergeot, H., Tressens, S., and Ouamri, A., 1989, Performance of high resolution frequencies estimation methods compared to the Cramer-Rao
bounds: IEEE Trans. Acous., Speech and Sig. Proc., 37, 1703-1720.
Kirlin, R. L., 1991, A note on the effects of narrowband and stationary signal
model assumptions on the covariance matrix of sensor array data vectors: IEEE Trans. Signal Processing, 503-506.
Pillai, S. U., 1989, Array signal processing: Springer-Verlag Inc.
Scharf, L. L., 1991, Statistical signal processing: Addison-Wesley Publ. Corp.
50
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Chapter 5
Temporal and Spatial Spectral Analysis
R. Lynn Kirlin
Spectral analysis is a broad topic. Most scientists and engineers who deal
with signals are quite comfortable with the concepts of time-frequency relationships. They are familiar with the Fourier transform and the common theorems such as Parsevals, convolution, delay, etc. The fast Fourier transform, or
FFT, is widely used and understood. The FFT gives a value of the Fourier
transform at all integer multiples of the reciprocal record length T. This spacing in frequency is also the resolution. The Nyquist frequency or folding frequency is half the sampling frequency. FFT coefficients between the folding
frequency fs / 2 and the sampling frequency fs are identical to those between
fs / 2 and 0, due to the periodicity of the coefficients in the frequency
domain.
We assume the reader is also familiar with the z-transform, which is essentially the Fourier transforms equivalent with application to equispaced samples of either time or spatial signals. The classical reference by Oppenheim and
Schafer (1975) discusses the z-transform, the discrete Fourier transform
(DFT), and fast Fourier transform (FFT).
Much information can be gained from the frequency domain representation of signals. Just as the FFT can apply to temporal signal samples, it may
equally apply to spatial signal samples. Quite commonly, two-dimensional (2D) FFTs are applied to 2-D seismic data, where one dimension is temporal
and the other is spatial. Some drawbacks of the FFT spectrum are (1) it usually gives more data than is necessary, (2) it is not suitable for transient signals
of short duration (few samples), (3) it does not directly give estimates of the
few parameters which often determine precisely the statistics of the time
sequence, (4) the FFT frequencies are almost never the same as those of real
51
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
sinusoids that may be present in the data, (5) it does not result in a polynomial ratio form, and (6) the spectral resolution is limited to T 1.
Although I have not given the deserved equal time to the virtues of Fourier analysis, the above gives sufficient reason for seeking alternative
approaches to signal analysis. I will suggest alternatives to the FFT for either
temporal or spatial or spatio-temporal signals analysis.
In this chapter, I will relate a discrete signals power or energy spectrum, or
simply spectrum, to both its discrete autocorrelation function and the autocovariance or covariance matrix of sample vectors from the discrete sequence. I
will also explain the relationship between the eigenstructure of the sequences
covariance matrix and spectral values. However, the chief purpose of this
chapter is to demonstrate a number of high-resolution algorithms and show
their commonality.
5.1
Sx ( f )e
j2f
df,
(5.1)
and
Sx ( f )
rx ( )e
j2f
d .
(5.2)
Sx ( z ) zn 1 dz
c
52
,
(5.3)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
and
Sx ( z )
where
k
k
r x ( k )z ,
(5.4)
c
5.1.1
(5.5)
where v [v1, v2, ..., vm]T. Now consider the ith element of v on the righthand side of equation (5.5). It is the inner product of the ith row of Rx and v:
M
r ( i, k )vk
k1
v i ,
(5.6)
where r(i,k) is Rx(i,k), the (i, k)th element of Rx. More precisely, for stationary
processes r(i, k) rx(k i) rx(i k). We may consider vj v(j) to be a
sequence in time with j 0 corresponding to the time origin. Clearly
equation (5.6) indicates a convolution operation:
M
r ( i k )v ( k )
k1
v ( i ).
(5.7)
(5.8)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
(5.9)
From this it is clear that in the limit as the covariance matrix incorporates
values of rx(k) for all k on the real line,
S x ( z ), z e
jw
(5.10)
where is the sampling interval. Thus, is the spectral value of Sx(z) at the
frequency w. What then can we deduce with regard to the sequence v(i)?
According to equations (5.8) and (5.10),
r ( i k )v ( k ) S x ( e
jw
(5.11)
jw
(5.12)
)v ( i ) .
k
r ( k )v ( i k ) S x ( e
)v ( i ) .
k
k
r ( k )v ( i k )
r ( k )e
jwk jwi
k
Sx ( e
jw
)e
jwi
(5.13)
Thus, in the limit as M , Sx(e jw), vi, the ith element of the
eigenvector v, approaches e jwi, and v approaches the complex sinusoid at
radian frequency w.
Thus, we expect that eigenstructure carries information relevant to the
spectrum of a process. We may also infer without proof that a finite number of
54
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
5.1.2
It can be shown that any rational spectrum of a sampled signal can be adequately represented as an all-pole spectrum. A hint that this should be so is
given by noting that a finite z-plane zero factor (the polynomial 1 az1 has a
zero value at z a) can be expanded into an all pole function:
1 az
1
--------------------------------------------- .
1
2 2
1 az a z ...
1
Continue then with the assumption that the process at hand may be considered to be that produced at the output of an all-pole filter G(z), driven by
2
white noise w(k) with variance w . Thus a finite length sequence x(k) from
the output of this filter would have the z-transform
M
X(z)
1
-W ( z )
-----------------1
i 11
pi z
G ( z )W ( z ) .
(5.14)
For such a process it may easily be shown that an mth-order linear predictor will optimally predict x(k) from x(k i), i 1, 2, ..., M with minimum
mean-squared error (mmse). That is, the mmse prediction is
M
x ( k )
ai x ( k i ),
(5.15)
i1
when the ai are the solutions to the Yule-Walker equations (Haykin, 1991,
chapter 2),
R x a r,
55
(5.16)
The generating and predicting filters are shown in Figure 5.1. The error in
the prediction is ( k ) x ( k ) x ( k ) . From the diagram of Figure 5.1 or
from the above equation we can easily deduce that the variance of (k) is
2
Sx ( z ) 1 H ( z ) ,
(5.17)
where
M
H(z)
a i z 1 .
(5.18)
i1
Sx ( z ) 1 H ( z ) .
(5.19)
2
w
rx ( 0 )
ai rx ( i ) ,
(5.20)
i1
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
where Rx is M M, a [a1, a2, ..., aM]T and r [rx(1), rx(2), ..., rx(M)]T.
w(k)
Figure 5.1.
G(z)
x(k)
H(z)
x(k)
e(k)
White noise w(k), all-pole filter model G(z), data x(k), linear predictor
H(z), white error process e(k).
56
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
2
+
Rx 1 w ,
a
0
(5.21)
5.1.3
Sx(z) as a Function of Rx
In the remainder of this chapter, I will describe a number of the high-resolution estimators, each of which has much in common with the linear prediction spectral estimates in equation (5.17). Before continuing, write Sx(w) as a
function of Rx, the covariance matrix. First, normalize sampling frequency to
unity, then let
e [1 e
jw
2jw
... e
Mj w T
] ,
(5.22)
(5.23)
b [ 1 a 1 a 2 ... a M ] ,
and
57
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
1 1 [ 1 0 0 ... 0 ] .
(5.24)
2
w S x ( z ) 1 H ( z )
2
e b
e
e
+ 1
( Rx )
+ 1
( Rx )
w
0
2
w
2
w
T
+ 1
( Rx )
or
2 T
+ 1
T
+ 1 1
S x ( z ) w e ( R x ) 1 1 1 1 ( R x ) e .
(5.25)
We shall see in the following that equation (5.25) is one specific formulation of one of two more general forms of high-resolution spectral estimators.
The distinctions among the specific estimators usually are due to the specific criterion, such as mmse prediction, as we have just seen, but are sometimes due to technique, either in solving for solution frequencies (or DOAs)
or in estimating the covariance matrix. Many methods specifically address the
problem of a finite number of sinusoids (or a finite number of plane-wave
arrivals). Yet all make use of the covariance matrix, and most make use of its
eigenstructure.
5.2
In the foregoing we saw that the linear predictor can produce a spectral
estimator, S x ( ) by varying the frequency in e e() in equation (5.25),
where e is composed of elements of the form exp(jk), k 0, 1, ..., M, . In
this section I will use P(f ) rather than Sx(f ) for high-resolution estimates,
because most of them are not actually spectral estimates but are frequency or
DOA estimators.
58
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
R x E { xx } ,
(5.26)
1
H
C x --- x i x i .
N
(5.27)
i1
R x VV
i vi vHi ,
(5.28)
i1
where V is a matrix whose columns are the eigenvectors vi, and is a diagonal
matrix with elements i.
59
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
(5.29)
5.2.1
1
P MV ( f ) ( e R x e )
1
H
1
H
e i v i v i e
i1
1
1
2
1 H
i e vi
i 1
.
60
(5.30)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
5.2.2
MUSIC
e ( f )v i 0, i D 1, ..., M ,
when f is chosen correctly. Thus,
M
H
P MU ( f ) e v i
i D 1
1
(5.31)
P MU ( z ) e ( z )V N V N e ( z )
M1
k
( 1 ri z
i1
1
) ( 1 r i z ),
(5.32)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
analogous to U2 in Section 3.3, except that here we assume the null space
eigenvectors are taken from the exact covariance matrix, rather than XXH.
5.2.3
Eigenvalue
2
1 H
P EV ( f ) i e v i
i D 1
5.2.4
1
(5.33)
2
H
P EMV ( f ) q i e v i
i 1
( e VQV e )
1
1
(5.34a)
(5.34b)
where
Q diag ( q i ) ,
1
--------------------2--, 1 i D
( i )
qi
1
, D 1 i M;
----2
62
(5.35)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
is a real, positive parameter to be selected, and 2 is the assumed independent noise variance at each sensor.
The division of the summation over eigenstructure into 1 i D and
D 1 i M is due to assumed knowledge that either there are D narrowband signals present in the data x or that the data may be represented by its
reduced-rank covariance. The other dimensions are due mostly, if not completely, to noise.
5.2.5
Maximum Entropy
P ME ( f ) 1 H
i 1 vi vHi er
1
i1
(5.36)
Hx
ln P ( f )df
(5.37)
1 2
subject to the constraints that the density of P(f ) must match the known autocorrelation lags rx(m) (= Rx(m) for zero mean x) for 0 m M 1. The
solution leads to the augmented Yule-Walker equation (5.21), except in equation (5.21) we had size M 1 instead of M for the size of the covariance, having derived an order-M predictor.
In most of the above estimators, various scale factors in the numerator
have been omitted. In equation (5.36), for example, the usual whitened signal
H 1
1
power ( 1 1 R x 1 1 ) is not shown in the numerator. Any constraint for this
63
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
scalar might be used, such as the whitened MMSE as just discussed; alternately a total power equal to unity can be used. When simply detecting sinusoids or estimating their frequencies or target (emitter) bearing angles, a scale
factor is unimportant and only the location or relative size of spectral peaks is
of interest. When comparing estimators, their spectral maximum is often set
to unity.
Reconsidering the maximum entropy spectrum, we note that it also might
H 1
be written using i 1 1 i v i :
P ME ( f )
2 1
i1
H
i vi er
(5.38)
H
H
P ME ( f ) q ik e v i v k e
i 1 k 1
1
1
V 1 1 1 1 V
( e VQV e )
1
(5.39a)
(5.39b)
where
Q
q ik
1
( i k ) v *1i v 1k ,
1
(5.40)
and v1i is the first element of vi. For unspecified qik, the form of equation
(5.39b) is more general than that of equation (5.34b), and incorporates all the
other spectra. If all qik 0 when i k, equation (5.39b) degenerates to the
general form of PEMV in equations (5.34a)-(5.34b) with unspecified diagonal
qi.
5.2.6
Minimum Norm
64
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
P MN ( f )
qik eH vi vHk e
i D 1 k D 1
( e VN VN 11 11 VN VN e )
1
1
(5.41a)
(c c) ,
(5.41b)
5.2.7
M1
12
jmw
df,
(5.42)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
erable due to the fact that, given an estimate of Rx from a short data record, it
is well known that estimates of rx(m) for larger values of m are quite poor.
Also, values of rx(m) may be near or at zero-crossings of rx(), in which case
their normalized estimates diminish uniformly with their index, have error
variances inversely related to the number of vector samples, and are proportional to the square of their true value (see Chapter 3). However, as the index
increases, eventually either computation or sample-size errors will cause
smaller eigenvalues to have erroneous inverses and will give considerable MSE
1
in spectral estimates based on R x . Lastly, positive projections insure a positive definite estimate.
Thus consider maximizing either
12
J
12
lnP ( f )df
1 2
qi
P ( f ) e v i df
i1
1 2
12
(5.43a)
or
12
J
lnP ( f )df
qi
i1
1 2
12
( f )e v i df.
(5.43b)
1 2
P ( f ) e v i df C i ,
1 i M,
(5.44)
1 2
where, for example, Ci may be the eigenvalue i. Taking the variation of J in
equation (5.43a) with respect to P(f ) easily gives
M
2
H
P ( f ) qi e vi
i 1
66
1
(5.45a)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
( e VQV e )
1
, Q diag ( q i ).
(5.45b)
That is a satisfying result in that it predicts the form of the large subset of
the estimators given by equations (5.43a) and (5.43b). Simply by choosing the
Ci according to some criteria, we can produce more of such estimates.
In order to satisfy the constraints of equation (5.44), substitute P(f ) from
equations (5.45a) and (5.45b) to yield
12
1 2
e v
---M------------i-------- df C i , 1 i M .
(5.46)
qj eH vj 2 .
j1
12
1 2
qi eH vi 2
i1
---M---------------------df
qj eH vj 2
qi Ci
q C 1,
(5.47)
i1
j1
where q [q1, q2, ..., qM]T and C [C1, C2, ..., CM]T.
Because only the sampled covariance is available, we are usually uncertain
of the eigenstructure. Thus, we let the constraint values Ci be random variables with E{Ci } C i and covariance matrix Kc. Because the Ci are random
variables, we choose the qi to minimize the variance of qTC while enforcing
the expectation of the constraint in equation (5.47); that is,
T
E{q C} q C 1 .
(5.48)
67
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
1
P NME ( f ) C K c C ( e VD q V e )
1
(5.49)
where
D q diag ( q 1, q 2, ..., q M)
(5.50)
and
T
1
1
q K c C ( C K c C ).
(5.51)
Because equation (5.49) has been derived in terms of C and its variance, it
applies to general uncertain constraints. Thus, for example, according to convention, we might split the signal and noise space into eigenvector sets, esti2
mating i l i
, 1 i D , for the signal eigenvalues, and
2
i
, D 1 i M for the noise eigenvalues. Alternately, we might
use Ci li,1 i D, and Ci 0, D 1 i M. Each of these uncertain
Ci has a respective variance that can be inserted into equation (5.49).
5.2.7.1 Example 1
Consider then the case where a pure spectral analysis is being attempted,
having no knowledge of sensor noise, etc. Letting C i i , the eigenvalues
estimates give the realization that the constraints are a bit uncertain. In fact, it
is known (see Chapter 4) that for Gaussian data and distinct i the asymptotic
mean and variance of i are
E { i } ,
(5.52a)
i j
E { ( i i ) ( j j ) } ------- ,
N ij
(5.52b)
q k ( k
2
k )
i1
2
( i
68
2
i )
(5.53)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
P(f)
[ 2i var { i } ]
i1
1
( k var { k } ) e
vk
k1
(5.54)
(5.55)
P(f) M
k
1
( e ( f ) ) vk
(5.56a)
k1
MP MV ( f ) .
(5.56b)
This method is not effective for estimating variance for small eigenvalues.
Thus the NME estimate with uncertain constraints C i i , when the
2
variance of i is estimated with var { i } i N , is PMV, the minimum
variance spectral estimate, scaled by M.
5.2.7.3 Example 3
Suppose next that we wish to use the constraints
2
( i ) l i , 1 i D
Ci
2
, D1iM,
69
(5.57)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
where l i is an ith mode signal power estimate, is Owsleys (1985) signalspace enhancement factor, and
M
1
-------------- i ,
M Di D 1
2
(5.58a)
is a noise power estimate. When the true noise eigenvalues are not distinct, we
2
need an estimate of variance
. When M D 3 and N is large,
2
----------------------s MD
is approximately t-distributed with M D 1 degrees of freedom. In this
2
case,
has sample variance
MD1
2
2
var {
} ----------------------------------------------s
(M D 3)(M D)
(5.58b)
where
M
s (M D 1)
1
2 2
( i
)
i D1
However, assume that the smaller eigenvalues are, in fact, distinct, then
the uncertainties of Ci are expressed by
M
2
var { i } ( M D ) var { j } , 1 i D
j D1
(5.59)
var { C i }
M
( M D )2 var { j } , D 1 i M ,
j D1
Again replacing the value of i with i in var { i } in equation (5.52b),
and using D for M D,
70
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
1 2
2
2
i N ( D N ) j , 1 i D
j D1
var { C i }
M
( D2 N )1 2j , D 1 i M .
j D1
(5.60)
With these variances and Ci as in equation (5.57), the estimator of equation (5.54) becomes
M
2
l 2i
(
)
1
---------------------------- ---------------------------------------
M
M
i 1
i D1
2
D2 2j
i D 2 2j
j D1
j D1
P NME ( f, ) --------------------------------------2--------------------------------------------------------------- (5.61)
D
M
2
2 H
l i e H v i
e vi
1
-- ------------------M--------------------- --------------------------M
i1 2
i
D
1
2 2
D2 2j
i D j
j D1
j D1
5.2.8
If, instead, we use the projections given in equation (5.43b), the corresponding spectral estimator is the complex new maximum entropy estimator
(CNME) (Kirlin, 1992):
71
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
P CNME ( f ) ( C
2
1
Kc C )
i 1 j 1
1
* H
H
q i q j e v i v j e
( C K c C ) ( e VQV e )
1
1
(5.62a)
(5.62b)
where
1
1
Qc Kc C C Kc .
(5.63)
The first significant difference between this result and that for the power
constraints is that the magnitude-squared operation is outside the sum in
equation (5.62a) instead of inside as in equation (5.49). This form corresponds to that of PME and PMN as is seen in the matrix Qc.
In Kirlin (1992) some qualitative comparisons are shown between
PCNME(f ) and PME and PMN under various assumptions regarding C and LC.
Although this general form is of interest, our personal preference is to avoid
the form of PME, PMN, and PCNME due to tendency to give false peaks (instability).
5.2.9
To demonstrate relative results, take an estimated signal spectrum generated by two sinusoids at normalized frequencies 0.225 and 0.250 in white
Gaussian noise at S/N 3 dB. The data record is 64 points long and the
covariance matrix is 20 20. The following spectral estimators have been
used: conventional ME, Figure 5.2; minimum variance (PEMV with 1),
Figure 5.3; enhanced minimum variance PEMV with enhancement factor
100, Figure 5.4 (essentially PMU); new maximum entropy method PNME of
equation (5.61) with 1, Figure 5.5; and a modified forward-backward
linear prediction (FBLP) method (Marple, 1987), Figure 5.6.
A number of expected effects can be seen among these results. Although
PME resolves the two frequencies, it is quite noisy. The basic PMV is much
more stable, its peaks are nearly unresolved. At the other extreme is PFBLP
which has the sharp, well-resolved peaks but a large number of sidelobes up to
-8dB from maximum.
72
Spectrum in dB
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 5.2.
73
Spectrum in dB
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 5.3.
Minimum Variance estimate. S/N 3 dB, 64 samples, 20 20 covariance matrix. ( 1992 IEEE, Used with permission. R. L. Kirlin, New
Maximum Entropy Spectrum Using Uncertain Eigenstructure Constraints, IEEE Trans. on Aerospace and Electronic Systems, vol. 28, no.
1, January 1992.)
Two additional comparisons to MN have been made. In the first, the algorithms were each told that D 1 when, in fact, D 2. At S/N 10 dB and
1 for NME, 10 runs of each algorithm gave a single peak each time.
However, when the two algorithms were each told that D 3 (3 signals
present) when in fact D 2, 10 runs of each algorithm gave the plots in
Figures 5.9 and 5.9. The greater tendency for instability is clearly shown in the
PMN plots. This kind of error (false alarm) is obviously minimized by incorporating the model uncertainty offered by PMNE. If nothing else, this example
shows that more study should be done before designing specifications for miss
and false alarm probabilities.
5.3
Conclusions
It has been shown that all modern spectral estimators are of the form given
in equations (5.62a) and (5.62b), a subset of which is given by equations
(5.34a) and (5.34b). These estimators have been derived under various criteria
74
Spectrum in dB
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 5.4.
Enhanced minimum variance estimate, 100. S/N 3 dB, 64 samples, 20 20 covariance matrix; essentially equivalent to PNME with
100 and PMU. ( 1992 IEEE. Used with permission. R. L. Kirlin, New
Maximum Entropy Spectrum Using Uncertain Eigenstructure Constraints, IEEE Trans. on Aerospace and Electronic Systems, vol. 28, no.
1, January 1992.)
75
Spectrum in dB
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 5.5.
76
Spectrum in dB
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 5.6.
5.4
References
77
Spectrum in dB
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 5.7.
Kaveh, M., and Barabell, A. J., 1986, The statistical performance of the
MUSIC and minimum-norm algorithms in resolving plane waves in
noise: IEEE Trans. Acous., Speech and Sig. Proc., 34, 331-341.
Key, S. C., Kirlin, R. L., and Smithson, S. B., 1987, Seismic velocity analysis
using maximum likelihood weighted eigenvalue ratios: 57th Ann.
Internat. Mtg., Soc. Expl. Geophys., Expanded Abstracts, 461-464.
Kirlin, R. L., 1992, New maximum entropy spectrum using uncertain eigenstructure constraints: IEEE Trans. Aerosp. Elect. Systems, 28, 2-14.
Kumaresan, R., and Tufts, D. W., 1983, Estimating the angles of arrival of
multiple plane waves: IEEE Trans. Aerosp. Elect. Systems, 19, 134139.
Li, F., Liu, H., and Vaccaro, R. J. , 1993, Performance analysis for DOA estimation algorithms, further unification, simplification and observations: IEEE Trans. Aerosp. and Elect. Systems, 29.
78
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 5.8.
Marple, S. L., Jr., 1987, Digital spectral analysis with application: PrenticeHall, Inc.
Mars, J., Glangeaud, F. Lacoume, J. L., Fourmann, J. M. and Spitz, S., 1987,
Separation of seismic waves: 57th Ann. Internat. Mtg. , Soc. Expl.
Geophys., Expanded Abstracts, 489-492.
Oppenheim, A. V., and Schafer, R. W. , 1975, Digital signal processing: Prentice-Hall, Inc.
Owsley, 1985, Sonar array processing, in Haykin, S., Ed., 1985, Array signal
processing: Prentice Hall, Inc.
Papoulis, A., 1981, Maximum entropy and spectral estimation: A review:
IEEE Trans. Acoust., Speech, and Sig. Proc., 29, 1176-1186.
79
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 5.9.
Rao, B. D., and Hari, K. V. S., 1989, Performance analysis of root MUSIC:
IEEE Trans. Acoust. Speech and Sig. Proc., 37, 1789-1794.
Shirley, T. E., Laster, S. J., and Meek, R. A., 1987, Assessment of modern
spectral analysis methods to improve wavenumber resolution of f-k
spectra: 57th Ann. Internat. Mtg., Soc. Expl. Geophys., Expanded
Abstracts, 607-609.
Wax, M., Shan, T. , and Kailath, T., 1984, Spatiotemporal spectral analysis by
eigenstructure methods: IEEE Trans. Acous., Speech, and Sig. Proc.,
32, 817-827.
80
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 5.10.
Ten Runs for new ME, S/N 10 dB, M 20, n 64, f1 .225, f2
.25 1. ( 1992 IEEE. Used with permission. R. L. Kirlin, New Maximum Entropy Spectrum Using Uncertain Eigenstructure Constraints,
IEEE Trans. on Aerospace and Electronic Systems, vol. 28, no. 1, January
1992.)
81
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Chapter 6
Root-Mean-Square Velocity Estimation
R. Lynn Kirlin
6.1
Introduction
6.2
83
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
x ( i ) Hs ( i ) n ( i ) ,
(6.1)
mk ( T 0 ( m ) /V k ) T ok ,
(6.2)
mk ( m ) / ( 2T 0V 0) p ( m ) ,
(6.3)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 6.1.
We have shown (Kirlin, 1991) that wavefronts, not meeting the narrowband and stationarity assumptions, lose energy into dimensions of the space
other than the ideal rank-one dimension of the plane wave. This energy
appears as colored noise, adding magnitude to the near-diagonal elements of
the covariance matrix.
The next assumption that must be overcome is that of independent signals
sk(i). If, for example, there are two independent signals, then the covariance
matrix
85
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
2
0
P s E { ss } 1
2
0 2
H
(6.4)
E { xx } N HP s H ,
(6.5)
T 0 x /V T 0 x /V a .
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 6.2.
Direct MUSIC analysis of data in Figure 6.1, using Ricker spectral peak as
frequency. Peaks should be at 9000 ft/s (2700 m/s) and 12 000 ft/s
(3600 m/s).
6.3
87
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 6.3.
Clearly, each Fourier frequency coefficient of a signal represents a narrowband component. This leads to consideration of frequency shifting each component to a reference or central frequency, coherently combining those and
then applying the narrowband algorithm. This procedure, called frequency
focusing, yields some success when bandwidths are not too great and when the
rms velocities are approximately the same for all wavefronts in the window.
To begin this method, the nth Fourier coefficients Fni are found for each
(ith) trace; i.e.,
88
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 6.4.
xi ( t )
ni
F ni e
jnt
(6.6)
(6.7)
89
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Cn K
1
ynk yHnk ,
(6.8)
k1
E { U w1 w2 y n ( w 2 )y n ( w 2 )U w1 w2 } E { y n ( w 1 )y n ( w 1 ) } .
Basically, this transformation will remove the delay factors on yn(w2) and
replace them with the approximate delay factors of yn(w1). To estimate the
delays, we might use the parabolic approximation
2
w ( w ) p ,
(6.9)
where
2
p 1/ ( 2T 0 V ) .
(6.10)
V is the trial velocity and T0 is the time center of the analysis window. We
might also use the hyperbolic expression
2 1/2
w ( T 0 ( w/V ) )
90
(6.11)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Use of p rather than equation (6.11) allows precalculation of w for all T0;
if a wavefront has p p T0 in a window positioned at T0 s then the corresponding V is extracted by rearranging equation (6.10).
In either case we need to choose a reference sensor for the analysis region.
Suppose we let the first trace in the most offset window (of length L) be the
time delay reference. Then the index of this trace is M L 1. The window whose first element is located on trace w would then yield transformation
element
U M L 1, w ( i ) exp[-j2f n ( w i 1 M L i )]
(6.12)
for the ith diagonal element Uw(i,i) UM L 1,w . The elements of Uw are
spatially dependent. If we have not already preflattened the data, we can
choose Va or pa to be a central approximation of all possible parameters. Then
the phase shifts in equation (6.12) are approximately correct for fn, and the
sample vectors, transformed to the spatial reference, will give a good coherent estimate of the vector yn (M L 1) at the reference location.
On the other hand, if we have already preflattened to Va in the time
domain, we have already effected this approximate phase shift at all frequencies and do not need to use Uw at all. Note, however, that in consideration of
this fact and by observation of the two wavefronts in Figure 6.1, it should be
clear that neither flattening nor spatial smoothing with a single approximate V
will yield the desired approximation yn (w) yn (M L 1) when there are
two distinct wavefronts present. Spatial smoothing of the sum of any two or
more wavefronts with distinct velocities is not effective because of nonspatial
stationarity. (Spatial smoothing is still appropriate for any wavefront which
has been exactly flattened, however, and subsequently I will do this.) We are
left with only frequency focusing (Wang and Kaveh, 1985) to achieve multiple samples of yn.
th
(6.13)
w is the estimated relative delay at the wth sensor. The smoothed covawhere T
riance matrix is
91
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
n0
C
n Tn yn yHn THn ,
(6.14)
and the summation indexes over all frequencies of sufficiently high S/N.
Weights of each Tnyn according to S/N are also proposed. The true w can only
be approximated using an approximate velocity representative of the range to
be searched.
Obviously, this method will not work well over both a broad range of frequencies and a broad range of velocities, because the addition of the covariance at each frequency is intended to be coherent.
We have experimented with both spatial and frequency smoothing. Spatial
smoothing is clearly not appropriate for single covariance matrix analysis of
multiple curved wavefronts. Frequency smoothing has been moderately effective, but the short durations of seismic signals make time-domain methods
more attractive.
Nevertheless, I want to draw attention to a recent publication which
addresses the broadband problem very well, even though the method is
directed to plane-wave analysis. Allam and Moghaddamjoo (1994) have introduced a frequency-domain remapping method, projecting (proportioning)
spatial frequencies used at each temporal frequency out to those spatial frequencies at the reference temporal frequency. The projection is based on the
linear relationship between spatial and temporal frequency content of a planewave. Results are impressive for plane waves and the methodology might possibly be extended to hyperbolic wavefronts.
Each temporal frequency provides a vector of spatial frequency coefficients
which can then contribute coherently to a sample covariance matrix.
6.4
Discussion
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
next, preflattening is done for each trial velocity. If a wavefront has been flattened exactly, a number of beamforming and interference-canceling planewave detection and parameter estimation algorithms apply to that wavefront.
Preflattening also allows spatial smoothing for improved covariance matrix
estimation. Other multiple wavefronts in the analysis window will not be planar, but appear as high- (or multi-) dimensional coherent interferences. Some
degradation from the ideal is seen.
6.5
Sc
x ( j, i )
j k N/2 i 1
--------k--
----N/2
-------------------------------.
M
j k N/2 i 1
(6.15)
x ( j, i ) 2
S c 1 C k 1/MTr [ C k ]
(6.16)
1 ( V V ) 1 1 ( Vn n Vn ) 1
S c = -----------s-----s-----s---------M--------------------------------,
M
m1
93
m
(6.17)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 6.5.
1/ 1 m / 1 .
m2
94
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
--------------------S c 1 M S/N 1- .
1
(6.18)
which approaches infinity as Sc approaches one. We have used S ' c for the data
of Figure 6.6, where S/N 5 dB and two wavefronts have velocities 9500 ft/s
(2900 m/s) and 10 500 ft/s (3200 m/s). The semblance analysis of Figure 6.7
should be compared to the MUSIC analysis of Figure 6.8 for data flattened to
the nominal velocity of 10 000 ft/s (3000m/s). Both spectra have been normalized by their peak values. MUSIC uses f 30 Hz.
We can see that both methods have resolved the two wavefronts, MUSIC
somewhat better than semblance. Computation time using MATLAB on a
Sparcstation SLC is 200 s for semblance and 30 s for MUSIC, including the
single flattening process. However, semblance is more accurate. Both algorithms used increments of 50 ft/s(15 m/s) in searching velocity, or 201 trials.
It is clear that even though MUSIC requires eigenstructure analysis, doing
this once is much more efficient than forming 201 covariance matrices, which
in effect is required by semblance.
The potential utility of broadband MUSIC is shown in Figure 6.9, by a
second application of MUSIC to the data if Figure 6.6, but center frequency is
input to be 60 Hz. Note that the bias has essentially disappeared but resolution has decreased. The potential for application of wideband MUSIC with
frequency focussing is evident, however we have not found it to be useful for
our test cases.
6.6
Keys Algorithm
If all the information with regard to the wavefronts and noise is in the
covariance matrix, perhaps the eigenvalues themselves give adequate informa95
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 6.6.
tion for some purposes. Key (Key and Smithson, 1990) developed an algorithm which windows the data per trial velocities and steps along in time like
semblance. The covariance matrix for each trial velocity is analyzed for eigenvalues.
Assuming only one wavefront exists in the analysis window, then, ideally,
2
2
2
1 M s n m n for M 1. Thus, a coherence measure is
J k ( 1 )/
2
M s / n
(6.19)
where
( Tr ( C ) 1 )/ ( M 1 ) ,
96
(6.20)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 6.7.
(6.21)
Because both 1 and are estimates of the true values, their difference in
the numerator of JK gives rise to more variability than that for alone. Further, when both 1 and are inappropriately small, the ratio can be inappro2
priately large. Thus, the bias of
n in the numerator of
2
2
S c ( 1 M
s
n ) is traded for the variance in the numerator of
2
97
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 6.8.
that Sc contains signal power as well as noise power to normalize the coherence
measure.
Results of Keys algorithm for the data in Figure 6.6 is shown in
Figure 6.10. These computations required 724 s.
6.7
Because semblance is the conventional coherency measure, we have constructed an equivalent based on the estimate of the signal subspace (Kirlin,
1992). This is somewhat of a compromise between semblance, which can
never go to zero when noise is present (numerator 1), and Keys algorithm,
which uses subspace ideas but is quite unstable and is not normalized.
We derive that the coherence measure
98
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 6.9.
MUSIC analysis of f 60 Hz, of data in Figure 6.6. Note the accuracy has
improved over use of f 30 Hz, but resolution has degraded, and baseline
has raised.
T
SK 1 1 Vn Vn 1
T
( 1 v 1 1 )/ ( M 1 )
(6.22)
(6.23)
is an estimate of how well the signal subspace has been flattened. If there is
only one signal present v1 1/ M , and with no noise SK 1. However,
with no signal at all, v1 is a randomly oriented vector; the average power of the
99
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 6.10.
vector 1 in the direction of v1 equals 1/Mth of 1 , or 1. Thus the average nosignal value of SK is zero, and Sk may go negative.
A plot of the spectrum of S K for the data of Figure 6.6 at t 1850 ms is
shown in Figure 6.11. Comparing Keys algorithm with semblance Sc in
Figure 6.7, note the accuracy of semblance is preserved while the background
level is lowered and resolution is apparently enhanced.
This algorithm seems to work quite well. Run time is comparable to semblance and it is time efficient if a special routine is used that finds only the first
eigenvalue. However, solving for all 32 eigenvectors in this problem used
703 s. (Compare to 210 s for semblance and 30 s for preflattened MUSIC.)
The satisfactory outcome of subspace semblance has led to the use of a
similar algorithm which I describe in the following section. Subsurface semblance calculates coherence assuming only one wavefront is present. The next
algorithm optimally and adaptively cancels any wavefronts which may be
present before computing a coherence measure.
100
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 6.11.
6.8
The multiple sidelobe canceler (MSC) is extracted from the beam forming
literature. Conventional semblance is like conventional beam forming; one
must apply a delay at each phone so as to align the wavefront to make it
appear if it were a broadside plane-wave. This allows coherent stacking or
addition of the M time-shifted traces, thereby reducing random noise power
by M. However, when another wavefront is present, its signal would interfere
with the ideal stacking process.
For example, in the previous section two wavefronts were present. This
causes the first eigenvector to be different from 1/ M , no matter what the
trial velocity, since the first eigenvector is that linear combination of the two
direction (delay) vectors which gives greatest temporal variance of the sum.
With window processing, we are relying somewhat on the time gate to reduce
the energy of the nonflattened wave and its correlation with the flattened one.
101
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
The multiple sidelobe canceler seeks to subtract any wavefronts from the
data which do not have the trial velocity used for flattening. If this is possible,
a more accurate stack is effected. The first step is to apply the flattening delays
per the trial velocity. If x is an unflattened data vector from the analysis window, then we let Dx represent the flattened data. The usual stack is ym
1TDx. The second step is to remove the data at the flattening velocity. Our
approximation of the residual, the auxiliary or interference reference for noise
canceling, is
x a ( I D )x .
(6.24)
However, over any finite time, for any sources not in nulls of the (I D)
beam, (that is, with any finite length array) ym and xa are correlated. Therefore,
we use minimum-mean-squared-error criteria to subtract the optimal linear
combination of the xa elements from ym. That is, we find wa such that
2
E { ym wa xa }
is minimized. The solution for wa is
w a R a# r ma ,
(6.25)
1
Ra U1 1 U1 ,
where 1 is p p and U1 is M p.
The interference-canceled main beam or waveform estimator is
102
(6.26)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
y m y ( R # r ) T x
m
a ma
a
T
[ 1 ( R a r ma ) ( I D ) ]x
(6.27)
The expectations for Ra and rma must be estimated with the data in the
analysis window. Let the window be N M. Writing xn for the nth time slice
of x, the nth ym is
T
y n 1 x n /M ,
(6.28)
r ma N 1
N1 ( I 1 1T /M ) xn yn
(6.29)
n1
a N
R
1
( I 1 1 /M )
xn xHn ( I 1 1T /M ) .
(6.30)
n1
Expression (6.27) is an estimate of the interference-free flattened waveform. A comparison to semblance is possible using the expression
N
S MSC
y n
n 1
---N-----------------------------------------------------
x n diagrma R a x an
(6.31)
n1
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 6.12.
(1 SMSC)1 for the data of Figure 6.6. Spatial smoothing was used to enhance a 20 20 covariance matrix. Time gate is 100 ms.
For this example, as for the other algorithms, a single window length of
approximately 100 ms around 1850 ms has been applied. Shorter windows
can also be used; these may result in higher S/N estimates of coherence at the
time when wavefront peaks are present, but more variable estimates are
expected at other times. A 20-ms window has been used for the result in
Figure 6.13.
104
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
velocity, ft/s
Figure 6.13.
6.9
The planar narrowband wave model, so prevalent in the sonar and radar
literature, has been shown to have application, but also limitations, in the estimation of seismic wavefronts rms velocities. In particular, the MUSIC algorithm has been shown to be quite fast and to yield high-resolution for two
wavefronts with close velocities; however, it is necessary that the data be preflattened to the approximate rms velocity, and estimates are likely to be biased.
Iterated flattening, as is practiced with conventional semblance, is recommended in all coherence detection and estimation methods. Preflattening can
be said variously to make the wavefronts planar and narrowband for array
processing purposes, to reduce rank in the covariance matrix, or to increase
stability of estimators.
105
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
MUSIC with preflattening, MVDR with preflattening, low-rank semblance or the multiple sidelobe canceler all achieve good results, improving on
conventional semblance in one aspect or another. Certainly there are other
variations that we have not discussed, although we have suggested, for example, that a low-rank pseudo-inverse might be used in the MSC.
Our MATLAB computation times for the algorithms of this chapter are
shown in Table 6.1. Note that substantial time costs could be greatly reduced
in several cases with more efficient programming, sometimes as simple as only
solving for the first eigenvector and eigenvalue rather than all (32 in these
examples).
Table 6.1
Comparative Computation Time
Seconds
MUSIC
30
MDVR
48
Semblance
210
Keys
724
Subspace Semblance
703
MSC
2090*
x s ( I V 1 V 1 )x
106
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
6.10 References
Allam, M., and Moghaddamjoo, A., 1994, Two-dimensional DFT projection
for wideband direction-of-arrival estimation: IEEE Signal Processing
Letters, 1, 35-37.
Haimovich, A. M., and Bar-Ness, Y., 1991, An eigenanalysis interference canceller: IEEE Trans. Signal Processing, 39, 76-84.
Key, S. C., and Smithson, S. D., 1990, New approach to seismic-reflection
event detection and velocity determination: Geophysics, 55, 10571069.
Kirlin, R. L., 1991, A note on the effects of narrowband and stationary signal model assumptions on the covariance matrix of sensor array data
vectors: IEEE Trans. Signal Processing, 39, 503-505.
1992, The relationship between semblance and eigenstructure velocity estimators: Geophysics, 57, 1027-1033.
Wang, H., and Kaveh, M., 1985, Coherent subspace processing for the detection and estimation of angles of arrival of multiple wideband sources:
IEEE Trans. Acous., Speech and Sig. Proc., 33, 823-831.
107
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Chapter 7
Subspace-Based Seismic Velocity Analysis
Fu Li and Hui Liu
In this chapter, we present a new approach to simultaneously estimate the
stacking velocity and zero-offset time of seismic wave propagation. This
approach includes the following steps: preprocessing to extract structure,
application of several subspace methods (ESPRIT, MUSIC, and Minimum
Norm) to estimate the time delay at each sensor, and postprocessing to estimate the stacking velocity and zero-offset time. The advantages of this proposed approach are high resolution and less computation.
Wave propagation velocity often reects the properties of the media
through which a wave propagates. Therefore, estimating the stacking velocity
of a seismic wavefront is an important signal processing task in exploratory
seismology. However, because of the special hyperbolic trajectory that often
occurs with seismic wave propagation, the stacking velocity must be estimated
together with two-way normal incidence time (zero-offset time).
Conventionally, people estimate the stacking velocity and zero-offset time
by varying the seismic data window to seek the maxima of some coherency
measure function, for instance, semblance coefcient (Neidell and Taner,
1971), or Keys method (Key and Smithson, 1990). However, this approach
must either estimate velocity and zero-offset time iteratively or plot a twodimensional semblance spectrum over a range of the velocity variable and the
zero-offset time variable. The computational expense associated with varying
the data window is high.
Goldstein and Archuleta (1987) rst applied the MUltiple SIgnal Classication (MUSIC) (Schmidt, 1979) algorithm and the spatial-smoothing technique to estimate the directions of seismic arrivals and later applied MUSIC,
spatial-smoothing, and seismogram alignment to estimate several seismic
109
08Chapter07.indd 1
12/3/09 1:50 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
7.1
Problem Formulation
110
08Chapter07.indd 2
12/3/09 1:50 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 7.1.
x2
t 2 T 02 -v--2
(7.1)
(7.2)
The signal received at ith sensor is a delayed version of the signal at (i-1)th
sensor yi(t) yi 1(t ). If the delays at all the sensors are chosen with
respect to a common reference, then equation (7.2) can be rewritten as
y ( t ) ( y ( t o )y ( t M ) ).
(7.3)
Further, if the signals in the analysis window (see Figure 7.2) are sampled
at tj for j 0, 1, ..., K, we can form a data matrix (or analysis window as
known in other literature) of dimension (K1) (M1)
111
08Chapter07.indd 3
12/3/09 1:50 PM
y ( to o )
...
..
y ( tK o )
...
y ( to M )
...
Y ( )=
...
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 7.2.
y ( tK M )
(7.4)
where Y(t) is the time-domain data matrix. When y(t) is narrow band (at center frequency c ) signal, a time delay can be generally approximated by a
phase delay as implied in Goldstein and Archuleta (1991) and other DOA
estimation literature
y ( t ) y ( t )e jc .
(7.5)
(7.6)
112
08Chapter07.indd 4
12/3/09 1:50 PM
7.2
Subspace Approach
...
..
y ( K )e jK 1
y ( o )e jo M
...
Y ( )=
y ( o )e jo 1
...
Taking the discrete Fourier transform (DFT) of all the columns in Y(t) of
equation (7.4), we can obtain
...
y ( K )e jK M
,
(7.7)
2
where k k ----------- (for k 0, 1, ..., K), y ( k ) y ( k ) e j k . Y() is
K1
the frequency-domain data matrix, and w is the vector of discrete frequencies.
( jw k ) 1
An element y ( w k )e
is the DFT at wk for the mth trace.
7.2.1
Structure Extraction
Y ( )=
y ( o )
y ( o )e jo
...
y ( o )e jo M
..
..
...
...
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
which is used to dene the span of the signal subspace. Also notice that the
narrow band data matrix has low rank as in most of DOA estimation literature. But when y(t) is a wide-band signal, y(t ) cannot be expressed as
y(t)e-j, so that applying a subspace approach directly to time-domain data
does not utilize the subspace structure appropriately. Therefore, we perform
subspace processing in the frequency-domain because the time delays in the
time-domain correspond to phase delay in the frequency-domain.
y ( K )
y ( K )e jK
...
y ( K )e jK M
(7.8)
113
08Chapter07.indd 5
12/3/09 1:50 PM
e j o
------------y ( o )
...
..
...
T ( )=
...
e j K
------------y ( K )
...
(7.9)
...
T ( )Y ( )=
...
e jo 1
..
.
..
e jK 1
...
e jo M
.
e jK M
(7.10)
The rst column in T()Y() does not contain any useful information.
So we dene a matrix A which has all the columns of T()Y() except the
rst column
...
..
A
e
j K 1
e jo M
...
e jo 1
...
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
...
j K M
(7.11)
Generally, most of the signal energy in the frequency domain is distributed within a certain bandwidth and possesses conjugate symmetry with
respect to zero frequency. Assume the range of this energy band is (P,Q) with
0
P < Q
K, and denote
2
------ .
d -K----
1
Then we can write A, a reduced version of A , as
114
08Chapter07.indd 6
12/3/09 1:50 PM
e jPd M
e j ( P 1 )d 1
...
..
.
e j ( P 1 )d M
...
...
e j ( Q 1 )d M
e j ( Q 1 )d 1
e jQd 1
...
...
...
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
A=
e jPd 1
e jQd M
(7.12)
7.2.2
In almost all subspace-based processing algorithms for narrow band signals, the signal and orthogonal subspaces are determined by either a singularvalue decomposition of the data matrix or an eigenvalue decomposition of the
data covariance matrix. This is because the data matrix, and thus the covariance matrix, are rank decient in the noise-free case. However, in the seismic
data representation, the data matrix, and thus the covariance matrix, if used,
are of full rank due to wide bandwidth. That is, the (Q P 1) M)
(assuming Q P 1 M) matrix A() has rank M. The advantage of this
fact is that we do not have to perform those computationally expensive
decompositions. Instead, we determine the subspaces by
s A ( A H A ) 1 A H
and
o I s
(7.13)
and
o A 0.
(7.14)
With s and o obtained in equation (7.14), we can perform all the high-resolution subspace algorithms:
115
08Chapter07.indd 7
12/3/09 1:50 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
7.2.2.1 MUSIC
We form a delay spectrum function
P ( ) a ( ) H o a ( )
(7.15)
for
i 2, , M ,
(7.16)
for the noise-free case. In the presence of noise, we will get Q P 1 minima, instead of M zeros. We choose the s corresponding to M smallest minima. The Root-MUSIC can also be implemented by forming a delay spectrum
polynomial
P ( z ) a ( z )H o a ( z ) ,
(7.17)
P(z) C
( 1 ri z 1 ) ( 1 ri* z )
i1
(7.18)
we can get 2(Q P) roots. We choose the M double roots on the unit circle
(for the noise-free case) or the M roots closest to but inside the unit circle (for
the noisy case), as the signal roots. We then calculate the s from these signal
roots.
7.2.2.2 Minimum-Norm
The linear prediction-error vector is dened as
o e1
------------2
d
o e1
(7.19)
D( z ) a( z )Hd
( 1 r i z 1 ) .
i1
(7.20)
116
08Chapter07.indd 8
12/3/09 1:50 PM
7.2.2.3 ESPRIT
The shift-invariance exists in the rows of A, as shown in following expression:
... e j ( P 1 )d M
.. .
...
...
e j ( P 1 )d 1
...
0
...
...
.. .
e jd 1
... jPd M
e
.. .
...
e jPd 1
...
... e jQd M
e jQd 1
...
j d M
(7.21)
In equation (7.21) the matrix on the left side is the matrix A (or sA)
excluding the rst row, so we denote it as A, and the rst matrix on the right
side is the matrix A (or sA) excluding the last row, so we denote it as A. If
we further denote
...
..
0
...
e jd 1
...
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
The M roots on (for the noise-free case) or closest to (for noisy case) the unit
circle are chosen as signal roots. We can also apply our searching algorithm (Li
and Vaccaro, 1989), that is to search over for the M zeros or smallest minima
of D() = a()Hd.
...
e jd M
(7.22)
then we have
A = A ,
(7.23)
so that
A
A ,
(7.24)
117
08Chapter07.indd 9
12/3/09 1:50 PM
i ai a i
(7.25)
7.2.3
(7.26)
For all the offsets xis and delay estimates is, we have
2
xM
2 M
1
---2
v
To
12
...
2 1
...
x 12
...
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
2
M
(7.27)
118
08Chapter07.indd 10
12/3/09 1:50 PM
...
2 1
...
To
x 12
...
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
1
--v2
2 M
2
M
2
xM
12
.
(7.28)
Thus we obtain the estimates of stacking velocity v and zero-offset time T0,
simultaneously. Notice that equation (7.28) is the solution to an overdetermined estimation problem, it is equivalent to identifying a hyperbola with the
parameters v and T0 that best ts all the estimated arrival delays i.
7.3
Since the seismic signals are transient signals, they only occur in a small
part of the total data. The rest of the data have only noise. To improve the performance of subspace processing of velocity estimation, we should use only
the part of the data that has the signals. This is the reason for using a data
analysis window. However, because the seismic reection has a hyperbolic trajectory, applying a rectangular data analysis window directly to seismic data
will still have the data that contains only the noise. In order to achieve the
highest possible S/N, we apply a hyperbolic window to select the signals. This
hyperbolic window concept is similar to seismogram alignment.
A scheme that improves the proposed subspace processing approach is the
following:
1. Estimate seismic arrival delays i using various subspace processing algorithms described in Section 7.2. Obtain the velocity and zero-offset estimates v and T0 using equation (7.28).
2. Construct a set of delays i from equation (7.1) using estimated parameters v and To. (The hyperbolic trajectory passes through all the i , while
the trajectory is only a best t for all the i .)
3. Apply a data analysis window (ones inside the window and zeros outside)
with time range ( i , i ) to the seismic data (see Figure 7.1).
4. Repeat step 1 for more accurate estimates from windowed data.
119
08Chapter07.indd 11
12/3/09 1:50 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
7.4
Performance Analysis
7.4.1
In the noisy case, the A matrix will be perturbed so that the parameters
estimated will also be perturbed. The noise perturbed A matrix can be
expressed as
A N,
A
(7.29)
120
08Chapter07.indd 12
12/3/09 1:50 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
s ( A N ) [ ( A N ) H ( A N ) ] 1 ( A N ) H
( A N ) [ A H A A H N N H A N H N ] 1 ( A N ) H .
Neglecting the second order term and using the Matrix Inverse Lemma, we
get
s ( A N ) [ A H A ( I ( A H A ) 1 ( A H N N H A ) ) ] 1 ( A N ) H
( A N ) ( I ( A H A ) 1 ( A H N N H A ) ) ( A H A ) 1 ( A N ) H
A ( A H A ) 1 A H A ( A H A ) 1 N H N ( A H A ) 1 A H
A ( A H A ) 1 ( A H N N H A ) ( A H A ) 1 A H .
(7.30)
s
s
s
s o NP H PN H o
(7.31)
(7.32)
o o NP H PN H o
(7.33)
So
7.4.2
The noise matrix N used here is a transformation of the noise matrix from
. For the
the spatial domain. Denote the spatial domain noise matrix as N
121
08Chapter07.indd 13
12/3/09 1:50 PM
N
) n2 I
E(N
H
D
...
..
...
0
.
...
e j P
------------y ( P )
...
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
convenience of the analysis, we assume it to be white Gaussian so that its covariance matrix is
e j Q
-------------y ( Q )
(7.34)
The interval (P Q) is the range of energy band. The covariance matrix of N
becomes
N
H )F H D H n2 DFF H D H .
E ( NN H ) DFE ( N
Here we use the property that the DFT matrix F is unitary.
In the case of using the improved subspace approach, the noise matrix in
the time domain is weighted by a window matrix Wi at each trace so that
N DF ( W 1 n 1 ,,W M n M ) ,
where n i is the noise vector at trace i.
7.4.3
Basically there are two ways to implement MUSIC and MN for parameter
estimation: Extrema-Searching and Polynomial-Rooting. It has been proved
that these two methods have the same performance in the sense of mean122
08Chapter07.indd 14
12/3/09 1:50 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
squared error of parameters estimated (Li, 1990; Li et al., 1990; Li and Vaccaro, 1991c; Li et al., 1993). We rst analyze the perturbation of time delay
(Li et al., 1993) estimation using MUSIC and MN algorithms. Readers are
referred to (Li et al., 1993) for details of the development.
7.4.3.1 Extrema Searching: MUSIC and MN
The null spectrum function associated with the MUSIC and MN searching algorithm can be written as
P ( t ) a ( t ) H o W o a ( t )
where the weighting matrix W equals I for MUSIC and
e 1 e 1H
----------------o e1 4
for MN. Dene e1 as the vector of all zeros except a1 in ith position. The time
delays can be estimated by searching for the minima of the null spectrum.
The perturbation of the time delay estimates can be obtained via a rst
)/t
order expansion of P (t i ,
o
.
o)
P ( i ,
--------------------
i ---2-------------------- ,
P( , )
-------------i-2-------o-
for
k 1, , M 1.
(7.35)
Taking the rst and second partial derivative with respect to i, we can easily
obtain
2 P ( i , o )
--------------2--------- 2 a ( 1 ) ( i ) H o W o a ( 1 ) ( i )
and
o)
P ( ,
-----------i---------- a ( 1 ) ( ) H ( ) ( W W ) ( ) a ( )
i
o
o
o
o
i
a ( i ) H ( o o ) ( W W ) ( o o ) a ( 1 ) ( i ) .
123
08Chapter07.indd 15
12/3/09 1:50 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
(7.36)
(7.37)
(7.38)
A H o 0.
(7.39)
124
08Chapter07.indd 16
12/3/09 1:50 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
L1
P ( z ) a ( z 1 ) T o W oH a ( z ) A
( 1 ri z 1 ) ( 1 ri* z ) ,
i1
(7.40)
125
08Chapter07.indd 17
12/3/09 1:50 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
P ( z, r )
2 jAr i* Im ( r i r i* ) G ( r i ) ,
----------------z
r
z
i
where
L1
G ( ri )
( 1 rj ri1 ) ( 1 rj* ri ).
jk
j 1
(7.41)
)
P ( z, U
-----------------o-z
and evaluate it at z r i . The rst-order terms yield
)
P ( z, U
-----------------o-z
z ri
2 jr i* Im [ r i* a ( r i ) H o W oH a ( 1 ) ( r i ) ] .
(7.42)
(7.43)
Using r i* r i1 and the angle-root relation given in Tufts et al. (1989),
Im [ r 1 a ( r ) H W H a ( 1 ) ( r ) ]
r
i C i Im -----i C i ---------------i------------i-------------o-----------o--------------i----.
ri
AG ( r i )
(7.44)
(7.45)
126
08Chapter07.indd 18
12/3/09 1:50 PM
i o W oH a ( 1 ) ( r i )C i r i1 ,
then (7.45) can be simplied as
Im [ e H N H ]
i -----------i-------------i--.
AG ( r i )
(7.46)
This result can be proved to be the solution to equation (7.40) (see Li, 1990).
7.4.3.3 ESPRIT Algorithm
In the noisy case, the data structure is perturbed so we can solve for a rst
order perturbation. We now have
(A N ) (
) (A N ).
Cancel A
and A and neglect the second-order term to obtain
A
(N N ).
Since
is diagonal matrix, its eigenvectors are ei. The rst-order perturbation
due to
is
of eigenvalues of
i e iH (A
I A
i e iH e i e iH A
(N N
)e i
(7.47)
)Ne i .
N ( i Ie i ) i I
Ne i
and
N i ( e iH 1 )A
I N i e iH A
e iH A
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
If we dene
IN.
08Chapter07.indd 19
12/3/09 1:50 PM
(7.48)
where Ci 1/d,
iH C i e iH (A I A I ) ,
and
i 1 .
7.4.3.4 Vector-Wise ESPRIT Algorithm
In the noisy case,
H
n iH ) ( a ( i ) n i )
i (---a------(-----i-)---
----------------------------------
i
i
Km 1
(7.49)
(7.50)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
[ e iH N H i ]
i C i -------i ----------------------- ,
i
i
a H ( i ) n i n i H a ( i )
C i -------------------------------------------
K1
a H( )n a H( )n
C i ------------i------i-------------------i------ ,
K1
(7.51)
128
08Chapter07.indd 20
12/3/09 1:50 PM
H
a H ( i ) H a-------(-----i-)
------------- e i A K 1
K1
and
Ne i n i ,
we can rewrite equation (7.51) as
[ e iH N H i ]
i ----------------------- ,
i
(7.52)
where
iH C i e iH ( A I A I )
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
e iH A
and
i 1 .
This is the same as that of the matrix ESPRIT algorithm.
7.4.3.5 Mean-Squared Error of Time-Delay Estimation
The estimation of elements of covariance matrix Ct is given by
E[ [ e HNH ][ e HNH ] ]
C r ( i, j ) E ( i j ) --------------i-------------i------------j-------------j--- .
i j
(7.53)
Under the assumption that the noise elements are uncorrelated circular
random variable with equal variances n2 /2 (for real and imaginary parts
respectively), then (see Li and Vaccaro, 1991)
( H DD H ) 2
C ( i, j ) ------i-----------------j-------n ( i , j ) .
2 i j
(7.54)
129
08Chapter07.indd 21
12/3/09 1:50 PM
(7.55)
7.4.4
2 ( 2 2 )
x 22
2 ( M M )
2
xM
( 1 1 ) 2
...
(v
v ) 2
x 12
...
T o T o
2 ( 1 1 )
...
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
( HD
W W
HDH ) 2
C ( i, j ) ------i----------------i------j-------------------j-------n ( i , j ) ,
2 i j
( M M ) 2
(7.56)
Lets dene
2 )T
x ( x 12 x 32 x M
2
(v v)
xH
1
2 ( H H ) .
( 2 . )
xH
4 H 2 H x 8 H 2 H x
2 x H x H x 2 x H
0
1
2 H ( . ) 4 H ( . )
2 H
H ( . )
2 x H ( . )
x
130
08Chapter07.indd 22
12/3/09 1:50 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
(7.57)
( I B 11 B 2 )B 11 ( b 1 b 2 )
B 11 b 1 B 11 B 2 B 11 b 1 B 11 b 2
for the rst order approximation
T o
T o T o
To
2v .
( v v ) 2 v 2
----------- v3
(7.58)
Since
To
1
v 2 B 1 b 1 ,
this becomes
T o
v
1
0
0.5v 3
(7.59)
(7.60)
where row vectors d1, and d2 are given in the appendix. Now we have the
mean-squared error of the parameter estimated as
E ( T o ) 2 d 1 C d 1H
(7.61)
and
131
08Chapter07.indd 23
12/3/09 1:50 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
E ( v ) 2 d 2 C d 2H .
7.5
(7.62)
Simulations
2e
( 2f ) 2
----------------2
.
(7.63)
It arrives at the rst sensor at To 0.1 s with a velocity of 2000 ft/s (600m/s);
501 data points are sampled uniformly from each trace and the signal-to-noise
ratio is 20 dB.
Twenty-one frequency-domain data points (Q P 20) are used from
each trace within the main lobe of the wavelet (30 Hz ~ 50 Hz) with
f 1 Hz. Figures 7.37.5 show the velocity and zero-offset time estimates
from twenty trials using MUSIC, MN, and ESPRIT algorithms, respectively.
Figures 7.7 and 7.8 show the velocity and zero-offset time estimates from
twenty trials using the improved MUSIC, MN, and ESPRIT algorithms,
respectively. The window width is .3 s.
Table 7.1 shows root-mean-squared error (RMSE) of velocity and zerooffset time estimates averaged from 100 trials using MUSIC, MN, and
ESPRIT algorithms. Table 7.2 shows the RMSE using improved MUSIC,
MN, and ESPRIT algorithms. Table 7.3 gives RMSE for the sample using
Semblance and Key's algorithms. The same window used for the improved
subspace approach is used for both the Semblance and Keys algorithms. The
search steps in both algorithms are 0.8 ft/s (20 cm/s) for velocity variable and
0.0004 s for time variable.
Table 7.1. RMSE for MUSIC, Minimum Norm, and ESPRIT
RMSE
MUSIC
MN
ESPRIT
Velocity (ft/s)
10.8298
25.3919
19.2899
0.00299
0.00713
0.00856
132
08Chapter07.indd 24
12/3/09 1:50 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 7.3.
Table 7.2. RMSE for Improved MUSIC, Minimum Norm, and ESPRIT
RMSE
MUSIC
MN
ESPRIT
Velocity (ft/s)
10.7464
14.2129
9.6694
0.00278
0.00411
0.00267
Semblance
Keys Method
Velocity (ft/s)
11.426
15.899
0.00856
0.0122
133
08Chapter07.indd 25
12/3/09 1:51 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 7.4.
From these results we can see that the improved subspace processing algorithms outperform the semblance and Keys algorithms. For the original subspace algorithm, MUSIC has smallest RMSE, followed by ESPRIT; but for
the improved subspace algorithms, ESPRIT has smallest RMSE, followed by
MUSIC.
Figures 7.9 and 7.10 give the mean-squared error for velocity and zero-offset time estimates, in which the lines are theoretical predictions, and discrete
symbols are simulation measurements. They show that experimentally simulated results and those predicted theoretically agree very well.
134
08Chapter07.indd 26
12/3/09 1:51 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 7.5.
7.6
Conclusion
135
08Chapter07.indd 27
12/3/09 1:51 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 7.6.
as noise. However, an estimation scheme for the parameters of multiple wavefronts will be useful and worth pursuing in the future.
7.7
References
Goldstein, P., and Archuleta, R. J., 1987, Array analysis of seismic signals:
Geophys. Res. Lett., 14, 13-16.
1991, Deterministic frequency-wavenumber methods and direct
measurements of rupture propagation during earthquakes using a
dense array: Theory and methods: J. Geophys. Res., 96, 6173-6185.
Key, S. C., and Smithson, S. D., 1990, New approach to seismic-reection
event detection and velocity determination: Geophysics, 55, 10571069,
Kirlin, R. L., 1992, The relationship between semblance and eigenstructure
136
08Chapter07.indd 28
12/3/09 1:51 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 7.7.
08Chapter07.indd 29
12/3/09 1:51 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 7.8.
Figure 7.9.
138
08Chapter07.indd 30
12/3/09 1:51 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 7.10.
Proc., 26132616.
1991a, Performance degradation of DOA estimation due to
unknown noise elds: Proc. IEEE Internat. Conf. On Accoust., Speech
and Sig. Proc., 14131416.
1991b,On frequency-wavenumber estimation by state-space realization: IEEE Trans. On Circuits and Systems, 38, 800804.
1991c, Unied analysis for DOA estimation algorithms in array signal processing: Signal Processing, 22, 147169.
Li, F., Vaccaro, R. J., and Tufts, D. W., 1990, Unied performance analysis of
subspace-based estimation algorithms: Proc. IEEE Internat. Conf. on
Accoust., Speech and Sig. Proc., 25752578.
Neidell, N. S., and Taner, M. T., 1971, Semblance and other coherent measures for multichannel data: Geophysics, 36, 482497.
Paulraj, A., Roy, R., and Kailath, T., 1985, Estimation of signal parameters
via rotational invariance techniques - ESPRIT: in Proc. 19th Asilomar
Conf. on Signals, Systems and Computers, 8389.
139
08Chapter07.indd 31
12/3/09 1:51 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Schmidt, R. O., 1979, Multiple emitter location and signal parameter estimation: Proc. RADC Spectral Estimation Workshop, 243258.
7.8
B2 2
H x
H x
and
6 ( . ) H
.
b2
2( x . )H
Let
b1
b2
b3
b4
=2
0.5v 3
B 11
,
a 1
1
a 2 B 1 b 1 ,
and
c 1 =
c 2
0.5v 3
6 ( . ) H
B 11
.
2( x . )H
(7.64)
So we have
T o
v
b1
b2
4 H
b3
b4
H x
H x
0
a 1 c 1
a 2 c 2
4a b H a b x H a b x H c
2 1
1 2
1
1 1
H a b xH a b xH c
b
4a
1 3
2 3
1 1
2
d
1
d 2
def
(7.65)
140
08Chapter07.indd 32
12/3/09 1:51 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Chapter 8
Enhanced Covariance Estimation with
Application to the Velocity Spectrum
R. Lynn Kirlin
Seismic reflections from an interface are relatively short in time; they are
transient. Because of this fact, estimation of the covariance matrix is accomplished by the use of either time-slice vectors (across traces) or Fourier transform value vectors. Although we might have a dozen or so time-slice vectors,
this is generally not adequate, (a few hundred would be nice, and appropriate
for temporally stationary signals). In the frequency domain, the short reflection only affords one Fourier transform. Thus, the multiplicity of vector samples must be found another way if possible.
As always with a finite data set, we may exchange spatial resolution for statistical stability. The exchange is effected two ways: break the array into spatially offset subarrays or, in the frequency domain, utilize the correlated
variations of transform values among distinct (usually neighboring) frequencies. Both of these schemes are usually implemented with a sliding window,
either in space or frequency, as appropriate. In this chapter, I examine these
possibilities for enhancing covariance estimation and determine their applicability to seismic array processing.
In practice, the data covariance matrix,
H
R x E { xx } E { x }E { x } ,
(8.1)
is not known, and usually it must be estimated with the finite data available.
Generally, it is known that E{x} 0; thus, the estimator of Rx is as in
Equation (2.1):
141
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
x ------1------ x i x H
.
Cx R
N 1i1 i
(8.2)
For real Gaussian data, the elements of Cx are distributed Wishart, and for
complex data they are complex Wishart, as discussed in Chapter 2. Also in
Chapter 2, I described estimation in the maximum likelihood sense and
robust estimation for contaminated Gaussian data.
Different forms of a priori knowledge help improve covariance estimation.
The primary case is when the data are known to be the sum of signals that
each having plane wavefronts and the array is uniformly spaced and linear.
However, I will describe a method for dealing with hyperbolic curvature wavefronts in Section 8.4. Equivalently we could consider vectors from a time
sequence which is composed of sinusoids. In either case all noise samples are
considered independent, white, stationary, and zero-mean Gaussian random
2
variables with variance n . Note that not all of the following analyses will
apply directly to hyperbolically curved wavefronts.
8.1
Spatial Smoothing
Under the conditions assumed above it is easy to imagine that the covariance matrices of data from different subarrays of m adjacent sensors should be
identical. See Figure 8.1 for the general concept of subarray covariance matrices related to the whole array covariance matrix. However, in general this is
not true because temporal coherence among the signal sources causes spatial
nonstationarity. Yet, by properly combining data from different subarrays
coherently, a more stable estimate of the m m covariance matrix results. For
any single plane-wave narrowband signal arrival, the delay between any two
sensors at positions i1 i0 and i2 i0 q is indicated by the same factor
exp{j2fq }(where is the delay between adjacent sensors) regardless of i0.
Thus we may coherently combine narrow band, analytic signals from sensors i1 and i2 if we shift the phase of the signal at i2 by multiplying the analytic
signals by the factor exp{j2fq }. This is easily effected for each sensor in one
subarray spaced q positions from each sensor in another subarray.
Thus let x1(i) be a time-slice vector taken from a reference subarray at time
i and let xq(i) be the corresponding time-slice vector taken from an array q
sensor spacings removed. Then if the covariance matrix associated with x1(i) is
142
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
R1 and that with xq(i) is Rq, then, for a multiple source signal vector s and
2
noise covariance matrix I ,
n
R 1 AE { ss }A n I E { x 1 ( i )x 1 ( i ) }
(8.3)
and
H
R q AD q E { ss }D q A n I E { x q ( i )x q ( i ) } ,
(8.4)
(8.5)
Tq xq ( i ) ,
where Tq is a diagonal matrix with elements exp { j2fq } and is the intersensor delay for the trial direction in the search vector of the direction-finding
algorithm; i.e.,
sin /v ,
(8.6)
H
-1--T q x q ( i )x H
R
q ( i )T q
N i1
H
Tq Cq Tq
143
(8.7)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Suppose that we have an M-sensor array; then Equation (8.7) implies that
each subarray of m elements can be used to estimate the covariance matrix of
R1. There are M m 1 of these, therefore the spatially smoothed covariance estimate is
Mm
o = R
1 -----------1------------ T C T H .
C
Mm1q0 q q q
(8.8)
8.2
R
R 11 R 12
R 21 R 22
(8.9)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
vector by another random vector; that is, we want to find a linear transformation f, given by
x 1 f ( x 2 ) Ax 2 x 0 ,
(8.10)
E { x 1 x 1 2 }
(8.11)
2
E { x 1 ( Ax 2 x 0 ) 2 }
is minimized, where x0 CM and A CM N is an M N transformation
matrix. The optimal linear transformation to minimize J is easily shown to
give
1
A R 12 R 22
(8.12)
and
1
x 0 1 R 12 R 22 2 .
(8.13)
(8.14)
where
1
Cov ( x 1 ) R 12 R 22 R 21
(8.15)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
the concern. Note that the above proposition not only provides the optimal
prediction of the random vector x1 from the random vector x2, but also it
implies that the autocovariance matrix R11 of x1 can also be predicted from x2
since
1
p11 Cov ( x 1 ) R 12 R
R
22 R 21 ,
(8.16)
(8.17)
The argument behind the averaging operation in the example of array processing, where x1 and x2 are from different subarrays and received wavefronts
11 and R
p11 are noisy estimates of R11.
are spatially stationary, is that both R
These can be written
11 R 11 N 1 ,
R
(8.18)
p11 R 11 N 2 ,
R
(8.19)
and
where N1 and N2 are the perturbations due to finite averaging. Equation (8.17)
can be rewritten as
11 R 1-- ( N N ) .
R
11
2
2 1
146
(8.20)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
(8.21)
w 12
Tr { R R 22 R 21 }
------------12
--------------------- ,
Tr { R 11 }
(8.22)
where q is any positive number. When the two traces are equal, w12 has a value
of one; and when the two random vectors are uncorrelated with
R12 R21 0, w12 has a value of zero.
8.3
z i Tz i z 0
i 1, 2, . . . , M .
147
(8.23)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
xk ( t )
si ( t )ej ak ( i ) n ( t )
k 1, 2 ,
(8.24)
i1
where i is the initial phase of the ith signal, n(t) CM is the additive noise
vector and ak(i ) is the steering vector of the kth subarray for the ith signal,
which is given by
ak ( i ) [ e
T k
jw 0 i z 1 /c
,e
T k
jw 0 i z 2 /c
,...,e
T k
jw 0 i z M /c T
] ,
(8.25)
with c denoting the propagation velocity of the signals, w 0 the signal band
center frequency and T the transpose operation. Using matrix-vector notations, the kth subarray output vector can be presented as follows:
x k ( t ) A k s ( t ) n ( t ), for k 1, 2 ,
(8.26)
j
s ( t ) [ s 1 ( t )e 1, s 2 ( t )e 2, . . . , s d ( t )e
j d T
] ,
and
A k [ a k ( 1 ), a k ( 2 ) , . . . , a k ( d ) ] .
The additive noise is assumed to be a stationary zero-mean random process that is temporally and spatially white and uncorrelated with the signals.
With this assumption we get the cross-covariance matrix of the subarray outputs
H
R ij E [ x i x j ] A i SA j I ij
for i, j 1, 2 ,
(8.27)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
1
H
H
p11 A 1 SA H
R
2 R 22 A 2 SA 1 A 1 S A 1 ,
(8.28)
H 1
where S SA 2 R 22 A 2 S is the estimate of the signal covariance matrix.
8.3.1
In the application to DOA estimation, the structure of the signal covariance matrix is not of concern except it must be of full rank for the eigen-type
algorithms to apply. The spatial smoothing techniques will generally change
the structure of the signal covariance matrix and increase its rank. This is in
fact necessary when sources are fully coherent, because then rank is less than
the source count. From matrix theory, we know that the rank of the product
of two matrices is always less than or equal to the lower rank of the two matrices or
rank ( AB ) minimum [ rank ( A ), rank ( B ) ] .
(8.29)
1
(8.30)
where w12 is the weighting factor. The function of w12 is twofold: it normal1
p11 to cancel the scaling affects arising from neglecting R
izes R
22 and reduces
noise propagation from the second subarray when its outputs have very low
S/N. Similar to the definition of w12 in Equation (8.22), we can define w12 by
q
w 12
Tr { R 12 R 21 }
----------------------- .
Tr { R 11 R 22 }
(8.31)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
p11 R 11 R 11 .
R
The above result implies that R12R21 can be approximately regarded as the
prediction of R11R11, the square of covariance matrix R11, which contains the
1
p11 not
same DOA information as R11. Omitting R 22 in the expression of R
only reduces the computational requirements but also avoids the numerical
errors introduced by the matrix inversion operation. Note again in the above
discussions that we assume that true crosscorrelations are known for the convenience of formulation. If only finite observations are available, the improved
estimate of R11 is given by
11 1-- ( R
p11 ) ,
11 R
R
2
(8.32)
p11 w 12 R
12 R
21 and R
ij is the estimated cross-covariance matrix.
where R
Similarly, the improved estimate of R22 is given by
22 1-- ( R
p22 ) .
22 R
R
2
(8.33)
Although we have only considered the case of two subarrays, the above
results apply to more than two subarrays. For example, if there are L subarrays,
the predicted covariance matrix of the ith subarray from the crosscorrelations
will be
L
pii
R
1
ij R
ji ,
----------- w ij R
L 1 j 1(j i)
(8.34)
ii will be
and R
ii 1-- ( R
pii ) .
ii R
R
2
8.3.2
(8.35)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
ii -1 ij R
ji ,
R
w ij R
L j
1
(8.36)
where ii 1. Improvement using Equation (8.36) is then followed by averaging the L improved estimates of the squared subarray covariance matrices.
Thus the recommended forward and the forward-backward spatially
smoothed covariance matrices are given by
f --1-R
2
L
wij R ij R ji ,
(8.37)
i1 j1
and
fb ---1--R
2
2L
(8.38)
i1 j1
151
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
where T
0 0 0 ...
1
0 0 0 ... 1 0 .
.
..
1 0
0
ii with ( R
ii T R
ii T)/2 R
ii R ii /2.
Thus it is reasonable to estimate R
Generally speaking, the outputs of all subarrays have similar S/N values.
Therefore, I am not concerned about the noise propagation problem introduced from using the crosscorrelations with some random signals of very low
S/N values. For this reason, I usually set wij 1.
There are some redundancies in equations (8.37) and (8.38) since the
crosscorrelations have been exploited more than once. An essential consideration in proposing the above equations is to make the new algorithms capable
of increasing the rank of the signal covariance matrix after spatial smoothing.
It has been proved in Du and Kirlin (1991) that when wij 1, the numbers
of subarrays required by algorithms (8.37) and (8.38) are the same as those
required by the conventional algorithms in Shan et al. (1985), Williams et al.
(1988), and Pillai and Kwon (1989).
To illustrate how the conventional methods ignore the crosscorrelation
matrices, consider a method for finding the forward, spatially smoothed covariance matrix using the covariance matrix of the overall array R CMM with
M denoting the number of sensors in the overall array. First, we form a band
matrix in R using m 1 diagonals or elements rij with i 1, ..., M and
|i j|, m, where m denotes the number of sensors in each subarray. Then
along the main diagonal of R, an m m square window slides down to sample this band matrix, yielding L m m submatrices, which correspond to L
autocovariance matrices of the L M m 1 subarrays. The average of
these L autocovariance matrices is the forward, spatially smoothed covariance
matrix using the conventional method. This shows that the conventional spa-
152
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
tial smoothing methods ignore the information in R, which lies outside the
band matrix (see Figure 8.1).
MM
Figure 8.1.
8.3.3
Conventional spatial smoothing method using autocorrelations of subarrays in the band matrix only. Correlations out of the band matrix which
correspond to the cross-subarray correlations are used in the suggested
smoothing.
Simulations
In the simulation, we use a uniform linear array of 16 sensors; the wavenumber is defined as sin . Three coherent signals with equal powers are
arriving from bearing angles 10, 14, and 80 degrees, respectively. The S/N,
which is defined as the signal-to-noise power ratio is 0 dB; the number of
snapshots is 64; the size of the subarray is 8; and the number of subarrays is 9.
For both the conventional methods and the new methods, we use the forwardbackward spatial smoothing scheme to obtain the smoothed covariance matrix
and apply the MUSIC algorithm to find the power spectra. Five independent
runs using the conventional forward-backward spatial smoothing method are
plotted in Figure 8.2. This figure shows that the two signals of the bearing
angles 10 and 14 degrees are not resolved. Using the same sets of data, the
results for the improved forward-backward spatial smoothing method are
153
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
shown in Figure 8.3, where the resolution of the two signals at close bearing
angles are achieved. This example illustrates that, by using the proposed
method, a more stable estimate of the covariance matrix can be obtained, and,
based on this estimation, the recently developed high-resolution algorithms
will achieve a better performance.
8.4
To some extent it is safe to say that the velocity analysis is equivalent to the
DOA estimation and spectral estimation. However, high-resolution spectral
estimators have had limited success in this area since seismic reflections do not
satisfy some important assumptions on which these estimators have been
based. Rather, seismic reflections have nonplanar (hyperbolic) wavefronts and
are temporally transient. Therefore, they are neither temporally nor spatially
stationary.
The above features of seismic signals deserve special considerations when
applying modern array processing techniques.
Figure 8.2.
154
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 8.3.
The hyperbolic model of reflection time versus offset provides a means for
establishing the necessary velocity relationships. Based on the hyperbolic
model, several methods have been developed for this purpose (Robinson and
Treitel, 1980; Neidell and Taner, 1971). Semblance is perhaps the most
widely used velocity estimation method, as is outlined in the following subsection.
8.4.1
Semblance Review
155
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
u Ru
-------------- ,
S c -Mtr
(R)
(8.39)
8.4.2
156
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 8.4.
a ( )Ra ( )
P BF ( ) -----H-------------------- ,
a ( )a ( )
( j sin ) c
(8.40)
j ( M 1 ) sin /c
0
where a ( ) 1, e 0
for a uniform linear array
,e
where , , 0 , and c denote, respectively, the sensor spacing, steering direction, center frequency of the signal, and the propagation speed of the wave in
the medium. The locations of the peaks of the spectrum are interpreted as the
estimates of DOA. We can rewrite the above equation as
u R ( )u
-,
P BF ( ) -----------------M
(8.41)
R ( ) T ( )RT ( ) ,
( jw sin ) c
( j ( M 1 )w sin )/c
0
where T diag 1, e 0
.
,e
The covariance matrix steering operation can also be implemented in the
time domain by adding appropriate delays to different traces. The time-
157
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
8.4.3
158
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
1
T
R ( t 0, v s ) ----------------------- I m, M ( i )R M ( t 0, v s )I m, M ( i ) ,
Mm1 i0
(8.42)
2
T
2 ( t , v ) -----------1-----------R
I m, M ( i )R M ( t 0, v s )I m, M ( i ) .
0 s
M m 1 i
0
(8.43)
159
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
u w 1,
(8.44)
where w is the weight vector of the beamformer. If this is the only constraint
used, we obtain the optimal weights
1
R ( t 0, v s )u
w ---H----------1----------------- ,
u R ( t 0, v s )u
(8.45)
(8.46)
2 ( t , v ) . We use
R(t0,vs) in the above two equations can be replaced with R
0 s
the array output power to indicate the degree of match between the data and
the pre-assumed signal or the confidence level of the hypothesis test. If information regarding interferences is available (e.g., through interactive seismic
processing software, strong interferences are visible and can be approximately
measured), more constraints can readily be added into the constraint matrix
such that the new velocity estimator has zero response to these known interferences. This feature makes the new estimator especially suitable for interactive
seismic processing software packages.
160
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
8.4.4
As Neidell and Taner (1971) argued, noise present on the data channels
affects coherency measures primarily through the apparent amplitude and
shape diversity it creates. The precise character of the effects depends on the
noise statistics and the signal-noise interactions. Therefore, it is reasonable to
perform an experiment on noise-free data to establish in some sense the discrimination or resolution of semblance and the optimal velocity estimator.
Figure 8.5 depicts a synthetic CMP gather. These data are specially designed
to test the resolving power and discrimination threshold of candidate coherency measures. Each region indicated in the data contains a doublet; the
pulses are separated by 20 ms in two-way time and the rms velocities differ by
200 ft/s (60 m/s). All events are Ricker wavelets with dominant frequency at
40 Hz. Since the objective of this computation is a resolution test rather than
the modeling of physical reality, the time separation and velocity increments
for the doublets have been chosen so that the trajectories tend to cross. The
sensor array used is a uniform linear array of 32 sensors with sensor spacing
200 ft (60m). The data are recorded from 0.8 s to 2.8 s, and the sampling
period is 2 ms. All calculations use a 10 ms time step and a 48 ms time gate,
but different coherency measures. Contoured results will be presented for final
comparisons. Table 8.1 summarizes the data characteristics. Figures 8.6 and
8.7 show the computed velocity spectra by the semblance method and the
new method with the conventional spatial smoothing. The final size of the
covariance matrix for the new method is 20. Since this is a noise-free case, the
improvement of the spatial smoothing is not necessary. In each contour plot
the coherency measure has been normalized to have unity peaks. Ideally the
contour centers should be located at the correct events parameterized by the
two-way traveltime t0 and the velocity v. The conventional semblance method
does not have clear contours at the correct locations, but the new method
does. Thus we conclude that the new method has better properties of resolution and parameter identification than semblance.
Table 8.1 Parameters for data in Figure 8.5.
Event no.
t0 (102 ms)
9.2
15
15.2
22
22.2
vs (103 ft/s)
8.2
9.2
10
10.2
161
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 8.5.
Figure 8.6.
162
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 8.7.
Contour plot of the velocity spectrum using the new velocity estimator
with conventional spatial smoothing (clean data).
(8.47)
such that the modified semblance S c has the same range as the new estimator.
The contour plot for the modified semblance is given in Figure 8.8. This figure shows that either the resolution of the three doublets is not achieved or the
estimates are very much biased. This example shows that the performance of
the new method is credible.
Since the new velocity estimator involves the matrix inversion operation,
it requires more computation than semblance. When the covariance matrix is
163
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 8.8.
8.4.5
Discussion
Having applied spatially smoothing to the rms velocity estimation. Conventional semblance was found to be a scaled conventional beamformer. We
proposed an optimal velocity estimator based on LCMV beamformers, using
spatial smoothing and enhanced spatial smoothing to improve the estimate of
the steered covariance matrix. Comparing conventional semblance and the
optimal velocity estimator shows that the latter performs much better than the
conventional semblance in discriminating between close events.
8.5
164
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
(p)
Cx
k uk vHk
k1
(k)
F Cx ,
(8.48)
where k are the p largest (assumed distinct) singular values of Cx, and uk and
*
u k are the corresponding left and right singular vectors (see Chapter 3). This
mapping yields the p-rank matrix closest to Cx in the Frobenius norm sense
(p)
[sum of squares of differences of Cx (i, k) and C x ( i, k ) for all i, k].
+
+
Cx
k uk uHk
F Cx ,
(8.49)
k1
(T)
Cx
1 2 3
(T)
*2 1 2 x
* *
3 2 1
165
1
2
3
1
*
2R
2
1 A 2I
2
3R
*
3I
3
2
1
(8.50)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
where
A
1
0
0
0
1
0
0
0
1
0
1
0
1
0
1
0
1
0
0
j
0
j
0
j
0
j
0
0
0
1
0
0
0
1
0
0
0
0
j
0
0
0
j
0
0
(8.51)
(T)
A(A A)
1
A x F
(T)
Cx .
(8.52)
Lastly, we often wish to find the matrix Cx(p) which is closest to Cx, has rank p,
and has its n p smallest eigenvalues equal. This has been shown in section
4.2.1 to be given by
p
Cx (p)
H
k uk uk
k1
2
n
uk uk F(p) Cx ,
(8.53)
k p1
where
2
n
1
---------np
k
(8.54)
k p1
166
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
x ( k ) F(T) F C
C
(p) x( k 1 ) ,
(8.55)
where we assume in F(p) that only p positive eigenvalues are used, and in F(T)
that Hermitian-Toeplitz form is produced.
Some spectral estimation or bearing estimation results of the iterative
method are given in Cadzow (1988). It is not claimed that the iteration converges to a maximum-likelihood solution. In fact, it generally would not.
Another difficulty is in the selection of rank p. If p is chosen too large, the
method will faithfully give a rank p solution, when perhaps only p 1 solutions are true. As in most seismic processing applications, good-sense interpretation is a requirement. However, we note that when p is known or correctly
chosen, just a few iterations give enhanced results.
8.6
References
167
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Krolik, J., and Swingler, D., 1989, Multiple broad-band source location using
steered covariance matrices: IEEE Trans. Acous., Speech and Sig. Proc.,
37, 14811494.
Neidell, N. S., and Taner, M. T., 1971, Semblance and other coherence measures for multichannel data: Geophysics, 36, 482497.
Pillai, S. U., and Kwon, B. H., 1989, Forward-backward spatial smoothing
techniques for the coherent signal identification: IEEE, Trans. Acous.,
Speech and Sig. Proc., 37, 815.
Robinson, E. A., and Treitel, S., 1980, Geophysical signal analysis: PrenticeHall, Inc.
Shan, T. J., Wax, M., and Kailath, T., 1985, On spatial smoothing for direction-of-arrival estimation of coherent signals: IEEE Trans. Acous.,
Speech and Sig. Proc., 33, 806811.
Tufts, D.W., Parthasarathy, S., and Kumaresan, R., 1991, Effect of predictor
order on the accuracy of frequency estimates, in Haykin, S., Ed.,
Advances in spectrum analysis and array processing, 1: Prentice-Hall,
Inc., 114140.
VanVeen, B. D., and Buckley, K. M., 1988, Beamforming: A versatile
approach to spatial smoothing: IEEE Acous., Speech and Sig. Proc.
Magazine, 424.
Wang, H., and Kaveh, M., 1985, Coherent signal subspaces processing for the
detection estimation of angles of arrival of multiple wide-band sources:
IEEE Trans. Acous., Speech and Sig. Proc., 33, 823831.
Williams, R. T., Prasad, S., Mahalanabis, A. K., and Sibul, L. H., 1988, An
improved spatial smoothing technique for bearing estimation in a multipath environment: IEEE, Trans. Acous., Speech and Sig. Proc., 36,
425432.
168
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Chapter 9
Waveform Reconstruction and Elimination of
Multiples and Other Interferences
R. Lynn Kirlin
Most chapters in this book deal with methods that enhance either the estimation of wavefront phase velocities (directions of arrival) or the covariance
matrices from which those and other parameters are inferred. Occasionally the
exact waveform from a single source or reector is the desired result. This
brings a few problems into play:
1) Multiple removal, where a reection from a single boundary appears to
have more than one arrival because it has been caught in a waveguide layer
and transmits part of its energy to the surface with each cycle of reection
within that layer;
2) Secondary interference removal, where uncontrolled sources outside of the
central seismic experiment cause wavefronts to be superimposed on the
desired reections;
3) Thin bed resolution, where features of distinct reections from the top
and bottom of the bed are to be individually analyzed but the waveforms
are considerably overlapped in the prole; and
4) Separation of up- and down-going waves in vertical seismic proling.
Many techniques have been developed to deal with each of these problems. Generally the idea is to lter out the unwanted waveforms. Often this is
done following transformation of the data within the processing window into
another domain where wavefronts at different velocities may be more easily
distinguished. Those deemed to be interference are nulled out with the appropriate two-dimensional lter and the remaining data are inverse transformed.
In almost all such schemes, whole bands of f-k or -p data are passed or
stopped.
169
10Chapter9.indd 1
12/3/09 2:08 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
9.1
Signal-Plus-Interference Subspace
We begin with the data model of equation (4.24), using time index i,
x ( i ) As ( i ) n ( i ) ,
(9.1)
where A is an M r matrix, the columns of which are the delay vectors, one
for each of the total of r source wavefronts. Vector s(i) contains the r source
signals (waveforms), some of which we would like to recover. Lastly, n(i) is a
2
vector of additive, independent, white Gaussian noise with variance n at
each sensor. The r sources may include multipaths, in fact, all seismic reections from the same source are quite coherent, thus the elements of the sourceplus-interference (SPI) covariance matrix PSI indicate such coherence. We
assume zero-mean signals and interferences so
H
P SI cov ( s ) E { s s } ,
(9.2)
R x AP SI A n I
(9.3a)
170
10Chapter9.indd 2
12/3/09 2:09 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
V SI SI V
(9.3b)
SI
Following the concepts of Section 4.2, we may assume that the eigenstructure
of Rx allows separation into signal and noise subspaces, giving for signal-plusinterferences eigenvectors the r columns of VSI and for associated eigenvalues
the r largest diagonal elements of SI. When the sources are highly coherent,
not only in frequency but also in space and time, we presume that either spatial smoothing, frequency-focusing or some other process has been applied to
cause the transformed SPI subspace covariance matrix to have full rank (r). See
for example Shan and Kailath (1985). (Note that for curved wavefronts and
partially nonoverlapping reections, total coherence is not a concern; this is
the case in common midpoint gathers, other prestack data and actually most
typical situations.) For further details on dealing with coherent sources see
Chapter 6.
Suppose then that we have identied the SPI subspace and that, by some
high-resolution means, such as those in Chapter 5, we have found the r solu
tion estimates for velocities of the r wavefronts; that is, we have an estimate A
of A in equation (9.1). From equation (4.10) we have that the estimate of the
SPI covariance matrix is
HA
) 1 V
SI
SI V
H
H
SI ( A
P
SI ( A A )
1
(9.4)
(9.5)
171
10Chapter9.indd 3
12/3/09 2:09 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
M R dimensions dene the noise subspace, wherein all vectors are orthogonal to the SPI subspace.
9.2
s ULS ( i ) ( A A )
1
A x(i) ,
(9.6)
where the r elements of s ULS ( i ) are the estimates of the r waveforms at time
sample i.
The next most common solution may be the Wiener solution, where we
utilize knowledge of exact or approximated data and signal covariance matrices. This estimate is the minimum-mean-squared-error solution
H
1
s W ( i ) P SI A R x x ( i )
(9.7)
9.3
Subspace Estimators
Even the Wiener solution does not utilize all possible a priori information;
further, there are other optimality criteria of interest. Because we know that a
wavefront is planar (or hyperbolic), estimation of the appropriate parameters
allows us to make a better estimate of the signal covariance matrix needed in
equation (9.7). In fact in blind processing we have no a priori estimate of PSI.
Rx can of course be estimated with the data in the processing window. Further,
without an examination of eigenvalues or singular values of the data, we have
no good guess as to the number of wavefronts present. However, it is some-
172
10Chapter9.indd 4
12/3/09 2:09 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
times assumed, even realistically, that there is only one strong reection in the
window.
It has been shown, for example (Ottersten et al., 1989), that when sample
SI from equation (9.4) in
size is not large, use of the low-rank (r) P
2
x P
SI
R
nI
(9.8)
1 2
mse SI n T r { P SI
H
1
1
1 2
A R x A P SI
},
(9.9)
mse ULS n Tr { ( P SI A A )
1
(9.10)
SI is shown to be
The asymptotic improvement using the low-rank P
about an order of magnitude for a two-source uncorrelated problem when
S/N 0 dB. When S/N is higher (10 dB or above), asymptotic mse's are
identical, but either the subspace or ULS estimators converge much faster
than using equation (9.7) without a structured Rx.
9.4
Interference Canceling
In the preceding, we assumed that all sources, both signals and interferences, were to be estimated jointly. However, the purpose of this chapter is to
optimally enhance just one of a number of signals in the presence of others. In
seismic processing only one wavefront for any one two-way traveltime can be
considered signal, although multipath energies ideally might be combined for
maximum S/N recovery. In the following, I will consider that only one of the
r sources is desired and others are interferences; thus rs 1, rI r 1.
In many cases, we not only select which wavefront is considered signal but
also resample the data, interpolating after timeshifting it so that the desired (or
173
10Chapter9.indd 5
12/3/09 2:09 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
1 1 0 0
0 1 1 0
C .
.
0
1 1
(9.11)
(9.12)
R a CA I P I A I C n CC ,
(9.13)
(9.14)
10Chapter9.indd 6
12/3/09 2:09 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
E { s w xa } .
(9.15)
H
T
s w w
w IE { ss* }I w
S N ------------H------------------ ----H---------- .
w Ra w
w Ra w
(9.16)
min w R a w subject to w 1 1
w
.
(9.17)
w CC E I 0 ,
(9.18)
which says that the weight vector is orthogonal to the transformed (by C)
interference subspace eigenvectors, columns of EI, where EI is dened by the
generalized eigenstructure of the matrix Ra,
H
R a e CC e .
(9.19)
In equation (9.19), is a generalized eigenvalue, and e is the associated generalized eigenvector (g-eigenvalues and g-eigenvectors). Haimovich and BarNess (1991) state that Ra (which is (M 1) M) will have rI g-eigenvalues
2
2
larger than n and M rI 1 g-eigenvalues equal to n . The M rI 1
associated noise-subspace eigenvalues, columns of En, are orthogonal to the
interference subspace, which is spanned by the transformed g-eigenvectors
CCHEI. Thus a generalized eigendecomposition is required for the three optimizations equations (9.15) and (9.17).
175
10Chapter9.indd 7
12/3/09 2:09 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
CC E I QV
O H.
( QI Qn ) I
V
O O
(9.20)
Having obtained the analyses of equations (9.19) and (9.20), the solutions for
the three optimized weight vectors are
1) Minimum-mean-square error
2
Q Q 1
w mmse ---------s------n---2---n--------- ;
2
H
2
s Qn 1 n
(9.21)
2) Maximum S/N
H
w SNR
Qn Qn 1
--------H------- ;
Qn 1
(9.22)
w MV
Qn Qn 1
--------------2- .
H
Qn 1
(9.23)
176
10Chapter9.indd 8
12/3/09 2:09 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
As with the subspace methods of Section 9.3 the exploited a priori knowledge of the structure of the signal and interference wavefronts gives a much
faster approach to the asymptotic optimum results than for the conventional
methods that use only raw estimates of covariance matrices and no subspace
analyses. For the short time records of a seismic reection this is a signicant
advantage.
9.5
dp drU ( p, ) (
2 2
t p h ) n ( h, t ) ,
(9.24)
dp drU ( p, ) ( ( t ph2 ) ) n ( h, t ) ,
(9.25)
wherein
d(h, t) measured seismogram at offset h and two-way time t,
U(p, t) hyperbolic transform coefcient at slowness p and zero-offset time t,
and
n(h, t) measurement noise at offset h and two-way time t.
The -function is a hyperbolic or parabolic edge in the t, p plane. After
performing a discrete Fourier transform (DFT) on U(p, t) from the -domain
to
, and discretizing in p to Np values, equation (9.25) transforms to
d ( h, w )
p U ( p, )ej ph
n ( h, ) ,
(9.26)
where
177
10Chapter9.indd 9
12/3/09 2:09 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
U ( p, )
j
U ( p,
)e .
(9.27)
j
t
d ( h,
)e .
(9.28)
(9.29)
j p k h i
, l k N p , and 1 i N h ,
(9.30)
d is an Nh-element column vector with elements d(hi, w), and U is an Np-element column vector with elements Uk U(pk, w). The unconstrained leastsquares solution for the unknown U is
T
U (L L)
1
L d.
(9.31)
By nding U from equation (9.31), the estimate d ( h,
) can be determined by equation (9.26) for each
, and then d ( h, t ) found using
equation (9.28).
The 2-D plot of U(p, t) as determined by equation (9.28) can be used to
lter slowness by selecting acceptable values. Hampson uses high-passed p-values (recall that data were originally attened) to estimate multiples, and then
subtracts their reconstruction of d from the original to get estimates of primaries that when attened had nearly zero move-out.
9.6
10Chapter9.indd 10
12/3/09 2:09 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
employed, but a nite number of wavefronts are assumed to compose d(hi, t).
In the narrowband model (appropriate particularly after attening) a wavefront sk(t) (the kth element of U) with slowness pk arrives at the ith sensor as
s k ( t )e
j p k h i
In the wideband model the DFT of sk(t) gives at the ith sensor the Fourier
transform value
s k ( 0 )e
2
j
0 p k h i
at each 0 .
The most signicant difference between Hampsons method and the subspace methods is that with eigenstructure the model assumes an unknown but
xed number K of wavefronts to be found, and therefore the L matrix has
exactly K columns (wherein p pk, k 1, 2, ..., K) rather than one for each
trial pk, and U has exactly K elements, one for each wavefront. Otherwise
equation (9.29) is identical to MUSICs wideband model at a single given frequency.
Other major differences between Hampsons and the subspace methods
are the subspace assumptions of (1) stationary processes and (2) the treatment
of the elements of U as random variables rather than deterministic constants.
Thus, for the eigenstructure methods we have the K-column matrix L with kth
column Lk indicating the delays at the offsets hi as given in equation (9.30),
and the U vector containing exactly K complex source signals sk(t). Thus, we
have in the narrowband model
s (t)
1
s2 ( t )
U .
.
.
sk ( t )
(9.32a)
179
10Chapter9.indd 11
12/3/09 2:09 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
or for the wideband model at each selected frequency w0, the elements of U
are K source Fourier transform values:
S (
)
1 0
S2 (
0 )
.
U
SK ( 0 )
(9.32b)
R d E { dd* } LE { UU }L E { nn }
H
(9.33)
LR u L I ,
wherein the meaning of Ru and 2 is implicit. Finding the eigenvalues i and
corresponding eigenvectors vi of Rd and using the properties given in
Chapter 4, we have, ideally,
K
Rd
( k 2 ) vk v*k 2 I ,
(9.34)
k1
where
2
1 2 K K 1 Nh .
(9.35)
180
10Chapter9.indd 12
12/3/09 2:09 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
2
H
L k v i 0 ,
(9.36)
i K1
where approximation is used due to the fact that the v i are estimates of vi
from a noisy covariance matrix estimate of Rd. We note that, as in Chapter 4
root MUSIC is faster. We note again that rather than solving U for all values
of p, the subspace methods reduce these equations to only K.
Having found the K solution vectors Lk, one or more of these is considered signal and the others are considered interference. Reconstruction of signal alone is accomplished via one of the subspace methods given in
Section 9.3 or Section 9.4.
The accuracy of using only one value of p to describe one reection is a
function of how well the parabolic model ts. Further, if wavefronts are not
well attened, more than one p per waveform is more appropriate. This is the case
for multiples, and some experimentation may be required.
9.7
Even though both Hampsons and subspace methods are based on the
same equations, considerable differences in application result. Both require a
DFT of each trace unless a narrowband subspace method is used, both give
estimates of slowness or velocity, both allow separation of signal and interference, and both are optimum in different senses for dealing with noise.
Computationally, Hampson requires the solution of Np equations for each
; MUSIC requires nding the eigenstructure at one or more frequencies of
an Nh by Nh covariance matrix and then solving for K solution values of p,
where K is usually much less than Np. Some subspace methods do not require
eigenstructure; for example QR decomposition to nd the subspaces is much
faster. In many methods solving for the K values of p is much faster than solv-
181
10Chapter9.indd 13
12/3/09 2:09 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
ing Hampsons equation (9.31). The covariance matrix must also be estimated, but this is not computationally demanding.
Three major differences appear:
1) Hampsons method produces an estimate of U(p, t) after the inverse
DFT, and this is convenient for reproducing parts or all of the data. In
contrast MUSIC will easily produce |U(p,
)|2, and the zero-offset signal amplitudes must be determined by a separate reconstruction process such as presented in Sections 9.4 and 9.5, but the size of vectors
and matrices involved is smaller than equation (9.31).
2) Subspace methods undoubtedly will give improved resolution for
velocity picking thereby allowing separation of more similar reections. However, the accuracy is a function of the parabolic t.
3) Broadband subspace methods require covariance estimation with samples of d(h,
). The eigenstructure literature suggests each trace be segmented in time; each segment yields a sample of d(h,
) to be used in
the covariance estimate. However, for nonstationary processes such as
primary seismic reections, this is not appropriate except for multiples
which reoccur down the trace. Typically a processing window is
stepped down through time and solutions found at each step. Solutions for U(pk, tk) are assumed to be for the center of the time span of
the window, although Li (Chapter 7) gives a method of explicitly solving for both Pk and tk for single wavefronts.
In summary, Hampsons method generally is applied to DFTs of whole
gathers, having rst been processed to NMO. It solves for all values of zerooffset reections versus time and can be used to band-pass or bandstop on the
parameter p before reconstruction into the time domain.
Subspace methods utilize a model having the same equations, but assume
that only K wavefronts are present in the windowed data. By one of several
means (Chapter 4, Chapter 6) some of which are quite fast, the number K and
the corresponding pk are found. These algorithms operate on the data covariance matrix, easily obtained from either time-domain (narrowband or preattened) or frequency domain (broadband). Typically, subspace processing data
windows span only a short time interval and the window must be moved to
process for all two-way times of interest. For parameterization of multiples,
the whole record can be processed in one window, but subsegments must give
sample DFTs for the covariance matrix estimation. An alternative to segment-
182
10Chapter9.indd 14
12/3/09 2:09 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
ing for samples is to use multiple frequencies for samples. We have not experimented with estimation or suppression of multiples.
The use of a priori knowledge, giving the covariance matrix structure, is
an advantage that was emphasized in Section 9.3, where faster convergence to
the mmse was noted. This is important because of the nite duration of a primary reection.
More complex methods of signal reconstruction (weighted stacking), such
as presented in Section 9.4, require more computation yet allow optimal estimation of each waveform, considering all others as interference. This can be of
signicance when ne features of the waveform are to be interpreted.
The above comparisons also apply to the Radon Transform (Beylkin,
1987) which converts two-dimensional data much the same as Hampson's
procedure. Typically applied to a whole gather (perhaps NMO corrected), the
Radon transform also allows band-pass and band-stop ltering in p before
inversion back into the space-time domain. FK ltering uses 2-D FFTs and
wedge-shaped stop or passband ltering, and is most appropriate for plane
wavefronts. Again the advantages of subspace processing over FK lie in the
structured estimate of the signal covariance matrix and exclusion of noisespace energy from the estimate through limiting signal-plus-interference
dimensionality to K Np.
9.8
References
Beylkin, G., 1987, Discrete radon transform: IEEE Trans. Acous., Speech and
Sig. Proc., 35, 162-172.
Haimovich, A. M., and Bar-Ness, Y., 1991, An eigenanalysis interference canceller: IEEE Trans. Acous., Speech and Sig. Proc., 39, No. 1, 76-84.
Hampson, D., 1986, Inverse velocity stacking for multiple elimination: J.
Canadian S.E.G., 22, 44-55.
Ottersten, B., Roy, R., and Kailath, T., 1989, Signal waveform estimation sensor array processing: Proc. Asilomar Conference, WA5-6, 1-5.
Scharf, L. L., 1991, Statistical signal processing: Addison-Wesley, Publ. Co.
Shan, T.-J., and Kailath, T., 1985, Adaptive beam-forming for coherent signals and interference: IEEE Trans. Acous., Speech and Sig. Proc., 33,
537-536.
Thorson, J. R., 1984, Velocity-stack and slant-stack inversion methods, Ph.D.
Thesis, Stanford Univ.
183
10Chapter9.indd 15
12/3/09 2:09 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Thorson, J. R., and Claerbout, J. F., 1985, Velocity-stack and slant-stack stochastic inversion: Geophysics, 50, 2727-2741.
184
10Chapter9.indd 16
12/3/09 2:09 PM
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Chapter 10
Removal of Interference Patterns in Seismic
Gathers
William J. Done
The Karhunen-Loeve Transform (KLT) technique is frequently applied to
a variety of seismic data processing problems. Initial seismic applications of
the KLT (Hemon and Mace, 1978; Jones, 1985) were based on principal components analysis, usually on stacked data. A subset of the principal components obtained from the KLT of a seismic data set is used to reconstruct the
data. Using the dominant principal components in the reconstruction emphasizes the lateral coherence which characterizes poststack seismic data. Using
subdominant principal components during reconstruction can emphasize
detail in the result by eliminating the strong lateral coherency carried by the
high-order principal components. The usual approach to principal components analysis is also used to suppress random noise in the final reconstruction
by always eliminating the low order principal components from any reconstructions. These low-order principal components contribute to the randomness in the data.
The three applications described in this chapter, however, use the eigendecomposition methods of the KLT in a manner more closely associated with
interference canceling (Widrow and Stearns, 1985). The reader is also referred
to Haimovich and Bar-Ness (1991). In all three examples, the goal is the suppression of coherent noise in the seismic data. The coherent noise forms an
interference pattern in the seismic data that is correlated in the spatial and
temporal domains. Suppression of this interference is accomplished by first
estimating the coherent noise and then subtracting that estimate from the
original data, the difference being an estimate of the desired seismic signal
components.
185
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Desired Signal
Primary Input
sk
(z)
xk
+
nk
e k s k
-
Reference Input
Noise
Source
Figure 10.1.
G(z)
yk
Adaptive
Filter
n k
wk
On the right side of Figure 10.1 is the data analysis portion of the adaptive
interference canceling model. The two recorded signals xk and yk are the inputs
to the interference canceler. Reference signal yk is filtered by the adaptive filter
wk. The output nk of wk is subtracted from the data signal xk, the difference
signal ek being an estimate of the desired signal sk. Adaptivity comes about
because, initially, the characteristics of wk and thus ek are not known. By
assuming that the noise process nk is not correlated with the desired signal sk,
various schemes based on the minimization of the energy in the error signal ek
186
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
p ki [ y k y ] a i ,
(10.1)
where T denotes the matrix transpose operation and y is the vector mean of
the N vectors yk. It is assumed that we can generate the vector yk from the pki
using a set of M 1 vectors bj according to
M
yk y
pkj bj
(10.2)
j1
Equations (10.1) and (10.2) are the forward and inverse transform relations of
the KLT. An approximation to yk is made by using only the first m M terms
in the sum in equation (10.2):
187
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
y k
pkj bj .
(10.3)
j1
(10.4)
j m1
p kj b j .
(10.5)
1
T
J ( m ) ------------ e k e k ,
N 1k 0
(10.6)
a i b i u i , i 1, , M .
(10.7)
the result is
The M 1vectors ui are determined from the covariance matrix of the yk.
This matrix is estimated by the sample covariance
N1
1
T
R ------------ [ y k y ] [ y k y ]
N 1k 0
UU
and is real and symmetric.
188
(10.8)
(10.9)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
The vectors ui are the normalized eigenvectors of R, with ui being the ith
column of the M M matrix U. is an M M diagonal matrix:
diag [ 1 2 M ] .
(10.10)
(10.11)
(10.12)
are then projected onto the noise subspace. This reduced order representation
(see Scharf and Tufts, 1987) is
189
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Desired Signal
Primary Input
xk
sk
+
nk
s
k
+
-
(z)
n k
Reconstruction
of Target Data
on Noise Space
Reference Input
Noise
Source
G(z)
Figure 10.2.
yk
Eigenanalysis
of Training
Data
x k s k n k .
(10.13)
(10.14)
and the projection of the xk on the noise subspace becomes an estimate of the
interference nk obscuring the desired seismic signal sk. The estimate of sk is
obtained from
s k x k n k .
(10.15)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
of gain control or flattening of noise events. Even though this procedure looks
at 2-D data blocks, it is likely that data with a strong horizontal correlation
will exhibit a greater spread in eigenvalues than data in which that same correlation occurs diagonally through the data block. This emphasis in the spread
of the eigenvalues is desirable because the goal is to train on the noise to be
removed, associating that noise with the dominant structure. This will be
illustrated in the examples to follow.
The target region is that portion of the seismic record in which it is
desired to suppress the interference. Typically this is the entire seismic record,
including the training region.
Once the training and target regions have been selected, each is divided
into data vectors, the yk and xk. In Jones (1985), the data vectors were large
and one-dimensional in nature. Typically, a vector comprised all of the samples at a constant time in a stack. Within a vector, there was variation only in
one independent axis, usually the spatial axis. The vectors are large because
seismic stacks usually consist of several hundred traces. This makes M the
same as the number of traces. In the EIC application, the data vectors are
formed by taking a rectangle of samples from the datavariation in both the
spatial and temporal axes. In Chapter 2, Figure 2.1, and Section 2.3, the samples are arranged in an M 1 vector, where M is no larger than 25 in the following examples. The order the samples are loaded into the vectors is not
important, but must be identical for all vectors.
In the procedure originally specified in Done et al. (1991), the residual
data exhibits an artifact caused by dividing the target region into small, adjacent blocks. This artifact causes a strong visual correlation between output
traces common to a column of data blocks. An abrupt change between the
boundaries of adjacent blocks, especially in the trace direction, causes a degradation in the appearance of the output data. This phenomenon is similar to
the edge which occurs when using the KLT to encode portions of images. By
overlapping adjacent data blocks, both in the trace and time directions, this
artifact is reduced. A data value in the reconstructed target region is the average of all elements that correspond to that data value taken from the reconstructed target region vectors in which that element is found. The averaging
process smooths out the boundaries between data blocks. The examples illustrate this effect.
191
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
No Overlap
Overlap
419123.
403433.
298751.
307242.
155691.
157946.
10186.
10405.
9116.
8953.
7864.
8023.
Figure 10.4a shows an isolated view of the training region from the input
data in Figure 10.3. The training region, as reconstructed using the two most
dominant eigenvectors from tests 1 and 2, is shown in parts b and c of
Figure 10.4, respectively. Either reconstruction of the training interference is
193
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 10.3.
One shot record of marine seismic data, DAVC applied, exhibiting coherent noise. ( 1991 IEEE, Used with permission, W. J. Done, R. L.
Kirlin, A. Moghaddamjoo, Two-Dimensional Coherent Noise Suppression in Seismic Data Using Eigencomposition, IEEE Trans. on
Geoscience and Remote Sensing, vol. 29, no. 3, May 1991.)
194
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
a)
b)
c)
Figure 10.4.
195
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
visually accurate, with the overlapped data blocks in test 2 producing a slightly
smoother appearing reconstruction.
The reconstructed target and resulting residual for test 1 are shown in
Figure 10.5, parts a and b, respectively. Figure 10.6 contains the results for the
overlapping data blocks used in test 2. Comparing Figures 10.5b and 10.5b to
the input data in Figure 10.3, it can be seen that the coherent interference
below 3 s (the target zone) has been suppressed. The test 2 result with overlapping data blocks tends to have smoother transitions between the seismic
traces. With the suppression of the coherent interference, the hyperbolic seismic arrivals are more visible below 3 s.
Interpretation of these data should be done with caution, as with any technique which causes lateral mixing of seismic traces. The danger is that the lateral mixing can create false events or smooth over fine features. But with this
method, the smearing is limited to a known number of traces determined by
the data block size and amount of overlap. Notice that three dead traces
present in this record of data have been reconstructed, though the signal levels
in these three traces are somewhat lower than in the adjacent traces.
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 10.5a.
197
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 10.5b.
198
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 10.6a.
199
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 10.6b.
Test 2 with overlapping data blocks. Residual target region (3.0 to 6.0
s). ( 1991 IEEE, Used with permission, W. J. Done, R. L. Kirlin,
A. Moghaddamjoo, Two-Dimensional Coherent Noise Suppression
in Seismic Data Using Eigencomposition, IEEE Trans. on Geoscience
and Remote Sensing, vol. 29, no. 3, May 1991.)
200
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 10.7.
One shot record of South Florida Basin data, AGC applied, exhibiting repeating reverberations.
201
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
data block parameters. Each data block covers a 12-trace by 2-time sample
pattern, and adjacent data blocks overlap by three samples in the trace direction and one sample in the time direction. This results in 1840 data blocks in
the training region. Computing the sample covariance matrix and performing
an eigendecomposition produces the eigenvalues listed in Table 10.2 under
the columns labeled Flattened Data. Only the largest 10 eigenvalues are
listed. This is for a training region defined by four pairs (trace, sample number) (1, 50), (1, 450), (76, 50), and (76, 150) on the flattened data. Also listed
in the table is the normalized running sum of those eigenvalues.
Table 10.2 Eigenvalues comparison showing
effects of preflattening data.
Flattened Data
Unflattened Data
Eigenvalue
Cumulative Sum
(Normalized)
Eigenvalue
Cumulative Sum
(Normalized)
3 779 219.
0.7377
1 829 858.
0.3264
782 533.
0.8905
1 670 745.
0.6245
186 719.
0.9269
633 756.
0.7375
110 805.
0.9485
595 499.
0.8437
68 174.
0.9619
280 942.
0.8939
43 636.
0.9704
293 065.
0.9301
34 036.
0.9770
96 328.
0.9473
21 295.
0.9812
93 658.
0.9640
17 580.
0.9846
61 414.
0.9749
10
14 005.
0.9873
46 034.
0.9831
Order
The flattened data training region vertices can be moved to their corresponding positions in the original, unflattened data domain. Doing this, the
four point zone defined previously becomes the region bounded by vertices (1,
402), (1, 802), (76, 191), and (76, 291). These vertices define a training
region on the unflattened data that covers the same data area as the region
defined above for the flattened data. Figure 10.9 shows the unflattened test
202
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 10.8a.
The data after being flattened using a velocity of 8720 ft/s (2660m/s).
The entire data set.
training zone. Using the same data block configuration, there are 1649 data
blocks in the unflattened training region. The eigenvalues found from the
sample covariance matrix and the normalized running sum of those eigenvalues are listed in Table 10.2 under the columns labeled Unflattened Data.
The differences in the number of data blocks found in the flattened and
unflattened cases arise from the definition of the data block and the manner in
which data blocks are extracted from a training zone. With the current version
of the algorithm, no data block can extend beyond the boundaries determined
203
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 10.8b.
The data used as the training zone for the flattened data tests.
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 10.9.
of the first eigenvalue in the flattened case. This finding correlates with statements on flattening for rank reduction in Sections 6.2 and 6.9.
The next step is to use selected eigenvectors from the training region to
reconstruct the data in the target region and compute the residual between the
original data and reconstruction. The eigenvalues in Table 10.2 suggest several
tests. Using the flattened data, (a) reconstruct in one case with eigenvector 1
and (b) in the second case, reconstruct with eigenvectors 1 and 2. Three
reconstructions are done with the unflattened data: reconstruct the target
205
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
206
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 10.10a. Flattened data test, reconstructed target zone. Reconstruction with
eigenvector 1.
207
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 10.10b. Flattened data test, reconstructed target zone. Reconstruction with eigenvectors 1 and 2.
208
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 10.10c. Unflattened data test, reconstructed target zone. Reconstructed with
eigenvector 1.
209
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 10.10d. Unflattened data test, reconstructed target zone. Reconstruction with
eigenvectors 1 and 2.
210
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 10.10e. Unflattened data test, reconstructed target zone. Reconstruction with
eigenvectors 1, 2, and 3.
211
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 10.11a. Residual from flattened data test. Reconstruction with eigenvector 1.
212
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 10.11b. Residual from flattened data test. Reconstruction with eigenvectors
1 and 2.
213
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 10.11c. Residual from unflattened data test. Reconstruction with eigenvector 1.
214
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 10.11d. Residual from unflattened data test. Reconstruction with eigenvectors
1 and 2.
215
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 10.11e. Residual from unflattened data test. Reconstruction with eigenvectors
1, 2, and 3.
216
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
the preceding section is used to suppress the reverberating refraction. A velocity of 4434 ft/s (1351m/s) is used to flatten the interference event.
Figure 10.13 shows the training region used, defined by trace, time sample
vertices (100, 150), (121, 150), (100, 300), and (121, 300) in the flattened
data domain. All plots are shown after restoration to the unflattened domain.
The data block configuration used for this analysis is an 8-trace by 3-time
sample pattern, with adjacent blocks overlapping by one element in both trace
and time directions. There are 225 data blocks in the training zone. The first
three eigenvalues determined from the training zone covariance structure
account for approximately 50%, 21%, and 13%, respectively, of the total
eigenvalue sum.
The reconstructions of the entire record using the dominant first and second eigenvectors are shown in Figures 10.14a and b. Figure 10.14a shows the
result obtained when only eigenvector 1 is used in the reconstruction. If eigenvectors 1 and 2 are used, the reconstruction is as shown in Figure 10.14b. The
residuals obtained by subtracting these reconstructions from the original data
are plotted in Figure 10.15. Results for the two cases are similar, but the one
eigenvector reconstruction may be somewhat better, because the desired
reflection data appears stronger.
An alternative to this interference canceling procedure is velocity filtering.
Velocity filtering can produce a smoothed over, wormy appearance that is
indicative of strong trace-to-trace mixing. Figure 10.16 shows the results of
applying velocity filtering to the original shot record. The velocity filter was
designed to reject velocities from 0 to 6125 ft/s (1867m/s) and 0 to 6125
ft/s (1867m/s). In some situations, such as range dependent attribute analysis, the trace mixing characteristic of velocity filtering may be undesirable.
Trace mixing effects in the eigendecomposition interference canceling method
are limited to the trace width of the data blocks. Velocity filtering can also be a
problem when bad traces are present, as is the case for the near range traces in
this data set. High amplitude values in these traces reproduce the impulse
response of the 2-D velocity filter, introducing artifacts into the output. This
can also happen at the edges of mute zones or around dead traces. On the
other hand, a properly designed velocity filter can remove interference over a
range of velocities, while the eigendecomposition approach may have to be
applied repeatedly to suppress interference arriving with different velocities.
Comparing Figure 10.15a or b to 10.16 shows that the eigendecomposition
217
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 10.12.
218
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 10.13.
Training zone.
219
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
220
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
221
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
222
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
223
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 10.16.
224
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
results still contain some interference events which have a different velocity
than the events that were dominant in the selected training zone.
10.6 References
Done, W. J., Kirlin, R. L., and Moghaddamjoo, A., 1991, Two-dimensional
coherent noise suppression in seismic data using eigendecomposition:
IEEE Trans. on Geosci. and Remote Sensing, 29, 379384.
Haimovich, A.M., and Bar-Ness, V., 1991, An eigenanalysis interference canceller: IEEE Trans. Acous., Speech and Sig. Proc., 39, 7684
Hemon, C., and Mace, D., 1978, The use of the karhunen-loeve transformation in seismic data processing: Geophys. Prosp. 26, 600626.
Jones, I. F., 1985, Applications of the Karhunen-Loeve transform in reflection
seismology: Ph.D. Dissertation, Univ. of British Columbia.
Kramer, H. P., and Matthews, M. V., 1956, A linear coding for transmitting a
set of correlated signals: IRE Trans. on Information Theory, 2, 4146.
Scharf, L. L., and Tufts, D. W., March 1987, Rank reduction for modeling
stationary signals: IEEE Trans. on Acoust., Speech, and Signal Proc.,
35, 350355.
Widrow, B., and Stearns, S. D., 1985, Adaptive signal processing, PrenticeHall, Inc.
225
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Chapter 11
Principal Component Methods for
Suppressing Noise and Detecting Subtle
Reflection Character Variations
Brian N. Fuller
11.1 Introduction
Seismic interpreters are sometimes presented with the challenge of identifying small lateral reflection character changes that can indicate significant
lithologic variations. Examples of this occur in stratigraphic trap exploration
and in estimating the lateral extent of a reservoir. Standard seismic trace plots,
even in color, sometimes do not have sufficient dynamic range to show a small
waveform variation against a background of traces that are very similar to one
another. Additionally, noise can obscure subtle waveform variations.
It is the purpose of this study to present methods and examples which
show how eigenvector methods can be used to aid an interpreter in detecting
small trace-to-trace variations in seismic waveforms when the lateral reflection
variation is very small and/or when the variations are obscured by noise. This
work extends that of D.C. Hagen (1982), who presented methods for using
the correlation coefficients of data traces with principal components to identify porous sand zones on a CMP stacked section. This work also draws from
other papers that have discussed the uses of principal component reconstruction as a noise suppression technique (Huang and Narendra, 1975; Andrews
and Patterson, 1976). Some of those papers have dealt specifically with seismic
reflection data (Jones and Levy, 1987; Done et al., 1991).
The approach here is to first calculate the normal-incidence seismic
response of a simulated reservoir sand that varies in thickness between 0.3 and
4.5 m. Eigenvector methods are then applied to the model data with and
227
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
x k [ x 1k, x 2k, , x Nk ] ,
(11.1)
where the superscript T indicates the transpose of the matrix. An N N sample covariance matrix C is formed from the data in the window and each element Cij of the matrix is calculated by equation (11.2):
M
C ij
xik xjk .
k1
228
(11.2)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
An orthonormal vector basis for the dataset is then given by the normalized eigenvectors of the matrix C. The variance in the data accounted for in
the dimension of each eigenvector is given by their associated eigenvalues. The
data component in the dimension of the eigenvector associated with the largest eigenvalue is known as the first principal component. The second principal
component is the data component in the dimension of the eigenvector associated with the second largest eigenvalue, and the third, fourth, etc., principal
components are named in a like manner. The jth normalized eigenvector is
annotated as vj. The eigenvalues are ordered largest first, smallest last.
An estimate xi of a data vector (a portion of a seismic data trace) may be
constructed by using (see section 3.6)
N
xi
ij j ,
N N .
j1
(11.3)
In equation (11.3), ij is a coefficient relating the vector xi to the eigenvector vj. The value of ij , which shall for the remainder of this paper be referred
to as a projection coefficient, is found by the dot product operation,
T
ij x i v j .
(11.4)
As noted above, the first, second, etc., eigenvectors of a data sets covariance matrix are ranked in accordance with the proportion of the variance of
the data along that eigenvectors dimension. If all of the data traces in a data
window were identical, only one principal component would be necessary to
account for all of the variance in the data window. The first eigenvector v1
would be identical, except for a scaling factor, to each of the data traces.
The second and subsequent eigenvalues would equal zero because there
would be no other energy in the data that is not accounted for by the first
eigenvector. In the seismic data case that I am describing here, however, the
assumption is that the data traces in the window are merely similar to one
another. Variability from one trace to the next is due to lithologic variations
and/or noise. In this case, the data traces are still highly correlated with (meaning they are very similar to) the first eigenvector. The second and higher eigenvectors dimensions account for the trace-to-trace variability in the data. The
values of ij would vary with the variability in the data. If the variations are
229
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
due solely to random noise, then the values of ij will vary randomly. However, if the data vary systematically, then the values of ij vary systematically,
and we expect this in the first few eigenvectors.
230
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 11.1.
(a) A geologic cross-section of a sand body in shale. (b) A normal-incidence seismic section calculated using the cross-section in part a above.
The eigenvectors from this dataset were calculated. (c) Projection coefficients for each of the data traces in part b onto the first eigenvector. The
values of projection coefficients are not shown as they are somewhat arbitrary, depending on the scaling of the eigenvectors. (d) Projection coefficients for each of the data traces in part b onto the second eigenvector.
231
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figures 11.1c and 11.1d show plots of projection coefficients [see equation (11.4)] of each data trace on the first and second principal components,
respectively. As above, there is a one-to-one correspondence between the horizontal axes of Figures 11.1c and 11.1d with the horizontal axes of
Figures 11.1a and 11.1b. Note that the projection coefficients change with the
changes in the sand layer thickness. The slope of the projection coefficient
curves is steepest in places where the sand thickness changes most quickly.
This experiment establishes that, at least in noise-free data, the principal component method can detect subtle reflection character changes for changes in
sand thickness of much less than a meter.
Repeating the experiment above on more realistic, but still controlled conditions. Figure 11.2a is a plot of the data in Figure 11.1b with noise added.
The noise is in the same frequency band as that of the reflections (8 Hz to
50 Hz) and the signal power to noise power of each trace is 4.0. The covariance matrix was formed from the noisy data and the principal components
were calculated. In this case, 98.5% of the variance in the data was accounted
for by the first 10 eigenvalues with 81.0% accounted for by the first. The
information is more dispersed among the eigenvalues than in the noise-free
case because the added energy is random and uncorrelated with either the signal or the noise on the other traces.
Figure 11.2b shows projection coefficients for the noisy data onto the second eigenvector (jagged curve) along with the projection coefficient curve for
the noise-free case that was shown in Figure 11.1. The projection coefficient
curve for the noisy data is similar in shape to that of the noise-free curve, but
the added noise causes random variations about the noise-free curve. A running average smoothing operator was applied to the noisy projection coefficient curve to reduce the random variations and to obtain a better estimate of
the noise-free projection coefficients. The smoothing operator calculates the
mean of 2K 1 projection coefficients and outputs a value at the ith trace that
is the mean of the ith projection coefficient and the K projection coefficients
on either side of the ith trace. The smoothed curve is shown as the data marks
with no lines connecting them. This curve was calculated using a value of 9
for K, or averaging 19 projection coefficients.
The primary result of the smoothing exercise is that the smoothed projection coefficient curve exhibits an easily observed and systematic variation that
is correlated with changes in the sand layer thickness. The sensitivity of the
projection coefficient plot to changes in sand thickness is not as great in the
232
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 11.2.
(a) Noise added to the data in Figure 11.1b. (b) The projection coefficients of each trace onto the second eigenvector are plotted here as the
jagged, dashed curve. The jagged curve was smoothed with a 19-point
smoothing operator plotted as distinct data points. The projection coefficients curve for the noise-free second eigenvector (Figure 11.1d) is plotted
as a solid curve. The values of projection coefficients are not shown as they
are somewhat arbitrary, depending on the scaling of the eigenvectors.
(c) The traces in part b were reconstructed from the first three principal
components. The projection coefficients for the first principal component
were smoothed. The reconstructed data have a signal-to-noise ratio of
32.5.
233
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
noisy data as it is in the noise-free data, but there is a clear difference in the
smoothed projection coefficients between where the sand thickness is 2 m and
where it is 4.5 m. The observable difference in sand thickness is about 2% of
the wavelength of the data. Other smoothing methods may give a better estimate of the noise-free projection coefficients and provide better resolution of
variations in sand thickness.
This part of the experiment shows that under controlled conditions, principal components methods can detect small lithologic variations and do so
better than visual interpretation.
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
jection coefficients of the first principal component before reconstruction. Justification for reconstructing the traces using smoothed projection coefficients
comes from the experiment above from which we learned that a better estimate of noise-free projection coefficients can be obtained by smoothing the
projection coefficients. Use of a less noisy projection coefficient should result
in a better estimate of the noise free trace after reconstruction. The projection
coefficients for the first eigenvector were smoothed with a 19-point equal
weight averaging operator while the projection coefficients for the second and
third dimensions were left unsmoothed. The reconstructed data are shown in
Figure 11.2c, which has a signal-to-noise ratio of 32.5. This is substantially
better than the signal-to-noise ratio of 4.0 in the original data.
Other reconstruction experiments resulted in output signal-to-noise ratios
of 15.8 using the first two principal components with no smoothing and
138.6 using the first two principal components and a 19-point smoothing
operator on both components. In the last case, the reconstructed data are visually indistinguishable from the original noise-free data.
The idea of treating data in separate subspaces before reconstruction can
be applied as a general concept. The advantage to operating on data in separate signal subspaces before reconstruction is that the appropriate noise suppression measures (band-pass filter, smoothing, etc.) for each kind of noise in
the separate subspaces can be treated without affecting the data in the other
subspaces. In the data shown in Figure 11.2c for example, some noise was suppressed by smoothing the projection coefficients for the first principal component but the lateral resolution provided by the second and third principal
components was preserved. Additionally, the projection coefficients can be
weighted more or less heavily before reconstruction, thus providing seismic
data in which character variations are easier to see than if the projection coefficients provided by the original eigendecomposition had been used.
235
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
the data traces, excluding the first through the third principal components,
and visually verifying that there was little coherency among the reconstructed
traces.
The projection coefficients for the second and third eigenvectors are plotted in Figure 11.3c. These projection coefficients have been smoothed with a
19-point smoothing operator. Once again smoothing sacrifices some lateral
resolution in exchange for a better view of the trends indicated by the projection coefficients. The trend in the projection coefficients for principal component 2 changes abruptly near CMP 1375. This change correlates with the
eastern termination of the porous sand. Another change in the trend for projection coefficient 2 occurs at about CMP 1400, which correlates well with
the eastern edge of the reservoir sand. The third principal component
decreases at a nearly constant rate from the left side of the seismic section to
about CMP 1400, the eastern termination of the sand body, where it reverses
slope and increases to the right side of the seismic section. Unlike the second
principal components, there is no significant variation in the third at the eastern edge of the porous sand facies. This observation has not been explained.
The most notable changes in the projection coefficients occur at locations
where the change in sand thickness is about 5 m. This gives us a gauge of the
sensitivity of the principal component method in terms of the seismic wavelength. The sensitivity of the method in terms of wavelength is estimated in
the following way. The frequency band of the data is between 8 Hz and
59 Hz. If we assume that the center frequency of the data is 33.5 Hz and a
seismic velocity (from well logs) at the depth of the reservoir to be 4200 m/s,
then 5 m is about 4% of the dominant wavelength of the data. Perhaps
smaller lithologic changes could be detected, but this would require better
well control than was available for this study.
237
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 11.3.
(a) A geological cross-section across Hartzog Draw Oil Field showing the
reservoir sand facies. (b) A seismic section recorded across the cross-section shown in part a showing the time interval effected by the presence
of the reservoir sand. The data traces are spatially coincident with the part
of the cross section directly above the trace. Eigenvectors were calculated
for the data in the window. (c) Projection coefficients of each trace onto
the second and third. Changes in the projection coefficient trends are coincident with changes in the reservoir facies. The values of projection coefficients are not shown as they are somewhat arbitrary, depending on the
scaling of the eigenvectors. (d) Data traces in part b were reconstructed
using the first three principal components. The noise was reduced by this
operation.
238
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
11.8 Discussion
The methods presented in this chapter have the weakness that there is no
established connection between reflection coefficients, geology, and the information conveyed by the projection coefficients. Until we can say directly what
the value of a projection coefficient means relative to reflection coefficients,
these methods will remain an indirect, albeit powerful, indicator of stratigraphic variations.
11.9 Conclusions
In tests on synthetic data, principal component methods indicate where
changes in lithology thickness occur that are on the order of 2% of the dominant wavelength of the data. The signal-to-noise ratio in synthetic data was
increased from 4.0 to as high as 136.0 by reconstructing the data using a subset of the principal components. Improvements in S/N can also be made in
the data by operating on separate parts of the signal subspace before reconstructing data traces using selected principal components. In the real data case
presented, principal component methods provide an indicator of vertical
lithologic changes that are about 4% of the dominant wavelength of the data.
11.10 References
Andrews, H. C., and Patterson, C. L., 1976, Singular value decompositions
and digital image processing: IEEE Trans. Acous., Speech, and Sig.
Proc., 24, 26-53.
Devijver and Kittler, 1982, Statistical pattern recognition: Prentice/Hall International.
Done, W. J., Kirlin, R. L., and Moghaddamjoo, A., 1991, Two-dimensional
coherent noise suppression in seismic data using eigendecomposition:
IEEE Trans. on Geosci. and Remote Sensing, 29, 379-384.
Fuller, B. N., 1988, Seismic detection of upper cretaceous stratigraphic oil
traps in the Powder River Basin, Wyoming: Ph.D. thesis, Univ. of
Wyoming.
Hagen, D. C., 1982, The application of principal components to seismic data
sets: Geoexploration, 20, 93-111.
Huang, T. S., and Narendra, P. M., 1975, Image restoration by singular value
decomposition: Applied Optics, 14, 2213-2216.
239
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Jones, I. F., and Levy, S., 1987, Signal-to-noise ratio enhancement in multichannel seismic data via the Karhunen-Loeve transform: Geophys.
Prosp., 35, 12-32.
Ranganathan, V., and Tye, R. S., 1986, Petrography, diagenesis and facies controls on porosity in Shannon sandstone, Hartzog Draw Field, Wyoming: AAPG Bull., 70, 56-69.
Tillman, R. W., and Martinsen, R. S., 1987, Sedimentologic model and production characteristics of Hartzog Draw field, Wyoming, A Shannon
shelf-ridge sandstone, in Tillman, R. W., and Weber, K. J., Eds., Reservoir characterization: SEPM special publications no. 40. 15-112.
Widess, M. B., 1973, How thin is a thin bed?: Geophysics, 38, 1176-1180.
240
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Chapter 12
Eigenimage Processing of Seismic Sections
Tadeusz J. Ulrych, Mauricio D. Sacchi, and
Sergio L. M. Freire
12.1 Introduction
This chapter briefly reviews the important theoretical aspects of eigenimage processing and demonstrates the unique properties of this approach using
various examples such as the separation of up and downgoing waves, multiple
attenuation, and residual static correction. In particular, we will compare the
eigenimage technique to the well-known frequency-wave number f-k,
method, (Treitel et al., 1967), and discuss important differences which arise
especially with respect to spacial aliasing and the separation of signal and
noise.
In order to fully understand the similarities and differences of this
approach versus approaches described in the previous chapters, it is important
to begin with a little history. The first publication which introduced and
applied aspects of eigenimage processing to seismic data was the paper by
Hemon and Mace (1978). These authors investigated the application of a particular linear transformation known as the Karhunen-Love (KL) transformation. The KL transformation is also known as the principal component
transformation, the eigenvector transformation, or the Hotelling transformation (Anderson, 1967; Love, 1955). It has been used by various authors for
one- and two-dimensional data compression and to select features for pattern
recognition. Of particular relevance to the ensuing discussion is the excellent
paper by Ready and Vintz (1973) which deals with information extraction
and S/N improvement in multispectral imagery. In 1983, the work of Hemon
and Mace was extended by a group of researchers at the University of British
241
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Columbia (Levy et al., 1983; Ulrych et al., 1983) which culminated in the
work of Jones (1985) and Jones and Levy (1987).
In 1988, Freire and Ulrych applied the KL transformation in a somewhat
different manner to the processing of vertical seismic profiling data. The
actual approach which was adopted in this work was by means of singularvalue decomposition (SVD), which is another way of viewing the KL transformation (the relationship between the KL and SVD transformations is discussed in this chapter; see also Chapter 3). In later works, Ulrych et al., (1988)
and Freire and Ulrych (1990) applied the SVD approach to various other
problems, including the attenuation of multiple reflections, and adopted the
nomenclature of eigenimage decomposition to this method of data processing.
Eigenimages were first introduced into the literature by Andrews and Hunt
(1977) in the context of image processing and, in our opinion, this description is one which is the most succinct for the purpose at hand.
A seismic section that consists of M traces with N points per trace may be
viewed as a data matrix X where each element xij represents the jth point of the
ith trace. A singular-value decomposition (Lanczos, 1961) transforms X into a
weighted sum of orthogonal rank one matrices which have been designated by
Andrews and Hunt (1977) as eigenimages of X. A particularly useful aspect of
the eigenimage decomposition is its application in the complex form. In this
instance, if each trace is transformed into the analytic form, then the eigenimage processing of the complex data matrix allows both time and phase shifts to
be considered. This is particularly important in the case of the correction of
residual statics.
12.2 Theory
We consider the data matrix X to be composed of M traces with N data
points per trace, the M traces forming the rows of X. The SVD of X is given
by, (Lanczos, 1961),
r
X
i ui vTi ,
(12.1)
i1
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
X
i ui vTi
i1
Figure 12.1.
ues of the matrices XXT and XTX. These eigenvalues are always positive
because of the positive definite nature of the matrices XXT and XT X. In
matrix form, equation (12.1) is written as
T
X UV .
(12.2)
where the definitions of the matrices are clear from equation (12.1).
Andrews and Hunt (1977) designate the outer dot product uiviT as the ith
eigenimage of the matrix X. Owing to the orthonormality of the eigenvectors,
the eigenimages form an orthonormal basis which may be used to reconstruct
X according to equation (12.1). This concept is illustrated diagrammatically
in Figure 12.1. It is clear from equation (12.1) that the contribution of a particular eigenimage in the reconstruction of X is proportional to the magnitude
of the associated singular value. Since in the SVD the singular values are
always ordered in decreasing magnitude, it is possible, depending of course on
the data, to reconstruct the matrix X using only the first few eigenimages.
Suppose, for example, that X represents a seismic section and that all M
traces are linearly independent. In this case X is of full rank M, all the i are
different from zero, and a perfect reconstruction of X requires all eigenimages.
On the other hand, in the case where all M traces are equal to within a scale
factor, all traces are linearly dependent, X is of rank one and may be perfectly
243
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
k .
k p1
Freire and Ulrych (1988) defined band-pass XBP, low-pass XLP and highpass XHP eigenimages in terms of the ranges of singular values used. The bandpass image is reconstructed by rejecting highly correlated as well as highly
uncorrelated traces and is given by
q
X BP
i ui vTi ,
1 < p q < r.
(12.3)
ip
2i
ip
E ----r-------- .
(12.4)
2i
i1
The choice of p and q depends on the relative magnitudes of the singular values, which are a function of the input data. These parameters may, in general,
2
be estimated from a plot of the eigenvalues i i as a function of the index
i. This is reasonable given the form of equation (12.4). In certain cases, an
abrupt change in the eigenvalues is easily recognized. In other cases, the
change in eigenvalue magnitude is more gradual and care must be exercised in
244
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
the choice of the appropriate index values. Figure 12.2 illustrates the above
discussion in a simple fashion. Figure 12.2a represents a synthetic seismic section showing three reflectors, one of which is faulted. The section has been
corrupted with additive pseudo-white noise with a standard deviation of 20%
of the maximum amplitude. Figure 12.2b shows the variation of the relative
magnitudes of the eigenvalues. In this particular case Figure 12.2b shows that
the signal portion of X is contained in only the first two eigenimages. Indeed,
the first two eigenimages and the sum of these eigenimages, which are shown
in Figure 12.2c, d, and e, respectively, illustrate this point very clearly. We
note in particular that the second eigenimage bears the signature of the faulted
reflector and the highly correlated horizontal information appears in the first
eigenimage.
12.2.1
As we have seen, decomposition of an image X into eigenimages is performed by means of the SVD of X. Many authors also refer to this decomposition as the Karhunen-Love or KL transformation. We believe however, that
the SVD and KL approaches are not equivalent theoretically for image processing and, in order to avoid confusion, we suggest the adoption of the term
eigenimage processing. Some clarification is in order.
A wide sense stationary process (t) allows the expansion
( t )
cn n ( t )
0 t T,
(12.5)
n1
where n(t) is a set of orthonormal functions in the interval (0,T) and the
coefficients cn are random variables. The Fourier series is a special case of the
expansion given by equation (12.5) and it can be shown that, in this case,
( t ) ( t ) for every t and the coefficients cn are uncorrelated only when
(t) is mean-squared periodic. Otherwise, ( t ) ( t ) only for 0 t T,
and the coefficients cn are no longer uncorrelated. In order to guarantee that
the cn are uncorrelated and that (t) (t) for every t without the requirement of mean-squared periodicity, it turns out that the n(t) must be determined from the solution of the integral equation
0 R ( t1 , t2 ) ( t2 ) dt2
T
( t 1 ) 0 < t1 < T,
245
(12.6)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 12.2.
An example showing an abrupt change in eigenvalue magnitude :(a) synthetic seismic section with noise, (b) magnitude of resulting eigenvalues,
(c) first eigenimage, (d) second eigenimage, and (e) sum of first two
eigenimages.
246
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
where R(t1,t2) is the autocovariance of the process (t). Substituting the eigenvectors that are the solutions of equation (12.6) into equation (12.5) gives the
KL expansion of (t). An infinite number of basis functions is required to
form a complete set. However, for an N 1 random vector x the dimensionality N is finite, and we may write equation (12.5) in terms of a linear combination of N orthonormal basis vectors wi (wi1, wi2,, wiN)T as
N
xk
yi wik
k 1, 2, , N,
(12.7)
i1
which is equivalent to
x Wy ,
(12.8)
where W (w1, w2, , wN). Now only N basis vectors are required for completeness. The KL transformation or, as it is also often called, the KL transformation to principal components, is obtained as
T
y W x,
(12.9)
C x WW ,
(12.10)
247
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
where |Cx| is the determinant of Cx. Since the integral is identically the expectation of the bracketed terms,
T
1
1
E [ x C x x ] E [ trC x xx ] trI N ,
where E[] is the expectation operator, and l ln(e), we obtain the final
expression
1
N
H ( x ) --ln ( C x ) ---ln ( 2e ) .
2
2
Since the determinant of a matrix is equivalent to the product of its eigenvalues, the above equation may be written as
N
1
N
H ( x ) -- ln ( i ) ---ln ( 2e ) ,
2i 1
2
which defines the entropy of the Gaussian process x.
Using the above definition of the entropy, Young and Calvert (1974) show
that, given a positive definite matrix W, maximizing the entropy of y in
T
equation (12.9), subject to w i w i 1 , i 1, 2,, N, results in W U,
T
2 T
where U is obtained from the eigendecomposition XX U U . In
other words, the principal component transformation constrains y to have
maximum entropy.
Let us now turn our attention to the problem of the KL transformation
for multivariate statistical analysis. In this case, we consider M row vectors xi,
i 1, M arranged as rows in an M N data matrix X. The M rows of the
data matrix are viewed as M realizations of the stochastic process x, and, consequently, the assumption is that all rows have the same row covariance matrix
Cr .The KL transform applied to X now gives
T
( y, y 2 y N ) Y W X .
(12.11)
1
2
r ------1------- x i x Ti ------1------- XX T -----------C
-U U ,
M1
M 1i 1
M1
248
(12.12)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
assuming a zero mean process for convenience. Since the factor M 1 does
not influence the orthonormal eigenvectors, we can see from equation (12.12)
and the definition of U that W U. Consequently, we can rewrite
equation (12.11) as
T
Y U X.
(12.13)
Y U UV
V T
(12.14)
(12.15)
X UV .
(12.16)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
12.2.2
i F2 ( ui vTi ) ,
F2 [ X ]
(12.17)
i1
where F2 [] represents the 2-D FT. It is clear from equation (12.17) that the
f-k representation of X may also be viewed in terms of eigenimages in that
T
domain. Further, since the M rows of the ith eigenimage ( u ki v i , k 1, m )
are equal to within a scale factor, we may write
T
F 2 [ u i v i ] F 1 [ u i ]F 1 [ v i ] ,
where F1 [] represents the 1-D FT. Consequently we may write
equation (12.17) as
r
F2 [ X ]
i F1 [ ui ]F1 [ vi ]T .
(12.18)
i1
250
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
12.2.3
Finally, we turn to the problem of the actual computation of the eigenimage filtered section. Even with the availability of efficient algorithms the computation of the full decomposition may be a time consuming task. This is
particularly true in the seismic case where the dimension N may be large. Fortunately, in our case, the dimension M is often considerably less than N and
we are also concerned with the reconstruction of X from only a few eigenimages. Consequently, we can reconstruct the filtered section, XLP say, rapidly by
computing only those eigenvectors of the (M M) matrix XXT which enter
into the summation in equation (12.1). In order to make the derivation quite
general, we will concern ourselves with the construction of a general XBP , a
band-pass SVD data matrix, using the singular values pp 1q
where p 1, q r, and r is the rank of the matrix. We wish to compute XBP
without the necessity of computing the complete SVD of the data matrix X.
Using equation (12.2), the band-pass matrix XBP is given by
T
X BP U BP BP V BP ,
(12.19)
where UBP , VBP , and BP are equal to U, V, and , respectively, with the
exception that the first p 1 and the last r q columns of each matrix are
zeroed. Without loss of generality we consider the case where M, the number
of traces, is less than N, the number of time samples per trace, and we compute the covariance matrix XXT of smaller dimension which allows the
decomposition
T
XX U U .
T
U BP X U BP UV .
(12.20)
U BP U BP ,
and substituting equation (12.21) into equation (12.20), we obtain
251
(12.21)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
U BP X BP V .
(12.22)
BP V BP V BP .
Using the above expression in equation (12.22),
T
U BP X BP V BP .
(12.23)
X BP U BP U BP X .
(12.24)
It is interesting to note from equation (12.24) that in the case when p 1 and
q r M,
T
X LP U LP U LP X X .
T
12.3 Applications
In this section we will illustrate with synthetic and real data examples
some of the applications of eigenimage processing to seismic sections and,
where applicable, we will compare the results with those obtained using f-k
filtering. In particular, we are interested in addressing such issues as signal-tonoise enhancement, wavefield decomposition, and residual static correction.
12.3.1
252
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
layers. Only primary reflections are considered, and the signal has been corrupted with additive white noise with a standard deviation equal to 20% of
the maximum signal amplitude. Figure 12.3b shows the 2-D amplitude spectrum of the noiseless input and is indicative of the horizontal character of
the input. To separate the signal from the background noise, we applied an f-k
filter with a cutoff of 1ms/trace to 1ms /trace. The 2-D amplitude spectrum of the output is shown in Figure 12.3c. Figure 12.3d illustrates the 2-D
amplitude spectrum of the output of the eigenimage filter with q 1, i.e., we
have used only the first eigenimage. Simple visual inspection shows immediately the close similarity between the actual noiseless signal spectrum and that
obtained using eigenimage decomposition. The eigenimage filter has recovered almost the exact f-k signature of the input, including the symmetry with
respect to the f-axis. The f-k filter has certainly increased the S/N, but at the
expense of signal distortion. Another quite striking view of the two different
filtering schemes is illustrated in Figures 12.3eg, which show the 2-D amplitude spectra of the actual noise and the noise rejected by the f-k and the eigenimage filters, respectively. As can be seen, the spectrum of the noise output
from the eigenimage filter is almost exactly that of the input, whereas the spectrum of the noise rejected by the f-k filter shows clearly that this filter cannot
separate noise which occupies the same f-k band as the signal. The difference is
that whereas the f-k filter rejects f-k components, the eigenimage filter rejects
uncorrelated components.
To gain deeper insight into the eigenimage filter, it is interesting to consider this example in a little more detail. Let S and N represent the noiseless
data matrix, which is composed of M identical traces, and the noise matrix
which is uncorrelated with the data, respectively. The input matrix is
X S N as illustrated in Figure 12.3a. Since S is of rank one, it may be
reconstructed from the first eigenimage. Using the reconstruction
T
X LP U LP U LP X ,
we may write
T
S u1 u1 S ,
where ul is the eigenvector associated with the largest eigenvalue and, in this
T
particular case, the matrix u 1 u 1 is composed of elements each equal to 1/M.
253
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 12.3.
254
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 12.3.
255
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
C x u u ,
( C s C n )u u ,
(12.25)
and
2
C s u ( N )u .
(12.26)
From equations (12.25) and (12.26) it is evident that the eigenvectors of XXT
2
are insensitive to white noise, while the eigenvalues are increased by N . ConT
sequently, the first eigenimage of X is u 1 u 1 and
T
u1 u1 X u1 u1 S u1 u1 N
T
S u1 u1 N .
(12.27)
Equation (12.27) shows the behavior of the SVD filter very clearly. The
signal is recovered fully and the noise is suppressed in a manner equivalent to
an average stack. In the more general case, when the signal may vary from
trace to trace, the rank of S is greater than one and the noise will be suppressed
by an optimally weighted average. This average is optimum in the sense in
that the weight vectors ui are obtained as a result of a maximum variance criterion. We have computed the S/N for this example using the following expression
T
S N T r { SS } T r { ( Y S ) ( Y S ) } ,
where Y is the filtered matrix and Tr {} represents the trace of the matrix. The
actual S/N of the input was 0.6. The values computed for the two filtering
schemes were 3.20 for the f-k filter and 4.24 for the eigenimage filter, a 5-dB
improvement.
12.3.2
Wavefield Decomposition
256
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
(12.28)
where aik is the amplitude of the ith event on kth trace, wi is the wavelet associated with the ith event, i is the time of occurrence of the ith event on the first
trace. The spacial interval is X and Vi is the apparent velocity of the ith event.
Each trace in the section is now represented by
L
xk ( t )
sik ( t ) nk ( t ) ,
(12.29)
i1
257
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
matrix of the data, we will consider a coherency measure based on these eigenvalues.
It is evident from the discussion concerning equations (12.25), (12.26),
and (12.27) that, for an input section consisting of one event only [L 1 in
equation (12.29)], which has been aligned to simulate zero moveout, the ideal
eigenvalue distribution of the row covariance matrix consists of the major
2
2
2
eigenvalue 1 S N , where S is the variance of the signal, together
2
with M 1 eigenvalues i N for 2 i M. Various measures which
indicate the presence of signal immediately suggest themselves. For example,
we can define measures K1, K2, and K3 given by
1
1
- , K 3 ----1 .
K 1 ----M---------- , K 2 -------------M
2
i 1 i
i 2 i
These measures were applied by Ulrych et al.,(1983) and Jones (1985) in
velocity analysis of CMP sections. As pointed out by Key and Smithson
(1990), however, although important, these measures fail to take into account
the presence of the noise variance in the energy estimate. In an attempt to
overcome this shortcoming, Key and Smithson (1990) present a measure
which appears to have high sensitivity and resolution. We briefly outline this
approach here and compare the various measures from the point of view of
event identification and separation.
First, since Cr XTX is only an estimate of the true covariance matrix of
the data, an estimate of the noise variance is determined from the eigenvalues
of Cr as
M
2
N
1
------------- i .
M1i2
S 1
N,
giving an estimated S/N
2
( M 1 ) 1 i 2 i
S
-.
S N ----2- ------------------------------------------M
i 2 i
N
258
(12.30)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Due to the sensitivity of the eigenvalues to the presence of signal, Key and
Smithson (1990) formulate a weighting function WML which is a log-generalized likelihood ratio which tests the hypothesis that no signal is present.
M
W ML
( i 1 i M )
MN ln -------------------------- .
1M
M
( i 1 i )
(12.31)
The point behind WML is that, given precise knowledge of Cr , in the pres2
ence of noise only i n for i 1 to M and hence WML 0. In the pres2
ence of signal only, 1 S , i 0 for j 2, and WML . WML thus
provides a strong discriminant in the presence of signal. Key and Smithson
(1990) combine equations (12.30) and (12.31) to obtain a new coherency
measure KML given by
K ML W ML S N .
(12.32)
If the events are indeed linear, M and N in equations (12.30) and (12.31) may
be taken to represent the full seismic section. Since, in practice, events may
often show nonlinear moveout, KML is computed using suitable windows
which then define the indices M and N.
An example of the use of the measures we have discussed for event identification is illustrated in Figure 12.4. Figure 12.4a shows an input section composed of two equal amplitude dipping events of opposite polarity with
additive random noise. This example is close to the one used by Rutty and
Jackson (1992) in wavefield decomposition using spectral matrix techniques.
The dips are twice the sample interval and the S/N is 20. Figure 12.4b shows
the eigenvalue variation as a function of dip and illustrates very clearly the
philosophy behind the various measures which we compare. Specifically,
Figures 12.4ce show the measures K1, K2, and K3, respectively. The various
components of the Key and Smithson (1990) measure, KML, are shown in
Figures 12.4fh.
Although KML certainly exhibits a far superior resolution to the other three
measures, the problem is that this measure assumes that only one event exists
in the window of interest. The weighting function given by equation (12.31)
completely dominates the measure, and the information in the S/N, computed using equation (12.30), is swamped. In this particular instance, the best
259
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 12.4.
260
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 12.4.
An example of seismic event detection (continued): (i) dipping event 1 recovered by low-pass eigenimage filtering, (j) dipping event 2 recovered by
low-pass eigenimage filtering, (k) sum of dipping events i and j.
262
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
measure is K3, but much work in this regard needs to be done. One path we
are at present pursuing is the definition of norms based on the concept of
entropy.
The actual separation of the dipping events using low-pass eigenimage filtering is shown in Figures 12.4i and j. We note the low amplitudes in the middle of the reconstructed section which correspond to destructive interference
of the events as seen in the input data. The improvement in the S/N is shown
in Figure 12.4k which is the sum of the two low-passed reconstructed events.
12.3.2.2 Vertical Seismic Profiling
A classic example of the use of eigenimage decomposition is in application
to vertical seismic profiling (VSP) for the purpose of separation of up and
downgoing waves (Freire and Ulrych, 1988). The processing sequence is illustrated in Figure 12.5 in schematic fashion. Figure 12.5a is the input section
showing the up and downgoing events and Figure 12.5b is the same section
following time shifting of each trace by appropriate amounts. The time shifting may be performed either in the time domain, which will in general require
some type of interpolation, or, as is our preference, in the frequency domain
by operating on the phase of each trace. This latter approach requires two
FFTs per trace, but obviates interpolation. Figure 12.5c and d show the
shifted up- and downgoing waves, respectively, which are the outputs of bandpass and low-pass eigenimage filtering. We determine p and q which are
required for this purpose [equation (12.3)] are determined from an examination of the behavior of the relative magnitudes of the eigenvalues of the matrix
XXT as previously discussed. The final steps in the processing are illustrated in
Figures 12.5eg, and consist of time shifting the recovered up and downgoing
waves to their original positions, transferring the upgoing components into
two-way traveltime using computed first breaks, and stacking to produce a
final trace.
An example from the state of Bahia in Brazil is shown in Figure 12.5.
Figure 12.5a shows the input VSP section and Figure 12.5b illustrates the
time-shifted section reconstructed from the first four eigenimages. This section is in fact the XLP reconstruction with q 4 and represents the downgoing waves. Figure 12.5c is XBP with p 5, q 41, shifted to the two-way
traveltime, and low-pass reconstructed with p 6. This figure represents the
separated and signal to noise enhanced upgoing wave section and compares
with the f-k fan-filtered upgoing wave section shown in Figure 12.5d. Striking
263
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 12.5.
A schematic illustration of the process used to separate up- and down-going VSP events: (a) input VSP section, (b) time shifted version of a, flattening the down-going events, (c) time shifted up-going events after
band-pass eigenimage filtering, (d) time shifted down-going events after
low-pass eigenimage filtering, (e) down-going events after reversing the
time shift, (f) up-going events after reversing the time shift.
264
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
and important differences may be seen, particularly in the first 2.2 s of the sections. Of special interest is the reflection at 1.9 s which has been very well
recovered in the eigenimage section whereas f-k fan-filtering was unable to do
so.
In comparing eigenimage and f-k filtering, it is important to point out
that whereas, due to the periodic nature of the FT, spacial aliasing occurs in fk filtering whenever the maximum spacial frequency exceeds 1/2 x, where x
is the spacial sampling, eigenimage reconstruction does not entail a periodic
assumption and, consequently, spacial aliasing does not arise.
12.3.3
i2ft o
X(f) .
i
+i
X (f ), f 0
X (f ), f 0
and hence
F 1 [ x ( t t o ) ] ( cos i sgn f sin ) X ( f ) , all f ,
transforming back to time
265
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 12.6.
An example of separating up- and down-going VSP events: (a) input VSP
section, (b) time shifted section reconstructed from first four eigenimages, the estimate of the down-going events, (c) estimate of up-going waves
from band-pass eigenimage filtering, d) up-going waves estimated by f-k
filtering.
266
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
x ( t t 0 ) [ x ( t )e
i
(12.33)
267
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
12.4 Discussion
Eigenimage reconstruction is a nonlinear and data dependent filtering
method which may be applied either in the real or complex domain to achieve
a number of important processing objectives. Since seismic sections are, in
general, highly correlated trace-to-trace, eigenimage reconstruction is a parsimonious representation in the sense that the data may be reconstructed from
only a few images. A natural consequence of such a reconstruction is, as we
have shown, an improvement of the S/N. Eigenimage reconstruction has a
capacity similar to f-k filtering to remove events that show an apparently different phase velocity on the section. The actual process by which this is
accomplished is quite different to that entailed in f-k filtering. In the former
approach, events are removed which do not correlate well from trace to trace.
In the latter, events are rejected which possess different f-k signatures. One of
the consequences of this difference is that eigenimage filtering is not subject to
spacial aliasing in the sense of f-k filtering. However, eigenimage reconstruction encounters similar difficulties as does the f-k approach in separating
events with similar dips.
As we have mentioned, the linear moveout model is not essential in eigenimage processing. In fact, Ulrych et al. (1983) and Key and Smithson (1990)
employed a hyperbolic moveout model to estimate stacking velocities in shot
gathers. A very fine example of S/N enhancement using eigenimages on
stacked seismic sections which contain highly curved events has been recently
presented by Al-Yahya (1991) who applied the filter for selected dips and then
combined the results for all dips to produce the composite enhanced section.
268
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 12.7.
A synthetic example of residual statics correction using eigenimage decomposition: (a) synthetic section formed by repeating one trace 24
times, (b) input section formed by subjecting the section in a to random
time and phase shifts and adding white noise, (c) processed section with
residuals estimated using standard approach of correlating against a
stacked trace, (d) section in c phase corrected using phase -estimate from
eigenvector associated with major eigenvalue, (e) processed section using
low-pass eigenimage reconstruction of section b, assuming only time
shifts, (f) phase correction determined from major eigenvector, applied to
section (b).
269
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 12.8.
An example of residual statics correction using seismic data: (a) input data
section, (b) application of convention residual static correction.
270
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 12.8.
271
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Much of the subject matter in this book involves the use of the spectral
matrix. It is appropriate, therefore, that we comment on the correspondence
between eigenimage and spectral matrix techniques. The spectral matrix is
formed in the frequency domain as the complex covariance matrix at each frequency. For linear events in time, each frequency contains a sum of L harmonics associated with the L events on the section. Separation of the events is then
achieved by means of averaging in frequency and/or space using window functions which depend on the input data. This is possible because of the redundancy of information which exists in the frequency-space domain. Unlike the
eigenimage technique, a priori information concerning the dips of the events
is not required. However, as shown by the recent work of Rutty and Jackson
(1992), this entails certain penalties which manifest themselves as edge effects
and the necessity that events to be separated have different energies. These
points are well illustrated by a comparison of Figures 12.4i and 12.4j in this
chapter with Figures 15.3a and 15.3b in Rutty and Jackson (1992). These latter figures show the edge effects in the separated events, even when the events
have different energies, at the same time illustrating that, unlike the eigenimage approach, amplitudes of the events are recovered. At the time of this writing, we are engaged in research that involves combining the spectral matrix
and eigenvector methods to develop an approach that reflects the best of both
techniques.
12.5 References
Al-Yahya, K.M., 1991, Application of the partial Karhunen-Love transform
to suppress random noise in seismic sections: Geophys. Prosp., 39, 7793.
Anderson, T. W., 1971, An introduction to multivariate statistical analysis:
John Wiley & Sons, Inc.
Andrews, H. C., and Hunt, B.R., 1977, Digital image restoration: PrenticeHall, Inc., Signal Processing Series.
Freire, S. L. M., and Ulrych, T. J., 1988, Application of singular value decomposition to vertical seismic profiling: Geophysics, 53, 778-785.
Freire, S. L., and Ulrych, T. J., 1990, An eigenimage approach to the attenuation of multiple reflections: Buturi-Tansa, 43, 1-13.
Gerbrands, J.J., 1981, On the relationship between SVD, KLT, and PCA, pattern recognition: 14, 375-381.
272
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Hemon, C.H., and Mace, D., 1978, Essai d'une application de la transformation de Karhunen-Love au traitement sismique: Geophys. Prosp., 26,
600-626.
Jones, I. F., 1985, Applications of the Karhunen-Love transform in reflection
seismic processing: Ph.D. thesis, Univ. of British Columbia.
Jones, I. F., and Levy, S., 1987, Signal-to-noise ratio enhancement in multichannel seismic data via the Karhunen-Love transform: Geophys.
Prosp., 35, 12-32.
Key, S. C., and Smithson, S. B., 1990, New approach to seismic-reflection
event detection and velocity determination: Geophysics, 55, 10571069.
Lanczos, C., 1961, Linear differential operators: D. Van Nostrand Co.
Levy, S., Jones, I. F., Ulrych, T. J., and Oldenburg, D. W., 1983, Applications
of common signal analysis in exploration seismology: 53rd Ann. Internat. Mtg., Soc. Expl. Geophys., Expanded Abstracts, 325-328.
Levy, S., and Oldenburg, D.W., 1987, The deconvolution of phase shifted
wavelets: Geophysics 47, 1285-1294.
Love, M., 1951, Probability theory: D. Van Nostrand Co.
Marchisio, G., Pendrel, J. V., and Mattocks, B. W., 1988, Applications of full
and partial Karhunen-Love transformation in geophysical image
enhancement, 58th Ann. Internat. Mtg., Soc. Expl. Geophys.,
Expanded Abstracts,1266-1269.
Ready, R. J., and Wintz, P. A., 1973, Information extraction, S/N improvement and data compression in multispectral imagery: IEEE Trans.
Communications, 21, 1123-1130.
Rutty, M. J., and Jackson, G. M., 1992, Wavefield decomposition using spectral matrix techniques: Exploration Geophysics, 23, 293-298.
Treitel, S., Shanks, J. L., Frazier, C. M., 1967, Some aspects of fan filtering:
Geophysics, 32, 789-800.
Ulrych, T. J., Freire, S. L., and Siston, P., 1988, Eigenimage processing of seismic sections: 58th Ann. Internat. Mtg., Soc. Expl. Geophys.,
Expanded Abstracts, 1261-1265.
Ulrych, T. J., Levy, S., Oldenburg, D. W., and Jones, I. F., 1983, Applications
of the Karhunen-Love transformation in reflection seismology: 53rd
Ann. Internat. Mtg., Soc. Expl. Geophys., Expanded Abstracts, (page
numbers).
273
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Young, T. Y., and Calvert, T. W., 1974, Classification, estimation and pattern
recognition: American Elsevier Publishing Co., Inc.
274
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Chapter 13
Single-Station Triaxial Data Analysis
G. M. Jackson, I. M. Mason, S. A. Greenhalgh
13.1 Introduction
In a high signal-to-noise environment, a single triaxial geophone can provide estimates of the polarization state of a seismic arrival. Knowledge of the
mode of the event (e.g., P-wave) and elastic properties of the host rock can be
used to infer the direction of propagation. Alternatively, knowledge of the
direction of propagation can be combined with the polarization state to identify the various wave modes and characterize the elastic properties of the host
rock. Polarization analysis might, for example, allow body waves (rectilinear
polarization) to be distinguished from surface waves (elliptical polarization).
Polarization measurements are therefore important in many areas of seismology.
Polarization analysis can be achieved efficiently by treating a time window
of a single station triaxial recording as a matrix and doing a singular-value
decomposition (SVD) of this seismic data matrix. SVD is a standard matrix
algebra technique which produces both an eigenanalysis of the data covariance
(cross-energy) matrix and a rotation of the data onto the directions given by
the eigenanalysis (Karhunen-Love transform). Before proceeding with the
SVD approach, however, it is necessary to discuss the selection of the data
entering the polarization analysis.
275
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
betti and Kanasewich (1970), Esmersoy (1984), and Jurkevics (1988) for
more detail. We have
n
M (X X)
i1
xx ( i )
x x ( i )x y ( i )
xy ( i )
x y ( i )x x ( i )
x z ( i )x y ( i )
x z ( i )x x ( i )
x x ( i )x z ( i )
x y ( i )x z ( i )
xz ( i )
. (13.1)
Matrix
: The data matrix. The columns x are the traces of the triaxial recording. The matrix is of dimension n 3, where n is the number of
samples in the window.
: A diagonal matrix (3 3) produced by SVD of X. The diagonal elements 1, 2, 3 are the singular values. Each singular value is the
square root of the corresponding eigenvalue of XTX or XXT (they
share eigenvalues). Since XTX M, 1, 2, 3 are the square roots
of the eigenvalues 1, 2, 3.
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
(13.2)
(13.3)
X UWV .
278
(13.4)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 13.1.
279
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 13.2.
280
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 13.3.
281
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
The columns of matrix V are v1, v2, v3: the eigenvectors of the crossenergy matrix M. W is a 3 3 diagonal matrix with the singular values 1,
2, 3 as the diagonal elements. Each singular value 1 is the positive square
root of the corresponding eigenvalue i of the cross-energy matrix M. Each
column of matrix U is the same column of the rotated data matrix K divided
by the corresponding singular value. This follows from
K XV
T
UWV V
(SVD of X)
UW
(V is orthonormal).
(13.5)
U is the signal matrix X after both projection into the Cartesian frame of the
eigenvectors of the cross-energy matrix M and normalization by division with
the singular values. Since the eigenvalues of M are the energies of the principal components in the window, the singular values are proportional to the
amplitudes of the data principal components. Division of the columns of K
by the singular values produces the n-dimensional unit vectors u1, u2, and u3
describing the time behavior of the data principal components.
The SVD of data matrix X can also be written as the sum of three eigenimages or principal components:
X UWV
i ui vTi
(13.6)
i1
E1 E2 E3
Each eigenimage is the outer product of a column of U with a column of
V weighted by the corresponding singular value. Each eigenimage is a principal component of the data (trace of the KL transform) expressed in the Cartesian frame of the recording. The eigenimages are mutually orthogonal in that
no one eigenimage can be reconstructed from a combination of the other two.
Superposition or stacking of the eigenimages reconstructs the data X.
282
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
1 2 2 3
Signal/Noise ----------------------------2
3 3
283
(13.7)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
If one has the a priori knowledge that the arrival is rectilinearly polarized,
2
the estimates of the noise energy along v2 becomes 2 , and the estimate of
noise energy along v1 can be taken as ( 22 23 )/2 . The signal-to-noise ratio
(energy) is then
2
1 ( 2 3 )/2
-.
Signal/Noise ----2-----------2-------------------------------2
2
3 2 ( 2 3 )/2
(13.8)
Since the singular values give estimates of the noise energy and of the
energy of the signal principal components, the singular values can be used to
calculate a probability as to (a) whether there is signal present (not just random noise) and (b) whether the signal is rectilinearly polarized (a body wave),
as opposed to elliptically polarized (a surface wave). F-tests are used to calculate the significance of the differences between energies.
First, we evaluate the null hypothesis that the data energy in the window is
2
2
2
the same as the noise energy. The ratio of the data energy ( 1 2 3 ) to
2
the noise energy for the more general elliptically polarized signal model ( 3 3 )
gives an F-ratio that is evaluated in terms of the number of independent samples N that contributed to the energies. Adapting the F-test of Press et al.
(1986, p. 468), we have
2
1 2 3
F -----------------2----------3 3
(13.9)
N
Q ( F/N ) I ---------------- ( N/2, N /2 ) ,
N NF
(13.10)
and
where Q (F/N) is the significance level of the null hypothesis that the energies
are equal, given the ratio of the energies F and the number of independent
samples N contributing to the energies. I is the incomplete beta function (see
Press et al., 1986, p. 166). The number of independent samples is best
obtained by doing a discrete Fourier transform of the real data window and
counting the number of frequency samples with significant amplitude. Only if
the data are sampled at the Nyquist rate will the number of independent sam284
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
ples (N) be equal to the total number of samples in the analysis window (n).
Typically N is less than n.
If the null hypothesis has low significance in this first F-test, one can
assume that signal is present. It then becomes meaningful to ask whether the
residual energies (noise energy) of the rectilinear and elliptically polarized signal models are different. If the null hypothesis (that the energies are the same)
is significant, the data are rectilinearly polarized. The residual energy is
2
3 2 ( 2 3 )/2
2
for the rectilinearly polarized signal model and 3 3 for the elliptical signal
model, giving
2
3 2
-.
F --------------2
2 3
Again, the F-ratio is evaluated in terms of the number of independent samples N that contributed to the energies.
Referring to Figure 13.1, the singular values in the presence of added noise
were 3.600, 1.145, and 0.868. This gives the total energy in the data window
as 15.024 (sum of the squares of the singular values) with a noise energy of
3.097 for the linearly polarized model and 2.260 for the elliptically polarized
signal model. The F-test gives a significance of only 0.0025% to the hypothesis that there is no signal present (an F-ratio of 0.15 on 24 independent samples). Given the near certainty of the presence of signal in the window, one
can then test the null hypothesis that the noise variance of the linearly polarized signal model is no greater than that of the elliptically polarized signal
model. The F-ratio of 1.370 on 24 independent samples gives a 46% significance level to the null hypothesis that the rectilinear and elliptically polarized
noise energies are the same. One can therefore say that the signal is rectilinear
with a confidence level of only 46%. Noise has degraded the certainty with
which one can say the signal is rectilinearly polarized.
The signal is unknown so noise energy has to be defined as that which is
uncorrelated between channels. Consequently, the energy identified as noise
on specific examples will show a stochastic variation representing the interaction of the true signal (unknown) with the true added noise (also unknown).
Significance levels must therefore be interpreted in the knowledge that first,
285
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 13.4.
286
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 13.3 shows the addition of noise to the elliptically polarized example. The noise is the same as that used in the example of Figure 13.1. The first
principal component of the signal is still recognizable but the second component of the elliptically polarized signal is buried in the noise.
A comparison of the direction v1 in Figures 13.2 and 13.3 show a 180degree flip, which illustrates a fundamental ambiguity in single station triaxial
recordings. It is not possible to distinguish the true arrival from an arrival of
opposite polarity arriving from the opposite direction. This ambiguity can
often be eliminated by other constraints: for example; an explosion must produce a compressional (not rarefractional) first motion, and the direction of a
wave recorded on the surface cannot be downgoing. In the absence of a priori
information, components of opposite polarity along antiparallel directions are
equivalent. The 180-degree flip must be removed before looking at the deviation due to the added noise. Note that the polarity of the first trace of U is
reversed accordingly.
The singular values for the noisy elliptically polarized example
(Figure 13.3) were 3.470, 0.990, and 0.803. Performing the F-test for the
presence of signal, we get a significance of only 0.0015% for the null hypothesis, strongly suggesting the presence of signal. The second F-test to distinguish rectilinear and elliptical polarization is, however, inconclusive. We get a
significance level for the null hypothesis of 58% from an F-ratio of 1.260 and
24 independent samples.
A tighter window around the arrival would have reduced the noise energy
in the window, while leaving the signal energy unchanged. The consequent
increase in signal-to-noise ratio would have reduced 3 with respect to 2 giving rise to a higher F-ratio and a decreased significance for the null hypothesis
(rectilinear polarization). One could then state with a higher level of confidence that the arrival was elliptically polarized. Note, however, that since
noise is defined as that which is uncorrelated between channels and since correlation over too short a window finds spurious correlation between independent time series, too short a window will give an estimate of noise that might
be less than the true'' noise. Noise in the short window is indistinguishable
from signal, so the signal estimate is corrupted.
13.6 Summary
Polarization analysis can therefore be efficiently achieved by treating a
time window of a single station triaxial recording as a matrix and doing a sin287
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
gular-value decomposition (SVD) of this seismic data matrix. SVD of the triaxial data matrix produces an eigenanalysis of the data covariance (crossenergy) matrix and a rotation of the data onto the directions given by the
eigenanalysis (Karhunen-Love transform), all in one step.
Singular-value decomposition offers a computationally efficient method
of analyzing seismic arrivals at a triaxial station. An eigenanalysis of the crossenergy matrix is produced along with a rotation of the data onto the principal
component directions given by the eigenanalysis:
1) The signal is contained in the plane perpendicular to the column v3 of V
corresponding to the smallest singular value (3).
2) The first and second columns v1, v2 of V provide least-squares best estimates of the axes of the signal polarization ellipse. These directions are
mutually perpendicular.
3) The squares of the singular values (found along the diagonal of matrix W)
give the energies of the three data principal components. The noise and
signal principal component energies can be inferred, allowing an F-test for
the significance of the hypothesis of rectilinear polarization, as well as the
calculation of signal-to-noise ratios.
4) The projection of the data along the principal axes of polarization is
obtained as the columns of matrix U multiplied by the corresponding singular values. This is the Karhunen-Love transform K of the data X.
5) The eigenimages produced by SVD are the projection of the data principal components along the Cartesian axes of the recording frame.
Thus SVD provides a complete principal-components analysis of the data
in the analysis time window. Selection of this time window is crucial to the
success of the analysis and is governed by three considerations: the window
should contain only one arrival, the window should be such that the signal-tonoise ratio is maximized, and the window should be long enough to allow the
discrimination of random noise from signal.
The SVD analysis provides estimates of signal, signal polarization directions, and noise. F-tests based on the singular values can be used to give confidence levels for hypotheses like the absence of signal, or rectilinear (versus
elliptical) polarization of the signal.
288
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
13.7 References
Dankbaar, J. W. M., 1985, Separation of P and S waves: Geophys. Prosp., 33,
970-986.
Esmersoy, C., 1984, Polarization analysis, orientation and velocity estimation
in three component VSP, in Toksz, M.N., and Stewart, R.R., Eds.,
Vertical seismic profiling Part B: Advanced Concepts: Geophysical
Press.
Flinn, E. A., 1965, Signal analysis using rectilinearity and direction of particle
motion: Proc. IEEE, 53, 1874-1876.
Freire, S. L. M., and Ulrych, T. J., 1988, Application of SVD to vertical seismic profiling: Geophysics, 53, 778-785.
Hemon, C., and Mace, D., 1978, The use of the Karhunen-Love transformation in seismic data processing: Geophys. Prosp., 26, 600-626.
Jones, I. F., and Levy, S., 1987, Signal-to-noise ratio enhancement in multichannel seismic data via the Karhunen-Love transform: Geophys.
Prosp., 35, 12-32.
Jurkevics, A., 1988, Polarization analysis of three-component array data: Bull.,
Seis. Soc. Am., 78, 1725-1743.
Levy, S., Ulrych, T. J., Jones, I. F., and Oldenburg, D. W., 1983, Applications
of complex common signal analysis in exploration seismology: 53rd
Ann. Internat. Mtg. Soc. Expl. Geophys., Expanded Abstracts, 325328.
Montalbetti, J. F., and Kanasewich, E. R., 1970, Enhancement of teleseismic
body phases with a polarization filter: Geophys. J. Roy. Astr. Soc., 21,
119-129.
Press, W. H., Flannery, B. P., Teukolsky, S. A., and Vetterling, W. T., 1986,
Numerical recipes: The art of scientific computing (Fortran and Pascal): Cambridge Univ. Press.
Ulrych, T. J., Levy, S., Oldenburg, D. W., and Jones, I. F., 1983, Applications
of the Karhunen-Love Transformation in reflection seismology: 53rd
Ann. Internat. Mtg. Soc. Expl. Geophys., Expanded Abstracts, 323325.
289
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Chapter 14
Correlation Using Triaxial Data from Multiple
Stations in the Presence of Coherent Noise
M. J. Rutty and S. A. Greenhalgh
14.1 Introduction
Polarization analysis of single station multicomponent seismic data has
been performed with success by several researchers (Montalbetti and
Kanasewich, 1970; Vidale, 1986; Flinn, 1965; Esmersoy, 1984; Magotra et
al., 1987). The technique has two major objectives: (1) to devise filters to distinguish between events with different modes of vibration (e.g., P- and Swaves versus Rayleigh waves) and (2) to provide a means of estimating the
direction of particle motion for use in seismic direction finding.
The most common applications of triaxial seismic recording to date are in
vertical seismic profiling (VSP) and earthquake seismology. Various filtering
techniques have been applied to individual three-component records, but very
little work has been carried out using the polarization information from more
than a single station at a time. One exception is Jurkevics (1988) who advocates averaging the covariance matrices formed at different stations. This simple procedure reduces the estimation variance by a factor 1/M, where M is the
number of stations used, but it cannot cope with coherent noise between the
stations. Bataille and Chiu (1991) use a similar averaging procedure to reduce
the effects of incoherent arrivals. They also consider the error introduced
when an interfering event is present within the time window of the single station polarization analysis.
The major problem with the conventional single station approach to
polarization analysis is that it places some severe restrictions on the types of
data set that can be processedgenerally, data must have high signal-to-noise
291
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
ratio events and no coherent noise. This means that all events under examination must be well separated in time. These restrictions have led to the current
rather limited use of the technique.
Since multicomponent information is often available from more than one
position in space, techniques which use this information should be developed.
These would then supplement the single-station processing techniques when
the data fail to meet their criteria. If a coherent event could be singled out
from any coherent noise prior to applying polarization processing procedures,
a significant improvement in the results of polarization analysis would follow.
The technique described below is an attempt in this direction. Correlating
events are picked from a multicomponent seismic section in the presence of
both random and coherent noise. This represents the first stage of a possible
processing procedure to deal with multicomponent data acquired over a sparse
spatial array currently under development.
The theory behind monochromatic electromagnetic polarization (with its
vibration restricted to a plane) is well documented (e.g., Born and Wolf,
1965), but seismic polarization analysis for polychromatic transients, in which
the motion is three-dimensional, is poorly developed. After a brief discussion
of single station polarization theory and its limitations, an analysis of a
two-station approach is presented. An interpretation of the physical significance of the multidimensional vector space is proposed and a correlation procedure developed. This is then applied to synthetic and physical scale model
data with varying levels of both coherent and random noise.
C [ t0 ]
u ( i )u ( i ) .
(14.1)
W
i t 0 --2
Here W is the window length, T represents the transpose operator and t0 is the
center of the window. Any rectilinear motion within the time window will
constructively add in the covariance matrix, improving the ratio of signal to
292
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
random noise. If the ratio of signal and random noise energy is sufficient for
the window chosen, the direction of the signal will dominate within the covariance matrix. A coordinate rotation made to the principal axes of the covariance matrix using similarity transforms (rotations) will reveal this direction
and the proportion of energy along it. This transformation is achieved by an
eigendecomposition of the covariance matrix, with its eigenvectors being the
principal axes and its eigenvalues proportional to the energy in each of these
directions. Expressed in matrix notation:
T
R CR
(14.2)
14.2.1
The expected particle polarization for perfect single mode elastic arrivals
will be either one-dimensional (e.g., rectilinear P- or S-waves) or two-dimensional (e.g., plane, polarized Rayleigh waves). A rectilinear arrival is easily dealt
with in the time domain using the covariance analysis since each sample in the
time window constructively sums along its arrival direction. This produces a
unique dominant direction in the eigenanalysis of its covariance matrix. However, if the arrival being examined is a nonrectilinearly polarized event, such as
a Rayleigh wave, the signal energy is observed in a plane. The covariance
matrix analysis cannot then indicate this as a coherent event. This type of
event must be examined in a different data domain. Vidale (1986) uses the
spectral coherency matrix that is a complex analogue to the covariance matrix.
Analytic signals are generated by taking the Hilbert transform of the data as its
imaginary part. These are then used to calculate the coherency matrix C as
293
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
T
t 0 --2
C [ t0 ]
T
i t 0 --2
u ( i )u ( i )
(14.3)
where T is the time window, u is the analytic signal and H represents the complex conjugate transpose operator. A complex eigendecomposition performed
on this matrix reveals any clean coherent signals (including nonrectilinearly
polarized ones) in a unique complex direction.
Another equally valid approach to deal with nonrectilinearly polarized
events is to work in the frequency domain. This uses the (complex) cross-spectral matrix averaged over a frequency window. Plane-polarized events which
are well separated (in frequency) from one another again sum constructively in
a single complex principal direction, which may be obtained by eigendecomposition.
When using a covariance matrix style of processing, it is important to be
sure of the type of signal that is under examination. If it is a plane polarized
event, then a complex representation must be used (i.e., the analytic signal in
the time domain or the Fourier transform in the frequency domain). However, if the signal is rectilinearly polarized, computing time may be saved by a
factor of about four by using the real signal.
14.2.2
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
(14.4)
1
C [ t 0 ] --T
T
t 0 --2
T
t 0 --2
H
1
H
s 1 ( t )s 1 ( t )d d dt --T
T
t 0 --2
1
--T
T
t 0 --2
T
t 0 --2
T
t 0 --2
H
1
H
s 1 ( t )n ( t )d dt --T
T
t 0 --2
H
s 1 ( t )d n ( t ) dt
n ( t )n ( t ) dt
(14.5)
T
t 0 --2
295
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
ent signal in the dominant eigenimage. Most single station polarization processing therefore assumes that any noise within the time window of analysis is
random. If this criterion is satisfied, the technique is highly successful. However, it is important to realize that this places major restrictions on the types of
data set that may be legitimately processed.
If the noise is spatially coherent or polarized, or the signal-to-noise ratio
falls below unity, the noise terms in equation (14.5) dominate the covariance
matrix and a successful polarization analysis is not possible. Single station
polarization data must therefore have a good signal-to-noise ratio and must
not contain more than one event in each time window used. In practice, the
window for the analysis optimally should be positioned to contain all significant samples of the signal (maximizing signal-to-noise energy) without interference from any previous or subsequent events. It should also be as long as
possible within these constraints so that signal may be enhanced above the
random noise (Jackson et al. 1991).
14.2.3
When considering realistic data, it is very important to appreciate the significance of the eigenvalues of the covariance matrix. If there were no noise,
then any nonzero eigenvalue would be significant. When noise is present, the
significance of certain eigenvalues depends on the types of signal and noise
present. Let us denote the eigenvalues by i and arrange them in descending
order. If there is a rectilinear event in purely random noise, (which implies
only one event in our time window), the noise energy within the window may
be estimated as 33 (Kanasewich, 1981). In this case 1 is significant if
1 2 3. However, if several phase-shifted overlapping events with different arrival directions were present within our time window, then, even with
no random noise, 3 may be comparable with 1 and so provides no indication to the significance of 1.
Significance of eigenvalues may usually be tested by comparison with the
smallest eigenvalue of the covariance matrix. However, this is only true if the
signal and any coherent noise within the window are known to span a vector
space of smaller dimension than our covariance matrix. The smallest eigenvalue is then purely due to random noise. Given a specific window over the
data, the dimension of the particle motion within it can be estimated. A filter
function can then be calculated to enhance rectilinear signals using the relative
size of the eigenvalues. This function is then applied to the midpoint of the
296
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
14.2.4
297
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 14.1.
A VSP source receiver geometry. Triaxial receivers down a borehole receive signals generated by a source near the surface. Coherent noise may
be caused by local scatterers and reflected events.
resents the dimensionality of the six-component signal (or the rank of its signal space).
14.3.1
298
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
u 1 s 1 ( t )d 1
(14.6)
and
u 2 s 1 ( t 0 )d 2 s 3 ( t )d 3 ;
(14.7)
s1 represents the correlating wavelet delayed by 0 on station 1 and s3 the wavelet produced by the local scatterer. The direction cosines of the arrivals at each
station are given by di {dxi dyi dzi}T.
Setting uT = {u1T u2T}, the binocular 6 x 6 coherency matrix formed from
u is:
H
C [ ] A d1 d1
H
H
B d 2 d 1
H
B d 1 d 2
H
E d 2 d 2
H
D d 1 d 3
0
D
H
d 3 d 1
H
F [ d 2 d 3
H
d 3 d 2 ]
H
G d 3 d 3
where
T
--2
1
A ( t 0 ) --T
s 1 ( t )s 1 ( t )dt ,
T
--2
T
t 0 --2
1
B ( ) --T
s 1 ( t )s 1 ( t )dt
T
--2
299
(14.8)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
T
--2
1
D ( ) --T
s 1 ( t )s 3 ( t )dt ,,
T
--2
T
--2
1
E ( ) --T
S 1 ( t t 0 )S 1 ( t )dt
T
--2
T
--2
1
F ( ) --T
s 3 ( t )s 1 ( t )dt ,
T
--2
and
T
--2
1
G ( ) --T
s 3 ( t )s 3 ( t )dt .
T
--2
T is the length of the time window on the data and H represents the conjugate
transpose operator.
The coherency matrix has been split into the sum of two separate matrices
in equation (14.8). The first of these represents the coherence between the
correlating arrivals of s1. Its characteristic equation is:
6
( A E ) ( AE BB ) 0
(14.9)
implying at least four of its six eigenvalues must be zero. This is due to its representing only one rectilinear event on each station. Each event contributes to
the signal space rank, which therefore has a maximum value of 2 in this example.
300
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
As the time difference t0 tends to zero, i.e., when the event on both stations is perfectly correlated, then A, B, and E all tend to the same constant.
This implies that AE BBH 0, which in turn implies that only one eigenvalue will be nonzero (1 A E). Thus, if the signal correlating on both
stations is not correctly aligned, its phase shift introduces some linear independence. When it is time-aligned, the data on the two stations are linear
combinations of each other. Since the eigenanalysis separates the traces into
independent orthogonal signals, the dimension of the eigenanalysis must then
decrease. This result holds irrespective of the directions of arrival (in real
space) at the two stations.
If there were no scattered event s3, the right-hand matrix in
equation (14.8) is zero and so the rank of the coherency matrix will fall from
two to one when the event on each station correlates perfectly. When the overlapping event s3 is present, we might expect a drop in rank from three to two
when the correlating event s1 is correctly aligned. This procedure may therefore be able to identify correlating events in coherent noise.
A simple synthetic experiment was performed to test this possibility. Signal s1 on station 1 is correlated with a signal at station 2 created as the superposition of signal s1 and another signal s3 (Figure 14.2). Signal s1 has arrival
direction ratios of 2:1:1 at station 1 and 1:1:2 in the merged signal at station
2. The superposed three-component signal on station 2 thus contains signal s1
but in differing proportions between separate traces on the station.
The time window used in the test was the complete trace. A cyclic time
shift was applied to the traces of the second station and the covariance matrix
and its eigenvalues calculated at each shift. The eigendimension of the motion
is indicated in the lower part of Figure 14.2 by the plot of the ratio of the
3rd/1st eigenvalues against the time shift in the covariance matrix. The rank
of the covariance matrix clearly drops at = t0, the propagation delay of the
correlating signal s1. When the merged event s1 on station 2 is not aligned
with event s1 on station 1 then there are three nonzero eigenvalues (one from
station 1 and two from station 2). However, when event s1 is aligned, there is a
linear dependence inherent in the covariance matrix and so there are only two
nonzero eigenvalues (one from signal s1 on both stations and one from signal
s2 on station 2). This dependence occurs despite the somewhat arbitrary
choice of arrival directions (particle motion directions) of the correlating signal.
301
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 14.2.
Synthetic traces for two triaxial geophones. A single rectilinear event arrives on stations 1 and 2 with direction ratios 1:2:3 and 2:1:1, respectively. Another rectilinear signal is superposed on station 2 constituting
coherent noise. The ratio of 3rd/1st eigenvalues of the binocular covariance
matrix is plotted for different time shifts of one station to the other. Perfect correlation is observed when the 3rd eigenvalue becomes zero at the
correct time shift, =t0.
302
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
14.3.2
303
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
and should be of sufficient signal-to-noise ratio such that a single-station covariance analysis is capable of detecting it. Once the reference wavelet has been
selected and windowed, a similar window is placed over the beginning of the
traces of the station we wish to examine (the correlation station). The eigenvalues of both the correlation station and the combined station covariance
matrices are calculated and stored before the window on the correlation station is time shifted and the calculation repeated.
As the window is shifted across the data frame at the correlation station,
several events may be encountered. These events may or may not be rectilinear
with varying amounts of random and spatially coherent noise. The eigenstructure of both the binocular and correlation station covariance matrices will be
duly modified. If a correlating event is correctly time aligned, the rank of the
two-station covariance matrix should decrease compared to the sum of the
ranks of the two single stations. The signal space ranks for the combined and
single station data are related by the inequality
Rank ( C ( ) ) Rank ( C 1 ( ) ) Rank ( C 2 ( ) ) ,
(14.10)
comb
i1
ref corr ,
i1
i1
(14.11)
where combi , refi , and corri represent the eigenvalues of the combined,
reference and correlation station covariance matrices, respectively. The distribution of these eigenvalues changes as different events appear in the correlation window in accordance with the rank inequality [equation (14.10)].
Correlation functions R1 and R2 have therefore been defined below. The functions maximize when a rectilinear reference event is compared with a similar
304
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
rectilinear event in both purely random noise (R1) and in random and coherent noise (R2):
comb1 ( ) sin g2 ( )
---------------------- -------------------R1 ( )
comb 2 ( ) sin g 1 ( )
(14.12)
comb1 ( ) comb2 ( )
sin g3 ( )
- ------------------------------------------------R 2 ( ) ----------------------------------------------------- comb3 ( )
sin g1 ( ) sin g2 ( ) .
(14.13)
Here combi is the ith eigenvalue of the combined covariance matrix and
sin gi is the ith ordered eigenvalue in the set of all eigenvalues from both single
station covariance matrices. These eigenvalues vary with according to
equation (14.8).
The function R1 has a maximum when the second eigenvalue of the binocular covariance matrix is a minimum and the principal eigenvalues of the
individual covariance matrices are maximum. This represents a collapse of
rank of the combined covariance matrix from dimension 2 to 1, implying a
correlating rectilinear signal without significant coherent noise. Similarly, R2
seeks a collapse of the 6 6 covariance matrix from rank 3 to rank 2. This
maximizes when a correlating rectilinear signal and another rectilinear event
are superposed on station 2. Clearly, other functions Rj (j 2) could be
formed to pick for correlating plane polarized events.
Each function is formed using a ratio of the eigenvalues of the combined
covariance matrix and a weight comprising the inverse ratio of the corresponding single station eigenvalues. If no event is present in a range of the correlation station's traces, R1 and R2 will be flat functions over that range with
amplitudes around unity. Spikes are formed when there is correlation at the
corresponding time delay. Their amplitudes are not dependent on the amplitude of either reference or correlation events but on the linear dependence
between their wavelets.
The statistical distribution describing the accuracy of the correlation procedure is unknown. Standard statistical theory uses normalized nondimensional standard distributions to test for the significance of specific events. R1
and R2 are nondimensional functions whose amplitudes indicate some measure of the confidence of the corresponding type of event. They may therefore
305
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 14.3.
307
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 14.4.
308
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 14.5.
Synthetic seismogram with a rectilinear zero-phase band-pass filtered (060 Hz) spike reference wavelet on station 1 correlated against both separated and interfering versions of itself on station 2 in the presence of 50
percent noise. The reference window length is 72 ms and is shown by the
rectangle on the reference station traces.
309
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 14.6.
Synthetic seismogram with a rectilinear zero-phase band-pass filtered (060 Hz) spike reference wavelet on station 1 correlated against both separated and interfering versions of itself on station 2 in the presence of 100
percent noise. The reference window length is 72 ms and is shown by the
rectangle on the reference station traces.
310
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 14.7.
311
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
have been superposed with direction ratios 3:2:1 and 2:1:2, respectively.
Uniformly distributed random noise was applied to the data such that its
amplitude limits were the stated percentage of the maximum peak to trough
value of the reference signal.
The general appearance of the functions R1 and R2 is that of several spikes
within a flat unit background. The amplitude of the spikes decreases with the
increase of noise (Figure 14.6) indicating the lowering confidence level of
picking the respective events. Figure 14.4 shows the reference station and correlation stations with 10% noise. The window length of the analysis is 72 ms
and the reference window is marked by the rectangular box. All three correlating events are correctly picked by spikes on R1 and R2, which clearly dominate
the picking functions. Note how the event free of coherent noise is picked by
R1 while the separate picks for both of the interfering events are successfully
identified on R2.
As the noise level applied to the synthetic data is increased, the confidence
of our picking functions is eroded. Figure 14.5 shows the same data set with
50% noise on both reference and correlation stations. Again, the expected
events are indicated by spikes on the relative picking functions. However,
there are some unexpected maxima present, especially on R1. Sidelobes around
correct picks originate due to the wavelet shape and our choice of window
length. As with one-component correlation, a wavelet with an oscillatory
nature will exhibit ringing in its autocorrelation. However, when this effect
is observed on R1, the sidelobes must all lie within a time interval equal to the
sum of the window length chosen for the correlation, and the time the transient event is significant. Thus, if we choose the window length to be the
observable period of the signal, we expect any sidelobes to be within two window lengths on R1. More significantly, if separate events are present but separated by a time less than the chosen window length, then they should show as
interfering events on R2 and not on R1. The reason for this is that the covariance matrices cannot contain information from only one of the signals. Any
secondary maxima occurring within a window length of an event picked on R1
must therefore be due to ringing.
There are also maxima on R1 when an overlapping event is detected as
shown around 380 ms (Figure 14.5). These are due to the way the signal vector space is affected by the overlapping events. However, if R2 picks any significant overlapping events, then there should not be any valid picks on R1
312
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
present within the time window used of each such pick. Thus, any maxima
showing on R1 within such a time window are not significant.
Figure 14.6 shows the same data with 100% noise superposed. The corresponding events can still be picked correctly, though our confidence levels are
much lower with values of R1 and R2 lying between 2.0 and 3.0. Spurious
maxima can be detected as outlined above. This result is very encouraging and
provides some measure of the significance of picking correct times in high
noise.
For comparison, a single-station analysis has been performed on the correlation station data of 10% noise using the same window length. Four rectilinearity (or filter gain) functions commonly used in single-station polarization
analysis have been plotted in Figure 14.7. All are formed from the normalized
eigenvalues of covariance matrix. The function F1 is the rectilinearity filter
function defined by Esmersoy [1984, Equation (13)]:
3 1
1
-1 .
F 1 --- --------------------------------2 1 2 3
(14.14)
( 2 3 ) 2N
F 2 1 ---------------------------
1
1
--2-
(14.15)
N
F 3 1 -----2
1
(14.16)
and suggested values of N = 0.3 to 1.0. (We have used N = 1.0). The polarization measure F4 is referred to by Esmersoy:
1
F 4 ------------------2 3 .
313
(14.17)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
All these functions show local maxima for the single event at 126 ms.
However, the resolution of the picks is poor. The maxima have a width nearly
twice the window length used in all cases except for F4. The interfering events
at 362 ms and 382 ms have not been detected and would be suppressed using
such rectilinearity measures. Of even greater concern, two erroneous maxima
occur at times corresponding to where the processing window contains the
beginning of the first and the end of the last interfering wavelets. This is due
to each signal having no appreciable interference from the other wavelet at
these positions. The failure of single-station processing on such a high quality
data set indicates the problems of coherent noise. When processing such a
data set, a multiple station approach has a clear advantage.
314
315
Figure 14.8.
Moving surface profile geometry for recording reflected signals from the
bottom of aluminum plate in the laboratory. Wave paths and types are
shown schematically. A biaxial receiver records the wavefield produced
by a source which is repeated at several locations down the borehole. The
source is moved in intervals of 3 cm from the top (trace 1) to the bottom
(trace 41) of the plate.
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 14.9 identifies all the major events using the correlation procedure. The
two-component model data have been rotated into horizontal and vertical
directions and zero traces used in a dummy transverse direction). The reference event used was formed by placing a 16-ms window around the direct P
arrival on station 22 (shown by the small rectangle). The correlation functions
R1 and R2 are plotted for station 31, highlighted by the larger rectangular box
in the figure. All expected events are labelled. These correspond to direct P (P,
210 ms), reflected P (PP, 310 ms), direct S (S, 350 ms), P to S conversion (PS,
360 ms), S to P conversion (SP; 480 ms) and reflected S (SS, 540 ms). Note
how the overlapping events (S and PS) appear on R2 while those free of interference occur on R1.
The correlation of a small subset of the traces is examined in detail in
Figures 14.1014.12. These illustrations are concerned with the interference
of the reflected P event (PP) and the direct S event (S) at stations 27 to 29 and
indicate the practicality of the technique. All figures have the same reference
event as Figure 14.9 (window length 16 ms) and use a significance cut-off
confidence value of Ri = 2.0. Figure 14.10 successfully picks the overlapping
PP (324 ms) and direct S (320 ms) events on station 27. The pick on R1 corresponds to that of the direct S but it is overridden by the same pick on R2 since
it is within a window length of a valid pick of an interfering arrival. However,
it serves to reinforce the interpretation of the presence of a coherent event. It is
particularly encouraging to see that the function R2 can distinguish the two
events when they are so closely overlapping and when one has much more
energy content than the other.
Figures 14.11 and 14.12 show the results of the procedure applied to stations 28 and 29 to examine the same two events. In Figure 14.11, the interfering PP and S events are clearly picked on station 28 at 320 ms and 328 ms,
respectively, using function R2. The spike on R1 is at the correct time for S but
again is overridden by the pick on R2. In Figure 14.12, the events have now
separated by more than a window length (16 ms) and are therefore clearly
picked by the function R1.
14.7 Conclusions
Three-component seismic data acquisition and processing is still a long
way from becoming a standard processing technique. This is because singlestation polarization processing requires data with high signal-to-noise ratio
events and no coherent noise. There are also practical problems associated
316
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 14.9.
Raw data from the experiment shown in Figure 14.8. The data have
been rotated into horizontal and vertical components and the times
scaled by a factor of 1000. This corresponds to increasing the dimensions of the experiment by 1000. Correlation functions are shown for a
reference event on station 22 and correlation data on station 31 indicated by the rectangles on the seismograms. The window length used is 16
ms and the major separated and overlapping events are labelled on the
functions R1 and R2.
317
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 14.10.
318
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 14.11.
319
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 14.12.
320
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
with acquiring the data as the stations must be carefully calibrated to be able
to gain any directional information. On the other hand, the advantages of triaxial data are numerous in certain environments. Whenever there is restricted
access, conventional seismic techniques are not possible. A triaxial geophone
station records the complete wavefield with no assumption of a two-dimensional earth model. This then allows the polarization techniques to be applied
in areas with complicated geology.
If realistic data sets are to be examined using polarization analysis, a multiple-station processing approach is more likely to succeed than that of the single station. This will enable the problem of coherent noise to be tackled and
also provide an increase in signal-to-noise ratios. A multicomponent binocular
correlation technique has therefore been developed with the potential to distinguish overlapping arrivals. This technique will supplement conventional
single station event detection algorithms which cannot perform in the presence of coherent noise. A confidence estimate of the event pick is also directly
related to the size of the correlation picking function. Performing the two-station covariance eigenanalysis as opposed to that of the single station increases
the computational time by a factor of around four.
The algorithm has been tested successfully using synthetic data with signal-to-noise ratios varying down to at least unity and with physical scale
model data obtained in the laboratory. The authors are currently in the process of acquiring three-component field data in a mine and hope to apply the
technique to this data set in the near-future.
This technique would be of special interest in areas where conventional
seismic surveys cannot be carried out due to access difficulties. The procedure
also has direct applications in the areas of statics and wavetype identification
as well as earthquake seismology where multicomponent data are commonly
acquired using sparse sensor arrays.
14.8 References
Aki, K., and Richards, P. G., 1980, Quantitative seismology: W. H. Freeman
Co., 2 vols
Bataille, K., and Chiu, J. M., 1991, Polarization analysis of high-frequency,
three-component seismic data: Bull., Seis. Soc. Am., 81, 622-642.
Benhama, A., Cliet, C.,and Dubesset, M., 1988, Study and applications of
spatial directional filtering in three-component recordings: Geophys.
Prosp., 36, 591-613.
321
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Born, M. and Wolf, E., 1965, Principles of optics, 3rd edition: Pergamon
Press, Inc.
Esmersoy, C., 1984, Polarization analysis, rotation and velocity estimation in
three-component VSP, Toksoz, in M. N., and Stewart, R. R., Eds., Vertical seismic profiling Part B: Advanced concepts: Geophysical
Press, 236-255.
Flinn, E. A., 1965, Signal analysis using rectilinearity and direction of particle
motion: Proc. IEEE, 53, 1874-1876.
Freire, S. L. M., and Ulrych, T. J., 1988, Application of singular value decomposition to vertical seismic profiling: Geophysics, 53, 778-785.
Jackson, G. J., Mason, I. M., and Greenhalgh, S. A., 1991, Principal component transforms of triaxial recordings by singular value decomposition:
Geophysics, 56, 528-533.
Jurkevics, A., 1988, Polarization analysis of three-component array data, Bull.,
Seis. Soc. Am., 78, 1725-1743.
Kanasewich, E. R., 1981, Time sequence analysis in geophysics: Univ. of
Alberta Press.
Kennett, B. L. N., 1991, The removal of free surface interactions from
three-component seismograms: Geophys. J. Internat., 104, 153-163.
Magotra, N., Ahmed, N., and Chael, E., 1987, Seismic event detection and
source location using single-station (three-component) data: Bull.,
Seis. Soc. Am., 77, 958-971.
Means, J. D., 1972, The use of the three dimensional covariance matrix in
analyzing the polarization properties of plane waves. J. Geophys. Res.,
77, 5551-5559.
Montalbetti, J. F., and Kanasewich, E. R., 1970, Enhancement of teleseismic
body phases with a polarization filter: Geophys. J. Roy. Astr. Soc., 21,
119-129.
Pant, D. R., 1989. Physical seismic modelling: Studies in ground roll suppression, reflector resolution and multi-component wavefield separation.
Ph.D. Thesis, The Flinders University of South Australia.
Pant, D. R., and Greenhalgh, S. A., 1990, A multi-component offset VSP
scale model investigation: Geoexploration, 26, 191-212.
Vidale, J. E., 1986, Complex polarization analysis of particle motion: Bull.,
Seis. Soc. Am., 76, 1393-1405.
322
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Chapter 15
Parameterization of Narrowband Rayleigh
and Love Waves Arriving at a Triaxial Array
R. Lynn Kirlin, John Nabelek, and Guibiao Lin
15.1 Introduction
It is desirable to separate and parameterize Rayleigh and Love wavefronts
arriving at a three-component seismometer. We show that methods of modern
array signal processing and parameter estimation will accomplish this task. An
overall covariance matrix of vectors having elements representing the traces
from each component of all seismometers yields the necessary information.
We assume that the only vertical axis response is due to Rayleigh rv , and Love
wave components Le and Ln appear only on east and north axes respectively.
The relative powers of the three Rayleigh components and the two Love
components are constant across the array, although the phase relationships
between seismometers will, of course, vary. That is to say, the 5 5 covariance
matrix of the five components (rv , re, rn, LE, LN) at a single seismometer is not
a function of seismometer location, but the 5 5 cross-covariance matrix of
the five-component vector seismometer from one seismometer with the fivecomponent vector of another seismometer has complex values whose phases
vary according to the relative locations of the two seismometers. After formulating a method of estimating the powers of each component, we show that we
can use all seismometers contributions coherently to determine the possibly
different azimuths and horizontal velocities of the Rayleigh and Love waves.
15.2 Background
The MUSIC (multiple signal classification) algorithm is a maximum likelihood estimate of directions of arrival of simultaneous narrowband plane
323
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
waves. However, it requires knowledge of the relative amplitudes of the arrivals at each sensor. Often equal amplitudes are assumed. Because this assumption does not hold at all axes of any seismometer for either the Rayleigh or
Love wave, we must estimate the amplitude relationships before we can estimate directions of arrival. We also assume that all signal sources components
and all data are Gaussian and that sensor spacing is always less than one-half
wavelength. With these assumptions and an overall covariance matrix of all
sensors data, we can estimate all parameters of interest.
We first discuss the component response method of Jepson and Kennett
(1990). They show that the pressure (P), shear vertical SV-wave (Sv) and shear
horizontal SH-wave (Sh) components of a source signal arriving at a surfacelevel, 3-component seismometer from azimuth appear as Z, N, and E (vertical, north and east) components according to
1 0
Z
0
N 0 cos ( k ) sin ( k )
0 sin ( k ) cos ( k )
E
w zp w zs 0 P
w np w ns 0 S v . (15.1)
0 0 w eh S h
w eh 2 .
where vp0 and vs0 are pressure and shear surface velocities respectively, qp0 and
qs0 are pressure and shear vertical velocities, p is horizontal slowness,
324
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
2 ( v 2 2p 2 )
2v s0
s0
----------------------------------C 1 ( v 2 --2p
2 ) 2 4p 2 q q
s0
s0 v0
and
2 q q
4v s0
s0 v0
---------------------------------C 2 ( v 2 2p 2 ) 2 4p 2 q q - .
s0
s0 v0
s ( LE LN
rv ) .
LE
T
R s Ess } 0
02
2
LN
0
0 2 .
2
rv
(15.2)
We allow that the two Love components may be less than fully correlated,
but that all three Rayleigh components will be a complex scalar multiplication
of rv . Later we revert to a further simplification that the Love components are
simply related by a scalar multiplier.
Now let there be M sensors having three axial components each. These are
all placed in a data vector x = (x1 x2...x3M)T, where
x i r vi L Ei n i , i 1, 4, 7, 3M 2;
x i r vi L Ni n i , i 2, 5, 8, 3M 1;
x i r vi n i , i 3, 6, 9, 3M.
325
(15.3)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
and n = (n1 n2...n3M)T is a vector of white Gaussian noise samples having covariance matrix n2 I. Then the data covariance matrix
R x E { xx H }
(15.4)
x is x i2 r v L N
x i3
rv
Ks ,
(15.5)
where
1 0
K 0 1 ,
0 0 1
s ( LE LN rv ) .
T
(15.6)
( KR s K H ) N
(15.7)
R xis N ,
where the noise covariance matrix is the same at all sensors due to the assumption of spatial stationarity, and the covariance of signal part of xi is
326
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
R xis ( KR s K H )
2 2 2
rv
LE
( L L r2 )*
E N
v
* 2
rv
LE LN * r2v
L2N
2 2
rv
* r2v
r2v
.
2
rv
2rv
(15.8)
Note that if we can estimate this matrix, we can easily identify all of its separate parameters, r2, , , L2E , L2N , LE LN . Having accomplished this, we can
construct the covariance matrix of any subset of the components (rv , rE, rN,
LE, LN). Of course we assume that Rayleigh components are uncorrelated with
Love components.
Using a derivation from estimation theory, the conditional estimate of the
reference sensors components x1 (x11, x21, x31)T, given any other sensors
components xi (x1i, x2i, x3i)T is the conditional mean (Scharf, 1991,
chapter 7)
x 1 xi E { x 1 x i } R 1i R i1 x i
(15.9)
H.
R x1 x R 1i R i1 R 1i
(15.10)
We note that the covariance R x1 x is the estimate of the signal part covarii
ance R xis . This is because the noise at sensor i does not correlate with either
the signal or the noise at sensor 1. In fact, the covariance R 1i is given by
R 1i E { x 1 x iH } KR s G iH ,
(15.11)
where Gi is the transfer matrix for the signal vector to the signal component of
xi, or
xi Gi s ni .
327
(15.12)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
1 xi
KR s G iH R i1 G i R s K H
KR s G iH ( G i R s G iH n2 I ) 1 G i R s K H
(15.13)
KR s K H R xs ,
R xis
i1
(15.14)
j=1
ji
Since it is known that horizontal components of the Rayleigh wave are 90degrees out-of-phase with the vertical, we use the estimates
imag ( R xis ( 3, 1 )/R xis ( 3, 3 ) )
328
(15.16a)
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
using the minimum norm or unconstrained least-squares error [see section 9.2
or Scharf (1991, chapter 9)] estimate of s given x. Alternately, since K is nonsingular, we may easily write
R s E ( ss H ) K 1 R xs 1 K .
(15.16b)
If s includes more than one Rayleigh and Love wave and we want to estimate their directions of arrival, we need to enlarge the reference sensor vector
to include more than one sensor. In that case, only sensor subsets with similar
geometric distribution can be averaged coherently. When all sensors are
spaced on a grid or equally spaced in line, the condition for geometric similarity of subarrays is satisfied. Although we address only one each simultaneous
Rayleigh and Love wave, more general treatments of limitations of this sort of
problem solution are currently being researched.
Anderson and Nehorai (1994) have found the limiting number of wavefronts that can be found using data from one and two three-component geophones. Burgess and VanVeen (1994) have analyzed the generalized likelihood
test for detecting signals using subspace data from an arbitrary array of vector
sensors (multicomponent sensors). Also Wax et al. (1994) have determined
the number of waveform parameters that can be estimated with an arbitrary
M-sensor array for signals in colored noise. They show that more than the
usual M 1 number of wavefronts can be localized with their method. These
recent works represent the state-of-the-art in this field at the time of our writing.
In the remainder of this chapter we assume that only one Rayleigh and
one Love wave are present.
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
At any FFT frequency between 0.12 Hz and 0.18 Hz, a vector (one vector
element from each trace) is produced, as in equation (15.3); but the samples
from the traces are the Fourier transform values at a single frequency rather
than time samples. Given the sample rate fs =1/s and the Hanning window,
the effective filter bandwidth for each transform value is a bit wider than the
2/1024 Hz that would be in effect without windowing. The Hanning window
suppresses leakage from neighboring frequencies (Bardat and Piersol, 1986,
chapter 11).
Operating on the data covariance matrix as suggested above in
equations (15.13), (15.14), and (15.15), we find and versus frequency as
shown in Figure 15.2. Figure 15.3 gives the angles of the complex and
before imaginary parts are taken. Note that the angles are very nearly 90degrees except that the angle of changes dramatically above 0.155 Hz.
Figure 15.4 shows the Rayleigh vertical, east and north components versus
frequency. In Figure 15.5 are plotted total vertical, Rayleigh vertical, total east,
Love east, total north and Love north components. By the model, there is no
difference between total vertical (signal power) and Rayleigh vertical. The
Love components are approximately 20% to 50% of the respective total east
and north powers.
(15.17)
be the Fourier transform values of east, north, and vertical (Z) components at
all M sensors. Here we redefine the source vector
s ( L rv )
(15.18)
to have elements that are the transforms of the sources waveforms. This representation of the sources is different than in equation (15.5). (We could have
used this definition before, letting LN LE, in which case rather than LE
will be estimated.) Now we can write
330
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 15.1.
x As n ,
(15.19)
where A is the transfer coefficient matrix from the three components to all
traces and n is a vector of zero mean and spatially and temporally white Gaussian noise samples. A must have two columns, one for each signal arrival. We
write the first column for the Rayleigh wave and the second for the Love, giving
331
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
alpha (
) and beta (
) vs. frequency
*
*
*
* ** *
*
*
*
1 * * *** *
*****
*
*
*
*
1.5
0.5
0
-0.5
-1
-1.5
-2
0.12
0.14
0.16
0.18
frequency
Figure 15.2.
Alpha and beta versus frequency using equation (15.15). Alpha and
beta are the Rayleigh east and Rayleigh north components, respectively
[see equation (15.5)]
332
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
200
150
100
50
-50
-100
-150
0.12
0.14
0.16
0.18
frequency
Figure 15.3.
333
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
*
*
**
0
0.12
Figure 15.4.
*
*
*
*
* **
0.14
*
*
*
* * ********
0.16
0.18
334
4
power
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
*
*
*
0
0.12
** *
0.14
*
** * * **
* * ** ** **
0.16
0.18
frequency
Figure 15.5.
335
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
a1
1
e i2r
e i2r
e i2r
.
.
.
e i( M 1 )r
e i( M 1 )r
e i( M 1 )r
L
L
LE e i2L
L e i2L
a 0
2
.
LE e i( M 1 )L
L e i( M 1 )L
(15.20)
These columns represent the delays and attenuations of the Rayleigh and
Love waves at the sensor and axis represented by the element of the vector x.
The delay i at sensor i for one of the waves is a function of the azimuth of
arrival , the slowness p for that wave, and the relative x and y offset of the
sensor from the reference sensor. For a sensor at x, y we have
i ( x sin ( ) y cos ( ) )p .
(15.21)
We have assumed that the first sensor is the reference sensor, so its components relative delays are zero, unity exponential factor. Delay elements repeat
in triplicate because all three components of both the Rayleigh and Love
waves have equal relative delays, although Love has zero amplitude on the vertical axes.
Because the delay factors in any delay vector a are a function of frequency,
vectors x(i), sample vectors of the Fourier coefficients at different frequencies
i, will have different phases. The covariance method of analyzing direction of
arrival requires that many samples of the vector x(i) have the same average
relative phases. For purely stationary, sources many sequential Fourier transforms can produce these samples, and this is the approach we have taken.
336
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
(15.22)
Note that there are 3M elements in the data vector and 3M is the size of the
covariance matrix.
We can find the vectors in A, modeled equation (15.22) and which are
functions of the directions of arrival and slownesses, by searching for each
through all its possible values of parameters p and , to as fine an increment as
is useful. This finds those parameter pairs which yield coefficient vectors a ideally orthogonal to the noise space eigenvectors (Haykin, 1991). This is the
MUSIC method of Chapter 5. If Vn is the matrix whose columns comprise
the set of 3M 2 noise-space eigenvectors and Vs is the matrix whose columns are the two signal space eigenvectors, then, ideally, a trial delay vector a
is a solution if
J a H V n a a H ( I V s )a 0
(15.23)
Usually the inverse J -1 is searched for peaks, because true zeros of J are not
realized in practice. We have used both MUSIC and minimum variance distortionless response (see Chapter 5) and get similar results; however, MUSIC
can give more resolution. Because the actual number of arrivals is not known
337
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Figure 15.6.
exactly we tried ranks up through five to define the signal subspace. The
results at 0. 13 Hz are shown in Figure 15.6. It appears that there are only two
strong point sources, but there may be other less strong sources. This process
agreed reasonably with findings produced by other means.
Having found the p and parameters of two waves, we might proceed to
find the waveforms themselves, waveform parameters, and quality measures.
Waveform estimation is discussed in Chapter 9. Quality measures can be estimated through the eigenvalues of Rr . The two largest have a sum equal to M
times the total power in the signals, and the average of the 3M 2 smaller
eigenvalues is an estimate of the noise on each sensor component. For example, if only one signal were present, the first eigenvalue 1 would represent the
total variation of the data which has the delays implied by the elements of the
associated eigenvector. This variation also includes noise in that dimension,
however. Thus 1 M s2 n2 , where the average powers of signal and
noise over the time window analyzed are s2 and s2 , respectively. Of course
2 includes P, S, and H component powers of the wave. When more than one
338
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
wave is present, the eigenvalues do not correspond one to one with wavefront
powers.
We have already found the powers at the solution coefficient vectors a
through the elements of the matrix R xis of equation (15.8), which we estimated via equation (15.14). A measure of coherence for these estimations is
given by the ratio of their powers,
r2v Im 2 ( ) r2v Im 2 ( ) r2v , L2E L2N ,
to the estimate of the noise power,
1
n2 ---------------3M 2
3M
i .
(15.24)
i3
15.7 Conclusions
We have formulated the received data such that elements of the signal
covariance matrix reveal the unknown parameters of powers of both Rayleigh
and Love waves arriving at the sensor array. The signal covariance matrix is
accomplished by a unique spatial averaging that tends to nullify the random
noise on the diagonal. We have demonstrated our method with real data and
results concur with theories on the source of the data. We find clear distinctions between the two wavetypes and a 90-degree phase difference in Rayleigh
components as it should be over most of the frequency analysis band. Source
directions were established in a separate report and concur with other knowledge available.
15.8 References
Anderson S., and Nehorai, A., 1994, Analysis of a polarized seismic wave
model: 8th IEEE Workshop on Statistical Signal and Array Processing,
281-284.
339
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Bendat, J. A., and Piersol, A. G., 1986, Random data, analysis and measurement procedures, 2nd ed.: John Wiley & Sons, Inc.
Burgess, K., and VanVeen, B., 1994, Vector-sensor detection using a subspace
GLRT: 8th IEEE Workshop on Statistical Signal and Array Processing,
109-112.
Haykin, S., 1991, Adaptive filter theory: Prentice-Hall, Inc.
Jepson, D. C., and Kennett, B. L. N., 1990, Three-component analysis of
regional seismograms: Bull. Seis. Soc. Am., 80, 2032-2052.
Wax, M., Sheinvald, J., and Weiss, A., 1996, Detection and localization of
correlated and uncorrelated signals in colored noise via generalized least
squares: IEEE Trans. Sig. Proc., 44, 1734-1743.
Scharf, L. L., 1991, Statistical signal processing: Addison-Wesley Publ. Co.
340
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
INDEX
A
analysis region 5, 6
beamformer 156
bias 9
binocular 298
broadband 47, 83, 86, 87, 92, 95
Hampsons algorithm
comparison to subspace method 178,
179, 181
Hampsons multiple elimination method
170, 182
Hermitian 19, 165, 166, 167
high-resolution spectral estimators 58, 154
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
patterns 3, 185
suppression 186, 187, 206
range space 26
rank 20, 21, 24, 26, 27, 30, 38
low rank approximation 21, 24, 83,
104, 106
reduced rank 26, 29, 30, 63, 105
Rayleigh 291, 293, 323, 324
342
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
273
waveform 60, 103, 169, 227, 329
estimation 183
estimator 102
wavefront 28, 29, 44, 46, 57, 83, 115, 323
multiple model 83, 87, 92, 93, 136
Wishart distribution 10
spectral
analysis 1, 3, 51, 57, 59, 68, 77, 79,
80, 167
estimation 40, 41, 57, 140, 154, 167
estimators 58, 61, 64, 65, 67, 71, 72,
74, 77, 154
spectrum 51, 53, 57, 253, 254, 255
discrete power 52
static correction 241, 252, 265
stationarity 37, 142, 326
stationary 37
subarray 43, 141, 142, 329
subarrays 146
subspace 35, 39, 42, 83, 98, 113, 170,
173, 192, 234, 329
and eigenstructure 38, 41
estimators 83, 172
examples, signal 44
noise 35, 38, 43, 171, 189
perturbation 41
signal 35, 38, 39, 42, 44, 83, 93, 94,
98, 110, 113, 147, 158,
159, 168, 170, 189, 235
statistics of components 42
vector 35
time 125
time delay 84, 91, 109, 123, 305
estimation 115, 118, 125
perturbation 122
time gate 84, 101, 156
time window 156, 230, 275, 277, 292
Toeplitz 12, 13, 19, 57, 165
constraints 13, 164
triaxial
array 323, 329
covariance matrix 276
data 275, 278, 288, 291, 297, 321
343
Downloaded 06/26/14 to 134.153.184.170. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/