Inversion Linear Case

TrainingCourseon
JointInversionsinGeophysics
Barcelonnette,June 15192015
MichelDietrich
Objectivesofthislecture
Introduceusefulconceptsofparameterestimation
Providerecipestosolvelinearinverseproblems
Givesimpleexamples
Twousefulbooksonthesubject:
GeophysicalDataAnalysis:DiscreteInverseTheory(RevisedEdition)
WilliamMenke (1989),AcademicPress
ParameterEstimationandInverseProblems (SecondEdition)
RichardC.Aster,BrianBorchers &CliffordH.Thurber(2013),AcademicPress
MD
Joint Inversions in Geophysics

15--19 June 2015
Outline
Introductiontodiscrete linear systems

Vector norms
Matrix norms
Conditioning of alinear system
Classificationof linear inverseproblems
Solutionsbased onnorm minimization
Overdetermined problems
Underdetermined problems
Mixeddetermined problems
Damped least squaressolutions
Other aprioriinformation
Properties of generalized inverses:dataand model resolution matrices
Summary
MD

15--19 June 2015
Mathematical representation and terminology

We areinterested inthe relationships between physical (orchemical,
economic,)model parameters m and asetof data d.
We assumeagood knowledge of the laws governing the investigated
phenomena (underlying physics),inthe form of afunction G such that
d G (m)
Inthe mathematical model d = G(m), the forward modeling operator G
can be defined as
alinear ornonlinear system of algebraic equations
the solutionof anODEorPDE
Forward problem:find d given m

Inverseproblem:find m given d
Modelidentificationproblem:find G knowing some valuesof d and m
MD

15--19 June 2015
Forwardandinverseproblems
Predicted or
measured
data
Physical properties
Parameters
Unknowns
Forward problem:
Model parameters Forward modeling Predicted data
Inverse problem:
Observed data
MD
Inverse modeling Parameter estimation

15--19 June 2015
Continuous and discrete inverseproblem (1)

Quite often,our goalis todetermine afinite number M of model
parameters:
physical quantities,e.g. distributionsof densities,temperatures,seismic velocities
coefficientsentering functional relationships describing the mathematical model.
Inmany casesalso,we haveafinite number N of datapoints.

Insuch situations,the model parameters and datasetcan be expressed as
vectors and we will write
d = G(m)
where d is aNelement vector,and m,aMelement vector.
Inthis case,finding m given d is adiscrete inverseproblem.
Inthe other cases,when the model parameters and dataarefunctions of
continuous variables(time orspace),we mustsolve acontinuous inverse
problem.
MD

15--19 June 2015
Continuous and discrete inverseproblem (2)

Incontinuous inverseproblems,
d ( x) G ( x, ) m( ) d
and G(x,) is called the datakernel.
When G(x,) canbewrittenintheformG(x ),the integral representation
above becomes aconvolution integral and the inverseproblem can be
solved viaadeconvolution.
The theory of continuous inverseproblems reliesonfunctional analysis and
is moreabstractthan the theory of discrete inverseproblems.
Thisis compensated forbygainsincomputationtime since the
computations can be partly carried outinsymbolic form.
Functional analysis also lends itself toabetter physical interpretation of the
operations needed tosolve the inverseproblem.
Discrete inverseformulationsarealso applicabletoproperly sampled
continuous problems,and therefore represent avast domain of applications.
MD

15--19 June 2015
Example of convolutionintegral
ExampletakenfromAsteretal. (2013):Inversionofaverticalgravity
anomalyd(x),observedatsomeheighth toestimateanunknownlinemass
densitydistributionrelativetoabackgroundmodelm(x) = (x)
d ( x) G
x h
2
2 32
m( ) d
g ( x) m( ) d
where G is Newtonsgravitational constant.
Here,becausethe kernel is asmooth function,d(x) will be asmoothed and

scaled transformationof m(x).
The solutionof the inverseproblem m(x) will be aroughened transformation
of d(x).
Noiseinthe datacan seriously affectthe solutionof the inverseproblem.
MD

15--19 June 2015
Classificationof inverseproblems (1)

We focus ondiscrete inverseproblems:model parameters and dataare
represented byvectors
d d1 d 2 d 3 d N T and m = m1 m2 m3 mM T respectively.
Nonlinear implicit equations:series of L equations
f1 d, m 0
f d, m 0
2
or,inmatrixform,
f L d, m 0
f d, m 0
Linear implicit equations

d
f d, m 0 F
Formermatrix equation simplifiesto:
m
where F is amatrix of dimensionsL (M+N).
MD

15--19 June 2015
Classificationof inverseproblems (2)

Nonlinear explicitequations:when dataand model parameters can be
separated,L = N equations can be written inmatrix form
f d, m 0 d gm
where g is anonlinear vector operator.

Linear explicitequations:ifg is alinear operator,the general equation
writes
f d, m 0 d Gm
where G is aNM matrix.Matrix F defined above can then be partitioned

inthe form
I 0
F
0 G
Inthe following,we concentrate onthe linear explicitequations d = G m.

MD

15--19 June 2015
10
Discrete linear systems

Simplest mathematical formulation.Useslinear algebra tools.
Obeysuperpositionand scaling laws:
G(m1 + m2) = Gm1 + Gm2
G(m) = Gm
Broad rangeof applications:seemingly nonlinear problems can be cast ina
linear form (see next examples).
Mathematical linearity is associated with physical linearity (straightrays,)
Canbe used aslocalapproximationsfor(weakly)nonlinear problems
n ) g m m
n Taylorseriesexpansionaroundm
n
d g (m) g (m
n ) G n m n 1 ; n 0, 1, 2,
g (m
where G n g m m n ;
G n ij
gi
n
; m n 1 m m
(n)
m j
n ) G n m n 1 Invertform n 1 (starting from m 0 )

d d g (m
MD

15--19 June 2015
11
Linear regression
Problem of fitting afunction toadataset.The function is defined bya
series of parameters.
When the problem can be solved asalinear inverseproblem,it is referred
toalinear regression.
Example:ballistic trajectory
z (t ) m1 m2t 1 2 m3 t 2
Quadratic in time t, but linear with respect to mi
1
1
t1
t2
t3
tN
1 2 t12
z1

1 2 t 22 m1 z 2

1 2 t32 m2 z3
m
3
z M
1 2 t N2
(Space Archive)
MD

15--19 June 2015
12
Tomography
Dealswith pathintegrated properties:
Travel times of acoustic,seismic,EMwaves
Attenuation of waves,of Xrays,of muons

Inseismic traveltime tomography,the

problem is nonlinear ifit is expressed in
terms of wave velocities v.
It is linearized byconsidering the wave
slowness u.
Incaseof slowness perturbations,
T
s
ds
u ( s ) ds
v( s ) s
T Tobs T pred u ( s ) ds
Ifthe mediumis discretized into blocks
Ti lij u j
j
MD

15--19 June 2015
13
Tomographic reconstruction
via backprojection
Mo Dai Master 1 EGID U. Bordeaux 3

MD

15--19 June 2015
14
Vector norms (1)

One way of solving the linear inverseproblem is tomeasure the length
of some vectors.
Thisis related tothe problem of minimizing amisfit function,which will be
addressed later oninthis course.
Forexample,the linear regression problem is
solved bythe socalled least squaresmethod in
which one triestominimize the overall error
N
E ei2 where ei d iobs d ipre ; d pre Gm est

i 1
E eT e (squaredEuclideanlengthofvectore).
The least squaresmethod usesthe L2 (orEuclidean)norm which is defined,
foravector v,by
1/ 2
2
v 2 v i ( v , v )1 / 2
MD

15--19 June 2015
15
Vector norms (2)

Other norms can be used such as
the L1 norm
v 1 vi
i
which is aparticular caseof the
1/ p
Lp norm
p
vi
i
p 1
The higher norms give the largest

element of vector v alarger weight.
Outlier
The limiting casep is the L norm

v
max vi
i
which selectsthe vector element with the largest absolute valueasthe

measure of length.
MD

15--19 June 2015
16
Vector norms (3)

Why is the L2 norm so frequently used?/Which norm should we use?
1. Computationsaresimpler with the L2 norm than with the L1 norm.
2. Depends onthe importanceone chooses togive tooutliers,i.e.,datathat fall
farfrom the mean trend:
Ifthe dataarevery accurate with only afew outliers,we may want tobe
sensitivetothese anomalous values.Inthis case,we would useahighorder
norm.
Onthe contrary,ifthe datascatter widely around the trend,then the large
prediction errors donot carryaspecial significance.Insuch cases,aloworder
norm would be used becauseit gives amorebalanced weight toerrors of
different size.
3. Similar argumentscould be developed byconsidering aprobabilistic

approach of the inverseproblem.Letusjust pointoutthat the L2 norm
implies that the dataobey Gaussian statistics.Gaussians arerather short
tailed (limited support)distributionswhich imply very few scattered points.
MD

15--19 June 2015
17
Matrix norms (1)

Avectorinduced matrix norm is defined as
Av
A max
v0 v
or
Therefore: Av A v
and
A max Av
v 1
I 1
The L1 ,L2 and L norms thus correspondto

Av 1
A 1 max
max aij max c j
v
j
j
i
v0
1
Av
A 2 max
v0 v
A
MD
where cj is the
jth column of matrix A
( A * A) ( AA*) A * 2
Av
max
max aij max ri
v
i
i
j
v0
where (K) is the

spectral radius of matrix K;
A* is the adjoint of matrix A
where ri is the
ith row of matrix A

15--19 June 2015
18
Matrix norms (2)

The L1 norm of matrix A is the largest L1 norm of the columns of the matrix.
The L norm of matrix A is the largest L norm of the rows of the matrix.
Both areeasily calculated from the elements of matrix A.

The L2 norm of matrix A requires morecomputations.Letusgive afew
practical reminders onmatrix calculation.
Transpose
(AT ) ij a ji
*
(A )ij a ji
Adjoint
Trace tr(A )
MD
a
i 1
ii
Eigenvalue /eigenvector problem
Ax x
Characteristic polynomial
det(A I N ) 0
Spectralradius
( A) max i ( A) .
Singular values
i (A) i (A T A) i (AA T )
For square matrices
1i N

15--19 June 2015
19
Matrix norms (3)

The L2 norm of matrix A is the largest squareroot of the eigenvalue of
matrix AA* ormatrix A*A.
It is the largest singular valueof matrix A.
Ifmatrix A is hermitian (orselfadjoint)A = A*,orsymmetric A = AT,
its L2 norm is the spectralradiusof matrix A:
A 2 ( A) .
Foranynorm, A ( A)
The Frobeniusnorm is not vectorinduced butcan easily be computed
N M
aij
i 1 j 1
1/ 2
or
tr A A
1/ 2
It is aneffectiveway tocompute the L2 norm of amatrix since
A2 A
MD
N A
foraN N matrix.
2
15--19 June 2015
20
Conditioning of alinear system (1)

Letsconsider the linear system Au = b (taken from Ciarlet,1994)
10
7
8
7 u1 32

5 u2 23
whosesolutionis
6 10 9 u3 33

5 9 10 u4 31
7
5
8
6
1

1
1 ,

1

and the perturbed system

10
7
8
7 8 7 u1 u1 32.1

5 6 5 u2 u2 22.9
whosesolutionis
6 10 9 u3 u3
33.1

5 9 10 u4 u4 30.9
9.2
12.6
4.5 .
1.1
b / b 0.2 / 60 0.0033
Avery weak relativeperturbationof the RHS
induces animportantrelativeerror of the solution u / u 16.4 / 2 8.2
that is,anamplificationof the relativeerrors of 8.2 / 0.003 = 2461.
MD

15--19 June 2015
21

Letsalso consider aperturbed system inwhich we slightly modify the
elements of matrix A:
7
8.1 7.2 u1 u1 32
10

6
5 u 2 u2 23
7.08 5.04
whosesolutionis
u u3
5.98 9.89
9
33

3
6.99 4.99
9
9.98 u 4 u4 31
80.33
136
34.10 .
21.97
These error amplificationsmay be surprising inthis example,considering the

good aspectof the originalmatrix A which is symmetric and full,its
determinant is equal to1,and its inverse
A 1
MD
10 6
25 41
41
68
17
10
10 17
5 3
10 3
2
6
doesnt show anything special.

15--19 June 2015
22

These behaviors can be analyzed byconsidering the norms of matricesA and
A-1.
Inthe first case,we comparethe solutionsu and u of the systems
A u b
A ( u u ) b b
A u b
Forany vector norm and its induced matrix norm,we infer from
b Au
that
u
A
b
b A u
u A 1 b
The relativeerror onthe result is therefore

bounded bythe quantity
MD

15--19 June 2015
b
A
A 1
b
b
23

Inthesecondcase,wecomparethesolutionsu andu of the systems
A u b
(A A)(u u) b
A u A (u u)
We infer from the last equality that

u A 1 A u u
uu
thatis,
A 1
A
A
u
Forsmall perturbationsA,is
agood approximationof
uu
and therefore,
A 1
u
u
A
A
Inboth cases,the relativeerror onthe result is bounded bythe relative

error of the modified quantities multiplied by
cond(A) A A 1
which is the conditionnumber of matrix A.
MD

15--19 June 2015
24

The conditionnumber measures the sensitivity of the solutionu of system
Au = b with respecttovariationsindatab orinelements of matrix A.
Alinear system is wellconditioned ifits conditionnumber is small;
it is illconditioned ifits conditionnumber is large.
Properties
(sinceAA 1 I, 1 I A A 1 )
cond( A ) 1
cond( A ) cond( A 1 )
cond( A ) cond( A )
max i ( A )
cond 2 ( A ) i
min i ( A )
i
cond 2 ( A )
max i ( A )
i
min i ( A )
i
cond 2 ( A ) 1
MD
wherei ( A )denotethenon zero

singular valuesofmatrixA
wherei ( A )denotethenon zero
eigenvalue sofmatrixA
ifAisanormalmatrix(A A* A * A)
ifAisaunitarymatrix(A A* A * A I)

15--19 June 2015
25

Finally,cond2(A) is invariantbyunitary transformation. UU * U * U I
then
cond 2 ( A) cond 2 (UA) cond 2 ( AU) cond 2 (U * AU) .
Numerical analysis of the previous example

The eigenvalues of symmetric matrix
areequal toits singular values
10
7
A
8
5 6 5
6 10 9
5 9 10
1 30.2887 2 3.858 3 0.8431 4 0.01015

u 2 16.397
b 2
0.2
Using L2 norm, cond 2 ( A) 1 2984.108 ;
;
4
2
60.025
u2
b2
u
2
2
8.199 < 9.942 .
so that conditionbecomes
cond 2 (A)
MD

15--19 June 2015
26

The Frobeniusnorm is useful toevaluate the L2 norm of amatrix without
knowing its singular values.
Here,forsymmetric matrix A,
A
( A)
A 1
1
1
( A 1 )
u u
MD
162,825
;
163,079
2984 3009 A
A2 A
which verifies property
A 1
98.5222 98.5292 A 1
Therefore, cond 2 (A) A
We also verify that
30.2887 30.5450 A
N A
A 1
cond F (A)
seen previously
u
A
cond(A)
u u
A
A
A
F
F
0,266645
30,5450
leading to0,998 < 26,267 .

15--19 June 2015
27
Classificationof linear inverseproblems (1)

When solving the linear inverseproblem d = Gm,several casesmustbe
distinguished according tothe quantity of informationcontained inthe
equation d = Gm.
Thisinformationdepends onnumber N of dataand number M of model
parameters,butit also depends onthe structureof matrix G which can be
sparse orfull.
IfN > M,there aremoredatathan unknowns,i.e.,toomuchinformationto
exactly solve the inverseproblem.
Thiscorrespondstoanoverdetermined problem.
Curve (orsurface)fitting procedures aretypical overdetermined problems.
IfN < M,there aremoreunknowns than data:we donot haveenough
informationtodetermine all model parameters.
Thiscorrespondstoanunderdetermined problem.
MD

15--19 June 2015
28
Classificationof linear inverseproblems (2)

Inreality,we often mustdealwith mixeddetermined problems,which are
partly overdetermined and partly undertermined.
Thiscan happen even ifN > M.
Thissituationis typical of tomographic experiments when the mediumis
divided into blocks:some blocksarecrossed bymany rayswhereas others
arenot crossed byany rays.
Each of the situationsdescribed above
mustbe solved inanappropriate manner.
The solutionformixeddetermined
problems applies toall situations
butis not necessarily optimal.
MD

15--19 June 2015
29
Solutionsbased onnorm minimization

Minimization of the prediction error inoverdetermined problems.
Minimization of the norm of the estimated solutioninunderdetermined
problems.
Combination of the two approaches inmixeddetermined problems.
MD

15--19 June 2015
30
Overdetermined problems (1)
Consider the prediction error e d d between observationsd and data

predicted from the model m
we seek tofind.
d Gm
d Gm
bycanceling the derivatives
Minimization of E e T e d Gm
E m q with respecttoparameter m q
T
Explicitly,
M
M
E di Gij m j di Gij m j
i 1
j 1
j 1

N
E
Giq
m q i 1
M
M
N
M
(
)
d
G
m
d
G
m
G
G
d
G
G
m
2
iq i
i
ij j i
ij j
iq
iq
ij j
1
j
j
i
j
N
M N
E

0 Giq di Giq Gij m j 0
m q
i 1
j 1 i 1

MD

15--19 June 2015
31
Overdetermined problems (2)
The last equation can be written inmatrix form as

0, where
G T d G T Gm
G11
G
12
T
G G
G1M
G21
G12
G
GN1 11
G
21

GNM
GN1
G1M

GNM
is aMM squarematrix.
G G
Ifinverseexists,then
the estimated model is given bythe least
squaressolution
MD
GT G
m
GT d

15--19 June 2015
32
Tomographic reconstruction of a 3 3 model (1)

3 3 model
16-ray scan of the model
2
1 1
8 2
M2
1 1 3
-1
-1
-2
-1
3
-1
3
Note: it is assumed that all
ray segments
have unit length
MD
11
10
3
1
-3
3

15--19 June 2015
33

Cell numbering adopted
1
1
1

2

2
m 8

2
1

1
3
Matrix of raypaths
(other expressions
are also possible)
1
0
1
0
0
0
0
G
1
0
0
1
0
0
0
1 1 0 0 0 0 0 0
0 0 1 1 1 0 0 0 3horizontalrays(fromtoptobottom)
0 0 0 0 0 1 1 1
0 0 1 0 0 1 0 0
1 0 0 1 0 0 1 0 3verticalrays(fromlefttoright)
0 1 0 0 1 0 0 1
0 1 0 0 0 0 0 0
1 0 0 0 1 0 0 0
0 0 0 1 0 0 0 1 5obliquerays"NW SE"
0 0 1 0 0 0 1 0 (fromupperrightcornertolowerleftcorner)
0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0
1 0 1 0 0 0 0 0
0 1 0 1 0 1 0 0 5obliquerays"NESW"
0 0 0 0 1 0 1 0 (fromupperleftcornertolowerrightcorner)
0 0 0 0 0 0 0 1
Data vector d
MD

15--19 June 2015
34

True model
Back-projection
Filtered back-projection
GT d
m
~ GT G 1 GT d GT G 1 m
2
1 1
M2
8 2
1 1 3
MD
13 12 18
14 37 7
M
17 9 19

15--19 June 2015
2
1 1
~
M 2
8 2
1 1 3
(Maple computations and graphics)
35
Underdetermined problems (1)
When the number of unknowns,M,exceeds the number of data,N,the

problem admits aninfinity of solutions.
Toobtain asolution,one has toprovide some aprioriinformation(e.g.,

physical constraints)toreduce the number of solutions.
We can also find asolutionminimizing the norm of the model while

imposing atthe same time azero prediction error.
M
T
m
m 2j
L m
j 1
e = d - Gm
=0
MD
Minimum
Thisproblem can be solved with Lagrangemultipliers which we first

illustrate with the minimization of afunction of two variablesE(x,y) subject
toaconstraint (x,y) = 0 defined implicitly.
E
Minimize
dE d
dx
dy 0 :Lagrangemultiplier

15--19 June 2015
36
Since dx and dy arenot independent,we havetoconsider the 3equations

E
E
0 ;
0 ; ( x, y) 0
y
x
x
y
thatwillallowustodeterminethe3unknowns:thevaluesofx andy atthe

minimumoffunctionE,and (not needed).
) ,we
and N constraints (m
When we haveM unknowns inavector m
introduce aLagrangemultiplierforeach constraint.
We then havetosolve M+N simultaneous equations forM+N unknowns.
Inour underdetermined problem,we minimize the function

M

withrespecttovariables
(m) L i ei m i d i Gij m j
m
q
M
,
1
,
q
i 1
j 1
i 1
j 1

N
MD
2
j

15--19 June 2015
37
The differenciation gives

M m
N
M
N
j
m j
2
2 m q i Giq 0
m j i Gij
m q
m
j 1
i 1
j 1
i 1
q
q
GT 0 ,
or,inmatrix form, 2m
anequation that mustbe solved with the constraint e d Gm 0

which implies d Gm GG T
GG T is aNN squarematrix.Ifit is invertible,then 2 GG T
and byusing the first matrix equation,we obtain the minimumlength

solution
G T GG T
m
MD

15--19 June 2015
38
Mixeddetermined problems (1)
(a)(b)
overdetermined
underdetermined
(c)
2parameters,3data>1information
Example taken from Menke (1984).
We can only determine the average properties of the two cells incasec).
G11m1 G12 m2 d1
G21m1 G22 m2 d2
G m G m d
32 2
3
31 1
G11 G12 m1 d1
G21 G22 m1 d 2
G G m d
32
1
3
31
MD
m1 m2
1
2
Byintroducin g
m m2
m2 1
2
m m1 m2
then 1
m2 m1 m2
since G11 G12 ; G21 G22 ; G31 G32

15--19 June 2015
39
Thisshowsthat parameter m1 is overdetermined whereas parameter m2

is underdetermined.Thissuggests apartitioning of the equations into
overdetermined and underdetermined partsbyforming linear
combinations of the initialparameters.
G over
0
0
G under
mover dover
m d
under under
The newproblem d'=G'm' can be solved byusing the least squares

solutionforthe overdetermined partand the minimumlength solution
forthe underdetermined part,i.e.,we minimize the dataprediction error
and introduce only minimalaprioriinformation.
Forthis,we write equation d = Gm inthe form d r d 0 Gm r m 0 , where

m 0 and d 0 belong tothe model nulspace and datanulspace,
respectively: Gm 0 0 and dT0 Gm 0 .
MD

15--19 June 2015
40
With this decomposition,prediction error E and norm of solutionL write:

E d Gm d Gm
T
d r Gm r d r Gm r dT0 d 0
T
L m T m m Tr m r m T0 m 0
as Gm 0 0 and dT0 d r dTr d 0 0
since m Tr m 0 m T0 m r 0
Returning tothe fact that we want tominimize the dataprediction error

and introduce only minimalaprioriinformation,we therefore impose
Er d r Gm r d r Gm r 0
T
(Error ond0 cannot be reduced)
and we may choose m 0 0 tolimit aprioriinformation.
The vector subspaces m r , m 0 , d r and d 0 belong tocan be identified bya

SVDof matrix G which writes G U r r VrT
U r , r and Vr arematricesof dimensionsNr, rr, and Mr,
respectively.
MD

15--19 June 2015
41
r is adiagonalmatrix containing the r nonzero singular valuesof matrix G.
Ur contains the eigenvectors associated with the nonzero eigenvalues of

matrix GGT.
Vr contains the eigenvectors associated with the nonzero eigenvalues of

matrix GTG.
We introduce U0 and V0 asthe contributionsof zero singular valuesof G.
With these definitions,mr,m0,dr and d0 belong tothe subspaces spanned

byVr,V0,Ur and U0.
Notesonthe nulspaces U0 and V0:
It is easy toverify that U 0 Gm U 0 U r r Vr m 0 because U 0 U r .

Thisimplies that the datacannot entirely be described byoperator G
V0 is responsible forthe nonuniqueness of the solutionsbecause
m 0 V 0, G (m1 m 0 ) Gm1 since Gm 0 0

MD

15--19 June 2015
42
Byanalogy with squarematricesG,forwhich the inverseoperator is G-1,

G U r r VrT
since we have,we
define forour mixeddetermined problem
d = Gm
G g Vr r1U Tr
a generalized inverse
so that
G g d Vr r 1U Tr d
m
We can easily verify that this solution

1
V0 Vr r U r d 0 as V0 Vr
Has no componentinthe model nulspace V0: V0 m
Error e has no componentsinUr subspace
Since U Tr U r I r and VrT Vr I r
U Tr e U Tr d Gm
d U U d U d U d 0
U Tr d U r r VrT Vr r1U Tr d
U Tr
MD
T
r
T
r
T
r

15--19 June 2015
43
Inadditiontobeing the natural solutionformixeddetermined problems,

the generalized inverseG-g can be used inall situationsdescribed
previously:over,under,and exactlydetermined problems.
It contains all solutionsderived before:
When U0 and V0 areof zero dimension,r = N = M ,G-g = G-1 (exactdetermination)
When U0 has zero dimensionand V0 has nonzero dimension(overdetermination),
GT G
m
G T d Vr r2 VrT Vr r U Tr d Vr r1U Tr d G g d
When U0 has nonzero dimensionand V0 has zero dimension(underdetermination),
G T GG T
V U U V V U V U U
1
T
r
T
r
T
r
T
r
2
r
U Tr
= Vr r 1U Tr G g
MD

15--19 June 2015
44
Weak underdetermination
Incaseof weak underdetermination,rather than partitioning vectors m

and d,we can minimize acombination of the dataprediction error E and
length of the solutionL,i.e.,we minimize afunction
) E 2 L eT e 2m
Tm
(m
with respecttoelements m q of model m we seek tofind.
Factor determines the importanceof length L relativetoerror E inthe

minimization of function (m ) .
Bysolving this minimization problem explicitly,we end up with the

damped least squaressolution
GT G 2 I
m
MD
GT d
The additional term 2I regularizes matrix GTG and stabilizes its inverse,at
the expense of the model resolution.
15--19 June 2015
45
Other aprioriinformation(1)
The criterion that was adopted (minimization of L mT m )is not always

suitable.It can be generalized inseveral ways.
Forinstance,we can introduce anaprioriinformationonthe model and

consider minimizing
L m m priori
T m m priori
Inother cases,we will be looking forsmooth solutionsbyintroducing

weighting factors inthe form of aMM matrix Wm:
L mT Wm m.
MD
We may estimate the roughness of discrete model parameters via

1 1
m1
1
1
m2

Dm

1 1 m M

15--19 June 2015
46
Matrix D is anapproximate differenciation operator.Minimizing the

roughness of vector m amounts tominimize
L T Dm T Dm m T D T D m m T Wm m
The offdiagonal terms of matrix Wm represent the interdependence of

the model parameters.Matrix Wm can be designed toimposesome
relationship between the model parameters.
Bycombining the aprioriinformationmpriori and weighting matrix Wm ,
L m m priori
T Wm m m priori
We can similarly define aNN weighting matrix Wd forthe datatofavor

the good dataatthe expense of the noisy dataand define a
generalized prediction error
E e T Wd e
Wd is adiagonalmatrix when there is no coupling between the data.

MD

15--19 June 2015
47
We thus obtain newsolutionsof the discrete linear inverseproblem,

which generalize the expressionsobtained so far.
Overdetermined problems
Minimization of E e T Wd e leads tothe weighted least squaressolution
G T Wd G
m
G T Wd d
Purely underdetermined problems

Minimization of L m m priori T Wm m m priori with constaint E = 0
leads tothe weighted minimallength solution
m priori Wm G T GWm G T
m
MD
d Gm
1

15--19 June 2015
priori
48
Weakly underdetermined problems
Minimization of quantity (m ) E 2 L e T Wd e 2 m m priori T Wm m m priori

leads tothe damped and weighted least squaressolutionwhich can be
written
m priori G T Wd G 2 Wm
m
m priori Wm1G T GWm1G T 2 Wd1

m
MD
G T Wd d Gm priori
d Gm
priori
Linear equality constraints: yet another classof aprioriinformation

Linear combinations of model parameters can be expressed ash = Fm.
Forexample,ifthe average of the model parameters is equal toa
constant,then
m1
m
1
Fm 1 1 1 1 2 h

M
m
M
15--19 June 2015
49
Other example:inacurve fitting procedure,imposethat the curve passes

through aspecified point.
One way toaccount forlinear equality constraints is tocombinethe p

equations h = Fm with the N equations d = Gm and toputstrong weights
onequations h = Fm.
Thiswill imposeaprediction error

which is very small forthe equations
h = Fm atthe expense of the
prediction error of equations
d = Gm which can be important.
MD

15--19 June 2015
50
Another solutiontothis problem is tominimize the prediction error

E = eTe with the p constraints Fm h = 0 byusing the Lagrangemultipliers
techniqueoncemore.
We minimize the function

2
p
M

M
) Gij m j d i 2 i Fij m j hi wrt variables m q , q 1, M
(m
i 1 j 1
i 1

j 1
N
G T d
G T G F T m
F
0
h
Inmatrix form:
Finally,the solutionof the overdetermined problem d = Gm with linear

constraints h = Fm is
G G G d G G F F G G F
m
F G G G d h
1
sol. without constraint
MD

15--19 June 2015
51
Properties of generalized inverses(1)
We haveobtained model parameters estimates invarious situationsthat

we can write
est
g obs
G d
The inverseoperator G-g is not amatrix inverseinthe classical sense

(except inthe exactly determined problem where G-g = G-1).The matrix
products GG-g and G-g G aregenerally not identity matrices.
Datapredicted from the estimated models areobtained via
d pre Gm est G G g d obs GG g d obs N d obs
The NN squarematrix N GG g is the dataresolution matrix,which

should ideally be the IN identity matrix.Inthis case,the dataprediction
error would be zero.
The importanceof offdiagonal terms can be evaluated bydefining

(N) I N
MD
2
F
nij ij
i 1 j 1

15--19 June 2015
52
Properties of generalized inverses(2)
Similarly,we may wonder how closethe estimated model mest is from the
true model mtrue which is such that Gmtrue = dobs.
Therefore, m est G g d obs G g Gm true G g G m true R m true
The MM squarematrix R G g G is the model resolution matrix,which

should ideally be the IM identity matrix touniquely determine each model
parameter.
The importanceof offdiagonal terms can be evaluated

bydefining
2
(R ) R I M
MD
rij ij
i 1 j 1
The unitcovariancematrix characterizes the degree of error amplification

that occurs inthe mapping from datatomodel parameters.Ifthe dataare
all uncorrelated and haveequal variance2,the unitcovariancematrix is
given by
T
T
2G g C d G g G g G g
15--19 June 2015
53
Summary (1)
Resolution of linear systems d = Gm forN dataand M unknowns using the L2 norm
IfN > M:Overdetermined problem
T
Minimization of prediction error
E d Gm
Least squaressolution
GT G GT d
m
Dataresolution matrix
N G GT G
Modelresolution matrix
R IM
Unitcovariancematrix
GT G
d Gm
GT
IfN < M:Underdetermined problem

L m T m with constraint E = 0
Minimization of norm of solution
MD
Minimumlength solution
G T GG T
m
N IN
R G T GG T
G T GG T

15--19 June 2015
G
54
Summary (2)
Resolution of linear systems d = Gm forN dataand M unknowns using the L2 norm
Forany given N, M:Mixeddetermined problem
Singular valuedecompostion
G U r r VrT
Generalized inversesolution
Vr r U Tr d
m
N U r U Tr
R Vr VrT
Vr r2 VrT
Damped least squaressolutions
GT G 2 I M
m
GT d
G T GG T 2 I N
m
MD

15--19 June 2015
55
Exercise:mediumdescribedby4cells
Computesolutionswith
1 GT G GT d
m
3 G T GG T 2 I N
m
3 GT G 2 I M
m
d ; 0.01, 0.1, 1
G T d ; 0.01, 0.1, 1
4 Vr r U Tr d
m
Inallthesecases,compute
Tm
thesolutionlength
Lm
T d Gm
thedatapredictionerror E e T e d Gm
Useyourpreferredcomputationtooltodothematrix
operations:
Matlab,Octave
Maple,Mathematica
LibreOffice,Gnumeric,Excel(MMULT,MINVERSE,TRANSPOSE,)(*)
Python,R
Fortran,C
(*)Normscanbetrickytocompute
Mainresults:
m 1 2 4 5 8 ; E1 0 ; L1 109
m 2 2.000090 3.999990 4.999940 7.999790

E 2 .1479000000e - 6 ; L2 108.9963201 [ 0.01]
m 2 2.00577777 3.99582753 4.99085241 7.97592705
E 2 .2427478313e - 2 ; L2 108.5138021 [ 0.1]
m 3 2.005778095 3.995827847 4.990852723 7.975927349
E3 .2427360908e - 2 ; L3 108.5138140 [ 0.1]
Mainresults(continued):
Solid line: damped overdetermined solution

Symbols: damped underdetermined solution
66dataresolutionmatrix for
theleastsquaressolution(1)
Datapredictedfromthe
estimatedmodeldonot
entirelyexplainthedata
(theyareslightlysmoothed)
44modelresolutionmatrix for
theleastsquaressolutionis
I4 identitymatrix
When increases (>0.6),
we degrade both the
dataand model
resolutions (smoothing)
Tradeoff between
resolution and variance
tr(N,M)
(N)
tr()
(R)

Inversion Linear Case

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Inversion Linear Case

Încărcat de

Drepturi de autor:

Formate disponibile

TrainingCourseon

Joint Inversions in Geophysics

Introductiontodiscrete linear systems

Joint Inversions in Geophysics

Mathematical representation and terminology

Forward problem:find d given m

Joint Inversions in Geophysics

Model parameters Forward modeling Predicted data

Inverse modeling Parameter estimation

Joint Inversions in Geophysics

Continuous and discrete inverseproblem (1)

Inmany casesalso,we haveafinite number N of datapoints.

Joint Inversions in Geophysics

Continuous and discrete inverseproblem (2)

Joint Inversions in Geophysics

where G is Newtonsgravitational constant.

Here,becausethe kernel is asmooth function,d(x) will be asmoothed and

Joint Inversions in Geophysics

Classificationof inverseproblems (1)

Linear implicit equations

Joint Inversions in Geophysics

Classificationof inverseproblems (2)

where g is anonlinear vector operator.

where G is aNM matrix.Matrix F defined above can then be partitioned

Inthe following,we concentrate onthe linear explicitequations d = G m.

Joint Inversions in Geophysics

Discrete linear systems

n ) G n m n 1 Invertform n 1 (starting from m 0 )

Joint Inversions in Geophysics

Quadratic in time t, but linear with respect to mi

Joint Inversions in Geophysics

Inseismic traveltime tomography,the

Ifthe mediumis discretized into blocks

Joint Inversions in Geophysics

Mo Dai Master 1 EGID U. Bordeaux 3

Joint Inversions in Geophysics

Vector norms (1)

E ei2 where ei d iobs d ipre ; d pre Gm est

Joint Inversions in Geophysics

Vector norms (2)

which is aparticular caseof the

The higher norms give the largest

The limiting casep is the L norm

which selectsthe vector element with the largest absolute valueasthe

Joint Inversions in Geophysics

Vector norms (3)

3. Similar argumentscould be developed byconsidering aprobabilistic

Joint Inversions in Geophysics

Matrix norms (1)

The L1 ,L2 and L norms thus correspondto

where (K) is the

Joint Inversions in Geophysics

Matrix norms (2)

Both areeasily calculated from the elements of matrix A.

Eigenvalue /eigenvector problem

For square matrices

Joint Inversions in Geophysics

Matrix norms (3)

The Frobeniusnorm is not vectorinduced butcan easily be computed

It is aneffectiveway tocompute the L2 norm of amatrix since

Conditioning of alinear system (1)

and the perturbed system

Joint Inversions in Geophysics

Conditioning of alinear system (2)