Sunteți pe pagina 1din 118

Sparse & Redundant Representations

and Their Applications in


Signal and Image Processing
Greedy Pursuit Algorithms – The Practice

Michael Elad
The Computer Science Department
The Technion – Israel Institute of technology
Haifa 32000, Israel
Defining Our Objective
and Directions
We Return to (P0)

So, we are considering again the


general (P0) problem

(P0 ) min x 0
s.t. A x  b
x

and this time we


would like to discuss n A 
practical ways for
solving it b
m
x

Michael Elad | The Computer-Science Department | The Technion


What are Our Options ?

 Here is a possible recipe for solving (P0): An


exhaustive search over all the possible supports
 Denote by k – the number of non-zeros in the
solution
 As we do not know how many non-zeros there are in
the optimal solution, we should check k=1, 2, … till
we find the sparsest solution
Solve the LS problem
Gather all the
2
Set k=1 supports {Si}i min A x  b 2
s.t. sup  x   S i error ≤ ε ?
x
of cardinality k
for each support
No Yes
There are (mk )
such supports Set k=k+1 Done

Michael Elad | The Computer-Science Department | The Technion


Option 1: Exhaustive Search

 A typical example: Assume that


o m=2000 (number of atoms in A)
o k is known – k=15 (so no need to start at k=1, …)
o It takes 1 nano-second to check each LS

We shall need ~7.5e20 years to solve this problem!!

Solve the LS problem


GatherThis
all theis a combinatorial
supports {Si}i min A x  b 2 s.t. sup  x   S i
2 LS error
Set k=1
problem,
of cardinality k proven
x
to be
for each support
≤ ε2 ?

There are (mk)


NP-Hard! No Yes

such supports Set k=k+1 Done

Michael Elad | The Computer-Science Department | The Technion


What are Our Options ?

(P0 ) min x 0
s.t. A x  b
x

DEFINITELY NOT EXHAUSTIVE SEARCH !


 So, what are the alternatives? The general
answer is
Approximation Algorithms
 Approximation? This means that are be willing
to sacrifice accuracy and NOT obtain the truly
optimal solution of (P0)
 So, how do we design such approximation
algorithms?

Michael Elad | The Computer-Science Department | The Technion


Approximation Algorithms: Greedy

 Very similar to the exhaustive search rationale, one


could say that the true unknown in (P0) is the
support of the solution, which is discrete by nature
 The set of support possibilities forms a m-tuple tree,
and exhaustive search S={}
implies checking each node
 An approximation: S={1} S={2} S={m}
--- ---
Search the tree
of possibilities S={2,1} S={2,3} S={2,m}
while pruning --- -- - ---

many “unlikely” states S={2,3,7}


---
 This leads to “Greedy Methods”
Michael Elad | The Computer-Science Department | The Technion
Approximation Algorithms: Relaxation

(P0 ) min x 0
s.t. A x  b
x
 As opposed to the above, one could
consider the whole vector x as an
unknown rather than focusing on its
support
 We have massive knowledge in
continuous optimization… but …
 Main difficulty: (P0) is highly non-
smooth due to the L0 penalty
 Solution: Smooth (P0) somehow,
which leads to “Relaxation Methods”

Michael Elad | The Computer-Science Department | The Technion


Approximation Algorithms

(P0 ) min x 0
s.t. A x  b
x

Greedy methods Relaxation methods


Build the solution one Smooth the L0 and use
non-zero element at a continuous optimization
time techniques

Michael Elad | The Computer-Science Department | The Technion


Sparse & Redundant Representations
and Their Applications in
Signal and Image Processing
Greedy Pursuit Algorithms – The Practice

Michael Elad
The Computer Science Department
The Technion – Israel Institute of technology
Haifa 32000, Israel
Greedy Algorithms:
The Orthogonal Matching
Pursuit Algorithm
Lets Go Greedy
 Core idea:
exploit the
best support
from the last

round
 Start: find the atom that best matches
Ax to b
 Next: given the previously chosen atoms,
find the next one to best fit Ax to b
 The solution grows the support one item at a time
 The algorithm should stop when the error Ax-b is
getting close enough to zero
Michael Elad | The Computer-Science Department | The Technion
The Relation to the Pruned Tree
x 0
0 S={}

x 0
1 S={1} S={2} S={m}

x 0
2 S={2,1} S={2,3} S={2,m}

x 0
3 S={2,3,1} S={2,3,4} S={2,3,7} S={2,3,m}

x 0
4 S={2,3,7,1} S={2,3,7,4} S={2,3,7,8} S={2,3,7,m}

Many of the possibilities are never checked,


as every round has only O(m) tests
(instead of m-choose-k)
Michael Elad | The Computer-Science Department | The Technion
From Concept to Algorithms

 As it turns out, there are various ways to practice


the above rationale, all of them considered
“Greedy Algorithms”
 We shall meet several such variants, ranging from
the most sophisticated down to simpler methods:
o Least-Squares Orthogonal Matching Pursuit (LS-OMP)
o Orthogonal Matching Pursuit (OMP) Our starting
point
o Matching Pursuit (MP)
o Weak Matching Pursuit (WMP)
o The Thresholding Algorithm

Michael Elad | The Computer-Science Department | The Technion


OMP: The Rationale

 Greedy

A
algorithms
such as OMP  + 
build the
solution
sequentially
by adding one non-zero at a time b r 41320
x0, x1, x2, … , xk, …
 In this path, the found solution xk may x 20143
not satisfy the equation: Axkb
 We shall refer to this error as the current residual:
rk=b-Axk
 OMP Strategy: choose the next non-zero such as to
reduce the “energy” in the residual as best as possible
Michael Elad | The Computer-Science Department | The Technion
OMP: The Details

Our goal: Approximating the solution of


m in x 0
s.t. A x  b
x

Initialization Main Iteration


2
1. Compute E(i)  min z  a i  r k 1 for 1  i  m
k  0, x 0  0 z 2

r0  b  A x0  b k  k 1 2. Choose i0 s.t. 1  i  m, E(i0 )  E(i)


and S 0   3. Update S k : S k  S k 1  i0 
2
4. LS : x k  min Ax  b 2
s.t. sup  x   S k
x

5. Update Residual: r k  b  A x k

No Yes
rk 2
 Stop

Michael Elad | The Computer-Science Department | The Technion


OMP: Choosing the Next Atom

 Lets assume that the columns of A are L2-normalized


 Evaluating which atom to choose relies on computing
E(i)
2
 1  i  m E(i)  min z  a i  r k 1 2
z

T
a i r k 1
 z  a i  r k 1   0
T T
ai z opt  T
 a i r k 1
ai ai

 
2 2 2
T T
E(i)  a i r k 1  a i  r k 1  r k 1 2
 a i r k 1
2

Conclusion: Instead of minimizing E(i),


T
we can maximize a i r k 1

Michael Elad | The Computer-Science Department | The Technion


OMP: Choosing the Next Atom

In order to choose the next atom to join the


support, OMP performs the following computation


T
A r k 1
T
A r k 1 m

and seeks the maximal entry –


its location points to the atom to be chosen

Michael Elad | The Computer-Science Department | The Technion


OMP: Least-Squares

After we have updated the support, we should update


both the current solution xk and the residual rk

2
x k  m in A x  b
x

s.t. sup  x   S k
2

A 
2
 b
 AS x  b
2
 min Sk x
2 x

 
1
T T †
xk  AS AS AS b  ASb

Michael Elad | The Computer-Science Department | The Technion


Why the Name “Orthogonal …” ?

 Observe that the updates solution xk satisfy


2
AS x  b 2

x
min
T
AS rk

0  A S  A S x k  b    A ST r k
T

 The solution in each step is chosen such


that the new residual (rk) is orthogonal to
all the chosen atoms in A
A Positive Consequence:
OMP can never choose the same atom twice

Michael Elad | The Computer-Science Department | The Technion


Numerical Shortcut
2
 Our Least-Squares task is given as 

 
2 1
T T
m in A S x  b  A AS Sk
AS b
x k 2 k k

2
 Could we exploit the fact that we already
have the inversion from the previous
A  ?
1
T
iteration S k 1
AS
k 1

2 2
The answer is positive –
 
There is a recursive method to
 update the solution (which will
not be discussed here)
2 2
 A S , ak 
 k 1 
Michael Elad | The Computer-Science Department | The Technion
OMP: Complexity

 The two most demanding parts of the OMP are


o The sweep stage in which we choose the next atom
This stage requires the computation of
ATrk-1 and finding the maximal value, a
process that requires O(mn) operations
o The Least-Squares stage in which we update the solution
This stage requires computing ASTAS and
updating the solution, a process that
requires O(k2m) operations

 The overall complexity of the OMP is governed


by the first of the two (applied k times):
OMP complexity is O(mnk)

Michael Elad | The Computer-Science Department | The Technion


Sparse & Redundant Representations
and Their Applications in
Signal and Image Processing
Greedy Pursuit Algorithms – The Practice

Michael Elad
The Computer Science Department
The Technion – Israel Institute of technology
Haifa 32000, Israel
Variations over the
Orthonormal Matching
Pursuit
Other Greedy Algorithms

 The OMP is just one interpretation of the greedy


rationale, and there are others that could be
proposed
 The alternative algorithms suggest different
tradeoffs between accuracy and complexity

Weak Matching
OMP More
Pursuit (WMP)
accurate

Faster
Thresholding Matching Pursuit Least-Squares
(MP) OMP

Michael Elad | The Computer-Science Department | The Technion


Least-Squares OMP

 Here is an algorithm that may appear at first to be


equivalent to the OMP ...

Set k=1 Gather all Find the solution of


& S0=  
2
the supports min A x  b 2
s.t. sup  x   S i
S i i1  S k 1 , ii1
m m x

for each support


There are only Sk
m such Find the support
No
supports Set k=k+1 Ek ≤ ε ? with the smallest
error {Sk,Ek}
Yes
 Is it different than OMP? Done

 Yes! While OMP uses the residual as a proxy to


the error, this method computes the actual error
directly
Michael Elad | The Computer-Science Department | The Technion
LS-OMP vs. OMP

Our goal: Approximating the solution


of m in x s.t. A x  b
x 0

This is the OMP:


Main Iteration
Initialization
1. Compute p(i)  a iT r k 1 for 1  i  m
k  0, x 0  0
k  k 1 2. Choose i0 s.t. 1  i  m, p(i0 )  p(i)
r0  b  A x0  b
and S 0   3. Update S k : S k  S k 1  i0 
4. LS : x k  min A x  b s.t. sup  x   S k
x

5. Update Residual: r k  b  A x k
and this part will be
replaced in order to No
rk 
Yes
Stop
obtain the LS-OMP 2

Michael Elad | The Computer-Science Department | The Technion


LS-OMP: Key Observation

 Who is the best atom to join the support?


o OMP’s Answer: The one that most correlates with the
residual
o LS-OMP’s Answer: The OMP strategy is sub-optimal since
it relies on the residual. A better choice can be made by
computing the following set of m LS problems:

2
2
E(i)  m in A x  b
x 2 min
x
s.t. sup  x   S k 1  i for 1  i  m
 A S , ai  2
 k 1 
and choosing the atom that led to the smallest error
 Observation: OMP does this very LS but only once per
iteration, while LS-OMP performs it m times in order to
choose the next atom – this is the best possible greedy step
Michael Elad | The Computer-Science Department | The Technion
LS-OMP: Details

Initialization Main Iteration


2
k  0, x 0  0 1. Com pute E(i)  m in A x  b
x 2
s.t.
k  k 1
r0  b  A x0  b sup  x   S k 1  i for 1  i  m
and S 0   2. Choose i0 s.t. 1  i  m, E(i0 )  E(i)

3. Update S k : S k  S k 1  i0 
2
4. LS : x k  min A x  b 2
s.t. sup  x   S k
x

5. Update Residual: r k  b  A x k

Comments:
o Step 1 can be done faster by
No Yes
exploiting the recursive LS alg. rk 2
 Stop
o Step 4 is not needed since its
result is already given in step 1
o The residual is computed here
only for the stopping criterion

Michael Elad | The Computer-Science Department | The Technion


LS-OMP: Complexity

Claim 1: Complexity(OMP)<Complexity(LS-OMP)

Claim 2: Nevertheless, LS-OMP can be made


more efficient by exploiting the recursive LS
solution mention earlier

Weak Matching
OMP More
Pursuit (WMP)
accurate

Faster
Thresholding Matching Pursuit Least-Squares
(MP) OMP

Michael Elad | The Computer-Science Department | The Technion


Simplifying the OMP

 How can we simplify the OMP?


o Matching Pursuit: By avoiding the LS step somehow
o Weak MP: By simplifying the search for the next atom

Main Iteration
Initialization
T
k  0, x 0  0 1. Compute p(i)  a i r k 1 for 1  i  m
k  k 1
r0  b  A x0  b
2. Choose i0 s.t. 1  i  m, p(i0 )  p(i)
and S 0  
3. Update S k : S k  S k 1  i0 
4. LS : x k  min A x  b s.t. sup  x   S k
x

5. Update Residual: r k  b  A x k

No Yes
rk 2
 Stop

Michael Elad | The Computer-Science Department | The Technion


Matching Pursuit: Rationale

 MP uses the same method for choosing the next atom,


so assume that the current support Sk has been set
 When updating 2
m in A S x  b The original
the solution, x k 2 OMP LS
OMP chooses to 2
MP suggested
“forget” the m in A S
z k 1
x k 1  a i0 z  b
2 solution
previous
xk-1 and re-compute x k  i0   m in a i0 z  r k 1
2
Looks
familiar?
it by a full LS z 2

 MP approach: x k  i0   ai0 r k 1
T The new
coefficient
Keep xk-1 and simply
update it by
adding the new atom with its coefficient
Michael Elad | The Computer-Science Department | The Technion
MP: Details

Main Iteration
Initialization
1. Compute p(i)  a iT r k 1 for 1  i  m
k  0, x 0  0
k  k 1 2. Choose i0 s.t. 1  i  m, p(i0 )  p(i)
r0  b  A x0  b
and S 0   3. Update S k : S k  S k 1  i0 
T
4. Update : x k Axx
LS : x k x kmin k 1b &s.t. (i0 ) x x k(iS
x ksup )  a i r k 1
0 k 0
x

Comments: 5. Update Residual: r k  b  A x k

o MP might choose the same


No Yes
atom twice, explaining the rk 2
 Stop
‘+’ in the update formula

o Clearly, MP is faster (and


less accurate) compared to
the OMP, since it avoids all
the LS computations

Michael Elad | The Computer-Science Department | The Technion


Weak Matching Pursuit: Rationale

 Weak-MP seeks to further simplify the Matching Pursuit


by targeting the step of choosing the next atom:
o MP approach: Compute |ATrk-1| and choose the largest entry
o WMP’s rationale: If A is huge, this step is too expensive
 The alternative: compute |aiTrk-1| values, and stop when
the obtained value is big enough. How big is “big
enough”?
 
2 2
T T
E(i)  a i r k 1  a i  r k 1  r k 1 2
 a i r k 1 0
Range of Possible
Values Weak-MP strategy: If |aiTrk-1| is
T above t (<1) times the upper
0  a i r k  1  r k 1 2 bound, it is sufficiently big

Michael Elad | The Computer-Science Department | The Technion


WMP: Details

Main Iteration
Initialization
1. Compute p(i)  a iT r k 1 for 1  i  m
k  0, x 0  0
k  k 1 2. Choose i0 as soon as p  i   t  r k 1
r0  b  A x0  b 2

and S 0   3. Update S k : S k  S k 1  i0 


4. Update x k : x k  x k 1 & x k (i0 )  x k (i0 )  aiT r k 1
0

5. Update Residual: r k  b  A x k
Comments:
o WMP uses the same
‘trick’ as the MP for No Yes
rk  Stop
avoiding the LS
2

computation
o t is a parameter – the
larger it is, the faster
(and less accurate) the
algorithm becomes

Michael Elad | The Computer-Science Department | The Technion


Sparse & Redundant Representations
and Their Applications in
Signal and Image Processing
Greedy Pursuit Algorithms – The Practice

Michael Elad
The Computer Science Department
The Technion – Israel Institute of technology
Haifa 32000, Israel
The Thresholding Algorithm
What Have we Seen So Far?

 There is a set of possible greedy methods that


span a wide range of complexities and
accuracies
Weak Matching
OMP More
Pursuit (WMP)
accurate

Faster
Matching Pursuit Least-Squares
Thresholding (MP) OMP

 We now turn to discuss the thresholding


algorithm, which is the simplest and crudest of
all, but could also be considered as the most
popular pursuit technique

Michael Elad | The Computer-Science Department | The Technion


Thresholding: Core Idea

 How should we choose the next atom to join the support?


o LS-OMP’s Answer: By trying each of the atoms and
choosing the one that leads to the eventual smallest error
o OMP’s Answer: The one that most correlates with the
residual
o Thresholding Answer: The above two are too complicated ...
 So, what is the alternative? T
A b
 Take the very first OMP step
|ATb| and extract from it all
T 
the decisions about the order
of atoms to bring into S,
A b
m
based on the magnitude of
the values of this vector
Michael Elad | The Computer-Science Department | The Technion
Thresholding: Details

Our goal: Approximating the solution of


m in x 0
s.t. A x  b
x

Initialization Main Iteration


1. Com pute   A T b and 1.Update S k : S k  S k 1  {ik }

sort this vector by absolute k  k 1 2. LS : x k  min A x  b s.t.


x

descending order i1 , i2 , ... sup  x   S k


 i1   i2   i3    ik 3. Update Residual: r k  b  A x k

2. Set : k  0, x 0  0 , S 0   
No Yes
rk  Stop
Here as well one could 2

use the recursive LS to speed-up


the process
Michael Elad | The Computer-Science Department | The Technion
Thresholding Made Even Simpler

Our goal: Approximating the solution of


m in x 0
s.t. A x  b
x

Initialization Main Iteration


1. Com pute   A T b and 1.Update S k : S k  S k 1  {ik }

sort this vector by absolute 2. LS : x k  min A x T b s.t.


x k sup Ax S Sb
k  k 1 x

decending order i1 , i2 , ...


  k k

i  i  i   i 3. Update Residual: r k  b  A x k
1 2 3 k

2. Set : k  0, x 0  0 , S 0   
No Yes
 In high dim. problems, the LS step becomes
r  prohibitive
Stop k 2

 It can be replaced by a simple projection,


following the Matching-Pursuit approach
Michael Elad | The Computer-Science Department | The Technion
A Test Case:
Demonstrating and Testing
Greedy Algorithms
Proposed Experiment

 We have
a several Draw A nm
A
algorithms for somehow
(m  n)
approximating
the solution of b n
A Solver of x̂
min x s.t. A x  b Multiply min x
x 0
x 0 b=Ax0
s.t. A x  b
 Let’s test these
algorithms
Draw an s-sparse
 This is the x0 at Random Compare
structure of the x 0  m
, x0 0
s n
experiment

Michael Elad | The Computer-Science Department | The Technion


Proposed Experiment

 We shall use
a random A Draw A A nm

of size somehow
(m  n)
50×100
with normal b n
A Solver of x̂
entries Multiply min x
x 0
b=Ax0
 We shall s.t. A x  b
L2-normalize
the columns Draw an s-sparse
of A x0 at Random Compare
m
x0  , x0 0
s n

Michael Elad | The Computer-Science Department | The Technion


Proposed Experiment

 We shall test the


cardinalities s in
Draw A nm
the range [1,15] somehow
A
(m  n)
 The non-zeros
will be drawn
n
b A Solver of x̂
from the uniform Multiply min x
x 0
b=Ax0
distribution [1,2] s.t. A x  b
and given a
random sign
Draw an s-sparse
x0 at Random Compare

-2 -1 ? 1 2
x0  m
, x0 0
s n

Michael Elad | The Computer-Science Department | The Technion


Proposed Experiment

 We will compute
the relative error Draw A nm
2 A
x̂  x 0 somehow
(m  n)
ErrorL2  2
2

x0 b n
2 A Solver of x̂
Multiply min x
 We will also b=Ax0
x 0

evaluate the s.t. A x  b


support recovery
Draw
Ŝ  S0 an s-sparse
x0 at Random Compare
ErrorS  1 
max S  0 
ˆx 0 ,S m , x 0 0  s n

 Results averaged over 200 experiments


Michael Elad | The Computer-Science Department | The Technion
Proposed Experiment: Results

2
x̂  x 0 2
2
x0 2

Michael Elad | The Computer-Science Department | The Technion


Proposed Experiment: Results

Ŝ  S0
1

ˆ,S
max S 0 

Michael Elad | The Computer-Science Department | The Technion


Performance of Greedy Algorithms

Question: Should we be happy with the results


we got?

Answer: Yes & No

Negative: The success is conditioned on a very


low s value

Positive: These algorithm succeed for sparse


enough x0 (e.g. OMP for s=9 versus
an exhaustive search?)

Michael Elad | The Computer-Science Department | The Technion


Sparse & Redundant Representations
and Their Applications in
Signal and Image Processing
Relaxation Pursuit Algorithms

Michael Elad
The Computer Science Department
The Technion – Israel Institute of technology
Haifa 32000, Israel
Relaxation of the
L0-Norm: The Core Idea
Back to (P0)

We are considering again the


general (P0) problem

(P0 ) min x 0
s.t. A x  b
x
m

and we would like to n


A  n
discuss practical
ways for solving it
b
that are NOT based on x
the greedy rationale
Michael Elad | The Computer-Science Department | The Technion
Relaxation?

(P0 ) min x 0
s.t. A x  b
x

 We have massive knowledge in


continuous optimization
… but …
 The main difficulty: (P0)
is highly non-smooth due
to the L0 penalty
 Solution: Smooth (P0)
 This leads us to
“Relaxation Methods”

Michael Elad | The Computer-Science Department | The Technion


Smoothing the L0-Norm

 Recall the L0-norm:


m
0 x 0
x   k   x  w here  *  x   
0
k 1 1 x 0
 x

 There are infinite ways to smooth this function


 We shall propose few smooth variations of *(x)
to illustrate the possibilities

Michael Elad | The Computer-Science Department | The Technion


Here are Few Popular Options …

 x2  x
2

   x   1  exp     x     x   x

 2
   x

These options all share the property


  x      *  x 
0

1.2

0.8

0.6

0.4

0.2

0
-4 -3 -2 -1 0 1 2 3 4

Michael Elad | The Computer-Science Department | The Technion


Graduated Optimization

An appealing option is to solve a chain of such


problems with a smoothing effect that starts wide
(nearly convex), and progressively gets closer to L0
1.2

For example: 1
1=5.0
x
2 0.8 2=1.5

  x   1  e 
Decrease  3=0.5
0.6
4=0.1
0.4

0.2

0
-4 -3 -2 -1 0 1 2 3 4

Michael Elad | The Computer-Science Department | The Technion


Graduated Optimization
m
 Define  P0     m in   x 
 k
s.t. A x  b
x
k 1

 Each problem provides a warm-start initialization


to the next
Initialization
x j 1 xj
set : j  1 Solve
x0  0 (P0{}) j=j+1
  1 j
somehow

 Such process may lead to better final solution


compared to the use of a single  value
Michael Elad | The Computer-Science Department | The Technion
Numerical Solution of the Relaxed (P0)

 How can we numerically solve the problem


m
m in
x
  x 
 k
s.t. A x  b ?
k 1

 Optimization theory provides various such ways


 We shall present one simple option:
Iterative Reweighted Least-Squares (IRLS)
 The core idea: m m   x k 
Pseudo L2-norm:     x k  
2
o 2
xk
k 1 k 1 x k2

o Refer to this term as “fixed” weights wk


Michael Elad | The Computer-Science Department | The Technion
IRLS: The Details
m   x k  This algorithm is also known as

2
m in 2
x k
s.t. A x  b FOcal Underdetermined
x xk
k 1
System Solver (FOCUSS)
wk

m This problem
 P2 { W }  m in 
T
x W
w kxx s.t.
2
s.t. AAx xbb
k has a closed
x
k 1
form solution
IRLS iterates between a solution of the L2
problem and an update of the weights
set Update the
Solve
diagonal
(P2{W})
x0  1 ? matrix W
x

Michael Elad | The Computer-Science Department | The Technion


Alternative: Convex Relaxation

 Another possible relaxation that has drawn


much attention is (x)=|x|, implying …

 P0  m in x
x 0
s.t. A x  b

 P1  m in x
x 1
s.t. A x  b

 The resulting problem (P1) is known as


Basis Pursuit
 This relaxation is a convex problem that can
be handled by Linear-Programming solvers
Michael Elad | The Computer-Science Department | The Technion
Sparse & Redundant Representations
and Their Applications in
Signal and Image Processing
Relaxation Pursuit Algorithms

Michael Elad
The Computer Science Department
The Technion – Israel Institute of technology
Haifa 32000, Israel
A Test Case:
Demonstrating and Testing
Relaxation Algorithms
Proposed Experiment: As Before

Draw A A
nm

somehow
(m  n)

n
b An
Multiply Approximate

b=Ax0 Solver
of (P0)
m
x0 

Draw an
s-sparse x0 Compare
at Random
x0 0
s n
Michael Elad | The Computer-Science Department | The Technion
Proposed Experiment: As Before …

 WeDraw
use A
a random A of size
A 50×100 with iid
nm

somehowentries, and normalize A’s columns


Gaussian (m  n)
 We test the cardinalities s in the range [1,20]
n
b An
 TheMultiply
non-zeros are drawn from x̂
the uniform
Approximate
distribution
b=Ax0 [1,2] and given a random sign
Solver
of (P0)
 We evaluate the results by
m
x 0  a) L2 error of the computed solution, and

b)Draw
A support
an recovery score
s-sparse x0 Compare
 We
at average
Random the results over 200 experiments
x0 0
s n
Michael Elad | The Computer-Science Department | The Technion
Proposed Experiment: Algorithms

Draw A A
nm
 Wesomehow
compare three algorithms:
(m  n)
1. The OMP, applied exactly as in the previous
experiment
n
b An
2. The IRLS used for approximating the
Multiply Approximate

solution of (P1/2) (where 1 /2  x   x )
0.5

b=Ax0 Solver
o 100 iterations of (P0)
xk 1
x0 
m o The weights are given by 2
 1.5
xk xk
Draw an
3.s-sparse
A direct
x0
solution of (P1) using the
Compare
Random instruction in Matlab
at linprog
x0 0
s n
Michael Elad | The Computer-Science Department | The Technion
Proposed Experiment: Results

2
x̂  x 0
2
2
x0 2

Michael Elad | The Computer-Science Department | The Technion


Proposed Experiment: Results

Ŝ  S 0
1
ˆ ,S
m ax S 0 

Michael Elad | The Computer-Science Department | The Technion


Sparse & Redundant Representations
and Their Applications in
Signal and Image Processing
Guarantees of Pursuit Algorithms

Michael Elad
The Computer Science Department
The Technion – Israel Institute of technology
Haifa 32000, Israel
Our Goal: Theoretical
Justification for the Proposed
Algorithms
Back to (P0)

 For the problem


(P0 ) min x 0
s.t. A x  b
x

we came up with various algorithms, all aiming


to approximate its solution:
o Greedy methods: OMP, LS-OMP, MP, WMP, THR
o Relaxation methods, such as Basis Pursuit
 Why should we trust these methods?
 A partial and indirect answer was given by
the experiments we did

Michael Elad | The Computer-Science Department | The Technion


Theoretical Guarantees
 Our goal now is to show that pursuit
algorithms could in fact be accurate, leading
to the desired solution
 Wait ! Doesn’t this contradict our earlier
statement that (P0) is NP-Hard?
 Answer: Yes. We shall develop theoretical
guarantees for the success of several pursuit
algorithms under some conditions on the
cardinality of the unknown x
 The message: when these conditions are
met, (P0) is not NP-Hard anymore
Michael Elad | The Computer-Science Department | The Technion
Rules of the Game

Choose a A
nm

specific A
(m  n)

n
b x̂
Multiply A Pursuit
b=Ax Algorithm

x
m
We shall prove that if s is small
enough (i.e. x0 is sufficiently
Draw an
s-sparse x0 sparse), then OMP,THR, and BP
are all guaranteed to give x̂  x
x 0
s n
Michael Elad | The Computer-Science Department | The Technion
Implications

(P0 ) min x 0
s.t. A x  b
x

 While (P0) is generally NP-Hard, it is far less


complicated if its unknown, x, is known to be
sparse
 As we are about to show, the guarantees we
shall develop may pose additional conditions,
either on x or A
 We shall commonly refer to all these results
as
EQUIVALENCE GUARANTEES

Michael Elad | The Computer-Science Department | The Technion


Worst-Case Analysis

(P0 ) min x 0
s.t. A x  b
x
 The guarantees are going to develop adopt a
worst-case point of view
 This means that for all {A,b} satisfying the
conditions, success is perfectly guaranteed
 There exists a more sophisticated approach
that adopts a probabilistic point of view,
claiming the success of the pursuit with
probability →1
 These offer more “generous” bounds but their
analysis is typically more complicated
Michael Elad | The Computer-Science Department | The Technion
Sparse & Redundant Representations
and Their Applications in
Signal and Image Processing
Guarantees of Pursuit Algorithms

Michael Elad
The Computer Science Department
The Technion – Israel Institute of technology
Haifa 32000, Israel
Equivalence:
Analyzing the OMP
Algorithm
Recall the OMP

Our goal: Approximating the solution of


0
m in x 0
s.t. A x  b
x

Main Iteration
Initialization T
1. Com pute p(i)  a i r k 1 for 1  i  m
k  0, x 0  0
k  k 1 2. Choose i0 s.t. 1  i  m, p(i0 )  p(i)
r0  b  A x0  b
3. Update S k : S k  S k 1  i0 
and S 0  
s.t. sup  x   S k
2
4. LS : x k  m in A x  b 2
x

5. Update Residual: r k  b  A x k

No Yes
rk 2
 Stop

Michael Elad | The Computer-Science Department | The Technion


Underlying Assumptions

 Assume that the first s elements in x are


the non-zero ones, ordered in decreasing
order of absolute values:

The true n
A 
Support S

m b
s x
b xa i i where
i 1

x1  x 2   xs  0

Michael Elad | The Computer-Science Department | The Technion


OMP: Condition for Success

 The first step of the OMP succeeds if the inner


product of b with a1 is bigger (in absolute value)
than all other columns of A
T T
b a1  m ax b a j
j s

 We shall proceed by expanding these two


expressions, lower-bounding the left and upper-
bounding the right, this way deriving a condition
for this inequality to be satisfied
Lower Upper
T T
b a1  bound for  bound for  m ax b a j
j s
the LHS the RHS

Michael Elad | The Computer-Science Department | The Technion


Upper-Bounding the RHS


T T
RHS  m ax b a j  m ax x i ai a j
j s j s
i 1
s
b xa i i
i 1

s s

 x i a i a j  m ax  x i  a i a j
T T
 m ax
j s j s
i 1 i 1

 x1  s   A

m ax a i a j    A 
T

i j

Michael Elad | The Computer-Science Department | The Technion


Lower-Bounding the LHS
s s

xa xa
T T T
LHS  b a1  i i a1  x 1  i i a1
i 1 i 2
s
b xa i i
i 1


T
 x1  x i a i a1
i 2
s


ab  a  b T
 x1  x i  a i a1
i 2

 x 1  1   s  1    A  

m ax a i a j    A  ; x 1  x i  i  2
T

i j

Michael Elad | The Computer-Science Department | The Technion


Gathering the Bounds …
Lower Upper
T T
b a1  bound for  bound for  m ax b a j
j s
the LHS the RHS

b a1  x 1  1   s  1    A  
T
>
x 1  s    A   m ax b a j
T

j s

1   s  1  A  s   A
1 1 
1    A   2s    A  s  1  

2   A  

Michael Elad | The Computer-Science Department | The Technion


Moving to the Next Step
Conclusion so far:
 If x0 is sparse enough
1 1 
x  s  1  
0 
2   A  
then the first step of the OMP is successful –
it finds an atom i0 from within the support
 The next OMP steps:

o Update the solution by x 1  i0   b a i  c 1


T
0

o Update the residual by r 1  b  A x 1  b  c 1 a i 0

Michael Elad | The Computer-Science Department | The Technion


Moving to the Next Step
Observe that the residual a linear combination
of the s original atoms (as in b)
s
r 1  b  c 1 a i0   xa i i
i 1

The same condition applies for the


second OMP step … i.e. if
1 1 
x  s  1  
0 
2   A  
then the second atom chosen,
i1 is within the true support of x
Michael Elad | The Computer-Science Department | The Technion
Moving to the Next Step
 As the algorithm proceeds, the solution xk is
restricted to be a linear combination of atoms
from the support S, and thus the residual always
s
have the form
rk  xa i i
i 1

 Therefore, the obtained condition guarantees


the success of each step in the OMP
1 1 
x s   1  
0
2    A  
 Since rk is orthogonal to all chosen atoms, OMP
always selects a new one
 After s steps, OMP finds the exact solution
Michael Elad | The Computer-Science Department | The Technion
OMP (& MP) Equivalence

We are given A and b defining the problem (P0)


(P0 ) min x 0
s.t. A x  b
x

and we deploy OMP or MP for its solution

Theorem: Given the above (P0), if the unknown


to be found is sparse enough,
1 1 
x  s  1  
0
2    A  

then OMP and MP are guaranteed to


find it. Furthermore, OMP finds the
solution is exactly s=|S| steps

Michael Elad | The Computer-Science Department | The Technion


Sparse & Redundant Representations
and Their Applications in
Signal and Image Processing
Guarantees of Pursuit Algorithms

Michael Elad
The Computer Science Department
The Technion – Israel Institute of technology
Haifa 32000, Israel
Equivalence:
Analyzing the THR
Algorithm
THR: Terms for Success

 We shall use the same assumption as before


(the support S is the set 1is), the coefficients
are given in a descending order
 The THR algorithm succeeds if the inner
products with the true atoms are dominant:
T T
m in b a i  m ax b a j
1  i s j s

 Just as before, we will use the bounding idea:

T
Lower Upper T
m in b a i  bound for  bound for  m ax b a j
1  i s j s
the LHS the RHS

Michael Elad | The Computer-Science Department | The Technion


Upper-Bounding the RHS


T T
RHS  m ax b a j  m ax x i ai a j
j s j s
i 1
s
b xa i i
i 1

s s

 x i a i a j  m ax  x i  a i a j
T T
 m ax
j s j s
i 1 i 1

 x m ax  s    A 

m ax a i a j    A 
T

i j

Michael Elad | The Computer-Science Department | The Technion


Lower-Bounding the LHS
s

x
T T
LHS  m in b a i  m in t
a t ai
1  i s 1  i s
s t 1
b x t
at
t 1
s


T
 m in x i  x t a t ai
1  i s
t 1 ,t  i

 s 
 x t a t ai 
T
 m in  x i 
1  i s
 t 1 , t  i 
ab  a  b s


T
 m in x i  m ax x t a t ai
1  i s 1  i s
t 1 , t  i

Michael Elad | The Computer-Science Department | The Technion


Lower-Bounding the LHS
s

x
T T
LHS  m in b a i  m in t
a t ai
1  i s 1  i s
s t 1
b x t
at
t 1
s


T
 min x i  max x t a t ai
1  i s 1  i s
t 1, t  i
s


T
 x m in  m ax x t  a t ai
1  i s
t 1 , t  i

 x m in  x m ax  s  1    A 

m ax a i a j    A  ; x 1 
T
 xs
i j

Michael Elad | The Computer-Science Department | The Technion


Gathering the Bounds …

T
Lower Upper T
m in b a i  bound for  bound for  m ax b a j
1  i s j s
the LHS the RHS

x m in  x m ax  s  1    A  >x m ax
 s   A

x m in
  s  1  A  s   A
x m ax

1  x m in 1 
s    1
2  x m ax   A 
 

Michael Elad | The Computer-Science Department | The Technion
THR Equivalence

We are given A and b defining the problem (P0)


(P0 ) min x 0
s.t. A x  b
x

and we deploy the THR algorithm for its solution

Theorem: Given the above (P0), if the unknown


to be found satisfies
1 x m in 1 
x  s  1   
0
2  x m a x   A  

then THR is guaranteed to find it.

Michael Elad | The Computer-Science Department | The Technion


THR vs. OMP: Bounds

OMP Condition THR Condition


1 1  1 x m in 1 
x  s  1   x  s  1   
0
2    A   0
2  x m a x   A  

 The THR condition is weaker compared to


the OMP, showing dependency on the
unknown’s “contrast”
 Is this sensitivity to the contrast is true or
just an artifact of the proof?
 Answer: we present an experiment
verifying that THR does not handle well
highly contrasted non-zeros
Michael Elad | The Computer-Science Department | The Technion
THR vs. OMP: Experiment

We experiment with the OMP and the THR


algorithms following the very same test as before:
 A is a matrix of size 50×100
 x0 is a sparse vector of cardinality s=5 and 10
 The non-zeros have a varying contrast c in
the range [1,20]:
1 1

0.2c 0.2c

 The experiment is averaged over 1000 trials

Michael Elad | The Computer-Science Department | The Technion


THR vs. OMP: Experiment (s=5)

2
x̂  x 0
2
2
x0 2

Michael Elad | The Computer-Science Department | The Technion


THR vs. OMP: Experiment (s=10)

2
x̂  x 0
2
2
x0 2

Michael Elad | The Computer-Science Department | The Technion


Sparse & Redundant Representations
and Their Applications in
Signal and Image Processing
Guarantees of Pursuit Algorithms

Michael Elad
The Computer Science Department
The Technion – Israel Institute of technology
Haifa 32000, Israel
Equivalence: Analyzing the
Basis-Pursuit Algorithm
--- Part 1 ----
Basis Pursuit (BP) Rationale

We approximate the solution of the problem

 P0  ˆ
x  Arg m in x
x 0
s.t. b  A x

by solving instead

 P1  ˆ
x  Arg m in x
x 1
s.t. b  A x

We aim to show that a sparse vector x is


also the shortest w.r.t. the L1 measure

Michael Elad | The Computer-Science Department | The Technion


BP: Analysis Approach
 We define a set C of all the possible solutions
to b=Ax, such that their L1 length is shorter
(or even equal) than the sparse vector x we
started with


C  z b  A x  A z, z  x , and z 1
 x 1 
 This set represents the possible solutions of
BP that would be considered as erroneous
 Our strategy will be to inflate this set (and
simplify it as a consequence), while showing
that it is actually empty under sparsity
conditions on x
Michael Elad | The Computer-Science Department | The Technion
An Error-Driven Set C


C  z b  A x  A z, z  x , and z 1
 x 1 
Rather than defining the set C w.r.t. candidate
alternative solutions, lets redefine it w.r.t. the
solution’s error: z=x+e

b  Ax  Az Ae  0


C e  e 0  A e, e  0, and x  e 1
 x 1 
Michael Elad | The Computer-Science Department | The Technion
Simplifying Ce – Part 1 (1)

 Let us focus on Ae=0 and simplify it:

T
0  Ae 0  A Ae
Multiply by A


e  A A  I e
T

Subtract e from
both sides
e  A T

A I e
Apply abs on
both sides
T
e  A A I  e
ax  by  a x  b y

Michael Elad | The Computer-Science Department | The Technion


Simplifying Ce – Part 1 (2)

T T
e  A A I  e e    (11  I )  e

 The matrix |ATA-I| has several properties:


o It is non-negative due to the abs-value
o Its main diagonal contains zeros
o All its off-diagonal entries are in the range [0,]

T A - I
A

 Thus, |ATA-I|(11T-I)
[11T is a square matrix filled with 1-es]
Michael Elad | The Computer-Science Department | The Technion
Simplifying Ce – Part 1 (3)
T
e    (11  I )  e  T
e  1 e 1 e 
e
1


(1   )  e   e 1 1 e  e 11
1
 Observe that any e satisfying Ae=0 satisfies the last
inequality but not vice-versa. This means that
  
e 0  A e    e e  1   e 1 1 
 
 We should be pleased with this replacement because
o A is replaced by its property 
o The condition is posed w.r.t. |e|

Michael Elad | The Computer-Science Department | The Technion


Simplifying Ce – Part 1 (4)

We started with this set of problematic vectors


C e  e 0  A e, e  0, and x  e 1
 x 1 
and inflated it to

  
 e  e 1 1, e  0 
C e  e 1 
 & xe1  x 1 
 
We now target the term ||x+e ||1 ||x ||1 and aim
to simplify it to be stated in terms of |e| as well
Michael Elad | The Computer-Science Department | The Technion
Simplifying Ce – Part 2 (1)

 We focus on the following

xe 1
 x 1

 Writing these norms explicitly we get


m

x i
 ei  x i 0
i 1

 This sum can be divided to the on-support and


off-support parts
s

x i
 ei  x i    ei  0
i 1 i s

Michael Elad | The Computer-Science Department | The Technion


Simplifying Ce – Part 2 (2)
s
xe 1
 x 1 x i
 ei  x i    ei  0
i 1 i s

 We exploit the inequality |x+y|-|x|-|y|


and apply it to the first part, resulting with
s s
 ei   ei  x i
 ei  x i    ei  0
i 1 i s i 1 i s
s

 Adding and subtracting  e i leads to e


T
1
 2  1s e  0
i 1

where  m elem ents



T
1 s  1 1 1 0 0
 
 s elem ents 

Michael Elad | The Computer-Science Department | The Technion


Simplifying Ce – Part 2 (3)

xe  x
T
e  2  1s e  0
1 1 1

 Here as well we get inclusion of the form

 e xe 1
 x 1  
 e e 1
T
 2  1s e  0 
i.e. every e satisfying the RHS, satisfies the LHS
but not vice versa
 We are pleased with this replacement because
o The condition is posed w.r.t. |e|
o The dependency on x is replaced by a simpler
dependency on the support S
Michael Elad | The Computer-Science Department | The Technion
Sparse & Redundant Representations
and Their Applications in
Signal and Image Processing
Guarantees of Pursuit Algorithms

Michael Elad
The Computer Science Department
The Technion – Israel Institute of technology
Haifa 32000, Israel
Equivalence: Analyzing the
Basis-Pursuit Algorithm
--- Part 2 ----
Simplifying Ce - Summary

We started with this set of problematic vectors



C e  e 0  A e, e  0, and x  e 1
 x 1 
and inflated it to
  
 e  e 1, e  0 
C e  e 1 1

 & xe1  x 1 
 
  
 e  e 1, e  0 
 e 1   1

 & e 1  2  1S e  0 
T

 
Michael Elad | The Computer-Science Department | The Technion
Scale-Invariance

  
 e  e 1, e  0 
C S  e 1   1

 & e 1  2  1S e  0 
T

 
 Observe that if eCS then eCS for all 0
 Since our aim is to investigate whether CS is
empty or not, it is sufficient to consider the
intersection of this set with the unit L1 sphere
 Thus, we impose || e ||1=1, getting
  T 
CN  e e  1, e  1, 1  2  1 S e  0 

1
1 

Michael Elad | The Computer-Science Department | The Technion


Last Step
 
  T

CN  e e 1 1 , e  1 , 1  2  1S e  0 
 1 
Condition 1 Condition 3
 Condition 2 
 Cond. 3, requires to concentrate as much energy
as possible in the support elements of e
 Cond. 2 gives an upper bound on these entries
 Thus choosing the s elements in the support as
/(1+), Cond. 3 is violated if
s 1 1
12 0 s   1  
1 2 

Michael Elad | The Computer-Science Department | The Technion


BP Equivalence

We are given A and b defining the problem (P0)


(P0 ) min x 0
s.t. A x  b
x

and we deploy the Basis Pursuit for its solution

Theorem: Given the above (P0), if the unknown


to be found satisfies
1 1 
x  s  1  
0
2    A  

then Basis-Pursuit is guaranteed to


find it.

Michael Elad | The Computer-Science Department | The Technion


BP vs. OMP: Bounds

OMP Condition BP Condition


1 1  1 1 
x  s  1   x  s  1  
0
2    A   0
2    A  

 The two bounds are the same !!


 Does this mean that the two algorithms are
equivalent? definitely not
 The above implies that up to the specified
bound, the two algorithms provide perfect
recovery, and beyond it, each has its own
pattern of mistakes

Michael Elad | The Computer-Science Department | The Technion


Theory vs. Reality?

Remember this graph? How are these


2
results related to the
x̂  x 0

x0
2
2
obtained bound
2

1 1 
Perfect x  s  1  
0
2    A  
Recovery for
s<8

?
Answer: In this experiment, 0.48, implying that
this bounds predict success for s<1.5
Too Pessimistic
Michael Elad | The Computer-Science Department | The Technion

S-ar putea să vă placă și