Dual-Tree Wavelets and Their DTCWT&Vision - UDRC

Dual-Tree Wavelets and their application to Vision Systems: key-point detectors and descriptors with polar matching.
Nick Kingsbury
Signal Processing and Communications Laboratory Department of Engineering, University of Cambridge, UK. email: ngk@eng.cam.ac.uk web: www.eng.cam.ac.uk/~ngk
Part 2, UDRC Short Course, 21 March 2013
Nick Kingsbury (Univ. of Cambridge)
Dual-Tree Wavelets & Vision
UDRC 2013 (2)
1 / 30
Introduction
Dual-Tree Wavelets and their application to Vision Systems

Properties of Dual-Tree Complex Wavelets (DT CWT) and why we use them Multiscale Keypoint Detection using Complex Wavelets (Julien Fauqueur and Pashmina Bendale) Rotation-invariant Local Feature Matching (an alternative to Lowes popular SIFT system): Modications to DT CWTto improve rotational symmetry Resampling using bandpass interpolation The Polar Matching Matrix descriptor Efcient Fourier-based matching Similar to log-polar mapping of Fourier domain, but much more localised. Enhancements for resilience to keypoint location errors and scale estimates.
Nick Kingsbury (Univ. of Cambridge) Dual-Tree Wavelets & Vision UDRC 2013 (2) 2 / 30
Dual-Tree CWT
Features of the Dual Tree Complex Wavelet Transform (DT CWT) Good shift invariance = negligible aliasing. Hence transfer function through each subband is independent of shift and wavelet coefs can be interpolated within each subband, independent of all other subbands. Good directional selectivity in 2-D, 3-D etc. derives from analyticity in 1-D (ability to separate positive from negative frequencies). Similarity to Primary (V1) Cortex lters of the human visual system. Perfect reconstruction with short support lters. Limited redundancy 2:1 in 1-D, 4:1 in 2-D etc. Low computation much less than the undecimated ( trous) DWT.
UDRC 2013 (2)
3 / 30
Dual-Tree CWT
Q-shift Dual Tree Complex Wavelet Transform in 1-D

Level 4 x0000a x000a H00a Level 3 - 2 Tree a (3q ) x00a H00a Level 2 - 2 (3q ) x0001a x0a H00a Level 1 - 2 - H01a - 2 (3q ) (q ) - H0a - 2 - H01a - 2 - x 001 a (1) (q ) H01a - 2 - x x0000b 01a (q ) x000b H00b - 2 - H1a - 2 - x (q ) 1a x00b H00b (0) - 2 ( q ) x0001b x0b H00b H01b - 2 - 2 (q ) H01b (3q ) - H0b - 2 - 2 - x 001b (0) H01b (3q ) - 2 - x 01b (3q ) Tree b - H1b - 2 - x 1b (1)
1-D Input Signal x

Figure: Dual tree of real lters for the Q-shift CWT, giving real and imaginary parts of complex coefcients from tree a and tree b respectively.
Dual-Tree CWT
Q-shift DT CWT Basis Functions Levels 1 to 3

1D complex wavelets & scaling functions at levels 1 to 3.
1
Level 1: scaling fn.

0
Tree a Tree b
wavelet
1
|a + jb|
Basis functions for adjacent sampling points are shown dotted.

2
wavelet
3

4
wavelet
5
20
15
10
0 5 sample number
10
15
20
UDRC 2013 (2)
5 / 30
Dual-Tree CWT
Frequency Responses of 18-tap Q-shift lters

Frequency responses of 18tap Qshift wavelets (n = 9) 1.6 1.4 1.2 Scaling fn. at level 4 1 0.8 0.6 0.4 0.2 0 8 Wavelets at level: 4 3 2 1
2 0 2 frequency (sample freq / 16)

8
UDRC 2013 (2) 6 / 30
DT CWT in 2-D
The DT CWT in 2-D

When the DT CWT is applied to 2-D signals (images), it has the following features: It is performed separably, using 2 trees for the rows of the image and 2 trees for the columns yielding a Quad-Tree structure (4:1 redundancy). The 4 quad-tree components of each coefcient are combined by simple sum and difference operations to yield a pair of complex coefcients. These are part of two separate subbands in adjacent quadrants of the 2-D spectrum. This produces 6 directionally selective subbands at each level of the 2-D DT CWT. Fig 2 shows the basis functions of these subbands at level 4, and compares them with the 3 subbands of a 2-D DWT. The DT CWT is directionally selective (see g ??) because the complex lters can separate positive and negative frequency components in 1-D, and hence separate adjacent quadrants of the 2-D spectrum. Real separable lters cannot do this!
DT CWT in 2-D
0.6 0.4
10
2-D Basis Functions at level 4

DT CWT real part
0.2 0 0.2 15 0.1 0.05 0
HL2 (z) H0 (z2 ) G0 (z2 )
10
20 10 5 0 5 10 15
30
Level 4 Sc. funcs.
dB
40
50
60 40 20 0 20 40 60
15
45
75
75
45
15(deg)
60 0.1 0.05 0 0.05
70
Level 4 Wavelets
80
Magnitude
0.4
0.3
0.2
DT CWT imaginary part

60 40
Real
20 0
Imaginary
20 40 60
0.1
Real DWT
Time (samples)
e j 1 x . e j 2 y = e j (1 x +2 y )
90 45(?) 0
Figure: Basis functions of 2-D Q-shift complex wavelets (top), and of 2-D real wavelet lters (bottom), all illustrated at level 4 of the transforms. The complex wavelets provide 6 directionally selective lters, while real wavelets provide 3 lters, only two of which have a dominant direction. The 1-D bases, from which the 2-D complex bases are derived, are shown to the right.
DT CWT in 2-D
Test Image and Colour Palette for Complex Coefcients
Colour palette for complex coefs.

1 0.8 0.6 0.4
Imaginary part
0.2 0 0.2 0.4 0.6 0.8 1 1 0.5 0 0.5 1
Real part
UDRC 2013 (2)
9 / 30
DT CWT in 2-D
2-D DT CWT Decomposition into Subbands
Figure: Four-level DT CWT decomposition of Lenna into 6 subbands per level (only
the central 128 128 portion of the image is shown for clarity). A colour-wheel palette is used to display the complex wavelet coefcients.
DT CWT in 2-D
2-D Shift Invariance of DT CWT vs DWT

Components of reconstructed disc images
Input (256 x 256)
DT CWT
DWT
wavelets: level 1
level 2
level 3
level 4
level 4 scaling fn.
Figure: Wavelet and scaling function components at levels 1 to 4 of an image of a light circular disc on a dark background, using the 2-D DT CWT (upper row) and 2-D DWT (lower row). Only half of each wavelet image is shown in order to save space.
Keypoint detection
Multi-scale Keypoint Detection using Accumulated Maps Basic Method:

Detect keypoint energy as locations x at scale k where complex wavelet energy exists in multiple directions d using: Ekp (k , x) or Ekp (k , x) Ekp (k , x) =
d =1...6 6
min |wk ,d (x)|

1 6
|wk ,d (x)|
d =1 3 1 d =1 3 1 6
or
|wk ,d (x)|.|wk ,d +3 (x)|

6 d =1
|wk ,d (x)|2
Interpolate and accumulate Ekp (k , x) across groups of relevant scales k to create the Accumulated Map. Pick keypoints as locations of maxima in the Accumulated Map.
Keypoint detection
Multi-scale Keypoint Detection using Accumulated Maps
UDRC 2013 (2)
13 / 30
Keypoint detection
Comparison of keypoint detectors
UDRC 2013 (2)
14 / 30
Rotation-Invariant Feature Matching
Rotation-Invariant Local Feature Matching Aims:

To derive a local feature descriptor for the region around a detected keypoint, so that keypoints from similar objects may be matched reliably. Matching must be performed in a rotationally invariant way if all rotations of an object are to be matched correctly. The feature descriptor must have sufcient complexity to give good detection reliability and low false-alarm rates. The feature descriptor must be simple enough to allow rapid pairwise comparisons of keypoints. Raw DT CWT coefcients provide multi-resolution local feature descriptors, but they are tied closely to a rectangular sampling system (as are most other multi-resolution decompositions). Hence we rst need better rotational symmetry for the DT CWT.
Frequency Responses of 2-D Q-shift lters at levels 3 and 4

Contours of 2D DT CWT freq responses at levels 3 & 4
4
Contours shown at 1 dB and 3 dB.
4 4
1 0 1 frequency (sample freq / 32)
UDRC 2013 (2)
16 / 30
Modication of 45 and 135 subband responses for improved rotational symmetry (shown at level 4).
(a) DualTree Complex Wavelets: Real Part
(c) Frequency responses of original and modified 1D filters 3
2
15 45 75 105 135 165
modified
original
Imaginary Part (b) Modified Complex Wavelets: Real Part
0 2
1.5
0.5 0 0.5 1 frequency / output sample rate
1.5
(a) Original 2-D impulse responses; (b) 2-D responses, modied to have lower centre frequencies (reduced by 1/ 1.8) in the 45 and 135 subbands, and even / odd symmetric real / imaginary parts; (c) Original and modied 1-D lters.
Imaginary Part
15
45
75
105
135
165
Better rotational symmetry is achieved, but we have lost Perfect Reconstruction.

13-point circular pattern for sampling DT CWT coefs at each keypoint location
M is a precise keypoint location, obtained from the keypoint detector.

D 1 B M C E F
0.5
0.5
L K J 1 0.5 0 0.5 I
Bandpass interpolation calculates the required samples and can be performed on each subband independently because of the shift-invariance of the transform: 1. Shift by {1 , 2 } down to zero frequency (i.e. multiply by ej (1 x1 +2 x2 ) at each point {x1 , x2 }); 2. Lowpass interpolate to each new point (spline / bi-cubic / bi-linear); 3. Shift up by {1 , 2 } (multiply by e j (1 y1 +2 y2 ) at each new point {y1 , y2 }).
UDRC 2013 (2)
18 / 30
Form the Polar Matching Matrix P

5 4 3 2 6 7 8 1 12 11 9 10 8 9
m1 m2 m3 m4 m5 m6 P= m1 m2 m 3 m 4 m5 m6
j1 i2 h3 g4 f5 e6 d1 c2 b3 a4 l5 k6
k1 j2 i3 h4 g5 f6 e1 d2 c3 b4 a5 l6
l1 k2 j3 i4 h5 g6 f1 e2 d3 c4 b5 a6
a1 l2 k3 j4 i5 h6 g1 f2 e3 d4 c5 b6
b1 a2 l3 k4 j5 i6 h1 g2 f3 e4 d5 c6
c1 b2 a3 l4 k5 j6 i1 h2 g3 f4 e5 d6
Column 1
8 9 10 11 12
6 5 4 3 2 10 11 12
7 6 5 4 11 12 1
10
8 7 6 5
Column 2
10
Column 3
11
Column 4
12
11 12 1 2 3
9 8 7 6 5 1 2 3
12
10 9 8 7 2 3 4
11 10 9 8
Column 5
Column 6
Column 7
Each column of P comprises a set of rotationally symmetric samples from the 6 subbands and their conjugates ( ), whose orientations are shown by the arrows and locations in the 13-point pattern by the letters. Numbers for each arrow give the row indices in P .
Efcient Fourier-based Matching

Columns of P shift cyclically with rotation of the object about keypoint M. Hence we perform correlation matching in the Fourier domain, as follows: First, take 12-point FFT of each column of Pk at every keypoint k to give P k and normalise each P k to unit total energy. Then, for each pair of keypoints (k , l ) to be matched: Multiply P k by P l element-by-element to give S k ,l . Accumulate the 12-point columns of S k ,l into a 48-element spectrum vector sk ,l (to give a 4-fold extended frequency range and hence ner correlation steps). Different columns of S k ,l are bandpass signals with differing centre frequencies, so optimum interpolation occurs if zero-padding is introduced over the part of the spectrum which is likely to contain least energy in each case. Take the real part of the inverse FFT of sk ,l to obtain the 48-point correlation result sk ,l . The peak in sk ,l gives the rotation and value of the best match. Extra columns can be added to P for multiple scales or colour components.
Correlation plots for two simple images

1
Test image (a)

0.5
0.5 0 90 180 270 360
Each set of curves shows the output of the normalised correlator for 48 angles in 7.5 increments, when the test image is rotated in 5 increments from 0 to 90 . Levels 4 and 5 of the DT CWT were used in an 8-column P matrix format. The diameter of the 13-point sampling pattern is half the width of the subimages shown.
Rotation (degrees)
1
Test image (b)

0.5
0.5 0 90 180 270 360
Rotation (degrees)
Correlation plots for more complicated images

1
Test image (c)

0.5
0.5 0 90 180 270 360
Rotation (degrees)
1
Test image (d)

0.5
0.5 0 90 180 270 360
Rotation (degrees)
Matching hand-picked keypoints across different 3D views
Matching hand-picked keypoints in heavy noise
UDRC 2013 (2)
23 / 30
Improving resilience to keypoint errors
Improving resilience to errors in keypoint location and scale

The basic P -matrix normalised correlation measure is highly resilient to changes in illumination, contrast and rotation. BUT it is still rather sensitive to discrepancies in keypoint location and estimated dominant scale. To correct for small errors (typically a few pixels) in keypoint location, we modify the algorithm as follows: Measure derivatives of P k with respect to shifts x in the sampling circle. Using the derivatives, calculate the shift vectors xi which maximise the normalised correlation measures sk ,l at each of the 48 rotations i (using LMS methods with approximate adjustments for normalised vectors). By regarding the 48-point IFFT as a sparse matrix multiplication, the computation load is only 3 times that of the basic algorithm. We do the same for small scale errors using a derivative of P k wrt scale dilation, , increasing computation to 4 times that of the basic algorithm.
A keypoint and its corresponding sampling circles over levels 5, 4 and 3

1 32 2 64 3 96 4 128 32 64 96 128 1 2 3 4
12
8 2 4 6 8
16 4 8 12 16
Background images show magnitudes of DT CWT coefs for subband 1, oriented for edges at 15 above the horizontal.
Mean correlation surfaces with no change of scale

Results averaged over 73 images from Caltech PP Toys 03 dataset, using the centre of the image as the keypoint. red no tolerance; blue shift tolerance; green shift + scale tolerance.
1 0.9 Correlation value 0.8 0.7 0.6 0.5 0.4 Correlation value 6 4 2 0 x 2 4 6 1 0.9 0.8 0.7 0.6 0.5 0.4
0 x
Left) No change of orientation; Right) Average over rotations from 0 to 90 in steps of 7.5 .
Mean correlation surfaces with change of scale (dilation)

Results averaged over 73 images from Caltech PP Toys 03 dataset, using the centre of the image as the keypoint. red no tolerance; green shift + scale tolerance.
1 0.9 Correlation value
1 0.9 Correlation value 0.8 0.7 0.6 0.5 0.4 1
0.8 0.7 0.6 0.5 0.4
0 x
1.1
1.2
1.3 Scale
1.4
1.5
Left) Average over dilations from 1.0 to 1.4, with respect to shift; Right) Mean correlation values with respect to scale change (dilation), no shift.
Histograms of non-target correlation scores

red no tolerance; blue shift tolerance; green shift + scale tolerance.
8000 original shift tol. scale tol.
Frequency
6000
4000
2000
0 0.2
0.2 0.4 0.6 Correlation value
0.8
The histograms demonstrate the increase in false positive detection rate when shift and scale tolerances are introduced.
Methods for handling colour and lighting changes
Methods for handling colour and lighting changes
For colour keypoint detection, we add the wavelet energies from the DT CWT of each R, G and B component, so that we get the total wavelet response to any changes in RGB-space; and use this to form a single Accumulated Map. Alternatively we could do this in a perceptually more uniform space such as Lab. We perform contrast equalisation over local image regions to handle variations between lighted / shaded parts of the image and between images. For colour feature matching, we form 3 polar-matching matrices into a single matrix [PR PG PB ] and perform matching with this. Prior to matching, we normalise each composite P-matrix with a single scaling factor for all 3 colours, so that relative amplitudes of colour variations are maintained, despite lighting changes.
UDRC 2013 (2)
29 / 30
Conclusions
Conclusions
New local Keypoint Detectors and Feature Descriptors have been proposed for use when comparing the similarity of detected objects in images. They have the following properties:
They are based on a modied form of the efcient Dual-Tree Complex Wavelet Transform. The modications improve the rotational symmetry of the 6 directionally selective subbands at each scale. The 12-point circular sampling patterns and the Polar Matching Matrix P are dened such that arbitrary rotations of an object can be accomodated efciently in the matching process. Shift-tolerant and scale-tolerant extensions of the basic P -matrix method give greater robustness at the expense of some increase in false-positive rate. There is scope for applying non-linear pre-processing (e.g. magnitude compression and phase adjustment) to the complex wavelet coefcients prior to matching, to improve resilience to illumination and viewpoint changes.
Papers on complex wavelets and applications are available at: www.eng.cam.ac.uk/~ngk/


Dual-Tree Wavelets and Their DTCWT&Vision - UDRC

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Dual-Tree Wavelets and Their DTCWT&Vision - UDRC

Încărcat de

Drepturi de autor:

Formate disponibile

Dual-Tree Wavelets and their application to Vision Systems: key-point detectors and descriptors with polar matching.

Part 2, UDRC Short Course, 21 March 2013

Nick Kingsbury (Univ. of Cambridge)

Dual-Tree Wavelets & Vision

UDRC 2013 (2)

Dual-Tree Wavelets and their application to Vision Systems

Nick Kingsbury (Univ. of Cambridge)

Dual-Tree Wavelets & Vision

UDRC 2013 (2)

Q-shift Dual Tree Complex Wavelet Transform in 1-D

1-D Input Signal x

Q-shift DT CWT Basis Functions Levels 1 to 3

Level 1: scaling fn.

Basis functions for adjacent sampling points are shown dotted.

Level 2: scaling fn.

Level 3: scaling fn.

Nick Kingsbury (Univ. of Cambridge)

Dual-Tree Wavelets & Vision

UDRC 2013 (2)

Frequency Responses of 18-tap Q-shift lters

2 0 2 frequency (sample freq / 16)

Nick Kingsbury (Univ. of Cambridge)

The DT CWT in 2-D

2-D Basis Functions at level 4

0.2 0 0.2 15 0.1 0.05 0

HL2 (z) H0 (z2 ) G0 (z2 )

Level 4 Sc. funcs.

60 0.1 0.05 0 0.05

DT CWT imaginary part

Test Image and Colour Palette for Complex Coefcients

Colour palette for complex coefs.

0.2 0 0.2 0.4 0.6 0.8 1 1 0.5 0 0.5 1

Nick Kingsbury (Univ. of Cambridge)

Dual-Tree Wavelets & Vision

UDRC 2013 (2)

2-D DT CWT Decomposition into Subbands

2-D Shift Invariance of DT CWT vs DWT

Input (256 x 256)

level 4 scaling fn.

Multi-scale Keypoint Detection using Accumulated Maps Basic Method:

min |wk ,d (x)|

|wk ,d (x)|.|wk ,d +3 (x)|

Multi-scale Keypoint Detection using Accumulated Maps

Nick Kingsbury (Univ. of Cambridge)

Dual-Tree Wavelets & Vision

UDRC 2013 (2)

Comparison of keypoint detectors

Nick Kingsbury (Univ. of Cambridge)

Dual-Tree Wavelets & Vision

UDRC 2013 (2)

Rotation-Invariant Feature Matching

Rotation-Invariant Local Feature Matching Aims:

Rotation-Invariant Feature Matching

Frequency Responses of 2-D Q-shift lters at levels 3 and 4

Contours shown at 1 dB and 3 dB.

1 0 1 frequency (sample freq / 32)

Nick Kingsbury (Univ. of Cambridge)

Dual-Tree Wavelets & Vision

UDRC 2013 (2)

Rotation-Invariant Feature Matching

(c) Frequency responses of original and modified 1D filters 3

Imaginary Part (b) Modified Complex Wavelets: Real Part

0.5 0 0.5 1 frequency / output sample rate