Sunteți pe pagina 1din 30

Dual-Tree Wavelets and their application to Vision Systems: key-point detectors and descriptors with polar matching.

Nick Kingsbury
Signal Processing and Communications Laboratory Department of Engineering, University of Cambridge, UK. email: ngk@eng.cam.ac.uk web: www.eng.cam.ac.uk/~ngk

Part 2, UDRC Short Course, 21 March 2013

Nick Kingsbury (Univ. of Cambridge)

Dual-Tree Wavelets & Vision

UDRC 2013 (2)

1 / 30

Introduction

Dual-Tree Wavelets and their application to Vision Systems


Properties of Dual-Tree Complex Wavelets (DT CWT) and why we use them Multiscale Keypoint Detection using Complex Wavelets (Julien Fauqueur and Pashmina Bendale) Rotation-invariant Local Feature Matching (an alternative to Lowes popular SIFT system): Modications to DT CWTto improve rotational symmetry Resampling using bandpass interpolation The Polar Matching Matrix descriptor Efcient Fourier-based matching Similar to log-polar mapping of Fourier domain, but much more localised. Enhancements for resilience to keypoint location errors and scale estimates.
Nick Kingsbury (Univ. of Cambridge) Dual-Tree Wavelets & Vision UDRC 2013 (2) 2 / 30

Dual-Tree CWT

Features of the Dual Tree Complex Wavelet Transform (DT CWT) Good shift invariance = negligible aliasing. Hence transfer function through each subband is independent of shift and wavelet coefs can be interpolated within each subband, independent of all other subbands. Good directional selectivity in 2-D, 3-D etc. derives from analyticity in 1-D (ability to separate positive from negative frequencies). Similarity to Primary (V1) Cortex lters of the human visual system. Perfect reconstruction with short support lters. Limited redundancy 2:1 in 1-D, 4:1 in 2-D etc. Low computation much less than the undecimated ( trous) DWT.

Nick Kingsbury (Univ. of Cambridge)

Dual-Tree Wavelets & Vision

UDRC 2013 (2)

3 / 30

Dual-Tree CWT

Q-shift Dual Tree Complex Wavelet Transform in 1-D


Level 4  x0000a x000a H00a Level 3 - 2 Tree a  (3q )  x00a H00a Level 2 - 2  (3q )   x0001a x0a H00a Level 1 - 2 - H01a - 2  (3q )   (q )  - H0a - 2 - H01a - 2 - x 001 a  (1) (q )  H01a  - 2 - x  x0000b 01a  (q ) x000b H00b  - 2 - H1a - 2 - x  (q ) 1a x00b H00b  (0)  - 2  ( q ) x0001b x0b H00b  H01b  - 2 - 2  (q )  H01b  (3q )  - H0b - 2 - 2 - x 001b (0)  H01b  (3q )  - 2 - x 01b  (3q )  Tree b - H1b - 2 - x 1b (1)

1-D Input Signal x



Figure: Dual tree of real lters for the Q-shift CWT, giving real and imaginary parts of complex coefcients from tree a and tree b respectively.
Nick Kingsbury (Univ. of Cambridge) Dual-Tree Wavelets & Vision UDRC 2013 (2) 4 / 30

Dual-Tree CWT

Q-shift DT CWT Basis Functions Levels 1 to 3


1D complex wavelets & scaling functions at levels 1 to 3.
1

Level 1: scaling fn.


0

Tree a Tree b

wavelet
1

|a + jb|

Basis functions for adjacent sampling points are shown dotted.

Level 2: scaling fn.


2

wavelet
3

Level 3: scaling fn.


4

wavelet
5

20

15

10

0 5 sample number

10

15

20

Nick Kingsbury (Univ. of Cambridge)

Dual-Tree Wavelets & Vision

UDRC 2013 (2)

5 / 30

Dual-Tree CWT

Frequency Responses of 18-tap Q-shift lters


Frequency responses of 18tap Qshift wavelets (n = 9) 1.6 1.4 1.2 Scaling fn. at level 4 1 0.8 0.6 0.4 0.2 0 8 Wavelets at level: 4 3 2 1

2 0 2 frequency (sample freq / 16)


Dual-Tree Wavelets & Vision

8
UDRC 2013 (2) 6 / 30

Nick Kingsbury (Univ. of Cambridge)

DT CWT in 2-D

The DT CWT in 2-D


When the DT CWT is applied to 2-D signals (images), it has the following features: It is performed separably, using 2 trees for the rows of the image and 2 trees for the columns yielding a Quad-Tree structure (4:1 redundancy). The 4 quad-tree components of each coefcient are combined by simple sum and difference operations to yield a pair of complex coefcients. These are part of two separate subbands in adjacent quadrants of the 2-D spectrum. This produces 6 directionally selective subbands at each level of the 2-D DT CWT. Fig 2 shows the basis functions of these subbands at level 4, and compares them with the 3 subbands of a 2-D DWT. The DT CWT is directionally selective (see g ??) because the complex lters can separate positive and negative frequency components in 1-D, and hence separate adjacent quadrants of the 2-D spectrum. Real separable lters cannot do this!
Nick Kingsbury (Univ. of Cambridge) Dual-Tree Wavelets & Vision UDRC 2013 (2) 7 / 30

DT CWT in 2-D

0.6 0.4

10

2-D Basis Functions at level 4


DT CWT real part

0.2 0 0.2 15 0.1 0.05 0

HL2 (z) H0 (z2 ) G0 (z2 )

10

20 10 5 0 5 10 15

30

Level 4 Sc. funcs.

dB

40

50

60 40 20 0 20 40 60

15

45

75

75

45

15(deg)

60 0.1 0.05 0 0.05

70

Level 4 Wavelets

80

Magnitude

0.4

0.3

0.2

DT CWT imaginary part


60 40

Real
20 0

Imaginary
20 40 60

0.1

Real DWT

Time (samples)

e j 1 x . e j 2 y = e j (1 x +2 y )
90 45(?) 0

Figure: Basis functions of 2-D Q-shift complex wavelets (top), and of 2-D real wavelet lters (bottom), all illustrated at level 4 of the transforms. The complex wavelets provide 6 directionally selective lters, while real wavelets provide 3 lters, only two of which have a dominant direction. The 1-D bases, from which the 2-D complex bases are derived, are shown to the right.
Nick Kingsbury (Univ. of Cambridge) Dual-Tree Wavelets & Vision UDRC 2013 (2) 8 / 30

DT CWT in 2-D

Test Image and Colour Palette for Complex Coefcients

Colour palette for complex coefs.


1 0.8 0.6 0.4

Imaginary part

0.2 0 0.2 0.4 0.6 0.8 1 1 0.5 0 0.5 1

Real part

Nick Kingsbury (Univ. of Cambridge)

Dual-Tree Wavelets & Vision

UDRC 2013 (2)

9 / 30

DT CWT in 2-D

2-D DT CWT Decomposition into Subbands

Figure: Four-level DT CWT decomposition of Lenna into 6 subbands per level (only
the central 128 128 portion of the image is shown for clarity). A colour-wheel palette is used to display the complex wavelet coefcients.
Nick Kingsbury (Univ. of Cambridge) Dual-Tree Wavelets & Vision UDRC 2013 (2) 10 / 30

DT CWT in 2-D

2-D Shift Invariance of DT CWT vs DWT


Components of reconstructed disc images

Input (256 x 256)

DT CWT

DWT

wavelets: level 1

level 2

level 3

level 4

level 4 scaling fn.

Figure: Wavelet and scaling function components at levels 1 to 4 of an image of a light circular disc on a dark background, using the 2-D DT CWT (upper row) and 2-D DWT (lower row). Only half of each wavelet image is shown in order to save space.
Nick Kingsbury (Univ. of Cambridge) Dual-Tree Wavelets & Vision UDRC 2013 (2) 11 / 30

Keypoint detection

Multi-scale Keypoint Detection using Accumulated Maps Basic Method:


Detect keypoint energy as locations x at scale k where complex wavelet energy exists in multiple directions d using: Ekp (k , x) or Ekp (k , x) Ekp (k , x) =
d =1...6 6

min |wk ,d (x)|


1 6

|wk ,d (x)|
d =1 3 1 d =1 3 1 6

or

|wk ,d (x)|.|wk ,d +3 (x)|


6 d =1

|wk ,d (x)|2

Interpolate and accumulate Ekp (k , x) across groups of relevant scales k to create the Accumulated Map. Pick keypoints as locations of maxima in the Accumulated Map.
Nick Kingsbury (Univ. of Cambridge) Dual-Tree Wavelets & Vision UDRC 2013 (2) 12 / 30

Keypoint detection

Multi-scale Keypoint Detection using Accumulated Maps

Nick Kingsbury (Univ. of Cambridge)

Dual-Tree Wavelets & Vision

UDRC 2013 (2)

13 / 30

Keypoint detection

Comparison of keypoint detectors

Nick Kingsbury (Univ. of Cambridge)

Dual-Tree Wavelets & Vision

UDRC 2013 (2)

14 / 30

Rotation-Invariant Feature Matching

Rotation-Invariant Local Feature Matching Aims:


To derive a local feature descriptor for the region around a detected keypoint, so that keypoints from similar objects may be matched reliably. Matching must be performed in a rotationally invariant way if all rotations of an object are to be matched correctly. The feature descriptor must have sufcient complexity to give good detection reliability and low false-alarm rates. The feature descriptor must be simple enough to allow rapid pairwise comparisons of keypoints. Raw DT CWT coefcients provide multi-resolution local feature descriptors, but they are tied closely to a rectangular sampling system (as are most other multi-resolution decompositions). Hence we rst need better rotational symmetry for the DT CWT.
Nick Kingsbury (Univ. of Cambridge) Dual-Tree Wavelets & Vision UDRC 2013 (2) 15 / 30

Rotation-Invariant Feature Matching

Frequency Responses of 2-D Q-shift lters at levels 3 and 4


Contours of 2D DT CWT freq responses at levels 3 & 4
4

Contours shown at 1 dB and 3 dB.

4 4

1 0 1 frequency (sample freq / 32)

Nick Kingsbury (Univ. of Cambridge)

Dual-Tree Wavelets & Vision

UDRC 2013 (2)

16 / 30

Rotation-Invariant Feature Matching

Modication of 45 and 135 subband responses for improved rotational symmetry (shown at level 4).
(a) DualTree Complex Wavelets: Real Part

(c) Frequency responses of original and modified 1D filters 3

2
15 45 75 105 135 165

modified

original

Imaginary Part (b) Modified Complex Wavelets: Real Part

0 2

1.5

0.5 0 0.5 1 frequency / output sample rate

1.5

(a) Original 2-D impulse responses; (b) 2-D responses, modied to have lower centre frequencies (reduced by 1/ 1.8) in the 45 and 135 subbands, and even / odd symmetric real / imaginary parts; (c) Original and modied 1-D lters.
Imaginary Part

15

45

75

105

135

165

Better rotational symmetry is achieved, but we have lost Perfect Reconstruction.


Nick Kingsbury (Univ. of Cambridge) Dual-Tree Wavelets & Vision UDRC 2013 (2) 17 / 30

Rotation-Invariant Feature Matching

13-point circular pattern for sampling DT CWT coefs at each keypoint location

M is a precise keypoint location, obtained from the keypoint detector.


D 1 B M C E F

0.5

0.5

L K J 1 0.5 0 0.5 I

Bandpass interpolation calculates the required samples and can be performed on each subband independently because of the shift-invariance of the transform: 1. Shift by {1 , 2 } down to zero frequency (i.e. multiply by ej (1 x1 +2 x2 ) at each point {x1 , x2 }); 2. Lowpass interpolate to each new point (spline / bi-cubic / bi-linear); 3. Shift up by {1 , 2 } (multiply by e j (1 y1 +2 y2 ) at each new point {y1 , y2 }).

Nick Kingsbury (Univ. of Cambridge)

Dual-Tree Wavelets & Vision

UDRC 2013 (2)

18 / 30

Rotation-Invariant Feature Matching

Form the Polar Matching Matrix P


5 4 3 2 6 7 8 1 12 11 9 10 8 9

m1 m2 m3 m4 m5 m6 P= m1 m2 m 3 m 4 m5 m6

j1 i2 h3 g4 f5 e6 d1 c2 b3 a4 l5 k6

k1 j2 i3 h4 g5 f6 e1 d2 c3 b4 a5 l6

l1 k2 j3 i4 h5 g6 f1 e2 d3 c4 b5 a6

a1 l2 k3 j4 i5 h6 g1 f2 e3 d4 c5 b6

b1 a2 l3 k4 j5 i6 h1 g2 f3 e4 d5 c6

c1 b2 a3 l4 k5 j6 i1 h2 g3 f4 e5 d6

Column 1

8 9 10 11 12

6 5 4 3 2 10 11 12

7 6 5 4 11 12 1

10

8 7 6 5

Column 2
10

Column 3
11

Column 4
12

11 12 1 2 3

9 8 7 6 5 1 2 3

12

10 9 8 7 2 3 4

11 10 9 8

Column 5

Column 6

Column 7

Each column of P comprises a set of rotationally symmetric samples from the 6 subbands and their conjugates ( ), whose orientations are shown by the arrows and locations in the 13-point pattern by the letters. Numbers for each arrow give the row indices in P .
Nick Kingsbury (Univ. of Cambridge) Dual-Tree Wavelets & Vision UDRC 2013 (2) 19 / 30

Rotation-Invariant Feature Matching

Efcient Fourier-based Matching


Columns of P shift cyclically with rotation of the object about keypoint M. Hence we perform correlation matching in the Fourier domain, as follows: First, take 12-point FFT of each column of Pk at every keypoint k to give P k and normalise each P k to unit total energy. Then, for each pair of keypoints (k , l ) to be matched: Multiply P k by P l element-by-element to give S k ,l . Accumulate the 12-point columns of S k ,l into a 48-element spectrum vector sk ,l (to give a 4-fold extended frequency range and hence ner correlation steps). Different columns of S k ,l are bandpass signals with differing centre frequencies, so optimum interpolation occurs if zero-padding is introduced over the part of the spectrum which is likely to contain least energy in each case. Take the real part of the inverse FFT of sk ,l to obtain the 48-point correlation result sk ,l . The peak in sk ,l gives the rotation and value of the best match. Extra columns can be added to P for multiple scales or colour components.
Nick Kingsbury (Univ. of Cambridge) Dual-Tree Wavelets & Vision UDRC 2013 (2) 20 / 30

Rotation-Invariant Feature Matching

Correlation plots for two simple images


1

Test image (a)


0.5

0.5 0 90 180 270 360

Each set of curves shows the output of the normalised correlator for 48 angles in 7.5 increments, when the test image is rotated in 5 increments from 0 to 90 . Levels 4 and 5 of the DT CWT were used in an 8-column P matrix format. The diameter of the 13-point sampling pattern is half the width of the subimages shown.

Rotation (degrees)
1

Test image (b)


0.5

0.5 0 90 180 270 360

Rotation (degrees)
Nick Kingsbury (Univ. of Cambridge) Dual-Tree Wavelets & Vision UDRC 2013 (2) 21 / 30

Rotation-Invariant Feature Matching

Correlation plots for more complicated images


1

Test image (c)


0.5

0.5 0 90 180 270 360

Rotation (degrees)
1

Test image (d)


0.5

0.5 0 90 180 270 360

Rotation (degrees)
Nick Kingsbury (Univ. of Cambridge) Dual-Tree Wavelets & Vision UDRC 2013 (2) 22 / 30

Rotation-Invariant Feature Matching

Matching hand-picked keypoints across different 3D views

Matching hand-picked keypoints in heavy noise

Nick Kingsbury (Univ. of Cambridge)

Dual-Tree Wavelets & Vision

UDRC 2013 (2)

23 / 30

Improving resilience to keypoint errors

Improving resilience to errors in keypoint location and scale


The basic P -matrix normalised correlation measure is highly resilient to changes in illumination, contrast and rotation. BUT it is still rather sensitive to discrepancies in keypoint location and estimated dominant scale. To correct for small errors (typically a few pixels) in keypoint location, we modify the algorithm as follows: Measure derivatives of P k with respect to shifts x in the sampling circle. Using the derivatives, calculate the shift vectors xi which maximise the normalised correlation measures sk ,l at each of the 48 rotations i (using LMS methods with approximate adjustments for normalised vectors). By regarding the 48-point IFFT as a sparse matrix multiplication, the computation load is only 3 times that of the basic algorithm. We do the same for small scale errors using a derivative of P k wrt scale dilation, , increasing computation to 4 times that of the basic algorithm.
Nick Kingsbury (Univ. of Cambridge) Dual-Tree Wavelets & Vision UDRC 2013 (2) 24 / 30

Improving resilience to keypoint errors

A keypoint and its corresponding sampling circles over levels 5, 4 and 3


1 32 2 64 3 96 4 128 32 64 96 128 1 2 3 4

12

8 2 4 6 8

16 4 8 12 16

Background images show magnitudes of DT CWT coefs for subband 1, oriented for edges at 15 above the horizontal.
Nick Kingsbury (Univ. of Cambridge) Dual-Tree Wavelets & Vision UDRC 2013 (2) 25 / 30

Improving resilience to keypoint errors

Mean correlation surfaces with no change of scale


Results averaged over 73 images from Caltech PP Toys 03 dataset, using the centre of the image as the keypoint. red no tolerance; blue shift tolerance; green shift + scale tolerance.
1 0.9 Correlation value 0.8 0.7 0.6 0.5 0.4 Correlation value 6 4 2 0 x 2 4 6 1 0.9 0.8 0.7 0.6 0.5 0.4

0 x

Left) No change of orientation; Right) Average over rotations from 0 to 90 in steps of 7.5 .
Nick Kingsbury (Univ. of Cambridge) Dual-Tree Wavelets & Vision UDRC 2013 (2) 26 / 30

Improving resilience to keypoint errors

Mean correlation surfaces with change of scale (dilation)


Results averaged over 73 images from Caltech PP Toys 03 dataset, using the centre of the image as the keypoint. red no tolerance; green shift + scale tolerance.
1 0.9 Correlation value

1 0.9 Correlation value 0.8 0.7 0.6 0.5 0.4 1

0.8 0.7 0.6 0.5 0.4

0 x

1.1

1.2

1.3 Scale

1.4

1.5

Left) Average over dilations from 1.0 to 1.4, with respect to shift; Right) Mean correlation values with respect to scale change (dilation), no shift.
Nick Kingsbury (Univ. of Cambridge) Dual-Tree Wavelets & Vision UDRC 2013 (2) 27 / 30

Improving resilience to keypoint errors

Histograms of non-target correlation scores


red no tolerance; blue shift tolerance; green shift + scale tolerance.
8000 original shift tol. scale tol.

Frequency

6000

4000

2000

0 0.2

0.2 0.4 0.6 Correlation value

0.8

The histograms demonstrate the increase in false positive detection rate when shift and scale tolerances are introduced.
Nick Kingsbury (Univ. of Cambridge) Dual-Tree Wavelets & Vision UDRC 2013 (2) 28 / 30

Methods for handling colour and lighting changes

Methods for handling colour and lighting changes

For colour keypoint detection, we add the wavelet energies from the DT CWT of each R, G and B component, so that we get the total wavelet response to any changes in RGB-space; and use this to form a single Accumulated Map. Alternatively we could do this in a perceptually more uniform space such as Lab. We perform contrast equalisation over local image regions to handle variations between lighted / shaded parts of the image and between images. For colour feature matching, we form 3 polar-matching matrices into a single matrix [PR PG PB ] and perform matching with this. Prior to matching, we normalise each composite P-matrix with a single scaling factor for all 3 colours, so that relative amplitudes of colour variations are maintained, despite lighting changes.

Nick Kingsbury (Univ. of Cambridge)

Dual-Tree Wavelets & Vision

UDRC 2013 (2)

29 / 30

Conclusions

Conclusions
New local Keypoint Detectors and Feature Descriptors have been proposed for use when comparing the similarity of detected objects in images. They have the following properties:
They are based on a modied form of the efcient Dual-Tree Complex Wavelet Transform. The modications improve the rotational symmetry of the 6 directionally selective subbands at each scale. The 12-point circular sampling patterns and the Polar Matching Matrix P are dened such that arbitrary rotations of an object can be accomodated efciently in the matching process. Shift-tolerant and scale-tolerant extensions of the basic P -matrix method give greater robustness at the expense of some increase in false-positive rate. There is scope for applying non-linear pre-processing (e.g. magnitude compression and phase adjustment) to the complex wavelet coefcients prior to matching, to improve resilience to illumination and viewpoint changes.

Papers on complex wavelets and applications are available at: www.eng.cam.ac.uk/~ngk/


Nick Kingsbury (Univ. of Cambridge) Dual-Tree Wavelets & Vision UDRC 2013 (2) 30 / 30