Documente Academic
Documente Profesional
Documente Cultură
Time-frequency analysis
expressing a complex signal f as a superposition of exponentials e2it . If f vanishes outside some finite
set then the exponentials, which extend over all time, must cancel each other in some fantastic way that
makes it virtually impossible to quantify in any intuitive way which frequencies play a dominant role at any
particular time t.
3.1.2 Frequency Local in time
Consider a physical signal to be a square integrable real-valued function of time, x(t). One can define a
complex extension z(t) = x(t) + iy(t) bey letting y(t) be the inverse Fourier transform of isgn (t)
x() where
sgn denotes the signum function /||. In this case z = x
+ i
y has only positive frequencies.
Exercise 3.1.1. Explain why z(t) has an extension to a complex argument t + is, s > 0 that is analytic
in the upper half plane {t + is : s > 0}. You may assume that x(t) is a continuous, bounded, absolutely
integrable function.
p
The analytic signal z has the polar form r(t)ei(t) where r = x2 + y 2 and = arctan y/x. The instantaneous frequency can be defined as d/dt. This point of view however is a little too simple because x(t)
can be a superposition of multiple oscillating components and the instantaneous frequency cannot resolve
multiple oscillating contributions. We will return to this point later in this chapter. First we want to consider
some fundamental issues governing the impossibility of joint time-frequency localization and, in view of these
limitations, mathematical tools that aim to characterize compositions of signals in terms of time localized oscillations in view of such limitations. One typically refers to such tools as time-frequency representations. As
with the case of Fourier transforms we will encounter both continuous and discrete parameter time-frequency
representations.
3.1.3 The Heisenberg-Wiener inequality
Variance inequalities
Theorem 3.1.2. (Heisenberg uncertainty principle) If f L2 (R) with kf k2 = 1 then
1
kxf (x)k2 k f()k2
.
4
2
Moreover, one has equality if and only if f (x) = ex for some > 0.
48
3 Time-frequency analysis
This inequality states that f cannot have most of its energy near zero in time or space, and most of
its energy near zero (or really any other point) in frequency. This type of inequality is called a variance
inequality because, when regardedRas a continuous probability density on R, the variance of the quantity
|f (t)|2 dt is defined as kxf (x)k2 if x|f (x)|2 dx = 0 while k f()k2 is the variance of the density |f()|2 d
R
is |f()|2 = 0. In quantum mechanics the Heisenberg inequality has an interpretation as joint variance of
position and momentum operators and can be construed, roughly, as saying that one cannot jointly measure
the position and momentum of a subatomic particle with arbitrary precision, a fact that was verified in the
case of an electron by photon scattering in the famoous Compton effect.
Our interest in uncertainty inequalities will take more of a macroscopic interpretation involving the
formulation of a joint time-frequency picture of a signal.
Heisenbergs inequality has an interesting and simple extension to the case of (possibly unbounded)
operators on a Hilbert space H. Define the domain of a self-adjoint operator A to be the set of u H such
that Au H.
Theorem 3.1.3. If A and B are self-adjoint operators on a Hilbert space H then, whenever u is in the
domain of both AB and of BA and a, b C one has
k(A a)ukk(B b)uk
1
h(AB BA)u, ui.
2
Equality holds precisely when (A a)u and (B b)u are pure imaginary scalar multiples of one another.
Exercise 3.1.4. Shows that, at least formally,
h(AB BA)u, ui = 2i =h(B b)u, (A a)ui.
Then apply Cauchy-Schwarz to conclude the theorem.
The Heisenberg inequality has a covariant form called the Robertson-Schr
odinger inequality. It takes the
form
1
(xj )2 (j )2
+ (Cov(xj , j ))2
16 2
where the notation refers to covariance of operators (see, e.g. [?]).
Hermite functions
d
The Fourier transform exchanges differentiation and multiplication, specifically, ( dt
f ) () = 2i f(),
P
d
k
writing D = 2i dt and writingPP (D) for a differential operator P (D) =
k ak D we have, formally,
F(P (D)) = P () where P () = k ak k . P
Additionally, if P (t, D) is a homogeneous polynomialPof degree m
m
in t and D meaning that it has the form k=0 ak tk Dmk then, also formally, F(P )(t, Dt ) =
ak Dk mk
i d
1
(D ) = t. This and the observation
where D = 2
d since the Fourier inversion formula implies that F
2
about Gaussians being preserved leads to a description of L -eigenfunctions for the Fourier transform on R.
These eigenfunctions are the Hermite functions
(1)m t2 dm 2t2
e
(e
)
hm (t) = 21/4m
dtm
m m!
(3.1)
Exercise 3.1.5. Show that hm is an eigenfunction of the Fourier transform with eigenvalue (i)m .
Exercise 3.1.6. Explain why the Hermite functions are orthogonal with respect the the standard inner
product on R.
It turns out that, in fact, the Hermite functions form an orthonormal basis for L2 (R).
R
Exercise 3.1.7. Compute the moment th2m (t) dt.
49
Entropy inequality
There are a vast number of known alternative, precise mathematical statements of the fact that a function
and its Fourier transform cannot both be arbitrarily well localized. One important form has to do with
informationR usually regarded in terms of entropy. For f L2 (R) with kf k2 = 1 one defines its entropy
E(f ) = |f |2 ln |f |. Entropy can take on positive or negative values (including infinity)
but if f were
highly concentrated then it would have a large negative entropy. For example, if f = N on [0, 1/N
) and
zero elsewhere then (using the rule 0 ln 0 = 0) one would have E(f ) = 12 ln N whereas, if f = 1/ N on
[0, N ) and zero elsewhere then E(f ) = + 12 ln N . This typical example indicates that entropy is a measure
of energy spread. A very illuminating mathematical discussion of entropy can be found in Landau [?]. The
following entropy inequality was proved by Beckner [?]. It says that E(f ) and E(f cannot both have large
negative values.
Theorem 3.1.8. If f L2 (R), kf k2 = 1 then E(f ) + E(f) 12 (ln 2 1).
2
PN
k=1
|zk |2 ln |zk |.
PN
k=1
|zk |p
1/p
Exercise 3.1.10. Prove that kzkp defines a norm on CN , 1 p but that the triangle inequality can
fail when p < 1.
Exercise 3.1.11. Explain why kb
zk
1 kzk1 .
N
Interpolation
A convexity principle known as the Riesz-Thorin interpolation theorem (e.g [?]) allows us to conclude from
Plancherels identity (that the DFT is unitary) and from the inequality kb
zk 1N kzk1 that
kF(z)kp0 N (p2)/(2p) kzkp ,
1
1
+
=1
p p0
(3.2)
whenever 1 p 2.
Now define the quantity
Hp (z) =
1
1
p
2
ln
N
X
k=1
|zk |p =
kzk
p
p
ln
2p
kzk2
ln N + ln
|zk |p
k
0
p
p
p
X
X
0
p
p
ln N
ln
|
zk |p
ln
|zk |p
0
(p 2)p
p(p 2)
X
X
1
1
p0
ln N
ln
|
z
|
+
ln
|zk |p
k
2 p0
2p
50
3 Time-frequency analysis
has to be at least as large as that of the larger function at . In this case we are saying that that logarithmic
derivatives of (kzkp0 /kzkp ) is at least the logarithmic derivative of N (2p)/(2p) and this translates into the
statement that
X
X
1
(3.3)
|zk |2 ln |zk |
|b
zk |2 ln |b
zk | ln N
2
k
R\
Here |S| denotes the total length of S when S can be expressed as a (possibly infinite) union of pairwise
disjoint intervals.
Exercise 3.1.14. CanPa function f and its Fourier transform f both be supported on sets of the form
R
R\A
|f |2 < 2
R
R
51
The entropy inequality does not tell us when this inequality becomes an identity. It turns out that equality
occurs if and only if z is a shifted or modulated picket fence vector [?]. That is, if an appropriate shift or
modulate of z is a multiple of 1ZN/Q where Q divides evenly into N . In other words, 1ZN/Q is the vector
z such that zk = 1 if k is a multiple of Q
and zk = 0 otherwise. The inequality between arithmetic and
geometric means implies that #z+#b
z 2 N . One can infer a stronger inequality when P is prime (see [?]).
Corollary 3.1.21. If N = P is prime, then for z 6= 0,
#z + #b
zP +1
Equality holds only when z is a modulated version of a multiple of the vector all of whose coordinates equal
one.
Exercise 3.1.22. Write a matlab script to verify this statement experimentally.
Concentration inequalities can actually be used to say something about approximations from sparse data.
Exercise 3.1.23. Suppose that N = M P and that it is known that b
z is sparse in the sense that only
K < min{M, P } entries are nonzero. What is the minimal number of coordinates of z required to reconstruct
z? Explain.
Exercise 3.1.24. Suppose that z is such that zk = 0 unless k = nL for some n where L divides N . Can the
complexity of computing the DFT for such z be reduced? Explain.
The problem of estimating z based on the prior assumption that b
z has only a small number of nonzero
coefficients is a deep and challenging active area of applied mathematics. The work of Tao, Candes et al. [?,?]
gives examples that build on earlier work of Donoho and Elad [?].
52
3 Time-frequency analysis
Theorem 3.2.2. (Balian-Low) Suppose that G(, , ) forms an orthonormal basis for L2 (R). Then the
time-frequency variance product kxg(x)k2 k
g ()k2 =
This tells us that the Gabor window cannot have good time-frequency localization. We can ask whether
an overcomplete Gabor representation can have good time-frequency localization (finite time-frequency variance). This time we are in luck, but we need a little basic machinery to describe the main result and how it
can be applied.
Frames
Given a separable Hilbert space H, a countable subset fn is called a frame for H provided that there are
constants A, B such that for any f H one has
X
Akf k2
|hf, fn i|2 Bkf k2 .
n
These inequalities imply that the frame operator is bounded and continuously invertible. Frames are necessarily complete sets but typically they are overcomplete or redundant. For example, any three unit vectors
in R2 that differ from one another by a rotation by 2/3 will form a frame for the Hilbert space R2 . In the
case of Gabor systems G(g, , ) one defines the frame operator
X
Sg,, f =
hf, gn,k ign,k .
n,kZ
1
To g one can assign a canonical dual window = Sg,,
g and then one has the reproducing formula
X
X
f=
hf, n,k i gn,k =
hf, gn,k i n,k .
n,k
n,k
In this case one can say that f is expressed in a natural way as a superposition of time-frequency localized
Gabor atoms if g (or ) is time-frequency localized. At this stage we just have a couple of minor problems.
First, what does look like, and second, if g is time-frequency localized in a suitable sense then will be
localized in the same sense? A third issue come in determining conditions on g, , such that one has a
2
Gabor frame. When g is the Gaussian function g(x) = ex one has the following frame density criterion.
2
Theorem 3.2.3. (Seip and Lyubarskii) When g(x) = ex the family G(g, , ) forms a frame for L2 (R)
if and only if < 1.
The product can be regarded as a time-frequency density in the sense that 1/() is the number of
Gabor time-frequency shifts per unit area. This has to be at least one in order that the family is complete in
the case of a Gaussian. Overcompleteness implies that typically there will be more than one dual function
1
to a Gabor frame generator g. The dual function = Sg,,
g is called the canonical dual. Except for a minor
technical condition, Wexler and Raz [?, ?] characterized the Gabor duals in the analogous finite frame case
that will be discussed in a minute. In the case of Gabor frames for L2 (R) this characterization was carried
out rigorously by Daubechies, H. Landau and Z. Landau as follows.
Theorem
3.2.4. (Wexler-Raz) The pair (g, ) is a pair of dual Gabor windows in the sense that Sg,,, f =
P
hf,
n/,k/ , gm/,`/ = mn `k .
f = T,,
Tg,, . As noted, the main difficulty with this expansion is the computation of the dual function
. A very important consequence of the Wexler-Raz identity is that the canonical dual is the same as the
so-called Wexler-Raz dual which is defined as follows:
= Tg,1/,1/
(Tg,1/,1/ Tg,1/,1/
)1 e0,0
(3.4)
where e0,0 is the coefficient sequence on Z Z such that e0,0 (n, k) = 1 if n = k = 0 and e0,0 (n, k) = 0
otherwise. What is important about the formula (3.4) is that it allows for a discrete calculation of the dual
function . This calculation can be performed numerically as follows.
53
Proposition 3.2.5. (Neumann series expansion) Suppose that S is an operator on a Hilbert space H such
that 0 < S < I is the sense that for any, for any x H one has 0 < h(I S)x, xi < kxk2 . Then one can
write
X
S 1 =
(I S)k .
(3.5)
k=0
P
Formally, (3.5) is the same as the geometric series expansion x1 = k=0 (1 x)k when 0 < x < 1 with S
)1 e0,0 =
(Tg,1/,1/ Tg,1/,1/
k
2/()
2/() X
I
Tg,1/,1/ Tg,1/,1/
e0,0 = lim eK
K
A+B
A+B
k=0
2/()
e0,0 ,
A+B
and
K
n,k gn/,k/
n,k
K
n,k
L1
X
x(`)e2i`k/M g(` an + 1)
`=0
k = 0, . . . , M 1;
n = 0, . . . N 1 = L/a 1.
The ltfat syntax is
[c,Ls]=dgt(f,g,a,M);
fr=idgt(c,gd,a,Ls)
54
3 Time-frequency analysis
where f is the input vector x having Ls entries, g is the Gabor window with Gabor dual gd, a is the sample
shift parameter and M is the normalized frequency parameter which is referred to in the ltfat documentation
as the number of channels. Practically, it is the effective number of Fourier modes to be considered in the
windowed data. Gabor window and dual window design for finite implementations is discussed briefly in the
ltfat help. in Figure 3.1 the Gabor transform of the signal buellershort is plotted on a log intensity scale.
The parameters were chosen as follows in order to give a fairly robust time-frequency picture.
g=pgauss(1024,256,512);
[c,Ls]=dgt(x,g,2,1024);
In general, the more robust the picture, the more redundancy required. This means small a and large M .
Minimum redundancy would require large a and small M ; aM = 1 being the smallest possible choice. One
other thing worth mentioning is that the DGT treats data as being a single period of a periodic sequence.
The window functions used are also periodic so that there are no edge effects.
50
100
150
200
250
500
1000
1500
2000
2500
3000
3500
4000
Fig. 3.1. Log scale intensity plot of Gabor transform of bueller signal of length L = 8192.
55
50
50
100
100
150
150
200
200
250
250
300
300
350
350
400
400
450
450
500
500
100
200
300
400
500
600
700
800
900
1000
100
200
300
400
500
600
700
800
900
1000
50
50
100
100
150
150
200
200
250
250
300
300
350
350
400
400
450
450
500
500
100
200
300
400
500
600
700
800
900
1000
100
200
300
400
500
600
700
800
900
1000
f (t)
g (x t)e2it dt
(3.6)
where g(x) is actually replaced by its reflection g(x) = g(x) and x takes the value n and the value
k. However, if one is willing to compute all of the values then one ends up with the short-time Fourier
transform S(f, g)(x, ) which is a mapping from a function f (t) to a function of the variable (t, ). As an
56
3 Time-frequency analysis
0.15
0.1
0.05
0
1800
2000
2200
2000
2200
2400
2600
2800
3000
3200
3400
0.2
0.18
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
2400
2600
2800
3000
3200
3400
Fig. 3.6. Top plot shows noise added to Gabor signal Bottom plot shows residual cleaned signal from top 10 percent
of Gabor coefficients minus noisy bueller signal. This residue looks much the same as the noise and the cleaned signal
sounds much like the original bueller signal.
integral it is linear in f . In fact it is also (conjugate) linear in g but one usually regards g as a fixed window
function then regards f 7 Sg (f ) = S(f, g) as a linear mapping. The short-time Fourier transform satisfies
a remarkable inversion property. Suppose that kgk2 = 1. Then
Z Z
f (t) =
S(f, g)(, ) g(t )e2i d d
(3.7)
The inversion formula is very similar, on the one hand to the Fourier inversion formula (corresponding to the
case where g is replaced by the Dirac point mass here) as well as to the Gabor representation formula
in the limit as the time and frequency shift parameters tend to zero. Formula 3.7 is interpreted in the sense
of convergence in L2 (R). We Rwill not justify the formula rigorously (see [?]) but we will give a formal proof
based on the formal identity e2it dt = . Then
Z Z
S(f, g)(, ) g(t )e2it d d
Z Z Z
=
f (s)g( s)e2is ds g(t )e2it d d
Z
Z
Z
= f (s) g(s )g(t ) e2i(st) d d ds
Z
Z
= f (s) g(s )g(t )(s t) d ds
Z
= f (t) |g(t )|2 d = f (t).
In fact, one can show that S(f, g) is energy preserving in the sense that, when kgk = 1,
Z Z
Z
2
|S(f, g)(t, )| dt d = |f (t)|2 dt.
57
1000
1100
1200
1300
1400
1500
1600
1700
1800
1900
2000
200
400
600
800
1000
1200
1400
1600
1800
2000
Fig. 3.7. Log intensity spectrogram of bueller signal computed using tfrsp.
58
3 Time-frequency analysis
like t|(t)|2 dt can be thought of as the expected location of a particle modelled by . Wigner sought a
mathematical object to model the joint distribution of such a density in time and frequency. Such a joint
density W (f ) should satisfy, among other things,
TF1
TF2
TF3
TF4
TF5
It turns out that these properties are incompatible. Wigner proposed as a substitute a mapping (f, g) 7
W (f, g), the so-called Wigner distribution, that satisfies all properties except (TF3), but requires instead
that W (f ) be real-valued. The Wigner distribution is defined as
Z
d
(3.8)
W (f, g)(t, ) = e2i f t + g t
2
2
Exercise 3.2.7. With W (f ) = W (f, f ) verify properties (TF2), (TF4) and (TF5).
2
Exercise 3.2.9. Denote Da f (t)(t) = af (at), Eb f (t) = e2ibt f (t) and Ta f (t) = f (t a). Compute
W (Da )(f )(t, ), W (Eb f )(t, ) and W (Ta f )(t, ). Also compute W (f)(t, ) where f is the Fourier transform
of f . Express your answers in terms of W (f )(t, ).
Exercise 3.2.10. (Hard) Show that shifted, dilated and modulated Gaussians are the only L2 functions
having completely nonnegative Wigner distributions.
Moyals formula
The Wigner distribution is a unitary mapping from L2 (R) to L2 (R2 , a fact that is known as Moyals formula.
Theorem 3.2.11. If f1 , f2 , g2 , g2 L2 (R) then
hf1 , f2 ihg1 , g2 i = hW (f1 , f2 ), W (g1 , g2 )i.
On a formal level the proof is much the same as that of the inversion formula for Rthe short-time Fourier
transform, that is, it involves changes of order of integration and the formal identity e2ix d = x .
Z Z Z
Z
1
2
2
2
hW (f1 , f2 ), W (g1 , g2 )i =
e
f1 t +
g1 t
d1 e2i2 f2 t +
g2 t
d ddt
2
2
2
2
Z
Z Z
1
2
2
2
=
f1 t +
f2 t +
g2 t
g1 t
e2i(1 2 ) d d1 d2 dt
2
2
2
2
Z Z
1
2
2
2
=
f1 t +
f2 t +
g2 t
g1 t
1 =2 = d2 d1 dt
2
2
2
2
Z Z
=
f1 t + f2 t + g2 t g1 t
d dt
2
2
2
2
Z Z
=
f1 (u)f2 (u)g2 (u )g1 (u ) d du = hf1 , f2 ihg1 , g2 i
2i1
59
1000
1100
1200
1300
1400
1500
1600
1700
1800
1900
2000
200
400
600
800
1000
1200
1400
1600
1800
2000
Fig. 3.8. Log intensity Wigner distribution of bueller signal computed using tfrwv.
60
3 Time-frequency analysis
[tfr,t,f]=tfrname(signal);
where tfrname is the name of the distribution. For example, the spectrogram name is tfrsp. Some alternative
time-frequency distributions are provided in Figure 3.10.
100
100
200
200
300
300
400
400
500
500
600
600
700
700
800
800
900
900
1000
1000
200
400
600
800
1000
1200
1400
1600
1800
2000
Fig. 3.9. Log intensity plot smoothed pseudo Wigner distribution of bueller signal.
200
400
600
800
1000
1200
1400
1600
1800
2000
t = (t, )
= +
(t, )
t
Numerical implementation using these rules is not completely effective so one substitutes instead the
following
n S(f, tg(t))(t, )S(f, g)(t, ) o
t t <
|S(f, g)(t, )|2
n S(f, d g(t))(t, )S(f, g)(t, ) o
dt
=
|S(f, g)(t, )|2
3.3 Return to time-frequency orthonormal bases: local trigonometric bases and Wilson bases
61
100
100
200
200
300
300
400
400
500
500
600
600
700
700
800
800
900
900
1000
1000
200
400
600
800
1000
1200
1400
1600
1800
2000
200
400
600
800
1000
1200
1400
1600
1800
2000
g ()
/ L (R). It came as a surprise then when K. Wilson suggested the possibility of finding an orthonormal
basis for L2 (R) consisting of alternating windowed sines and cosines as follows. Set
2w(t n/2)
if k = 0, n even
nk =
2w(t n/2) cos 2kt
k = 1, 2, 3, . . . , n even
62
3 Time-frequency analysis
70
80
90
100
110
120
10
20
30
40
50
60
X
b2 (t n) = 1.
(3.10)
n=
3.3 Return to time-frequency orthonormal bases: local trigonometric bases and Wilson bases
63
2b(x n) cos (k + 21 )(x n) , n Z and k = 0, 1, . . . form an
Proof. One must show that overlapping functions are orthogonal. We prove this for the case when one of
the functions lives on [1/2, 3/2] so n = 0. The integral defining he0k , em` i = 0 unless m = 0 or m = 1.
We will consider the case m = 0 and leave the other cases as an exercise.
(Z
!
Z 3/2 )
1/2
1
1
2
b (t) cos (k + )t cos (` + )t .
+
he0k , e0` i = 2
2
2
1/2
1/2
R 1/2
R0
The first integral can be split into 1/2 + 0 . On [1/2, 0], b(t) = s1/2 (t) and on [0, 1/2) b(t) = c1/2 (t).
Also, cosine is an even function so
(Z
Z 1/2 )
Z 1/2
0
1
1
1
1
2
+
b (t) cos (k + )t cos (` + )t =
(s21/2 (t) + c21/2 (t)) cos (k + )t cos (` + )t
2
2
2
2
1/2
0
0
Z 1/2
1
1
cos (k + )t cos (` + )t
=
2
2
0
R 3/2
by the Pythagorean identity. For the integral 1 one uses the fact that cos (k + 1/2)t = cos (k + 1/2)(2 t)
and that, on [1, 3/2), b(t) = c1/2 (t 1) to write
Z
1
3/2
1
1
b (t) cos (k + )t cos (` + )t =
2
2
2
3/2
1
1
c21/2 (t 1) cos (k + )(2 t) cos (` + )(2 t)
2
2
1
c21/2 (1 t) cos (k + )t cos (` +
2
1/2
Z 1
1
=
s21/2 (t 1) cos (k + )t cos (` +
2
1/2
=
1
)t
2
1
)t
2
R 3/2
R1
where we have used the fact that c (u) = s (u). Therefore, adding 1/2 to 1/2 and using the Pythagorean
identity we end up with the integral of the cosine terms alone over [1/2, 1). Altogether, then,
Z 1
1
1
he0k , e0` i = 2
cos (k + )t cos (` + )t
2
2
0
Z 1
Z 1
1
=2
sin kt sin `t dt =
cos (k `)t cos (k + `)t dt = k`
2
0
0
which was to be shown. The completeness of the system {enk } follows from completeness of the trigonometric
system over each of the unit intervals [n, n + 1).
Exercise 3.3.3. Show that he0k , e1` i = 0 for all k and `. Explain why, in general, all inner products
henk , em` i can be reduced to calculating inner products on [0, 1] to conclude that the {enk } form an orthonormal family.
Though we will not go into detail here, local trigonometric bases can in fact be adapted to any partition of
the real line into intervals In whose left endpoints tn form a strictly increasing sequence such that limn =
. The
P main idea is to use a sequence n of cutoffs in such a way that the bells bn = sn (ttn )cn+1 (ttn+1 )
satisfy n b2n = 1 and such that the local trigonometric basis elements alternate polarities at the endpoints
in the same sense as the cosines just considered.
3.3.3 Discrete implementations
Discrete implementations of the local trigonometric bases are often called Malvar bases because H. Malvar
was the first to use them in signal processing applications. The really amount to nothing other than sampled
versions of the local trigonometric bases. Coefficient pictures produced by discrete analogues of the systems
64
3 Time-frequency analysis
{enk } just discussed will look very much like the corresponding Wilson basis pictures such as in Figure 3.13.
As will be discussed in more detail later, it is possible to use recursive decision trees to decide whether to split
a given interval for local trigonometric analysis into two subintervals with corresponding decompositions for
each subinterval. Such splittings give rise to families of local trigonometric bases indexed by interval splittings
and one can ask which, among this family of bases, represents given data in the most efficient way. Efficiency
will be discussed in Chapter 5. The local trigonometric functions are called cosine packets and routines for
implementing cosine packet analysis can be found in the Stanford WaveLab package WaveLab802. Figure
3.14 shows the pattern of nonzero cosine packet basis coefficients in an analysis of buellershort. Although
the intensity does not show up in this scheme, the pattern essentially follows that of the other time-frequency
analysis tools.
Phase plane:
0.6
Frequency
0.5
0.4
0.3
0.2
0.1
0
0.4
0.5
0.6
0.7
Time
0.8
0.9
65
If g = QT f is in the image of the operator QT then g is not bandlimited but P QT f is and we can write
Z
T /2
f (t)
T /2
T /2
f (k)
T /2 kZ
T /2
f (k)
T /2
kZ
sin (x t)
dt.
(x t)
sin (t k) sin (x t)
dt
(t k)
(x t)
X
sin (t k) sin (x t)
dt =
h{f (`)}, sT (k, x)i`2
(t k)
(x t)
`Z
T /2
sT (k, x) =
T /2
Of course, when T this converges to xk and one recovers the sampling theorem. Now consider the
case in which f = n is the n-th eigenfunction of the operator P QT . Here we make use of the fact that the
operator P QT has a discrete spectrum 0 1 . . . 0 as a self-adjoint operator on the Hilbert space PW.
If is an eigenfunction of P QT with eigenvalue then
X
P QT (m) = (m) =
h{(`)}, sT (`, m)i`2 .
`Z
In other words, the sample vector v = {(m)} is a -eigenvector of the matrix AT (m, `) = sT (`, m). This
means that the eigenvalue/eigenvector problem for the prolate spheroidal wave functions can be reduced to
that of the discrete matrix A = Am,` .
Exercise 3.4.1. Fix T be be a fairly large even integer, say T = 10. Estimate the entries of the matrix AT
by first approximating the sinc function by means of Taylor polynomials centered at the origin of sufficiently
high order, on the one hand, and by using Legendre polynomials on the other. Then compute the svd of the
matrix to obtain approximate sample eigenfunctions and plot several of them.
Now consider the operator QT P QT . The only difference between this operator and P QT is that the
elements of the range of P QT are now truncated to [T /2, T /2] so the eigenfunctions can be considered as
eigenfunctions of P QT restricted to [T /2, T /2]. This is no great observation, but here is a surprising one.
Proposition 3.4.2. The eigenfunctions of P QT are orthogonal on the whole real line and also on the interval
[T /2, T /2], that is, if n is the n eigenfunction of P QT then
Z
T /2
n (t)m (t) dt = n m nm = n m nm
T /2
n (t)m (t) dt
Proof. Orthogonality on all of R follows from the fact that eigenvectors coming from different eigenvalues are
orthogonal. To prove orthogonality on [T /2, T /2] we use Parsevals theorem together with an interesting
little fact that the eigenfunctions are, in a sense, invariant under the Fouriertransform. First, if c = T , the
time-frequency area associated with P QT , then, setting f (x) = f (x/)/
P QT / f (x) = P ((QT f ) )
\
= ((Q
T f ) (1)[/2,/2] )
\
= ( (Q
T f )()(1)[/2,/2] )
\
= (((Q
T f )(1)[/2,/2 )1/ )
= (P QT f ) .
66
3 Time-frequency analysis
In other words, dilation in a sense commutes with time-frequency localization, and the eigenfunctions of the
operator P QT / have the form where is an eigenfunction of P QT f .
What does the Fourier transform do to a -eigenfunction? Well,
(QT ) () = (QT P QT ) ()
= PT (P QT ) ()
= PT Q (QT ) ()
where Q g() = g()1[/2,/2] (). This says that (QT ) () is an eigenfunction of the operator PT Q =
(P QT )T and, by what we just observed it follows that the eigenfunctions of PT Q are unitary dilations
by a factor T / of the eigenfunctions of P QT which tells us that (QT ) () = T / (). Therefore, by
Parseval,
Z T /2
Z
(P QT n )(P QT m ) =
(QT P QT n )(QT P QT m )
T /2
(QT P QT
) (QT P QT m )
Z
= n m
(QT n ) (QT m )
Z
= n m
n,T / ()m,T / () = n m
as claimed.
Problem 3.4.3. Do the sample sequences of the prolate spheroidal wave functions satisfy some extrapolation
problem?
In other words, one would like to determine from the sample values of m inside of [T /2, T /2] the values
outside of [T /2, T /2]. This problem is highly ill-posed but it is less ill-posed when we observe that is an
analytic function and even less so when we observe that it is an eigenfunction.
3.4.1 Numerical generation of PSWFs
Figure 3.15 shows numerically generated prolate spheroidal wave functions where T = 10 and = 1. The
figure illustrates the fact that the first several eigenfunctions are highly concentrated in [T /2, T /2], but
that the concentration of n decreases with n and once n.[T ] at most half of the energy of n is localized
inside [T /2, T /2]. Here is a brief description of how they were created. First a matlab function sincmatrix
was used to generate a partial matrix sT (k, `) for values k and ` running from N to N for some user
input N . The integral defining sT (k, `) was computed using matlabs built in quad function for numerical
estimation of integrals. Matlab also has a built in sinc function but for earlier versions the sinc function
can be input manually, with a small correction in the denominator to avoid division by zero at t = k, `.
Computing sT (k, `) is computationally intensive but only has to be done once. The eigenvectors are then
estimated numerically by using the matlab built in svd. These eigenvectors are the samples of the PSWFs.
Finally, one multiplies the matrix containing the eigenvectors of sT by a matrix containing densely sampled
values of the shifted sinc functions. The columns of the resulting product are densely sampled approximate
prolate spheroidal wave functions. The approximations here depend on two things: (i) the error tolerance in
the quadrature defining sT and (ii) the parameter N governing the size of the partial matrix of sT . In fact,
the entries of sT decay fairly rapidly away from the diagonal. This is illustrated in Figure 3.16 showing that
the entries sT (k, `) are significant only when k, ` are approximately between 5 and 5 and when k `. In
addition, one sees in Figure 3.17 that the partial sinc matrix sT has only about 14 significant eigenvalues
and only 10 eigenvalues greater than 1/2. In fact, a theorem due to Landau [?] states that, in general, P QT
has at most [T ] + 1 eigenvalues larger than 1/2. In our case, = 1 and T = 10 so our numerical results
are certainly consistent with the theory. Taking N any larger will provide slightly better approximations but
at a higher computational cost.
Once one has the samples of the PSWFs n one can compute the projection of any f P W onto the
range of P QT simply by computing the `2 (Z) inner product (or, numerically, the partial inner product) of
the samples of f with the samples of each n . This is the same as multiplying the partial sample vector by
the orthogonal matrix obtained in the svd above. Then one expands each of the n s as before.
67
phi2
phi3
phi10
0.4
0.2
0.2
0.4
0.6
10
10
0.9
0.8
10
0.7
15
0.6
20
0.5
25
0.4
30
0.3
35
0.2
40
0.1
10
15
20
25
30
35
40
0
10
12
14
16
18
68
3 Time-frequency analysis
d2 x
b dx
+
+ 2 x(t).
dt2
m dt
Damping proportional to velocity or viscosity damping is an oversimplification; in many cases Coulomb or
frictional damping also is significant. One can also call Hookes law into question. For example, if one allows
spring stiffness to depend slightly on displacement then one is quickly led to oscillators such as the duffing
oscillator
b dx
d2 x
+
+ 2 x(t) + x(t)3 .
dt2
m dt
Thus, even for small amplitude vibrations and for the simplest possible nonlinearities, the result of treating
oscillations as superpositions of pure sinusoids leads to nontrivial bandwidth. Finally, there is the possibility
that a signal is generated by a coupled system of oscillators.
It is impossible to take all of these issues into account when trying to express a system or signal as a
superposition of damped vibrations or a measurement thereof. In many applications it is more fundamental
that time-frequency analysis can point to fundamental changes over time in the oscillatory behavior of
observed signals, as time-frequency representations attempt to do. But when a signal is fundamentally a
superposition of a small number of oscillators, it makes sense to try to identify waveforms beyond sinusoids.
3.5.1 Defining the spectrum of a real signal
3.5.2 Instantaneous frequency and the Hilbert transform
At the beginning of this chapter we discussedpthe polar or amplitude-phase decomposition z = rei of a
complex signal z(t) = x(t) + iy(t) with r = x2 + y 2 and = arctan(y/x). At any time t, r represents
the magnitude of z and eit represents the position of a point on the unit circle. In the ideal case z = eit ,
expressed in radians, the signal z oscillates at a rate radians per unit time and we call the instantaneous
frequency. When z is not a pure exponential we define the instantaneous phase at time t as (t) and d/dt
is called the instantaneous frequency.
These are nice definitions but they suffer a fundamental problem, which is that, ordinarily, measurable
signals are real-valued. In order to make sense out of instantaneous phase one has to make sense out of the
virtual complex signal companion of x(t).
When x is a square integrable signal of time there is a somewhat canonical way of manufacturing an
analytic signal from x. This tool is called the Hilbert transform and it is defined in the following way:
f 7 f 7 f1[0,) 7 (7 f1[0,) )
1
(I + iH)f.
2
The operator H is called the Hilbert transform and can be defined analytically by the integral formula
Z
1 f (s)
Hf (t) = p.v.
ds.
t s
Here p.v. stands for principal value meaning that the integral is rigorously defined by taking an appropriate limit near the singularity s = t in the denominator of the integrand. Corresponding discrete Hilbert
transforms can be defined for discrete and finite signals respectively.
Exercise 3.5.1. Give a reasonable definition of a Hilbert transform on CN .
The instantaneous frequency of f is that of its analytic extension (I + iH)f /2. While instantaneous
frequency can thus be defined in rigorous mathematical terms, its physical interpretation is still problematic.
For one thing, the phase (t) may not be a differentiable function of t. But just as critically, instantaneous
frequency may not reflect the nature of the signal if it is composed of multiple oscillating components. We
are still haunted by the fundamental tradeoff between localization in time and localization in frequency.
3.5.3 Monocomponent versus multicomponent
Instantaneous frequency only yields one value at any given time. This is fine if a signal is comprised of a
single oscillating component but not if several components are present. One way to quantify the amount of
frequency variation a physical signal possesses is in terms of bandwidth.
69
1 dz
dt
i dt
Z
Z
d
i dr 2
d 2
r (t) dt =
=<
r (t) dt
dt
r dt
dt
hi = <
z(t)
by equating real and imaginary parts. In this notation one can define the bandwidth in terms of instantaneous amplitude and frequency averages as
Z
1
2 hi2
=
( hi)2 |Z()|2 d
2 =
hi2
hi2
Z
1 d
2
1
2
=
z
(t)
hi
z(t) dt
hi2
i dt
Z 2
2
2
d
dr
1
2
+
r
(t)
dt
hi
=
hi2
dt
dt
For a narrowband signal both terms of the last integral have to be small meaning that both the amplitude
and the instantaneous frequency have to vary slowly. In order that the Fourier transform is supported in
[0, ), one should have ) . However, this definition of bandwidth still takes global averages of local
information which can lead to negative frequencies.
When signals are generated by a Gaussian stationary process meaning that random errors follow a
normal distribution and the nature of the distribution does not change over time, it is possible to compute
the expected number of zero crossings per unit time. If the average value of the signal is zero and the signal
is essentially one oscillating component, it makes sense to define the frequency locally approximately as the
number of zero crossings per unit time, for a short time average, provided a local maximum or minimum
intervenes between consecutive zero crossings. When the signal does not have an average value of zero, this
approach can be misleading.
Exercise 3.5.2. Compute the instantaneous frequency of the analytic signal z(t) whose real part is x(t) =
a + sin t for (i) a = 0, (ii) a = 2/3, (ii) a = 4/3. What does this say about using the analytic signal to define
instantaneous frequency?
Exercise 3.6.1. (Easy) For which values of a is x(t) = sin t + a an IMF? Is et sin t and IMF?
70
3 Time-frequency analysis
The term intrinsic mode function refers to the oscillation mode embedded in the data. An IMF, is meant
to have only one oscillation mode, although we will see that algorithms for extracting IMFs do not always
succeed in this regard. But they do succeed in extracting riding waves that arise in the analytic extension
of a + sin t, for example. An IMF is not restricted to be narrowband and can is typically nonstationary. For
instance, any frequency modulated (FM) signal can be an IMF.
Instantaneous frequency of IMD
Suppose that x(t) is an IMD and let z(t) = r(t)eit be its analytic extension. Then its Fourier transform is
Z
r(t)e2i( 2 t) dt
Z() =
or
d
= .
dt
71
Spectrogram intensity plots of sums of the first few IMFS are given in Figure 3.20. Each of the figures
illustrates one nontrivial issue with EMD, namely that multiple overlapping frequencies can show up in a
single IMF.
0.5
0
0.5
0
0.5
0
0.5
0
0.5
0
0.5
0
0.5
0
0.5
0
0.2
0
0.2
0
0.1
0
0.1
0
0.05
0
0.05
0
0.05
0
0.05
0
Bueller IMF 1
1000
2000
3000
4000
5000
6000
7000
8000
9000
Bueller IMF 8
Fig. 3.18. First eight IMFs of bueller data. EMD identified 13 IMFs but the last several have small amplitude.
72
3 Time-frequency analysis
1
0.8
Bueller
1st IMF
0.6
0.4
0.2
0.2
0.4
0.6
0.8
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
First 1 IMFs
First 3 IMFs
First 4 IMFs
Fig. 3.20. Sums of first few IMFs. The sum of the first four captures most of the spectrogram in Figure 3.7
0.15
0.1
0.05
0.05
0.1
0.15
0.2
1350
1400
1450
1500
1550
1600
1650
1700
1750
1800
Fig. 3.21. Beating in first IMF of Bueller signal illustrates one limitation of EMD.
%
%
%
%
%
L=2^(floor(log2(length(X))));
X=double(X(1:L));
p=(100-P)/(100);
%
g=pgauss(L/4,L/16,L/8);
a=8;
M=L/a;
% dyadic length
% floating point
% number of channels
% default shift factor
dg=candual(g,a,M);
tic
% start clock
[dgtX,Ls] = dgt(X,g,a,M);
disp([-----------]);
disp(Forward DGT);
toc
% start clock
% stop clock
73
74
3 Time-frequency analysis
disp([-----------]);
s1=size(dgtX,1)
s2=size(dgtX,2)
s=s1*s2
figure;
imagesc(log(1+10*abs(flipud(dgtX(1:floor(s1/2),:)))));
tic
fcsort = sort(abs(dgtX(:)));
fcerr = cumsum(fcsort.^2);
%
fcerr = flipud(fcerr);
%
fthresh = fcsort(floor(p*s));
%
cf_X = dgtX .* (abs(dgtX) > fthresh);
disp([-----------]);
disp(Sorting/thresholding);
toc
disp([-----------]);
figure;
imagesc(log(1+10*abs(flipud(cf_X(1:floor(s1/2),:)))));
tic
idgt_X = idgt(cf_X,dg,a,Ls);
disp([-----------]);
disp(Inverse DGT);
toc
disp([-----------]);
figure;
y=real(idgt_X);
subplot(1,2,1);
plot(X);
axis([1 length(X) min(X) max(X)]);
title(original data);
subplot(1,2,2)
plot(y);
axis([1 length(X) min(X) max(X)]);
title(reconstruction from large Gabor coefficients);