Documente Academic
Documente Profesional
Documente Cultură
SpeechLab, Department of Physics and Informatics, Institute of Physics at Sao Carlos, University of Sao Paulo.
Av. Trabalhador Sao Carlense 400 - 13566-590, Sao Carlos - SP, Brazil.
Phone: 55 - 16 - 33739789. e-mail: guido@ifsc.usp.br http://speechlab.ifsc.usp.br
Abstract This paper introduces the paradigm of joint time- IV. Applications, tests and comparisons are presented in section V
frequency-shape analysis of discrete-time signals by means of a and, lastly, section VI presents the conclusions.
novel transform called Discrete Shapelet Transform (DST)1 . This
approach allows one to find the time-support of frequencies, and,
II. The Construction of Shapelets
at the same time, analyze a shape. Applications are possible for
a diversity of fields in signal processing, and the results assure The elements that form the Shapelet structure, which are similar
the ecacy of the proposed transform. to the elements of the DWT structure, are:
a low-pass half-band nite impulse response (FIR) [1] lter,
I. Introduction and Motivation p[n], which has a frequency response P[] so that P[ = ] = 0,
Much research and literature have appeared recently describing and a phase response that is not necessarily linear;
techniques for time-frequency analysis of signals, being the Discrete the corresponding FIR high-pass mirror lter, q[n], dened as
Wavelet Transform (DWT) [1] the most traditional and widely qk = (1)k pNk1 ;
known tool for this purpose. In the same way, the eld of pattern the FIR lter p [n], dened as p k = pNk1 ;
matching [2], and also wavelet-based pattern matching [3], have
attracted much attention from the scientic community during the the FIR lter q[n],
dened as q k = (1)k+1 pk ;
last decades, specially for speech recognition, biomedical signal the support-size of the lters, i.e., the number of coecients or
processing, computer vision, besides others applications. the length, N, (N 4), that must be even;
the minor and major shapelet functions, as discussed later.
In order to improve the traditional time-frequency signal analysis
approaches, this article proposes the concept of joint time-frequency- The lters p[n] and q[n] form the analysis lter pair, and the lters
shape analysis of discrete-time signals. This approach is possible by p [n] and q[n]
form the synthesis lter pair. The former pair is used
means of the Discrete Shapelet Transform (DST), a novel transform to transform the original signal from the time-domain to the time-
used to nd the time-support of frequencies, and, at the same time, frequency-shape domain, and the latter pair is used to invert the
match patterns. Shapelet is inspired on two works: Spikelet and transformation, restoring the original signal from the transformed
Speechlet. The former, described in the awarded paper in [4], relates one. The requirements above for p[n], q[n], p[n],
q[n],
and N, force
to the creation of a novel wavelet transform that distinguishes among the lters to form a perfect-reconstruction lter bank (PRFB) [1],
seven patterns of spikes and overlaps from H1, a motion sensitive i.e., the anti-aliasing conditions, in equation 1, and the no-distortion
neuron of the ys visual system. The latter corresponds to the condition, in equation 2, both in z-domain, are satised.
adaptation of Spikelet to distinguish among patterns of normal and
= Q[z] ,
P[z] = P[z] .
Q[z] (1)
pathologically aected voice signals. Although these experiments
were successful, the last two transforms could not assure the ecacy
P[z]P[z] + Q[z]Q[z]
= 2zN+1 . (2)
of the pattern matching for a wide diversity of input signals due to
the use of the Least-Squares adjustment that somewhat disturbs the Once the coecients of one of the FIR lters specied above are
shape matching. The adequate conditions for vanishing moments determined, Shapelet can be completely dened. To determine them,
and perfect reconstruction [1] could also not be achieved. These the next algorithm, with ve sequences of steps, has to be followed:
drawbacks justify the present research to improve the previous ideas sequence A, sequence B, sequence C, sequence D (D1 and D2) and
and create the DST. sequence E. Sequences A, B, C, and D can happen in parallel, but
E requires the completion of the previous sequences to be carried
The remainder of this work is organized as follows. Section out. A detailed description of A, B, C, D, and E follows in order to
II details the construction of Shapelets, where the basic concepts of determine q[n].
wavelet transforms are also reviewed, since DSTs are inspired on
DWTs. DSTs are used to match patterns according to the algorithms Step A: Let the lter have unitary energy, i.e.,
described in section III. An example Shapelet is designed in section
N1
qk 2 = 1 . (3)
1 This k=0
work is funded by the State of Sao Paulo Research Foundation, Brazil,
under process nr. 2005/00015-1)
similar to the father and mother functions [1] of the DWT, re-
spectively. Neither nor are required to calculate the DST of
b0 c0
a signal, however, they play an important role in terms of signal c1
decomposition, since the jth level DST of a signal f (x), consists of b1
c...2
b2
...
writing it as the linear combination of these two functions, according b3
B[] = C[] = ....
...
to equation 7. ... , .
. .
j . .
. cn2
f (x) = < f, j,k (x) > j,k (x) + < f, t,k (x) > t,k (x) , (7) bn2
cn1
k t=1 k bn1
Y
D
2894
III. Pattern Matching with Shapelets
(0;4533) (1;5562)
(y)
A. The time-frequency-shape approach (2;6517)
(4;7858)
(3;7358)
(5;7278) 104
(6;4903) (7;187)
As dened above, the DST of a discrete-time signal f [n] generates (8;6648) (9;14330) 0
another discrete signal that belongs to a time-frequency-shape plane. (10;21331) (11;26532)
(12;29400) (13;29865)
It means that it is possible to use the DST to nd two basic (14;28071) (15;24728) 104
relationships: the time-supports of frequencies, and the time-supports (16;20572) (17;15930)
(18;11207) (19;6682)
of a particular shape contained in f [n]. Particularly, it is possible (20;2644) (21;807)
2 104
to discover the time interval in which a certain shaped signal exists (22;3627) (23;5771)
(24;7318) (25;8302) 3 104
inside a certain sub-band of frequencies. The former relation can be (26;8771) (27;8772)
found by using exactly the same procedure adopted with the DWT. (28;8421) (29;7822) 0 10 20 30 (x)
(30;7044) (31;6222)
Since this procedure is widely known and can be easily found in Fig. 1: the matching signal m[n]
the literature [1], it is not described here. The latter relation is the
focus of this section, existing basically two objectives: to measure the A. Designing the filter bank
degree of similarity between an unknown signal and a signal with a
According to the algorithm presented in section II, the following
particular shape, and to classify a pattern among several templates.
system of equations is obtained for the example signal m[n]:
The algorithms in the next two subsections carry out each one of
these tasks.
q0 2 + q1 2 + q2 2 + q3 2 =1
q0 q2 + q1 q3 =0
q0 + q1 + q2 + q3 =0
D(q0 , q1 , q2 , q3 , m[n]) = 1.24638
B. The algorithm for shape analysis
The algorithm described below measures the degree of similarity being D(q0 , q1 , q2 , q3 , m[n]) the fractal dimension of DST(m[n]), that
between a discrete-time unknown signal, u[n], and a signal of interest, is an equation in the unknowns q0 , ..., q3 , and 1.24638 the fractal
m[n], in a particular sub-band of frequencies. dimension of m[n]. The system has the solution:
q = 0.14016652716275140 p = 0.43255112987095045
BEGINNING
0
0
q1 = 0.27455565131559717 p1 = 0.84727330834929903
Step SA-1: Create one Shapelet S , i.e., one set of lters,
=
qq2 == 0.84727330834929903
0.43255112987095045
pp2 == 0.27455565131559717
0.14016652716275140
considering m[n] as being the matching signal; 3 3
Step SA-2: Calculate the lth level DST of u[n], being l in The next two subsections show how to nd and , by using the
accordance to the sub-band of interest;
N1
lters coecients, being pk = 2 the normalization required to do
Step SA-3: Obtain the similarity rate, R = (1||)100%, (0 k=0
R 100%), and being, respectively, the fractal dimension so, since the area under the major shapelet is unitary.
of u[n] and the fractal dimension of DST(u[n]). The higher the
value of R, the more similar u[n] and m[n] are in the particular B. The major shaplet
band of frequencies being considered;
The major shapelet,
(x), dened recursively using the dilation
END. equation (n) = pk (2n k) for a system of support N = 4, does
k
not exist outside the interval [0 3]. We therefore get:
C. The algorithm for pattern classification
(0) = p0 (0)
The algorithm to use the DST for pattern classication follows. (1) = p0 (2) + p1 (1) + p2 (0)
(2) = p1 (3) + p2 (2) + p3 (1)
To carry it out, a set of J template models, i.e., = {1 , 2 ,..., J }, (3) = p3 (3)
and the input signal, U , which is of unknown pattern, are assumed.
p0 0 0 0
(0)
BEGINNING and T = (1)
or MT = T , being M =
p2 p1 p0 0
0 p3 p2 p1
(2) . So, matrix T
Step PC-1: Create one Shapelet S i , i.e., one set of lters, for 0 0 0 p3 (3)
each i , (1 i J); with scaling function values is the eigenvector of M corresponding
Step PC-2: Calculate J lth level DSTs of U , each one using to eigenvalue 1. Using the normalizing condition (k) = 1, we
k
one S i obtained in step PM-1, being l in accordance with the get:
sub-band of interest;
(p0 1)(0) = 0
(0) = (3) = 0
p2 (0) + (p1 1)(1) + p0 (2) = 0
(p1 1)(1) + p0 (2) = 0
Step PC-3: For each DST, obtain one Ri = (1 |i i |) 100%,
p3 (1) + (p2 1)(2) + h1 (3) = 0 ()
p3 (1) + (p2 1)(2) = 0 .
(0 Ri 100%), considering the matched pattern as the one
(p3 1)(3) = 0 (1) + (2) = 1
with the highest score in the comparison, i.e., the highest value (0) + (1) + (2) + (3) = 1
for Ri ; being the transformation () due to the fact that p0 , ..., p3 0. The
END. solution of this last system gives the response for the integer points,
whereas the intermediate points satisfy ( 2x ) = pk (x k). Thus:
1 k
( 23) = p0 (1)
IV. Example: designing a Shapelet
( ) = p (2) + p (1) .
( 52 ) = p13 (2) 2
2
In this section, a Shapelet system with support-size 4 is designed
to match the signal m[n] that is shown in gure 1 together with the For p[n] designed above we get:
numerical values (x = sample; y = amplitude) for each one of its
32 points. The length of m[n] is arbitrary: 32 is merely the example (0) = 0 ; ( 12 ) = 0.2932 ; (1) = 0.4794 ;
used in this case. ( 32 ) = 1.5863 ; (2) = 1.4794 ; ( 52 ) = 0.2932 ; (3) = 0.
2895
- P1 P2 P3 P4 P5 P6 P7
C. The minor shapelet R 98% 96% 94% 90% 84% 83% 80%
Using the equation (n) = qk (2n k), we get:
TABLE I
k
(0) = q0 (0)
(0) = (3) = 0 Results for shape analysis; (R: similarity rate).
(1) = q0 (2) + q1 (1) + q2 (0)
1 1
D. Comparisons and discussion
0.5
0
0
Dynamic Time Warping (DTW) [2] is one of the most used
1 algorithms for pattern matching. The proposed approach (DST) is
0.5
0 1 2 3 (x) 0 1 2 3 (x)
much faster than DTW, since the latter algorithm has a higher order
of complexity in relation to the lengths of the input and the templates,
Fig. 2. [left]: major shapelet ; [right]: minor shapelet ; (x: sample ; y: furthermore, it was not designed for joint time-frequency-shape
amplitude). analysis. Particularly, no other well-known pattern matching approach
was designed for this joint analysis. In regard to the DST, the support-
size of the lters must be chosen according to the requirements for
V. Tests and comparisons frequency selectivity and time resolution, exactly as the DWT works
A. Shape analysis for biological data [1]. In the same way, the level of decomposition can be chosen
according to the sub-band frequency of interest. For pattern matching
The algorithm presented in section 3.2 and the shapelet designed and shape analysis, the IDST is not required, however, depending
in the last section were used to measure the degree of similarity on the specic application, for instance joint pattern matching and
between the input signal m[n], i.e., the signal shown in gure 1, and compression, where some coecients of the transformed signal are
7 other patterns extracted from biological data [4], that are shown in modied or discarded, the IDST is needed, since the original signal
gure 3. Clearly, the signals P1 and P7 are, respectively, the most must be reconstructed. This justies the design of orthogonal PRFBs
similar (almost identical, in this case) and the most dierent patterns and the specication of (x) and (x).
in comparison to m[n]. The results presented in table I indicate that
the proposed technique identied these patterns.
VI. Conclusions
B. Pattern classification for biological data This paper introduced the Discrete Shapelet Transform, a novel
The 7 signals shown on gure 3 were also used for pattern tool for joint time-frequency-shape analysis of discrete-time signals,
classication, according to the algorithm presented in section 3.3. which extends the traditional wavelet transform so that a shape
Particularly, one shapelet S i was designed for each Pi and 7 tests analysis becomes possible. The results obtained for pattern matching
were performed. For each test T i , the input signal Pi was compared and shape analysis show the ecacy of the proposed approach, that
to the set of signals {P1 , ..., P7 } by using S i , in the rst level of is much faster than other similar techniques.
decomposition. The results shown in table II demonstrate the ecacy
of the DST in each test. References
C. Pattern classification for speech data [1] Strang, G.; Nguyen, T., Wavelets and Filter Banks, Wellesley-Cambridge
Press, Wellesley, 1997.
DSTs were used for pattern classication among short speech [2] Theodoridis, S.; Koutroumbas, K. Pattern Recognition, 3.ed. Academic
utterances sampled at 8000Hz, 16-bit, particularly the phonemes /a/, Press, 2006.
/e/, /i/, /o/, and /u/ extracted from the words dogma, men, ship, boy, [3] Chapa, J.O.; Rao, R.M., Algorithms for designing wavelets to match
and super, respectively. The same procedure presented in sub-section a specied signal, IEEE Trans. on Signal Processing, v.48, n.12, Dec-
5.2 was carried out, and the experiment was repeated with 8 speakers, 2000, pp.3395-3406.
[4] Guido, R. C.; Slaets, J. F. W. ; Koberle, R. ; Almeida, L. O. B. ; Pereira,
4 male and 4 female, aged from 20 to 60. In all the tests, the correct
J. C. A New Technique to construct a wavelet transform matching a
pattern was classied. specied signal with applications to digital, real-time, spike and overlap
pattern recognition, Digital Signal Processing, v. 16, n. 1, Jan-2006, pp.
24-44.
[5] Al-Akaidi, M. Fractal Speech Processing, Cambridge University Press,
Cambridge, 2004.
Fig. 3. The 7 patterns used during the tests. Their names are, from left to
right: P1 , ..., P7 .
2896