Sunteți pe pagina 1din 42

Theodoridis, Pattern Recognition 4e,

Page 1

Pattern Recognition Matlab Manual


Aggelos Pikrakis, Sergios Theodoridis, Kostantinos Koutroumbas and Dionisis Cavouras February 2009

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 2

Contents
1 Preface 2 Clustering 2.1 BSAS.m . . . . . . . . . . 2.2 reassign.m . . . . . . . . . 2.3 GMDAS.m . . . . . . . . 2.4 gustkess.m . . . . . . . . . 2.5 k means.m . . . . . . . . . 2.6 possibi.m . . . . . . . . . . 2.7 k medoids.m . . . . . . . 2.8 spectral Ncut2.m . . . . 2.9 spectral Ncut gt2.m . . . 2.10 spectral Ratiocut2.m . . 2.11 spectral Ratiocut gt2.m 2.12 competitive.m . . . . . . 2.13 valley seeking.m . . . . . 2.14 distant init.m . . . . . . . 2.15 rand data init.m . . . . . 2.16 rand init.m . . . . . . . . 2.17 cost comput.m . . . . . . 2.18 dgk.m . . . . . . . . . . . . 2.19 dist.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 6 6 6 7 7 8 9 9 10 11 11 12 12 13 14 14 15 15 15 16 17 17 17 17 18 18 18 19 19 19 20 20 20 21 21 21 21 22 22 22

3 Feature Selection 3.1 simpleOutlierRemoval.m . . . . . . 3.2 normalizeStd.m . . . . . . . . . . . 3.3 normalizeMnmx.m . . . . . . . . . 3.4 normalizeSoftmax.m . . . . . . . . 3.5 ROC.m . . . . . . . . . . . . . . . . . 3.6 divergence.m . . . . . . . . . . . . . 3.7 divergenceBhata.m . . . . . . . . . 3.8 ScatterMatrices.m . . . . . . . . . . 3.9 ScalarFeatureSelectionRanking.m 3.10 SequentialBackwardSelection.m . 3.11 SequentialForwardSelection.m . . 3.12 exhaustiveSearch.m . . . . . . . . . 4 Image Features 4.1 generateCoOccMat.m . 4.2 CoOccMatFeatures.m . 4.3 CoOccASM.m . . . . . 4.4 CoOccContrast.m . . . 4.5 CoOccCOR.m . . . . . 4.6 CoOccVariance.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 3

4.7 4.8 4.9 4.10 4.11 4.12 4.13 4.14 4.15 4.16 4.17 4.18 4.19 4.20 4.21 4.22 4.23 4.24 4.25 4.26 4.27 4.28 4.29

CoOccIDM.m . . . . . . . CoOccSUA.m . . . . . . . CoOccSUV.m . . . . . . . CoOccSUE.m . . . . . . . CoOccEntropy.m . . . . . CoOccDEN.m . . . . . . CoOccDVA.m . . . . . . . CoOccCIMI.m . . . . . . CoOccCIMII.m . . . . . . CoOccPXandPY .m . . . CoOccPXminusY.m . . . CoOccPxplusY.m . . . . ImageHist.m . . . . . . . HistMoments.m . . . . . HistCentralMoments.m LawMasks.m . . . . . . . RL 0 90.m . . . . . . . . . RL 45 135.m . . . . . . . SRE.m . . . . . . . . . . . LRE.m . . . . . . . . . . . GLNU.m . . . . . . . . . . RLNU.m . . . . . . . . . . RP.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

22 23 23 23 23 24 24 24 24 25 25 25 26 26 26 27 27 27 28 28 28 29 29 30 30 30 30 31 31 31 32 32 33 33 33 34 34 35 35 36 36 37 37

5 Audio Features 5.1 sfSpectralCentroid.m . 5.2 sfSpectralRollo.m . . 5.3 sfFundAMDF.m . . . . 5.4 sfFundAutoCorr.m . . 5.5 sfFundCepstrum.m . . 5.6 sfFundFreqHist.m . . . 5.7 sfMFCCs.m . . . . . . . 5.8 computeMelBank.m . . 5.9 stEnergy.m . . . . . . . 5.10 stZeroCrossingRate.m 5.11 stSpectralCentroid.m . 5.12 stSpectralRollo.m . . 5.13 stSpectralFlux.m . . . . 5.14 stFundAMDF.m . . . . 5.15 stMelCepstrum.m . . . 5.16 stFundFreqHist.m . . . 5.17 stFundAutoCorr.m . . 5.18 stFundCepstrum.m . . 5.19 stFourierTransform.m

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 4

6 Dynamic Time Warping 6.1 editDistance.m . . . . 6.2 DTWSakoe.m . . . . . 6.3 DTWSakoeEndp.m . 6.4 DTWItakura.m . . . 6.5 DTWItakuraEndp.m 6.6 BackTracking.m . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

38 38 38 39 39 40 41

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 5

Preface

The Matlab m-les that are provided here are intended to satisfy pedagogical needs. We have not covered all the algorirhms that are described in the book. We have focused at the most popular methods as well as methods that can help the reader to get familiar with the basic concepts associated with the dierent methodologies. All the m-les have been developed by ourselves and we have not included (with very few exceptions) code that is already included in the available Matlab toolboxes, e.g., statistical toolbox and image processing toolbox. Examples are the routines related to Support Vector Machines, k-NN classier, etc. Currently, a companion book is being developed including short descriptions of theory as well as a number of Matlab excercises. This book will be based on routines, which are given here as well as those available in the matlab toolboxes. We would appreciate any comments from the readers and we are happy to try to accomodate them. A. Pikrakis S. Theodoridis K. Koutroumbas D. Cavouras

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 6

2
2.1

Clustering
BSAS.m
Syntax : [bel, m]=BSAS(X,theta,q,order) Description: This function implements the BSAS (Basic Sequential Algorithmic Scheme) algorithm. It performs a single pass on the data. If the currently considered vector lies at a signicant distance (greater than a given dissimilarity threshold) from the clusters formed so far, a new cluster is formed with the current vector being its representative. Otherwise, the considered vector is assigned to its closest cluster. The results of the algorithm are inuenced by the order of the presentation of the data. Input: X : an l N dimensional matrix, each column of which corresponds to an l-dimensional data vector. theta: the dissimilarity threshold. q: the maximum allowable number of clusters. order : an N-dimensional vector containing a permutation of the integers 1, 2, . . . , N . The i-th element of this vector species the order of presentation of the i-th vector to the algorithm. Output: bel : an N-dimensional vector whose i-th element indicates the cluster where the i-th data vector has been assinged. m: a matrix, each column of which contains the l-dimensional (mean) representative of each cluster.

2.2

reassign.m

Syntax : [bel]=reassign(X,m,order) Description: This function performs a single pass on the data set and re-assigns the data vectors to their closest clusters taking into account their distance from the cluster representatives. It may be applied on the clustering produced by BSAS in order to obtain more compact clusters. Input: X : an l N dimensional matrix, each column of which corresponds to an l-dimensional data vector. m: the matrix whose columns contain the l-dimensional (mean) representatives of the clusters. order : an N-dimensional vector containing a permutation of the integers 1, 2, . . . , N . The i-th element of this vector species the order of presentation of the i-th vector to the algorithm. Output: bel : an N-dimensional vector whose i-th element indicates the cluster where the i-th data vector has been assinged.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 7

2.3

GMDAS.m

Syntax : [ap,cp,mv,mc,iter,divec]=GMDAS(X,mv,mc,e,maxiter,sed) Description: This function implements the GMDAS (Generalized Mixture Decomposition Algorithmic Scheme) algorithm, where each cluster is characterized by a normal distribution. The aim is to estimate the means and the covariance matrices of the distributions characterizing each cluster, as well as the a priori probabilities of the clusters. This is carried out in an iterative manner, which terminates when no signicant change in the values of the previous parameters is encountered between two successive iterations. Once more, the number of clusters m is assumed to be known. Input: X : an l N dimensional matrix, each column of which corresponds to an l-dimensional data vector. mv : an l m dimensional matrix, each column of which contains an initial estimate of the mean corresponding to the i-th cluster. mc: an l l m dimensional matrix whose i-th l l two-dimensional slice is an initial estimate of the covariance matrix corresponding to the i-th cluster. e: The threshold that controls the termination of the algorithm. Specically, the algorithm terminates when the sum of the absolute dierences of mvs mcs and a priori probabilities between two successive iterations is smaller than e. maxiter : The maximum number of iterations the algorithm is allowed to run. sed : The seed used for the initialization of the random generator function rand. Output: ap: an m-dimensional vector whose i-th coordinate contains the a priori probability of the i-th cluster. cp: an N m dimensional matrix whose (i, j) element contains the probability of the fact that the i-th vector belongs to the j-th cluster. mv : the l m dimensional matrix each column of which contains the nal estimate of the mean corresponding to the i-th cluster. mc: an l l m dimensional matrix whose i-th l l two-dimensional slice is the nal estimate of the covariance matrix corresponding to the i-th cluster. iter : the number of iterations performed by the algorithm. divec: a vector whose i-th cooordinate contains the dierence between the sum of the absolute dierences of mvs mcs and a priori probabilities between the i-th and the (i 1)-th iteration.

2.4

gustkess.m

Syntax : [u,c,S,iter]=gustkess(X,u,c,S,q,e) Description: This function implements the Gustafson-Kessel algorithm, an algorithm of fuzzy nature that is able to unravel planar clusters. Once more, the number of clusters, m, is a prerequisite for the algorithm. The j-th cluster is represented by a center c(:,j) and a covariance matrix S(:,:,j). The distance of a point X(:,i) from the j-th cluster is a weighted form of the Mahalanobis distance and is a function of X(:,i), c(:,j) and S(:,:,j). The algorithm aims at grouping points that lie around a line (hyperplane in general) to the same cluster via iterative adjustment of the cluster parameters (centers and covariances).

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 8

Input: X : an l N dimensional matrix each column of which corresponds to a data vector. u: an N m dimensional matrix whose (i, j) element is an initial estimate of the grade of membership of the i-th data vector to the j-th cluster (all elements of u are in the interval [0, 1] and the entries of each row sum up to unity). c: an l m dimensional matrix whose j-th column is an initial estimate of the center for the j-th cluster. S : an l l m dimensional matrix whose j-th l l two-dimensional slice is an initial estimate of the covariance for the j-th cluster. q: the fuzzier parameter. e: the parameter used in the termination criterion of the algorithm. (the algorithm terminates when the summation of the absolute dierences of us between two successive iterations is less than e). Output: u: an N m dimensional matrix with the nal estimates of the grade of memberships of each vector to each cluster. c: an l m dimensional matrix with the nal estimates of the centers of the clusters. S : an l l m dimensional matrix whose j-th l l two-dimensional slice is a nal estimate of the covariance for the j-th cluster. NOTE : This function calls dgk.m function for the computation of the distance of a point from a cluster.

2.5

k means.m

Syntax : [w,bel]=k means(X,w) Description: This function implements the k-means algorithm, which requires as input the number of clusters underlying the data set. The algorithm starts with an initial estimation of the cluster representatives and iteratively tries to move them into regions that are dense in data vectors, so that a suitable cost function is minimized. This is achieved by performing (usually) a few passes on the data set. The algorithm terminates when the values of the cluster representatives remain unaltered between two successive iterations. Input: X : an l N dimensional matrix, each column of which corresponds to an l-dimensional data vector. w : a matrix, whose columns contain the l-dimensional (mean) representatives of the clusters. Output: w : a matrix, whose columns contain the nal estimates of the representatives of the clusters. bel : an N-dimensional vector, whose i-th element indicates the cluster where the i-th data vector has been assinged.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 9

2.6

possibi.m

Syntax : [U,w]=possibi(X,m,eta,q,sed,init proc,e thres) Description: Implements the possibilistic algorithm, when the squared Euclidean distance is adopted. The algorithm moves iteratively the cluster representatives to regions that are dense in data, so that a suitable cost function is minimized. It terminates when no signicant dierence in the values of the representatives is encountered between two successive iterations. Once more, the number of clusters is a priori required. However, when it is overestimated, the algorithm has the ability to return a solution where more than one representatives coincide. Input: X : an l N dimensional matrix, each column of which corresponds to a data vector. m: the number of clusters. eta: an m-dimensional array of the eta parameters of the clusters. q: the q parameter of the algorithm. When this is not equal to 0 the original cost function is considered, while if it is 0 the alternative one is considered. sed : a scalar integer, which is used as the seed for the random generator function rand. init proc: an integer taking values 1, 2 or 3 with - 1 corresponding to the rand init.m initialization procedure (this procedure chooses randomly m vectors from the smallest hyperrectangular that contains all the vectors of X and its sides are parallel to the axes). - 2 corresponding to rand data init.m (this procedure chooses randomly m vectors among the vectors of X) and - 3 corresponding to distant init.m (this procedure chooses the m vectors of X that are most distant from each other. This is a more computationally demanding procedure). e thres: The threshold controlling the termination of the algorithm. Specically, the algorithm terminates when the sum of the absolute dierences of the representatives between two successive iterations is smaller than e thres. Output: U : an N m dimensional matrix, whose (i, j) element denotes the possibility that the i-th data vector belongs to the j-th cluster, after the convergence of the algorithm. w : an l m dimensional matrix, each column of which corresponds to a cluster representative, after the convergence of the algorithm. NOTE : This function calls rand init.m, rand data init.m and distant init.m.

2.7

k medoids.m

Syntax : [bel,cost,w,a]=k medoids(X,m,sed) Description: This function implements the k-medoids algorithm. The aim of this algorihm is the same as with k-means, i.e., to move iteratively the cluster representatives to regions that are dense in data, so that a suitable cost function is minimized. However, now, the represenatives are constrained to be vectors

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 10

10

of the data set. The algorithm terminates when no change in the representatives is encountered between two successive iterations. Input: X : an l N dimensional matrix, each column of which corresponds to an l-dimensional data vector. m: the number of clusters. sed : a scalar integer, which is used as the seed for the random generator function rand. Output: bel : an N-dimensional vector, whose i-th element contains the cluster in which the i-th data vector is assigned, after the convergence of the algorithm. cost: a scalar which is the summation of the distances of each data vector from each closest representative, computed after the convergence of the algorithm. w : an l m dimensional matrix, each column of which corresponds to a cluster representative, after the convergence of the algorithm. a: an m-dimensional vector containing the indices of the data vectors that are used as representatives. NOTE : This function calls cost comput.m to compute the cost accosiated with a specic partition.

2.8

spectral Ncut2.m

Syntax : [bel]=spectral Ncut2(X,e,sigma2) Description: This function performs spectral clustering based on the N N dimensional normalized graph Laplacian L1, produced by an l N dimensional matrix X, each column of which corresponds to a data vector. The number of clusters in this case is xed to 2. The algorithm determines the N dimensional eigenvector that corresponds to the 2nd smallest eigenvalue of L1. A new N dimensional vector y is produced by multiplying the above eigenvector with a suitable matrix D. Finally, the elements of y are divided into two groups according to whether they are greater or less than the median value of y. This division species the clustering of the vectors in the original data set X. The algorithm minimizes the so-called Ncut criterion. Input: X : an l N dimensional matrix each row of which is a data vector. e: the parameter that denes the size of the neighborhood around each vector. sigma2 : the parameter that controls the width of the Gaussian kernel (here only the case where all the kernels have the same sigma2 parameter is considered). Output: bel : an N-dimensional vector whose i-th element contains the index of the cluster to which the i-th data vector is assigned.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 11

11

2.9

spectral Ncut gt2.m

Syntax : [bel]=spectral Ncut gt2(X,e,sigma2,k) Description: This function performs spectral clustering based on the N N dimensional normalized graph Laplacian L1, produced by an l N dimensional matrix X, each column of which corresponds to a data vector. The number of clusters, k, is again assumed to be known (it may be greater than or equal to 2). The algorithm determines a k N dimensional matrix U, the j-th column of which corresponds to the j-th smallest eigenvalue of L1. Matrix V is produced by multiplying U with a suitable matrix. Then, the i-th data vector X(:,i) is mapped to the i-th row vector U(i,:) of U. Finally, the data vectors X(:,i)s are clustered based on the clustering produced by the k-means algorithm that is applied on U(i,:)s. The algorithm minimizes the so-called Ncut criterion. Input: X : an l N dimensional matrix each row of which is a data vector. e: the parameter that denes the size of the neighborhood around each vector. sigma2 : the parameter that controls the width of the Gaussian kernel (here only the case where all the kernels have the same sigma2 parameter is considered). k : the number of clusters. Output: bel : an N-dimensional vector whose i-th element contains the index of the cluster to which the i-th data vector is assigned.

2.10

spectral Ratiocut2.m

Syntax : [bel]=spectral Ratiocut2(X,e,sigma2) Description: This function performs spectral clustering based on the N N dimensional unnormalized graph Laplacian L, produced by an l N dimensional matrix X, each column of which corresponds to a data vector. The number of clusters in this case is xed to 2. The algorithm determines the N-dimensional eigenvector y that corresponds to the 2nd smallest eigenvalue of L. The elements of y are divided into two groups according to whether they are greater or less than the median value of y. This division species the clustering of the vectors in the original data set X. The algorithm minimizes the so-called Ratiocut criterion. Input: X : an l N dimensional matrix each row of which is a data vector. e: the parameter that denes the size of the neighborhood around each vector. sigma2 : the parameter that controls the width of the Gaussian kernel (here only the case where all the kernels have the same sigma2 parameter is considered). Output: bel : an N-dimensional vector whose i-th element contains the index of the cluster to which the i-th data vector is assigned.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 12

12

2.11

spectral Ratiocut gt2.m

Syntax : [bel]=spectral Ratiocut gt2(X,e,sigma2,k) Description: This function performs spectral clustering based on the N N dimensional unnormalized graph Laplacian L, produced by an l N dimensional matrix X, each column of which corresponds to a data vector. The number of clusters, k, is again assumed to be known (it may be greater or equal to 2). The algorithm determines a k N dimensional matrix V, the j-th column of which corresponds to the j-th smallest eigenvalue of L. Then, i-th data vector X(:,i) is mapped to the i-th row vector V(i,:) of V. Finally, the data vectors X(:,i)s are clustered based on the clustering produced by the k-means algorithm that is applied on V(i,:)s. The algorithm minimizes the so-called Ratiocut criterion. Input: X : an l N dimensional matrix each row of which is a data vector. e: the parameter that denes the size of the neighborhood around each vector. sigma2 : This parameter controls the width of the Gaussian kernel (here only the case where all the kernels have the same sigma2 parameter is considered). k : the number of clusters. Output: bel : an N-dimensional vector whose i-th element contains the index of the cluster to which the i-th data vector is assigned.

2.12

competitive.m

Syntax : [w,bel,epoch]=competitive(X,w ini,m,eta vec,sed,max epoch,e thres,init proc) Description: This function implements the competitive leaky learning algorithm. It is an iterative algorithm where the representatives are updated after the presentation of each data vector X(:,i). Specically, all representatives move towards X(:,i). However, the learning rate for the one that lies closer to X(:,i) (winner) is much higher than the learning rate for the rest representatives (losers). As a consequence, the closest representative moves much closer to X(:,i) than the rest representatives. In this way, the representatives are moved towards regions that are dense in data vectors. The number of representatives is assumed to be known. The basic competitive learning scheme (where only the closer representative moves towards the current vector) can be viewed as a special case of the leaky learning algortihm when the learning rate for the losers is set equal to 0. IMPORTANT NOTE: In this implementation, (a) the vectors are presented in random order within each epoch and (b) the termination condition of the algorithm is The clustering remains unaltered during two successive epochs. Input: X : an l N dimensional matrix containing the data points. w ini : an l m dimensional matrix containing the initial estimates of the representatives. If it is empty, the representatives are initialized by the algorithm.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 13

13

m: the number of representatives. This is utilized only when w ini is empty. eta vec: a 2-dimensional parameter vector whose rst component is the learning rate for the winning representative, while its second component is the learning rate for all the rest representatives. sed : a seed for the rand function of MATLAB. max epoch: The maximum number of epochs. e thres: The parameter used in the termination condition (in this version of the algorithm its value is of no importance). init proc: an integer taking values 1, 2 or 3 with - 1 corresponding to rand init.m initialization procedure (this procedure chooses randomly m vectors from the smallest hyperrectangular that contains all the vectors of X and its sides are parallel to the axes). - 2 corresponding to rand data init.m (this procedure chooses randomly m vectors among the vectors of X) and - 3 corresponding to distant init.m (this procedure chooses the m vectors of X that are most distant from each other. This is a more computationally demanding procedure). This choice is activated only if the user does not provide the initial conditions. Output: w : an l m dimensional matrix containing the initial estimates of the representatives. bel : an N dimensional vector, whose i-th element contains the index of the closest to X( : ,i) representative. epoch: The number of epochs required for convergence. NOTE : This function calls rand init.m, rand data init.m and distant init.m

2.13

valley seeking.m

Syntax : [bel,iter]=valley seeking(X,a,bel ini,max iter) Description: This function implements the valley seeking algorithm. This algorithm starts with an initial clustering of the data vectors and iteratively adjusts it in order to identify the regions that are dense in data (which correspond to the physically formed clusters). Specically, at each iteration and for each data vector X(:,i), its closest neighbors are considered and X(:,i) is assigned to the cluster that has more vectors among the neighbors of X(:,i). The algorithm terminates when no point is reassigned to a dierent cluster between two successive iterations. Input: X : an l N dimensional data matrix whose rows corresponds to the data vectors. a: a parameter that species the size of the neighborhood. bel ini : an N-dimensional vector whose i-th coordinate contains the index of the cluster where the i-th vector is initially assigned. max iter : a parameter that species the maximum allowable number of iterations.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 14

14

Output: bel : an N-dimensional vector, which has the same structure with bel ini described before. iter : the number of iterations performed until convergence is achieved. NOTES : - This function calls dist.m that computes the squared Euclidean distance between two vectors. - It is assumed that the cluster indices are in the set {1, 2, . . . , m}. - The algorithm is extremely sensitive to parameter settings (a and bel ini). It can be used after, e.g., a sequential algorithm. Then, the valley seeking algorithm can take over, using this as initial condition, in order to identify the true number of clusters.

2.14

distant init.m

Syntax : [w]=distant init(X,m,sed) Description: This function chooses the m most distant from each other vectors among the N vectors contained in a data set X. Specically, the mean vector of the N data vectors of a set X is computed and the vector of X that lies furthest from the mean is assigned rst to the set w contaning the most distant points. The i-th element of w is selected via the follwoing steps: (a) the minimum distance of each of the N-i+1 points of X-W from the i-1 vectors in W is computed. (b) the point with the maximum of the above N-i+1 computed minimum distances joins w as its i-th element. Input: X : an l N dimensional matrix, whose columns are the data vectors. m: the number of vectors to be chosen. sed : the seed for the random number generator (in this case it does not aect the results of the algorihm). Output: w : an l m dimensional matrix, whose columns are the selected most distant vectors.

2.15

rand data init.m

Syntax : [w]=rand data init(X,m,sed) Description: This function chooses randomly m among the N vectors contained in a data set X of vectors. Input: X : an l N dimensional matrix, whose columns are the data vectors. m: the number of vectors to be chosen. sed : the seed for the random number generator. Output: w : an l m dimensional matrix, whose columns are the randomly selected vectors of X.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 15

15

2.16

rand init.m

Syntax : [w]=rand init(X,m,sed) Description: This function chooses randomly m vectors from the smallest hyperrectangular whose edges are prallel to the axes and contains all the vectors of a given data set X. Input: X : an l N dimensional matrix, whose columns are the data vectors. m: the number of vectors to be chosen. sed : the seed for the random number generator. Output: w : an l m dimensional matrix, whose columns are the randomly selected vectors.

2.17

cost comput.m

Syntax : [bel,cost]=cost comput(X,w) Description: This is an auxiliary function and it is called by the k medoids function. Its aim is twofold: (a) it computes the value of the cost function employed by the k-medoids algorithm, i.e. the summation of the distances of each data vector from each closest representative and (b) it assigns each vector X(:,i) to the cluster whose repersentative lies closest to X(:,i). Input: X : an l N dimensional matrix, each column of which corresponds to an l-dimensional data vector. w : an l m dimensional matrix, each column of which corresponds to a cluster representative. Output: bel : an N dimensional vector, whose i-th element contains the cluster where the i-th data vector is assigned. cost: a scalar which is the summation of the distances of each data vector from each closest representative. NOTE : This function calls the function dist that computes the squared Euclidean distance between two vectors.

2.18

dgk.m

Syntax : [z]=dgk(x,c,S) Description: This function determines the distance used in the G-K algorithm between a point x and a cluster characterized by center c and covariance S. Input:

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 16

16

x : a l-dimensional column vector. c: the center of the cluster at hand. S : the covariance of the cluster at hand. Output: z : the distance between the point and the cluster as it is dened in the framework of the GustafsonKessel algorithm.

2.19

dist.m

Syntax : [z]=dist(x,y) Description: Computes the squared Euclidean distance between two column vectors of equal length. Input: x, y: two column vectors of equal length. Output: z : the squared Euclidean distance between the vectors x and y. NOTE : It is called from the cost comput.m function.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 17

17

3
3.1

Feature Selection
simpleOutlierRemoval.m
Syntax : [outls,Index,dat]=simpleOutlierRemoval(dat,ttimes) Description: Detects and removes outliers from a normally distributed dataset by means of a thresholding technique. The threshold depends on the median and std values of the dataset. Input: dat: holds the normally distributed data. ttimes: sets the outlier threshold. Output: outls: outliers that have been detected. Index : indices of the outliers in the input dat matrix. datOut: reduced data set after outliers have been removed.

3.2

normalizeStd.m

Syntax : [class1n,class2n]=normalizeStd(class1,class2) Description: Data normalization to zero mean and standard deviation 1. Input: class1 : row vector of data for class1. class2 : row vector of data for class2. Output: class1n: row vector of normalized data for class1. class2n: row vector of normalized data for class2.

3.3

normalizeMnmx.m

Syntax : [c1,c2]=normalizeMnmx(class1,class2,par1,par2) Description: A linear technique for data normalization that limits the feature values in the range [par1 par2] (e.g., [-1 1]), by proper scaling. Input: class1 : row vector of data for class1. class2 : row vector of data for class2. par1 : desired minimum value. par2 : desired maximum value.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 18

18

Output: c1 : normalized data for class1. c2 : normalized data for class2.

3.4

normalizeSoftmax.m

Syntax : [c1,c2]=normalizeSoftmax(class1,class2,r) Description: Softmax normalization. This is basically a squashing function limiting data in the range [0 1]. The range of values that corresponds to the linear section depends on the standard deviation and the input parameter r. Values away from the mean are squashed exponentially. Input: class1 : row vector of data for class1. class2 : row vector of data for class2. r : factor r aects the range of values that corresponds to the linear section (e.g., r=0.5). Output: c1 : normalized data for class1. c2 : normalized data for class2.

3.5

ROC.m

Syntax : [auc]=ROC(x,y) Description: Plots the ROC curve and computes the area under the curve. Input: x : row vector of data for both classes. y: row vector with data labels. Each element is -1 or 1. Output: auc: area under ROC curve.

3.6

divergence.m

Syntax : [D]=divergence(c1,c2) Description: Computes the divergence between two classes. Input: c1 : data of the rst class, one pattern per column. c2 : data of the second class, one pattern per column. Output: D: value of divergence.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 19

19

3.7

divergenceBhata.m

Syntax : [D]=divergenceBhata(c1,c2) Description: Computes the Bhattacharyya distance between two classes. Input: c1 : data of the rst class, one pattern per column. c2 : data of the second class, one pattern per column. Output: D: Bhattacharyya distance.

3.8

ScatterMatrices.m

Syntax : [J]=ScatterMatrices(class1,class2) Description: Computes the distance measure (J3 ) between two classes with scattered (non-Gaussian) feature samples. Input: class1 : data of the rst class, one pattern per column. class2 : data of the second class, one pattern per column. Output: J : distance measure (J3 ) which is computed from the within-class and mixture scatter matrices.

3.9

ScalarFeatureSelectionRanking.m

Syntax : [T]=ScalarFeatureSelectionRanking(c1,c2,sepMeasure) Description: Features are treated individually and are ranked according to the adopted separability criterion (given in sepMeasure). Input: c1 : matrix of data for the rst class, one pattern per column. c2 : matrix of data for the second class, one pattern per column. sepMeasure: class separability criterion. Possible parameter values are t-test, divergence, Bhata, ROC, Fisher. Output: T : Feature ranking matrix. The rst column contains class separability costs and the second column the respective feature ids.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 20

20

3.10

SequentialBackwardSelection.m

Syntax : [cLbestOverall,JMaxOverall]=SequentialBackwardSelection(class1,class2) Description: Feature vector selection by means of the Sequential Backward Selection technique. Uses function ScatterMatrices to compute the class separability measure. Input: c1 : matrix of data for the rst class, one pattern per column. c2 : matrix of data for the second class, one pattern per column. Output: cLbestOverall : Selected feature subset. Row vector of feature ids. JMaxOverall : Class separabilty cost derived from the scatter-matrices measure.

3.11

SequentialForwardSelection.m

Syntax : [cLbestOverall,JMaxOverall]=SequentialForwardSelection(class1,class2) Description: Feature vector selection by means of the Sequential Forward Selection technique. Uses function ScatterMatrices for the class separability measure. Input: c1 : matrix of data for the rst class, one pattern per column. c2 : matrix of data for the second class, one pattern per column. Output: cLbestOverall : Selected feature subset. Row vector of feature ids. JMaxOverall : Class separabilty cost derived from the scatter-matrices measure.

3.12

exhaustiveSearch.m

Syntax : [cLbest,Jmax]=exhaustiveSearch(class1,class2,CostFunction) Description: Exhaustive search for the best feature combination depending on the adopted class separability measure (given in CostFunction). Input: class1 : data of the rst class, one pattern per column. class2 : data of the second class, one pattern per column. CostFunction: Possible choices are divergence, divergenceBhata, ScatterMatrices. Output: clbest: best feature combination. Row vector of feature ids. Jmax : value of the adopted cost function for the best feature combination.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 21

21

4
4.1

Image Features
generateCoOccMat.m
Syntax : [P 0,P 45,P 90,P 135]=generateCoOccMat(dat,Ng) Description: Generates four co-occurrence matrices corresponding to the directions, 0, 45, 50 and 135 degrees. Input: dat: gray-level image (matrix). Ng: reduce the number of gray levels to Ng as a preprocessing step. Output: P 0, P 45,P 90, P 135 : the four co-occurrence matrices. NOTE : This function diers from Matlabs graycomatrix.m because it implements a two-way scan for each direction, i.e., for the horizontal direction data are scanned both from left-to-right and from right-to-left.

4.2

CoOccMatFeatures.m

Syntax : [features]=CoOccMatFeatures(CoMat) Description: Computes a total of 13 image features, given a co-occurrence matrix. Calls 13 functions, one per feature. These 13 functions are described in the sequel. Input: coMat: the co-occurrence matrix. Output: features: a feature vector of 13 features.

4.3

CoOccASM.m

Syntax : [ASM]=CoOccASM(M) Description: Computes the Angular Second Moment, given a co-occurrence matrix. This feature is a measure of the smoothness of the image. Input: M : the co-occurrence matrix. Output: ASM : the value of the Angular Second Moment.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 22

22

4.4

CoOccContrast.m

Syntax : [CON]=CoOccContrast(M) Description: Computes the Contrast, given a co-occurrence matrix. This is a measure of the image contrast, i.e., a measure of local gray-level variations. Input: M : the co-occurrence matrix. Output: CON : the value of contrast.

4.5

CoOccCOR.m

Syntax : [COR]=CoOccCOR(M) Description: Computes the Correlation, given a co-occurrence matrix. This feature is a measure of the gray-level linear dependencies of an image. Input: M : the co-occurrence matrix. Output: COR: the value of Correlation.

4.6

CoOccVariance.m

Syntax : [variance]=CoOccVariance(M) Description: Computes the Variance of a co-occurrence matrix. Input: M : the co-occurrence matrix. Output: variance: the value of Variance.

4.7

CoOccIDM.m

Syntax : [INV]=CoOccIDM(M) Description: Computes the Inverse Dierence Moment of a co-occurrence matrix. Input: M : the co-occurrence matrix. Output: INV : the value of the Inverse Dierence moment.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 23

23

4.8

CoOccSUA.m

Syntax : [SUA]=CoOccSUA(M) Description: Computes the Sum (Dierence) Average of a co-occurrence matrix. Input: M : the co-occurrence matrix. Output: SUA: the value of the Sum (Dierence) Average.

4.9

CoOccSUV.m

Syntax : [SUV]=CoOccSUV(M) Description: Computes the Sum Variance of a co-occurrence matrix. Input: M : the co-occurrence matrix. Output: SUV : the value of Sum Variance.

4.10

CoOccSUE.m

Syntax : [SUE]=CoOccSUE(M) Description: Computes the Sum Entropy of a co-occurrence matrix. Input: M : the co-occurrence matrix. Output: SUE : the value of Sum Entropy.

4.11

CoOccEntropy.m

Syntax : [entropy]=CoOccEntropy(M) Description: Computes the Entropy of a co-occurrence matrix. Entropy is a measure of randomness and takes low values for smooth images. Input: M : the co-occurrence matrix. Output: entropy: the value of Entropy.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 24

24

4.12

CoOccDEN.m

Syntax : [DEN]=CoOccDEN(M) Description: Computes the Dierence Entropy, given a co-occurrence matrix. Input: M : the co-occurrence matrix. Output: DEN : the value of Dierence Entropy.

4.13

CoOccDVA.m

Syntax : [DVA]=CoOccDVA(M) Description: Computes the Dierence Variance, given a co-occurrence matrix. Input: M : the co-occurrence matrix. Output: DVA: the value of Dierence Variance.

4.14

CoOccCIMI.m

Syntax : [CIMI]=CoOccCIMI(M) Description: Computes the Information Measure I, given a co-occurrence matrix. Input: M : the co-occurrence matrix. Output: CIMI : the value of the Information Measure I.

4.15

CoOccCIMII.m

Syntax : [CIMII]=CoOccCIMII(M) Description: Computes the Information Measure II, given a co-occurrence matrix. Input: M : the co-occurrence matrix. Output: CIMII : the value of the Information Measure II.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 25

25

4.16

CoOccPXandPY .m

Syntax : [px,py]=CoOccPXandPY (M, px, py) Description: It is used by other feature generation functions. It computes the marginal probability matrices by summing the rows (px) and columns (py) of the co-occurrence matrix. Input: M : the co-occurrence matrix. Output: px : vector formed by sums of rows of co-occurrence matrix. py: vector formed by sums of columns of co-occurrence matrix.

4.17

CoOccPXminusY.m

Syntax : [Px minus y]=CoOccPXminusY(M) Description: It is used by other feature generation functions. It generates probability matrices by scanning the diagonals of the co-occurrence matrix at 135 degrees. Input: M : the co-occurrence matrix. Output: Px minus y: vector formed by the sums of the diagonals of M at the direction of 135 degrees.

4.18

CoOccPxplusY.m

Syntax : [Px plus y]=CoOccPxplusY(M) Description: It is used by other feature generation functions. It generates probability matrices by scanning the diagonals of the co-occurrence matrix at 45 degrees. Input: M : the co-occurrence matrix. Output: Px plus y: vector formed by the sums of the diagonals of M at the direction of 45 degrees.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 26

26

4.19

ImageHist.m

Syntax : [h]=ImageHist(A,Ng) Description: Generates the histogram of a graylevel image for Ng levels of gray. Input: A: gray-level image (matrix). Ng: desired number of gray levels. Output: h: histogram of the image (vector Ngx1).

4.20

HistMoments.m

Syntax : [feat]=HistMoments(dat,mom) Description: Computes the moment of order mom from the histogram of a gray-level image. Input: dat: graylevel image (matrix). mom: order of moment (integer 1). Output: feat: the value of moment of order mom.

4.21

HistCentralMoments.m

Syntax : [feat]=HistCentralMoments(dat,c mom) Description: Computes the central moment of order c mom from the histogram of a gray-level image. Input: dat: graylevel image (matrix). c mom: order of moment (integer 1). Output: feat: the value of moment of order c mom.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 27

27

4.22

LawMasks.m

Syntax : [A]=LawMasks(kernelLength) Description: Generates the Law Masks [Laws 80] given the kernel length (3 or 5). Input: kernelLength: length of kernels (3 or 5). Output: A: Cell array with 9 or 16 masks (matrices), depending on the adopted kernel length.

4.23

RL 0 90.m

Syntax : [Q]=RL 0 90(m,N runs,Ng,degrees) Description: Generates the Run Length matrix from a gray-level image given the number of runs, the levels of gray and the direction (0 or 90 degrees). Input: m: gray-level image (matrix). N runs: number of runs. degrees: direction in degrees (0 or 90). Output: Q: the resulting run length matrix.

4.24

RL 45 135.m

Syntax : [Q]=RL 45 135(m,N runs,Ng,degrees) Description: Generates the Run Length matrix from a gray-level image given the number of runs, the levels of gray and the direction (45 or 135 degrees). Input: m: gray-level image (matrix) N runs: number of runs. degrees: direction in degrees (45 or 135). Output: Q: the resulting run length matrix.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 28

28

4.25

SRE.m

Syntax : [SRuEm]=SRE(M) Description: Computes the Short-Run Emphasis (SRE) from a Run Length matrix. This feature emphasizes small run lengths and it is thus expected to be large for coarser images. Input: M : the Run Length matrix. Output: SRuEm: the value of Short-Run Emphasis.

4.26

LRE.m

Syntax : [LRunEmph]=LRE(M) Description: Computes the Long-Run Emphasis (LRE) from a Run Length matrix. This feature is expected to be large for smoother images. Input: M : the Run Length matrix. Output: LRunEmph: the value of Long-Run Emphasis.

4.27

GLNU.m

Syntax : [GLNonUn]=GLNU(M) Description: Computes the Gray-level Nonuniformity (GLNU) from a Run Length matrix. When runs are uniformly distributed among the gray levels, this feature takes small values. Input: M : the Run Length matrix. Output: GLNonUn: the value of Gray-Level Nonuniformity.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 29

29

4.28

RLNU.m

Syntax : [RLNonUn]=RLNU(M) Description: Computes the Run Length Nonuniformity (RLN) from a Run Length matrix. In a similar way to gray-level nonuniformity, this feature is a measure of run length nonuniformity. Input: M : the Run Length matrix. Output: RLNonUn: the value of Run Length Nonuniformity.

4.29

RP.m

Syntax : [RuPe]= RP(M) Description: Computes the Run Percentage (RP) from a Run Length matrix. This feature takes low values for smooth images. Input: M : the Run Length matrix. Output: RuPe: the value of Run Percentage.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 30

30

5
5.1

Audio Features
sfSpectralCentroid.m
Syntax : [Sc]=sfSpectralCentroid(x,Fs) Description: Computes the Spectral Centroid of a single frame. Input: x : signal frame (sequence of samples). Fs: sampling frequency (Hz). Output: Sc: spectral centroid (Hz).

5.2

sfSpectralRollo.m

Syntax : [Sr]=sfSpectralRollo(x,Fs,RolloThresh) Description: Computes the Spectral Rollo frequency of a single frame given a rollo threshold. Input: x : signal frame (sequence of samples). Fs: sampling frequency (Hz). RolloThresh: rollo threshold, 0RolloThresh1. Output: Sr : rollo frequency (Hz).

5.3

sfFundAMDF.m

Syntax : [Fr]=sfFundAMDF(x,Fs,Tmin,Tmax) Description: Computes the fundamental frequency of a single frame using the Average Magnitude Dierence Function for periodicity detection [Rabi 78]. Input: x : signal frame (sequence of samples). Fs: sampling frequency (Hz). Tmin: minimum period length (in samples). Tmax : maximum period length (in samples). Output: Fr : fundamental frequency (Hz).

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 31

31

5.4

sfFundAutoCorr.m

Syntax : [Fr]=sfFundAutoCorr(x,Fs,Tmin,Tmax) Description: Computes the fundamental frequency of a single frame using the autocorrelation function for periodicity detection [Rabi 78]. Input: x : signal frame (sequence of samples). Fs: sampling frequency (Hz). Tmin: minimum period length (in samples). Tmax : maximum period length (in samples). Output: Fr : fundamental frequency (Hz).

5.5

sfFundCepstrum.m

Syntax : [Fr]=sfFundCepstrum(x,Fs,Tmin,Tmax) Description: Computes the fundamental frequency of a single frame using the cepstrum coecients [Rabi 78]. Requires function rceps of Matlabs Signal Processing Toolbox. Input: x : signal frame (sequence of samples). Fs: sampling frequency (Hz). Tmin: minimum period length (in samples). Tmax : maximum period length (in samples). Output: Fr : fundamental frequency (Hz).

5.6

sfFundFreqHist.m

Syntax : [Fund,FinalHist]=sfFundFreqHist(x,Fs,F1,F2,NumOfPeaks) Description: Computes the fundamental frequency of a single frame using Schroeders frequency histogram [Schr 68]. The frequency histogram is generated by taking into account only the spectral peaks (local maxima). Input: x : signal frame (sequence of samples). Fs: sampling frequency (Hz). F1 : minimum funamental frequency (Hz).

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 32

32

F2 : maximum fundamental frequency (Hz). NumOfPeaks: number of spectral peaks to be taken into account. Output: Fund : fundamental frequency (Hz). FinalHist: frequency histogram.

5.7

sfMFCCs.m

Syntax : [cMel,Y]=sfMFCCs(x,Fs,bins) Description: Computes the Mel-frequency cepstrum coecients (MFCCs) from a single frame of data, assuming a lter-bank of triangular non-overlapping lters. Prior to using this function, the center frequency and frequency range for each lter in the lter-bank need to be computed. This can be achieved by rst calling function computeMelBank and pass its output to the bins variable of this m-le. Input: x : signal frame (sequence of samples). Fs: sampling frequency (Hz). bins: Mel lter-bank (see computeMelBank for details). Output: cMel : the MFCCs. The number of MFCCs is equal to the length of the frame. Y : output of the lter bank after insertion of zeros has taken place. The length of Y is equal to the length of the frame.

5.8

computeMelBank.m

Syntax : [bins]=computeMelBank(N,Fs,melStep) Description: Computes the frequency centers and frequency limits of the lters of a Mel lter-bank, assuming triangular non-overlapping lters. This function computes the frequencies of the FFT of a N-point frame and then shifts the frequency centers of the lter-bank to the closest FFT frequency. Input: N : number of FFT coecients. Fs: sampling frequency (Hz). melStep: distance between successive centers of the lter-bank (mel units). Output: bins: lter-bank matrix. The i-th row contains three values for the i-th lter of the lter-bank, i.e., lowest frequency, center frequency and rightmost frequency.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 33

33

5.9

stEnergy.m

Syntax : [E,T]=stEnergy(x,Fs,winlength,winstep) Description: Computes the short-term energy envelop. Input: x : signal (sequence of samples). Fs: sampling frequency (Hz). winlength: length of the moving window (number of samples). winstep: step of the moving window (number of samples). Output: E : sequence of short-term energy values. T : time equivalent of the rst sample of each window.

5.10

stZeroCrossingRate.m

Syntax : [Zcr,T]=stZeroCrossingRate(x,Fs,winlength,winstep) Description: Computes the short-term zero-crossing rate. Input: x : signal (sequence of samples). Fs: sampling frequency (Hz). winlength: length of the moving window (number of samples). winstep: step of the moving window (number of samples). Output: Zcr : zero-crossing rate sequence of values. T : time equivalent of the rst sample of each window.

5.11

stSpectralCentroid.m

Syntax : [Sc,T]=stSpectralCentroid(x,Fs,winlength,winstep,windowMultiplier) Description: Computes the short-term spectral centroid by calling the MovingWindow function. Input: x : signal (sequence of samples). Fs: sampling frequency (Hz). winlength: length of the moving window (number of samples). winstep: step of the moving window (number of samples).

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 34

34

windowMultiplier : use hamming, hanning, etc., i.e., any valid Matlab window multiplier (or [] for rectangular window). Output: Sc: sequence of spectral centtroids. T : time equivalent of the rst sample of each window.

5.12

stSpectralRollo.m

Syntax : [Sr,T]=stSpectralRollo(x,Fs,winlength,winstep,windowMultiplier,RolloThresh) Description: Computes the short-term spectral rollo feature by calling the MovingWindow function. Input: x : signal (sequence of samples). Fs: sampling frequency (Hz). winlength: length of the moving window (number of samples). winstep: step of the moving window (number of samples). windowMultiplier : use hamming, hanning, etc., i.e., any valid Matlab window multiplier (or [] for rectangular window). Output: Sr : sequence of spectral rollo values. T : time equivalent of the rst sample of each window.

5.13

stSpectralFlux.m

Syntax : [Sf,T]=stSpectralFlux(x,Fs,winlength,winstep,windowMultiplier) Description: Computes the short-term spectral ux. Input: x : signal (sequence of samples). Fs: sampling frequency (Hz). winlength: length of the moving window (number of samples). winstep: step of the moving window (number of samples). windowMultiplier : use hamming, hanning, etc., i.e., any valid Matlab window multiplier (or [] for rectangular window). Output: Sf : sequence of spectral ux values. T : time equivalent of the rst sample of each window.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 35

35

5.14

stFundAMDF.m

Syntax : [Fr,T]=stFundAMDF(x,Fs,winlength,winstep,windowMultiplier,Tmin,Tmax) Description: Fundamental Frequency Tracking using the Average Magnitude Dierence Function [Rabi 78]. Input: x : signal (sequence of samples). Fs: sampling frequency (Hz). winlength: length of the moving window (number of samples). winstep: step of the moving window (number of samples). windowMultiplier : use hamming, hanning, etc., i.e., any valid Matlab window multiplier (or [] for rectangular window). Tmin: minimum period length (in samples). Tmax : maximum period length (in samples). Output: Fr : sequence of fundamental frequencies (Hz). T : time equivalent of the rst sample of each window.

5.15

stMelCepstrum.m

Syntax : [MelCepsMat,T]=stMelCepstrum(x,Fs,winlength,winstep,windowMultiplier,melStep) Description: This function computes the short-term Mel Cepstrum. It calls computeMelBank to compute the center frequencies and frequency range of each lter in the mel lter-bank. Input: x : signal (sequence of samples). Fs: sampling frequency (Hz). winlength: length of the moving window (number of samples). winstep: step of the moving window (number of samples). windowMultiplier : use hamming, hanning, etc., i.e., any valid Matlab window multiplier (or [] for rectangular window). melStep: distance between the center frequencies of successive lters in the Mel lter-bank (mel units). Output: MelCepsMat: matrix of Mel-cepstrum coecients, one column per frame. T : time equivalent of the rst sample of each window.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 36

36

5.16

stFundFreqHist.m

Syntax : [FundFreqs,T]=stFundFreqHist(x,Fs,winlength,winstep,windowMultiplier,F1,F2,NumOfPeaks) Description: Fundamental Frequency Tracking based on Schroeders Histogram method [Schr 68]. This function calls MovingWindow. Input: x : signal (sequence of samples). Fs: sampling frequency (Hz). winlength: length of the moving window (number of samples). winstep: step of the moving window (number of samples). windowMultiplier : use hamming, hanning, etc., i.e., any valid Matlab window multiplier (or [] for rectangular window). F1 : minimum funamental frequency (Hz). F2 : maximum fundamental frequency (Hz). NumOfPeaks: number of spectral peaks to take into account for the histogram generation. Output: FundFreqs: sequence of fundamental frequencies (Hz). T : time equivalent of the rst sample of each window.

5.17

stFundAutoCorr.m

Syntax : [Fr,T]=stFundAutoCorr(x,Fs,winlength,winstep,windowMultiplier,Tmin,Tmax) Description: Autocorrelation-based Fundamental Frequency Tracking [Rabi 78]. Input: x : signal (sequence of samples). Fs: sampling frequency (Hz). winlength: length of the moving window (number of samples). winstep: step of the moving window (number of samples). windowMultiplier : use hamming, hanning, etc., i.e., any valid Matlab window multiplier (or [] for rectangular window). Tmin: minimum period length (in samples). Tmax : maximum period length (in samples). Output: Fr : sequence of fundamental frequencies (Hz). T : time equivalent of the rst sample of each window.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 37

37

5.18

stFundCepstrum.m

Syntax : [Fr,T]=stFundCepstrum(x,Fs,winlength,winstep,windowMultiplier,Tmin,Tmax) Description: Cepstrum-based Fundamental Frequency Tracking [Rabi 78]. Input: x : signal (sequence of samples). Fs: sampling frequency (Hz). winlength: length of the moving window (number of samples). winstep: step of the moving window (number of samples). windowMultiplier : use hamming, hanning, etc., i.e., any valid Matlab window multiplier (or [] for rectangular window). Tmin: minimum period length (in samples). Tmax : maximum period length (in samples). Output: Fr : sequence of fundamental frequencies (Hz). T : time equivalent of the rst sample of each window.

5.19

stFourierTransform.m

Syntax : [StftMat,Freqs,T]=stFourierTransform(x,Fs,winlength,winstep,windowMultiplier,GenPlot) Description: Short-time Fourier Transform of a signal. Input: x : signal (sequence of samples). Fs: sampling frequency (Hz). winlength: length of the moving window (number of samples). winstep: step of the moving window (number of samples). windowMultiplier : use hamming, hanning, etc., i.e., any valid Matlab window multiplier (or [] for rectangular window). GenPlot (optional): if set to 1, a plot spectrogram is generated. Output: StftMat: matrix of Short Time Fourier Transform coecients (one column per frame). Freqs: multiples of Fs/winlength (vector of frequencies). T : time equivalent of the rst sample of each window.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 38

38

6
6.1

Dynamic Time Warping


editDistance.m
Syntax : [editCost,Pred]=editDistance(refStr,testStr) Description: Computes the edit (Levenstein) distance between two strings, where the rst argument is the reference string (prototype). The prototype is placed on the horizontal axis of the matching grid. Input: refStr : reference string. testStr : string to compare with prototype. Output: editCost: the matching cost. Pred : matrix of node predecessors. The real part of Pred(j, i) is the row index of the predecessor of node (j, i) and the imaginary part of Pred(j, i) is the column index of the predecessor of node (j, i).

6.2

DTWSakoe.m

Syntax : [MatchingCost,BestPath,D,Pred]=DTWSakoe(ref,test) Description: Computes the Dynamic Time Warping cost between two feature sequences. The rst argument is the prototype, which is placed on the vertical axis of the matching grid. The function employs the SakoeChiba local constraints on a cost grid, where the Euclidean distance has been used as the distance metric. No end-point contstraints have been adopted. This function calls BackTracking.m to extract the best path. Input: ref : reference sequence. Its size is m I, where m is the number of features and I the number of feature vectors. test: test sequence. Its size is m J, where m is the number of features and J the number of feature vectors. genPlot: if set to 1, a plot of the best path is generated. Output: MatchingCost: matching cost. The matching cost is normalized, i.e., it is divided with the length of the best path. BestPath: backtracking path. Each node of the best path is represented as a complex number, where the real part stands for the row index and the imaginary part stands for the column index of the node. D: cost grid. Its size is I J. Pred : matrix of node predecessors. The real part of Pred(i, j) is the row index of the predecessor of node (i, j) and the imaginary part of Pred(i, j) is the column index of the predecessor of node (i, j).

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 39

39

6.3

DTWSakoeEndp.m

Syntax : [MatchingCost,BestPath,D,Pred]=DTWSakoeEndp(ref,test,omitLeft,omitRight) Description: Computes the Dynamic Time Warping cost between two feature sequences. The rst argument is the prototype which is placed on the vertical axis of the matching grid. The function employs the SakoeChiba local constraints on a type N cost grid, where the Euclidean distance has been used as the distance metric. End-points contstraints are permitted for the test sequence. This function calls BackTracking.m to extract the best path. Input: ref : reference sequence Its size is m I, where m is the number of features and I the number of feature vectors. test: test sequence. Its size is m J, where l is the number of features and J the number of feature vectors. omitLeft: left endpoint constraint for the test sequence. This is the number of frames that can be omitted from the beginning of the test sequence. omitRight: right endpoint constraint for the test sequence. This is the number of frames that can be omitted from the end of the test sequence. genPlot: if set to 1, a plot of the best path is generated. Output: MatchingCost: matching cost. The matching cost is normalized, i.e., is divided with the length of the best path. BestPath: backtracking path. Each node of the best path is represented as a complex number, where the real part stands for the row index and the imaginary part stands for the column index of the node. D: cost grid. Its size is I J. Pred : matrix of node predecessors. The real part of Pred(i, j) is the row index and the imaginary part of Pred(i, j) is the column index of the predecessor of node (i, j).

6.4

DTWItakura.m

Syntax : [MatchingCost,BestPath,D,Pred]=DTWItakura(ref,test) Description: Computes the Dynamic Time Warping cost between two feature sequences. The rst argument is the prototype, which is placed on the vertical axis of the matching grid. The function employs the standard Itakura local constraints on a cost grid, where the Euclidean distance has been used as the distance metric. No end-points contstraints have been adopted. This function calls BackTracking.m to extract the best path. Input: ref : reference sequence. Its size is m I, where m is the number of features and I the number of feature vectors.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 40

40

test: test sequence. Its size is m J, where m is the number of features and J the number of feature vectors. genPlot: if set to 1, a plot of the best path is generated. Output: MatchingCost: matching cost. The matching cost is normalized, i.e., is divided with the length of the best path. BestPath: backtracking path. Each node of the best path is represented as a complex number, where the real part stands for the row index and the imaginary part stands for the column index of the node. D: cost grid. Its size is I J. Pred : matrix of node predecessors. The real part of Pred(i, j) is the row index and the imaginary part of Pred(i, j) is the column index of the predecessor of node (i, j).

6.5

DTWItakuraEndp.m

Syntax : [MatchingCost,BestPath,D,Pred]=DTWItakuraEndp(ref,test,omitLeft,omitRight) Description: Computes the Dynamic Time Warping cost between two feature sequences. The rst argument is the prototype, which is placed on the vertical axis of the matching grid. The function employs the standard Itakura local constraints on a cost grid, where the Euclidean distance has been used as the distance metric. End-points contstraints are permitted for the test sequence. This function calls function BackTracking.m to extract the best path. Input: ref : reference sequence. Its size is m I, where m is the number of features and I the number of feature vectors. test: test sequence. The size is m J, where m is the number of features and J the number of feature vectors. omitLeft: left endpoint constraint for the test sequence. This is the number of frames that can be omitted from the beginning of the test sequence. omitRight: right endpoint constraint for the test sequence. This is the number of frames that can be omitted from the end of the test sequence. genPlot: if set to 1, a plot of the best path is generated. Output: MatchingCost: matching cost. The matching cost is normalized, i.e., it is divided by the length of the best path. BestPath: backtracking path. Each node of the best path is represented as a complex number, where the real part stands for the row index and the imaginary part stands for the column index of the node. D: cost grid. Its size is I J. Pred : matrix of node predecessors. The real part of Pred(i, j) is the row index and the imaginary part of Pred(i, j) is the column index of the predecessor of node (i, j).

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 41

41

6.6

BackTracking.m

Syntax : [BestPath]=BackTracking(Pred,startNodek,startNodel) Description: Performs backtracking on a matrix of node predecessors and returns the extracted best path starting from node (startNodek,startNodel). The best path can be optionally plotted. Input: Pred : matrix of node predecessors. The real part of Pred(i, j) is the row index of the predecessor of node (i, j). The imaginary part of Pred(i, j) is the column index of the predecessor of node (i, j). startNodek : row index of node from which backtracking starts. startNodel : column index of node from which backtracking starts. genPlot: if set to 1, a plot of the best path is generated. Output: BestPath: backtracking path, i.e., vector of nodes. Each node is represented as a complex number where the real part stands for the row index of the node and the imaginary part stands for the column index of the node.

Copyright 2009, Elsevier Inc

Theodoridis, Pattern Recognition 4e,

Page 42

42

References
[Theo 09] S. Theodoridis, K. Koutroumbas, Pattern Recognition, 4th edition, Academic Press, 2009. [Rabi 78] L.R. Rabiner, R.W Schafer, Digital Processing of Speech Signals, Prentice Hall, 1978. [Schr 68] M.R. Schroeder, Period histogram and product spectrum: New methods for fundamental frequency measurement, Journal of Acoustical Society of America, Vol. 43(4), pp. 829-834, 1968. [Laws 80] K.I. Laws, Texture image segmentation, Ph.D. Thesis, University of Southern California, 1980.

Copyright 2009, Elsevier Inc

S-ar putea să vă placă și