0 Voturi pozitive0 Voturi negative

2 (de) vizualizări6 paginiMar 17, 2012

© Attribution Non-Commercial (BY-NC)

PDF, TXT sau citiți online pe Scribd

Attribution Non-Commercial (BY-NC)

2 (de) vizualizări

Attribution Non-Commercial (BY-NC)

- 20320140507006-2-3
- Hurst
- neuro-fuzzy system
- BP_NN
- Unified Power Quality Conditioner for Compensating Power Quality Problem Ad
- Prediction levels of heavy metals (Zn, Cu and Mn) in current Holocene deposits of the eastern part of the Mediterranean Moroccan margin (Alboran Sea)
- 10.1.1.134
- Logic Gate Analysis With Feed Forward Neural Network
- ChenShihWu2006
- Prediction of Stock Market Index Using Neural Networks_ an Empirical Study of BSE
- Carro1.pdf
- Matlab NN Toolbox
- Usnn Rep
- Computer Vision-Aided Fabric Inspection System
- Perceptron Multilayer
- 6
- 1-s2.0-S0038092X11000193-main
- Nn Control
- Ccn Book
- 200921971131743721.pdf

Sunteți pe pagina 1din 6

BRAD A. HAWICKHORST AND STEPHEN A. ZAHORIAN Department of Electrical and Computer Engineering Old Dominion University, Norfolk, VA 23529 RAM RAJAGOPAL Department of Electrical and Computer Engineering Old Dominion University, Norfolk, VA 23529 ABSTRACT: Neural networks are often used as a powerful discriminating classifier for tasks in automatic speech recognition. They have several advantages over parametric classifiers such as those based on a Mahalanbois distance measure, since no (possibly invalid) assumptions are made about data distributions. However there are disadvantages in terms of amount of training data required, and length of training time. In this paper, we compare three neural network architectures--two layer feedforward perceptrons trained with back propagation, Radial Basis Function (RBF) networks, and Learning Vector Quantization (LVQ) networks--with respect to these issues. These networks were also experimentally evaluated with vowel classification experiments using a Binary Pair Partitioned scheme, requiring a total of 120 pairwise sub-networks for a 16 category vowel classifier. Although the three network architectures give similar average classification performance (i.e., for the 120 sub-networks), there are large differences in terms of training time, performance of individual sub-networks, and amount of training data required. These results, and the implications for choosing network architectures in a larger speech recognition system are presented. INTRODUCTION Neural networks trained with backpropagation have been used to accurately classify vowel sounds in continuous speech recognition systems. However, the length of time required to train the networks can present problems, particularly when investigating a variety of feature sets to represent speech data. This paper compares the performance of multi-layer feedforward perceptrons trained with backpropagation versus that of two alternative neural network paradigms-- Radial Basis Function (RBF) networks, and Learning Vector Quantization (LVQ) networks. Classification accuracy, training speed and network evaluation speed are considered. The effect of reduced training data is also compared for each of the three networks.

DESCRIPTION OF DATA SET All networks were evaluated on the same data set, which consisted of a ten feature cepstral coefficient representation of vowel sounds extracted from continuous speech (TIMIT database). All coefficients were scaled for a mean of 0.0 and a standard deviation of 0.2. For each of 16 vowel sounds, a training set and an evaluation set were used for training and testing. The relative sizes of these sets corresponded to the frequency of occurrence of the vowels in the speech samples. The number of vowel tokens per set ranged from approximately 500 to 2500 for training with about 1/10 as much data for testing. For the tests conducted in this study, the task of each neural network is to classify an input vector as one of only two vowel sounds on which it was trained. This method has been found to have advantages over a single large network (Zahorian and Nossair, 1993. Since there are 16 vowels sounds, 120 networks are required to cover all possible vowel combinations. Performance of network paradigms was evaluated using average results over all 120 networks. BACKPROPAGATION MODEL Backpropagation uses a gradient descent approach to minimize output error in a feed-forward network. The algorithm involves presenting an input vector, comparing the network output to the desired output for that vector, and updating each weight by an amount corresponding to the derivative of the error with respect to that weight times some learning rate, and adjusted by a momentum factor. The architecture used in these experiments consisted of five tan-sigmoid hidden units and one output tan-sigmoid output unit. Weights were updated after presentation of each training vector using a variable learning rate and momentum to speed training . After each epoch, i.e. all tokens from each vowel set processed, the learning rate was reduced by a factor of .89 and processing continued using random starting indexing for each set. Training continued until 160,000 training vectors had been processed. Classification results for backpropagation, for both training and test data, are presented along with the best results of the other network paradigms in table 3 as an average for all 120 networks. The backpropagation networks displayed good classification, and more importantly, good generalization to the test set. The major drawback was the 77 million floating point operations required to train the network. RADIAL BASIS FUNCTION MODEL The radial basis function network uses hidden units with localized receptor fields. The RBF network hidden unit can be thought of as representing a point in Ndimensional space which responds to input vectors whose Cartesian coordinates are close to those of the hidden unit, where N is the number of features in the data representation. Many transfer functions may be used for the hidden units -- but the most common is Gaussian, which gives a response that drops off rapidly as the distance between the hidden unit and the input vector increases and is symmetrical about the radial axis-- hence the name Radial Basis Function. The rate with which

the response drops is determined by the spread of the hidden unit. The output layer of an RBF network is linear. The challenge of designing an RBF network lies in properly placing hidden layer neurons and choosing an optimal value for the spread constant such that the entire input space of interest is covered with minimum overlap. These decisions are usually made empirically, rather than through automatic training methods. Like feedforward perceptron networks trained with backpropagation, radial basis function networks are capable of approximating any continuous function with arbitrary accuracy given enough hidden units. A major advantage of the radial basis function network is usually considered to be its short training time in comparison to backpropagation, although the computation and storage requirements for classification of inputs after the network is trained is usually greater. LEARNING VECTOR QUANTIZATION MODEL Learning vector quantization employs a self-organizing network approach which uses the training vectors to recursively tune placement of competitive hidden units that represent categories of the inputs. Once the network is trained, an input vector is categorized as belonging to the class represented by the nearest hidden unit. The hidden units may be thought of as having inhibitory connections between each other so that the unit with the largest input wins and inhibits all other units to such an extent that only the winning unit generates an output. In the computer simulation there are no actual inhibitory connections and the winner is simply the hidden unit closest to the input vector As with the RBF network, each hidden unit can be thought of as representing a point in N-dimensional space. In both network types the output of the hidden units are based on the proximity of the input vector - the difference between the two is that in an RBF network, several Gaussian hidden units can have significant outputs, while in the LVQ network the output of all but one competitive unit is zero. The final output of both network types is determined by the weights of the linear output unit. These weights are computed by solving the linear network equation Wo=T/P, where Wo is the linear weight vector, P is a matrix of inputs into the output layer corresponding to all training vectors, and T is a vector containing target outputs associated with the training vectors. Training an LVQ network is accomplished by presenting input vectors and adjusting the location of hidden units based on their proximity to the input vector. The nearest hidden unit is moved a distance proportional to the learning rate, toward the training vector if the class of the hidden unit and the training vector match, and away if they do not. The hidden layer weights are trained in this manner for an arbitrary number of iterations, usually with the learning rate decreasing as training progresses. The objective is to place the hidden units so as to cover the decision regions of the training set. LVQ networks have been found to perform well in pattern classification (Kohonen, 1990). As with RBF networks, they tend to have shorter training time requirements than feedforward networks trained with backpropagation, but processing required for input classification may be larger since more hidden units are often required. EXPERIMENTAL RESULTS

All networks were simulated using the neural network toolbox of the software package MATLAB, by Mathsoft Inc. The programming language is not compiled and therefore rather inefficient in execution. The intent of these experiments was to compare the relative performance of the three network types, not to produce efficient neural network code. RBF The method used to train the hidden layer of the RBF was to evaluate every input as a possible location for a hidden unit, then choose the location which resulted in the least number of misclassifications. The process was repeated until the desired number of hidden units were found. A Vector Quantization algorithm was used to reduce the training sets to 256 codewords per vowel, then the networks were trained on the codewords. The trained networks were evaluated on the original training sets as well as the evaluation sets. The two primary variables which affected performance were the number of hidden units used and the Gaussian spread. The best value for the spread constant changed as the number of hidden units increased, since the density of hidden units in vector space increased. For networks of less than 20 hidden units, a spread value of .8 produced the best classification accuracy for most vowel pairs.

TABLE 1: EFFECT OF NUMBER OF HIDDEN UNITS ON RBF Vowel Pair OY-UH 2 hidden units 8 hidden units 16 hidden units 100 hidden units Classification Accuracy Training Set 77.3% 83.2% 81.5% 91.5% Evaluation Set 82.4% 80.7% 80.7% 79.0% 16 31 52 700 Computation Required Training-MFLOPS Evaluation-FLOPS 94 370 738 4602

The accuracy of classification of the training set increased with the number of hidden units; however, generalization degraded with large numbers of hidden units and the amount of computation required for both training and evaluation was also greatly increased. Eight hidden units produced the best compromise of classification accuracy and computation burden. LVQ The primary factors which controlled the behavior of the LVQ network were the number of hidden units, learning rate and training time. Two different schemes for initial placement of the hidden units were tried -- random placement and placement of all units at the mean value of the training set. The networks seemed to stabilize to the same solution regardless of the initialization of the hidden layer weights. The number of hidden units had a greater effect , as shown in table 2 for one vowel pair with a learning rate of .03 decreasing by .98 every 100th of 5000 cycles.

Vowel Pair OY-UH 2 hidden units 8 hidden units 16 hidden units 100 hidden units

Classification Accuracy Training Set 78.8% 77.7% 77.9% 77.9% Evaluation Set 84.2% 75.4% 75.4% 70.1% .8

Computation Required Training-MFLOPS 2.7 5.1 28.9 Evaluation-FLOPS 82 322 642 4002

The generalization capacity of the networks seemed to decrease as hidden units were added. Experiments run over all 120 vowel pairs indicated that the best LVQ performance for this application was achieved with only 2 hidden units trained with 1000 input vector presentations. COMPARISON OF BEST RESULTS FOR THE THREE NETWORKS The best results for RBF and LVQ are summarized in table 3 along with the backpropagation results. Classification accuracy is presented as an average of all 120 vowel-pair networks.

TABLE 3: BEST PERFORMANCE OF THREE NETWORK TYPES Average Performance for 120 Networks RBF - 8 hidden units LVQ - 2 hidden units Backpropagation - 5 hidden units Classification Accuracy Training Set Evaluation Set 89.9% 88.8% 91.9% 90.6% 89.7% 92.5% 32.9 0.2 77.4 Computation Required Training-MFLOPS Evaluation-FLOPS 370 82 152

An examination of the performance of individual networks revealed that in cases in which one vowel in a pair had a many more input vectors than the other, the backpropagation networks tended to favor the vowel with more inputs, classifying a much greater percentage of the smaller set incorrectly than the larger. Neither the RBF or the LVQ networks suffered from this problem. The number of MFLOPS displayed above is somewhat misleading in terms of execution time. MATLAB reported roughly twice the number of floating point operations to train each network for backpropagation compared to RBF; however, the time required to train a network with backpropagation was about ten times that required for an RBF network. The cause of this discrepancy appears to be that the RBF code used large matrix operations which seemed to be the most efficient processing mode for MATLAB. The backpropagation code used a large amount of looping, which was apparently very inefficient in the uncompiled MATLAB code. EFFECT OF SMALLER TRAINING SETS The size of the available training set is often less than what is desired in automated speech recognition applications. In order to compare the effects of a smaller training set on classification accuracy, the best performing networks of the three types were trained on much smaller input sets, reduced by selecting every thirtieth input vector and discarding the rest. These tests represent the average of

six vowel-pair networks chosen arbitrarily. Note that these six vowel pairs were more difficult to discriminate than the average of all 120 vowel pairs.

TABLE 4: EFFECT OF REDUCED TRAINING SET Average Performance for 6 networks Backpropagation LVQ RBF Full Set Reduced Set Full Set Reduced Set Full Set Reduced Set Classification Accuracy Training Set 84.3% 87.3% 80.4% 83.9% 83.5% 89.8 % Evaluation Set 85.2% 80.7% 82.5% 77.4% 83.2% 82.1%

In all three cases, the networks were able to learn the much smaller training sets with a greater degree of accuracy; however, the RBF networks appear to have maintained a higher degree of generalization than the other two networks. CONCLUSIONS Neither of the two network paradigms under consideration were superior in terms of classification accuracy to feedforward perceptrons trained with backpropagation; however, some interesting characteristics were observed. The RBF network came very close to matching the accuracy of backpropagation with a much shorter training time, and seemed to retain its generalization capacity better than backpropagation when trained on a drastically reduced input set. These are both important considerations in speech processing applications. The LVQ network was less accurate than feedforward multi-layer perceptrons trained with backpropagation and it was somewhat surprising that the best performance was achieved with only two hidden units.. Although it is not as accurate as the backpropagation or RBF networks, an LVQ network with two hidden units may be useful in evaluation of speech data feature sets since both training time and evaluation time are extremely short. Neither RBF nor LVQ shared the backpropagation problem of favoring the larger set in vowel pairs with mismatched input set sizes. However, the backpropagation can also be modified to eliminate this discrepancy. REFERENCES

Kohonen, T., (1990). Statistical Pattern Recognition Revisited, Advanced Neural Computers R. Eckmiller (editor), 137-144. Norton, C. A., and Zahorian, S. A. (1995). Speaker Verification with Binary-Pair Partitioned Neural Networks, ANNIE-95. Zahorian S. A., Nossair, Z. B., and Norton, C. A., (1993). A Partitioned Neural Network Approach for Vowel Classification using Smoothed Time/Frequency Features, Eurospeech-93, 1225-1228.

- 20320140507006-2-3Încărcat deIAEME Publication
- HurstÎncărcat depepperap
- neuro-fuzzy systemÎncărcat deGaurav Kumar
- BP_NNÎncărcat deMitul Shah
- Unified Power Quality Conditioner for Compensating Power Quality Problem AdÎncărcat deIAEME Publication
- Prediction levels of heavy metals (Zn, Cu and Mn) in current Holocene deposits of the eastern part of the Mediterranean Moroccan margin (Alboran Sea)Încărcat deInternational Organization of Scientific Research (IOSR)
- 10.1.1.134Încărcat deeltrome
- Logic Gate Analysis With Feed Forward Neural NetworkÎncărcat dechandrakalaaligeti
- ChenShihWu2006Încărcat deFelipe Sepúlveda Rapido Furioso
- Prediction of Stock Market Index Using Neural Networks_ an Empirical Study of BSEÎncărcat deAlexander Decker
- Carro1.pdfÎncărcat deRoberto González
- Matlab NN ToolboxÎncărcat deLuffi Muhammad Nur
- Usnn RepÎncărcat demauricetappa
- Computer Vision-Aided Fabric Inspection SystemÎncărcat deAnonymous zd5HPB5
- Perceptron MultilayerÎncărcat deIonut Stanciu
- 6Încărcat deLavanya Vaishnavi D.A.
- 1-s2.0-S0038092X11000193-mainÎncărcat deJim Akis
- Nn ControlÎncărcat deRam Kumar
- Ccn BookÎncărcat deAntonius Dennis T
- 200921971131743721.pdfÎncărcat deAravind Bhashyam
- Neural Networks Based Controller for DC Motor PositionÎncărcat deRajatChhabra
- 1407.5949Încărcat deSippakorn Saeaiw
- A neural network for predicting moisture content of grain.pdfÎncărcat delili&vali
- ON-LINE IDENTIFICATION OF A ROBOT MANIPULATOR USING NEURAL.pdfÎncărcat deMohd Sazli Saad
- Radial Basis Function Neural NetworkÎncărcat deKishan Kumar Gupta
- Intelligent-system.pdfÎncărcat deDevaraju Thangellamudi
- machine learning pesit lab manualÎncărcat dePratishta Tambe
- Charniak Introduction to Deep LearningÎncărcat degrzerysz
- Comparative Study of Stock Trend Prediction Using Time Delay, Recurrent and Probabilistic Neural NetworksÎncărcat deJibran Sheikh

- Artificial Neural Networks for BeginnersÎncărcat deVarthini Raja
- Solar Cells and NanotechnologyÎncărcat detvelu0606
- Sohaib Hasan - Brushless DC MotorsÎncărcat deGanesh Kumar Arumugam
- Sensor Less BLDC MotorÎncărcat deGanesh Kumar Arumugam
- View ContentÎncărcat deGanesh Kumar Arumugam
- Influnce PeopleÎncărcat deGanesh Kumar Arumugam
- Robot ArmÎncărcat deGanesh Kumar Arumugam
- Boosting Solar-cell EfficiencyÎncărcat deGanesh Kumar Arumugam
- durairajÎncărcat deGanesh Kumar Arumugam
- Lecture 6Încărcat deGanesh Kumar Arumugam
- 3360Încărcat deGanesh Kumar Arumugam
- New Text DocumentÎncărcat deGanesh Kumar Arumugam
- pxc3871881Încărcat deGanesh Kumar Arumugam
- IJEST10-02-09-27Încărcat deGanesh Kumar Arumugam
- IJCSS-25Încărcat deGanesh Kumar Arumugam
- EJSR_49_3_15Încărcat deGanesh Kumar Arumugam
- 00399874Încărcat deGanesh Kumar Arumugam
- 36168774 DC Motor ControlÎncărcat deGanesh Kumar Arumugam
- ann_lab5Încărcat deGanesh Kumar Arumugam
- 05254486Încărcat deGanesh Kumar Arumugam
- 04740969Încărcat deGanesh Kumar Arumugam
- 04635780Încărcat deGanesh Kumar Arumugam
- 04072072Încărcat deGanesh Kumar Arumugam
- 00962673Încărcat deGanesh Kumar Arumugam
- 00946264Încărcat deGanesh Kumar Arumugam
- 00574958Încărcat deGanesh Kumar Arumugam
- 00549151Încărcat deGanesh Kumar Arumugam
- delta-engÎncărcat deAgung Arif Nur Wibowo
- cvaÎncărcat deGanesh Kumar Arumugam

- Front Page ReportÎncărcat deMohanraj Subramani
- Web Services in Liferay « Liferay Knowledge ShareÎncărcat deSzeto Sorimachi
- Quick SortÎncărcat deare_you_people
- Payment NotificationÎncărcat deMuddassir Ah
- Adesso CyberPad PresentationÎncărcat deairamiah
- 01.EGovFrame Training Book I EnÎncărcat deJosé León Arellano Corral
- LTSPManualÎncărcat deaguih2005
- ATP Draw TutorialÎncărcat deMuhammad Majid Altaf
- DocumentationÎncărcat deSorin Mateescu
- naradaÎncărcat desushmsn
- Small Perturbation ProofÎncărcat deyalllik
- Teach Yourself CORBA in 14 DaysÎncărcat debad_ghoul134111
- e-hrmÎncărcat derachealll
- HistoryÎncărcat dejirachilover
- Peace Corps RFI Notice - SaS June 10 2013 Digital LibraryÎncărcat deAccessible Journal Media: Peace Corps Documents
- Lab_05+_aggregate+function_-1_2.pdfÎncărcat deMuhammad Umer Arshid
- Cell BroadcastÎncărcat demsiddique295
- 9781439873847Încărcat deDele Ted
- side_channel.pdfÎncărcat demos
- Lecture 1 - Course IntroductionÎncărcat degdenunzio
- Dell PowerEdge M420 and Oracle Database 11g R2: A Reference ArchitectureÎncărcat dePrincipled Technologies
- TFIN50 Sample QuestionsÎncărcat dePrasad Punupu
- SOLUTION TO Q2 TEST 2 BDA 31103 SEM 2 1617.pdfÎncărcat deIzzah 'Atirah
- Fourier stuffÎncărcat deacerfootball
- Tmodel10.61Încărcat dePepiño
- Stago Sta Compact User ManualÎncărcat deAnonymous SxgPnrd
- SpringCleanup DMSC Acceptance ChklistÎncărcat deGowrishankar Ek
- GPDB-SingleNodeEdition-QuickStartÎncărcat destolarov
- AntivirusÎncărcat deMoch Ma'ruf A
- Resume Rezaei3Încărcat deElham Akbari

## Mult mai mult decât documente.

Descoperiți tot ce are Scribd de oferit, inclusiv cărți și cărți audio de la editori majori.

Anulați oricând.