Logic Gate Analysis With Feed Forward Neural Network

WHAMPOA - An Interdisciplinary Journal 51(2006) 235-251
235
Logic Gate Analysis with Feed Forward Neural Network

David C.T. Tai1, Shu-hui Wu2
1
Dept. of Computer Information Science, ROC Military Academy 2Dept. of Mechanical Engineering, ROC Military Academy
Abstract Logic gates are main components to design and assemble an integrated circuit part. A multifunction electronic part is often composed by a great many of basic gates and derived gates. To analyze such a part is very tedious job. With advent of knowledge and technique in neural network, the complexity can be simplified with this technology to remarkably extent. The purpose of the study is to focus on how to establish a feed forward neural network to analyze a complex logic gates. Firstly, the mathematical background of a neural network model is reviewed. Then, the neural network toolbox in MatLAB is described. Finally, a simple hybrid logic gate is used as an illustrated example. To generate the truth-value for the gate, LabVIEW is employed to produce the dataset for training, validating and testing the neural network. Keywords: Feed Forward Neural Network, Logic gate, MatLAB, LabVIEW
I. Introduction Logic gates process signals, which represent true or false. The basic gates are AND, NAND, OR and XOR gates [4]. Figure 1, shows traditional symbol for the
basic gates. The Integrated Circuit (IC) composed of lots of such basic gates and some derived gates.
AND Gate NAND Gate
OR Gate
NOR Gate
Figure 1 Traditional Symbol for Basic Gates
212
In digital system design to analyze the input and output status for a combined gate is a very important and tedious job. With advent of knowledge and technique in neural network, feed forward neural network is employed to help the analysis. In this paper, the mathematical background for neural network will be reviewed firstly. Then, II. Mathematical Background for Feed Forward Neural Network 1. Introduction to Neural Networks Neural networks are computer systems of biological neurons, composed of nonlinear computational elements operating in parallel. The brain is a collection of about 10 billion interconnected neurons. Figure 2 shows the scheme of a neuron. Each neuron is a cell that uses biochemical reactions to receive process and transmit information. A neuron's dendrite tree is connected to a thousand neighboring neurons. As one of those neurons fires, a positive or negative charge is received by one of the dendrites. The strengths of all the received charges are added together through the processes of spatial and temporal summation. Spatial summation occurs when several weak signals are converted into a single large one, while temporal summation converts a rapid series of weak pulses from one source into one large signal. The aggregate input is then passed to the soma. The soma and the enclosed nucleus don't play a significant role in the processing of incoming and outgoing data. Their primary function is to perform
Neural Network Toolbox in MATLB will be used to develop the neural network for assisting the analyzing of a combined logic gate. Finally, a simple hybrid logic gate constructed with LabVIEW. It generates the dataset for training, validating and testing the Feed Forward Neural Network established [3]. the continuous maintenance required to upkeep the neuron function.
Figure 2 Organization of a Nueron1 The part of the soma that does concern itself with the signal is the axon hillock. If the aggregate input is greater than the axon hillock's threshold value, then the neuron fires, and an output signal is transmitted down the axon (Fraser, 2000) 2 . Neural networks consist of nodes (neurons) and synaptic connections that connect these
Source from the web site,
http://vv.carleton.ca/~neil/neural/neuron-a.html
2
Fraser, N. ,1998, Training Neural Networks, Web
site: http://vv.carleton.ca/~neil/neural/neuron-d.htm
David C.T. Tai, Shu-hui WuLogic Gate Analysis with Feed Forward Neural Network
213
nodes. Each connection is assigned a relative weight, also called connection strength, or synaptic strength. The output at each node depends on the threshold, also called bias or offset, specified and a transfer (activation) function. In mathematical terms, a neural network model can be represented by the following parameters: A State variable s is associated with each node i.
connected to node i, and its bias.The state variable s i of node i is given by

s i = f i ( wij s j v i )
j =1 J
(1)
weight
wij
associates
with
Note that transfer functions and bias terms are absent for the input nodes. The sigmoid function is the commonly used transfer function (Russell and Norving, 1995). It is given by 1 f ( ) = (2) (1 + e 2 )
connection between a node i and, a node j that the node i is connected to, such that signals flow from j to i.
Where
determine
the
steepness.
Generally, is set to unity, which results in a sigmoid transfer function of the form below,
A bias vi associated with each node i.

f i ( s j , wij , v i ) A transfer function
for
f ( ) =
1 (1 + e )
(3)
each node I, which determines the state of the node as a function of the summed output from all the nodes that are
Graphical representation of sigmoid function is shown in Figure 3.
238
1 0.9 0.8 0.7 0.6
f( )
0.5 0.4 0.3 0.2 0.1 0 -10 -5 0 5 10
Figure 3 Plot of Sigmoid Function

a finite derivative. If y i = f (hi ), where hi is the summed input to a node i, then
Another popular activation function is the hyper-tangent function, tanh, which is given by e e f ( ) = e + e
f ' (hi ) = y i (1 y i ).
(5)
(4)
2. Three-layered Feed-forward Network
The activation function is equivalent to the sigmoid function if we apply a linear transformation = / 2 to the input and a
~
linear transformation
f = 2 f 1 to the
output. These functions are monotonic with
Multi-layered networks are those with input, output and inner (or hidden) neuron layers that intervene between the input and output layers. The input-output relation defines a mapping, and the neural network provides a representation of this mapping. The number of hidden layers and number of nodes in each hidden layer depend on the complexity of the problem, and to a large extent vary from problem to problem. Increasing the number of hidden
215
layers increases the complexity of a neural network and may or may not result in a better network performance but will lead to a longer training time. Based on previous experience, one or two hidden layers provide a better performance, while not requiring extensive training time [2]. A three-layer feed-forward network consists of an input layer, an output layer and a hidden layer, resulting in two layers of weights. One connects the input to the hidden layer and the other connects the hidden layer to the output layer. Figure 4 illustrates a three-layered feed-forward implementation architecture for evaluating
Threshold (H) Initial Performance Reliability
the training breaks, training cycle numbers and training performance with initial performance and reliability parameters for this study. Bias or threshold of nodes in the hidden layer can be treated as one of the input node generally with the value 1. Bias of nodes in the output layer has same property as that of nodes in the hidden layer. Summation of the weight value of nodes in the input layer including that of bias is transformed as the output value of nodes in the hidden layer. And summation of the weight value of nodes in the hidden layer including that of bias is transformed as the output value of nodes in the output layer.
Training Breaks Cycle Numbers Performance
Figure 4 A Three-layered Feed-forward Network propagation is the most extensively used neural network training method. The backpropagation algorithm is an iterative gradient method based algorithm developed to introduce synaptic correction (or weight adjustments) by minimizing the sum of square errors (objective function). Figure 5 is architecture of three-layer network and is used to derive the back-propagation method. Node indexes i, j and k respectively
3. The Back-Propagation Algorithm
Learning for a neural network is accomplished through an adaptive procedure, known as a learning rule or algorithm. Learning algorithms indicate how weights which connecting nodes between two layers should be incrementally adapted to improve a predefined performance measure. Learning can be viewed as a search in multidimensional weight space for a solution, which gradually optimizes a predefined objective function [1]. Back-
represent the index of nodes in the input, hidden and output layers. The wij is the
216
weight or synaptic strength connecting from node i in the input layer to node j in the hidden layer, while the
w jk
is the weight or
synaptic strength from node j in the hidden layer to node k in the output layer. Bias or
k
threshold for hidden layer nodes will be represented as node 0 in the input layer, while bias for output layer nodes will be represented as node 0 in the hidden layer. The number of nodes in input, hidden and output layers is equal to l, m and n separately.
Output Layer
wjk
Hidden Layer
wij Input Layer
Figure 5 Architecture of Three-layered Network The back-propagation method is derived by minimizing the error on the output units over all the patterns using the following formula for this error, E, 1 E = (t pk o pk ) 2 (10) 2 p k where p is the subscript for the pattern and k is the subscript for the output units. Then, tpk is the target value of output unit k for pattern p and opk is the actual output value of output layer unit k for pattern p. This is by far the most commonly used error function; however from time to time people have tried other error functions. Notice that E is the sum over all the patterns, however it is assume that by minimizing that error for each pattern individually E will also be minimized. Therefore the subscript, p, will
Then, the node output value for the hidden and output layer, oi and o j will be represented as follows:
oj = 1 net 1+ e j
m
(6) (7)
net j = wij oi
j =1
ok =
1 1 + e netk
n
(8) (9)
net k = w jk o j
k =1
217
be dropped form here on and the weight change formulas for just one pattern will be derived. The first part of the problem is to find how E changes as the weight, E with respect to
output unit changes. The second part of the problem is to find out how E changes as the weight leading into a hidden layer unit j, changes. derivative
wij
w jk
, leading into an
By chain rule, the partial of
w jk
is
E ok net k E = (t k ok )(1 ok )o j = w jk ok net k w jk
(11)
The plot of an error function looks something like the curve in Figure 6. This final term is the slope of the error curve for one weight,
w jk
. If the slope is positive, we
need to decrease the weight by a small amount to lower the error, and if the slope is negative, we need to increase the weight by a small amount3.
McAuley, D., 1997, The BackPropagation Network:
Learning by Examples, URL:http://www2.psy.uq.edu.au/~brainwav/Manual/ BackProp.html
242
Figure 6 Plot of an Error Function
Hence, the weight adjustment can be given by
w jk (t + 1) = w jk (t ) + ( )((t k ok )ok (1 ok )o j
The w jk (t + 1) and w jk (t ) represent the
(12)
this case, a change to the weight
wij
changes
weight after and before adjustment respectively.
oj and this changes the inputs into each unit k in the output layer. The change in E with a change in wij is therefore the sum of the
A concise form is given by w jk = k o j where k = (t k o k )o k (1 o k ) . Next the relation between the changes in E and in the weight wij will be derived. In (13)
changes to each of the output units. Again application of the chain rule, the results will be obtained as
219
The weight changes can be written by

E E o k net k = wij o j net j wij k o k net k
=(tk ok )ok (1ok )wjkoj (1oj )oi
k
o j net j
wij = j oi where j = o j (1 o j ) k w jk .
k
(15)
= k w jk o j (1 o j )oi
k
=oi oj (1oj ) k wjk

k
(14)
Continuously using the procedure, a general formula of weight change for more than three layers of network can be obtained. In order to use the mathematical analysis mentioned above, training set (or training example) which contain several pairs of input and output values (a training pattern)
has to be prepared. Using the known input/output value, the training rule of back-
propagation algorithm can be displayed as pseudo program syntax in Figure 7.
Repeat For each training pattern Train on that pattern End of for loop Until the error comes to a acceptable level
Figure 7 Pseudo Codes for Back-propagation Algorithm
The main procedure for the backpropagation algorithm may be expanded to the following steps.
Randomly generate weights from input to hidden layer and from hidden to output layer, including the weights of bias of nodes in hidden and output layers.
220
For each training pattern, calculate the node output values and delta values for hidden and output layers. Update weights leading to hidden layers and to output layers. Continue step 2 and 3 until all patterns have been trained. If the error is small enough, then training will stop. Otherwise, training will go on. That is, return to step 2, 3 and 4. In the above step, cycle from step 2 to step 4 is typically called an epoch (or iteration). A lower acceptable error level will generally require a large number of epochs.
III. Neural MatLAB Network ToolBox in
needs to be run in the environment of MatLAB. It can be used from simple neuron to multilayer neurons. Three popular Transfer Functions are provided. They are Hard-Limit Transfer Function, Linear Transfer Function and Log-Sigmoid Transfer Function. Figure 8 describes the three functions.
Neural Network ToolBox accessory to MatLAB is widely applied to Industry and business, such as Aerospace, Automotive, Banking, Electronics, Entertainment and etc. The toolbox module
221
Figure 8 Three Popular Transfer Functions Offered The first step in training a feed forward network is to create the network object. The toolbox module offers the function, newff, to create a feed forward network. It requires four inputs and returns the network object. The usage for this function is given as:
net=newff([-1 2; 0 5],[3,1],{'tansig','purelin'},'traingd');
The first input is an R-by-2 matrix of minimum and maximum values for each of the R elements of the input vector. The second input is an array containing the sizes of each layer. The third input is a cell array containing the names of the transfer functions to be used in each layer. The final input contains the name of the training function to be used. There are lots of training functions provided with the toolbox. In this study, Batch Gradient Descent (traingd) is used. This command creates the network object and also initializes the weights and biases of the network; therefore the network is ready for training. There are times when you might want to reinitialize the weights, or to perform a custom initialization. The following command can be used to initialize or reinitialize a net. net = init(net);
There are seven training parameters associated with traingd: epochs show goal time min_grad max_fail lr The learning rate lr is multiplied times the negative of the gradient to determine the changes to the weights and biases. The larger the learning rate is, the bigger the step. If the learning rate is made too large, the algorithm becomes unstable. If the learning rate is set too small, the algorithm takes a long time to converge. The training status is displayed for every 50 iterations of the algorithm. The other parameters determine when the training stops. The training stops if the number of iterations exceeds epochs, if the performance function drops below goal,
222
if the magnitude of the gradient is less than mingrad, or if the training time is longer than time seconds. The max_fail is associated with the early stopping technique. The following code creates a training set of inputs p and targets t. For batch training, all the input vectors are placed in one matrix. p = [-1 -1 2 2;0 5 0 5]; t = [-1 -1 1 1]; Create the feedforward network. Here the function minmax is used to determine the range of the inputs to be used in creating the network. net=newff(minmax(p),[3,1],{'tansig','purelin' },'traingd'); Some of the default training parameters can be modified as bellows: net.trainParam.show = 50; net.trainParam.lr = 0.05; net.trainParam.epochs = 300; net.trainParam.goal = 1e-5; If you want to use the default training parameters, the preceding commands are not necessary. In addition to traingd, there is another batch algorithm for feed forward networks that often provides faster convergence: traingdm, steepest descent with momentum. Momentum allows a network to respond not only to the local gradient, but also to recent trends in the error surface. Acting like a lowpass filter, momentum allows the network to ignore small features in the error surface. Without momentum a network can get stuck in a shallow local minimum. With
momentum a network can slide through such a minimum. A momentum constant, mc, mediates the magnitude of the effect that the last weight change is allowed to have which can be any number between 0 and 1. When the momentum constant is 0, a weight change is based solely on the gradient. When the momentum constant is 1, the new weight change is set to equal the last weight change and the gradient is simply ignored. Based on the functions offered, a example will be provided in next paragraph to demonstrate the details to develop a feed forward neural network.
IV. Illustrated Example
In this section, a combined logic gate will be constructed to develop a feed forward neural network to analyze the inputoutput truth-values. National Instrument LabVIEW [6] is employed to define the logic gate. Figure 7 depicts the front panel for interaction. And the graphical programming diagram block is shown in Figure 8. As A, B and C buttons are pressed; their status will be changed. That is, truthvalue will be toggled from 1 to 0 or vise visa. Each status will bring different byte-value for button Q. With this system, eight patterns truth-value can be used as the dataset for training, validating and testing the feed forward neural network. Table 1 is the truth-value is generated by the system.
223
Figure 7 Front Panel of Logic Gate in LabVIEW
Figure 8 Block Diagram in LabVIEW
224
Table 1 Truth Table for the combined logic gate A 0 0 0 1 1 0 1 1 B 0 1 0 0 1 1 0 1 C 0 0 1 0 0 1 1 1 Q 1 0 1 0 0 1 0 1 There are generally six steps to develop a neural network with MatLAB [5]. 1. Prepare the dataset. 2. Create network object. 3. Initialize the network 4. Set network training parameters. 5. Training the network. 6. Simulate the response with test data. The script command for the network with function provided by the tools is given in the figure 9. In the program, there are three neurons in the hidden layer. One neuron composes the output layer. In this model, logsigmoid transfer functions are both used by the neurons in the two layers and batch gradient descent with momentum training style is employed. The performance goal (Mean Square Error) is set to 0.001. In figure 10, the training session is displayed each 50 iterations. The training comes to stop and met the performance goal.
Figure 9 MatLAB Script File Figure 11 depicts the relationship of Performance goal and Epochs. In this study, since size of the dataset is small. Hence, validation is not conducted. For a lump of dataset, 20% of total number of data can be used for validating to check whether the model is over-trained. Another 20% of data can be used for testing the model. Hence, only 60% of data is used to train the neural network. For more precision, increase training epochs and set the performance goal nearly to zero can get the good result.
225
However, it needs to consume much com putin g resou
rce as well as time.
Figure 10 Part Result of Training Session
226
Figure 11 Relationships between Performance Goal and Epochs
227
V. Conclusion
In this study, mathematical background for feed forward neural networks is reviewed. Use National Instrument LabVIEW to construct a combined logic gate and generate a dataset to use for training neural network. The Neural Network Toolbox in MatLAB is employed to establish the multilayer neural network model. The logsigmoid transfer function is used by neurons in each layer. Batch Gradient Decent with Momentum method is used to train the networks. The size of dataset in the research is small; they are only used for training the net. To construct a more precise model, epochs needs more many and mean square errors should be nearly equal to zero. However, it is time and computing resource consuming. Finding more data for training, validating and testing with searching an optimal network training parameters such as learning rate and momentum will be an interesting topic for future study.
Reference
International Conference on Industrial Engineering-Theory, Applications and Practice, Hsinchu, Taiwan (1995). [3] Kehtarnavaz, N. and Kim N., Digital Signal Processing System-Level Design Using LabVIEW, Elsevier, New York( 2005). [4] Marek J. P, Janos L. and Koster K., Digital Fuzzy Logic Controller: Design and Implementation, IEEE Transactions on Fuzzy Systems, Vol. 4, No. 4, 439459(1996). [5] Martin T. H., Howard B. D. and Mark H. B., Neural Network Design, University of Colorado Bookstore (2006).
[6] Neural Network User Guide, available from http://www.mathworks.com/access/help desk/help/toolbox/nnet/ (2006). LabVIEW Fundamentals, available from http://sine.ni.com/manuals/ (2006).
[1] Huang, C. L., et al., The Construction of Production Performance Prediction System for Semiconductor Manufacturing with Artificial Neural Network, International Journal of Production Research, Vol. 37, No. 6, 1387 1402(1999). [2] Hassoun, M. H., Fundamentals of Artificial Neural Networks, The MIT Press, Cambridge, Massaachussetts, Proceedings of the 5th Annual
228
1 2
2
MatLAB LabVIEW MatLAB, LabVIEW
229

Logic Gate Analysis With Feed Forward Neural Network

Încărcat de

Informații document

Descriere originală:

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Logic Gate Analysis With Feed Forward Neural Network

Încărcat de

Drepturi de autor:

Formate disponibile

WHAMPOA - An Interdisciplinary Journal 51(2006) 235-251

Logic Gate Analysis with Feed Forward Neural Network

AND Gate NAND Gate

Figure 1 Traditional Symbol for Basic Gates

Source from the web site,

Fraser, N. ,1998, Training Neural Networks, Web

connected to node i, and its bias.The state variable s i of node i is given by

A bias vi associated with each node i.

Graphical representation of sigmoid function is shown in Figure 3.

1 0.9 0.8 0.7 0.6

0.5 0.4 0.3 0.2 0.1 0 -10 -5 0 5 10

Figure 3 Plot of Sigmoid Function

output. These functions are monotonic with

Training Breaks Cycle Numbers Performance

3. The Back-Propagation Algorithm

wij Input Layer

By chain rule, the partial of

E ok net k E = (t k ok )(1 ok )o j = w jk ok net k w jk

. If the slope is positive, we

McAuley, D., 1997, The BackPropagation Network:

Learning by Examples, URL:http://www2.psy.uq.edu.au/~brainwav/Manual/ BackProp.html

Figure 6 Plot of an Error Function

Hence, the weight adjustment can be given by

this case, a change to the weight

weight after and before adjustment respectively.

The weight changes can be written by

=oi oj (1oj ) k wjk

propagation algorithm can be displayed as pseudo program syntax in Figure 7.

Figure 7 Pseudo Codes for Back-propagation Algorithm

Figure 7 Front Panel of Logic Gate in LabVIEW

Figure 8 Block Diagram in LabVIEW

However, it needs to consume much com putin g resou

rce as well as time.

Figure 10 Part Result of Training Session

Figure 11 Relationships between Performance Goal and Epochs

MatLAB LabVIEW MatLAB, LabVIEW

S-ar putea să vă placă și