Adaline/Madaline

Adaline/Madaline
Dr. Bernard Widrow*

Professor of Electrical Engineering, Stanford University
Dr. Bernard Widrow is Professor of Electrical Engineering at Stanford University. His fields of teaching and research are signal processing, neural networks, acoustics, and control systems. Before coming to Stanford in 1959, he taught at MIT where he received the Doctor of Science Degree in 1956. Dr. Widrow is the author of two books: Adaptive Signal Processing, and Adaptive Inverse Control, both published by Prentice-Hall. Each is the first of its kind, establishing new fields of research and engineering that are being pursued worldwide by students, faculty, and practicing engineers. Dr. Widrow is the inventor or co-inventor of 17 patents. One of his inventions, an adaptive filter based on the LMS (least mean square) algorithm, is used in almost all the computer MODEMS in the world, making high-speed digital communications (such as the internet) possible. He is co-inventor of a directional hearing aid that will enable many people with severe to profound hearing loss to regain speech recognition and communication ability. Dr. Widrow has started Cardinal Sound Labs to develop and commercialize the technology. He has been honored many times for his research. The Institute of Electrical and Electronic Engineers (IEEE), elected him a Fellow in 1976. In 1984, he received the IEEE Alexander Graham Bell Medal. He was inducted into the National Academy of Engineering in 1995. Dr. Widrow is currently supervising ten doctoral students at Stanford. Over the years, more than sixty students have completed their Ph.D.s under his supervision. Many of his former students have become founders and top scientists in Silicon Valley companies. About ten have become university professors, four have gone on to medical school and become MDs, and two have become Admirals in the U. S. Navy.
*http://www.svec.org/hof/1999.html#widrow
Adaline
Name comes from Adaptive Linear neuron
Tribute to its resemblance to a single biological nerve cell
Invented by Bernard Widrow in 1959 Like the perceptron, use a threshold logic device that performs a linear summation of inputs (Classify linearly separable patterns)
Its weight parameters are adapted over time.
Judith Dayhoff, Neural Network Architectures: An Introduction, Van Nostrand Reinhold
Adaline Structure
Neural Computing: NeuralWorks, NeuralWare, Inc
Adaline Learning Algorithm

A learning control mechanism samples the inputs, the output, and the desired output and uses these to adjust the weights. There are several variants of the adaline learning algorithm
We use (B. Widrow, and F. W. Smith, Pattern - recognizing Control Systems, Computer and Informations Sciences Symposium Proceedings, Spartan Books, Washington, DC, 1963.
n Wi (t + 1) = Wi (t ) + d (t ) Wi (t )X i (t ) Xi (t ) i =1
Where
0 i n and is the learning rate and usually is a small number ranging from 0 to 1 ( typically 1/n)
Adaline Learning Algorithm

Computes the error signal for each iteration and adjusting the weights to eliminate the error using the delta rule, also known as Widrow-Hoff Learning rule
This algorithm has been shown to guarantee that the set of weights exists, and at the very least, to guarantee that the set of weights will minimize the error in a leastmean-square sense (LMS)
Least Mean Square Error

The delta rule for adjusting the ith weight for each pattern n is
Wi (t + 1) = d (t ) Wi (t ) Xi (t ) Xi (t ) i =1
The squared error for a particular training pattern is
E = d (t ) Wi (t ) Xi (t ) i=1
n
L. Fausett, Fund. Of NN, Prentice Hall
Least Mean Square Error (Cont.)

The error can be reduced by adjusting the weight Wi in the direction of negative gradient E W 2 n E = d (t ) Wi (t ) Xi (t ) i=1
i
E = d (t ) 2d(t ) Wi (t ) Xi (t ) + Wi (t ) Xi (t ) i=1 i =1
n n 2
and
n E = 2 d (t ) Wi ( t )X i (t ) Xi (t ) Wi i =1
The local error will be reduced most rapidly (for a given learning rate) by adjusting the weights according to the delta rule. n Wi (t + 1) = d (t ) Wi (t ) Xi (t ) Xi (t ) i =1
L. Fausett, Fund. Of NN, Prentice Hall
Adaline: Storage Capacity*

N/(n+1) 1.0 2.0 3.0 1.5 2.0 2.5 Probability (N/(n+1)) 1.0 0.5 0.0 1.0 0.5 0.0 n n>5 n>5 n>5 n>50 n>50 n>50
Estimates of the storage capacity for an adaline have been made and experimentally verified
N = Number of patterns to be trained, n = number of weights (number of input weights +1)
Adaline:Learning Procedure
Step 1: Initialize Weights (W1..Wn) and Threshold (W0)
Set all weights and threshold to small bipolar random values ().
Step 2: Present New Input and Desired Output

Present input vector x1, x2, .....xn along with the desired output d(t). Note: x0 is a fixed bias and always set equal to 1 d(t) takes the value of 1
Step 3: Calculate Actual Output [y(t)]
n
y(t) = Fh[ wi(t) * xi(t) ]

i=0 where Fh (e) = 1 when e > 0, and = -1 when e <= 0
Step 4: Adapt Weights

where 0<i<n and
n t )X i (t ) Xi (t ) Wi (t + 1) = Wi (t ) + d (t ) Wi ( i =1
Note:
is the learning rate and usually is a small number ranging from 0 to 1. wi(t+1) = wi(t) if d(t) = y(t)
Step 5: Repeat step 2 to 4
Repeat until the desired outputs and the actual network outputs are all equal for all the input vectors of the training set.
Thoughts on Adaline
Similar basic neural structure as Perceptron Single adaline could only classify linearly separable patterns Widrow and Hoff update rule guarantees that the set of weights will minimize the error in a least-mean-square sense and thus the local error will be reduced most rapidly
Experimental results indicate that an adaline will typically converge to a stable solution in five times as many learning trials as there are weights
Madaline
Consists of Many Adaptive Linear Neurons arranged in a multilayer net. Employs a majority vote rule on the outputs of the adaline layer
If more than half of the adalines output a +1, then the madaline unit outputs +1 (similarly for -1)
Be able to classify nonlinear functions similar to multi-layer Perceptron Original learning rule uses Widrow and Hoff rule
Judith Dayhoff, Neural Network Architectures: An Introduction, Van Nostrand Reinhold
Madaline Structure
Adjustable Weights Majority function
Output
Adaline Layer Input Layer Madaline

Madaline:Learning Procedure
Step 1: Initialize Weights (Wk1..Wkn) and Threshold (Wk0)
Set all weights and threshold to small bipolar random values ().
k represents the adaline unit k, and n represents the number of inputs to each adaline unit
Step 2: Present New Input and Desired Output

Present input vector x1, x2, .....xn along with the desired output d(t). Note:** x0 is a fixed bias and always set equal to 1.
** d(t) takes the value of 1.
Step 3: Calculate Actual adaline Output [yk(t)]
n
yk(t) = Fh[ wki(t) * xki(t) ]

i=0 where Fh (e) = 1 when e > 0, and = -1 when e <= 0 yk(t) is the output from adaline unit k.
Step 4: Determine Actual Madaline Output[M(t)]
M(t) = Majority (yk(t))
Step 5: Determine error and update weights
If M(t) = desired output, no need to update the weights, Otherwise:
In a madaline network, the processing elements in the adaline layer compete. The winner is the neuron with the excitation, weighted sum , nearest to zero, but with the wrong output. Only this neuron is to be adapted n Wci (t + 1) = Wci (t ) + d(t ) Wci (t ) Xi (t ) Xi (t ) i =1 Where 0 < i n and is the learning rate (typically n 1/n) c is the winner adaline unit
Step 6: Repeat step 2 to 5 Repeat until the desired outputs and the actual network outputs are all equal for all the input vectors of the training set.
Madaline:Example
Train a Madaline to recognize the following:
X2
X1 -1 -1 +1 +1
X2 -1 +1 -1 +1
Desire O/P -1 +1 +1 -1
x1
Madaline:Example (Cont.)
1
w10 w11 x1 w21 w22 x2 w31 w32 #3 Adalines #1 w12 #2

1 1
w20 Maj w30
Output
Madaline
Step 1: set all weights and threshold to small bipolar random values:
Excitation (Line Equation):
Madaline:Example (cont.)
Step 2: Present new input and desired output
Lets apply input (-1,-1) and desired output = -1
Step 3: Calculate Actual Output [yk(t)] y1(t) = F(0.0037+(0.3566*-1)+(-0.43*-1)) = +1 y2(t) = F(-0.2779+(0.0232*-1)+(0.1117*-1)) = -1 y3(t) = F(-0.3823+(0.2843*-1)+(0.455*-1)) = -1 Step 4: Determine Actual Madaline Output[M(t)]
M(t) = Majority (1,-1,-1) = -1
Since M(t) = desired output, no weight updates are needed
Repeat : Step 2: Present new input and desired output
Lets apply input (-1,1) and desired output = +1
Step 3: Calculate Actual Output [yk(t)] y1(t) = F(0.0037+(0.3566*-1)+(-0.43*1)) = -1 y2(t) = F(-0.2779+(0.0232*-1)+(0.1117*1)) = -1 y3(t) = F(-0.3823+(0.2843*-1)+(0.455*1)) = -1 Step 4: Determine Actual Madaline Output[M(t)]
M(t) = Majority (-1,-1,-1) = -1
M(t) not equal to desired output and adaline #2 is the winner neuron (-0.19 VS -.78 for #1 and -0.21 for #3) Update only adaline #2
After 3 epochs,
Same problem with new set of random weights
Solution converged after 3 epoches

Adaline/Madaline

Încărcat de

Informații document

Descriere originală:

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Adaline/Madaline

Încărcat de

Drepturi de autor:

Formate disponibile

Adaline/Madaline

Dr. Bernard Widrow*

Judith Dayhoff, Neural Network Architectures: An Introduction, Van Nostrand Reinhold

Neural Computing: NeuralWorks, NeuralWare, Inc

Adaline Learning Algorithm

Adaline Learning Algorithm

Neural Computing: NeuralWorks, NeuralWare, Inc

Least Mean Square Error

The squared error for a particular training pattern is

L. Fausett, Fund. Of NN, Prentice Hall

Least Mean Square Error (Cont.)

Adaline: Storage Capacity*

N = Number of patterns to be trained, n = number of weights (number of input weights +1)

Neural Computing: NeuralWorks, NeuralWare, Inc

Step 2: Present New Input and Desired Output

y(t) = Fh[ wi(t) * xi(t) ]

Step 4: Adapt Weights

Judith Dayhoff, Neural Network Architectures: An Introduction, Van Nostrand Reinhold

Adaline Layer Input Layer Madaline

Step 2: Present New Input and Desired Output

yk(t) = Fh[ wki(t) * xki(t) ]

Step 4: Determine Actual Madaline Output[M(t)]

M(t) = Majority (yk(t))

w10 w11 x1 w21 w22 x2 w31 w32 #3 Adalines #1 w12 #2

w20 Maj w30

M(t) = Majority (1,-1,-1) = -1

M(t) = Majority (-1,-1,-1) = -1

S-ar putea să vă placă și