Sunteți pe pagina 1din 33

The McCulloch Neuron (1943)

p 1
w 1

p w
2 2
+
b

p n
w n

for n=2
w1 p1 + w 2 p 2 = b
n
a = g wi p i b = g ( w t p b ) a [ 0;1] p
i =1 2

g = step function
B
The euclidian space n is divided in two regions A and B
p
1

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 51
The McCulloch Neuron
as patterns classifier
o x o
o o
o o o
o o
o x o x
x x o x
x x
x x x x x x

Linearly separable collections Linearly dependent (non-separable) collections

Some Boolean functions of two variables represented in a binary plan.

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 52
Linear and Non-Linear Classifiers

There exist 2 =2
m 2n
possible logical functions connecting n inputs to one binary output.

n # of binray # of logical # linearly % linearly


patterns functions separable separable
1 2 4 4 100
2 4 16 14 87,5
3 8 256 104 40,6
4 16 65536 1.772 2,9
5 32 4,3 x 109 94.572 2,2 x 10-3
6 64 1,8 x 1019 5.028.134 3,1 x 10-13

The logical functions of one variable:

A, A , 0, 1

The logical functions of two variables:

A, B, A , B , 0, 1
A B, A B, A B, A B,
A B , A B , A B , A B , A B, A B

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 53
Two Step Binary Perceptron

The neuron 6 implements a logical AND function by choosing


5
b6 = w
i =3
i6 .
For example:
1
w36 = w46 = w56 = ; b6 = 1 a 6 = 1 if and only if a3 = a 4 = a5 = 1
3

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 54
Three Step Binary Perceptron
w
1 3
3 w 3 9

p 2

w 1
4 4 9
9 A
w 5 9
A B
p
1
5
a 1 1

10
p
2
p 1
6
1
B
a =A B
11 ^
7 10

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 55
Neurons and Artificial Neural Networks
Micro-structure
characteristics of each neuron in the network
Meso-Structure
organization of the network
Macro-Structure
association of networks, eventually with some analytical processing
approach for complex problems

p
1
w 1

p w
2 2
+
b Bias input

p w
n n
Bias:
with p=0,
output 0 still possible !

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 56
Typical activation functions
Linear f ( s) = s Hopfield purelin f(s)
BSB

Signal + 1 se s 0 Perceptron hardlims f(s)


f (s) = 1
1 se s < 0

s
-1

Step + 1 se s 0 Perceptron hardlim f(s)


f ( s) = BAM 1
0 se s < 0

Hopfield/ +1 se s > 0 Hopfield f(s)


BAM BAM 1
f (s) = 1 se s < 0
unchanged if s = 0
s
-1

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 57
Typical activation functions

BSB or K se s K BSB satlin f(s)


Logical satlins K
f ( s) = s se K < s < + K
Threshold + K se s + K
s
-K

Logstics 1 Perceptron logsig f(s)


f ( s) = 1
1 + e s Hopfield
BAM, BSB
s

Hiperbolic 1 e 2 s Perceptron tansig f(s)


Tangent f ( s) = tanh(s) = Hopfield 1
1 + e 2s BAM, BSB
s
-1

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 58
Meso-Structure Network Organization...

# neurons per layer


# network layers
# connection type (forward, backward, lateral).

1- Multilayer Feedforward

Multilayer Perceptron (MLP)

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 59
Meso-Structure Network Organization...

2- Single Layer laterally connected (BSB (self-feedback), Hopfield)

3 Bilayers Feedforward/Feedbackward

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 60
Meso-Structure Network Organization

4 Multilayer Cooperative/Comparative Network

5 Hybrid Network

Sub- Sub-
Network
Rede 1 Network
Rede 2
1 2

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 61
Neural Macro-Structure

Rede
NetW. 1
- # networks
- connection type
- size of networks
Rede 2a
NetW.2 Rede
NetW. 2b Rede
NetW. 2c
- degree of connectivity

Rede
NetW. 3

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 62
Supervised Learning
d
-
x y
w w + x dy
Delta Rule Perceptron __
learning rate
Widrow-Hoff delta rule (LMS) ADALINE, MADALINE

Generalized Deta Rule

j x ij
wij wij +
x k2

Widrow-Hoff Delta Rule (LMS)

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 63
Delta rule Perceptron
d
Perceptron Rosenblatt, 1957 -
x y
Dynamics: __

sj = w
i
ij pij + bj
1 bj
+ 1 se s j 0
y j = f (s j ) = p w
0 se s j < 0
1 j 1 j

p w sj
yj
2 j 2 j
+

pn j
w n j

j = dj yj

wij wij + j xij Delta Rule Psychology Reasoning:


- positive reinforcement
- learning rate
j = 0 the weight is not changed. - negative reinforcement

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 64
ADALINE and MADALINE
Widrow & Hoff, 1960 (Mult.) Adaptive Linear Element

y j = wij pij + b j
i

Training:

j = d j sj = d j
wij pij + b j


j xij
wij wij +
x 2
k
Widrow-Hoff delta rule
LMS Least Mean Squared algorithm
0.1< <1 stability and convergency speed.
MatLab: NEWLIN, NEWLIND, ADAPT, LEARNWH

Obs : j j wij wij + j xij Delta Rule

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 65
LMS Algorithm
Objective: learn a function f : n from the samples (xk, dk)

{ xk}, {dk } and {ek } stationary stochastic processes


e = d y actual stochastic error Linear neuron
n
y= x w
i =1
i i
= xw t

Expected value
E[e2] = E[(d-y)2]
= E[(d-xwt)2]
= E[d2 ] 2E[dx]wt + wE [xtx] wt

Assuming w deterministic.
With
E [xtx] R autocorrelation input matrix
E [dx] P cross correlated vector Optimal analytic
w* = PR-1 solution of the
E[e2 ] = E[d2 ] 2Pwt + wRwt optimization
(solvelin.m)
0 = 2w*R 2P

(Partial derivatives equal 0 for optimal w*)

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 66
Iterative LMS Algorithm
Objective: adaptively learn a function f : n from the samples (xk, dk)

Knowing P and R, R-1 , then for some w:

w E[e2 ] = 2wR 2P

Post-multiplyting by R-1

w E[e2 ] R-1 = w 2P R-1 = w w*

w* = w w E[e2 ] R-1 i
*

wk+1 = wk ck w E[e2 ] R-1


(ck = Newtons method) How to, cautiously find
new (better ) values for
LMS Hypothesis:
E[e2k+1| e20 , e21, ... e2k] = e2k
wi , the free parameters ?

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 67
Iterative LMS Algorithm...
assuming R = I estimated steppest decent algorithm:

wk+1 = wk ck w e2k

Gradient of e2k with respect to w


ek2 ek2
w e2k = , L

1w w n

(d k yk ) 2 ( d k yk ) 2
= , L
i

w1 wn *

y y
= 2 (d k yk ) k , L 2(d k yk ) k
w1 wn j xij
wij wij +
y
= 2ek k , L
yk

x k2 Norma-
lization

1w wn

[ ]
Iterative (adaptive) solution
= 2ek x , L x = 2ek x k
1
k
n
k ( yk = x k w ) t
k
(The optimal solution is never reached!)
MADALINE i-input, j-neuron
LMS algorithm reduces to wk+1 = wk + 2ck ek xk
Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 68
The Multilayer Perceptron
- The Generalized Delta Rule
Rumelhart, Hinton e Williams, PDP/MIT, 1986

p1 = x1(0) x1(1)
x1(2) = y1
x2(1)
p2 = x2(0)
x2(2) = y2

p3 = x3(0)
x3(1)

Neuron Dynamics:

s (jk ) = w0( kj ) + wij( k ) xi( k 1)


Processing Element (PE) j Turning Point Question:
in layer k
input i i
How to find the error
associated with an
with f (activation function) x (jk ) = f ( s (jk ) ) internal neuron??
continuous differentiable

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 69
The generalized delta rule
Training p1 = x1( 0) x1(1)
x1(2) = y1
m
=
2
j =1
(d j y j )2 - quadratic error p2 = x2(0)
x 2(1)
x2(2) = y2

w (jk ) = ( woj( k ) , w1( kj ) ,..., wmj


(k )
) - weigths of PE j p3 = x3( 0)
x3(1)

x (jk 1) = (1, x1( kj 1) ,..., xnj( k 1) ) - input vector of PE j


s (jk )
With s (jk ) = w (jk ) x (jk 1) = x (jk 1)
w (jk )

Instantaneous gradient:
so (jk ) = ( k ) = ( k ) x (jk 1)
2 2

2 2 2 2 w j s j
(jk ) = = (k ) , , L
w j
(k )
w0 j w1 j
(k )
wmj
(k )

1 2
Defining the quadratic derivative error as (k )
j =
2 s (jk )
s j
(k )
2 2
(jk ) = =
w (jk ) s (jk ) w (jk ) (jk ) = 2 (j k ) x (jk 1) Gradient of the error with respect
to the weights as function of the
former layer signals!!

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 70
The generalized delta rule...
p1 = x1( 0) x1(1)
For the output layer, the quadratic derivative error is: x1(2) = y1
x 2(1)
Nk Nk p2 = x2(0)
1

i =1
( d i yi ) 2
1

i =1
( d i f ( si( k ) )) 2 x2(2) = y2
(j k ) = = p3 = x3( 0)
2 s (jk ) 2 s (jk ) x3(1)

The partial derivatives are 0 for i j

1 ( d j f ( s j )) ( k ) ( d j f ( s j ))
(k ) 2 (k )
(j k ) = = ( d j f ( s j )) = ( d j x (k )
j ) f ( s (k )
j )
2 s (jk ) s (jk )

The output error associated with PEj, in the last layer:

(jk ) = d j x (jk ) = d j y j

Giving:

(j k ) = (jk ) . f ( s (jk ) )
Remember,
activation function, f, continuous differentiable

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 71
The generalized delta rule...
p1 = x1( 0) x1(1)
For a hidden layer k, the quadratic derivative error x1(2) = y1
can be calculated using the linear outputs of layer k+1: x 2(1)
p2 = x2(0)
x2(2) = y2

1 2 1 N k +1 2 si( k +1) p3 = x3( 0)


(k )
= = ( k +1) x3(1)
si( k )
(Chain Rule)
2 s (jk ) 2 i =1 si
j

considering

1 2 si( k +1) N k +1 ( k +1) si( k +1)
N k +1
= ( k ) = i s j
(k )
f( )
sl
(k )
= 0 if l j and that
s j
(k )
( )
f s (jk ) = f s (jk ) ( )
( k +1) (k )
i =1 2 si si i =1 si

Nk
We have: (k )
j ( )
N k +1 ( k +1) ( k +1)
= i w ji . f s (jk ) ( )
Taking into account that s (jk ) = w0( kj ) + wij( k ) xi( k 1) 1i =41
42443
i =1 (k )
j
N k +1
( )
( k +1) Nk
( k +1)
(j k ) = i si( k )
w
0i
+ wli( k +1) f sl( k )
Finally, the quadratic derivative errror for a hidden layer:
i =1 l =1
(j k ) = (jk ) . f ( s (jk ) )
( )
N k +1 Nk

(k )
j = ( k +1)
i wli( k +1)
si( k )
f sl( k )

i =1 l =1

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 72
The Error Backpropagation algorithm

1. wij( k ) random , initialize the network weigths


m
2. for (x,d), training pair, obtain y. Feedforward propagation: = 2
(d
j =1
j y j )2

3. k last layer
4. for each element j in the layer k do:
Compute (kj ) using (jk ) = d j x (jk ) = d j y j if k is the last layer,
N k +1
(k )
j = i( k +1) w(jik +1) if it is a hidden layer;
i =1

Compute (j k ) = (jk ) . f ( s (jk ) )


5. k k 1 if k > 0 go to step 4, else continue.
6. w (jk ) (n + 1) = w (jk ) (n) + 2 i( k ) x i( k )
7. For the next training pair go to step 2.

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 73
The Backpropagation Algorithm in practice

1 In the standard form BP is very slow.


Ee-2 Energia da rede
2 BP Pathologies: paralysis in regions of small gradient.

3 Initial conditions can lead to local minima. PadroStart


Bad esprio Good Start
Valor Inicial

4 Stop conditions number of epochs, wij <

5 BP variants
- trainbpm (with momentum) Padro recuperado
Optimum
PadresMinima
Local armazenados
- trainbpx (adaptive learning rate)
- ....
wi,j
Estados
- trainlm (Levenberg-Marquard J, Jacobian)
e2(wi,j) - Illustrative quadratic error
W (j k ) = ( J T J + J ) 1 J T e as function of the weights

Obs: the error surface is, normally, unknown.

Steepest descent go in the opposite


direction of the local gradient (downhill).

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 74
Computational Tools

SNNS
MatLab
- Neural Network Toolbox
NeuralWorks
Java
C++

Hardware Implementations of RNAs

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 75
SNNS - Stuttgarter Neural Network Simulator

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 76
MatLab
- complete environment purelin
Model Reference Controller

Reference
model liq 4 order
-System Simulation logsig qi
Neural
Network Control qi h4
Controller Signal h4
-Training tansig
Plant Output
h4

-Control Scope

radbas

3 f(u)
1 y(n)=Cx(n)+Du(n)
y
Fcn1 z x(n+1)=Ax(n)+Bu(n) K*u
Unit Delay 1 Discrete State -Space 2 Matrix
Gain 4
-C-
Constant 3
Switch 1 f(u)

Fcn2

f(u) 1 y(n)=Cx(n)+Du(n)
2 K*u
u z x(n+1)=Ax(n)+Bu(n) + K*u
Fcn +
Unit Delay 5 Discrete State -Space 1 Matrix
tansig Matrix
Gain 3
-C- Gain 1 netsum 1 purelin -C-
netsum 1
Constant 2 Constant 4 uhat
Switch Switch 2 Saturation 1 Zero -Order
Hold
B2_c
Constant 7

1 f(u) y(n)=Cx(n)+Du(n)
r K*u
x(n+1)=Ax(n)+Bu(n)
Fcn3
Discrete State -Space 3 Matrix
Gain 2

-C-
Constant 5 B1_c
Switch 3
Constant 6

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 77
Demonstration - Perceptron
% Perceptron
% Training an ANN to learn to classify a non-linear problem
% Input Pattern
P=[ 0 0 0 0 1 1 1 1
0 0 1 1 0 0 1 1 T= 1 0 1 1 1 0 1 0
0 1 0 1 0 1 0 1]
% Target Y= 1 0 1 1 1 0 1 0
%T=[1 0 1 1 1 0 1 0] % Linear separable
T=[1 0 0 1 1 0 1 0] % non separable

% Try with Rosenblat's Perceptron T= 1 0 0 1 1 0 1 0


net=newp(P,T,'hardlim')
Y= 1 0 1 0 0 0 1 0
% train the network
net=train(net,P,T)

Y=sim(net,P)

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 78
Demonstration - OCR
Training Vector

20 % Noise

ANN

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 79
Demonstration OCR...
% of missclassifications Neural OCR Classifier

Training with 10 x (0,10,20,30,40,50) % noise

Noisy patterns used in training (unitl % of bits flipped)

* - error without noisy training patterns


* - error using noisy training patterns

With Some Noisy Training Pattern


Learns how to treat any noise
Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 80
Demonstration LMS, ADALINE, FIR
y (k ) = w0u (k ) + w1u (k 1) + w2u (k 2) + L wn u (k n)
Y ( z)
= w0 + w1 z 1 + w2 z 2 + L wn z n
U ( z)

FIR Model (always stable, only Zeros)


Obs : IIR Model is more compact, but can be unstable!

1 3
g1 (0 79.9 sec) = g 2 (80 150 sec) =
s 2 + 0.2 s + 1 s 2 + 2s + 1

System changes at 80 sec Sampling Time, Ts = 0.1sec


4

-2

(TDL Time Delay Line) -4


0 50 100 150
sec

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 81
Demo LMS, ADALINE, FIR...
% ADALINE - Adaptive dynamic system identification
% First sampled system - until 80 sec
g1=tf(1,[1 .2 1]), gd1=c2d(g1,.1)
% Sytem changes dramatically - after 80 sec
g2=tf(3,[1 2 1]),gd2=c2d(g2,.1)

% Pseudo Random Binary Signal - good for identification


u=idinput(120*10,'PRBS',[0 0.01],[-1 1]);

% time vector
...
[y1,t1,x1]=lsim(gd1,u1,t1);
[y2,t2,x2]=lsim(gd2,u2,t2,x1);

% Creates new adaline nework with delayed inputs (FIR)


% Learning Rate = 0.09

net=newlin(t,y,[1 2 3 4 5 6 7 8 9 10],0.09)
[net,Y,E]=adapt(net,t,y)

% design an average transfer function


netd=newlind(t,y)

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 82
Demo LMS, ADALINE, FIR...
RMSE Set 1=6.5742 u=idinput(1500,'PRBS',[0 0.01])
4

2 n=10, lr=0.1
0

-2

-4
0 500 1000 1500
ADALINE
Error
Learns System AND 4

also Changes in the Dynamics!! 2

-2

-4
0 500 1000 1500

10
RMSE Set 2=22.7817 u=idinput(1200,'PRBS',[0 0.05])

5 n=10, lr=0.1 Verification Signal


0
But, in other frequency range
not so good... -5
0 200 400 600 800 1000 1200

(needs to Adjust TDL, lr, Ts) Error


5

-5
0 200 400 600 800 1000 1200

Laboratrio de Automao e Robtica - A. Bauchspiess Soft Computing - Neural Networks and Fuzzy Logic 83

S-ar putea să vă placă și