Documente Academic
Documente Profesional
Documente Cultură
Outline
n Limitation of Single layer Perceptron n Multi Layer Perceptron (MLP) n Backpropagation Algorithm n MLP for non-linear separable classification problem n MLP for function approximation problem
3.2
3.3
x1 x2 x3
w w
(1) 2,1
(1) 1
Hidden node
h = layer no.
w
(1) y3
(1) 2,2
(1) y2 w(2)
1,2
(2) 1,1
Output node
(h) i
o1
(2) 1,3
(1) 2,3
( wijh )
i = node i of layer h
where
= f ( wi(,hj) y (jh 1) + i( h ) )
j
x2
0 1 0 1
y
0 1 1 0
x1 x2
y1
w w
(2) w1,1
o
(2) w1,2
Layer 1
(1) (1) y1 = f ( w1,1 x1 + w1,2 x2 + 1(1) ) (1) (1) y2 = f ( w2,1 x1 + w2,2 x2 + 2(1) )
Layer 2 (2) (2) o = f ( w1,1 y1 + w1,2 y2 + 1(2) ) f( ) = Activation (or Transfer) function
Artificial Neural Network 3.5 Gp.Capt.Thanapant Raicharoen, PhD
Outputs at layer 1
x1 x2 y1 y2
x1
y1
x2
(1) w2,2
y2
(0,0)
(1,0)
x1
(1) (1) Line L2 w2,1 x1 + w2,2 x2 + 2(1) = 0
3.6
x1-x2 space
(1,1)
y2
y1-y2 space
(1,1)
Linearly separable !
(0,0) Class 0 Class 1 (1,0)
x1
(0,0)
(1,0)
y1
(1) (1) Line L2 w2,1 x1 + w2,2 x2 + 2(1) = 0 (1) (1) Line L1 w1,1 x1 + w1,2 x2 + 1(1) = 0
3.7
y1 y2
(1,1)
(2) w1,1
o
(2) w1,2
y2
y1-y2 space
(0,0)
(1,0)
y1
Space y1-y2 is linearly separable. Therefore the line L3 can classify (separate) class 0 and class 1.
3.8
In this case: activation function is identity function (Linear function) f(x) = x -We need to adjust w1 and ^ 2 in order to obtain w y is close to y (or equal to)
Artificial Neural Network 3.10 Gp.Capt.Thanapant Raicharoen, PhD
means average
MSE
w2
Artificial Neural Network
w1
3.11 Gp.Capt.Thanapant Raicharoen, PhD
w2
w1
Therefore, w1 and w2 must be adjusted in order to reach the minimum point in this error surface
Artificial Neural Network 3.12 Gp.Capt.Thanapant Raicharoen, PhD
Delta Learning Rule (Widrow-Hoff Rule) w1 and w2 are adjusted to the minimum point like this:
3 2.8
Target
w2
w1
3.13
3.14
Backpropagation Algorithm
2-Layer case
(2) ok = f ( wk , j y j + k(2) ) j
ok
Output layer
(2) wk , j
= f (hk(2) )
Hidden layer
y j = f ( w(1) xi + j(1) ) j ,i
i
w(1) j ,i
Input layer
= f ( h(1) ) j
( hmn ) = weighted sum of input
xi
of Node m in Layer n
Artificial Neural Network 3.15 Gp.Capt.Thanapant Raicharoen, PhD
2 = 2 (o k ok ) f (hk(2) ) y j (2) wk , j
The derivative of 2 with respect to (2)k
2 = 2 (o k ok ) f (hk(2) ) (2) k
3.16
(2.2) (2.3)
w(1)j,i
3.17
2 = 2 (o k ok ) f (hk(2) ) y j 2 wk , j
Derivative of current node Input from lower node
3.19
(n) j
3.20
Adjusting Weights for a Nonlinear Function (Unit) calculation f , in case of nonlinear (function) unit
1. Sigmoid function
1 f ( x) = 1 + e 2 x
We get
2. Function tanh(x)
f ( x ) = tanh( x ) f ( x ) = (1 f ( x )2 )
We get
3.21
3.22
Desired output o: if (x1,x2) lies in a circle of radius 1 centered at the origin then o=1 x2 else o=0
-1
-2
-3 -3 -2 -1 0 1 2 3
3.24
x1
Range of inputs
3.25
No. of training rounds 0.002; Maximum desired error Training command Compute network outputs (continuous) Convert to binary outputs
3.26
3.27
Class 1 Class 0
-1
-2
-3 -3 -2 -1 0 1 2 3
Initial weights of the hidden layer nodes (10 nodes) displayed as Lines w1x1+w2x2+ = 0
Artificial Neural Network 3.28 Gp.Capt.Thanapant Raicharoen, PhD
Training-Blue Goal-Black
10
-1
10
-2
10
-3
0.5
1 20000 Epochs
1.5 x 10
2
4
-1
-2
-3 -3 -2 -1 0 1 2 3
Training-Blue Goal-Black
10
-1
10
-2
10
-3
4 10 Epochs
10
Unused node
-1
-2
-3 -3 -2 -1 0 1 2 3
Example : Application of MLP for classification (cont.) Summary: MLP for Classification Problem
- Each lower layer (hidden) Nodes of Neural Network create a local boundary decision.
- The
upper layer Nodes of Neural Network combine all local boundary decisions to a global boundary decision.
3.33
Range of inputs
S1 = 6; S2 = 1;
3.34
Input nodes
3.35
Range of inputs
S1 = 3; S2 = 1;
0.5
1.5
2 Input x
2.5
3.5
Function to be approximated
x = 0:0.01:4; y = (sin(2*pi*x)+1).*exp(-x.^2);
Artificial Neural Network 3.38 Gp.Capt.Thanapant Raicharoen, PhD
0.5
1.5
2.5
3.5
Range of inputs
S1 = 5; S2 = 1;
0.5
-0.5
0.5
1.5
2.5
3.5
Example: Application of MLP for function approximation Summary: MLP for Function Approximation Problem
- Each lower layer (hidden) nodes of Neural Network create a local (short) approximated function.
- The
upper layer Nodes of Neural Network combine all local approximated function to global approximated function cover all input range.
3.42
Summary
Backpropagation can train multilayer feed-forward networks with differentiable transfer functions to perform function approximation, pattern association, and pattern classification. The term backpropagation refers to the process by which derivatives of network error, with respect to network weights and biases, can be computed. The number of inputs and outputs to the network are constrained by the problem. However, the number of layers between network inputs and the output layer and the sizes of the layers are up to the designer. The two-layer sigmoid/linear network can represent any functional relationship between inputs and outputs if the sigmoid layer has enough neurons.
Gp.Capt.Thanapant Raicharoen, PhD
3.43
3.44