1/10/2017
_{1}
Neural Networks
for
Pattern Classification
1/10/2017
_{2}
Neural Networks for Pattern Classification
• General discussion
• Linear separability
• Hebb nets
• Perceptron
1/10/2017
_{3}
General discussion
• Pattern recognition
• Patterns: images, personal records, driving habits, etc. • Represented as a vector of features (encoded as integers or real numbers in NN)
• Pattern classification:
• Classify a pattern to one of the given classes
• Form pattern classes
• Pattern associative recall
• Using a pattern to recall a related pattern
• Pattern completion: using a partial pattern to recall the whole
pattern
• Pattern recovery: deals with noise, distortion, missing information
1/10/2017
4
• General architecture Single layer
net input to Y:
net
b
n
i 1
x w
i
i
bias b is treated as the weight from a special unit with constant
output 1.
threshold related to Y output
y
f (
net
)
1
1
if 
net 

if 
net 

classify
( x
1
,
x
n
) into one of the two classes
1/10/2017
5
• Decision region/boundary n = 2, b != 0, q = 0
b
x
2
x w
1
1
x w
2
w
1
w
2
x
1
2
0 or
b
w
2
is a line, called decision boundary, which partitions the plane into two decision regions
• If a point/pattern
, and the output is one (belongs to class
( x , x
1
0
2
) is in the positive region, then
b x w
1
1
x w
2
2
one)
• Otherwise,
b x w
1
1
x w
2
2
0
, output –1 (belongs to class two)
n = 2, b = 0, q != 0 would result a similar partition
1/10/2017
6
• If n = 3 (three input units), then the decision boundary is a two dimensional plane in a three dimensional space
• In general, a decision boundary
b
n
i
1
x w
i
i
0
dimensional hyperplane
partition the space into two decision regions
in
dimensional
an
n
space,
is a n1 which
• This simple network thus can classify a given pattern into one of the two classes, provided one of these two classes is entirely in one decision region (one side of the decision boundary) and the other class is in another region.
• The decision boundary is determined completely by the weights W and the bias b (or threshold θ).
1/10/2017
7
Linear Separability Problem
• If
two
classes
of
patterns
can
be
separated
boundary, represented by the linear equation
b
by
a
decision
n
i 1
x w
i
i
0
then they are said to be linearly separable. The simple network
can correctly classify any patterns.
• Decision boundary (i.e., W, b or θ) of linearly separable classes can be determined either by some learning procedures or by solving linear equation systems based on representative patterns
of each classes
• If such a decision boundary does not exist, then the two classes are said to be linearly inseparable.
• Linearly inseparable problems cannot be solved by the simple network , more sophisticated architecture is needed.
1/10/2017
8
• Examples of linearly separable classes
 Logical AND function
patterns (bipolar) decision boundary
x1 
x2 
y 
w1 = 1 
1 
1 
1 
w2 = 1 
1 
1 
1 
b = 1 
1 
1 
1 
θ= 0 
1 
1 
1 
1 + x1 + x2 = 0 
 Logical OR function
patterns (bipolar) decision boundary
x1 
x2 
y 
w1 = 1 
1 
1 
1 
w2 = 1 
1 
1 
1 
b = 1 
1 
1 
1 
θ = 0 
1 
1 
1 
1 + x1 + x2 = 0 
x: class I (y = 1) o: class II (y = 1)
1/10/2017
9
• Examples of linearly inseparable classes
 Logical XOR (exclusive OR) function patterns (bipolar) decision boundary
x1 
x2 
y 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
x: class I (y = 1) o: class II (y = 1)
No line can separate these two classes.
1/10/2017
10
• XOR can be solved by a more complex network with hidden units
(1, 1) 
(1,1) 
1 
(1, 1) 
(1, 1) 
1 
(1, 1) 
(1, 1) 
1 
(1, 1) 
(1, 1) 
1 
1/10/2017
_{1}_{1}
1/10/2017
_{1}_{2}
1/10/2017
_{1}_{3}
Different non linearly separable problems
1/10/2017
_{1}_{4}
Can a single neuron learn a task?
1/10/2017
15
Hebb Nets
• Hebb, in his influential book The organization of Behavior (1949), claimed • Behavior changes are primarily due to the changes of synaptic
strengths (
increases only when both I and j are “on”: the Hebbian learning law
increases only if the
outputs of both units
) between neurons I and j
w ij
w
•
ij
• In ANN, Hebbian law can be stated:
x
i
and
y
j
w
ij
have the same sign.
• In our simple network (one output and n input units)
w
ij
or,
w
ij
w
ij
( new )
w
w
ij
(
new
) w
ij
ij
( 
old 
( 
old 
)
x y
i
) x y
i
1/10/2017
16
• Hebb net (supervised) learning algorithm (p.49)
Step 0. Initialization: b = 0, wi = 0, i = 1 to n Step 1. For each of the training sample s:t do steps 2 4
/* s is the input pattern, t the target output of the sample */
Step 2. 
xi := si, I = 1 to n 
/* set s to input units */ 
Step 3. 
y := t 
/* set y to the target */ 
Step 4. 
wi := wi + xi * y, i = 1 to n b := b + xi * y 
/* update weight */ /* update bias */ 
Notes: 1) α = 1, 2) each training sample is used only once.
• Examples: AND function
• Binary units (1, 0)
(x1, x2, 1) 
y=t 
w1 
w2 
b 

(1, 
1, 
1) 
1 
1 
1 
1 
(1, 
0, 
1) 
0 
1 
1 
1 
(0, 
1, 
1) 
0 
1 
1 
1 
(0, 
0, 
1) 
0 
1 
1 
1 

bias unit 
An incorrect boundary:
1 + x1 + x2 = 0 Is learned after using each sample once
1/10/2017
17
• Bipolar units (1, 1)
(x1, x2, 1) 
y=t 
w1 
w2 
b 

(1, 
1, 
1) 
1 
1 
1 
1 
(1, 1, 
1) 
1 
0 
2 
0 

(1, 
1, 
1) 
1 
1 
1 
1 
(1, 1, 
1) 
1 
2 
2 
2 
A
correct boundary
1 + x1 + x2 = 0
is
successfully learned
• It will fail to learn x1 ^ x2 ^ x3, even though the function is linearly separable.
• Stronger learning methods are needed.
• Error driven: for each sample s:t, compute y from s based on current W and b, then compare y and t
• Use training samples repeatedly, and each time only change
weights slightly (α<< 1)
• Learning methods of Perceptron and Adaline are good examples
1/10/2017
_{1}_{8}
The Perceptron
In 1958, Frank Rosenblatt introduced a training algorithm that
provided the first procedure for training a simple ANN: a perceptron.
The operation of Rosenblatt’s perceptron is based on the
McCulloch and Pitts neuron model. The model consists of a
linear combiner followed by a hard limiter.
The weighted sum of the inputs is applied to the hard limiter, which produces an output equal to +1 if its input is positive and 
1 if it is negative.
1/10/2017
_{1}_{9}
Singlelayer twoinput perceptron
Inputs
^{x} 1
x 2
Li near
Hard
Com bin er
Li m iter
Output
Y
T hreshold
1/10/2017
_{2}_{0}
The aim of the perceptron is to classify inputs, x _{1} , x _{2} , one of two classes, say A _{1} and A _{2} .
In the case of an elementary perceptron, the n dimensional
space is divided by a hyperplane into two decision regions. The
., x _{n} , into
hyperplane is defined by the linearly separable function:
n
i 1
x w
i
i
0
1/10/2017
_{2}_{1}
Linear separability in the perceptron
1/10/2017
_{2}_{2}
How does the perceptron learn its classification tasks?
• This is done by making small adjustments in the weights to reduce
the difference between the actual and desired outputs of the
perceptron.
• The initial weights are randomly assigned, usually in the range [ 0.5, 0.5], and then updated to obtain the output consistent with the
training examples.
1/10/2017
_{2}_{3}
• If at iteration p, the actual output is Y(p) and the desired output is Y _{d} (p), then the error is given by:
e( p )
Y ( p ) Y ( p )
d
where p = 1, 2, 3,
Iteration p here refers to the p ^{t}^{h} training example presented to the perceptron.
• If the error, e(p), is positive, we need to increase perceptron
output Y(p), but if it is negative, we need to decrease Y(p).
1/10/2017
_{2}_{4}
The perceptron learning rule
w ( p
i
1)
• where p = 1, 2, 3,
• α is the learning rate, a positive constant less than unity.
• The perceptron learning rule was first proposed by Rosenblatt in
1960. Using this rule we can derive the perceptron training algorithm for classification tasks.
1/10/2017
_{2}_{5}
Perceptron’s training algorithm
Step 1: Initialization
Set initial weights w _{1} , w _{2} ,…, w _{n} and threshold θ to random numbers in the range [0.5, 0.5].
If the error, e(p), is positive, we need to increase perceptron
output Y(p), but if it is negative, we need to decrease Y(p).
Step 2: Activation
Activate the perceptron by applying inputs x _{1} (p), x _{2} (p),…, x _{n} (p) and desired output Y _{d} (p). Calculate the actual output at iteration p = 1

n 


Y ( 
p 
) 
step 
i 1 
x i 
( p ) 
w 
i ( 
p 
) 


1/10/2017
_{2}_{6}
Perceptron’s training algorithm (continued)
• where n is the number of the perceptron inputs, and step is a step activation function.
Step 3: Weight training
Update the weights of the perceptron
w
i
( p
1)
w
i
(
p )
w
i
(
p )
where Δw _{i} (p) is the weight correction at iteration p.
The weight correction is computed by the delta rule:
w
i
( p )
x ( p )
i
.
e ( p )
1/10/2017
_{2}_{7}
Perceptron’s training algorithm (continued)
Step 4: Iteration
Increase iteration p by one, go back to Step 2 and repeat the process until convergence.
1/10/2017
_{2}_{8}
Example of perceptron learning: the logical operation AND
4
5
Epoch
In puts
x 1
x 2
Threshold: _{} = 0.2; l earning rate:
= 0.1
1/10/2017
_{2}_{9}
Twodimensional plots of basic logical operations
A perceptron can learn the operations AND and OR, but not
ExclusiveOR.
1/10/2017
_{3}_{0}
Multilayer neural networks
A multilayer perceptron is a feedforward neural network with
one or more hidden layers.
The network consists of an input layer of source neurons, at least one middle or hidden layer of computational neurons, and
an output layer of computational neurons.
The input signals are propagated in a forward direction on a layerbylayer basis.
1/10/2017
_{3}_{1}
Multilayer perceptron with two hidden layers
1/10/2017
_{3}_{2}
What does the middle layer hide?
A hidden layer “hides” its desired output. Neurons in the hidden
layer cannot be observed through the input/output behavior of the
network. There is no obvious way to know what the desired output of the hidden layer should be.
Commercial ANNs incorporate three and sometimes four layers,
including one or two hidden layers. Each layer can contain from
10 to 1000 neurons. Experimental neural networks may have five or even six layers, including three or four hidden layers, and utilize millions of neurons.
1/10/2017
_{3}_{3}
Backpropagation neural network
Learning in a multilayer network proceeds the same way as for a perceptron.
A training set of input patterns is presented to the network.
The network computes its output pattern, and if there is an error  or in other words a difference between actual and desired output patterns  the weights are adjusted to reduce this error.
In a backpropagation neural network, the learning algorithm has two phases.
First, a training input pattern is presented to the network input layer.
The network propagates the input pattern from layer to layer until
the output pattern is generated by the output layer.
1/10/2017
_{3}_{4}
If this pattern is different from the desired output, an error is calculated and then propagated backwards through the network
from the output layer to the input layer. The weights are
modified as the error is propagated.
1/10/2017
_{3}_{5}
Threelayer backpropagation neural network
1/10/2017
_{3}_{6}
The backpropagation training algorithm
Step 1: Initialization
Set all the weights and threshold levels of the network to random numbers uniformly distributed inside a small range:
,
in the
network. The weight initialization is done on a neuronby neuron basis.
where
F _{i} is the total number of inputs
of neuron
i
1/10/2017
_{3}_{7}
Step2: Activation
Activate the backpropagation neural network by applying inputs x _{1} (p), x _{2} (p),…, x _{n} (p) and desired outputs y _{d}_{,}_{1} (p), y _{d}_{,}_{2} (p),…,y _{d}_{,}_{n} (p).
(a) Calculate the actual outputs of the neurons in the hidden layer:
i 1
i
ij
j
where n is the number of inputs of neuron j in the hidden layer, and sigmoid is the sigmoid activation function.
1/10/2017
_{3}_{8}
Step 2 : Activation (continued)
(b) Calculate the actual outputs of the neurons in the output
layer:
m
where m is the number of inputs of neuron k in the output layer.
1/10/2017
_{3}_{9}
Step 3: Weight training
Update the weights in the backpropagation network propagating backward the errors associated with output neurons. (a) Calculate the error gradient for the neurons in the output layer:

k 
( p ) 

y k ( p ) 


1 
y k ( p ) 
e k 
( p ) 

where 
e k ( 
) p 

y , d k ( 
) p 

y k 
( 
) p 
Calculate the weight corrections:
Update the weights at the output neurons:
w
jk
( p
1)
w ( p )
jk
w ( p )
jk
1/10/2017
_{4}_{0}
Step 3: Weight training (continued)
(b) Calculate the error gradient for the neurons in the hidden layer:
Calculate the weight corrections:
w ij
( p )
( p )
( p )
Update the weights at the hidden neurons:
1/10/2017
_{4}_{1}
Step 4: Iteration
Increase iteration p by one, go back to Step 2 and repeat the process until the selected error criterion is satisfied.
As an example, we may consider the threelayer backpropagation
network. Suppose that the network is required to perform logical operation ExclusiveOR. Recall that a singlelayer perceptron could not do this operation. Now we will apply the threelayer net.
1/10/2017
_{4}_{2}
Threelayer network for solving the ExclusiveOR operation
1
Hi ddenlayer
y 5
1/10/2017
_{4}_{3}
The effect of the threshold applied to a neuron in the hidden or
output layer is represented by its weight, θ , connected to a fixed
input equal to 1.
The initial weights and threshold levels are set randomly as follows:
w _{1}_{3} = 0.5, w _{1}_{4} = 0.9, w _{2}_{3} = 0.4, w _{2}_{4} = 1.0, w _{3}_{5} = 1.2, w _{4}_{5} = 1.1, θ _{3} = 0.8, θ _{4} = 0.1 and θ _{5} = 0.3.
1/10/2017
_{4}_{4}
We consider a training set where inputs x _{1} and x _{2} are equal to 1 and desired output y _{d}_{,}_{5} is 0. The actual outputs of neurons 3 and 4 in the hidden layer are calculated as
y 3 y 4 sigmoid sigmoid 
( ( 
x w 1 13 x w 1 14 
x 2 x 2 w 23 w 24 3 ) 4 
) 

1/ 1 1/ 1 
e (10.5 10.4 10.8) e (10.9 11.0 10.1) 

0.5250 0.8808 

Now the 
actual 
of 
neuron 
5 
in 
the 
is 
output
output
layer
determined as:
Thus, the following error is obtained:
1/10/2017
_{4}_{5}
The next step is weight training. To update the weights and
threshold levels in our network, we propagate the error, e, from the output layer backward to the input layer.
First, we calculate the error gradient for neuron 5 in the output
(1
y
5
)
e
0.5097 (1
0.5097) ( 0.5097)
0.1274
Then we determine the weight corrections assuming that the
learning rate parameter, α , is equal to 0.1:
w
35
y
3
5

0.1 0.5250 ( 0.1274) 
0.0067 


0.1 0.8808 ( 0.1274) 
0.0112 


0.1 ( 
1) 

( 0.1274) 0.0127 
1/10/2017
_{4}_{6}
Next we calculate the error gradients for neurons 3 and 4 in the hidden layer:
We then determine the weight corrections:

0.1 

1 0.0381 
0.0038 


0.1 0.1 ( 1 0.0381 1) 0.0381 0.0038 0.0038 


0.1 

1 
( 0.0147) 
0.0015 


0.1 1 0.1 ( ( 0.0147) 0.0015 1) ( 0.0147) 0.0015 
1/10/2017
_{4}_{7}
At last, we update all weights and threshold:
The training process is repeated until the sum of squared errors is less than 0.001.
1/10/2017
_{4}_{8}
Learning curve for operation ExclusiveOR
Epoch
1/10/2017
_{4}_{9}
Final results of threelayer network learning
In puts 

^{x} 1 
^{x} 2 
1 
1 
0 
1 
1 
0 
0 
0 
0
1
1
0
Actual
output
^{y} 5
0.0155
0.9849
0.9849
0.0175
Sum of
squared
errors
0.0010
1/10/2017
_{5}_{0}
Network represented by McCullochPitts model for solving the
ExclusiveOR operation
x 1
x 2
1
1
y 5
1/10/2017
_{5}_{1}
Decision boundaries
( a )
( b)
( c)
(a) Decision boundary constructed by hidden neuron 3;
(b)Decision boundary constructed by hidden neuron 4;
(c)Decision
network
boundaries
constructed
by
the
complete
threelayer
1/10/2017
_{5}_{2}
Pattern Association
and
AssociativeMemory
1/10/2017
_{5}_{3}
Neural networks were designed on analogy with the brain. The brain’s memory, however, works by association.
For example, we can recognize a familiar face even in an
unfamiliar environment within 100200ms. We can also recall a complete sensory experience, including sounds and scenes, when we hear only a few bars of music. The brain routinely associates one thing with another.
Multilayer neural networks trained with the backpropagation
algorithm are used for pattern recognition problems. However, to emulate the human memory’s associative characteristics we need
a different type of network: a recurrent neural network. A recurrent neural network has feedback loops from its
outputs to its inputs. The presence of such loops has a profound impact on the learning capability of the network.
1/10/2017
_{5}_{4}
AssociativeMemory Networks
Input: Pattern (often noisy/corrupted)
Output: Corresponding pattern (complete / relatively noisefree)
Process
1. Load input pattern onto core group of highlyinterconnected neurons.
2. Run core neurons until they reach a steady state.
3. Read output off of the states of the core neurons.
Inputs
Outputs
Output: (1 1 1 1 1)
_{I}_{n}_{p}_{u}_{t}_{:} _{(}_{1} _{0} _{1} _{}_{1} _{}_{1}_{)}
1/10/2017
_{5}_{5}
Associative Network Types
1. Autoassociative: X = Y
*Recognize noisy versions of a pattern
2. Heteroassociative Bidirectional: X <> Y
BAM = Bidirectional Associative Memory
*Iterative correction of input and output
1/10/2017
_{5}_{6}
Associative Network Types (2)
3. Heteroassociative Input Correcting: X <> Y
*Input clique is autoassociative => repairs input patterns
4. Heteroassociative Output Correcting: X <> Y
*Output clique is autoassociative => repairs output patterns
1/10/2017
_{5}_{7}
Hebb’s Rule
Connection Weights ~ Correlations
``When one cell repeatedly assists in firing another, the axon of the first cell develops synaptic knobs (or enlarges them if they already exist) in contact
with the soma of the second cell.” (Hebb, 1949)
In an associative neural net, if we compare two pattern components (e.g. pixels)
within many patterns and find that they are frequently in:
a) the same state, then the arc weight between their NN nodes should be positive ”
b) different states, then
”
”
”
negative
Matrix Memory:
The weights must store the average correlations between all pattern components across all patterns. A net presented with a partial pattern can then use the correlations to recreate the entire pattern.
1/10/2017
_{5}_{8}
Correlated Field Components
• Each component is a small portion of the pattern field (e.g. a pixel).
• In the associative neural network, each node represents one field component.
• For every pair of components, their values are compared in each of several patterns.
• Set weight on arc between the NN nodes for the 2 components ~ avg correlation.
a
b
a
b
1/10/2017
_{5}_{9}
Quantifying Hebb’s Rule
Compare two nodes to calc a weight change that reflects the state correlation:
AutoAssociation:
HeteroAssociation:
w
jk
w
jk
i
i
i
pk pj
pk
o
pj
* When the two components are the same (different), increase (decrease) the weight
i = input component o = output component
Ideally, the weights will record the average correlations across all patterns:
Auto:
w
jk
P
p 1
i
pk
i
pj
Hetero:
w
jk
P
p 1
i
pk
o
pj
Hebbian Principle: If all the input patterns are known prior to retrieval time,
then init weights as:
Auto:
w
jk
1
P
P
p 1
i
pk
i
pj
Hetero:
w
jk
1
P
P
p 1
i
pk
o
pj
Weights = Average Correlations
1/10/2017
_{6}_{0}
Matrix Representation
Let X = matrix of input patterns, where each ROW is a pattern. So x _{k}_{,}_{i} = the ith bit of the kth pattern. Let Y = matrix of output patterns, where each ROW is a pattern. So y _{k}_{,}_{j} = the jth bit of the kth pattern.
Then, avg correlation between input bit i and output bit j across all patterns is:
^{1}^{/}^{P} ^{(}^{x} 1,i ^{y} 1,j ^{+} ^{x} 2,i ^{y} 2,j ^{+} ^{…} ^{+} ^{x} p,i ^{y} p,j ^{)}
^{=}
^{w} i,j
To calculate all weights:
Hetero Assoc:
Auto Assoc:
W = X ^{T} Y W = X ^{T} X
X
In Pattern 1: x _{1}_{,}_{1} 
x 
1,n 
In Pattern 2: x _{2}_{,}_{1} 
x 
2,n 
: 

In Pattern p: x _{1}_{,}_{1} 
x 
1,n 
1/10/2017
_{6}_{1}
AutoAssociative Memory
1. AutoAssociative Patterns to Remember
Comp/Node value legend:
dark (blue) with x => +1
dark (red) w/o x => 1 light (green) => 0
2. Distributed Storage of All Patterns:
• 1 node per pattern unit
• Fully connected: clique
• Weights = avg correlations across all patterns of the corresponding units
3. Retrieval
1/10/2017
_{6}_{2}
HeteroAssociative Memory
1. HeteroAssociative Patterns (Pairs) to Remember
3. Retrieval
1 
a
b

1

a
b


2 


2 


3 

3 
2. Distributed Storage of All Patterns:
• 1 node per pattern unit for X & Y
• Full interlayer connection
• Weights = avg correlations across all patterns of the corresponding units
1/10/2017
_{6}_{3}
The Hopfield Network
&
Bidirectional Associative Memory
1/10/2017
_{6}_{4}
The Hopfield Network
John Hopfield formulated the physical principle of storing information in a dynamically stable network.
• AutoAssociation Network
• Fullyconnected (clique) with symmetric weights
• State of node = f(inputs)
• Weight values based on Hebbian principle
• Performance: Must iterate a bit to converge on a pattern, but
generally much less computation than in backpropagation
networks.
1/10/2017
_{6}_{5}
_{I}_{n}_{p}_{u}_{t}
Hopfield Networks
Output (after many iterations
1/10/2017
_{6}_{6}
The Hopfield network uses McCulloch and Pitts neurons with the sign activation function as its computing element:
Y sign
1,
1,
if
if
Y , if
X 
0 
X 
0 
X 
0 
The current state of the Hopfield network is determined by the
current outputs of all neurons, y _{1} , y _{2} ,
Thus, for a singlelayer nneuron network, the state can be defined by the state vector as:
., y _{n} .
Y
y
y
y
1
2
n
1/10/2017
_{6}_{7}
The Hopfield Network
1/10/2017
_{6}_{8}
1/10/2017
_{6}_{9}
1/10/2017
_{7}_{0}
Retrieval Algorithm
• the output update rule for Hopfield autoassociative memory can be expressed in the form
where k is the index of recursion and i is the number of the neuron currently undergoing an update.
• Asynchronous update sequence considered here is random. • Assuming that recursion starts at v ^{o} , and a random
is chosen, the output
sequence of updating neurons m,p, q,
vectors obtained are as follows
1/10/2017
_{7}_{1}
Storage Algorithm
In the Hopfield network, synaptic weights between neurons are
usually represented in matrix form.
• Assume that the bipolar binary prototype vectors that need to be
.,p. The storage algorithm
stored
for calculating the weight matrix is
S ^{(}^{m}^{)} ),for m
= 1, 2,
are
number of states to be memorized by the
network, I is n*n identity matrix, and superscript t denotes matrix transposition.
where p
is
the
1/10/2017
_{7}_{2}
Possible states for the threeneuron Hopfield network
y 2
(1, 1, 1)
y 1
(1,1, 1)
(1, 1, 1)
(1,1,1)
y 3
1/10/2017
_{7}_{3}
Hopfield Network
The stable statevertex is determined by the weight matrix W, the current input vector X, and the threshold matrix q. If the input
vector is partially incorrect or incomplete, the initial state will
converge into the stable statevertex after a few iterations.
Hopfield Network Example: Suppose, for instance, that our
network is required to memorize two opposite states, (1, 1, 1) and
(1, 1, 1). Thus,
1
1
1
Y
1
Y
2
1
1
1
or
Y
T
1
1
1
1
T
Y _{2}
1
1
1
where Y _{1} and Y _{2} are the threedimensional vectors.
1/10/2017
_{7}_{4}
The 3 ´ 3 identity matrix I is
I
1
0
0
0
1
0
0
0
1
Thus, we can now determine the weight matrix as follows:
Mult mai mult decât documente.
Descoperiți tot ce are Scribd de oferit, inclusiv cărți și cărți audio de la editori majori.
Anulați oricând.