Lec-04-Logistic Regression and Neural Networks PDF

Lecture 4
Logistic regression
and neural networks
Machine Learning
Andrey Filchenkov
08.06.2016
Lecture plan
• Logistic regression
• Single-layer neural network
• Completeness problem of neural
networks
• Multilayer neural networks
• Backpropagation
• Modern neural networks
• The presentation is prepared with

materials of the K.V. Vorontsov’s
course “Machine Leaning”.
Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 2

Lecture plan
networks
• Backpropagation

Logistic regression
We may want to talk about probably of belonging to a class

(we will discuss it on Lecture 5 in details).
1 checar lo de (-∞, ∞) -> (0, 1) con
𝑦 = = σ 〈𝑤, 𝑥 〉 , el exponente
1+𝑒 〈 , 〉
where σ 𝑧 is logistic (sigmoid) function.
Then classification model is

ℓ
𝑄 𝑎, 𝑇 ℓ = ln(1 + exp(− 𝑤, 𝑥 𝑦)) → min .
That is logarithmic loss function.

Logarithmic loss function plot

Gradient descent
Derivative:
σ 𝑠 = σ 𝑠 σ(−𝑠).
Gradient:
ℓ
µ∇𝑄 𝑤 [ ] =− 𝑦 𝑥 σ −𝑀 𝑤 .
Gradient descent step:

𝑤[ ] = 𝑤 [ ] − µ𝑦 𝑥 σ −𝑀 𝑤 .

Smoothed Hebb’s rule
Hebb’s rule:
if − 𝑤 , 𝑥 𝑦 > 0, then 𝑤 [ ] = 𝑤 [ ] + µ𝑥 𝑦 .
Marginal [𝑀 < 0] and smoothed σ −𝑀 :

Smoothed Hebb’s rule
Hebb’s rule:
if − 𝑤 , 𝑥 𝑦 > 0, then 𝑤 [ ] = 𝑤 [ ] + µ𝑥 𝑦 .
Marginal [𝑀 < 0] and smoothed σ −𝑀 :

Logistic regression implementation
Python: LogisticRegression with different solvers
Weka: Logistic

Lecture plan
networks
• Backpropagation

Biological intuition

Neuron
Generalized McCulloch-Pitts neuron:
𝑎 𝑥, 𝑇 ℓ = σ 𝑤 𝑓 𝑥 −𝑤 ,
where σ is activation function.

Activation functions

Rosenblatt’s rule and Hebb’s rule
Rosenblatt’s rule for {1; 0} classification case for

weight learning is for each object 𝑥( ) change
weight vector:
𝑤 [ ] ≔ 𝑤 − η(𝑎 𝑥 − 𝑦 ).
Hebb’s rule for {1; −1} classification case for

weight learning is for each object 𝑥( ) change
weight vector:
If 𝑤 𝑥 𝑦( ) < 0 then 𝑤 [ ] ≔𝑤 + η𝑥 𝑦 .

Delta rule
Let 𝐿 𝑎 , 𝑥 = 〈𝑤, 𝑥〉 − 1 .
Delta-rule for weight learning is for each object
𝑥( ) change weight vector:
𝑤 [ ] ≔ 𝑤 − η 𝑤, 𝑥 −𝑦 .

Lecture plan
networks
• Backpropagation

Completeness problem (for neuron)
Basic idea: synthesize combinations of neurons.
Completeness problem: how rich is family of

function which can be represented with neural
network?
Start with single neuron.

Logical functions as neural networks
Logical AND
𝑥 ∧ 𝑥 = [𝑥 + 𝑥 − 3/2 > 0]
Logical OR
𝑥 ∨ 𝑥 = [𝑥 + 𝑥 − 1/2 > 0]
Logical NOT
¬𝑥 = [−𝑥 + 1/2 > 0]

Two ways of making it more complex
Example (Minkovski):
𝑥 ⊕𝑥
Two way of making it more complex

1. Use non-linear transformation:
𝑥 ⊕ 𝑥 = [𝑥 + 𝑥 − 2𝑥 𝑥 − 1/2 > 0]
2. Build superposition:
𝑥 ⊕ 𝑥 = [(𝑥 ∨ 𝑥 ) − (𝑥 ∧ 𝑥 ) − 1/2 > 0]

Completeness problem (Boolean functions)
Completeness problem: how rich is family of

function which can be represented with neural
network?
DNF Theorem:
Any particular Boolean function can be
represented by one and only one full disjunctive
normal form.
What is with a all possible functions?

Gorban Theorem
Theorem (Gorban, 1998)

Let
• 𝑋 be compact space,
• 𝐶(𝑋) be an algebra of continuous on 𝑋 real-
valued functions,
• 𝐹 be linear subspace 𝐶(𝑋), closed with respect to
nonlinear continuous function ϕ and containing
constant 1 ∈ 𝐹 ,
• 𝐹 separated points in 𝑋.
Then 𝐹 is dense in 𝐶 𝑋 .
Lecture plan
networks
• Backpropagation

Multilayer neural network

Multilayer neural network
Any number of layers

Any number of neurons on each layer
Any number of ties between different layers

Weights adjusting
Let use SGD to learn weights

𝑤 = 𝑤 ,𝑤 ∈ℝ :
𝑤[ ] = 𝑤 [ ] − η𝛻𝐿 𝑤, 𝑥 , 𝑦 ,
where 𝐿 𝑤, 𝑥 , 𝑦 is loss function (depends on the

problem we are solving).

Lecture plan
networks
• Backpropagation

Derivation of functions superposition
𝑎 𝑥 =𝜎 𝑤 𝑢 𝑥 ;
𝑢 𝑥 =𝜎 𝑤 𝑓 𝑥 ;
Let 𝐿 𝑤 = ∑ 𝑎 𝑥 −𝑦 .
Find partial derivatives
∂𝐿 (𝑤) ∂𝐿 (𝑤)
; .
∂𝑎 ∂𝑢
27
Errors on layers
∂𝐿 (𝑤)
=𝑎 𝑥 −𝑦
∂𝑎
ε =𝑎 𝑥 −𝑦 is error on output layer.
∂𝐿 (𝑤)
= 𝑎 𝑥 −𝑦 σ 𝑤 = ε σ 𝑤
∂𝑢
ε =∑ ε σ 𝑤 is error on hidden layer.

28
Backpropagation discussion (advantages)
Advantages:
• efficacy: gradient can be computed in a time,
which is comparable to time of the network
processing;
• can be easily applied for any σ, 𝐿 ;
• can be applied in dynamical learning;
• not all the sample objects can be used;
• can be paralleled.

29
Backpropagation discussion
(disadvantages)
Disadvantages:
• do not always converge;
• can stuck in local optima;
• number of neurons in the hidden layer should be
fixed;
• the more ties, the probable overfitting is;
• “paralysis” of a single neuron and for network.

30
Lecture plan
networks
• Backpropagation

Plethora of neural networks
Tens or even hundreds different neural networks

exist:
• self-organizing map
• deep learning networks
• recurrental neural networks
• radial basis function networks
• Bayesian neural networks
• modular neural networks
• …

Lec-04-Logistic Regression and Neural Networks PDF

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Lec-04-Logistic Regression and Neural Networks PDF

Încărcat de

Drepturi de autor:

Formate disponibile

Lecture 4

• The presentation is prepared with

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 2

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 3

We may want to talk about probably of belonging to a class

Then classification model is

𝑄 𝑎, 𝑇 ℓ = ln(1 + exp(− 𝑤, 𝑥 𝑦)) → min .

That is logarithmic loss function.

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 4

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 5

Gradient descent step:

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 6

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 7

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 8

Python: LogisticRegression with different solvers

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 9

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 10

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 11

Generalized McCulloch-Pitts neuron:

where σ is activation function.

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 12

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 13

Rosenblatt’s rule for {1; 0} classification case for

Hebb’s rule for {1; −1} classification case for

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 14

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 15

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 16

Basic idea: synthesize combinations of neurons.

Completeness problem: how rich is family of

Start with single neuron.

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 17

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 18

Two way of making it more complex

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 19

Completeness problem: how rich is family of

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 20

Theorem (Gorban, 1998)

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 22

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 23

Any number of layers

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 24

Let use SGD to learn weights

where 𝐿 𝑤, 𝑥 , 𝑦 is loss function (depends on the

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 25

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 26

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 28

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 29

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 30

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 31

Tens or even hundreds different neural networks

S-ar putea să vă placă și