Documente Academic
Documente Profesional
Documente Cultură
Lecture 1: Feedforward
Princeton University COS 495
Instructor: Yingyu Liang
Features
Color Histogram
Extract
features
build
hypothesis
Red
Green
Blue
build
hypothesis
Linear model
1
= sign( () + )
2
Fixed
Learn
Learn
Feedforward networks
View each dimension of as something to be learned
Feedforward networks
Linear functions = dont work: need some nonlinearity
Feedforward networks
Typically, set = ( ) where () is some nonlinear function
Figure from
Deep learning, by
Goodfellow, Bengio, Courville.
Dark boxes are things to be learned.
Motivation: neurons
Figure from
Wikipedia
Components in Feedforward
networks
Components
Representations:
Input
Hidden variables
Layers/weights:
Hidden layers
Output layer
Components
First layer
Output layer
Input
Hidden variables 1
Input
Represented as a vector
Sometimes require some
preprocessing, e.g.,
Subtract mean
Normalize to [-1,1]
Expand
Output layers
Output layer
Regression: =
+
Linear units: no nonlinearity
Output layers
Multi-dimensional regression: =
Linear units: no nonlinearity
Output layer
Output layers
Output layer
Binary classification: =
+ )
Corresponds to using logistic regression on
Output layers
Output layer
Multi-class classification:
= softmax where = +
Corresponds to using multi-class
logistic regression on
Hidden layers
+1
Hidden layers
= ( + )
Typical activation function
Threshold t = [ 0]
Sigmoid = 1/ 1 + exp()
Tanh tanh = 2 2 1
()
Hidden layers
Problem: saturation
()
Too small gradient
Hidden layers
Activation function ReLU (rectified linear unit)
ReLU = max{, 0}
Hidden layers
Activation function ReLU (rectified linear unit)
ReLU = max{, 0}
Gradient 0
Gradient 1
Hidden layers
Generalizations of ReLU gReLU = max , 0 + min{, 0}
Leaky-ReLU = max{, 0} + 0.01 min{, 0}
Parametric-ReLU : learnable
gReLU