Machine Learning, Willey • Brett Lantz, Machine Learning with R, Packt • M S Ram, Introduction to Deep Learning, Indian Institute of Technology Kanpur • Papers, Website, tutorial 3 Deep Learning Overview • The term “deep learning” has become widely used to describe machine learning methods that can learn more abstract concepts, primarily in the areas of image recognition and text processing than methods previously available • Deep Learning is another name for a set of algorithms that use a neural network as an architecture, and learn the features automatically Deep Learning - Brief History • The earliest example of artificial neural networks is the Perceptron algorithm developed by Rosenblatt in 1958 • In the late 1970’s, researchers discovered that Perceptron cannot approximate many nonlinear decision functions • In 1980’s, researchers found a solution to that problem by stacking multiple layers of linear classifiers (hence the name “multilayer perceptron”) to approximate nonlinear decision functions • the lack of computational power and labeled data etc., neural networks were left out of mainstream research in late 1990’s and early 2000’s • Since the late 2000’s, neural networks have recovered and become more successful thanks to the availability of inexpensive, parallel hardware (graphics processors, computer clusters) and a massive amount of labeled data Perceptron • It was the first algorithmically described neural network. Its invention by Rosenblatt in 1958 • It is the simplest form of a neural network used for the classification of patterns said to be linearly separable • The perceptron built around a single neuron is limited to performing pattern classification with only two classes (hypotheses) Neural Network – Why Success ? • The most important reason is that neural networks have a lot of parameters, and can approximate very nonlinear functions. So if the problem is complex, and has a lot of data, neural networks are good approximators for it – For instance, neural networks with three or more hidden layers have proven to do quite well at tasks such as recognizing handwritten digits or selecting images with dogs in them
• The second reason is that neural networks are very
flexible: we can change the architecture fairly easily to adapt to specific problems/domains (such as convolutional neural networks and recurrent neural networks Deep Learning Application Neural Network & Deep Learning • Ideas drawn from neural networks and machine learning are hybridized to perform improved learning tasks beyond the capability of either one operating on its own, and • ideas inspired by the human brain lead to new perspectives wherever they are of particular importance Deep Learning- Example Deep Learning- Example Deep Learning- Example 1 Deep Learning- Example 2 Deep Learning- Example 3 Deep Learning - Aim • Create algorithms – that can understand scenes and describe them in natural language – that can infer semantic concepts to allow machines to interact with humans using these concepts • Requires creating a series of abstractions – Image (Pixel Intensities) -> Objects in Image -> Object Interactions -> Scene Description • Deep learning aims to automatically learn these abstractions with little supervision How do we train? • Inspiration from mammal brain • Multiple Layers of “neurons” (Rumelhart et al 1986) • Train each layer to compose the representations of the previous layer • to learn a higher level abstraction – Ex: Pixels -> Edges -> Contours -> Object parts -> Object categories – Local Features -> Global Features • Train the layers one-by-one (Hinton et al 2006) – Greedy strategy Neuron vs Artificial Network Neuron vs Artificial Network Neuron vs Artificial Network Element of Neural Network Element of Neural Network 1. Inputs are fed into the perceptron 2. Weights are multiplied to each input 3. Summation and then add bias 4. Activation function is applied. Note that here we use a step function, but there are other more sophisticated activation functions like sigmoid, hyperbolic tangent (tanh), rectifier (relu) and more. No worries, we will cover many of them in the future! 5. Output is either triggered as 1, or not, as 0. Note we use y hat to label output produced by our perceptron model Deep Learning Model First DL Model Challenge • Training neural networks takes a long time, especially when the training set is large • It therefore makes sense to use many machines to train our neural networks. Challenge • Too many concepts to learn – Too many object categories – Too many ways of interaction between objects categories • Behaviour is a highly varying function underlying factors – f: L -> V – L: latent factors of variation • low dimensional latent factor space – V: visible behaviour • high dimensional observable space – f: highly non-linear function