Sunteți pe pagina 1din 5

Convolutional Neural Networks(CNN):

A Convolutional Neural Network (CNN) is a Neural Network which uses


convolution[1] [2], also known as filtering of the input, which in the case of
CNNs is an image, in the feature extraction step of the algorithm. In this
process, a kernel, also known as a filter, slides over the image performing
a convolution at each location. The result is a new image which contains
the characteristics of the image defined by the kernel at all locations in the
original image. In Figure 2 the result of using a kernel which extracts
edges in vertical direction, i.e. rapid changes in brightness, on an image
can be seen.

Figure 2: The result of using a kernel which extracts edges in vertical direction

To analyze what happened in the convolution, the concept of neuron and


the layer should be declared.
Neuron is a computational unit which takes the input(‘s) , does some
calculations and produces the output[2]. As it could be seen in figure 3:

Figure 3:Structure of a neuron


Above, Figure 3 is the one we use in neural networks, having the input and
some weights(parameters). We apply the dot product of these two vectors
and get the result (which would be a continuous value -infinity to + infinity).
After a convolution has been performed, an activation function is applied
to the output. Multiple convolutions followed by an activation layer can be
added after each other. Using an activation function aims to restrict the
output values.
The activation function squashes the output value and produces a value
within range (which is based on the type of activation function) as shown
in figure 4:

Figure 4: Activation functions

Usually, the following three functions (Sigmoid range from 0 to 1, Tanh


from -1 to 1 and Relu from 0 to +infinity) are used.
The task of these activation functions is to add non-linearity to the network.
Without these functions the entire network would be a series of linear
operations, which could be replaced by a single linear operation, as one
layer. As most tasks in real-life are non-linear, a linear algorithm would
not be able to solve these tasks in a satisfactory manner .Well, as
mentioned before, a neural network is a set of layers(a layer has a set of
neurons) stacked together sequentially. As a result, NNs can be classified
into three types of layers(i.e Figure 1):

1. Input layer: A set of input neurons where each neuron represents


each feature in our dataset. It takes the inputs and passes them to the
next layer.
2. Hidden layer('s): A set of (n) neurons where each neuron has a
weight(parameter) assigned to it. It takes the input from previous layer
and does the dot product of inputs and weights, applies activation
function (as we have seen above),produces the result and passes the
data to next layer.
3. Output layer: it’s identical to the hidden layer except it gives the final
result(outcome/class/value).

How Neural Network is trained?

There are two algorithms in neural networks:

1.Forward propagation(FP).

2.Back propagation(BP).

BP will be used in this report.

Back propagation(BP) [3]:

The main goal of back propagation is to update each of the weights in the
network so that they cause the predicted output to be closer to the target
output, thereby minimizing the error for each output neuron and the network
as a whole. The steps of back propagation algorithm are[2]:

 Starting with random weights

 Selecting an input couple (the normal input x, and the desired output
d(x))

 Modifying the weights when 𝑦 ≠ 𝑑(𝑥).


 If there exists a set of connection weights w* which is able to perform
the transformation 𝑦 = 𝑑(𝑥).the perceptron learning rule will
converge to some solution (which may or may not be the same as w*
) in a finite number of steps for any initial choice of the weights.

The process of extracting features will be completed in this method:

1. Applying a convolution on the image by applying the dot


product between the matrix of the image and a number of
filters in aim to extract the required features from the input
image.
2. Applying the Max-pooling process: Typically, for each 2-by-
2 matrix in the image the largest value is selected and replaces
the matrix, reducing the size of the image by a factor 2.
Subsampling pixels will not change the object, but
characterizing the image will need fewer parameters.

These operations are then cascaded in layers until the


satisfactory level of detail extraction has been achieved.

Following this process will produce a fully-connected neural network


which connects each output in the next layer to all the inputs in the previous
layer. The fully connected network now is trained to recognize the missing
features which don't exist in the extracted ones, and then giving an order to
repeat the process of extraction until all the desired features are extracted.

An example of (CNN) is shown in figure 5:

Fully-connected layer

Feature extraction
layer and Convolution
Max-pooling layer
layer

Figure 5:Example of CNN


Loss Function:

Loss function is a method of evaluating how well the algorithm models the
dataset. Most machine learning algorithms use some sort of loss function in
the process of optimization, or finding the best parameters (weights) for
data, the parameters of the network are changed based on the gradient of the
loss function, using gradient decent optimization. Gradient Descent is used
while training a machine learning model. It is an optimization algorithm,
based on a convex function, which tweaks its parameters iteratively to
minimize a given function to its local minimum.
When the output of the algorithm does not match the desired output, loss
function takes large values, and it takes small values when there is a good
correspondence between the output and the desired output.
One of the more common loss functions to be used for Convolutional
Neural Network algorithms is the Soft-Max loss function. This is a
generalization of the binary Logistic Regression classifier for multiple
classes. The typical implementation of the loss function does not
differentiate between classes, meaning that miss-classifying a pixel as class
1, when it should be class 0 produces the same amount of loss as miss-
classifying a pixel as class 0, when it should be class 1.

S-ar putea să vă placă și