Sunteți pe pagina 1din 7

SIGNATURE AUTHENTICATION USING DEEP LEARNING

ABSTRACT
This paper describes the method to authenticate signatures using deep learning. The paper covers
the various modules and the architecture required to achieve the purpose. Convolutional neural
networks are implemented to parse signatures and feed forward neural networks are implemented to
analyse the characteristics of the signature. Gradient descent is used to correct errors with a method
called back-propagation. This paper talks about how the signature on the cheque is compared to the
signature in the database and how a final authentication score is provided to the user.

Keywords: Machine learning, Deep learning, Neural networks, Convolution, Pooling, Activation
function, ReLU, Gradient Descent, Training, Test, Validation
the same data for training as well as testing,
INTRODUCTION since it would not know how well the network
generalises and whether or not the model
Machine learning is the idea that there are overfit the data. Hence we need to keep
generic algorithms that can provide interesting separate pairs in reserve for test set (input,
information about a set of data without having target) which are not used for training purpose.
to write any custom code specific to the This type of data set is called as test data set.
problem. Data is fed into the generic algorithm
and the logic is built based on the data. We check how well the network is learning
Machine learning can be used in several fields during training, so we can decide when to stop
like image classification, speech recognition, training the model. To take such a decision, we
cybersecurity, robotics etc. Deep learning is the cannot use training data as we would not detect
topic that covers neural networks and is a part overfitting but we cannot use the testing data
of machine learning. either as we are saving it for final test. Thus the
third kind of data set called validation set is
A neural network is a computational model that required to validate the learning so far. In
works in a similar way to the neurons in the statistics it is known as cross validation. If
human brain. Each neuron takes an input, there is a huge amount of data, the exact
performs some operations then passes the proportion of training to testing to validation
output to the following neuron. For image data is somewhere around 50:25:25 and in case
classification, Convolutional Neural Networks of less amount of data it is 60:20:20.
(CNN) are used. CNNs are primarily used for
image processing but can also be used for other CNNs tend to start with an input scanner. The
types of input such as audio. A typical use case scanner scans the input image and passes it on
for CNNs is where you feed the network to the next layer. This input data is then fed
images and the network classifies the data. through convolutional layers. These
convolutional layers also tend to shrink as they
The data set is divided into three parts. become deeper, mostly by easily divisible
Training data, testing data and validation data. factors of the input. Besides these
Training data is used to train a multi-layer convolutional layers, they also often feature
perceptron. The next step is to find how well it pooling layers. Pooling is a way to filter out
performs. In the case of trained network the details: a commonly found pooling technique is
error can be computed by computing the sum max pooling, which takes 2 x 2 pixels and
of squares error between output and target. The passes on the pixel with the biggest value.
next step is to figure out which data should be
used to test the network. It is not wise to use The output of the CNN acts as the input for a
fully connected layer where the features of the
image are established. The fully connected
layer is a feed forward neural network with
multiple hidden layers. The characteristics of

1 of 7
the signature like dots, dashes and curves are the entire matrix and pick the largest element
established in each of the hidden layers, which from the window to be included in the next
are then passed on to the next layer for representation map.
processing. The output of the fully connected
layer is the classification of the signature.

PROPOSED SYSTEM

My work involves authenticating the signature


on a cheque by scanning the cheque and using
machine learning model on the scanned image.
Convolutional neural networks (CNN) and
Feed Forward neural networks are used to
accomplish the above task. Representation of max pooling

The first step in the process is to input the Thus, effectively the dimensions of the feature
scanned image of the cheque to a CNN. A map reduce from (m, n) to (m/k, n/k) on
scanner or a filter scans the image and the input application of a (k, k) non-overlapping filter.
of the scanner is then passed on to the next Hence, k needs to be chosen in consistence
layer. The scanner is typically a matrix with a with the dimensions of the input feature map.
random set of values. This matrix is first placed In contrast, the dimensions on application of a
on the top-left corner of the image. A dot convolution layer would move to (m-k+1, n-
product of the pixel value of the image and the k+1). Unlike the convolution layer, the pooling
respective value on the matrix is calculated. filter does not operate on overlapping segments
The output of the scanner is a set of channels of the input feature map. However, we can
(the size of this set varies with the number of explicitly specify the stride to make the
scanners). This process can be repeated many operation overlapping. The pooling can be
times to increase accuracy. The output of the implemented in other ways also.
final convolutional layer is then fed into a feed-
forward neural network. The main reason for using the pooling layer is
to prevent the model from overfitting. In quite
a few models, the dropout layer succeeds the
pool.

It is important to be careful in the use of the


pooling layer, particularly in vision tasks.
While it would help significantly reduce the
complexity of the model, location sensitivity
Representation of a convolutional neural network
might be lost.
Once the scanner scans the image, a pooling Consider for instance a vision task that
function is used to highlight the most important involves identifying a ball in an image. Using
aspects of the image and to reduce the size of the pooling layer is beneficial if we are to
the image. determine if the ball exists in the image or not.
However, if the task is concerned with
Pooling layer is frequently used in
determining the exact location of the ball in the
convolutional neural networks with the
image as well, we will have to be a little
purpose to progressively reduce the spatial size
careful regarding using a pooling layer.
of the representation, to reduce the amount of
features and the computational complexity of Thus the pooling layer is primarily used for
the network. The more commonly used pooling reducing the computational complexity of the
layer is the maxpool layer. A maxpool of 2 x 2 model. We can prefer to use it in scenarios
would cause a filter of 2 by 2 to traverse over where we can afford to lose some localisation
2 of 7
information. Pooling is pretty interesting and
common to use in deep learning models.

Once the convolutional step is completed, the


output passes through an activation function
that introduces some nonlinearity to the
function. There are several activation functions
such as Identity, Heaviside, sigmoid, tanh and
ReLU. For this problem, a Rectified Linear
unit or ReLU is used.

Graphical representation of ReLU

ReLU activation function The Feed Forward neural network is a set of


neurons of multiple layers starting with the
The ReLU is the most used activation function, input layer and ending with the output layer.
since it is used in all the convolutional neural All layers in-between are called hidden layers.
networks or deep learning. The function and its Increasing the number of hidden layers,
derivative both are monotonic. The issue is that increases the efficiency of the neural network
all the negative values become zero but will also require higher processing power
immediately which decreases the ability of the to compute.
model to fit or train from the data properly.
That means, any negative input given to the
ReLU activation function turns the value into
zero immediately in the graph, which in turns
affects the resulting graph by not mapping the
negative values appropriately. Another Representation of a feed-forward neural network

variation of ReLU is leaky ReLU. It takes care


of the negative values to some extent. In the Feed Forward neural network, the
characteristics of the signature such as the
ReLU trains 6 times faster than tanh. Also, curves, dashes, and dots are analysed. The
ReLU addresses the vanishing gradient output of the Feed Forward neural network is a
problem with sigmoid and tanh functions do classification of that signature.
not. The output value of ReLU will be zero
when the input value is less than zero. If the The network is first trained on a training set
input is greater than or equal to zero, then the and then tested on a testing set. Once the
output is equal to the input. When the input network is trained, new inputs can be given to
value is positive, the derivative is 1. Hence the network for classification. When the
there will be no squeezing effect which occurs network has first trained, the error is high. To
in the case of back-propagating errors from the lower the error, optimisation is applied to the
sigmoid function. neural network. Optimising is possible with a
process called gradient descent, where the
predicted output is compared with the actual
output and then the weights are adjusted to get
a value that will decrease the error.

First Order Optimisation Algorithms — These


algorithms minimise or maximise a Loss
function E(x) using its Gradient values with
respect to the parameters. Most widely used
First order optimisation algorithm is Gradient

3 of 7
Descent. The First order derivative tells us
whether the function is decreasing or After this we propagate backwards in the
increasing at a particular point. First order Network carrying Error terms and updating
Derivative basically give us a line which is Weights values using Gradient Descent, in
Tangential to a point on its Error Surface. which we calculate the gradient of Error (E)
function with respect to the Weights (W) or the
A Gradient is simply a vector which is a multi- parameters, and update the parameters (here
variable generalisation of a derivative (dy/dx) Weights) in the opposite direction of the
which is the instantaneous rate of change of y Gradient of the Loss function w.r.t to the
with respect to x. The difference is that to Model’s parameters. This step is carried out
calculate a derivative of a function which is multiple times until a good enough output, with
dependent on more than one variable or minimum error, is obtained.
multiple variables, a Gradient takes its place.
And a gradient is calculated using Partial
Derivatives . Also another major difference
between the Gradient and a derivative is that a
Gradient of a function produces a Vector Field.

Gradient descent is majorly used to do Weights


updates in a Neural Network Model , i.e update
and tune the Model’s parameters in a direction
so that we can minimise the Loss function. A
Neural Network trains via a famous technique
called Back-propagation, in which we first
Representation of gradient descent
propagate forward calculating the dot product
of Inputs signals and their corresponding
Weights and then apply an activation function The neural network first parses the signature
to those sum of products, which transforms the from the document. The respective outputs for
input signal to an output signal and also is the signature are compared and a final
important to model complex Non-linear authentic score is displayed.
functions and introduces Non-linearities to the
Model which enables the Model to learn
almost any arbitrary functional mappings.

SYSTEM ARCHITECTURE

The user uploads the scanned copy of the 64 channels. This is also passed through a max
signature through the user interface. The pool layer. The output of the second layer is
scanned image preprocessed in the then passed as a single dimensional array
preprocessing step. The RGB channels are through a fully connected neural network. This
converted into Grayscale channel. The size of network then outputs the accuracy percentage.
each image is converted into 70 X 150 units.

The processed image is then sent as an input to


the first convolutional layer. This layer
converts the input into 32 channels by passing
an image filter through the input. This is then

sent to a max pool layer which extracts only


the most important features of the image. The
output of the first layer is then sent to the
second convolutional layer. This layer takes
the 32 channels as the input layer and outputs
4 of 7
5 of 7
MODULES

MODULE 1: USER INTERFACE


The User interacts with the program using the
command prompt which serves as the user
interface for input and output.

MODULE 2: IMAGE PREPROCESSING


The images are stored at a particular address.
All the images for the training and testing
purposes pass through this module. The images
are converted to a specific format in this
module.

MODULE 3: NEURAL NETWORK


The neural network takes the image as input
and outputs a score. The neural network
consists of two convolutional layers with 5 X 5
unit filter. The neural network also consists of a
fully connected layer. The fully connected
layer presents he authentication score to the
user interface.

6 of 7
6. https://en.wikipedia.org/wiki/Artificial_neu
ral_network
OUTPUT 7. https://en.wikipedia.org/wiki/Machine_lear
ning
8. https://en.wikipedia.org/wiki/Convolutiona
l_neural_network
9. https://medium.com/@ageitgey/machine-
learning-is-fun-part-3-deep-learning-and-
convolutional-neural-networks-
f40359318721
10. https://towardsdatascience.com/types-of-
optimization-algorithms-used-in-neural-
networks-and-ways-to-optimize-gradient-
CONCLUSION
95ae5d39529f
11. https://towardsdatascience.com/deep-
Machine learning is being adopted by many of
learning-3-more-on-cnns-handling-
the industries world over, especially the
overfitting-2bd5d99abe5d
financial sector. My work is an attempt to
12. https://www.youtube.com/watch?
complement the efforts done in this area. My
v=FmpDIaiMIeA&t=2s
model will help reduce the number of
13. https://www.youtube.com/watch?
fraudulent transactions in the banking industry
v=FTr3n7uBIuE
by providing an accurate evaluation of the
14. https://www.youtube.com/watch?v=2-
signature in the documents. This will help in
Ol7ZB0MmU
reducing the losses incurred by banks. It will
15. https://www.youtube.com/watch?v=7Wq-
also help in bringing about quick turnaround
QmMT4gM
times by drastically cutting the manual
intervention. The cost of implementation of the
machine learning model is low and the
scalability of the model is high. The recent
improvements in Convolutional neural
networks is the creation of a new model called
Capsule Networks where each neuron will hold
a nested neural network. With the Capsule
Network, a new activation function is also
being developed, which can produce better
results. With the advancement of machine
learning models and improved computational
powers, the accuracy of the machine learning
models increase leaps and bounds and the turn
around time is reduced severalfold.

REFERENCES

1. Signature verification using multiple


neural classifiers
2. Offline signature verification based on
geometric feature extraction and neural
network classification
3. Online signature verification
4. Automatic signature verification and writer
identification
5. Online handwritten signature verification

7 of 7

S-ar putea să vă placă și