Documente Academic
Documente Profesional
Documente Cultură
Abstract: In this study we set purpose to prove the high degree A multitude of types of neural networks have been
of security that can be oferred by using ANN-s as the base of a proposed over time. Actually, neural networks have been so
biometric system. Neural networks , based upon a feed-forward intensely studied (for example by IT engineers, electronics
architecture are being used in problem solving as universal engineers, biologists and psychologists) that they have received
approximators in concrete associations such as classification a variety of names. Scientists refer to them as „Artificial Neural
(including nonlinear separable classes), prediction, Networks”, „Multi-Layered Perceptron” and „Parallel-
compression.The error backpropagation algorithm has been Distributed Processors”. Despite this fact ,there is a small group
used to train the multi-layered perceptron network. The results of classical networks which are used, mainly networks which
showed that errors can be reduced by increasing the number of use the BackPropagation algorithm, Hopfield networks,
learning epochs and the number of input charaters up until a „Competitive” networks and those networks which use Spiky
point and that, of course, there is room for improvement. neurons.
Knowledge can be classified by its degree of
Key words: neural networks, biometrics,character recognition generality. At the basic level are signals which contain useful
data as well as parasite elements (noise). Data consists of
1. INTRODUCTION elements which can raise a potential interest. We must consider
A biometric system is essentially a pattern the fact that processed data lead to obtaining information,
recognition system, which makes a personal identification by driven by a specific interest. When a piece of information is
determining the authenticity of a specific physiological or subjected to a certain specialization then we face knowledge.
behavioral characteristics possessed by the user. Pattern Knowledge based systems, depending on the purpose and on
recognition , as a branch of artificial intelligence is aiming their type, can rationalize on their own, having as a starting
identification of similarity relationships between abstract point signals, data, pieces of information; further more, in these
representations of objects or phenomena., for recognition is to knowledge based systems we may be dealing with
classify data entry as belonging to certain classes using metaknowledge.
classification criteria based on information previously built. Here are a number of resons why we should study
An important issue in designing a practical system is neural networks:
to determine how an individual is identified. Biometrics dates 1. They are a viable alternative to the computational
back to the ancient Egyptians, who measured people to identity paradigm based upon the utilization of a formal
them. Keeping to the basics, we submit to your attention the model and the design of algorithm whose behaviour
ideea of identifying someone by his handwriting. Every person do not alter during use.
who desires to entry a secured perimeter is obliged to write a 2. They incorporate results which derive from different
random text , which will be compared with previously taken fields of study, for the purpose of ontaining simple
samples of his handwriting. Depending on the results , calculus architectures.
consisting of a percentage which illustrates the similarity , the 3. They model human intelligence, helping us to better
person shall, or shall not be allowed the entry. understand the way the human brain works.
Biometrics devices have three primary components: 4. They can offer a better rejection of errors, being able
an automated mechanism that scans and captures a digital / to have a good performance even if the data entries
analog image of a living personal characteristics; another have been flawed.
handles compression, processing, storage and comparison of ANNs have multiple representational forms but the most
image with the stored data; the third interfaces with application common are the mathematical. For each artificial neuron, the
systems. mathematical form consists of a function g ( x) of the input
Pattern recognition is a branch of artificial
intelligence aiming identification of similarity relationships vector x , where x = (x1; x2; ... ; xi) . Each input xi
between abstract representations of objects or phenomena. For is weighted according to its weight
recognition is to classify forms (= recognition) data entry
(forms) as belonging to certain classes using classification w = (w1 ; w 2 ; ... ; w i ) . K is the post-processing
criteria based on information previously built. function that is finally applied. This results in the following
This study is dealing with the first part of the equation for a single neuron:
biometric system, illustrated by the useage of artificial neural g ( x) = K (∑ wi xi ) (1)
networks which are, as their name indicates, computational
i
networks which attempt to simulate the networks of nerve cell
(neurons) of the biological central nervous system.The neural
When interpreting the results we must take into
network is in fact a novel computer architecture and a novel
consideration the fact that in handwritten text we face the
algorithmization architecture relative to conventional
variability due to the loss of synchronism between the muscles
computers. It allows using very simple computational
of the hand as well as the variation of one’s style due to several
operations (additions, multiplication and fundamental logic
factors, including but not limited to: education, mood,etc.
elements) to solve complex, mathematically ill-defined
Reading handwriting is a very difficult task considering
problems. A conventional algorithm will employ complex sets
the diversities that exist in ordinary penmanship. However,
of equations, and will apply to only a given problem and
progress is being made. Early devices, using non-reading inks
exactly to it. The ANN will be computationally and
to define specifically-sized character boxes, read constrained
algorithmically very simple and it will have a self-organizing
handwritten entries.
feature to allow it to hold for a wide range of problems.
IntelliSec – The 1st International Workshop on Intelligent Security Systems
11-24th November 2009, Bucharest, Romania
of each resistor is determined by the section of the hole and units and one hidden layer we can choose for the
corresponding inverse synaptic efficacy. The circuit integration
can be achieved both by standard bipolar technology, the last one the size as M ⋅ N ).
modest level of integration, as well as CMOS technology. If the number of hidden layer neurons is too small, the
In order to avoid the difficulty brought about by the network fails to form an adequate internal representation of
imposed compromise between the high level of connectivity data training and thus the classification error will be high. With
and synaptic contacts inaccessibility we could use a scheme of a number too large, the network learns very well the training
neural network implemented using CCD (Charge Coupled data but it turns out to be incapable of obtaining a good
Device) type microelectronic circuits beacouse CCD shift generalization obtaining high levels of error for the test data.
registers can store discrete groups of electrons in well defined Therefore the input vector consists of 150 parts
position, which can then be quickly moved by applying external representing the matrix elements ,size 10x15 pixel binary
potential, keeping their local value. representation. The matrix size was chosen considering the
average values of the characters represented, with a minimum
of noise introduced.
Used for network learning algorithm is the well-known
Backpropagation, proposed in 1986 by Rumelhart, Hinton and
Williams for setting weights and hence for the training of multi-
layer perceptrons.
Here's how learning arises: it initializes the network’s
weights with some random numbers, usually between -1 and 1.
The next step consists of applying the set of entry data and
calculating the exit (this step is called "step forward"). The
calculation brings one result completely different from our
target, because all the weights had had random values. At this
point the error of each neuron is calculated, which usually
Fig. 1. A neural network architecture implemented with CCD meets the formula: Target - Effective output. This error is then
shift register type used to modify the weights so that the error has become
increasingly smaller. The process is repeated until when the
error is minimal.
The learning rate (η ) is to speed up or slow down the
The circuit shown in the previous figure mostly avoids the
limitations imposed by the high degree of connectivity
characterized by a synaptic matrix easily accessible and learning process, if appropriate. We have decided upon a
modifiable. On the other hand, the arrangement proposed is learning rate of 150 as being the appropiate for this system, but
partially sacrificing the parallelism and the asynchronous signal we are allowing the user to modify it in the range of 1 to 200
processing which gave rise to the original idea. It is true that and configure it according to his own needs.
relatively high speed CCD circuits partially compensates non- The detection of the symbols is a very important part of
parallel data processing. Currently, there can be implemented the program. This is based on the premise that we are dealing
shift register containing up to 2000 CCD circuits operating at only with black and white images, where white RGB
frequencies of 10MHz. (255,0,0,0) = space and black RGB (255,255,255,255) =
character bitmap image, with any resolution. It is also
4. PROPOSED SOFTWARE considered that the image contains only characters, any another
form of existence (line of the table, edge, etc..) are considered
ARCHITECTURE
noise.
In order to obtain positive results a network-type
feedforward was chosen for implementing the integrated
system,consisting of 150 neurons in the input layer, 250
neurons in the hidden layer and 16 neurons in the output layer
,which represent the characters of the alphabet in binary code,
each uniquely represented on 16 bits.
Inserting hidden layers enhances the capacity of representation
of the feed forward networks but raise difficulties in terms of
learning as „delta” type algorithms
can not be directly applied. This was one of the main reasons
for the stagnation of the development of feedforward networks
with supervised learning between 1969 (when Papert and
Minsky highlighted the limits of single-level networks) and
1985 (when the BackPropagation algorithm, developed in
parallel by several researchers became known). Fig. 2. Neural Network Chosen Architecture
In determining the number of neurons in each layer the
following had been taken into account: We should mention also that each set of training consists
- Both entry level and output level should have as of an image and a text file containing the desired output.
many units as needed to represent input data Concerning the user’s ability to customize the application,
respectively output. we have granted the possibility to choose an activation
function. To fully understand the mechanism we should
- The number of hidden units should be just enough
acknowledge that in biologically-inspired neural networks, the
to solve the problem, but not higher than necessary.
activation function is usually an abstraction representing the
The number of hidden units is based either on
rate of action potential firing in the cell. In its simplest form,
theoretical results concerning the capacity of
this function is binary-that is, either the neuron is firing or not.
representation of the architecture (such as the case
The function looks like φ(vi) = U(vi), where U is the Heaviside
of the current chosen network) or heuristic rules
(eg. for a network with N input units and M output
IntelliSec – The 1st International Workshop on Intelligent Security Systems
11-24th November 2009, Bucharest, Romania
step function. In this case a large number of neurons must be 5. EXPERIMENTAL RESULTS
used in computation beyond linear separation of categories. 5.1. Results obtained for variation in the number of
Activation functions for the hidden units are needed to epochs of training are illustrated in the following
introduce nonlinearity into the network. Without nonlinearity, tables:
hidden units would not make nets more powerful than just plain
perceptrons (which do not have any hidden units, just input and Activation function:= bipolar sigmoid.
output units). The reason is that, a linear function of linear Number of symbols =90, Learning rate=150
functions is again a linear function. However, it is the
nonlinearity (i.e, the capability to represent nonlinear functions) 50 200 500
that makes multilayer networks so powerful. Almost any
nonlinear function does the job, except for polynomials. Used Font Misid Misid
For hidden units, sigmoid activation functions Misid Error Error Error
char char
are usually preferable to threshold activation functions. Here char
they are:
- the standard sigmoid function which ranges from 0 Arial 19 24% 2 3% 4 4.5%
to 1: Tahoma 13 14.5% 5 5.6% 3 3.4%
1 Times New Roman 11 12.3% 4 4.5% 1 1.2%
y= (2)
1 + e − D⋅ x 800 1000 2000
- the hyperbolic tangent which ranges from -1 to 1: Used Font
2 Misid Misid Misid
y= −1 (3) char
Error
char
Error
char
Error
1 + e−2⋅ x
- the Gauss function Arial 2 3% 1 1.2% 2 3%
Tahoma 2 3% 2 3% 2 3%
y = e− x
2
(4) Times New Roman 2 3% 1 1.2% 1 1.2%
- another sigmoid usually used:
x
y= (5)
1+ x 3000
As for the last function, if the network output is a set of Used Font
Misid
numerical values , then it will require more iterations to achieve Error
char
the target value. But if the problem is a classification, as in this
case, this function is appropriate because it consumes less Arial 0 0%
during the central unit processing, without the number of Tahoma 1 1.2%
iterations being affected.
Times New Roman 0 0%
Networks with threshold units are difficult to train because
the error function is stepwise constant, hence the gradient either
does not exist or is zero, making it impossible to use backprop
Variation in the number of epochs
or more efficient gradient-based training methods.
Considering all of the above the user has a choice to
make: unipolar sigmoid, bipolar sigmoid, linear function, 50 A
number of misidentified
Used Font 20 50 90 to the fact that the adjustment of the parameters are
Misid Misid Misid very small.
Error Error Error § Overtraining : the network provides a good
char char char
approximation on the set of training, but possesses a
Latin Arial 0 0 6 12 11 12.22 low generalization ability.
Latin Tahoma 0 0 3 6 8 8.89 Starting from the standard BP there can be developed
Latin Times Roman 0 0 2 4 9 10 BackPropagation algorithm variants that differ by:
§ How to choose the learning rate: constant or adaptive;
§ Relations adjustment (determined by the
5.3. Results obtained for variation of learning rate
minimization algorithm used, which is different from
simple gradient algorithm: conjugate gradient type
Activation function:= bipolar sigmoid. algorithms, Newton-type algorithms, decrease
Number of symbols =90, Number of epochs=300 random algorithms, genetic algorithms, etc.).
1 10 40 § The way the parameters are being intialized: random
Used Font initialization or based upon a search algorithm.
Misid Misid Misid § Attending training set (it only influences the serial
Error Error Error version of the algorithm): sequential or random;
char char char
§ Error function: besides the squared mean error there
can be used some specific measures of error to solve
Arial 56 63% 6 6.78% 5 5.6% the problem (eg in case of classification problems
Tahoma 70 78% 8 8.9% 10 11.2% there can be used an entropy based error);
Times Roman 48 54% 4 4.5% 3 3.4% § Stopping criterion: In addition to the criterion based
on the maximum number of epochs and the
80 120 corresponding error of training set we can use criteria
Used Font related kit validation error and the size adjustment of
the last era.
Misid Misid
Error Error
char char
7. CONCLUSIONS
Arial 2 2.33% 0 0% The NN was trained and tested for different test and
Tahoma 2 2.33% 2 2.33% training patterns.
Times Roman 0 0 0 0% § In all the cases the amount of convergence and error
rate was observed.
§ The convergence greatly depended on the hidden
Variation of learning rate layers and number of neurons in each hidden layer.
§ The number in each hidden layer should neither be
too less or too high.
80 § The NN once properly trained was very accurate in
Number of misidentified
8. REFERENCES