Documente Academic
Documente Profesional
Documente Cultură
Abstract
Today there is no reliable way to translate scanned
handwriting into a text file. In this paper a clssifier for
handwritten digits based on structural similarity index is
described. The classifier is implemented using MATLAB.
Results of the performed experiments demonstrate that the
classifier successfuly recognizes digits in 86% of cases.
1 Introduction
There is no automatic way for reliable handwriting
recognition and its translation into a text file. A solution to
this problem would ease and accelerate the process of
digitalizing handwritings as well as their browsing and
searching. Special case of the problem of handwriting
recognition in which significant improvement has been
achieved is the recognition of handwritten digits [1, 2, 3]. In
this paper a classifier of handwritten digits, based on a signal
fidelity measure using structural similarity index, has been
developed.
Standard criterion for measuring signal quality and its
difference from the original (signal fidelity) is mean square
error. Using this criterion different methods for signal
processing are compared, and it is used to optimize signal
processing systems. Nonetheless, mean square error does not
achieve very good results, especially if signals that represent
speech and images are considered [4]. Since mean square
error is being more and more criticized, other signal fidelity
measurements have been proposed. One of those is structural
similarity index. Structural similarity index has been
proposed because the neighboring signal samples (pixels)
have strong mutual dependencies, which are ignored by mean
squared error. These dependencies contain very important
information about the structure of the objects in the image
[4].
Even though it was originally meant to be used as a
measure of the quality of the image, structural similarity
index is a measure of the similarity of images and it can be
used to classify them. Since structural similarity index yields
better results than mean square error in measurements of the
signal fidelity [4], it is considered that the introduction of this
method to the problem of handwritten digits recognition
could also lead to improvements.
In this paper a nearest neighbor classifier for handwritten
digits, based on structural similarity index, is described. We
experimentally evaluated the classifier on a subset of
handwritten digits from MNIST database [5]. Afterwards, the
results of the experiment are discussed. In these results it is
shown that the classifier has been able to recognize the test
digits in 86% cases. This paper is similar to papers [6] and
329
2 Signal fidelity
A universal measurement of signal fidelity which would
be appropriate for all areas of signal processing certainly does
not exist [4]. This is because signal fidelity measurement
depends on the area in which it is being used. Prime reason
for these facts is that human perception is very subjective.
Because of this, in this paper, a measure of signal fidelity is
considered to be appropriate only if it is in accordance with
the whole communication system. The communication
system in this case consists of the natural image (transmitter)
and human visual perception (receiver). In other words, the
measurement is supposed to yield close values for visually
close images and vice versa.
Structural similarity index is based on the observations
that natural image signals have very high mutual
dependencies between neighboring pixels, and that those
dependencies carry information about the structure of the
objects in the image. Structural similarity is based on the fact
that the human visual system is adapted to observe
information about the structure of the object which it sees.
This is why it is very important for a measurement of signal
fidelity to contain information about the structure of the
image. It is also possible to measure structural disorder to
create a measurement of signal fidelity. Since human visual
system is very sensitive to structural distortions (additive
noise, blurring the image, high-level loss compression, etc.),
while it adapts very well to distortions that do not change the
structure of the image (change of light, brightness or spatial
shift), it is necessary that the measure of image fidelity
simulates these two features [4].
2.1 Structural similarity index
Let two images be compared, and let x be a set of pixels
from the first image, and y a set of pixels from the second
image. Also, let x and y have the same coordinates, each on
their own image. Local structural similarity index measures
three elements of these two sets: similarity of brightness
l(x,y), contrast c(x,y) and sample structures s(x,y).
Combination of these local values yields structural similarity
index:
, = ,
, , =
=
, (1)
2 7
%84
,% 44
,% 4 + 6
247
%8
,%
,% 4 + 6
7
2 7
7
%84
,%
,% 4 + 6
%84
,% 4 + %84
,% 4 + 6
, 2
330
changes of
and
. Its maximum value, one, is achieved
only if the phase difference between
,% and
,% is constant
for every i. Phase component of the structural similarity
index in complex wavelet domain transfers information about
structural similarity of images in a correct way, because the
local structural similarity is calculated using relative phase
patterns of local frequencies of the image. Also, constant
phase offset of all coefficients does not affect the structure of
local samples of the image. In this paper, for the calculation
of the index in complex wavelet domain the equation given in
[9] is used:
,-
,
. =
247
%8
,%
,% 4 + 6
7
7
%84
,% 4 + %84
,% 4 + 6
3
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
4 Experimental results
During the experiments performed the number of correct
comparisons is counted, which enables later assessment of
the quality of the classifier. Also, the combinations of images
where the classifier made an error are noted. Classification
results are presented in the form of the confusion matrix,
which is given in Fig. 2. From the confusion matrix, it is
possible to see, not only how many times did the classifier go
wrong, but also in what way. In other words, from this matrix
it can be seen which digits the classifier has identified
wrongly. Confusion matrix enables us to see which digits
have been confused, but it is not possible to know which
exact image of the digit has been identified wrongly. This is
why the third criterion for the assessment of the classifier is
included. It consists of showing the combinations of images
for which the classifier made errors, Fig. 3.
http://pajkanovic.netne.net/cwssim
331
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
(k)
(l)
5 Conclusions
Within this paper structural similarity index is
represented in its two forms (sections 2.1 and 2.2). Its
mathematical features, advantages and flaws as a
measurement of signal fidelity are listed. Since it has been
shown in the literature [4], [9] that in the area of assessment
of visual quality of image, the results which are most similar
to human perception are obtained by the structural similarity
index, we decided to use this index as a base for a classifier
of scanned images of handwritten digits. The classifier itself
works on the principle of the nearest neighbor. The algorithm
for classification is implemented in MATLAB.
While assessing the yielded results it has been concluded
that this classifier, considering its simplicity, gives promising
results, since its decisions were correct in 86% percent of the
cases.
Acknowlegments
The authors wish to express their gratitude to Professor
Branimir Reljin, PhD. for his many useful suggestions during
the writing of this paper.
References
[1] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner.
"Gradient-based learning applied to document
recognition." Proceedings of the IEEE, vol. 86, no. 11,
pp. 2278-2324, November 1998.
[2] K. Labusch, E. Barth, T. Martinetz, Simple Method for
High-Performance Digit Recognition Based on Sparse
332
visited: December
visited: