Sunteți pe pagina 1din 6

An Efficient Scheme for Tilt Correction in Arabic OCR system

M. Sarfraz*, S. A. Shahab
Department of Information and Computer Science
KFUPM, Dhahran 31261, Saudi Arabia
Email: sarfraz@ccse.kfupm.edu.sa

Abstract and that the acquisition process did not introduce


a relevant skew. The advent of flat bed scanners
Preprocessing stage is required in almost every and the need to process large amounts of
image processing application ranging from documents at high rates, made the above
biometric analysis to document image analysis. assumption unreliable and the introduction of the
An input image or information need to be skew estimation phase became mandatory. In
normalized and converted into format acceptable fact, a little skewing of the page is often
by OCR (Optical Character Recognition) system. introduced during processes such as copying or
OCR systems typically assume that documents scanning. Moreover, today documents are ever
were printed with a single direction of the text more free styled and text aligned along different
and that the acquisition process did not directions is not an uncommon feature. The
introduce a relevant skew. Practically this subsequent stages of OCR systems, chiefly
assumption is not very strong and printed depends upon the accuracy of preprocessing
documents could be skewed at some angle with stage. For instance, if OCR system under
horizontal axis. In this paper, we have proposed estimate or over estimate skew angle, then OCR
skew estimation of document images for Arabic system which is utilizing projection based
fonts. It is based upon the specific feature of technique will fail miserably.
Arabic script. In our proposed scheme, we scan
for the occurrence of letter ‘alif’ and estimate In this paper, we have addressed the issue of tilt
the tilt based upon its slope. Extensive estimation in Arabic document images. The
experimentation was performed and scheme was paper is organized in this fashion. Section 2
found to be very effective. deals with related work, Section 3 depicts our
proposed schemes. Section 4 deals with
1. Introduction experiments and results. Section 5 concludes the
paper.
Preprocessing is a stage in typical OCR system,
which focuses on enhancing the acquired image 2. Related Work
to increase the ease of feature extraction and to
compensate for the eventual poor quality of the There has been a variety of techniques proposed
scanned document [2]. When patterns are in the literature to estimate and correct the skew
scanned and digitized, the raw data may carry a of document image. In this section, we will
certain amount of noise. If the acquired image briefly review, some of the existing techniques
contains noise it is subjected to the preprocessing for skew estimation and correction in document
stage where “de-noising” of the image takes processing. A comprehensive survey could be
place. Furthermore, when a document is fed to found in [3].
the scanner either mechanically or by a human
operator, a few degrees of skew (tilt) is Most of the skew estimation techniques can be
unavoidable. Skew estimation is a process which divided into the following main classes
aims at detecting the deviation of the document according to the basic approach they adopt [3]:
orientation angle from the horizontal or vertical analysis of projection profiles, Hough transform,
direction. Skew detection and correction are connected components clustering and correlation
important preprocessing steps of document between lines. Other techniques have been
layout analysis and OCR approaches. proposed which are based on the analysis of the
Fourier spectrum [13], on the use of
OCR systems typically assumed that documents morphological transforms and on subspace line
were printed with a single direction of the text detection [6].

Proceedings of the Computer Graphics, Imaging and Vision: New Trends (CGIV’05)
0-7695-2392-7/05 $20.00 © 2005 IEEE
peaks and dips than any other histogram of the
Nazim et al [4] detected the angle of rotation of same page corresponding to a skew angle. A
the text by creating an array whose indices are - Cohen distribution of a histogram represents its
90 to 90, and initialized to zero. All black pixels time-frequency distribution, where in this case
in the image are grouped in a vector from which the time increases according to the height of the
they are retrieved in pairs. For each pair, the page. Consequently, the Cohen distribution
angle of the straight line segment defined by the presents maximum intensity for the histograms
pair is calculated and the corresponding cell in of 0 and 180 degrees, which show the major
the array is incremented. After exhausting all peaks and dips alternations. The closer the skew
pairs, the array is examined and index of the angle is to 0º and 180º, the larger are the values
maximum-value cell is the rotation angle of the of the maximum intensity. This fact guarantees
text. Although this is a simple procedure, it is the success of the algorithm, provided that the
very expensive in terms of the number of skew angle ranges between -89 and +89 degrees
computations performed. with respect to the right page position. Otherwise
the page would be oriented at reverse side.
Another skew estimation technique is based on
the projection profile of the document [5]. The A Nearest-neighbor based approach [10] also
horizontal/vertical projection profile is a performs an accurate document skew estimation.
histogram of the number of black pixels along Size restriction is introduced to the detection of
horizontal/vertical scan-lines. For a script with nearest-neighbor pairs. Then the chains with a
horizontal text lines, the horizontal projection largest possible number of nearest-neighbor pairs
profile will have peaks at text line positions and are selected, and their slopes are computed to
troughs at positions in between successive text give the skew angle of document image.
lines. To determine the skew angle of a
document, the projection profile is computed at a Another approach is based on fixing the
number of angles, and for each angle, the threshold statically and dynamically in order to
difference between peak and trough heights is separate the text lines [11]. This method
measured. The maximum difference corresponds proceeds with the assumption that there is space
to the best alignment with the text line direction. between text lines. These methods give accurate
This in turn determines the skew angle. results for up to ±30° and the skew is computed
by considering all the text lines in the document.
A recursive morphological transform [6] It works well for the images of any size. The
generates skew estimation of document having major disadvantage of this method is that it loses
text lines with different skews. Text skew angles the accuracy for the documents having grater
within 0.5º of the true text skew angles are than the 30 degree.
computed with a probability of 99%. To process
a 300 dpi document image, the algorithm takes 3. Proposed Schemes
10 seconds on SUN SPARC 10 machines.
As discussed in Section 2, there is no specific
In another approach [7] skew angle is calculated technique for skew estimation in Arabic OCR
using text row accumulation based on the system. We have devised a new technique for
computation of the first eigenvector of the data skew estimation in Arabic OCR system. In this
covariance matrix. In contrast to other works, technique, we scan for 'alif', a character in Arabic
where an image resolution of 100-300 dpi is script used very often and almost vertical in case
necessary, a lower resolution (50 dpi) is enough of no tilt. All the fonts of 'alif' are found to be
for this method to obtain a correct result. similar in shape. We utilized this fact, and scan
the document for character 'alif'. We have
Another work regarding skew angle estimation categorized a document into two basic parts;
was based on the Cohen’s class distributions [9]. structural part (single dots, two dots, '΀' and other
It is suitable for most types of document pages: stressed marks) and alphabetic parts (other than
printed and handwritten pages, with graphics or structural part, connected or isolated characters).
borders, poor resolution, and various types and
sizes of fonts. Skew angles of up to ±89 degrees Let
are managed with an accuracy of ±0.5. The Smin = Element in structural part with minimum
method is based on the fact the histogram of the number of pixels
non-skewed page presents more pronounced

Proceedings of the Computer Graphics, Imaging and Vision: New Trends (CGIV’05)
0-7695-2392-7/05 $20.00 © 2005 IEEE
Smax = Element in structural part with maximum We have found that the difference between Smax
number of pixels and Amin is more than 3 pixels in case of any
Amin= Element in alphabetic part with minimum Arabic Font of size greater than or equal to 12.
number of pixels This difference increases as we change the font
Amax = Element in alphabetic part with maximum size. Utilizing this fact we search for the
number of pixels estimation of the above mentioned parameter.
For instance, consider the document in Fig. 1.

Figure 1: Arabic Document with Font Size 14


table1 shows typical bit pattern of a template '΍'
Here, Smin = single dot, Smax =' ΀ ', Amin = '΍', Amax having font size 14.
= 'ϲΒΘϜϤϟ'
Table 1: Bit Pattern of template '΍' of font size
It can be noted that Amax strictly depends upon
14
the text in document image. Similarly, Smax can
also be changed but in almost every case Smin 1 0
and Amin will not change. Another important 1 0
thing is the separation between Smax and Amin. 1 1
This condition cannot be compromised that is the 1 1
number of pixels in Smax and Amin need to be at 1 1
considerable distance (more than 3 pixels). Now 1 1
we scan for maximum occurrences of element in 1 1
alphabetic part, '΍' (Alif) is an alphabet which is 0 1
found frequently as compared with other 0 1
elements. 0 1
0 1
The skew angle of alif can be found by drawing 0 1
a bounding box on it. Let P1(x1, y1) be the top 0 1
most point of '΍' and P2(x2, y2) be the bottom most 0 1
point. The slope of tilted '΍' can be found out by
using the following formula:

Ԧ = tan-1 ( y2 – y1) / (x2-x1). Skew angle of template 'alif' as given in table 1 is


found to be:
The document needs to be rotated at angle Ԧ.
Ԧalif = tan-1 ( 14-0) / (0-1) = 85.94o
However one problem with this scheme is that, '΍'
is not exactly vertical even if there is no tilt. The Bit pattern of 60o titled 'alif' is shown in the
table2.

Proceedings of the Computer Graphics, Imaging and Vision: New Trends (CGIV’05)
0-7695-2392-7/05 $20.00 © 2005 IEEE
Table 2: Bit Pattern of 60o tilted '΍'

0 0 0 0 0 0 1
0 0 0 0 0 1 1
0 0 0 0 0 1 1
0 0 0 0 1 1 0
0 0 0 1 1 1 0
0 0 0 0 1 0 0
0 0 0 1 0 0 0
0 0 0 1 0 0 0
0 0 1 0 0 0 0
0 1 0 0 0 0 0
0 1 0 0 0 0 0
1 0 0 0 0 0 0
0 0 0 0 0 0 0

In order to rotate the document image, the


adjusted skew angle could be found out by
utilizing following equation
Ԧ' = Ԧtilted-alif - Ԧalif

4. Experiments and Results: applied this scheme to different Arabic fonts and
found it to be very effective. For example
We have tested this scheme on variety of tilted consider the image tilted at an angle of 60o. The
documents and accuracy is found to be within Fig. 2 demonstrates a titled document and Fig. 3
range of -0.017 to +0.017 radians. We have also depicts detected '΍'.

Figure 2: Titled Document Figure 3: Detected 'Alif'


be segmented by projection based method. Fig. 5
After detecting the skew angle of 'alif', the demonstrates the horizontal projection of the
document is rotated in the opposite direction to skew corrected document.
negate the effect of skew as shown in Fig. 4

Figure 4: Rotated Document


Figure 5: Horizontal Projection of skew
corrected document
Without adjustment of skew angle with template
'alif', the document cannot be segmented with
With adjustment of skew angle with template projection based method. This is visible in Fig.
'alif', even the document is not exactly aligned 6.
with horizontal axis, but document image could

Proceedings of the Computer Graphics, Imaging and Vision: New Trends (CGIV’05)
0-7695-2392-7/05 $20.00 © 2005 IEEE
Figure 6: Horizontal Projection of rotated image without skew adjustment

We also tested our proposed scheme in skew


correction of Andalus Font titled at 63o. The
results are shown in the following figures.

Figure 7: Normal Figure 8: Skewed Document Figure 9: Detected


Document
'Alif' from tilted
document

Figure 11: Horizontal Projection


Figure 10: Rotated Document after skew
correction
5. Conclusion others because it does not consider entire
document. The speed will depend upon the
Tilt correction in document images has been a position of first detected '΍', which is of high
major problem in any OCR systems. Tilts are probability to be detected in first line. This
induced into the document images because of scheme is tested extensively on Arabic fonts and
operational mistakes. The accuracy of OCR found to be accurate with in all practical limits.
systems in terms of recognition heavily depends
upon pre-processing stage. Tilt correction is Acknowledgements
considered to be among primary entities in pre- The authors acknowledge the support of King Fahd
processing techniques. In this paper we have University of Petroleum & Minerals (KFUPM) for this
research. This work has been developed under
proposed an efficient scheme for tilt correction in Project# EE/AUTO-TEXT/232 funded by KFUPM.
Arabic document images. This scheme is specific
to Arabic fonts and it attempts to correct the tilt
by observing the orientation and slope of Arabic
alphabet '΍', which is present in almost all the
documents. This scheme will outperform all

Proceedings of the Computer Graphics, Imaging and Vision: New Trends (CGIV’05)
0-7695-2392-7/05 $20.00 © 2005 IEEE
6. References [8] Younki Min, Sung-Bae Cho and Yillbyung Lee;
"A Data Reduction Method for Efficient Document
Skew Estimation Based on Hough Transformation";
[1] S. Sadallah and S.Yacu; “Design of an Arabic
Proceedings of ICPR '96 IEEE
Character Reading Machine”; Proceedings Computer
process, Arabic Language, Kuwait, (1985).
[9] E.Kavallieratou, N.Fakotakis, and G.Kokkinakis;
"Skew Angle Estimation In Document Processing
[2] A. Cheung, M. Bennamoun and N.W. Bergmann
Using Cohen’s Class Distributions"; Pattern
“Implementation of a Statistical Based Arabic
Recognition Letters, Vol. 20, Issue 11, pp. 1305-1311,
Character Recognition System”, IEEE Tencon –
Nov. 1999. (Journal)
Speech and Image Technologies for Computing and
Telecommunications, pp.531-534, (1997).
[10] Yue Lu and Chew Lim Tan; "Improved Nearest
Neighbor Based Approach to Accurate Document
[3] Jonathan J. Hull; "Document Image Skew
Skew Estimation",; Proceedings of the Seventh
Detection: Survey and Annotated Bibliography";
International Conference on Document Analysis and
World Scientific, pp. 40-64, 1998.
Recognition (ICDAR’03) 2003 IEEE
[4] Syed Nazim Nawaz; "Offline Arabic Character
[11] P.Shivakumara, G. Hemantha Kumar, D. S Guru
Recognition System"; Master Thesis, KFUPM June
and P. Nagabhushan; "Skew Estimation of Binary
2003.
Document Images Using Static and Dynamic
Thresholds Useful for Document Image Mosaicing";
[5] T. Akiyama and N. Hagita; "Automatic entry
Proceedings: National Workshop on IT Services and
system for printed documents"; Pattern Recognition
Applications (WITSA2003) Feb 27-28, 2003
Volume 23 , Issue 11 1990 Pages: 1141 – 1154,
Publisher Elsevier Science
[12] Adnan Amin, Ricky Shiu; "Page Segmentation
and Classification Utilizing Bottom-Up Approach";
[6] Su Chen and Robert M.Haralick; "An Automatic
International Journal of Image and Graphics, Vol. 1,
Algorithm for Text skew estimation in document
No. 2 (2001) pp. 345-361.
images using Recursive morphological transforms";
proc. of ICIP, pp 139-143, 1994
[13] W. Postl; "Detection of linear oblique structures
and skew in digitized documents"; Proc. Eighth Int.
[7] Oleg Okun, Matti Pietikäinen and Jaakko Sauvola;
Conf. on Pattern Recognition, 1986.
"Robust Skew Estimation on Low-Resolution
Document Images"; Proc. 5th International
[14] Rajiv Kapoor, Deepak Bagai, T. S. Kamal; "Skew
Conference on Document Analysis and Recognition,
angle detection of a cursive handwritten Devanagari
Bangalore, India, 621 – 624, 1999.
script character image"; Journal of Indian Institute
May - Aug 2002, pp 161–175 .

Proceedings of the Computer Graphics, Imaging and Vision: New Trends (CGIV’05)
0-7695-2392-7/05 $20.00 © 2005 IEEE

S-ar putea să vă placă și