Documente Academic
Documente Profesional
Documente Cultură
M. Sarfraz*, S. A. Shahab
Department of Information and Computer Science
KFUPM, Dhahran 31261, Saudi Arabia
Email: sarfraz@ccse.kfupm.edu.sa
Proceedings of the Computer Graphics, Imaging and Vision: New Trends (CGIV’05)
0-7695-2392-7/05 $20.00 © 2005 IEEE
peaks and dips than any other histogram of the
Nazim et al [4] detected the angle of rotation of same page corresponding to a skew angle. A
the text by creating an array whose indices are - Cohen distribution of a histogram represents its
90 to 90, and initialized to zero. All black pixels time-frequency distribution, where in this case
in the image are grouped in a vector from which the time increases according to the height of the
they are retrieved in pairs. For each pair, the page. Consequently, the Cohen distribution
angle of the straight line segment defined by the presents maximum intensity for the histograms
pair is calculated and the corresponding cell in of 0 and 180 degrees, which show the major
the array is incremented. After exhausting all peaks and dips alternations. The closer the skew
pairs, the array is examined and index of the angle is to 0º and 180º, the larger are the values
maximum-value cell is the rotation angle of the of the maximum intensity. This fact guarantees
text. Although this is a simple procedure, it is the success of the algorithm, provided that the
very expensive in terms of the number of skew angle ranges between -89 and +89 degrees
computations performed. with respect to the right page position. Otherwise
the page would be oriented at reverse side.
Another skew estimation technique is based on
the projection profile of the document [5]. The A Nearest-neighbor based approach [10] also
horizontal/vertical projection profile is a performs an accurate document skew estimation.
histogram of the number of black pixels along Size restriction is introduced to the detection of
horizontal/vertical scan-lines. For a script with nearest-neighbor pairs. Then the chains with a
horizontal text lines, the horizontal projection largest possible number of nearest-neighbor pairs
profile will have peaks at text line positions and are selected, and their slopes are computed to
troughs at positions in between successive text give the skew angle of document image.
lines. To determine the skew angle of a
document, the projection profile is computed at a Another approach is based on fixing the
number of angles, and for each angle, the threshold statically and dynamically in order to
difference between peak and trough heights is separate the text lines [11]. This method
measured. The maximum difference corresponds proceeds with the assumption that there is space
to the best alignment with the text line direction. between text lines. These methods give accurate
This in turn determines the skew angle. results for up to ±30° and the skew is computed
by considering all the text lines in the document.
A recursive morphological transform [6] It works well for the images of any size. The
generates skew estimation of document having major disadvantage of this method is that it loses
text lines with different skews. Text skew angles the accuracy for the documents having grater
within 0.5º of the true text skew angles are than the 30 degree.
computed with a probability of 99%. To process
a 300 dpi document image, the algorithm takes 3. Proposed Schemes
10 seconds on SUN SPARC 10 machines.
As discussed in Section 2, there is no specific
In another approach [7] skew angle is calculated technique for skew estimation in Arabic OCR
using text row accumulation based on the system. We have devised a new technique for
computation of the first eigenvector of the data skew estimation in Arabic OCR system. In this
covariance matrix. In contrast to other works, technique, we scan for 'alif', a character in Arabic
where an image resolution of 100-300 dpi is script used very often and almost vertical in case
necessary, a lower resolution (50 dpi) is enough of no tilt. All the fonts of 'alif' are found to be
for this method to obtain a correct result. similar in shape. We utilized this fact, and scan
the document for character 'alif'. We have
Another work regarding skew angle estimation categorized a document into two basic parts;
was based on the Cohen’s class distributions [9]. structural part (single dots, two dots, '' and other
It is suitable for most types of document pages: stressed marks) and alphabetic parts (other than
printed and handwritten pages, with graphics or structural part, connected or isolated characters).
borders, poor resolution, and various types and
sizes of fonts. Skew angles of up to ±89 degrees Let
are managed with an accuracy of ±0.5. The Smin = Element in structural part with minimum
method is based on the fact the histogram of the number of pixels
non-skewed page presents more pronounced
Proceedings of the Computer Graphics, Imaging and Vision: New Trends (CGIV’05)
0-7695-2392-7/05 $20.00 © 2005 IEEE
Smax = Element in structural part with maximum We have found that the difference between Smax
number of pixels and Amin is more than 3 pixels in case of any
Amin= Element in alphabetic part with minimum Arabic Font of size greater than or equal to 12.
number of pixels This difference increases as we change the font
Amax = Element in alphabetic part with maximum size. Utilizing this fact we search for the
number of pixels estimation of the above mentioned parameter.
For instance, consider the document in Fig. 1.
Proceedings of the Computer Graphics, Imaging and Vision: New Trends (CGIV’05)
0-7695-2392-7/05 $20.00 © 2005 IEEE
Table 2: Bit Pattern of 60o tilted ''
0 0 0 0 0 0 1
0 0 0 0 0 1 1
0 0 0 0 0 1 1
0 0 0 0 1 1 0
0 0 0 1 1 1 0
0 0 0 0 1 0 0
0 0 0 1 0 0 0
0 0 0 1 0 0 0
0 0 1 0 0 0 0
0 1 0 0 0 0 0
0 1 0 0 0 0 0
1 0 0 0 0 0 0
0 0 0 0 0 0 0
4. Experiments and Results: applied this scheme to different Arabic fonts and
found it to be very effective. For example
We have tested this scheme on variety of tilted consider the image tilted at an angle of 60o. The
documents and accuracy is found to be within Fig. 2 demonstrates a titled document and Fig. 3
range of -0.017 to +0.017 radians. We have also depicts detected ''.
Proceedings of the Computer Graphics, Imaging and Vision: New Trends (CGIV’05)
0-7695-2392-7/05 $20.00 © 2005 IEEE
Figure 6: Horizontal Projection of rotated image without skew adjustment
Proceedings of the Computer Graphics, Imaging and Vision: New Trends (CGIV’05)
0-7695-2392-7/05 $20.00 © 2005 IEEE
6. References [8] Younki Min, Sung-Bae Cho and Yillbyung Lee;
"A Data Reduction Method for Efficient Document
Skew Estimation Based on Hough Transformation";
[1] S. Sadallah and S.Yacu; “Design of an Arabic
Proceedings of ICPR '96 IEEE
Character Reading Machine”; Proceedings Computer
process, Arabic Language, Kuwait, (1985).
[9] E.Kavallieratou, N.Fakotakis, and G.Kokkinakis;
"Skew Angle Estimation In Document Processing
[2] A. Cheung, M. Bennamoun and N.W. Bergmann
Using Cohen’s Class Distributions"; Pattern
“Implementation of a Statistical Based Arabic
Recognition Letters, Vol. 20, Issue 11, pp. 1305-1311,
Character Recognition System”, IEEE Tencon –
Nov. 1999. (Journal)
Speech and Image Technologies for Computing and
Telecommunications, pp.531-534, (1997).
[10] Yue Lu and Chew Lim Tan; "Improved Nearest
Neighbor Based Approach to Accurate Document
[3] Jonathan J. Hull; "Document Image Skew
Skew Estimation",; Proceedings of the Seventh
Detection: Survey and Annotated Bibliography";
International Conference on Document Analysis and
World Scientific, pp. 40-64, 1998.
Recognition (ICDAR’03) 2003 IEEE
[4] Syed Nazim Nawaz; "Offline Arabic Character
[11] P.Shivakumara, G. Hemantha Kumar, D. S Guru
Recognition System"; Master Thesis, KFUPM June
and P. Nagabhushan; "Skew Estimation of Binary
2003.
Document Images Using Static and Dynamic
Thresholds Useful for Document Image Mosaicing";
[5] T. Akiyama and N. Hagita; "Automatic entry
Proceedings: National Workshop on IT Services and
system for printed documents"; Pattern Recognition
Applications (WITSA2003) Feb 27-28, 2003
Volume 23 , Issue 11 1990 Pages: 1141 – 1154,
Publisher Elsevier Science
[12] Adnan Amin, Ricky Shiu; "Page Segmentation
and Classification Utilizing Bottom-Up Approach";
[6] Su Chen and Robert M.Haralick; "An Automatic
International Journal of Image and Graphics, Vol. 1,
Algorithm for Text skew estimation in document
No. 2 (2001) pp. 345-361.
images using Recursive morphological transforms";
proc. of ICIP, pp 139-143, 1994
[13] W. Postl; "Detection of linear oblique structures
and skew in digitized documents"; Proc. Eighth Int.
[7] Oleg Okun, Matti Pietikäinen and Jaakko Sauvola;
Conf. on Pattern Recognition, 1986.
"Robust Skew Estimation on Low-Resolution
Document Images"; Proc. 5th International
[14] Rajiv Kapoor, Deepak Bagai, T. S. Kamal; "Skew
Conference on Document Analysis and Recognition,
angle detection of a cursive handwritten Devanagari
Bangalore, India, 621 – 624, 1999.
script character image"; Journal of Indian Institute
May - Aug 2002, pp 161–175 .
Proceedings of the Computer Graphics, Imaging and Vision: New Trends (CGIV’05)
0-7695-2392-7/05 $20.00 © 2005 IEEE