Sunteți pe pagina 1din 6

An Efficient Scheme for Tilt Correction in Arabic OCR system

M. Sarfraz*, S. A. Shahab Department of Information and Computer Science KFUPM, Dhahran 31261, Saudi Arabia Email: sarfraz@ccse.kfupm.edu.sa

Abstract

Preprocessing stage is required in almost every image processing application ranging from biometric analysis to document image analysis. An input image or information need to be normalized and converted into format acceptable by OCR (Optical Character Recognition) system. OCR systems typically assume that documents were printed with a single direction of the text and that the acquisition process did not introduce a relevant skew. Practically this assumption is not very strong and printed documents could be skewed at some angle with horizontal axis. In this paper, we have proposed skew estimation of document images for Arabic fonts. It is based upon the specific feature of Arabic script. In our proposed scheme, we scan for the occurrence of letter ‘alif’ and estimate the tilt based upon its slope. Extensive experimentation was performed and scheme was found to be very effective.

1. Introduction

Preprocessing is a stage in typical OCR system, which focuses on enhancing the acquired image to increase the ease of feature extraction and to compensate for the eventual poor quality of the scanned document [2]. When patterns are scanned and digitized, the raw data may carry a certain amount of noise. If the acquired image contains noise it is subjected to the preprocessing stage where “de-noising” of the image takes place. Furthermore, when a document is fed to the scanner either mechanically or by a human operator, a few degrees of skew (tilt) is unavoidable. Skew estimation is a process which aims at detecting the deviation of the document orientation angle from the horizontal or vertical direction. Skew detection and correction are important preprocessing steps of document layout analysis and OCR approaches.

OCR systems typically assumed that documents were printed with a single direction of the text

and that the acquisition process did not introduce a relevant skew. The advent of flat bed scanners and the need to process large amounts of documents at high rates, made the above assumption unreliable and the introduction of the skew estimation phase became mandatory. In fact, a little skewing of the page is often introduced during processes such as copying or scanning. Moreover, today documents are ever more free styled and text aligned along different directions is not an uncommon feature. The subsequent stages of OCR systems, chiefly depends upon the accuracy of preprocessing stage. For instance, if OCR system under estimate or over estimate skew angle, then OCR system which is utilizing projection based technique will fail miserably.

In this paper, we have addressed the issue of tilt estimation in Arabic document images. The paper is organized in this fashion. Section 2 deals with related work, Section 3 depicts our proposed schemes. Section 4 deals with experiments and results. Section 5 concludes the paper.

2. Related Work

There has been a variety of techniques proposed in the literature to estimate and correct the skew of document image. In this section, we will briefly review, some of the existing techniques for skew estimation and correction in document processing. A comprehensive survey could be found in [3].

Most of the skew estimation techniques can be divided into the following main classes according to the basic approach they adopt [3]:

analysis of projection profiles, Hough transform, connected components clustering and correlation between lines. Other techniques have been proposed which are based on the analysis of the Fourier spectrum [13], on the use of morphological transforms and on subspace line detection [6].

Proceedings of the Computer Graphics, Imaging and Vision: New Trends (CGIV’05) 0-7695-2392-7/05 $20.00 © 2005 IEEE

Proceedings of the Computer Graphics, Imaging and Vision: New Trends (CGIV’05) 0-7695-2392-7/05 $20.00 © 2005 IEEE

Nazim et al [4] detected the angle of rotation of

the text by creating an array whose indices are -

90 to 90, and initialized to zero. All black pixels

peaks and dips than any other histogram of the same page corresponding to a skew angle. A Cohen distribution of a histogram represents its time-frequency distribution, where in this case

in the image are grouped in a vector from which they are retrieved in pairs. For each pair, the

the time increases according to the height of the page. Consequently, the Cohen distribution

angle of the straight line segment defined by the

presents maximum intensity for the histograms

pair

is calculated and the corresponding cell in

of

0 and 180 degrees, which show the major

the

pairs, the array is examined and index of the

array is incremented. After exhausting all

very expensive in terms of the number of

peaks and dips alternations. The closer the skew angle is to 0º and 180º, the larger are the values

maximum-value cell is the rotation angle of the

of

the maximum intensity. This fact guarantees

text. Although this is a simple procedure, it is

computations performed.

the success of the algorithm, provided that the skew angle ranges between -89 and +89 degrees with respect to the right page position. Otherwise

the page would be oriented at reverse side.

Another skew estimation technique is based on the projection profile of the document [5]. The

to the best alignment with the text line direction.

A Nearest-neighbor based approach [10] also

horizontal/vertical projection profile is a histogram of the number of black pixels along horizontal/vertical scan-lines. For a script with

performs an accurate document skew estimation. Size restriction is introduced to the detection of nearest-neighbor pairs. Then the chains with a

horizontal text lines, the horizontal projection profile will have peaks at text line positions and troughs at positions in between successive text

largest possible number of nearest-neighbor pairs are selected, and their slopes are computed to give the skew angle of document image.

lines. To determine the skew angle of a document, the projection profile is computed at a

Another approach is based on fixing the

number of angles, and for each angle, the difference between peak and trough heights is measured. The maximum difference corresponds

threshold statically and dynamically in order to separate the text lines [11]. This method proceeds with the assumption that there is space between text lines. These methods give accurate

This in turn determines the skew angle.

results for up to ±30° and the skew is computed

by

considering all the text lines in the document.

A recursive morphological transform [6]

It

works well for the images of any size. The

generates skew estimation of document having

major disadvantage of this method is that it loses

text lines with different skews. Text skew angles

within 0.5º of the true text skew angles are

computed with a probability of 99%. To process

a 300 dpi document image, the algorithm takes

10 seconds on SUN SPARC 10 machines.

In another approach [7] skew angle is calculated using text row accumulation based on the computation of the first eigenvector of the data covariance matrix. In contrast to other works, where an image resolution of 100-300 dpi is necessary, a lower resolution (50 dpi) is enough for this method to obtain a correct result.

Another work regarding skew angle estimation

was based on the Cohen’s class distributions [9].

It is suitable for most types of document pages:

printed and handwritten pages, with graphics or borders, poor resolution, and various types and

sizes of fonts. Skew angles of up to ±89 degrees

are managed with an accuracy of ±0.5. The

method is based on the fact the histogram of the non-skewed page presents more pronounced

Proceedings of the Computer Graphics, Imaging and Vision: New Trends (CGIV’05) 0-7695-2392-7/05 $20.00 © 2005 IEEE

the accuracy for the documents having grater than the 30 degree.

3. Proposed Schemes

As discussed in Section 2, there is no specific technique for skew estimation in Arabic OCR

system. We have devised a new technique for skew estimation in Arabic OCR system. In this technique, we scan for 'alif', a character in Arabic script used very often and almost vertical in case

of no tilt. All the fonts of 'alif' are found to be

similar in shape. We utilized this fact, and scan the document for character 'alif'. We have categorized a document into two basic parts; structural part (single dots, two dots, ' ' and other stressed marks) and alphabetic parts (other than structural part, connected or isolated characters).

Let S min = Element in structural part with minimum number of pixels

part, connected or isolated characters). Let S m i n = Element in structural part with

S max = Element in structural part with maximum number of pixels

A min = Element in alphabetic part with minimum

number of pixels

A max = Element in alphabetic part with maximum

number of pixels

We have found that the difference between S max and A min is more than 3 pixels in case of any Arabic Font of size greater than or equal to 12. This difference increases as we change the font size. Utilizing this fact we search for the estimation of the above mentioned parameter. For instance, consider the document in Fig. 1.

parameter. For instance, consider the document in Fig. 1. Figure 1: Arabic Document with Font Size

Figure 1: Arabic Document with Font Size 14

Here, S min = single dot, S max =' ', A min = ' ', A max

= ' '

It can be noted that A max strictly depends upon the text in document image. Similarly, S max can also be changed but in almost every case S min and A min will not change. Another important thing is the separation between S max and A min . This condition cannot be compromised that is the number of pixels in S max and A min need to be at considerable distance (more than 3 pixels). Now we scan for maximum occurrences of element in alphabetic part, ' ' (Alif) is an alphabet which is found frequently as compared with other elements.

The skew angle of alif can be found by drawing

a bounding box on it. Let P 1 (x 1 , y 1 ) be the top

most point of ' ' and P 2 (x 2 , y 2 ) be the bottom most point. The slope of tilted ' ' can be found out by using the following formula:

= tan -1 ( y 2 – y 1 ) / (x 2 -x 1 ).

The document needs to be rotated at angle .

However one problem with this scheme is that, ' ' is not exactly vertical even if there is no tilt. The

table1 shows typical bit pattern of a template ' ' having font size 14.

Table 1: Bit Pattern of template ' ' of font size

14

1 0 1 0 1 1 1 1 1 1 1 1 1 1 0
1
0
1
0
1
1
1
1
1
1
1
1
1
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1

Skew angle of template 'alif' as given in table 1 is found to be:

alif = tan -1 ( 14-0) / (0-1) = 85.94 o

Bit pattern of 60 o titled

table2.

'alif' is

shown in the

Proceedings of the Computer Graphics, Imaging and Vision: New Trends (CGIV’05) 0-7695-2392-7/05 $20.00 © 2005 IEEE

Proceedings of the Computer Graphics, Imaging and Vision: New Trends (CGIV’05) 0-7695-2392-7/05 $20.00 © 2005 IEEE

Table 2: Bit Pattern of 60 o tilted ' '

0

0

0

0

0

0

1

0

0

0

0

0

1

1

0

0

0

0

0

1

1

0

0

0

0

1

1

0

0

0

0

1

1

1

0

0

0

0

0

1

0

0

0

0

0

1

0

0

0

0

0

0

1

0

0

0

0

0

1

0

0

0

0

0

1

0

0

0

0

0

0

1

0

0

0

0

0

1

0

0

0

0

0

0

0

0

0

0

0

0

0

In order to rotate the document image, the adjusted skew angle could be found out by utilizing following equation

' = tilted-alif

-

alif

4. Experiments and Results:

We have tested this scheme on variety of tilted documents and accuracy is found to be within range of -0.017 to +0.017 radians. We have also

be within range of -0.017 to +0.017 radians. We have also Figure 2: Titled Document After

Figure 2: Titled Document

After detecting the skew angle of 'alif', the document is rotated in the opposite direction to negate the effect of skew as shown in Fig. 4

direction to negate the effect of skew as shown in Fig. 4 Figure 4: Rotated Document

Figure 4: Rotated Document

With adjustment of skew angle with template 'alif', even the document is not exactly aligned with horizontal axis, but document image could

applied this scheme to different Arabic fonts and found it to be very effective. For example consider the image tilted at an angle of 60 o . The Fig. 2 demonstrates a titled document and Fig. 3 depicts detected ' '.

titled document and Fig. 3 depicts detected ' '. Figure 3: Detected 'Alif' be segmented by

Figure 3: Detected 'Alif' be segmented by projection based method. Fig. 5 demonstrates the horizontal projection of the skew corrected document.

the horizontal projection of the skew corrected document. Figure 5: Horizontal Projection of skew corrected document

Figure 5: Horizontal Projection of skew corrected document

Without adjustment of skew angle with template

'alif', the document cannot be segmented with projection based method. This is visible in Fig.

6.

Proceedings of the Computer Graphics, Imaging and Vision: New Trends (CGIV’05) 0-7695-2392-7/05 $20.00 © 2005 IEEE

6. Proceedings of the Computer Graphics, Imaging and Vision: New Trends (CGIV’05) 0-7695-2392-7/05 $20.00 © 2005
Figure 6: Horizontal Projection of rotated image without skew adjustment We also tested our proposed

Figure 6: Horizontal Projection of rotated image without skew adjustment

We also tested our proposed scheme in skew correction of Andalus Font titled at 63 o . The results are shown in the following figures.

63 o . The results are shown in the following figures. Figure 7: Normal Document Figure

Figure 7: Normal Document

shown in the following figures. Figure 7: Normal Document Figure 8: Skewed Document F i g

Figure 8: Skewed Document

Figure 7: Normal Document Figure 8: Skewed Document F i g u r e 9 :

Figure 9: Detected 'Alif' from tilted document

e t e c t e d 'Alif' from tilted document Figure 10: Rotated Document after

Figure 10: Rotated Document after skew correction

5. Conclusion

Tilt correction in document images has been a major problem in any OCR systems. Tilts are induced into the document images because of operational mistakes. The accuracy of OCR systems in terms of recognition heavily depends upon pre-processing stage. Tilt correction is considered to be among primary entities in pre- processing techniques. In this paper we have proposed an efficient scheme for tilt correction in Arabic document images. This scheme is specific to Arabic fonts and it attempts to correct the tilt by observing the orientation and slope of Arabic alphabet ' ', which is present in almost all the documents. This scheme will outperform all

almost all the documents. This scheme will outperform all Figure 11: Horizontal Projection others because it

Figure 11: Horizontal Projection

others because it does not consider entire document. The speed will depend upon the position of first detected ' ', which is of high probability to be detected in first line. This scheme is tested extensively on Arabic fonts and found to be accurate with in all practical limits.

Acknowledgements

The authors acknowledge the support of King Fahd University of Petroleum & Minerals (KFUPM) for this research. This work has been developed under Project# EE/AUTO-TEXT/232 funded by KFUPM.

Proceedings of the Computer Graphics, Imaging and Vision: New Trends (CGIV’05) 0-7695-2392-7/05 $20.00 © 2005 IEEE

Proceedings of the Computer Graphics, Imaging and Vision: New Trends (CGIV’05) 0-7695-2392-7/05 $20.00 © 2005 IEEE

6. References

[1] S. Sadallah and S.Yacu; “Design of an Arabic Character Reading Machine”; Proceedings Computer process, Arabic Language, Kuwait, (1985).

[2] A. Cheung, M. Bennamoun and N.W. Bergmann “Implementation of a Statistical Based Arabic Character Recognition System”, IEEE Tencon – Speech and Image Technologies for Computing and Telecommunications, pp.531-534, (1997).

[3] Jonathan J. Hull; "Document Image Skew Detection: Survey and Annotated Bibliography"; World Scientific, pp. 40-64, 1998.

[4] Syed Nazim Nawaz; "Offline Arabic Character Recognition System"; Master Thesis, KFUPM June

2003.

[5] T. Akiyama and N. Hagita; "Automatic entry system for printed documents"; Pattern Recognition Volume 23 , Issue 11 1990 Pages: 1141 – 1154, Publisher Elsevier Science

[6] Su Chen and Robert M.Haralick; "An Automatic Algorithm for Text skew estimation in document images using Recursive morphological transforms"; proc. of ICIP, pp 139-143, 1994

[7] Oleg Okun, Matti Pietikäinen and Jaakko Sauvola; "Robust Skew Estimation on Low-Resolution Document Images"; Proc. 5th International Conference on Document Analysis and Recognition, Bangalore, India, 621 – 624, 1999.

Proceedings of the Computer Graphics, Imaging and Vision: New Trends (CGIV’05) 0-7695-2392-7/05 $20.00 © 2005 IEEE

[8] Younki Min, Sung-Bae Cho and Yillbyung Lee; "A Data Reduction Method for Efficient Document Skew Estimation Based on Hough Transformation"; Proceedings of ICPR '96 IEEE

[9] E.Kavallieratou, N.Fakotakis, and G.Kokkinakis; "Skew Angle Estimation In Document Processing Using Cohen’s Class Distributions"; Pattern Recognition Letters, Vol. 20, Issue 11, pp. 1305-1311, Nov. 1999. (Journal)

[10] Yue Lu and Chew Lim Tan; "Improved Nearest Neighbor Based Approach to Accurate Document Skew Estimation",; Proceedings of the Seventh International Conference on Document Analysis and Recognition (ICDAR’03) 2003 IEEE

[11] P.Shivakumara, G. Hemantha Kumar, D. S Guru and P. Nagabhushan; "Skew Estimation of Binary Document Images Using Static and Dynamic Thresholds Useful for Document Image Mosaicing"; Proceedings: National Workshop on IT Services and Applications (WITSA2003) Feb 27-28, 2003

[12] Adnan Amin, Ricky Shiu; "Page Segmentation and Classification Utilizing Bottom-Up Approach"; International Journal of Image and Graphics, Vol. 1, No. 2 (2001) pp. 345-361.

[13] W. Postl; "Detection of linear oblique structures and skew in digitized documents"; Proc. Eighth Int. Conf. on Pattern Recognition, 1986.

[14] Rajiv Kapoor, Deepak Bagai, T. S. Kamal; "Skew angle detection of a cursive handwritten Devanagari script character image"; Journal of Indian Institute May - Aug 2002, pp 161–175 .

a cursive handwritten Devanagari script character image" ; Journal of Indian Institute May - Aug 2002,